7.2 Selecting Samples
Unless you plan on sub-sampling the entire core, perhaps contiguously, you will need to decide where best to take your samples for best model coverage. If you are only interested in a single element, this is trivial - simply ensure your sampling strategy encompasses a wide range of samples. However, where you are attempting to calibrate for many elements, developing an optimal sampling strategy manually can be difficult.
The function below illustrates the use of cluster analysis in optimising the sampling regime. Because it is unrealistic to plan on accurately sub-sampling at the high-resolution of the data (in this case, 1 mm), we use itrax_reduce()
to re sample the data to a lower resolution, in this case, 10 mm.
The following itrax_section()
performs an unconstrained cluster analysis, and also reports the central sample in each cluster. This could be considered the most representative of each cluster, and thus the suitable location for sampling. In the graph below, the black ticks created by geom_rug()
are the suggested sampling locations.
CD166_19_xrf %>%
filter(qc == TRUE) %>%
itrax_reduce(by = 10) %>%
mutate(uid = 1:length(uid)) %>% # we have to spoof uid here
drop_na("depth") %>%
itrax_section(divisions = 60,
elementsonly = c("Al", "Si", "K" , "Ca", "Ti",
"V" , "Cr", "Mn", "Fe", "Ni",
"Cu", "Zn","Sr" , "Y" , "Zr",
"Ba")
) %>%
ggplot(mapping = aes(x = depth, y = 1, fill = group)) +
geom_tile(width = 10) +
scale_x_reverse() +
geom_rug(sides = "b",
data = . %>% filter(calib_sample == TRUE)) +
theme(axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
legend.position = "none")
To return a core sampling plan, some conversions must be performed:
CD166_19_xrf %>%
filter(qc == TRUE) %>%
itrax_reduce(breaks_lower = seq(from = 0, to = 4209-10, by = 10),
breaks_upper = seq(from = 0+10, to = 4209, by = 10)) %>%
# this needs a method for simply adding all the character vectors from each chunk together into a vector...
mutate(uid = 1:length(uid)) %>%
mutate(breaks_lower = seq(from = 0, to = 4209-10, by = 10),
breaks_upper = seq(from = 0+10, to = 4209, by = 10)) %>% # we have to spoof uid here
drop_na("depth") %>%
itrax_section(divisions = 60,
elementsonly = c("Al", "Si", "K" , "Ca", "Ti",
"V" , "Cr", "Mn", "Fe", "Ni",
"Cu", "Zn","Sr" , "Y" , "Zr",
"Ba")
) %>%
filter(calib_sample == TRUE) %>%
select(group, depth, breaks_lower, breaks_upper, everything()) %>%
arrange(depth) %>%
select(breaks_lower, breaks_upper) %>%
mutate(core = if_else(breaks_upper >= min(CD166_19_S1$xrf$depth) & breaks_upper <= max(CD166_19_S1$xrf$depth), "c1", "")) %>%
mutate(core = if_else(breaks_upper >= min(CD166_19_S2$xrf$depth) & breaks_upper <= max(CD166_19_S2$xrf$depth), "c2", core)) %>%
mutate(core = if_else(breaks_upper >= min(CD166_19_S3$xrf$depth) & breaks_upper <= max(CD166_19_S3$xrf$depth), "c3", core)) %>%
mutate(core_position_lower = ifelse(core == "c1", breaks_lower-min(CD166_19_S1$xrf$depth), NA)) %>%
mutate(core_position_lower = ifelse(core == "c2", breaks_lower-min(CD166_19_S2$xrf$depth), core_position_lower)) %>%
mutate(core_position_lower = ifelse(core == "c3", breaks_lower-min(CD166_19_S3$xrf$depth), core_position_lower)) %>%
mutate(core_position_upper = ifelse(core == "c1", breaks_upper-min(CD166_19_S1$xrf$depth), NA)) %>%
mutate(core_position_upper = ifelse(core == "c2", breaks_upper-min(CD166_19_S2$xrf$depth), core_position_upper)) %>%
mutate(core_position_upper = ifelse(core == "c3", breaks_upper-min(CD166_19_S3$xrf$depth), core_position_upper)) %>%
select(core, core_position_lower, core_position_upper)
## # A tibble: 60 × 3
## core core_position_lower core_position_upper
## <chr> <dbl> <dbl>
## 1 c1 80 90
## 2 c1 330 340
## 3 c1 430 440
## 4 c1 440 450
## 5 c1 460 470
## 6 c1 670 680
## 7 c1 770 780
## 8 c1 840 850
## 9 c1 930 940
## 10 c1 1020 1030
## # ℹ 50 more rows