4
152 CHAPTER VI Stratified random sampling for spatial sampling Sampling is fundamental to most ecological studies and a representative sampling design is of high impor- tance for biodiversity monitoring. It was previously recommended that the ecological sampling design should be stratified to improve precision, accu- racy, and to ensure proper spatial cov- erage (Gregory et al. 2004). Hence, stratified random sampling has been one of the designs frequently ap- plied in ecological studies. Among the various options for stratified random sampling, Latin Hypercube An optimal spatial sampling approach for modelling the distribution of species Yu-Pin Lin, Wei-Chih Lin, Yung-Chieh Wang, Wan-Yu Lien, Tzung-Su Ding, Pei-Fen Lee, Tsai-Yu Wu, Reinhard A. Klenke, Dirk S. Schmeller, Klaus Henle Sampling (LHS) is promising. It ef- ficiently samples variables from their multivariate distributions and can be conditioned in the multidimensional space defined by environmental co- variates, then called conditioned Latin Hypercube Sampling (cLHS) (Minas- ny and McBratney 2006). The cLHS approach may be used to optimize the sampling design and improve predictions of species distributions by introducing spatial structures of explanatory variables and their cross- spatial structures into the cLHS opti- mization procedure. This is of special importance as overestimates in spe- cies distribution models often result from a lack of relevant explanatory variables or spatial autocorrelation (Lobo and Tognelli 2011). Environ- mental variables and species distribu- tion data are frequently recorded in different cell (grain) sizes (Lauzeral et al. 2013). Therefore, spatial reso- lution is critical in any examination of distributions of species (Lauzeral et al. 2013). Reliable methods to downscale environmental variables or species distributions from coarse to fine grain resolutions have potential benefits for ecology and conservation studies (Keil et al. 2013). In regard to spatial resolution, species distribution models (SDMs) are impacted by the fact that environmental descriptors of samples are frequently recorded at different resolutions (Lauzeral et al. 2013) and may thus require scal- ing to the same resolution. A method called Area-to-Point (ATP) kriging uses spatial structures of predictors for downscaling to predict species distributions (Keil et al. 2013) by taking spatial dependence of predic- tors into account. We illustrate the approach using Swinhoe’s blue pheas- ants (Lophura swinhoii) in Taiwan as an example. Combining spatial downscaling with conditioned Latin hypercube sampling (sdcLHS) In Latin Hypercube Sampling, one must first decide how many sample points to use, and to remember for each sample point from which row and column the sample point was Figure 1. (a) Flowchart of the procedure of spatial downscaling conditioned Latin Hypercube Sampling (sdcLHS); (b) Interface of the Windows-based tool of spatial conditioned Latin Hypercube Sampling (scLHS). a b Step 2. Environmental variables Z Step 1. Downscale Z by ATP Kriging Steps 5 to 7. Optimization procedures of steps 4 to 6 in Minasny & McBratney (2006) Step 4. Calculate the objective functions Divide Z into n strata Calculate the quantile distribution of each variable Calculate the correlation matrix of z(T) No, repeat Steps 4 to 7 Converging criteria reached ?/ 10,000 iterations completed? End of sdcLHS Randomly select n samples from N N= no. sample sites Step 3. Step 8. Yes

An optimal spatial sampling approach for modelling the ...dirk.die-schmellers.de/.../Lin...Conservation-4.pdf · ecological sampling design should be stratified to improve precision,

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An optimal spatial sampling approach for modelling the ...dirk.die-schmellers.de/.../Lin...Conservation-4.pdf · ecological sampling design should be stratified to improve precision,

152 CHAPTER VI

Stratified random sampling for spatial sampling

Sampling is fundamental to most ecological studies and a representative sampling design is of high impor-tance for biodiversity monitoring. It was previously recommended that the ecological sampling design should be stratified to improve precision, accu-racy, and to ensure proper spatial cov-erage (Gregory et al. 2004). Hence, stratified random sampling has been one of the designs frequently ap-plied in ecological studies. Among the various options for stratified random sampling, Latin Hypercube

An optimal spatial sampling approach for modelling the distribution of speciesYu-Pin Lin, Wei-Chih Lin, Yung-Chieh Wang, Wan-Yu Lien, Tzung-Su Ding, Pei-Fen Lee, Tsai-Yu Wu, Reinhard A. Klenke, Dirk S. Schmeller, Klaus Henle

Sampling (LHS) is promising. It ef-ficiently samples variables from their multivariate distributions and can be conditioned in the multidimensional space defined by environmental co-variates, then called conditioned Latin Hypercube Sampling (cLHS) (Minas-ny and McBratney 2006). The cLHS approach may be used to optimize the sampling design and improve predictions of species distributions by introducing spatial structures of explanatory variables and their cross-spatial structures into the cLHS opti-mization procedure. This is of special importance as overestimates in spe-cies distribution models often result from a lack of relevant explanatory variables or spatial autocorrelation

(Lobo and Tognelli 2011). Environ-mental variables and species distribu-tion data are frequently recorded in different cell (grain) sizes (Lauzeral et al. 2013). Therefore, spatial reso-lution is critical in any examination of distributions of species (Lauzeral et al. 2013). Reliable methods to downscale environmental variables or species distributions from coarse to fine grain resolutions have potential benefits for ecology and conservation studies (Keil et al. 2013). In regard to spatial resolution, species distribution models (SDMs) are impacted by the fact that environmental descriptors of samples are frequently recorded at different resolutions (Lauzeral et al. 2013) and may thus require scal-ing to the same resolution. A method called Area-to-Point (ATP) kriging uses spatial structures of predictors for downscaling to predict species distributions (Keil et al. 2013) by taking spatial dependence of predic-tors into account. We illustrate the approach using Swinhoe’s blue pheas-ants (Lophura swinhoii) in Taiwan as an example.

Combining spatial downscaling with conditioned Latin hypercube sampling (sdcLHS)

In Latin Hypercube Sampling, one must first decide how many sample points to use, and to remember for each sample point from which row and column the sample point was

Figure 1. (a) Flowchart of the procedure of spatial downscaling conditioned Latin Hypercube Sampling (sdcLHS); (b) Interface of the Windows-based tool of spatial conditioned Latin Hypercube Sampling (scLHS).

a

b

Step 2.

Environmental variables Z

Step 1.Downscale Z by ATP Kriging

Steps 5 to 7.Optimization procedures ofsteps 4 to 6 in Minasny &

McBratney (2006)

Step 4.Calculate the objective

functions

Divide Z inton strata

Calculate the quantile distribution of each variable

Calculate the correlation matrix of

z(T)

No, repeatSteps 4 to 7Converging criteria reached ?/

10,000 iterations completed?

End of sdcLHS

Randomly select n samples from N

N= no. sample sites

Step 3.

Step 8.

Yes

Page 2: An optimal spatial sampling approach for modelling the ...dirk.die-schmellers.de/.../Lin...Conservation-4.pdf · ecological sampling design should be stratified to improve precision,

VI CHAPTER 153

taken. Statistically expressed, the sd-cLHS approach will address the fol-lowing optimization problem: Given N sample sites with environmental variables (Z), select n sample sites (n<<N) such that the sampled sites form a Latin hypercube. For k con-tinuous variables, each component of Z is divided into n equally probable strata based on their distributions and z denotes a sub-sample of Z. The steps of the sdcLHS algorithm, which are based on those in cLHS (Minasny and McBratney 2006), are as follows (Figure 1a):

Step 1. ATP kriging to downscale environmental variables Z from coarse scale to the fine scale.

Step 2. Division of the quantile distri-bution of Z into n strata; calcula-tion of the quantile distribution for each variable.

Step 3. Selection of n random sam-ples from N; calculation of the correlation matrix of z (T).

Step 4. Calculation of the objective functions. The overall objective function integrates four different components (objective functions) (for details see Lin et al. 2014). For

general applications, the weight assigned to each component in the overall objective function is equal.

Steps 5 to 7. Steps 5-7 are optimiza-tion procedures (for details see Minasny and McBratney 2006).

Step 8. Repetition of steps 4 to 7 until either the objective function value falls beyond a given stop criterion or 10,000 iterations are completed.

The spatial conditioned Latin Hy-percube Sampling (scLHS) (Lin et al. 2014) is the sampling part (steps 2 to 8) of sdcLHS (Figure 1a), developed as a Windows-based tool to select optimal sampling sites (Figure 1b).

Illustrative exampleWe applied the optimal sampling

method with a downscaling approach to locate optimal sampling sites at the 2 × 2 km scale and to improve the identification of the spatial structure of the distribution of Swinhoe’s blue pheasants (Figure 2) in Taiwan. The distribution of the focal species was estimated by Maximum Entropy (see Maxent; Phillips et al. 2009) based on the existing 803 2 × 2 km samples and separately based on 725 1 × 1 km samples (Figure 3). The estimated distributions were assumed to be the real distribution of the focal

Rain-gauge stations

Figure 3. Observed samples with presence and absence data of Swinhoe’s blue pheasant (Lee et al. 2004) in (a) 2 × 2 km sample sites; (b) 1 × 1 km sample sites; and (c) rain-gauge stations used for scaling validation. (Blue: presence; Green: absence).

Figure 2. Photo of Swinhoe’s blue pheasants (Lophura swinhoii) (a) male; (b) female.

a b

Page 3: An optimal spatial sampling approach for modelling the ...dirk.die-schmellers.de/.../Lin...Conservation-4.pdf · ecological sampling design should be stratified to improve precision,

154 CHAPTER VI

"

" ""

"" " " " "

" " "" "

" " ""

" " "" " "

" " " " """ " "

" " " ""

" "" " " "" " " "

"" " "

"" "" " "

" " " " "" " "

" """ "

" " " "" "

" "" " " "

" "" " "

" " " " "" " " " " "

" " " "" "" ""

" " " " " "" " " "

" "" " " " "

" " "" " "" "

" " " " "" "" " " " " "

" "" " " "

" " "" " "" "

" " " "" " "

" " "" " " "

" " "" " "" " " " "" "" "

" " " " "" " " " "

" " " "" " " " "

" " " " " "" " "" " " "

" " "" " " " "

"" " " " ""

" " " " " "" " "

" " "" " " " " " "" " "

" " " "" " " "

" " " " " "" " "

"" " "

" " "" " " " " "

" " " " " "" " " "

" "" " "" " " "" "

" " " " " "" " "" "

" " " " "" "" " "

" " " " "" " " "

" " " "" "" "" " " " "

"" " " "" " " " " " " "

" "" "

" " "" "" " " "

" " "" " " " " " " " "" " "

" " " " "" " " "

" " " " "" " " " " "

" " " " " "" " "

" " " "" """ "

" " " " "" " "

" " " "" " " "

" " " " "" " " " " " " "

""" " "" " "

" "" " " " """ ""

" " " " "" " " "

" " " " """ " "

"" " ""

" "" " " "

" " " "" "

" " "" " " " "" "

" "" " " "

" " " "" "

" " ""

" " " " " ""

" " " " " "" " "

" """ "

""" " "

" "" " "" "

"" "

"" " ""

"" ""

"

" " ""

" "

""

"

""

""

"""" "

"

" " " " """ "" " "

"" " "" "" " """ " " " "" "" " ""

" " "" " " "" "

" "" " "" "" " "" " " "

"" " "" """ " """ " """ " """" "" " " """" " """ """"" " "" " "" " "" "" "" " " "" " "" " " "

" "" " "" "" " " " "" " "" ""

"" " "" " "" """ " "" "" "" " "" "" " " " "

" " " " """ " " " "" "" " "" " "" "" " " "" " "" " "

"" "" """ "" " " "" " """ " "" " "

" "" " " " "" "" """ """ "" " "" " """ " """ " " "" """ " "" "" " "" " " "" "" " " " " "" "" "" " " "" " " """" "" " " " "" """

" " " "" "" " " """ " " "" "" " " """ " " "" """ " " " "" " " " "" " "" " " "

"" " "" " "" "" "" " " " "" " " """ " " "" "" " ""

" "" " " " "" " "" " " "" "" " "" "" """ " " "" "" " " "" " " " "" " " " "" "" "" " " " "" "" " " """" "" " " " "" "" " " "" "" " "" " "" " " "" " "" "" "" "

" " " "" "" "" " " "" " "" """ "" "" """ " "

"" " "" " "" "" " " " "" " """ " "" "" "" """" " " "" " " " """ " ""

"" " ""

"" ""

" """ """ " "

" """ "" """ " "" """

"

""" " "

"

" """" "

" ""

"

"" """

""

"

""" " "

"" "" "

" "" " ""

" """ " "

"

""" ""

"""

" """ " ""

""

" """

" ""

" " "" """ "

"

" "" " "" "" " "" "

"

" "

" "

" "" ""

" "" """

" "" " " " "" """

""" ""

" " """ "" ""

" "

""" " " """

""" " " ""

" ""

"" " "" "

" """" "

"" " "" "" "" "

" """

" " """

"""" "" ""

" ""

"

"""

"""

""" " " "

""" ""

""

"

"

"

""

""

"

""

" "" " "

"" "

""

"" " " "

"" " " "

""

" ""

"" "

" "" " "

""

" " " "" "

" ""

"" " " "

"" "

""

""

"" " "

" "" " "

""

"" " "

" " ""

""

" " "" " "

""

" "" "

"" ""

" "" "

"" ""

" ""

" "" "

" " " " "" "

""

""

" ""

" "" " "

" "" " "

"

" ""

""

" " "" " "

" " ""

" "

""

"""

" ""

"" "

" "" "

" "" "

"" " "

" " "" "

""

"

"" "

"

" "

" " ""

" ""

"

""

"

" "" "

"

" ""

"" "

" " ""

" " "" "

" """ "

" " ""

"" "

"" " "

" " " "" " " "

"" " "

"" "

" " """

" " "" "" " "

" "" " "

" "" " ""

" " " "" "

" " " " "" " "

""

" " "" "

" "" " "

"" " " "

" " "" "

" " "" "

" " "" " " "

" "" " " "

" " """ " "

" " "" " " " "

" ""

" "" " "

" " " """

" " "" " " "

" "" " "

" " ""

" " " " " "" " "

" " " "" "

"" "

" " " " "" " " " "

"" " "

" " "" " "

" "" " "

""" " " " "

" " " "" "" " "

" " " "" "

" " "" "

" " "" " " " "" " "

"" " ""

"" " "

" " ""

" " "" "

" ""

" "" " " "

" ""

" "" " "

" " " "" " " " " "" "

" " " " ""

" " " """" "

" " "" " "

" " ""

" " """

"" " "" " "

" "" "

""

" " "" " """ "" " "

" " " "" " "

"" " " "

" "" " " " "

" " ""

" ""

""

"" "

""

" "

""

"

"""

"

"

"

"

"""" ""

" "" """ "" "

" ""

"" "

" " "" " "

" " """ "" ""

"" " "" " "

"" " " "" " "

" " "" "

" " """" """ ""

" " """ "" """ "" " " "

" " """ " "

" """ " """

" "" ""

""" " "" " "" "" "" " " " "" "" "" "" " " "

"" "" "" " """ " "" " "" "

" " """

"""" "" " "" " " "" "" " "" " "" """ """ " "

" "" """ " "" ""

" "" " "" " " """ " """ " "

" "" " "" """ " ""

" " " "" " """ " "" "" "" "" "" ""

" "" "" " " "

" " """ " "" "" " "" "

"" " "" " " "" " " "" " "" "" " "

" "" "" " " """ "" " "" "" "" " ""

" """ " "" "" " " ""

" ""

"" """ " "" """ " "

"" "" " " """ """ " "" " ""

" " "" """ " " "

"" "" "" ""

""

" """" "

"" "

"""

"""

"

"

" ""

"" "" "

"

" 1 × 1 km optimal samples " 2 × 2 km optimal samples

a b c

Figure 4. Locations of (a) 200, (b) 400, and (c) 600 2 × 2 km and 1 × 1 km samples derived by the optimal sdcLHS approach (sdcLHS: spatial downscaling conditional Latin Hypercube Sampling).

species for evaluating our proposed approach.

The presence data of the species at certain locations, determined from a set of samples based on presence-absence data, combined with the val-ues of a selected set of environmental variables were used as input for the calculations. The resulting output represented the distribution of maxi-mum entropy among all distributions satisfying the set of constraints (Phil-lips et al. 2009) These methodological constraints required that the expected value of each environmental variable under the estimated distribution was nearly equal to its empirical average (Phillips et al. 2009). The performanc-es of Maximum Entropy were vali-dated by the Kappa and AUC values.

The sample locations at 2 × 2 km and 1 × 1 km resolution were partially clustered due to similar spatial pat-terns and structures (variograms) of the variation of several environmental parameters (Figure 4). The Kappa value of the Maximum Entropy model was 0.38 and the AUC value was 0.86 in model validations using 401 samples at the 2 × 2 km resolu-

tion. The Kappa and AUC values in the Maximum Entropy method were slightly higher when using 362 sam-ples (Kappa= 0.58; AUC= 0.92) at the 1 × 1 km resolution. The predic-tions with 200, 400 and 600 optimal samples taken from the assumed real distributions showed a consistently high performance, with AUC values of 0.99 and Kappa values of 0.97-1.00 for 1 × 1 km cells and AUC values of 0.98 and Kappa values of 0.96-0.98 for 2 × 2 km cells.

Concluding remarks

Incorporating spatial dependency of variables with different resolution into sampling approaches is critical to achieve efficient, unbiased spatial sam-pling. In the frame of the EU project SCALES, we have tested here an opti-mal sampling approach using the spa-tial downscaling sdcLHS based on se-lected environmental variables without pre-sampled species data, and used a Maximum Entropy approach to show the efficiency of the proposed ap-

proach in capturing the distribution of the endemic Swinhoe’s blue pheasant in Taiwan. Our analysis showed that fine scale data yielded accurate pres-ence/absence maps using a subset of presence/absence data that were op-timally located. Locations of samples tended to be non-randomly spatially distributed when sample size increased at a coarser cell size. In regards to cost and resource efficiency without the loss of spatial structures (variograms) of focal species, our method with a sufficiently large sample size, 200 op-timal samples in this case, performed well in capturing the spatial structure and predicting the spatial distribution of the focal species.

We conclude that the proposed sdcLHS approach considers the sta-tistical distributions and effectively exploits the spatial structures of the selected environmental variables to capture spatial correlations in the original data recorded at various cell sizes. In addition, our approach does not require pre-sampled species data to select spatially unbiased sample locations based on information of parameters collected at various scales.

Page 4: An optimal spatial sampling approach for modelling the ...dirk.die-schmellers.de/.../Lin...Conservation-4.pdf · ecological sampling design should be stratified to improve precision,

VI CHAPTER 155

ReferencesGregory RD, Gibbons DW, Donald

PF (2004) Bird census and survey techniques. In: Sutherland WJ, Newton I, Green RE (Eds) Bird Ecology and Conservation; a Handbook of Techniques. Oxford University Press, Oxford, 17-56.

Keil P, Belmaker J, Wilson AM, Unitt P, Jetz W (2013) Downscaling of species distribution models: A hierarchical approach. Methods in Ecology and Evolution 4: 82-94.

Lauzeral C, Grenouillet G, Brosse S (2013) Spatial range shape drives the grain size effects in species distribution models. Ecography 36: 778-787.

Lee PF, Ding TS, Hsu FH, Geng S (2004) Breeding bird species richness in Taiwan: Distribution on gradients of elevation, primary productivity and urbanization. Journal of Biogeography 31: 307-314.

Lin Y-P, Lin W-C, Li M-Y, Chen Y-Y, Chiang L-C, Wang Y-C (2014) Identification of spatial distributions and uncertainties of multiple heavy metal concentrations by using spatial conditioned Latin Hypercube sampling. Geoderma 01/2014, s 230-231: 9-21. doi: 10.1016/j.geoderma.2014.03.015

Lobo JM, Tognelli MF (2011) Exploring the effects of quantity and location of pseudo-absences and sampling biases on the performance of distribution models with limited point occurrence data. Journal for Nature Conservation 19: 1-7.

Minasny B, McBratney AB (2006) A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & Geosciences 32: 1378-1388.

Phillips SJ, Dudík M, Elith J, Graham CH, Lehmann A, Leathwick J, Ferrier S (2009) Sample selection bias and presence-only distribution models: Implications for background and pseudo-absence data. Ecological Applications 19: 181-197. doi: 10.1890/07-2153.1