Upload
euro-basin-programme
View
1.121
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Key lecture for the EURO-BASIN Training Workshop on Introduction to Statistical Modelling for Habitat Model Development, 26-28 Oct, AZTI-Tecnalia, Pasaia, Spain
Citation preview
Introduc)on to Sta)s)cal Modelling Tools for Habitat Models Development, 26-‐28th Oct 2011 EURO-‐BASIN, www.euro-‐basin.eu
2
• Introduction to species habitat and some concepts in community ecology
• Statistical methods dealing with communities
• Analysis of β-diversity: Similarity and distance matrices & Mantel and
partial Mantel test
� Practical session “Community Ecology with R”
• Direct Ordination Methods (CCA and RDA)
• Variation partitioning
� Practical session “Community Ecology with R”
• 4th corner method
Index
3
• Which are the main factors that determine the distribution (or the habitat) of species?
• Environmental factors (e.g. temperature, nutrients, …) → Adaptation processes versus• Dispersal limitation factors (reproduction and mortality rate, growth, migration,…) → Historical processes
• for a species, but species compete for resources (hence, for space)• for an assemblage (or community) of species, within a guild
A guild (or ecological guild) is any group of species that exploit the same resources: e.g. zooplankton, phytoplankton, trees
Hypothesis
4
A
B
C
DE
Site 1 Site 2
FG
AB
CD
EF
G
Site 1 Site 2
• Which are the main factors that determine the species composition of a community in a region?
• What are the factors that determine the maintenance of local and regional diversity?
Shared species↓ Shared species↑
Hypothesis
5
γ-diversity / Landscape
α-diversity /Within an homogeneous habitat
β-diversity / Environnemental
gradient
Whittaker (1960, 1977)
diversities…
6
Abundance a b c d e
Environmental Gradient
1. Environmental factors ⇔ Niche ⇔ « Environmental patchiness »
2. Geographic Distance ⇔ Dispersal limitation ⇔ « random walk »(Neutral theory, Hubbell 2001)
Distance between sites
Sharedspecies
Neutral community: all individuals have the same rates of reproduction and mortality
Habitat theories
7
Niche model
• The Hutchinsonian niche views niche as an multi-dimensional hypervolume, where the
dimensions are environmental conditions and the resources that define the requirements of
an individual or a species (E. Hutchinson, 1957).
• The full range of environmental conditions (physical and biological, i.e. the resources) under
which an organism can exist describes its fundamental niche.
Two dimensional nicheUnidimensional niche
Three dimensional niche
Ab
un
dan
ce
Variable
8
Dispersal-limited model
• Species composition fluctuates in a random, autocorrelated way.
A
B
C
DE
Site 1 Site 2
FG
AB
CD
EF
G
Site 1 Site 2
Geographical distance
Shared
species
β-diversity
Metacommunity A
Metacommunity B
Similarity ↓ : β-diversity↑ Similarity ↑: β-diversity ↓
Distance decay
Metacommunity: a set of local communities
that are linked by dispersal of multiple,
potentially interacting species
9
A metapopulation is a group of spatially separated populations of the same
species which interact at some level
A metacommunity is a set of local communities that are linked by dispersal of
multiple, potentially interacting species
n1
n2
m1
na1
ma1
nb1
nc1
na2
nb2
na3
mb1
Terminology
10
• The number of species found on an undisturbed island is determined by immigration and
extinction.
• Immigration and emigration are affected by the distance of an island from a source of colonists
(distance effect).
• Large islands => lower extinction
• Near islands to continents => higher immigration rate
The theory of island biogeography
(MacArthur and Wilson, 1967)
MacArthur, R. H. and Wilson, E. O. 1967. The Theory
of Island Biogeography. Princeton, N.J.: Princeton
University Press.
11
Condit et al. Science,January 25, 2002.
β-diversity
Duivenvoorden et al. Science,January 25, 2002.
Variance partitionning
Dispersal limited model
12
Geographic distance
Spatial Autocorrelation
Legendre, P. (1993) Spatial autocorrelation: trouble or new paradigm. Ecology, 74, 1659–1673.
Environmental Gradient
Sh
are
d
spe
cie
s
• Environmental variables and species distributions tend to be spatially autocorrelated:
• Species distributions are most often aggregated because of contagious biotic processes such as
local dispersal
• But also, environment is structured primarily by climate and geomorphological processes on
land that cause gradients and patchy structures.
• Therefore values of these variables are not stochastically independent from one another. This may
lead to misinterpretation of patterns using classical statistics when ecologists conclude that species–
habitat associations are statistically significant.
• To evaluate the relative importance of environmental segregation and limited dispersal in explaining
species distributions, spatial structure must be considered.
• Spatial autocorrelation can be a problem for explaining species ecological niche, however, it can
improve habitat modelling
13
Some statistical methods to analyse distribution
patterns of species communities
• Similarity and distance matrices &
Mantel and partial Mantel test
(Analysis of β-diversity)� Practical session “Community Ecology
with R”
• Direct Ordination Methods (CCA and
RDA)
• Variation partitioning
• Practical session “Community Ecology
with R”
• 4th corner method
14
Analysis of β-diversity: Similarity and distance matrices
&Mantel and partial Mantel test
15
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
Ssim
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
β-diversity
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
AMBsim
Environmental similarity
Similarity Coefficient / Distance
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
Species Matrix: m sites x n species
Environmental Matrix: m sites x q variables
(Euclidean, …)
(Jaccard, …)
Similarity and distance matrices
e.g. n = 5 sites
16
(Dis)Similarity and distance indices
Similarity indices (for species data): 0 → 1• Jaccard index (for presence-absence data) is the
number of species shared between the two plots,
divided by the total number of species observed.
0 (no shared species) → 1 (all species shared)• Bray-Curtis index (for abundance data) is defined by
2W/(A+B), where W is the sum over all species of the
minimum abundances between the two stations of
each species, and A and B are the sums of the
abundances of all species at each of the two stations.
• Bray-Curtis is also known as Steinhaus dissimilarity,
Sørensen index, or Czekanowski
• …
Distance indices (for variables):
• Euclidean :
• …
sp1 sp2 min
St1 3 4 → 3
St2 5 2 → 2
W = 5
AB
CD
EF
Site 1 Site 2
Jaccard = 4 / 6
dvar1 var2
St1 32.3 0.2
St2 34.6 0.3
d1=2.32 d2=0.12
17
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
Ssim
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
β-diversity
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
AMBsim
Environmental similarity
Similarity Coefficient / Distance Mantel Test
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
Species Matrix
Environmental Matrix
(Euclidean, …)
(Jaccard, …)
18
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
Ssim
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
β-diversity
=
0....
0...
0..
0.
0
45
3534
252423
15141312
d
dd
ddd
dddd
d
Geographic distance
Similarity Coefficient / Distance Mantel Test
Species Matrix
Site location: x,y
Euclidean
(Jaccard, …)
=
mm
xy
yx
x
yx
d
.
.
...2
11
e.g. 5 sites
19
Floristic data:
708 tree species (> 10 cm dbh)
53 sites of ~1 ha
Case Study 1: Tree rainforest in Panama
PrecipitationGradient
FloristicComposition
Environmental Variables:
• Precipitation• Elevation• Slope• Water accumulation flow• Geology• Fragmentation
20
Condit et al. Science,January 25, 2002.
β-diversity
Case Study 1: Tree rainforest in Panama
Jaccard Geographical Distance (GD) 0.637 ln(GD) 0.696 Dispersal-related factors Cross-plot forest fraction 0.323 Elevation 0.424 Slope 0.318 Runoff 0.078 Precipitation 0.572 Dry season 0.461
Environmental factors
Geologic types 0.126 Band 1 0.305 Band 2 0.117 Band 3 0.127 Band 4 0.258 Band 5 0.148
Spectral data
Band 7 0.160
Distance (km)
Fra
ctio
n s
pec
ies
shar
ed
21
Identification of complementary areas of diversity
A
B
C
DE
Site 1 Site 2
FG
AC
DE
F
Site 1 Site 2
• Problem of the minimal areaMinimise the total surface while preserving all species
• Problem of the maximal coverageMaximise the number of species within a fixed surface
Optimisingγ-diversity
22
8%
20%
Similarity0%
Cluster 3.1 3.2 3.3 3.4 3.5 2.1 2.2 1.1 1.2
Plots 34 40 41 36 35 371,3,4,21,22,29,23,27,24,28,30,C1, C4,C2,C3
2,S0,S1,S3,S2,S4,SH,25,26,5,17,13, 10,11,18,14,P1, P2,6,7,12,15,16,8,9, 20,19,G1,G2
31,3233
Cluster 3.1 3.2 3.3 3.4 3.5 2.1 2.2 1.1 1.2
Plots 34 40 41 36 35 371,3,4,21,22,29,23,27,24,28,30,C1, C4,C2,C3
2,S0,S1,S3,S2,S4,SH,25,26,5,17,13, 10,11,18,14,P1, P2,6,7,12,15,16,8,9, 20,19,G1,G2
31,3233
Step 1. Hierarchical agglomerative clustering
Step 2. Multiple Regression Model between distance matrices
Identification of complementary areas
23
8%
20%
Similarity0%
Cluster 3.1 3.2 3.3 3.4 3.5 2.1 2.2 1.1 1.2
Plots 34 40 41 36 35 371,3,4,21,22,29,23,27,24,28,30,C1, C4,C2,C3
2,S0,S1,S3,S2,S4,SH,25,26,5,17,13, 10,11,18,14,P1, P2,6,7,12,15,16,8,9, 20,19,G1,G2
31,3233
Cluster 3.1 3.2 3.3 3.4 3.5 2.1 2.2 1.1 1.2
Plots 34 40 41 36 35 371,3,4,21,22,29,23,27,24,28,30,C1, C4,C2,C3
2,S0,S1,S3,S2,S4,SH,25,26,5,17,13, 10,11,18,14,P1, P2,6,7,12,15,16,8,9, 20,19,G1,G2
31,3233
Step 1. Hierarchical agglomerative clustering
Predicted
0.2 0.4 0.6 0.8 1.0
Jacc
ard
sim
ilarit
y
0.2
0.4
0.6
0.8
1.0
R2 = 0.57 (p < 0.001)
Step 2. Multiple Regression Model between distance matrices
Step 3. Extrapolation of the model and cluster assignation
Ŝ(pixel i, site 1)Ŝ(pixel i, site 2):Ŝ(pixel i, site 53)
• Log(GD)• Elevation• Bands 1-4
Identification of complementary areas
24
Non-rain forest Water surfaces Cluster 1.1 Cluster 1.2 Cluster 2.1 Cluster 2.2 Cluster 3.1 Cluster 3.2 Cluster 3.3 Cluster 3.4 Cluster 3.5
Predicted floristic types: identification of complementary areas
Chust, G., J. Chave, R. Condit, S. Aguilar, S. Lao, & R. Pérez (2006)Determinants and spatial modelingof beta-diversity in a tropical forest landscape in Panama. Journal of Vegetation Science 17: 83-92.
25
Case Study 2: zooplankton in the Bay of Biscay
-7 -6 -5 -4 -3 -2 -1 043
44
45
46
47200 m
100 m
Gironde EstuaryBay of Biscay
Arcachon Bay
Adour river
Cap Breton Canyon
Cap Ferret Canyon
267 Zooplankton samples collected from May 2-16, 2004
Irigoien, X., G.Chust, J.A. Fernandes, A. Albaina, L. Zarauz (2011) Factors determiningmesozooplankton species distribution and community structure in shelf and coastal waters. Journal of Plankton Research33: 1182-1192.
CopepodCalanus helgolandicus
24 most abundantcopepods
26
Distance (km)
0 50 100 150 200 250 300 350
Sp
ecie
s si
mila
rity
(B
ray-
curt
is)
0.0
0.2
0.4
0.6
0.8
1.0
Distance (km)
0 50 100 150 200 250 300 350
Sp
ecie
s si
mila
rity
(Ja
ccar
d)
0.0
0.2
0.4
0.6
0.8
Species similarity indices against geographic distance
Case Study 2: zooplankton in the Bay of Biscay
Sp
ecie
s si
mila
rity
Distance (km)
27
Species similarity indices against environment:
• 15 environmental variables (bottom depth, temperature, salinity and density at surface and bottom, difference in density between surface and bottom, Frequency of Brunt-Vaisala, integrated fluorescence, depth of the maximum fluorescence, fluorescence at the maximum, abundance of chaetognath, jellyfish and fish eggs)
• 32767 possible subsets were compared
• ∑�!
��� !�!
���� , where n: number of var., k: combinations
• The best subset of environmental variables selected so that explain the maximum variation of the species similarity were 4: Frequency of Brunt-Vaisala, salinity at surface, density at bottom and jellyfish abundance (for Bray-Curtis index)
Case Study 2: zooplankton in the Bay of Biscay
28
Aim: to select the best subset of environmental variables, so that distances of (scaled) environmental variables have the maximum correlation with community dissimilarities
Model Selection
=
1....
1...
1..
1.
1
45
3534
252423
15141312
s
ss
sss
ssss
AMBsim
Environmental similarity
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
Environmental Matrix
(Euclidean, …)
=
.
.:
...
1
21
1211
mx
x
xx
AMB
n combinations of q variables → n Environmental similarity matrices
29
Mantel r p-value Terms selected for Environmental variablesBray-Curtis × Environment 0.54 0.001 Frequency of Brunt-Vaisala, Salinity at surface,
Density at bottom, Jellyfish abundanceBray-Curtis × Distance 0.43 0.001Bray-Curtis × Environment (Distance partially out) 0.50 0.001
Jaccard × Environment 0.44 0.001 Temperature at bottom, Density at surface and at bottom, Fish abundance
Jaccard × Distance 0.47 0.001Jaccard × Environ selec (Distance partially out) 0.34 0.001
Case Study 2: zooplankton in the Bay of Biscay
ENV
DIS
Conclusion: mesozooplankton communities in the Bay of Biscay are subjected to balanced degree of dispersal limitation and niche segregation.
30
Case Study 2: a comparison of estuarine intertidal communities
rM = 0.625Slope = -0.0021
rM = 0.316Slope = -0.0020
rM = 0.064Slope = -0.0003
Saltmarsh and seagrass plants Macroalgae Macroinvertebrates
31
• R: veganpackage (Oksanen et al. 2011, see Docs)
• PRIMER (Clarke & Gorley 2006; http://www.primer-e.com/)
• …
Software for Similarity/distance indices and Mantel tests
32
Practical session 1 “Community Ecology with R: vegan package”
33
ANALYZING BETA DIVERSITY: PARTITIONING THE SPATIAL VARIATION OF COMMUNITY COMPOSITION
DATA (Legendre et al. 2005, Ecological Monographs)
• The variance of a dissimilarity matrix among sites (rM2) is not the variance of the
community composition,• hence, partitioning on distance matrices should not be used to study the variation in
community composition among sites.• Partitioning on distance matrices underestimated the amount of variation in
community composition explained by the raw-data approach.
• The proper statistical procedure for partitioning the spatial variation of communitycomposition data among environmental and spatial components, and for testinghypotheses about the origin and maintenance of variation in community compositionamong sites, is canonical partitioning.
• The Mantel approach is appropriate for testing other hypotheses, such as the variationin beta diversity among groups of sites. Regression on distance matrices is alsoappropriate for fitting models to similarity decay plots.
34
Direct (constrained) Ordination Methods
&Variation partitionning
35
Constrained (Canonical) Ordination Methods
• Univariate: e.g. (multiple) regression model
• Multivariate response data: e.g. Canonical Ordination
• Residual variation of multivariate response data: e.g. Partial ordination
=
mx
x
x
S:2
1
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
×
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
×
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
× ×
=
mm
xy
yx
x
yx
d
.
.
...2
11
One species (Occurrence, abundance)
q environ. var.
Species composition data. q environ. var.
Species composition data. q environ. var. Spatial terms
36
Response models Indirect Direct Multivariate
Linear PCA Constrained Ordination: RDA
Unimodal Constrained Ordination: CCA
• Ordination methods such as principal component analysis (PCA) are used to reduce the variation in community composition in an ordination diagram.(PCA uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components)
• Constrained (Canonical) Ordination: is a combination of ordination and multiple regression. It extracts continuous axes of variation from species abundance data in order to explain which portion of this variation is directly explained by environmental variables. The axes are constrained to be linear combinations of environmental variables. The orthogonal directions in PCA is particular and other directions may well be better related to env. var. Canonical Ordination is a solution for this.
=
mnm
n
xx
x
xxx
S
..
...:
.....
...
1
21
11211
=
mqm
q
xx
x
xxx
AMB
..
...:
.....
...
1
21
11211
×→
=
........
...:
........
... 321 pcpcpc
PCA
Constrained (Canonical) Ordination Methods
37
Redundancy Analysis (RDA): species are assumed to have linear response surfaces with respect to compound environmental gradients. Thus, RDA is a direct extension of multiple regression to the modelling of multivariate response data. It is related to PCA and it is based on Euclidean Distances.
a b c
Environnemental Gradient
Abundance
Abundance
Environnemental Gradient
ab
c
Canonical Correspondence Analysis (CCA): species are assumed to have unimodalresponse surfaces with respect to compound environmental gradient. It is related to Correspondence Analysis and it is based on Chi-squared distance.
Constrained (Canonical) Ordination Methods
38
0
5
10
15
20
25
30
1
2
3
4
5
1
23
4
Z D
ata
X Dat
a
Y Data
-150
-100
-50
0
50
100
1
2
3
4
5
1
23
4
Z D
ata
X Dat
a
Y Data
Linearx, y
Cubicx, y, xy, x2, y2, x2y, y2x, x3, y3
Trend surface model
d
x y. .. .. .
Geographic distancefor Mantel approaches
Spatial terms for Canonical Ordination Methods: trend surface
39
© AZTI-Tecnalia
Variation Partitionning
Chust, G., et al. (2003).Conservation Biology17 (6): 1712-1723.
UNA
ENV ANT
DIS
a b
c
d
e fg
UNA: Not explained
ENV
DIS
a
b
c
Steps (just algebra):1. Canonical Ordination (CO) between Species and ENV → a+c2. pCO between Species and ENV, partially out SPA→ a; → c = (a+c)-a 3. CO between Species and (ENV & Distance) → a+b+c; i.e. 1-UNA
Thus, b = (a+b+c) – a – cOr 3bis. CO between Species and DIS → b+c→
Thus b = (b+c) – c ; UNA = 1 – [a + b + c]
UNA = 50%
ENV
DIS
30%
10%
10%
2 variable types Example 3 variable types
e.g. Environment, Distance, Anthropogenic ANTe.g. Environment ENV, Distance DIS, UNA: unaccounted (not explained)
40
• R: veganpackage (Oksanen et al. 2011, see Docs)
• CANOCO (ter Braak and Smilauer 1998; http://www.pri.wur.nl/uk/products/canoco/)
• …
Software for Canonical and Redundancy analysis, and Variation Partitioning:
41
*Chust, G., J. Chave, R. Condit, S. Aguilar, S. Lao, & R. Pérez (2006) Determinants and spatial modelingof beta-diversity in a tropical forest landscape in Panama. Journal of Vegetation Science 17: 83-92.
Conclusion: The distribution of Panamanian tree species appears to be primarily determined by dispersal limitation, then by environmental heterogeneity
Case Study 1: Tree rainforest in Panama
Duivenvoorden et al. Science,2002.
Based on Mantel test
Chust et al. 2006. JVS*
25% 10%
17%
46%
Shared
Spatial terms
Environment
Not explained
Based on Canonical Correspondence Analysis
42
Practical session 2 “Community Ecology with R: vegan package”
43
4th Corner Method
44
A B C
Presence/Absence × Traits × Environment
245 sp× 78 sites 3 life form Fragmentation
4 types of dispersion
D = C * A’ * B
××××
)(var)(var
)()(
traitDsitesC
traitspBsitesspA
• test F (global)• Correlation r
Legendre, P., Galzin, R. & Harmelin-Vivien, L. (1997) Relating behavior to habitat: solutions to the fourth-corner problem. Ecology, 78, 547–562.
4th Corner Method (Legendre et al. 1997)
• The fourth-corner tests for the association between biological traits to habitat at locations where the corresponding species are found.• How do the biological and behavioral characteristics of species determine their relative locataions in an ecosystem?• e.g. are the modes of dispersion related to habitat fragmentation?
45
Case study 1: Coral reef fish data
• Biological and behavioural traits
• Environmental variables:Bottom typeDepth…
Legendre et al. 1997
46
Case study 2: Plant traits
3 life forms
4 types of dispersion
Habitat fragmentation
47
Test F
Case study: Plant traits
Correlation
Interpretation: The effects of fragmentation of scrubland on scrub species community are related to the dispersal type
Fragmentation
Interpretation: Wind-dispersed species are positively related to the defragmentation
48
Chust, G., A. Pérez-Haase, J. Chave, & J. Ll. Pretus. (2006) Linking floristic patterns and plant traits of Mediterranean communities in fragmented habitats. Journal of Biogeography 33: 1235–1245.
Case study: Plant traits
Fragmentation
Num
ber
of s
peci
es
in s
crub
land
s Woody plantsAnnual herbs
Animal-dispersedWind-dispersed
Fraction of scrubland (%)
0-33 34-66 67-100
Interpretation: Wind-dispersed and annual species are positively related to the defragmentation of scrublands
Introduc)on to Sta)s)cal Modelling Tools for Habitat Models Development, 26-‐28th Oct 2011 EURO-‐BASIN, www.euro-‐basin.eu