Upload
lamdieu
View
223
Download
2
Embed Size (px)
Citation preview
Testing hypotheses on differences among groups with multiple response variables: none-parametric approaches to multivariate analysis of variance in R
Alejandro OrdonezCommunity and Conservation Ecology Group
Centre for Ecological and Evolutionary Studies CEESUniversity of Groningen
What is parametric?
• Parametric statistics
• Based on the assumption that the data come from a type of probability distribution (moment generation function) making then inferences about the parameters of the distribution (PARAMETRIC)
• Non-parametric statistics
• Are methods that make NO assumption on the type of probability distribution (and hence its parameters) that generates the observed process - (Distribution free) ➔ This doesn’t mean assumption free
General assumptions
Parametric Non- parametric
Observations must be independent Observations must be independent
Observations must be drawn from a given distribution
Variable under study has underlying continuity
The assumed distribution is described by a group of parameters
Univariate Vs Multivariate
One response Variable(UNIVARIATE)
Multiple response variables(MULTIVARIATE)
Parametric No predictor
Non-Parametric No predictor
Parametric 1 predictor/more
Non-parametric 1 predictor/more
χ2 associationsPearson correlations
PCA and DCA
Spearman rank correlation PCoA and NMDS
General + Generalized lineal models
Non-parametric tests
MANOVA MANCOVARDA, CCA,
Wilcoxon test, Kruskall-Wallis TODAY!!!
From univariate to multivariate
• Compare the groups to know if they are different (multivariate)
• Multiple response variables simultaneously
• Then test on which variables they are different (univariate)
• One response variable at a time
From univariate to multivariate
The problem
• How to determine differences between groups when multiple response variables are measured?
188 Multidimensional semiquantitative data
Table 5.1 Methods for analysing multidimensional ecological data sets, classified here according to thelevels of precision of descriptors (columns). For methods concerning data series, see Table 11.1.To find pages where a given method is explained, see the Subject index (end of the book).
Quantitative Semiquantitative Qualitative Descriptors ofdescriptors descriptors descriptors mixed precision
Difference between two samples:Hotelling T2 --- Log-linear models ---
Difference among several samples:MANOVA --- Log-linear models MANOVALSdb-RDA, CCA --- db-RDA, CCA db-RDA
Scatter diagram Rank diagram Multiway contingency Quantitative-ranktable diagram
Association coefficients R:Covariance --- Information, X2 ---Pearson r Spearman r Contingency ---
Kendall !Partial r Partial !Multiple R Kendall W
Species diversity:Diversity measures Diversity measures Number of species ---
Association coeff. Q Association coeff. Q Association coeff. Q Association coeff. Q
Clustering Clustering Clustering Clustering
Ordination:Principal component a. --- Correspondence a. PRINCALSCorrespondence a. HOMALS PRINCIPALSPrincipal coordinate a. Principal coordinate a.Nonmetric multi- Nonmetric multi-
dimensional scaling dimensional scalingALSCAL, GEMSCAL
Factor analysis --- --- FACTALS
Regression Regression Correspondence Regressionsimple linear (I and II) nonparametric logisticmultiple linear dummypolynomial MORALSpartial linearnonlinear, logisticsmoothing (splines, LOWESS)multivariate; see also canonical a.
Path analysis --- Log-linear models PATHALSLogit modelsCanonical analysis:
Redundancy analysis (RDA) CORALS, OVERALSCanonical correspondence a. (CCA) CCA db-RDACanonical correlation a. (CCorA)Discriminant analysis --- Discrete discriminant a.CRIMINALS
Log-linear models Logistic regression
Methods for analyzing
multidimensional ecological data sets,
Table form Legendre & Legendre 1999Numerical ecology
Univariate Vs Multivariate
One response Variable(UNIVARIATE)
Multiple response variables(MULTIVARIATE)
Parametric No predictor
Non-Parametric No predictor
Parametric 1 predictor/more
Non-parametric 1 predictor/more
χ2 associationsPearson correlations
PCA and DCA
Spearman rank correlation PCoA and NMDS
General + Generalized lineal models
Non-parametric tests
MANOVA MANCOVARDA, CCA,
Wilcoxon test, Kruskall-Wallis TODAY!!!
What alternatives are out there?
• Parametric
• MANOVA
• MANCOVA
•Non-parametric
• ANOSIM
• Mantel test
• Multiple response permutation procedures
• Distance based redundancy analysis (dbRDA)
• Multivariate analysis of variance using distance measurements (PERMANOVA)
See Legendre & Legendre 1999 (Numerical ecology)
MANOVA Vs. PERMANOVAAssumptions
MANOVA PERMANOVA
Data from a multivariate normal distribution Distribution free
All groups have the same variance Between group variance might change(BUT sensitive to this)
Sensitive to correlation among response variables
Insensitive to the correlation among response variables
Need much more samples than variables There could be more variables than samples
Highly sensitive to many zeros Insensitive to many zeros
Legendre & Anderson 1999 Ecol, MonoAnderson 2001 Austral Ecology
Why PERMANOVA?
• The variation can be partitioned in factors
• Simple, nested and random models can be described (Blocking factors)
• Choose between metric or semi-metric distances (set by the question)
• Distribution / parameter free
• H0 can be articulated using permutations
Assumptions of PERMANOVA
• The observations units are exchangeable under a true null hypothesis
But there are some implicit assumptions
• The observations independence assumption doesn’t implies a independence of variables
• Sensitive to differences in the within groups dispersions
• For designs with more than one-factor the test is not strictly non-parametric (semi-parametric)
• It’s distribution free co’z it uses permutations to estimate P-values
McArdle, & Anderson 2001 EcologyAnderson 2001 Austral Ecology
The goal of this method
• Test the significance of individual terms in a multi-factor ANOVA like framework for multiple response variables
• using a non-parametric permutation approach
McArdle, & Anderson 2001 EcologyAnderson 2001 Austral Ecology
The null hypothesis
• There are NO differences in the position and/or spread, in a multivariate space, of the compared groups attributes
A important but subtle element of the Deff:
• PERMANOVA evaluates differences in both location and spread simultaneously
Raw data
Pseudo F test
Test of Ratio
Distance Matrix
Using either metric or semi-metric
measurements
F= SSA / SSE
Permutations or Monte Carlo simulations
Variance componentsSST SSA SSE
Gower matrix
A practical Example
• Using the Iris and Vegetation.Crawley.csv data
• Evaluate if the datasets emerge from a multi-normal distribution
• Do a MANOVA and a PERMANOVA
Raw data
Pseudo F test
Test of Ratio
Distance Matrix
Variance componentsSST SSA SSE
Gower matrix
What is measured in a PERMANOVA
Mass (kg)
Leng
th (
m)
F
M
The Raw dataIt is a n x m matrix that could
summarize:•Community data•Locations characteristics•Species Attributes or DNA sequences
Ludwig & Reynolds 1988 (Statistical Ecology)Figure From Legendre & Legendre 1999 (Numerical Ecology)
Q and R analyses 249
the two time instances to be compared are considered to define two descriptors, as inthe S mode, so that normal R-type measures may be used. Environmental impactstudies form an important category of T-mode problems; ecologists should look at theliterature on BACI designs when planning such studies (Before/After -Control/Impact: Green, 1979; Bernstein & Zalinski, 1983; Stewart-Oaten et al., 1986;Underwood, 1991, 1992, 1994).
It is not always obvious whether an analysis belongs to the Q or R mode. As afurther complication, in the literature, authors define the mode based either on theassociation matrix or on the purpose of the analysis. Principal component analysis(Section 9.1), for instance, is based on a dispersion matrix among descriptors(R mode?), but it may be used for ordination of either the objects (Q mode?) or thedescriptors (R mode?). In order to prevent confusion, in the present book, any studystarting with the computation of an association matrix among objects is called aQ analysis whereas studies starting with the computation of an association matrixamong descriptors are referred to as R analyses. In Chapter 9 for example, it is
Figure 7.1 The three-dimensional data box (objects ! descriptors ! times). Adapted from Cattell (1966).
O P
S
TR
Q
OBJECTS
TIMES
DESCRIPTORS
Q or R?
Sp. 1 Sp. 2 Sp. 3 Sp. 4
GROUP AGROUP A
GROUP BGROUP B
1 1 4 7 0
2 2 6 10 1
3 4 2 0 6
4 5 1 2 9
Rain Elev. Temp. Evap.
GROUP AGROUP A
GROUP BGROUP B
1 100 10 25 0
2 200 15 40 1
3 50 1000 16 6
4 20 1500 14 9
Trait 1 Trait 2 Trait 3 Trait 4
GROUP AGROUP A
GROUP BGROUP B
Sp. 1 100 0.01 10 30
Sp. 2 123 0.03 15 50
Sp. 3 300 0.1 1 30
Sp. 4 400 0.4 0.5 20
Community data Environmental data
Species characteristics
Measuring Ecological resemblance
• How to determine the resemblance between either the objects under study (sites) or the variables describing them (species or other descriptors)?
• Association between objects (Q mode)
• Association between descriptors (R mode)
GROUP IGROUP I GROUP IIGROUP II
Sp. 1 Sp. 2 Sp. 3 Sp. 4
GROUP AGROUP A
GROUP BGROUP B
Site 1 1 2 40 20
Site 2 2 0 60 10
Site 3 10 35 0 4
Site 4 20 55 2 0
R - MODEQ
- M
OD
E
PERMANOVA a type of Q-mode analysis
Mass (kg)
Leng
th (
m)
F
M
Similarity Vs. Distances
• Q-mode studies, similarity coefficients between objects will be distinguished from distance (or dissimilarity) coefficients
• Similarities:
• MAXS: A=B MINS: A≠B
• Distances:
• MAXD: A≠B MINS: A=B
Transforming similarities to distances
• Similarity can be transformed into a distance, for example by computing its one-complement
• Distances, which in some cases are not bound by a pre-determined upper value, may be normalized
D = 1− S D = 1− S D = 1− S2
Dnorm =DDmax
Dnorm =D − Dmin
Dmax − Dmin
Q mode: distance coefficients 275
The third group of distances consists of nonmetrics. These coefficients may takenegative values, thus violating the property of positiveness of metrics.
All similarity coefficient from Section 7.3 can be transformed into distances, asmentioned in Section 7.2. Some properties of distance coefficients resulting from thetransformations D = (1 – S) and are discussed in Table 7.2. Stating that adistance coefficient is not metric or Euclidean actually means that an example can befound; it does not mean that the coefficient is never metric or Euclidean. A coefficientis likely to be metric or Euclidean when the binary form of the coefficient (name givenin the Table) is known to be metric or Euclidean, and test runs have never turned upcased to the contrary. A coefficient is said to be Euclidean if the distances are fullyembeddable in an Euclidean space; principal coordinate analysis (Section 9.2) of sucha distance matrix does not produce negative eigenvalues.
D 1 S–=
Table 7.2 Some properties of distance coefficients calculated from the similarity coefficients presented inSection 7.3. These properties (from Gower & Legendre, 1986), which will be used inSection 9.2, only apply when there are no missing data.
Similarity D = 1 – S D = 1 – S
metric, etc. Euclidean metric Euclidean
(simple matching; eq. 7.1) metric No Yes Yes
(Rogers & Tanimoto; eq. 7.2) metric No Yes Yes
(eq. 7.3) semimetric No Yes No
(eq. 7.4) nonmetric No No No
(eq. 7.5) semimetric No No No
(eq. 7.6) semimetric No Yes Yes
(Jaccard; eq. 7.10) metric No Yes Yes
(Sørensen; eq. 7.11) semimetric No Yes Yes
(eq. 7.12) semimetric No No No
(eq. 7.13) metric No Yes Yes
(Russell & Rao; eq. 7.14) metric No Yes Yes
(Kulczynski; eq. 7.15) nonmetric No No No
D 1 S–= D 1 S–=
S1a d+
a b c d+ + +------------------------------=
S2a d+
a 2b 2c d+ + +-------------------------------------=
S32a 2d+
2a b c 2d+ + +-------------------------------------=
S4a d+b c+------------=
S514--- a
a b+------------ a
a c+------------ d
b d+------------ d
c d+------------+ + +=
S6a
a b+( ) a c+( )------------------------------------------ d
b d+( ) c d+( )------------------------------------------=
S7a
a b c+ +---------------------=
S82a
2a b c+ +------------------------=
S93a
3a b c+ +------------------------=
S10a
a 2b 2c+ +----------------------------=
S11a
a b c d+ + +------------------------------=
S12a
b c+------------=
Euclideancoefficient
Properties of distance coefficients calculated from the similarity coefficients
Table From Legendre & Legendre 1999 (Numerical Ecology)
Distance coefficients
• Distance coefficients are functions which take their maximum values (often 1) for two objects that are entirely different, and 0 for two objects that are identical over all descriptors[Legendre & Legendre 1999 (Numerical Ecology); Quinn & Keough 2002 (Experimental Design
and Data Analysis for Biologists); Ludwig & Reynolds 1988 (Statistical ecology)]
Types of distancesType Description
Metrics
Bounded ate the minimum (if A=B; then D(a,b)=0)Always positive (if A≠B; then D(a,b)>0)Symmetry (if D(a,b) = D(b,a))Follows the triangle inequality (Trinagular inequality D(a,b) + D(a,b))
Semi-metricspseudo-metrics
Bounded at the minimum (if A=B; then D(a,b)=0)Always positive (if A≠B; then D(a,b)>0)Symmetry (if D(a,b) = D(b,a))
Non-metrics Symmetry (if D(a,b) = D(b,a))Follows the triangle inequality (Trinagular inequality D(a,b) + D(a,b))
Metric (M)Semi-metric (S)
Formula
Euclidean
City block or Manhattan
Minkowski
Chi-square
Bray–Curtis
Jaccard’s
M
M
M
S
S
M
dij =yik − yjkk= i
p∑yik + yjk( )k= i
p∑
dij = yik − yjkk= i
p∑
dij =y••y•k
yikyi•
−yjkyj•
⎛
⎝⎜⎞
⎠⎟
2
k=1
p∑
dij =y+ ik − y
+jkk= i
p∑wkk= i
p∑
dij = yik − yjk⎡⎣ ⎤⎦k= i
p∑r⎡
⎣⎢⎤⎦⎥
1r
dij = yik − yjk( )k= i
p∑2
Description
Euclidean
City block or Manhattan
Minkowski
Chi-square
Bray–Curtis
Jaccard’s
By definition its the measurement to dissimilarity between two groups does not have an upper limit, its value increasing indefinitely with the number of descriptors.
The Euclidean distance its not good in gradient separation without proper standardization
It has properties similar to Euclidean distance and will be dominated by variables with large values. Like the Euclidean distance its not good in gradient separation
without proper standardization
Euclidean and City block are both versions this metric. Is normalized is useful to compare dissimilarities between data sets with different numbers of variables.
Dissimilarity measure, implicit in some multivariate analyses (e.g. correspondence analysis), and is only applicable when the variables are counts, such as species
abundances.
Modification of the Manhattan measure (to constrain it between 0 and 1). Sometimes called percent dissimilarity is well suited to species abundance data because it ignores variables that have zeros for both objects ( joint absences).
Asymmetrical binary coefficient based on presence-absence data (Modified Bray-Curtis). From its formulation it might consider that the presence of a species is
more informative than its absence.
Abundances Vs Presence absence
2 ORDINATION: BASIC METHOD 2.3 Comparing ordinations: Procrustes rotation
> dis <- vegdist(decostand(varespec, "norm"), "euclid")
This gives chord distances which reach a maximum limit of√
2 when
there are no shared species between two sites. Another recommended
alternative is Hellinger distance which is based on square roots of sites
standardized to unit total:
> dis <- vegdist(decostand(varespec, "hell"), "euclidean")
Despite standardization, these still are Euclidean distances with all their
good properties, but for transformed data. Actually, it is often useful to
transform or standardize data even with other indices. If there is a large
difference between smallest non-zero abundance and largest abundance,
we want to reduce this difference. Usually square root transformation is
sufficient to balance the data. Wisconsin double standardization often
improves the gradient detection ability of dissimilarity indices; this can
be performed using command wisconsin in vegan. Here we first divide
all species by their maxima, and then standardize sites to unit totals.
After this standardization, many dissimilarity indices become identical in
rank ordering and should give equal results in nmds.You are not restricted to use only vegdist indices in vegan: vegdist
returns similar dissimilarity structure as standard R function dist which
also can be used, as well as any other compatible function in any package.
Some compatible functions are dsvdis (labdsv package), daisy (cluster
package), and distance (analogue package), and beta diversity indices
in betadiver in vegan. Morever, vegan has function designdist where
you can define your own dissimilarity indices by writing its equation using
either the notation for A, B and J above, or with binary data, the 2× 2
contingency table notation where a is the number of species found on
both compared sites, and b and c are numbers of species found only in
one of the sites. The following three equations define the same Sørensen
Quadratic terms
J =�N
i=1 xijxik
A =�N
i=1 x2ij
B =�N
i=1 x2ik
Minimum terms
J =�N
i=1 min(xij , xik)
A =�N
i=1 xij
B =�N
i=1 xik
Binary terms
J = Shared species
A = No. of species in jB = No. of species in k
Site kpresent absent
Site j present a babsent c d
J = a
A = a + b
B = a + c
index where the number of shared species is divided by the average species
richness of compared sites:
> d <- vegdist(varespec, "bray", binary = TRUE)> d <- designdist(varespec, "(A+B-2*J)/(A+B)")> d <- designdist(varespec, "(b+c)/(2*a+b+c)", abcd = TRUE)
Function betadiver defines some more binary dissimilarity indices in
vegan.Most published dissimilarity indices can be expressed as designdist
formulae. However, it is much easier and safer to use the canned alter-
natives in existing functions: it is very easy to make errors in writing the
dissimilarity equations.
2.3 Comparing ordinations: Procrustes rotation
Two ordinations can be very similar, but this may be difficult to see,
because axes have slightly different orientation and scaling. Actually, in
nmds the sign, orientation, scale and location of the axes are not de-
fined, although metaMDS uses simple method to fix the last three compo-
nents. The best way to compare ordinations is to use Procrustes rotation.
Procrustes rotation uses uniform scaling (expansion or contraction) and
rotation to minimize the squared differences between two ordinations.
Package vegan has function procrustes to perform Procrustes analysis.
7
Transformations before measuring the distance
• Standardization (continuos data with different units)
• Log-10 transformations (Abundance are assumed to flown a log-normal distribution)
• Presence/Absence (Binning)
Metric Vs. Semi-Metric
• Does the type of measurement influence/determine the type of analysis possible?
YES!!
• But not for PERMANOVA!
(As argued by McArdle, & Anderson 2001 Ecology)
A practical example
• Using the Iris and Warb.pop.str.csv data:
• Create different distance matrices and look at their properties
• What about if for the Iris case we use other distance measurements in our PERMANOVA?
• e.g. Euclidean - Jaccard
• Transform our data (normalize, log-10)
Distance-based redundancy analysis (db-RDA)
• Approach for testing group differences in dissimilarities for designs with interactions
• PCoA to convert the original dissimilarities into Euclidean distances (WARNING negative eigenvalues)
• The analysis just becomes a multiple linear regression hence any combination of crossed and nested, fixed and random factors can be included
Legendre & Anderson 1999(Ecological Monographs)
Raw data(n x m)
n = observationsp = factors
m = variablesu = Eigen vector
Distance Matrix(n x n)
Matrix Y(n x υ)
Matrix X(n x p)
Dummy variables for the predictor
Redundancy analysis (RDA)F# statistics
Test of F#
by permutation
Diagram based on Legendre & Anderson 1999(Ecological
Monographs)
Principal coordinate AnalysisPCoA
Correction for negative eigenvalues
PERMANOVA:Non-parametric MANOVA• Partitioning of sums-of- squares (SS) and
variances used for testing linear models can also be applied directly to dissimilarities
• Even for semi-metric dissimilarities like Bray–Curtis
• Using PCoA on the original dissimilarities is not necessary (Not to worry about those negative Eigen values)
Estimation of PERMANOVA variance components
• The essence of the analysis:
• Compare the variability within groups (SSw or SSE) against the variability between groups (SSA or SSA)
• Determine Pseudo F-statistic
• The same framework as in a ANOVA/MANOVA
• Crossed and nested, fixed and random factors can be included
McArdle, & Anderson 2001 EcologyAnderson 2001 Austral Ecology
PERMANOVA:The how to
• Two essential considerations most be evaluated
1. The construction of the test statistic (i.e. pseudo F-Test)
2. Calculation of the P-value using a some sort of permutation procedure
Construction of the test statistic
• For simplicity, let’s consider a single factor design with p groups and n objects in each group so the total number of objects is
N = pn
• For this design any distance matrix can be determined
• IMPORTANT: the used dissimilarity index most be adequate for the question and the type of information
The total sum-of-squares
(SST)the sum of squared dissimilarities
between all pairs of objects divided by N
SST =1N
d 2ijj= i+1
N
∑i=1
N −1
∑Variable 1
Vari
able
2
n = observationsp = factors
N = Total Number of observations (np)i = observation ij= observation j
The within-groups sum-of-squares (SSW)sum of squared dissimilarities between objects within each
group, summed over the groups
SSw =1n
d 2ij∈ijj= i+1
N
∑i=1
N −1
∑ Variable 1
Vari
able
2
n = observationsp = factors
N = Total Number of observations (np)i = observation ij= observation j
The between-groups sum-of-squares (SSA)sum of squared dissimilarities between group means and the
overall mean. It can be determined from the usual additive partitioning
of the SST as described for ANOVA
Variable 1
Vari
able
2
SSA = SST − SSw
SST SSW and SSA form the distance matrix
• One can calculate the sums of squares from a distance matrix obtained using any distance measure
• Why?
• Because the sum of squared distances between points and their centroid is equal to the sum of squared inter-point distances divided by the number of points
THAT’S THE COOL THING ABOUT THIS METHOD
Kendall & Stuart 1963 (The Advanced Theory of Statistics)Gower 1966 (Biometrika)
Legendre & Legendre 1999 (numerical ecology)
Table 1c). Note that this additive partitioning using ageometric approach yields one value for each of SSW,SSA and SST as sums of squared Euclidean distances.This geometric approach gives sums of squares equiv-alent to the sum of the univariate sums of squares(added across all variables) described in the previousparagraph. This differs from the traditional MANOVA
approach, where partitioning is done for an entirematrix of sums of squares and cross-products (e.g.Mardia et al. 1979; Table 1d).
The key to the non-parametric method describedhere is that the sum of squared distances between pointsand their centroid is equal to (and can be calculateddirectly from) the sum of squared interpoint distancesdivided by the number of points. This importantrelationship is illustrated in Fig. 2 for points in twodimensions. The relationship between distances tocentroids and interpoint distances for the Euclideanmeasure has been known for a long time (e.g. Kendall & Stuart 1963; Gower 1966; Calinski &Harabasz 1974; Seber 1984; Pillar & Orlóci 1996;Legendre & Legendre 1998; see also equation B.1 inAppendix B of Legendre & Anderson 1999). What isimportant is the implication this has for analyses based on non-Euclidean distances. Namely, an additive partitioning of sums of squares can be obtainedfor any distance measure directly from the distancematrix, without calculating the central locations ofgroups.
Why is this important? In the case of an analysisbased on Euclidean distances, the average for each vari-able across the observations within a group constitutesthe measure of central location for the group inEuclidean space, called a centroid. For many distancemeasures, however, the calculation of a central locationmay be problematic. For example, in the case of thesemimetric Bray–Curtis measure, a simple averageacross replicates does not correspond to the ‘centrallocation’ in multivariate Bray–Curtis space. Anappropriate measure of central location on the basis of Bray–Curtis distances cannot be calculated easily directly from the data. This is why additive
NON-PARAMETRIC MANOVA FOR ECOLOGY 35
Fig. 2. The sum of squared distances from individualpoints to their centroid is equal to the sum of squared inter-point distances divided by the number of points.
Fig. 3. Schematic diagramfor the calculation of (a) a dis-tance matrix from a raw datamatrix and (b) a non-para-metric MANOVA statistic for aone-way design (two groups)directly from the distancematrix. SST, sum of squareddistances in the half matrix(!) divided by N (total number of observations); SSW,sum of squared distanceswithin groups ( ) divided byn (number of observations per group). SSA ! SST – SSW
and F = [SSA/(a – 1)]/[SSW/(N – a)], where a ! the num-ber of groups.
The geometric argumentThe sum of squared distances from individual points to their centroid is equal to the sum of squared inter- point distances
divided by the number of points.
Legendre & Legendre 1999 (Numerical ecology)
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
DISTANCE MATRIX
SST
SSASSW
Building a Pseudo F-test
• Now that the main components of a F-ratio can be estimated, an appropriate statistic to test the statistical hypothesis of no effects of the model parameters
F =
SSAp −1( )
SSWN − p( )
n = observationsp = factors
N = Total Number of observations (np)i = observation ij= observation j
Between Groups
Within Groups
SSw for multi-predictors designs
• Multi-predictors case:
• a variables are measured simultaneously for each of n replicates in each of pxq groups
• A natural multivariate analogue may be obtained by simply dividing up the sums of squares across all groups
• F-ratio can then be constructed for each factor
Treatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
DISTANCE MATRIX
Multi-factor SSTTreatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
SST =1N
d 2ijj= i+1
N
∑i=1
N −1
∑ n = observationsp = factors
N = Total Number of observations (np)a=levels of factor A; b= levels of factor B
i = observation ij= observation j
Multi-factor SSW(A)Treatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
SSW A( )=1bn
d 2ij∈ij
A( )j= i+1
N
∑i=1
N −1
∑n = observations
p = factorsN = Total Number of observations (np)a=levels of factor A; b= levels of factor B
i = observation ij= observation j
Multi-factor SSW(B)Treatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
SSW B( )=1an
d 2ij∈ij
B( )j= i+1
N
∑i=1
N −1
∑n = observations
p = factorsN = Total Number of observations (np)a=levels of factor A; b= levels of factor B
i = observation ij= observation j
Multi-factor SSRTreatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
SSR =1n
d 2ij∈ij
AB( )j= i+1
N
∑i=1
N −1
∑n = observations
p = factorsN = Total Number of observations (np)a=levels of factor A; b= levels of factor B
i = observation ij= observation j
Source F-Ratios
Among levels of A
Among levels of B
Interaction A x B
FA( ) =
SSW A( )a −1( )
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟
FB( ) =
SSW B( )b −1( )
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟
FAB( ) =
SSW AB( )
dfii=1
h
∑⎛⎝⎜⎞⎠⎟
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟
10 PIERRE LEGENDRE AND MARTI J. ANDERSON Ecological MonographsVol. 69, No. 1
TABLE 3. Two-way crossed ANOVA designs. Symbols are as in the text.
SourceMeansquare
Expected meansquare† F ratio
a) Two fixed factorsAmong levels of A MSA � � bnK2 2
e A MSA/MSResAmong levels of B MSB � � anK2 2
e B MSB/MSResInteraction A � B MSAB � � nK2 2
e AB MSAB/MSResResidual MSRes �2e
b) One fixed, one random factorAmong levels of A (fixed) MSA � � n� � bnK2 2 2
e AB A MSA/MSABAmong levels of B (random) MSB � � an�2 2
e B MSB/MSResInteraction A � B MSAB � � n�2 2
e AB MSAB/MSResResidual MSRes �2e
c) Two random factorsAmong levels of A MSA � � n� � bn�2 2 2
e AB A MSA/MSABAmong levels of B MSB � � n� � an�2 2 2
e AB B MSB/MSABInteraction A � B MSAB � � n�2 2
e AB MSAB/MSResResidual MSRes �2e
† In the expected mean square expressions, a � the number of levels in factor A, b � thenumber of levels in factor B, and n � the number of replicates in each group of the balanceddesign. K 2 is defined by Eq. 11.
First, we will consider the linear ANOVA model asit applies when the two factors are fixed, then whenone factor is fixed and the other is random (a ‘‘mixedmodel’’), and finally when the two factors are random.We define the variance of the main effect of any fixedfactor, A, in a univariate analysis as
a2¯(A � A)� i
i�12K � (11)A (a � 1)
where A is the mean across all levels of factor A, a isthe number of levels of factor A, and Ai is the effectof the ith level of factor A. Also, for fixed factors, weassume
a
A � 0 . (12)� ii�1
In addition, we define the estimated variance attrib-utable to a random factor, B, as . Similarly, residual2�Berror variance is designated by . In general, a fixed2�efactor is a factor for which all of the possible levels ofthe factor (or at least, all of the possible levels of in-terest for the study) are included in the experiment. Bycontrast, a random factor is a factor whose levels area random subset of all possible levels from a populationof levels that could have been included in the study.For a more complete discussion of the distinction be-tween fixed and random factors in biological applica-tions, see Underwood (1981, 1997), Winer et al.(1991), and Sokal and Rohlf (1995).For the multivariate extension, Eqs. 6–9 concerning
calculations of mean squares are true for any term inany ANOVA model. As in the one-way case, the Xmatrix does not change in going from the univariate tothe multivariate extension. The dummy variable codingfor interaction terms is described in Appendix C. Inbrief, dummy variables for interaction terms are simply
the direct products of the variables coding for the maineffects.
Two fixed factors
The expected mean squares and F ratios for the mod-el when both factors are fixed are given in Table 3a.For a multivariate analog to the F ratio to test for theeffect of an interaction term, H01, the RDA statistic isconstructed as
#MSAB#F � (13)AB #MSRes
where is determined from dfAB, which is the num-#MSABber of columns coding for the interaction in matrix X,and , the sum of canonical eigenvalues of an RDA#SSABof Y on a subset of matrix X, say XA�B, which includesonly those dummy variables coding for the interactionterm.If the interaction term is found to be nonsignificant
and the main effects are to be investigated, their cor-responding statistics, according to their expected meansquares (Table 3a), are calculated as follows. For a testof the null hypothesis H02 above, where there are twofixed factors, the RDA statistic is
#MSA#F � (14)A #MSRes
and for the test of the null hypothesis H03, the RDAstatistic is
#MSB#F � . (15)B #MSRes
The mixed model: one fixed and one random factor
When the experimental design has one fixed factor(A) and one random factor (B), the calculated meansquares in ANOVA for the sources of variation in the
F-rations for Two-way crossed ANOVA designs.
Table from Legendre & Anderson 1999(Ecological
Monographs)
Now I have a F and a P value. What does it mean?
• The idea of a PERMANOVA is to determine the possible differences in the location (means) and spread (dispersion) of the attributes from the compared groups
BUT REMEMBER!
• PERMANOVA is sensitive to the differences in the dispersion of points between groups
A practical example
• Using the mites and WrightWestobyOZ.csv data:
• Do a 2-way PARMANOVA
• What does the F test tell us?
• Should we include a interaction?
versus abundant species in the analysis, given by Clarkeand Green (1988), is followed here. Note that the trans-formation is not done in an effort to make data con-form to any assumptions of the analysis. In thisexample, the data contained some species that occurredon a very large relative scale of abundance (e.g.Spirorbid worms occurred in the thousands), so thedata were transformed by taking double-square rootsbefore the analysis. To visualize the multivariate patterns among observations, non-metric multi-dimensional scaling (MDS) was performed on theBray–Curtis distances (Kruskal & Wish 1978), usingthe PRIMER computer program. Non-parametricMANOVA was then done on Bray–Curtis distances, asdescribed in the previous section, using the computerprogram NPMANOVA, written by the author in FORTRAN.
The number of possible permutations for the one-way test in the case of the grazing experiment is9.6 ! 1025. A random subset of 4999 permutations wasused (Fig. 5). In this case, the null hypothesis of nodifferences among groups was rejected, as the observedvalue was much larger than any of the values obtainedunder permutation (Fig. 5, Table 2).
A POSTERIORI TESTS
As in univariate ANOVA where there is a significant resultin a comparison of 3 or more treatments, we may wishto ask for the multivariate case: wherein does the sig-nificant difference lie? This can be done by using thesame test, given above for the one-way comparison ofgroups, but where individual pair-wise comparisonsbetween particular groups are done. To continue withthe logic of the analogous univariate situation, we can use a t-statistic (which is simply the square root ofthe value of the F-statistic described above) for these
38 M. J. ANDERSON
Fig. 4. Two variables in each of two groups of observationswhere (a) the groups differ in correlation between variables,but not in location or dispersion and (b) the groups differ in dispersion, but not in location or correlation betweenvariables.
Fig. 5. Distribution of the non-parametric MANOVA
F-statistic for 4999 permutations of the data on assemblagesin different grazing treatments. The real value of F for thesedata is very extreme by reference to this distribution(F " 36.62): thus there are strong differences among theassemblages in different grazing treatments.
Table 2. Non-parametric MANOVA on Bray–Curtis dis-tances for assemblages of organisms colonizing intertidal sur-faces in estuaries in three grazing treatments (grazersexcluded, grazers inside cages, and surfaces open to grazers)
Source d.f. SS MS F P
Grazers 2 18 657.65 9328.83 36.61 0.0002Residual 57 14 520.89 254.75Total 59 33 178.54
Comparison* t P
Open versus caged 8.071 0.0002Open versus cage control 3.268 0.0002Caged versus cage control 6.110 0.0002
*Pair-wise a posteriori tests among grazing treatments.
Groups differ in correlation between variables, but not in
location or dispersion
Groups differ in dispersion, but not in location or correlation between
variables
versus abundant species in the analysis, given by Clarkeand Green (1988), is followed here. Note that the trans-formation is not done in an effort to make data con-form to any assumptions of the analysis. In thisexample, the data contained some species that occurredon a very large relative scale of abundance (e.g.Spirorbid worms occurred in the thousands), so thedata were transformed by taking double-square rootsbefore the analysis. To visualize the multivariate patterns among observations, non-metric multi-dimensional scaling (MDS) was performed on theBray–Curtis distances (Kruskal & Wish 1978), usingthe PRIMER computer program. Non-parametricMANOVA was then done on Bray–Curtis distances, asdescribed in the previous section, using the computerprogram NPMANOVA, written by the author in FORTRAN.
The number of possible permutations for the one-way test in the case of the grazing experiment is9.6 ! 1025. A random subset of 4999 permutations wasused (Fig. 5). In this case, the null hypothesis of nodifferences among groups was rejected, as the observedvalue was much larger than any of the values obtainedunder permutation (Fig. 5, Table 2).
A POSTERIORI TESTS
As in univariate ANOVA where there is a significant resultin a comparison of 3 or more treatments, we may wishto ask for the multivariate case: wherein does the sig-nificant difference lie? This can be done by using thesame test, given above for the one-way comparison ofgroups, but where individual pair-wise comparisonsbetween particular groups are done. To continue withthe logic of the analogous univariate situation, we can use a t-statistic (which is simply the square root ofthe value of the F-statistic described above) for these
38 M. J. ANDERSON
Fig. 4. Two variables in each of two groups of observationswhere (a) the groups differ in correlation between variables,but not in location or dispersion and (b) the groups differ in dispersion, but not in location or correlation betweenvariables.
Fig. 5. Distribution of the non-parametric MANOVA
F-statistic for 4999 permutations of the data on assemblagesin different grazing treatments. The real value of F for thesedata is very extreme by reference to this distribution(F " 36.62): thus there are strong differences among theassemblages in different grazing treatments.
Table 2. Non-parametric MANOVA on Bray–Curtis dis-tances for assemblages of organisms colonizing intertidal sur-faces in estuaries in three grazing treatments (grazersexcluded, grazers inside cages, and surfaces open to grazers)
Source d.f. SS MS F P
Grazers 2 18 657.65 9328.83 36.61 0.0002Residual 57 14 520.89 254.75Total 59 33 178.54
Comparison* t P
Open versus caged 8.071 0.0002Open versus cage control 3.268 0.0002Caged versus cage control 6.110 0.0002
*Pair-wise a posteriori tests among grazing treatments.
Figure from: Anderson 2001 Austral EcologyAnderson 2006 (Biometrics)
Can we determine the homogeneity of multivariate dispersions?
• Multivariate dispersions are measured on the basis of any distance or dissimilarity index robust to skewed or zero-inflated data
• The tests is a multivariate extensions of Levene’s test (P-values obtained by permutation)
• Is based on the rotational invariance of measures of spread from the multivariate centroid /spatial median in a euclidean space
Anderson 2006 (Biometrics)
PERMDISP: Permutational analysis of multivariate dispersions
• Te test proposed by Andreson (2006) consist of two steps:
1. Calculation of the distances from observations to their centroids
2. Comparison of the average of these distances among groups, using ANOVA
Distances from observations to their centroids
• The goal is to measure the “spread” around central tendency measurement (centroid or spatial mean)
• REMEMBER that there two types of distance measurements
• Metrics (e.g. Euclidean)
• Semi-Metrics (e.g. Bray-Curtis)
Measuring disrpersion for semi-metric distances• The rotational invariance only applies to
Euclidean spaces so if the used distance measurement is not metric it needs to be transformed
• PCoA analysis allows this (see Legendre & Legendre 1998 for complete details)
• WARNING: remember that semi-metric distance measures will produce negative eigenvalues
Principal Coordinates Analysis (PCoA)
• If we have a distance/dissimilarity matrix
• a PCoA will produce the corresponding Euclidean coordinates for ear replicate using the squared-root of the eigenvalues
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
DISTANCE MATRIX
PCoA, the how to• The procedure is simple
• Transform the distance matrix to
• Double center A to calculate Gowers’s center matrix G
• Using a spectral decomposition of G the eigenvalues (λi) are obtained
• Produce a series of eigenvectors υ+l and υ-l representing the real and imaginary (those produced form negative eigenvalues) components
A = aij( ) = −12dij
⎛⎝⎜
⎞⎠⎟
G = Ι −
1N1 ′1⎛
⎝⎜⎞⎠⎟Α Ι −
1N1 ′1⎛
⎝⎜⎞⎠⎟= gl ′l[ ] = al ′l − al ⋅ + −ai ′l − a⋅⋅[ ] = λlqlq ′l
T
l=1
N
∑
Anderson 2006 (Biometrics)
PERMDISP: Measuring multivariate dispersion
• From this PCoA space and the G matrix is possible to determine the deviation of each observation to a central tendency measurement (centroid - spatial median)
• For this we use the generated Eigen vectors and the equivalence between the distances to a central point and the inter-point distance
PERMDISP: Measuring multivariate dispersion
Distance to the centroid Distance to the spatial mean
zijm = Δ2 υij
+ ,m ′i+( ) − Δ2 υij
− ,mi−( )zij
c = Δ2 υij+ ,c ′i
+( ) − Δ2 υij− ,ci
−( )
υl+ = λl( )12 ql−1( )12υij=
− λl( )12 qlci = Centroidmi = Median Anderson 2006 (Biometrics)
A practical example
• Using the mites data :
• Evaluate the within-group heterogeneity
• If you have time try to do it for a two-factor analysis (Shrub cover and Topography)
P values fromPermutations
• The p-value or observed significance level p is the chance of getting a test statistic as or more extreme than the observed one, under the null hypothesis Ho (no differences between factors)
P =No of Fπ ≥ F( ) +1
Total no. of Fπ( ) +1
Exact Vs. Permutations
• Exact P-values: All possible realizations of Groups - Factors are evaluated
But with a groups and n replicates per group the number of distinct possible realizations in a one-way test is:
• Permutation procedures: using a large sample of the domain of realizations the distribution of Fπ is determined (1000 iterations give α of 0.05)
an( )!a!n!a( )
Permutation alternatives• The idea is that randomization of the observations across groups
would allow to determine the distribution of Fπ in the case of Ho
Permutation method Description
Raw dataGood approximate test proposed for complex ANOVA designs.
Type I error close to α, Method does not need large sample although with larger sample sizes it tends to be more conservative
Reduced model residuals
Gives the best power and the most accurate Type I error for complex designs,and it’s the closest to a conceptual extract test
Full model residuals
Described by ter Braak (1992), it aims to obtain residuals of the full model by subtracting from each replicate the mean
corresponding to its particular cell
Monte Carlo P-values
• Applied in situations where there are not enough possible permutations to get a reasonable test
• Based on the asymptotic permutation of the numerator (or denominator) given that
• Variables can be drawn randomly and independently, and combined with the eigenvalues to construct the asymptotic permutation distribution for numerator/denominator
tr HGH( ) χl2 υl( )
A practical example
• Using the mites and WrightWestoby data:
• Compare the Homogeneity of dispersion of the evaluated groups
• Are the Nutrient/Rainfall dispersions significantly different?
• Is this the same is the analysis are done for Nutrient or Rainfall pairs separately.
Mass (kg)
Leng
th (
m)
F
M
Site Sp. 1 Sp. 2 Sp. 3 Sp. 4
GROUP A
GROUP A
GROUP B
GROUP B
1 1 4 7 0
2 2 6 10 1
3 4 2 0 6
4 5 1 2 9
RAW DATASite 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
DISTANCE MATRIX
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
Site 1 Site 2 Site 3 Site 4
Site 1
Site 2
Site 3
Site 4
Treatment A-1Treatment A-1Treatment A-1Treatment A-1 Treatment A-2Treatment A-2Treatment A-2Treatment A-2
Tr. B-1Tr. B-1 Tr. B-2Tr. B-2 Tr. B-1Tr. B-1 Tr. B-2Tr. B-2
Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8
Tr. A-1
Tr. B-1Site 1
Tr. A-1
Tr. B-1Site 2
Tr. A-1
Tr. B-2Site 3
Tr. A-1
Tr. B-2Site 4
Tr. A-2
Tr. B-1Site 5
Tr. A-2
Tr. B-1Site 6
Tr. A-2
Tr. B-2Site 7
Tr. A-2
Tr. B-2Site 8
Multi-factor pseudo f- statistics
SSW B( )=1an
d 2ij∈ij
B( )j= i+1
N
∑i=1
N −1
∑
SSR =1n
d 2ij∈ij
AB( )j= i+1
N
∑i=1
N −1
∑
SST =1N
d 2ijj= i+1
N
∑i=1
N −1
∑
SSW A( )=1bn
d 2ij∈ij
A( )j= i+1
N
∑i=1
N −1
∑
SSAB = SST − SSW A( ) − SSW B( ) − SSR
FA( ) =
SSW A( )a −1( )
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟
FB( ) =
SSW B( )b −1( )
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟
FAB( ) =
SSW AB( )
dfii=1
h
∑⎛⎝⎜⎞⎠⎟
SSRN − dfi
i=1
h
∑⎛⎝⎜⎞⎠⎟−1
⎛⎝⎜
⎞⎠⎟