14
Nordic Society Oikos Red-Shifts and Red Herrings in Geographical Ecology Author(s): Jack J. Lennon Source: Ecography, Vol. 23, No. 1 (Feb., 2000), pp. 101-113 Published by: Blackwell Publishing on behalf of Nordic Society Oikos Stable URL: http://www.jstor.org/stable/3682872 . Accessed: 15/08/2011 09:12 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Blackwell Publishing and Nordic Society Oikos are collaborating with JSTOR to digitize, preserve and extend access to Ecography. http://www.jstor.org

Nordic Society Oikos - ut

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nordic Society Oikos - ut

Nordic Society Oikos

Red-Shifts and Red Herrings in Geographical EcologyAuthor(s): Jack J. LennonSource: Ecography, Vol. 23, No. 1 (Feb., 2000), pp. 101-113Published by: Blackwell Publishing on behalf of Nordic Society OikosStable URL: http://www.jstor.org/stable/3682872 .Accessed: 15/08/2011 09:12

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Blackwell Publishing and Nordic Society Oikos are collaborating with JSTOR to digitize, preserve and extendaccess to Ecography.

http://www.jstor.org

Page 2: Nordic Society Oikos - ut

ECOGRAPHY 23: 101-113. Copenhagen 2000

Red-shifts and red herrings in geographical ecology

Jack J. Lennon

Lennon, J. J. 2000. Red-shifts and red herrings in geographical ecology. - Ecography 23: 101-113.

I draw attention to the need for ecologists to take spatial structure into account more seriously in hypothesis testing. If spatial autocorrelation is ignored, as it usually is, then analyses of ecological patterns in terms of environmental factors can produce very misleading results. This is demonstrated using synthetic but realistic spatial patterns with known spatial properties which are subjected to classical correlation and multiple regression analyses. Correlation between an autocorrelated response variable and each of a set of explanatory variables is strongly biased in favour of those explanatory variables that are highly autocorrelated - the expected magnitude of the correlation coefficient increases with autocorrelation even if the spatial patterns are completely independent. Similarly, multiple regression analysis finds highly autocorrelated explanatory variables "significant" much more frequently than it should. The chances of mistakenly identifying a "significant" slope across an autocor- related pattern is very high if classical regression is used. Consequently, under these circumstances strongly autocorrelated environmental factors reported in the literature as associated with ecological patterns may not actually be significant. It is likely that these factors wrongly described as important constitute a red-shifted subset of the set of potential explanations, and that more spatially discontinuous factors (those with bluer spectra) are actually relatively more important than their present status suggests. There is much that ecologists can do to improve on this situation. I discuss various approaches to the problem of spatial autocorrelation from the literature and present a randomisation test for the association of two spatial patterns which has advantages over currently available methods.

J. J. Lennon, Centre for Biodiversity and Conservation, School of Biology, Univ. of Leeds, Leeds, Yorkshire, U.K. LS2 9TJ ([email protected]).

High-quality spatial data, often sampled across a spa- tially continuous matrix of quadrats, are increasingly available to geographical ecologists. Major new sources are improved automated data capture technology (e.g. satellite remote-sensing) and well-organised species sur- veys. While this is obviously to be welcomed, the uncritical application of simple statistical methods to these spatial data may in fact obscure more than it reveals. This is primarily a consequence of the presence of spatial autocorrelation in these geographical data.

Spatial autocorrelation (Cliff and Ord 1973, Ripley 1981, Legendre 1993) is the correlation between pairs of points separated by a (spatial) distance. The usual case is for this correlation to be positive and to decrease with increasing distance between the points; the greater

the distance between measurements, the weaker the dependency between them.

Positive spatial autocorrelation is present, often strongly, in most spatial data in ecology. Although long known by statisticians as a source of problems for statistical inference, spatial autocorrelation has usually been mentioned, if at all, in passing by geographical ecologists: in general it has been ignored as a statistical inconvenience or simply not observed as a problem in the first place. Sometimes spatial autocorrelation is used as an inferential tool or is more of an object of study in itself (e.g. Sokal and Oden 1991, Koenig 1997). However, in general the impression given by many ecologists working with geographical data is effectively that of complacency, perhaps in the belief that simple,

Accepted 15 June 1999

Copyright ? ECOGRAPHY 2000 ISSN 0906-7590 Printed in Ireland - all rights reserved

ECOGRAPHY 23:1 (2000) 101

Page 3: Nordic Society Oikos - ut

"blunt-instrument" methods will be adequate if the trends are strong enough (there are a small minority of exceptions which take spatial autocorrelation seriously: a few examples are Borcard et al. 1992, Leduc et al. 1992, Smith 1994, Borcard and Legendre 1994, Palmer and van der Maarel 1995, Thioulouse et al. 1995, and Buckland et al. 1996). Unfortunately, not only does ignoring or being unaware of this problem lead to serious difficulties in hypothesis testing, in the sense that significance levels are incorrect, much more impor- tantly it also results in a systematic bias towards partic- ular kinds of explanation for ecological patterns. These biases may seriously distort our understanding of the processes involved in generating the observed ecological patterns.

Areas of particular concern are medium to large- scale non-experimental geographical studies such as diversity-environment gradient analyses, population density-environment relationships, and climate envel- ope analysis. This encompasses a considerable number of published studies (a few examples are: Currie and Paquin 1987, Turner et al. 1987, 1988, Adams and Woodward 1989, Nilsson et al. 1989, Lennon 1990, Currie 1991, Shipley 1991, Greenwood and Baillie 1991, Leathwick and Mitchell 1992, McCollin 1993, Black- burn 1996, Blackburn and Gaston 1996, Gaston and Blackburn 1996, Newton and Dale 1996, Austin et al. 1996, B6hning-Gaese 1997, Kurki et al. 1998, O'Brien et al. 1998, Peltonen et al. 1998).

Those analyses that involve several competing envi- ronmental spatial patterns each potentially explaining a given ecological spatial pattern are most at risk from spatial autocorrelation if "spaceless" statistical models are applied. Furthermore, when the quadrats form a continuous lattice rather than a scattered distribution the problem of spatial autocorrelation is exacerbated, i.e. somewhat counter-intuitively, the problem is likely to be worse when there are a greater number of data points. This situation is often compounded by associa- tions between explanatory variables (e.g. productivity and temperature) which make it more difficult to dis- criminate between them.

In this paper, the consequences of ignoring spatial autocorrelation are illustrated using analysis of syn- thetic spatial patterns and results from statistical theory in the literature.

Methods

Some statistical preliminaries

Consider the ordinary (Pearson) correlation coefficient r, testing the null hypothesis that r = 0. The standard textbook (e.g. Sokal and Rohlf 1995) definition of the significance of r is

n-2 t=r (1) 1 -Ir

where n is the number of pairs of data values and t is distributed as Student's t with n - 2 degrees of freedom.

So to be significant, r has to be larger than a particu- lar value controlled principally by n. In an important paper, it has been convincingly argued by Clifford et al. (1989) (see also Dutilleul 1993) that in the presence of spatial autocorrelation the effective sample size M is < n. From eq. 1, this means that when n is replaced by M, r has to be larger to be significant. How much larger depends on how much autocorrelation there is in the data. Another way of looking at this is to observe that t is proportional to the square root of the degrees of freedom, so overestimating the degrees of freedom causes correlates to appear more significant than they are (Fig. 1). Clifford et al. (1989) show that M can be estimated from the autocorrelation functions of the two spatial variables (their method does not assume a par- ticular kind of autocorrelation structure). See Haining (1990) for further discussion of this approach. A simi- larly corrected association test for pairs of binary vari- ables (particularly relevant to species spatial distributions consisting of presence or absence) has also recently become available (Cerioli 1997).

0

-2

-4 a) c -6

C. c -8

) -10

-12

-14

-16 ,, 10 20 30 40 50

pairs of observations

correlation coefficient e-e 0.1 a 0.3 --- 0.5 * 0.7

=--* 0.9 Fig. 1. The sensitivity of the significance of the Pearson corre- lation coefficient to the degrees of freedom DF. The change in significance is most dramatic for larger coefficients, but smaller coefficients are closer to the standard significance boundaries e.g. p < 10-2 to begin with. If the effective DF is much lower than the nominal DF, the actual significance of a given r is much less.

102 ECOGRAPHY 23:1 (2000)

Page 4: Nordic Society Oikos - ut

It may seem this argument is a statistical abstraction that allows a refinement of significance levels but does not have a great deal of relevance for qualitative results from real data analysis. Or in other words, the broad conclusions from an analysis of, for example, diversity patterns and environmental factors would remain un- changed. However, this is not the case, as is now illustrated by showing how often independent spatial patterns are found to match by chance alone when spatial autocorrelation is unaccounted for.

Chance associations between independent spatial patterns Two statistical approaches common in the literature were considered: pairwise correlation and multiple regression.

Synthetic patterns were constructed with known spa- tial autocorrelation properties. Construction of many independent synthetic patterns sharing the same spatial structure allows the correct null model to be generated and a randomisation test for true significance to be made (Manly 1997).

For the case of correlation where the values (e.g. species diversities, population sizes, climatic factors) of each of the two variables are drawn from a normal distribution, this randomisation test is not absolutely necessary since Clifford et al.'s (1989) test for the true

significance of r (using the effective degrees of freedom correction) can be applied. However, when both vari- ables are not normally distributed randomisation is necessary. For example, if the objective is to test if there is a significant spatial gradient across a spatial pattern of diversity (e.g. testing for a significant latitu- dinal gradient), then the diversity pattern might satisfy the normal distribution requirement (i.e. the diversity values are normally distributed), but the other "vari- able" is a plane surface, since we are testing the null

hypothesis that the slope of our pattern is zero. Simi-

larly, this randomisation approach can be used to com-

pare a spatially autocorrelated pattern with a more complex non-linear surface e.g. that describing spatial isolation between a set of fixed points, proximity to a coastline or river etc.

For the pairwise correlation tests, the spatial auto- correlation structure of the synthetic surfaces was first defined, the surfaces were generated and then an r statistic was calculated from the pairs of surfaces. The empirical distribution of many independent r values so calculated forms the null model, and the differences between the null models for different levels of spatial autocorrelation can be examined. In a similar fashion, for regression, one pattern is generated to represent the response variable and a set of patterns with different strengths of autocorrelation are the explanatory vari- ables. Two additional explanatory variables are in-

cluded to represent spatial direction as the coordinates of the surface values; this is often seen in analyses in the literature as a means of detecting trends along latitude and longitude axes.

Spatial autocorrelation structure

There are many possible ways of describing positive autocorrelation decreasing with distance. A simple (but perhaps particularly relevant) form of spatial structure for natural patterns was chosen: fractal spatial struc- ture. Such patterns are statistically self-similar at all scales and are independent of quadrat size or resolu- tion. Fractal structures are ubiquitous in nature

(Schroeder 1991). Specifying other models discussed in the literature for spatial structure such as those arising from autoregressive or moving-average processes (e.g. Haining 1990) will not qualitatively affect the results below.

Five levels of autocorrelation were used, ranging from zero autocorrelation to very strong autocorrela- tion. These strengths of autocorrelation correspond to five types of stochastic "noise", and are more conve- niently described in terms of the colour of their spectra: white, white-pink, pink, pink-brown and brown, in

increasing order of autocorrelation (note that, confus-

ingly, pink noise is sometimes called brown noise in the literature, since it corresponds to Brownian motion). These spectra have a simple relationship between spec- tral density S(f) and frequency (Voss 1988) of the form:

S(f) oc f (2)

The values of 0 corresponding to the spectral colours white, white-pink, pink, pink-brown and brown are 0, - 1/2, - 1, - 3/2, and - 2 respectively. A white noise "surface" consists of completely spatially independent amplitudes. Brown noise corresponds to a deterministic Euclidean surface, which is, strictly speaking, not frac- tal (Russ 1994). Pink noise (also known as 1/f noise) is

particularly interesting since it appears in a huge range of natural phenomena yet the reason for this remains unexplained (Schroeder 1991). The other two cases are intermediate between these three.

Synthetic pattern generation

There are many ways in which realisations of eq. 2 can be generated. A computationally efficient method is to use the useful properties of the inverse discrete Fourier transform (IDFT). The fast transform algorithm (Pitas 1993) allows rapid generation of isotropic (no direc- tional trends), stationary (statistically identical in all locations) surfaces: surfaces with a fixed autocorrela- tion structure, and a gaussian distribution of surface

ECOGRAPHY 23:1 (2000) 103

Page 5: Nordic Society Oikos - ut

values. For surface values which are real numbers, there are an infinite number of surfaces sharing the same autocorrelation structure. The IDFT in two di- mensions can be written as:

n--ln--I

x,y = E E S(f,v)1/2 sin(2nfu,v, - u,v) (3) u=o-0v= o0

for a surface of size n by n elements and where Zx,y is the surface value at spatial coordinates x, y and

S(f.,v) is the spectral density of frequency fu, and

fU,v = ux/n + vy/n. The subscripts u and v are the fre- quency components in the x and y directions. They also indicate the angular direction of the two dimen- sional sine wave (imagine it as a corrugated sheet) as tan - '(u/v) with frequency in this direction of (u2 +

V2)1/2. See Russ (1994) for a graphical introduction to the IDFT or Goodman (1968) for a more detailed mathematical discussion of these technical details.

The key to generating a different surface each time (since the spectral densities are wholly determined by the autocorrelation structure) lies in randomising the wave phases,

u,, (the displacement of each sine wave

from an arbitrary common origin). The phases of the sine waves of each frequency are chosen indepen- dently from a uniform distribution of angles in the interval 0... 27t. The IDFT formula is then applied. This generates a surface with the required statistical properties. Note that no trace of the sinusoidal com- ponents are left in the synthetic surfaces.

A 1024 element surface (32 x 32 measurements) was generated out of which only one quadrant of the lat- tice (16 x 16) was used. This circumvents the undesir- able property (for these purposes) of the inherent periodicity of the IDFT generated synthetic surfaces: the left and right edges, and the top and bottom edges "wrap round" on the surfaces (the surface forms a torus). If left uncorrected, this would result in the surfaces showing more correlation between points at opposite edges than between a point at an edge and another a short distance away towards the centre of the surface.

Note that the technical aspects of surface genera- tion are unimportant - there are many equivalent ways to do this and the IDFT is merely a convenient method.

Pairwise correlation

Pairs of surfaces with these known spatial autocorre- lation properties were correlated using the Pearson product-moment correlation coefficient r. This was re- peated for 105 pairs of surfaces. The value of r was recorded for each replicate. All 15 combinations of autocorrelation strengths were considered.

Multiple regression

One surface was defined as the response variable and a set of five surfaces, one for each of the five strengths of autocorrelation, were defined as the ex- planatory variables. Two additional explanatory vari- ables represented the x and y coordinates of each surface value. The model is therefore:

Z(x,y) = bo + bZ

+ 32Z2 Z3 + b4Z4 + sZ,5 + bxX +

byy + e (4)

where Z(x, y) is the response variable surface (ecologi- cal pattern), the x and y are the spatial coordinates and the Z, - Z5 are the explanatory variable surfaces (environmental factors). The classical model assumes the errors (e) are uncorrelated with each other be- tween surface values (this is the major mistake in model specification).

The model was fitted, and the student t values for the b-coefficients compiled. Again, 10s replicates were taken.

This whole procedure was repeated five times, changing the response variable each time to the next strength of the five strengths of autocorrelation. This allowed consideration of the ideal case where autocor- relation is absent from the response variable (white noise), to the worst case (brown noise) where spatial autocorrelation is very strong in the response vari- able.

Quantifying the pattern-matching bias

The shape of the frequency distribution of r for pair- wise correlation and t for the b-coefficients in the multiple regressions indicates the strength of any ten- dency of patterns to match more than that expected by chance (i.e. more than that expected by our choice of a). Patterns that match more often than expected will show a relatively flattened frequency distribution, with more of this frequency distribution in the tails beyond the critical values of t or r as defined under assumptions of zero autocorrelation.

This inflation of the number of significant pattern matches can be quantified by taking the ratio of the number of times a "significant" t or r value was found to that expected i.e. the type I inflation ratio defined as:

N0

I(a)= - N (5) Ne

where a is the nominal type I significance level, No is the observed number of "significant" t or r values and N, is the number of expected significant values under the correct null hypothesis (i.e. Ne = ONtotal).

104 ECOGRAPHY 23:1 (2000)

Page 6: Nordic Society Oikos - ut

In

itAt . .... .. ......

....... . . ..

Di, . .... 'j, .. .... ....

Fig. 2. Typical synthetic surfaces generated by the IDFT. The surfaces shown have 32 x 32 elements but the analyses used a 16 x 16 element area to reduce boundary effects. The spectral density-frequency exponents of these surfaces are 3 = 0 white noise (D = 3), p = - 0.5 white-pink noise (D = 2.75), 3 = - 1 pink noise (D = 2.5), 3 = - 1.5 pink-brown noise (D = 2.25) and P = - 2 brown noise (D = 2), where D is the fractal dimension, related to P such that D = 3 + 3/2 (Russ 1994).

This index shows how many times more likely it is to find a significant relationship between a response pat- tern and an explanatory pattern purely by chance com- pared to that expected.

Results

Example surfaces

One surface for each level of spatial autocorrelation is shown in Fig. 2. Loosely speaking, the appearance of the surfaces range from smooth (strong autocorrela- tion) to rough (weak autocorrelation) and the grain of the pattern appears to change from coarse to fine.

Correlation

The frequency distribution of observed r for each com- bination of spatial autocorrelation is shown in Fig. 3. It can be seen that if there is zero autocorrelation in one

of the pair of variables (i.e. one is a white noise process), the degree of autocorrelation in the other variable has no effect on the frequency distribution of r, and the classical correlation significance test is correct. However, when both variables possess some autocorre- lation, the effect of increasing autocorrelation in one of the variables is to inflate the magnitude of the correla- tion coefficient i.e. the expected magnitude of the corre- lation coefficient grows with increasing autocorrelation even if the spatially patterns are, in reality, independent of each other. This general trend is in agreement with theoretical expectations (Clifford et al. 1989). The sig- nificance inflation index (Fig. 4) shows that for non- zero autocorrelation in one variable there is a rapid increase in the number of observed "significant" corre- lations over that expected with increasing autocorrela- tion in the other variable. For example, if we correlate two pink noise surfaces, instead of finding one pair of surfaces in a hundred associated by chance alone, we find closer to twenty times this number, i.e. a reported significance level of p < 0.01 is actually more like p < 0.20.

ECOGRAPHY 23:1 (2000) 105

Page 7: Nordic Society Oikos - ut

Regression

The empirical frequency distribution of Student's t for each response variable (level of autocorrelation) re-

gressed against each explanatory variable (representing either a level of spatial autocorrelation or a spatial direction) is shown in Fig. 5. If the response variable is not autocorrelated, then there is no bias: all explana- tory variables are chosen as significant equally and correctly. However, when the response variable pos- sesses some autocorrelation, the choice of which ex-

planatory variables are significant is strongly influenced by how much autocorrelation they have: the more autocorrelated explanatory variables are chosen much more frequently as significant. The inflated "signifi- cance" of the spatial direction variables x and y is particularly prominent; the finding of a directional trend across a spatially autocorrelated ecological pat- tern using regression is therefore to be expected.

The significance inflation index for each explanatory variable for the five strengths of autocorrelation in the

response variable are shown in Fig. 6. For a white noise

response variable, the index is close to unity for all

explanatory variables, so there is no bias towards any particular type of explanatory variable when the re-

sponse variable conforms to the classical assumptions of multiple regression, even if the explanatory variables are autocorrelated. However, when the response vari- able is autocorrelated, there is a very strong tendency for more autocorrelated explanatory variables to be found significant much more often than expected by chance. Even for weak to moderate autocorrelation in the response variable (e.g. pink noise), the more auto- correlated explanatory variables are found to be signifi- cant many times more often than expected by chance. None of the explanatory variables are found to be

significant less often than expected, but the more

_ _

2500

0

-1 0 1

Fig. 3. The distribution of correlation coefficients generated by correlating pairs of synthetic surfaces. Each panel is for a different combination of spatial autocorrelation. Only the upper triangle of the symmetric matrix of combinations is shown. Along the diagonal both variables have the same autocorrelation strength. Autocorrelation strength increases from white noise (left and top), to brown noise (right and bottom). The horizontal axis represents the pairwise correlation coefficient r and ranges between - 1 and + 1. The vertical lines within each panel are the critical values of r outside which an observed r is significant at a = 0.01 if the observations are truly independent.

106 ECOGRAPHY 23:1 (2000)

Page 8: Nordic Society Oikos - ut

Fig. 4. Significance inflation index for o = 0.01 (eq. 5). for the pairwise correlations. This is the ratio of the proportion of the empirical frequency distribution (Fig. 3) lying beyond the critical value of r, divided by x. This index describes how many times more likely it is to find a "significant" relationship between the variables when in fact there is none. Each curve is for each of the strengths of spatial dependency in one of the surfaces labelled "response" for convenience. The horizontal axis represents increasing spatial dependency in the other "explanatory" surface.

60

50

40

-0

C 30

20

10

0 I white w_pink pink p_brown brown

explanatory variable

response variable: E brown 61- -r p_brown e- pink w_pink white

weakly autocorrelated explanatory variables are effec-

tively discriminated against. The frequency distribution of model r2 values is given

in Fig. 7: there is a very strong effect of response variable spatial autocorrelation on the proportion of variance explained by the model. With increasing auto- correlation in the response variable, the model fits increasingly well, despite the fact that the synthetic patterns are generated independently of each other.

Discussion

As mentioned in the Introduction, it has been known for a long time (e.g. Student 1914) that lack of indepen- dence where independence is assumed can result in

rejecting the null hypothesis when it is actually true much more often than expected, i.e. cause inflation of

Type I errors. However, it may come as something of a

surprise to geographical ecologists to discover what an

impact this has: the effect of spatial autocorrelation on the apparent "significance" of associations between in-

dependently generated random spatial patterns is strik- ing. For real-world patterns, this means that "significance" is determined to a great extent by the spatial structure of the data. This effect may well overwhelm real, causal associations between response and explanatory variables. An unpleasant consequence of this is that past attempts to construct an importance hierarchy of explanatory factors influencing or explain- ing an ecological pattern may have only resulted in a ranking of these factors in order of their spatial auto- correlation strength.

Although only classical correlation and regression have been considered, it is likely that these comments apply to other statistical techniques which erroneously

ECOGRAPHY 23:1 (2000) 107

Page 9: Nordic Society Oikos - ut

assume spatial independence, such as logistic regression and general linear models. The effective sample size problem is a general one.

How to deal with spatial autocorrelation in ecological patterns

Although providing a framework for dealing with spa- tial autocorrelation is not the primary focus of this paper, there are two relatively straightforward ways that ecologists can make an immediate improvement over classical correlation. One way is to apply the spatially explicit correlation test of Clifford et al. (1989). This is easy to do - only the spatial autocorre- lation function of each pattern need be calculated (a simple programming task) and a few simple calcula- tions applied thereafter. Another more flexible ap- proach is based on the surface randomisation methodology described in this paper. This is technically slightly more difficult but software to perform the fourier transform is widely available. An outline of this approach is as follows. To apply this randomisation

method to a pair of real ecological patterns, the auto- correlation structure of one of the pair of real patterns (e.g. a species diversity surface) is first measured using standard procedures. The autocorrelation matrix repre- sentation of the spatial pattern is transformed to the power spectrum representation using the fourier trans- form. Once we have the information in the form of the power spectrum, we can use the procedure outlined in the Methods above to generate many synthetic surfaces with the same spatial structure as the original observed diversity pattern. We then take each of the synthetic diversity surfaces and calculate a matching statistic (perhaps r, but the precise form of the matching statis- tic is relatively unimportant) with the second ecological pattern (e.g. a climate surface). This gives a large number of matching statistic scores for the large num- ber of comparisons between the observed climate sur- face and the synthetic diversity surfaces. The frequency distribution of these matching statistic scores forms the distribution of the null model. To see if our observed diversity surface is an unusually good match with our climate surface, we calculate the same matching statistic between the observed diversity surface and the climate

500

0 5??I:n 'i [

i ir

-25 0 25

LL LkL w"J

1< w

; ~ ; I I ii i: :• i ; i

? ....•ii . . .... : . . I.. .. • ... ,

Fig. 5. The distribution of Student's t for the partial regression coefficients from regression modelling of synthetic surfaces. Each row of panels is for a different level of spatial autocorrelation in the response variable, ranging from white noise (top row) to brown noise (bottom row). Each column represents one of the explanatory variables, in the order shown in eq. 4: white, white-pink, pink, pink-brown, brown, and the spatial directions along the x and y axes of the patterns, left to right respectively (intercept bo not shown). The vertical lines within each panel are the critical values of t outside of which an observed t is significant at a = 0.01 if the observations are truly independent.

108 ECOGRAPHY 23:1 (2000)

Page 10: Nordic Society Oikos - ut

Fig. 6. Significance inflation index for a = 0.01 (eq. 5) for the multiple regression model. This is the ratio of the proportion of the empirical frequency distribution (Fig. 5) lying beyond the critical values of t divided by o. This index describes how many times more likely it is to find a "significant" relationship between the response variable and an explanatory variable when in fact there is none.

100

90

80

70

x 60 CD

01

, 40

30

20

10

0 white wpink pink pbrown brown x_slope yslope

explanatory variable

response variable -1

brown - p_brown e pink wpink

white

surface. The position of this matching statistic in the

frequency distribution tells us whether or not the match is likely to have occurred by chance and so if the

diversity and climate surfaces are significantly associated.

This procedure has a strong advantage over the method of Clifford et al. (1989) in that only one of the

pair of surfaces is constrained to have surface values which are normally distributed (as opposed to both in their method) i.e. if the surface is species diversity, then the frequency distribution of the diversity values need not be normal so long as the other surface of the pair satisfies this requirement. This is potentially very useful, for example, if one surface described spatial distance to a coast or other sharp ecotone (such distances are

unlikely to be normally distributed) we can still test for association between this distance and a diversity, abun- dance or other spatial pattern (as long as this second

pattern is normally distributed in the sense outlined above). This will work even if one of the patterns is a binary pattern (perhaps presence/absence of a species), which is very useful. However, a problem with this procedure is that it cannot be used when neither of the patterns have normally distributed values. This is be- cause that this randomisation method cannot conserve both the autocorrelation structure and the frequency distribution of the surface values for some kinds of data. Consequently, attempting to produce synthetic replicates of, for example, a binary pattern using the above procedure does not work well. When applied to a binary pattern as input, instead of a producing a binary pattern as output, a continuous surface is gener- ated (Fig. 8) albeit with the same autocorrelation struc- ture. This is unfortunate, because a straightforward way to produce statistical replicates of binary patterns would be useful in spatial ecology, since we very often

ECOGRAPHY 23:1 (2000) 109

Page 11: Nordic Society Oikos - ut

encounter presence/absence patterns e.g. species distri- bution "dot maps". However, a major part of the spatial autocorrelation randomisation problem for bi- nary patterns is ultimately definitional: what property of the spatial structure of a pattern must we conserve in the replicates? This has no simple answer. For example, as outlined above, we can choose to conserve the autocorrelation structure of a pattern at the expense of destroying the frequency distribution of the surface values. The approach of Roxburgh and Chesson (1998) which generates pseudo-replicates for binary data by using a randomisation method based on join-count statistics (see Cliff and Ord 1973) is a useful but incom- plete solution to the problem for binary data because their method concentrates on conserving a particular set of pattern features, although this approach does look promising and is deserving of development

through a more formal theoretical treatment. Recall that when both patterns are binary (perhaps two species distribution maps) the spatially explicit test of Cerioli (1997) can be applied.

The other main route towards satisfactory treatment of spatial autocorrelation is that of parametric mod- elling which explicitly takes it into account (Ripley 1988, Haining 1990 and references therein). This latter route can deal with multiple patterns simultaneously and models the spatial structure of data by fitting parameters to functions describing both the spatial structure of the "noise" component of ecological pat- terns and the systematic or deterministic components of the surfaces. This is deserving of much more attention from geographical ecologists than it currently receives, since assumptions about the nature of spatial structure and the effects of explanatory variables on a given

100

90

80

70

. 60

50 CO

r- >40

30

20-

10

1 .. ..... .I I

white w_pink pink p_brown brown

response variable

Fig. 7. The percentage of variance accounted for by the multiple regression models as a function of the spatial autocorrelation of the response variable. The line joins the median values; the upper and lower limits of each box delimit the 75th and 25th percentiles.

110 ECOGRAPHY 23:1 (2000)

Page 12: Nordic Society Oikos - ut

iiiw 1IS D ~ i

Il.N

... .. .....

• 11

19 !,il ,,';i"•J•

i; r

-"l- ...,.......

... .... . AA HU 111

.j II

Fig. 8. The randomisation procedure applied to a binary pattern (e.g. species presence/absence) does not work well. The randomisation replicates (right) of the binary pattern (left) are not binary, even though the autocorrelation structure is identical in these two patterns. The binary pattern obviously has a binary frequency distribution of surface values, while the synthetic pattern has a normal distribution of surface values. Note that this is not a problem if we want to conduct a randomisation test where one pattern is binary (or some other non-normal frequency distribution) and the other is approximately normal. In this case, the approximately normal pattern can be randomised and the match of the resultant replicated patterns with the (unaltered) binary pattern used to construct the null model frequency distribution.

response pattern can be tested in parallel. For example, multiple regression can be used where spatial autocorre- lation is present if the regression model is modified to take this into account (the usual equation Y = Xp + e where e is a vector of uncorrelated errors, is replaced with Y = Xp + V, where V is the non-diagonal error term covariance matrix). Fitting spatially dependant regression models requires more effort than the classical methods (Mardia and Marshall 1984).

A contrasting (and somewhat defeatist) approach to spatial dependency is to thin out the response data in the spatial domain to leave larger spatial distances between the measurements e.g. thin out a lattice by discarding points until the correlation between the re- maining neighbouring points is suitably "low". This will reduce the autocorrelation between points, but it will also jettison information. Unfortunately, if simple statistical methods are used (i.e. misapplied) the ten- dency of explanatory variables to be significant in order of their spatial autocorrelation is unchanged, although the effect of thinning out the response variable may make it more likely that any "real" association between a response and an explanatory variable overcomes this ranking bias. Given that ecological information is often expensive to gather (consider the effort put in to map- ping species distributions e.g. Gibbons et al. 1993), this is really not a desirable option.

Red-shifted explanations in geographical ecology

It is clear that spatial autocorrelation is present in most data, and that simple statistical methods are positively

misleading. This suggests that there are serious prob- lems with many if not most published studies in geo- graphical ecology. The effects on classical significance testing (testing ignoring a spatial structure of errors) are not only to make explanatory variables in reality less

significant than reported, but much more seriously, to bias the choice of which variables are chosen as "sig- nificant" towards those that have the greater spatial autocorrelation. The consequence of this is that the chosen explanation for a spatial ecological pattern is more likely to be the spatially "smoother" rather than the "rougher" environmental variables. In other words, the environmental factors selected by many studies as

explanations for ecological patterns are "red-shifted" relative to the set of potential explanatory factors: environmental factors with less spatial autocorrelation and hence bluer spectra are much more likely to be

rejected. To compound this situation, it is precisely those ecological patterns that show the most spatial autocorrelation that attract most attention and hence are studied most often, since these systematic patterns seems to be most in need of an explanation: unfortu-

nately, this exacerbates the bias towards explanatory factors with redder spectra still further.

What environmental factors are these? Likely candi- dates are spatial coordinates, any factors that are in-

trinsically smooth, and those that arise from any kind of smoothing or interpolation process: this processing typically loses detail in the "real" underlying (sampled) spatial pattern. Temperature and temperature-related climatic factors, such as the productivity surrogates actual and potential evapotranspiration, may be partic-

ECOGRAPHY 23:1 (2000) 111

Page 13: Nordic Society Oikos - ut

ularly vulnerable (although whether or not these are intrinsically smooth is an open question). Rainfall ap- pears to be an intrinsically less autocorrelated factor, although this obviously depends on the quality of the data collection and processing steps. A comparison of published maps of any of these climatic factors with other common explanatory factors such as altitude or habitat coverage classifications immediately shows that the climatic factors are relatively smooth and hence more autocorrelated (although caution is needed since climatic maps are usually extensively interpolated). This means that when in the literature energetic climatic factors are reported as being more strongly associated with abundance or diversity than habitat type, altitude or rainfall, because of the discrimination against factors with bluer spectra this may not be the case. The only remedy for this uncertainty is re-analysis of affected published work, but this time taking spatial autocorre- lation explicitly into account.

It is unlikely that approaching the difficulties of spatial analysis with the kind of one-dimensional de- scriptions of spatial structure that seem to be the norm in ecology (k of the negative binomial, mean-variance ratios) will be entirely sufficient, although it is undoubt- edly a start. There is the potential, at least, that using an inadequate description of spatial structure (in the belief that it is adequate) will lead to a similar trail of mistaken associations and conclusions as that produced by the application of classical correlation and regres- sion to spatial problems. It may well be that the results from well-executed re-analyses will be broadly similar, but there is no way of telling until this has been done, and it is perhaps more likely that both spatially rougher or patchier environmental factors will become more important in our understanding of ecological spatial patterns and processes.

Conclusions

Geographical ecologists need to realise that detecting meaningful relationships between spatial patterns is a good deal more difficult than first appears. Simple, non-spatial correlations and regressions should really be the beginning of an analysis (the exploratory data analysis stage) rather than reported as results (accom- panied by meaningless significance tests). Ecologists should resist the temptation to use "rough and ready" methods to get a "rough and ready" answer: this answer may be quite wrong. However, ecologists should not think that the problems presented by spatial autocorrelation are intractable-there is a great deal that can be done. Given that the current explosion in the availability of spatially continuous data requires a more enlightened approach, ignoring spatial autocorrelation is really not an option. If spatial structure continues to

be ignored as an inconvenience, it is likely that our understanding of inter-relationships between patterns of environmental variables and species spatial distribu- tions, abundances and diversity gradients (to name but a few) will remain confused.

Acknowledgements - I thank Eli Groner, Stephen Hartley and Bill Kunin for reading earlier drafts, and David Currie for his useful and challenging comments.

References Adams, J. M. and Woodward, F. I. 1989. Patterns in tree

species richness as a test of the glacial extinction hypothe- sis. - Nature 339: 699-701.

Austin, M. P., Pausas, J. G. and Nicholls, A. 0. 1996. Patterns of tree species richness in relation to environment in south- eastern New-South-Wales, Australia. - Aust. J. Ecol. 21: 154-164.

Blackburn, T. M. 1996. The distribution of bird species in the New World: patterns in species turnover. - Oikos 77: 146-152.

Blackburn, T. M. and Gaston, K. J. 1996. Spatial patterns in the body sizes of bird species in the New World. - Oikos 77: 436-446.

B6hning-Gaese, K. 1997. Determinants of avian species rich- ness at different spatial scales. - J. Biogeogr. 24: 49-60.

Borcard, D. and Legendre, P. 1994. Environmental control and spatial structure in ecological commumities: an exam- ple using oribatid mites Acari, Oribatei. - Environ. Ecol. Stat. 1:37-61.

Borcard, D., Legendre, P. and Drapeau, P. 1992. Partialling out the spatial component of ecological variation. - Ecol- ogy 73: 1045-1055.

Buckland, S. T., Elston, D. A. and Beany, S. J. 1996. Predict- ing distributional change, with application to bird distribu- tions in northeast Scotland. - Global Ecol. Biogeogr. Lett. 5: 66-84.

Cerioli, A. 1997. Modified tests of independence in 2 x 2 table with spatial data. - Biometrics 53: 619-628.

Cliff, A. D. and Ord, J. K. 1973. Spatial autocorrelation. -

Pion, London. Clifford, P., Richardson, S. and Hemon, D. 1989. Assessing

the significance of the correlation between two spatial processes. - Biometrics 45: 123-134.

Currie, D. J. 1991. Energy and large-scale patterns of animal- and plant-species richness. - Am. Nat. 137: 27-49.

Currie, D. J. and Paquin, V. 1987. Large-scale biogeographical patterns of species richness in trees. - Nature 329: 326- 327.

Dutilleul, P. 1993. Modifying the t test for assessing the correlation between two spatial processes. - Biometrics 49: 304-314.

Gaston, K. J. and Blackburn, T. M. 1996. The tropics as a museum of biological diversity: an analysis of the New World avifauna. - Proc. R. Soc. Lond. B. 263: 63-68.

Gibbons, D. W., Reid, J. B. and Chapman, R. A. 1993. The new atlas of breeding birds in Britain and Ireland: 1988- 1991. - Poyser, London.

Goodman, J. W. 1968. Introduction to Fourier optics. - McGraw-Hill.

Greenwood, J. J. D. and Baillie, S. R. 1991. Effects of density-dependence and weather on population-changes of English passerines using a non-experimental paradigm. - Ibis 133: 121-133.

Haining, R. P. 1990. Spatial data analysis in the social and environmental sciences. - Cambridge Univ. Press.

Koenig, W. D. 1997. Spatial autocorrelation in California land birds. - Conserv. Biol. 12: 612-620.

112 ECOGRAPHY 23:1 (2000)

Page 14: Nordic Society Oikos - ut

Kurki, S. et al. 1998. Abundances of red fox and pine martin in relation to the composition of boreal forest landscapes. - J. Anim. Ecol. 67: 874-886.

Leathwick, J. R. and Mitchell, N. D. 1992. Forest pattern, climate and vulcanism in central North Island, New Zealand. - J. Veg. Sci. 3: 603-616.

Leduc, A. et al. 1992. Study of spatial components of forest cover using partial Mantel tests and path analysis. - J. Veg. Sci. 3: 69-78.

Legendre, P. 1993. Spatial autocorrelation: trouble or new paradigm? - Ecology 74: 1659-1673.

Lennon, J. J. 1990. Species richness distributions and climate. - Ph.D. thesis. Univ. of Leeds, U.K.

Manly, B. F. J. 1997. Randomization, Bootstrap and Monte Carlo methods in Biology. - Chapman and Hall.

Mardia, K. V. and Marshall, R. J. 1984. Maximum likelihood estimation of models for residual covariance in spatial regression. - Biometrika 71: 135-146.

McCollin, D. 1993. Avian distribution patterns in a frag- mented woodland landscape (North Humberside, UK): the role of between-patch and within-patch structure. - Global Ecol. Biogeogr. Lett. 3: 48-62.

Newton, I. and Dale, L. C. 1996. Relationship between migra- tion and latitude among west European birds. - J. Anim. Ecol. 65: 137-146.

Nilsson, C. et al. 1989. Patterns of plant species richness along riverbanks. - Ecology 70: 77-84.

O'Brien, E. M., Whittaker, R. J. and Field, R. 1998. Climate and woody plant diversity in southern Africa: relationships at species, genus and family levels. - Ecography 21: 495- 509.

Palmer, M. W. and van der Maarel, E. 1995. Variance in species richness, species association, and niche limitation. - Oikos 73: 203-213.

Peltonen, M. et al. 1998. Bark beetle diversity at different spatial scales. - Ecography 21: 510-517.

Pitas, I. 1993. Digital image processing algorithms. - Prentice Hall.

Ripley, B. D. 1981. Spatial statistics. - Wiley. Ripley, B. D. 1988. Statistical inference for spatial processes. -

Cambridge Univ. Press. Roxburgh, S. H. and Chesson, P. 1998. A new method for

detecting species associations with spatially autocorrelated data. - Ecology 79: 2180-2192.

Russ, J. C. 1994. Fractal surfaces. - Plenum Press. Schroeder, M. R. 1991. Fractals, chaos and power laws:

minutes from an infinite paradise. - Freeman. Shipley, B. 1991. A model of species density in shoreline

vegetation. - Ecology 72: 1658-1667. Smith, P. A. 1994. Autocorrelation in logistic regression mod-

elling of species distributions. - Global Ecol. Biodiv. Lett. 4: 47-61.

Sokal, R. R. and Oden, N. L. 1991. Spatial autocorrelation analysis as an inferential tool in population genetics. - Am. Nat. 138: 518-521.

Sokal, R. R. and Rohlf, F. J. 1995. Biometry: the principals and practice of statistics in biological research. 3rd ed. - Freeman.

Student (W. S. Gosset). 1914. The elimination of spurious correlation due to position in time or space. - Biometrika 10: 179-181.

Thioulouse, J., Chessel, D. and Champely, S. 1995. Multivari- ate analysis of spatial patterns: a unified approach to local and global structures. - Environ. Ecol. Stat. 2: 1-14.

Turner, J. R. G., Gatehouse, C. M. and Corey, C. A. 1987. Does solar energy control organic diversity? Butterflies, moths and the British Climate. - Oikos 48: 195-205.

Turner, J. R. G., Lennon, J. J. and Lawrenson, J. A. 1988. British bird species distributions and the energy theory. - Nature 335: 539-541.

Voss, R. F. 1988. Fractals in nature: from characterisation to simulation. - In: Peitgen, H. 0. and Saupe, D. (eds), The science of fractal images. Springer, pp. 21-90.

ECOGRAPHY 23:1 (2000) 113