Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Ecological Applications, 17(1), 2007, pp. 266–280� 2007 by the Ecological Society of America
PREDICTIVE VEGETATION MODELING FOR CONSERVATION:IMPACT OF ERROR PROPAGATION FROM DIGITAL ELEVATION DATA
KIMBERLY P. VAN NIEL1,3
AND MIKE P. AUSTIN2
1School of Earth and Geographical Sciences, University of Western Australia, Crawley, Western Australia 6009, Australia2CSIRO Sustainable Ecosystems GPO Box 284, Canberra, Australian Capitol Territory 2601 Australia
Abstract. The effect of digital elevation model (DEM) error on environmental variables,and subsequently on predictive habitat models, has not been explored. Based on an erroranalysis of a DEM, multiple error realizations of the DEM were created and used to developboth direct and indirect environmental variables for input to predictive habitat models. Thestudy explores the effects of DEM error and the resultant uncertainty of results on typicalsteps in the modeling procedure for prediction of vegetation species presence/absence. Resultsindicate that all of these steps and results, including the statistical significance ofenvironmental variables, shapes of species response curves in generalized additive models(GAMs), stepwise model selection, coefficients and standard errors for generalized linearmodels (GLMs), prediction accuracy (Cohen’s kappa and AUC), and spatial extent ofpredictions, were greatly affected by this type of error. Error in the DEM can affect thereliability of interpretations of model results and level of accuracy in predictions, as well as thespatial extent of the predictions. We suggest that the sensitivity of DEM-derivedenvironmental variables to error in the DEM should be considered before including them inthe modeling processes.
Key words: Acmena smithii; Corymbia maculata; digital elevation model (DEM); error analysis; errorpropagation; generalized additive model (GAM); generalized linear model (GLM); geographic informationsystem (GIS); predictive vegetation modeling; southern New South Wales, Australia; uncertainty.
INTRODUCTION
Predictive models of species spatial distribution are
now widely used in conservation studies as exemplified
by the encyclopaedic proceedings of Scott et al. (2002).
Although errors in prediction are considered (O’Connor
2002, e.g., van Horne 2002), only one paper explicitly
mentions error propagation from maps, Geographic
Information System (GIS) layers, or remotely sensed
data, and its potential impact on the statistical models
obtained (Henebry and Merchant 2002). These authors
(p. 295) state ‘‘effective techniques for . . . understanding
of error propagation in GIS are still rudimentary but
experience is rapidly increasing.’’ In this paper we
present an evaluation of the impact of error propagation
from errors in a digital elevation model (DEM) on all
stages in the fitting of statistical regression models for
predicting the distribution of species and their potential
habitats. The purpose is to understand how error and its
propagation affect final results in predictive modeling
and to consider options that will reduce the impact on
modeling results.
Error can enter into the predictive modeling process
by a number of pathways: spatial inaccuracies in the
sampling points, weaknesses in the assumptions or
calculations developing the spatial extension of the
environmental factors, the fit of the modeled relation-
ship, and error that was propagated from any initial
data sets. Like errors in other data sets, GIS data errors
affect the reliability of the final outcomes of any model,
but how they affect final outcomes is difficult to assess.
In general, we can only calculate the uncertainty in
model results due to input errors introduced into the
modeling processes.
Derivatives of a digital elevation model can have very
high levels of propagated error (Fisher 1998, Holmes et
al. 2000), yet most commonly employed environmental
variables for predictive vegetation modeling are derived
from a DEM but are not tested for problems (Henebry
and Merchant 2002). Prevailing belief is that geo-
morphometric (indirect) variables are expected to have
less propagated error than more derived, direct variables
(Guisan et al. 1998, Guisan and Zimmermann 2000), but
Van Niel et al. (2004) have demonstrated that this is not
true. They found that the level of propagated error from
a DEM is more complex and depends on the method-
ology of the data development and the nature of the
study site. Van Horssen et al. (2002) considered the
impact of error in kriged prediction surfaces on spatial
predictions of vegetation, but in this case the indepen-
dent and dependent variables were gathered at the same
time at each field site, so there is no uncertainty in the
relationship between the variables beyond model fit.
They still found extensive impact of model fit and
kriging error on the outcomes of spatial predictive
Manuscript received 21 November 2005; revised 5 May 2006;accepted 9 May 2006. Corresponding Editor: J. S. Baron.
3 E-mail: [email protected]
266
modeling. However, there has been no examination of
how the error and uncertainty in environmental
variables, particularly those developed from DEMs,
affects the process and results of predictive vegetation
modeling. If it has an extensive effect, then sensitivity of
environmental variables to error in the DEM should be
a consideration when selecting variables for inclusion in
a model. This has the potential to affect many aspects
and types of models, not only of terrestrial predictive
habitat modeling, but also climatic, marine benthic, and
ecosystem process models. The implications of the
outcomes of this study are discussed further in the
Discussion, below.
In this paper we explore how error in the DEM leads
to increased uncertainty in results from generalized
additive models (GAMs), generalized linear models
(GLMs), stepwise model selection procedures, accuracy
of model predictions, and the predicted spatial extent of
vegetation species for each of two sets of variables. We
develop species spatial-distribution models for two types
of predictor variables, direct and indirect (sensu Austin
and Smith 1989), which are derived from a DEM. A
probability distribution function based on assessed error
in the DEM is used in a stochastic model to develop 10
equi-probable DEMs. The predictor variables are then
redeveloped from each DEM ‘‘realization’’ and used to
redevelop the species spatial distribution models. The
models are then compared against each other and
against the original model at each stage in the modeling
process.
This research demonstrates the range of model results
that can be expected due to error in the DEM and other
source data sets. In addition, the methods employed here
provide a pathway for analyzing the influence of source-
data error on predictive habitat models, as well as a
guideline for understanding the nature of these effects
and their possible reduction.
DATA AND METHODS
The study was conducted in Murramarang National
Park and South Brooman State Forest on the South
Coast of New South Wales, Australia, ;300 km south
of Sydney. This area has been the subject of numerous
vegetation-prediction studies (Moore et al. 1991, Fitz-
gerald and Lees 1992, 1994), and an extensive GIS
database and set of field plots (424) exists for the region.
The elevation at the site ranges from 8 to 260 m with an
average elevation of 66 m. Predictive models were
developed for two species: Corymbia maculata (Hook.),
a common sclerophyll tree (63% presence on plots), and
Acmena smithii (Poir.), a less common rainforest tree
(20% presence on plots), and spatial predictions were
made for a subset of the region well within the extent of
the field sites. The driving factors were shown to be
different for these species in standard GAM (generalized
additive model) and GLM (generalized linear model)
analysis. The standard, unperturbed-data results were
used throughout this paper as a base model for
comparison with the results from the uncertainty
analysis. Commonly employed (based on the literature)
direct and indirect variables, as defined by Austin and
Smith (1989), were used to develop separate models for
each species, resulting in four different models (Table 1).
The indirect variables used were elevation (e.g., Guisan
et al. 1999, Vayssieres et al. 2000), slope (e.g., Franklin
1998, Guisan et al. 1998), cosine of aspect (e.g., Lees and
Ritman 1991, e.g., Guisan et al. 1998), and topographic
position, and the direct variables were net solar
radiation (Moore et al. 1993, e.g., Leathwick 1995),
average air temperature (Austin and Meyers 1996,
Vayssieres et al. 2000), and topographic wetness index
(TWI) (e.g., Moore et al. 1993, Barling et al. 1994).
Models based on indirect and direct variables were
compared to investigate their relative sensitivity to
uncertainity (Guisan and Zimmermann 2000, Austin
2002). A geologic nutrient data set (as described in
Moore et al. [1991]) was included in the models
(appropriate soils data have not been developed for
the region), but was not evaluated for uncertainty.
Assessment of baseline error
The error model applied in this study was developed
from ground-truthed (i.e., field verified) data (Fisher
1998, Holmes et al. 2000) and was used to create a
probability distribution function (pdf) to derive ten
random grids which simulate the known error in the
TABLE 1. Base models applied for stepwise procedure, by species.
Model type� Base model applied�
Corymbia maculata
Direct GeoNutrients,4 þ AvgAirTemp þ NetSolarRad þ TWIIndirect GeoNutrients,4 þ DEM þ CosineAspect,2 þ Slope,2 þ TopoPosition
Acmena smithii
Direct GeoNutrients,4 þ AvgAirTemp þ NetSolarRad þ TWI,3Indirect GeoNutrients,4 þ DEM,3 þ CosineAspect,2 þ Slope,4 þ TopoPosition,2
� Direct and indirect variables were used to develop separate models. Direct variables used werenet solar radiation, average air temperature, and topographic wetness index (TWI); indirectvariables were elevation, slope, cosine of aspect, and topographic position.
� Note that the value following the variable indicates that level of polynomial plus all lowerorders. Thus, the direct model for C. maculata would be interpreted as GeoNutrients þGeonutrients2þGeonutrients3 þGeonutrients4 þ AvgAirTempþNetSolarRadþ TWI.
January 2007 267IMPACT OF ERROR ON PREDICTIVE MODELS
digital elevation model (DEM). The DEM for the study
region was developed from elevation contours (10-m
interval), steamlines and spot heights that were digitized
from a 1:25 000 scale topographic map and then
interpolated to 30-m pixel resolution using IDRISI
(GIS and image-processing software; Clark Labs, Clark
University, Worcester, Massachusetts, USA), as de-
scribed by Lees (1999). The reference (ground-truthed)
data of 2097 points were captured using an OmniSTAR
Scout 12 differential GPS (OmniSTAR, West Perth,
Western Australia, Australia), at a vertical accuracy of
3.8 m or less, along roads at the study site. The point
data were then aggregated into cells (30-m resolution
and spatially matched to the DEM), taking the mean of
all values. The dense forest canopy across much of the
study site made it impossible to capture vertically
accurate data via a GPS across a range of topographic
positions (e.g., gullies). Also, capture of primarily linear
data meant that directional biases (anisotrophy) could
not be assessed properly. Thus, we were unable to
consider these relationships.
The DEM error was assessed at a standard deviation
of 5.3 m, normally distributed, with a spatial autocor-
relation equivalent to Moran’s I of 0.2776 (Van Niel et
al. 2004) and these settings were used to develop the
pdf ’s. This level of error compares well to the standard
deviation of error in other DEMs, such as USGS 30-m-
resolution DEMs (4.1 m, as per Holmes et al. [2000]),
British ordinance survey (BOS) profile (10 m) and
Panorama (50 m) DEMs (2.6 m and 7.0 m, respectively,
as per Fisher [1998]), and shuttle radar topography
mission interferometric terrain elevation data (SRTM
ITED-2) DEMs (3.3 m in open landscape and 15.18 m in
forested areas, as per Heipke et al. [2002]).
Each error grid was then added to the original DEM,
and the environmental variables were re-calculated. This
process was repeated 10 times to generate 10 alternative
uncertainty realizations of the set of predictor variables.
These realizations were then used to create 10 separate
statistical models, which were compared against each
other and against the original models for each species.
The effect of uncertainty was considered not only on the
final outcomes of the model, but also on each step of the
modeling process.
Error impact and assessment
The effect of error on a number of different modeling
processes is considered. The processes selected were
based on their usage in the literature and their
importance in the overall development and outcome of
predictive habitat modeling. There are five steps in
modeling the impact of base error on the modeling
process:
1) Testing of individual variable significance for a
species using GLMs. The effect of propagated error on
the significance of individual predictor variables was
assessed, as this is often the first step in developing a
species spatial-distribution model (Vayssieres et al.
2000). Regression models using each predictor indepen-
dently were fitted for the unperturbed data and the 10
realizations. Regressions for linear, quadratic, cubic,
and quartic polynomial models were fitted separately.
The level of probability determined for each realization
regression was compared with that of the unperturbed
data. The entire set of field plots is used for this analysis.
2) Exploring the shape of the species–environment
relationship curves using GAMs. Shapes of the species-
predictor variable relationship are examined using
GAMs (Brown 1994, Austin and Meyers 1996, Bio et
al. 1998, Franklin 1998, Leathwick 1998, Lehmann
1998). These models are developed for individual
predictor variables, first for comparison to the individ-
ual predictor, and then as full models to consider the
change due to the combination of relationships. The
second case is current practice for later development of a
GLM model (Franklin 1998).
3) Selecting a final model from all predictor variables
by stepwise GLM procedures. Backward stepwise
selection of variables applying Aikake’s information
criterion (AIC) is used for all modeled realizations to
examine the effects of propagated error on this process.
For this analysis and all others, the split-sample
approach was applied with 70% of the data used for
model development and 30% retained for testing.
4) Assessing accuracy of model predictions. Based on
GAM analysis of the unperturbed data, an initial model
was constructed for each species. This model was then
applied to each of the realizations of the data sets. The
coefficients and their standard errors were examined,
and the model accuracy tested using both kappa
accuracies (all at 0.05 thresholds; prediction accuracy
as assessed via the kappa statistic [Cohen’s kappa]) and
area under the receiver operating curve (AUC; Bradley
1997).
5) Predicting the spatial distribution of the species.
Each of the models developed in the stepwise procedures
was used to develop spatial predictions (Pearce and
Ferrier 2000) of Corymbia maculata and Acmena smithii.
For the predictions, each model was applied using the
complete grids of the variables for its realization. For
example, the model developed from realization 1 was
applied only to those environmental variables developed
for realization 1 for predicting the spatial distribution.
In this way, each model was applied to the data set from
which it was developed. The spatial predictions were
developed for each realization, then they were used to
create three data sets based on the probabilities: the
minimum probability, the maximum probability, and
the range in probability for each cell. This allows for the
display of the differences in probability estimates. For
each data set the threshold that optimized kappa
accuracy was used to reclassify the probability data sets
to presence/absence grids (1/0). These were then
summed to give the number of realizations that
predicted the species as present for each cell. This
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN268 Ecological ApplicationsVol. 17, No. 1
displays the effect of uncertainty and the range of
possibilities for the final presence/absence predictions.
RESULTS
Univariate GLM analysis
Comparison of the results of the univariate GLM
(generalized linear model) regressions for the realiza-
tions with the unpertubed model shows variability in the
outcomes when error in the digitized error model
(DEM) is considered (Table 2). For all variables and
species, the propagated error affected both the polyno-
mial selected for each predictor variable and whether the
predictor variable was selected at all.
For Corymbia maculata, results depended on the
variable investigated when judged on probability levels
for significance (using P , 0.05). Table 2 shows that
average air temperature, elevation (DEM) and topo-
graphic position all gave identical results for all 40
regressions. Cosine of aspect also showed nonsignificant
results. The topographic wetness index (TWI) generally
produced similar significant regressions to the unper-
turbed data (70% agreement). Not one realization for
net solar radiation or slope gave identical results to the
unperturbed data, and none of the regressions for slope
match those of the unperturbed data using two
significant figs for the probability. At the P , 0.05
level, there is only 23% agreement. TWI is more affected
by error in the DEM than net solar radiation (Van Niel
et al. 2004), yet in this case, the error propagated to net
solar radiation leads to more inconsistent results. Thus,
the variability introduced by the error appears to depend
on both the strength of the relationship and the nature
of the derivation of the predictor variable from the
DEM—for example, where the representation of curva-
ture is critical to the output data set or where the data
conversion is not continuously differentiable (Van Niel
et al. 2004).
For Acmena smithii, the results with direct variables
demonstrate similar patterns as for C. maculata (results
not presented; see Van Niel [2003] for details). In
summary, a strong relationship with net solar radiation
remained unaffected by error in the DEM. Average air
temperature regressions had a 93% agreement, while
TWI had 50% agreement. Regression results using
indirect variables were consistent with the unperturbed
data at the P , 0.05 level for elevation, slope, and
topographic position. For all variables and both species,
TABLE 2. Comparison of unperturbed values and the range of perturbed Prob(v2) values for GLM analysis of Corymbia maculatapresence/absence with direct and indirect variables.
Predictor variablePolynomialselected�
Unperturbedvalue
Perturbed value Realizations (10)
Minimum Maximum Mean SD No. significant�
C. maculata, direct
Net solar radiation L 0.830 0.008 0.715 0.394 0.250 1Qd 0.247 0.013 0.788 0.353 0.255 2C 0.002 0.001 0.686 0.344 0.285 2Qt 0.003 0.003 0.681 0.334 0.268 2
Average air temperature L ,0.001 ,0.001 ,0.001 ��� ��� 10Qd ,0.001 ,0.001 ,0.001 ��� ��� 10C ,0.001 ,0.001 ,0.001 ��� ��� 10Qt ,0.001 ,0.001 ,0.001 ��� ��� 10
Topographic wetness index, TWI Qd ,0.001 0.001 0.361 0.046 0.111 9C ,0.001 0.002 0.316 0.054 0.095 8Qt ,0.001 0.002 0.193 0.060 0.066 7
C. maculata, indirect
Digital elevation model, DEM L ,0.001 ,0.001 ,0.001 ��� ��� 10Qd ,0.001 ,0.001 ,0.001 ��� ��� 10C ,0.001 ,0.001 ,0.001 ��� ��� 10Qt ,0.001 ,0.001 ,0.001 ��� ��� 10
Slope L ,0.001 0.018 0.929 0.268 0.299 2Qd ,0.001 0.020 0.748 0.289 0.295 3C ,0.001 0.019 0.822 0.350 0.301 2Qt ,0.001 0.025 0.748 0.356 0.297 2
Topographic position L ,0.001 ,0.001 ,0.001 ��� ��� 10Qd ,0.001 ,0.001 0.001 ��� ��� 10C ,0.001 ,0.001 ,0.001 ��� ��� 10Qt ,0.001 ,0.001 ,0.001 ��� ��� 10
Cosine of aspect L 0.244 0.130 0.969 0.560 0.332 0Qd 0.09 0.257 0.985 0.571 0.252 0C 0.161 0.203 0.981 0.564 0.244 0Qt 0.267 0.211 0.991 0.622 0.267 0
Note: For each species and environmental variable, all polynomials to quartic are shown, even if a significant value was attainedbefore reaching the next polynomial, to show the extent of variability in the results.
� Polynomials are: L, linear; Qd, quadratic; C, cubic; and Qt, quartic.� Number of error realizations (out of 10 realizations) significant at P , 0.05.
January 2007 269IMPACT OF ERROR ON PREDICTIVE MODELS
the propagated error affected both the polynomial
selected for each predictor variable and whether the
predictor variables was selected at all.
Effect on shapes of species response curves
Examination of the nature or shape of the relationship
between species and environmental variables is usually
conducted using generalized additive models (GAMs).
Given the results of the univariate analysis above, we
would expect changes in the shapes of the species
response curves due to slight error perturbation.
Figs. 1 and 2 demonstrate the change in the shape of
species response curves given a relatively strong rela-
tionship with the variable. Fig. 1 shows the relationships
between C. maculata presence/absence and average air
temperature. Although average air temperature had a
strong relationship with C. maculata presence/absence
despite the propagation of DEM error, Fig. 1 shows that
the nature of the relationship may still be distorted by
error in the predictor. Different realizations suggest
different final curves, which vary from linear, to
quadratic to cubic (Fig. 1B). Two of the realizations
demonstrate the differences (Fig. 1C and D). Note that
some of the shape changes occur in sparse-sampling
areas, indicating that more comprehensive sampling
may provide some protection from this problem.
Although topographic position appeared to have a
robust relationship with A. smithii presence/absence in
the univariate GLM analysis, Fig. 2 shows how
relatively minor error can change interpretation of the
nature of the relationship, again with variation in form
of linear, quadratic, and cubic. This predictor relation-
ship was less sensitive to error than those environmental
variables with weaker species relationships, such as with
aspect and slope.
This series of realizations demonstrated the problem
of basing interpretation of species response on error-
prone data. Not only does the shape and complexity of
the response change, affecting for example the selection
of a term for a GLM model, but the general response
pattern can be quite different. This was surprising given
that many of the relationships remained significant in
the univariate analysis, and showed that these relation-
ships were not impervious to error in the DEM. The
problem of interpretation of the relationships and the
shapes of species curves remained, despite the strength
of the GLM results.
Effects on model selection by stepwise procedures
Results from the stepwise model selection revealed
sensitivity to DEM error. Table 3 shows the differences
in backwards step model-development results due to the
propagation of error in the DEM to the environmental
variables, and illustrates the weaker relationship of TWI
to C. maculata presence/absence compared to the other
variables. Seven realizations were now nonsignificant for
TWI compared with the unperturbed result, although
nine realizations had significant linear GLM equations
FIG. 1. Generalized additive model (GAM) plots demonstrating the change in relationship between Corymbia maculatapresence/absence and mean air temperature (AvgAirTemp). (A) Relationship for the unperturbed data set. (B) Relationship for 10different error realizations. (C and D) The variation between two of the realizations. On the y-axis ‘‘s’’ stands for ‘‘smoothingspline’’; the units are a spline of the variable, and a high value indicates high probability of occurrence (but is not calibrated here).Above the x-axis the short vertical lines are a ‘‘rug plot,’’ providing information on the number of data points defining the spline.
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN270 Ecological ApplicationsVol. 17, No. 1
(Table 2). Net solar radiation and average air temper-
ature are less affected by DEM error than TWI (Van
Niel et al. 2004). The other variables were unaffected
either due to a stronger relationship with C. maculata
presence/absence or to their robustness to DEM error.
Note that the geologic nutrients data set was not
changed in the error propagation. Geologic nutrients, an
unperturbed variable from a different data source,
remains in every model, as does elevation (DEM). The
selection of the other variables in the stepwise procedure
FIG. 2. GAM plots demonstrating the change in relationship between Acmena smithii presence/absence and topographicposition. (A) Relationship with the unperturbed data set. (B) Relationship for 10 different error realizations. (C and D) Thevariation between two of the realizations. The format is as in Fig. 1. The topographic-position units are meters of exposure.
TABLE 3. Results, by species, of backwards stepwise model development and change in AIC and residual deviance for 10 errorrealizations of all models.
Species and model type ModelNo.
affected�
Change in AIC D deviance
Mean SD Mean SD
Corymbia maculata, direct model
Unperturbed model GeoNutrients,4 þ AvgAirTemp þ NetSolarRadþ TWI
0.97 0.83 78.66 6.86
Dropped from unperturbed noneOther models selected GeoNutrients,4 þ AvgAirTemp þ NetSolarRad 7
C. maculata, indirect model
Unperturbed model GeoNutrients,4 þ DEM þ Slope,2 þ TopoPosition 2.83 1.88 77.65 5.97Dropped from unperturbed CosineAspect,2Other models selected GeoNutrients,4 þ DEM þ Slope,2 3
GeoNutrients,4 þ DEM þ CosineAspect,2 þ Slope,2 1GeoNutrients,4 þ DEM þ CosineAspect,2 1GeoNutrients,4 þ DEM þ TopoPosition 2
Acmena smithii, direct model
Unperturbed model GeoNutrients,4 þ AvgAirTemp þ NetSolarRadþ TWI,3
1.41 0.60 44.67 7.93
Dropped from unperturbed noneOther models selected GeoNutrients,4 þ NetSolarRad þ TWI 9
A. smithii, indirect model
Unperturbed model GeoNutrients,4 þ TopoPosition,2 8.24 1.25 75.08 14.68Dropped from unperturbed Slope,4; Dem,3Other models selected GeoNutrients,4 þ CosineAspect,2 þ TopoPosition,2 5
GeoNutrients,4 þ Slope,4 þ TopoPosition,2 2
� Number of models (out of 10 models) with this outcome.
January 2007 271IMPACT OF ERROR ON PREDICTIVE MODELS
varied for the different realizations. Only 3 out of 10
realizations have identical predictors to the unperturbed
model. The changes in deviance and AIC showed that
the unperturbed-data model falls within the envelope of
the perturbed-data models. The models for A. smithii
with the direct environmental variables disagreed on the
inclusion of average air temperature; 9 out of 10
realizations do not include it in the regression model.
Average air temperature is not greatly affected by error
in the DEM (Van Niel et al. 2004), but the addition of
error in other variables has resulted in the consistent
exclusion of a predictor, due to weakness in the
relationship, which is consistent with the species’
ecology.
Effect on coefficients in GLM models
For the models of C. maculata with direct environ-
mental variables, the coefficients remained relatively
stable across the realizations (Table 4), even between
realizations that included different sets of variables. The
sign remained the same for all coefficients and the
intercept was quite stable, indicating that the general
slope of the model was similar across realizations. The
indirect-environmental variable models for C. maculata
showed changes in the regression coefficients for the
intercept and the unperturbed geologic nutrients data
set. As with the direct model, the sign of the coefficients
and the relative level of the coefficient were not highly
affected for all variables, even with different variables in
the model. This suggests that the relationships between
C. maculata and both geologic nutrients and elevation
are strong enough to be relatively unaffected by error.
The results of A. smithii models with direct variables
demonstrated much greater variation in the coefficients
than the C. maculata models. In particular, the
coefficient on the intercept was highly variable. To test
whether the sensitivity of TWI to error in the DEM was
the cause of the instability in the intercept seen here, the
TWI parameters were removed for all the models with
the same parameters (1–5, 7–10) and the models were
rebuilt. For these nine models, the range in variation of
the intercept was 1.513 and the range of the standard
errors was 0.583, much less than the analysis shown in
Table 4.
The results of models of A. smithii with indirect
variables supported the instability that was seen in the
direct model (not shown). Again, this was likely due to
the strength of the relationship between topographic
position and A. smithii presence/absence and the
sensitivity of topographic position to errors in the
DEM. As a test, the models were redeveloped excluding
topographic position while leaving all other parameters
as they were shown above. Despite the differences in the
other variables included, the variation of the coefficients
for the intercept and geologic nutrients was much lower.
The intercept and geologic nutrients showed much lower
coefficient ranges (e.g., intercept range became 2.53, as
opposed to 12.82).
Effect on prediction accuracy and threshold
Table 5 shows the differences in kappa accuracy and
the area under the receiver operating curve (AUC) for
all models. The graphs of the differences in kappa
accuracy for models of C. maculata with the change in
threshold are shown in Fig 3. for both the direct and
indirect models. Notice that the general trend of the
overall and kappa accuracies was the same for the
realizations as with the unperturbed data. For the
TABLE 4. Summary of the variation in regression coefficients for 10 error realizations of the direct models for Corymbia maculataand Acmena smithii presence/absence.
VariableNumberincluded
Perturbed Realization
UnperturbedMinimum Maximum Mean SD
C. maculata, direct model
Intercept 10 80.688 94.494 86.387 4.093 85.442GeoNutrients 10 �29.352 �25.119 �27.152 1.267 �27.426GeoNutrients2 10 11.910 13.965 12.924 0.608 13.290GeoNutrients3 10 �2.506 �2.127 �2.319 0.111 �2.420GeoNutrients4 10 0.126 0.150 0.139 0.007 0.146AvgAirTemp 10 �4.994 �4.232 �4.560 0.252 �4.507NetSolarRad 10 0.046 0.082 0.058 0.010 0.076TWI 3 �0.174 �0.111 �0.137 0.033 �0.144
A. smithii, direct model
Intercept 10 �25.069 23.244 �1.883 14.140 �27.367GeoNutrients 10 10.487 14.052 12.251 1.220 12.106GeoNutrients2 10 �6.145 �4.506 �5.343 0.555 �5.283GeoNutrients3 10 0.721 0.998 0.864 0.093 0.857GeoNutrients4 10 �0.052 �0.037 �0.045 0.005 �0.044AvgAirTemp 1 0.900 0.900 0.9 NA �1.244NetSolarRad 10 �0.050 �0.023 �0.035 0.008 �0.095TWI 3 �7.710 4.142 2.032 3.636 8.564TWI2 10 �0.295 0.647 0.195 0.312 �0.509TWI3 10 �0.017 0.007 �0.005 0.009 0.01
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN272 Ecological ApplicationsVol. 17, No. 1
indirect models the unperturbed data had the best kappa
accuracy, but for the direct models the error realizations
had higher kappa accuracies. There was a wide range of
results, especially the higher kappa accuracies from the
perturbed data sets in the direct models. One could
argue that the kappa accuracies for the direct model had
nowhere to go but up. Note the models with higher
accuracy were the three that did not include TWI,
whereas TWI was included in the unperturbed model.
The graphs of the differences in kappa accuracies for
A. smithii with the change in threshold are shown in Fig.
4 for both the direct and indirect models. The chart of
the kappa accuracies for both indirect and direct models
(Fig. 4) shows that one realization had higher kappa
accuracies than the unperturbed data.
Effect on spatial predictions of species
For each model of C. maculata and A. smithii, the
spatial predictions were developed for every realization
and then used to create three data sets based on the
probabilities, as well as a map of the spatial agreement
across all realizations. The map in Fig. 5 shows the
variation in predicted probabilities for the indirect
models of A. smithii, while Fig. 6 shows the variation
between the presence/absence prediction for the unper-
turbed analysis and two realizations. Fig. 7 shows the
extent of the variation in presence/absence predictions
across all realizations at the optimum threshold, where
the mapped value is the number of models predicting
presence. There were areas on which all models agreed
on either presence or absence of C. maculata for both
models (Fig. 7A and B), although there was less
agreement on the indirect model (7A). Two areas of
agreement for the direct models on predicted absence for
C. maculata were quite notable, in the north and south
of the study area, as well as the lowland running from
the southwest to northwest of the study site along a
river. The realizations also show agreement of presence
in the northwest and eastern parts of the study site.
However, the areas of disagreement and the range in
variation of the probabilities of presence show the extent
to which error in the DEM can affect the spatial
prediction of this species in the both model. For A.
smithii, there were very few places where the models all
agree (Fig. 7D), particularly for the indirect models (Fig.
7C).
DISCUSSION
Propagated primary data error affected each of the
steps in predictive vegetation model development.
Qualitatively this is to be expected; what this study
shows is the nature, magnitude and complexity of the
effects of DEM errors.
Analysis of the relationships between a species and a
single environmental variable is a relatively common
method. For example, Vayssieres et al. (2000) tested the
relationship of six plant species against 25 separate
environmental variables up to the cubic term for each.
The results here suggest that tests of significance of an
individual variable depends on the relationship between
the true value and the value in the DEM, the type of
error (biased or random), and the type and complexity
of the model (Carroll et al. 1995). Complexity of the
model and decreasing degrees of freedom were expected
to make models more sensitive to error, so one would
expect greater variability in a more complex fit. Our
study shows that the GLM (generalized linear model)
univariate method may be called into question as a way
of eliminating variables before model fitting, simply
based on error in the primary data (e.g., Pearce and
Ferrier 2000, Vayssieres et al. 2000). The strong
relationships for both species, which remained unaffect-
TABLE 5. Summary of variation in optimum kappa threshold, highest kappa accuracy, and errorunder the receiver operating curve (AUC) for 10 error-realization direct and indirect models ofCorymbia maculata and Acmena smithii presence/absence, together with results of unperturbeddata models for comparison.
Direct models Indirect models
ModelOptimumthreshold
Kappaaccuracy AUC
Optimumthreshold
Kappaaccuracy AUC
C. maculata
Unperturbed 0.70 0.209 0.634 0.60 0.316 0.646
Perturbed
Minimum 0.50 0.155 0.574 0.35 0.068 0.530Maximum 0.80 0.251 0.639 0.85 0.228 0.630Mean 0.63 0.207 0.608 0.62 0.157 0.581SD 0.09 0.032 0.018 0.16 0.049 0.028
A. smithii
Unperturbed 0.40 0.452 0.735 0.20 0.397 0.781
Perturbed
Minimum 0.25 0.299 0.715 0.20 0.176 0.581Maximum 0.50 0.616 0.779 0.75 0.416 0.778Mean 0.37 0.426 0.751 0.35 0.305 0.711SD 0.09 0.089 0.018 0.17 0.069 0.051
January 2007 273IMPACT OF ERROR ON PREDICTIVE MODELS
ed by DEM error propagation, were indicated by their
commonly observed ecology (such as the strong
relationship between Acmena smithii, a rainforest
species, with topographic position).
GAMs (generalized additive models) were used for
exploring the shapes of species response to environmen-
tal variables. This method has been used to suggest
relationships for use in GLM models (Brown 1994,
Franklin 1998), to explore ecological theories (Austin
and Meyers 1996), or to develop predictive models (Bio
et al. 1998, Leathwick 1998, Lehmann 1998). The effect
that error in the predictor variables has on the integrity
of GAMs and GLMs influences the success of final
predictive outcomes. More importantly, it can also
affect the development of ecological theories on species
niche and environmental response (Leathwick 1995,
Austin and Meyers 1996). Error in the environmental
variables had a relatively strong effect on the apparent
shapes of species response curves. For some relation-
ships both the slope and the shape of the relationship
were changed by DEM uncertainty.
In the presence of a strong (or extreme lack of)
relationship between the predictor variable and species
presence, GAM and GLM univariate analyses were less
sensitive to error in the DEM. For A. smithii, the strong
relationship between presence/absence and topographic
position was relatively unaffected by DEM error, while
the lack of relationship between presence/absence and
elevation was similarly unaffected. More ephemeral
relationships, such as Corymbia maculata presence/
absence with slope, proved to be quite variable.
However, higher levels of error would certainly have a
more dramatic effect on all the relationships. There is an
interaction between the strength of relationship between
the predictor variable and species presence and the level
of error considered in the uncertainty perturbations.
Weaker relationships are more sensitive at lower levels
of error, and stronger relationships at higher levels of
error. Thus, the level of error in the DEM can have a
very powerful effect on the accuracy of the relationships
explored using this method. However, the GAM
analysis showed that the strength of the relationship
FIG. 3. Predictive accuracies (kappa statistics based on p-optimum, a threshold value that leads to the best accuracy value asassessed by kappa accuracy) for 10 error realizations of the (A) direct and (B) indirect models for Corymbia maculata presence/absence, displayed as decimals. All perturbed realizations are shown in gray; the unperturbed results are shown in black.
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN274 Ecological ApplicationsVol. 17, No. 1
did not necessarily protect against misinterpretation of
the nature of species–environment relationships.
This study did not include predictor-variable interac-
tions. Given results from this study and Van Niel et al.
(2004), we would expect that the impact from propa-
gated DEM error would be a complex interaction
between the strength of relationship between interacting
variables and the species, the attribute and spatial
relationships between the variables themselves, and the
relationship (co-occurrence) of the spatial pattern of
error in the interacting variables.
Stepwise procedures are widely used for model
selection for GLMs (Guisan et al. 1998, Zimmerman
and Kienast 1999). Stepwise model selection proved to
be relatively stable, although the strength or extreme
lack of relationship dictated the similarity in the results.
It was likely that inclusion of the unperturbed data set,
geologic nutrients, helped to stabilize this step in the
modeling procedure. For stepwise model development,
the models from the indirect variables were more
sensitive to error in the DEM than the direct models.
This was probably due to the larger number of variables
presented to the stepwise procedure and to the fact that
topographic position, slope, and cosine of aspect are
known to be more sensitive to error than net solar
radiation and average air temperature (see Van Niel et
al. 2004), but it may also indicate that these variables
were more sensitive to error during the stepwise
procedure because the relationships are not as strong.
Some studies have attempted to assign importance of
variables in species relationships based on the order in
which they enter or leave a model in stepwise
procedures. Although this inference is warned against
(Harrell et al. 1996), it is often used (e.g., Nicholls 1989).
FIG. 4. Predictive accuracies (kappa statistics based on p-optimum) for 10 error realizations of the (A) direct and (B) indirectmodels for Acmena smithii presence/absence, displayed as decimals. All perturbed realizations are shown in gray; the unperturbedresults are shown in black.
January 2007 275IMPACT OF ERROR ON PREDICTIVE MODELS
Our analysis also shows that error in the DEM can
distort the order, and supports the notion that inference
of importance due to order should be avoided. In
addition, change in deviance is often reported as a
measure of how well the model was developed. The
results show that change in deviance is impacted by
error in the DEM. A wide range in the change of
deviance (21.72 in Table 5 and 18.54 in Table 6) was
found simply due to relatively slight error in the DEM.
With more realizations, this range would probably
increase. Both order of variable exclusion and change
in deviance are affected by propagated DEM error, and
thus they may be unreliable guides to the assignment of
importance to environmental variables and as a
comparative measure between models.
Examination of GLM coefficients showed that strong
relationships between species presence/absence and an
environmental variable, when combined with high
sensitivity to error in that environmental variable, can
lead to unstable intercepts, coefficients, and standard
errors. This result was more disturbing than results from
the previous model development steps. It suggests that
although the other steps show more stability when
relationships between environmental variables and
species were strong, a strong relationship can also lead
to very unstable results if the environmental variable
was sensitive to propagated error. Removal of the
sensitive variable appears to help stabilize the results, as
shown in the removal of TWI (topographic wetness
index) from the direct model and topographic position
from the indirect model for A. smithii. The other striking
difference with these models was the high variability of
the coefficients for TWI. It was likely that both of these
effects were caused by the sensitivity of TWI to error in
the DEM (as described by Van Niel et al. [2004]) and the
FIG. 5. Prediction maps for the study area on the southcoast of New South Wales, Australia, demonstrating thevariation in the probability of species presence. The (A)minimum, (B) maximum, and (C) range of probabilities foreach cell for 10 error realizations of the indirect models ofAcmena smithii presence/absence are shown.
FIG. 6. Predicted presence/absence for the (A) unperturbed model and (B and C) two realizations for the indirect models ofCorymbia maculata.
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN276 Ecological ApplicationsVol. 17, No. 1
strength of the relationship between TWI and A. smithii
presence/absence. Because the relationship was quite
strong (as shown in the univariate analysis), changes in
TWI through the perturbations caused wide fluctuations
in the intercept, high standard errors on the intercept,
and variability in the TWI coefficients. The greatest
variation occurs with outliers and data sparsity, which
supports the idea that any data collection and analysis
methods that generally strengthen models will also help
to protect models against primary-data error.
Therefore there is a further interaction between the
strength of the relationship, level of error, and the
sensitivity of the variables to error in the DEM. In the
univariate GLM analysis, we saw that a strong
relationship led to more robust results. However, a
strong relationship with an independent variable that
has a high sensitivity can lead to the development of
models that were very sensitive to error in the DEM.
This is a big cause for concern in the development of
vegetation-prediction models. In this study, for example,
TWI and topographic position were the strongest
variables for prediction of the rainforest species,
characteristically found only in gullies on the steep
slopes of the two north–south mountain ranges on the
study site. This is due to a number of factors, which
included not only the routing of water in gully channels
but also the availability of soil nutrients accumulated
from deposition and protection from fire. Since both
predictor variables are sensitive to error in the DEM
(Van Niel et al. 2004) and contributed to model
FIG. 7. The number of models (out of 10) that predicted the species as present. None (0) is displayed in red, and all (10) isshown in blue to highlight those areas where all models agreed. Displayed are (A) indirect models of Corymbia maculata, (B) directmodels of C. maculata, (C) indirect models of Acmena smithii, and (D) direct models of A. smithii.
January 2007 277IMPACT OF ERROR ON PREDICTIVE MODELS
instability, neither seemed to be particularly better than
the other for model development for the prediction of
rainforest species at this site. What is required in this
case is a variable that can represent the gullies without
being sensitive to error in the DEM. Unfortunately, the
representation of flow characteristics is highly sensitive
to errors in the base DEM (Holmes et al. 2000), and
therefore estimating this topographic feature requires
more scrutiny among the predictive habitat-modeling
community.
Analysis of model prediction accuracy supports the
stepwise model-development findings. Models that
included only those environmental variables that were
relatively robust to DEM error had the highest
accuracies, as compared to those that included variables
that were highly sensitive to DEM error. In this analysis,
the decision of whether to use direct or indirect variables
was not clear. The most important aspect of variable
selection shown here is that environmental variables
should a priori have a strong hypothetical relationship
to species presence/absence (based on known plant
physiology and ecology) and should also be robust to
error in the source data. This recommendation was also
supported by the analysis of the relationships between
variables and species, where stronger relationships are
more impervious to propagated DEM error (see above).
In addition to the strength of the predictive relation-
ship reducing model sensitivity to DEM error, it is likely
that other measures that generally increase model
robustness would also increase robustness to error.
These measures would likely include a larger number of
sample points, balance in the ratio of presence/absence
in the data set, and limiting data sets to reduce absences
beyond the range of a species. The use of environmental
variables that are less sensitive to error would also be
expected to increase robustness. In our analysis,
however, a strong (or strong lack of) relationship was
shown as the most important factor in sensitivity to
DEM error in the univariate GLM analyses. However, it
did not preserve the shape of the species response curves
if both the relationship was strong and the environmen-
tal variable was sensitive to DEM error.
The spatial predictions clearly demonstrate the
problem of propagated error on model results. For only
10 realizations, the results show a wide variation in the
spatial extent of the species predictions. The variation in
the extent of spatial predictions has the greatest
implications for management. If the analyses were
expanded to include a larger number of realizations,
which is required for the development of complete
statistical results, then a better estimate of the extent of
variation and the estimate of the mean variation could
be obtained. However the maps of spatial variation in
the species predictions give a powerful visual demon-
stration of the range of results due to primary-data
error. It also provides a spatial context for the
uncertainties introduced from error in the base data.
For example, the spatial pattern of model disagreement
for the indirect models for C. maculata (Fig. 6) show the
greatest uncertainties in the region where topographic
position is most affected by DEM error (Van Niel et al.
2004), while the same region is affected in the direct
models, probably due to the inconsistency of the
inclusion of TWI in the final models. For both models
of A. smithii, there is general agreement of species
absence on ridges, while uncertainty of species presence
is evident in the gullies (direct model) as well as slopes
and toe slopes (indirect model).
This study has shown that not only final outcomes,
but also many of the components from the process of
predictive habitat modeling are impacted. This includes
prediction accuracies for both plant and animal species
(Pereira and Itami 1991, Zimmerman and Kienast 1999),
spatial extents (Franklin 1998), species realised niche
(Austin et al. 1990), predictor-variable significance
(Vayssieres et al. 2000) and strength of the relationship
between species and predictor variables for species
occurrences (Franklin 1998), interrelationships of envi-
ronmental variables (Austin et al. 1983), stepwise
procedures (Vayssieres et al. 2000), and the shape of
species response curves. Models that predict dominance
(Lenihan 1993) and richness (Heikkinen 1996) could
also be impacted. Although this study has focused on
predictive habitat modeling and modeling methods,
other models based on continuous GIS data or surface
models may also be impacted. For example, distributed
ecosystem process models (e.g., Band et al. 1991), which
rely on input from a DEM to develop secondary
topographic data, will be similarly affected. Climate
and climate-change models that either use DEMs and
their derivatives to spatially extend climate station data
or rely on models that use these data, such as MTCLIM
(Running et al. 1987), would also be impacted (e.g.,
Tingey et al. 2001, Lapp et al. 2005). Affected also
would be predictive process models that are reliant on
derived terrain data, such as fire models like BEHAVE
(Rothermel 1972) and studies that use these models (e,g.,
Stephens 1998). These results may also be extended to
the marine environment, where bathymetry, as a basis
for benthic habitat modeling (Kostylev et al. 2001), may
also be developed from contour data or from data that
require smoothing of striping and scalar (depth) effects
(e.g., MultiBeam techniques). None of the cited papers
consider the effect of propagated error, including those
from GIS data, on the results.
Conclusions
Spatial predictions of vegetation, habitats, and
processes are widely used in management. An under-
standing of how error can affect the prediction of the
spatial extent of a species, community, or process is
critical for management decision making. The results in
this study indicate that error in a DEM can have a
profound effect on modeling processes and outcomes.
Accepting this, it is necessary that we begin to explore
the ramifications of these findings and develop best-
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN278 Ecological ApplicationsVol. 17, No. 1
practice methods that reduce the impact of errors. For
example, our results indicate that proper model devel-
opment could help to reduce the impact of DEM error
on predictive habitat modeling. The sensitivity of DEM-
derived environmental variables to error in the DEM
should be considered before including them in any
modeling process. There is clearly a need to examine
environmental variables, not just on the basis of their
relatedness to the original source data or to physiolog-
ical drivers, but also in light of their sensitivity to error
in the source data. In particular, we need to develop
variables that are more robust under these conditions.
This study has shown that propagated uncertainty,
based on error in the DEM, has the potential to have a
significant impact on a number of steps and the final
results from predictive vegetation modeling.
ACKNOWLEDGMENTS
This study was conducted as part of a Ph.D. thesis at theAustralian National University, supported by generous schol-arships provided by the ANU and the Australian Departmentof Youth Services. Special thanks go to thesis supervisor BrianLees (UNSW-ADFA) for support and comments on the draftand to thesis advisors Stuart Davey (BRS) and BrendanMackey (ANU). Thanks also to two anonymous reviewerswhose comments greatly improved the manuscript.
LITERATURE CITED
Austin, M. P. 2002. Spatial predictions of species distribution:an interface between ecological theory and statisticalmodelling. Ecological Modelling 157:101–118.
Austin, M. P., R. B. Cunningham, and R. B. Good. 1983.Altitudinal distribution of several eucalypt species in relationto other environmental factors in southern New South Wales.Australian Journal of Ecology 8:169–180.
Austin, M. P., and J. A. Meyers. 1996. Current approaches tomodelling the environmental niche of eucalypts: implicationsfor management of forest biodiversity. Forest Ecology andManagement 85:95–106.
Austin, M. P., A. O. Nicholls, and C. R. Margules. 1990.Measurement of the realized qualitative niche: environmentalniches of five Eucalyptus species. Ecological Monographs 60:161–177.
Austin, M. P., and T. M. Smith. 1989. A new model for thecontinuum concept. Vegetation 83:35–47.
Band, L. E., D. Peterson, S. Running, J. Coughlin, R.Lammers, J. Dungan, and R. Nemani. 1991. Forestecosystem processes at the watershed scale: basis fordistributed simulation. Ecological Modelling 56:171–196.
Barling, R. D., I. D. Moore, and R. B. Grayson. 1994. A quasi-dynamic wetness index for characterising the spatial distri-bution of zones of surface saturation and soil water content.Water Resources Research 30:1029–1044.
Bio, A. M. F., R. Alkemade, and A. Barendregt. 1998.Determining alternative models for vegetation responseanalysis: a non-parametric approach. Journal of VegetationScience 9:5–16.
Bradley, A. P. 1997. The use of the area under the ROC curve inthe evaluation of machine learning algorithms. PatternRecognition 30:1145–1159.
Brown, D. G. 1994. Predicting vegetation types at treeline usingtopography and biophysical disturbance variables. Journal ofVegetation Science 5:641–656.
Carroll, D., D. Ruppert, and L. A. Stefanski. 1995. Measure-ment error in nonlinear models. Chapmand and Hall,London, UK.
Fisher, P. 1998. Improved modeling of elevation error withgeostatistics. GeoInformatica 2:215–233.
Fitzgerald, R. W., and B. G. Lees. 1992. The application ofneural networks to the floristic classification of remote sensingandGISdata in complex terrain. Pages 570–573 inProceedingsof the 17th ISPRS Congress. American Society for Photo-grammetry and Remote Sensing, Washington D.C., USA.
Fitzgerald, R. W., and B. G. Lees. 1994. Spatial context andscale relationships in raster data for thematic mapping innatural systems. Pages 462–476 in T. Waugh and R. Hedley,editors. Advances in GIS research. Taylor and Francis,Southhampton, UK.
Franklin, J. 1998. Predicting the distribution of shrub species insouthern California from climate and terrain-derived vari-ables. Journal of Vegetation Science 9:733–748.
Guisan, A., J. Theurillat, and F. Kienast. 1998. Predicting thepotential distribution of plant species in an alpine environ-ment. Journal of Vegetation Science 9:65–74.
Guisan, A., S. B. Weiss, and A. D. Weiss. 1999. GLM versusCCA spatial modeling of plant species distribution. PlantEcology 143:107–122.
Guisan, A., and N. E. Zimmermann. 2000. Predictive habitatdistribution models in ecology. Ecological Modelling 135:147–186.
Harrell, F. E., K. L. Lee, and D. B. Mark. 1996. Mutlivariableprognostic models: issues in developing models, evaluatingassumptions and adequacy measuring and reducing errors.Statistics in Medicine 15:361–387.
Heikkinen, R. K. 1996. Predicting patterns of vascular plantspecies richness with composite variables: a meso-scale studyin Finnish Lapland. Vegetatio 126:151–165.
Heipke, C., A. Koche, and P. Lohmann. 2002. Analysis ofSRTM DTM. Methodology and practical results. In A.Boberg, editor. Photogrammetry meets geoinformatics.Journal of Swedish Society for Photogrammetry and RemoteSensing 2000(1):69–80.
Henebry, G. M., and J. Merchant. 2002. Geospatial data intime: Limits and prospects for predicting species occurrences.Pages 291–302 in J. Scott, P. Heglund, J. Haufler, J.Morrison, M. Raphael, W. Wall, and F. Samson, editors.Predicting species occurrences: issues of accuracy and scale.Island Press, Washington, D.C., USA.
Holmes, K. W., O. A. Chadwick, and P. C. Kyriakidis. 2000.Error in a USGS 30-meter digital elevation model and itsimpact on terrain modelling. Journal of Hydrology 233:154–173.
Kostylev, V., B. Todd, G. Fader, R. Courtney, G. Cameron,and R. Pickill. 2001. Benthic habitat mapping on the ScotianShelf based on multibeam bathymetry, surficial geology, andseafloor photographs. Marine Ecology Progress Series 219:121–137.
Lapp, S., J. Buyrne, I. Townshend, and S. Zienzle. 2005.Climate warming impacts on snowpack accumulation in analpine watershed. International Journal of Climatology 25:521–536.
Leathwick, J. R. 1995. Climatic relationships of some NewZealand forest tree species. Journal of Vegetation Science. 6:237–248.
Leathwick, J. R. 1998. Are New Zealand’s Nothofagus speciesin equilibrium with their environment? Journal of VegetationScience 9:719–732.
Lees, B. G. 1999. The Kioloa GLCTS Pathfinder Site.[Accessed 21 October 1999.] hhttp://sres.anu.edu.au/associated/pathfinder/index.htmli
Lees, B. G., and K. Ritman. 1991. Decision-tree and rule-induction approach to integration of remotely sensed andGIS data in mapping vegetation in disturbed or hillyenvironments. Environmental Management 15:823–831.
Lehmann, A. 1998. GIS modelling of submerged macrophytedistribution using Generalised Additive Models. PlantEcology 139:113–124.
January 2007 279IMPACT OF ERROR ON PREDICTIVE MODELS
Lenihan, J. M. 1993. Ecological response surfaces for NorthAmerican boreal tree species and their use in forestclassification. Journal of Vegetation Science 4:667–680.
Moore, D. M., B. G. Lees, and S. M. Davey. 1991. A newmethod for predicting vegetation distributions using decisiontree analysis in a geographic information system. Environ-mental Management 15:59–71.
Moore, I. D., T. W. Norton, and J. E. Williams. 1993.Modelling environmental heterogeneity in forested land-scapes. Journal of Hydrology 150:717–747.
Nicholls, A. O. 1989. How to make biological surveys gofurther with generalised linear models. Biological Conserva-tion 73:51–75.
O’Connor, R. 2002. The conceptual basis of species distributionmodeling: time for a paradigm shift. Pages 25–33 in J. Scott,P. Heglund, J. Haufler, J. Morrison, M. Raphael, W. Wall,and F. Samson, editors. Predicting species occurrences: issuesof accuracy and scale. Island Press, Washington, D.C., USA.
Pearce, J., and S. Ferrier. 2000. An evaluation of alternativealgorithms for fitting species distribution models. EcologicalModelling 128:127–147.
Pereira, J. M. C., and R. M. Itami. 1991. GIS-based habitatmodelling using logistic multiple regression: a study of theMt. Graham red squirrel. Photogrammetric Engineering andRemote Sensing 57:1475–1486.
Rothermel, R. 1972. A mathematical model for predicting firespread in wildland fuels. Research Paper INT-115. USDAForest Service, Ogden, Utah, USA.
Running, S., R. Nemani, and R. Hungerford. 1987. Extrapo-lation of synoptic meteorological data in mountainousterrain and its use for simulating forest evapotranspirationand photosynthesis. Canadian Journal of Forest Research17:472–483.
Scott, J., P. Heglund, J. Haufler, J. Morrison, M. Raphael, W.Wall, and F. Samson, editors. 2002. Predicting species
occurrences: issues of accuracy and scale. Island Press,Washington, D.C., USA.
Stephens, S. 1998. Evaluation of the effects of silviculture andfuels treatments on potential fire behaviour in Sierra Nevadamixed-conifer forests. Forest Ecology and Management 105:21–35.
Tingey, D., J. Laurence, J. Weber, J. Greene, W. Hogsett, S.Brown, and E. Lee. 2001. Elevated CO2 and temperaturealter the response of Pinus ponderosa to ozone: a simulationanalysis. Ecological Applications 11:1412–1424.
van Horne, B. 2002. Approach to habitat modelling: the tensionbetween pattern and process and between specificity andgenerality. Pages 63–72 in J. M. Scott, P. J. Heglund, andM. L. Morrison, editors. Predicting species occurrences.Island Press, Washington, D.C., USA.
van Horssen, P. W., E. J. Pebesma, and P. P. Schot. 2002.Uncertainties in spatially aggregated predictions from alogistic regression. Ecological Modelling 154:93–101.
Van Niel, K. P. 2003. Geographical issues in predictivevegetation modelling: Error and uncertainty in GIS data,methods, and models. Dissertation. The Australian NationalUniversity, Canberra, Australian Capital Territory, Austra-lia.
Van Niel, K. P., S. W. Laffan, and B. G. Lees. 2004. Effect oferror in the DEM on environmental variables for predictivevegetation modelling. Journal of Vegetation Science 15:747–756.
Vayssieres, M. P., R. E. Plant, and B. H. Allen-Diaz. 2000.Classification trees: an alternative non-parametric approachfor predicting species distributions. Journal of VegetationScience 11:679–694.
Zimmerman, N., and F. Kienast. 1999. Predictive mapping ofalpine grasslands in Switzerland: Species versus communityapproach. Journal of Vegetation Science 10:469–482.
KIMBERLY P. VAN NIEL AND MIKE P. AUSTIN280 Ecological ApplicationsVol. 17, No. 1