Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
University of Calgary
PRISM: University of Calgary's Digital Repository
Graduate Studies The Vault: Electronic Theses and Dissertations
2013-01-25
Range-Wide Habitat Mapping for Ord’s Kangaroo Rats
(Dipodomys ordii)
Robbins, Allison
Robbins, A. (2013). Range-Wide Habitat Mapping for Ord’s Kangaroo Rats (Dipodomys ordii)
(Unpublished master's thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/27526
http://hdl.handle.net/11023/483
master thesis
University of Calgary graduate students retain copyright ownership and moral rights for their
thesis. You may use this material in any way that is permitted by the Copyright Act or through
licensing that has been assigned to the document. For uses that are not allowable under
copyright legislation or licensing, you are required to seek permission.
Downloaded from PRISM: https://prism.ucalgary.ca
UNIVERSITY OF CALGARY
Range-Wide Habitat Mapping for Ord’s Kangaroo Rats (Dipodomys ordii)
by
Allison Robbins
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE
DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF GEOGRAPHY
CALGARY, ALBERTA
JANUARY, 2013
© ALLISON ROBBINS 2013
ii
Abstract
Ord’s kangaroo rat (Dipodomys ordii) is federally listed as an endangered species in
Canada. This is the northern-most population of the species, occurring in only two
Canadian provinces: Alberta and Saskatchewan. Building on the previous models and
extensive knowledge of the kangaroo rat population in Alberta, a habitat model was
developed and extrapolated across the entire Canadian extent. The habitat model was
trained using population data obtained in Alberta. The top model, as indicated by
Akaike’s Information Criteria (AICc), was used to estimate the area of available habitat in
Saskatchewan. An alternate model was also developed to test for potential bias toward
river valley slopes in the top model. Population estimates ranged from 6,522 for the top
model to 8,876 for the alternate model. In future studies, the performance of the model
may be improved through better image classification and the use of more biologically
relevant model variables.
iii
Acknowledgements
I would first like to thank my supervisor, Darren Bender, for guiding me through this
process. He patiently taught me about modelling and introduced me to kangaroo rats. I
would also like to thank Randy Dzenkiw for training me in the field and recommending
sites for the vegetation surveys in Saskatchewan.
.
iv
Dedication
I’d like to dedicate this thesis to my family, because their constant support made it
possible.
v
Table of Contents
Approval Page ..................................................................................................................... ii
Abstract ............................................................................................................................... ii
Acknowledgements ............................................................................................................ iii Dedication .......................................................................................................................... iv Table of Contents .................................................................................................................v List of Figures and Illustrations ........................................................................................ vii
CHAPTER ONE: BACKGROUND ....................................................................................8
1.1 Introduction ................................................................................................................8 1.2 Study Species .............................................................................................................9 1.3 Study Area ...............................................................................................................13
1.4 Study Objectives ......................................................................................................15
CHAPTER TWO: HABITAT DISTRIBUTION MODEL ...............................................18 2.1 Introduction ..............................................................................................................18
2.1.1 Resource Selection Functions ..........................................................................19 2.1.2 Validation ........................................................................................................20
2.2 Methods ...................................................................................................................22 2.2.1 Model Layers ...................................................................................................22 2.2.2 Training Data for the Habitat Model ...............................................................29
2.2.3 Model Selection and Validation ......................................................................31 2.3 Results ......................................................................................................................32
2.4 Discussion ................................................................................................................39
CHAPTER THREE: HABITAT AND POPULATION ESTIMATION...........................43
3.1 Introduction ..............................................................................................................43 3.2 Methods ...................................................................................................................45
3.2.1 An Alternate Model Surface ............................................................................45 3.2.2 Estimating Population Size ..............................................................................46
3.3 Results ......................................................................................................................48
3.3.1 The Alternate RSF Model ...............................................................................48 3.3.2 Habitat and Population Estimates for the Top Model by AICc .......................50 3.3.3 Habitat and Population Estimates for the Alternate Model .............................50
3.4 Discussion ................................................................................................................52
CHAPTER FOUR: SUMMARY AND GENERAL CONCLUSIONS.............................57
REFERENCES ..................................................................................................................63
APPENDIX A: CLASSIFICATION VALIDATION .......................................................72
APPENDIX B: K-FOLD CROSS-VALIDATION ............................................................74
APPENDIX C: PREDICTED HABITAT IN SASKATCHEWAN ..................................77
vi
List of Tables
Table 2.1. The spatial layers developed for this model were drawn from several
sources in order to cover the entire study area. ......................................................... 22
Table 2.2. All models run were ranked by AICc. Delta indicates the difference
between AICc scores, and wi is a measurement of the relative value of the model. . 32
Table 2.3. The model coefficients are given for the top-ranked model. .......................... 34
Table 2.4 The χ2 statistic for the first fold is significant (p>0.001). Most of the
validation points are in the highest probability bin. .................................................. 38
Table 3.1. The area of potential habitat (given in km2) and habitat capacity were
identified by the top model and is organized by habitat type and province. The
total area of potential habitat included habitat that was smaller than a single
home range. The estimated area of potential habitat in dune fields is much lower
than might be expected from the population surveys performed in Alberta. ............ 50
Table 3.2. The area of potential habitat (given in km2) for the alternate model was
totaled. The habitat capacity was calculated from the totals by province and
habitat type. The predicted potential habitat and habitat capacity was much
higher in Saskatchewan than the prediction made by the top model ........................ 51
Table 3.3. When the probability threshold for potential habitat was increased and
applied to the alternate model surface the overall potential habitat and habitat
capacity were reduced from the estimates made above. The area and habitat
capacity in road margin and other habitat types fell more than in dune fields. ........ 52
vii
List of Figures and Illustrations
Figure 1.1. The study area is shown with the dune fields identified. CFB Suffield is
shown as a hollow outline. Records of kangaroo rats within a dune field are
indicated by the stippled pattern and dune fields without any evidence of
kangaroo rats are shown with cross-hatching. .......................................................... 14
Figure 2.1. The training points were collected mostly in the Middle Sand Hills region
and in the Suffield NWA. CFB Suffield is shown with the outline and the dune
fields in the area are labelled..................................................................................... 30
Figure 2.2. The probability surface derived using the top-ranked model and applied
across the entire Canadian range of Ord’s kangaroo rats. ......................................... 35
Figure 2.3. A zoomed-in view of the probability surface for a region of CFB Suffield
shows more distinction between individual sand dunes. .......................................... 36
Figure 2.4. The ROC curve shows excellent model performance with the curve
falling well above the line indicating random chance. The AUC was 0.938. .......... 37
Figure 2.5. The slope of the linear regression for the first fold is significantly
different from zero and one. ...................................................................................... 38
Figure 3.1. The ROC curve for the model developed without the distance-to-rivers
variable has an AUC of 0.818. While lower than the AUC for the top model, it is
still well above the AUC of 0.5 indicating the model performs better than
random chance. ......................................................................................................... 48
Figure 3.2. The model developed without the distance-to-rivers variable shows
greater variation in probability values across the study area. Several active
dunes and roads are visible as darker areas of high probability. Note that an
agricultural mask was applied to the model surface, explaining the the large
areas of white in the figure. ....................................................................................... 49
8
8
Chapter One: Background
1.1 Introduction
The study of how a species is distributed across a landscape has long interested
researchers. A species’ spatial distribution can be used to understand what food, land
cover, and other variables are necessary for its survival (Manly et al., 2002). By
identifying, then mapping environmental characteristics associated with a species,
potential habitat can be located and even predicted (e.g. Bender et al., 2010; Hough and
Dieter, 2009; Jędrzejewski et al., 2008). A map of potential habitat can be used to
prioritize conservation efforts and increase the efficiency of field surveys. (Guisan and
Zimmermann, 2000).
Niche theory, initially developed in the 1920’s, is based on identifying the
parameters within which a species thrives (Vandermeer, 1972). Hutchison (1957) is
credited with identifying two types of niches, and formalizing the definitions for the
fundamental niche and the realized niche. He defines a fundamental niche in terms of a
hypervolume, where the axes are defined by resources and environmental characteristics
required by a species. The niche is bounded by the limits of tolerance of a species. The
fundamental niche represents an area in n-dimensional space that an animal could inhabit
under ideal circumstances. However, a species rarely occupies its fundamental niche in
its entirety. Interactions can occur, which limit an individual’s access to certain areas of
the niche hypervolume. These interactions include predation by another species, or
competition for resources, such as food or territory, with another species. These other
individuals occupy a portion of the hypervolume. With these limitations applied, the
9
9
fundamental niche is thus reduced into a realized niche, which reflects the actual
ecological area a species occupies (Hutchinson, 1957).
The environmental factors defining a species’ niche can be mapped and, using a
statistical model, the overall distribution of potential habitat can be predicted. Habitat
maps are graphical representations of the occupied habitat, and can be useful tools for
locating and understanding a species. Habitat maps can be derived from historic
occurrence records and expert knowledge, which might not be reproducible (Guisan and
Thuiller, 2005). Statistical techniques have been developed that can quantify the
relationships between the environmental requirements of a species and their influence on
the species’ distribution. Statistical models, like resource selection functions (Boyce and
McDonald, 1999; Guisan and Zimmermann, 2000; Manly et al., 2002), combine expert
opinion in choosing environmental variables and the repeatability of empirical
observations.
1.2 Study Species
The Ord’s kangaroo rat (Dipodomys ordii) is a member of the Heteromyid family
of rodents (Day et al., 1956). This species is distributed across the arid and semi-arid
regions of western North America. While common throughout regions of the United
States and Mexico, the kangaroo rats in Canada represent the northernmost extent of the
species’ range (COSEWIC, 2006). The population in Canada is isolated and differs from
individual in the US and Mexico in several ways (COSEWIC, 2006; Gummer, 1997a).
The mean weight of Ord’s kangaroo rats across the North American range was
recorded at 52 g (Jones, 1985). In Canada, kangaroo rats tend to be larger, with a mean
weight of 70 g (Gummer, 1997a). There is slight sexual dimorphism throughout North
10
10
America, with males weighing more (71.1 g) than females (69.0 g) (Gummer, 1997a).
Males also display larger skull characteristics, with males showing a mean skull length of
38.22 mm compared to 37.98 mm for females (Kennedy and Schnell, 1978). Kangaroo
rats have large hind legs which they use in their unique method of locomotion by
hopping. The body is covered in orange-brown pelage with white stripes over the hips
and running down the long tail (Gummer, 1997b).
Ord’s kangaroo rats are nocturnal, emerging from their burrows on the darkest
nights to forage. They are highly sensitive to light, staying underground to minimize
their surface activity on bright moonlit nights (Kaufman and Kaufman, 1982). Northern
kangaroo rats also avoid surface foraging on cold nights (below -5 °C), and activity is
reduced with snow cover (Gummer, 1997a). This differs from southern populations,
where winter conditions are milder and kangaroo rats are active throughout the year
(Hoditschek and Best, 1983; White and Geluso, 2012).
When surface activities cease, kangaroo rats in Canada remain in burrows,
subsisting on seed caches. During cold spells, kangaroo rats have been observed to
undergo shallow torpor (Gummer, 1997a). While torpor has been observed in
conspecifics and other kangaroo rat species, it is usually in response to starvation and
often results in death (Breyen et al., 1973; Yousef and Dill, 1971). In contrast, Ord’s
kangaroo rats in Canada appear to use this as a strategy to maintain fat reserves through
the long winter (Gummer, 1997a).
When underground, kangaroo rats feed on seeds cached from earlier foraging
expeditions. They forage primarily on seeds collected from local plants (COSEWIC,
2006; Gummer, 1997b). Conspecifics elsewhere are known to store seeds in shallow
11
11
caches near the surface as well as in larder hoards in underground burrows. They tend to
switch to larder hoards for the winter season (White and Geluso, 2012). It is these
underground larders which provide food for individuals throughout the winter (Gummer,
1997a). When burrows are abandoned by kangaroo rats, seed caches can serve as seed
banks for future vegetation growth.
Kangaroo rats also clip and collect grass blades for bedding in their burrows.
Clipping grass blades can limit the spread of vegetation on dunes (Kerley et al., 1997).
Dunes are kept active and kangaroo rats can alter the plant community by clipping
vegetation, which is evidence for Ord’s kangaroo rats’ potential status as a keystone
species (Kerley et al., 1997).
Ord’s kangaroo rats in Canada breed at a younger age and more frequently than
conspecifics in the U.S. and Mexico (Gummer, 1997a). Canadian kangaroo rats can have
up to 4 broods from April to August. Juveniles from early broods can reach sexual
maturity and reproduce later in the same summer (Gummer, 1997a). Southern
conspecifics tend to breed later in the year and rarely produce more than a single litter in
one season (Hoditschek and Best, 1983).
The high reproductive rate leads to a boom and bust population cycle in Canada.
In harsh northern winters, overwinter survival rates tend to be as low as 5% (Gummer,
1997a). Adults rarely survive past one year and the probability of juveniles surviving
winter is even lower (Gummer, 1997a). In locations where sufficient individuals survive
the winter, the population typically increases rapidly throughout the summer. Especially
when the population is at its peak, kangaroo rats serve as a portion of the diet of many
predatory species, including owls, coyotes, and rattlesnakes.
12
12
Ord’s kangaroo rats build their burrows in loose sandy soils (Compton and
Hedges, 1943). In Canada, they are exclusively found on actively eroding sand dunes
and in adjacent areas, particularly along sandy roads, fireguards, trails, or other linear
features. While associated with eroding features, such as active sand dunes, the species
relies most on nearby partially vegetated soil for foraging. Partially vegetated soils
provide adequate food resources and cover while allowing enough bare surface to
facilitate the kangaroo rats’ method of locomotion by hopping. Once established,
kangaroo rats excavate elaborate burrows with multiple entrances and create distinctive
runways to and from their burrow entrances (Gummer, 1997b).
Kangaroo rats also establish burrows along sandy roads and other anthropogenic
features (Gummer, 1997b). However, these secondary habitats are of lower quality and
mortality is often higher along roads (Teucher, 2007). For example, some predators
follow roads while hunting, increasing the likelihood of predation, and incidences of
parasitism by botflies is higher along roads (Robertson, 2007). Overwinter mortality is
also higher on roads due to cooler soil temperature during the winter, higher rates of
predation and parasitism, and increased soil compaction (Gummer et al., 1997; Teucher,
2007).
In Canada, Ord’s kangaroo rat is classified as endangered by the Committee of the
Status of Endangered Wildlife in Canada (COSEWIC). The species occupies limited and
declining habitat with a small spatial extent. The population fluctuations and isolation
from other populations further jeopardizes the species persistence. In addition to federal
protection across the species’ range, kangaroo rats also have provincial protection as an
endangered species under the Alberta Wildlife Act. In Saskatchewan, Ord’s kangaroo rat
13
13
is recognized as an endangered species sensitive to habitat change by government
research groups (Nielsen, 2007), but no provincial legislation specifically designed to
protect the species was found.
1.3 Study Area
This study includes portions of southeastern Alberta and southwestern
Saskatchewan, encompassing the region defined by historical records of kangaroo rats
from west of Medicine Hat, AB to Swift Current, SK. The Trans-Canada Highway
serves as a southern boundary and the Red Deer River bounds the extent to the North.
Several dune fields fall into the study area, including the Middle Sand Hills and Great
Sand Hills (Figure 1.1).
In Alberta, the majority of the kangaroo rat population is within the Middle Sand
Hills, most of which falls within Canadian Forces Base (CFB) Suffield. The Base
includes the Suffield National Wildlife Area (SNWA), where training activities are
prohibited and access is limited. The majority of active sand dunes occupied by
kangaroo rats are within the protected SNWA rather than the main training area of the
Base. CFB Suffield represents almost 458 km2 of virtually intact prairie (Department of
National Defense, 2012). Twenty species designated as “at risk” by COSEWIC have
been recorded on the SNWA (Environment Canada, 2012).
14
14
Figure 1.1. The study area is shown with the dune fields identified. CFB
Suffield is shown as a hollow outline. Records of kangaroo rats within a
dune field are indicated by the stippled pattern and dune fields without any
evidence of kangaroo rats are shown with cross-hatching.
The soil material that formed the sand dunes was deposited initially by glaciation
in the Holocene era, and subsequently eroded and activated by wind into sand dune
complexes (Hugenholtz and Wolfe, 2005a). The deposits in the study area have
undergone periods of active development and stabilization (Hugenholtz and Wolfe,
2005a). Recently, dunes in the Great Sand Hills and surrounding areas have been rapidly
stabilizing (Hugenholtz and Wolfe, 2005b). In recent decades, warmer, wetter, and
longer growing seasons have promoted vegetative recolonization, which leads to plant
root stabilization of the sand (Hugenholtz and Wolfe, 2005b). Stabilizing sand dunes
reduces the amount of habitat available to kangaroo rats (COSEWIC, 2006).
15
15
1.4 Study Objectives
Nearly 15 years of research has provided information about the population and
biology of Ord’s kangaroo rats in Canada, but only for the portion of the range that falls
within Alberta (Gummer, 1997b; Heinrichs et al., 2010; Heinrichs et al., 2008; Podgurny,
2004; Robertson, 2007; Teucher, 2007). Preliminary data about the population in
Saskatchewan was collected by Kenny (1989). , Since then, a grazing study has used
Ord’s kangaroo rat as an indicator of habitat quality in the Great Sand Hills, but not as a
focal species (Nielsen, 2007).
The current study seeks to extrapolate habitat models developed for Ord's
kangaroo rats in Alberta into the Saskatchewan portion of the Canadian range. There
have been previous models derived from resource selection functions (RSFs) developed
for Alberta which have successfully identified kangaroo rat habitat (Bender et al., 2010;
Heinrichs et al., 2008; Koenig, 2010; Podgurny, 2004). This study will use an RSF
model (see Chapter 2 for background on RSFs) and variables similar to those in previous
models (Bender et al., 2010; Heinrichs et al., 2008; Koenig, 2010; Podgurny, 2004) to
develop a habitat model which can be extrapolated across the entire Canadian range of
kangaroo rats.
Several of the most recent habitat models use nearly identical sets of variables.
Two reports prepared for the Alberta Species at Risk Series included exposed sand and
partially vegetated soil (Bender et al., 2010; Heinrichs et al., 2008). Exposed sand and
partially vegetated soils were identified by supervised classification with Indian Remote
Sensing Systems (IRS) panchromatic imagery at 5m spatial resolution. Aeolian sand and
river layers were digitized using a combination of IRS and Système Pour l’Observation
16
16
de la Terre (SPOT) imagery (Bender et al., 2010; Heinrichs et al., 2008). The Euclidean
distance from each feature identified (exposed soil, a river, etc.) was calculated to each
pixel in the study area. A log transformation was performed on each distance layer.
Slope and elevation were derived from a digital elevation model (DEM) (Land Processes
Distributed Active Archive Center, 2012). Koenig (2010) used all of these variables in
her kangaroo rat habitat model, but used SPOT-4 and -5 satellite imagery to derive the
exposed sand and partially vegetated soils layers. The spatial resolution of the spectral
bands was 20 m. All of the previous models used a resource selection function (RSF)
approach to identify potential kangaroo rat habitat in Alberta (Bender et al., 2010;
Heinrichs et al., 2008; Koenig, 2010).
Most of the data layers had already been assembled for Alberta, so similar layers
had to be located through provincial sources in Saskatchewan for roads and rivers, and
derived for the entire range, as for open sand and partially vegetated soil. While the
previous Alberta habitat models used IRS-1C imagery, this data was not available for
Saskatchewan; therefore, new imagery was necessary for an extrapolated habitat model.
The imagery for this study had to be available across the entire study area with moderate
to high spatial and spectral resolution. SPOT imagery was chosen because suitable
images from across the range were readily available. Other imagery sources lacked either
the spatial or spectral resolution, were unavailable for the entire study area, or charged a
fee for use of the imagery. The spatial resolution of SPOT is moderate, but sufficient for
this species, as demonstrated in Koenig’s (2010) habitat model.
Once the habitat model has been updated and extrapolated, this study will then
attempt to estimate the area of potential habitat. Potential habitat will be identified from
17
17
model surfaces by a probability threshold separating potential habitat from non-habitat.
Potential habitat can be related to population by estimating the number of kangaroo rat
home ranges that could occupy the area of potential habitat. This assumes complete
occupancy and can be considered a habitat capacity if habitat were the sole limiting
factor. The habitat capacity estimates will therefore be reduced to a population estimate
based on a previous population estimate made for Alberta (Gummer, 1997b). The
estimated size and location of potential habitat can be used to guide future conservation
efforts by focusing survey efforts into areas of high habitat potential. By using predictive
habitat modelling for areas with similar environmental factors, conservation
organizations could save the time and cost of extensive surveys by prioritizing areas of
highest habitat potential.
18
18
Chapter Two: Habitat Distribution Model
2.1 Introduction
Empirical habitat models allow researchers to combine the power of expert
opinion to select biologically relevant environmental variables with the power of
statistics to determine empirical relationships between environmental variables and a
species’ distribution (Guisan and Thuiller, 2005; Wiens et al., 2008). The relationships
can then be used in a geographic information system (GIS) to create a visual output and,
potentially, a predictive probability surface indicating likelihood of occurrence for a
species (Manly et al., 2002). This technique has been used to identify habitat for the
Ord’s kangaroo rat, a species with specific habitat requirements (Bender et al., 2010;
Heinrichs et al., 2008; Koenig, 2010). These studies have focused on identifying habitat
in Alberta using resource selection functions based on logistic regression (Manly et al.,
2002).
Habitat models for kangaroo rats by Heinrichs et al. (2008) and Bender et al.
(2010) were used to identify critical habitat for kangaroo rats in Alberta. Heinrichs later
used a critical habitat model in a population simulation (Heinrichs, 2010) to estimate
population extinction risk and the population dynamic of Ord’s kangaroo rats (Heinrichs,
2010). Another recent habitat model by Koenig (2010) examined seasonal variation in
habitat use by Ord’s kangaroo rats in Alberta. By creating habitat models from
population data collected from different parts of the season, Koenig (2010) was able to
quantify temporal variation in habitat selection by kangaroo rats.
This study builds on previous habitat modelling work for kangaroo rats in
Alberta. The previous model (Bender et al., 2010) was updated for Alberta and then
19
19
extrapolated into Saskatchewan to identify potential habitat for kangaroo rats there, the
only other Canadian province where this species occurs.
2.1.1 Resource Selection Functions
A resource selection function (RSF) is an empirical model that relates the
distribution of a species’ resources to the geographic distribution of the species (Boyce
and McDonald, 1999). Resources may be related to variables that influence the
distribution of a species such as land cover, food availability, or mates. Often, these
models are fitted to a logistic regression curve. This function allows for the use of binary
data, such as presence-absence or use-available data.
Using presence-absence data in a RSF model yields the probability of species
occurrence for the resulting model surface, also known as a resource selection probability
function (RSPF). While there is value in knowing the probability of species occurrence
for a given location, presence-absence data are often unavailable. Data on species
presence can usually be obtained through surveys, as is the case for kangaroo rats, but
data on species absence is often difficult to collect. A location that is initially recorded as
unoccupied (i.e., absent) might later be found to be occupied (i.e., present) based on
survey intensity, time of the survey, or experience of the surveyors and the accuracy of
their records. For these reasons, absence data were not used to create this model.
Instead, a use-available design was used, where locations observed to have been used or
occupied by an individual are compared to a random sample of background points
representing the habitat available to the species (Manly et al., 2002). This method results
in a relative probability of occurrence (Boyce et al., 2002).
20
20
In a use-available design the available points are randomly selected from across
the study area habitat, so there is potential for those points to fall on locations actually
used by the animal (Keating and Cherry, 2004). This potential is considered
contamination, and the portion of available points that are at locations used by the animal
is unknown. Johnson et al. (2006) tested the effects of contamination by introducing
known proportions of used points into the selection of available points. Contaminating
the pool of available points with used points caused relatively little error even with large
amounts (>50%) of contamination (Johnson et al., 2006). For species that occupy a
relatively small proportion of available habitat, the risk of contamination is reduced
because available points are less likely to fall within occupied habitat.
2.1.2 Validation
A model must first be free from typographical, mathematical, and programming
errors before it can be validated. The validation process assesses a model’s relationship to
the real world and how well it emulates reality (Oreskes et al., 1994). When a model is
both free from errors and returns values similar to those measured and observed, it is
considered to be valid (Oreskes et al., 1994). Validation is required to determine the
quality of a model. Most studies claim to have performed some type of validation (e.g.
Colchero et al., 2011; Hough and Dieter, 2009; Jędrzejewski et al., 2008). However,
some researchers argue that a model cannot be fully validated because it over-simplifies a
complex open system (e.g. Konikow and Bredehoeft, 1992).
The simplest method of RSF validation is to construct a confusion matrix using a
threshold probability value (often 0.5) for suitable habitat, and then assess whether
validation points were correctly or incorrectly classified as habitat or non-habitat
21
21
(Campbell, 2008). However, the need to assign a single threshold can limit the value of
this statistical method. The receiver operator characteristic (ROC) curve is a method
drawn from medical science that avoids setting a single threshold by iteratively setting
thresholds (Fawcett, 2006). The ROC curve uses an independent data set to compare the
true positive and false positive rates without the need to set an arbitrary threshold.
(Boyce et al., 2002; Fawcett, 2006; Fielding and Bell, 1997). The proportion of true
positive validation points is plotted against the proportion of false positive points. A
good or representative model maximizes the true positive rate while minimizing the false
positive rates (Morrison, 2005). When plotted, this creates a curve. The area under the
curve (AUC) represents the probability that a randomly chosen true positive point will
have a higher value than a randomly selected false point. When the AUC is greater than
0.5, the model performs better than random chance (Fawcett, 2006; Fielding and Bell,
1997).
When independent data is unavailable or datasets are small, k-fold cross-
validation is commonly applied to assess the internal consistency of the data. The data
are subset k times into folds and trained k-1 times. Each fold is excluded from training
the model once and is used to validate the model in that subset. An independent dataset
is not required. Each fold is assessed using χ2 goodness-of-fit tests and linear regression
analysis (Johnson et al., 2006; Wiens et al., 2008). The expected result of the χ2
goodness-of-fit test should have non-significant value, where the observed distribution of
probability values is nearly the same as the distribution of values across the model
surface. Similarly, the linear regression should have a slope near one indicating that the
22
22
observed and expected probability values are equal (Johnson et al., 2006; Wiens et al.,
2008).
2.2 Methods
2.2.1 Model Layers
The RSF model for Ord’s kangaroo rat occurrence was developed using a variety
of environmental and topographical layers deemed to be important components of
kangaroo rat habitat, both by inclusion in previous habitat models and as predictors of
habitat. These layers were largely drawn from previous models developed for kangaroo
rats in Alberta (Bender et al., 2010; Heinrichs et al., 2008; Koenig, 2010; Podgurny,
2004). A summary of the layers used in this model is provided in Table 2.1.
Table 2.1. The spatial layers developed for this model were drawn from several
sources in order to cover the entire study area.
Variable Source
Log distance to open sand
Supervised classification of SPOT imagery
Log distance to partially vegetated soil
Supervised classification of SPOT imagery
Log distance to rivers Hydrology shapefiles, acquired through
GeoBase for each province, limited to
major rivers and merged into single layer
Log distance to roads Provincial road networks by GeoBase and
GeoSask, limited to unpaved roads and
merged into single layer
Slope
DEM
Elevation
DEM
Local Relief DEM
23
23
Previous habitat models for Ord’s kangaroo rats used IRS -1C imagery for
information on soils and land cover, but this imagery was only available for Alberta.
Because this study includes habitat in Saskatchewan as well, another source of satellite
imagery was required that covered the entire species’ range and had sufficiently fine
spatial resolution to identify sand dunes. SPOT -4 and -5 imagery was used to provide
information on soils and land cover for the entire study area (Government of Canada,
2007). The spatial resolution of the SPOT imagery was 20 m as opposed to 5 m used in
the previous model (Bender et al., 2010).
The features of interest, mainly sand dunes, tend to be relatively small features on
the landscape. Upon visual examination of the SPOT images, the larger sand dunes were
visible at the coarser resolution, and even relatively narrow sandy roads appeared on the
imagery. Pan-sharpening using the panchromatic band to increase the spatial resolution
to 10 m was used in Koenig’s (2010) model, but the loss of spectral information would
have complicated the supervised classifications performed with spectral bands as part of
this study. SPOT imagery included four bands: green, red, near infrared, and short-wave
infrared. While each SPOT image tile was captured on different days, exposed sand
varies little throughout the year in both size and spectral signature. However, temporal
variation could affect the appearance of vegetated areas. Throughout the year, percent
vegetative cover and greenness cover vary as plants emerge, mature, and senesce.
Because the SPOT imagery was used to identify bare sandy soils, which vary little
throughout the year, and nearby sparsely vegetated soils, the SPOT imagery was used
despite the temporal variations.
24
24
The SPOT image tiles were mosaicked using PCI Orthoengine (PCI Geomatica,
2009). First, the radiometric differences between tiles were corrected with colour
balancing, which matched the colour contrast of each tile (PCI Geomatica, 2009). This
reduced abrupt changes in colour or brightness where image tiles were joined and
improved the performance of the supervised classification by allowing training areas to
be applied across the entire range rather than to a single tile. PCI Orthoengine
determined the placement of cutlines between image tiles (PCI Geomatica, 2009). The
join areas were chosen to reduce the visibility of the seam at each cutline, based on the
similarity in texture and tone from one tile to another and the uniformity of features in the
overlap area between each tile. Visual inspection of the area around the cutlines showed
minor differences in brightness between adjoining tiles. The classification (described
below) did not show any variation between the image tiles. Lines where the
classification abruptly changed near an area of overlap or a “patchwork” appearance of
classified areas between tiles would indicate a flaw in colour balancing or in joining the
image tiles. These indicators were not seen during visual inspection or after the
supervised classification.
The mosaic of SPOT satellite imagery was classified using a supervised,
minimum distance method to locate open sand and partially vegetated soils. Minimum
distance classification measures the spectral distance between an unknown pixel and the
mean vector of the training samples (Perumal and Bhaskaran, 2010). The unknown pixel
is then assigned to the nearest class. Maximum distance thresholds based on standard
deviation can be used to limit the classification (ENVI, 2012). This method of
classification was chosen because it correctly identified partially vegetated soils more
25
25
frequently than the maximum likelihood method. Ambiguous pixels were not forced into
a class and the search radius could be limited to refine the classes.
Training areas for exposed sand were selected from active sand dunes in Alberta
and Saskatchewan. The trainings sites were selected from multiple image tiles across the
study area. Several sandy roads in Alberta were also included because they are
occasionally inhabited by kangaroo rats. Partially vegetated areas were trained on areas
adjacent to active sand dunes. These areas were chosen because they were known to
have a mixture of vegetation and open sand used by kangaroo rats for foraging. Partially
vegetated soils are a difficult class to quantify on satellite imagery. The class itself
consists of a mixture of vegetation and sandy soil on a gradient. Drawing boundaries can
be arbitrary; therefore the vegetated areas around known sand dunes were used for
training areas.
Only two classes, exposed sand and partially vegetated soils, were used during the
classification because the classifier could be limited by a threshold standard deviation.
With the threshold set, pixels were only assigned to a class if they fell within the set
standard deviations, leaving pixels unclassified if they fell outside the training area.
Exposed sandy soils were limited to one standard deviation from the mean of the training
areas. The spectral signature of sand is distinctive enough to allow for a very low
threshold distance. The resulting image successfully identified dunes and sandy roads
when subjected to visual examination. Partially vegetated soils were classified using a
threshold at 1.5 standard deviations. This limited the results to areas near sand dunes and
potentially stabilized dunes. The classified images were assessed using points acquired
during a preliminary vegetation survey (Appendix I).
26
26
Classifying partially vegetated soils was difficult because they combine the
spectral signatures of vegetation and bare soil. Different combinations of spectral bands,
including principle component bands and the normalized difference vegetation index
(NDVI), were incorporated in the partially vegetated soils classification. Partially
vegetated soils were classified using the original four spectral bands included with SPOT
imagery, the original spectral bands and NDVI, and the original spectral bands and the
principle component bands. The resulting classifications were used to train initial RSF
models. The classification using only the four bands included with SPOT imagery was
consistently ranked higher using Akaike’s Information Criterion (AIC; Burnham et al.,
2011; see below) than the other band combinations and was the only one retained for
further model-building. The classified pixels were converted into polygons. The
distance was calculated from both exposed sand and partially vegetated soil. The
distance was transformed using a logarithmic function to accentuate the effect of
distance.
Agricultural lands used for crop production were excluded from the classification.
The exposed soil and emerging plants confused the classification of the native open sand
and partially vegetated areas. Crop fields represent poor kangaroo rat habitat (Teucher,
2007). The dense vegetation late in the growing season does not provide foraging areas
for kangaroo rats. Frequent soil agitation (e.g., harrowing) disturbs any established
burrows and increases mortality (COSEWIC, 2006). Therefore, the fields were digitized
into a mask, a layer used to exclude some areas from a classification or layer, and applied
to the imagery during classification. This mask was applied to the all layers.
27
27
The Euclidean distance to rivers was included as a variable in previous Alberta
habitat models because sand dunes have drifted toward the South Saskatchewan River
and Red Deer River (Bender et al., 2010). While rivers do not directly provide kangaroo
rat habitat, habitat in Alberta tends to occur close to a river (Bender et al., 2010). This
caused proximity to rivers to be a good predictor of kangaroo rat habitat despite the
variable’s limited biological relevance. Hydrological layers depicting the Red Deer River
and South Saskatchewan River, the major rivers within the study area, were obtained and
clipped to the geographic extent of the study (Government of Canada, 2007). The Spatial
Analyst tool in ArcGIS 9.3 was used to calculate the shortest distance from each pixel in
the study area to a river (ESRI, 2011). From the resulting raster surface, the log
transformation of the distance was obtained, similar to previous models (Bender et al.,
2010; Koenig, 2010).
River valley slopes in the study area tend to be steep, eroded, and only sparsely
vegetated (Bender et al., 2010). These slopes were often confused for active sand dunes
during the classification of exposed sandy soils in this study. However, kangaroo rats
generally do not occupy river valley slopes because the soil is unsuitable for burrows and
food resources are unavailable, except when the soils are very sandy, (Bender et al.,
2010) as occurs when Aeolian deposits are blown into the river valleys. Therefore, a
river valley mask was built using the Riparian Topography Toolbox developed to
calculate flood height (Dilts, 2009). This tool calculates the height above the river
(HAR) and simulates flooding by filling the area around the river like a bathtub. By
simulating a flood, area within the river valley was determined empirically, rather than
subjectively digitizing the edges of each river valley by hand. The HAR was calculated
28
28
for the Red Deer and South Saskatchewan Rivers. The flood height was first set at 10 m
for both rivers. While this was effective for the Red Deer River, the steep valley slopes
of the South Saskatchewan River limited the extent of simulated flooding and failed to
exclude the areas confused for sand dunes. Therefore, the flood height for the South
Saskatchewan River was increased to 40 m, which excluded the slopes most often
incorrectly identified as sand dunes. The masks were combined and applied to the model
layers.
A layer for sandy roads was previously derived for Alberta for use in habitat
models and population surveys (Bender et al., 2007). This layer was digitized to include
sandy roads within CFB Suffield (Dzenkiw and Bender, 2009). A layer for roads in
Saskatchewan was obtained from the Government of Saskatchewan (Saskatchewan
Minstry of Highways and Infrastructure, 2009). This file included all major highways,
secondary and tertiary roads. Only unpaved dirt or gravel roads serve as potential habitat
for kangaroo rats, so the layer was limited to include only these two features. The two
provincial road layers were merged and the Spatial Analyst tool in ArcGIS 9.3 was used
to calculate the distance of each pixel in the study areas to the nearest road. A
logarithmic transformation was performed on the raster surface as was done in previous
models (Bender et al., 2010; Koenig, 2010).
The topographical layers were derived using an Advanced Spaceborne Thermal
Emissions and Radiation Radiometer (ASTER) DEM (Land Processes Distributed Active
Archive Center, 2012). The DEM tiles were merged to cover the study area. When slope
was first derived for the DEM, a pattern of faint vertical and horizontal stripes was visible
across the surface. The reason for this pattern was not evident in the original DEM. To
29
29
remedy this situation, the Fill tool from the Hydrology toolbox was used to identify any
large and sudden drops in elevation and fills in the drops, which may have resulted from
measurement errors (ESRI, 2011).
Absolute elevation was used in previous models to locate high points, such as the
tops of sand dunes, which kangaroo rats tend to favour (Gummer, 1997b). Because of the
larger extent and regional variations within the study area, absolute elevation does not
adequately describe local high points. Therefore, a measurement of local relief was
included in this model by using a moving window analysis of elevation. A square
window was used starting at 50 cells (1000 m) and increasing by increments of 25 cells to
175 cells (3500 m). The maximum elevation within the window was identified and then
assigned to the focal cell. The layer of local maxima was divided by the measured
elevation at each cell to derive a measurement of local relief.
2.2.2 Training Data for the Habitat Model
Location points for training the RSF model were drawn from population data
collected as part of an ongoing population survey in CFB Suffield and nearby areas of the
Middle Sand Hills (Alberta Ord's Kangaroo Rat Recovery Team, 2005). Animals were
caught, tagged, and their locations were recorded following a standardized monitoring
protocol (Bender et al., 2007). Animals were captured during foot surveys on natural
dune features or spotted by truck on sandy roads (Bender et al., 2007). Duplicate records
from recaptured individuals were reduced to the most recent location point to avoid
pseudoreplication. Points from the 2009 and 2010 summer surveys were used to train
candidate models. When combined, these years contained 440 unique location points.
Additional survey locations from the 2011 season were used for independent validation.
30
30
These data were collected in Alberta and used to validate the Alberta portion of the
model (Figure 2.1).
Randomly generated points were used to provide data about available habitat to
compare with the training points in Alberta. A ratio of three available points to each
presence point was chosen, and 1500 points were generated within the survey area in
Alberta. The number of randomly generated available points could affect the model’s
performance. In other studies where small numbers of presence points were available,
performance improved with more available points used to build the model (Stokland et
al., 2011). With more presence points, as was the case with this model, a smaller
Figure 2.1. The training points were collected mostly
in the Middle Sand Hills region and in the Suffield
NWA. CFB Suffield is shown with the outline and the
dune fields in the area are labelled.
31
31
proportion of available points could be used (Lobo and Tognelli, 2011). Any points that
occurred within 50 m of a kangaroo rat record were removed in order to reduce the
inclusion of used points with the available points.
2.2.3 Model Selection and Validation
AIC model selection was used to choose the model that best explains the variation
in the training data with the fewest variables from the range of candidate models
(Burnham et al., 2011). Corrected AIC (AICc) was used to rank models, and their
weights were calculated. Models are ranked by the lowest AICc value and compared to
each other based on the difference between AICc, or the delta (Δ) value. The delta values
can also be used to calculate the relative weight of the model (wi), which indicates weight
of the evidence that a model is the best model. For example, a model with a weight of
0.4 twice as likely to be a better model than another model with a weight of 0.2
(Burnham et al., 2011). The top-ranked model was used to generate a probability surface,
which was then validated.
An ROC curve was generated using the 2011 Alberta population data and 500
randomly generated points (MedCalc, 2012). The relative probabilities were extracted
from the RSF model surface. Using the randomly generated points as true positive
locations, the ROC curve plotted the rate of true positive against the rate of false positive
predictions at each probability threshold value (Fawcett, 2006).
The training data from 2009 and 2010 were randomly assigned to four groups or
bins. The model was trained withholding one bin iteratively. The resulting surfaces were
classified into equal-area, ordinal bins as suggested for rare species (Wiens et al., 2008).
The utilization was calculated based on the midpoint of each bin and area within the bin.
32
32
The ordinal bin for each withheld point was extracted. The χ2 goodness-of-fit was
calculated comparing the expected values calculated by the utilization to the withheld
data. The expected and observed data were regressed and plotted in a scatter plot. If the
expected and observed data are identical, the slope of the linear regression is equal to one
(Johnson et al., 2006), so the slope observed in this study was assessed for its deviation
from one. Significant difference from one indicates imperfect model performance. In
addition, a Spearman Rank correlation was calculated for all folds to supplement the
linear correlation (Johnson et al., 2006).
2.3 Results
In this study, the delta value for the AIC rankings between the first two models is
higher than three (Table 2.2), indicating strong support for the first model (Burnham et
al., 2011). The relative weight (wi) of the top model further indicates support for the top
model. The top two models share the same variables: the local relief with a window size
of 125 pixels, and the log distances to open sand, partially vegetated soils, rivers, and
roads. The second-highest ranked model is similar to the top model but lacks the slope
variable. Several other top-ranked models share the same distance variables, but the
window size for local relief differs among them.
Table 2.2. All models run were ranked by AICc. Delta indicates the difference
between AICc scores, and wi is a measurement of the relative value of the model.
Model K AICc Δ Wi
OS, PV, RI, RO, SP, LR125 7 777.96 0 0.8672
OS, PV, RI, RO, LR125 6 781.71 3.752 0.1328
OS, PV, RI, RO, SP, LR175 7 813.00 35.04 2.134E-08
OS, PV, RI, RO, SP, LR150 7 857.52 79.56 4.591E-18
33
33
Model K AICc Δ Wi
OS, PV, RI, RO, SP, LR50 7 859.20 81.24 1.982E-18
OS, PV, RI, RO, SP, EL 6 860.87 82.91 8.588E-19
OS, PV, RI, RO, SP, LR100 7 861.18 83.22 7.364E-19
OS, PV, RI, RO, SP, LR75 7 862.97 85.01 3.009E-19
OS, PV, RI, RO, LR150 6 883.60 105.6 9.957E-24
OS, PV, RI, RO, LR50 6 885.27 107.3 4.320E-24
OS, PV, RI, RO, EL 6 888.50 110.5 8.593E-25
OS, PV, RI, RO 5 888.57 110.6 8.318E-25
OS, PV, RI, RO, LR100 6 889.08 111.1 6.429E-25
OS, PV, RI, RO, LR75 6 890.49 112.5 3.178E-25
OS, PV, RO, EL 5 1192.0 414.1 1.052E-90
OS, PV, RO, SP, LR175 6 1259.5 481.5 2.417E-105
OS, PV, RO, LR175 5 1272.8 494.9 2.998E-108
OS, PV, RO, SP, LR125 6 1310.1 532.1 2.487E-116
OS, PV, RO, SP, LR100 6 1335.5 557.5 7.589E-122
OS, PV, RO, SP, LR50 6 1336.3 558.3 5.087E-122
OS, PV, RO, LR125 5 1340.1 562.2 7.292E-123
OS, PV, RO, SP, LR75 6 1340.5 562.5 6.229E-123
OS, PV, RO, LR100 5 1349.1 571.2 8.101E-125
OS, PV, RO, LR50 5 1350.3 572.4 4.446E-125
OS, PV, RO, LR150 5 1352.4 574.5 1.556E-125
OS, PV, RO 4 1353.9 576.0 7.395E-126
OS, PV, RO, LR75 5 1354.238 576.2774 6.325E-126
OS, PV 3 1502.715 724.7549 3.6273E-158 Model variables: OS= log distance to open sand, PV= log distance to partially
vegetated soil, RI= log distance to river, RO= log distance to unpaved roads, SP=
slope, EL= elevation, LR= local relief with window size in pixels indicated after
abbrev.
The model coefficients are given in Table 2.3.
34
34
Table 2.3. The model coefficients are given for the top-ranked model.
Estimate Std. Error p
(Intercept) -20.3455 4.81828 <0.001
Log Dist. Open Sand -1.07526 0.09851 <0.001
Log Dist. Part Veg -0.77332 0.16359 <0.001
Log Dist. River -4.87005 0.30456 <0.001
Log Dist. Road -1.25886 0.1123 <0.001
Slope -0.07543 0.03147 <0.001
Local Relief (125) 44.35467 5.4061 <0.001
The model coefficients were applied to the appropriate layers to produce Figure
2.2, where darker areas indicate high relative probability of occurrence. While only river
valley slopes are apparent when the entire extent is viewed, sandy roads and individual
dunes are visible at finer scales (Figure 2.3).
35
Figure 2.2. The probability surface derived using the top-ranked model and applied across the entire Canadian
range of Ord’s kangaroo rats.
36
36
The ROC curve (Figure 2.4) shows a definite difference from the line representing
random chance. It falls entirely in the upper-left of the graph indicating performance better than
expected by random chance, with an AUC score of 0.938. This indicates that the model will
assign higher probabilities to true occurrences than random occurrences (Fawcett, 2006). Values
above 0.7 for AUC indicate excellent model performance (Fawcett, 2006).
Figure 2.3. A zoomed-in view of the probability surface for a region of CFB
Suffield shows more distinction between individual sand dunes.
37
37
The results from the k-fold cross-validation are shown, in part, in Table 2.4 and Figure
2.5. All folds showed a significant χ2 goodness-of-fit value, when a non-significant statistic
would be expected (Johnson et al., 2006). Results for the remaining folds are provided in
Appendix II. The validation subsets had more data points in bins of high probability than would
be expected from the distribution of probability on the model surface. The linear regressions had
slopes significantly higher than one; a slope of one indicates consistent model performance
(Johnson et al., 2006). However, the Spearman Rank Correlation was greater than 0.5 for all
folds and greater than 0.75 for three of the folds. The first fold had a Spearman rank correlation
ROC Curve for the Model Selected by AICc
0 20 40 60 80 100
0
20
40
60
80
100
100-Specificity
Sen
siti
vit
y
Figure 2.4. The ROC curve shows excellent model
performance with the curve falling well above the line
indicating random chance. The AUC was 0.938.
38
38
of 0.830. This indicates correlation of the rank values of each expected and observed bins. As
with the χ2 goodness-of-fit tables, only the results for the first fold are provided here, while the
remainder of the folds can be found in Appendix II.
Probability Area wiAi Utilization Expected Observed (o-e)2/e
0.00 - 0.007817 339817 1327.325 0.00531 0.5841 0 0.5841
0.007817 - 0.01563 167821 1967.366 0.007871 0.8658 0 0.8658
0.01563 - 0.02735 153697 3302.718 0.01321 1.453 0 1.453
0.02735 - 0.04688 152821 5671.722 0.02269 2.496 0 2.496
0.04588 - 0.07422 133097 7992.541 0.03198 3.517 0 3.517
0.07422 - 0.1211 136838 13363.46 0.05346 5.881 0 5.881
0.1211 - 0.2031 128140 20772.9 0.08310 9.141 0 9.141
0.2031 - 0.3555 126803 35415.7 0.1417 15.59 1 13.65
0.3555 - 0.6250 122729 60165.62 0.2407 26.48 9 11.54
0.6250 - 1.000 123056 99982.08 0.39999 43.99 100 71.28
Sum 120.40
p > 0.001
Table 2.4 The χ2 statistic for the first fold is significant (p>0.001). Most of the
validation points are in the highest probability bin.
Figure 2.5. The slope of the linear regression for the first fold is
significantly different from zero and one.
y = 1.9029x - 9.9314
R² = 0.7386
-20
0
20
40
60
80
100
0 10 20 30 40 50
Ob
serv
ed
Expected
Linear Regression for Fold One
39
39
2.4 Discussion
The RSF models ranked highest by AIC include the majority of variables used in
previous model (Bender et al., 2010; Heinrichs et al., 2008; Koenig, 2010; Podgurny, 2004) The
variables distance to sand, distance to partially vegetated soil, distance to roads, distance to
rivers, and the topographical variables of elevation and slope, were included in the top model.
The second-ranked model included all on these variables except slope. Several other high-
ranked models included the same variables, but differed in the scale of variables included, with
only slight differences in the scale of local relief terms and the inclusion of slope in the model.
Models with the local relief terms were consistently ranked above those with absolute elevation,
indicating that this variable may be useful to include in future habitat models. The top models
included local relief measurements at two scales, 125 cells and 175 cells, which equates to about
a 2.5 km and 3.5 km square, respectively. Measuring the local relief accentuates areas with high
amounts of variation within the moving window. The highest points within the window are
favoured (the model coefficient is positive). The high areas of the river valley slopes combined
with the relatively large model coefficient assigned to the log-distance-to-river variable caused
the river valleys to be assigned high probabilities of occurrence.
Areas of potential kangaroo rat habitat in Saskatchewan appear to have lower probability
of occurrence values than similar habitat in Alberta. Lower probability of occurrence values in
Saskatchewan could result from training the model in Alberta and then extrapolating it beyond
the initial study area. Alternatively, lower values in Saskatchewan could be due to differences in
the stabilization of dunes (which was not adequately represented in the model) across the
species’ range. Regional changes, such as the amount of rainfall leading to greater vegetation
40
40
density (i.e., dune stabilization) in Saskatchewan may have contributed to the relatively lower
probabilities in Saskatchewan when compared to Alberta.
The combination of vegetation and soil in the spectral signature of partial vegetation
makes the classification of this land cover class difficult. Spectral signature of these pixels
includes characteristics of both vegetation and soil, often in different proportions. Determining
what proportion of these mixed pixels is covered in soil or vegetation is extremely difficult.
Improvements in identifying partially vegetated soils may aid in identifying potential habitat in
Saskatchewan. Ideally, new imagery could be found that has both high spatial and spectral
resolution. A classification scheme including a soil-adjusted vegetation index (SAVI) could
greatly improve the identification of partially vegetated soils.
The high probability along river valleys, seen in the model surface, does not correspond
to known patterns of habitat in Alberta. Extrapolating the model seems to have accentuated the
river valleys while underestimating the potential habitat of active sand dunes in Saskatchewan.
The distance-to-rivers model variable may be the cause of the inconsistent identification of
active sand dunes. The distance-to-rivers variable was included in previous habitat models for
Alberta, and was subsequently included in the current study. However, the influence of rivers
might be a regional artifact rather than an indicator of kangaroo rat habitat. In Alberta, active
sand dunes were blown toward the Red Deer and South Saskatchewan Rivers, resulting in high-
quality habitat near rivers. The correlation of kangaroo rat habitat and proximity to river does
not appear to be consistent throughout the study area. The influence of the river valleys will be
examined in Chapter Three where the area of potential habitat is calculated and an estimate made
of the kangaroo rat population in Alberta and Saskatchewan.
41
41
The two validation techniques used in this study led to contradictory conclusions. The
data fit the model well according to an ROC curve. The ROC curve had a very high AUC value
which denotes high confidence in the model. However, when k-fold cross-validation was
applied, the model failed to provide the results indicative of a consistent model. The χ2
goodness-of-fit summary statistic was significant, despite the expectation that it would be non-
significant. The linear regressions had slopes greater than one; a slope of one indicates a
consistent model. The withheld subsets used to validate each fold tended to have more points in
the higher probability bins, causing the results to deviate from the expected distribution and
pulling the linear regressions toward steeper slopes. Hough and Dieter (2009) encountered a
similar issue in validating their mode for flying squirrels. The linear regression for their flying
squirrel model had a slope greater than one and a significant value for χ2 goodness-of-fit tests.
Hough and Dieter (2009) determined that their model was still suitable, but future improvements
in the variables chosen and data from across the range could improve the results.
A potential explanation for the discrepancy between ROC and k-fold cross-validation is
the habitat specificity of kangaroo rats. K-fold cross-validation methods were developed on
generalist species, such as elk, that are more flexible in their use of habitat than are kangaroo rats
(Boyce et al., 2002; Johnson et al., 2006; Wiens et al., 2008). These species might occur in areas
of lower probability of occurrence more frequently than a habitat specialist such as kangaroo
rats. Based on the performance of the ROC curve, as well as the pattern in deviation toward high
probability bins in the k-fold cross-validation, the model was used to proceed for further
examination.
Overall, the modifications made to the previous habitat model (Bender et al., 2010) were
successful. Satellite imagery from SPOT was able to identify open sandy soil even at coarser
42
42
spatial resolution. While partially vegetated soils were more difficult to classify, the top model
still had an excellent ROC result with an AUC value of 0.938. With the high AUC value and
Spearman rank correlation, the top model will be taken forward to further estimates for
population in Saskatchewan.
43
43
Chapter Three: Habitat and Population Estimation
3.1 Introduction
A habitat model surface indicates where a species is likely to occur. Boyce and
McDonald (1999) explored ways to relate two habitat modelling approaches, an RSPF and an
RSF, to population size. In an RSPF (e.g., presence-absence design), population size is assumed
to be proportional to the probability of occurrence (Boyce and McDonald, 1999). The
probabilities across the RSPF surface can be directly related to the probability of use (Boyce and
McDonald, 1999). An RSF (e.g., use-available design used by this study) approach does not
necessarily produce a model that is proportional to the probability of occurrence, and therefore, it
is not meaningful to relate the probabilities across the surface directly to population size (Keating
and Cherry, 2004). Another approach must be made when estimating population size from an
RSF model. A researcher may assign a threshold probability of occurrence that defines suitable
habitat, but this threshold must be arbitrarily selected (Liu et al., 2005). Any area assigned a
probability of occurrence higher than the threshold values could serve as potential habitat. If the
population density is known, it can be applied to the area of potential habitat and the population
size can be estimated.
Setting a meaningful probability threshold can be a difficult determination. Often, a
probability of 0.5 is chosen as a default threshold to delineate potential habitat from the
surrounding landscape (Liu et al., 2005). Even when a researcher attempts to relate the
probability threshold to the habitat model, the final selection remains arbitrary. In a recent
population study of Ord's kangaroo rats in Alberta (Heinrichs, 2010), the habitat threshold was
arbitrarily determined by the probability that defined enough habitat to include two-thirds of
presence points used in validation.
44
44
This chapter will attempt to coarsely estimate Ord's kangaroo rat population size
throughout Saskatchewan by identifying potential habitat and then extrapolating from known
kangaroo rat densities in Alberta. Areas identified as potential habitat will be divided by the
average kangaroo rat densities for habitat in dune fields, road margins, and other potential
habitat, as observed in Alberta (Gummer and Robertson, 2003b; Heinrichs et al., 2008). Each
home range is assumed to be occupied by a single, reproductively mature animal (Gummer and
Robertson, 2003a). While juveniles share territory with their mothers during the first weeks of
life, the average home range was calculated for adults and the resulting population estimate will
represent the number of adult kangaroo rats in the study area.
Simply dividing potential habitat by the average kangaroo rat densities assumes full
saturation of kangaroo rats in the potential habitat, resulting in an estimate of habitat capacity.
However, complete occupancy is not seen in population surveys, as other factors (e.g., predation)
also limit the kangaroo rat population, rather than just the amount of available habitat. In order
to estimate the population, the proportion of habitats occupied must be estimated. The
population estimate recorded in Alberta(Gummer, 1997b) was divided by the habitat capacity
estimate in Alberta in order to estimate the proportion of occupied habitat.
Previous population estimates for Ord’s kangaroo rats in Alberta were based on mark-
recapture surveys between 1994 and 1997. The estimate for Alberta was about 3000 animals
with upper and lower confidence limits of 4160 and 2180, respectively (Gummer, 1997b).
Kenney (1989) based a separate estimate for Saskatchewan on population density and estimated
that 1370 kangaroo rats occupied the province. The estimated population size for Saskatchewan
and the Canadian total will be a starting point from which further studies build and the scope
conservation efforts can be assessed.
45
45
3.2 Methods
3.2.1 An Alternate Model Surface
The top model selected by AICc in Chapter Two calculated high probability values along
river valley slopes, while assigning lower probability of occurrence values to active sand dunes
in Saskatchewan. One of the variables included in the model was the log distance-to-rivers
variable, which likely caused the emphasis on river valley slopes. The distance-to-rivers variable
is included in all of the top models derived in Chapter Two. The inclusion of the distance-to-
rivers variable is likely an artifact and largely driven by the proximity of dune fields in Alberta to
the South Saskatchewan and Red Deer Rivers rather than as a biologically relevant indicator of
habitat. The majority of active sand dunes in Alberta are located near rivers. The dunes
provided kangaroo rat habitat. And although the nearby river valley slopes are also characterized
by sparsely vegetated soils, they typically do not provide habitat. River valley slopes in the
study area consisted of bare soil and steep slopes, both of which were positively associated with
kangaroo rat presence, resulting in high probabilities assigned to these areas. However, the river
valley slopes generally consist of finer clays and silts, which produce a compact soil inadequate
for kangaroo rat burrows. River valley slopes are adequate habitat for kangaroo rats in only a
few, unique cases. These areas consist of Aeolian dunes blown toward and over the river slopes
in the Suffield National Wildlife Area. Yet the influence of the distance-to-rivers variable
appears in the consistently high probability of occurrence assigned to river valley slopes, despite
the low habitat potential provided.
The top model selected by AICc in Chapter Two emphasized the river valley slopes and
may result in biased estimates of potential habitat and kangaroo rat population in Saskatchewan.
An alternate model was calculated using the same variables as the top model selected by AICc
46
46
but excluding the distance-to-rivers variable. By removing the distance-to-rivers variable, the
extent of the bias included in the top model was explored in the prediction made by the alternate
model. The same calculations for potential habitat, habitat capacity, and population size
estimates were performed with the alternate model as were performed with the top model. The
population estimates were compared to the top model. An ROC curve was used to assess the
performance of the alternate model.
3.2.2 Estimating Population Size
The threshold values set in order to define potential habitat were the same as in Heinrichs
(2010) where two-thirds of the validation data (i.e., 2011 Alberta population survey capture
points) would be included in the potential habitat. The probability value at each point was
extracted from the model surface calculated for the top model selected by AICc and the alternate
model. The probabilities of occurrence from the validation points were sorted by value, and the
threshold was set at the probability where two-thirds of the validation points occurred with a
probability above the threshold and one-third of points were below the threshold.
A pixel was classified as habitat if the probability value was greater than or equal to the
threshold value or as non-habitat if the probability value was lower than the threshold value. The
potential habitat may or may not be used as habitat by kangaroo rats. Habitat within 20 m of the
river mask was excluded from the total population and area calculations to reduce the influence
of river valley slopes falsely identified as potential kangaroo rat habitat.
To further refine the potential habitat estimate, the areas identified by the thresholds were
classified by habitat type as dune fields, road margin, or other potential habitat. General dune
field polygons were used to identify habitat within active dune fields. The population density
used to calculate the maximum occupancy of kangaroo rats was one adult kangaroo rat per 1,750
47
47
m2 in active dune fields, as determined by Gummer and Robertson by radio-collar tracking
(2003b) and used by Heinrichs (2010) in population simulations. Potential habitat within 40 m
of roads was classified as road margin and a density of one adult kangaroo rat per 3,115 m2 was
used to calculate the maximum population (Heinrichs, 2010). A density of one adult kangaroo
rat per 2,905 m2 was used for all other areas of potential habitat, corresponding to the territory
size used for stabilized sand dunes by Heinrichs (2010). Habitat area was multiplied by average
population density in each habitat type. The result was rounded down to the nearest whole
number and reported as the habitat capacity of kangaroo rats by province and habitat type.
To determine a population size from the maximum habitat capacity, an estimated
proportion of occupancy is needed. Therefore, the Gummer’s (1997b) population estimate for
Alberta was divided by the maximum habitat capacity calculated for kangaroo rats for all habitat
types in Alberta resulting in an estimated proportion of occupancy. The resulting proportion of
occupancy was multiplied by the habitat capacity for kangaroo rats calculated for Saskatchewan
to obtain a coarse estimate of the kangaroo rat population in Saskatchewan.
48
48
3.3 Results
3.3.1 The Alternate RSF Model
The ROC curve for the alternate model surface developed without the distance-to-rivers
variable had an AUC of 0.818 (Figure 3.1). This is slightly lower than the model selected by
AICc (AUC=0.938), but above the AUC of 0.5 expected for random chance and still considered
excellent performance. The surface is shown in Figure 3.2.
ROC Curve for the Alternate Model
0 20 40 60 80 100
0
20
40
60
80
100
100-Specificity
Sen
siti
vit
y
Figure 3.1. The ROC curve for the model developed without the
distance-to-rivers variable has an AUC of 0.818. While lower than the
AUC for the top model, it is still well above the AUC of 0.5 indicating the
model performs better than random chance.
49
Figure 3.2. The model developed without the distance-to-rivers variable shows greater variation in
probability values across the study area. Several active dunes and roads are visible as darker areas of high
probability. Note that an agricultural mask was applied to the model surface, explaining the the large areas
of white in the figure.
50
50
3.3.2 Habitat and Population Estimates for the Top Model by AICc
The probability threshold for the top model based on AICc was 0.645. Using the
threshold probability value of 0.645, potential habitat was classified from the top model
surface. The total area of potential habitat estimated across the study area for the top
model was 139.8 km2 (Table 3.1). This included habitat in dune fields, road margins, and
potential habitat beyond dunes and roads. The model assumed that all habitat types were
fully occupied and predicted over 41,000 kangaroo rats across the study area (Table 3.1).
The habitat capacity within dune fields is 799, which is much lower than estimates made
through mark-recapture studies (Gummer, 1997b; Kenny, 1989). When the habitat
capacity was reduced by applying the proportion of occupancy, in proportion to
Gummer’s (1997b) population estimate of 4,160, the population estimate in
Saskatchewan was 2,362. The total population estimate for the entire study area was
6,522.
Table 3.1. The area of potential habitat (given in km2) and habitat capacity were
identified by the top model and is organized by habitat type and province. The total
area of potential habitat included habitat that was smaller than a single home range.
The estimated area of potential habitat in dune fields is much lower than might be
expected from the population surveys performed in Alberta.
AB SK Total
Habitat Type Area Capacity Area Capacity Area Capacity
Dune Field 1.700 735 0.1824 64 1.882 799
Other 16.48 3,693 9.596 2,255 26.07 5,948
Road 71.17 22,250 40.64 12,834 111.8 35,084
All Types 89.35 26,678 50.42 15,153 139.8 41,831
3.3.3 Habitat and Population Estimates for the Alternate Model
The threshold for the alternate model was 0.260. The alternate model surface using
a threshold at 0.260 estimated a total of 2,097 km2 of potential habitat for the entire study
51
51
area (Table 3.2). The potential habitat within dune fields totalled 18.50 km2. The model
predicted 13,350 kangaroo rats if dune fields were fully occupied (Table 3.2). The scaled
population estimate for Saskatchewan was 4,716 when reduced in proportion to the
proportion of occupancy based on Gummer’s (1997b) population estimate of 4,160
kangaroo rats in Alberta. The total population for the study area was predicted to be
8,876.
Table 3.2. The area of potential habitat (given in km2) for the alternate model was
totaled. The habitat capacity was calculated from the totals by province and habitat
type. The predicted potential habitat and habitat capacity was much higher in
Saskatchewan than the prediction made by the top model
AB SK Total
Habitat Type Area Capacity Area Capacity Area Capacity
Dune Fields 14.28 4,104 18.50 9,246 32.8 13,350
Other 301.8 92,302 129.1 36,785 430.9 129,087
Road 1,099 77,332 534.9 150,957 1,634 228,289
All Types 1,415 173,738 682.4 196,988 2,097 370,726
An arbitrary threshold was set at 0.50 and applied to the alternate model surface.
A threshold of 0.50 is often chosen as a default threshold for suitable habitat (Liu et al.,
2005). When the probability threshold was increased to 0.5 from 0.260, the total area of
potential habitat was reduced across the study area to 719.4 km2, but the area within dune
fields was only reduced slightly to 21.59 km2 (Table 3.3). The total area was reduced by
66% when the threshold was increased, but the area within dune fields was reduced by
only 34%. The model predicts 142,021 kangaroo rats across the study, if all habitat types
were fully occupied (Table 3.3). When the habitat capacity was reduced by the
proportion of occupancy based on Gummer’s (1997b) population estimate for Alberta,
52
52
3,623 kangaroo rats were predicted for Saskatchewan. The total population estimate for
kangaroo rats across the study area was 7,783.
Table 3.3. When the probability threshold for potential habitat was increased and
applied to the alternate model surface the overall potential habitat and habitat
capacity were reduced from the estimates made above. The area and habitat
capacity in road margin and other habitat types fell more than in dune fields.
AB SK Total
Habitat Type Area Capacity Area Capacity Area Capacity
Dune Field 15.49 7,651 6.098 2,858 21.59 10,509
Other 186.0 50,965 78.50 20,056 264.5 71,021
Road 175.4 17,287 258.0 43,204 433.3 60,491
All Types 376.8 75,903 342.6 66,118 719.4 142,021
3.4 Discussion
This chapter explored the use of an extrapolated model surfaces from Alberta to
estimate the potential population size of Ord’s kangaroo rats in Saskatchewan. Using two
models at three probability thresholds, one selected by AICc and an alternate model that
excluded the distance-to-rivers variable at two probability thresholds, several estimates
about the potential habitat available, corresponding habitat capacity estimates, and
population size estimates were made.
The top-ranked habitat model from Chapter Two predicted only 0.1824 km2 of
potential high-quality habitat associated with dune fields in Saskatchewan. A grazing
study by Nielsen (2007) predicted 5.2 km2 of kangaroo rat habitat in the Great Sand Hills
region alone. The focus of the grazing study was the impact of cattle and oil and gas
development in the Great Sand Hills region on multiple native species. Nielsen’s (2007)
model included road margins and other marginal kangaroo rat habitat. The discrepancy
between the habitat predicted in dune fields by the habitat model and Nielsen’s (2007)
53
53
grazing study warrants further investigation through intensive habitat modelling for
kangaroo rats, or field surveys.
The top model surface in this study was dominated by the rivers, with most of the
predicted habitat occurring close to the South Saskatchewan and Red Deer Rivers. In
Alberta, many of the dune fields occur near rivers, which is why the variable was
included in previous habitat models (Bender et al., 2010). However, the majority of sand
dunes in Saskatchewan were located away from the rivers and were assigned low
probabilities of occurrence by the top model. The low potential habitat and habitat
capacities predicted in Saskatchewan dune fields indicated a need to improve the top
model when it was extrapolated over the entire study area.
For this reason, a second model surface was developed, identical to the top model,
but excluding the distance-to-rivers variable. The threshold probability of 0.260 for
potential habitat was very low, likely because the alternate model has a slightly more
homogeneous surface than the top model, and included a lot of additional potential
habitat in the total estimates. The reduction in area and capacity estimates made by the
alternate model when the threshold was increased from 0.260 to 0.50 was much smaller
in dune fields than in the other habitat types. Habitat in dune fields was reduced by 34%
while total habitat fell by 66% when the threshold was increased. The proportionally
small change of potential habitat in dune fields indicates that the probability of
occurrence in those areas is higher than both the 0.260 and 0.50 thresholds. In areas with
lower probability of occurrence, i.e., road margins, increasing the threshold had a larger
impact on the area and habitat capacity estimates.
54
54
The population estimates ranged from 6,522 to 8,876 kangaroo rats throughout
the study area. Because the population was held constant in Alberta, the variation is
driven by population estimates in Saskatchewan ranging from 2,362 to 4,716. If Kenny’s
(1989) estimate of 1,370 kangaroo rats in Saskatchewan reflects the true proportion of
occupancy in the province, the top model came closest to estimating the population at
2,362 kangaroo rats. Due to the coarse nature of Kenny’s (1989) estimate and the
tendency for the top model to select for river valley slopes rather than active sand dunes,
the alternate model might provide a better estimate of the population size and habitat
capacity of kangaroo rats throughout the study area. With the threshold set at 0.50, the
alternate model provides a moderate estimate of 7,783 kangaroo rats and a maximum
habitat capacity of 10,509 kangaroo rats in dune fields. The alternate model with the
threshold set at 0.260 estimated the highest population of kangaroo rats and very high
habitat capacity. This may not reflect the actual probability threshold for potential
habitat, and therefore, might not provide the best estimates.
Gummer’s (1997b) Alberta population estimate included kangaroo rats in natural
dune habitat, road margins, and some semi-stabilized dunes together in a single
population estimate. When the habitat capacity was reduced proportionally to a
population estimate in Saskatchewan, only the total could be reported. A population
survey examining the proportion of the kangaroo rat population in dune fields, road
margins, and other habitat could refine the population estimates made in this study.
The extrapolated portion of the habitat model in Saskatchewan was not validated.
At this time, independent records of kangaroo rat location in Saskatchewan are not
available. The independent data collected as part of this study (see Chapter Two)
55
55
included only five points near kangaroo rat burrows. More points over a larger area
would be needed to perform thorough validation in the extrapolated potion of the model.
The alternate model surface and the top model surface can provide guidance on areas to
focus on during population surveys, but should still be validated. Extensive population
monitoring across the study area similar to that in Alberta would be difficult in terms of
time and access to sites on private land in Saskatchewan. A combination of broad scale
surveys across the model surface and at multiple probabilities of occurrence could be
used to validate both the top model and the alternate model. In addition, an intensive
survey of a single region, such as the Great Sand Hills, could be used to assess the habitat
model and the population seen in Saskatchewan. High spatial resolution data is available
for the region and could be used to recreate the model produced in this study. Population
surveys in the Great Sand Hills could provide training data and an estimate of population
based on mark-recapture results.
The extrapolated model surfaces for both the top model and the alternate model
can serve as a starting point for future studies of kangaroo rats in Saskatchewan. They
provide rough and un-validated estimates of population and habitat capacity in the
province. The model surfaces could be used to target areas with high probabilities of
occurrence for further population surveys or conservation work. From the starting points
of the models developed in this study, researchers can prioritize field work and develop
improved models. The most accurate model surface can be identified and used to
improve another range-wide habitat model that incorporates training data for Alberta and
Saskatchewan. Developments such as local relief measurement and the inclusion of only
56
56
biologically relevant variables in the model, will improve the performance of future
habitat models used to guide conservation efforts for Ord’s kangaroo rats in Canada.
57
57
Chapter Four: Summary and General Conclusions
This study built upon previous work modelling habitat for Ord’s kangaroo rats in
Alberta. With the intensive study of the kangaroo rat population in Alberta, much is
known about the habitat requirements and population dynamics in the province.
However, very little is known beyond anecdotal evidence about kangaroo rats in
Saskatchewan. This study sought to extrapolate the knowledge gained from habitat
models and extensive population surveys in Alberta to the entire species’ range in
Canada, including Saskatchewan.
To extrapolate the recent habitat model (Bender et al., 2010), several alterations to
the original model variables were implemented in this study, because several data sources
were only available for Alberta. The previous habitat models in Alberta utilized IRS-1C
imagery, which covered the species’ range in Alberta at 5 m spatial resolution. However,
IRS-1C was only available for Alberta. Therefore, SPOT -4 and -5 imagery, which was
available for the entire study area, was acquired. While the spatial resolution of SPOT
was 20 m, much larger than most sand dunes and roads, both landcover types were
correctly identified by the supervised classification. However, partially vegetated soils
were more difficult to identify than open sand due to the mixed spectral nature of the land
cover class. Potentially, a SAVI could improve the classification of partially vegetated
soil, but the equations found required a blue band, which SPOT lacks (Campbell, 2008).
Another modification to the model variables was the substitution of absolute
elevation with a measure of local relief. In this study, candidate models that included a
local relief variable were ranked higher by AICc-based model selection than those using
58
58
absolute elevation. Local relief emphasized local high points and is recommended for
use in future kangaroo rat habitat models.
The top RSF model selected in Chapter Two performed favourably when assessed
in Alberta using an ROC curve. The ROC curve had an extremely high AUC value of
0.938, indicating excellent model performance, well above that expected by random
chance. However, when k-fold cross-validation was used to assess the model
performance, the habitat model failed in several aspects. Many of the kangaroo rat
locations used for validation fell into the high probability bins used in χ2 goodness-of-fit
and linear regression. This may reflect the highly specialized habitat requirements of
kangaroo rats, because few kangaroo rat location records used for validation occurred in
areas of lower probability and more of the validation points were in the very high
probability bins than expected from the distribution of probability across the model
surface. This extreme distribution skewed the results from the expected in the χ2
goodness-of-fit test. The linear regression of the expected against the observed
distribution of validation points had a slope higher than one, causing the model to fail one
aspect of the criteria set by k-fold cross-validation (Boyce et al., 2002; Johnson et al.,
2006). This deviation toward higher slope was consistent for each fold. When a
Spearman rank correlation was calculated, the correlation statistic was greater than 0.5
for all folds, indicating positive correlation. Taking into account the high AUC value
from ROC, Spearman rank correlation, and the tendency for the k-fold cross-validation to
err toward higher probabilities (rather than the overall k-fold cross-validation results), the
model was extrapolated into Saskatchewan and used for habitat and population estimates.
59
59
In the extrapolated area of Saskatchewan, the probability of occurrence values
calculated for the top model produced in Chapter Two were generally lower than those
for Alberta, especially in dune fields, such as the Great Sand Hills. The areas around
river valleys throughout the study area were assigned especially high probabilities of
occurrence by the model, while sand dunes were barely visible on the probability surface.
River valley slopes rarely serve as kangaroo rat habitat (Bender et al., 2010), but were
identified by the model as the most likely areas to be occupied throughout the study area.
Therefore, a second model surface was developed, identical to the top model, but
excluding the distance-to-rivers variable in order to reduce the emphasis on river valley
slopes. When the alternate model was developed without the distance-to-river variable,
the emphasis shifted from river valleys to sand dunes throughout the study area. The
distance-to-rivers variable had a high coefficient in the top model. When the distance-to-
rivers variable was removed other variables, such as distance to open sand and distance to
partially vegetated soils, gained more influence on the model surface. The distance-to-
rivers variable did not have considerable biological relevance to kangaroo rat habitat. It
was included in previous studies because sand dunes in Alberta occur near rivers, making
it a good predictor of where sand dunes are located in Alberta. In Saskatchewan, the
distance-to-rivers variable and local relief predicted river valley slope, which do not
provide viable habitat for kangaroo rats.
The alternate model showed excellent performance when assessed by an ROC
curve with an AUC value of 0.818, as compared to the AUC value of 0.938 for the top
model. The locations of sand dunes in the alternate model surface were more visible, but
60
60
the probability values were more homogeneous on the model surface than in the top
model.
Potential habitat was estimated from the top model surface and the alternate
model surface at two thresholds, 0.260 and 0.50. A habitat capacity was calculated from
the potential habitat. The habitat capacity in Saskatchewan was then reduced in the
proportion of occupancy calculated for Alberta. The population estimate made by the top
model for the entire study area was 6,522. The population estimate for the alternate
model at 0.50 was 7,783 and 8,876 for the alternate model with the threshold at 0.260.
These population estimates can be seen as conservative and liberal estimates made from
the alternate surface. The population estimate made in Saskatchewan by the top model
came closest to Kenny’s (1989) estimate of 1,370 made from population density
observations. However, Kenny’s (1989) estimate was limited in its scope and the top
model had problems correctly identifying the location of active sand dunes in
Saskatchewan. With river valley slopes emphasized and active sand dunes assigned low
probabilities of occurrence, the top model may not accurately predict habitat in
Saskatchewan. The alternate model surface may provide better estimates of the potential
habitat in Saskatchewan. The distribution of high probabilities of occurrence for the
alternate model surface identified the active sand dunes, rather than the distribution of
high probabilities of occurrence assigned to the river valley slopes by the top model.
Therefore, the population estimate made from the alternate model surface at a threshold
of 0.50 may be a more accurate estimate. It was more conservative than the population
estimate made with the threshold set at 0.260 and closer to the rough estimate made by
61
61
Kenny (1989) and Gummer’s (1997b) estimate made from mark-recapture studies in
Alberta.
These population estimates and the model surface in Saskatchewan are un-
validated. A validation scheme involving intensive regional modelling and broad
population surveys was proposed in Chapter Three. A regional habitat model for the
Great Sand Hills should include the improvements made as part of this study, particularly
the measure of local relief. Survey points recording kangaroo rat captures or burrows
from across Saskatchewan and from a variety of predicted probabilities of occurrence
from the top model and alternate model surfaces should be used to validate the model
surfaces. Training data from Saskatchewan could drastically improve the model
performance in Saskatchewan. The extrapolated area of this model would be trained
using the same techniques as used to train the model surface in Alberta.
The extrapolated alternate model with the threshold set at 0.5 in Saskatchewan
and its predicted areas of habitat could be particularly useful as a starting point for future
population surveys and conservation efforts in Saskatchewan, where currently very little
is known about kangaroo rat abundance and geographic distribution. Researchers could
prioritize the areas identified with high probability of occurrence or avoid low probability
areas to reduce some of the cost and time spent in extensive field population surveys.
Extensive population monitoring in Saskatchewan, similar to that currently accomplished
in Alberta, would be difficult in terms of time and access. The portion of the study area
in Saskatchewan is larger than the portion in Alberta. Further, most of the potential
kangaroo rat habitat in Saskatchewan falls on private land. Land owners may be difficult
to contact and reluctant to grant access to private property. If researchers cannot access
62
62
large portions of potential habitat, accurate population surveys cannot be made for
Saskatchewan.
The alternate model surface can guide researchers to areas of high probability of
occurrence. While the model surface can be used to prioritize surveys and plan sampling
methods, the model should not be used to make final conservation decisions, such as
legal protection of land. The model needs to be validated in Saskatchewan before it can
be used for more than preliminary work.
Extrapolating the model from Alberta to Saskatchewan highlighted issues not
seen at finer scales. Local relief was substituted for elevation in order to identify local
high points favoured by kangaroo rats. The distance-to-rivers variable emphasized river
valley slopes in Saskatchewan. The variable was included in previous habitat models in
Alberta, but had little biological significance to kangaroo rat habitat. When the variable
was excluded from a model, the model surface identified sand dunes in Saskatchewan
better than the top model. Despite the improved identification of active sand dunes across
the study area, the AUC value calculated from the ROC curve was lower than the top
model. The discrepancy between river valley slopes favoured by the top model and
patterns of occupation on active sand dunes observed in reality was identified for
kangaroo rats because of their specific habitat requirements. If a species had more
general habitat requirements or the researchers had less background information about a
species, land cover types incorrectly identified as potential habitat, such as occurred in
this study, could be overlooked. Careful selection of biologically relevant variables is
required for modelling and extrapolating a regional model.
63
63
References
Alberta Ord's Kangaroo Rat Recovery Team, 2005, Recovery Plan for Ord's Kangaroo
Rat in Alberta, in Alberta Sustainable Resource Development Fish and Wildlife
Division, ed.: Edmonton, AB, p. 28 pp.
Bender, D.J., Gummer, D.L., Dzenkiw, R., and Heinrichs, J., 2010, An Occurrence-based
Habitat Model for the Ord's Kangaroo Rat (Dipodomys ordii) in Alberta, in
Alberta Sustainable Resource Development Fish and Wildlife Division, ed.,
Alberta Species at Risk: Edmonton, AB, p. 17.
Bender, D.J., Gummer, D.L., and Dzenkiw, R.E., 2007, Monitoring Protocol for the Ord's
Kangaroo Rat, in Alberta Sustainable Resource Development Fish and Wildlife
Division, ed.: Edmonton, AB.
Boyce, M.S., and McDonald, L.L., 1999, Relating Populations to Habitats Using
Resource Selection Functions: TREE, v. 14, p. 5.
Boyce, M.S., Vernier, P.R., Nielsen, S.E., and Schmiegelow, F.K.A., 2002, Evaluating
resource selection functions: Ecological Modelling, v. 157, p. 281-300.
Breyen, L.J., Bradley, W.G., and Yousef, M.K., 1973, Physiological and ecological
studies on the chisel-toothed kangaroo rat, dipodomys microps: Comparative
Biochemistry and Physiology Part A: Physiology, v. 44, p. 543-555.
64
64
Burnham, K.P., Anderson, D.R., and Huyvaert, K.P., 2011, AIC model selection and
multimodel inference in behavioral ecology: some background, observations, and
comparisons: Behavioral Ecology and Sociobiology, v. 65, p. 23-35.
Campbell, J.B., 2008, Introduction to Remote Sensing: New York, NY, The Guilford
Press.
Colchero, F., Conde, D.A., Manterola, C., Chávez, C., Rivera, A., and Ceballos, G.,
2011, Jaguars on the move: modeling movement to mitigate fragmentation from
road expansion in the Mayan Forest: Animal Conservation, v. 14, p. 158-166.
Compton, L.V., and Hedges, R.F., 1943, Kangaroo Rat Burrows in Earth Structures: The
Journal of Wildlife Management, v. 7, p. 306-316.
COSEWIC, 2006, COSEWIC Assessment and Update Status Report on the Ord's
Kangaroo Rat Dipodomys ordii in Canada, in Committee on the Status of
Endangered Wildlife in Canada, ed., p. 1-27 pp.
Day, B.N., Egosgue, H.J., and Woodbury, A.M., 1956, Ord Kangaroo Rat in Captivity:
Science, v. 124, p. 485-486.
Department of National Defense, 2012, CFB Suffield National Wildlife Area, Volume
2012, Canadian Army.
Dilts, T., 2009, Riparian Topography Toolbox: Reno, NV, University of Nevada, Reno.
65
65
Dzenkiw, R., and Bender, D.J., 2009, Ord's Kagaroo Rat Long Term Population
Monitoring 2008-2009, in Candian Forces Base Suffield, ed.: Calgary, AB.
ENVI, 2012, ENVI Tutorial: Classification Methods, Exelis.
Environment Canada, 2012, Canadian Forces Base Suffield National Wildlife Area,
Volume 2012: Edmonton, AB, Environment Canada.
ESRI, 2011, ArcGIS Desktop: Redlands, CA, Environmental Systems Research Institute.
Fawcett, T., 2006, An introduction to ROC analysis: Pattern Recognition Letter, v. 27, p.
861-874.
Fielding, A.H., and Bell, J.F., 1997, A review of methods for the assessment of prediction
errors in conservation presence/absence models: Environmental Conservation, v.
24, p. 38-49.
Government of Canada, 2007, Geobase Orthoimage 2005-2010: Geobase Orthoimage
2005-2010: Sherbrooke, QB, Natural Resources Canada, Centre for Topographic
Information.
Guisan, A., and Thuiller, W., 2005, Predicting species distribution: offering more than
simple habitat models: Ecology Letters, v. 8, p. 993-1009.
Guisan, A., and Zimmermann, N.E., 2000, Predictive habitat distribution models in
ecology: Ecological Modelling, v. 135, p. 147-186.
66
66
Gummer, D.L., 1997a, Effects of latitude and long-term isolation on the ecology of
norther Ord's kangaroo rats (Dipodomys ordii): M.Sc., Biological Sciences,
University of Calgary. Calgary, AB.
—, 1997b, Ord's Kangaroo Rat (Dipodomys ordii). in Alberta Environmental Protection,
W.M.D., ed.: Wildlife Status Report: Edmunton, AB, p. 16.
Gummer, D.L., Forbes, M.R., Bender, D.J., and Barclay, R.M.R., 1997, Botfly (Diptera:
Oestridae) parasitism of Ord's kangaroo rats (Dipodomys ordii) at Suffield
National Wildlife Area, Alberta, Canada: Journal of Parasitology, v. 83, p. 601-
604.
Gummer, D.L., and Robertson, S.E., 2003a, Distribution of Ord's kangaroo rats in
southeastern Alberta, in Alberta Sustainable Resource Development Fish and
Wildlife Division, ed., Volume Alberta Specis at Risk Report No. 63: Edmunton,
AB.
—, 2003b, Evaluation of activities and survival of Ord's kangaroo rats during and post-
construction of the North Suffield pipeline: Calgary, Alberta, EnCana, p. 43 pp.
Heinrichs, J., 2010, The Relative Influence of Habitat Quality on Population Extinction:
PhD, Department of Biological Sciences, University of Calgary. Calgary, AB.
Heinrichs, J.A., Bender, D.J., Gummer, D.L., and Schumaker, N.H., 2010, Assessing
critical habitat: Evaluating the relative contribution of habitats to population
persistence: Biological Conservation, v. 143, p. 2229-2237.
67
67
Heinrichs, J.A., Dzenkiw, R., Bender, D.J., and Gummer, D.L., 2008, A Critical Habitat
Model for the Ord's Kangaroo Rat, Dipodomys ordii, in Alberta: Calary, AB,
University of Calgary.
Hoditschek, B., and Best, T.L., 1983, Reproductive Biology of Ord's Kangaroo Rat
(Dipodomys ordii) in Oklahoma: Journal of Mammalogy, v. 64, p. 121-127.
Hough, M.J., and Dieter, C.D., 2009, Resource selection habitat model for Northern
flying squirrels in the Black Hills, South Dakota: The American Midland
Naturalist, v. 162, p. 356+.
Hugenholtz, C.H., and Wolfe, S.A., 2005a, Biogeomorphic model of dunefield activation
and stabilization on the northern Great Plains: Geomorphology, v. 70, p. 53-70.
—, 2005b, Recent stabilization of active sand dunes on the Canadian prairies and relation
to recent climate variations: Geomorphology, v. 68, p. 131-147.
Hutchinson, G., 1957, Population studies: Animal ecology and demography: Cold Spring
Harbor Symposia on Quantitative Biology, v. 22, p. 415-427.
Jędrzejewski, W., Jędrzejewska, B., Zawadzka, B., Borowik, T., Nowak, S., and
Mysłajek, R.W., 2008, Habitat suitability model for Polish wolves based on long-
term national census: Animal Conservation, v. 11, p. 377-390.
Johnson, C.J., Nielsen, S.E., Merrill, E.H., McDonald, T.L., and Boyce, M.S., 2006,
Resource selection functions based on use-availability data: Theoretical
68
68
motivation and evaluation methods: Journal of Wildlife Management, v. 70, p.
347-357.
Jones, W.T., 1985, Body Size and Life-History Variables in Heteromyids: Journal of
Mammalogy, v. 66, p. 128-132.
Kaufman, D.W., and Kaufman, G.A., 1982, Effect of Moonlight on Activity and
Microhabitat Use by Ord's Kangaroo Rat (Dipodomys ordii): Journal of
Mammalogy, v. 63, p. 309-312.
Keating, K.A., and Cherry, S., 2004, Use and Interpretation of Logistic Regression in
Habitat-Selection Studies: Journal of Wildlife Management, v. 68, p. 774-789.
Kennedy, M.L., and Schnell, G.D., 1978, Geographic Variation and Sexual Dimorphism
in Ord's Kangaroo Rat, Dipodomys ordii: Journal of Mammalogy, v. 59, p. 45-59.
Kenny, R., 1989, Population, distribution, habitat use, and natural history of Ord's
kagaroo rat (Dipodomys ordii) in the sand hill areas of south-western
Saskatchewan and south-eastern Alberta: MSc, University of Maitoba. Winnipeg,
Manitoba.
Kerley, G.I.H., Whitford, W.G., and Kay, F.R., 1997, Mechanisms for the keystone status
of kangaroo rats: graminivory rather than granivory?: Oecologia, v. 111, p. 422-
428.
Koenig, S., 2010, Seasonal Population Dynamics and Habitat Use for Ord's Kangaroo
Rat (Dipodomys ordii) in Southeastern Alberta: Masters of Geographic
69
69
Information Systems, Department of Geography, University of Clagary. Calgary,
AB.
Konikow, L.F., and Bredehoeft, J.D., 1992, Ground-water models cannot be validated:
Advances in Water Resources, v. 12, p. 75-83.
Land Processes Distributed Active Archive Center, 2012, ASTER Global Digital
Elevation Model, Volume 2012, United States Geologic Survey.
Liu, C., Berry, P.M., Dawson, T.P., and Pearson, R.G., 2005, Selecting thresholds of
occurrence in the prediction of species distributions: Ecography, v. 28, p. 385-
393.
Lobo, J.M., and Tognelli, M.F., 2011, Exploring the effects of quantity and location of
pseudo-absences and sampling biases on the performance of distribution models
with limited point occurrence data: Journal for Nature Conservation, v. 19, p. 1-7.
Manly, B.F.J., McDonald, L.L., Thomas, D.L., McDonal, T.L., and Erickson, W.P.,
2002, Resource Selection by Animals: Statistical Design and Analysis for Field
Studies: London, Kluwer Academic Publishers.
MedCalc, 2012, MedCalc for Windows: Mariakerke, Belgium.
Morrison, A.M., 2005, Receiver Operating Characteristic (ROC) Curve Preparation - A
Tutorial, in Authority, M.W.R., ed., Volume 2005: Boston, Report ENQUAD, p.
5 p.
70
70
Nielsen, S.E., 2007, Biodiversity Assessments for the Great Sand Hills: Response of
Focal Species to Livestock Grazing and Gas Development, in Government of
Saskatchewan, ed., The Great Sand Hills Regional Environmental Study: Regina,
SK, Conservation Planning Institute.
Oreskes, N., Shhrader-Frechette, K., and Belitz, K., 1994, Verification, Validation, and
Confirmation of Numerical Models in the Earth Sciences: Science, v. 263, p. 641-
646.
PCI Geomatica, 2009, Geomatica OrthoEngine: Richmond Hill, ON.
Perumal, K., and Bhaskaran, R., 2010, Supervised Classification Performances of
Multispectral Images: Journal of Computing, v. 2, p. 124-129.
Podgurny, E.M., 2004, Habitat Modeling of Ord's Kangaroo Rat for Southeastern
Alberta: Master of Geographic Information Systems, Department of Geography,
University of Calgary. Calgary, AB.
Robertson, S.E., 2007, Spatial patterns and effects of bot fly (Cuterebra polita) parasitism
in Ord's kangaroo rat (Dipodomys ordii) [MR26370 thesis]: M.Sc., University of
Calgary (Canada). Canada.
Saskatchewan Minstry of Highways and Infrastructure, 2009, Saskatchewan Road
Network Database: Regina, SK, Information Services Corporation.
71
71
Stokland, J.N., Halvorsen, R., and Støa, B., 2011, Species distribution modelling—Effect
of design and sample size of pseudo-absence observations: Ecological Modelling,
v. 222, p. 1800-1809.
Teucher, A.C., 2007, Factors Affecting Ord's Kangaroo Rats (Dipodomys ordii) in
Natural and Anthropogenic Habitats: Master of Science, Department of
Geography, University of Calgary. Calgary, AB.
Vandermeer, J.H., 1972, Niche Theory: Annual Review of Ecology and Systematics, v. 3,
p. 107-132.
White, J.A., and Geluso, K., 2012, Seasonal link between food hoarding and burrow use
in a nonhibernating rodent: Journal of Mammalogy, v. 93, p. 149-160.
Wiens, T.S., Dale, B.C., Boyce, M.S., and Kershaw, G.P., 2008, Three way k-fold cross-
validation of resource selection functions: Ecological Modelling, v. 212, p. 244-
255.
Yousef, M.K., and Dill, D.B., 1971, Daily cycles of hibernation in the kangaroo rat,
Dipodomys merriami: Cryobiology, v. 8, p. 441-446.
72
72
APPENDIX A: CLASSIFICATION VALIDATION
Additional vegetation cover data were collected opportunistically in
Saskatchewan for validation of the extrapolated model area. Locations were selected
based on historical capture locations, uncertain landscape features, and potential
kangaroo rat habitat. Sandy features and partially vegetated soils were identified by
satellite imagery and visited in the field for additional vegetation surveys. Points were
visited between July 1st and 5
th 2011. Vegetation cover was assessed within a 5 m radius
of the point and photographed at cardinal points for future reference. The presence of
any kangaroo rat burrows near the site was also recorded. A total of 23 points were
recorded covering a variety of stabilized dunes, active dunes, and other features.
The classifications of open sand and partially vegetated soils were validated using
the 23 points collected through preliminary vegetation surveys in Saskatchewan. These
points recorded the presence of kangaroo rat burrows and the estimated percentage of
vegetation cover. Because the classification only considered two classes, open sand and
partially vegetated soils, rather than a complete classification of land cover, a traditional
confusion matrix could not be constructed. The classification of open sandy soil
performed well. The majority of points classified as open sand had less than 30% cover
and the points not classified were on water or salt flats. The partially vegetated
classification tended to identify areas of higher vegetation. It is possible that the spatial
and spectral resolution do not allow for identification of sparsely vegetated areas at lower
percent cover.
73
73
Table A.1. The vegetation survey points are summarized by classification type and
estimated percent vegetation cover. About half of the points were classified as open sand or
partially vegetated soil.
Classification Type Vegetation Cover
< 30% 30% - 50% 50% - 100%
Open Sandy Soil 5 0 1
Partially Vegetated Soil 0 0 5
Unclassified 3 2 7
74
74
APPENDIX B: K-FOLD CROSS-VALIDATION
Probability Area wiAi Utilization Expected Observed (o-e)2/e
0 - 0.003909 287093 561.1233 0.002436 0.267958 0 0.267958
0.003909 - 0.011721 241831 1889.909 0.008205 0.902505 0 0.902505
0.011721 - 0.02344 179384 3153.66 0.013691 1.505995 0 1.505995
0.02344 - 0.039065 135731 4241.933 0.018415 2.025687 0 2.025687
0.039065 - 0.066408 141720 7473.817 0.032446 3.569037 0 3.569037
0.0664081 - 0.113283 131144 11782.7 0.051152 5.626698 0 5.626698
0.113283 - 0.195313 119309 18409.14 0.079919 8.791077 1 6.904829
0.195313 - 0.351562 118502 32402.89 0.140669 15.47364 4 8.507652
0.351562 - 0.632809 115436 56815.93 0.246653 27.13181 15 5.424655
0.632809 - 0.999993 114670 93616.7 0.406415 44.7056 90 45.89095
Sum 80.62597
p > 0.001
The Spearman rank correlation for the second fold, ρ = 0.894, indicating high correlation.
y = 1.7385x - 8.1236
R² = 0.8023
-20
0
20
40
60
80
100
0 10 20 30 40 50
Ob
serv
ed
Expected
Linear Regression for Fold Two
Table B.1. The χ2 distribution for the second fold shows significant deviation from
the expected, especially in the highest probability bin.
Figure B.1. The linear regression for the second fold is significantly
different from zero and one.
75
75
Probability Area wiAi Utilization Expected Observed (o-e)2/e
0 - 0.003907 318175 621.5549 0.002749 0.302357 0 0.302357
0.003907 - 0.01172 237393 1854.87 0.008203 0.902305 2 1.335394
0.01172 - 0.023438 173979 3058.377 0.013525 1.487754 2 0.176371
0.023438 - 0.039063 133058 4158.129 0.018388 2.02273 0 2.02273
0.039063 - 0.066407 138197 7287.819 0.032229 3.545174 0 3.545174
0.066407 - 0.113282 125657 11289.59 0.049926 5.491844 0 5.491844
0.113282 - 0.195313 117876 18187.97 0.080433 8.847576 0 8.847576
0.195313 - 0.351562 114365 31271.68 0.138292 15.21217 2 11.47512
0.351562 - 0.636717 113043 55859.01 0.247025 27.17273 10 10.85289
0.636717 - 0.999996 113078 92538.12 0.409231 45.01536 94 53.30393
Sum 97.35339
p > 0.001
The Spearman rank correlation, ρ=0.539, was only slightly above 0.5 for this fold.
y = 1.7386x - 8.1249
R² = 0.7478
-20
0
20
40
60
80
100
0 10 20 30 40 50
Ob
serv
ed
Expected
Linear Regression for Fold Three
Table B.2. The χ2 distribution for the third fold shows significant deviation from the
expected, especially in the highest probability bin.
Figure B.2. The linear regression for the third fold is significantly
different from zero and one.
76
76
Probability Area wiAi Utilization Expected Observed (o-e)2/e
0 - 0.003908 273994 535.3843 0.002312 0.254367 0 0.254367
0.003908 - 0.011721 233954 1828.234 0.007896 0.868613 0 0.868613
0.011721 - 0.023439 179727 3159.601 0.013647 1.50116 1 0.167312
0.023439 - 0.039064 138540 4329.583 0.0187 2.057031 1 0.543169
0.039064 - 0.066408 147020 7753.247 0.033488 3.683651 0 3.683651
0.066408 - 0.109376 127798 11232.42 0.048515 5.336644 1 3.524028
0.109376 - 0.187501 125078 18566.39 0.080192 8.821092 0 8.821092
0.187501 - 0.335937 120653 31577.18 0.136388 15.00266 2 11.26928
0.335937 - 0.613279 118911 56436.16 0.243758 26.81343 7 14.64087
0.613279 - 0.999995 119145 96106.77 0.415103 45.66136 98 59.99238
Sum 103.7648
p > 0.001
The Spearman rank correlation, ρ = 0.745, showed high correlation.
y = 1.794x - 8.7339
R² = 0.7413
-20
0
20
40
60
80
100
0 10 20 30 40 50
Ob
serv
ed
Expected
Linear Regression for Fold Four
Table B.3. The χ2 distribution for the fourth fold shows significant deviation
from the expected, especially in the highest probability bin.
Figure B.3. The slope for the linear regression calculated for the
fourth fold is significantly different from zero and one.
77
77
APPENDIX C: PREDICTED HABITAT IN SASKATCHEWAN
Figure C.1. This figure shows the Middle Sand Hills regions
shown in Chapter Two, but as predicted by the alternate model
surface. The sandy roads and dunes are highlighted in this
model surface.
78
78
Figure C.2. The Big Sticks sand dunes are located just south of the Great
Sand Hills. The alternate model surface is shown here. The top model
surface showed almost no contrast between the roads and dunes, and the
surrounding areas.
79
79
Figure C.3. The top model surface predicts habitat near the river in
the Hilda Sand Hills. Roads also show higher probabilities of
occurrence.
80
80
Figure C.4. The Hilda Sand Hills are displayed here as predicted by the alternate
model surface. Note how there is now more emphasis near the roads and several
smaller dune near the centre of the map.
Figure C.5. The Burstall Sand Hills in the top model surface shows very high
probabilities of occurrence along the river. These dunes are located near the
Alberta border
81
81
Figure C.6. The alternate model surface predicts much more potential habitat away
from the river for the Burstall dunes. Note that more individual dunes are visible in
this figure than in Figure C.5.
82
82
Figure C.7. The Westerham Dunes are shown here as predicted by the top model
surface. Note again, how the areas adjacent to the river valley were assigned high
probabilities of occurrence.
Figure C.8. The alternate model surface for the Westerham Dunes shows more
roads and individual dunes near the river in the centre of the image.
83
83
Figure C.9. The Cramserber Sand Hills are located in the north-east section of the
study area. The top model predicts potential habitat along the river and roads in
this area.
84
84
Figure C.10. The alternate model shows more defined sand dunes in the
Cramersber Sand Hills.