Upload
vanhuong
View
215
Download
0
Embed Size (px)
Citation preview
Working with what we know –presence-only / ecological niche models
in marine mammal science
Kristin Kaschner, Colin MacLeod, Laura Mandleberg, Ross Comptom
Sea Around Us Project, Fisheries Centre, University of British Columbia, Canada &FTZ Büsum, Kiel Christian-Albrechts-University, Germany
School of Biological Sciences (Zoology), University of Aberdeen, UK
Outline
• What are presence-only models?
• Why do we use them?
• Which ones are there?
• Do they work?
• What can we do with them?
OUTLINE
What are they?WHAT?
• Predict ecological niches
• Use only presence data
Why do we use them?WHY?
• Data paucity
• Absence data issues
• Niche modeling vs. distribution
Data paucity
Hawaii
WHY?
Data paucityWHY?
OBIS-SEAMAP (http://seamap.env.duke.edu/species)
- compilation & storage of marine mammal occurrence data
- out of 115 species, geo-referenced / effort corrected data
- available/accessible for ~ 50%
- representative coverage = ~ 2 %
Data paucityWHY?
Model evaluationPresence-absence confusion matrix
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true absence)
WHY? Absence Data Issues
Model evaluationPresence-absence confusion matrix
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true absence)
WHY? Absence Data Issues
Omission error / Model overfitting
Commission error / Model overprediction
Model evaluationPresence-absence confusion matrix
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true or perceived absence????)
WHY? Absence Data Issues
Site with environmental
value X
Species present?
Site visited? Species detected?
Presence-only data
M. Nakamura, CONABIO, 2005GBIF Ecological Niche Modelling Workshop, KU
Absence Data IssuesWHY?
Site with environmental
value X
Species present?
Site visited? Species detected?
Presence-absence data
M. Nakamura, CONABIO, 2005GBIF Ecological Niche Modelling Workshop, KU
Absence Data IssuesWHY?
Site with environmental
value X
Species present?
Site visited? Species detected?
True absence data
M. Nakamura, CONABIO, 2005GBIF Ecological Niche Modelling Workshop, KU
Absence Data Issues
False absence data
WHY?
Absence Data IssuesWHY?
0
0.5
1
1.5
2
2.5
0 20 40 60 80Depth [m]
Mea
n de
nsity
[# a
nim
als
/ km
2]
Harbour porpoise density
No animals in deeper waters!
Scheidat, Gilles et al, (unpublished data)
Absence Data IssuesWHY?
0
0.5
1
1.5
2
2.5
0 20 40 60 80Depth [m]
Mea
n de
nsity
[# a
nim
als
/ km
2]
0
2
4
6
8
10
12
0 20 40 60 80Depth [m]
Mea
n Ef
fort
[km
2]
Harbour porpoise density
Aerial survey effort
No animals in deeper waters?
True absences???
Scheidat, Gilles et al, (unpublished data)
0
10
20
30
40
50
60
0 5 10 15 20 25
Effort [km2]
# si
ghtin
gs
Absence Data IssuesWHY?
Spring 2002-2005
Scheidat, Gilles et al, (unpublished data)
Absence Data IssuesWHY?
Spring
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Summer
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Fall
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Scheidat, Gilles et al, (unpublished data)
Absence Data IssuesWHY?
Spring
R2 = 0.4597
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Summer
R2 = 0.6622
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Fall
R2 = 0.6069
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Scheidat, Gilles et al, (unpublished data)
Absence Data IssuesWHY?
Spring
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Summer
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
Fall
0
10
20
30
40
50
60
70
80
0 10 20 30Effort category [km2]
Mea
n #
sigh
tings
True absences???
Scheidat, Gilles et al, (unpublished data)
.20
Variable A
Species #1
Presence
.8Bias
.16Obs. rate
Example: 100 trials
16Observed
.80
Variable B
.1
.08
8
0
Variable C
.1
0
0
× ××
= ==
.32
Variable A
Species #2
.5
.16
16
.16
Variable B
.5
.08
8
.52
Variable C
0
0
0
×× ×
== =
M. Nakamura, CONABIO, 2005GBIF Ecological Niche Modelling Workshop, KU
Absence Data IssuesWHY?
Model Complexity
Pre
dict
ion
Err
or
Low High
Low Variance High Variance
Training sample
Test sample
Hastie et al. (2001)
WHY? Ecological Niche vs Distribution
Model Complexity
Pre
dict
ion
Err
or
Low High
Low Variance High Variance
Training sample
Test sample
Hastie et al. (2001)
WHY? Ecological Niche vs Distribution
Ecological Niche Model
Distribution Model
WHY? Ecological Niche vs Distribution
J. Soberon, CONABIO, 2005GBIF Ecological Niche Modelling Workshop, KU
Geographic space(2 dimensions)
Ecological space(n dimensions / hypervolume)
Which ones are out there?
Envelope models• BioClim• DOMAIN• Fuzzy bioclimatic envelope model• RES*
Machine-learning• Garp**• Maxent**
• ENFA (Biomapper)*
WHICH?
* Models have been applied to and tested for marine mammals ** Preliminary applications to marinemammals
Which ones are out there?Web-based applications• AquaMaps*• KGS-Mapper**• WhyWhere?
* Models have been applied to and tested for marine mammals ** Preliminary applications to marinemammals
WHICH?
Ecological Niche ModelsWHICH?
Geographic spaceEcological / Environmental space
J. Soberon
BioClim / Climate Envelope Range
• Based on presence cells
• Very simple & intuitive
• No interactions betweenvariables
• Unweighted variables
• Binary predictions
• No extrapolations
• Tends to overpredict
Lindenmayer et al. 1991 J. Biogeog. 18: 371-383.Arcscript: http://arcscripts.esri.com/details.asp?dbid=13745
WHICH?
J. Soberon
RES
Kaschner et al. 2006, MEPS / www.seaaroundus.orgSkov & Svenning, 2004, Ecography, 27:366-380
Relative Environmental Suitability Model(Fuzzy Bioclimatic Envelope Model)
Assigned habitat usage categories: Depth, SST, Ice edge
0 -200 -1000 -2000 -3000 -4000 -5000 -6000 -7000 -8000
Depth [m]
Prob
abilit
y
Pmax
Mainly continental slope
-2 0 5 10 15 20 25 30
Subpolar – warm temp.
SST [° C]
Pmax
*-1 0 1 500 1000 2000 8000
No association with ice edge
Distance from ice edge [km]
Pmax
*
• Based on relative occurrences
• Very simple, transparent & intuitive
• Expert knowledge based (no point data required)
• No interactions betweenvariables
• Unweighted variables
• Continuous output
• Tends to overpredict
Sowerby‘s beaked whaleSowerby‘s beaked whale
WHICH?
RES
Kaschner et al. 2006, MEPS / www.seaaroundus.orgSkov & Svenning, 2004, Ecography, 27:366-380
Relative Environmental Suitability Model(Fuzzy Bioclimatic Envelope Model)
Assigned habitat usage categories: Depth, SST, Ice edge
0 -200 -1000 -2000 -3000 -4000 -5000 -6000 -7000 -8000
Depth [m]
Prob
abilit
y
Pmax
Mainly continental slope
-2 0 5 10 15 20 25 30
Subpolar – warm temp.
SST [° C]
Pmax
*-1 0 1 500 1000 2000 8000
No association with ice edge
Distance from ice edge [km]
Pmax
*
• Based on relative occurrences
• Very simple, transparent & intuitive
• Expert knowledge based (no point data required)
• No interactions betweenvariables
• Unweighted variables
• Continuous output
• Tends to overpredict
Sowerby‘s beaked whaleSowerby‘s beaked whale
WHICH?
DOMAIN
• Based on presence cells
• Cluster algorithm in environmental space
• No interactions betweenvariables
• Unweighted variables
• Non-binary predictions
• Tends to overfit
Carpenter et al. 1993 Biodiv. Conservation 2: 667-680.Freeware: http://www.cifor.cgiar.org/docs/_ref/research_tools/domain/
WHICH?
J. Soberon
GARP Genetic Algorithm of Rule-set Prediction
• Based on presence cells
• Machine learning / automated model optimization (not transparent)
• Generates pseudo-absencedata
• Optimization using training / test data sets
• Interactions between weightedvariables
• Non-binary predictions
• Tends to overpredict?
Stockwell & Noble 1992, Math. & Comp. in Simulation 33:385-390Freeware: http://www.lifemapper.org/desktopgarp/
WHICH?
J. Soberon
GARP Genetic Algorithm of Rule-set Prediction
Stockwell & Noble 1992, Math. & Comp. in Simulation 33:385-390Freeware: http://www.lifemapper.org/desktopgarp/
Minke whale (IWC whaling data)
• Based on presence cells
• Machine learning / automated model optimization (not transparent)
• Generates pseudo-absencedata
• Optimization using training / test data sets
• Interactions between weightedvariables
• Non-binary predictions
• Tends to overpredict?
WHICH?
MaxentMaximum entropy
• Based on presence cells
• Machine learning / automated model optimization (not transparent)
• Optimization using training / test data sets
• Interactions between weightedvariables
• Non-binary predictions
• Tends to overfit?
Phillips et al. 2006, Ecological ModellingFreeware: http://www.cs.princeton.edu/~schapire/maxent/
Minke whale (IWC whaling data)
WHICH?
MaxentMaximum entropy
• Based on presence cells
• Machine learning / automated model optimization (not transparent)
• Optimization using training / test data sets
• Interactions between weightedvariables
• Non-binary predictions
• Tends to overfit?
Phillips et al. 2006, Ecological ModellingFreeware: http://www.cs.princeton.edu/~schapire/maxent/
Minke whale (IWC whaling data)
WHICH?
ENFAEcological Niche Factor Analysis
• Based on presence cells
• Compares species occurrence to all available habitat
• Multivariate (Interactionsbetween weighted variables)
• Non-binary predictions (HIS)
Hirzel et al. 2002, Ecology 83: 2027-2036.Biomapper freeware: http://www2.unil.ch/biomapper/
WHICH?
Species occurrence cells Available habitat
ENFAEcological Niche Factor Analysis
• Based on presence cells
• Compares species occurrence to all available habitat
• Multivariate (Interactionsbetween weighted variables)
• Non-binary predictions (HIS)
Hirzel et al. 2002, Ecology 83: 2027-2036.Biomapper freeware: http://www2.unil.ch/biomapper/
WHICH?
Marginality
Specialization
ENFAEcological Niche Factor Analysis
Northern bottlenose whale
(NW Atlantic & Total N Atlantic)
Compton, 2004, MRes Thesis, U of PlymouthMacLeod, 2005, PhD Thesis, U of Aberdeen
WHICH?
MarginalCore
UnsuitableHabitat Suitability
AquaMaps (Coming soon....)
Kaschner et al, in prep,www.fishbase.de
WHICH?
AquaMaps (Coming soon....)
Kaschner et al, in prep,www.fishbase.de
WHICH?
AquaMaps (Coming soon....)
Kaschner et al, in prep,www.fishbase.de
WHICH?
Do they work?Model Evaluations
• Test statistics
• Cross-validation
• External testing
• Comparison with other models
DO THEY WORK?
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true absence)
Test statisticsDO THEY WORK?
Sensitivity: Proportion of observed presences correctly predicteda/(a + c)
R. Pearson, AMNH, 2005GBIF Ecological Niche Modelling Workshop, KU
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true absence)
Test statisticsDO THEY WORK?
Sensitivity: Proportion of observed presences correctly predicteda/(a + c)
R. Pearson, AMNH, 2005GBIF Ecological Niche Modelling Workshop, KU
Specificity: Proportion of observed (or assumed) absences correctly predicted d/(b + d)
Test statistics: Receiver Operator Curve
DO THEY WORK?
AUC = 0.5 = random
Area Under Curve (AUC)
Predicted presence
Predicted absence
Recorded presence
Recorded absence
a (true presence)
c (false absence)
b (false presence)
d (true absence)
Test statisticsDO THEY WORK?
R. Pearson, AMNH, 2005GBIF Ecological Niche Modelling Workshop, KU
Cohen’s Kappa:
)]/)))(())(((([)]/)))(())(((()[(
ndcdbbacanndcdbbacadak
+++++−+++++−+=
Araújo et al. 2005 Gl. Ch. Biol.
Environmental envelope
Evaluation
Calibration
Projection
Same region
New region
New resolution
New time
Environmental envelope
Evaluation
Calibration
Projection
Same region
New region
New resolution
New time
Same region
New region
New resolution
New time
DO THEY WORK? Cross-validation
0
2
4
6
8
10
12
14
16
0-33 34-66 67-100
HS bin range
Mea
n ar
e-ad
just
ed fr
eque
ncy
Compton, 2004, MRes Thesis, U of Plymouth
Northern bottlenose whale (NW Atlantic)
• Same data set
• Jack-knife procedure
Environmental envelope
Evaluation
Calibration
Projection
New region
New resolution
New time
100%Environmental envelope
Evaluation
Calibration
Projection
New region
New resolution
New time
New region
New resolution
New time
100%
Araújo et al. 2005 Gl. Ch. Biol.
DO THEY WORK? External Testing
Kaschner et al, 2006, MEPS
Southern bottlenosewhale
RES category
Avg
SP
UE
per
cel
l Common nameComparison with random data sets
rho p Simulated p-valueNorthern fur seal 0.54 < 0.0001 0Harbour porpoise 0.59 < 0.0001 0Sperm whale 0.66 < 0.0001 0Antarctic minke whale 0.71 < 0.0001 0Killer whale 0.56 < 0.0001 0.54Fin whale 0.53 < 0.0001 0Blue whale 0.48 < 0.0001 0.268Humpback whale 0.20 <0.05 0.006Hourglass dolphin 0.68 < 0.0001 0Southern bottlenose whale 0.83 < 0.0001 0Dwarf minke whale -0.77 < 0.0001 0
Spearman's non-parametric rank correlation analysis of RES vs mean SPUE
• Different data set
• Permutation
DO THEY WORK? Model Comparison
Mandelbaum, 2005, MSc Thesis, U of Aberdeen
PCA ENFA
GARP GLM
Presence
Absence
Harbour porpoise
DO THEY WORK? Model Comparison
Mandelbaum, 2005, MSc Thesis, U of Aberdeen
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
PCA ENFA
GARP GLM
AUC = 0.746 AUC = 0.745
AUC = 0.773 AUC = 0.828
1-specificity
sens
itivi
ty
Significantly different
DO THEY WORK? Model Comparison
Mandelbaum, 2005, MSc Thesis, U of Aberdeen
43210
Combined model
Do they work?Limitations
• Presence cells vs. encounter rates
• Effort / Sampling biases
DO THEY WORK?
DO THEY WORK? Presence cells vs Encounter Rate
Minke whale
Maxent prediction (IWC whaling data)
Kaschner et al, in prep
SOWER sightingsWhaling data
DO THEY WORK? Presence cells vs Encounter Rate
Minke whale
Maxent prediction (IWC whaling data)
Kaschner et al, in prep
SOWER sightingsWhaling data
DO THEY WORK? Presence cells vs Encounter Rate
Minke whale
Maxent prediction (IWC whaling data)
Kaschner et al, in prep
SOWER SPUE
DO THEY WORK? Sampling Biases
Kaschner et al, 2006, MEPS
Minke whale Blue whale Humpback whale
DO THEY WORK?
Minke whale Blue whale Humpback whale
Kaschner et al, 2006, MEPS
Sampling Biases
DO THEY WORK?
Minke whale Blue whale Humpback whale
Kaschner et al, 2006, MEPS
Sampling Biases
What can we do with them? Potential Applications
• Biodiversity Mapping
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal-fisheries interactions
– Marine Protected Areas
– Climate change
WHAT FOR?
What can we do with them? Potential Applications
• Biodiversity Mapping
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal-fisheries interactions
– Marine Protected Areas
– Climate change
WHAT FOR?
Kaschner & WormWednesday, 15:15
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal fisheries interactions
– Marine Protected Areas
– Climate change
Applications
RES > 0.8
Beaked whales
WHAT FOR?
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal fisheries interactions
– Marine Protected Areas
– Climate change
Applications
RES > 0.8
Beaked whales
M. densirostris
All Mesoplodon Spp.
MacLeod, 2005, PhD Thesis, U of Aberdeen
WHAT FOR?
Applications
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal-fisheries interactions
– Marine Protected Areas
– Climate change
APPLICATIONS
FisheriesFood consumption
Resource overlap
Kaschner et al, accepted, CJFAS
Applications
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal fisheries interactions
– Marine Protected Areas
– Climate change
& Seamounts
Species richness
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0 5 10 15 20 25 30 35 40
Species richness category
Mea
n se
amou
nt d
ensi
ty
Kaschner, 2006, Ecology of Seamounts
WHAT FOR?
-0.50-0.40-0.30-0.20-0.100.000.100.200.300.400.50
Applications
• Management / Research Prioritiziation
– Risk mitigation
– Marine mammal fisheries interactions
– Marine Protected Areas
– Climate change
Small toothed whales
Predicted relative changes in distribution size by 2020
WHAT FOR?
• require less data / maximize available data
• less affected by false absences
• more general, useful to investigate large scale patterns & ecological interactions
• time & cost efficient starting points
• can supplement small scale studies and help to focus research and management efforts
Presence-only / Ecological niche models
ConclusionsCONCLUSIONS
• Ed Gregr, MMRU, UBC, Vancouver
• Anita Gilles & Roger Mundry, FTZ Buesum
• Instructors & Participants of the GBIF Ecological Niche Modelling Workshop, 5-9 Dec 2005, Lawrence, Kansas
Acknowledgements