21
Combining Data in Species Distribution Models Combining Data in Species Distribution Models Bob O’Hara 1 Petr Keil 2 Walter Jetz 2 1 BiK-F, Biodiversity and Climate Change Research Centre Frankfurt am Main Germany bobohara 2 Department of Ecology and Evolutionary Biology Yale University New Haven, CT, USA

Combining Data in Species Distribution Models

Embed Size (px)

DESCRIPTION

Using point process models to combine different data types for species distribution models. Slides for talk at ISEC 2014, presented on the 3rd July

Citation preview

Page 1: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Bob O’Hara1 Petr Keil 2 Walter Jetz2

1BiK-F, Biodiversity and Climate Change Research CentreFrankfurt am MainGermany bobohara

2Department of Ecology and Evolutionary BiologyYale University

New Haven, CT, USA

Page 2: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Motivation

Map Of Life

www.mol.org/

Page 3: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

The Problem

Different data sources

I GBIF

I expert range maps

I eBird and similar citizen science efforts

I organised surveys (BBS, BMSs)

Page 4: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Pointed Process Models

Point process representation of actual distribution

I Continuous space models

Build different sampling models on top

Page 5: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Point Processes: Model

Intensity ρ(ξ) at point s. Assume covariates (features?) X (ξ), anda random field ν(ξ)

log(ρ(ξ)) = η(ξ) =∑

βX (ξ) + ν(ξ)

then, for an area A,

P(N(A) = r) =λ(A)re−λ(A)

r !

where

λ(A) =

∫Aeη(s)ds

Page 6: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

In practice...

Constrained refined Delaunay triangulation

λ(A) ≈N∑

s=1

|A(s)|eη(s)

Approximate λ(ξ) numerically:select some integration points,and sum over those

Page 7: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Some Data Types

I AbundanceI e.g. Point counts

I Presence/absenceI surveys, areal lists

I Point observationsI museum archives, citizen science observations

I Expert range maps

Page 8: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Abundance

Assume a small area A, so that η(ξ) is constant, and observationfor a time t, then n(A, t) ∼ Po(eµ(A,t)) with

µA(A, t) = η(A) + log(|A|) + log(t) + log(p)

where p is the proability of observing each indidivual.Don’t know all of |A|, t and p, so estimate an interceptCan also add a sampling model to log(p)

Page 9: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Presence/Absence for ’points’

As n(A, t) ∼ Po(µ(A, t)),

cloglogPr(n(A, t)) = µI (A, t)

with µI (A, t) as beforeAgain, can make log(|A|) + log(t) + log(p) an intercept

Page 10: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Presence only: point process

log Gaussian Cox ProcessLikelihood is a Poisson GLM (but with non-integer response)

Page 11: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Areal Presence/absence

If an area is large enough, we can’t assume constant covariates, so

Pr(n(A) > 0) = 1− e∫A eρ(ξ)dξ

in pracice this is calculated as

1− e∑

s |A(s)|eρ(s)

which causes problems with the fitting

Page 12: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Expert Range Maps

Not the same as areal presence.Instead, use distance to range asa covariate

I within range, this is 0.

I Have to estimate the slopefor outside the range

Use informative priors to forcethe slope to be negative 0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

Space (1d)

Inte

nsity

Species'Range

Page 13: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Put these together with INLA

Quicker than MCMC

SolTim.res <- inla(SolTim.formula,

family=c('poisson','binomial'),

data=inla.stack.data(stk.all),

control.family = list(list(link = "log"),

list(link = "cloglog")),

control.predictor=list(A=inla.stack.A(stk.all)),

Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)

Page 14: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

The Solitary Tinamou

Photo credit: Francesco Veronesi on Flickr(https://www.flickr.com/photos/francesco veronesi/12797666343)

Page 15: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Data

Whole RegionExpert rangePark, absentPark, presenteBirdGBIF

I expert range

I 2 pointprocesses (49points)

I 28 parks

Page 16: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

A Fitted Model

mean sd mode

Intercept -0.30 0.09 -0.30b.PP 1.37 0.40 1.37

b.GBIF 1.43 0.26 1.43Forest -0.03 0.04 -0.03

NPP 0.15 0.05 0.15Altitude -0.02 0.04 -0.02

DistToRange -0.01 0.02 -0.01

Page 17: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Predicted Distribution

−0.10

−0.05

0.00

0.05

0.10

0.15

0.20

0.25

Whole RegionExpert rangePark, absentPark, presenteBirdGBIF

Page 18: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Individual Data Types

Expert Range

−10

−8

−6

−4

−2

0

GBIF−0.060

−0.058

−0.056

−0.054

−0.052

−0.050

−0.048

eBird−0.060

−0.058

−0.056

−0.054

−0.052

−0.050

−0.048

Parks

−10

−8

−6

−4

−2

0

all data

−0.10

−0.05

0.00

0.05

0.10

0.15

0.20

0.25

Page 19: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Summary

Parks and expert range seem to drive distributionNPP is main covariate, not forest or altitude

Page 20: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

What Next

Multiple species

I already being done elsewhere

I estimate sampling biases

More Data

I Point counts (have it working)

Can we estimate absolute probability of presence?

I Distance sampling?

I Mark-recapture?

I scaling issues (in time and space)

Page 21: Combining Data in Species Distribution Models

Combining Data in Species Distribution Models

Not the final answer...

http://www.gocomics.com/nonsequitur/2014/06/24