Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Hierarchical models
Hierarchical models
Represent processes and observations that span multiple levels (aka multi‐level models)
Consider processes important at each scale or at many scales
N1 N2 N3
R1
N4 N5 N6
R2
N7 N8 N9
R3
Ni = true abundance on a plotConsider factors that govern abundance at the plot scale
Rj = true abundance in a regionConsider factors that govern abundance at the regional scale
Hierarchical models
Add additional levels
Define parameters for each level
Hierarchical, because parameters at one level govern parameters at lower level
N1
o1 o2 on…
N2
o1 o2 on…
N3
o1 o2 on…
R1 ρ
λ
p
stateprocesses
observationprocess
Two‐level hierarchical model
yij ~ N(θi , σi2) level 1, i = sites, j = surveys
Key idea: Consider an attribute of a sample unit, θi, as having been drawn from an underlying distribution. We don’t estimate θi’s for each sample unit, but instead we estimate parameters of the distribution from which θswere drawn
θi~ N(θ, σ2) level 2
Parameters of interest are θ and σ2, which in this case are the mean and variance of the distribution of θi’s; we estimate these from data
Two‐level hierarchical model
Key idea: Estimate parameters of the upper‐level distribution assumed to govern processes that give rise to data observed at lower levels
• Parameters from all levels are estimated simultaneously• Important because uncertainty at one level affects
inferences at other levels• Most alternative modeling frameworks do not allow us to
model state and observation processes simultaneously • Modeling density with Program DISTANCE?• Modeling abundance in Program MARK?
Hierarchical models
Two common types:
1) Latent‐variable models2) Mixed‐effects models
N3
o1 o2 on…
λ
p
Hierarchical models in ecology
Ecological ProcessModel for describing “state” variables (latent or unobserved): abundance, occupancy, survival
‐ Parameters: λ, ψ, φ‐ Site / individual covariates
Observation ProcessModel for describing the detection process
‐ Parameter: p‐ Site / individual covariates‐ Survey covariates
Realized Data: y1 , y2, y3,…, yn
Imperfect observations
• Wish to estimate abundance of a species on a plot, Ni
• Use a survey method that yields counts on plots, Ci, e.g, point counts, line transects, removals, etc.
• Probability that we observe an individual that is present, β, is often < 1
• No. individuals counted is related to true abundance:
C = β N,
where β ranges from 0 – 1
• Translate C into an estimate of abundance:
• Example: Count 5 quail on a plot; if β = 0.25, then:
= 5/0.25 = 20
Imperfect observations
Occupancy, single season
Presence‐absence data
• Classifying a species as present or absent across space is the basis for studying biogeography (study of distributions) and many types of habitat analyses
• Changes in present‐absent status over time is the basis for patch dynamics and metapopulation dynamics
• Problem: when detection process is imperfect, we cannot distinguish non‐detection from absence
• Estimates of the area occupied will be biased
What is occupancy?
• Occupancy – proportion of area, patches, or other sample unit occupied by a species
• Probability of occupancy – probability (ψ) that any given unit within a sampling frame is occupied
• Single‐season goal: estimate ψ when p< 1 during a single season
• Multi‐season goal: dynamics = colonization and extinction
Changes in geographic range
• Has purple loosestrife spread across the Lake Erie basin? If so, how fast?
• Are eradication methods working?
Habitat relationships and resource selection
• Identify habitat features associated with selection• Classify presence‐absence of species on sample units, then
assess with logistic regression– Does not account for false absences = imperfect detection
Occupancy as a parameter
Trade‐offs:• Not as sensitive as abundance to changes over time
• Value of ψ is a function of size of sample units (sites)
5 110 ψ = 1
Year 1 Year 2 Year 3
Ψ = 4/4 = 1.0 Ψ = 9/25 = 0.36
Basic sampling scheme
• Select a sample of s units (“sites”) from a larger set of S units (population)
• Survey each site K times and record whether species of interest is detected or not = temporal replication
• Resurvey all sites in sample, even those where species detected previously – forms the basis for estimating detection probability
• Sampling can be direct (visual) or indirect (tracks)
1Season
Sites 1 2 … S
Closure
Surveys 1 2…K1 1 2…K2
Occupancy: hierarchical structure
0 , 1 , 0 , 1 , 1
1 , 0 , 1 , 1 , 1
0 , 0 , 0 , 0 , 0
DetectionNo Detection
Encounter histories
Encounter histories
• Survey results:• 1 = detected• 0 = not detected
• Survey history for each site:• When surveys complete, we have
two types of sites:
Site ID 1 2 3 4
A 0 1 1 0
B 1 1 0 0
C 0 0 0 0
D 0 1 0 1
E 1 1 0 0
F 0 0 0 0
G 1 1 0 1
…Detection
No Detection
Not occupied
Occupied, but not detected
Occupied
Ideas underlying estimatesSite Survey 1 Survey 2 Survey 3 Survey 4
1 0 1 1 0
2 1 1 1 1
3 1 0 0 0
4 0 0 0 0
If surveys were perfect, 0‐0‐0‐0 would indicate true absence, so we could estimate ψ as proportion of sites with ≥1 detection
Naïve estimate of ψ = ¾ or 0.75
If surveys imperfect, estimate p from sites with ≥1 detections
p = (0.50 + 1.00 + 0.25) / 3 = 0.58
Estimate ψ and p
Use a model‐based approach to estimate occupancy and detection parameters simultaneously
Consider two stochastic process:• Occupancy: a site will either be occupied with probability ψ
or unoccupied with probability 1 – ψ• Detection: if site unoccupied, species cannot be detected; if
site occupied, then at each survey there is some probability of detecting the species (p):
Species detected = ψSpecies not detected = 1 – ψ or ψ(1 – p)
Binomial distribution
Discrete distribution. Represents the outcome of a number of independent Bernoulli trials = events with two possible outcomes
Notation: Bin(n, p)
Parameters: n = number of trials, p = prob. of success each trial
p = 0.1 (blue)p = 0.5 (green)p = 0.8 (red)
n = 20
Occupancy: single‐season
Ecological Process
Observation Process
Unobservable trueoccupancy (state)
Probability ofoccupancy
Binomial distribution
Zi ~ Bin(1, ψ)
yij ~ Bin(1, Zi ∙ p)
Probability of detection
Unobservable truestate of occupancy
Observedoutcome
Binomialdistribution
Binary response, so represent the response (stochastic part) with binomial distribution; mean is a probability or proportion (p)
Link function is the logit (log‐odds): logit(y) = β0 + β1x1 + β2x2 + …
Occupancy state: logit(ψi), i = no. sites
Observation process: logit(pij), j = no. visits/site
Logistic regression
Observed outcome
Number trials
y ~ Bin(N, p)
Prob(occupancy) orProb(detection)
Binomial distribution
Assumptions
• Species never falsely detected when absent• Detection of a species at a site independent of
detecting species at other sites• Sites closed to changes in occupancy state during
survey period (no colonization or extinction)• ψ and p constant across sites, unless heterogeneity in
parameters is explained by covariates
Accounting for heterogeneity with covariates
Consider additional factors to explain variation in ψ and p
ψ can be modeled as a function of site‐level covariates• covariates for ψmust remain constant during survey period; e.g., plant community, patch size
p can be modeled as a function of:• site‐level covariates; e.g., vegetation cover• survey‐level covariates; e.g., cloud cover, air temperature, observer
Covariates
• Two types: • Site‐level covariates (for ψ and p)• Observation‐level covariates (for p)
Surv.1 Surv.2 Surv.3 Surv.4 Buffel% Time.1 Time.2 Time.3 Time.4
Site 1 0 1 1 0 40 M E M E
Site 2 1 1 1 1 60 E M E M
Site 3 1 0 0 0 20 E M E M
Site 4 0 0 0 0 10 M E M E
M = morningE = evening
Adding covariates
Ecological and Observation Processes
logit(p) = β0 + β1X1 + β2X2+…+βnXn
Extend models with Generalized Linear Modelingframework that allow us to model linear functions regardless of the distribution of the response
yij ~ Bin(Ni, pij)
Run models to estimate parameters
• For estimates based on maximum‐likelihood methods:
• Code directly in R• Use UNMARKED package in R
• For estimates based on Bayesian methods:
• WinBUGS• OpenBUGS• JAGS
Fitting models in Unmarked
• Develop and fit a set of candidate models for the state variable (here, occupancy) and detection process
rObject <‐ occu (~detect ~occupancy, UMF)
time.buff <‐ occu (~time ~buffel, goagUMF)
timeDate.buffYear <‐occu (~time + date ~ buff + year, goagUMF)
• Use model selection or frequentist methods to establish model for inference