54
MODULE 5: Spatial Statistics in Epidemiology and Public Health Lecture 5: Spatial regression Jon Wakefield and Lance Waller 1/1

MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

MODULE 5: Spatial Statistics in Epidemiologyand Public Health

Lecture 5: Spatial regression

Jon Wakefield and Lance Waller

1 / 1

Page 2: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

References

I Waller and Gotway (2004, Chapter 9) Applied SpatialStatistics for Public Health Data. New York: Wiley.

I Elliott, P., et al. (2000) Spatial Epidemiology: Methods andApplications, Oxford: Oxford University Press.

I Haining, R. (2003). Spatial Data Analysis: Theory andPractice. Cambridge: Cambridge University Press.

I Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2014)Hierarchical Modeling and Analysis for Spatial Data, 2nd Ed.Boca Raton, FL: CRC/Chapman & Hall.

I Blangiardo, M. and Cameletti, M. (2015) Spatial andSpatio-temporal Bayesian Models with R-INLA. Chichester:Wiley.

2 / 1

Page 3: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What do we have so far?

I Point process ideas (intensities, K -functions).I Data: (x , y) event locations.I Where are the clusters? Use intensities.I How are events clusters (in average)? Use K -functions.

I Inhomogeneous Poisson process → regional counts arePoisson distributed.

I Non-overlapping areas should be independent.

I Point process results provide a basis for the small areaestimation methods from yesterday.

I Tension between statistical precision (want large local samplesizes → big regions), and geographic precision (want smallregions for more detail in map).

3 / 1

Page 4: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What’s left?

I So we know how to describe and evaluate spatial patterns inhealth outcome data.

I What about linking patterns in health outcomes to patterns inexposures?

I With independent observations we know how to use linear andgeneralized linear models such as linear, Poisson, logisticregression.

I What happens with dependent observations?

4 / 1

Page 5: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Caveat

“...all models are wrong. The practical question is howwrong do they have to be to not be useful.”Box and Draper (1987, p. 74)

5 / 1

Page 6: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What changes with dependence?

I In statistical modeling, we are often trying to describe themean of the outcome as a function of covariates, assumingerror terms are mutually independent.

I Where do correlated errors come from?

I Perhaps outcomes truly correlated (infectious disease).

I Perhaps we omitted an important variable that has spatialstructure itself.

I If temperature is important and we left it out of a modelapplied to the continental U.S., what would the residuals looklike?

6 / 1

Page 7: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Our plan

I NY leukemia data and covariates (Waller and Gotway, 2004).

I Fit linear and Poisson regression models with various spatialcorrelation structures and compare inferences.

I Remember, all of these models are wrong, but some may beuseful.

7 / 1

Page 8: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Illustrating regression models

I New York leukemia data from Waller et al. (1994)

I 281 census tracts (1980 Census).

I 8 counties in central New York.

I 592 cases for 1978-1982.

I 1,057,673 people at risk.

8 / 1

Page 9: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Crude Rates (per 100,000)

Central New York Census Tracts, 1980Leukemia rates 1978-1982

0.000000 - 0.000021

0.000022 - 0.000324

0.000325 - 0.000588

0.000589 - 0.001096

0.001097 - 0.006993

9 / 1

Page 10: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Building the model

I Let Yi = count for region i .

I xi ,TCE = inverse distance to TCE site.

I xi ,65 = percent over age 65 (census).

I xi ,home = percent who own own home (census).

I A generic regression model:

Yi = β0 + xi ,TCEβTCE + xi ,65β65 + xi ,homeβhome + εi .

10 / 1

Page 11: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Assumptions for regression

I The error terms, εiind∼ N(0, σ2);

I The data have a constant variance, σ2;

I The data are uncorrelated (OLS) or have a specifiedparametric covariance structure (GLS);

11 / 1

Page 12: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Y normally distributed?

Histogram

Incidence Proportions

Freq

uenc

y

0.000 0.004

050

100

150

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

● ●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●●

●●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●● ●

●●●

● ●●

●●● ●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

● ●

●●●●

● ●●

●●●

●●

●●

●●

●●

●●

● ●

●●

−3 −1 0 1 2 30.00

00.

003

0.00

6

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

12 / 1

Page 13: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Transformation?

Zi = log

(1000(Yi + 1)

ni

).

Transformed Incidence Proportions

Freq

uenc

y

−2 0 1 2 3 4 5

020

4060

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

−3 −1 0 1 2 3

−20

12

34

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

13 / 1

Page 14: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Outliers, where are the top 3?

14 / 1

Page 15: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Scatterplots

●●

●●

●●●●●●

●●●●●

●●

●●

●●●●

●●

● ●

●●

● ●

●●

●●●

●●●

●●●

● ●

●●

●●●

●●●

●●

●●●●●

●●

●●

●●●

●●●

●●

●●●

●●●●●

●●

●●●

●●

●●

●●

●●●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●●

●●

●●●●

●●

●●

●●

●●●

●●●●●●●

●●●

●●

●●

●●●●●●

●●

●●

●●●●

●●●●

●●●●

●●

●●

●●

●●

● ●

●●

●●

●●

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

−20

24

Inverse Distance

Tran

sfor

med

Cou

ntInverse Distance vs. Outcome

●●

●●

●●●●

●●

●●

●● ●

● ●

●●

●●●

●●

● ●

●●

● ●

●●

●●●

●●●

●● ●

● ●

●●

●● ●

●●

●●

● ●●●●

●●

●●

●●●

● ●●

●●

●●●

●●●●●

●●

●●●

●●

●●

●●

●●●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●●

●●

●●●●

●●

●●

●●

●●●

●●●●●●

●●●

●●

●●

●●●●●●

●●

●●

●●●

●●●●

●●●●

●●

●●

●●

●●

● ●

● ●

●●

●●

1 2 3 4 5 6

−20

24

Log(100*Inverse Distance)

Tran

sfor

med

Cou

nt

Log(100*Inverse Distance) vs. Outcome

●●

●●

● ●●

●●

●●●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●● ●

●●●

● ●

●●

●●●

●●

●●

●●●●

●●

●●

● ●

●● ●

●● ●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●● ●

●●

●●

●●

● ●●

●●

●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●●●

●●●●●

●●

●●

●●●

● ●

● ●

●●

●●

0.0 0.1 0.2 0.3 0.4 0.5

−20

24

% > 65 years

Tran

sfor

med

Cou

nt

Percent > 65 vs. Outcome

●●

●●

● ●●

●●

●●●●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●●

●● ●

●● ●

●●

●●

●● ●

●●

●●

●●● ●

●●

●●

● ●

●●●

● ●●

● ●

● ●●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●●●

●●

●●

● ●

●●●

●●

●●

●●

● ●●

●●

●●

●● ● ●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

% Own Home

Tran

sfor

med

Cou

nt

Percent Own Home vs. Outcome

15 / 1

Page 16: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Linear Regression (OLS)

Parameter Estimate Std. Error p-value

β0 (Intercept) -0.5173 0.1586 0.0012

β1 (TCE) 0.0488 0.0351 0.1648

β2 (% Age > 65) 3.9509 0.6055 <0.0001

β3 (% Own home) -0.5600 0.1703 0.0011

σ2 0.4318 277 df

R2=0.1932 AIC=567.5

16 / 1

Page 17: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Is OLS appropriate?

I Z s roughly Gaussian (symmetric).

I Do Z s have constant variance?

I No, since population sizes vary.

I Var(Zi ) = Var(

log(1000(Yi+1)

ni

))I Try weighted least squares with weights 1/ni .

17 / 1

Page 18: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Linear Regression (WLS)

Parameter Estimate Std. Error p-value

β0 (Intercept) -0.7784 0.1412 <0.0001

β1 (TCE) 0.0763 0.0273 0.0056

β2 (% Age > 65) 3.8566 0.5713 <0.0001

β3 (% Own home) -0.3987 0.1531 0.0097

σ2 1121.94 277 df

R2=0.1977 AIC=513.5

18 / 1

Page 19: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What changed?

I The three outliers are all in regions with small ni .

I Weighting reduced their impact on estimates.

I Most profound effect is with respect to TCE.

19 / 1

Page 20: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

WLS fitted values

20 / 1

Page 21: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Residual map

21 / 1

Page 22: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What are we looking for?

I Patterns in locations of residuals.

I Model underfit (predictions too low) near cities?

I Correlations in residuals?

I We can try maximum likelihood fit incorporating residualcorrelation via estimate of covariance matrix as function ofdistance between observations.

22 / 1

Page 23: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Linear Regression, Correlated Errors (ML)

Parameter Estimate Std. Error p-value

β0 (Intercept) -0.7222 0.1972 <0.0001

β1 (TCE) 0.0826 0.0434 0.0576

β2 (% Age > 65) 3.7093 0.6188 <0.0001

β3 (% Own home) -0.3245 0.2044 0.1136

c0=0.3740 cs=0.0558 a=6.93

AIC=565.6 277 df

23 / 1

Page 24: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Weighting?

I We also need to include weights to account forheteroskedasticity.

I Again we use weights equal to 1/ni .

I What changes?

24 / 1

Page 25: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Linear regression, Correlated, Weighted

Parameter Estimate Std. Error p-value

β0 (Intercept) -0.9161 0.1648 <0.0001

β1 (TCE) 0.0956 0.0322 0.0032

β2 (% Age > 65) 3.5763 0.5920 <0.0001

β3 (% Own home) -0.2285 0.1761 0.1956

c0=997.65 cs=127.12 a=6.86

AIC=514.7 277 df

25 / 1

Page 26: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Fitted values (correlated, weighted)

26 / 1

Page 27: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Modelling counts directly

I Using linear regression required a fair amount of datatransformation, just to meet modelling assumptions.

I Can we model the counts directly?

I In epidemiology, common to use logistic or Poisson regression.

I For rare disease, little difference between logistic and Poisson.

I Both are examples of generalized linear models (McCullaghand Nelder, 1989).

27 / 1

Page 28: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Building the model

I Let Yi = count for region i .

I Let Ei = expected count for region i .

I Let (xi ,TCE , xi ,65, xi ,home) be the associated covariate values.

I Poisson regression:

Yi ∼ Poisson(Eiζi )

where

log(ζi ) = β0 + xi ,TCEβTCE + xi ,65β65 + xi ,homeβhome .

28 / 1

Page 29: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

What’s different?

I Poisson distribution for counts, rather than transformingproportions for normality.

I Link function: Natural log of mean of Yi is a linear functionof covariates.

I βs represent multiplicative increases in expected counts, eβ ameasure of relative risk associated with one unit increase incovariate.

I Ei an offset, what we expect if the covariates have no impact.

I Age, race, sex adjustments in either Ei (standardization) orcovariates.

29 / 1

Page 30: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

How do we add spatial correlation?

I Trickier than in regression, since mean and variance arerelated for Poisson observations.

I Maximum likelihood will be challenging (since the likelihood ismore complicated).

I A common tool is to add a random effect (intercept).

I Represents an impact of region i , not accounted for in Ei orthe covariates.

I We define this random effect to have a spatial distribution.

30 / 1

Page 31: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Building the model: New York data

I Assume Ei known, perhaps age-standardized, or based onglobal (external or internal) rates.

I Our model is

Yi |β, ψiind∼ Poisson(Ei exp(x ′

iβ + ψi )),

log(ζi ) = β0 + xi ,TCEβTCE + xi ,65β65 + xi ,homeβhome + ψi .

I The ψi represent the random intercepts.

I Add overdispersion via ψiind∼ N(0, vψ).

I Add spatial correlation via

ψ ∼ MVN(0,Σ).

31 / 1

Page 32: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Priors and “shrinkage”

I Overdispersion model (i.i.d. ψi ) results in each estimate beinga compromise between the local SMR and the global averageSMR.

I “Borrows information (strength)” from other observations toimprove precision of local estimate.

I “Shrinks” estimate toward global mean. (Note: “shrink” doesnot mean “reduce”, rather means “moves toward”).

32 / 1

Page 33: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Local shrinkage

I Spatial model (correlated ψi ) results in each estimate begin acompromise between the local SMR and the local averageSMR.

I Shrinks each ψi toward the average of its neighbors.

I Can also include both global and local shrinkage (Besag, York,and Mollie 1991).

I How do we fit these models?

33 / 1

Page 34: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Bayesian inference

Bayesian inference regarding model parameters based on posteriordistribution

Pr [β,ψ|Y ]

proportional to the product of the likelihood times the prior

Pr [Y |β,ψ]Pr [ψ]Pr [β].

Defers spatial correlation to the prior rather than the likelihood.

34 / 1

Page 35: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Spatial priors

I Could model joint distribution

ψ ∼ MVN(0,Σ).

I Could also model conditional distribution

ψi |ψj 6=i ∼ N

(∑j 6=i cijψj∑

j 6=i cij,

1

vCAR∑

j 6=i cij

), i = 1, . . . ,N.

where cij are weights defining the neighbors of region i .

I Adjacency weights: cij = 1 if j is a neighbor of i .

35 / 1

Page 36: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

CAR priors

I The conditional specification defines the conditionalautoregressive (CAR) prior (Besag 1974, Besag et al. 1991).

I Under certain conditions on the cij , the CAR prior defines avalid multivariate joint Gaussian distribution.

I Variance covariance matrix a function of the inverse of thematrix of neighbor weights.

36 / 1

Page 37: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Perspective: Generalized linear mixed model

I Given the values of the random effects (ψi s), observations(Yi s) are independent.

I Taking into account correlation in the ψi s, the Yi s arecorrelated.

I Conditionally independent Yi |ψi give likelihood function.

I (Spatially correlated) distribution of the ψi s a priordistribution.

37 / 1

Page 38: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Fitting Bayesian models

I Posterior often difficult to calculate mathematically.

I Markov chain Monte Carlo: Iterative simulation approach tomodel fitting.

I Given full conditional distributions, simulate a new value foreach parameter, holding the other parameter values fixed.

I The set of simulated values converges to a sample from theposterior distribution.

I Alternative: integrated nested Laplace analysis using the inla

package (example code).

38 / 1

Page 39: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Complete model specification

Yi |β, ψiind∼ Poisson(Ei exp(x ′

iβ + ψi )),

log(ζi ) = β0 + xi ,TCEβTCE + xi ,65β65 + xi ,homeβhome + ψi .

βk ∼ Uniform.

ψi |ψj 6=i ∼ N

(∑j 6=i cijψj∑

j 6=i cij,

1

vCAR∑

j 6=i cij

), i = 1, . . . ,N.

1

vCAR∼ Gamma(0.5, 0.0005).

39 / 1

Page 40: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

MCMC trace plots

0 2000 4000 6000 8000

−6−4

−20

Iteration

Inte

rcep

t

0 2000 4000 6000 8000

26

1014

Iteration

Age

Effe

ct

0 2000 4000 6000 8000

−1.5

−0.5

Iteration

Expo

sure

Effe

ct

0 2000 4000 6000 8000

−11

23

Iteration

Hom

e O

wner

ship

Effe

ct

40 / 1

Page 41: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Posterior densities

−0.5 0.0 0.5

0.0

1.0

2.0

Intercept

Dens

ity

1 2 3 4 5 6

0.0

0.2

0.4

0.6

Age Effect

Dens

ity

0.00 0.10 0.20 0.30

02

46

8

Exposure Effect

Dens

ity

−1.0 −0.5 0.0 0.5

0.0

1.0

2.0

Home Ownership Effect

Dens

ity

41 / 1

Page 42: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

MCMC posterior estimates

Covariate Posterior 95% CredibleMedian Set

β0 0.048 (-0.355, 0.408)β65 3.984 (2.736, 5.330)βTCE 0.152 (0.066, 0.226)βhome -0.367 (-0.758, 0.049)

42 / 1

Page 43: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

But there’s more!

I A nifty thing about MCMC estimates:

We get posterior samples from any function of modelparameters by taking that function of the sampled posteriorparameter values.

I Gives us posterior inference for SMRi = Yi ,fit/Ei .

I Also can get Pr [SMRi > 200|Y ] and map these exceedenceprobabilities.

43 / 1

Page 44: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Posterior median SMRs

44 / 1

Page 45: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Posterior exceedence probabilities

45 / 1

Page 46: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Example 2

I Cryptozoology Example: Waller and Carlin (2010) DiseaseMapping. In Handbook of Spatial Statistics, Gelfand et al.(eds.). Boca Raton: CRC/Chapman and Hall.

46 / 1

Page 47: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Cryptozoology example

I County-specific reports of encounters with Sasquatch(Bigfoot).

I Data downloaded from www.bfro.net

I Sightings from counties in Oregon and Washington (PacificNorthwest).

I Probability of report related to population density?

I (Hopefully) rare events in small areas.

I Perhaps spatial smoothing will stabilize local rate estimates.

47 / 1

Page 48: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Sasquatch Data

Wasco

Number of Reports Reports per 2000 Population

2000 Population per Square Mile

0 200 400 600 800100

Kilometers

Legend

Population/Sq. Mi.Reports

0.7 - 12.9

13.0 - 32.1

32.2 - 69.9

70.0 - 180.1

180.2 - 414.0

414.1 - 793.3

793.4 - 1419.3

Reports/Person

0.00000 - 0.00003

0.00003 - 0.00008

0.00008 - 0.00016

0.00016 - 0.00026

0.00026 - 0.00046

0.00046 - 0.00076

0.00076 - 0.00517

Skamania

King

0

1-5

6-10

11-15

16-20

21-25

25-51

48 / 1

Page 49: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Reports vs. Population Density

02

46

0.000 0.001 0.002 0.003 0.004 0.005

log(Census 2000 population density)

Reports per Census 2000 population size

Log of Population Density

02

46

0 e+00 2 e−04 4 e−04 6 e−04 8 e−04 1 e−03

log(Census 2000 population density)

Reports per Census 2000 population size

Ignoring outlier

49 / 1

Page 50: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Predicted relative risks and credible sets

Filled circle = Skamania, Filled square = Wasco

0 5 10 15 20 25 30 35

−2−1

01

23

4

log(

rela

tive

risk)

No Random Effects Exchangeable CAR Convolution

50 / 1

Page 51: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Mapped relative risksNo random effect RRs Exchangeable RRs

CAR RRs Convolution RRs

Legend 0.00 - 1.00

1.01 - 2.00

2.01 - 3.00

3.01 - 4.00

4.01 - 15.00

approxim ately 70.00

51 / 1

Page 52: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Skamania Sasquatch Ordinances

I http://www.skamaniacounty.org/commissioners/

homepage/ordinances-2/

I Big Foot Ordinance 69-1: “THEREFORE BE IT RESOLVEDthat any premeditated, willful and wonton slaying of any suchcreature shall be deemed a felony punishable by a fine not toexceed Ten Thousand Dollars ($10,000.00) and/orimprisonment in the county jail for a period not to exceedFive (5) years. ADOPTED this 1st day of April, 1969.”

I Big Foot Ordinance 1984-2:I Repealed felony and jail sentence.I Established a Sasquatch Refuge (Skamania County).I Clarified penalty (gross misdemeanor vs. misdemeanor) and

penalty (fine and jail time), disallowed insanity defense, andclarified distinction between coroner designation of victim ashumanoid (murder) or anthropoid (this ordinance).

52 / 1

Page 53: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

And...

www.amazon.com/Skamania-County-Washington-Bigfoot-Vintage/dp/B076PWN7ZM

53 / 1

Page 54: MODULE 5: Spatial Statistics in Epidemiology and Public ...faculty.washington.edu/jonno/SISMIDmaterial/2020_SISMID_5_5.pdf · Spatio-temporal Bayesian Models with R-INLA. Chichester:

Conclusions

I What method to use depends on what you want data youhave and what question you want to answer.

I All methods try to balance trend (fixed effects) withcorrelation (here, with random effects).

I All models wrong, some models useful.

I Trying more than one approach often sensible.

54 / 1