Andrew Lawson MUSCConclusions Thanks for your attention! Contact address: [email protected] INLA examples given in Appendix D of Lawson, A. B. (2013) Bayesian Disease Mapping: hierarchical

Andrew Lawson MUSC

INLA INLA is a relatively new tool that can be used to approximate posterior distributions in Bayesian models

INLA stands for integrated Nested Laplace Approximation

The approximation has been known for some time (see e.g. Kass and Steffey (1989) JASA)

Recently it has been shown that if nested approximations are made and sparse matrix theory exploited it is possible to provide reasonably good and fast estimates of many posterior quantities

INLA more formally Laplace approximation matches the mode and curvature of a Gaussian distribution to the posterior in question and uses this to provide an integral approximation to the density.

For models close to Gaussian then the approximation is very good.

©Andrew B Lawson 2013

How its computed


outcome data parameters hyperparameters

P( | ) ( | , ) ( | )

( | , ) ( | )

denotes the Laplace approximation

i i

k i k kk

P P d

P P

where P

yλ

λ y λ y y

λ y y

INLA basics Can be used for a wide variety of hierarchical models

spatial models Survival data Longitudinal data fMRI imaging Clinical applications econometrics

INLA advantages VERY fast computation

Can fit some spatial models many times faster than McMC (in the form of WinBUGS)

Can handle very large datasets We have an example with >60,000 observations which can run quickly on INLA but ‘freezes’ on WinBUGS

Can handle large numbers of regression predictors in fixed effect models

Handles random effects easily

Models on INLA INLA operates as for the LM function on R

Two components: formula and inla call

Example:>formula1=y~1+x>result1=inla(formula1, family ="gaussian",data=‘dataframe’)

This fits a linear regression with intercept between y and x


INLA on R Basic formulation is akin to using the lm function in R Two basic calls are made :

Model definition (formula) Model fitting

Example: linear regression with 2 predictorsformula1<‐y~1+x1+x2res1<‐inla(formula1,family="gaussian",data=As,control.compute=list(dic=TRUE,cpo=TRUE))

Bayesian Disease Mapping Spatial distribution of incident counts of disease within small areas

I don’t consider case events at residential addresses here

Example:

SC congenital deaths 1990

Statistical Issues Count outcomes in m regions/small areas:

Need population ‘background’ for each area (expected count or rate):

Various methods used to estimate the expected counts BUT they are assumed fixed in analysis.


: 1,..., iy i m

1,..., : ie i m

Simple estimator of relative risk Standardized incidence Ratio (SIR): Ratio of observed to expected counts:

This is a crude estimator and sometimes difficult to interpret and unstable (ratio quality)


ˆ /i i iy e

Basic Model Poisson count model assumed for small areas:

This is our data level model and we assume a Poisson likelihood

The main parameter is the relative risk : This can have a prior distribution (e.g. a gamma or log‐normal)

Alternatively the log of relative risk can be modeled


( ) multiplicative model

i i

i i i

y Poise

i

Relative risk models

A) intercept (constant) model B) log–normal (random intercept) model C) GLMM D)Convolution model


) log( ) log( )

log( ) ....

log(

model terms

i i ioffset

i

e

Risk models I Constant risk :

Log‐normal risk:

Generalized Linear Mixed Model (GLMM):


exp( )log( )i

i

log( )exp( )

i i

i i

vv

( )

:

:

log linear predictorlinear combination of random effects

T Ti i i

TiTi

x α z γ

x α

z γ

Convolution Models Special case of GLMM Includes spatial correlation

Adding covariates is straightforward:


log( ) is a spatial effect and

is called a convolution

i i i

i

i i

v uwhere uv u

1 1

1

log( ) is a covariate

i i i i

i

x v uwhere x

Use of f() function A powerful feature of the INLA package is the f() function

This allows special links to be specified to predictors Can have smooth non‐linear links Can have correlated dependence Can include random effects via this function


Some examples

SC congenital example: UH only #UH model formulaUH = obs~ f(region, model = "iid")resultUH = inla(formulaUH,family="poisson",data=SCcongen90,control.compute=list(dic=TRUE,cpo=TRUE),E=expe)summary(resultUH)

SC congenital example #UH+CH + poverty covariate

setwd("working directory")formulaUHCHPov = obs~ 1+pov+f(region, model = "iid")+f(region2,model="besag",graph="SC.graph")resultUHCHPov = inla(formulaUHCHPov,family="poisson",data=SCcongen90,control.compute=list(dic=TRUE,cpo=TRUE),E=expe)summary(resultUHCHPov)

Output from UH only model

Diagnostics

Space‐time examples Ohio county level respiratory cancer A well known dataset (full dataset 21 years ) Available at http://www.stat.uni‐muenchen.de/service/datenarchiv/ohio/ohio_e.html

1979‐1988 shown here SIRs displayed

©Andrew Lawson 2013

Basic retrospective model Infinite population; small disease probability Poisson assumption


0

~ ( )

log( )

: spatial terms: temporal terms

: interaction

ij ij ij

ij i j ij

i

j

ij

y Pois e

S T ST

ST

ST

Some Random Effect models


0

0

0 1 2

0 2

0 1 2

model 1a:log( )

model 1b:log( )

model 2:log( )

model 3:log( )

model 4:log( )

model 5: variants of (3) with

ij i i j

ij i i j

ij i i j j

ij i i j ij

ij i i j j ij

ij

v u t

v u

v u

v u

v u

Model fitting Results (WinBUGS) Model DIC pD

1a 5759 80

1b 5759 80

2 5759.4 79

3 5751.4 129

4 5755.3 129

5 5750.6 115


DIC comparison

INLA model

Model DIC pD WB model

1 Spatial only (UH) 5758.2 79.65

2 UH+CH 5757.4 79.66

3 UH+CH+timetrend

5759.28 80.05 1a

4 UH+CH+time iid 5760.4 80.58

5 UH+CH+timeRW1

5760.6 80.60 1b

6 UH+CH+time(iid, rw1)

5763.1 81.97 2

7 UH+time rw1+ST int

5753.80 116.78

8 UH+CH+timerw1+STint

5757.9 86.41 3

Limitations of INLA Models must be expressible in the linear model format There are restrictions on the types of prior distributions that can be assumed Example: there is no Dirichlet or multinomial distribution currently

Mixtures cannot be modeled, but joint models are available

Finally Other INLA features

Measurement error in predictors ( mec, meb) Missingness in outcomes (copy facility) Geographically weighted regression

e.g. f(ind,x1,model=“besag”,graph=“……”)

Smoothed predictors e.g. f(x1,model=“rw1”)

Modeling point processes via SPDE facilities (LGCP) Caveat: Taylor and Diggle (2012)

Finally INLA versus WinBUGS

INLA WinBUGS

Runs on R x Only through Brugs or R2WinBUGS

Large datasets x

Mixtures x

Posterior functionals

x

Special spatial Models

X some: LGCP for point processes

X GeoBUGS+CAR models

Missingness Only outcomesin general, but can handle drop‐out models

Can handle a range of missingness

Conclusions Thanks for your attention! Contact address: [email protected] INLA examples given in Appendix D ofLawson, A. B. (2013) Bayesian Disease Mapping: hierarchical modeling in spatial epidemiology. 2nd Ed CRC Press, New York

Full 2 x 2 day courses on BDM (including WinBUGS and INLA) given in MUSC (March) University of Edinburgh (June) each year.

Contacts: MUSC courses June Watson email: [email protected] UOE courses Bob Carr email: [email protected]

Documents

Andrew Lawson MUSCConclusions Thanks for your attention! Contact address: [email protected] INLA examples given in Appendix D of Lawson, A. B. (2013) Bayesian Disease Mapping: hierarchical