35
Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses Ludwig Fahrmeir Department of Statistics, University of Munich, Germany. 1. Applications: - Internet survey ‘Prospect Germany 1‘ - Child morbidity and malnutrition in Nigeria - Post war security in Cambodia 2. Geoadditive latent variable models 3. MCMC inference based on auxiliary variables 4. Applications: Results

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Embed Size (px)

Citation preview

Page 1: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

1

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous

ResponsesLudwig Fahrmeir

Department of Statistics, University of Munich, Germany.

1. Applications: - Internet survey ‘Prospect Germany 1‘- Child morbidity and malnutrition in

Nigeria - Post war security in Cambodia

2. Geoadditive latent variable models

3. MCMC inference based on auxiliary variables

4. Applications: Results

Page 2: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

2

1. Applications:Internet Survey “Prospect Germany 1“ (1)

Survey initiated by McKinsey & Co., stern.de and t-online (2001)

General goal: To find out in which areas of life are people willing to bear the responsibility, and in which areas are they expecting the state to take the responsibility.

Data from a subsample of 6804 individuals.Most variables are binary or (ordered) categorical.Continuous covariate: age of participant (in years).Spatial information: 402 administrative districts in Germany.

Analysis with ten indicators and two latent variables:First latent variable reflects the participant‘s attitude when social coverage wouldbe one‘s own responsibility, or if the state has to take care of social coverage.Second latent variable reflects the ambition of the person to achieve something in their job and in society.

Page 3: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

3

1. Applications:Internet Survey “Prospect Germany 1“ (2)

Table 1: Variable names, variable types, questions/statements, and response categories of the ten indicators.

Page 4: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

4

1. Applications:Internet Survey “Prospect Germany 1“ (3)

Table 2: Variable names, variable types and response categories of the four indirect covariates used in the analyses.

Page 5: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

5

1. Applications:Child morbidity and malnutrition in Nigeria

(1)

Data: about 5000 children from DHS for Nigeria (2003)

Goal: Assess impact of personal, socioeconomic and public health factors as well as spatial location on the latent variables „morbidity“ and „malnutrition“ of children.

Responses / indicators:

child had diarrhoea (or not),child had cough (or not),child had fever (or not) within two weeks before the interview.

Malnutrition status of child measured through Z-score for stunting, wasting and underweight.

ii

AI -med(AI )(AI )

Z -score

Page 6: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

6

1. Applications:Child morbidity and malnutrition in Nigeria

(2)

Covariates:

age of child (in months)age of mother (in years)body mass index of mother

Categorical covariates characterize socio-economic and public health environment.

Spatial information: district of Nigeria where mother and child live.

Page 7: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

7

1. Applications:Post war security in Cambodia (1)

Conflict and violence data collected by the monitoring arm of the Government of Cambodia‘s decentralization program SEILA (the Khmer word for foundation stone).We use data for 2002, obtained from headmen and leaders of over 13000 villages and urban neighbourhoods.

More details on the data as well as sociological and political background is given in Benini, Owen and Rue (2006). They used separate geoadditive count data models to analyze the impact of the legacy of war, poverty and resource competition, urbanity, and governance quality on the three dependent variables- number of serious crimes committed in community,- number of land conflicts in community,- number of households in community known to have domestic violence

problems.

Page 8: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

8

1. Applications:Post war security in Cambodia (2)

We apply a Poisson indicator LVM to these three indicators, focussing on the latent variable „disposition for violence“. Instead of the total numbers of counts per year, we use the monthly averages y1, y2 and y3

of the three count variables as target variables. Because the yearly numbers are only estimates provided by local leaders, the effect of averaging can be neglected. It helps to make data analysis computationally feasible.

.

Page 9: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

9

1. Applications:Post war security in Cambodia (3)

Page 10: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

10

2. Geoadditive latent variable models (1)

2.1 Measurement models for observable responses

observable responses (manifest variables, indicators)observed value of

•j

j

ij

,

,

jy =1,...,p,y y i=1,...,n

Gaussian responses:

vector of direct effects of covariate vector

factor loadings of latent variables ,

• 2 ' 'ij ij j ij j i j i

j

j j1 jq i i1 iq

i

y ~ N(μ ,σ ), μ = u + l

α

λ = (λ ,...,λ )' l = (l ,...,l ) q < p

u

Binary responses: • ' 'ij ij ij j i j i y ~ B(1,π ), π = φ(α u + λ l )

Poisson responses: ' 'ij ij ij j i j iy ~ Po(μ ), μ = exp( u + λ l )

Generally:

Gaussian, binomial, (ordered) categorical, Poisson, lifetime

nonlinear effects + spatial effects +

j

' 'ij j j i j i

y

E(y )= h (α u λ l )

Page 11: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

11

2. Geoadditive latent variable models (2)

2.2 Structural models for latent variables (1)

Geoadditive structural model

with (geo-)additive predictor

smooth functions of continuous covariates

add

ir ir ir

add 'ir i r r,1 i1 r,k ik r,geo i

r,1 1 r,k k 1

l = η +δ

η = x β + f (z )+...+ f (z )+ f (s )

f (z )..., f (z ) z ,...,

spatial effect of location k

r,geo

z

f (s) s 1,...,S

Linear structural model

with indirect effects of covariates is extended to the

'

ir i r ir ir

r i

l = x β +δ , δ i.i.d. N(0,1)

β x

Structural models relate latent variables to

covariates (of different type).

ir l , r = 1,...,q,

Page 12: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

12

2. Geoadditive latent variable models (3)

2.2 Structural models for latent variables (2)

Smooth functions modelled through (Bayesian) P-splines

with random walk priors

M

m mm=1

d 2m m

f(z)= γ B (z)

Δ γ = u N(0,τ )

Spatial effects modelled through Markov random field for

neighbors of number of neighbors

geo,s geo

2geo

geo,s geo,s geo,s's's s

s

γ := f (s), s 1,...,S

τ1 γ | γ ,: N γ ,

n n

s' s, n

( )

Page 13: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

13

2. Geoadditive latent variable models (4)

2.3 Other priors

Priors for fixed effects are flat or weakly informative

Gaussian.

Priors for variances or are of inverse Gamma-type

(proper or improper).

Priors for factor loadings can b

2 2

a, ß

σ τ

λ e chosen as weakly

informative for large sample size or many indicators.

Otherwise more informative (Gaussian) priors are

needed to prevent Heywood cases.

Page 14: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

14

3. MCMC inference based on auxiliary variables (1)

varia

Two s

nces

trategies for posterior analysis of

parameters

and latent variables

are

1. Direct posterior

a

ir

2 2

l = l , i = 1,...,n;r = 1,...,q

α, β, γ, λ

σ , τ

nalysis for these unknowns based on MCMC

with MH steps drawing from full conditionals (not implemented), or

2. MCMC inference based on additional auxiliary Gaussian variables

constructed

ij *y

from observable responses , using Gibbs sampling

(implemented).

ijy

Page 15: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

15

3. MCMC inference based on auxiliary variables (2)

Gaussian responses

directly observable

Binary responses

Probit model for

Can be extended to ordinal (implemented) and nominal

(not implement

*ij ij

* ' *ij ij j i j i ij

ij

y = y

y = 1 y = α 'u + λ l +ε > 0

y

ed) categorical responses.

Page 16: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

16

3. MCMC inference based on auxiliary variables (3)

Poisson responses

(Conditional) distribution of is considered as the distribution

of the number of jumps of an unobserved Poisson process.

Distribution of log-interarrival times

ijy

is approximated through (mixture of) auxiliary Gaussian linear models

see Frühwirth-Schnatter and Wagner (2005, 2006) in

* *ij ijm ij

* 'ijm j i j i ijl ij

*

y = y , m = 1,..., y +1

y = α u + λ l +ε , m = 1,..., y +1

a state space

model framework (implemented)

Computational problem if is large!

More parsimonious alternative:

Frühwirth-Schnatter, Frühwirth, Held and Rue (2007); only 2 auxiliary

ijy

linear models needed! (not implemented)

Page 17: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

17

3. MCMC inference based on auxiliary variables (4)

2

2

2

Gibbs steps

1. Draw auxiliary variables from

2. Draw latent variables from

3. Draw parametric indirect effects from

4. Draw nonparametr

*

*

p( y | , ,l , , y )

p( l | y , y, , , )

p( | l , , )2

2

2

2

ic indirect effects from

5. Draw smoothing variances from

6. Draw direct effects and factor loadings from

7. Draw error variances from

*

*

|

p( | l , , )

p( l , , )

p( , | y , )

p( | y , , )

Page 18: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

18

4. Results: Internet Survey „Prospect Germany 1“ (1)

10 observable binary/ordinal indicators ,related to

10 auxiliary Gaussian variables through threshold mechanism.

Measurement model for auxiliary variables •

i

*j

*j

*ij j j1 i1 j2 i

y

y

y , j = 1,...,10

= α + λ l + λ ly

Structural model for 2 latent variables

ir r1 i 2r i 3r i r1 i r2 i i r,geo i

* *ij ij2

= β Inc2 + β Inc3 + β Inc4 + f (Age )+ f (Age )Sex + f (Reg )

+ ε , ε ~ N(0,1)

η

Page 19: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

19

4. Results: Internet Survey „Prospect Germany 1“ (2)

Estimates of factor loadings and parametric indirect effects.

Page 20: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

20

4. Results: Internet Survey „Prospect Germany 1“ (3)

Estimates of the smooth functions modelled by P-splines priors. The mean values are connected by the solid line, 10%- and 90%-quantiles are connected by the dashed line.

Page 21: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

21

4. Results: Internet Survey „Prospect Germany 1“ (4)

Estimated Spatial effects

Spatial effects for the first latent variable

Regions with - a significant negative effect (red)- a significant positive effect (green)- a non-significant effect (yellow)

Page 22: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

22

4. Results: Internet Survey „Prospect Germany 1“ (5)

Estimated Spatial effects

Spatial effects for the second latent variable

Regions with - a significant negative effect (red)- a significant positive effect (green)- a non-significant effect (yellow)

Page 23: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

23

4. Results: Child morbidity and malnutrition in Nigeria (1)

3 observable binary indicators: fever, cough, diarrhea

2 observable continuous indicators: stunting, underweight

Measurement model predictors for 5 indicators'j i j1 i1 j2 i2 α u + λ l + λ λ , j = 1,.

vector of covariates with direct effects

Structural model predictors for 2 latent variables

i

'

ir r r1 i r2 i r3 i r,geo i= x β + f (chage ) + f (mage ) + f (bmi ) + f (district )η

..,5

u

Page 24: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

24

4. Results: Child morbidity and malnutrition in Nigeria (2)

Page 25: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

25

4. Results: Child morbidity and malnutrition in Nigeria (3)

Page 26: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

26

Nonlinear effects for the first latent variable for Nigeria

4. Results: Child morbidity and malnutrition in Nigeria (4)

Page 27: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

27

Nonlinear effects for the second latent variable for Nigeria

4. Results: Child morbidity and malnutrition in Nigeria (5)

Page 28: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

28

Spatial effects for the first and second latent variable for Nigeria

4. Results: Child morbidity and malnutrition in Nigeria (6)

Page 29: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

29

4. Results: Post war security in Cambodia (1)

3 observable count indicators:

- number of serious crimes committed in community

- number of land conficts in community

- number of households in community with domestic violence problems

1 latent variable: disposition for violence

Measurement model

Structural model predictor

ij i ij i

i 1 i 2 i geo i

j0 j1 )= exp( nrfam + λ l , j = 1,2,3

= f (contam )+ f (usbomb )+ f (community )

α + α

η

Page 30: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

30

4. Results: Post war security in Cambodia (2)

Page 31: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

31

Commune rate for land conflicts, rated to population (light: below average, dark: above average)

4. Results: Post war security in Cambodia (3)

Page 32: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

32

Map of Cambodia with the estimated spatial effects for all 1628 communities

4. Results: Post war security in Cambodia (4)

Page 33: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

33

Estimated effects:

f1: f2:

4. Results: Post war security in Cambodia (5)

Page 34: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

34

References

Benini, A., Owen, T. and Rue, H. (2006). A semi-parametric spatial regression approach to post-war human security: Cambodia 2002-2004. Technical Report.

Fahrmeir, L. and Raach, A. (2007). A Bayesian semiparametric latent variable model for mixed responses. Psychometrica, in press.

Fahrmeir, L., Steinert, S. (2006). A geoadditive Bayesian latent variable model for Poisson indicators. Discussion Paper 508, Sonderforschungsbereich 386. Ludwig-Maximilians-Universität München.

Frühwirth, S., Frühwirth, R., Held, L. and Rue, H. (2007). Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data. IFAS Research Paper 2007-25.

Frühwirth, S. and Wagner, H. (2006). Auxiliary mixture sampling for parameter-driven models of time series of counts with application to state space modelling. Biometrika 93(4), 827-841.

Page 35: Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses, Zurich, Sept. 25-26, 2007. 1 Bayesian Geoadditive Latent Variable Models

Bayesian Geoadditive Latent Variable Models for Discrete and Continuous Responses,Zurich, Sept. 25-26, 2007.

35

Thanks to

Alexander Raach

Sven Steinert

Khaled Khatab

The BayesX Group