Bayesian Structural Equations Modeling (SEM)

Preview:

Citation preview

1/44

Bayesian Structural Equations Modeling

M’hamed (Hamy) Temkit1

1Division of BiostatisticsMayo Clinic, Arizona

Applied Statistics Seminar, November 17, 2016

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

2/44

Outline

Introduction to SEM

Covariance Analysis

SEM Estimation (GLS vs MLE)

CFA

The General Model of SEM

LAAVAN

Bayesian Paradigm

Bayesian SEM

Bayesian CFA

BLAAVAN

CONCLUSION

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

3/44

Motivation

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

4/44

Motivation

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

5/44

Two Paradigms

Covariance Analysis

Σ = Σ(θ)

Bayesian Inference

p(θ | y) = p(y | θ)p(θ)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

6/44

Brief SEM Terminology

ξ1

X1

X2

δ1

δ2

λx11

λx21

ξ2

X3

X4

δ1

δ2

λx32

λx42

ξ3

X5

X6

δ1

δ2

λx53

λx63

η1

η 2

y1

y2

y3

y4

ε1

ε2

ε3

ε4

λy11

λy21

λy32

λy42

Measurement model

Structural model

β21

γ11

γ12

γ22

γ23

ϕ21

ϕ32

ϕ31

Endogenous latent variables

Exogenous latent variables

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

7/44

Background

Factor Analysis (Spearman, 1904)

Path Analysis (Sewal Wright 1918,1921,1934,1960)

Confirmatory Factor Analysis (CFA)(Joreskog, 1969 )

General SEM ( Joreskog (1973), Wiley (1973))

LISREL model (Wiley (1973), Joreskog (1977))

Generalized least squares Browne (1974,1982,1984)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

8/44

Relevant Reading References

Structural Equations With Latent Variables (Bollen, 1989)

Structural Equations Modeling With Amos (Byrn)

Latent Curve Models (Bollen, Curran 2006)

Structural Equation Modeling, A Bayesian Approach (Sik-YumLee 2007)

Structural Equation Modeling: A Multidisciplinary Journal

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

9/44

First Principle: Linear Regression

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

10/44

Linear Regression: The Machinery

yi = β0 + β1xi + εi , i = 1, n (regression line)

minn∑

i=1

(yi − β0 − β1xi )2 (OLS)

and if εi ∼ N(0, σ2) iid’s

maxn∏

i=1

1

2πσ2exp(− 1

2σ2

n∑i=1

(yi − β0 − β1xi )2) (ML)

β ∼ N(β, σ2(X ′X )−1)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

11/44

Pros and Cons of Regression (Linear Models)

Oversimplistic view of the Phenomena

Underestimates Measurement error (covariates are fixed)

Lacking in simultaneous equations in general (mediation )

Lacks flexibility to fit the SEM models

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

12/44

What is SEM

A melding of factor analysis and path (regression) analysisinto one comprehensive statistical methodolgy

Simultaneous equation modeling

Does the implied covariance matrix match up with theobserved covariance matrix

Degree to which they match represents the goodness of fit

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

13/44

Estimation (graph)

1.00 0.49

1.00 3.51

1.00 0.84

1.00 230.18

0.59

0.02

-0.00

1.09 1.32

1.20 0.47

0.44 0.34

1.18 -123.86

0.27

-0.02

1.22

0.00

0.51

x1 x2

x3 x4

x5 x6

x7 x8

Eps

Tlr

Eng

Rng

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

14/44

Estimation (equations)

Measurement Model:

x1 = a1 + epistemiology + e1

x2 = a2 + b2 epistemiology + e2

x3 = a3 + tolerance + e3

x4 = a4 + b4 tolerance + e4

x5 = a5 + engagement + e5

x6 = a6 + b6 engagement + e6

x7 = a7 + range + e7

x8 = a8 + b8 range + e8

Structural Model:

tolerance = a9 + b9 epistemiology + e9

range = a10 + b10 tolerance

b11 engagement + e10

cov(epist, engag) 6= 0

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

15/44

Estimation: objective function

S =

1n

∑ni=1(x1i − x1)2 1

n

∑ni=1(x1i − x1)(x2i − x2) · · · cov(x1, x8)

cov(x1, x2) var(x2) · · · cov(x2, x8)· · · · · · · · · · · ·

cov(x1, x8) cov(x2, x8) · · · var(x8)

Σ(θ) = cov(x1, x2, · · · , x8) =

var(x1) cov(x1, x2) · · · cov(x1, x8)

cov(x1, x2) var(x2) · · · cov(x2, x8)· · · · · · · · · · · ·

cov(x1, x8) cov(x2, x8) · · · var(x8)

S ≈ Σ(θ)

Basically, minimize f (Σ(θ), S)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

16/44

Generalized Least Squares (GLS)

x1, · · · , xn ∼ N(0,Σ(θ0)), xi ∈ Rp iid’s

vec SL−→ N(Σ(θ0),C )

G (θ) = 2−1tr(S − Σ(θ))V 2,V > 0

θL−→ N(θ0,D(θ0))

nG (θ)L−→ χ2

p∗−q

p∗ = p(p+1)2 , q parameters

H0 : Σ = Σ(θ) vs Ha : Σ 6= Σ(θ)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

17/44

Maximum Likelihood (ML)

x1, · · · , xn ∼ N(µ0,Σ(θ0)), xi ∈ Rp iid’s

(n − 1)S ∼Wp(R0, ρ0)

F (θ) = log det(Σθ) + tr((SΣ(θ))−1)− log det(S)− p

θML−→ N(θ0,C2(θ0))

nF (θM)L−→ χ2

p∗−q

H0 : Σ = Σ(θ) vs Ha : Σ 6= Σ(θ)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

18/44

SEM Modeling

Model ( Diagram )

Identifyability ( q ≤ 2−1p(p + 1)),check identifyabiltiy rules in Bollen (page 238)

Constraints ( loadings equal 1 )

EDA ( Distribution, correlation, outliers, etc...)

EDA ( Estimation )

Fit indices ( SMR ( residuals ))

Diagnostics ( residuals, outliers, etc... )

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

19/44

Measurement model (CFA)

xi = Λξi + εi , i = 1, · · · , n

ξ ∼ N(0,Φ), Latent variablesε ∼ N(0,Ψε), Ψε diagonalξ and ε are uncorrelated

Σ = ΛΦΛt + Ψε

Λ, Φ, Ψε are the parameters

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

20/44

CFA Example (graph)

1.00 0.55 0.73 1.00 1.11 0.93 1.00 1.18 1.08

0.55 1.13 0.84 0.37 0.45 0.36 0.80 0.49 0.57

0.81 0.98 0.38

0.41

0.26

0.17

x1 x2 x3 x4 x5 x6 x7 x8 x9

vsl txt spd

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

21/44

CFA (loadings and latents)

ξ =

vsltxtspd

Λ =

1 0 0λ21 0 0λ31 0 00 1 00 λ52 00 λ62 00 0 10 0 λ820 0 λ92

But also remember the variances and covariances

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

22/44

CFA using Laavan (R)

library(stringr)

library(lavaan)

library(DiagrammeR)

library(dplyr)

library(semPlot)

# specify the model

HS.model <-

" visual =~ x1 + x2 + x3

textual =~ x4 + x5 + x6

speed =~ x7 + x8 + x9 "

fit.HS <- sem(HS.model,

data=HolzingerSwineford1939)

summary(fit.HS)

semPaths(fit.HS, intercept = FALSE,

whatLabel = "est",

residuals = TRUE, exoCov = TRUE)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

23/44

CFA Example (output)

> summary(fit.HS)

lavaan (0.5-22) converged normally after 35 iterations

Number of observations 301

Estimator ML

Minimum Function Test Statistic 85.306

Degrees of freedom 24

P-value (Chi-square) 0.000

Parameter Estimates:

Information Expected

Standard Errors Standard

Latent Variables:

Estimate Std.Err z-value P(>|z|)

visual =~

x1 1.000

x2 0.554 0.100 5.554 0.000

x3 0.729 0.109 6.685 0.000

textual =~

x4 1.000

x5 1.113 0.065 17.014 0.000

x6 0.926 0.055 16.703 0.000

speed =~

x7 1.000

x8 1.180 0.165 7.152 0.000

x9 1.082 0.151 7.155 0.000

Covariances:

Estimate Std.Err z-value P(>|z|)

visual ~~

textual 0.408 0.074 5.552 0.000

speed 0.262 0.056 4.660 0.000

textual ~~

speed 0.173 0.049 3.518 0.000

Variances:

Estimate Std.Err z-value P(>|z|)

.x1 0.549 0.114 4.833 0.000

.x2 1.134 0.102 11.146 0.000

.x3 0.844 0.091 9.317 0.000

.x4 0.371 0.048 7.779 0.000

.x5 0.446 0.058 7.642 0.000

.x6 0.356 0.043 8.277 0.000

.x7 0.799 0.081 9.823 0.000

.x8 0.488 0.074 6.573 0.000

.x9 0.566 0.071 8.003 0.000

visual 0.809 0.145 5.564 0.000

textual 0.979 0.112 8.737 0.000

speed 0.384 0.086 4.451 0.000

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

24/44

Structural model (SEM)

η = Bη + Γξ + ζ

y = Λyη + εx = Λxξ + δ

B, Γ, Λy , Λx ,Φ, Ψ, Θε,Θδ, are the parameters

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

25/44

SEM Example (graph)

1.00 2.18 1.82

1.00 1.26 1.06 1.26 1.00 1.19 1.28 1.27

1.48 0.57

0.84

0.621.31

2.15 0.79 0.351.36

x1 x2 x3

y1 y2 y3 y4 y5 y6 y7 y8

i60

d60 d65

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

26/44

SEM Example (some equations)

[d60d65

]=

[0 0B21 0

] [d60d65

]+

[γ11γ21

] [i60]

+

[ξ1ξ2

]

Σ(θ) =

(Σyy (θ) Σyx(θ)Σxy (θ) Σxx(θ)

)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

27/44

SEM Example ( R code)

# specify the model

model <- ’

# latent variables

ind60 =~ x1 + x2 + x3

dem60 =~ y1 + y2 + y3 + y4

dem65 =~ y5 + y6 + y7 + y8

# regressions

dem60 ~ ind60

dem65 ~ ind60 + dem60

# residual covariances

y1 ~~ y5

y2 ~~ y4 + y6

y3 ~~ y7

y4 ~~ y8

y6 ~~ y8

fit <- sem(model, data=PoliticalDemocracy)

summary(fit)

semPaths(fit, intercept = FALSE, whatLabel = "est",

residuals = FALSE, exoCov = FALSE)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

28/44

SEM Example (output)

summary(fit)

lavaan (0.5-22) converged normally after 68 iterations

Number of observations 75

Estimator ML

Minimum Function Test Statistic 38.125

Degrees of freedom 35

P-value (Chi-square) 0.329

Parameter Estimates:

Information Expected

Standard Errors Standard

Latent Variables:

Estimate Std.Err z-value P(>|z|)

ind60 =~

x1 1.000

x2 2.180 0.139 15.742 0.000

x3 1.819 0.152 11.967 0.000

dem60 =~

y1 1.000

y2 1.257 0.182 6.889 0.000

y3 1.058 0.151 6.987 0.000

y4 1.265 0.145 8.722 0.000

dem65 =~

y5 1.000

y6 1.186 0.169 7.024 0.000

y7 1.280 0.160 8.002 0.000

y8 1.266 0.158 8.007 0.000

Regressions:

Estimate Std.Err z-value P(>|z|)

dem60 ~

ind60 1.483 0.399 3.715 0.000

dem65 ~

ind60 0.572 0.221 2.586 0.010

dem60 0.837 0.098 8.514 0.000

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

29/44

SEM Example (output)

Covariances:

Estimate Std.Err z-value P(>|z|)

.y1 ~~

.y5 0.624 0.358 1.741 0.082

.y2 ~~

.y4 1.313 0.702 1.871 0.061

.y6 2.153 0.734 2.934 0.003

.y3 ~~

.y7 0.795 0.608 1.308 0.191

.y4 ~~

.y8 0.348 0.442 0.787 0.431

.y6 ~~

.y8 1.356 0.568 2.386 0.017

Variances:

Estimate Std.Err z-value P(>|z|)

.x1 0.082 0.019 4.184 0.000

.x2 0.120 0.070 1.718 0.086

.x3 0.467 0.090 5.177 0.000

.y1 1.891 0.444 4.256 0.000

.y2 7.373 1.374 5.366 0.000

.y3 5.067 0.952 5.324 0.000

.y4 3.148 0.739 4.261 0.000

.y5 2.351 0.480 4.895 0.000

.y6 4.954 0.914 5.419 0.000

.y7 3.431 0.713 4.814 0.000

.y8 3.254 0.695 4.685 0.000

ind60 0.448 0.087 5.173 0.000

.dem60 3.956 0.921 4.295 0.000

.dem65 0.172 0.215 0.803 0.422

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

30/44

Why Bayesian

Flexibility to utilize prior knowledge ( priors )

Robust to small sample sizes

Bayes Factor and flexibility in comparing models

Easy production of the Latent scores ( Factors )

Blaavan ( open software in R )

WinBUGS ( open software )

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

31/44

Bayesian References

A Bayesian approach to confirmatory factor analysis (Lee,1980)

Evaluation of the Bayesian and maximum likelihoodapproaches in analyzing structural equation models with smallsmall sample sizes (Lee, Song, 2004)

Structural Equation Modeling, A Bayesian Approach (Lee,2007)

Basic and Advanced Bayesian Structural Equation Modeling,With Applications in the Medical and Behavioral Sciences(Song, Lee, 2012)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

32/44

Bayesian estimation

log p(Θ|Y ,M) ∝ log p(Y |Θ,M) + log p(Θ)M: arbitrary SEM model

Y: observed dataset of raw observations, sample size nθ: Random vector of parameters in M

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

33/44

Conjugate priors

p(y |θ) =(nk

)θy (1− θ)n−y , θ ∈ (0, 1)

p(θ) ∝ θα−1(1− θ)β−1 , θ ∼ β(α, β)p(θ|y) ∝ p(y |θ)p(θ) ∝ θy (1− θ)n−y (1− θ)β−1

∝ θy+α−1(1− θ)n−y+β−1 ∼ β(y + α, n − y + β)The prior p(θ) and posterior p(θ|y) have the same distribution

form

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

34/44

Measurement model (CFA) Bayesian approach

yi = Λwi + εi , i = 1, · · · , n, yi ∈ Rk

wi ∼ N(0,Φ),w ∈ Rq

εi ∼ N(0,Ψε), Ψε diagonal , Ψεk elementswi and εi are independent

Λ, Φ, Ψε are the parametersLet Λt

k be the kth row of Λ

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

35/44

Measurement model (CFA) priors

The conjugate priors on the parameters are:

Ψεk ∼ IGamma(α∗0εk , β∗0εk)

[Λk |Ψεk ] ∼ N(Λ0k ,ΨεkH0yk)

Φ ∼ IWq(R∗0 , ρ0), R∗0 is pd

The problem is choosing the hyperparameters, such that we haveinformative vs. non informative priors

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

36/44

Measurement model (CFA) Gibbs Sampling (MCMC)

Let Y = y1, · · · , yn be the observed data matrixΩ = (w1, · · · ,wn) matrix of the the latent variables(Y ,Ω) is the complete dataset ( augmented data )

P(Λ, Φ, Ψε|Y ) the posterior is intractable

P(Λ, Φ, Ψε|Ω,Y ) usually standardP(Ω|Λ, Φ, Ψε,Y ) can be also derived based on Model M

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

37/44

Measurement model (CFA) Gibbs Sampling

The Gibbs sampling algorithm allows to sample fromP(Λ, Φ, Ψε,Ω|Y )

at the (j + 1)thiteration given Ωj , Λj , Φj , Ψjε

Generate Ωj+1 ∼ P(Ω|Λj , Φj , Ψjε,Y )

Generate Ψj+1ε ∼ P(Ψε|Ωj+1, Λj , Φj , Y )

Generate Φj+1 ∼ P(Φ|Ωj+1, Λj , Ψj+1ε ,Y )

Generate Λj+1 ∼ P(Λ|Ωj+1, Φj+1, Ψj+1ε ,Y )

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

38/44

Measurement model (CFA) Posterior Parameters Estimates

θt = (Λt , Φt , Ψtε), t = 1, · · · ,T ∗

θ =1

T ∗

T∗∑i=1

θt

var(θ) =1

(T ∗ − 1)

T∗∑i=1

(θt − θ)(θt − θ)t

along with 95% confidence intervals using the Q0.025 and Q0.975

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

39/44

Bayesian CFA Example using Blaavan

library(blavaan)

# specify the model

bHS.model <- " visual =~ x1 + x2 + x3

textual =~ x4 + x5 + x6

speed =~ x7 + x8 + x9

# intercepts

x1 ~ 0

x2 ~ 0

x3 ~ 0

x4 ~ 0

x5 ~ 0

x6 ~ 0

x7 ~ 0

x8 ~ 0

x9 ~ 0

"

bfit.HS <- bsem(bHS.model,

data=HolzingerSwineford1939 )

summary(bfit.HS)

fitMeasures(bfit.HS,fit.measures="all", baseline.model= NULL)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

40/44

Bayesian CFA Example (output)

blavaan (0.2-2) results of 10000 samples after 5000 adapt+burnin iterations

Number of observations 301

Number of missing patterns 1

Statistic MargLogLik PPP

Value -4481.087 0.000

Parameter Estimates:

Latent Variables:

Estimate Post.SD HPD.025 HPD.975 PSRF Prior

visual =~

x1 1.000

x2 1.221 0.018 1.186 1.255 1.000 dnorm(0,1e-2)

x3 0.463 0.012 0.438 0.487 1.000 dnorm(0,1e-2)

textual =~

x4 1.000

x5 1.404 0.020 1.365 1.445 1.004 dnorm(0,1e-2)

x6 0.731 0.016 0.7 0.761 1.001 dnorm(0,1e-2)

speed =~

x7 1.000

x8 1.320 0.020 1.28 1.357 1.002 dnorm(0,1e-2)

x9 1.286 0.019 1.25 1.325 1.002 dnorm(0,1e-2)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

41/44

Bayesian CFA Example (output)

Covariances:

Estimate Post.SD HPD.025 HPD.975 PSRF Prior

visual ~~

textual 15.500 1.321 12.998 18.14 1.000 dwish(iden,4)

speed 20.910 1.764 17.576 24.439 1.000 dwish(iden,4)

textual ~~

speed 13.003 1.118 10.9 15.259 1.000 dwish(iden,4)

Intercepts:

Estimate Post.SD HPD.025 HPD.975 PSRF Prior

.x1 0.000

.x2 0.000

.x3 0.000

.x4 0.000

.x5 0.000

.x6 0.000

.x7 0.000

.x8 0.000

.x9 0.000

visual 0.000

textual 0.000

speed 0.000

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

42/44

Bayesian CFA Example (output)

Variances:

Estimate Post.SD HPD.025 HPD.975 PSRF Prior

.x1 0.716 0.088 0.547 0.891 1.001 dgamma(1,.5)

.x2 1.219 0.138 0.96 1.5 1.000 dgamma(1,.5)

.x3 0.993 0.086 0.832 1.164 1.000 dgamma(1,.5)

.x4 0.449 0.053 0.346 0.552 1.001 dgamma(1,.5)

.x5 0.314 0.069 0.184 0.452 1.002 dgamma(1,.5)

.x6 0.509 0.048 0.417 0.604 1.000 dgamma(1,.5)

.x7 0.877 0.084 0.717 1.045 1.000 dgamma(1,.5)

.x8 0.567 0.077 0.417 0.72 1.000 dgamma(1,.5)

.x9 0.478 0.068 0.347 0.61 1.000 dgamma(1,.5)

visual 24.998 2.118 20.929 29.176 1.000 dwish(iden,4)

textual 10.256 0.882 8.518 11.953 1.001 dwish(iden,4)

speed 17.812 1.539 14.813 20.859 1.001 dwish(iden,4)

> fitMeasures(bfit.HS,fit.measures="all", baseline.model= NULL)

npar logl ppp bic dic p_dic waic

21.000 -4398.287 0.000 8916.354 8837.747 20.586 8838.364

p_waic looic p_loo margloglik

20.848 8838.391 20.861 -4481.087

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

43/44

Conclusions

The frequentist SEM approach is based on MLE

The Bayesian approach with data augmentation and MCMCmethods is flexible to analyze SEM

The Bayesian approach may be used when prior knowledge isavailabe when small sample size

Some open problems (power, optimal designs, GSEM, etc...)

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

44/44

THANK YOU!

M’hamed (Hamy) Temkit Division of Biostatistics

Bayesian Structural Equations Modeling

Recommended