42
ROB CRIBBIE QUANTITATIVE METHODS PROGRAM – DEPARTMENT OF PSYCHOLOGY COORDINATOR - STATISTICAL CONSULTING SERVICE COURSE MATERIALS AVAILABLE AT: WWW.PSYCH.YORKU.CA/CRIBBIE Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Embed Size (px)

DESCRIPTION

Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012. Rob Cribbie Quantitative Methods Program – Department of Psychology Coordinator - Statistical Consulting Service COURSE MATERIALS AVAILABLE AT: WWW.PSYCH.YORKU.CA/CRIBBIE. What are we going to do today?. - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

ROB CRIBBIE

Q UA N T I TAT I V E M E T H O D S P R O G R A M – D E PA RT M E N T O F P S YC H O L O G Y

C O O R D I N AT O R - S TAT I S T I C A L C O N S U LT I N G S E RV I C E

C O U R S E M AT E R I A L S AVA I L A B L E AT:W W W. P S YC H . Y O R KU. C A / C R I B B I E

Introduction to Structural Equation Modeling (SEM)

Day 2: November 15, 2012

Page 2: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

What are we going to do today?

Confirmatory factor analysis

Full structural equation models

Page 3: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Review from last week….

• Definitions of SEM• SEM lingo• SEM assumptions• Model Identification• Fit indices (RMSEA, CFI, TLI, IFI, SRMR)

Page 4: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Least Squares Regression Example

This example helps to bridge the gap between regression and SEM

Study Description: 100 Psychology Graduate Students Outcome: Depression Predictors:

Hours Worked per Week Quality of Relationship with Supervisor Research Productivity

Page 5: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Multiple Regression Output

depression ~ hours + rel_superv + res_prod

Coefficients:

Estimate Std. Error t Pr(>|t|)

(Intercept) 8.81743 2.14132 4.118 8.1e-05 *** hours 0.13045 0.05561 2.346 0.02106 * rel_superv -0.05621 0.08027 -0.700 0.48544 res_prod -0.35047 0.10379 -3.377 0.00106 **

Residual standard error: 3.005 on 96 degrees of freedom Multiple R-squared: 0.1716, Adjusted R-squared: 0.1457

F-statistic: 6.628 on 3 and 96 DF, p-value: 0.0004072

Page 6: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Model

Page 7: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Results

Estimate S.E. C.R. P

depression res_prod -.350 .102 -3.429 ***

depression rel_superv -.056 .079 -.711 .477

depression hours .130 .055 2.382 .017

Number of distinct sample moments: 10

Number of distinct parameters to be estimated: 10

Degrees of freedom (10 - 10): 0

Squared Multiple Correlation (R2) = .17

Page 8: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Regression/SEM Example Summary

The only difference between the Regression and SEM analyses is the estimation method SEM: Maximum Likelihood

Iterative attempt to find parameter values that fit the data Regression: Least Squares

Parameter values that minimize the residuals (i.e., observed – predicted)

Regression and SEM produced parameter estimates and r-squared values for depression that were almost identical

Page 9: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Confirmatory Factor Analysis (CFA)

A structural equation model often consists of two components: a measurement model linking a set of observed

variables to a usually smaller set of latent variables a structural model linking the latent variables through

a series of specified relationships.

CFA corresponds to the measurement model of SEM

Page 10: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Exploratory Factor Analysis (EFA) versus CFA

With EFA, investigators are interested in exploring patterns within the data, whereas with CFA, investigators are interested in explicitly testing specific hypotheses about how the observed variable are related

Exploratory factor analysis (EFA imposes no substantive constraints on the data there are no restrictions on the pattern of relationships between

observed and latent variables (e.g., cross-loadings are permitted and the number of factors is generally not fixed)

EFA is data driven

Page 11: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

EFA

• For EFA, each common factor is assumed to affect every observed variable, with the common factors being either all correlated or uncorrelated (i.e., orthogonal or oblique factors) – Can be estimated with ordinary statistical software

packages (e.g., R, SPSS)

• Once the model is estimated, factor scores, proxies of latent variables, are calculated and used for follow-up analysis – e.g., use factor scores to predict a different outcome

in a separate analysis

Page 12: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

CFA

Confirmatory factor analysis (CFA), on the other hand, is theory- or hypothesis driven.

With CFA it is possible to place substantively meaningful constraints on the factor model For example, researchers can specify the number of factors,

which observed variables should load on which latent variables, which factors should be correlated, etc.

Unlike EFA, CFA produces many goodness-of-fit measures to evaluate the model Recall that it is the constraints on the model (e.g., limited

number of factors, observed variables that load on only one factor) that determines how well the model fits (i.e., are those constraints reasonable)

Page 13: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Path Diagrams with Latent Variables

• Measurement models– Generally, latent variables “cause” the

observed/indicator variables (reflective indicators), as shown by single-headed arrows pointing away from the latent variable and towards the observed variables– E.g., Latent depression with indicators representing

scores on three different depression scales (latent depression “causes” scores on the observed variables)

– However, in some instances the observe variables ‘combine’ to determine the latent variable (formative indicators)– E.g., Latent socioeconomic status variable with indicators

income, occupation prestige, and level of education

Page 14: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Sample CFA

Page 15: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Sample CFA with higher order factor

Page 16: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Model Identification- Review

Need to scale the latent variables in order to identify the model 1) set one of the regression coefficients for one

indicator equal to 1. All other indicators are interpreted relative to this

value

OR 2) set the variance of the latent variable to 1

(standardizing) Most common method for CFA

Page 17: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Confirmatory Factor Analysis

Indicators are assumed to be normally distributed variables What about items from a scale?

Likert-type items are by nature categorical: covariances are smaller than they should be, model fit tests are biased, parameter estimates and std. errors are biased.

However, research has found that ordered variables with more than 5 categories can often be treated as continuous

For categorical items (e.g., items with less than 5 categories), better to use polychoric correlations or item response theory

Page 18: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

CFA Example

Greenglass et al. were interested in the influence of a construct called “energized state”; more specifically whether it could influence coping and stress outcomes Do individuals experiencing this energized state cope better

with stress?A measurement model was needed to examine the

validity of this constructSeveral positive personality variables were measured

as indicative of an energized state Optimism, positive affect, tendency to perceive difficulties as a

challenge, and vigor N = 404 Variance of the latent variables is set at 1

Page 19: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Example CFA

1

energized state

PositiveAffect

e1

1

Vigor

e2

1

Challenge

e3

1

Optimism

e4

1

Page 20: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

CFA

Review: Stages of Modeling Ensure that model is identified Screen SEM assumptions:

multivariate outliers, univariate normality, multivariate normality, linearity in the relationships between your variables

Check overall model fit Check standardized residual covariance

matrix/modification indices if model fit is poor Post hoc/exploratory analyses: Make theoretically

appropriate changes to model and re-fit Interpret parameter estimates.

Page 21: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

CFA example

Model fit was good (although RMSEA is a little high):

χ² df p-value CFI TLI RMSEA 90% CI SRMR

6.1 2 .047 .99 .97 .07 .006, .139 .0254

Page 22: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Example CFA

Parameter Estimates

Estimate S.E. C.R. P

Positive Affect energized state

8.9886581 .477506 18.824168 ***

Vigor energized state

4.2036306 .290409 14.474831 ***

Challenge energized state

2.3353240 .225466 10.357735 ***

Optimism energized state

1.4259721 .221145 6.4481178 ***

Page 23: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Example CFA

Standardized Parameter Estimates: Numerous rules of thumb, but standardized parameter

estimates are often expected to be >.5

Estimate

Positive Affect <--- energized state .9435927

Vigor <--- energized state .7250835

Challenge <--- energized state .5214540

Optimism <--- energized state .3317228

Page 24: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Example CFA

Model is a reasonable fit to the dataAll loadings on the general factor are

statistically significantThere is a question regarding whether

optimism is an important contributor to the latent construct since its loading is relatively small

Could use “energized state” variable as part of a full structural equation model

Page 25: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Full Structural Equation Models

Once you have established the measurement models for your latent variables, you can now evaluate the structural portion of your hypothesized model i.e., the relationships among the latent variables and

observed variables of interest.

Page 26: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Full SEM Example

A researcher was interested in whether attitudes regarding quantitative ability at the start of a statistics course predicted quantitative performance at the end of the course

2 latent variables Quant Attitudes – 3 indicators

Anxiety, hinderances to doing well in a stats course, self-efficacy

Quant Performance – 2 indicators Average homework grade, average exam grade

One indicator for each latent variable had its loading fixed to 1

Page 27: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Quantitative Attitudes and Performance

Page 28: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Full SEM example

N = 129χ² (4) = 3.23, p = .519 (Excellent!)

PROBLEM: The following variances are negative e10 is the residual variance for the observed

variable “exam average”

e10

-10.0533927

Page 29: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Improper Solutions

Tempting to look at the “problem” variables (e.g., the residual variance for exam average) and deal with the issue by “fixing” the variance to a positive value (e.g., .01) In some instances this is necessary, especially when

the value is close to 0 and all other parts of the model fit well

Better to think carefully about the variables in the whole model Is something misspecified? Are there important parameters missing?

Page 30: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Full SEM Example

If we look through the output, we see that homework average is not a significant indicator of quantitative performance Further, if we go back to the bivariate correlations

among our variables, we further see that homework average is not correlated with any of the indicator variables for quantitative attitudes

Perhaps exam average alone is a better representation of quantitative performance?

Page 31: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

χ² (2) = 1.8, p = .413CFA = 1.00IFI = 1.00TLI = 1.00RMSEA = 090% CI = (0, .169)SRMR =.026

Quantitative Attitudes and PerformanceNo Homework Average

Page 32: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Quantitative Attitudes and PerformanceParameter Estimate

Estimate S.E. C.R. P

QPerf QAtt -9.955 2.575 -3.865 ***

HINDR1 QAtt 1.000

ANX1 QAtt 1.720 .380 4.531 ***

SEFF1 QAtt -1.043 .241 -4.331 ***

EXAMAVG QPerf 1.000

Note: The model on the previous slide is identical to just including the observed ‘exam average’ variable as the outcome (instead of creating a latent ‘quantitative performance’ variable)

Page 33: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Model fit well with homework average present, but it was not contributing to the model (and in fact it was leading to other issues)

Without homework average, the latent Quantitative Attitudes variable was a significant predictor of Quantitative Performance (now simply exam scores), explaining approximately 20% of the variability in Quantitative Performance The relationship was negative, as expected,

with higher levels of negative attitudes predicting lower scores (and vice versa)

Quantitative Attitudes and PerformanceSummary

Page 34: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Evaluate the effects of a sixth grade intervention for reducing early sexual behaviours More specifically, do these sixth grade

intervention strategies reduce the amount of sexual behaviour in grade 7 (time 2) and grade 8 (time 3)

‘Sexual Gestalt’ is a latent variable that is made up of psychosocial variables related to the individual’s views toward early sexual behaviour

SEM Example Two

Page 35: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example Two

SexualBehavior (T3)

PeerNorms

SexualLimits

UnwantedAdvances

Residual

ParentalViews

SexualBehavior (T2)

Residual

1

e11

e21

e31

e41

SexualGestalt

1

1

Page 36: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

Results for Model

Chi-square test of absolute model fit Chi-square = 33.42 with 9 DF, p < .0001 Our model does not fit the data on an absolute basis

(which is extremely common given that sample sizes are usually large and any non-zero residuals will result in a significant chi-square)

Does our model fit the data on a descriptive or approximate basis?

Descriptive fit measures CFI = .96 RMSEA = .062 SRMR = .04

Reasonable fit …. but can we do better???

Page 37: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Model Modification

Largest standardized residual covariances: Sexual Limits - Sexual Behavior (t3): -2.43 Peer Norms - Sexual Behavior (t3): -1.84

Modification indices suggest that Sexual Limits to Sexual Behavior (t3) is the single best path to free for estimation Index value = 9.10

what the (minimum) expected drop in the model chi- square fit statistic would be if we were to free this parameter

Modification indices suggest that Sexual Gestalt to Sexual Behavior (t3) is the next best path to free for estimation

Page 38: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Model Modification

Makes more sense (probably) to connect Sexual Gestalt to Sexual Behavior at time 3 than it does to connect Sexual Limits to Behavior It is important to always consider which of the possible

modifications makes most sense (in terms of parsimony, theory, etc.), instead of blindly making modifications

Re-specify model with one additional path from the Sexual Gestalt factor to Sexual Behavior at time 3

Page 39: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Modified Model

SexualBehavior (T3)

PeerNorms

SexualLimits

UnwantedAdvances

Residual

ParentalViews

SexualBehavior (T2)

Residual

1

e11

e21

e31

e41

SexualGestalt

1

1

Page 40: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Modified Model Results

Chi-square = 17.91 with 8 DF, p =.02 15 unit drop in chi-square value for only one DF a good tradeoff!

Approximate Fit Indices: CFI = .98 RMSEA = .04 SRMR = .02 GFI = .98 AGFI = .95

No standardized residual covariances exceed |1.50|; most are below |1.00|

Page 41: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Modified Model Results

SexualBehavior (T3)

PeerNorms

SexualLimits

UnwantedAdvances

2.83

Residual

ParentalViews

Unstandardized estimates

Chi-square (8 df) = 17.912

p = .022

SexualBehavior (T2)

.52

Residual

1

.29

e11

.27

e21

.07

e31

.39

e41

.28

SexualGestalt

1.00

-.95*

-.28*

-.30*

-2.20*

.31*

1

-1.49*

Page 42: Introduction to Structural Equation Modeling (SEM) Day 2: November 15, 2012

SEM Example: Modified Model Results

.33

SexualBehavior (T3)

.48

PeerNorms

.49

SexualLimits

.25

UnwantedAdvances

Residual

.06

ParentalViews

.72

SexualBehavior (T2)

Residuale1

e2

e3

e4

SexualGestalt

.70

-.69

-.50

-.25

-.85

.21

-.39