Upload
others
View
21
Download
0
Embed Size (px)
Citation preview
Structural Equation Modeling Using Stata
Paul D. Allison, Ph.D.
Upcoming Seminar: August 16-17, 2018, Stockholm
2/3/2017
1
Introduction toStructural Equation ModelingUsing Stata
Structural Equation Models
What is SEM good for?
SEM
Preview: A Latent Variable SEM
Latent Variable Model (cont.)
Cautions
Outline
Software for SEMs
Favorite Textbook
Linear Regression in SEM
GSS2014 Example
Linear Regression with Stata
FIML for Missing Data
Further Reading
Assumptions
FIML in Stata
Path Diagram (from Mplus)
Path Analysis of Observed Variables
Some Rules and Definitions
Three Predictor Variables
Two-Equation System
Why combine the two equations?
Calculation of Indirect Effect
A More Complex Model
Decomposition of Direct & Indirect Effects
Standardized Coefficients
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2/3/2017
2
Numerical Examples
More Complex Example
Decomposition of Effects
Illness Data
Summary Data
Covariance Matrix
Covariance Matrix for Illness Data
Illness Regression in Stata
Stata Results - Unstandardized
Counting Moments & Parameters
Mplus Results - Standardized
Illness Model with Indirect Effects
Model Diagram
Path Diagrams in Stata
Results
More Goodness of Fit Measures
Identification Status of the Model
Improving the Model
GOF for Improved Model
Estimates for Improved Model
Indirect Effects
Indirect Effects in Stata
Specific Indirect Effects in Stata
Partial Correlations
Partial Correlations (cont.)
Maruyama (1998) Data
Partial Correlations in Stata
Partial Correlation Results
Causal Ordering
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
2/3/2017
3
How to Decide
Nonrecursive Systems
Identification Problem in Nonrecursive Models
Identification Problem (cont.)
A Just-Identified Model
Reduced Form Equations
Solutions for Structural Parameters
Sufficient Condition for Identification
Varieties of Identification
Problems with Instrumental Variables
Example of a Nonrecursive Model
Nonrecursive Example (cont.)
Stata Code for Nonrecursive Model
Nonrecursive Results
GOF for Nonrecursive
Latent Variable Models
Roadmap for Latent Variables
Classical Test Theory
Random Measurement Error
Reliability
Parallel Measures
Tau-Equivalent Measures
Tau-Equivalance: Example
Tau-Equivalence in Stata
Congeneric Tests
Three Congeneric Tests
Three Congeneric Measures (cont.)
Identification in General
Standardized Version
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
2/3/2017
4
Digression: Tracing Rule for Correlations
Tracing Rule (cont.)
Tracing Rule (cont.)
Standardized Version of 3 Congenerics (cont.)
Three Congenerics: Example
Three Congenerics: Stata
Three Congenerics: Results
Four Congeneric Measures
Overidentification with 4 Congeneric Measures
Four Congeneric Measures with Stata
Estimates for Four Congenerics
GOF for Four Congenerics
Alternative Model for 4 Congeneric Measures
Stata for Alternative Model
Results for Alternative Model
Heywood Case
Factor Models
Factor Models (cont.)
Identification (Standardized)
Identification (cont.)
Two Approaches to Identification Problem
Identification (Unstandardized)
Determining Identification
Normalizing Constraints
Normalizing Constraints
ML Estimation of CFA Models
Multivariate Normality
ML Details
Chi-Square Test
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
2/3/2017
5
Example: Self-Concept Measurement
Self Concept Path Diagram
Self Concept. Results
Self Concept Results (cont.)
Self Concept Results (cont.)
Global Goodness of Fit Measures
Other Global Measures
Other Global Measures (cont.)
Specific Goodness of Fit Measures
Standardized Residuals for Self-Concept Model
Modification Indices
Mod Indices for Self-Concept
Mod Indices for Self-Concept (cont.)
Freeing Up Parameters
Results from Freeing 1 Parameter
Selected Results (cont.)
Correlated Errors
Two Correlated Errors
A Five-Indicator Model
A Two-Factor Model
Example: Self-Concept Data
Selected Results
The General Structural Equation Model
GSS2014 Example: Stata Code
GSS2014 Example: GOF Results
GSS2014: Standardized Results
GSS2014: Standardized Results
Farm Manager Example (Rock et al. 1977)
Farm Managers Path Diagram
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
2/3/2017
6
Farm Managers: Stata Code
Farm Managers. Selected Results
Selected Results (cont.)
A Tau-Equivalent Model
Parallel Model
Identification in SEM Models
An Identified SEM
What to Do If Endogenous Variables Aren’t Normal
Example: NLSY Data
ML Results for NLSY Data
Both Variables Highly Skewed
Satorra-Bentler Robust SE’s
Weighted Least Squares
Weighted Least Squares
WLS Results
Multiple Group Analysis
Subjective Class Example
Reading in the Data in Stata
Subjective Class Models
Stata Code for 2-Group Models
Stata Code (cont.)
Tests for Comparing the Groups
Model 2 Results
Model 2 Results (cont.)
Wald & Score Tests Comparing Groups
Output from estat ginvariant
More output from estat ginvariant
Interactions and Non-Linearities
Ordinal and Binary Data
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
2/3/2017
7
Special Correlations
Special Correlations
Specialized Models
CFA Model with Categorical Indicators
Selected Results
Other Capabilities of gsem
Cautions About SEMs
Examples I Don’t Like
Examples I Like
SEMs and Causality
Exemplary Article
Some Recommendations:Lest you forget
174
175
176
177
178
179
180
181
182
183
184
185
174
175
176
177
178
179
180
181
182
183
184
185
Introduction toStructural Equation Modeling
Using StataPaul D. Allison, Instructor
February 2017www.StatisticalHorizons.com
1
Copyright © 2017 Paul Allison
Structural Equation ModelsThe classic SEM includes many common linear models
used in the behavioral sciences:• Multiple regression• ANOVA• Path analysis• Multivariate ANOVA and regression• Factor analysis• Canonical correlation• Non-recursive simultaneous equations• Seemingly unrelated regressions• Dynamic panel data models
2
What is SEM good for?
• Modeling complex causal mechanisms.• Studying mediation (direct and indirect effects).• Correcting for measurement error in predictor variables.• Avoiding multicollinearity for predictor variables that are
measuring the same thing.• Analysis with instrumental variables.• Modeling reciprocal relationships (2-way causation).• Handling missing data (by maximum likelihood).• Scale construction and development.• Analyzing longitudinal data.• Providing a very general modeling framework to handle all
sorts of different problems in a unified way.
3
SEM
Convergence of psychometrics and econometrics
• Simultaneous equation models, possibly with reciprocal (nonrecursive) relationships
• Latent (unobserved) variables with multiple indicators.
• Latent variables are the most distinguishing feature of SEM. For example:
4
X Y
x1 x2 y1 y2
e1 e2 e3 e4
u
a b
f
c d
X and Y are unobserved variables, x1, x2, y1, and y2 are observed indicators, e1-e4 and u are random errors. a, b, c, d, and f are correlation coefficients.
Preview: A Latent Variable SEM
5
Latent Variable Model (cont.)
6
• If we know the six correlations among the observed variables, simple hand calculations can produce estimates of a through f. We can also test the fit of the model.
• Why is it desirable to estimate models like this? – Most variables are measured with at least some error. – In a regression model, measurement error in
independent variables can produce severe bias in coefficient estimates.
– We can correct this bias if we have multiple indicators for variables with measurement error.
– Multiple indicators can also yield more powerful hypothesis tests.
Cautions
• Although SEM’s can be very useful, the methodology is often used badly and indiscriminately.– Often applied to data where it’s inappropriate.– Can sometimes obscure rather than illuminate. – Easy to get sucked into overly complex modeling.
7
Outline1. Introduction to SEM2. Linear regression with missing data3. Path analysis of observed variables4. Direct and indirect effects5. Identification problem in nonrecursive models6. Reliability: parallel and tau-equivalent measures7. Multiple indicators of latent variables8. Confirmatory factor analysis9. Goodness of fit measures10. Structural relations among latent variables11. Alternative estimation methods.12. Multiple group analysis13. Models for ordinal and nominal data
8
Software for SEMsLISREL – Karl Jöreskog and Dag SörbomEQS – Peter BentlerPROC CALIS (SAS) – W. Hartmann, Yiu-Fai YungAmos – James ArbuckleMplus – Bengt Muthénsem, gsem (Stata)Packages for R:
OpenMX – Michael Nealesem – John Foxlavaan – Yves Rosseel
9
Favorite Textbook
10