Upload
chayma-bouaoiina
View
115
Download
1
Embed Size (px)
Citation preview
1
Introduction to Path Analysis and Structural Equation Modelling
with AMOSDaniel Stahl
Biostatistics and Computing
OutlineCourse:
– SEM and path analysis– Using AMOS to do path analysis and SEM– Model specification, identification, and estimation – Evaluating model fit – Interpreting parameter estimates – SEM and causality
Today:– What is SEM?– Relationship between correlation, regression, path analysis and SEM– Basic concepts of path analysis and SEM– Unobservable traits– Introduction to AMOS– Simple analyses with AMOS
Books
• Barbara M. Byrne (2001) Structural Equation Modeling with AMOS
• Randall E. Schumacker and Richard G. Lomax (2004) A Beginner's Guide to Structural Equation Modeling. (presents AMOS examples)
• Rex B. Kline (2004) Principles and Practice of Structural Equation Modeling, 2nd ed.
• Bill Shipley (2004) Cause and Correlation in Biology: A User's Guide to Path Analysis, Structural Equations and Causal Inference.
• James L. Arbuckle (2007) Amos™ 7.0 User’s Guide.
We collected the following variables of 40 cancer patients: – Body function – Pain– Depression
We are interested about the influence of pain and functioning on depression.
What kind of analyses could we do?
Correlations
DepressionPain
DepressionFunction
DepressionPain
DepressionFunction
Simple linear regresion
2
Depression
Pain
Function
Multiple linear regresion
Depression
Pain
Function
Mediation (Path analysis)
Depression
Pain
Function
Model 1
Depression
Pain
Function
Model 2
Path analysis: competing models
• Let’s assume that there is no test available for measuring “pain”. We developed a small questionnaire with three questions.
• How could we integrate the answers of the questionnaires in our analysis?
• (Hint: Pain is a latent construct, which we would like to measure with our questionnaire.)
Factor analysis: Latent construct "pain"
Latent variable"Pain"
Q 3
1
Q 2
Q 1
We could do a factor analysis and use the factor scores as an estimate of “pain” in the same way as before.
• Now we measured all three variables with a questionnaire with 3 items:
3
Pain
Item 1
Error 1
1
1
Item 2
Error 2
1
Item 3
Error 3
1
Function
Item 1
Error 1
Item 2
Error 2
Item 3
Error 3
1
111
Depression
Item 1 Error 1
Item 2 Error 2
Item 3 Error 3
11
1
1
Path analysis with latent constructs = Structural Equation Modelling
Correlation and Regression
• Correlation describes the linear association between two variables.
• Regression describes the effect of one or more independent (predictor) variables on a dependent variable: Depression = c + β1*pain + β2*function + error (N,σ2)
Correlation and Regression
Pearson’s r=0.68
Depression = 11.86+1.072*Pain + Error (0,32)
Standardised coefficient for pain: 0.678
Coefficients a
11.860 1.337 8.870 .0001.072 .248 .678 4.331 .000
(Constant)Pain
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: Depressiona.
Correlation and Regression
• Correlation describes the linear association between two variables.
• Regression describes the effect of one or more independent (predictor) variables on a dependent variable: Depression = c + β1*pain + β2*function + error (N,σ2)
Error also influences our outcome variable “Depress ion”
Depression
Pain
Function
Multiple linear regresion
Error
� We have to add an error term to our dependent variable: Path analysis
• Technique to examine the causal relationships between two or more variables.
• Path analysis assess the direct and indirect (mediating) relationships among a set of variables! – X�Z: direct effect of Y on Z
– X� Y�Z: indirect effect of X on Z via Y – Total effect of X on Z = direct + indirect effect
• Regression is a subset of path analysis. It only studies the direct effects of one or more independent variables on (usually) one dependent variable
4
Structural equation models
• Structural equation modelling additional allows to study the effect of unmeasured latent variables.
• Latent variables cannot be observed and must be inferred from measured variable.
• Latent variables are implied by the covariance among two or more measured variables (� factor analysis).
• SEM are therefore also called covariance models.
SEM
• SEM consists of two parts: a measurement model and a structural model.
• The structural model deals with the relationship between the latent variables while the measurement model describes the relationship between our measured variables and the latent variables
• For example: Relationship between the measurement model and the structural model relating pain and function to depression:
Pain
Item 1
Error 1
1
1
Item 2
Error 2
1
Item 3
Error 3
1
Function
Item 1
Error 1
Item 2
Error 2
Item 3
Error 3
1
111
Depression
Item 1 Error 1
Item 2 Error 2
Item 3 Error 3
11
1
1
Measurement model
Structural model
Path analysis and SEM
• In path analysis we assume that each latent variable is perfectly measured with one observed variable (perfect correlation = no measurement error).
• Path analysis can be regarded as a SEM, where each latent variable is inferred from one measured variable. (e.g. temperature change can be seen as a latent variable measured as the change in a quicksilver column).
• SEM ≈ combination of path and factor analysis
Family Tree of SEM
BivariateCorrelation
MultipleRegression
PathAnalysis
StructuralEquationModeling
FactorAnalysis
ExploratoryFactorAnalysis
ConfirmatoryFactorAnalysis
Unobservable traits
• In psychology and health sciences we are often concerned with questions which are more subjective than questions in other fields of science.
• These includes measurements of: abilities, knowledge, emotions, feelings, attitudes or personality traits.
• All traits have got in common that they are unobservable
traits = latent traits.
5
Unobservable, latent traits
• The effect of a drug may prolong the life of a patient or cure a symptom but it may also effects on the general well-being.
• While life prolonging is rather easy to define, it is not easy to define “well-being”
• And different people may have got different definitions.• The field of psychometrics is concerned with the theory
and technique of measurement of such psychological and mental phenomena.
Latent traits in psychology
• Intelligence• Memory• Extraversion• Self-esteem• Depression• Anxiety• Knowledge• Beliefs• Feelings and Emotions: Joy, sadness, • Senses and Perception: smell of flower• Attitude about something, e.g. foreigners, risk • Motivation• Ability to learn statistics or a new language
Latent traits in medical research
• Pain • Mental disorders
– Depression– Schizophrenia– Autism
• Mobility/Function (gerontology)• Arthritis• Quality of life• Patient satisfaction (e.g. in hospital)
Latent vs. observed variables
• An observed variable, like body height, is directly observable and can be measured easily.
• A latent variable or trait or construct is not directly observable. Instead, it is inferred from variables(items) than can be observed.
• The main approach of psychometric measurements involves applying interviews, questionnaires and tests (= instruments)
Latent trait and items
Self Esteem
“I can feel that my co-worker
respect me” 1 2 3 4 5
“I feel that I am making a useful
contribution to work” 1 2 3 4 5
“I feel good about my
work” 1 2 3 4 5 6 7
“On the whole I get along with others well” 1 2 3 4 5
Here, the latent trait “Self esteem” elicits to each item a response from 1 (strongly disagree) to 5 (strongly agree). The sum of the observed responses allows a conclusion about the person’s self esteem.
“I am proud of my relationship with my supervisor” 1 2 3 4 5
Item and latent traits
• All 5 items are measuring the latent trait “self esteem”. • They should therefore correlate with the latent trait. • A single item will never measure a construct perfectly
(and hence will never correlate perfectly), • but the 5 items should be an accurate predictor of the
latent trait. • SEM in form of factor analysis is an important tool to
develop such tests.
6
Confirmatory factor analysis model
Self Esteem
Item 1"work"
e1
1
1
Item 2super-visor
e21
Item3other
peoplee3
1
Item 4Coworker
e41
Item 5contri-bution
e51
SEM: Longitudinal CFA
Well-beingTime 1
paine1
1
1
depresse21
functione31
Well-beingTime 2
pain e1
depress e2
function e3
1
1
1
1
Self Esteem
Item 1"work"
e1
1
1
Item 2super-visor
e21
Item3other
peoplee3
1
Item 4Coworker
e41
Item 5contri-bution
e51
Depression
Item 1
1
1
Item 2
1
Item 3
1
Item 4
1
Age
e
1
But SEM can be extended: it allows to include more latent and observable variables in the analysis:
e1 e2 e3 e4
Longitudinal data analysis using SEM
• A common approach to the analysis of longitudinal data is multilevel modelling
• but we can also use the structural equation modeling (SEM) framework to form what are known as “latent curve” or “latent trajectory” models.
• Given this SEM framework, latent trajectory analysis is extremely flexible in terms of the variety of potential hypotheses that can be tested.
• Rovine & Molenaar (2001) demonstrated the mathematical equivalence of MLM and SEM with balanced data.
• SEM latent growth curve approach is more flexible.
ICEPT SLOPE
X1 X2 X3 X4
E1
1
E2
1
E3
1
E4
1
Amos Setup: Simple Growth Curve Model with Random Slope and Intercept with 4 time points
mg1
y1
0, g1
e11
mg2
y2
0, g2
e21
mg3
y3
0, g3
e31 mg4
y4
0, g4
e41
g34
g14
g12
g24
g23
g13
ICEPT Slope
10
12
1
4
1
6
Growth Curve Model with Random Slope and Intercept with correlated errors
7
Literature
• Terry E. Duncan, Susan C. Duncan, Lisa A. Strycker (2006) An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues, and Applications.
• I will start with regressions and simple path analyses without latent variables.
ExerciseUse SPSS and open the data file “pain.sav”Do the following analysis:• Correlation matrix of all four variables• Simple linear regressions between:
– Function � Depression– Pain � Depression– Pain � Function
• Multiple regression:– Pain +Function � Depression
• Plot a path analysis diagram for standardised and unstandardised estimates
• Calculate the direct, indirect and total effects of pain• Use AMOS to do the same analysis• Use AMOS to evaluate the indirect (mediation) effect
SPSS
Results of regression analysis
Parameter estimate
Standardised Parameter
p
Pain+Function�Depression Pain 0.061 0.184 0.028
Function -0.523 -0.337 <0.001 Function�Depression
Function -0.652 -0.421 <0.001
Pain�Depression Pain 0.112 0.337 <0.001
Pain�Function Pain -0.097 -0.455 <0.001
Standardised estimates
pain
function
depress-.34
error
-.46 error2
.18
8
Result: unstandardised estimates
depress
4.08
pain
function
.06
Model 2
-.52
-.10
.15
Error 1
.36
Error 2
1
1
Parameter estimate
Standardised Parameter
p
Pain+Function�Depression Pain 0.061 0.184 0.028
Function -0.523 -0.337 <0.001 Function�Depression
Function -0.652 -0.421 <0.001
Pain�Depression Pain 0.112 0.337 <0.001
Pain�Function Pain -0.097 -0.455 <0.001
Direct and indirect effects
Indirect effect of pain on depression
• The direct effect of pain on depression (that is if we keep function constant) is 0.061. But pain also negatively effects body function. Therefore, pain has an indirect effect on depression via function.
• For each 1 unit increase in pain, body function decreases by 0.097. A decrease of 0.097 units of body function increases depression by 0.097*0.523= 0.051.
• Therefore, the indirect effect of pain on depression is 0.051.
• The total effect of depression is the sum of direct and indirect effect: 0.061+0.051=0.112. The direct effect is 0.112, which is the same effect as in the simple linear regression between pain and depression.
• A commonly used test for statistical significance of the indirect effect (=mediation effect) is the Sobel test.
Direct, indirect and total effects
The total effect of pain on depression is 0.112.
The direct effect of pain on depression is 0.061.
The indirect effect is -0.097*(-0.523)= 0.051.
Control: Total=0.061+0.051=0.112.
The total standardised effect of pain on depression is 0.337.
The direct standardised effect of pain on depression is 0.184.
The indirect standardised effect is -0..455*(-0.337)= 0.153Control: Total=0.184+0.154=0.338.
Parameter estimate
Standardised Parameter
p
Pain+Function�Depression Pain 0.061 0.184 0.028
Function -0.523 -0.337 <0.001 Function�Depression
Function -0.652 -0.421 <0.001
Pain�Depression Pain 0.112 0.337 <0.001
Pain�Function Pain -0.097 -0.455 <0.001
AMOS 1. Observed variables
2. Unobserved variables
3. Drawing latent variable (draws latent variable and items)
4. Drawing path (causal relationship �regression)
5. Draw covariances (correlation, no direction)
6. Unique variable (error variable, add e.g. to each dependent var
7. List variables (open data file first, then drag and drop variabl
8. Select one object, select all, deselect
9. Move object
10.Delete
11.Select data file
12.Analysis properties (choose statistics)
13.Calculate estimates (starts the analysis)
14.View test (see results)
15.Copy graph in clipboard
16.Save
1 2 3
4 5 6
8 8 8
9 10
11 12 13
14 15 16
7
9
Main steps:
1. Draw path diagram
2. Move data into appropriate box
3. Name variable (here only error variable): right-click object
4. Analysis properties: Select statistics
5. Calculate estimates: Run analysis
6. View text: View resultsdepresspain
Simple linear regresion
Error depression
1
depresspain
Correlation
depress
pain
function
error
1
Multiple linear regression
Correlation and Regression as AMOS path models
Endogenous variables have got error variances variables pointing at them = unexplained variance
Simple mediation analysis with AMOS
depress
pain
function
Model 2
Error 1
Error 2
1
1
depress
pain
function
Mediation (Path analysis)
Error 1
Error 2
1
1
Model 1 Model 2
Exogenous and endogenous variables• In SEM it is difficult to apply the concept of independent and
dependent variables. • Function is a dependent variable in one regression analysis
(pain�function) but an independent in another (pain+function�depr).• Therefore, we distinguish between exogenous and endogenous
variables: • Exogenous variables are independent variables with no prior causal
variable (= no single-headed arrow pointing on it), although it can be correlated (double headed arrow) with other variables.
• Endogenous: all other variables (at least one single headed arrow is directed at it).
• Latent variables can be exogenous variables.
depress
pain
function
Model 2
Error 2
1
1
Exogenous variable
Endogenous variables
SEM diagram symbols
1
1
1 1
Observed variable
Observed variable with measurement error (endogenous variable)
Latent variable with items (observed variables)
1Latent variable with disturbance or error
Unidirectional path (“regression”)
Correlation between variables
Reciprocal relation between variables
SEM: observed and unobserved variables
• Observed variables:– Indicator variables
– Manifest variables
– Reference variables
• Unobserved variables:– Latent variables
– Latent constructs
– Latent factors
10
• How to test the mediation effect (pain� function�depression)?
• (Sobel Test)• �Alternative: Bootstrapping allows to test and to
estimate confidence intervals for the indirect effect • more power, more robust (does not require distributional
assumptions, works even if the specified model is wrong (except violations of independence).
How to find bootstrap results in AMOS:
Lower Bootstrap CI: 0.019
Upper Bootstrap CI: 0.096
Bootstrap Test (p=0.002)
Indirect effect of depression on pain is: 0.051 (95 % CI: 0.019-0.096)
Results
11
Results: unstandardised estimates
depress
4.08
pain
function
.06
Model 2
-.52
-.10
.15
Error 1
.36
Error 2
1
1
4.93, 4.08
pain
3.10
function
2.61
depress-.52
0, .36
error
1
-.10
0, .15
error2
1
.06
Unstandardised estimate
(same as regression coefficients of regressions analysis pain� function and pain+function�depression.)
Unstandardised estimates
4.93, 4.08
pain
3.10
function
2.61
depress-.52
0, .36
error
1
-.10
0, .15
error2
1
.06
Mean and variance of exogenous variable
= descriptive mean and variance of “pain”
= mean squared error (MSE) in regression analysis (pain� function and pain+function�depression)
4.93, 4.08
pain
3.10
function
2.61
depress-.52
0, .36
error
1
-.10
0, .15
error2
1
.06
Estimates of intercepts for predicting endogenous variables = intercepts in regression analyses (pain� function and pain+function�depression)
Result: standardised estimates
.20
depress
pain
.21
function
.18
Model 2
-.34
-.46
Error 1
Error 2
Result: standardised estimates
.20
depress
pain
.21
function
.18
Model 2
-.34
-.46
Error 1
Error 2
Standardised estimates (same as standardised regression coefficients of regression analyses pain� function and pain+function�depression.)
Estimates of squared multiple correlations= explained variance of the two regression models (pain� function and pain+function�depression.)
12
pain
function
depress
0,
Error1
0,
Error21
Correlated Errors(Instrumental variable, 2-Stage Regression)
Exercise
• Do a similar analysis with data file: PATH-INGRAM.sav.• The data are from: Ingram, K. L., Cope, J. G., Harju, B.
L., & Wuensch, K. L. (2000). Applying to graduate school: A test of the theory of planned behavior. Journal of Social Behavior and Personality, 15, 215-226.
• Ajzen’s theory of planned behavior was used to predict student’s intentions and application behaviour (to graduate school) from their attitudes, subjective norms and perceived behavioural control.
Five Variables (derived from questionnaires)
• Perceived Behavioural Control (PBC) • Subjective norm• Attitude• Intention (to apply to college)• Behaviour (applications)
Ajzen’s theoretical model
• PBA, subjective norm and attitude influence intention
• PBA, subjective norm and attitude correlate with each other
• Intention influences Behaviour• PBA also influences Behaviour
Exercise
• Draw the path diagram for the model• Conduct a path analysis with a series of multiple
regression analyses using SPSS.• Calculate the standardised indirect effects using
the standardised estimates from the regression analysis.
• Check your results using AMOS• Use a bootstrap analysis to evaluate the indirect
effect.• Remove some indirect effects and compare the
results with the theoretical model.
13
Path diagram for AMOS
PerceivedBehaviorControl
Attitude
Subjective Norm Intention Behavior
e1
1
e2
1
Results
PerceivedBehaviorControl
Attitude
Subjective Norm
.60
Intention
.34
Behavior
-.13
.09
.81
.35
.51
.47
.67
.34e1 e2
Standardized Indirect Effects
.000-.044.033.282Behavior
IntentPBCSubNormAttitude
Standardized Regression Weights:
.005.555.092.336PBC<---Behavior
.013.548.075.350Intent<---Behavior
.002.985.596.807Attitude<---Intent
.430.314-.118.095SubNorm<---Intent
.293.126-.352-.126PBC<---Intent
PUpperLowerEstimateParameter
95%CI: (0.08-0.5) (-0.04-0.13) (-0.14-0.04)
P: 0.007 0.339 0.277
Main results: AMOS output
Quick introduction: How to define latent variables in AMOS:
Example: simple factor analysis: • Can we reduce the three variables into one factor (latent
variable) without loosing too much information?• = Factor analysis with Maximum likelihood estimation
Latent variable in AMOS• Use button on top left (#3) to create a latent variable• Move variables into item boxed• Name error variable as e1,e2,e3• Name latent variable as “well-being”• Tick “standard estimates”, “squared correlations” and
“factor scores weight” in analysis property box
Well-being
pain
e1
1
1
depress
e2
1
function
e3
1
14
Factor analysis SPSS: SEM analysis AMOS:Communalities
.233 .365
.204 .312
.288 .568
pain
depress
function
Initial Extraction
Extraction Method: Maximum Likelihood.
Factor Matrix a
.604
.558
-.754
paindepress
function
1
Factor
Extraction Method: Maximum Likelihood.
1 factors extracted. 4 iterations required.a.
Well-being
.36
pain
e1
.60
.31
depress
e2
.56
.57
function
e3
-.75
Some questions for next week
• Why do we get not always a statistical test for the overall model?
• Why should the test be non-significant?• How can we compare two models?• Are there any assumptions for the SEM analysis?• If yes, how can we check them?• What are the functions of variances and covariances in
SEM?