Direct and Indirect Effects

1

Introduction to Path Analysis and Structural Equation Modelling

with AMOSDaniel Stahl

Biostatistics and Computing

OutlineCourse:

– SEM and path analysis– Using AMOS to do path analysis and SEM– Model specification, identification, and estimation – Evaluating model fit – Interpreting parameter estimates – SEM and causality

Today:– What is SEM?– Relationship between correlation, regression, path analysis and SEM– Basic concepts of path analysis and SEM– Unobservable traits– Introduction to AMOS– Simple analyses with AMOS

Books

• Barbara M. Byrne (2001) Structural Equation Modeling with AMOS

• Randall E. Schumacker and Richard G. Lomax (2004) A Beginner's Guide to Structural Equation Modeling. (presents AMOS examples)

• Rex B. Kline (2004) Principles and Practice of Structural Equation Modeling, 2nd ed.

• Bill Shipley (2004) Cause and Correlation in Biology: A User's Guide to Path Analysis, Structural Equations and Causal Inference.

• James L. Arbuckle (2007) Amos™ 7.0 User’s Guide.

We collected the following variables of 40 cancer patients: – Body function – Pain– Depression

We are interested about the influence of pain and functioning on depression.

What kind of analyses could we do?

Correlations

DepressionPain

DepressionFunction

DepressionPain

DepressionFunction

Simple linear regresion

2

Depression

Pain

Function

Multiple linear regresion

Depression

Pain

Function

Mediation (Path analysis)

Depression

Pain

Function

Model 1

Depression

Pain

Function

Model 2

Path analysis: competing models

• Let’s assume that there is no test available for measuring “pain”. We developed a small questionnaire with three questions.

• How could we integrate the answers of the questionnaires in our analysis?

• (Hint: Pain is a latent construct, which we would like to measure with our questionnaire.)

Factor analysis: Latent construct "pain"

Latent variable"Pain"

Q 3

1

Q 2

Q 1

We could do a factor analysis and use the factor scores as an estimate of “pain” in the same way as before.

• Now we measured all three variables with a questionnaire with 3 items:

3

Pain

Item 1

Error 1

1

1

Item 2

Error 2

1

Item 3

Error 3

1

Function

Item 1

Error 1

Item 2

Error 2

Item 3

Error 3

1

111

Depression

Item 1 Error 1

Item 2 Error 2

Item 3 Error 3

11

1

1

Path analysis with latent constructs = Structural Equation Modelling

Correlation and Regression

• Correlation describes the linear association between two variables.

• Regression describes the effect of one or more independent (predictor) variables on a dependent variable: Depression = c + β1*pain + β2*function + error (N,σ2)


Pearson’s r=0.68

Depression = 11.86+1.072*Pain + Error (0,32)

Standardised coefficient for pain: 0.678

Coefficients a

11.860 1.337 8.870 .0001.072 .248 .678 4.331 .000

(Constant)Pain

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Depressiona.


• Correlation describes the linear association between two variables.

• Regression describes the effect of one or more independent (predictor) variables on a dependent variable: Depression = c + β1*pain + β2*function + error (N,σ2)

Error also influences our outcome variable “Depress ion”

Depression

Pain

Function

Multiple linear regresion

Error

� We have to add an error term to our dependent variable: Path analysis

• Technique to examine the causal relationships between two or more variables.

• Path analysis assess the direct and indirect (mediating) relationships among a set of variables! – X�Z: direct effect of Y on Z

– X� Y�Z: indirect effect of X on Z via Y – Total effect of X on Z = direct + indirect effect

• Regression is a subset of path analysis. It only studies the direct effects of one or more independent variables on (usually) one dependent variable

4

Structural equation models

• Structural equation modelling additional allows to study the effect of unmeasured latent variables.

• Latent variables cannot be observed and must be inferred from measured variable.

• Latent variables are implied by the covariance among two or more measured variables (� factor analysis).

• SEM are therefore also called covariance models.

SEM

• SEM consists of two parts: a measurement model and a structural model.

• The structural model deals with the relationship between the latent variables while the measurement model describes the relationship between our measured variables and the latent variables

• For example: Relationship between the measurement model and the structural model relating pain and function to depression:

Pain

Item 1

Error 1

1

1

Item 2

Error 2

1

Item 3

Error 3

1

Function

Item 1

Error 1

Item 2

Error 2

Item 3

Error 3

1

111

Depression

Item 1 Error 1

Item 2 Error 2

Item 3 Error 3

11

1

1

Measurement model

Structural model

Path analysis and SEM

• In path analysis we assume that each latent variable is perfectly measured with one observed variable (perfect correlation = no measurement error).

• Path analysis can be regarded as a SEM, where each latent variable is inferred from one measured variable. (e.g. temperature change can be seen as a latent variable measured as the change in a quicksilver column).

• SEM ≈ combination of path and factor analysis

Family Tree of SEM

BivariateCorrelation

MultipleRegression

PathAnalysis

StructuralEquationModeling

FactorAnalysis

ExploratoryFactorAnalysis

ConfirmatoryFactorAnalysis

Unobservable traits

• In psychology and health sciences we are often concerned with questions which are more subjective than questions in other fields of science.

• These includes measurements of: abilities, knowledge, emotions, feelings, attitudes or personality traits.

• All traits have got in common that they are unobservable

traits = latent traits.

5

Unobservable, latent traits

• The effect of a drug may prolong the life of a patient or cure a symptom but it may also effects on the general well-being.

• While life prolonging is rather easy to define, it is not easy to define “well-being”

• And different people may have got different definitions.• The field of psychometrics is concerned with the theory

and technique of measurement of such psychological and mental phenomena.

Latent traits in psychology

• Intelligence• Memory• Extraversion• Self-esteem• Depression• Anxiety• Knowledge• Beliefs• Feelings and Emotions: Joy, sadness, • Senses and Perception: smell of flower• Attitude about something, e.g. foreigners, risk • Motivation• Ability to learn statistics or a new language

Latent traits in medical research

• Pain • Mental disorders

– Depression– Schizophrenia– Autism

• Mobility/Function (gerontology)• Arthritis• Quality of life• Patient satisfaction (e.g. in hospital)

Latent vs. observed variables

• An observed variable, like body height, is directly observable and can be measured easily.

• A latent variable or trait or construct is not directly observable. Instead, it is inferred from variables(items) than can be observed.

• The main approach of psychometric measurements involves applying interviews, questionnaires and tests (= instruments)

Latent trait and items

Self Esteem

“I can feel that my co-worker

respect me” 1 2 3 4 5

“I feel that I am making a useful

contribution to work” 1 2 3 4 5

“I feel good about my

work” 1 2 3 4 5 6 7

“On the whole I get along with others well” 1 2 3 4 5

Here, the latent trait “Self esteem” elicits to each item a response from 1 (strongly disagree) to 5 (strongly agree). The sum of the observed responses allows a conclusion about the person’s self esteem.

“I am proud of my relationship with my supervisor” 1 2 3 4 5

Item and latent traits

• All 5 items are measuring the latent trait “self esteem”. • They should therefore correlate with the latent trait. • A single item will never measure a construct perfectly

(and hence will never correlate perfectly), • but the 5 items should be an accurate predictor of the

latent trait. • SEM in form of factor analysis is an important tool to

develop such tests.

6

Confirmatory factor analysis model

Self Esteem

Item 1"work"

e1

1

1

Item 2super-visor

e21

Item3other

peoplee3

1

Item 4Coworker

e41

Item 5contri-bution

e51

SEM: Longitudinal CFA

Well-beingTime 1

paine1

1

1

depresse21

functione31

Well-beingTime 2

pain e1

depress e2

function e3

1

1

1

1

Self Esteem

Item 1"work"

e1

1

1

Item 2super-visor

e21

Item3other

peoplee3

1

Item 4Coworker

e41

Item 5contri-bution

e51

Depression

Item 1

1

1

Item 2

1

Item 3

1

Item 4

1

Age

e

1

But SEM can be extended: it allows to include more latent and observable variables in the analysis:

e1 e2 e3 e4

Longitudinal data analysis using SEM

• A common approach to the analysis of longitudinal data is multilevel modelling

• but we can also use the structural equation modeling (SEM) framework to form what are known as “latent curve” or “latent trajectory” models.

• Given this SEM framework, latent trajectory analysis is extremely flexible in terms of the variety of potential hypotheses that can be tested.

• Rovine & Molenaar (2001) demonstrated the mathematical equivalence of MLM and SEM with balanced data.

• SEM latent growth curve approach is more flexible.

ICEPT SLOPE

X1 X2 X3 X4

E1

1

E2

1

E3

1

E4

1

Amos Setup: Simple Growth Curve Model with Random Slope and Intercept with 4 time points

mg1

y1

0, g1

e11

mg2

y2

0, g2

e21

mg3

y3

0, g3

e31 mg4

y4

0, g4

e41

g34

g14

g12

g24

g23

g13

ICEPT Slope

10

12

1

4

1

6

Growth Curve Model with Random Slope and Intercept with correlated errors

7

Literature

• Terry E. Duncan, Susan C. Duncan, Lisa A. Strycker (2006) An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues, and Applications.

• I will start with regressions and simple path analyses without latent variables.

ExerciseUse SPSS and open the data file “pain.sav”Do the following analysis:• Correlation matrix of all four variables• Simple linear regressions between:

– Function � Depression– Pain � Depression– Pain � Function

• Multiple regression:– Pain +Function � Depression

• Plot a path analysis diagram for standardised and unstandardised estimates

• Calculate the direct, indirect and total effects of pain• Use AMOS to do the same analysis• Use AMOS to evaluate the indirect (mediation) effect

SPSS

Results of regression analysis

Parameter estimate

Standardised Parameter

p

Pain+Function�Depression Pain 0.061 0.184 0.028

Function -0.523 -0.337 <0.001 Function�Depression

Function -0.652 -0.421 <0.001

Pain�Depression Pain 0.112 0.337 <0.001

Pain�Function Pain -0.097 -0.455 <0.001

Standardised estimates

pain

function

depress-.34

error

-.46 error2

.18

8

Result: unstandardised estimates

depress

4.08

pain

function

.06

Model 2

-.52

-.10

.15

Error 1

.36

Error 2

1

1

Parameter estimate


p



Function -0.652 -0.421 <0.001



Direct and indirect effects

Indirect effect of pain on depression

• The direct effect of pain on depression (that is if we keep function constant) is 0.061. But pain also negatively effects body function. Therefore, pain has an indirect effect on depression via function.

• For each 1 unit increase in pain, body function decreases by 0.097. A decrease of 0.097 units of body function increases depression by 0.097*0.523= 0.051.

• Therefore, the indirect effect of pain on depression is 0.051.

• The total effect of depression is the sum of direct and indirect effect: 0.061+0.051=0.112. The direct effect is 0.112, which is the same effect as in the simple linear regression between pain and depression.

• A commonly used test for statistical significance of the indirect effect (=mediation effect) is the Sobel test.

Direct, indirect and total effects

The total effect of pain on depression is 0.112.

The direct effect of pain on depression is 0.061.

The indirect effect is -0.097*(-0.523)= 0.051.

Control: Total=0.061+0.051=0.112.

The total standardised effect of pain on depression is 0.337.

The direct standardised effect of pain on depression is 0.184.

The indirect standardised effect is -0..455*(-0.337)= 0.153Control: Total=0.184+0.154=0.338.

Parameter estimate


p



Function -0.652 -0.421 <0.001



AMOS 1. Observed variables

2. Unobserved variables

3. Drawing latent variable (draws latent variable and items)

4. Drawing path (causal relationship �regression)

5. Draw covariances (correlation, no direction)

6. Unique variable (error variable, add e.g. to each dependent var

7. List variables (open data file first, then drag and drop variabl

8. Select one object, select all, deselect

9. Move object

10.Delete

11.Select data file

12.Analysis properties (choose statistics)

13.Calculate estimates (starts the analysis)

14.View test (see results)

15.Copy graph in clipboard

16.Save

1 2 3

4 5 6

8 8 8

9 10

11 12 13

14 15 16

7

DELL

Highlight

DELL

Highlight

9

Main steps:

1. Draw path diagram

2. Move data into appropriate box

3. Name variable (here only error variable): right-click object

4. Analysis properties: Select statistics

5. Calculate estimates: Run analysis

6. View text: View resultsdepresspain

Simple linear regresion

Error depression

1

depresspain

Correlation

depress

pain

function

error

1

Multiple linear regression

Correlation and Regression as AMOS path models

Endogenous variables have got error variances variables pointing at them = unexplained variance

Simple mediation analysis with AMOS

depress

pain

function

Model 2

Error 1

Error 2

1

1

depress

pain

function

Mediation (Path analysis)

Error 1

Error 2

1

1

Model 1 Model 2

Exogenous and endogenous variables• In SEM it is difficult to apply the concept of independent and

dependent variables. • Function is a dependent variable in one regression analysis

(pain�function) but an independent in another (pain+function�depr).• Therefore, we distinguish between exogenous and endogenous

variables: • Exogenous variables are independent variables with no prior causal

variable (= no single-headed arrow pointing on it), although it can be correlated (double headed arrow) with other variables.

• Endogenous: all other variables (at least one single headed arrow is directed at it).

• Latent variables can be exogenous variables.

depress

pain

function

Model 2

Error 2

1

1

Exogenous variable

Endogenous variables

SEM diagram symbols

1

1

1 1

Observed variable

Observed variable with measurement error (endogenous variable)

Latent variable with items (observed variables)

1Latent variable with disturbance or error

Unidirectional path (“regression”)

Correlation between variables

Reciprocal relation between variables

SEM: observed and unobserved variables

• Observed variables:– Indicator variables

– Manifest variables

– Reference variables

• Unobserved variables:– Latent variables

– Latent constructs

– Latent factors

10

• How to test the mediation effect (pain� function�depression)?

• (Sobel Test)• �Alternative: Bootstrapping allows to test and to

estimate confidence intervals for the indirect effect • more power, more robust (does not require distributional

assumptions, works even if the specified model is wrong (except violations of independence).

How to find bootstrap results in AMOS:

Lower Bootstrap CI: 0.019

Upper Bootstrap CI: 0.096

Bootstrap Test (p=0.002)

Indirect effect of depression on pain is: 0.051 (95 % CI: 0.019-0.096)

Results

11

Results: unstandardised estimates

depress

4.08

pain

function

.06

Model 2

-.52

-.10

.15

Error 1

.36

Error 2

1

1

4.93, 4.08

pain

3.10

function

2.61

depress-.52

0, .36

error

1

-.10

0, .15

error2

1

.06

Unstandardised estimate

(same as regression coefficients of regressions analysis pain� function and pain+function�depression.)

Unstandardised estimates

4.93, 4.08

pain

3.10

function

2.61

depress-.52

0, .36

error

1

-.10

0, .15

error2

1

.06

Mean and variance of exogenous variable

= descriptive mean and variance of “pain”

= mean squared error (MSE) in regression analysis (pain� function and pain+function�depression)

4.93, 4.08

pain

3.10

function

2.61

depress-.52

0, .36

error

1

-.10

0, .15

error2

1

.06

Estimates of intercepts for predicting endogenous variables = intercepts in regression analyses (pain� function and pain+function�depression)

Result: standardised estimates

.20

depress

pain

.21

function

.18

Model 2

-.34

-.46

Error 1

Error 2

Result: standardised estimates

.20

depress

pain

.21

function

.18

Model 2

-.34

-.46

Error 1

Error 2

Standardised estimates (same as standardised regression coefficients of regression analyses pain� function and pain+function�depression.)

Estimates of squared multiple correlations= explained variance of the two regression models (pain� function and pain+function�depression.)

12

pain

function

depress

0,

Error1

0,

Error21

Correlated Errors(Instrumental variable, 2-Stage Regression)

Exercise

• Do a similar analysis with data file: PATH-INGRAM.sav.• The data are from: Ingram, K. L., Cope, J. G., Harju, B.

L., & Wuensch, K. L. (2000). Applying to graduate school: A test of the theory of planned behavior. Journal of Social Behavior and Personality, 15, 215-226.

• Ajzen’s theory of planned behavior was used to predict student’s intentions and application behaviour (to graduate school) from their attitudes, subjective norms and perceived behavioural control.

Five Variables (derived from questionnaires)

• Perceived Behavioural Control (PBC) • Subjective norm• Attitude• Intention (to apply to college)• Behaviour (applications)

Ajzen’s theoretical model

• PBA, subjective norm and attitude influence intention

• PBA, subjective norm and attitude correlate with each other

• Intention influences Behaviour• PBA also influences Behaviour

Exercise

• Draw the path diagram for the model• Conduct a path analysis with a series of multiple

regression analyses using SPSS.• Calculate the standardised indirect effects using

the standardised estimates from the regression analysis.

• Check your results using AMOS• Use a bootstrap analysis to evaluate the indirect

effect.• Remove some indirect effects and compare the

results with the theoretical model.

13

Path diagram for AMOS

PerceivedBehaviorControl

Attitude

Subjective Norm Intention Behavior

e1

1

e2

1

Results

PerceivedBehaviorControl

Attitude

Subjective Norm

.60

Intention

.34

Behavior

-.13

.09

.81

.35

.51

.47

.67

.34e1 e2

Standardized Indirect Effects

.000-.044.033.282Behavior

IntentPBCSubNormAttitude

Standardized Regression Weights:

.005.555.092.336PBC<---Behavior

.013.548.075.350Intent<---Behavior

.002.985.596.807Attitude<---Intent

.430.314-.118.095SubNorm<---Intent

.293.126-.352-.126PBC<---Intent

PUpperLowerEstimateParameter

95%CI: (0.08-0.5) (-0.04-0.13) (-0.14-0.04)

P: 0.007 0.339 0.277

Main results: AMOS output

Quick introduction: How to define latent variables in AMOS:

Example: simple factor analysis: • Can we reduce the three variables into one factor (latent

variable) without loosing too much information?• = Factor analysis with Maximum likelihood estimation

Latent variable in AMOS• Use button on top left (#3) to create a latent variable• Move variables into item boxed• Name error variable as e1,e2,e3• Name latent variable as “well-being”• Tick “standard estimates”, “squared correlations” and

“factor scores weight” in analysis property box

Well-being

pain

e1

1

1

depress

e2

1

function

e3

1

DELL

Highlight

14

Factor analysis SPSS: SEM analysis AMOS:Communalities

.233 .365

.204 .312

.288 .568

pain

depress

function

Initial Extraction

Extraction Method: Maximum Likelihood.

Factor Matrix a

.604

.558

-.754

paindepress

function

1

Factor

Extraction Method: Maximum Likelihood.

1 factors extracted. 4 iterations required.a.

Well-being

.36

pain

e1

.60

.31

depress

e2

.56

.57

function

e3

-.75

Some questions for next week

• Why do we get not always a statistical test for the overall model?

• Why should the test be non-significant?• How can we compare two models?• Are there any assumptions for the SEM analysis?• If yes, how can we check them?• What are the functions of variances and covariances in

SEM?

Documents

Direct and Indirect Effects