68
Confirmatory Factor Analysis Rick DeShon 2008

Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Confirmatory Factor Analysis Rick DeShon

2008

Page 2: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

OverviewWho am I?

Course purpose and focusUnderstand principles of modern measurementFocus on modelling and inference

Topic CoverageCourse website:

www.msu.edu/course/psy/818/deshon

Open time for general questions

Page 3: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Measurement!Most constructs in the social sciences are fuzzy, ill-defined, and unobservable

How do you measure something you don't fully understand and can't see?

Success in the physical sciences gives us hope and confidence

Ex: TemperatureEx: pH

Page 4: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Descriptive StatisticsSummation notation:

Mean:

Variance:

Covariance:

1

n

ii

x=∑

;x x

XN n

μ = =∑ ∑

( )( ) ( )( )2 2;1

i i i ix x x x x xs

N nμ μ

σ− − − −

= =−

∑ ∑

( )( ) ( )( );

1i x i y i i

xy xy

x y x x y ys

N nμ μ

σ− − − −

= =−

∑ ∑

Page 5: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example - Univariateperson x x.dev x.dev.sq

1 18.32 8.04 64.612 5.55 -4.74 22.463 9.55 -0.74 0.554 9.02 -1.26 1.595 11.55 1.26 1.606 1.02 -9.27 85.867 13.68 3.40 11.558 22.57 12.28 150.929 -1.00 -11.28 127.32

10 12.59 2.30 5.31

Sum 102.85 0.00 471.76

Mean= Variance= Standard Deviation=10.29 52.42 7.24

Histogram of x

x

Fre

quen

cy

-5 0 5 10 15 20 25

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Page 6: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example - Bivariateperson x y x.dev y.dev xy.CrossProd

1 18.32 32.63 8.04 17.82 143.252 5.55 -4.15 -4.74 -18.97 89.883 9.55 22.26 -0.74 7.45 -5.514 9.02 -0.98 -1.26 -15.80 19.925 11.55 12.74 1.26 -2.07 -2.626 1.02 23.38 -9.27 8.57 -79.377 13.68 23.13 3.40 8.32 28.278 22.57 28.43 12.28 13.62 167.339 -1.00 -13.47 -11.28 -28.28 319.13

10 12.59 24.16 2.30 9.35 21.53

Covariance = 77.98

0 5 10 15 20

-10

01

02

03

0

x

y

Page 7: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Variance – Covariance Matrix

2

2x xy

yx y

σ σσ σ⎡ ⎤

Σ = ⎢ ⎥⎣ ⎦

2

2

52.42 77.9877.98 244.66

x xy

yx y

s sS

s s⎡ ⎤ ⎡ ⎤

= =⎢ ⎥ ⎢ ⎥⎣ ⎦⎣ ⎦

Page 8: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Covariance is unbounded and hard to interpretWhat is a big covariance?

Correlation transforms the covariance so that it ranges from -1 to +1

For our example...

Correlation

2 2

( )( ) ( )

xyxy

x y

Cov xyrVar x Var y

σ

σ σ= =

77.98 .68952.42* 244.66xyr = = 1.0 .69

.69 1.0r

⎡ ⎤= ⎢ ⎥⎣ ⎦

Correlation matrix

Page 9: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Simple RegressionSimple linear regression is the foundation of confirmatory factor analysis (CFA)So let's review it ...

0 5 10 15 20

-10

010

2030

x

y

Page 10: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Recall from algebra that the equation for a line is:

Slope = mrise/run or change in y for a unit change in x

Y-intercept = bValue of y when x=0

Simple regression applies this notion to bivariate data that are not perfectly related

Simple Regression

( )y m x b= +

Page 11: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

What is the line that “best” fits or characterizes the relationship between x and y?

Population:

Sample:

Simple Regression

0 5 10 15 20

-10

010

2030

x

y

0 1i i iy xβ β ε= + +

0 1i i iy b b x ε= + +

Person 2: e = (-11.92)

Page 12: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Intercept and slope are constant across individualsy,x,and e are random variables that are likely different for each individual

Simple Regression

person x y1 18.32 32.632 5.55 -4.153 9.55 22.264 9.02 -0.985 11.55 12.746 1.02 23.387 13.68 23.138 22.57 28.439 -1.00 -13.47

10 12.59 24.16

1; 32.63 .49 1.48(18.32) 5.862; 4.15 .49 1.48(5.55) 11.92

10; 24.16 .49 1.48(12.59) 5.92

ii

i

= = − + += − = − + −

= = − + +

Page 13: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

The variable means in most social science research are arbitraryTherefore, we may focus on analyzing data that are in deviation form (i.e., centered)

Simple Regression

person x.dev y.dev1 8.04 17.822 -4.74 -18.973 -0.74 7.454 -1.26 -15.85 1.26 -2.076 -9.27 8.577 3.4 8.328 12.28 13.629 -11.28 -28.28

10 2.3 9.35 -10 -5 0 5 10

-30

-20

-10

010

x.dev

y.de

v

Page 14: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

When variables are in deviation form:The mean of each variable = 0.0

The intercept in the regression = 0.0

The regression slope is unchanged

The fit of the model is unchanged

The regression model reduces to:

Simple Regression

1i i iy xβ ε= +

Page 15: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

To be consistent with CFA, we'll use slightly different notation to represent the same linear regression model

This relationship or equation may be represented graphically as:

Simple Regression

i i iy xλ ε= +

x y

e

λ

2eσ

lambdaλ =

Page 16: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

EstimationOrdinary Least Squares (OLS)

AssumesExpected value of residuals = 0Homoskedasticity (constant variance) of residualsResiduals are independent of each other and xNormality of residuals

If met, OLS yields an unbiased, minimum variance estimate of the population slope

Simple Regression

2( )( )

xy y

x x

Cov xy rVar x

σ σλ

σ σ= = =

77.98ˆ 1.4852.42

λ = =

Page 17: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

EstimationNotice that all the information we need to estimate the regression slope is available in the variance-covariance matrix

covariance of x and yvariance of x

Simple Regression

2

( )( )

xy

x

Cov xyVar x

σλ

σ= =

Page 18: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Hypothesis TestingGiven the magnitude of the estimate( ), is it reasonable to conclude that the population parameter ( ) is different from zero?

Standard error of the regression slope

Simple Regression

λ̂

λ

( )2

ˆ 2 1x Nε

λ

σσσ

=−

2 mean square error

root mean square errorε

ε

σ

σ

=

=

Page 19: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Hypothesis TestingThe slope estimate and standard error may be used to form a confidence interval or a significance test

Confidence interval for the regression slope

Significance test for the regression slope

Simple Regression

ˆˆ t

λλ σ±

ˆ

ˆt

λ

λσ

=

Page 20: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example...High school and beyond dataset

200 observations from a sample of high school studentsdemographic information

gender (female)socio-economic status (ses)ethnic background (race)

scores on standardized testsreading (read)writing (write)mathematics (math)social studies (socst).

Simple Regression

Page 21: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example...Let's examine the relationship between Science achievement scores and SES (numeric)

Simple Regression

. .. 0.525 2.028. 2.028 98.028

SES c SCI cS SES c

SCI c

⎡ ⎤⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦

. .. 1.0 0.283. 0.283 1.0

SES c SCI cr SES c

SCI c

⎡ ⎤⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦ -1.0 -0.5 0.0 0.5 1.0

-20

-10

010

20

SES.c

SC

I.c

Page 22: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example...Let's regress Science achievement scores onto SES (numeric)

95% CI:

Simple Regression

( ). .i i iSCI c SES cλ ε= +

ˆ

ˆ 3.8670.9299.497

λ

ε

λσσ

===

3.867 4.16*0.929

t = =

( ).05 1.978t α= =

3.867 1.978*0.929 2.029 5.704to± =

Page 23: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example...

Simple Regression

-1.0 -0.5 0.0 0.5 1.0

-20

-10

010

20

SES.c

SC

I.c

Page 24: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Not multiple regression!Multiple regression model

Multivariate simple regression model

Multivariate Simple Regression

1 1 2 2i i ii p p iy x x xλ λ λ ε= + + + +…

1 2 3, , , , ; 1...i i i ip j i iy y y y x j pλ ε= + =…

Page 25: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Multivariate Simple Regression

x y3

y2

y1

y4

yp

e1

e2

e3

e4

ep

λ1

λ2

λ3

λ4

λp

1

2eσ

2

2eσ

3

2eσ

4

2eσ

2peσ

Page 26: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Represents a system of linear equationsA separate regression equation for each outcome variable Moving toward a matrix representation...

Multivariate Simple Regression

1 1 11

2 1 21

1

0 00 0

0 0p p p

y xy x

y x

λ ελ ε

λ ε

⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎡ ⎤⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥= +⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥

⎣ ⎦⎣ ⎦ ⎣ ⎦ ⎣ ⎦

Y X ε= Λ +or

1 ; 1...j j jy x j pλ ε= + =Capital Lambda

Page 27: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

EstimationIf the errors are independent you can estimate each equation separately using OLS!!

If the errors are contemporaneously correlated, then this is a Seemingly Unrelated Regressions (SUR) estimation problem

If you estimate the system of equation simultaneously, you can impose equality constraints across the equations

More on this later...

Multivariate Simple Regression

Page 28: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Let's examine the HSB dataset from this perspectiveWhat's the effect of SES on the academic achievement test scores

All in deviation form (centered)SES.cSCI.cREAD.cWRITE.cMATH.c SOC.c

Multivariate Simple Regression

Page 29: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Covariance Matrix (S)

Correlation Matrix (r)

Multivariate Simple Regression

SES.c SCI.c READ.c WRITE.c MATH.c SOC.cSES.c 0.52 2.03 2.18 1.42 1.85 2.58SCI.c 2.03 98.03 63.97 53.53 58.50 49.44READ.c 2.18 63.97 105.12 58.00 63.61 68.41WRITE.c 1.42 53.53 58.00 89.84 54.83 61.54MATH.c 1.85 58.50 63.61 54.83 87.77 54.76SOC.c 2.58 49.44 68.41 61.54 54.76 115.26

SES.c SCI.c READ.c WRITE.c MATH.c SOC.cSES.c 1.00 0.28 0.29 0.21 0.27 0.33SCI.c 0.28 1.00 0.63 0.57 0.63 0.47READ.c 0.29 0.63 1.00 0.60 0.66 0.62WRITE.c 0.21 0.57 0.60 1.00 0.62 0.60MATH.c 0.27 0.63 0.66 0.62 1.00 0.54SOC.c 0.33 0.47 0.62 0.60 0.54 1.00

Page 30: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Run the multivariate regression analysisglm SCI.c READ.c WRITE.c MATH.c SOC.c with SES.Results...

Multivariate Simple Regression

3.867*4.152*2.715*3.524*4.919*

⎡ ⎤⎢ ⎥⎢ ⎥

Λ = ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

9.4979.8029.2729.014

10.130

εσ

⎡ ⎤⎢ ⎥⎢ ⎥

= ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

Page 31: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Notice that these estimates may be computed directly from the variance covariance matrix!

Multivariate Simple Regression

SES.c SCI.c READ.c WRITE.c MATH.c SOC.cSES.c 0.52 2.03 2.18 1.42 1.85 2.58SCI.c 2.03 98.03 63.97 53.53 58.50 49.44READ.c 2.18 63.97 105.12 58.00 63.61 68.41WRITE.c 1.42 53.53 58.00 89.84 54.83 61.54MATH.c 1.85 58.50 63.61 54.83 87.77 54.76SOC.c 2.58 49.44 68.41 61.54 54.76 115.26

1( . , . ) 2.03 3.90

( . ) .52Cov SES c SCI c

Var SCI cλ = = = Slightly different

due to roundingerror

1Variance of y not due to xεσ =

( )29.497 98.028 3.867 *.5246= −

Page 32: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Finally!!CFA is a multivariate regression model where you don't directly observe the exogenous (X) variable(s)Measurement theory

Can't directly observe many interesting phenomenae.g., intelligence, personality, temperament, beliefse.g., pH, temperature, nuclear particles

So, identify observable phenomena that are thought to be caused by the underlying variable that is latent or unobservable

i.e., temperature vs thermometer

Confirmatory Factor Analysis

Page 33: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

CFA model is constructed in advancespecifies the number of (latent) factorsspecifies the pattern of loadings on the factors

factor loadings can be constrained to be zero (or any other value)

specifies the pattern of unique variancesmeasurement errors may be correlated

covariances among latent factors can be estimated or constrainedmultiple group analysis is possible

Can TEST whether the constraints are consistent with the data

Confirmatory Factor Analysis

Page 34: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Unidimensional CFA

y1

y2

y3

y4

ξ

δ1

x y3

y2

y1

y4

yp

e1

e2

e3

e4

ep

λ1λ

2

λ3

λ4

λp

σε1

σε2

σε3

σε4

σεp

Compare withMultivariate Regression

λ1

λ2

λ3

λ4

δ2

δ3

δ4

ksi or xi

1

2δσ 2

2δσ 3

2δσ 4

2δσ

2ξσ

Page 35: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Mathematical representation

y = (q × 1) vector of indicator/manifest variableq is the number of manifest variables

ξ = (p × 1) vector of latent variables or factorsksi or xip is the number of latent variables

Λ= (q × p) matrix of factor loadings (lambda)

δ = (q × 1) vector of errors of measurement (delta)

Unidimensional CFA

y ξ δ= Λ +

Page 36: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Mathematical representation

Unidimensional CFA

y ξ δ= Λ +

[ ]1 1 1

2 2 21

3 3 3

4 4 4

yyyy

λ δλ δ

ξλ δλ δ

⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥= +⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦

Page 37: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Mathematical representation2 additional variance-covariance matrices are needed

Unidimensional CFA

[ ]1 1 1

2 2 21

3 3 3

4 4 4

yyyy

λ δλ δ

ξλ δλ δ

⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥= +⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦

2ξσ⎡ ⎤Φ = ⎣ ⎦

Phi (pxp)

1

2

3

4

2

2

2

2

0 0 0

0 0 0

0 0 0

0 0 0

δ

δ

δ

δ

σ

σ

σ

σ

⎡ ⎤⎢ ⎥⎢ ⎥

Θ = ⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

Theta-delta (qxq)

Page 38: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Mathematical representationThe specified model results in an implied or predicted variance-covariance matrix, Σθ

Unidimensional CFA

θΣ = ΛΦΛ +Θ

21 1 11

22 1 1 2 1 22

23 1 1 3 2 1 3 1 33

24 1 1 4 2 1 4 3 1 4 1 44

λ φ θλ λφ λ φ θλ λφ λ λ φ λ φ θλ λφ λ λ φ λ λ φ λ φ θ

⎡ ⎤+⎢ ⎥+⎢ ⎥ΛΦΛ +Θ =⎢ ⎥+⎢ ⎥

+⎣ ⎦

Page 39: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

If the unidimensional model is reasonable, then:

The covariance of scores on any two indicators is:

The product of the respective regression slopes (loadings) and the variance of the latent variable

The variance of the jth indicator is:

The sum of the squared loading and the error variance of the jth indicator

Unidimensional CFA

1

2ij i j ξσ λ λ σ=

1

2 2 2jjj j ξ δσ λ σ σ= +

Page 40: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Can also reverse this processIf we know the indicator variances and covariances, we can determine the model parameters that generated them

The regression slopes (loadings) may be determined from the covariance of any 3 indicators

This means that we must have at least 3 indicators to estimate the regression slopes!!

3 indicators = just identified single factor model

Unidimensional CFA

1

jk jlj

kl ξ

σ σλ σ σ=

Page 41: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Can also reverse this processThe error variances (aka uniquenesses) may be determined via subtraction

Unidimensional CFA

1

2 2 21j jjδ ξσ σ λ σ= −

Page 42: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Problem!Different covariance triplets will likely yield different estimates of the regression slopesWhich one is right...if any?

Solution – Maximum Likelihood estimationML is a very general estimation technique that determines the parameter estimates that maximize the fit between the implied covariance matrix and the actual covariance matrix

Minimizes the difference

Unidimensional CFA

Page 43: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Goal: Specify a model with an estimated covariance matrix that is not significantly different from the sample covariance matrix

Parameters are estimated simultaneously yielding an estimated covariance matrix, sigma_thetaS is the sample covariance matrix

The residual matrix is found by subtracting sigma_theta from S

Process iterates until changes in estimates do not improve model fit

Unidimensional CFA

Page 44: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Iterations continue until the likelihood is minimized

this is called convergenceWhen the model converges you get estimates of all of the parameters and a residual matrixχ2 is based in the function minimum (from ML estimation) when the model converges

The minimum value is multiplied by N – 1

Unidimensional CFA

Page 45: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Identification (for single factor models)Key issue-> do the parameters in the model result in a unique covariance matrix?

If different parameter sets yield identical outcomes, then there is no way to determine which set of parameters is superior

Def: The set of parameters θ={Λ,Φ,Θ} is not identified if there exists θ1≠θ2 such that Σ(θ1)= Σ(θ2)

This becomes a bigger issue when we get to multidimensional CFA

Unidimensional CFA

Page 46: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Identification (for single factor models)Necessary conditions:

T-rulep(p + 1)/2 - (# of estimated parameters)>0

The latent variable scale must be provided using one of two common constraints

Variance of the latent variable is fixed to 1Yields standardized loadings

Loading of one indicator is fixed to 1Yields unstandardized loadings

Unidimensional CFA

Page 47: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

The unidimensional CFA model is a theoryYou believe that the observed variables are manifestations or indicators of a common latent variable that is not directly observableThe indicators may also reflect other sources of variance that are not common across the set of indicators = (uniqueness) or error variance (δ

i)

The variances and covariances among the observed variables are due to a single underlying latent (unobservable) variable

This is a very strong model of reality!

Unidimensional CFA

Page 48: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: High school and beyond dataVariance-covariance matrix for the 5 achievement tests

Correlation matrix for the 5 achievement tests

What latent variable might underly scores on these 5 achievement tests?

Unidimensional CFA

SCI.c READ.c WRITE.c MATH.c SOC.cSCI.c 98.03 63.97 53.53 58.50 49.44READ.c 63.97 105.12 58.00 63.61 68.41WRITE.c 53.53 58.00 89.84 54.83 61.54MATH.c 58.50 63.61 54.83 87.77 54.76SOC.c 49.44 68.41 61.54 54.76 115.26

SCI.c READ.c WRITE.c MATH.c SOC.cSCI.c 1.00 0.63 0.57 0.63 0.47READ.c 0.63 1.00 0.60 0.66 0.62WRITE.c 0.57 0.60 1.00 0.62 0.60MATH.c 0.63 0.66 0.62 1.00 0.54SOC.c 0.47 0.62 0.60 0.54 1.00

Page 49: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: High school and beyond dataWe're finally ready to fit our first CFA model using AMOS

Start by opening in the High School and Beyond dataset in SPSS

Click on the “Analyze” menu...down at the bottom should be an option for “Amos” -> Click it

Amos provides a very nice graphic interface for building our CFA models

Unidimensional CFA

Page 50: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: High school and beyond dataLet's build a single factor CFA model for the achievement tests

Notice that AMOS automatically adds the scale constraints

First path is constrained to 1.0Click on the “view” menu option and view the variables in the dataset

Drag the academic achievement tests to the desired box in the model

Click on the “plugins” menu optionUse the “name unobserved variables” to automatically name your latent variables for you

Before you estimate the parameters think about what you expect to see!!! (df, what estimates)

Unidimensional CFA

Page 51: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: High school and beyond dataClick on the “Analyze” menu option and then “Calculate the estimates”Examine the model with the estimates placed on itCheck your output window to make sure the model converged...”minimum was achieved”

Let's examine the local and global fit information

Does the model fit well?

Unidimensional CFA

Page 52: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: High school and beyond dataLet's fit the model with the latent variance constrained to 1.0

Select the latent variable and add the 1.0 constraintSelect the path and remove the 1.0 constraint.

Does the model fit change?Do the parameter estimates change?

Unidimensional CFA

Page 53: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Local FitAre the regression slopes all significantly difference from zero in the populationLook at the t-tests

Global FitChi-square test

Can be overpowered (too sensitive) with sample sizes over 200Sensitive to nonnormality

Goodness of fit indices

Page 54: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Global FitComparative fit index

Indicates reduction in model misfit of a target model relative to a baseline (independence) model

.90-.95 = acceptable; above .95 is good

Goodness of fit indices

CFI =null

2 ­df null ­ model2 ­df model

model2 ­df model

Page 55: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Global FitRoot Mean Square Error of Approximation

RMSEA

Discrepancy per degree of freedomRMSEA ≤ 0.05 → Close fit0.05 <RMSEA ≤ 0.08 → Reasonable fitRMSEA > 0.1 → Poor fit

Goodness of fit indices

RMSEA= model2 /df model­1

N ­1

Page 56: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Global FitStandardized Root Mean Square Residual

SRMRStandardized difference between the observed covariance matrix and predicted covariance matrixA value of zero indicates perfect fitThis measure tends to be smaller as sample size increases and as the number of parameters in the model increases

SRMR < .08 is good fit

Goodness of fit indices

Page 57: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Global FitHu & Bentler

SRMR ≤ .08 AND (CFI ≥ .95 OR RMSEA ≤ .06)

Goodness of fit indices

Page 58: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

ReliabilityThe reliability of a set of indicators as a measure of the latent variable may be determined from the estimated parameters.If the indicators are homogeneous (i.e. ,unidimensional)

Reliability is the ratio of true score variance to total variance

Unidimensional CFA

( )( )

2

2 2i

i

i δ

λω

λ σ=

+

∑∑ ∑

Page 59: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Omega is:Ratio of variance due to the common attribute to the total variance in X

Square of the correlation between X (observed score) and the the common factor

Correlation between two parallel test scores

Square of the correlation between the total score of m indicators and the total score of an infinite set of indicators

Unidimensional CFA

Page 60: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Parallel IndicatorsEqual lambdas (loadings)Equal error variances

Tau-Equivalent IndicatorsEqual lambdas

Congeneric IndicatorsNothing needs to be equal

Can compare model fit to determine the best representation

Unidimensional CFA

Page 61: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

If the tau-equivalent model holds!Omega = Coefficient alpha

If not tau-equivalent, alpha will be less than omega

Standards for reliability (Nunnaly,1968).7 for research.9 for application

.7 has become the standard

Unidimensional CFA

Page 62: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Models are nested whenever one model is the same as the other model except that it has some parameters constrainedIn other words, if you unconstrained some parameters in one model, you would get the otherWhen models are nested, we can perform a Chi-square difference test to compare them

Nested Model Comparisons

Page 63: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Examples:Testing for redundancy - A one factor model is nested within a two factor simple structure CFA (constrains correlation between factors to equal 1)Testing for orthogonality - an orthogonal factor structure is nested within an oblique structureTesting for tau equivalence - a model with factor loadings constrained to be equal is nested within one that lets all loadings be different.

Nested Model Comparisons

Page 64: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Chi-Square difference testH0: More restrictive model (the one with some parameters constrained)HA: Less restrictive modelChi-square test to compare these models = Chi-square test value for More restrictive model - Chi-square test value for Less restrictive modelCompare this test stat to a chi-squared distribution with d.f. = (d.f. for More restrictive model) - (d.f. for Less restrictive model

Nested Model Comparisons

Page 65: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Example: Tau-equivalent modelH0: Tau-equivalent model vs HA: Congeneric modelH_0: 1 = a1_1 = a2_1 = a3_1 = a4_1 = a5_vs HA: all factor loadings can be differentChi-square test value for Congeneric model = 238.5 with 9 degrees of freedomChi-square test value for Tau-equivalent model = 263.8 with 14 degrees of freedomChi-square difference test for hypothesis above = 263.8-238.5 = 25.3 with 14-9 = 5 dfThe p-value associated with this test is < .0001, so we reject the Null hypothesis in favor of the alternative.Thus the Tau-equivalent model (more restrictive

Nested Model Comparisons

Page 66: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

Chi-squared test of model fit and standard errors for factor loadings are based on normality assumption for x|f and f and asymptotic theoryHow big should n be before we trust these numbers????

∗n > 15 (number of free parameters)How non-normal is non-normal??Rules of thumb - examine univariate skew and kurtosis. West, Finch and Curran (1995) “Structural equation models with non-normal variables: Problems and remedies”. In R.H. Hoyle (Ed.) Structural equation modeling (pp. 56-75) Thousand Oaks, CA: Sage.

Sample Size and Normality

Page 67: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

d

Missing Data

Page 68: Confirmatory Factor Analysis 818 - Lecture 0… · person x x.dev x.dev.sq 1 18.32 8.04 64.61 2 5.55 -4.74 22.46 3 9.55 -0.74 0.55 4 9.02 -1.26 1.59 5 11.55 1.26 1.60 6 1.02 -9.27

d

Factor Scores