A gentle introduction to growth curves using SPSS

A gentle introduction to growth curves

Dr. Sean P. Mackinnon, Dalhousie University

When to use a growth curve Growth curves measure patterns of change over

time Specifically, mean-level changes over time Patterns can be linear, quadratic, cubic, etc.

Time 1

Time 2 Time 3

John 10 7 5Mary 8 5 4Zoe 7 9 9Sarah 5 2 1Bill 2 4 3MEAN 6.4 5.4 4.4

Mean-Level Change**

Limitations of RM-ANOVA Requires a balanced design (i.e., no missing

data)

Requires equal spacing between time points

Requires independence of observations (not often possible in longitudinal data)

Requires homogeneity of variance

Growth Curves overcome these limitations Accounts for missing data using a full information

maximum likelihood (FIML) approach

Does not require equal spacing between time points (can specify unequal time points, e.g., 1, 2, 5, 7, 10)

Does not require independence of observations (can model different types of correlated error structures)

Is robust to violations of homogeneity of variance assumptions required by RM-ANOVA

So… what are growth curves?

Growth curves are a type of mixed (or multilevel) model

Simply put, multilevel models are a way of dealing with clustered data

For example…

Level 2Between-Subjects

(2 Participants)

Level 1Within-Subjects

(6 measurement occasions)

Participant ID001(Average)

Participant ID002(Average)

Growth Curves are Multilevel Models All multilevel models (MLMs) partition

variance into their appropriate levels E.g., students nested within schools

Multilevel models also use maximum likelihood estimation, which is better when there’s missing data and are more flexible when dealing with real data

Growth curves are a specific type of MLM where: The lowest level of observation is repeated

measures The predictor variable is TIME

Application to a clinical context The RCT is a

common design

Growth curves can be used instead of ANOVA

The time*interv interaction is most important

Leiter et al., 2012

How do you do this in SPSS? First, you need to convert your data from

“WIDE” format to “LONG” format

Wide Format

Long Format (Use the syntax provided in the handout to

get this): Long Format

Coding the Time Variable is Important The choices you make for your time variable will

influence your analyses!

If relationships are linear, need to be equidistant 1, 2, 3 OR -1, 0, 1, etc.

If you are expecting a quadratic relationship, need to also calculate time-squared 1, 4, 9 OR 1, 0, 1

Unequal time points 1 month, 3 month, 12 month 1, 3, 12

Decision 1: ML vs REML Maximum Likelihood Estimation (ML)

vs Restricted Maximum Likelihood Estimation

(REML)

REML is generally preferred because it provides more unbiased estimates

ML would be preferred if you need to compare nested models, as REML is not adequate for this

Decision 2: Fixed vs Random Random vs. Fixed Slopes & Intercepts

Random (varying): Allow to vary across people Fixed (constant): Force them to be equal across people

Random vs. Fixed has no single, agreed-upon definition (Gelman, 2005); I’m presenting a practical conceptualization

Fixed (constant) intercepts and slopes are more parsimonious and less computationally intensive, but may not be as good a fit to the data. Select the most parsimonious model that fits the data best.

Random (varying) Intercepts Random (varying) Slopes

http://www.spss.ch/upload/1126184451_Linear%20Mixed%20Effects%20Modeling%20in%20SPSS.pdf



Random (varying) InterceptsFixed (constant) Slopes




Fixed (constant) Intercepts Random (varying) Slopes




Decision 3: Linear, Quadratic, or Cubic? If slopes are allowed to be random (varying),

then you need at least: 3 time points for linear 4 time points for quadratic

Add time*time as a predictor 5 time points for cubic

Add time*time and time*time*time as predictors

One less time point needed if using fixed slopes

Today, I’m focusing on LINEAR relationships

Decision 4: Covariance Structure Is there a predictable pattern to the errors?

If you are unsure, specify an “unstructured” matrix Less parsimony because it lets things freely vary

AR(1) correlated error structure is also fairly common Autoregressive correlated errors, getting smaller as

timepoints get more distant

You can test multiple models with different plausible structures, and choose the one that fits the data best

Annotated Syntax

MIXED ASItotal WITH time interv

/METHOD = REML

/FIXED = time interv time*interv | SSTYPE(3)

/RANDOM = INTERCEPT time interv | SUBJECT(id) COVTYPE(UN)

/PRINT = SOLUTION TESTCOV HISTORY.

*Mixed model, dependant variable predicted by time and intervention

*Restricted Maximum Likelihood Estimation (usually better than ML)

*Put all predictors after FIXED. Indicate interactions by Var1*Var2

*The intercept, and the slopes for time and interv are random. The slope for the interaction is fixed because I omitted it from this part.

*”UN” Specifies an unstructured covariance matrix (other types are possible, but require thought)

Annotated Output: Model Comparison

Use the BIC values to compare nested models (e.g., random slopes vs fixed slopes)

Lower absolute values are better (∆BIC > 4)

Annotated Output: Covariance Parameters

UN(1,1) = Variance of the Intercept. Significant, so random intercepts are important to include.

UN(2,2) = Variance of the slope for time. Non-significant, which suggests that a more parsimonious model with fixed slopes for time would fit the data better.

Annotated Output

Interpret like ANOVA; parameters adjusted for clustering Time -> Main effect for time (linear, in this case) Interv -> Main effect for intervention Time * interv -> 2-way Interaction

Graphing the interaction is usually important to understand Dummy coding (0, 1) intervention helps a LOT

Graphing the interaction

Can graph the interaction using tools meant for moderation in linear regression with this kind of model

The parameters in the output are interpreted the same way, they’re just adjusted so that you’re accounting for the clustering due to repeated measurement and missing data

http://www.jeremydawson.co.uk/slopes.htm




A few closing points Other software can implement this (e.g., SAS,

Mplus, HLM)

Non-normal data may be better modeled with different distributional assumptions (e.g., poisson)

Modeling of covariance structures may be important, but can be challenging to figure out

Some programs (e.g., Mplus) may use a latent variable approach

Questions? Comments?

Thank you!

P.S. In the handout I provided, there is some syntax and instructions which may be helpful!

Email me if you want an electronic copy of the presentation:

[email protected]

Appendix: Syntax*Convert data from LONG to WIDE format SORT CASES BY id time.CASESTOVARS /ID=id /INDEX=time /GROUPBY=VARIABLE. *Convert data from WIDE to LONG format VARSTOCASES /MAKE ASItotal FROM ASItotal.0 ASItotal.1 ASItotal.2 /INDEX=time(3) /KEEP=id interv /NULL=KEEP.

Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Random Intercept, Random Slopes)

MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = INTERCEPT time interv time*interv | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.

Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Random Intercept, Fixed Slopes)

MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = INTERCEPT | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.

Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Fixed Intercept, Random Slopes)

MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = time interv time*interv | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.

Appendix: Syntax*Quadratic Growth Curve with Intervention Group as Moderator (Random Intercept, Fixed Slopes)

COMPUTE quadtime = time*time. EXECUTE.

MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time quadtime interv time*interv quadtime*interv | SSTYPE(3)/RANDOM = INTERCEPT | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.

Data & Analytics

A gentle introduction to growth curves using SPSS