54
1 Binary Models 1 A (Longitudinal) Latent Class Analysis of Bedwetting

Binary Models 1

  • Upload
    marva

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Binary Models 1. A (Longitudinal) Latent Class Analysis of Bedwetting. How much fun can you have with 5 binary variables?. Croudace paper:. Gender-specific prevalence and levels of missing data. Latent Class Models. Deal with patterns of response e.g. 11111 = Yes at all five time points - PowerPoint PPT Presentation

Citation preview

Page 1: Binary Models 1

1

Binary Models 1

A (Longitudinal) Latent Class Analysis of Bedwetting

Page 2: Binary Models 1

2

How much fun can you have with 5 binary variables?

Page 3: Binary Models 1

3

Croudace paper:

Page 4: Binary Models 1

4

Page 5: Binary Models 1

5

Gender-specific prevalence and levels of missing data

4.5 years 5.5 years 6.5 years 7.5 years 9.5 years

Boys (n=7217)          

No bedwetting 3139 3309 3293 3370 3414

Bedwetting1793

(36.4%)1281

(27.9%)1061

(24.4%)844

(20.0%)526

(13.4%)

No response 2285 2627 2863 3003 3277

Girls (n=6756)          

No bedwetting 3555 3635 3516 3560 3587

Bedwetting1074

(23.2%)691

(16.0%)586

(14.3%)420

(10.6%)225

(5.9%)

No response 2127 2430 2654) 2776 2944

Page 6: Binary Models 1

6

Latent Class Models

• Deal with patterns of response e.g.11111 = Yes at all five time points

00000 = No at all five time points

11000 = Yes early on, followed by no

10101 = Alternating pattern

11*** = Yes followed by missing

1*0*1 = etc.

Page 7: Binary Models 1

7

Latent Class Models

Page 8: Binary Models 1

8

Conditional independence

• manifest variables are independent given latent class,

can be written as a product of all the probabilities for the individual items of that pattern.

E.g. for a pattern ’01’, if pat(1) to be the first element of a pattern i.e. the first binary variable, then

P( pattern = ’01’ | class = i)

= P(pat(1) = ‘0’ | class = i)*P(pat(2) = ‘1’ | class = i)

Page 9: Binary Models 1

9

Huh?

• Basically, Bayes’ rule plus conditional independence assumption enable you to calculate the likelihood of each pattern being assigned to each class.

• By selecting starting values for the probabilities, ot alternatively a starting assignment for the patterns, one can iterate to convergence to find the best solution given your chosen number of classes

Page 10: Binary Models 1

10

Distribution of patterns (complete data)

+--------------------------------+ | 4.5 5.5 6.5 7.5 9.5 N | |--------------------------------| 1. | 0 0 0 0 0 3535 | 2. | 1 0 0 0 0 529 | 3. | 1 1 1 1 1 303 | 4. | 1 1 0 0 0 241 | 5. | 1 1 1 1 0 199 | |--------------------------------| 6. | 1 1 1 0 0 176 | 7. | 0 1 0 0 0 138 | 8. | 0 0 1 0 0 99 | 9. | 1 0 1 0 0 82 | 10. | 0 0 0 1 0 69 | |--------------------------------| 11. | 1 0 1 1 0 48 | 12. | 0 0 0 0 1 43 | 13. | 1 1 0 1 0 40 | 14. | 0 1 1 0 0 39 | 15. | 1 1 1 0 1 29 | |--------------------------------| 16. | 1 0 0 1 0 28 | +--------------------------------+

+--------------------------------+ | 4.5 5.5 6.5 7.5 9.5 N | |--------------------------------| 17. | 0 0 1 1 0 27 | 18. | 0 1 1 1 0 25 | 19. | 1 0 0 0 1 24 | 20. | 0 1 1 1 1 23 | |--------------------------------| 21. | 1 0 1 1 1 20 | 22. | 1 1 0 0 1 20 | 23. | 1 1 0 1 1 20 | 24. | 0 0 1 1 1 15 | 25. | 0 0 0 1 1 12 | |--------------------------------| 26. | 0 1 0 1 0 11 | 27. | 1 0 0 1 1 11 | 28. | 0 1 1 0 1 8 | 29. | 0 1 0 1 1 8 | 30. | 0 1 0 0 1 8 | |--------------------------------| 31. | 1 0 1 0 1 7 | 32. | 0 0 1 0 1 6 | +--------------------------------+

Page 11: Binary Models 1

11

Distribution of patterns with some missingness

+--------------------------------+ | 4.5 5.5 6.5 7.5 9.5 N | |--------------------------------| 1. | 0 0 0 0 . 458 | 2. | 0 . . . . 398 | 3. | 0 0 0 . . 260 | 4. | 0 0 . . . 252 | 5. | 0 0 0 . 0 248 | |--------------------------------| 6. | 1 . . . . 150 | 7. | 0 0 . 0 0 146 | 8. | . 0 0 0 0 138 | 9. | 0 . 0 0 0 125 | 10. | . . . 0 0 124 | |--------------------------------| 11. | . . . . 0 123 | 12. | 0 0 . 0 . 109 | 13. | . . . 0 . 107 | 14. | 0 . 0 . . 94 | 15. | . 0 . . . 92 | |--------------------------------|

+--------------------------------+ | 4.5 5.5 6.5 7.5 9.5 N | |--------------------------------|

16. | 1 0 0 0 . 71 |

17. | 0 0 . . 0 70 |

18. | 1 1 1 1 . 66 |

19. | 0 . . . 0 65 |

20. | 1 1 . . . 62 |

|--------------------------------|

21. | 0 . . 0 0 57 |

22. | . . 0 . . 55 |

23. | 1 0 0 . 0 54 |

24. | 0 . 0 0 . 53 |

25. | 1 0 . . . 50 |

|--------------------------------|

Etc.

182. | . . 0 1 0 1 |

183. | 0 0 . 1 1 1 |

+--------------------------------+

Page 12: Binary Models 1

12

Thresholds

• Mplus thinks of binary variables as being a dichotomised continuous latent variable

• The point at which a continuous N(0,1) variable must be cut to create a binary variable is called a threshold

• A binary variable with 50% cases corresponds to a threshold of zero

• A binary variable with 2.5% cases corresponds to a threshold of 1.96

Page 13: Binary Models 1

13

Thresholds

Figure from Uebersax webpage

Page 14: Binary Models 1

14

Categorical variables

• A categorical variable with n-levels requires n-1 thresholds

• i.e. you need to make n-1 cuts in a continuous N(0,1) variable to make your observed n-level variable

Page 15: Binary Models 1

15

Degrees of freedom

• 32 possible patterns (missing data patterns don’t count)

• Each additional class requires – 5 df to estimate the 5 prevalence of wetting within that

class (i.e. 5 thresholds)– 1 df for an additional cut of the latent variable defining

the class distribution

• Hence a 5-class model uses up 5*5 + 4 = 29 degrees of freedom leaving 3 df to test the model

Page 16: Binary Models 1

16

Procedure

• Fit 1-5 class models in turn• Output = within class thresholds (prevalences)

and latent class distribution• Select preferred model using various criteria

– Statistical – measures of fit– Ease of interpretation of results– Parsimony– Face validity– Astrology

• Celebrate

Page 17: Binary Models 1

17

How to fit in Mplus – black box way

data: file is ‘bedwetting_5tp.txt'; listwise is on;

variable: names sex bwt marr m_age parity educ tenure ne_kk ne_km ne_kp ne_kr ne_ku; categorical = ne_kk ne_km ne_kp ne_kr ne_ku; usevariables ne_kk ne_km ne_kp ne_kr ne_ku; missing are ne_kk ne_km ne_kp ne_kr ne_ku (-9); classes = c (4);

analysis: type = mixture; starts = 1000 500; stiterations = 10; stscale = 20;

model: %OVERALL% Is that it????

Page 18: Binary Models 1

18

What you’re actually doing:model: %OVERALL%

[c#1 c#2 c#3];

%c#1% [ne_kk$1]; [ne_km$1]; [ne_kp$1]; [ne_kr$1]; [ne_ku$1];

%c#2% [ne_kk$1]; [ne_km$1]; [ne_kp$1]; [ne_kr$1]; [ne_ku$1];

%c#3% [ne_kk$1]; [ne_km$1]; [ne_kp$1]; [ne_kr$1]; [ne_ku$1];

%c#4% [ne_kk$1]; [ne_km$1]; [ne_kp$1]; [ne_kr$1]; [ne_ku$1];

4 class model => 3 estimated threshold for latent class variable

5 more thresholds for each class

Page 19: Binary Models 1

19

How many random starts?

• Depends on– Sample size– Complexity of model

• Number of manifest variables• Number of classes

• Aim to find consistently the model with the lowest likelihood, within each run

Page 20: Binary Models 1

20

Success Not there yetLoglikelihood values at local maxima,

seeds, and initial stage start numbers:

-10148.718 987174 1689 -10148.718 777300 2522 -10148.718 406118 3827 -10148.718 51296 3485 -10148.718 997836 1208 -10148.718 119680 4434 -10148.718 338892 1432 -10148.718 765744 4617 -10148.718 636396 168 -10148.718 189568 3651 -10148.718 469158 1145 -10148.718 90078 4008 -10148.718 373592 4396 -10148.718 73484 4058 -10148.718 154192 3972 -10148.718 203018 3813 -10148.718 785278 1603 -10148.718 235356 2878 -10148.718 681680 3557 -10148.718 92764 2064

Loglikelihood values at local maxima, seeds, and initial stage start numbers

-10153.627 23688 4596 -10153.678 150818 1050 -10154.388 584226 4481 -10155.122 735928 916 -10155.373 309852 2802 -10155.437 925994 1386 -10155.482 370560 3292 -10155.482 662718 460 -10155.630 320864 2078 -10155.833 873488 2965 -10156.017 212934 568 -10156.231 98352 3636 -10156.339 12814 4104 -10156.497 557806 4321 -10156.644 134830 780 -10156.741 80226 3041 -10156.793 276392 2927 -10156.819 304762 4712 -10156.950 468300 4176 -10157.011 83306 2432

Page 21: Binary Models 1

21

What the output looks likeTESTS OF MODEL FIT

Loglikelihood H0 Value -10153.129 H0 Scaling Correction Factor 1.007 for MLR

Information Criteria Number of Free Parameters 23 Akaike (AIC) 20352.258 Bayesian (BIC) 20505.737 Sample-Size Adjusted BIC 20432.649 (n* = (n + 2) / 24)

Chi-Square Test of Model Fit for the Binary outcomes

Pearson Chi-Square 11.543 Degrees of Freedom 8 P-Value 0.1728

Likelihood Ratio Chi-Square 11.210 Degrees of Freedom 8 P-Value 0.1901

Page 22: Binary Models 1

22

Measures of entropy

Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column)

1 2 3 4

1 0.846 0.067 0.040 0.047

2 0.110 0.808 0.037 0.046

3 0.030 0.095 0.875 0.000

4 0.041 0.007 0.000 0.952

Entropy (global) 0.844

Page 23: Binary Models 1

23

Model based + modal class latent class distribution

FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSESBASED ON THE ESTIMATED MODEL

Latent classes 1 790.04734 0.13521 2 297.76475 0.05096 3 511.59879 0.08756 4 4243.58913 0.72627

CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP

Class Counts and Proportions

Latent classes 1 674 0.11535 2 211 0.03611 3 545 0.09327 4 4413 0.75526

Page 24: Binary Models 1

24

Model results Estimates S.E. Est./S.E.

Latent Class 1

Thresholds NE_KK$1 -1.540 0.238 -6.475 NE_KM$1 -0.891 0.199 -4.471 NE_KP$1 0.346 0.121 2.864 NE_KR$1 2.500 0.472 5.294 NE_KU$1 2.359 0.248 9.527

Latent Class 2

Thresholds NE_KK$1 -0.294 0.208 -1.415 NE_KM$1 0.482 0.429 1.123 NE_KP$1 -0.632 0.230 -2.754 NE_KR$1 -1.539 0.750 -2.053 NE_KU$1 0.501 0.197 2.540

Estimates S.E. Est./S.E.

Latent Class 3

Thresholds NE_KK$1 -3.122 0.544 -5.739 NE_KM$1 -15.000 0.000 0.000 NE_KP$1 -3.172 0.481 -6.593 NE_KR$1 -3.224 0.609 -5.294 NE_KU$1 -0.563 0.117 -4.800

Latent Class 4

Thresholds NE_KK$1 2.093 0.073 28.484 NE_KM$1 3.698 0.198 18.651 NE_KP$1 3.796 0.140 27.200 NE_KR$1 4.213 0.180 23.383 NE_KU$1 4.420 0.169 26.172

Categorical Latent Variables

Means C#1 -1.681 0.114 -14.746 C#2 -2.657 0.240 -11.092 C#3 -2.116 0.090 -23.551

Page 25: Binary Models 1

25

RESULTS IN PROBABILITY SCALELatent Class 1

NE_KK Category 1 0.177 0.035 5.106 Category 2 0.823 0.035 23.816 NE_KM Category 1 0.291 0.041 7.078 Category 2 0.709 0.041 17.250 NE_KP Category 1 0.586 0.029 19.981 Category 2 0.414 0.029 14.138 NE_KR Category 1 0.924 0.033 27.917 Category 2 0.076 0.033 2.292 NE_KU Category 1 0.914 0.020 46.773 Category 2 0.086 0.020 4.420

Latent Class 2

NE_KK Category 1 0.427 0.051 8.387 Category 2 0.573 0.051 11.258 NE_KM Category 1 0.618 0.101 6.103 Category 2 0.382 0.101 3.769 NE_KP Category 1 0.347 0.052 6.673 Category 2 0.653 0.052 12.555 NE_KR Category 1 0.177 0.109 1.620 Category 2 0.823 0.109 7.551 NE_KU Category 1 0.623 0.046 13.445 Category 2 0.377 0.046 8.150

Page 26: Binary Models 1

26

These can be plotted

• In Excel from the probabilities:

Page 27: Binary Models 1

27

… or in Mplus

plot:

type is plot2;

series is anybw_t1 (4.5) anybw_t2 (5.5) anybw_t3 (6.5) anybw_t4 (7.5) anybw_t5 (9.5);

Then

Graph> view graphs> estimated probabilities

Page 28: Binary Models 1

28

Yuck!

Page 29: Binary Models 1

29

Model fit stats

1 class 2 class 3 class 4 class 5 class

Estimated params 5 11 17 23 29

Likelihood -13784.5 -10432.6 -10200.2 -10153.1 -10148.7

BLRT <0.0001 <0.0001 <0.0001 <0.0001 0.17

BIC 27612.4 20960.6 20548 20505.7 20548.9

Entropy - 0.889 0.824 0.844 0.804

Page 30: Binary Models 1

30

Model fit stats - BIC

• Bayesian Information Criterion

= -2*Log-likelihood + (# params)*ln(sample size)

• Function of likelihood which rewards a more parsimonius model

• Decrease followed by an increase as extra classes are added

5000

6000

7000

8000

9000

10000

1 2 3 4 5

# classes

BIC

Page 31: Binary Models 1

31

Model fit stats - Entropy

• Measure of the certainty with which patterns (and therefore subjects) are assigned to latent classes.

• Higher values indicate a greater delineation of classes – ideally approaching 1 (Celeux & Soromenho)

• 0.62 would indicate 'fuzzyness' (Ramaswamy)

• Thorough job:– Examine global entropy, – Examine class-specific entropy – Examine the assignment probabilities for each individual pattern

Page 32: Binary Models 1

32

Model fit stats - BLRT

• Bootstrap Likelihood Ratio Test• Traditional Δ-likelihood test cannot be used to test

nested latent class models as the difference in likelihoods is not chi-square distributed

• The BLRT empirically estimates the difference distribution providing a p-value for the observed difference which can be used in tests of model fit

Page 33: Binary Models 1

33

How to obtain BLRT

• Fit k-class model• Select an optseed which replicates the solution with the

lowest likelihood for k-classes• Re-fit k-class model using optseed, specifying tech14 in

the output section and selecting an appropriate number of runs for bootstrapping

k-1starts = 100 20;

lrtstarts 0 0 250 50;

lrtbootstrap 100;

• Ensure that optimal k-1 class model has been replicated

Page 34: Binary Models 1

34

BLRT Output PARAMETRIC BOOTSTRAPPED LIKELIHOOD RATIO TEST FOR 4 (H0)

VERSUS 5 CLASSES

H0 Loglikelihood Value -2097.721 2 Times the Loglikelihood Difference 5.513 Difference in the Number of Parameters 7 Approximate P-Value 0.7600 Successful Bootstrap Draws 100

Does this agree with what you obtained for 4 class run?

B LRT p-value implies no improvement in fit when moving from 4 to 5 classes

Page 35: Binary Models 1

35

Model fit stats

1 class 2 class 3 class 4 class 5 class

Estimated params 5 11 17 23 29

Likelihood -13784.5 -10432.6 -10200.2 -10153.1 -10148.7

BLRT <0.0001 <0.0001 <0.0001 <0.0001 0.17

BIC 27612.4 20960.6 20548 20505.7 20548.9

Entropy - 0.889 0.824 0.844 0.804

BLRT 4 class model is adequate fit to data and there is little improvement in fit when a 5th class is added

BIC attains it’s minimum value at the 4 class model (penalising 5-class for it’s lack of parsimony)

Entropy 2-class model has highest entropy, however all values are reasonably high

Page 36: Binary Models 1

36

Other considerations - interpretation

4 class model 5 class model

Page 37: Binary Models 1

37

Final decision

• 4 and 5 class model fit the data, parsimony favours 4 class

• Both 4 and 5 class models make intuitive sense– 4 classes: ‘normative’, ‘delayed’, ‘persistent’, ‘relapse’– 5 class brings us ‘severe delay’

• 5 class model is in good agreement with Croudace et al (2004)

• Job’s a good ‘un

Page 38: Binary Models 1

38

Conclusions

• Like EFA, LCA is an exploratory tool with the aim of summarising the variability in the dataset in a simple/interpretable way

• These results do not prove that there are 4 or 5 groups of children in mere life.

• LCA will find groupings in the data even if there is no reason to think such groups might exist.

Page 39: Binary Models 1

39

Bring on the covariates

Page 40: Binary Models 1

40

Predicting class membership

• One can strengthen the assertion that subjects can be neatly packaged in little groups if one can show that these groups differ with respect to – Co-morbid conditions– Aetiogical factors– Later outcomes

• Even better if such findings support what is seen in (a) other epidemiological studies, (b) clinical settings

Page 41: Binary Models 1

41

Output from LCA:

• Class distribution• Set of probabilities defining the likelihood that each observed

pattern can be assigned to each class

Pattern Normative Delayed Sev delay Persist Relapse

00000 0.989 0.008 0.001 - 0.001

11111 - 0.0 0.026 0.969 0.005

11000 0.038 0.761 0.01 0.002 0.189

01010 0.105 0.088 0.473 0.014 0.320

Page 42: Binary Models 1

42

Incorporating covariates

• 2-stage method• Export class probabilities to another package –

Stata• Model class membership as a multinomial model

with probability weighting

• Using classes derived from repeated BW measures with partially missing data (gloss over)

Page 43: Binary Models 1

43

Save data from Mplus

savedata: file is “boys_5class_output_completecase.txt"; save cprob;

Don’t forget to add the ID variable:

variable: <snip>

idvariable is ID;

Page 44: Binary Models 1

44

Dataset:t1 t2 t3 t4 t5 ID p(c1) p(c2) p(c3) p(c4) p(c5)

Modal class

1 0 0 0 0 30004 0.224 0 0.001 0 0.775 5

0 0 0 0 0 30031 0.017 0 0 0 0.983 5

0 0 0 1 1 30033 0.177 0 0.48 0 0.342 3

0 0 0 0 0 30042 0.017 0 0 0 0.983 5

0 0 0 0 0 30050 0.017 0 0 0 0.983 5

2 2 2 2 0 30051 0 0 0 1 0 4

0 0 0 0 0 30064 0.017 0 0 0 0.983 5

1 0 0 0 0 30068 0.224 0 0.001 0 0.775 5

0 0 0 0 0 30070 0.017 0 0 0 0.983 5

0 0 0 0 0 30072 0.017 0 0 0 0.983 5

0 0 0 0 0 30095 0.017 0 0 0 0.983 5

2 2 2 0 0 30106 0.001 0.989 0 0.01 0 2

1 0 0 0 0 30110 0.224 0 0.001 0 0.775 5

1 0 0 0 0 30116 0.224 0 0.001 0 0.775 5

0 0 0 0 0 30119 0.017 0 0 0 0.983 5

2 2 1 0 0 30140 0.053 0.943 0.004 0.001 0 2

0 0 0 0 0 30147 0.017 0 0 0 0.983 5

Page 45: Binary Models 1

45

Then what?

• Merge Mplus output with covariates using ID

• Mplus will only permit one ID variable.– ALSPAC has one ID to identify parents but two ID’s to

identify the kids – parent ID plus another one– Create a composite ID e.g. 1000.1, 1000.2 and then

rederive proper ID before matching

• Read into Stata (or similar)

Page 46: Binary Models 1

46

Reshaping the dataset

• Weighted model requires a reshaping of the dataset so that each child has n-rows (for an n-class model) rather than just 1

Page 47: Binary Models 1

47

Pre-shaped – first 20 kids

| ID sex dev_18 dev_42 pclass1 pclass2 pclass3 pclass4 pclass5 modclass ||--------------------------------------------------------------------------------------------------|| 30004 male 3 . .001 0 .803 0 .197 3 || 30008 male 2 1 .908 0 0 .007 .085 1 || 30010 male 2 2 .053 .001 .052 0 .894 5 || 30023 male 1 3 .115 0 .596 .001 .288 3 || 30031 male 3 4 0 0 .983 0 .016 3 ||--------------------------------------------------------------------------------------------------|| 30033 male 4 4 .392 0 .397 0 .211 3 || 30042 male 1 3 0 0 .983 0 .016 3 || 30050 male 3 2 0 0 .983 0 .016 3 || 30051 male 2 2 0 0 0 1 0 4 || 30057 male 1 3 .135 0 .002 0 .864 5 ||--------------------------------------------------------------------------------------------------|| 30058 male 1 4 0 0 .958 0 .041 3 || 30064 male 2 4 0 0 .983 0 .016 3 || 30068 male 4 3 .001 0 .803 0 .197 3 || 30070 male 3 4 0 0 .983 0 .016 3 || 30072 male 1 1 0 0 .983 0 .016 3 ||--------------------------------------------------------------------------------------------------|| 30075 male 3 3 0 0 .982 0 .018 3 || 30088 male 3 4 .03 .002 .889 .003 .076 3 || 30095 male 3 . 0 0 .983 0 .016 3 || 30098 male 3 . .068 .158 .173 .018 .583 5 || 30104 male 4 1 .008 0 .775 0 .217 3 |+--------------------------------------------------------------------------------------------------+

Page 48: Binary Models 1

48

Pre-shaped – first 20 kids

| ID sex dev_18 dev_42 pclass1 pclass2 pclass3 pclass4 pclass5 modclass ||--------------------------------------------------------------------------------------------------|| 30004 male 3 . .001 0 .803 0 .197 3 || 30008 male 2 1 .908 0 0 .007 .085 1 || 30010 male 2 2 .053 .001 .052 0 .894 5 || 30023 male 1 3 .115 0 .596 .001 .288 3 || 30031 male 3 4 0 0 .983 0 .016 3 ||--------------------------------------------------------------------------------------------------|| 30033 male 4 4 .392 0 .397 0 .211 3 || 30042 male 1 3 0 0 .983 0 .016 3 || 30050 male 3 2 0 0 .983 0 .016 3 || 30051 male 2 2 0 0 0 1 0 4 || 30057 male 1 3 .135 0 .002 0 .864 5 ||--------------------------------------------------------------------------------------------------|| 30058 male 1 4 0 0 .958 0 .041 3 || 30064 male 2 4 0 0 .983 0 .016 3 || 30068 male 4 3 .001 0 .803 0 .197 3 || 30070 male 3 4 0 0 .983 0 .016 3 || 30072 male 1 1 0 0 .983 0 .016 3 ||--------------------------------------------------------------------------------------------------|| 30075 male 3 3 0 0 .982 0 .018 3 || 30088 male 3 4 .03 .002 .889 .003 .076 3 || 30095 male 3 . 0 0 .983 0 .016 3 || 30098 male 3 . .068 .158 .173 .018 .583 5 || 30104 male 4 1 .008 0 .775 0 .217 3 |+--------------------------------------------------------------------------------------------------+

covariates Posterior probs Modal class

Page 49: Binary Models 1

49

The reshaping

reshape long pclass, i(id) j(class)

(note: j = 1 2 3 4 5)

Data wide -> long---------------------------------------------------------Number of obs. 5584 -> 27920Number of variables 66 -> 63j variable (5 values) -> classxij variables: pclass1 pclass2 ... pclass5 -> pclass---------------------------------------------------------

Page 50: Binary Models 1

50

Re-shaped – first 3 kids +--------------------------------------------------+ | id sex dev_18 dev_42 pclass class | |--------------------------------------------------| 1. | 30004 male 3 . .001 1 | 2. | 30004 male 3 . 0 2 | 3. | 30004 male 3 . .803 3 | 4. | 30004 male 3 . 0 4 | 5. | 30004 male 3 . .197 5 | |--------------------------------------------------| 6. | 30008 male 2 1 .908 1 | 7. | 30008 male 2 1 0 2 | 8. | 30008 male 2 1 0 3 | 9. | 30008 male 2 1 .007 4 | 10. | 30008 male 2 1 .085 5 | |--------------------------------------------------| 11. | 30010 male 2 2 .053 1 | 12. | 30010 male 2 2 .001 2 | 13. | 30010 male 2 2 .052 3 | 14. | 30010 male 2 2 0 4 | 15. | 30010 male 2 2 .894 5 | +--------------------------------------------------+

First kid

Third kid

Second kid

Page 51: Binary Models 1

51

Multinomial modellog using "temperament.log", replace

foreach var of varlist activity rhythmicity approach ///adaptability intensity mood persistence distractibility ///threshold {xi: mlogit class `var' [iw = pclass], rrr test `var'

xi: mlogit class `var' kz021 [iw = pclass], rrr test `var'xi: mlogit class `var' kz021 b1 b2 b5 b6 b10 b16 b17 i.wisc_70 [iw = pclass], rrr }

log close

Page 52: Binary Models 1

52

Multinomial modellog using "temperament.log", replace

foreach var of varlist activity rhythmicity approach ///adaptability intensity mood persistence distractibility ///threshold {xi: mlogit class `var' [iw = pclass], rrr test `var'

xi: mlogit class `var' kz021 [iw = pclass], rrr test `var'xi: mlogit class `var' kz021 b1 b2 b5 b6 b10 b16 b17 i.wisc_70 [iw = pclass], rrr }

log close

Weight by class membership probabilities

Page 53: Binary Models 1

53

Typical OutputMultinomial logistic regression Number of obs = 9535 LR chi2(4) = 57.66 Prob > chi2 = 0.0000Log likelihood = -9613.0706 Pseudo R2 = 0.0030

------------------------------------------------------------------------------ class | RRR Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------1 |adaptability | 1.146521 .0431225 3.64 0.000 1.065042 1.234232-------------+----------------------------------------------------------------2 |adaptability | 1.180821 .0643773 3.05 0.002 1.061151 1.313986-------------+----------------------------------------------------------------3 |adaptability | 1.230596 .0459751 5.55 0.000 1.143706 1.324087-------------+----------------------------------------------------------------5 |adaptability | 1.172061 .0421842 4.41 0.000 1.092231 1.257727------------------------------------------------------------------------------(class==4 is the base outcome)

( 1) [1]adaptability = 0 ( 2) [2]adaptability = 0 ( 3) [3]adaptability = 0 ( 4) [5]adaptability = 0

chi2( 4) = 57.12 Prob > chi2 = 0.0000

Page 54: Binary Models 1

54

Summary

• Can use Latent Class Analysis to summarise the variability in patterns of response to binary longitudinal repeated measures

• ~ LLCA [Longitudinal LCA]

• Can use time invariant covariates to predict class membership in a 2-stage model in Stata using posterior probabilities