View
222
Download
4
Category
Tags:
Preview:
Citation preview
1
Graphical Diagnostic Tools for Evaluating Latent Class Models:
An Application to Depression in the ECA Study
Elizabeth S. Garrett
Department of BiostatisticsJohns Hopkins University
2
GOAL
Provide tools for choosing the most appropriate latent class model.
Interpret objective diagnostic methods in reference to the latent class model.
3
Table of Contents
1. Introduction2. Previous Work3. Model Estimation4. Diagnostic Methods for Latent Class Models5. Extensions to Latent Class Regression6. Application to the ECA Study7. Validating Diagnostic Criteria for Depression
Using LCM8. Discussion and Further Research
4
Outline
Depression in relation to the LCM Approach to Estimation The ECA Study Predicted Frequency Check Plot Latent Class Estimability Display Interpretation of Findings Revisions
5
Motivating Question
How should we describe
“major depression?”
– not depressed, depressed– none, moderate, severe– none, mild, moderate, severe– none, mood symptoms, somatic symptoms, both
6
How we conceptualize “major depression”
We use indicators of symptoms such as self-reported presence of sadness, weight change, etc.
A combination of these indicators is thought to define depression.
Using these combinations, we commonly seek to categorize individuals into depression classes.
These classes represent the construct “depression.” “Depression” is a latent variable.
The construct of “Depression” can then be used for classification, description, and prediction
7
Depression in the Diagnostic and Statistical Manual of Mental Disorders, 3rd
Edition
DSM-III Criteria (generally):A. Dysphoria for 2 or more weeks
B. Reported symptoms in 4 or more of the following
symptom groups:
1. loss of appetite, weight change
2. insomnia, hypersomnia
3. retarded movement, restlessness
4. disinterest in sex
5. fatigue
6. feelings of guilt or worthlessness
7. trouble concentrating, thoughts slow or mixed
8. morbid thoughts, suicidal thoughts/attempts
Latent Class Model: Main Ideas
There are M classes of depression (e.g. none, mild, severe). m represents the proportion of individuals in the population in class m (m=1,…,M)
Each person is a member of one of the M classes, but we do not know which. The latent class of individual i is denoted by i.
Symptom prevalences vary by class. The prevalence for symptom j in class m is denoted by pmj.
Given class membership, the symptoms are independent.
9
Latent Class Model
~
i~
iy
~i
p
M : number of classes pi
: vector of symptom probabilities given latent class i
: probability of being in latent class m, m=1,…M. : the true latent class of individual i. : vector of individual i’s report of symptoms.
m
i
~iy
10
Estimation Approach
Bayesian Approach:
Quantify beliefs about p, , and before and after observing data.
Bayesian Terminology:Prior Probability: What we believe about unknown
parameters before observing data.
Posterior Probability: What we believe about the parameters after observing data.
11
Bayesian Estimation Approach
We estimated the models using a Markov chain Monte Carlo (MCMC) algorithm:
Specify prior probability distribution:
P(p, , )Combine prior with likelihood to obtain posterior distribution:
P(p, , |Y) P(p, , ) x L(Y| p, , )Estimate posterior distribution for each parameter using
iterative procedure.
P( 1|Y) = ∫ P(p, , |Y)
12
The Epidemiologic Catchment Area Study
3481 community-dwelling individuals in Baltimore were interviewed using the NIMH Diagnostic Interview Schedule.
8 self-reported symptom groups were completed for 2938 individuals*.
6 month prevalence of symptoms was assessed.
prevalence
A. dysphoria 0.12
B. Group 1 lost appetite
lost weight 0.06weight gain
Group 2 insomnia 0.11hypersomnia
Group 3 retarded movement 0.14
restlessness
Group 4 disinterest in sex 0.07Group 5 fatigue 0.04Group 6 guilt/worthless 0.09Group 7 trouble concentrating 0.04
thoughts slow or mixed
Group 8 thoughts of death
wanted to die 0.06suicidal thoughts
suicide attempts* those with organic brain disorder were omitted as per DSM-III criterion
13
The Epidemiologic Catchment Area Study
2 Class Model 3 Class Model 4 Class Model
Class1
Class2
Class1
Class2
Class3
Class1
Class2
Class3
Class4
0.88 0.12 0.82 0.14 0.02 0.83 0.12 0.04 0.03
weight 0.02 0.42 0.01 0.24 0.77 0.01 0.33 0.21 0.75
sleep 0.06 0.48 0.05 0.36 0.68 0.05 0.33 0.42 0.70
movement 0.07 0.63 0.05 0.50 0.80 0.05 0.49 0.58 0.80
sex 0.02 0.42 0.01 0.25 0.81 0.01 0.12 0.17 0.81
fatigue 0.01 0.20 0.008 0.12 0.36 0.009 0.01 0.19 0.35
guilt 0.04 0.48 0.03 0.35 0.78 0.03 0.20 0.51 0.76
concentration 0.005 0.28 0.004 0.12 0.61 0.003 0.04 0.05 0.65
morbid 0.02 0.40 0.01 0.22 0.80 0.01 0.05 0.11 0.80
dysphoria 0.06 0.51 0.05 0.40 0.77 0.05 0.61 0.23 0.79
14
Predicted Frequency Check (PFC) Plot
Compare observed symptom pattern frequencies to what the model predicts for a new sample of data from the same population.
Symptom patterns:» 000000000 no reported symptoms» 000000001 report dysphoria only» 111111111 report all symptoms
29 = 512 possible patterns
15
Example:
Pattern 001000001 :» restlessness/retarded movement» dysphoria
We observed 24 individuals with this symptom pattern:
24001000001 X
16
Example:
95% confidence interval for frequency?
Non-parametric (saturated model) estimate:
15 24 34
|[ ]
.p00100000124
2938 0 008
17
Model Based Estimation
Predicted Frequency
9 18 28
Predicted frequency of pattern 001000001 and prediction interval in the 3 class model:
P X x Y( | )001000001
(x)
2.5%97.5%
18
Model Based Estimation
9
15
18
24
28
34
2 class model3 class model4 class model
Comparison of model based prediction interval to empirical confidence interval:
97.5%
2.5%
Observed
19
Predicted Frequency Check Plot
000000000
000000001
001000000
010000000
000001000
000000010
011
000000
001000001
000100000
100000000
001001000
000010000
010000001
100000001
000101000
000100001
0011
00000
000001001
010001000
000000100
001000010
101000000
010010000
011
000001
110000000
1111
1111
1
001000011
001010000
0011
01000
0011
01010
010000010
1111
011
11
2.5%
observed
97.5%
Pattern (in order of prevalence)
Ob
sd
N =
1
98
2
= 11
0
= 1
05
= 1
03
= 5
8
= 2
6
= 2
6
= 2
4
= 2
3
= 2
2
= 1
8
= 1
7
= 1
6
= 11
= 1
0
= 9
= 9
= 8
= 8
= 7
= 7
= 7
= 6
= 6
= 6
= 6
= 5
= 5
= 5
= 5
= 5
= 5
20
Predicted Frequency Check Plot
000000011
000000101
000001011
001001001
0011
00001
010001001
011
000101
011
001000
011
101000
111000000
000001010
001000100
001000101
001001010
010011
011
010100000
011
000010
011
0011
01
100000100
100001000
1011
011
11
1011
1111
1
110001000
1101011
11
111001000
111001011
1111
00001
000010001
000100010
00100011
1
001010001
0011
000102.5%
observed
97.5%
Pattern (in order of prevalence)
Ob
sd
N =
1
98
2 = 4
= 4
= 4
= 4
= 4
= 4
= 4
= 4
= 4
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 3
= 2
= 2
= 2
= 2
= 2
21
Latent Class Estimability Display (LCED)
Is there enough data to estimate all of the parameters in the model?» 2 class model: 19 parameters» 3 class model: 29 parameters» 4 class model: 39 parameters
Problems arise when:» small data set» small class size
e.g. N=1000 and class size = 0.01 10 individuals in class to estimate symptomprevalences
» small data set and small class size
22
Weak “Identifiability”(Weak Estimability)
Definition: A parameter in a (Bayesian) model is weakly identified if the posterior distribution of the parameter is approximately the same as the prior.
P(1) P( 1|Y)
If a model is weakly identified it is still “valid”, but we cannot make inferences from the data about the weakly identified parameters.
24
Latent Class Estimability Display
Item
Class 1 Class 2 Class 1 Class 2 Class 3 Class 1 Class 2 Class 3 Class 4
1
2
3
4
5
6
7
8
9
Class Size
Tau
2 Class Model 3 Class Model 4 Class Model
0.02 0.16 0.01 0.16 0.33 0.02 0.35 0.39 0.35
25
none mild severe
0.82 0.14 0.02
weight 0.01 0.24 0.77
sleep 0.05 0.36 0.68
movement 0.05 0.50 0.80
sex 0.01 0.25 0.81
fatigue 0.008 0.12 0.36
guilt 0.03 0.35 0.78
concentration 0.004 0.12 0.61
morbid 0.01 0.22 0.80
dysphoria 0.05 0.40 0.77
Interpretation
Depression appears to be ‘dimensional’» none
» mild
» severe
2% of population is in severe class
14% in mild class: are they depressed or not?
How does this compare to the DSM-III definition?
26
Work Not Included in Talk
MCMC Algorithm
Log Odds Ratio Check Plot
Predicted Class Assignment Display
Extensions to Regression
Recommended