Upload
amber-maynard
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Paul Biemer, UNC and RTI
Bac Tran, US Census Bureau
Jane Zavisca, University of Arizona
SAMSI Conference, 11/10/2005
Latent Class Analysis of Rotation Group Bias: The Case of Unemployment
Overview
Motivation: To understand measurement error in the official unemployment rate
Method: Latent Class Analysis: measurement error as classification error
Distinction from previous research: Focus on measurement error mechanisms, as opposed to correcting marginal estimates.
Ultimate goal: To improve survey design.
The Official Unemployment Rate
In LaborForce {
Source: The Current Population Survey, 2004
Employed 62.4%Unemployed 3.6%Not in Labor Force 34.0%
Employed 94.5%Unemployed 5.5%
Employment Status
Unemployment Rate
The Official Unemployment Rate
Categories Employed: worked at least one hour in previous
week, or temporarily absent from job. Unemployed: not employed and “actively” looking
for work (unprompted categories), or temporarily laid off.
Not in Labor Force (NILF): All others.
Evidence for Measurement Error in Labor Force Status (LFS) in the CPS 1. Re-interview inconsistency
2. Rotation group bias
Re-interview Inconsistency
1% random sample of original sample of ≈ 50,000 households is re-interviewed monthly (without replacement).
Re-interview occurs in same week as the original interview.
Inconsistent responses suggest measurement error.
Re-interview Inconsistency (2001-2003)
8.9% of cases are inconsistently classified.
First Interview Empl. Unempl. NILF AllEmpl. 58.2 0.4 4.2 62.7Unempl. 0.5 1.9 1.0 3.4NILF 2.0 0.8 31.0 33.9All 60.7 3.1 36.2 100.0
Reinterview
Unemployment Inconsistency (2001-2003)
First Interview Empl. Unempl. NILF AllEmpl. 92.6 0.6 6.7 100Unempl. 13.8 56.4 29.8 100NILF 6.0 2.4 91.6 100
Reinterview
Rotation Group Design
BEGINDATE J F M A M J J A S O N D J F M A M J J
Oct-01 8Nov-01 7 8Dec-01 6 7 8Jan-02 5 6 7 8Feb-02 5 6 7 8Mar-02 5 6 7 8Apr-02 5 6 7 8May-02 5 6 7 8Jun-02 5 6 7 8Jul-02 5 6 7 8
Aug-02 5 6 7 8Sep-02 5 6 7 8Oct-02 4 5 6 7 8Nov-02 3 4 5 6 7 8Dec-02 2 3 4 5 6 7 8Jan-03 1 2 3 4 5 6 7 8Feb-03 1 2 3 4 5 6 7 8Mar-03 1 2 3 4 5 6 7 8Apr-03 1 2 3 4 5 6 7 8
2003 2004SAMPLE MONTH
Rotation Group Bias (2002 Full CPS)
5
5.5
6
6.5
7
1 2 3 4 5 6 7 8
Month-in-Sample
Un
em
plo
ym
en
t R
ate
What Could Cause Rotation Group Bias? Non-response bias: rotation groups may
represent different populations. Differences in interview setting
telephone vs. face-to-face proxy vs. self
Time in sample effect Improved understanding of questionnaire Embarrassment at admitting prolonged
unemployment Interview changes behavior
Latent Class Analysis to Test Hypotheses Sources of Rotation Group Bias
Non-response bias (different populations): Does latent employment status vary by rotation group?
Measurement error: Does rotation group influence error rates?
Differences in setting: Does interview mode (telephone vs. face-to-face) initial
interview influence error rates? Does interview mode account for apparent rotation group
effects on error rates? Social pressure:
Gender influences latent employment status Does gender also influence error rates? Does the effect of rotation group vary by gender?
Correlation between Month-in-Sample and Interview Mode
20
88
40
88
80
12
60
12
1 2 - 4 5 6 - 8Month in Sample
In Person
By Phone
Re-interview Data Set N = 24,297 (un-weighted data) X = True Labor Force Status (Latent Variable) A = Observed Labor Force Status at Inititial
Interview B = Observed Labor Force Status as Time 2
(Reinterview)
Basic Latent Class Model
XA
B
XBjt
XAit
Xt
ABXijt
|| BXjt
AXit
Bj
Ai
Xt
ABXijtf )ln(
X, A|X, B|X Shorthand:
(with usual constraints for identifiability)
Grouping Variable
XBjt
XAit
SXts
Ss
ABXSijts
|||
X
A
B
S, X|S, A|X, B|X
S
BXjt
AXit
XSts
Bj
Ai
Xt
Ss
ABXSijtsf )ln(
External Variable influencing Classification Error
XMBjtm
XMAitm
SXts
SMs
ABXSMijtsm
|||
XA
B
SM, X|S, A|XM , B|XM
S M
BXMjtm
AXMitm
BMjm
AMim
BXjt
AXit
XSts
SMts
Bj
Ai
Xt
Mm
Ss
ABXSMijtsf
)ln(
Grouping versus External Variables
XA
B
SM, X|S, A|XMS {AXM AXS} , B|XMS {BXM BXS}
S M
Covariates
S = Gender Men: 47% Women: 52%
M = Month in Sample 1 or 5: 28% 2-4, 6-8: 72%
T = Interview Mode (Initial Interview) Telephone: 72% In Person: 18%
Statistical Power & Identifiability IssuesFirst
Interview Empl. Unempl. NILF AllEmpl. 13939 87 1007 15033Unempl. 113 459 249 821NILF 500 204 7739 8443All 14552 750 8995 24297
Reinterview
• Large total N, but relatively small N for unemployed.•More variables means more identifiable models, but also diminishing cell counts and boundary solutions.
Principles of Model Construction Always include X|S A|X B|X
Assume 3 latent classes & S as grouping variable Fit classification table of A*B*M*T*S.
Vary following effects M as grouping variable M &/or T affecting classification error for A & B T affecting A but not B S affecting A & B when identifiable based on other
restrictions (including interaction of M & S)
Principles of Model Construction Try equality constraints
Equal influence of M & or S on error rate for A & B.
Error rate for T at time A = error rate at time B (when T does not affect B).
Principles of Model Selection Limit search to theoretically plausible models. Limit search to identifiable models. Overall model fit
P-value of likelihood ratio test vs. saturated model > .01 Dissimilarity index < .05
Model selection among those meeting above criteria: Bayesian information criterion (BIC) Likelihood ratio test for nested models
Check substantive interpretation within set of possible best models.
Best-Fitting Models
Model Group Effects on
Classification df L2 pval BIC dissim1 X|S A|XM; A|XT; B|XM;
B|XT24 38.0 0.04 -204 0.007
2 X|S X|M A|XM; A|XT; B|XM; B|XT
22 37.0 0.02 -184 0.007
3 X|S A|XMS=B|XMS; A|XT; B|XT
20 25.4 0.20 -176 0.006
4 X|S X|M A|XMS=B|XMS; A|XT; B|XT
18 20.2 0.32 -161 0.005
5 X|S A|XM, A|XT, A|XS B|XM, B|XT B|XS
12 12.5 0.41 -108 0.005
6 X|S X|M A|XM, A|XT, A|XS B|XM, B|XT B|XS
10 7.6 0.67 -93 0.004
Estimated Unemployment Rate Model 1 (similar to other top models)
UE = 4.9% Observed M.I.S. 1 & 5
UE = 6.0% Observed M.I.S. 2-4, 6-8
UE = 4.7%
Conditional Probabilities for Employment Status
Latent Observed Biemer Tran State State A B 1997 1999
E 96.8 95.5 98.7 98.7U 0.3 0.0 0.4 0.4N 2.9 4.5 0.8 0.9E 13.7 11.1 8.6 9.8U 77.3 74.1 74.4 72.3N 9.0 14.8 17.0 17.9E 4.2 1.7 1.1 2.3U 2.0 1.9 0.9 1.5N 93.8 97.0 98.0 96.2
Model 1Current Estimates Previous EstimatesClassification
E
N
U
Conditional Probabilities for A|TX & B|TX
Latent ObservedState State Phone Visit Phone Visit
E 97.3% 96.7% 96.9% 95.9%U 0.3% 0.4% 0.0% 0.0%N 2.5% 2.9% 3.1% 4.1%E 12.5% 9.5% 0.0% 0.0%U 61.0% 75.6% 83.6% 82.1%N 26.4% 14.8% 16.4% 17.9%E 4.1% 6.3% 5.3% 5.3%U 0.1% 2.0% 1.3% 2.4%N 95.8% 91.7% 93.4% 92.3%
Interview
E
ReinterviewConditional ProbabilitiesClassification
U
N
Conditional Probabilities for A|MX & B|MX
Latent State
Observed State
MIS 1,5
MIS 2-4,6-8
MIS 1,5
MIS 2-4,6-8
E 97.7% 98.8% 96.0% 96.7%U 0.8% 0.6% 0.1% 0.0%N 1.5% 0.6% 3.8% 3.3%E 8.2% 14.6% 3.0% 1.6%U 83.3% 71.8% 87.3% 79.6%N 8.5% 13.6% 9.7% 18.8%E 7.7% 5.4% 3.1% 5.2%U 2.8% 1.5% 2.3% 1.2%N 89.5% 93.1% 94.6% 93.6%
U
N
ClassificationConditional Probabilities
Interview Reinterview
E
Summary Findings Change in structural model (treating month-in-
sample as grouping variable) does not change the preferred measurement model.
Models fit nearly as well without M as grouping variable; casts doubt on non-response bias hypothesis.
M-I-S bias is not just a function of interview mode. Covariate effects (esp. S) on response error should
be examined further in model with more df; need another grouping variable.
Unresolved Issues Ambiguous results for model selection Most interested in fit of unemployment
classification, but this is overwhelmed in measures of overall fit
Software limitations: clustering, local & boundary solutions, standard errors not consistently output
Future Research Agenda Try finer coding of month-in-sample Develop models for other variables: age, race,
proxy vs. self Pool more years of data Develop hypotheses & interpretation based on
review of: experimental work analyses of non-response related models including Markov latent class models
of employment status transitions
Rotation Group Bias (2001-2003, reinterview data)
0%
1%
2%
3%
4%
5%
6%
7%
1 2 3 4 5 6 7 8
Month-in-Sample
Un
emp
loym
ent
Rat
e