Click to edit Master text styles Click icon to add picture Click to edit Master title style The...

Preview:

DESCRIPTION

Classical vs. IRT Measurement IRT

Citation preview

Click to edit Master text styles

Click icon to add picture

Click to edit Master title styleClick to edit Master title style

The Future of Mental Health Measurement

David J. KupferDavid J. WeissPaul Pilkonis

Ellen FrankR. Darrell Bock

Robert D. GibbonsUniversity of Chicago

Supported by NIMH Grant R01-MH-66302. The CAT-MH is distributed by Adaptive Testing Technologies (www.adaptive testingtechnologies.com) for which Drs. Gibbons, Kupfer, Frank, Weiss and Pilkonis have financial interests.

Classical vs. IRT Measurement

Classical Measurement Model

Classical vs. IRT Measurement IRT

Arithmetic Algebra

Calculus

What is CAT?

Imagine a 1000 Item Math Test

Background: Bi-factor Model

• Psychiatry issue: Multidimensionality = excess correlation within domains violating conditional independence assumption

• Fitting unidimensional models to multidimensional data.

• Small item banks (e.g. PROMIS 28 items for depression)

• Underestimate posterior variance• Greater variability of scores between and within

individuals • Solution is to base CAT on multidimensional IRT

Quality of Life Example

Gibbons R.D., Bock R.D., Hedeker D., Weiss D., Segawa E., Bhaumik D.K., Kupfer D., Frank E., Grochocinski V., Stover A. Full-Information Item Bi-Factor Analysis of Graded Response Data. Applied Psychological Measurement, 31, 4-19, 2007.

Quality of Life Example

CAT

• Traditional – all subjects get all items• Subjects get different items based on

severity• Smallest number of items for fixed

precision• Develop large item banks that completely

characterize a disorder such as depression• Select items dynamically based on

responses

Paradigm Shift• Traditional Measurement

– Fix items allow precision to vary• IRT-Based CAT

– Fix precision allow items to vary• Change precision depending on application

– Epidemiology – fewer items lower precision (se=.4)

– Primary care screening – medium precision (se=.3)

– RCTs – more items high precision (se=.2)• Suicide Screen – C-SSRS items (2-4)

CAT-DI and CAD-MDD

• Focus on Major Depressive Disorder (MDD)• MDD is a leading factor in US health care

costs• Created 1008 item bank (DEP, ANX, MANIA)• Over 800 items remained after calibration • CAT-DI – Measure Depressive Severity• CAD-MDD – Diagnose Depression• CAT-ANX (anxiety) CAT-MANIA (bipolar)

CAT-DI RESULTS

Gibbons R.D., Weiss D.J., Pilkonis P.A., Frank E., Moore T., Kim J.B., Kupfer D.K. The CAT-DI: A computerized adaptive test for depression. Archives of General Psychiatry, 69, 1104-1112, 2012.

Gibbons R.D., Weiss D.J., Pilkonis, P.A., Frank E., Moore T., Kim J.B., Kupfer D.J. Development of the CAT-ANX: A computerized adaptive test for anxiety. American Journal of Psychiatry, 171, 187-194, 2014.

Achtyes E.D., Halstead S., Smart L., Moore T., Frank E., Kupfer D., Gibbons R.D. Validation of computerized adaptive testing in an outpatient non-academic setting. Psychiatric Services, in press.

CAT-DI Depression Scores vs. SCID Diagnosis

MajorMinor+DysNone

3

2

1

0

-1

-2

-3

Dep

ress

ion

CAT-DI Scores by Depression Status

HAM-D Scores vs. SCID Diagnosis

MajorMinor+DysNone

45

40

35

30

25

20

15

10

5

0

Dep

ress

ion

HAM-D Scores by Depression Status

PHQ-9 Scores vs. SCID Diagnosis

MajorMinor+DysNone

30

20

10

0

Dep

ress

ion

PHQ-9 Scores by Depression Status

CAD-MDD – Decision Tree

• Computerized Adaptive Diagnosis

• Begin with 100 DSM-IV MDD items

• Obtain SCID DSM-IV MDD Diagnosis

• Fit Decision Tree using Random Forest

• Develop live CAD-MDD and cross validate

CAD-MDD – Decision Tree

CAD-MDD – Results

Sensitivity: 0.95Specificity: 0.87

Average of 4 Items Max=6

Gibbons et.al., J. of Clinical Psychiatry (2013)

543210

Time in Minutes

1.0

0.9

0.8

0.7

0.6

0.5

True

Pos

itive

Rat

e - S

ensi

tivity

Comparison of Diagnostic Screening Tools for Depression

M3

CAD-MDD

PHQ-2

PHQ-9

as a Function of Time and Sensitivity

Rates of Detection and Service Utilization• Emergency Department U of Chicago (n=1000)

• 26% MDD positive screens (>50% confidence)• 22% MDD positive screens (>90% confidence)• 7% MDD Positive + moderate or severe CAT-DI• 3% suicide screen positive• 3-fold increase in ED visits in past year moderate/severe vs. none/mild • 4-fold increase in hospitalizations in past year

moderate/severe vs. none/mild • None of these patients had a psychiatric indication

• Primary Care Spain and US Latino Samples (n=1000)• 33% MDD positive screens (>50% confidence)• 25% MDD positive screens (>90% confidence)• 9% MDD Positive + moderate or severe CAT-DI

Independent Validation Study • Highly Comorbid Community MH sample (n=150)

– High sensitivity 0.96 maintained for entire sample– Specificity 1.0 for MDD vs Control – CAT-DI, CAT-ANX, CAT-MANIA all predict Dx– MDD 28-fold across scale– GAD and current BP each 12-fold across scale

• 97% accurately reflected mood• 86% preferred computer interface (10% preferred pp)• 97% Comfortable taking CAT-MH• 98% Answered honestlyAchtyes E.D., Halstead S., Smart L., Moore T., Frank E., Kupfer D., Gibbons R.D. Validation of computerized adaptive testing in an

outpatient non-academic setting. Psychiatric Services, in press.

Future Directions• Screening and monitoring in primary care• Inexpensive phenotyping for GWAS studies• Psychiatric epidemiology• Comparative effectiveness and safety• Differential Item Functioning – Global Health• Kiddie CAT - Developmental shifts – vertical scaling• Spend billions on biology but validate using stone age

clinical measurements • Autism, PTSD, RDoC, …• Military – Suicide RR=4 within 4 years of discharge• Cloud computing environments

Relevant PublicationsGibbons R.D., & Hedeker D.R. Full-information item bi-factor analysis. Psychometrika, 57, 423-436, 1992.

Gibbons R.D., Bock R.D., Hedeker D., Weiss D., Segawa E., Bhaumik D.K., Kupfer D., Frank E., Grochocinski V., Stover A. Full-Information Item Bi-Factor Analysis of Graded Response Data. Applied Psychological Measurement, 31, 4-19, 2007.

Gibbons R.D., Weiss D.J., Kupfer D.J., Frank E., Fagiolini A., Grochocinski V.J., Bhaumik D.K., Stover A. Bock R.D., Immekus J.C. Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services, 59, 361-368, 2008.

Gibbons R.D., Weiss D.J., Pilkonis P.A., Frank E., Moore T., Kim J.B., Kupfer D.K. The CAT-DI: A computerized adaptive test for depression. Archives of General Psychiatry, 69, 1104-1112, 2012.

Gibbons R.D., Hooker G., Finkelman M.D., Weiss D.J., Pilkonis P.A., Frank E., Moore T., Kupfer D.J. The CAD-MDD: A computerized adaptive diagnostic screening tool for depression. Journal of Clinical Psychiatry, 74, 669-674, 2013.

Gibbons R.D., Weiss D.J., Pilkonis, P.A., Frank E., Moore T., Kim J.B., Kupfer D.J. Development of the CAT-ANX: A computerized adaptive test for anxiety. American Journal of Psychiatry, 171, 187-194, 2014.

Achtyes E.D., Halstead S., Smart L., Moore T., Frank E., Kupfer D., Gibbons R.D. Validation of computerized adaptive testing in an outpatient non-academic setting. Psychiatric Services, published on-line.

Recommended