31
CAT Item Selection and CAT Item Selection and Person Fit: Predictive Person Fit: Predictive Efficiency and Detection of Efficiency and Detection of Atypical Symptom Profiles Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon J. Conrad, Ph.D. Funded by NIDA grant 1R21DA025731

CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Embed Size (px)

Citation preview

Page 1: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

CAT Item Selection and Person CAT Item Selection and Person Fit: Predictive Efficiency and Fit: Predictive Efficiency and Detection of Atypical Symptom Detection of Atypical Symptom ProfilesProfiles

CAT Item Selection and Person CAT Item Selection and Person Fit: Predictive Efficiency and Fit: Predictive Efficiency and Detection of Atypical Symptom Detection of Atypical Symptom ProfilesProfiles

Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon J. Conrad,

Ph.D.

Funded by NIDA grant 1R21DA025731

Page 2: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

IntroductionIntroductionIntroductionIntroduction• Do our measures accurately reflect a

person’s performance or status?– Example: Persons with few endorsed

symptoms, but symptoms of high severity

• Person fit statistics offer a means of detecting these patterns.

• But, detecting person misfit in CAT is problematic:– Reduced number of items administered– Selected items cover limited range of

measurement continuum

Page 3: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Item Selection in CATItem Selection in CATItem Selection in CATItem Selection in CAT

• Optimized for efficiency and precision of measurement estimation.– e.g., maximizing Fisher’s information

function• Alternative procedures could be

devised to balance efficiency/precision and obtaining responses over a wider range of the measurement continuum– e.g., Linacre’s (1995) Bayesian

falsification procedure

Page 4: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Purpose of StudyPurpose of StudyPurpose of StudyPurpose of Study

• Examine the predictive efficiency and sensitivity of various person fit indices to detecting misfit in CAT– Predictive efficiency: how well can we

predict the overall pattern of misfit based on item responses collected via CAT?

• What effect does different item selection methods have on our ability to detect person misfit in a CAT context?

Page 5: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

HypothesesHypothesesHypothesesHypotheses

1. Predictive efficiency of CAT-derived person fit statistics will be enhanced by selecting items from a wider range of the measurement continuum.

2. Greater predictive efficiency will improve detection of atypical responding.

Page 6: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Data Source and Simulation Data Source and Simulation ProcedureProcedureData Source and Simulation Data Source and Simulation ProcedureProcedure

• Data were from 4,360 individuals presenting to substance abuse treatment upon intake

• Post-hoc CAT simulations were performed:– One parameter IRT (Rasch) dichotomous

response model.– Maximum-likelihood estimation– Item Selection Procedures

• Modified “Bayesian” falsification procedure (MBF)• Maximum Fisher’s Information (MFI)

– Stop Rule: all items were administered to examine the effects of successive item administration on person fit indices.

Page 7: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Internal Mental Distress ScaleInternal Mental Distress ScaleInternal Mental Distress ScaleInternal Mental Distress Scale

• The IMDS is a 42-item instrument that is part of the Global Appraisal of Individual Needs (Dennis et al., 2003).

• Measures:– Internal mental distress (second-order factor)– Depression– Anxiety– Trauma– Homicidality/Suicidality– Somatic complaints

• Validated using a 1-parameter IRT (Rasch) measurement model

Page 8: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Modified Bayesian Falsification Item Modified Bayesian Falsification Item Selection (MBF)Selection (MBF)Modified Bayesian Falsification Item Modified Bayesian Falsification Item Selection (MBF)Selection (MBF)

1. Set the start value for the measure (θ0) at 0 logits.

2. Calculate a “target” measure:i. If previous item was endorsed or first

item: θT = θi-1 + max(2,SE2)

ii. Otherwise: θT = θi-1 – max(2,SE2)

3. For each unadministered item, compute the information function Ini(θT).

4. Select the item with the largest information function.

Page 9: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Person Fit StatisticsPerson Fit StatisticsPerson Fit StatisticsPerson Fit Statistics

• Residual-based:– Infit, outfit (Wright & Stone, 1979; Wright,

1980)– Log infit and outfit (Wright & Stone, 1979)

• Non-Parametric– Modified Caution Index (MCI; Harnisch &

Linn, 1981)– HT (Sijtsma, 1986; Sijtsma & Meier, 1992)

• Likelihood-Based– lz (Drasgow, Levine & Williams, 1985)

• CAT-Specific (CUSUM; van Krimpen-Stoop & Meijer, 2000)– Used three different methods for estimating

response residuals (T1, T3, and T6).

Page 10: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Predictive Efficiency of Person Fit Predictive Efficiency of Person Fit StatisticsStatisticsPredictive Efficiency of Person Fit Predictive Efficiency of Person Fit StatisticsStatistics

Page 11: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Predictive Efficiency, MFI Item Predictive Efficiency, MFI Item SelectionSelectionPredictive Efficiency, MFI Item Predictive Efficiency, MFI Item SelectionSelection

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Items Administered

R2

MCI Infit Log Infit HT CUSUM T1

CUSUM T3 CUSUM T6 Outfit Log Outfit Iz

Page 12: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Predictive Efficiency, MBF Item Predictive Efficiency, MBF Item SelectionSelectionPredictive Efficiency, MBF Item Predictive Efficiency, MBF Item SelectionSelection

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Items Administered

R2

MCI Infit Log Infi HT CUSUM T1

CUSUM T3 CUSUM T6 Outfit Log Outfit Iz

fs

Page 13: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Min. Number of Items to Achieve RMin. Number of Items to Achieve R22 == .80 .80Min. Number of Items to Achieve RMin. Number of Items to Achieve R22 == .80 .80Fit Statistic MFI MBF

MCI 13 11

HT 18 17

Infit 20 19

Log Infit 15 16

Outfit 39 36

Log Outfit 19 19

lZ 38 34

CUSUM (T1) 26 26

CUSUM (T3) 30 32

CUSUM (T6) 39 35

Average 25.7 24.5

Page 14: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Identification of Persons with Identification of Persons with Atypical SuicideAtypical SuicideIdentification of Persons with Identification of Persons with Atypical SuicideAtypical Suicide

Page 15: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Atypical SuicideAtypical SuicideAtypical SuicideAtypical Suicide• Conrad and colleagues (2010) identified

a subgroup with suicidal ideation with lower levels of depression, anxiety, trauma

• In this study however, we defined atypical suicide as persons with:– 2+ suicidal symptoms– Level of internal mental distress is not

predictive of suicidality.– Under typical CAT operation, these

individuals would be unlikely to receive suicide items during a CAT session

Page 16: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Suicide Groups Based on 2+ Suicide Groups Based on 2+ SymptomsSymptoms

91%

2%

7%

Non-Suicidal Suicidal Atypical Suicide

N=7,348

Page 17: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Predicting Atypical Suicide: All ItemsPredicting Atypical Suicide: All ItemsPredicting Atypical Suicide: All ItemsPredicting Atypical Suicide: All ItemsVariable AUC Sensitivit

ySpecificit

y

IMDS 0.83 0.0 99.5

MCI 0.38 0.0 100.0

HT 0.62 0.0 100.0

Infit/Log Infit 0.90 33.2 99.0

Outfit 0.92 14.1 98.6

Log Outfit 0.92 16.3 98.2

lZ 0.92 45.4 98.8

CUSUM (T1) 0.89 15.3 99.1

CUSUM (T3) 0.84 11.8 99.3

CUSUM (T6) 0.87 16.6 99.2

Multivariate 0.98 81.0 97.0

Page 18: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Sensitivity to Predict Atypical SuicideSensitivity to Predict Atypical SuicideSensitivity to Predict Atypical SuicideSensitivity to Predict Atypical Suicide

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Items Administered

Sen

siti

vity

IMDS Only-MFI IMDS Only-MBF IMDS+Fit Statistics--MFI IMDS+Fit Statistics--MBF

Page 19: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Comparison of Item Selection Comparison of Item Selection ProceduresProceduresComparison of Item Selection Comparison of Item Selection ProceduresProcedures

Page 20: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

First 5 Items Administered by CATFirst 5 Items Administered by CATFirst 5 Items Administered by CATFirst 5 Items Administered by CAT

24.40%

48.80%

1.40%

13.30%

12.10%

33.80%

6.80%

40.70%

18.50%

0.20%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 110%

Depression

Anxiety

Homicidality/Suicidality

Somatic Complaints

Trauma

IMD

S S

ub

scal

es

Percentage

MFI MBF

Page 21: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

CAT to Full Instrument CorrelationCAT to Full Instrument CorrelationCAT to Full Instrument CorrelationCAT to Full Instrument Correlation

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1.10

Items Administered

CA

T t

o F

ull

In

stru

men

t C

orr

elat

ion

MFI MBF

Page 22: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Measurement Precision (RMSE)Measurement Precision (RMSE)Measurement Precision (RMSE)Measurement Precision (RMSE)

0.000.200.400.600.801.001.201.401.601.802.002.202.402.602.803.003.203.403.60

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Items Administered

RM

SE

MFI MBF

Page 23: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Test InformationTest InformationTest InformationTest Information

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Items Administered

Mea

n C

um

% o

f T

est

Info

rmat

ion

MFI BMF

Page 24: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

A Case ExampleA Case ExampleA Case ExampleA Case Example

Page 25: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

MFI Item Selection and Measure MFI Item Selection and Measure EstimationEstimationMFI Item Selection and Measure MFI Item Selection and Measure EstimationEstimation

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Items Administered

Mea

sure

Difficulty Measure

First suicide item administered

First suicide item administered

Page 26: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

MBF Item Selection and Measure MBF Item Selection and Measure EstimationEstimationMBF Item Selection and Measure MBF Item Selection and Measure EstimationEstimation

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

4.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

Items Administered

Mea

sure

Difficulty Measure

First suicide item administered

First suicide item administered

Page 27: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

ComparisonComparisonComparisonComparison

MFI MBF Full

Measure -1.57 -0.58 -0.40

Std. Error 0.49 0.48 0.35

Outfit 0.82 2.20 2.10

Infit 0.85 1.25 1.51

lz 1.85 -1.53 -3.65

# Suicide 0 3 5

# Administered

19 22 42

Page 28: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

ConclusionsConclusionsConclusionsConclusions

• Hypothesis 1: Item selection method had only a modest effect on predictive efficiency, though in the hypothesized direction.– MBF had strongest effect on outfit, lz and

CUSUM (T6)

• Partial support for Hypothesis 2:– MBF provided efficient detection of atypical

suicide pattern– Reflects the type of items selected early in

the CAT rather than on predictive efficiency

• MBF was found to be somewhat less efficient than MFI

Page 29: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Strengths and LimitationsStrengths and LimitationsStrengths and LimitationsStrengths and Limitations

• Strengths– Large sample– Clinical sample– Several fit statistics examined

• Limitations– Multidimensionality– Small item bank– Further work needed on defining

“atypicalness” in clinical context– Further validation of approach across

instruments, measurement models

Page 30: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

ReferencesReferencesReferencesReferences• Conrad, K. J., Bezruczko, N., Chan, Y. F., Riley, B., Diamond, G., & Dennis, M. L.

(2010). Screening for atypical suicide risk with person fit statistics among people presenting to alcohol and other drug treatment. Drug and Alcohol Dependence, 106(1), 92-100.

• Drasgow, F., Levine, M. V., & McLaughlin, M. E. (1987). Detecting inappropriate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11(1), 59-79.

• Harnisch, D. L., & Linn, R. L. (1981). Analysis of item response patterns: Questionable test data and dissimilar curriculum practices. Journal of Educational Measurement, 18(2), 133-146.

• Linacre, J. M. (1995). Computer-adaptive testing CAT: A Bayesiian approach. Rasch Measurement Transactions, 9(1), 412.

• Sijtsma, K. (1986). A coefficient of deviance of response patterns. Kwantitatieve Methoden, 7, 131–145.

• Sijtsma, K., & Meijer, R. R. (1992). A method for investigating the intersection of item response functions in Mokken’s non-parametric IRT model. Applied Psychological Measurement, 16(2), 149-157.

• van Krimpen-Stoop, E. M., & Meijer, R. R. (2000). Detecting person misfit in adaptive testing using statistical process control techniques. In W.J. van der Linden and C.A.W. Glas (Ed.), Computer adaptive testing: Theory and practice. Boston: Kluwer Academic.

• Wright, B. D. (1980). Afterword. In G. Rasch (Ed.), Probabilistic models for some intelligence and attainment tests: With foreword and afterword by Benjamin D. Wright. Chicago: MESA Press.

• Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: University of Chicago, MESA Press.

Page 31: CAT Item Selection and Person Fit: Predictive Efficiency and Detection of Atypical Symptom Profiles Barth B. Riley, Ph.D., Michael L. Dennis, Ph.D., Kendon

Thank you!Thank you!Thank you!Thank you!

For more information, contact:Barth Riley, Ph.D.

[email protected]

For more information about the psychometrics of the Global Appraisal of Individual Needs (GAIN), including the

Internal Mental Distress Scale, go to:

http://www.chestnut.org/li/gain/#GAIN%20Working%20Papers