25
Investigating item quality using fit and other indicators Robert Coe Rasch User Group, Durham, 18 March 2016 @ProfCoe #RUD2016

Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Embed Size (px)

Citation preview

Page 1: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Investigating item quality using fit and other indicatorsRobert CoeRasch User Group, Durham, 18 March 2016

@ProfCoe#RUD2016

Page 2: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Quality

1. Utility for measurementa) Fit with measurement model (Rasch)b) Alignment with intended construct interpretations

2. Utility for learninga) Alignment with intended learning aimsb) Value of diagnostic information

i. For teachersii. For students

c) Reinforcement, retrieval

2

Page 3: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Model fit

§ INFIT/OUTFIT§ Discrimination

– IRT parameter/index– Item-measure correlation– 27% rule (Kelley, 1939)

§ H -coeff of homogeneity (Loevinger, 1948; Mokken, 1971; Mokken & Lewis, 1982)

§ Other fit statistics?

3

Page 4: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Problems with INFIT/OUTFITKarabatsos (2000) JAppMeas

§ ‘Residual fit statistics’ are confounded: parameters are estimated from data; fit stats test fit between data and parameters …

§ Interpretation of INFIT/OUTFIT is sample dependent

§ They do a poor job of identifying misfit

4

Page 5: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

5

Page 6: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Page 7: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

7

ItemStatisticsNumberofresponses 7,685Maximumscore 1Meanscoreonitem 0.12

Itemdifficulty(Raschmeasure) 1.98INFIT(meansq) 0.98OUTFIT(meansq) 3.80IRTDiscriminationparameter 0.92

Item-measurecorrelation(actual) 0.42Item-measurecorrelation(expected) 0.46

Percentmatchmodel(observed) 91Percentmatchmodel(expected) 90

Page 8: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

3. b) Infit and outfit indicate model fit?

infit 1.07, outfit 1.15 infit 1.04, outfit 1.08 infit 1.06, outfit 1.27

Page 9: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

WINSTEPS category probability curves can be misleading

3.002.001.000.00-1.00-2.00-3.00

1.20

1.00

0.80

0.60

0.40

0.20

0.00

-0.20

Logistic probability of correct response v Person ability

Smoothed local average proportion correct v Person ability

Equal density distribution plot v Person ability

Infit = 1.31

Page 10: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

10

Page 11: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

11

Page 12: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

12

Page 13: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

13

Page 14: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

14

Page 15: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Enlargement 8→12, so ?→18

15

12

14

Missing

Page 16: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Algebra

16

Page 17: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Question: ALG04EEMultn+5by4

ItemStatisticsNumberofresponses 7,841Maximumscore 1Meanscoreonitem 0.11

Itemdifficulty(Raschmeasure) 1.94INFIT(meansq) 0.92OUTFIT(meansq) 0.6682IRTDiscriminationparameter 1.0816

Item-measurecorrelation(actual) 0.4338Item-measurecorrelation(expected) 0.3969

Percentmatchmodel(observed) 90.664Percentmatchmodel(expected) 90.128

17

Page 18: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

18

Outfitforpersonsforwhothe itemis:0<p<0.05 0.05<p<0.2 0.2<p<0.8 0.8<p<0.95 0.95<p<1

thresholdVeryhard Hard About right Easy Veryeasy

1 Outfit 0.42 0.84 1.05 0.79 0.03N 3997 2512 1288 39 5

“Multiply n+5 by 4”

Page 19: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

19

4n+20

4(n+5)

missing

4n+5

n+20

5n+5

“Multiply n+5 by 4”

Page 20: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Question: ALG04AAAdd4to8

ItemStatisticsNumberof responses 7,841Maximumscore 1Meanscoreonitem 0.74

Itemdifficulty (Raschmeasure) -2.86INFIT(meansq) 1.2693OUTFIT(meansq) 1.627IRTDiscriminationparameter 0.6577

Item-measurecorrelation(actual) 0.4636Item-measurecorrelation(expected) 0.5731

Percentmatchmodel(observed) 78.117Percentmatchmodel(expected) 83.017

20

Page 21: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

21

“Add 4 to 8”

Page 22: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

22

12missing

8+4

“Add 4 to 8”

Page 23: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

Question: ALG04FFMult3nby4

ItemStatisticsNumberofresponses 7,841Maximumscore 1Meanscoreonitem 0.34

Itemdifficulty(Raschmeasure) -0.11INFIT(meansq) 1.5233OUTFIT(meansq) 1.8801IRTDiscriminationparameter 0.1397

Item-measurecorrelation(actual) 0.3335Item-measurecorrelation(expected) 0.5512

Percentmatchmodel(observed) 65.338Percentmatchmodel(expected) 78.106

23

Page 24: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

24

“Multiply 3n by 4”

Page 25: Investigating item quality - · PDF fileQuality 1. Utility for measurement a) Fit with measurement model (Rasch) b) Alignment with intended construct interpretations 2. Utility for

25

“Multiply 3n by 4”

missing

12n

7n

4x(3n)