23
Selected topics in psychometrics L10: Computerized Adaptive Testing Patr´ ıcia Martinkov´ a NMST570, January 7, 2020 Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences Institute for Research and Development of Education Faculty of Education, Charles University, Prague

Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

  • Upload
    others

  • View
    8

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Selected topics in psychometrics

L10: Computerized Adaptive Testing

Patrıcia MartinkovaNMST570, January 7, 2020

Department of Statistical Modelling

Institute of Computer Science, Czech Academy of Sciences

Institute for Research and Development of Education

Faculty of Education, Charles University, Prague

Page 2: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Outline

1. Introduction

2. Computerized adaptive tests

3. Simulations and examples

4. Conclusion

Page 3: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

Motivation Types of tests

Motivation

• Student abilities may differ remarkably

• Some students may become bored by too easy items

• Some students may become stressed by too hard items

• Selection of proper items may save time

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 1/19

Page 4: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

Motivation Types of tests

Different types of tests

• Linear tests

• most popular

• test takers take all items

• Linear Computer-based tests (CBT)

• linear test administered on computer

• allow for taking track of response time for each item

• Computerized adaptive tests (CAT)

• gets estimate after each item

• guides selection of subsequent items

• Multistage test (MST)

• compromise between Linear and CAT,

• similar to CAT, but by groups of items

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 2/19

Page 5: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

CAT flow chart

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 3/19

Page 6: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Computerized adaptive tests: components

1. Initialization

• administer initial item(s)

• estimate student ability

2. Testing cycle

• Select item and gather response

• Re-estimate student ability

• Check termination criteria

3. Output the final ability estimate / final classification

Further possibilities:

• Exposure rate control

• Content balancingPatrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 4/19

Page 7: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Item Bank

• Based on test specifications

• Sufficient number of items in each content category

• Item quality reviewed by content experts

• Newly written items pretested

• Reviewed items calibrated with CTT or IRT (parameters a, b, c , d)

• Qualified items selected to item bank

• Item bank re-evaluated for size, specifications, content balance

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 5/19

Page 8: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Item properties described by IRT model

• Selected IRT model

• Rasch, 1PL, 2PL, 3PL, 4PL, (G)PCM, GRM, NRM,...

• Functional form (here for 2PL IRT model)

P(Yij = 1|θi , aj , bj) =eaj (θi−bj )

1 + eaj (θi−bj )

aj discrimination, bj difficulty

• Item parameter estimates

• JML, CML, MML, Bayesian methods

• Item Characteristic Curve (ICC)

• Item Information Curve (IIC)

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 6/19

Page 9: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Initialization

• Initial estimate of student ability

• Mean ability (θ0 = 0)

• Based on a single item

• Based on a pretest

• Item selection for initial estimate of ability / pretest

• Item(s) with mean difficulty (b close to 0)

• Item(s) with difficulty closest to ability estimated by pretest

• Random item(s)

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 7/19

Page 10: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Ability Estimation

• Maximum likelihood (MLE)

• Bayes modal estimator, maximal a posteriori estimator (MAP)

• Expected a posteriori estimator (EAP)

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 8/19

Page 11: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Item Selection

• Sequential (non-adaptive)

• Urry’s criterion: difficulty closest to current ability estimate

• Fisher Information

• maximum Fisher information (MFI)

• maximum likelihood weighted information criterion (MLWI)

• maximum posterior weighted criterion (MPWI)

• Kullback-Leibler information criteria

• Maximum expected information criterion (MEI)

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 9/19

Page 12: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Other Issues

• Exposure rate control

• randomesque

• 5-4-3-2-1 method

• etc.

• Content balancing

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 10/19

Page 13: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT flowchart Item Bank Starting Continuing Stopping

Stopping rule

• Test length reached

• Test time reached

• Precision criterion on ability estimate

• Classification criterion: threshold met specification for ability confidence

interval

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 11/19

Page 14: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

Post-hoc analysis of real data CAT software

Post-hoc analysis of real data

• Having real data from linear test

• Checking how would the results be in CAT design

• Comparison of different settings

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 12/19

Page 15: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

Post-hoc analysis of real data CAT software

Simulation study

• Simulating ability

• Setting item parameters (from real test or based on a model)

• Comparison of different settings

• Correlation with true ability, bias, length,...

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 13/19

Page 16: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

Post-hoc analysis of real data CAT software

CAT Examples and software

• catIRT

• catR + Concerto

• mirtCAT

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 14/19

Page 17: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT conclusions Course conclusions

Conclusion

• CAT is an up-do-date testing method!

• Good precision using much less items

• Increased security

• Faster score reporting

• Less stress and less boredom for students

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 15/19

Page 18: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT conclusions Course conclusions

Selected topics in psychometrics

• Introduction

• Reliability and measurement error

• Validity

• Traditional item analysis

• Regression models for item description

• Item response theory models for binary data

• Item response theory models for ordinal and nominal items

• Differential item functioning (2 lessons)

• Computerized adaptive testing and other topics in psychometrics.

Final project assignment

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 16/19

Page 19: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT conclusions Course conclusions

Course goals

1. Name the main topics studied in psychometrics

2. Describe and apply strategies to find proofs about validity

3. Describe and apply strategies to find proofs about reliability

4. Describe and apply traditional methods to describe item functioning

(difficulty, discrimination, distractor analysis)

5. Explain how logistic regression may be used to describe item properties,

apply regression models on real data and interpret results

6. Formulate IRT models for binary, polytomous and nominal items (Rasch,

1-4PL IRT, GRM, GPCM, RSM, NRM), apply them on real data and

interpret results

7. Explain concept of differential item functioning, describe and apply some

DIF detection methods (Mantel-Haenszel test, logistic regression,

IRT-based Lord’s test, Raju’s test etc.)

8. Explain process of computerized adaptive testing, prepare adaptive test

9. Describe process of assessment development and validationPatrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 17/19

Page 20: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT conclusions Course conclusions

Final project

• Introduction (2 pts)

• Methods (2 pts)

• Results (12 pts)

• A consise descriptive analysis (2 pts)

• Proofs of test reliability (2 pts)

• Proofs of test validity (2 pts)

• Item analysis using traditional methods or regression models (2 pts)

• Item response theory model (2 pts)

• Differential item functioning (2 pts)

• Discussion (2 pts)

• References (2 pts)

• Supplemental Materials (4 pts)

• Bonus (own dataset (5 pts), CAT (5 pts), simulation (5pts)

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 18/19

Page 21: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Introduction Computerized adaptive tests Simulations and examples Conclusion

CAT conclusions Course conclusions

Seminar in Psychometrics NMST571

Seminar covers selected topics in psychometrics in more depth.

Tentative schedule:

• Tuesdays, 3:40-5:10pm

Credit requirements:

• Actively present (2 absences tolerated)

• Presentation during one of the sessions (approx. 30 min presentation plus

15 min discussion).

Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 19/19

Page 22: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

Thank you for your attention!

www.cs.cas.cz/martinkova

Page 23: Selected topics in psychometrics - CAS · 2020-01-07 · Selected topics in psychometrics L10: Computerized Adaptive Testing Patr cia Martinkov a NMST570, January 7, 2020 Department

References

• Magis D, Yan D, von Davier AA. (2017). Computerized Adaptive and

Multistage Testing with R. Springer.

• Magis D, & Raichle G. (2012). Random Generation of Response Patterns under

Computerized Adaptive Testing with the R Package catR. Journal of Statistical

Software, 48, pp. 1-31. https://www.jstatsoft.org/article/view/v048i08

• Chalmers P. (2016). Generating Adaptive and Non-Adaptive Test Interfaces for

Multidimensional Item Response Theory Applications. Journal of Statistical

Software, 71, pp. 1-38. https://www.jstatsoft.org/article/view/v071i05

• Chalmers P. (2017-08-11). Wiki mirtCAT

https://github.com/philchalmers/mirtCAT/wiki