16
Study of Device Comparability within the PARCC Field Test

Study of Device Comparability within the PARCC Field Test

Embed Size (px)

Citation preview

Page 1: Study of Device Comparability within the PARCC Field Test

Study of Device Comparabilitywithin the PARCC Field Test

Page 2: Study of Device Comparability within the PARCC Field Test

2

PARCC’s ultimate goal for test delivery • Digital delivery of the PARCC ELA and mathematics

assessments – On the widest variety of devices that will support

interchangeable scores– E.g., desktop computers, laptops, and tablets

Goal of Test Delivery

Fairness

Page 3: Study of Device Comparability within the PARCC Field Test

3

• “Tablets” = full size (10”) iPads• One form of each of the following tests was chosen

for administration on iPads:– Grade 4 ELA/Literacy and Mathematics – Grade 8 ELA/Literacy and Mathematics– Grade 10 ELA/Literacy– Geometry

• Selected “condition 1” forms so that the same students took both the PBA and EOY components of the selected forms

Quantitative Comparability Study

Page 4: Study of Device Comparability within the PARCC Field Test

4

1. Do the individual items/tasks perform similarly across computers and tablets?

2. Are the psychometric properties of the test scores similar across computers and tablets?

3. Do students perform similarly on the overall test across computers and tablets?

Research Questions

Page 5: Study of Device Comparability within the PARCC Field Test

Methodology

Page 6: Study of Device Comparability within the PARCC Field Test

6

• Grade 8 and high school studies used random assignment of Burlington, MA students to computer and tablet conditions– Random assignment to conditions by homeroom or class

section

• Grade 4 study used matched sample from MA– Burlington students assigned to tablet condition matched

to other MA students who tested on computer– Matching based on previous scores on state assessment,

Massachusetts Comprehensive Assessment System (MCAS)

Data Collection Design

Page 7: Study of Device Comparability within the PARCC Field Test

7

• Item/Task Level Analysis– Comparison of p-values and item means– Analysis of IRT item difficulty estimates

• Component Level Analysis– Correlation between PBA and EOY scores

• Test Level Analysis– Reliability– Validity – Score Interpretation

Analysis Methods

Page 8: Study of Device Comparability within the PARCC Field Test

Summary of Results

Page 9: Study of Device Comparability within the PARCC Field Test

9

• Grade 4 Mathematics– Device effect found for 18 of 51 (35%) items– Elementary students less familiar with taking mathematics

tests online– Degree of success in matching samples for Grade 4

• Grade 8 Mathematics– Device effect found in component-level and reliability

analysis– Highest number of items (29 of 67, or 43%) excluded from

study

Observed Device Effect

Page 10: Study of Device Comparability within the PARCC Field Test

10

• Grade 4 ELA– Device effect found in validity and score interpretation

analysis– Elementary students less familiar with taking items/tasks

that are not selected responses online– Degree of success in matching samples for Grade 4

• Consistent device effect across analyses was not observed for any of the tests in the study– Device effect was found for none of the analyses in Grades

8 ELA and Geometry

Observed Device Effect

Page 11: Study of Device Comparability within the PARCC Field Test

Conclusions and Implications

Page 12: Study of Device Comparability within the PARCC Field Test

12

1. Do the individual items/tasks perform similarly across computers and tablets?o YES, for most items/tasks in the studyo More items with device effect in Grade 4– Unfamiliarity with taking certain item types

online for elementary school students– Degree of success in matched samples

o Insufficient device effect items to draw conclusions about item features

Conclusions

Page 13: Study of Device Comparability within the PARCC Field Test

13

2. Are the psychometric properties of the test scores similar across computers and tablets?o YES, for all but one test in this studyo Exception: Grade 8 mathematics (component-

level and reliability analyses)– Highest number of items excluded from study

may have led to less stable correlation estimates

Conclusions

Page 14: Study of Device Comparability within the PARCC Field Test

14

3. Do students perform similarly on the overall test across computers and tablets?o In general, YES – no consistent device effect was

observed across analyses for any test in study o Device effect found in score interpretation

analysis for Grade 4 ELA– Unfamiliarity with taking non-selected response

tasks online for elementary school students– Degree of success in matched samples

Conclusions

Page 15: Study of Device Comparability within the PARCC Field Test

15

• Comparability of assessments administered on computer and tablets – No evidence of large or consistent differences in

comparability was found in this study– Also supported by device comparability research

conducted outside of PARCC (e.g. Davis, Orr, Kong, Lin, 2014; Olsen, 2014; Davis, Kong, McBride, 2015)

– Further supported by policies in other large scale assessment programs (e.g., SBAC and other statewide assessments)

Implications

Page 16: Study of Device Comparability within the PARCC Field Test

16

• Item development and user interface design– Consider familiarity of younger students with

nontraditional item types online– Additional focus groups and/or cognitive labs with

elementary school students – Minimize the use of item features (e.g., drag and

drop) that may lead to differential performance across computers and tablets

Implications