Upload
myles-french
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
© copyright 2006 Stephen G. Sireci
Stephen G. SireciCenter for Educational Assessment
University of Massachusetts Amherst
Mary J. PitoniakEducational Testing Service
Assessment Accommodations: What Have We Learned
From Research?
© copyright 2006 Stephen G. Sireci
In this presentation we will
• Discuss validity issues in test accommodations
• List the most common test accommodations used to promote valid score interpretation
• Discuss research conducted on test accommodations
• Suggest areas for future research on test accommodations
© copyright 2006 Stephen G. Sireci
Defining “Accommodation”
• The Standards for Educational and Psychological Testing
– use the terms “modification” and “accommodation” almost interchangeably,
– use accommodation “as the general term for any action taken in response to a determination that an individual’s disability requires a departure from standard testing protocol” (p. 101).
© copyright 2006 Stephen G. Sireci
• “Accommodation” is used to refer to test or test administration changes that are not considered to alter the construct measured.
• “Modification” is used to refer to changes that are thought to alter the construct.
Current State Testing Programs
© copyright 2006 Stephen G. Sireci
• To support valid test score interpretations for students with disabilities, it is important to remove construct-irrelevant barriers to these students’ test performance, but it is also important to maintain “construct representation.”
• In situations where individuals who take accommodated versions of tests are compared with those who take the standard version, an additional validity issue is the comparability of scores across the different test formats.
Validity Issues in Accommodations
© copyright 2006 Stephen G. Sireci
• Accommodated Standardized Test– Promotes fairness in testing?
Or– Provides an unfair advantage to some
examinees?
The Psychometric Oxymoron
What do the Standards for Educational and Psychological Testing say on this issue?
© copyright 2006 Stephen G. Sireci
• Standard 10.1: “In testing individuals with disabilities, test developers, test administrators, and test users should take steps to ensure that the test score inferences accurately reflect the intended construct rather than any disabilities and their associated characteristics extraneous to the intent of the measurement” (AERA, et al., p. 106).
Standards for Educational and Psychological Testing
© copyright 2006 Stephen G. Sireci
• Standard 10.4: If modifications are made or recommended by test developers. . . (unless) evidence of validity for a given inference has been established for individuals with the specific disabilities, test developers should issue cautionary statements in manuals or supplementary materials regarding confidence in interpretations based on such test scores” (AERA et al., p. 106).
Standards for Educational and Psychological Testing
© copyright 2006 Stephen G. Sireci
“Cautionary statements”
• Flagging of test scores: Controversial—most research in this area focused on postsecondary and postgraduate admissions tests (Sireci, 2005).
• How do states handle score reporting issues for accommodated and alternate assessments?
© copyright 2006 Stephen G. Sireci
Accommodated Tests and Accommodated Test Administrations have the Potential to
Undermine Validity in at Least 2 Ways:
1. Construct underrepresentation
2. Construct-irrelevant variance
As stated by Messick (1989):
“Tests are imperfect measures of constructs because they either leave out something that should be included…or else include something that should be left out, or both” (p. 34)
© copyright 2006 Stephen G. Sireci
• When standardized tests are NOT accommodated for SWD– Construct-irrelevant variance can interfere
with test performance• e.g. ability to see, hear, focus, interferes with
measurement of math or reading proficiency
• When standardized tests ARE accommodated– Construct underrepresentation may occur
• e.g., read-aloud for a reading assessment
© copyright 2006 Stephen G. Sireci
What methods do states use to minimize construct-irrelevant variance, while
maintaining construct representation?
© copyright 2006 Stephen G. Sireci
Categories of Accommodations
• Presentation• Timing• Response• Setting
Thompson, Blount, and Thurlow (2002)
© copyright 2006 Stephen G. Sireci
Presentation Accommodations•Oral (read-aloud, audiocassette)
• Paraphrasing
• Technological
• Braille/large print
• Sign language interpreter
• Encouragement (redirecting)
• Cueing
• Spelling assistance
• Use of manipulatives
© copyright 2006 Stephen G. Sireci
• Extended time• Multiple days/sessions• Separate sessions
Timing Accommodations
Timing accommodations are not so much an issue on state standards-based assessments because most have generous time limits.
© copyright 2006 Stephen G. Sireci
• Scribe• Booklet versus answer sheet• Marking booklet to maintain place• Transcription
Response Accommodations
Setting Accommodations• Individual administration• Administration in a separate room
© copyright 2006 Stephen G. Sireci
Psychometric Research on Test Accommodations Has Focused On
•Has the accommodation changed the construct measured?
•Speed•Different skill
•Do accommodations help only those who need them?
–Interaction hypothesis
•Do test scores from accommodated and non-accommodated administrations have the same meaning?
© copyright 2006 Stephen G. Sireci
Research on test accommodations for individuals with disabilities:
•Little empirical study
•Some literature reviews–Willingham et al. (1988) ─Chiu & Pearson (1999)
–Tindal & Fuchs (2000) ─Pitoniak & Royer (2001)
–Thompson et al. (2002) ─Bolt & Thurlow (2004)
–Sireci, Scarpati, & Li (2005)
•Psychometric issues (Geisinger, 1994)
•Legal issues (Phillips, 1994)
•Also: Keeping Score for All (Koenig & Bachman, 2004)
© copyright 2006 Stephen G. Sireci
• Do test accommodations improve the scores of students with disabilities (SWD)?
• If so, do such score gains reflect increased validity or unfair advantage?– Interaction hypothesis
• What specific types of accommodations are best for specific types of students?
Sireci, Scarpati, & Li (2005)Research Questions
Interaction Hypothesis
Figure 1
Illustration of Interaction Hypothesis
Accommodation Condition
ACCNo ACC
Me
an
Sco
re
60
50
40
30
20
10
GROUP
GEN
SWD/ELL
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)
“Differential impact on students with and without disabilities provides evidence that the accommodation removes a barrier based on disability” (p. 55).
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
“Because the source of variance is fundamentally irrelevant to the measurement of the construct, a valid accommodation will improve performance only for students with a disability” (p. 138).
© copyright 2006 Stephen G. Sireci
• Extended time seems to help and it helps SWD more than non-SWD.
• Oral accommodations show promise (math), but less uniformity across studies. Effects are considered unclear.
Are there any general conclusions regarding effects?
© copyright 2006 Stephen G. Sireci
• ERIC and PsychInfo searches• E-mails to researchers in this area
Review Process
© copyright 2006 Stephen G. Sireci
• Dimension 1: SWD or ELL• Dimension 2: Type of accommodation• Dimension 3: Experimental or non-experimental
study
Note that the review was primarily conducted in 2003 and so the results are somewhat dated. We have, however, reviewed additional research since then.
Structure of review
© copyright 2006 Stephen G. Sireci
Characteristics of Studies
Research DesignStudy Focused On
Total
SWD ELL
Experimental13 8 21
Quasi-experimental2 4 6
Non-experimental10 1 11
Total25 13 38
Studies pertaining exclusively to ELL will not be discussed in this presentation.
Types of AccommodationsType(s) of Accommodation # of Studies
Presentation:
Oral* 23
Paraphrase 2
Technological 2
Braille/Large Print 1
Sign Language 1
Encouragement 1
Cueing 1
Spelling assistances 1
Manipulatives 1
*Includes read aloud, audiotape, or videotape, and screen-reading software. Note: Literature reviews and issues papers are not included in this table.
Types of Accommodations
Note: Literature reviews and issues papers are not included in this table.
Type(s) of Accommodation # of Studies
Timing:
Extended time 14
Multi day/sessions 1
Separate sessions 1
Response:
Scribes 2
In booklet vs. answer sheet 1
Mark task book to maintain place 1
Transcription 1
Setting (separate room) 1
© copyright 2006 Stephen G. Sireci
• Most of the studies focused on elementary school (2/3 between grades 3 and 8).
• Only 41% were published in peer-reviewed journals.
Characteristics of Studies
© copyright 2006 Stephen G. Sireci
• Most common findings were gains for both SWD and and non-SWD.– Contrast Camara et al. (1998) with Bridgeman
et al. (in press)• Most studies of extended time (6 of 8) looked at
students with learning disabilities (SWLD)
Results: Extended Time
© copyright 2006 Stephen G. Sireci
Study Subject(s) Design Results H1? Elliott & Marquart (2004)
Math Experimental All student groups gained
No
Runyan (1991) Reading Experimental Greater gains for SWD
Yes
Zurcher & Bryant (2001)
Analogy test
Quasi-experimental
No gains for either group
No
Huesman & Frisbie (2000)
Reading Quasi-experimental
Gains for LD but not for non-LD groups
Yes
Alster (1997) Math Quasi-experimental
Greater gains for SWD
Yes
Summary of Studies on Extended Time (1)
© copyright 2006 Stephen G. Sireci
Summary of Studies on Extended Time (2)
Study Subject(s) Design Results H1?
Camara, Copeland, & Rothchild (1998)
SAT Ex post facto
Gains for LD retesters 3x > greater than standard retesters
Yes
Ziomek & Andrews (1998)
ACT Ex post facto
Gains for LD retesters 4x > greater than gains of standard retesters
Yes
Zuriff (2000) Reading,
ACT, GRE 5 experimental
Gains for both SWD and non-SWD
No
© copyright 2006 Stephen G. Sireci
• Results depend on subject– Gains for SWD only in Math– No differential gain in other subject areas– Tends to support oral accommodation for math
tests
Results: Oral
Study Subject Design Results H1?
Weston (2002) MathExperimental
(b/w and w/in groups)Greater gains for SWD Yes
Tindal, Heath, et al. (1998)
MathExperimental
(b/w and w/in groups)Sig. gain for SWD only Yes
Calhoon, Fuchs, & Hamlett (2000)
MathExperimental (w/in
group)
Sig. gains for oral accom., no differences b/w teacher & computer
Yes
Johnson (2000) MathExperimental (b/w
group)Greater gains for SWD Yes
Huynh, Meyer, & Gallant (2004)
Math Ex post factoAccommodated SWD > matched non-accom. SWD Yes
Helwig, & Tindal (2003)
Math Quasi-experimental
Teachers not accurate in predicting benefit; no gains for either group.
No
Meloy, Deville, & Frisbie (2000)
Science, Math,
Reading
Experimental(b/w and w/in groups)
Similar gains for SWD and non-SWD
No
Oral (continued)
Study Subject Design Results H1?
Brown & Augustine (2001)
Science, Social Studies
Experimental(b/w and w/in groups)
No gain No
Kosciolek & Ysseldyke (2000)
Reading Quasi-experimentalSWD had greater gains, but not
statistically significantNo
McKevitt & Elliot (2003)
ReadingExperimental
(b/w and w/in groups)
No sig. effect size differences b/w accom. & standard. conditions for either group.
No
© copyright 2006 Stephen G. Sireci
More Recent Research
• Extended time– Cohen, Gregg, & Deng (2005)– Wainer, Bridgeman, Najarian, & Trapani (2004)
• Oral– Fletcher, Francis, Boudousquie, Copeland,
Young, Kalinowski, & Vaughn (2006)• Dictation software
– MacArthur & Cavalier (2004)
© copyright 2006 Stephen G. Sireci
Cohen, Gregg, & Deng (2005)• Looked at groups of students with and without
accommodations and their performance on specific types of math items using differential item functioning methods– Accommodation status “only marginally related to the
pattern of accommodation-related DIF”– Different types of students benefited from the extra time– DIF not due to accommodations, but to differences in
students’ performance across different types of math items
© copyright 2006 Stephen G. Sireci
Cohen, Gregg, & Deng (2005)
“Accommodations are more appropriately viewed as leveling the playing field; they do not supply the knowledge necessary to pass tests” (p. 231).
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)• Reanalysis of Bridgeman, Trapani, & Curley (2004)
data• Evaluated extended time by shortening experimental
sections of SAT• Little difference for verbal (about 5-point gain)• Big difference for quantitative
– about 10-30 points, with larger gain associated with larger time extension
– Largest gains for highest-scoring students
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)
• Looked at correlations b/w scores from standard and extended time with students’ HS math grades– Claimed no relationship, but results
(correlations and sample sizes) were not reported!
– Important idea to look at external validity criterion
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)
• Claim that results support not flagging verbal, but should flag quantitative– Don’t acknowledge presence of undesired
speededness– SWD not included in study
• Hard to agree with conclusions• Supports increasing time limit on SAT-Q
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Experimental study involving Grade 3 students with (n=91) and without (n=91) decoding difficulties associated with dyslexia
• Oral vs. standard accommodation reading test (Texas)
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Accommodation targeted for specific disability– Oral reading of proper nouns,
comprehension stems, & answer choices– Designed to reduce the impact of word
recognition difficulties
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Results– Significant group/accommodation
interaction– Only SWD benefited from the
accommodation– Seven times greater likelihood of passing
the test with the accommodation
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)• Looked at accommodations for writing
assessments– Experimental study: SWD (n=21), students w/o
documented disability (n=10)– Three accommodation conditions:
• hand-written• dictation to scribe• dictation to speech recognition software
– 48 states allow dictation accommodation (17 exclude scores)
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)• Results:
– Dictation improved writing scores for SWD, with Scribe > speech recognition software > hand-written
– Dictation did not improve scores for students w/o disability
– No difference between student groups with respect to preference (hand vs. dictation)
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)
• Caveat–Small n (21, 10)
• Construct issue–Dictation okay if construct =
“composing”–Not okay if construct=“writing”
© copyright 2006 Stephen G. Sireci
Research on Equivalence of Test Structure
• One aspect of “construct equivalence”– Rock, Bennett, Kaplan, & Jirele (1988)– Tippets & Michaels (1997)– Huynh, Meyer, & Gallant (2004)– Huynh & Barton (2006)– Cook, Eignor, Sawaki, Steinberg, & Cline (2006)
© copyright 2006 Stephen G. Sireci
Research on Equivalence of Test Structure
Results tend to support similarity of test structure across accommodated and standard test administrations (oral, extended time, various).
© copyright 2006 Stephen G. Sireci
• Do accommodations hurt or promote valid score interpretations for students with disabilities?– Accommodations are designed to promote
validity by removing barriers (irrelevant variance)
– In general, the research suggests the accommodations being used are sensible and defensible.
Discussion (1)
© copyright 2006 Stephen G. Sireci
• Extended time seems to be a valid accommodation.– Unintended test speededness could
explain results for students w/o disabilities
– Result support revised interaction hypothesis or “differential boost.”
Discussion (2)
Interaction Hypothesis: Typical
Illustration of Interaction Hypothesis
Accommodation Condition
ACCNo ACC
Mea
n S
core
60
50
40
30
20
10
GROUP
GEN
SWD/ELL
© copyright 2006 Stephen G. Sireci
Interaction Hypothesis: Revised “Differential Boost”(Fuchs, Fuchs, Eaton, Hamlett, & Karns, 2000)
Illustration of Revised Interaction Hypothesis
Accommodation Condition
ACCNo ACC
Mea
n S
core
60
50
40
30
20
10
GROUP
GEN
SWD/ELL
© copyright 2006 Stephen G. Sireci
• Other accommodations have less consistent and convincing results, but no evidence of “harm” or “unfairness.”
• It should be noted that lots of solid and ingenious experimental research has been done in this area.– Small n, but intense with respect to data
collection
Discussion (3)
© copyright 2006 Stephen G. Sireci
• Oral accommodation for math seems valid. • Oral accommodation for reading involves
consideration of specific construct changes– Fletcher et al. (2006) results indicate matching
disability and accommodation to one aspect of construct promotes validity
Discussion (4)
© copyright 2006 Stephen G. Sireci
• Looking across various studies and accommodation conditions– Lots of variability across studies with respect to
• accommodation conditions and how they were implemented
• Student groups (within and between)• Results
Discussion (5)
© copyright 2006 Stephen G. Sireci
• Test Development: Universal test design– Build tests that are “accessible to all”
(i.e., that do not need to be accommodated).– CBT could be particularly helpful in this regard.– 19th & 20th century: Standardization– 21st century?—Adaptivity?
(can’t be oxymoronic)
Future Directions for Test Design
© copyright 2006 Stephen G. Sireci
• Meta-analysis based on practice– Non-published test accommodations being
conducted in states– Establish a data warehouse for teachers and
test administrators to record results and make comments?
– Would address the small-n issue
Future Directions for Research (1)
© copyright 2006 Stephen G. Sireci
• Larger sample sizes due to inclusion, coupled with improved school data management systems should promote more research on– Differential item functioning– Structural equivalence– Analysis of educational gains
Future Directions for Research (2)
© copyright 2006 Stephen G. Sireci
• More needs to be done on potential changes to the construct– Most often decided by logical analysis– Structural equivalence research is limited– Structural equivalence construct equivalence
Future Directions for Research (3)
© copyright 2006 Stephen G. Sireci
Let’s go do it!
Thank you for your attention!