Upload
dysis
View
20
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Effects of Question Format on 2005 8th Grade Science WASL Scores. Janet Gordon, Ed. D. A Big Thank-you!. WERA Pete Bylsma Andrea Meld Roy Beven Yoonsun Lee Joe Willhoft North Central ESD. Today’s Presentation. National trends in assessment Washington State trends - PowerPoint PPT Presentation
Citation preview
Effects of Question Format on
2005 8th Grade Science WASL Scores
Janet Gordon, Ed. D.
A Big Thank-you!
WERA
Pete Bylsma
Andrea Meld
Roy Beven
Yoonsun Lee
Joe Willhoft
North Central ESD
Today’s Presentation
• National trends in assessment
• Washington State trends
• My research on the science WASL
• A look at the literature to try to explain research results
• Take-home messages
National Trends in Science and Mathematics
Assessments
• Assessing what is valued in science professional community (inquiry, application)
• Assessing tightly integrated knowledge linked to application
• Involving teachers and professionals in test development
• What is easily measured
• Discrete bits of knowledge
• Off-the-shelf commercial tests
Placing More Emphasis On:
Compared To:
Improvements in theNational Assessment of
Educational Progress (NAEP)
• Items grouped into thematic blocks with rich context.
• Real-world application.
• Emphasizes integrated knowledge rather than bits of information.
The NAEP Results
• Lower omission rates on thematically grouped items compared to stand-alone m/c items.
• Increased student motivation to try item
• Increased student engagement(Silver, et al., 2000; Kenney & Lindquist, 2000)
Washington’s Science Standards & Strands
Washington’s Science Strands
2 Science WASL Question Types
Mostly Scenario Type
Rich Context
Clear, authentic task
5 to 6 multiple-choice, short or extended-constructed response items
Few Stand-Alone Type Discreet bits of
knowledge
1 multiple-choice or short-constructed response item
3 Item Response Formats
• Extended Constructed Response (ECR)– Students write 3-4 sentences
• Short Constructed Response (SCR)– Students write 1-2 sentences
• Multiple-choice (M/C)
3 Categories of Factors That Affect
Student Achievement Scores(The Student) Model of
Cognition
CultureGender, EthnicityIndividual differences
(The Test Item) ObservationItem format
InterpretationMeasurement model (IRT, Bayes Nets)
The Test Item - Observation
• Girls scored much lower on m/c compared to boys (Jones et al., 1992)
• Girls scored higher on constructed response compared to boys (Zenisky et al., 2004)
• Underrepresented groups score higher on performance-like formats (Stecher et al., 2000)
• Embedded Context = Increased comprehension (Solano-Flores, 2002; Zumbach & Reimann, 2002)
State’s 2005 Science WASL Scores
Proficient and Non-Proficient on 2005 8th-Grade Science WASL
80 7972
55
20 2128
45
0102030405060708090
AfricanAmerican
Hispanic AmericanIndian
White
Ethnicity
Per
cen
t
Not Proficient
Proficient
Statement of Problem
Is the science WASL accurately measuring
what students know?
Hypothesis
• Contextual, real-world scenarios make information accessible to all ethnicities (“cultural validity”).
• Clear, authentic tasks within scenario questions “unpacks” prior knowledge for ALL students
• Gender neutral – extended and short constructed response formats…not just m/c
Research Questions
On the 2005 8th grade science WASL:
Is there any significant difference in performance between gender and/or ethnic groups:
1) on stand-alone question types?
2) on scenario question types?
Methods - Instrument
• OSPI provided results from 8th grade 2005 science WASL
• Entire population: N = 81,690
• Invalid records excluded (e.g. cheating)
• Incomplete records excluded (e.g. gender or ethnicity omitted)
• Actual population: N = 77,692
Methods - Analysis
• MANOVA & follow-up ANOVAs
• Dependent Variable:– scenario score points – stand-alone score points
• Independent Variables:– gender– ethnicity
Methods - Analysis
• Analysis I – All item response formats
• Analysis II– Multiple-choice response formats only
• Effect Size (Cohen’s d)– Magnitude of differences
Results
Stand-Alone Question Type
Analysis Of Variance
Significant Differences?
Effect Size
Gender Groups – NOEthnic Subgroups – YESEthnicity x Gender-YES
Gender – Very smallEthnicity x Gender – very small
Ethnicity Small to Moderate
Between White,Asian,MultiRacial
AND AI/AN, HPI, Black, Hispanic groups
Scenario Question Type
Analysis Of Variance
Significant Differences?
Effect Size
Gender Groups – NOEthnic Subgroups – YESEthnicity x Gender-YES
Gender – Very smallEthnicity x Gender – very small
Ethnicity Large Effect Size
Between White,Asian,MultiRacial AND AI/AN, HPI, Black, Hispanic groups
Result 1
The achievement gap
between ethnic subgroups
is LARGER
on SCENARIO
vs. stand-alone question types.
Result 2
More students
received MORE points
on STAND-ALONE question
types compared to
scenario question types.
Result 3
A new achievement gap
between boys and girls
IS CREATED
when extended
constructed response items
were removed.
Three(3)
Prevailing Themes
In the Literature to
Help Explain
Differences
in
Student Achievement
THEME I - Individual Differences
ContentKnowledg
e
StrategicProcessingKnowledge
Expert/Novice Theory
(Alexander, 2003; Chi, 1988)
Novice-Dependent on working memory limits.
Expert-Fluent. Freed-up w.memory to focus on meaning/execution of problem.
THEME II - Opportunity To Learn
Quality Teaching & Learning (Darling-Hammond, 2000)
There are differences between schools in students’ exposure to knowledge or OTL
Deep understanding of science strategic processing knowledge often requires direct instruction & lots of practice (Garner, 1987)
OTL are often compromised in high-need schools (lack of PD support, supplies)
1) Passage Length (Davies, 1988)
2) Academic Vocabulary (Schaftel et al., 2006)
3) Degree of Knowledge Transfer (Chi et al., 1987)
4) Ambiguity & Complexity in Performance-Like Items (Haydel, 2003)
5) Science Strand Type (Bruschi & Anderson, 1994)
6) Instructional Sensitivity of Item (D’Agostino et al., 2007)
Theme III - Attributes of Items
Sensitivity of Items to Variations
in Classroom Instruction
(D’Agostino et al., 2007)
“The Test
Gap”
“The Learning
Gap”
Some item response formats are more sensitive to variations in classroom
instruction than others.
Standards
Translating This Into Classroom Practice
• Inspired to dig deeper into detailed learning progressions from novice to expert.
• Use these principals in your formative assessment process; can identify where students need rich feedback
• Many teachers are creating common Classroom-Based-Assessments (CBA) for quarterly benchmarking.
“To Go” Classroom Based Assessment (CBA) Creation
Checklist
“Because not all items are created equal.”
Did I…. For This Reason…
Use m/c, short and extended response item types?
To give both boys and girls equal chance to show evidence of learning.
Keep passage and sentence length to a minimum?
To uncover gaps in content knowledge and separate reading ability.
Use the same academic vocabulary that is in the standards?
Items are sensitive to variations in classroom instruction. Match instruction to standards.
“Lessons to Go”
• Use all 3 item response types in your classroom-based assessments (CBAs).
• Keep passage length at a minimum to tease apart content knowledge from reading ability and working memory limitations.
“Lessons to Go”
• Use the same academic vocabulary in the classroom and on your CBAs that is on the WASL.
• Use embedded context in a way that is similar to how students learned the material.
Suggestions for Future Research
1- Do similar patterns within question types exist between Schools? Classrooms?
2-Deeper examination of performance variance at the item level. What level of strategic processing knowledge is assumed compared to content knowledge?
3- Students’ perceptions of assessment items (think-aloud protocol).
4- Do the same patterns exist independent of reading proficiency?
References – Page 1
Alexander, P. A. (2003). The development of expertise: The journey from acclimation to proficiency. Educational Researcher, 32(8), 10-14.
Anderson, J. R. (1990). Cognitive Psychology and Its Implications (3rd ed.). New York: W.H. Freeman
Bruschi, B. A., & Anderson, B. T. (1994). Gender and ethnic differences in science achievement of nine-, thirteen-, and seventeen-year-old students. Paper presented at the Eastern Educational Research Association, Sarasota, FL.
Chi, M. T., Glaser, R., & Farr, M. J. (1988). The Nature of Expertise. Hillsdale, NJ: Lawrence Erlbaum Associates.
Cohen, D. K., & Hill, H. C. (2000). Instructional policy and classroom performance: The mathematics reform in California. Teachers College Record, 102(2), 294-343.
D'Agostino, J. V., Welsh, M. E., & Corson, M. E. (2007). Instructional sensitivity of a state's standards-based asssessment. Educational Assessment, 12, 1-22.Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence. Seattle: Center for the Study of Teaching and Policy, University of Washington.
References – Page 2
de Ribaupierre, A., & Rieben, L. (1995). Individual and situational variability in cognitive development. Educational Psycologist, 30(1), 5-14.
Garner (1987). Garner, R. (1990). When children and adults do not use learning strategies: Towards a theory of settings. Review of Educational Research, 60, 517-529.
Haydel, A. M. (2003). Using cognitive analysis to understand motivational and situational influences in science achievement. Paper presented at the AERA, Chicago, Il.
Shaftel, J., Belton-Kocher, E., Glasnapp, D. & Poggio, J. (2006). The impact of language characteristics in mathematics test items on the performance of English language learners and students with disabilities. Educational Assessment, 11(2), 105-126.Marshall (1995).
Woltz, D. J. (2003). Implicit cognitive processes as aptitudes for learning. Educational Psycologist, 38(2), 95-104.