Upload
osma
View
43
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Measuring Student Growth. Feedback Loop. Key Concept. All models are wrong, but some are useful. George Box. Two Conflicting Models. Ptolemaic. Copernican. Key Problems. Theta. Lack of Randomness. Neither teachers nor students are randomly assigned. Topics. - PowerPoint PPT Presentation
Citation preview
Measuring Student Growth
Feedback Loop
Key Concept
All models are wrong, but some are useful. George Box
Two Conflicting ModelsPtolemaic Copernican
Key Problems
Theta
Neither teachers nor students are randomly assigned
Lack of Randomness
Topics
What are the main types of growth models and their application
What are the major problems presented by the use of growth models
What does the actual application of growth models look like
Assessment and scoring
Item AnalysisDifficulty Index
p=(number correct)/(number of responses)
Discrimination Index
d=(((number correct upper group)-(number correct lower group))/(number in each group))
Raw Score v. Scale ScoreRaw ScoreTotal point value of correct responses to valid items
Scale ScoreEquating adjustment to ensure that any given assessment is comparable to previous assessments
Vertical ScaleEquating adjustment that requires a higher score in higher grades
CRT v. NRTCRTMeasure of performance relative to a delimited domain of learning tasks
NRTMeasure of performance relative to an individual’s standing in a group
In PracticeThe primary difference is in the way the scores are interpreted
Three MeasuresCenter
Mean, Median, and Mode
SpreadVariance and Standard Deviation
DistributionWhat does the data look like?
Distribution
The Mean Follows the Tail
Grade 6 Math 2010
Grade 6 Math 2010
Main Types of Growth Models
Trajectory
Value/Transition Table
Projection/Linear
Trajectory (Growth to Proficiency)
Begins with the current score and the needed score for proficiency in the future, dividing the required student gains needed to reach this score into annual targets
Usually dependent on assessments with a vertical scale
Trajectory Model
Arkansas Example
Value Table/Transition Model
Creates subdivisions of performance and awards credit for moving students to higher levels.
A categorical approximation of a trajectory model, not dependent on a vertical scale
Year 1 Level
Year 2 Level
Level 1ALevel 1B Level 2A Level 2B Proficient
Level 1A 0 150 225 250 300
Level 1B 0 0 175 225 300
Level 2A 0 0 0 200 300
Level 2B 0 0 0 0 300
Proficient0 0 0 0 300
Value Table/Transition Model
Previous Year
Current Year1 1.5 2 2.5 3 3.5 4 4.5
1 0 .5 1 1.5 2 2.5 3 3.51.5 -.5 0 .5 1 1.5 2 2.5 32 -1 -.5 0 .5 1 1.5 2 2.52.5 -
1.5-1 -.5 0 .5 1 1.5 2
3 -2 -1.5
-1 -.5 0 .5 1 1.5
3.5 -2.5
-2 -1.5 -1 -.5 0 .5 1
4 -3 -2.5
-2 -1.5 -1 -.5 0 .5
4.5 -3.5
-3 -2.5 -2 -1.5 -1 -.5 0
Arkansas Example
Projection (Linear) Model
Uses current and past scores to predict performance in the future
Such models can be quite complex and difficult for stakeholders to understand
Projection Model
EVAAS Example
SGP Example
GMPP Evaluation
GMPP Participation
State Growth ModelDeleware Transition MatrixIowa Transition MatrixAlaska TrajectoryArizona TrajectoryArkansas TrajectoryFlorida TrajectoryNorth Carolina TrajectoryOhio ProjectionTennessee Projection
Results by State
Schools Meeting AYPOverall, the increase of schools making AYP when growth was included was 16%
Biggest rates were in Ohio, 50%, Arkansas, 13%, and Tennessee, 10%
If Ohio’s results are excluded, the overall rate is only 4%
Growth Model Comparisons
Two Viewpoints
Classification Errors
False negative: an effective teacher is classified as a less-effective teacher
False positive: a less-effective teacher is classified as an effective teacher
The Widget EffectThe vast majority of school districts presently employ teacher evaluation systems that result in all teachers receiving the same (top) rating…. In districts that used binary ratings more than 99 percent of teachers were rated satisfactory. In districts using a broader range of ratings, 94 percent received one of the top two ratings and less than 1 percent received an unsatisfactory rating.
…student test scores alone are not sufficiently reliable and valid indicators of teacher effectiveness to be used in high-stakes personnel decisions, even when the most sophisticated statistical applications such as value-added modeling are employed.
Teaching is Complex
Teaching is ComplexThe use of imprecise measures to make high stakes decisions that place societal or institutional interests above those of individuals is wide spread and accepted in fields outside of teaching…. nearly all selective colleges use SAT or ACT scores as a heavily weighted component of their admission decisions even though that produces substantial false negative rates (students who could have succeeded but are denied entry).
Perverse Incentives…research and experience indicate that approaches to teacher evaluation that rely heavily on test scores can lead to narrowing and over-simplifying the curriculum, and to misidentifying both successful and unsuccessful teachers. These and other problems can undermine teacher morale, as well as provide disincentives for teachers to take on the neediest students. When attached to individual merit pay plans, such approaches may also create disincentivesfor teacher collaboration.
Perverse Incentives
Much of the fear concerning growth is about “use.” Take that fear away, and what is left is something that is very useful and which teachers are interested in knowing as well.
At least in terms of SGP, the “neediest” students represent a teacher’s best chance to demonstrate superior growth.
Data Requirements
All growth models are dependent upon the ability to track a large percentage of student over time
This is especially difficult at the teacher level, where a host of issues will probably never fully be resolved
Terminology
Is the term “value-add” loaded? Is the term “growth” more palatable and perhaps more descriptive?
Other PopulationsEthnicity Count SGP_Median
A 5028 60B 82290 43H 37857 50N 2808 51P 1575 48T 4707 52W 259692 52
GT Count SGP_MedianN 341949 48Y 52299 63
Sped Count SGP_MedianN 354723 51Y 39619 38
ESICode Count SGP_MedianAU 1539 43DB 8 42.5ED 467 34HI 408 37.5MD 214 36.5MR 2345 28OHI 7374 34OI 135 44SI 9396 44SLD 17515 37TBI 82 33VI 136 41
G5 2008 Math G6 2009 Math IncreaseNoah 702 775 73Ben 425 527 102
Who Had More Growth?
2Percentiles 16 50 84 98
The Normal Distribution
The Normal Distribution
Student Growth Percentiles
Quantile Regression
Ben
Noah
G5 2008 Math G6 2009 Math 2009 Math SGPNoah 702 775 54Ben 425 527 44
Who Had More Growth?
Math Scale and Growth
Aggregated Growth Percentiles
Goodness of Fit
Distribution
Distribution by Cohort
Prior/Current Math Scale
Math Scale and Growth
Prior/Current Math Growth
Density Plot
Current Math/Literacy Scale
Current Math/Literacy Growth
Density Plot
THOMAS KANEhttp://metproject.org/
Wide variation among teachers
Wide variation within schools
Little (if any) differences in teacher preparation
Teachers improve, but improvement flattens after the third year
No schools where all teachers are highly effective
Metproject
Evaluating Teachers
Evaluating Teachers
Evaluating Teachers
Feedback?