45
TECHNICAL AND CONSEQUENTIAL VALIDITY IN THE DESIGN AND USE OF VALUE-ADDED SYSTEMS LAFOLLETTE SCHOOL OF PUBLIC AFFAIRS & VALUE-ADDED RESEARCH CENTER, UNIVERSITY OF WISCONSIN-MADISON Robert Meyer, Research Professor and Director

Robert Meyer, Research Professor and Director

  • Upload
    renate

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Technical and Consequential Validity in the Design and Use of Value-Added systems LaFollette School of Public Affairs & value-added research center, university of Wisconsin-Madison. Robert Meyer, Research Professor and Director. VARC Partner Districts and States. - PowerPoint PPT Presentation

Citation preview

Page 1: Robert Meyer, Research Professor and Director

TECHNICAL AND CONSEQUENTIAL VALIDITY IN THE DESIGN AND USE OF VALUE-ADDED SYSTEMSLAFOLLETTE SCHOOL OF PUBLIC AFFAIRS & VALUE-ADDED RESEARCH CENTER, UNIVERSITY OF WISCONSIN-MADISON

Robert Meyer, Research Professor and Director

Page 2: Robert Meyer, Research Professor and Director

VARC Partner Districts and States

Design of Wisconsin State Value-Added System (1989) Minneapolis (1992) Milwaukee (1996) Chicago (2006) Department of Education: Teacher Incentive Fund (TIF) (2006 and 2010) Madison (2008) Wisconsin Value-Added System (2009) Milwaukee Area Public and Private Schools (2009) Racine(2009) New York City (2009) Minnesota, North Dakota & South Dakota: Teacher Education Institutions and Districts

(2009) Illinois (2010) Hillsborough (2010) Atlanta (2010) Los Angeles (2010) Tulsa (2010) Collier County (2012) New York (2012) California Charter Schools Association (2012) Oklahoma Gear Up (2012)

Page 3: Robert Meyer, Research Professor and Director

MinneapolisMilwauk

ee

Chicago

Madison

Tulsa

Atlanta

New York City

Los Angeles

Hillsborough County

NORTH DAKOTA

SOUTH DAKOTA

MINNESOTA

WISCONSIN

ILLINOIS

Districts and States Working with VARC

Collier County

NEW YORK

Page 4: Robert Meyer, Research Professor and Director

Context and Research Questions

Page 5: Robert Meyer, Research Professor and Director

Components to Educator Effectiveness Systems

Educator Effectiven

ess Systems

Data Requirements and Data

Quality

Professional Developmen

t (Understandi

ng and Application)

Evaluating Instructional

Practices, Programs,

and Policies

Alignment with School,

District, State

Policies and Practices Embed

within a Framework

of Data-Informed Decision-Making

Value-Added System

Page 6: Robert Meyer, Research Professor and Director

Uses of a Value-Added System

Value-

Added

Evidence that All

students can Learn

Set School Performance Standards

Triage: Identify Low Performing

Schools

Contribute to District Knowledge

about “What Works”

Data-Informed Decision-Making /

Performance

Management

Page 7: Robert Meyer, Research Professor and Director

Development of a Value-Added System

Clarity: What is the objective?

Dimensions of validity and reliability

Why? Achieve accuracy, fairness, improved teaching and learning

How complex should a value-added model be?

Possible rule: "Simpler is better, unless it is wrong.”

Page 8: Robert Meyer, Research Professor and Director

Dimensions of Validity and Reliability

Accuracy Criterion validity Technical (causal) validity Reliability (precision)

Consequential validity Transparency

Page 9: Robert Meyer, Research Professor and Director

Technical validity Technical validity measures the degree to which

the statistical model and data used in the model (for example, student outcomes, student characteristics, and student-classroom-teacher linkages) provide consistent (unbiased) estimates of performance using the available student outcomes/assessments

Requires development of a quasi-experimental model that captures (to the extent possible) the structural factors that determine student achievement and growth in student achievement

Page 10: Robert Meyer, Research Professor and Director

Consequential validity Consequential validity addresses the

incentives and decisions that are triggered by the design and use of performance measures and performance systems

Page 11: Robert Meyer, Research Professor and Director

Transparency Transparency addresses the

consequences of simplicity versus complexity in the design (and clarity of explanation) of value-added models and reports

Page 12: Robert Meyer, Research Professor and Director

Criterion Validity Criterion validity captures the degree to

which effect estimates based on available student outcome data fully align with estimates based on the complete spectrum of student outcomes valued by stakeholders

Page 13: Robert Meyer, Research Professor and Director

Reliability Reliability (or precision) captures

statistical error due to the fact that effectiveness estimates are based on finite samples of students, which in the context of estimating classroom and teacher performance are generally small

Page 14: Robert Meyer, Research Professor and Director

Application of Framework Develop a value-added model that incorporates

important structural factors that determine growth in student achievement and specify performance parameters that represent educational units (classrooms) and agents (teachers)

Identify and address threats to validity that could cause bias in the estimation of desired performance parameters

Specify data uses, including the design of reports intended to inform decision making

Page 15: Robert Meyer, Research Professor and Director

Technical vs. Consequential Validity I

Consider the consequences of controlling for prior achievement and other predictors – switching from measurement of attainment (as in NCLB) to growth

Positive from the standpoint of technical validity because the estimates are more accurate

Possibly negative from the perspective of consequential validity if controlling for prior achievement and other predictors inevitably leads to reduced expectations for poor and minority students.

Page 16: Robert Meyer, Research Professor and Director

Technical vs. Consequential Validity II

Consequences of inclusion of demographic variables?

Possibly positive from the standpoint of technical validity because the estimates are more accurate

Possibly negative from the perspective of consequential validity because the inclusion of these variables inevitably leads to reduced expectations for poor and minority students.

Or, the reverse is true

Page 17: Robert Meyer, Research Professor and Director

Value-Added Model

Page 18: Robert Meyer, Research Professor and Director

Generally Recommended Value-Added Model Features

Longitudinal student outcome/assessment data Flexible (data-driven) posttest-on-pretest link, including

possible nonlinearities in this relationship Contextual covariates Adjust for test measurement error Address changes in assessments over time Allow for end-of-grade & end-of-course exams Dosage/student mobility Allow differential effects by student characteristics Statistical shrinkage: address noise due to small samples Measures of precision and confidence ranges

Page 19: Robert Meyer, Research Professor and Director

Model Simplifications Longitudinal data for two time periods

(appropriate for early grades) Model will be defined in terms of true

test scores. Estimation method controls for test measurement error

Posttest on pretest relationship is assumed to be linear – this can be generalized

Student mobility with the school year is ignored in order to simplify notation

Page 20: Robert Meyer, Research Professor and Director

Structural Determinants of Achievement and Achievement Growth

Student level Prior achievement Student and family contribution Within-classroom allocation of resources

(including student performance expectations)

School contributions external to classroom (supplemental in-school instruction, after school instruction, summer school)

Page 21: Robert Meyer, Research Professor and Director

Structural Determinants of Achievement and Achievement Growth

Classroom level Peer effects Contributions external to teacher (school

resources, policies, and climate, class size) Contributions internal to teacher (teacher

resources, policies, and instructional practices, alignment with standards implied by assessments) (factors that may be covered by observational rubrics)

Page 22: Robert Meyer, Research Professor and Director

Preview of Alternative Performance Parameters Teacher performance: Classroom performance:

Includes contributions in classroom from student peers and resources external to teacher (such as other staff and class size)

Factors external to the classroom (supplemental in-school instruction, after school instruction, summer school):

Classroom/school performance: Includes contributions from classroom and

resources external to the classroom

Sjk Cjk Xjk

TjkCjk

Xjk

Page 23: Robert Meyer, Research Professor and Director

Model Specification Strategy Include in the model all structural determinants

of achievement and achievement growth Be explicit how demographic variables and prior

achievement contribute directly or indirectly (via other determinants) to achievement and growth

Two types of student and demographic variables: Level I (Student level): Level II (Classroom level): Subscripts: student i, teacher j, and school k

0,i iX y0,jk jkX y

Page 24: Robert Meyer, Research Professor and Director

I: Student-Level Equation

Posttest: Pretest: with durability/decay parameter: Student and family contribution: Within classroom contribution: Supplemental contribution:

Measures of supplemental factors not observed Subscripts: student i, teacher j, and school k

1 0 ( ) ( )i i i i jk i jky y b c d

1iy0iy

ib

( )i jkc

( )i jkd

Page 25: Robert Meyer, Research Professor and Director

Alternative Student-Level Equation Include explicit measures of supplemental

resources in the model, producing a multiple-input (crossed effects) model

This model is tractable if the crossed effects are not highly collinear. If the crossed effects are highly (or completely) collinear, then it may be possible to address provision of supplemental resources in the second level of the model as a factor external to the teacher.

Our focus is on the conventional one input model

Page 26: Robert Meyer, Research Professor and Director

Condition Factors on Student-Level Demographic Variables

Student and family factor

Within classroom factor

Supplemental factor

0 0 1 2 1i i i ib b y b X b e

( ) 0 0 1 2 2i jk i i Cjk ic c y c X c e

( ) 0 0 1 2 3i jk i i Xjk id d y d X d e

Page 27: Robert Meyer, Research Professor and Director

Defines a VAM of Student Growth and Classroom/School Performance

Combine student-level structural factors

Pretest coefficient

Effect of student-level characteristics

Classroom/school performance

1 1 1b c d

1 0i i i Sjk iy y X e

2 2 2b c d

Sjk Cjk Xjk

Page 28: Robert Meyer, Research Professor and Director

Decomposition of Average Achievement

Predicted achievement = Prior achievement + Student growth

Average post achievement = Predicted achievement + Classroom/school performance

Teacher subscripts jk dropped

1 0py y X

1 1p

Sy y

Page 29: Robert Meyer, Research Professor and Director

Technical Validity Classroom/school performance from the

value-added model that includes demographic variables is structural parameter of interest:

The performance parameter obtained from a model that excludes demographic variables is (approximately)

This parameter is biased

Sjk

Sjk X

Page 30: Robert Meyer, Research Professor and Director

II. Classroom/School Level Equation

Classroom/school performance: Peer effects: Contributions external to teacher: Contributions internal to teacher:

Sjk jk jk Tjkp q

jkp

jkq

Tjk

Sjk

Page 31: Robert Meyer, Research Professor and Director

Condition Factors on Average Classroom-Level Demographic Variables

Peer effects:

Contributions external to teacher:

Contributions internal to teacher:

0 0 1 2jk jk jkp p y p X p

0 0 1 2 1jk jk jk jkq q y q X q u

0 0 1 2 2T jk jk jk jkr y r X r u

Page 32: Robert Meyer, Research Professor and Director

Defines a Model of Classroom/School Performance

Preferred model (but not identified)

Teacher parameter (not identified):

Bias: productivity external to teacher = Feasible model (biased)

Bias is caused by “over-controlling”

0 0 0 1 1 2 2 1( ) ( ) ( )Sjk jk jk Tjk jkp q y p q X p q u

1Tjk Tjk jku

1 jku

1 0 1 2( )pTjk Sjk Sjk Tjk jk jk jku y r X r

Page 33: Robert Meyer, Research Professor and Director

Dilemma in the Choice of Models from the Perspective of Technical Validity

Option A: Use classroom/school performance as a proxy measure of teacher performance; commit an error of “omission”

Option B: Use the feasible, but biased, estimate of teacher performance; commit an error of “commission”

Option C: Use a more complicated model to control for the factors external to the teacher

Page 34: Robert Meyer, Research Professor and Director

Consequential Validity: Uses and Decisions Parental choice of schools Teachers willingness to teach in given schools Identification of master teachers Identification of teachers for professional

development Performance based compensation Provision of supplemental services Avoid bubble effects: incentives to deploy

resources to students as artifact of statistical measures (Statistics based on means rather than medians can be affected by all students)

Page 35: Robert Meyer, Research Professor and Director

Key Point: the Power of Two Decisions need to be informed by:

Measure of school/classroom or teacher performance

Measures of student achievement Actual average student achievement Student achievement target (e.g., proficiency status)

Options Use only information on student attainment

(NCLB) Use only information on value-added performance Use both pieces of data to inform decisions

Page 36: Robert Meyer, Research Professor and Director

Achievement Target, Performance, and Achievement Shortfall – Retrospective View Example with two teachers Focus on use of classroom/school indicator Scale of parameters:

Value-added ratings are centered around zero with a standard deviation of one, and thus range from approximately -3 to 3

All other parameters (average achievement and the average contribution of demographic characteristics) are centered around zero and have been transformed to the value-added scale, although the standard deviations of these parameters are not constrained to equal one

Page 37: Robert Meyer, Research Professor and Director

How to Read the Scatter Plots

1 2 3 540

20

40

60

80

100

Value-Added (2009-2010)

Perc

ent

Prof

/Adv

(20

09)

Schools in your district

A

A. Students know a lot and are growing faster than predicted

B

B. Students are behind, but are growing faster than predicted

C

C. Students know a lot, but are growing slower than predicted

D

D. Students are behind, and are growing slower than predicted

E

E. Students are about average in how much they know and how fast they are growing

Page 38: Robert Meyer, Research Professor and Director

Achievement Target, Performance, and Achievement Shortfall – Retrospective View

Achievement Target

Average Prior

Achievement

Student

Factor

Classroom/School

Performance

Average

Posttest

Achievement

Shortfall

4 3 1 -1 3 14 0 -1 -1 -2 -6

1Ty 0y X S 1y

Page 39: Robert Meyer, Research Professor and Director

Achievement Target, Performance, and Achievement Shortfall – Prospective View

Achievement Target

Average Prior

Achievement

Student

Factor

Classroom/School

Performance

Average

Posttest

Achievement

Shortfall

4 3 1 -1 3 10 4 01 5 NA2 6 NA3 7 NA

1Ty 0y X S 1y

Page 40: Robert Meyer, Research Professor and Director

Achievement Target, Performance, and Achievement Shortfall – Prospective View

Achievement Target

Average Prior

Achievement

Student

Factor

Classroom/School

Performance

Average

Posttest

Achievement

Shortfall

4 0 -1 -1 -2 -60 -1 -51 0 -42 1 -33 2 -24 3 -15 4 0

1Ty 0y X S 1y

Page 41: Robert Meyer, Research Professor and Director

The Pros and Cons of Using Attainment Only

It is straightforward to connect actual attainment with achievement targets and maintain a universal target

Average achievement and related attainment indicators such as percent proficient are severely biased as measures of classroom/school performance

Given a universal achievement target, the achievement shortfalls very enormously across teachers and schools

Page 42: Robert Meyer, Research Professor and Director

The Pros and Cons of Using Value-Added Only

The value-added model provides an unbiased/consistent estimate of classroom/school performance

High value-added targets do not eliminate achievement shortfalls if prior achievement (or more correctly, predicted achievement, which includes student growth) is extremely low

Page 43: Robert Meyer, Research Professor and Director

The Power of Using Both Indicators The value-added model provides an

unbiased/consistent estimate of classroom/school performance

Achievement shortfalls can be identified prospectively and thus can trigger supplemental resource allocations designed to eliminate them

Page 44: Robert Meyer, Research Professor and Director

Include Student-Level Demographics?

Yes, to provide more accurate measures of classroom/school performance

Does this reduce expectations? No, achievement targets are set

independently Predicted achievement shortfalls are not

reduced in a model that includes student demographics. In fact, they are identical

Supplemental resource allocations can be triggered to eliminate achievement shortfalls

Page 45: Robert Meyer, Research Professor and Director

Does Including Demographic Variables Matter?

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.300.050.1

0.150.2

0.250.3

0.350.4

State Wide Data for Grade 3 Math - VA Tier Difference After Removing Demographics

Percent of Schools

Value Added Difference

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

Percent of Schools 0.000894

0.004468

0.001787

0.028597

0.033959

0.064343

0.161752

0.341376

0.308311

0.053619

0.000894

Percent of Students 0.000637

0.004155

0.001839

0.026487

0.029387

0.060667

0.141844

0.323455

0.341526

0.068783 0.00122

Female 0.38889 0.5234 0.60577 0.54272 0.49097 0.50277 0.50274 0.49702 0.47864 0.46581 0.44928African American 0.97222 0.83404 0.86538 0.75501 0.52828 0.29875 0.12653 0.04729 0.02879 0.01851 0.01449

Hispanic 0 0.08085 0.01923 0.14219 0.30987 0.29991 0.16181 0.07987 0.05307 0.04704 0.01449Asian 0 0.02553

20 0.03938

60.05415

20.04546

80.06756

40.03843 0.02945

90.02185

10.07246

4Indian 0 0.01276

60.00961

50.00400

50.00481

30.00903

50.02443

30.02110

10.01563

60.00462

70

White 0.02778 0.04681 0.10577 0.05941 0.10289 0.34684 0.61967 0.81321 0.87305 0.90797 0.89855Free Reduced Lunch 1 0.93617 0.82692 0.90053 0.90433 0.73535 0.56008 0.40638 0.29889 0.23805 0.15942