Download pptx - Measuring Teacher Effectiveness: Challenges and Opportunities Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive

Measuring Teacher Effectiveness: Challenges and Opportunities

Laura Goe, Ph.D.Research Scientist, ETS, and Principal Investigator for the

National Comprehensive Center for Teacher Quality

National Association of Latino Elected and Appointed Officials (NALEO) Education

Fund NALEO Audio Conference April 10, 2012

2

Laura Goe, Ph.D.

• Former teacher in rural & urban schools Special education (7th & 8th grade, Tunica, MS) Language arts (7th grade, Memphis, TN)

• Graduate of UC Berkeley’s Policy, Organizations, Measurement & Evaluation doctoral program

• Principal Investigator for the National Comprehensive Center for Teacher Quality

• Research Scientist in the Performance Research Group at ETS

3

The National Comprehensive Center for Teacher Quality

• A federally-funded partnership whose mission is to help states carry out the teacher quality mandates of ESEA

• Vanderbilt University• Learning Point Associates, an affiliate of

American Institutes for Research• Educational Testing Service

4

Today’s presentation available online

• To download a copy of this presentation go to www.lauragoe.com Go to Publications and Presentations page Today’s presentation is at the bottom of the

page

http://www.lauragoe.com/

5

The goal of teacher evaluation

The ultimate goal of all teacher evaluation should be…

TO IMPROVE TEACHING AND

LEARNING

6

Questions to be considered

• What is teacher effectiveness and why should we measure it?

• How do you measure teacher effectiveness? • What are strengths and cautions to keep in

mind when using these measures?

7

Differentiating among teachers

• “It is nearly impossible to discover and act on performance differences among teachers when documented records show them all to be the same.” (Glazerman et al., 2011, pg 1)

8

Trends in teacher evaluation

• The policy imperative to change teacher evaluation has outstripped the research

Though we don’t yet know which model and combination of measures will identify effective teachers, many states and districts feel compelled to move forward at a rapid pace

• Inclusion of student achievement growth data represents an important “culture shift” in evaluation

Communication and teacher/administrator participation and buy-in are crucial to ensure change

• The implementation challenges are considerable We are models exist for states and districts to adopt or adapt Many districts have limited capacity to implement comprehensive

systems, and states have limited resources to help them

9

It’s an equity issue

• Value-added research shows that teachers vary greatly in their contributions to student achievement (Rivkin, Hanushek, & Kain, 2005).

• The Widget Effect report (Weisberg et al., 2009) found that 90% of teachers were rated “good” or better in districts where students were failing at high levels

10

A simple definition of teacher effectiveness

Anderson (1991) stated that “… an effective teacher is one who quite consistently achieves goals which either directly or indirectly focus on the learning of

their students” (p. 18).

11

Race to the Top definition of effective & highly effective teacher

Effective teacher: students achieve acceptable rates (e.g., at least one grade level in an academic year) of student growth (as defined in this notice). States, LEAs, or schools must include multiple measures, provided that teacher effectiveness is evaluated, in significant part, by student growth (as defined in this notice). Supplemental measures may include, for example, multiple observation-based assessments of teacher performance. (pg 7)

Highly effective teacher students achieve high rates (e.g., one and one-half grade levels in an academic year) of student growth (as defined in this notice).

12

Race to the Top definition of student growth

• Student growth means the change in student achievement (as defined in this notice) for an individual student between two or more points in time. A State may also include other measures that are rigorous and comparable across classrooms. (pg 11)

12

13

Goe, Bell, & Little (2008) definition of teacher effectiveness

1. Have high expectations for all students and help students learn, as measured by value-added or alternative measures.

2. Contribute to positive academic, attitudinal, and social outcomes for students, such as regular attendance, on-time promotion to the next grade, on-time graduation, self-efficacy, and cooperative behavior.

3. Use diverse resources to plan and structure engaging learning opportunities; monitor student progress formatively, adapting instruction as needed; and evaluate learning using multiple sources of evidence.

4. Contribute to the development of classrooms and schools that value diversity and civic-mindedness.

5. Collaborate with other teachers, administrators, parents, and education professionals to ensure student success, particularly the success of students with special needs and those at high risk for failure.

14

Measures and models: Definitions

• Measures are the instruments, assessments, protocols, rubrics, and tools that are used in determining teacher effectiveness

• Models are the state or district systems of teacher evaluation including all of the inputs and decision points (measures, instruments, processes, training, and scoring, etc.) that result in determinations about individual teachers’ effectiveness

15

Multiple measures of teacher effectiveness

• Evidence of growth in student learning and competency Standardized tests, pre/post tests in untested subjects Student performance (art, music, etc.) Curriculum-based tests given in a standardized manner Classroom-based tests such as DIBELS

• Evidence of instructional quality Classroom observations Lesson plans, assignments, and student work Student surveys such as Harvard’s Tripod Electronic portfolios/evidence binders

• Evidence of professional responsibility Administrator/supervisor reports, parent surveys Teacher reflection and self-reports, records of contributions

16

Teacher observations: strengths and weaknesses

• Strengths Great for teacher professional growth

- If observation is followed by opportunity to discuss results- If support is provided for those who need it

Helps evaluator (principals or others) understand teachers’ needs across school or across district

• Weaknesses Essential to have alignment between teaching standards

and observation instrument Resource intensive (personnel time, training, calibrating) Validity of observation results may vary with who is

doing them, depending on how well trained and calibrated they are

17

Example: University of Virginia’s CLASS observation tool

Emotional Support Classroom Organization

Instructional Support

Pre-KandK-3

Positive Climate

Negative Climate

Teacher Sensitivity

Regard for Student (Adolescent) Perspectives

Behavior Management

Productivity

Instructional Learning Formats

Concept DevelopmentQuality of FeedbackLanguage Modeling

Upper Elementary/Secondary

Content UnderstandingAnalysis and Problem

SolvingQuality of Feedback

18

Domain 1: Planning and Preparation includes comprehensive understanding of the content to be taught, knowledge of the students’ backgrounds, and designing instruction and assessment.

Domain 3: Instruction is concerned with the teacher’s skill in engaging students in learning the content, and includes the wide range of instructional strategies that enable students to learn.

Domain 2: The Classroom Environment addresses the teacher’s skill in establishing an environment conducive to learning, including both the physical and interpersonal aspects of the environment.

Domain 4: Professional Responsibilities addresses a teacher’s additional professional responsibilities, including self-assessment and reflection, communication with parents, participating in ongoing professional development, and contributing to the school and district environment.

Example: Charlotte Danielson’s Framework for Teaching

19

Validity of classroom observations is highly dependent on training

• A teacher should get the same score no matter who observes him

This requires that all observers be trained on the instruments and processes

Occasional “calibrating” should be done; more often if there are discrepancies or new observers

Who the evaluators are matters less than the fact that they are trained to recognize evidence and score it consistently

• Teachers should also be trained on the observation forms and processes so they can participate actively and fully in the process

20

Risk management vs. one-size-fits-all in teacher observations

• Conducting high-quality observations is a resource-intensive process A more efficient use of resources is for teachers

who have not yet demonstrated competence to be on a more intensive observation schedule

- New teachers- Teachers who have changed teaching

assignments or schools

• Other measures are less resource intensive and can be used routinely (surveys, student outcomes, portfolios)

21

Reliability results when using different combinations of raters and lessons

Figure 2. Errors and Imprecision: the reliability of different combinations of raters and lessons. From Hill et al., 2012 (see references list). Used with permission of author.

22

Formal vs. informal observations

• Formal observations are likely to be Announced and scheduled in advance according to a

pre-determined yearly schedule Include pre- and post-conferences with review of lesson

plans and artifacts Last an entire class period Result in a set of scores on multiple indicators

• Informal observations are likely to be Unannounced, drop-in Last less than an entire class period Result in informal verbal or written feedback to the

teacher, perhaps on only one indicator

23

Questions to ask about observations

• How many observations per year? Vary by new vs. experience? Vary by demonstrated competence? Combination of formal and informal?

• Who should conduct the observations?• Will multiple observers be required?• How will they be trained?

Workshops? Online (video-based)?

• Will they need to be certified?

24

Value-added models

• Many variations on value-added models TVAAS (Sander’s original model) typically uses 3+

years of prior test scores to predict the next score for a student

- Used since the 1990’s for teachers in Tennessee, but not for high-stakes evaluation purposes

- Most states and districts that currently use VAMs use the Sanders’ model, also called EVAAS

There are other models that use less student data to make predictions

Considerable variation in “controls” used

24

25

Growth vs. Proficiency Models

End of YearStart of School Year

Achievement

Proficient

Teacher B: “Failure” on Ach. Levels

Teacher A: “Success” on Ach. Levels

In terms of growth,

Teachers A and B are

performing equally

Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

26

Growth vs. Proficiency Models (2)

End of YearStart of School Year

Achievement

ProficientTeacher A

Teacher B

A teacher with low-

proficiency students can still be high in terms of GROWTH (and vice

versa)

Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

27

Colorado Growth Model

• Colorado Growth model Focuses on “growth to proficiency” Measures students against “academic peers” Also called criterion‐referenced growth‐to‐standard

models

• The student growth percentile is “descriptive” whereas value-added seeks to determine the contribution of a school or teacher to student achievement (Betebenner 2008)

28

Slide courtesy of Damian Betebenner at www.nciea.org

Colorado Growth Model

http://www.nciea.org/

29

What value-added and growth models cannot tell you

• Value-added and growth models are really measuring classroom, not teacher, effects

• Value-added models can’t tell you why a particular teacher’s students are scoring higher than expected Maybe the teacher is focusing instruction

narrowly on test content Or maybe the teacher is offering a rich,

engaging curriculum that fosters deep student learning.

• How the teacher is achieving results matters!

30

Recommendation from NBPTS Task Force (Linn et al., 2011)

Recommendation 2: Employ measures of student learning explicitly aligned with the elements of curriculum for which the teachers are responsible. This recommendation emphasizes the importance of ensuring that teachers are evaluated for what they are teaching.

31

School-wide VAM illustration

SS1

SS2

Sci1

Sci2

Sped FL PE ELL

Math

1

Math

2ELA

1ELA

20

1

2

3

4

5

6

7

8

Obs/SurvVAM

32

Measuring teachers’ contributions to student learning growth: A summary of current models

Model Description

Student learning objectives

Teachers assess students at beginning of year and set objectives then assesses again at end of year; principal or designee works with teacher, determines success

Subject & grade alike team models (“Ask a Teacher”)

Teachers meet in grade-specific and/or subject-specific teams to consider and agree on appropriate measures that they will all use to determine their individual contributions to student learning growth

Content Collaboratives Content experts (external) identify measures and groups of content teachers consider the measures from the perspective of classroom use; may not include pre- and post measures

Pre-and post-tests model Identify or create pre- and post-tests for every grade and subject

School-wide value-added Teachers in tested subjects & grades receive their own value-added score; all other teachers get the school-wide average

33

Tripod Survey domains

• Harvard’s Tripod Survey – the 7 C’s– Caring about students (nurturing productive relationships);– Controlling behavior (promoting cooperation and peer

support);– Clarifying ideas and lessons (making success seem

feasible);– Challenging students to work hard and think hard (pressing

for effort and rigor);– Captivating students (making learning interesting and

relevant);– Conferring (eliciting students’ feedback and respecting their

ideas);– Consolidating (connecting and integrating ideas to support

learning)

34

Tripod Survey results

• Control is the strongest correlate of value added gains

• However, it is important to keep in mind that a good teacher achieves control by being good on the other dimensions

• English & Spanish, paper or online versions at three levels: k-2, 3-5, 6-12

• For more info: http://www.tripodproject.org/index.php/index/

http://www.tripodproject.org/index.php/index/

35

Why you should keep (and provide support to) the less effective teachers

• With the right instructional strategies and guidance, motivated teachers can improve practice and student outcomes

• The teachers you hire to replace your less effective teachers are not necessarily going to be more effective

• You may not be able to find better replacements!• You may not be any to find any replacements!• The replacements you find may not stay

36

Measures that help teachers grow

• Measures which include protocols and processes that teachers can examine and comprehend

• Measures that are directly and explicitly aligned with teaching standards

• Measures that motivate teachers to examine their own practice against specific standards

• Measures that allow teachers to participate in or co-construct the evaluation (such as portfolios)

• Measures that give teachers opportunities to discuss the results for formative purposes with evaluators, administrators, teacher learning communities, mentors, coaches, etc.

• Measures that are aligned with and used to inform professional growth and development offerings

37

Evaluating Teacher Preparation Programs (TPPs)

Evaluate teacher performance (including student outcomes)

Use results as a measure of TPP success (for evaluation purposes)

Use results to improve TPP curriculum and instruction

K-12 Teaching and learning improves as a result of changes made by TPPs

38

Meeting the “standards”

• It’s possible to be meeting accreditation standards (NCATE, TEAC) but still not be preparing fully effective teachers

• If TPPs are not adequately preparing teachers for the contexts and communities which they serve, their effectiveness may be hampered

39

Final thoughts

• The limitations: There are no perfect measures There are no perfect models Changing the culture of evaluation is hard work

• The opportunities: Evidence can be used to trigger support for struggling

teachers and acknowledge effective ones Multiple sources of evidence can provide powerful

information to improve teaching and learning Evidence is more valid than “judgment” and provides

better information for teachers to improve practice

40

References

Anderson, L. (1991). Increasing teacher effectiveness. Paris: UNESCO, International Institute for Educational Planning.

Glazerman, S., D. Goldhaber, et al. (2011). Passing muster: Evaluating evaluation systems. Washington, DC, Brown Center on Education Policy at Brookings.

http://www.brookings.edu/reports/2010/1117_evaluating_teachers.aspx Goe, L., C. Bell, et al. (2008). Approaches to evaluating teacher effectiveness: A research synthesis,

Washington, DC: National Comprehensive Center for Teacher Quality: 1-103.http://www.tqsource.org/publications/teacherEffectiveness.php Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher

observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56-64.

Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student learning, student achievement: How do teachers measure up? Arlington, VA: National Board for Professional Teaching Standards.

http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305

Race to the Top Application

http://www2.ed.gov/programs/racetothetop/resources.html

Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417 - 458.

http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf

Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: The New Teacher Project.

http://widgeteffect.org/downloads/TheWidgetEffect.pdf

http://www.brookings.edu/reports/2010/1117_evaluating_teachers.aspx

http://www.tqsource.org/publications/teacherEffectiveness.php

http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305

http://www2.ed.gov/programs/racetothetop/resources.html

http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf

http://widgeteffect.org/downloads/TheWidgetEffect.pdf

41

Questions?

42

Laura Goe, [email protected]://twitter.com/GoeLaura

National Comprehensive Center for Teacher Quality1000 Thomas Jefferson Street, NWWashington, D.C. 20007www.tqsource.org

mailto:[email protected]

http://www.lauragoe.com/

https://twitter.com/GoeLaura

http://www.tqsource.org/