Upload
daire
View
47
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Washington State Teacher and Principal Evaluation Project. Maximizing Rater Agreement. As you enter, please have a brief discussion with your district team to decide your level of confidence in whether the following statements are true for your district : - PowerPoint PPT Presentation
Citation preview
1
Washington State Teacher and Principal
Evaluation ProjectMaximizing Rater Agreement
June 2013
2 2
Entry Task: Confidence Conversation As you enter, please have a brief discussion with your district team to decide your level of confidence in whether the following statements are true for your district:1. Our evaluators demonstrate accuracy and strong rater
agreement when using observation data to score teacher performance.
2. Our district’s new evaluation system includes frequent, structured opportunities for evaluators to practice and calibrate their observation and rating skills.
3. Our teachers and principals trust their evaluators to rate their performance accurately and reliably.
Write your district name on three sticky notes and place them on the confidence scales posted on [INSERT LOCATION] for each statement.
3
Agenda Connecting Learning I Implementing Reflecting Wrap-Up
Welcome! Introductions Logistics Agenda
4
Modules Introduction to Educator Evaluation in Washington Using Instructional and Leadership Frameworks in
Educator Evaluation Preparing and Applying Formative Multiple Measures
of Performance: An Introduction to Self-Assessment, Goal Setting, and Criterion Scoring
Including Student Growth in Educator Evaluation Conducting High-Quality Observations and Maximizing
Rater Agreement Providing High-Quality Feedback for Continuous
Professional Growth and Development Combining Multiple Measures into a Summative
Rating
5
The Evaluation System Components
6
TPEP Core Principles“We Can’t Fire Our Way to Finland”
1. The critical importance of teacher and leadership quality2. The professional nature of teaching and leading a school 3. The complex relationship between the system for
teacher and principal evaluation and district systems and negotiations
4. The belief in professional learning as an underpinning of the new evaluation system
5. The understanding that the career continuum must be addressed in the new evaluation system
6. The system must determine the balance of “inputs or acts” and “outputs or results”
7
Session Norms Pausing Paraphrasing Posing Questions Putting Ideas on the Table Providing Data Paying Attention to Self and Others Presuming Positive Intentions
What Else?
Connecting
Builds community, prepares the team for learning, and links to prior knowledge, other modules, and
current work
8
9
Module Overview: 2 PartsA. Conducting High-Quality ObservationsB. Maximizing Rater Agreement
Reminder! This module provides an orientation to
the basic concepts. This module does not go into great depth about
evidence relating to any of the specific instructional or leadership frameworks and instead leaves it up to the districts to seek additional training.
10
Overview of Intended Participant Outcomes Participants will know and be able to: Describe the OSPI working definition of rater
agreement and the stages for development. Identify common rating errors in their own and
others’ practice. Utilize appropriate strategies for minimizing
bias and error in the observation and rating process.
Understand the elements of high-quality training required to achieve maximum rater agreement.
11
Connecting Content: Importance of Rater Agreement Even if you select a high-quality instructional
or leadership framework AND Observers use best practices in collecting the
observation data:The results will be meaningless if observers are unable to demonstrate accuracy and consistency in scoring using the framework.
KEY POINT: An educator’s observation scores should be the same regardless of the observer.
12
Importance of Rater AgreementDemonstrating rater agreement is critical to ensuring that: Educators can trust the new evaluation
system. Educators receive relevant, useful
information for professional growth. The new system is legally defensible for
personnel decisions.
13
Rater Agreement Background The new law requires that evaluators of both
teachers and principals “must engage in professional development designed to implement the revised systems and maximize rater agreement.”
The Teacher and Principal Evaluation Project (TPEP) has relied heavily on the growing body of research, the framework authors, and the practical input from practitioners in pilot sites to create a “working definition” of rater agreement for the 2012-13 school year.
14
OSPI Definition of Rater AgreementThe extent to which the scores between the raters have consistency and accuracy against predetermined standards. The predetermined standards are the instructional and leadership frameworks and rubrics that define the basis for summative criterion-level scores.
15
OSPI Definition of Rater Agreement Consistency: A measure of observer data
quality indicating the extent to which an observer is assigning scores that agree with scores assigned to the same observation of practice by another typical observer.
Accuracy: A measure of observer data quality indicating the extent to which an observer is assigning scores that agree with scores assigned to the same observation by an expert rater; the extent to which rater’s scores agree with the true or “correct” score for the performance.
16
Calculating Rater AgreementTable 1. Illustrating Rater Agreement
Component Component Score Type of Agreement
Rater A Rater B Master Scorer
1 4 4 4 Exact Agreement
2 3 2 3 Adjacent Agreement
3 1 4 4 ?4 3 3 1 ?
17
Calculating Rater AgreementTable 1I. Illustrating Rater Agreement (Cont.)
Component Component Score More than 1pt Off
Rater A Rater B Master Scorer
Subcomponent 1
4 3 4 No
Subcomponent 2
2 1 3 Yes
Subcomponent 3
1 3 4 Yes
Subcomponent 4
4 3 1 Yes
Subcomponent 5
3 4 2 Yes
Component Score (Average)
2.8 2.8 2.8
18
Connecting Activity: Where Can You Assess Rater Agreement? Summativ
e Criterion
ScoreCriteria 2
Criteria 1
Criteria 3
Criteria 4
Criteria 5
Criteria 6
Criteria 7
Criteria 8
Framework scales (e.g.,
components,
domains, dimensio
ns)
Evidence
• Observation evidence
Framework Score
• Student growth data
• Artifacts• Other
relevant evidence
Learning
Understand common sources of rater error and strategies for minimizing their influence in observer
ratingsUnderstand the role of high-quality observation
training in achieving rater agreement
19
20
Learning Content I. Avoiding Rater ErrorRecall that a skilled observer: 1. Understands each component and indicator on
the district rubric thoroughly and deeply.
2. Gathers and sorts sufficient evidence of practice as it happens in the classroom or school.
3. Recognizes and puts aside preferences and biases.
4. Interprets the evidence appropriately to give an accurate rating using the evaluation instrument.
(McClellan, Atkinson, & Danielson, 2012)
21
Avoiding Common Rater Errors Central Tendency
A rater evaluates the observation using points on the middle of the scale and avoids extremely high or low ratings.
Strategy to avoid this error? Pay careful attention to behavioral anchors that define
performance at each scale point. Compare observation evidence with the behavioral
anchors. Keep in mind that behavioral anchors are examples—
you do not have to have observational evidence for every single anchor for a particular rating.
22
Avoiding Common Rater Errors Contrast Effect
A rater directly compares the performance of one educator to that of another educator.
This is particularly problematic when a group of educators select a common criterion on a focused evaluation cycle.
Strategy to avoid this error? When assigning observation ratings, do not
use another educator’s performance as a point of reference. Raters should only compare the observation evidence against the anchors on the rating scale.
23
Avoiding Common Rater Errors Focusing on One or Two Incidents
Ratings are based on only a small sample of observation evidence that typically includes either very strong or weak examples of practice.
Strategy to avoid this error? Be sure to take into account the full range of performance described in the observation evidence. Assess the frequency and depth of the behaviors recorded against the behavioral indicators in the rubric.
24
Avoiding Common Rater Errors Halo Error
A rater allows ratings on one component/scale to influence ratings on another component/scale.
Strategy to avoid this error? Remember that framework components are scored separately. Your ratings on one component should not influence ratings on another component.
Consider the observation evidence for each component separately and only use the information that is relevant to the component you are considering.
25
Avoiding Common Rater Errors Potential Error
A rater gives higher or lower ratings to an educator then is warranted by the observation evidence because he or she believes the educator has (or does not have) the potential to be an excellent educator.
Strategy to avoid this error? Remember to consider all instances of an
educator’s actual observation data. Ratings should be made based only on the observation evidence collected, not on anticipated improvements or declines.
26
Avoiding Common Rater Errors Leniency and Severity Errors
A rater gives mostly high (lenient) or low (severe) ratings to an educator in a manner inconsistent with the observation data collected.
Strategy to avoid this error? Pay careful attention to the scale anchors when
making your ratings. Also, review the anchors in order to understand how performance is defined at each scale point. Do not try to be intentionally “easy” or “hard” in your ratings.
27
Avoiding Common Rater Errors Recency Bias
A rater is inclined to remember recent events better than those that occur in the past; thus, raters often place greater weight or emphasis on evidence collected near the end of the observation.
Strategy to avoid this error? Consider all of the observation evidence
collected over the entire class period. Remind yourself that the educator’s performance at the beginning of the observation is just as important as his or her performance at the end.
28
Avoiding Common Rater Errors Similar-to-me Bias
A rater provides higher ratings to educators who are similar to themselves and lower ratings to educators who are dissimilar.
Strategy to avoid this error? Avoid incorporating personal preferences, feelings, or perceptions about the educator into your ratings. Only actual observation evidence should be used to make an observation rating.
29
Learning Activity I. Practicing Observation Rating
You will need the following: Your observation notes from the Conducting High
Quality Observations module Your district’s instructional framework
Identify sections of your framework aligned to Criteria 5:
Instructional Framework & Criteria 5 Alignment
CEL 5D+ Danielson Marzano
CEC 1, 2, 4-7 2a, 2c, 2d, and 2e 5.1 through 5.6
30
Learning Activity I. Practicing Observation Rating
Step 1: As a group: select two indicators from the list,
read them through, and discuss the key differences between the performance levels in each.
Step 2: As an individual:
Read your observation notes and code the evidence relevant to each indicator (e.g., use highlighters, make notations, etc.).
Select a rating for each indicator based on your coded evidence.
31
Learning Activity I. Practicing Observation Rating
Step 3: Select one person to be the “recorder” and
write down ratings in Handout 4: Ratings Record.
Share your indicator ratings for recording.
32
Learning Activity I. Practicing Observation Rating Step 4:
Identify any indicator without exact rater agreement.
Discuss and attempt to achieve a rating consensus on each (e.g., explain your ratings with reference to evidence).
Note and record, during the discussion, any common rating errors you find in your own ratings or others in your group (see Handout 3: Common Rating Errors for reference).
33
Learning Activity I: Debrief/Wrap-Up Did anyone achieve exact rater
agreement on at least one indicator? Both?
Were you able to achieve consensus on ratings where you did not have exact rater agreement?
What rater errors did you identify and what strategies could you utilize in the future to avoid the error?
34
Learning Content II: Observer Training to Achieve Rater Agreement Intensive Training to Achieve Rater
Agreement Orientation and deep understanding of
standards and framework, components, and tools
Practice rating using a combination of videos and live observations
Feedback, coaching, and discussion of ratings
Assessment of rater agreement (e.g., certification testing)
35
OSPI’s Stages of Rater Agreement Training
2-3 Day
Foundational
Training
Ongoing Rater
Agreement
Training
36
Certification and Calibration Exams The certification exam should cover all grades
and subjects the observer will observe. There are a variety of ways to reduce the
time burden of certification: Include a knowledge assessment of the observation rubric Mix shorter videos of practice with longer, full-lesson
videos of practice The calibration exam should test the observer
on a representative selection of skills and content to ensure continued accuracy in rating.
Certification and calibration exams are high-stakes exams.
37
Ongoing Calibration, Practice, and MonitoringOngoing Calibration, Practice, and Monitoring Rater agreement is NOT ensured
by a single training or certification test.
38
Ongoing Calibration, Practice, and MonitoringRater drift will naturally occur unless evaluators have:Periodic opportunities to re-calibrate.
Access to practice videos for difficult-to-score domains/components.
Expectations that their ratings will be monitored.
39
Ongoing Calibration, Practice, and MonitoringLessons from TPEP Pilots:
Informal calibration through discussion forums where observers share challenges and best practices have a big impact.
Use pre-existing professional learning groups (such as principal PLCs) to practice and calibrate.
To practice, co-observe a classroom lesson, score separately, and meet to compare scores.
40
Learning Activity II: Identifying Opportunities for CalibrationDiscuss with your team: What opportunities already exist in your district for ongoing calibration? Identify at least two.
Share: What opportunities have you identified?
Implementing
41
Develop a district plan for ongoing assessment and monitoring of rater
agreementDevelop a district plan for ongoing
rater calibration and practice
42
Implementing Activity: Monitoring and Maintaining Rater Agreement Read “Maximizing Rater Agreement: A Primer” and “Rater Agreement in Washington State’s Evaluation System” (20 minutes)
Use the Implementation Planning Tool (Handout #5) to begin developing your district’s plan for monitoring and maintaining rater agreement over time
43
Implementing Activities DebriefEach team share two things to debrief our implementing tasks:1. One decision you made today
(could be a key decision, a preliminary decision, a change of course, etc.)
2. One of the immediate next steps you are taking when you return to your district
Reflecting
44
45
Revisiting Our Confidence Conversations1. Our evaluators demonstrate accuracy and
strong rater agreement when using observation data to score teacher performance.
2. Our district’s new evaluation system includes frequent, structured opportunities for evaluators to practice and calibrate their observation and rating skills.
3. Our teachers and principals trust their evaluators to rate their performance accurately and reliably.
46
What’s Next Homework options:
District or school teams: use your observation notes to practice scoring additional indicators in your framework and discuss ratings to achieve agreement. Identify specific components or dimensions
you think will be particularly hard for your observers to score. Prioritize those components or dimensions in your ongoing calibration and practice sessions.
47
Thank you!
Presenter NameXXX-XXX-XXXX
1234 Street AddressCity, State 12345-1234
800-123-1234