Oreilly teacher evaluation presentation

Anju Kuriakose [email protected] Joe O’Reilly [email protected]

Requirements for A Teacher Evaluation System

“In developing these systems it is imperative that LEAs recognize that high stakes decisions about educator effectiveness should only be made using multiple measures that are both valid and reliable. To this end, this framework identifies several sources of data that may be used; however, LEAs should recognize that the majority of teachers do not have a complete compliment of valid and reliable student achievement data. This is particularly true for teachers in special needs areas and for those in grades and subjects where statewide assessments are not required. As LEAs begin the work of developing their own evaluation systems priority should be given to the creation of valid and reliable assessments in these high need areas. (http://www.azed.gov/teacherprincipal-‐evaluation/ …SBE Framework)

*  multiple measures that are both valid and reliable

The Most Important Part of that Paragraph:

*  Multiple Measures *  “The various types of assessments of student learning,

including for example, value-‐added or growth measures, curriculum-‐based tests, pre/post tests, capstone projects, oral presentations, performances, or artistic or other projects.” [IF they are reliable and valid]

*  Types of scores *  Passing Rate: The percentage of students passing the test *  Growth: Student growth percentile ranks

“multiple measures that are both valid and reliable”

*  ADE Definition of Validity: The extent to which a test's content is representative of the actual skills learned and whether the test can allow accurate conclusions concerning achievement.

*  NCME/AERA/APA Definition of Validity: The degree to which the evidence obtained through validation supports the score interpretations and uses to be made of the scores from a certain test administered to a certain person or group on a specific occasion


*  Implications for your evaluation system *  You can collect evidence that a test validly measures the

academic content, but that is not sufficient. *  You should also have evidence that the tests measure the

impact of the teacher on student learning and not other factors.

*  Most evaluation systems assume the first bullet is true and ignore the second bullet.


*  Reliability: The ability of an instrument to measure teacher performance consistently across different rates and different contexts. *  Implication: *  Tests should produce approximately the same result if given

repeatedly. *  Observers seeing the same teaching should score it similarly


ü All teachers were considered as Group B. ü Teachers get a final rating based on *  Individual Observation scores (Danielson Rubric) (60%) *  School Level AIMS scores (33%) *  Meeting the goals of the incentive plan at the school

level (7%)

Teacher Rating in Mesa

Student Achievement,

33%

Teacher Observations,

60%

School Achievement,

7%

The Teacher Evaluation System Consists of Three Components

*  Two observations – one in the Fall and one in the spring if a teacher is new or having difficulty *  A final Summative score is given

Teacher Observation Process

*  Danielson Framework *  Other frameworks – Marzano, Stronge

*  Four domains and 22 components *  Planning and preparation (6 components) *  Classroom environment (5 components) *  Instruction (5 components) *  Professional responsibilities (6 components)

Teacher Observation Is Based On The Danielson Framework

*  Highly Effective *  Minimum of 7 Highly Effective ratings and no Ineffective or

Developing rating *  Effective *  No Ineffective rating and less than four Developing rating

*  Developing *  Four of more Developing rating and less than 3 Ineffective

rating *  Ineffective *  3 or more Ineffective rating

Based on All The Data, Teachers Receive One of Four Labels

Ineffective 20, 1%

Developing 99, 3%

Effective 1866, 59%

Highly Effective 1161, 37%

How MPS Teachers Were Rated

Are Your Ratings Reliable?

*  Reliability *  The ability of an

instrument to measure teacher performance consistently across different rates and different contexts.

*  What does that mean for teacher evaluation? *  Would two raters give the

same person the same rating?

*  What training have you done to ensure that two raters view a situation in the same way? *  Have you checked to see that raters are being accurate and consistent?

Are Your Ratings Reliable?

*  Are the best teachers getting high ratings and the poorer performers getting lower ratings?

Are Your Ratings Valid?

Does The Evidence Support 96% HE or E?

Reading Math Wri.ng

3rd 23% 29% -‐-‐

4th 21% 32% 42%

5th 20% 37% 47%

6th 18% 34% 49%

7th 13% 30% 50%

8th 27% 38% 30%

10th 14% 31% 28%

Percent of Students NOT Showing Grade Level Proficiency (AIMS)


33%


60%

School Achievement,

7%

Student Achievement Component

ü AIMS percent passing and growth in reading and math

ü AIMS writing percent passing X  District end of course/grade tests X  Teacher made tests X  DIBELS in grades K-‐3 X  Stanford 10 (grades 2 & 9)

What Measures Do We Use?

Test Percent Score AIMS Growth Percentiles 0-‐23

24-‐40 41-‐69 70-‐100

0 1 2 3

AIMS Percent Passing 0-‐35 36-‐59 60-‐89 90-‐100

0 1 2 3

AIMS Results Are Converted to a 0-‐3 Score

*  Validity: AIMS has evidence it is a valid measure of reading, writing and math. District, class tests probably do not have that evidence. No test has evidence that it accurately measures the effectiveness of a teacher.

*  Reliability: Classroom median growth percentiles often vary widely from year to year.

What About Reliable & Valid?

Ineffective (0) Developing ( 1) Effective ( 2) Highly Effective (3)

0% of incentive plan compensation

Achieve incentive goals at the 50% level



School Achievement Component

Mesa has schools set annual incentive goals. How a school does on these school wide goals determines how many points a teacher gets for this component


33%


60%

School Achievement,

7%

You Have Ratings For Each Component, Now What?

97%

98%

98%

94%

Teacher Name Components Rating Calculation

Teacher A Observation 3 3 * 0.60 = 1.8

Academic achievement

2 2 * 0.33 = 0.66

School Goals 3 3* 0.07 = 0. 21

Total Points 2. 67

Final Rating Highly Effective

Putting the pieces together

Final Teacher Evaluation Rating

Ineffective Developing Effective Highly Effective

Less than 0.8 0.8 – 1.6 1.7-‐ 2.4 2.5 or above

*  Multiple measures *  Reliability *  Validity *  Impact on individual teachers

Did We Create A Reliable and Valid Teacher Evaluation System Using ………….Multiple Measures?

Anju Kuriakose [email protected] Joe O’Reilly [email protected]

Documents

Oreilly teacher evaluation presentation