33
Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Embed Size (px)

Citation preview

Page 1: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Local Assessment Validity Study

Rayne A. SperlingJonna M. Kulikowich

The Penn State Research Team

Page 2: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

2

Regulation Guiding the Research

“Students shall demonstrate proficiency in reading, writing and mathematics on either the State assessments administered in grade 11 or 12 or local assessment aligned with academic standards and State assessments under § 4.52 (relating to local assessment system) at the proficient level or better to graduate.”

Page 3: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

3

Purpose

• For those students who failed to reach proficiency on the PSSA during the 11th grade administration or the 12th grade retake administration, what are the local assessments districts use to measure proficiency?

– Objectives• To describe local assessments submitted by districts

• To report the alignment of local assessments to proficiency standards

• To describe districts’ reported local assessment practices as measures used to establish proficiency to the standards

Page 4: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

4

Collecting Local Assessments

• Two requests were sent by PDE (7/28/09, 8/12/08)

• Materials and practices submitted to PDE by October 7, 2008 were included in the study

• PDE recorded date of receipt on submissions

• Documents were boxed and shipped to Penn State, University Park

• Stored by randomly generated code in Penn State suite

Page 5: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Response Rates

• Approximately 85% of districts responded to the PDE request for local assessments

• Response rates were roughly equal across rural, urban, and suburban school districts

• Characteristics of non-reporting districts were also examined

5

Page 6: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Districts Overall

TOTAL Rural Suburban Urban

All Districts 497* 226 231 40

Average District Enrollment 2006-2007

1163** 762 1300 2683**

85 – 47789 85 – 3760 188 - 9067 164 - 47789

Average Per-Pupil Expenditure ($)

11925.45 11375.85 12474.21 11928.21

($) 8174.25 – 23009.15 8174.25 – 20394.90 8484.56 – 23009.15 9058.45 – 15572.14

Average % below proficient both PSSA attempts

42.77 45.74 37.4 58.56

10.2 – 93.1 19.3 – 77.3 10.2 – 75.4 26.7 – 93.1

6

*Although there are 501 school districts in PA, 4 do not have high schools, thus are not eligible for this investigation** These numbers exclude Districts of the First Class

Page 7: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Reporting Districts

7

TOTAL Rural Suburban Urban

Reporting Districts 418 (84.1%) 189 (83.6%) 195 (84.4%) 34 (85.0%)

Average District Enrollment 2006-2007

1185 (1072 w/o Districts of the First Class)

779 1295 2894 (1446 w/o Districts of the First Class)

85 – 47789 85 – 3760 188 - 9067 164 – 47789

Average Per-Pupil Expenditure ($)

11988.41 11410.02 12556.58 11924.71

($) 8174.25 – 23009.15 8174.25 – 20394.9 8634.41 – 23009.15 9058.45 – 15572.14

Average % below proficient both PSSA attempts

42.22 44.73 37.06 59.09

10.2 – 92.2 19.3 – 77.3 10.2 – 75.3 33.6 – 92.2

7

Page 8: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Non-Reporting Districts

8

TOTAL Rural Suburban Urban

Non-reporting Districts(%)

7937 (46.83) 36(45.57)

6 (7.59)

Average District Enrollment 2006-2007

1050 666 1295 1,646

317 – 3868 317 – 1,894 382 -3868 629 – 3,442

Average number of graduates 2006-2007

216 136 288 286

Graduation class Range 35 – 904 35-559 46 – 904 146 – 550

Average % below proficient both PSSA attempts

46.48% 51.30% 39.99% 55.72%

17.8% - 93.1% 29.1%- 69.1% 17.8% - 75.4% 26.7% - 93.1%

Average Per-Pupil Expenditure ($)

$11,618.46 $11,184.74 $12,009.50 $11,946.88

($) $8,484.56 – $16,121.68

$8,921.52 -$13, 986.57

$8,484.56 -$16,121.68

$10,787 - $14,502.57

Average % Free & Reduced Enrollment

34.32% 37.35% 28.11% 52.93%

Free & Reduced 2.51% – 82.08% 8.92% - 60.81% 2.51% - 82.08% 19.83% - 77.58%

Page 9: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

9

Local Assessments: Materials

• Districts reported use of a variety of materials and practices as their local assessment

• There was considerable variance in the nature of the materials submitted by districts – Some districts submitted a cover letter describing

their local assessment

– Some districts submitted multiple binders or boxes of materials as their local assessment

• Whatever the district submitted became their local assessment ‘folder’

Page 10: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Submitted

• Materials included curricula, portfolio strategies, published and commercial assessments, district and teacher constructed assessments

• Districts reported use of more than 60 different published or commercially-available assessments

• Districts submitted assessments with varying types of item formats as local assessments

10

Page 11: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Submitted

11

Type of Material* Mathematics Frequency (%) Reading Frequency (%)

Multiple-choice Test 314 (75) 315 (75)

Constructed Response 269 (64) 273 (65)

Standardized/Published Test

140 (33) 150 (36)

Portfolio 33 (08) 40 (10)

Curriculum 36 (09) 35 (08)

Course 99 (24) 96 (23)

Tutoring 50 (12) 48 (11)

*Types of Materials Submitted by Districts Overall (n = 418, districts that sent materials)Districts may have submitted more than one type of material

Page 12: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Expert Panels

• Educators from Pennsylvania were invited to serve on an expert panel to review local assessments

• Characteristics– The expert panel was comprised of 24 members – The panel was equally divided into two teams, a 12-member

panel for Reading and a 12-member panel for Mathematics– Among the panelists,18 are classroom teachers, six of whom

are chairs in their respective departments; several teach at nearby colleges.

– Two panelists are administrators (a high school principal and a curriculum supervisor)

– Two are PSSA coaches (a literacy coach and a Mathematics coach)

12

Page 13: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Expert Panels (continued)

– One is an instructional specialist– One is an educational assessment specialist

• Experience– Twenty-two panelists reported having experience with

school-level curriculum development– Half of the panel members reported direct experience with

development of their districts’ local assessments– Eight panelists participated in development of the PSSA on

tasks such as item analysis– Panelists’ overall years of experience in education ranged

from 7 to 37 years– Together they brought nearly 450 years of experience to the

coding task

13

Page 14: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Expert Panels Procedure

• Panels met in State College in October

• Trained by national advisors to code district folder materials for alignment to proficiency standards

• Established criteria to evaluate reported practices for determining proficiency

14

Page 15: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

15

Materials Alignment

0 = No content areas represented; no alignment of outcomes to standards.

1 = Some content areas represented; some outcomes are aligned.

2 = Many to most content areas represented; most outcomes are aligned.

3 =All content areas represented; all outcomes are aligned.

Page 16: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Alignment

0 = No content areas represented; no alignment of outcomes to standards.

No assessment information provided

Calculus Final Exam Content area assessment

used as Mathematics proficiency (e.g., Science test or English exam)

Assessment with errors in items or answers

Portfolio without Mathematics content

Curriculum materials with no observable outcome measure

No assessment information provided

Below grade level assessments (e.g., 8th grade Reading test)

Isolated vocabulary test or grammar test

Content area assessment used as reading proficiency

Curriculum materials with no observable outcome measure

Writing assessments used as reading proficiency

16

Page 17: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Alignment1= Some content areas represented; some outcomes are aligned.

Course overview or table of contents, but aligned to proficiency standards Assessment that does not match standardsAssessment above eligible content that may include some standardsAssessment far below grade level that may include some standardsA single midterm or final exam for one particular content area (e.g., Algebra or Geometry).

Standardized test that does not align to PA StandardsAbove or below grade level assessment with or without text that matches some standardsGood example that only partially addresses limited standardsNo variety of texts, for example, all narrative without attempt to measure standardsCollection of tests not aligned to standardsAssessment with emphasis on recall not reading

17

Page 18: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Alignment

2=Many to most content areas represented; most outcomes are aligned.

Assessment or collection of assessment that may be missing only one area (e.g., Geometry or Probability)Collection of chapter tests and quizzesAssessment that includes content beyond the standardsAssessment that has all standards but students are given option to complete sections

Many but not all standards addressedInappropriate text materials but attempt to meet standards Assessment that addresses outdated standardsLacks systematic coverage of standards (e.g., may not have propaganda/bias, or poetry as part of assessment)Released PSSA items

18

Page 19: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Materials Alignment3=All content areas

represented; all outcomes are aligned.

An assessment or a collection of assessments that include at least one item for each standard (Even though some students may not take all assessments)

4Sight assessment IU 8 assessment Study Island assessments An assessment that covers

all proficiency standards as either one test or a collection of tests with emphasis on use of knowledge in problem solving.

Grade level text(s) present with items/tasks that meet all proficiency standards

Performance assessment representative of standards

Study Island assessments

19

Page 20: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Alignment Ratings

Group (n) Mathematics Reading

Mean (S.D.) Missing Mean (S.D.) Missing

All (418) 2.77 (0.57) 31 1.79 (0.95) 36

Rural (190) 2.78 (0.58) 18 1.75 (1.01) 23

Suburban (196) 2.79 (0.53) 9 1.81 (0.90) 9

Urban (32) 2.61 (0.79) 4 1.82 (0.86) 4

Score range for all groups is 0 – 3.Ratings were given for the folder of district materials

20

Page 21: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Local Assessments

• There was also considerable variance in the manner in which districts reported that they implement local assessments– Some districts reported they did not have a local

assessment

– Some districts reported that the local assessment was not necessary for graduation decisions

– Some districts reported that their curriculum was aligned to standards and that passing courses was the local assessment

21

Page 22: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

22

Assessment Practices

• Expert Panels examined local assessment practices

• A survey of 42 representative practices was rated before exposure to local assessments and again after rating the alignment of local assessment materials

• Consensus criteria were established for assigning codes for assessment practices

Page 23: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

23

Assessment Practices0 = The practice as reported cannot ensure proficiency level in Mathematics and/or Reading is met.

No proficiency requirement for graduationGraduation project serves as proficiency standardSummer course attendance after graduationRemedial course attendance without any explanation of how grades are assignedCourses without an objective measure or portfolios with no standard rubric systemTests not aligned to standardsUnlimited retakes of unsecured assessments in part or full

23

Page 24: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

24

Assessment Practices

1 = The information about practices provided is insufficient to determine whether proficiency in Mathematics and/or Reading is met.

Retakes of aligned assessments that exceed the number of forms of the assessment (if perfectly aligned)When information was not adequate to fully evaluate the practice and the alignment was at least a ‘2’Ambiguous security or retake opportunities of an aligned assessment

24

Page 25: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

25

Assessment Practices

2 = There are some good practices reported to determine proficiency level; however, more information is needed.

When information was not adequate to fully evaluate the practice and the alignment was a ‘3’Valid practice with a alignment of ‘2’Fully aligned assessment with ambiguous security or retake opportunitiesFully aligned and secure assessment with unclear information about how the scores or ratings determine that proficiency levels have been metAligned assessment without clear criteria for proficiency

25

Page 26: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

26

Assessment Practices

3 = The practices reported represent a valid system of procedures to determine proficiency in Mathematics and/or Reading.

Fully aligned secure assessment with retakes that do not exceed number of assessment formsFully aligned assessment with randomly-generated (e.g., computer-based) items Fully aligned and secure assessment that has scores or cut-off points to measure proficiency

26

Page 27: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

27

Results by Districts

Group (n) Mathematics Reading

Mean (S.D.) Missing Mean (S.D.) Missing

All (418) 0.90 (0.97) 8 0.69 (0.91) 10

Rural (190) 0.98 (0.98) 5 0.75 (0.95) 7

Suburban (196) 0.84 (0.95) 3 0.65 (0.86) 3

Urban (32) 0.75 (1.02) 0 0.56 (0.95) 0

Score range for all groups is 0 – 3.

Page 28: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Practices: Mathematics

Alignment→Practices ↓

0 1 2 3

0 6 (.016) 10 (.026) 35 (.091) 108 (.281)

1 0 (.000) 1 (.003) 11 (.029) 105 (.273)

2 0 (.000) 0 (.000) 2 (.005) 74 (.193)

3 0 (.000) 0 (.000) 0 (.000) 31 (.081)

28

Page 29: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Practices: Reading

Alignment→Practices ↓

0 1 2 3

0 38 (.101) 73 (.194) 74 (.197) 17 (.045)

1 0 (.000) 30 (.080) 39 (.104) 26 (.069)

2 0 (.000) 0 (.000) 23 (.061) 37 (.098)

3 0 (.000) 0 (.000) 0 (.000) 19 (.051)

29

Page 30: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

30

Alignment: Mathematics & Reading

Reading→Mathematics ↓

0 1 2 3

0 6 (.016) 0 (.000) 0 (.000) 0 (.000)

1 2 (.005) 6 (.016) 3 (.008) 0 (.000)

2 6 (.016) 22 (.058) 17 (.045) 1 (.002)

3 24 (.063) 74 (.195) 118 (.312) 98 (.259)

30

Page 31: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

31

Practices: Mathematics & Reading

Reading→Mathematics ↓

0 1 2 3

0 180 (.442) 3 (.007) 1(.002) 0 (.000)

1 41 (.101) 73 (.179) 1(.002) 0 (.000)

2 5 (.012) 17 (.042) 54 (.133) 1 (.002)

3 4 (.010) 3 (.007) 5 (.012) 18 (.044)

31

Page 32: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Conclusions

• There was considerable variance in both the nature of materials and practices reported by districts.

• As measured by the district folder, Mathematics assessments were more often aligned to standards than were Reading assessments.

• For Mathematics, thirty-one school districts (8.1%) had ratings of ‘3’ for both alignment and practice, while nineteen (5.1%) districts received ratings of ‘3’ for both alignment and practice in Reading.

32

Page 33: Local Assessment Validity Study Rayne A. Sperling Jonna M. Kulikowich The Penn State Research Team

Conclusions (continued)

• Based on criteria established by the panels, evidence of alignment to standards and practices that could result in valid measures of proficiency was present from 5 percent of school districts statewide given information submitted and reported about both their Mathematics and Reading assessments.

33