New England Common Assessment Program

1

New England Common Assessment Program

Science Test Item Review Committee

Meeting August 14-15, 2007

Killington, VT

2

New HampshireTim KurtzJan McLaughlinBrain CochraneStan Freeda

Rhode IslandMary Ann SniderHeather Heineke AgnewLinda JzykPeter McLaren

VermontMichael HockGail HallPat FitzsimmonsDave White

Measured ProgressHarold StephensElliot ScharffAmanda SmithJosh EvansJim ManhartTori HenkesBeneta BrownSusan Tierney

Welcome and Introductions

An Emerging Vision

Cabot School, Vermont, Web Project Artwork

New England Common Assessment Program

4

Grades 3–8 (Reading, Math, and Writing)• Oct 2007 Third Administration• Jan 2008 Release Results

Grade 11 (Reading, Math, and Writing)• Oct 2007 First Operational Administration• Feb 2008 Release Results

Grades 4, 8, and 11 (Science)• May 2008 First Operational Administration• Oct 2008 Release Results

NECAP – Where are we now?

5

2007–2008 Schedule Test Form Construction Bias/Sensitivity Depth-of-Knowledge Test Item Review & Role of

Committees Universal Design for Assessment

Science Overview

6

Item Review Committee meeting: August 14–15• 36 teachers: 12 from each state

Bias Committee meeting: August 14–16• 18 teachers: 6 from each state

Face-to-Face meetings: October/November Test Form Production: January/February DOE Reviews: late February / early March Printing: March Test Administration Workshops: April 2008 Shipments to schools: April 25, 2008 Test Administration Window: May 12–29, 2008

• 108,000 students from the 3 states

NECAP 2007–2008 Schedule

7

Collaborative effort among NH, RI, and VT

Based on common content from all three states

Used “Big Ideas of Science” and the domains of science as organizing foundations

Less about isolated facts and more about use and application of information

Overview of Test Design

8

Who? The NECAP includes “all” students educated at

public expense in grades 3–8 and 11 in NH, RI, and VT.

Through explicit planning during test construction and the use of accommodations, the tests will be accessible to as many students as possible.

The NECAP does not include each state’s alternate assessment and English language proficiency assessment programs.

Test Design – Who?

9

What? The content, skills, and depth of knowledge

contained in the Assessment Targets of each states’ Grade Span Expectations (GSEs). The Assessment Targets were developed jointly by the three states expressly for this assessment program.

Physical Science, Life Science, and Earth Space Science at the end of grades 4, 8, and 11.

Each test will be designed to measure a range of student achievement across four performance levels.

Test Design – What?

10

Why spring testing? Critical transition points

• Grade 4 to 5, 8 to 9, and HS to beyond National Standards

• General agreement at transition points High School Schedule

• 4-by-4 block scheduling Science is not (yet?) part of AYP

Test Design – Why Spring Testing?

11

How? Operational Test

•Three Sessions•Sessions 1 and 2: MC and CR

items grouped together in three domains—Life Science, Physical Science, and Earth Space Science

•Session 3: Performance Task

Test Design – How?

12

Performance Task Session 3 will be a performance task

Looking at inquiry and science process Focus on one assessment target within “INQ” code Scenario (story) driven Work in groups of two or three to begin the session, then

answer questions individually Focus will vary by grade

Grade 4: Always hands-on “design an experiment” Grade 8: Sometimes like Grade 4, sometimes like Grade 11 Grade 11: Students will be given data and asked to draw

conclusions

Test Design – Performance Task

13

Forms Construction—Common/Matrix Design Common Items

• A common set of items completed by all students• All achievement level scores (student, school,

district, and state) are based solely on common items

Matrix-Sampled Items• Unique sets of items distributed across forms• Includes equating and field test items

Test Design – Forms Construction

14

How do we ensure that this test works well for students from diverse backgrounds?

Bias/Sensitivity Review

15

What Is Item Bias?

Bias is the presence of some characteristic of an assessment item that results in the differential performance of two individuals of the same ability but from different student subgroups.

Bias is not the same thing as stereotyping, although we don’t want either in NECAP.

We need to ensure that ALL students have an equal opportunity to demonstrate their knowledge and skills.

16

Sensitivity to different cultures, religions, ethnic and socio-economic groups, and disabilities

Balance of gender roles Use of positive language, situations, and images In general, items and text that may elicit strong

emotions in specific groups of students, and as a result, may prevent those groups of students from accurately demonstrating their skills and knowledge

Role of the Bias/Sensitivity Review Committee

The Bias/Sensitivity Review Committee DOES need to make recommendations concerning…

17

Reading Level Grade-Level Appropriateness Assessment Target Alignment Instructional Relevance Language Structure and Complexity Accessibility Overall Item Design

Role of the Bias/Sensitivity Review Committee

The Bias/Sensitivity Review Committee will not make recommendations concerning…

18

How do we ensure that the test contains a range of complexity?

Depth of Knowledge

19

Level 1 Recall and Reproduction Recall of a fact, information, or procedure

Level 2 Skills and Concepts Use information or conceptual knowledge, two

or more steps, etc.

Level 3 Strategic Thinking Requires reasoning, developing plan or a

sequence of steps, some complexity, more than one possible answer

Level 4 Extended Thinking Requires an investigation, time to think and

process multiple conditions of the problem

Depth of Knowledge

20

This assessment has been designed to support a quality program in science. It has been informed by the input of hundreds of NH, RI, and VT educators. Because we intend to release assessment items each year, the development process continues to depend on the experience, professional judgment, and wisdom of classroom teachers from our three states.

Test Item Review Committees

21

Today you will be looking at test items in science.

The role of Measured Progress staff is to facilitate the discussion and capture recommendations that are clear and defensible for test items.

The role of DoE content specialists is to listen, ask clarifying questions as necessary, and explain background information.

Your role is to advise the states by actively offering opinions based on content knowledge and grade-level expertise.

Role of the Test Item Review Committees

22

You will be asked to review all items against the following criteria:

1. Assessment Target Alignment2. Correctness3. Depth of Knowledge4. Language5. Universal Design

Finally you will recommend each item for field testing, revision, or rejection.

Each committee member will complete a form to gather this information about each item.

Role of Test Item Review Committees

23

You will also be asked to provide group feedback on the following question:

Does this item measure more specific knowledge and ideas that might be part of an end-of-unit test or does it measure extended learning that would be part of a cumulative science assessment?


24

You will also be asked to provide group feedback on the inquiry task by answering the following questions:

1. Is it possible for students at this grade level to answer the questions without completing the task?

2. Do the questions related to this task require scientific knowledge and understanding to answer?


25

You are here today to represent your diverse perspectives. We hope that you…• share your thoughts vigorously and listen just as

intensely—we have different expertise and we can learn from each other,

• use the pronouns “we” and “us” rather than “they” and “them”—we are all working together to make this the best assessment possible, and

• grow from this experience—I know we will.And we hope that today will be the beginning of some new interstate friendships.

Role of the Test Item Review Committees

Documents

New England Common Assessment Program