86
1 1 st Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment David Abrams Sullivan County BOCES 8/20/2013

1 st Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

  • Upload
    lapis

  • View
    23

  • Download
    0

Embed Size (px)

DESCRIPTION

1 st Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment. David Abrams Sullivan County BOCES 8/20/2013. Introduction. Validity is a process not a product. Design assessments to function as a lever for good instructional practices. - PowerPoint PPT Presentation

Citation preview

Page 1: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

1

1st Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

David AbramsSullivan County BOCES

8/20/2013

Page 2: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Introduction

● Validity is a process not a product.● Design assessments to function as a lever for

good instructional practices.● Consciously design assessments so that the data

produced can be used to: inform instruction; provide evidence of student achievement and/or growth; and facilitate the transition to the Common Core College & Career Ready Standards.

● Document the Process.

2

Page 3: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Assessment System Design: Key Questions/Bright Lines

1. What do I want to learn from this assessment?2. Who will use the information gathered from this assessment?3. What action steps will be taken as a result of this assessment?4. What professional development or support structures should be in place to ensure the

action steps are taken appropriately?5. How will student learning improve as a result of using this assessment and will it

improve more than if the assessment were not used?(Perie, Marion, & Gong, 2009)

(Perie, Marion, & Gong, 2009)3

Page 4: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Comprehensive Assessment System:Components

Summative: given one time at the end of the semester or school year to evaluate students’ performance against a defined set of content standards. Can be used for accountability or to inform policy and/or can be teacher administered for grading purposes (Perie, Marion, & Gong, 2009).

4

Page 5: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Comprehensive Assessment System:Components

Interim: Assessments administered during instruction to evaluate students’ knowledge and skills relative to a specific set of academic goals in order to inform policymaker or educator decisions at the classroom, school, or district level. The specific interim assessment designs are driven by the purposes and intended uses, but the results of any interim assessment must be reported in a manner allowing aggregation across students, occasions, or concepts (Perie, Marion, & Gong, 2009).

5

Page 6: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Comprehensive Assessment System:Components

Interim Con’t: Key components of interim assessments are: 1) they evaluate students’ knowledge and skills relative to a specific set of academic goals; and 2) they are designed to inform decisions at both the classroom and beyond the classroom level (Perie, Marion, & Gong, 2009).

6

Page 7: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Comprehensive Assessment System:Components

Formative: used by teachers to diagnose where students are in their learning, where gaps in knowledge and understanding exist, and to help teachers and students improve student learning. The assessment is embedded within the learning activity and linked directly to the current unit of instruction.

7

Page 8: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Comprehensive Assessment System:Components

Formative assessments are used most frequently and have the smallest scope and shortest cycle while summative are administered least frequently and have largest scope and cycle. Interim fall between the two (Perie, Marion, & Gong, 2009).

8

Page 9: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Table Discussion-Information Processing

How is Formative Assessment being used in your District, Schools, &/or Classroom?What constitutes a “quality formative assessment and how do you know?”What are your goals for implementing Formative Assessment in your school?Does your district have Professional Learning Communities/Whole Faculty Study Groups, vertical & horizontal, in place?Are you comfortable using data to inform instruction?

9

Page 10: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Perie, Marion, Gong

● Formative assessment is used by classroom teachers to diagnose where students are in their learning, where gaps in knowledge and understanding exist, and how to help teachers and student improve student learning.

● The assessment is embedded within the learning activity and linked directly to the current unit of instruction.

● The tasks presented may vary from one student to another depending on the teacher’s judgment.

10

Page 11: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Perie, Marion, Gong

● Providing corrective feedback, modifying instruction to improve the student’s understanding, or indicating areas of further instruction are essential aspects of a classroom formative assessment.

11

Page 12: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment

● …true meaning of formative assessment: an activity designed to give meaningful feedback to students and teachers and to improve professional practice and student achievement Reeves (2009).

● Three things must occur for an assessment to be formative: assessment is used to identify students who are experiencing difficulty; those students are provided additional time and support to acquire the intended skill or concept, and the students are given another opportunity to demonstrate what they have learned (DuFour, Eaker, and Karhanek 2010).

12

Page 13: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment

● Formative assessment is a systematic process to continuously gather evidence and provide feedback about learning while instruction is underway.

● The feedback identifies the gap between a student’s current level of learning and a desired learning goal.

● Teachers elicit evidence about student learning using a variety of methods and strategies, e.g. observation, questioning, dialogue, demonstration, and written response.

(Heritage, et al 2009)

13

Page 14: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment

● Teachers must examine the evidence from the perspective of what it shows about student conceptions, misconceptions, skills, and knowledge.

● They need to infer the gap between students’ current learning and desired instructional goals, identifying students’ emerging understanding or skills so they can modify instruction.

● For assessment to be formative, action must be taken to close the gap based on evidence solicited.

(Heritage, et al 2009)

14

Page 15: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment

What We Know:●It is not a kind of test.●Formative Assessment practice, when implemented effectively, can have powerful effects on learning.●Formative Assessment involves teachers making adjustments to their instruction based on evidence collected, and providing students with feedback that advances learning.●Students participate in the practice of formative assessment through self and peer-assessment.

(Heritage, 2011)15

Page 16: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Teacher’s Role

● Effective when teachers are clear about the intended learning goals for a lesson: this means focusing on what students will learn, as opposed to what they will do.

● Teachers need to share learning goal with students.● Teachers need to communicate the indicators of

progress toward the learning goal.● There is no single way to collect formative evidence

because formative assessment is not a specific kind of test.

(Heritage, 2011)

16

Page 17: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Student’s Role

● Student’s role begins when they have a clear conception of the learning target.

● In self-assessment, students engage in metacognitive activity which involves students in thinking about their own learning while they are learning.

● They generate feedback that allows them to make adjustments to their learning strategies.

17

Page 18: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Student’s Role

● It is important to include peer-assessment where students give feedback to their classmates.

● Students use the feedback; it is important that students have to both reflect on their learning and use the feedback advance learning.

(Heritage, 2011)

18

Page 19: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence Collection

● Whatever methods teachers use to elicit evidence of learning, it should yield information that is actionable by them and their students.

● Evidence collection is a systematic process and needs to be planned so that teachers have a constant stream of information tied to indicators of progress.

(Heritage, 2011)

19

Page 20: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Feedback

● Feedback obtained from planned or spontaneous evidence is an essential resource for teachers to shape new learning through adjustments in their instructions.

● Feedback that teacher provides to students is also an essential resource so that student can take active steps to advance their own learning.

(Heritage, 2011)

20

Page 21: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Formative Assessment

● Common assessment refers to those assessments given by teacher teams who teach the same content or grade level; those with “collective responsibiloity for the learning of a group of students who are expected to acquire the same knowledge and skills.”

● No teacher can opt out of the process: common assessment use the same instrument or a common process utilizing the same criteria for determining the quality of student work.

(DuFour et al., 2010)

21

Page 22: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Formative Assessment: Benefits

● Promote efficiency for teachers● Promote equity for students● Provide an effective strategy for determining

whether the guaranteed curriculum is being taught and, learned.

● Inform practice of individual teachers● Build a team’s capacity to improve its program● Facilitate a systematic, collective response to

students who are experiencing difficulty● Tool for changing adult behavior and practice

(Bailey & Jakicic, 2012)22

Page 23: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Table Discussion-Information Processing

Do these views of Formative Assessment converge with your’s? Why/Why Not?What strikes you as most important given the key points regarding Formative Assessment?How do you see implementing assessment strategies that effectively utilize the crucial aspects of Formative Assessment?Are you comfortable designing Common Formative Assessments?

23

Page 24: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study

● Heritage et al determined that there is little research to evaluate teachers’ ability to adapt instruction based on assessment of student knowledge and understanding.

● Research has shown, using math, that moving from evidence to action may not always be the seamless process formative assessment demands.

● G-study results provide data showing teachers do better at drawing inferences of student levels of understanding from assessment evidence, while having difficulties in deciding next instructional steps.

24

Page 25: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study

● Heritage et al conducted a G-Study using 3 mathematical concepts as the instructional learning goal.

● The teachers’ pedagogical knowledge in mathematics was the object of measurement. The study was designed to provide information about potential sources of variation in measuring teachers' pedagogical knowledge in mathematics.

● The study design implies that there were three sources of score variability: rater, mathematics principle, and type of task.

25

Page 26: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study

● Rater: Study used performance tasks; once source of variance was the possibility of score variation between raters due to interpretation and application of rubric and how stringent/lenient a rater may be.

● Principle: different types of domain specific principles may cause variance due to a given teachers’ preparation, a teacher may have more knowledge about one principle than others. (Study evaluated 3 principles: distributive property, solving equations, & rational number equivalence.)

26

Page 27: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study

Task: potential source of variability. Study focused on 3 types of tasks: identifying key principles; evaluating student understanding; and planning the next instructional step based on the evaluation of student understanding.

27

Page 28: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study Results

● Main effect of principle and rater are minimal. Teachers knew the concept and knew how to evaluate the student work to determine learning.

● Important finding: regardless of the math principle, determining next instructional steps based on the examination of student responses tends to be more difficult for teachers.

● If teachers are not clear about what the next steps are to move learning forward, then promise of Formative Assessment to improve student learning is impacted negatively.

28

Page 29: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study Results & Teacher

Support

● Teachers need clear conceptions of how learning progresses in a domain; they need to know what precursor skills and understandings are for a specific instructional goal, what a good performance of the desired goal looks like, and how the skill increases in sophistication from the current level students have reached.

● Learning progressions describe how concepts and skill increase in sophistication in a domain from the most basic to the highest level, showing the trajectory of learning along which students are expected to progress.

29

Page 30: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study Results & Teacher Support

● From a learning progression, teachers can access the big picture of what students need to learn, they can grasp what the key building blocks of the domain are, while having sufficient detail for planning instruction to meet short term goals.

● Teaches are able to connect Formative Assessment opportunities to short term goals as a means to track student learning.

● Learning progressions alone will not be sufficient.

30

Page 31: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Evidence to Action-G-Study Results & Teacher Support

● Teachers need to know what a good performance of the their specific short-term learning goal looks like. They must also know that good performance does not look like.

● Key finding: using assessment information to plan subsequent instruction tends to be the most difficult task for teachers as compared to other tasks. This finding gives rise to the question: can teachers always use formative evidence to effectively “form” action?

31

Page 32: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Instructional Support-Marzano, Pickering, and Pollock (2001)

9 highly effective, research-based strategies1.Identifying similarities and differences2.Summarizing and note-taking3.Reinforcing effort and providing recognition4.Homework and practice5.Nonlinguistic representations6.Cooperative learning7.Setting objectives & Providing Feedback8.Generating & testing hypothesis9.Cues, questions, and advanced organizers

32

Page 33: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Table Discussion-Information Processing

How can you use Professional Learning Communities to support teachers and students when implementing Formative Assessment?

Based on Heritage et al’s findings, what type of professional development will you need to support Formative Assessment in your District’s?

33

Page 34: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Validity Framework

Review Nichols, Meyers, & Burling (2009) Validity Framework for a Formative System:●Figure 1: General Framework for Evaluating Validity●Figure 2: Framework using the identification of specific procedural errors●Figure 3: Framework for a system of reteachingValidity: The degree to which accumulated evidence and theory support specific interpretations of test scores entailed by proposed use of a test (JS Glossary).

34

Page 35: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Design

● Need to unwrap standards or evaluate State Testing Data to determine demonstrated areas of instructional need

● Define & Align to State and District ExpectationsReview Baily Figure 4.1: 5 Steps1. Focus on Key Words2. Map it Out3. Analyze the target4. Determine big ideas5. Establish Guiding Questions for Instruction

35

Page 36: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Design

● Assessments must provide information about important learning targets that are clear to students and teacher teams.

● Assessments provide timely information for both students and teacher teams.

● Assessment must provide information that tells students and teacher teams what to do next.

36

Page 37: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Design

●Determine Assessment types: selected response; constructed response; performance task●Determine number and balance of items●Select/design assessment●Administer, evaluate responses, and redesign instructional strategies

37

Page 38: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Formative Assessment: Design

●Assess again (in original assessment design, create enough items to have more than one assessment)●Evaluate depth and breath of rigor: Webb’s Depth of Knowledge/Bloom’s Taxonomy●Utilize PLC to support process and data discussions

38

Page 39: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Cognitive Response Demands: Reading Load

Reading Load: The amount and complexity of the textual and visual information provided with an item that an examinee must process and understand in order to respond successfully to an item (Ferrara 2011).

39

Page 40: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Cognitive Response Demands: Reading Load

Low Reading Load: May include a small amount of text.Moderate Reading Load: Lower amounts of text and visuals and less complex text.High Reading Load: May include a large amount of text much of which is complex linguistically or complex because of the content area concepts and terminology used (Ferrara, 2011).

40

Page 41: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Cognitive Response Demands:Mathematical Complexity Ranges for NAEP

Low Complexity: Items may require examinees to recall a property or recognize a concept; these are straightforward, single operation items.Moderate Complexity: May require examinees to make connections between multiple concepts, multiple operations, and to display flexibility in thinking as they decide how to approach a problem.High Complexity: May require examinees to analyze assumptions made in a mathematical model or to use reasoning, planning, judgment, and creative thought; it assumes that students are familiar with the mathematics concepts and skills required by an item (Ferrara, 2011).

41

Page 42: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Core Transition: Academic Language

Academic language is used to refer to the form of language expected in contexts such as the exposition of topics in the school curriculum, making arguments, defending propositions, and synthesizing information (Snow, 2010).

(See Coxhead’s High-Incidence Academic Word List)

42

Page 43: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Bailey & Jakicic Sample Assessment Plan

LearningTarget

Knowledge Application Analysis Evaluation

Identify similes & metaphors from text

Five Matching

Explain meaning of common similes & metaphors

Four Multiple-choice

Develop a narrative paragraph w/both

One Constructed Response

43

Page 44: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

44

Quest # Type(MC, CRQ, ERQ)

Point(s) LearningStandard

PI CC LiteracyStandard

Reading OrMathLoad

Lexile or other ReadingFormula

AL* WebbDOK

*AL: Academic Language

Revised Item Map for Locally Developed Assessments-Formative Assessments

Quest # Type(MC, CRQ, ERQ)

Point(s) LearningStd

PI CCLiteracyStd

ReadingOrMathLoad

Lexile orOtherReadingFormula

AL WebbDOK

Item Data

Sample Item Map Template: Prior to Assessment Administration

Sample Item Map Template: Post Assessment Administration

Page 45: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures-Joint Standards

13.7: In educational settings, a decision or characterization that will have major impact on a student should not be made on a simple test score. Other relevant information should be taken into account if it will enhance the overall validity of the decision.

45

Page 46: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures

Multiple Measures are intended to improve quality of high-stakes decision making so decisions are not based on one single measure.

Definition of multiple measures, criteria to evaluate each measure, and how these measures should be combined for use in decision making is not clear.

(Henderson-Montero, Julian, & Yen, 2003)

46

Page 47: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures: Examples

● Test more than one content area● Assess a content area using a combination of MC and CRQ

formats● Assess a content area using an on-demand test and a class

based portfolio (writing)● Assess school performance using a combination of academic

tests and other indicators● Make progressively “higher stakes” decisions about schools

using a combination of accountability scores and other reviews● Use for promotion/graduation processes by meeting certain

criteria, even if student does not pass a State’s on-demand test (Gong & Hill, 2001).

47

Page 48: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures: Examples

● Have several assessment instruments that can be used by students of various proficiency or presentation/response needs

● Allow students multiple opportunities to retake the test to determine whether they meet minimum cut scores

● Allow for promotion/graduation● Double score every constructed response item on tests used

for high school student accountability ● Assess school performance using an average of at least two

years’ of data● Assess school performance using as many grades of students

as practical. (Gong & Hill, 2001)

48

Page 49: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures

● Need to create a framework for combining multiple measures.

Four Categories of Rules: 1. Conjunctive, 2. Compensatory, 3. Mixed conjunctive-compensatory, and 4. Confirmatory

(Henderson-Montero, Julian, and Yen 2003; Chester, 2003).

49

Page 50: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures

● Conjunctive: attainment of a minimum standard (e.g. meeting a designated cut score on a specific exam)

● Compensatory: weaker performance on one measure can be offset by stronger performance on another. Performance on two or more measures is required before they are combined into a compensatory rule.

● Mixed conjunctive-compensatory: uses a combination of conjunctive and compensatory approaches; e.g. minimal performance level is required across measures, but beyond minimal level of performance, poorer performance on one measure can be counterbalanced by better performance on other measures.

50

Page 51: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures

● Confirmatory: employs information from one measure to validate or compare information from another, independent measure (e.g. statewide reading performance compared to NAEP). Generally apply a conjunctive rule where minimum performance on independent measure is required (e.g. Regents alternatives, i.e. AP score of Level 3).

(Henderson-Montero, Julian, and Yen 2003; Chester, 2003).

51

Page 52: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures: Example

  Conjunctive(AND)

Compensatory(+/-)

Complementary(OR)

Measures of Different Constructs 

Test scores inReading & math;Satisfactory teacherMarks on ALL subjectAreas; Successful Completion ofMultidisciplinary &Service Learning Projects

   

Different Measures Of the SameConstructs

Satisfactory teacher marks ANDtest scores

Multiple-Choice andOpen-EndedSections of SAT-9

SAT 9 OR CitywideTest

Multiple Opportunities

    Summer School;Retest in citywide test

Accommodations &Alternate Assessments

    SAT-9 OR Spanish-language Aprenda; Citywide testIn English OR Spanish

52

Philadelphia's approach to combining multiple measures to reach Elementary promotion decisions.

(Chester, 2003)

Page 53: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Multiple Measures: Challenges

● Technical issues in combining specific assessments and indicators: use of common scale and/or index (e.g. combine test scores from NRT & CRT; state and local (NRT; CRT; other); combine test scores and other indicators; combining non-standard and/or different combinations of measures.

● Sufficient data for characterizing school(s).● “Valid” interpretations and uses of accountability for effective

and fair school improvement & services to individual students (disaggregation of data) (Gong & Hill, 2003).

● Use of multiple measures does not necessarily improve the reliability and validity of the decisions that are made; it is the logic by which the measures are combined that determines the accuracy and appropriateness of the high-stakes decisions that are made (Chester, 2003).

53

Page 54: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

54

Appendix Materials

Page 55: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

55

Levels Description

Level 1: Knowledge Recall & Recognition

Level 2: Comprehension Translate, Interpret, & Extrapolate

Level 3: Application Use of generalizations in specific instances

Level 4: Analysis Determine Relationships

Level 5: Synthesis Create new relationships

Level 6: Evaluation Exercise of learned judgment

Cognitive Response Item Coding: Bloom’s Taxonomy

Page 56: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Cognitive Response Item Coding: Webb’s Depth of Knowledge

Level Description

Level 1 Recall: recall of a fact, information, or procedure

Level 2 Skill/Concept: use of information or conceptual knowledge, two or more steps, etc.

Level 3 Strategic Thinking: requires reasoning, developing plan or a sequence of steps, some complexity, more than one possible answer.

Level 4 Extended Thinking: requires an investigation, time to think and process multiple conditions of the problem

56

(www.wcer.wisc.edu)

Page 57: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

57

Bloom’s Taxonomy Webb’s DOK

Knowledge: RecallComprehension: low level processing

Recall

Application: Use of abstractions in concrete situations

Basic Application of Skill/Concept: use of information, conceptual knowledge, multiple steps

Analysis: breakdown of a situation into component parts

Strategic Thinking: requires reasoning, a plan, or multi-step processes

Synthesis & Evaluation: Putting together elements and parts to form a whole

Extended Thinking: requires investigation, time to think, and to process multiple conditions of task

Cognitive Response Item Coding: Bloom & Webb: Side-By-Side

Page 58: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Core Transition: Academic Language

Academic language is designed to be concise, precise, and authoritative. To achieve these goals, it uses sophisticated words and complex grammatical constructions that can disrupt reading comprehension and block learning. Students need help in learning academic vocabulary and how to process academic language if they are to become independent learners (Snow, 2010).(See Coxhead’s High-Incidence Academic Word List)

58

Page 59: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Core Transition: Academic Language

● Maintaining the impersonal authoritative stance creates a distanced tone that is often puzzling to adolescent readers and is difficult for them to utilize in their writing.

● Students must have access to the all-purpose academic vocabulary that is used to talk about knowledge and that they will need to use in making their own arguments and evaluating others’ arguments (Snow, 2010).

59

Page 60: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Common Core Transition: Academic Language

Formal Register (Formal Academic Tone): language found in most academic, business, & serious non-fiction. It is characterized by:●Emotional distance between writer and audience;●Emotional distance between the writer and topic;●No colloquial language, slang, regionalisms, and dialects; ●Careful attention to conventions of edited American English; and●Attention to the logical relationships among words and ideas (The St. Martin’s Handbook).

60

Page 61: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Term

Item Format: The variety of test item structures or types that can be used to measure examinees’ knowledge, skills, and abilities; these include: multiple-choice or selected response; open-ended or constructed response; essay; and performance task. (Perie, 2010).

61

Page 62: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Selected Response Items: Items or tasks in which students choose from among response or answer choices that are presented to them, e.g. Multiple-Choice, True/False, Matching, Cloze (Carr & Harris, 2001).

62

Page 63: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Conventional MC

Three Parts: Stem, Correct Answer, and Distractors (Haladyna, 1999).

●Stem: Stimulus for the response; it should provide a complete idea of the problem to be solved in selecting the right answer. The stem can also be phrased in a partial-sentence format. Whether the stem appears as a question or a partial sentence, it can also present a problem that has several right answers with one option clearly being the best of the right answers.

63

Page 64: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Conventional MC

Three Parts: Stem, Correct Answer, and Distractors●Correct Answer: the one and only right answer; it can be a word, phrase, or sentence.●Distractors: Distractors are wrong answers. Each distractor must be plausible to test-takers who have not yet learned the content the item is supposed to measure. To those who have learned the content, the distractors are clearly wrong choices. Distractors should resemble the correct choice in grammatical form, style, and length. Subtle or blatant clues that give away the correct choice should be avoided.

64

Page 65: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Constructed Response Item: An exercise for which examinees must create their own responses or products rather than choose a response from an enumerated set. Short-answer items require a few words or a number as an answer, whereas extended-response items require at least a few sentences (Joint Standards, 1999).

65

Page 66: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Constructed Response Items Con’t: Constructed response items or tasks are used to assess processes or procedural knowledge or to probe for students’ understanding of knowledge and information. Constructed response tasks are often contrasted with selected response items or tasks (Carr & Harris, 2001).

66

Page 67: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Performance Assessments: Product- and behavior-based measurements based on settings designed to emulate real-life contexts or conditions in which specific knowledge or skills are actually applied (Joint Standards, 1999).

67

Page 68: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Key Components of CR Items:●Task: A specific item, problem, question, prompt, or assignment●Response: Any kind of performance to be evaluated, including short/extended answer, essay, presentation, & demonstration.●Rubric: Scoring criteria used to evaluate responses●Scorers: People who evaluate responses (ETS, 2005)

68

Page 69: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Task-Specific Rubric: A set of scoring guidelines specific to a particular task. The criteria are addressed and described in terms of specific content or capacities that can be demonstrated in terms of particular, identified content relevant to the task (Carr & Harris, 2001).

69

Page 70: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

High-Inference Constructed Response Items: Format requires expert judgment about the trait being observed and rubrics are used to evaluate. Many abstract qualities are evaluated this way, e.g. writing ability, organization, style, word choice, and math complex problem solving (e.g. proofs, quadratic equations) (Haladyna, 1999).

70

Page 71: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: Key Terms

Low Inference Constructed Response Items: Format simply involves observation because there is some behavior or answer in mind that is either present or absent (e.g. scaffolded questions, writing conventions, measurements) (Haladyna, 1999).

71

Page 72: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: High & Low-Inference CR

Attribute High-Inference Low-Inference

Type of Behavior Measured

Usually Abstract, most valued in education

Usually concrete

Ease of Construction

Design of items is complex, involving command or question, conditions for performance, and rubrics

Design of items is not as involved as high inference, involving command or question, conditions for performance, and a simple mechanism

Cost of Scoring Involves training; expensive

Scoring not as complex but costs can still be high

72

High Low

Page 73: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Item Types: High & Low-Inference CR

Attribute High-Inference Low-Inference

Reliability Reliability can be a problem due to inter-rater reliability

Results can be very reliable due to concrete nature of items

Objectivity Can be subjective More objective

Bias: Systemic Error

Possible threats to validity: over or underrating

Seldom yields biased observations

73

(Haladyna, 1999)

Page 74: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Choosing An Item Format(MC v. CR)

● Most efficient and reliable way to measure knowledge is with MC format.

● Most direct way to measure skill is via performance, but many mental skills can be tested via MC with a high degree of proximity (statistical relation between CR and MC items of an isolated skill). If the skill is critical to ultimate interpretation, CR is preferable to MC.

74

Page 75: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Choosing An Item Format(MC v. CR)

● When measuring a fluid ability or intelligence, the complexity of such human traits favors CR item formats of complex nature (high-inference) (Haladyna, 1999).

75

Page 76: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Choosing An Item Format: Conclusions about Criterion Measurement

Criterion Conclusion About MC & CR

Declarative Knowledge Most MC formats provide the same information as an essay, short answer, or completion formats.

Critical Thinking MC formats involving vignettes or scenarios provide a good basis for forms of critical thinking. MC format has good fidelity to the more realistic open-ended behavior elicited by CR.

76

(Haladyna, 1999)

Page 77: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Choosing An Item Format: Conclusions about Criterion Measurement

Criterion Conclusion About MC & CR

Problem Solving Ability MC item sets provide a good basis for testing problem solving. However, more research is still needed.

Creative Thinking Ability MC format limited.

School Abilities (e.g. writing, reading, & mathematics)

Performance has the highest fidelity to criterion for these school abilities. MC is good for measuring foundational aspects of fluid abilities, such as declarative knowledge or knowledge of skills.

77

(Haladyna, 1999)

Page 78: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Test Validation: Data Collection & Analysis

● Answer Sheet Design & Scanning Procedures (Work w/BOCES & RIC)

● Utilize Item Banking Software● Depth of data collection: student demographics;

scores; and item level data, if possible● Evaluate/Revise● Generate trend data ● Review against other data to verify & audit rigor● Document & Save Analyses

78

Page 79: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Test Validation: Data Collection & Analysis

● Parallel NYSTP Test Reporting Approaches● Report General Performance on Test; recommendation, report

%s of students at Levels 1-4 (place 3 cuts into the instrument).● Run data for all student groups and then run Title I

disaggregation: Race/Special Populations (Special Education & ELLs).

● If you have multiple Need Resources Capacity districts/schools, run analysis by category.

● Further analysis: disaggregate by building and class/instructor if possible.

● For shared exams, try to run a comprehensive analysis of all participating CSDs for general trends and Title I disaggregation (allows for possible benchmarking).

79

Page 80: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Test Validation: Data Collection & Analysis

Psychometric Analysis: See Classical Test Analysis in SED ELA/Math 3-8 Technical Manuals. Use baseline measures to inform your analysis.Analysis Includes 4 Primary Elements: 1)Item level statistical information (item response patterns, item difficulty, and item discrimination); 2)Test level data (raw score statistics-mean & SD) & test reliability measures (Cronbach’s Alpha & Feldt-Raju Coefficient); 3)Test speededness (omit rates); & 4)Bias through DIF (differential item functioning).

Use p-values to complete Item Maps for use with administrative and instructional staff.

80

Page 81: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Sources

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (1999). Standards for Educational and Psychological Testing.

Washington, D.C.: American Educational Research Association.Baily, Kim. & Jakicic, C. (2012). Common Formative Assessment. Bloomington, IN: Solution Tree Press.

81

Page 82: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Sources

Gong, Brian & Hill, Richard. (2001). Some Considerations of Multiple Measures in Assessment and School Accountability. CCSSO & US

DOE Accountability Conference March 23-24, Washington D.C.: 2001. Haladyna, Thomas M. (1999). Developing and Validating Multiple-Choice Test Items. Mahwah, NJ: Lawrence Erlbaum Associates, Publishers.

82

Page 83: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Sources

Henderson-Montero, Dianne, Julian, Marc. C., & Yen, Wendy M. (2003). Multiple Perspectives and Multiple Measures: Alternative Design and Analysis Models. Educational Measurement: Issues and Practice. 22. 2, 7-12.Heritage, Margaret, Kim, J., Vendlinski, T. & Herman, J.

From Evidence to Action: A Seamless Process in Formative Assessment? Educational

Measurement: Issues and Practice, 28.3, 24-31.

83

Page 84: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Sources

Heritage, Margaret. (2011). Formative Assessment: An Enabler of Learning. Retrieved from www.cse.ucla.edu.

Herman, Joan L. and Choi, Kilchan (2012). Validation of ELA and Mathematics Assessments: A General Approach. Retrieved from www.cse.ucla.edu/products/states_schools/ValidationELA_FINA L.pdf

84

Page 85: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

Sources

Nichols, Paul, Meyers, J. & Burling, K (2009). A Framework for Evaluating and Planning Assessments Intended to Improve Student Achievement. Educational Measurement: Issues and Practice, 28.3, 14-23.

Perie, Marianne, Marion, S., & Gong, B. (2009). Moving Toward a Comprehensive Assessment System: A Framework for Considering Interim Assessments. Educational Measurement: Issues and Practice, 28.3, 5-13.

85

Page 86: 1 st  Annual Data Summit: Comprehensive Assessment Systems & Formative Assessment

86

David [email protected]