20
9/20/12 1 IGDIs and Beyond: Measurement and Decision-Making for Language and Literacy RTI Efforts in Early Childhood Education Scott McConnell, Alisha Wackerle-Hollman and Tracy Bradfield Disclosure Scott McConnell and colleagues developed Individual Growth and Development Indicators; intellectual property from this research has been licensed to Early Learning Labs, Inc., for commercial development and sale. Scott and the University of Minnesota have royalty and equity interest in Early Learning Labs, Inc. These relationships have been reviewed and managed by the University of Minnesota in accordance with its conflict of interest policies. Today’s Session “Measuring Up” – the logic of General Outcome Measurement and contemporary models IGDIs 2.0 – Building Items and Scales IGDIs 2.0 and RTI – A Decision-Making Framework

CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

1

IGDIs and Beyond: Measurement and Decision-Making for Language and Literacy RTI Efforts in Early Childhood Education

Scott McConnell, Alisha Wackerle-Hollman and Tracy Bradfield

Disclosure

¨  Scott McConnell and colleagues developed Individual Growth and Development Indicators; intellectual property from this research has been licensed to Early Learning Labs, Inc., for commercial development and sale. Scott and the University of Minnesota have royalty and equity interest in Early Learning Labs, Inc. These relationships have been reviewed and managed by the University of Minnesota in accordance with its conflict of interest policies.

Today’s Session

¨  “Measuring Up” – the logic of General Outcome Measurement and contemporary models

¨  IGDIs 2.0 – Building Items and Scales ¨  IGDIs 2.0 and RTI – A Decision-Making

Framework

Page 2: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

2

Logic of General Outcome Measurement and Contemporary Models of RTI, Measurement

Measuring Up

Time

Standardized and Sound

Sensitive to time, treatment

Repeatable

Fast, Easy

General Outcome

GOM Metrics

¨  Status ¤ What is the level of performance for a child

(or group) today? ¤ How does this child (or group) compare to

an a priori standard? ¨  Change

¤ Has performance changed from last time? ¨  Growth

¤ What is the rate of change? ¤ Is the child “on track” toward a long-term

desired goal?

Page 3: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

3

Application to Early Literacy

¨  Four CRTIEC domains ¤ Oral Language ¤ Phonological Awareness ¤ Alphabet Knowledge and Concepts about

Print ¤ Comprehension

¨  Assumption of concurrent or heterotypic development Sound Units

Print Units

Contextual Units (Narrative)

Semantic Units (Concepts)

Outside-In Skills

Inside-Out Skills

Language Units

GOM of Early Literacy

¨  Assessment of interrelated domains ¨  Assessment of domains over time ¨  Assessment for different functions

¤ Description of language and early literacy development (“status”)

¤ Evaluation of timeliness of development (“growth”)

¤ Identification of need for additional intervention

¤ Monitoring effects of intervention

Item Response Theory

¨  Assumes  an  “ability”  that  is  invariant  in  characteristics  across  individuals  and  time  

¨  Assumes  that  items  and  individuals  can  be  located  on  this  ability  ¤ Thus,  items  and  individuals  vary  across  ability  –  an  implicit  “absolute”  scale  

¨  Assumes  that  item  and  test  statistics  are  invariant  across  samples  

¨  Provides  precise  way  to  build,  use  tests  

Page 4: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

4

IRT and GOMs

¨  Core,  underlying  metaphors  are  similar,  and  key  assumptions  are  compatible  ¤  IRT:  Trajectory  that  is  invariant  across  individuals  and  populations,  with  items  and  locations  located  on  it  

¤ GOM:  Growth  toward  long-­‐term,  common  outcome,  with  variations  at  individual  (and  nested)  level  in  status,  rate  of  growth  

¤ Both  efforts  –  locate  individual  at  single  point  in  time  and  repeated  measures  on  trajectory  (and  estimate  rate  of  growth)  

Advantages/Assets from IRT

¨  Increased  precision  in  item  and  scale  construction  ¤ More  analytic  tools,  and  more  analytic  colleagues  ¤  Item-­‐level  analyses  for  reliability,  item  information  function  

¤ Greater  facility  for  adding,  evaluating  items  and  constructing  scales  

¨  Expanding  item  pools  ¨  Increasing  knowledge  of  methodological  and  logistical  requirements  for  design,  testing,  refinement,  implementation  

Can We Go from an IRT-Based Scale to General Outcome Measurement?

Page 5: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

5

Progress  on  an  IRT  Scale  

Low   High  Ability

Progress  on  an  GOM  Scale  

Low  

High  

Abi

lity

Time

What’s  needed  to  “make  the  move?”  

¨  Construct  validaAon  –  RelaAon  to  long-­‐term  outcome(s)  of  interest  

¨  Item  pool    ¨  Growth  scaling  –  change  as  a  funcAon  of  Ame  ¨  EvaluaAon  

¤  Judgments  ¤ Norms  ¤ Benchmarks  ¤ Empirical  analyses  

Page 6: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

6

Constructing  Measures  for  Speci2ic  purposes  

IGDI 2.0 Items and Scales

Measurement Framework

¨  Wilson, 2005

Early Literacy Construct

De#ining  the  Construct  of  Early  Literacy  

Phonological  Awareness  Construct  DefiniAon   The  ability  to  detect  and  manipulate  the  sound  structure  of  words  

independent  of  their  meanings  (Phillips,  Clancy-­‐MencheQ,  Lonigan,  2008),  which  develops  along  a  conAnuum  of  complexity  from  idenAficaAon  to  synthesis  to  analysis.      

Measures  of  idenAficaAon  level:     Rhyming  and  First  Sounds  

Oral  Language  Construct  DefiniAon   The  ability  to  use  words  to  communicate  ideas  and  thoughts  and  to  use  

language  as  a  tool  to  communicate  to  others  (Dunst,  TriveXe,  Masiello,  Roper,  &  Robyak,  2008;  Morgan  &  Meier,  2008).          •  Expressive  language:    the  use  of  words  to  express  meaning.      •  RecepAve  language:  the  ability  to  listen,  process,  and  understand  the  

meaning  of  spoken  words    Measure  of  Expressive  Language:     Picture  Naming  

Page 7: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

7

Early Literacy Construct

De#ining  the  Construct  of  Early  Literacy  

Alphabet  Knowledge  Construct  DefiniAon   Knowledge  about  the  names  and  sounds  of  the  26  leXers  of  the  alphabet  (McBride-­‐Chang,  

1999)  Measure  of  Aural  idenAficaAon:    

Sound  Iden7fica7on  

Comprehension  Construct  DefiniAon   Text    Comprehension:  Text  comprehension  is  the  ability  to  understand  and  interpret  text  

as  a  whole  (Storch  &  Whitehurst,  2002);  it  includes  the  “recogniAon  of  pictures  and  symbols  in  books  and  the  ability  to  interpret  and  infer  meaning  from  what  is  seen”  (Dunst,  TriveXe,  Masiello,  Roper,  &  Robyak,  2006,  p.  4).  Listening  Comprehension:  Listening  comprehension  is  the  ability  to  understand  and  interpret  spoken  phonemes,  words,  phrases,  sentences,  narraAves,  and  stories  (Dickinson  &  Smith,  1994;  Skarakis-­‐Doyle,  Dempsey,  &  Lee,  2008).  

Measures  of    Comprehension:    

Which  One  Doesn’t  Belong  

Item level Revisions ¨  Cleaning Items ¨  Item level functions

¤ Rasch Output Values n How is each item contributing to the test?

n  Item/total correlations n  In-Fit statistics n  Standar Error of the item

¤ Construct Irrelevant Features (CIF) n What characteristics of each item provide information

to the student? n What information distracts the student from the

intended content? n What features of the items are malleable?

Example: Poorly Functioning Item – Def Vocab Construct  Irrelevant  Features  

Not a real elephant

Elephants are big but this one is actually small.

Page 8: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

8

Example: Revision of Item

Example:  Poorly  Functioning    Rhyming  Item  

Construct Irrelevant Features

Some are real images others

are not

Some are enclosed,

others are not

Word content, color content and image

clarity all might contribute to a

response.

Example: Revised Item

“Bat,  Cat”  “Bat,  Doll”  

Page 9: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

9

New Item Development

¨  Expert  Contributions  ¨  online  database  for  semantic  set  size  (Nelson,  McEvoy  &  Schreiber,  1998)  

¨  Age  of  Acquisition  word  lists  used  in  previous  studies  (Carroll  &  White,  1973;  Garlock,  1997;  Snodgrass  and  Yuditsky,  1996)  

¨  Phonotactic  Probabilty  online  calculator  (Storkel  &  Hoover,  2010)  

¨  Concreteness.  Familiarity,  and  Imagability  ratings  online  database  (Wilson,  1987)  

Current item pools

¨  5 measures with over 150 items per measure.

¨  Items appropriately match distributions of students.

¨  All items have been tested with over 100 students, have item/total correlations between .2 and .8; In-fit statics less than an absolute value of 2.

Assessment Purposes

¨  Screening/  Identi2ication  ¤ To  identify,  with  increased  certainty,  children  requiring  Tier  2  or  Tier  3  services  in  one  or  more  domains.  

¨  Progress  Monitoring  ¤ To  assess  whether  individual  children  are  growing  in  the  targeted  skill  area,  speci2ically  and  generally.      

¤ To  determine  whether  individual  children  continue  to  require  high  intensity  intervention  and  when  it  is  appropriate  to  transition  children  to  different  levels  of  intensity  (tiers).  

Page 10: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

10

Identification

¨  Tri-­‐annual/seasonal  assessments  ¨  Criterion-­‐referenced  assessment  based  on  contrast-­‐groups  design  cut  score  location  between  Tier  1  and  Tier  2/3  ¤ Decision  Making  Framework  will  add  predictive  power  such  that  Tier  2  and  Tier  3  will  be  able  to  be  differentiated.  

¨  Criterion  performance  is  based  on  over  2000  children  represented  nationally.  

¨  Information  is  provided  to  describe  performance  as  pass/fail  (go/no-­‐go).  

Firs

t Sou

nds

Rhy

min

g

Page 11: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

11

Pic

ture

Nam

ing

Whi

ch O

ne D

oesn

’t B

elon

g S

ound

Iden

tific

atio

n

Page 12: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

12

Progress Monitoring

¨  Performance  based  assessment  that  examines  each  child’s  ability  level  based  on  a  Rasch  pro2ile.  

¨  Assessments  are  delivered  every  3  weeks  to  examine  changes  in  ability  score  as  a  result  of  intervention  or  instruction.  

¨  Growth  is  examined  in  the  context  of  previous  performance,  but  also  in  reference  to  the  criterion  standard  for  Tier  level  performance.  

Designing Progress Monitoring IGDIs ¨  Sensitive  to  change  ¨ Opportunity  for  growth  within  Tiers  ¨ Reliable  and  Valid  ¨ Tailored  to  each  child’s  unique  needs  

Firs

t Sou

nds

Page 13: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

13

Rhy

min

g P

ictu

re N

amin

g W

hich

One

Doe

sn’t

Bel

ong

Page 14: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

14

Sou

nd Id

entif

icat

ion

A  Multiple  Gating  Model  of  Decision  Making  

IGDI 2.0 Decision Making Framework

CRTIEC’s Decision Making Framework

¨  Basic principles ¨  Rationale for multiple gating ¨  Current framework ¨  Evidence to date ¨  Coming research

Page 15: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

15

DMF: Principles

¨  Principle #1: According to the Standards for Educational and Psychological Testing (AERA, APA, NCME, 1999), instructional decisions should never be made with only one source of data.

¨  It is important to have multiple sources of data to support instructional decision making.

DMF: Principles & Purpose

¨  A child’s raw score on each IGDI measure is interpretable in terms of its relation to an identified cut score (range) which distinguishes between tier one candidates and tier two/ three candidates.

Establishing Cut Scores

¨  Cut scores (ranges) were established through a standard setting process. ¤ These standards consisted of operational

definitions of child performance that would be typical of students with needs at each of the respective tier levels, for each domain.

¨  Teachers were given these tier level descriptors and ranked students as good candidates for Tier 1, Tier 2 or Tier 3.

Page 16: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

16

Setting the cut scores and ranges

¨  A combination of Rasch output, ROC analysis, Regression analysis and contrasting groups design methods were used to identify the Rasch value the best distinguished between Tier 1 and Tier 2/3 ability.

¨  Cut scores maximized fit between IGDI scores and teacher judgment about ability level.

Picture Naming

How do IGDIs alone function in identification of tier candidacy?

¨  Sensitivity and Specificity of IGDIs alone with provisional cut scores, using teacher judgment as criterion

Measure' Sensitivity' Specificity'Sound'Identification' .75' .87'Alliteration' .85' .77'Picture'Naming' .76' .67'Rhyming' .71' .70'Which'One'Doesn’t'Belong?' .70' .46''

Page 17: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

17

Cut Scores and need for DMF

¨ Currently, we have not identified IGDI cut scores/ ranges that distinguish between tier two and tier three candidates.

¨ A Decision Making Framework (DMF) is needed to support IGDI score interpretation to increase their use in supporting instructional decision making.

Multiple Gating Model: Rationale

¨  CRTIEC has adopted a multiple gating model of decision making. ¤ Successive “narrowing of the playing field” ¤ Maximizes efficient use of resources.

¨  Model uses teacher judgments gathered using a questionnaire at gates B and C. ¤ Recent studies have found teacher ratings

act as significant predictors of at-risk status (Speece & Ritchey, 2005; Speece et al., 2010).

MulAple  GaAng  Model  Fall  IGDI  Iden7fica7on  Set  Administra7on  

Score  Below  Cut  Range  on  IGDI  

Score  Within    Cut  Range  on  IGDI  

Score  Above  Cut  on  IGDI  

Move  to  Gate  C  Move  to  Gate  B    

Tier  1  instrucCon  

No  problems  indicated    

Problems  indicated  

Tier  1  instrucCon  

Teacher  fills  out  Gate  C  “T2  vs  T3”  quesConnaire  

No  problems   Problems  indicated  

Tier  3  instrucCon  Tier  2  instrucCon  

Gate  A  

Gate  B  

Result  

Ac7on  

Ac7on  

Ac7on  

Teacher  fills  out  Gate  B  “Disconfirming  T1”  Ques7onnaire  

Result  

Move  to  Gate  C  

Gate  C  

Gate  D   Teacher  fills  out  Gate  D  quesConnaire  (regarding  behavioral    concerns)  

Page 18: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

18

Oral  Language/  Comprehension  Teacher  QuesAonnaire  

Evidence to date:

¨  Last year, a study was conducted in 30 classrooms across KS, OH, and MN in which Identification IGDIs were administered and corresponding teacher questionnaire data was collected to support use of the DMF (n=303).

What proportion of children are identified for T2/T3… ¨  …with IGDIs alone

¨  …with teacher ratings added into DMF?

Tier  One

Move  to  Gate  2  (IGDI  in  cut  range)

Move  to  Gate  3        (IGDI  <  cut  range)

Picture  Naming 9 43.8 47.3Rhyming   41.4 53.9 4.6Alliteration 32.2 44.8 23.1

Percent  of  Total  Sample  Identified  As

Tier   Percent  1   52%  2   24%  3   24%  

Page 19: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

19

Promising Evidence

q  Moderate correlation between score on Teacher Questionnaire and standardized test: ¤ Oral Language/ Comprehension

Questionnaire and PPVT-IV, r = .418 ¤ Phonological Awareness/ Early Literacy

Questionnaire and TOPEL-PA, r = .333 ¨  Significant mean difference on PPVT

(t=3.75**) when comparing DMF identified tier 2 and tier 3 candidate performance.

Revision of Teacher Questionnaire

¨  Using results from last year’s study, we have revised both teacher questionnaires: ¤ To more closely align scales with decisions

needed at each gate of the DMF. n Used CRTIEC panel of experts to support this

process. ¤  To increase the number of items (to

increase scale/ score reliability).

OL/ Comprehension Pilot Study

¨  We piloted the revised Oral Language/ Comprehension Teacher Questionnaire with 40 teachers in the metro Twin Cities area (n =83). ¤ Purpose: item and scale analysis

¨  After examining Coefficient Alpha, inter-item and item-total correlations for each scale, we made modifications at the item level resulting in the following internal consistency estimates: ¤ Gate B = .977 ¤  Gate C = .961 ¤ Gate D = .963

Page 20: CRTIEC Summit 2012 v3crtiec/rti_summit/documents/McConnell... · test: ! Oral Language/ Comprehension Questionnaire and PPVT-IV, r = .418 ! Phonological Awareness/ Early Literacy

9/20/12

20

Pilot Study

¨  Results of OL/ Comp study also supported identification of cut scores for each scale, to support tier classification.

¨  PA/ Early Literacy questionnaire is being subjected to same pilot test. ¤ Just finished item level data entry, analysis

to be completed soon.

Current Efforts

¨  Decision Making Validation Study ¤ 5 school districts in metro Twin Cities, KS,

OR, OH. ¤ OL/ Comp measures in Fall ¤ PA/ Early Literacy measures in Winter

¨  Identification IGDIs administered, teacher questionnaire completed, tier assignments given. ¤ Standardized criterion test given to all tier

2 and 3 identified children plus random sample of tier one identified children.

Current Efforts: RQs

¨  For each domain (Oral Language/ Comprehension or Phonological Awareness/ Early Literacy), what is the relation between score on the teacher questionnaire and score on the standardized criterion measure?

¨  For each domain, what is the classification accuracy of the DMF when the standardized measure is used as the criterion of need?

¨  For each domain, does the mean standardized criterion test score differ significantly across tier assignment groups (tier 1, tier 2, tier 3)?

¨  For each domain, which variables or combination of variables capture the most variance in predicting language and literacy status?