92
ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago, IL February 21, 2007

ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

Embed Size (px)

Citation preview

Page 1: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

ACCESS for ELLs®Scores, Reliability and Validity

Developed by the Center for Applied Linguistics

Prepared by Dorry Kenyon, CAL

ISBE Meeting, Chicago, IL

February 21, 2007

Page 2: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

2ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Outline of my presentationOutline of my presentation

1. What do scores on ACCESS for ELLs® mean?

2. What do we know about the reliability of ACCESS for ELLs® scores?

3. What do we know about the validity of ACCESS for ELLs® scores?

4. So what does this mean for using scores on ACCESS for ELLs®?

Page 3: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

3ISBE Presentation 2/21/2007© 2007 WIDA/CAL

1. What do scores on ACCESS for ELLs® mean?1. What do scores on ACCESS for ELLs® mean?

Page 4: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

4ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Two types of scoresTwo types of scores

WIDA ACCESS for ELLs® Scale Scores = psychometrically-derived measure

WIDA ACCESS for ELLs® Proficiency Level Scores = socially-derived interpretation of the scale score in terms of the WIDA Standards’ Proficiency Level Definitions

Page 5: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

5ISBE Presentation 2/21/2007© 2007 WIDA/CAL

What is measured?What is measured?

Scale Scores (and interpretive Proficiency Level Scores) are given for measures in the four domains Listening Speaking Reading Writing

Scale Scores are combined into four composite scores (which are also interpreted in Proficiency Level Scores) Oral (listening and speaking) Literacy (reading and writing) Comprehension (listening and reading) Overall Composite (listening, speaking, reading, and

writing)

Page 6: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

6ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Weighting of the overall compositeWeighting of the overall composite

Scale Scales of the four domains are weighted differently in the overall composite score Listening (15%) Speaking (15%) Reading (35%) Writing (35%)

Page 7: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

7ISBE Presentation 2/21/2007© 2007 WIDA/CAL

ACCESS administration times and composite score weightsACCESS administration times and composite score weights

Listening (15%): 20-25 minutes, machine scored

Reading (35%): 35-40 minutes, machine scored

Writing (35%): Up to 1 hour, rater scored

Speaking (15%): Up to 15 minutes, administrator scored

Test Times(Minutes)Listening,

25

Reading, 40Writing, 60

Speaking, 15

Listening

Reading

Writing

Speaking

Test Weights(Percent)Listening,

15%

Reading, 35%

Writing, 35%

Speaking, 15%

Listening

Reading

Writing

Speaking

Page 8: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

8ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Scale Scores vs. Proficiency Level ScoresScale Scores vs. Proficiency Level Scores

The WIDA ACCESS for ELLs® Scale Scores are the psychometrically derived measures of student proficiency Range from 100 to 600 One scale applies to all grades through vertical equating

of tests Vertical scale score takes into account that assessment

tasks taken by students in the grade 9-12 cluster are more challenging than the assessment tasks taken by students in the grade 1-2 cluster

Average scale scores consistently show an increase from grade to grade

Page 9: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

10ISBE Presentation 2/21/2007© 2007 WIDA/CAL

2005-2006 Overall Composite Scale Scores2005-2006 Overall Composite Scale Scores

Average Overall Composite Scale Score by Grade

250

300

350

400

1 2 3 4 5 6 7 8 9 101112

Grade

Sca

le S

core

Average OverallCompositeScale Score

Page 10: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

11ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Scale Scores vs. Proficiency Level ScoresScale Scores vs. Proficiency Level Scores

Proficiency Level Scores are socially-derived interpretations of the WIDA ACCESS for ELLs® Scale Scores in terms of the six proficiency levels defined in the WIDA Standards Comprised of two numbers, e.g. 2.5

First number indicates the proficiency level into which the student’s scale score places him or her (e.g. 2 = Beginning)

Second number indicates how far, in tenths, the student’s scale places him or her between the lower and the higher cut score of the proficiency level (e.g. 2.5 = 5/10 or ½ of the way between the cut score for level 2 and for level 3)

The same scale score is interpreted differently based on what grade level cluster different students are in

The same proficiency level score corresponds to different scale scores based on the grade level cluster

Page 11: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

12ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Example: Scale score of 350Example: Scale score of 350

Grades Domain Cut

    1/2 2/3 3/4 4/5 5/6

1-2 Overall 259 285 313 332 354

3-5 Overall 292 325 350 370 394

6-8 Overall 319 347 374 393 410

9-12 Overall 347 373 396 412 429350

350

350

350

Page 12: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

13ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Easy ItemsLess Proficient Students

Hard ItemsMore Proficient Students

350 475 600225100

9-12

429

410

6-8

394

3-5

Example: Overall composite proficiency level score 6.0Example: Overall composite proficiency level score 6.0

354

1-2

Page 13: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

14ISBE Presentation 2/21/2007© 2007 WIDA/CAL

How are proficiency level scores derived?How are proficiency level scores derived?

While Proficiency Level Scores are socially-derived interpretations, they are not arbitrary Set by panels of content experts Set following best technical practices Set by consensus building procedures (standard setting

studies) Set by carefully documented replicable procedures

For WIDA ACCESS for ELLs®, these were set by panels of experts in April of 2004, for each grade level cluster (see WIDA Technical Report #1 for complete details)

Page 14: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

15ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Originally WIDA had grade level cluster cutsOriginally WIDA had grade level cluster cutsOverall Composite (Current Cuts)

220

230

240

250

260

270

280

290

300

310

320

330

340

350

360

370

380

390

400

410

420

430

440

450

0 1 2 3 4 5 6 7 8 9 10 11 12

Grade

Sca

le S

core current1/2

current2/3

current3/4

current4/5

current5/6

1

2

3

4

5

6

Page 15: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

16ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Grade level cuts are being introduced this yearGrade level cuts are being introduced this year

Overall Composite (Proposed Smoothed Cuts)

220

230

240

250

260

270

280

290

300

310

320

330

340

350

360

370

380

390

400

410

420

430

440

450

0 1 2 3 4 5 6 7 8 9 10 11 12

Grade

Sca

le S

core proposed1/2

proposed2/3

proposed3/4

proposed4/5

proposed5/6

1

2

34

5

6

Page 16: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

17ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Cluster vs. grade level cutsCluster vs. grade level cuts

Overall Composite (Proposed vs. Current Cuts)

220

230

240

250

260

270

280

290

300

310

320

330

340

350

360

370

380

390

400

410

420

430

440

450

0 1 2 3 4 5 6 7 8 9 10 11 12

Grade

Sca

le S

core

current1/2

current2/3

current3/4

current4/5

current5/6

proposed1/2

proposed2/3

proposed3/4

proposed4/5

proposed5/6

Page 17: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

18ISBE Presentation 2/21/2007© 2007 WIDA/CAL

2005-2006 Overall Composite Scale Scores2005-2006 Overall Composite Scale Scores

Average Overall Composite Scale Score by Grade

250

300

350

400

1 2 3 4 5 6 7 8 9 101112

Grade

Sca

le S

core

Average OverallCompositeScale Score

Page 18: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

19ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Effect of grade level cut scoresEffect of grade level cut scores

Prof Level Score of Average Overall Composite Scale Score by Grade

1.0

2.0

3.0

4.0

5.0

6.0

1 2 3 4 5 6 7 8 9 101112

Grade

Pro

fien

cy L

evel

S

core

Current ClusterCuts

ProposedGrade LevelCutsP

rofi

cien

cy L

evel

S

core

Page 19: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

20ISBE Presentation 2/21/2007© 2007 WIDA/CAL

2. What do we know about the reliability of ACCESS for ELLs® scores?

2. What do we know about the reliability of ACCESS for ELLs® scores?

Page 20: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

21ISBE Presentation 2/21/2007© 2007 WIDA/CAL

What is reliability? What is reliability?

Psychometrically speaking, reliability refers to the consistency of test scores.

What evidence is there that this test score result is not just a chance occurrence, but would have been obtained had the student been tested on multiple occurrences or scored under multiple occasions?

Page 21: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

22ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Multiple forms of ACCESS for ELLs®Multiple forms of ACCESS for ELLs®

In the Annual Technical Report, the reliability of each of the 44 separate test forms for ACCESS for ELLs® is reported.

Cluster List Read Write Speak Total

K 1 1 1 1 4

1-2 3 3 3 1 10

3-5 3 3 3 1 10

6-8 3 3 3 1 10

9-12 3 3 3 1 10

Total 13 13 13 5 44

Page 22: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

23ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Types of reliability reportedTypes of reliability reported

For all test forms, internal consistency (coefficient alpha) is reported.

For writing, agreement between operational raters is also reported (20%)

For speaking, agreement between administrators from field test data is also given currently, but a larger study is underway

Reliabilities for domain scores based on the individual forms for Series 100 (2004-2005) are within expected and acceptable ranges

Page 23: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

24ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Reliability of the overall compositeReliability of the overall composite

Results indicate that the reliability of the overall composite score across tiers is similar and very high across all grade level clusters (Series 100).

K .930

1-2 .949

3-5 .941

6-8 .933

9-12 .936

Page 24: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

25ISBE Presentation 2/21/2007© 2007 WIDA/CAL

The most important reliability indexThe most important reliability index

For tests like ACCESS for ELLs®, by which decisions are based on a student’s classification into proficiency levels, the accuracy of classification is perhaps the most important reliability index.

This index gives an estimate of how reliably a student was placed to be at least at or above a certain category (versus below that category).

Page 25: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

26ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Accuracy of classification indices (Series 100)Accuracy of classification indices (Series 100)

Grade Cluster

Cut K 1-2 3-5 6-8 9-12

1/2 .925 .974 .977 .968 .951

2/3 .949 .943 .940 .936 .921

3/4 na .928 .917 .912 .924

4/5 na .943 .940 .945 .954

5/6 na .975 .972 .976 .977

Page 26: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

27ISBE Presentation 2/21/2007© 2007 WIDA/CAL

3. What do we know about the validity of ACCESS for ELLs® scores?

3. What do we know about the validity of ACCESS for ELLs® scores?

Page 27: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

28ISBE Presentation 2/21/2007© 2007 WIDA/CAL

What is validity?What is validity?

Validity refers to an evaluative judgment of the degree to which theoretical rationales and empirical evidence support the adequacy and appropriateness of inferences and actions made on the basis of test scores.

Page 28: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

29ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Validity issues for ACCESS for ELLs®Validity issues for ACCESS for ELLs®

Issues related to ACCESS for ELLs® include Do the described proficiency levels exist? How does the test relate to other measures of English

language proficiency? How confident are we in the cut scores that place

students into the various levels, that they really define the levels?

Do we know that ACCESS for ELLs® tests the language needed for academic success and is not a content test?

And so on…

Page 29: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

30ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Study 1: Do the levels of the Standards really exist?Study 1: Do the levels of the Standards really exist?

Reading and Listening Selected Response Type Items

SI = Social and Instructional Language LA = language of Language Arts MA = language of Math SC = language of Science SS = language of Social Studies

Page 30: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

31ISBE Presentation 2/21/2007© 2007 WIDA/CAL

The Standards guide test developmentThe Standards guide test development

1. ACCESS for ELLS® makes the WIDA Standards operational

2. WIDA Standards providea. Content (What?)b. Performance Levels (How well?)

Page 31: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

32ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Large-scale Standards: SC readingLarge-scale Standards: SC reading

Page 32: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

33ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Large-Scale standards: SC readingLarge-Scale standards: SC reading

Classify living organisms (such as birds and mammals) by using pictures or

icons

Page 33: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

34ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Large-scale Standards: SC readingLarge-scale Standards: SC reading

Interpret data presented in text and

tables in scientific studies

Page 34: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

35ISBE Presentation 2/21/2007© 2007 WIDA/CAL

At the given level of English language proficiency, English language learners will process, understand, produce, or use:

5- Bridging

the technical language of the content areas; a variety of sentence lengths of varying linguistic complexity in extended oral or written discourse,

including stories, essays, or reports; oral or written language approaching comparability to that of English proficient peers when presented

with grade level material

4- Expanding

specific and some technical language of the content areas; a variety of sentence lengths of varying linguistic complexity in oral discourse or multiple, related

paragraphs; oral or written language with minimal phonological, syntactic, or semantic errors that do not impede the

overall meaning of the communication when presented with oral or written connected discourse with occasional visual and graphic support

3- Developing

general and some specific language of the content areas; expanded sentences in oral interaction or written paragraphs; oral or written language with phonological, syntactic, or semantic errors that may impede the

communication but retain much of its meaning when presented with oral or written, narrative or expository descriptions with occasional visual and graphic support

2- Beginning

general language related to the content areas; phrases or short sentences; oral or written language with phonological, syntactic, or semantic errors that often impede the meaning of

the communication when presented with one to multiple-step commands, directions, questions, or a series of statements with visual and graphic support

1- Entering

pictorial or graphic representation of the language of the content areas; words, phrases, or chunks of language when presented with one-step commands, directions,

WH-questions, or statements with visual and graphic support

2: general language of the content areas

1: pictorial or graphic representation of the language of the content areas

5: technical language of the content areas

At the given level of English language proficiency, English language learners will process, understand, produce, or use:

Page 35: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

36ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Validation issuesValidation issues

Validity is about the adequacy and appropriateness of inferences about students made on the basis of test scores.

The WIDA Standards make claims about what students at five different proficiency levels can do.

Can those claims be substantiated empirically?

Page 36: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

37ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Research study questionsResearch study questions

1. Are the ACCESS for ELLs™ items empirically ordered by difficulty as predicted by the WIDA Standards?

2. Does that ordering differ by domain (listening or reading)?

3. Does that ordering differ by standard (SI, LA, MA, SC, SS)?

Page 37: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

38ISBE Presentation 2/21/2007© 2007 WIDA/CAL

DataData

Results from ACCESS for ELLs™ field test

Fall 2004

Over 6500 students grades 1 to 12

8 WIDA states

About 3.5% proportional representation

Page 38: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

39ISBE Presentation 2/21/2007© 2007 WIDA/CAL

MethodMethod

Items were vertically scaled across grade levels using common item equating

Item difficulty was determined using the Rasch measurement model

Items that did not meet the requirements of the model were eliminated from the analysis

Average item difficulties were calculated by proficiency level

Page 39: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

40ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Domain WIDA Proficiency Level

1 2 3 4 5 Total

Standard LA 6 14 19 15 11 65 MA 9 18 27 27 14 95 SC 5 12 16 15 10 58 SI 13 17 22 11 5 68 SS 7 13 19 12 6 57

Listening

Total 40 74 103 80 46 343 LA 8 13 13 15 13 62 MA 5 11 17 13 8 54 SC 5 12 22 20 11 70 SI 9 18 24 11 4 66

Standard

SS 8 12 20 9 7 56

Reading

Total 35 66 96 68 43 308

Number of items used = 651Number of items used = 651

Page 40: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

41ISBE Presentation 2/21/2007© 2007 WIDA/CAL

ResultsResults

Page 41: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

42ISBE Presentation 2/21/2007© 2007 WIDA/CAL

1 2 3 4 5

WIDA Proficiency Level

300

400

500

600

Ite

m D

iffi

cult

y

Listening and Reading Combined, All Grade Level Clusters

Page 42: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

43ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Listening Reading

Domain

1 2 3 4 5

WIDA Proficiency Level

300

400

500

600

Ite

m D

iffi

cult

y

All Grade Level Clusters (by Domain)

Page 43: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

44ISBE Presentation 2/21/2007© 2007 WIDA/CAL

LA MA SC SI SS

Standard

1 2 3 4 5

WIDA Proficiency Level

300

400

500

600

Ite

m D

iffi

cu

lty

Listening and Reading Combined, All Grade Level Clusters (by Standard)

Page 44: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

45ISBE Presentation 2/21/2007© 2007 WIDA/CAL

ConclusionsConclusions

Page 45: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

46ISBE Presentation 2/21/2007© 2007 WIDA/CAL

1. Are the ACCESS for ELLs™ items empirically ordered by difficulty as predicted by the WIDA Standards?

1. Are the ACCESS for ELLs™ items empirically ordered by difficulty as predicted by the WIDA Standards?

Yes. WIDA Standards (MPIs) provided sufficient content and rationale to develop specifications that operationalized the five proficiency levels through listening and reading selected response items.

Page 46: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

47ISBE Presentation 2/21/2007© 2007 WIDA/CAL

2. Does that ordering differ by domain (listening or reading)?2. Does that ordering differ by domain (listening or reading)?

No. The general ordering was similar across listening and reading. Some difference between listening level 5 and reading level 5 was observed.

Page 47: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

48ISBE Presentation 2/21/2007© 2007 WIDA/CAL

3. Does that ordering differ by standard (SI, LA, MA, SC, SS)?3. Does that ordering differ by standard (SI, LA, MA, SC, SS)?

Yes. SI (social and instructional language) items showed a clear tendency to be easier than items assessing language in the content areas, particularly at higher proficiency levels.

Items assessing language in the content areas were similar except at level 5 where language arts appeared easier than expected.

Page 48: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

49ISBE Presentation 2/21/2007© 2007 WIDA/CAL

DiscussionDiscussion

1. While many additional validation issues remain, this preliminary empirical analysis based on the field test data indicate that the WIDA Standards provide a strong basis for distinguishing among proficiency levels of ELLs.

Page 49: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

50ISBE Presentation 2/21/2007© 2007 WIDA/CAL

DiscussionDiscussion

2. The operational plan for ongoing WIDA assessment item renewal and development provides opportunity to tighten item specifications based on empirical research while operationalizing the WIDA Standards.

Page 50: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

51ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Process of test developmentProcess of test development

2. Standards

3. Specifications

4. Assessment

1. Theory and Research

Page 51: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

52ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Study 2: Validation evidence from the bridge studyStudy 2: Validation evidence from the bridge study

What can we learn about ACCESS for ELLs™ from the WIDA Consortium’s bridge study? Study 1: What is the relationship between performances on the

older English language proficiency tests and on ACCESS for ELLS™?

Study 2: What is the relationship between the “cut score” denoting the highest level of proficiency on the older tests and the predicted corresponding score on ACCESS for ELLs™ in terms of ACCESS proficiency levels?

Page 52: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

53ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Purpose of the bridge studyPurpose of the bridge study

To help WIDA Consortium member states understand the performances of their ELLs in acquiring English on the older tests (for which they had data) in terms of the new test, especially to: meet compliance with Title III requirements provide continuity of data flow for cohorts of English language

learners identified in 2002-03, the baseline year provide information that may help determine Annual

Measurable Achievement Objectives (AMAOs) for the established cohorts in the transitional year

Page 53: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

54ISBE Presentation 2/21/2007© 2007 WIDA/CAL

The older testsThe older tests

IDEA Proficiency Test (IPT)

Language Assessment Scales (LAS)

Language Proficiency Test Series (LPTS)

Maculaitis II (MAC II)

NOTE: The first three tests do NOT have separate scores for listening and speaking!

Page 54: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

55ISBE Presentation 2/21/2007© 2007 WIDA/CAL

WIDA levels of English Language ProficiencyWIDA levels of English Language Proficiency

ENTERING

BEGINNING

DEVELOPING

EXPANDING

1

2

3

4

5

BRIDGING

4.5

Page 55: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

56ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Participants Participants

4,985 students from IL and RI

GRADE IPT LAS LPTS MAC II TOTAL

K 102 81 246 47 476

1 109 184 216 95 604

2 143 137 246 76 602

3 102 80 290 63 535

4 82 57 146 74 359

5 104 32 216 57 409

6 116 55 142 97 410

7 111 110 58 110 389

8 106 62 48 142 358

9 28 12 150 134 324

10 37 17 120 106 280

11 30 2 92 79 203

12 9 2 31 43 85

Total 1,079 831 1,952 1,123 4,985

Page 56: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

57ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Procedures Procedures

2005 operational ACCESS administration (AL, ME, VT)

Participating students in IL and RI administered older test and operational ACCESS within 6-8 week window

Scoring of older test took place within local districts following their standard procedures and submitted to ACCESS scoring vendor

Scoring of ACCESS was with Spring 2005 operational scoring

Data matched by ACCESS scoring vendor

Older test data cleaned at CAL

Analyses at CAL

Page 57: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

58ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Analyses: Study 1Analyses: Study 1

Pearson correlations between performances on each form of older test (raw or scale score) and ACCESS for ELLs™ scale scores

Because each form for the older tests was unique, 64 correlational analyses were performed

IPT (14)

LAS (14)

LPTS (16)

MAC II (20)

Summarized by averaging

Page 58: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

59ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 1 example (IPT Reading)Results: Study 1 example (IPT Reading)

IPT Reading Score with ACCESS Reading Scale Score

IPT Form (Read) ACCESS Read Scale Score

IPT_EL IPT Read Raw Score Pearson Correlation .741**

N 205

IPT_R_1AB IPT Read Raw Score Pearson Correlation .540**

N 250

IPT_R_2AB IPT Read Raw Score Pearson Correlation .618**

N 296

IPT_R_3AB IPT Read Raw Score Pearson Correlation .713**

N 317

Page 59: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

60ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 1 summary rangeResults: Study 1 summary range

Average Correlations (All Levels of Each Test within Domain)

Test List Speak Read Write

IPT 0.601 0.625 0.653 0.631

LAS 0.503 0.570 0.591 0.525

LPTS 0.603 0.651 0.741 0.675

MAC II 0.433 0.453 0.593 0.509

Page 60: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

61ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 1 summary by test across domainsResults: Study 1 summary by test across domains

Average Correlations (All Levels of Each Test within Domain)

Test List Speak Read Write

IPT 0.601 0.625 0.653 0.631

LAS 0.503 0.570 0.591 0.525

LPTS 0.603 0.651 0.741 0.675

MAC II 0.433 0.453 0.593 0.509

Page 61: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

62ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 1 summary by domain across testsResults: Study 1 summary by domain across tests

Average Correlations (All Levels of Each Test within Domain)

Test List Speak Read Write

IPT 0.601 0.625 0.653 0.631

LAS 0.503 0.570 0.591 0.525

LPTS 0.603 0.651 0.741 0.675

MAC II 0.433 0.453 0.593 0.509

Page 62: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

63ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Discussion: Study 1Discussion: Study 1

Generally moderate to high correlations between ACCESS for ELLs® and older tests; ACCESS appears to assessing a similar construct (criterion-related validity) but is not interchangeable with the older tests

Correlations across all tests with reading were highest; most familiar to students and test developers?

Correlations across all tests with listening were lowest; but three tests did not have separate scores for listening and speaking!

Correlations across domains between LPTS and ACCESS for ELLs® were highest; LPTS the newest of the ‘older generation’

Page 63: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

64ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Analyses: Study 2Analyses: Study 2

From predicted scores tables, found for each grade level the ACCESS for ELLs® proficiency level score corresponding to the “cut score” of the highest proficiency level on the older test

Summarized findings by calculating averages and standard deviations

Page 64: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

65ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Predicted scores table examplePredicted scores table example

Predicted ACCESS = 270.1 + 1.738 * LAS

LAS RW 2AB Writing Raw Score to WIDA ACCESS Writing Scale Score

LAS RW 2AB

Raw Score

LAS Proficiency

Level (by grade)

PredictedACCESS

Score

ACCESS Proficiency

Level (by grade)  

LAS RW 2AB

Raw Score

LAS Proficiency

Level (by grade)

PredictedACCESS

Score

ACCESS Proficiency

Level (by grade)

Writing 4,5,6 Writing 4,5 6   Writing 4,5,6 Writing 4,5 6

0 1 270 1.9 1.8   28 1 319 3.0 2.4

1 1 272 1.9 1.8   29 1 321 3.1 2.5

2 1 274 1.9 1.8   30 1 322 3.1 2.5

3 1 275 1.9 1.8 31 1 324 3.1 2.5

4 1 277 1.9 1.9 32 1 326 3.1 2.5

5 1 279 1.9 1.9 33 2 327 3.1 2.5

… … … … … … … … … …

27 1 317 2.9 2.4 55 3 366 4.5 3.7

Page 65: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

66ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Finding the WIDA proficiency level score exampleFinding the WIDA proficiency level score example

Predicted ACCESS = 270.1 + 1.738 * LAS

LAS RW 2AB Writing Raw Score to WIDA ACCESS Writing Scale Score

LAS RW 2AB

Raw Score

LAS Proficiency

Level (by grade)

PredictedACCESS

Score

ACCESS Proficiency

Level (by grade)  

LAS RW 2AB

Raw Score

LAS Proficiency

Level (by grade)

PredictedACCESS

Score

ACCESS Proficiency

Level (by grade)

Writing 4,5,6 Writing 4,5 6   Writing 4,5,6 Writing 4,5 6

… … … … …   … … … … …

… … … … …   42 2 343 3.8 3.2

… … … … …   43 2 345 3.9 3.2

… … … … … 44 3 347 3.9 3.3

… … … … … 45 3 348 3.9 3.3

… … … … … 46 3 350 4 3.3

… … … … … … … … … …

Page 66: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

67ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Truncated example results: ListeningTruncated example results: Listening

K 1 2 3 … 11 12

Listening

IPT 6.0 6.0 6.0 4.3 … 4.8 4.8

LAS 4.7 6.0 6.0 4.7 … 4.4 4.4

LPTS 3.1 3.4 3.8 3.7 … 3.0 3.0

MAC II 3.3 3.3 5.2 3.3 … 2.9 2.9

Page 67: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

68ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 2 summary rangeResults: Study 2 summary range

Average Proficiency Level Score

(Standard Deviation)

Test List Speak Read Write

IPT 4.9

(0.80)

4.0

(0.36)

3.9

(0.97)

2.9

(0.64)

LAS 4.8

(0.67)

5.1

(0.81)

3.1

(1.11)

3.1

(0.67)

LPTS 3.5

(0.53)

2.9

(0.79)

5.3

(0.71)

3.9

(0.74)

MAC II 3.7

(0.78)

3.5

(0.74)

3.5

(0.76)

3.0

(0.40)

Page 68: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

69ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Interpretation: Highest test and domainInterpretation: Highest test and domain

LPTS Reading

ENTERING

BEGINNING

DEVELOPING

EXPANDING

1

2

3

4

5

BRIDGING

LPTS Reading

Page 69: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

70ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Interpretation: Lowest test and domainInterpretation: Lowest test and domain

LPTS Reading

ENTERING

BEGINNING

DEVELOPING

EXPANDING

1

2

3

4

5

BRIDGING

IPT Writing

Page 70: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

71ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 2 High and low by test across domainsResults: Study 2 High and low by test across domains

Average Proficiency Level Score

(Standard Deviation)

Test List Speak Read Write

IPT 4.9

(0.80)

4.0

(0.36)

3.9

(0.97)

2.9

(0.64)

LAS 4.8

(0.67)

5.1

(0.81)

3.1

(1.11)

3.1

(0.67)

LPTS 3.5

(0.53)

2.9

(0.79)

5.3

(0.71)

3.9

(0.74)

MAC II 3.7

(0.78)

3.5

(0.74)

3.5

(0.76)

3.0

(0.40)

Page 71: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

72ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Results: Study 2 High and low by domain across testsResults: Study 2 High and low by domain across tests

Average Proficiency Level Score

(Standard Deviation)

Test List Speak Read Write

IPT 4.9

(0.80)

4.0

(0.36)

3.9

(0.97)

2.9

(0.64)

LAS 4.8

(0.67)

5.1

(0.81)

3.1

(1.11)

3.1

(0.67)

LPTS 3.5

(0.53)

2.9

(0.79)

5.3

(0.71)

3.9

(0.74)

MAC II 3.7

(0.78)

3.5

(0.74)

3.5

(0.76)

3.0

(0.40)

Page 72: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

73ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Discussion: Study 2 (1 of 3)Discussion: Study 2 (1 of 3)

Results varied widely from a close relationship to WIDA proficiency span (LPTS Reading) to much lower, though in general, “cut scores” on older tests tended to be much lower than the WIDA 6.0; were ELLs exited too early under the older tests? do ACCESS for ELLs™ standards and performance level

definitions better align with levels of English proficiency needed for academic success?

with a single test across districts within a states, states will have clearer data to better understand the development of English proficiency in ELLs and its relationship to academic achievement

Page 73: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

74ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Discussion: Study 2 (2 of 3)Discussion: Study 2 (2 of 3)

Results varied widely across tests and domains; LPTS with the highest “cut scores” in reading and writing had lowest “cut scores” in listening and speaking; but three tests did not have separate scores for listening and

speaking, including LPTS! LPTS had only “fluent”/”non-fluent” listening and speaking

categories?

Page 74: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

75ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Discussion: Study 2 (3 of 3)Discussion: Study 2 (3 of 3)

Across tests, writing had lowest “cut scores” for three of four tests; is writing on ACCESS for ELLs™ unduly hard?, or is it more indicative of what is needed for academic success?

Page 75: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

76ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Important considerations in interpretationsImportant considerations in interpretations

CONTENT differences between all five tests include: Degree of alignment with English language proficiency and

academic content standards Number and types of items in each subsection or language domain Depth of knowledge of the items Inclusion of the language of math, science, and social studies Ceiling levels of the measures Rubrics used for interpreting speaking and writing

METHODOLOGICAL caveats include: Use of linear regression across all analyses Sometimes small numbers of students in subgroups Distribution of observed scores (Spring testing)

Page 76: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

77ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Preliminary conclusionsPreliminary conclusions

Correlational data show strong support for ACCESS for ELLs™ as a measure of English proficiency (criterion-related validity)

Comparison of “cut scores” indicate that the WIDA Standards, as operationalized by ACCESS for ELLs™, describe a longer proficiency continuum than the older tests

Additional studies are needed to explore the relationship between that extended continuum and academic achievement

Page 77: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

78ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Validity evidence from the grade level cut score review studyValidity evidence from the grade level cut score review study

75 teachers from 14 WIDA states

Examined test items and (for writing and speaking) examinee performances in light of the WIDA Standard’s model Performance Indicators and the Standard’s performance level descriptors

Through a structured process came up with proposed grade level cut scores (based on empirical proposed scores based on current cluster level cut scores)

As in the original standard setting study, evaluated the confidence they had in the cut scores representing the different performance levels

Results: Confidence increased greatly over first study

Page 78: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

79ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Evaluations from grade level cut score reviewEvaluations from grade level cut score review

Averages across all participants

How confident are you in the cut scores? (4 = hi, 1 = lo)

Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40

Read Write List Speak

Orig Rev Orig Rev Orig Rev Orig Rev

1/2 3.08 3.41 3.39 3.46 3.22 3.51 3.24 3.46

2/3 2.83 3.47 3.28 3.43 3.15 3.55 3.01 3.39

3/4 2.98 3.48 3.33 3.36 3.17 3.57 2.89 3.37

4/5 3.05 3.54 3.33 3.35 3.19 3.53 2.84 3.37

5/6 3.01 3.52 3.33 3.41 3.18 3.60 2.97 3.56

Page 79: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

80ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Evaluations from grade level cut score reviewEvaluations from grade level cut score review

Averages across all participants

How confident are you in the cut scores? (4 = hi, 1 = lo)

Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40

Read Write List Speak

Orig Rev Orig Rev Orig Rev Orig Rev

1/2 3.08 3.41 3.39 3.46 3.22 3.51 3.24 3.46

2/3 2.83 3.47 3.28 3.43 3.15 3.55 3.01 3.39

3/4 2.98 3.48 3.33 3.36 3.17 3.57 2.89 3.37

4/5 3.05 3.54 3.33 3.35 3.19 3.53 2.84 3.37

5/6 3.01 3.52 3.33 3.41 3.18 3.60 2.97 3.56

Page 80: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

81ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Evaluations from grade level cut score reviewEvaluations from grade level cut score review

Averages across all participants

How confident are you in the cut scores? (4 = hi, 1 = lo)

Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40

Read Write List Speak

Orig Rev Orig Rev Orig Rev Orig Rev

1/2 3.08 3.41 3.39 3.46 3.22 3.51 3.24 3.46

2/3 2.83 3.47 3.28 3.43 3.15 3.55 3.01 3.39

3/4 2.98 3.48 3.33 3.36 3.17 3.57 2.89 3.37

4/5 3.05 3.54 3.33 3.35 3.19 3.53 2.84 3.37

5/6 3.01 3.52 3.33 3.41 3.18 3.60 2.97 3.56

Page 81: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

82ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Evaluations from grade level cut score reviewEvaluations from grade level cut score review

Averages across all participants

How confident are you in the cut scores? (4 = hi, 1 = lo)

Red = below 3.10 / Black = 3.11 to 3.40 / Green = above 3.40

Read Write List Speak

Orig Rev Orig Rev Orig Rev Orig Rev

1/2 3.08 3.41 3.39 3.46 3.22 3.51 3.24 3.46

2/3 2.83 3.47 3.28 3.43 3.15 3.55 3.01 3.39

3/4 2.98 3.48 3.33 3.36 3.17 3.57 2.89 3.37

4/5 3.05 3.54 3.33 3.35 3.19 3.53 2.84 3.37

5/6 3.01 3.52 3.33 3.41 3.18 3.60 2.97 3.56

Page 82: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

83ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Other validity studies underway at CALOther validity studies underway at CAL

Some ongoing internal research at CAL (1) What do we learn from the results of the technical

analyses of Series 100 to improve item and form specifications?

(2) How do we improve the construction of items appropriate (both from content and empirical results) to their targeted proficiency levels?

(3) What evidence do we have that ACCESS for ELLs tests the language of the content areas and not knowledge of the content areas?

Page 83: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

84ISBE Presentation 2/21/2007© 2007 WIDA/CAL

#1 Example from Series 100 analyses#1 Example from Series 100 analyses

Figure 8.3.1DTest Information Function: List 3-5 ABC

0

1

2

3

4

5

6

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

Ability Measure

Info

rma

tio

n

Page 84: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

85ISBE Presentation 2/21/2007© 2007 WIDA/CAL

#1 Example from Series 100 analyses#1 Example from Series 100 analyses

Figure 8.3.2DTest Information Function: Read 3-5 ABC

0

1

2

3

4

5

6

7

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

Ability Measure

Info

rma

tio

n

Page 85: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

86ISBE Presentation 2/21/2007© 2007 WIDA/CAL

#2 Example 3-5 Read Prof Level 2#2 Example 3-5 Read Prof Level 2

5. R_2526_SIp2g35_PlantSale 2 302 11. R_2999_SIp2g35_FamilyNight 2 272 13. R_2675_SIp2g35_Artwork 2 327 29. R_2871_LAp2g35_AngelaPepper 2 321 210. R_2172_LAp2g35_KangarooDream 2 329 28. R_2870_LAp2g35_AngelaPepper 2 335 39. R_2171_LAp2g35_KangarooDream 2 365 5

Page 86: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

87ISBE Presentation 2/21/2007© 2007 WIDA/CAL

#2 Example 3-5 Read Prof Level 5#2 Example 3-5 Read Prof Level 5

13. R_2535_LAp5g35_AthleteBio 5 271 13. R_2541_SIp5g35_PlaygroundRules 5 332 312. R_2534_LAp5g35_AthleteBio 5 373 5

Page 87: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

88ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Interaction of Performance Level Descriptions and model Performance IndicatorsInteraction of Performance Level Descriptions and model Performance Indicators

Language Proficiency (Performance Level Descriptions)

1 Entering

2 Beginning

3 Developing

4 Expanding

5 Bridging

PIs

L 1

L 2

L 3

L4

L 5

Linguistic Complexity

Vocabulary Usage

Language Control

Page 88: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

89ISBE Presentation 2/21/2007© 2007 WIDA/CAL

#3 Confirmatory Factor Analyses (SEM)#3 Confirmatory Factor Analyses (SEM)

RSS

RSI

RLA

RMA

RSC

LSI

LSS

LLA

LMA

LSC

ListScore

ReadScore

L-prof

Engprof

R-prof

SS

SC

MA

LA

SI

Page 89: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

90ISBE Presentation 2/21/2007© 2007 WIDA/CAL

Other research (and possibilities)Other research (and possibilities)

1. Native speaker studies (Alabama data)

2. Relationship between performance on ACCESS for ELLs and state content tests (?)

Page 90: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

91ISBE Presentation 2/21/2007© 2007 WIDA/CAL

XXXXX X XXXX X X XX X X XNo

Logistic regression with state data?Logistic regression with state data?

Yes XXXXXXXXXX X X

ACCESS Scale Score

low highScore

80%

hi%

lo%

Pro

bab

ility

Page 91: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

92ISBE Presentation 2/21/2007© 2007 WIDA/CAL

4. So what does this mean for using scores on ACCESS for ELLs®?

4. So what does this mean for using scores on ACCESS for ELLs®?

Be sure to understand the meaning of scale scores and proficiency level scores

Have confidence using scores knowing that the reliability (consistency) of the scale scores are high;

in particular, for the overall composite score that the accuracy of classification based on the overall

composite is also high initial validity studies strongly support the use of ACCESS for

ELLs® test scores as a valid indicator of levels of proficiency in accordance with the WIDA Standards

the WIDA Consortium supports a rigorous program of on-going test improvement, supported by research

the WIDA Consortium continues to collect evidences in support of the validity of the use of test scores

Page 92: ACCESS for ELLs® Scores, Reliability and Validity Developed by the Center for Applied Linguistics Prepared by Dorry Kenyon, CAL ISBE Meeting, Chicago,

For more information, please contact the WIDA Hotline:1-866-276-7735 or www.wida.us/helpform

World Class Instructional Design and Assessment, www.wida.us

Center for Applied Linguistics, www.cal.org

Metritech, Inc., www.metritech.com