Upload
edward-carpenter
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Speaking Points
Current paper-and-pencil-based assessments
Image Scoring Computer Administration Computer Scoring
Typical Current Paper-and-Pencil Based Statewide Assessment
3 grades Reading, writing, math, science, social
studies 30 MC and 6 OE questions for four areas,
one essay for writing 50,000 students per grade
Materials Processed
150,000 28-page test booklets 2 millions sheets of paper 10 tons of paper, a stack 700 feet high
150,000 20-page answer documents 1.5 million sheets of special paper 7.5 tons 600 boxes to store (per year)
Process
Materials shipped to schools Materials shipped back to contractor Materials logged in
Count everything, resolve discrepancies Note that one misplaced school can stop
entire process
Process for Receiving Materials
Separate answer booklets from test booklets Test booklets placed in temporary storage in
original boxes, then destroyed after reporting complete
Answer sheets guillotined MC answer sheets scanned OE sheets packaged by scoring
Processing of OE Sheets
Separate by content area Sorted by form, randomized across
schools Scanned to capture ID numbers Scoring headers prepared, then merged
with answer sheets
Scoring
Hire, train, qualify Score On-going evaluation of quality of scoring Determine papers that need adjudication,
then rescore as necessary Scan scoring headers Merge MC, OE and writing scores
Scoring Time
20 seconds per OE question 5 minutes per essay (2 scorings plus
adjudication, if necessary) 13 minutes per student
32,500 hours 1000 person-weeks, plus training, qualifying,
quality control and equating
Count, Count, Count
Initial log-in counts After packaging Every time a box is opened or closed Count boxes, too
Final Steps
Ship reports back to schools Resolve problems
Missing or misplaced students Challenges to scoring (requires finding
answer sheets—perhaps all for one student) Destroy test materials Long-term storage for answer documents
Solution # 1—Image Scoring
High-speed scanners capture images of documents
All processing is done on CRTs by looking at electronic image of original paper
Advantages
Control Scoring
Blind read-behinds Real-time tracking of accuracy of every scorer Multiple sites
Equating Blind rescores from previous year
Advantages (cont’d)
Scoring speed Next response is ready to be scored when
first is done Scoring stops when rates decline No fumbling for papers Up to 1/3 faster
Advantages (cont’d)
Tracking No need for counting Nothing is lost Nothing is damaged Records automatically linked Special-request papers easy to obtain
Prep for next year’s scoring Challenged papers Adjudication
Disadvantages
Hardware and software costs Costs have dropped dramatically ($150,000
server two years ago now selling for $16,000) Need to prove that scoring is the same
Writing vs. OE Connectivity Power outages
Computer Administered Tests
Web-based vs. CD Comparability
Standards—especially writing Students that write on paper and then just type in
Full use of computer capabilities Underestimation of (some) students’ abilities
Georgia’s Proposed System
Huge item bank, three levels Teachers can create tests Capacity concerns for Level III tests
Advantages
Elimination of paper Accommodations Adaptive testing
Shorter tests Diagnostic tests Lower frustation levels
Real-time scoring
Issues
Administration time All schools have some computers, but how
many? Transition
Recommendation is to test all schools the same way
Comparability Logistics of operating two programs at same time
Computer Scoring
Major vendors NCME Session N1, April 12, 2001 ETS Technologies—E-rater (Princeton, NJ) Vantage Learning—Intellimetric (Yardley, PA) TruJudge—Project Essay Grade (PEG)
(Purdue) Knowledge Analysis Technologies—Intelligent
Essay Assessor (Boulder, CO)
Issues
Accuracy rates PA study—computers vs. humans
Computer more accurate than one human Computer less accurate than two humans Bias vs. random error
Beating the system (“Stakes changes everything”)
Capacity of contractors to deliver logistics