Upload
trankhanh
View
222
Download
1
Embed Size (px)
Citation preview
Experiences with Rubrics:Experiences with Rubrics:
Grading the APGrading the AP
Computer Science ExamComputer Science Exam
Jim HugginsJim Huggins
Computer Science DepartmentComputer Science Department
The Advanced Placement ProgramThe Advanced Placement Program
37 courses in 22 subject areas37 courses in 22 subject areas
–– since 1955since 1955
–– 2007: 2.5M exams, 1.5M students2007: 2.5M exams, 1.5M students
Each exam given a final grade of 1Each exam given a final grade of 1--55
–– roughly speaking, 5=A, 4=B, 3=Croughly speaking, 5=A, 4=B, 3=C
–– 2007: 425K received at least one 3 or better2007: 425K received at least one 3 or better
(15% of all HS students)(15% of all HS students)
The PlayersThe Players
Run by College BoardRun by College Board
–– 3900 colleges/schools/organizations3900 colleges/schools/organizations
Administered by Educational Testing Administered by Educational Testing
Service (ETS)Service (ETS)
–– develops, administers exams for AP & othersdevelops, administers exams for AP & others
–– obviously, lots of teachers involved hereobviously, lots of teachers involved here
(as temporary employees of ETS) (as temporary employees of ETS)
AP Computer ScienceAP Computer Science
Since 1984Since 1984
Two exams (A, AB)Two exams (A, AB)
–– Pascal, C++ (1999), Java (2004)Pascal, C++ (1999), Java (2004)
Includes a yearIncludes a year--long case studylong case study
–– changed from changed from ““FishFish”” to to ““GridworldGridworld”” in 2007in 2007
2007: 15,049 (A) + 5,064 (AB) = 20,1132007: 15,049 (A) + 5,064 (AB) = 20,113
2007: 170 exam 2007: 170 exam ““readersreaders”” (graders)(graders)
–– both college and high school instructorsboth college and high school instructors
AP CS Grades (2007)AP CS Grades (2007)
19.5%19.5%33.9%33.9%1: No Recommendation1: No Recommendation
9.3%9.3%9.5%9.5%2: Possibly Qualified2: Possibly Qualified
18.3%18.3%14.5%14.5%3: Qualified3: Qualified
19.7%19.7%22.8%22.8%4: Well Qualified4: Well Qualified
33.2%33.2%19.3%19.3%5: Extremely Qualified5: Extremely Qualified
AP CS ABAP CS ABAP CS AAP CS A
The AP CS Exam ItselfThe AP CS Exam Itself
50%: Multiple choice (40 questions)50%: Multiple choice (40 questions)
–– designers goal: mean score of 20 designers goal: mean score of 20
–– 75 minutes75 minutes
–– machine scoredmachine scored
50%: Free response (4 questions)50%: Free response (4 questions)
–– requires coding, design, evaluationrequires coding, design, evaluation
–– 105 minutes105 minutes
–– must be scored by human must be scored by human ““readersreaders””
The Free Response QuestionsThe Free Response QuestionsA1: Self Divisor
� devise and implement an algorithm for identifying and collecting self-divising numbers
A2: Pounce Fish (Marine Biology Case Study)
� extend the Fish class to allow for pouncing ahead and eating a fish within a range
A3: Answer Sheets
� utilize existing classes to process and score ArrayLists of multiple choice tests
A4: Game Design (Design)
� utilize existing classes and inheritance to implement an abstract game framework
AB1: Sliding Puzzle (Design)
� implement a specified algorithm for filling a 2-D array, and search for a pattern
AB2: Pair Matcher
� utilize existing classes and a complex Map structure for selecting optimal pairs
AB3: Tree Ball
� construct a full binary tree with random data values, and find a maximum path sum
AB4: Environment Iterator (Marine Biology Case Study)
� create and use an iterator to traverse an environment following a diagonal pattern
The Problem At HandThe Problem At Hand
20,113 exams 20,113 exams ……
times 4 questions each times 4 questions each ……
plus 30plus 30--40% double40% double--reading reading ……
107,000 questions to score 107,000 questions to score ……
170 readers 170 readers ……
yields 630+ questions per readeryields 630+ questions per reader
–– most years, 700+most years, 700+
The Problem At HandThe Problem At Hand
Internal ConsistencyInternal Consistency
–– 700+ exams over 7 consecutive days700+ exams over 7 consecutive days
External ConsistencyExternal Consistency
–– average of 20 readers per question average of 20 readers per question
(big questions have over 30)(big questions have over 30)
The Answer:The Answer:
RubricsRubrics
Definition (NCTM)Definition (NCTM)
““a set of authoritative rules to give a set of authoritative rules to give
direction to the scoring of assessment direction to the scoring of assessment
tasks or activities.tasks or activities.””
““describes levels of performance for a describes levels of performance for a
particular complex performance task and particular complex performance task and
guides the scoring of that task consistent guides the scoring of that task consistent
with relevant performance standardswith relevant performance standards””
The AP ApproachThe AP Approach
Each question is scored on a 0Each question is scored on a 0--9 scale9 scale
Correct solutions are described in terms of Correct solutions are described in terms of
a set of tiny, verifiable featuresa set of tiny, verifiable features
Scoring: verifying the presence (or Scoring: verifying the presence (or
absence) of these featuresabsence) of these features
–– (maybe this why they call it (maybe this why they call it ““readingreading””))
The AP ApproachThe AP Approach
Each characteristic worth Each characteristic worth ½½--1 point1 point
Deductions for systemic errorsDeductions for systemic errors
–– e.g. confusing e.g. confusing array.lengtharray.length & & ArrayList.sizeArrayList.size()()
Sum of all points (rounded up) is scoreSum of all points (rounded up) is score
Designers goal: flat distribution, mean of 5Designers goal: flat distribution, mean of 5
2004 Question A4: Personal Robot2004 Question A4: Personal Robot
Imagine a robot walking down a hallway, picking Imagine a robot walking down a hallway, picking
up toys left on various tiles up toys left on various tiles ……
xx
xx
xxxxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
On a move, the robot does two things (in order):On a move, the robot does two things (in order):
1. Pick up a toy in the current tile1. Pick up a toy in the current tile
xx
xx
xxxxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
On a move, the robot does two things (in order):On a move, the robot does two things (in order):
1. Pick up a toy in the current tile1. Pick up a toy in the current tile
2. If the tile is (or was) empty, move 2. If the tile is (or was) empty, move ““forwardforward””
xx
xx
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 2: nothing to pick upMove 2: nothing to pick up
xx
xx
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 3: pick up one toyMove 3: pick up one toy
xx
xx
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 4: pick up one toy Move 4: pick up one toy …… but I canbut I can’’t move!t move!
Solution: reverse directionSolution: reverse direction
xxxxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 5: empty cell, move forwardMove 5: empty cell, move forward
And so on, until the hallway is clearAnd so on, until the hallway is clear
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 6Move 6
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 7Move 7
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 8Move 8
xxxxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 9Move 9
xxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 10Move 10
xxxxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 11Move 11
xxxx
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 12Move 12
x x
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 13Move 13
xxxx
Question A4: Personal RobotQuestion A4: Personal Robot
Move 14Move 14
xx
Question A4: Personal RobotQuestion A4: Personal Robot
Finally Finally …… done after 14 movesdone after 14 moves
Question A4: Personal RobotQuestion A4: Personal Robot
Student given an example sequence Student given an example sequence (like the above)(like the above)
Student given the overall class design for Student given the overall class design for the solutionthe solution
–– hallway == array of integershallway == array of integers
–– robot == int (position in hallway)robot == int (position in hallway)+ boolean (facingRight or not?)+ boolean (facingRight or not?)
–– some predefined methods (some predefined methods (isClearisClear())())
Finally: The QuestionFinally: The Question
A.A. Write Write forwardMoveBlockedforwardMoveBlocked()()determines if stuck at end of hallwaydetermines if stuck at end of hallway
B.B. Write move()Write move()performs one performs one ““movemove”” of the gameof the game
uses uses forwardMoveBlockedforwardMoveBlocked()()
C.C. Write Write clearHallclearHall()()runs the whole game (using move())runs the whole game (using move())
counts number of movescounts number of moves
Handout:Handout:
The Canonical SolutionsThe Canonical Solutions
The Problem RubricThe Problem Rubric
2004 Scoring results (all questions)2004 Scoring results (all questions)
Training DayTraining Day
After the usual After the usual ““welcomewelcome”” activitiesactivities……
–– and some really bad skits and some really bad skits ……
–– and the assignment of readers to problems and the assignment of readers to problems ……
A presentation of the problem and the A presentation of the problem and the
canonical solutionscanonical solutions
A presentation of the rubric and how its A presentation of the rubric and how its
features appear in the canonical solutionsfeatures appear in the canonical solutions
Then Then …… The Training PacketThe Training Packet
~15 actual student solutions (photocopied)~15 actual student solutions (photocopied)
Individually scored, group discussedIndividually scored, group discussed
Attempt to cover representative issuesAttempt to cover representative issues
–– ““the good, the bad, and the uglythe good, the bad, and the ugly””
The Work (Slowly) BeginsThe Work (Slowly) Begins
Exams graded in batches (folders) of 25Exams graded in batches (folders) of 25
First 1First 1--2 folders graded as 2 folders graded as ““splitsplit--packspacks””
–– graded individually by two readers graded individually by two readers
–– all results compared; differences resolvedall results compared; differences resolved
(with consultation if needed)(with consultation if needed)
Once Once ““comfortablecomfortable””, individual work begins, individual work begins
Reading SupportReading Support
Reading partnerReading partner
–– sitting adjacent sitting adjacent
–– quick questions, reminders, quick questions, reminders, ““huh?huh?””
Table LeadersTable Leaders
–– one per 4one per 4--6 readers6 readers
–– questions on rubric interpretationsquestions on rubric interpretations
–– principal principal ““backback--readerreader”” (quality control)(quality control)
Question LeadersQuestion Leaders
–– in charge of everything elsein charge of everything else
–– provide instruction, review, clarificationprovide instruction, review, clarification
The Reading ProcessThe Reading Process
Folders of 25 examsFolders of 25 exams
–– exams have all 4 questions; others ignoredexams have all 4 questions; others ignored
–– all identifying information removedall identifying information removed
Rubric scoring done on separate sheetRubric scoring done on separate sheet
Final scores copied onto machine formFinal scores copied onto machine form
((““bubblebubble--formform””))
Occasional clarification recaps as neededOccasional clarification recaps as needed
Quality ControlQuality Control
BackBack--ReadingReading
–– every packet of 25 folders spotevery packet of 25 folders spot--checked:checked:
random items rescored until 6random items rescored until 6--7 agreements7 agreements
–– reader may request specific items be reader may request specific items be
backback--read (for those weird answers)read (for those weird answers)
–– major problems major problems �� consultationconsultation
–– initial packets backinitial packets back--read (100%) by leadersread (100%) by leaders
MetaMeta--Quality ControlQuality Control
College Comparability StudiesCollege Comparability Studies
–– questions administered in college classesquestions administered in college classes
(by faculty volunteers)(by faculty volunteers)
–– scored simultaneously with scored simultaneously with ““realreal”” examsexams
College faculty involvementCollege faculty involvement
–– exam development, reading, training, etc.exam development, reading, training, etc.
““Make It Or Break It?Make It Or Break It?””
A middleA middle--ground approach:ground approach:
Must show clear intentMust show clear intent
–– dondon’’t attempt to read intent into the textt attempt to read intent into the text
If intent is clear, mercy for common errorsIf intent is clear, mercy for common errors
–– e.g. syntax, =/==, size/lengthe.g. syntax, =/==, size/length
–– as long as errors donas long as errors don’’t admit further confusiont admit further confusion
Systemic problems penalized once onlySystemic problems penalized once only
–– e.g. method calls on classes vs. objectse.g. method calls on classes vs. objects
What Kinds of Answers?What Kinds of Answers?
The good The good ……
–– comparable to the best college studentscomparable to the best college students
The bad The bad ……
–– dondon’’t understand the languaget understand the language
–– dondon’’t read & understand the problemt read & understand the problem
And the And the ““uglyugly”” ……
–– blank bookletsblank booklets
–– long essays on the nature of life long essays on the nature of life ……
–– sketches of robots, people, basketball nets sketches of robots, people, basketball nets ……
Thoughts on RubricsThoughts on Rubrics
Surprisingly easy to get the feelSurprisingly easy to get the feel
–– especially with the level of detailespecially with the level of detail
Helps to recognize the richness of even Helps to recognize the richness of even ““simplesimple”” problemsproblems
Obviously, consistency a plusObviously, consistency a plus
““It has changed the way I grade my It has changed the way I grade my student work.student work.”” ---- Henry Walker, SIGCSEHenry Walker, SIGCSE
Consummate ProfessionalsConsummate Professionals
ETS staff:ETS staff:
Handles the flow of packetsHandles the flow of packets
–– forward reading, back reading, times 4forward reading, back reading, times 4
–– checking for basic errors (e.g. missing sig.)checking for basic errors (e.g. missing sig.)
The small amenitiesThe small amenities
–– morning / afternoon breaks morning / afternoon breaks
(no grading session longer than 2 hours)(no grading session longer than 2 hours)
–– simple recreational eventssimple recreational events
(movie night, AA baseball, seminars (movie night, AA baseball, seminars ……))
Consummate ProfessionalsConsummate Professionals
Fellow readers (75% highFellow readers (75% high--school staff):school staff):
Start early, stay late (until chased out!)Start early, stay late (until chased out!)
Free exchange of info, teaching tips, Free exchange of info, teaching tips, language issues language issues ……
Mutual support (Mutual support (““how nuts is how nuts is youryour boss?boss?””))
The unofficial social events (games, eats)The unofficial social events (games, eats)
The wacky night of parody songsThe wacky night of parody songs((““this paper is a nine this paper is a nine …”…”))
SummarySummary
““Either you love reading or you hate itEither you love reading or you hate it…”…”
–– and I loved it and I loved it ……
The people behind the scenes at ETS/CB The people behind the scenes at ETS/CB
really know what theyreally know what they’’re doingre doing
At least for CS, the results map At least for CS, the results map
meaningfully onto college coursesmeaningfully onto college courses
Rubrics can be a useful way to organize Rubrics can be a useful way to organize
scoring (and later, assigning grades)scoring (and later, assigning grades)
http://apcentral.collegeboard.comhttp://apcentral.collegeboard.com