Upload
ally-dillard
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
MMAKINGAKING A APPROPRIATE PPROPRIATE
PPASS-ASS-FFAIL AIL DDECISIONS ECISIONS
DDWIGHT WIGHT HHARLEY, Ph.D.ARLEY, Ph.D.
DIVISION OF STUDIES IN MEDICAL EDUCATIONDIVISION OF STUDIES IN MEDICAL EDUCATIONUNIVERSITY OF ALBERTAUNIVERSITY OF ALBERTA
PPASSINGASSING S SCORESCORES
Essential component of high stakes exams Essential component of high stakes exams Reaffirm standardsReaffirm standards Their purpose is to ensure that Their purpose is to ensure that
qualified candidates passqualified candidates pass unqualified candidates unqualified candidates do notdo not pass pass
How much is enough?How much is enough? Is 50% the passing score on this exam ?Is 50% the passing score on this exam ?
RREAFFIRMINGEAFFIRMING S STANDARDSTANDARDS
Performance standardPerformance standard Minimally adequate level of performance to enter Minimally adequate level of performance to enter
practicepractice Passing scorePassing score
Point on the score scale which separates those who are Point on the score scale which separates those who are successful and those who are notsuccessful and those who are not
TTHEHE B BASISASIS F FOROR P PASSINGASSING S SCORESCORES
Arbitrary judgment unavoidableArbitrary judgment unavoidable Reflect consensus of experts on reasonable Reflect consensus of experts on reasonable
expectations for evidence of competenceexpectations for evidence of competence Imposing discrete categories on a continuumImposing discrete categories on a continuum Set to serve the interests of public and professionSet to serve the interests of public and profession Process should be as open as possibleProcess should be as open as possible Based on as much relevant data as possibleBased on as much relevant data as possible Rationale presented as clearly as possibleRationale presented as clearly as possible
PPROCESSROCESS OFOF SSETTINGETTING PPASSINGASSING SSCORESCORES
Unreasonable to expect 100% correctUnreasonable to expect 100% correct Possible to construct tests with predetermined Possible to construct tests with predetermined
passing scorespassing scores Possible to adjust passing scores to achieve an Possible to adjust passing scores to achieve an
acceptable pass rateacceptable pass rate Possible to estimate a minimum passing score by Possible to estimate a minimum passing score by
combining estimates of the importance of individual combining estimates of the importance of individual test itemstest items
PPASSINGASSING S SCORECORE L LEVELEVEL
Determined by the situation and purposeDetermined by the situation and purpose Provide society with enough sufficiently competent practitionersProvide society with enough sufficiently competent practitioners Raising the passing score increases the average competence Raising the passing score increases the average competence
of those who pass but decreases their numberof those who pass but decreases their number Proportions passing should remain constantProportions passing should remain constant The more relevant and demanding the requirements for writing The more relevant and demanding the requirements for writing
the test, the fewer are expected to failthe test, the fewer are expected to fail If more than a small proportion of successful candidates fail the If more than a small proportion of successful candidates fail the
exam, its validity may be subject to serious challenge.exam, its validity may be subject to serious challenge.
CCRITERIARITERIA F FOROR D DEFENSIBILITYEFENSIBILITY
A standard setting method should …A standard setting method should … produce appropriate classification informationproduce appropriate classification information be sensitive to candidate performancebe sensitive to candidate performance be sensitive to instructionbe sensitive to instruction be statistically soundbe statistically sound identify the “true” standardidentify the “true” standard be easy to implement and computebe easy to implement and compute be credible and easily interpretable by lay peoplebe credible and easily interpretable by lay people
More than 3 dozen methodsMore than 3 dozen methods Some of the better known methods includeSome of the better known methods include
NedelskyNedelsky AngoffAngoff BookmarkBookmark EbelEbel Jaeger Jaeger IRT methodsIRT methods
SSTANDARD TANDARD SSETTING ETTING MMETHODSETHODS
““TTHE HE IINDUSTRYNDUSTRY SSTANDARDTANDARD””
The Angoff Method is:The Angoff Method is: the most commonly used methodthe most commonly used method convenient to useconvenient to use well-researched well-researched easily explainedeasily explained easily customizedeasily customized applicable to several response formatsapplicable to several response formats
AANGOFFNGOFF M METHODETHOD
Judges assign probabilities that a hypothetical Judges assign probabilities that a hypothetical minimally competent borderline candidate will be minimally competent borderline candidate will be able to answer each item correctly.able to answer each item correctly.
For each judge, probabilities are summed to get a For each judge, probabilities are summed to get a minimum performance level (MPL)minimum performance level (MPL)
MPLs are averaged to get a final passing scoreMPLs are averaged to get a final passing score
MMINIMALLY INIMALLY CCOMPETENTOMPETENT
The effectiveness of the Angoff method rests on the The effectiveness of the Angoff method rests on the judges’ ability to accurately conceptualize a judges’ ability to accurately conceptualize a “minimally competent, borderline candidate.”“minimally competent, borderline candidate.”
Repeated references to a formal summary of the Repeated references to a formal summary of the behaviours and performance indicators is requiredbehaviours and performance indicators is required
Judge training and calibration are essentialJudge training and calibration are essential
AANGOFF NGOFF CCALCULATIONSALCULATIONS
ItemItem Judge 1Judge 1 Judge 2Judge 2
11 1.001.00 0.850.85
22 0.650.65 0.500.50
33 0.800.80 0.750.75
44 0.450.45 0.500.50
55 0.300.30 0.400.40
MPLMPLjj 3.23.2 3.03.0
Passing score for this test is 3.1 items correct out of 5.
AA MMINORINOR VVARIANTARIANT
Judges are asked to imagine Judges are asked to imagine a pool of 100a pool of 100 minimally competent borderline students and then minimally competent borderline students and then estimate the number of these students who would estimate the number of these students who would answer the item correctlyanswer the item correctly
Reduces cognitive complexity of the taskReduces cognitive complexity of the task
VVARIATIONS ON A ARIATIONS ON A TTHEMEHEME
ScalesScales Iterative processIterative process Feedback between roundsFeedback between rounds
Judges’ resultsJudges’ results Past item performancePast item performance
p-valuesp-values % passing% passing
Yes/No procedureYes/No procedure
SSCALESCALES
Probability scales are sometimes provided to Probability scales are sometimes provided to simplify the process. For example:simplify the process. For example:
5%, 20%, 40%, 60%, 75%, 90%, 95%5%, 20%, 40%, 60%, 75%, 90%, 95%
0%, 5%, 10%, 15% … 95%, 100%0%, 5%, 10%, 15% … 95%, 100%
20%, 25%, 30% … 95%, 100%20%, 25%, 30% … 95%, 100%
AANGOFF WITH NGOFF WITH IITERATIONTERATION
Most commonly used modification.Most commonly used modification. ““Angoff-ing” is done a number of times.Angoff-ing” is done a number of times. Time between rounds is used for discussion among Time between rounds is used for discussion among
judges.judges. Intent is to reduce variability among judges on item Intent is to reduce variability among judges on item
estimates.estimates.
NNORMATIVE ORMATIVE DDATAATA
Normative or impact data is presented just prior to Normative or impact data is presented just prior to the final iteration.the final iteration.
Improves inter-rater reliability.Improves inter-rater reliability. Greatest impact on items that have been greatly Greatest impact on items that have been greatly
over or underestimated.over or underestimated.
YYES/ES/NNO O PPROCEDUREROCEDURE
Judges decide whether or not a single minimally Judges decide whether or not a single minimally competent borderline student would or would not competent borderline student would or would not answer the item correctlyanswer the item correctly
Attempt to simplify the cognitive complexity of the Attempt to simplify the cognitive complexity of the judges’ taskjudges’ task
Comparable results to the traditional methodComparable results to the traditional method
YYES/ES/NNO O CCALCULATIONSALCULATIONS
ItemItem Judge 1Judge 1 Judge 2Judge 2
11 11 11
22 11 00
33 11 11
44 00 00
55 00 00
MPLMPLjj 33 22
Passing score = Average of MPLs= (3+2)/2= 2.5 items correct
IIN AN N AN EEMERGENCYMERGENCY
When a committee is not available, Angoff-ing can When a committee is not available, Angoff-ing can be done solobe done solo
Assign Angoff values to each item ands sum the Assign Angoff values to each item ands sum the valuesvalues
Ask a colleague to review your Angoff assignmentsAsk a colleague to review your Angoff assignments Use an item analysis as a reality checkUse an item analysis as a reality check
RROUNDING OUNDING PPASSING ASSING SSCORESCORES
Rarely do derived passing scores produce exact Rarely do derived passing scores produce exact whole numberswhole numbers
Rounding may have an impact on the pass/fail rateRounding may have an impact on the pass/fail rate Consider the consequences of rounding Consider the consequences of rounding