16
10/25/17 1 Revitalizing ‘Pore’ Assessment Practices: Sarah Shaffer, DO – University of Iowa Amy Thompson, MD – University of Cincinnati Alice Chuang, MD, MEd – University of North Carolina Valid and Reliable Assessment Tools in Medical Education None of the authors nor their spouses have any relevant financial disclosures. Learning Objectives The learner will be able to identify elements of a poorly- constructed multiple-choice question. The learner will recognize reliability and validity threats to observational assessments and discuss actions to minimize these. The learner will be able to implement key steps to create a performance evaluation (simulation or objective structured clinical exam).

Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

1

Revitalizing‘Pore’AssessmentPractices:

SarahShaffer,DO– UniversityofIowaAmyThompson,MD– UniversityofCincinnati

AliceChuang,MD,MEd– UniversityofNorthCarolina

ValidandReliableAssessmentToolsinMedicalEducation

Noneoftheauthorsnortheirspouseshaveanyrelevantfinancial

disclosures.

LearningObjectives

• Thelearnerwillbeabletoidentifyelementsofapoorly-constructedmultiple-choicequestion.• Thelearnerwillrecognizereliabilityandvaliditythreatstoobservationalassessmentsanddiscussactionstominimizethese.• Thelearnerwillbeabletoimplementkeystepstocreateaperformanceevaluation(simulationorobjectivestructuredclinicalexam).

Page 2: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

2

“Onlywhenweareclearwhatitiswewanttoassesscanwedeviseeffectivestrategiesforachievingthat.”

Hammersley,1987

Whyarereliabilityandvaliditysoconfusing?

• Campbell&Fiske,1967Reliability istheagreementbetweentwoeffortstomeasurethesametraitthroughmaximallysimilarmethods.

Validity isrepresentedintheagreementbetweentwoattemptstomeasurethesametraitthroughmaximallydifferentmethods.

• Black&Champion,1976Thereliability ofameasuringinstrumentisdefinedastheabilityoftheinstrumenttomeasureconsistentlythephenomenonitisdesignedtomeasure.Thevalidity ofameasuringinstrumentisdefinedasthepropertyofameasurethatallowstheresearchertosaythattheinstrumentmeasureswhathesaysitmeasures…

Hammersley,1987

HighPrecisionLowaccuracyRandomerrorsmallSystematicerrorlarge

HighPrecisionHighAccuracyRandomerrorsmallSystematicerrorsmall

BetterPrecisionBetteraccuracyRandomerrorlargeSystematicerrorlarge

LowPrecisionLowaccuracyRandomerrorsmallerSystematicerrorlarge

Validity – ConstructwearemeasuringReliability – Instrumentweareusingtomeasuretheconstruct,anditsabilitytoproducevalidscores

Page 3: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

3

Let’sreview…Validity1

• “Doscoresaccuratelyreflectthepresenceofaproperty?”• “Accuracyinourabilitytomeasureanobservablepropertyandcorrelatewithanunobservableproperty.“• Doscoresaccuratelyreflecttheamountofaproperty?(precision)

Reliability• “Repetitionofthestudywouldresultinthesamedataandconclusions”2

• “Abilityofaninstrumenttoproducevalidscores.”1• Anunderstoodrequirementforatesttobevalid• Ameasureofrandomerrorintheinstrument

1Hammersley,19872Goode&Hatt,1952

How tall am I?Validity

Reliability

Page 4: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

4

Pregnancytestexample– 50%ofpatientsaretrulypregnant

Hour#1Results50positive50negative

Hour#2Results20positive80negative

BESTRESULTS

??%

WrittenTests

WrittenTests

•Twomajorformats

• ConstructedResponse:requireformationofanuniquewrittenresponse

• SelectedResponse:requirechoiceofabestorcorrectanswerfromasetlistofoptionsinaresponsetoapromptor“questionstem”

Page 5: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

5

Miller’sPyramid4.Results

A.ChangeinorganizationalpracticeB.Benefitstopatients/clients

3.BehaviorA.Transferlearningtoworkplace

B.Learnersapplynewknowledgeandskills

2.LearningA.Changeattitudes/perceptions

B.Changeknowledge/skills

1.ReactionStudentSatisfaction

Kirkpatrick’sCriteriaforLearningOutcomes

Demonstrationoflearning

Pros&ConsofWrittenTests

• Strengths• In-depth&non-cued(CR)• Testmultipleaspects (CR)• Easeofpartialcreditscoring(CR)• Testbroad content(SR)• Objective(SR)• Reproducible(SR)• Efficient(SR)

• Limitations• Subjectivescoring(CR)• Issueswithreproducibility(CR)• Inefficient(CR)• Difficulttowritewell (SR)• Issueswithpublicizing(SR)

StepstocreationofaWrittenTest

• SelectedResponse• DetermineappropriatenessofSRwrittentestformat• Determinecontenttobeassessed• WriteSRquestion(s)withconsiderationof:• Formatting• Style• Stem-specificissues• Option-specificissues

• ScoreSRitem(s)

• ConstructedResponse• DetermineappropriatenessofCRwrittentestformat• Determinethecontenttobeassessed• WriteCRprompt(s)orquestion(s)• Prepareaidealanswer(s)• Contentexpertsread&scoreCRitem(s)

Page 6: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

6

“Any aspectofacognitiveeducationalachievementcanbetestedbymeansofeitherthemultiple-choiceorthetrue-falseform.”

RogerL.Ebel,1972

SRWrittenTest:DetermineContent

• Learnerlevel• Backgroundinformationorinstruction• Goalofcontentgainedbylearner• Time(etc…)constraints• Testauthor&testreviewer(s)shouldbecontentexperts

SRWrittenTest:WriteMCQs

• SRtestitemanatomy=stem +question +options• Aproposedtaxonomyof31principlesofwritingeffectiveMCQsisavailable• HIGHPOINTS• Focusonasinglecontentelementorissue• Stem mustcontainadequateinformation• Question shouldbeobvious• Brevityisbestforansweroptions• Everyoptions shouldbeplausible• Threeoptions isgenerallybest

Page 7: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

7

SRWrittenTest:Review&Scoring

• Reviewbyanothercontentexpert(s)• ApplyprinciplesofwritingeffectiveMCQs• Considercontentover/under-representation• Considerbalanceofhigher-order&lower-ordercognitiveobjectives(2/3and1/3,respectively,aresuggested)

• Adequatetimeforreview&revision• Scoring shouldberelativelysimplewithasinglebestand/orcorrectanswer

Examples:SelectedResponseMCQs

SR- Possiblethreatstovalidity/sourcesoferror• Choiceofcontent➡ adequateinputfrommultiplecontentexperts,adequatetimetorefinelist• Content-validityconsiderations

• Over-representation• Degradationduetoinefficiency

• ConstructionsofMCQsorotherSRitems➡ trainingofskill,peerreview&feedback• Attemptstocorrectforguessing➡ construct-irrelevantvariance➡ limitsthenumberofoptionsandthenumberofitems

Page 8: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

8

CR- Possiblethreatstovalidity/sourcesoferror

• Scoringisinherentlysubjective➡• ModelAnswers

• Containrequiredcomponents• Reviewedforaccuracy• Reviewedforcompleteness

• Createascoringrubric• Twoindependentreadersforeach• Readallanswerstosamequestion• Train,re-trainorreplace

• ContentUnder-representation• Promptssampleknowledge“deeply”butnot“broadly”

• Typesofexaminee“bluffing”

InteractiveActivity

OralExams

Page 9: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

9

Observationalassessments:aka ClinicalPerformanceEvaluation

Miller’sPyramid4.Results

A.ChangeinorganizationalpracticeB.Benefitstopatients/clients

3.BehaviorA.Transferlearningtoworkplace

B.Learnersapplynewknowledgeandskills

2.LearningA.Changeattitudes/perceptions

B.Changeknowledge/skills

1.ReactionStudentSatisfaction

Kirkpatrick’sCriteriaforLearningOutcomes

Demonstrationoflearning

ClinicalPerformanceEvaluation

• Commonlyusedforformativeevaluation• Oftenrepresentsmajorcomponentofcompositegrade• Varyfromsimpletocomplex• Singlerater,singlesetting• Singlerater,multiplesettings• Multipleraters,singleratertype,singlesetting• Multipleraters,singleratertype,multiplesettings• Multipleraters,multipleratertypes,singlesetting• Multipleraters,multipleratertypes,multiplesettings

• Checklistsandratingformsused

Page 10: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

10

RatersandRatingScalelimitations

“Themoreprecisethescale,themoredifficultitistoachievehighlevelsofvalidity.And,indeed,thereisoftenatemptationtobemoreprecisethanthelevelofvaliditywithwhichanobjectcanbemeasuredjustifies.”

Hammersley,1987

Designer Regularpeople

StandardizedAssessment(OSCE)• Oftenpredictable

• Specificparticipants

• Rolesconstant

• Levelofparticipationconsistent

• Trainingcontrolled

• Specificcontent

• Patientcomplexitycontrolled

• Simulatedworld

ClinicalPerformanceAssessment• Unpredictable

• Participantscanvary

• Rolesarecomplex

• Levelofparticipationvaries

• Participants(likely)notformallytrained

• Contentvaries

• Patientcomplexityvaries

• Real-consequences

AssessmentinClinicalEnvironments

• Instructionandassessmentareinseparable• Two-wayinteractiverelationshipsdevelop• Subjecttopitfallsandbiasesinherentinhumanrelationships• Assessorsareoftenlearningandworkingatthesametime• Learner’squicklytakeonapretenseorcloakofcompetencethatshapesevaluation.

Page 11: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

11

ValidityThreatsinClinicalEvaluations

• Toofewobservationsofbehavior• Incompleteobservations• Checklistitemsinappropriatelywritten• Evaluatorbias• Systematicevaluatorerror• Bluffingofevaluators• Poorlytrainedevaluators

Stepstominimizevaliditythreats

• Assessmentgoalsaredefinedandmatchassessmenttool• Checklistitemsaregeneratedbykeystakeholders• Checklistitemsareclearlywrittenandtestedwithavarietyofevaluators• Evaluatorshaveclearsenseoftheassessmentgoal• Evaluatorsknowhowtousetheassessmenttool/instrument• Learnershaveclearsenseofassessmentgoal

BestPractices• Instrumentsshouldallowforabroadrangeofclinicalsituations• Usemultipleevaluatorswhenpossible• Keepinstrumentsshortandfocused• Formativeassessmentsshouldbeseparatethanthoseforadvancement• Givesufficienttimeforassessmentssothatratingsarenotrushed.• Minimizelagtimebetweenobservationsandevaluationcompletion• Nomorethan7responseoptionsperitem

Williams,Klamen,&McGaghie,2003;McGaghie,Buter,&Kaye,2009

Page 12: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

12

InteractiveActivity

PerformanceAssessments

PerformanceAssessments

• Definition:Simulatedreal-lifetask,performed“invitro”• Examples:• Driver’slicenseexam• SCUBAcertification• USMLEStep2• OSCE

Page 13: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

13

Demonstrationoflearning

ProsandCons(comparedtoObservationalAssessments)

• Strengths• Cancontrolsetting• Increasedstandardization• ImprovedpatientsafetybecauseusingSP’s• Notdisruptivetoactualpatientcare

• Limitations• Complexlogistically• Challengingtosimulatereallife• Expensive

Steps

• Identifytask• Createcase• Createassessmenttools• Pilot•Makeadjustments

Page 14: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

14

Identifytask• Whatcompetenciesandskillswillbeassessed?• Mayneedtoreferenceschoolcompetencies• Mayrequireaneedsassessment• MayneedtoreviewEPAs

• Doyouwanttotestdiscreteskills(e.g.takingasexualhistory)?

• Doyouwanttotestacomprehensivepatientencounter(e.g.fullH&Pwithdifferentialandwriteup)?

Creatingthecase/WritingtheScripts

• SPscriptshouldcontainrichdetailincluding:• Personalcharacteristicsofthepatient• Demographics• Specialinstructions(i.e.affect,props,mood,etc.)• Backstory• Fullmedicalhistory• Proposedresponsestoquestions• Plausiblelabsresults

• Includealso:• Doornote• Checklists• Tasktobeaccomplishedbystudent• Guidelines• Feedback• Fillingoutforms

Scoringtheperformance• Checklist:containsspecifictasksthatthestudenthasperformed,doesnotrequireexpertrater,yesorno,behaviors,objective• RatingScale:Likertscaletoindicatequalityofwhatwasdone,moresubjective• Case-specificChecklist:tasksdevelopedbypanelofexpertsforspecificscenario,shouldbeabletobecompletedbynon-expert,behaviors,objective• Rubric:usedforwrittenproductsorpost-encounter

• ALSO- MUSTDECIDETHESTANDARDFORPASSING:CERTAINCUMULATIVESCOREor CERTAINPERFORMANCEoneachcase

Page 15: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

15

Examples:ShowChoirAuditionRubricPoor(0points) Fair(1points) Good(2points) Superior(3pts)

SingingSkills

Singswith asmuchexpressionasawetnoodle,

cannotidentifytunecandidateissinging,cannotidentifylyricsofsongdueto

poorpronunciation

Minimally expressive,pitchoffsignificantlyon

occasion,dictionunclearattimes

Veryexpressive, singsonpitch mostofthetimewithminor

errors,dictionclearmostofthetime

Artistically expressive,singsonpitch,dictionconsistentlyclear

DancingSkills

Has2leftfeet,unabletolearnnewsteps,continuestodance likeMCHammer

despitedifferentchoreographydemonstrated

Misstepsfrequently,minimalartistic

expressionindancemoves,difficultylearningnew

choreographyafter3demonstrations

Occasionally missteps,butoveralldancestepsareaccurate,adaptschoreography

fairlyrapidly

Quickandnimble,dancesartistically,able

tolearnnewchoreography quickly.

Enthusiasm FreelyadmitsnotknowingwhatGLEEis

Endorses enjoymentofGLEE,butunabletoidentifyfavorite

character

Haswatched70%ofGLEEepisodes

Hasseen everyepisodeofGLEE,allGLEE

albumsconfirmediniTUNES library,has

beentoGLEELIVEeachsummer

Possiblethreatstovalidity/error

• Checklistitemsinterpreteddifferentlybydifferentpeople->facultydevelopment,usemultipleinputsforcreationofchecklist• Someitems/somecasesvarybydifficult->usemultiplecases• Somecasesambiguouswithmultipleinterpretations->pilotcarefully,usePDSAcycletoimprove• SP’svariablequality->carefultrainingofSP’s• Ratershavevariablescoressecondarytohaloeffective,leniency,severity,etc ->usebehaviorallyanchoredscoring,establishframeofreference• Onerater/SPperformsinconsistently->removethemortrain• Oneparticulargroupperformssignificantlydifferently->maybeanoccasion-specificfactor(suchasdisappointingpresidentialelection),trytocontrolfactorsthatcanbecontrolled.

InteractiveActivity• Pleasecreatea3stationOSCEthatwilltestifanindividualknowshowtoprepareforbed.ThisOSCEshouldtestskillsintheareasof1)dentalhygiene,2)toiletingand3)appropriateattire.Rememberthesteps:• Identifytask• Createcase• Createassessmenttools• Pilot• MakeadjustmentsEachtableshouldchoosetocreateastation forthisOSCE– selectskill1,2,or3.

Page 16: Pore-Assessment-Practices finalDRAFT 24oct2017 · 10/25/17 6 “Anyaspect of a cognitive educational achievement can be tested by means of either the multiple-choice or the true-false

10/25/17

16

Thankyouforjoiningus!

SarahShaffer,DO– UniversityofIowaAmyThompson,MD– UniversityofCincinnati

AliceChuang,MD,MEd– UniversityofNorthCarolina

References

• DowningSMandYudkowsky R,eds.AssessmentinHealthProfessionsEducation.NewYork:Routledge.2009.• CaseyPMetal.Tothepoint:reviewinmedicaleducation—theObjectiveStructuredClinicalExamination.AmJObstet Gynecol 2009Jan;2009(1):25-34.• Hammersley,1987• Goode&Hatt,1952• Williams,Klamen &McGaghie,2003• McGaghie,Buter &Kaye,2009