Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
AProgrammaticApproachtoAssessmentADEA/ADEEMeeting:ShapingtheFutureofDentalEducation8-9May,London,UK
CeesvanderVleutenMaastrichtUniversityTheNetherlandswww.maastrichtuniversity.nl/shewww.ceesvandervleuten.comWith gratitude to the supportofLiftUPP!
Overview
• Frompracticetoresearch• Fromresearchtotheory• Fromtheorytopractice• Conclusions
TheToolbox
• MCQ,MEQ,OEQ,SIMP,Write-ins,Key Feature,Progress test,PMP,SCT,Viva,Longcase,Shortcase,OSCE,OSPE,DOCEE,SP-based test,Videoassessment,MSF,Mini-CEX,DOPS,assessmentcenter,self-assessment,peerassessment,incognitoSPs,portfolio………….
Knows
Shows how
Knows how
Does
Knows Fact-oriented assessment:MCQ, write-ins, oral…..
Knows howScenario or case-based assessment:MCQ, write-ins, oral…..
Shows howPerformance assessment in vitro:Assessment centers, OSCE…..
DoesPerformance assessment in vivo:In situ performance assessment, 360◌,۫ Peer assesment
Thewayweclimbed......
Validity
Characteristicsofinstruments
Reliability
Educationalimpact
Acceptability
Cost
Validity
Reliability
Educationalimpact
Validity:whatareweassessing?
• Curriculahavechangedfromaninputorientationtoanoutputorientation
• Wewentfromhaphazardlearningtointegratedlearningobjectives,toendobjectives,andnowto(generic)competencies
• Wewentfromteacherorientedprogramstolearningoriented,self-directedprograms
Competency-frameworks
CanMeds§ Medical expert§ Communicator§ Collaborator§ Manager§ Health advocate§ Scholar§ Professional
ACGMEn Medical knowledgen Patient caren Practice-based learning
& improvementn Interpersonal and
communication skillsn Professionalismn Systems-based practice
GMCn Good clinical caren Relationships with
patients and familiesn Working with
colleaguesn Managing the
workplacen Social responsibility
and accountabilityn Professionalism
Knows
Shows how
Knows how
Does
Knows
Knows how
Shows how
Does
Validity:whatareweassessing?
Standardizedassessment(fairlyestablished)
Unstandardizedassessment(emerging)
Messagesfromvalidityresearch
• Thereisnomagicbullet;weneedamixtureofmethodstocoverthecompetencypyramid
• WeneedBOTHstandardizedandnon-standardizedassessmentmethods
• Forstandardizedassessmentqualitycontrolaroundtestdevelopmentandadministrationisvital
• Forunstandardizedassessmenttheusers(thepeople)arevital.
Methodreliabilityasafunctionoftestingtime
TestingTimeinHours
1
2
4
8
MCQ1
0.62
0.77
0.87
0.93
Case-BasedShortEssay2
0.68
0.81
0.89
0.94
PMP1
0.36
0.53
0.69
0.82
OralExam3
0.50
0.67
0.80
0.89
LongCase4
0.60
0.75
0.86
0.92
OSCE5
0.54
0.70
0.82
0.90
PracticeVideoAssess-ment7
0.62
0.77
0.87
0.93
1Norcinietal.,19852Stalenhoef-Hallingetal.,19903Swanson,1987
4Wassetal.,20015VanderVleuten,19886Norcinietal.,1999
In-cognitoSPs8
0.61
0.76
0.86
0.93
MiniCEX6
0.73
0.84
0.92
0.96
7Ram etal.,19998Gorter,2002
Reliabilityasafunctionofsamplesize(Moonen etal.,2013)
0.65
0.7
0.75
0.8
0.85
0.9
4 5 6 7 8 9 10 11 12G=0.80 KPB
Mini-CEX
0.65
0.7
0.75
0.8
0.85
0.9
4 5 6 7 8 9 10 11 12
G=0.80 KPB OSATSMini-CEX OSATS
Reliabilityasafunctionofsamplesize(Moonen etal.,2013)
0.65
0.7
0.75
0.8
0.85
0.9
4 5 6 7 8 9 10 11 12
Mini-CEX OSATS MSF
Reliabilityasafunctionofsamplesize(Moonen etal.,2013)
Effectofaggregationacrossmethods(Moonen etal.,2013)
Method
Mini-CEXOSATSMSF
Sampleneeded
whenusedasstand-alone
899
Sampleneeded
whenusedasacomposite
562
ResultatenBetrouwbaarheidPerinstrumentAllejaren:
8KPB,9OSATS,9MSFEerstejaar:
6KBP,6OSATS,6MSF
GezamenlijkAllejaren:
7KPB,8OSATS,1MSFof5KPB,6OSATS,2MSF
Eerstejaar:5KBP,6OSATS,1MSF
Messagesfromreliabilityresearch
• Acceptablereliabilityisonlyachievedwithlargesamplesoftestelements(contexts,cases)andassessors
• Nomethodisinherentlybetterthananyother(thatincludesthenewones!)
• ObjectivityisNOTequaltoreliability• Manysubjectivejudgmentsareprettyreproducible/reliable.
Educationalimpact:Howdoesassessmentdrivelearning?
• Relationshipiscomplex(cf.Cilliers,2011,2012)• Butimpactisoftenverynegative
– Poorlearningstyles– Gradeculture(gradehunting,competitiveness)– Gradeinflation(e.g.intheworkplace)
• AlotofREDUCTIONISM!– Littlefeedback(gradeispoorestformoffeedbackonecanget)– Non-alignmentwithcurriculargoals– Non-meaningfulaggregationofassessmentinformation– Fewlongitudinalelements– Tick-boxexercises(OSCEs,logbooks,work-basedassessment).
• Alllearnersconstructknowledgefromaninnerscaffoldingoftheirindividualandsocialexperiences,emotions,will,aptitudes,beliefs,values,self-awareness,purpose,andmore...ifyouarelearning…..,whatyouunderstandisdeterminedbyhowyouunderstandthings,whoyouare,andwhatyoualreadyknow.
PeterSenge,DirectoroftheCenter forOrganizationalLearningatMIT (ascitedinvanRyn etal.,2014)
Messageslearningimpactresearch
• Noassessmentwithout(meaningful)feedback
• Narrativefeedbackhasalotmoreimpactoncomplexskillsthanscores
• Provisionoffeedbackisnotenough(feedbackisadialogue)
• Longitudinalassessmentisneeded.
Overview
• Frompracticetoresearch
• Fromresearchtotheory• Fromtheorytopractice• Conclusions
Limitationsofthesingle-methodapproach
• Nosinglemethodcandoitall• Eachindividualmethodhas(significant)limitations
• Eachsinglemethodisaconsiderablecompromise onreliability,validity,educationalimpact
Implications
• Validity:amultitudeofmethodsneeded• Reliability: alotof(combined)informationisneeded
• Learningimpact: assessmentshouldprovide(longitudinal)meaningfulinformationforlearning
Programmaticassessment
Programmaticassessment
• Acurriculumisagoodmetaphor;inaprogramofassessment:– Elementsareplanned,arranged,coordinated– Issystematicallyevaluatedandreformed
• Buthow?(theliteratureprovidesextremelylittlesupport!)
Programmaticassessment
• Dijkstraetal2012:73genericguidelines
• To be done:– Further validation– Afeasible (self-assessment)instrument
• ASPIREassessmentcriteria
Buildingblocksforprogrammaticassessment1
• Everyassessmentisbutonedatapoint(Δ)• Everydatapointisoptimizedforlearning
– Informationrich(quantitative,qualitative)– Meaningful– Variationinformat
• Summativeversusformativeisreplacedbyacontinuumofstakes(stakes)
• Ndatapointsareproportionallyrelatedtothestakesofthedecisiontobetaken.
Continuumofstakes,numberofdatapointandtheirfunction
Nostake
Very highstake
OneDatapoint:
• Focusedoninformation
• Feedbackoriented
• Notdecisionoriented
Intermediateprogressdecisions:
• Moredatapointsneeded
• Focusondiagnosis,remediation,prediction
Finaldecisionsonpromotionorselection:
• Manydatapointsneeded• Focusedona(non-
surprising)heavydecision
Assessmentinformationaspixels
Classicalapproachtoaggregation
Method1toassessskillA Σ
Method2toassessskillB Σ
Σ
Σ
Method3toassessskillC
Method4toassessskillC
Moremeaningfulaggregation
Method1
Σ
Method2
Σ
Method3
Σ
Method4
Σ
SkillA
SkillBB
SkillC
SkillD
Overview
• Frompracticetoresearch• Fromresearchtotheory
• Fromtheorytopractice• Conclusions
Fromtheorybacktopractice
• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio
– DutchspecialtytraininginGeneralPractice
– McMasterModularAssessmentPrograminEmergencyMedicine
– GraduateentryprogramMaastricht
Fromtheorybacktopractice
• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio
– DutchspecialtytraininginGeneralPractice
– McMasterModularAssessmentPrograminEmergencyMedicine
– GraduateentryprogramMaastricht
Fromtheorybacktopractice
• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio
– DutchspecialtytraininginGeneralPractice
– McMasterModularAssessmentPrograminEmergencyMedicine
– GraduateentryprogramMaastricht
Physician-clinical investigator program
• 4 year graduate entry program• Competency-based (Canmeds) with emphasis on
research• PBL program
– Year 1: classic PBL– Year 2: real patient PBL– Year 3: clerkship rotations– Year 4: participation in research and health care
• High expectations of students: in terms of motivation, promotion of excellence, self-directedness
The assessment program• Assessment in Modules: assignments, presentations, end-examination,
etc.• Longitudinal assessment: assignments, reviews, projects, progress
tests, evaluation of professional behavior, etc.• All assessment is informative and low stake formative• The portfolio is central instrument
Module-overstijgendetoetsingvanprofessioneelgedrag
Module2 Module3 Module4Module1
PT 1 PT2 PT 3 PT 4
Longitudinal Module exceeding assessment of knowledge, skills and professional behavior
portfolioCounselormeeting
Counselormeeting
Counselormeeting
Counselormeeting
Module exceeding assessment of knowledge in Progress Test
Longitudinal total test scores across 12 measurement moments and predicted future performance
Maastricht Electronic portfolio(ePass)
Comparisonbetween the score of the student and the average score of his/her peers.
Every blue dot corresponds to an assessment form included in the portfolio.
Maastricht Electronic portfolio (ePass)
Coaching by counselors• Coaching is essential for successful use of reflective
learning skills • Counselor gives advice/comments (whether asked or not)• He/she counsels if choices have to be made• He/she guards and discusses study progress and
development of competencies
Decision-making by committee• Committee of counselors and externals• Decision is based on portfolio information & counselor
recommendation, competency standards• Deliberation is proportional to clarity of information• Decisions are justified when needed; remediation
recommendation may be provided
Strategytoestablishtrustworthiness Criteria
PotentialAssessmentStrategy(sample)
Credibility Prolongedengagement Trainingofexaminers
Triangulation Tailoredvolumeofexpertjudgmentbasedoncertaintyofinformation
Peerexamination Benchmarkingexaminers
Memberchecking Incorporate learnerview
Structuralcoherence Scrutinyofcommitteeinconsistencies
Transferability Timesampling Judgment basedonbroadsampleofdatapoints
Thickdescription Justifydecisions
Dependability Stepwisereplication Usemultipleassessorswhohavecredibility
Confirmability Audit Givelearnersthepossibilitytoappealtotheassessmentdecision
Progress test embedded in programmatic assesssment – use of information and feedback
to selfdirect learning
Overview
• Frompracticetoresearch• Fromresearchtotheory• Fromtheorytopractice
• Conclusions
Conclusions1:Thewayforward• Wehavetostopthinkingintermsofindividualassessmentmethods
• Asystematicandprogrammaticapproachisneeded,longitudinallyoriented
• Everymethodofassessmentmaybefunctional(oldandnew;standardizedandunstandardized)
• Professionaljudgmentisimperative(similartoclinicalpractice)
• Subjectivityisdealtwiththroughsamplingandproceduralbiasreductionmethods(notwithstandardizationorobjectification).
Conclusions 2:Thewayforward
• Theprogrammaticapproachtoassessmentoptimizes:– Thelearningfunction(throughinformationrichness)
– Thepass/faildecisionfunction(throughthecombinationofrichinformation)
Furtherreading:www.ceesvandervleuten.com