OUNL’s assessment model

OUNL’s assessment model

January the 10th 2006

Colin Tattersall & Henry Hermans

Positioning this work in TENCompetence

WP 6,Task 3. Develop a formal specification model and supporting tools that

combines new assessment types and the ones included in the IMS QTI (providing input for standards development). The current IMS QTI specification concentrates on classical forms of assessment (multiple choice, short answer, etc.). For competence development also other assessment techniques should be included in the specification (e.g. 360 degrees feedback, portfolio assessment). During the first 18 months activities will build on initial work already done by OUNL and CITO (citogroup.nl) to specify a formal integrative model, and tools will be developed. This will provide valuable input for the further development of the IMS QTI specification.“

12 man months available in the first 18 months Proposed in December 2005:

OUNL (2), Giunti (1), CERTH (2), UB (2), UvA (2), SU (3)

Assessment vs testing

Assessment = all the systematic methods that can be used to gather information and evidence about student properties, based on a process, a product or the progress of a student, for the purposes of certification, placement or diagnoses in formative and summative contexts.

This definition includes classical tests, examinations and questionnaires, as well as newer types of assessment, such as competence-based assessment, portfolio assessment and peer assessment.

Where instruction and assessment are considered as separate activities, assessment is often referred to as ‘testing’

Taken from Joosten-Ten Brink et al (submitted)

Relevance for TENCompetence

The judgment that a someone is competent in a domain can be achieved by using several assessment instruments

Challenge: select the assessment types that yield the appropriate evidence

Some questions I have heard

Why not just use QTI? Why not just use QTI+LD? What about other initiatives, eg FREMA? …

Why an(other) assessment model?

Need for exchange of assessments development of reliable and valid assessments is time-

consuming and expensive

Discrepancy between: state of the art in assessment current standards (QTI)

Where and how is QTI lacking?

tendency away from massive, standardized, summative testing

with multiple choice questions based on knowledge acquisition.

towards assessment integrated in learning and instruction, process-based, with student involvement.

QTI focuses on those assessment types for which an unambiguous definition in technical terms can be specified.

More needed: assessment process (which steps are to be carried out and by whom), but also rationale (what is being tested and how)

What about using QTI+LD

Can do more, but much remains implicit, not described in the language of assessment

Item

Section

Assessment

AssessmentScenario

AssessmentPlan

+ assessmentFunction: int+ candidateDescription:

UnitOfAssessment

DecisionRule Trait

qti??

LD??

Assessment model

1..*

1

1

*

1..*

1..*

*

1..*

*

1..*

*

1..**

1..*

1..*

Goals of original OUNL work

Develop conceptual model of assessment Cover state of the art in assessment Create foundation for a new standard Document and evaluate methodology

UML as notation Model expert knowledge

Project information

Duration: 1 year Partners:

Open University of the Netherlands Citogroep Arnhem

Assessment experts: Citogroep Open University Fontys

Project phases

Initial model (expert sessions) Internal evaluation

Several (>70) change requests Modified version

Layered model Broad coverage of assessments

1

*

afnamevorm

0..1

0..*

0..1 *

*

0..1

0..*

1

gebruikt

0..* 2..*

bestaat uit

1..* 1baseert beslissingen op0..*

1

1

*

is basis voor

*

1

bepaalt

1

*

* *heeft

0..1

* heeft

*

1

bepaalt responscodering

*

1

1

*

bepaald score

* *beoordeelt

1

*

1

*

1

*

0..*

1beslist inzake

1*

*

*

is toegepast door

1

*

1

*

1

*

0..1

*

bestaat uit

1

1..*

*

*

toegestaan

1

0..*

0..1

*

bestaat uit

0..1

1..*

bestaat uit

1 0..*toegepast in

0..1 0..1

1..*

1heeft

1

*hoofdindicator

1

*subindicator

1 *

0..1

*

bestaat uit

1..*

0..*

hoort bij

*

0..1

1*

1 *

10..*

1..*0..1

1..*

*

voldoet aan criteria van

1*

1..*

* is bestemd voor

1 *

1..*

*

hoort bij

*

1

is toegepast in

0..1

0..1

levert

*

1

neemt

1

*

neemt in overweging mee1..*

0..*betreft de uitvoering van

Toetsfunctie

-typeBeslissing:String-omschrijving:String

Populatie

-criteria:Lijst-naam:String-vooropleiding:Lijst-taalbeheersing:Lijst-disability:Lijst-gebruikteMethoden:Lijst-domeinBehandeld:Lijst-aantalStudenten:int-opleiding:String-vak:String-OnderwerpenBehandeld:Lijst-niveau:String-specialeKenmerken:String

Item

-nummer:int-creatiedatum:Datum-geschatteAfnametijd:Tijd-maximaleAfnametijd:Tijd-betrouwbaarheid:double-validiteit:double-rit_waarde:double-irt_waarde:double-kandidaatrol:Lijst-medium:String-presentatieMedia:Lijst-kandidaatInstructie:Tekst

Indicator

-omschrijving:Tekst

Prompt

<<associatieve klasse>>AssessmentItemBeoordelingsvoorschrift

-gewicht:int

Scoringsvoorschrift

-categorieCodering:Waarde-coderingsregel:Regel-categorieScore:Waarde-scoringsregel:Regel

Toetsplan

-matrijs:Tabel-omstandigheden:Tekst-preconditie:Tekst

Unit of Assessment

-titel:Tekst-geschatteAfnametijd:Tijd-maximaleAfnametijd:Tijd-matrijs:Tabel-samenstellingsregels:Regel-minPersonen:int-maxPersonen:int-presentatieMedia:Lijst-kandidaatrol:Lijst-kandidaatInstructie:Tekst

<<associatieve klasse>>AssessmentItem

-volgordenummer:int-gewicht:double

Beslisvoorschrift

-beslisregels:Lijst

Kandidaat

-Identificatie:String-naam:String

<<associatieve klasse>>AssessmentAfname

-datum:Datum-aanvang:Tijd-einde:Tijd-kandidaatrol:String-presentatieMedium:String

<<associatieve klasse>>ItemRespons

Beoordelaar

-Identificatie:String-naam:String

<<associatieve klasse>>IndicatorResponsCodering

-gecodeerdeResponse:Waarde-indicatorScore:Waarde

Beslisser

-naam:String

SelectieFormat

-gokkans:double

DemonstratieFormatConstructieFormat

Hint

-Moment:int

Terugkoppeling

-uitspraak:String-toelichting:String-referentie:String

Casuspositie

-nummer:int

<<associatieve klasse>>Beoordelingsvoorschrift

-gewicht:double-beoordelingscriteria:Lijst-definitieCriterium:Tekst-prestatieschaal:Schaal

AssessmentIndicatorScore

-indicatorScore:Waarde

Toetskader

-toetsvisie:Tekst

<<associatieve klasse>>ToetsplanIndicator

-norm:double-schaal:Schaal-scoreBerekeningsregel:Regel

Persoon

-vooropleiding:tekst-persoonlijkOntwPlan:tekst-persoonlijkActPlan:tekst

Groep

<<associatieve klasse>>AssessmentScenario

-volgorde:int-gewicht:double-maxDoorlooptijd:Tijd-verplichtIndicatie:int

AfnameOrganisatie

-regelingen:Tekst-Afnameinstructie:Tekst-benodigdHulpmiddel:Lijst-toegestaanHulpmiddel:Lijst

Assessmentvorm

<<associatieve klasse>>Beslissing

-besluit:Waarde

Zie verdere uitwerking in ResponsModus

Towards a bird’s-eye view....

Phases in the assessment process

Assessment Design

ItemConstruction

AssessmentConstruction

AssessmentDelivery

ResponsEvaluation

DecisionMaking

Assessment design

AssessmentFunction

Candidate

AssessmentPlan

IndicatorAssessmentDefinition

AssessmentScenario

DecisionRule

Trait

Population

AssessmentPolicy

* 1..*

* 1..*

*

1

* 1..*

1

1..*

*

1..*

*

1..*

1

1..*

1..

*

1 *

Item construction

IndicatorItem

Prompt

Case

ResponsMode

Hint

Feedback

RatingInstruction

1

*

0..1

1..*

* 1

1..*

0..*

* 1

* 1..*

Assessment construction

Item

AssessmentDefinition

AssessmentItem

+ scoringPrescription: text+ indicator:

UnitOfAssessment

*

is based on

1

*1..*

Assessment run

Item

UnitOfAssessment

AssessmentTake

Candidate

ItemRespons

*

*

*

*

*

1..*

Respons processing (1)

AssessorItemRespons

+ respons: + itemScore: int

ResponsCode

+ rubricScore:

* 1..*

Respons processing (2)

UnitOfAssessment

Candidate

Indicator

*

*

*

1..*

IndicatorScore

Decision making

Decision

+ verdict: + decisionMaker: + explanation:

DecisionRule AssessmentPlan

Candidate

1..*

*

*

To be discussed

Is it already done then? Evaluation What about other initiatives (FREMA)?

Next steps How might this integrate into the

TENCompetence infrastructure? Tooling?

Relationship with IMS TENCompetence explicitly mentions new QTI

version (although 2.1 is underway)

Documents

OUNL’s assessment model