Workshop 1: Item design for assessments involving ... 1.pdf · Outline Part 1: Wherefore assessments involving collaboration? I Set up the current perspective: performance assessments

Workshop 1: Item design for assessmentsinvolving collaboration

Peter F. Halpin

April 18, 2016

github.com/peterhalpin/BearShare

1 / 46

github.com/peterhalpin/BearShare

Outline

Part 1: Wherefore assessments involving collaboration?

I Set up the current perspective: performance assessments

I Uses of collaboration in assessment contexts

Part 2: Item design for collaboration

I What’s wrong with using “one-player” items for collaboration?

I Some examples of two-player items obtained from one-playeritems

Part 3: OpenEdx and CPSX for building your own collaborativeassessments

I Git repo: github.com/ybergner/cpsx.gitI Live version: collaborative-assessment.org

2 / 46

github.com/ybergner/cpsx.git

collaborative-assessment.org

Outline









3 / 46



Outline









4 / 46



Part 1: Why?

I 21st-century skills, non-cognitive skills, soft skills,hard-to-measure skills, social skills, ...

I Theme: traditional educational tests target a relatively narrowset of constructs

I Analyses of US labour markets indicate that such skills arevalued by employers (Burrus et al. 2013, Deming 2015)

I There is a salient demand for assessments of a broader rangeof student competencies

5 / 46

Part 1: Why?





6 / 46

Part 1: Why?





7 / 46

With apologies to Dr. Duckworth...

upenn.app.box.com/8itemgrit

8 / 46

upenn.app.box.com/8itemgrit

Self-reports

I Self-report measures often do not require the respondent toexhibit the skills about which we wish to make inferences

! Unsuitable for supporting consequential decisions ineducational settings1

1cf. Duckworth, & Yeager (2015). Measurement matters: Assessing personal qualities other than cognitive

ability for educational purposes. Educational Researcher, 44(4), 237-251.

9 / 46

Educational assessments

, Reliability and generalizability in traditional content domains

/ Current psychometric models don’t seem entirely appropriateto “next generation assessments”

I e.g., IRT models don’t use process data

/ Collateral damage: teaching to the test, test anxiety,bubble-filling, ...

I NY opt-out movement: 20% of students (parents) boycottedstate test last year2

2www.wnyc.org/story/

new-york-city-students-make-modest-gains-state-tests-opt-out-numbers-triple/

10 / 46

www.wnyc.org/story/new-york-city-students-make-modest-gains-state-tests-opt-out-numbers-triple/










11 / 46











12 / 46



Performance assessments3

3Davey, Ferrara, Holland, Shavelson, Webb, & Wise (2015). Psychometric Considerations for the Next

Generation of Performance Assessment. Princeton, NJ. p. 10

13 / 46

Collaboration as a modality of performance assessment

I Small group interactions are a highly-valued educationalpractice

I The Jigsaw Classroom (Aronson et al. 1978; jigsaw.org)

I Group-worthy tasks (Cohen et al. 1999)

I The use of information technology to support studentcollaboration is well established

I CSCL (e.g., Hmelo-Silver et al. 2013)

I The use of group work in assessment contexts has a relativelylong-standing history

I Webb 1995, 2014

14 / 46

jigsaw.org

Collaboration as a modality of performance assessment

I Small group interactions are a highly-valued educationalpractice

I The Jigsaw Classroom (Aronson et al. 1978; jigsaw.org)

I Group-worthy tasks (Cohen et al. 1999)

I The use of information technology to support studentcollaboration is well established

I CSCL (e.g., Hmelo-Silver et al. 2013)

I The use of group work in assessment contexts has a relativelylong-standing history

I Webb 1995, 2014

15 / 46

jigsaw.org

Webb 19954

1 Purpose of the assessment

2 The goal of the group work

3 Processes of the group work

4Group Collaboration in Assessment: Multiple Objectives, Processes, and Outcomes. Educational Evaluation

and Policy Analysis, 17(2), p. 241

16 / 46

Webb’s three-part theory of collaboration in assessments

Purpose of assessment

1. Individual learning after collaboration2. Group productivity3. Students’ ability to work together

Goals of group work1. Individual learning2. Group productivity

Group processes1. Co-construction of ideas2. Giving and receiving help3. Conflict and cooperation4. Equality of participation5. Social loafing6. Division of labour

17 / 46






18 / 46






19 / 46






20 / 46






21 / 46






22 / 46

Is this possible?





23 / 46

Part 2: Item design

I PISA 2015 CPS5

I Various item types: jigsaw / information sharing; consensusbuilding; negotiation

I Problems were interactive, but with simulated collaboration(deterministic computer agent)

I Problems were not designed to assess content knowledge in atraditional domain – “below grade level”

5Organisation for Economic Co-operation and Development. (2013). PISA 2015 Draft Collaborative Problem

Solving Framework. Retrieved fromhttp://www.oecd.org/pisa/pisaproducts/DraftPISA2015CollaborativeProblemSolvingFramework.pdf

24 / 46

Part 2: Item design

I PISA 2015 CPS5

I Various item types: jigsaw / information sharing; consensusbuilding; negotiation

I Problems were interactive, but with simulated collaboration(deterministic computer agent)

I Problems were not designed to assess content knowledge in atraditional domain – “below grade level”

5Organisation for Economic Co-operation and Development. (2013). PISA 2015 Draft Collaborative Problem

Solving Framework. Retrieved fromhttp://www.oecd.org/pisa/pisaproducts/DraftPISA2015CollaborativeProblemSolvingFramework.pdf

25 / 46

Item design

I ATC21S6

I 4 prototype tasks7 (3 similar to what I will discuss)

I Problems were interactive, and with interactions between realstudents

I Problems were designed for learning as well as assessment intarget domains

I This doesn’t really let YOU design tasks for collaboration

6Gri�n & Care (2015). Assessment and teaching of 21st century skills: Methods and approach

7http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf

26 / 46

http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf

Item design

I ATC21S6

I 4 prototype tasks7 (3 similar to what I will discuss)

I Problems were interactive, and with interactions between realstudents

I Problems were designed for learning as well as assessment intarget domains

I This doesn’t really let YOU design tasks for collaboration

6Gri�n & Care (2015). Assessment and teaching of 21st century skills: Methods and approach

7http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf

27 / 46

http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf

Item design

I DIY with CPSX8

I Built on OpenEdx

I Open source LMS, easy to set up (locally or AWS), activedevelopment

I 40+ built-in problem types with automatic grading andadaptive hints9

I Traditional (MC, NR); interactive; customizable (xml, JS)

I CPSX allows for small group chat during problem solving

I Currently in its infancy – e.g., no screen sharing / sharedmanipulables

8https://github.com/ybergner/cpsx.git

9http://docs.edx.org

28 / 46

https://github.com/ybergner/cpsx.git

http://docs.edx.org

Goals of the current approach

1 Make use of tech that’s already available and free (speech andbeer)

2 Tasks clearly anchored in an educational content domain

I Specifically math

3 Navigate “minimal design” vs group-worthy tasks

I Working from psychometric theory towards best practices ingroup work, rather than the other way around

29 / 46

Modifying one-player items

I How can a conventional mathematics test question beadapted to a collaborative context?

I Recipes for creating collaborative tasks from existingassessment materials

I Not an ideal approach to designing group-worthy tasks

I However, retain the strengths of existing assessments

I e.g., Calibrate the one-player version to anchor the domaindi�culty of the two-player version

30 / 46

Standard items

I Change in the instructions while retaining the assessmentmaterials

I e.g., two students, one copy of a math test, evaluationdepends only on what they record on the test form

, Minimal design

/ Group-worthy tasks

31 / 46

Jigsaw / information sharing items

32 / 46


33 / 46


34 / 46


35 / 46

Hints / information requesting items

36 / 46


37 / 46


38 / 46

Multiple answer / negotiation items

39 / 46

A note on process loss10

I Process loss as the discrepancy between potential and actualperformance

I Actual Productivity = Potential Productivity - Process Loss

I Two main sources: motivation and coordination, often treatedas functions of group size

I For our two-player items, process loss due to coordination canbe thought of as an item characteristic

10Steiner (1972) Group processes and productivity

40 / 46

A IRF modified for process loss

I Assume 2PL for one-player items i

logit(pij) = ↵i ✓j + �i

I Process loss for two-player version i0:

logit(pi0j) = ↵i ✓j + �i + �i0

41 / 46

Viable designs for estimating process loss

I Independent samples or repeated measures withcounterbalanced forms where:

I An individual assessment serves as the calibration sample foritem i

I A collaborative assessment serves as the calibration sample foritem i0

I WithP

i0 �i0 = 0 to identify E[✓j ] in the collaborative sample

I But what is ✓j in the collaborative sample? More on thattomorrow!

42 / 46

Viable designs for estimating process loss

I Independent samples or repeated measures withcounterbalanced forms where:

I An individual assessment serves as the calibration sample foritem i

I A collaborative assessment serves as the calibration sample foritem i0

I WithP

i0 �i0 = 0 to identify E[✓j ] in the collaborative sample

I But what is ✓j in the collaborative sample? More on thattomorrow!

43 / 46

OpenEdx / CPSX demo

I LMS (Students): collaborative-assessment.org

I CMS (Content developer): collaborative-assessment.org:18010

I Login credentials: For (i in 100 : 120)

I email: [email protected] password: useri

44 / 46


collaborative-assessment.org:18010

Contact: [email protected]

Collaborators: Yoav Bergner, ETS; Jacqueline Gutman, NYU

Support: This research was funded by a postdoctoral fellowship from theSpencer Foundation and an Education Technology grant from NYU Steinhardt.

45 / 46

Aronson, E., Blaney, N., Stephan, C., Sikes, J., Snapp, M. (1978). The jigsaw classroom. Beverly Hills, CA: Sage.

Cohen, E. G., Lotan, R. A., Scarloss, B. A., & Arellano, A. R. (1999). Complex instruction: Equity in cooperativelearning classrooms. Theory Into Practice, 38, 80-86.

Davey, T., Ferrara, S., Holland, P. W., Shavelson, R. J., Webb, N. M., Wise, L. L. (2015). PsychometricConsiderations for the Next Generation of Performance Assessment. Princeton, NJ.

Deming, D. J. (2015). The Growing Importance of Social Skills in the Labor Market. National Bureau of EconomicResearch Working Paper Series, (21473).

Gri�n, P., Care, E. (2015). Assessment and teaching of 21st century skills: Methods and approach. New York:Springer.

Hmelo-Silver, C. E., Chinn, C. A., Chan, C. K., & O?Donnel, A. M. (2013). International handbook ofcollaborative learning. New York: Taylor and Francis.

Organisation for Economic Co-operation and Development. (2013). PISA 2015 Draft Collaborative ProblemSolving Framework. Retrieved fromhttp://www.oecd.org/pisa/pisaproducts/DraftPISA2015CollaborativeProblemSolvingFramework.pdf

Webb, N. M. (1995). Group Collaboration in Assessment: Multiple Objectives, Processes, and Outcomes.Educational Evaluation and Policy Analysis, 17(2), 239-261.

46 / 46

Documents

Workshop 1: Item design for assessments involving ... 1.pdf · Outline Part 1: Wherefore assessments involving collaboration? I Set up the current perspective: performance assessments