Download pdf - Can I Do This In My Pajamas? Validation Studies Going Virtual

Can I Do This In My Pajamas?

Validation Studies Going Virtual

Hillary Michaels

William Lorie

Michaela Gelin

Susan Davis-Becker

Chad Buckendahl

2011 National Conference on Student Assessment

June 19-22, 2011 Orlando, Florida www.ccsso.org

Next Generation Learners: Who are they and what are their needs

Setting Standards Remotely:

Conditions for Success

William Lorié, Ph.D.

Metrica Research Associates, LLC

Five Questions

• Where We Are Now ▫ Why conduct a standard setting? ▫ Why do people meet in person for standard

settings?

• Going Virtual ▫ Is remote suitable for standard setting? ▫ When are panelists ready to make standard setting

judgments? ▫ When should test sponsors allow standards to be

set remotely?

The Logic of Standard Setting

1 No technical analysis of the test will yield a readily recognizable or acceptable cut score

2 Each stakeholder representative (panelist) has or can form an opinion regarding minimally acceptable performance

3 That opinion can be shared, discussed, and modified

4 A suitably designed process will transform all panelist opinions to a defensible cut score on the test

Standard Setting Essentials

Panelists understand the rules of the process

Panelist opinions are formed and shared

Panelists record their opinions according to the rules of the process

Why meet face-to-face?

• Reasons unrelated to standard setting goals

▫ Professional development

▫ Networking with colleagues

• Goal-related

▫ Taking the test

▫ Training

▫ Reviewing (secure) materials

▫ Engaging in large and/or small group discussions

Three Types of Goal-Related Standard-

Setting Tasks • One-to-many

▫ Orientation, Training, Providing feedback

• Individual ▫ Taking the test, Reviewing materials, Making

judgments, Making ratings, Completing in-process and final evaluations

• Many-to-many ▫ Writing or revising descriptors, Discussing the

borderline candidate, Discussing ratings, Discussing feedback

Intra-Panel Elements in Context

Train Review Discuss Debrief /

Train

Assess Judge Inform Discuss

Train Judge Inform Assess

Debrief

Five Questions, Revisited

• Where We Are Now ▫ Why conduct a standard setting? ▫ Why do people meet in person for standard

settings?

• Going Virtual ▫ Is remote suitable for standard setting? ▫ When are panelists ready to make standard setting

judgments? ▫ When should test sponsors allow standards to be

set remotely?

The Most Serious Objections

• Test and data security at risk

• Discussions not “rich enough” if not in person

• Cannot discern if participants are on task

The Most Serious Objections

• Test and data security at risk ▫ Test Development < Remote Stan Set < Testing ▫ Relationship between panelists and sponsoring

agency is key

• Discussions not “rich enough” if not in person ▫ Descriptor writing ≠ standard setting ▫ Remote discussions may be more focused

• Cannot discern if participants are on task ▫ Problem is mode-independent ▫ Reduction of session lengths helpful

Managing Remote Many-Many Processes

• Reliable meeting system

▫ E.g., Conference or video call among several people, with a facilitator or group leader

▫ Low chance of interruptions

• Rules for turn-taking and strict schedules

• (Possibly) System for drawing attention to materials

When are panelists ready to make

standard setting judgments? • Understand the objective referent on which to

base their judgments

• Understand the procedure for making judgments and have practiced it

• Understand the legitimate reasons for modifying their judgments and the likely effect of specific modifications

Is remote standard setting right for

your program?

Yes No

Does my program plan to release a practice or

sample test form? ✔

Can I entrust my panelists with secure

materials? ✔

Can I proceed to standard setting voting with

very little or no descriptor refinement? ✔

Can my panelists work with remote meeting

technologies? ✔

Does my organization have the necessary

technologies and technical support? ✔

Virtual and Face-to-face Standard

Setting: A Blended Model

Michaela Gelin, Ph.D.

CGA Canada

Hillary Michaels, Ph.D.

Consultant

Overview

• Why use a blended model for standard setting?

• Overview of examination

• Modified Angoff standard setting

▫ Virtual training

▫ Virtual Round 1 ratings

▫ Face-to-face discussion & Round 2 ratings

• Feasibility for other standard settings

• Security

• Future directions

Why Use A Blended Model?

• Economic and logistical benefits

• Convenient for panelists; panelists complete the work anywhere, anytime

▫ Advantage: panelists more likely to partake

• Digital media has made it possible

▫ Easily collaborate across distances

▫ Eliminates the need to email files; improves security for sensitive documents

Examination & Standard Setting

• Certification exam (4 hours)

▫ 20 MCQs

▫ 3 CRQs

• Exam assesses technical competencies & professional qualities and skills (e.g., Problem Solving, Communication, Integrative Approach)

• Modified Angoff standard setting process is used to determine a single Pass/Fail cut score for the exam

Virtual Training

• Web-based meeting software • Share and view documents (PowerPoint, Excel

Spreadsheets) with panelists in real time ▫ Panelists connect on their computers through a

web browser ▫ Everyone sees the same thing (e.g., PowerPoint

presentation) at the same time

• Phone conferencing ▫ The facilitator talks while sharing documents ▫ Q&A encouraged during the presentation

Training cont…

• Training includes

▫ The purpose and process of standard setting

▫ Discussion and consensus about minimally-competent performance (i.e., borderline candidates)

▫ Factors for making ratings

▫ Practice ratings

• One-on-one follow-up training is provided for panelists who need additional support

Round 1 Virtual Ratings

• Exam and keys couriered and signature required

• Panelists take the exam at home under exam-like conditions and self assess their performance

• Panelists use secure portal site to complete Round 1 ratings using Excel

• Panelists given ample time (1 week) to complete Round 1 ratings

• Round 1 ratings compiled for sharing at the face-to-face meeting

Round 1 MCQ Excel Ratings

Enter your Inititals below:

Question # Competency Area Competency Enter percentage from 0 - 100.

1 MA MA:03

2 IT IT:01

3 AS AS:03

4 BE BE:06

5 ET ET:03

What percentage of minimally-

competent candidates will

answer this question correctly?

Standard Setting Round 1 Rater Data Sheet

Round 1 CRQ Excel Ratings

You need to provide percentage estimates that total to 100%.

Question # Competency Area Competency

0 1 2 3 4

Sum Mean

20 MA MA:01

0 0

21 MA MA:02

0 0

22 FAR FAR:08

0 0

23 FAR FAR:02

0 0

Question # Competency Area Competency

28 Communication CM:01

29 Integrative

Approach

IA:02

30 Problem Solving PS:01

31 Leadership LD:01

What percentage of minimally-competent candidates will demonstrate

competent performance on this professional quality and skill?

Enter percentage from

0 - 100.

Core and Core-Related competencies

Professional Qualities and Skills

What percentage of minimally-competent candidates will demonstrate

the competency at each of the levels indicated?

Standard Setting Round 1 Rater Data Sheet

Round 2 Face-to-face Ratings

• Review standard setting process, assessment program, performance level descriptors

• Discuss exam-taking experience and factors that impact performance

• Repeated process: For each rating, review Round 1, discuss similarities and differences, review candidate data, and make final rating individually

Round 2 Ratings: Google docs

• Google docs provides for real-time data collection, collaboration, and immediate feedback (e.g., average panel rating)

• Ratings are made one-at-a-time, with no reference to individual questions, content, or identifying information

• Facilitator transfers content (ratings) of Google docs to secure hard drive

Conditions for Success

• Virtual web-based training for the group, including practice ratings

• Helpline for additional support is required

• Thoughtful Round 1 ratings made individually

• Round 2 ratings scheduled shortly after completion of Round 1

• Quality control features built into Excel templates for ratings

• Final ratings and discussion made in person

Security

• Pending signed confidentiality, exam and keys couriered with signature required

• Standard setting begins the day after the exam window ends

• All hard copies collected at face-to-face meeting • Use of secure password protected portal for

round 1 collection • Spreadsheet for capturing Round 2 ratings has

no identifiable information in terms of content (e.g., exam, question)

Applicability to other standard settings

• Especially useful if exam is lengthy and requires thoughtful analysis

• Minimizes the length of face-to-face meetings, thereby saving time and money

• Provides highly flexible schedule for completion of Round 1 ratings (submitted via portal)

• Can conduct multiple training sessions if needed with larger panels

• In this case Excel spreadsheets used for dichotomous and partial credit items

Future Directions

• Enhance test security – exam taken via computer rather than sending hardcopies out

• Training session – save and store on the web for panelist reference

• Improve rating spreadsheet to allow panelists to record their rationales for their ratings and aid recall for Round 2 ratings

Virtual Item Writing and Review Susan Davis-Becker, Ph.D.

Alpine Testing Solutions

Traditional Item Writing and Review

• Multiple in-person meetings

• Item Writing

▫ Substantial training

▫ Practice exercises

▫ Focused item writing over days

▫ Ongoing mentoring

• Item Review

▫ Peer feedback

▫ Group discussion

Virtual Item Writing Training

• Advance materials

• Deliver training through web-based software

• Focus on core elements

• Link training to materials

• Include example items that can be reviewed by the group

• Demonstrate item writing tool

Virtual Item Writing Process

• Initial assignment

▫ Write 2-3 items to an assigned objective

▫ Facilitator reviews and provides targeted feedback

▫ Item writer completes edits

• Ongoing item writing

▫ Larger item writing assignments

▫ Facilitator monitors progress and provides feedback

Virtual Item Content Review

• Web-based software

• Facilitator role

▫ display item content

▫ ask specific review questions

▫ makes changes to items in real time so group can approve final item

Virtual Post-Pilot Item Review

• Web-based software

• Display item content and results of pilot analysis

• Facilitator role

▫ display item content

▫ aid in interpretation of analysis results

▫ facilitate discussion on appropriateness of item.

If not appropriate, guide SMEs in revising item based on analysis results and make changes in real time so item can be re-piloted

Virtual Post-Pilot Item Review: Example

Objective: Solve word problems involving subtraction of small whole numbers.

Two sparrows and three chipmunks are resting in a sequoia. One chipmunk runs down the sequoia and hides in a shrub.

How many mammals are left in the tree?

a) 1

b) 2

c) 4

d) 5

P-value = .15 Item-score correlation = -.05 Option analysis:

Response P-value 0-25% 25-49% 50-74% 75-100%

A .20 .02 .10 .03 .05

B* .15 .07 .06 .03 0

C .50 .03 .07 .15 .25

D .15 .01 .07 .04 .03

Virtual Process - Advantages

• Recruitment ▫ No travel ▫ Item writing: smaller fixed time commitment, most

work can be done on own schedule ▫ Item review: work is conducted over several shorter

meetings, some experts may be able to contribute to some but not all meetings

• Less pressure to create all the items in a fixed time frame, reduce fatigue

• Allows time for iterative feedback to hone item writing skills during the process

Virtual Process - Disadvantages

• Greater security concerns

• Potential for less focus on the process

• Potential for less collaboration among group

• Longer-term commitment

Applicability to other item writing

settings • Technology is available to support virtual item

writing and review

• Subject Matter Experts who are

▫ Physically spread out

▫ Comfortable working independently

▫ Comfortable working with technology

• Test development plan allows for longer item writing process

Future Directions

• Improve software for virtual item writing

• Research comparing quality of items by development mode

Additional Blended Development

and Validation Activities Chad W. Buckendahl, Ph.D.

Alpine Testing Solutions

© 2010 Alpine Testing Solutions, Inc.

Overview

• “Can” versus “Should”

• Other validation activities can include blended methods of evidence collection

▫ Content specification (e.g., practice analysis)

▫ Alignment studies (e.g., assessments to content standards)

▫ Evaluating consequences (e.g., curricular impact, teacher motivation, student achievement)

Content Specification

• Defining content domain, cognitive processes, and performance expectations

• Stakeholder group involvement

▫ Focus groups

▫ Working committees

▫ Surveys of practitioners

Alignment studies

• Evaluating representation of content, cognitive processes, and performance expectations of assessments relative to content expectations

• Independent reviews by subject matter experts

▫ Training activities

▫ Independent, group discussion

▫ Exploratory, confirmatory

Evaluating consequences

• Evidence of intended and unintended consequences (e.g., curricular and instructional change, public perceptions)

• Methods for collecting evidence

▫ Artifact gathering

▫ Focus groups

▫ Working committees

▫ Surveys of stakeholder groups

Virtual Elements - Advantages

• Stakeholder participation

▫ Broader representation of intended population

▫ Time on task

▫ Reduced travel

• Cost containment, with caveats

• Greater flexibility

• Real-time data collection (e.g., survey questionnaires)

Virtual Elements - Disadvantages

• Equivalent interaction with the technologies

• Engagement of participants

• Redistributed costs (e.g., travel versus technology, time commitments)

• Security risks for sensitive material/content

• Quality of information for intended purpose

Summary

• Validation activities do not need to be Either/Or

▫ Some are more conducive to virtual work than others

• Prioritizing in-person versus virtual activities

• Considerations:

▫ Quality of information

▫ Security risk

▫ Resources, time, cost

▫ Political

Then… and Now