20
Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Novák, Educational Testing Service

Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

  • View
    227

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Designing Large-Scale Speaking and Writing Assessments

TOEIC® Speaking as an Example

Jakub Novák, Educational Testing Service

Page 2: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

TOEIC® Speaking and Writing

• Design process started in August 2005 • First operational tests administered in December

2006

• Design followed principles of Evidence-Centered Design approach (ECD)

Page 3: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Evidence-Centered Design (ECD)

• Framework:

-- claims

-- evidence

-- tasks

• Advantage: transparent and evidentially solid relation between tasks and claims.

Page 4: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Business Requirements for TOEIC® Speaking

• Test should discriminate across a wide range of abilities, starting with the bottom quintile of traditional TOEIC® takers.

• Test should separate candidates into ~ 10 levels.

• Many unique forms of the test will be administered each year.

Page 5: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

General claim

• Test taker can communicate in spoken English to function effectively in a global workplace context.

Page 6: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Partial, hierarchical claims

1. Test-taker can create connected, sustained discourse appropriate to the typical workplace.

2. Test-taker can carry out routine social and occupational interactions such as giving and receiving directions, asking for information, asking for clarification, and so forth.

3. Test-taker can produce some language that is intelligible to native and proficient non-native English speakers.

Page 7: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Test-taker can produce some language that is intelligible to native and proficient non-native

English speakers.

• Task:

Complete the sentence:

“Whenever I have free time, …”

This task type can give the evidence, but cannot yield enough unique prompts.

Page 8: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Test-taker can produce some language that is intelligible to native and proficient non-native

English speakers.

Task:

Read aloud the text on the screen. You will have 45 seconds to prepare. Then you will have 45 seconds to read the text aloud.

Whether you want office supplies for personal or for business use, Sun Office Products is the single source for all your needs. With over 50 years of experience, our professionals can help you find any type of supply for any project…

This task type can give the desired evidence, and can yield many prompts.

Page 9: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

TOEIC® Speaking – Read a Text Aloud Evaluation Criteria

• PronunciationHigh Pronunciation is highly intelligible, though the response may include minor lapses and/or other language influence.Medium Pronunciation is generally intelligible, though it includes some lapses and/or other language influence.Low Pronunciation may be intelligible at times, but significant other language influence interferes with appropriate delivery of the text.

Page 10: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Ability Levels (idealized case)

Page 11: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

From task to evidence to claim

• Performance on a task can be reliably scored, giving evidence for a partial claim.

• Partial claims can be combined into a general claim.

• General claims for all levels are supported by evidence.

Page 12: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Test-taker can create connected, sustained discourse appropriate to the typical workplace.

• Propose a Solution (show that you recognize the problem, and propose a way of dealing with the problem.)

Hi, this is Marsha Syms. Um, I’m calling about my bank card. I went to the bank machine early this morning, you know - the ATM (upspeak) ... because the bank was closed so only the machine was open. Anyway, I put my card in the machine and got my money out....but then my card didn’t come out of the machine. I got my receipt and my money but then my bank card just didn’t come out. And I’m leaving for my vacation tonight so I’m really going to need it. ...I had to get to work early this morning, and couldn’t wait around for the bank to open....Could you call me here at work, and let me know how to get my bank card back? I’m really busy today, and really need you to call me soon. I can’t go on vacation without my bank card. This is Marsha Syms at 555-1234. Thanks.

(30 seconds to prepare, 60 seconds to speak.)

Page 13: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Test-taker can create connected, sustained discourse appropriate to the typical workplace.

• Make a RecommendationImagine that your company is planning an international conference for all its clients. Your

department is responsible for choosing the hotel for the conference. The chart below includes information about two different hotels. Please take 10 seconds to look at the chart.

Prepare a voice-mail report for Mr. Collins, your supervisor, who has asked you to recommend one hotel for the conference.

(45 seconds to prepare, 60 seconds to speak.)

Page 14: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Scoring a high-level task

Level 5 Response is effective and consists of highly intelligible, sustained, coherent discourse. Characterized by all of the following:

– Response presents a clear progression of ideas and conveys the relevant information required by the tasks. It includes appropriate detail, though it may have minor omissions.

– Speech is clear with generally well-paced flow and fluid expression. Response may include minor lapses or minor difficulties with pronunciation or intonation patterns which do not affect overall intelligibility.

– Response exhibits a fairly high degree of automaticity with good control of basic and complex structures (as appropriate). Some minor errors may be noticeable but do not obscure meaning.

– Use of vocabulary is accurate and precise.

Page 15: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Testing the Test: The Pilot Study

• Four test forms created, administered to 2700 subjects who represented the target range of abilities (Dec. 2005–Jan. 2006)

• Responses scored through Online Scoring Network (OSN) by trained raters. The response to each task scored by a separate rater unfamiliar with candidate’s other responses.

• Raw scores weighted: highest-level tasks received the highest weight.

Page 16: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Results of pilot study

• Test writers can create multiple versions of the same test task of equivalent difficulty.

• Test takers who took more than one version of the test scored the same on both versions.

• Different raters rated the same response with the same score.• Test takers who performed well on high-level tasks

performed well on lower-level tasks as well. The assumption that tasks were hierarchical was confirmed.

• 8 proficiency levels (not 10) supported by data.• “Make a recommendation” task does not provide good

evidence.

Page 17: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Test-taker can create connected, sustained discourse appropriate to the typical workplace.

• Make a RecommendationImagine that your company is planning an international conference for all its clients. Your

department is responsible for choosing the hotel for the conference. The chart below includes information about two different hotels. Please take 10 seconds to look at the chart.

Prepare a voice-mail report for Mr. Collins, your supervisor, who has asked you to recommend one hotel for the conference.

(45 seconds to prepare, 60 seconds to speak.)

Page 18: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

TOEIC® Speaking Test Overview

Page 19: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Score report information: claims for 8 levels

Level 5 Scale Score 110-120 Typically, test takers at level 5 have limited success at expressing an opinion or

responding to a complicated request. Responses include problems such as: language that is inaccurate, vague, or repetitive; minimal or no awareness of audience; long pauses and frequent hesitations; limited expression of ideas and connections between ideas; limited vocabulary.

Most of the time, test takers at level 5 can answer questions and give basic information. However, sometimes their responses are difficult to understand or interpret.

When reading aloud, test takers at Level 5 are generally intelligible. However, when creating language, their pronunciation, intonation and stress may be inconsistent.

Page 20: Designing Large-Scale Speaking and Writing Assessments TOEIC ® Speaking as an Example Jakub Nov á k, Educational Testing Service

Inquiries about TOEIC®

www.ets.org

under TOEIC