Test Development Final

Embed Size (px)

Citation preview

  • 7/30/2019 Test Development Final

    1/64

    Summer Intensive College English (ICE): Placement Test for Academic Purpose

    Alice Chan and Takako Kobayashi

    Professor Jean Turner

    EDUC 8540: Language Assessment

    May 21, 2013

  • 7/30/2019 Test Development Final

    2/64

    2Table of Contents

    The Original Test Specifications for Entire Test.3

    Objectively-Scored Test as Administered.12

    Objective Section Scoring Key and Test Administration Guide...23

    Actual Test Administration Description23

    Statistical Analysis.24

    Descriptive Statistics

    Item Facility

    Item Discrimination

    Internal Consistency (KR20)

    Revisions and Rationale.29

    Proposed Validity Investigation for Objectively-Scored Section..47

    Subjectively-Scored Test as Administered49

    Subjective Section Scoring Key and Test Administration Guide..50

    Actual Test Administration Description52

    Inter-rater Reliability.56

    Content Analysis for Subjective Section...57

    Revisions and Rationale.59

    References..63

    Appendix64

    Self-Assessment Survey

  • 7/30/2019 Test Development Final

    3/64

    3The Original Test Specifications for Entire Test

    Test Overview

    Alice and I would like to develop the listening and speaking sections in the placement

    test for the six-week session in the Summer Intensive College English (ICE) program. Liam,

    Vincent, and Sage are working on the reading and writing sections in the same test. Although

    students are required to have at least either 49 of TOEFL iBT score or 4.5 of IELTS score for the

    enrollment, the program currently does not offer the placement test at the beginning of the

    session. Therefore, the overall purpose of this placement test is to assess students academic

    English proficiency levels before the program starts so that teachers in the session can modify

    the lessons accordingly to meet their actual needs. In terms of student demographic, most of the

    students in the session are planning to attend colleges or universities in the United States. Thus,

    we would like to develop the placement test which specifically measures academic English

    proficiency. One feature of the students in the ICE program is that they are young learners in the

    age range of 15 to 19 from variety of L1 backgrounds according to the last years student

    demographic, with most of the students came from Middle East, two from Southern America,

    and one from Italy.

    As identified above, the placement test consists of four skills. In terms of target language

    situation for the test, all tasks will be instructed in English. The students are required to take this

    placement test on the first two days of arrival as following:

    Day 1: Reading and Writing

    Day 2: Listening and Speaking

    For speaking section, students will be asked to sign up the time slot because this section will

    consist of face-to-face interview process. Therefore, on the second day, students will take the

  • 7/30/2019 Test Development Final

    4/64

    4Listening placement test before noon, and then they will take the Speaking task individually after

    noon.

    Listening Section

    In the listening section, there are three categories of speech acts: lecture, conversation

    between college faculty and student, and conversation between peers regarding the topic of

    college life. In terms of text types for the listening section, the appropriate resources are

    authentic lectures and conversation between faculty and students as well as between students.

    For each listening task will consist of approximately three-minute segment of listening and eight

    multiple choice questions. Each listening task will take 20 minutes to complete questions. In total,

    there, the listening test will be 60 minutes.

    In terms of the listening contents, they should not require test takers to have the previous

    knowledge. Instead of assessing students existing knowledge of the topics, the listening section

    will assess the listening comprehension of spoken English in academic contents as well as

    communication relevant to college life. According to Chapter 12 in Hughes, the global

    operations: 1) obtain the gist; 2) follow argument; and 3) recognize the attitude of the speaker

    seem to be appropriate for the specification of this listening item. In more depth, based on the

    lists shown in the chapter, there is a tentative list for the ability to be assessed in the test.

    Informational:

    obtain factual information

    understand request for information

    understand expressions of need

    understand requests for help

    understand requests for permission

  • 7/30/2019 Test Development Final

    5/64

    5 recognize and understand opinions

    understand compassions

    recognize and understand suggestions

    recognize and understand comments

    Interactional:

    understand greetings and introductions

    understand expressions of agreement/disagreement

    recognize speakers purpose

    understand requests for clarification

    recognize requests for clarification

    recognize requests for opinion

    recognize attempts to persuade others

    A lecture segment will be taken from one ofMIT 9.00SC Introduction to Psychology,

    Spring 2011, which are available on YouTube. For a conversation between peers, the topic will

    be housing because one of our friends has just moved to a new house and agreed to volunteer for

    the recording. Therefore, the script for the conversation has already been made (Script), and we

    are going to record the conversation on April 16th. Regarding a conversation between college

    faculty and student, we will create a script and record a conversation by asking Professor Jean to

    participate if possible.

    In terms of task type, the multiple choices for this section include questions and

    incomplete sentences. As identified above, all questions will be presented in English. Based on

    the test task, the method used for Listening section will be paper-based.

    The instruction given to students during the listening section will be following:

  • 7/30/2019 Test Development Final

    6/64

    61. Listen to the recording once. While listening, students will be allowed to take note.

    2. After listening, students will be allowed to open the test booklet to start the test. (Student

    will not be allowed to see the questions beforehand.)

    3. Students will be given 20 minutes to complete each listening task.

    The Listening section is objectively-scored due to the use of multiple choices for the task. The

    criteria should reveal degree of each listening abilities listed above. The criteria rubric will be

    created once the tentative list of listening ability to be assessed is finalized.

    Speaking Section

    As for the Speaking test portion, students are able to choose their desired time slot on the

    sign-up sheet that will be passed around the classroom before Listening test begins. The test will

    take place in the afternoon after the lunch break. The Speaking test will be conducted in the form

    of face-to-face interview style, and each testing slot is 10 minutes long.

    For the first 4-5 minutes, the student will be prompted to introduce themselves by talking

    about topics that are asked by the examiner. Topics can be eclectic but the first conversation

    topic initiated by the examiner is chosen based on the result of students self-assessment survey

    prior to the oral test (Self-Assessment Survey). For example, if students answered mostly

    neither agree or disagree/disagree on questions that are related to whether they can respond to

    simple daily conversation or talk about their interests, the examiner may choose easier

    conversation opener topics such as the following:

    1. Do you have any favorite food? What about any dislikes of food?

    2. How do you usually spend your free time?

    3. What is your favorite subject in school? (Prompt them to elaborate when answer is given.)

  • 7/30/2019 Test Development Final

    7/64

    7On the other hand, if students answered mostly agree/strongly agree to the questions on

    the self-assessment survey, especially to ones that pertaining the ability to talk or discuss freely

    on personal interest, academic-related topics, and feel confident to unexpected turn-takings in the

    conversations, then the topic of conversation opener can be chosen from the following samples:

    1. Can you tell me why do you want to come to ICE for this summer instead of enjoying

    time with your family and friends in your home country?

    2. Can you tell me what are some facts about America that you find interesting?

    3. How is America (or American culture) different from your home country? Are there

    also any similarities?

    These questions are meant to relax the students because they meant to prompt personal

    answers rather than thinking critically, and it is also a way for the teachers to get to know some

    gist of their personalities, which may come in as assistance to better understand what might work

    with each student, and make adjustment to the lesson content (to add or take out) based on the

    general characteristic information elicited from the introductory talk. In addition, this first part of

    the oral test is to determine the topic level for the second part, because during this first portion of

    the test, coherence of the content, word choice (vocabulary), and fluency of speech (how they

    deliver the speech) are the main criterion being evaluated. A tentative rubric of performance

    evaluation for the whole Speaking test will be included once the descriptions of qualities for each

    criterion (which will divide students level) are finalized.

    In the latter 5 minutes of the oral test, the student will receive a topic card with 1 minute

    preparation time to think about their stance and sort their opinions regarding to the given topic.

    The theme of these random topics is related to academic or college-life-centered, granted that

    most of them might be pursuing higher education in the US later on. Each topic card has one

  • 7/30/2019 Test Development Final

    8/64

    8random question, and the student will not know the question until the card is given to them.

    Samples of possible topics can be:

    1. Tell me about your opinion: How do you think about having international students as

    classmates? Do you think itll help to improve your English? Please elaborate your

    answer.

    2. What do you plan on studying in college? Any particular reason for choosing that major?

    3. Why do you think it is important to learn English, and what are the benefits of able to

    speak English?

    As for marking for the Speaking test, each student will be subjectively-scored based on

    their performances during the 10-minute oral exam. Throughout the entire speaking test, students

    are to be graded on their delivery of the speech, as well as being coherence and cohesion on

    ideas presenting to the examiner. Criterions that are specifically noted for the latter half of the

    test would be word choice (their academic vocabulary knowledge, as well as expression of

    opinions), grammar (whether they are able to use correct grammar points adequately), and

    pronunciation, although it will not mark as heavily as the other criterions, as the whole purpose

    of the speaking test is to see whether they have the ability to be able to succeed communicatively

    in academic setting. Sample questions of both part one and part two speaking test are in

    developing process, therefore criterions for each portion of the test are subjected to change.

  • 7/30/2019 Test Development Final

    9/64

    9Script

    Topic: Housing

    A: Hey, (Speaker Bs name). I found there is a lot of housing information in todays newspaper.

    B: Oh, thats great! Lets take a look together.

    A: Since both of us will be having 8AM class next semester, it would be great if the house is

    close to the campus, so we wont have to drive to school.

    B: I totally agree! Its difficult to find a parking spot in the morning...

    A: There are three streets close to campus- Van Buren, Watson, and Franklin...

    B: Do you want to live in a single house, or an apartment?

    A: Does single house have a backyard? I want to play with my dog there.

    B: Lets see...maybe we should make a list first so that we can find the best one!

    A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available

    because we both have cars, and the fee is about $700 for each.

    B: (Speaker B shares her own list)

    A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two

    single houses on the Watson Street, and they both have backyards. The only difference is that

    one of the backyards is smaller. But, it seems that they dont have parking spots...we have to

    park on the street.

    B: What about Van Buren Street or Franklin Avenue? Is there any available option?

    A: Hmm... Most of the housing options that are on Van Buren Streets for now are apartments

    without backyards, but they do include parking spaces for both of us.

    B: I dont like living on Van Buren Street because its very noisy sometimes :(

    A: That is true, because it is quite close to downtown, and there are many bars in downtown too.

  • 7/30/2019 Test Development Final

    10/64

    10B: How about looking for housing options that are still within walking distance to the campus,

    but just a little bit far up the hills?

    A: Sure, are there any houses available on other streets?

    B: There are few options on Clayton Street, which is about 4 or 5 streets up from Van Buren.

    A: Look! There are several options available on Clayton and the streets nearby it! And some of

    them have backyards AND parking spaces provided for the tenants too!

    B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord

    tomorrow to make appointment for visiting?

    A: Okay, sounds great! Ill call these two houses tomorrow!

  • 7/30/2019 Test Development Final

    11/64

    11Self-Assessment Survey

    Please read the following questions carefully and place amark next to your choice of answer.

    1. I can initiate spoken greetings and talk about basic personal information.___ Strongly Agree

    ___ Agree___ Neither Agree/Disagree___ Disagree

    2. I can talk about basic things such as school life, food, and my hobbies.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

    3. I can perform daily activities such as ordering food and go out shopping alone.

    ___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

    4. I can talk and share freely about my culture and home country, or just any everyday topic.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

    5. I can express and support my opinions in a discussion.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

    6. I can manage unexpected turn-takings in discussion or unfamiliar topics.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

    7. I can participate in an academic discussion and express my ideas.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree

  • 7/30/2019 Test Development Final

    12/64

    12Objectively-Scored Test as Administrated

    Listening Test

    45 minutesName: __________________

    Section 1

    Do NOT open the next page until you are directed to do so.

    You will listen to a lecture. Based on what you hear, answer the questions below.

    Please place a check mark () next to the correct answer. You can take notes while you

    are listening. You will listen to the lecture only once. You have 15 minutes to complete

    this section.

  • 7/30/2019 Test Development Final

    13/64

    13Section 1 Questions 1-8

    1. What is attribution theory?

    ___a) The theory explains the process of attributing outcomes based on only

    internal behavior.

    ___b) The theory explains the process of attributing outcomes based on

    sequences of the event of a matter.

    ___c) The theory explains the process of attributing outcomes based on only

    external events.

    ___d) The theory explains the process of attributing outcomes based on internal

    behavior and external events.

    2. Please check the answers that best describe internal causes and external reasons.

    You may mark more than one correct answer.

    ___a) Internal causes include situational factors and controllable emotions.

    ___b) Internal causes refer to emotions, talents, and personal characteristics.

    ___c) External reasons are environmental factors.

    ___d) External reasons consist of uncontrollable emotions.

    3. According to the lecture, what is the possible external cause associated with the car

    accident?

    ___a) A deer.___b) A dog.

    ___c) A bear.

    ___d) A horse.

    4. If the wind is blowing the right direction when the bat hits the ball, which factor would

    it be categorized under?

    ___a) Internal-Stable Factor.

    ___b) Internal-Unstable Factor.___c) External-Stable Factor.

    ___d) External-Unstable Factor.

  • 7/30/2019 Test Development Final

    14/64

    145. What is the tendency that celebrates our own success as the indication of our internal

    abilities and failure as the results of external factors?

    ___a) Self-regulation bias.

    ___b) Self-serving bias.

    ___c) Self-success bias.

    ___d) Self-ability bias.

    6. What is the tendency to blame a persons internal behavior?

    ___a) Internal Attribution Error.

    ___b) External Attribution Error.

    ___c) Fundamental Attribution Error.

    ___d) Attribution Decision Error.

    7. Which hypothesis that assumes judgments on performance are fair because people

    get the outcome they deserve?

    ___a) Just Work Hypothesis.

    ___b) Justice World Hypothesis.

    ___c) Justice Work Hypothesis.

    ___d) Just World Hypothesis.

    8. What are the appropriate summary of attribution? You may mark more than one

    correct answer.

    ___a) People attribute causes to categorize behaviors and events.

    ___b) People always fail to attribute causes correctly.

    ___c) It is important to consider internal and external reasons when interacting

    with others.

    ___d) It is necessary to attribute causes to only internal reasons.

    Section 1:

    Listening Script

    http://www.youtube.com/watch?v=L8SAyOqG1a4

  • 7/30/2019 Test Development Final

    15/64

    15Section 2

    Do NOT open the next page until you are directed to do so.

    You will listen to a conversation between peers. Based on what you hear, answer the

    questions below. Please place a check mark () next to the correct answer. You can

    take notes while you are listening. You will listen to the conversation only once. You

    have 15 minutes to complete this section.

  • 7/30/2019 Test Development Final

    16/64

    16Section 2 Questions 9-16

    9. Why do they want to live closer to the campus? You may mark more than one correct

    answer.

    ___a) Because they both have 8am classes next semester.

    ___b) Because it is closer, then theyll be able to sleep-in a little more.

    ___c) Because it is difficult to find parking spots in the morning.

    ___d) Because they both ride bicycles to school.

    10. Which streets are closest to the campus?

    ___a) Clayton, Larkin, and Monroe.

    ___b) Jefferson, Madison, and Hellam.

    ___c) Van Buren, Watson, and Franklin.___d) Scott, Van Buren, and Pierce.

    11. Based on the conversation, choose the answers below and fill in the blanks to

    complete each students list. An answer can be chosen more than once.

    Student A Student B

    Backyard

    Under $800 for each

    a. Parking.

    b. $ 1000 total.

    c. Balcony.

    d. $ 700.

    e. Big kitchen.

    f. Under $ 800 for each.

    g. Furnished.h. Large living room.

  • 7/30/2019 Test Development Final

    17/64

    1712. What is NOT available on Watson Street?

    ___a) There are no backyards to play with her dog.

    ___b) There are no parking spots, theyll have to park on the street.

    ___c) There are no houses available at this time.

    ___d) There are no houses that allow pets.

    13. Why doesnt she like to live on Van Buren Street?

    ___a) Because it is not safe at night.

    ___b) Because there are a lot of car accidents on Van Buren.

    ___c) Because the grocery store is far away from Van Buren.

    ___d) Because it is quite noisy sometimes.

    14. How far away is Clayton Street from Van Buren Street?

    ___a) About 4-5 street up from Van Buren Street.

    ___b) About 4-5 street down from Van Buren Street.

    ___c) About 6-7 street up from Van Buren Street.

    ___d) About 6-7 street down from Van Buren Street.

    15. Which street are they most likely to find the place to live?

    ___a) Van Buren.

    ___b) Madison.___c) Franklin.

    ___d) Clayton.

    16. What are they going to do tomorrow?

    ___a) They are going to visit each house.

    ___b) They are going to look for more housing options.

    ___c) They are going to call landlords to make appointment for visiting.

    ___d) They are going to stop looking for new place and stay where they live now.

  • 7/30/2019 Test Development Final

    18/64

    18Section 2:

    Listening Script

    A: Hey, Alice. I found there are a lot of housing information in todays newspaper.

    B: Oh, thats great! Lets take a look together.

    A: Since both of us will be having 8AM class next semester, it would be great if the house isclose to the campus, so we wont have to drive to school.

    B: I totally agree! Its difficult to find a parking spot in the morning...

    A: There are three streets close to campus- Van Buren, Watson, and Franklin...

    B: Do you want to live in a house, or an apartment?

    A: Does house have a backyard? I want to play with my dog there.

    B: Lets see...maybe we should make a list first so that we can find the best one!

    A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available

    because we both have cars, and the fee is about $700 for each.

    B: (Lauren speaks her own list)

    A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two

    houses on the Watson Street, and they both have backyards. The only difference is that one of

    the backyards is smaller. But, it seems that they dont have parking spots...we have to park on

    the street.

    B: What about Van Buren Street or Franklin Avenue? Is there any available option?

    A: Hmm... Most of the housing options that are on Van Buren Streets for now are apartments

    without backyards, but they do include parking spaces for both of us.

    B: I dont like living on Van Buren Street because its very noisy sometimes :(

    A: That is true, because it is quite close to downtown, and there are many bars in downtown too.

    B: How about looking for housing options that are still within walking distance to the campus, but

    just a little bit far up the hills?

    A: Sure, are there any houses available on other streets?B: There are few options on Clayton street, which is about 4 or 5 streets up from Van Buren.

    A: Look! There are several options available on Clayton and the streets nearby it! And some of

    them have backyards AND parking spaces provided for the tenents too!

    B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord

    tomorrow to make appointment for visiting?

    A: Okay, sounds great! Ill call these two houses tomorrow!

  • 7/30/2019 Test Development Final

    19/64

    19Section 3

    Do NOT open the next page until you are directed to do so.

    You will listen to a conversation between a professor and a student. Based on what you

    hear, answer the questions below. Please place a check mark () next to the correct

    answer. You can take notes while you are listening. You will listen to the conversationonl once. You have 15 minutes to com lete this section.

  • 7/30/2019 Test Development Final

    20/64

    20Section 3 Question 17-24

    17. Why does this student come to Professor Turner?

    ___a) To get advice about a writing assignment.

    ___b) To submit a course assignment.

    ___c) To get advice about class registration.

    ___d) To ask questions about her grade for the course.

    18. What does it call for the courses that students have to take before the main ones?

    ___a) Prep-course.

    ___b) Precalculus.

    ___c) Prerequest.

    ___d) Prerequisite.

    19. Which introductory course is this student currently taking this semester?

    ___a) Introduction to Sociology.

    ___b) Introduction to Psychology.

    ___c) Introduction to Anthropology.

    ___d) Introduction to Physiology.

    20. What introductory course that is required by most of the main Sociology courses?

    ___a) Introduction to Astrology.

    ___b) Introduction to Sociology.

    ___c) Introduction to Psychology.

    ___d) Introduction to Methodology.

    21. Which two majors is the student interested in to be her degree?

    ___a) Psychology and Sociology.

    ___b) Psychology and Anthropology.

    ___c) Sociology and Anthropology.

    ___d) Sociology and Physiology.

  • 7/30/2019 Test Development Final

    21/64

    2122. When should students decide which major to be their degree?

    ___a) At the end of freshmen year.

    ___b) Before the end of sophomore year.

    ___c) At the beginning of sophomore year.

    ___d) Before the beginning of senior year.

    23. Which courses is the student most likely to take next semester? You may mark

    more than one correct answer.

    ___a) Introduction to Psychology.

    ___b) Advance level Sociology courses.

    ___c) Advance level Psychology courses.

    ___d) Introduction to Sociology.

    24. What are the things that the professor recommends the student to do for selecting

    courses? You may mark more than one correct answer.

    ___a) To get advice from classmates and hear about their opinions of the

    courses.

    ___b) To check if there are other courses that are necessary to complete first.

    ___c) To get course syllabi from professors to check for the workload

    beforehand.

    ___d) To take courses from various majors to find which suits your best interest.

    Thank you for your participation. You can leave earlier once you have completed thetest. Please submit this test to either Alice or Takako. If you have any questionsregarding this test, please feel free to ask us.

    Again, thank you so much for your support!

  • 7/30/2019 Test Development Final

    22/64

    22Section 3:

    Listening Script

    A: Hello, Professor Turner, may I come in?

    P: Sure, come on right in! What can I help you?

    A: Um... I have some questions regarding to classes for next semester...P: Okay. What about it?

    A: I heard that Sociology courses offered next semester are interesting. Id like to take one of

    them, but Im not quite sure which I should take .

    P: Lets take a look the list together. So, the columns next to the course titles show if there are

    any prerequite courses to take the courses.

    A: So that means Ill have to take those courses before I can register for the main ones, right?

    P: Correct. You cannot take the main courses that are required for your degree until you finish

    those prerequisite courses.

    A: Most of the main courses require Introduction to Sociology.

    P: Thats right. Are you taking Introduction to Sociology this semester?

    A: No, Im taking Introduction to Psychology now.

    P: Then, you should take Introduction to Sociology next semester so that you will be able to

    register the other Sociology courses in the future.

    A: I have not decided which major to be my degree yet. Is it okay for me to take those

    introduction classes before I make my decision?

    P: Sure, no problem. But we would advise students to try to find out what you really passionate

    about before you finish your sophomore year, because starting from your junior year, youll be

    taking courses that are mainly for your degree.

    A: I see.

    P: How is Introduction to Psychology? Do you like the class?

    A: Yes. The class is very interesting.P: Then, there are also some advance level Psychology courses offered next semester. They

    are all require the course youre in now, the Introduction to Psychology, as prerequite. Youre

    currently taking the course, so you are eligible for those adavanced courses.

    A: Im interested in both Sociology and Psychology, so maybe I can take Intorduction to

    Sociology and some advance level courses of Psychology next semester, and then to figure out

    which one would best suit my interest.

    P: Thats good idea. You can also take some of the introductory courses from other majors to

    see if you find them interesting too, so that you are ready to take advanced courses in junior

    year.

    A: Im going to look the courses from other majors, too.

    P: Great! Just make sure you have those prerequisites taken care of first before you register for

    any advanced level classes.

    A: Thank you, Professor Turner. Now I think I know what Ill take for next semester.

    P: No problem, glad that I could help you. Youre more than welcome to stop by again if you still

    have any question.

    A: Thank you!!

  • 7/30/2019 Test Development Final

    23/64

    23Objective Section Scoring Key

    1. d2. b and c3. a4. d

    5. b6. c7. d8. a and c9. a and c10.c11.a and c (for student A)/ a and e (for student B)12.b13.d14.a15.d

    16.c17.c18.d19.b20.b21.a22.b23.c and d24.b and d

    Administration Guide

    Test takers will be given 45 minutes to complete the objectively scored listening section.

    The purpose of the test is to assess listening ability, which requires independent work during the

    assessment. The directions are written on the test at the beginning of each section before the

    listening questions. Before examinees start the test, however, the administrators will go over the

    directions together to prevent any possible confusion. In addition, the administrators will remind

    the test takes that they listen to each recording only once and encourage the test takers to take

    notes.

    Actual Test Administration Description

    The listening test was administered during two weeks from April 27 to May 3, 2013 to

    five current Intensive English Program (IEP) students and three Teaching English to Speakers of

  • 7/30/2019 Test Development Final

    24/64

    24Other Languages (TESOL) and Teaching Foreign Languages (TFL) program students at

    Monterey Institute of International Studies (MIIS).

    During the administration, the test takers worked independently. Each recording took less

    than three minutes. After listening to each recording, each section took approximately 7-6

    minutes for the examinees to complete the set of questions. The classrooms in which the

    listening test was administered were the usual classrooms at MIIS which the test takers were

    familiar with the environment. There were no issues such as noise during the assessment. Each

    time, approximately two examinees took the listening test in the well-lit classrooms. Alice and I

    were available to answer questions during the test, but the examinees were not allowed to talk

    each other.

    Statistical Analysis

    Descriptive Statistics

    For the eight examinees, the mean of the listening section was 27.6. Score range was

    from 22 to 36. The most frequent score (mode) was 26. The median score was also 26. The

    standard deviation (SD) was 4.87.

    N Mean Mode Median Range SD Variance

    8 27.6 26 26 36-22 4.87 23.70

  • 7/30/2019 Test Development Final

    25/64

    25Item Facility

    Section 1:

    Item 1 2 3 4 5 6 7 8 9 10 11 12

    Key d b c a d b c d a cS1 d b c a d c a b a d

    S2 d b c a d c a c a c

    S3 d b c a d b c d a c

    S4 d b c a d c c d c

    S5 d a b c d b d d d b c

    S6 d b c a d c a c c

    S7 d b d a d d d a b c

    S8 d c a d b c b a c

    # of Correct

    Answers 8 7 7 7 6 7 8 2 3 2 4 7Item Facility 1 0.875 0.875 0.875 0.750 0.875 1 0.25 0.375 0.25 0.5 0.875

    Section 2:

    13 14 15 16 17 18 19 20 21 22 23 24 25 26

    a c c a d a e b d a d c

    a c a c d a g a d a d b

    a c a h g e b d a d c

    a c c a d a e b d a d c

    a c c a d a h b d a d c

    a b c c a c/b a e b d a d ca d g d a b b d b c

    a c c a d a e b d a a c

    a c c a d a e b d a d c

    8 7 6 8 6 6 6 7 5 7 8 7 6 7

    1 0.875 0.750 1 0.750 0.750 0.750 0.875 0.625 0.875 1 0.875 0.750 0.875

  • 7/30/2019 Test Development Final

    26/64

    26Section 3:

    27 28 29 30 31 32 33 34 35 36 Total Score

    c d b b a b c d b d

    c d b b d d c d c d 25

    c d a b a c a b b d 26

    c d b b a b c d b d 36

    c d b b a b d d 31

    c a a b a a c b b d 23

    c a b c a d d b d 22

    c a b b a d c d d 26

    c c b b a b c d c d 32

    8 4 6 7 7 7 5 6 4 8

    1 0.5 0.750 0.875 0.875 0.375 0.625 0.750 0.5 1

    Item facility or item difficulty shows the level of difficulty of each item on an

    objectively-scored test (Turner, 2013). The range of the values in this test items was from 0.25 to

    1. Of the 36 listening items, there seemed to be unbalanced distribution of easy, medium, and

    difficult items. Most of the items could be categorized as easy questions, whereas only three

    questions (Item 8, 9, 10) had the values close to 0. The intended examinees for this listening test

    are required to have at least 49 of TOEFL iBT score for enrolling in the program. Therefore, the

    distribution of item levels needed to be weighted more closely to the difficult level.

    Item Discrimination

    Section 1:

    Item 1 2 3 4 5 6 7 8 9 10 11 12

    S3 d b c a d b c d a c

    S4 d b c a d c c d c

    S8 d c a d b c b a c

    S1 d b c a d c a b a dS5 d a b c d b d d d b c

    S6 d b c a d c a c c

    P 1 1 0.67 1 1 1 1 0.67 1 0.67 0.67 1

    P 1 0.67 1 1 0.67 0.67 1 0.67 0.67 0.67 0.33 0.67

    Item

    Discrimination

    0 0.33 -0.33 0 0.33 0.33 0 0 0.33 0 0.33 0.33

  • 7/30/2019 Test Development Final

    27/64

    27Section 2:

    13 14 15 16 17 18 19 20 21 22 23 24 25 26

    a c c a d a e b d a d c

    a c c a d a h b d a d c

    a c c a d a e b d a d c

    a c a c d a g a d a d b

    a b c c a c/b a e b d a d c

    a d g d a b b d b c

    1 1 1 1 1 1 1 1 0.67 1 1 1 1 1

    1 0.67 0.67 1 0.33 0.33 0.67 1 0.33 0.67 1 0.67 0.67 0.67

    0 0.33 0.33 0 0.67 0.67 0.33 0 0.33 0.33 0 0.33 0.33 0.33

    Section 3:

    27 28 29 30 31 32 33 34 35 36

    c d b b a b c d b dc d b b a b d d

    c c b b a b c d c d

    c d b b d d c d c d

    c a a b a a c b b d

    c a b c a d d b d

    1 0.67 1 1 1 1 0.67 1 0.33 1

    1 0.33 0.67 0.67 0.67 0 0.67 0.67 0.67 1

    0 0.33 0.33 0.33 0.33 1 0 0.33 -0.33 0

    In addition with item facility, item discrimination illustrates the objectively-scored items

    ability which distinguish between test takers with high scores and those with low scores (Turner,

    2013). Most of the items had the values ranging from 0.33 to 0.67. The values of Item 1, 4, 7, 8,

    10, 16, 20, 23, 27, 33, and 36 were 0, which indicated that these items did not discriminate well.

    Although some of the items were intended to be easy to answer correctly, 11 items,

    approximately half of the items, failed to discriminate well. Therefore, the number of items

    whose obtained value was 0 should be decreased. Item 8 obtained the item facility value of 0.25,

    which is the most difficult question in this listening test. In terms of item discrimination,

    however, the obtained value was 0. Therefore, this item should be revised to improve the item

  • 7/30/2019 Test Development Final

    28/64

    28discrimination value. Item 3 and 35, seemed to be seriously problematic and should be revised

    because in spite of the high item facility values (0.875 and 0.5 respectively), both items had the

    value of -0.33, which revealed that top-scoring examinees were more likely to answer wrongly

    than lower-scoring test takes.

    Internal Consistency

    To calculate internal consistency, KR-20 was used. There was a pre-programmed excel

    spreadsheet for calculating KR-20 formula available online. In the spreadsheet, a right answer

    was computed as 1, whereas a wrong answer was 0. As a result, by using KR-20 formula, the

    obtained value was 0.80. This obtained value seemed to be quite high, which suggests that the

    items of the objective section were highly matched.

    Split-Half (odd-even) Correlation 0.821066194

    Spearman-Brown Prophecy 0.901742284

    Mean for Test 27.5

    Standard Deviation for Test 4.636809248

    KR21 0.717940199

    KR20 0.795348837

  • 7/30/2019 Test Development Final

    29/64

    29Revised Test Specification for Listening Section

    According to Bejar, Douglas, Jamieson, Nissan and Turner (2000), three areas of

    content have been defined as relevant for the TOEFL 2000 listening measure: academic, class

    related, and campus related (p. 9). Three categories of speech acts in the listening test (an

    academic lecture, a conversation between college faculty and student, and a conversation

    between peers) seem appropriate for the purpose of this placement test. In terms of text types for

    the listening section, the appropriate resources are authentic lectures and conversation between

    faculty and students as well as between students. For each listening task will consist of

    approximately three-minute segment of listening and eight multiple-choice questions. Each

    listening task will take 10 minutes to complete questions. In total, thus, the listening test will be

    30 minutes.

    In terms of the listening contents, they should not require test takers to have the previous

    knowledge. Instead of assessing students existing knowledge of the topics, the listening section

    will assess the listening comprehension of spoken English in academic contents as well as

    communication relevant to college life. Hughes (2003) claimed the global operations as the

    ability to obtain the gist, follow argument; and recognize the attitude of the speaker (p. 161).

    These operations seem to be appropriate for the specification of this listening test. In more depth,

    based on the lists Hughes suggested, there is a finalized list for the ability to be assessed in the

    listening test.

    Informational:

    obtain factual information

    follow sequence of events (narration)

    recognize and understand opinions

  • 7/30/2019 Test Development Final

    30/64

    30 understand compassions

    recognize and understand suggestions

    recognize and understand comments

    Interactional:

    understand greetings and introductions

    understand expressions of agreement/disagreement

    recognize speakers purpose

    understand requests for clarification

    recognize requests for clarification

    recognize requests for opinion

    recognize attempts to persuade others

    Each listening segment has been already recorded. A lecture segment was taken from one

    of the series titledMIT 9.00SC Introduction to Psychology, Spring 2011, which are available on

    YouTube. For a conversation between peers, the topic was housing because it is associated with

    campus related topic. In addition, one of our friends has just moved to a new house and agreed to

    volunteer for the recording. Regarding a conversation between college faculty and student, the

    topic is class registration, which is identified as a class related topic. All listening materials were

    successfully recorded before conducting the pilot test.

    In terms of task type, the multiple choices for this section include questions and

    incomplete sentences. As identified above, all questions are presented in English. Based on the

    test task, the method used for Listening section will be paper-based.

    The instruction given to students during the listening section will be following:

    1. Listen to the recording once. While listening, students will be allowed to take notes.

  • 7/30/2019 Test Development Final

    31/64

    312. After listening, students will be allowed to open the test booklet to start the test. (Student

    will not be allowed to see the questions beforehand.)

    3. Students will be given 10 minutes to complete each listening task.

    This listening test also attempts to measure the ability of note-taking. Carrell (2007)

    suggested that note-taking strategies seemed to be associated with listening performance. The

    Listening section is objectively-scored due to the use of multiple choices for the task. The criteria

    should reveal degree of each listening abilities listed above.

  • 7/30/2019 Test Development Final

    32/64

    32Revised Test

    Listening Test30 minutes

    Name: __________________

    Direction:

    This test is design to test your listening comprehension. There are three

    sections in this test. You will listen to each recorded material only once and

    must answer the following questions. Answer all the questions on the basis

    of what is statedorimpliedby the speaker you hear. While listening, you

    can take notes.

    Do NOT turn the next page until you are told to do so.

  • 7/30/2019 Test Development Final

    33/64

    33

    Section 1

    Directions:

    In this section, you will listen to a lecture in an undergraduate program. The

    lecture will not be repeated. After listening to the lecture, open the next

    page and answer the questions below based on what is stated or implied in

    the lecture. Please place a check mark () next to the correct answer. You

    can take notes while you are listening. You have 10 minutes to answer the

    following questions.

    Do NOT turn the next page until you are told to do so.

  • 7/30/2019 Test Development Final

    34/64

    34Section 1 Questions 1-8

    1. What is attribution theory?

    ___a) The theory explains the process of attributing outcomes based on only

    internal behavior.

    ___b) The theory explains the process of attributing outcomes based on

    sequences of the event of a matter.

    ___c) The theory explains the process of attributing outcomes based on only

    external events.

    ___d) The theory explains the process of attributing outcomes based on internal

    behavior and external events.

    2. Please check the answers that best describe internal causes and external reasons.

    You may mark more than one correct answer.

    ___a) Internal causes include situational factors and controllable emotions.

    ___b) Internal causes refer to emotions, talents, and personal characteristics.

    ___c) External reasons are environmental factors.

    ___d) External reasons consist of uncontrollable emotions.

    3. According to the lecture, what is the possible external cause associated with the car

    accident?

    ___a) A deer.___b) A dog.

    ___c) A bear.

    ___d) A horse.

    4. If the wind is blowing the right direction when the bat hits the ball, would

    this fact be categorized under_______?

    ___a) Internal-Stable Factor.

    ___b) Internal-Unstable Factor.___c) External-Stable Factor.

    ___d) External-Unstable Factor.

  • 7/30/2019 Test Development Final

    35/64

    355. What is the tendency that identify our own success as our internal abilities and failure

    as the results of external factors?

    ___a) Self-regulation bias.

    ___b) Self-serving bias.

    ___c) Self-success bias.

    ___d) Self-ability bias.

    6. What is the tendency to criticize a persons internal behavior?

    ___a) Internal Attribution Error.

    ___b) External Attribution Error.

    ___c) Fundamental Attribution Error.

    ___d) Attribution Decision Error.

    7. What is the hypothesis that assumes judgments on performance are fair

    because people get the outcome they deserve?

    ___a) Just Work Hypothesis.

    ___b) Justice World Hypothesis.

    ___c) Justice Work Hypothesis.

    ___d) Just World Hypothesis.

    8. What are the appropriate summaries of this lecture?

    You may mark more than one correct answer.

    ___a) People attribute causes to categorize behaviors and events.

    ___b) People always fail to attribute causes correctly.

    ___c) It is important to consider internal and external reasons when interacting

    with others.

    ___d) It is necessary to attribute causes to only internal reasons.

  • 7/30/2019 Test Development Final

    36/64

    36

    Section 2

    Directions:

    In this section, you will listen to conversation between friends. The

    conversation will not be repeated. After listening to the conversation, open

    the next page and answer the questions below based on what is stated or

    implied in the conversation. Please place a check mark () next to the

    correct answer. You can take notes while you are listening. You have 10

    minutes to answer the following questions.

  • 7/30/2019 Test Development Final

    37/64

    37Please use the map while you are listening

    Do NOT turn the next page until you are told to do so.

    OakStreet

    LakeStreet

  • 7/30/2019 Test Development Final

    38/64

    38Section 2 Questions 9-16

    9. Why do they want to live closer to the campus?

    You may mark more than one correct answer.

    ___a) Because they both have 8am classes next semester.

    ___b) Because it is closer, then theyll be able to sleep-in a little more.

    ___c) Because it is difficult to find parking spots in the morning.

    ___d) Because they both ride bicycles to school.

    10. Which streets are closest to the campus?

    ___a) Oak, Hill, and Lake.

    ___b) Second, Green, and Lake.

    ___c) Main, Second, and Oak.___d) Green, Maple, and Hill.

    11. Based on the conversation, choose the answers below and place alphabets in the

    blanks to complete each persons list. Each blank will be filled with one answer.

    An answer can be chosen more than once.

    Alice Lauren

    Backyard

    Under $800 for each

    a. Parking.

    b. $ 1000 total.

    c. Balcony.

    d. $ 700.

    e. Big kitchen.f. Under $ 800 for each.

    g. Furnished.

    h. Large living room.

  • 7/30/2019 Test Development Final

    39/64

    3912. What is NOT available on Second Street?

    ___a) Backyards to play with pets.

    ___b) Parking spots.

    ___c) Apartments.

    ___d) Houses that allow pets.

    13. Why doesnt Lauren like to live on Main Street?

    ___a) Because it is not safe at night.

    ___b) Because there are a lot of car accidents on Main Street.

    ___c) Because the grocery store is far away from Main Street.

    ___d) Because it is quite noisy sometimes.

    14. How far away are Hill Street from Main Street?

    ___a) About 4-5 street up.

    ___b) About 4-5 street down.

    ___c) About 6-7 street up.

    ___d) About 6-7 street down.

    15. Which street are they most likely to find the place to live?

    ___a) Main.

    ___b) Lake.___c) Oak.

    ___d) Hill.

    16. What are they going to do tomorrow?

    ___a) They are going to visit each house.

    ___b) They are going to look for more housing options.

    ___c) They are going to call landlords to make appointment for visiting.

    ___d) They are going to stop looking for new place and stay where they live now.

  • 7/30/2019 Test Development Final

    40/64

    40

    Section 3

    Directions:

    In this section, you will listen to conversation between a professor and

    student. The conversation will not be repeated. After listening to the

    conversation, open the next page and answer the questions below based

    on what is stated or implied in the conversation. Please place a check mark

    () next to the correct answer. You can take notes while you are listening.

    You have 10 minutes to answer the following questions.

    Do NOT turn the next page until you are told to do so.

  • 7/30/2019 Test Development Final

    41/64

    41Section 3 Question 17-24

    17. Why does this student come to Professor Turner?

    ___a) To get advice about a writing assignment.

    ___b) To submit a course assignment.

    ___c) To get advice about class registration.

    ___d) To ask questions about her grade for the course.

    18. What does it call for the courses that students have to take before the main ones?

    ___a) Prep-course.

    ___b) Precalculus.

    ___c) Prerequest.

    ___d) Prerequisite.

    19. Which introductory course is this student currently taking this semester?

    ___a) Introduction to Sociology.

    ___b) Introduction to Psychology.

    ___c) Introduction to Astronomy.

    ___d) Introduction to Philosophy.

    20. What introductory course that is required by most of the main Sociology courses?

    ___a) Introduction to Anthropology.

    ___b) Introduction to Sociology.

    ___c) Introduction to Psychology.

    ___d) Introduction to Social Work.

    21. Which two majors is the student interested in to be her degree?

    ___a) Psychology and Sociology.

    ___b) Psychology and Social Work.

    ___c) Sociology and Anthropology.

    ___d) Literature and Physiology.

  • 7/30/2019 Test Development Final

    42/64

    4222. When should students decide which major to be their degree?

    ___a) At the end of freshmen year.

    ___b) Before the end of sophomore year.

    ___c) At the beginning of sophomore year.

    ___d) Before the beginning of senior year.

    23. Which courses is the student most likely to take next semester?

    You may mark more than one correct answer.

    ___a) Introduction to Psychology.

    ___b) Advanced level Sociology courses.

    ___c) Advanced level Psychology courses.

    ___d) Introduction to Sociology.

    24. What are the things that the professor recommends the student to do for selecting

    courses? You may mark more than one correct answer.

    ___a) To get advice from classmates and hear about their opinions of the

    courses.

    ___b) To check if there are other courses that are necessary to complete first.

    ___c) To get course syllabi from professors to check for the assignments

    beforehand.

    ___d) To take courses from various majors to find which suits students best

    interest.

    This is the end of listening section. You can leave earlier onceyou have completed the test. Please submit the completed test toeither Alice or Takako. If you have any questions regarding this

    test, please feel free to ask us.

  • 7/30/2019 Test Development Final

    43/64

    43Revised Listening Prompt (Section 2)

    Script

    Topic: Housing

    A: Hey, Alice. I found there are a lot of housing information in todays newspaper.B: Oh, thats great. Lauren! Lets take a look together.

    A: Since both of us will be having 8AM class next semester, it would be great if the house is

    close to the campus, so we wont have to drive to school.

    B: I totally agree! Its difficult to find a parking spot in the morning...

    A: There are three streets close to campus- Main, Second, and Oak...

    B: Do you want to live in a house, or an apartment?

    A: Does house have a backyard? I want to play with my dog there.

    B: Lets see...maybe we should make a list first so that we can find the best one!

    A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available

    because we both have cars, and the fee is about $700 for each.

    B:

    A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two

    houses on the Second Street, and they both have backyards. The only difference is that one of

    the backyards is smaller. But, it seems that they dont have parking spots...we have to park on

    the street.

    B: What about Main Street or Oak Street? Is there any available option?

    A: Hmm... Most of the housing options that are on Main Streets for now are apartments without

    backyards, but they do include parking spaces for both of us.

    B: I dont like living on Main Street because its very noisy sometimes :(

    A: That is true, because it is quite close to downtown, and there are many bars in downtown too.

    B: How about looking for housing options that are still within walking distance to the campus, butjust a little bit far up the hills?

    A: Sure, are there any houses available on other streets?

    B: There are few options on Hill Street, which is about 4 or 5 streets up from Main Street.

    A: Look! There are several options available on Hill Street and the streets nearby it! And some

    of them have backyards AND parking spaces provided for the tenents too!

    B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord

    tomorrow to make appointment for visiting?

    A: Okay, sounds great! Ill call these two houses tomorrow!

  • 7/30/2019 Test Development Final

    44/64

    44Revised Scoring Key for Objectively-Scored Section

    1. d2. Unmarked3. b

    4. c5. Unmarked6. a7. d8. b9. c10.d11.a12.c13.a14.Unmarked

    15.c16.Unmarked17.c18.a19.c20.a21.e22.b23.d24.a25.d26.c27.c28.d29.b30.b31.a32.b33.c34.d35.b36.d

    Revision and Rationale

    First of all, the directions of the test including its administration process were revised.

    Although the administrators (Alice and I) told the examinees not to open the test until they heard

    to do so, most of them actually opened the test as soon as they received. In terms of the

  • 7/30/2019 Test Development Final

    45/64

    45administration guide, the administrators need to tell and make sure that the test takers are not

    allowed to open the pages before passing the test packages. In addition, the revised test uses a

    page for the direction of each section, which may prevent the test takers from reading questions

    accidentally. The direction in each section includes more detailed information of the recording

    material so that the examinees do not have to guess the situations where the lecture and

    conversations occur.

    Secondly, based on the results of statistical analysis, the listening prompts and questions

    are revised. Item facility revealed the unbalanced distribution of item levels. By taking a closer

    look at the distribution, most of the difficult items were included in Section 1, whereas there are

    no items of difficult level in Section 2. Therefore, Section 2 required large-scale revision. In the

    recording prompt of Section 2, there were many complicated street names, which might be

    extremely confusing for the examinees who may not know the streets in Monterey. Indeed, all

    participants for the pilot test were living in Monterey, which might be helpful for them to answer

    correctly. In order to measure English proficiency level of listening, such pre-existing knowledge

    should not be required. Therefore, the revised prompt use simple and common street names in

    the United States such as Main, Second, and Oak. Every question associated with street names

    are also revised. In addition, the revised test includes a map with the street names that the

    examinees are allow to look at while listening (Figure 1).

    Although Section 1 and 3 did not seem to need a wide-ranging change, several items are

    revised. For example, Item 8 had the obtained value of 0.25 but did not discriminate well (the

    obtained value = 0 for item discrimination). The revised Item 8 uses simpler vocabulary than the

    original one.

    The original Item 8:

  • 7/30/2019 Test Development Final

    46/64

    465. What is the tendency that celebrates our own success as the indication of our internal

    abilities and failure as the results of external factors?

    ___a) Self-regulation bias.

    ___b) Self-serving bias.

    ___c) Self-success bias.

    ___d) Self-ability bias.

    The revised Item 8:

    5. What is the tendency that identify our own success as our internal abilities and failure

    as the results of external factors?

    ___a) Self-regulation bias.

    ___b) Self-serving bias.___c) Self-success bias.

    ___d) Self-ability bias.

    In Section 3, Item 33 and 34 are also revised because the multiple choices were

    grammatically problematic. Instead of Advance level of Sociology/Psychology courses, the

    revised multiple choices are Advanced level of Sociology/Psychology course.

    The original Item 33 and 34:23. Which courses is the student most likely to take next semester? You may mark

    more than one correct answer.

    ___a) Introduction to Psychology.

    ___b) Advance level Sociology courses.

    ___c) Advance level Psychology courses.

    ___d) Introduction to Sociology.

  • 7/30/2019 Test Development Final

    47/64

    47The revised Item 33 and 34:

    23. Which courses is the student most likely to take next semester?

    You may mark more than one correct answer.

    ___a) Introduction to Psychology.___b) Advanced level Sociology courses.

    ___c) Advanced level Psychology courses.

    ___d) Introduction to Sociology.

    Moreover, the multiple choices of Item 29, 30 and 31 are revised by changing academic

    subjects which are more similar to Psychology or Sociology. Additionally, the provisos in items

    are bolded and italicized for the test takers to pay attention. Lastly, the recording materials also

    require change. The voices of two people in the recordings were very similar, which might be

    difficult for the examinees to distinguish the voices. Therefore, as a possible change, the

    conversations should be recorded by different people who have distinguishable differences in

    terms of voice. In addition, even though the recording quality was overall fine, the conversations

    at the beginning of the recordings were too small to hear, which suggests that recording should

    be done by reliable and secure equipment.

    Proposed Validity Investigation for Objectively-Scored Section

    There are several ways to investigate the validity of the listening section. One approach is

    that experts such as Summer ICE program directors and instructors would review the listening

    section.

    The use of criterion-related approaches is also another way of investigating validity.

    Concurrent validity shows how this original test might be correlated with an existing validated

    assessment which also measures the same specifications by comparing the performances on the

    tests. Particularly, this listening test was developed by manipulating TOEFL iBT Listening.

  • 7/30/2019 Test Development Final

    48/64

    48Therefore, comparison the performance on this listening section to the performance on TOEFL

    iBT listening section may indicate how both assessments are correlated with each other.

    Another criterion-related approach is the use of predictive validity, which reveals how the

    performance on this listening test can predict test takers future academic performances.

    Especially, this listening test attempt to measure the students proficiency levels before enrolling

    in the Summer ICE program. For instance, by comparing the scores on this listening test to the

    students academic performances during the program, predictive validity might indicate how

    these performances are related with each other.

  • 7/30/2019 Test Development Final

    49/64

    49Subjectively-Scored Section

    (Speaking Test)

    Specifications and Rationale

    Both objectively-scored and subjectively-scored sections were designed independently

    and did not have any mutual reinforcement. As previously stated in the original specifications,

    speaking test will be subjectively-scored and conducted in a face-to-face interview format, and

    each testing session should be about 10 minutes long. Unlike traditional assessment that requires

    students to select an answer or to recall information to complete the task (such as reading a short

    story first, then reciting the story again in his/her own words as a presentation), this speaking test

    is designed based on the principle of authentic assessment; according to Wiggins (1993), in

    authentic assessment students must use knowledge to effectively and creatively perform the

    assigned task, and the task can be replicas of questions or problems that students may face in the

    field. Therefore, while the overall purpose of this placement test is to assess students academic

    English proficiency levels before the program starts so that teachers in the program can modify

    the lessons accordingly to meet their actual needs, in this speaking test students will also be

    asked to perform meaningful task (i.e. answering questions that may arise when theyre in the

    real world in any setting), and their performance shall demonstrate their competencies and

    communicative level.

    For the first part of the test (4-5 minutes), student will be prompted to talk about and

    answer the topic chosen by the teachers; topic for the first part of the test is selected based on the

    result of each students self-assessment survey (see Appendix). The procedure of how to choose

    the topic question is explained in the Test Administration Guide section. Before proceeding to

    the second part, student will receive a topic card randomly selected by teachers; a minute of

    preparation time will be given to the student to write down any idea that he/she would like to

  • 7/30/2019 Test Development Final

    50/64

    50bring up to talk about regarding to the given topic on the paper provided at site. When a minute

    is up, teachers will prompt the student to respond to the topic; moreover, granted that most

    students will probably pursue their higher education in the US later on, and the whole purpose of

    the speaking test is to see whether they have the ability to be able to succeed communicatively in

    academic setting, hence the theme of these random topics is related to academic or college-life-

    centered for this part of the test; they will have to think critically in order to provide a well-

    reasoned and consistent response.

    Criterions that students will be evaluated are listed and elaborated in the Scoring Protocol

    section.

    Subjectively-Scored Test as Administered

    Original Speaking Test Questions (Topic Prompts)

    Part I: 4-5 minutes

    Personal Interest/Information:

    1) Do you have any favorite food? What about any dislikes of food?

    2) How do you usually spend your free time?

    3) What is your favorite subject in school? (Prompt them to elaborate when answer is

    given.)

    Comparison/Opinion:

    4) Can you tell me why do you want to come to ICE for this summer instead of enjoying

    time with your family and friends in your home country?

    5) Can you tell me what are some facts about America that you find interesting?

    6) How is America (or American culture) different from your home country? Are there also

    any similarities?

  • 7/30/2019 Test Development Final

    51/64

    51==== 1 minute of preparation time ====

    Part II: 4-5 minutes

    7) Tell me about your opinion: How do you think about having international students as

    classmates? Do you think itll help to improve your English? Please elaborate your

    answer.

    8) What do you plan on studying in college? Any particular reason for choosing that major?

    Why do you think it is important to learn English, and what are the benefits of able to

    speak English?

    9) What do you plan on studying in college? Any particular reason for choosing that major?

    Subjective Section Scoring Key

    Original Test Rubric: Speaking Test (Spring 2013)

    Name: _______________

    Level Competencies

    Grammar:Sentence

    structure & verb

    tense

    Content:Relevance to the

    prompt

    Vocabulary:Appropriateness &

    Complicacy of

    terms chosen

    Discourse:

    Turn-taking,coherence &

    cohesions

    Delivery:

    Fluency of thespeech &

    pronunciation

    Level

    1

    Frequent grammarmistakes, takeslonger time toresponse due tohaving difficultiesto deliver ideas.

    Can do basicgreetings andanswer questionsregarding to familyand personalmatters. But maylack relevance inideas.

    Incorrect use ofvocabulary, lack of

    proper vocabulary tocommunicateadequately.

    Able to completesolid ideas thoughnot connecting toeach other, andshows difficulty intake-turning.

    Often pauses toconnect ideas andslow-paced,incorrect

    pronunciations werefrequently made incommunication.

    Level

    2

    Able to deliverideas adequatelywith only somegrammar errors,and can identifywhen errors were

    made (self-correct)

    Can talk aboutpersonal preferencesand academictopics. Ideas arerelevant to the

    prompt.

    Able to use generaland some basicacademic terminologyappropriately withminor mistakes. Littledifficulty in

    elaboration whenprompted.

    Able to performturn-taking andconnect complexideas, but may lacka little ofcoherence and

    cohesions.

    Makes somepronunciationmistakes butgenerallyunderstandable, withonly some pauses,

    but pacing isappropriate.

    Level

    3

    Often self-correct

    right away if errorswere made; veryfew grammarerrors, and able todeliver ideas withcorrect grammaraccurately.

    Can support

    personal opinionsand able to talkabout academictopics. Ideas arerelevant andelaborated.

    Able to use both

    general and variety ofacademic termsappropriately andcorrectly. Minormistakes in metaphoror idiom expressions.

    Able to perform

    turn-takingeffectively, andable to connectcomplex ideas withcoherence andcohesions.

    Only minor

    pronunciation errorsbut generally easyand clear tounderstand. Showsappropriate pacingand pauses incommunication.

  • 7/30/2019 Test Development Final

    52/64

    52

    Test Administration Guide

    Prior to the test, students must complete the self-assessment survey beforehand. Self-

    assessment survey will be given to the students at the end of the Listening test and return the

    survey to teachers before they leave the classroom. The primary purpose for this self-assessment

    survey is to determine which level of difficulty of questions to ask in the first part of the

    speaking test.

    As for preparation for the test, teachers must make sure to prepare several materials and

    equipment that will help to best document the test process and to make sure nothing is missing,

    so the test process can be conducted smoothly. For starter, laptop that is equipped with audacity,

    iPod or iPhone, or any audio recording device should be present for the purpose of recording

    each test session; the purpose for the recoding is to be used later for evaluation only when the

    two teachers cannot come to a conclusion on deciding level of the student. Some pencil and

    blank sheet of paper are also required because in between the first and second part of the test,

    students will get a one-minute preparation time to jot down any ideas that they would like to talk

    about upon receiving the topic card for the second part of the test. Last but not the least, test

    prompts for the second part of the test and topic cards for the second part of the speaking test

    should be typed and ready on the day of testing.

    There are also several guidelines for teachers to follow before conducting the test. Upon

    having the student entering the testing classroom, teachers and student will exchange greetings,

    and both teachers will take turn to do a brief self-introduction before asking student to introduce

    him/herself. For students self-introduction, make sure to have them to include where is their

    home countries, and how long have they studied English to estimate their expected linguistic

  • 7/30/2019 Test Development Final

    53/64

    53ability. Prior to beginning the actual test, the teachers should explain the following to every

    student:

    o

    Please relax and dont think of this as a test that will judge your English ability.

    Also, well not judge your opinions, so please speak freely and just think of this 10-

    minute as a chat between teachers and the student. This is to let the student know

    that his/her opinion of certain matter will not affect how he/she will be evaluated,

    therefore, they will not feel restricted on topics and actually be able to best express

    themselves when there is no boundary.

    o We will be audio-recording the entire speaking test only for our use in evaluation

    process later. Well not distribute nor upload the content. This is to inform them

    politely that we acknowledge privacy and guarantee that all the testing materials will

    not be used for any other purpose, therefore, no one who is not a faculty staff will

    know or hear what they say during the test.

    o Do you have any question before we start? Proper signal to indicate the beginning

    of the actual test.

    There will be two teachers present at the testing, and students will be entering the testing

    classroom one at a time according to the time slot that they have signed up previously (a sign-up

    sheet will be passed around in the classroom prior to the beginning of the Listening test, so

    students can choose their desired time slot for the speaking test.) During the test, one of the

    teachers will be the main test giver who initiates questions and turn-takings, while the second

    teacher will be the assistant test giver to initiate follow-up questions (example: asking student to

    elaborate more on his/her answer if the answer lacks of reasons to support it.)

  • 7/30/2019 Test Development Final

    54/64

    54For the first part of the speaking test, teachers will begin with the prompt that was

    selected based on the result indicated on the self-assessment survey: if the student answered

    mostly disagree to task performances that are difficult, then questions for the first part will be

    selected from the Personal Interest/Information category; however, if the student answered

    mostly agree or strongly agree to task performances that are difficult, such as onesthatpertaining the ability to talk or discuss freely on personal interest, academic-related topics, and

    feel confident to unexpected turn-takings in the conversations, then questions for the first part

    will be chosen from Comparison/Opinion category. To avoid any confusion that would cause

    time loss for the student, teachers must speak clearly when giving the question. In terms of

    proper feedback during the test, teachers may respond in forms of nodding or back-channeling;

    in addition, teachers should remain eye-contact with the student, which is to avoid looking like

    grading the student already while he/she is still speaking by writing on the rubric. Most

    importantly, teachers shall not interrupt the student during his/her comment. Follow-up questions

    are necessary if the student only answered the original question without providing proper

    elaboration to support the answers.

    At the end of the test, teachers will tell the student that the test is over by thanking

    him/her for sharing their thoughts and information, and telling the student that theyll be looking

    forward to meeting them him/her again in classes. Teachers will not and shall not discuss the

    result and performance of the student at the end of the testing session.

    Scoring Protocol

    The rubric should be completed by the two teachers immediately upon finishing each test

    session. Circle the level of description clearly that best describes the students performance of

    each competency. The audio recording can be used to re-listening to the conversation if a

  • 7/30/2019 Test Development Final

    55/64

    55decision cannot be made or details cannot be recalled. Things that will be evaluated on the

    speaking test include: grammar (sentence structure and tense agreement), content (relevance to

    the prompt), vocabulary (appropriateness and complicacy of words chosen), discourse (turn-

    taking, coherence and cohesions), and delivery (fluency of the speech and pronunciation in

    general.)

    Actual Test Administration Description

    The test was administered to three female students: two were the current IEP students at

    MIIS, and one was a first-semester TFL-Chinese student. All three of the test takers had

    previously finished the objective portion of the test the week before. Each of the test takers was

    scheduled to take the speaking test according to their best available day and time in the week

    following the objective test. While I waited in front of the Holland Center at the time of meeting,

    my co-teacher/administer, Takako, went to look for an empty classroom in Morse for us to

    conduct the test. She would then notify me the location of the testing classroom via text message.

    Contrary to my original test administration guide where the test takers must fill out the

    self-assessment survey immediately after finishing the objective portion of the test (Listening),

    and due to the fact that my self-assessment survey (Appendix) was still under development

    during the week of piloting listening test, I was unable to have them to fill out the survey in

    advance in order for me to determine which category of questions to ask for the first part of the

    speaking test. Therefore, all three of the test takers did their self-assessment survey on the testing

    day, then Takako and I would decide on which category of questions to begin the first part of the

    test based on how many disagree were ticked next to the questions that have difficult

    performance tasks. If the test taker ticked most of the answers agree or strongly agree on the

    survey that have performance tasks that are difficult, then we would start the first part of the test

  • 7/30/2019 Test Development Final

    56/64

    56with questions from the Comparison/Opinion category. If the test taker ticked mostly

    disagree on difficult performance tasks, then we would select questions from Personal

    Interest/Information category.

    Prior to beginning the test, we informed the test taker that we would like to audio-

    recording the test session for evaluation purpose only, and will not display the recording to

    anyone besides our professor (if necessary). After receiving the permission to audio-record, we

    then set up our iPhones to prepare for recording. To signal the beginning of the test, I first told

    the test taker to relax and just think of the next 10-minute as chatting between friends (due to the

    fact that all of our test takers were our acquaintances), and asked her if she has any question.

    During speaking test portion, I was the main test giver to initiate the first questions and

    manipulate turn-takings, and Takako was the assistant test giver to ask follow-up questions. At

    the end of each test, we thanked the test taker for willing to participate to help us piloting our

    tests.

    Inter-rater Reliability

    For this test both Takako and I are the raters and we score each test taker individually,

    using the same scoring rubric that Ive created. There are total of 5 competencies to evaluate, and

    each competency is divided into 3 levels, with level 1 being the lowest/near-beginner stage, level

    2 being the intermediate, and level 3 being the advanced stage. The scoring protocol is that

    immediately after each testing session, teachers are to evaluate the students performance by

    circling or placing a check mark in the box of descriptors of each competency that best describes

    the students level. Therefore, if a student received mostly level 3 in all the competencies across

    the rubric chart, then that student is determined to be in the advanced level of the class. That is to

    say, if a student received an alignment in three or more competencies with a single level by the

  • 7/30/2019 Test Development Final

    57/64

    57evaluation of both teachers, then that would be the level of the student. However, if fewer than

    three descriptors were aligned with a single level between the two teachers, then they will have

    to resolve the differences through discussion by listening to the audio recording again to

    determine a final placement for the student.

    In our pilot test, we happened to score the first two test takers in 100% agreement.

    However, on the third test taker (shown in attached document: Maggie), Takako and I only had

    one difference in the scoring, which was that I circled the test takers vocabulary competency

    level at Level 2, whereas Takako thought the vocabulary level was at Level 3. We resolved this

    difference by listening to our audio recording again, and discussed the word choice in her

    discourse. The conclusion that we arrived was that she should be placed as Level 2 because

    although she was quite fluent in her speech delivery and was able to stay on the topic (even

    providing adequate elaboration), most of her comments and word choices were rather repetitive,

    which did not entirely meet the requirement of able to use both general and variety of academic

    terms appropriately and correctly as indicated in the scoring rubric for Level 3 Vocabulary

    descriptor. In addition, even with this one difference between Takako and my rating, both of us

    had scored her with an alignment in three competencies; therefore, we were able to reach 100%

    agreement on all of our three test takers scores.

    Content Analysis for Subjective Section

    The question genre was first based on the fact that these students will most likely to

    return to the US for higher education, but to make sure this test would be able to genuinely

    measure their communicative competency in an academic and campus life setting, I reviewed my

    previous specifications draft and did a research on possible questions that could reflect

    meaningful real-world performance when I came across the concept of authentic assessment.

  • 7/30/2019 Test Development Final

    58/64

    58To remind myself that this test is meant to design to assess their academic life

    communicative skill, I checked my draft of the rubric frequently when I was developing the

    questions to ensure these questions could reflect any kind of real-world circumstance. On top of

    all, even though this test is meant to be a placement test for Summer ICE program students

    ranged from age 15 to 19, when we piloted this test to two current IEP students and one current

    TFL student at MIIS, they all seemed perfectly relaxed and chatty throughout their test sessions

    after we told them not to think of this pilot test as a test, but just a normal chat between

    acquaintances prior to beginning the test. Therefore, informing them beforehand and choosing

    topics that seem personal relevant and real-life like situations could best put the students in a

    less-stressed state of mind, thus they seem to be able to express themselves better and the output

    performance would be much natural and genuine.

  • 7/30/2019 Test Development Final

    59/64

    59Revised Speaking Test Questions (Topic Prompts)

    Part I: 4-5 minutes

    Personal Interest/Information:

    1) Do you have any favorite food? (Follow up: What about your second favorite food? Why is it

    not your first choice?)

    What about any dislikes of food?

    What would be your ideal perfect meal on a date? And in what kind of restaurant?

    2) How do you usually spend your free time after you finish your homework assignment?

    (Follow up: Do your parents help you on your homework when you have questions? Or do

    you think that homework should be done alone by yourself?)

    3) What is your favorite subject in school? (Prompt them to elaborate when answer is given.)

    4) Do you enjoy shopping, and what do you usually buy? (Follow up: Do you prefer to shop

    alone, or to shop with your family and friends?)

    5) Do you have any hobby? Are your hobbies relaxing or theyll make you excited?

    Comparison/Opinion:

    6) Can you tell me why do you want to come to ICE for this summer instead of enjoying time

    with your family and friends in your home country?

    7) Can you tell me what are some facts about America that you find interesting?

    8) How is America (or American culture) different from your home country? Are there also any

    similarities?

  • 7/30/2019 Test Development Final

    60/64

    609) Do you think fine arts are important in life? Between music and art, such as paintings and

    sculptures, which one do you prefer/or enjoy the most, why and why not.

    10) What method works best for you when youre learning something new? What method have

    you tried that didnt work out so well? (If they cannot understand the word method,

    rephrase the question to What is the best strategy that you would use when youre learning

    something new?)

    ===== 1 minute of preparation time =====

    Part II: 4-5 minutes

    11)Tell me about your opinion: How do you think about having international students as

    classmates? Do you think itll help to improve your English? Please elaborate your answer.

    12)What do you plan on studying in college? Why do you want to study in that area?

    (If student answered I dont know yet, move on to another topic choice)

    13)Why do you think it is important to learn English, and what are the benefits of able to speak

    English?

    14) Is it important to have a university degree in order to get a good job in your home country?

    What about having an university degree from a foreign university, and will that enhance your

    future employment opportunity in your home country?

    15) If theres one change that you think its important for modern school system to make, what

    would you recommend and your reasons?

  • 7/30/2019 Test Development Final

    61/64

    61Revisions and Rationale

    Upon finishing the pilot test I immediately knew what revisions would be necessary to

    make. First of all, there was no need to revise the scoring rubric as each descriptor clearly

    describes the ability expected for the level for each competency, and it was easy to determine

    students five competencies level accordingly. Since subjectively-scored section is a speaking

    test with topic prompts, the only revision I had to make was to add more prompts to each part of

    the test, and change the wording of some of the questions in order for students to understand it

    easier and capable to elaborate their answers without asking follow-up questions. One thing I

    noticed during actual test administration was that all three test takers were rather advanced

    communicatively and could burn through a question (even with follow-up questions) very fast,

    resulting my intended 10-minute speaking test shortened to be approximately or less than 7

    minutes long. Furthermore, I felt that asking only one main question and ask them to elaborate

    the answer during the first part of the test wasnt enough (given that they were able to answer

    and elaborate to an extent on their own); therefore, should there be students who have high

    English proficiency, especially in discourse and fluency, it would be best to prepare at least 2

    questions for each part I and part II of the speaking test, that way in case if a student who burns

    through the first question with adequate elaboration, there will still be another question to fill in

    the time-gap in order to fulfill the intended 4-5 minutes for each part of the test.

    Another minor revision that needs to be made is to reinforce the one-minute preparation

    time in between part I and part II of the test. Due to the fact that all three of our pilot test takers

    were quite advanced in discourse competency and very talkative, they were able to proceed

    straight into the second part of the test (the more academic and college-life-centered questions)

    without pausing to think and write down their topic cues; one of them even said she didnt need

  • 7/30/2019 Test Development Final

    62/64

    62that one-minute preparation time to sort her thoughts in order to produce a more coherent

    response to the topic. Nevertheless, if this test was truly administered in the way intended, I

    would definitely reinforce the one-minute preparation time rule in order to ensure fairness

    among all the examinees because not everyone would have the same level of communicative

    ability in a test setting.

  • 7/30/2019 Test Development Final

    63/64

    63References

    Bejar, I. I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening

    framework. Princeton, NJ: Educational Testing Service.

    Carrell, P. L. (2007). Notetaking strategies and their relationship to performance on listening

    comprehension and communicative assessment tasks.RESEARCH REPORT-

    EDUCATIONAL TESTING SERVICE PRINCETON RR, 7.

    Hughes, A. (2003). Testing for language teachers (2nd ed.). NY: Cambridge University Press.

    Turner, J (2013). Item analysis handout (Obtained as class handouts on ).

    Wiggins, G. P. (1993).Assessing student performance. San Francisco: Jossey-Bass Publisher.

  • 7/30/2019 Test Development Final

    64/64

    64Appendix

    Self-Assessment Survey

    Name: ________________

    Please read the following questions carefully and place a

    mark next to your choice of answer.

    1. I can initiate spoken greetings and talk about basic personal information.___ Strongly Agree___ Agree___ Disagree

    2. I can talk about basic things such as school life, food, and my hobbies.___ Strongly Agree___ Agree___ Disagree

    3. I can perform daily activities such as ordering food and go out shopping alone.___ Strongly Agree___ Agree___ Disagree

    4. I can t