Upload
takako-kobame-kobayashi
View
225
Download
0
Embed Size (px)
Citation preview
7/30/2019 Test Development Final
1/64
Summer Intensive College English (ICE): Placement Test for Academic Purpose
Alice Chan and Takako Kobayashi
Professor Jean Turner
EDUC 8540: Language Assessment
May 21, 2013
7/30/2019 Test Development Final
2/64
2Table of Contents
The Original Test Specifications for Entire Test.3
Objectively-Scored Test as Administered.12
Objective Section Scoring Key and Test Administration Guide...23
Actual Test Administration Description23
Statistical Analysis.24
Descriptive Statistics
Item Facility
Item Discrimination
Internal Consistency (KR20)
Revisions and Rationale.29
Proposed Validity Investigation for Objectively-Scored Section..47
Subjectively-Scored Test as Administered49
Subjective Section Scoring Key and Test Administration Guide..50
Actual Test Administration Description52
Inter-rater Reliability.56
Content Analysis for Subjective Section...57
Revisions and Rationale.59
References..63
Appendix64
Self-Assessment Survey
7/30/2019 Test Development Final
3/64
3The Original Test Specifications for Entire Test
Test Overview
Alice and I would like to develop the listening and speaking sections in the placement
test for the six-week session in the Summer Intensive College English (ICE) program. Liam,
Vincent, and Sage are working on the reading and writing sections in the same test. Although
students are required to have at least either 49 of TOEFL iBT score or 4.5 of IELTS score for the
enrollment, the program currently does not offer the placement test at the beginning of the
session. Therefore, the overall purpose of this placement test is to assess students academic
English proficiency levels before the program starts so that teachers in the session can modify
the lessons accordingly to meet their actual needs. In terms of student demographic, most of the
students in the session are planning to attend colleges or universities in the United States. Thus,
we would like to develop the placement test which specifically measures academic English
proficiency. One feature of the students in the ICE program is that they are young learners in the
age range of 15 to 19 from variety of L1 backgrounds according to the last years student
demographic, with most of the students came from Middle East, two from Southern America,
and one from Italy.
As identified above, the placement test consists of four skills. In terms of target language
situation for the test, all tasks will be instructed in English. The students are required to take this
placement test on the first two days of arrival as following:
Day 1: Reading and Writing
Day 2: Listening and Speaking
For speaking section, students will be asked to sign up the time slot because this section will
consist of face-to-face interview process. Therefore, on the second day, students will take the
7/30/2019 Test Development Final
4/64
4Listening placement test before noon, and then they will take the Speaking task individually after
noon.
Listening Section
In the listening section, there are three categories of speech acts: lecture, conversation
between college faculty and student, and conversation between peers regarding the topic of
college life. In terms of text types for the listening section, the appropriate resources are
authentic lectures and conversation between faculty and students as well as between students.
For each listening task will consist of approximately three-minute segment of listening and eight
multiple choice questions. Each listening task will take 20 minutes to complete questions. In total,
there, the listening test will be 60 minutes.
In terms of the listening contents, they should not require test takers to have the previous
knowledge. Instead of assessing students existing knowledge of the topics, the listening section
will assess the listening comprehension of spoken English in academic contents as well as
communication relevant to college life. According to Chapter 12 in Hughes, the global
operations: 1) obtain the gist; 2) follow argument; and 3) recognize the attitude of the speaker
seem to be appropriate for the specification of this listening item. In more depth, based on the
lists shown in the chapter, there is a tentative list for the ability to be assessed in the test.
Informational:
obtain factual information
understand request for information
understand expressions of need
understand requests for help
understand requests for permission
7/30/2019 Test Development Final
5/64
5 recognize and understand opinions
understand compassions
recognize and understand suggestions
recognize and understand comments
Interactional:
understand greetings and introductions
understand expressions of agreement/disagreement
recognize speakers purpose
understand requests for clarification
recognize requests for clarification
recognize requests for opinion
recognize attempts to persuade others
A lecture segment will be taken from one ofMIT 9.00SC Introduction to Psychology,
Spring 2011, which are available on YouTube. For a conversation between peers, the topic will
be housing because one of our friends has just moved to a new house and agreed to volunteer for
the recording. Therefore, the script for the conversation has already been made (Script), and we
are going to record the conversation on April 16th. Regarding a conversation between college
faculty and student, we will create a script and record a conversation by asking Professor Jean to
participate if possible.
In terms of task type, the multiple choices for this section include questions and
incomplete sentences. As identified above, all questions will be presented in English. Based on
the test task, the method used for Listening section will be paper-based.
The instruction given to students during the listening section will be following:
7/30/2019 Test Development Final
6/64
61. Listen to the recording once. While listening, students will be allowed to take note.
2. After listening, students will be allowed to open the test booklet to start the test. (Student
will not be allowed to see the questions beforehand.)
3. Students will be given 20 minutes to complete each listening task.
The Listening section is objectively-scored due to the use of multiple choices for the task. The
criteria should reveal degree of each listening abilities listed above. The criteria rubric will be
created once the tentative list of listening ability to be assessed is finalized.
Speaking Section
As for the Speaking test portion, students are able to choose their desired time slot on the
sign-up sheet that will be passed around the classroom before Listening test begins. The test will
take place in the afternoon after the lunch break. The Speaking test will be conducted in the form
of face-to-face interview style, and each testing slot is 10 minutes long.
For the first 4-5 minutes, the student will be prompted to introduce themselves by talking
about topics that are asked by the examiner. Topics can be eclectic but the first conversation
topic initiated by the examiner is chosen based on the result of students self-assessment survey
prior to the oral test (Self-Assessment Survey). For example, if students answered mostly
neither agree or disagree/disagree on questions that are related to whether they can respond to
simple daily conversation or talk about their interests, the examiner may choose easier
conversation opener topics such as the following:
1. Do you have any favorite food? What about any dislikes of food?
2. How do you usually spend your free time?
3. What is your favorite subject in school? (Prompt them to elaborate when answer is given.)
7/30/2019 Test Development Final
7/64
7On the other hand, if students answered mostly agree/strongly agree to the questions on
the self-assessment survey, especially to ones that pertaining the ability to talk or discuss freely
on personal interest, academic-related topics, and feel confident to unexpected turn-takings in the
conversations, then the topic of conversation opener can be chosen from the following samples:
1. Can you tell me why do you want to come to ICE for this summer instead of enjoying
time with your family and friends in your home country?
2. Can you tell me what are some facts about America that you find interesting?
3. How is America (or American culture) different from your home country? Are there
also any similarities?
These questions are meant to relax the students because they meant to prompt personal
answers rather than thinking critically, and it is also a way for the teachers to get to know some
gist of their personalities, which may come in as assistance to better understand what might work
with each student, and make adjustment to the lesson content (to add or take out) based on the
general characteristic information elicited from the introductory talk. In addition, this first part of
the oral test is to determine the topic level for the second part, because during this first portion of
the test, coherence of the content, word choice (vocabulary), and fluency of speech (how they
deliver the speech) are the main criterion being evaluated. A tentative rubric of performance
evaluation for the whole Speaking test will be included once the descriptions of qualities for each
criterion (which will divide students level) are finalized.
In the latter 5 minutes of the oral test, the student will receive a topic card with 1 minute
preparation time to think about their stance and sort their opinions regarding to the given topic.
The theme of these random topics is related to academic or college-life-centered, granted that
most of them might be pursuing higher education in the US later on. Each topic card has one
7/30/2019 Test Development Final
8/64
8random question, and the student will not know the question until the card is given to them.
Samples of possible topics can be:
1. Tell me about your opinion: How do you think about having international students as
classmates? Do you think itll help to improve your English? Please elaborate your
answer.
2. What do you plan on studying in college? Any particular reason for choosing that major?
3. Why do you think it is important to learn English, and what are the benefits of able to
speak English?
As for marking for the Speaking test, each student will be subjectively-scored based on
their performances during the 10-minute oral exam. Throughout the entire speaking test, students
are to be graded on their delivery of the speech, as well as being coherence and cohesion on
ideas presenting to the examiner. Criterions that are specifically noted for the latter half of the
test would be word choice (their academic vocabulary knowledge, as well as expression of
opinions), grammar (whether they are able to use correct grammar points adequately), and
pronunciation, although it will not mark as heavily as the other criterions, as the whole purpose
of the speaking test is to see whether they have the ability to be able to succeed communicatively
in academic setting. Sample questions of both part one and part two speaking test are in
developing process, therefore criterions for each portion of the test are subjected to change.
7/30/2019 Test Development Final
9/64
9Script
Topic: Housing
A: Hey, (Speaker Bs name). I found there is a lot of housing information in todays newspaper.
B: Oh, thats great! Lets take a look together.
A: Since both of us will be having 8AM class next semester, it would be great if the house is
close to the campus, so we wont have to drive to school.
B: I totally agree! Its difficult to find a parking spot in the morning...
A: There are three streets close to campus- Van Buren, Watson, and Franklin...
B: Do you want to live in a single house, or an apartment?
A: Does single house have a backyard? I want to play with my dog there.
B: Lets see...maybe we should make a list first so that we can find the best one!
A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available
because we both have cars, and the fee is about $700 for each.
B: (Speaker B shares her own list)
A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two
single houses on the Watson Street, and they both have backyards. The only difference is that
one of the backyards is smaller. But, it seems that they dont have parking spots...we have to
park on the street.
B: What about Van Buren Street or Franklin Avenue? Is there any available option?
A: Hmm... Most of the housing options that are on Van Buren Streets for now are apartments
without backyards, but they do include parking spaces for both of us.
B: I dont like living on Van Buren Street because its very noisy sometimes :(
A: That is true, because it is quite close to downtown, and there are many bars in downtown too.
7/30/2019 Test Development Final
10/64
10B: How about looking for housing options that are still within walking distance to the campus,
but just a little bit far up the hills?
A: Sure, are there any houses available on other streets?
B: There are few options on Clayton Street, which is about 4 or 5 streets up from Van Buren.
A: Look! There are several options available on Clayton and the streets nearby it! And some of
them have backyards AND parking spaces provided for the tenants too!
B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord
tomorrow to make appointment for visiting?
A: Okay, sounds great! Ill call these two houses tomorrow!
7/30/2019 Test Development Final
11/64
11Self-Assessment Survey
Please read the following questions carefully and place amark next to your choice of answer.
1. I can initiate spoken greetings and talk about basic personal information.___ Strongly Agree
___ Agree___ Neither Agree/Disagree___ Disagree
2. I can talk about basic things such as school life, food, and my hobbies.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
3. I can perform daily activities such as ordering food and go out shopping alone.
___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
4. I can talk and share freely about my culture and home country, or just any everyday topic.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
5. I can express and support my opinions in a discussion.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
6. I can manage unexpected turn-takings in discussion or unfamiliar topics.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
7. I can participate in an academic discussion and express my ideas.___ Strongly Agree___ Agree___ Neither Agree/Disagree___ Disagree
7/30/2019 Test Development Final
12/64
12Objectively-Scored Test as Administrated
Listening Test
45 minutesName: __________________
Section 1
Do NOT open the next page until you are directed to do so.
You will listen to a lecture. Based on what you hear, answer the questions below.
Please place a check mark () next to the correct answer. You can take notes while you
are listening. You will listen to the lecture only once. You have 15 minutes to complete
this section.
7/30/2019 Test Development Final
13/64
13Section 1 Questions 1-8
1. What is attribution theory?
___a) The theory explains the process of attributing outcomes based on only
internal behavior.
___b) The theory explains the process of attributing outcomes based on
sequences of the event of a matter.
___c) The theory explains the process of attributing outcomes based on only
external events.
___d) The theory explains the process of attributing outcomes based on internal
behavior and external events.
2. Please check the answers that best describe internal causes and external reasons.
You may mark more than one correct answer.
___a) Internal causes include situational factors and controllable emotions.
___b) Internal causes refer to emotions, talents, and personal characteristics.
___c) External reasons are environmental factors.
___d) External reasons consist of uncontrollable emotions.
3. According to the lecture, what is the possible external cause associated with the car
accident?
___a) A deer.___b) A dog.
___c) A bear.
___d) A horse.
4. If the wind is blowing the right direction when the bat hits the ball, which factor would
it be categorized under?
___a) Internal-Stable Factor.
___b) Internal-Unstable Factor.___c) External-Stable Factor.
___d) External-Unstable Factor.
7/30/2019 Test Development Final
14/64
145. What is the tendency that celebrates our own success as the indication of our internal
abilities and failure as the results of external factors?
___a) Self-regulation bias.
___b) Self-serving bias.
___c) Self-success bias.
___d) Self-ability bias.
6. What is the tendency to blame a persons internal behavior?
___a) Internal Attribution Error.
___b) External Attribution Error.
___c) Fundamental Attribution Error.
___d) Attribution Decision Error.
7. Which hypothesis that assumes judgments on performance are fair because people
get the outcome they deserve?
___a) Just Work Hypothesis.
___b) Justice World Hypothesis.
___c) Justice Work Hypothesis.
___d) Just World Hypothesis.
8. What are the appropriate summary of attribution? You may mark more than one
correct answer.
___a) People attribute causes to categorize behaviors and events.
___b) People always fail to attribute causes correctly.
___c) It is important to consider internal and external reasons when interacting
with others.
___d) It is necessary to attribute causes to only internal reasons.
Section 1:
Listening Script
http://www.youtube.com/watch?v=L8SAyOqG1a4
7/30/2019 Test Development Final
15/64
15Section 2
Do NOT open the next page until you are directed to do so.
You will listen to a conversation between peers. Based on what you hear, answer the
questions below. Please place a check mark () next to the correct answer. You can
take notes while you are listening. You will listen to the conversation only once. You
have 15 minutes to complete this section.
7/30/2019 Test Development Final
16/64
16Section 2 Questions 9-16
9. Why do they want to live closer to the campus? You may mark more than one correct
answer.
___a) Because they both have 8am classes next semester.
___b) Because it is closer, then theyll be able to sleep-in a little more.
___c) Because it is difficult to find parking spots in the morning.
___d) Because they both ride bicycles to school.
10. Which streets are closest to the campus?
___a) Clayton, Larkin, and Monroe.
___b) Jefferson, Madison, and Hellam.
___c) Van Buren, Watson, and Franklin.___d) Scott, Van Buren, and Pierce.
11. Based on the conversation, choose the answers below and fill in the blanks to
complete each students list. An answer can be chosen more than once.
Student A Student B
Backyard
Under $800 for each
a. Parking.
b. $ 1000 total.
c. Balcony.
d. $ 700.
e. Big kitchen.
f. Under $ 800 for each.
g. Furnished.h. Large living room.
7/30/2019 Test Development Final
17/64
1712. What is NOT available on Watson Street?
___a) There are no backyards to play with her dog.
___b) There are no parking spots, theyll have to park on the street.
___c) There are no houses available at this time.
___d) There are no houses that allow pets.
13. Why doesnt she like to live on Van Buren Street?
___a) Because it is not safe at night.
___b) Because there are a lot of car accidents on Van Buren.
___c) Because the grocery store is far away from Van Buren.
___d) Because it is quite noisy sometimes.
14. How far away is Clayton Street from Van Buren Street?
___a) About 4-5 street up from Van Buren Street.
___b) About 4-5 street down from Van Buren Street.
___c) About 6-7 street up from Van Buren Street.
___d) About 6-7 street down from Van Buren Street.
15. Which street are they most likely to find the place to live?
___a) Van Buren.
___b) Madison.___c) Franklin.
___d) Clayton.
16. What are they going to do tomorrow?
___a) They are going to visit each house.
___b) They are going to look for more housing options.
___c) They are going to call landlords to make appointment for visiting.
___d) They are going to stop looking for new place and stay where they live now.
7/30/2019 Test Development Final
18/64
18Section 2:
Listening Script
A: Hey, Alice. I found there are a lot of housing information in todays newspaper.
B: Oh, thats great! Lets take a look together.
A: Since both of us will be having 8AM class next semester, it would be great if the house isclose to the campus, so we wont have to drive to school.
B: I totally agree! Its difficult to find a parking spot in the morning...
A: There are three streets close to campus- Van Buren, Watson, and Franklin...
B: Do you want to live in a house, or an apartment?
A: Does house have a backyard? I want to play with my dog there.
B: Lets see...maybe we should make a list first so that we can find the best one!
A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available
because we both have cars, and the fee is about $700 for each.
B: (Lauren speaks her own list)
A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two
houses on the Watson Street, and they both have backyards. The only difference is that one of
the backyards is smaller. But, it seems that they dont have parking spots...we have to park on
the street.
B: What about Van Buren Street or Franklin Avenue? Is there any available option?
A: Hmm... Most of the housing options that are on Van Buren Streets for now are apartments
without backyards, but they do include parking spaces for both of us.
B: I dont like living on Van Buren Street because its very noisy sometimes :(
A: That is true, because it is quite close to downtown, and there are many bars in downtown too.
B: How about looking for housing options that are still within walking distance to the campus, but
just a little bit far up the hills?
A: Sure, are there any houses available on other streets?B: There are few options on Clayton street, which is about 4 or 5 streets up from Van Buren.
A: Look! There are several options available on Clayton and the streets nearby it! And some of
them have backyards AND parking spaces provided for the tenents too!
B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord
tomorrow to make appointment for visiting?
A: Okay, sounds great! Ill call these two houses tomorrow!
7/30/2019 Test Development Final
19/64
19Section 3
Do NOT open the next page until you are directed to do so.
You will listen to a conversation between a professor and a student. Based on what you
hear, answer the questions below. Please place a check mark () next to the correct
answer. You can take notes while you are listening. You will listen to the conversationonl once. You have 15 minutes to com lete this section.
7/30/2019 Test Development Final
20/64
20Section 3 Question 17-24
17. Why does this student come to Professor Turner?
___a) To get advice about a writing assignment.
___b) To submit a course assignment.
___c) To get advice about class registration.
___d) To ask questions about her grade for the course.
18. What does it call for the courses that students have to take before the main ones?
___a) Prep-course.
___b) Precalculus.
___c) Prerequest.
___d) Prerequisite.
19. Which introductory course is this student currently taking this semester?
___a) Introduction to Sociology.
___b) Introduction to Psychology.
___c) Introduction to Anthropology.
___d) Introduction to Physiology.
20. What introductory course that is required by most of the main Sociology courses?
___a) Introduction to Astrology.
___b) Introduction to Sociology.
___c) Introduction to Psychology.
___d) Introduction to Methodology.
21. Which two majors is the student interested in to be her degree?
___a) Psychology and Sociology.
___b) Psychology and Anthropology.
___c) Sociology and Anthropology.
___d) Sociology and Physiology.
7/30/2019 Test Development Final
21/64
2122. When should students decide which major to be their degree?
___a) At the end of freshmen year.
___b) Before the end of sophomore year.
___c) At the beginning of sophomore year.
___d) Before the beginning of senior year.
23. Which courses is the student most likely to take next semester? You may mark
more than one correct answer.
___a) Introduction to Psychology.
___b) Advance level Sociology courses.
___c) Advance level Psychology courses.
___d) Introduction to Sociology.
24. What are the things that the professor recommends the student to do for selecting
courses? You may mark more than one correct answer.
___a) To get advice from classmates and hear about their opinions of the
courses.
___b) To check if there are other courses that are necessary to complete first.
___c) To get course syllabi from professors to check for the workload
beforehand.
___d) To take courses from various majors to find which suits your best interest.
Thank you for your participation. You can leave earlier once you have completed thetest. Please submit this test to either Alice or Takako. If you have any questionsregarding this test, please feel free to ask us.
Again, thank you so much for your support!
7/30/2019 Test Development Final
22/64
22Section 3:
Listening Script
A: Hello, Professor Turner, may I come in?
P: Sure, come on right in! What can I help you?
A: Um... I have some questions regarding to classes for next semester...P: Okay. What about it?
A: I heard that Sociology courses offered next semester are interesting. Id like to take one of
them, but Im not quite sure which I should take .
P: Lets take a look the list together. So, the columns next to the course titles show if there are
any prerequite courses to take the courses.
A: So that means Ill have to take those courses before I can register for the main ones, right?
P: Correct. You cannot take the main courses that are required for your degree until you finish
those prerequisite courses.
A: Most of the main courses require Introduction to Sociology.
P: Thats right. Are you taking Introduction to Sociology this semester?
A: No, Im taking Introduction to Psychology now.
P: Then, you should take Introduction to Sociology next semester so that you will be able to
register the other Sociology courses in the future.
A: I have not decided which major to be my degree yet. Is it okay for me to take those
introduction classes before I make my decision?
P: Sure, no problem. But we would advise students to try to find out what you really passionate
about before you finish your sophomore year, because starting from your junior year, youll be
taking courses that are mainly for your degree.
A: I see.
P: How is Introduction to Psychology? Do you like the class?
A: Yes. The class is very interesting.P: Then, there are also some advance level Psychology courses offered next semester. They
are all require the course youre in now, the Introduction to Psychology, as prerequite. Youre
currently taking the course, so you are eligible for those adavanced courses.
A: Im interested in both Sociology and Psychology, so maybe I can take Intorduction to
Sociology and some advance level courses of Psychology next semester, and then to figure out
which one would best suit my interest.
P: Thats good idea. You can also take some of the introductory courses from other majors to
see if you find them interesting too, so that you are ready to take advanced courses in junior
year.
A: Im going to look the courses from other majors, too.
P: Great! Just make sure you have those prerequisites taken care of first before you register for
any advanced level classes.
A: Thank you, Professor Turner. Now I think I know what Ill take for next semester.
P: No problem, glad that I could help you. Youre more than welcome to stop by again if you still
have any question.
A: Thank you!!
7/30/2019 Test Development Final
23/64
23Objective Section Scoring Key
1. d2. b and c3. a4. d
5. b6. c7. d8. a and c9. a and c10.c11.a and c (for student A)/ a and e (for student B)12.b13.d14.a15.d
16.c17.c18.d19.b20.b21.a22.b23.c and d24.b and d
Administration Guide
Test takers will be given 45 minutes to complete the objectively scored listening section.
The purpose of the test is to assess listening ability, which requires independent work during the
assessment. The directions are written on the test at the beginning of each section before the
listening questions. Before examinees start the test, however, the administrators will go over the
directions together to prevent any possible confusion. In addition, the administrators will remind
the test takes that they listen to each recording only once and encourage the test takers to take
notes.
Actual Test Administration Description
The listening test was administered during two weeks from April 27 to May 3, 2013 to
five current Intensive English Program (IEP) students and three Teaching English to Speakers of
7/30/2019 Test Development Final
24/64
24Other Languages (TESOL) and Teaching Foreign Languages (TFL) program students at
Monterey Institute of International Studies (MIIS).
During the administration, the test takers worked independently. Each recording took less
than three minutes. After listening to each recording, each section took approximately 7-6
minutes for the examinees to complete the set of questions. The classrooms in which the
listening test was administered were the usual classrooms at MIIS which the test takers were
familiar with the environment. There were no issues such as noise during the assessment. Each
time, approximately two examinees took the listening test in the well-lit classrooms. Alice and I
were available to answer questions during the test, but the examinees were not allowed to talk
each other.
Statistical Analysis
Descriptive Statistics
For the eight examinees, the mean of the listening section was 27.6. Score range was
from 22 to 36. The most frequent score (mode) was 26. The median score was also 26. The
standard deviation (SD) was 4.87.
N Mean Mode Median Range SD Variance
8 27.6 26 26 36-22 4.87 23.70
7/30/2019 Test Development Final
25/64
25Item Facility
Section 1:
Item 1 2 3 4 5 6 7 8 9 10 11 12
Key d b c a d b c d a cS1 d b c a d c a b a d
S2 d b c a d c a c a c
S3 d b c a d b c d a c
S4 d b c a d c c d c
S5 d a b c d b d d d b c
S6 d b c a d c a c c
S7 d b d a d d d a b c
S8 d c a d b c b a c
# of Correct
Answers 8 7 7 7 6 7 8 2 3 2 4 7Item Facility 1 0.875 0.875 0.875 0.750 0.875 1 0.25 0.375 0.25 0.5 0.875
Section 2:
13 14 15 16 17 18 19 20 21 22 23 24 25 26
a c c a d a e b d a d c
a c a c d a g a d a d b
a c a h g e b d a d c
a c c a d a e b d a d c
a c c a d a h b d a d c
a b c c a c/b a e b d a d ca d g d a b b d b c
a c c a d a e b d a a c
a c c a d a e b d a d c
8 7 6 8 6 6 6 7 5 7 8 7 6 7
1 0.875 0.750 1 0.750 0.750 0.750 0.875 0.625 0.875 1 0.875 0.750 0.875
7/30/2019 Test Development Final
26/64
26Section 3:
27 28 29 30 31 32 33 34 35 36 Total Score
c d b b a b c d b d
c d b b d d c d c d 25
c d a b a c a b b d 26
c d b b a b c d b d 36
c d b b a b d d 31
c a a b a a c b b d 23
c a b c a d d b d 22
c a b b a d c d d 26
c c b b a b c d c d 32
8 4 6 7 7 7 5 6 4 8
1 0.5 0.750 0.875 0.875 0.375 0.625 0.750 0.5 1
Item facility or item difficulty shows the level of difficulty of each item on an
objectively-scored test (Turner, 2013). The range of the values in this test items was from 0.25 to
1. Of the 36 listening items, there seemed to be unbalanced distribution of easy, medium, and
difficult items. Most of the items could be categorized as easy questions, whereas only three
questions (Item 8, 9, 10) had the values close to 0. The intended examinees for this listening test
are required to have at least 49 of TOEFL iBT score for enrolling in the program. Therefore, the
distribution of item levels needed to be weighted more closely to the difficult level.
Item Discrimination
Section 1:
Item 1 2 3 4 5 6 7 8 9 10 11 12
S3 d b c a d b c d a c
S4 d b c a d c c d c
S8 d c a d b c b a c
S1 d b c a d c a b a dS5 d a b c d b d d d b c
S6 d b c a d c a c c
P 1 1 0.67 1 1 1 1 0.67 1 0.67 0.67 1
P 1 0.67 1 1 0.67 0.67 1 0.67 0.67 0.67 0.33 0.67
Item
Discrimination
0 0.33 -0.33 0 0.33 0.33 0 0 0.33 0 0.33 0.33
7/30/2019 Test Development Final
27/64
27Section 2:
13 14 15 16 17 18 19 20 21 22 23 24 25 26
a c c a d a e b d a d c
a c c a d a h b d a d c
a c c a d a e b d a d c
a c a c d a g a d a d b
a b c c a c/b a e b d a d c
a d g d a b b d b c
1 1 1 1 1 1 1 1 0.67 1 1 1 1 1
1 0.67 0.67 1 0.33 0.33 0.67 1 0.33 0.67 1 0.67 0.67 0.67
0 0.33 0.33 0 0.67 0.67 0.33 0 0.33 0.33 0 0.33 0.33 0.33
Section 3:
27 28 29 30 31 32 33 34 35 36
c d b b a b c d b dc d b b a b d d
c c b b a b c d c d
c d b b d d c d c d
c a a b a a c b b d
c a b c a d d b d
1 0.67 1 1 1 1 0.67 1 0.33 1
1 0.33 0.67 0.67 0.67 0 0.67 0.67 0.67 1
0 0.33 0.33 0.33 0.33 1 0 0.33 -0.33 0
In addition with item facility, item discrimination illustrates the objectively-scored items
ability which distinguish between test takers with high scores and those with low scores (Turner,
2013). Most of the items had the values ranging from 0.33 to 0.67. The values of Item 1, 4, 7, 8,
10, 16, 20, 23, 27, 33, and 36 were 0, which indicated that these items did not discriminate well.
Although some of the items were intended to be easy to answer correctly, 11 items,
approximately half of the items, failed to discriminate well. Therefore, the number of items
whose obtained value was 0 should be decreased. Item 8 obtained the item facility value of 0.25,
which is the most difficult question in this listening test. In terms of item discrimination,
however, the obtained value was 0. Therefore, this item should be revised to improve the item
7/30/2019 Test Development Final
28/64
28discrimination value. Item 3 and 35, seemed to be seriously problematic and should be revised
because in spite of the high item facility values (0.875 and 0.5 respectively), both items had the
value of -0.33, which revealed that top-scoring examinees were more likely to answer wrongly
than lower-scoring test takes.
Internal Consistency
To calculate internal consistency, KR-20 was used. There was a pre-programmed excel
spreadsheet for calculating KR-20 formula available online. In the spreadsheet, a right answer
was computed as 1, whereas a wrong answer was 0. As a result, by using KR-20 formula, the
obtained value was 0.80. This obtained value seemed to be quite high, which suggests that the
items of the objective section were highly matched.
Split-Half (odd-even) Correlation 0.821066194
Spearman-Brown Prophecy 0.901742284
Mean for Test 27.5
Standard Deviation for Test 4.636809248
KR21 0.717940199
KR20 0.795348837
7/30/2019 Test Development Final
29/64
29Revised Test Specification for Listening Section
According to Bejar, Douglas, Jamieson, Nissan and Turner (2000), three areas of
content have been defined as relevant for the TOEFL 2000 listening measure: academic, class
related, and campus related (p. 9). Three categories of speech acts in the listening test (an
academic lecture, a conversation between college faculty and student, and a conversation
between peers) seem appropriate for the purpose of this placement test. In terms of text types for
the listening section, the appropriate resources are authentic lectures and conversation between
faculty and students as well as between students. For each listening task will consist of
approximately three-minute segment of listening and eight multiple-choice questions. Each
listening task will take 10 minutes to complete questions. In total, thus, the listening test will be
30 minutes.
In terms of the listening contents, they should not require test takers to have the previous
knowledge. Instead of assessing students existing knowledge of the topics, the listening section
will assess the listening comprehension of spoken English in academic contents as well as
communication relevant to college life. Hughes (2003) claimed the global operations as the
ability to obtain the gist, follow argument; and recognize the attitude of the speaker (p. 161).
These operations seem to be appropriate for the specification of this listening test. In more depth,
based on the lists Hughes suggested, there is a finalized list for the ability to be assessed in the
listening test.
Informational:
obtain factual information
follow sequence of events (narration)
recognize and understand opinions
7/30/2019 Test Development Final
30/64
30 understand compassions
recognize and understand suggestions
recognize and understand comments
Interactional:
understand greetings and introductions
understand expressions of agreement/disagreement
recognize speakers purpose
understand requests for clarification
recognize requests for clarification
recognize requests for opinion
recognize attempts to persuade others
Each listening segment has been already recorded. A lecture segment was taken from one
of the series titledMIT 9.00SC Introduction to Psychology, Spring 2011, which are available on
YouTube. For a conversation between peers, the topic was housing because it is associated with
campus related topic. In addition, one of our friends has just moved to a new house and agreed to
volunteer for the recording. Regarding a conversation between college faculty and student, the
topic is class registration, which is identified as a class related topic. All listening materials were
successfully recorded before conducting the pilot test.
In terms of task type, the multiple choices for this section include questions and
incomplete sentences. As identified above, all questions are presented in English. Based on the
test task, the method used for Listening section will be paper-based.
The instruction given to students during the listening section will be following:
1. Listen to the recording once. While listening, students will be allowed to take notes.
7/30/2019 Test Development Final
31/64
312. After listening, students will be allowed to open the test booklet to start the test. (Student
will not be allowed to see the questions beforehand.)
3. Students will be given 10 minutes to complete each listening task.
This listening test also attempts to measure the ability of note-taking. Carrell (2007)
suggested that note-taking strategies seemed to be associated with listening performance. The
Listening section is objectively-scored due to the use of multiple choices for the task. The criteria
should reveal degree of each listening abilities listed above.
7/30/2019 Test Development Final
32/64
32Revised Test
Listening Test30 minutes
Name: __________________
Direction:
This test is design to test your listening comprehension. There are three
sections in this test. You will listen to each recorded material only once and
must answer the following questions. Answer all the questions on the basis
of what is statedorimpliedby the speaker you hear. While listening, you
can take notes.
Do NOT turn the next page until you are told to do so.
7/30/2019 Test Development Final
33/64
33
Section 1
Directions:
In this section, you will listen to a lecture in an undergraduate program. The
lecture will not be repeated. After listening to the lecture, open the next
page and answer the questions below based on what is stated or implied in
the lecture. Please place a check mark () next to the correct answer. You
can take notes while you are listening. You have 10 minutes to answer the
following questions.
Do NOT turn the next page until you are told to do so.
7/30/2019 Test Development Final
34/64
34Section 1 Questions 1-8
1. What is attribution theory?
___a) The theory explains the process of attributing outcomes based on only
internal behavior.
___b) The theory explains the process of attributing outcomes based on
sequences of the event of a matter.
___c) The theory explains the process of attributing outcomes based on only
external events.
___d) The theory explains the process of attributing outcomes based on internal
behavior and external events.
2. Please check the answers that best describe internal causes and external reasons.
You may mark more than one correct answer.
___a) Internal causes include situational factors and controllable emotions.
___b) Internal causes refer to emotions, talents, and personal characteristics.
___c) External reasons are environmental factors.
___d) External reasons consist of uncontrollable emotions.
3. According to the lecture, what is the possible external cause associated with the car
accident?
___a) A deer.___b) A dog.
___c) A bear.
___d) A horse.
4. If the wind is blowing the right direction when the bat hits the ball, would
this fact be categorized under_______?
___a) Internal-Stable Factor.
___b) Internal-Unstable Factor.___c) External-Stable Factor.
___d) External-Unstable Factor.
7/30/2019 Test Development Final
35/64
355. What is the tendency that identify our own success as our internal abilities and failure
as the results of external factors?
___a) Self-regulation bias.
___b) Self-serving bias.
___c) Self-success bias.
___d) Self-ability bias.
6. What is the tendency to criticize a persons internal behavior?
___a) Internal Attribution Error.
___b) External Attribution Error.
___c) Fundamental Attribution Error.
___d) Attribution Decision Error.
7. What is the hypothesis that assumes judgments on performance are fair
because people get the outcome they deserve?
___a) Just Work Hypothesis.
___b) Justice World Hypothesis.
___c) Justice Work Hypothesis.
___d) Just World Hypothesis.
8. What are the appropriate summaries of this lecture?
You may mark more than one correct answer.
___a) People attribute causes to categorize behaviors and events.
___b) People always fail to attribute causes correctly.
___c) It is important to consider internal and external reasons when interacting
with others.
___d) It is necessary to attribute causes to only internal reasons.
7/30/2019 Test Development Final
36/64
36
Section 2
Directions:
In this section, you will listen to conversation between friends. The
conversation will not be repeated. After listening to the conversation, open
the next page and answer the questions below based on what is stated or
implied in the conversation. Please place a check mark () next to the
correct answer. You can take notes while you are listening. You have 10
minutes to answer the following questions.
7/30/2019 Test Development Final
37/64
37Please use the map while you are listening
Do NOT turn the next page until you are told to do so.
OakStreet
LakeStreet
7/30/2019 Test Development Final
38/64
38Section 2 Questions 9-16
9. Why do they want to live closer to the campus?
You may mark more than one correct answer.
___a) Because they both have 8am classes next semester.
___b) Because it is closer, then theyll be able to sleep-in a little more.
___c) Because it is difficult to find parking spots in the morning.
___d) Because they both ride bicycles to school.
10. Which streets are closest to the campus?
___a) Oak, Hill, and Lake.
___b) Second, Green, and Lake.
___c) Main, Second, and Oak.___d) Green, Maple, and Hill.
11. Based on the conversation, choose the answers below and place alphabets in the
blanks to complete each persons list. Each blank will be filled with one answer.
An answer can be chosen more than once.
Alice Lauren
Backyard
Under $800 for each
a. Parking.
b. $ 1000 total.
c. Balcony.
d. $ 700.
e. Big kitchen.f. Under $ 800 for each.
g. Furnished.
h. Large living room.
7/30/2019 Test Development Final
39/64
3912. What is NOT available on Second Street?
___a) Backyards to play with pets.
___b) Parking spots.
___c) Apartments.
___d) Houses that allow pets.
13. Why doesnt Lauren like to live on Main Street?
___a) Because it is not safe at night.
___b) Because there are a lot of car accidents on Main Street.
___c) Because the grocery store is far away from Main Street.
___d) Because it is quite noisy sometimes.
14. How far away are Hill Street from Main Street?
___a) About 4-5 street up.
___b) About 4-5 street down.
___c) About 6-7 street up.
___d) About 6-7 street down.
15. Which street are they most likely to find the place to live?
___a) Main.
___b) Lake.___c) Oak.
___d) Hill.
16. What are they going to do tomorrow?
___a) They are going to visit each house.
___b) They are going to look for more housing options.
___c) They are going to call landlords to make appointment for visiting.
___d) They are going to stop looking for new place and stay where they live now.
7/30/2019 Test Development Final
40/64
40
Section 3
Directions:
In this section, you will listen to conversation between a professor and
student. The conversation will not be repeated. After listening to the
conversation, open the next page and answer the questions below based
on what is stated or implied in the conversation. Please place a check mark
() next to the correct answer. You can take notes while you are listening.
You have 10 minutes to answer the following questions.
Do NOT turn the next page until you are told to do so.
7/30/2019 Test Development Final
41/64
41Section 3 Question 17-24
17. Why does this student come to Professor Turner?
___a) To get advice about a writing assignment.
___b) To submit a course assignment.
___c) To get advice about class registration.
___d) To ask questions about her grade for the course.
18. What does it call for the courses that students have to take before the main ones?
___a) Prep-course.
___b) Precalculus.
___c) Prerequest.
___d) Prerequisite.
19. Which introductory course is this student currently taking this semester?
___a) Introduction to Sociology.
___b) Introduction to Psychology.
___c) Introduction to Astronomy.
___d) Introduction to Philosophy.
20. What introductory course that is required by most of the main Sociology courses?
___a) Introduction to Anthropology.
___b) Introduction to Sociology.
___c) Introduction to Psychology.
___d) Introduction to Social Work.
21. Which two majors is the student interested in to be her degree?
___a) Psychology and Sociology.
___b) Psychology and Social Work.
___c) Sociology and Anthropology.
___d) Literature and Physiology.
7/30/2019 Test Development Final
42/64
4222. When should students decide which major to be their degree?
___a) At the end of freshmen year.
___b) Before the end of sophomore year.
___c) At the beginning of sophomore year.
___d) Before the beginning of senior year.
23. Which courses is the student most likely to take next semester?
You may mark more than one correct answer.
___a) Introduction to Psychology.
___b) Advanced level Sociology courses.
___c) Advanced level Psychology courses.
___d) Introduction to Sociology.
24. What are the things that the professor recommends the student to do for selecting
courses? You may mark more than one correct answer.
___a) To get advice from classmates and hear about their opinions of the
courses.
___b) To check if there are other courses that are necessary to complete first.
___c) To get course syllabi from professors to check for the assignments
beforehand.
___d) To take courses from various majors to find which suits students best
interest.
This is the end of listening section. You can leave earlier onceyou have completed the test. Please submit the completed test toeither Alice or Takako. If you have any questions regarding this
test, please feel free to ask us.
7/30/2019 Test Development Final
43/64
43Revised Listening Prompt (Section 2)
Script
Topic: Housing
A: Hey, Alice. I found there are a lot of housing information in todays newspaper.B: Oh, thats great. Lauren! Lets take a look together.
A: Since both of us will be having 8AM class next semester, it would be great if the house is
close to the campus, so we wont have to drive to school.
B: I totally agree! Its difficult to find a parking spot in the morning...
A: There are three streets close to campus- Main, Second, and Oak...
B: Do you want to live in a house, or an apartment?
A: Does house have a backyard? I want to play with my dog there.
B: Lets see...maybe we should make a list first so that we can find the best one!
A: Sounds great! I hope theres a backyard for my dog, also parking spots should be available
because we both have cars, and the fee is about $700 for each.
B:
A: Okay! Well, on this page of the newspaper, I found two advertisements showing these two
houses on the Second Street, and they both have backyards. The only difference is that one of
the backyards is smaller. But, it seems that they dont have parking spots...we have to park on
the street.
B: What about Main Street or Oak Street? Is there any available option?
A: Hmm... Most of the housing options that are on Main Streets for now are apartments without
backyards, but they do include parking spaces for both of us.
B: I dont like living on Main Street because its very noisy sometimes :(
A: That is true, because it is quite close to downtown, and there are many bars in downtown too.
B: How about looking for housing options that are still within walking distance to the campus, butjust a little bit far up the hills?
A: Sure, are there any houses available on other streets?
B: There are few options on Hill Street, which is about 4 or 5 streets up from Main Street.
A: Look! There are several options available on Hill Street and the streets nearby it! And some
of them have backyards AND parking spaces provided for the tenents too!
B: Sounds cool. The monthly rent is almost same price...why dont you call each landlord
tomorrow to make appointment for visiting?
A: Okay, sounds great! Ill call these two houses tomorrow!
7/30/2019 Test Development Final
44/64
44Revised Scoring Key for Objectively-Scored Section
1. d2. Unmarked3. b
4. c5. Unmarked6. a7. d8. b9. c10.d11.a12.c13.a14.Unmarked
15.c16.Unmarked17.c18.a19.c20.a21.e22.b23.d24.a25.d26.c27.c28.d29.b30.b31.a32.b33.c34.d35.b36.d
Revision and Rationale
First of all, the directions of the test including its administration process were revised.
Although the administrators (Alice and I) told the examinees not to open the test until they heard
to do so, most of them actually opened the test as soon as they received. In terms of the
7/30/2019 Test Development Final
45/64
45administration guide, the administrators need to tell and make sure that the test takers are not
allowed to open the pages before passing the test packages. In addition, the revised test uses a
page for the direction of each section, which may prevent the test takers from reading questions
accidentally. The direction in each section includes more detailed information of the recording
material so that the examinees do not have to guess the situations where the lecture and
conversations occur.
Secondly, based on the results of statistical analysis, the listening prompts and questions
are revised. Item facility revealed the unbalanced distribution of item levels. By taking a closer
look at the distribution, most of the difficult items were included in Section 1, whereas there are
no items of difficult level in Section 2. Therefore, Section 2 required large-scale revision. In the
recording prompt of Section 2, there were many complicated street names, which might be
extremely confusing for the examinees who may not know the streets in Monterey. Indeed, all
participants for the pilot test were living in Monterey, which might be helpful for them to answer
correctly. In order to measure English proficiency level of listening, such pre-existing knowledge
should not be required. Therefore, the revised prompt use simple and common street names in
the United States such as Main, Second, and Oak. Every question associated with street names
are also revised. In addition, the revised test includes a map with the street names that the
examinees are allow to look at while listening (Figure 1).
Although Section 1 and 3 did not seem to need a wide-ranging change, several items are
revised. For example, Item 8 had the obtained value of 0.25 but did not discriminate well (the
obtained value = 0 for item discrimination). The revised Item 8 uses simpler vocabulary than the
original one.
The original Item 8:
7/30/2019 Test Development Final
46/64
465. What is the tendency that celebrates our own success as the indication of our internal
abilities and failure as the results of external factors?
___a) Self-regulation bias.
___b) Self-serving bias.
___c) Self-success bias.
___d) Self-ability bias.
The revised Item 8:
5. What is the tendency that identify our own success as our internal abilities and failure
as the results of external factors?
___a) Self-regulation bias.
___b) Self-serving bias.___c) Self-success bias.
___d) Self-ability bias.
In Section 3, Item 33 and 34 are also revised because the multiple choices were
grammatically problematic. Instead of Advance level of Sociology/Psychology courses, the
revised multiple choices are Advanced level of Sociology/Psychology course.
The original Item 33 and 34:23. Which courses is the student most likely to take next semester? You may mark
more than one correct answer.
___a) Introduction to Psychology.
___b) Advance level Sociology courses.
___c) Advance level Psychology courses.
___d) Introduction to Sociology.
7/30/2019 Test Development Final
47/64
47The revised Item 33 and 34:
23. Which courses is the student most likely to take next semester?
You may mark more than one correct answer.
___a) Introduction to Psychology.___b) Advanced level Sociology courses.
___c) Advanced level Psychology courses.
___d) Introduction to Sociology.
Moreover, the multiple choices of Item 29, 30 and 31 are revised by changing academic
subjects which are more similar to Psychology or Sociology. Additionally, the provisos in items
are bolded and italicized for the test takers to pay attention. Lastly, the recording materials also
require change. The voices of two people in the recordings were very similar, which might be
difficult for the examinees to distinguish the voices. Therefore, as a possible change, the
conversations should be recorded by different people who have distinguishable differences in
terms of voice. In addition, even though the recording quality was overall fine, the conversations
at the beginning of the recordings were too small to hear, which suggests that recording should
be done by reliable and secure equipment.
Proposed Validity Investigation for Objectively-Scored Section
There are several ways to investigate the validity of the listening section. One approach is
that experts such as Summer ICE program directors and instructors would review the listening
section.
The use of criterion-related approaches is also another way of investigating validity.
Concurrent validity shows how this original test might be correlated with an existing validated
assessment which also measures the same specifications by comparing the performances on the
tests. Particularly, this listening test was developed by manipulating TOEFL iBT Listening.
7/30/2019 Test Development Final
48/64
48Therefore, comparison the performance on this listening section to the performance on TOEFL
iBT listening section may indicate how both assessments are correlated with each other.
Another criterion-related approach is the use of predictive validity, which reveals how the
performance on this listening test can predict test takers future academic performances.
Especially, this listening test attempt to measure the students proficiency levels before enrolling
in the Summer ICE program. For instance, by comparing the scores on this listening test to the
students academic performances during the program, predictive validity might indicate how
these performances are related with each other.
7/30/2019 Test Development Final
49/64
49Subjectively-Scored Section
(Speaking Test)
Specifications and Rationale
Both objectively-scored and subjectively-scored sections were designed independently
and did not have any mutual reinforcement. As previously stated in the original specifications,
speaking test will be subjectively-scored and conducted in a face-to-face interview format, and
each testing session should be about 10 minutes long. Unlike traditional assessment that requires
students to select an answer or to recall information to complete the task (such as reading a short
story first, then reciting the story again in his/her own words as a presentation), this speaking test
is designed based on the principle of authentic assessment; according to Wiggins (1993), in
authentic assessment students must use knowledge to effectively and creatively perform the
assigned task, and the task can be replicas of questions or problems that students may face in the
field. Therefore, while the overall purpose of this placement test is to assess students academic
English proficiency levels before the program starts so that teachers in the program can modify
the lessons accordingly to meet their actual needs, in this speaking test students will also be
asked to perform meaningful task (i.e. answering questions that may arise when theyre in the
real world in any setting), and their performance shall demonstrate their competencies and
communicative level.
For the first part of the test (4-5 minutes), student will be prompted to talk about and
answer the topic chosen by the teachers; topic for the first part of the test is selected based on the
result of each students self-assessment survey (see Appendix). The procedure of how to choose
the topic question is explained in the Test Administration Guide section. Before proceeding to
the second part, student will receive a topic card randomly selected by teachers; a minute of
preparation time will be given to the student to write down any idea that he/she would like to
7/30/2019 Test Development Final
50/64
50bring up to talk about regarding to the given topic on the paper provided at site. When a minute
is up, teachers will prompt the student to respond to the topic; moreover, granted that most
students will probably pursue their higher education in the US later on, and the whole purpose of
the speaking test is to see whether they have the ability to be able to succeed communicatively in
academic setting, hence the theme of these random topics is related to academic or college-life-
centered for this part of the test; they will have to think critically in order to provide a well-
reasoned and consistent response.
Criterions that students will be evaluated are listed and elaborated in the Scoring Protocol
section.
Subjectively-Scored Test as Administered
Original Speaking Test Questions (Topic Prompts)
Part I: 4-5 minutes
Personal Interest/Information:
1) Do you have any favorite food? What about any dislikes of food?
2) How do you usually spend your free time?
3) What is your favorite subject in school? (Prompt them to elaborate when answer is
given.)
Comparison/Opinion:
4) Can you tell me why do you want to come to ICE for this summer instead of enjoying
time with your family and friends in your home country?
5) Can you tell me what are some facts about America that you find interesting?
6) How is America (or American culture) different from your home country? Are there also
any similarities?
7/30/2019 Test Development Final
51/64
51==== 1 minute of preparation time ====
Part II: 4-5 minutes
7) Tell me about your opinion: How do you think about having international students as
classmates? Do you think itll help to improve your English? Please elaborate your
answer.
8) What do you plan on studying in college? Any particular reason for choosing that major?
Why do you think it is important to learn English, and what are the benefits of able to
speak English?
9) What do you plan on studying in college? Any particular reason for choosing that major?
Subjective Section Scoring Key
Original Test Rubric: Speaking Test (Spring 2013)
Name: _______________
Level Competencies
Grammar:Sentence
structure & verb
tense
Content:Relevance to the
prompt
Vocabulary:Appropriateness &
Complicacy of
terms chosen
Discourse:
Turn-taking,coherence &
cohesions
Delivery:
Fluency of thespeech &
pronunciation
Level
1
Frequent grammarmistakes, takeslonger time toresponse due tohaving difficultiesto deliver ideas.
Can do basicgreetings andanswer questionsregarding to familyand personalmatters. But maylack relevance inideas.
Incorrect use ofvocabulary, lack of
proper vocabulary tocommunicateadequately.
Able to completesolid ideas thoughnot connecting toeach other, andshows difficulty intake-turning.
Often pauses toconnect ideas andslow-paced,incorrect
pronunciations werefrequently made incommunication.
Level
2
Able to deliverideas adequatelywith only somegrammar errors,and can identifywhen errors were
made (self-correct)
Can talk aboutpersonal preferencesand academictopics. Ideas arerelevant to the
prompt.
Able to use generaland some basicacademic terminologyappropriately withminor mistakes. Littledifficulty in
elaboration whenprompted.
Able to performturn-taking andconnect complexideas, but may lacka little ofcoherence and
cohesions.
Makes somepronunciationmistakes butgenerallyunderstandable, withonly some pauses,
but pacing isappropriate.
Level
3
Often self-correct
right away if errorswere made; veryfew grammarerrors, and able todeliver ideas withcorrect grammaraccurately.
Can support
personal opinionsand able to talkabout academictopics. Ideas arerelevant andelaborated.
Able to use both
general and variety ofacademic termsappropriately andcorrectly. Minormistakes in metaphoror idiom expressions.
Able to perform
turn-takingeffectively, andable to connectcomplex ideas withcoherence andcohesions.
Only minor
pronunciation errorsbut generally easyand clear tounderstand. Showsappropriate pacingand pauses incommunication.
7/30/2019 Test Development Final
52/64
52
Test Administration Guide
Prior to the test, students must complete the self-assessment survey beforehand. Self-
assessment survey will be given to the students at the end of the Listening test and return the
survey to teachers before they leave the classroom. The primary purpose for this self-assessment
survey is to determine which level of difficulty of questions to ask in the first part of the
speaking test.
As for preparation for the test, teachers must make sure to prepare several materials and
equipment that will help to best document the test process and to make sure nothing is missing,
so the test process can be conducted smoothly. For starter, laptop that is equipped with audacity,
iPod or iPhone, or any audio recording device should be present for the purpose of recording
each test session; the purpose for the recoding is to be used later for evaluation only when the
two teachers cannot come to a conclusion on deciding level of the student. Some pencil and
blank sheet of paper are also required because in between the first and second part of the test,
students will get a one-minute preparation time to jot down any ideas that they would like to talk
about upon receiving the topic card for the second part of the test. Last but not the least, test
prompts for the second part of the test and topic cards for the second part of the speaking test
should be typed and ready on the day of testing.
There are also several guidelines for teachers to follow before conducting the test. Upon
having the student entering the testing classroom, teachers and student will exchange greetings,
and both teachers will take turn to do a brief self-introduction before asking student to introduce
him/herself. For students self-introduction, make sure to have them to include where is their
home countries, and how long have they studied English to estimate their expected linguistic
7/30/2019 Test Development Final
53/64
53ability. Prior to beginning the actual test, the teachers should explain the following to every
student:
o
Please relax and dont think of this as a test that will judge your English ability.
Also, well not judge your opinions, so please speak freely and just think of this 10-
minute as a chat between teachers and the student. This is to let the student know
that his/her opinion of certain matter will not affect how he/she will be evaluated,
therefore, they will not feel restricted on topics and actually be able to best express
themselves when there is no boundary.
o We will be audio-recording the entire speaking test only for our use in evaluation
process later. Well not distribute nor upload the content. This is to inform them
politely that we acknowledge privacy and guarantee that all the testing materials will
not be used for any other purpose, therefore, no one who is not a faculty staff will
know or hear what they say during the test.
o Do you have any question before we start? Proper signal to indicate the beginning
of the actual test.
There will be two teachers present at the testing, and students will be entering the testing
classroom one at a time according to the time slot that they have signed up previously (a sign-up
sheet will be passed around in the classroom prior to the beginning of the Listening test, so
students can choose their desired time slot for the speaking test.) During the test, one of the
teachers will be the main test giver who initiates questions and turn-takings, while the second
teacher will be the assistant test giver to initiate follow-up questions (example: asking student to
elaborate more on his/her answer if the answer lacks of reasons to support it.)
7/30/2019 Test Development Final
54/64
54For the first part of the speaking test, teachers will begin with the prompt that was
selected based on the result indicated on the self-assessment survey: if the student answered
mostly disagree to task performances that are difficult, then questions for the first part will be
selected from the Personal Interest/Information category; however, if the student answered
mostly agree or strongly agree to task performances that are difficult, such as onesthatpertaining the ability to talk or discuss freely on personal interest, academic-related topics, and
feel confident to unexpected turn-takings in the conversations, then questions for the first part
will be chosen from Comparison/Opinion category. To avoid any confusion that would cause
time loss for the student, teachers must speak clearly when giving the question. In terms of
proper feedback during the test, teachers may respond in forms of nodding or back-channeling;
in addition, teachers should remain eye-contact with the student, which is to avoid looking like
grading the student already while he/she is still speaking by writing on the rubric. Most
importantly, teachers shall not interrupt the student during his/her comment. Follow-up questions
are necessary if the student only answered the original question without providing proper
elaboration to support the answers.
At the end of the test, teachers will tell the student that the test is over by thanking
him/her for sharing their thoughts and information, and telling the student that theyll be looking
forward to meeting them him/her again in classes. Teachers will not and shall not discuss the
result and performance of the student at the end of the testing session.
Scoring Protocol
The rubric should be completed by the two teachers immediately upon finishing each test
session. Circle the level of description clearly that best describes the students performance of
each competency. The audio recording can be used to re-listening to the conversation if a
7/30/2019 Test Development Final
55/64
55decision cannot be made or details cannot be recalled. Things that will be evaluated on the
speaking test include: grammar (sentence structure and tense agreement), content (relevance to
the prompt), vocabulary (appropriateness and complicacy of words chosen), discourse (turn-
taking, coherence and cohesions), and delivery (fluency of the speech and pronunciation in
general.)
Actual Test Administration Description
The test was administered to three female students: two were the current IEP students at
MIIS, and one was a first-semester TFL-Chinese student. All three of the test takers had
previously finished the objective portion of the test the week before. Each of the test takers was
scheduled to take the speaking test according to their best available day and time in the week
following the objective test. While I waited in front of the Holland Center at the time of meeting,
my co-teacher/administer, Takako, went to look for an empty classroom in Morse for us to
conduct the test. She would then notify me the location of the testing classroom via text message.
Contrary to my original test administration guide where the test takers must fill out the
self-assessment survey immediately after finishing the objective portion of the test (Listening),
and due to the fact that my self-assessment survey (Appendix) was still under development
during the week of piloting listening test, I was unable to have them to fill out the survey in
advance in order for me to determine which category of questions to ask for the first part of the
speaking test. Therefore, all three of the test takers did their self-assessment survey on the testing
day, then Takako and I would decide on which category of questions to begin the first part of the
test based on how many disagree were ticked next to the questions that have difficult
performance tasks. If the test taker ticked most of the answers agree or strongly agree on the
survey that have performance tasks that are difficult, then we would start the first part of the test
7/30/2019 Test Development Final
56/64
56with questions from the Comparison/Opinion category. If the test taker ticked mostly
disagree on difficult performance tasks, then we would select questions from Personal
Interest/Information category.
Prior to beginning the test, we informed the test taker that we would like to audio-
recording the test session for evaluation purpose only, and will not display the recording to
anyone besides our professor (if necessary). After receiving the permission to audio-record, we
then set up our iPhones to prepare for recording. To signal the beginning of the test, I first told
the test taker to relax and just think of the next 10-minute as chatting between friends (due to the
fact that all of our test takers were our acquaintances), and asked her if she has any question.
During speaking test portion, I was the main test giver to initiate the first questions and
manipulate turn-takings, and Takako was the assistant test giver to ask follow-up questions. At
the end of each test, we thanked the test taker for willing to participate to help us piloting our
tests.
Inter-rater Reliability
For this test both Takako and I are the raters and we score each test taker individually,
using the same scoring rubric that Ive created. There are total of 5 competencies to evaluate, and
each competency is divided into 3 levels, with level 1 being the lowest/near-beginner stage, level
2 being the intermediate, and level 3 being the advanced stage. The scoring protocol is that
immediately after each testing session, teachers are to evaluate the students performance by
circling or placing a check mark in the box of descriptors of each competency that best describes
the students level. Therefore, if a student received mostly level 3 in all the competencies across
the rubric chart, then that student is determined to be in the advanced level of the class. That is to
say, if a student received an alignment in three or more competencies with a single level by the
7/30/2019 Test Development Final
57/64
57evaluation of both teachers, then that would be the level of the student. However, if fewer than
three descriptors were aligned with a single level between the two teachers, then they will have
to resolve the differences through discussion by listening to the audio recording again to
determine a final placement for the student.
In our pilot test, we happened to score the first two test takers in 100% agreement.
However, on the third test taker (shown in attached document: Maggie), Takako and I only had
one difference in the scoring, which was that I circled the test takers vocabulary competency
level at Level 2, whereas Takako thought the vocabulary level was at Level 3. We resolved this
difference by listening to our audio recording again, and discussed the word choice in her
discourse. The conclusion that we arrived was that she should be placed as Level 2 because
although she was quite fluent in her speech delivery and was able to stay on the topic (even
providing adequate elaboration), most of her comments and word choices were rather repetitive,
which did not entirely meet the requirement of able to use both general and variety of academic
terms appropriately and correctly as indicated in the scoring rubric for Level 3 Vocabulary
descriptor. In addition, even with this one difference between Takako and my rating, both of us
had scored her with an alignment in three competencies; therefore, we were able to reach 100%
agreement on all of our three test takers scores.
Content Analysis for Subjective Section
The question genre was first based on the fact that these students will most likely to
return to the US for higher education, but to make sure this test would be able to genuinely
measure their communicative competency in an academic and campus life setting, I reviewed my
previous specifications draft and did a research on possible questions that could reflect
meaningful real-world performance when I came across the concept of authentic assessment.
7/30/2019 Test Development Final
58/64
58To remind myself that this test is meant to design to assess their academic life
communicative skill, I checked my draft of the rubric frequently when I was developing the
questions to ensure these questions could reflect any kind of real-world circumstance. On top of
all, even though this test is meant to be a placement test for Summer ICE program students
ranged from age 15 to 19, when we piloted this test to two current IEP students and one current
TFL student at MIIS, they all seemed perfectly relaxed and chatty throughout their test sessions
after we told them not to think of this pilot test as a test, but just a normal chat between
acquaintances prior to beginning the test. Therefore, informing them beforehand and choosing
topics that seem personal relevant and real-life like situations could best put the students in a
less-stressed state of mind, thus they seem to be able to express themselves better and the output
performance would be much natural and genuine.
7/30/2019 Test Development Final
59/64
59Revised Speaking Test Questions (Topic Prompts)
Part I: 4-5 minutes
Personal Interest/Information:
1) Do you have any favorite food? (Follow up: What about your second favorite food? Why is it
not your first choice?)
What about any dislikes of food?
What would be your ideal perfect meal on a date? And in what kind of restaurant?
2) How do you usually spend your free time after you finish your homework assignment?
(Follow up: Do your parents help you on your homework when you have questions? Or do
you think that homework should be done alone by yourself?)
3) What is your favorite subject in school? (Prompt them to elaborate when answer is given.)
4) Do you enjoy shopping, and what do you usually buy? (Follow up: Do you prefer to shop
alone, or to shop with your family and friends?)
5) Do you have any hobby? Are your hobbies relaxing or theyll make you excited?
Comparison/Opinion:
6) Can you tell me why do you want to come to ICE for this summer instead of enjoying time
with your family and friends in your home country?
7) Can you tell me what are some facts about America that you find interesting?
8) How is America (or American culture) different from your home country? Are there also any
similarities?
7/30/2019 Test Development Final
60/64
609) Do you think fine arts are important in life? Between music and art, such as paintings and
sculptures, which one do you prefer/or enjoy the most, why and why not.
10) What method works best for you when youre learning something new? What method have
you tried that didnt work out so well? (If they cannot understand the word method,
rephrase the question to What is the best strategy that you would use when youre learning
something new?)
===== 1 minute of preparation time =====
Part II: 4-5 minutes
11)Tell me about your opinion: How do you think about having international students as
classmates? Do you think itll help to improve your English? Please elaborate your answer.
12)What do you plan on studying in college? Why do you want to study in that area?
(If student answered I dont know yet, move on to another topic choice)
13)Why do you think it is important to learn English, and what are the benefits of able to speak
English?
14) Is it important to have a university degree in order to get a good job in your home country?
What about having an university degree from a foreign university, and will that enhance your
future employment opportunity in your home country?
15) If theres one change that you think its important for modern school system to make, what
would you recommend and your reasons?
7/30/2019 Test Development Final
61/64
61Revisions and Rationale
Upon finishing the pilot test I immediately knew what revisions would be necessary to
make. First of all, there was no need to revise the scoring rubric as each descriptor clearly
describes the ability expected for the level for each competency, and it was easy to determine
students five competencies level accordingly. Since subjectively-scored section is a speaking
test with topic prompts, the only revision I had to make was to add more prompts to each part of
the test, and change the wording of some of the questions in order for students to understand it
easier and capable to elaborate their answers without asking follow-up questions. One thing I
noticed during actual test administration was that all three test takers were rather advanced
communicatively and could burn through a question (even with follow-up questions) very fast,
resulting my intended 10-minute speaking test shortened to be approximately or less than 7
minutes long. Furthermore, I felt that asking only one main question and ask them to elaborate
the answer during the first part of the test wasnt enough (given that they were able to answer
and elaborate to an extent on their own); therefore, should there be students who have high
English proficiency, especially in discourse and fluency, it would be best to prepare at least 2
questions for each part I and part II of the speaking test, that way in case if a student who burns
through the first question with adequate elaboration, there will still be another question to fill in
the time-gap in order to fulfill the intended 4-5 minutes for each part of the test.
Another minor revision that needs to be made is to reinforce the one-minute preparation
time in between part I and part II of the test. Due to the fact that all three of our pilot test takers
were quite advanced in discourse competency and very talkative, they were able to proceed
straight into the second part of the test (the more academic and college-life-centered questions)
without pausing to think and write down their topic cues; one of them even said she didnt need
7/30/2019 Test Development Final
62/64
62that one-minute preparation time to sort her thoughts in order to produce a more coherent
response to the topic. Nevertheless, if this test was truly administered in the way intended, I
would definitely reinforce the one-minute preparation time rule in order to ensure fairness
among all the examinees because not everyone would have the same level of communicative
ability in a test setting.
7/30/2019 Test Development Final
63/64
63References
Bejar, I. I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening
framework. Princeton, NJ: Educational Testing Service.
Carrell, P. L. (2007). Notetaking strategies and their relationship to performance on listening
comprehension and communicative assessment tasks.RESEARCH REPORT-
EDUCATIONAL TESTING SERVICE PRINCETON RR, 7.
Hughes, A. (2003). Testing for language teachers (2nd ed.). NY: Cambridge University Press.
Turner, J (2013). Item analysis handout (Obtained as class handouts on ).
Wiggins, G. P. (1993).Assessing student performance. San Francisco: Jossey-Bass Publisher.
7/30/2019 Test Development Final
64/64
64Appendix
Self-Assessment Survey
Name: ________________
Please read the following questions carefully and place a
mark next to your choice of answer.
1. I can initiate spoken greetings and talk about basic personal information.___ Strongly Agree___ Agree___ Disagree
2. I can talk about basic things such as school life, food, and my hobbies.___ Strongly Agree___ Agree___ Disagree
3. I can perform daily activities such as ordering food and go out shopping alone.___ Strongly Agree___ Agree___ Disagree
4. I can t