Upload
videoconferencias-utpl
View
1.621
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Language Testing is a matter of using data to establish evidence of learning.
Citation preview
1
ESCUELA:
NOMBRES:
LANGUAGE TESTINGI BIMESTRE
FECHA:
Languajes
Dra. María Arias Córdova
ABRIL /AGOSTO 2009
2
WHAT IS LANGUAGE TESTING? Language Testing is a matter of
using data to establish evidence of learning.
Testing is a universal feature of social life.
Testing is generally concerned with enumeration, that is, turning performance into numbers.
Testing is about making inferences.
3
TESTING AND EVALUATION
The relationship between testing and evaluation is similar to the relationship between the Curriculum and the Syllabus.
CURRICULUM are the subjects that are studied in schools, and the procedures and approaches used to teach them. This is usually decided by the state.
5
TESTING AND EVALUATION The SYLLABUS is a set of items for
the teacher to cover in a term. But the syllabus is part of a bigger methodological scheme- the CURRICULUM.
EVALUATIONCURRICULUM
TESTING SYLLABUS
5
ASSESSMENT AND EVALUATION
Assessment: the act of assessing; to estimate or determine the significance, importance, or value of; evaluate. It is important to notice that the final purposes and assessment practices in education depends on the theoretical framework of the practitioners and researchers, their assumptions and beliefs about the nature of human mind, the origin of knowledge and the process of learning.
Evaluate: to find the value or amount of; to judge or determine the worth or quality of.
6
EDUCATIONAL ASSESSMENT
Educational assessment is the process of documenting, usually in measurable terms, knowledge, skills, attitudes and beliefs. Assessment can focus on the individual learner, the learning community (class, workshop, or other organized group of learners), the institution, or the educational system as a whole.
7
SOME ASSESSMENT TERMINOLOGY
Formative Assessment: A relatively informal assessment that takes place during the process of learining, as oppossed to at the end. The purpose is to provide feedback, which helps in the learning process.
Performance assessment: Assessment of performance on an oral or written task.
Summative assessment: Formal testing or evaluation at the end of a learning period to measure what a student has learned.
8
SOME ASSESSMENT TERMINOLOGY
Self-Assessment: A reflexive process in which learners evaluate their own work based on pre-set criteria.
Performance assessment: Assessment of performance on an oral or written task.
9
EVALUATION Evaluation is an effective means of
measuring teaching and learning performances in a language program and of improving the teaching process.
Evaluation is a process to judge or measure the value of a finished or ongoing program, plan, or even a policy (Gasper 1995). In the language teaching field specially in ESL/EFL programs, there are numerous reports on how to apply evaluation to class activities and program assessment. Forum magazine
10
WHAT IS A TEST?
In education, it is called an examination or exam, it serves to assess or measure student’s performance, knowledge or skills.
Language tests play a powerful role in many people’s lives, acting as gateways at important transitional moments in education, in employment, and in moving from one country to another.
11
PURPOSE OF TESTS
Personality tests
DNA TESTS
12
PURPOSE OF TESTS
MEDICINE Blood Cancer screening Hearing Eye
13
PURPOSE OF TESTS
THE LAW Paternity tests
Lie detection tests
14
TYPES OF TESTS Not all language tests are of the same kind. They
differ with respect to how they are designed, and what they are for; in other words, in respect to test method and test purpose.
In terms of method, we can broadly distinguish traditional paper-and-pencil language tests.
Language tests also differ according to their purpose. The same form of test may be used for different purposes, although in other cases the purpose may affect the form.
15
REASONS FOR TESTING Achievement Tests:Achievement Tests: are associated with the are associated with the
process of instruction.process of instruction.
Proficiency tests:Proficiency tests: look to the future situation look to the future situation of language use without necessarily any of language use without necessarily any reference to the previous process of teaching.reference to the previous process of teaching.
Diagnostic tests:Diagnostic tests: involve identifying specific involve identifying specific areas of strength or weakness in language areas of strength or weakness in language ability so as to assign students to specific ability so as to assign students to specific courses or learning activities. Lyle F. Backman courses or learning activities. Lyle F. Backman and Adrian S. Palmer ( 2000)and Adrian S. Palmer ( 2000)
16
REASONS FOR TESTING Progress Tests: J.B.Heaton (1991) explains that
the type of test we give will depend very much on our purpose in tesing. There are many reasons for giving a test, and we should always ask ourselves about the real purpose of the test which we are giving to our students. Perhaps the most important reason is to find out how well the students have mastered the language areas and skills which have just been taught. Thesee tests look back at what students have achieved and are called progress tests.
17
REASONS FOR TESTING
Placement Tests: enable us to sort students into groups according to their language ability at the beginning of the course. Such a test should be as general as possible and should concentrate on testing a wide and representative range of ability in English.
18
REASONS FOR TESTING Proficiency tests: Heaton says that we use
proficiency tests to measure how suitable candidates will be for performing a certain task or following a specific course. For example, The British Council administers proficiency to overseas students intending to study in universities and polytechnics in Britain. This test has different parts which candidates can choose to do according to their different purposes.
19
REASONS FOR TESTING It is thus possible for the test to measure
candidate’s proficiency in certain special fields: life sciences, medicine, social studies, physical science, and technology.
Most proficiency tests concentrate on assessing candidates` ability to use English for specific purpose. The candidates` general command of English may not form the chief focus for a proficiency test.
20
TESTING LISTENING SKILLS DISTINGUISH BETWEEN SOUNDS
DICTATION REPEATING INFORMATION SHORT STATEMENTS, QUESTIONS
AND CONVERSATIONS COMPLETING PICTURES FOLLOWING DIRECTIONS SHORT CONVERSATIONS AND
STATEMENTS ABOUT PICTURES
21
TESTING SPEAKING SKILLS PRONOUNCING WORDS IN
ISOLATION PRONOUNCING WORDS IN SENTENCES READING ALOUD RE-TELLING STORIES USING PICTURES MAPS ORAL ITERVIEWS ASKING QUESTIONS
22
MARKING J. B. Heaton also says that it is appropriate to state an important J. B. Heaton also says that it is appropriate to state an important
principle here; namely, never mark in front of a student. Nothing is principle here; namely, never mark in front of a student. Nothing is more discouraging for a student than to enter into conversation more discouraging for a student than to enter into conversation with someone who is constantly breaking off to enter marks and with someone who is constantly breaking off to enter marks and comments. The student should be constantly reassured that comments. The student should be constantly reassured that what what he or she says is being treated as important- rather than he or she says is being treated as important- rather than how how he or he or she says it. If possible, wait until the student has left the room she says it. If possible, wait until the student has left the room before you enter your marks and comments. before you enter your marks and comments.
In spite of all your attempts, it may sometimes be impossible to In spite of all your attempts, it may sometimes be impossible to avoid tension and nervousness on the part of many students. Such avoid tension and nervousness on the part of many students. Such feelings of tension can affect performance and change the way feelings of tension can affect performance and change the way they behave in an interview. For example, students at a certain they behave in an interview. For example, students at a certain age sometimes become unnaturally quiet or aggressive. age sometimes become unnaturally quiet or aggressive.
23
OBJECTIVE ITEM TYPES
Objective Tests are those that include questions in a true/ false, multiple-choice, matching, or fill-in format. Usually the answer is provided but the student must decide among several possibilities.
24
OBJECTIVE ITEM TYPES 1. Multiple-Choice items 2. Cloze Tests 3. Dictation 4. Short Answer Questions 5. Dichotomous Items 6. Matching 7. Sentence Completion or Fill-In
Questions 8. True/False Questions
25
MULTIPLE-CHOICE ITEMSThe basic structure is: A stem: initial part of a test item A number of options: the alternatives
from which examinees have to select the correct one
Key: the correct answer Distractors: the incorrect options
26
MULTIPLE-CHOICE ITEMSExample 1:
The word hazardous is closest in meaning to
a. frequent b. perilous c. outer d. unpredictable
27
MULTIPLE-CHOICE ITEMS Example 2: According to Cognitive Approach, all of
the following may influence the decision whether to act aggressively EXCEPT a person’s
a. moral values b. previous experiences with aggression c. instinct to avoid aggression d. beliefs about other people’s
intentions
28
MULTIPLE-CHOICE ITEMS Harold S. Madsen
Example 3: The word disrupted is closest in meaning
to a. prolonged b. established c. followed d. upset
29
MULTIPLE –CHOICE ITEMS
Example 4 The phrase bound to is closest in
meaning to a. limited to b. hidden within c. regarded as d. venerated as
30
MULTIPLE-CHOICE ITEMS Harold S. Madsen
Example 5:
Poor item: Do you need some _______ to write on?
a. pen
b. paper
c. material
d. ink
defect or weakness: more than one right answer
MULTIPLE-CHOICE ITEMS Example 6:
Poor item: The mouse ______ quickly away.
a. very
b. run
c. baby
d. little
nonverbs used31
32
MULTIPLE-CHOICE ITEMS Example 7:
Poor item: I think he will be here in an ______.
a. soon
b. weekend
c. day after
d. hour
“ an” cues answer
33
MULTIPLE-CHOICE ITEMS Example 8: Poor item: She is a person of good
judgement and courage. a. sense b. cents c. scents d. since spelling trap
34
DICTATION TESTS
Dictation can be only be fair to students if it is presented in the same way to them all, and this generally means having the material on tape, so that not only is it presented in an identical way to all candidates, but the speed of delivery and positioning of pauses can be tested in advance. If the use of a tape recording is impossible, the people who deliver the dictation must be very thoroughly trained.
35
DICTATION TESTS Dictation can be objectively marked if
candidates are asked to write down the original text verbatim, and if the examiner has a system for deciding how should be allotted. However, such system are difficult to devise. For example, if the marking instructions say, “ deduct one point for each misspell word and two points for each word that is missing or is the same as in the original “ , it is not always clear whether a word is misspell or just wrong. The same problem occurs even if the maker is told to ignore spelling mistakes.
36
DICTATION TESTS The other problem with this method of
marking dictation is that it is both time-consuming and boring to mark. This means not only that the marking will be expensive but that the markers are likely to make frequent errors. Some test writers avoid this problem by giving a partial dictation in which the candidates are given a copy of the text they are to hear in which words, phrases or sentences have been deleted. Students are asked to fill in the gaps as they listen to the text being read.
37
ADVANTAGES OF MULTIPLE-CHOICE COMPLETION ITEMS
It is impossible for students to avoid the grammar point being evaluated.
Scoring is easy and reliable. This is a sensitive measure of
achievement ( and like other multiple-choice language tests, it allows teachers to diagnose specific problems of students).
38
ADVANTAGES OF MULTIPLE-CHOICE COMPLETION ITEMS
These tests are generally easier to prepare than are multiple-choice items.
These give the appearance of measuring productive skills because some items permit flexibility and original expression.
There is no exposure to incorrect grammatical forms.
These provide a sensitive measure of achievement.
39
DISADVANTAGES OF MULTIPLE-CHOICE COMPLETION ITEMS
Preparing good items is not easy. It is easy for students to cheat. It doesn’t appear to measure
student’s ability to reproduce language structures.
This can have a negative influence on class work if used exclusively.
40
DISADVANTAGES OF MULTIPLE-CHOICE COMPLETION ITEMS
Preparing good items is not easy. It is easy for students to cheat. It doesn’t appear to measure
student’s ability to reproduce language structures
This can have a negative influence on class work if used exclusively.
41
DISADVANTAGES OF MULTIPLE-CHOICE COMPLETION ITEMS
These are usually more time consuming to correct than are multiple-choice questions. Not only can poor penmanship be a problem but also irrelevant errors beyond those being tested.
Occasionally students can unexpectedly avoid the structure being tested.
42
CLOZE TESTS Cloze tests are prose passages,
usually a paragraph or more in length, from which words have been deleted. The student relies on the context in order to supply the missing words. Cloze here refers to tests in which words are deleted mechanically. Each word is deleted regardless of what the function of that word is. So, for example, every sixth word might be removed.
43
MATCHING
Matching means items where students are given a list of possible answers which they have to match with some other list of words, phrases, sentences, paragraphs or visual clues. Look at the following example where the students have to match the four words on the left with those on the right in order to answer the questions.
44
MATCHINGCOLUMN A COLUMN B 1. Who is the head a. the
Pope of the catholic church?2. What`s the Amazon? b. about
4,700
years old 3. How old are the c. It`s a
river Pyramids of Egypt?
45
TRUE AND FALSE QUESTIONS
True / False questions are the easiest test questions for the obvious reason that you have at least a fifty- fifty chance or getting the right answers. First, be sure you have read the questions correctly. Look for words such as always or never; these words often indicate a false answer. Words such as often usually, rarely, or sometimes can indicate a true answer. Decide if the statement is totally true before you mark in true.
46
TRUE AND FALSE QUESTIONS
For example: Are you good language learner?
TRUE FALSE
1. Practice pronunciation ……. ……..
2. Guess the meaning of new words ……. ……..
3. Make lists of new words ……. ……..
4. Watch TV in English ……. …….
5. Think about grammar ……. …….
6. Read something in English ……. …….
47
TEST CONTENT
From a practical point of view test design begins with decisions about test content, what will go into the test. These decisions imply a view of the test construct, the way language and language use in test performance.
48
TEST METHOD
The next thing to consider in the test design is the way in which candidates will be required to interact with the test materials, particularly the response format, that is, the way in which the candidate will be required to respond to the materials.
TEST SPECIFICATIONS These are a set of instructions for
creating the test, written as if they are to be followed by someone other than the test developer; they are a recipe or blueprint for test constuction. The specifications will include information on such matters as the lenght and structure of each part of the test.
50
TEST SPECIFICATIONS
The response
format
The test rubric
How responses are to be scored
51
TEST TRIALS
PROFICIENCY LEVEL
LEARNING BACKGROUNDAGE
THE TARGET TEST
POPULATION
52
THE RATING PROCESS Making judgements about people is a common Making judgements about people is a common
feature of everyday life. We are continually feature of everyday life. We are continually evaluating what others say and do, in comments evaluating what others say and do, in comments called for or not, offering criticism and fedback called for or not, offering criticism and fedback informally to friends and colleagues about their informally to friends and colleagues about their behavior. Formal, institutional judgements figure behavior. Formal, institutional judgements figure prominently in our lives too. prominently in our lives too.
People pass driven tests, survive the probationary People pass driven tests, survive the probationary period in a new job, get promotion at work, succeed at period in a new job, get promotion at work, succeed at interviews, win Oscars for performances in a film, win interviews, win Oscars for performances in a film, win medals in diving competitions, and are released from medals in diving competitions, and are released from prison for good behavior.prison for good behavior.
53
THE RATING PROCESS Rating scale
A rating scale is a set of categories designed to elicit information about an attribute in social science. Common examples are the and 1-10 rating scales for which a person selects the number which is considered to reflect the perceived quality of a product.
In rating scales are often referenced to a statement which expresses an attitude or perception toward something. The most common example of such a rating scale is the Likert scale, in which a person is asked to select a category label from a list indicating the extent of disagreement or agreement with a statement.
54
THE RATING PROCESS The basic feature of any rating scale is that it
consists of a number of categories. These are usually assigned integers. Look at an example of the use of a Likert scale is as follows.
Statement: Response options: 1. Strongly Disagree
2. Disagree
3. Agree
4. Strongly Agree
55
EXAMPLES OF HOLISTIC RATINGS
Reading comprehension
Reading comprehension can be defined as the level of understanding of a passage or text. For normal (around 200-220 words per minute) an acceptable level of comprehension is above 75%.
Reading comprehension can be improved by: Training the ability to self assesses comprehension, actively test comprehension using , and by improving .
56
Teaching conceptual and knowledge is also advantageous.
Self assessment can be conducted by summarizing, and elaborative interrogation, and those skills will gradually become more automatic through practice.
Reading comprehension skills separates the "passive" unskilled reader from the "active" readers. Skilled readers don't just read, they interact with the text.
THE RATING PROCESS
57
THE RATING PROCESS To help a beginning reader understand this concept, you
might make them privy to the dialogue readers have with themselves while reading.
Skilled readers, for instance:
Predict what will happen next in a story using clues presented in text
Create questions about the main idea, message, or plot of the text
Monitor understanding of the sequence, context, or characters
Clarify parts of the text which have confused them Connect the events in the text to prior knowledge or
experience.
58
THE RATING PROCESS Introducing the rater into the assessment
process is both necessary and problematic. It is problematic because ratings are necessarily subjective. Another way of saying this is that the rating given to a candidate is a reflection, not only of the quality of the performance, but of the qualities as a rater of the person who has judged it.
59
CONSULTED BIBLIOGRAPHY
McNamara, Tim ( 2000) Language Testing. Oxford University Press.
Pierce, Douglas and Kinsell Sean ( 2009) Cracking the TOEFL iBT. Princenton Reviw. Random House, Inc. New York.
Alderson J. Charles Clapham Caroline and Wall Dianne ( 1995 ) Language. Test Construction and Evaluation. Cambridge University Press.
Huges, Arthur ( 1995 ) Testing for Language Teachers. Cambridge University Press.
Wikipedia. The free encyclopedia.
60
THANK YOU
61
62