Upload
megan-kirk
View
40
Download
3
Embed Size (px)
DESCRIPTION
Principles in language testing. What is a good test?. What is the purpose of testing?. The purpose of testing is to obtain information on language skills of the learners. Information is very costly. The more specific it is, the more cost it involves. - PowerPoint PPT Presentation
Citation preview
Principles in language testing
What is a good test?
What is the purpose of testing?
• The purpose of testing is to obtain information on language skills of the learners.
• Information is very costly. The more specific it is, the more cost it involves.– Is language testing targeting specific information?– Costs here involve human and material resources and
TIME.– Once an institution/teacher decided that the
information is needed, it/he should be ready to meet the costs.
Types of tests
• Achievement tests (final or progress)
• Proficiency tests
• Pro-achievement tests
• Diagnostic tests
• Placement tests
Test marking
• Assessment scale (also: rating scale)– criteria by which performances at a given
level will be recognized– levels of performance:
• 10 (excellent), 9 (very good), 8 (good)• bands 0-9 in IELTS• 1-100 pts in the national English examination• level descriptors – verbal descriptions of
performances that illustrate each level of competence on the scale
Communicative language competences
• Linguistic competences– lexical, grammatical, semantic, phonological,
orthographic, orthoepic• Sociolinguistic competences
– markers of social relations, politeness conventions, expressions of folk wisdom, register differences, dialect and accent
• Pragmatic competences– discourse comp. (ability to arrange sentences in
proper sequence), functional (requests, invitations etc.)
(adapted from CEFR 2001)
Competences vs. skills
• Competences are tested through skills• The four major skills are subdivided into
minor subskills:– reading comprehension:
• reading for general orientation• reading for information• reading for main ideas• reading for specific information• reading for implications etc.
(CEFR 2001)
What is good testing?
• It is valid • It is reliable• It is practical• It has positive impact
on the teaching process
• VALIDITY• RELIABILITY• PRACTICALITY• WASHBACK EFFECT
Test validity
• It appropriateness of the test; OR
• It shows that a test tests what it is supposed to test; OR
• A test is valid if it measures accurately what it is intended to measure.
• To establish that a test is valid, empirical evidence is needed. The evidence comes from different sources…
Types of validity
• Construct validity: – the extent to which a test measures the
underlying psychological construct (“ability, capacity”)
– the extent to which a test reflects the essential aspects of the theory on which that test is based
– an overarching notion of validity reflected in many subordinate forms of validity
In a more complicated way…
• If a test does not have construct validity, test scores will show CONSTRUCT IRRELEVANT VARIANCE. – E. g. in an advanced speaking test candidates
may be asked to speak on an abstract topic. Personal engagement in the topic, however, may weaken or improve the performance. BUT: having previous knowledge about the abstract topic should not be assessed.
Types of validity
• Content validity:– the extent to which a test adequately and sufficiently measures
the particular skills it sets out to measure (cf. test specifications)• Response validity:
– … test takers respond in the way expected by the test developers
• Predictive validity:– … a test accurately predicts future performance
• Concurrent validity:– … one test relate to scores on another external measure
• Face validity:– … test appears to measure whatever it claims to measure
(Hughes 2003: 26-35)
Types of validity
• Nearly 40 different types have been collected on a language testers’ forum…
• The more different types of validity are established in a test, the more valid that test is considered to be.
Test reliability
• Quality of test scores resulting from test administration:– accuracy of marking and fairness of scores– consistency of marking:
• similar scores on different days• similar scores from different markers
– inter-rater reliability– intra-rater reliability
Factors influencing reliability
1. The performance of test takers1. a sufficient number of items2. restricted freedom of test behaviour3. unambiguous items, clear instructions and rubrics4. layout, good copies, familiar format5. proper administration
2. The reliability of scorers1. objective scoring vs. subjective scoring2. restricting freedom of response3. a detailed scoring/marking key
Test feasibility/practicality
• It is the ease with which the items/tasks can be replicated in terms of resources needed, e. g. time, materials, people
Washback effect (sometimes ‘backwash’)
• It is a type of impact of examinations/tests on the classroom situation.
• Washback may be positive or negative.
How to achieve positive washback?
1. Test the abilities/skills whose development you want to encourage.
2. Sample widely and unpredictably.3. Use direct testing.4. Make testing criterion-referenced.5. Base achievement tests on objectives.6. Make sure that the test is known and
understood by students and other teachers.
References and additional reading
1. Alderson, Ch., D. Clapham and D. Wall. 1995. Language Test Construction and Evaluation. Cambridge: CUP
2. Hughes, A. 2003. Testing for Language Teachers. 2nd ed. Cambridge: CUP.
3. Council of Europe. 1991. Common European Framework of Reference for Languages. Cambridge: CUP.