Upload
lionel-smith
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Corpus linguistics and language teaching
The next nexus?
Doug Biber
Northern Arizona University
Goals of the talk
• Introduce corpus linguistics
• Present case studies illustrating the surprising findings that emerge from corpus-based research
• Discuss the application of corpus research to classroom teaching and materials development
What is corpus linguistics?
• A research approach for describing language use:
How do speakers and writers actually use the vocabulary and grammar resources available in a language?
What is a corpus?
• A large, principled collection of ‘natural’ texts stored on computer
• A corpus should ‘represent’ particular language varieties or registers (e.g., conversation or university textbooks)
– Design is important: texts must be sampled from particular target registers
– Size is equally important: Some language features are rare but still have systematic patterns of use
Characteristics of corpus-based analysis (I)
• Relies on computer-assisted techniques– Concordancers (‘KWIC’ displays = ‘Key Word In
Context’)– Computer programs
• Automatic (e.g., grammatical ‘taggers’)• Interactive (to code grammatical variants)
Example of concordance output (from MonoConc)
Characteristics of corpus-based analysis (II)
• Analyses are empirical
• Uses both quantitative and qualitative / interpretive techniques
• Meaningful analyses must be motivated by linguistic research questions (not simply by the availability of a corpus)
So what is corpus linguistics?• A research approach – A way of thinking
about language– Shines the spot light on language use: registers
and language for specific purposes– Allows investigation of language choice: Why
does a speaker use a particular word or grammatical form rather than alternatives?
– Allows investigation of meaning in context: why synonyms are usually not interchangeable
– Allows investigation of language preference: what forms are rare? What is especially common?
Corpus descriptions capture the complexities of actual use
– Language use is often systematic but complex
– Corpus-based studies can consider the range of relevant factors and the interactions among factors
– Corpus analysis describes the patterns of use, but it cannot directly determine how those findings are relevant for language learning
– That is, corpus analyses provide the basis for informed decisions by teachers – not necessarily the immediate content of our language teaching
Case studies
• Vocabulary
• Grammar
• Lexico-grammar
Corpus-based descriptions of vocabulary:Selected reference works
Learner dictionaries based on corpora:Longman Dictionary of Contemporary English (LDOCE);
Collins COBUILD English Dictionary
Vocabulary textbooks based on corpora:McCarthy and O’Dell; Basic Vocabulary in Use
Thornbury; Natural Grammar
Academic studies of collocation:Sinclair 1991; Partington 1998
Case studies on vocabulary
• Corpus-based dictionaries• Collocation• Semantic prosody
Case studies on vocabulary (1):Corpus-based dictionaries
• The order of meanings reflects use
e.g. LDOCE entry for concerned:Meaning 1: ‘involved in something’ (reach an agreement with all concerned)Meaning 2: ‘worried’(concerned about how little I eat)
• Identifies common words and register differences
Words moderately common in speech (not writing -- LDOCE)flood, hopefully, messy, potato, shave, underneath
Words moderately common in writing (not speech -- LDOCE)focus, glance, moreover, pollution, scope, underlying
Synonyms: large, great, and big
Case studies on vocabulary (2):Collocations
For example:
Large number(s) ‘quantity’scaleproportionamount
versus
Great deal (of) ‘impressive’importancemajority
(see Firth 1957; Sinclair 1991; Partington 1998; Biber, Conrad, Reppen 1998)
Case studies on vocabulary (3): Semantic prosody
Copular verbs that mean ‘become’:
turn black, red, white, pale
come alive, loose, true, unstuck
go crazy, mad, wrong, bad
(Longman Grammar of Spoken and Written English, 444-445)
(cf. Partington 1998)
Corpus-based studies of grammar
• Demonstrative pronouns: this versus that• Word classes: nouns, verbs, pronouns• Dependent clauses: that-clauses versus
to-clauses
• (From the Longman Grammar of Spoken and Written English)
Case studies on grammar (1)
The grammar of individual words: Demonstrative pronouns this versus that
• The traditional description of the difference:
– This refers to a thing near the speaker
– That refers to something that is not near the speaker
The grammar of individual words (cont.) Demonstrative pronouns that versus this
0
2000
4000
6000
8000
10000
12000
Conversation Academic WR
Fre
qu
ency
per
mil
lio
n w
ord
s
that
this
Demonstrative pronouns that versus this (cont.)
• Examples of that in conversation(vague or situational reference)
That was delicious.
A: I was, I was flat on my back. B: Uh, I can't sleep like that
• Examples of this in academic writing(text deixis)
GAAP requires that a business use the accrual basis. This means that the accountant records revenues as they are earned…
Case studies on grammar (2)
The register distribution of grammatical classes:
Nouns, verbs, personal pronouns
Distribution of nouns, verbs, and pronouns across four registers
0
50
100
150
200
250
300
350
Conversation ClassroomTeaching
Textbooks Academic Prose
Fre
qu
ency
per
1,0
00 w
ord
s
Nouns
Verbs
Personalpronouns
Case studies on grammar (3)
Syntactic features
Dependent clauses are common in writing but rare in speech:
Contrasting intuitions with actual use
That-clauses and to-clauses in conversation vs. academic prose
0
1000
2000
3000
4000
5000
6000
7000
Conversation Academic prose
Verb + THAT-clause
"Extraposed"THAT-clause
Verb + TO-clause
"Extraposed"TO-clause
• Verb + that-clause in conversation:I know (that) I told you.
I think (that) we picked it up.
• Extraposed to-clauses in academic prose:It is important to specify the states …
It is difficult to maintain a consistent level…
It is impossible to liquefy a gas …
Corpus-based studies of lexico-grammar
Case studies from the Longman Grammar of Spoken and Written English:
– The grammatical ‘patterns’ of individual words: tell and promise
(cf. Hunston and Francis 2000; Thornbury 2004)
– Passive verbs: common and rare
– Common verbs with that-clauses in conversation
Case studies on lexico-grammar (1)
The grammar of words: tell versus promise
• Both verbs have identical valency patterns:– They can occur as monotransitive verbs (with a
direct object) – or as ditransitive verbs (with a direct object and
an indirect object)
Grammatical patterns for tell and promise in newspaper language
0
10
20
30
40
50
60
70
80
90
TELL PROMISE
V + Direct Object
V + Clause
V + IndirectObject + Clause
• Example of TELL in newspapers – expressing both the addressee AND the content of the message:
Cheney told [Navy Secretary H. Lawrence Garrett] [that he would cancel the $50 billion project] …
• Example of PROMISE in newspapers – expressing only the content of the promise:
The company promised [to donate about $500,000 to the cause] …
Case studies on lexico-grammar (2)
The words of grammar:
Verbs with passive voice
Verbs with passive voice
• Selected verbs that almost always occur with passive voice in academic prose (over 70% of the time):
– Verbs of scientific methodology: be analyzed, be calculated, be collected, be measured, be tested
– Their occurrence is measured in a few parts per million.
– Verbs expressing logical relations and interpretations: be based (on), be associated (with), be attributed (to), be interpreted (as), be regarded (as)
– Their presence must be regarded as especially undesirable.
Verbs with passive voice (2)
• Selected transitive verbs that almost never occur in the passive voice:
agree, guess, have, like, love, quit, reply, try, want, watch, wish, wonder
Case studies on lexico-grammar (3)
Verbs controlling that-clauses versus to-clauses
That-clauses and to-clauses in conversation vs. academic prose
0
1000
2000
3000
4000
5000
6000
7000
Conversation Academic prose
Verb + THAT-clause
"Extraposed"THAT-clause
Verb + TO-clause
"Extraposed"TO-clause
Verbs that control that-clauses
• Almost 200 verbs attested in the LSWE Corpus (e.g., feel, realize, hear, assume, suggest, ensure, indicate, imply, propose)
• Only 4 verbs are extremely common in conversation:
think, say, know, guess
Verbs controlling that-clauses in conversation
0
500
1000
1500
2000
2500
THINK SAY KNOW GUESS All otherverbs
Applications of corpus-based research
Language for specific purposes
• Language use is mediated by register
• That is, notions like ‘common’, ‘rare’, and ‘typical’ are usually not meaningful for general English.
• Rather, language features and patterns are typical of particular registers.
• Case study of modal verbs in university registers
Modal verb classes across specialized university registers
0
5
10
15
20
25
Classroomteaching
Classroommanagement
Textbooks Syllabi, etc.
Possibilitymodals
Necessitymodals
Predictionmodals
Why are there so many prediction modals in class management?
These usually serve (indirect) directive functions:
• I'd like you to review your quizzes
• I would encourage you to add this to your stack of materials
• and then assignment six will be due Tuesday
Students using corpora in the classroom
• The student as researcher: Data-driven learning (e.g., article use) (Johns – e.g., 1991, ELR Journal)
• LSP applications: student concordancing based on a specialized corpus (see, e.g., Donley and Reppen 2001, TESOL Journal; Gavioli and Aston 2001)
• Do students benefit? Yes: enhances vocabulary learning and transfer of word knowledge (Cobb 1997, System; 1999, CALL)
General considerations for curricula, materials development, and lesson
planning
• What language features and grammatical topics to include / exclude
• What vocabulary to include
• Sequencing
• Providing meaningful practice
Using corpus-based materials in the classroom: Issues (1)
• How to adapt corpus-based research findings?
• What kinds of corpus findings are useful for learners?
• How to adapt natural text for classroom use?
• What kinds of gains in proficiency should we expect from corpus-based materials?
Developing corpus-based materials for the classroom: Issues (2)
• How important is frequency / typicality? What about representation of specific target registers?
• Difficulty and learnability of the construction; inter-language sequences – natural order of acquisition.
• To what extent are current practices actually informed by research on acquisition??
• Unreliability of intuitions
Future research directions
• Need for empirical research on the translation of corpus research findings to classroom materials:
– Overall distribution of grammatical features Issues of inclusion and sequencing
– Collocation and lexico-grammatical patterns Issues of word choice and practice within a lesson
– Discourse factors influencing grammatical variation and choice Presentation and practice within a lesson
• What kinds of gains in proficiency, in response to what kinds of materials?