47
Applications of corpus analysis in EAP: research, learning, and teaching Martin Hewings The University of Birmingham & Cambridge University Press [email protected]

Applications of corpus analysis in EAP: research, learning, and teaching Martin Hewings The University of Birmingham & Cambridge University Press [email protected]

Embed Size (px)

Citation preview

Applications of corpus analysis in EAP: research, learning, and teaching

Martin Hewings

The University of Birmingham &

Cambridge University Press

[email protected]

• Corpus analysis in EAP research

• Students learning from corpora: Data-driven learning and an alternative

• Teachers learning from corpora: Classroom applications

Outline of talk

Features of 31 JEAP ‘corpus’ papers• Paper types

– 26 corpus analyses

• Corpus content– 19 writing; 5 speech; two both; journal articles predominant; focus on

single soc sci disciplines

• Corpus types– mainly expert/ published

• Focus of analysis– mainly particular lexical/ grammatical features

Corpus research for EAP

Typically…• written corpora• expert/ published

corpora• particular (soc sci)

disciplines• lexical/ grammatical

focus

Corpus research for EAP

Typically…• written corpora• expert/ published

corpora• particular (soc sci)

disciplines• lexical/ grammatical

focus

The future?• more speech & CMC• more learner corpora• more science

disciplines (& more comparison)

• greater number of areas of investigation

Corpora as resources for learners: data-driven learning (DDL)

Corpora as resources for learners: data-driven learning (DDL)

DDL: • exposes students to ‘target’ language forms• provides authentic examples• provides information beyond dictionary or

grammar• encourages inductive learning• encourages learner autonomy

Corpora as resources for learners: data-driven learning (DDL)

“…an exceptional group of students – highly

acculturated into the genres of their discourse communities, mostly on the way to their PhDs, eager to perfect their English, possessing of advanced computer skills, and perfectly comfortable with quantitative data.”

Lee & Swales (2006): DDL

DDL: some reservations

• lack of evidence to link DDL to language improvement

• are the outcomes worth the time, effort and money?

• it doesn’t suit all students

• An example: MBA students’ use of ‘I’

• ‘Research article (RA) corpus’: 120,000 words

• ‘MBA corpus’: essays, 22,000 words

Selecting corpus data for students (as an alternative to DDL)

MBA corpus on TV or from magazine, I am in the opinion that service more consumption of fuels. I am almost certain that there world imports composition. I believe services commodities s composition. In the future I believe there will be a new osing a million dollars. So I believe services commodities wector. As a result of this, I can predict that there will mports appeared. After 1987, I do not think that there was a mports about one third. But I don't think it will grow so er. As a result, therefore, I expect that the countries more than other commodities. I expect service industry will ervices" and is intangible. I feel that the intangible veloped in the next future. I personally see the above idea world. But, before I go on, I should make a point. After ore detail analysis, I think I should take deeper consideratio a long term point of view, I suppose the composition of

MBA corpus on TV or from magazine, I am in the opinion that service more consumption of fuels. I am almost certain that there world imports composition. I believe services commodities s composition. In the future I believe there will be a new osing a million dollars. So I believe services commodities wector. As a result of this, I can predict that there will mports appeared. After 1987, I do not think that there was a mports about one third. But I don't think it will grow so er. As a result, therefore, I expect that the countries more than other commodities. I expect service industry will ervices" and is intangible. I feel that the intangible veloped in the next future. I personally see the above idea world. But, before I go on, I should make a point. After ore detail analysis, I think I should take deeper consideratio a long term point of view, I suppose the composition of

RA corpusuring the estimation period. I also computed Patell's (1976, p. y. The question: When should I buy? has one logical answer:(SVR) metric (in all cases), I choose to present only the resulthe wall' statements such as 'I don't care how you do it, just doon environment. In addition, I examine several subhypotheses basSize Test. In this section, I first test the hypothesis of diffy perennial question: should I invest now or wait for the as a long way from reality: 'I just did not want to be part of aep asking themselves,'How do I know? What evidence is there?' Thy? By information technology I mean the hardware and software, cI were doing this what would I need?' Another useful heuristic red per ASR No. 190. That is, I test the hypothesis that inflatioe key questions such as: 'If I were doing this what would I needh domestically and globally. I will, therefore, focus more on th

Journals: published writing

RA corpusuring the estimation period. I also computed Patell's (1976, p. y. The question: When should I buy? has one logical answer:(SVR) metric (in all cases), I choose to present only the resulthe wall' statements such as 'I don't care how you do it, just doon environment. In addition, I examine several subhypotheses basSize Test. In this section, I first test the hypothesis of diffy perennial question: should I invest now or wait for the as a long way from reality: 'I just did not want to be part of aep asking themselves,'How do I know? What evidence is there?' Thy? By information technology I mean the hardware and software, cI were doing this what would I need?' Another useful heuristic red per ASR No. 190. That is, I test the hypothesis that inflatioe key questions such as: 'If I were doing this what would I needh domestically and globally. I will, therefore, focus more on th

Journals: published writing

RA corpusuring the estimation period. I also computed Patell's (1976, p. y. The question: When should I buy? has one logical answer:(SVR) metric (in all cases), I choose to present only the resulthe wall' statements such as 'I don't care how you do it, just doon environment. In addition, I examine several subhypotheses basSize Test. In this section, I first test the hypothesis of diffy perennial question: should I invest now or wait for the as a long way from reality: 'I just did not want to be part of aep asking themselves,'How do I know? What evidence is there?' Thy? By information technology I mean the hardware and software, cI were doing this what would I need?' Another useful heuristic red per ASR No. 190. That is, I test the hypothesis that inflatioe key questions such as: 'If I were doing this what would I needh domestically and globally. I will, therefore, focus more on th

Journals: published writing

Teachers learning from corpora: checking intuitions

Teachers learning from corpora: checking intuitions

• it is [adjective] to-infinitive• it is [adjective] that

Cambridge Corpus of Academic English (CCAE). About 400 million words of published academic written texts.

Teachers learning from corpora: checking intuitions

• it is [adjective] to-infinitive 48,170• it is [adjective] that 24,115

it is [adjective] to-infinitive

crucial difficult helpful important necessary possible safe straightforward

> 4000 times < 500 timespossible 7784important 5019difficult 4345

necessary 4103

straightforward 481crucial 282

helpful 255

safe 194

it is [adjective] that

clear interesting likely notable possible significant surprising true

> 1000 times < 300 timesclear 5284

possible 4116likely 2561true 1170

significant 257surprising 251interesting 235notable 206

it is true that

• It is true that having a theoretical foundation for what one is doing in the classroom is important,but it is at least equally important to transform that knowledge into activities that are simple,appealing to the students, and successful.

• While it is true that national expenditure estimates are often larger than those of national income, this is not always the case.

What adverbs come before…

……. similar but not …….different?

……. different but not …..similar?

……. similar or ……… different?

Teachers learning from corpora: checking intuitions

closely essentially radically rather reasonably roughly strikingly totally vastly

‘highly different’ and ‘almost similar’

From corpus research to teaching materials

From corpus research to teaching materials: ‘on the surface’

From corpus research to teaching materials: ‘below the surface’

Corpus analysis in EAP: the future

• In research…

• In (data-driven) learning…

• In teaching…

Applications of corpus analysis in EAP: research, learning, and teaching

Martin Hewings

The University of Birmingham &

Cambridge University Press

[email protected]

Cambridge Academic English

Open-access academic corpora include…• Corpus of Contemporary American English (COCA) (Academic) 120 million words

• British Academic Written English (BAWE) 6.5 million words of good-standard student writing

• Michigan Corpus of Academic Spoken English (MICASE) 1.8 million words

• British Academic Spoken English (BASE) 1.6 million words

Teachers learning from corpora: discovering new information

Some nouns have a related adjective ending:• -ic base – basic (not basical)

• -ical astrology – astrological (not astrologic)

• -ic or –ical analysis – analytic/ analytical

analytic analytical

problematic problematical

geographic geographical

technologic technological

analytic 9, 721 analytical 12, 107

problematic problematical

geographic geographical

technologic technological

analytic 9, 721 analytical 12, 107

problematic 11, 042 problematical 551

geographic geographical

technologic technological

analytic 9, 721 analytical 12, 107

problematic 11, 042 problematical 551

geographic 4, 403 geographical 9, 322

technologic technological

analytic 9, 721 analytical 12, 107

problematic 11, 042 problematical 551

geographic 4, 403 geographical 9, 322

technologic 47 technological 8, 750

ecological and geographical

(rather than ecological and geographic)

taxonomic and geographic

(rather than taxonomic and geographical)

-ic or –ical ?

technologic or technogical?