The Most Frequently Used English Phrasal Verbs in American and British English- A Multicorpus Examination

The Most Frequently Used EnglishPhrasal Verbs in American and BritishEnglish: A Multicorpus Examination

DILIN LIUUniversity of Alabama

Tuscaloosa, Alabama, United States

This study uses the Corpus of Contemporary American English and theBritish National Corpus as data and Biber, Johansson, Leech, Conrad,and Finegan’s (1999) and Gardner and Davies’ (2007) informativestudies as a starting point and reference. The study offers a cross-English variety and cross-register examination of the use of Englishphrasal verbs (PVs), one of the most difficult aspects of English forlearners of English as a foreign language or English as a secondlanguage. The study first identified the frequency and usage patterns ofthe most common PVs in the two corpora and then analyzed the resultsusing statistical procedures, the chi-square and dispersion tests, todetermine any significant cross-variety or -register differences. Besidesvalidating many of the findings of the two previous studies (althoughneither was a cross-English variety examination), the results of thisstudy provide new, useful information about the use of PVs. Inaddition, the study resulted in a comprehensive list of the mostcommon PVs in American and British English, one that complementsthose offered by the two previous studies with more necessary items andmore detailed usage information. The study also presents a cross-register list of the most frequent PVs, showing in which register(s) eachof the PVs is primarily used. Finally, pedagogical and researchimplications are discussed.doi: 10.5054/tq.2011.247707

B ecause of their extremely high frequency in the English languageand the great difficulty they present to language learners, phrasal

verbs (PVs) have long been a subject of interest and importance inEnglish as a foreign language (EFL) or English as a second language(ESL) teaching and research, as evidenced by the many publications onthe topic (Bolinger, 1971; Cordon & Kelly, 2002; Darwin & Gray, 1999;Gardner & Davies, 2007; Liao & Fukuya, 2004; McCarthy & O’Dell, 2004;Side, 1990; Wyss, 2003). The unique challenge for teaching PVs is that,although PVs are ubiquitous in the English language, EFL or ESL

TESOL QUARTERLY Vol. 45, No. 4, December 2011 661

speakers, especially those with a lower and intermediate level ofproficiency, consistently avoid using them (Dagut & Laufer, 1985;Hulstijn & Marchena, 1989; Laufer & Eliasson, 1993; Liao & Fukuya,2004). The reasons for this avoidance are many, including cross-linguistic differences and the complexity of syntactic and semanticstructures of PVs (Dagut & Laufer, 1985; Hulstijn & Marchena, 1989;Laufer & Eliasson, 1993). The enormous number of PVs in English alsocontributes to the problem, because it makes learners feel overwhelmed,not knowing which ones to learn. Thus identifying the most useful PVs isparamount for language learning purposes. Although the answer to thequestion of which PVs are useful may vary depending on learners’objectives and learning contexts, frequency is usually a good criterionfor determining usefulness. This is because, in general, highly frequentPVs are more useful than those with very low frequency. There havebeen two corpus-based frequency studies of English PVs (Biber,Johansson, Leech, Conrad, & Finegan, 1999; Gardner & Davies, 2007),and both have provided valuable information about PVs and theirdistribution patterns. Yet, there are important limitations in each of thetwo studies. It is important, however, to point out that the limitations arenot due to any oversight on the part of the scholars who did the studiesbut simply the result of their specific foci and space constraints.

Being a small section of a comprehensive book on English grammar,Biber et al.’s (1999) treatment of PVs is limited largely to a small set ofPVs (31 in total). Gardner and Davies’ (2007) work, though coveringmany more PVs than Biber et al.’s work, has three limitations of its own.First, their list of the most frequent PVs (a total of 100 items) containsonly PVs made up of the top 20 PV-producing lexical verbs (e.g., come, go,get, and take). In other words, the list does not include highly frequentPVs formed by verbs outside the top 20 PV-producing ones (e.g., keep upis not on the list because keep is not one of the top 20 PV-producingverbs). As a result, their study, although offering new insights about PVs(e.g., a very small group of lexical verbs make up a majority of PVs), doesnot provide a thorough account of the most frequent PVs. Second, withthe British National Corpus (BNC) as the data source, their study dealsexclusively with British English. It remains an interesting questionwhether their findings are also true of any other major varieties ofEnglish. In fact, in their conclusion, Gardner and Davies themselvesexplicitly called for the need to test the validity of their list ‘‘against othermegacorpora’’ (p. 354). Third, limited by space, their study did notrender a cross-register examination of the frequently used PVs. Suchcross-register information is, however, very important for languagelearning purposes, because it indicates the contexts where specific PVsare and are not typical. Gardner and Davies also explicitly recommended‘‘a reanalysis of the [PV] lists across major registers (e.g., spoken versus

662 TESOL QUARTERLY

written English)’’ (p. 354). In order to help fill in the aforementionedinformation gaps about PVs, the present study aims to offer acomparative investigation of the most frequently used PVs betweenAmerican and British English and an examination of the usageinformation of these frequently used PVs across registers in AmericanEnglish.

DEFINITION OF PHRASAL VERB

For any study of PVs, the definition of PV is often the first order ofbusiness. Yet, what constitutes a PV and how to classify PVs have longbeen topics of debate. Many different theories have been proposed, andthey differ largely over what syntactic and semantic features define a PVand how such features should be used to classify PVs (Biber et al., 1999;Celce-Murcia & Larsen-Freeman, 1999; Darwin & Gray, 1999; Gardner &Davies, 2007; Quirk, Greenbaum, Leech, & Svartvik, 1985). However,many of the differences among the theories are quite minuscule,especially from a language learner’s perspective. As Gardner and Davies(2007, p. 341) correctly note, ‘‘if even the linguists and grammariansstruggle with nuances of PV definitions, of what instructional value couldsuch distinctions be for the average second language learner?’’Furthermore, because of the purposes of the present study, there islittle need and room for a lengthy review of the various definitions thathave been proposed so far. This study had two main purposes: (1) toexamine in the Corpus of Contemporary American English (COCA) thefrequencies of the most common PVs and to compare the results withthose reported in Biber et al. (1999) and Gardner and Davies (2007);and (2) to conduct a cross-register distribution analysis of the PVs inCOCA and to compare the results with those of the study by Biber et al.

In order to ensure a meaningful comparison between the findings ofthis study and those of the other two, this study uses Gardner and Davies’(2007) definition of VP: any two-part verb ‘‘consisting of a lexical verb(LV) proper . . . followed by an adverbial particle (tagged as AVP) that iseither contiguous (adjacent) to that verb or noncontiguous (i.e.,separated by one or more intervening words)’’ (p. 341). The reasonfor using Gardner and Davies’ definition rather than Biber et al.’s istwofold. First, it is simpler, because it involves only one syntacticcriterion: ‘‘a verb plus an AVP.’’ In contrast, Biber et al.’s definitionincludes an additional semantic component: PVs must ‘‘have meaningsbeyond the separate meanings of the two parts [i.e., the verb and theAVP]’’ as in the case of ‘‘come on, shut up . . .’’ whereas verb + AVPcombinations in which ‘‘the verb and the adverb have their ownmeanings’’ are ‘‘free combinations like come back, come down . . .’’ (Biber

PHRASAL VERBS IN AMERICAN AND BRITISH ENGLISH 663

et al., 1999, p. 404). The application of this semantic criterion is notalways straightforward and often involves some subjective judgments. Ofcourse, Gardner and Davies’ syntactic criterion is not always simpleeither, because whether a verb particle should be classified as an AVP,regular adverb, or preposition is sometimes open to debate, an issue Iaddress later. The second reason for using Gardner and Davies’definition is that, as is shown next, a majority of the most frequentPVs examined in this study came from Gardner and Davies’ study.

METHOD

Corpora Used

As mentioned earlier, the main corpus used for this study was COCA,a large free online corpus developed by Professor Mark Davies ofBrigham Young University. When this study was conducted, COCAconsisted of 386.89 million words via data gathered from 1990 to 2008,that is, an average of approximately 20 million words from each of the 19years. The corpus contains five subcorpora: spoken, fiction, magazine,newspaper, and academic writing, with each subcorpus contributing anequal amount of data (4 million words per subcorpus per year). Thecorpus is also user friendly. Its search engine allows the user to perform,among other things, the search and comparison of ‘‘the frequency ofwords, phrases and grammatical constructions’’ (Davies, 2008). BesidesCOCA, the 100.47-million-word BNC was also used both indirectly anddirectly: The frequency results of the 100 most common PVs in the BNCreported in Gardner and Davies’ study were compared with the PVs’frequencies in COCA, and I queried the BNC directly through Davies’(2005) BYU interface for the frequency information of the other PVsthat are not on Gardener and Davies’ list of the 100 most frequent PVs.Furthermore, because the results of Biber et al.’s study were also used forcomparison in this study, the corpus they used, the 40-million-wordLongman Spoken and Written English (LSWE) corpus, was alsoindirectly used in this study.

To help the reader better understand the cross-corpora comparisonsto be rendered in the Findings and Discussion section, some relevantbackground information about the LSWE and the BNC is given here.The spoken part of the LSWE consists primarily of face-to-faceconversation (see Biber et al., 1999, p. 29–30). Similarly, a very largeportion of the spoken subcorpus of the BNC is composed of suchconversations. In fact, the British English portion of the LSWE isincluded in the BNC. In contrast, the spoken part of COCA consistsmostly of TV or radio broadcasting speech.

664 TESOL QUARTERLY

Data Gathering and Data Reporting or Analysis Methods

Querying for the frequency of a PV is a challenging task. One cannotaccomplish the search by simply entering the lexical verb lemma of a PV inthe form of [verb] plus its particle (e.g., ‘‘[go] on’’), because not every oneof the tokens generated by such a search is a phrasal verb. For example,the ‘‘[go] on’’ entry may yield non-PV tokens such as ‘‘We typically go onMondays’’ where ‘‘on’’ is a preposition in the time adverbial phrase ‘‘onMondays,’’ not an adverbial particle (AVP) of go. (The lemma searchfunction helps generate the tokens of the various forms of the verb, e.g.,go/goes/going/went/gone for the lemma go.) Thus, to ensure an accuratecount of all the tokens of a PV, sophisticated query methods are called for.One such method is found in Gardner and Davies’ (2007) study. Theyimported the entire tagged BNC data set into the Microsoft SQL server, arelational data program that can help identify all the instances of PVs.This method was not used in this study, however, because COCA does notmake its entire tagged data set accessible to the public. Instead, this studyemployed basically a four-step procedure using the existing searchfunctions in COCA’s interface. This procedure, though more laborintensive, proved to be functional and fundamentally accurate.

The first step was the search for all the PV tokens of a lexical lemma. Thiswas done by entering the verb lemma in the form of [verb] plus [RP*] (RPis the search code for AVPs in COCA and the wildcard * stands for anyAVPs). For example, for all the PV tokens of the lexical verb lemma [go],‘‘[go] [RP*]’’ is entered. The query will generate all the ‘‘go plus AVP’’ PVtokens, including go on, go off, and so forth. The second step was a search ofthe tokens of transitive PVs used with their AVPs separated by oneintervening word. This was carried out by entering for search ‘‘[verb] *[RP*], with the wildcard * between the verb and the AVP standing for anyintervening word. The third step was the search of the tokens of separablePVs with two intervening words (e.g., look the word up). This task wasperformed by entering ‘‘[verb] * * [RP*]’’. No search was done, however,for instances of PVs with their AVPs separated by three or more interveningwords. This is because PVs so used are rare, and a search for them will yield‘‘many false PVs’’ (Gardner & Davies, 2007, pp. 344–345). Furthermore,Gardner and Davies did not include such tokens, making it necessary toexclude them in this study to ensure a meaningful comparison. In steps 2and 3, I read through the result lines to exclude any false tokens. All theaforementioned searches were performed with the cross-section compar-ison search function in COCA activated so that the search results includedthe PVs’ frequency distribution in each of the five registers. The last step wasthe recording and tabulation of the query results, using Excel spreadsheets.For each PV, the frequency results of its various forms in the five registerswere entered, and the subtotal and total frequencies were computed.


As far as the frequency counting or reporting method is concerned, rawfrequency numbers cannot be used for comparison purposes, because ofthe large differences in size among the corpora used in the study. Instead,a number of tokens per number of words norming method must be employed.For examining data in large corpora, researchers typically use the numberof tokens per million words (PMWs) method (cf. Biber, Conrad, &Reppen, 1998; Biber et al., 1999; Liu, 2003, 2008; Moon, 1998). Further-more, given that this method was already used in Biber et al. (1999), it wasadopted for this study for the reporting of most of the data. However, inthe statistical analysis (i.e., the chi-square and the dispersion tests) of theresults to determine whether there were significant differences among thePVs’ distributions, I used only raw observed frequencies, because normal-ized data are inappropriate for such statistical tests.

PVs Examined

In order to render a comparison of the results of this study with thoseof Biber et al. (1999) and Gardner and Davies (2007), I queried COCAfor the frequency of all the PVs in their lists. There were a total of 31 PVsin Biber et al. Each had at least 40 tokens PMWs in at least one register ofthe LSWE. Gardner and Davies’ list consists of 100 items made up of thetop 20 PV-producing verbs. Twenty seven of the 31 in Biber et al.’s listoverlap, however, with those in Gardner and Davies’ list. In other words,only 4 of Biber et al’s 31 PVs are not in Gardner and Davies’ list. Of thesefour, one is go ahead. It is not in Gardner and Davies’ list because ahead isnot tagged as an AVP in the BNC (or in COCA), but rather it is tagged asa regular adverb. The other three PVs not on Gardner and Davies’ listare shut up, stand up, and run out because run, shut, and stand are notamong the top 20 PV-producing lexical verbs that Gardner and Daviesidentified. Because of the overlapping of 27 items, the total number ofPVs from Biber et al.’s and Gardner and Davies’ studies was 104, not 131.

Besides searching these 104 PVs in COCA, I also queried the COCA andthe BNC for the other most common PVs. To do so, I used the four mostrecent comprehensive PV dictionaries as a search list guide: CambridgeInternational Dictionary of Phrasal Verbs (1997), with over 4,500 entries;Longman Phrasal Verbs Dictionary (2000), with over 5,000 PVs; NTC’sDictionary of Phrasal Verbs and Other Idiomatic Verbal Phrases compiled bySpears (1993), with 7,634 entries; and Oxford Phrasal Verbs Dictionary forLearners of English (2001), with over 6,000 entries. I searched a total of 8,847PVs, 5,933 of which were from the dictionaries, whereas 2,914 were not.The latter were not searched intentionally but were the by-product of myquery method [verb] [RP*] which would automatically return all the PVsof the verb being queried, including those not in the dictionaries. For

666 TESOL QUARTERLY

example, my queries [drive] [RP*] returned not only the intended PVsfrom the dictionaries, for example, drive away/up/down/off, but also thosenot listed in the dictionaries, for example, drive about/along/by/round.Considering the large number of PVs listed in each of the four dictionaries,one may wonder why only 5,933 PVs were queried. The reasons were (1)many of the entries in the dictionaries overlap, and (2) the dictionariesinclude verb + preposition structures (e.g., abide by and accede to) that arenot considered PVs relative to the definition used in this study.

According to Gardner and Davies’ (2007) search, there are a total of12,508 PV lemmas in the BNC. This means that my query of 8,847 left3,661 PVs unsearched. This should not, however, be a concern for thefollowing reasons. First, the purpose of my study was to identify the mostfrequently used PVs, and the criterion for inclusion in my list was 10tokens PMWs. As the immediately following discussion shows, only 152out of the 8,847 made the list. Most PVs simply do not have the requiredfrequency. Second, my search covered all the lexical verb lemmas that hada total of 1,000 tokens in the BNC or 3,869 in COCA, because this was theminimum number that would give the verbs the potential for yielding therequired number of PV tokens to make the most common PV list. Finally,because of tagging errors, not all of the 12,508 PV lemmas are PVs.

As already stated, the criterion for a PV to make the most frequentlyused list in this study was 10 tokens PMWs in either COCA or the BNC.The rationale for using this criterion was threefold. First, 73 (70%) ofthe 104 PVs on the Biber et al. and Gardner and Davies’ combined listeach have 10 tokens or more PMWs; only 31 on Gardner and Davies’ listeach show a frequency fewer than 10 PMWs. Second, in order to be trulymeaningful, a list of the most frequently used PVs should not be toolong. Third, as Gardner and Davies (2007) reported, the 100 frequentlyused PVs they identified already ‘‘account for more than half (51.4%) ofall the PV occurrences in the BNC’’ (p. 351).1 Using this ten-tokenPMWs criterion, my search identified 48 additional most frequently-usedPVs. The search results also showed that these 48 PVs and the four from

1 It is necessary to note that there is an error in the frequency number of a PV in Gardnerand Davies’ data that has an implication for the total numbers they reported. In their 100most common PV list, carry out is ranked as the 2nd most frequent PV, boasting a frequencyof 10,798. This frequency number is unusually high and incorrect, based on my search andconsultation with Mark Davies, one of the authors of the Gardner and Davies article. Thecorrect number is 4,180, which means that their reported frequency of this PV is 6,618tokens over the actual frequency. This should also have resulted in an inflation of the totalPV occurrences in the BNC by 6,618. Thus, with the 6,618 removed from both the tokennumbers of the 100 PVs (266,926 2 6,618) and the total token numbers of all the PVoccurrences in the BNC (518,283 2 6,618), the tokens of the 100 PVs (260,168) shouldaccount for 50.78%, instead of the 51.7%, of the total PV tokens (512,305) in the BNC.These adjusted correct numbers are used in the discussion in the remainder of the article.Also, in the appendix, the frequency number and order of carry out in the BNC list isadjusted accordingly (from 2nd to 24th).


Biber et al. that are not on Gardner and Davies’ list together account foranother 12.17% of all the PV occurrences in the BNC. This means thatthe 152 most frequently-used PVs compiled in this study, whilecomprising only 1.2% of the total 12,508 PV lemmas in the BNC, cover62.95% of all the total 512,305 PV occurrences. This helps demonstratethe representativeness and hence the usefulness of these most-frequentlyused PVs. Of course, there are several limitations that should beconsidered when using this list for learning/teaching purposes, such asthe fact that it is a lemmatized list and that many of the PVs have multi-meanings, two very important issues I will address in the next section.

FINDINGS AND DISCUSSION

Most Frequently Used Phrasal Verbs: American English VersusBritish English

This study has uncovered the frequency information of 152 PVs,including the 100 from the Gardner and Davies list, the four from Biberet al. that are not in Gardner and Davies’ list,2 and the 48 additionalmost frequent PVs this study has identified. The frequency informationis reported in a table format in the appendix, with the PVs listed in orderof their frequency in COCA. To allow for an easy comparison of the PVs’frequency in COCA with their frequency in the BNC, their frequencyand rank order information in the BNC is also provided (in the secondand third columns from the right). It is necessary to note that the totalnumber of PVs in the appendix is 150, not 152, because I combined thePVs in each of the following two related pairs that were reported asindividual PVs in Gardner and Davies’ study (2007): look around and lookround; turn around and turn round. Gardner and Davies also have comeround and go round on their list but not come around and go around, giventhat the latter forms are the dominant uses in American English, I haveincluded and combined them with the former in this study. The reasonfor combining the two forms in each pair is that they are synonymousand that they represent mainly a usage variation between American andBritish English, an issue that is discussed later.

Before proceeding to a detailed comparison of the PVs’ frequency andusage patterns in the two corpora, I briefly discuss how some of the resultsof this study support Biber et al.’s (1999) and Gardner and Davies’ (2007)findings about an interesting aspect of PVs: A relatively small number oflexical verbs and AVPs form the majority of the PVs in English. Biber et al.

2 One of them is go ahead. Even though it is not tagged as a PV, as mentioned earlier, I haveincluded it not only because Biber et al. (1999) did but also because I believe ahead isactually an AVP for the verb go, making the phrase a true PV.

668 TESOL QUARTERLY

identified eight verbs and six adverbs as the most productive in formingPVs. Gardner and Davies identified the top 20 PV-producing verbs and thefour most ‘‘prolific’’ AVPs that help form for more than half (53.7%) of allthe PVs in the BNC (2007, p. 347).3 The same pattern is found in thelexical verbs and the AVPs in the 52 additional most frequent PVs (48identified in this study and 4 from Biber et al.). For example, out and upare each the AVPs in 19 of the 52 PVs, that is, they combine for the AVPSof 38 (73.08%) of the 52 PVs. Concerning the verbs in these 52 PVs, it isimportant to first recall that all of them are outside the top 20 PV-producing lexical verbs. Yet even these less productive verbs show someconcentrated use in PVs. One of them (hang) appears in three of the 52PVs, and five (fill, keep, pull, show, stand) each appear in two.

To compare the PVs’ frequency distribution patterns in the twocorpora, it is necessary to note that the data of the two corpora do notcome from the same time period. Although the BNC covers the 1980s to1993, COCA extends from 1990 to the present, that is, COCA startsbasically where the BNC ends. This difference in time periods could beresponsible for some of the PV usage variations between the twocorpora, which is discussed later.4 To compare the general frequencypatterns of the PVs in the two corpora and to determine whether there isany significant difference calls for a chi-square test of the raw observedfrequencies. Given the large difference in size between the two corpora,a one-way chi-square test of the observed frequencies of the PVs from thetwo corpora would not make sense. To account for the effect of thedifference in corpus size, I opted for a two-way chi-square test with thetotal observed frequencies of the 150 PVs measured against the totalnumber of words of their respective corpora minus the total number oftokens of the 150 PVs. In this way, the problem of difference in corpussize was controlled, allowing the chi-square test to determine whetherthe relative frequency of the PVs was statistically equal in both corpora.The results are reported in Table 1 where I also include at the bottomthe PVs’ frequencies PMWs in the two corpora for easier comparison.

A close look at the test results indicates that, although there is asignificant difference between the frequencies of the PVs in the two

3 Although most of the top PV-producing verbs and AVPs identified by Biber et al. (1999)overlap with Gardner and Davies’ (2007), the rank orders of the items between the two listsdiffer. For example, whereas take and get are first and second on the Biber et al. list, go andcome are the first two on Gardner and Davies’ list (also my COCA list). The differenceappears to have resulted from the different definitions of PV used. As mentioned earlier,Biber et al.’s definition involves a semantic criterion, which excludes verb + adverbcombinations where verb and AVP hold separate instead of combined meanings. ThusBiber et al. excluded many of the highly frequent PVs formed by come and go (e.g., go backand come in) listed in Gardner and Davies (2007).

4 I owe this idea to an anonymous reviewer, who suggested that the increased use of certainPVs over the past 20 years in COCA may explain their higher frequencies in COCA than inthe BNC.


corpora, the difference is actually minuscule, as evidenced by the verysmall effect size, a Cramer’s V of only 0.0032, and also by the percentagesof deviations (PDs) of the observed frequencies from the expectedfrequencies, with the frequency in COCA being merely 2.7% higher thanexpected and the frequency in the BNC being only 10.5% lower thanexpected. The effect size is extremely important for statistical analysis incorpus research, because, as Gries (2010, p. 286) explained, ‘‘the largesample sizes that many contemporary corpora provide basicallyguarantee that even minuscule effects will be highly significant.’’ Thusthe significant difference shown by the chi-square test is very likely theresult of the large size of the two corpora. Furthermore, a comparison ofall the individual PVs’ frequency rank order in COCA against their rankorder in the BNC (the results reported in the last column of theappendix) indicates that the PVs’ frequency rank orders in the twocorpora are fairly similar. For example, for each of the following five PVs,its frequency orders in both corpora are identical: go on 1st, come in 14th,get back 19th, bring back 44th, and turn down 94th. (Incidentally, go on isalso the most frequent one in Biber et al.’s study.) Eight out of the top10 PVs in the COCA list also make the top 10 in the BNC list. Forty-six(30.67%) of the 150 PVs show only a single digit difference betweentheir rank orders in the two corpora (e.g., pick up ranks 2nd in COCAand 3rd in the BNC, a rank difference of 1).5 Thirty-seven (24.67%)record a rank order difference between 10 and 19. However, 67(44.67%) display a rank order difference of 20 or above, an issue I returnto later.

5 This rank difference number can be interpreted to mean, depending on one’s perspective,either that the frequency of pick up in COCA is one rank higher (i.e., +1) than its frequencyin the BNC, or its frequency in the BNC is one rank lower (21) than its frequency inCOCA. To make the reporting of this rank order comparison simpler, no +/– sign is used.

TABLE 1

Comparison of the Most Common PVs’ Overall Frequency Patterns in COCA and the BNC

COCA BNC df Chi-square (x2) P Cramer’s V

Total observedfrequency of the150 PVs

1,424,836(+2.7%)*

322,517(210.5%)*

1 4,988.65 0.0001 0.0032

Total number ofwords minus the150 PVs’ totaltokens

385,465,164 100,147,483 1 4,988.65 0.0001 0.0032

Frequency PMWsof the 150 PVs

3,682.79 3,210.09

Note. COCA 5 Corpus of Contemporary American English; BNC 5 British National Corpus; PVs5 phrasal verbs. *Percentage that the observed frequency deviated from the expected frequency.

670 TESOL QUARTERLY

Given the different time periods the BNC and COCA each cover, theabsence of a truly large difference in PV use between the two corpora maysuggest that PV use has remained fairly stable. This fact may in turn implythat the list of the most frequently used PVs produced in this study maywithstand the test of time. In that case, What about the differences in PVuses found between the two corpora, especially the rank disparity of 20 ormore found in 67 of the PVs? What might be the cause(s) for thedifferences? To answer these questions, we should first understand howand to what extent the frequencies and uses of these PVs in the twocorpora differ. A close examination reveals that, although the differencesof their rank orders between the two corpora offer some interestinginformation, the difference between a PV’s frequencies (numbers oftokens) in the two corpora is a much more informative indicator. Forexample, the difference between the rank orders of come up in the twocorpora is only five (4th in COCA and 9th in the BNC), but its frequencydifference in the two corpora is 55.45 PMWs. In contrast, set off has a rankorder difference of 49 but a frequency difference of only 6.81 PMWs.Therefore, I decided to use frequency as the main criterion to examinethe individual PVs’ distribution differences in the two corpora.

Specifically, I tested for any significant difference between the rawfrequencies of those PVs whose frequencies in the two corpora varied by10 or more PMWs. There were a total of 39 such PVs. Given that the twocorpora differ tremendously in size, I conducted a two-way chi-squaretest employing exactly the same method used for testing the totalfrequency difference of the 150 PVs in the two corpora reported earlierin Table 1. Because of the large size of the corpora, the chi-square resultsfor the 39 PVs were all significant, but their Cramer’s Vs were very small,ranging from 0.0006 to 0.0019. In order to have a shorter and morefocused list of PVs which show a truly noticeable difference in theirdistributions between American and British English, I excluded from thelist those PVs with Cramer’s Vs lower than 0.001. This resulted in a list of30 PVs. Twenty are significantly more common in American English:check out, come out, come up, figure out, get out, go ahead, grow up, hang out,hold up, lay out, pick up, pull out, show up, shut down, take off, end up, turnout, take on, turn a/round, and wake up (cf. the appendix table for theirrank orders or frequencies in the two corpora). Ten appear significantlymore frequently in British English: build up, carry on, fill in, get on, set out,set up, sort out, take over, take up, and turn up. Although the reasons forsome of the PVs’ prominent use in one of the two English varieties aredifficult to determine, the causes for some can be attributed to eitherusage differences between the two varieties of English or the increase ofuse in American English, for, as mentioned earlier, COCA starts wherethe BNC ends in terms of the time periods covered.


Regarding usage differences, an examination of some of the tokens ofthe PVs confirms the following information indicated by some of the PVdictionaries. The significantly larger number of tokens of fill in in theBNC appears related to the fact that British English typically uses fill in in‘‘fill in or fill something in a form/document,’’ whereas American Englishgenerally uses fill out in such cases. The quadrupled use of check out inCOCA compared to that in the BNC is the result of the multiplefunctions or meanings of the PV in American English that are not foundin British English, such as its meaning ‘‘paying for things’’ at a store and‘‘borrowing items from a library.’’ Furthermore, the far less frequent useof shut down in the BNC is mostly due to the fact that, in British English,shut up is often used to express the meaning of ‘‘closing a businesstemporarily,’’ a meaning almost always expressed by shut down inAmerican English. This fact also helps explain the lower frequency andrank order of shut up in COCA.

Another noticeable use difference between American and BritishEnglish, as mentioned earlier, relates to the use of around/round in thePVs such as come around/round, go around/round, look around/round, andturn around/round. The distribution of around/round in these PVs inthe two corpora is reported in Table 2. The results demonstrate that,although it is true Americans prefer around and British speakers favorround, Americans’ preference for around over round is much strongerthan the British preference for round over around. The American useof around is more than 90% of the time in each of the four PVs,whereas the British use of round is in general much less than 90% ofthe time.

Concerning frequency differences likely caused by the increased useof certain PVs in American English, a query of COCA indicated that checkout, hang out, show up, and come up each show a noticeable increase innumber of tokens from 1990–1994 to 2005–2009. Check out increased by102%, hang out by 52%, show up by 25%, and come up by 23%. Suchsubstantial increases of the PVs in COCA may help explain their higherfrequencies in COCA than in the BNC. Yet, because we do not have theBritish English data after the early 1990s, we cannot be certain whetherthe same increases would have also occurred in British English. In short,the analysis of the PVs’ frequency patterns indicates that, although theirgeneral distribution patterns are very similar in both corpora, there aresome differences concerning some specific PVs because of (1) usagedifferences between American and British English and/or (2) increaseduse in American English. Knowledge of these differences is useful toEnglish language educators when deciding which PVs should be taughtand learned in which English variety.

672 TESOL QUARTERLY

Cross-Register Differences in the Use of PVs

To determine whether there is a significant difference in the overall rawfrequency distributions of the 150 PVs among the five registers in COCA, Iconducted a one-way chi-square test and a dispersion/adjusted frequencytest using Gries’ (2008b) Dispersions2 program. This dispersion testyields, in addition to a series of adjusted frequencies, a deviation ofproportion (DP) score, which theoretically can range from 0 to 1, butsometimes the number of parts of the corpus and other factors mayprevent it from reaching the maximal value of 1. To address this problem,the test also gives a normalized DP score, shown as DPnorm, which is able todisplay the maximal value. The values of DP near 0 suggest that thefrequencies of a linguistic item are distributed in proportion to the sizes ofthe corpus registers or parts, whereas high values, especially those near 1,signify that the frequencies of the linguistic item are distributed veryunevenly across the registers. An adjusted frequency is a downwardlyadjusted total frequency in proportion to the degree of the unevenness ofthe distribution of the linguistic item. The results of both the chi-squareand the dispersion tests are reported in Table 3. Besides the rawfrequencies of the PVs, I have also reported the frequencies PMWs so theresults can be compared with those of the Biber et al. (1999) study. Theresult of the chi-square test is very significant, with p , 0.0001, but thedeviations of the PVs across the registers are not particularly highaccording to Gries’s DP (0.214; 0.268norm). It is important to note,however, that the specific percentage deviations of the observedfrequencies of the PVs from the expected are fairly high: Whereas theobserved frequencies in the spoken and fiction registers are, respectively,44.34% and 66.12% higher than the expected, those in the magazine,newspaper, and academic registers are 18.36%, 21.02%, and 66.86% lowerthan the expected, respectively.

TABLE 2

Distribution of the PV Particles Around/Round in COCA and the BNC

PV COCA BNC

Verb AVP Tokens PMWs Percentage Tokens PMWs Percentage

come around 6.11 94 1.40 11round 0.39 6 11.02 89

go around 9.37 93 4.36 24round 0.70 7 13.60 76

look around 20.54 99 7.76 53round 0.21 1 6.91 47

turn around 26.82 98 4.21 27round 0.55 2 11.41 73

Note. AVP 5 adverbial particle; PMWs 5 per million words.


TA

BL

E3

Mo

stF

req

uen

tP

Vs

Dis

trib

uti

on

Acr

oss

the

Reg

iste

rsan

dth

eR

esu

lts

of

aO

ne-

Way

Ch

i-S

qu

are

Tes

tan

dG

ries

’sD

isp

ersi

on

Tes

t

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

dfC

hi-s

quar

ex

2p

Gri

es’s

DP

Size

(mil

lio

n)

78.8

274

.88

80.6

676

.33

76.2

038

6.89

Raw

freq

uen

cyo

fP

Vs

411,

326

(+44

.34%

)*44

9,72

0(+

66.1

2%)*

244,

270

(218

.36%

)*22

5,07

9(2

21.0

2%)*

94,4

41(2

66.8

6)*

1,42

4,83

6(1

,339

,479

)**

432

4,44

5.88

0.00

010.

214

0.26

8 nor

m

Fre

qu

ency

PM

Ws

5,21

8.55

6,00

5.88

3,02

8.39

2,94

8.76

1,23

9.38

3,68

3.74

432

4,44

5.88

0.00

010.

214

0.26

8 nor

m

Not

e.*T

he

per

cen

tage

the

ob

serv

edfr

equ

ency

dev

iate

dfr

om

the

exp

ecte

dfr

equ

ency

.**

Ro

sen

gren

’sad

just

edfr

equ

ency

pro

du

ced

by

Gri

es’s

Dis

per

sio

ns2

test

.

674 TESOL QUARTERLY

It is thus clear from the test results that the PVs are much morecommon in fiction and spoken English than in magazines, newspapers,and, especially, academic writing. The results support the conclusion ofBiber et al. (1999, p. 408) on the issue: ‘‘Overall, phrasal verbs are usedmost commonly in fiction and conversation; they are rare in academicprose. In fiction and conversation, phrasal verbs occur almost 2,000times per million words.’’ The only difference between the finding ofBiber et al. and my finding is the rate of occurrence. Although the PVfrequency in fiction and conversation in their study is ‘‘almost 2,000 permillion words,’’ the rates in the two registers found in this study arealmost three times that. The reason for this large difference betweentheir number and mine appears, again, to be the narrower definition ofPV used in their study, an issue explained earlier (Footnote 3). One canattribute the difference to this reason quite confidently, because thefrequency numbers in my study and in Gardner and Davies’ (2007) studyare quite comparable, and our two studies used the same definition. InGardner and Davies’ study, the frequency of the top 100 PVs in the BNCis 278,780 or 2,788 PMWs (p. 349), a number that would have been evenhigher if it had included those of the 50 PVs included in my study.Furthermore, given that the 2,788 PMWs frequency is the average thatincluded the much lower frequencies in the newspaper and academicwriting registers, one can certainly expect the numbers in their spokenand fiction registers to go much higher than 2,788 PMWs.

Although the overall cross-register analysis provides informationabout the PVs’ general distribution patterns, it does not offerinformation about the behavioral patterns of the individual PVs,especially those that actually occur more often in the registers otherthan in fiction or speech. Such information is very useful for languagelearning. Therefore, I conducted an analysis of each individual PV’s rawobserved frequencies across the five registers, using both a one-way chi-square test and Gries’ (2008b) Dispersions2 test. Although the chi-square test reveals that every one of the 150 PVs showed a significantdifference (p , 0.001) in its frequencies among the five registers, thedispersion test shows their DPnorm values vary substantially, ranging from0.045 (in the case of make up) to 0.74 (in the case of look up).6 This meansthat, of the 150 PVs, make up is distributed most evenly across theregisters, whereas look up is distributed most unevenly. Based on theirDPs, I divided the PVs into three groups: (1) fairly evenly distributed,with a DPnorm below 0.25 (65 of them or 43.33%), (2) not evenlydistributed, with a PDnorm between 0.25 and 0.499 (68 or 45.33%), and(3) very unevenly distributed, with a PDnorm of 0.5 or above (17 or11.33%). The latter two types combine for 85 (56.67%). Each PV’s

6 DPnorm, instead of DP, is opted for because of its ability to show the maximal value.


dispersion classification is shown in the appendix with a superscriptnumber 1, 2, or 3 after the PV.

Classifying PVs by dispersion pattern is very useful for languagelearning because, as Greis (2008a) showed in his literature review, thedistributional range of lexical items has a great impact on secondlanguage (L2) learners’ processing and learning of them. Items thatboast a wider and more even distributional range are processed fasterthan those that have a narrow one, and hence they should be of higherpriority for L2 learners. However, this does not mean one can overlookthose unevenly distributed PVs in language learning. In fact, the latterPVs occur mostly in one or two registers, and, as such, they are actuallyvery important for English for specific purposes (ESP) learners, who,because of their specific purpose of study, must focus on the register(s)in which these PVs appear most frequently. For example, carry out is anunevenly distributed PV because of its very high frequency in academicwriting. For students studying academic English, it should thus be veryhigh on their list of PVs to be learned. Before examining in detail thenoticeable distribution patterns of the 85 significantly unevenlydistributed PVs, it is important again to note that the PVs reportedhere are lemmatized and many of them are polysemous, just as most PVsare in general. The distribution of the different meanings of apolysemous PV may vary significantly across registers.

For example, make up can mean, among other things, compose orconstitute (e.g., ‘‘Women make up 22 percent of the rural labor force inNicaragua . . .’’); decide, when used in ‘‘make up one’s mind’’ (‘‘SecretaryPowell can make up his own mind’’); compensate (for) (e.g., ‘‘The kids makeup for their lack in experience with enthusiasm’’); and fabricate (‘‘Melaniemade up that story’’).7 I examined the meanings of the first 100 tokens ofthis PV in COCA’s spoken and academic registers. A 2 by 5 Chi-Square testof the meaning distributions (reported in Table 4) yielded a verysignificant result: x2(df 54) 5 104.52, p , 0.0001, with a Cramer’s V of0.7229, indicating a clear, significant difference between the semanticdistributions of make up in the two registers. Although the tokens in spokenEnglish show a fairly even division among the four meanings, the tokens inacademic writing mean mostly compose (79%). This finding suggests clearlythat the cross-register distribution of the different meanings of a PV is alsoimportant information. Unfortunately, because of lack of space, this studyis unable to offer a close examination of this important issue.

Furthermore, as lemmatized lexical items, the 150 PVs are listedwithout information regarding their uses in different tenses, for example,make up versus made up versus making up. This latter information is veryimportant for language learners or teachers when deciding which form to

7 All the examples here and in the following are from COCA.

676 TESOL QUARTERLY

focus on, because the dominant tenses in which specific PVs are usedsometimes differ substantially. For instance, in COCA, although turn out isused roughly 50% of the time in the past tense, go ahead appears 93% ofthe time in the present tense. Thus PVs like turn around may be good itemsfor instruction in teaching the past tense, whereas PVs like go ahead may bebest used for teaching the present tense. Again, for lack of space, thisstudy is unable to offer a detailed treatment of the tense distribution ofthe PVs. Clearly, although the lemmatized list of the 150 PVs is a usefulsource for learning the most common PVs in general, English learnersmay still need to seek further semantic or usage information of the PVswhen learning these PVs. There are some useful sources they can turn tofor help in this regard, including PV dictionaries and online sources likethe Wordnet Search (Miller, 2008).

Concerning the distribution patterns of the 85 significantly unevenly orvery unevenly distributed PVs, almost all of them appear primarily infiction and spoken English, the two registers that record the highestoverall use of PVs. Sixty of the 85 (70.59%) occur mostly in fiction and 22(25.88%) in the spoken registers. Only three (3.53%) appear significantlymore frequently in the other three registers (two in academic writing andone in newspapers). Because of their rarity, the latter three deserve ourattention first. The two PVs that occur mainly in academic writing are bringabout and carry out. Bring about is used so predominantly in academicwriting that its frequency in academic writing (27.44 PMWs) is many times(varying from 3 to 10 times) more than its frequencies in each of the otherfour registers. Also worth mentioning here is that point out, a fairly evenlydistributed PV, registers its highest frequency in academic writing as well.It is important to note that carry out and point out are also found to be usedmost frequently in academic writing in the study by Biber et al. (1999).The reason that bring about is not on their list seems to be that it does nothave at least 40 tokens PMWs in any of the registers, the criterion ofinclusion in their study. Biber et al.’s (1999) analysis also shows that takeon, take up, and set up are more common in academic writing than inconversation. However, in this study, only take up shows this patterntogether with set out, likely because of the data in the spoken register ofCOCA are mostly from TV or radio programs, not from conversations, aswas the case for Biber et al.’s spoken corpus data. Obviously, all these PVs

TABLE 4

Distribution of the Major Meanings of the First 100 Tokens of Make Up in the Spoken andAcademic Registers

Compensate Compose Decide Fabricate Other

Spoken 26 12 27 25 10Academic 18 79 1 2 0


deserve attention in academic writing teaching materials. In addition,fairly evenly distributed PVs that have a substantial frequency in academicwriting (e.g., break down, carry on, follow up, make up, rule out, and sum up)should also be considered.

The only significantly unevenly distributed PV that appears mostfrequently in newspapers is pay off, but there are several in the fairlyevenly distributed group that claim their highest frequency in thenewspaper register: grow up, take over, shut down, wind up, turn down, fillout, and come off. Most of these PVs are expressions used to describebusiness dealings. As such, it is understandable that they often find theirway into news. This finding helps illustrate the ‘‘field’’-specific nature ofthe use of some PVs, an issue Celce-Murcia and Larsen-Freeman (1999,p. 434) have addressed in some detail. Magazine is the only register inwhich none of the significantly unevenly distributed PVs is used mostfrequently. Yet, quite a few PVs (7) in the fairly evenly distributed groupeach record their highest frequency in this register, including break down,break up, build up, check out, set up, sum up, stand out, and take on. The factthat these PVs all come from the fairly evenly distributed group canperhaps be explained by the mixed nature of this register. Magazinearticles cover a variety of topics, and different magazines have differenttarget audiences, making their contents quite diverse.

The 60 significantly unevenly distributed PVs that occur most often infiction are a very large group, but a majority of them (over 40) aremovement or action expressions, for example, look a/round, look up, sitdown, stand up, and walk out. Because describing human actions constitutesa very large part of fiction, it is truly befitting of fiction to make an extensiveuse of these action PVs. It is again important to note that some of these PVsare polysemous (e.g., look up), but they are used mostly as movementdescriptions in fiction. Of the first 100 tokens of look up in the fictionregister, 98 are about upward vision or head movement, as in the example‘‘When Billy opened his eyes and looked up, all he could see out the windowswere stars.’’ Of course, not all action PVs appear most frequently in fiction.Some are more common in spoken English. For example, come down, go in,among others, are used most frequently in spoken English. In fact, themajority of the most frequent PVs in fiction also show a high degree offrequency in spoken English and vice versa, largely due to the fact that asubstantial portion of fiction is made up of dialogues. Still, some action PVsoccur almost exclusively in fiction, with just a few tokens (all in the singledigits) in the other registers, including call out, hang up, and sit back, andthey are largely mono-meaning, used for depicting actions. In contrast,polysemous PVs that can be used to either describe actions or expressother meanings are common in both spoken English and fiction.

Another PV that bears some discussion here is come on. In Biber et al.’s(1999) study, it is the most frequent PV in spoken English, but in COCA

678 TESOL QUARTERLY

and the BNC, go on is the most frequent spoken PV. What is particularlystriking about come on is that its frequency in spoken English in the LSWEcorpus (the corpus Biber et al. used) is over 300 PMWs; 266.97 PMWs in thespoken part of BNC; but only 83.67 PMWs in the spoken register of COCA.The most likely explanation for its extremely high frequency in the LSWEand the BNC is that, as pointed out earlier, the data in the spoken registerin the two corpora are primarily taken from face-to-face conversations,whereas the data in the spoken register of COCA consist mostly of publicspeech mediums like radio or TV broadcasting, a much more formal type ofspoken language. The corpus examples of come on provided by Biber et al.(1999), such as ‘‘Come on, let Andy do it’’ and ‘‘Come on, let’s go’’ helpdemonstrate the conversational nature of their spoken corpus data.

CONCLUSION: IMPLICATIONS AND LIMITATIONS

This study has offered a comparative examination of the usage patternsof the most frequently used PVs in American and British English andacross registers. Besides validating many of the results of Biber et al.’s(1999) and Gardner and Davies’ (2007) studies, it has provided some newinformation about the use of PVs and a comprehensive list of the mostcommon PVs in American and British English, one that complementsthose offered by Biber et al. and Gardner and Davies with more items andmore usage information. In addition, it also presents a cross-register list ofthe most frequent PVs, showing in which register(s) each of the PVs isused primarily. English learners or teachers can elect to use the lists of the150 most common PVs in ways that best meet their learning purposes. Forexample, for a language curriculum or program with a general learningpurpose, either the American or the British overall frequency list may beused as a reference guide, depending on whether American or BritishEnglish is chosen as the target English variety. For ESP programs,however, one of the register-specific frequency lists (e.g., newspapers oracademic texts) can be used as the guide. Currently, the frequency orderis based entirely on the PVs’ overall frequency in COCA. To derive thecorrect frequency order of a register-specific list, one can copy the desiredregister list together with the PV items, place them in an Excelspreadsheet, and have the rank order adjusted according the PVs’frequencies in the register using the sorting function. The following aresome additional pedagogical implications.

1. Although there are some PV usage and frequency differences betweenAmerican and British English, the most common PVs are generallyrather similar between the two English varieties. Thus, except for thoseaforementioned usage differences, learners or teachers of English neednot worry about the problem of learning PVs that are useful only inAmerican or British English.


2. Although PVs that show a wider and more even distribution acrossregisters usually should receive more attention and perhaps be learnedfirst, unevenly distributed PVs may actually deserve special attention forESP learners, because of their high frequency in the register(s) that arethe ESP learners’ focus.

3. Learners of English should be made aware that the use of PVs is registerand field sensitive so they can approach PVs more effectively andappropriately. For students learning academic writing in English, it isimportant to know that, although PVs are generally not common informal writing, there are a few PVs (e.g., carry out and point out) that areactually very useful in academic writing, and it will be to the students’advantage to gain command of them. Writing teachers may want topurposely include these PVs in their teaching.

4. Learners should focus mostly on polysemous or idiomatic PVs, becausemono-meaning and literal meaning PVs are not only easy to understandbut are limited in context and function, as shown in the usage patternsof action PVs uncovered in this study.

5. Learners should also understand that the various meanings andfunctions of polysemous PVs are also often register specific, as in thecase of make up discussed earlier. Learners or teachers should consultvarious sources such as PV dictionaries and online sources like Wordnet3.0 to become familiar with the different meanings, especially the keymeanings of the PVs they are learning.

6. Learners can also take advantage of free online corpora such as COCAand the BNC as useful sources for learning and practicing PVs, especiallytheir different meanings. For example, students can enhance theirability in distinguishing the different meanings of a PV by going throughconcordance lines of a PV query to determine the meaning of eachspecific token. Such exposure to PVs can also help learners becomemore familiar with PVs and then more comfortable in using them,hence helping overcome their inclination to avoid PVs. Furthermore,some useful strategies for learning PVs have been suggested, such asstudying the cognitive motivation of the use of the AVPs in PVs to helpbetter grasp the meanings of PVs (Kovecses & Szabo, 1996) andexamining the typical noun collocates of PVs to better understand andretain idiomatic PVs.

Limitations of the Study and Implications for Future Research

First, limited by space and research design, this study provides onlythe lemmatized most common PVs, and it does not provide anexamination of the use of the various meanings of those polysemousPVs across various registers. A tense-specific list and an analysis of thevarious meanings of the PVs can help better understand how the varioustenses and meanings of a PV are used, including information such as in

680 TESOL QUARTERLY

which register or registers each of its tense forms and meanings is mostfrequently used. Second, as is the case in Biber et al. (1999), the cross-register comparative study of PVs in this study covers only broadcategories, offering little information on the PVs usage patterns inspecific fields, such as air-traffic control. In the future, more field-specific comparative studies are needed.

ACKNOWLEDGMENTS

I thank the three anonymous reviewers and TESOL Quarterly Editor Alan Hirvela fortheir extremely valuable comments and suggestions. They have helped mesignificantly enhance the quality of this article. I also thank one of the reviewersfor suggesting the dispersion/adjusted frequency test and Professor Stefan T. Griesfor allowing me to use his Dispersions2 program.

THE AUTHOR

Dilin Liu is Professor and Director of the Applied Linguistics/TESOL program atthe University of Alabama. His main research interests include the learning andteaching of lexis and grammar, especially corpus-based description and learning oflexicogrammar.

REFERENCES

Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating languagestructure and usage. Cambridge, England: Cambridge University Press.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longmangrammar of spoken and written English. London, England: Longman.

Bolinger, D. (1971). The phrasal verb in English. Cambridge, England: CambridgeUniversity Press.

Cambridge international dictionary of phrasal verbs. (1997). Cambridge, England:Cambridge University Press.

Celce-Murcia, M., & Larsen-Freeman, D. (1999). The grammar book: An ESL/EFLteacher’s course (2nd ed.). Boston, MA: Heinle & Heinle.

Cordon, N., & Kelly, P. (2002). Does cognitive linguistics have anything to offerEnglish language learners in their efforts to master phrasal verbs? ITL Review ofApplied Linguistics, 133–134, 205–231.

Dagut, M., & Laufer, B. (1985). Avoidance of phrasal verbs: A case for contrastiveanalysis. Studies in Second Language Acquisition, 7, 73–79. doi: 10.1017/S0272263100005167.

Darwin, C. M., & Gray, L. S. (1999). Going after the phrasal verb: An alternativeapproach to classification. TESOL Quarterly, 33, 65–83. doi: 10.2307/3588191.

Davies, M. (2005). BYU–BNC (an online interface with the British National Corpus:World Edition). Provo, UT: Brigham Young University. Retrieved from http://corpus.byu.edu/bnc/x.asp.

Davies, M. (2008). The corpus of contemporary American English. Provo, UT:Brigham Young University. Retrieved from http://www.americancorpus.org/.

Gardner, D., & Davies, M. (2007). Pointing out frequent phrasal verbs: A corpus-based analysis. TESOL Quarterly, 41, 339–359.


Gries, S. T. (2008a). Dispersions and adjusted frequencies in corpora. InternationalJournal of Corpus Linguistics, 13, 403–437. doi: 10.1075/ijcl.13.4.02gri.

Gries, S. T. (2008b). Dispersions2 (statistical program) Retrieved from http://www.linguistics.ucsb.edu/faculty/stgries.

Gries, S. T. (2010). Useful statistics for corpus linguistics. In A. Sanchez & M. Almela(Eds.), A mosaic of corpus linguistics: Selected approaches (pp. 269–291). Frankfurt amMain, Germany: Peter Lang.

Hulstijn, J. H., & Marchena, E. (1989). Avoidance: Grammatical or semantic causes?Studies in Second Language Acquisition, 11, 241–255. doi: 10.1017/S0272263100008123.

Kovecses, Z., & Szabo, P. (1996). Idioms: A view from cognitive linguistics. AppliedLinguistics, 17, 326–355. doi: 10.1093/applin/17.3.326.

Laufer, B., & Eliasson, S. (1993). What causes avoidance in L2 learning: L1–L2difference, L1–L2 similarity, or L2 complexity? Studies in Second LanguageAcquisition, 15, 35–48. doi: 10.1017/S0272263100011657.

Liao, Y., & Fukuya, Y. J. (2004). Avoidance of phrasal verbs: The case of Chineselearners of English. Language Learning, 54, 193–226. doi: 10.1111/j.1467-9922.2004.00254.x.

Liu, D. (2003). The most frequently used spoken American English idioms: A corpusstudy analysis and implications. TESOL Quarterly, 37, 671–700. doi: 10.2307/3588217.

Liu, D. (2008). Linking adverbials: An across-register study and its implications.International Journal of Corpus Linguistics, 13, 491–518. doi: 10.1075/ijcl.13.4.05liu.

Longman phrasal verbs dictionary. (2000). Harlow, England: Pearson EducationMcCarthy, M., & O’Dell, F. (2004). English phrasal verbs in use. Cambridge, England:

Cambridge University Press.Miller, G. (2008). Wordnet Online (Version 3.0, first launched in 2006) Retrieved

from http://wordnetweb.princeton.edu/perl/webwn.Moon, R. (1998). Fixed expressions and idioms in English. Oxford, England: Clarendon

Press.Oxford phrasal verbs dictionary for English Learners. (2001). Oxford, England:

Oxford University Press.Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of

the English language. London, England: Longman.Side, R. (1990). Phrasal verbs: Sorting them out. ELT Journal, 44, 144–152. doi:

10.1093/elt/44.2.144.Spears, R. A. (1993). NTC’s dictionary of phrasal verbs and other idiomatic phrasal

verbs and phrases. Lincolnwood, IL: National Textbook.Wyss, R. (2003). Putting phrasal verbs into perspective. TESOL Journal, 12, 37–38.

682 TESOL QUARTERLY

AP

PE

ND

IX

Fre

qu

ency

and

Dis

trib

uti

on

of

the

Mo

stF

req

uen

tP

hra

sal

Ver

bs

inC

OC

Aan

dth

eB

NC

Lis

ted

Acc

ord

ing

toT

hei

rO

vera

llR

ank

Ord

erin

CO

CA

(Fre

qu

ency

Nu

mb

erP

MW

s)

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

goon

2316.3

719

8.70

96.7

110

1.65

52.5

515

3.48

114

8.33

10

pick

up2

108.

67262.1

796

.13

90.2

923

.70

115.

402

89.9

53

1co

me

back

2251.7

516

3.84

50.1

569

.15

11.9

210

9.44

379

.91

52

com

eu

p2250.5

110

2.38

64.4

268

.29

18.9

110

1.48

454

.97

95

goba

ck2

190.9

815

4.92

56.1

565

.03

19.7

497

.31

580

.27

41

fin

dou

t2162.0

899

.41

65.8

850

.84

22.3

580

.43

665

.88

82

com

eou

t2163.2

290

.99

45.3

550

.32

11.4

872

.51

749

.99

125

goou

t2129.9

110

9.79

48.4

756

.02

9.62

70.7

78

76.5

26

2po

int

out1

79.1

338

.72

76.7

062

.05

90.7

269

.71

969

.51

72

grow

up1

**89

.25

61.3

975

.95

97.3

822

.57

69.5

610

18.4

453

43se

tu

p167

.62

55.2

183.5

074

.48

43.3

865

.11

1110

3.12

29

turn

out1

82.8

764

.50

81.8

958

.65

33.3

964

.58

1242

.64

219

get

out2

122.0

510

7.17

41.8

444

.29

6.92

64.4

313

35.2

830

17co

me

in2

121.2

399

.71

39.3

046

.52

10.1

163

.36

1447

.91

140

take

on1

71.6

746

.99

71.9

370

.52

48.5

162

.17

1541

.79

227

give

up1

65.3

172.4

051

.38

67.9

023

.76

56.1

116

41.6

623

7m

ake

up1

54.3

561.6

559

.20

56.8

546

.91

55.8

017

54.4

310

7en

du

p1**

82.4

447

.34

63.9

758

.23

20.3

854

.80

1833

.62

3214

get

back

290

.61

92.4

535

.37

44.5

25.

3353

.56

1945

.31

190

look

up3

16.8

5202.8

719

.96

12.8

44.

3150

.24

2038

.53

266

figu

reou

t1**

70.4

662

.95

52.3

941

.92

12.4

048

.17

212.

7314

712

6si

tdo

wn

255

.75

126.4

927

.65

23.2

36.

3447

.43

2244

.57

202

get

up2

51.7

9126.6

031

.22

24.5

65.

0947

.41

2339

.18

252

take

out3

57.4

587.7

435

.15

33.8

38.

2844

.32

2434

.10

317

com

eon

383

.67

108.7

212

.84

10.6

01.

8643

.22

2548

.07

1312

godo

wn

273.5

164

.34

26.4

727

.41

6.43

39.6

226

47.5

915

11sh

owu

p1**

52.1

145

.79

43.3

047

.22

8.90

39.5

727

7.64

119

92ta

keof

f236

.73

81.6

632

.28

27.5

45.

7136

.58

2821

.52

4618


TA

BL

EC

on

tin

ued

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

wor

kou

t149.6

334

.09

43.5

039

.57

14.6

236

.47

2946

.81

1613

stan

du

p2*

42.7

691.3

220

.94

18.2

05.

7036

.46

3030

.43

344

com

edo

wn

260.3

254

.37

24.2

527

.20

6.82

34.5

831

32.9

033

2go

ahea

d3*

119.4

626

.34

9.69

9.20

2.66

33.8

032

17.4

756

24go

up2

64.8

435

.39

26.8

032

.91

5.55

33.2

333

36.6

129

4lo

okba

ck2

33.9

671.2

019

.91

17.9

97.

9929

.97

3422

.40

428

wak

eu

p2**

34.0

363.8

226

.42

20.4

23.

6629

.54

3516

.07

6227

carr

you

t226

.48

12.1

319

.92

23.8

362.2

528

.86

3641

.60

2412

take

over

130

.01

23.2

826

.08

45.5

716

.21

28.2

437

53.9

511

26ho

ldu

p222

.52

75.1

120

.08

17.3

37.

2628

.16

3816

.16

6123

pull

out2

**22

.82

77.4

718

.45

16.5

73.

3727

.42

3913

.99

7334

turn

a/ro

un

d233

.08

64.1

016

.08

20.3

94.

3327

.37

4015

.62

6424

take

up1

21.4

240.3

629

.42

22.4

922

.56

27.1

941

45.8

618

23lo

okdo

wn

39.

0197.0

911

.98

6.09

3.20

24.9

642

22.1

143

1pu

tu

p135.4

933

.16

20.5

328

.26

4.99

24.4

943

28.2

236

7br

ing

back

134

.78

35.5

620

.63

21.2

19.

5924

.34

4421

.90

440

brin

gu

p244.7

234

.99

17.8

315

.73

8.14

24.3

145

24.9

540

5lo

okou

t314

.91

75.7

215

.01

12.3

63.

5723

.97

4616

.33

5913

brin

gin

138.9

721

.18

18.2

425

.76

11.7

123

.23

4724

.93

416

open

up1

**32.2

521

.53

23.1

219

.11

17.3

322

.74

4820

.43

491

chec

kou

t1**

25.7

328

.62

33.6

219

.74

3.36

22.3

549

5.73

128

79m

ove

on1

35.9

725

.05

18.7

818

.92

6.89

21.1

850

14.1

272

22pu

tou

t238.9

627

.87

16.2

217

.91

4.20

21.0

751

16.5

258

7lo

oka/

rou

nd3

12.7

873.7

99.

317.

572.

1720

.75

5214

.67

6816

catc

hu

p1**

21.9

631.0

020

.22

20.4

58.

4520

.39

5316

.05

6310

goin

248.1

031

.12

7.86

12.4

22.

3420

.37

5419

.65

513

brea

kdo

wn

119

.91

13.9

826.4

917

.27

17.5

519

.15

5521

.89

4510

get

off2

25.6

733.2

915

.46

17.5

82.

4718

.85

5610

.81

9034

keep

up1

**17

.43

25.4

521

.99

21.7

27.

6318

.85

5613

.38

7822

684 TESOL QUARTERLY

TA

BL

EC

on

tin

ued

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

put

dow

n2

17.1

450.1

113

.55

10.5

53.

3518

.75

5828

.60

3523

reac

hou

t2**

13.7

745.1

714

.24

13.7

37.

7218

.74

599.

4510

445

goof

f228

.05

37.6

312

.46

12.1

23.

1918

.62

6020

.94

4713

cut

off1

**20

.49

30.7

416

.89

14.5

07.

6518

.01

6113

.74

7413

turn

back

38.

1764.5

38.

606.

503.

4417

.91

6213

.67

7513

pull

up3

**9.

2057.8

313

.14

8.25

1.93

17.8

163

10.5

392

29se

tou

t112

.74

19.4

023.6

814

.83

17.5

417

.67

6446

.11

1747

clea

nu

p1**

20.8

722.0

917

.88

19.8

17.

1917

.58

659.

2610

540

shu

tdo

wn

1**

23.6

012

.91

17.3

824.4

65.

9116

.92

664.

6914

377

turn

over

120

.25

25.2

713

.53

18.2

05.

4316

.50

679.

7010

235

slow

dow

n1**

17.3

422.1

421

.05

13.9

86.

7116

.29

6811

.91

8517

win

du

p1**

19.9

715

.72

19.8

021.1

33.

1016

.02

698.

2611

142

turn

up1

13.1

926.2

819

.13

14.6

64.

9215

.62

7026

.97

3832

lin

eu

p1**

14.2

220.9

417

.82

20.4

04.

1715

.51

719.

9698

27ta

keba

ck2

23.8

328.4

710

.59

10.4

44.

2415

.47

7216

.20

6012

lay

out1

**20.6

518

.59

16.9

511

.49

8.44

15.2

773

2.64

148

75go

over

226

.11

32.9

67.

877.

492.

2215

.25

749.

8610

127

han

gu

p3**

8.36

52.4

09.

276.

450.

9215

.23

755.

4013

358

goth

rou

gh2

38.1

612

.35

9.20

13.4

32.

5115

.22

769.

6610

327

hold

on2

23.7

628.9

811

.79

8.46

3.12

15.1

977

9.04

107

30pa

yof

f2**

15.6

77.

6022

.50

24.6

04.

4615

.09

786.

1812

547

hold

out3

6.84

49.5

78.

247.

604.

3415

.06

7915

.00

6712

brea

ku

p116

.87

17.9

118.4

714

.76

6.31

14.9

180

12.8

081

1br

ing

out1

22.5

318

.96

13.3

010

.66

7.07

14.5

381

14.1

871

10pu

llba

ck3**

9.67

45.9

49.

146.

881.

7614

.47

827.

5012

038

han

gon

1**

13.4

130.9

013

.50

11.5

02.

8014

.34

8320

.11

5033

buil

du

p1**

17.0

18.

7621.5

512

.16

10.9

914

.21

8437

.34

2856

thro

wou

t2**

23.1

918

.55

10.3

315

.20

3.36

14.1

385

4.91

140

55ha

ng

out1

**15

.55

19.4

717

.07

15.6

72.

6014

.10

862.

7414

660


TA

BL

EC

on

tin

ued

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

put

on2

18.0

428.8

711

.93

8.69

3.12

14.0

887

14.2

168

19ge

tdo

wn

222

.23

23.6

210

.31

9.59

1.96

13.5

388

15.3

165

23co

me

over

216

.40

35.4

18.

575.

991.

3513

.43

899.

9999

10m

ove

in1

13.0

724.9

911

.48

14.9

42.

9813

.43

897.

8611

627

star

tou

t1**

20.5

210

.83

16.4

313

.40

4.03

13.1

491

4.88

141

50ca

llou

t3**

5.28

43.9

97.

486.

043.

5313

.03

923.

7914

452

sit

up3

5.63

50.7

16.

351.

830.

9112

.83

9311

.53

876

turn

dow

n1

13.6

917

.09

12.9

917.1

52.

6312

.71

9410

.46

940

back

up1

**14

.22

18.5

412

.84

13.0

64.

2912

.58

959.

0810

611

put

back

216

.26

25.9

19.

108.

862.

8012

.53

9613

.63

7719

sen

dou

t1**

18.0

814

.56

12.4

312

.08

4.69

12.4

097

13.6

776

21ge

tin

219

.92

22.1

28.

609.

661.

6512

.36

9811

.22

8810

blow

up1

**21.0

514

.46

10.4

911

.84

2.55

12.1

199

7.79

117

18ca

rry

on1

11.5

517.4

112

.09

10.4

38.

8612

.04

100

38.5

127

73se

tof

f19.

4018.5

414

.39

12.3

54.

3311

.79

101

18.6

052

49ke

epon

2**

14.2

520.0

112

.67

9.33

2.31

11.7

110

28.

2811

08

run

out2

*14

.26

21.4

69.

2110

.97

2.17

11.5

710

311

.89

8617

mak

eou

t36.

9336.4

77.

674.

342.

1411

.35

104

11.0

089

15sh

ut

up3

*8.

8838.3

75.

003.

891.

1611

.27

105

14.1

970

35tu

rnof

f210

.28

20.2

213

.63

8.99

3.20

11.2

510

65.

9112

620

brin

gab

out2

10.7

22.

808.

676.

5127.4

411

.22

107

20.7

348

59st

epba

ck3**

8.16

32.3

27.

104.

972.

7410

.92

108

3.31

145

37la

ydo

wn

2**

8.97

24.6

810

.56

5.48

5.12

10.8

910

910

.08

9712

brin

gdo

wn

115.9

215

.46

9.52

9.21

4.19

10.8

611

010

.17

9515

stan

dou

t1**

7.79

12.4

713.9

210

.72

9.32

10.8

511

18.

1411

32

com

eal

ong2

15.0

117.8

69.

269.

511.

8110

.68

112

12.6

482

30pl

ayou

t1**

15.9

56.

499.

0011

.69

8.27

10.3

211

32.

6214

936

brea

kou

t111

.49

12.2

310

.25

11.5

75.

7610

.26

114

9.91

100

14go

a/ro

un

d218.5

617

.11

6.32

6.50

1.93

10.0

711

517

.96

5560

686 TESOL QUARTERLY

TA

BL

EC

on

tin

ued

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

wal

kou

2**

11.9

924.8

45.

796.

521.

3510

.01

116

8.06

115

1ge

tth

rou

gh2-

19.0

611

.66

8.11

7.55

1.92

9.70

117

5.31

134

17ho

ldba

ck2-

8.65

18.6

67.

727.

473.

589.

1611

88.

1911

26

wri

tedo

wn

1**

9.19

16.7

78.

915.

025.

469.

0411

915

.05

6653

mov

eba

ck1-

7.66

14.8

57.

939.

763.

148.

6312

05.

6313

111

fill

out1

**-

8.11

9.28

8.58

10.8

65.

518.

4612

12.

5515

029

sit

back

2-

7.38

23.6

95.

505.

290.

758.

4312

28.

3010

913

rule

out1

**11.8

62.

508.

229.

888.

628.

2512

313

.05

7944

mov

eu

p1-

8.23

10.4

48.

6810

.42

3.26

8.21

124

4.75

142

18pi

ckou

t2-

7.76

16.3

18.

905.

632.

448.

1912

58.

5210

817

take

dow

n2-

10.7

816.9

56.

015.

571.

828.

1912

67.

7111

88

get

on2

13.9

114.7

65.

185.

501.

588.

1712

726

.83

3988

give

back

1-

11.8

810

.71

6.38

7.94

2.93

7.97

128

5.05

138

10ha

nd

over

2**

7.64

17.2

55.

956.

263.

017.

9612

917

.35

5772

sum

up1

**6.

172.

9611.6

38.

249.

627.

7713

012

.28

8446

mov

eou

t2-

10.1

013.2

25.

997.

422.

227.

7613

15.

7012

92

com

eof

f1-

8.83

8.27

7.90

12.6

60.

677.

6713

25.

1313

64

pass

on1**

8.26

8.8

08.

506.

564.

887.

4213

312

.81

8053

take

in2-

6.39

15.1

66.

265.

032.

747.

0713

45.

0713

73

set

dow

n3-

1.89

27.6

42.

951.

431.

596.

9513

55.

0213

94

sort

out1

**9.0

07.

736.

915.

504.

936.

8213

627

.36

3799

follow

up2

**13.4

04.

553.

165.

457.

036.

7313

710

.12

9641

com

eth

rou

gh1-

10.8

28.

645.

436.

841.

546.

6613

85.

6412

99

sett

ledo

wn

2**

4.17

15.6

36.

464.

911.

756.

5313

910

.76

9148

com

ea/

rou

nd2

8.25

14.0

54.

504.

930.

976.

5014

012

.42

8357

fill

in1**

5.65

7.9

26.

356.

343.

745.

9914

118

.18

5487

give

out1

-8.4

27.

774.

805.

311.

775.

6214

25.

3013

57

give

in2-

4.80

10.9

56.

113.

882.

245.

5814

35.

7612

716

goal

ong2

-9.6

36.

374.

294.

491.

565.

2814

47.

1412

321


TA

BL

EC

on

tin

ued

PV

s

Dis

trib

uti

on

acro

ssth

ere

gist

ers

inC

OC

AIn

CO

CA

InB

NC

Ran

k-o

rder

dif

fere

nce

Spo

ken

Fic

tio

nM

agaz

ine

New

spap

erA

cad

emic

To

tal

Ran

ko

rder

To

tal

Ran

ko

rder

brea

kof

f2-

2.96

11.9

34.

562.

701.

934.

7714

55.

4613

213

put

off1

-4.

806.6

15.

315.

201.

444.

6714

67.

3912

125

com

eab

out1

-7.8

22.

524.

093.

315.

284.

6314

77.

3812

225

clos

edo

wn

1**

7.6

03.

583.

333.

882.

174.

1314

810

.48

9355

put

in2-

5.09

7.2

23.

163.

541.

054.

0014

98.

0611

435

set

abou

t1-

1.00

3.2

23.

202.

072.

132.

3215

06.

4212

426

Not

e.Su

per

scri

pte

dn

um

ber

(1,

2,o

r3)

afte

rea

chP

Vin

dic

ates

its

dis

trib

uti

on

pat

tern

acro

ssth

ere

gist

ers:

15

fair

lyev

enly

dis

trib

ute

d;

25

no

tev

enly

dis

trib

ute

d;

35

very

un

even

lyd

istr

ibu

ted

.T

he

bo

ldn

um

ber

inea

chP

Ven

try

isth

eh

igh

est

amo

ng

the

five

regi

ster

s.*In

dic

ates

PV

iso

ne

of

the

4P

Vs

fro

mB

iber

etal

.’s

list

that

isn

ot

on

Gar

dn

eran

dD

avie

s’to

p10

0P

Vli

st.

**In

dic

ates

PV

iso

ne

of

the

33P

Vs

this

stu

dy

has

iden

tifi

ed.

Th

em

inu

ssi

gnaf

ter

aP

Vin

dic

ates

its

nu

mb

ero

fto

ken

sP

MW

sis

bel

ow

10in

eith

erco

rpu

s.

Co

ncl

ud

ed

688 TESOL QUARTERLY

Documents

The Most Frequently Used English Phrasal Verbs in American and British English- A Multicorpus Examination