247
CHAPTER 1 THE STUDY OF COLLOCATIONS 1.0 Introduction 'Collocations' are usually described as "sequences of lexical items which habitually co-occur [i.e. occur together]" (Cruse 1986:40). Examples of English collocations are: ‘thick eyebrows’, 'sour milk', 'to collect stamps', 'to commit suicide', 'to reject a proposal'. The term collocation was first introduced by Firth, who considered that meaning by collocation is lexical meaning "at the syntagmatic level" (Firth 1957:196). The syntagmatic and paradigmatic relations of lexical items can be schematically represented by two axes: a horizontal and a vertical one. The paradigmatic axis is the vertical axis and comprises sets of words that belong to the same class and can be substituted for one another in a specific grammatical and lexical context. The horizontal axis of language is the syntagmatic axis and refers to a word's ability to combine with other words. Thus, in the sentence 'John ate the apple' the word 'apple' stands in paradigmatic relation with 'orange', 'sandwich', 'steak', 'chocolate', 'cake', etc., and in syntagmatic relation with the word 'ate' and 'John'. Collocations represent lexical relations along the syntagmatic axis. 114

Collocations - UQ eSpace - University of Queensland

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

CHAPTER 1

THE STUDY OF COLLOCATIONS

1.0 Introduction

'Collocations' are usually described as "sequences of lexical items which

habitually co-occur [i.e. occur together]" (Cruse 1986:40). Examples of English

collocations are: ‘thick eyebrows’, 'sour milk', 'to collect stamps', 'to commit

suicide', 'to reject a proposal'.

The term collocation was first introduced by Firth, who considered that

meaning by collocation is lexical meaning "at the syntagmatic level" (Firth

1957:196). The syntagmatic and paradigmatic relations of lexical items can be

schematically represented by two axes: a horizontal and a vertical one. The

paradigmatic axis is the vertical axis and comprises sets of words that belong to

the same class and can be substituted for one another in a specific grammatical

and lexical context. The horizontal axis of language is the syntagmatic axis and

refers to a word's ability to combine with other words. Thus, in the sentence

'John ate the apple' the word 'apple' stands in paradigmatic relation with

'orange', 'sandwich', 'steak', 'chocolate', 'cake', etc., and in syntagmatic relation

with the word 'ate' and 'John'. Collocations represent lexical relations along the

syntagmatic axis.

114

Firth's attempt to describe the meaning of a word on the collocational

level was innovative in that it looked at the meaning relations between lexical

items, not from the old perspective of paradigmatic relations (e.g. synonyms,

antonyms) but from the level of syntagmatic relations. Syntagmatic relations

between sentence constituents had been widely used by structural linguists

(e.g. 'John ate the apple' is an 'Subject-Verb-Object' construction), but not in the

study of lexical meaning.

Up till now, studies on collocation have been insufficient in defining the

concept of collocation in a more rigorous way (Cowan 1989:1). Since the term

'collocation' was introduced by Firth to describe meaning at the syntagmatic

level, subsequent linguists and researchers have not often attempted to define

'collocation' in a more thorough and methodical way. Collocation is still

defined as the tendency of a lexical item to co-occur with one or more other

words (Halliday, McIntosh & Strevens 1964:33; Ridout & Waldo-Clarke 1970;

Backlund 1973, 1976; Seaton 1982; Crystal 1985:55; Cruse 1986:40; Zhang

1993:1).

Although the theoretical treatment of collocations has been inadequate,

the teaching of collocations to second language (L2) learners has gained

importance during the last decade. For a long time the emphasis in vocabulary

learning has been on accumulating and memorising lists of word definitions,

followed by gap filling exercises (Robinson 1989:276; Gitsaki 1992; for a review

of the development of vocabulary teaching see Carter and McCarthy 1988).

However, applied linguists realised that vocabulary skills involve more than

115

the ability to define a word. Suggestions were made for a new approach to

vocabulary teaching that would avoid the previous emphasis on words in

isolation and on word definitions. The new approach would include an

examination of the syntagmatic relations of collocation between lexical items, a

skill that is evident in the adult native speakers of a language (McCarthy

1984:14-16; Carter 1987:38; Sinclair 1991).

The shift of interest towards lexical learning is also evident in the

introduction of a new approach to L2 teaching. The Lexical Approach, as it is

outlined by Lewis (1993), regards language as grammaticalised lexis and places

the way words combine at the centre of its theoretical perspective (Hewitson &

Steele 1993). Lexis becomes the central organising principle of the syllabus, and

collocation assumes an important syllabus-generating role (Lewis 1993).

Raising the learners' understanding of the collocations of words is a

matter of first-rate importance (McCarthy 1984:21), since the task of learning

collocations can present both intralingual and interlingual problems.

'Collocation' as a term describing lexical relations is not well-defined, and

unfortunately joining words that are in principle semantically compatible does

not always produce acceptable collocations, e.g. 'many thanks' is an acceptable

collocation in English but *'several thanks' is not, in the same way that 'strong

tea' is well-formed but *'powerful tea' is not.

Further on, unlike paradigmatic relations between words which can be

the same for different languages, syntagmatic relations are more likely to differ

from language to language (Mitchell 1975:10). For example, English people

116

'draw conclusions' while the Greeks ‘bga;zoun sumpera;smata’ [take out

conclusions]; in English you have to 'wait for somebody' while in Greek

‘perime;neiß ka;poion’ [wait somebody]; in English you 'go on a diet' while in

Greek 'ka;neiß di;aita’ [do diet]; in English someone who drinks a lot is a

'heavy drinker' while in Greek he is a 'gero; poth;ri' [strong glass]; in English

you 'get in touch with someone', while in Greek you 'e;rcese se epafh; me

ka;poion' [come to touch with someone].

The purpose of this thesis is to study syntagmatic lexical relations within

a framework that will allow a more thorough treatment of the phenomenon of

'collocation', and to investigate the acquisition process of English collocations

by L2 learners as an attempt to describe the possible factors affecting the

development of English collocational knowledge.

1.1 The Importance of Collocations in L2 Learning

The importance of collocations for the development of L2 vocabulary

and communicative competence has been underscored by a number of linguists

and language teachers who recommend the teaching and learning of

collocations in the L2 classroom.

Collocation has been considered as a separate level of vocabulary

acquisition. Bolinger (1968) and (1976) argues that we learn and memorise

words in chunks and that most of our "manipulative grasp of words is by way

of collocations" (Bolinger 1976:8). The learning of language in segments of

collocation size, especially in children, is proved by the fact that "the collocate is

117

what the young child produces if you ask him a definition", e.g. a 'hole' is 'a

hole in the ground' (Cazden 1972:129, cited in Bolinger 1976:11). Bolinger

describes language learning as a continuum starting at the morpheme level

with word formation rules, moving to the word level and activating phrase

formation rules. The last stage before storage into memory is the level where

words enter into collocations. When learning a language people may or may

not store a morpheme as such, but they do store phrases. For example, the

phrase 'indelible ink' will be stored as a phrase, but few people will analyse the

word 'indelible' as having the morpheme 'in-' as a prefix (Bolinger 1968:106).

Among the early advocates for the importance of collocations in L2

learning and their inclusion in L2 teaching is Brown (1974), who suggests that

an increase of the students' knowledge of collocation will result in an

improvement of their oral and listening comprehension and their reading

speed. In an effort to make the advanced students achieve a better feel of what

is acceptable and what is appropriate, Brown outlines a number of exercises.

The combination of lexical items as a source of difficulty in vocabulary

acquisition has been noted by researchers like Korosadowicz-Struzynska (1980),

who claims that the learner's mastery of these troublesome combinations, rather

than her/his knowledge of single words, should be an indication of her/his

progress (Korosadowicz-Struzynska 1980:111). Korosadowicz-Struzynska

reports that students face intralingual and interlingual problems in the use of

collocations, and even advanced students who have considerable fluency of

expression in a foreign language make collocational errors. The teaching and

118

learning of collocations for production reasons is regarded as essential by

Korosadowicz-Struzynska, who also describes certain steps that should be

followed in order to promote the teaching of collocations from the initial stages

of foreign language learning. These include selection of the most essential

words on the basis of usefulness and frequency of occurrence, selection of the

most frequent collocations of these words, presentation of these collocations in

the most typical contexts, and contrasting any of the selected collocations with

the equivalent native-language collocations that could cause interference

problems for the learners.

The significant role that conventionalised language forms (idioms,

routine formulas and other forms such as collocations) play in the development

of foreign language learners' communicative competence is stressed by Yorio

(1980). One of the functions of conventionalised forms is that they "make

communication more orderly because they are regulatory in nature” (Yorio

1980:438). Realising that random selection on purely subjective grounds from

diverse conventionalised language forms is totally inadequate for the purposes

of foreign language teaching, Yorio describes a set of criteria for the selection of

specific forms to be taught: need, usefulness, productivity, currency, frequency,

and ease (Yorio 1980:439).

It has been claimed that prefabricated language chunks and routinized

formulas play an important role in acquiring and using language (Nattinger &

DeCarrico 1992:1; Nattinger 1980). Nattinger and DeCarrico have argued that a

common characteristic in acquiring a language is the progression from routine

119

to pattern to creative language use (Nattinger & DeCarrico 1992:116).

Therefore, it is suggested that the learning of prefabricated language patterns

should be promoted in the classroom.

The "apparent rulelessness" of collocations as one factor that interferes

with foreign language vocabulary learning has been noted by Laufer (1988).

Laufer reports that collocations constitute an essential aspect in the learners'

knowledge of vocabulary, and she acknowledges that problems can arise in the

learners' use of word combinations. She also suggests that collocations could

be found to provide help in many levels of vocabulary development and the

development of self-learning strategies such as guessing (Laufer 1988:16).

Realising the foreign language learner's difficulties in learning

vocabulary, Cowie (1978), (1981) stresses the importance of the compilation of

English dictionaries "in which collocation and examples play a separate but

complementary role" (Cowie 1978:131). Cowie points out that "meaning is not

the only determinant of the extent and semantic variety of collocating

words....The constraint may be situational" (Cowie 1978:134). For example, in

the collocation 'a tea/dinner service of 50 pieces' there is a restriction as to

which meals can combine with 'service' (tea, dinner, breakfast, ?luncheon) and

their combination is based on cultural factors, i.e. which of these meals it is

customary to serve, and whether it is conventional to have separate sets of

dishes and plates for each (Cowie 1978:134). As a result, special treatment of

the cultural factor of collocability in a learner's dictionary is proposed. He also

suggests the inclusion of 'free word-combinations' that could still cause

120

problems for the foreign language learners, as well as the inclusion of

grammatical rules that will indicate the correct grammatical treatment of the

included collocations (Cowie 1981:226,232).

The teaching of collocations in the classroom could help students

overcome problems of vocabulary, style and usage (Leed & Nakhimovsky

1979). Leed and Nakhimovsky suggest the utilisation of lexical functions, as

these are described by Mel'cuk and Zholkovsky (1988) (see Table 1), for the

construction of foreign language teaching materials, vocabulary exercises and

learners' dictionaries. Leed and Nakhimovsky argue that vocabulary exercises

should be based on the findings of a well-structured lexical analysis, in the

same way that pronunciation exercises are based on phonology (Leed &

Nakhimovsky 1979:111). The theory of lexical functions can provide the basis

for the generation of pedagogical exercises that are more consistent, diversified,

and elaborate, less arbitrary, and ultimately more effective. Such an approach

would help foreign language learners with problems of vocabulary, style and

usage, and give teachers a method to produce and carry out lexical exercises in

the classroom, as well as concentrate on the teaching of restricted collocations

such as 'heavy drinker', 'heavy smoker', 'deep trouble', etc., (Leed &

Nakhimovsky 1979:109).

Table 1. Examples of Lexical Functions Lexical Functions

Syn (to shoot) = to fire [synonym]

121

Sync (to shoot) = to machine-gun [narrower synonym]

Anti (victory) = defeat [antonym]

Oper1 (analysis) = to perform [be the subject of]

Oper2 (analysis) = to undergo [be the object of]

(Mel'cuk & Zholkovsky 1970:26; Mel'cuk 1981:39)

Teaching phrase-patterns and sentence patterns from the early stages of

L2 learning may help vocabulary expansion (Twaddell 1973; Korosadowicz-

Struzynska 1980). Twaddell argues that vocabulary expansion should take

place from the intermediate stages of L2 learning and onwards under the

condition that "the most habitual parts of language use" such as phrase-patterns

and sentence patterns will be "practised and established as early as possible"

(Twaddell 1973:63). After those habits have been adequately established, then

new vocabulary can be assimilated into the L2 patterns. Korosadowicz-

Struzynska also suggests that it is reasonable to teach collocations of words to

learners from the beginning rather than to arrange remedial courses afterwards,

when lexical errors have become fossilised (Korosadowicz-Struzynska

1980:116). She disagrees with Smith's view that "mastery of the utterance

should be the culmination of learning, not the beginning" (Smith 1971:42).

It has been argued that the teaching of collocations facilitates vocabulary

building for University-bound ESL students (Smith 1983). Smith (1983)

illustrates a type of exercise for the teaching of collocations that combines both

paradigmatic and syntagmatic relations between words. A number of

122

collocations that are primarily used in academic subjects are selected for

teaching, and the key words of these collocations are members of the same

semantic field (e.g. 'same', 'identical', 'equivalent', 'parallel', 'equal',

'homogeneous', 'similar') . According to Smith, this type of exercise could

prove to be useful in an ESP course.

A "carefully graded curriculum" should include word associations

according to Murphy (1983), who treats collocations and word associations as

synonymous. Murphy describes 11 steps that foreign language teachers could

follow in order to include collocations, word association, famous sayings and

catch phrases in their teaching program.

The study of fixed expressions in English has been suggested as a useful

starting point for a principled approach to vocabulary learning and teaching

(Alexander 1984:132). Alexander stresses the benefits in the learning process if

emphasis is placed "on the three C's of vocabulary learning: collocation,

context, and connotation" (Alexander 1984:128).

Contrastive analysis has been suggested as an approach to the teaching

of collocations. The main strategy of this approach is the compilation of lists of

collocations in the learner's L1 and their equivalents in the target language.

Newman (1988) conducted a contrastive analysis of Hebrew and English dress

and cooking verbs and their noun/object collocations. Newman suggests that

providing learners with words that are described in terms of meaning

components, derived from contrastive analysis and collocation restrictions, can

prove to be a useful device in the learners' disposal for making conscious

123

distinctions and avoiding lexical errors arising from negative L1 transfer

(Newman 1988:303). Therefore, the language learning process should be

complemented by frequent practice and immersion to cater for the acquisition

of idioms and rigidly restricted collocations, along with meaningful mnemonic

operations that will involve the "deliberate exercising of the learner's powers of

analysis and creativeness parallel to the characteristics of the transparent freer

end of the collocational range" (Newman 1988:304). A similar view is reported

by Bahns (1993). He argues that a contrastive analysis of the lexical collocations

in the students' L1 and the target language will reveal which collocations have

direct translational equivalents and therefore need not be taught, allowing

foreign language teaching to concentrate on items for which there is no

translational equivalence in the target language.

The studies reviewed above show the importance of teaching

collocations to ESL learners, and the necessity of the inclusion of collocations in

the second/foreign language curriculum, as this can prove to be beneficial for

the development of L2 vocabulary, communicative competence, and language

performance. Even though some criteria are offered in order to help teachers

decide which collocations to teach, these criteria are arbitrarily established, they

are not based on empirical research, and they are by no means conclusive. For

example, Brown (1974) recommends that 'normal' collocations should be taught

first because they form the basis for 'unusual' collocations (Brown 1974:3), but

she does not define the criteria that would help teachers distinguish 'normal'

from 'unusual' collocations. In addition, the proposed exercises do not seem to

124

have been constructed systematically; the choice of verbs and nouns to be

combined seems random, and no criteria are given as a means for establishing

the "usefulness" of the collocations provided by the exercises; and the teacher

has to rely on her/his own intuition about which of the collocations are more or

less useful.

Similarly, Laufer (1988) accepts the view that collocations constitute an

essential aspect in the learner’s knowledge of vocabulary, and she

acknowledges that problems can arise in the learner’s use of word

combinations, but she nevertheless concentrates on the paradigmatic lexical

relations, abandoning collocations to their 'rulelessness'. In addition, Laufer

does not explain how the problem of teaching, learning, and use of collocations

can be tackled, even though collocations could be found to provide help in

many levels of vocabulary development (Laufer 1988).

In Murphy's paper (1983) a number of exercises are outlined for the

teaching of collocations, but it is left to the teacher's personal judgement to

decide which collocations, word associations and phrases are more useful than

others and which ones should be taught first.

These are some of the problems presented by studies prescribing the

teaching of collocations. It is apparent that even though the importance of

collocations in L2 teaching and learning has been established, the treatment of

collocations has been inadequate. There are still decisions to be made as to

which collocations should be given priority in the classroom, how many

collocations per new word should be taught, how to practice collocations, at

125

which level the teaching of collocations should be attempted, how is the

acceptability of specific collocations to be established.

Finally, the large repertoire of terms employed by linguists and language

pedagogists to refer to word combinations includes 'combinations of lexical

items' (Korosadowicz-Struzynska 1980), 'conventionalised language forms'

(Yorio 1980), 'prefabricated language chunks and routinized formulas'

(Nattinger & DeCarrico 1992), 'phrase patterns and sentence patterns'

(Twaddell 1973), 'word associations' (Murphy 1983), 'fixed expressions'

(Alexander 1984)(see also Kennedy 1990). The variety of terms used

underscores the need for a more precise definition of 'collocation' and a method

for the systematic classification of individual collocations.

1.2 Collocations in L2 Acquisition Research

There have been a number of studies in L2 acquisition research that

investigated how the knowledge and use of collocations by students at different

levels of proficiency affect their communicative competence and language

performance, and so established the importance of collocations in L2 learning.

In her effort to identify the main factors in L2 acquisition for academic

achievement, Saville-Troike studied a group of nineteen non-English speaking

elementary school students who were subsequently taught and tested in

English. The longitudinal study revealed that the most usual verbal interaction

patterns consisted of the use of English routines such as 'don't do' and 'that's

mine' (Saville-Troike 1984:207) and that vocabulary knowledge in English is the

126

most important aspect of L2 competence for academic achievement (Saville-

Troike 1984:216). Students progressed from simply repeating after the teacher,

to nodding or shaking the head, to using single words, and finally to using

phrase and sentence patterns. These patterns and routines can be considered as

collocations since they are word combinations, and hence Saville-Troike's study

shows that collocations are essential for communicative interaction even from

the initial stages of L2 acquisition.

In an experiment carried out by Bahns and Eldaw (1993), a translation

and a cloze task were used to test German post-secondary learners' active

knowledge of 15 English verb-noun 'lexical collocations' (i.e. collocations that

included words belonging to open-class categories, and excluding words such

as prepositions, articles or conjunctions). The German collocations used in the

translation test were direct equivalents of the English collocations. In the cloze

test there were 15 sentences each sentence containing one verb-noun collocation

with the verb missing. The analysis of the data revealed that the subjects

produced more than twice as many errors in their translations of the nouns in

the noun-verb collocations as in their translation of general lexical words, while

in the cloze test nearly 52% of the responses were grammatically or

collocationally unacceptable to a native speaker of English. The results show

that for advanced ESL students collocations present a major problem in the

production of correct English. The results also indicate that the learners'

knowledge of collocations does not expand in parallel with their knowledge of

general vocabulary, since they could not identify the specific verb-noun

127

collocations, although they could use general lexical items. Also, the learners'

inability to paraphrase collocational phrases suggests that "a knowledge of

collocations is essential to full communicative mastery of English" (Bahns &

Eldaw 1993:109). Bahns and Eldaw suggest that the results of their study are

due to the fact that collocations are not taught explicitly in the classroom and

therefore learners do not pay any attention to learning them (p. 109).

Verb-noun collocations were also tested by Aghbar (1990) in a writing

task based on the assumption that the use of formulaic language should be

considered in assessing native and non-native English proficiency. Aghbar

defines formulaic language as language chunks that are used and learnt

together. He reports that "collocations are the less obvious examples of

formulaic language", possibly because they are not fixed in the same way that

idioms and proverbs are (Aghbar 1990:2). The writing test consisted of 50

sentences, appropriate for formal written contexts, with each sentence

containing one formulaic verb-noun expression. In each of these expressions

the verb was missing and the participants had to provide the verb most likely

to be used in a formal written context. The results showed that ESL students

did well where 'get' was the desirable word. However, they used 'get' even

when other more specific and more appropriate verbs were needed. For

example, 'This is an opportunity for you to _______ knowledge in your field of

study' could be filled with 'get' but also with other more appropriate verbs such

as 'acquire', 'accumulate', 'gain', 'demonstrate', 'display' etc. The reason for the

poor ESL performance in the test was the "lack of acquisition of those language

128

chunks that make discourse fluent and idiomatic" (Aghbar 1990:6). The results

also showed that the performance of American students was similar to that of

ESL students, thus proving that even the native undergraduates' knowledge of

the collocations used in formal written language was inadequate.

Similarly, 200 undergraduate third and fourth year Jordanian students

majoring in English performed poorly in a multiple choice test conducted by

Fayez-Hussein (1990), who aimed to assess the students' ability to collocate

words correctly in English. The multiple choice test consisted of 40 sentences,

with each sentence containing an incomplete collocation (i.e. idioms, fixed

expressions, restricted collocations). The collocations tested were mainly noun-

noun, adjective-noun, and verb-noun phrases. The students' performance on

the test (only 48.4% of the collocations were answered correctly) was found

unsatisfactory. Almost half of the incorrect responses were found to be due to

negative transfer from L1, e.g. in item 5 'By the weekend the death _________

had reached 95', 51% of the subjects chose 'death number' instead of 'death toll'.

Unfamiliarity with the structure of the particular idioms and fixed expressions

was another major factor for incorrect responses, e.g. in item 21 'The first

voyage of a new ship is referred to as a __________ voyage', 45.5% of the

subjects selected 'primary voyage' instead of 'maiden voyage'. Finally, the

students' tendency to use generic terms instead of specific ones accounted for

38.3% of incorrect responses, e.g. in item 29 'After the current repairs of the

city's water supply system, ________ water will be safe for drinking', 48.5% of

the subjects chose 'pipe water' instead of 'tap water'. Fayez-Hussein lists a

129

number of reasons for the students' inadequate knowledge of English

collocations: the neglect of lexicon in the teaching and learning of English as a

foreign language, the students' insufficient reading experience (which is

assumed to restrict their knowledge of vocabulary, synonyms, lexical

restrictions, etc.), the reduction and simplification that takes place in the

teaching of a foreign language (which can encourage students to use

oversimplified generalisations), and the subjects' overuse of guessing strategies

in answering the test items. The latter could have also been encouraged by the

format of the test, i.e. multiple choice test items.

The lack of emphasis which most syllabuses place on vocabulary has

been reported as the main reason for the frequency of learners' lexical errors

(e.g. collocational errors, over-use of a few general items) by Channell (1981). A

group of eight advanced students of English were asked to fill in a

'collocational grid' which had the adjectives 'handsome, pretty, charming,

lovely' as its vertical axis, and the nouns 'woman, man, child, dog, bird, flower,

weather, landscape, view, house, furniture, bed, picture, dress, present, voice'

as its horizontal axis. The test showed that the students failed to mark a large

number of acceptable collocations, even though they were very familiar with

the words involved in the test. Channell concludes that it is essential that

learners realise the potential of words they know and of the new words they

learn, and she recommends that syllabuses should take into account two things

about every new word the learner needs to learn: how it relates to other words

130

with similar meaning, and which other words it can be used with and in which

contexts (Channell 1981:116).

An analysis of the writing of four Arab college ESL students by Elkatib

(1984) showed unfamiliarity of collocation as well as overuse of a few general

lexical items to be among the eight main types of lexical errors that were

recorded. In a further analysis of the collocational errors, Elkatib observes that

the learners knew the basic meaning of the lexical item but they did not know

its collocative patterns, which resulted in the use of erroneous collocations such

as 'beautiful noise', 'shooting stones', 'I increased a hundred marks', 'do

progress'. Elkatib concludes that new words should be presented in company

with their most typical collocations in the form of example sentences or of

collocation grids like the ones proposed by Channell (1981). The importance of

such a practice derives from the fact that "students often fail to realise the

potential even of words they know well, because they use them only in a

limited number of collocations of which they are sure" (Elkatib 1984:50).

The analysis of frequent words and their collocations was used in order

to assess the writing proficiency of primary school students in Singapore

(Ghadessy 1989). Writing samples of grade three (8-9 years old) and grade six

(11-12 years old) students were analysed using the KWIC (key-word in context)

method. It was found that grade three students used content words (i.e. nouns,

verbs, adjectives and adverbs) more frequently than grade six students, who

showed a more frequent use of function words (i.e. articles, pronouns,

prepositions, etc.) (Ghadessy 1989:113). According to Ghadessy, the frequent

131

use of function words is indicative of a more advanced use of collocations,

grammatical patterns and cohesive devices on the part of grade 6 students

(Ghadessy 1989:114). Ghadessy reports that looking at the collocations students

use is a valid way of investigating what happens during their development

towards a full linguistic communicative competence, i.e. by looking at the

collocations of nouns, one can draw conclusions about the development of the

students' ability to use premodification and postmodification of nouns. For

example, in Ghadessy's study all students used premodification (e.g. ‘tall tree’,

‘tennis ball’, ‘shady tree’) more frequently than postmodification (e.g. ‘the tree

that...’, ‘’a tree near the place that...’, ‘the tree which...’), which appeared mainly

in the writings of grade six students. Therefore, it appears that

postmodification is a more complex skill that develops at later stages of L2

learning, and as such it may be used as an indicator of a more advanced level of

language acquisition.

The use of collocations in the writings of native and non-native college

freshmen was examined by Zhang (1993). Samples of written essays, as well as

a fifty-item blank filling test containing 21 types of collocation (11 grammatical

and 10 lexical ones), were analysed in order to examine any associations

between collocational knowledge (as this was measured by the blank filling

tests) and writing quality, on the one hand, and the use of collocations in the

students' essays and writing quality on the other. The results show that

collocational knowledge is a source of fluency in written communication, and

also that the quality of collocations in terms of variety and accuracy is

132

indicative of the quality of college freshmen writing. An interesting result in

Zhang's study is that the use of more grammatical collocations (e.g. SV to Inf)

and fewer lexical collocations (e.g. Verb Adverb) (see section 1.5. for definitions)

was found to be characteristic of the writing in native Good writers and non-

native Poor writers (Zhang 1993:168). Zhang considers this result indicative of

the development that takes place as non-native speakers develop from poor

writers to good writers to native-like writers. Even though Zhang did not test

subjects from different proficiency levels, he anticipates that learners at the

lower levels of English proficiency use more grammatical collocations, and

fewer lexical collocations in their writing, and whatever collocations they do

use are poor in variety and accuracy. As learners progress to intermediate

levels they produce a greater variety of collocations and fewer collocational

errors, but they are still dependent on the prefabricated routines they have

acquired, and thus they use more lexical collocations than grammatical ones.

At higher levels of English proficiency learners have a better knowledge of

grammatical collocations and they are able to use the analysed parts to create

new ones, resulting to fewer lexical collocations and more grammatical ones

(Zhang 1993:169). Zhang's study suggests that there is some kind of

development in collocational knowledge as L2 learners proceed from low

language proficiency to more advanced language proficiency.

In an investigation of possible ways of facilitating L2 vocabulary

learning, Cohen and Aphek (1981) concluded that intermediate level students

find tasks with contextualised words (average 77% correct) easier than tasks

133

involving lists of words, which in turn are easier for beginners (average 84%

correct) (Cohen & Aphek 1981:225). Thus, teaching words in their collocations

could be beneficial for intermediate level students but not for elementary

students.

Overall, the use of correct collocations in the reviewed studies was found

to be indicative of a higher level of language proficiency, and the lack of

collocational knowledge was found to impair language performance. Even

though the above studies pursued similar goals, i.e. to reveal that a limited

knowledge of collocations inhibits language performance and that the teaching

of collocations in L2 classroom is necessary, they present a number of

limitations. Some of the studies were limited to the examination of a small

number collocations, usually belonging to the same pattern (verb-noun

collocations in Bahns & Eldaw 1993; Aghbar 1990; adjective-noun collocations

in Channell 1981). The use of elicitation procedures differed from study to

study, making their results difficult to compare (translation and cloze test in

Bahns & Eldaw 1993; blank filling in Aghbar 1990; collocational grid in

Channell 1981; multiple choice test in Fayez-Houssein 1990; analysis of written

performance in Ghadessy 1989; Elkatib 1984; essay writing and blank filling in

Zhang 1993) (for a critique of the use of multiple choice tests and open-choice

tests in the investigation of collocational knowledge see Aghbar & Tang 1991).

Some studies contained only a small number of subjects (8 subjects in Channell

1981; four in Elkatib 1984; nineteen in Saville-Troike 1984; Cohen & Aphek

1981). There is no common theoretical framework for the study of collocations,

134

i.e. they are mainly descriptions of the problems that learners have with

collocations (word combinations, routinized patterns, phrase patterns, etc.).

With the exception of Zhang's (1993) study, where a number of collocational

patterns are identified and systematically tested, the rest of the studies lack

systematicity and methodology in the selection of the collocations they tested,

which were based mainly on native speaker intuitions. Due to these limitations

the study of the acquisition of collocations is still in need of systematic and

methodologically sound research, while a common framework for the study of

collocations is yet to be established.

The following section outlines the different approaches to the study of

collocations in an attempt to construct a theoretical framework as the basis of

the present study.

1.3 Approaches to the Study of Collocations

Since the 1960's there have been three main approaches to the study of

collocations, focusing on different aspects of the phenomenon of collocation. In

this study, these approaches are referred to as: the lexical composition

approach, the semantic approach, and the structural approach. The lexical

composition approach characterises collocation as a different level of lexical

meaning. The semantic approach attempts to predict the collocates of lexical

units by reference to their semantic features. The structural approach examines

collocations using grammatical patterns. Each approach is described in more

detail in the following sections.

135

1.3.1 The Lexical Composition Approach

The lexical composition approach in the study of collocations is based on

the assumption that words receive their meaning from the words they co-occur

with. Among those who perceived collocations as a lexical phenomenon

independent of grammar is Firth, who is also believed to be the 'father' of the

term "collocation". Collocation according to Firth is a "mode of meaning". Just

as the light of mixed wave-lengths disperses into a spectrum, "the lexical

meaning of any given word is achieved by multiple statements of meaning at

different levels", e.g. the orthographic level, phonological level, grammatical

level, and collocational level (Firth 1957:192). For example, the meaning of the

word 'peer' is described by Firth in the following way: at the orthographic level

the group of letters 'peer' is distinguished from the group of 'pier'. Next the

pronunciation is stated, then at the grammatical level we state whether 'peer' is

a noun or a verb, and by making such statements at the grammatical level we

make explicit a further component of meaning. Also, formal and etymological

meaning may be added, together with social indications of usage (Firth

1957:192). Finally, at the collocational level, one of the meanings of the word

'peer' is its collocation with 'school', as in 'school peers'. Firth highlights the

"general rule" that every word entering a new context is a new word. Firth also

distinguishes contextual meaning from meaning by collocation, and attempts a

classification of collocations into "general or usual collocations and more

restricted technical or personal collocations", though unfortunately without any

136

further elaboration (Firth 1957:195). Even though Firth does not enter into a

thorough exploration of a theory of collocations, he uses collocation in his book

as a technique for the stylistic criticism of literary works, e.g. personal or

'unusual' collocations can reflect personal idiosyncratic styles in the use of

language (for the use of collocations in the stylistic analysis of literature, see

Behre 1967).

Halliday (1966) and Sinclair (1966) took Firth's theory of meaning one

step forward and stressed the importance of lexical collocations, i.e. collocations

that consist of lexical items, in an integrated lexical theory. The so called Neo-

Firthians attempted the study of lexis as a distinct linguistic level. Sinclair saw

Grammar and Lexis as two 'interpenetrating ways' of looking at language form

(Sinclair 1966:411), and Halliday argued that lexical theory is complementary

to, but not part of, grammatical theory (Halliday 1966:148). Grammar organises

language as a system of choices and whatever patterns and/or items fail to

"resolve themselves into systems" are listed at the end of each grammatical

description (Sinclair 1966:411). 'Lexis', on the other hand, is devoted to the

study and description of individual lexical items and their collocational

tendencies that cannot be dealt with by grammar, since they are not a matter of

choice (one rather than another) but of likeliness of occurrence, i.e. "there are

virtually no impossible collocations, but some are more likely than others"

(Sinclair 1966:411), e.g. the collocation 'this lemon is sweet' could be considered

as unusual except in the context of somebody exclaiming over a child's painting

of still life (McIntosh 1961:329).

137

The Neo-Firthians also introduced a new set of linguistic terms related to

the study of collocations. They used the term Node to refer to a lexical item

whose collocations are being studied, Span to refer to the number of lexical

items on either side of the node that are considered to be relevant to the node,

and Collocates to refer to those items that are in the environment defined by the

span (Sinclair 1966:415). For example, when we study the collocational patterns

of 'tea', 'tea' is the node. If we decide to have a span of 3, that means we study

the 3 lexical items that occur before and after 'tea'. All the lexical items that are

within the span of the word 'tea' are considered to be its collocates.

To the extent to which words are specified by their collocational

environment, similarities of their collocational restrictions enables linguists to

group lexical items into "lexical sets", i.e. sets of words with similar

collocational restrictions. For example, the words 'bright', 'shine' and 'light' are

members of the same lexical set because they are frequent collocates of the

word 'moon' (Halliday 1966:156). Along the same lines, the lexical items

'bright', 'hot', 'shine', 'light', 'lie' and 'come out' are all members of the same

lexical set because they all collocate with the item 'sun' (Halliday 1966:158). The

criterion for a lexical item to enter a lexical set is its syntagmatic relation to a

specific lexical item (i.e. its collocation with a specific word) rather than its

paradigmatic relation to that lexical item. For example, lexical items like

'strong' and 'powerful' are considered members of the same lexical set because

they collocate with the lexical item 'argument', e.g. 'strong argument' and

'powerful argument'. As far as other collocates are concerned, e.g. 'car' and

138

'tea', the lexical items 'strong' and 'powerful' will enter different lexical sets, i.e.

'strong' will be a member of the lexical set defined by 'tea', and 'powerful' will

be a member of a lexical set defined by 'car' (Halliday 1966:152). Halliday is

also interested in the collocational patterns that lexical items enter. For

example, 'a strong argument' presents the same collocational pattern as 'the

strength of his argument' and 'he argued strongly'. Since 'strong', 'strength',

and 'strongly', are parts of the same collocational pattern, they are considered

as word-forms of the same lexical item (Halliday 1966:151). Halliday also

points out that lexical items need not have any formal relationship to one

another in order to collocate. For example 'strong' and 'argument' could be in

different sentences 'I wasn't convinced of his argument. He had some strong

points but they could all be met'.

What Halliday refers to as 'collocational pattern' McIntosh calls

'collocational range' in order to distinguish it from its grammatical equivalent,

i.e. 'pattern', which has to do with the structure of the sentences we produce,

while 'collocational range' has to do with the specific collocations we produce

in a series of particular instances (McIntosh 1961:337; McIntosh & Halliday

1966). McIntosh also argues that since collocations are the material out of

which sentences are made, collocational range should be taken into account

within the dictates of pattern when dealing with the text of actual sentences.

A theory of lexical meaning similar to the one outlined by Firth and the

Neo-Firthians is suggested by Anthony (1975). Even though Anthony was not

involved directly in the study of collocation, his proposed theory treats the

139

lexical word as an empty form capable of bonds to different kinds of meaning

(Anthony 1975:22). Each lexical word becomes a discourse word when it is

used in ordinary discourse, and the particular meaning which is in focus is

called its lexical meaning. For example, the lexical word 'pitch' can mean many

things, i.e. it is capable of bonds to different kinds of meaning (a throwing

action, a tar-like substance, something musical, etc.). The moment 'pitch' is

used communicatively in a group of other words and becomes a discourse

word, then a small portion of its repertory of meanings is in focus and this

becomes its lexical meaning, e.g. in the sentence 'pitch the ball to me', 'pitch'

receives the meaning of 'a throwing action'. Anthony also remarks that a word

that occurs in one grammatical construction differs in lexical meaning from the

same word in another construction. For example the use of 'mother' as a verb

has a different referential meaning from the use of 'mother' as a noun.

Collocation has also been identified by Halliday and Hasan as a form of

lexical cohesion, and it has been defined as the "cohesive effect" of pairs of

words such as 'bee...honey' and 'king...crown' which "depends not so much on

any systematic semantic relationship as on their tendency to share the same

lexical environment, to occur in COLLOCATION with one another" (Halliday

& Hasan 1976:286). However, 'collocational cohesion', as it is used by Halliday

and Hasan, is simply "a cover term" for textual cohesion, a kind of "semantic

interlace that provides texts with their texture- their non-structural cohesion or

lexical form" (Addison 1983:3), and leaves the "specific kinds of co-occurrence

which are variable and complex" to be dealt with by "a general semantic

140

description of the English language" (Halliday & Hasan 1976:287-288).

Halliday and Hasan's definition of collocation serves the task of textual

analysis, but it is restricted in lexically predictable collocational chains that

extend beyond the boundaries of a sentence. Furthermore, it does not pay

attention to idiosyncratic and unpredictable co-occurrences of words that are

not semantically or environmentally, in a physical sense, associated to each

other, e.g. there is nothing obvious in the meaning of 'tea' that explains why it

collocates with 'strong' but not with 'powerful'.

The main problem with lexical analysis has been identified as "the

circularity of the definition of the basic unit of description, the lexical item"

(Sinclair 1966:412). That is, every item is described in terms of its environment

which in its turn is defined in terms of the item. For example, one of the

meanings of 'night' is its collacability (i.e. ability to collocate) with 'dark', and of

'dark', its collocation with 'night' (Firth 1957:196). The above realisation makes

lexical statements look weaker and less precise than grammatical ones, which

are based on a well-defined and explicit framework.

One of the good points of the lexical composition approach is that it

drew attention to lexis and uncovered the insufficiency of grammatical analysis

to account for the 'patterns' a word enters in, in the Hallidayan sense, and the

collocatory idiosyncrasies of lexical items. The Neo-Firthians argue that

grammar alone cannot describe what the lexical item is, therefore the lexical

item "must be identified within Lexis, on the basis of collocation" (Halliday,

McIntosh & Strevens 1964:35).

141

Sinclair and Halliday do not underestimate the importance of

grammatical analysis; they rather highlight the significance of being able to

make valid statements about lexis that do not disregard but complement

grammar. However, the Neo-Firthians admit that they do not know "how far

collocational patterns are dependent on the structural relations into which the

items enter" (Halliday 1966:159), and therefore it is essential to examine

collocational patterns in their grammatical environments. In other words, the

advocates of the lexical composition approach recommend that collocational

patterns are best described and analysed through lexical analysis, but they do

admit that help from grammar is still needed.

1.3.2 The Semantic Approach

Collocation as a linguistic phenomenon associated with lexical semantics

was described as early as 2,300 years ago. Greek Stoic philosophers, according

to Robins (1967), rejected the equation of "one word, one meaning" and shed

light on an important aspect of the semantic structure of language: "word

meanings do not exist in isolation, and they may differ according to the

collocation in which they are used" (Robins 1967:21).

In parallel to the lexical composition approach, where linguists

recognised lexis as a level of analysis of language separate from grammar, in

the semantic approach linguists attempted to investigate collocations on the

basis of a semantic framework, also separate from grammar.

142

Chomsky was among the first to suggest the treatment of collocations by

semantics. Even though Chomsky did not examine collocations, he

distinguished between 'strict subcategorisation rules', i.e. rules that "analyze a

symbol in terms of its categorical context", and 'selectional rules', i.e. rules

which "analyze a symbol in terms of syntactic features of the frames in which it

appears" (Chomsky 1965:95). These rules assist the generation of grammatical

strings. The breaking of strict subcategorisation rules will result in strings such

as e.g. 'John found sad' and 'John became Bill to leave', while failure to observe

the selectional rules will give examples like 'Colorless green ideas sleep

furiously' (Chomsky 1965:149). He then finds that selectional rules play a

marginal role in the grammar and suggests that they should be dropped from

the syntax and be taken over by semantics.

The Neo-Firthians' approach to the study of collocations was found

inadequate by semanticists because it sorts lexical items into sets according to

their collocations, but it does not explain why there are lexical items that

collocate only with certain other lexical items. In the lexical composition

approach collocations and sets are studied as if the combinatorial processes of

language were arbitrary (Lehrer 1974:176).

Firth's theory of meaning was found to be insufficient for the study of

collocations (Lyons 1966). Lyons claims that Firth's definition of 'meaning' as a

"complex of contextual relations" is puzzling, and he criticises the apparent lack

of principles by means of which "lexical groups by association" can be

established and "lexical sets" can be defined (Lyons 1966:289-297). Overall,

143

Lyons proposes an abandonment of Firth's theory of meaning, in which the

statement of meaning by collocation was introduced, because it does not

coincide with well-established theories of meaning and language description

and furthermore there are other "more important meaning relations" which

must be accounted for in a theory of meaning (Lyons 1966:295). Even though

Lyons seems to agree that 'collocations' restricted to "syntagms (or collocations)

composed of a noun and a verb or a noun and an adjective" (Lyons 1977:261)

are worthy of study by the semanticist, he does not believe that a separate

collocational level has to be established. Lyons also proposes that collocations

should be studied only as part of the synchronic and diachronic analysis of

language. For the study of collocations Lyons proposes the notion of "lexical

fields" founded upon "the relations of sense holding between pairs of

syntagmatically connected lexemes" (Lyons 1977:261). However, he advises

against going to the extreme of "defining the meaning of a lexeme to be no

more than the set of its collocations" (Lyons 1977:265-268). He then proceeds to

describe the principles of a strong version of field-theory as if the vocabulary of

a language was a closed set of lexemes with each lexeme being a member of no

more than one field. However, the vocabulary of a language is an open system,

and lexemes do belong to different fields due to their different meanings.

Therefore, the study of vocabulary in a theory of lexical fields based on

syntagmatic relations presents problems. These problems led Lyons to suggest

that descriptive semantics can get along well without syntagmatic relations

(Lyons 1977:268). Thus, Lyons decides to deal with the 'more important'

144

paradigmatic relations of sense in his study of semantics, setting aside the

study of syntagmatic relations altogether.

Even though Lyons (1977) provided only a criticism of the Firthian

theory of meaning, there have been other semanticists who tried to put together

a theory of lexical meaning based on the semantic properties of lexical units.

This approach is the semantic approach to the study of collocations. According

to the semantic approach, the meaning of a lexical item is perceived as a

combination of the semantic properties of that item. It is the semantic

properties of a lexical item that determine its collocates.

Just as the Neo-Firthians tried to establish the lexis as different from

grammar, the semanticists also tried to establish a semantic theory that is

different from, but complementary to, grammar. Katz and Fodor (1963)

describe a semantic theory that would organise, systematise, and generalise

facts about meaning (Katz & Fodor 1963:170). They state that a semantic theory

of a language would "take over the explanation of the speaker's ability to

produce and understand new sentences at the point where grammar leaves off"

(Katz & Fodor 1963:172-173). They accept that one component of a semantic

theory of a language is a dictionary of that language, and they proceed to

describe the semantic markers for a few lexical entries of a model dictionary of

English. According to the semantic theory proposed by Katz and Fodor, each

entry for a lexical item in the dictionary must contain a selection restriction, i.e.

a condition for that particular lexical item to combine with others. For example,

the lexical item 'sleep' would require a subject with the feature [Animate], and

145

the lexical item 'break' would require as object something that is a [Physical

object] and [Rigid].

Due to the fact that under the semantic approach to the study of

collocations each lexical item will be defined by semantic markers based on its

meaning or meanings, Lehrer (1974) argues that the semantic approach is more

likely to explain why certain words can be found together. In his examination

of syntagmatic meaning relations between lexical units, Cruise describes

collocational restrictions as co-occurrence restrictions that are arbitrarily

established (Cruse 1986:279). For example, 'kick the bucket' can only be used

with human beings, although its propositional meaning is simply 'die' and not

'die in a characteristically human way'. Similarly, 'blond' refers to hair, but

describing a hairy animal or a fur coat as 'blond' would be unacceptable. Cruse

also distinguishes three kinds of collocational restrictions: systematic, semi-

systematic and idiosyncratic, according to whether, and if so to what degree,

the semantic properties of a lexical item set up an expectation of a certain

collocant. Lexical units that belong to the category of systematic collocational

restrictions are 'grill' and toast'. Both verbs denote the same process from the

point of view of the agent, but different patients: normally we 'grill' food that is

raw, while we 'toast' food that is already cooked. Semi-systematic are those

collocational restrictions that still behave as presuppositions of the lexical item

in question, but there can be certain exceptions to the general tendency. For

example, 'customers' obtain something material in exchange for money, while a

'client' receives a less tangible professional or technical service. So, butchers,

146

bakers, and grocers have 'customers', but solicitors and architects have 'clients'.

However, banks seem to have 'customers' rather than 'clients' (Cruse 1986:281).

Finally, for lexical items that present idiosyncratic collocational restrictions,

their collocational ranges can only be described by enumerating all their

acceptable collocants (Cruse 1986:281). For example, one can 'pay attention/a

visit' but not ?'pay a greeting or welcome'. Idiosyncratic collocational

preferences, such as 'flawless performance' but not *'unblemished

performance', do not give rise to presuppositions, according to the semantic

approach, and Cruse wonders whether "idiosyncratic restrictions are a matter

of semantics at all" (Cruse 1986:282). A close study of what collocational

restrictions can deliver to the sentence they are used in is totally justified, since

they are not 'logically' necessary. For example, 'die' and 'pass away' have the

same meaning, but 'pass away' refers to human beings, so the use of 'pass

away' in the sentence 'My grandfather passed away' adds semantic cohesion to

it; if it is used to describe the death of a pet animal then it anthropomorphises

the animal (Cruse 1986:280). Due to the difficulty of the syntagmatic relations,

Cruse (1986), like most lexical semanticists, finds that paradigmatic sense

relations are "a richer vein to mine than relations of the syntagmatic variety"

(Cruse 1986:86).

One of the weaknesses of the semantic approach - the view that co-

occurrence of words is the result of their semantic properties - is that there is a

large number of idiosyncratic co-occurrences or combinations that are

arbitrarily restricted (see Cruse's examples above). These constructions create

147

problems for the study of collocations under a theory of lexical fields, and

therefore they are left unexplained and marginal by semanticists. To return to

Halliday's example, since there is nothing in the meaning of 'tea' to explain why

it collocates with 'strong' but not with 'powerful', according to the semantic

approach, it will be listed as an idiom and as such it will be ignored in a study

of lexical semantics. Furthermore, as Lehrer (1974) points out, finding semantic

features for each lexical item that would account for all its collocates is an

extremely ambitious task (Lehrer 1974:178). Fillmore (1978) also points out the

difficulty of estimating the magnitude of collocational binding between lexical

items, while he acknowledges the fact that a semantic theory must not accept

the suggestion that all meanings must be described in the same terms.

An example of how the semantic approach to the study of collocations

can be best utilised was the compilation of a prototypical dictionary, the

Explanatory Combinatorial Dictionary (ECD), of any language. The ECD is

related to the Meaning-Text theory which defines language as "a specific

system of correspondences between an infinite set of meanings and an infinite

set of texts" (Mel'cuk 1988:167). As a core component of the Meaning-Text

Model, the ECD, according to Mel'chuk (1988):

"ensures the lexicalisation of the initial meaning (i.e., of semantic

representation), uniting bundles of configurations of semantic elements

into actual lexical units and supplying the enormous bulk of syntactic

and lexical co-occurrence information that accrues from the individual

lexical units of the language in question" (Mel'cuk 1988:167).

148

Each ECD entry is divided into three zones: a semantic zone, a syntactic

zone and a lexical co-occurrence zone. The latter comprises all the restricted

lexical co-occurrences of the entry lexeme. For this purpose, Mel'cuk and

Zholkovsky, the ECD initiators, devised the concept of Lexical Functions that

describe all the paradigmatic and syntagmatic relations that a lexeme can have

with other lexemes (Mel'cuk & Zholkovsky 1988:42). The above approach

resulted in a large number of standard basic lexical functions - some of which

had already been utilised in dictionaries for several decades (e.g. 'Syn' for

synonyms) and others were new (e.g. 'Instr' preposition meaning 'by means of',

and 'Propt' preposition meaning 'because of', 'as a result of') (see Table 1,

above). In the ECD version for French, Dictionnaire Explicatif et Combinatoire

du Francais Contemporain, there are 53 lexical functions listed, and these are

used together with the other semantic and syntactic information for the

description of 50 lexical items. Mel'cuk and Zholkovsky are considered

pioneers in their lexicographic principles and the heuristic criteria they used for

the compilation of the ECD. The fact that only 50 lexical items were described

in the French ECD underlines the extremely difficult task of listing all the

semantic features of lexical items in an effort to account to all its collocates.

Despite its limitations, the ECD could be used as "a central component of

automatic text synthesis and analysis", as a "format" for the development of

textbooks, pedagogically oriented dictionaries, and reference works, and also it

can contribute to language theory (Mel'cuk & Zholkovsky 1988:66-67).

149

Even though semanticists claimed that syntagmatic lexical relations

should be studied under the scope of semantics, they did not proceed any

further with the study of collocation and they did not make the phenomenon of

'collocation' any more explicit. Due to the irregularities and idiosyncrasies that

collocations present, semanticists, who followed a similar role to grammarians

(i.e. assigning semantic labels to sentence constituents and examining

generalisable tendencies and regularities), preferred to study the more regular

paradigmatic lexical relations, abandoning collocations to their rulelessness.

1.3.3 The Structural Approach

The structural approach consists of those linguists and researchers who

suggest that collocation is influenced by structure, and collocations occur in

patterns. Therefore, the structural approach recommends that the study of

collocations should include grammar.

The Neo-Firthians' view of separating lexical analysis from grammatical

analysis was criticised by Mitchell (1971), who argues for the "one-ness of

grammar, lexis and meaning" (Mitchell 1971:43). The interdependence of

grammar and lexicon is evident from the fact that 'lexical particularities' derive

their meaning not only from contextual extension of a lexical kind but also from

the generalised grammatical patterns in which they appear (Mitchell 1971:48).

For the study of collocations, Mitchell proposes that "collocations [which are 'of

roots' rather than 'of words'] are to be studied within grammatical matrices"

(Mitchell 1971:65). In a group of word forms like 'drinks', 'drinker' and

150

'drinking' Mitchell abstracts the common elements of each word form and

labels that as 'root', e.g. /drink, and the associations of different roots, e.g.

/drink- and /heav-, as 'collocations', e.g. 'heavy drinker', 'drink heavily'

(Mitchell 1971:51). Mitchell refers to the collocation 'heavy drinker' as an

exemplification of the colligation 'adjective + agentive noun' (Mitchell

1966:337). The relationship between 'collocation' and 'colligation' is one of

generality: 'colligations' are the generalised classes of associations and

'collocations' are their particular members (Mitchell 1971:53).

Mitchell's view that collocations are of roots rather than of words does

not hold for every combination of roots. For example, 'faint praise' is an

acceptable English collocation, but not all combinations of the two roots, /faint-

and /praise-, produce acceptable collocations: 'she was damned by faint praise'

is acceptable, but 'he praised her faintly' is not.

Matthews (1965) proposes another way of studying collocations within

grammar. He suggests enriching Chomsky's syntax with extra sets of rules that

will account for the selectional restrictions on lexical items. This approach

deals with the syntagmatic relations along a string of lexical items, a 'kernel

colligation' (p.38), rather than with individual collocational relations of pairs of

words, but Matthews realises that such a description of the language involves

double or triple the number of rules when compared to a description on the

lines of Chomsky's syntax (Matthews 1965). Matthews' theory suggested the

study of syntagmatic relations, and consequently of collocation, along the lines

of transformational grammar, but it was not developed any further.

151

The influence of grammar on collocation was also discussed by

Greenbaum (1970), (1974) who pointed out that certain instances of collocation

require syntactic information. For example 'much' collocates with 'prefer' when

it is in a pre-verb position as in 'I much prefer a dry wine', but it does not

collocate with 'prefer' in post-object position as in *'I prefer a dry wine much'

(Greenbaum 1974:82). Greenbaum suggests that the collacability of words (i.e.

their potential co-occurrence with other lexical items) should be "tied" to

syntax, and realises that there are certain lexical items that can occur only in

certain syntactic relationships, e.g. 'His sincerity frightens us' but not 'We

frighten his sincerity' (Greenbaum 1974:82). Without reference to syntax, the

notion of collacability becomes vacuous - virtually any two items can co-occur

at a given arbitrary distance. For example, 'sincerity' can collocate with

'frighten’, but the acceptability of the combinations they produce can only be

judged via syntax.

The notion of language blocks and lexicalised sentence stems was

introduced by Pawley and Syder (1983), who suggest that if a learner is going

to achieve a native-like control of a language, then along with the rules of a

generative grammar, she/he needs to "learn a means for knowing which of the

well-formed sentences are native-like -- a way of distinguishing those sentences

that are normal or unmarked from those that are unnatural or highly marked"

(Pawley & Syder 1983:194). Pawley and Syder propose a new way of

examining native-like selection and fluency. According to their approach,

learners memorise a language in blocks, and a big portion of a native speaker's

152

lexicon consists of "lexicalised sentence stems". For example, an expression of

apology like 'I'm sorry to keep you waiting' gives the sentence stem 'NP be-

TENSE sorry to keep-TENSE you waiting'; the constituents of this sentence are

its 'inflections' and any additional constituents (e.g. 'all this time') are its

'expansions' (Pawley & Syder 1983:210). According to Pawley and Syder

lexicalisation belongs to the domain of competence and a sentence stem can be

lexicalised if it is a standard expression of the meaning in question in a

particular community, or if it is an "arbitrary choice, in terms of linguistic

structure, for the role of standard expression". For example, 'it's twenty to six'

is a standard expression in English since it is a convention that one tells 'to

[Hour]' rather than 'preceding [Hour]' or 'before [Hour]', and 'I want to marry

you' is an arbitrarily established standard usage, compared to a less standard

paraphrase such as 'I wish to be wedded to you', which could be used in a

formal letter or a satirical speech (Pawley & Syder 1983:198). As with most of

the theories examined so far, Pawley and Syder do not define the notion of

lexicalised sentence stems any further, and they did not offer an explicit list of

sentence stems that could be used as a framework in the study of collocations.

The view that language consists of blocks or 'chunks' was also supported

by Nattinger and DeCarrico (1992), who proposed the compilation of a lexical

phrase dictionary for L2 learners. Nattinger and DeCarrico give the following

examples of lexical phrases for inclusion in the dictionary:

Conversational Maintenance (regularities of conversational interaction

that describe how conversations begin, continue and end). Summoning:

153

Excuse/pardon me (sustained intonation); Hey/hi/hello, (Name); How

are you (doing)? I didn't catch/get your name; Do you live around here?

Hello, I'm + NAME; Good morning/afternoon/evening, (how are you)

What's up? (Nattinger & DeCarrico 1993).

From the examples of lexical phrases, as these were presented by

Nattinger and DeCarrico, it appears that lexical phrases are not the same as

collocations or lexicalised sentence stems. Lexical phrases appear to be more

general than collocations and less systematic than lexicalised sentence stems.

Also, Nattinger and DeCarrico are not concerned with providing explanations

about why certain lexical phrases are put together, which would be more useful

for the study of collocations.

A set of criteria for examining whether a combination of words is a

collocation or not is outlined by Kjellmer (1984), who also suggests the study of

collocations in a grammatical framework. Kjellmer defines collocations as

"lexically determined and grammatically restricted sequences of words"

(Kjellmer 1984:163). According to this definition, only recurring sequences that

are grammatically well-formed can be considered as collocations. For example,

during a search of the Brown Corpus, Kjellmer found the following sequences:

'green ideas', 'try to', 'hall to'. From these strings, it is only 'hall to' and 'try to'

that recur, and from these two, only 'try to' that is grammatically well-formed.

Therefore, only 'try to' is a collocation (Kjellmer 1984:163). Kjellmer also tries to

establish a set of rules for assessing 'collocational distinctiveness'. According to

these, a sequence is highly distinctive when it appears frequently in many and

154

different categories of texts; it is long (minimum length is two words); and it is

structurally complex.

On the other hand, Renouf and Sinclair (1991) applied their theory of

studying collocations to 'frameworks' consisting of discontinuous sequences of

two words, whose grammatical well-formedness depends on what intervenes,

e.g. 'a + ? + of', 'too + ? + to' (Renouf & Sinclair 1991:128). They found out that

in some cases there seems to be a stronger collocational pull exerted by one of

the pair on some items rather than on others, e.g. in the framework 'too + ? +to',

'to' would be able to collocate with 'easy', 'hard', 'good' and 'proud' even in the

absence of 'too', e.g. ‘easy to do’, ‘good to do’, but not with 'much' or 'tired'

which require the presence of 'too', e.g. ‘too tired to dance’, ‘too full to eat’,

(Renouf & Sinclair 1991:133). Thus, Renouf and Sinclair demonstrated that the

collocations of grammatical words offer an appropriate basis for studying

collocations, since "co-occurrences in the language most commonly occur

among grammatical words" (Renouf & Sinclair 1991:128).

The importance of grammatical words for the study of collocations was

also confirmed by Jones and Sinclair (1974). Even though their study on

English lexical collocations was based on a relatively small corpus (147,000

running words), it yielded some interesting results concerning the study of

collocation: the influence of the node does not extend beyond span position

Node (N) + 4 (see also Berry-Rogghe 1973). Grammatical words are not

collocationally neutral (unlike Haskel 1971). Even though grammatical words

are weak at predicting their environment, they do show ability to predict word

155

classes at specific span positions, e.g. the collocates of the word 'the' in position

N-1 are mainly verbs and prepositions, while in position N+1 they are nouns

and adjectives. The significance of a collocation takes into account the overall

frequency of the two items concerned, the number of times they occur together,

and the length of the text. Collocations can appear to be 'text dependent'.

Verbs tend to collocate with grammatical items, e.g. 'put' and 'take' collocate

with a great number of prepositions to form phrasal verbs. Association

between lexical items is subject to grammatical influence, e.g. the adjective

'good' is preceded by adverbs and followed by nouns as significant collocates.

Significant collocations show a considerable amount of position dependence,

e.g. in a span of 4, significant collocations most frequently occur in the span

positions immediately next to the node, N-1 and N+1, while very little occurs at

the two extremes of the span, N-4 and N+4. Finally, collocation was found to

be an organising principle that influences the construction and interpretation of

utterances (Jones & Sinclair 1974:48; Leitner 1992).

The study of collocations in structural patterns was also suggested by

Aisenstadt (1979). Aisenstadt distinguishes collocability restrictions as part of

the wide field of collocability. Word combinations whose constituents are

restricted in their 'commutability', i.e. their ability to combine with other words,

are called restricted collocations (Aisenstadt 1979:71). Restricted collocations

are defined as combinations of two or more words used in one of their regular,

non-idiomatic meanings, following certain structural patterns (e.g.

V+(art)+(A)+N), and restricted in their commutability not only by grammatical

156

and semantic valency (e.g. in the restricted collocation 'shrug one's shoulders'

both components have a narrow semantic valency), but also by usage (e.g. we

can 'bear a grudge' but we cannot *'bear hatred/ scorn') (Aisenstadt 1979:71,

1981:54). Restricted collocations are different from free word-combinations.

For example 'carry' can enter a large number of free word-combinations when it

means 'to support the weight of something' like 'carry a

book/bag/chair/torch/table/etc.', but it may also enter a restricted collocation

pattern 'carry conviction', 'carry persuasion', 'carry weight' when it is used to

denote 'being convincing' or 'winning the argument' (Aisenstadt 1979:72).

Some of the structural patterns of restricted collocations in English listed by

Aisenstadt are given below in Table 2:

Table 2. Examples of structural patterns of restricted collocations in English

Pattern Example

V+(art)+(A)+N 'command devotion', 'give a loud laugh'

V+prep+(art)+(A)+N 'leap to a sudden conclusion', 'leap to a decision'

A+N 'cogent argument'

V+Adv 'take off', 'take away', 'sit down'

I(Intensifier)+A 'dead tired', 'dead drunk', 'stark naked'

Note: V = Verb, art = Article, A = Adjective, N = Noun, prep = Preposition,

Adv = Adverb, I = Intensifier

157

Aisenstadt also reports that restricted collocations have not yet been

studied yet adequately as a specific linguistic phenomenon, and therefore they

have not received a proper treatment in lexicography: some of them are listed

alongside free word combinations and others are listed in dictionaries of idioms

as idioms (Aisenstadt 1981:53). Aisenstadt concludes that a study of restricted

collocations is of great importance for applied linguistics, translators,

lexicographers, language teachers and students.

The structure-based studies make clear that collocational restrictions do

not apply only to lexical words (as the other two approaches assume) but also

to grammatical words. Furthermore, studies such as Jones and Sinclair (1974),

Renouf and Sinclair (1992), and Aisenstadt (1979) show that it is possible to

study collocations using structural patterns. Thus, there is no need for the

debate among linguists over whether collocations should be described using

lexical analysis, or semantic rules and/or grammar rules. It is possible that by

defining structurally and isolating a particular collocational pattern and

examining its frequency, variability and systematicity in a language corpus, the

notion of collocation could be enriched.

Benson, Benson and Ilson (1986a) compiled the BBI Combinatory

Dictionary of English, a dictionary of English collocations. The difference

between the BBI and the ECD, examined earlier on, is that the BBI includes

more lexical items and a less detailed grammatical and lexical treatment. The

BBI writers do not include in their dictionary "free combinations" that are

predictable and thus not needed, e.g. the collocation of the verb 'to destroy'

158

with a large number of nouns denoting physical objects like 'bridge', 'house',

'road' etc. (Benson 1985:66; Ilson 1985; Benson et al. 1986a). Fifteen different

types of "essential grammatical and lexical recurrent word combinations" are

defined and included in the BBI dictionary for "general use" (Benson et al.

1986a:7). The BBI distinguishes between grammatical and lexical collocations

in the following way: a grammatical collocation is a phrase that consists of a

dominant word (verb, noun, adjective) and a preposition or grammatical

structure such as an infinitive or clause. Lexical collocations normally do not

contain prepositions, infinitives, or clauses. Typical lexical collocations consist

of nouns, adjectives, verbs, and adverbs. Examples of grammatical and lexical

collocational patterns are given in Table 3.

Table 3. Examples of Grammatical and Lexical Collocations in the

BBI Combinatory Dictionary of English

Code Pattern Example

Grammatical Collocations:

(G4) preposition + noun in agony, at anchor

(G8) verb + to infinitive decide to come, offer to help

Lexical Collocations:

(L1) verb + noun make an impression

(L3) adjective + noun long hair

(L4) noun + verb dogs bark

(Benson et al. 1986)

159

The BBI contains seven types of lexical collocations, L1...L7, and eight

main types of grammatical collocations, G1...G8, with the eighth type consisting

of nineteen English verb patterns, e.g. SVO to O (e.g. ‘I gave the book to Mary’),

SVV-ing (e.g. ‘I started crying’), SV to inf (e.g. ‘I want to sleep’), etc. Altogether,

there are 33 patterns of grammatical and lexical collocations included in the

BBI.

One of the disadvantages of the BBI is that its writers do not explain how

they established that a word combination is recurrent enough to be included in

their dictionary. The recent advances in corpus analysis provide more accurate

examples of significant collocations for their inclusion in a dictionary (see

COLLINS COBUILD English Words in Use, forthcoming; cited in Bahns 1993;

also Collins COBUILD English Collocations on CD-ROM); for the advantages

of using corpus analysis in lexicography see also Sinclair (1985) and

Greenbaum (1984). The use of language corpora for the detection of collocative

semantic lexical relations in the compilation of dictionaries is also suggested by

Meijs (1992), Noel (1992), Sinclair (1992), (1993), and, for the making of a lexical

and phraseological grammar, Francis (1993).

Even though the BBI has methodological weaknesses, its major

contribution to the study of collocations is that it defines explicitly a number of

patterns and, unlike the previous studies on collocations, it actually organises

the collocations of a large number of words around those patterns, proving that

it is possible to use structural patterns in order to study collocations.

1.3.4 Summary of the Three Approaches

160

The three approaches to the study of collocations focus on different

aspects of the phenomenon of collocation. The lexical composition approach

regards lexical analysis as independent from grammar and considers lexis an

autonomous entity, choosing its own collocates which can be enumerated and

classified in lexical sets. The semantic approach tries to find semantic features

based on the meaning of lexical units that would enable the prediction of their

collocates. The structural approach tries to establish patterns of collocations

that include grammatical and lexical words alike.

The semantic and the lexical composition approaches are restricted to

the study of a small number of collocations (usually 'verb noun' and 'adjective

noun' collocations); they exclude grammatical words from their scope, and

eventually they achieved only limited results.

The structural approach, on the other hand, examines more patterns of

collocations, includes grammatical words in the study of collocations, and

provides a framework for the study of collocations that is feasible and

systematic (e.g. the collocational patterns included in the BBI).

1.4 Collocations and Idioms

Before proceeding to the description of the framework employed by the

present study on collocations, it is necessary to make reference to one of the

debates concerning the study of collocations: to what degree collocations are

similar to idioms.

161

Along the continuum with free combinations on one end and idioms on

the other, collocations seem to fall in the middle as they blend together the

semantic transparency of free combinations and the syntagmatic bonds of

idioms. An idiom is usually described as "a constituent or series of constituents

for which the semantic interpretation is not a compositional function of the

formatives of which it is composed" (Nagy 1978:296). Collocations, although

they are combinations of at least two words, exhibit a degree of syntactic

frozenness and resistance to lexical substitution; they are semantically

transparent; and hence they are not idioms. However, there are certain lexical

combinations that are semantically transparent, and therefore should be

classified as collocations, but which also show a certain degree of syntactic

frozenness and resistance to lexical substitution, just like idioms: for example,

'foot the bill', 'curry flavour', 'high explosive', 'highest confidence'. Such

expressions have been called 'bound collocations' (Cruse 1986:41), 'semi-

productive expressions’ (Nagy 1978:296), and 'partial idioms' (Palmer 1976:99).

There are linguists who do not distinguish between idioms and

collocations. For example, Wallace (1979) describes collocations as a class of

idioms, as stereotyped expressions that are easily decoded from the meaning of

their constituent elements. Wallace distinguishes two dimensions to the idiom:

the dimension of meaning (the semantic dimension) and the dimension of

grammatical context (the structural dimension) (Wallace 1979:63). Idioms,

according to the degree of their decodability, are classified as 'transparent', if

they are easily decoded, or 'opaque'. Idioms falling into the area of transparent

162

stereotypes are called 'restricted collocations', e.g. ‘Pleased to meet you’, ‘be

honest with’, ‘use up’.

The semantic approach to the study of collocations also considers lexical

co-occurrences that are arbitrarily restricted and so lacking a semantic

explanation. These are like idioms, i.e. linguistically non-productive, and as

such they should be left out of the study of lexical fields (Lehrer 1974:187).

By and large, semantic transparency appears to be the only criterion that

could make a difference in the process of classifying expressions as idioms or

collocations, while the importance of how clear-cut the distinction is between

collocations and idioms seems to vary among linguists, with some arguing that

"it is, of course, a matter of terminology whether collocations should be classed

separately from idioms or as a major sub-class" (Bolinger 1976:5).

This study examines collocations, i.e. word combinations, in terms of the

syntactic patterns in which they enter. Therefore, the degree of their semantic

transparency is, for the purposes of this study, overlooked.

1.5 A Framework for the Study of Collocations

For the investigation of the acquisition of collocations, this study adopts

a framework based on the structuralist approach. The framework comprises 37

patterns of collocation. 33 of these patterns are from the BBI, 2 patterns are

extensions of the BBI patterns, and 2 are adapted from Zhang (1993). The use of

structural patterns for the study of collocations has been employed in previous

studies (Zhang 1993; Bahns & Eldaw 1993; Biscup 1992). These patterns are

163

utilised in this study in order to operationalise the notion of collocation and to

examine the development of English collocational knowledge in L2 learners. In

order to avoid a confusion between structural/collocational patterns and

grammatical patterns, the patterns used in this study are, from now on, referred

to as 'types'. 'Type' with a capital 'T' is used for reference to individual

collocation types. For a complete list of the 37 types of collocation with

examples, see Table 4 below.

Table 4. Types of Collocation used in the study*

TYPE EXAMPLE

1. Noun Preposition argument about

2. Noun to Infinitive (it was a) pleasure to do it

3. Noun that-clause he took an oath that he would do ....

4. Preposition Noun in agony

5. Adjective Preposition angry at

6. Predicate Adjective to Infinitive she is ready to go

7. Adjective that-clause she was afraid that she would fail...

8. SVO to O/ SVOO he sent the book to his brother

9. SVO to O they described the book to her

10. SVO for O/ SVOO she bought a shirt for her husband

11. SV(O) Preposition O we export to many countries

12. SV to Infinitive they began to speak

13. SV Infinitive we must work

164

14. SV V-ing he kept talking

15. SVO to Infinitive she asked me to come

16. SVO Infinitive she heard them leave

17. SVO V-ing I caught them stealing apples

18. SV Possessive V-ing they love his clowning

19. SV(O) that-clause they admitted that they were wrong

20. SVO to be c we consider her to be well trained

21. SVOc she dyed her hair red

22. SVOO the teacher asked the pupil a question

23. SV(O) Adverbial he carried himself well

24. SV(O) wh-word he asked how to do it

25. S(it) VO to Infinitive it surprised me to learn of her decision

26. SVc he was a teacher

27. Verb Noun/Pronoun (creation) make an impression

28. Verb Noun (eradication) reject an appeal

29. Adjective Noun strong tea

30. Noun Verb bees buzz

31. Noun1 of Noun2 a piece of advice

32. Adverb Adjective deeply absorbed

33. Verb Adverb affect deeply

34. Noun Noun aptitude test

35. Miscellaneous in fact

36. Preposition Determiner Noun on the contrary

165

37. Phrasal Verb to pass on

Note: S: Subject, V: Verb, O: Object, c: complement

* Henceforth, ‘Preposition’ and ‘Prep’, ‘Adjective’ and ‘Adj’, ‘Noun’ and ‘N’,

‘Verb’ and ‘V’, ‘Infinitive’ and ‘Inf’, ‘creation’ and ‘creat’, ‘determiner’ and ‘det’

are used interchangeably depending on the availability of space in the tables.

See also table of abbreviations.

The categorisation of the above collocation types in lexical and

grammatical collocations by the BBI (see section 1.3.3.) was further refined by

Zhang (1993). According to Zhang, a lexical collocation is "a type of collocation

where one component recurrently co-occurs with one or more other

components as the only lexical choice or one of the few lexical choices in a

combination" (Zhang 1993:14). A grammatical collocation, on the other hand, is

"a type of collocation where one component recurrently co-occurs with one or

more other components as a grammatical category, rather than a particular

lexical item" (Zhang 1993:14). In other words, if a collocation is lexicalised, i.e.

if the combination of an open class word (verb, noun adjective, adverb) and a

preposition or another open class word is used as a single word, e.g. 'to do

one's homework', 'to depend on', 'strong in', then it is a lexical collocation. If

the collocation is a combination of an open class word (verb, adjective, noun,

adverb) and a clause, infinitive, gerund, or preposition, then it is a grammatical

collocation, e.g. 'enjoy + V-ing', 'want + to infinitive'. Zhang's definition of

lexical and grammatical collocations was found to be more appropriate than the

166

BBI's for pedagogical research, and this study considers the following types to

be lexical collocations (Types 27, 28, 29, 30, 31, 32, and 33 were also defined as

lexical collocations by the BBI):

Table 5. Types of Lexical Collocations used in the study

Type

1. Noun Prep

2. Adjective Prep

27. Verb Noun (creation)

28. Verb Noun (eradication)

29. Adjective Noun

30. Noun Verb

31. Noun1 of Noun2

32. Adverb Adjective

33. Verb Adverb

36. Prep Det Noun

37. Phrasal Verb

The use of syntactic structures to operationalise English collocations and

to examine the acquisition of an area of vocabulary, i.e. collocations, is

considered appropriate for this study for the following reasons:

i) English collocations have already been found to be influenced by structure

(see studies under the structuralist approach, above). Also, the classification of

167

English collocations in patterns/types enables a large scale investigation of

vocabulary acquisition, i.e. by using types of collocations a larger area of

vocabulary will be covered than by using a number of specific collocations.

ii) The use of syntactically defined structures will enable the description of the

development of collocational knowledge with respect to types of collocation

rather than to a limited number of specific collocations. If collocational

knowledge is affected by structure and does develop in terms of collocation

types, then the results of this investigation will be applicable for all the specific

collocations that belong to a particular collocation type. For example, if certain

conclusions can be drawn about how collocational knowledge develops with

respect to the 'SV inf' collocations, then the results will hold for all collocations

that belong to this type: 'I can sing', 'we must go', 'he might win', etc.

iii) The old debate in linguistics on the division between grammar/syntax and

vocabulary did not prove a constructive approach to the description of L2

acquisition. If vocabulary is not a mere listing of words in memory but

combinations of words carrying meaning and governed by syntactic rules, as

the studies reviewed in this chapter claim, then investigating the acquisition of

vocabulary in combination with syntactic structures will yield a more complete

picture of L2 vocabulary acquisition.

The investigation of the development of English collocations at different

proficiency levels was considered useful because previous studies have made

assumptions based on their results that learners at different levels of

proficiency use different types of collocation (Zhang 1993). The aim of the

168

present study will be to describe the development of collocational knowledge

in L2 learners at different proficiency levels and to investigate whether there

are any collocation types that are acquired before others. If different collocation

types are used by L2 learners at different levels of proficiency, could it be that

there are developmentally determined acquisition orders in the acquisition of

English collocations?

The following chapter reviews a number of studies on acquisition orders

and developmental sequences in L2 acquisition.

169

CHAPTER 2

SECOND LANGUAGE ACQUISITION AND THE DEVELOPMENT

OF COLLOCATIONAL KNOWLEDGE

2.0 Introduction

In the 1970's, research in L1 acquisition provided evidence of

developmental patterns and stages that characterise child language acquisition

(see Brown 1973). Along similar lines, studies in L2 acquisition investigated

how a L2 is acquired and whether it follows a similar developmental route.

Theories of L2 acquisition were formulated, deductively or inductively, and

research in the L2 classroom flourished. Longitudinal and cross-sectional

studies were conducted (for a critique see Miesel, Clahsen & Pienemann 1981;

Rosansky 1976) and the data were analysed to reveal "developmental

sequences" of L2 acquisition. These sequences were then compared to L1

developmental sequences and found to be either similar (Ravem 1968, 1970,

1974; Dato 1970; Milon 1974; Gillis & Weber 1976) or different (Wode 1976).

Among the studies investigating L2 development there is great variation

in the way language "development" is operationalised. Some studies describe

the various "stages" that the learner's interlanguage goes through before a

particular language structure is considered to be acquired, e.g. the five stages of

170

the acquisition of word order in German (Meisel et al. 1981). Such stages form

a "developmental sequence" that all learners seem to traverse regardless of their

native language or the learning context. Other studies describe "acquisition

orders" for certain language components, e.g. it has been shown that the

acquisition of a number of English morphemes follows such a predetermined

acquisition order (see Krashen 1977). Such orders have also been referred to as

"accuracy orders" because the criterion for a certain item to enter an order is its

accurate use by the L2 learner. Morpheme acquisition orders also support the

existence of developmental sequences in L2 acquisition. The most commonly

researched aspects of language for developmental sequences were the areas of

morphology (Dulay & Burt 1973, 1974; Bailey, Madden & Krashen 1974; Larsen-

Freeman 1975; Krashen, Sfelazza, Feldman & Fathman 1976; Mace-Matluck

1977; Fuller 1978; Fathman 1978; Makino, 1979; Lightbown 1983), word-order

and syntax (Huang 1970; Butterworth 1972; Ravem 1974; Wagner-Gough 1975;

Adams 1978; Cazden, Cancino, Rosansky & Schumann 1975; Gillis & Weber

1976; Meisel et al. 1981; Pienemann, Johnston & Brindley 1988).

This chapter reviews studies on developmental sequences pertaining to

different aspects of L2 acquisition and highlights the motivation for the present

study, i.e. the investigation of evidence of development in the acquisition of

English collocations.

2.1 Morphology

171

The Natural Order Hypothesis in Krashen's Monitor Theory suggests

that there is a natural order of acquisition of L2 rules. Some of them are early-

acquired and some are late-acquired. This order does not necessarily depend

on simplicity of form. It can also be influenced by classroom instruction

(Krashen 1985). Evidence for the Natural Order Hypothesis was provided by a

series of research studies investigating morpheme acquisition orders and

showing that grammatical morphemes elicited in free speech and with the use

of specifically designed instruments (e.g. the Bilingual Syntax Measure)

constitute a natural order of morpheme acquisition for performers (Houck,

Robertson & Krashen 1978; Krashen, Houck, Giunchi, Bode, Birnbaum & Strei

1977). Krashen's Natural Order for the acquisition of 9 English morphemes,

from the early acquired morphemes (top) to the late acquired ones (bottom), are

given below in Table 6:

Table 6. The acquisition of English morphemes

Morpheme

-ing

plural

copula

auxiliary

article

irregular past

regular past

172

3rd person singular

possessive 's

(Krashen 1977).

Dulay and Burt (1973), (1974) used the Bilingual Syntax Measure (BSM)

to elicit speech data from 250 Spanish- and Chinese- speaking children learning

English in the USA. They found statistically significantly related acquisition

orders for the two groups, but these were different from the order of

acquisition for English L1 obtained by Brown (1973) in his longitudinal study of

three children. Dulay and Burt's findings were also confirmed by Bailey et al.

(1974) in their study of 73 Spanish and non-Spanish ESL adults.

Acquisition orders that were L1-neutral were also found by Larsen-

Freeman (1975). She tested the acquisition of ten English morphemes by 24

adults from four different L1 backgrounds (Arabic, Spanish, Japanese, and

Farsi) using five different tasks: the BSM speaking task, a reading task, a

listening comprehension test, an imitating task, and a writing test. Larsen-

Freeman found that language background did not affect performance in

morpheme ordering in a significant way, i.e. there were significantly high

coefficients of concordance produced among the language groups on tasks

within the study, and also the BSM elicited a very similar order of morphemes

for learners from different L1 backgrounds. The BSM ordering from Larsen-

Freeman's study and the ordering obtained by Dulay and Burt (1974) correlated

highly at the .01 level of significance, rho = .87. Also the ordering elicited by

173

the imitating task correlated significantly with the ordering obtained in Dulay

and Burt (1974), rho = .60. However, the morpheme orderings that the other

three tasks produced had low correlations with Dulay and Burt's study, none of

them reaching statistical significance.

In an attempt to provide an explanation for the similar ordering obtained

by the BSM in both the Dulay and Burt (1974) and the Larsen-Freeman (1975)

studies, Larsen-Freeman suggested that input frequency could be one factor

influencing the order along with other factors (Larsen-Freeman 1975, 1976).

Also, other factors affecting morpheme acquisition by L2 learners are that the

learner supplies certain morphemes correctly because she/he is trying to match

the gestalt of the speech she/he hears, or that these certain morphemes occur in

speech patterns that she/he has memorised (Larsen-Freeman 1978:100).

Other morpheme studies involved learners from Indo-European and

non-Indo-European L1 backgrounds (Mace-Matluck 1977; Fuller 1978), in both

second and foreign language learning contexts (Fathman 1978; Makino 1979;

Lightbown 1983), and on different tasks (Krashen et al. 1976). Morpheme

studies for L2s other than English (e.g. Spanish in van Naersen 1980; Quiche

Mayan in Bye 1980; and a 'creoloid' (Singapore English) in Platt 1977) also

proved the existence of accuracy orders.

Evidence was also provided for strong similarities in the L2 acquisition

process for learners involved in different learning situations and with different

amounts of exposure (Makino 1979), and for the language acquisition processes

utilised by adults and children (Krashen et al. 1976).

174

An alternative to the morpheme order studies is reported by Wode,

Bahns, Bedey and Frank (1978). They describe the stages that German children

go through while acquiring one morpheme, i.e. plural in English. The data for

this study were from Wode's four children acquiring English naturalistically

(without classroom instruction) during a 6 month field trip to the U.S.A. There

are three stages described:

Stage 1: One form for both singular or plural intention

Stage 2: Two forms for each noun reflecting target singular and plural

Stage 3: Forms with plural target reflexes restricted to plural intention;

forms with singular target reflexes restricted to singular intention

(Wode et al. 1978:178-179).

Wode et al. argue that their approach of investigating morpheme order

and language acquisition as a developmental process can provide more insights

into the mechanisms of the process of language acquisition. However, their

approach was limited to the analysis of the acquisition of English plural

inflections, and it can only be used for the investigation of the acquisition of

morphemes that present a variety of allomorphs, like the English plural.

Although these results strongly suggest that common accuracy and

acquisition orders in morphemes are evident across L2 learners, there are

certain shortcomings in the morpheme studies. Research did not provide

enough empirical support for a theoretical explanation of the developmental

175

sequences (e.g. for a critique of Krashen's Monitor Theory see Gregg 1984).

Also, only a tiny portion of English grammar was studied, and the acquisition

orders obtained represented a linguistically heterogeneous group of bound and

free NP and VP morphemes. The methodology was also criticised for using a

limited number of elicitation methods (mainly the BSM for which claims have

been made that it is not a valid instrument for measuring the sequence of

morpheme acquisition; for a critique of the BSM see Porter 1977). However,

even though these orders are not rigidly invariable across studies, they are far

from being random (Krashen 1977; Larsen-Freeman & Long 1991).

2.2 Syntax

Empirical evidence for developmental sequences in the area of syntax is

also available. Studies identified developmental sequences for the acquisition

of ESL interrogatives (Huang 1970; Butterworth 1972; Ravem 1970, 1974; Young

1974; Wagner-Gough 1975; Adams 1978; Cazden, Cancino, Rosansky &

Schumann 1975; Gillis & Weber 1976; for a review see Larsen-Freeman & Long

1991). Four stages of interrogative formation in ESL were identified:

Stage 1. Rising intonation

e.g. He work today?

Stage 2. Uninverted Wh-word, with or without an auxiliary

e.g. What he (is) saying?

Stage 3. Overinversion

176

e.g. Do you know where is it?

Stage 4. Differentiation

e.g. Does she like where she lives?

(examples from Larsen-Freeman & Long 1991:93)

Four stages of acquisition were also identified for negation in ESL:

Stage 1: no + X

e.g. 'No book', 'No you playing here'

Stage 2: no/don't Verb

e.g. 'He don't have job'

Stage 3: auxiliary-negation

e.g. 'I can't play the guitar'

Stage 4: analysed don't

e.g. 'She doesn't drink alcohol'

(examples from Larsen-Freeman & Long 1991:94; for a review see

Schumann 1979).

Studies in German word order acquisition yielded a five stage model of

development in the acquisition of German L2 (Meisel et al. 1981). The

Multidimentional Model provided a theoretical basis for the observed

acquisition order and was further extended to ESL acquisition (Pienemann &

Johnston 1987). According to the model, invariant developmental stages in the

177

acquisition of certain morphological and syntactic elements in both German

and English can be predicted and explained in terms of "hierarchically ordered

speech processing constraints" (Pienemann, Johnston & Brindley 1988:217).

Based on the same data Pienemann (1984), (1985), and (1989) suggests that

formal input impedes rather than promotes language acquisition, therefore the

formal instruction of syntax can be abandoned (see also Dulay & Burt 1973).

For a review of the debate on whether instruction affects L2 acquisition see

Long (1983); for a critique of the Multidimensional Model see Hudson (1993).

The acquisition of relative clauses in ESL was also investigated and

found to follow a developmental route similar to that found in some L1

acquisition studies (Schumann 1980).

Apart from describing developmental stages for the acquisition of a

single syntactic structure, there have also been studies that investigated the

existence of acquisition orders of grammatical structures. Fathman (1977)

tested the usage of 20 grammatical structures by 500 non-native English-

speaking children learning English in public schools in the United States. She

found difficulty orders (or learning orders) that were similar for students

coming from different language backgrounds and ages. Fathman suggests that

the forms found to be used correctly early in the learning of L2 are those which

are needed for effective communication.

Difficulty orders were also found by Yamada and Matsuura (1982) in the

acquisition of English articles by Japanese students. Yamada and Matsuura

reported that the definite article was the easiest for both intermediate and

178

advanced students. The zero article was most difficult for the intermediate

students, while the indefinite article remained most difficult for the advanced

level students.

In a functional approach to linguistic universals in L2 acquisition

research, Keenan and Comrie (1977) constructed the Accessibility Hierarchy for

Relativisation. They argue that the degree of difficulty for relativising on a

particular noun phrase proceeds along an implicational order. For example,

sentences with NP in subject position are predicted to relativise easier than

sentences with NP in direct object position. Keenan and Comrie suggest that

the Accessibility Hierarchy could be considered as an acceptability ordering

within each language and used for the explanation of syntactic processes in

learners' interlanguage. A number of studies have used the Accessibility

Hierarchy for testing predictions concerning ease or difficulty of acquisition.

Gass (1979) and Gass and Ard (1980) tested relative clause formation in English

by learners from different L1 backgrounds. The results indicate that learners

followed the constraints of the Accessibility hierarchy in their English

regardless of their L1 background. All learners found it easier to relativise

sentences with NP in subject position than sentences with NP in direct object

position.

Markedness was also examined as a factor affecting L2 acquisition in the

Principles and Parameters approach (White 1989). Although there are a

number of definitions of markedness, most of them consider the structures

which are exceptions to linguistic generalisations, or which are of low

179

frequency across the world's languages, or which are very complex (White

1989:117). Markedness has been used to make predictions about L1 and L2

acquisition. It has been claimed that developmental sequences of language

structures based on the criterion of markedness can predict ease or difficulty of

acquisition of specific language structures. For example, it was shown that

learners acquire unmarked forms, i.e. the unmarked dative prepositional

phrase complement (e.g. Mary gave the book to John), before marked forms, i.e.

marked double noun phrase constructions (e.g. Mary gave John the book)

(Mazurkewich 1984). The limitations of the markedness theory in predicting

developmental sequences of L2 acquisition are reported by White (1987). In an

investigation of the value of markedness as a predictor of L1 transferability,

White (1987) concludes that even though markedness can affect acquisition, it is

not a clear predictor of what L2 learners will or will not transfer from L1.

The above studies provide evidence that there are stages of L2 learner

development which are sequenced in a predictable order and which can be

identified and described with a certain degree of accuracy. What is also evident

from the studies reviewed so far is that grammar (in the form of syntax, word-

order or morphology) has been the central issue in L2 acquisition research. In

contrast, phonology and vocabulary have not been investigated to the same

extent that grammar has (Tarone, Swain & Fathman, 1976). Other limitations

reported by Tarone et al. are the undeveloped methodology for data collection

and data analysis (the limitations of data collection instruments such as the

180

BSM have been noted by a number of researchers), and finally the limited

number of replicated studies in L2 acquisition.

The focus on form rather than function is another limitation in the

interlanguage studies (Long & Sato 1984). Long and Sato also argue that more

research is needed in "a broader array of morphosyntactic features, e.g.

complex syntactic structures, and for lexical choice" (Long & Sato 1984:279).

In the next sections of this chapter a representative selection of studies in

phonology and vocabulary acquisition are reviewed.

2.3 Phonology

In the limited research studies to-date, claims have been made that L2

phonology also follows certain patterns of development. For example, Tarone

(1976) found that L2 learners prefer to use open syllables (i.e. syllables that end

in a vowel) rather than closed syllables (i.e. syllables that end in a consonant) in

the early stages of L2 acquisition (Tarone 1976, 1978).

Also, Wode (1977) found that children acquire the L2 phonological

system in ordered developmental sequences. In his study of German children

acquiring ESL, he found that German children follow the same developmental

route for /r/ as the native, English-speaking children (Wode 1977: 213).

Similar findings were obtained in an analysis of the production of the English

syllable-final stops /b d g/ in Spanish, Polish and Mandarin learners by Flege

and Davidian (1985). The authors conclude that the observed developmental

processes are similar to those affecting child L1 speech production.

181

Markedness theory has also been applied to L2 phonology. Eckman

(1977) claimed that where there are differences between the phonemes of L1

and L2, those phonemes that are more 'marked' (e.g. word-final voicing

contrasts are more marked than medial or initial contrasts) will be more

difficult for the L2 learner.

In her review paper, Tarone (1978) reports that the following processes

have been utilised in shaping the development of L2 phonology:

i) negative transfer from L1

ii) first language acquisition processes

iii) overgeneralisation

iv) approximation

v) avoidance

(Tarone 1978:25, 1987:77).

These processes are similar to the general interlanguage strategies

employed by L2 learners (see Selinker 1972).

As yet there is no substantial evidence as to why some developmental

processes that occur in the acquisition of a L1 phonology are employed by the

L2 learner, and some others are not (Ioup & Weinberger 1987). What these

studies show, though, is that there are certain developmental processes that

learners follow in the acquisition of L2 phonology (for a review on the

acquisition of L2 speech see Leather & James 1991).

182

2.4 Vocabulary

Until recently, lexical acquisition has been a "victim of discrimination"

(Levenston 1979:147). Traditionally, L2 acquisition research has meant

"grammar" research, in which the focus is on understanding the acquisition of

rules of structural development. Largely ignored was the fact that "using the

right word is the most important aspect of language use" (Politzer 1978:258),

and that lexis is "the major learning priority" in L2 acquisition (Jones 1994:441).

As a result, research in developmental sequences in ESL has been mainly

concerned with morphology and syntax. Lexical development has rarely been

researched (Meara 1978, 1980) even though it is evident that vocabulary is an

important aspect of L2 acquisition. It has been shown that lexical errors

outnumber grammatical ones by almost four to one (Meara 1984), and that a

poor knowledge of vocabulary has negative effects in the writing of L2 learners

(Linnarud 1986). Also, it was found that L2 learners vocabulary errors are

corrected more frequently by native speakers than errors in syntax (Chun, Day,

Chenoweth & Luppescu 1982).

2.4.1 Vocabulary as a Language Sub-skill

Interest in L2 vocabulary development has been expressed by two

sources: those linguists and language practitioners who saw vocabulary as a

component of one of the four major language skills, i.e. reading, and those who

183

saw vocabulary as an independent aspect of language development, equal in

importance and status to grammar.

L2 vocabulary development is viewed as a necessary subcomponent of

the development of reading skills because L2 learners need very well

developed vocabularies in order to read authentic selections (Dubin 1989).

However, according to Dubin, ESL learners do not have time to undertake

separate vocabulary building courses, and furthermore, teaching vocabulary

items which are not embedded in some meaningful context, such as a stretch of

text, does not seem to help learners, and therefore vocabulary should be taught

through unedited text.

Krashen's view on vocabulary acquisition is that vocabulary is acquired

in the same way that the rest of the language is acquired (Krashen 1989). In the

skill-building view, vocabulary learning "involves learning words one at a time,

by deliberate study" (Krashen 1989:440) and comprehensible input in the form

of reading and listening to stories is the way to successful vocabulary

development. Krashen concludes that explicit teaching of vocabulary is not so

effective and "in addition, many vocabulary teaching methods are at best

boring, and are at worst painful" (Krashen 1989:450). Thus, successful

vocabulary development can only occur through the development of reading

and listening skills.

Along the same lines Fox (1987) suggests an approach to vocabulary

development based on the assumption that "developing vocabulary and

reading skills takes time and extensive practice" (Fox 1987:310). According to

184

this approach, reading simplified texts followed by more complex ones results

in a gradual development of L2 vocabulary. Fox also expresses the need for

research on rates of acquiring receptive vocabulary.

Oral translation was also suggested as an adequate exercise to build

vocabulary (Heltai 1989) as it makes students devote attention to vocabulary,

and encourages them to extend their vocabulary into new areas, for example

synonymic sets, collocations and idioms (Heltai 1989:292). However, such an

approach can be made possible only under the condition that all the students

and the teacher share the same mother tongue. Other L2 vocabulary teaching

suggestions include the teaching of new words through a "meaningful learning

approach", i.e. teaching the etymology of a word, as opposed to other

techniques such as rote memorisation of words, especially with intermediate

and advanced L2 learners (Pierson 1989:57).

The above studies express an 'interest' in vocabulary acquisition mainly

due to fact that language practitioners realised that the development of reading

skills was impeded because of the lack of adequate vocabulary. The

suggestions given for vocabulary development are not the product of research

in the development of L2 vocabulary, but ways of circumventing the problem

of inadequate vocabulary in order to develop reading skills.

2.4.2 Vocabulary as a Language Skill

The first attempts to discover how L2 vocabulary is acquired led

researchers to investigate how vocabulary is stored and then retrieved by L2

185

learners. Evidence for a phonologically organised mental lexicon was provided

by Fay and Cutler (1977), Cutler and Fay (1982) through an investigation of

"malapropisms" (word substitution errors), e.g. 'we need a few laughs to break

up the monogamy' instead of 'monotony' (Fay & Cutler 1977:505). They conclude

that the mental dictionary lists its entries according to syllable structure and/or

stress pattern, and only within these categories according to sound (Fay &

Cutler 1977:511).

In investigating the problem of how new foreign words are stored in the

learner's mental lexicon, Meara (1978) tested the word associations of 76

English girls learning French in two London Comprehensive schools. The girls

were given a list of 100 French words and were asked to write down, beside

each one, the first French word that it made them think of (Meara 1978:194).

These associations were then compared with the word associations produced

by native French speakers. Meara concludes that the native speaker's mental

dictionary is organised mainly on semantic lines while in L2 learners this

semantic organisation seems to be much less well established (Meara 1978:208).

This lack of proper semantic organisation could be the source of difficulty that

foreign language learners experience in processing both written and spoken

foreign language material (Meara 1978:208). Meara finds it plausible that

learners follow a transition from a mental L2 lexicon organised on non-

semantic criteria to a more native-like one organised on semantic grounds.

Meara's claim that there are transitional stages in the lexicon has been criticised.

The results of his research have been described as "simply messy" and failing

186

to confirm the existence of developmental patterns (Sharwood-Smith 1984:238).

Despite the negative criticism, Sharwood-Smith suggests that the networks of

semantic associations that exist between words could be a viable avenue to

explore in the investigation of L2 vocabulary acquisition.

In a study of the acquisition of individual words, Meara and Ingle (1986)

tested the acquisition and retrieval of 35 low-frequency French nouns by

English-speaking learners. The nouns were presented and practised

phonetically. They found that the beginnings of L2 words were relatively

resistant to error, while subsequent consonants were more likely to be incorrect.

The results of Meara and Ingle's study are suggestive of how words are stored

and retrieved from mental lexicon, but they are limited in that they pertain to

words acquired phonetically. Furthermore, they concern individual lexical

items. In a more recent paper Meara (1992) draws attention to the examination

of vocabulary acquisition as a network of structures and associations.

Laufer's (1990a) study showed that in vocabulary acquisition learners

follow a similar developmental route according to the L1 acquisition = L2

acquisition hypothesis which predicts that L2 learners follow a similar

developmental route to that followed by a child learning the same language as

L1 (Laufer 1990a:290). Laufer compared adult EFL learners and English native

speaking children in order to examine the similarities and/or differences that

they experience in distinguishing between words of similar form (synforms),

e.g. 'considerate' and 'considerable', 'extend' and 'extent', 'simulate' and

'stimulate'. Laufer concludes that native speaking learners of English and

187

foreign learners of English share the same order of difficulty in the acquisition

of 'synforms', i.e. suffix synforms (e.g. considerable/considerate) created the

most difficult synformic distinctions, followed by the vocalic (e.g. cute/acute),

and then the prefix (e.g. superficial/artificial) and consonantal (e.g. price/prize)

(Laufer 1990a:281). Despite the interesting results, Laufer's study suffers from

certain shortcomings: she compared adult foreign learners of English and 12-

year-old native speakers of English without justifying why she expected

language development in these two groups to be comparable. Further on, the

multiple choice test she used for her research was poorly designed (e.g. the

fourth distractor of each item is almost always one that is definitely wrong - in

the 38 items tested, only one has (d) as the correct answer). Despite its

limitations Laufer's investigation suggests that in L2 vocabulary acquisition,

too, there are developmental sequences.

Palmberg (1987) also investigated patterns of vocabulary development in

Swedish ESL learners in Finland. Palmberg used 'spew' tests, which required

the students to write down as many words as they could think of that began

with a given letter (M or R). This was done for one minute per week for 17

weeks. Palmberg found that the words produced by his subjects consisted

mainly of textbook vocabulary. Results also show a steady increase in the

overall word-production capacity of the subjects over time (see also Palmberg

1988).

The acquisition of modal auxiliaries (i.e. can, could, may, and might) by

L2 learners was investigated by Gibbs (1990). She examined 75 Panjabi-

188

speaking pupils on their expression of English modal auxiliaries and found that

the acquisition of modal auxiliaries by the L2 learners follows an English L1

developmental pattern.

The acquisition of word formation processes was investigated by

Olshtain (1987). Word formation rules in Hebrew were tested using three tasks

(production, evaluation and interpretation) with a group of native speakers and

two groups of foreign speakers of Hebrew (advanced and intermediate levels).

In the production task, subjects were asked to coin new terms for concepts not

named in the conventional lexicon of Hebrew. In the evaluation task, subjects

were presented with five innovative forms representing word formation

devices in Hebrew and asked to judge which of these forms was the most

suitable name for a specified noun. In the interpretation task, subjects were

asked to supply the most likely meaning of an innovative blend. Olshtain's

results show that L2 learners acquire target word formation processes in a

gradual progression, with the advanced learners exhibiting productivity that is

very similar to native speaker's performance (Olshtain 1987:229). It was also

shown that at the advanced level the L1 influence in the application of L2 word

formation devices is marginal, while at the intermediate level students rely

mainly on word formation devices that were covered in their Hebrew course

(i.e. affixation devices). Olshtain's study strongly suggests a developmental

process in the acquisition of word-formation rules.

Giacobbe and Cammarota (1986) conducted an investigation of the

relationship between L1 and L2 in the construction of lexis during the first

189

phases of L2 acquisition. They collected their data by interviewing two Spanish

subjects acquiring French during the first months of their stay in France. They

concluded that there are two approaches to the construction of lexis, systematic

and non-systematic, depending on the learner's ability or inability to establish a

relationship between the L1 and L2. In the systematic approach, the learner

forms a General Lexeme Construction Hypothesis (GLCH) which is concretised

by a series of simple operations facilitating the transformation of L1 lexemes

into L2 lexemes. For example, Cacho, one of the subjects in the study,

suppressed the final vowel of Spanish lexemes, e.g. [kurs] instead of 'curso' and

[mism] instead of 'misma', in order to produce French lexemes, e.g. 'cours' and

'meme'. The GLCH is further complemented by parallel hypotheses

concerning other aspects of the lexemes such as stress. In the non-systematic

approach, the learner just memorises words that are frequently used in her/his

environment. Even though Giacobbe & Cammarota's study reveals that a

degree of systematicity can exist in the acquisition of L2 lexis, it has certain

shortcomings. First, their study was limited to the examination of only two

Spanish adults acquiring French without formal instruction. Second, the

similarity of the subjects' mother tongue and the L2 could have accentuated the

role of L1 in the construction of rules for the acquisition of lexis.

In the studies reviewed above, vocabulary acquisition has been equated

with the acquisition of individual words by L2 learners, even though it has

been suggested that an examination of vocabulary as a network of semantic

and structural associations would be worthwhile (Meara 1992). So far, results

190

suggest that in L2 vocabulary acquisition, too, there are certain patterns of

development. However, the scope of these studies has been mainly

exploratory, and there has not been a systematic framework of investigation of

patterns of vocabulary development. The rest of this chapter will focus on

studies exploring the acquisition of sequences of lexical items, i.e. lexical

phrases and collocations.

2.4.3 The Acquisition of Lexical Phrases

The studies considered so far dealt with the acquisition of individual

words. Other studies have also dealt with the acquisition of combinations of

two or more words.

The investigation of the early acquisition and use of prefabricated

patterns such as "can you", "where is", "how to" and others, revealed that in the

initial stages of L2 acquisition learners learn to use multiword phrases as if they

are individual lexical items (Hakuta 1974). Hakuta poses the question of

whether this rote memorisation of prefabricated patterns accelerates or

decelerates language development. Peters for one believes that 'chunks' play

an important role in L1 acquisition (Peters 1983).

Krashen and Scarcella (1978) have also identified the memorisation of

syntactic patterns, i.e. prefabricated routines, as part of the early stages of L2

acquisition. However they conclude that, when more learning has taken place,

"language development proceeds analytically, in the 'one word at a time'

fashion" (Krashen & Scarcella 1978:297). Krashen and Scarcella conclude that

191

prefabricated routines and patterns are useful for establishing social relations

and also for encouraging intake of target language. However, this intake is

insufficient for successful language acquisition and thus the teaching of

routines and patterns should be minor (Krashen & Scarcella 1978:298). Even

though Krashen and Scarcella provide an answer (negative) to Hakuta's

question, their conclusions are speculative since they have not been based on

empirical evidence.

Counter to Krashen and Scarcella's view of the usefulness of

prefabricated routines, Nattinger and DeCarrico (1992) have argued that

unanalysed chunks of language play an integral part in acquiring and using

language. Nattinger and DeCarrico identified the structural and functional

properties of lexical phrases (e.g. 'I'm sorry to hear that X' (expressing

sympathy), 'by the way' (topic shift), 'Could/Would you X ?' (request)

(DeCarrico & Nattinger, 1993)), and suggested ways for utilising lexical phrases

in language teaching. Nattinger and DeCarrico's lexical approach to language

learning draws attention to the systematic utilisation of lexical phrases in

language teaching, however, there is still little empirical evidence on the way

these 'lexico-grammatical units' are actually acquired by L2 learners;

furthermore their approach is limited - for the purposes of this study - by being

focused on the linguistic analysis of native adult language use (Weinert 1995).

Pienemann et al. (1988) also underscore the importance of lexical

phrases. The use of formulae in the oral production of English L2 learners was

classified as Stage 1 structure, i.e. low in processing complexity, and the

192

formulae were used as indicators of linguistic development by Pienemann et al.

(1988). However, these 'formulae' were left unexplained and the individuals

employed as 'assessors' of linguistic development had considerable difficulties

in identifying when a formula was used or not. It is possible that using an

umbrella term, i.e. 'formulae', to refer to word combinations memorised as

chunks, could create problems when this is used as an indicator of linguistic

development as different formulae can exhibit different levels of complexity

depending on factors such as the length of the collocational string, the

frequency of the lexical items in the formula, the formality of the formula, etc.

Thus, more refinement is needed in the description of formulae if it is going to

be used as an indicator of linguistic development.

The above studies suggest that the acquisition of formulae/lexical

phrases is characteristic of the initial stages of L2 acquisition, and that their

utilisation for language teaching would be of benefit to the learner. However,

their conclusions and suggestions are not based on empirical evidence, while

the use of the term 'formula' or 'lexical phrase' to describe any combination of

words that could be memorised as a whole is inappropriate and vague for a

detailed investigation and description of the acquisition process of such word

combinations. Still, we need to know much more about the role of formulaic

language in classroom L2 development (Weinert 1995).

2.4.4 The Acquisition of Collocations

193

Collocational development in L2 vocabulary acquisition has not been

investigated yet in terms of systematic patterns of acquisition, even though

there has been evidence for the existence of such sequences in the fields of

syntax and morphology and phonology, and also evidence that vocabulary

acquisition may also follow patterns of development.

There is already no doubt that collocations are an important part of L2

lexical development. It has been shown that collocational errors make up a

high percentage of all errors committed by L2 learners (Grucza & Jaruzelska

1978 cited in Biscup 1992); Marton 1977; Arabski 1979), and linguists have

acknowledged the importance of focusing on the relations that hold between

items in the lexical system in order to describe vocabulary development (White

1988; Meara 1992). It has also been suggested that collocations provide most of

the "initial lexical units", and thus their study is of great importance both for the

early stages of language acquisition and for the following years of vocabulary

development (Greenbaum 1974:89).

The need for research in collocations has long been identified (Levenston

1979), but it is only in recent years that empirical investigations have been

conducted. One reason for this lack of interest could be the shortage of suitable

research instruments designed specifically for testing hypotheses about lexical

acquisition processes (Levenston & Blum 1978:2). The recent research on

collocations has taken a number of forms.

Links between the acquisition and use of collocations and writing

proficiency were reported by Ghadessy (1989) (see Chapter 1). According to

194

Ghadessy, the use of function words indicates a more advanced use of

collocations, grammatical patterns and cohesive devices on the part of the older

students (Ghadessy 1989:114). Ghadessy's study demonstrates that the

examination of the collocations L2 learners use can be useful in an investigation

of what happens during the L2 learners' development towards a full linguistic

communicative competence.

A developmental process in the acquisition of collocations is also

suggested by Zhang (1993) in his study of the use of collocations in the writings

of native and non-native speakers of English (also see Chapter 1). One of the

results of the study is that poor non-native writers and good native writers use

more grammatical collocations and fewer lexical collocations. Even though

Zhang did not compare the acquisition of English collocations by L2 learners

from different proficiency levels, he assumes that the results of his study

indicate a certain development in the acquisition of collocations by L2 learners:

at the lower levels of English proficiency learners use more grammatical

collocations and fewer lexical collocations; when learners are at intermediate

levels they produce a greater variety of collocations but they still rely greatly on

the prefabricated routines they have acquired at early stages, and therefore use

more lexical collocations than grammatical ones; finally, when learners have

reached an advanced level of proficiency, they have a better knowledge of

grammatical collocations, which they are now able to break down into parts

and use to create new ones, thus resulting in a heavier use of grammatical

collocations. However, a developmental continuum like the one described by

195

Zhang would require empirical evidence from L2 learners at different

acquisition stages.

The acquisition of lexical collocations by advanced learners of English

from two different L1 backgrounds, Polish and German, was investigated by

Biskup (1992). Subjects were asked to supply the English translation

equivalents of lexical collocations in Polish and German respectively. German

learners were more prone to use descriptive answers and try alternative ways

of rendering the meaning of unfamiliar collocations, while the Polish students

would use a collocation only if they were sure it was the correct one. This

result is explained in the light of the different emphasis on EFL in Poland and

Germany. The Polish educational system insists on accuracy, so the Polish

learners would refrain from giving any answer at all unless they were certain

that it was the correct one. On the other hand, the Germans pay more attention

to communication and fluency and thus the German learners tried to use

alternative ways of expressing the meaning of collocations whose English

equivalents they did not know (Biskup 1992:88). Even though Biskup's study

does not concern the acquisition of collocations from a strictly linguistic point,

it suggests that by employing different approaches and taking into account

factors such as the focus of instruction, new and valuable insights in the field

vocabulary acquisition can be provided.

Aghbar and Tang (1991) devised an instrument to measure the

acquisition of collocations. The principle of the proposed scoring scheme is

based on the assumption that the acquisition and use of collocations evolves

196

along a continuum from the least semantic approximation to full mastery of

collocations that are idiomatic and appropriate, both semantically and by

register (Aghbar & Tang 1991:2). The scoring instrument was used to test

mastery of verb-noun collocations by 205 university level ESL students. The

collocations were collected using a blank filling test, and they were scored in

terms of their idiomaticity (idiomatic/non-idiomatic), semanticity

(semantic/marginally semantic/not semantic), and register (proper

register/not proper register). Results showed that the use of common verbs

such as 'take', 'get', 'find' were relatively easy for the low proficiency groups

and therefore do not discriminate between low and high proficiency in

collocations. It was also concluded that open-choice tests are more reflective of

the students' choice of collocations in their own natural communication, and

that low proficiency students are much more likely to choose an appropriate

answer in a multiple choice test.

The acquisition of low frequency (or rare) words and multi-word (or

complex) lexical units (e.g. noun phrases (a damp squib), adjectival/ adverbial/

prepositional phrases (at a pinch), predicates (to bite the bullet), and sayings

(the penny drops)) by advanced L2 learners was investigated by Arnaud and

Savignon (1994). A list of sixty rare words and sixty complex lexical units was

compiled in a multiple choice format (i.e. each item on the list was followed by

four choices, one of them being a paraphrase or a synonym of the item and the

other three distractors). The list was given to French advanced learners of

English, who were asked to complete the multiple choice test by choosing the

197

appropriate definition for each test item. Results show that native-like

performance was attained in the case of rare words but not in that of complex

lexical units (Arnaud & Savignon 1994). It is possible that because of lack of

awareness of the importance and nature of complex lexical units, learners did

not pay attention to them. Arnaud and Savignon conclude that even though

the acquisition of a large number of complex lexical units (such as collocations)

involves considerable difficulty, such an acquisition is necessary for the

advanced learner's receptive competence (Arnaud & Savignon 1994).

The acquisition of lexical collocations or "conventional syntagms" in

foreign language learning was also investigated by Marton (1977). Results

showed that recurrent exposure to conventional syntagms did not lead to their

remembering and recall by the learners. This could be due to the fact that

conventional syntagms are easily decodable and thus they do not cause any

difficulty in the process of recognition. Simple words or more idiomatic

expressions have a stronger impact on the learner's conscious mind as the

learner makes an effort to learn them, and thus they have a better chance of

being remembered. Marton suggests that intensive study of vocabulary and a

conscious effort in memorising and rehearsing of a great number of

conventional syntagms is the most effective way to learn how to handle target

language lexical collocations (Marton 1977:55). More recent studies have also

underscored the effects of practice in L2 acquisition (see Kirsner, Lalor & Hird

1993).

198

The above studies show that an investigation of how collocations are

acquired will be of potential benefit for illuminating some of the processes that

contribute to L2 vocabulary development and for L2 teaching.

2.4.5 Summary

The reviewed literature so far suggests that:

i) L2 vocabulary development only recently received systematic attention and

examination even though there have been studies suggesting the existence of

developmental patterns in the acquisition of L2 vocabulary.

ii) Given the emerging consensus that vocabulary knowledge is best viewed as

a network of associations, the acquisition of collocations is a valuable avenue to

explore since it represents structural and semantic relationships between lexical

items.

iii) Other language aspects have been found to exhibit developmental

processes and patterns. The acquisition of vocabulary, and in particular the

acquisition of collocations could be found to follow a developmental process of

some kind that can be described and analysed (Ellis 1994:113).

2.5 The Aims of the Present Study

The limited research in the development of L2 vocabulary, and the

availability of English collocations for a study of development, as these are

operationalised in this study (see Chapter 1), provided the rationale for this

199

study which aimed to investigate whether there are patterns in the

development of collocational knowledge in L2 learners.

Describing developmental 'stages' in the acquisition of collocations (i.e.

the stages that the learner goes through before the correct English collocations

are fully acquired) is not feasible in an investigation of vocabulary learning.

For example, in the investigation of the acquisition of English interrogatives

(Cazden et al. 1975) the end product (i.e. a well-formed interrogative

conforming to English grammar rules) was evident and the researchers had to

describe the stages learners go through in the acquisition of English

interrogatives. In vocabulary acquisition, however, and in particular in the

acquisition of collocations, the end product is not as obvious. For example,

when the learner uses 'bad milk', the end product cannot be confidently

determined. It is possible that the learner is trying to say 'sour milk', or even

that 'the milk is off'.

Due to the above limitation, this study aimed to explore 'patterns' or

'acquisition/difficulty/accuracy orders' rather than 'stages' of development in

the acquisition of collocations. Thus, development in the acquisition of

collocations is in the form of sequences or implicational steps of correctly used

English collocations by learners at different proficiency levels.

For the purposes of the present study, ESL learners from three different

proficiency levels were tested in their free and cued production of collocation

types as these are operationalised in the BBI and other studies on collocations

(see Biskup 1992; Zhang 1993). The proficiency level of the selected L2 learners

200

was based on the assumption that collocations are important for the early

stages of language acquisition and for the following years of vocabulary

development (Greenbaum 1974). Thus, the subjects in this study were at post-

beginner, intermediate, and post-intermediate levels of proficiency.

The correctly used collocations were sequenced to reveal implicational

orders from 'easy' or 'early acquired' collocation types to 'difficult' or 'late

acquired' types. In this way any systematic patterns of development in the

acquisition of collocations would emerge. As a result of the foregoing, there are

two hypotheses tested in this study:

i) There are stable patterns of development of collocational knowledge across

language proficiency levels.

ii) There are stable patterns of development of collocational knowledge

within language proficiency levels.

The next chapter describes the methodology of this study.

201

CHAPTER 3

METHODOLOGY FOR THE PRESENT STUDY

3.0 Introduction

This chapter specifies the methodology of the present study. It describes

the development of the testing materials, the data collection procedures, the

coding and scoring of the data, and the analyses to be performed in order to

test the two predictions:

1. There are stable patterns in the development of collocational

knowledge across language proficiency levels.

2. There are stable patterns in the development of collocational

knowledge within proficiency levels.

3.1 Analysis of the Teaching Materials

For the purposes of the present study an initial analysis and

classification of the collocations found in three textbooks, namely Task Way

202

English 1, 2 & 3, was performed. These textbooks are used in all the State

Junior High Schools in Greece for the teaching of English.

The Task Way English(TWE) series was designed by a five member

English Language Teaching (ELT) committee appointed by the Greek Ministry

of Education. All the members of the committee were Greek and their aim was

to design a series of textbooks for the teaching of English in the State Junior

High Schools that will meet the interests and needs of Greek students.

3.1.1 Curriculum Objectives

The objectives of the Junior High School ELT curriculum were reformed

under the task-based approach to foreign language learning adopted by the

authoring committee of the series. According to the committee, the new

objectives were "related to knowledge of language as a system and to language

as a means of communication" (Dendrinos 1988:2) and the TWE series was

designed to realise these objectives through role-play tasks, listening activities,

and emphasis on communicative competence.

3.1.2 Syllabus and Methodology

The syllabus for each of the three textbooks is graded in terms of both

grammar and the communicative functions of language: the contents page for

each book describes for each unit the title, the grammar points included, and

the language functions that are to be practised. The aim of the textbooks, as

203

outlined by the authoring team, is to develop the learners' communicative

competence and provide them with practice in using the target language. The

authors of the TWE series wanted to adopt a methodology that follows the

principles of "process-oriented learning" (Dendrinos 1988:5). Such a

methodology, the authors of the books claim, has made the grading of the

formal, semantic and pragmatic properties of language "far less important than

the sequencing of the learning tasks" (Dendrinos 1988:5).

3.1.3 Activities and Tasks

Each unit in the textbooks has a central theme, which is further divided

into several topics and issues leading to situations where the learner is invited

to participate by using her/his communicative skills. Before each task is

performed, the sociolinguistic context of the situation is given.

The team of authors designed the tasks, aiming to develop in learners

both receptive and productive skills and to encourage them to discover new

knowledge rather than impose it on them. They also wanted to offer the

students opportunities for metacognition and metacommunication (Dendrinos

1988:6). For example, learners are asked to look at the usage of different

grammar tenses in a comparative way, or they are informed about the roles of

certain grammatical structures, e.g. Passive Voice is used when we are

interested in the action rather than the agent (TWE3, p.57).

The last part of each unit in the three textbooks aims to help the learner

systematise the knowledge that she/he acquired throughout the unit. For

204

example, in TWE2, Unit 6, p. 84, an alphabetical list of more than eighty English

verbs with irregular past tense forms is presented in order to help learners

systematise their knowledge of irregular past tense verb forms. However, the

effectiveness of such tasks depends to a large extent on the way these are

presented to the learners, and the use that learners make of them.

The writers of the series did not design activities that could raise the

students' awareness of collocations in a systematic way. In TWE1, there is only

one activity that asks students to list nouns that could be accompanied by a

certain adjective, e.g. big: toe, finger, foot, hand, mouth, ear, eye. In TWE2,

there are no activities that would help learners acquire specific collocations.

Finally, in TWE3, there is one activity in which students are asked to make

adjective-noun and noun-noun compounds using specific words on a list, e.g.

'classified advertisement', 'natural resources', 'entertainment section'.

Instructors follow the curriculum closely and the TWE series textbooks

are the only textbooks used in the classroom. So the textbooks control the

learners lexical acquisition in the classroom.

3.1.4 The Use of L1

The learners' L1, Greek, is used in the textbooks in order to describe the

context of the tasks to the students, to tell them how to carry out the task, and

in some instances to give rules of language use. The use of Greek is much more

extensive in the first book, which is aimed at beginners.

205

3.1.5 The Vocabulary

The authors report that due to the communicative purposes of the

textbooks, "the vocabulary which appears in different discourse types has not

been chosen with any formal criteria in mind, while it has not been strictly

graded" (Dendrinos 1988:4).

For the purposes of this study, and in order to understand the linguistic

environment that the subjects of this study have been exposed to, the

vocabulary of the textbooks was analysed in terms of types of collocations. The

list of 37 types of collocation developed by the BBI was used (see Chapter 1).

The classification was performed manually by the researcher. Inter-rater

reliability of 90% was achieved with one other rater on a random 5% sample of

the total number of pages analysed, and it was considered to be sufficient. The

results were entered in a database using the Quattro Pro 3.0 software program.

Descriptive statistics were then calculated.

3.1.6 Descriptive Statistics for the TWE Series

There is a steady increase of the English collocations included in the

books (Figure 1). TWE1 contains 2,161 collocations, TWE2 contains 3,922

collocations, and TWE3 contains 5,901 collocations. Token-type ratios were

calculated for each book (see Table 7).

206

1 2 3

0

1000

2000

3000

4000

5000

6000

Sum of Col l oca ti ons per textbook

Task Way Engl i sh 1, 2, 3

1Figure 1. Distribution of collocations across the TWE series

Table 7. Collocation tokens and token/type ratios in the TWE series

TYPE TWE1 TWE2 TWE3 Total Tokens/ Type

1. Noun Preposition 76 80 145 301 100.3

2. Noun to Infinitive 2 16 27 45 15

3. Noun that-clause 0 0 0 0 0

4. Preposition Noun 228 215 429 872 290.67

5. Adjective Preposition 45 75 128 248 82.66

6. Predicate Adjective to Infinitive 3 9 37 49 16.33

7. Adjective that-clause 8 7 12 27 9

8. SVO to O/ SVOO 15 26 27 68 22.66

9. SVO to O 1 3 7 11 3.66

207

10. SVO for O/ SVOO 4 6 9 19 6.33

11. SV(O) Preposition O 10 45 5 60 20

12. SV to Infinitive 19 234 285 538 179.33

13. SV Infinitive 26 230 347 603 201

14. SV V-ing 31 9 22 62 20.66

15. SVO to Infinitive 4 25 56 85 28.33

16. SVO Infinitive 11 28 37 76 25.33

17. SVO V-ing 1 16 14 31 10.33

18. SV Possessive V-ing 0 0 0 0 0

19. SV(O) that-clause 45 149 234 428 142.67

20. SVO to be c 0 0 3 3 1

21. SVOc 4 9 32 45 15

22. SVOO 3 13 8 24 8

23. SV(O) Adverbial 110 130 61 301 100.33

24. SV(O) wh-word 73 174 203 450 150

25. S(it) VO to Infinitive 0 0 0 0 0

26. SVc 402 351 503 1256 418.67

27. Verb Noun/Pronoun (creat) 30 135 143 308 102.67

28. Verb Noun (eradication) 3 5 4 12 4

29. Adjective Noun 306 704 1346 2356 785.33

30. Noun Verb 3 11 2 16 5.33

31. Noun1 of Noun2 16 21 47 84 28

32. Adverb Adjective 3 15 30 48 16

208

33. Verb Adverb 58 173 146 377 125.67

34. Noun Noun 197 319 498 1014 338

36. Preposition Determiner Noun 240 345 454 1039 346.33

37. Phrasal Verb 184 344 600 1128 376

TOTAL OCCURRENCES 2,161 3,922 5,901 11,984 ------

As can be seen from Table 7, there were also collocation types for which

no instances of collocations were found in any of the books:

Type 18. SV Possessive V-ing

Type 25. S(it)VO to Infinitive

Type 3. Noun that-clause

These categories have not been included in the calculation of the means

and standard deviations for each textbook that appear below.

A look at the mean number of collocations found in each book confirms

this steady increase of collocations, see Table 8. The standard deviation was

also calculated for each book, see Table 8, and it was found that as the level of

English progresses in the TWE series the variability from the central point in

the distribution of scores becomes greater, that is, some types of English

collocations are represented by a large number of tokens, which gets even

larger in the third book of the series, while in other types the occurrences

remain consistently low. It also appears that the scores are much more spread

209

out in the second and third books than in the first one, which has generally low

levels of English collocations.

Table 8. Means and standard deviations per textbook

BOOK MEAN STD

TWE 1 61.5 97.01

TWE 2 111.1 150.1

TWE 3 167.8 261.9

As we can also see in Table 7 above, most of the collocation types have a

small number of tokens in all three books, a few have a medium sized number

of tokens, and only a couple have a high number of tokens. This indicates that

some types are represented more than others in the TWE series.

There is little recycling of collocations across the three books. For about

half of the collocation types there were no common collocations appearing in all

three books. The largest amount of recycled collocations across the three books

are under the following types:

Type 24. SV(O) wh-word (13 tokens appearing in all three books);

Type 37. Phrasal Verb (18 tokens appearing in all three books);

Type 36. Preposition Det Noun (11 tokens appearing in all three books); Type 4.

Preposition Noun (11 tokens appearing in all three books).

210

A closer look at these categories revealed the following:

i) Type 24. SV(O) wh-word - The collocations that belong to this category and

appear in all three books are instructions mainly for carrying out role-play tasks

- e.g. "ask if she has got a brother" (TWE1, Unit 5, p.66).

ii) Type 37. Phrasal Verb - Most of the recycled collocations in this category also

appear in task instructions - e.g. "try to fill in the 'What is Done By Whom'

table" (TWE3, Unit 3, p.47).

From the above, it appears that most of the recycled collocations are

mainly standard expressions used for giving instructions to the students about

the task they are asked to perform.

A close look at the textbook data shows that there is little recycling of the

collocations used in TWE1 and TWE2, and TWE1 and TWE3. TWE2 and TWE3

appear to be more compatible as they have 329 collocations in common.

However, an examination of these 329 collocations revealed that one third of

them (28%) belonged to the types used for task instructions (see above).

3.2 Subjects

Three groups of Greek students of English were involved in the present

study. They were all learners from the same Greek Junior High School and

were taught English via the TWE coursebooks. The first group consists of

students in the first year of Junior High School, the second one consists of

students in the second year of Junior High School, and the third group

comprises students in the third year of Junior High School.

211

The Junior High School that participated in the study used the TWE

series, and it was situated in an urban area, i.e. Veria which is the capital of the

prefecture of Imathia, to ensure that its students were mainly from the same

town rather than from the nearby villages. Permission was obtained from the

Principal of the Junior High School and the Department of Education for

Secondary Schools, as the research would engage the participating students for

one hour and forty minutes for the completion of the test.

There were 347 subjects participating in the study: 107 subjects for the

first group, 125 subjects for the second group, and 115 subjects for the third

group. All subjects were between 12 and 15 years of age. They were Greek

nationals who were native speakers of Greek, and they all had the same level of

formal education. These subjects had to be screened with regard to their

language proficiency and their production of collocations, and hence those

subjects that had not written an essay were not included in the study (see

3.5.2.). Ultimately, there were 275 subjects included in the study: 91 subjects for

Group 1, 94 subjects for Group 2, and 90 subjects in Group 3. The subjects'

mean age for each group was calculated: 12 years and 9 months for Group 1, 13

years and 8 months for Group 2, and 14 years and 7 months in Group 3 (see

Appendix A).

The study was conducted three teaching weeks before the end of the

school year in Greece. By this time students were in the final chapters of their

books and under revision in preparation for the annual exams which follow the

end of the school year.

212

3.3 Materials

The test instrument used for the data collection consisted of a

questionnaire eliciting background information about the subjects, and a

battery of three tests: an essay writing task eliciting free-production data, a

translation task, and a blank-filling task eliciting accuracy in the use of

collocations. The purpose and the contents of the materials are examined in

detail below.

3.3.1 Questionnaire

The first part of the test instrument was a questionnaire in Greek aiming

to elicit information about the students' background. There were 15 questions,

10 open ended and 5 closed ones, asking information such as the students' age,

sex, recent marks in English, how many languages they speak, whether they

had any additional exposure in English, when they started learning English,

how often they watch English movies with Greek subtitles or without subtitles,

how often they read English books and/or newspapers, how often they listen

to English songs, whether they speak English with their friends, and whether

they correspond with pen friends in English. The questionnaire was the same

for all three groups. See Appendix A for information on the subjects' gender

and age and Appendix B for the English translation of the questionnaire.

213

3.3.2 Composition

The first test in the battery of tests was a composition task measuring

free production of collocations. Students were asked to write an essay of

approximately 200 words on a given topic. There were different topics for the

three different groups and each topic had been covered in the textbook of the

particular group. The topic for each group was given in Greek (see Appendix C

for the topics in Greek and their English translations).

3.3.3 Translation

The second test consisted of an elicited translation task. The translation

test measured cued production of collocations. There were 10 sentences in

Greek for each group, and the subjects were asked to translate them into

English (see Appendix D for a word-by-word English translation of the Greek

sentences and their expected translations in English). Each sentence tested one

collocation. The collocations included in the translation test were selected from

the database of the collocations found in the students' textbooks, and each

collocation included in the test was different from its Greek equivalent, e.g.

'draw conclusions' is 'take out conclusions' in Greek.

The types of grammatical and lexical collocations tested in the elicited

translation task for each group are given below in Table 9.

Table 9. Collocation types included in the translation test.

214

TYPES Group1 Group2 Group3

1. Noun Preposition 2 2 2

5. Adjective Preposition 1 2 2

11. SV(O) Preposition O 2 2 1

13. SV Infinitive 1 0 1

14. SV V-ing 1 1 1

16. SVO Infinitive 1 1 1

23. SV(O) Adverbial 1 0 0

27. Verb Noun (creation) 1 2 2

TOTAL: 10 10 10

3.3.4 Blank-Filling

Finally a blank-filling test was also included. This test measured cued

production of collocations. There were a number of sentences in English (50

sentences for Group 1, 65 for Group 2, and 90 for Group 3), containing

collocations in context. Each sentence contained one collocation. In each

sentence, one part of the collocation was replaced by a blank and students were

asked to read the sentence and provide one suitable word for each blank. As

with the translation test, the collocations tested in the blank-filling test were

selected from those appearing in the students' textbooks, and they were

different from their Greek equivalents (see Appendix E for the sentences

215

included in the blank-filling tests with the intended collocations underlined and

the missing part bolded).

All parts of the tests were typed. There were instructions in Greek for

each part of the test.

The types of grammatical and lexical collocations tested in the elicited

translation and the blank-filling tasks for each group are given below in Table

10.

Table 10. Collocation types included in the blank-filling test.

TYPES Group1 Group2 Group3

1. Noun Preposition 1 5 3

4. Preposition Noun 3 6 7

5. Adjective Preposition 5 5 15

11. SV(O) Preposition O 12 14 27

23. SV(O) Adverbial 1 1 2

24. SV(O) wh-word 2 1 3

27. Verb Noun (creation) 6 10 11

28. Verb Noun (eradication) 0 0 2

29. Adjective Noun 1 0 2

30. Noun Verb 1 0 0

31. Noun1 of Noun2 3 1 0

33. Verb Adverb 1 3 1

34. Noun Noun 2 1 1

216

36. Preposition Determiner Noun 6 11 12

37. Phrasal Verb 6 7 4

TOTAL: 50 65 90

3.4 Data Collection Procedures

In the following sections, the procedures followed by the researcher for

the collection of data are reported.

3.4.1 Test Administration

The test was administered on three consecutive days: one day for each

group. All subjects for each group were tested on the same day. All the

subjects belonging in a particular group were tested together. During the data

collection, the researcher personally monitored the testing.

The subjects were told that their school had agreed to participate in a

research project undertaken by the Centre for Language Teaching and Research

of the University of Queensland. Their knowledge of English was to be

assessed using a test and the collective results of their performance would be

forwarded to their school after the completion of the project. A complementary

copy of the Macquarie Dictionary of Australian English would be donated to

the school library for student use as a reward for their participation to the

testing. Subjects were assured that the data would be treated confidentially

and would not affect their course marks.

217

After the tests were distributed to the subjects, they were asked to first

complete the questionnaire, then to write the essay, then to do the translation

task and finally to complete the blank-filling task. The researcher explained

what each test required the students to do. All test instructions were written

and spoken in Greek. The questionnaire and the topic for the essay were

written in Greek, while the translation and the blank-filling test were

introduced by instructions in Greek asking students to translate and fill in the

sentences respectively (see Appendix F for the exact wording of the

instructions).

Subjects were then asked to complete the tests. They were encouraged

to ask the researcher about anything in the test they might find difficult to

understand, or any unknown words. Even though the vocabulary used for the

test items came out of the students' textbooks, the researcher realised that the

students might not remember certain words under the pressure of time.

Therefore, any words unknown to the students were explained by the

researcher, with care taken that the particular words were not giving away the

answers to any of the test items. Such cases were limited only to subjects in

Group 1 because of their low level of English.

The subjects were allowed one hour and thirty minutes to complete the

test, and they were told that they should not leave any of the test items

unanswered.

218

Students finishing earlier than expected were asked to remain seated and

revise their tests, e.g. try to expand their composition. When the time was up

all of the subjects had finished and they were allowed to leave the room.

The same procedure was followed for each day of the testing, until all

three groups of subjects had been tested.

3.4.2 Debriefing

At the end of the last day of testing, the researcher had a meeting with

the two English teachers of the school during which she explained the purpose

of the testing and the research project. She also asked them to complete a

questionnaire about the use of the TWE coursebooks and the teaching of

collocations in the classroom (see Appendix G for the teachers’ questionnaire).

This information was to be used later in the data analysis and the discussion of

the results.

3.5 Coding Procedures

Each set of data from the three tests was coded and scored according to

the following criteria.

3.5.1 Free Composition

219

The data obtained from the free composition were treated as evidence of

both language proficiency and of collocational use. As language proficiency

data, the essays were analysed with respect to six different measurements:

holistic rating, target-like use of articles, lexical density, length of terminal

units, error-free terminal units, and sentence-nodes per terminal unit. The use

of each of the six measures is explained below.

i) Holistic Rating

The free compositions were scored on a holistic scale of 1-100 which is a

standardised and widely used scale compiled by Jacobs, Zinkgraf, Wormuth,

Hartfiel and Hughey (1981) (see Appendix H for a list of the criteria for

scoring). Each composition was assessed by two raters. The raters were native

speakers of English and experienced English teachers.

Each essay received two scores, one from each rater. In cases where the

two raters had more than ten points difference in their evaluation of a

particular essay, the essay in question was scored by a third rater. If the third

rater gave a score that was half way between the previous two scores, e.g. Rater

1 gave a 50, Rater 2 gave a 30, and Rater 3 gave a 40, then the third rater's score

was counted, and the previous two ignored. If the third rater gave a score that

was the same as or closer to one of the previous two scores, then the third score

and the closest other score were averaged and the estimated score was given to

the essay, e.g. Rater 1 gave a 50, Rater 2 gave a 30, Rater 3 gave also a 30, the

scores from Rater 2 and Rater 3 were averaged, while Rater 1 was ignored. If in

220

the previous case Rater 3 gave a 60, then Rater 1 and Rater 3 would be

averaged, and Rater 2 would be ignored. In 32 out of 275 essays (11.6%) where

a third rater was needed, the score from the third rater had no more than nine

points difference with at least one of the previous two scores. Finally, each

essay received a score on a scale 1-100 based on the average of the two ratings.

ii) Target-Like Use of Articles

The analysis of the Target-Like Use (TLU) of articles was performed by

the researcher. TLU is an accuracy measure. As in Pica (1983), the number of

accurately supplied articles (definite and indefinite) in obligatory contexts were

counted and divided by the overall number of obligatory contexts in the essay

(whether an article was provided in them or not) plus the number of

non-obligatory contexts with inappropriate articles multiplied by 100. The

TLU percentage score was recorded for each essay.

Inter-rater reliability was performed on a sub-set of the ratings randomly

sampled from the entire set of data. 5% of the data (i.e. 15 essays) was rated in

this way by two other raters and the inter-rater reliability was at 99%. The

raters were given a random selection of 15 essays, 5 essays from each sample,

and an instruction sheet, which reported what the measurement was and what

each rater was required to do (see Appendix I). The raters were given a short

training session by the researcher on the TLU analysis on two other essays.

After the raters had performed the measurement on the sample essays, the

researcher's and each rater's ratings were compared. In a total of 235 accurate

221

and inaccurate suppliance and omission of articles, there were two instances of

disagreement between the researcher and one of the raters in one of the essays.

In essay 13 from Group 1, there was an ungrammatical sentence, "He plays

basketball and sometimes he plays tennis or going with friends for jogging".

The researcher had considered the phrase "going with friends for jogging" as a

non-obligatory context for an article, provided that it was read as "going (for)

jogging with friends", while the rater had marked it as an obligatory context for

the use of an indefinite article, if it was read as "going with friends for a

jog(ging)". After the case was discussed, both the researcher and the rater

agreed that "going with friends for a jog" was closer to what the student had

written, and as such it was an obligatory context for an article. The second

instance was the phrase "she often climbs on the mountains" in the same essay.

Here the researcher had considered that the phrase did not need an article,

while the rater had marked it as correct, assuming that the student was

probably talking about a specific group of mountains. After discussion it was

agreed that the omission of the article was much more general and

consequently appropriate, especially since the student did not refer any further

to particular mountains in his essay. For the rest of the essays, there was total

agreement between the researcher and the raters.

iii) Lexical Density

A Lexical Density analysis (LD) was also performed by the researcher.

Lexical Density refers to the number of lexical, or 'open class', words divided by

222

the total number of words in each essay and multiplied by 100 (see Long 1991,

unpublished paper; Linnarud 1986). For this analysis a number of criteria were

defined: abbreviations such as 'etc.', 'e.g.' were not counted at all; proper names

were not counted as lexical words (in Group 1 the students were asked to

describe themselves and their family, so in each essay there was a considerable

number of Greek names that did not really contribute to the semantic richness

of the essay and so they were not considered to be lexical words, see also

Palmberg 1987:212); names of places in Greece were not counted as lexical

words (see above); numbers were not counted as lexical words; adverbs other

than those ending in -ly were not counted as lexical words; the verb 'do' was

not counted as a lexical word, even when it was used as a main verb; misspelt

words that could be easily recognised as English words either in writing or

when pronounced according to the Greek or English phonetic system were

counted (in this case it was assumed that the student knew the word and was

attempting to use it, see also Palmberg 1987:205); words written in Greek were

not counted at all.

Inter-rater reliability was also obtained for the LD analysis from two

other raters on 5% of the data, which randomly sampled (i.e. 15 essays), and

was at 98%. Each rater was given the same sample of essays that the TLU raters

were given and an instruction sheet (see Appendix J) and were asked to

underline the lexical words in each essay. A short training session was given

by the researcher on two other essays. After the raters performed the LD

analysis on the sample essays, the researcher's and each rater's ratings were

223

compared. There were three instances in which one of the raters had

underlined the verb "going" as a lexical word in phrases in which it was an

auxiliary, and as such it was a grammatical rather than a lexical word, e.g.

essay 37 Group 3 "we are going to do everything". In one case the adverb

"hard" was underlined as a lexical word even though it did not end in -ly.

Finally, words such as "other" and "everybody" were underlined by the raters

even though they were clearly grammatical words, while there were cases in

which the raters should have underlined words such as "think", "way", "worst",

"better", "use", "thanks" but they did not. Overall, there was agreement

between the researcher and the raters in the LD analysis in 1264 out of 1280

lexical words, and the inter-rater reliability was considered sufficient for the LD

measurement.

iv) Terminal-Units

The essays were also analysed with respect to the number of terminal

units (T-Units) they contained. All the main clauses plus any subordinate

clauses attached to or embedded in them were counted as T-Units (see Long

1991, unpublished paper; Hunt 1966). A T-Unit is a structural discourse unit

and it was used in this study in three different measures: length of T-Units,

Error-Free T-Units, and S-Nodes per T-Unit. Inter-rater reliability for the

number of T-Units was also obtained from two other raters on a randomly

sampled sub-set of the data (5% of the data), and was at 97%. Each of the raters

was given the same sample of essays and the T-Unit instructions (see Appendix

224

K) and were asked to mark all the T-Units in each essay. A short training

session was given by the researcher on two other essays. After the

measurement, the number of T-Units marked by the raters and the researcher

were compared and the inter-rater reliability (97%) was considered sufficient

for the T-Units measurement.

v) Length of T-Units

After the T-Units per each essay were calculated, the average length for

the T-Units in each essay was estimated by dividing the total number of words

in each essay by the number of T-Units in that essay (see Larsen-Freeman 1978),

e.g. an essay with 186 words and 25 T-Units had 7.44 as the average length of a

T-Unit. This was a complexity measure.

vi) Error Free T-Units

The number of error-free T-Units per each essay was also calculated as

an accuracy measure. Only T-Units that were free from grammatical, syntactic,

lexical, spelling and punctuation errors were counted as Error-Free T-Units (see

Larsen-Freeman 1978). Inter-rater reliability on a randomly sampled sub-set of

the data (15 essays) was also obtained from two other raters, and was at 97%.

Each of the raters was given the same sample of essays and instructions for

counting the Error-Free T-Units in each essay (see Appendix K). A short

training session was given by the researcher on two other essays. After the

225

measurement, the number of Error-Free T-Units marked by the raters and the

researcher was compared, and the inter-rater reliability (97%) was considered

sufficient for the Error-Free T-Units measurement.

vii) S-Nodes per T-Unit

The essays were also analysed with respect to the number of sentence

nodes (S-Nodes) they contained. This was a measure of syntactic

accumulation. The number of underlying sentence nodes, indicated by tensed

and untensed verbs, was calculated for each essay and then the average

number of S-Nodes per T-Unit was estimated by dividing the number of S-

Nodes in each essay by the number of T-Units in that essay (see Long 1991,

unpublished paper). Inter-rater reliability on 5% of the data was also obtained

from two other raters, and was at 98%. Each of the raters was given the same

sample of essays and instructions for counting the S-Nodes in each essay (see

Appendix L). A short training session was given by the researcher on two

other essays. After the measurement, the number of S-Nodes marked by the

raters and the researcher were compared and the inter-rater reliability (98%)

was considered sufficient for the Error-Free T-Units measurement.

3.5.2 Use of Collocations in the Essays

The subjects' performance on the free composition task served not only

as a measurement of the subjects' writing proficiency in English (see above), but

226

also as a measurement of their free production of collocations. Test papers in

which no composition was given, either because of the particular subject's

inadequate level of English or because of lack of time, interest etc., were not

included in the study. Thus, there were 275 complete test papers: 91 complete

test papers in Group 1, 94 in Group 2, and 90 in Group 3.

The essays were then analysed with regard to the collocations they

contained. The students' production of the 37 different collocation types as

these are operationalised in this study (see Chapter 1) was recorded as

frequency data. Where the students provided a correct collocation they were

marked as having used a token of the particular type in which the collocation

belonged. Misspelt collocations were recorded as evidence of collocational use.

Each collocation found in the essays was checked against the collocations

included in the BBI. If the particular collocation was included in the BBI it was

recorded as correct evidence of use of the particular collocation type (see also

Zhang 1993). If it was not included in the BBI it was discarded. There were

13.1% of rejected collocations in all three groups. These collocations were

mainly Adjective Noun combinations with 'big' or 'good' as the adjective. Such

collocations are considered 'free combinations' by the BBI writers and they are

not listed in the BBI (Benson et al. 1986a:xxiv).

The quantity of collocations found in each essay for each of the 37 types

was also recorded as well as the percentage of the type-token ratio. Inter-rater

reliability of 90% was obtained for 5% of the data.

3.5.3 Translation

227

The data from the translation task were scored both as frequency data

and as accuracy data. As frequency data the answers in the translation test

were marked using a binary code: when students used the correct collocation

they received 1, and when they used the wrong collocation or they provided no

collocation at all, they received 0. As accuracy data, the mean accuracy of

response to each collocation type in the translation test was recorded. Spelling

mistakes were disregarded. For a list of the types of collocations tested in the

translation test see Table 9 above.

3.5.4 Blank-Filling

The data from the blank-filling test were also recorded both as frequency

data and as accuracy data. As frequency data, the same binary coding was

used as for the translation data. As accuracy data, the mean accuracy of

response to each of the collocation types included in the blank-filling test was

recorded. Spelling mistakes were disregarded. In the few cases where the

students provided a collocation that did not match with the target one, but

which belonged to the same collocation type, the collocation was recorded as

correct. For a list of the collocation types tested in the blank-filling test see

Table 10 above.

3.6 Analyses

228

In the following section, the analyses of the language proficiency

measures and the two hypotheses investigated by this study are outlined.

3.6.1 Language Proficiency Measures

Before testing the hypotheses it was necessary to perform a number of

language proficiency measures on the free production data in order to

determine that there are different levels of language proficiency in the groups.

The analyses performed included the following measurements: holistic rating,

target-like use of articles, lexical density, length of T-Units, error-free T-Units,

and S-Nodes per T-Unit. A six-way factorial MANOVA was calculated on the

scores obtained by the six different measures in order to determine whether the

three samples were different with respect to all six measures.

Following that, six one-way ANOVAs were then calculated for each of

the six measures.

3.6.2 Analyses for the Hypotheses

The analyses performed for testing the two hypotheses are described

below.

i) Analysis for Hypothesis 1

229

To address Hypothesis 1, that there are stable patterns of development in

the acquisition of collocations across proficiency levels, non-parametric

Kruskal-Wallis tests were performed on the data. Kruskal-Wallis is the non-

parametric equivalent of ANOVA. Due to the wide range of types of

collocations that were used in the analysis of the essay data, and the unequal

number of tokens for each of the types of collocations tested in the translation

and the blank-filling tests, the data were not expected to be normally

distributed and thus non-parametric statistics were considered suitable to

address the first hypothesis. Previous studies on collocations used non-

parametric statistics too (see Zhang 1993).

For the free production essay data that resulted from the analysis of the

students' essays, Kruskal-Wallis tests followed by Dunn's multiple comparison

procedures were performed on the mean tokens of each of the 37 collocation

types used by subjects in each group.

For the cued production translation data non-parametric Kruskal-Wallis

tests followed by Dunn's procedure were performed on the mean accuracy of

responses to each of the 6 types of collocations repeated across the three

groups.

For the cued production blank-filling data non-parametric Kruskal-

Wallis tests followed by Dunn's procedure were performed on the mean

accuracy of responses to each of the 11 types of collocations repeated across the

three groups.

230

Implicational scaling analysis for each of the three sets of data was also

performed in order to reveal any acquisition orders. The frequency data were

used for this analysis. For each implicational scale the Guttman's coefficients of

reproducibility and scalability were then calculated (Hatch & Lazaraton 1991:204)

in order to test the validity of the scales and the scalability of the items on the

scales.

ii) Analysis for Hypothesis 2

To address Hypothesis 2, that there are patterns of development in the

acquisition of collocations within proficiency levels, non-parametric Friedman

repeated measures tests were performed on each group for each set of data, i.e.

three Friedman tests per each set of data. Friedman repeated measures test is

the non-parametric equivalent for ANOVA repeated measures test. These tests

were followed by Nemenyi's multiple comparisons tests based on the Friedman

rank sums. The analysis for Hypothesis 2 also includes those collocation types

that were not repeated across groups.

For the free production essay data, separate repeated measures

Friedman tests were performed for each group on the tokens for each of the 37

types of collocation found in the students' essays, followed by Nemenyi's

multiple comparison procedures. Implicational scaling analyses using the

frequency data for each group were also performed and the Guttman's

coefficients of reproducibility and scalability were calculated.

231

For the cued production translation data, separate repeated measures

Friedman tests were performed for each group. There were 8 types of

collocation for Group 1, 6 types for Group 2, and 7 types for Group 3. These

tests were followed by Nemenyi's multiple comparisons procedures.

Implicational scaling analyses using the frequency data for each group were

also performed and the Guttman's coefficients of reproducibility and scalability

were calculated.

For the cued production blank-filling data, separate repeated measures

Friedman tests were performed for each group. There were 14 types of

collocation for Group 1, 12 types for Group 2, and 13 types for Group 3. These

tests were followed by Nemenyi's multiple comparisons procedures.

Implicational scaling analyses using the frequency data for each group were

also performed and the Guttman's coefficients of reproducibility and scalability

were calculated.

The following chapter presents the results of the analyses.

232

CHAPTER 4

ANALYSES AND RESULTS

4.0 Introduction

This chapter describes the results from the language proficiency

measures and the main analyses performed to address the two hypotheses

listed in 2.5. The presentation of the results is organised around the three sets

of data used to address each of the hypotheses: the free production essay data,

the elicited production translation data and the elicited production blank filling

data. In section 4.1 the results from the language proficiency measures

performed on the essay data are reported. The aim of these measures was to

screen the data and establish clear proficiency differences among the three

groups, using different measures of language proficiency on the essay data.

This initial screening of the data was considered necessary, since the

proficiency differences between groups are the major independent variable for

the present study.

In 4.2 the results of the main analyses of each set of data are described.

The analyses and results for Hypothesis 1 are first reported in 4.2.1. The aim of

these analyses was to examine evidence for developmental differences in the

knowledge of collocations, assessed both in terms of ability to use collocations

in the essay data, and in terms of accuracy of response to questions eliciting

collocations in the translation and blank filling data. These analyses involved

233

comparisons of collocation use and accuracy between groups using Kruskal-

Wallis tests. The Kruskal-Wallis one-way analysis of variance by ranks is a

non-parametric test for deciding whether a number of independent groups are

from different populations (Siegel & Castellan 1988:206). Evidence for

acquisition orders was then sought using implicational scaling of: i) the use of

collocations in essays across all groups, and ii) mean accuracy of response to

collocation types on the translation and blank filling tests across all groups. In

this way it was hoped to show what differences existed in the subjects'

knowledge of collocations across different proficiency levels, and how

knowledge of collocations developed. This evidence was used to address the

first hypothesis, which states that there are patterns in the development of

collocational knowledge across all groups.

Analyses and results for Hypotheses 2 are described in section 4.2.2

These analyses involved comparisons of collocation use and accuracy within

each group using Friedman Repeated Measures tests, which is a parallel non-

parametric test for repeated-measures ANOVA. Evidence for acquisition

orders within groups was then examined using implicational scaling of

collocation use and accuracy for each group. The aim of these analyses was to

show what developmental differences and sequences existed in the use of

collocations within groups. These results would reveal any group-specific

patterns in the development of collocational knowledge.

The results of all the analyses are summarised in 4.3 in relation to the

two hypotheses of the study.

4.1 Language Proficiency Results

234

In this section the results of the analyses performed to determine the

proficiency differences between the three groups are described.

4.1.1 Descriptive Statistics

Prior to performing the MANOVA for the six measurements on the free

production data, the data were examined with regard to the normality of their

distribution. The data for each group were tested for kurtosis and skewness.

The results are reported below.

4.1.1.1 Descriptive statistics for Group 1

The results of the descriptive analysis of the data for Group 1 show

normal distributions for the majority of the dependent variables: Holistic

Rating (Kurtosis: -.689, Skewness: -.162), Words per T-Unit (Kurtosis: -.183,

Skewness: .209), and Error-Free T-Units (Kurtosis: -.669, Skewness: .513). The

results of the analysis for the dependent variable Target-Like Use of Articles

also show no significant effects for kurtosis or skewness (Kurtosis: -1.099,

Skewness: -.229). Results of the analysis for the dependent variable Lexical

Density reveal a slightly peaked distribution (Kurtosis: 1.301, Skewness: -.54).

Finally, the results of the analysis for the dependent variable S-Nodes per T-

Unit show a distribution that is positively skewed and peaked (Kurtosis: 2.514,

Skewness: 1.51).

4.1.1.2 Descriptive Statistics for Group 2

The results of the analysis for the data in Group 2 reveal normal

distributions for the dependent variables: Holistic Rating (Kurtosis: .12,

Skewness: -.447), Target-Like Use of Articles (Kurtosis: .134, Skewness: -.747),

and S-Nodes per T-Unit (Kurtosis: .251, Skewness: .611). The data for Lexical

235

Density (Kurtosis: 7.005, Skewness: 1.94) and Error-Free T-Units (Kurtosis:

1.067, Skewness: 1.259) reveal distributions that are peaked and positively

skewed. Finally, the results of the analysis for the dependent variable Words

per T-Unit show a peaked distribution (Kurtosis: 3.312, Skewness: .891).

4.1.1.3 Descriptive Statistics for Group 3

The analysis of the data for Group 3 again reveal normal distributions

for the majority of the dependent variables: Holistic Rating (Kurtosis: -.855,

Skewness: -.3), Lexical Density (Kurtosis: .204, Skewness: .129), Words per T-

Unit (Kurtosis: -.042, Skewness: .354), and S-Nodes per T-Unit (Kurtosis: .394,

Skewness: .519). The results of the analysis for the dependent variable Target-

Like Use of Articles show a peaked and negatively skewed distribution

(Kurtosis: 2.986, Skewness: -1.815), while the results of the analysis for the

dependent variable Error-Free T-Units show a distribution that is peaked and

positively skewed (Kurtosis: 3.372, Skewness: 1.569).

Despite the fact that some variables in each group displayed slightly

peaked or skewed distributions, overall the distribution of the data from the

analyses of the language proficiency measures was found to be normal. For a

summary of the results on the frequency distributions of the language

proficiency measures see Table 11 below.

Table 11. Kurtosis* and skewness* for the language proficiency measures

Group 1 Group 2 Group 3

Measures Kurtosis Skewness Kurtosis Skewness Kurtosis Skewness

236

Hol. Rating -.689 -.162 .12 -.447 -.855 -.3

TLU -1.099 -.229 .134 -.747 2.986 -1.815

Lex. Density 1.301 -.54 7.005 1.94 .204 .129

Words per T -.183 .209 3.312 .891 -.042 .354

Error-Free T -.669 .513 1.067 1.259 3.372 1.569

S -nodes per T 2.514 1.51 .251 .611 .394 .519

* Values > +1 show distributions that are not normal.

4.1.2 Results of the MANOVA

A MANOVA was performed for the factor Group (three levels) and the

six dependent variables. The results of the MANOVA reveal a significant main

effect for Group (F(6, 268) = 69.363, p = .0001). Following this, six one-way

ANOVAs were performed on the data to examine which of the six language

proficiency measures show differences between the groups. The results of the

univariate ANOVAs for the six variables are reported below.

4.1.3 Holistic Rating

The results of the ANOVA for the holistic rating show no significant

difference between the groups (F(2, 272) = 1.148, p = .3188). As can be seen

from Table 12, the mean holistic rating for each group is similar.

Table 12. Means and standard deviations for the dependent variable:

Holistic Rating

GROUP COUNT MEAN STD. DEV.

237

Group 1 91 63.868 12.638

Group 2 94 66.574 12.333

Group 3 90 66.333 15.039

4.1.4 Target-Like Use of Articles

The ANOVA for the dependent variable target-like use of articles (TLU)

shows a significant main effect for the factor Group (F(2, 272) = 31.306, p =

.0001). To examine the source of the significant effect for the factor Group, post-

hoc comparisons of the means for each group were performed. The results of

the comparisons are illustrated in Table 13. There is a significant difference

between all groups at the p < .05 level of significance. The table of means, Table

14, shows a steady increase in TLU from Group 1 to Group 3.

Table 13. Post-hoc comparisons for the dependent variable: Target-Like Use

of Articles

Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t

Group 1 vs. Group 2 -152.815 73.324* 8.419* 4.103

Group 1 vs. Group 3 -297.831 74.122* 31.295* 7.911

Group 2 vs. Group 3 -145.016 73.531* 7.539* 3.883

* Significant at .05 level

Table 14. Means and standard deviations for the dependent variable: Target-

Like Use of Articles

238

GROUP COUNT MEAN STD. DEV.

Group 1 91 54.879 32.479

Group 2 94 70.160 23.117

Group 3 90 84.662 18.292

4.1.5 Lexical Density

The ANOVA for the dependent variable Lexical Density shows a

significant main effect for the factor Group (F(2, 272) = 23.642, p = .0001). To

examine the source of the significant main effect, post-hoc comparisons of the

means for each group were performed. The results of the comparisons are

illustrated in Table 15. There is a significant difference between all groups at

the p < .05 level of significance. The table of means, Table 16, shows that there

is a steady decrease in Lexical Density from Group 1 to Group 3. The reason

for this drop is made clear in the light of the further analyses performed.

Table 15. Post-hoc comparisons for the dependent variable: Lexical Density

Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t

Group 1 vs. Group 2 27.091 13.845* 7.422* 3.853

Group 1 vs. Group 3 48.765 13.995* 23.533* 6.86

Group 2 vs. Group 3 21.674 13.884* 4.724* 3.074

* Significant at .05 level

239

Table 16. Means and standard deviations for the dependent variable: Lexical

Density

GROUP COUNT MEAN STD. DEV.

Group 1 91 42.017 5.240

Group 2 94 39.308 5.030

Group 3 90 37.141 3.956

4.1.6 Length of T-Units

The ANOVA for the dependent variable Words per T-Unit shows a

significant main effect for the factor Group, (F (2, 272) = 151.684, p = .0001). To

examine the source of the significant main effect, post-hoc comparisons of the

means for each group were performed. The results of the comparisons are

illustrated in Table 17. There is a significant difference between all groups at

the p < .05 level of significance. The table of means, Table 18, shows a steady

increase of the length of the T-Units from Group 1 to Group 3.

Table 17. Post-hoc comparisons for the dependent variable: Words per T-

Unit

Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t

Group 1 vs. Group 2 -9.734 4.619* 8.609* 4.15

Group 1 vs. Group 3 -39.622 4.669* 139.603* 16.709

Group 2 vs. Group 3 -29.889 4.632* 80.719* 12.706

* Significant at .05 level

240

Table 18. Means and standard deviations for the dependent variable: Words

per T-Unit

GROUP COUNT MEAN STD. DEV.

Group 1 91 6.801 1.070

Group 2 94 7.774 1.339

Group 3 90 10.763 2.177

4.1.7 Error-Free T-Units

The ANOVA for the dependent variable Error-Free T-Units shows a

significant main effect for the factor Group, (F(2, 272) = 9.031, p = .0002). To

examine the source of the significant effect for the factor Group, post-hoc

comparisons of the means for each group were performed. The results of the

comparisons are illustrated in Table 19. There is a significant difference

between Group 3 and Group 1, and between Group 3 and Group 2, at the p <

.05 level of significance. The highest proficiency group had the smallest

number of Error-Free T-Units, see Table 20.

Table 19. Post-hoc comparisons for the dependent variable: Error-Free T-

Units

Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t

Group 1 vs. Group 2 -.095 2.09 .004 .09

Group 1 vs. Group 3 3.892 2.112* 6.581* 3.628

241

Group 2 vs. Group 3 3.988 2.096* 7.019* 3.747

* Significant at .05 level

Table 20. Means and standard deviations for the dependent variable: Error-

Free T-Units

GROUP COUNT MEAN STD. DEV.

Group 1 91 9.681 6.993

Group 2 94 9.777 8.662

Group 3 90 5.789 5.596

4.1.8 S-Nodes per T-Unit

The ANOVA for the dependent variable S-Nodes per T-Unit shows a

significant main effect for the factor Group (F(2, 272) = 89.607, p = .0001). To

examine the source of the significant effect for the factor Group, post-hoc

comparisons of the means for each group were performed. The results of the

comparisons are illustrated in Table 21. There is a significant difference

between all groups at the p < .05 level of significance. There is a steady increase

in the number of S-Nodes per T-Unit from Group 1 to Group 3, see Table 22.

Table 21. Post-hoc comparisons for the dependent variable: S-Nodes per T-

Unit

Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t

Group 1 vs. Group 2 -2.462 .691* 24.614* 7.016

Group 1 vs. Group 3 -4.747 .698* 89.551* 13.383

242

Group 2 vs. Group 3 -2.285 .693* 21.085* 6.494

* Significant at .05 level

Table 22. Means and standard deviations for the dependent variable: S-

Nodes per T-Unit

GROUP COUNT MEAN STD. DEV.

Group 1 91 1.119 .144

Group 2 94 1.366 .229

Group 3 90 1.594 .312

4.1.9 Summary of the Results for the Language Proficiency Analyses

The results of the analyses for the language proficiency measures show

an overall significant main effect for the factor Group. No difference was found

between the three groups in the holistic rating of the essays, even though the

three groups represent three different levels of language proficiency. This

could be due to the nature of the holistic rating since it takes into account not

just the use of language, but also the structure of the essay, its organisation, the

expression of ideas, the explanations and arguments provided by the writer,

etc. Therefore the ratings based on the holistic scale may obscure differences

among subjects that are attributable to language proficiency, which is of most

interest to this study. However, this lack of significant differences between

groups using this measure is counter-balanced by the fact that reliable

differences were found using the other measures, and these are in line with the

243

claim that the different groups are composed of subjects at different levels of

proficiency, and possibly different stages of development.

As can be seen from Table 16, there is a significant drop in Lexical

Density for subjects in Groups 2 and 3 compared with subjects in Group 1, and

for subjects in Group 3 compared with subjects in Group 2. To this extent,

increases in proficiency appeared to be related to increases in the number of

grammatical words used in the essays (as the TLU analysis showed). As a

result, lower-level students were grammatically less accurate in their essays (as

the results from the TLU analysis show), and thus the omission of grammatical

words (e.g. articles) contributed to a higher percentage score for Lexical

Density. In Group 3 where the students are grammatically more accurate, as

the TLU analysis showed, the lexical density is lower. These results are also

consistent with the findings of recent research which showed that subjects of

lower proficiency levels use more content words, while those of higher

proficiency levels use more function words , e.g. pronouns, articles, and

prepositions (Ghadessy 1989).

The results for the dependent variable Words per T-Unit also reflect

different proficiency groupings (see Table 18). The higher the level of

proficiency, the more subordination and embedding the student uses in the

construction of sentences, and thus the longer the sentences they produce. This

finding, it must be noted, is in partial agreement with the finding reported by

Larsen-Freeman and Strom (1977), who found that the mean length of the T-

Units in the writings of the subjects in their study increased steadily with

244

proficiency level, but the statistical analysis performed on their data did not

yield significant differences. Larsen-Freeman and Strom conclude that length

of T-Units is still "a viable contender on which to base an index of

development" (Larsen-Freeman & Strom 1977:132).

The results for the dependent variable Error-Free T-Units do show

significant differences between the three groups (see Table 19). Although these

differences support the claim that the three different groups reflect different

proficiency groupings, the direction of the difference is in contrast to the

findings of previous research. In line with Larsen-Freeman's findings (1978) it

was expected that more proficient subjects would use more Error-Free T-Units

than less proficient subjects. However, the present findings show that subjects

in Group 1 use significantly more Error-Free T-Units than subjects in Group 2,

and these subjects in turn use significantly more Error-Free T-Units than

subjects in Group 3. The present results could be due to the fact that subjects in

this study are simply trying harder to produce more complex syntax than less

proficient subjects. It is certainly true that subjects in the present study are not

at a sufficiently advanced level to make no mistakes in their writing, since the

subjects in Group 3, who have had the longest period of instruction in English,

and who are older by one and two years on average than subjects in the other

groups, are only at a post-intermediate level. In Larsen-Freeman's study

subjects were from a larger range of proficiency levels (5 groups), from subjects

that were of very low proficiency and needed a great deal of ESL instruction

(Group 1) to subjects that were advanced enough not to need any more ESL

245

instruction (Group 5). Even though Larsen-Freeman does not report the post-

hoc comparisons for the Error-Free T-Units measure, it is apparent from the

percentages reported in her paper that it is at the advanced level that subjects

singificantly use more Error-Free T-Units, e.g. there is a 15% increase in the

amount of Error-Free T-Units used by the advanced learners in group 5 (see

Table 23).

Table 23. Percentage of Error-Free T-Units in Larsen-Freeman (1978)

Group Number %EFT

1 37 11.4

2 39 18.5

3 45 22.1

4 56 34.3

5 35 49.6

(Adapted from Larsen-Freeman 1978:445)

In line with the above interpretation, that more proficient subjects in

Group 3 try harder to produce more complex syntax and so make greater

numbers of errors, it was shown that Group 3 students write longer T-Units

than subjects in the other groups. They should therefore have a higher chance

factor of making mistakes than subjects in the other two groups. The shorter

the T-units, the less chance subjects have of making spelling, punctuation,

grammatical, or syntactic mistakes.

246

Also in line with this argument are the results for the dependent variable

S-Nodes per T-Unit. The higher the level of the students’ proficiency, the more

syntactically complex sentences they produce in writing (see Bardovi-Harlig

1992a). In summary, the higher the level students belong to, the more accurate

they are in the use of articles, and the more syntactically complex and longer

sentences they produce, while their lexical density decreases and their chance

of making an error increases.

4.2 Results of the Main Analyses

In this section the results of the analyses performed to address each of

the two hypotheses are described.

4.2.1 Hypothesis 1: There are patterns of development in collocational

knowledge across proficiency levels

To address Hypothesis 1, the three sets of data were analysed separately:

i) For the free production data, tokens of the correct use of the thirty-seven

types of collocation were recorded. Lack of, or incorrect use of, a particular

type were scored as 0. The data were entered as the sum of tokens of correct

usage of each collocation type by each subject.

ii) For the translation data, the mean accuracy of response to each of the six

types of collocation repeated across groups was calculated.

iii) For the blank filling data, the mean accuracy of response to each of the

eleven types of collocation repeated across groups was calculated.

247

The data for these analyses were examined and were not found to be

normally distributed. This is due, in the case of the elicited production

measures, to the fact that means for accurate responses to some types were

calculated on the basis of a small number of responses to tokens, thus

restricting the possible range of scores on these types. In the case of the essay

data the mean use of many types of collocation did not follow the normal

pattern of distribution within and across groups. This justifies the use of non-

parametric Kruskal-Wallis tests, followed by Dunn's multiple comparisons

procedures to address the first hypothesis regarding between-group differences

in accuracy and use of collocations.

4.2.1.1 Essay Data (All Groups)

The sum of tokens for each of the 37 types of collocation were calculated

for each essay. Kruskal-Wallis tests were performed to identify significant

between-group differences with respect to each collocation type. The results of

the Kruskal-Wallis tests of the mean tokens of each of the 37 collocation types

used by subjects in each group, corrected for ties, together with the results of

the post-hoc Dunn's multiple comparisons procedures, are reported below.

These are summarised in Table 24. Collocation types that did not show

significant differences across all groups, or which did not contain any tokens

for one or two of the groups, are not included in the table.

Table 24. Summary of the results of the Kruskal-Wallis tests and post-hoc

analyses for the essay data

Dunn’s Procedure: Mean Rank Differences

248

Type K-W 1 vs. 2 2 vs. 3 1 vs. 3

1. Noun Prep 15.664 122.401-136.926 136.926<154.894* 122.401<154.894*

2. Noun to inf 6.832 132 -139.33 139.33 -142.678 132 -142.678

4. Prep Noun 19.104 129.742<163.234* 163.234>119.994* 129.742-119.994

5. Adjective Prep 6.118 125.242<144.601* 144.601-144.006 125.242<144.006*

11. SV(O)prepO 14.592 146.33 -154.053 154.053>112.811* 146.33 >112.811*

12. SV to inf 41.069 97.242<166.83* 166.83 >149.1* 97.242<149.1*

13. SV inf 71.452 89.11 <144.34* 144.34 <180.811* 89.11 <180.811*

14. SVV-ing 6.19 137.198-147.963 147.963>128.406* 137.198-128.406

15. SVO to inf 16.115 127.473-136.239 136.239-150.483 127.473<150.483*

19. SV(O) that 45.251 103.11 <141.569* 141.569<169.55* 103.11 <169.55*

21. SVOc 19.721 128 -135.154 135.154-151.083 128 <151.083*

24. SV(O) wh 8.585 128.06 <147.096* 147.096-138.55 128.06 -138.55

26. SVc 41.535 170.088>147.202* 147.202>95.944* 170.088>95.944*

29. Adj Noun 63.637 177.049>149.261* 149.261>86.756* 177.049>86.756*

31. N1of N2 8.371 135.588-133.936 133.936-144.683 135.588-144.683

36. Prep Det N 16.584 113.813<158.527* 158.527>141.017* 113.813<141.017*

37. Phrasal Verb 51.136 116.082<175.399* 175.399>121.1* 116.082<121.1*

*: Significant at the .05 level

< or >: direction of the difference

4.2.1.1.1 Kruskal-Wallis Analyses for the Essay Data

249

Type 1. Noun Preposition - Results of the Kruskal-Wallis test for

numbers of tokens of Noun Preposition collocations used in the subjects' essays

show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N =

275) = 15.664, p = .0004). The table of means, Table 25, shows that the mean

number of tokens used per group increases across groups. The results of the

post-hoc Dunn's multiple comparisons procedure show significant differences

at the p < .05 level between numbers of tokens of this type used in Groups 1

and 3, and in Groups 2 and 3, but not in Groups 1 and 2 (see Table 24).

Type 2. Noun to Infinitive - Results of the Kruskal-Wallis test for

numbers of tokens of Noun to Infinitive collocations used in the subjects' essays

show the difference between groups to be significant (Kruskal-Wallis c2 (2, N =

275) = 6.832, p = .0328). However, the results of the Dunn's multiple

comparisons procedure show no significant difference between any pairs of

groups at the p < .05 level, even though the mean number of tokens used per

group increases across groups and the difference between Group 1 and Group

2 is approaching significance (see Table 24).

Type 4. Preposition Noun - Results of the Kruskal-Wallis test for

numbers of tokens of Preposition Noun collocations used in the subjects' essays

show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N =

275) = 19.104, p = .0001). The table of means, Table 25, shows that subjects in

Group 2 used more collocations of this type in their essays that subjects in

250

Group 1, and subjects in Group 1 used more collocations of this type than

subjects in Group 3. The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in numbers of tokens of this type used

between Groups 1 and 2, and between Groups 2 and 3, but not between Groups

1 and 3 (see Table 2.5).

Type 5. Adjective Preposition - Results of the Kruskal-Wallis test for

numbers of tokens of Adjective Preposition collocations used in the subjects'

essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2,

N = 275) = 6.118, p = .0469). The table of means, Table 25, shows that the mean

number of tokens used by subjects in Group 2 is equal to the mean number of

tokens of this type used by subjects in Group 3, while subjects in Group 1

produced considerably less tokens on this type than subjects in Groups 2 and 3.

The results of the post-hoc Dunn's multiple comparisons procedure show

significant differences in numbers of tokens of this type used between Groups 1

and 3, and between Groups 1 and 2, but not between Groups 2 and 3 (see Table

24).

Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test for

numbers of tokens of SV(O) Preposition O collocations used in the subjects'

essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2,

N = 275) = 14.592, p = .0007). The table of means, Table 25, shows that subjects

in Group 2 produced more tokens of this type of collocation than subjects in

251

Group 1, who used more tokens of this type of collocation than subjects in

Group 3. The results of the post-hoc Dunn's multiple comparisons procedure

show significant differences in numbers of tokens of this type used between

Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2

(see Table 24).

Type 12. SV to Infinitive - Results of the Kruskal-Wallis test for numbers

of tokens of SV to Infinitive collocations used in the subjects' essays show a

significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) =

41.069, p = .0001). The table of means, Table 25, shows that subjects in Group 2

produced more tokens of this type of collocation than subjects in Group 3, who

used more tokens of this type of collocation than subjects in Group 1. The

results of the post-hoc Dunn's multiple comparisons procedure show

significant differences in numbers of tokens of this type used across all groups

(see Table 24).

Type 13. SV Infinitive - Results of the Kruskal-Wallis test for numbers of

tokens of SV Infinitive collocations used in the subjects' essays show a

significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) =

71.452, p = .0001). The table of means, Table 25, shows that the mean number of

tokens used per group increases across groups. The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in

numbers of tokens of this type used across all groups (see Table 24).

252

Type 14. SVV-ing - Results of the Kruskal-Wallis test for numbers of

tokens of SVV-ing collocations used in the subjects' essays show a significant

main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.19, p =

.0453). The table of means, Table 25, shows that subjects in Group 2 produced

more tokens of this type of collocation than subjects in Group 1, who used more

tokens of this type of collocation than subjects in Group 3. The results of the

post-hoc Dunn's multiple comparisons procedure show significant differences

in numbers of tokens of this type used only between Groups 2 and 3, but not

between Groups 1 and 2, or between Groups 1 and 3 (see Table 24).

Type 15. SVO to Infinitive - Results of the Kruskal-Wallis test for

numbers of tokens of SVO to Infinitive collocations used in the subjects' essays

show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N =

275) = 16.115, p = .0003). The table of means, Table 25, shows that the mean

number of tokens used increases across groups. The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in

numbers of tokens of this type only between Groups 1 and 3, but not between

Groups 1 and 2, or between Groups 2 and 3 (see Table 24).

Type 19. SV(O) that-clause - Results of the Kruskal-Wallis test for

numbers of tokens of SV(O) that-clause collocations used in the subjects' essays

show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N =

275) = 45.251, p = .0001). The table of means, Table 25, shows that the mean

253

number of tokens used per group increases across groups. The results of the

post-hoc Dunn's multiple comparisons procedure show significant differences

in numbers of tokens of this type used between all groups (see Table 24).

Type 21. SVOc - Results of the Kruskal-Wallis test for numbers of tokens

of SVOc collocations used in the subjects' essays show a significant main effect

for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 19.721, p = .0001). The

tables of means, Table 25, shows that the mean number of tokens used per

group increases across groups. The results of the post-hoc Dunn's multiple

comparisons procedure show significant differences in numbers of tokens of

this type used only between Groups 1 and 3, but not between Groups 1 and 2,

or between Groups 2 and 3 (see Table 24).

Type 24. SV(O) wh-word - Results of the Kruskal-Wallis test for

numbers of tokens of SV (O) wh-word collocations used in the subjects' essays

show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N =

275) = 8.585, p = .0137). The table of means, Table 25, shows that subjects in

Group 2 produced more tokens of this type of collocation than subjects in

Group 3, who used more tokens of this type of collocation than subjects in

Group 1. The results of the post-hoc Dunn's multiple comparisons procedure

show significant differences in numbers of tokens of this type used only

between Groups 1 and 2, but not between Groups 2 and 3, or Groups 1 and 3

(see Table 24).

254

Type 26. SVc - Results of the Kruskal-Wallis test for numbers of tokens

of SVc collocations used in the subjects' essays shows a significant main effect

for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 41.535, p = .0001). The

table of means, Table 25, shows that the mean number of tokens of this type of

collocation decreases as the proficiency level of the subjects increases. The

results of the post-hoc Dunn's multiple comparisons procedure show

significant differences in numbers of tokens of this type of collocation between

all groups (see Table 24).

Type 29. Adjective Noun - Results of the Kruskal-Wallis test for numbers

of tokens of Adjective Noun collocations used in the subjects' essays show a

significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) =

63.637, p = .0001). The table of means, Table 25, shows that the mean number of

tokens of this type of collocation decreases as the proficiency level of the

subjects increases. The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in numbers of tokens of this type of

collocation across all groups (see Table 25).

Type 30. Noun Verb - Results of the Kruskal-Wallis test for numbers of

tokens of Noun Verb collocations used in the subjects' essays show a significant

main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.212, p =

255

.0448), though the Dunn's multiple comparisons procedure revealed no

significant between-group differences (see Table 24).

Type 31. Noun1 of Noun2 - Results of the Kruskal-Wallis test for numbers

of tokens of Noun1 of Noun2 collocations used in the subjects' essays show the

difference between groups to be significant (Kruskal-Wallis c2 (2, N = 275) =

8.371, p = .0152). However, the results of the Dunn's multiple comparisonss

procedure show no significant differences between any of the groups at the p <

.05 level (see Table 24).

Type 36. Preposition Determiner Noun - Results of the Kruskal-Wallis

test for numbers of tokens of Preposition Determiner Noun collocations used in

the subjects' essays show a significant main effect for the factor Group

(Kruskal-Wallis c2 (2, N = 275) = 16.584, p = .0003). The table of means, Table

25, shows that subjects in Group 2 produced more tokens of this type of

collocation than subjects in Group 3, who used more tokens of this type of

collocation than subjects in Group 1. The results of the post-hoc Dunn's

multiple comparisons procedure show significant differences between all

groups (see Table 24).

Type 37. Phrasal Verb - Results of the Kruskal-Wallis test for numbers of

tokens of Phrasal Verb collocations used in the subjects' essays show a

significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) =

256

51.136, p = .0001). The table of means, Table 25, shows that subjects in Group 2

produced more tokens of this type of collocation than subjects in Group 3, who

used more tokens of this type of collocation than subjects in Group 1. The

results of the post-hoc Dunn's multiple comparisons procedure show

significant differences between Groups 1 and 2, and between Groups 2 and 3,

but not between Groups 1 and 3 (see Table 24).

Table 25. Means and standard deviations by group for the essay data

Means per Group Std. Dev. per Group

Types Group1 Group2 Group3 Group1 Group2 Group3

1. Noun Prep .099 .234 .400 .335 .517 .650

2. Noun to infinitive 0.000 .074 .083 0.000 .366 .323

3. Noun that 0.000 0.000 0.000 0.000 0.000 0.000

4. Preposition Noun .626 .989 .411 1.217 1.187 .701

5. Adjective Prep .231 .426 .411 .616 .823 .833

6. Pred Adj to inf .033 .128 .100 .180 .421 .337

7. Adj that 0.000 .011 .011 0.000 .103 .105

8. SVO to O/SVOO .209 .117 .133 .587 .384 .429

9. SVO to O 0.000 0.000 .011 0.000 0.000 .105

10. SVO for O/SVOO .011 .011 0.000 .105 .103 0.000

11. SV(O) prep O 2.066 2.213 1.200 2.081 2.047 1.432

12. SV to inf .835 2.404 1.933 1.790 2.329 2.360

13. SV inf .286 1.287 2.389 .898 1.708 2.355

257

14. SVV-ing .330 .606 .222 .844 1.483 .746

15. SVO to inf .011 .106 .256 .105 .427 .628

16. SVO inf .011 .021 .044 .105 .145 .207

17. SVO V-ing 0.000 .032 0.000 0.000 .177 0.000

18. SV poss V-ing 0.000 0.000 0.000 0.000 0.000 0.000

19. SV(O) that .165 .702 1.078 .719 1.199 1.326

20. SVO to be c 0.000 .011 0.000 0.000 .103 0.000

21. SVOc 0.000 .053 .256 0.000 .226 .646

22. SVOO 0.000 0.000 0.000 0.000 0.000 0.000

23. SV(O) Adverbial .813 .649 .533 .999 .924 .782

24. SV(O) wh-word .055 .245 .133 .273 .581 .373

25. S(it)VO to inf 0.000 0.000 0.000 0.000 0.000 0.000

26. SVc 7.846 6.160 3.789 5.131 3.740 3.000

27. Verb Noun (creat) .549 .681 .800 1.036 .941 1.083

28. Verb Noun (erad) 0.000 0.000 0.000 0.000 0.000 0.000

29. Adj Noun 3.440 2.021 .856 3.078 1.328 1.076

30. Noun Verb 0.000 0.000 .033 0.000 0.000 .181

31. Noun1 of Noun2 .044 .011 .122 .295 .103 .419

32. Adv Adj 0.000 0.000 0.000 0.000 0.000 0.000

33. Verb Adverb 0.000 .021 .056 0.000 .145 .275

34. Noun Noun .209 .298 .367 .548 .583 .800

35. Miscellaneous .033 .043 .022 .233 .203 .148

36. Prep Det Noun .714 1.362 1.033 1.088 1.335 1.146

258

37. Phrasal Verb .143 .936 .200 .382 1.096 .050

0

2

4

6

8

Gr oup1

Gr oup2

Gr oup3

Col location Tokens - Essay Data - Al l Groups

Col location Types

Note. Only those collocation types showing significant differences across

groups are included in this figure.

Figure 3. Mean use of collocation tokens - essay data - all groups

4.2.1.1.2 Summary of the Results for the Essay Data

The results of the Kruskal-Wallis analyses of the accurate use of the 37

types of collocations in the students' essays partially support Hypothesis 1,

since there are significant differences between different proficiency groups in

the use of collocations. These differences are clear in the use of the collocation

Types 13. SV Infinitive and 19. SV(O) that. As the proficiency level increases,

the accurate use of these two types of collocations increases, resulting in

significant between-group differences, across all three groups.

259

Type 1. Noun Preposition collocations are also positively related to

proficiency, since Group 3 subjects use significantly more tokens of this type of

collocation than subjects in the other two groups.

The results also show that the direction of the between-group differences

is not always the expected one. With respect to the collocation Types 26. SVc

and 29. Adjective Noun, proficiency is negatively correlated with accurate use of

these two types across groups: the less proficient students use significantly

more tokens of these two types of collocation than the more proficient students.

There are also collocation Types, 4. Preposition Noun, 12. SV to Infinitive, 36.

Prep Det Noun, and 37. Phrasal Verb, for which Group 2 subjects use

significantly more tokens than either Group 1 or Group 3 subjects. For Type 5.

Adjective Preposition, Group 2 and Group 3 subjects use significantly more

collocations than subjects in Group 1, while for Type 11. SV(O) Preposition O

Group 1 and 2 subjects are significantly better users than Group 3 subjects.

There is also a number of collocation types that did not receive any tokens of

accurate use by any of the groups (3. Noun that, 18. SV possessive V-ing, 22.

SVOO, 25. S(it)VO to inf, 28. Verb Noun (eradication), and 32. Adverb Adjective).

The results of the analysis for this set of data, summarised below in

Table 26, suggest that there are indeed proficiency-related differences in the

accurate use of collocations, and that there are specific types of collocation that

are used in the early stages of proficiency, and others that are used in the later

stages of development.

Table 26. Collocational use distinguishing proficiency levels

260

Group 1 Group 2 Group 3

Collocation Types Collocation Types Collocation Types

26. SV c** 4. Prep Noun** 1. Noun Prep**

29. Adjective Noun** 12. SV to Inf** 13. SV Inf**

11. SV(O) Prep O* 36. Prep Det Noun** 19. SV(O) that**

37. Phrasal Verb** 5. Adjective Prep*

5. Adjective Prep*

11. SV(O) Prep O*

**: Significantly more occurences than the other two groups

*: Significantly more occurences than one other group

4.2.1.1.3 Implicational Scaling for the Essay Data (All Groups)

For the implicational scaling analysis the Guttman procedure was used.

When the Guttman analysis reveals that a particular scale is consistently

interpretable, that is if one item on the scale is statistically consistently more

difficult than another, which is in turn harder than another, then the scale

attains a certain predictive power (Davidson 1987). The coefficient of

reproducibility, which shows how accurately a subject's performance can be

predicted from that person's position in the matrix, and the coefficient of

scalability, which is a single statistic detailing the strength of the items as an

ordered scale and indicating whether a given set of features is truly scalable

and unidimensional, were calculated. The higher the value of the coefficient of

scalability, the more "implicational" the scale (Davidson 1987).

261

Each subject was coded as having used (1), or not used (0), each of the 37

types of collocation in their essays. The two axes of the matrix for the

implicational scaling consisted of the 37 items ranked from most commonly

used by all subjects to least commonly used, and the 275 subjects ranked in

order of their frequency of use of all types of collocations, from subjects using

the most types to subjects using the fewest types. This matrix is summarised

for the first 17 types, mean >.1, in Figure 4 below. The coefficient of

reproducibility for this analysis was .90. The coefficient of scalability was .33.

While the coefficient of reproducibility is at the level necessary for this

implicational scale to be considered valid (see Andersen 1978), the coefficient of

scalability is below the recommended level of .6 (Hatch & Lazaraton 1991:212).

This suggests that while the implicational scale for the essay data is valid, the

variance in terms of numbers of errors, and the fact that most subjects did not

use the majority of the scaled collocations, resulted in the low coefficient of

scalability.

262

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

Note. Only those types with a mean > .1 are included in this figure.

Figure 4. Mean use of collocation tokens per type in the essay data.

4.2.1.2 Translation Data (All Groups)

The data set used in these analyses consisted of the mean accuracy of

response to each of the six types of collocation repeated across groups in the

translation test. As with the essay data reported above, the procedure followed

in analysing the translation data was to perform Kruskal-Wallis tests of the

differences between groups for each collocation type separately. Subsequently,

where significant group effects were identified, post-hoc Dunn's multiple

comparisons procedures were calculated in order to identify the source of the

significant contrasts between groups. The results of the Kruskal-Wallis tests,

together with the results of post-hoc Dunn's multiple comparisons procedures,

are reported below. These are summarised in Table 27. Type 16. SVO Infinitive

263

showed no significant across groups differences and therefore is not included

in the table.

Table 27. Summary of the results of the Kruskal-Wallis tests and post-hoc

analyses for the translation data

Dunn’s Procedure: Mean Rank Differences

Type K-W 1 vs. 2 2 vs. 3 1 vs. 3

1. Noun Prep 51.334 122.505-110.681 110.681<182.2* 122.505<182.2*

5. Adjective Prep 6.503 128.703-136.404 136.404-149.06 128.703<149.06*

11. SV(O) Prep O 14.546 127.434<159.479* 159.479>126.25* 127.434-126.25

14. SVV-ing 33.999 122.709-121.793 121.793<170.38* 122.709<170.38*

27. Verb N (creat) 85.758 101.269<126.511* 126.511<187.13* 101.269<187.13*

*: Significant at the .05 level, < or >: Direction of the difference

4.2.1.2.1 Kruskal-Wallis Analyses for the Translation Data

Type 1. Noun Preposition - Results of the Kruskal-Wallis test of

responses to Noun Preposition collocations show a significant main effect for the

factor Group (Kruskal-Wallis c2(2, N = 275) = 51.334, p = .0001). The results of

the post-hoc Dunn's multiple comparisons procedure show significant

differences in the mean accuracy of response to this type of collocation between

Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2

(see Table 27).

264

Type 5. Adjective Preposition - Results of the Kruskal-Wallis test of

responses to Adjective Preposition collocations show a significant main effect for

the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.503, p = .0387). The table of

means, Table 28, shows that the mean accuracy of response to Adjective

Preposition collocations increases across groups. The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in the

mean accuracy of response to this type of collocation only between Groups 1

and 3, but not between Groups 2 and 3, or between Groups 1 and 2 (see Table

27).

Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test of

responses to SV(O) Preposition O collocations show a significant main effect for

the factor Group (Kruskal-Wallis c2 (2, N = 275) = 14.546, p = .0007). The results

of the post-hoc Dunn's multiple comparisons procedure show significant

differences in the mean accuracy of response to this type of collocation between

Groups 1 and 2, and between Groups 2 and 3, but not between Groups 1 and 3

(see Table 27).

Type 14. SVV-ing - Results of the Kruskal-Wallis test of responses to

SVV-ing collocations show a significant main effect for the factor Group

(Kruskal-Wallis c2 (2, N = 275) = 33.999, p = .0001). The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in the

265

mean accuracy of response to this type of collocation between Groups 1 and 3,

and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 27).

Type 27. Verb Noun (creation) - Results of the Kruskal-Wallis test of

responses to Verb Noun (creation) collocations show a significant main effect for

the factor Group (Kruskal-Wallis c2 (2, N = 275) = 85.758, p = .0001). The table

of means, Table 28, shows that the mean accuracy of responses to Verb Noun

collocations increases across groups. The results of the post-hoc Dunn's

multiple comparisons procedure show significant differences in the mean

accuracy of response to this type of collocation across all groups (see Table 27).

Table 28. Means and standard deviations by group for the translation data

Means per Group Std. Dev. per Group

Types Group1 Group2 Group3 Group1 Group2 Group3

1. Noun Prep 23.077 18.085 57.222 29.162 29.193 4.368

5. Adjective Prep 10.989 10.106 20.000 31.449 21.477 35.790

11. SV(O) Prep O 19.231 40.957 23.333 30.523 44.579 42.532

14. SVV-ing 20.879 20.213 55.556 40.870 40.374 49.969

16. SVO Inf 25.275 23.404 36.667 43.699 42.567 48.459

27. Verb Noun (creat) 3.297 11.702 37.222 17.954 21.283 29.567

266

Type1 Type5 Type11 Type14 Type16 Type270

10

20

30

40

50

60

Group1

Group2

Group3

Mean Accuracy of Response - Translation Test - All Groups

Collocation Types

Mea

n A

ccur

acy

of R

espo

nse

Figure 5. Mean accuracy of response for the translation data

4.2.1.2.2 Summary of the Results for the Translation Data

As with the results of the analyses of collocational use in the essay data,

the translation data also reveal a significant difference across groups in terms of

the accuracy of their responses to the six types of collocation. The results for

Types 1. Noun Preposition, 14. SVV-ing, and 27. Verb Noun (creation) are

consistent with the claim that differences in the accuracy of translation of

collocations are positively related to differences of proficiency, since the more

proficient subjects in Group 3 are more accurate in the use of these collocations

than subjects in either Group 2 or Group 1. Group 3 subjects were significantly

more accurate than subjects in Group 1 but equal to Group 2 subjects in their

responses to Type 5. Adjective Preposition collocations. With the exception of

267

Type 11. SV(O) Preposition O collocations, in which Group 2 subjects were

better than either Group 1 or Group 3 subjects, the differences across all groups

are in the predicted direction (see Table 29).

Table 29. Translation accuracy distinguishing proficiency levels

Group 1 Group 2 Group 3

Collocation Types Collocation Types Collocation Types

11. SV(O) Prep O** 1. Noun Prep**

5. Adjective Prep* 14. SVV-ing**

27. Verb Noun (creat)**

5. Adjective Prep*

**: Significantly more accurate than the other two groups

*: Significantly more accurate than one other group

4.2.1.2.3 Implicational Scaling for the Translation Data (All Groups)

For the implicational scaling analysis, following the Guttman procedure,

each subject was coded as having translated accurately (1), or not (0), each of

the 6 types of collocations in the translation test. A criterion of 80% accuracy

was used for the coding of the data (see also Andersen 1978; Anderson 1978).

That is, if a subject was 80 to 100% accurate in translating the particular

collocation type, she/he was coded as 1. Accuracy less than 80% was coded as

0. As with the essay data, the two axes of the matrix for the implicational

scaling consisted of the six items ranked from the most accurately translated by

268

all subjects to the least accurately translated, and the 275 subjects ranked in

order of their accuracy of response to all types of collocations, from subjects

translating accurately the most types to subjects translating accurately the

fewest types. This matrix is given in Appendix M and summarised in Figure 6

below. The coefficient of reproducibility for this analysis was .92. The

coefficient of scalability was .578 and so approached significance (see Andersen

1978; Hatch & Lazaraton 1991:212).

0

.05

.1

.15

.2

.25

.3

.35

Ty pe14 Ty pe16 Ty pe11 Ty pe1 Ty pe5 Ty pe27Col loca t i on Types

Accura cy - Tra nsl a t i on Test - Al l Groups

Figure 6. Mean accuracy of response for the translation data - all groups

4.2.1.3 Blank Filling Data (All Groups)

As with the translation data, the mean accuracy of responses to each of

the eleven types of collocation repeated across groups in the blank filling test

was calculated. The procedure used was identical to the procedure followed in

analysing the essay and translation data. The results of the Kruskal-Wallis tests

269

and the post-hoc Dunn's multiple comparisons procedures are reported below.

These are summarised in Table 30. Type 4. Prep Noun showed no significant

across-group differences and therefore is not included in the table.

Table 30. Summary of the results for the Kruskal-Wallis tests and post-hoc

analyses for the blank filling data

Dunn’s Procedure: Mean Rank Differences

Type K-W 1 vs. 2 2 vs. 3 1 vs. 3

1. Noun Prep 58.03 95.566<146.883* 146.883<171.62* 95.566<171.628*

5. Adjective Prep 19.673 160.791>110.53* 110.532<143.64* 160.791-143.64

11. SV(O) Prep O 14.711 115.747<137.5* 137.5 <161.02* 115.747<161.02*

23. SV(O) Adverb 39.351 102.495<149.42* 149.42 -161.97 102.495<161.97*

24. SV(O)wh 22.03 111.758<139.91* 139.915<162.53* 111.758<162.53*

27. Verb N (creat.) 29.988 108.967<132.78* 132.787<172.8* 108.967<172.8*

33. Verb Adverb 77.892 116.731-108.138 108.138<190.69* 116.731<190.69*

34. Noun Noun 26.577 167.885>114.82* 114.824<131.98* 167.885>131.98*

36. Prep Det N 17.299 131.505-118.064 118.064<165.38* 131.505<165.38*

37. Phrasal Verb 26.57 116.176-125.872 125.872<172.73* 116.176<172.73*

*: Significant at the .05 level

< or >: Direction of the difference

4.2.1.3.1 Kruskal-Wallis Analyses for the Blank Filling Data

270

Type 1. Noun Preposition - Results of the Kruskal-Wallis test of

responses to Noun Preposition collocations show a significant main effect for the

factor Group (Kruskal-Wallis c2 (2, N = 275) = 58.03, p = .0001). The mean

accuracy of responses to Noun Preposition collocations increases across groups

(see Table 31). The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in the mean accuracy of response to this

type of collocation between all groups (see Table 30).

Type 5. Adjective Preposition - Results of the Kruskal-Wallis test of

responses to Adjective Preposition collocations show a significant difference

between conditions (Kruskal-Wallis c2 (2, N = 275) = 19.673, p = .0001). The

results of the post-hoc Dunn's multiple comparisons procedure show

significant differences in the mean accuracy of response to this type of

collocation between Groups 1 and 2, and between Groups 2 and 3, but not

between Groups 1 and 3 (see Table 30).

Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test of

responses to SV(O) Preposition O collocations show a significant main effect for

the factor Group (Kruskal-Wallis c2 (2, N = 275) = 14.711, p = .0006). The mean

accuracy of responses to SV(O) Preposition O collocations increases across

groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in the mean accuracy of response to this

type of collocation between all groups (see Table 30).

271

Type 23. SV(O) Adverbial - Results of the Kruskal-Wallis test of

responses to SV(O) Adverbial collocations show a significant main effect for the

factor Group (Kruskal-Wallis c2 (2, N = 275) = 39.351, p = .0001). The mean

accuracy of responses to SV(O) Adverbial collocations increases across groups

(see Table 31). The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in the mean accuracy of response to this

type of collocation between Groups 1 and 2, and between Groups 1 and 3, but

not between Groups 2 and 3 (see Table 30).

Type 24. SV(O) wh-word - Results of the Kruskal-Wallis test of

responses to SV(O) wh-word collocations show a significant main effect for the

factor Group (Kruskal-Wallis c2 (2, N = 275) = 22.03, p = .0001). The mean

accuracy of responses to SV(O) wh-word collocations increases across groups

(see Table 31). The results of the post-hoc Dunn's multiple comparisons

procedure show significant differences in the mean accuracy of response to this

type of collocation between all groups (see Table 30).

Type 27. Verb Noun (creation) - Results of the Kruskal-Wallis test of

responses to Verb Noun (creation) collocations show a significant main effect for

the factor Group (Kruskal-Wallis c2 (2, N = 275) = 29.988, p = .0001). The mean

accuracy of responses to Verb Noun (creation) collocations increases across

groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons

272

procedure show significant differences in the mean accuracy of response to this

type of collocation between all groups (see Table 30).

Type 33. Verb Adverb - Results of the Kruskal-Wallis test of responses to

Verb Adverb collocations show a significant main effect for the factor Group

(Kruskal-Wallis c2 (2, N = 275) = 77.892, p = .0001). The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in the

mean accuracy of response to this type of collocation between Groups 1 and 3,

and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 30).

Type 34. Noun Noun - Results of the Kruskal-Wallis test of responses to

Noun Noun collocations show a significant main effect for the factor Group

(Kruskal-Wallis c2 (2, N = 275) = 26.577, p = .0001). The results of the post-hoc

Dunn's multiple comparisons procedure show significant differences in the

mean accuracy of response to this type of collocation between all groups (see

Table 30).

Type 36. Preposition Determiner Noun - Results of the Kruskal-Wallis

test of responses to Preposition Determiner Noun collocations show a significant

main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 17.299, p =

.0002). The results of the post-hoc Dunn's multiple comparisons procedure

show significant differences in the mean accuracy of response to this type of

273

collocation between Groups 1 and 3, and between Groups 2 and 3, but not

between Groups 1 and 2 (see Table 30).

Type 37. Phrasal Verb - Results of the Kruskal-Wallis test of responses to

Phrasal Verb collocations show a significant main effect for the factor Group

(Kruskal-Wallis c2 (2, N = 275) = 26.57, p = .0001). The mean accuracy of

responses to Phrasal Verb collocations increases across groups (see Table 31).

The results of the post-hoc Dunn's multiple comparisons procedure show

significant differences in the mean accuracy of response to this type of

collocation between Groups 1 and 3, and between Groups 2 and 3, but not

between Groups 1 and 2 (see Table 30).

Table 31. Means and standard deviations by group for the blank filling data

Means per Group Std. Dev. per Group

Types Group1 Group2 Group3 Group1 Group2 Group3

1. Noun Prep 4.396 13.617 28.656 20.613 15.579 30.304

4. Prep Noun 51.538 43.819 47.989 25.269 23.649 33.728

5. Adjective Prep 41.099 26.170 35.156 24.651 24.012 22.714

11. SV(O) Prep O 24.648 31.468 37.633 21.185 23.631 22.321

23. SV(O) Adverbial 8.791 39.362 41.111 28.474 49.117 39.361

24. SV(O) wh-word 21.429 40.426 48.656 34.272 49.338 38.407

27. Verb Noun (creat) 20.758 25.851 40.600 19.007 20.289 26.100

33. Verb Adverb 23.077 12.638 72.222 42.366 22.233 45.041

274

34. Noun Noun 58.242 26.596 37.778 39.626 44.421 48.755

36. Prep Det Noun 42.527 38.489 52.311 25.933 23.514 19.253

37. Phrasal Verb 24.725 28.957 45.000 22.684 24.586 27.335

1 4 5 11 23 24 27 33 34 36 370

20

40

60

80

Group1

Group2

Group3

Mean Accuracy of Response - Blank Filling Test - All Groups

Collocation Types

Mea

n A

ccur

acy

of R

espo

nse

Figure 7. Mean accuracy of response for the blank filling data

4.2.1.3.2 Summary of the Results for the Blank Filling Data

As with the essay and the translation data, the results for the blank

filling data also lend partial support to Hypothesis 1. The subjects' mean

accuracy of response to Types 1. Noun Preposition, 11. SV(O) Preposition O, 24.

SV(O) wh-word, 27. Verb Noun (creation), 33. Verb Adverb, 36. Preposition

Determiner Noun, and 37. Phrasal Verb collocations, is consistent with the claim

that differences in proficiency are positively related to differences in the

275

accuracy of collocation use, since the more proficient subjects in Group 3 are

more accurate in the use of these collocations than subjects in Group 2 and

Group 1. Responses to Type 23. SV(O) Adverbial collocations partially confirm

Hypothesis 1, since Groups 2 and 3 are more accurate than Group 1. With

respect to accuracy of responses to Type 5. Adjective Preposition collocations,

Groups 1 and 3 are significantly better than Group 2. The results for Type 34.

Noun Noun collocations are the only exception to the general direction of the

blank filling data, since Group 1 students are significantly more accurate than

students in Groups 2 and 3.

Overall, the results for the blank filling data are in the predicted

direction, that is, accuracy of response to collocational types increases with

proficiency (see Table 32).

Table 32. Blank filling accuracy distinguishing proficiency levels

Group 1 Group 2 Group 3

Collocation Types Collocation Types Collocation Types

34. Noun Noun** 23. SV(O) Adverbial* 1. Noun Prep**

5. Adjective Prep* 11. SV(O) Prep O**

24. SV(O) wh-word**

27. Verb Noun (creat) **

33. Verb Adverb**

36. Prep Det Noun**

37. Phrasal Verb**

276

23. SV(O) Adverbial*

5. Adjective Prep*

**: Significantly more accurate than the other two groups

*: Significantly more accurate than one other group

4.2.1.3.3 Implicational Scaling for the Blank Filling Data (All Groups)

For the implicational scaling analysis the Guttman procedure was used.

Each subject was coded as having answered accurately (1), or not (0), each of

the 11 types of collocations repeated across groups in the blank filling test. As

with the translation data, an 80% accuracy criterion was used for the coding of

the data, i.e. accuracy less than 80% was coded as 0, accuracy 80 to 100% was

coded as 1. The two axes of the matrix for the implicational scaling consisted of

the eleven types ranked from most accurately answered by all subjects to the

least accurately answered, and the 275 subjects ranked in order of their

accuracy of response to all types of collocations, from types most accurately

answered to types least accurately answered. This matrix is summarised in

Figure 8 below. The coefficient of reproducibility for this analysis was .91. The

coefficient of scalability was .4. As with the essay data, even though the

coefficient of reproducility for the blank filling data is at the level necessary for

this implicational scale to be considered valid, the coefficient of scalability is

below the recommended level of .6 (see Andersen 1978; Hatch & Lazaraton

1991:212).

277

0

.05

.1

.15

.2

.25

.3

.35

34 33 24 23 4 36 5 37 27 1 11 Col loca t i on Types

Accura cy - Bl a nk Fi l l i ng Da t a - Al l Groups

Figure 8. Mean accuracy of response for the the blank filling data - all

groups

4.2.1.4 Summary of the Results for Hypothesis 1

The results from the analyses of the three sets of data, the free

production essay data and the elicited production translation and blank filling

data, support Hypothesis 1 by providing evidence that there are differences

between groups in the production and knowledge of collocations, assessed in

this study both in terms of ability to use collocations in the essays, and in terms

of accuracy of response to questions elicing collocations in the translation and

blank filling tests. Hypothesis 1, however, is only partially supported since

there was limited evidence in the data analysis to support the existence of

accuracy orders in the use and knowledge of collocations across groups. The

implicational scales for the essay and blank filling data, though proven to be

valid according to the coefficient of reproducibility, were found to be only

278

marginally scalable. The implicational scaling for the translation data

approached statistical significance and revealed a valid accuracy order.

4.2.2 Hypothesis 2: There are patterns in the development of collocational

knowledge within proficiency groups

To address Hypothesis 2 and examine the extent of the within-group

differences in the use and knowledge of collocations, non-parametric Friedman

repeated measures tests were used, followed by post-hoc Nemenyi's multiple

comparisons procedures. Implicational scaling for each of the three groups in

each of the three sets of data was then performed. Results of the analyses are

reported below.

4.2.2.1 Essay Data

As for the analyses in Hypothesis 1, the tokens of accurate use of the 37

types of collocation by each subject in each group were used as data for these

analyses.

4.2.2.1.1 Friedman test for the Essay Data - Group 1

The results of the Friedman test for Group 1 show a significant

difference in the use of the 37 types of collocation in the students' essays

(Friedman c2 (36, N = 37) = 1699.221, p = .0001). Nemenyi's multiple

comparisons tests based on the Friedman rank sums were performed on the

data. The results of these tests are summarised below in Table 33. The results

of the post-hoc analysis show a clustering of certain collocations. Types 11.

279

280

SV(O) Prep O, 26. SVc, and 29. Adjective Noun are used significantly more than

almost all the other types of collocation.

4.2.2.1.2 Friedman Test for the Essay Data - Group 2

The results of the Friedman test for Group 2 show a significant difference

in the use of the 37 types of collocation in the students' essays (Friedman c2 (36,

N = 37) = 1823.796, p = .0001). Nemenyi's multiple comparisons tests based on

the Friedman rank sums were performed on the data. The results of these tests

are summarised below in Table 4.24. The results show types 11. SV(O) Prep O,

12. SV to Inf, 26. SVc, 29. Adjective Noun, and 36. Prep Det Noun to be used

significanlty more than all the other types.

4.2.2.1.3 Friedman Test for Group 3 - Essay Data

The results of the Friedman test for Group 3 show a significant

difference in the use of the 37 types of collocation in the students' essays

(Friedman c2 (36, N = 37) = 1401.246, p = .0001). Nemenyi's multiple

comparisons tests based on the Friedman rank sums were performed on the

data. The results of these tests are summarised below in Table 35. The results

show that types 12. SV to Inf, 13. SV Inf, and 26. SVc are used significantly more

than all other types.

Table 33. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 1

Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

11 12.2* 13.7* 13.7* 7.28 11.0 13.2* 13.7* 11.3 13.7* 13.5* ----- 7.47 11.1 10.5 13.5* 13.5* 13.7* 13.7*

26 18.1* 19.5* 19.5* 13.1* 16.8* 19.0* 19.5* 17.1* 19.5* 19.3* 5.83 13.3* 16.9* 16.4* 19.3* 19.3* 19.5* 19.5*

29 14.8* 16.2* 16.2* 9.85 13.5* 15.7* 16.2* 13.8* 16.2* 16.1* 2.57 10.0 13.7* 13.1* 16.1* 16.1* 16.2* 16.2*

* Significant at the .05 level

Type 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

11 12.1* 13.7* 13.7* 13.7* 4.77 13.0* 13.7* 5.83 7.79 13.7* 2.57 13.7* 13.3* 13.7* 13.7* 11.0 13.3* 6.58 11.5

26 18.0* 19.5* 19.5* 19.5* 10.6 18.8* 19.5* ----- 13.6* 19.5* 3.26 19.5* 19.1* 19.5* 19.5* 16.9* 19.1* 12.4* 17.3*

29 14.7* 16.2* 16.2* 16.2* 7.34 15.5* 16.2* 3.26 10.3 16.2* ----- 16.2* 15.8* 16.2* 16.2* 13.6* 15.9* 9.16 14.0*

* Significant at the .05 level

Note. Only those types that were significantly different from the other types are included in the table

281

Table 34. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 2

Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

11 11.6 13.7* 14.6* 4.21 9.68 13.1* 14.5* 13.1* 14.6* 14.5* ----- .309 4.31 10.0 13.4* 14.3* 14.2* 14.6*

12 11.9* 14.0* 15.0* 4.52 9.99 13.4* 14.8* 13.4* 15.0* 14.8* .309 ----- 4.62 10.3 13.7* 14.6* 14.5* 15.0*

26 18.1* 20.2* 21.2* 10.7 16.1* 19.5* 21.0* 19.6* 21.2* 21.0* 6.50 6.19 10.8 16.5* 19.9* 20.8* 20.7* 21.2*

29 13.9* 16.1* 17.0* 6.58 12.0* 15.5* 16.9* 15.5* 17.0* 16.9* 2.36 2.05 6.68 12.4* 15.8* 16.7* 16.6* 17.0*

36 9.49 11.6 12.5* 2.09 7.56 11.0 12.4* 11.0 12.5* 12.4* 2.12 2.43 2.19 7.96 11.3 12.2* 12.1* 12.5*

* Significant at the .05 level Type 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

11 8.13 14.5* 13.9* 14.6* 6.79 11.6 14.6* 6.50 6.65 14.6* 2.36 14.6* 14.5* 14.6* 14.3* 10.7 14.0* 2.12 5.31

12 8.44 14.8* 14.2* 15.0* 7.10 12.0* 15.0* 6.19 6.96 15.0* 2.05 15.0* 14.8* 15.0* 14.6* 11.0 14.3* 2.43 5.62

26 14.6* 21.0* 20.4* 21.2* 13.3* 18.2* 21.2* ------ 13.1* 21.2* 4.13 21.2* 21.0* 21.2* 20.8* 17.2* 20.5* 8.62 11.8*

29 10.5 16.9* 16.2* 17.0* 9.24 14.0* 17.0* 4.13 9.02 17.0* ------ 17.0* 16.9* 17.0* 16.7* 13.1* 16.3* 4.48 7.68

36 6.0 12.4* 11.7* 12.5* 4.76 9.57 12.5* 8.62 4.53 12.5* 4.48 12.5* 12.4* 12.5* 12.2* 8.62 11.9* ------ 3.19

Note. Only those types that were significantly different from the other types are included in the table

282

283

Table 35. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 3

Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

12 8.51 12.6* 13.8* 8.56 8.76 12.4* 13.7* 12.2* 13.7* 13.8* 3.62 ------ 1.24 11.8 10.8 13.1* 13.8* 13.8*

13 9.75 13.8* 15.1* 9.80 10.0 13.7* 14.9* 13.4* 14.9* 15.1* 4.86 1.24 ------ 13.0* 12.1* 14.4* 15.1* 15.1*

26 13.4* 17.5* 18.8* 13.4* 13.6* 17.3* 18.6* 17.1* 18.6* 18.8* 8.53 4.91 3.67 16.7* 15.7* 18.1* 18.8* 18.8*

* Significant at the .05 level

Type 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

12 3.46 13.8* 11.0 13.8* 7.13 11.8 13.8* 4.91 5.63 13.8* 4.68 13.3* 12.3* 13.8* 13.1* 9.62 13.5* 2.37 11.1

13 4.70 15.1* 12.2* 15.1* 8.38 13.1* 15.1* 3.67 6.87 15.1* 5.92 14.5* 13.6* 15.1* 14.4* 10.8 14.7* 3.62 12.3*

26 8.37 18.8* 15.9* 18.8* 12.0* 16.7* 18.8* ------ 10.5 18.8* 9.59 18.2* 17.2* 18.8* 18.0* 14.5* 18.4* 7.29 16.0*

* Significant at the .05 level

Note. Only those types that were significantly different from the other types are included in the table

4.2.2.1.4 Implicational Scaling for the Essay Data by Groups

The implicational scaling was done by coding each subject as having

used (1), or not used (0), each of the 37 types of collocations in their essays. The

two axes of the matrix for the implicational scaling consisted of the 37

collocation types, ranked from most commonly used to least commonly used,

and the subjects in each group ranked in order of their use of types of

collocations, from subjects using the most types to subjects using the fewest.

This matrix is summarised in Figures 9, 10, and 11 below.

For Group 1, the coefficient of reproducibility was .94, which is

considered to be valid (see Andersen 1978). The coefficient of scalability was

.40. While the coefficient of reproducibility is at the level necessary for this

implicational scale to be considered valid, the coefficient of scalability is below

the recommended level of .6 (see Hatch & Lazaraton 1991:212). This suggests

that while the implicational scale for Group 1 is valid, the fact that most

subjects did not use the majority of the 37 scaled collocation types resulted in

the low coefficient of scalability. The implicational scale shows that the first

three collocation types are those that were found to be used significantly more

than all the other types in the post-hoc analyses (see Table 33).

For Group 2, the coefficient of reproducibility was .90, which is

considered to be valid (see Andersen 1978). The coefficient of scalability was

.33, below the recommended level of .6 (see Hatch & Lazaraton 1991:212). The

implicational scale shows that the first five items are those types that were

found to be used significantly more than all the other types in the post-hoc

analyses (see Table 34).

284

For Group 3, the coefficient of reproducibility was .89, and the coefficient

of scalability was .31. As for the other two groups, the implicational scale for

Group 3 was found to be below the recommended level of scalability. The

implicational scale shows that the first three items were those collocation types

that were found to be used significantly more than all the other types according

to the post-hoc analyses (see Table 35).

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

26 29 11 23 36 12 4 27 14 5 34 13 8 37

Collocation Tokens - Essay Data - Group 1

Collocation Types

Note. Only those types with mean > .1 are included in this figure

Figure 9. Mean use of collocation tokens in the essay data - Group 1

285

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

Note. Only those types with mean > .1 are included in this figure

Figure 10. Mean use of collocation tokens in the essay data - Group 2

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

Note. Only those types with mean > .1 are included in this figure

Figure 11. Mean use of collocation tokens in the essay data - Group 3

286

4.2.2.1.5 Summary of the Results for the Essay Data

The results of the Friedman repeated measures for the essay data

support Hypothesis 2 that there are group-specific patterns in the development

of collocational knowledge. For Group 1, Types 26. SVc, 29. Adjective Noun, and

11. SV(O) Prep O were used significantly more than the other types of

collocation. For Group 2, Types 26. SVc, 29. Adjective Noun, 12. SV to Inf, and

11. SV(O) Prep O were used significantly more than the other types. For Group

3, Types 26. SVc, 13. SV Inf, and 12 SV to Inf were used significantly more than

the other types. These results suggest that for each group there are certain

types that are used more than others, indicating that subjects in each group

prefer to use, and are more accurate in using, specific types of collocation.

These results also indicate the existence of group-specific patterns in the

acquisition of collocation, as was predicted by Hypothesis 2.

4.2.2.1.6 Further Analyses on the Essay Data

Due to the lack of statistical significance of the accuracy orders obtained

from the implicational scaling analyses for the essay data, further analyses were

performed to investigate the correlation of the accuracy orders for the three

groups. Spearman’s Rho Correlation Coefficient was calculated for the

accuracy orders by Groups 1 and 2, Groups 2 and 3, and Groups 1 and 3. Only

those types with a mean greater than .1 were included in the analyses. The

correlation for Groups 1 and 2 was rs = .832, p = .0004; for Groups 2 and 3 rs =

.766, p = .0011; and for Groups 1 and 3 rs = .552, p = .019. The significance of

these results are discussed in the next chapter.

287

4.2.2.2 Translation Data

The mean accuracy of response to each type of collocation supplied in

the translation test was calculated. The number of types differs from group to

group. For Group 1 there were 8 types of collocation tested in the translation

test, 6 types for Group 2, and 7 types for Group 3. The results of the analyses

are summarised below.

4.2.2.2.1 Friedman Test for the Translation Data - Group 1

The results of the Friedman test for Group 1 show a significant

difference in the students' mean accuracy of translation of the 8 types of

collocation in the translation data (Friedman c2 (7, N = 8) = 220.613, p = .0001).

Nemenyi's multiple comparisons tests based on the Friedman's rank sums were

performed on the data. The results of these tests are summarised below in

Table 36. According to the results of the post-hoc analysis, Types 13. SV Inf,

and 23. SV(O) Adverbial are translated significantly more accurately than all the

other types of collocation.

Table 36. Nemenyi's multiple comparisons tests of mean rank differences for

the translation data - Group 1

Types 1 5 11 14 16 27 13 23

1. Noun Prep --- .687 .237 .264 .039 1.044 1.186 2.26*

5. Adj Prep --- .45 .423 .648 .357 1.87* 2.95*

11. SV(O) Prep O --- .027 .198 .807 1.42* 2.50*

14. SVV-ing --- .225 .78 1.45* 2.53*

288

16. SVO Inf --- 1.005 1.225 2.30*

27. Verb Noun(creat) --- 2.23* 3.31*

13. SV Inf --- 1.083

23. SV(O) Adverb ---

*: Significant at the .05 level

4.2.2.2.2 Friedman Test for the Translation Data - Group 2

The results of the Friedman test for Group 2 show a significant

difference in the students' mean accuracy of translation of the 6 types of

collocation in the translation data (Friedman c2 (5, N = 6) = 74.279, p = .0001).

Nemenyi's multiple comparisons tests based on Friedman's rank sums were

performed on the data. The results of these tests are summarised below in

Table 37. According to the results of the post-hoc analysis, only Type 11. SV(O)

Prep O is significantly more accurately translated than all the other types.

Table 37. Nemenyi's multiple comparisons tests of mean rank differences for

the translation data - Group 2

Types 1 5 11 14 16 27

1. Noun Prep --- .346 .931* .037 .154 .298

5. Adj Prep --- 1.27* .383 .5 .048

11. SV(O) Prep O --- .894 .777 1.22*

14. SVV-ing --- .117 .335

16. SVO Inf --- .452

289

27. Verb Noun(creat) ---

*: Significant at the .05 level

4.2.2.2.3 Friedman Test for the Translation Data - Group 3

The results of the Friedman test for Group 3 show a significant

difference in the students' mean accuracy of translation of the 7 types of

collocation in the translation data (Friedman c2 (6, N = 7) = 134.62, p = .0001).

Nemenyi's multiple comparisons tests based on the Friedman rank sums were

performed on the data. The results of these tests are summarised below in

Table 38. According to the post-hoc analysis, Types 13. SV Inf, 14. SVV-ing, and

1. Noun Prep were found to be significantly more accurately translated than all

the other types of collocation.

Table 38. Nemenyi's multiple comparisons tests of mean rank differences for

the translation data- Group 3

Types 1 5 11 14 16 27 13

1. Noun Prep --- 1.52* 1.33* .056 .8 .789 .972

5. Adj Prep --- .189 1.47* .728 .739 2.5*

11. SV(O) Prep O --- 1.28* .539 .55 2.31*

14. SVV-ing --- .744 .733 1.028

16. SVO Inf --- .011 1.77*

27. Verb Noun(creat) --- 1.76*

13. SV Inf ---

*: Significant at the .05 level

290

4.2.2.2.4 Implicational Scaling for the Translation Data by Groups

As for the implicational scaling analysis of the translation data in the

first hypothesis, each subject was coded as having translated accurately (1), or

not (0), each of the types of collocations in the translation test according to the

80% accuracy criterion. The two axes of the matrix for the implicational scaling

consisted of the collocation types ranked from most accurately translated by all

subjects in each group to least accurately translated, and the subjects in each

group ranked in order of their accuracy of translation of all types of

collocations, from subjects translating accurately the most types to subjects

translating accurately the fewest. The matrix for each group is given in

Appendix N and summarised in Figures 12, 13, and 14 below.

For Group 1, the coefficient of reproducibility was .936, and the

coefficient of scalability was .632. The implicational scale for this set of data

was found to be significant and the items on the scale are scalable (see

Andersen, 1978; Hatch & Lazaraton 1991:212). The implicational scale (Figure

12) also shows that the first two items on the scale are the two types of

collocation that were found to be tranlated significantly more accurately than

all the other types in the post-hoc analyses (see Table 36).

For Group 2, the coefficient of reproducibility was .97 and the coefficient

of scalability was .78. As for Group 1, the implicational scale for this set of data

was found to be significant and the items on the scale scalable. The

implicational scale (Figure 13) also shows that the first item on the scale is Type

291

11. SV(O) Prep O which was also found to be translated significantly more

accurately than all the other types in the post-hoc analyses (see Table 37).

For Group 3, the coefficient of reproducibility was .89, and the coefficient

of scalability was .59. Both coefficients are approaching significance and it can

be conlcuded that the impicational scale for Group 3 is valid. The implicational

scale (Figure 14) shows that the first three items on the scale are the ones that

were found to be translated significantly more accurately than all the other

types according to the post-hoc analyses (see Table 38).

0

.1

.2

.3

.4

.5

.6

.7

.8

Figure 12. Mean accuracy of response for the translation data- Group 1

292

0

.1

.1

.2

.2

.2

.3

.4

Figure 13. Mean accuracy of response for the translation data- Group 2

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

Figure 14. Mean accuracy of response for the translation data- Group 3

4.2.2.2.5 Summary of the Results for the Translation Data

The results of the Friedman repeated measures for the translation data

support Hypothesis 2 that there are significant differences in the knowledge of

293

collocations within proficiency groups. For Group 1, Types 23. SV(O) Adverbial,

and 13. SV Inf were translated significantly more accurately than the other

types of collocation. For Group 2, only Type 11. SV(O) Prep O was translated

significantly more accurately than the other types. For Group 3, Types 13. SV

Inf, 14. SVV-ing, and 1. Noun Prep were translated significantly more accurately

than the other types. These results suggest that for each group certain

collocation types are easier to translate than others.

4.2.2.3 Blank Filling Data

As for the translation data, for the within-group analyses of the blank

filling data the mean accuracy of responses to each type of collocation included

in the blank filling test for each group was calculated. Thus, the number of

types differs from group to group. For Group 1 there were 14 types of

collocation included in the blank filling test, for Group 2 there were 12 types,

and for Group 3 there were 13 types (see Table 10, Chapter 3). The results of

the analyses are summarised below.

4.2.2.3.1 Friedman Test for the Blank Filling Data - Group 1

The results of the Friedman test for the blank filling data for Group 1

show a significant difference in the students' mean accuracy of response to the

14 types of collocation in the blank filling data (Friedman c2 (13, N = 14) =

541.595, p = .0001). Nemenyi's multiple comparisons tests based on the

Friedman's rank sums were performed on the data. The results of these tests

are summarised below in Table 39. According to the results, the significant

differences are spread among many different pairs of collocation types. Thus,

the clustering of only a limited number of types that are significantly different

294

295

to all other types, evident in the results of the post-hoc analyses for the

translation and essay data, is not found in the post-hoc analyses for the blank

filling data for Group 1.

4.2.2.3.2 Friedman Test for the Blank Filling Data - Group 2

The results of the Friedman test for the blank filling data for Group 2

show a significant difference in the students' mean accuracy of response to the

12 types of collocation in the blank filling data (Friedman c2 (11, N = 12) =

202.339, p = .0001). Nemenyi's multiple comparisons tests based on the

Friedman's rank sums were performed on the data. The results of these tests

are summarised below in Table 40. According to the results of the post-hoc

analysis, Type 4. Prep Noun with the highest mean rank (i.e. most accurately

answered) and Type 33. Verb Adverb with the lowest mean rank (i.e. least

accurately answered) are the ones that show significant differences to most of

the other collocation types.

4.2.2.3.3 Friedman Test for the Blank Filling Data - Group 3

The results of the Friedman test for the blank filling data for Group 3

show a significant difference in the students' mean accuracy of response to the

13 types of collocation in the blank filling data (Friedman c2 (12, N = 13) =

191.452, p = .0001). Nemenyi's multiple comparisons tests based on Friedman's

rank sums were performed on the data. The results of these tests are

summarised below in Table 41. According to the results, Type 33. Verb Adverb

with the highest mean rank (i.e. most accurately answered), and Type 28. Verb

Noun (eradication) with the lowest mean rank (i.e. least accurately answered),

are significantly different to all the other types of collocation.

Table 39. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 1

TYPES 1 4 5 11 23 24 27 33 34 36 37 30 31 29

1. Noun Prep --- 6.68* 5.65* 3.87* .45 2.016 2.89 1.956 6.49* 5.97* 3.65* .115 1.104 .099

4. Prep Noun --- 1.022 2.807 6.23* 4.66* 3.79* 4.72* .187 .709 3.02* 6.56* 5.57* 6.78*

5. Adj Prep --- 1.785 5.20* 3.64* 2.769 3.70* .835 .313 2.005 5.54* 4.55* 5.75*

11. SV(O) Prep O --- 3.42* 1.858 .984 1.918 2.62 2.098 .22 3.75* 2.77 3.97*

23. SV(O) Adverbial --- 1.566 2.44 1.506 6.04* 5.52* 3.20* .335 .654 .549

24. SV(O) wh-word --- .874 .06 4.47* 3.95* 1.638 1.901 .912 2.115

27. Verb Noun (creat.) --- .934 3.60* 3.08* .764 2.775 1.786 2.98*

33. Verb Adverb --- 4.53* 4.01* 1.698 1.841 .852 2.055

34. Noun Noun --- .522 2.84 6.37* 5.39* 6.59*

36. Prep Det Noun --- 2.318 5.85* 4.86* 6.07*

37. Phrasal Verb --- 3.53* 2.55 3.75*

30. Noun Verb --- .989 .214

31. Noun1 of Noun2 --- 1.203

29. Adj Noun ---

*: Significant at the .05 level

296

Table 40. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 2

TYPES 1 4 5 11 23 24 27 33 34 36 37 31

1. Noun Prep --- 4.04* 1.362 2.70* 1.559 1.66 1.793 .798 .421 3.64* 1.931 .196

4. Prep Noun --- 2.68* 1.335 2.48* 2.38* 2.25 4.84* 3.62* .394 2.112 4.23*

5. Adj Prep --- 1.346 .197 .298 .431 2.16 .941 2.287 .569 1.558

11. SV(O) Prep O --- 1.149 1.048 .915 3.50* 2.287 .941 .777 2.90*

23. SV(O) Adverbial --- .101 .234 2.35* 1.138 2.09 .372 1.755

24. SV(O) wh-word --- .133 2.45* 1.239 1.989 .271 1.856

27. Verb Noun (creat.) --- 2.59* 1.372 1.856 .138 1.989

33. Verb Adverb --- 1.219 4.44* 2.72* .602

34. Noun Noun --- 3.22* 1.51 .617

36. Prep Det Noun --- 1.718 3.84*

37. Phrasal Verb --- 2.127

31. Noun1 of Noun2 ---

297

298

*: Significant at the .05 level Table 41. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 3

TYPES 1 4 5 11 23 24 27 33 34 36 37 29 28

1. Noun Prep --- 2.74* 1.217 1.534 1.284 2.66* 1.884 4.61* 1.017 3.73* 2.506 1.339 1.572

4. Prep Noun --- 1.528 1.211 1.461 .078 .861 1.866 1.728 .994 .239 1.406 4.31*

5. Adj Prep --- .317 .067 1.45 .667 3.39* .2 2.522 1.289 .122 2.78*

11. SV(O) Prep O --- .25 1.133 .35 3.07* .517 2.205 .972 .195 3.10*

23. SV(O) Adverbial --- 1.383 .6 3.32* .267 2.455 1.222 .055 2.85*

24. SV(O) wh-word --- .783 1.944 1.65 1.072 .161 1.328 4.23*

27. Verb Noun (creat.) --- 2.72* .867 1.855 .622 .545 3.45*

33. Verb Adverb --- 3.59* .872 2.105 3.27* 6.18*

34. Noun Noun --- 2.72* 1.489 .322 2.589

36. Prep Det Noun --- 1.233 2.4 5.31*

37. Phrasal Verb --- 1.167 4.07*

29. Adj Noun --- 2.91*

28. Verb Noun (era) ---

*: Significant at the .05 level

4.2.2.3.4 Implicational Scaling for the Blank Filling Data by Groups

A matrix was compiled for each group, consisting of the types of collocation

included in the blank filling test, ranked from most accurately answered by all

subjects in each group to least accurately answered, and the subjects in each group

ranked in order of their accuracy of response to all types of collocations, from

subjects responding accurately to the most types to subjects responding accurately

to the fewest types. As with the translation data, the 80% accuracy criterion was

used. These matrices are given in Appendix O and summarised in Figures 15, 16,

and 17 below.

For Group 1, the coefficient of reproducibility was .95, and the coefficient of

scalability was .5. Even though the coefficient of reproducibility for the blank

filling data for Group 1 is at the level necessary for this implicational scale to be

considered valid, the coefficient of scalability is below the recommended level of .6

(see Andersen 1978; Hatch & Lazaraton 1991:212).

For Group 2, the coefficient of reproducibility was .95, and the coefficient of

scalability was .61. The implicational scale for this set of data was found to be

significant and the items on the scale scalable.

For Group 3, the coefficient of reproducibility was .928, and the coefficient

of scalability was .68. The implicational scale for Group 3 was found to be

significant and the items on the scale scalable.

1

0

.1

.1

.2

.2

.2

.3

.4

.4

.5

Figure 15. Mean accuracy of response for the blank filling data - Group 1

0

.1

.1

.2

.2

.2

.3

.4

.4

.5

Figure 16. Mean accuracy of response for the blank filling data - Group 2

2

0

.1

.2

.3

.4

.5

.6

.7

.8

Figure 17. Mean accuracy of response for the blank filling data - Group 3

4.2.2.3.5 Summary of the Results for the Blank Filling Data

The results of the Friedman repeated measures for the blank filling data

support Hypothesis 2 that there are significant differences in the knowledge of

collocations within proficiency groups. From the implicational scales it is apparent

that certain types of collocation are answered more accurately than others in each

group. For Group 1, Types 34. Noun Noun, and 33. Verb Adverb were answered

more accurately than the other types of collocation. For Group 2, Types 24. SV(O)

wh-word, 23. SV(O) Adverbial, and 34. Noun Noun were answered more accurately

than the other types. For Group 3, Types 33. Verb Adverb was answered more

accurately than the other types. These results suggest that in each group subjects

found certain collocation types easier to answer than others.

3

4.2.2.4 Summary of the Results for Hypothesis 2

The results from the analyses of the three sets of data, the essay data, the

translation data and the blank filling data, support Hypothesis 2 by providing

evidence that there are within-group differences in the use and knowledge of

collocation types, assessed in this study both in terms of ability to produce

collocations in the essays, and in terms of accuracy of response to questions

eliciting collocations in the translation and blank filling tests. The statistical

significance for the implicational analyses of the translation and the blank filling

data strongly suggests that there are group-specific patterns in the acquisition of

collocations; that certain collocation types are easier than others to acquire; and

that they do form an accuracy order.

The next chapter discusses the significance of the results for the two

hypotheses.

4

CHAPTER 5

DISCUSSION OF THE FINDINGS

5.0 Introduction

This study investigated the acquisition of English collocations by ESL

subjects at three proficiency levels - post-beginners, intermediate, and post-

intermediate - in an attempt to describe the development of English collocational

knowledge in L2 learners. The acquisition of English collocations was measured

both as free production of collocations (accuracy of use in the students' essays) and

cued production of collocations (accuracy of response in the translation and blank-

filling tests). Evidence for the development of collocational knowledge was sought

in comparing the production and knowledge of collocation types across and within

the different proficiency groups. An implicational scaling analysis was also

performed on the data in an attempt to find evidence for accuracy orders in the

acquisition of English collocations. The findings are summarised and discussed in

the following sections. In section 5.1. the free production results are discussed; in

5.2. the cued production results are considered; a summary of the findings is then

presented in 5.3.; the factors affecting the development of collocational knowledge

5

are discussed in 5.4.; and finally a summary of the discussion is given in 5.5. The

pedagogical implications of this study are given in 5.6.

5.1 Free Production of Collocations

The accurate use of collocations in the subjects' essays was used as evidence

for the acquisition of collocations. There were significant differences in the

production of a number of collocation types between and within the three

proficiency groups.

5.1.1 Between-Group Differences

Subjects used significantly more Type 13. SV Inf and 19. SV(O) that

collocations as their level of proficiency increased (see Table 24). There were also a

number of collocations which were used significantly more by subjects in the

highest level group. For example, Type 1. Noun Prep, 5. Adjective Prep, 15. SVO to

Inf, and 21. SVOc collocations were used significantly more by subjects in Group 3.

Types 1 and 5 are lexical collocations, and Types 13, 19, 15 and 21 are grammatical

collocations that are syntactically more demanding than the simple grammatical

collocations SV to Inf and SVc.

The analysis of the collocations in the subjects' textbooks revealed that the

use of Type 13. SV Inf, 19. SV(O) that, 1. Noun Prep, 5. Adjective Prep, 15. SVO to Inf,

6

and 21. SVOc collocations in the textbooks also increases as the level of difficulty of

the language increases from TWE1 to TWE3. It is possible that the subjects'

exposure to larger amounts of collocations of these collocations as their level of

proficiency increased has influenced the production of these types in their essays.

That is, the more the subjects were exposed to a particular collocation type, the

more they used it. This is also reflected in the fact that the order of frequency of

the 37 collocation types in each of the three textbooks correlated significantly with

the frequency of use of the 37 types of collocation in the students' essays, i.e. the

order of frequency of use for each group correlated significantly with the order of

frequency of the 37 types of collocation in the textbook for that particular group.

Types 26. SVc and 29. Adjective Noun were used significantly more by

subjects at the lower proficiency levels. This could be due to the fact that these two

types of collocation are more frequent in everyday speech and syntactically simple

(e.g. Type 26. SVc includes constructions such as 'I am a student', 'I am bad', 'She

became a teacher'; Type 29. Adjective Noun includes collocations such as 'long hair',

'good student', 'beautiful girl'). Another explanation for the extensive use of these

collocation types by Group 1 subjects is that students in this group used fewer

collocation types overall, with more tokens used for each type. As the level of

proficiency increased, the number of collocation types used in the essays also

increased. The analysis of the collocations in the essays showed that subjects in the

lowest proficiency level, Group 1, used only 23 out of the 37 collocation types

investigated in this study, while subjects in the higher proficiency levels, Groups 2

7

and 3, used 29 and 28 of the 37 types respectively. Group 1 used fewer collocation

types and a greater number of tokens for some types (e.g. 26. SVc and 29. Adjective

Noun). Similar results were reported by Zhang (1993), who found that the more

proficient L2 learners used significantly more collocation types than the less

proficient L2 learners (Zhang 1993:147).

Another factor that could have influenced the subjects' performance with

regard to the use of SVc and Adjective Noun collocations is the topic of the essay.

Group 1 subjects had to describe themselves and their family in the essay, a topic

that may have prompted the use of more Adjective Noun and SVc phrases. Group 2

subjects had to describe themselves and their plans for the future, and Group 3

students had to describe and discuss pollution problems in their home town. The

essay topics for Groups 2 and 3 were thus not as purely descriptive as for Group 1.

Type 11. SV(O) Prep O collocations were also more frequent in the Group 1

and 2 essays. The textbook analysis also shows that TWE3, the textbook for Group

3 (post-intermediate students), contains the least number of collocations of this

type when compared to the other two textbooks, i.e. the students' production of

collocations may mirror their exposure to these collocations in their current

textbook, and not necessarily the incremental growth of collocations from TWE1 to

TWE3.

There were also collocation types which were used significantly more by

Group 2 than by Group 1, but they were used less frequently by Group 3. These

types are: 12. SV to Inf, 36. Prep Det Noun, 37. Phrasal Verb, 4. Prep Noun, 14. SVV-

8

ing, and 24. SV(O) wh-word. Such a phenomenon has been described in previous

studies as 'backsliding' (Lightbown 1985a). According to Lightbown, L2

acquisition is not linear and cumulative, but is characterised by backsliding and

loss of forms that appeared to be previously mastered. In this study, Group 3

subjects are able to use the above collocation types, but they seem to rely less on

the use of these types than subjects in the lower proficiency levels. Backsliding has

been reported in previous developmental studies too (see Hyltenstam 1977;

Andersen 1978).

There were also collocation types which were not used at all. These are: 3.

Noun that, 18. SV possessive V-ing, 22. SVOO, 25. S(it)VO to Inf, 28. Verb Noun

(eradication), and 32. Adverb Adjective. The majority of these types are structurally

demanding and infrequent in everyday English. According to the BBI, examples of

these types are: 'We reached an agreement that she would represent us in court',

or 'it was his desire that his estate be divided equally' (Type 3. Noun that); 'They

love his clowning', or 'This fact justifies Bob's coming late' (Type 18. SV Possessive

V-ing); 'It surprised me to learn of her decision' (Type 25. S(it)VO to Inf). Type 22.

SVOO collocations consist of a transitive verb and two objects, neither of which can

be used in the prepositional phrase with to or for, e.g. 'God will forgive them their

sins', or 'we bet her ten pounds'. Previous research has also shown that SV

Possessive V-ing constructions are acquired late (Anderson 1978:97). Also, SVOO

constructions were found to be acquired after the more unmarked SVO to O

constructions (Mazurkewich 1984). What the above collocation types appear to

9

have in common is a greater degree of complexity. Studies in L1 acquisition have

shown that grammatical complexity is a determinant of acquisition orders (see

Brown 1973). Given that collocations in this study are operationalised in terms of

structurally determined patterns, grammatical complexity could be a factor

affecting the pattern of results obtained in this study. Zhang's (1993) study also

defined collocations in structural terms, and he found that the L2 learners in the

study avoided, and were unable to produce, the more structurally demanding

collocations when compared with native speakers (Zhang 1993:126). These

collocation types are also structurally different from their equivalent collocations in

Greek, e.g. Noun that collocations are Noun to [Passive Voice] Infinitive. Laufer and

Eliasson (1993) have also reported that L1-L2 difference was the best predictor of

avoidance in their investigation of the use of phrasal verbs by Swedish and

Hebrew ESL learners.

Finally, the absence of Type 28. Verb Noun (eradication) and 32. Adverb

Adjective collocations could be due to the fact that these types are relatively fixed

(not free combinations) and therefore difficult to acquire. For example, some Type

28. Verb Noun (eradication) collocations in the BBI are 'to reject an appeal', 'to

reverse a decision', 'to rescind a tax'. The authors of the BBI suggest that

collocations of this type are arbitrary and unpredictable, i.e. no predictions can be

made as to why certain verbs combine with certain nouns, therefore L2 learners

have difficulties acquiring them as they cannot tell why 'make an estimate' is

acceptable but *'make an estimation' is not (Benson et al. 1986b:258). For Type 32.

10

Adverb Adjective collocations the BBI includes 'deeply absorbed', 'strictly accurate',

'sound asleep'. Previous research has also revealed that adverbs, in particular, are

difficult for the L2 learner to use appropriately because they typically collocate

with specific words, i.e. they are fixed (Linnarud 1986:105). With respect to Types

3, 18, 22, 25, 28, and 32, it is possible that the subjects in this study have not yet

reached a proficiency level advanced enough to use such complex, infrequent,

and/or fixed collocations. Also, the analysis of the TWE series showed that Types

3. Noun that, 18. SV Possessive V-ing, and 25. S(it)VO to Inf do not appear in the

subjects' textbooks, i.e. no tokens of those collocation types were found in any of

the three textbooks. Furthermore, only a limited number of Type 22. SVOO, 28.

Verb Noun (eradication), and 32. Adverb Adjective collocations were found in the

textbooks (see Table 7, Chapter 3). It appears that lack of exposure to specific

collocation types or the low frequency of these collocation types in the subjects'

textbooks have also contributed to the avoidance of these types by the subjects.

Collocational development across groups was examined by implicational

scaling analysis of acquisition orders. The implicational scale for the essay data

was found to have a significant coefficient of reproducibility which means that a

subject's performance can be predicted with a high degree of accuracy from that

subject's position on the scale. Although previous studies using implicational

scaling analysis considered a high coefficient of reproducibility as adequate

evidence for the presence of an implicational scale (see Andersen 1978; Hyltenstam

11

1977), in this study the coefficient of scalability was also calculated, to provide

additional evidence as to the strength of the collocation types as an ordered scale.

The essay data were found to have a low coefficient of scalability (Cscalability =

.33). It is possible that the large number of items on the scale for this data could

have reduced the strength of the scale. Also, the backsliding learning patterns

which occurred for some collocation types will influence the scalability of the data

(Hatch & Lazaraton 1991:216). Even though the statistical validity of the essay

scale does not reach statistical significance, the relative magnitude of the predictive

power of the scale cannot be determined, due to the lack of other implicational

analysis studies in the acquisition of collocations. According to Davidson (1987)

"the magnitude of a coefficient of scalability should rightly be judged against

similar findings in the field" (p. 25). Since there are no other studies similar to this

one, it is possible that even scalability of .33 is valid enough as a predictor for the

order of acquisition of collocation types (Davidson 1987:26). However, only future

research and implicational analysis on collocations can verify this.

5.1.2 Within-Group Differences

Differences in the use of collocations were also sought in the writing

performance of each group of subjects. The following types were used most

frequently in each of the three groups.

12

Table 42. Collocation types used most frequently in the students' essays

Group 1 Group 2 Group 3

26. SVc 26. SVc 26. SVc

29. Adjective Noun 29. Adjective Noun 13. SV Inf

11. SV(O) Prep O 12. SV to Inf 12. SV to Inf

11. SV(O) Prep O

36. Prep Det Noun

It appears that Type 26. SVc collocations were used significantly more than

the other types in all three groups. Given that Type 26. SVc constructions are basic

and frequent in everyday speech, e.g. 'I am a student', 'I am happy, 'She became a

teacher', it is not surprising that subjects in all levels of proficiency used

collocations of this type more than any other type. Zhang's (1993) study also

showed that more SVc collocations were used by all learners, more and less

proficient, in their essays (Zhang 1993:125). These results are also in line with

previous research in the sequence of acquisition of grammatical structures by

Fathman (1977). She found that structures that needed to be produced correctly for

effective communication, such as SVc constructions, were learned early. Also,

according to Pienemann's Processability Model, copula sentences such as 'I am a

student' belong to Stage 1 (basic sentence structures and basic categories) of second

language acquisition (Pienemann 1996). Evidence from Japanese as a second

13

language have also shown copula sentences to be a Stage 1 structure (Huter 1996).

This collocation type may then be considered a 'core' type in the acquisition of

collocations.

Groups 1 and 2 also used Type 29. Adjective Noun and 11. SV(O) Prep O

collocations significantly more than the other types. As already mentioned above,

it is possible that the topic of the essay for these two groups (see Appendix B)

could have influenced the frequency of use of Type 29. Adjective Noun collocations.

As far as Type 11. SV(O) Prep O collocations are concerned, TWE1 and TWE2

contain more collocations of this type than TWE3.

The implicational scaling for the essay data between groups (see Figure 3)

shows that Types 26, 29 and 11 are also the first three items on the implicational

scale for all groups. Since, Group 1 is the lowest proficiency group investigated by

this study, it is understandable that the subjects in this group use the easiest

collocation types more than the others. Therefore, it can be concluded that Types

26, 29 and 11 are early acquired collocation types, as their use was measured in the

writing performance of L2 learners in this study.

Types 12. SV to Inf and 13. SV Inf are also used more than the others by

higher level students, Groups 2 and 3. Both these types are still among the first six

items on the implicational scale of the essay data for all groups (see Figure 3), i.e.

they are among the most frequently used types of collocation, but their use

increases significantly in Groups 2 and 3. Zhang (1993) also reports that these two

types of collocation were used frequently by the L2 learners in his study (Zhang

14

1993:126). The textbook analysis reveals a few tokens of these two types in TWE1

(19 tokens for Type 12, and 26 tokens for Type 13) and a considerable increase in

TWE2 and TWE3 (TWE2: 234 tokens for Type 12, and 230 tokens for Type 13;

TWE3: 285 tokens for Type 12, and 347 tokens for Type 13). From a linguistic point

of view, the fact that Type 13. SV Inf collocations are acquired later than Type 12.

SV to Inf collocations could be due to the cumulative grammatical complexity,

introduced by Brown (1973). The cumulative grammatical complexity assumes

that a construction y is more complex than a construction x only if y involves all the

transformations involved in x plus one or more others (Brown 1973:377). In this

respect, the cumulative grammatical complexity is different from the theory of

derivational syntactic complexity which assumes that all transformations involve a

constant increment of complexity (see Brown & Hanlon 1970). Derivational

syntactic complexity proved inadequate for providing an explanation of language

acquisition (see Smith 1988), and Brown claims that the cumulative number of

transformations is a better index of complexity (Brown 1973:377; for other

approaches to assessing lexico-syntactic complexity see Frazier 1988; Crain &

Shankweiler 1988; Cheung & Kemper 1992; Hulstijn & deGraaff 1994; Hulstijn

1995). In the present data, Type 13. SV Inf requires all the rules that constructions

that contain infinitives do, plus one more, i.e. to-deletion. Type 13. SV Inf

collocations are thus more difficult and hence are acquired later. Furthermore,

Type 13. SV Inf represents collocations that contain modal auxiliaries, e.g. 'can,

could, should, would, may + Inf'. Modal auxiliaries constitute a closed class of

15

verbs with limited distributions and have distinct features when compared to

regular verbs, e.g. they require to-deletion before their combination with an

infinitive, they take no third-person inflection, they have abnormal time reference,

and they can only occur as the first element of the verb phrase (see Quirk,

Greenbaum, Leech & Svartvik 1985; Steele 1981; on the learnability of English

auxiliaries in L1 acquisition see Pinker 1984). From a developmental point of view,

the correct use of Type 13 collocations mainly by Group 3 subjects indicates that

accurate use of modal auxiliaries develops later in L2 learners and thus Type 13

collocations are developmentally 'difficult'. From a learnability point of view,

Type 13 collocations are different from their equivalent collocations in Greek

which do not require to-deletion, e.g. ‘mporei;s na ywni;seis edw;’[you can to shop

here] is SV[Modal Auxiliary] to Inf. Due to the L1-L2 difference, Type 13

collocations can be considered more difficult than Type 12 collocations. Similar

results regarding the use of modal auxiliaries are also reported by Anderson

(1978). In Ravem (1974) too, it was reported that the acquisition of a full range of

auxiliary morphemes (which included Modals) and their distribution develops late

(Ravem 1974:148).

The implicational scales for the essay data by group have coefficients of

scalability below the recommended level of statistical significance. As with the

implicational scale for the essay data (all groups), the large number of items on the

scale could be responsible for the low scalability. Despite the low coefficients of

scalability, the three scales reveal orders of difficulty similar to the patterns of

16

acquisition, as measured by the Friedman repeated measures analyses. The scale

for Group 1 has Types 26, 29 and 11 as the first three items on the scale. The scale

for Group 2 has Types 26, 29, 12, 11 and 36 as the first five items on the scale.

Finally, the scale for Group 3 has Types 26, 13, 12, and 36 as the first four items on

the scale. The results from the implicational scaling analysis, although not

reaching statistical significance, exhibit a pattern that supports the view that

certain orders exist in the acquisition of collocations, as measured by the writing

performance of L2 learners. These orders appeared to be influenced by exposure,

as the subjects' textbook analysis shows, and/or the complexity, arbitrariness, and

predictability of specific collocation types (see above). The correlation of the three

implicational orders (see also Fathman 1977; Pica 1983) showed that the orders for

subjects in Groups 1 and 2 were highly correlated (rs = .832); the orders for Groups

2 and 3 were also highly correlated (rs = .766); and the orders for Groups 1 and 3

revealed the lowest correlation (rs = .552). These results show a gradual

development of collocational knowledge across the three Groups in the study.

Since the subjects in each Group for this study were only one year apart, the

development of collocational knowledge had progressed to a different stage after

only two years of instruction (exposure to collocations via the TWE textbooks) and

maturation (during the period between 12-15 years of age) had taken place. Thus,

even though the implicational scales for the essay data lack statistical significance,

they can still be used as indicators of the development of English collocational

knowledge in L2 learners.

17

5.2 Cued Production of Collocations

Knowledge of collocations was also measured as accuracy of response to the

translation and blank-filling tests. Between- and within-group differences in the

accuracy of responses were used as evidence of the development of collocational

knowledge in the three proficiency levels.

5.2.1 Translation Data

Subjects were tested on their ability to translate correctly sentences from

their L1 into English. Each sentence contained an English collocation that was

different from its equivalent in the learners' L1. The significant results for this set

of data are discussed below.

5.2.1.1 Between-Group Differences

The students in Group 3 performed with the greatest accuracy in the

translation test. The results showed that Type 1. Noun Prep, 5. Adjective Prep, 14.

SVV-ing, and 27. Verb Noun (creation) collocations were translated significantly

more accurately by higher proficiency subjects.

18

The results for Type 1. Noun Prep reflect to some extent the treatment of this

collocation type in the subjects' textbooks: 76 tokens in TWE1, 80 tokens in TWE2

and 145 tokens in TWE3. The use of Type 27. Verb Noun (creation) collocations also

increases in the textbooks in a pattern similar to the one found in the results of the

translation test, i.e. as level increases the number of collocations found in the

textbooks increases. The results for Type 5. Adjective Prep collocations show a

significant difference in accuracy only between Group 1 and Group 3, with the

highest proficiency subjects, Group 3, were more accurate than the others. A

similar pattern is also found in the students' textbooks. Although Type 14. SVV-

ing collocations appear more in TWE1, students were able to translate them with

significantly more accuracy after their level increased. Finally, Type 11. SV(O) Prep

O collocations were translated most accurately by subjects in Group 2.

Collocations of this type were also found more in TWE2 than in the other two

textbooks.

Implicational scaling analysis was also used for the between-group

differences. The coefficient of reproducibility was found to be significant, which

confirmed the predictive power of the scale. The coefficient of scalability for the

translation data approached significance (Cscalability = .57). The smaller number of

collocation types included in the translation test could have contributed to the high

coefficient of scalability. This result also indicates that a small number of items

and a translation test are more likely to yield strong enough differences in

19

performance for a set of collocation types to be truly scalable, i.e. implicationally

ordered.

Considering the implicational scale for the translation data, the following

order of accuracy was found after the 80% criterion of acquisition was applied to

the data (types at the top of the order were more accurately translated than types at

the bottom):

Table 43. Accuracy order for the collocation types included in the translation

test - All Groups

Type

14. SVV-ing

16. SVO Inf

11. SV(O) Prep O

1. Noun Prep

5. Adjective Prep

27. Verb Noun (creation)

The results from the implicational analysis indicate that students were more

accurate in translating grammatical collocations (Types 14, 16, and 11) than lexical

collocations (Types 1, 5, and 27). Type 14. SVV-ing was easier to translate than

Type 16. SVO Inf. Similar results were reported by Anderson (1978), who found

20

that gerund SVV-ing constructions were acquired earlier than SVO Inf

constructions that required to-deletion (Anderson 1978:97).

The most accurately translated lexical collocation type on the scale, Type 1.

Noun Prep, included collocations such as 'things about', 'flight to', 'plans about',

'champion in', 'success in', 'pain in [the stomach]'. Students found these lexical

collocations easier to translate than Type 5. Adjective Prep collocations. The

following Type 5 collocations were included in the translation test: 'afraid of',

'interested in', 'bored with', 'married to'. Type 1 occurred more frequently than

Type 5 in the TWE series, i.e. the type-token ratio for Type 1 in the TWE series was

100.3, while for Type 5 it was 82.6. Also, all of the Type 1 collocations included in

the translation test have a similar structure in Greek, i.e. a noun followed by a

preposition. Some Type 5 collocations, on the other hand, e.g. 'afraid of' and 'bored

with', have a different structure in Greek, i.e. Verb Det Noun ’foba;mai ta fi;dia’

[afraid-[Middle Voice Verb] the snakes], ’barie;mai to scolei;o’ [bored-[Middle

Voice Verb] the school]. The L1-L2 difference with regard to the English Adjective

Prep collocations could be one factor responsible for the subjects' low accuracy in

the translation of Type 5. Adjective Prep collocations. It has also been reported that

Adjective Prep collocations are more fixed (i.e. consistently used with a preposition,

e.g. 'fond of', 'afraid of', 'deaf to' (Benson et al. 1986a:xii)) and difficult for low

proficiency students, and as such they are indicative of a higher level of

proficiency. Zhang (1993) also reports that in his investigation of English

21

collocational knowledge by L2 learners and native speakers, collocations such as

Adjective Prep were used considerably more by native writers than L2 learners.

Type 27, the most difficult collocation type on the scale, included lexical

collocations that are fairly fixed in English, e.g. 'draw conclusions', 'face problems',

and different from their equivalent collocations in Greek, e.g. 'bga;zw

sumpera;smata' [take out conclusions], 'antimetwpi;zw problh;mata' [confront

problems]. The arbitrary nature of Verb Noun (creation) collocations has also been

reported by the writers of the BBI (Benson et al. 1986b). The arbitrariness and

unpredictability of these collocations makes non-native speakers unable to cope

with them. It is not surprising, then, that such collocations were difficult for the

subjects of this study. Also, an examination of the translations supplied by the

students showed considerable influence from Greek. It is possible that the nature

of the test, i.e. translation, could have increased L1 influence. L1 interference has

been also reported in past studies on collocations involving a translation test

(Marton 1977:46).

The acquisition order for the translation data approached statistical

significance: that is, students who correctly translated Type 27. Verb Noun (creation)

collocations, the last and most difficult to translate type on the scale, also translated

correctly the rest of the collocation types included in the translation test.

Overall, results show a very low accuracy in the translation test, i.e. only 88

out of 275 subjects, about 33%, were 80% or more accurate in the translation of

Type 14. SVV-ing collocations, which was the most accurately translated type on

22

the scale. Hence, translation proved to be a difficult test for the subjects. Previous

research involving advanced L2 learners, i.e. fifth year Polish students majoring in

English, in a translation test, Polish to English, showed that even advanced

students did not have most of the collocations which were tested in their

productive repertoires (Marton 1977:45).

5.2.1.2 Within-Group Differences

The results for the translation data revealed significant within-group

differences in the translation responses.

Group 1

For Group 1, Types 23. SV(O) Adverbial and 13. SV Inf were translated more

accurately than all the other types. The implicational scaling analysis also confirms

that subjects were 80% or more accurate in translating Types 23 and 13 than the

other types. The coefficient of reproducibility and the coefficient of scalability

were both found to be statistically significant for this analysis, which suggests that

the accuracy order for the translation data for Group 1 has validity and predictive

power. The order is the following (at the top are those types that were the easiest

to translate, while at the bottom are those types that were the most difficult to

translate):

23

Table 44. Accuracy order for the collocation types included in the translation

test - Group 1

Type

23. SV(O) Adverbial

13. SV Inf

16. SVO Inf

14. SVV-ing

5. Adjective Prep

11. SV(O) Prep O

1. Noun Prep

27. Verb Noun (creation)

The above accuracy order shows the following.

i) SV Inf collocations are easier to translate than SVO Inf collocations, which in

turn are easier to translate than SVV-ing collocations.

24

ii) Collocations containing a preposition, i.e. Adjective Prep, SV(O) Prep O, and

Noun Prep, are more difficult to translate than collocations containing an infinitive,

i.e. SV Inf, and SVO Inf. Prepositions are also more likely to cause interference

from the subjects' L1 than infinitives. Greek has a number of prepositions that do

not always coincide with the English prepositions, i.e. 'pain in the stomach' is

‘po;nos sto stoma;ci’ [pain to the stomach], 'things about other countries' is

'pra;gmata gia a;lles cw;res’ [things for other countries]. On the other hand,

infinitives in Greek are like their English equivalents. Prepositional phrases and

phrasal verbs have also been reported as constructions that exhibit arbitrary lexical

restrictions (Allerton 1984), and as such they are difficult to acquire.

iii) Verb Noun (creation) lexical collocations are the most difficult to translate, a

result that was also evident from the between-group analysis of the translation

data.

Group 2

The results for Group 2 show that only Type 11. SV(O) Prep O collocations

were translated significantly more accurately than the other collocation types. The

implicational scaling analysis for this set of data, which was also found to be

statistically significant, shows the accuracy order given in Table 45:

25

Table 45. Accuracy order for the collocation types included in the translation

test - Group 2

Type

11. SV(O) Prep O

16. SVO Inf

14. SVV-ing

1. Noun Prep

5. Adjective Prep

27. Verb Noun (creation)

The above accuracy order shows that:

i) Grammatical collocations are easier to translate than lexical collocations.

ii) As with the accuracy order for Group 1, collocations that contain a preposition,

i.e. Noun Prep and Adjective Prep, are more difficult to translate than collocations

that contain an infinitive.

iii) Verb Noun (creation) lexical collocations are also the most difficult to translate

for Group 2.

Comparing the accuracy orders for Groups 1 and 2, it appears that they are

similar with respect to most types. The only exception is Type 11. SV(O) Prep O.

Subjects in Group 2 found collocations of this type easier to translate than the

subjects in Group 1. That is, subjects that received an additional year of instruction

26

were more accurate in translating Type 11. SV(O) Prep O collocations. Group 2

subjects also received more exposure to Type 11 collocations through their

textbooks than Group 1 students, i.e. Type 11 collocations were found more

frequently in TWE2 than in TWE1.

Group 3

The results for the translation data for Group 3 show Types 13. SV Inf, 14.

SVV-ing and 1. Noun Prep, to be translated more accurately than the other types.

The implicational scaling analysis approached statistical significance (Cscalability =

.59). The following accuracy order was obtained:

Table 46. Accuracy order for the collocation types included in the translation

test - Group 3

Type

13. SV Inf

14. SVV-ing

1. Noun Prep

16. SVO Inf

11. SV(O) Prep O

27

5. Adjective Prep

27. Verb Noun (creation)

The above accuracy order shows that:

i) SV Inf collocations are easier to translate than SVO Inf collocations.

ii) With the exception of Noun Prep collocations, collocation types that contain a

preposition are more difficult to translate than collocation types that contain an

infinitive.

iii) Verb Noun (creation) collocations are the most difficult to translate for Group 3.

Even for the subjects in the highest proficiency level, Verb Noun (creation)

collocations are the most difficult to translate with accuracy. The same applies for

structures that are grammatically more complex, e.g. SVO Inf versus SV Inf (see

above). According to the cumulative grammatical complexity (Brown 1973), SVO

Inf structures are more complex than SV Inf structures since they require the

insertion of an Object, and as such they are more difficult to acquire. Also, recent

research in L2 acquisition has shown that the greater number of units and

morphemes in some structures obscure their perceptual 'salience' making them

harder to 'notice' and therefore to produce accurately (Bardovi-Harlig 1987;

Robinson 1995; Schmidt 1990, 1995). SVO Inf collocations contain all the units of

SV Inf constructions plus one more, i.e. Object, and as such they are less salient and

harder to produce accurately. The above result is also consistent with previous

studies on the acquisition of grammatical structures (see Anderson 1978).

28

The subjects' accuracy improves significantly in the translation of Type 1.

Noun Prep collocations, i.e. as the students' level increased, their ability to translate

lexical collocations also improved. This result is in line with Zhang's (1993) study,

in which he found that the high proficiency L2 students had a better command of

English lexical collocations than the low proficiency L2 students (Zhang 1993:148).

Comparing the three accuracy orders for the translation data, we can

conclude the following.

i) Verb Noun (creation) collocations are difficult for all three proficiency groups.

This was also evident from the between-group analysis. Collocations of this type

also appeared infrequently in the students' essays with no significant between-

group differences. As mentioned above, collocations of this type are fixed, e.g. 'to

face problems', 'to draw conclusions', and subjects at all three levels exhibit a

general weakness in the free production (essay data) and cued production

(translation data) of Verb Noun (creation) collocations. Zhang (1993) also found the

use of such collocations to be "weak areas" for L2 learners (Zhang 1993:106).

ii) Grammatical collocations, e.g. SV Inf, SVO Inf, SVV-ing, are easier to translate

than lexical collocations, e.g. Noun Prep, Adjective Prep, Verb Noun (creation).

However, as subjects become more proficient, their accuracy in lexical collocations

improves, i.e. Group 3 subjects become more accurate in translating Noun Prep

collocations than students in Groups 1 and 2. This is also consistent with Zhang's

results.

29

iii) Collocations that contain prepositions are harder to translate than collocations

that contain infinitives. Zhang reports that "knowing prepositions and being able

to use them in idiomatic combinations with other words are part of native fluency"

(Zhang 1993:135). In his study, too, L2 learners showed a weakness in knowledge

and ability to use collocations that contained prepositions.

5.2.2 Blank-Filling Data

Cued production of collocations was also measured in a blank-filling test.

Each sentence contained an English collocation with one part missing. Subjects

were required to provide the missing part of each collocation. The collocations

included in the blank-filling test could not be translated directly into the learners'

L1, Greek. The results for this set of data are discussed below.

5.2.2.1 Between-Group Differences

The results in the blank-filling data revealed that for Type 1. Noun Prep, 11.

SV(O) Prep O, 23. SV(O) Adverbial, 24. SV(O) wh-word, 27. Verb Noun (creation), 33.

Verb Adverb, 36. Prep Det Noun, and 37. Phrasal Verb collocations, subjects were

30

significantly more accurate in supplying the correct collocation as their level of

proficiency increased.

The textbook analysis (see Chapter 3) also showed that all but two of the

above collocation types exhibit a similar pattern of increase across the three

textbooks. For example, tokens for Types 1. Noun Prep, 24. SV(O) wh-word, 27. Verb

Noun (creation), 33. Verb Adverb, 36. Prep Det Noun, and 37. Phrasal Verb increase as

the level of language proficiency increases from TWE1 to TWE3. The students'

exposure to these collocation types could have influenced their performance to the

blank-filling test, i.e. the more frequently students were exposed to a particular

type of collocation, the more accurate they became in their knowledge of

collocations of this type.

The subjects' performance on two collocation types showed a U-shaped

pattern of acquisition. For Types 5. Adjective Prep and 34. Noun Noun, subjects in

Group 1 were more accurate than subjects in Group 2, who were also less accurate

than subjects in Group 3. A look at the specific collocations shows that the level of

difficulty increases with proficiency. For example, Type 34. Noun Noun

collocations for Group 1 were 'post office' and 'phone number', for Group 2 'traffic

lights', and for Group 3 'curriculum vitae'. Group 3 subjects were more accurate in

responding to the 'curriculum vitae' collocation than subjects in Group 2 were with

the 'traffic lights' collocation, even though 'curriculum vitae' is less frequent than

'traffic lights' in everyday speech. The analysis of the textbooks shows that 'traffic

lights' appears twice in TWE2, while 'curriculum vitae' appears only in TWE3, four

31

times. Again, the amount of exposure to a specific collocation appears to influence

performance.

Performance on the Adjective Prep collocations also increased as the level of

proficiency increased. For example, some of the Adjective Prep collocations for

Group 3 were: 'competent in', 'fond of', successful in', 'married to', 'unsure about',

'similar to', 'slow in', 'capable of', 'regardless of'. Group 2 subjects were tested on

the following Type 5 collocations: 'full of', 'sympathetic to', 'engaged to', 'upset

about'. Despite the fact that Group 3 students were faced with a larger number of

Type 5 collocations compared to Group 2 students, they were more accurate in

supplying the correct collocations. Both Noun Noun and Adjective Prep collocations

are lexical collocations. It appears that students at the initial stages of ESL learn

specific lexical collocations, possibly as unanalysed chunks, and hence they are

relatively accurate with respect to selected lexical collocations. As their proficiency

increases and their grammatical knowledge develops, their relative accuracy in

lexical collocations declines and they become better in grammatical collocations (in

the translation test too, intermediate level students were better in SV(O) Prep O

collocations). At the post-intermediate level, the subjects' overall accuracy

increases, and they once again become more accurate in lexical collocations. Such a

U-shaped development in L2 learners has been reported in previous linguistic

studies too (see McLaughlin 1987, 1990; Lightbown et al. 1980).

32

Overall, accuracy improves in the same fashion as in the essay and

translating tasks. Subjects in Group 3 performed the best and subjects in Group 1

performed the worst.

The implicational analysis for the blank-filling data had a low coefficient of

scalability (Cscalability = .33), and thus there is little evidence of a stable acquisition

order. Again, the U-shaped learning patterns probably contributed to the low

scalability of the blank-filling data (see Hatch & Lazaraton 1991:216).

5.2.2.2 Within-Group Differences

These were the significant differences in the accuracy of the subjects'

responses to specific collocation types within each group.

Group 1

After the 80% accuracy criterion was applied to the data for the

implicational scaling analysis, the following accuracy order was revealed here):

Table 47. Accuracy order for the collocation types included in the blank-filling

test - Group 1

Type

34. Noun Noun

33

33. Verb Adverb

24. SV(O) wh-word

36. Prep Det Noun

5. Adjective Prep

23. SV(O) Adverbial

30. Noun Verb

1. Noun Prep

4. Prep Noun

37. Phrasal Verb

29. Adjective Noun

31. Noun1 of Noun2

11. SV(O) Prep O

27. Verb Noun (creation)

The non-significant accuracy order shows that subjects in Group 1 were

more accurate in lexical collocations, i.e. Noun Noun, Verb Adverb, Prep Det Noun,

Adjective Prep, than in grammatical collocations that were more difficult to produce

(longer collocational strings), e.g. SV(O) wh-word, SV(O) Adverbial, SV(O) Prep O.

Group 2

34

After the 80% accuracy criterion was applied to the data for the

implicational scaling analysis, the following accuracy order was obtained for

Group 2 subjects:

Table 48. Accuracy order for the collocation types included in the blank-filling

test - Group 2

Type

24. SV(O) wh-word

23. SV(O) Adverbial

34. Noun Noun

31. Noun1 of Noun2

4. Prep Noun

5. Adjective Prep

36. Prep Det Noun

37. Phrasal Verb

27. Verb Noun (creation)

11. SV(O) Prep O

33. Verb Adverb

1. Noun Prep

The accuracy order for Group 2 is statistically significant and reveals that as

the level of proficiency increases, subjects become more accurate in their responses

35

to grammatical collocations that were initially difficult to produce (see scale for

Group 1 subjects), i.e. SV(O) wh-word, SV(O) Adverbial.

With regard to SV(O) wh-word collocations, the subjects' accuracy could be

due to the systematic appearance of this type of collocation in their textbook,

TWE2. SV(O) wh-word collocations are mainly used in the TWE series to give

instructions for the various tasks in the textbooks, e.g. "ask what the area code for

Liverpool is" (TWE2:56), "find out why Sam went back to his home town"

(TWE2:100). TWE1, because it was designed for beginner levels, gives task

instructions in Greek. It is in TWE2 that instructions are given in English and a

large amount of SV(O) wh-word collocations are included in the textbook. Hence,

the subjects in Group 2 had more exposure to SV(O) wh-word collocations

compared with Group 1 students.

Group 3

After the 80% accuracy criterion was applied on the data for the

implicational scaling analysis, the following accuracy order was evident for

subjects in Group 3:

Table 49. Accuracy order for the collocation types included in the blank-filling

test - Group 3

Type

36

33. Verb Adverb

34. Noun Noun

4. Prep Noun

24. SV(O) wh-word

23. SV(O) Adverbial

29. Adjective Noun

27. Verb Noun (creation)

1. Noun Prep

36. Prep Det Noun

37. Phrasal Verb

28. Verb Noun (eradication)

5. Adjective Prep

11. SV(O) Prep O

The above accuracy order for Group 3 was statistically significant. It shows

that subjects in the highest proficiency level were accurate in both lexical and

grammatical collocations.

Overall, as with the translation data, the subjects were less accurate in their

responses to the blank-filling test, i.e. 45 subjects out of 275, about 16%, were

accurate in their responses to Type 34, the type with the most accurate answers.

Comparing the results from the three scales, the following conclusions can

be drawn.

37

i) Type 11. SV(O) Prep O and 27. Verb Noun (creation) collocations were among the

most difficult collocation types (see also the results for the translation data). Also,

the subjects' responses to Type 28. Verb Noun (eradication) collocations were no

more accurate than their responses to Type 27. Verb Noun (creation) collocations

(see implicational scale for Group 3). It appears that Verb Noun collocations are

difficult to acquire, irrespective of whether or not they denote creation or eradication

(see Benson et al. 1986a).

ii) Subjects in Groups 1 and 2 achieved similar levels of accuracy, while subjects in

Group 3 were clearly more accurate in the blank-filling test, despite the fact that

their test contained more items. Undoubtedly, students at the most proficient level

for this study had a more advanced level of collocational knowledge.

iii) The greatest difference in the three acquisition orders was with respect to Type

33. Verb Adverb collocations. Subjects in Groups 1 and 3 were accurate in their

responses to this type of collocation, with Group 3 subjects significantly more

accurate than Group 1 subjects, while subjects in Group 2 were not at all accurate

on this collocation type. An examination of the specific collocations tested showed

that Groups 1 and 3 were tested only on the collocation 'work hard', while subjects

in Group 2 were tested on 'work hard', 'brake hard', and 'think highly'. In terms of

idiomaticity, 'think highly' is more idiomatic than the other two Verb Adverb

collocations. The idiomaticity of the collocation 'think highly' can be determined in

terms of its level of abstraction and literalness (i.e. the likelihood of its literal

meaning): 'think highly' is a more abstract collocation compared to the 'work hard'

38

and 'brake hard', which represent physical actions; also 'think highly' is of low

literalness (i.e. of unlikely literal meaning), while 'work hard' and 'brake hard' are

collocations with high literalness (see Cronk & Schweigert 1992). The collocation

'think highly' appeared to be especially difficult for subjects in Group 2, and even if

they answered the other two collocations correctly they still would not be able to

score more than 66% accuracy on this type of collocation (less than the 80%

accuracy criterion).

iv) As the level of proficiency increased, the students' performance on Prep Noun

collocations also increased. Despite the fact that subjects in Group 3 had more than

double the number of Prep Noun collocations in their version of the blank-filling

test than subjects in Group 1, they were far more accurate in their responses to this

type of collocation. On the other hand, Noun Prep collocations were difficult for all

three groups. The two types of collocation consist of the same parts of speech (a

noun and a preposition) but in a different order. When the preposition precedes

the noun, collocations are easier for L2 learners. When the preposition comes after

the noun, collocations become more difficult. A look at some of the Prep Noun

collocations included in the test shows that these collocations are fairly fixed,

frequent and regular (i.e. rule-governed), e.g. 'on Sundays' [on + day of the week],

'at 7:06' [at + time], 'in favour', 'in danger'. Noun Prep collocations are also fixed but

less regular, more unpredictable (i.e. no rules can be generated for them) and

associative, e.g. 'skills in', 'attitude towards', 'accusations against', 'degree in'. It is

possible that the order in which the parts of a collocation combine, rather than the

39

class they belong to (e.g. noun, verb, preposition, etc.), influences the degree of

difficulty and consequently the acquisition of a collocation.

v) SV(O) Adverbial and SV(O) wh-word collocations were relatively easy for all

groups, with SV(O) Adverbial collocations slightly more difficult than SV(O) wh-

word collocations. Both these types have occurred frequently in the TWE series,

with SV(O) wh-word collocations more frequent than SV(O) Adverbial collocations.

vi) As students became more proficient their accuracy on Adjective Noun lexical

collocations also improved. The Adjective Noun collocations for this test were fixed

and formal, e.g. 'sore throat', 'marine life', 'heavy drinker'. The subjects' knowledge

of fixed collocations therefore improved significantly with proficiency.

5.3 Summary of the Findings

As far as the free production of collocations is concerned, the following

conclusions can be drawn:

i) Type 26. SVc collocations are 'core' collocations, as they were the most frequently

used by students at all proficiency levels.

ii) Type 26. SVc and 29. Adjective Noun collocations are early acquired ones as their

use by subjects in this study revealed.

iii) Types 3. Noun that, 18. SV Possessive V-ing, 22. SVOO, 25. S(it)VO to Inf, 28. Verb

Noun (eradication), and 32. Adverb Adjective were avoided by subjects in all groups.

40

These types represent collocations that are structurally demanding, infrequent,

and/or fixed.

iv) The use of Type 12. SV to Inf, 13. SV Inf, 15. SVO to Inf, 21. SVOc, 1. Noun Prep,

and 5. Adjective Prep collocations indicates a higher level of proficiency and

development of collocational knowledge, as they were used mainly by subjects in

Group 3, the highest proficiency level.

v) The development of collocational knowledge occurs gradually, and collocational

use develops significantly after two years of instruction, exposure and maturation

has taken place (see correlations of the acquisition orders for the three Groups).

As far as the learners' cued production of collocations is concerned, the

following conclusions can be drawn.

i) Grammatical collocations are easier to translate than lexical collocations.

Accuracy in translating lexical collocations, Types 1,5, and 27, increased as

language proficiency increased.

ii) Grammatically more complex types were more difficult, e.g. Type 16. SVO inf

collocations were more difficult to translate with accuracy than Type 13. SV Inf

collocations.

iii) Collocations containing a preposition were more difficult to translate than

collocations containing an infinitive, as prepositions appeared to be more likely to

cause L1 interference for the subjects in this study, and their combination with

other words produced relatively fixed and difficult collocations.

41

iv) Type 27. Verb Noun (creation) lexical collocations were the most difficult to

translate with accuracy for all subjects. Verb Noun collocations in the blank-filling

test were also difficult for all subjects and they were also infrequent in the students'

essays.

5.4 Factors Affecting the Development of Collocational Knowledge

In previous developmental studies, frequency in the input has been

considered a determinant of the sequence of acquisition of morphemes (Larsen-

Freeman 1976a, 1976b). In this study too, frequency of input seemed to affect the

development of collocational knowledge. The results from the translation and the

blank-filling tests suggest that the more frequently students are exposed to a

particular collocation type, the more likely they are to know it. There is also

evidence that the amount of exposure to a particular collocation via textbooks can

influence the acquisition of that particular collocation, irrespective of how

frequently that particular collocation occurs in everyday speech, e.g. 'curriculum

vitae'. The results of the essay data strongly suggest that the production of English

collocations by the subjects in the present study was influenced by the frequency of

occurrence of English collocations in their textbooks. Greater frequency could

have made certain collocations more salient and noticeable, supporting the

argument that 'noticing' the form of input leads to learning (Doughty 1991;

Robinson 1995; Schmidt 1990, 1995). Palmberg (1987), (1988) also found that the

42

vocabulary L2 learners produced consisted mainly of textbook vocabulary.

Instruction has been found to influence the rate of acquisition in other studies too

(Olshtain 1987; Doughty 1991). However, given the fact that the subjects in this

study were tested on collocations already taught to them, their overall low

accuracy in both the translation and the blank-filling tests suggests that mere

exposure to collocations is not enough to facilitate recall. This conclusion is also in

line with past research (Marton 1977:47; Bardovi-Harlig 1992b:272).

Complexity was also considered as another factor influencing the

development of collocational knowledge in ESL learners. With regard to

grammatical collocations, for specific pairs of collocational structures, the type that

was grammatically more complex was also more difficult for L2 learners. For

example, learners were more accurate in SV Inf collocations than in SVO Inf

collocations, and their use of SV Inf collocations increased later than the use of SV

to Inf collocations. Also, grammatically complex and infrequent collocation types

were avoided by the L2 learners in this study, e.g. students showed no evidence of

acquisition of SV Possessive V-ing collocations. With regard to lexical collocations,

'complexity' in terms of arbitrariness, unpredictability and idiomaticity seemed to

influence their acquisition by L2 learners, e.g. subjects were less accurate with fixed

(not free), arbitrary, and unpredictable Verb Noun lexical collocations. Idiomaticity

and arbitrariness have been previously found to affect the acquisition of individual

words too (for a review see Laufer 1990b). Also, in this study, those collocation

types, grammatical and lexical, that were early acquired, i.e. SVc and Adjective

43

Noun, represent collocations that are structurally 'salient' and need to be produced

correctly for effective communication due to their high frequency in every day

speech. Similar results with respect to these two structures were reported by

Fathman (1977) in her study of the acquisition of grammatical structures.

Also, there has been suggestive evidence that the order in which the parts of

a certain collocation type combine can influence the degree of regularity of the

collocations represented by that particular type. This has also been found to affect

the degree of difficulty of acquisition for that particular type. For example, Prep

Noun collocations, 'on Sundays', 'at 7:06', have been found to be more regular (i.e.

rule-governed) and hence easier to acquire than Noun Prep collocations, e.g. 'degree

in', 'attitude towards', 'skills in', which are unpredictable (i.e. associative).

There is also evidence that the degree of L1-L2 difference influences the

salience and consequently the acquisition of certain collocation types. For

example, collocation types that were structurally different from the subjects' L1

were more difficult to translate, e.g. Type 5. Adjective Prep collocations that were

'Verb Determiner Noun' collocations in Greek were more difficult to translate, e.g.

the Greek equivalent of 'I am bored with school' is ‘Barie;mai to scolei;o’

[bore[Middle Voice Verb] the school].

Finally, for a number of collocation types, knowledge develops as overall

language proficiency increases, i.e. the subjects' accuracy and production of

collocations was influenced by their overall language proficiency, and the most

proficient students performed with greater accuracy in the translation and the

44

blank-filling tests than the other two groups. By and large, the greatest difference

in performance appeared to be between Groups 1 and 3, which also suggests that

maturation, in terms of language proficiency and age, affects the development of

collocational knowledge.

The following model summarises schematically the factors affecting the

development of collocational knowledge:

Language Proficiency

Maturation

Instruction

Saliency

L1-L2 Difference

Fixed/Arbitrary Unpredictable

Lexical Collocations

Regular/Salient

Collocational Knowledge Complex

Grammatical Collocations

Salient

Figure 17. Model of the Development of Collocational Knowledge

According to this model, collocational knowledge develops as overall

language proficiency develops, as students become more mature, and as more

exposure to collocations takes place. The development of collocational knowledge

is influenced by the 'salience' of the particular collocation types. Grammatical

collocations that are simple and frequent in everyday speech are early acquired.

45

The more complex structures are acquired later. Lexical collocations are more

difficult to acquire than the simple grammatical collocations. They are

syntactically simple (e.g. Noun Verb, Verb Noun, Noun Prep, Prep Noun, Verb

Adverb), but their acquisition is affected by other factors of 'semantic complexity',

e.g. arbitrariness, predictability and idiomaticity, i.e. the more fixed and idiomatic

they are, the more difficult they are to acquire.

Also, based on these results, a continuum of collocational knowledge and

language proficiency can be described. Beginning students (Group 1) are able to

produce simple grammatical collocations, they are more accurate on lexical

collocations than complex grammatical collocations, but their overall accuracy is

low. This can be interpreted as evidence that these students use lexical collocations

as unanalysed blocks of language that they have memorised, and because their

grammatical competence is not yet well-developed, they are less accurate with the

more structurally demanding grammatical collocations. The fact that they can

memorise lexical collocations more than grammatical ones could be due to the

saliency of lexical collocations in terms of length of the collocational strings, i.e.

most of the lexical collocation types consist of two words (Verb Adverb, Adjective

Noun, Noun Prep, Adjective Prep) so they are easier to remember. Grammatical

collocations, on the other hand, are longer and as such harder to memorise.

At the intermediate level (Group 2), students become more accurate with

the more complex grammatical collocations as their grammatical competence

46

increases, but their accuracy on lexical collocations and their overall accuracy do

not improve.

As students reach a higher level of proficiency, post-intermediate (Group 3),

their overall accuracy in collocations (both lexical and grammatical) increases

considerably, and they once again show greater accuracy on lexical collocations,

indicating a richer vocabulary. Previous research has also shown that more

advanced learners have more lexical and syntactic tools when they approach a

language learning task (Ferris 1991, 1994).

A similar step-by-step model of L1 acquisition is described by Berman

(1986). According to Berman's model, children in acquiring their L1 go through

three main phases:

(a) a PREGRAMMATICAL phase... where children's knowledge is largely

item-bound...; (b) the phase of GRAMMAR ACQUISITION..., where rules

are applied productively across items in terms of linguistic structure, and

items are interrelated within more general systems, categories and

paradigms; and (c) a final phase of APPROPRIATE USAGE where the

repertoire of forms and rules acquired previously are deployed with

increasing skill.

(Berman 1986:193).

The beginners' stage is similar to Berman's pregrammatical phase: they learn

collocations as lexicalised items. At the intermediate level, learners are at the

phase of grammar acquisition: they apply rules productively, increasing their

47

knowledge of grammatical collocations. At the post-intermediate level, students

are approaching Berman's final phase of appropriate usage: their overall

knowledge of collocations increases for both grammatical and lexical collocations.

Since collocations are one of the key building blocks of language, it is not

surprising that their acquisition proceeds to a pattern similar to L1 acquisition.

Pienemann's Processability Model also provides a framework for

understanding the development of collocational knowledge. The first stage of

Pienemann's model consists of basic sentence structures and basic categories

(Pienemann 1996). This stage coincides with the initial stage of collocational

knowledge: learners acquire simple grammatical collocations and relatively free

lexical collocations that are basic and frequent in everyday speech. The second

stage of Pienemann's model contains extensions of the noun phrase, verb phrase,

and sentence. This is the stage where students become able to apply grammatical

rules productively and have a better understanding of the constituents of the

sentence resulting to the use of more complex collocational strings. Stage 3 of the

Processability model is characterised by the use of new categories which are filled

with lexical items. The third stage of collocational knowledge is also characterised

by a better command of both lexical and grammatical and a preference for lexical

collocations signifying a richer vocabulary. The roughly parallel stages between

Pienemann's Processability model and the model of the development of

collocational knowledge described in this study underscore the existence of a

48

stage-by-stage development of collocational knowledge and its significance for the

overall development of L2 proficiency.

5.5 Summary of the Discussion

With regard to the main questions in this study - is there development of

collocational knowledge in L2 learners as their overall language proficiency

develops; and are there any differences in development between and within

proficiency levels? - the answer is affirmative. There is significant development of

collocational knowledge as overall language proficiency develops. Evidence has

been provided by both production (essay data) and knowledge of collocations

(translation and blank-filling data). The development of collocational knowledge

has been defined in the differences in the use and knowledge of collocations

between and within three different proficiency levels: post-beginners,

intermediate, and post-intermediate.

This study also explored what possible factors can account for the

acquisition of English collocations by L2 learners, and whether there are

identifiable patterns of acquisition of that part of vocabulary previously described

as 'ruleless'. As with most of the developmental studies, the main emphasis has

been on describing the emerging patterns of acquisition of English collocations.

The large number of structures examined by this study has led to the emergence of

a number of different patterns of acquisition. Where possible, explanations

49

pertaining to theories of second language acquisition have been provided with

regard to specific patterns of acquisition. The present study has shown that an

overall explanation of lexical acquisition may require a modular theory of

language acquisition with different modules on the grammatical complexity,

learnability, processability, and developmental order of the different collocational

structures.

The ultimate aim of this study has been to shed light on the acquisition of

collocations, which are considered an important aspect of L2 acquisition.

In the next section, some pedagogical implications of the results of the

present study are given. It is hoped that the data can also provide language

instructors with an anchor point in the teaching of English collocations.

5.6 Pedagogical Implications

The main goal of this study has been to investigate the acquisition of L2

collocations. L2 learners have been tested on how their collocational knowledge

develops. Overall, results show that students from the three proficiency levels

tested were not very accurate in either the translation or the blank-filling tests.

This is indicative of the L2 learners' general weakness in producing acceptable

collocations noted by other researchers, and of the need to provide L2 learners

with help for the improvement of their collocational knowledge.

50

The subjects in this study did not receive explicit teaching on collocations.

The teachers' questionnaire (see Appendix F) showed that the teachers did not

emphasise either the importance of collocations to their students, or the use of

other resources in learning collocations. The teachers also agreed that the

treatment of vocabulary in TWE is inadequate. The results of this project reveal

certain weaknesses and needs on the part of L2 learners, and ways to utilise these

results in L2 classrooms are suggested below.

The results provide useful information as to how collocational knowledge

develops in L2 learners. Such information can be used for improving the treatment

of collocations in ESL syllabuses. The knowledge of which collocation types are

acquired early in L2 learning, and which are acquired later, can help syllabus

designers order the presentation of collocations to promote a step-by-step

development of collocational knowledge.

Specific collocational problems for L2 learners have also been identified.

Students from all the proficiency levels had difficulties with lexical collocations

that are fairly fixed and arbitrary (not predictable) in English, e.g. Verb Noun

collocations such as ’draw conclusions’, ’earn a living’, ’take shorthand’, ’call a

penalty’. Such lexical combinations require specific collocational knowledge and

native-like ability. L2 learners have no means of telling which words collocate

with which unless they are specifically taught about such collocations.

The findings can also be used as a guide to help teachers decide how to

handle the teaching of collocations in their classroom, e.g. teaching early acquired

51

types before late acquired types, or more regular and frequently used collocations

before more fixed and idiomatic ones. Also, by analysing teaching materials (e.g.

readings) with respect to which collocation types they contain, teachers can assess

the different teaching materials to be used with the different proficiency levels.

Making teachers aware of the importance of collocations is not enough.

Students also need to become aware of collocations and develop strategies for their

acquisition. By raising the students' awareness of the existence of collocations and

their usefulness in L2 learning, teachers can help students take note of the

collocations they come across and make more effective use of them. Students

should become aware that words do not occur in isolation, but in combination with

other words. Increased awareness of and attempt to use communicatively

redundant grammatical structures may also lead to faster rates of acquisition and

possibly higher levels of L2 attainment (Long 1988:120).

The present study also showed that the L1 can influence the learners'

knowledge of collocations, especially lexical collocations, that are different from

their equivalent collocations in the learners' L1. For example, Adjective Preposition

collocations, such as ’afraid of [snakes]’, are Verb Det Noun collocations in Greek,

‘foba;mai ta fi;dia’ [afraid-[Middle Voice Verb] the snakes]. As a result, the Greek

learners in this study often translated the Adjective Preposition collocations leaving

out the preposition. Also, in coping with arbitrary Verb Noun collocations, such as

’draw conclusions’, ’take an examination’, 'earn a living', subjects seemed to use

their knowledge of Greek, e.g. *’take out conclusions’ ‘bga;zw sumpera;smata’

52

[take out conclusions], *’give an examination’ ‘di;nw exeta;seis’ [give

examinations], *’take out a living’ ‘bga;zw to ywmi; mou’ [take out my bread].

Such differences between the L1 and the L2 should be pointed out to the L2

learners, and L2 learners should be encouraged to practise and use such

collocations in order to sound more idiomatic in the target language.

The accuracy orders reported here may also be relevant as a starting point

for an index of L2 development (see Larsen-Freeman 1978b, 1978c). That is, the

students' language proficiency can be determined according to which collocation

types they have acquired. Such an index of development can also be used for

designing language testing materials, and for the placement of students in a

suitable proficiency level.

53

CHAPTER 6

CONCLUSIONS

6.1 Summary and Conclusions

This study has investigated the development of English collocational

knowledge in three different proficiency levels - post-beginners, intermediate and

post-intermediate - of 275 Greek learners of ESL. Three tests measuring the

learners’ knowledge of collocations were used: essay writing, a translation test and

a blank-filling test. The essay writing measured free production of collocations,

and the translation and blank-filling tests were measures of cued production.

Evidence was sought for the development of collocational knowledge between and

within the three proficiency groups. Results revealed that there are specific

patterns of development across and within the three different groups.

Collocational knowledge increased steadily as the overall language proficiency

increased, and the development of collocational knowledge was found to be

influenced by the frequency of the input, the L1-L2 difference, the overall language

proficiency, and the 'saliency' of the collocation types. Grammatical and lexical

collocations that were simple and frequent in everyday use of English were

acquired early and the more complex grammatical collocations were acquired

54

later. Lexical collocations that were idiomatic, fixed and/or unpredictable were

more difficult than those that were less arbitrary and more rule-bound. Finally, the

development of collocational knowledge in terms of the three proficiency levels

can be described as follows: Post-beginner students have already acquired the

simple and frequent grammatical collocations, e.g. SVc, they use few types of

collocation and a large number of tokens for some of them, they are more accurate

with regard to lexical collocations than complex grammatical collocations, but their

overall accuracy is very low. At the intermediate level, students use more

collocation types and they use both simple and complex grammatical collocations,

but their overall accuracy does not improve. At the post-intermediate level,

students become more accurate with respect to grammatical, both simple and

complex, and lexical collocations, and their collocational knowledge is significantly

advanced.

From a theoretical point of view, the present study developed a

classification of the various studies on collocations in three major approaches:

lexical composition, the semantic and the structural approach. Each approach has

been critically reviewed to reveal its strengths and weaknesses for the study of

collocations.

The systematic use of a classification system for classifying collocations

makes the replication of this study possible. If this classification system is used in

future studies on collocations, it will enable a comparison of the results, and

support a systematic contribution to how collocational ability develops.

55

The empirical contribution of this study lies in the use of the different

elicitation instruments and the analyses of the data. The detailed description of the

construction of the battery of tests used for the collection of data (Chapter 3), as

well as their strengths and weakness (see next section) can be used as a guide for

designing future studies on collocations and developing more sensitive and

effective elicitation instruments.

The analysis performed on the data is an improvement over analyses in

other developmental studies, i.e. studies on the order of acquisition of morphemes.

It shows not only the order of acquisition of collocational types, but also the

strength of the relationship of the items on the implicational order.

From a pedagogical point of view, this study provides a picture of how

English collocational knowledge develops in ESL learners. Knowing how

collocations are acquired is fundamental for devising ways of teaching them and

strategies for learning them.

It was the aim of this study to investigate the development of collocational

knowledge in L2 acquisition, and to provide a starting point towards unravelling

the acquisition process of English collocations. A model for the development of

collocational knowledge has been suggested, and the possible factors affecting the

various stages of collocational knowledge have been examined. Hopefully, the

study of collocations will continue in the future. Further studies should reveal a

more detailed picture of the development of collocational knowledge in L2

learners, with important implications for L2 theory and instruction.

56

6.2 Directions for Further Research

This study used syntactic structures in defining and operationalising

collocational knowledge, which is traditionally considered an area of lexical

acquisition. The results suggest that defining collocational types syntactically is a

valid approach in the examination of collocational development, especially with

grammatical collocations. The description of the acquisition of lexical collocations,

however, requires further refinement using semantic information. As it has

already been mentioned in the discussion of the results (Chapter 5), lexical

collocations are syntactically simple, i.e. they are usually combinations of two

words such as Verb Noun, Adjective Prep, Noun Prep, Verb Adverb, but their

acquisition is influenced by other factors. For example, lexical collocations that

belonged to the same collocation type were found to vary in difficulty, e.g. subjects

had more difficulties with the collocation ’think highly’ than with ’work hard’ even

though both collocations belonged to the same collocation type, Verb Adverb.

’Think highly’ is more idiomatic than ’work hard’ and as such it was more difficult

for the ESL learners in this study. Future researchers should be aware that the

acquisition of syntactic forms is a necessary but not sufficient condition for the

development of collocational knowledge, especially with regard to lexical

collocations.

57

The translation test revealed strong differences in the development of

collocational knowledge between and within proficiency levels. One of the

advantages of translation, as opposed to a blank-filling test, is that it enables the

testing of grammatical collocations as well as lexical ones. However, translation

has proved to be difficult for both beginning and more advanced L2 learners.

Furthermore, there is evidence that it promotes L1 interference in the students'

production. Future research on collocations should take the above limitations into

account before deciding on the use of a translation test.

The blank-filling test for this study contained more lexical than grammatical

collocations, mainly because grammatical collocations are more difficult to test in a

blank-filling test. For example, testing SVO to O collocations in a blank-filling test

creates the problem of where to put the blank space without making the

collocation too general or too obvious. Even though the blank-filling test showed

that most of the differences in accuracy reflect language proficiency, the accuracy

orders were weak. This could be due to the fact that the majority of the test items

tested lexical collocations. Lexical collocations, as already discussed above, are

influenced by semantic factors as well as syntactic ones. Therefore, the students'

performance on the blank-filling test was not consistent enough to produce a

reliable accuracy order, as the students' accuracy of responses reflected not only

their knowledge of the particular collocational type, but also which particular

words were required for the particular lexical collocations. Research on

collocations is in need of a reliable instrument to elicit information on a wider

58

range of collocational knowledge. For example, future research might examine the

development of collocational knowledge in a two-fold way, i.e. development with

respect to lexical collocations, controlling collocations for formality, frequency of

occurrence and idiomaticity, and development with respect to grammatical

collocations, controlling for grammatical complexity.

Although the present study did not set out to determine the extent to which

syllabuses influence the acquisition and the rate of acquisition of collocations, it

has provided evidence that the frequency of occurrence of collocations in L2

textbooks influences their acquisition (see Long 1988). That is, the more students

were exposed to a particular collocation type, the more they used it accurately.

Future research can test this result by controlling for number of exposures to given

collocations in an experimental condition. One question of interest is how much

exposure to collocations accounts for acquisition orders. This would help identify

the optimal instruction conditions leading to the acquisition of collocations (see

also Chaudron 1988; Sheen 1994). Also, it will be useful to determine whether

instruction can change the order of acquisition, i.e. whether emphasis of exposure

on some types of collocation will produce a change in the acquisition orders

obtained in this study, or whether classroom instruction affects only the rate of

acquisition but not the order of acquisition of collocations (see also Ellis 1989).

Long (1988) also underscores the need for research on collocational ability

achievable with and without instruction.

59

In this study, essay writing revealed a number of interesting results with

respect to the use of collocations. Subjects were controlled with respect to

variables such as age, formal education, English proficiency, first language

background, and knowledge of vocabulary. Unlike previous studies on

collocations, subjects in the present study were tested on their knowledge of

collocations already taught to them. The collocations included in the translation

and the blank-filling tests were taken from the subjects' textbooks. This ensured

that the subjects were tested on knowledge of collocations already presented to

them. The topics of the essays were also chosen with the subjects' textbooks in

mind. This ensured that subjects from all proficiency levels could perform

successfully in the essay composition and produce those collocations that they had

acquired and felt comfortable with using. However, the use of specific topics has

been shown to promote the use of specific collocation types, such as a large

number of SVc and Adjective Noun constructions in the essays by subjects in Group

1. Future research could investigate the performance of different proficiency levels

in essay writing, using the same topic for all proficiency levels. In this way, any

influences of the essay topic on the use of collocations would be equal for all levels.

The present study has concentrated on accuracy in the use and knowledge

of collocations. The analysis of collocational errors was not part of this study.

However, future research could investigate the misuse of collocations by L2

learners, the possible causes leading to collocational errors, and ways to remedy

them. The use of a corpus-based dictionary could also provide future researchers

60

with information as to whether collocational misuse is greater with infrequent

collocations or not. Note that the BBI does not provide frequency information.

Further research is also needed on how collocational knowledge develops in

native speakers of English. Such information can be used to compare the routes of

development by L2 learners and native speakers in the acquisition of English

collocations. Also, research in the development of collocational knowledge by

learners from different L1 backgrounds would reveal whether the accuracy orders

found in this study are L1-neutral. A comparison of the collocational errors would

yield important information about the extent of the influence of L1 in the

development of collocational knowledge in L2 learners.

The classification system used in this study has proved to be useful for a

systematic categorisation of the collocations found in the students' essays. Some

types, though, need some fine-tuning. For example, Type 15. SVO to Infinitive , as

it is used in the BBI, implies that the object of the main verb is the subject of the

infinitive, e.g. 'she told him to leave'. There can be cases, though, in which the

subject of the main verb is also the subject of the infinitive, e.g. 'she used the knife

to cut the bread'. In the present study both examples would be classified under the

same type. However, future research could use a different type of collocation for

the second example, e.g. 'SVO to Inf O' or 'SVO to Inf NP' (NP = Noun Phrase).

Such fine-tuning may yield more sensitive differences in collocational performance

among learners from different language proficiency levels.

61

62

Studies on collocations to date have concentrated on written data. It would

be interesting also to investigate L2 learners' use of collocations in oral production.

By using the classification system employed by the present study, L2 learners' oral

production data could be analysed in a similar way to reveal acquisition orders

and development of collocational knowledge. These orders could then be

compared with the ones found in this study and reveal helpful information as to

whether collocational knowledge in L2 writing and speech develop in similar or

different ways.

The above are selected directions for future research on collocations. The

development of collocational knowledge in L2 learners is far from being

exhaustively described. More work is needed in the area of lexical acquisition both

for theoretical and pedagogical reasons as it has proved to be a profitable avenue

for inquiry in the study of L2 acquisition.