Upload
ella-hensley
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
“Geiriau Saesneg yn slipio i fewn”: Investigating the integration of English-origin verbs in Welsh
Jonathan Stammers8 March 2010, Bilingualism Centre
Overview The Siarad corpus Code-switching vs. borrowing controversy Poplack approach: “Nonce Borrowing” English-origin verbs in Welsh Analysis: Soft mutation on verbs (2 attempts) Dealing with word frequency effects Summary
The Siarad Corpus 40 hours of Welsh/English bilingual speech recorded
& fully transcribed in CHAT format 69 Naturalistic recordings of informal conversations,
typically between 2 speakers, & 30 minutes long; 151 speakers of varying age, sex and background 456,266 words (tokens) Every word tagged for language Recordings & transcription done by project team
(Elen Robert, Peredur Davies, Marika Fusser & myself; Margaret Deuchar – project director)
Freely available to researchers online
Examples in Siarad: Borrowings?
ond mae o mor cheesy mae’n funny yndy ?
“but it’s so cheesy it’s funny isn’t it?” [Fusser29:217]
hynna ’dy’r exam dw i gorod eistedd fory
“that’s the exam I have to sit tomorrow.” [Stammers6: 1273]
Code-switching or Borrowing? Criteria:Criterion Borrowing Code-mixing
no more than one word + -
adaptation: phonological ±/+ ±/-
morphological + -
syntactic + -
frequent use + -
replaces own word + -
recognised as own word + -
semantic change + -(Muysken 2000: 73)
Additional Criteria suggested:“Core/Cultural” distinction: “Cultural” items are not switchesFlagging: self-correction, repetition, hesitation or stammering flags up a switchDictionary
Poplack’s approach Code-switching and borrowing can be
distinguished absolutely “Free morpheme constraint” no word-
internal switching Variationist approach: Comparing morpho-
syntactic patterning of donor-language items with native items
“Nonce Borrowing hypothesis”
The Nonce borrowing hypothesis “One of the goals of these studies is to develop operational criteria
for distinguishing loanwords from codeswitches. Thus, for the Puerto Rican data, a working hypothesis was that loanwords from English were phonologicaly, morphologically, and syntactically integrated into Spanish, were recurrent and widespread, and that an English word not satisfying these criteria could only occur in English monolingual discourse or in code-switches from Spanish to English. In general, however, borrowing is a much more productive process and is not bound by all of these constraints. In particular, phonological integration and the “social” characteristics of recurrence (in the speech of an individual) and distribution (across the community) need not be satisfied. This type of borrowing is sometimes called “nonce” borrowing.”
(Sankoff, Poplack & Vanniarajan 1990: 74)
Study Language pair studied
Elements analysed
Linguistic features studied in analyses
Conclusion
Sankoff, Poplack & Vanniarajan 1990
Tamil-English Lone English nouns
Case inflections All are borrowings
Poplack & Meechan 1995
Wolof-French; Fongbe-French
Lone French nouns
Definite/indefinite reference ; NP word order
All are borrowings
Adalar & Tagliamonte 1998
Turkish-English Lone English nouns
Vowel harmony; Plural affixation; NP word order
All are borrowings
Budzhak-Jones 1998
Ukrainian-English Lone English nouns
Case inflections All are borrowings
Eze 1998 Igbo-English Lone English verbs; Lone English nouns
Affix distribution; serial constructions; vowel harmony (verbs) ; determiners; type of nominal reference; NP word order (nouns)
All are borrowings
Samar & Meechan 1998
Persian-English Lone English nouns
Definite/indefinite reference; VP word order; case inflections
All are borrowings
Turpin 1998 (Acadian) French-English
Lone English nouns
Determiners; NP word order; plural marking; discourse flagging
Most are borrowings; Minority are switches
Arroyo & Tricker 2000
Catalan-Spanish Lone Spanish nouns
Definite/indefinite reference; plural marking; gender
All are borrowings
Shin 2002 Korean-English Lone English nouns
Case inflections All are borrowings
Cacoullos & Aaron 2003
Spanish-English Lone English nouns
Determiners All are borrowings
English verb insertions (1)
More “established English borrowings”:
pasio (to pass), trio (to try), setlo (to settle), canslo (to cancel), meindio (to mind), cysidro (to consider)
sut mae o’n cope-io efo (.) hynna i gyd?
“how is he coping with all that?” [Fusser29:635]
pan dach chi’n defnyddio wide-angle lenses dach chi’n emphasize-io ’r foreground.
“when you use wide-angle lenses, you emphasize the foreground.” [Fusser17: 792]
English verb insertions (2)
bysai hi’m ’di gwisgo helmet ’sai pen hi ’di cael ei crush-o to bits
“if she hadn’t worn a helmet, her head would have been crushed to bits.” [Robert3: 898]
a mae ’di cael ei ºgonnect-io i’r printer yr computer, de
“and it’s been connected to the computer printer, right.” [Roberts2: 627]
English verb insertions (3)
anyway, ges i ’yn gazump-io ar hwnna “anyway, I got gazumped on that one” [Fusser29:700]
maen nhw’n (.) exfoliate-io chdi gynta (.) ac yn spwnjo chi drosodd gynta
“they exfoliate you first, and sponge you over first” [Fusser30:27]
Soft Mutation in Welsh
Soft mutation on verbs: Environments (1)
After "i" particle
e.g. oedd e’n mynd i ºgostio pres [Fusser6:524]
After "ei" possessive (with masculine subject)
e.g. fyswn i licio ei ºfenthyg o [Fusser9:375]
After various other particles: heb, am, cyn, gan, ar, neu; dy possessive
e.g. sut mae o am ºfihafio [Fusser15:510]
Soft mutation on verbs: Environments (2)
With gwneud (or ddaru) auxiliary + Subject
e.g. wnest ti ºdrio? [Stammers5:708]
After "i" + (non-overt) Subject
e.g. mae’n gwneud i chdi ºgofio rywbeth dydy? [Stammers7:139]
After Finite Verb + Subject
e.g. sut fedra i ºddeud? [Fusser4:257]
Soft Mutation: VariationE.g. Welsh verb “cerdded” (to walk):
a maen nhw’n mynd i ºgerdded am tua dwy, dair milltir“and they’re going to walk for 2 or 3 miles” [Roberts2: 32]
But frequently mutation doesn’t happen where expected (especially in informal spoken Welsh):
a (.) does dim byd i poeni amdano“and there’s nothing to worry about” [Fusser14: 40]
Three groups of verbs compared in this study (1st Analysis):
Native Welsh: cofio (remember), defnyddio (use), cwyno (complain), pwyso (push), cneifio (shear), treiglo (mutate), twtio (tidy)
talu (pay), penderfynu (decide), poeni (worry), lladd (kill), cwrdd (meet), cau (close), dal (hold), dechrau (start), cael (have), mynd (go), gweld (see). [irregular verbs and non –(i)o suffix included]
Listed English: trio, cario, clirio, dreifio, sbwnjo, clariffeio, pinsio, bargeinio, pipo, dipio, trotio, manejio, tsiecio, titso, protestio, cidnapio
twtsiad, dripian, [non –(i)o suffix included] Unlisted English: text-io, download-io, brief-io, quote-io, bulk-
io, ban-io, bypass-io, crush-o, trample-o, base-io, connect-io, babysit-io, decorate-io, concentrate-io, mollycoddle-io, power-walk-io
Method
Text-based searches through corpus (and using word frequency lists) for possible verbs, extracting examples where mutation expected (and where consonant can be mutated!)
Coded each verb as mutated or not First attempt: used a random sampling
technique to find the native Welsh verbs
Results (First Attempt): (1)
Results (First Attempt): (1)
Absolute Freq.
Freq./million words
%Mut. Overall AVG Freq
log(AVG freq)
1-4 1-9 34.69% 2.21 0.3452
5-45 10-99 52.68% 16.44 1.2160
46-450 100-999 75.29% 161.56 2.2083
451-4500 1000-9999 89.63% 1962.67 3.2928
Correlation coefficient with overall % mutation: 0.7752 0.9936
1-9 10-99 100-999 1000-99990
102030405060708090
100
Frequency per million words of verb (grouped data)
% M
uta
tion
wh
ere
exp
ect
ed
Analysis: 1st & 2nd AttemptsEarlier Analysis Later Analysis
Corpus 46 transcript subset (66%) of Siarad corpus; 301,072 word tokens
Whole Siarad corpus
(69 transcripts; 456,266 word tokens)
Instances selected (where soft mutation expected)
All English-origin verbs (with any suffix or none); Sample of native tokens: 5 randomly distributed tokens per transcript of any Welsh verbs, including irregular verbs 466 tokens altogether (230 native Welsh; 198 listed English; 38 unlisted English)
All English-origin and native tokens ending in the –(i)o suffix (regular verbs only)
506 tokens altogether (143 native Welsh; 302 listed English; 61 unlisted English)
No. of verb types (overall and by verb status and frequency band)
147 types overall
native Welsh: 65; listed English: 62; unlisted English: 20
1-9 words per million: 42; 10-99: 54; 100-999: 41; 1000-9999: 10
159 types overall
native Welsh: 44; listed English: 81; unlisted English: 34
1-9 words per million: 79; 10-99: 72; 100-999: 7; 1000-9999: 1
Initial consonants
Verbs starting with /p/,/t/,/k/,/b/,/d/,/m/, /ɬ/ and /g/ included; /rʰ/ excluded
Verbs starting with /p/,/t/,/k/,/b/,/d/ and /m/ included;
/ɬ/, /g/ and /rʰ/ excluded
Other Possible Variables: (1) Mutation Environment
(A) "i" particle
(B) "gwneud" auxiliary + Subject
(C) "i" + (non-overt) Subject
(D) "ei" possessive
(E) Fin Verb + Subject
(F) other particle
A B C D E F0
10
20
30
40
50
60
70 65.460.5
65.3
58.1
44.8
62.5
Mutation Environment
% M
uta
tion
wh
ere
exp
ecte
d
Other Possible Variables: (2) Initial Consonant
b d m p t k0
10
20
30
40
50
60
70
80
47.5
59.3 61.555.8
6468.8
Initial Consonant of verb
% M
uta
tion
wh
ere
exp
ecte
d
Three groups of verbs compared in this study (2nd Analysis):
Native Welsh: cofio (remember), defnyddio (use), cwyno (complain), pwyso (push), cneifio (shear), treiglo (mutate), twtio (tidy)
Listed English: trio, cario, clirio, dreifio, sbwnjo, clariffeio, pinsio, bargeinio, pipo, dipio, trotio, manejio, tsiecio, titso, protestio, cidnapio
Unlisted English: text-io, download-io, brief-io, quote-io, bulk-io, ban-io, bypass-io, crush-o, trample-o, base-io, connect-io, babysit-io, decorate-io, concentrate-io, mollycoddle-io, power-walk-io
1-9 10-99 100-999 1000-99990
10
20
30
40
50
60
70
80
90
100
Word frequency per million words (grouped values)
% M
utat
ion
whe
re e
xpec
ted
Results: Second Analysis
Native Listed Eng. Unlisted Eng.0
50
100
150
200
250
73%
66%
16%
27%
34%
84%
MutatedNot Mutated
Results: First & Second Analyses
Results: 1st & 2nd Attempts
Earlier Analysis Later Analysis
% Mutation by verb status
native Welsh 85.6%; listed English 61.1%; unlisted English 18.4%
native Welsh 72.7%; listed English 66.2%; unlisted English 16.4%
% Mutation by frequency band
1-9 words per million 34.7%;
10-99 52.7%;
100-999 75.3%;
1000-9999 89.6%
1-9 words per million 40.9%;
10-99 58.9%;
100-999 74.9%;
1000-9999 86.7%
Results (First Analysis)
Results (Second Analysis)
Statistical Testing: 1st & 2nd AnalysesEarlier Analysis Later Analysis
Results of statistical testing (logistic regression) with raw frequency values
Raw frequency marginally significant (p=.044) or not quite significant (p=.072) as a predictor of mutation, depending upon baseline category
Differences between all verb categories significant, including between native Welsh and listed English (where p<.0005)
Raw frequency marginally significant (p=.042) or not at all significant (p=.682) as a predictor of mutation, depending upon baseline category
Differences between verb categories significant, except between native Welsh and listed English (where p=.174)
Results of statistical testing (logistic regression) with log values of frequency
Log frequency significant (p=.019 or .01 with native and listed English as baseline respectively) as a predictor of mutation
Differences between native Welsh and unlisted English, and between listed and unlisted English significant (p=.019 and .03, respectively), but difference between native Welsh and listed English not at all significant (p=.448)
Log frequency highly significant as a predictor of mutation (p=.001) with listed English as baseline, but not at all significant (p=.549) with unlisted English as baseline)
Differences between native Welsh and unlisted English, and between listed and unlisted English highly significant (p=.001 and .005, respectively), but difference between native Welsh and listed English not at all significant (p=.186)
Summary English-origin verbs in Welsh – highly productive
(―(i)o) suffix). Almost certainly be considered a simple case of borrowings according to Poplack
Subset of them based on a dictionary criterion found to be significantly less integrated morpho-syntactically (with respect to soft mutation) : could be considered “switches”
Strong (log-linear) relationship between word frequency and rate of mutation
This goes against Poplack’s “nonce borrowing” hypothesis: “nonce” items pattern significantly differently from “established” items, based on either dictionary criterion OR frequency