Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Cognitive Studies, 18(2), 320-328. (June 2011)
● Short Notes●
Effect of Repetition in Pronunciation Practiceon Retrieval of Nonsense Words
Katsumi Nagai
A nonsense word retrieval test was conducted to examine the effectiveness of two
common practices in language classrooms: repetition “with” a teacher and “after” a
teacher. Participants were asked to memorize two lists of bisyllabic test words by
reading the words aloud with and after the model presentation (repetition phase). Par-
ticipants heard test words and were expected to reject the “new” test words as “words
that were not repeated in the test phase”. The number of correct responses and par-
ticipants’ levels of confidence were analysed, and both ANOVA and ROC (Receiver
Operating Chracteristic) curve showed that repetition “with a teacher” significantly
surpassed “after a teacher”. These results demonstrate the advantage of pronunciation
practice simultaneously with a teacher.
Keywords: pronunciation practice(発音練習), repetition(反復), retrieval of words
(単語再認), Receiver Operating Characteristics curve(ROC曲線)
1. Introduction
Pronunciation practice has special importance
for language learning because speech sounds
form the basis of all languages. Foreign lan-
guage learners have been long encouraged to im-
prove their ‘naturalness’ measured against ‘na-
tive’ speakers of the target language. Conse-
quently, pronunciation practice occupies an im-
portant position in language classes when learn-
ers’ mispronunciation is expected to create un-
avoidable difficulties in communication (see re-
views in Stern, 1992; Howatt and Widdowson,
2004). Teaching plans typically include repeti-
tion of minimally-paired target sounds, words,
and phrases following the teacher’s explanation
and model pronunciation (Celce-Murcia, Brin-
ton, and Goodwin, 1996, p.310; Doff, 1988;
Kelly, 2000). However, a principal aim in pro-
nunciation practice can be also an improvement
Foreign Language Centre (Daikyo-centre), KagawaUniversity.
in ‘intelligibility’ — the degree to which the
learner’s speech can be understood by others,
both receptively and productively. This view-
point is free from disagreements over the defini-
tions of ‘good’ and ‘correct’ accent when ‘get-
ting meaning across to the listeners’ is set as
the goal of pronunciation practice (Kenworthy,
1987; Dalton and Seidlhofer, 1994). Despite
persistent efforts of teachers and researchers, it
seems there is a long way to go before standard-
ized scales of ‘naturalness’ and ‘intelligibility’ are
available to language learners (see examples in
Miwa, Sasaki, and Tanno, 2000; Aoyama, Flege,
Guion, Akahane-Yamada, and Yamada, 2004).
Even so, another goal of oral practice can be
set as a substitute for written grammar exercises
because vocalization requires less time and en-
ergy than paper-based tasks. It is true that little
previous research has been published about this
goal, and that some teachers might say that such
oral practice can not be categorized as a pronun-
ciation drill in cases where no weight is given to
Vol. 18 No. 2 Repetition in Pronunciation Practice on Retrieval of Nonsense Words 321
‘naturalness’ or ‘intelligibility’. Nevertheless, it
is still reasonable for researchers to maximize the
efficiency of pronunciation drills as long as audio-
lingual-type oral practice produces a certain and
inevitable effect on learners. In practical terms,
a more important matter for teachers is to clar-
ify how this effect can be maximized. This is the
purpose of the present experiment.
When the first aim of pronunciation drills is
to increase learners’ vocabulary, the effect can
be quantitatively measured by a word recogni-
tion test, in which the scores of nonsense-word
retrieval are measured after pronunciation drills.
In the present experiment, nonsense test words
were synthesized to minimize the deviation of
various acoustic characteristics found in the hu-
man voice. Effectiveness was measured by the
scores of word retrieval tests with a factor of six-
point confidence rating (Massaro, 1975, pp.85–
141; Wickens, 2002, pp.83–92). The pronunci-
ation drills in the present paper are limited to
the following basic twofold types: repetition ‘af-
ter’ the teacher and repetition ‘with’ the teacher;
variations of oral practice can be considered as
derived forms of these two types of practices.
Repetition ‘after’ the teacher is considered the
default and unmarked practice, what most teach-
ers and learners do in their classrooms (see Nagai
2007 for a description of the varieties of pronun-
ciation drills).
Nagai (2007) conducted four experiments to
assess naturalness of learners’ repetition in two
ways — ‘after’ the teacher and ‘with’ the teacher.
The results show that repetition ‘after’ the
teacher surpassed ‘with’ the teacher in natural-
ness of the sentences spoken by the learners prob-
ably because repeating ‘with’ a model is a task
which hinders learners’ precise auditory feed-
back by its random and nonstationary interrup-
tion including the participants’ own voices. An-
other experiment in Nagai (2009) firstly asked
the participants to repeat test sentences orally
‘after’ and ‘with’ the model presentation, and
then asked them to judge whether the sentences
were grammatical or not by pressing keys on a re-
sponse pad. The results indicated that repeating
‘with’ the model tends to yield better grammat-
ical judgment scores than repeating ‘after’ the
model. Do tests on vocabulary show the same
tendency as seen in Nagai (2009)? Does pronun-
ciation practice ‘after’ the teacher also improve
the learners’ scores on a nonsense word memo-
rization test? The present study addresses these
issues through an experiment with word retrieval
tests administered after vocalizing the test words
after and with the model presentation.
2. Experiment
2.1 Participants
Six male and four female college students
aged between 19 and 21 participated in the ex-
periment. They all were born and raised in
Okayama. They received honoraria for their par-
ticipation. None of them had hearing disorders
or experience living abroad.
2.2 Test words
Test words, shown in Table 1, were synthe-
sized using a speech synthesizer (Fujitsu, 1998)
at 16bit/22kHz sampling to minimize durational
variety and to maximize the naturalness of coar-
ticulation within the words. The list con-
tained 120 bisyllabic (two consonant-vowel com-
binations: C1V1.C2V2) nonsense test words of
which the association (i.e. familiarity) values
for Japanese speakers are below 20 (on scale of
1 to 100) in Hayashi (1976). Some candidate
words, either entry words in a popular Japanese
dictionary (Shogakukan, 1988) or Japanese ono-
matopoeia were also removed from the list. The
list of nonsense test words in Hayashi (1976) in-
cludes words of foreign origin. Note that limit-
ing test words to phonetically and phonologically
English-type words would decrease the validity of
the result because the aim of present paper is to
examine vocabulary building through oral prac-
tice. The mean duration of all test words was
476ms (S.D.=20.8). Table 2 indicates variations
322 Cognitive Studies June 2011
Table 1 List of test words
heha nehe rano ruyu
hehu nehi raro sehe
heka neke rayo seho
heme neme rehe sonu
hene newa reke suse
henu nime reme suyo
heyo nino rera tehi
hihe nona rewa teyu
hinu nonu reya tohe
hise noyo reyu tonu
hohi noyu rinu tuhu
honi nuha rite tuko
honu nuhe riwa tunu
huho nuho rohe tusa
kehe nuko romo tuse
keku numi roni tuso
kenu numo ronu wami
keyo numu rowa wamo
mehe nuna royo wane
mehu nune royu waso
memi nuni ruhe wayu
mena nuse ruke yohe
mihu nuso rume yuha
moma nutu runi yuhe
monu nuwa runu yuma
muhe nuya rura yumu
mumo nuyo ruro yunu
munu nuyu rute yuro
muwa raho ruya yuti
nate rani ruyo yuyo
Table 2 Mean durations and standard deviations of C1V1 and C2V2 units in test words.
The duration of whole CV units ranges between 395 ms (shortest ‘ku’) and
528 ms (longest ‘se’).
head
consonants
mean CV
duration (ms)S.D.
h 252.6 18
k 175.5 18
m 240 16
n 238.6 17
r 244.8 15
s 280.5 15
t 201 12
y 243.7 7
vowelsmean CV
duration (ms)S.D.
a 238.3 30
e 247.7 37
i 235.2 19
o 240.4 38
u 218.6 31
of syllabic duration sorted by their consonants
and vowels in the CV units.
2.3 Procedure and apparatus
After a brief explanation of experimental
procedures in a soundproofed room (Yamaha
ANUKC3508), participants put on headphones
(Sony MDR-CD2000) connected to a computer
extension card (Creative Audigy 2ZS). They lis-
tened to stimuli presented at 60 dB (SPL) and
repeated the test words loudly. The first part of
the experiment consisted of six repetition phases
with ‘repeat after me’-type practice and then
six more test phases. One repetition phase in-
cluded ten nonsense test words with pauses of
five seconds between each word, and participants
repeated the test words ten times in one rep-
etition phase. Participants were asked to vo-
calize the test words clearly, and their voices
were recorded digitally to ensure a precise re-
production of the test words (with a Roland R-1
recorder and Sony ECM-55B microphone). Then
the test phase, termed ‘signal plus noise trials’ in
Signal Detecting Theory (Green & Swets, 1966;
Vol. 18 No. 2 Repetition in Pronunciation Practice on Retrieval of Nonsense Words 323
a. Repetition phase(10 words ‘after’ theteacher, 50 seconds)
b. Test phase(10 repeated words +10 new words, self-paced)
c. Repetition phase(10 words ‘with’ theteacher, 50 seconds)
d. Test phase(10 repeated words +10 new words, self-paced)
Figure 1 Procedures of the experiment. One repetition phase (a) and the following test phase
(b) composed one “repetition ‘after’ the teacher” set. The six sets (six a-b, a-b, ...
a-b sets) composed the first part of the experiment. The second half also consisted
of six “repetition ‘with’ the teacher” phases and six test phases (six c-d, c-d, ... c-d
sets). Half of the participants started the experiment of ‘with the teacher’ sets, and
all test words were presented randomly.
Lindsay & Norman, 1977; Wickens, 2002), fol-
lowed the repetition phase after a short cue of
white noise for 100 ms. One test phase included
twenty test words presented in randomized or-
der. The twenty words in the test phase con-
sisted of ten ‘old’ test words from the preced-
ing repetition phase (words the participants vo-
calized), and ten more nonsense words ‘new’ to
the participants (words the participants did not
vocalize). The participants heard twenty test
words through their headphones and self-checked
their memory by expressing their confidence lev-
els. The participants were expected to reject the
‘new’ test words as ‘not the word I repeated.’
The participants’ levels of confidence were de-
fined as certainty factors (CFs hereafter), and
the levels were measured on a scale from one (‘yes
with most certainty’) to six (‘no with most cer-
tainty’) as shown in Table 3. Participants pushed
one of the six keys on a six-button switchbox
(Cydrus Response Pad RB-620). The test phase
was self-paced and no feedback was given to the
participants. After short cue of five white noises
and pauses for 100 ms each, the test phase en-
tered the second repetition phase. Participants
were allowed to have a 30 minute break after six
repetition-and-test phases in the experiment.
The second half of the experiment consisted
of six more repetition phases and six more test
phases. The difference was that participants re-
peated the test words ‘with the teacher’ during
the repetition phases. Participants were pre-
sented the test words twice and were required
to repeat test words chorally and simultaneously
with the second audio presentation. This type
of repetition is similar to practices of ‘echoing,’
‘mirroring,’ or ‘shadowing’ as classified in Nagai
(2007).
Half number of the participants started the
experiment with repetition ‘after’ the teacher,
and the other half started repetition ‘with’ the
teacher to make a counterbalance. The overall
procedures of the experiment are illustrated in
Figure 1. The total number of trials was 2 (‘rep-
etition after me’ and ‘repetition with me’ at the
trial phases) × 20 (trials at the test phases) × 6
(phases at the first and second halves) = 240.
If repetition ‘after’ the teacher is divided into
several cognitive stages (i.e. (1) perception of the
teacher’s voice, (2) holding the teacher’s model
pronunciation in temporary storage, (3) plan-
ning pronunciation by reference to the teacher’s
324 Cognitive Studies June 2011
Table 3 Levels of confidence (certainty
factor, CF). Six buttons on the
switchbox were laid out in a
straight vertical line. The box
was placed with the key no.6
(yes with most certainty) in
front of the participant.
1 no with most certainty
2 no with certainty
3 no without certainty
4 yes without certainty
5 yes with certainty
6 yes with most certainty
model, and (4) articulation by the learner), rep-
etition ‘with’ the teacher has an additional pro-
cess ((5) adjustment of articulation following the
second model pronunciation by their teacher).
The adjustment of articulation (stage (5) when
repeating ‘with’ the teacher) included an over-
lapping process of ‘the second audio presenta-
tion’ and ‘student’s voice being fed back audi-
torily and mentally’. Note here that the stu-
dents were hearing their own voices (auditory
and mental feedback) not only when they re-
peated ‘with’ the teacher, but also when repeat-
ing ‘after’ the teacher. In other words, repetition
‘with the student’s own voice (repetition ‘after’
the teacher)’ and repetition ‘with the teacher’s
voice + the student’s voice (repetition ‘with’ the
teacher) simultaneously’ were compared in the
present experiment. This is unavoidable because
synchronous repetition of nonsense words pre-
sented only once (i.e. ‘presenting test words only
one time when repeating ‘with’ the teacher’ with-
out visual presentation) is too difficult for partic-
ipants, and because ‘presenting test words two
times when repeating ‘after’ the teacher’ is the
same as allowing the participants to listen to the
test words three times (i.e. they hear their own
voices thirdly).
3. Results and discussion
If the CFs in Table 3 are simplified into di-
chotic yes/no responses, the result can be sum-
Figure 2 Number of correct responses counted
by dichotic (yes/no) criteria
marized as shown in Figure 2. The numbers
of correct responses in Figure 2 correspond to
the scores of eleven participants (A-K) and their
scores are summed up in Table 4. Under the as-
sumption that the number of correct responses
equals the scores of each participant with nor-
mal distributions, analyses of variance were cal-
culated to examine the effects of the two repeti-
tion conditions (‘after’ versus ‘with’ the teacher)
and the two other levels of type of trials (‘hit’
and ‘correct rejection,’ detailed below) in Fig-
ure 2. While one variable, ‘repetition after/with
the teacher,’ had a significant main effect on the
number of correct responses (the dependent vari-
able), F (1, 40) = 10.59, p < .05. The other vari-
able ‘hit/correct rejection’ did not have a signif-
icant effect, F (1, 40) = 1.04, p = .31, n.s.. The
post hoc test showed that the score for repeti-
tion ‘with’ the teacher was higher than that of
‘after’ the teacher. It also indicated that ‘hit’
responses outnumbered ‘correct rejections’ when
the participants repeated the test words ‘after’
the teacher, p < .05. These results demonstrate
the importance of repetition ‘with’ the teacher.
They also imply that, in the test phase follow-
ing ‘repetition after the teacher,’ the participants
were better at recalling the test words which they
repeated in the first repetition phase than they
were at rejecting new words added in the test
phase.
Signal detection theory provides a framework
to classify the correct responses into two cate-
Vol. 18 No. 2 Repetition in Pronunciation Practice on Retrieval of Nonsense Words 325
Table 4 Mean number of correct responses (hit and correct rejection)
out of 60 trials each (240 in total)
mean S.D.
repeat after the teacher hit 37.91 5.4
correct rejection 46.18 8.0
subtotal 40.00 5.1
repeat with the teacher hit 42.09 3.9
correct rejection 45.73 6.1
subtotal 43.91 5.4
total 42.98 6.7
gories: ‘hit’ and ‘correct rejection’. ‘Hit’ indi-
cates the correct response elicited by participants
who judged that ‘the word presented in the test
phase’ was included among ‘the test words in
the previous repetition phase’ (i.e. ‘old’ words).
‘Correct rejection’ of newly added test words in
the test phase, on the other hand, represents an-
other type of correct response that is evoked by
participants who judged that ‘the word in the
test phase’ was not ‘the word s/he had repeated
in the preceding repetition phase’ (i.e. a ‘new’
word).
Figure 3 indicates the two idealized contin-
uum distributions of responses. The broken line
can be considered to indicate the number of re-
sponses to the ‘noise’ (test words newly added
in the test phase). The overlapping right curve
corresponds to the number of responses to the
‘noise and signal’ (test words with and without
newly added words). The area of overlap shows
where the participants had difficulty in distin-
guishing ‘noise and signal’ from ‘noise’. Because
the present experiment was of a forced-choice de-
sign, the participants needed to draw and repre-
sented by a vertical line which determines the
cutoff point of the two distributions. Accord-
ingly, it is reasonable to assume that the larger
distance of the two distributions (d′) yields a
more precise distinction between the ‘noise’ and
‘noise and signal’ distribution. The present study
tried to differentiate between two styles of repe-
tition (‘after’ and ‘with’ the teacher in the rep-
Figure 3 Idealized distributions of noise (test
words newly added in the second
test phases) and noise+signal (all
test words in the test phase includ-
ing words repeated in the first rep-
etition phases)
etition phases) by calculating the distances (d′)
of the two distributions.
The cutoff points of the vertical line indicate
trade-offs of the two correct responses (‘hit’ and
‘correct rejections’) and the distance (d′) varies
together. Therefore, the results of the present ex-
periment can be illustrated by plotting the ‘hit’
rate against the ‘false alarm (complement set of
‘correct rejections’)’ rate as shown in Figure 4.
Panels (a-e) in Figure 4 are called Receiver Oper-
ating Characteristic (ROC) curves, and they are
indices of the sensitivity of participants to the
signals. It is also known that as the distance (d′)
increases, the area of the bottom-right portion of
the graphs will become wider (see Wickens, 2002,
326 Cognitive Studies June 2011
Figure 4 Receiver Operating Characteristics (ROC) curves (repetition ‘after’ and ‘with’ the
model pronunciation are abbreviated to a-repeat and w-repeat)
pp. 39–58 for the theoretical background).
The lower right-hand area under the ROC
curves can be maximized by moving the ROC
curves to the left-hand border (the ordinate) and
the top border (the abscissa). Then normalized
ROC curves were computed in Figure 5 to com-
pare the area under the ROC curves in Figure 4.
It is more desirable to have a larger distance (d′)
because a large distance (d′) of the two distribu-
tions indicates more precise detection (reception)
of the signal as explained above, and because
participants’ more precise detection (perception)
can be regarded as indicating better retrieval of
the test words. The values of d′ can be obtained
by deciding crossing points of the linear approx-
imations of the normalized ROC curves and the
abscissa, p(H) = 0. The d′ value of repetition
‘after’ the teacher was 1.35, while that of rep-
etition ‘with’ the teacher phase was 1.57. This
difference shows a tendency for repetition ‘with’
the model presentation (at the repetition phases
of the experiment) to yield better retrieval of test
words than repetition ‘after’ the model presenta-
tion.
4. Conclusion
Memorization of new phonological lexicon is
an essential step in foreign language learning, re-
gardless of learners’ achievement levels. Vocal-
izing words is a useful tactic for minimizing the
target words because oral repetition requires no
written text at hand. This repetition can be
classified into two basic types from the view-
point of teacher’s and learner’s vocal overlap-
ping, i.e. repetition ‘with’ or ‘after’ the teacher.
Using ANOVA and the framework of signal de-
tection theory, the present paper tested which
of the two types of vocalization was more effec-
tive. ANOVA results show that scores of rep-
etition ‘with’ the model presentation is signifi-
cantly higher than that of ‘after’ the teacher’s
voice. The ROC curve analysis also ensures that
Vol.18 No.2 Repetition in Pronunciation Practice on Retrieval of Nonsense Words 327
Figure 5 Normalized Receiver Operating Characteristics (ROC) curves calculated from the
data of Figure 4.
repetition ‘with’ the model presentation height-
ens the value of the discrimination index (d′).
These results suggest that repetition ‘with’ the
teacher is a better activity than repetition ‘after’
the teacher when teaching new words. Vocaliza-
tion ‘with’ the teacher can be regarded as better
practice than repetition ‘after’ the teacher in lan-
guage classrooms, because students will memo-
rize target words (i.e. phonological lexicon) more
easily when they repeat ‘with’ their teacher.
The data were obtained from an experiment
in the retrieval of ‘new’ and ‘old’ words after
an audio presentation and its vocal repetition.
The difference of discrimination index (d′) cal-
culated from the ROC curves is regarded as the
difference of confidence in participants’ ability to
distinguish ‘old’ words (that the participants re-
peated) from ‘new’ words (that they did not re-
peat). The sensitivity of the participants is also
regarded as being indicated by the accuracy of
their responses in the present study. It must be
admitted that the number of audio presentation
is different between repetitions ‘after’ and ‘with’
a model. However, the framework of this exper-
iment may establish a new research method for
the study of the role of repeating in language
learning.
References
Aoyama, K., Flege, J. E., Guion, S. G., Akahane-
Yamada, R., & Yamada, T. (2004). “Per-
ceived phonetic dissimilarity and L2 speech
learning: The case of Japanese /r/ and En-
glish /l/ and /r/.” Journal of Phonetics, 32,
233–250.
Celce-Murcia, M., Brinton, D. M., & Goodwin,
J. M. (1996). Teaching Pronunciation. Cam-
bridge: Cambridge University Press.
Dalton, C., & Seidlhofer, B. (1994). Pronuncia-
tion. Oxford: Oxford University Press.
Doff, A. (1988). Teach English. Cambridge:
Cambridge University Press.
Fujitsu Ltd. (1998). Speech API for Microsoft
Windows. Tokyo: Fujitsu Ltd.
Green, D. M., & Swets, J. W. (1966). Signal de-
tection theory and psychophysics. New York:
Wiley.
Hayashi, S. (1976). New nonsense syllable list.
Nagoya: Tokai University Press.
Howatt, A. P. R., & Widdowson, H. G. (2004). A
History of English Language Teaching Second
edition. Oxford: Oxford University Press.
Kenworthy, J. (1987). Teaching English Pronun-
ciation. London: Longman.
Kelly, G. (2000). How to Teach Pronunciation.
Harlow: Pearson Education.
Lindsay, P. H., & Norman, D. A. (1977). Human
information processing. New York: Academic
Press.
Massaro, D. W. (1975). Experimental psychology
and information processing. Chicago: Rand
McNally College Publishing.
Miwa, J., Sasaki, H., & Tanno, K. (2000).
“Japanese Spoken Language Learning Sys-
328 Cognitive Studies June 2011
tem Using Java Information Technology.”
Sixth International Conference on Spoken
Language Processing (ICSLP 2000), Vol.III,
578–581.
Nagai, K. (2009). “Effect of pronunciation prac-
tices on the acquisition of artificial lan-
guages.” Studies in phonetics and speech
communication. Kinki Society for Phonetics.
6, 225–233.
Nagai, K. (2007). “Differences of pronunciation
practices: A study of ‘Repeat with me’ and
‘Repeat after me’.” Journal of the Phonetic
Society of Japan, 11, 79–93.
Shogakukan (1988). Kokugo Dai Jiten Dictio-
nary (Revised edition).
Stern, H. H. (1992). Issues and Options in Lan-
guage Teaching. Oxford: Oxford University
Press.
Wickens, T. D. (2002). Elementary Signal De-
tection Theory. Oxford: Oxford University
Press.
(Received 9 Sep. 2010)
(Accepted 12 April 2011)
長井 克己(正会員)1996年,M.Sc. in Applied Linguistics (Univer-
sity of Edinburgh).1999年,博士(言語文化学・大阪大学).1987年山口大学人文学部卒業後,大阪府立香里ヶ丘高等学校教諭,同農芸高等学校教諭,津山工業高等専門学校助教授を経て,現在香川大学大学教育開発センタ准教授.日本語,英語,ゲール語の学習者の音声を母語話者と比較することにより,より有効な発音練習の方法を提案することを目指している.日本音声学会,日本音響学会,日本心理学会,Acoustical Society of America などに所属.