Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Dept. for Speech, Music and Hearing
Quarterly Progress andStatus Report
Acoustical features of hardand soft Russian consonants
in connected speech: Aspectrographic study
Shupljakov, V. and Fant, G. and deSerpa-Leitao, A.
journal: STL-QPSRvolume: 9number: 4year: 1968pages: 001-006
http://www.speech.kth.se/qpsr
STL-QPSR 4/1968 3.
Experimental procedure
Three female and two male speakers of Modern Standard Russian took
par t i n the experiments. The subjects pronounced l i s t s of symmetr ical
sequencesto VCV (vowel-consonant-vowel) type. These l i s t s contained a l l
Russian vowels and consonants, in al l 180 syllables.
This speech mater ia l was recorded on an Ampex magnetic tape-recorder .
Spectrograms were made with a Kay Electr ic Sona-Graph, type 4691, oper-
ated with a broad f i l ter of 300 Hz and a high frequency emphasis of 6 dB
p e r octave above 1000 Hz,
Measurements were made of F and F within the consonant sampling a t 2 3 places of extreme values. If these were obscured, measurements were
taken at the very beginning of a transit ion f rom the consonant to the vowel.
The F and F data were quantized to nearest 100 Hz values fo r the purpose 2 3
of constructing diagrams.
The direction of the transit ions were observed.
Results and conclusions
It was soon verified that hypothesis (3) did not hold. F2-transi t ions
f r o m adjacent vowels to the soft consonant were generally positive, s ee
Fig. I-A-1, but could a l so be neutral o r slightly negative, s ee the word
[ a l ' z j of Fig. I-A-2, where F2 = 2100 Hz in [ie] a s well a s i n the [I] . The
sequence [~ lz ] of Fig. I-A-2 exemplifies a typical negative transit ion f rom
the F2 = 1900 Hz of the [ie] to the F2 min = 830 Hz of the da rk [I].
However, in [odo] and [ aza l , Fig. I-A-2, the transit ion f rom the vowel
to the hard consonant i s neutral o r positive.
Although the direction of transit ion can b e a n important perceptual
parameter , Stevens (1 967), Chistovich (1 968), it apparently does not pro-
vide a simple invariance rule for the soft/hard distinction.
The ce ter i s paribus hypothesis (1) was consistently found to hold. The
F of a soft consonant i s always higher than the F2 of a hard consonant i n 2
the same context and assuming the same speaker. The next s tep was to .
s e e whether there existed an absolute c r i te r ion in F independent of vowel 2
context a s formulated i n hypothesis (2) above. Fig. I-A-3 and Fig. I-A-4
show the distributions for each of the five speakers . Apparently the F2
kHz
Fig. I-A-1. Spectrogram of the syllable uh' u.
SUBJECT: M - n
I I \ I
\ I \
i :I \I Y t
\ \ \ \ \
1 1 1 1 I I 1 J
500 1000 1500 2000 2500
3 0 ~ - - 25 1 SUBJECT: B - r - - - 20 = I - I
I
I I I I
- \ - I \ \ \
I l l I l U \ I
2500
FREQUENCY OF THE SECOND F3RMANT IN Hz
Fig. I -A-3. The f requency of the second formant i n hard (solid l ines ) and soft (dashed l ines ) cons onants, pronounced by two male subjects .
FREQUENCY OF THE SECOND FORMANT IN H z
Fig. I -A-4 . The frequency of the second fo rmant in ha rd (solid l ines ) and soft (dashed l ines) consonants, pronounced by t h r e e different female subjects .
STL-CPSR 4/1968 4.
distributions of hard and soft consonants show very l i t t le overlap. Thei r
c rossover points occur a t 1700 Hz for the two male speakers and 1900 Hz,
2000 Hz, and 2000 Hz i n the distributions of the female speakers . The
number of mistakes that would occur when using a single threshold value
in F2, individually determined for each speaker would be 1.9 % in the 360
utterances of the male subjects and 4 YO in the 540 utterances of the female
group. A s imi l a r tendency was found in the resul ts f rom the experiments
on the perception of hard and soft stationary fricative consonants,
Shupljakov (1 968).
An analysis of the overlap in the F distributions shows that 18 out of 2
the 29 cases originate f rom the ;cquencis[atre] [ik 1 Lug' u][uf u][o J' o]
which a r e r a r e o r unnatural in Russian. The front vowel contexts ([i]
[ &][a] and [ e l account fo r 17 of the overlap points.
The finite amount of overlap can be fur ther reduced by taking into ac-
count the F distributions, see Figs. I-A-5 - I -A-9 . This cue, however, 3
i s effective mainly in front vowel contexts, where F3 of the soft consonant
i s generally higher than in hard consonants. In [ a ] contexts the overlap i s
considerable. The same i s found in Lo] and [u] contexts but i t is interest-
ing to note that for one of the male subjects (h4-n) there i s a c l ea r tendency
of a lower F in soft than in hard consonants. This effect could be attributed 3
to a coarticulation of the hard consonant with "mid-vocal tract" place of
articulation conditioning a high F 3 '
The crossover points of F in front vowel contexts were found to be 3 2600 Hz fo r the male speakers and 3100 Hz, 3100 Hz, and 3000 Hz fo r the
female speakers .
The question now a r i s e s whether the cr i t ical threshold values of second
and third formants separating hard f rom soft consonants can be inferred
f rom some general property of the speaker. One such parameter is the
total length of the speaker ' s vocal t r ac t which i s inversely proportional to
the average F3, Fant (1 959). The average F of each speaker ' s front 3 vowels [i] [&][el and [ze) was accordingly calculated and was found to co-
incide with the hard/snft F3 boundary. The hard/soft F boundary was found 2
to be 0.53 of the male speake r ' s average F and within the female group 3
the F boundary was 0.59 of the speaker ' s average Fj. Thus 2 - F 2 t = 0.53 Fg (males) - F 2 t = 0.59 F3 (females) - F3t = F 3 (males and females)
FREQUENCY OF THE THIRD FORMANT IN Hz
Fig . I-A-5. The frequency of the th i rd fo rmant i n hard (solid l ines ) and soft (dashed l ines) consonants, followed by different vowels (Male speaker : M-n).
FREQUENCY OF THE THIRD FORMANT IN Hz
F i g . I -A-6 . The f requency of the th i rd fo rmant i n ha rd (solid l ines) and soft (dashed l ines) consonants , followed by di f ferent vowels (Male speaker : B - r ) .
FREQUENCY OF THE THIRD FORMANT IN Hz
F i g . I - A - 7 . The frequency of the th i rd fo rmant i n hard (sol id l ines ) and sof t (dashed l ines ) consonants , followed by d i f ferent vowels ( F e m a l e speaker : H - a ) .
FREQUENCY OF THE THIRD FORMANT I N Hz
Fig. I - A - 9 . The frequency of the th i rd formant in hard (solid l ines) and soft (dashed l ines) consonants, followed by di f ferent vowels ( F e m a l e speaker : P - a ) .
STL-CPSR 4/1968 5.
The normalizing function of the third formant determining F 2 boundaries
i n vowel identification has been studied by Fujisaki and Kawashima (1 967
and 1968). F r o m the i r data and those of Fant (1959) i t i s known that the
perceptual importance of F is much g rea te r i n front vowels than i n back 3 vowels. This general rule i s related to our finding that F3 adds to the
phonetic separation of soft and hard consonants i n front vowel context only.
In this connection i t is also worth noting that i n the experiments on the per-
ception of sustained fricatives, Shupljakov (1968), the F2t was constant a s
long a s the par t of the noise spectrum above 2500 Hz was kept the same.
Summary. Perceptual model
Our final conclusion is accordingly that the speech wave charac ter i s t ics
separating Russian hard and soft consonants can be expressed by a single
threshold i n the consonant F with a support of a s imi l a r normalized thres- 2
hold in F i f F2 is high. These thresholds a r e simply related to the 3
speaker ' s average formant pattern. Consonants with formant frequencies
above the thresholds a r e soft and those with formant frequencies below
the thresholds a r e hard. The speech production cor re la te is palatalization
a s opposed to a neutral o r "mid tube" place of tongue-body articulation.
Therefore a l is tener ' s hard/s:Jft identification could be based on n f ~ r m a n t
frequency boundary categorization a s the pr imary cue a s was suggested
fo r vowels by Chistovich, Fant, d e s e r p a - ~ e i t g o , and Tjernlund (1966).
More detailed data on the F-pat tern of various hard and soft consonants
in different contexts will be given i n a forthcoming publication.
Acknowledgments
We a r e thankful to Mrs. 5. Felicett i and Dr. B. Lindblom fo r the help
during the preparation of the paper and to Miss E. Agelfors for thz drzwings.
Rc:fzr;:nc~s Chistovich, L. A. (1968): "Direction of Transit ion a s a Perceptual P a r a -
me te r of Time-Varying Stimuli", paper B-3-7, 6th International Gongr. on Acoustics, Tokyo 1968.
Chistovich, L. , Fant , G., d e s e r p a - ~ e i t c o , A., and Tjernlund, ~ . ( 1 9 6 6 ) : "Mimicking and Perception of Synthetic Vowels. P a r t 11", STL-CPSR 3/1966, pp. 1-3.
Fant, G. (1959): "Acoustic Analysis and Synthesis of Speech with Applica- tions to Swedish", E r i c s son Technics 15, No. 1 (1959). -
Fant, G. (1 960): Acoustic Theory of Speech Production ( ' s-Gravenhage).
STL-QPSR 4/1968 6.
references, conto
Fujisaki, H. and Kawashima, T. (1967): "The Roles of Pi tch and Higher Formants i n the Perception of Vowels", Conference on Speech Communication and Processing, Cambridge, Mass. 1967; publ. in IEEE Transactions on Audio and Electroacoustics AU-16, March 1968, pp. 73-77.
Kawashima, T. and Fujisaki, H. : "The Influence of Various Fac to r s on the Identification and Discrimination of Synthetic Speech Sounds:' paper B-3-6, 6th International Congr. on Acoustics, Tokyo 1968.
Shupljakov, : "Pitch and Hard-Soft Distinction af Fricat ive and / /", Proc , of the Seminar on Speech
Production and Perception, Leningrad 1966; 2. f. Phonetik, Sprachwissenschaft und Kommunikationsforschung - 21, Heft 1/2, 1968.
Shupljakov, V. S. (1 968): "P,coustical Fea tu re of the Perception of Soft Stationary Fricat ive Consonants", 6th Allunion Acoustical Conference, Ivioscow (in ~ u s s i a n ) .
Stevens, K, N. (1 967): "Acoustic Corre la tes of Certain Consonantal Features" , 1967 Conf. on Speech Communication and Processing, Cambridge, Mass. , paper C6 in Conference Prepr in ts , pp. 177 - 184.
ahman, S. (1 964): "N<]te 3n Palat alizati,>n i n Russian", Luar t . P r o g r . Rep. Research Inst. of Electronics, Massachusetts Inst. of Techn. No. 73 (1964)~ pp. 167-172.
Chomsky, N, and Halle, M. : The Sound Pat te rn of English (New York 1968).