Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Dept. for Speech, Music and Hearing
Quarterly Progress andStatus Report
Distinctive features andphonetic dimensions
Fant, G.
journal: STL-QPSRvolume: 10number: 2-3year: 1969pages: 001-018
http://www.speech.kth.se/qpsr
STL-QPSR 2-3/1969
I. SPEECH ANALYSIS
A. DISTINCTIVE FEATURES AND PHONETIC DIMENSIONS*
G. Fant
The purpose of this paper is to express some comments on the recent
developments of distinctive feature theory with specific reference to the
work of Chomsky and Halle (1968). On the whole I consider the i r feature
sys tem to be an improvement over that of Jakobson, Fant , and Halle (1952),
one of the main advantages being the introduction of a se t of tongue body
features in common for vowels and consonants but separate f rom the con- I
sonantal "place of articulation" features. The basic philosophy of treating
phonetics a s an integral par t of general linguistics demands that features
i n addition to the i r classificatory function shall have a definite phonetic
function reflecting independently controllable aspects of the speech event
o r independent elements of perceptual representation. However, there is
a danger that the impact of the theoretical f r ame with its apparent mer i t s
of operational efficiency will give some readers the impression that the se t
of features i s once for a l l established and that their phonetic bas is has been
thoroughly investigated. This i s not so. Many of the i r propositions a r e in-
teresting and stimulating starting points fo r fur ther r e sea rch whereas
others I find in need of a revision.
As pointed out by Chomsky and Halle there a r e s t i l l ser ious short-
comings in our general knowledge of the speech event. Their feature sys-
t e m i s a lmost entirely based on speech production categorizations. The
exclusion of acoustical and perceptual cor re la tes was a pract ical limitation
in the scope of the i r work but a l so appears to note the importance layed on
the production stage. It is f a r eas i e r to construct hypothetical feature
sys tems than to t e s t them on any level of the speech communication chain.
This i s really our present dilemma. Until we have reached a more solid
bas is in general phonetics any feature theory will remain "preliminary".
Here follows m y reaction to some of the basic i ssues in chapter seven
of Sound Pa t t e rn of English. My ea r l i e r comments on distinctive feature
theory may be found in the l i s t of references, Fant (1960a, b, 1966, 1967,
1968).
9 submitted f o r publication in the proceedings of the Second International Congress of Applied Linguistics, Cambridge, England, Sept. 8- 12, 1969.
STL-QPSR 2-3/1969
1. Will we ever have a language universal, finite, and unique se t of distinctive features?
The universality aspects a r e attractive but I a m somewhat pessimistic
about the outlooks. Fea tures a r e a s universal a s the sound producing con-
s t raints of the human speech producing mechanism and a finite number
should suffice for the classificatory function. However, I a m ra ther scep-
t ical concerning the uniqueness and thereby a definite number of features
since one and the same facts often can be described in alternative forms
and the c r i t e r i a f o r selecting an optimum sys tem a r e not very rigid. Even
if we had a l l the knowledge we needed the choice of features would be de-
pendent on the part icular weight given to phonetic and general linguistic
considerations and the preferences of the investigator would in the last in-
stance determine some of the selections. The problem is the following.
2. Are the demands on a feature sys tem different on the classificatory level and the phonetic level?
There a r e two ways of arriving a t features: (1) by selecting an inven-
tory of c l a s ses suitable f o r encoding of language s t ruc tures and then deter-
mine the i r phonetic cor re la tes o r (z), to s t a r t with an exhaustive analysis
of the modes and constraints of the speech producing mechanisms and per -
ception and determine their distinctive function i n language. Fea ture
theory has to develop along both lines and investigators differ only in the
relative importance layed on one o r the other. The main approach of
Jakobson et a1 (1 952) was t o s t a r t out with an ordering of phonemic oppo-
sitions and to identify minimal distinctions a s the same if motivated by
phonetic s imilar i t ies . The demand for a smal les t possible number of
features and the fargoing identification of features within the vowel and
consonant systems, e.g. that of identifying the relation between dentals
and labials with that of front and back vowels, resulted in an unavoidable
pay-off between encoding efficiency and phonetic reali ty and specifiability.
Chomsky and Halle (1968) avoided some of these difficulties by introducing
a g rea te r number of features.
One of the i r basic issues i s that a feature sys tem in addition to the
classificatory efficiency should conform with a natural phonetic systemati-
zation. How have they managed in this respec t? In many instances such
a s dealing with the c l a s ses of fricatives, stops, nasals, l a te ra ls , etc. , the solution is s t raight forward. On the other hand, I find the encoding of
STL-QPSR 2-3/1969 3. - .
the class of labial consonants a s [+ anterior) and [ - coronal] to constitute
a c lear case of departure from the unifying principles. One single phonetic
dimension, "labiality", which has a distinctive function has here lost i ts
identity on the phonological level. It appears to be a rather far-fetched
hypothesis that the actual neural encoding of labial consonants a t some
stage should include a selection of a maximal anterior point of articula-
tion in the vocal t rac t and a lack of tongue t ip evaluation in order for a
lower level to find out that this command has to be executed by the lips
and not the tongue.
The major c lass features "vocalic" and "consonantal" introduced al-
ready in the work of Jakobson et a1 and the features "sonorant" and syl-
labic display a complicated system of interdependencies as will be de-
scribed in la ter sections.
The starting point for the major c lass features appears to have been
the need to encode certain pre-established phonetic classes whereas the
voiced-voiceless feature i s a typical example of the opposite approach,
i. e. to s tar t out with a natural phonetic dimension and study i t s distinctive
role in language. A natural linguistic class , i. e. a l l [ r ] -phonemes, may
have rather complicated sets of phonetic correlates and a natural phonetic
dimension a s voicing may have to be studied together with several other
dimensions a s tensening, durations, and coarticulation when i t comes t o
the discussion of i t s distinctive role,
Before we can accomplish the happy marriage between phonology and
phonetics we have to work out the rules for predicting the speech event
given the output of the phonological component of grammar. To me this
i s the central, though much neglected, problem of phonetics and i t i s of
the same magnitude a s that of generative g rammar in general and will re-
quire a s imilar se t of transformational rules, The starting point i s the
feature matrix of a message a s successive phonological segments, i. e,
columns each with i ts specific bundle of features, i. e. rows, The parti-
cular choice of classificatory features a t this stage i s not very important
providing the conventions relating phonemes to alternative features sys-
tems a r e known.
The derivation of the rules of this "phonetic component" of language
aims a t describing the speech production, speech wave, o r perception cor-
relates of each feature given the "context" in a very general sense of
STL-QPSR 2-3/1969 4.
co-roccurking features within the phohological segment a s well a s those of
following and preceding segments. One se t of sequential constraints a r e
expreeraible a s coakticulatibn rules which may be both universal and lan-
guage ispecificr
In additidn to these more o r l e s s iner t ia dependent laws of connecting
vocal ges tures there may exist rules of neural reorganization of control
signals fo r modifying the physi ca l manifestation of a feature in accordance
with a principle of leas t effort articulation, o r the contrary, a compensa-
tion for maintaining o r sharpening of a phonetic distinction dependent on
what features occur o r follow in the t ime domain. In addition there enter I rules for modifications dependent on s t r e s s patterns, intonation, tempo,
speaker , sex, type, and dialect, attitude etc. Rules fo r speech segment
durations and sound shapes have to be expressed in t e r m s of l a r g e r phono-
logical segments, generally severa l syllables defining a natural rhyth-
mical unit in t e r m s of s t r e s s and intonation. Very l i t t le is known about
these rules. There is some evidence that the phase of maximal intensity
increase within a syllable is a reference point for ordering rules concern-
ing segment durations (B . Lindblom, personal communication).
This "phonetic component!' of the speech event receives very l i t t le
attention in the work of Chomsky and Halle who merely re fer to the phone-
tic cor re la tes of a feature a s a sca le with many steps instead of the binary
scaling on the classificatory level. A knowledge of linguistic structuring
is of grea t importance in practical communication engineering undertakings
such a s the administration of synthesis by rule o r automatic identifications.
However, without access to the rules of the "phonetic component" the
phonetic aspect of features becomes a s imaginary and empty a s the "Em-
peror ' s New Clothes" in the s tory of H. C. Andersen. Observing the
speech wave we a r e not faced with phonemes o r features but sound seg-
ments and more o r l e s s continuous sound shapes with a reciprocal many-
to-one relation between phonol.ogica1 and physical units. The same is t rue
of speech production studied in relation to the phonological transcript. In
both cases there is the need to define inventories of physical units, Fant
(1968), which a r e not identical to the distinctive features but a r e used to
define the i r phonetic correlates . It may be quite pract ical t o r e fe r to a
specific sequence of segments a s a stop followed by a fricative a t the phon-
etic level while we may want to re fer to the whole unit a s a n affricate on
the phonological level.
STL-QPSR 2-3/1969 5.
Those who want to increase their ? e r s ~ e c t i v e s on phonology in relation
to phonetics should read Ladefoged' s monograph "Linguistic phonetics l1
(1 967a). A pure phonetic sys tem was outlined by G. E, Pe terson (1 968).
3. What is the psychological reali ty of features?
As demonstrated in the previous section features must, a t l eas t under
prototype conditions, have physical cor re la tes a s observed by a n external
observer of the speech communication ac t and they should hopefully re -
flect categorical phenomena i n the encoding and decoding mechanism.
This is not the s a m e a s ascribing each feature to a specific brain alloca-
tion. We can be aware of a feature by introspection but otherwise it may
lack immediate neurophysiological correlates . The important thing is that
the actual processes a r e phenomena that have some abs t rac t relation to our
feature mat r ices ,
4. Is the binarv ~ r i n c i ~ l e i m ~ o r t a n t ?
No, not necessar i ly , but i t is convenient. Language regularit ies and
language developments may in some instances b e more easily described
by sca les of three o r more levels, cif. Ladefoged (1967a). It i s a l so
questionable whether formulations in t e r m s of feature mat r ices always
reveals m o r e fundamental rules than formulations in t e r m s of phonemes.
5. Are features independent and orthogonal?
This question can pertain both to the classificatory, "phonological
level", and to the phonetic level discussing the production speech wave
and perceptual correlates . Besides the apparent constraints on possible
sequences of phonological segments there exist universal constraints on
feature combinations within one and the same segment. As discussed by
Chomsky and Halle [t high] would contradict [t low]. Also, some features
o r combinations of features imply specific signs of other feature i n the
same bundle, a s exemplified by [t vocalic] implying [t sonorant]. A
c loser analysis of interdependencies within the major c l a s s features r e -
veals that the c l a s s of [+ sorlorants] by definition a lso incorporates all
[t syllabics] and a l l C - consonantal] segments. Such constraints will be
discussed in g rea te r detail in the section of major c l a s s features. The
phonological dependencies within this se t of features a r e paralleled by
phonetic s imilar i t ies . Thus the c l a s s of C - cons onantala incorporating
vowels and glides must have much in common with the c l a s s of [t vocalic]
incorporating vowels and liquids. In other words "vocalic" i s a lmost the
negative of the "consonantal" feature.
The phonetic interdependencies a r e apparent even when they a r e not
paralleled by classificatory constraints. The situation had been ideal in I the vowel sys tem if the perceptually relevant number of dimensions had
been the same a s the number of classificatory features. We would have
had a perfect orthogonal sys tem if limited to the [t low] o r [ - high] and '
the [ - back] dimensions corresponding to the +F1 and +F2 dimensions,
respectively. The feature "rounding" i s correlated with - ( F ~ + F ~ + P ~ ) and
thus only partially independent of other features. The same i s t r u e of the
feature "tense" which is related to the formant pattern (direction towards
an extreme target) and duration. Additional features and/or sca le values
a r e needed f o r the Swedish vowel sys tem as will be discussed la ter .
We accordingly have to r e so r t to the minimal c la im of Chomsky and
Halle that features should be a t least partially independent. At the same
t ime we have to be aware of considerable interdependencies. This applies
to the i r classificatory function a s well as to the i r phonetic cor re la tes .
6 . Are differences in feature contents of mat r ices a reliable measure of phonetic distance?
No, not always. On a n average basis i t might be permissible to ex-
p r e s s differences between languages o r dialects by summing binary units
i n the classificat ory domain and expect such differences to represent the i r
phonetic differences, Ladefoged (1969). However, one cannot expect the
phonetic difference between any two phonemes to be proportional to the
number of features by which they differ. The situation was especially
severe in the Jakobson, Fant, and Halle system, It was stated that the
[ Q ] and the [i] of the word "wing" do not have any features in common,
the [i] being [$. voc] [ - cons)[ - compact][ - grave], the [Q] being [ - voc]
[t cons] [+ nasal] [t compact]. On the phonetic level, on the other hand,
the difference between the [i] and the [ n l is minimal since the ent i re [i]
is nasalized and the transit ion f rom [i] to [n] merely involves a ges ture
of tongue c losure which in dialectal variants is omitted. Within the
C homs ky-Halle f ramework the situation i s indeed improved since the tongue
body features [ -back][-low][+high] a r e in common for the two segments.
Consonantal sound{ a r e produced with a radical constriction in the
midsagittal region of the vocal t rac t , This constriction l imits the flow of
a i r in the obstruents and in the closed phase of r-sounds whereas i t is
"shunted", i. e. by-passed in la terals and nasals. Because of the variety
of sounds to be included by the feature a formulation of the acoustical cor-
re lates becomes ra ther complex, the common denominator being a devia-
tion f rom the ideal "vocalic" pattern by a reduction of the second and/or
higher formants.
Vocalic sounds a r e produced with an ora l opening that shal l not exceed
that of the high vowels [i] and [u] and which by definition shall be g rea te r
than that of glides. In addition the vocal cords shal l be positioned to allow
for spontaneous voicing. This requirement rules out unvoiced vowels a s
being nonvocalic. Ora l opening here includes la te ra l opening and in case
of sonorant [ r] -sounds the more open intervals. The acoustic cor re la te
i s a higher F1 and higher overall intensity than in nonvocalic sounds.
Syllabic sounds form a syllabic peak in the sequence of sound events.
Obstruents a r e by definition exchided f r o m the possibility of forming syl-
labic peaks, whereas syllabic nasals and liquids between obstruents a r e
basically characterized by the same cr i ter ion a s that of vowels between
obstruents o r glides. A weighted s u m of second and f i r s t formant inten-
sity relative to that of adjacent phonetic segment would be the s implest
acoustic correlate .
Sonorant sounds. The relative degree of sonority can be based on
exactly the same c r i t e r i a a s for syllabicity except that the relative degree
of sonority i s related to alternative compositions of one and the same seg-
ment whereas syllabicity implies comparisons in the t ime domain. The
production cc r re l a t e of sonority i s the s u m of vocal t r ac t openings includ-
ing ora l , nasal, and la te ra l passages which i s l a rge r than that found in ob-
struents. Thus [-sonorant] = obatruent. An interesting c la im not yet
verified i s that nonsonorarit sounds would not allow "spontaneous voicing"
and that a compensation of glottal adjustment to counteract the impaired
flow would be necessary.
The interdependencies between basic c lass features a r e a s apparent
on the phonetic level a s on the classificatory level. The situation i s even
more complicated by the fact that the continuant -noncontinuant (stop)
feature i s the same a s the consonantal feature, except that the degree of
STL-QPSR 2-3/1969 9.
pr imary s t r i c tu re is total in stops and in the closed interval of affricates
but not total in the [+consonantal][+continuant] fricatives.
I fully agree with Chomsky and Kalle on the need for replacing the
"vocalic" feature by the "syllabic" feature. The syllabicity seems to be
more easily testable than the vocalicity which employs a disputable thres-
hold between liquids and glides which does not focus on the important dif-
ferences. Fur thermore , I suggest a fur ther reduction of the number of
features dealing with vocal t r ac t opening by replacing the features "con-
sonantal" and "continuant" by one single feature (medially) "closed" which
i s identical to the "consonantal" feature but for an extension to separate
stops and affricates f rom fricatives, Before applying this feature we shal l
study hov.1 some of the main phonetic c lasses a r e encoded.
TABLE I-A-1
vowels nasals
syllabic + -. + consonantal - ( + ) +
s ono rant (+) + ( + )
na s a1 + + l a te ra l ( - ) ( - ) continuant
inst. re lease
affri- f r ica- la te ra ls r-sounds glides+h stops cates tives
- t - t - (-) (-) (-1 ( + I + (+) + - (4 (+I (+)
+ (+> + (+I (+I - - -
Features that by definition a r e implied by other features of the s a m e
phonological segment a r e marked with parantheses. Blank spaces repre-
sent other instances of "unmarkedness", i. e, (a) not applicable because
of physiological constraints, (b) i r relevant for the classificatory function,
o r (c ) occurrence in r a r e cases only. In detailed feature-analysis it
would be valuable to have separate notations for these four different a s -
pects of unmarkedness and also fo r the fifth aspect, that related to sequen-
t ia l constraints a s implied by a l l higher levels of analysis. According to
Chomsky and Halle the [+nasal] feature when added to stops could stand
fo r prenasalization, i. e. instance (c) above, whereas +nasal, when added
to vowels o r liquids, is a contextual variant due to adjacent nasal conso-
nants and can thus be omitted f rom the mat r ix (case (b) above). 1
STL-QPSR 2-3/1969
It i s interesting to note that if the feature matr ix is to be used for de-
scription of actual phonetic s ta tes , i t would not be possible to distinguish
between proper nasal consonants and nasalized [ r ) -sounds. This is a
consequence of liquids being opposed to nasal consonants in t e r m s of
[-nasal1 feature alone instead of by a specific complex a s the [+vocalic]
[+consonantal] in the ea r l i e r conventions.
A s imi l a r case of defining a phonetic category by the negative of an
other not directly related category i s the encoding of [r]-sounds a s
[ -lateral] . It is questionable whether an inhibition of the l a t e ra l command
in the production of an [I] automatically resul ts in an [ rl-sound. Addi-
tional adjustment may be necessary. These examples a r e analogous to
the [ -coronal, +anter ior] encoding of labial consonants which I consider
more objectional, All these instances of classification i n t e r m s of com-
binations and selections f rom a finite se t a r e acceptable provided we give
up the demand that each feature shall represent an independent and speci-
fic production category.
A coding t r e e related to Table I-A-1 i s shown in Fig. I-A-1, The syl-
labic feature presides in the top but this is not crucial. The same number
of yes-no branching points would have been needed if we put the sonority
feature on top. Now, coding t r ees a r e deceptive in a way since a l l so r t s
of variations and hierarchies a r e possible because of inherent redundan-
cies, However, the manipulation of coding t r e e s has the pedagogical
mer i t of bringing out these redundancies.
Examples of coding t r e e s for the reduced se t of features I have pro-
posed a r e shown in Figs. I-A-2 and I-A-3. In one the syllabic feature is
placed on the top, in the other i t i s given the lowest place and sonorant
the top place. The economy in t e r m s of branching points i s the same in
a l l the three figures. Figs. I-A-2 and I-A-3 merely have the mer i t of a
sma l l e r number of features. It was actually during the construction of
such t r e e s that I observed the complementary distribution of [-continuant]
and [+consonantal]. I p refer the t r e e of Fig. I-A-2 which s t a r t s out with
the sonorant feature related to vocal t r ac t opening i r respect ive of where
i t occurs. Then, logically follows the feature of c losure in the vocal
t r ac t midsagittal plane, then the manner of re lease of this c losure which
applies to [ -sonorants2 only. The medially closed sonorants a r e then
separated into nasals , l a te ra ls , and r-sounds a s previously discussed
Fig. I-A-l . Coding tree with the basic Chom~ky-Halle features, "syllabic" replacing "vocalic".
SYLLABIC + - -
. - --
CONSONANTAL
b 0 0 0 vow L r nos nos 1 r stop affr fric
0 glides + h
0 0 0
SYLLABIC
SONOR ANT
NASAL
LATERAL
CONTINUANT
INST. RELEASE
-- - -- - .
- * 4 -
SONOR ANT +-- r - 1
Fig. I -A -2 . Coding tree with the features contonantal and continuant replaced by a single feature l'mid-closure'l, The feature "sonorant" is given the top level.
MID-CLOSU E 7 -I+ +I - 7-
(consonantal INST. RELEASE
NASAL -
LATERAL
b -
b b b C, vow glidesnosnos 1 1 r r stop affr f r i c
+h 0 , o 0
- - -
SYLLABIC +a-
SYLLABIC + * - I -
INST. RELEASE
vow nas t r glid,ees nas 1 r stop affr f r i c
SYLLABIC
Fig. I -A-3 . Alternative coding tree with the s a m e features as in Fig. I-A-2 arranged in a different order, the feature "syllabic" in the top. Note the relation to Fig. I -A-1.
STL-QPSR 2-3/1969 11.
and glides a r e opposed to v,>wels a s nonsyllabic. The main acoustic cor -
re late of voiced sonorants i s their higher Fl intensity, whereas the acous-
t ic cor re la tes of "closure" is a reduction of formants higher than F1. The
specification of the nasal and the l a t e ra l cor re la tes a r e not s o simple.
They will not be discussed here.
Some detailed comments
The c lass of h-sounds has always been a problem in feature analysis.
I accep t the classification of glides (semivowels) and h-sounds given by
Chomsky and Halle a s [t sonorant], [ -consonantal]<* and [ -syllabic but I
object to their contrasting of h-sounds to other glides a s [+low]. This
solution i s an apparent mistake since h-sounds display perfect coarticula-
tion with vowels whether [+low] o r [ -low]. The h-sounds, voiced o r un-
voiced, a r e produced with an active glottal readjustment.
The presence of the unvoiced h-sound in the c lass of sonorants weakens
the simple acoustic cor re la te of intensity if this c l a s s since velar f r ica-
tives display s imi lar acoustic patterns but with more noise in the region
above F2. The degree to which the intensity i s associated with the vocalic
formant patterns i s accordingly a necessary aspect to take into account.
This fact a l so cor re la tes with the affinity of sonorants to be found next to
the syllabic nucleus.
Directly related to the classification of h-sounds i s the treatment of
aspiration. The statement of Ghomsky and Halle that a feature of height-
ened subglottal p res su res is a necessary requirement fo r zspiratior, is not
tenable, s ee Fant, Acoustic Theory of Speech Product ion, pp. 277 -27 9.
Instead we need a new feature of "glottal relaxation" yet to be defined that
covers aspiration in general a s well a s the c l a s s of h-sounds.
On the whole, there is a need for fur ther studies of the phonatory
mechanism in various situations before we can single out the various phon-
etic components involved in the various manner of articulations of stop
sounds. The difference between English o r Swedish [P, t , k] and [b , d , g]
involves both aspiration, tenseness and voicing a s phonetic parameters .
In initial s t r e s sed position the aspiration, i. e. glottal relaxation is the ob-
vious cause of the delay of voicing in L p , t , and k]. A higher in t raora l
stop p res su re , when present , appears to reflect a l a r g e r glottal opening
-:+ o r [ -"midclosure) instead of -consonantal].
STL-QPSR 2-3/1969 12.
ra ther than a higher subglottal pressure. At the same t ime there appears
to be a prolongation of the state of articulatory narrowing in [ p, t, and k]
which accounts fo r a high frequency "fricative" noise superimposed on the
f i r s t par t of the aspiration.
There a r e a l so coarticulation differences. The range of F2-locus a t
the instant of re lease is g rea te r fo r the voiced than for the unvoiced stops,
especially s o with [b] compared with [p]. This can be seen i n the data
of Lehiste and Peterson (1961) and I have measured s imi l a r distributions
f o r Swedish (forthcoming article). At the instant of re lease of [b) before
a back vowel the tongue takes a position close to that of the following vow-
el whilst the instant of re lease of the [p] before the same vowel displays
a much higher locus, typical of neutral tongue articulation. After about
40 msec f rom the release of the [p] the formant pattern follows essentially
that observed immediately a f te r the re lease of the [b]. These temporal
relations should be studied closer .
It could be, a s stated by Chomsky and Halle, that the amount of vocal
wall tensening could affect the possibility to maintain a prevoicing (before
the release) but I consider the glottal adjustment to be p r imary and that i t
a l so i s the pr imary cause of the smal l difference found in the t ime lag of
voicing af te r re lease comparing the intervocalic [k, p, t f and [g, b, d]
and associated with this t ime lag a difference in the F1 contour ( ~ 1 cut back).
The me r e fact that there a r e cer tain "tens e-lax" elements associated
with the distinction between the English o r Swedish [k, p, t] versus [b, d , g]
in addition to the obvious glottal adjustments i s not a sufficient basis fo r sel-
ecting the feature "tense" ra ther than the feature "vcice?!'. According to
Chomsky and Halle the cr i ter ion for classifying [p , t, k] a s [+tense] ra ther
than [-voiced? would be that vocal vibrations a r e stopped because of a r t ic -
ulatory interaction ra ther than by glottal relaxation. With this cr i ter ion I
would lay a g rea te r importance in the voicing component than in the tense-
ness component. Fur the r studies a r e needed.
The feature "distributed" which on the articulatory level i s defined a s
a long versus short constriction in the direction of the a i r flow has not been
analyzed very closely a s to i ts acoustic cor re la tes , and these a r e f a r f rom
obvious. Differences in source location, s i ze of front cavity, and the de-
g r e e of coupling to the back cavities may be affected. A high frequency ex-
tension of the noise could be an acoustic cor re la te but I cannot really say
STL-QPSR 2-3/1969 13.
anything definite before I have studied actual samples of spectrograms and
cineradiograms. It appears to me that the main difference between labials
and labiodentals is that of a l e s s effective versus a more effective source
and I a m ra ther hesitant to equate i t with differences in tongue articulations.
In Swedish there a r e both dental and apical alveolar stops, the la t te r
being lexically induced by a previous /r/. The phonological component
would have to work with classifications that differentiate these articulations.
It i s indeed questionable whether the phonetic difference i s that of dis t r ib-
uted-nondistributed.
Swedish vowels
The feature "covered" pertaining to narrowed, tensed pharynx wall and
an elevated larynx is suggested to have some relevance for the difference
between the Swedish vowels L y ] and [a]. There is no evidence to support
this suggestion a s f a r a s I can see.
The Swedish vowel sys tem is of considerable interest inview of the
la rge number of sounds contained. I shall attempt here to construct a phon- I etic feature mat r ix of Swedish long vowel phonemes, [u:], [o:], [a:], C &:I, [e:], [i:], Ly:], [u:], [&:I ,and the p r e - r allophones, [ae:] and [e:] of LC:] and [b:], respectively. I shall f i r s t attempt to use the Chomsky-Halle
tongue-body features back, low, high, and the rounding feature. In addition,
I have defined two new features , which in the consistent art iculatory termin-
ology a r e named "palatal" and "labial". These function a s extreme degrees
of tongue-height and lip-rounding, respectively. It has been long recognized
that 211 Swedish long vowels of extreme low f i r s t formant frequency, [i:],
[Y:], [a:], and [u:] a r e pronounced a s diphthongs towards a homorganic
glide o r fricative. However, what is not s o obvious and often overlooked i s
that the vowel Ly:] is made with a palatal closing ges ture just a s in [i:] but
with added lip- rounding and that the front vowel [w:] is produced with a
labial ges ture towards closure just a s in the back vowel [u:], Fant (1968).
The historical origin of [u:] i s a tongue fronting of [u:) which was replaced
by an Lo:] in a vowel shift, In the Swedish spoken in Finland Cu:] and [uc]
a r e not differentiated and a r e realized with a single sound shape. The
tongue fronting of the "long" [a:] has now progressed to a n articulation
close to that of [i:], [y:], [ e:] , and [+:I, generally a l i t t le more open
than [Y:] and a l i t t le m o r e close than [ b : ] . As f a r a s I can judgd the
STL-QPSR 2-3/1969 14.
element of velarization has been completely lostre. The position of the
m a s s of the tongue in the palatal-velar direction i s not m o r e "velar" than
that of the other front vowels, and the apex i s often slightly raised thus
tending to shift the location of the tongue-palate constriction somewhat an-
t e r i o r of [i:]. However, in the c lass of "short", i. e. lax Swedish vowels, ** the tongue of [u] is lower than that of [b] but more velarized.
When sampling formant data on vowels the distinction between Swedish
[o:] and [u:] and between [b:] and [u:] may be obscured i f [u:] and [u:]
a r e sampled a t the i r onset and not a t their target values where F1 and F2
a r e lower. Similarly, the contrast between [y:] and [a:] is increased if
the sampling i s performed a t the l a t e r par t of the vowel where F2 of [u:]
has been progressively lowered and F3 of Cy:] has been progressively in-
creased. At the place of the vowel target the main constriction is a t the
l ips fo r [u:] and [u:] but a t the tongue-palate region for Cy:] and [i:].
The progressingly decreasing tongue-height in the s e r i e s [u:], [ o:], [ a :]
and in [i:], [ e : ] , [ E:] , [=:I and in [u:], [ b : ] , [ce:] is paralleled by an in-
creasing jaw opening, Lindblom (1967). It has been demonstrated by
Lindblom and Sundberg (1969a and b) that with a minimum jaw opening but
otherwise normal tongue movements the F1 range is considerably reduced.
The jaw opening thus adds not only to the tongue-palate distance but a l so
to the effective lip-opening, everything e lse being equal. The oix vowel
features classify the Swedish long vowels a s follows.
TABLE I-A-2
Swedish long vowels
[=:I and [a:] a r e p r e - r allophones of [E : ] and [ b : ] Binary sys tem
u: o: a: E: E: e: i: y: u: 6: a:
back + + + - - .. - - - - - low - - + + - - - - - - - high + - - .. - + t + + + -
palatal - - - - - + + . - - - round t t - - - - - + - ( + - t
labial t - - - - - - - t - -
* Lindblom and Sundberg (1969a) classified [u) a s "velar" but expressed doubts a s to the phonetic validity.
** The quality of shor t /u/ is generally t ranscr ibed a s [el.
In the consonant sys tem the feature should be used instead of
[+anter ior] [ &coronal] to define the c lass of labial consonants. Labialized
vowels a r e analogous t6 "retroflei", i. e. [+coronal] vowels. Long (tense)
SwediSh vowels a r e accordingly diphthongized if they possess the features
"palatal" o r "labial1'. These a r e the maximally "close" vowels, compare
Lindblom and Sundberg (1969 a).
An alternative matr ix may be se t up with "jaw closure" instead of the
"palatal" feature. The maximum degree of jaw closure i s found in [i:],
[Y:], [u:], and [u:] which would be labeled [+closed]. With this solution
one gains the distinction in actual tongue-plat15 opening comparing [u:]
and [+:I whilst the distinction between [u:] and Cy:] is reduced t c one of
labializatiox~ only. One then has to add the rule that labialization always
determines the diphthongal element when present in the close vowels.
Note the minimal distinction of [-back] separating [w:] f rom [u:] in either
system, A third and ra ther different alternative sys t em was suggested
by Lindblom and Sundberg (1 969 a).
The variety of solutions possible in a sys tem of interrelated physiolog- 1 ical dimensions scaled according to binary principles i s indeed a problem.
One source of variability is that the number of possible combinations gen-
erated f rom a given ensemble i s l a r g e r than the number of sounds to be I encoded. Therefore there may resul t an ambiguity in feature selection.
Two o r m o r e physiological parameters may contribute to one and the s a m e
acoustical and perceptual effect which may constitute a more natural
candidate for the role of feature, a t leas t in the sense of phonetic feature.
Let us s e e what happens if we t r y to simplifjr the inventory of articulatory
parameters by grouping together the features "low", "high", and "palatal"
to one single dimension assigning the value 0 fo r the most "open" degrees
[a:], [z:], and [ce:] and the value 3 for the maximally palatal [i:]. Sim-
i la r ly the feature labial i s added to that of rounding accounting for
TABLE I-A-3 u: o: a : z: : e: i: y: u: : a :
back 1 1 1 0 0 0 0 0 0 0 0
h i g h 2 1 0 0 1 2 3 3 2 2 0
r o u n d 2 1 0 0 0 0 0 1 2 1 1
A matr ix of this so r t i s eas i e r to comprehend than a multidimensional
binary system. There a r e apparently three major c l a s ses within the sys-
tem, the back vowels [u:], [o:], [a:] in which an increase in tongue
STL-QPSR 2-3/1969 16.
height goes with increasing lip rounding (partially jaw dependent). The
unrounded front vowels a r e differentiated by tongue (and jaw) height and
the rounded front vowels a r e a l so differentiated by height and by extra
rounding a s a special feature of La:), cif. Malmberg (1 956) and Fant (1966).
At this stage we might a s k for the acoustic and perceptual cor re la tes
of these articulatory categories. The phonetic color i s mainly dependent
of F I J F2' and F but it should be possible to find an optimal projection 3
of this three-dimensional space on a plane. Pilot experiments now in pro-
g r e s s a t the Dept. of Speech Communication, KTH (Fant, Carlson and
~ r a n s t r b m ) indicate that an F1 versus F ' plot would se rve this purpose. 2 F A is the frequency of the second formant in a two-formant approximation
to the vowel. In mid- and back vowels F' is identical to F2 and in high 2
front vowels close to Fj.
A tentative F versus plot of Swedish long vowels and some short 1 2 vowels of specific identity have been plotted on a me1 scale , Fig. I-A-4.
In this d iagram we find evidence of a fairly even spread in the perceptual
domain. The average distance between any of the sounds and i t s c losest
neighbor i s 180 mels. The articulatory cor re la te of increasing F is in- 1 creasing jaw opening and a shift of tongue place towards a pharyngeal
position. The articulatory cor re la te of the ordinate F'h is a shift of the
tongue away f rom the velum and towards the palate.
It can be seen that back vowels may be separated f rom front vowels
by a line of the slope t45 degrees and rounded vowels f rom unrounded
vowels with a line of -45 degrees slope. Therefore a rotation of coord-
inates a s in Fig. I-A-5 brings out the d i rec t cor re la tes to the main vowel
classes . Back vowels a r e characterized by a distance between the f i r s t
and the second formant lower than 400 mels. All unrounded front vowels
l ie c lose to a line of M ~ + M ; = 2200 me1 and the rounded front vowels have
an absc issa of Ml+M; l e s s than 2100 mel. The quantal s teps in the ordi-
nate comparing [i:, e:, E:, and z:] a r e of the o rde r of 250-300 mels where
a s the quantal s teps in the absc issa a r e of the o rde r of 200-250 mels.
Since we now have condensed the vowel space to a plane we have only two
orthogonal parameters .
I The absc issa (Ivl1+M2) i s twice the center of gravity of the spectrum,
giving equal weight to M1 and M ' and will be identified with the negative 2' of the old feature "flat". - Labialization, velarization, jaw closing, larynx
STL-QPSR 2-3/1969 17.
lowering will a l l lower the center of gravity whilst the ordinate, he re
re fer red to a s the spec t ra l feature "spread" i s a measure of disperson.
Note that it is related to but not identical to any of the old features such
a s [ -compactness], [+diffuseness 1, o r [-gravity]. The spec t ra l spread
is increased with moving the tongue f rom a pharyngeal to a palatal place
of articulation. Five levels a r e indicated by the points [a:], [ax], [ E:],
[e:], [i:]. Note that increasing jaw opening increases in the f i r s t hand
M1 and thus makes the spectrum l e s s flat and l e s s spread. Fig. I-A-5
would motivate a quantization of the long vowels in sca les of "flat" and
"spread" a s follows.
TABLE I-A-4 u: o: a: ce: E: e: i: y: at : Q: parameter
"spread" 0 0 0 1 2 3 4 3 2 2 1 M2-M1
"flat" 5 4 2 0 0 0 0 1 2 1 2 -(MI-2)
These sca les a r e absolute but can of course be reduced according to the
principle of complimetltary distributions. The progressing "flatness"
f rom f rom [a:] over Lo:] to [u:] i s the effect of rounding + velarization
whereas the flatness of [u:] i s pr imari ly a mat te r of sma l l l ip opening.
As previously d i scusse i no velarization appears to be involved in [u:]
but possible an "anteriorization". The possibility of compensatory forms
of articulations in the flatness domain a r e apparent. In the c l a s s of "short"
i. e. lax Swedish vowels, the /u/, phonetically [ o ] i s m o r e "velar" than
the short [ b ] , s e e Fig. I-A-6. These facts support a perceptual ra ther
than an ar t iculatory feature basis.
It has often been suggested that ar t iculatory descriptions of vowels
actually rely on underlying perceptual classifications, Ladefoged (1 967b).
Our data indicate that the Swedish vowels a r e not a rb i t ra r i ly spaced in-
dividuals in the space of physically producible sounds but show a c l e a r
organization in t e r m s of l inear sequences and a tendency of equili3tant
spacings in an orthogonal perceptual space. This ordering appears to be
a subset of a language universal sys tem of maximal contrast . This idea
was also expressed b y L indblo~n an'3 Sunclbcrg (1 963 2). Fur the r work
along these l ines is continuing, E a r l i e r work on me1 sca le mapping of
Swedish vowels was published by Fant (1959).
references on next page
SWEDISH VOWELS
Feature "spread"
to palata l place - Jaw closing
I I I I I I I I I I I I 1 I 1
FEATURE "FLAT" - \ rnels LABIALIZATION \ JAW OPENING VELARIZATION VECTOR JAW CLOSING (INCREASING Ft) LARYNX LOWERING
(FORMANTS MOVING DOWN)
F i g . I - A - 5 . Swedish vowels in a "spread" versus "flat" me1 scale plot bringing out s o m e orthogonal vowel ca tegor ie s (back and front vowels) and a trndency of equidistant me1 spacings .
Fig . I-A-6. X - r a y t rac ings of Swedish vowels. ( ~ r o m Fan t , G. : "The acoust ics of speech", i n P r o c . of the Th i rd International Congress on Acoust ics , Stuttgart 1 959, pp. 188-201, Fig . 9, A m s t e r d a m 1961.)