4
Robert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘Cornell University, USA J J ABSTRACT We report here on an investigation of Buginese and Toba Batak vowels, considering formant structure, duration, FO, and intensity in stressed and unstressed syllables. Three native speakers of Buginese (I, N, Y) and one of Batak were recorded producing disyllables with penultimate stress. The vowel spaces for /u/ and /o/ greatly overlap in both languages, and Buginese schwa is nearly as high as /i, u/. For two Buginese speakers and the Batak speaker, FO correlates with stress, while duration and intensity distinguish stress for the remaining Buginese Speaker. For speakers N and I, vowels in stressed position are shorter in duration (final unstressed vowels are longer due to final lengthening) and more centralized than those in unstressed position. This provides support for Lindblom’s undershoot hypothesis (arguing that the shorter a given vowel, the more centralized it is) and argues against the schwa hypothesis (contending that vowels centralize in unstressed position). Buginese has a six-vowel system: /i, e, a, o, u, a/, while Batak has the following five vowels: /i, e, a, o, u/. Vowels are exemplified in 37 disyllables (the canonical root shape in both languages) in Buginese and 32 disyllables in Batak. In both languages, stressed vowels were targeted in penultimate syllables (stress in both languages is penultimate) for the acoustic description. In order to examine the correlates of stress, words with identical first and second syllable vowels were chosen to rule out the effects of intrinsic acoustic differences among vowels. 1. INTRODUCTION In this paper, we report on a systematic investigation of the vowels of two Western Austronesian languages in Indonesia: Buginese (South Sulawesi) and Toba Batak (North Sumatra). The study is primarily descriptive in nature, providing results on formant frequency, FO, duration, and intensity, since there is no acoustic work on Buginese vowels and only a preliminary acoustic description of Toba Batak [I]. In fact, of the over 300 languages spoken in Indonesia, only the vowels of Indonesian have been the subject of an acoustic study [2]. Speakers were recorded producing four repetitions of each word, embedded in a carrier phrase. For all speakers, recordings were made in a quiet setting. The recordings were digitized at a sampling rate of 11025 Hz. Vowel onset and offset were identified on wideband spectrograms and waveforms and labeled using the X-Label attachment of Waves+. Acoustic measurements for Fl, F2, duration, FO, and intensity were calculated with a series of scripts designed by Eric Evans at the Cornell Phonetics Laboratory. Values generated by the scripts were checked randomly for accuracy by the researchers. Statistical analysis was carried out in SPSS. 3. ACOUSTIC DESCRIPTION As a second goal of this study, we seek to explore the acoustic correlates of Buginese and Toba Batak word stress. Though stress has received much attention in the literature, most studies have focused on European languages. In a departure from European languages, stressed syllables were found to have highest FO and greatest intensity in Indonesian, while no strong relationship was found between stressed syllables and duration [3] (which has been found to correlate strongly with stress in many European languages). Here we examine stress in two additional languages spoken in Indonesia. In this section, we offer an acoustic description of the vowels in Buginese and Batak, focusing on formant structure, duration, FO, and amplitude. Because the four speakers under analysis here display somewhat different patterns, we have opted not to pool results across speakers. 3.1. Formant Structure The fomant plots for the three Buginese speakers are presented in Figures 1-3, with Fl (in Hz) represented on the y-axis and F2 (in Hz) plotted on the x-axis. Each ellipse represents the area of 2 standard deviations from the mean value. 3000 2500 2000 1500 ~lIIIIIIIIII I I I I I I In the remainder of the paper, we present our experimental methods (section 2); a systematic acoustic description of Buginese and Batak vowels (section 3); the acoustic correlates of stress in both languages (section 4); and a conclusion (section 5), in which we summarize our principal findings and their implications. 2. METHODS This study reports on the speech of three female speakers (Y, N, I) of the Bone dialect of Buginese and one male speaker (W) of Toba Batak (results for an additional speaker will be presented at the Congress). e d a a a G B lIIIIIIIIlIIIII I I I I I I Figure 1. Formant plot for Buginese speaker i r1200 ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK 1000 I I I HZ L 100 -300 -400 -500 -600 -700 -800 - 900 -1000 -1100 page 535 ICPhS99 San Francisco

ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK · PDF fileRobert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘ Cornell University, USA

Embed Size (px)

Citation preview

Page 1: ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK · PDF fileRobert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘ Cornell University, USA

Robert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘Cornell University, USA

J J ’

ABSTRACT We report here on an investigation of Buginese and Toba Batak vowels, considering formant structure, duration, FO, and intensity in stressed and unstressed syllables. Three native speakers of Buginese (I, N, Y) and one of Batak were recorded producing disyllables with penultimate stress. The vowel spaces for /u/ and /o/ greatly overlap in both languages, and Buginese schwa is nearly as high as /i, u/. For two Buginese speakers and the Batak speaker, FO correlates with stress, while duration and intensity distinguish stress for the remaining Buginese Speaker. For speakers N and I, vowels in stressed position are shorter in duration (final unstressed vowels are longer due to final lengthening) and more centralized than those in unstressed position. This provides support for Lindblom’s undershoot hypothesis (arguing that the shorter a given vowel, the more centralized it is) and argues against the schwa hypothesis (contending that vowels centralize in unstressed position).

Buginese has a six-vowel system: /i, e, a, o, u, a/, while Batak has the following five vowels: /i, e, a, o, u/. Vowels are exemplified in 37 disyllables (the canonical root shape in both languages) in Buginese and 32 disyllables in Batak. In both languages, stressed vowels were targeted in penultimate syllables (stress in both languages is penultimate) for the acoustic description. In order to examine the correlates of stress, words with identical first and second syllable vowels were chosen to rule out the effects of intrinsic acoustic differences among vowels.

1. INTRODUCTION In this paper, we report on a systematic investigation of the vowels of two Western Austronesian languages in Indonesia: Buginese (South Sulawesi) and Toba Batak (North Sumatra). The study is primarily descriptive in nature, providing results on formant frequency, FO, duration, and intensity, since there is no acoustic work on Buginese vowels and only a preliminary acoustic description of Toba Batak [I]. In fact, of the over 300 languages spoken in Indonesia, only the vowels of Indonesian have been the subject of an acoustic study [2].

Speakers were recorded producing four repetitions of each word, embedded in a carrier phrase. For all speakers, recordings were made in a quiet setting. The recordings were digitized at a sampling rate of 11025 Hz. Vowel onset and offset were identified on wideband spectrograms and waveforms and labeled using the X-Label attachment of Waves+. Acoustic measurements for Fl, F2, duration, FO, and intensity were calculated with a series of scripts designed by Eric Evans at the Cornell Phonetics Laboratory. Values generated by the scripts were checked randomly for accuracy by the researchers. Statistical analysis was carried out in SPSS.

3. ACOUSTIC DESCRIPTION

As a second goal of this study, we seek to explore the acoustic correlates of Buginese and Toba Batak word stress. Though stress has received much attention in the literature, most studies have focused on European languages. In a departure from European languages, stressed syllables were found to have highest FO and greatest intensity in Indonesian, while no strong relationship was found between stressed syllables and duration [3] (which has been found to correlate strongly with stress in many European languages). Here we examine stress in two additional languages spoken in Indonesia.

In this section, we offer an acoustic description of the vowels in Buginese and Batak, focusing on formant structure, duration, FO, and amplitude. Because the four speakers under analysis here display somewhat different patterns, we have opted not to pool results across speakers.

3.1. Formant Structure The fomant plots for the three Buginese speakers are presented in Figures 1-3, with Fl (in Hz) represented on the y-axis and F2 (in Hz) plotted on the x-axis. Each ellipse represents the area of 2 standard deviations from the mean value.

3000 2500 2000 1500 ~lIIIIIIIIII I I I I I I

In the remainder of the paper, we present our experimental methods (section 2); a systematic acoustic description of Buginese and Batak vowels (section 3); the acoustic correlates of stress in both languages (section 4); and a conclusion (section 5), in which we summarize our principal findings and their implications.

2. METHODS This study reports on the speech of three female speakers (Y, N, I) of the Bone dialect of Buginese and one male speaker (W) of Toba Batak (results for an additional speaker will be presented at the Congress).

e d

a a a

G

B

lIIIIIIIIlIIIII I I I I I I

Figure 1. Formant plot for Buginese speaker i r1200

ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK

1000 I I I

HZ L 100

-300

-400

-500

-600

-700

-800

- 900

-1000 -1100

page 535 ICPhS99 San Francisco

Page 2: ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK · PDF fileRobert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘ Cornell University, USA

3000 2500 2000 1500 1000 Hz JIIIIIIIIIIIII I I I I I I I I I I LlOO

- 200

-300

-400

-500

-600

-700

-800

aia v

- 900

-1000

-1100

1IIIIIIIIIIIII I I I I I I I I I I r1200 Figure 2. Formant plot for Buginese speaker N

3000 2500 2000 1500 1000 Hz ~IIIIIIIIIIlII I I I I I I I I I I L 100

- 200

- -300

-400

-500

-600

-700

-800

- 900

-1000 -1100

1IIIIIIIIIIIIII I I I I I I I

Figure 3. Formant plot for Buginese speaker i r 1200

Perhaps the most striking characteristic of these formant plots is the degree of overlap between the vowels. For I, only a small port ion of the vowel space for /u/ does not overlap with the vowels /o/ and schwa. Moreover, N’s vowel space for /o/ is contained entirely within the vowel space of /u/. The three speakers also show a tendency for schwa to pattern with the high vowels in the Fl (height) dimension, a pattern evidenced in Indonesian as well [2].

The Batak speaker exhibits a pattern similar to Buginese speaker N. As shown in Figure 4, the vowel space for /u/ completely encloses that of /o/. The remaining vowels, on the other hand, are neatly contained and exhibit virtually no overlap.

3000 2500 2000 1500 1000 Hz _1IIIIIIIIIIIII I I I I I I I I I I LlOO

- 200

-300

-400

- 500

-600

-700

-800

- 900 - -1000

-1100 1IIIIIIIIIIIII I I I I I I I I I

Figure 4. Formant plot for Toba Batak speaker ‘W r 1200

3.2. Duration The mean durat ions for the six Buginese and five Batak vowels, sorted by speaker, are presented in Figure 5. For all three speakers of Buginese, as well as the Batak speaker, durat ion patterns inversely with vowel height (except in the case of schwa). The same general pattern has been found in English and many other languages, most likely because of the greater articulatory movement required for the product ion of low vowels [4]. Surprisingly, a different pattern has been found in a much more closely related language, Indonesian, in which /e/ and /o/ were longer than /a/ [2].

1 4 0

h 1 2 0

E 1 0 0

: 8 0 .- c, E 6 0

: 4 0

2 0

0

Y N Bug inese Bug inese

I W Ewgmese Toba Batak

Speaker

Figure 5. Mean vowel durat ion (ms) C

3.3. Fundamental Frequency The mean FOs (taken at stressed vowel midpoint) for Buginese and Batak vowels are shown in Figure 6. The chart reveals a general tendency for FO to vary directly with vowel height for speakers of both languages. Again, the same trend was found in English and other languages [4]. As with duration, Indonesian exhibited a different pattern: the FO values of /o/ are roughly equivalent to /a/, rather than its corresponding mid vowel /e/ [2]. Regarding schwa, its FO was found to be approximately equivalent to that of the high vowels in Indonesian, which may be the case in Buginese as well, though the tendency is insufficiently clear to make a definitive claim, particularly in the case of N.

t 2 4 0

5 2 3 0

: 2 2 0

5 2 1 0

t’ k 2 0 0

z 1 9 0 c, E 1 8 0

E 1 7 0

i 1 6 0

.s 1 5 0 L Y N

Buginese Bug inese I

Buginese W

Toba Batak

Speaker

r igure 6. Fundamental f requency (Hz) at vowel midpoint

3.4 Intensity W e conclude our acoustic description of Buginese and Batak vowels with a discussion of intensity. The mean intensity values for stressed vowels are charted in Figure 7. W e call attention to the fact that the values for Buginese speaker I are considerably lower than those for the other speakers, due to imperfect field recording conditions. Nevertheless, this difference is irrelevant for our purposes here, given that we are considering relative, rather than absolute, intensity.

page 536 ICPhS99 San Francisco

Page 3: ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK · PDF fileRobert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘ Cornell University, USA

i $ I ‘E 76 / 5 7 4

j f 7 2 I I 7 0

1 6 8

I

Y Bug inese

N I

Bug inese Bug inese

Speaker

W

To& Baa /mai Le- . _ _ _ J

._-_-__--._ -_.---. _ _____ ----. __.__ - . . ..- ~

Figure 7. Intensity (dB) at vowel midpoint

Figure 7 fails to make clear many conclusive findings concerning the intrinsic intensity of Buginese or Batak vowels, though some clear tendencies are evident. Disregarding schwa in Buginese, the mid front vowel /e/ has the greatest intensity for all speakers of Buginese as well as the Batak speaker. This is surprising, as many studies report /a/ to have the highest intensity in English [5, 61. In Buginese, the intensity levels for schwa were also surprising. W e found that for speaker Y, schwa has the highest intensity of all vowels, while its intensity is second only to /e/ for the other two speakers.

4. ACOUSTIC CORRELATES OF STRESS In this section, we discuss the acoustic correlates of word stress in Buginese and Batak, considering the parameters discussed in section three: formant structure, duration, FO, and intensity.

4.1. Formant Structure W e begin our discussion of the acoustic correlates of stress with formant structure, the acoustic counterpart of vowel quality. Figures 8-10 show average values for vowels in stressed and unstressed position for the Buginese speakers. Bullets (0) represent stressed vowels, while minus signs (-) represent unstressed vowels. The figures reveal that the Buginese speakers exhibit different patterns for vowel quality with regard to stress. For Y, unstressed vowels occupy a more central ized position in the vowel space. For I and N, however, unstressed vowels are more peripheral than their stressed counterparts. This finding casts doubt on the schwa hypothesis, which argues that vowels centralize in unstressed position [7]. On the other hand, because unstressed vowels were longer than stressed vowels for speakers I and N (as will be discussed in section 4.2), the results here support Lindblom’s undershoot hypothesis, arguing that the shorter a given vowel, the more central ized it is [8].

F2 (Hz)

5 0 0

+z 6 0 0 3

i- _-- ----_- __. __ .-.i Figure 8. Formant plot of stressed vs. unstressed vowels (I)

/

--_.- ._ .-._- L __.-_-.- _.__ __-_ -1-- --_.--- /.--- - I

-A- --.-_ _ --.. - -+-. ._ _____.__._. --_-_.-- ._.- _... _ _._._ _

f--- .-. . _ ._.. -1

/

------ I. I k I

N , 6 0 0 5

h : 7 0 0

9 0 0 I

1 0 0 0 ;

Figure 9. Formant plot of stressed vs. unstressed vowels (N)

F2 (Hz)

2 4 0 0 1 9 0 0 1 4 0 0 9 0 0

I$ ._.._ ----.

: j

: 3 _____. ____ ------------ I

I --.._ _-_-- .-.. _ ..__. _. -.- I

_____. -- p/---- 0 I / ._- - --

1 0 0 0 : -__ -- ___-__ --_.- - _-I__._-__.- ._- .._.-.- _ _ --__--- ~~ Figure 10. Formant plot of stressed vs. unstressed vowels (Y)

As illustrated in Figure 11, the Batak speaker exhibits the same vowel reduction pattern as Buginese speaker Y. The same trend has been found in English [8,9] and Swedish [7, 81.

___.___ _ ______ _- --_---.---------_-- .___._. -.-_ _.. F2 (Hz)

2 4 0 0 1 9 0 0 1 4 0 0 9 0 0

8 0 0

--- __... - ___ ---.~-‘--.--- .___ 9 0 0

1 0 0 0 I

s g i ” I G I

I

Figure 11. Formant plot of stressed vs. unstressed vowels (W)

4.2. Duration Table 1 offers a compar ison of stressed and unstressed vowel durat ions for all four speakers. Significant dif ferences (alpha level = 0.05, using paired t-tests) in the expected direction (i.e. stressed vowels are longer, higher in FO, and have greater intensity) are indicated by asterisks (*) next to the values for unstressed vowels, while significant dif ferences in the reverse direction are marked with an ampersand (#). Here again the speakers display different patterns. Duration correlates with stress for speaker Y, which has also been found in Serbo-Croatian [lo]. No stress-duration correlation exists for the remaining Buginese speakers or the Toba Batak speaker. In fact, durat ion is actually longer in unstressed syllables for speakers I and N

page 537 ICPhS99 San Francisco

Page 4: ACOUSTIC INVESTIGATION OF THE VOWEL SYSTEMS OF BUGINESE AND TOBA BATAK · PDF fileRobert J. Podesva*, Niken Adisasmito-Smith’ *Stanford Universitv. USA, ‘ Cornell University, USA

(significantly SO in the latter), a pattern which we attribute to final lengthening.

acoustic properties of the Buginese and Toba Batak vowel systems. We have found that the vowels /u/ and /o/ occupy much of the same vowels space, overlapping considerably for all speakers. In fact, the results of a MANOVA indicate that Speaker N’s first and second formants for /u/ were not significantly different from those of /o/ (alpha level = O.OS), suggesting potential difficulties in perceiving the categorical distinction between these vowels. We hypothesized that the duration or FO of these vowels might be sufficiently different to preserve the acoustic distinction between them, but did not find that the differences (/u/ had a shorter duration and higher FO than /o/) reached a level of significance. In Buginese, we observed a tendency for schwa to pattern with the high vowels along the dimensions of Fl (height) and FO, calling into question the vowel’s traditional transcription as [a]. Finally, as is consistent with cross-linguistic trends, we found duration to correlate inversely and FO to correlate directly with vowel height.

We have also endeavored to investigate the acoustic correlates of stress in Buginese and Toba Batak. Though we expected that formant structure (vowel quality) might be related to stress, we discovered instead that quality depends most heavily on duration, as predicted by the undershoot hypothesis [8]. With respect to the other acoustic parameters under analysis, we have found that Buginese speakers I and N, together with Toba Batak speaker W, pattern in opposition to Buginese speaker Y. In the case of the latter, duration and intensity correlate with stress, while FO is the crucial parameter for the remaining three speakers. Whether these acoustic differences translate into distinct percepts merits attention in future perceptual studies.

I i II 97 I 66* II 95 I 104 II 100 I 162# II 96 I 101 I 80* II I22 1125 111 I 183# II 101 I 98 I

I a ll 125 I 91* ll 136 I 142 II 142 I 213# II 98 I 108 I 1 o 11 116 1 77* 11 129 1 117 11 117 1 186#11 106 1 124 1 1 u 11 107 1 71* 11 92 1 112 11 121 1 162 11 88 1 108# 1 1 a 11 48 1 66* 11 74 1 102# 11 84 1 115#11

Table 1. Duration (ms) in stressed and unstressed vowels

4.3. Fundamental Frequency The FO values for stressed and unstressed vowels are found in Table 2. Again, Y’s pattern contrasts with the patterns exhibited by Buginese speakers I and N and Batak speaker W. The latter three consistently have higher FO values for stressed vowels, with the small exception of speaker N’s /u/ and schwa. This general finding is consistent with the widely attested tendency for FO to correlate with stress, as found in English [ 1 I], Danish [ 121, and Indonesian [3], among others. Speaker Y, on the other hand, reveals no consistent pattern.

I Buginese Y II Buginese I II Buginese N II Batak W I

ACKNOWLEDGMENTS This work is supported by Abigail Cohn’s NSF grant number SBR- 9511185. The authors would especially like to thank our Buginese and Toba Batak language consultants, Yati Paseng Barnard and Wilson Manik, respectively. Many thanks also to Abby Cohn for helpful comments on earlier drafts. Finally, thanks to Eric Evans for his technical and statistical assistance.

’ 226 ’ 224 ‘1 231 ’ 212* ‘1 239 ’ 241 ‘1 Table 2. FO (Hz) in stressed and unstressed vowels

4.4. Intensity Table 3 lists the intensity level of stressed and unstressed vowels. While intensity does not appear to correlate with stress for I and N in Buginese and W in Toba Batak, a clear pattern emerges for Buginese speaker Y, for whom stressed vowels have a slightly higher intensity than unstressed vowels (though the stressed- unstressed differences do not reach statistical significance for /e/ and /u/). On average, speaker Y’s stressed vowels are 1 dB greater in intensity than unstressed vowels, a perceptible difference [6]. Whether this difference is sufficiently salient to trigger the perception of stress requires further investigation.

REFERENCES [l] Nababan, P. W.J. 1981. A Grammar of Toba Batak. Paczjic Linguistics, Series D, no. 37. Canberra: Australian National University. PI van Zanten, E. 1989. The Indonesian Vowels: Acoustic and Perceptual Explorations. Delft: Eburon. [3] Adisasmito-Smith, N. and Cohn, A. 1996. Phonetic correlates of primary and secondary stress in Indonesian: A preliminary study. Working Papers of the Cornell Phonetics Laboratory, I I, I- 15. [4] House, A. and Fairbanks, G. 1953. The influence of consonant environment upon the secondary acoustical characteristics of vowels. JASA, 25, 105-l 14. [5] Lehiste, I. and Peterson, G. 1959. Vowel amplitude and phonemic stress in American English. JASA, 3 1,428-435. [6] Lehiste, I. 1970. Suprasegmentals. Cambridge, MA: MIT Press. [7] Fant, G. 1962. Den akustika fonetikens gerunder. Kungl. Tek. Hogskol. Taltransmissionslab. Rappt. No. 7. Stockholm: Royal Institute of Technology. [8] Lindblom, B. 1963. Spectrographic study of vowel reduction. JASA, 35, 1173-l 181. [9] Stetson, R. (1951) Motor Phonetics. Amsterdam: North Holland Publishing Co. [lo] Lehiste, I. and Ivic, P. 1963. Accent in Serbo-Croatian: An experimental study. Michigan Slavic Materials, no. 4. Ann Arbor: University of Michigan. [ 1 l] Fry, D. 1958. Experiments in the perception of stress. Language and Speech, 1, 126-152. [ 121 Thorsen, N. 1982. On the variability of FO patterning and the fuztion of FOtiming inlanguageswhere pi@h cues stess. Phonetica, 39,203-316.

1 ]I Buginese N II Batak W I 11 Buginese Y 11 Buginese I

I V str unstr . ,I . ( nstr II str I unstr II str I unstr I i9 11 73 75 11 76 76 3* 11 76 76 11 84 80* 9* 11 75 1 74* 11 80 80 i9 11 74 74 11 81 79*

Table 3. Intensity (dB) in stressed and unstressed vowels

5. CONCLUSION In conclusion, our main objective has been to describe the

page 538 ICPhS99 San Francisco