Glides, Place and Perception March 18, 2010 News The hard drive on the computer has been fixed! A...

Preview:

Citation preview

Glides, Place and Perception

March 18, 2010

News• The hard drive on the computer has been fixed!

• A couple of new readings have been posted to the course website.

• We’ll start today’s class off with a perception experiment…

X-ray microbeam thoughts• “Stress locally shifts articulations toward the hyperarticulate end of the continuum. Speakers do whatever is necessary to enhance the realization of segmentally contrasting features. A primary mechanism for enhancing distinctions is to decrease coarticulatory overlap so that gestures for segments in stressed syllables blend less with each other or with segments in neighboring syllables.”

• Counter-thought: coarticulation between segments can often be a useful perceptual cue.

• Especially in the case of stop consonants…

Glides

• Each glide corresponds to a different high vowel.

Vowel Glide Place

[i] [j] palatal (front, unrounded)

[u] [w] labio-velar (back, rounded)

[y] labial-palatal (front, rounded)

velar (back, unrounded)

• Glides are vowel-like sonorants which are produced…

• with slightly more constriction than a vowel at the same place of articulation.

• Each glide’s acoustics will be similar to those of the vowel they correspond to.

Glide Acoustics• Glides look like high vowels, but…

• are shorter than vowels

• They also tend to lack “steady states”

• and exhibit rapid transitions into (or from) vowels

• hence: “glides”

• Also: lower in intensity

• especially in the higher formants

[j] vs. [i]

[w] vs. [u]

Vowel-Glide-Vowel

[iji] [uwu]

More Glides

[wi:] [ju:]

Transitions• When stops are released, they go through a

transition phase in between the stop and the vowel.

• From stop to vowel:

1. Stop closure

2. Release burst

3. (glide-like) transition

4. “steady-state” vowel

• Vowel-to-stop works the same way, in reverse, except:

• Release burst (if any) comes after the stop closure.

Stop Components

• From Armenian: [bag]

closure voicing

vowel

formant transitions

another closure

stop release burst

Release Bursts• The acoustic characteristics of a stop release burst tend to resemble those of a fricative made at the same place of articulation.

• Ex: labial release bursts have a very diffuse spectrum, just like bilabial and labio-dental fricatives.

[p] burst

Release Bursts: [t]• Alveolar release bursts tend to lack acoustic energy at the bottom of the spectrum.

• To some extent, higher frequency components are more intense.

[t] burst

Release Bursts: [k]• Velar release bursts are relatively intense.

• They also often have a strong concentration of energy in the 1500-2000 Hz range (F2/F3).

• There can often be multiple [k] release bursts.

[k] burst

Closure Voicing• During the stop closure phase, only low frequency information escapes from the vocal tract (for voiced stops)

• “voicing bar” in spectrogram

• analogy: loud music from the next apartment

Armenian:

[bag]

• This acoustic information provides hardly any cues to place of articulation.

[bag] vs. [bak]• From Armenian (another language from the Caucasus)

[bag] [bak]

Confusions• When the spectrogram was first invented…

• phoneticians figured out quite quickly how to identify vowels from their spectral characteristics…

• but they had a much harder time learning how to identify stops by their place of articulation.

• Release bursts and closure voicing only provided weak and unreliable cues to place.

• Eventually they realized:

• the formant transitions between vowels and stops provided a reliable cue to place of articulation.

• Why?

Formant Transitions• A: the resonant frequencies of the vocal tract change as stop gestures enter or exit the closure phase.

• Simplest case: formant frequencies usually decrease near bilabial stops

Stops vs. Glides

• Note: formant transitions are more rapid for stops than they are for glides.

“baby”

“wave”

Formant Transitions: alveolars• For other places of articulation, the type of formant transition that appears is more complex.

• From front vowels into alveolars, F2 tends to slope downward.

• From back vowels into alveolars, F2 tends to slope upwards.

• In Perturbation Theory terms:

• alveolars constrict somewhat closer to an F2 node (the palate) than to an F2 anti-node (the lips)

[hid]

[hæd]

Formant Locus• Whether in a front vowel or back vowel context...

• The formant transitions for alveolars tend to point to the same frequency value. ( 1650-1700 Hz)

• This (apparent) frequency value is known as the locus of the formant transition.

• Researchers have theorized:

• the locus frequency can be used by listeners to reliably identify place of articulation.

• However, velars posed a problem…

Velar Transitions• Velar formant transitions do not always have a reliable locus frequency for F2.

• Velars exhibit a lot of coarticulation with neighboring vowels.

• Fronter (more palatal) next to front vowels

• Locus is high: 1950-2000 Hz

• Backer (more velar) next to back vowels

• Locus is lower: < 1500 Hz

• F2 and F3 often come together in velar transitions

• “Velar Pinch”

The Velar Pinch

[bag] [bak]

“Velar” Co-articulations

Vowels and Consonants• Just to make things more interesting, Öhman (1966) found that consonantal articulations are effectively superimposed on top of a stream of vowel articulations.

• The formant transitions effectively pass right through the consonants…

øgy øga

Vowels and Consonants

The Easter Bunny’s Nightmare.Charles Hockett (1955): “Imagine a row of Easter eggs carried along a moving belt; the eggs are of various sizes, and variously colored, but not boiled. At a certain point, the belt carries the row of eggs between the two rollers of a wringer, which quite effectively smash them and rub them more or less into each other. The flow of eggs before the wringer represents the series of impulses from the phoneme source; the mess that emerges from the wringer represents the output of the speech transmitter. At a subsequent point, we have an inspector whose task it is to examine the passing mess and decide, on the basis of the broken and unbroken yolks, the variously spread out albumen and the variously colored bits of shell, the nature of the flow of eggs which previously arrived at the wringer.”

The Solution?• The acoustic record that emerges from connected speech is so complex that some theorists have proposed that we do not perceive speech as a series of “sounds”…

• We perceive the intended underlying gestures instead.

• In Articulatory Phonology terms…

We perceive something like this

Not this

Or this

Extraordinary Claims• This proposal is formally known as the Motor Theory of speech perception.

• Basic Tenet #1: Speech is special.

• We don’t process/perceive speech the same way we process other sounds.

• Basic Tenet #2: We are biologically specialized to perceive speech.

• Other animals can’t do what we do.

• Basic Tenet #3: A special “module” in our brains is devoted to speech processing.

• What evidence do we have that we do not perceive speech on the basis of its acoustic characteristics?

• The earliest experiments on place perception were conducted in the 1950s, using a speech synthesizer known as the pattern playback.

Extraordinary Evidence?

Pattern Playback Picture

Haskins Formant Transitions• Testing the perception of two-formant stimuli, with varying F2 transitions, led to a phenomenon known as categorical perception.

Categorical Perception• Categorical perception =

• continuous physical distinctions are perceived in discrete categories.

• In the in-class perception experiment:

• There were 11 different syllable stimuli

• They only differed in the locus of their F2 transition

• F2 Locus range = 726 - 2217 Hz

Source: http://www.ling.gu.se/~anders/KatPer/Applet/index.eng.html

Stimulus #1 Stimulus #6

Stimulus #11

Example stimuli from the in-class experiment.

Recommended