32
The Perception of Speech

The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Embed Size (px)

Citation preview

Page 1: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

The Perception of Speech

Page 2: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Speech is for rapid communication•Speech is composed of units of sound called phonemes–examples of phonemes: /ba/ in bat , /pa/ in pat

Speech

Page 3: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Seeing Sound with Spectrograms

• A spectrogram is a 3D plot of sound

Time

Frequency

Page 4: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Seeing Sound with Spectrograms

• A spectrogram is a 3D plot of sound

Time

Frequency

IntensityIntensity is often coded by colour

Page 5: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Speech can be characterized by a spectrogram

Acoustic Properties of Speech

Page 6: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Spectrogram reveals differences between phonemes•The differences are in the formants and the formant transitions

Acoustic Properties of Speech

Page 7: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perceiving Speech

• So perceiving (interpreting) speech sounds is simply a matter of matching the spectrotemporal properties (the shape of the spectrogram) of the incoming sound waves to the appropriate phoneme

• right?…

Page 8: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perceiving Speech

• So perceiving (interpreting) speech sounds is simply a matter of matching the spectrotemporal properties (the shape of the spectrogram) of the incoming sound waves to the appropriate phoneme

• Then specific phonemes must correspond to specific spectrograms - a property called acoustic-phonetic invariance

Page 9: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Acoustic - Phonetic invariance says that phonemes should match one and only one pattern in the spectrogram–This is not the case! For example /d/ followed by different vowels:

Perceiving Speech

Page 10: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Acoustic - Phonetic invariance says that phonemes should match one and only one pattern in the spectrogram–This is not the case! For example /d/

•Clearly perception and understanding of speech sounds is more elaborate than simply interpreting an internal spectrogram

Perceiving Speech

Page 11: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•The phrase “Peter buttered the burnt toast” has five /t/ phonemes. There are not 5 identical sweeps in the spectrogram

Perceiving Speech

Page 12: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•The Segmentation Problem•Segmentation is the perception of silence between words•Often illusory

Perceiving Speech

Page 13: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•The phrase “I owe you a Yo-Yo” has no silence in it !

Perceiving Speech

Page 14: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Spoken Input

• The Segmentation Problem:– The stream of acoustic input is not physically segmented into discrete

phonemes, words, phrases, etc.

– Silent gaps don’t always indicate (aren’t perceived as) interruptions in speech

Page 15: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Spoken Input

• The Segmentation Problem:– The stream of acoustic input is not physically segmented into discrete

phonemes, words, phrases, etc.

– Continuous speech stream is sometimes perceived as having gaps

Page 16: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perceiving Speech

• So how do you perceive speech?

Some of the “strategies”:1. reduce the data2. use context clues3. use vision

Page 17: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Categorical Perception is a phenomenon in which the brain assigns a stimulus into one or another category but never into an intermediate category

Categorical Perception

Page 18: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•For example, /ba/ and /pa/ differ in their formant transitions–/ba/ is formed by stopping the flow of air from the lungs and releasing it after about 10 milliseconds (called voice onset time)–/pa/ is similar except that voice onset time is about 50 ms

Categorical Perception

Page 19: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Voice onset time can range from zero to >50 ms. For example, you could synthesize a sound with a voice onset time of 30 ms but...

Categorical Perception

Page 20: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Voice onset time can range from zero to >50 ms. For example, you could synthesize a sound with a voice onset time of 30 ms but...

•English speakers will hear either /ba/ or /pa/ but never something in between

Categorical Perception

Page 21: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Categorical Perception is Part of Learning a Language

• Babies can discriminate /ba/ from /pa/ and can discriminate these from phonemes with intermediate voice onset times!

• By 10 to 12 months, babies (learning English) stop discriminating intermediate voice onset times

Page 22: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Categorical Perception is Part of Learning a Language

• Once category boundaries are learned it is impossible to unlearn them– non-native speakers of any language often cannot

hear certain phonemes the way native speakers do

– as a consequence they will always have at least some slight accent

Page 23: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Another example:

Categorical Perception

Page 24: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perception (of all types) Makes Use of Context

• The stream of information contained in speech is usually ambiguous and incomplete

• Your brain makes a “best guess” based on the circumstances

Page 25: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perception (of all types) Makes Use of Context

• Consider the following example:

“The __eel fell of thecough

shoe”.

car”.

Page 26: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Perception (of all types) Makes Use of Context

• Consider the following example:

• Listeners report hearing the “appropriate” phoneme during the cough

“The __eel fell of thecough

shoe”.

car”.

Page 27: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Why rely on only one sensory system when there is information in two !?

Much of Speech Perception isn’t Auditory !

Page 28: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Why rely on only one sensory system when there is information in two !?

•The brain seamlessly integrates any information it is given - this is called cross-modal integration

Much of Speech Perception isn’t Auditory !

Page 29: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•Speech perception involves the synthesis of vision and hearing

•The McGurk effect demonstrates the critical role of vision on speech perception

Cross-modal Integration

Page 30: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•The McGurk Effect

Cross-modal Integration

Page 31: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

•The McGurk Effect - suggests that visual and auditory information are combined to enhance speech perception under normal circumstances

•When visual and auditory information are incongruous the resulting perception is unpredictable and often wrong

Cross-modal Integration

Page 32: The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa

Next Time: Taste Smell Touch Balance