33
Auditory Neuroscience - Lecture 6 Hearing Speech [email protected] auditoryneuroscience.com/lectures

Auditory Neuroscience - Lecture 6 Hearing Speech [email protected] auditoryneuroscience.com/lectures

Embed Size (px)

Citation preview

Auditory Neuroscience - Lecture 6

Hearing Speech

[email protected]

auditoryneuroscience.com/lectures

Vocalization in Speech and Animals

Articulation

• Articulators (lips, tongue, jaw, soft palate) move to change resonance properties of the vocal tract.

http://auditoryneuroscience.com/vocalization/articulators

Can other animals speak?

Other mammals have similar vocal tracts and use them for communication. However, they have only very limited use of syntax (grammar) and very much smaller vocabularies than humans.

http://mustelid.physiol.ox.ac.uk/drupal/?q=mishka

Acoustic features of vocalizations: modulations –

harmonics & formants

Speech as a Modulated Signal

Elliott and Theunissen (2009) PLoS Comput Biol

Launch Spectrogram

AN Figure 4.2

Modulation spectra of male and female English speech.

From figure 2 of Elliott and Theunissen (2009) PLoS Comput Biol 5:e1000302.

Pitch changes done with Hideki Kawahara’s “Straight”

Pitch changes done with Hideki Kawahara’s “Straight”

??????

I come in peace!I come in peace!

RisingRising FallingFalling

http://www.auditoryneuroscience.com/content/pitchInSpeech

“Spectral Modulation”Harmonics & Formants

HarmonicsFormants

Formants determine vowel categories

AN Fig 4.3 Adapted from figure 7 of Diehl (2008) Phil Trans Royal Soc B

http://auditoryneuroscience.com/topics/two-formant-artificial-vowels

Formant Tracking & Synthesis

http://person2.sol.lu.se/SidneyWood/praate/wavformedform.html

Visual Influences

Visual / Auditory Interactions:The McGurk Effect

http://www.auditoryneuroscience.com/McGurkEffect

Neural Representation of Vocalizations in the Ascending

Pathway

Frequency modulations are poorly resolved on the tonotopic axis

Aud Neursci Fig 4.4, Based on data by Young & Sachs (1979)

Speech and Cochlear Implants

• Since tracking a small number of formants is all that is required to extract most of the semantic information of speech, cochlear implants can deliver speech even though they have only few effective frequency channels.

• https://mustelid.physiol.ox.ac.uk/drupal/?q=prosthetics/noise_vocoded_speech

“Modulation tuning” in Thalamus and Cortex

AN Figs 4.5 & 4.6

From Miller LM, Escabi MA, Read HL, Schreiner CE (2002) Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex. J Neurophysiol 87:516-527.

Which Temporal Modulations are the Most Important?

Elliott and Theunissen (2009) PLoS Comput Biol

http://auditoryneuroscience.com/topics/speech-modulated-signal

Cat cortical modulation transfer functions seem not particularly well matched to the most important modulation frequencies of speech signals.

Species differences?

Different (“higher order”) cortical areas?

Cortical Specialization for Speech?

Putative Cortical “What” and “Where” Streams

AN Fig 4.11 Adapted from Romanski et al. (1999). Nat Neurosci 2:1131-1136

Human Cortex

Hemispheric “Dominance” for Speech and the Wada test

Broca first proposed that the left hemisphere is “dominant” for speech, based on examinations of post-mortem brains.

Nowadays “dominance” is usually assessed with the “Wada test” (intracarotid sodium amobarbital procedure): either the left or right brain hemisphere is anesthetised by injection of amobarbital into the carotid through a catheter. The patient’s ability to understand and produce speech is scored.

Left Hemisphere Dominance Dominates

Wada test results suggest that:

Ca 90% of all right handed patients and ca. 75% of all left handed patients display “left hemisphere dominance” for speech.

The remaining patients are either “mixed dominant” (i.e. they need both hemispheres to process speech) or they have a “bilateral speech representation” (i.e. either hemisphere can support speech without necessarily requiring the other).

Right hemisphere dominance is comparatively rare, and seen in no more than 1-2% of the population

Hierarchical levels of speech perception

Acoustic / phonetic representation:- Can the patient tell whether two speech sounds or syllables presented in succession are the same or different?

Phonological analysis:- Can the patient tell whether two words rhyme? Or what the first phoneme (“letter”) in a given word is?

Semantic processing:- Can the patient understand “meaning”, e.g. follow spoken instructions?

Human Cortex - Microstimulation

AN Fig 4.8: Sites where acoustic (A), phonological (B), or lexical-semantic (C) deficits can be induced by disruptive electrical stimulation.

From Boatman (2004) Cognition 92:47-65

Where in the Brain does the Transition from Sound to

Meaning happen?We don’t really know.

“Ventral vs Dorsal stream hypothesis” of auditory cortex connectivity would suggest that anterior temporal and frontal structures should be involved.

This fits with neuroimaging studies (e.g. Scott et al (2000) Brain 123 Pt 12:2400-2406)

http://www.auditoryneuroscience.com/?q=node/46

But other electrophysiological and lesion data do not really fit this picture.

Marmoset Twitter Calls as Models for Speech Tokens

1

2

3

N R

“Twitter Selectivity” in Marmoset, Cat and Ferret A1 from Wang & Kadia, J

Neurophysiol. 2001

cat cat

marmoset marmoset ferret ferret

from Schnupp et al. J Neurosci 2006

Cortical representations of vocalizations

AN Fig 4.9

Mapping cortical sensitivity to sound features

Neuralsensitivity

Timbre

Pitc

h

Loca

tio

n

Nelken et al., J Neurophys, 2004

Bizley, Walker, Silverman, King & Schnupp - J Neurosci 2009Bizley, Walker, Silverman, King & Schnupp - J Neurosci 2009

Summary• Human speech signals carry information mostly

in their time-varying formant structure.

• Formants are initially encoded as time varying activity patterns across the tonotopic array.

• It is rather difficult to pin down which parts of the brain might translate sound acoustics to “meaning”.

• There is a clear left hemisphere bias, but evidence for cortical areas with very clear specialization for speech or vocalization processing remains elusive.

Further Reading

• Auditory Neuroscience – Chapter 6