Lecture 21 Cochlea to brain, Source Localization Tues

Preview:

Citation preview

1

COMP 546

Lecture 21

Cochlea to brain,Source Localization

Tues. April 3, 2018

Ear

2

auditory canal

pinna

cochlea

outer middle inner

3

Eye

• Lens

• Retina

• Photoreceptors

(light -> chemical)

• Ganglion cells (spikes)

• Optic nerve

Ear

• ?

• ?

• ?

• ?

• ?

4

Eye

• Lens

• Retina

• Photoreceptors

(light -> chemical)

• Ganglion cells (spikes)

• Optic nerve

Ear

• Outer ear

• Cochlea

• hair cells

(mechanical -> chemical)

• Ganglion cells (spikes)

• VestibuloCochlear nerve

Basilar Membrane

5

BM fibres have bandpass frequency mechanical responses.

20,000 Hz20 Hz

Basilar Membrane: Place code (“tonotopic”)

6

Nerve cells (hair + ganglion) are distributed along the BM. They have similar bandpass frequency response functions.

20,000 Hz20 Hz

7

0 1000 2000 3000 4000 …. 22,000

Bandpass responses (more details next lecture)

Neural coding of sound in cochlea

• Basilar membrane responds by vibrating with sound.

• Hair cells at each BM location release neurotransmitter that signal BM amplitude at that location

• Ganglion cells respond to neurotransmitter signals by spiking

8

9

Louder sound within frequency band

→ greater amplitude of BM vibration at that location

→ greater release of neurotransmitter by hair cell

→ greater probability of spike at each peak of filtered wave

10

Hair cell neurotransmitter release can signal exact timingof BM amplitude peaks for frequencies up to ~2 kHz.

For higher frequencies, hair cells encode only the envelope of BM vibrations.

t

Timing of ganglion cell spikes:for frequencies up to 2 KHz (“phase locking”)

11

Hair cells release more neurotransmitter at BM amplitude peaks.

Ganglion cells respond to neurotransmitter peaks by spiking.

This allows exact timing of BM vibrations to be encoded by spikes.

BM vibration

Spikes

12

3,000 hair cellsin each cochlea(left and right)

30,00 ganglion cells in each cochlea

cochlear nerve(to brain)

Ganglion cells cannot spike faster than 500 times per second.So we need many ganglion cells for each hair cell.

“Volley” code

13

Different ganglion cells at same spatial position on BM

14

From cochlea to brain stem

cochlea → cochlear nucleus → lateral and medial superior olive (LSO, MSO)… → auditory cortex

BRAIN STEM

Tonotopic maps

15

cochlear nucleus (CN)

cochlea

lateral superior

olive (LSO)

medial superior

olive (MSO)

auditory nerve

high 𝜔 low 𝜔

Binaural Hearing

16

CN

LSO MSO

high 𝜔 low 𝜔

CN

MSO LSO

low 𝜔 high 𝜔

Binaural Hearing

17

CN

LSO MSO

high 𝜔low 𝜔

CN

MSO LSO

low 𝜔high 𝜔

MSO combines low frequency signals.

Binaural Hearing

18

CN

LSO MSO

high 𝜔

CN

MSO LSO

high 𝜔

LSO combines high frequency signals.

Levels of Analysis

19

- what is the task ? what problem is being solved?

- brain areas and pathways

- neural coding

- neural mechanisms

high

low

20

For high frequency bands, • the head casts a shadow• the timing of the peaks cannot be accurately coded

by the spikes (only the rate of spikes is informative)

For low frequency bands,• the head casts a weak shadow only• the timing of the peaks can be encoded by spikes

Duplex theory of binaural hearing(Rayleigh, 1907)

• level differences computed for higher frequencies (ILD -- interaural level differences)

• timing differences computed for lower frequencies (ITD - interaural timing differences)

21

22

CN

LSO MSO

high 𝜔

CN

MSO LSO

high 𝜔

Level differences (high frequencies)

− −+ +

Excitatory input comes from the ear on the same side. Inhibitory input comes from ear on the opposite side.

23

CN

LSO MSO

high 𝜔low 𝜔

CN

MSO LSO

low 𝜔high 𝜔

Timing differences (low frequencies)

Sum excitatory input from both sides. Reminiscent of binocular complex cells in V1 ?

++

++

Jeffress Model (1948) for timing differences

24

http://auditoryneuroscience.com/topics/jeffress-model-animation

E D C B A from right earfrom left ear

A

B

C

D

E

Spike Timing precision required for Jeffress Model ?

25

from right earfrom left ear

distance = 1

10𝑚𝑖𝑙𝑙𝑖𝑚𝑒𝑡𝑟𝑒𝑠

speed of spike = 10 𝑚𝑒𝑡𝑟𝑒𝑠 𝑠𝑒𝑐𝑜𝑛𝑑−1

⟹ ∆ time =𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒

𝑠𝑝𝑒𝑒𝑑=

1

100𝑚𝑖𝑙𝑙𝑖𝑠𝑒𝑐𝑜𝑛𝑑

See Exercises 19 Q2c

E D C B A

26

Jeffress model remains controversial. It is not known exactly how “coincidence detection” occurs in MSO.

𝜔

Coincidence detection for each low

frequency band

A B C D E

A B C D E

A B C D E

A B C D E

Naïve Computational Model of Source Localization(Recall lecture 20)

27

𝐼𝑙 (𝑡) = 𝛼 𝐼𝑟(𝑡 − 𝜏) + 𝜖(𝑡)

shadow model error

delay

Find the 𝛼 and 𝜏 that minimize

where 𝜏 < 0.5 𝑚𝑠.

𝑡=1

𝑇

{ 𝐼𝑙 (𝑡) − 𝛼 𝐼𝑟(𝑡 − 𝜏) }2

28

10 𝑙𝑜𝑔10

𝑡=1𝑇 𝐼𝑙 (𝑡)2

𝑡=1𝑇 𝐼𝑟 (𝑡)2

𝑡

𝐼𝑙 (𝑡) 𝐼𝑟(𝑡 − 𝜏) .

Timing difference: find the 𝜏 that maximizes

Level difference:

29

𝑡

𝐼𝑙𝑒𝑓𝑡𝑗(𝑡) 𝐼𝑟𝑖𝑔ℎ𝑡

𝑗(𝑡 − 𝜏) .

For each low frequency band 𝑗, find the 𝜏 that maximizes

(or use summation model similar to binocular cells or Jeffress model)

An estimated value of delay 𝜏 in frequency band j is consistent with various possible source directions ( 𝜙, θ ).

Similar to cone of confusion, but more general because of frequency dependence

30

𝐼𝐿𝐷𝑗 = 10 𝑙𝑜𝑔10

𝑡=1𝑇 𝐼𝑙𝑒𝑓𝑡

𝑗 (𝑡)2

𝑡=1𝑇 𝐼𝑟𝑖𝑔ℎ𝑡

𝑗 (𝑡)2

For each high frequency band 𝑗, compute interaural level difference (ILD) :

What does each 𝐼𝐿𝐷𝑗 tell us ?

31

𝐼𝑙𝑒𝑓𝑡𝑗

𝑡; 𝜙, 𝜃 = 𝑔𝑗 𝑡 ∗ ℎ𝑙𝑒𝑓𝑡(𝑡; 𝜙, 𝜃) ∗ 𝐼𝑠𝑟𝑐 𝑡; 𝜙, 𝜃

𝐼𝑟𝑖𝑔ℎ𝑡𝑗

𝑡; 𝜙, 𝜃 = 𝑔𝑗 𝑡 ∗ ℎ𝑟𝑖𝑔ℎ𝑡(𝑡; 𝜙, 𝜃) ∗ 𝐼𝑠𝑟𝑐 𝑡; 𝜙, 𝜃

Recall head related impulse response function (HRIR) from last lecture..

If the source direction is (q, f), and 𝑔𝑗 𝑡 is the filter for band 𝑗.

then…

32

𝐼 𝑙𝑒𝑓𝑡𝑗

𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑙𝑒𝑓𝑡(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

𝐼 𝑟𝑖𝑔ℎ𝑡𝑗

𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑟𝑖𝑔ℎ𝑡(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

Take the Fourier transform and apply convolution theorem :

33

𝐼 𝑙𝑒𝑓𝑡𝑗

𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑙𝑒𝑓𝑡(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

𝐼 𝑟𝑖𝑔ℎ𝑡𝑗

𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑟𝑖𝑔ℎ𝑡(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

Take the Fourier transform and apply convolution theorem :

If there is just one source direction (𝜙, 𝜃), then for each frequency 𝜔 within band 𝑗 ∶

𝐼 𝑟𝑖𝑔ℎ𝑡𝑗

𝜔

𝐼 𝑙𝑒𝑓𝑡𝑗

𝜔 ℎ𝑙𝑒𝑓𝑡(𝜔; 𝜙, 𝜃)

ℎ𝑟𝑖𝑔ℎ𝑡(𝜔; 𝜙, 𝜃)≈

34

One can show using Parseval’s theorem of Fourier transforms that if ℎ𝑙𝑒𝑓𝑡(𝜔; 𝜙, 𝜃) and ℎ𝑟𝑖𝑔ℎ𝑡(𝜔; 𝜙, 𝜃) are approximately constant

within band 𝑗, then:

𝑡=1𝑇 𝐼𝑙𝑒𝑓𝑡

𝑗 (𝑡)2

𝑡=1𝑇 𝐼𝑟𝑖𝑔ℎ𝑡

𝑗 (𝑡)2

| ℎ𝑙𝑒𝑓𝑡𝑗

( 𝜙, 𝜃) |2

| ℎ𝑟𝑖𝑔ℎ𝑡𝑗

( 𝜙, 𝜃) |2≈

35

𝑡=1𝑇 𝐼𝑙𝑒𝑓𝑡

𝑗 (𝑡)2

𝑡=1𝑇 𝐼𝑟𝑖𝑔ℎ𝑡

𝑗 (𝑡)2

| ℎ𝑙𝑒𝑓𝑡𝑗

( 𝜙, 𝜃) |2

| ℎ𝑟𝑖𝑔ℎ𝑡𝑗

( 𝜙, 𝜃) |2≈

The ear can measure this… and can infer source directions ( 𝜙, 𝜃) that are consistent with it.

One can show using Parseval’s theorem of Fourier transforms that if ℎ𝑙𝑒𝑓𝑡(𝜔; 𝜙, 𝜃) and ℎ𝑟𝑖𝑔ℎ𝑡(𝜔; 𝜙, 𝜃) are approximately constant

within band 𝑗, then:

36

https://auditoryneuroscience.com/topics/acoustic-cues-sound-location

Each iso-contour in each frequency band is consistent with a measured level difference (dB).

Interaural Level Difference (dB) as a function of (𝜙, 𝜃) for two fixed ω.

700 Hz 11,000 Hz

Monaural spectral cues(Spatial localization with one ear?)

37

𝐼𝑗 𝑡; 𝜙, 𝜃 = 𝑔𝑗 𝑡 ∗ ℎ(𝑡; 𝜙, 𝜃) ∗ 𝐼𝑠𝑟𝑐 𝑡; 𝜙, 𝜃

𝐼𝑗 𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑗(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

If the source is noise, then all frequencies make the same contribution on average.

Pattern of peaks and notches across bands will be due to HRTF, not to the source.

38

HRTF from last lecture

e.g. medial plane

Azimuth 𝜃 = 0

“Pinnal notch” frequency varies with elevation of source e.g. in the medial plane.

𝐼𝑗 𝜔; 𝜙, 𝜃 = 𝑔𝑗 𝜔 ℎ𝑗(𝜔; 𝜙, 𝜃) 𝐼𝑠𝑟𝑐 𝜔; 𝜙, 𝜃

00

Levels of Analysis

39

- what is the task ? what problem is being solved?Source localization using level and timingdifferences within frequency channels.

- brain areas and pathways(cochlea to CN to MSO and LSO in the brainstem)

- neural coding(gave sketch only)

- neural mechanisms

high

low