21
Introduction to Spatial Auditory Processing Kaushik Patra ([email protected]) 18 th Jul, 2014

Introduction to Spatial Auditory Processing

Embed Size (px)

DESCRIPTION

This presentation gives introduction to human auditory system and signal processing goes on in brain to deduce spatial information about the sound we hear.

Citation preview

Page 1: Introduction to Spatial Auditory Processing

Introduction to Spatial Auditory Processing

Kaushik Patra([email protected])

18th Jul, 2014

Page 2: Introduction to Spatial Auditory Processing

Objective

● Review Auditory System

● Review of what is sound?

● Understand Auditory Information Processing.

● Conclusion

Page 3: Introduction to Spatial Auditory Processing

Auditory System

● Auditory system consists of the following components.

– External ear (Pinna, ear canal and ear drum).

– Middle ear (3 very light weight tiny bones named ossicles).

– Inner ear (Cochlea, hair cells, special nerve called spiral ganglion)

Fig.1: Schematic of Auditory Path

Fig.2: Schematic Position of Auditory Cortex

Page 4: Introduction to Spatial Auditory Processing

Auditory System

● Spiral Ganglion nerves runs through 8th cranial nerve (CN VIII) to bring information to auditory cortex of the brain.

– Fig. 2 shows the part of brain in pink patch known to be the auditory processing centre, auditory cortex.

Fig.1: Schematic of Auditory Path

Fig.2: Schematic Position of Auditory Cortex

Page 5: Introduction to Spatial Auditory Processing

Auditory System

● Fig. 3 shows relative amount of cells that present in various parts of the auditory system.

● Most portion of the cells are in the auditory cortex.

● Only small chunk of cells are part of hair cells (inner or outer – in red), sensory neurons (in blue) and central neurons in various part of the path (in black inner circles).

Fig.3: Relative Amount of Cells in Different Parts of Auditory System

Page 6: Introduction to Spatial Auditory Processing

Auditory System

● Why do we need such a big amount of auditory cortex cells?

– We'll review later in these slides on the auditory information processing.

– We need to process for speech interpretation and spatial information.

– We need to process not only the auditory signals but also combine them to visual information for gaining spatial information.

Fig.3: Relative Amount of Cells in Different Parts of Auditory System

Page 7: Introduction to Spatial Auditory Processing

What is sound?

● Sound is a physical phenomenon of compression and rarefaction of the medium (gas, fluid or solid) through which energy flows from one place to another.

● It can be represented as wave in terms of air pressure with respect to time (Fig. 4).

Fig.4: Sound Wave

Page 8: Introduction to Spatial Auditory Processing

Auditory Information Processing● Two types of processing

– Passive processing and transformation of sound energy

– Active processing for information extraction.● Perception: Extraction of meaning of sound

(speech recognition, understanding of surroundings and ambiance, etc.)

● Spatial: Direction of source, distance of the source, etc.

Page 9: Introduction to Spatial Auditory Processing

Passive Information Processing

● Passive Processing.

– External ear canal has resonance of same frequency as the human speech. Thus is increases volume of the sound inside the ear canal.

– Middle ear transform the vibration in air into the vibration of fluid inside the cochlea. The ossicles connects tympanic membrane of external ear to the oval window membrane of cochlea. The bone hammers the oval window membrane with same frequency as the incoming sound. The energy level is decreased partially during this conversation.

– Inner ear transform vibration energy into electrical pulses along the nerve cell inside cochlea and transmit it to the brain. In concept Fourier transformation is done on the incoming vibration (i.e. transforming time domain signal into frequency domain signal).

ExternalEarSound

waveSoundWave with increased volume

MiddleEar

InternalEarAir to fluid

Vibration transmission

BrainElectrical Pulsesthroughtransduction

Page 10: Introduction to Spatial Auditory Processing

Direction Information Processing● Unlike visual information, audio

information does not produce any image, thus it is much more challenging to deduce spatial information out of sound.

– Spatial information is computed.

● Location of sound source is computed using interaural time delay (ITD).

– Sound travels more path to far ear.

– As sound source is located far toward the side, this difference in distance increases.

– With a ear-to-ear distance of 140cm the ITD can be computed as 0.41ms.

– ITD is too small to be computed by action potential (electrical pulses) of the auditory neuron.

Page 11: Introduction to Spatial Auditory Processing

Direction Information Processing● ITD is computed using differential neurons

where neurons from both side ears connect to the differential neurons.

– Two set of differential neurons, where one is set is located near left ear and other set is located near right ear.

– There will be time difference reaching sound signals onto these differential neurons.

– Combining information generated out of these differential neurons can infer the direction of the sound source.

● Not only ITD, but sound level is also used to infer sound source direction (Interaural Level Difference or ILD).

– Sound coming from one side will create acoustic shadow into the other ear.

– Volume will be less in this acoustic shadow location.

– Difference of the volume will be more as the the sound source moves further from the centre.

Page 12: Introduction to Spatial Auditory Processing

Direction Information Processing

● ITD and ILD provides horizontal information only.

– This produces cone of confusion.

– Cone of confusion is created by the positions of sound source which produces same amount of ITD and ILD.

– Confusion can be created either at vertical dimension or front/back.

● We solve cone of confusion by two ways.

– Using head movement.

– Using sound frequency.

Page 13: Introduction to Spatial Auditory Processing

Direction Information Processing● Front/back confusion is resolved by moving

head.

– 'A' position in the picture produces same ITD and ILD.

– We move our head to position 'B' to make different ITD and ILD and resolve front/back confusion.

● Similarly, vertical confusion is resolved by tilting of head.

– Head tilt creates different ITD and ILD, hence we can resolve confusion.

● These technique requires longer sound so that we can react to compute direction by head movement.

● Direction information processing also needs to process amount of signal that produced active head movement.

Page 14: Introduction to Spatial Auditory Processing

Direction Information Processing● We also process incoming sound wave

information with its spectral cues.

● Sound is not a simple sin wave, it is much complex than that.

● Any complex wave can be expressed as a sum of the simple sin waves with different frequency and amplitude.

– In mathematical term this transformation is called Fourier transformation.

● The plot of Amplitude vs. Frequency for a sound wave is called spectrum of the sound.

Page 15: Introduction to Spatial Auditory Processing

Direction Information Processing● We do Fourier transformation inside our ear.

– Location at cochlea.

– Timing of the action potential.

● The Basilar membrane in cochlea has a resonance gradient along the length.

– Towards outer direction it resonate to higher frequency and towards the centre it resonate with lower frequency.

– Corresponding hair cells along the basilar membrane will produce more action potentials according to it's frequency resonance.

● Scraping of hair cells against tectorial membrane causes opening of ion channels in the sensory cells, which creates the action potentials (electrical pulse).

– This opening is synchronized (phase locked) with the frequency of the incoming signal. Hence the action potential has same frequency as the incoming sound wave.

Page 16: Introduction to Spatial Auditory Processing

Direction Information Processing

● Sounds are first filtered by the external ear's pinna.

– The little folds in the ear lobe or pinna causes alteration of energies at different frequencies.

– The filtering is direction dependent.

● Different frequencies of same amplitude for a sound coming from same location ends up having different amplitude after the filtering done by pinna.

● Filtering is also direction dependent.

– Same sound wave from different direction ends up attenuated differently.

– This constructs spectral cue.

● Using spectral cue, we are also able to deduce sound source direction.

Page 17: Introduction to Spatial Auditory Processing

Direction Information Processing● Sound itself is not a very strong cue to identify the direction.

● To confirm our deduction to the location we also depends on the visual cue.

– For example, even if the speech comes from speaker in movie theater, we perceive that the actor / actress is talking and sound is coming from their mouth (by observing there lip movement).

– Ventriloquism uses this visual cue association of sound to make the puppet appeared to be talking while the puppet master is talking without moving his/her lips.

– We also move our head to look for a plausible source of sound comparing the sound with our prior knowledge of origin of similar sound.

Page 18: Introduction to Spatial Auditory Processing

Distance Information Processing● Distance is computed using two factors.

– Loudness

– Echos● Sound loudness diminishes with distance.

● We uses loudness information to deduce the distance of the sound source.

– Thus this works better with familiar sound.

– Familiar sound level can be compared easily with our prior knowledge about the loudness of the sound and distance stored in our memory.

Page 19: Introduction to Spatial Auditory Processing

Distance Information Processing● One 'first principles' distance cue is the

delays of echoes associated with that sound.

– This does not need any prior knowledge about the sound.

● Our auditory environment acts as a 'auditory hall of mirror' where sound bounces off every other surfaces in the environment we are hearing the sound.

● Time difference between principle sound and its echo gives cue to the source distance.

– If the distance of the source is large (like the top picture) the difference is smaller.

– If the distance of the source is small (like the bottom picture) the difference is larger.

Page 20: Introduction to Spatial Auditory Processing

Conclusion● Spatial information processing for sound is complex

involving multilevel signal transformation.

● It involves both audio and visual processing.

● It also depends on physical phenomenon like echo.

● It involves prior knowledge about sound (loudness).

Page 21: Introduction to Spatial Auditory Processing

Acknowledgement

● Most of the figures have been taken from:– Coursera course on 'Neurobiology in Everyday

Life'.

– Coursera course on 'Brain and Space'

– Wikipedia● Auditory Cortex (

http://en.wikipedia.org/wiki/Auditory_cortex)