jh_propagation.ai
SOUND PROPAGATION
listenersound source
speed c
x
lwavelength
c = f l
frequency f (Hz)
AN INTRODUCTION TOHUMAN SPATIAL HEARING
Richard O. DudaCIPIC Interface Laboratory
UC Davis
http://phosphor.cipic.ucdavis.edu
October 12, 2000
�������������umd00_title.ai
umd00_overview.ai
OVERVIEW
•Physics of sound
•Acoustic cues for sound localization •Azimuth •Elevation •Range
•Head-related transfer functions (HRTFs)
•Approaches to synthesizing spatial sound
•Opportunities and challenges
jh_paths.ai
MULTIPATH PROPAGATION
Reflection
Refraction
Scattering
umd00_axiom_1.ai
AXIOM I
The sound pressure at the twoear drums is a sufficient stimulus.
Producing the same sound pressure willproduce the same auditory perception.
•Bone conduction•Adaptation•Conflicting visual cues•Conflicting expectations
Caveats:
umd00_axiom_2.ai
AXIOM II
Exact reproduction of the sound pressureis not necessary for producing the sameauditory perception.
The limitations of neural responsesallow different (and simpler) stimulito produce the same response.
•Bandwidth (20 Hz to 20 kHz) •Amplitude (1-dB resolution) •Monaural phase (2-ms resolution) •Latency (10-ms resolution) •Spectral fine structure(critical bands, Q = 8)
Examples:
umd00_axiom_3.ai
AXIOM III
Although it is not necessary to reproduceall of the cues exactly, conflicting cuesdegrade perception.
Key engineering challenge -- find themost cost-effective approximation.
ubc_vp_coords.ai
VERTICAL-POLARCOORDINATES
qf
Plane o
f
consta
nt
azimuth
r
Cone ofconstantelevation
MedianPlane
Sound source
Horizontal plane
q
ubc_ip_coords.ai
INTERAURAL-POLARCOORDINATES
f
q f
Plane of constant elevation
rInteraural axis
Cone ofconstantazimuth
MedianPlaneHorizontal plane
Sound source
jh_azimuth_cues.ai
AZIMUTH CUES
sound source
q
•ITD (Interaural Time Difference)
•ILD (Interaural Level Difference)
WOODWORTH'S FORMULA
ubc_delay.ai
Contralateral Ear Ipsilateral Ear
Sound Source
a q
qa
aq
q
a sin q
DTips =- a sin q
c
ITD = a q + sin qc
DTcon =a qc
ARRIVAL TIME
ubc_delay_curve.ai
Rayleigh's solution (20% rise time)
Woodworth's formula
Angle of Incidence (deg)
Arr
iva
l tim
e
(ms)
0 50 100 150 200 250 300 350 400
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
-0.4
jh_elevation_cues.ai
ELEVATION CUES
soundsource
f
•Pinna reflections and resonances
•Torso and shoulder reflections
umd00_torso_refl1.ai
TORSO REFLECTIONsoundsource
f
h
soundsource
fmin
ffmin 90o
DTT
2hc
|H(f)|
f12DTT
32DTT
52DTT
72DTT ubc_pinna_nomenclature.ai
THE PINNA
Cavum concha
Cymba concha
Helix
Crus helias
Triangular fossaScaphoid fossa
LobuleIntertragal incisure
Antihelix
External auditory meatusTragus
Antitragus
ubc_pinna_modes.ai
PINNA PHENOMENA
Pinna reflections (Batteau)
Pinna resonances (Shaw)
+ +
+
++
PINNAE
ubc_pinnae.ai jh_elevation_cues.ai
RANGE CUES
•Loudness (for familiar sources)
•Excess ILD (for close sources)
•Direct/reverberant (for distant sources)
sound source
soundsource
umd00_dynamnic_cues1.ai
HEAD-MOTION CUES ANDFRONT/BACK CONFUSION
?
?
umd00_dynamnic_cues2.ai
HEAD-MOTION CUES ANDELEVATION MAGNITUDE
aa
aa
aa
f
ITD = a2ac
ITD = a cos f2acITD = 0
umd00_other_cues.ai
OTHER CUES
•Visual cues •Synchronized motion •Absence
•Knowledge of source
•Knowedge of environment
jh_ff.ai
FREE-FIELD RADIATION FROM ASPHERICAL SOURCE
X(f) = Fourier transform of source pressureXff(f)= Free-field pressure at head center
Xff = Hff X
Hff(f)= e- j k r , k = r0r
Inverse range Propagation delay
2 p fc
Sound Source
X(f)
r0
r0
r
Xff(f)
ubc_HRTF_def.ai
THE HEAD-RELATEDTRANSFER FUNCTION
X(f) = Fourier transform of source pressureXL(f)= Fourier transform of left ear pressureXR(f)= Fourier transform of right ear pressureXff(f)= Free-field pressure at the origin
XL(f)= HL(f) Xff(f) XR(f)= HR(f) Xff(f)
HR(f)
Sound Source
X(f)
XR(f)
XL(f)
HL(f)
ubc_HRIR_def.ai
THE HEAD-RELATEDIMPULSE RESPONSE
hR(t)
Sound Source
d(t)
xR(t)
xL(t)
hL(t)
xL(t) = Left ear pressurexR(t) = Right ear pressurexff(t) = Free-field pressure at the origin
xL(t) = hL(t) xff(t-t) dt xR(t) = hR(t) xff(t-t) dt- 8
8
- 88
HRIR SOUND SYNTHESIS
jh_synthesis.ai
xR(t)xL(t)
Convolver Convolver
Head-RelatedImpulse Responses
Sound SignalhL(t) hR(t)
Azimuth q Elevation f Range r
VirtualSource
x(t)
jh_structural_model.ai
A STRUCTURAL MODEL
VirtualSource
xR(t)xL(t)
x(t)
+ +
Head Torso Room Head Torso Room
Pinna Pinna
Sound Signal
COMPUTING HRTFs BYBOUNDARY ELEMENT METHODS
•Digitize with a 3-D scanner
•Solve wave equation numerically
ubc_bem.ai
* See Kahana et al.
THE KEMARACOUSTIC MANIKIN
ubc_kemar.ai
f
q
Interaural
Axis
Elevation
Azim
uth
umd00_hoop.ai
ACOUSTICHRTF MEASUREMENT
jh_kemar_hrir_m45.ai
KEMAR HRIR
Azimuth = -45o, Elevation = 0o
0 0.5 1 1.5 2
Left ear
Right ear
Time (ms)
jh_kemar_hrtf_m45.ai
KEMAR HRTF
Azimuth = -45o, Elevation = 0o
Frequency (kHz)
Re
spo
nse
(d
B)
-30
-20
-10
0
10
20
30
0.1 1 1020.2 20
Left ear
Right ear
ubc_ke_freq.ai
RIGHT-EAR HRTF FOR KEMAR(Horizontal Plane)
100 1000 10000Frequency (Hz)
FRONT
Re
spo
nse
(d
B)
-25
-20
-15
-10
-5
0
5
10
15
20
AZIMUTH = 0o
AZIMUTH = 90o
AZIMUTH = -90o
100 1000 10000
Re
spo
nse
(d
B)
BACK
-25
-20
-15
-10
-5
0
5
10
15
20
Frequency (Hz)
AZIMUTH = 90o
AZIMUTH = 180o
AZIMUTH = 270o
ubc_ke_np_freq.ai
HRTF FOR KEMAR, NO PINNA(Horizontal Plane)
100 1000 10000
-25
-20
-15
-10
-5
0
5
10
Frequency (Hz)
Re
spo
nse
(d
B)
FRONTAZIMUTH = 90oAZIMUTH = 0o
AZIMUTH = -90o
BACK
Frequency (Hz)
Re
spo
nse
(d
B)
100 1000 10000
-25
-20
-15
-10
-5
0
5
10AZIMUTH = 90o
AZIMUTH = 180o
AZIMUTH = 270o
umd00_full_HRTF.ai
HRTF ELEVATION DEPENDENCE
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
Elevation (deg)0 100 200
-15
-10
-5
0
5
10
15
dB
umd00_HRTF_nopinna.ai
HRTF WITHOUT PINNA
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
-15
-10
-5
0
5
10
15
dBElevation (deg)0 100 200
umd00_pinplane.ai
A PINNA ON A PLANE
umd00_HRTF_pinna.ai
HRTF FOR ISOLATED PINNA
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
-15
-10
-5
0
5
10
15
dBElevation (deg)0 100 200
-15
-10
-5
0
5
10
15
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
Elevation (deg)0 100 200
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
-15
-10
-5
0
5
10
15
Fre
quency
(k
Hz)
2
4
6
8
10
12
14
16
-15
-10
-5
0
5
10
15
dB
Full HRTF
Head and torso
Pinna
umd00_HRTF_contributions.ai
CONTRIBUTIONS TO THE HRTF
jh_structural_model.ai
A STRUCTURAL MODEL
VirtualSource
xR(t)xL(t)
x(t)
+ +
Head Torso Room Head Torso Room
Pinna Pinna
Sound Signal
ubc_sphere_model.ai
THE SPHERICAL-HEAD MODEL
VirtualSource
q
xR(t)xL(t)
x(t)
DTL(q)
HHsL(q)
DTR(q)
HHsR(q)
jh_sphere_assess.ai
ASSESSING THESPHERICAL HEAD MODEL
•Only one parameter -- easily customized
•Well focused
•Good left/right position
•No up/done control -- image elevated
•With a head tracker: • Moderately externalized • Little front/back confusion
•Without a head tracker: • Internalized • Usually seems to be in back
jh_torso_reflections.ai
ELLIPSOIDAL-TORSO MODEL
soundsource
f
HeadModel
HeadModel
rT
DTT
rT
DTT
= torso reflection coefficient
= torso reflection delay
jh_ellipsoid_assess.ai
ASSESSING THEELLIPSOIDAL TORSO MODEL
•Five parameters; still easily customized
•Provides an elevation cue • Significant below 3 kHz • Ineffective in median plane
•Only one component of a full model
jh_structural_model_2.ai
STRUCTURAL HRTF MODEL
HeadModel
HeadModel
TorsoModel
PinnaModel
DTH(q)
HHS
(q)
Head Model
rT
DTT(q,f)
Torso Model
jh_structural_model_3.ai
SIMPLIFIED PINNA MODEL
kP(f)
DTP(f)
Fixed-poleresonator
kP(f)
DTP(f)
Fixed-poleresonator
umd00_systems.ai
SPATIAL SOUND SYSTEMS
Multichannel
Two-channel: headphones
Two-channel: crosstalk-canceled loud speakers
umd00_systems2.ai
MULTICHANNEL SYSTEMS
Pros •Works with a large audience •No customization needed •Conceptually simple
Cons •Speakers must be distant •Many channels needed for full 3-D •Space consuming, expensive
umd00_systems3.ai
TWO-CHANNEL: HEADPHONES
Pros •Can reproduce full 3-D with only 2 channels •Private and non-interfering •Conceptually simple
Cons •Uncomfortable for extended use •Clumsy for a large audience •Requires customization for full 3-D •Difficult to achieve frontal externalization
xL(t) xR(t)
umd00_systems4.ai
TWO-CHANNEL: CROSSTALK-CANCELED
LOUD SPEAKERS
Pros •Can reproduce full 3-D with only 2 channels •Unencumbered listening
Cons •Small "sweet spot" •Cannot be used with a large audience •Requires customization for full 3-D •Difficult to get near or rear locations
xL(t) xR(t)
Inverse HRTFs
umd00_customization.ai
APPROACHES TOCUSTOMIZATION
•Measure exact HRTF for each person •Acoustic •Computational
•Nearest-neighbor •Trial and error •Anthropometry
•Scale a standard HRTF •Global •Pinna/head/torso components
•Use an adaptive model •Match to anthropometry •Match to exact HRTF
umd00_problems.ai
CHALLENGESAND
OPPORTUNITIES
•Frequency range (combining partial HRTFs)
•Elevation perception •Front/back confusion •Low elevations
•Range perception •Headphones: externalization • Median plane • Frontal •Speakers: back locations
•Transducers •Headphone compensation •Loudspeaker "sweet spot"
•Latency in dynamic systems
•Room acoustics