14
1 Lannion, France, 10-12 September 2008 International Telecommunication Union Models of Binaural Interaction Jens Blauert Emeritus Professor of Acoustics Ruhr-Universität Bochum, Germany Jonas Braasch Ass. Professor of Communication Acoustics Rensselaer Polytechn. Inst., Troy NY, USA ITU-T Workshop on "From Speech to Audio: bandwidth extension, binaural perception" Lannion, France, 10-12 September 2008 Lannion, France, 10-12 September 2008 International Telecommunication Union 2 Prominent Features of Binaural Hearing Localization – Formation of Positions of the Auditory Events i.e., Azimuth, Elevation & Distance – Spatial Extent of Auditory Events Suppression of – Directional Information Coming from Reflections e.g., Precedence Effect, Localization Dominance, Fusion – Reverberation, Coloration and Noise Identification & Segregation of – Auditory Streams e.g., Concurrent Talkers (Cocktail-Party-Effect), Warning Signals Lannion, France, 10-12 September 2008 International Telecommunication Union 3 Architecture of a Model of Binaural Hearing The Jeffress Processor The Lindemann/Gaik Extensions Interpreting Binaural Activity The Effect of Interaural Incoherence Binaural Speech Enhancement Models of Binaural Interaction Part I: Basic Concepts Lannion, France, 10-12 September 2008 International Telecommunication Union 4 binaural-activity display Architecture for a Model of Binaural Hearing Lannion, France, 10-12 September 2008 International Telecommunication Union 5 Ear-Adequate Band-Pass-Filter Bank Lannion, France, 10-12 September 2008 International Telecommunication Union 6 A Simplified Functional Model of the Hair Cells Probabilistic

ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

1

Lannion, France, 10-12 September 2008

InternationalTelecommunicationUnion

Models of Binaural Interaction

Jens BlauertEmeritus Professor of Acoustics

Ruhr-Universität Bochum, Germany

Jonas BraaschAss. Professor of Communication AcousticsRensselaer Polytechn. Inst., Troy NY, USA

ITU-T Workshop on"From Speech to Audio: bandwidth extension,

binaural perception"Lannion, France, 10-12 September 2008

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 2

Prominent Features of Binaural Hearing

• Localization– Formation of Positions of the Auditory Events

i.e., Azimuth, Elevation & Distance– Spatial Extent of Auditory Events

• Suppression of– Directional Information Coming from Reflections

e.g., Precedence Effect, Localization Dominance, Fusion – Reverberation, Coloration and Noise

• Identification & Segregation of– Auditory Streams

e.g., Concurrent Talkers (Cocktail-Party-Effect),Warning Signals

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 3

– Architecture of a Model of Binaural Hearing– The Jeffress Processor– The Lindemann/Gaik Extensions– Interpreting Binaural Activity– The Effect of Interaural Incoherence– Binaural Speech Enhancement

Models of Binaural Interaction

Part I: Basic Concepts

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 4

binaural-activity display

Architecture for a Model of Binaural Hearing

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 5

Ear-Adequate Band-Pass-Filter Bank

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 6

A Simplified Functional Model of the Hair Cells

Probabilistic

Page 2: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

2

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 7

ΨY(τ)= 1/(t1-t0) yl(t) yr(t +τ)Σt=t0

t1

Cherry 1959

The Interaural Cross-Correlation Functionusually estimated for the physiologic range of interaural

arrival-time differences, τ < + 1 ms

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 8

The Binaural-Coincidence Processor after Jeffress 1948

There are many of these channels in parallelFM … firing modelBP … band-pass filter

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 9

Output of Jeffress’ Coincidence Processorfor Different Angles of Sound Incidence

videos provided by Braasch 2003

estimates of the interaural cross-correlation function

about 0.7 kHz about 6 kHz

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 10

Sample Output of the Binaural Modelone frontal sound source, sending out a musical chord

taken at an instant t = tofrom a running

correlogram

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 11

The Lindemann Processorplease note the monaural modules, m

Lindemann 1986 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 12

Schematic View of the Lindemann Model

Page 3: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

3

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 13

Output of Lindemann’s Coincidence Processor with Lateral Inhibition

video provided by Braasch 2005

estimate of the inhibited interaural cross-correlation function

about 0.7 kHz

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 14

Output of the Lindemann Processorwith regard to interaural arrival-time difference (ITDs)

and interaural level differences (ILDs)

600-Hz sinusoidal sounds

ITDs ILDs

d

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 15

own HRTFs other animal‘s HRTFs

Guinea-pig experiments, after Sterbing & Hartung 1998

Single-Cell Responses from theCentral Nucleus of the Inferior Colliculus

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 16

„Natural Combinations“ of ITDs and ILDs

Gaik 1988

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 17

Weighted Contralateral Inhibition

after Gaik 1988

Gaik’sExtension

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 18

cente

rfr

equen

cyof

critic

alband

left < lateral deviation > right

Output of the Jeffress-Lindemann-Gaik Model frontal broad-band sound source

Page 4: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

4

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 19

A „Figurative“ Plot of the Output of the Gaik Processor Block Diagram of theBinaural-Analysis Systemof IKA Bochum

binaural-activity map

pure bottom-upprocessing,signal driven !

interaural-level-difference analysis

interaural-time-difference analysis200620062006

20

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 21

Spatial Extent of the Auditory Event as a Function of Interaural Correlation

after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 22

Model Output with Incoherent Ear-Input Signalsparameter: degree of interaural coherence, k

left < center > right

prediction:

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 23

Area Covered by Auditory Events as a Function of the Degree of Interaural Coherence, k

pink noise, 12 subjects

after Blauert & Lindemann, 1985(plot enhanced for contrast)

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 24

Impulse Response and Running Interaural Cross Correlation

in a Room with Reflecting Walls

Page 5: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

5

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 25

The Binaural Intelligibility-Level Difference, BILD

Cherry’s Experiment (1959) Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 26

Instantaneous Binaural Activity of Two Concurrent Talkers

talker # 1

left < lateral deviation > right

talker 1

talker 2

Ampl

itude

runnin

gtim

eLannion, France, 10-12 September 2008

InternationalTelecommunicationUnion 27

Architecture of a Cocktail-Party Processor, Based on Binaural Modelling

direction finder weight assigment

BinauralModel

ControlUnit

spectraldecomposition

speechcleansing

WienerFilter

source selection

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 28

Spatial Selectivity of a Model of Binaural Hearing

Bodden 1993

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 29

one noise source two concurrent talkers

two moving talkers talker plus wall reflection

Source Tracking with the Binaural Model

Bodden 1993 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 30

Binaural Modeling for Acoustically-Adverse Conditions

(a) unprocessed

(b) processed

influence of reverberation on binaural activity (model output)

Page 6: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

6

Cross-Modal Influenceand Cognitive Influenceon the Formationof Auditory Events

top-down processinghypothesis driven !

binaural-activity map

31 Blauert 1988

Potential Applications for Binaural Algorithms

32

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 33

Communication Acoustics

Communication Acoustics represents those areas of acoustics that relate to the modern communication and information

sciences & technologies

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 34

Models of Binaural Interaction

Part II: The Precedence EffectContents

- Phenomenology- Bottom-Up Modelling - Auditory Scenes- Build-Up and Break-Down- Top-Down Modelling

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 35

Standard Stereo-Listening Arrangement

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 36

lead

lag

Auditory Effects with Two Coherent Sound Sources

summing localization precedence effect echo threshold

Page 7: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

7

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 37

Precedence Effect, Haas Effect & Backward Inhibition

signal: running speech of 50 syllables/s

delay of the reflection

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 38

Auditory-Event Trajectoriesfor Broad-Band and Narrow-Band Signals

narrow-band signal broad-band signal

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 39

ms

Trajectories of Narrow-Band Signalstwo loudspeakers in standard stereo position

Blauert & Cobben 1978 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 40

Widening Effect with Broad-Band Sound Sources

broad-bandsounds

primaryauditoryevent

echo

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 41

Binaural Model for StudyingInter-Aural Cross-Correlation

includes a simple cochea-filterand hair-cell model

fc = 800 Hz fc = 800 Hz

Blauert & Cobben 1978 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 42

Contral-Lateral Inhibition after Lindemann

ICC

contra-laterally-inhibited ICC

Lindemann 1982

Page 8: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

8

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 43

Cross-Correlation vs. Lindemann

impulsive input signals with no ITD, but with an ILD, looked at in an auditory „critical“ band about 800 Hz

schematic plot

Output of theLindemann Model for a Frontal Sound Plus One Lateral Reflection

includesdynamic contra-lateral inhibition

impulses (clicks)

left panel:cross correlation only

right panel:cross correlation pluscontra-lateral inhibition

44

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 45

Precedence Effect for Ongoing Signals

lead and lag pair:band-passed noise, 500 Hz center frequency, 200 ms100 Hz, 400 Hz or 800 Hz bandwidthlead: + 300-µs ITD, lag: - 300-µs ITD inter-stimulus intervals, ISI:

0, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.5 ms6 listenersacoustic pointer

stimulus presentation via headphones

Braasch & Blauert 2004 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 46

Time Course of the Listening Experiments

ITD 1 ITD 2ISI

time

rightchannel

leftchannel

leadlag

left ear

right ear

200-ms noise bursts, band-pass filtered around 500 HzISI … inter-stimulus intervalITD 1, ITD 2… inter-aural arrival-time differences

Blauert & Braasch 2004

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 47

Psycho-Acoustic Resultsno localizationdominance

band-pass-noise bursts200-ms duration, 500-Hz center frequency

Blauert & Braasch 2004

Cross-Correlation Model x…type I+…type II

level diff

Blauert & Braasch 2004

48

Page 9: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

9

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 49

Output of the Jeffress Coincidence Processor – Lag Frontally-Fixed, Lead Moving About –

video provided by Jonas Braasch (2005)

estimate of the interaural cross-correlation function

about 0.7 kHz100-ms noiseISI ...10 ms

lag

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 50

Output of the Jeffress Coincidence Processor – Lag Frontally Fixed, Lead Moving About –

video provided by Jonas Braasch (2005)

estimate of the interaural cross-correlation function

broad band100-ms noiseISI ... 10 ms

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 51

Modification of the Lindemann Algorithm

Aim: Processing of ITDs and ILDs independently of each other

(a) ITDs: compression of the signal before sending it to the Lindemann model to elimate the influence of ILDs

(b) ILDs: processed through excitation/inhibition, EI, cellsthat contain inhibitory elements

Braasch & Blauert 2004

Modified Lindemann Model x…type I+…type II

inhibited ICC level diff

Blauert & Braasch 2004

52

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 53

Output of Lindemann’s Processor with Lateral Inhibition– Lag Frontally-Fixed, Lead Moving About –

video provided by Jonas Braasch (2005)

estimate of the inhibited interaural cross-correlation function

about 0.7 kHz100-ms noiseISI ...10 ms

lag

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 54

Output of Lindemann’s Coincidence Processor with Lateral Inhibition

video provided by Jonas Braasch (2005)

estimate of the inhibited interaural cross-correlation function

broad band100-ms noiseISI = 10 ms

Page 10: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

10

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 55

Conclusions

At 100-Hz bandwidth, localization dominance was onlyobserved in half of the cases

The psycho-acoustical results could be simulated well using a combined ICC/EI model with elements of contra-lateral inhibition

In the model simulation, across frequency interaction was not required.

The degree of localization dominance depends on thesignals‘ bandwidth

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 56

Simulation for Ongoing Stimulidifferent settings of the contra-lateral inhibition

low for impulses high for ongoing sounds

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 57

Conclusions

The necessity to increase the inhibition factor of the Lindemannmodel to simulate experiments with ongoing sounds suggests that contra-lateral inhibition increases with signal duration

An alternative solution could employ a second onset-triggered inhibition process with a longer time constant on top of the Lindemann model, as was done by Djelani to simulate the build-up of the Precedence Effect for click trains

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 58

Lindemann‘s Algorithm (1985)

dynamicallyadjustable !

Lindemann 1985

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 59

Fast & Slow Melodies – Trumpet

time

Fo

Tsakostas & Blauert 2001 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 60

Echo Thresholds

Fast & Slow Melodies

35 4030

4253 52

65

4557

4855

8592

7582

60

0

20

40

60

80

100

L1 L2 L3 L4

ET

in m

s FFMSSM

F…fast, FM…fast in mixture, S…slow, SM…slow in mixture

Tsakostas & Blauert 2001

Page 11: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

11

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 61

Low- & High-Frequency Noise

time

frequ

ency

LF

HF

LF

HF

time

frequ

ency

HF

timefre

quen

cytime

frequ

ency

LF

durations: 500 ms and 1000 ms, rise & fall time: 10 ms, linearlevel: 70 dB SPL, A-weighted

LF

LFM

HF

HFM

time: 1000 ms

Tsakostas & Blauert 2001 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 62

Echo Thresholds

LF & HF

10090 95

78

136151

134

82 80 8463

85 86 8664

135

020406080

100120140160

L1 L2 L3 L4

ET

in m

s LFLFMHFHFM

LF…low, LFM…low in mixture, HF…high, HFM…high in mixture

Tsakostas & Blauert 2001

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 63

Wolf‘s Experiment I

Wolf 1991 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 64

Wolf‘s Experiment IIalternative plot

Wolf 1991

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 65

The Clifton Effect

Clifton 1987 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 66

Sound Sources Alternating in Space

Blauert & Col 1989

Page 12: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

12

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 67

Fusion Build-Up: Experimental Set-Up

build-up

build-up

no build-up

Djelani 2001 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 68

Fusion-Build-Up Results

Djelani & Blauert 2002

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 69

Virtual Test Room

Djelani & Blauert 2002 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 70

Test-Room SwitchingDjelani & Blauert 2002

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 71

Room-Switching Results

Djelani & Blauert 2002 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 72

Echo-Build-Up-Decay Results

Djelani & Blauert 2002

Page 13: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

13

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 73

Wolf‘s Model

prior knowledge,plausibility

pure cross-correlation maps

cross-correlationmaps with dynamic inhibition

Wolf 1991 Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 74

The Franssen Effect

Franssen 1960

Modelling the Auditory System withHypotheses-DrivenProcessing Stages

– situational knowledge– domain knowledge– cross-modal influence

binaural- activity map

75Lannion, France, 10-12 September 2008

InternationalTelecommunicationUnion 76

Audition and Cognition Come in Couples

What does the sound meanto me ?

Where and howis the sound ?

recognitioninterpretation

detectionperception

Lannion, France, 10-12 September 2008InternationalTelecommunicationUnion 77

Communication Acoustics

Jens Blauert, ed. (2005)

Authors: Jens Blauert, Jonas Braasch, Hugo Fastl,Volkmar Hamacher,

Dorte Hammershøi, Ulrich Heute, Inga Holube, Herbert Hudde, Ute Jekosch, Georg Klump,

Armin Kohlrausch, Arild Lacroix, Henrik Møller,Sebastian Möller, John N. Mourjopoulos,

Pedro Novo, Steven van der Par

Springer Berlin–Heidelberg–New YorkISBN 3-540-22162-X

77

Communication AcousticsJens Blauert, ed. (2005)

Contents

78

Page 14: ITU-T Workshop on From Speech to Audio: bandwidth ...symphony.arch.rpi.edu/~braasj/.inside/images/... · after Dubrovski & Cherniak 1966 Lannion, France, 10-12 September 2008 International

14

Lannion, France, 10-12 September 2008

InternationalTelecommunicationUnion

Thank you!

[email protected]@rpi.edu

ITU-T Workshop on"From Speech to Audio: bandwidth extension,

binaural perception"Lannion, France, 10-12 September 2008

79Lannion, France, 10-12 September 2008

InternationalTelecommunicationUnion 80

Copyright note:

This material is not in the public domain.The authors claim all applicable rights.However, permission to copy it is granted under the condition that proper reference is given to the authors.

Corresponding author:

_____________________________________

Jens Blauert, Emeritus Professor of AcousticsInstitute of Communication AcousticsRuhr-Universität BochumD-44780 Bochum, Germany

Tel.: +49 234 322 2496 (direct: 3480)Fax: +49 234 321 4165e-mail: [email protected]://www.rub.de/ika

____________________________________________

©