23
Noise reduction and binaural Noise reduction and binaural cue preservation of multi- cue preservation of multi- microphone algorithms microphone algorithms Simon Doclo , Tim van den Bogaert, Marc Moonen, Jan Wouters Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium Oldenburg, June 28 2007

Noise reduction and binaural cue preservation of multi- microphone algorithms Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters Dept. of Electrical

Embed Size (px)

Citation preview

Noise reduction and binaural Noise reduction and binaural

cue preservation of multi-cue preservation of multi-

microphone algorithms microphone algorithms

Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters

Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium

Oldenburg, June 28 2007

22

OverviewOverview

• Problem statement

o Improve speech intelligibility + preserve spatial awareness

o Bilateral vs. binaural processing

• Binaural signal processing using multi-channel Wiener filter

o MWF: noise reduction and preservation of speech cues, noise cues are distorted

o Extension of MWF to preserve binaural cues of all components:

– MWFv: partial estimation of noise component

– MWF-ITF: extension with Interaural Transfer Function

o Physical and perceptual evaluation

• Reduce bandwidth requirements of wireless link

o Distributed binaural MWF

33

Problem statementProblem statement

• Many hearing impaired are fitted with a hearing aid at both earso Signal processing to selectively enhance useful speech signal

and improve speech intelligibility

o Signal processing to preserve directional hearing and spatial awareness

o Multiple microphone available: spectral + spatial processing• Binaural auditory cues:

o Interaural Time Difference (ITD) – Interaural Level Difference (ILD)

o Binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localisation

o ITD: f < 1500Hz, ILD: f > 2000Hz

IPD/ITD

ILD

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

44

Bilateral vs. BinauralBilateral vs. Binaural

Bilateral system

Independent left/right processing:Preservation of binaural cuesfor localisation ?

Binaural system

More microphones: better performance ? preservation of binaural cues ?

Need of binaural link

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

55

• Bilateral system:

o Independent processing of left and right hearing aid

o Localisation cues are distorted

RMS error per loudspeaker when accumulating all responses of the different test conditions (NH = normal hearing, NO = hearing impaired without hearing aids, O = omnidirectional configuration, A = adaptive directional configuration)

[Van den Bogaert et al., 2006]

Bilateral vs. BinauralBilateral vs. Binaural

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

also effect on intelligibility through binaural hearing advantage

77

• Bilateral system:

o Independent processing of left and right hearing aid

o Localisation cues are distorted

• Binaural system:

o Cooperation between left and right hearing aid (e.g. wireless link)

o Assumption: all microphone signals are available at the same timeObjectives/requirements for binaural

algorithm:

1. SNR improvement: noise reduction, limit speech distortion

2. Preservation of binaural cues (speech/noise) to exploit binaural hearing advantage

3. No assumption about position of speech source and microphones

[Van den Bogaert et al., 2006]

Bilateral vs. BinauralBilateral vs. Binaural

Problem statement

-bilateral/binaural Binaural processing

Bandwidthreduction

Conclusions

1010

Configuration and signalsConfiguration and signals

• Configuration: microphone array with M microphones at left and right hearing aid, communication between hearing aids

noise component

0, 0, 0, 0( ) = ( ) , 0) =( 1m m mVY X m M 00, 0,, 0( ) = ( )( 0) , = 1mm mY V m MX

speech component

0 0 1 1( ) = ( ) ( ), ( ) = ( ) ( )H HZ Z W Y W Y

• Use all microphone signals to compute output signal at both ears

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1111

Overview of cost functionsOverview of cost functions

Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears

trade-off noise reduction and speech

distortion

Speech-distortion weighted multi-channel Wiener filter

(SDW-MWF)[Doclo 2002, Spriet 2004]binaural cue

preservation of speech + noise

Partial estimation of noise component

(MWFv)[Klasen 2005]

Extension with ITD-ILD or Interaural Transfer

Function (ITF)

[Doclo 2005, Klasen 2006, Van den Bogaert 2007]

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1212

• Binaural SDW-MWF: estimate of speech component in microphone signal at both ears (usually front microphone) + trade-off between noise reduction and speech distortion

Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter

0

1

= , = ,x v M xx y v

M x v x

R R 0 rR r R R R

0 R R r

0

1

2 2

0, 0 0

11, 1

( )H H

r

HHr

XJ E

X

W X W VW

W VW X1=SDWW R r

speech componentin front microphones noise

reductionspeech distortion

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

estimate

o Depends on second-order statistics of speech and noise

o Estimate Ry during speech-dominated time-frequency segments, estimate Rv during noise-dominated segments, requiring robust voice activity detection (VAD) mechanism

o No assumptions about positions of microphones and sources

o Adaptive (LMS-based) algorithm available [Spriet 2004, Doclo 2007]

1313

Binaural multi-channel Wiener Binaural multi-channel Wiener filterfilter

• Interpretation for single speech source:

o Spectral and spatial filtering operation

with (spatial) coherence matrix and P (spectral) power

o Equivalent to superdirective beamformer (diffuse noise field) or delay-and-sum beamformer (spatially white noise field) +single-channel WF-based postfilter (spectral subraction)

Spatial separation between speech and noise sources

SNR

0

1 1*

,0 0,1 1 /

Hv v

SDW rH Hv v v s

AP P

Γ A A Γ AW

A Γ A A Γ A

• Binaural cues (ITD-ILD) :

Perfectly preserves binaural cues of speech component

Binaural cues of noise component speech component !!

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1414

• Partial estimation of noise component

o Estimate of sum of speech component and scaled noise component

o Relationship with SDW-MWF: mix with reference microphone signals

reduction of noise reduction performance

works for multiple noise sources

Partial noise estimation (MWFv)Partial noise estimation (MWFv)

0

1

0

1

0,

2

0, 0

1, 11,

( )r

Hr

Hr

r

X

X

V

VJ E

W YW

W Y

0

1

0

1

2 2

0, 0 0

1,

0,

1 11,

0 1( ) ,r

r

H Hr

H Hr

XJ

VE

X V

W X W VW

W X W V

0

1

0 0, ,0

1 1, ,1

(1 )

(1 )

r SDW

r SDW

Z Y Z

Z Y Z

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1515

Interaural Wiener filter (MWF-Interaural Wiener filter (MWF-ITF)ITF)

• Extension of SDW-MWF with binaural cueso Add term related to binaural cues of noise (and speech)

component

o Possible cues: ITD, ILD, Interaural Transfer Function (ITF)

( ) = ( ) ( ) ( )x vtot SDW cue cueJ J J J W W W W

0 0

1 1

Hv v

out Hv

ZITF

Z

W V

W V

0 10

1 1 1

*0, 1,0, 0 1

*1, 1 11, 1,

( , )

( , )

r rrv vin

r vr r

E V VV r rITF

V r rE V V

R

Re.g.

0

1

2 2

0, 0 0

11, 1

2 2

0 1 0 1

( ) =H H

r

tot HHr

H x H H v Hin in

XJ E

X

E ITF E ITF

W X W VW

W VW X

W X W X W V W V

ITF preservation speech ITF preservation noise

o Closed form expression!o large changes direction of speech increase weight o Implicit assumption of single noise source

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1717

Simulation setupSimulation setup

• Identification of HRTFs:o Binaural recordings on CORTEX MK2 artificial head

o 2 omni-directional microphones on each hearing aid (d=1cm)

o LS = -90:15:90, 90:30:270, 1m from head

o Room reverberation: T60=140 ms (and T60=510 ms)

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1818

Experimental resultsExperimental results

• Simulations:

o SxNy, SNR = 0 dB on left front microphone (broadband)

o fs = 20.48 kHz

• MWF algorithmic parameters:

o batch procedure, perfect VAD

o L=96, =5

o MWFv for different , MWF-ITF for different ,• Physical evaluation:

o Speech = HINT, noise = babble noise

o Speech intelligibility: SNR

o Localisation: ITD / ILD

• Perceptual evaluation:

o Preliminary study with NH subjects

o Speech intelligibility: SRT

o Localisation: localise S and N

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

1919

Physical evaluationPhysical evaluation

• Performance measures:

o Intelligibility weighted SNR improvement (left/right)

o ILD error (speech/noise component) power ratio

x xx out i in i

i

ILD ILD ILD

o ITD error (speech/noise component) phase of cross-correlation

x i x ii

ITD I ITD 1

* *0, 1, 0 10

{ } { }x i r r x xITD E X X E Z Z

L i L ii

SNR I SNR

importance of i-th frequencyfor speech intelligibility

low-pass filter 1500 Hz

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

0 0.2 0.4 0.6 0.8 10

10

20

S

NR

left

[dB

]

0 0.2 0.4 0.6 0.8 10

10

20

S

NR

rig

ht [

dB]

0 0.2 0.4 0.6 0.8 10

2

4

6

IL

D s

peec

h [d

B]

0 0.2 0.4 0.6 0.8 10

2

4

6

IL

D n

oise

[dB

]

0 0.2 0.4 0.6 0.8 10

0.5

1

IT

D s

peec

h [r

ad]

auditec 60deg (=5, L=96, N=4)

0 0.2 0.4 0.6 0.8 10

0.5

1

IT

D n

oise

[ra

d]

Physical evaluation: MWFvPhysical evaluation: MWFvS0N60

2323

• Procedure:o headphone experiments, using measured HRTFs

o Filters are calculated off-line on VU speech-weighted noise as S and multitalker babble noise as N

o All stimuli presented at comfort level, 5 NH subjects (ongoing)

Perceptual evaluationPerceptual evaluation

Headphones

HRTFx

HRTFv

speech

noise

GBinaural filter

Mic L

R

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

• Speech intelligibility:o Adaptive procedure to find 50% Speech Reception Threshold (SRT)

• Localisation:o S and N components (telephone) are sent separately through filter

o Localise S and N in room where HRTFs were measured

o Level roving 6 dB, 3 repetitions per condition for each subject

2424 N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a mwf02 0dB

Perceptual evaluation: MWFvPerceptual evaluation: MWFv• Algorithms: unprocessed, state-of-the art bilateral, MWF, MWFv

(=0.2)

• Conditions: S0N60, S45N315 and S90N270 (T60=510 ms)

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a unproc 0dB

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a classic 0dB

N270 N315 S0 S45 N60 S90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

v a mwf0 0dB

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

2525

Perceptual evaluation: MWFvPerceptual evaluation: MWFv

• With state-of-the-art systems: preservation of binaural cues only within central angle of frontal hemisphere.

• Binaural MWF:

o preserves localization cues for speech source

o preserves localization cues for noise source(s) with small mixing

o Recent SRT experiments (N=2) show no substantial SRT difference between =0 and =0.2

• Ongoing research:

o Perceptual evaluation (SRT and localisation) for MWF-ITF

Problem statement

Binaural processing-MWF-Cue preservation-Physical evaluation-Perceptual eval

Bandwidthreduction

Conclusions

2828

Bandwidth constraintsBandwidth constraints

• Binaural MWF:o 2M microphone signals are transmitted over wireless link

• Reduce bandwidth requirement of wireless link:o Transmit one signal from contralateral ear

Problem statement

Binaural processing

Bandwidthreduction

Conclusions

– Front contralateral microphone signal

– Output of contralateral fixed (e.g. superdirective) beamformer

– Output of MWF using only M contralateral microphone signals

– Iterative distributed binaural MWF scheme

2929

Physical evaluationPhysical evaluation

60 90 120 180 270 300 -60 60 -120 120 120 210 60 120 180 210 60 120 180 270

8

10

12

14

16

18

20

22Performance comparison of MWF-based binaural algorithms

noise source(s) angle (°)

AI

wei

ghte

d S

NR

impr

ovem

ent

(dB

)

MWF-fullMWF-frontMWF-contraMWF-iter

Problem statement

Binaural processing

Bandwidthreduction

Conclusions

Performance of dB-MWF close to full binaural MWF !

3030

Contralateral directivity patternContralateral directivity pattern

T60=140 ms

S0N120

Left HA -50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - contralateral (N=4,120 deg,=5,SNR=14.6277)

Fullband

SNR=14.6dB

B-MWF

-45

-40

-35

-30

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - front contralateral (N=3)

Fullband

MWF-front

SNR=10.5dB

-55

-50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

Left HA - MWF contralateral (N=4,120 deg,=5,SNR=14.2051)

Fullband

MWF-contra

SNR=14.2dB

-50

-45

-40

-35

30

210

60

240

90

270

120

300

150

330

180 0

dB-MWF

SNR=14.2dB

3131

ConclusionsConclusions

• State-of-the art signal processing in (bilateral) HAs: preservation of binaural cues only within central angle of frontal hemisphere

• Binaural MWF:

o Substantial noise reduction (MWF 4 3 > 2)

o Preservation of binaural speech cues

o Distortion of binaural noise cues

o No assumptions about positions and microphones VAD

• Compromise between noise reduction and binaural cue preservation can be achieved with extensions of MWF

o Mixing with microphone signals

o Interaural Transfer Function

• Reduction of bandwidth using distributed MWF

Problem statement

Binaural processing

Bandwidthreduction

Conclusions