32
HIWIRE MEETING HIWIRE MEETING Torino, March 9-10, 2006 Torino, March 9-10, 2006 José C. Segura, Javier Ramírez José C. Segura, Javier Ramírez

HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

HIWIRE MEETINGHIWIRE MEETINGTorino, March 9-10, 2006Torino, March 9-10, 2006

José C. Segura, Javier RamírezJosé C. Segura, Javier Ramírez

Page 2: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

2 HIWIRE Meeting – Torino, 9 -10 March, 2006

Schedule

HIWIRE database evaluations New results: HEQ and PEQ

Non-linear feature normalization Using temporal redundancy HEQ integration in Loquendo platform Recursive estimation of the equalization function

New improvements in robust VAD Bispectrum-based VAD SVM-enabled VAD

Page 3: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

3 HIWIRE Meeting – Torino, 9 -10 March, 2006

HIWIRE database evaluations

Results without adaptation (50 test sentences) MODELS French Greek Italian Spanish World Avgd WSJ16k 13,06 23,70 20,41 17,30 12,55 17,60 WSJ16kfon 10,43 19,24 16,52 15,33 8,01 14,12 TIMIT (Loria) 7,30 9,96 11,87 9,27 5,77 8,99 TIMIT (retrained) 8,39 9,76 12,83 8,98 7,72 9,70 TIMIT HEQ 12,76 21,78 16,17 15,33 10,39 15,65 TIMIT PEQ 12,66 17,69 14,39 14,23 9,16 14,01 Results with MLLR adaptation (50 adapt / 50 test sentences) MODELS French Greek Italian Spanish World Avgd WSJ16k 3,85 4,50 5,94 4,53 3,90 4,55 WSJ16kfon 3,50 3,13 7,00 5,55 3,75 4,45 TIMIT (Loria) 3,13 2,71 3,80 2,99 2,81 3,13 TIMIT (retrained) 3,23 3,26 4,12 3,21 2,96 3,41 TIMIT HEQ 5,09 5,57 5,90 6,13 5,56 5,55 TIMIT PEQ 5,89 4,91 5,90 6,64 5,70 5,72

Page 4: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

4 HIWIRE Meeting – Torino, 9 -10 March, 2006

Schedule

HIWIRE database evaluations New results: HEQ and PEQ

Non-linear feature normalization Using temporal redundancy HEQ integration in Loquendo platform Recursive estimation of the equalization function

New improvements in robust VAD Bispectrum-based VAD SVM-enabled VAD

Page 5: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

5 HIWIRE Meeting – Torino, 9 -10 March, 2006

Temporal redundancy in HEQ

Enhance the normalization adding a linear transformation to restore temporal correlations

Each equalized cepstral coefficient is time-filtered with an ARMA filter that restores the autocorrelation of clean data

1 2 3 4 5 6 7 8 9 10 11 12 13 14 AvgdHIWIRE (baseline) 13,22 24,68 46,00 47,62 52,67 44,80 54,73 22,58 36,21 55,40 58,31 65,34 54,11 62,28 45,57ECDF (clean ref) 11,82 22,62 37,75 38,90 36,91 37,46 40,92 21,29 32,67 45,93 49,28 50,61 44,60 49,65 37,17ECDF (clean ref) + TES 11,42 21,40 35,25 37,53 34,59 36,17 38,56 20,15 28,80 43,87 47,66 49,83 44,75 46,77 35,48

AURORA4

Test A Test B Test C AvgdHIWIRE (baseline) 36,00 30,90 35,27 33,81ECDF (clean ref) 17,06 17,30 18,97 17,54ECDF (clean ref) + TES 16,24 14,21 16,35 15,45

AURORA2 (clean test)

Page 6: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

6 HIWIRE Meeting – Torino, 9 -10 March, 2006

HEQ integration in Loquendo platform

SEGMENTALSAC WC WI WD WS

Baseline (LASR) 45,70 75,10 0,60 16,60 7,60denoise-MeanDev 46,60 77,50 4,80 7,20 10,40denoise-HEQ121 38,20 69,60 4,30 12,60 13,50denoise-HEQ1001 46,50 77,70 4,00 7,30 11,00

Actually implementedHIGH MISMATCH

SYSTEM HM MM WM HM MM WMHIWIRE 48,61 85,3 94,49 42,49 61,98 80,9HEQ_GAUS 73,36 80,1 82,72 41,37 46,17 48,97HEQ(Q31) 74,88 75,67 85,86 41,21 34,81 56,23HEQ(Q31) IIR SORT 0.8 81,34 88,65 95,46 53,67 64,69 83,12

WAC SACSENTENCE-BY-SENTENCE

RECURSIVE

New proposal

Page 7: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

7 HIWIRE Meeting – Torino, 9 -10 March, 2006

HEQ integration (recursive estimation) (1)

Actual approach: Gaussian HEQ using ECDF

Ttr

TttrT

Ttr

CyCCx

ECDFTtT

tryC

yyyyyyY

XYXt

tY

TrrrT

)(1

,,1)]([

5.0)())(ˆ(ˆ

,,15.0)(

)(ˆ

},,,{

11

)()2()1(21

Using quantiles

krfrfloorkCTr

xxxxxxX

fxxfQQQQQK

kCCCCCCC

k

T

SORT

T

kkXk

XK

Xk

XX

kKk

)()1(1

}...{},...,,{

)1(},...,,...,{

5.0},,...,,...,,{

)()2()1(21

)1()(1

max1min

Page 8: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

8 HIWIRE Meeting – Torino, 9 -10 March, 2006

HEQ integration (recursive estimation) (2)

Equalization by linear interpolation

},...,,...,{ 1XK

Xk

XX QQQQ

},...,,...,{ 1YK

Yk

YY QQQQ

Averaged over training data

From actual utterance

)()( YkYk

XkX QCCQC

Mapping correspondingquantiles

YkQ

XkQ

y

YkQ

XkQ

Page 9: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

9 HIWIRE Meeting – Torino, 9 -10 March, 2006

HEQ integration (recursive estimation) (3)

feature each for tIndependen

quantiles) (31 quantiles betweenioninterpolat Linear :onEqualizati

)utterance each()1(

)initially(

R

R

QQ

qQQ

QQ

quantiles utterance Actual

quantiles Estimated

quantiles Reference

q

Q

QR

Alpha HM MM WM0,00 74,88 75,67 85,860,20 77,45 85,34 87,670,40 76,98 86,10 88,960,60 79,92 88,45 91,630,80 81,34 88,65 95,460,85 80,97 89,77 95,660,90 79,16 89,13 95,870,95 76,51 88,89 95,021,00 46,75 87,14 94,40

HEQ(31) IIR SORT (BEFORE)AURORA3 Italian results (before)

70

75

80

85

90

95

100

0,00 0,20 0,40 0,60 0,80 0,85 0,90 0,95 1,00

Alpha

WA

C(%

)

HM

MM

WM

Page 10: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

10 HIWIRE Meeting – Torino, 9 -10 March, 2006

HEQ integration (recursive estimation) (4)

Alpha HM MM WM0,00 77,51 79,15 90,060,20 77,93 81,42 90,780,40 79,21 85,78 91,600,60 77,38 89,01 89,980,80 79,87 89,21 95,260,85 78,98 88,41 95,630,90 78,06 88,45 95,520,95 76,25 88,73 95,101,00 46,75 87,14 94,40

HEQ(31) IIR SORT (AFTER) AURORA3 Italian results (after)

70

75

80

85

90

95

100

0,00 0,20 0,40 0,60 0,80 0,85 0,90 0,95 1,00

Alpha

WA

C(%

)

HM

MM

WM

Utterances are equalized WITHOUT delay

Quantiles are updated AFTER the equalization

Page 11: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

HIWIRE MEETINGHIWIRE MEETINGTorino, March 9-10, 2006Torino, March 9-10, 2006

José C. Segura,José C. Segura, Javier Ramírez Javier Ramírez

Page 12: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

12 HIWIRE Meeting – Torino, 9 -10 March, 2006

Schedule

HIWIRE database evaluations New results: HEQ and PEQ

Non-linear feature normalization Using temporal redundancy HEQ integration in Loquendo platform Recursive estimation of the equalization function

New improvements in robust VAD Bispectrum-based VAD SVM-enabled VAD

Page 13: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

13 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (1)

Motivations: Ability of HOS methods to detect signals in noise

Knowledge of the input processes (Gaussian)

Issues to be addressed: Computationally expensive Variance of bispectrum estimators much higher than that of power

spectral estimators (identical data record size)

Solution: Integrated bispectrum J. K. Tugnait, “Detection of non-Gaussian signals using integrated

polyspectrum,” IEEE Trans. on Signal Processing, vol. 42, no. 11, pp. 3137–3149, 1994.

Page 14: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

14 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (2)

Definitions:Let x(t) be a discrete-time signal Bispectrum:

Third order cumulants:

Inverse transform:

i k

xx kijkiCB )}(exp{),(),( 21321

)}()()({),(3 ktxitxtxEkiC x

),( ),( 321 kiCB xx

21212123 )}(exp{),(

)2(

1),( ddkijBkiC xx

Page 15: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

15 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (3)

Noise only Noise + speech

Page 16: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

16 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (4)

Integrated bispectrum (IBI):

Cross-spectrum Syx()

Bispectrum Inverse

transform:

Bispectrum – Cross spectrum:

)()( 2 txty

)(}exp{)(2

1),0(

}exp{),0(}exp{)}()({)(

3

3

krdkjSkC

kjkCkjktxtyES

yxyxx

kx

kyx

21212123 )}(exp{),(

)2(

1),( ddkijBkiC xx

1122 ),(2

1),(

2

1)( dBdBS xxyx

i= 0

Page 17: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

17 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (5)

Integrated bispectrum (IBI): Defined as a cross spectrum between the signal and its square,

and therefore, it is a function of a single frequency variable

Benefits: Less computational cost

computed as a cross spectrum Variance of the same order of the power spectrum estimator

Properties For Gaussian processes:

Bispectrum is zero Integrated bispectrum is zero as well

Page 18: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

18 HIWIRE Meeting – Torino, 9 -10 March, 2006

Two alternatives explored for formulating the decision rule: Estimation by block averaging (BA):

MO-LRT: Given a set of N= 2m+1 consecutive observations:

Bispectrum-based VAD (6)

)( )H(P

)H(P

)H|ˆ(

)H|ˆ()ˆ(

1

0

0H|

1H|

1H

0H0

1

l

ll p

pL

y

yy

y

y

ml

mlk k

kmllmlN

k

k

p

pL

)H|ˆ(

)H|ˆ()ˆ,...,ˆ,...,ˆ(

0H|

1H|

0

1

y

yyyy

y

y

KBNB samples

NBsamples

KB blocks

l-th frame

Frameshift

VADdecision

T

Page 19: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

19 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (7)

LRT evaluation IBI Gaussian Model

Variances Defined in terms of

Sss (clean speech power spectrum)

Snn (noise power spectrum)

Page 20: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

20 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD (8)

Denoising:

Smoothedspectral

subtraction

( )xxS

( )nnS 1( )S

1st WF design

1st WFstage

2 ( )S 2nd WF design

2nd WFstage

( )ssS

1-framedelay

Page 21: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

21 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum VAD Analysis (1)

MO-LRT VAD

ml

mlk k

kmllmlN

k

k

p

pL

)H|ˆ(

)H|ˆ()ˆ,...,ˆ,...,ˆ(

0H|

1H|

0

1

y

yyyy

y

y

Page 22: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

22 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD results (2)

0

20

40

60

80

100

0 10 20 30 40 50 60FALSE ALARM RATE (FAR0)

PA

US

E H

IT R

AT

E (

HR

0)

G.729AMR1AMR2AFE (Noise Est.)AFE (frame-dropping)LiMarzinzikSohnWooBA-IBI (KB= 1, NB= 256)BA-IBI (KB= 3, NB= 256)BA-IBI (KB= 5, NB= 256)

Page 23: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

23 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD results (3)

0

20

40

60

80

100

0 10 20 30 40 50 60FALSE ALARM RATE (FAR0)

PA

US

E H

IT R

AT

E (

HR

0)

G.729AMR1AMR2AFE (Noise Est.)AFE (frame-dropping)LiMarzinzikSohnWooMO-LRT IBI (KB= 1, NB= 256, m= 2)MO-LRT IBI (KB= 1, NB= 256, m= 5)MO-LRT IBI (KB= 1, NB= 256, m= 7)

Page 24: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

24 HIWIRE Meeting – Torino, 9 -10 March, 2006

Bispectrum-based VAD results (4)

WF: Wiener filteringFD : Frame-dropping

Page 25: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

25 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (1)

Motivation: Ability of SVMs for learning from experimental data

SVMs enable defining a function:

using training data:

Classify unseen examples (x, y)

Statistical learning theory restricts the class of functions the learning machine can implement.

y

Rf N

}1,1{:

x

}1,1{ ),(),...,,(),,( 2211 NRyyy xxx

otherwise1

1)(1

xx

fy

Page 26: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

26 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (2)

Hyperplane classifiers:

Training: w and b define maximal margin hyperplane

Kernels:

)·(sign)( bf xwx

bkf i

l

ii ),(sign)(

1

xxx )()·(),( yxyx k

Page 27: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

27 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (3)

Page 28: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

28 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (4)

Feature

extraction:

Training:

Page 29: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

29 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (5)

Feature

extraction:

Decision function 2-band features

))((sign)(

),()(1

xx

xxx

gf

bkg i

l

ii

Page 30: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

30 HIWIRE Meeting – Torino, 9 -10 March, 2006

SVM-enabled VAD (6)

Analysis: 4 subbands Noise reduction

Improvements: Contextual speech features Better results without noise

reduction

Page 31: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

31 HIWIRE Meeting – Torino, 9 -10 March, 2006

Dissemination (VAD)

Integrated bispectrum: J.M. Górriz, J. Ramírez, C. G. Puntonet, J.C. Segura, “Generalized-LRT based

voice activity detector”, IEEE Signal Processing Letters, 2006.

J. Ramírez , J.M. Górriz, J. C. Segura, C. G. Puntonet, A. Rubio, “Speech/Non-speech Discrimination based on Contextual Information Integrated Bispectrum LRT”, IEEE Signal Processing Letters, 2006.

J.M. Górriz, J. Ramírez, J. C. Segura, C. G. Puntonet, L. García, “Effective Speech/Pause Discrimination Using an Integrated Bispectrum Likelihood Ratio Test” , ICASSP 2006.

SVM VAD: J. Ramírez, P. Yélamos, J.M. Górriz, J.C. Segura. “SVM-based Speech

Endpoint Detection Using Contextual Speech Features”, IEE Electronics Letters 2006.

J. Ramírez, P. Yélamos, J.M. Górriz, C.G. Puntonet, J.C. Segura. “SVM-enabled Voice Activity Detection”, ISNN 2006.

P. Yélamos, J. Ramírez, J.M. Górriz, C.G. Puntonet, J.C. Segura, “Speech Event Detection Using Support Vector Machines”, ICCS 2006.

Page 32: HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez

HIWIRE MEETINGHIWIRE MEETINGAthens, November 3-4, 2005Athens, November 3-4, 2005

José C. Segura, Javier RamírezJosé C. Segura, Javier Ramírez