Department of Communication Technology 30/08/2004 1 A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition

30/08/2004 1

Department ofCommunication Technology

A Comparative Study of Feature-Domain Error Concealment Techniques

for Distributed Speech Recognition

- Robust2004 workshop, Norwich, UK

Zheng-Hua Tan, Børge Lindberg and Paul Dalsgaard

{zt, bli, pd}@kom.aau.dk

Aalborg University, Denmark

30/08/2004 2


Agenda

• Feature-domain EC techniques– repetition – linear interpolation– subvector concealment

• Speech recognition experiments

• Comparative study– MFCC features– Euclidean and DP distances– HMM state durations

30/08/2004 3


Motivation

Why to do this work?• A variety of EC techniques for DSR occur

– A survey

• Repetition vs. interpolation– Which is better?

• What makes an EC technique good for recognition?

30/08/2004 4


EC techniques

Two classes of EC techniques• Client based EC

– e.g. retransmission and forward error control (FEC)

• Server based EC(the redundancy in the transmitted signal is exploited)– in the model-domain

• Weighted Viterbi, missing feature theory

– in the feature-domain• Insertion based techniques: splicing, substitution, repetition• Interpolation based techniques: linear interpolation• Subvector concealment

30/08/2004 5


Subvector concealment

• Observation1: conventional EC schemes share a common characteristic - conducting EC at the vector level

• Observation 2: within erroneous vectors, a substantial number of subvectors are often error-free

Subvector based EC

30/08/2004 6


Subvector concealment (cont.)

• The ETSI-DSR standard– Feature-pair and SVQ: The n’th vector is

– Frame-pair:

Tnnnnnnn Eccccc ]log,,,...,,,[ 0121121 V

TTnTnTn ]][,][...,,][[ 650 S S S Feature-pair

Subvector

][ 1 V ,V nn

30/08/2004 7


• Buffering matrix

• Consistency test

TSSd OR TSSd jn

jn

jjn

jn

j ))1())1()1((())0())0()0((( 11


B2NA1-2NA2A1A V V V . V V V A

BNNA

BNNA

BNNA

BNNA

BNNA

BNNA

BNNA

62

612

62

61

66

52

512

52

51

55

42

412

42

41

44

32

312

32

31

33

22

212

22

21

22

12

112

12

11

11

02

012

02

01

00

.

.

.

.

.

.

.

SSSSSS

SSSSSS

SSSSSS

SSSSSS

SSSSSS

SSSSSS

SSSSSS

AAAA

AAAA

AAAA

AAAA

AAAA

AAAA

AAAA

))(())(( 21

22

201

12

1 cAA

cAA TccdORTccd

30/08/2004 8


Consistency matrix and subvector concealment

B8A7A6A5A4A3A2A1A V V V V V V V V V V A

1110011001

1111111111

1001111111

1111111001

1111111111

1110011111

1110000111

0 for inconsistent

1 for consistentC =

6

5

4

3

2

1

0

S

S

S

S

S

S

S


30/08/2004 9


Outline




30/08/2004 10


Recognition experiments

• two tasks: Danish digits and city names• the HTK based reference recogniser • the realistic GSM error patterns (EP) :

– EP1, 10 dB (C/I ratios )

– EP2, 7 dB

– EP3, 4dB

30/08/2004 11


Recognition experiments (cont.)

The %WER for three EC techniques

(a) Danish digits (b) city names

0

2

4

6

8

10

12

EP1 EP2 EP3

Repetition

Interpolation

Subvector

20

25

30

35

40

45

EP1 EP2 EP3

Repetition

Interpolation

Subvector

30/08/2004 12


Outline




30/08/2004 13


Comparative study - MFCC features

• Transmission errors of a random BER value of 2% is used.

• The original error-free MFCC features are directly compared with the features corrupted with errors but concealed either – by repetition – by interpolation– by subvector concealment

30/08/2004 14


Comparative study - MFCC features (cont.)

• MFCC c0

• Two observations

30/08/2004 15



• Interpolation: straight line – constant value segment – zero value segment

30/08/2004 16



• Repetition generated feature curves display similar shapes even though there are some displacements along the time axis as compared to the iMFCC feature.

• However, the DP embedded in the Viterbi algorithm makes this displacement relatively irrelevant.

30/08/2004 17


Comparative study - DP distances

– The Euclidean and DP distances between c0 of

MFCC and MFCC generated by different EC techniques for word “et”

0

1

2

3

4

5

Euclidean DP

Repetition Interpolation Subvector

– General expectation: interpolation performs better

• Signal reconstruction vs. speech recognition• Euclidean distance vs. DP distance

30/08/2004 18


Comparative study - DP distances (cont.)

Over 328 testing utterances• Number of smaller distances

• Subvector EC always gives the smallest for both distances.

0

50

100

150

200

250300

Euclidean DP

Repetition

Interpolation

30/08/2004 19


Comparative study - HMM state durations

• Viterbi decoding tracks the HMM state alignment • The average state-durations

• Two facts are observed:– repetition vs. interpolation– subvector vs. error-free

0

2

4

6

State durationRepetition InterpolationSubvector Erro-free

30/08/2004 20


Summary

• Three different EC techniques compared– the simple repetition technique is as good as

or even better than linear interpolation– subvector concealment performs best


30/08/2004 21


A Comparative Study of Feature-Domain Error Concealment Techniques

for Distributed Speech Recognition

Thanks!

Documents

Department of Communication Technology 30/08/2004 1 A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition