24

INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant

Embed Size (px)

Citation preview

Page 1: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant
Page 2: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

SIBILANT SPEECH DETECTION IN NOISEBY: HOSEIN BITARAFSUPERVISOR: DR. NASERSHARIF

Page 3: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

INTRODUCTION

Sibilant speech is aperiodic. the fricatives /s/, /ʃ/, /z/ and /Ʒ/ and the

affricatives /tʃ/ and /dƷ/ we present a sibilant detection algorithm

robust to high levels of noise

Page 4: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Gaussian for noisy speech signal

Xk,i = power K = frequency i = time-frame µk,i = mean power

Page 5: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

PSD for /ʃ/

Page 6: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Log-likelihood

µk,N1 = µk,N2 = ak

µk,S = ak + bk

Page 7: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Maximizing the log-likelihood

74% of sibilant within 60 and 130 ms. |t| < 30 ms high probability sibilant |t| > 65 ms high probability outside the

sibilant. reduces contribution of the transition region 30 ms < |t| < 65 ms

Page 8: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Maximizing the log-likelihood

Page 9: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Maximizing the log-likelihood

Page 10: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Maximizing the log-likelihood

Page 11: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Estimate noise and siblant

Page 12: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Estimated sibilant mean power

Page 13: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Maximum filter

W = 30

Page 14: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Normalization

To make the estimate independent of the overall speech level

Page 15: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Gaussian Mixture Model

For each frame has two Gaussian mix-ture models (GMMs):

one trained on non-sibilant speech and the other on sibilant speech.

Page 16: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

EXPERIMENTS

Filter for1.5 kHz to 8 kHz. The weighting function used for three

Hamming windows

Page 17: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

GMMs

The input for the GMMs was a 14-component vector

containing the estimated sibilant power spectrum from

1.5 kHz to 8 kHz every 500 Hz

Page 18: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Result

White Gaussian noise was added to the speech files

it is more difficult to detect sibilants in white noise than in other typical stationary noise

Page 19: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Result

Pmiss = miss probability

Pfa = false alarm probability

Page 20: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Result

Page 21: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Result

Page 22: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

CONCLUSIONS

we have presented a sibilant detection algorithm with noise

sibilant mean power estimation stage likelihood ratio of two GMMs, Test in TIMIT . 80% classification accuracy for positive

SNRs.

Page 23: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

For Future

it is possible that its classification accuracy could be further improved by applying temporal constraints to the classification decisions.

Page 24: INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant

Thank you