Covariation and weighting of harmonically decomposed streams for ASR

aperiodic periodic

Production of /z/:

Covariation and weighting of harmonically decomposed

streams for ASR

Introduction

Pitch-scaled harmonic filter

Recognition experiments

Results

Conclusion

Motivation and aims

• Most speech sounds are either voiced or unvoiced, which have very different properties:

– voiced: quasi-periodic signal from phonation

– unvoiced: aperiodic signal from turbulence noise

• Do these properties allow humans to recognize speech in noise?

Maybe, we can use this information to help ASR...

by computing separate features for the two parts.

• Are their two contributions complementary?

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION

aperiodic contribution periodic contribution

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION

Voiced and unvoiced parts of a speech signal

Production of /z/:

speech waveform

aperiodic waveform

periodic waveform

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD

Pitch-scaled harmonic filter

time shifting

PSHF. . .

optimised pitch

pitch optimisation

pitch extraction

PSHFPSHF

re-splicing

Decomposition example (waveforms)

Decomposition ex. (spectrograms)

Decomposition ex. (MFCC specs.)

Speech database: Aurora 2.0

• From TIdigits database of connected English digit strings (male & female speakers), filtered with G.712 at 8 kHz.

Data type Signal-to-Noise Ratio (dB)

clean-condition

multi-condition 20 15 10 5

set A (same noises)

20 15 10 5 0 -5

set B (different noises)

20 15 10 5 0 -5

set C (diffferent channel)

20 15 10 5 0 -5

Description of the experiments

• Baseline experiment: [base]– standard parameterisation of the original waveforms

(i.e., MFCC,+Δ,+ΔΔ)

• PCA experiments: [pca26, pca78, pca13 and pca39]– decorrelation of the feature vectors, and reduction of

the number of coefficients

• Split experiments: [split, split1]– adjustment of stream weights (periodic vs. aperiodic)

Caveat: pitch values were derived from clean speech files, for entire database!

PCA26:

PCA78:

PCA13:

PCA39:

MFCC +Δ, +Δ2catPSHF PCA

MFCC +Δ, +Δ2 catPSHF PCA

BASE: MFCCwaveform features

+Δ, +Δ2

Parameterisations

SPLIT: MFCC +Δ, +Δ2 catPSHF

SPLIT1: MFCC +Δ, +Δ2 catPSHF

Word Error Rate (%) clean multi overall base 47.4 21.7 34.6

pca26 33.8 11.4 22.6 pca78 42.7 12.8 27.7 pca13 28.3 13.0 20.7 pca39 30.3 14.5 22.4

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS

Full-sized PCA results

PCA26PCA39

• clean+ multi

Variance of Principal Components

PCA26 experiment’s results

CLEAN MULTI

pca26 29.0 11.4 20.2 pca78 38.3 12.1 25.2 pca13 27.6 12.6 20.1 pca39 29.3 12.5 20.9

Summary of best PCA results

Split experiment’s results

split (=0) 62.9 44.3 53.6

split (=1) 28.5 11.7 20.1

split (=2) 22.7 11.5 17.1

Sample Split results

Note: same value of stream weights used in training as in testing, for Split.

Split1 experiment’s results

Word Error Rate (%) WER (%) clean multi overall abs. rel. base 47.4 21.7 34.6 0.0 0.0

pca26 29.0 11.4 20.2 14.4 41.6 pca78 38.3 12.1 25.2 9.4 27.2 pca13 27.6 12.6 20.1 14.5 41.9 pca39 29.3 12.5 20.9 13.7 39.6

split 22.6 11.0 16.8 17.8 51.4 split1 21.0 10.9 16.0 18.6 53.8

pca26 29.0 11.4 20.2 pca78 38.3 12.1 25.2 pca13 27.6 12.6 20.1 pca39 29.3 12.5 20.9

split 22.6 11.0 16.8 split1 21.0 10.9 16.0

Summary of PCA & Split results

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ CONCLUSION

Conclusions• PSHF module split Aurora’s speech waveforms into

two synchronous streams (periodic and aperiodic)– large improvements over the single-stream Baseline

• Split was better than all PCA combinations:– PCA26/13 better than PCA 78/39, and PCA13 best

– Split1 marginally better than Split

• Periodic speech segments give robustness to noise.

Further work– Modeling: how best to combine the streams?

– LVCSR: evaluate front end on TIMIT (phone recognition).

– Robust pitch tracking

COLUMBO PROJECT: Harmonic decomposition

applied to ASR

Philip J.B. Jackson 1 <p.jackson@surrey.ac.uk>

David M. Moreno 2 <davidm@talp.upc.es>

Javier Hernando 2 <javier@talp.upc.es>

Martin J. Russell 3 <m.j.russell@bham.ac.uk>

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/

Covariation and weighting of harmonically decomposed streams for ASR

Documents

aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition

Structure-function covariation with nonfeeding ecological ... · lationships in the skulls of carnivorans (dogs, cats, seals, and relatives) through covariation with masticatory

Transition to chaotic vibrations for harmonically forced

Children’s and Adults’ Interpretation of Covariation Data

Covariation in Productivity of Mid-Columbia Steelhead Populations

Ch03 - Harmonically Excited Vibrations_Part1

Simple Covariation

Aerodynamics of Harmonically Oscillating Aerofoil at Low ... · University’s low speed wind tunnel are referred for validation (Gerontakos 2004; ... Aerodynamics of Harmonically

Nonsynchronous covariation process and limit theorems · PDF fileStochastic Processes and their Applications 121 (2011) 2416–2454 Nonsynchronous covariation process and limit theorems

The covariation method of estimation Add_my_pet

Decomposed PE

Oscillations the force and motion of springs harmonically

Estimating the Quadratic Covariation Matrix for

Mechanical Vibrations - UNESP · Mechanical Vibrations Prof. Paulo J. Paupitz Gonçalves. Harmonically Excited Vibration. Harmonically Excited Vibration The solution of nonhomogeneous

Perceived Covariation Among the Features of Ingroup and ...webuser.bus.umich.edu/yoonc/research/Papers/... · Perceived Covariation Among the Features of Ingroup and Outgroup Members:

Embracing covariation in brain evolution: : Large brains

Characterization of BAW Modes Harmonically Generated (f-2f

4. Harmonically Excited Vibrations

Covariation-based Approach to Crisis Responsibility ...698124/FULLTEXT01.pdf · Communication Theory (SCCT) with Kelley’s covariation principle, the present research aims to further

Character Strengths and Type: Exploration of Covariation