8. Cepstral Analysis - Xavier Anguera · Cepstral analysis of speech As pointed out at the...

8. Cepstral Analysis

(most slides taken from MIT course by Glass and Zue)

Homomorphic filtering Homomorphic filtering/transformation is a nonlinear

transformation usually applied to image and speech processing used to convert a signal obtained from a convolution of two original signals into the sum of two signals.

In speech processing it can be applied to separate the filter from

the excitation in the source-filter model The cepstrum is one such homomorphic transformation that allows

us to perform such separation. It is an alternative option to linear prediction analysis seen before

x[n]= e[n]!h[n]" x[n]= e[n]+ h[n]

x[n]= D(x[n])

s[n]= u[n]!h[n]

TDP: Cepstral Analysis 3

Cepstral analysis is based on the observation that

by taking the log of X(z)

If the complex log is unique and the z transform is valid then, by applying Z-1

the two convolved signals are now additive.

Basics of cepstral analysis

Basics of cepstral analysis (II) Consider now that we restrict our signal x[n] to have poles and zeros only in the unit circle, i.e.:

then if

This is the complex logarithm of X(w)

Definition of Cepstrum The real cepstrum is defined as:

Its magnitude is real and non-negative And the complex cepstrum:

Where arg() represents the phase. We call it complex because it uses the complex logarithm, not due to the sequence, which can also ne real. In fact, the complex cepstrum of a real sequence is also real

Guido Geßl 21.01.2004

original definition - real cepstrum

Inverse Fourier transfrom of the logarithm of themagnitude of the Fourier transform:

[ ] ( ) !"

! �� #

= log21

magnitude is real and nonnegative

not invertible as the phase is missing

more general definitioncomplex cepstrum

�� )](log[21][ˆ

$$$ $#

�� ))](arg()(log[21

arg is the continuous (unwrapped) phase function.

The term complex cepstrum refers to the use of thecomplex logarithm, not to the sequence.

The complex cepstrum of a real sequence is also a real sequence.

Definition of cesptrum(II)

It can be shown that the the real cepstrum is the even part of the complex cepstrum:

In speech processing we generally use the real cepstrum, which is

obtained by applying an inverse Fourier Transform of the log-spectrum of the signal.

In fact, the name “cepstrum” comes from inverting the first syllable of the word “spectrum”. Similarly, the variable “n” in cx[n] is called “quefrency”, which is the inversion of “frequency”

The real cepstrum is the inverse transform of the real partof and therefore is equal to the conjugate-symmetric part of :

)(ˆ !��][ˆ ��

2][ˆ][ˆ][

* ��

Properties of cepstrums

From this we can derive the following generalproperties:1) the complex cepstrum decays at least as

fast as 1/|n|2) it has infinite duration, even if x[n] has

finite duration3) it is real if x[n] is real (poles and zeros are

in complex conjugate pairs)

NOTE: from 2) and 3) we see why usually a finite number of cepstrums is used in speech processing (12-20 is sufficient), as very high order cepstrums have very small values.

An example

(Using Taylor series expansion and several tricks)

Given the “r” in the denominator, it is an infinite train of deltas that converges to 0

Computational considerations: using DFT In digital signals we replace the Fourier Transform by the Discrete Fourier Transform

Aliasing by repetition of the cepstrums with period N

When N> the number of used cepstrums we do not have a problem (which is usually the case)

Cepstral analysis of speech As pointed out at the beginning, we would like to separate the

excitation from the vocal tract filter h(n) by using a homomorphic transformation.

We can do so easily as the filter parameters usually reside in the lower quefrencies, while the excitation parameters have higher quefrencies

HOMOMORPHIC TRANSFORMATIONS

A homomorphic transformation converts a convolution into a sum:

Consider the problem of recovering a filter's response from a periodic signal (such as a voiced excitation):

The filter response can be recovered if we can separate the output of the homomorphic transformation using a simple filter:

Note that the process of separating the signals is essentially a windowing processing. Is this useful for speech processing?

Cepstral analysis of speech

Cepstrum of a generic voiced signal

magnitude spectrum

cepstrum

•  Contributions to the cepstrum due to periodic excitation will occur at integer multiples of the fundamental period. NOTE that for children and high-pitch women we might have a problem

•  Contributions due to parameters usually modeled by the filter will concentrate in the low quefrency region and will decay quickly with n

Cepstral analysis of speech (voiced signals)

SOURCE-FILTER SEPARATION VIA THE CEPSTRUM

An example of source-filter separation using voiced speech:

(a) Windowed Signal

(b) Log Spectrum

(c) Filtered Cepstrum (n < N)

(d) Smoothed Log Spectrum

(e) Excitation Signal

(f) Log Spectrum (high freq.)

An example of source-filter separation using unvoiced speech:

(a) Windowed Signal

(b) Log Spectrum

The reason this works is simple: the fundamental frequency for the speaker produces a peak in the cepstrum sequence that is far removed (n > N) from the influence of the vocal tract (n < N). You can also demonstrate this using an autocorrelation function. What happens for an extremely high-pitched female or child?

Cepstral analysis of speech (unvoiced signals)

SOURCE-FILTER SEPARATION VIA THE CEPSTRUM

An example of source-filter separation using voiced speech:

(a) Windowed Signal

(b) Log Spectrum

(e) Excitation Signal

(f) Log Spectrum (high freq.)

An example of source-filter separation using unvoiced speech:

(a) Windowed Signal

(b) Log Spectrum

The reason this works is simple: the fundamental frequency for the speaker produces a peak in the cepstrum sequence that is far removed (n > N) from the influence of the vocal tract (n < N). You can also demonstrate this using an autocorrelation function. What happens for an extremely high-pitched female or child?

Cepstral analysis of vowel (rectangular window)

Cepstral analysis of vowel (tapering window)

Cepstral analysis of fricative (rectangular window)

Cepstral analysis of fricative (tapering window)

Use in Speech Recognition

Statistical properties of cepstral coefficients

Mel-frequency cepstral representation

MFCC computation diagram

Mel-filter bank processing

8. Cepstral Analysis - Xavier Anguera · Cepstral analysis of speech As pointed out at the...

Documents

Pitch Prediction from Mel-frequency Cepstral Coefficients ...€¦ · Pitch Prediction from Mel-frequency Cepstral Coe cients Using Sparse Spectrum Recovery Achuth Rao MV, Prasanta

Constant Q Cepstral Coe cients: A Spoo ng Countermeasure ... · Patil, 2015) used cochlear lter cepstral coe cients. Albeit in combina-tion with standard Mel frequency cepstral coe

Year1994 Our Lady Queen of Peace Anguera Bahia Brazil

Year2000 Our Lady Queen of Peace Anguera Bahia Brazil

Glottis Lips Tongue Linear versus Mel Frequency Cepstral ...users.umiacs.umd.edu/~ramani/pubs/Xinhui_ASRU2011_LFCC_vs_MFCC_v19.pdf · MFCC and LFCC (Linear frequency cepstral coefficients)

Year1996 Our Lady Queen of Peace Anguera Bahia Brazil

9. Automatic Speech Recognition - Xavier Anguera

Cepstral- and Spectral-Based Acoustic Measures of Normal

CEPSTRAL ANALYSIS Cepstral analysis synthesis on the mel frequency scale, and an adaptative algorithm for it. Cecilia Caruncho Llaguno

Year1995 Our Lady Queen of Peace Anguera Bahia Brazil

BACKGROUND Our Lady Queen of Peace Anguera Bahia Brazil

L9: Cepstral analysis

One Solution of Extension of Mel-Frequency Cepstral

Year2009 Our Lady Queen of Peace Anguera Bahia Brazil

Feature extractor Mel-Frequency Cepstral Coefficients (MFCCs) Feature vectors

Year1993 Our Lady Queen of Peace Anguera Bahia Brazil

LECTURE 13: CEPSTRAL ANALYSIS - ISIP · LECTURE 13: CEPSTRAL ANALYSIS ... Theory, Algorithm, and System Development, ... models implemented using a complex software paradigm. Such

Year2006 Our Lady Queen of Peace Anguera Bahia Brazil

Year1999 Our Lady Queen of Peace Anguera Bahia Brazil

Implementing Cepstral Filtering Technique using Gabor Filters · other methods use multi-lambda Gabor filters to compute disparity. The ... Now the Cepstral filtering technique is