38
S.B.I.T Cepstrum Analysis CHAPTER-1 INTRODUCTION 1.1 Introduction The name "cepstrum" was derived by reversing the first four letters of "spectrum". The operations performed on cepstra are labelled as quefrency analysis, liftering or cepstral analysis The cepstrum method serves as an alternative approach to linear prediction in that it does not make any assumption regarding the characteristics of the data sequence. Bogert el al. [I] developed the cepstrum approach to find echo arrival times in a composite signal by decomposing the non additive constituents. The term cepstrum represents the power spectrum; it is defined as a function of pseudo-time, I, the spectral ripple frequency or quefrency. The cepstrum terms defined by Bogert et al. are summarized below 1.2 organization of thesis It starts with introductory chapter that gives basic idea about project CEPSTRUM ANALYSIS Chapter1 Describes about the basic idea about of the project. Chapter2 Describes Literature survey Chapter3 Gives the Aim & Objectives Chapter4 Describe about the Design Implementation Chapter5 Illustrates the Source code and outputs Chapter6 Describes about Results Department Of E.C.E 1

The Cepstrum Method

Embed Size (px)

DESCRIPTION

cepstrum method

Citation preview

Page 1: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER-1

INTRODUCTION1.1 Introduction

The name "cepstrum" was derived by reversing the first four letters of "spectrum". The operations performed on cepstra are labelled as quefrency analysis, liftering or cepstral analysis

The cepstrum method serves as an alternative approach to linear prediction in that it does not make any assumption regarding the characteristics of the data sequence. Bogert el al. [I] developed the cepstrum approach to find echo arrival times in a composite signal by decomposing the non additive constituents. The term cepstrum represents the power spectrum; it is defined as a function of pseudo-time, I, the spectral ripple frequency or quefrency. The cepstrum terms defined by Bogert et al. are summarized below

1.2 organization of thesis

It starts with introductory chapter that gives basic idea about project

CEPSTRUM ANALYSIS

Chapter1 Describes about the basic idea about of the project.

Chapter2 Describes Literature survey

Chapter3 Gives the Aim & Objectives

Chapter4 Describe about the Design Implementation

Chapter5 Illustrates the Source code and outputs

Chapter6 Describes about Results

Chapter7 Describes about Applications

Chapter8 Gives the conclusion

Chapter9 Gives the Acknowledgements and references

Department Of E.C.E 1

Page 2: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER-2

LITERATURE SURVEY2.1 Basic principles of cepstral analysis

In general a signal coming out from a system is due to the input excitation and also the response of the system. From the signal processing point of view, the output of a system can be treated as the convolution of the input excitation with the system response. At times, we need each of the components separately for study and/or processing. The process of separating the two components is termed as deconvolution.

In the first case, if we knew the input excitation, then the system component can be separated/ constructed by exciting the system with the inputs and collecting its responses. Then the second case is, we knew the system response, then the input excitation can be recovered using the inverse filter theory concept. There is yet another type of deconvolution, where the assumption is both input excitations as well as system responses are unknown. The present study of cepstral analysis of speech comes under this category.

Speech is composed of excitation source and vocal tract system components. In order to analyze and model the excitation and system components of the speech independently and also use that in various speech processing applications, these two components have to be separated from the speech. The objective of cepstral analysis is to separate the speech into its source and system components without any a priori knowledge about source and / or system.

According to the source filter theory of speech production, voiced sounds are produced by exciting the time varying system characteristics with periodic impulse sequence and unvoiced sounds are produced by exciting the time varying system with a random noise sequence. The resulting speech can be considered as the convolution of respective excitation sequence and vocal tract filter characteristics. If e(n) is the excitation sequence and h(n) is the vocal tract filter sequence, then the speech sequence s(n) can be expressed as follows:

s(n)=e(n)*h(n) (1)

This can be represented in frequency domain as,

S(w)=E(w).H(w) (2)

Department Of E.C.E 2

Page 3: The Cepstrum Method

S.B.I.T Cepstrum Analysis

The Eqn. (2) indicates that the multiplication of excitation and system components in the frequency domain for the convolved sequence of the same in the time domain. The speech sequence has to be deconvolved into the excitation and vocal tract components in the time domain. For this, multiplication of the two components in the frequency domain has to be converted to a linear combination of the two components. For this purpose cepstral analysis is used for transforming the multiplied source and system components in the frequency domain to linear combination of the two components in the cepstral domain.

From the Eqn. (2) the magnitude spectrum of given speech sequence can be represented as,.

│S(w)│=│E(w)│. │H(w)│ (3)

To linearly combine the E(ω) and H(ω) in the frequency domain, logarithmic representation is used. So the logarithmic representation of Eqn. (3) will be,

log │S(w)│=log │E(w)│+ log│H(w)│(4)

As indicated in Eqn. (4), the log operation transforms the magnitude speech spectrum where the excitation component and vocal tract component are multiplied, to a linear combination (summation) of these components i.e. log operation converted the "*" operation into "+" operation in the frequency domain. The separation can be done by taking the inverse discrete Fourier transform of the linearly combined log spectra of excitation and vocal tract system components. It should be noted that IDFT of linear spectra transforms back to the time domain but the IDFT of log spectra transforms to quefrency domain or the cepstral domain which is similar to time domain. This is mathematically explained in Eqn. (5). In the quefrency domain the vocal tract components are represented by the slowly varying components concentrated near the lower quefrency region and excitation components are represented by the fast varying components at the higher quefrency region.

Figure 1 details the various steps involved in converting the given short term speech signal to its cepstral domain representation. The output obtained at different stages of cepstrum computation as described in Figure 1, is given in Figure2. In Figure 2, s(n) is the voiced frame considered and x(n) is the windowed frame. Here s(n) multiplied by a hamming window to get x(n). |x(ω)| in Figure 2 represent the spectrum of the windowed sequence x(n). As the spectrum of the given frame is symmetric, only one

Department Of E.C.E 3

Page 4: The Cepstrum Method

S.B.I.T Cepstrum Analysis

half of the spectral components is plotted. The log|x(ω)| represents the log magnitude spectrum obtained by taking logarithm of the |x(ω)|. c(n) of Figure 2 shows the computed spectrum for the voiced frame s(n). The obtained cepstrum contains vocal tract components which are linearly combined according Eqn.(5).

As the cepstrum is derived from the log magnitude of the linear spectrum, it is also symmetrical in the quefrency domain. Here also only one symmetric part of the cepstrum is used for plotting. Figure 3 plots various stages in the cepstrum computation for an unvoiced frame. It can be observed that the variations in the lower quefrency region (near 0 axis) is due to vocal tract characteristics and the fast varying nature of the cepstrum towards the upper quefrency region represents the excitation characteristics of the short term speech segment. Methods have to be devised to extract to these vocal tract and excitation characteristics independently. For this purpose a liftering operation is performed in the quefrency domain. Following section describing about the liftering operation performed to extract the vocal tract and excitation features independently from the quefrency domain.

c(n)= IDFT(log │S(w)│)=IDFT(log │E(w)│+ log│H(w)│) (5)

Figure 2.1: Block diagram representing computation of cepstrum

Department Of E.C.E 4

Page 5: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Figure 2.2: 20 ms voiced speech segment and its cepstrum

Department Of E.C.E 5

Page 6: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Figure 2.3: 20 ms unvoiced speech segment and its cepstrum

Department Of E.C.E 6

Page 7: The Cepstrum Method

S.B.I.T Cepstrum Analysis

2.2 Liftering:

Liftering operation is similar to filtering operation in the frequency domain where a desired quefrency region for analysis is selected by multiplying the whole cepstrum by a rectangular window at the desired position. There are two types of liftering performed, low-time liftering and high-time liftering. Low-time liftering operation is performed to extract the vocal tract characteristics in the quefrency domain and high-time liftering is performed to get the excitation characteristics of the analysis speech frame.

2.2.1 Low-time liftering for Formant estimation

Low-time liftering is used for estimating slow varying vocal tract characteristics from the computed cepstrum of the given speech sequence. The low-time liftering window used for extracting vocal tract characteristics can be represented as follows,

where Lc is the cut off length of the liftering window and N/2 is half the total length of the cepstrum. Usually Lc is used as 15 or 20. The vocal tract characteristics can be obtained by multiplying the cepstrum c(n) with the low-time liftering window as indicated in Eqn. (7).

ce (n)=we[n].c(n) (7)

The extraction of vocal tract characteristics is illustrated in Figure 8.

Applying DFT on the low-time liftered sequence takes to its log magnitude spectrum which is the vocal tract spectrum of the given short term speech as given in Eqn. (8).

log [│ H (⍵)│]=DFT [ce(n)] (8)

The important vocal tract parameters like formant location and bandwidth can be computed from the vocal-tract spectrum. The formant locations can be estimated by picking the peaks from the smooth vocal tract spectrum. The block diagram given in Figure 4 shows the process of formant estimation using low-time liftering. Figure 5 shows the computation of low

Department Of E.C.E 7

Page 8: The Cepstrum Method

S.B.I.T Cepstrum Analysis

time liftering. Figure 6 shows the formants locations obtained from the peaks in the vocal tract spectrum.

Figure 2.4: Block diagram representing low-time liftering

Figure 2.5: Low-time liftering: Cepstrum of a voiced segment and low-time liftering window (in red color) and vocal tract characteristics of the cepstrum obtained through the low-time liftering

Department Of E.C.E 8

Page 9: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Figure 2.6: Formant locations from vocal tract spectrum

2.2.2 High time liftering for pitch estimation:

As the cepstrum computed from the analysis speech sequence is symmetric, half the length of the cepstrum is considered for the liftering. The excitation characteristic are obtained through a high time liftering operation using the following window,

wh [n ]={1 , Le≤n≤N2

0 ,else (9)

where Lc is the cut off length of the liftering window and N/2 is the half the total length of the cepstrum. Usually Lc is used as 15 or 20. The excitation characteristics are obtained by multiplying high time liftering window with the cepstrum obtained as given in Eqn. (7).

Department Of E.C.E 9

Page 10: The Cepstrum Method

S.B.I.T Cepstrum Analysis

ch (n )=wh (n )∗c (n) (10)

The block diagram given in Figure 8 indicates the high-time liftering process for pitch estimation. The computation of high-time liftered cepstrum from the cepstrum using high-time liftering window is given in Figure 8. Pitch can be estimated as the instant corresponds to the highest peak in the high-time liftered cepstrum. In the Figure 8, pitch period is time instant corresponding to the largest peak in the high-time liftered cepstrum. The reciprocal of the pitch interval multiplied by the sampling frequency gives the pitch frequency of the analysis speech frame.

Figure 2.7: Block diagram representing high-time liftering

Figure 2.8: High-time liftering: Cepstrum of a voiced segment and liftering window (in red color) and vocal tract part of the cepstrum obtained through the high time liftering

Department Of E.C.E 10

Page 11: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER-3

Aim and Objective3.1 Aim

The aim of cepstrum analysis is to convert signals (such as a source and filter) combined by convolution into sums of their cepstra, for linear separation. Then the cepstrum is result of taking the Inverse Fourier transform of the logarithm of the spectrum of a signal.

3.2 Objective

The objective of this type of deconvolution, is that the assumption of both input excitations as well as system responses are unknown. The present study of cepstral analysis of speech comes under this category.

In the first case, if we knew the input excitation, then the system component can be separated/ constructed by exciting the system with the inputs and collecting its responses. Then the second case is, we knew the system response, then the input excitation can be recovered using the inverse filter theory concept.

Department Of E.C.E 11

Page 12: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER- 4

DESIGN IMPLEMENTATION4.1 Block diagram

Figure 4.1: Block diagram of cepstrum

4.2 Cepstrum

The cepstrum is a representation used to convert signals (such as a source and filter) combined by convolution into sums of their cepstra, for linear separation. Then the cepstrum is result of taking the Inverse Fourier transform  of the logarithm of the spectrum of a signal.

Cepstrum analysis is concerned with the deconvolution of two signal types: the fundamental (basic) wavelet and a train of impulses (excitation function) .The composite signal can be represented in terms of power. Complex, or phase cepstra. Emphasis is placed on the power and complex cepstra.

4.3 The Power Cepstrum

The power cepstrum was first described and used by Bogert C'tal. [I] in 1963. The purpose of the study was to determine echo arrival times in a composite signal since the delayed echoes appear as ripples in the logarithmic spectrum of the input data sequence x(n). In practice the power cepstrum is an effective tool provided that the frequencies of the basic wavelet and excitation function do not overlap.The power cepstrum of the signal is defined as the square of inverse z transform of the logarithm of the magnitude squared of the z-transform of the data sequence, which can be written as

xpc(nT)= (z-1{log|X(z)|2})2 (4.1)

xpc(nT)=¿ (4.2)

Department Of E.C.E 12

Page 13: The Cepstrum Method

S.B.I.T Cepstrum Analysis

where X(Z) represents the z-transform of the data sequence x(nT). Let us assume that the data sequence consists of two convoluted sequences y(nT) and v(nT). which represent the basic wavelet and excitation function respectively. The data sequence x(nT) can be written as

x(nT)= y(nT) * v(nT). (4.3)

This equation can then be written as the multiplication of the Fourier transform of the two sequences.

|X(z)|2=|Y(z)|2.|V(z)|2 (4.4)

Upon taking the logarithm of both sides of the equation. we obtain

log|X(z)|2=log|Y(z)|2+log|V(z)|2 (4.5)

To further elaborate on the power spectrum analysis, let us assume that the excitation function (signal) is given as

v(nT)=δ(nT)+cδ(nT-n0T) (4.6)

where S(n) denotes the unit impulse function in a sampled data sequence. On the basis of this equation. Above eq can be further written as

|X(z)|2=|Y(z)|2|1+cz−n0|2 (4.7)

By taking the logarithm of both sides of this equation and substituting

z = ejw, we expand Eq. (4.6) as

log|X(ejw)|2= log|Y(ejw)|2 + log(1+c2+2ccos(wn0T))2 (4.8)

log|Y(ejw)|2 + log(1+c2)+log (1+2c

1+c2cos(wn0T))

The details of these derivations are described elsewhere. It is obvious from above Eq that the logarithm of the magnitude squared of the z transform of the data sequence x(n) will have sinusoidal components (ripples). The amplitudes and frequencies of these ripples correspond to the amplitude c of the excitation function and the time delay, n0T.By taking the inverse z-transform of Eq. (4.8). the data sequence x(nT) can now be expressed in terms of its components. 11is assumed that the power

Department Of E.C.E 13

Page 14: The Cepstrum Method

S.B.I.T Cepstrum Analysis

cepstra of these components are additive, each corresponding to different frequency bands,

xpc=ypc+vpc (4.9)

where ypc is the power cepstrum of the basic wavelet. and vpc is the power cepstrum of the excitation signal.

Note that in the above equation the cross-product term was neglected If the data sequences ypc and vpc have different frequency ranges, they can be easily obtained by filtering in the pseudo-frequency domain.

In summary, after taking the inverse z-transform and obtaining the power cepstrum, the peaks produced by the excitation function can be identified at the quefrencies (delays) of nT. Assuming that vpc(nT) is an impulse function, the peaks of the power cepstrum can be detected if the loglY(z)|2 is quefrency limited to less than n0T and the ripples of the loglY(Z)|2

have a period (repoid) less than (n0T)-1.While power cepstrum methods have been successfully applied to

biomedical signals including the ECG and diastolic heart sounds, the methods are limited by their failure to maintain the phase information required for precise recovery of analyzed signals.

4.4 The Complex Cepstrum

The cepstrum computation discussed so far is known as the real cepstrum. As the real cepstrum is computed from the log magnitude spectrum, the phase part is ignored. This will not enable the reconstruction of the sequence from the cepstrum. However the reconstruction can be done by preserving the fourier phase and use it for reconstruction from the real cepstrum. For the reconstruction of the sequence from the cepstrum, complex cepstrum is used. Instead of taking inverse fourier transform of the log magnitude spectrum for the real cepstrum, the inverse fourier transform of the logarithm of complex spectrum is used for computing complex cepstrum. As the logarithm of all the spectral values are used, the phase is preserved in the complex cepstral sequence which can be used for reconstructing back the sequence. The methods for computing pitch and formant parameters from the complex cepstrum remain same as that of the real cepstrum as these parameters are obtained from the magnitude of the complex cepstral coefficients. Figure 9 shows the block diagram for complex cepstrum computation.

Department Of E.C.E 14

Page 15: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Figure 4.2: Block diagram representing complex cepstrum

The complex cepstrum is an outgrowth of homomorphic system theory developed by Oppenheim. Although the power cepstrum can be used for detecting echoes, it cannot be used for wavelet recovery since the phase information is lost. The complex cepstrum of a data sequence can be defined as the inverse z-transform of the complex logarithm of the z-transform of the data sequence as follows,

(4.10)

where . x(nT) represents the complex cepstrum and X(z) represents the z transform of the data sequence x(nT).Let us assume that the input sequence is the convolution of two sequences as follows,

x(nT)=y(nT)*v(nT) (4.11)

where y(nT) represents the basic wavelet and v(nT) represents the excitation function. This can be written in the z-domain as

X(z)= Y(z).V(z) (4.12)The logarithm of Eq. (4.12) is written as

X ( z )=logX ( z )=logY ( z )+ logV (z ) (4.13)

The complex cepstrum can then be estimated by the inverse z-transform of this equation,

x (nT )=f (nT )+ g(nT ) (4.14)

where x(nT) represents the complex cepstrum of composite signal x(n), y(nT) represents the complex cepstrum of the wavelet component, and v(nT) represents the complex cepstrum of the excitation component.In an effort to account for the presence of the excitation function in the complex cepstra. we assume that the excitation function v(nT) is of the form

v(nT)=δnT+cδ(nT-n0T) (4.15)

Department Of E.C.E 15

Page 16: The Cepstrum Method

S.B.I.T Cepstrum Analysis

By taking the z-transform and substituting z = ejw, we have

V(z)= V(ejwT)=1+ce− jwn0T (4.16)

And

V(ejwT)= Y(ejwT)(1+ce− jwn0T) (4.17)

Taking the logarithm of both sides of Eq. (4.17)

logX(ejwT)= logY(ejwT)+log(1+ce− jwn0T) (4.18)

Where c < I, the wavelet component dominates and the data sequence exhibits minimum phase characteristics. This is most evident when Eq. (4.18) is expanded to the form

logX(ejwT)= logY(ejwT)+ce− jwn0T-(c2e−2 jw n0T)/2+…….. (4.19)

Finally, the complex cepstrum of the data sequence x(n) is obtained by taking the inverse z-transform of Eq. (4.19),

x (nT )= y ( nT )+cδ(nT-n0T)-c2δ(nT-2n0T)……… (4.20)It is evident in Eq. (4.20) that the complex cepstrum includes the

complex cepstrum of the wavelet as well as ripples of the excitation function at the positive frequencies (n0T). The amplitudes and frequencies of the ripples correspond to the amplitudes and delays of the excitation function v(noT).

For the case c > I, where the data sequence exhibits maximum phase characteristics, Eq. (4.19) can be further written as

(4.21)

Prior to filtering, the term -jwnoT in the above equation should be extracted in order for the excitation function to show maximum phase characteristics, which requires negative ripple frequencies. Note that the amplitudes of the ripples have been attenuated in Eq. (4.21)by the term lie.The complex cepstrum can then be written as

(4.22)

Department Of E.C.E 16

Page 17: The Cepstrum Method

S.B.I.T Cepstrum Analysis

After filtering, the linear phase term -jwnoT should be included again in order to recover the excitation function [3].

The basic wavelet Y(nT) can be recovered by low pass filtering the complex cepstrum and taking the inverse z-transform of the resultant sequence. Note that for effective wavelet recovery it is essential that the frequencies of the wavelet and the excitation function do not overlap. If necessary, the excitation function can also be recovered by first high pass filtering the complex spectra and then taking the inverse z-transform of the resultant signal.

The recovery process requires that the filtered complex cepstrum be z-transformed, exponentiated, and inverse z-transformed. The Y(nT) and V(nT) sequences can be restored since the necessary phase information has been retained. figures 6.1a and 6.1b show the overall cepstrum (homomorphic deconvolution) wavelet recovery system. Figure 6.lc shows the typical filters utilized: short pass (low pass), long pass (high pass), and notch. These filters are defined in the pseudo-frequency domain and are analogous to low pass, high pass, and notch filters in the frequency domain. In summary, the complex cepstrum contains the phase information and therefore allows reconstruction of the composite signal. The power cepstrum can be calculated from the complex cepstrum as follows,

x pc (nT )=( x (nT ) )+ x (−nT ) ¿¿2 (4.24)

where Xpc(nT) represents the power cepstrum and J(nT) represents the complex cepstrum.

4.4.1 Phase UnwrappingCalculation of the complex cepstrum is however complicated by the

fact that it is multi valued. Where computers and commercial software are employed to compute the imaginary part of the complex cepstrum, the principal value is given as

−π ≤arg│ X (z )│≤π (4.25)The term arg|X(z)|, representing the imaginary part of the complex

cepstrum, has discontinuities at multiples of 2πradians. Since the function is discontinuous, calculation of the log|X(z)|, is inappropriate. Consequently, the imaginary part of the log|X(z)|must be continuous, periodic, and thus analytical in some annular region of the z-plane in order to perform the z transformation of the .log|X(z)|Another important requirement is that the imaginary part of the log|X(z)|must be an odd function of w since the

Department Of E.C.E 17

Page 18: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Fig. 4.3. Overall wavelet recovery system, also known as homomorphic deconvolution (filtering) or cepstrum system. The DFT is performed by an FFT algorithm. xR(n) denotes the recovered wavelet. The input sequence is windowed and then appended with zeros. (a) Simplified block diagram. (b) More detailed block diagram which can be used to process data in real time. (c) Typical lifters for the single-echo, minimum phase (c < I) case where peaks occur at no and multiples thereof. (The notch lifter is sometimes called a comb lifter.) [From Childers et al. [4]].

Complex cepstrum of a real function should be real. Therefore, un-. wrapped phase is required for calculation of the complex cepstrum.

Several approaches for computation of unwrapped phase values are available [4]. However, only two of them are discussed here. The first is based on the fact that the phase is sampled at a very high frequency. This sampling rate is required so that the phase never changes more than π between samples. As shown in Fig. 11, the correction term, c(k), can be added if the phase difference between samples of the module 2πphase sequence P(k) exceeds π

Department Of E.C.E 18

Page 19: The Cepstrum Method

S.B.I.T Cepstrum Analysis

c (k )={c ( k−1 )−2 π if P (k )−P (k−1 )>πc (k−1 )+2π if P (k−1 )−P(k )>π

c (k−1 ) otherwise (4.26)

where c(O) = O.

Fig. 4.4. Phase unwrapping. (a) Phase modulo 217. (b) C(k), the correction sequence. (c) Unwrapped phase. [From Childers et al. [4]].

Alternatively, the phase can be unwrapped by an adaptive numerical integration procedure proposed by Tribolet [6]. This approach combines information regarding the phase derivative and the principal value of the phase. For each frequency, a set of permissible phase values is ‘defined by adding integer multiples of 2π to the principal value of the phase. One of these values may be selected as the unwrapped phase with the help of a phase estimate. This phase estimate is formed by adaptive numerical integration of the phase derivative within a given step interval. The step interval is updated until the phase estimate approaches the permissible phase values [6].A new approach based on finite length cepstrum modeling was proposed by Nadeu, details of which appear in Ref. [7].

4.4.2 Phase Unwrapping Using Adaptive Numerical Integration

Department Of E.C.E 19

Page 20: The Cepstrum Method

S.B.I.T Cepstrum Analysis

The Fourier transform of the data sequence x(n) is given as

X ( z )=XR ( z )+ j X1=│ X (z )│e( jarg [X (z )]) (4.27)

where XR(z) represents the real part of X(z); X1(z)represents the imaginary part of X(z), |X(z)| represents the magnitude of the X(z), and arg[X(z)] represents the phase of the X(z). The logarithm of the Fourier transform of the data sequence x(n) is written as

X ( z )=logX ( z )=log│X ( z )│+ jarg[X ( z)] (4.28)

The derivative of X(z) may be found by assuming that Eq. (4.28) has a valid Fourier transform,

δ X ( z)δw

=δlogX (z)

δw=

ΔX (z)/δwX (z )

(4.29)

The derivative of arg[X(z)] is obtained from

δarg [X (z)]δw

=X R ( z ) X '

l ( z )−X 'R ( z ) X l ( z )│X (z)│2

(4.30)

where the first derivative notation represents δ/δw. Finally, the derivative of X(z) is written as

X’(z)=X`R(z)+ X`l(z)=-jFT{nx(n)} (4.31)

It is evident then that the phase arg[X(z)] can be defined as the integration of the derivative arg'[X(z)] as follows

arg [ X ( z ) ]=∫0

ω

ar g' [X (e jn)]dn, (4.32)

based on the initial condition arg[X(ejθ) = O. The phase function exhibiting these properties is called the "nwrapped pllase function. On the basis of these properties. it is apparent that the unwrapped phase function is a continuous function and can also be defined as an odd function when the phase derivative of the mean is equal to zero. Otherwise. the linear phase component caused by the derivative of the arg|X(z)| should be omitted before phase unwrapping.

To calculate the unwrapped phase function the principal value of the phase is calculated at each frequency wk. The limited phase value can then be found from the summation of the principal value and the correction factor 2πI(wk)

arg [ X ( z ) ]+2πl(wk ) (4.33)

Department Of E.C.E 20

Page 21: The Cepstrum Method

S.B.I.T Cepstrum Analysis

where arg[X(z)] represents the principal value at the given frequency Wk and I is an integer value.

arg [X (e jωk ) ]=arg [ X (e jωk ) ]+2πl(w k) (4.34)

The correction factor I(wk) at a given frequency wk can be estimated where arg[X(ejwk)] represents the unwrapped phase function at frequency wk by applying trapezoidal numerical integration to the phase derivative. The current value of the phase estimate can be calculated by utilizing the previous estimate of the phase value

Step 1

arg [X (e jωl ) ]=arg [X (e jωl−1 ) ]+ωl−ωl−1

2∗[ar g ' [X (e jωl−1 ) ]+ar g '[X (e jωl )] ]

(4.35)

Equation (6.34) improves as the step increment. ∆w = wk - wk-1, becomes small.

Step 2. The correction factor I(wk) at wk can, be assumed to be consistent if it falls into a predetermined range of 'one of the acceptable correction factors 2π(wk) at wk,

│arg [X (e jωl ) ]−arg [X (e jωl ) ]+2πl(ωk)<THR<π│ (4.36)

The resultant I(wk) can be used in Eq. (6.33) to form the unwrapped phase at wk. The unwrapped phase at wk+1 will be estimated using the recently unwrapped phase at wk. This process continues until all the unwrapped phase values have been determined.

Department Of E.C.E 21

Page 22: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER-5SOURCE CODE AND OUTPUTS

5.1 SOURCE CODE

clc;clear all;close all;[y,fs,b]=wavread('03.wav');%fs=8000;y=y/(1.01*abs(max(y))); %%normalizationy=y(241:400);''N=160;window=hamming(N);y=y.*window;y1=log(abs(fft(y)));y2=ifft(y1); %%%cepstrum

figure;subplot(2,1,1);plot(y);title('Voiced speech segment');xlabel('time(samples)');ylabel('amplitude');subplot(2,1,2);plot(y2);title('Cepstrum');xlabel('quefrency(samples)');ylabel('amplitude');

%%%liftering hp y3=y2(1:length(y2)/2);L=zeros(1,length(y3));

Department Of E.C.E 22

Page 23: The Cepstrum Method

S.B.I.T Cepstrum Analysis

L(15:length(L))=1; ''y4=real(y3.*L');

[y_val,y_loc]=max(y4); pitch_period=y_loc pitch_frequency=(1/pitch_period)*fs

figure;subplot(2,1,1);plot(y3);hold on; plot(L,'r');xlabel('quefrency(samples)');ylabel('amplitude');title('Cepstrum'),hold off;subplot(2,1,2);plot(y4);xlabel('quefrency(samples)')ylabel('amplitude');title('High-time liftered Cepstrum')

%%%% low pass lifteringL1=zeros(1,length(y3));L1(1:15)=1;y5=real(y3.*L1');y6=y5(1:15);f6=fft(y6,8000);f6=f6(1:4000);f6=real(f6);

%%//peak picking algorithmk=1;for i=2:length(f6)-1 if (f6(i-1)<f6(i))&(f6(i+1)<f6(i)) %%% formant_mag(k)=f6(i); formant(k)=i; k=k+1; else continue; endendfigure;subplot(2,1,1);plot(y3);hold on;plot(L1,'r');xlabel('quefrency(samples)')ylabel('amplitude');title('Cepstrum');

Department Of E.C.E 23

Page 24: The Cepstrum Method

S.B.I.T Cepstrum Analysis

hold off;subplot(2,1,2);plot(y5);xlabel('quefrency(samples)');ylabel('amplitude');title('Low-time liftered Cepstrum'), figure; plot(f6); hold on; plot(formant,formant_mag,'r*'); hold off; title('formant extracted from low time liftered cepstrum') xlabel('quefrequency(sample)') ylabel('amplitude');

5.2 OUTPUTS

Output fig 5.1:

Voiced speech segment and its Cepstrum

Department Of E.C.E 24

Page 25: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Output fig 5.2

Cepstrum and its high time liftered cepstrum

Department Of E.C.E 25

Page 26: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Output fig 5.3

Cepstrum and its low time liftered cepstrum

Department Of E.C.E 26

Page 27: The Cepstrum Method

S.B.I.T Cepstrum Analysis

Output fig 5.4

Department Of E.C.E 27

Page 28: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER 6

Department Of E.C.E 28

Page 29: The Cepstrum Method

S.B.I.T Cepstrum Analysis

RESULTS

6. Result

Hence cepstrum is used as a feature vector for representing the human voice and musical signals. And It is used for voice identification, pitch detection and much more. The cepstrum is useful in these applications because the low-frequency periodic excitation from the vocal cords and the formant filtering of the vocal tract, which convolve in the time domain and multiply in the frequency domain, are additive and in different regions in the quefrency domain.

CHAPTER-7Department Of E.C.E

29

Page 30: The Cepstrum Method

S.B.I.T Cepstrum Analysis

APPLICATIONS

7. Applications:

The cepstrum can be seen as information about rate of change in the different spectrum bands. It was originally invented for characterizing the seismic echoes resulting from earthquakes and bomb explosions. It has also been used to determine the fundamental frequency of human speech and to analyze radar signal returns. Cepstrum pitch determination is particularly effective because the effects of the vocal excitation (pitch) and vocal tract (formants) are additive in the logarithm of the power spectrum and thus clearly separate.[5]

The auto cepstrum is defined as the cepstrum of the autocorrelation. The auto cepstrum is more accurate than the cepstrum in the analysis of data with echoes separation. In particular, the power cepstrum is often used as a feature vector for representing the human voice and musical signals. For these applications, the spectrum is usually first transformed using the Mel scale. The result is called the Mel or MFC (its coefficients are called Mel-frequency cepstral coefficients, or MFCCs). It is used for voice identification, pitch detection and much more. The cepstrum is useful in these applications because the low-frequency periodic excitation from the vocal cords and the formant filtering of thevocal tract, which convolve in the time domain and multiply in the frequency domain, are additive and in different regions in the quefrency domain

CHAPTER-8

Department Of E.C.E 30

Page 31: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CONCLUSION

8. CONCLUSION

The study proves that the cepstrum analysis technique in the field of the cepstrum can be seen as information about rate of change in the different spectrum bands. It was originally invented for characterizing the seismic echoes resulting from earthquakes and bomb explosions. It has also been used to determine the fundamental frequency of human speech and to analyze radar signal returns. Cepstrum pitch determination is particularly effective because the effects of the vocal excitation (pitch) and vocal tract (formants) are additive in the logarithm of the power spectrum and thus clearly separate.The auto cepstrum is defined as the cepstrum of the autocorrelation. The auto cepstrum is more accurate than the cepstrum in the analysis of data with echoes separation. In particular, the power cepstrum is often used as a feature vector for representing the human voice and musical signals. For these applications, the spectrum is usually first transformed using the Mel scale. The result is called the Mel or MFC (its coefficients are called Mel-frequency cepstral coefficients, or MFCCs). It is used for voice identification, pitch detection and much more. The cepstrum is useful in these applications because the low-frequency periodic excitation from the vocal cords and the formant filtering of thevocal tract, which convolve in the time domain and multiply in the frequency domain, are additive and in different regions in the quefrency domain

Department Of E.C.E 31

Page 32: The Cepstrum Method

S.B.I.T Cepstrum Analysis

CHAPTER-9ACKNOWLEDGEMENTS & REFERENCES

9.1 ACKNOWLEDGEMENTS

The authors wish to thank Cmde PK Chakravorty, General Manager (Technical) Naval Dockyard, Visakhapatnam as this paper received inspiration from his paper titled 'Signal Processing in Vibration Analysis -Advanced Techniques', published in Defence Science Journal July 1991 issue and, also for his constant encouragement in bringing out this paper. The authors wish to extend their gratitude to Dr RS Tripathi, DGM (LAB), Naval Dockyard, Visakhapatnam for his constant encouragement, interaction and supervision of the study. The paper could not have taken the present shape, but for the motivation provided by him.

9.2. REFERENCES

1.Chakravorty, P.K. Signal processing in vibration analysis- advanced techniques. Def. Sci J. , 1991, 41(3),241-49.2.Randall, R.B. Frequency Analysis Bruel & Kjaer , Naerum Press. Denmark, 1987. p. 36.3. Warring, R.H. Hand book of noise and vibration Further, it is an invaluable 1echnique for detection and control,4th Edn. Trade and Technical Press Ltd; Morden, Surrey, England, 1978, p. 386. 4. Harris, Cyril M. & Crede, Charles E. Vibration measurement equipment and signal analyzers & condition monitoring of machinery .In Shock and vibration handbook, edited by Harold, Crawford D & David, Fogarty E. Mc Graw-Hill Book Company, USA, 1988. pp. 13.43-13.45 and 16.16.5. Lyon, R.H. & Ordubadi, A. Use of cepstra in acoustic signal analysis. ASMEJ. Mech Des, 1982, 104(2), 303-07.6. Randall, R.B. Cepstrum analysis and gearbox fault diagnosis; Briiel & Kjaer application. note, Briiel & Kjaer Naerum Press, Denmark, 1973.pp. 4-5.7. Randall, R. B. Cepstrum analysis. In Machine health monitoring using vibration analysis. Canadian Acoustical Association, Vancou..;er , Canada, October 1983, pp. 1-15.8. Volen, R.H. Technique and application of mechanical signature analysis.S and V Digest,1979, 11(9), p. 12.9. Courrech, J. New techniques for fault diagnostic in rolling element bearings. In Proceedings of the 4Oth meet of the mechanical failure preventive group,16-18 April 1985. National Bereau of Standards, Gaithersburg, MD, 1985. pp. 16-20.

Department Of E.C.E 32