18
Pitch detection 1 In this chapter will we describe a pitch detector or pitch extractor. The goal of a pitch de- tector is to find the fundamental frequency f 0 of a speech signal. There are several different techniques for finding this fundamental frequency. In this chapter a pitch detector will be chosen, based on test of some different pitch detections techniques. 1.1 Introduction A pitch detector is used to detect the fundamental frequency ( f 0 ), called the pitch, in a sig- nal. In the project "Bandwidth expansion of narrowband signals" a pitch detector would be useful to find f 0 of speech signals, due to the fact that telephone signals are limited in the frequency interval 300-3400 Hz. When extending the bandwidth of a signal from a sam- pling frequency of 8000Hz to 16000Hz, it would also be useful to create the lower frequen- cies of a phone-signal, say 50-300 Hz. Because the human pitch lies in the interval 80-350 Hz(table 1.1.1 on page II), where the pitch for men is normally around 150 Hz, for women around 250 Hz and children even higher frequencies, the pitch is needed to construct this part of the speech signal. Some of the most used detectors are; Energy based, Cepstral based, zero crossing, pitch in the difference function and autocorrelation based. In this chapter there will be tested three different pitch detectors, The "On-The-Fly", a Cepstral based and an auto correlation based. "On-The-Fly" [1] is time based an algorithm which is developed to estimate the pitch of a signal, divided into frames. This paper is a description of this algorithm, how it’s imple- mented and an overall test of its ability to estimate a pitch of a speech signal. The Cepstral pitch detector is based on the log of the fourier transformation. This algo- rithm is only shortly described and tested in this paper. The Auto correlation pitch detector [5] is described and tested in this chapter. I

Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

Pitch detection 1

In this chapter will we describe a pitch detector or pitch extractor. The goal of a pitch de-tector is to find the fundamental frequency f0 of a speech signal. There are several differenttechniques for finding this fundamental frequency. In this chapter a pitch detector will bechosen, based on test of some different pitch detections techniques.

1.1 Introduction

A pitch detector is used to detect the fundamental frequency ( f0), called the pitch, in a sig-nal. In the project "Bandwidth expansion of narrowband signals" a pitch detector would beuseful to find f0 of speech signals, due to the fact that telephone signals are limited in thefrequency interval 300-3400 Hz. When extending the bandwidth of a signal from a sam-pling frequency of 8000Hz to 16000Hz, it would also be useful to create the lower frequen-cies of a phone-signal, say 50-300 Hz. Because the human pitch lies in the interval 80-350Hz(table 1.1.1 on page II), where the pitch for men is normally around 150 Hz, for womenaround 250 Hz and children even higher frequencies, the pitch is needed to construct thispart of the speech signal.

Some of the most used detectors are; Energy based, Cepstral based, zero crossing, pitchin the difference function and autocorrelation based. In this chapter there will be testedthree different pitch detectors, The "On-The-Fly", a Cepstral based and an auto correlationbased.

"On-The-Fly" [1] is time based an algorithm which is developed to estimate the pitch ofa signal, divided into frames. This paper is a description of this algorithm, how it’s imple-mented and an overall test of its ability to estimate a pitch of a speech signal.

The Cepstral pitch detector is based on the log of the fourier transformation. This algo-rithm is only shortly described and tested in this paper.

The Auto correlation pitch detector [5] is described and tested in this chapter.

I

Page 2: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

1.1.1 The test signal

The pitch detector must be able to estimate a pitch of signals in a limited frequency range.The speech signal used throughout this paper, is spoken by a man with a pitch around 150Hz± 20Hz, tested with a pitch analyzer program1. If not mentioned otherwise figures illustratea single frame (20ms) and always the same frame, so that the figures can be compared. Thesignal is sampled at 16000Hz and passed though a telefilter, which is a 10’the order band passfilter working at 300-3400Hz. All figures in this paper are illustrated with the original signaland the band passed signal. In this way it is possible to compare the difference betweenthese.

f0min(Hz) f0max(Hz)

men 80 200

woman 150 350

Table 1.1: Range of human speech

Also notice that the implemented algorithms not only have been tested with the above de-scribed signal. This signal is only used to gain understanding of the algorithms.

1.2 On-The-Fly pitch estimation

1.2.1 Common time domain pitch detectors

Many time-domain pitch detectors use the autocorrelation (equation 1.1) method to estimatethe pitch, which is also among the oldest methods. Another method to find the pitch is thedifference function (equation 1.2), which has been shown by Yin [2] and Chowdhury [3] tobe more effective than the autocorrelation method. The algorithm "On-The-Fly", is a pitch-detector is based on the difference function (dt ).

rt(τ) =W−τ

∑j=1

x jx j+τ (1.1)

dt(τ) =W−τ

∑j=1

(x j − x j+r)2 (1.2)

, where W is the length of the window. t indicates this is a time-domain calculation.

1.2.2 Unifying the difference function

The difference function is an effective algorithm to detect the f0 of a signal. Here f0 shouldbe at the global minimum of the difference function. This minimum is although difficult to

1Pitch analyzer v1.1 - written by Franz Josef Elmer

II

Page 3: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.2. ON-THE-FLY PITCH ESTIMATION

locate, because the difference function of a random signal decrease with a slope of a straightline, as shown on figure 1.1.

0 50 100 150 200 250 300 3500

5

10

15

20

25

30

35

40

45Differencen function of the signal

Samples difference index

Am

plitu

de

All pass signalBand limited signal

Figure 1.1: Shows the difference function taken of both the all-passed signaland the band limited signal. The figure shows that both signals approximatelyconverge with a straight line. Also notice the difference between the localminima. The local minimas of the band limited signal are generally muchsmaller than the all pass signal.

To locate the actual global minima, the difference function output needs to be unified. So ifthe time delay τ is equal to the period of the pitch (P), the converging slope (see figure 1.1)can be seen as a straight line of a unity slope. Correlating xt+P with xt will result in a globalmaximum at the pitch P. Therefore by considering the covariance matrix of the randomvector X = [x j xt + τ]T we get that:

cx(τ) = E[(x−m)(x−m)T ] (1.3)

Considering this as WSS, two orthogonal eigenvectors can be found from the covariancematrix, by decomposing it using diagonalization:

Cx(τ) =

[

1 −11 1

][

λ1(τ) 00 λ2(τ)

][

1 1−1 1

]

=

[

λ1(τ)+ λ2(τ) λ1(τ)−λ2(τ)λ1(τ)−λ2(τ) λ1(τ)+ λ2(τ)

]

(1.4)

By assuming uniform distributions and equal means, the eigenvalues can be written likethis:

λ1(τ) ∝W−τ

∑j=1

(x j + x j+τ)2 (1.5)

λ2(τ) ∝W−τ

∑j=1

(x j − x j+τ)2 (1.6)

III

Page 4: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

Because the eigen-values and vectors describes new basis for the covariance matrix, a uni-fied difference function can be derived. Thinking of eigenvector 1 and eigenvalue 1 being a1D basis of the 2D space, the corresponding eigenvector 2 and eigenvalue 2 can be thoughtof as the error off this basis and vice versa. Therefore the unified difference function can bewritten as the following:

d′t(τ) =

λ2(τ)λ1(τ)+ λ2(τ)

=

W−τ∑j=1

(x j − x j+r)2

2 ·W−τ∑j=1

x2j + x2

j+r

(1.7)

The unified difference function is plotted figure 1.2.

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1Unified differencen function of the signal

Am

plitu

de

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1

Samples difference index

Am

plitu

de

All pass signal

Band limited signal

Figure 1.2: Here the two signals are unified. Now the difference min the localminima are really shown.

By Yin the the pitch is found at the global minima of the unified difference function (d′

t ). Thisis found as a sample index and is equal to P. f0 is when calculated by the function:

f0 =f sP

(1.8)

, where fs is the sampling frequency (fs).

IV

Page 5: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.2. ON-THE-FLY PITCH ESTIMATION

1.2.3 Contradiction in results

By taking further notice to the two figures in figure 1.2, we find a contradiction. The twosignals doesn’t seem to have the same global minimum, eventhough f0 of the two signalsshould be equal. In the article [1] it does not say anything about "On-The-Fly" was devel-oped for these types of band limited signals. But it is developed to find the actual pitch ofa signal, due to the fact, that the authors’ claim that the global minimum of the differencefunction is not always equal to the pitch period. This is actually the case for our band limitedsignal, where the situation is as follows:

• The all-pass signal has a global minimum around index 95, which is equal to a pitchof about 170 Hz.

• The band limited signal has a global minimum around index 47, which is equal to apitch of about 340 Hz.

Surely one of these results must be wrong. Considering the test signal originates from aman, with a pitch around 150 Hz, the result of the band limited signal must be wrong.

1.2.4 Idea of the "On-The-Fly"

The fundamental idea of the "On-The-Fly" algorithm is to find other candidates for f0. This isdone by locating a number of the smallest minimas in d

t , where the overall global minimum,is still by offset selected to as f0. Notice that at all time, the global minimum is still thoughtof as being some harmonic of the actual pitch. If the global minimum is not equal to f0, theactual pitch should be found within some limits of this minimum, so it would be a candidateof a harmonic frequency to the global minima.

The surrounding local minima are therefore to be tested in 2 ways:

1. A minimum differs only by a significant amount (threshold stage 1).A small threshold (ATh1) is to be used in testing the difference in the minimas size. Ifthe difference from the found pitch-sample is within the limit, the one with the smallertime period is considered the actual pitch. This has to be accompanied with anotherlarger threshold (TTh1) which is used to test the cohesion between the candidate har-monic of the pitch minima and the new candidates. If the minimum does not lie withthe threshold of the harmonic, this sample should be discarded.

2. A minima differs only a relatively significant amount (threshold stage 2).A larger threshold (ATh2) is to be used in testing the difference in the minimas size. Ifthe difference from the found pitch-sample is within the limit, the one with the smallertime period is considered the actual pitch. This has to be accompanied with anothersmaller threshold (TTh2) which is used to test the cohesion between the candidateharmonic of the pitch minima and the new candidates. If the minimum does not liewith the threshold of the harmonic, this sample should be discarded.

V

Page 6: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

Figure 1.3 is illustrating the 2 test conditions where the threshold stages are marked.

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1Unified differencen function of the signal with smallest minimas

Am

plit

ude

All pass signal

Minimas

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1

Samples difference index

Am

plit

ude

Band limited signal

Minimas

TTh1 ATh1

TTh2 ATh2

Figure 1.3: Same as figure 1.2, though here with the 5 smallest minimasmarked with green lines. The threshold stage 1, ATh1 and TTh1, is shownin the upper figure, while threshold stage 2, ATh2 and TTh2, is shown in thelower figure. Of course both thresholds need to be applied to each signal whentesting

1.3 The On-The-Fly algorithm and implemantation

The algorithm to On-The-Fly is relatively easy and is in figure 1.4 shown in a block diagram.

Calculate the unified difference function (d’

t )

of a frame.

Locate a number (typically 5) of the

smallest minima of d’ t .

Do parabolic interpolation of the

possible candidates.

Do threshold stage 1, with the thresholds ATh1

and TTh1.

Do threshold stage 2, with the thresholds ATh2

and TTh2.

Result of threshold stage 2 is the found pitch

(fundamental frequency)

A B C

D E F

Figure 1.4: The diagram of the On-The-Fly algorithm

A Here d′

t is calculated.

VI

Page 7: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.3. THE ON-THE-FLY ALGORITHM AND IMPLEMANTATION

B Locating a number of the smallest local minima of d′

t can be a challenging task. First aminimum must be defined. But can 2 local minima be separated with 1 or only a fewsamples? If not, where should the limit be?

An example shows that surely minimas with only a few samples between them wouldnot be harmonics of each other. Typical speech signals are samples at fs=16000 Hz.With the maximal pitch of 500 Hz, there would still have to be minimum (16000Hz/500Hz)32 samples between minimas of the d

t . But still a local minimum might occur outsidethe harmonic and thereby eliminate the possibility of locating the actual harmonic tobe found.

In the implementation, an algorithm to find a number of minimas of the d′

t was devel-oped. It works in the following way; First it locates all minimas of d

t . Then any minimalarger that 0.4 is deleted from data, to assure the minimas could actually be the pitch.Next the global minimum among all the minima is found and stored in ti, where ti isthe index to the minima. If any minima exists within 10 samples of the found globalminimum, these minimas are deleted from data along with the found minima, and theprocedure is run again until there is no more data or the number of minimas wantedhas been found.The matlab code for this is found in section 1.8.2.

C The parabolic interpolation is needed to reduce quantization error in the candidatepitch values. An example, say fs=16000 Hz, if a minima is the pitch and located atsample 60, it equals a pitch of 267Hz. If the minima were found at sample 59, it wouldequal a pitch of 271Hz. No pitch is possible to get between 267-271Hz. Further thiserror gets higher, as the pitch rises. The parabolic interpolation solves this problem,and is implemented as a simple curve fitting problem.The matlab code for this is found in section 1.8.3.

D Among the candidate minimas (ti), we need to know if there are a harmonics of theglobal minimum is said to be the pitch. This is decided from the formula:

N −tgti

< T T h1 , where N={2,3, ..,6} (1.9)

, where tg is the index of the overall global minima. The threshold TTh1 here specifieshow much a harmonic can maximal differs from the global minimum. Those ti forwhich at least one of the possible N-values satisfy the threshold are then tested againstthreshold ATh1. ATh1 is a threshold of the difference in the minima value between tgand ti. If any ti are within the 2 threshold values, tg is changed with the one, whichhave the smaller timeperiod (if not tg itself). In this threshold set, TTh1 is set to 0.2 andATh1 is set to 0.07.The matlab code for this is found in section 1.8.4.

E Bullet D and E are exactly the same, just with 2 different threshold sets. In this thresh-old set, TTh2 is set to 0.05 and ATh2 is set to 0.2.

The fully implemented "On-The-Fly" algorithm can be seen in section 1.8.2.

VII

Page 8: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

1.3.1 Testresults of "On-The-Fly"

Though this paper a single frame has been used to show how the algorithm works. In figure1.5 we see the final result of the algorithm on this frame.

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1Pitch found in the difference function by On−The−Fly

Am

plitu

de

All pass signalf0 (pitch=165.84Hz)

0 50 100 150 200 250 300 3500

0.2

0.4

0.6

0.8

1

Samples difference index

Am

plitu

de

Band limited signalf0 (pitch=338.51Hz)

Figure 1.5: The pitch found in a single frame by the "On The Fly"-algorithm.

The ’o’ in each figure indicates where the global minima of d′

t (offset pitch)were found.

The figure clearly shows that the pitch is not the same in this frame. The pitch should bearound 150Hz, which is the result from the original signal. From the band passed signal apitch was not found anywhere near the actual pitch of the frame.This could although be a insecurity in the algorithm. So in figure 1.6 150 frame of the signalare shown.

VIII

Page 9: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.4. CEPSTRAL BASED PITCH DETECTOR

0 0.5 1 1.5 2 2.5

x 104

−1

0

1On−The−Fly pitch estimation of original signal

Am

plitu

de o

f sig

nal

0 0.5 1 1.5 2 2.5

x 104

−1

0

1On−The−Fly pitch estimation of bandpassed signal

Am

plitu

de o

f sig

nal

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Fre

quen

cy o

f pitc

h

Org. speech signalPitch frequency

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Fre

quen

cy o

f pitc

h

Sample index

Bandpassed speech signalPitch frequency

Figure 1.6: The pitch found by the "On The Fly"-algorithm.

In this figure we see that the found pitch from the original speech signal is far more conse-quente, than the one found in the band passed signal. From this and the fact that the signal isman with a pitch around 150Hz, we can conclude that "On The Fly" may be a good at detect-ing pitch in full frequencyranged signals, but not in band passed signals2. This algorithm istherefore not useful in the project "Bandwidth expansion of narrowband signals".

1.4 Cepstral based pitch detector

The theory behind the cepstral detector is that the fourier transform of a pitched signal usu-ally have a number of regularly peaks, who is representing the harmonic spectrum. Whenlog magnitude of a spectrum is taken, these peaks are reduced (their amplitude brought intoa usable scale). The result is a periodic waveform in the frequency domain, where the pe-riod is related to the fundamental frequency of the original signal. This means that a fouriertransformation of this waveform has a peak representing the fundamental frequency.This method has only shortly been tested in matlab, with a non satisfying result. Due to thetest, and time limitation, this pitch detector will not be explained further.

2A short description the algorithms flaws in detecting pitch in band limited signal can be found in section1.8.1

IX

Page 10: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

1.5 Pitch detection using the autocorrelation

When calculating the autocorrelation you get a lot of information of the speech signal. Oneinformation is the pitch period (the fundamental frequency). To make the speech signalclosely approximate a periodic impulse train we must use some kind of spectrum flattening.To do this we have chosen to use "Center clipping spectrum flattener". After the Centerclipping the autocorrelation is calculated and the fundamental frequency is extracted. Anoverview of the system is shown in figure 1.7.

Speech Frame x(n)

LPF

Fc = 900 Hz

Cl = 30% of max{x(n)}

Center clip x(n)

Compare AC Rn(k) for

Fs/350 <= K <= Fs/80

Compute

R = max{Rn(k)}

Compute

R = Rn(0)

Test

R >= 30% of Rn(0)

Pitch period

f0 = K + Fs/350

Frame unvoiced

output = 0

yes No

Figure 1.7: Overview of the autocorrelation pitch detector

1.5.1 Center clipping spectrum flattener

The meaning of center clipping spectrum flattener is to make a threshold (cl) so only thehigh and low pitches are saved. I our case (cl) is chosen to 30% of the max amplitude. thecenter clipping of a speech signal is shown in figure 1.8 on page XI

X

Page 11: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.5. PITCH DETECTION USING THE AUTOCORRELATION

+CL

-CL

Amax

Input speech

Time

Center clipped speech

Time

Figure 1.8: Center clipping

The center clipping is made due to the following definition:

• CL =% of Amax (e.g. 30%)

• if x(n) > CL then y(n)=x(n)- CL

• if x(n) <= CLthen y(n) = 0

1.5.2 Autocorrelation

The autocorrelation is a correlation of a variable with itself over time. The autocorrelationcan be calculated using the equation shown in equation 1.10

R(k) =N.k−1

∑m=0

x(m)x(m + k) (1.10)

In this project we have used the Matlab function "xcorr" to calculate the autocorrelation.

XI

Page 12: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

1.5.3 Median smoothing

When we have calculated the pitch detection, we can see that there are some outliers. Tominimize these outliers, we have chosen to use a median smooth (Median Filter). The effectof this median filter is shown in figure 1.9. The window length (n) have been tested withn=3 and n=5. The window length have been chosen to n=5, because the pitch detector theneliminate the outliers in the noisy part. One of the problem using the Median filter with n=5is that it smooths over five windows. I our project that means that we will have to make adelay in our system for 50 ms. The delay is calculated as:

number o f windows ∗ the length o f the window ∗ window overlap = 5∗20ms∗0,5 = 50ms(1.11)

In our project we have chosen not to deal with this problem, cause we are not going toimplement it as an real time system. If we should implement it as a real time system, wewould have to predict the pitch five windows ahead.

0 0.5 1 1.5 2 2.5

x 104

−1

−0.5

0

0.5

1Pitch detection with median smooth n=3

Am

plitu

de o

f sig

nal

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Fre

quen

cy o

f pitc

h0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Speech signalmedian smoothpitch detection

0 0.5 1 1.5 2 2.5

x 104

−1

−0.5

0

0.5

1with median smooth n=5

Am

plitu

de o

f sig

nal

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Fre

quen

cy o

f pitc

h

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Speech signalmedian smoothpitch detection

0 0.5 1 1.5 2 2.5

x 104

−1

−0.5

0

0.5

1with median smooth n=7

Am

plitu

de o

f sig

nal

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Fre

quen

cy o

f pitc

h

0 0.5 1 1.5 2 2.5

x 104

−500

−250

0

250

500

Sample index

Speech signalmedian smoothpitch detection

Figure 1.9: Median Smoothing

If the window length is set to n=7, shown in figure 1.9, the result seen perfect. The prob-lem with such a long window is that the algorithm sometime finds a wrong fundamentalfrequency when there are "speech pause".

XII

Page 13: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.6. OVERALL TEST AND RESULTS

1.5.4 Test

The algorithm has been tested against result from the program "pitch analyzer 1.1", and hasshown that the auto correlation pitch detector is very accurate.

1.6 Overall test and results

Thought this paper a speech signal has been used as a testsignal. But since pitch of a randomtestsignal isn’t known, the algorithms has also been tested up againt signals with a knownpitch. Figure 1.10 shows the 2 implemented pitch detectors, estimating the pitch of a linearswept frequency signal, which spectogram is shown in the top figure.

Time

Fre

quen

cy

The spectogram of a Linear swept−frequency signal

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

1000

2000

3000

4000

5000

6000

7000

8000

0 2000 4000 6000 8000 10000 12000 14000 16000

−1

−0.5

0

0.5

1

Samples

Am

plitu

de

0 2000 4000 6000 8000 10000 12000 14000 16000−500

0

500

Pitc

h fr

eque

ncy

(Hz)

Pitch estimation of a Linear swept−frequency signal (80−1000Hz)

Linear swept−frequency signal (Amplitude)OnTheFly (pitch freq)Fast−Auto−corr (pitch freq)Fast−Auto−corr shifted 5 windows <−− (pitch freq)Actual pitch of signal (pitch freq)

Figure 1.10: The top figure shows a spectogram of the linear swept frequencysignal, starting at 80 Hz and ending in 1000 Hz. The buttom figure showsthe 2 algorithms estimation of the pitch of the linear swept frequency signal.NOTICE the pitch detectors are constructed to detect the pitch in the interval80-350 Hz.

As the figure shows, both methods are equally good to estimate the pitch within their oper-ation limits 80-350 Hz. Outside these limits the figure shows that both methods quite poorto floor the pitch frequency, which would be the preferred outcome.Further the ’Fast auto-correlation’ method is delayed 5 frames (50ms), because it uses amedian-smooth over 5 estimated pitches. This can’t be compensated for and the only so-

XIII

Page 14: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

lution to this, is therefore to remove the smoothing, which will cause more spikes (see fig-ure 1.9 on page XII).

Another important feature are the influence of noise in signals. This is tested with a 200 Hzsinus signal, with a signal/noise ratio shifting from 100% to 0% in a timeinterval, shown infigure 1.11.

0 2000 4000 6000 8000 10000 12000 14000 16000

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Samples

Am

plitu

de

0 2000 4000 6000 8000 10000 12000 14000 16000−500

−400

−300

−200

−100

0

100

200

300

400

500

Pitc

h fr

eque

ncy

(Hz)

Pitch estimation of a 200 Hz sinus signal with decreasing signal/noise ratio

Signal with noise (Amplitude)Signal/noise (%)OnTheFly (pitch freq)Fast−Auto−corr (pitch freq)Fast−Auto−corr shifted 5 windows <−− (pitch freq)

Figure 1.11: Test of the influence of noise in a signal. Here the signal/noiseratio reises with time.

The figure shows that the Fast-auto-correlation method is very robust against noisy signals,while the On-The-Fly breaks down very early in the test. Because speech signals aren’t justa single sinus signal, such signals may more or less be similar to a noisy sinus. Therefore theFast-auto-correlation would be preferred.

1.7 Conclusion

Between the 2 implemented methods of pitch detection, only the autocorrelation method areusable in the project "Bandwidth expansion of narrowband signals". It was shown (figure1.6) that the "On-The-Fly" algorithm could not estimate a reliable pitch from band limitedsignals, which is a must in the project. The Fast-auto-correlation method, on the contrary,had no problem with either type of signal and is robust againt noisy signals. The preferredpitch detector is therefore the Fast-auto-correlation method.

XIV

Page 15: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.8. APPENDIX

1.8 Appendix

1.8.1 On-The-Fly’s flaws when used on band limited signals

Tests of On-The-Fly algorithm showed that it was unable to estimate an accurate pitch of aband limited signal. So why is that?The problem is with the lower cut-off frequency of the band pass filter. The On-The-Flyalgorithm uses these lower frequencies harmonics to estimate the pitch. Because it is timebased and only used the minimas of the difference equation, the higher order harmonicsget more influence, than the lower order harmonics (see figure 1.2). Next the key idea is tobe the threshold stages is to select the candidate harmonic with the smallest time period,resulting in a higher pitch. Comparing this key idea with the figure, the algorithm will mostlikely select a higher pitch frequency, than it should.

1.8.2 Matlab implementation of the "On-The-Fly" algorithm

1 % T h i s f u n c t i o n c a l c u l a t e s t h e f u n d a m e n t a l f r e q u e n c y o f a f rame o f da ta2 % w i t h t h e OnTheFly a l g o r i t h m .34 % −− I n p u t −−5 % data : Frame o f da ta f o r w i t h p i t c h i s t o be c a l c u l a t e d6 % f s : Sampl ing f r e q u e n c y o f s i g n a l78 % −− Outpu t −−9 % f 0 : The c a l c u l a t e d f u n d a m e n t a l f r e q u e n c y o f t h e i n p u t da ta

10 % 0 i s r e t u r n e d i f no p i t c h i s found1112 f u n c t i o n [ f 0 ] = OnTheFly ( da t a , f s )1314 f r am eLeng th = l e n g t h ( d a t a ) ; % Frameleng th15 c u r v e F i t S a m p l e s = 3 ; % Samples from minima t o l e f t and r i g h t1617 % C a l c u l a t e d i f f e r e n c e f u n c t i o n18 d = zeros ( 1 , f rameLength −1) ;19 f o r i =1 : f rameLength −120 d ( i ) = sum ( ( d a t a ( 1 : f rameLength −i ) − d a t a ( i +1 : f r am eLeng th ) ) . ^ 2 ) ;21 d ( i ) = d ( i ) / ( 2∗ sum ( d a t a ( 1 : f rameLength −i ) . ^ 2 + d a t a ( i +1 : f r am eLeng th ) . ^ 2 ) ) ;22 end2324 % Find g l o b a l minima and 4 l o c a l minima ( s m a l l e s t minima ) w i t h minimum 1 025 % samples be tween minimas . Minimas can have a maximum v a l u e o f 0 . 426 [ minimas minCount ] = f indm in ( d , 5 , 1 0 , 0 . 4 ) ;27 %mins ( end : −1 : 2 , : ) = mins ( 2 : end , : ) ;28 nMins = s i z e ( minimas ) ;2930 % I f t o many l o c a l minimas are l o c a t e d by f indmin , t h e s i g n a l i s l i k e l y t o31 % be no i se , and t h e r e f o r e n o t t o be p r o c e s s e d . Al so i f no minimas has been32 % found , t h e da ta i s p r o p e r l y unvo i ced , and i s n o t t o be p r o c e s s e d .33 i f minCount <= 20 && nMins ( 1 ) > 034 % M i n i m i z a t i o n o f q u a t i z a t i o n e r r o r w i t h p a r a b l e c u r v e f i t t i n g f o r each minima35 f o r i =1 : nMins ( 1 )36 % x−c o o d i n a t e s t o make c u r v e f i t t i n g from37 x = ( minimas ( i , 1 )−c u r v e F i t S a m p l e s ) : ( minimas ( i , 1 ) + c u r v e F i t S a m p l e s ) ;3839 % Removes c o o r d i n a t e s o f x , which i s n o t w i t h i n t h e da ta f rame40 x ( f i n d ( x < 1 | x > ( f rameLength −1) ) ) = [ ] ;41 y = d ( x ) ; % Correspond ing v a l u e s o f x−c o o r d i n a t e s42 [ minimas ( i , 1 ) minimas ( i , 2 ) ] = p a r r e g ( x , y ) ; % C u r v e f i t t i n g and t o p p o i n t e x t r a c t i o n43 end4445 % T h r e s h o l d i n g s t a g e 146 TTh1 = 0 . 2 ; ATh1 = 0 . 0 7 ; % T h r e s h o l d s e t t i n g s47 [ t g mins ] = f i n d f 0 ( TTh1 , ATh1 , minimas ) ;4849 % T h r e s h o l d i n g s t a g e 250 TTh2 = 0 . 0 5 ; ATh2 = 0 . 2 ; % T h r e s h o l d s e t t i n g s51 [ t g mins ] = f i n d f 0 ( TTh2 , ATh2 , [ t g ; minimas ] ) ;5253 f 0 = f s / t g ( 1 ) ; % C a l c u l a t i o n o f t h e f u n d a m e n t a l f r e q u e n c y o f t h e f rame54 e l s e55 f 0 = 0 ; % Zero i f f rame i s u n v o i c e d or n o i s e !

XV

Page 16: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

56 end

Locating minimas

Main function in locating minimas of data:

1 % F u n c t i o n l o c a t e s nMins o f t h e s m a l l e s t minimas o f data , where minimas2 % have t o be s e p a r a t e d w i t h a t l e a s t spred samples . Al so a minima can ’ t3 % e x c i d e maxMinValue !45 % −− I n p u t −−6 % data : Data t o l o c a t e minimas i n7 % nMins : # o f minimas t o f i n d i n da ta ( i f p o s s i b l e )8 % spred : Minimum # o f samples be tween minimas l o c a t e d9 % maxMinValue : Maximum v a l u e o f a minima

1011 % −− Outpu t −−12 % t i : A n−by −2 m a t r i x where column 1 i s i n d e x e s o f minima and13 % column 2 i s v a l u e s o f minima .14 % minCount : T o t a l number o f found minima i n da ta1516 f u n c t i o n [ t i minCount ] = f indm in ( da t a , nMins , sp r ed , maxMinValue )1718 [ n m] = s i z e ( d a t a ) ; % F l i p s v e c t o r c o r r e c t !19 i f m = = 120 d a t a = da t a ’ ;21 end2223 minimas = l o c a l m i n i m a ( d a t a ) ’ ; % L o c a t e s ALL minimas i n da ta as i n d e x e s24 minimas ( : , 2 ) = d a t a ( minimas ( : , 1 ) ) ’ ; % Extends mins t o ho ld d a t a v a l u e s t o i n d e x e s o f minimas25 minCount = l e n g t h ( minimas ) ; % T o t a l number o f l o c a l mins found2627 i n d e x e s = f i n d ( minimas ( : , 2 ) >= maxMinValue ) ;% Find i n d e x e s o f minima t h a t e x c i d e s maxMinValue28 minimas ( indexes , : ) = [ ] ; % D e l e t e s minimas t h a t e x c i d e maxMinValue29 [m n ] = s i z e ( minimas ) ; % New s i z e o f mins m a t r i x3031 i f m > = 1 % Only runs i f da ta are a v a i l a b l e32 t i = zeros ( nMins , 2 ) ; % A l l o c a t e s number o f s m a l l e s t minimas wanted33 f o r i =1 : nMins34 [ minima mIndex ] = min ( minimas ( : , 2 ) ) ;% Finds g l o b a l min among minimas35 t i ( i , : ) = minimas ( mIndex , : ) ; % Copies found g l o b a l minima i n t r o new m a t r i c e36 i n d e x e s = f i n d ( abs ( minimas ( mIndex , 1 )−minimas ( : , 1 ) ) < s p r e d ) ; % Finds i n d e x e s o f minimas w i t h i n ’ spred ’ samples

o f f ound minima37 minimas ( [ mIndex indexes ’ ] , : ) = [ ] ; % D e l e t e s minima w i t h i n ’ spred ’ samples o f f ound minima38 i f l e n g t h ( minimas ) < = 0 % Break loop i f no more da ta39 t i ( i +1 : end , : ) = [ ] ; % D e l e t e s used a l l o c a t e d space40 break ;41 end42 end43 e l s e44 t i = zeros ( 0 , 2 ) ; % A l l o c a t e s a 0−by −2 m a t r i x ( t o avo id r u n t i m e e r r o r )45 end

Sub function that locates all minimas of data:1 % mins = LocalMinima ( x )2 %3 % f i n d s p o s i t i o n s o f a l l s t r i c t l o c a l minima i n i n p u t array45 f u n c t i o n mins = LocalMinima ( x )67 n P o i n t s = l e n g t h ( x ) ;8 Middle = x ( 2 : ( n P o i n t s −1) ) ;9 L e f t = x ( 1 : ( n P o i n t s −2) ) ;

10 Righ t = x ( 3 : n P o i n t s ) ;11 mins = 1 + f i n d ( Middle < L e f t & Middle < Righ t ) ;1213 % W r i t t e n by Kenneth D . H a r r i s14 % T h i s s o f t w a r e i s r e l e a s e d under t h e GNU GPL15 % www . gnu . org / c o p y l e f t / g p l . h tml16 % any comments , or i f you make any e x t e n s i o n s17 % l e t me know a t harris@axon . r u t g e r s . edu

XVI

Page 17: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

1.8. APPENDIX

1.8.3 Curvefitting function

1 % T h i s f u n c t i o n c a l c u l a t e s t h e t o p p o i n t o f a c u r v e f i t t e d p a r a b l e .2 f u n c t i o n [ xx yy ] = p a r r e g ( x , y )3 %x : x−k o o r d i n a t e s t o do c u r v e f i t t i n g around4 %y : y−k o o r d i n a t e s t o do c u r v e f i t t i n g around56 [m n ] = s i z e ( y ) ; % V e c t o r x has t o s t a n d ( a m−by −1 v e c t o r )7 i f n ~ = 18 y = y ’ ;9 end

10 [m n ] = s i z e ( x ) ; % V e c t o r y has t o s t a n d ( a m−by −1 v e c t o r )11 i f n ~ = 112 x = x ’ ;13 end1415 % Curve f i t t i n g i n t h e form Ax=b16 X = [ ones ( s i z e ( x ) ) x x . ^ 2 ] ; % X v a l u e s17 a = X\ y ; % L e a s t s q a u r e s s o l u t i o n o f i n c l i n a t i o n18 xx = − a ( 2 ) / ( 2 ∗ a ( 3 ) ) ; % C a l c u l a t e s x o f t o p p o i n t o f p a r a b l e19 yy = [ 1 xx xx . ^ 2 ] ∗ a ; % C a l c u l a t e s t o f t o p p o i n t o f p a r a b l e

1.8.4 Threshold stage function

1 % F u n c t i o n t h a t l o c a t e s t h e f u n d a m e n t a l f r e q u e n c y o f t i a c c o r d i n g t o i n p u t2 % t h r e s h o l d s TTh and ATh .34 % −− I n p u t −−5 % TTh : Harmonic t h r e s h o l d6 % ATh : Ampl i t ude t h r e s h o l d7 % t i : A n−by −2 m a t r i x [ index_o f_min ima , va lue_o f_min ima ]8 % Row 1 MUST be t h e c u r r e n t p i c k e d f u n d a m e n t a l f r e q u e n c y minima9

10 % −− Outpu t −−11 % t g : New minima o f t h e f u n d a m e n t a l f r e q u e n c y12 % t i : Remaining minimas1314 f u n c t i o n [ tg , t i ] = f i n d f 0 ( TTh , ATh , t i )15 N = [ 2 3 4 5 6 ] ; % T h r e s h o l d c o n s t a n t s t o t e s t f o r harmon ics1617 t g = t i ( 1 , : ) ; % E x t r a c t c u r r e n t f u n d a m e n t a l f r e q u e n c y o f minimas18 t i ( 1 , : ) = [ ] ; % D e l e t e s t h e c u r r e n t f u n d a m e n t a l f r e q u e n c y o f minimas from t19 t i = s o r t r o w s ( t i , 1 ) ; % S o r t s rows a c c o r d i n g t o i n d e x e s ( a s c e n d i n g )20 t i = t i ( end : −1 : 1 , : ) ; % R e v e r s e s t i . . . s o r t must be d e s e n d i n g21 i ndex = [ ] ; % Must t o a l l o c a t e d t o avo id r u n t i m e warn ings2223 % T e s t f o r c a n d i d a t e s o f harmonic minimas t o t h e f u n d a m e n t a l f r e q u e n c y minima24 f o r i =1 : l e n g t h (N) % Run e l e m e n t s i n n − t i m e s25 x = t g ( 1 ) . / t i ( : , 1 ) ; % C a l c u l a t e s t g / t i26 newI ndexes = f i n d ( abs (N( i )−x ) < TTh ) ; % Finds i n d e x e s o f t i , t h a t s a t i f i e s t h e t h r e s h o l d TTh27 i f l e n g t h ( newI ndexes ) > 028 i ndex = [ index ; newI ndexes ] ; % Adds t h e found i n d e x t o a g e n e r a l i n d e x v a r i a b l e29 end30 end3132 % T e s t among t h e c a n d i d a t e s o f a minimas t o t h e f u n d a m e n t a l f r e q u e n c y minima33 % T e s t t h e c a n d i d a t e s i f t h e t h r e d s h o l d i s w i t h i n t h e t h r e s h o l d ATh34 newTgIndex = 0 ; % I ndex among i n d e x o f new f u n d a m e n t a l f r e q u e n c y minima35 f o r i =1 : l e n g t h ( i ndex ) % Runs o n l y among t h e harmonic c a n d i d a t e s36 i f t i ( i ndex ( i ) , 2 ) − t g ( 2 ) < ATh % T e s t i f ATh t h r e s h o l d i f s a t i f i e d37 newTgIndex = i ; % Save i n d e x o f ’ i ndex ’ o f new f u n d a m e n t a l f r e q u e n c y minima38 end39 end40 i f newTgIndex > 0 % I f a new f u n d a m e n t a l f r e q u e n c y was found , s e t new t g41 tgTemp = t i ( i ndex ( newTgIndex ) , : ) ; % Makes copy o f new f u n d a m e n t a l f r e q u e n c y minima42 t i ( i ndex ( i ) , : ) = t g ; % Copies o l d f u n d a m e n t a l f r e q u e n c y minima i n t o t i43 t g = tgTemp ; % Copies new f u n d a m e n t a l f r e q u e n c y minima i n t o t g44 t i = s o r t r o w s ( t i , 1 ) ; % S o r t s rows a c c o r d i n g t o i n d e x e s ( a s c e n d i n g )45 t i = t i ( end : −1 : 1 , : ) ; % R e v e r s e s t i . . . s o r t must be d e s e n d i n g46 end

XVII

Page 18: Pitch detection - Aalborg Universitetkom.aau.dk/group/04gr742/pdf/pitch_worksheet.pdf · 2004-12-08 · Pitch detection 1 In this chapter will we describe a pitch detector or pitch

CHAPTER 1. PITCH DETECTION

1.9 Litterature list

[1] Saurabh Sood and Ashok Krishnamurthy: A Robust On-The Fly Pitch(OTFP) Estimation Algorithm. The Ohio State University, Columbus OH.

[2] A. D. Cheveigne and H. Kawahara. Yin: A fundamental frequencyestimator for speech and music. Journal of the Acoustical Society ofAmerica, 111(4), 2002.

[3] S. Chowdhury, A. K. Datta, and B. B. Chaudhuri. Pitch detectionalgorithm using state phase analysis. Journal of the AcousticalSociety of India, 28(1), Jan. 2000.

[4] http://www.cnmat.berkeley.edu/~tristan/Report/node4.html

[5] http://www.ee.ucla.edu/~ingrid/ee213a/speech/vlad_present.pdf

XVIII