13
B. Dumitrescu is on leave from the Department of Auto- matic Control and Computer Science, `Politehnicaa University of Bucharest. * Corresponding author. Tel.: #358-3-3653911; fax: #358- 3-3653857. E-mail addresses: bogdand@cs.tut." (B. Dumitrescu), tabus@cs.tut." (I. Ta y bus 7 ). Signal Processing 81 (2001) 2019}2031 Predictive LSF computation Bogdan Dumitrescu, Ioan Ta y bus 7 * Signal Processing Laboratory, Tampere University of Technology, P.O.Box 553, SF-33101 Tampere, Finland Received 3 February 2000; received in revised form 11 March 2001 Abstract This paper deals with a key component involved in the short time spectrum coding in a speech coding system, namely the computation of line spectral frequencies (LSF). The method consists in: (1) the use of some predictions of the LSF as starting points for the search of the current LSF, (2) the fast bracketing of the LSF restricted on a grid of points and (3) the re"nement of LSF by a variable number of bisections. The experimental results show an improvement of the average performance with more than 30% with respect to one frequently used method of LSF computation*the Kabal}Ramachandran method. 2001 Elsevier Science B.V. All rights reserved. Keywords: Line spectral frequency; LSF prediction; Speech coding; Kabal}Ramachandran algorithm 1. Introduction 1.1. Problem formulation At rates higher than 2 kbits/s, one of the most used technique for modelling the envelope of the short-time speech spectrum (for each frame of sam- pled signal) is based on AR predictors, speci"ed by the nth order "lter A(z)"1# a z. (1) In order to transmit the spectral information, the model parameters ought to be quantized, but since the coe$cients of the polynomial A(z) are too sensi- tive to quantization noise, they are "rst trans- formed to a set of equivalent parameters, named line spectral frequencies (LSF) [4], which are sub- sequently quantized and transmitted. We assume an even order n, and we use the predictor poly- nomial A(z) to de"ne the symmetric polynomial P I and antisymmetric polynomial Q I as follows: P I (z)"A(z)#zA(z)"P(z)(1#z), (2) Q I (z)"A(z)!zA(z)"Q(z)(1!z). We suppose that the polynomial A(z) has its roots inside the unit circle, which is true when it is ob- tained by the autocorrelation method, as in con- ventional LP analysis procedures. In this case, it can be shown that all roots of P I and Q I lie on the unit circle [12,13], are distinct and interlaced as 0165-1684/01/$ - see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 5 - 1 6 8 4 ( 0 1 ) 0 0 0 5 5 - X

Predictive LSF computation

Embed Size (px)

Citation preview

Page 1: Predictive LSF computation

�B. Dumitrescu is on leave from the Department of Auto-matic Control and Computer Science, `Politehnicaa Universityof Bucharest.

*Corresponding author. Tel.: #358-3-3653911; fax: #358-3-3653857.E-mail addresses: [email protected]." (B. Dumitrescu),

[email protected]." (I. Tay bus7 ).

Signal Processing 81 (2001) 2019}2031

Predictive LSF computation

Bogdan Dumitrescu�, Ioan Tay bus7 *Signal Processing Laboratory, Tampere University of Technology, P.O.Box 553, SF-33101 Tampere, Finland

Received 3 February 2000; received in revised form 11 March 2001

Abstract

This paper deals with a key component involved in the short time spectrum coding in a speech coding system, namelythe computation of line spectral frequencies (LSF). The method consists in: (1) the use of some predictions of the LSF asstarting points for the search of the current LSF, (2) the fast bracketing of the LSF restricted on a grid of points and (3) there"nement of LSF by a variable number of bisections. The experimental results show an improvement of the averageperformance with more than 30% with respect to one frequently used method of LSF computation*theKabal}Ramachandran method. � 2001 Elsevier Science B.V. All rights reserved.

Keywords: Line spectral frequency; LSF prediction; Speech coding; Kabal}Ramachandran algorithm

1. Introduction

1.1. Problem formulation

At rates higher than 2 kbits/s, one of the mostused technique for modelling the envelope of theshort-time speech spectrum (for each frame of sam-pled signal) is based on AR predictors, speci"ed bythe nth order "lter

A(z)"1#

�����

a�z��. (1)

In order to transmit the spectral information, themodel parameters ought to be quantized, but sincethe coe$cients of the polynomial A(z) are too sensi-tive to quantization noise, they are "rst trans-formed to a set of equivalent parameters, namedline spectral frequencies (LSF) [4], which are sub-sequently quantized and transmitted. We assumean even order n, and we use the predictor poly-nomial A(z) to de"ne the symmetric polynomialPI and antisymmetric polynomial QI as follows:

PI (z)"A(z)#z������A(z��)"P(z)(1#z��),(2)

QI (z)"A(z)!z������A(z��)"Q(z)(1!z��).

We suppose that the polynomial A(z) has its rootsinside the unit circle, which is true when it is ob-tained by the autocorrelation method, as in con-ventional LP analysis procedures. In this case, itcan be shown that all roots of PI and QI lie on theunit circle [12,13], are distinct and interlaced as

0165-1684/01/$ - see front matter � 2001 Elsevier Science B.V. All rights reserved.PII: S 0 1 6 5 - 1 6 8 4 ( 0 1 ) 0 0 0 5 5 - X

Page 2: Predictive LSF computation

explicited below. Therefore, the symmetric poly-nomials P(z) and Q(z), resulted after de#ating the"xed root (at !1 or 1) will have complex zeroes ofthe form z

�"e��� . Since the roots appear in com-

plex conjugate pairs, one has to "nd only the rootsz�located on the upper unit semicircle, which may

be speci"ed either by the line spectral pairs (LSP)de"ned as x

�"Re z

�, i"1 : n, or by the LSF para-

meters, de"ned as ��3(0,�), i"1 : n. The relation

between the LSP x�and the LSF �

�is obviously

x�"cos�

�. (Although LPS and LSF express

equivalent quantities and are used often one foranother in the literature, we prefer to distinguishthem for more precision.)Consider the indexing of the LSP set such that its

elements are decreasingly ordered. Consequently,the corresponding LSF set (�

�"arccos x

�) results

to be increasingly ordered.

1'x�'x

�'2'x

�'!1,

0(��(�

�(2(�

�(�.

(3)

The interlacing property consists of the following:any odd indexed LSF, �

����, corresponds to a root

e������ of P and any even indexed LSF, ���, corres-

ponds to a root e���� of Q.The focus of this paper is on the fast computation

of the LSP parameters. The rest of this sectionpresents some popular methods for computing theLSP. Section 2 is dedicated to our new method.Section 3 contains a report on our experiments; we"rst show how the parameters of the method werechosen, then evaluate its complexity and compareit with the well known Kabal}Ramachandranalgorithm.

1.2. Existing methods for LSP computation

Several methods have been proposed for thecomputation of the LSP parameters. The most e$-cient and popular of them have two stages. In the"rst stage, the coe$cients of two polynomials P

�(x)

and Q�(x) are computed, such that P

�(x) has the

roots at the odd indexed �x�and the polynomial

Q�(x) has the roots at even indexed �x

�, where x

�are

ordered as in (3) and � may be 1 or 2. In the secondstage, the roots of the polynomials P

�(x) and Q

�(x)

are e!ectively computed.

To explicit the formation of the polynomial P�(x)

of degree m"n/2, we start from the polynomialP(z) having symmetric coe$cients

P(z)"1#p�z��#2#p

�z��#2#p

�z��

#2#p�z������#2#p

�z������#z��

(4)

and may continue in two di!erent ways:� The Kabal}Ramachandran algorithm (KR) [6]

uses the relation

e���#e����"2 cos k� (5)

to evaluate the polynomial P on the uppersemicircle as

e���P(e��)"p�

#2p���

cos�#2

#2p�cos(m!1)�#2 cosm�. (6)

Using the substitution x"cos�, making ex-plicit that cos k�"¹

�(x) is the kth order

Chebyshev polynomial and removing the linearphase term that does not a!ect the roots, we mayexpress (6) as a polynomial in x:

P�(x)"p

�#2p

���¹

�(x)#2

#2p�¹

���(x)#2¹

�(x). (7)

In the KR algorithm, the polynomial P�(x) is not

formed explicitly, because it may be e$cientlyevaluated at any x by means of a speci"c recur-rence relation (a 50% more costly equivalent ofthe Horner scheme for Chebyshev polynomials).

� The algorithm of Wu and Chen [14] is verysimilar to the KR algorithm, it transforms

z�P(z)"(z�#z��)#2#p���

(z#z��)#p�(8)

by the substitution x"z#z�� and uses modi-"ed Chebyshev polynomials to express (8) as thepolynomial P

�(x). In this algorithm, the coe$-

cients of P�(x) are explicitly computed (which

implies some extra operations, with the advant-age of using the Horner algorithm for poly-nomial evaluation); since x"z#z��"2 cos�,when z is on the unit circle, the roots of P

�(x) are

equal to 2x�, the double of the LSP. Recently,

Rothweiler [10] proposed a faster algorithm forthis polynomial transformation.

2020 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 3: Predictive LSF computation

For both algorithms, the polynomial Q is trans-formed in the same way as P.Next we review several ways to perform the sec-

ond stage, the computation of the roots of thereduced polynomials P

�(x) or P

�(x).

� In [6], P�(x) is evaluated on a "ne grid in order to

detect sign changes; when a root is bracketed, thesearch is re"ned by a "xed number of bisectionsand a "nal linear interpolation is performed. Thesearch starts at 1; taking advantage of the interlac-ing property, when a root of a polynomial isfound, the search continues for a root of theother polynomial. Thus, in each point of the gridonly one polynomial evaluation is performed.

� In [14], the (modi"ed) Newton}Raphsonmethod is used to "nd a root of P

�(x), starting

with the initial approximation x"2. The poly-nomial is then de#ated with respect to that rootand the same procedure is repeated for the de-#ated polynomial of order m!1; the roots arere"ned using a Newton}Raphson iteration ap-plied to the initial polynomial, to compensatethe ill-conditioning of de#ation. When the de-#ated polynomial has degree four, closed formformulas are used to compute the remainingroots. The interlacing property is used to choosegood initial approximations for computing theroots of the second polynomial (at its turn de-#ated down to degree four). The numerical accu-racy of de#ation, known to be poor, may beimproved as shown in [2].A method introduced in the late 80s by Omologo

[8] may be considered a precursor of our method,because it uses the values of LSP in the previousframe as starting point in an iterative algorithm.However, the iterative algorithm is completely dif-ferent of ours and while it has a good averageperformance, it is uncontrolled with respect toworst case performance.Other recent methods for computing the LSP use

more sophisticated techniques:� Saoudi et al. [11] do not compute the coe$cients

of polynomials A, P or Q, but construct symmet-ric tridiagonal matrices whose eigenvalues arethe double of the LSP (the split Levinsonalgorithm [1] is used instead of the classicalLevinson}Durbin algorithm to compute the lin-ear predictor).

� Pillai and Stonick [9] track the roots usinga homotopy of previous and current frame LSP.

� Another class of techniques uses particularmethods, for instance for the practical casem"5, as in [3].There is no absolute ranking of the cited

methods, one method outperforming the others de-pending on the rate, delay, desired accuracy, testmaterial. Therefore, we con"ne our comparison tothe settings speci"c to the real-time speech codersused in mobile telephony. The Kabal}Ramachan-dran algorithm seems the most appropriate to thisapplication; its execution time is approximatelyconstant regardless of the polynomial coe$cientsand the corresponding code is small. The KR algo-rithm is also scalable in the sense that whenm grows, the only requirement is to take a "nergrid. The Wu}Chen algorithm is faster when theLSP should be computed with high accuracy,which is not the case here; moreover, their proced-ure being partly iterative, the worst case behaviourmay be too far o! the average, a drawback inreal-time environment; even so, the algorithm re-mains somehow faster, as show some preliminaryexperiments we have performed. But its implemen-tation is more complicated by the precautionsrequired by de#ation and by the iterative rootcomputation, more notably when n is large (e.g.,n"16), see [2] for a discussion of these aspects.The ITU-T standard G.729 [5] implements the

Kabal}Ramachandran algorithm, for the casem"5. The evaluation grid has 60 points (equallyspaced in frequency); after bracketing a root, 4 bi-sections are performed. These values lead to a totalworst case of 109 polynomial evaluations. (P

�is

also evaluated in the roots of Q�and vice-versa,

excepting for the last computed root.)

2. Using predictions for fast root bracketing

Excepting the early work from [8], the use of thepast LSF for the fast computation of the currentones was not exploited up to now. It is well recog-nized that there is a correlation between the LSFbelonging to successive frames of a sampled speechsignal. This correlation may be used in the quantiz-ation stage, allowing a reduction of the number of

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2021

Page 4: Predictive LSF computation

Fig. 1. Possible locations of the current predicted LSP.

bits necessary to obtain a satisfactory coding. Thereare several available LSF prediction methods,many of them using MA or AR predictors to formthe predicted LSF, �(

�. Instead of directly quantiz-

ing the current (computed) LSF, ��, i"1 : n, the

di!erences ��!�(

�are quantized, since the latter

have a smaller variance and their distribution iscloser to a Laplacian one.Our method uses the predicted LSP for the com-

putation of the current LSP, with the goal of reduc-ing the complexity of root searching. Twopredictors were used in conjunction with ourmethod:� the "rst was simply the set of LSP computed at

the previous frame;� the second is the same as used by the G.729

coder, which computes two di!erent fourth order"xedMA predictions; at quantization, the best ofthe two is selected. As at the time of LSP compu-tation it is impossible to evaluate which pre-dictor is better, we used the predictor which wasthe best for the previous frame.There could be several alternatives how to pro-

ceed from the predictions to compute the exactLSP. We have chosen a solution which preserves asmuch as possible the spirit of the Kabal}Rama-chandran algorithm.We work on a grid of ¸ pointsg, l"1 :¸, covering the interval (!1,1); the points

are conventionally ordered from 1 to !1, i.e., fromright to left. For each LSP x

�, i"1 : n, of the cur-

rent frame, our aim is to "nd a grid intervalbracketing x

�, i.e., an index k (depending on i) such

that g���

(x�)g

�. (This grid interval will be fur-

ther bisected to the desired accuracy, like in the KRalgorithm.) While the KR algorithm starts thesearch from x

���(or the "rst grid point for i"1),

we start the search from the prediction x(�and use

the additional information o!ered by the sign andthe slope of the polynomial around the root predic-tion.Let us suppose that we already have computed

the "rst i!1 LSP and we want to compute x�. The

algorithm will be described for the case when x�is

a root of P�and P

�(x

���)'0; this case is immedi-

ately extended for the other polynomial and foropposite sign. Let g

��be the "rst grid point at the

right of x���

; of course, x���

)g��; due to the

ordering (3), we also have x�(g

��. Given the pre-

diction x(�, we begin by "nding g

�K, the "rst grid

point at the right of x(�. Obtaining kK from x(

�is an

easy task; if x(�is the previous frame LSP, then kK is

already available; in G.729, where in fact �(�is

computed, as the grid is equidistant in frequency,then kK "��(

�¸/��#1. To complete the notation,

remind that g�is the "rst grid point at the right of

x�, i.e., the LSP x

�is within the interval (g

���, g

�].

Our problem is to "nd k starting from kK .The possible locations of the prediction are pre-

sented in Fig. 1, where the notations s"g�K,

t"g�K ��

are used. The interval (x���

,x���

) isdivided in four categories denoted A}D, marked byvertical stripes; notice that A and D are subdividedinto two. The categories are characterized by thesigns of P

�(s), P

�(t) and their di!erence P

�(s)!P

�(t),

as explained in the sequel.The algorithm we propose is outlined in

Fig. 2 and will be described in the rest of thissection. The routine to compute the value of a poly-nomial is generically named horner, without anyconnection with the algorithm actually used.In the "rst part of the algorithm (lines 1.1}1.3),

we correct the prediction, if necessary. If x(�is too

far at the right, i.e., x(�'x

���, then we move it to

the left by reassigning kK "k�

#2; actually, insteadof testing x(

�*x

���, we used the more aggressive

test kK )k�

#1, as seen in the line 1.1 of the algo-rithm. If the prediction is too far at the left, we

2022 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 5: Predictive LSF computation

Fig. 2. Description of fast bracketing in the predictive LSP computation algorithm.

move it to the right as in line 1.3 upon a heuristicexplained later; we assume that the prediction isnever so bad such that g

�K(x

���; we have never

seen this situation in our experiments described inSection 3.1.In the second part of the algorithm we identify

the stripe in which the prediction lies and thensearch the grid interval bracketing x

�. The "rst

branching is according to the sign of P�(s). If the

sign is negative, then the prediction is in stripe C,the root is already bracketed and g

�will be

searched at the right of s. As we want to narrow thebracketing to one grid interval, we may proceed bygoing backwards on the grid from l"kK !1 downto k

�and evaluating P

�(g

) until a positive value is

met; the root is now in one grid interval. A vari-ation is to choose a step �

di!erent of 1 when

decreasing l, with a potential faster bracketing of

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2023

Page 6: Predictive LSF computation

the root when the prediction is not very near of thetrue LSP, but a loss in performance for very goodpredictions. (Another method is to bisect directlythe interval [g

�K, g

��], until meeting the desired ac-

curacy; this method was experimentally lesssuccessful and we will no more discuss it.) Thechoice of �

will be detailed in Section 3.2.

If P�(s) is positive, then the polynomial is evalu-

ated in the next grid point t"g�K ��

. Now, if P�(t) is

negative, we fall in stripe B, which is the best case:the LSP is bracketed with two polynomialevaluations!If P

�(t) is also positive, the algorithm we propose

is the following. If P�(t)(P

�(s), i.e., the slope of the

polynomial is positive, the corresponding possiblestripes are A

�and A

�. In A

�, the strategy is exactly

the same of the Kabal}Ramachandran algorithm,since there is no other information available aboutthe root; that is, the polynomial is evaluated onsuccessive grid points (or with step �

�), starting

from kK #2 until "nding a sign change. The caseA

�must be avoided, since it is indistinguishable of

A�and would lead to the bracketing of x

���in-

stead of x�, and thus to a fatal loss of computed

LSP. The solution we adopted is to choose a thre-shold value � and, if kK !k

�'�, to recompute the

`predictiona with kK "k�

#�/2. The choice of � isclearly a compromise: a large value will make lesssure the avoidance of the region A

�, while a small

value will possibly modify good predictions whenx���

!x�is large. We will discuss how to choose

� in Section 3.2.If P

�(t)*P

�(s), i.e., slope is negative, the corre-

sponding stripes are D�and D

�, which explains

that D�might not exist, depending on the position

of x���

(which is a root of Q�) with respect to the

maximum of P�in the interval [x

�,x

���]. Since we

cannot distinguish between D�and D

�, we choosed

to reset kK to k�

#1 and to proceed like in case A,with a possible di!erent step �

�. Also, predictions

in D�may bene"t of the �-rule, rather than of the

reset kK Qk�

#1, which is a drastic solution. Thus,the condition of applying the �-rule results as inline 1.3 of the algorithm in Fig. 2.We will name PK the described algorithm

(standing for predictive Kabal}Ramachandran).An initial letter will be added to represent thepredictor: PPK uses the previous frame LSP, and

GPK the predictor from G.729. It is clear that thePK algorithm does not have to comply with all thedetails of the basic Kabal algorithm (such as the useof a Chebyshev expression of the polynomials orthe use of bisection in order to "nd a root bracketedin an interval), but only with the limitation of thesearch to some grid points. However, we will reportresults comparing to the original KR algorithm,since this algorithm is largely in use. Since both PKand KR algorithms bracket the LSP in a gridinterval and re"ne this interval in the same way,they will compute the same values. The comparisonshould thus be limited to the complexity of the twoalgorithms.The performance of the algorithm PK is depen-

dent on the quality of the prediction and on thedata. A good choice of parameters ensures a goodbehaviour, as we will show in the next section.

3. Extensions, tuning of parameters and evaluations

The experiments were performed using theTIMIT database, the test part, on over 500,000speech frames processed with the G.729 #oatingpoint coder, in which we have replaced the originalKabal}Ramachandran algorithm with our routinefor predictive LSP computation.

3.1. About the quality of the prediction

The "rst question is how large our improvementexpectations may be, and the answer is related tothe quality of the prediction. To evaluate the max-imum that could be got, we measured the distancekK !k for all the LSP (where kK and k are as before,the indices of the nearest grid points at the right ofthe predicted and true LSP, respectively). The plotof the number of occurrences of di!erent values ofkK !k is presented in Fig. 3, for the two predictorswe used: the previous frame LSP (denoted P) andthe G.729 predictor (denoted G). It is clear thatP o!ers better predictions, as �kK !k� has relativelymore small values.Let suppose now that we would have an algo-

rithm detecting on which side of the true LSP is theprediction and going step by step in the good direc-tion on the grid until the LSP is bracketed. Such an

2024 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 7: Predictive LSF computation

Fig. 3. The percentage of occurrences of kK !k, for the two predictors.

algorithm would perform in average

N "2# �

���

f��d�# �

���

f�(d!1) (9)

polynomial evaluations to bracket a root, wheref�is the relative number of occurrences of the value

d"kK !k (for d"0 or d"1, only two evaluationsare required, for d"!1 or d"2, three evalu-ations are needed, etc.). We obtained N

"2.51 for

P, and N "3.10 for G, which shows again that

P is better; in conjunction with the Kabal}Rama-chandran algorithm parameters used in G.729, i.e.,four bisections for re"nement after bracketing, thiswill result in a total of n(4#N

)+65.1 polynomial

evaluations per speech frame for P, and 71 for G.These numbers should be compared with an aver-age of about 105 polynomial evaluations for Ka-bal's algorithm (the grid has ¸"60 points). Thisresult is showing that the current approach hasa good potential of improvement if one guesses wellthe direction of search.A second question is related to Fig. 1; what is the

percentage of predictions falling in the categoriesA}D? The answer is given in Table 1; for eachpredictor, the "rst line is showing the result without

modifying the prediction, in which case there willbe predictions at the right of stripe D

�, i.e., at the

right of the latest computed LSP, denoted x���

inthe previous section. For the second line, kK is in-creased to k

�#2 if inferior to this value, as in line

1.1 of the algorithm in Fig. 2. (The third line corres-ponds to the �-rule and will be commented later.)We notice that the categories that raise di$culties,i.e., D and especially A

�, have low frequencies;

moreover, there were no predictions at the left ofA

�, ensuring the correctness of our algorithm. The

predictor P is again better, having fewer extremepredictions. On the contrary, almost one half forP and over a quarter for G, of the predictions, fallinto stripe B, where the LSP is bracketed with twoevaluations. Notice that the values for stripe B maybe seen in Fig. 3, when kK "k.

3.2. Tuning the parameters of the algorithm

Coming back to our algorithms, let us tune theparameters which are not yet speci"ed. A "rst para-meter is the threshold �, introduced to avoidthe location of the prediction in stripe A

�and to

move to the right some of the predictions in D�.

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2025

Page 8: Predictive LSF computation

Table 1Frequencies of prediction categories, for predictors P and G

(D�

D�

A�

B C D�

A�

Predictor P

kK not modi"ed 0.13 0.17 25.65 47.65 26.40 0.00021 0

kK Qmax(kK , k�

#2) 0 0.09 24.23 46.48 29.20 0.00021 0

kK Qk�

#�/2if kK 'k

�#�

(�"12)

0 0.28 24.34 46.17 29.20 0.00019 0

Predictor G

kK not modi"ed 1.44 1.26 32.36 24.10 40.81 0.038 0.00025

kK Qmax(kK , k�

#2) 0 0.41 30.27 26.25 43.04 0.038 0.00025

kK Qk�

#�/2

if kK 'k�

#�

(�"12)

0 0.49 30.30 26.14 43.05 0.020 0

Table 2Frequencies of number of occurrences of d

d

11}12 13}14 15}16 17}18 19}20

Frequency ofoccurrence (%)

0.013 0.17 0.60 1.46 2.66

To have an image of the range where � couldbe, we measured (whenever possible, i.e., for1)i)n!4"6) the distance d

, de"ned as the

distance in grid intervals between x���

and x���

,for all the test frames. We are interested in the smallvalues of d

; Table 2 presents the frequency of

occurrence of dfor di!erent values less than 20.

We need a value of � su$ciently large to not a!ectthe prediction unnecessarily, but also su$cientlysmall to have �(d

in almost all cases. Such

a value seems to be �"10, but we preferred the`fastera choice �"12. The very low number ofoccurrences of d

)12, combined with the even

lower apparition of predictions in stripe A�(see

again Table 1), ensures that the probability to ob-tain both these events is negligible for predictorG (of order 10���, if we consider the events inde-pendent). For predictor P, the �-rule seems to haveless signi"cance, as there are no predictions in A

�;

however, we kept it as a safety measure.The modi"cation upon the �-rule of the predic-

tions far-o! of the last previous computed LSP willa!ect the location of kK not only in stripe A

�, but

also in other locations. The third line of Table1 shows the frequencies of the locations of kK aftermodifying it as in line 1.3 of the algorithm in Fig. 2.For predictor G, one can see that there are no morepredictions in A

�and that half of the occurrences in

stripe D�have also disappeared; however, stripe

D�is nowmore populated. The other categories are

only slightly a!ected. For predictor P, predictionsmoved rather from B to D

�and A

�, but in few

number, so that this small negative e!ect is negli-gible (and compensated by the safety gain). Glo-bally, we can appreciate that the introduction of�"12 is very convenient, but this value is notcritical; the results for �"10 are not signi"cantlydi!erent.The problem to be debated further is which is the

best step �

in stripe C. It is helpful to see thedistribution of d"kK !k when the prediction fallsin stripe C (after applying the �-rule); a bargraphsimilar to that in Fig. 3, is now presented in Fig. 4,with the remark that all di!erences are positive.The value d"1 has a frequency of about 75% forP and 50% for G, and the largest value is 16 forP and 19 for G.

2026 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 9: Predictive LSF computation

Fig. 4. The frequencies of occurrence of kK !k, when prediction is in stripe C.

Recall that if �'1, other polynomial evalu-

ations are required to reduce the bracketinginterval to one grid unit, speci"cally �log

���. We

con"ne the comparison to some heuristic argu-ments and only for the choices �

"1 and �

"2,

because larger values are not likely to give goode$ciency since most of the occurrences ofd"kK !k are at very small values. With �

"1 the

gain is one evaluation for d"1, but one looses ford'3, more evaluations as d grows larger. Thebehaviour could be e$cient in average, but poor inthe worst case, since there are 0.14% cases forG and 0.04% for P when d'10. That is why wepreferred the choice �

"2.

Finally, the values of the steps in stripes A andD have been chosen �

�"�

�"1. The reason of

this simple choice is the favourable slope in stripeA, implying a smaller number of horners untilbracketing, and the scarce occurrence of predic-tions in stripe D (see again Table 1).It is possible that the values of the para-

meters, derived experimentally in this sectionusing data from the TIMIT database, may requireretuning in the case of very di!erent conditions.Decreasing � may be the only necessary adapta-

tion, thus improving the robustness of the algo-rithm.

3.3. Complexity of the predictive LSP algorithm

We present now the complexity of our algorithm,reported in number of polynomial evaluations(`hornersa) N

������required for the computation of

a LSP set, on the experimental data set. This isa reasonable measure, since, compared to the KRalgorithm, our algorithm from Fig. 2 has only fewmore operations required to adjust the predictionand few more comparisons per root, needed todetect the position of the prediction with respect tothe actual LSP. (It is di$cult to evaluate exactly theextra cost of the control #ow of algorithm PK, butobviously it is only a small fraction of a horner;however, the PK algorithm has clearly a longerprogram code than KR.) Also, this measure doesnot take into account the way the polynomial isevaluated and may be as well suitable for thestraight Horner rule as for the Chebyshev version.Agreeing that four bisections are needed to re"nea bracketed root, it is obvious that the minimumpossible value of N

������is 60, in the case where all

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2027

Page 10: Predictive LSF computation

Table 3Frequencies of number of occurrences of N

������

N������

60}65 66}70 71}75 76}80 81}85 86}90 91}95 96}99

Frequency of occurrence (%)(algorithm PPK)

49.36 34.73 9.89 4.53 1.28 0.19 0.022 0.0010

Frequency of occurrence (%)(algorithm GPK)

9.24 46.25 30.83 9.76 3.14 0.70 0.075 0.0046

Fig. 5. Frequencies, in percents, of the number of polynomial evaluations for algorithms PPK and GPK (left side) and KR (right side).

LSP of a speech frame are predicted with highaccuracy (all predictions are in stripe B or at theextreme right of stripe C); a worst case is di$cult toevaluate; for example, if all predictions are at theright of the previous computed LSP, the number ofevaluations will be the same as in Kabal's algo-rithm. It is also clear that the current algorithmcomputes exactly the same LSP as the KR algo-rithm, since each LSP is "nally bracketed in onegrid interval.The bargraphs from Fig. 5 indicate the frequen-

cies of occurrence of N������

on the whole testdata, for three algorithms: PPK and GPK, inthe left side, and KR in the right side. For PPK,the average value N

������is 66.64; for GPK, the

average is 70.68; it is worth to notice that these

values are relatively near the theoretical valuescalculated in Section 3.1 (recall them: 65.1 and 71,respectively). The worst values of N

������are 98 for

PPK and 99 for GPK. The frequencies of the num-ber of occurrences on intervals of length 5 areshown in the second and third line of Table 3. It isseen that the frequency of a N

������greater than 90

is very small and that the event N������

'80 hasa frequency less than 1.5% for PPK and 4% forGPK.For comparison, the Kabal}Ramachandran has

an average value of N������

equal to 104.75 poly-nomial evaluations and a worst value 109, as al-readymentioned in Section 1. It is clear that PPK isbetter than GPK due to the better prediction(see Fig. 3) and that both these variants of our

2028 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 11: Predictive LSF computation

Fig. 6. Frequencies of the number of polynomial evaluations for algorithms PPK(80) and GPK(80).

predictive LSP computation algorithm are signi"-cantly better than KR.

3.4. Controlling the worst case behaviour

In a real-time implementation, the worst case ofthe execution time is the best measure of the perfor-mance. The method presented up to now performsvery well in average, giving a signi"cant gain overthe Kabal}Ramachandran method, but the worstcase time is only about 10% better, both for PPKand GPK. There is a simple extension to cope withthis di$culty, by imposing a maximum number ofhorners N

��as upper limit for the execution of the

algorithm. During the execution of the algorithm,we monitor the number of polynomial evaluations.The algorithm is reorganized in two steps:1. Each root is only bracketed in one grid interval

using the same bracketing algorithm describedin Section 2. At the end of this step, the numberof polynomial evaluations has the value N

�.

2. The resulting intervals are bisected to re"ne theroot. The number of bisections per root N

�that

may be performed within the limit N��

isN

�"�(N

��!N

�)/n� .

The computational burden associated with thisalgorithm change is the maintenance of a counter

for N�(one increment per horner) and the calcu-

lation of N�, once per frame. Also, the 2n values

bracketing the LSP should be stored before the"nal bisections.Of course, the proposed strategy may a!ect the

accuracy of some computed roots, so N��

shouldbe chosen carefully. Let us denote PPK(N

��) the

algorithm obtained from PPK using the techniquedescribed above. (Similar notations are used forGPK and KR.)Looking again at Table 3, we remind than

N������

'80 appears only in less than 1.5% of thecases for PPK (and 4% for GPK). Thus, usingN

��"80 will a!ect the accuracy of a very small

number of LSP sets. The complexity of the algo-rithms PPK(80) and GPK(80) is presented in Fig. 6.The average of N

������is 66.55 for PPK(80) and

70.35 for GPK(80), slightly modi"ed with respect tothe initial algorithms. A similar behaviour have thealgorithms PPK(75) and GPK(75). (Their averagenumber of horners is 66.06 and 69.3, respectively.)As a direct measure of the accuracy, we computedfor each frame the distance between the set ofcomputed LSP x� and the actual LSP x, i.e., theeuclidian norm ��x� !x��; as actual LSP we used thevalues computed in double precision with the poly-nomial root "nder from [7]. Three average errors

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2029

Page 12: Predictive LSF computation

Table 4Average error of computed LSP with respect to actual LSP

Algorithm Whole Frames with Frames withdata set N

�#40'80 N

�#40'75

PPK 8.88e-6 8.86e-6 9.47e-6PPK(80) 9.31e-6 3.69e-5 *

PPK(75) 1.09e-5 * 4.12e-5

GPK 8.88e-6 8.35e-6 7.99e-6GPK(80) 1.00e-5 3.54e-5 *

GPK(75) 1.31e-5 * 3.80e-5

Table 5Spectral distortion for the predictive LSP computation

Algorithm Whole Frames with Frames withdata set N

�#40'80 N

�#40'75

PPK 1.3420 2.3782 1.9787PPK(80) 1.3420 2.3782 *

PPK(75) 1.3421 * 1.9788

GPK 1.3420 1.9000 1.6968GPK(80) 1.3420 1.9001 *

GPK(75) 1.3421 * 1.6969

of the computed LSP are presented in Table 4. Inthe second column, we have the average error overthe whole data set, which is only slightly a!ected bythe limitation of N

������. The next two columns

present the average errors only for the frameswhere the limitation is actually applied, i.e., whenN

������'N

��. We see that for these frames the

error increases about 4}5 times. However, the accu-racy is still acceptable, as the computation is per-formed in single precision where the roundo! is oforder 10�.A better perceptual measure of the accuracy is

the spectral distortion. Therefore, we have com-puted the spectral distortion of the quantizedLSP set, in the G.729 coder (with our algorithmsinstead of KR). Table 5 shows the average spectraldistortion over the same sets as Table 4. Wenotice that the change is insigni"cant, whichmeans that the quantization is practically insensi-tive to the error introduced by the LSP computa-tion with reduced number of horners. It can beremarked that the spectral distortion is greater forthe frames that require a large number of horners;we could state that the common cause of thesephenomena is the bad quality of the predictor forthese frames.These results show that the idea of limiting the

number of polynomial evaluations is successful.We should add that the same limitation could be

applied as well to the KR algorithm. However, wenotice from Fig. 5 that KR(99), which reduces theworst case with 10, a!ects almost all LSP, com-puted now with three bisections instead of four.Therefore, the accuracy su!ers to a greater extentthan for the algorithms PPK and GPK.

4. Conclusions

The present paper proposes a new fast methodfor the computation of the line spectral frequencies.The main contribution is the use of the predicted

LSP as a starting point in the search of the currentLSP. We detail a method to bracket the LSP work-ing on a grid of points, i.e., using as background theKabal}Ramachandran algorithm. The worst casebehaviour is improved by simply limiting thenumber of polynomial evaluations, withouta degradation of the spectral distortion. Experi-ments performed on the TIMIT database showa good average behaviour of the new method.

Acknowledgements

The authors are grateful to reviewer A, whosedetailed comments lead to signi"cant improvementof this paper.

References

[1] P. Delsarte, Y. Genin, The split Levinson algorithm, IEEETrans. Acoust. Speech, Signal Process. 35 (3) (June 1986)470}478.

[2] B. Dumitrescu, I. Tabus, How to de#ate polynomialsin LSP computation, Proceedings of the IEEE Workshopon Speech Coding, Vol. 1, Porvoo, Finland, 1999, pp.52}54.

[3] S. Grassi, A. Dufaux, M. Ansorge, F. Pellandini, E$cientalgorithm to compute LSP parameters from 10th orderLPC coe$cients, Proceedings of the International Confer-ence on Acoustics, Speech, Signal Processing, Vol, 3,Munich, Germany, 1997, pp. 1707}1710.

2030 B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031

Page 13: Predictive LSF computation

[4] F. Itakura, Line spectrum representation of linear pre-dictor coe$cients of speech signals, J. Acoust. Soc. Amer.57 (S35(A)) (1975).

[5] ITU-T, Recommendation G.729*coding of speech at8 kbits/s using conjugate-structure algebraic-code-excitedlinear-prediction (CS-ACELP), March 1996.

[6] P. Kabal, R.P. Ramachandran, Computation of line spec-tral frequencies using Chebyshev polynomials, IEEETrans. Acoust. Speech, Signal Process. 34 (6) (December1986) 1419}1426.

[7] M. Lang, B.C. Frenzel, Polynomial root "nding, IEEESignal Process. Lett. 1 (October 1994) 141}143.

[8] M. Omologo, The computation and some spectral consid-erations on line spectrum pairs (LSP), European Confer-ence on Speech Communication and Technology,Eurospeech, Vol. 2, Paris, France, 1989, pp. 352}355.

[9] U. Pillai, V. Stonick, A scalar homotopy method for paral-lel and robust tracking of line spectral pairs, Proceedingsof the International Conference on Acoustics, Speech,

Signal Processing, Vol. 2, Atlanta, Georgia, 1996,pp. 805}808.

[10] J. Rothweiler, On polynomial reduction in the compu-tation of LSP frequencies, IEEE Trans. Speech AudioProcess. 7 (5) (September 1999) 592}594.

[11] S. Saoudi, J.M. Boucher, A. Le Guyader, A new e$cientalgorithm to compute the LSP parameters for speechcoding, Signal Process. 28 (1992) 201}212.

[12] F.K. Soong, B.W. Juang, Line spectrum pair (LSP) andspeech data compression, Proceedings of the InternationalConference on Acoustics, Speech, Signal Processing, SanDiego, CA, 1984, pp. 1.10.1}1.10.4.

[13] P. Stoica, A. Nehorai, The poles of symmetric linear pre-diction models lie on the unit circle, IEEE Trans. Acoust.Speech, Signal Process 34 (5) (October 1986) 1344}1346.

[14] C.H. Wu, J.H. Chen, A novel two-level method for thecomputation of the LSP frequencies using a decimation-in-degree algorithm, IEEE Trans. Speech Audio Process.5 (2) (March 1997) 106}115.

B. Dumitrescu, I. Tay bus7 / Signal Processing 81 (2001) 2019}2031 2031