CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering

CHAPTER 4

Adaptive Tapped-delay-line Filters

Using the Least Squares

Adaptive FilteringAdaptive Filtering

In this presentation the method of least squares will be used to derive a recursive algorithm for automatically adjusting the coefficients of a tapped-delay-line filter, without invoking assumptions on the statistics of the input signals. This procedure, called the recursive least-squares (RLS) algorithm, is capable of realizing a rate of convergence that is much faster than the LMS algorithm, because the RLS algorithm utilizes all the information contained in the input data from the start of the adaptation up to the present.

Adaptive Tapped-delay-line Filters Using the Least Squares

Adaptive Tapped-delay-line Filters Using the Least SquaresThe Deterministic Normal Equations

The requirement is to design the filter in such a way that it minimizes the residual sum of squares of the error.

Figure 4.1 Tapped-delay-line filter

n

i

ienJ1

2 )()((4.1)


The filter output is the convolution sum

M

k

nikiunkhiy1

,...,2,1)1(),()((4.2)

which upon substituting becomes )()()( iyidie

n

i

M

k

M

m

n

i

n

i

M

k

miukiunmhnkh

kiuidnkhidnJ

11 1

11 1

2

)1()1(),(),(

)1()(),(2)()(

(4.3)

where nM where


Introduce the following definitions:1) We define the deterministic correlation between the input signals at taps k and m, summed over the data length n, as

n

i

Mmkmiukiumkn1

1,...,1,0,)()(),;((4.4)

2) We define the deterministic correlation between the desired response and the input signal at tap k, summed over the data length n, as

n

i

Mkkiuidkn1

1,...,1,0)()();((4.5)

3) We define the energy of the desired response as

n

id idnE

1

2 )()((4.6)

Adaptive Tapped-delay-line Filters Using the Least SquaresThe residual sum of squares is now written as

)1,1;(),(),(

)1;(),(2)()(

1 1

1

mknnmhnkh

knnkhnEnJ

M

k

M

m

M

kd

(4.7)We may treat the tap coefficients as constants for the duration of the input data, from 1 to n. Hence, differentiating Eq. (4.7) with respect to h(k, n), we get

Mkmknnmhknnkh

nJ M

m

1,2,...,)1,1;(),(2)1;(2),(

)(

1

(4.8)

Let denote the value of the kth tap coefficient for which the derivative is zero at time n. Thus, from Eq. (4.8) we get

),(ˆ nkh),()/( nkhnJ

MkknmknnmhM

m

1,2,...,)1;()1,1;(),(ˆ1

(4.9)

Adaptive Tapped-delay-line Filters Using the Least SquaresThis set of M simultaneous equations constitutes the deterministic normal equations whose solution determines the “least-squares filter”.

TnMhnhnhn ),(ˆ),...,,2(ˆ),,1(ˆ)(ˆ hVector form of the least-squares filter(4.10)

The deterministic correlation matrix of the tap inputs

)1,1;(..)1,1;()0,1;(

.....

.....

)1,1;(..)1,1;()0,1;(

)1,0;(..)1,0;()0,0;(

)(

MMnMnMn

Mnnn

Mnnn

n

Φ

and the deterministic cross-correlation vector

TMnnnn )1;(),...,1;(),0;()( θ

(4.11)

(4.12)


With these definitions the normal equations are expressed as

)()(ˆ)( nnn θh Φ (4.13)

Assuming (n) is nonsingular

)()()(ˆ -1 nnn θh Φ (4.14)

and for the resulting filter the residual sum of squares attains the minimum value:

)()(ˆ)()(min nnnEnJ Td θh (4.15)


Properties of the Least-squares Estimate

Property 1. The least-squares estimate of the coefficient vector approaches the optimum Wiener solution as the data length n approaches infinity, if the filter input and the desired response are jointly stationary ergodic processes.

Property 2. The least-squares estimate of the coefficient vector is unbiased if the error signal e(i) has zero mean for all i.

Property 3. The covariance matrix of the least-squares estimate equals , except for a scaling factor, if the error vector has zero mean and its elements are uncorrelated.

h-1Φ 0e

Property 4. If the elements of the error vector are statistically independent and Gaussian-distributed, then the least-squares estimate is the same as the maximum-likelihood estimate.

0e


Let A and B be two positive definite, M by M matrices related by

TCCDBA -1-1 (4.16)

The Matrix-Inversion Lemma

where D is another positive definite, N by N matrix and C is an M by N matrix. According to the matrix-inversion lemma, we may express the inverse of the matrix A as follows:

BCBCCDBC-BA TT -1-1 (4.17)


The Recursive Least-Squares (RLS) Algorithm

The deterministic correlation matrix is now modified term by term as

)(nΦ

n

imkckiumiumkn

1

)()(),;( (4.18)

where c is a small positive constant and is the Kronecker delta; mk

km

kmmk 0

1

(4.19)


This expression can be reformulated as

1

1

)()()()(),;(n

imkckiumiuknumnumkn

(4.20)

where the first term equates yielding ),;1( mkn

1,...,1,0,)()(),;1(),;( Mmkknumnumknmkn (4.21)

Note that this recursive equation is independent of the arbitrarily small constant c.


Defining the M-by-1 tap input vector

TM-nu-nunun 1)(1),...,(),()( u (4.22)

we can express the correlation matrix as

)()()()( nnnn TuuΦΦ (4.23)

and make the following associations to use the matrix inversion lemma

1)(

1)()( -1

DuC

BA

n

-nn ΦΦ

Thus the inverse of the correlation matrix gets the following recursive form

)(1)()(1

1)()()(1)(1)()(

1-

-1-11-1-

n-nn

-nnn-n--nn

T

T

uu

uu

Φ

ΦΦΦΦ

(4.24)


)()( -1 nn ΦP

For convenience of computation, let

and )(1)()(1

)(1)()(

n-nn

n-nn

T uPu

uPk

Then, we may rewrite Eq. (4.24) as follows:

1)()()(-1)()( -nnn-nn T PukPP (4.27)

(4.26)

(4.25)

The M-by-1 vector k(n) is called the gain vector.

Postmultiplying both sides of Eq.(4.27) by the tap-input vector u(n) we get

)(1)()()(-)(1)()()( n-nnnn-nnn T uPukuPuP (4.28)Rearranging Eq. (4.26) we find that

)()(1)()(1)()()( n-n-nn-nnn T kuPuPuk (4.29)Therefor substituting Eq. (4.29) in Eq. (4.28) and simplifying we get

)()()( nnn uPk (4.30)


Reminding that the recursion requires not only

updates for as given by Eq. (4.27) but also recursive

updates for the deterministic cross-correlation defined by

)()()(ˆ -1 nnn θh Φ

)()(-1 nn PΦ

)(nθ

n

i

Mkkiuidkn1

1,...,1,0)()();((4.5)

which can be rewritten as

);1()()()()()()();(1

1

knknundkiuidknundknn

i

(4.31)

yielding the recursion

)()()1()( nndnn uθθ (4.32)


As a result

)()()1()()()()()1()(

)()()1()()()()(ˆ

ndnnnndnnnn

ndnnnnnn

kθPuPθP

uθPθPh

With the suitable substitutions we get(4.33)

)1()1()()()()1()1(

)()()1()1()()()1()(ˆ

nnnndnnn

ndnnnnnnnT

T

θPukθP

kθPukPh

(4.34)which can be expressed as

)()()1(ˆ)1(ˆ)()()()1(ˆ)(ˆ nnnnnndnnn T khhukhh (4.35)

where (n) is a “true” estimation error defined as

)1(ˆ)()()( nnndn T hu (4.36)

Equations (4.35) and (4.36) constitute the recursive least-squares (RLS) algorithm.


Summary of the RLS Algorithm1. Let n=1

2. Compute the gain vector)(1)()(1

)(1)()(

n-nn

n-nn

T uPu

uPk

3. Compute the true estimation error )1(ˆ)()()( nnndn T hu

4. Update the estimate of the coefficient vector

)()()1(ˆ)(ˆ nnnn khh

5. Update the error correlation matrix

1)()()(-1)()( -nnn-nn T PukPP

6. Increment n by 1, go back to step 2

Side result: recursion of the minimum value of the residual sum of squares

)()(1)()( minmin nen-nJnJ (4.37)

Comparison of the RLS and LMS Algorithms


Figure 4.2 Multidimensional signal-flow graph (a) RLS algorithm (b) LMS algorithm


1. In the LMS algorithm, the correction that is applied in updating the old estimate of the coefficient vector is based on the instantaneous sample value of the tap-input vector and the error signal. On the other hand, in the RLS algorithm the computation of this correction utilizes all the past available information.

2. In the LMS algorithms, the correction applied to the previous estimate consists of the product of three factors: the (scalar) step-size parameter , the error signal e( n-1), and the tap-input vector u(n-1). On the other hand, in the RLS algorithm this correction consists of the product of two factors: the true estimation error (n) and the gain vector k(n). The gain vector itself consists of -1(n), the inverse of the deterministic correlation matrix, multiplied by the tap-input vector u(n). The major difference between the LMS and RLS algorithms is therefore the presence of -1(n) in the correction term of the RLS algorithm that has the effect of decorrelating the successive tap inputs, thereby making the RLS algorithm self-orthogonalizing. Because of this property, we find that the RLS algorithm is essentially independent of the eigenvalue spread of the correlation matrix of the filter input.


3. The LMS algorithm requires approximately 20M iterations to converge in mean square, where M is the number of tap coefficients contained in the tapped-delay-line filter. On the other band, the RLS algorithm converges in mean square within less than 2M iterations. The rate of convergence of the RLS algorithm is therefore, in general, faster than that of the LMS algorithm by an order of magnitude.

4. Unlike the LMS algorithm, there are no approximations made in the derivation of the RLS algorithm. Accordingly, as the number of iterations approaches infinity, the least-squares estimate of the coefficient vector approaches the optimum Wiener value, and correspondingly, the mean-square error approaches the minimum value possible. In other words, the RLS algorithm, in theory, exhibits zero misadjustment. On the other hand, the LMS algorithm always exhibits a nonzero misadjustment; however, this misadjustment may be made arbitrarily small by using a sufficiently small step-size parameter .


5. The superior performance of the RLS algorithm compared to the LMS algorithm, however, is attained at the expense of a large increase in computational complexity. The complexity of an adaptive algorithm for real-time operation is determined by two principal factors: (1) the number of multiplications (with divisions counted as multiplications) per iteration, and (2) the precision required to perform arithmetic operations. The RLS algorithm requires a total of 3M(3 + M )/2 multiplications, which increases as the square of M, the number of filter coefficients. On the other hand, the LMS algorithm requires 2M + 1 multiplications, increasing linearly with M. For example, for M = 31 the RLS algorithm requires 1581 multiplications, whereas the LMS algorithm requires only 63.

Documents

CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering