IV. Recursive Least Squares Algorithm (RLS)faculty.nps.edu/fargues/teaching/ec4440/springfy09/ec4440-iv-spfy... · IV. Recursive Least Squares Algorithm (RLS) • [p. 2] Differences

03/24/09 EC4440.SpFY09/MPF - Section IV 1

IV. Recursive Least Squares Algorithm

(RLS)

• [p. 2] Differences with the LMS algorithm• [p. 7] Derivation of the iterative scheme• [p. 17] Application to Adaptive Noise Cancelation (ANC)• [p. 20] RLS convergence issues• [p. 23] Applications to adaptive equalization

References:[Manolakis] [Haykin]: S. Haykin, Adaptive Filter Theory, Prentice Hall]


Potential Problem with LMS• based on using instantaneous data for

statistical estimates

• noise impacts may be high but fast to track changes in signal behavior

x(n) ( )d̂ n

d(n)

⊕Filter

( ) ( )ˆ Hd n h x n= ⋅

( ) ( ) ( )0 1 1, , ,

, , 1

THp

T

h h h h

x n x n x n p

−⎡ ⎤= ⎣ ⎦

= − +⎡ ⎤⎣ ⎦

( ) ( ) ( )ˆe n d n d n= −+

−

RLS Minimizes the Error

( ) ( ) 2

1( )

nn i

ih n e i nξ λ −

=

= ∑


RLS Minimizes the Error

( ) ( ) 2

1( )

nn i

ih n e i nξ λ −

=

= ∑

λ: forgetting factor; 0 ≤

λ ≤

1; λ = infinite memoryusually 0.95 ≤

λ ≤

0.99 effective to track non-stationarities

( ) ( ) ( ) ( )He i n d i h n x i= − ⋅

Information regarding the filter coefficients is known up to

time n


• Optimum weights are obtained for

( ) 0hV hξ =⎡ ⎤⎣ ⎦

for RLS:

( ) *Recall LMS ( ) 2 ( ) ( )hV h n e n x nξ→ −⎡ ⎤⎣ ⎦

( ) ( ) 2

1

nn i

ih n e i nξ λ −

=

=⎡ ⎤⎣ ⎦ ∑

Information regarding the filter coefficients is known up to

time n

• RLS criterion is different from LMS!criterion is changed and computed at each iteration and optimized at each iteration


( )( ) ( ) 2

1

2

1

*

1

*

( ) ( ) ( )

[ ( ) ( ) ( ) ( ) ( ) ( )

- ( ) ( ) ( ) ( ) ( ) ( )

nn i

h hi

nHn i

hi

nH Hn i

hi

H H

V h n V e i n

V d i h n x i

d i d i h n x i x i h n

d i x i h n d i h n x i

ξ λ

λ

λ

−

=

−

=

−

=

⎡ ⎤⎡ ⎤ = ⎢ ⎥⎣ ⎦ ⎣ ⎦⎡ ⎤= −⎢ ⎥⎣ ⎦⎡

= ∇ +⎢⎣⎤− ⎦

∑

∑

∑

( )( ) ( )*

1

2 ( ) ( ) ( ) ( ) ( )n

Hn ih

i

V h n x i x i h n d i x nξ λ −

=

⎡ ⎤ = −⎣ ⎦ ∑


• Note: φ(n): is the estimated biased autocorrelation matrix as:

( )*

1 1

0

[ ( ) ( )] ( ) ( ) ( )

h

n nHn i n i

i i

V h

x i x i h n x i d i

ξ

λ λ− −

= =

⇒ =⎡ ⎤⎣ ⎦

⇒ =∑ ∑

• Optimum weights are obtained with:

( ) ( )

( ) ( )

( ) ( ) ( )

( )

*

1 1

n nHn i n i

i ix i x i h n x i d i

nn h n

λ λ

θφ

− −

= =

⎡ ⎤=⎢ ⎥⎣ ⎦

=

∑ ∑

( )1lim for 1xnn R

nφ λ

→+∞= =

( ) ( ) ( ) ( )11 h n n nφ θ−⇒ =

(1)

(2)


• Above equation expensive to solve for each time sample.

• How can we solve above solution recursively instead ?

( ) ( ) ( )1h n n nφ θ−=

( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

1

1 11

1

1

n nH H Hn i n i

i in

H Hn i

i

n x i x i x i x i x n x n

x i x i x n x n

φ λ λ

λ λ

−− −

= =

−− −

=

= = +

= +

∑ ∑

∑

( ) ( ) ( ) ( )1 Hn n x n x nφ λφ= − +⇒

(2)


( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

1* * *

1 1

11 * *

1

=

n nn i n i

i i

nn i

i

n x i d i x i d i x n d n

x i d i x n d n

θ λ λ

λ λ

−− −

= =

−− −

=

= = +

+

∑ ∑

∑

• How to update ?

• Solve (2) recursively

(4)

(5)

( )nθ

( )nθ =

( ) ( ) ( ) ( ) [ ]1

1 Hh n n x n x nλφ−

⎡ ⎤= − +⎣ ⎦

(1) Replace (3) and (4) in (2)


(5)

(6)

( ) ( ) ( ) ( ) [ ]1

1 Hh n n x n x nλφ−

⎡ ⎤= − +⎣ ⎦

(1) Replace (3) and (4) in (2)

(2) Use matrix inversion lemma to simplify (5)

(3) Use (6) in (5) with

( )( )

11 1 1

11

H H H

H H

A B C D C A B I C D C BC C B

A B BC D C BC C B

−− − −

−−

⎡ ⎤= + ⇒ = − +⎢ ⎥⎣ ⎦

= − +

1

A CB D−

= =

= =


( ) ( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( ) ( )( )

11

1 1

1 11 1

1

1 1

1 1 1

H

H H

n n x n x n

n n x n

x n n x n x n n

φ λφ

λφ λφ

λ φ λφ

−−

− −

− −− −

⎡ ⎤⇒ = − +⎡ ⎤⎣ ⎦ ⎣ ⎦

= − − − ×⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦

⎡ ⎤+ − −⎣ ⎦

( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( )

1 121 1

11

1 11

1 1

H

H

n x n x n nn n

x n n x nλ φ φ

φ λφλ φ

− −−− −

−−

− −⇒ = − −⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦ + −

(8)

(9)


( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( )

1 11 11 1

11

1 11

1 1

H

H

n x n x n nn n

x n n x nλ λ φ φ

φ λφλ φ

− −− −− −

−−

− −⇒ = − −⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦ + −

( ) ( ) ( )( ) ( ) ( )

( ) ( )( ) ( ) ( )

11

11

1

1

1

1 1

11

1

H

H

P

n x nk n

x n n x n

x n

x

n

P nn x n

λ φ

λ φ

λ

λ

−−

−−

−

−

−=

⎡ ⎤+ −⎣ ⎦−

−=

⎡ ⎤+⎣ ⎦

( ) ( ) ( ) ( ) ( )1 1 11 ( ) 1Hn P n P n k n x n P nφ λ λ− − −= = − − −⎡ ⎤⎣ ⎦

(8)

(9)

(10)

(4) Define:

(5) Replace (9) in (8):

( ) ( ) 1P n nφ −=


( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

1

11

11 1

1

1 1

1 1 1

H

H

P n x nk n

x n n x n

k n x n n x n P n x n

λ

λ φ

λ φ λ

−

−−

−− −

−=

⎡ ⎤+ −⎣ ⎦⎡ ⎤⇒ + − = −⎣ ⎦

(11)

(6) Rearrange (9)

(12)

( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )

1 1

1 1

1 1

1 1

H

H

k n P n x n k n x n P n x n

P n k n x n P n x n

λ λ

λ λ

− −

− −

⇒ = − − −

⎡ ⎤= − − −⎣ ⎦

( ) ( ) ( )k n P n x n=

from (10)

from (10)


(7) Update weights h(n)

(8) Replace (10) in (12)

(12)

( ) ( ) ( )( ) ( ) ( ) ( )

1

*1

h n n n

P n n x n d n

φ θ

λθ

−=

⎡ ⎤= − +⎣ ⎦

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )( ) ( )( )( ) ( ) ( ) ( )

( )

( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

1 1 *

1 1

1 1 *

*

1 1 1

1 1 1 1

1 1

1 1 1 1

H

H

H

H

h n P n k n x n P n n x n d n

P n n k n x n P n n

P n k n x n P n x n d n

P n

P n n k n x n P n n P n x n d n

λ λ λθ

λ λθ λ λθ

λ λ

θ θ

− −

− −

− −

⎡ ⎤⎡ ⎤= − − − − +⎣ ⎦⎣ ⎦

= − − − − −

⎡ ⎤+ − − −⎣ ⎦

= − − − − − +


( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )*1 1 1 1Hh n P n n k n x n P n n P n x n d nθ θ= − − − − − +

( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )

*

*

1 1

1 1

H

H

h n h n k n x n h n k n d n

h n k n d n x n h n

= − − − +

⎡ ⎤= − − − + −⎣ ⎦

( )1h n −

( ) ( ) ( ) ( ) ( ) ( )*

1 1 Hh n h n k n d n h n x n⎡ ⎤⇒ = − + − −⎣ ⎦

( ) ( ) ( ) ( ) ( )Hn e n d n h n x nα ≠ = −

α(n): innovation sequence

α(n): a priori estimation error; e(n): a posteriori estimation error

Note:

( )k n( )1h n −


(14)

RLS recursion:

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )

1

1

*

1 1

1

1 1

( 1)

1

1 1

H

H

H

P n x nk n

x n P n x n

n d n h n x n

h n h n k n n

P n P n k n x n P n

λ

λ

α

α

λ λ

−

−

− −

−=

⎡ ⎤+ −⎣ ⎦

= − −

= − +

= − − −

( ) 1

Pick (0) at randomDefine 0

hP Iδ −=

represents the confidence in the initial estimates (pick δ small)

How to Initialize

( ) ( )- Initialize 0 , 0P h

( innovation sequence)

(16)

(15)

(17)


( ) ( ) ( ) ( )*

LMS:

1h n h n e n x nγ+ = +

( ) ( ) ( ) ( )*

R LS : 1h n h n k n nα+ = +

[optimum filter at CV]

[optimum filter at each iteration]

Comparisons with LMS


adaptive filter +

d na f=x na f=

d na f=

e n d n d na f a f a f= −

−

+

• Example: RLS applied to one-filter coefficient adaptive noise canceller

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )

11

1

*

1 1

1, ( ) ( )

1 1

1

1 1

H

H

H

P n x nk n P n n

x n P n x n

n d n h x n

h n h n k n n

P n P n k n x n P n

λφ

λ

α

α

λ λ

−−

−

− −

−=

⎡ ⎤+ −⎣ ⎦

= −

= − +

= − − −

Recall:




A Few Comments on Convergence(1) Sensitivity of RLS is determined by minimum

eigenvalues →

ill-conditioned RLS has bad convergence properties.

(2) Zero misadjustment in stationary environment.

(3) Convergence obtained in about twice the number of filter weights.

(4) For control applications: RLS = Kalman filter

• Property 1: sensitivity of the RLS implementation is determined by the minimum eigenvalue magnitude

ill-conditioned RLS have “bad” convergence properties


Property 2: MSE can be shown to be as:

min

2( ) 1 , for =1 and largeePMSE n nn

σ λ⎛ ⎞+⎜ ⎟⎝ ⎠

Zero misadjustment in stationary environment, misadjustmentoccurs in non stationary environment

1 , : filter length1nsMisadjustment P Pλ

λ−+

• Property 3: algorithm convergence is obtained in about twice the number of filter coefficients


Comparisons Between LMS and RLS(1) RLS has a faster rate of convergence than

LMS in stationary environments.

(2) LMS tracks better than RLS does in non-stationary cases.


• Application to Adaptive Equalization(same example as for FIR & LMS implementations done earlier)

Goal: Design the adaptive equalizer to correct distortion produced by channel in the presence of noise.

Assume: • {an } = ±1 with zero mean (BPSK sequence).

• channel impulse response:( )1 21 cos 2 1, 2, 3

20

n

n nc W

ow

π⎧ ⎡ ⎤⎛ ⎞+ − =⎪ ⎜ ⎟⎢ ⎥= ⎝ ⎠⎨ ⎣ ⎦⎪⎩

ak

x(n)

e(n)⊕Random Noise

Generator

2vσ

+−Channel

Delay

Adaptive Equalizer

Random Noise Generator (2)

⊕


• channel impulse response:

(BPSK sequence)

W represents amount of amplitude distortion introduced by channel. Assume W= 3.1 c0=0.2798,c1=1,c2=0.2798

( )1 21 cos 2 1, 2,320

n

n nc W

ow

π⎧ ⎡ ⎤⎛ ⎞+ − =⎪ ⎜ ⎟⎢ ⎥= ⎝ ⎠⎨ ⎣ ⎦⎪⎩

1 20 1 2( )cH z c c z c z− −= + +


(1) Assume and instantaneous transfer (delay = 0) 0vσ =

2 0.001, 11, 3.1v P Wσ = = =(2) Assume

Figure 9.17 →

eigenvalue spread increase causes:

• increased error • algorithm to slow down

Figure 9.19 • faster convergence with larger

step size • average error increases with

larger step size

LMS γ = 0.075

RLS λ = 1, δ=0.004

Figures 13.6, 13.7, 13.8


LMS

[Haykin]

Average of 20 trials

W=2.9 (χ=6), W=3.1 (χ=11), W=3.3 (χ=21), W=3.5 (χ=46),

χ CV


LMS

[Haykin]

Average of 100 trials

W=2.9 (χ=6), W=3.1 (χ=11), W=3.3 (χ=21), W=3.5 (χ=46),

Step size mis-adjustment


LMS & RLS

[Haykin]

(30dB SNR, Filter length =11)

Comments:1) RLS CVs in twice the filter length2) RLS CV rate faster than LMS CV rate3) MSE at CV is smaller for RLS than for LMS

Figure 13.6. Learning curves for LMS and RLS algorithms;

a) LMS: step size=0.075; RLS: W=2.9, δ=0.004, λ=1.

b) LMS: step size=0.075; RLS: W=2.9, δ=0.004, λ=1.


LMS & RLS

[Haykin]

(30dB SNR, Filter length =11)

Comments:1) RLS CVs in twice the filter length2) RLS CV rate is independent of χ(Rx ), as shown in Fig. 13.7

Figure 13.6. Learning curves for LMS and RLS algorithms;

c) LMS: step size=0.075; RLS: W=3.3, δ=0.004, λ=1.

d) LMS: step size=0.075; RLS: W=3.5, δ=0.004, λ=1.

03/24/09 EC4440.SpFY09/MPF - Section IV 30[Haykin]

LMS & RLS

Comments: as SNR increases, LMS & RLS CV in ~ same number of iterations

Figure 13.8. Learning curves for RLS & LMS algorithms for W=3.1; LMS: step size=0.075; RLS: δ=0.004, λ=1.

03/24/09 EC4440.SpFY09/MPF - Section IV 31[Haykin]

RLS

max

min

W=2.9, 3.1, 3.3, 3.5

= 6,11,21, 46λ

χλ

=

W=3.5, W=3.3, W=3.1, W=2.9

Documents

IV. Recursive Least Squares Algorithm (RLS)faculty.nps.edu/fargues/teaching/ec4440/springfy09/ec4440-iv-spfy... · IV. Recursive Least Squares Algorithm (RLS) • [p. 2] Differences