9
This article was downloaded by: [Columbia University] On: 11 November 2014, At: 10:42 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of Control Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tcon20 Stochastic algorithms in system identification HARALD HÖGE a a Siemens Aktiengesellschaft, Zentrallaboratorium für Nachrichtentechnik , D-8000 München, 70, Germany Published online: 27 Mar 2007. To cite this article: HARALD HÖGE (1973) Stochastic algorithms in system identification , International Journal of Control, 17:6, 1121-1128, DOI: 10.1080/00207177308932456 To link to this article: http://dx.doi.org/10.1080/00207177308932456 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Stochastic algorithms in system identification†

  • Upload
    harald

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stochastic algorithms in system identification†

This article was downloaded by: [Columbia University]On: 11 November 2014, At: 10:42Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

International Journal of ControlPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tcon20

Stochastic algorithms in system identificationHARALD HÖGE aa Siemens Aktiengesellschaft, Zentrallaboratorium für Nachrichtentechnik , D-8000 München,70, GermanyPublished online: 27 Mar 2007.

To cite this article: HARALD HÖGE (1973) Stochastic algorithms in system identification , International Journal of Control,17:6, 1121-1128, DOI: 10.1080/00207177308932456

To link to this article: http://dx.doi.org/10.1080/00207177308932456

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Stochastic algorithms in system identification†

llS'1' ..T. OOlS'1'ROL, 1!J7:3, VOL. 17, NO.6, 1121-1128

Stochastic algorithms in system identification tHARALD HOGE

Siemens Aktiengesellschaft, Zentrallaboratorium fur Nachrichtentechnik,D-8000 Miinchen 70, Germany

[Received 23 January 1973J

In order to idontify a linear system with a discrete non-parametric model a class ofalgorithms which is closely related to the Kalman filter is considered. \Vith an inputsignal of appropriate statistics these algorithms are easy to calculate and are cherec­terrzed by a speed of convergelice which is comparable to that of a Kalman filtergiven un imperfect" a priori knowledge of the error covariance matrix.

1. Introduction

Linear time invariant systems can often be described by a time-discretelinear model

y"=OTW,,+TJ,,, (r=transposed)

with a scalar output signal Yn' an N-dimensional stochastic input vector W n,an unknown N-dimcnsional parameter vector 0 and a stochastic noise TJ,,-

In the following only non-parametric models without correlation betweenWand TJ will be considered. Under these conditions the algorithm

0n+l=On+r,,(Yn-OnTWn) (I)

converges for all Co in mean towards an unbiased estimate C for 0 with anappropriate N x N matrix r n' which determines the stochastic properties of (1)(Zypkin 1970). With gaussian noise TJ of known variance, On is the bestestimate of 0 i-f r n is determined by the Kalman filter (Astrom 1968).

The gradient method

( r 1 Wn )n=n+lIIWnI12 '

on the othcr hand, generates a less cfficient estimate. As a non-parametricmodel is generally associated with a 0 of high dimension, the on-line calculationof all optimum r" is, in practice, too complicated and sometimes impossible.

In what follows algorithms will be developed which lie in the' gap' betweenthe gradient method and the Kalman filter and which are easy to calculate andconverge as fast as a Kalman filter given an input W with appropriate statistics.

2. Optimization of r,Equation (1) will be considered in the special form

0'1+1=0'1+Sn W,,(Yn-On'1'Wn) (2)

as in both the gradient method and in the Kalman filter the term Wn(Yn- 0'1'1' Wn)

t Communicated by Professor Dr.-Lng. M. Thoma.

CON. 48

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 3: Stochastic algorithms in system identification†

(3)

1122 H. H8ge

appears. Those matrices, Bn , are sought which by each step 11 minimize thevariance

r"+1 = «0 - 0"+1)1'(0 - 0"+1)",,,

where <)"n indicates uveraging with respect to

Y)nl'=('7o, '71' ... , '7n)'

The samples '7; (i=O, I, ... ) should be mutually and from W k (k=O, I, ... )statistically independent and should fulfil the conditions

a/=<'7i2) , 0=<'7;)' i=O, 1, ....

In order to find algorithms equivalent to the gmdient method and the Kalmanfilter an B" for three different' degrees of freedom' will be examined:

_

(

8 1 (" )

8/1 - 8 2 ( 11 )

o

(4 a)

(4 b)

(4 c)

Equation (4 a) is related to the gradient method and (4 c) leads to the Kalmanfilter.

Taking

R"=O-O,,, A,,= W"W,.'"

(2) can be transformed into

R"+1 = R; - BnAnH" - 8"'7,, W"

where the error covariance matrix

P n= <R"R"T)"n

can be recursively calculated with

(5)

P 11+1 = P; - (BnAnI'" + P nA nB,,'!') + (Bp(AnP,,) + a q2)BnA nB,,'!'. (6)

(6) can only be derived for those B" which are not correlated with '7i (i = 0, I, ... ).According to (3) and (5) BpP (BpP =sum of diagonal values of P) is the same

as the variance r. An optimal matrix Sn which minimizes 8pPn+1 can be foundusing (6) in the form

BpPn+1 = SpP" - Bp(8nA nP" + P"A"BnT)

+ (Bp(A"P,,) + a2v)Bp(BnA .s,1').

The optimal value "n of "'n of (4 a) is determined through the necessarycondition

(oBPP "+1) = 0

aet IX =:x11 n Ii

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 4: Stochastic algorithms in system identification†

which gives

Algorithm I

Stochastic algorithms in system identification 1123

, SpAnPn -"',,=S AS' S"=&n1. (7)p ,,(' 'p(AnP,,) +un

2)

With (6) P" can be calculated recursively by

P"+l = P" - &"(A"P,, + P"A,,) +&"2(Sp(A"P,,) +un2)A

n . (8)

From the necessary condition

(OSPP"+l)

os/,,) s;(")- ':S;(JI) = 0, i = 1, 2, ... , N

it follows that the recursive scheme for (4 b) is

Algorithm II

81 .(,, ) = [A"P"Jii . liT (9), , ~=1,2, ... ,1,[A"J;;(Sp(A"P,,) +«,2)

(81 ( " ) 0)

8,,= 0 "'8,\'(n) •

Pn+L = P n- (8 nA"Pn+ P nA,,8n)+ (Sp(AnPn)+u/)8nA nS". (lO)

(For any matrix JYI, [lYIJik indicates the element of 111 with index i, k).Finally through (4 c) the condition

(OSPP n+l) =0, i,k=1, 2, ... ,N

oSik(n) S;k(n)- ";k(H)

leads to the matrix equation

P"A n= (Sp(AnPn ) +un2)8nAn

with solution

8 = P n

" Sp(A nPn)+un2

Substituting (11) in (6) gives

(11)

(12)

Equations (11) and (12) are identical with the algorithm of the Kalman filter(Astrom 1968) as the Kalman gain f n is expressed by

fn='§n W"

Sp(A"Pn)+un2

P"+l =P" - r nWnTr;The theory of the Kalman filter, however, is limited to a gaussian noise 7).

The most tedious part of the algorithms (7)-(12) involves the calculation

4 B 2

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 5: Stochastic algorithms in system identification†

1124 H. Hoge

of the covariance matrix P II , which is greatly simplified when the signal IV"has the following statistics:

(a) Signals W" are statistically independent with respect to n and thecomponents [W,,]i (i=I,2, ... ,N) are statistically independent withrespect to i.

(b) The moments of WII are fixed through

<[ W,,];) = 0, <I W"V) = a 2, <I WIIV) =,.\4,

i=I, ... ,N; n=O,I, .... (1:3)

For a class of matrices S" that are statistically independent of both Wiland "'Ithe covariance matrix PII+1 which is averaged with respect to WII"L'=(WO' WI>... , WII) can be determined recursively through

<p ) - <P ) - (S <A) <P) + <P ) <A) S l'11+1 W/I - 11 W,,_I 11 11 w" 11 W n - I n W n - I 11 W" n

+ SII«Sp(A II<PII)W"_,) + a~2)AII)w"S,,1'using (8) and (13).

Matricos 8 11 can then be found that minimize Sp<P"+l)w". Defining

·/",,=Sp<PII)w"_., 1\(11)=<[P"];;)w"_,, i=l, 2, ... , N

we can simplify (7), (8) :

Algorithm 1, simplified

( 14)

(15)

and (lJ), (10) :

Algorithm Ll , simplified

( 16)i= I, 2, .. " 1V,sL Pk(II)+ ,.\4p;''' )+ a2a"2

.. -1k¥=i

8i(II)= _-..".__---''-'- _

a 1

(J 7)

As in (14) only rll is required to compute <ill' not the whole matrix <P,,+l)w"but only r"+l =SP<P,,+l)w" need be calculated, whereby the matrix eqn. (8)is transformed to the simple soalar eqn. (15).

In (16) only the diagonal elements of <P II)W"_. are needed, whieh leads to areduction of (10) to N equations according to (17).

Common to the above algorithms is that in the iteration proeess all requirean initial value.

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 6: Stochastic algorithms in system identification†

Stochastic algorithms in. sqstemidentijicaiion. 1 12:3

With a given initial value 60 for egn. (I) the covariance matrix P o=

(C-60)(C-60 )T , 8pPo, [Pol ii (i= 1, ... , N) needed for eqns. (8), (10), (12), (15)(17) are all defined. By most identification problems little is known, a priori,about Po; thus an estimate for Po must be found. In the next section theinfluence of this estimate on the different algorithms will be demonstratedthrough computer simulations. The latter show that the Kalman filter issuperior to othcr algorithms only when a good estimate for Po can be found.

3. SimulationsThe computer simulations were stimulated through the investigation of an

adaptive digital echo canceller (Sondhi 1967) which has to identify a linearsystem with time delays. In order to model systems with a band limited inputsignal x a transversal filter (Becker and Rudin 1966) was chosen; the input­vector IV then has the following form:

W,? = (:r(nT), x(nT - T), ... , x(nT - (N - I )T)). ( 18)

The sampling period T is defined by the upper limit of the band. The statisticalproperties of signal x and noise TJ were determined such that the sam pled values~;(iT) (i=O, I, ... ) and TJ(lcT) (k=O, I, ... ) are statistically independent and havea gaussian distribution respectively

N(O, a2) and N(O, a;). ( 19)

According to (18) the neigh bouring values of W n are strongly correlated asWn+l is the same as W" except for two components; consequently, Wn docs notfulfil condition (13 a).

Rage (1972) demonstrated that Algorithm J behaves as if condition (13 a)were fulfilled for IX constant and signal IV with statistics (19). The computersimulations showed that this is also true for Algorithms 1, U. Therefore, thesimplified algorithms can be applied. The values

I lIT TIlT diV'1I 'n an

are substituted for the a priori unknown values of a2 and ;\4 in (I:~ b). Asexamples of linear systems two systems (a, b) with the following properties

N =U,

1C"T= VO (I, I, I, I, I, I),

CbT = (0' 8, -0·.5,0'1,0'2,0'05, -0·218),

under initial condition 60 = 0 were examined.Given Po the speed of convergence according to figs. 1 and 2 increases for

Algorithm 1, Algorithm II and the Kalman filter. (The curves of figs. 1-4were calculated by averaging four statistically independent runs of RTR.)

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 7: Stochastic algorithms in system identification†

1126 H. Rage

Fig. 1

1.0

asA

0.2'"

01 '" '" '"A '" '" '" Algorithm I

t0.05 • 0

'" A0A/".

~A

"0 Algorithm II0

A "r 0.02 ./ " ""0.01

00 0

C =t, Kalman Filter0.005 6 2=1 /

000

0.0026~= 1

00

0.0010 20 40 60 80 100 120 140 160 180 200

n_

Convergence of Algorithms I, II in comparison to the Kalman filter with complete,a priori knowledge about Po for system C=Ca •

Fig. 2

10

0.5'"

'"0.2

'" '"0.1 A A

t 005 '" A A Algorithm It:. t:. l:J. t.. " ",/

r 002 • Algorithm nA " " A

0 0 • A

0.01C =Cb 0 0

62=10.005 6~= 1 0 0 0

/Kalman Filter

0.002

0.001, j j , j jj j j j j

0 20 40 60 80 100 120 140 160 180 200n_

Convergence of Algorithms I, II in comparison to the Kalman filter with complete,a priori knowledge about Po for system C=Cb .

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 8: Stochastic algorithms in system identification†

1.0

0.5

Stochastic algorithms in system identification

Fig. 3

112i

0.2

0

t. .

0.10

r0 Algorithm]I. simplified

0.05 ./

C =Cb

0 .Kalman Filter

15 1=1./0

c5~ =1 0 .0.02 0 0

. 0

001 0

0 20 40 60 80 100 120 140 160 180 200n ____

Convergence of simplified Algorithm n in comparison with the Kalman filter byincomplete, a priori knowledge of Po (system b).

1.0

0.5 Q

0.2

t 0.1r

!

'" .A• f

Fig. 4

~ Kalman Filter

~ ~ Algorithm I . simplified

0.05

0.02

C=Cb

151 = 1c5~ = 1

~* • +• A

• t

n----

Convergence of simplified Algorithm I in comparison with the Kalman filter forunknown Po (system b).

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014

Page 9: Stochastic algorithms in system identification†

1128 Stochastic algorithms in system identification

When the best a, priori estimate of Po is gi ven by a diagonal matrix

([PoJll 0)

o [Polv N

whose diagonal elements are equivalent to those of Po then the speed of con­vergence of the Kalman filter decreases and is similar to that of the simplifiedAlgorithm n (see fig. a).

Given that Po can only be estimated by a matrix Pol then the Kalman filteris comparable to the simplified Algorithm I (see fig, 4). (When a value for Pois chosen, then k is defined by

NPO)SpPo .

4. Conclusions

The speed of convergence as shown by the simulations is dependent on theinitiul value Po of covariance ma.trix P. The less that is known, a. priori,about Po the less is the difference between the Kalman filter and Algorithms I,11:. Algorithms I, IT require less calculation than the Kaunan filter when theyarc used in their simplified version, where the input signl1l IV is restricted to 11

special statistic. The simulation showed that this statistic does not have tofulfil the strict conditions (I ;1).

A subject of further examinations is to see to which class of signals IV thesimplified algorithms could be applied.

REFERENCES

.AS'J'ROM, K .•J., 1968, Lectures on the I dentijiwtion Problem-The Least Square J1fethod,Lund Institute of Technology, Division of Automatic Control Report.

BECKER, F. K., and RUDIN, H. R., 1!l66, Bell Syst. tech. J., 53, 1!l47.HOOE, H., 1!l72, Kybernetik, 11, ss.SONDIIT, 1\1. M., J!l67, Belt Sysf.. tech. J., 50, 785.ZYPK1N, J'. S., 1!l70, Adapiion. und Lernen. in kybernetischen Systemen (Oldenbourg

Verlag).

Dow

nloa

ded

by [

Col

umbi

a U

nive

rsity

] at

10:

42 1

1 N

ovem

ber

2014