SYSTEMS Identification

SYSTEMSSYSTEMSIdentificationIdentification

Ali Karimpour

Assistant Professor

Ferdowsi University of Mashhad

Reference: “System Identification Theory For The User”

Lennart Ljung

2

lecture 11

Ali Karimpour Nov 2009

Lecture 11

Recursive estimation methodsRecursive estimation methods

Topics to be covered include: Introduction.

The Recursive Least-Squares Algorithm.

The Recursive IV Method.

Recursive Prediction-Error Methods.

Recursive Pseudolinear Regressions.

The Choice of Updating Step.

Implementation.

3

lecture 11








Implementation.

Introduction

4

lecture 11


Introduction

In many cases it is necessary, or useful, to have a model of the system available on-line.

The need for such an on-line model is required in order to:

• Which input should be applied at the next sampling instant?

• How should the parameters of a matched filter be tuned?

• What are the best predictions of the next few output?

• Has a failure occurred and, if so, of what type?

Adaptive

Adaptive control, adaptive filtering, adaptive signal processing, adaptive prediction.

5

lecture 11


Introduction

The on-line computation of the model must completed during one sampling interval.

Identification techniques that comply with this requirement will be called:

• Recursive identification methods.

• On-line identification.

• Real-time identification.

• Adaptive parameter estimation.

• Sequential parameter estimation.

• Recursive identification methods. Used in this Reference.

6

lecture 11


Introduction

7

lecture 11


Introduction

Algorithm format

General identification method: ),(ˆ tt ZtF

))(),(),1((ˆˆ

))(),(),1(()1()(

1 tutytXQ

tutytXQtXtX

ttt

Xt

This form cannot be used in a recursive algorithm, since it cannot be completed in one sampling instant.

Instead following recursive algorithm must comply:

Minimizing argument of some function or…

Information state

Since the information in the latest pair of measurement { y(t) , u(t) } normally is small compared to the pervious information so there is a more suitable form

Small numbers reflecting the relative information value in the latest measurement.

))1((ˆ

))(),(),1(,()(

tXh

tutytXtHtX

t

8

lecture 11








Implementation.

The Recursive Least-Squares Algorithm

9

lecture 11



Weighted LS Criterion

The estimate for the weighted least squares is:

Where

10

lecture 11



Recursive algorithm

Suppose the weighting sequence has the following property:

Now

11

lecture 11



Recursive algorithm

Suppose the weighting sequence has the following property:

12

lecture 11



Recursive algorithm

Version with Efficient Matrix Inversion

Remember matrix inversion lemma

To avoid inverting at each step, let introduce

13

lecture 11



Version with Efficient Matrix Inversion

Moreover we have

We can summarize this version of algorithm as:

14

lecture 11



Normalized Gain Version

The size of the matrix R(t) will depend on the λ(t)

15

lecture 11



Initial Condition

A possibility could be to initialize only at a time instant t0

By LS method

Clearly if P0 is large or t is large, then above estimate is the same as:

16

lecture 11



Asymptotic Properties of the Estimate

17

lecture 11



Multivariable case

Remember SISO

Now for MIMO

18

lecture 11



Kalman Filter Interpretation

The Kalman Filter for estimating the state of system

Kalman Filter

The linear regression model can be cast to above form as:

Now, letExercise: Derive the Kalman filter for above mention system, and show that it

is exactly same as the Recursive Least-Squares Algorithm for multivariable

case.

19

lecture 11




Kalman filter interpretation gives important information, as well as some practical hints:

20

lecture 11



Coping with Time-varying Systems

An important reason for using adaptive methods and recursive identification in practice is:

• The properties of the system may be time varying.

• We want the identification algorithm to track the variation.

This is handled by weighted criterion, by assigning less weight to older measurements

21

lecture 11




These choices have the natural effect that in the recursive algorithms the step size will not decrease to zero.

22

lecture 11




Another and more formal alternative to deal with time-varying parameters is that the true parameters varies like a random walk so

Exercise: Derive the Kalman filter for above mention system, and show that it is exactly same as the Recursive Least-Squares Algorithm for multivariable case.

Note: The additive term R1(t) in P(t) prevents the gain L(t) from tending to zero.

23

lecture 11








Implementation.

The Recursive IV Method

24

lecture 11



Remember Weighted LS Criterion:

Where

The IV estimate for instrumental variable method is:

25

lecture 11



The IV estimate for instrumental variable method is:

26

lecture 11








Implementation.

Recursive Prediction-Error Methods

27

lecture 11



Analogous to the weighted LS case, let us consider a weighted quadratic prediction-error criterion

Where

so we have

the gradient with respect to θ is

28

lecture 11




Remember the general search algorithm developed for PEM as:

For each iteration i, we collect one more data point, so

now define

As an approximation let:

29

lecture 11




As an approximation let:

With above approximation and taking μ(t)=1, we thus arrive at the algorithm:

This terms must be recursive too.

30

lecture 11




31

lecture 11




32

lecture 11




33

lecture 11



Family of recursive prediction error methods

• According to the model structure

• According to the choice of R

Wide family of methods

We shall call “RPEM”

For example, the linear regression

If we consider R(t)=I

Where the gain could be normalized so

This is recursive least square method

This scheme has been widely used, under the name least mean squares (LMS)

34

lecture 11



Example 11.1 Recursive Maximum Likelihood

)()()()()()( teqCtuqBtyqA

Tcba nttntutuntytyt ),(...),1()(...)1()(...)1(),(

Tnnn cba

cccbbbaaa ......... 212121

Consider ARMAX model

where

and

Remember chapter 10

By rule 11.41

This scheme is known as recursive maximum likelihood (RML)

35

lecture 11


Recursive Prediction-Error Methods Projection into DM

In off-line minimization this must be kept in mind as a constraint.

The model structure is well defined only for giving stable predictors. MD

The same is true for the recursive minimization.

36

lecture 11



Asymptotic Properties

The recursive prediction-error method is designed to make updates of θ in a direction that “on the average” is modified negative gradient of

i.e.

37

lecture 11



Asymptotic Properties

Moreover (see appendix 11a), for Gauss-Newton RPEM, with

tt /1)( It can be shown that has an asymptotic normal distribution, which coincides with that of the corresponding off-line estimate. We thus have

)(ˆ t

38

lecture 11








Implementation.

Recursive Pseudolinear Regressions

39

lecture 11



Consider the pseudo linear representation of the prediction

And recall that this model structure contains, among other models, the general linear SISO model:

A bootstrap method for estimating θ was given by (Chapter 10, 10.64)

By Newton - Raphson method

40

lecture 11



By Newton - Raphson method

41

lecture 11



42

lecture 11



Family of RPLRs

The RPLR scheme represents a family of well-known algorithms when applied to different special cases of

The RPLR scheme represents a family of well-known algorithms when applied to different special cases of

The ARMAX case is perhaps the best known of this. If we choose

This scheme is known as extended least squares (ELS).

Other special cases are displayed in following table:

43

lecture 11



Other special cases are displayed in following table:

44

lecture 11








Implementation.

The Choice of Updating Step

45

lecture 11



Recursive Prediction-Error Methods is based prediction error approach:

Recursive Pseudolinear Regressions is based on correlation approach:

46

lecture 11



Recursive Prediction-Error Methods (RPEM) Recursive Pseudolinear Regressions (RPLR)

The difference between prediction error approach and correlation approach is:

Now we are going to speak about that modifies the update direction and determines the length of the update step.

)()( 1 tRt

We just speak about RPEM, RPLR is the same just one must change

47

lecture 11



Update direction

There are two basic choices of update directions:

Bet

ter

conv

erge

nce

rate

Eas

ier

com

puta

tion

48

lecture 11



Update Step: Adaptation gain

An important aspect of recursive algorithm is, their ability to cope with time varing systems. There are two different ways of achieving this

49

lecture 11



Update Step: Adaptation gain

In either case, the choice of update step is a trade-off between

• Tracking ability

• Noise sensitivity

A high gain means that the algorithm is alert in tracking parameter changes but at the same time sensitive to disturbance in data.

50

lecture 11



Choice of forgetting factor

The choice of forgetting profile β(t,k) is conceptually simple.For a system that changes gradually and in a stationary manner the most common choice is:

The constant λ is always chosen slightly less than 1 so

This means that measurements that are older than T0 samples are included in the criterion with a weight e-1≈0.36% of the most recent measurement.So T0 is the memory time constant.

So we could select λ such that 1/(1-λ) reflects the ratio between the time constant of variations in the dynamics and those of the dynamics itself.

Typical choices of λ are in the range between 0.98 and 0.995. For a system that undergoes sudden changes, rather than steady and slow ones, it is suitable to decrease λ(t) to a small value and then increase it to a value close to 1 again.

51

lecture 11



Choice of Gain γ(t)

52

lecture 11



Including a model of parameter changes


Remember

Now let

53

lecture 11




In the case of a linear regression model, this algorithm does give the optimal trade-off between tracking ability and noise sensitivity, in terms of parameter error covariance matrix.

The case where the parameters are subject to variations that themselves are of a nonstationary nature, [i.e. R1(t) varies with t] needs a parallel algorithm. (see Anderson 1985)

54

lecture 11



Constant systems

55

lecture 11



Asymptotic behavior in the time-varying case

56

lecture 11








Implementation.

Implementation

57

lecture 11


Implementation

Recursive Prediction-Error Methods (RPEM) Recursive Pseudolinear Regressions (RPLR)

Implementation

The basic, general Gauss-Newton algorithm was given in RPEM and RPLR.

Inverse manipulation is not suited for direct implementation, (d*d matrix R must be inverted)

We shall discuss some aspects on how to best implement recursive algorithm. By using matrix inversion lemma

Here η (a d*p matrix) represents either φ or ψ depending on the

approcach.

But this is p*p

58

lecture 11


Implementation

Implementation

Unfortunately, the P-recursion which in fact is a Riccati equation is not numerically sound: the equation is sensitive to round-off errors that can accumulate and make P(t) indefinite

Using factorization

It is useful to represent the data matrices in factorized form so as to work with better conditioned matrices.

• Cholesky decomposition, which Q(t) is triangular

• UD-decomposition, which U(t) is triangular and D is diagonal

Here we shall give some details of a related algorithm, which is directed based on Householder transformation (problem 10T.1). (by Morf and Kailath)

59

lecture 11


Implementation

Using factorization

• Step 1: Let

Form (p+d)*(p+d) matrix

• Step 2: Apply an orthogonal (p+d)*(p+d) transformation T (TTT=I) to L(t-1) so that TL(t-1) becomes an upper triangular matrix. (Use QR-factorization)

Let to partition TL(t-1) as:

60

lecture 11


Implementation

Using factorization

61

lecture 11


Implementation

Using factorization

• Step 3: Now L(t) and P(t) are:

62

lecture 11


Implementation

Using factorization in summary

Now L(t) and P(t) are:

There are several advantages with this particular way of performing.

• The only essential computation to perform is the triangularization step.

• This step gives new Q and the gain L after simple additional calculations.

• Note that Π(t) is triangular p*p matrix, so it is easy to invert it.

• Condition number of the matrix L(t-1) is much better than that of P.

Documents

SYSTEMS Identification