ebt410_l11_2007

8/6/2019 ebt410_l11_2007

1/6

University of Pretoria Page 5

SYSTEM IDENTIFICATION USING MATLAB

This step-by-step tutorial will guide you through an example of system

identification(SID) where Matlab is used. A second order system with two inputs and oneoutput will be considered. A graphical representation of the system is given in Figure 1.

SWF

CFF

CUAg s11( )

g s12( )

+

+

Figure 1The system for which SID will be performed

The transfer function g11(s) between SWF and CUA will be determined. The procedure

which you have to follow is:

1. Start Matlab with the Systems Identification Toolbox in its path. Matlab starts, and

displays a window with a >>. This >> is the Matlab prompt, and data or

commands are entered from here.

2. Load the file ebbles6.mat by entering the following command at the Matlab prompt:

>> load ebbles6

The vectors SWF, CFF and CUA are now loaded into Matlabs memory. These

vectors contain numerical data from field measurements. The name and size of

variables currently in Matlabs memory can be viewed by issuing the following

command:

>> whos

3. The output-input pairs are CUA and SWF; and CUA and CFF. Make matrices z1 andz2 of the output-input pairs by issuing the following command:

>> z1=[CUA SWF];

>> z2=[CUA CFF];

Do notforget the semicolon, otherwise Matlab displays the whole matrix.

8/6/2019 ebt410_l11_2007

2/6


4. Plot the output-input pairs:

>> idplot(z1)

>> idplot(z2)

Notice the region where CFF remains constant. It will be in this region where we try

to determine the transfer function between SWF and CUA, as we want to isolate the

influence of SWF on CUA.

5. Now take a look at the region where CFF is constant:

>> idplot(z2(1000:1900,:))

The above command tells Matlab to make a plot of rows 1000 to 1900 and all

columns of z2. Take a look at SWF in the region where CFF is constant:

>> idplot(z1(1000:1900,:))

As can be seen, a big part of SWF is also constant. In order to obtain a transfer

function, we want to isolate the region where there is change in SWF, since the

constant characteristic of SWF contains no information on the dynamics of the

system. This changing region of SWF is the usable region. The usable region can

be viewed by

>> idplot(z1(1500:1900,:))

6. The usable region can now be split up into two regions.

i.)2

3of the region is used to fit the data.

ii.)1

3of the region is used to validate the model.

The reason why we approach the problem in this way is to validate the model using

data that was not used to fit the model. This is a much more stringent test of the

model than using the same data for fitting and validation.

We now create two new data regions by issuing the following command:

>> zf=z1(1500:1750,:);

>> zv=z1(1751:1900,:);

zf is used for fitting of the model, and zv is used for validation of the model.

8/6/2019 ebt410_l11_2007

3/6


7. Now plot the new data to make sure that the regions are suitable for model fitting and

validation.

>> idplot(zf)

>> idplot(zv)

8. The data will now be pre-treated. We will remove the offset of the input SWF and the

output CUA, for both the fitting and validation regions, so that an offset parameter

will not unnecessarily be fit.

>> zfd=dtrend(zf);

>> zvd=dtrend(zv);

9. The time delay between the input and the output is a critical factor that has to be

considered when SID is performed. The time-delay can best be determined by

zooming in on the plot and then estimating from the plot how long it takes for anoutput to react for a large input change. Now zoom in on the pre-treated data:

>> idplot(zfd(1:50,:))

From the plot we can see that the time-delay is approximately 5 sampling periods.

Since the data was acquired using a sampling interval of 10 seconds, the time delay is

approximately 50 seconds.

10. We are now ready to fit the data. We will use the ARX(Auto Regression with

eXternal input) method. In difference equation format the ARX model is given as

follows:

y t a y t T a y t T au to regression part

b u t n T b u t n T T extern al in pu t part

( ) ( ) ( ) ( )

( ) ( ) ( )

+ + +

= + +

1 2

1 2

2 K

K

where T is the sampling interval, t( ) is the output, u t( ) is the input, a i and bi are

the model coefficients, and n is the number of periods of time delay. In this tutorial

we only fit a first order type model. For a first order model only the coefficients a 1

and b1 are used.

The ARX model in difference equation format is then

y t a y t T b u t n T ( ) ( ) ( )+ = 1 1 .

Now fit the data:

>> th=arx(zfd,[1 1 5]);

8/6/2019 ebt410_l11_2007

4/6


[1 1 5] indicates that a1 1= , b1 1= , and n = 5 . The results of the arx computation

are stored in th.

11. Is our model good enough? The following tests are commonly used:

a. First we simulate the model with the data we used to fit the model and then the datafor validation.

>> compare(zfd,th,1);

The 1 in the above command tells Matlab to use one step ahead prediction. Nearly all

models will fare well with this test since the model knows exactly what the current

real output is, and needs only to predict the next output. A far more stringent test

would be where the model knows only what the inputs are (pure simulation):

>> compare(zfd,th);

The most stringent test of the model occurs when it is validated on data that was not

used to fit the model:

>> compare(zvd,th);

b. The error signal (residuals) contains important information on the shortcomings of

the model, and are generated as in Figure 2:

Real plant Error = y - y+

-

u

model

yyy

y

Figure 2The error signal

An auto-correlation of the error signal determines whether the error signal is white

noise or not. If so, your model is an unbiased estimator, which means that the errors

that the model make have an equal chance to be positive or negative. The error signalas well as the auto-correlation are generated by

>> error=resid(zfd,th);

You can also plot the error signal to inspect it for outliers. Outliers are errors that

are significantly different from average. To plot the error use:

8/6/2019 ebt410_l11_2007

5/6


>> plot(error);

The resid command also generates a cross-correlation between the input u and theerror signal. From this test one can determine whether the error signal and u are

independent.

If negative correlation is present, it means that output feedback occurs. The current

error influences the future input. If positive correlation is present it means that the

time delay was estimated incorrectly.

If the auto-correlation of the error signal and the cross-correlation of the error signal

and the input are within the dotted red lines, we are 99% sure that the error is white

noise, and that the error and input are independent. If this is the case, the model is

statistically speaking, a very good model! (The dotted red lines indicate the 99%

confidence intervals)

12. To get the model in usable form for simulation and controller design, we have toconvert the th model to a Laplace transform model. The ARX difference equation

with parameters a 1 and b1 can be determined as follows.

>> [dend,numd]=th2arx(th)

Two vectors, dend and numd, are returned. Their formats are

[ ]

[ ]

d en d a

n u m d b

=

=

1

0 0 0 0 0

1

1

numd contains a zero for each time delay.

13. The ARX model can now be converted to a continuous Laplace transfer function as

follows:

>> [numc,denc]=d2cm([0 numd(6)],dend,10,zoh);

The 10 in the above command indicates the sampling period, and the zoh indicates

zero-order-hold for the discrete to continuous transformation. numd(6) indicatesthat we are entering only the 6th element of numd, i.e. b1. d2cm stands for convert

from discrete-to-continuous with method. In our case the method is zero-order-hold.

The results that you obtain for numc and numd should be

8/6/2019 ebt410_l11_2007

6/6


[ ]

[ ]

n u m c e

denc

=

=

0 9326 3

1 0 0 05 9

.

.

The transfer function is then given by

g se

s

se

s

s

11

6 50

50

932 6 10

0 0059

01581

169 1

( ).

.

.

=

+

=+

.

Documents

ebt410_l11_2007