30
* Correspondence to: Gregory L. Plett, Department of Electrical and Computer Engineering, University of Colorado Springs, 1420 Austin Blu!s Parkway, P.O. Box 7150, Colorado Springs, CO 80933-7150, U.S.A. - E-mail: glp@eas.uccs.edu Contract/grant sponsor: The National Science Foundation; contract/grant number: ECS-9522085 Contract/grant sponsor: Electric Power Research Institute; contract/grant number: WO8016-17 Published online 14 February 2002 Received 30 March 2000 Copyright 2002 John Wiley & Sons, Ltd. Accepted 29 May 2001 INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING Int. J. Adapt. Control Signal Process. 2002; 16:243}272 (DOI: 10.1002/acs.698) Adaptive inverse control of unmodeled stable SISO and MIMO linear systems Gregory L. Plett* - Department of Electrical and Computer Engineering, University of Colorado at Colorado Springs, 1420 Austin Bluws Parkway, P.O. Box 7150, Colorado Springs, CO 80933-7150, U.S.A. SUMMARY Adaptive-signal-processing techniques have been employed with great success in such applications as: system identi"cation, channel equalization, statistical prediction and noise/echo cancellation. From a math- ematical point of view, there is little di!erence between these applications and the types of operations required by control systems to control a dynamical system. This paper presents an approach to control systems called adaptive inverse control in which adaptive-signal-processing techniques are used throughout. Adaptive inverse control comprises three simultaneous processes. The plant is automatically modeled using adaptive system identi"cation techniques. The dynamic response of the system is adaptively controlled using the resulting model and methods related to channel equalization. Adaptive disturbance canceling is performed using methods similar to noise canceling. The method applies directly to stable single-input single-output (SISO) and multi-input multi-output (MIMO) plants, and does not require an a priori model of the system. If the plant is unstable, it must "rst be stabilized using conventional feedback. This implies that at least a rudimentary model need be made if the plant is unstable. Once the plant is stabilized, adaptive inverse control may be applied to the stabilized system. Copyright 2002 John Wiley & Sons, Ltd. KEY WORDS: adaptive inverse control, system identi"cation, adaptive "ltering, adaptive inverse "ltering, disturbance cancellation, BPTM algorithm 1. INTRODUCTION Basic adaptive-signal-processing techniques are well understood and have been successfully applied to applications as varied as "nance, medicine and communications. One application which has traditionally been treated outside the realm of signal processing is that of controlling a dynamical system. In this paper we present an approach to digital control systems called adaptive inverse control [1}6] in which adaptive-signal-processing techniques are used through- out.

Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

*Correspondence to: Gregory L. Plett, Department of Electrical and Computer Engineering, University of ColoradoSprings, 1420 Austin Blu!s Parkway, P.O. Box 7150, Colorado Springs, CO 80933-7150, U.S.A.

- E-mail: [email protected]

Contract/grant sponsor: The National Science Foundation; contract/grant number: ECS-9522085Contract/grant sponsor: Electric Power Research Institute; contract/grant number: WO8016-17

Published online 14 February 2002 Received 30 March 2000Copyright � 2002 John Wiley & Sons, Ltd. Accepted 29 May 2001

INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSINGInt. J. Adapt. Control Signal Process. 2002; 16:243}272 (DOI: 10.1002/acs.698)

Adaptive inverse control of unmodeled stable SISO and MIMOlinear systems

Gregory L. Plett*�-

Department of Electrical and Computer Engineering, University of Colorado at Colorado Springs, 1420 Austin Bluws Parkway,P.O. Box 7150, Colorado Springs, CO 80933-7150, U.S.A.

SUMMARY

Adaptive-signal-processing techniques have been employed with great success in such applications as:system identi"cation, channel equalization, statistical prediction and noise/echo cancellation. From a math-ematical point of view, there is little di!erence between these applications and the types of operationsrequired by control systems to control a dynamical system. This paper presents an approach to controlsystems called adaptive inverse control in which adaptive-signal-processing techniques are used throughout.

Adaptive inverse control comprises three simultaneous processes. The plant is automatically modeledusing adaptive system identi"cation techniques. The dynamic response of the system is adaptively controlledusing the resulting model and methods related to channel equalization. Adaptive disturbance canceling isperformed using methods similar to noise canceling.

The method applies directly to stable single-input single-output (SISO) and multi-input multi-output(MIMO) plants, and does not require an a priori model of the system. If the plant is unstable, it must "rst bestabilized using conventional feedback. This implies that at least a rudimentary model need be made if theplant is unstable. Once the plant is stabilized, adaptive inverse control may be applied to the stabilizedsystem. Copyright � 2002 John Wiley & Sons, Ltd.

KEY WORDS: adaptive inverse control, system identi"cation, adaptive "ltering, adaptive inverse "ltering,disturbance cancellation, BPTM algorithm

1. INTRODUCTION

Basic adaptive-signal-processing techniques are well understood and have been successfullyapplied to applications as varied as "nance, medicine and communications. One applicationwhich has traditionally been treated outside the realm of signal processing is that of controllinga dynamical system. In this paper we present an approach to digital control systems calledadaptive inverse control [1}6] in which adaptive-signal-processing techniques are used through-out.

Page 2: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 1. Adaptive inverse control system.

Figure 1 shows a block-diagram of an adaptive inverse control system. The dynamical systemwe wish to control is labeled the &plant'. It is subject to disturbances, and these are modeled asa stochastic process added to the plant output, without loss of generality. The plant is assumed tobe linear, time-invariant (slow variations may be compensated for by the adaptive process), andstable but otherwise unknown. To control the plant, we "rst generate a plant model PK usingadaptive system-identi"cation techniques. Secondly, the dynamic response of the system iscontrolled using C, which is adapted using information from PK . The output of the plant model iscompared to the measured output from the plant, and the di!erence is a good estimate of thedisturbance. A special adaptive "lter X is used to cancel the disturbances.

Control of linear systems requires linear adaptive signal-processing methods. Control ofnon-linear systems may also be done*by using non-linear adaptive signal-processing methods[1}6]*but are beyond the scope of this paper. Here, we proceed by "rst reviewing linear adaptive"ltering and applying it directly to system identi"cation. Next, we discuss adaptive feedforwardcontrol of plant dynamics and adaptive disturbance canceling. We conclude with simulationexamples to demonstrate the techniques.

2. ADAPTIVE DIGITAL FILTERING

An adaptive "lter is illustrated in Figure 2. It has an input, an output, and a &special input' calledthe desired response. The desired response d

�speci"es the output we wish the "lter to have. It is

used to calculate an error signal e�, which in turn is used to modify the internal parameters of the

"lter in such a way that the "lter &learns' to perform a certain function.A linear "lter computes its output as a weighted sum of its current and N

��previous inputs and

its N��

previous outputs

y�"WX

�(1)

where the column-vector

X�"[x�

�, x�

���2x�

�����, y�

���, y�

���2y�

�����]�

and W is the weight matrix of the "lter.

244 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 3: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 2. Symbolic representation of an adaptive "lter.

Linear SISO "lters have a single input and a single output at each time instant; for (1) torepresent a SISO system, x

�and y

�must be scalars. Linear MIMO "lters have (possibly) many

inputs and outputs at each time instant; this is also accommodated by (1) if x�and y

�are (column)

vector signals.Linear adaptive "lters come in two basic #avors: ,nite impulse response (FIR), and in,nite

impulse response (IIR). An FIR "lter has no self-feedback. When an FIR "lter is excited by animpulse at its input, the response of the "lter is non-zero for a "nite period of time. An IIR "lter,on the other hand, has self-feedback and may respond with non-zero values for an in"nite periodof time. Using the notation of (1), an FIR "lter is one for which all the weights associated with theself-feedback inputs y

���, y

���2 are zero. An IIR "lter may have non-zero feedback weights.

Any stable linear system may be approximated by a &su$ciently long' FIR "lter. Therefore, forthis work it is su$cient to use FIR "lters. However, all derivations are given for IIR "lters sincethey can often represent the same system with fewer parameters, and therefore learn more quickly.

2.1. Adaptation algorithm

The weights in W may be adapted in a variety of ways in order that the output of the "lter learnsto closely match the desired output [7]. The method presented here is probably familiar to mostreaders. It is a version of real-time recurrent learning (RTRL) [8], modi"ed for linear systems. Were-develop it in detail since the same method is later extended to become the controller anddisturbance-canceler adaptation rule.

An adaptive "lter has a desired response input signal d�

which is used when adapting theweights of the "lter. At each time instant the "lter output y

�is compared to this desired response,

and the error is computed to be e�"d

�!y

�. As the system runs, we wish to modify the weights of

the "lter in order to minimize the expected squared error. That is, we wish to minimize the costfunction J

�, where

J�"�[e�

�e�].

In practice, we accomplish this in real time by minimizing the stochastic squared error, J�"e�

�e�.

The weights are updated by a gradient-descent optimization procedure

�W"!�dJ

�dW

.

The small positive constant � is called the learning rate, and controls the step size in the directionof the negative gradient.

ADAPTIVE INVERSE CONTROL 245

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 4: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Suppose W has N�outputs. Then, the single MIMO "lter W is equivalent to N

�single-output

linear "lters operating on X�to produce the components of y

�. We let=�

�be the "rst such "lter,

=��

be the second, and so on. Then, we observe that

W"[=�=

�2=

��]� .

This de"nition is useful for actually computing the "lter output. However, for the purpose ofcomputing the weight updates, we use

="vec(W)"[=��=�

�2=�

��]�

which is the vector generated by stacking the columns of W on top of one another.We wish to adapt the values in = to optimize J

�. We can compute dJ

�/d= using standard

vector calculus

dJ�

d="!2e�

dy�

d=.

When this is computed, we reshape the row-vector dJ�/d= into the matrix dJ

�/dW and update

the weight matrix W. Therefore, to compute the weight update, it remains only to "nd dy�/d=.

Using (1) we can write

dy�

d="

�y�

�=#

���

����

�y�

�x�

dx�

d=#

���

����

�y�

�y���

dy���

d=. (2)

The "rst term, �y�/�= is an N

��N

�matrix which may be computed by vectorizing (1) to get

y�"[X�

�� I

��]= (see Brogan [9]), where � is the Kronecker matrix product. Then

�y�

�="[X�

�� I]. (3)

The "rst summation in (2) is zero since x�is independent of= and therefore dx

�/d= is zero. The

"rst term in the second summation, �y�/�y

���is an N

��N

�matrix equal to the columns of

W which multiply y���

in (1). The "nal term, dy���

/d=, is a stored, previously computed versionof dy

�/d=. All terms in (2) are now accounted for, and so the weight updates may be computed.

If the system is FIR, previous versions of dy�/d= need not be stored, and matrices need not be

reshaped. The weight update is simply

�W"2�e�X�

�.

This rule, commonly known as the LMS algorithm, is well described in several textbooks [10,11].

246 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 5: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

� In some interesting papers by Youla and colleagues [12,13] the Wiener solution is also used for non-adaptive controldesign, but knowledge of the plant and disturbance dynamics are assumed known.

2.2. Optimal solution

A nice property of linear "ltering is that the optimal solution is mathematically tractable if certainstatistical information about the input and desired response is available. This solution is knownas the =iener solution [6,10,11].

The details for the SISO case are widely known [6]. The MIMO solution may be developedsimilarly to arrive at the following results. If we let (�

��)"�[x

�d��

] be the crosscorrelationfunction between the input x

�and the desired response d

�, and (�

��)"�[x

�x��

] be the inputautocorrelation function, then the unconstrained solution, W����(z), may be found to be

W����(z)"[���

(z)]�[���

(z)]��

where ���

(z) and ���

(z) are the z-transforms of (���

)

and (���

), respectively. Note that this

solution allows for the "lter W���� to be non-causal. The Shannon}Bode (sometimes called theWiener}Hopf) solution for the optimal causal "lter is

W���������

(z)"[[���

(z)]�[����

(z)]��][�

��(z)]��

where ���

(z)"���

(z)����

(z) and ���

(z) has all the poles and zeros of ���

(z) which are inside theunit circle in the z-plane. Furthermore, the operator [)]

means &take the time series generated by

the inverse-z-transform of the operand, retain only the causal section (set the non-causal entries tozero), and take the z-transform of the result'.

Note that in order to compute the Wiener solution, statistical information regarding ���

(z) and�

��(z) must be known. In this paper we are careful to only use the Wiener solution to demonstrate

correctness of algorithms but not to require knowledge of the Wiener solution when implemen-ting the controller.� That is, �

��(z) and �

��(z) exist but need not be known.

2.3. Stability and convergence of algorithms

The LMS algorithm converges in the mean square [10,11] to the Wiener solution if

0(�(

2

average tap!input power

and if the adaptive "lter has an in"nite number of taps. Practically, a "nite number of taps su$cesfor stable plants since the impulse response of the plant will decay to nearly zero in "nite time.Since the average tap-input power is unknown a priori, it is useful to de"ne a time-varyingadaptive gain �

�based on the instantaneous tap-input power as

��" min

�)�)�

��X

���

where 0(�(2. A choice of � between 0.1 and 1 is typical.

ADAPTIVE INVERSE CONTROL 247

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 6: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 3. Adaptive plant modeling: (a) overall representation; (b) details of &feedback'.

3. ADAPTIVE PLANT MODELING

The "rst step in performing adaptive inverse control is to make an adaptive model of the plant.The model should capture the dynamics of the plant well enough that a controller designed tocontrol the plant model will also control the plant very well. This is a straightforward applicationof the adaptive "ltering techniques in Section 2.

A method for adaptive plant modeling is depicted in Figure 3(a). The plant is excited with thesignal u

�, and the disturbed output y

�is measured. The plant model PK is also excited with u

�, and

its output y'�is computed as y'

�"WPK;�

. The plant modeling error is the di!erence between themodel output and the measured plant output: e���

�"y

�!y'

�. This modeling error is then used

by the adaptation algorithm to update the weight values of the adaptive "lter.If the plant model is FIR, then the simple LMS algorithm may be used to adapt its weights. If

the plant model is IIR, there is a choice regarding how to adapt its parameters, and the methodused depends on how the plant model's self-feedback is &hooked up'. Figure 3(b) illustrates the twomethods. In the series-parallel method, the input vector to the plant model is

;�"[u�

�, u�

���2u�

�����, y�

���, y�

���2y�

�����]�

That is, the &feedback' to the plant model comprises delayed versions of the measured plantoutput. In the parallel method, the input vector to the plant model is

;�"[u�

�, u�

���2u�

�����, y' �

���, y' �

���2y' �

�����]�

That is, the feedback to the plant model comprises delayed versions of the plant-model output.The advantage to using the series-parallel method is that there is no recurrence in the plant

model itself, so the model is &e!ectively' FIR and the LMS algorithm may be used to adapt itsparameters. Unfortunately, if there is disturbance in the plant output, this method will adapt toa biased solution. The parallel method requires using the RTRL method of Section 2.1 to adaptits parameters, but adapts to an unbiased solution. Since the LMS algorithm often convergesmuch more quickly than the RTRL algorithm, we recommend that the plant model be "rstcon"gured in series-parallel until the weights have approached convergence. This initializes theweights to reasonable values. Then, the plant model should be con"gured in parallel andadaptation should continue.

248 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 7: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

AFor a MIMO system, p�

is matrix-valued at each time index k.

Assuming that the plant model has a &su$cient' number of taps and that the adaptationconstant � is chosen for stable adaptation, we can use the Wiener optimality techniques tocompute the optimal solution for the adaptive plant model.

(���)"�[u

�y��

]

"�[u�(p

� *u�

#w�

)�]

"�[u�(p

� *u�

)�]#�[u�w��

]

"(���) *

p�#(�

� )

where & * ' denotes convolution and p�is the impulse response of the plant.A If the disturbance is

zero-mean and uncorrelated with the plant input, then

(���)"(�

��) *

p�

���

(z)"���

(z)P(z)�

P� ���� (z)"[[���

(z)]�[����

(z)]��][�

��(z)]��

P� ����(z)"P(z)

assuming the plant is causal. So, the adaptive plant model converges to the plant. Adaptation ofan FIR model using LMS will be stable if

��" min

�)�)�

��;

���

, 0(�(2.

If the disturbance is not zero-mean (but still uncorrelated with the plant input), then an a$neplant model may be made, where the linear model is augmented with a bias weight (a tap withconstant input &1') which is adapted to estimate the mean of the disturbance. The dynamic modelof the plant is still unbiased.

A "nal comment should be made regarding the relative timing of the u�and y

�signals. In order

to be able to model either a strictly proper or a non-strictly proper plant, we output u�at time

t"(k¹)� and measure y�at time t"(k¹), ¹ being the sampling period. We will see that this

assumption can limit the extent to which we can e!ectively cancel disturbance. If we know a priorithat the plant is strictly proper, then we may instead measure y

�at time t"(k¹)� and use its

value when computing u�, (which is output at time t"(k¹)) since we know that there is no

immediate e!ect on the plant output due to u�.

ADAPTIVE INVERSE CONTROL 249

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 8: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

BThen, the combination of the plant and its feedback stabilizer can be regarded as an equivalent stable plant. This isshown in Appendix D of Widrow and Walach [6]. The author knows of no general method to stabilize an unmodeledunstable linear plant, so this imposes the requirement that at least a rudimentary model is required when controllingunstable systems.

4. FEEDFORWARD MODEL-REFERENCE CONTROL OF PLANT DYNAMICS

To perform adaptive inverse control, we need to be able to adapt the three "lters of Figure 1: theplant model P� , the controller C, and the disturbance canceler X. We have now seen how to adaptP� to make a plant model. For the time being we set aside consideration of the disturbancecanceling "lter X and concentrate on the design of the feedforward controller C.

Traditionally, feedback is used to ensure stability and performance. Here, we restrict ourdevelopment to stable plants or plants which have been stabilized via an additional feedbackloop,B so do not require feedback to stabilize the plant. Precise control is maintained through theadaptive process which acts as a weak feedback signal.

The goal is to make the dynamics of the controlled system PC approximate the "xed "lter M asclosely as possible, where M is a user-speci"ed reference model. The input reference signal r

�is

"ltered through M to create a desired response for the plant output d�. The measured plant

output is compared with the desired plant output to create a system error signal e�����

"d�!y

�.

We will adapt C to minimize the mean-squared system error while constraining the controle!ort u

�.

The reference model M may be designed in a number of ways. Following traditions ofcontrol-theory, we might design M to have a certain step response resembling a second-ordersystem which meets design speci"cations. However, we can often achieve even better trackingcontrol if we let M simply be a delay corresponding to the transport delay of the plant. Thecontroller C will adapt to a delayed inverse of the plant dynamics. (Since we are minimizingsystem error, we will actually invert the plant dynamics and not the plant-model dynamics, so wedo not need to worry about plant/plant-model mismatch as long as the plant model is accurateenough to compute reasonable derivatives for the adaptation algorithm).

We note that if the plant is minimum-phase, that is, has all of its poles and zeros inside the unitcircle in the z-plane, then the inverse will be stable with all of its poles inside the unit circle. If theplant is non-minimum-phase, then some of the poles of the inverse will be outside the unit circle.According to the theory of two-sided z-transforms [6], the inverse will then either be unstable ornon-causal. Since minimizing system error will not lead to an unstable solution*which wouldhave unbounded system error*the algorithm will attempt to match a non-causal solution. Thiswill result in very poor control unless the reference model has built-in latency. The longer thelatency, the better we can approximate a delayed version of a non-causal inverse with a causalcontroller. Therefore, there exists a design tradeo! between latency and system error for nonmini-mum-phase systems.

4.1 Adapting a constrained controller via the BPTM algorithm

Recall that an adaptive "lter requires a desired response input signal in order to be able to adaptits weights. While we do have a desired response for the entire system, d

�, we do not have a desired

response for the output of the controller C. A key hurdle which must be overcome by thecontroller adaptation algorithm is to "nd a mechanism for converting the system error to an

250 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 9: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 4. Structure diagram illustrating the BPTM method.

adaptation signal used to adjust C. For SISO systems, there are a number of simple methods wemight use which take advantage of the commutability of transfer functions [6]. For the mostgeneral MIMO case, however, these will not work.

One solution is to regard the series combination of C and P� as a single adaptive "lter. There isa desired response for this combined "lter: d

�. Therefore, we can use d

�to compute weight updates

for the conglomerate "lter. However, we only apply the weight updates to the weights in C; P� is stillupdated using the plant modeling error e���

�. Figure 4 shows the general framework to be used.

We say that the system error is back-propagated through the plant model, and used to adapt thecontroller. For this reason, the algorithm is named &BackProp Through (Plant) Model' (BPTM).

The algorithm is derived as follows. We wish to train the controller C to minimize the mean-squared system error and to simultaneously minimize some function of the control e!ort. Weconstruct the following cost function which we will minimize

J�"� [e����

��Qe����

�#h (u

�, u

���, 2, u

���)].

The di!erentiable function h ()) de"nes the cost function associated directly with the control signalu�, and is used to penalize excessive control e!ort, slew rate and so forth. The system error is the

signal e�����

"d�!y

�, and the positive-de"nite symmetric matrix Q is a weighting matrix which

assigns di!erent performance objectives to each plant output.We let the controller weight matrix be W

�and the plant-model weight matrix be WPK . Then, we

can compute

u�"W

�R

�and y

�+yL

�"WPK;�

(4)

noting that y�

di!ers from y'�by disturbance and plant-modeling error. Also

R�"[r�

�, r�

���2r�

���, u�

���, u�

���2u�

���]�

;�"[u�

�, u�

���2u�

���, y' �

���, y' �

���2y' �

��]�.

Since the modeling error is uncorrelated with the model input and since we wish to control theundisturbed plant output (and later use a separate circuit to control disturbance) the aboveapproximation is justi"ed.

ADAPTIVE INVERSE CONTROL 251

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 10: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Using gradient descent, the controller weights are updated in the direction of the negativegradient of the stochastic cost function

(�=�)��"!�

dJ�

d=�

"!�d

d=�

�e�����

�Qe�����

#h (u�, u

���,2 , u

���)�

where � is the adaptive learning rate. Continuing,

(�=�)��

�"2e�

�Q�

dy'�

d=��!�

�������h (u

�,2, u

���)

�u���

��

�du

���d=

��� . (5)

Using (4) and the chain rule for total derivatives, two further substitutions may be made at thistime

du�

d=�

"

�u�

�=�

#

������

�u�

�u�����

du���

d=�� (6)

dy'�

d=�

"

������

�y'�

�u�����

du���

d=��#

�����

�y'�

�y'�����

dy'���

d=�� . (7)

With these de"nitions, we may solve (5) to "nd the weight update. We need to "nd threequantities: �h())/�u

���, du

�/d=

�and dy'

�/d=

�.

First, we note that �h())/�u���

depends on the user-speci"ed function h ()). It can be calculatedgiven h ()). Secondly, we consider du

�/d=

�as expanded in (6). This is equivalent in form to (2) and

is computed the same way.Thirdly, we consider dy'

�/d=

�, as expanded in (7). The "rst term in the "rst summation,

�y'�/�u

���comprise the columns of WPK corresponding to the u

���input. The next term, du

���/d=

�,

is the current or a previously computed and saved version of du�/d=

�. The "rst term in the second

summation, �y'�/�y'

���comprise the columns of WPK corresponding to the y'

���input. The "nal

term, dy'���

/d=�, is a previously computed version of dy'

�/d=

�.

If the disturbance is not zero-mean, we adapt an a$ne plant model. We must also adapt ana$ne controller where the adaptive bias weight (with constant input &1') computes the dc input tothe plant required to subtract out the disturbance mean.

4.2 Special case: FIR controller and FIR plant model

In the common but special case where both C and P� are FIR, we can derive a simpler result.De"ne the matrices

(dH)�"��

�h�

�u���

��h

��u

�����2�

�h�

�u�����

��

252 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 11: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

and

(d;)�"��

�u�

�=���

��u

����=

���2�

�u���

�=���

��.

In accord with (3) we know that �u�/�=

�"[R�

�� I]. If we de"ne

R�"[R

�R

���2R

���]

then (d;)�"[R�

�� I]. Equation (7) is then found to be

dy�

d=�

"WPK (d;)�

which we may then use to "nd the weight update to the vectorized controller weight matrix

(�=�)�"� (d;)�

�[2W�PK Qe

�!(dH)

�].

The vectorized weight update may be converted to a direct update to the W�

matrix which avoidsthe matrix multiply by the large sparse matrix [R�

�� I]. We "nd that

(�W�)�"� unvec (2W�PK Qe

�!(dH)

�)��

���R�

where unvec(v)���

converts the vector v into an a�b matrix column-by-column.

4.2.1 Convergence of the controller. Convergence of BPTM in the mean square may be estab-lished using the same method presented in Haykin [10] to show convergence of LMS. Here weconsider the case where both C and P� are FIR, and where h ())"0. The weight-update equationgives

(=�)��

"(=�)�#2�[R

�� I]W�PK Qe

�.

We may expand e�"d

�!y'

�"d

�!WPK;�

+d�!WPK [R�

�� I](=

�)�assuming that=

�changes

slowly. Then

(=�)��

"(=�)�#2�[R

�� I]W�PK Qd

�!2�[R

�� I]W�PK QWPK [R�

�� I]�(=

�)�

�������

"[I!2�M��QM

�](=

�)�#2�M�

�Qd

�.

ADAPTIVE INVERSE CONTROL 253

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 12: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

� Although this assumption ignores the statistical dependence among successive tap-input vectors to the controller, thecorrelation structure is preserved in the result. Su$cient information is retained to serve as a reliable design guideline.

De"ne the weight-vector error �"(=

�)�!w

�where w

�is the Wiener-optimal weight vector. We

may then "nd that

��

"[I!2�M��QM

�]

�#2�M�

�Qe*

where e*�

is the output error which would have been observed if W�"w

�. Let

K�"�[

���], Z"�[M�

�QM

�], and invoke the &independence assumption' in Haykin.� Then,

K��

"[I!2�Z]K�[I!2�Z]#4��J

�Z

where J �

is the minimum �[e��Qe

�] obtained when W

�"w

�. We see that the weight-error

covariance will not generally go to zero, but will be driven by the small forcing term 4��J �

Z.There will be convergence in mean-square, however, if eig(I!2�Z) (1. Since Z is symmetricpositive-de"nite, its eigenvalues are real and positive. Also, eig(I!2�Z)"1!2� eig(Z). There-fore, for convergence we have

0(�(

1

max eig(Z)

Since we do not generally know Z a priori, we can make an instantaneous approximation to get(assuming � positive)

0(��( max

�)�)�

1

max eig([R�� I]W�PK QWPK [R�

�� I]�)

.

A conservative but simpler computation is

��" max

�)�)�

�tr([R

�� I]W�PK QWPK [R�

�� I]�)

where tr()) is the trace operator and 0(�(1. Finally, due to the invariance of the trace operatorunder cyclic permutations of its operand, we get

��" max

�)�)�

�tr(WPK [R�

�� I]�[R

�� I]W�PK Q)

which results in smaller matrix multiplications. A value of �+0.1 seems to work well.

254 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 13: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 5. Two methods to close the loop: (a) the output, y�, is fed back to the controller; (b) an estimate of

the disturbance, w'�, is fed back to the controller.

5. DISTURBANCE CANCELING

Referring again to Fig. 1, we have seen how to adaptively model the plant dynamics, and how touse the plant model to adapt a feedforward adaptive controller. Using this controller, anundisturbed plant will very closely track the desired output. A disturbed plant, on the other hand,may not track the desired output at all well. It remains to determine what can be done to mitigateplant disturbance.

The "rst idea which may come to mind is to simply &close the loop'. Two methods commonlyused to do this are discussed below. Unfortunately, neither of these methods is appropriate if anadaptive controller is being designed. Closing the loop will cause the controller to adapt toa &biased' solution. It is shown here that the extent of the bias is dependent on the plant dynamics,the spectrum of the disturbance, and the spectrum of the plant's control signal. An alternatetechnique is introduced, which leads to the correct solution.

5.1. Conventional disturbance rejection methods fail

Two approaches to disturbance rejection commonly seen in the literature are shown in Figure 5.The z�� (unit delay) block represents our prior assumption that the control signal u

�is output to

the plant immediately prior to the plant output y�

being measured. Therefore, only y���

isavailable to compute u

�. Should the plant be known a priori to be strictly proper and should y

�be

measured immediately prior to u�being computed and sent to the plant, then the delay block may

be removed. We will see that this will generally improve the extent to which we may canceldisturbance.

By closing the loop we either feed back the disturbed plant output y�, as in Figure 5(a), or we

feed back an estimate of the disturbance w'�, as in Figure 5(b). The approach shown in Figure 5(a)

is more conventional, but is di$cult to use with adaptive inverse control since the transferfunction of the closed-loop system is dramatically di!erent from the transfer function of theopen-loop system. Methods di!ering from those presented here are required to adapt C. Theapproach shown in Figure 5(b) is called internal model control [14}19]. The bene"t of using thisscheme is that the transfer function of the closed-loop system is equal to the transfer function ofthe open-loop system if the plant model is identical to the plant. Therefore, the methods found toadapt the controller for feedforward control may be used directly.

ADAPTIVE INVERSE CONTROL 255

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 14: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Unfortunately, closing the loop using either method in Figure 5 will cause P� to adapt to anincorrect solution. In the following analysis the case of a linear SISO plant controlled withinternal model control is considered. A similar analysis may be performed for the conventionalfeedback system in Figure 5(a), with the same conclusion.

5.2. Shannon}Bode solution for P� using internal model control

When the loop is closed as in Figure 5(b), the estimated disturbance term w'���

is subtracted fromthe reference input r

�. The resulting composite signal is "ltered by the controller C and becomes

the plant input signal, u�. In the analysis done so far, we have assumed that u

�and w

�are

uncorrelated, but that assumption is no longer valid. We need to revisit the analysis performed forsystem identi"cation to see if the plant model P� still converges to P.

The direct approach to the problem is to calculate the least mean-squared-error solution forPK and see if it is equal to P. However, due to the feedback loop involved, there is no obvious wayto obtain a closed-form solution for PK . An indirect approach is taken here. We do not need toknow exactly to what PK converges*we only need to know whether or not it converges to P. Thefollowing method is used:

1. Remove the feedback path (open the loop) and perform on-line plant modeling and controlleradaptation.

2. When convergence is reached, we know that PK +P, and the disturbance estimate w'�is very good.

We assume that w'�"w

�. This assumption allows us to construct a feedforward-only system

which is equivalent to the internal-model-control feedback system by substituting w�for w'

�.

3. Substitute w�for w'

�. If the least mean-squared-error solution for PK still converges to P, then the

assumption made in the second step remains valid, and the plant is being modeled correctly. IfPK diverges from P with the assumption made in step 2, then it will not converge to P even if anexact analysis were done.

We now apply this procedure to analyze the system of Figure 5(b). We "rst open the loop andallow the plant model to converge to the plant. Secondly, we assume that w'

�+w

�. Finally, we

compute the least mean-squared-error solution for PK . If causality is enforced, this solution is theShannon}Bode solution from Section 2. To calculate it, we "rst determine the correlation terms�

��(z) and �

��(z).

(���)"�[u

�y�

]

"�[u�(p

� *u�

#w�

)]

"p *

(���

)#�[u

�w

�]

where p�

is the impulse response of the plant. To proceed, note that u�"c

� *(r�!w

���)

"c� *

(r�!w

� *����

) , where ��is the unit impulse function. So

(���

)"p

*(�

��)#�[c

� *(r�!w

� *����

)w�

]

"p *

(���)!c

� *�� * (�

)

���(z)"P (z)�

��(z)!z�C(z��)�

(z).

256 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 15: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Similarly, we compute ���

(z)

(���)"�[u

�u�

]

"�[c� *

(r�!w

���)c

� *(r�

!w���

)]

"c� *

c *

(���)#(�

)

���

(z)"C(z)C(z��) (���(z)#�

(z)) .

The Shannon}Bode solution for P� may now be calculated (assuming in the third line that theplant is causal)

PK ���������

(z)"1

���

(z) ��

��(z)

����

(z)�

"

1

���

(z) �P (z)�

��(z)!z�C(z��)�

(z)

����

(z) �

"P (z)!1

���

(z) �z�C(z��)�

(z)

����

(z) �

Assumptions concerning the nature of the r�and w

�signals are needed to simplify this further. For

example, if w�

is white, then the plant model converges to the plant. Under almost all otherconditions, however, the plant model converges to something else. We conclude that, in general,an adaptive plant model made using the internal model control scheme will be biased bydisturbance. The biased plant model will cause incorrect weight updates to be computed forC using the BPTM algorithm, and will lead to sub-optimal performance of the control system.

The bottom line is that*even under the idealized assumptions that initially PK "P and thatw'�"w

�*if the loop is closed, and the plant model is allowed to continue adapting after the loop

is closed, the overall control system will become biased by the disturbance. One solution is to&freeze' the weights of the plant model just before the loop is closed. This solution will work, butdoes not allow the control system to respond to time variations in the plant dynamics. Anothersolution is to perform plant modeling with dither signals [6] rather than with the command inputsignal u

�. However, this will increase the output noise level. A better solution is presented next, in

which the plant model is allowed to continue adaptation. There is no extra disturbance at theoutput. The plant disturbances will be handled by a separate circuit from the one handling thetask of dynamic response. This results in an overall control problem which is partitioned in a verynice way.

5.3. A solution allowing on-line adaptation of PK

The only means at our disposal to cancel disturbance is through the plant input signal u�. This

signal must be computed in such a way that the plant output negates (as much as possible) the

ADAPTIVE INVERSE CONTROL 257

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 16: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 6. Correct on-line adaptive plant modeling in conjunction with disturbance canceling for linearplants. The circuitry for adapting C has been omitted for clarity.

disturbance. Therefore, the plant input signal must be correlated with the disturbance. However,we have just seen that the plant-model input signal cannot contain terms correlated with thedisturbance or the plant model will become biased. This conundrum was "rst solved by Widrow[6] and his solution is shown in Figure 6. The same basic architecture is used here, although theadaptation methods are di!erent to account for MIMO systems.

By studying the "gure, we see that the control scheme is very similar to internal model control.The main di!erence is that the feedback loop is &moved' in such a way that the disturbancedynamics do not appear at the input to the adaptive plant model, but do appear at the input tothe plant. That is, the controller output u

�is used as input to the adaptive plant model PK ; on the

other hand, the input to the plant is equal to u�#uJ

�, where uJ

�is the output of a special

disturbance-canceling circuit, X. PK is not used directly to estimate the disturbance; rather, a "lterwhose weights are a digital copy of those in PK is used to estimate the disturbance. This "lter iscalled PK

�� .

We can quickly show that an adapted plant model will remain at the correct solution when the&loop' is closed. As before, we follow the three-step procedure. First, we open the loop and allowPK to adapt until it converges to P. Secondly, we assume that w'

�+w

�. Lastly, we check to see

whether or not PK remains converged to P.The input to the plant model is u

�. The signal used as the desired output when adapting PK is the

disturbed plant output

y�"p

� * (u�#uJ

�)#w

where p�is the impulse response of the plant, and uJ

�is the output of the disturbance canceling

"lter X. This latter term is computed to be

uJ�"x

� * w� *

����

where x�is the impulse response of the disturbance canceling "lter X, and �

���is a unit delay.

We compute the Shannon}Bode solution for the adaptive plant model by "rst

258 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 17: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 7. A useful way of looking at the feedforward system dynamics.

computing ���

(z).

(���)"� [u

�(p

� *u�

#p� *

uJ�

#w�

)�]

"(���

) *

p�

���

(z)"���

(z)P (z)�

assuming that u�and w

�are uncorrelated. The Shannon}Bode solution is

PK ���������

(z)"[(���(z)P (z)�)�(��

��(z))��]

(�

��(z))��

"P(z)

assuming that the plant is causal. So, the plant model converges to the plant regardless of thedisturbance (if it is uncorrelated with u

�) and regardless of X.

5.4. The function of the disturbance canceler

It is informative to perform a heuristic investigation into the function performed by the distur-bance canceler X. The analysis is precise if the plant is minimum phase (that is, if it has a stable,causal inverse), but is merely qualitative if the plant is nonminimum phase. The goal of thisanalysis is not to be quantitative, but rather to develop an understanding of the functionperformed by X.

A useful way of considering the overall system is illustrated in Figure 7. We desire thatX produce an output so that y

�"m

� *r�. We can express y

�as

y�"w

�#p

� *(u

�#x

� *w'

���)

We substitute y�"m

� *r�and rearrange to solve for the desired response of X. We see that

x����� *w'

���"(p��)

� *(m

� * r�!w

�)!u

"(p��)� *

(p� * u

�!w

�)!u

"!(p��)� *

w�

ADAPTIVE INVERSE CONTROL 259

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 18: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 8. Internal structure of X.

assuming that the controller has adapted until p� *

u�+m

� *r�. The output of X when its input is

the w'���

signal is the same as the output of the plant inverse with input w�. Assuming that the

adaptive plant model is perfect, and that the controller has been adapted to convergence, theinternal structure of X is then shown in Figure 8. The w

�signal is estimated from previous

samples of w'�, and then passed through the plant inverse to compute the desired signal uJ

�.

Thus, we see that the disturbance canceler contains two parts. The "rst part is an estimatorwhich depends on the dynamics of the disturbance source. The second part is the canceler whichdepends on the dynamics of the plant.

One very important point to notice is that if the process generating the disturbance is notlinearly "ltered white noise, then the optimal estimator is in general a non-linear function. Theconclusion is that the disturbance canceler is best implemented as a non-linear "lter even if theplant is a linear dynamical system. Reference [1] details how to train a non-linear disturbancecanceler; here, we focus on the linear case.

If the plant is generalized minimum phase (minimum phase with a constant delay), then thissolution must be modi"ed slightly. The plant inverse must be a delayed plant inverse, with thedelay equal to the delay inherent in the plant. The estimator must estimate the disturbance onetime step into the future plus the delay of the plant. If the plant is known a priori to be strictlyproper, and the delay z��I block is removed from the feedback path, then the estimator needs topredict the disturbance one fewer step into the future. Since estimation is imperfect, this willimprove performance.

These results are heuristic and do not directly apply if the plant is non-minimum phase. We cansee this easily now, since a plant inverse does not exist. However, the results still hold qualitativelysince a delayed inverse exists; the solution for X is similar to the one for a generalized minimumphase plant. The structure of X consists of a part depending on the dynamics of the system whichamounts to a delayed plant inverse, and a part which depends on the dynamics of the disturbancegenerating source, which must now predict farther into the future than a single time step. Unlikethe case of the generalized minimum phase plant, however, these two parts do not necessarilyseparate. That is, X implements some combination of predictors and delayed inverses whichcompute the least mean-squared-error solution.

5.5. Adapting a disturbance canceler via the BPTM algorithm

We have seen how a disturbance-canceling "lter can be inserted into the control-system design insuch a way that it will not bias the controller. Proceeding to develop an algorithm to adapt X weconsider the system error to be composed of the following parts:

� One part of the system error is correlated with the input command vector r�, r

���2in C. This

part of the system error may be reduced by adapting C.

260 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 19: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 9. Integrated adaptive inverse control system.

� The second part of the system error is correlated with the estimated disturbance vectorw'�, w'

���2in X. This part of the system error may be reduced by adapting X.

� The third part of the system error is the irreducible mean-squared-error. This part of the systemerror is uncorrelated with both the input command vector in C and the estimate disturbancevector in X. It is irreducible either because the system dynamics prohibit improvement, orbecause the tapped-delay lines at the input to X or C have not been chosen large enough. In anycase, adaptation of the weights in X or C can not reduce the irreducible mean-squared-error.

� A fourth possible part of the system error is the part which is correlated with both the inputcommand vector and the disturbance vector. However, by assumption, r

�and w

�are uncor-

related, so this part of the system error is zero.

Using the BPTM algorithm to reduce the system error by adapting C, as discussed in Section 4.will reduce the component of the system error correlated with the input r

�. Since the disturbance

and minimum-mean-squared-error are independent of r�, their presence will not bias the solution

of C. The controller will learn to control the feedforward dynamics of the system, but not tocancel disturbance.

If we were to use the BPTM algorithm and backpropagate the system error through the plantmodel copy, using it to adapt X as well, the disturbance canceler would learn to reduce thecomponent of the system error correlated with the estimated disturbance signal. The componentof the system error due to unconverged C and minimum-mean-squared-error will not bias thedisturbance canceler.

This method is illustrated in Figure 9 where a complete integrated MIMO control system isdrawn. The plant model is adapted directly, as before. The controller is adapted by backpropagat-ing the system error through the plant model and using the BPTM algorithm of Section 4. Thedisturbance canceler is adapted by backpropagating the system error through the copy of theplant model and using the BPTM algorithm as well. So we see that the BPTM algorithm servestwo functions: it is able to adapt both C and X.

ADAPTIVE INVERSE CONTROL 261

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 20: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

5.6. Convergence of the adaptive processes

Stable adaptation of the disturbance canceler requires "rst that a good plant model be made.Then, the disturbance estimate is accurate, and the input to the disturbance canceler is e!ectivelydecoupled from the rest of the control system. So, while the plant model and controller may beadapted simultaneously from the time the system is &turned on', adaptation of the disturbancecanceler must wait until the plant model converges. Convergence may be practically tested bynoting when the mean (short-term average) squared modeling error approaches a constant.

When both the plant model and disturbance canceler are FIR we get the same convergenceresults as for the controller. De"ne W�

�"[=�

�=�

���2=�

���] where=�

�is the input vector to X at

time index k. Then, stable adaptation of the disturbance canceler is achieved when

��" max

�)�)�

�tr(WPK [W) �

�� I]�[W�

�� I]W�PK Q)

, 0(�(1.

A value of �+0.1 seems to work well.

6. EXAMPLES

In order to demonstrate the results of this paper, two di!erent plants were selected: a SISO plantand a MIMO plant. In addition, the SISO plant may be con"gured as either minimum-phase ornon-minimum-phase. The two plants are described, and control experiments and results arepresented.

6.1. Linear SISO plant

The linear SISO example was selected from Franklin et al. [20]. The goal is to control thetemperature of a tank of water. The #ow rate of water into the tank is constant and equal to the#ow rate of water out of the tank. The temperature of the incoming water is controlledby a mixing valve that adjusts the relative amounts of hot and cold supplies of the water (seeFigure 10). A length of pipe, assumed to have negligible heat loss, separates the mixing valve fromthe tank. This distance causes a time delay between the application of a change in the mixingvalve and changed temperature in the tank. When the plant dynamics are discretized, the transferfunction may have a "nite zero either inside or outside of the unit circle*depending on the lengthof the pipe*making the plant either minimum-phase or non-minimum-phase. Two di!erentselections of parameters yield a minimum-phase plant H

�(z) and a non-minimum-phase plant

H�(z)

H�(z)"0.1042

z#0.7402

z�!0.8187z�and H

�(z)"0.0676

z#1.6813

z�!0.8187z�. (8)

The reference signal the tank temperature is required to track is generated by "lteringi.i.d. uniformly distributed random numbers (between 453C and 553C) using a one-pole digital"lter, where the pole is located at z"0.7. This is a "rst-order Markov process. When constraintson the control e!ort are considered, the mixing-valve temperature is allowed to be in the range

262 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 21: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 10. Tank temperature control.

**Even should the control signal go outside of bounds, this relationship is used to preserve the linearity of the problem.

5}953C. This ensures a practical implementation as water is still in its liquid phase over this rangeof temperatures. Physically, this means that the hot reservoir is a 953C hot water source, and thatthe cold reservoir is a 53C cold water source.

The controller output selects a desired valve temperature which mixes hot and cold water. Thenominal hot and cold reservoir temperatures are ¹

�"953C, and ¹

�"53C. Therefore, the valve

is set based on the control signal, u�, to be**

¹�"�

u�!5

90 �¹�#�

95!u�

90 �¹�, 5)u

�)95.

Now, let us assume that the hot source is poorly regulated. It heats up and cools down ina periodic fashion. Let

¹�"95#5 sin(2�t/60#�)

where � is random variable, uniformly distributed between [!�, �], and independent of u�.

Combining,

¹�"u

�#�

u�!5

18 � sin (2�k/60#�)

dist�"�

u�!5

18 � sin (2�k/60#�) * p�

where the disturbance is measured at the output of the plant, so is shown convolved with theplant impulse response, p

�. This disturbance is interesting for two main reasons: (1) It is

non-linear, and (2) It is statistically dependent on u�. Note, however, that it is uncorrelated with u

�.

ADAPTIVE INVERSE CONTROL 263

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 22: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 11. Aircraft yaw-rate and bank-angle control.

��The primary reference for this section is Franklin et al. [21]. The authors in turn reference a seminal but elusive source[22]. The augmented equations for MIMO control were obtained from Grace et al. [23].

6.2. Linear MIMO plant

Two aspects of #ight control for a Boeing 747 aircraft (see Figure 11) were selected to demonstratelinear, MIMO control.�� The dynamics of the airplane have been approximated by a linear modelaround an equilibrium point. In the case at hand, the equilibrium &point' is level #ight at 40,000 ftand a nominal forward speed of Mach 0.8 (774 ft/s). The resulting linearized equations of motionare eighth-order, but they may be separated into two fourth-order sets representing the perturba-tions in longitudinal and lateral motion. Here, we wish to control the aircraft's yaw-rate (r) andbank-angle (�).

The dynamics of the system are most compactly represented in state-space form. Whenconverted to discrete-time, we de"ne

u�"�

Rudder angle in degrees

Aileron angle in degrees�, y�"�

Yaw rate, r�, in radians/s

Bank angle, ��, in radians�

and

x�"

Sideslip angle, �, in radians

Yaw rate, r�, in radians/s

Roll rate, p�, in radians/s

Bank angle, ��, in radians

.

Then

x��

"A�x�#B

�u�

y�"C

�x�

264 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 23: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

where

A�"

0.8876 !0.3081 0.0415 0.0198

0.2020 0.3973 !0.0046 0.0024

!1.2515 0.5106 0.7617 !0.0139

!0.3313 0.1510 0.4407 0.9976

, B�"

0.4806 !0.0013

!1.5809 0.3887

0.0599 4.8390

0.0390 1.2585

C�"�

0 1 0 0

0 0 0 1� .

The reference command to be tracked is generated independently for each output. Thereference command for the desired yaw-rate is a "rst-order Markov process generated by "lteringi.i.d. uniform random variables with maximum value 0.03 using a one-pole "lter whose pole is atz"0.9. The reference command for the desired bank angle is a "rst-order Markov processgenerated by "ltering i.i.d. uniform random variables with maximum value 0.12 with a one-pole"lter whose pole is at z"0.9.

The primary disturbance experienced by the dynamics of the airplane are those induced bybursts of wind. It is assumed here that the nominal wind values are incorporated into the dynamicmodel of #ight, and that gusts around that nominal value are the disturbances. The state of theairplane, x

�, is a!ected directly by the wind. So, the full discrete-time model of the airplane

dynamics, with disturbance, is

x��

"A�[x

�#dist

�]#B

�u�

y�"C

�x�.

Furthermore, it is assumed that the wind gusts occur as planar fronts and thus do not a!ect theyaw-rate, roll-rate or bank-angle directly. Instead, the sideslip angle is directly a!ected by thewind, and the other state-variables are a!ected indirectly through the dynamical relationshipbetween themselves and the sideslip angle. If we model the wind in the lateral direction, then thesideslip angle is perturbed by

� �"atan�

wind speed

airplane speed� .

The model for generating a wind-speed time series is based on data presented in Reference [24].An approximation is made to the autocorrelation function of the cited paper. The power spectraldensity of wind velocity was calculated from the autocorrelation function, and was found to be

�( f )"3950

1#(20� f )�.

An FIR "lter was designed using a weighted least-squares optimization algorithm to produce thispower spectral density given an input stream of i.i.d. uniform random numbers with maximum

ADAPTIVE INVERSE CONTROL 265

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 24: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 12. Sample wind time series.

magnitude 1. A sample wind time-sequence is shown in Figure 12. The maximum absolute windspeed is in the neighborhood of 20 f/s, so the maximum perturbation to

�is around 0.03 radians.

6.3. Feedforward control

Feedforward controllers were trained with the BPTM algorithm for the minimum-phase andnon-minimum-phase SISO system, and for the MIMO system. The reference model for the SISOminimum-phase system was a delay of 2 s (M (z)"z��) since the transport delay of the plant doesnot allow any faster response. The reference model for the MIMO system was a unit delay:M(z)"z��I. Various reference models were used when training controllers for the non-min-imum-phase SISO system in order to determine a good tradeo! between latency in the referencemodel and low system error: M(z)"z��, �3[2215].

Figure 13 shows the penalty function, h (u�) used when adapting a constrained controller for the

SISO systems. The penalty is zero for control e!ort between 5.53C and 94.53C. For control e!ortoutside this range the penalty is quadratic

h(u�)"

iggjggk

�u�!5.5

5!5.5 ��, if u

�(5.5;

�u�!94.5

95!94.5��, if u

�'94.5;

0, otherwise.

In Figure 13(a) the overall penalty function is plotted. It appears to be a hard limit on the controlsignal. However, in Figure 13(b) a region of the function is magni"ed to show the parabolic natureof the constraint.

Figure 14 shows an initial &training transient' for unconstrained control of the minimum-phaseSISO system. The gray stair-step line is the desired response signal; the solid dots indicate theplant output at the sampling instants, and the solid line is the continuous-time output ofthe plant. Controller weights were initialized to small random numbers, and adaptation was

266 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 25: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 13. Penalty function used on the magnitude of the control e!ort.

Figure 14. Training transient with BPTM for minimum-phase SISO plant.

begun. The general shape of the desired response is followed at this stage of training; as timeprogresses the controller adapts to better control the plant.

Figure 15 shows fully-trained control of the undisturbed minimum-phase SISO system usingeither a constrained controller or an unconstrained controller. In Figure 15(a), we see the trackingperformance of the unconstrained controller. The gray stair-step line is the desired responsesignal; the solid dots indicate the plant output at the sampling instants, and the solid line is thecontinuous-time output of the plant. We see that the unconstrained controller (nearly) exactlycontrols the discrete-time output of the plant. That is, the output of the plant is nearly the same asthe desired response at the sampling instants. The constrained controller (Figure 15(c)) is not ableto match this performance at the sampling instants. However, the inter-sample responses ofthe constrained controller is much more reasonable than the response of the unconstrainedcontroller.

Figures 15(b) and 15(d) show the control e!ort required by the two controllers for the sameinput signal. We see that the unconstrained controller produces control signals outside theacceptable range. The constrained controller produces acceptable control signals.

ADAPTIVE INVERSE CONTROL 267

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 26: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 15. Tracking performance and control e!ort of the two controllers for the minimum-phase plant.

When discussing inverse control of non-minimum-phase systems, we noted that a stable butnon-causal inverse exists. By allowing latency between command input and the plant's response,we can get increasingly better performance by shifting the non-causal inverse into the causal partof the time line. We accomplish this by setting M (z)"z��, where � is the latency. Figure 16 givesempirical evidence of this claim. Unconstrained (solid) and constrained (gray) controllers weretrained for di!erent levels of allowed latency. We see that the mean-squared system errordecreases with the greater level of allowed latency. We also see that the constrained controllercannot achieve the same level of performance as the unconstrained controller (which is reason-able)*its performance levels out for latencies greater than about 4 s. This helps the controldesigner select an appropriate latency for the control design. In this case, we see that a latencyof 4 s provides a tradeo! between performance and speed-of-response. The tracking performanceof this controller is comparable to that of the constrained controller for the minimum-phasesystem.

Figure 17 shows tracking performance for the MIMO plant in the same form as Figure 15. Wesee that the BPTM algorithm is able to e!ectively train a controller for MIMO systems as well asSISO systems.

268 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 27: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 16. System error versus design latency for unconstrained (solid) versus constrained (gray) control ofthe non-minimum-phase plant.

Figure 17. Tracking performance and control e!ort for the Boeing 747 controller.

6.4. Disturbance canceling

When performing disturbance canceling we stated that it was very important to make surethat the feedback is incorporated into the design correctly. Using conventional feedback

ADAPTIVE INVERSE CONTROL 269

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 28: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 18. Biased and unbiased plant-model impulse responses for minimum-phase plant.

methods, the plant model will adapt to a biased solution. This is illustrated in Figure 18.In Figure 18(a), internal model control is used with a plant model which adapts on-line.The impulse response of the converged plant model is plotted. In Figure 18(b), the adaptive-inverse-control architecture is used with a plant model which adapts on-line. The impulseresponse of the converged plant model is plotted. By comparing these "gures with the impulseresponse of H

�(z) in (8) we see that the "rst plant model is biased, and the second plant model is

correct.The bias in the plant model causes sub-optimal performance. In Figures 19(a) and 19(b),

disturbance canceling scenarios for the minimum-phase SISO plant are plotted. In bothcases, the system was run until convergence of all adaptive processes. Then, the disturbancecancelers were turned &o! ' by opening the loop. The system was allowed to run for 1000 sand the system squared-error was recorded. Then, the disturbance cancelers were turned onand the system was run for another 1000 s, and the system squared-error was recorded.In Figure 19(a), the results are plotted for the internal-model-control system, which has thebiased plant model. In Figure 19(b), the results are plotted for the adaptive-inverse-control system. For both scenarios the reference commands to follow and the disturbanceexperienced were identical. We see that for the "rst 1000 s the adaptive inverse controlsystem performs better than the internal-model-control system because its plant modeland controller have not become biased. For the second 1000 s it performs better as well.Because of the highly-predictable nature of the disturbance, even though it is formedthrough a non-linear process, the linear disturbance canceler is able to cancel almost all of thedisturbance.

In Figure 19(c) we see a similar plot for the non-minimum-phase plant. It is not able to canceldisturbance quite as well as for the minimum-phase plant, but it still does very well. Finally, inFigure 19(d), we see a similar plot for the MIMO plant, demonstrating that control anddisturbance canceling for a MIMO system may be performed using adaptive inverse control.Almost all of the disturbance (which is highly-correlated due to the low-frequency variation in thewind signal) can be predicted and removed.

270 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 29: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

Figure 19. Disturbance canceling scenarios. (a) For minimum-phase plant, using internal model control.The other three scenarios use adaptive inverse control. (b) For minimum-phase plant; (c) For non-minimum-

phase plant; (d) For MIMO plant.

7. CONCLUSIONS

Adaptive inverse control is a very simple yet highly e!ective way of controlling SISO or MIMOplants using adaptive signal-processing techniques. The control scheme is partitioned into smallersub-problems which can be independently optimized. First, an adaptive plant model is made;secondly, a constrained adaptive controller is generated; "nally, a disturbance canceler is adapted.All three processes may continue concurrently, and the control architecture is unbiased. Excellentcontrol and disturbance canceling for minimum-phase or non-minimum-phase plants is achieved.

ACKNOWLEDGEMENTS

This work was supported in part by the following organizations: The National Science Foundation undercontract ECS-9522085 and the Electric Power Research Institute under contract WO8016-17. The authorwould also like to thank the reviewers for their helpful comments and suggestions.

ADAPTIVE INVERSE CONTROL 271

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272

Page 30: Adaptive inverse control of unmodeled stable SISO and MIMO …mocha-java.uccs.edu/dossier/RESEARCH/2002acsp-.pdf · *Correspondence to: Gregory L. Plett, Department of Electrical

REFERENCES

1. Plett G. Adaptive inverse control of plants with disturbances. Ph.D. thesis, Stanford University, Stanford, CA 94305,May 1998.

2. Widrow B, Plett G, Ferreira E, Lamego M. Adaptive inverse control based on nonlinear adaptive "ltering. InProceedings of 5th IFAC=orkshop on Algorithms and Architectures for Real-¹ime Control AAR¹C+98, Cancun, MX,April 1998; 247}252 (invited paper).

3. Widrow B, Plett G. Nonlinear adaptive inverse control. In Proceedings of the 36th IEEE Conference on Decision andControl, San Diego, CA, vol. 2, 10}12 December 1997; 1032}1037.

4. Bilello M. Nonlinear adaptive inverse control. Ph.D. thesis, Stanford University, Stanford, CA, April 1996.5. Widrow B, Plett G. Adaptive inverse control based on linear and nonlinear adaptive "ltering. In Proceedings of the=orld Congress on Neural Networks, San Diego, CA, September 1996; 620}627.

6. Widrow B, Walach E. Adaptive Inverse Control. Prentice-Hall PTR: Upper Saddle River, NJ, 1996.7. Glentis G, Berberidis K, Theodoridis S. E$cient least squares adaptive algorithms for FIR transversal "ltering. IEEE

Signal Processing Magazine 1999; 16(4):13}41.8. Williams R, Zipser D. Experimental analysis of the real-time recurrent learning algorithm. Connection Science 1989;

1(1):87}111.9. Brogan WL. Modern Control ¹heory (3rd edn), Prentice-Hall: Upper Saddle River, NJ, 1991; 229}230 (Chapter 6).

10. Haykin S. Adaptive Filter ¹heory (3rd edn). Prentice Hall: Upper Saddle River, NJ, 1996.11. Widrow B, Stearns S. Adaptive Signal Processing. Prentice-Hall: Englewood Cli!s, NJ, 1985.12. Youla D, Bongiorno Jr, J, Jabr H. Modern Wiener}Hopf design of optimal controllers Part I: the single-input}output

case. IEEE ¹ransactions on Automatic Control 1976; AC-21(1):3}13.13. Youla D, Jabr H, Bongiorno Jr, J. Modern Wiener}Hopf design of optimal controllers Part II: the multivariable case.

IEEE ¹ransactions on Automatic Control 1976; AC-21(3):319}338.14. Garcia C, Morari M. Internal model control. 1. A unifying review and some new results. Industrial and Engineering

Chemistry Process Design and Development 1982; 21(2):308}323.15. Garcia C, Morari M. Internal model control. 2. Design procedure for multivariable systems. Industrial and

Engineering Chemistry Process Design and Development 1985; 24(2):472}484.16. Garcia C, Morari M. Internal model control. 3. Multivariable control law computation and tuning guidelines.

Industrial and Engineering Chemistry Process Design and Development 1985; 24(2):484}494.17. Rivera D, Morari M, Skogestad S. Internal model control. 4. PID controller design. Industrial and Engineering

Chemistry Process Design and Development 1986; 25(1):252}265.18. Economou C, Morari M. Internal model control. 5. Extension to nonlinear systems. Industrial and Engineering

Chemistry Process Design and Development 1986; 25(2):403}411.19. Economou C, Morari M. Internal model control. 6. Multiloop design. Industrial and Engineering Chemistry Process

Design and Development 1986; 25(2):411}419.20. Franklin G, Powell J, Emami-Naeini A. Feedback Control of Dynamic Systems (3rd edn). Addison-Wesley: Reading,

MA, 1994; 659}661.21. Franklin G, Powell J, Emami-Naeini A. Feedback Control of Dynamic Systems (3rd edn). Addison-Wesley: Reading,

MA, 1994; 684}693.22. He%ey R, Jewell W. Aircraft handling qualities. ¹ech. Rep. 1004-1, System Technology, Inc., Hawthorne, CA, May

1972.23. Grace A, Laub A, Little J, Thompson C. Control System ¹oolbox for use with MA¹¸AB, The Math Works Inc.:

Natick, MA, 1992; 23}35.24. Kaminsky F, Kirchho! R, Syu C, Manwell J. A comparison of alternative approaches for the synthetic generation of

a wind speed time series. ¹ransactions of the American Society of Mechanical Engineers. Journal of Solar EnergyEngineering 1991; 113(4):280}289.

272 G. L. PLETT

Copyright � 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:243}272