6
Adaptive Control of Discrete-Time Systems with Unknown, Unstable Zero Dynamics Syed Aseem Ul Islam, Tam W. Nguyen, Ilya V. Kolmanovsky, and Dennis S. Bernstein Abstract— Adaptive control of systems with poorly known or unmodeled unstable zero dynamics remains a challenging problem. This paper presents an extension of retrospective cost adaptive control (RCAC) called data-driven RCAC (DDRCAC), which does not require that the zero dynamics be known a priori. Instead, the method uses online identification to obtain an approximate model of the plant numerator dynamics for use in the target model of RCAC. DDRCAC is demonstrated numerically on systems with linear and nonlinear unstable zero dynamics that are a priori unknown. I. I NTRODUCTION Systems with unstable zero dynamics arise in many ap- plications [1]. For linear systems, unstable zero dynamics manifest themselves as nonminimum-phase (NMP) zeros. For nonlinear systems, the zero dynamics are characterized by control inputs that make the output identically zero. These systems are difficult to control due to limitations on plant inversion. Systems with unstable zero dynamics are especially diffi- cult to control when the zero dynamics are unknown. This case occurs in hypersonic vehicles, whose zero dynamics are poorly modeled due to the complex aerodynamics [2]–[6]. The present paper applies indirect adaptive control to systems with uncertain, unstable zero dynamics. The starting point for this work is retrospective cost adaptive control (RCAC), which is a discrete-time adaptive control technique that is applicable to stabilization, command following, and disturbance rejection. RCAC is described in [7], [8], and various applications are considered in [9]–[11]. The modeling information needed by RCAC is used to specify the target model G f , which defines the retrospective cost function. In the SISO case, the required minimal model- ing information is the relative degree, the sign of the leading numerator coefficient, and any NMP zeros. In the MIMO case, the required minimal modeling information needed to specify G f is an open problem. In practice, however, a plant may be NMP, but the NMP zeros may be uncertain. If the plant has NMP zeros that are poorly modeled, then RCAC may or may not cancel them, depending on the accuracy of the knowledge of the NMP zeros as well as the aggressiveness of the controller. The tendency of RCAC to cancel NMP zeros is exploited in [12], where unstable pole locations are used as estimates of the NMP zeros. Syed Aseem Ul Islam, Tam W. Nguyen, Ilya V. Kolmanovsky, and Dennis S. Bernstein are with the Department of Aerospace Engineering, Univer- sity of Michigan, Ann Arbor, MI, USA. {aseemisl, twnguyen, ilya, dsbaero}@umich.edu The present paper extends RCAC to address the problem of unknown NMP zeros, especially for the MIMO case. This approach, called data-driven retrospective cost adaptive control (DDRCAC) [13], uses online system identification to construct G f . In particular, closed-loop identification using recursive least squares (RLS) is used to identify an infinite- impulse-response (IIR) model whose numerator is used to update a finite-impulse-response (FIR) target model G f ,k at each step. This technique can be applied to both SISO and MIMO systems. This paper develops DDRCAC and applies this technique to linear NMP systems as well as nonlinear systems with unstable nonlinear zero dynamics. Since DDRCAC relies on online identification, we in- vestigate the effect of the sensor noise on the closed-loop performance. In order to examine this sensitivity, we apply DDRCAC to a system with an unknown NMP zero with increasing levels of sensor noise. Furthermore, we present an example in which the transients that arise due to poor iden- tification lead to improved identification accuracy, which, in turn, facilities closed-loop control using DDRCAC. II. COMMAND-FOLLOWING PROBLEM G c,k G w k u k v k y 0,k y k E - r k z k Fig. 1: Block diagram representation of the adaptive servo problem. The objective is to have the performance variable Ey k follow commands r k , where E is a matrix that selects the components of the measurement y k . The adaptive controller G c,k has access to noisy measurements y k = y 0,k + v k and the adaptation variable z k = r k - Ey k , and computes u k . The plant G is acted upon by u k and w k , and produces the output y 0,k . We consider the servo loop shown in Figure 1, where r k is the command and w k is the disturbance. The plant output is y 0,k , and the measurement is given by y k = y 0,k + v k , where v k is sensor noise. The performance variable is the signal Ey k , where the matrix E selects components of y k that are required to follow the command r k . The command- following error is thus z k 4 = r k - Ey k . The inputs to the adaptive feedback controller G c,k are the measurement y k and the command-following error z k . Note that z k also serves as the adaptation variable, as denoted by the diagonal line passing through G c,k . The objective is to minimize the adaptation variable, that is, the command-following error, in the presence of the disturbance and sensor noise. Note that the servo loop in Figure 1 involves two plants in feedback, namely, G and EG. In practice, each of these plants may be either minimum phase or nonminimum phase, and the presence of NMP zeros in either plant may impact 2020 American Control Conference Denver, CO, USA, July 1-3, 2020 978-1-5386-8266-1/$31.00 ©2020 AACC 1387 Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Adaptive Control of Discrete-Time Systems with Unknown

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive Control of Discrete-Time Systems with Unknown

Adaptive Control of Discrete-Time Systemswith Unknown, Unstable Zero Dynamics

Syed Aseem Ul Islam, Tam W. Nguyen, Ilya V. Kolmanovsky, and Dennis S. Bernstein

Abstract— Adaptive control of systems with poorly knownor unmodeled unstable zero dynamics remains a challengingproblem. This paper presents an extension of retrospective costadaptive control (RCAC) called data-driven RCAC (DDRCAC),which does not require that the zero dynamics be known apriori. Instead, the method uses online identification to obtainan approximate model of the plant numerator dynamics foruse in the target model of RCAC. DDRCAC is demonstratednumerically on systems with linear and nonlinear unstable zerodynamics that are a priori unknown.

I. INTRODUCTION

Systems with unstable zero dynamics arise in many ap-plications [1]. For linear systems, unstable zero dynamicsmanifest themselves as nonminimum-phase (NMP) zeros.For nonlinear systems, the zero dynamics are characterizedby control inputs that make the output identically zero. Thesesystems are difficult to control due to limitations on plantinversion.

Systems with unstable zero dynamics are especially diffi-cult to control when the zero dynamics are unknown. Thiscase occurs in hypersonic vehicles, whose zero dynamics arepoorly modeled due to the complex aerodynamics [2]–[6].

The present paper applies indirect adaptive control tosystems with uncertain, unstable zero dynamics. The startingpoint for this work is retrospective cost adaptive control(RCAC), which is a discrete-time adaptive control techniquethat is applicable to stabilization, command following, anddisturbance rejection. RCAC is described in [7], [8], andvarious applications are considered in [9]–[11].

The modeling information needed by RCAC is used tospecify the target model Gf , which defines the retrospectivecost function. In the SISO case, the required minimal model-ing information is the relative degree, the sign of the leadingnumerator coefficient, and any NMP zeros. In the MIMOcase, the required minimal modeling information needed tospecify Gf is an open problem.

In practice, however, a plant may be NMP, but the NMPzeros may be uncertain. If the plant has NMP zeros that arepoorly modeled, then RCAC may or may not cancel them,depending on the accuracy of the knowledge of the NMPzeros as well as the aggressiveness of the controller. Thetendency of RCAC to cancel NMP zeros is exploited in [12],where unstable pole locations are used as estimates of theNMP zeros.

Syed Aseem Ul Islam, Tam W. Nguyen, Ilya V. Kolmanovsky, and DennisS. Bernstein are with the Department of Aerospace Engineering, Univer-sity of Michigan, Ann Arbor, MI, USA. {aseemisl, twnguyen,ilya, dsbaero}@umich.edu

The present paper extends RCAC to address the problemof unknown NMP zeros, especially for the MIMO case.This approach, called data-driven retrospective cost adaptivecontrol (DDRCAC) [13], uses online system identification toconstruct Gf . In particular, closed-loop identification usingrecursive least squares (RLS) is used to identify an infinite-impulse-response (IIR) model whose numerator is used toupdate a finite-impulse-response (FIR) target model Gf,k ateach step. This technique can be applied to both SISO andMIMO systems. This paper develops DDRCAC and appliesthis technique to linear NMP systems as well as nonlinearsystems with unstable nonlinear zero dynamics.

Since DDRCAC relies on online identification, we in-vestigate the effect of the sensor noise on the closed-loopperformance. In order to examine this sensitivity, we applyDDRCAC to a system with an unknown NMP zero withincreasing levels of sensor noise. Furthermore, we present anexample in which the transients that arise due to poor iden-tification lead to improved identification accuracy, which, inturn, facilities closed-loop control using DDRCAC.

II. COMMAND-FOLLOWING PROBLEM

Gc,kGc,k G

wkuk

vky0,k yk

E

−rk zk

Fig. 1: Block diagram representation of the adaptive servo problem. Theobjective is to have the performance variable Eyk follow commands rk ,where E is a matrix that selects the components of the measurement yk. Theadaptive controller Gc,k has access to noisy measurements yk = y0,k+vkand the adaptation variable zk = rk − Eyk, and computes uk. The plantG is acted upon by uk and wk, and produces the output y0,k.

We consider the servo loop shown in Figure 1, where rkis the command and wk is the disturbance. The plant outputis y0,k, and the measurement is given by yk = y0,k + vk,where vk is sensor noise. The performance variable is thesignal Eyk, where the matrix E selects components of ykthat are required to follow the command rk. The command-following error is thus zk

4= rk − Eyk. The inputs to the

adaptive feedback controller Gc,k are the measurement ykand the command-following error zk. Note that zk alsoserves as the adaptation variable, as denoted by the diagonalline passing through Gc,k. The objective is to minimize theadaptation variable, that is, the command-following error, inthe presence of the disturbance and sensor noise.

Note that the servo loop in Figure 1 involves two plantsin feedback, namely, G and EG. In practice, each of theseplants may be either minimum phase or nonminimum phase,and the presence of NMP zeros in either plant may impact

2020 American Control ConferenceDenver, CO, USA, July 1-3, 2020

978-1-5386-8266-1/$31.00 ©2020 AACC 1387

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Page 2: Adaptive Control of Discrete-Time Systems with Unknown

robustness and achievable performance. In the special casewhere yk is the full state of G, it follows that G has no zerosand thus is minimum phase. In the present paper, we focusexclusively on the output-feedback case where G is NMPand EG = G. Furthermore, rk ∈ Rp, yk ∈ Rp, zk ∈ Rp, anduk ∈ Rm.

The plant G is modeled as the strictly-proper discrete-timeinput-output (IO) system

y0,k = −n∑i=1

Aiy0,k−i +

n∑i=`

Biuk−i + wk−1, (1)

yk = y0,k + vk, (2)

where k is the step, n is the system window length, y0,k ∈Rp is the output, yk ∈ Rp is a noisy measurement of y0,k,uk ∈ Rm is the control, wk ∈ Rp is the disturbance, 0 <` < n is an integer, and Ai ∈ Rp×p, Bi ∈ Rp×m are systemcoefficient matrices. In the cases wk = 0, vk = 0,

G(q)4=(Ipq

n +n∑i=1

Aiqn−i)−1 n∑

i=`

Biqn−i, (3)

is the transfer-function from uk to yk, where q is the forward-shift operator.

III. DATA-DRIVEN RETROSPECTIVE COST ADAPTIVECONTROL

DDRCAC consists of two components, namely, onlineidentification and RCAC. The online identification uses RLSto fit an IIR IO model using yk and uk data obtainedduring closed-loop operation. A second implementation ofRLS is used to update the RCAC adaptive controller using atarget model constructed from the latest identified model.In particular, at each step we construct the target modelGf,k as an FIR filter whose numerator is identical to thenumerator of the latest identified IIR model. Both RLSimplementations use data-dependent variable-rate forgetting(VRF). The following result is RLS with VRF from [14] thatis used to implement DDRCAC.

Proposition 1: For all k ≥ 0, let Yk ∈ RlY ,Φk ∈ RlY ×lΘ , λk ∈ (0, 1], and define ρk

4=∏kj=0 λj .

Let Θ0 ∈ RlΘ , and let P0 ∈ RlΘ×lΘ be positive definite.Furthermore, for all k ≥ 0, denote the minimizer of

Jk(Θ)4=

k∑i=0

ρkρi

(Yi − ΦiΘ)T(Yi − ΦiΘ)

+ ρk(Θ−Θ0)TP−10 (Θ−Θ0). (4)

where Θ ∈ RlΘ , by Θk+14= argmin

Θ ∈ RlΘJk(Θ). Then, for

all k ≥ 0, Θk+1 is given by

Pk+1 = 1λkPk − 1

λkΦTk (λkIlY + ΦkPkΦT

k )−1ΦkPk, (5)

Θk+1 = Θk + Pk+1ΦTk (Yk − ΦkΘk). (6)

A. Online Identification

We fit a strictly-proper linear IO model of the form

yk = −η∑i=1

Fi,kyk−i +

η∑i=1

Gi,kuk−i, (7)

to yk and uk, where η is the model window length, and Fi ∈Rp×p and Gi ∈ Rp×m are the model coefficient matricesthat are to be updated. To perform this update recursivelywe define the plant identification error

zp,k(θp)4= yk − φp,kθp, φp,k

4=

−yk−1

...−yk−ηuk−1

...uk−η

T

⊗ Ip ∈ Rp×lθp ,

(8)θp,k

4= vec [ F1,k · · · Fη,k G1,k · · · Gη,k ] ∈ Rlθp , (9)

The cost (4) with Yk −ΦkΘ = zp,k(θp) is minimized usingProposition 1 with

Yk4= yk, Φk

4= φp,k, Θ

4= θp, λk

4= λp,k, (10)

which recursively produces the minimizer θp,k+1.

B. Retrospective Cost Adaptive Control

Define the dynamic compensator

uk4= satα(φkθc,k), φk

4=

uk−1

...uk−ncyk−1

...yk−nc

T

⊗ Im ∈ Rm×lθ , (11)

nc is the controller window length, lθ = ncm(m + p), andθc,k is a vector of controller coefficients to be optimized.The definition (11) represents an IIR controller whose outputis saturated by a multivariable saturation function satα. Inparticular, each entry xi of a vector x is independentlysaturated by the corresponding entry αi of α, to produceeach entry satαi(xi) of satα(x). That is

satαi(xi)

4=

{xi, |xi| < αi,

sign(xi)αi, |xi| ≥ αi.(12)

Next, define the retrospective cost variable

zk(θc)4= zk +Nkφkθc −NkUk ∈ Rp, (13)

which, assuming that Ru and R∆u are nonzero, is stackedwith

−Ruφkθc ∈ Rm, R∆u(uk−1 − φkθc) ∈ Rm, (14)

where

φk4=

φk−1...

φk−η

∈ Rηm×lθ , Uk4=

uk−1...

uk−η

∈ Rηm, (15)

Nk4=

[ −1p×m 0 · · · 0 ] , Gi,k+1 = 0,

[ −G1,k+1 · · · −Gη,k+1 ] , otherwise,(16)

zk4= rk − yk, (17)

Ru ∈ Rm×m is the positive-semidefinite control weighting,R∆u ∈ Rm×m is the positive-semidefinite control move

1388

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Page 3: Adaptive Control of Discrete-Time Systems with Unknown

weighting, Gi,k+1 ∈ Rp×m for i = 1, . . . , η are the nu-merator coefficients of the identified model, Nk ∈ Rp×ηm,rk ∈ Rp is the command, and zk ∈ Rp is the command-following error and the adaptation variable. Note that theidentification update at step k can be performed beforethe adaptive control update and thus Gi,k+1 are availablefor the construction of Nk at step k. The cost (4) withYk − ΦkΘ = zk(θc) is minimized using Proposition 1 with

Yk4= zk −NkUk ∈ Rp, (18)

which, assuming that Ru and R∆u are nonzero, is stackedwith

0 ∈ Rm, R∆uuk−1 ∈ Rm, (19)

and

Φk4= −Nkφk ∈ Rp×lθ , (20)

which, assuming that Ru and R∆u are nonzero, is stackedwith

Ruφk ∈ Rm×lθ , R∆uφk ∈ Rm×lθ , (21)

and Θ4= θc, λk

4= λc,k, which recursively produces the

minimizer θc,k+1.For all examples in this paper, θp,k and θc,k are initialized

as 0 in order to reflect the absence of additional priormodeling information. In addition, for both implementationsof RLS P0 = p0IlΘ where p0 > 0 is a tuning hyperparameter.

C. Data-Dependent Variable Rate ForgettingFor data-dependent variable-rate forgetting, we define

λp,k4=

1

1 + γfτ1,τ2,k[zp,k(θp,k)], λc,k

4=

1

1 + γfτ1,τ2,k(zk),

(22)

which are the time-varying data-dependent plant-identification and control forgetting factors, respectively,where

fτ1,τ2,k(xk)4=

ξnξd− 1, ξn

ξd> 1,

0, ξnξd≤ 1 or ξd = 0,

(23)

ξn4=

(1

τ1

k∑i=k−τ1

xTi xi

)1/2

, ξd4=

(1

τ2

k∑i=k−τ2

xTi xi

)1/2

, (24)

xk is a vector and τ1 < τ2 are integers. The terms ξn and ξdare the average norms of xk over periods of length τ1 andτ2, respectively. The integer τ2 is chosen to be large so thatξd approximates the long-term variation of the xk, whereasτ1 is chosen to be small so that ξn approximates the short-term variation of the xk, Consequently, the case ξn

ξd> 1

implies that the variation of xk is greater in the recent pastthan over a longer time interval. The function f is used inDDRCAC to suspend forgetting when the variation of theerror drops below a threshhold determined by the ratio ξn

ξd.

This technique thus prevents DDRCAC from forgetting dueto sensor noise rather than due the magnitude of the noise-free command-following error. A list of hyperparameters tobe selected for DDRCAC is presented in Table I.

Parameter Description Selection

η Model window length Integer ≥ 1

nc Controller window length Integer ≥ 1

Ru Control weighting ruIm, ru ≥ 0

R∆u Control move weighting r∆uIm, r∆u ≥ 0

α Control saturation level vector Actuator saturation

p0 Initial RLS covariance scaling p0 > 0

γ Forgetting parameter 0 ≤ γ < 1

τ1, τ2 Forgetting window lengths Integers τ2 > τ1

TABLE I: Tuning hyperparameters that need to be selected for DDRCAC.

IV. APPLICATION TO NMP SYSTEMS

In this section, DDRCAC is applied to NMP systems. Thefirst example is SISO and unstable, and the second exampleis MIMO and asymptotically stable. Within the context of thelinear discrete-time IO model (1), NMP zeros are associatedwith the zero dynamics. In particular, if the first nonzerocoefficient B` in (1) has full column rank, then setting y0,k ≡0 and wk ≡ 0 in (1) yields the zero dynamics

uk = −(BT` B`)

−1BT`

n∑i=`+1

Biuk+`−i. (25)

G has a NMP transmission zero if and only if the zerodynamics (25) are unstable.

Example 1. Harmonic command following for an un-stable SISO NMP system with step-disturbance rejection.Consider the discrete-time, IO systemy0,k = 1.4y0,k−1 − 1.1y0,k−2 + uk−1 − 1.3uk−2 + wk−1,

(26)

which has the unstable linear zero dynamicsuk = 1.3uk−1. (27)

Hence (26) has unstable poles 0.7 ± 0.781 rad/sample andNMP zero 1.3 rad/sample. The disturbance wk is a stepof height 0.5, the sensor noise vk is a zero-mean Gaussiansequence with standard deviation 0.001, and the commandis rk = sin 0.23k. For DDRCAC we set η = 2, nc = 8,Ru = R∆u = 0, α = 3, p0 = 104, γ = 0.1, τ1 = 30, andτ2 = 100. Figures 2, 3 show that the harmonic command isfollowed asymptotically and the disturbance is rejected.

To investigate the inter-dependence between online-identification and RCAC we repeat the example with p0 =10−4. This value of p0 corresponds to less aggressive iden-tification and control. Consequently, poor identification ofthe true plant is expected, which in turn is expected toresult in poor closed-loop performance. Figure 4(a) showsa large transient in yk between k = 43 and k = 110that corresponds to poor command-following performance.The poor command-following performance is a result of theinsufficient modeling information in Gf,43, which in turn isdue to the poor model of the system at k = 43, as shown inFigure 4(d). The transient in yk causes a large change in theidentification coefficients between k = 43 and k = 80, as

1389

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Page 4: Adaptive Control of Discrete-Time Systems with Unknown

shown in Figure 4(b), which leads to an identified modelat k = 71 that approximately captures the poles of thetrue system, as shown in Figure 4(e). The large change inthe identification coefficients causes a large change in thecontroller coefficients between k = 71 and k = 110 as shownin Figure 4(c). At k = 110 the identified model capturesthe poles and NMP zero of the true system as shown inFigure 4(f) and thus, for k > 110 Gf,k contains the modelinginformation required by RCAC. Consequently, for k > 110asymptotic command following is achieved. The poles andzeros of the plant and identified model are shown in Figure4(d)–(f) at k = 43, k = 71, and k = 110. This investigationshows that, when the model is poor, the resulting transientresponse of yk provides additional persistence of excitation,which enhances identification accuracy.

Next, we repeat the example with varying levels of sensornoise. The absolute error from these simulations, and thepoles and zeros of the identified model are shown in Figure 5.Increasing levels of sensor noise degrade the identification ofthe poles and zeros, and thus the performance of DDRCAC.Note that, although the accuracy of the identified poles andzeros is poor in the case σ(vk) = 1, closed-loop stability ismaintained. �

-10

0

10

10-3

10-1

101

0 100 200 300 400 500 600

-2

0

2

Fig. 2: Example 1: Harmonic-command following and step-disturbancerejection for an unstable SISO NMP system. The signal-to-noise ratiobetween yk and vk is 64 dB.

-4

-2

0

2

4

6

8

0.9

0.95

1

-50

0

50

0 100 200 300 400 500 600

0.9

0.95

1

Fig. 3: Example 1: Harmonic-command following and step-disturbancerejection for an unstable SISO NMP system. The identification coefficients,identification forgetting factor, controller coefficients, and controller forget-ting factor are shown.

Example 2. Multi-step command following for an asymp-totically stable MIMO NMP system. Consider (1) with

A1 = I2, A2 =

[0.3 0

0 0.4

], B1 =

[−1 4

3 6

], B2 =

[3.2 −0.8−0.6 −1.2

].

(28)

-10

0

10

20

-1

0

1

0 50 100 150 200 250 300

-20

0

20

-2 -1 0 1

-1

0

1

-1 0 1

-1

0

1

-1 0 1

-1

0

1

Fig. 4: Example 1: Harmonic-command following and step-disturbancerejection for an unstable SISO NMP system using a small value p0. (a)shows that, between k = 43 and k = 110, the command-following error hasa large transient. (b) shows that, due to the transient in yk, the identificationcoefficients undergo a large change between k = 43 and k = 80. Note that,the update of the numerator of the identified model lags the update of theidentification of the denominator. (c) shows that, due to the large changein the identification coefficients, the controller coefficients undergo a largechange between k = 71 and k = 110. (d), (e), and (f) show the poles andzeros of the true system (blue) and the identified model (red) at k = 43,k = 71, and k = 110, respectively.

-1 -0.5 0 0.5 1 1.5

-1

-0.5

0

0.5

1

0 100 200 300 400 500 600

10-4

10-3

10-2

10-1

100

101

Fig. 5: Example 1: Harmonic-command following and step-disturbancerejection for an unstable SISO NMP system. The absolute command-following error |zk| is shown for four cases with different levels of sensornoise σ(vk). Higher levels of sensor noise degrade closed-loop performance.The poles and zeros of the identified model at the end of each simulationare also shown. The poles and zeros of the true system are shown in black.

The transmission zeros, channel zeros, and poles of G areshown in Figure 6. G has NMP channel and transmissionzeros, and is asymptotically stable. Using (25) the zero dy-

namics for this system are given by uk =

[1.2 0

−0.5 0.2

]uk−1.

Note that the eigenvalues of the matrix that defines the zerodynamics are 0.2 and 1.2 which are the transmission zerosof G. The disturbance wk and sensor noise vk are zero-meanGaussian sequences with standard deviation 0.001 and 0.01,respectively, and the command is given by

rk =

r1, 0 ≤ k < 150,

r2, 150 ≤ k < 300,

r3, 300 ≤ k < 450,

r4, k ≥ 450,

, (29)

1390

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Page 5: Adaptive Control of Discrete-Time Systems with Unknown

r14=

[0.2

0

], r2

4=

[1.5

1.25

], r3

4=

[−1.25−1.5

], r4

4=

[−0.250.15

].

(30)

For DDRCAC we set η = 2, nc = 8, Ru = R∆u = 0,α = [ 0.25 0.25 ]T, p0 = 104, γ = 0.1, τ1 = 30, andτ2 = 100. Figures 7, 8 show that the multi-step command isfollowed asymptotically. �

-1 0 1 2 3

-1

0

1

-1 -0.5 0 0.5 1

-1

0

1

-1 -0.5 0 0.5 1

-1

0

1

-1 -0.5 0 0.5 1

-1

0

1

Fig. 6: Example 2: Transmission zeros, channel zeros, and poles of thesystem where Gi,j is the (i, j) entry of the MIMO transfer function G.

-2

0

2

-3

-2

-1

0

1

10-5

10-3

10-1

-0.2

0

0.2

0 100 200 300 400 500 600

-0.2

0

0.2

Fig. 7: Example 2: Multi-step command following for an asymptoticallystable MIMO NMP system. The signal-to-noise ratio between yk and vk is[ 40 41 ]T dB. The control uk and the control saturation levels are shown.

V. APPLICATION TO NONLINEAR IO SYSTEMS WITHNONLINEAR UNSTABLE ZERO DYNAMICS

In this section, DDRCAC is applied to nonlinear discrete-time IO systems of the form

y0,k = −n∑i=1

Aiy0,k−i +B`uk−`

+ h(uk−1−`, . . . , uk−n) + wk−1, (31)

yk+1 = y0,k+1 + vk+1, (32)

where 0 < ` < n is an integer, and h : Rm×· · ·×Rm → Rp.The first example is SISO and asymptotically stable, and thesecond example is MIMO and unstable. Both examples havenonlinear unstable zero dynamics. If B` has full column rank,then setting y0,k ≡ 0 and wk ≡ 0 yields the zero dynamics

Fig. 8: Example 2: Multi-step command following for an asymptoticallystable MIMO NMP system. The identification coefficients, identificationforgetting factor, controller coefficients, and controller forgetting factor areshown.

uk = −(BT` B`)

−1BT` h(uk−1, . . . , uk+`−n). (33)

Note that, despite the fact that the examples in this sectionhave nonlinear zero dynamics, the identification model (7)is linear.

Example 3. Multi-step command following for an asymp-totically stable SISO system with unstable nonlinear zerodynamics. Consider the discrete-time, IO system

y0,k = 0.9y0,k−1 − uk−2 + uk−3 + u3k−3 + wk−1. (34)

This system has unstable nonlinear zero dynamics

uk = (1 + u2k−1)uk−1, (35)

and is open loop unstable. The disturbance wk and sensornoise vk are zero-mean Gaussian sequences with standarddeviation 0.001 and 0.01, respectively, and the command is

rk =

{1, 0 ≤ k < 300,

−1, k ≥ 300.(36)

For DDRCAC we set η = 3, nc = 8, Ru = 0, R∆u =0.3, α = 1, p0 = 104, γ = 0.1, τ1 = 30, and τ2 = 100.Figures 9, 10 show that the multi-step command is followedasymptotically. �

-2

0

2

10-3

10-1

0 100 200 300 400 500 600

-1

0

1

Fig. 9: Example 3: Multi-step command following for an asymptoticallystable SISO system with unstable nonlinear zero dynamics. The signal-to-noise ratio between yk and vk is 40 dB.

1391

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.

Page 6: Adaptive Control of Discrete-Time Systems with Unknown

-1

0

1

2

0.9

0.95

1

-50

0

50

100

0 100 200 300 400 500 600

0.9

0.95

1

Fig. 10: Example 3: Multi-step command following for an asymptoticallystable SISO system with unstable nonlinear zero dynamics. The identifica-tion coefficients, identification forgetting factor, controller coefficients, andcontroller forgetting factor are shown.

Example 4. Multi-step command following for an un-stable MIMO system with unstable nonlinear zero dynamicsand harmonic disturbance rejection. Consider (31) with

A1 =

[−0.4 −10 −1.1

], B1 =

[1 0

0 1

], (37)

h(uk−1) =

[| sinu1,k−1|u1,k−1 + u2,k−1

| cosu2,k−1|u2,k−1

]. (38)

Note that the system is open-loop unstable. Using (25) theunstable nonlinear zero dynamics for this system are given

by uk =

[| sinu1,k−1| 1

0 | cosu2,k−1|

]uk−1. The harmonic

disturbance is given by wk = 0.05 sin 0.33k, the sensor noisevk is a zero-mean Gaussian sequence with standard deviation0.01, and the command is given by (30). For DDRCAC weset η = 2, nc = 8, Ru = R∆u = 0, α = [ 0.6 0.6 ]T,p0 = 104, γ = 0.2, τ1 = 30, and τ2 = 100. Figures 11–12show that the multi-step command is followed asymptoticallyand the harmonic disturbance is rejected. �

-202468

-2

0

2

4

6

10-3

10-1

-0.5

0

0.5

0 100 200 300 400 500 600

-0.5

0

0.5

Fig. 11: Example 4: Multi-step command following and harmonic distur-bance rejection for an unstable MIMO system with unstable nonlinear zerodynamics. The signal-to-noise ratio between yk and vk is [ 44 42 ]T dB.The control uk and the control saturation levels are shown.

Fig. 12: Example 4: Multi-step command following and harmonic distur-bance rejection for an unstable MIMO system with unstable nonlinear zerodynamics. The identification coefficients, identification forgetting factor,controller coefficients, and controller forgetting factor are shown.

VI. ACKNOWLEDGMENTS

This research was supported by the Office of Naval Re-search under grant N00014-18-1-2211 of the BRC Program.

REFERENCES

[1] S. A. Al-Hiddabi and N. H. McClamroch, “Tracking and maneuverregulation control for nonlinear nonminimum phase systems: applica-tion to flight control,” IEEE Trans. Contr. Sys. Tech., vol. 10, no. 6,pp. 780–792, Nov 2002.

[2] L. Fiorentini and A. Serrani, “Adaptive restricted trajectory tracking fora non-minimum phase hypersonic vehicle model,” Automatica, vol. 48,no. 7, pp. 1248–1261, 2012.

[3] H. Xu, M. D. Mirmirani, and P. A. Ioannou, “Adaptive sliding modecontrol design for a hypersonic flight vehicle,” AIAA J. Guid. Contr.Dyn., vol. 27, p. 829–838, 2004.

[4] J. T. Parker, A. Serrani, S. Yurkovich, M. A. Bolender, and D. B.Doman, “Control-Oriented Modeling of an Air-Breathing HypersonicVehicle,” AIAA J. Guid. Contr. Dyn., vol. 30, no. 3, pp. 856–869, 2007.

[5] H. Buschek and A. J. Calise, “Uncertainty Modeling and Fixed-OrderController Design for a Hypersonic Vehicle Model,” AIAA J. Guid.Contr. Dyn., vol. 20, no. 1, pp. 42–48, 1997.

[6] M. A. Bolender and D. B. Doman, “Nonlinear Longitudinal DynamicalModel of an Air-Breathing Hypersonic Vehicle,” AIAA J. SpacecraftRockets, vol. 44, no. 2, pp. 374–387, 2007.

[7] Y. Rahman, A. Xie, and D. S. Bernstein, “Retrospective Cost AdaptiveControl: Pole Placement, Frequency Response, and Connections withLQG Control,” IEEE Contr. Sys. Mag., vol. 37, pp. 28–69, Oct. 2017.

[8] J. B. Hoagg and D. S. Bernstein, “Retrospective Cost Model ReferenceAdaptive Control for Nonminimum-Phase Systems,” J. Guid. Contr.Dyn., vol. 35, pp. 1767–1786, 2012.

[9] A. Ansari and D. S. Bernstein, “Retrospective Cost Adaptive Controlof Generic Transport Model Under Uncertainty and Failure,” J. Aero.Info. Sys., vol. 14, no. 3, pp. 123–174, 2017.

[10] Z. Zhao, G. Cruz, and D. S. Bernstein, “Adaptive Spacecraft AttitudeControl Using Single-Gimbal Control Moment Gyroscopes WithoutSingularity Avoidance,” J. Guid. Contr. Dyn. available online, 2019.

[11] A. Xie and D. S. Bernstein, “Adaptive Feedback Noise Control forWide, Square, and Tall Systems,” in Proc. Amer. Contr. Conf., July2019, pp. 435–440.

[12] S. A. U. Islam, A. Xie, and D. S. Bernstein, “Adaptive Control of Sys-tems with Unknown Nonminimum-Phase Zeros Using Cancellation-Based Pseudo-identification,” in Proc. Amer. Contr. Conf., July 2019,pp. 441–446.

[13] S. A. U. Islam, A. L. Bruce, T. W. Nguyen, I. Kolmanovsky, andD. Bernstein, “Adaptive Flight Control with Unknown Time-VaryingUnstable Zero Dynamics,” AIAA Scitech, 2020, AIAA 2020-0839.

[14] A. L. Bruce, A. Goel, and D. S. Bernstein, “Recursive Least Squareswith Matrix Forgetting,” in Proc. Amer. Contr. Conf., July 2020.

1392

Authorized licensed use limited to: Michigan State University. Downloaded on August 01,2020 at 15:23:27 UTC from IEEE Xplore. Restrictions apply.