6
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1161 Technical Notes and Correspondence_______________________________ Almost Sure Rate of Convergence of the Parameter Estimates in Stochastic Approximation Algorithm Miloje S. Radenkovic and Anthony N. Michel Abstract—This paper presents novel results on the almost sure conver- gence of the parameter estimates in stochastic approximation algorithm. It is proved that, surprisingly, this rate has the same order as the best one established for the least squares algorithm. Although we consider the case of self-tuning controllers, presented results can easily be extended to some other adaptive processes. Index Terms—Almost sure convergence, martingale, parameter estima- tion, self-tuning, stochastic approximation. I. INTRODUCTION In this paper we address the problem of the almost sure convergence rate of the stochastic approximation (SA) parameter estimator. Despite their computational simplicity and strong theoretical foundation, it is considered that the SA-type algorithm is inferior to the least squares (LS) algorithm. Fundamental results regarding almost sure convergence of the LS algorithm in adaptive systems context are presented in [1]–[3], [5], [6]. It is shown that for the LS-based self-tuning controller, averaged regret satisfies as (1) where is the parameter estimation error, while is the regressor. This is the best convergence rate for the averaged regret generated by the LS algorithm [2], [3], [5]. Reference [2] presents the best possible convergence rate for the parameter estimation error in the LS-based self-tuning control. This rate is the same as that in the laws of iterated logarithm, and it is given by as (2) Convergence rates (1) and (2) are not expected to be achievable with the stochastic approximation-based adaptive algorithm. This problem has been recently revisited in [8]. It is shown that the averaged regret has the same order as the one given by (1). Whether or not the convergence rate given by (2) is achievable with the SA algorithm is still an open problem. The focus of the paper is on the almost sure (a.s.) convergence rate of the parameter estimates generated by the SA type algorithm. The solution to this problem is given for the case of self-tuning control. We prove that the convergence rate of the parameter estimation error has the same order as that defined by (2). The key ideas of the proof consists of Manuscript received March 2, 1999; revised October 9, 1999. Recommended by Associate Editor, Q. Zhang. M. S. Radenkovic is with the Department of Electrical Engineering, Univer- sity of Colorado at Denver, Denver, CO 80217-3364 USA. A. N. Michel is with the Department of Electrical Engineering, University Notre Dame, Notre Dame, IN 46556 USA (e-mail: [email protected]). Publisher Item Identifier S 0018-9286(00)04171-4. the use of techniques based on backward recursions in the estimates and appropriate application of the martingale limit theory and the almost sure invariance principle. Compared with the conditions required in [2], we adopt somewhat more restrictive assumptions regarding the system noise and the reference signal. At the same time, we only treat the case of all-pole dynamical systems. The paper is organized as follows. The problem statement is given in Section II. The rate of convergence of the averaged regret is presented in Section III. The rate of convergence of the parameter estimates is evaluated in Section IV. II. PROBLEM STATEMENT Consider the following single-input/single-output discrete-time system (3) where , , and are the system output, input, and disturbance sequence, respectively, while is the unit delay operator. In (3) , and polynomial is given by We assume that the parameters and are unknown. Our objective is to design a causal control sequence such that the closed-loop system is stable and the following cost function is minimized: (4) where is a reference sequence to be tracked. For the purpose of our analysis we assume that is generated by the following signal model: (5) where is a stochastic process, and , . At the same time we assume that is known one step-ahead, i.e., is available at the time . Processes and are mutually independent, and they are defined on the underlying probability space with an increasing family of -fields ; , . Without loss of generality we assume that , . For the sequences and we adopt the following assumption. Assumption A1): Define . Then is a martingale difference sequence, i.e., is -mea- surable and (a.s.). Sequence satisfies (a.s.) and (a.s.) for some 0018–9286/00$10.00 © 2000 IEEE

Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

  • Upload
    an

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1161

Technical Notes and Correspondence_______________________________

Almost Sure Rate of Convergence of the ParameterEstimates in Stochastic Approximation Algorithm

Miloje S. Radenkovic and Anthony N. Michel

Abstract—This paper presents novel results on the almost sure conver-gence of the parameter estimates in stochastic approximation algorithm.It is proved that, surprisingly, this rate has the same order as the best oneestablished for the least squares algorithm. Although we consider the caseof self-tuning controllers, presented results can easily be extended to someother adaptive processes.

Index Terms—Almost sure convergence, martingale, parameter estima-tion, self-tuning, stochastic approximation.

I. INTRODUCTION

In this paper we address the problem of the almost sure convergencerate of the stochastic approximation (SA) parameter estimator. Despitetheir computational simplicity and strong theoretical foundation, it isconsidered that the SA-type algorithm is inferior to the least squares(LS) algorithm.

Fundamental results regarding almost sure convergence of the LSalgorithm in adaptive systems context are presented in [1]–[3], [5], [6].It is shown that for the LS-based self-tuning controller, averaged regretsatisfies

1

n

n

t=1

~�(t)T'(t)2

= Ologn

n(a:s:); asn!1 (1)

where~�(t) is the parameter estimation error, while'(t) is the regressor.This is the best convergence rate for the averaged regret generated bythe LS algorithm [2], [3], [5]. Reference [2] presents the best possibleconvergence rate for the parameter estimation error in the LS-basedself-tuning control. This rate is the same as that in the laws of iteratedlogarithm, and it is given by

k~�(n)k2 = Olog logn

n(a:s:); asn!1: (2)

Convergence rates (1) and (2) are not expected to be achievable with thestochastic approximation-based adaptive algorithm. This problem hasbeen recently revisited in [8]. It is shown that the averaged regret hasthe same order as the one given by (1). Whether or not the convergencerate given by (2) is achievable with the SA algorithm is still an openproblem.

The focus of the paper is on the almost sure (a.s.) convergence rateof the parameter estimates generated by the SA type algorithm. Thesolution to this problem is given for the case of self-tuning control. Weprove that the convergence rate of the parameter estimation error has thesame order as that defined by (2). The key ideas of the proof consists of

Manuscript received March 2, 1999; revised October 9, 1999. Recommendedby Associate Editor, Q. Zhang.

M. S. Radenkovic is with the Department of Electrical Engineering, Univer-sity of Colorado at Denver, Denver, CO 80217-3364 USA.

A. N. Michel is with the Department of Electrical Engineering, UniversityNotre Dame, Notre Dame, IN 46556 USA (e-mail: [email protected]).

Publisher Item Identifier S 0018-9286(00)04171-4.

the use of techniques based on backward recursions in the estimates andappropriate application of the martingale limit theory and the almostsure invariance principle. Compared with the conditions required in [2],we adopt somewhat more restrictive assumptions regarding the systemnoise and the reference signal. At the same time, we only treat the caseof all-pole dynamical systems. The paper is organized as follows. Theproblem statement is given in Section II. The rate of convergence ofthe averaged regret is presented in Section III. The rate of convergenceof the parameter estimates is evaluated in Section IV.

II. PROBLEM STATEMENT

Consider the following single-input/single-output discrete-timesystem

A(q�1)y(t+ 1) = bu(t) + !(t+ 1); t � 0 (3)

wherefy(t)g, fu(t)g, andf!(t)g are the system output, input, anddisturbance sequence, respectively, whileq�1 is the unit delay operator.In (3) b 6= 0, and polynomialA(q�1) is given by

A(q�1) = 1 + a1q�1 + � � �+ an q

�n; nA � 0:

We assume that the parametersb andai; i = 1; � � � ; nA are unknown.Our objective is to design a causal control sequencefu(t)g; t � 0 suchthat the closed-loop system is stable and the following cost function isminimized:

J = limn!1

1

n

n

t=1

(y(t+ 1)� y�(t+ 1))2 (4)

wherefy�(t)g is a reference sequence to be tracked. For the purpose ofour analysis we assume thaty�(t) is generated by the following signalmodel:

y�(t) = P (q�1)v(t) (5)

wherev(t) is a stochastic process, andP (q�1) = 1 + p1q�1 + � � �+

pn q�n , np � 0. At the same time we assume thaty�(t) is knownone step-ahead, i.e.,y�(t + 1) is available at the timet. Processesf!(t)g andfv(t)g are mutually independent, and they are defined onthe underlying probability space(;F ; P) with an increasing familyof �-fields Ft = f!(0); � � � ; !(t); v(0); � � � ; v(t + 1)g, t � 0.Without loss of generality we assume thaty(t) = u(t) = !(t) =v(t) = 0, 8 t < 0. For the sequencesf!(t)g andfv(t)g we adopt thefollowing assumption.

Assumption A1): Definex(t)T = (!(t); v(t + 1)). Thenfx(t); Ftg is a martingale difference sequence, i.e.,x(t) is Ft-mea-surable andEfx(t+ 1)jFtg = 0 (a.s.). Sequencefx(t)g satisfies

Efx(t+ 1)x(t+ 1)T jFtg

=�2! 0

0 �2v; 0 < �!; �v <1 (a.s.)

and

supt�0

Efkx(t+ 1)j� jFtg

<1; (a.s.) for some� > 4:

0018–9286/00$10.00 © 2000 IEEE

Page 2: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

1162 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000

The previous assumption implies two important relations whichwill be useful in subsequent developments. In fact, by the conditionalChebyshev’s inequality we have for any� > 4=�

1

t=1

P(kx(t+ 1)k4 � t�jFt)

=

1

t=1

P(kx(t+ 1)k� � t��=4jFt)

1

t=1

t���=4Efkx(t+ 1)k�jFtg <1 (a.s.).

Then by using the Borel–Cantelli lemma one obtains

kx(t+ 1)k4 = O(t�) (a.s.) 8 � 2 (4=�; 1) (6)

wherex(t)T = (!(t); v(t+1)). The second statement which followsfrom Assumption A1) is

lim supn!1

1

n

n

t=1

kx(t+ 1)k4 � kx <1 (a.s.): (7)

This relation can be proved by considering the sequencef�(i)g; i � 0,defined by

�(i) = kx(i)k4 � Efkx(i)k4 jFi�1g:

Sincef�(i); Fig is a martingale difference sequence satisfying

1

i=0

Ej�(i+ 1)j

i

c

jFi

<1; 8 c 2 1; min�

4; 2

by the Martingale Convergence theorem we get

limn!1

n

i=0

�(i+ 1)

i<1 (a.s.)

from where by the Kronecker’s lemma relation (7) directly follows.Denote the unknown parameters in (3) by

�T0 � [a1; � � � ; an ; b] (8)

and the corresponding regressor by

'(t)T = [�y(t); � � � ; �y(t� nA + 1); u(t)]: (9)

It is well known that the nonadaptive control law optimal in the senseof (4) is given by�T0 �(t) = y�(t+1). In the adaptive case we use thefollowing Ft-measurable control sequencefu(t)g:

solve foru(t): �(t)T�(t) = y�(t+ 1) (10)

where�(t) is an estimate of the unknown parameters�0, and it is gen-erated by the following gradient type algorithm:

�(t+ 1) = �(t) + ��(t)

d(t)[y(t+ 1)� y�(t+ 1)]; 0 < � <1

(11)

d(t) = max � max1���t

k�(r)k2; r(t)1�n; t ; 0 < � <1

2

(12)

r(t) = 1 +

n

i=1

k�(i)k2; r�(0) � 1 (13)

with the arbitrary nonrandom initial condition�(0) satisfyingk�(0)k < 1 andb(0) 6= 0. In this paper we assume that the estimateof b satisfiesb(t) 6= 0; 8 t � 1, i.e., (11) is solvable8 t � 1. Itis well known [1], [7] that if f!(t)g is a sequence of independentrandom samples with continuous distributions, then the estimateb(t) 6= 0; 8 t � 1 (a.s.). Without introducing previous restrictions onf!(t)g, solvability of (10) can be guaranteed by slightly modifyingb(t) so thatu(t) is well defined from (10). In the case when persistentexcitation (PE) condition is satisfied andlimt!1 �(t) = �0, aftersome finite time, such modification is not needed [1]. Since this paperis concerned with the asymptotical convergence rate of the parameterestimates, we do not impose additional restrictions onf!(t)g ormodify b(t), and simply assume that (10) is solvable for allt � 1.

We now list some of the known results in self-tuning control theory,needed in the sequel. It is not difficult to see that from (3) and (10), theclosed-loop adaptive system can be written in the form

y(t+ 1)� y�(t+ 1) =�z(t) + !(t+ 1) (14)

where

z(t) = ~�(t)T�(t); ~�(t) = �(t)� �0: (15)

Theorem 2.1: Consider(3) together with the algorithm(10)–(13).If Assumption A1) is valid, the following properties hold (a.s.):

i)

limn!1

1

n

n

i=0

z(i)2 = 0 (16)

ii)

limn!1

1

nr(n) = c0; 0 < c0 <1 (17)

iii)

limn!1

n

i=k

k�(i)� �(i� k)k2 <1; for all finite k (18)

iv)

limn!1

~�(n) = 0 (19)

v)

limn!1

max1�r�n

k�(r)k2=n = 0: (20)

Proof: Although normalization sequenced(t) in (11) is differentthan the standard one, proof of the above statements is exactly the sameas for the cased(t) = r(t) and can be found in a variety of literature(see for example [1]). We only comment on the statement (20). From(9) and (14) we can derive

max1�r�t

k�(r)k2 =Of max1���t

z(�)2g

+Of max1���t

f!(� + 1)2 + y�(� + 1)2g: (21)

Then (20) follows from (5), (6), and (16).

III. T HE CONVERGENCERATE OF THEPARAMETER ESTIMATES

In this section we evaluate the convergence rate of the parameterestimates generated by the SA algorithm. We show that the order ofthis rate is the same as the one established for the LS algorithm. Weprove the above statement under the condition that the scaling factor�in (11) is large enough. In order to quantify the size of the parameter

Page 3: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1163

�, we need a few simple developments. By using (14) it is easy to seethat the regressor�(t) defined by (9) can be represented in the form

�(t) = �z(t) + ��(t) (22)

where

�z(t)T = z(t� 1); � � � ; z(t� nA); �

A(q�1)

bz(t) (23)

and

��(t)T = [�y�(t)� !(t); � � � ; �y�(t� nA + 1)

� !(t� nA + 1);A(q�1)

b(y�(t+ 1) + !(t+ 1))]:

(24)

Define the following matrix:

W = Ef��(t)��(t)T jFt�kg; k � nA + nP + 1 (25)

wherenA andnP are degrees of the polynomialsA(q�1) andP (q�1),respectively. Sincek is chosen so that��(t) is notFt�k, measurable,Assumption A1) implies thatW is a positive definite matrix. Next, wedefine the size of the parameter� in (11).

Assumption A2: The gain factor� in the estimation algorithm sat-isfies

��� � 1

where�� is the minimal eigenvalue of the matrixW given by(25).For future reference we state the following result.Theorem 3.1: Let Assumptions A1) and A2) hold. Then asn!1

i)

n

i=0

z(i)2 = O(logn) (a.s.) (26)

ii)

n

i=0

k~�(i)k2 = O(logn) (a.s.): (27)

Proof: The proof of the theorem is given in [8].Theorem 3.2: Let Assumptions A1) and A2) hold. Then asn!1

k~�(n)k2 = Olog logn

n(a.s.): (28)

Proof: By using (17) and (20) we conclude that9 t0; t0 < 1,so that the gain sequenced(t) in (11) is given by

d(t) = t; 8 t � t0 (a.s.): (29)

Then from (11), (14), and (29), we have

~�(n+ 1) = ~�(n)� ��(n)�(n)T ~�(n)

n

+ ��(n)!(n+ 1)

n(a.s.) (30)

for all n � t0. Using the decomposition (22) of the regressor�(n),and substituting in the previous relation, we derive

~�(n+ 1) = ~�(n)� ���(n)��(n)T ~�(n)

n� �

f(n)

n

+ ��(n)!(n+ 1)

n(a.s.) (31)

where

f(n) = [�z(n)�z(n)T + �z(n)�

�(n)T + ��(n)�z(n)

T ]~�(n):

(32)

Further transformation of (31) gives

~�(n+ 1) = I � �W

n~�(n)� �

(f(n)� g(n))

n

+ ��(n)!(n+ 1)

n(a.s.) (33)

where

g(n) = (W � ��(n)��(n)T )~�(n) (34)

whileW is the positive definite matrix defined by (25). SinceW is thesymmetric matrix, there exists orthogonal matrixD such thatW =DLDT , whereL is diagonal matrix with the eigenvalues ofW on thediagonal. Having this in mind, (33) can be written in the form

(n+ 1) = I � �L

n (n)� �

DT (f(n)� g(n))

n

+ �DT�(n)!(n+ 1)

n; n � t0 (a.s.) (35)

with

(n) = DT ~�(n): (36)

Denote by i(n); i = 1; � � � ; nA+1, theith component of the vector (n). Then

i(n+ 1) = 1� ��i

n i(n)� �

liDT (f(n)� g(n))

n

+ �liD

T�(n)!(n+ 1)

n(a.s.) (37)

where�i > 0 is theith eigenvalue of the matrixL (orW ). In (37) thevector rowli has the same dimension as (n). All elements ofli arezero, except theith component, which is equal to one. Note one moretime that due to (28), relation (30) and consequently (31)–(37) are validfor n � t0, 0 < t0 < 1. From (37) one can derive

i(n+ 1) =Ri(n) i(t0)� �Ri(n)

n

t=t

1

tRi(t)liD

T (f(t)� g(t))

+ �Ri(n)

n

t=t

1

tRi(t)liD

T�(t)!(t+ 1) (a.s.) (38)

where

Ri(n) =

n

t=t

1� ��i

t: (39)

Next we bound the second term on the RHS of (38). The definition off(t) given by (32) and Schwarz inequality yield

n

t=1

kf(t)k

= O

n

t=1

k�z(t)k2k~�(t)k

+O

n

t=1

k�z(t)k2

1

2

n

t=1

k��(t)k2k~�(t)k2

1

2

:

(40)

Page 4: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

1164 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000

Since~�(t) ! 0 when t ! 1 (a.s.), the previous relation and (23)imply

n

t=1

kf(t)k = o(rz(n))

+O rz(n)12 r�(n)

12 max

1�r�nk��(r)k (a.s.) (41)

with rz(n) and r�(n) given by (15) whenx(t) = z(t) andx(t) = k~�(t)k, respectively. Noticing that by (6) and (24),

max1�r�n k��(�)k2 = O(n�) (a.s.),� 2 2

�; 12

, from (26), (27)and (41) one obtains

n

t=1

kf(t)k = O(n� logn);1

�< � <

1

4(a.s.): (42)

After applying Lemma A2 in the Appendix, from (39) we get

O((n� t0)��� ) � Ri(n) � O((n+ 1)��� ): (43)

Recall now that by Assumption A2)

��i � ��� � 1; 1 � i � nA + 1:

Hence (42) and (43) give

Ri(n)

n

t=t

1

tRi(t)liD

T f(t)

= O1

(n+ 1)��

n

t=t

(t� t0)��

tkf(t)k

= O1

(n+ 1)��(n� t0)

�� �1n

t=t

kf(t)k

= O1

n3=4logn (a.s.): (44)

The term�Ri(n)nt=t (1=Ri(t)t)liD

Tg(t) is bounded by LemmaA1 in the Appendix. Next we bound the third term on the RHS of (38).DefineTin as follows:

Tin =

n

t=1

1

Ri(t)t��liD

T�(t)!(t+ 1): (45)

Then by virtue of Assumption A2) the following chain of inequalitiesfollows:

j

n

t=t

1

tRi(t)liD

T�(t)!(t+ 1)j

= j

n

t=t

t�� �1(Tit � Ti(t�1)j

= j

n

t=t

(t�� �1Tit � (t� 1)�� �1Ti(t�1))

+

n

t=t

[(t� 1)�� �1 � t�� �1]Ti(t�1)j

� jn�� �1Tin � (t0 � 1)�� �1Ti(t �1)j

+ maxt ���n

jTi(��1)j(n�� �1 � (t0 � 1)�� �1)

= Ofn�� �1(n log logn)12 g (a.s.) asn!1 (46)

where we used the fact thatt0 is finite, andTin is bounded in LemmaA3 in the Appendix. Finally (43) and (48) yield

jRi(n)

n

t=t

1

tRi(t)liD

T�(t)!(t+ 1)j

= O1

n(n log logn)

12 (a.s.) (47)

asn!1. By using (43), (44), Lemma A1 in the Appendix, and (47),from (38) we obtain

k i(n+ 1)k =Olog logn

n

12

;

i = 1; � � � ; nA + 1 (a.s.): (48)

SinceDT is the orthogonal matrix, the statement of the theorem fol-lows from (36). This completes the proof of the theorem.

From relations (43), (44), and Lemma A1, it is clear that the first two

terms on the RHS of (38) have the orderO(n�12 ). Thus, convergence

rate of the parameter estimates is determined by the third term on theRHS of (38). Lemma A3 and (47) indicate that the convergence rate(29) is the best possible, since it is the same as that in the iterated log-arithm law.

Proof of the above theorem heavily exploits the fact that the secondcomponent(��(t)) of the regressor�(t) in (22) is a moving averageprocess. As a consequence of this we were able to define the constantmatrix W which plays a significant role in our analysis. This is thereason why we consider only the all-poles system model (3) and themoving average-type reference signal (5). Such restrictions are not re-quired for the LS algorithm [2]. It is not clear whether similar resultshold for systems with zeros or for general ARMAX models with arbi-trary bounded reference signaly�(t).

It is worth noticing that instead of (12), the standard SA algorithmhas the gain sequenced(t) = r(t) [1]. Since by (17)r(t) = O(t) (a.s.),it is obvious that the above results still hold in this case. Compared with(12), suchd(t) would involve somewhat more constants in some of thecalculations. Due to (29), derivations of the results are more clear, whend(t) is defined as in (12).

In the above results it is assumed that the reference signal is gen-erated by a stochastic process and that at the same time, the referencesignal is known one step ahead. This assumption can be avoided if inplace of the reference signal, its one-step-ahead prediction is used. Sucha prediction can be generated by another SA algorithm. In this case,the proof of Theorem 3.2 becomes much more complicated, involvingadditional algebraic details. For the sake of simplicity and clarity, weassume that the reference signal is known one step ahead.

It is also worth noting that the size of the algorithm gain� playsan essential role in establishing the convergence rate given by (28).Assumption A2) requires that the gain� in all directions is not smallerthan the inverse of the corresponding eigenvalues of the matrixW . Thisfact coincides with the observation that the gain of the LS algorithmasymptotically is proportional to the inverse of the covariance matrixW . In particular, LS estimator equation is given by

�(n+ 1) = �(n) + p(n)�(n)[y(n+ 1)� y�(n+ 1)]; n � 0

p(n)�1 = p(n� 1)�1 + �(n)�(n)T ; p(0)�1 > 0

with �(n) specified by (9). Sincelimn!1p(n)

n = W (a.s.), wehavelimn!1Wnp(n) = I (a.s.), whereW is defined by (25). Byvirtue of this relation, the gain matrixp(n) asymptotically behaves asp(n) � W�1=n.

Page 5: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1165

IV. CONCLUSION

In this paper we have obtained new results on the (a.s.) convergencerate of the parameter estimates generated by the stochastic approxima-tion algorithm. We proved that this rate has the same order as the bestone established for the least squares algorithm. The self-tuning controlwith the all poles system model and the moving average type referencesignal have been analyzed. For further study, it would be of interest togeneralize the results to a general ARMAX system model with arbi-trary bounded reference signaly�(t).

APPENDIX

Lemma A1: Under the same conditions as in Theorem 4.2, asn!1, we have

jRi(n)

n

t=t

1

tRi(t)liD

Tg(t)j � O(n�

12 ) (a.s.) (49)

where all variables are the same as in(38).Proof: From the definition ofg(t) in (34) we write

g(t) = p(t) + s(t) (50)

where

p(t) = (W � ��(t)��(t)T )~�(t� k) (51)

s(t) = (W � ��(t)�(t)T )(�(t)� �(t� k)) (52)

with k defined by (25). Since~�(t � k) is Ft�k measurable and from(25)EfW � ��(t)��(t)T jFt�kg = 0 (a.s.), by Assumption A1) andthe Local Martingale Convergence theorem (see [7, Lemma 2, p. 157])we get

I2(n)�=j

n

t=t

1

tRi(t)liD

Tp(t)j

=Of(

n

t=t

1

t2Ri(t)2k~�(t� k)k2)

12

� [log(

n

t=t

1

t2Ri(t)2k~�(t� k)k2)]�g; 8� >

1

2(a.s.)

(53)

I2(n)�=j

n

t=t

1

tRi(t)liD

Tp(t)j

=Of(

n

t=t

1

t2Ri(t)2k~�(t� k)k2)

12

� [log(

n

t=t

1

t2Ri(t)2k~�(t� k)k2)]�g;

8� >1

2(a.s.) (53)

Then (43) and (53) give

I2(n) =O n�� �1

n

t=t

k~�(t� k)k2

12

� logn2(�� �1)n

t=t

k~�(t� k)k2�

; 8� >1

2(a.s.)

(54)

from where by (27) it follows that for some� > 0:

I2(n) = Ofn�� �1(logn)(1+�)=2g (a.s.): (55)

By using the Schwarz inequality and (52), we have

I3(n)�= j

n

t=t

1

tRi(t)liD

Ts(t)j

=O

n

t=t

1

t2Ri(t)2k�(t)� �(t� k)k2

12

n

t=t

kW � ��(t)��(t)Tk2

12

: (56)

Since (7) and (24) imply nt=t k��(t)k4 = O(n), (a.s.), from (18),

(43), and (56), it is clear that

I3(n) = O(n�� �1 � n12 ) (a.s.): (57)

Finally (49) follows from (55), (57), and (43). Thus the lemma isproved.

Lemma A2: LetR(n) be defined as follows:

R(n) =

n

t=h

1�a

t(58)

wherea > 0, h = [a] + 1, and [a] is the largest integral part ofa.Then the following holds:

1�a

h

e�a

(n� h)a� R(n) � 1�

a

h

(h+ 1)a

n+ 1)a: (59)

Proof: Observe that8 t � h

�a

t� h�

a

t� a� log 1�

a

t� �

a

t: (60)

Since

n

t=h+1

1

t� h=1 +

n�h

i=2

1

i

=1 +

n�h

i=2

1

i

i

i�1

dx

� 1 +

n�h

i=2

i

i=2

dx

x

=1 +n�h

1

dx

x

=1 + log(n� h) (61)

and

n

t=h+1

1

t�

n

t=h+1

t+1

t

dx

x= log(n+ 1)� log(h+ 1) (62)

(60) yields

�a� a log(n� h) �

n

t=h+1

log 1�a

t

��a log(n+ 1) + a log(h+ 1): (63)

Page 6: Almost sure rate of convergence of the parameter estimates in stochastic approximation algorithm

1166 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000

Using the definition ofR(n), from the previous relation we derive

�a� a log(n� h) � log R(n)h

h� a

��a log(n+ 1) + a log(h+ 1)

from where (59) directly follows.Lemma A3: Let Assumptions A1) and A2) hold. Then asn!1

j

n

t=1

1

Ri(t)t��liD

T�(t)!(t+ 1)j

= Of(n log logn)1

2 g (a.s.) (64)

where all variables are the same in(38).Proof: First we observe the following facts.

i) Equation (43) implies that(Ri(t)�1=t�� ) � O(1).

ii) Equations (6), (22), and (26) yieldk�(t)k2 � O(t�), 8 � 2( 2�; 1

2) (a.s.).

Then it is clear that

(k�(t)!(t+ 1)k=Ri(t)t�� )2

� O(t�); 8� 24

�; 1 (a.s.) (65)

and by (17)

n

t=1

(k�(t)k=Ri(t)t�� )2 = O(n) (a.s.): (66)

Based on the last two relations we can use the theorem on Almost SureInvariance Principle (see [4, Th. 3.1, p. 122] and obtain the statementof the lemma.

REFERENCES

[1] H. F. Chen and L. Guo,Identification and Stochastic Adaptime Con-trol. Boston: Birkhäuser, 1991.

[2] L. Guo, “Further results on least-squares based adaptive minimum vari-ance control,”SIAM J. Cont. Optimization, vol. 32, pp. 187–212, 1994.

[3] L. Guo and H. F. Chen, “The Astrom–Wittenmark self-tuning regu-lator revisited and ELS-based adaptive trackers,”IEEE Trans. Automat.Contr., vol. 36, pp. 802–812, 1991.

[4] N. C. Jain, K. Jogdeo, and W. F. Stout, “Upper and lower functionsfor martingales and mixing processes,”Annals Probability, vol. 3, pp.119–145, 1975.

[5] T. L. Lai, “Asymptotically efficient adaptive control in stochastic regres-sion models,”Adv. Appl. Math., vol. 7, pp. 23–45, 1986.

[6] T. L. Lai and C. Z. Wei, “Least squares estimates in stochastic regres-sion models with applications to identification and control of dynamicsystems,”Ann. Statist., vol. 10, pp. 154–166, 1982.

[7] S. P. Meyn and P. E. Caines, “The zero divisor problem of multi-variablestochastic adaptive control,”Sys. Contr. Letters, vol. 6, pp. 235–238,1985.

[8] M. S. Radenkovic, “Further results on almost sure convergence of thestochastic gradient algorithm,”Int. J. Adaptive Contr. Signal Processing,1999.

Low-Order Controller Design for SISO Systems UsingCoprime Factors and LMI

Shaopeng Wang and Joe H. Chow

Abstract—This paper develops a low-order controller design methodfor linear continuous time-invariant single-input, single-output systems re-quiring only the solution of a convex optimization problem. The techniqueintegrates several well-known results in control theory. An important step isthe use of coprime factors so that based on strictly positive real functions,feedback stabilization using low-order controllers becomes a zero-place-ment problem which is convex. From this result, we develop algorithms tosolve two optimal control problems.

Index Terms—Bounded real lemma, coprime factorization, linear matrixinequalities, low-order controller design, strictly positive real functions.

I. INTRODUCTION

A simple, systematic, and reliable method for the design of a low-order stabilizing controller for a linear time-invariant system to op-timize certainH2, H1, and pole-placement performance index haseluded control systems researchers for many years [1]. The purpose ofthis paper is to develop a new control design method for single-input,single-output (SISO) systems so that some of these low-order controllerdesign problems can be solved with simple and reliable algorithms.

It is well known that the design of low-order controllers results ina control problem involving either a nonconvex rank condition [2] orbi-affine matrix inequalities (BMI’s) [3], which are nonconvex opti-mization problems and cannot be solved in polynomial time. Insteadof solving directly the BMI problem, several researchers [4]–[9] haveshown that low-order controllers can be obtained by solving iterativelylinear matrix inequality (LMI) subproblems, which are convex andreadily solved using existing semidefinite programming software [10].Some of these techniques have been applied successfully to practicaldesign problems. However, global convergence has not been estab-lished for any of these iterative methods.

In this paper, we will formulate low-order controller design problemsas convex optimization problems, without requiring an iterative solu-tion. The design does not involve any iterations so that no convergenceresult is needed. However, in order to make the problem convex, ourcontroller solution set will not be the entire solution set. The new de-sign formulation is achieved by integrating several well-known results,namely, strictly positive real (SPR) functions [11], the Bounded-RealLemma [12], and LMI’s [12]. Only the theory is presented here. In-terested readers are referred to [15] for additional results and designexamples.

Remark: We will use bold variables to denote rational functions[like ggg(s)] and nonbold variables to denote polynomial [likeg(s)].

II. DESIGN OFLOW-ORDER STABILIZING CONTROLLERS

Consider a strictly proper linear time-invariant SISO systemggg(s)with the minimum state-space realization

_x = Ax +Bu; y = Cx (1)

Manuscript received September 9, 1999; revised March 20, 1998. Recom-mended by Associate Editor, E. Feron. This work was supported in part by theNational Science Foundation under Grant DMI 9631919 and the General Elec-tric Company.

The authors are with the Electrical, Computer, and Systems Engineering De-partment, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.

Publisher Item Identifier S 0018-9286(00)04224-0.

0018–9286/00$10.00 © 2000 IEEE