Upload
an
View
212
Download
0
Embed Size (px)
Citation preview
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1161
Technical Notes and Correspondence_______________________________
Almost Sure Rate of Convergence of the ParameterEstimates in Stochastic Approximation Algorithm
Miloje S. Radenkovic and Anthony N. Michel
Abstract—This paper presents novel results on the almost sure conver-gence of the parameter estimates in stochastic approximation algorithm.It is proved that, surprisingly, this rate has the same order as the best oneestablished for the least squares algorithm. Although we consider the caseof self-tuning controllers, presented results can easily be extended to someother adaptive processes.
Index Terms—Almost sure convergence, martingale, parameter estima-tion, self-tuning, stochastic approximation.
I. INTRODUCTION
In this paper we address the problem of the almost sure convergencerate of the stochastic approximation (SA) parameter estimator. Despitetheir computational simplicity and strong theoretical foundation, it isconsidered that the SA-type algorithm is inferior to the least squares(LS) algorithm.
Fundamental results regarding almost sure convergence of the LSalgorithm in adaptive systems context are presented in [1]–[3], [5], [6].It is shown that for the LS-based self-tuning controller, averaged regretsatisfies
1
n
n
t=1
~�(t)T'(t)2
= Ologn
n(a:s:); asn!1 (1)
where~�(t) is the parameter estimation error, while'(t) is the regressor.This is the best convergence rate for the averaged regret generated bythe LS algorithm [2], [3], [5]. Reference [2] presents the best possibleconvergence rate for the parameter estimation error in the LS-basedself-tuning control. This rate is the same as that in the laws of iteratedlogarithm, and it is given by
k~�(n)k2 = Olog logn
n(a:s:); asn!1: (2)
Convergence rates (1) and (2) are not expected to be achievable with thestochastic approximation-based adaptive algorithm. This problem hasbeen recently revisited in [8]. It is shown that the averaged regret hasthe same order as the one given by (1). Whether or not the convergencerate given by (2) is achievable with the SA algorithm is still an openproblem.
The focus of the paper is on the almost sure (a.s.) convergence rateof the parameter estimates generated by the SA type algorithm. Thesolution to this problem is given for the case of self-tuning control. Weprove that the convergence rate of the parameter estimation error has thesame order as that defined by (2). The key ideas of the proof consists of
Manuscript received March 2, 1999; revised October 9, 1999. Recommendedby Associate Editor, Q. Zhang.
M. S. Radenkovic is with the Department of Electrical Engineering, Univer-sity of Colorado at Denver, Denver, CO 80217-3364 USA.
A. N. Michel is with the Department of Electrical Engineering, UniversityNotre Dame, Notre Dame, IN 46556 USA (e-mail: [email protected]).
Publisher Item Identifier S 0018-9286(00)04171-4.
the use of techniques based on backward recursions in the estimates andappropriate application of the martingale limit theory and the almostsure invariance principle. Compared with the conditions required in [2],we adopt somewhat more restrictive assumptions regarding the systemnoise and the reference signal. At the same time, we only treat the caseof all-pole dynamical systems. The paper is organized as follows. Theproblem statement is given in Section II. The rate of convergence ofthe averaged regret is presented in Section III. The rate of convergenceof the parameter estimates is evaluated in Section IV.
II. PROBLEM STATEMENT
Consider the following single-input/single-output discrete-timesystem
A(q�1)y(t+ 1) = bu(t) + !(t+ 1); t � 0 (3)
wherefy(t)g, fu(t)g, andf!(t)g are the system output, input, anddisturbance sequence, respectively, whileq�1 is the unit delay operator.In (3) b 6= 0, and polynomialA(q�1) is given by
A(q�1) = 1 + a1q�1 + � � �+ an q
�n; nA � 0:
We assume that the parametersb andai; i = 1; � � � ; nA are unknown.Our objective is to design a causal control sequencefu(t)g; t � 0 suchthat the closed-loop system is stable and the following cost function isminimized:
J = limn!1
1
n
n
t=1
(y(t+ 1)� y�(t+ 1))2 (4)
wherefy�(t)g is a reference sequence to be tracked. For the purpose ofour analysis we assume thaty�(t) is generated by the following signalmodel:
y�(t) = P (q�1)v(t) (5)
wherev(t) is a stochastic process, andP (q�1) = 1 + p1q�1 + � � �+
pn q�n , np � 0. At the same time we assume thaty�(t) is knownone step-ahead, i.e.,y�(t + 1) is available at the timet. Processesf!(t)g andfv(t)g are mutually independent, and they are defined onthe underlying probability space(;F ; P) with an increasing familyof �-fields Ft = f!(0); � � � ; !(t); v(0); � � � ; v(t + 1)g, t � 0.Without loss of generality we assume thaty(t) = u(t) = !(t) =v(t) = 0, 8 t < 0. For the sequencesf!(t)g andfv(t)g we adopt thefollowing assumption.
Assumption A1): Definex(t)T = (!(t); v(t + 1)). Thenfx(t); Ftg is a martingale difference sequence, i.e.,x(t) is Ft-mea-surable andEfx(t+ 1)jFtg = 0 (a.s.). Sequencefx(t)g satisfies
Efx(t+ 1)x(t+ 1)T jFtg
=�2! 0
0 �2v; 0 < �!; �v <1 (a.s.)
and
supt�0
Efkx(t+ 1)j� jFtg
<1; (a.s.) for some� > 4:
0018–9286/00$10.00 © 2000 IEEE
1162 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000
The previous assumption implies two important relations whichwill be useful in subsequent developments. In fact, by the conditionalChebyshev’s inequality we have for any� > 4=�
1
t=1
P(kx(t+ 1)k4 � t�jFt)
=
1
t=1
P(kx(t+ 1)k� � t��=4jFt)
�
1
t=1
t���=4Efkx(t+ 1)k�jFtg <1 (a.s.).
Then by using the Borel–Cantelli lemma one obtains
kx(t+ 1)k4 = O(t�) (a.s.) 8 � 2 (4=�; 1) (6)
wherex(t)T = (!(t); v(t+1)). The second statement which followsfrom Assumption A1) is
lim supn!1
1
n
n
t=1
kx(t+ 1)k4 � kx <1 (a.s.): (7)
This relation can be proved by considering the sequencef�(i)g; i � 0,defined by
�(i) = kx(i)k4 � Efkx(i)k4 jFi�1g:
Sincef�(i); Fig is a martingale difference sequence satisfying
1
i=0
Ej�(i+ 1)j
i
c
jFi
<1; 8 c 2 1; min�
4; 2
by the Martingale Convergence theorem we get
limn!1
n
i=0
�(i+ 1)
i<1 (a.s.)
from where by the Kronecker’s lemma relation (7) directly follows.Denote the unknown parameters in (3) by
�T0 � [a1; � � � ; an ; b] (8)
and the corresponding regressor by
'(t)T = [�y(t); � � � ; �y(t� nA + 1); u(t)]: (9)
It is well known that the nonadaptive control law optimal in the senseof (4) is given by�T0 �(t) = y�(t+1). In the adaptive case we use thefollowing Ft-measurable control sequencefu(t)g:
solve foru(t): �(t)T�(t) = y�(t+ 1) (10)
where�(t) is an estimate of the unknown parameters�0, and it is gen-erated by the following gradient type algorithm:
�(t+ 1) = �(t) + ��(t)
d(t)[y(t+ 1)� y�(t+ 1)]; 0 < � <1
(11)
d(t) = max � max1���t
k�(r)k2; r(t)1�n; t ; 0 < � <1
2
(12)
r(t) = 1 +
n
i=1
k�(i)k2; r�(0) � 1 (13)
with the arbitrary nonrandom initial condition�(0) satisfyingk�(0)k < 1 andb(0) 6= 0. In this paper we assume that the estimateof b satisfiesb(t) 6= 0; 8 t � 1, i.e., (11) is solvable8 t � 1. Itis well known [1], [7] that if f!(t)g is a sequence of independentrandom samples with continuous distributions, then the estimateb(t) 6= 0; 8 t � 1 (a.s.). Without introducing previous restrictions onf!(t)g, solvability of (10) can be guaranteed by slightly modifyingb(t) so thatu(t) is well defined from (10). In the case when persistentexcitation (PE) condition is satisfied andlimt!1 �(t) = �0, aftersome finite time, such modification is not needed [1]. Since this paperis concerned with the asymptotical convergence rate of the parameterestimates, we do not impose additional restrictions onf!(t)g ormodify b(t), and simply assume that (10) is solvable for allt � 1.
We now list some of the known results in self-tuning control theory,needed in the sequel. It is not difficult to see that from (3) and (10), theclosed-loop adaptive system can be written in the form
y(t+ 1)� y�(t+ 1) =�z(t) + !(t+ 1) (14)
where
z(t) = ~�(t)T�(t); ~�(t) = �(t)� �0: (15)
Theorem 2.1: Consider(3) together with the algorithm(10)–(13).If Assumption A1) is valid, the following properties hold (a.s.):
i)
limn!1
1
n
n
i=0
z(i)2 = 0 (16)
ii)
limn!1
1
nr(n) = c0; 0 < c0 <1 (17)
iii)
limn!1
n
i=k
k�(i)� �(i� k)k2 <1; for all finite k (18)
iv)
limn!1
~�(n) = 0 (19)
v)
limn!1
max1�r�n
k�(r)k2=n = 0: (20)
Proof: Although normalization sequenced(t) in (11) is differentthan the standard one, proof of the above statements is exactly the sameas for the cased(t) = r(t) and can be found in a variety of literature(see for example [1]). We only comment on the statement (20). From(9) and (14) we can derive
max1�r�t
k�(r)k2 =Of max1���t
z(�)2g
+Of max1���t
f!(� + 1)2 + y�(� + 1)2g: (21)
Then (20) follows from (5), (6), and (16).
III. T HE CONVERGENCERATE OF THEPARAMETER ESTIMATES
In this section we evaluate the convergence rate of the parameterestimates generated by the SA algorithm. We show that the order ofthis rate is the same as the one established for the LS algorithm. Weprove the above statement under the condition that the scaling factor�in (11) is large enough. In order to quantify the size of the parameter
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1163
�, we need a few simple developments. By using (14) it is easy to seethat the regressor�(t) defined by (9) can be represented in the form
�(t) = �z(t) + ��(t) (22)
where
�z(t)T = z(t� 1); � � � ; z(t� nA); �
A(q�1)
bz(t) (23)
and
��(t)T = [�y�(t)� !(t); � � � ; �y�(t� nA + 1)
� !(t� nA + 1);A(q�1)
b(y�(t+ 1) + !(t+ 1))]:
(24)
Define the following matrix:
W = Ef��(t)��(t)T jFt�kg; k � nA + nP + 1 (25)
wherenA andnP are degrees of the polynomialsA(q�1) andP (q�1),respectively. Sincek is chosen so that��(t) is notFt�k, measurable,Assumption A1) implies thatW is a positive definite matrix. Next, wedefine the size of the parameter� in (11).
Assumption A2: The gain factor� in the estimation algorithm sat-isfies
��� � 1
where�� is the minimal eigenvalue of the matrixW given by(25).For future reference we state the following result.Theorem 3.1: Let Assumptions A1) and A2) hold. Then asn!1
i)
n
i=0
z(i)2 = O(logn) (a.s.) (26)
ii)
n
i=0
k~�(i)k2 = O(logn) (a.s.): (27)
Proof: The proof of the theorem is given in [8].Theorem 3.2: Let Assumptions A1) and A2) hold. Then asn!1
k~�(n)k2 = Olog logn
n(a.s.): (28)
Proof: By using (17) and (20) we conclude that9 t0; t0 < 1,so that the gain sequenced(t) in (11) is given by
d(t) = t; 8 t � t0 (a.s.): (29)
Then from (11), (14), and (29), we have
~�(n+ 1) = ~�(n)� ��(n)�(n)T ~�(n)
n
+ ��(n)!(n+ 1)
n(a.s.) (30)
for all n � t0. Using the decomposition (22) of the regressor�(n),and substituting in the previous relation, we derive
~�(n+ 1) = ~�(n)� ���(n)��(n)T ~�(n)
n� �
f(n)
n
+ ��(n)!(n+ 1)
n(a.s.) (31)
where
f(n) = [�z(n)�z(n)T + �z(n)�
�(n)T + ��(n)�z(n)
T ]~�(n):
(32)
Further transformation of (31) gives
~�(n+ 1) = I � �W
n~�(n)� �
(f(n)� g(n))
n
+ ��(n)!(n+ 1)
n(a.s.) (33)
where
g(n) = (W � ��(n)��(n)T )~�(n) (34)
whileW is the positive definite matrix defined by (25). SinceW is thesymmetric matrix, there exists orthogonal matrixD such thatW =DLDT , whereL is diagonal matrix with the eigenvalues ofW on thediagonal. Having this in mind, (33) can be written in the form
(n+ 1) = I � �L
n (n)� �
DT (f(n)� g(n))
n
+ �DT�(n)!(n+ 1)
n; n � t0 (a.s.) (35)
with
(n) = DT ~�(n): (36)
Denote by i(n); i = 1; � � � ; nA+1, theith component of the vector (n). Then
i(n+ 1) = 1� ��i
n i(n)� �
liDT (f(n)� g(n))
n
+ �liD
T�(n)!(n+ 1)
n(a.s.) (37)
where�i > 0 is theith eigenvalue of the matrixL (orW ). In (37) thevector rowli has the same dimension as (n). All elements ofli arezero, except theith component, which is equal to one. Note one moretime that due to (28), relation (30) and consequently (31)–(37) are validfor n � t0, 0 < t0 < 1. From (37) one can derive
i(n+ 1) =Ri(n) i(t0)� �Ri(n)
n
t=t
1
tRi(t)liD
T (f(t)� g(t))
+ �Ri(n)
n
t=t
1
tRi(t)liD
T�(t)!(t+ 1) (a.s.) (38)
where
Ri(n) =
n
t=t
1� ��i
t: (39)
Next we bound the second term on the RHS of (38). The definition off(t) given by (32) and Schwarz inequality yield
n
t=1
kf(t)k
= O
n
t=1
k�z(t)k2k~�(t)k
+O
n
t=1
k�z(t)k2
1
2
�
n
t=1
k��(t)k2k~�(t)k2
1
2
:
(40)
1164 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000
Since~�(t) ! 0 when t ! 1 (a.s.), the previous relation and (23)imply
n
t=1
kf(t)k = o(rz(n))
+O rz(n)12 r�(n)
12 max
1�r�nk��(r)k (a.s.) (41)
with rz(n) and r�(n) given by (15) whenx(t) = z(t) andx(t) = k~�(t)k, respectively. Noticing that by (6) and (24),
max1�r�n k��(�)k2 = O(n�) (a.s.),� 2 2
�; 12
, from (26), (27)and (41) one obtains
n
t=1
kf(t)k = O(n� logn);1
�< � <
1
4(a.s.): (42)
After applying Lemma A2 in the Appendix, from (39) we get
O((n� t0)��� ) � Ri(n) � O((n+ 1)��� ): (43)
Recall now that by Assumption A2)
��i � ��� � 1; 1 � i � nA + 1:
Hence (42) and (43) give
Ri(n)
n
t=t
1
tRi(t)liD
T f(t)
= O1
(n+ 1)��
n
t=t
(t� t0)��
tkf(t)k
= O1
(n+ 1)��(n� t0)
�� �1n
t=t
kf(t)k
= O1
n3=4logn (a.s.): (44)
The term�Ri(n)nt=t (1=Ri(t)t)liD
Tg(t) is bounded by LemmaA1 in the Appendix. Next we bound the third term on the RHS of (38).DefineTin as follows:
Tin =
n
t=1
1
Ri(t)t��liD
T�(t)!(t+ 1): (45)
Then by virtue of Assumption A2) the following chain of inequalitiesfollows:
j
n
t=t
1
tRi(t)liD
T�(t)!(t+ 1)j
= j
n
t=t
t�� �1(Tit � Ti(t�1)j
= j
n
t=t
(t�� �1Tit � (t� 1)�� �1Ti(t�1))
+
n
t=t
[(t� 1)�� �1 � t�� �1]Ti(t�1)j
� jn�� �1Tin � (t0 � 1)�� �1Ti(t �1)j
+ maxt ���n
jTi(��1)j(n�� �1 � (t0 � 1)�� �1)
= Ofn�� �1(n log logn)12 g (a.s.) asn!1 (46)
where we used the fact thatt0 is finite, andTin is bounded in LemmaA3 in the Appendix. Finally (43) and (48) yield
jRi(n)
n
t=t
1
tRi(t)liD
T�(t)!(t+ 1)j
= O1
n(n log logn)
12 (a.s.) (47)
asn!1. By using (43), (44), Lemma A1 in the Appendix, and (47),from (38) we obtain
k i(n+ 1)k =Olog logn
n
12
;
i = 1; � � � ; nA + 1 (a.s.): (48)
SinceDT is the orthogonal matrix, the statement of the theorem fol-lows from (36). This completes the proof of the theorem.
From relations (43), (44), and Lemma A1, it is clear that the first two
terms on the RHS of (38) have the orderO(n�12 ). Thus, convergence
rate of the parameter estimates is determined by the third term on theRHS of (38). Lemma A3 and (47) indicate that the convergence rate(29) is the best possible, since it is the same as that in the iterated log-arithm law.
Proof of the above theorem heavily exploits the fact that the secondcomponent(��(t)) of the regressor�(t) in (22) is a moving averageprocess. As a consequence of this we were able to define the constantmatrix W which plays a significant role in our analysis. This is thereason why we consider only the all-poles system model (3) and themoving average-type reference signal (5). Such restrictions are not re-quired for the LS algorithm [2]. It is not clear whether similar resultshold for systems with zeros or for general ARMAX models with arbi-trary bounded reference signaly�(t).
It is worth noticing that instead of (12), the standard SA algorithmhas the gain sequenced(t) = r(t) [1]. Since by (17)r(t) = O(t) (a.s.),it is obvious that the above results still hold in this case. Compared with(12), suchd(t) would involve somewhat more constants in some of thecalculations. Due to (29), derivations of the results are more clear, whend(t) is defined as in (12).
In the above results it is assumed that the reference signal is gen-erated by a stochastic process and that at the same time, the referencesignal is known one step ahead. This assumption can be avoided if inplace of the reference signal, its one-step-ahead prediction is used. Sucha prediction can be generated by another SA algorithm. In this case,the proof of Theorem 3.2 becomes much more complicated, involvingadditional algebraic details. For the sake of simplicity and clarity, weassume that the reference signal is known one step ahead.
It is also worth noting that the size of the algorithm gain� playsan essential role in establishing the convergence rate given by (28).Assumption A2) requires that the gain� in all directions is not smallerthan the inverse of the corresponding eigenvalues of the matrixW . Thisfact coincides with the observation that the gain of the LS algorithmasymptotically is proportional to the inverse of the covariance matrixW . In particular, LS estimator equation is given by
�(n+ 1) = �(n) + p(n)�(n)[y(n+ 1)� y�(n+ 1)]; n � 0
p(n)�1 = p(n� 1)�1 + �(n)�(n)T ; p(0)�1 > 0
with �(n) specified by (9). Sincelimn!1p(n)
n = W (a.s.), wehavelimn!1Wnp(n) = I (a.s.), whereW is defined by (25). Byvirtue of this relation, the gain matrixp(n) asymptotically behaves asp(n) � W�1=n.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000 1165
IV. CONCLUSION
In this paper we have obtained new results on the (a.s.) convergencerate of the parameter estimates generated by the stochastic approxima-tion algorithm. We proved that this rate has the same order as the bestone established for the least squares algorithm. The self-tuning controlwith the all poles system model and the moving average type referencesignal have been analyzed. For further study, it would be of interest togeneralize the results to a general ARMAX system model with arbi-trary bounded reference signaly�(t).
APPENDIX
Lemma A1: Under the same conditions as in Theorem 4.2, asn!1, we have
jRi(n)
n
t=t
1
tRi(t)liD
Tg(t)j � O(n�
12 ) (a.s.) (49)
where all variables are the same as in(38).Proof: From the definition ofg(t) in (34) we write
g(t) = p(t) + s(t) (50)
where
p(t) = (W � ��(t)��(t)T )~�(t� k) (51)
s(t) = (W � ��(t)�(t)T )(�(t)� �(t� k)) (52)
with k defined by (25). Since~�(t � k) is Ft�k measurable and from(25)EfW � ��(t)��(t)T jFt�kg = 0 (a.s.), by Assumption A1) andthe Local Martingale Convergence theorem (see [7, Lemma 2, p. 157])we get
I2(n)�=j
n
t=t
1
tRi(t)liD
Tp(t)j
=Of(
n
t=t
1
t2Ri(t)2k~�(t� k)k2)
12
� [log(
n
t=t
1
t2Ri(t)2k~�(t� k)k2)]�g; 8� >
1
2(a.s.)
(53)
I2(n)�=j
n
t=t
1
tRi(t)liD
Tp(t)j
=Of(
n
t=t
1
t2Ri(t)2k~�(t� k)k2)
12
� [log(
n
t=t
1
t2Ri(t)2k~�(t� k)k2)]�g;
8� >1
2(a.s.) (53)
Then (43) and (53) give
I2(n) =O n�� �1
n
t=t
k~�(t� k)k2
12
� logn2(�� �1)n
t=t
k~�(t� k)k2�
; 8� >1
2(a.s.)
(54)
from where by (27) it follows that for some� > 0:
I2(n) = Ofn�� �1(logn)(1+�)=2g (a.s.): (55)
By using the Schwarz inequality and (52), we have
I3(n)�= j
n
t=t
1
tRi(t)liD
Ts(t)j
=O
n
t=t
1
t2Ri(t)2k�(t)� �(t� k)k2
12
�
n
t=t
kW � ��(t)��(t)Tk2
12
: (56)
Since (7) and (24) imply nt=t k��(t)k4 = O(n), (a.s.), from (18),
(43), and (56), it is clear that
I3(n) = O(n�� �1 � n12 ) (a.s.): (57)
Finally (49) follows from (55), (57), and (43). Thus the lemma isproved.
Lemma A2: LetR(n) be defined as follows:
R(n) =
n
t=h
1�a
t(58)
wherea > 0, h = [a] + 1, and [a] is the largest integral part ofa.Then the following holds:
1�a
h
e�a
(n� h)a� R(n) � 1�
a
h
(h+ 1)a
n+ 1)a: (59)
Proof: Observe that8 t � h
�a
t� h�
a
t� a� log 1�
a
t� �
a
t: (60)
Since
n
t=h+1
1
t� h=1 +
n�h
i=2
1
i
=1 +
n�h
i=2
1
i
i
i�1
dx
� 1 +
n�h
i=2
i
i=2
dx
x
=1 +n�h
1
dx
x
=1 + log(n� h) (61)
and
n
t=h+1
1
t�
n
t=h+1
t+1
t
dx
x= log(n+ 1)� log(h+ 1) (62)
(60) yields
�a� a log(n� h) �
n
t=h+1
log 1�a
t
��a log(n+ 1) + a log(h+ 1): (63)
1166 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 6, JUNE 2000
Using the definition ofR(n), from the previous relation we derive
�a� a log(n� h) � log R(n)h
h� a
��a log(n+ 1) + a log(h+ 1)
from where (59) directly follows.Lemma A3: Let Assumptions A1) and A2) hold. Then asn!1
j
n
t=1
1
Ri(t)t��liD
T�(t)!(t+ 1)j
= Of(n log logn)1
2 g (a.s.) (64)
where all variables are the same in(38).Proof: First we observe the following facts.
i) Equation (43) implies that(Ri(t)�1=t�� ) � O(1).
ii) Equations (6), (22), and (26) yieldk�(t)k2 � O(t�), 8 � 2( 2�; 1
2) (a.s.).
Then it is clear that
(k�(t)!(t+ 1)k=Ri(t)t�� )2
� O(t�); 8� 24
�; 1 (a.s.) (65)
and by (17)
n
t=1
(k�(t)k=Ri(t)t�� )2 = O(n) (a.s.): (66)
Based on the last two relations we can use the theorem on Almost SureInvariance Principle (see [4, Th. 3.1, p. 122] and obtain the statementof the lemma.
REFERENCES
[1] H. F. Chen and L. Guo,Identification and Stochastic Adaptime Con-trol. Boston: Birkhäuser, 1991.
[2] L. Guo, “Further results on least-squares based adaptive minimum vari-ance control,”SIAM J. Cont. Optimization, vol. 32, pp. 187–212, 1994.
[3] L. Guo and H. F. Chen, “The Astrom–Wittenmark self-tuning regu-lator revisited and ELS-based adaptive trackers,”IEEE Trans. Automat.Contr., vol. 36, pp. 802–812, 1991.
[4] N. C. Jain, K. Jogdeo, and W. F. Stout, “Upper and lower functionsfor martingales and mixing processes,”Annals Probability, vol. 3, pp.119–145, 1975.
[5] T. L. Lai, “Asymptotically efficient adaptive control in stochastic regres-sion models,”Adv. Appl. Math., vol. 7, pp. 23–45, 1986.
[6] T. L. Lai and C. Z. Wei, “Least squares estimates in stochastic regres-sion models with applications to identification and control of dynamicsystems,”Ann. Statist., vol. 10, pp. 154–166, 1982.
[7] S. P. Meyn and P. E. Caines, “The zero divisor problem of multi-variablestochastic adaptive control,”Sys. Contr. Letters, vol. 6, pp. 235–238,1985.
[8] M. S. Radenkovic, “Further results on almost sure convergence of thestochastic gradient algorithm,”Int. J. Adaptive Contr. Signal Processing,1999.
Low-Order Controller Design for SISO Systems UsingCoprime Factors and LMI
Shaopeng Wang and Joe H. Chow
Abstract—This paper develops a low-order controller design methodfor linear continuous time-invariant single-input, single-output systems re-quiring only the solution of a convex optimization problem. The techniqueintegrates several well-known results in control theory. An important step isthe use of coprime factors so that based on strictly positive real functions,feedback stabilization using low-order controllers becomes a zero-place-ment problem which is convex. From this result, we develop algorithms tosolve two optimal control problems.
Index Terms—Bounded real lemma, coprime factorization, linear matrixinequalities, low-order controller design, strictly positive real functions.
I. INTRODUCTION
A simple, systematic, and reliable method for the design of a low-order stabilizing controller for a linear time-invariant system to op-timize certainH2, H1, and pole-placement performance index haseluded control systems researchers for many years [1]. The purpose ofthis paper is to develop a new control design method for single-input,single-output (SISO) systems so that some of these low-order controllerdesign problems can be solved with simple and reliable algorithms.
It is well known that the design of low-order controllers results ina control problem involving either a nonconvex rank condition [2] orbi-affine matrix inequalities (BMI’s) [3], which are nonconvex opti-mization problems and cannot be solved in polynomial time. Insteadof solving directly the BMI problem, several researchers [4]–[9] haveshown that low-order controllers can be obtained by solving iterativelylinear matrix inequality (LMI) subproblems, which are convex andreadily solved using existing semidefinite programming software [10].Some of these techniques have been applied successfully to practicaldesign problems. However, global convergence has not been estab-lished for any of these iterative methods.
In this paper, we will formulate low-order controller design problemsas convex optimization problems, without requiring an iterative solu-tion. The design does not involve any iterations so that no convergenceresult is needed. However, in order to make the problem convex, ourcontroller solution set will not be the entire solution set. The new de-sign formulation is achieved by integrating several well-known results,namely, strictly positive real (SPR) functions [11], the Bounded-RealLemma [12], and LMI’s [12]. Only the theory is presented here. In-terested readers are referred to [15] for additional results and designexamples.
Remark: We will use bold variables to denote rational functions[like ggg(s)] and nonbold variables to denote polynomial [likeg(s)].
II. DESIGN OFLOW-ORDER STABILIZING CONTROLLERS
Consider a strictly proper linear time-invariant SISO systemggg(s)with the minimum state-space realization
_x = Ax +Bu; y = Cx (1)
Manuscript received September 9, 1999; revised March 20, 1998. Recom-mended by Associate Editor, E. Feron. This work was supported in part by theNational Science Foundation under Grant DMI 9631919 and the General Elec-tric Company.
The authors are with the Electrical, Computer, and Systems Engineering De-partment, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.
Publisher Item Identifier S 0018-9286(00)04224-0.
0018–9286/00$10.00 © 2000 IEEE