12
1220 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007 Self-Organizing Approximation-Based Control for Higher Order Systems Yuanyuan Zhao, Student Member, IEEE, and Jay A. Farrell, Senior Member, IEEE Abstract—Adaptive approximation-based control typically uses approximators with a predefined set of basis functions. Recently, spatially dependent methods have defined self-organizing approx- imators where new locally supported basis elements were incor- porated when existing basis elements were insufficiently excited. In this paper, performance-dependent self-organizing approxima- tors will be defined. The designer specifies a positive tracking error criteria. The self-organizing approximation-based controller then monitors the tracking performance and adds basis elements only as needed to achieve the tracking specification. The method of this paper is applicable to general th-order input-state feedback lin- earizable systems. This paper includes a complete stability analysis and a detailed simulation example. Index Terms—Adaptive nonlinear control, locally weighted learning, self-organizing approximation-based control. I. INTRODUCTION O NLINE approximation-based control has been well de- veloped to achieve stability and accurate reference input tracking in the presence of partially unknown nonlinear dy- namics. The design and analysis of adaptive systems involving online approximation structures has been extensively addressed in, e.g., [1]–[8], including controller structure selection, auto- matic adjustment of the control law, and the complete proofs of stability. Since such online approximation-based controllers can never achieve an exact modeling of unknown nonlinearities, inherent approximation errors could arise even if optimal ap- proximator parameters were selected. With the existing results, it is well understood that the tracking performance ultimately achieved is determined by the upper bound on these inherent approximation errors, which can be improved by the choices of the designer, such as approximator structure, parameter adaptation algorithm, etc. Therefore, proper choices of the approximator structure and parameter adaptation will produce better tracking performance without changing the control gain (i.e., bandwidth) of the control system. Many published universal approximation results [9], [10] state that under reasonable assumption on the basis function and the function to be approximated, for any given , if the network approximator has a sufficiently large number of nodes, then approximation accuracy can be achieved by Manuscript received November 20, 2005; revised August 15, 2006; accepted February 5, 2007. This work was supported by the National Science Foundation under Grant ECS-0322635. The authors are with the Department of Electrical Engineering, University of California, Riverside, CA 92521 USA (e-mail: [email protected]; farrell@ee. ucr.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNN.2007.899217 proper selection of the approximator parameters. Thus, an ideal design needs to allocate a sufficiently large number of learning parameters to achieve accuracy specification for function ap- proximation while maintaining low computational complexity. Additionally, allocating too many learning parameters bears the danger of overparameterizing the approximation, which may be capable of fitting the measurement noise. With these motiva- tions, nonlinear adaptive control with function approximation employing automatic structure adaptation has been discussed in [2] and [11]–[16]. In [13] and [14], the authors use wavelet networks and adapt the structure of the network in response to the evaluation of the magnitude of the output weights by “hard-thresholding.” Smoothly interpolated linear models are considered in [2]. The work of [15] and [16] defines local approximators within localized receptive field, and allows the online approximation to be tuned in a local region without affecting the approxima- tion accuracy previously achieved in other regions. Therefore, a more capable function approximation structure is produced to retain approximation accuracy as a function of the operating point. For the structure adaptation, these papers use gradient de- scent to adjust the distance metric of each local approximator so that each receptive field is tuned according to local curvature properties of the unknown function. In [2], [11], and [12], they use the linearly parameterized model locally, which is a spe- cial case of the receptive field weighted regression (RFWR) ap- proach proposed in [15] and [16]. No stability results are given in [15] and [16]. The common drawbacks of the existing ap- proaches [2], [11]–[14] are the following: 1) they only address the stability analysis for the state and the approximator parame- ters, not the change in the number of basis functions, and 2) the structure adaptation algorithms are defined by the trajectory, not by performance. New approximator nodes are added when the state is sufficiently far from all existing receptive field centers, whether or not additional approximation accuracy is required. In [17], we suggested an approach to adjust the structure of the approximator based on the tracking performance of the control system. The structure of the function approximator is tuned online such that sufficiently accurate approximation is attained with the least number of local approximators to en- sure the required state tracking performance without enlarging the bandwidth of the control system. The analysis of [17] fo- cused on the scalar single-input–single-output (SISO) system with being known and . In this paper, we extend the design method of the approxi- mator structure adaptation to th-order input-state feedback lin- earizable systems with unknown . Our goal is to design a self-organizing online approximation-based control to achieve 1045-9227/$25.00 © 2007 IEEE

Self-Organizing Approximation-Based Control for Higher Order Systems

  • Upload
    jay-a

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Self-Organizing Approximation-Based Control for Higher Order Systems

1220 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

Self-Organizing Approximation-Based Controlfor Higher Order Systems

Yuanyuan Zhao, Student Member, IEEE, and Jay A. Farrell, Senior Member, IEEE

Abstract—Adaptive approximation-based control typically usesapproximators with a predefined set of basis functions. Recently,spatially dependent methods have defined self-organizing approx-imators where new locally supported basis elements were incor-porated when existing basis elements were insufficiently excited.In this paper, performance-dependent self-organizing approxima-tors will be defined. The designer specifies a positive tracking errorcriteria. The self-organizing approximation-based controller thenmonitors the tracking performance and adds basis elements onlyas needed to achieve the tracking specification. The method of thispaper is applicable to general th-order input-state feedback lin-earizable systems. This paper includes a complete stability analysisand a detailed simulation example.

Index Terms—Adaptive nonlinear control, locally weightedlearning, self-organizing approximation-based control.

I. INTRODUCTION

ONLINE approximation-based control has been well de-veloped to achieve stability and accurate reference input

tracking in the presence of partially unknown nonlinear dy-namics. The design and analysis of adaptive systems involvingonline approximation structures has been extensively addressedin, e.g., [1]–[8], including controller structure selection, auto-matic adjustment of the control law, and the complete proofsof stability. Since such online approximation-based controllerscan never achieve an exact modeling of unknown nonlinearities,inherent approximation errors could arise even if optimal ap-proximator parameters were selected. With the existing results,it is well understood that the tracking performance ultimatelyachieved is determined by the upper bound on these inherentapproximation errors, which can be improved by the choicesof the designer, such as approximator structure, parameteradaptation algorithm, etc. Therefore, proper choices of theapproximator structure and parameter adaptation will producebetter tracking performance without changing the control gain(i.e., bandwidth) of the control system.

Many published universal approximation results [9], [10]state that under reasonable assumption on the basis functionand the function to be approximated, for any given ,if the network approximator has a sufficiently large numberof nodes, then approximation accuracy can be achieved by

Manuscript received November 20, 2005; revised August 15, 2006; acceptedFebruary 5, 2007. This work was supported by the National Science Foundationunder Grant ECS-0322635.

The authors are with the Department of Electrical Engineering, University ofCalifornia, Riverside, CA 92521 USA (e-mail: [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNN.2007.899217

proper selection of the approximator parameters. Thus, an idealdesign needs to allocate a sufficiently large number of learningparameters to achieve accuracy specification for function ap-proximation while maintaining low computational complexity.Additionally, allocating too many learning parameters bears thedanger of overparameterizing the approximation, which maybe capable of fitting the measurement noise. With these motiva-tions, nonlinear adaptive control with function approximationemploying automatic structure adaptation has been discussedin [2] and [11]–[16]. In [13] and [14], the authors use waveletnetworks and adapt the structure of the network in responseto the evaluation of the magnitude of the output weights by“hard-thresholding.” Smoothly interpolated linear models areconsidered in [2].

The work of [15] and [16] defines local approximators withinlocalized receptive field, and allows the online approximationto be tuned in a local region without affecting the approxima-tion accuracy previously achieved in other regions. Therefore,a more capable function approximation structure is producedto retain approximation accuracy as a function of the operatingpoint. For the structure adaptation, these papers use gradient de-scent to adjust the distance metric of each local approximatorso that each receptive field is tuned according to local curvatureproperties of the unknown function. In [2], [11], and [12], theyuse the linearly parameterized model locally, which is a spe-cial case of the receptive field weighted regression (RFWR) ap-proach proposed in [15] and [16]. No stability results are givenin [15] and [16]. The common drawbacks of the existing ap-proaches [2], [11]–[14] are the following: 1) they only addressthe stability analysis for the state and the approximator parame-ters, not the change in the number of basis functions, and 2) thestructure adaptation algorithms are defined by the trajectory, notby performance. New approximator nodes are added when thestate is sufficiently far from all existing receptive field centers,whether or not additional approximation accuracy is required.

In [17], we suggested an approach to adjust the structureof the approximator based on the tracking performance of thecontrol system. The structure of the function approximator istuned online such that sufficiently accurate approximation isattained with the least number of local approximators to en-sure the required state tracking performance without enlargingthe bandwidth of the control system. The analysis of [17] fo-cused on the scalar single-input–single-output (SISO) system

with being known and.

In this paper, we extend the design method of the approxi-mator structure adaptation to th-order input-state feedback lin-earizable systems with unknown . Our goal is to design aself-organizing online approximation-based control to achieve

1045-9227/$25.00 © 2007 IEEE

Page 2: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1221

prespecified tracking accuracy for a linear combination of statetracking errors, without using high-gain control nor large mag-nitude switching. We will also show that each component of thetracking error satisfies a known bound.

II. PROBLEM STATEMENT

In this paper, we extend the previous self-organizing con-troller design of [17] from scalar systems to th-order SISOinput-state feedback linearizable systems of the form

(1)

(2)

where is the state vector and is the con-trol signal. The functions and represent the knownmodel at the design stage (i.e., the design model). The functions

and represent nonlinear effects that are unknown atthe design stage. The functions , , , and are each as-sumed to be continuous functions.

Our goal is to design the control signal to steer totrack the reference input and to achieve boundedness forthe states for . In regions of state space wherethe effect of the model error is too large for the desired levelof tracking to be achieved, the (provably stable) approach pre-sented herein will adapt the structure of the online approximatorto allow the tracking specification to be achieved.

To ensure controllability, it is necessary to assume thatis bounded away from zero and of known sign.

Therefore, without loss of generality, we will invoke the fol-lowing assumption.

Assumption 2.1: The function has a lower boundsuch that , where

is a known function and is a known constant.

A. Reference Trajectory

When the physical system is designed, a specified operatingrange is defined for each element of the state vector as a compactset . Let . The control systemshould ensure that the state remains in the physical operatingregion . It is, therefore, reasonable to assume that the desiredstate trajectory is sufficiently inside . These constraints arestated more rigorously in the following.

There is a desired trajectory with derivatives, , each of which is available and

bounded. The vectorfor all , which implies

that for all . In fact, we will assume existenceof a small constant such that

(3)

for any . This condition states that the desired trajectoryis at least a distance from the boundary of . The unknownfunctions and will be approximated over the region .

Throughout this paper

are the tracking error components and is the tracking errorvector defined as . Note that

.For parameter adaptation, it is convenient to define a scalar

projection of the tracking error vector

(4)

where . Note that definesan -dimensional hyperplane in . The absolute valueof represents the distance of from this hyperplane. Onthe hyperplane , the dynamics of are defined by

...

where is the Laplace variable; therefore, is be selected sothat is a Hurwitzpolynomial. In this case, the transfer function

is bounded-input–bounded-output (BIBO) stable. If can beshown to be bounded for all , then each isbounded.

To allow the bounds to be easily expressed, as suggested in[18, sec. 7.1], it is convenient to choose such that

for some constant . This implies that the vector in (4)is defined as , where

is the binomial coefficient. The transferfunctions to from are

The advantage of defining in this manner is that if there ex-ists a constant such that the magnitude of is boundedas , then the tracking errors are asymptoti-cally bounded by

(5)

Page 3: Self-Organizing Approximation-Based Control for Higher Order Systems

1222 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

which yields

as (6)

with and beingthe two-norm of a vector. See [18, pp. 279–280] for additionaldetail.

The self-organizing online approximation-based controllerdeveloped in the subsequent sections is designed to maintainstability and to achieve a tracking accuracy ofwith prespecified at the design stage. If is selectedas previously mentioned, then ensures that

as . This tracking objectivewill be achieved without using large magnitude switching norhigh gain control. The controller will also include terms toensure that the region is attractive (i.e., initial conditionsoutside converge to in finite time).

III. LWL ALGORITHM

In this paper, our development of an online approximation-based control employs locally weighted learning (LWL) [11],[12]. In LWL, the approximation of at a point is formedfrom the normalized weighted average of local approximators

such that

(7)

where each is nonzero only on a set denoted by [defined in(8)] over which will be adapted to improve its accuracy rela-tive to . In the following, we focus on a specific LWL algorithmand give all definitions required for the discussions that follow.Although we will only give definitions based on the function

, similar definitions and discussion also apply to .

A. Weighting Functions

We define a continuous, nonnegative, and locally supported1

weighting function for the th local approximator. De-note the support of by

(8)

Let denote the closure of . Note that is a compact set.An example of a weighting function satisfying the aforemen-tioned conditions is the biquadratic kernel defined as

if

otherwise.(9)

where is the center location of the th weighting functionand is a constant which represents the radius of the region ofsupport. In this example, the region of support is

1If we define the size of a set S by �(S) = max (kx � yk), then“locally supported” means that �(S ) is small relative to �(D).

Since the approximator is self-organizing, the number of localapproximators is not constant. Conditions for increasing

at discrete instants of time are presented in Section IV-C.Since is time varying, the region over which the approximatordefined in (7) can have a nonzero value is also time varying. Thisregion is defined as

When , there exists at least one such that. The normalized weighting functions are defined as

The set of nonnegative functions forms a partitionof unity on

for all

Note that the support of is exactly the same as the supportof .

When , all for are zero.Therefore, to complete the approximator definition of (7) to bevalid for any

ifif .

(10)

In the reminder of this section, we will only consider the casewhen to give all definitions for the LWL algo-rithm.

B. Local Approximators

We define

(11)

where is a prespecified vector of continuous basis functions.For the function in (2), the vector denotes the unknownoptimal parameter estimates for

(12)

where

(13)

Note that is well defined for each because and aresmooth on compact . Therefore, will be referred to as theoptimal local approximator to on .

Let the approximation error on be denoted as

(14)

Page 4: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1223

In order for to be defined everywhere, let

onotherwise.

The controller will use a known design constant . Weassume that the basis set is sufficiently rich and is suffi-ciently small such that for for some (un-known) positive constant . Note that the boundedness of

comes from the fact that is continuouson compact .

For any , can be represented as the weightedsum of the local approximators

(15)

This expression defines the approximation error onwhich satisfies , since

(16)

(17)

Therefore, if each local model has accuracy on , thenthe global accuracy of on also achievesat least accuracy . The term in (15) is the inherent approx-imation error of for .

Since we have assumed that is unknown, the parametervector is unknown for each . The control law will, there-fore, be written using approximated functions defined globallyon by (10) and locally on by (13). The controller willbe adaptive in the sense that the local parameter vectorswill be adjusted to improve the function approximation accuracywhile the controller is in operation. For analysis of the conver-gence of the parameter estimates, we define the parameter errorvector as for the th local approximator. Thecontroller will also be self-organizing in the sense the structureof the approximator will be augmented during operation as nec-essary to achieve a tracking specification.

IV. APPROXIMATION-BASED FEEDBACK LINEARIZATION

We consider the design and analysis of self-organizing onlineapproximation-based control for feedback linearizable systems.The task of this paper is to design an algorithm to achieve pre-specified tracking accuracy . In [17], we developeda self-organizing approach, for the special case of system (1)and (2) where and , that guaranteed -accuratetracking. In this section, we present a self-organizing control ap-proach appropriate for th-order input-state feedback lineariz-able systems for general satisfying Assumption 2.1.

A. Tracking-Error-Based Controller

The tracking error dynamics are described as

(18)

(19)

which can be written in the matrix form as

(20)

for the tracking error vector defined prior to (4), where

......

.... . .

......

The time derivative of can then be written as

(21)The control law is designed as

(22)

(23)

The motivation for this choice of control law will become clearin the subsequent analysis. In the previously described controllaw design, is a stabilizing linear controller that could take avariety of forms. Here, for simplicity, we choose the state feed-back form

whenwhen

for any scalar control gain defined for and thefeedback gain vector defined for such that

is a Hurwitz matrix (i.e., all the eigenvalues ofare in the left half complex plane). We let and

be zero for . Outside the region , they are definedto ensure the states will converge to the approximation region

. We design the and terms as

forfor andfor and

(24)

forfor andfor and

(25)

where and are known bounding functions such thatand for ,

respectively. The and terms are designed to dominate the(small) approximation errors and for . These termswill be zero for . We select the and terms as

(26)

(27)

Page 5: Self-Organizing Approximation-Based Control for Higher Order Systems

1224 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

where and

forotherwise.

The function is defined in (10) and is defined similarly. Weinitialize the structure with no local approximators, i.e.,

; therefore, the set is initially empty.The following analysis will be concerned with bounding

the duration of time for which . There-fore, to decrease redundancy later, we introduce the function

which measures the amount of time in the in-terval for which . For example, this functioncould be computed as

(28)

where the function is defined as

ifotherwise.

In (28), the signal is defined as .

B. Analysis for

The objective of this section is to demonstrate that the defini-tions of and in (24) and (25) ensure that all initial condi-tions will return to in finite time.

For , and all and terms are zero;therefore, the resulting closed-loop tracking error dynamics de-rived from (20) satisfy

(29)

Since we select the constant such that is a Hur-witz matrix, for any given positive-definite , there exist a pos-itive-definite matrix to satisfy the Lyapunov equation

In the following analysis, is selected to be the identity matrix.To analyze performance for , we define the Lyapunov

function

(30)

The derivative of reduces to

The design of and in (24) and (25) will ensure that thesecond term on the right is nonpositive. This yields

(31)

where and are the maximum and minimumeigenvalues of which are both positive since is positivedefinite. The comparison principle yields

for any

Therefore, for any larger than

we have that . Because the desired trajectory inand is at least from the boundary of , this implies thatenters in finite time.

If the region is defined such that its boundary coincideswith a level curve of (30), then once , the sliding modeterms will not allow to leave . If the boundary of is nota level curve of (30), then states in may leave that region,but will return to in finite time. As long as , then sat-isfaction of the tracking specification on —as proved subse-quently—will ensure that ultimately the state is confined toeven in the case where its boundary is not a level curve of (30).

The reminder of this paper will only be concerned with thecase of , where all and terms are zero.

C. Analysis for

With the control law defined in (22) and (23), for , weconsider the closed-loop dynamic equation derived from (21)for as

(32)

We will first consider the tracking performance for(i.e., ). In this case, the closed-loop dynamic equationfor defined in (32) becomes

(33)

For , define the Lyapunov function as. Then, the derivative of is

(34)

When , , and , the definitionsof and in (26) and (27) will ensure that

Then, the derivative of is reduced to

(35)

Therefore, if and while, then the Lyapunov function must decrease. If in-

Page 6: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1225

creases while , then it must be true thator . This will be used as one condition for aug-menting the approximator structure. Continuing from (35) usingthe comparison principle

(36)

(37)

Therefore, if for, then it must be true that

or for some range ofwith . This will be used asa second condition for augmenting the approximator structure.

D. Structure Adaptation

The analysis in Section IV-C demonstrated that the trackingspecification (i.e., ) can be achieved without approx-imation of the unknown design model error or inthose regions of the operating envelope where the known de-sign model provides sufficient accuracy (i.e., and

). In addition, that analysis showed that the error vari-able is a useful indicator of where the design model is not ca-pable of achieving the tracking specification. Based on the anal-ysis of Section IV-C, we define the following criteria for addinga new local approximator to the approximation structure. A localapproximator is added if the following holds:

1) the present operating point does not acti-vate any of the existing local approximators (i.e.,

);2) either of the following two conditions is satisfied:

a) the function while ;b) for all .

For , we denote the time at which the th local model isadded as (i.e., and .The center location of the new local approximator is defined tobe .

With this notation, is constant with value for. It is possible that for some , the approximator has

sufficient approximation capability, in which case .

E. Analysis for

The goal of this section is to prove that for (i.e.,the number of local regions over this time interval is fixed to ),

, , , , , and for and that thetotal time is bounded for which . In this section, tosimplify the notation, we use the fact that and define

.As shown in Section III-B, the accuracy of

on achieves at least accuracy (i.e., forany ). Similarly, the functionachieves at least accuracy of on . Next, we will consider theadaptation of and to achieve the tracking specification(i.e., ultimately) for .

For , we consider the Lyapunov function

where and thepositive-definite matrices representlearning rates.

For , we choose the adaptive laws

ifotherwise

(38)

(39)

ifotherwise

(40)

where is a projection modification operation to ensurethat is bounded away from zero (see [19, App.E]). This is in accordance with the Assumption 2.1 to ensure thecontrollability of the estimated model. The parameter adaptationwill turn off when either or .

Let denote a time interval for whichis outside the deadzone (i.e., ). Over this time

interval, the state could be either outside or inside .1) For any subinterval , for which

, the parameter adaptation will automaticallystop because . Therefore, isconstant over this time interval. Section IV-C showed the

is decreasing over this interval. Therefore, alsodecreases for , such that and

(41)

2) For any subinterval , for which, the time derivative of along the solution of (32)

is

(42)

Substituting (38)–(40) and the definition of and from(26) and (27) into (42), we obtain

(43)

Page 7: Self-Organizing Approximation-Based Control for Higher Order Systems

1226 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

Note that since the projection modification ensures that, and have the same sign.

Thus

(44)

for any .Therefore, with , we can show that

(45)

From this, it is straightforward to show that

which implies

Next, we consider the case where enters the deadzone attime , stays inside the deadzone until , and leaves thedeadzone at . Therefore, denotes atime interval for which and is constant. Bydefinition

In addition, the following facts are true: 1) on the in-terval , the approximator parameters are constant(i.e., adaptation is off); 2) ; and3) . Using these facts, itis straightforward to show that and

. Note that these facts are in-dependent of whether enters or leaves the regionover the time interval .

In the following, we will consider the stability propertiesfor any . According to the criteria given inSection IV-D for adding the th local region, at

and . Assume starts at outsidethe deadzone, enters the deadzone at , leaves at , for

, and eventually stays outside the deadzone until .Let be the last time in this interval such that

. Therefore, the total time outside the deadzone foris

(46)

This proves that on each interval , the total time out-side the deadzone is finite. Therefore, either is infinite with

ultimately bounded by , or is finite with in-creased by one at .

Another important result is that

which follows directly from previous analysis whetheror . Therefore, for any , , , , , , and

. Note that these properties hold even if thestate enters or leaves the region an infinite number of times,or if the combined tracking error enters or leaves the deadzonean infinite number of times.

Since is compact and each increment of covers a por-tion of a volume of having radius , only a finite number ofincrements of can occur. Thus, is ultimately boundedby and , , , , , .

V. STABILITY OF THE SELF-ORGANIZING PROCESS

Here, we summarize the stability properties of the proposedself-organizing controller.

Theorem 5.1: The system described by (1) and (2) with con-trol law of (22) and (23) using the self-organizing function ap-proximation proposed in Section IV-D and the parameter adap-tation laws given in (38)–(40) has the following properties:

1) , , , , , , and ;2) is ultimately bounded by ;3) each is ultimately bounded by , for

, with being a constant selected for de-signing the vector.Proof: Let the time interval of operation be specified as

, where can be infinite. We initialize the approxi-mator structure with . Denote the times at which

increases as , as discussed in Section IV-D.When , and . As shown in

Section IV-C, either the total time such that is lessthan , , and the theorem is proved,or is finite. In either case

For , the th local region is added at . The previoussections have already proved items 1 and 2 of the theorem for

. The only remaining issue is the boundedness ofeach during the transition from to . Wewant to show that each has a finite value. The Lyapunovfunction at is

Note that , because is continuous duringthe transition from to . Since does not activate the

Page 8: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1227

first local approximators when the th local approxi-mator is added at , the parameter estimates

are unchanged from to . Thus

It has been proved in the previous section that, for any, . Then, we proceed to attain

(47)

For , each has a finitevalue as long as the initial parameter estimates at

are finite. Similarly, each has afinite value. Since only a finite number of increments of canoccur (i.e., ), the summation term on the right-hand side of the inequality (47) has a finite value. Theterm is also a finite value. This directly yields that ,which implies that , , , , , and .

The ultimate bound on each comes directly from (5), whichis guaranteed by the selection of as discussed in [18].

VI. NUMERICAL EXAMPLE

We consider for illustrative purpose a second-order systemgiven by

where

For example, and we assume that there isonly partial a priori knowledge of the system nonlinearities

and . The known “designmodels” are and ;therefore, the unknown design model errors are

and . We also assume that the system is designed tooperate over the region .

A desired trajectorythat is continuous, bounded, and has , forall , is generated by the third-order system

where two functions are included to ensure that, for any . In our simulations, we

select such that the transfer functions

are BIBO stable. As long as is bounded, signals ,and will be continuous and bounded. The main idea of sucha prefilter approach to generating the reference trajectory is thatthe user specifies a signal . The prefilter computes the neces-sary derivatives and ensures that the assumptions of the con-troller are satisfied. Theoretically, can be any bounded signal.For the purpose of this simulation, is selected to be the sum of

and 0.05-Hz square wave of magnitude 1.5.The tracking accuracy we expect to achieve is that as ,

with control gain . The linearlycombined tracking error is defined as with

selected to be 1. The desired tracking accuracy for isselected to be , so that we can ensure that

ultimately. The function approximationaccuracies are specified as .

The weighting function is the biquadratic kernel of the formas

ifotherwise.

(48)

where

For this example, as done by the authors of [11] and [15], wespecify the local basis function as

with being the center of the . Therefore, is the optimallocal affine approximation to on . We select .The parameter estimates on the first allocated region andare initialized at as .When the th center is allocated at , the initialparameter estimates and are selected either tobe zero vectors or based on and , where

Page 9: Self-Organizing Approximation-Based Control for Higher Order Systems

1228 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

Fig. 1. Combined tracking error e(t) in the time interval of [0, 20] (dotted),[20, 40] (dash-dot), [40, 60] (dashed), and [60, 80] (solid). The second dottedline (with larger magnitude) shows e(t) in the time interval of [0, 20] for thesimulation with online approximation turned off.

is the index of the closest existing center to the th center.The logic for the parameter initialization is as follows:

if and

else

This initialization forces the th approximator to have the samevalue as the th approximator would have at the center of .This is a means of “boot strapping” the learning process. Thethree elements of are initialized similarly. The adapta-tion rate matrices are set to and

where is the square diagonal matrix withdiagonal component equal to the vector .

Fig. 1 shows the performance of for .Since the period of the reference input is 20 s, we haveplotted the for (dotted), (dash-dot),

(dashed), and (solid) along the sametime axis. The time axis of each plot has been shifted by a mul-tiple of to increase the resolution of the time axis andto facilitate the comparison over repetitions of the trajectory.Note that with online approximation, since the local regions arebeing revisited periodically with , the tracking perfor-mance improves (i.e., tends to decrease) with each repe-tition of the trajectory. This indicates the local approximatorsare learning to achieve increasingly more accurate approxima-tion to the actual function. It is particularly important to notethat the learning is a function of state and the performance im-provement will extend to different trajectories to the extent thatthe new trajectories utilize the same regions in state space. Forcomparison, the second dotted line shows a clipped portion of

for simulations without learning of the online approxima-tion. For the simulation without adaptation, the tracking perfor-mance does not show any improvement over the repetitions ofthe trajectory; the magnitude of is 0.6.

Fig. 2. Plots of the tracking errors versus time t: (a) ~x versus t; (b) ~x versust. Note that the horizontal axes are identical and that the caption is only appliedto (b). The dotted lines are tracking errors in the time interval of [0, 20]. Thedash-dot lines are tracking errors in the time interval of [20, 40]. The dashedlines are tracking errors in the time interval of [40, 60]. The solid lines aretracking errors in the time interval of [60, 80]. The second dotted lines (withlarger magnitude) show the tracking errors in the time interval of [0, 20] for thesimulation with online approximation turned off.

The tracking performance can be shown more clearly inFig. 2, which plots the tracking errors and over thefirst four repetitions of the reference trajectory. Again, forcomparison, the second dotted lines (with larger magnitude)shows the tracking errors for simulations without the learningof the online approximation. Without learning, the range ofwas from 0.6 to 0.6.

Fig. 3(a) plots the time outside the deadzoneat 20-s intervals, which is also the period of the com-

manded state . For example, consider . Fig. 3(a)shows that the time outside the deadzone in the interval

is approximately 13 s. Prior to 300 s,the combined tracking error enters the deadzone and remainstherein for the remainder of the simulation. Therefore, as shownin Fig. 3(b), the total time outside the deadzone is finite as goesto infinity. This demonstrates that the tracking error is ultimatelyachieved and the controller does not include large gains or largeamplitude high-frequency switching.

The allocated center locations for are indicated ona phase plane plot of versus in Fig. 4. During the first 80s, 35 centers are allocated with 19 on the right of the regionand 16 on the left. For the remaining simulation time, no morecenters are added. Each “ ” indicates an allocated center loca-tion. The small square around each center indicates the support

of each local approximator. Since this is an example for il-lustrative purpose, we already know functions and .We can plot the region

where both has at least the accuracy of and has at leastthe accuracy of . Note that the separation between adjacent

Page 10: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1229

Fig. 3. Graphs indicating the incremental and cumulative time that the trackingerror e is outside the deadzone. (a) Each circle indicates the total time duringthe previous 20-s interval that the combined tracking error e was outside thedeadzone (i.e., jej > 0:02). (b) Cumulative time outside the deadzone. Notethat the horizontal axes are identical and that the caption is only applied to (b).

Fig. 4. Phase plane plot of x versus x for t 2 [0, 80] s. The dotted line is thedesired trajectory for t 2 [0, 80] s. The solid line shows the actual trajectory.The �s indicate the assigned center locations and the small square around eachcenter location represents the associated region of support. M labels the regionwhere jf j < 0:03 and jgj < 0:03.

centers is approximately either in horizon directionor in vertical direction. No local approximators areallocated inside the region where both model errors andare small (i.e., and ). Even outside the region

Fig. 5. (a) Actual function f and its approximation f at 400 s plotted versusx for x = 0:8. (b) Actual function g and its approximation g at 400 s plottedversus x for x = 0:8. M = fx j � 2:21 � x � 0:61g represents theregion where jf(x)j < 0:03 and jg(x)j < 0:03 at the fixed value of x = 0:8.(c) Histogram of the x values along the trajectory (x ; x ) such that 0:4 <

x < 1:2. The �’s along the horizontal axis indicate the x value of centerlocations that would be active with x = 0:8.

, no local approximators are allocated when eitheror . Fig. 4 also depicts the desired (dotted) and actual(solid) trajectories over the time interval [0, 80].

Fig. 5(a) plots the actual function (dotted) and the approx-imated (solid) at 400 s versus for fixed . Thefigure also shows the region such that for with

, both has at least the accuracy of and has atleast the accuracy of . Since no local approximators are al-located for , on . For some ,

has adapted. Over a portion of the operation range, hasconverged toward . Fig. 5(c) shows a histogram of the valuesof along the trajectory for which . These arethe trajectory points that could have an influence on the func-tion approximations shown in Fig. 5(a) and (b). Note that theapproximators are zero in the following: 1) region wherethey are not needed due to the accuracy of the design model, 2)regions where the trajectory did not enter, and 3) regions wherethe state did traverse but with performance was sufficiently goodthat center allocation did not occur. Similarly, Fig. 5(b) displaysthe actual function (dotted) and the approximator (solid) at400 s versus for . Note that for negative values of ,the approximated has overestimated the value of . The theoryherein gives no guarantee of convergence of the parameter es-timates, but does ensure the following: 1) they are bounded, 2)the Lyapunov function is decreasing, and 3) the tracking error

Page 11: Self-Organizing Approximation-Based Control for Higher Order Systems

1230 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 18, NO. 4, JULY 2007

Fig. 6. Phase plane plot of x versus x for t 2 [0, 120] s for a simulation withr(t) defined as a sequence of random step inputs with amplitudes in [�4.5, 4.5].The �’s indicate the assigned center locations. The small square around eachcenter location represents the associated region of support. M labels the regionwhere jf j < 0:03 and jgj < 0:03.

is ultimately within the deadzone. This can be accomplishedwithout convergence of the parameter errors.

For Figs. 1–5, was a periodic signal to facilitate the il-lustration of the learning process. Fig. 6 shows the trajectory(solid) and center distribution for a 120-s simulation in whichis a step function of random amplitude in [ 4.5, 4.5] where thestep function amplitude changed every 5.0 s. The randomnessof the amplitude leads to a greater exploration of the operatingregion. Note that the allocated centers still (correctly) lie outsideof .

VII. CONCLUSION

In this paper, we have developed a new self-organizingscheme for the th-order input-state feedback linearizablesystems. The online function approximation is based on theLWR framework, but can be extended to other approximatorswith locally supported basis functions. The proposed onlineself-organizing structure is stable in the sense of Lyapunov andguarantees that a linear combination of tracking errors (i.e.,

with the vector selected in the design stage)will ultimately achieve a specified tracking error bound. Thisis achieved without enlarging the bandwidth of the controlsystem. In addition, with the appropriate selection of the linearcombination (i.e., ), each state tracking error can be shown toultimately achieve some certain accuracy that is related to theprespecified accuracy of the combined tracking error.

The advantage of the self-organizing approach is that it allo-cates new local approximators only as necessary to achieve thedesired tracking accuracy. Previously existing approaches allo-cated new local approximators based on state independent oftracking performance. Therefore, for the same value of , the

approach presented herein should always require fewer local ap-proximators. Due to the “curse of dimensionality,” the payoff forself-organizing systems is much higher for systems with higherstate dimension, especially when the trajectory does not fullyexplore the operating envelope denoted herein by .

In our future work, adaptation of the radius of the region ofsupport (i.e., ) is of interest. In this paper, was picked apriori assuming that both continuous functions and could beapproximated by linear functions and with at least accu-racy and , respectively. While this is a much looser assump-tion than explicit knowledge of the functional forms of and ,methods that adapt in a stable fashion are also of interest.Note that for continuous functions this assumption will be truefor small enough . Therefore, methods where positive ismonotonically decreased should be straightforward. Of greaterinterest are methods that adjust based on performance. Anadditional important direction for future research is extensionof performance-based self-organization to systems more gen-eral than have been considered herein.

ACKNOWLEDGMENT

Any opinions, findings, and conclusions or recommendationsexpressed in this material are those of the author(s) and do notnecessarily reflect the views of the National Science Founda-tion.

REFERENCES

[1] F.-C. Chen and H. K. Khalil, “Adaptive control of a class of nonlineardiscrete-time systems using neural networks,” IEEE Trans. Autom.Control, vol. 40, no. 5, pp. 791–801, May 1995.

[2] J. Y. Choi and J. A. Farrell, “Nonlinear adaptive control using networksof piecewise linear approximators,” IEEE Trans. Neural Netw., vol. 11,no. 2, pp. 390–401, Mar. 2000.

[3] J.-P. Jiang and L. Praly, “Design of robust adaptive controllers for non-linear systems with dynamic uncertainties,” Automatica, vol. 34, no. 7,pp. 825–840, 1998.

[4] F. L. Lewis, K. Liu, and Y. A. , “Neural net robot control with guar-anteed tracking performance,” IEEE Trans. Neural Netw., vol. 6, no. 3,pp. 703–715, May 1995.

[5] F. L. Lewis, A. Yesildirek, and K. Liu, “Multilayer neural-net robotcontroller with guaranteed tracking performance,” IEEE Trans. NeuralNetw., vol. 7, no. 2, pp. 388–399, Mar. 1996.

[6] R. M. Sanner and J. J. Slotine, “Gaussian networks for direct adaptivecontrol,” IEEE Trans. Neural Netw., vol. 3, no. 6, pp. 837–863, Nov.1992.

[7] M. Polycarpou, “Stable adaptive neural control scheme for nonlinearsystems,” IEEE Trans. Autom. Control, vol. 41, no. 3, pp. 447–451,Mar. 1996.

[8] M. Polycarpou and M. Mears, “Stable adaptive tracking of uncertainsystems using nonlinearly parameterized on-line approximators,” Int.J. Control, vol. 70, no. 3, pp. 363–384, 1998.

[9] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforwardnetworks are universal approximators,” Neural Netw., vol. 2, pp.359–366, 1989.

[10] J. Park and I. W. Sandberg, “Universal approximation using radial basisfunction networks,” Neural Comput., vol. 3, no. 2, pp. 246–257, 1991.

[11] J. Nakanishi, J. A. Farrell, and S. Schall, “A locally weighted learningcomposite adaptive controller with structure adaptation,” in Proc.IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2002, pp. 882–889.

[12] J. Nakanishi, J. A. Farrell, and S. Schaal, “Composite adaptive controlwith locally weighted statistical learning,” Neural Netw., vol. 18, no. 1,pp. 71–90, 2005.

[13] R. M. Sanner and J. J. Slotine, “Structurally dynamic wavelet networksfor adaptive control of robotic systems,” Int. J. Control, vol. 70, no. 3,pp. 405–421, 1998.

[14] M. Cannon and J.-J. Slotine, “Space-frequency localized basis functionnetworks for nonlinear system estimation and control,” Neurocomput.,vol. 9, pp. 293–342, 1995.

Page 12: Self-Organizing Approximation-Based Control for Higher Order Systems

ZHAO AND FARRELL: SELF-ORGANIZING APPROXIMATION-BASED CONTROL FOR HIGHER ORDER SYSTEMS 1231

[15] C. G. Atkeson, A. W. Moore, and S. Schaal, “Locally weighted learningfor control,” Artif. Intell. Rev., vol. 11, pp. 75–113, 1996.

[16] S. Schaal and C. G. Atkeson, “Constructive incremental learning fromonly local information,” Neural Comput., vol. 10, no. 8, pp. 2047–2084,1998.

[17] J. A. Farrell and Y. Zhao, “Self-organizing approximation based con-trol,” in Proc. Amer. Control Conf., Minneapolis, MN, Jun. 14–16,2006, pp. 3378–3384.

[18] J. Slotine and W. Li, Applied Nonlinear Control. Englewood Cliffs,NJ: Prentice-Hall, 1991.

[19] M. Krstic, I. Kanellakopoulos, and P. Kokotovic, Nonlinear and Adap-tive Control Design. New York: Wiley, 1995.

Yuanyuan Zhao (S’05) received the B.S. and M.S.degrees in electrical engineering from the Universityof Science and Technology, Beijing, China, in 1997and 2000, respectively and the Ph.D. degree in elec-trical engineering from the University of California,Riverside, in 2007.

Her research interests are in computational intel-ligence, learning control, nonlinear adaptive control,and nonlinear robust control.

Jay A. Farrell (S’87–M’89–SM’98) received theB.S. degrees in physics and electrical engineeringfrom the Iowa State University, Ames, in 1986 andthe M.S. and Ph.D. degrees in electrical engineeringfrom the University of Notre Dame, Notre Dame,IN, in 1988 and 1989, respectively.

From 1989 to 1994, he was a Principal Investi-gator on projects involving intelligent and learningcontrol systems for autonomous vehicles at theCharles Stark Draper Laboratory. He is a Professorand former Chair of the Department of Electrical

Engineering, University of California, Riverside. He is the author of over140 technical publications. He is also a coauthor of the books The GlobalPositioning System and Inertial Navigation (New York: McGraw-Hill, 1998)and Adaptive Approximation Based Control: Unifying Neural, Fuzzy andTraditional Adaptive Approximation Approaches (New York: Wiley, 2006).

Dr. Farrell received the Engineering Vice President’s Best Technical Publica-tion Award in 1990 and Recognition Awards for Outstanding Performance andAchievement in 1991 and 1993.