Analysis of Minimal Radial Basis Function Network Algorithm for

Analysis of Minimal Radial Basis Function Network Algorithm

for Real-Time Identification of Nonlinear Dynamic Systems

Li Yan, N.Sundararajan and P.SaratchandranSchool of Electrical & Electronic Engineering

Nanyang Technological University, Singapore 639798E-mail: [email protected]

Abstract:

This paper presents first a performance analysis of the recently developed Minimal ResourceAllocating Network (MRAN) algorithm for on-line identification of nonlinear dynamic systems.Using nonlinear time invariant and time varying identification benchmark problems, MRAN’sperformance is compared with the recently proposed On-line Structural Adaptive Hybrid Learn-ing (ONSAHL) algorithm of Junge and Unbehauen. The results indicate that MRAN realizesnetworks using fewer hidden neurons than ONSAHL algorithm with better approximation ac-curacy. Next, methods for improving the run time performance of MRAN for real time iden-tification of the nonlinear systems are developed. An extension to MRAN referred to as theExtended Minimum Resource Allocating Network (EMRAN) which utilizes a winner neuronstrategy is highlighted. This modification reduces the computation load for MRAN and leadsto considerable reduction in the learning time with only a slight increase in the approximationerror. Using the same benchmark problems, the results show that EMRAN is well suited forfast on-line identification of nonlinear plants.

Keywords: Extended Kalman filter, Extended minimum resource allocating network, Nonlineardynamic system identification, Radial basis function neural network

1 Introduction

Neural networks have been used as nonlinear dynamic system controllers to tackle problemsfor which conventional approaches have been proven to be ineffective [1]. However, because alarge computation time is required for the learning process, the practical use of neural networksfor on-line control schemes is sparse 1, especially in areas such as flight control [2] [3]. Hence,the problem of designing fast on-line learning algorithm for practical implementation of neuralcontrol schemes remains an active research topic.

Since the late eighties, there has been considerable interest in Radial Basis Function (RBF)neural networks, due to their good global generalization ability and a simple network structurethat avoids lengthy calculations [4]. Gaussian functions are selected in majority of cases asradial basis functions even though other functions like thin plate functions can also be used[7].

1Generally an off-line training process for neural controllers is required.

1

These gaussian functions have two parameters namely the center and width which have to bedetermined. Number of algorithms have been proposed for training the RBF network [5] [6] [7].The classical approach to RBF implementation is to fix the number of hidden neurons a priorialong with its centers and widths, based on some properties of the input data, and then estimatethe weights connecting the hidden and output neurons. Two methods have been proposed tofind the proper number of hidden neurons for a given problem. [9] introduced the concept ofbuilding up of the hidden neurons from zero to the required number with the update of theRBF parameters being done by a gradient descent algorithm. An alternate approach is to startwith as many as hidden units as the number of inputs and then reduce them using a clusteringalgorithm which essentially puts patterns that are close in the input space into a cluster toremove the unnecessary hidden neurons [10]. However, in all these studies, the main learningscheme is of batch type, which is not suitable for on-line learning.

In 1991, Platt [11] proposed a sequential learning algorithm to remedy the above drawbacks. InPlatt’s Resource Allocating Network(RAN) algorithm, hidden neurons are added based on thenovelty of the new data and the weights connecting the hidden neurons to the output neuronsare estimated using the LMS method. Platt showed the resulting network topology to be moreparsimonious than the classical RBF networks. [12] proposed modifications to improve RAN byusing an EKF instead of the LMS to estimate the network parameters. The resulting networkcalled RANEKF is more compact and has better accuracy than RAN. A further improvement toRAN and RANEKF was proposed by [13] in which a pruning strategy was introduced to removethose neurons that consistently made little contributions to the network output. The resultingnetwork, called minimal RAN(MRAN), was shown to be more compact than RAN and RANEKFfor several applications in the areas of function approximation and pattern classification [14].Preliminary results of using MRAN for nonlinear system identification problems were presentedin [15].

Recently another sequential learning algorithm has been proposed by Junge and Unbehauen[16] [17]. Their algorithm incorporates the idea of on-line structural adaptation to add newhidden neurons and the method uses an error sensitive clustering algorithm to adapt the centerand width of the hidden neurons. The algorithm known as On-line Structural Adaptive HybridLearning(ONSAHL) is shown in [17] to produce compact networks for nonlinear dynamic systemidentification problems.

In the beginning section of the paper, we present a comparison of the performance of MRANwith ONSAHL algorithm for the same nonlinear identification benchmark problems from [17].This study is intended to compare the complexity of the resulting networks and the accuracy ofapproximation using MRAN and ONSAHL in the field of nonlinear system identification.

For any practical application of a newly developed identification algorithms, it is important tostudy the real-time implementation of the algorithm and the same study is undertaken for theMRAN algorithm in the latter sections of this paper. In MRAN algorithm, the parameters of thenetwork including all the hidden neuron’s centers, widths and weights have to be updated in everystep. This causes the size of the matrices to be updated to become large as the hidden neuronsincrease and the RBF network structure becomes more complex computationally, which directlyresults in a large computation load and limits the use of MRAN for real-time implementations.An analysis of the break down of the computation time for one cycle of MRAN is then presentedunder Visual C++ environment in terms of the different steps in the algorithm. Based on thisanalysis of the number of operations, specifically on the number of multiplications in a cycle aremade to find the bottleneck in the computation time.

2

Based on this analysis, an extension to MRAN called Extended MRAN (EMRAN) is proposed.The focus is to reduce the computation load of the MRAN and to realize a scheme for faston-line identification. For this purpose a ‘winner neuron’ strategy is incorporated into theconventional MRAN algorithm. The key idea in EMRAN algorithm is that in every step,only those parameters that are related to the selected winner neuron are updated by EKF.EMRAN attempts to reduce the on-line computation time considerably and avoids the overflowof the memory, retaining at the same time the good characteristics of MRAN, namely lessnumber of hidden neurons, lower approximation error. In this paper, these benefits of EMRANare illustrated using the same benchmark problems from the nonlinear system identificationarea which consist of SISO nonlinear time varying and MIMO nonlinear time invariant plants.Simulation results show that EMRAN is well suited for real-time implementation of nonlinearsystem identification.

The paper is organized as follows: Section 2 gives a brief description of the MRAN and theONSAHL algorithms and also highlights how MRAN can be used for nonlinear system identi-fication problem. Section 3 presents the results of comparison on two nonlinear identificationproblems from [16] [17]. In Section 4, a complete analysis of MRAN algorithm in terms of ele-mental computation times are made to identify the bottleneck for real implementation. Usingthis knowledge, the improvements for MRAN are brought out leading to EMRAN algorithmis also discussed in Section 4. Section 5 gives a comparison of both MRAN and EMRAN forthese benchmark identification problems for real implementation. Section 6 summarizes theconclusions from this study.

2 MRAN Algorithm and Nonlinear System Identification

In this section the MRAN algorithm is briefly described and the problem of identifying a givennonlinear dynamic system is described. At the same time, a brief introduction of the ONSAHLalgorithm is also presented.

2.1 Minimal Resource Allocating Network (MRAN) Algorithm

The MRAN algorithm proposed by Lu Yingwei et al. [13][14] combines the growth criteria ofRAN with a pruning strategy to realize a minimal network structure. This algorithm is animprovement to the RAN of Platt [11] and the RANEKF algorithm of Kadirkamanathan [12].In this section, we present the MRAN algorithm for training RBF networks. Before explainingthe algorithm in detail, the RBF network is briefly described.

Figure 1 shows a typical RBF network, with nx inputs x (x1, · · · , xnx), and ny outputs y(y1 · · · ˆyny). The hidden layer consists of N computing units (Φ1 to ΦN ), they are connected tothe output by N weight vectors (a1 to aN ). The outputs of the network which approximate thetrue output y are,

y = f(x) = a0 +N

n=1

anΦn(x) (1)

where Φn(x) is the response of the nth hidden neuron to the input x, and an is the weightconnecting the nth hidden unit to the output unit. a0 is the bias term and Φn(x) is a Gaussian

3

function given by,

Φn(x) = exp(− x− µn 2

σ2n) (2)

µn is the center for the nth hidden neuron and σn is the width of the Gaussian function.denotes the Euclidean norm.

In the MRAN algorithm, the RBF network begins with no hidden units, that is n = 0. As eachinput-output training data (xi,yi) (i is time index) is received, the network is built up based oncertain growth criteria. The following steps describe the basic ideas of the MRAN algorithm.

• Step 1. Calculate the three error definedThe first step of the algorithm is to check whether the criteria for recruiting a new hiddenunit are met,

ei = yi − yi(xi) > E1 (3)

ermsi =i

j=i−(M−1)

e2jM> E2 (4)

di = xi − µir > E3 (5)

where µir is the center of the hidden unit which is closest to current input xi. E1, E2 andE3 are thresholds to be selected appropriately. Equation (3) decides if the existing nodesare insufficient to obtain a network output that meets the error specification. Equation(4) checks whether the network met the required sum squared error specification for thepast M outputs of the network. Equation (5) ensures that the new node to be added issufficiently far from all the existing nodes.

Only when all these criteria are met, a new hidden node is recruited. Then go to step 2to add a new RBF hidden unit, otherwise go to step 3 to update all the parameters of thenetwork using EKF.

• Step 2: Inclusion of a new RBF hidden unitWhen all the criteria in Step 1 are satisfied, a new hidden unit is recruited. Each newhidden unit added to the network will have the following parameters associated,

aN+1 = ei, µN+1 = xi, σN+1 = κ xi − µir (6)

Those parameters are set to remove the error caused. The overlap of the responses of thehidden units in the input space is determined by κ, the overlap factor. After adding thenew hidden neuron, go to step 5 to perform a pruning strategy.

• Step 3: Calculating the gradient matrix BiIf the three criteria for adding new hidden unit can not be satisfied, then an adaptationof the network parameters should be done. Bi = wf(xi) is the gradient matrix of thefunction f(xi) with respect to the parameter vector w evaluated at wi−1, which will beused in the next step.

Bi = [I,Φ1(xi)I,Φ1(xi)(2a1/σ21)(xi − µ1)T ,Φ1(xi)(2a1/σ31) xi − µ1 2 , ... ,

ΦN (xi)I, ΦN(xi)(2aN/σ2N)(xi − µN )T ,ΦN (xi)(2aN/σ3N) xi − µN 2 ]T (7)

After this preparation, the vector w can be updated, therefore go to Step 4.

4

• Step 4: Updating the parameters using EKFIn this step, the network parameters w = [aT0 ,a

T1 , µ

T1 ,σ1, ...,a

TN , µ

TN ,σN ]

T are adaptedusing the EKF as follows,

wi = wi−1 +Kiei (8)

where Ki is the Kalman gain matrix given by,

Ki = Pi−1Bi[Ri +BTi Pi−1Bi]−1 (9)

Ri is the variance of the measurement noise. Pi is the error covariance matrix which isupdated by,

Pi = [Iz×z −KiBTi ]Pi−1 + qIz×z (10)

q is a scalar that determines the allowed random step in the direction of the gradientmatrix. If the number of parameters to be adjusted is z, then Pi is a z × z positivedefinite symmetric matrix. When a new hidden neuron is allocated, the dimensionality ofPi increases to,

Pi =Pi−1 00 p0Iz1×z1

(11)

where the new rows and columns are initialized by p0. p0 is an estimate of the uncertaintywhen the initial values assigned to the parameters. The dimension z1 of the identity matrixI is equal to the number of new parameters introduced by the new hidden neuron. Thengo to Step 5 for a pruning strategy.

• Step 5: Pruning strategyThe last step of the algorithm is to prune those hidden neurons that contribute little to thenetwork’s output for Nw consecutive observations. Let matrix O denotes the outputs ofthe hidden layer and A denotes the weight matrix A = (a1, · · · ,aN), consider the outputOnj(j = 1 · · ·ny) of the nth hidden neuron,

Onj = Anjexp(− 1σ2n

x− µn 2), n = 1 · · ·N, j = 1 · · ·ny (12)

If Anj or σn in the above equation is small, Onj might become small. Also, if x− µn islarge, the output will be small. This would mean that the input is sufficiently far away fromthe center of this hidden neuron. To reduce inconsistency caused by using the absolutevalue of the output, this value is normalized to that of the highest output,

rnj =Onj

max{O1j ,O2j , · · · ,ONj} , n = 1 · · ·N, j = 1 · · ·ny (13)

The normalized output of each neuron rnj is then observed for Nw consecutive inputs. Aneuron is pruned, if its output rnj(j = 1 · · ·ny) falls below a threshold value(δ) for the Nwconsecutive inputs. Then the dimensionality of all the related matrixes will be adjustedto suit the reduced network.

The sequential learning algorithm for MRAN, can be summarized as follows.

5

1. Obtain an input and calculate the network output (equation (1) to (2)) and the correspondingerrors (equation (3) to (5)).

2. Create a new RBF center (equation (6)) if all the three inequality (3) to (5) hold.

3. If condition in 2. is not met, adjust the weights and widths of the existing RBF networkusing EKF (equations (7) to (11)).

In addition a pruning strategy is adopted,

1. If a center’s normalized contribution to the output for a certain number of consecutive inputsis found to be below a threshold value, that center is pruned.

2. The dimensions of the corresponding matrix are adjusted and the next input is evaluated.

A number of successful applications of MRAN in different areas such as pattern classification,function approximation and time series prediction have been reported in [13] [14].

2.2 Nonlinear System Identification Using RBF Network

Generally, a nonlinear multi-input multi-output(MIMO) dynamic system in discrete form isrepresented by the following input-output description

y(i) = g[y(i− 1), · · · ,y(i− ky),u(i− 1), · · · ,u(i− ku)] (14)

where y is a vector containing m system outputs, u is a vector for n system inputs; g[· · · ] isa nonlinear vector function, representing m hyper-surfaces of the system, i represents the timeindex, and ky and ku are the maximum lags of the output and input vectors respectively.

The problem of identification is: Given the output vector y and input vector u over someinterval of time, find the nonlinear function g(.) which fits the data closely. This problem canbe converted to a nonlinear approximation problem for RBF networks by defining the variablesinside the brackets of equation (14) as the RBF network’s inputs, i.e., x, and yny(ny = m) asthe network outputs. Based on the past system outputs and inputs, construct the inputs xnx×1to the neural network as

x = [yTi−1, · · · , yTi−ky ,uTi−1, · · · ,uTi−ku ]T (15)

The output of the network will be an approximation to y(i) and is denoted by y(i). Essentiallythe problem of the identification of a nonlinear dynamic system is converted to a nonlineartime series problem with one step ahead prediction. The RBF network is used to give a bestapproximation of (xi,yi) for the plant nonlinear function g.

Many learning algorithms for RBF network as mentioned before exist. Generally the goodcharacteristics of a learning algorithm includes a more compact network structure, less numberof hidden neurons, lower approximate error, etc. However, for real-time implementation, modelidentification for one sample of data must be finished before the next data arrives. That is,suppose the learning time for one sample of data, referred to as the cycle time is tc, then tc < Twhere T is the sampling time. For this reason, learning time defined by tc plays a crucial rolein selecting a RBF learning algorithm for nonlinear system identification problems.

6

2.3 The ONSAHL Algorithm

The On-line Structural Adaptive Hybrid Learning (ONSAHL) algorithm has been proposedespecially to on-line identify time-variant nonlinear dynamic systems[17].

The ONSAHL algorithm is designed to train an extended RBF network, which is referred toas a Direct Linear Feedthrough RBF(DLF-RBF). DLFRBF is composed of a nonlinear RBFsub-network and a linear DLF sub-network part, which are connected in parallel, performing amapping from input layer directly to the output layer.

The ONSAHL algorithm uses the same growth criteria as in Platt’s RAN but differs in the waythe centers and widths are adjusted. Unlike RAN where the centers of all hidden neurons areupdated to fit the training data(in LMS sense), in the ONSAHL only the center and width ofthe neuron which is nearest to the input are updated in the first step. Then all the weightsconnected to the output layer are updated using the RLS method. Two nonlinear identificationproblems, including a time-invariant system and a time-varying dynamic system are used to testthe identification ability of the algorithm, both of them are single input single output (SISO).For a detailed description of the ONSAHL algorithm see [16][17].

3 Benchmark Problems on Nonlinear Dynamic Systems Iden-tification

In this section, we compare the performance of MRAN with the ONSAHL algorithm for twobenchmark problems on nonlinear dynamic systems identification used in [17]. For conveniencethese two benchmark problems are referred to as BM-1 and BM-2.

3.1 BM-1: Nonlinear SISO Time-Invariant System

The nonlinear SISO time-invariant system (BM-1) to be identified is described by the followingfirst order difference equation,

y(i) =29

40sin(

16u(i− 1) + 8y(i− 1)(3 + 4u(i− 1)2 + 4y(i− 1)2 ) +

2

10u(i− 1) + 2

10y(i− 1) (16)

A random signal uniformly distributed in the interval [-1,1] is used for u(i) in the system. Forcomparison purposes, the same error criteria defined in [16] viz Id(i) is used,

Id(i) =1

ny ×NwNw−1

p=0

ny

j=1

|yj(i− p)− yj(i− p)| (17)

In this case, the MRAN’s 3 criteria gates are selected as: E1 = 0.01, E2 = 0.1, E3 = max{εmax×γi, εmin}, εmax = 1.2, εmin = 0.6, γ = 0.997, we use δ = 0.0001 as the pruning threshold, and thesize of the two sliding windows (M,Nw) as 48.

The identification results using MRAN and ONSAHL are given in figures 2 and 3. Fig. 2 presentsthe hidden neuron evolution history along with the time histories of the three error functions.From Fig.2 one can see clearly how the hidden neurons are added and pruned according to the

7

three criteria. In this case, MRAN takes 8 hidden units to identify the system, while from [17],the ONSAHL takes 23 hidden neurons for the same problem. Fig. 3 shows the time history ofthe error index Id. It can bee seen from Fig. 3 that MRAN’s error is coming down drasticallyand eventually it is lower than that of the ONSAHL.

3.2 BM-2: Nonlinear SISO Time-Varying Discrete System

The second example BM-2 is selected to identify a nonlinear SISO time-varying discrete systemgiven below:

y(i) =29β(i)

40sin(

16u(i− 1) + 8y(i− 1)β(i)(3 + 4u(i− 1)2 + 4y(i− 1)2 ) +

2

10u(i− 1) + 2

10y(i− 1) (18)

β(i) is a time-varying parameter given in Table 1. The system inputs ui used in this example

Table 1: Evolution of β (BM-2)index(i) 0 to 1500 1501 to 2500 2501 to 5000

β(i) 1.0 0.9 0.8

is also a random signal uniformly distributed in the interval [−1, 1], whereas some parametersfor the RBFN change slightly to obtain better performance. Unlike ONSAHL, which startswith a trained network from BM-1 (23 hidden neurons), for this example the MRAN startswith no hidden neuron. Figures 4 and 5 present the identification results for this benchmarkproblem. From Fig.4 the most important thing one can observe is the change of the hiddenneurons together with β. Briefly speaking, when the system dynamics changes because of thevariation in β, the network has to add new neurons to adapt for the changes. When the changeof β does not affect the dynamics and changes the complexity of the system, after some timethe old neuron will be pruned. Compared to the results achieved in [17], MRAN uses only 11hidden neurons while the ONSAHL uses 25 hidden neurons, and from figure 5 we can see thatthe MRAN algorithm also achieve better approximation result, that is, Id is smaller.

The identification results for both MRAN and ONSAHL algorithms are compared in Table 2.In Table 2, Idav is the average value of the Id calculated from 1,500th sample to 5,000th sample.It is selected in this way because from the figure one can see that after 1,500 samples, theidentification result converges to a comparatively smooth one.

Table 2: Comparison of Identification Results of the MRAN and ONSAHL

BM-1 BM-2

Performance Idav Hidden units Idav Hidden units

ONSAHL 0.0437 23 0.0586 25

MRAN 0.0261 8 0.0326 11

Based on these two problems, it can be concluded that by using the EKF tuning rule andincorporating a pruning strategy , MRAN is able to perform better than ONSAHL with a smallernetwork structure. However, for practical use, MARN’s on line implementation issues have to be

8

analyzed. Any improvements to MRAN which can make it perform with a smaller computationtime is always welcome and such a modification is described in the following sections.

4 Real-Time Implementation and the EMRAN Algorithm

For practical utility, it should be possible to run any identification algorithm in real-time andMRAN also is no exception to this fact. Since MRAN uses sequential learning scheme, for real-time implementation, the learning time tc for one set of input and output data must be less thanthe sampling time T selected for identification. In this section, we estimate MRAN algorithm’slearning time tc for one cycle based on its detailed breakdown into a number of computationalsteps done in one cycle. All the time estimates presented here are based on running MRAN ona Pentium 120MHz computer under the environment of V C++5.0.

Looking at MRAN equations( (1) to(13) ) it is clear that tc mainly consists of five parts, cor-responding to the Steps 1 to 5 described in section 2.2.. The computation time for step i isreferred to as tci and the cycle time tc is given by

tc =5

i=1

tci (19)

For this timing study, the benchmark problem BM-2 has been selected as the first candidate.Since the time for training one sample of data is determined by factors, such as RBF network’sstructure (number of inputs, number of outputs), and the number of the hidden neurons, forfinding out the tc the network size was varied from 5 to 35 hidden neurons in an incrementof 5 without really worrying about the approximation accuracy. i.e. The following questionwas posed: if the MRAN network had 10 hidden neurons for the problem of BM-2 what arethe constituent times tcis without really worrying about whether the 10 hidden neuron networkproduced a good approximation. This approach makes the timing analysis easier as MRANreally produces a network with varying number of hidden neurons and to calculate times basedon this will be difficult. For the benchmark problem of BM-2, the breakdown for all the timesin ms for the 5 steps are given in Table 3.

In Table-3, the last row gives the total time tc. Although it is not correct to use ’total’ time asthe training time for each sample of data 2, this calculation considers the ’worst’ case scenariofor calculating ’tc’.

Even though the BM-2 problem gives some idea about the MRAN computation cycle times,the BM-2 problem is still a single input single output nonlinear system though time varying. Arealistic assessment can only be made if the problem selected is of a reasonably larger size likea multi-input multi-output problem. In this context, the 2 inputs 2 outputs nonlinear systemidentification problem [16] is selected here as the BM-3 problem.

2For example, if a new training pair xi,yi satisfies the condition for recruiting a new hidden unit, then afterthe hidden neuron is added, it directly goes to pruning strategy of Step 5, therefore the tc is only the sum of tc1,tc2 and tc5.

9

Table 3: Computation Cycle Time tc(ms) for MRAN (BM-2 problem)

Hidden units 5 10 15 20 25 30 35

t1c 0.82 1.37 2.07 2.38 3.00 3.65 4.55

t2c 0.003 0.004 0.004 0.004 0.005 0.005 0.005

t3c 1.10 2.70 3.32 4.80 5.97 7.32 8.97

t4c 6.14 30.97 105.5 269.7 360.9 533.2 936.0

t5c 0.09 0.17 0.26 0.33 0.44 0.53 0.67

tc 8.15 35.21 111.2 277.2 370.3 544.7 950.2

4.1 BM-3: Nonlinear MIMO Time-invariant Discrete System

The MIMO nonlinear dynamic system is given by:

y1(i) =15u1(i− 1)y2(i− 2)2 + 50u1(i− 1)2 +

1

2u1(i− 1)− 1

4y2(i− 2) + 1

10

y2(i) =sin[πu2(i− 1)y1(i− 2)] + 2u2(i− 1)

3(20)

The two random input signals u1(i), u2(i) uncorrelated with each other and uniformly distributedin the interval [-1,1], are used to generate the on-line training set together with the output. Oncethe plant is identified by the MRAN network, for testing the accuracy of the identified model, thesignals shown in figure 10 are used as inputs. These signals have both frequency modulation andamplitude modulation so that they can test the RBFN’s generalization ability and adaptabilityto the data’s oscillation.

Table 4 gives the computational cycle times for the five steps with varying hidden neurons. Itcan be seen straight-away that because of the more number of inputs and outputs the cycletimes for BM-3 is considerably higher specially for the case of higher number of hidden neurons.From Table 4 it is also evident that Step 4 is the real bottleneck and consumes a large of chunkof computational overhead.

Table 4: Computation Cycle Time tc(ms) for MRAN (BM-3 problem)

Hidden units 5 10 15 20 25 30 35

t1c 1.51 2.63 3.87 5.13 7.11 8.07 9.37

t2c 0.005 0.005 0.008 0.008 0.009 0.009 0.01

t3c 1.60 3.67 5.87 8.13 9.70 11.98 14.28

t4c 22.2 154.3 504.1 1202 2080 3548 6195

t5c 0.10 0.18 0.26 0.36 0.46 0.55 0.68

tc 25.42 160.8 514.1 1215 2097 3568 6219

From both Tables 3 and 4, it can be seen that with both the increase of the network inputs andoutputs and also the number of hidden neurons, time consumed in Step 4 (tuning parametersusing EKF) is large. For example, in BM-3 problem, when the hidden neuron reaches 30, from

10

Table.4 we can see that Step 4 consumes more than 99% of the total training time, and to satisfytc < T , T must be larger than 3.6 second, which is unacceptable for many real systems.

To find out reasons for why the time consumed in step 4 is large with the increase in thehidden neurons , one can take a close look at the computations involved in step 4, i.e. equation(8) to (10). Because multiplication consumes more time than addition and subtraction, it isworth looking at the number of multiplications involved in step 4. Note that when matrix Uc×dmultiplies matrix Vd×h, the total number of separate multiplication are c×d×h. The matricesinvolved in Step 4 (eqn. (8-10)) and their sizes are

Ki : (S, ny); Bi : (S, ny); Pi : (S,S); wi : (S, 1); ei : (ny, 1); Ri : (ny, ny) and whereS is defined as S = N × (nx + ny + 1) + ny;With the sizes of the above matrices known, the total number of multiplications needed

• to calculate KiBTi Pi−1 : S3 + S2 × ny• to calculate eqn.(9): suppose ny < 10, so the inverse calculation of matrix(with the sizeny × ny can be reasonably omitted.) S2 × ny + S × n2y

• to calculate (Kiei): S × ny

The total number of multiplications in Step 4 is,

Sum(N,nx, ny) = S3 + 2× S2 × ny + S × (n2y + ny) (21)

From equation (21), the total time for Step 4 is a third-order polynomial and is a function ofN, nx, ny. Hence if the number of neurons, or inputs and outputs increase the computationaltime for step 4 will increase enormously.

If MRAN has to be modified for real-time implementation, this bottleneck in Step 4 has to beovercome. Such a modification to MRAN is discussed in the next section.

4.2 Extended-MRAN (EMRAN) Algorithm

We see from equation (21) that the weakness in MRAN which increases its computational loadis that all the parameters of the network, including all the hidden neuron’s centers, widths andweights have to be updated in every step, so the size of the matrices to be updated become largeas the number of hidden neurons increase.

To overcome this bottleneck of MRAN for real-time implementation , a new algorithm calledEMRAN, which is an improved version of the MRAN algorithm is proposed. For this purposea ‘winner neuron’ strategy is incorporated similar to the one described in ONSAHL algorithm.The key idea of the EMRAN algorithm is that in every step, only those parameters that arerelated to the selected winner neuron are updated by the EKF algorithm. The ‘winner neuron’is defined as that neuron in the network which is closest (in some norm sense) to the currentinput data as in [17]. EMRAN attempts to reduce the on-line computation time considerablyand avoid the overflow of the memory, retaining at the same time the good characteristics ofMRAN, namely less number of hidden neurons, lower approximation error, etc.

Basically EMRAN has the same form as MRAN and all the equations are the same and henceare not repeated here. Only the changes are highlighted below:

11

In equation (6), µir is defined as the hidden neuron that is the nearest one to the current inputdata xi in the input space. Here this special hidden neuron is referred to as the ‘winner neuron’,and the parameters related to this neuron are denoted as µ∗, δ∗ and a∗. The criterion for addingand pruning the hidden neurons are all the same, the difference is that if the training sampledoes not meet the criteria for adding new hidden neuron, the network parameters w∗ related toonly the winner neuron are updated using the EKF as follows,

w∗i = w∗i−1 +K

∗i ei (22)

Obviously w∗ = [aT0 ,a∗T , µ∗T ,σ∗] ⊆ w,

In this equation, K∗i is the Kalman gain matrix with the size of (2ny + nx + 1)× ny:

K∗i = P∗i−1B

∗i [Ri +B

∗iTP∗i−1B

∗i ]−1 (23)

B∗i = w∗f(xi) is the gradient matrix of the function f(x(i)) with respect to the parametervector w∗ evaluated at w∗i−1:

B∗i = [I,Φ∗(xi)I,Φ∗(xi)

2a∗

σ∗2[xi − µ∗]T , Φ∗(xi)2a

∗

σ∗3xi − µ∗ 2]T (24)

Ri is the variance of the measurement noise. P∗i is the error covariance matrix which is updated

by:

P∗i = (I−KiB∗iT )P∗i−1 + qI (25)

P∗i ⊆ Pi, which includes the corresponding columns and rows related with the winner neuron,the size of P∗i is (2ny + nx + 1)× (2ny + nx + 1).Using EMRAN, the cycle time and its breakup for one sample of data are given in Tables 5 and6 for both problems BM-2 and BM-3, respectively. From the tables, it can be seen that there isa large reduction in tc4 while all other times remain same. This reduction in step4 time resultsin a major reduction for the cycle time tc for EMRAN.

Table 5: Computation Cycle Time tc(ms) for EMRAN (BM-2 problem)

Hidden units 5 10 15 20 25 30 35

t1c 0.82 1.37 2.07 2.38 3.00 3.65 4.55

t2c 0.003 0.004 0.004 0.004 0.005 0.005 0.005

t3c 0.22 0.27 0.23 0.25 0.24 0.25 0.26

t4c 1.67 1.50 1.51 1.42 1.48 1.57 1.54

t5c 0.09 0.17 0.26 0.33 0.44 0.53 0.67

tc 2.80 3.31 4.07 4.38 5.17 6.00 7.03

To look at the computational overhead reduction clearly, the above data is also displayed inFigure 6 in a bar graph form. Logarithmic scale has been used for this purpose because thedifference between MRAN and EMRAN is large. In BM-2 problem for a typical network withsay 30 neurons the cycle time for MRAN is 545 ms whereas for EMRAN it is 6 ms, a significantreduction. This reduction is more for BM-3 problem as for a network of 30 neurons MRAN

12

Table 6: Computation Cycle Time tc(ms) for EMRAN (BM-3 problem)

Hidden uints 5 10 15 20 25 30 35

t1c 1.51 2.63 3.87 5.13 7.11 8.07 9.37

t2c 0.005 0.005 0.008 0.008 0.009 0.009 0.01

t3c 0.35 0.38 0.40 0.42 0.40 0.42 0.45

t4c 2.12 2.04 2.03 1.95 2.19 2.11 2.05

t5c 0.10 0.18 0.26 0.36 0.46 0.55 0.68

tc 4.08 5.23 6.57 7.87 10.17 11.16 12.56

takes 3568 ms whereas EMRAN takes only 11.15 ms, a significant two-order reduction. Thisbehavior is observed for the network with all sizes (number of neurons). It is evident from Fig.6that EMRAN produces a large reduction in computation time and also that as the number ofhidden neurons increases more than 20 the cycle time remains approximately flat.

5 Performance Comparison of MRAN vs. EMRAN

The analysis of Section 4 has looked at cycle times of MRAN and EMRAN without lookingat their performances, i.e. the identification accuracy of MRAN and EMRAN have not beencompared. In this section, the identification accuracy comparison for both the problems BM-2and BM-3 are carried out using MRAN and EMRAN.

5.1 BM-2: Nonlinear SISO Time-Variant Dynamic System

The network input and output vector is the same as before. The parameters for the EMRANare selected as: E1 = 0.01, E2 = 0.09, E3 = max{εmax × γi, εmin}, εmax = 1.25, εmin = 0.4, γ =0.999, We use δ = 0.001 as the pruning threshold, and the size of the two sliding windows(M,Nw)are also 48.

Figure 7 to 8 give the neuron history and error history for this system. The continuous lineindicates the results obtained by the EMRAN algorithm, while the dotted line is that for MRAN.Compared to the neurons for MRAN which is at 11, 12 hidden neurons are used in EMRAN. Fromfigure 8 the approximation error between the EMARN and MRAN are near and whenever thereis a change in the dynamics of the system MRAN gives lower errors compared to EMRAN. Table7 presents a comparison of MRAN and EMRAN in terms of the network size, approximationerrors and computational times for both problems BM-2 and BM-3. Idav in the table is theaverage of error index Id based on 10,000 samples, and for tc, the network has (maximum) 15hidden neurons. In the tables, ’Overall time’ represents the total identification time for all thesamples and is given in seconds.

13

5.2 BM-3: MIMO nonlinear dynamic system

Figure 11 gives the results of the identification for both MRAN and EMRAN for problem BM-3.Using the test inputs given in Fig.10, the true outputs along with the outputs based on theidentifed models by MRAN and EMRAN are given in Fig.11. We can see from figure 11 thatthe adding and pruning capability of the EMRAN allows the RBF neural network to identify thehigh-dimension nonlinear system on line. EMRAN produces an output which is closer to MRANbut at a great reduction in computation time and hence can be used for real time identificationapplications easily.

Table 7: Comparison of Identification Results of the MRAN and EMRAN

BM-2 BM-3

Performance Idav Hidden Overall tc Idav Hidden Overall tcneurons time (sec) (ms) neurons time (sec) (ms)

MRAN 0.0379 11 104.0 111 0.0349 30 28574 6219

EMRAN 0.0427 12 21.3 3.80 0.0392 32 54.8 12.56

10,000 samples, tc is calculated based on 35 hidden units

From the tables it is clear that with a slight increase of error in EMRAN one gets a greatadvantage in computational time for EMRAN compared to MRAN.

At last, it is worth noting that the selection of the required threshold values for the MRAN/EMRANalgorithm and also the data window size are some of the critical parameters to be chosen prop-erly. As have been discussed in [15], when there is no measurement noise assumed, the selectionof the threshold is more critical. As in the most learning algorithms, we use some initial samplesto obtain the information for selecting these parameters in this study. Future work has to becarried out for determining those thresholds in a more efficient way.

6 Conclusions

This paper has presented a performance analysis of the recently developed Minimal ResourceAllocating Network (MRAN) algorithm for on-line identification of nonlinear dynamic systems.Using nonlinear time invariant and time-varying identification benchmark problems, MRAN’sperformance is compared with the ONSAHL. The results indicate that MRAN realizes networksusing less hidden neurons than ONSAHL algorithm with better approximation accuracy.

Next, the problems in real-time implementation of MRAN has been highlighted using a detailedtiming studies and analysis of the basic computations in MRAN. An extension to MRAN referredto as the Extended Minimum Resource Allocating Network (EMRAN) which utilizes a winnerneuron strategy in MRAN is highlighted. This modification reduces the computation load forMRAN and leads to considerable reduction in the identification time with only a minimal increasein the approximation error. This also indicates the minimum sampling time one can select usingEMRAN for identification problems. Using the same benchmark problems as before, the benefitsof EMARN show that compared with other learning algorithms EMRAN can ‘adaptively track’

14

the dynamics of the nonlinear system quickly without loss of accuracy and is ideal for fast on-lineidentification of nonlinear plants.

References

[1] Agarwal, M., “A systematic classification of neural-network-based control”, Journal of guid-ance, control, and dynamics, pp. 75-93, Apr.1997

[2] Sadhukhan, D. and Feteih,S., “F8 neurocontroller based on dynamic inversion”, IEEE con-trol systems magazines, 19, No.1, pp. 150-156, 1996

[3] Narendra, K. and Parthasarathy, K., “Identification and control of dynamic systems usingneural networks”, IEEE transactions on neural networks, Vol.1, No.1, pp. 4-26, Mar.1990

[4] Chen, S., Billings, S.A., Couan, C. and Grant, P.M., “Practical Identification of NARMAXModels Using Radial Basis Function”, Int.J.Control, 52, pp. 1327-1350, 1990

[5] Chen, C.L., Chen, W.C. and Chang, F.Y., “Hybrid learning algorithm for Gaussian poten-tial function networks”, IEE proceedings-D, control theory and applications, Vol.140, No.6,pp. 442-448, Nov.1993

[6] Moody, J. and Darken, C.J., “Fast learning in network of locally tuned processing units”,Neural computation, Vol.1, pp. 281-294, 1989

[7] Chen S. and Billings, S.A., “Neural networks for non-linear system identification ”, Inter-national journal of control, 52, pp. 1327-1350, 1990

[8] Bors, A.G. and Gabbouj, M., “Minimal Topology for a Radial Basis Functions NeuralNetwork for Pattern Classfication”, Digital Signal Processing, 4, pp. 173-178, 1994

[9] Lee, S. and Kil, R.M., “A gaussian potential function network with hierarchically self-organizing learning ”, Neural networks, 4, pp. 207-224, 1991

[10] Musavi, M.T., Ahmed, W., Chan, K.H., Faris, K.B. and Hummels, D.M., “ On training ofRadial Basis Function classifiers”, Neural networks, 5, pp. 595-603, 1992

[11] Platt, J.C., “A Resource Allocating Network for Function Interpolation”, Neural Compu-tation, Vol.3, pp. 213-225, 1991

[12] Kadirkamanathan, V. and Niranjan, M., “A Function Estimation Approach to SequentialLearning with Neural Networks” , Neural Computation, Vol.5, pp. 954-975, 1993

[13] Lu, Y.W., Sundararajan, N. and Saratchandran, P., “A Sequential Learning Scheme forFunction Approximation and Using Minimal Radial Basis Neural Networks”, Neural Com-putation, Vol.9, pp. 1-18, 1997

[14] Lu, Y.W., Sundararajan, N. and Saratchandran, P., “A sequential Minimal Radial BasisFunction(RBF) neural network learning algorithm”, IEEE transactions on neural networks,Vol.9, No.2, pp. 308-318, 1998

[15] Lu, Y.W., Sundararajan, N. and Saratchandran, P., “Identification of Time- varying Non-linear systems using Minimal Radial Basis Function Neural Networks”, IEE Proceedings-D,Control Theory Applications, Vol. 144, No. 1, pp. 1-7, UK, January. 1997

15

[16] Junge, T.F. and Unbehauen, H., “Off-Line Identification of Nonlinear Systems Using Struc-turally Adaptive Radial Basis Function Networks” Proceedings of the 35th Conference onDecision and Control, pp. 943-948, Kobe, Japan, Dec.1996

[17] Junge, T.F. and Unbehauen, H., “On-Line Identification of Nonlinear Time-Variant Sys-tems Using Structurally Adaptive Radial Basis Function Networks”, American ControlConference, pp. 1037-1041, Albuquerque, New Mexico, 1997

hidden layer

.

.

.

.

.

.

input layer

X

1

2

Φ

Φ

1

2(X)

(X)

Φ1(X)

output layer

y

y

α

α

α

α

x

x

xN

N

.

.

.

01

y0n

n x n y

1

yn y

Σ

Σ

α1n y

α21

11

α2n y

αN

Figure 1: Radial Basis Function Network Model

16

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

5

10

n

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

100

||ei|| <−:E

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 500010

−2

10−1

100

ermsi

<−:E2

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.5

1

1.5

Sample index (i)

di

<−:E3

Figure 2: BM-1: Performance of MRAN algorithm

0 500 1000 1500 2000 2500 3000 3500 4000 4500 500010

−2

10−1

100

Sample index (i)

Id

"−−−−" MRAN Algorithm

"−o−o−" ONSAHL Algorithm

Figure 3: BM-1: Evolution of Error(Id) (MRAN vs. ONSAHL)

17

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

5

10

15n

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

100

||ei|| <−:E

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 500010

−2

10−1

100

ermsi

<−:E2

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.5

1

1.5

Sample index (i)

di

<−:E3

Figure 4: BM-2: Performance of MRAN Algorithm

0 500 1000 1500 2000 2500 3000 3500 4000 4500 500010

−2

10−1

100

Sample index (i)

Id

" −−−− " MRAN Algorithm

" −o−o− " ONSAHL Algorithm

Figure 5: BM-2: Evolution of Error(Id) (MRAN vs. ONSAHL)

18

BM-2

BM-3

Figure 6: Comparision of Cycle Times for MRAN & EMRAN algorithm

19

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

5

10

15

Sample index (i)

n

" −−−− " EMRAN Algorithm

" −o−o− " MRAN Algorithm

Figure 7: BM-2: Evolution of Hidden Neurons (MRAN vs. EMRAN)

0 1000 2000 3000 4000 5000 6000 7000 8000 900010

−2

10−1

100

Sample index (i)

Id



Figure 8: BM-2: Evolution of Error(Id)(MRAN vs. EMRAN)

20

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

5

10

15

20

25

30

35

40

Sample index (i)



n

Figure 9: BM-3: Evolution of Hidden Neurons (MRAN vs. EMRAN)

0 100 200 300 400 500 600 700 800 900 1000−1

−0.5

0

0.5

1

u1 (i)

u2 (i)

0 100 200 300 400 500 600 700 800 900 1000−1

−0.5

0

0.5

1

Sample index (i)

Figure 10: BM-3: Test Input Signals

21

0 100 200 300 400 500 600 700 800 900 1000−1

−0.5

0

0.5

1

1.5

2

"−−−−" Actual output "−−−−" MRAN output "......" EMRAN output

y1

y2

0 100 200 300 400 500 600 700 800 900 1000−1

−0.5

0

0.5

1

1.5

Sample index (i)

Figure 11: BM-3: Test Output Data (MRAN vs. EMRAN)

22

Documents

Analysis of Minimal Radial Basis Function Network Algorithm for