5
Nonlinear System Identification using Least Squares Support Vector Machines Ming-guang Zhang Xing-gui Wang Wen-hui Li School of Electrical and Information Engineering, Lanzhou University of Technology Lanzhou, 730050, China E-mail: zhangmingg8 @163.com Abstract-Support vector machines (SVM) is a novel machine learning method based on small-sample Statistical Learning Theory (SLT), and is powerful for the problem with small sample, nonlinearity, high dimension, and local minimaSVM have been very successful in pattern recognition ,fault diagnoses and function estimation problems. Least squares support vector machines (LS-SVM) is an SVM version which involves equality instead of inequality constraints and works with a least squares cost function. This paper discusses least squares support vector machines (LS-SVM) estimation algorithm and introduces applications of the novel method for the nonlinear control systems. Then identification of MIMO models and soft-sensor modeling based on least squares support vector machines (LS-SVM) is proposed. The simulation results show that the proposed method provides a powerful tool for identification and soft-sensor modeling and has promising application in industrial process applications. I . INTRODUCTION Support Vector Machines( SVM) is a novel powerful machine learning method based on small-sample Statistical Learning Theory (SLT) [1,2,3]. SVM for classification and nonlinear function estimation, as introduced by Vapnik and further investigated by many others [4-9], is an important new methodology in the area of neural networks and nonlinear modeling [10]. Currently, SVM is an active field in artificial intelligent technology, and has been applied to pattern recognition, function estimation, signal processing [9,11]. SVM is powerful for the problems characterized by small samples, nonlinearity, high dimension and local minima. SVM replace the Empirical Risk Minimization principle, which is generally employed in the classical methods such as the least-square methods, the Maximum Likelihood methods and traditional Artificial Neural Network, by Structural Risk Minimization (SRM) principle which has advantage over ERM principle [12]. Least squares support vector machines (LS-SVM) proposed by Suykens and Vandewalle [13] is trained by solving a set of linear equations. LS-SVM is an extension of standard SVM and used to formulate for two-class classification problems and multiclass classification problems. LS-SVMs have been investigated for classification and function estimation [14,15]. In these LS-SVM formulations one works with equality instead of inequality constraints and a sum squared error (SSE) cost function as it is frequently used in training of classical neural networks. This reformulation greatly simplifies the problem in such a way that the solution is characterized by a linear system, more precisely a KKT (Karush-Kuhn-Tucker) system, which takes a similar form as the linear system that one solves in every iteration step by interior point methods for standard SVM. This linear system can be efficiently solved by iterative methods such as conjugate gradient . The aim of this paper is to investigate LS-SVM estimation algorithm and its application in nonlinear control system. Content of study consists of primarily two parts, one is study on LS-SVM estimation algorithm, and other is the application in MIMO nonlinear system identification and soft-sensor modeling. Effective simulation results indicate that LS-SVM estimation algorithm provides a powerful tool for nonlinear system identification and soft-sensor modeling and has promising application in industrial process applications. II. SUPPORT VECTOR MACHINES Given a set of training data (X1, YI) . (XI Y1) E R'XR The nonlinear function Vf ( ) was employed to map original input space Rn to high dimensional feature space SVfx= (o((x1 ) ,(X2 ),... * *(X,) . Then optimum decision fimction f(x,)=w-V(xi)+b is constructed in this high dimensional feature space. Nonlinear function estimation in original space becomes linear function estimation in feature space. By Structure Risk Minimum principle[1-3], we obtained optimization problem: Minimize R = ! .I1q2 +k c.Remp 211 where the second term HIw is the regularization term, 1' Rernp (a) = - X L(yi, f (xi, a)) is the empirical risk, i=l 0-7803-9422-4/05/$20.00 ©2005 IEEE 414

[IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

Embed Size (px)

Citation preview

Page 1: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

Nonlinear System Identification using LeastSquares Support Vector Machines

Ming-guang Zhang Xing-gui Wang Wen-hui LiSchool of Electrical and Information Engineering, Lanzhou University of Technology

Lanzhou, 730050, ChinaE-mail: [email protected]

Abstract-Support vector machines (SVM) is a novel machinelearning method based on small-sample Statistical LearningTheory (SLT), and is powerful for the problem with smallsample, nonlinearity, high dimension, and local minimaSVMhave been very successful in pattern recognition ,faultdiagnoses and function estimation problems. Least squaressupport vector machines (LS-SVM) is an SVM version whichinvolves equality instead of inequality constraints and workswith a least squares cost function. This paper discusses leastsquares support vector machines (LS-SVM) estimationalgorithm and introduces applications of the novel method forthe nonlinear control systems. Then identification of MIMOmodels and soft-sensor modeling based on least squaressupport vector machines (LS-SVM) is proposed. Thesimulation results show that the proposed method provides apowerful tool for identification and soft-sensor modeling andhas promising application in industrial process applications.

I . INTRODUCTION

Support Vector Machines( SVM) is a novel powerfulmachine learning method based on small-sample StatisticalLearning Theory (SLT) [1,2,3]. SVM for classification andnonlinear function estimation, as introduced by Vapnik andfurther investigated by many others [4-9], is an importantnew methodology in the area of neural networks andnonlinear modeling [10]. Currently, SVM is an active fieldin artificial intelligent technology, and has been applied topattern recognition, function estimation, signal processing[9,11]. SVM is powerful for the problems characterized bysmall samples, nonlinearity, high dimension and localminima. SVM replace the Empirical Risk Minimizationprinciple, which is generally employed in the classicalmethods such as the least-square methods, the MaximumLikelihood methods and traditional Artificial NeuralNetwork, by Structural Risk Minimization (SRM) principlewhich has advantage over ERM principle [12].

Least squares support vector machines (LS-SVM)proposed by Suykens and Vandewalle [13] is trained bysolving a set of linear equations. LS-SVM is an extension ofstandard SVM and used to formulate for two-classclassification problems and multiclass classificationproblems. LS-SVMs have been investigated forclassification and function estimation [14,15]. In these

LS-SVM formulations one works with equality instead ofinequality constraints and a sum squared error (SSE) costfunction as it is frequently used in training of classicalneural networks. This reformulation greatly simplifies theproblem in such a way that the solution is characterized by alinear system, more precisely a KKT (Karush-Kuhn-Tucker)system, which takes a similar form as the linear system thatone solves in every iteration step by interior point methodsfor standard SVM. This linear system can be efficientlysolved by iterative methods such as conjugate gradient .

The aim of this paper is to investigate LS-SVMestimation algorithm and its application in nonlinear controlsystem. Content of study consists of primarily two parts,one is study on LS-SVM estimation algorithm, and other isthe application in MIMO nonlinear system identificationand soft-sensor modeling. Effective simulation resultsindicate that LS-SVM estimation algorithm provides apowerful tool for nonlinear system identification andsoft-sensor modeling and has promising application inindustrial process applications.

II. SUPPORT VECTOR MACHINES

Given a set of training data(X1, YI) . (XI Y1) E R'XR

The nonlinear function Vf ( ) was employed to map

original input space Rn to high dimensional feature space

SVfx= (o((x1),(X2 ),...* *(X,) . Then optimumdecision fimction f(x,)=w-V(xi)+b is constructed inthis high dimensional feature space. Nonlinear functionestimation in original space becomes linear functionestimation in feature space. By Structure Risk Minimumprinciple[1-3], we obtained optimization problem:

Minimize R =! .I1q2+k c.Remp211

where the second term HIw is the regularization term,

1'Rernp (a) = -X L(yi, f(xi, a)) is the empirical risk,

i=l

0-7803-9422-4/05/$20.00 ©2005 IEEE414

Page 2: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

and c is a regularization parameter. L(yi, f (xi, a)loss function. Different SVM can be constructselecting different C -insensitive loss function.

A. Standard SVM estimation algorithm

Linear £ -insensitive loss function is selecstandard SVM. The optimization goal of standard Sformulated as

I Tmin J(w, = w +c±( + *)2 i=

subject to constraints

yi - we- p(xi) -b < .6+{w*q'(xi)+b-y .<e+ i

20 , =.i=1,**,1.The solution to this optimization problem is given

saddle point of the Lagrangian:

Aw-),$,,a,a ,c,,8,,ff) = UT *W+cL(4 +z2 i=l

-ap(wx())-yi +b+.6+4.)i=l

i=1 i=l

(minimum with respect to elements w, b, i aand maximum with respect to Lagrange mulc>O,a>O,ai >O,a*>0, )A >0and

K1 >0,i=l-,)From the optimality conditions

aL=0, aL=0,aw ab

=0 -o=0 (3

We haveI

w= Z(ai - ai)>((xi),i=l

L(ai -a*) =0,i=l

c-ai1-Al= 0,c-a-/i=0 , i=1,***,l* (4)Based on Mercer's condition [13], we define kernels

K(xN, xj) = -A)* fxj) (5)By (2), (4) and (5), optimization problem can be rewritten as

maxW(a, a* )=-- 2 ai a*)(ag ai)K(xi,,x j)i,j=l

I I

+L,(ai - a*)yi -LE(ai - a* )i=1 i=1

subject to constraints

S (ai - a*) = 0,

() O<a<C , O<a*.<c ,i=,-lFinally, nonlinear function is obtained as

f(x) =Z (ai - a>)K(x1,xj) + bi=1

B. Least Squares SVM estimation algorithm

(6)

(7)

The quadratic loss function is selected in LS SVM. They the optimization problem of LS SVM are formulated as

min J(w,St) =!1wT Iw+c! 2 (8)22

subject to the equality constraints

Yi=W ip(xi)+b+{i, i=1,---,l.We define the Lagrangian as

(2) L(w,b,a,y)=-w *w+cT 2L(w,b, , a2 i=WC1

(9)-ai(w - Vxi)+b+; -yi)

i=l

where ai (i = 1,---*,) are Lagrange multipliers.By the optimality conditions

aL 1_~= O -- W aio(xi )awi=-=0---* a1 =0,

a3LaL =o0 -+i = Ctiqi 1@,

= ° yi =wT*(x)+b+ , i-= 1, , . (10)aaBy (5) and (10), optimization problem can be rewritten as[0 1 b 1 01 K(x, x,) + l/c . K(xl,x)x a, y,

LI K(x,, x,) K(xi,x1)+1/c La, YI(11)

Finally, nonlinear function takes the form:

415

Page 3: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

f(x) = "aiK(x,xi)+b (12)

III. ILLUSTRATION AND SIMULATION

A. Identification ofMIMO models based on LS-SVM

The LS-SVM algorithm proposed in this paper wasused for identification on the following MIMO nonlinearsystem:

Xl,k+l Xlk, + (xl,k + X2,k)/((x2,k + sin(t))2 + 1)

10sin(xl k+ sin(t))Xlk + sin(t)

,0xi [] (13)where te [-10,10], The kernels function adopts the

radial basic function in above two simulation.

=x - xiI2K(x,x1) =exp(-- 2r

) (14)

where Ix - xi j| is calculated by formula

Itx-xi 2 = ~(xk-x~k)_i) and Cis kernels width.k=1

Testing samples data is 200 datapoints. The simulationresults are shown Fig. 1. The simulation results show thatidentification of MIMO models based on LS-SVM fit wellgiven fimction. Average errors are respectively 0.1536,0.00475.B.Soft-sensor modeling based on.LS-SVM

i2 . r -.5_ _

d-.'_

Soft-sensor modeling based on LS-SVM is black-boxmodel, which is based only on input-output measurementsof industrial process. In this modeling procedure, therelationship between input and output of the plant can beemphasized while the sophisticated inner structure isignored. The basic structure of soft-sensor based onLS-SVM is shown in Fig 2. In soft-sensor modeling, thesecondary variables, which are selected from measuredinput variables X , manipulated variables u andmeasured output variables y, are employed to act as theinputs of the soft-sensor model, and the calculated value or

long time interval sample values of primary variableY isemployed to act as the output of the soft-sensor model.Mapping relationship of secondary variables to primaryvariable, i.e. Y = f (u, y, X ) is implemented byLS-SVM.

In this case study, the source of training and testingsamples are from the process data records, which are

UT]V~

r

-r

.imeasurable primary,ariables X' variables Y

neasurable measurementrariables X plant

. ~~measurablemanipulate outputvariablesY

secondary soft-sensor estimstionvariables model _selection based on

LS-SVM

Fig.2 Structure of soft -sensor based on LS-SVMs

Fig. 1 identification curve ofMIMO models based on LS-SVMs

Star line is sample value, solid line is simulation value

416

Page 4: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

recorded and collected from the DCS systems and thecorresponding daily laboratory analysis of a refinery factory.According to analysis of fractional distillation process,extraction temperature of light diesel oil, vapor temperatureof nineteenth tray, quantity of reflux of first intermediatesection circulation, extraction temperature and refluxtemperature of first intermediate section circulation, areemployed to act as input of soft-sensor model, and frozenpoint of light diesel oil is employed to act as output ofsoft-sensor model. Variables values are nornalized to havezero mean and unit standard deviation by lineartransformation. The radial basic function is employed asformula (14).Estimated outputs of soft sensor based on

LS-SVM and real values of frozen point of light diesel oilare shown in Fig.3. Experiment results of soft sensors basedon LS-SVM are shown in Tab. I .

TABLE I EXPERIMENT RESULTS OF SOFT SENSORS BASED ONLS-SVM

10

N

4-

L -10

10

aD 0N0

U- -10

LS-SVM have well generalization performance and learningperformance. In LS-SVM algorithm, the regularizationparameter determines the trade-off between minimizingtraining errors and minimizing model complexity. Theoptimal model is obtained by selecting optimalregularization parameter and kernel parameters. Optimalregularization parameter implements the best trade-offbetween minimizing training errors and minimizing modelcomplexity so as to achieve well generalization performance.Furthermore, support vector (corresponding number of thehidden unit in neural network) in the soft-sensor model baseon LS-SVM can be automatically determined after SVMtraining.

IV. CONCLUSION

In this paper, we have proposed a new technique formodeling of nonlinear control system. The method is basedon least squares support vector machines functionapproximation and allows to determine the memorylessstatic nonlinearity as well as the linear model parametersfrom a linear set of equations. The LS-SVM algorithm areintroduced into identification MIMO model and soft-sensormodeling of nonlinear control system .The simulationresults show that the proposed method provides a powerfultool for identification and soft-sensor modeling and haspromising application in industrial process applications.

IlIPoint line-samples value Solid line- prediction value

!-

IIII II I

10 20 30 40 50 60 70 80 90LS SVM training samples

Point line-samples value golid line- 'rediction xlue TI

10 20 30 40 50 60LS SVM testing samples

100

70 80 90 100

Fig.3 Estimated outputs of soft sensor based on LS SVM and real values of frozen point of

light diesel oil ( C =0.0341, c=20.82)

From Fig.3, it is found that soft-sensor based onLS-SVM make good performance in estimation of frozenpoint of light diesel oil. Estimated outputs of soft-sensorbased on LS-SVM to the frozen point match real values ofthe frozen point and follow varying trend of the frozen pointvery well. From Tab. I , it is found that soft-sensor based on

ACKNOWLEDGEMENT

This work was supported by the National 863Scientific Project Development Fundation P.R. China undergrant no.2002BA90128A.

417

Soft sensor based on LS- SVMa =0.0341, c=20.82

LMSE 0.101

GMSE 0.267

Number of support vector 100

Page 5: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Nonlinear System Identification

REFERENCES

[1]V. Vapnik, The Nature of statistical Learning Theory, New York:Springer-Verlag, 1995.

[2]V. Vapnik, An overview of statistical Learning Theory, IEEE Trans.Neural Network, 10 (5) (1999)988-999.

[3]V. Vapni, The Nature of statistical Learning Theory, New York:Springer-Verlag, 1999.

[4] N. Cristianini, J. Shawe-Taylor, An Introduction to Support VectorMachines, Cambridge University Press, Cambridge, 2000.

[5] B. SchVolkopf, K-K Sung, C. Burges, F. Girosi, P. Niyogi, T. Poggio,V. Vapnik, Comparing support vector machines with Gaussian kernelsto radial basis function classifiers, IEEE Trans. Signal Process. 45 (11)(1997) 2758-2765.

[6] B. SchVolkopf, C. Burges, A. Smola (Eds.), Advances in KernelMethods-Support Vector Learning, MIT Press, Cambridge, MA,1998.

[7] B. SchVolkopf, S. Mika, C. Burges, P. Knirsch, KI-R. MVuller, G.RVatsch, A. Smola, Input space vs. feature space in kernel-basedmethods, IEEE Trans. Neural Networks. 10 (5) (1999) 1000-1017.

[8] A. Smola, B. SchVolkopf, KI-R. MVuller, The connection betweenregularization operators and support vector kernels, Neural Networks.11(1998)637-649.

[9] A. Smola, B. SchVolkopf, On a kernel-based method for patternrecognition, regression, approximation and operator inversion,Algorithmica 22 (1998) 211-231.

[101 J.A.K. Suykens, J. Vandewalle (Eds.), Nonlinear Modeling: AdvancedBlack-box Techniques, KIuwer Academic Publishers, Boston, 1998.

[11] A. Smola, B. SchVolkopf, On a kernel-based method for patternrecognition, regression, approximation and operator inversion,Algorithmica ,22 (1998) 211-231.

[12] B. SchVolkopf, S. Mika, C. Burges, P. Knirsch, K.-R. MVuller, G.RVatsch, A. Smola, Input space vs. feature space in kernel-basedmethods, IEEE Trans. Neural Networks ,10 (5) (1999) 1000-1017.

[13] J.A.K. Suykens, J. Vandewalle, Least squares support vector machineclassifiers, Neural Process. Lett. 9 (3) (1999) 293-300.

[14] J.A.K. Suykens, J. Vandewalle, B. De Moor, Optimal Control by LeastSquares Support Vector Machines, Neural Networks 14 (1) (2001)23-35.

[15] Ming-guang Zhang,Wei-wuYan,Zhan-ting Yuan. Study of NonlinearSystem Identification Based on Support Vector Machines. IEEEICMLC2005,Shanghai ,China, Aug.2004. pp3287-3290.

418