5
The 2005 IEEE International Conference on Neural Networks and Brain ICNN&B'05 Special Session 6 Kernel Methods: Theory, Implementation and Application

[IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

Embed Size (px)

Citation preview

Page 1: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

The 2005 IEEE International Conference on Neural Networks and Brain

ICNN&B'05

Special Session 6

Kernel Methods:

Theory, Implementation and

Application

Page 2: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

Soft Sensor Technique based on Robust SVMHuajun Feng, Haoran Zhang

College of Information Science and EngineeringZhejiang Normal University

Jinhua, Zhejiang ,321004, ChinaE-mail: {huajun&hylt} @zjnu.cn

Abstract-Support Vector Machine (SVM) is a modernmachine learning method based on Vapnik's statistical learningtheory. In this paper, a robust regression Support Vectormachine has been proposed as a tool to soft sensor technique, inwhich robust SVM is used to estimate variable which is highlynonlinear, then uses them to identify Absorption StabilizationSystem (ASS) process variable. Case studies are performed andindicate that the proposed method provides satisfactoryperformance with excellent approximation and generalizationproperty, soft sensor technique based on robust SVM achievessuperior performance to the conventional method based onneural networks.

I. INTRODUCTION

difference that equips SVM with a greater ability togeneralize, hence a better generalization ability isguaranteed. This paper discusses the basic principle of therobust SVM at first, and then uses it as a soft sensor tool toidentify Absorption Stabilization System (ASS) processvariable. The method can achieve higher identificationprecision at reasonably small size of training sample set andcan overcome disadvantages of the artificial neural networks(ANNs). The experiments of the identification have beenpresented and discussed. The results indicate that the SVMmethod exhibits good generalization performance.

II. A ROBUST SUPPORT VECTOR MACHINE

When monitoring and controlling plant processes, thereare some important variables that are difficult to measureon-line, due to the existence of certain limitations, such ascost, reliability, and so on. These problems can be solved byusing the soft sensor technique which allows those variablesto be estimated on-line by using other available on-linemeasurements. The core problem of soft sensor is toconstruct appropriate mathematic model, which is usuallyobtained based on various modeling techniques such asmultivariate statistics and artificial neural networks [1]. Asneural networks can approximate a nonlinear function witharbitrary precision[2], they have proven to be a powerfulmethodology in soft sensor modeling[3]. Despite manyadvances, there still remain a number of shortcomings suchas difficulty to determining networks' architecture, existenceof local minima solutions, excessive dependence on quantityand quality of training data. Although these problems havebeen solved to a certain extent, there are still no effectivemethods to improve the generalization ability. In order tosolve this problem, this paper proposes a new nonlinearmethod which has been motivated by the Support VectorMachine (SVM) to soft sensor technique.

SVM is a new machine learning method introducedrecently by Vapnik[4][5], it has numerous attractive featuresand promising empirical performance compared to thetraditional statistical approaches. The formulation of theSVM embodies the Structural Risk Minimization (SRM)principle, which has been shown to be superior to thetraditional Empirical Risk Minimization (ERM) principle,employed in conventional neural networks. It is this

Several robust cost fimctions have been used in SVMregression, such as Vapnik's £ -loss fimction, Huber's robustcost [6], or the ridge regression approach [7]. Here, wepropose a more general robust cost function that has theabove mentioned ones as particular cases. It can beexpressed as the following piecewise-defined function:

I

L(e) =

0

-(lel _,)22

c(-e|-£)-Ic2

el<.tel eC

lel ec

where ec = e + c.The three different intervals of unified function serve to

deal with different kinds of noise. Insensitive zone lel <.is adequate for low-frequency variations such as wander orbaseline deviations. The quadratic cost zone takes intoaccount the observation noise, the L2 norm in this zone isappropriate for Gaussian processes. The linear cost zonelimits the effect of either outliers or jitter noise. Parameter ccan be selected by trial and error.

For a nonlinear regression problem, firstly we maytransform it into a linear regression problem. This can beachieved by a nonlinear map 0(0) from input space into ahigh dimensional feature space and constructing a linear

0-7803-9422-4/05/$20.00 C2005 IEEE1704

Page 3: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

regression function there, that is:

f(x) = wTq(X) + b (1)We would like to find the function with the following

structural risk fumction:

Rst =I2||w| +C*Remp[f] (2)

We take loss function Remp = L(e), it is the unifiedrobust loss fimction we proposed above.According to formula (2) the above regression problemtransform to the following constraint optimization proble

nun 1( b2)qj(m2 ±4w)+wc(4 +;))2 1iell i"E2

yi WOxi)b <.£+;sI. i{(x,)+b-y.<e+4

Where I, is the set of samples for which 0 < 4i < c

O < Si < c, and I2 is the set of samples for wl

;i 2 c or 2 c . We append the term b2 to w7Extensive computational experience indicates thatformulation will add advantages such as strong convexitthe objective function.

For optimization problem (3), we construct a Lagrafunction from both the objective function andcorresponding constraints:

L = 1 (Ww+b2 C(J, ('i2 + ~;i ) + , (i+ ;2 2izEI, iCEI2

Li(6+;i -Yi +WO(xi)+b)-i=l

aX (, + Ss+ y i - w 0 xi) -

i=l1 (Yi4,i + Y )- (A4(i + P1s Si )iEI, iGI2

The KKT conditions of (4) are as follows:DL I

ob (~a, =

DL = w- (ai -ai*)b(xi) = 0aw i=l

DL-= C~; - a, - Y, = O i C= I,a4i I

(6)

(7)

DLeL = c;* -a. -y, =0

aL = Cc - ai - = O

iEl

i 2

(8)

(9)

aL =Cc-a - iel0 (10);i,

Substituting (5), (6), (7) ,(8), (9)and (10) into (4), yields thedual optimization problem:

L=--ZZ (, -a*aJ -aj Xo(x,)*b(xj)+1)-eZ(a +.a.)2=l j=l i=l

(3) 2C Yil(i - )_2C )i=F aso g

From KKT conditions, we also can get:

_1 (a2+a* l)12Cc22C ie(2 2

According to (12), formula (11) can be written:

(11)

(12)

1 1 1 1

LL(q -a, Xaj - aj*XXXi)-Oxj)+1)-.6 (a +1*)2i=l j=l i=l

YI(° I°)- 1* +c2)+ 12C20;2C.1 2

For a optimization problem, objective fimctionsubtract a constant do not change its optimization

solution, so we can subtract 2 from2

get:

L, then we

2I I I-

L (aE-( *i)(aj- *)(k(xi, xj )+ 1) +eL (a,i + *1.)2l=} j=l aii=l1 1'1

(4) 1=1i ((xi -yi ) 2C 1(a2 + ai 2 ) (13)

Where k(xi, xj) = q(x, )T'b(xj). From KKT conditions,we can get constraint condition of optimization problem:

0.<ai,,a <CcThe output ofnew SVM may be written as:

f(x) = (ai - a> )(k(xi,x) + 1)i=l

(14)

For the above discussion, we get the new SVM's dualoptimization problem:

1705

Page 4: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

1 1 1 1

AMi!:]oqXaj -aXk4,x,)+D+Eq+q)2 i4.m i--

I I I

-yi(q -)2Ci +6) (15)

st. O<q.cq4For (15), we solve objective function's gradient:

a.-

a.-L= (a; - aj)(k(x,,xj)+l)+-+y, +CLet:

Ei =Yi - x,*-;)(k(x,,xj)+l) (16)j=l

we get:___ a1

L__E. +e£+ai___ a.

According to gradient algorithm, The Increment of optimalvariable can be set:

a =-7a = -17(_Ei + c +a-C6a, aa-1

6a* a. -q(Ei+ + C)

Due to the constraint conditions: 0. ai,,a .Cc, wetake:

-a,Aa, = x ai

-a,Aa, = gai

{Cc-a,Finally we can getfollow:

6a1 <-a,-a<is a < Cc- ai (17)

6a, > Cc-a,

9a, <-ai-ari<6aa*<Cc-a* (18)

6a* > Cc- a,a training algorithm of new SVM as

( 1 ) Initialize ai = 0,ai =0;(2) Compute k(xi,xj)+l, i,j=l,...,l;(3) Choose a sample i randomly;

(4)

(4.1) Compute E,=y,-(aj- a;)(k(x,,xj)+1);j=1

(4.2) Compute Sa,,6,ac;(4.3) Compute Aa,, Aa,;

a, = ai + Aai(4.4) * *

a, =ai +Aai(5) If max(IAa1 ,|Aa |) < T, stop program, output

results; ifnot select the sample i which make

max(IAa,I, Aal), then goto (4)

HI. CASE STUDIES

The case is based on Absorption StabilizationSystem(ASS) of a refinery's heavy oil FCCU(FluidizedCatalytic Cracking Unit). The heavy oil FCCU is used toproduce light oil in refineries. Because of the low cost andsimple production process, FCCU is the major deepmachining unit to crude oil. ASS is one of the mostimportant parts ofFCCU. Its main task is to separate dry gas,liquefied natural gas, and stabilization gasoline with fromwealthy gas (include light gasoline and liquefied petroleumgas) and coarse gasoline. The gasoline's absorbing rate,most important criterion of the process, is difficult to bedetected. In order to overcome this difficulty, we use SVMregression to estimate the variable. According to techniqueanalysis, the mailly measurable variables are as follows:input rate ofraw oil, input rate ofrefining oil, temperature ofcatalytic reaction, peak temperature of main fractionatingtower, separation temperature of light diesel oil and bottomtemperature of stabilization tower. The six variables are theinput of soft sensor to estimate gasoline's absorbing rate. Inthis simulation, we use the actual sample data that have 100samples. Two methods, robust SVM, RBF NNs are appliedto soft sensor. We can-train SVM with RBF kernel by thosesamples, and give out SVM design parameter as follow:

0.7g=0.01c =0.2C= 350

The following table illustrates training and testingfitting quality ofboth robust SVM and NN:

1706

Page 5: [IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Soft Sensor Technique

TABLE ITHE SIMULATION ERROR OF ROBUST SVM AND NN

robust SVM NNtraining sets testing sets training traroigtesting error iiig testing error

error ______ error20 20 0.2430 1.1106 0 2.051230 30 0.4723 0.5453 0.3162 1.480840 40 0.5013 0.6783 0.4233 1.957250 50 0.5234 0.5989 0.4583 1.753760 40 0.5553 0.3163 0.4773 1.353270 30 0.5216 0.3042 0.4911 1.207980 20 0.5401 0.3172 0.5150 1.7247

As shown in table 1, we can see that the performance ofrobust SVM is superior to the performance of RBF NNmethod, robust SVM method possesses the bettergeneralization ability.

IV. CONCLUSION

In this paper, a new soft sensor method is proposed toestimate the immeasurable variables in industrial process, Arobust is given at first, then uses it to identify AbsorptionStabilization System (ASS) process variable, and give aleaming algorithm to train the robust SVM. Simulationexperiment results indicate that the soft sensor techniquebased on SVM is characterized by stronger learning ability,better generalization ability, less depending on the size oftraining data than NNs, and suitable to estimate theimmeasurable variables in industrial process which is highlynonlinear. So this approach sheds some light on the greatpotential application in industrial field.

REFERENCES

[¶] W. Cherkassky, and F. Mulier., Learningffrom Data, New York:JohnWiey & Sons, 1998.

[21 G Cybenko, "Continuous valued neural networks with two hiddenlayers are sufficient," Technical report, Department of computerscience, Tufls university, Medford, Massachusetts,1998.

[3] J. Glassey, M. Ignova, A.C. Ward and GA. Montague, 'Bioprocesssupervision: neural network and knowledge based system," Joumalbiotechnology, Vol.52, pp201-205, 1997.

[4] V.Vapnik, the nature ofstatistical learning theory, New York: Spri-nger Verlag,1995.

[5] J. Suykens and J. Vandewalle., 'Chaos control using least squaressupport vector machines," International Journal of Circuit Theoryand Applications, Special Issue on Communications, InfornationProcessing and Control using Chaos, vol. 27, no.6, pp605-615, 1999.

[6] K.R. Muller., A. Smola and G Ritsch, Advances in KernelMethods-Support Vector Learning, Cambridge: MIT Press ,1999.

[7] N. Cristianini and J.S.Taylor., An Introduction to Support VectorMachines and Other Kernel-based Learning Methods, Cambridge:Cambridge Univ. Press,2000.

1707