5
Combining Least Squares Support Vector Machines and Wavelet Transform to Predict Gas Emission Amount Hai-shan Wu, Cun-liang Jia College of Information and Electronic Engineering China University of Mining & Technology, Xuzhou, Jiangsu, China,221008 E-mail: eshan 1 [email protected] Abstrac- To improve the prediction accuracy of gas emission amount, a novel model based on least squares support vector machines(LS-SVM) and wavelet transform(WT)is presented. First, the historical series is decomposed by wavelet, and thus the approximate part and several detail parts are obtained. Then each part is predicted by a separate LS-SVM predictor. The reconstruction of predicted series is used as the final prediction result. The selections of embedding dimension and decomposition level are discussed, respectively. The results show that this model has greater generality ability and higher accuracy. I. INTRODUCTION Prediction of the expected gas emission amount from the work area of a mine is needed to facilitate ventilation planning and an assessment of methane drainage requirements. Accurate prediction of gas emission amount is crucial to insure the safety of the workers and the production of the coal. Great attention is paid on the accurate prediction of gas emission amount, and many models have been constructed. Among them, linear regression methods such as autoregressive (AR) [1] have been used in practice. Meantime, nonlinear methods are also applied to time series prediction with the development of machine learning theory. Of the nonlinear models, neural networks are very popular [2]. However, there are disadvantages in these models. Linear models are inadequate to predict nonstationary time series, which is affected by several random factors thus making it hard to predict. With respect to the model based on neural networks, it can not overcome the overfitting problem because it adopts the empirical risk minimization (ERM) principle. Moreover, it needs large quantity of training samples and learning speed is comparatively slow. Support vector machines (SVM), proposed by Vapnik [3]in 1995, is based on statistical learning theory (STL). It adopts structural risk minimization (SRM) principle instead of ERM principle, and thus can obtain global optimal solution by solving a quadratic problem. The adoption of kernel method avoids the curse of dimensional efficiently. Least squares support vector machines (LS-SVM) is a kind of SVM, but it possesses different constrains with regard to standard SVM. It has been applied in many fields such as time series prediction [4]. Wavelet transform (WT), which can produce a good local representation of the signal in both time domain and frequency domain, has also been successfully applied in the fields like data analysis and signal processing. It is also proposed for time series prediction combined with other models like neural networks [5]. In this paper, we proposed a model for gas emission amount prediction combining LS-SVM and WT, which can be called WT-LSSVM model. A simulation experiment is carried out to validate the applicability of the model. This paper is organized as follows: Section II reviews the basic principles of LS-SVM and WT. In Section III, the prediction model based on LS-SVM and WT is constructed. In Section IV, the simulation experiment is carried out and the selections of embedding dimension and decomposition level are discussed. Finally, the conclusion is made in Section V. II. BACKGROUND A. Least Squares Support Vector Machines Suppose we have the independent uniformly distributed data {xi,yi}...{xi,yj}, where each xi ERn denotes the input space of the sample and has a corresponding target value Yi E R for i= 1...1, where 1 corresponds to the size of the training data. The estimating function takes the form as follows: f(x) = (w * 4D(x)) + b (1) Where, cD(x) denotes the high dimensional feature space which is nonlinearly mapped from the input space. This leads to the optimization problem for standard SVM: Minimize 2 i=E * (2) 0-7803-9422-4/05/$20.00 ©2005 IEEE 1015

[IEEE 2005 International Conference on Neural Networks and Brain - Beijing, China (13-15 Oct. 2005)] 2005 International Conference on Neural Networks and Brain - Combining Least Squares

  • Upload
    vuduong

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Combining Least Squares Support VectorMachines and Wavelet Transform to Predict Gas

Emission AmountHai-shan Wu, Cun-liang Jia

College of Information and Electronic Engineering China University ofMining & Technology,Xuzhou, Jiangsu, China,221008

E-mail: eshan [email protected]

Abstrac- To improve the prediction accuracy of gas emissionamount, a novel model based on least squares support vectormachines(LS-SVM) and wavelet transform(WT)is presented.First, the historical series is decomposed by wavelet, and thusthe approximate part and several detail parts are obtained.Then each part is predicted by a separate LS-SVM predictor.The reconstruction of predicted series is used as the finalprediction result. The selections of embedding dimension anddecomposition level are discussed, respectively. The resultsshow that this model has greater generality ability and higheraccuracy.

I. INTRODUCTION

Prediction of the expected gas emission amount from thework area of a mine is needed to facilitate ventilationplanning and an assessment of methane drainagerequirements. Accurate prediction of gas emission amount iscrucial to insure the safety of the workers and theproduction of the coal. Great attention is paid on theaccurate prediction of gas emission amount, and manymodels have been constructed. Among them, linearregression methods such as autoregressive (AR) [1] havebeen used in practice. Meantime, nonlinear methods are alsoapplied to time series prediction with the development ofmachine learning theory. Of the nonlinear models, neuralnetworks are very popular [2].

However, there are disadvantages in these models. Linearmodels are inadequate to predict nonstationary time series,which is affected by several random factors thus making ithard to predict. With respect to the model based on neuralnetworks, it can not overcome the overfitting problembecause it adopts the empirical risk minimization (ERM)principle. Moreover, it needs large quantity of trainingsamples and learning speed is comparatively slow.

Support vector machines (SVM), proposed by Vapnik[3]in 1995, is based on statistical learning theory (STL). Itadopts structural risk minimization (SRM) principle insteadof ERM principle, and thus can obtain global optimalsolution by solving a quadratic problem. The adoption ofkernel method avoids the curse of dimensional efficiently.Least squares support vector machines (LS-SVM) is a kind

of SVM, but it possesses different constrains with regard tostandard SVM. It has been applied in many fields such astime series prediction [4].

Wavelet transform (WT), which can produce a good localrepresentation of the signal in both time domain andfrequency domain, has also been successfully applied in thefields like data analysis and signal processing. It is alsoproposed for time series prediction combined with othermodels like neural networks [5].

In this paper, we proposed a model for gas emissionamount prediction combining LS-SVM and WT, which canbe called WT-LSSVM model. A simulation experiment iscarried out to validate the applicability ofthe model.

This paper is organized as follows: Section II reviews thebasic principles of LS-SVM and WT. In Section III, theprediction model based on LS-SVM and WT is constructed.In Section IV, the simulation experiment is carried out andthe selections of embedding dimension and decompositionlevel are discussed. Finally, the conclusion is made inSection V.

II. BACKGROUND

A. Least Squares Support Vector MachinesSuppose we have the independent uniformly distributed

data {xi,yi}...{xi,yj}, where each xi ERn denotes theinput space of the sample and has a corresponding targetvalue Yi E R for i= 1...1, where 1 corresponds to the size ofthe training data. The estimating function takes the form asfollows:

f(x) = (w * 4D(x)) + b (1)Where, cD(x) denotes the high dimensional feature spacewhich is nonlinearly mapped from the input space.

This leads to the optimization problem for standardSVM:

Minimize2 i=E * (2)

0-7803-9422-4/05/$20.00 ©2005 IEEE1015

Subject to {Yi[WTD(xi)+b] 2-4.U i 03,i=LS...9

.(3)

Where, , is a slack variable and y is a positive realconstant which determines penalties to estimation errors.

For LS-SVM, (3) has been modified as follows:

2x -x.i

RBF: K(xi, xj) = exp(- 2 2

The resulting LS-SVM model for regression can beexpressed as follows:

1 T 1-w w+ i

Minimize 2 i=lSubject to the equality constrains:

(4)

Yoi [wTD(xi ) + b] = 1 - 4i i = 1,...,1 (5)By constructing the Lagrange function and according to

KKT Conditions, the equation as follows can be obtained:I

w = E ajyjD(xi )i=1

Xaiyi=O (6)

ai = y4isY-[WTo(Dxi)+b]-I+gi =0o=0

Then we define:

Z = [4)(xl)T yT ...;D(xi)TYiY = [Y1 ...;Yi;]

I = [1;...;1]

a =[a,; .. ;aj (7)

f(x) = L(a i-ac)K(x,,x)+bi=1

B. Wavelet Transform

(10)

Suppose the function p(t) EL (R) and its Fouriertransform v,(co) satisfies the condition:

£fLco<LD (11)

Then (p(t) can be called mother wavelet. By dilationsand translations of mother wavelet, a family of waveletfunctions as follows can be obtained:

V, (t)=aIld)a,da a (a.O,deR) (12)

Where, a is the dilation factor and d is the translationfactor.

Let a = 2' andd = k2i, discrete wavelet transform (DWT)can be realized:

IVj,k(t)= 2 2yf(2Jt-k) (13)

After substituting (7) into (6) and eliminating w andy,we can obtain:

E ZZT +yli][a]= [ ] (8)

By definingQ = zzT and applying Mercer's Condition [6]within the Q the matrix, each element of the matrix is inthe form:

Q, = YiYj(D(xi) F((xj) = yiyjK(xi, xj) . (9)Where, K(xi, Xj) is defined as kernel function. The valueof the kernel equals to the inner product of two vectors xiand Xj in the feature space (D(x1) and 1D(xj) thatis K(x1,xj1) = D(x j)D(xj) . Any symmetry functionsatisfying Mercer's condition can be used as kernel function.The typical examples of kernel function are polynomialkernel, RBF kernel.Polynomial: K(xi,X) = (y(xi xj) + r)d,y > 0;

Where, k is the shift parameter andj is the resolution level.The larger the value ofj, the lower the frequency.According to (13), the reconstruction expression offtx) can

be presented as follows:f(t) = Cj,k(Pj,k (t) + dj,kVj,k (t)

k k j (14)= aj (t) +L dj (t)

I

"- J,

Where, aj andd1 are the approximate and detail partsof original signal, respectively.

II. PREDICTION MODEL

The prediction model based on WT and LS-SVM can berealized according to the following stages:

A. Decomposition ofthe Time SeriesGiven time series of gas emission amount {Q(l) ... Q(1)},

it is decomposed by the wavelet at level j whose selectionwill be discussed in next section. Then the approximate partaj and the detail parts di (i =1... j) are obtained:

(15)Q(t) = a1 + E diI ,

1016

B. Prediction MODEL Base on LS-SVMSuppose the current time is t, the amount of gas emission

Q(t) can be predicted by the historical dataQ(t-1),Q(t-2)...Q(t-p). Then the prediction function can beexpressed as:

Q(t) = (D[Q(t -1)... Q(t - p)] (16)Where, p is referred to as the embedding dimension,

whose selection will also be discussed in next section.According to the above subsection, the prediction

function can be modified as follows:aj (t) = D[aj (t -1)*-* aj (t -p)] (17)dj (t) = (D[dj (t - l) ... dj (t - p)] (18)

We construct a multi-input and single-output LS-SVMpredictor for each part. According to (17) and (18), takingaj for example, the input vectors and output vectors ofLS-SVM predictor can be obtained as shown in Table I

TABLE ISTRUCTURE OF INPUT VECTORS AND OUTPUT VECTORS

input vectors output vectors

aj1)...aj<p-l),qajp) aj{p+1)

aj(l-p-1) ...aj(1-3),a(1-2) all-1)a#<-p) . ..a -,{l-1),a{-1) a,0l

IV. SIMULATION EXPERIMENT

To test the efficiency of our prediction model, we use fourdays gas emission amount of each hour to forecast those ofthe next day.

A. Experiment Procedure AndResultsDb3 wavelet is selected as the wavelet function and

decomposition level is selected at 3. The original series andthe decomposed series are shown in Fig.2.

0.51~~ ~ ~ ~ ~ ~~1X1 )' EP EDa

0.8

0.6 .02 4g 40 Q0 1Q0lo

0

_18l r

-WA' 20 40 Ai An nlll iX)

01 , , ,

-C20 40 6A 80 100 120J." -\E

O

.,-C0 20 40 60 80 100 120

C. Reconstruction ofthe Predicted ValueUsing LS-SVM predictor, the predicted values of the

approximate parts and detail parts of series of future gasemission amount can be achieved. Let aj and d,(i =1... 1)represent the predicted values of approximate parts anddetail parts, respectively. The reconstruction of each part canbe used as the final predicted results:

Q(t) = +xdiI

(19)

A

Where, Q and Q(t) are the real and predicted values ofthe gas emission amount respectively.

Figure 1 shows the structure ofthe prediction model:Q(t -p) Q(t -2) Q(t -1)

Wavelet Transform

djfi ... d, aj ,

ILS - SVMI ..*[|LS - SVMI LS -SVM

d_. ... di ajReconstruction

Q(t)Fig.1 Structure ofthe prediction model

Fig.2 The original series (at the top) and decomposed series

In Fig.2, the trend parts, periodic parts, and random partsof original series are illustrated obviously,.The decomposed series is used to predict that of the next

day by LS-SVM predictor. In this section, we use thesoftware LSSVM [7] which includes the implementation ofsolving (8). RBF kernel is chosen as the kernel function;embedding dimension is selected at 6. The parameters offour LS-SVM predictors are shown in TABLE II:

TABLE IIPARAMETERS OF LS-SVM PREDICTORS

LS-SVM predictor ofeach part ' a 2approximate parts 1250 20

detail parts in level 3 1250 20detail partsin level 2 100 20detail parts in level 1 350 190

By using LS-SVM predictor, the predicted values ofapproximate and detail parts of gas emission amount of thefifth day can be obtained as shown in Fig.3.

1017

I -

02 qu nu muiQ IMv.u

,w WwU.5 If =Yd KM MM

-0-2 _

nr.QIL_

J.0

respectively. The value of F, which is the comprehensiveindex to evaluate the model, shows the precision of themodel.

The comparison result is shown in Table. III.

TABLE IHCOMPARISON OF WT-LSVM MODELAND AR MODELAND

LS-SVM MODEL

Indices E Co FWT-LSSVM model 0.00474 0.9486 0.9766LS-SVM model 0.0093 0.5309 0.8068AR model 0.012 0.3360 0.7270

5 10 15 20 25

0

-0.20 5 10 15 20 25

Fig.3 Predicted values of each parts of original series for the fifth day. Ineach figure, the solid line and the dotted line represent the actual value andpredicted value, respectively.

According to (19), the reconstruction of each part is usedas the final predicted result. The result is shown in Fig.4:

1.1

I

-Actual Value- - Predicted Value

0 t ~ ~~-

0.9 F

0.8 F

0.70 5 10 hour 15 20 25

As we expect, all of the three indices of our method aresignificantly better than those of two other models.WT-LSSVM model demonstrates its success in theprediction of gas emission amount.

C. Discussion ofParameter SelectionThe values of embedding dimension and decomposed

level are difficulty to select. In our experiment, we select theembedding dimension from one to twelve and thedecompose level for one to three. Two indices are used tovalidate the efficiency of the model with selected values: Fand MAPE (Mean Absolute Percentage Error).

E:lQ(i) - QWMAPE=100 I = 24

The result is shown in TABLE IV and Fig.5:

TABLE IVPERFORMANCE WHEN DECOMPOSITION LEVEL VARIES

Decomposition Level F MAPE(%)0.9345 4.274

2 0.9705 3.0293 0.9766 2.436

0.9U-

Fig.4 The final predicted result

B. Performance and ComparisonTo make a comparison with the autoregressive model in

[1] and pure LS-SVM model, three indices are used toevaluate the performance ofthe prediction model:

E = abs{[(mean(Q) - mean(Q)] mean(Q)]}

Co cov(Q, Q)iD(Q) * D(Q)

F = 0.6(1- E) +O.4C .The values ofE and CO are used to measure the error and

the correlation between real value and predicted value,

0.8

Q7L0r,

.-o

0-~--uJa. 4

2 4 6 8Errbedding Dimension

0 2 4 6D 8Effbedding Dimension

10 12

10 12

Fig.5 Performance when embedding dimension varies

1018

I

t r *

v

1

From the table and figure above, we note that when p ismore than five or decomposed level more than two, there isonly tiny improvement of the performance. That is because:the information of gas emission amount of five hours issufficient to predict the value of the next hour. When p ismore than five, the information is redundant. Meanwhile,when the resolution level is at two, random parts have beendisplayed apparently. Too large resolution level may lead tothe error propagation. In this paper, we select the resolutionlevel at three in that the periodic parts, the trend parts andthe random parts are illustrated clearly. The predicted resultshown in TABLE IVindicates the validity of our selection.

V. CONCLUSIONS

In this paper, we combine wavelet transform and leastsquares support vector machines to predict time series of gasemission amount. The final results show that this model hasgreater generality ability and higher accuracy. That meansour method is applicable to predict time series of gasemission amount.

However, additional research is necessary to furtherexplore the model combining WT and LS-SVM. In ourexperiment, we only select RBF as the kernel function.Other kernel functions such as wavelet kernel proposed byZhang et al [8] may be also promising. In addition,prediction error mainly results from the random part of

original series. Other data preparation methods [9] mayenhance the accuracy.

REFERENCES

[1] Zhi-fang Xu "Research on Integrated System of Gas Real-timeDetecting Information Based on Intranet". China University ofMining & Technology, 2001.(in Chinese)

[2] Zhi-yi Yang , Ya-xuan Xiong, Qian-lin Zhang "Research on theprediction of gas emission in working face Based on neuralnetwork" Coal Engineering, no. 10 pp.73-75. 2004.(in Chinese)

[3] V.vapnik, "The Nature of Statistical Learning Theory" New York:Springer-Verlag,1995

[4] Van Gestel, T, et al. "Financial time series prediction using leastsquares support vector machines within the evidence framework".IEEE Trans. Neural Networks, vol.12, no.4, pp.809-821, 2001

[5] Bai-ling Zhang, et al. "Multi-resolution Forecasting for FuturesTrading Using Wavelet Decompositions" IEEE Trans. NeuralNetworks, vol.12, no.4, pp.765-774, 2001.

[6] Shevade S K et al. "Improvements to SMO algorithm for SVMregression" IEEE Trans. Neural Networks, vol.11, no.5,pp:l 188-1193,2000.

[7] LS-SVMlab[Online].Available:http:Hlwww.esat.kuleuven.ac.be/sista/lssvmlab

[8] Li Zhang, Wei-da Zhou,Licheng Jiao. "Wavelet Support VectorMachines." IEEE Trans. System. Man and Cybernetics, vol.34,no.l,pp.34-39, 2004.

[9] Bo-juen Chen, Ming-wei Chang, Chih-jen Lin "Load ForecastingUsing Support Vector Machines: A Study on EUNITE Competition2001". IEEE Phans. Power System vol.19, no.4, pp.1821-1830,2004.

1019