8/9/2019 1-s2.0-S0957417409004308-main
1/8
Power load forecasts based on hybrid PSO with Gaussian and adaptive mutation
and Wv-SVM
Qi Wu *
Key Laboratory of Measurement and Control of CSE (School of Automation, Southeast University), Ministry of Education, Nanjing, Jiangsu 210096, China
School of Mechanical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
a r t i c l e i n f o
Keywords:
Load forecasts
Wv-SVM
Particle swarm optimization
Adaptive mutation
Gaussian mutation
a b s t r a c t
This paper presents a new load forecasting model based on hybrid particle swarm optimization with
Gaussian and adaptive mutation (HAGPSO) and wavelet v -support vector machine (Wv-SVM). Firstly, it
is proved that mother wavelet function can build a set of complete base through horizontal floating
and form the wavelet kernel function. And then, Wv-SVM with wavelet kernel function is proposed in this
paper. Secondly, aiming to the disadvantage of standard PSO, HAGPSO is proposed to seek the optimal
parameter of Wv-SVM. Finally, the load forecasting model based on HAGPSO and Wv-SVM is proposed
in this paper. The results of application in load forecasts show the proposed model is effective and
feasible.
2009 Elsevier Ltd. All rights reserved.
1. Introduction
The theoretical study of load forecasting of power systems
started in the middle of last century, simultaneously with the
flourishing of system identification and modern control theories,
etc. Before that, because the scales of power systems were limited
and there were many uncertain factors, the study of load forecast-
ing had not taken shape. It was not until the 1980s that the theo-
retical study of mid-long term load forecasting began to occur, and
a series of forecasting methods, such as AR algorithm, MA algo-
rithm, General Exponential Smoothing algorithm, ARMA algorithm
and ARIMA algorithm, had been successively developed and are
widely accepted in the load forecasting of power systems at pres-
ent (Chenhui, 1987). With the improvement of the grey system,
manual neural network, expert system, genetic algorithm (Wu,
Yan, & Yang, 2008a) and other theories and methods, the method
of mid-long term load forecasting of power systems has continu-
ously improved (Benaouda, Murtagh, Starck, & Renaud, 2006;
Liang, 1997; Santos, Martins, & Pires, 2007; Topalli, Erkmen, &
Topalli, 2006; Ying & Pan, 2008). In general, most of the algorithms
above are based on the time series.
Recently, SVM which was developed by Vapnik (1995) is one of
the methods that receives increasing attention with remarkable re-
sults in the field of load forecasting (Hong, 2009; Pai & Hong, 2005;
Wu, Tzeng, & Lin, 2009). The main difference between NN and SVM
is the principleof riskminimization.ANN implements empiricalrisk
minimization (ERM) to minimize the error on the training data,
while SVM implements the principle of structural risk minimization
(SRM) by constructing an optimal separating hyper-plane in the hid-
den feature space, and using quadratic programming to find a un-
ique solution. SVM has yielded excellent generalization
performance that is significantly better than that of competingmethods in load forecasts(Hong, 2009; Pai& Hong, 2005; Wu,Tzeng
et al., 2009). However, for our used kernel functions so far, the SVM
cannot approach any curve in L2ðRnÞ space (quadratic continuous
integral space), because the kernel function which is used now is
not the complete orthonormal base. This character lead the SVM
cannot approach every curve in the L2ðRnÞ space. Similarly, the
regression SVMcannot approach every function. Therefore we need
find a new kernel function, and this function can build a set of com-
plete base through horizontal floating and flexing. As we know, this
kind of function has already existed, and it is the wavelet functions.
The SVM with wavelet kernel function is called by wavelet SVM
(WSVM). Reviewing the load forecasts literatures about support
vector machine technique (Hong, 2009; Pai & Hong, 2005; Wu,
Tzeng et al., 2009), little has been written about in the literature
on application of Wv-SVM to load forecast research field.
However, the confirmation of unknown parameters of the Wv-
SVMis complicated process. In fact,it is a multivariableoptimization
problem in a continuous space. The appropriate parameter combi-
nation of models can enhance approximating degree of the original
series. Therefore, it is necessary to select an evolutionary algorithm
to seek theoptimal parameters of Wv-SVM. These unknown param-
eters have a great effect on the generalization performance of Wv-
SVM. An appropriate parameter combination corresponds to a high
generalization performance of Wv-SVM. Particle swarm optimiza-
tion (PSO), which is an evolutionary computation technique devel-
oped by Kennedy and Eberhart (1995), is considered as an
excellent technique to solve the combinatorial optimization prob-
0957-4174/$ - see front matter 2009 Elsevier Ltd. All rights reserved.doi:10.1016/j.eswa.2009.05.011
* Tel.: +86 25 51166581; fax: +86 25 511665260.
E-mail address: [email protected]
Expert Systems with Applications 37 (2010) 194–201
Contents lists available at ScienceDirect
Expert Systems with Applications
j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a
mailto:[email protected]://www.sciencedirect.com/science/journal/09574174http://www.elsevier.com/locate/eswahttp://www.elsevier.com/locate/eswahttp://www.sciencedirect.com/science/journal/09574174mailto:[email protected]
8/9/2019 1-s2.0-S0957417409004308-main
2/8
lems (Lin, Ying, Chen, & Lee, 2008; Shen, Shi, Kong, & Ye, 2007; Wu,
Liu, Xiong, & Liu, 2009; Wu, Yan, & Wang, 2009; Wu, 2009; Wu, Yan,
& Yang, 2008b; Wu, & Yan, 2009, in press; Yuan & Chu, 2007; Yang,
Yuan, Yuan, & Mao, 2007; Zhao & Yang, 2009).
PSO is based on the metaphor of social interaction and commu-
nication such as bird flocking. Original PSO is distinctly different
from other evolutionary-type methods in a way that it does not
use the filtering operation (such as crossover and mutation) andthe members of the entire population are maintained through the
search procedure so that information is socially shared among indi-
viduals to direct the search towards the best position in the search
space. Oneof themajor drawbacks of the standard PSOis its prema-
ture convergence. To overcome the shortage, there have been a lot
of reported works focused on the modification PSO such as in (Lin
et al., 2008; Shen et al., 2007; Wu et al., 2008b; Yuan & Chu,
2007; Zhao & Yang, 2009) to solve the parameter selection prob-
lemsof SVM,but little attention is given inWv-SVM. And then, a hy-
brid PSO with adaptive mutation and Gaussian mutation (HAGPSO)
is proposed to optimize the parameters of Wv-SVM in this paper.
Based on the above analysis, a new load forecasting model
based and Wv-SVM is proposed in this paper. Their superiority
over traditional model is verified in numerical simulation. The rest
of this paper is organized as follows. Section 2 introduces Wv-SVM.
HAGPSO is arranged in Section 3. In Section 4 the steps of HAGPSO
and forecasting method are described. Section 5 gives experimen-
tal simulation and results. Conclusions are drawn in the end.
2. Wavelet v -support vector machine (W v -SVM)
2.1. Wavelet kernel theory
Let us consider a set of data points ð x1; y1Þ; ð x2; y2Þ; . . . ; ð xl; ylÞ,
which are independently and randomly generated from an un-
known function. Specifically, xi is a column vector of attributes, y iis a scalar, which represents the dependent variable, and l denotes
the number of data points in the training set.The support vector’s kernel function can be described as not
only the product of point, such as K ð x; x0Þ ¼ K ðÞ, but also
the horizontal floating function, such as K ð x; x0Þ ¼ K ð x x0Þ. In fact,
if a function satisfied condition of Mercer, it is the allowable sup-
port vector kernel function.
Lemma 1 (Mercer, 1909). The symmetry function K ð x; x0Þ is the
kernel function of SVM if and only if: for all function g – 0 which
satisfies the condition of R
Rn g 2ðnÞdn < 1, we need satisfy the
condition as follows:Z Z K ð x; x0Þ g ð xÞ g ð xÞdxdx0 P 0; x; x0 2 R n ð1Þ
This theorem proposed a simple method to build kernel function.
For the horizontal floating function, because hardly dividing this function into two same functions, we can give the condition of
horizontal floating kernel function.
Lemma 2 (Smola and Scholkopf, 1998). The horizontal floating func-
tion is allowable support vector’s kernel function if and only if the Fou-
rier transform of K ð xÞ need satisfy the condition follows:
F ½ xðxÞ ¼ ð2pÞn=2Z
Rnexpð jðx: xÞÞK ð xÞdx P 0; x 2 R n ð2Þ
If the wavelet function wð xÞ satisfied the conditions: wð xÞ 2
L2ðRÞ \ L1ðRÞ, and ŵð xÞ ¼ 0; ŵ is the Fourier transform of function
wð xÞ. The wavelet function group can be defined as:
wa;mð xÞ ¼ ðaÞ12w x ma
; x 2 R ð3Þ
where a is the so-called scaling parameter, m is the horizontal float-
ing coefficient, and wð xÞ is called the ‘‘mother wavelet”. The param-
eter of translation m 2 R and dilation a > 0, may be continuous or
discrete. For the function f ð xÞ; f ð xÞ 2 L2ðRÞ, The wavelet transform
f ð xÞ can be defined as:
W ða; mÞ ¼ ðaÞ12 Z þ1
1
f ð xÞw x ma dx ð4Þ
where wð xÞ stands for the complex conjugation of wð xÞ.
The wavelet transform W ða;mÞ can be considered as functions
of translation m with each scale a. Eq. (4) indicates the wavelet
analysis is a time-frequency analysis, or a time-scaled analysis. Dif-
ferent from the Short Time Fourier Transform (STFT), the wavelet
transform can be used for multi-scale analysis of a signal through
dilation and translation so it can extract time-frequency features
of a signal effectively.
Wavelet transform is also reversible, which provides the possi-
bility to reconstruct the original signal. A classical inversion for-
mula for f ð xÞ is:
f ð xÞ ¼ C 1wZ þ1
1
Z þ11
W ða; mÞwa;mð xÞda
a2 dm ð5Þ
where
C w ¼Z 1
1
jŵðwÞj2jwj dw 0. So far, because
the wavelet kernel function must satisfy the conditions of Lemma 2,
the number of wavelet kernel function which can be showed by
existent functions is few. Now, we give an existent wavelet kernel
function: Morlet wavelet kernel function, and we can prove that
this function can satisfy the condition of allowable support vector’s
kernel function. Morlet wavelet function is defined as follows:
w
ð x
Þ ¼ cos
ð1:75 x
Þexp
x2
2
ð10
Þ Theorem 1. Morlet wavelet kernel function is defined as:
K ð x; x0Þ ¼Yli¼1
cos 1:75 xi x0i
a
exp k xi x
0ik2
2a2
!;
x 2 R ld; xi 2 R d ð11Þand this kernel function is an allowable support vector kernel function.
Proof. According to Lemma 2, we only need to prove
F ½ xðxÞ ¼ ð2pÞl=2Z
Rldexpð jðx: xÞÞK ð xÞdx P 0 ð12Þ
where K ð xÞ ¼Ql
i¼1w
xi
a
¼Ql
i¼1 cos
1:75 xi
a
expðk xik
2=2a2Þ
; j denotesimaginary number unit. We have
Q. Wu / Expert Systems with Applications 37 (2010) 194–201 195
8/9/2019 1-s2.0-S0957417409004308-main
3/8
Z R
ldexpð jx xÞK ð xÞdx ¼
Z R
ldexpð jx xÞ
Yli¼1
cos 1:75 xia
exp k xik2
2a2
!!dx
¼ Yl
i¼1 Z 1
1exp
ð jxi xi
Þ expð j1:75 xi=aÞ þ expð j1:75 xi=aÞ
2
exp k xik2
2a2
!dxi ¼
Yli¼1
1
2
Z 11
exp k xik2
2 þ 1:75 j
a jxia
xi
!
þ exp k xik2
2 1:75i
a þ jxia
xi
!!
¼ Yl
i¼1
jaj ffiffiffiffiffiffiffi2pp
2
exp
ð1:75 xiaÞ2
2 !
þ exp ð1:75 þxiaÞ2
2
!! ð13Þ
Substituting formula (13) into Eq. (12), we can obtain Eq. (14).
F ½ xðxÞ ¼Yli¼1
jaj2
exp ð1:75 xiaÞ
2
2
!
þ exp ð1:75 þxiaÞ2
2
!! ð14Þ
where a – 0, we have
F ½ xðxÞ P 0 ð15Þ
This completes the proof of Theorem 1. h
2.2. Wavelet v -support vector machine
Combining the wavelet kernel function with v -SVM, we can
build a new SVM learning algorithm that is Wv-SVM. The structure
of Wv-SVM is shown in Fig. 1. For a set of data points ð x1; y1Þ;
ð x2; y2Þ; . . . ; ð xl; ylÞ, Wv-SVM can be described as:
minw;nðÞ ;e;b
sðw; nðÞ; eÞ ¼ 12
kwk2 þ C v eþ 1l
Xli¼1
ðni þ ni Þ !
ð16Þ
Subject to ðw xi þ bÞ yi 6 eþ ni ð17Þ yi ðw xi þ bÞ 6 eþ ni ð18Þn
ðÞi P 0; eP 0; b 2 R ð19Þ
where w and x i are a column vector with d dimensions, C > 0 is a
penalty factor, nðÞi ði ¼ 1; . . . ; lÞ are slack variables and m 2 (0,1] isan adjustable regularization parameter.
Problem (16) is a quadratic programming (QP) problem. By
means of the Wolfe principle, wavelet kernel function technique
and Karush–Kuhn–Tucker (KKT) conditions, we have the duality
problem (20) of the original optimal problem (16).
maxa;a
W ða; aÞ ¼ 12
Xli; j¼1
ðai aiÞða j a jÞK ð xi x jÞ
þXli¼1
ðai aiÞ yi ð20Þ
s:t: 0 6 ai;ai 6
C
l ð21ÞXli¼1
ðai aiÞ ¼ 0 ð22Þ
Xli¼1
ðai þai Þ 6 C v ð23Þ
Select the appropriate parameters C and v , and the optimal
mother wavelet function which can match well the original series
in some scope of scales as the kernel function of Wv-SVM model.
Then, W v -SVM output function is described as following:
f ð xÞ ¼X
l
i¼1ðai ai Þ
Yl
i¼1w
x j x jia
!þ b; b 2 R ð24Þ
where wð xÞ is wavelet transform function, a is the scaling parameter
of wavelet, a > 0. x j is the jth value of test vector x. x ji is the jth value
of sample vector x i.
Parameter b can be computed by Eq. (25), select the two scalars
a jða j 2 ð0; l=C ÞÞ and akðak 2 ð0; l=C ÞÞ, then we have
b ¼ 12
y j þ yk Xli¼1
ðai aiÞK ð xi; x jÞ þXli¼1
ðai aiÞK ð xi; xkÞ !" #
ð25Þ
3. Hybrid particle swarm optimization
The confirmation of unknown parameters of the Wv-SVM iscomplicated process. In fact, It is a multivariable optimization
problem in a continuous space. The appropriate parameter combi-
nation of models can enhance approximating degree of the original
series Therefore, it is necessary to select an intelligence algorithm
to get the optimal parameters of the proposed models. The
parameters of Wv-SVM have a great effect on the generalization
performance of Wv-SVM. An appropriate parameter combination
corresponds to a high generalization performance of the Wv-SVM.
PSO algorithm is considered as an excellent technique to solve the
combinatorial optimization problems. The proposed HAGPSO algo-
rithm is used to determine the parameters of Wv-SVM. The intelli-
gence system shown in Fig. 2 based on the HAGPSO algorithm and
Wv-SVM model can evaluate the performance of HAGPSO algo-
rithm by forecasting time series. The different Wv-SVMs in the dif-ferent Hilbert spaces are adopted to forecast the power load time
y
x
Σ
1( , )K x x
2( , )K x x
1w
iw
( , )n
K x x
nw
( , )i
K x x
2w
1 x
2 x
n x
Fig. 1. The architecture of Wv-SVM.
196 Q. Wu / Expert Systems with Applications 37 (2010) 194–201
8/9/2019 1-s2.0-S0957417409004308-main
4/8
series. For each particular region only the most adequate Wv-SVM
with the optimal parameters is used for the final forecasting.
To valuate forecasting capacity of the intelligence system, the
fitness function of AGPSO algorithm is designed as follows:
fitness ¼ 1l
Xli¼1
yi yi yi
2ð26Þ
where l is the size of the selected sample, yi denote the forecastingvalue of the selected sample, yi is original date of the selected
sample.
3.1. Standard particle swarm optimization
Similarly to evolutionary computation techniques, PSO (Yang
et al., 2007) uses a set of particles, representing potential solutions
to the problem under consideration. The swarm consists of n par-
ticles. Each particle has a position X i ¼ ð xi1; xi2; . . . ; xij; . . . xidÞ, a
velocity V i ¼ ðv i1;v i2; . . . ;v ij; . . . ;v idÞ, where i ¼ 1;2; . . . ;N ; j ¼
1;2; . . . ; d, and moves through an d-dimensional search space.
According to the global variant of the PSO, each particle moves to-
wards its best previous position and towards the best particle pg in
the swarm. Let us denote the best previously visited position of theith particle that gives the best fitness value as pi ¼ ð pi1; pi2; . . . ;
pij; . . . ; pidÞ, and the best previously visited position of the swarm
that gives best fitness as pg ¼ ð pg 1; pg 2; . . . ; pg j; . . . ; pg dÞ.
The change of position of each particle from one iteration to an-
other can be computed according to the distance between the cur-
rent position and its previous best position and the distance
between the current position and the best position of swarm. Then
the updating of velocity and particle position can be obtained by
using the following equations:
v kþ1ij ¼ w v kij þ c 1 r 1 pij xkij
þ c 2 r 2 pg j xkij
ð27Þ
xkþ1ij ¼ xkij þ v kþ1ij ð28Þ
where w is called inertia weight and is employed to control theimpact of the previous history of velocities on the current one. k
denotes the iteration number, c 1 is the cognition learning factor,
c 2 is the social learning factor, r 1 and r 2 are random numbers uni-
formly distributed in the range [0,1].
Thus, the particle flies through potential solutions towards pkiand pg k in a navigated way while still exploring new areas by the
stochastic mechanism to escape from local optima. Since there
was no actual mechanism for controlling the velocity of a particle,
it was necessary to impose a maximum value V max on it. If thevelocity exceeds the threshold, it is set equal to V max, which con-
trols the maximum travel distance at each iteration to avoid this
particle flying past good solutions. The PSO is terminated with a
maximal number of generations or the best particle position of
the entire swarm cannot be improved further after a sufficiently
large number of generations. The PSO has shown its robustness
and efficacy in solving function value optimization problems in
real number spaces.
3.2. Hybrid particle swarm optimization with Gaussian mutation and
adaptive mutation
Aiming to the disadvantage of the standard PSO, the adaptive
mutation operator is proposed to regulate the inertia weight of
velocity by means of the fitness value of object function and itera-
tive variable. The Gaussian mutation operator is considered to cor-
rect the direction of particle velocity. The aforementioned problem
is addressed by incorporating adaptive mutation and Gaussian
mutation for the previous velocity of the particle. Thus, the HAG-
PSO can update the velocity and particle position by using the fol-
lowing equations:2
v kþ1ij ¼ ð1 kÞwkijv kij þ kN 0;rki
þ c 1r 1 pij xkij þ c 2r 2 pg j xkij
ð29Þ
xkþ1ij ¼ xkij þ v kþ1ij ð30Þwkij
¼ b 1
f xki = f xkm þ ð
1
b
Þw0ij exp
ðak2
Þ ð31
Þrkþ1i ¼ rki expðN ið0;MrÞÞ ð32Þ
Fig. 2. The AGPSO optimizes the parameters of Wv-SVM.
Q. Wu / Expert Systems with Applications 37 (2010) 194–201 197
8/9/2019 1-s2.0-S0957417409004308-main
5/8
where i ¼ 1; 2; . . . ; N ; t ¼ 1;2. Mr is standard error of Gaussiandistribution, b is the adaptive coefficient, k is an increment coeffi-
cient, a is the coefficient of controlling particle velocity attenuation, f xki
is the fitness of the ith particle in the kth iterative process.
f xkm
is the optimal fitness of particle swarms in the k iterative
process.
The parameter w regulates the trade-off between the global and
local exploration abilities of the swarm. A large inertia weight facil-
itates global exploration, while a small one tends to facilitate local
exploration. A suitable value of the inertia weight w usually pro-
vides balance between global and local exploration abilities and
consequently results in a reduction of the number of iterations re-
quired to locate the optimum solution.
Adaptive mutation, which makes the quality of the solution de-
pend on mutation operator, is high effective mutation operator in
real code. The proposed adaptive mutation operator based on iter-
ative variable k and the fitness function value f ð xkÞ is described in
Eq. (31). Then, in first item of the right of Eq. (29), velocity inertia
weight wkij can provide balance between global and local explora-
tion abilities and consequently results in a reduction of the number
of iterations required to locate the optimum solution. In Eq. (31)
1 f xki = f xkm represents the particles with the bigger fitness
mutate in a smaller scope, while the ones with the smaller fitness
mutate in a big scope. w0ij expðak2
Þ represents the initial inertia
weight w0ij mutate in big scope and search the local optimal value
in bigger space in the start moment (smaller kÞ, while the param-
eter w0ij mutate in small scope and search the global optimal value
in small space and gradually reach the global optimal value in the
end moment (bigger kÞ.
The second item of Eq. (29) represents Gaussian mutation based
on the iterative variable k. The Gaussian mutation operator which
can correct the moving direction of particle velocity is represented
inEq. (32). In thestrategyof Gaussianmutation, theproposedveloc-
ity vector v kþ1 ¼ v kþ11 ;v kþ12 ; ;v
kþ1d
consists of last generation
velocity vector v k ¼ v k1;v k2; ;v
kd
and perturbation vector rk ¼
rk1;rk2; . . . ;r
kd . The perturbation vector mutates itself by Eq. (32)
on the each iterative process as a controlling vector of velocityvector.
The adaptive and Gaussian mutation operators can restore the
diversity loss of the population and improve the capacity of the
global search of the algorithm.
4. The procedures of HAGPSO and W v -SVM
The HAGPSO algorithm is described in steps as follows:
Algorithm 1
Step 1. Data preparation: Training, validation, and test sets are
represented as Tr, Va, and Te, respectively.
Step 2. Particle initialization and PSO parameters setting: Gener-ate initial particles. Set the PSO parameters including
number of particles ðnÞ, particle dimension ðmÞ, number
of maximal iterations ðkmaxÞ, error limitation of the fitness
function, velocity limitation ðV maxÞ, and inertia weight for
particle velocity ðw0Þ, Gaussian distribution ðN ð0;MrÞÞ,the perturbation momentum ðr0i Þ; the coefficient of controlling particle velocity attenuation ðaÞ, adaptivecoefficient ðbÞ, increment coefficient ðkÞ. Set iterative vari-
able: k ¼ 0. And perform the training process from Step
3–8.
Step 3. Set iterative variable: k ¼ k þ 1.
Step 4. Compute the fitness function value of each particle. Take
current particle as individual extremum point of every par-
ticle and do the particle with minimal fitness value as theglobal extremum point.
Step 5. Stop condition checking: if stopping criteria (maximum
iterations predefined or the error accuracy of the fitness
function) are met, go to Step 8. Otherwise, go to the next
step.
Step 6. Adopt the adaptive mutation operator by Eq. (31) and
Gaussian mutation operator by Eq. (32) to manipulate par-
ticle velocity.
Step 7. Update the particle position by Eqs. (29) and (30) and formnew particle swarms, go to step 3.
Step 8. End the training procedure, output the optimal particle.
On the basis of the Wv-SVM model, we can summarize an
estimation algorithm as the follows.
Algorithm 2
Step 1. Initialize the original data by normalization and fuzzifica-
tion, then form training patterns.
Step 2. Select the appropriate wavelet kernel function K , the con-
trol constant m and the penalty factor C . Construct the QPproblem (16) of the Wv-SVM.
Step 3. Solve the optimization problem and obtain the parame-
ters aðÞi . Compute the regression coefficient b by (25).Step 4. For a new forecasting task, extract load characteristics and
form a set of input variables x. Then compute the estima-
tion result ̂y by (24).
5. Experiment
To analyze the performance of the proposed HAGPSO algorithm,
the forecast of power load series by means of the intelligence sys-
tem based on HAGPSO and Wv-SVM is studied. To compare the
performance of HAGPSO algorithm, the standard PSO is also
adopted to optimize the parameters of Wv-SVM. The better algo-
rithm will give the better combinational parameters of Wv-SVM.
Therefore, there is a good forecasting capability provided by the
better combinational parameters in the regression estimation of
Wv-SVM. Better algorithm provides better forecasting capability.
To evaluate forecasting capacity of the intelligent system, some
evaluation indexes, such as mean absolute error (MAE), mean
absolute percentage error (MAPE) and mean square error (MSE),
are adopted to deal with the forecasting results of HAGPSOWv-
SVM and PSOWv-SVM.
In our experiments, power load series are selected from past
load record in a typical power company. The detailed characteristic
data and load series compose the corresponding training and test-
ing sample sets. During the process of the power load forecasting,
six influencing factors shown in Table 1, viz., sunlight, data, air
pressure, temperature, rainfall and humidity are taken into ac-
count. All linguistic information of gotten influencing factors is
dealt with fuzzy comprehensive evaluation (Feng & Xu, 1999)and form numerical information. Suppose the number of variables
is n, and n ¼ n1 þ n2, where n1 and n2, respectively denote the
number of fuzzy linguistic variables and crisp numerical variables.
The linguistic variables are evaluated in several description levels,
Table 1
Influencing factors of power load forecasts.
Load characteristics Unit Expression Weight
Sunlight Dimensionless Linguistic information 0.9
Data Dimensionless Linguistic information 0.7
Air pressure Dimensionless Linguistic information 0.68
Temperature Dimensionless Linguistic information 0.8
Rainfall Dimensionless Linguistic information 0.7
Humidity Dimensionless Linguistic information 0.4
198 Q. Wu / Expert Systems with Applications 37 (2010) 194–201
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
8/9/2019 1-s2.0-S0957417409004308-main
6/8
and a real number between 0 and 1 can be assigned to each
description level. Distinct numerical variables have different
dimensions and should be normalized firstly. The following nor-
malization is adopted:
xei ¼ xei min xei
li¼1
max xei
l
i¼1
min xei
l
i¼1
; e ¼ 1; 2; . . . ; n2 ð33Þ
where l is the number of samples, xe
i and xe
i denote the original valueand the normalized value, respectively. In fact, all the numerical
variables from (1) through (32) are the normalized values although
they are not marked by bars.
The proposed HAGPSO algorithm has been implemented in
Matlab 7.1 programming language. The experiments are made on
a 1.80 GHz Core(TM)2 CPU personal computer (PC) with 1.0 G
memory under Microsoft Windows XP professional. The initial
parameters of HAGPSO are given as follows: inertia weight:
w0 ¼ 0:9; positive acceleration constants: c 1; c 2 ¼ 2; the standard
error of Gaussian distribution: Mr ¼ 0:5; the adaptive coefficientb ¼ 0:8; increment coefficient: k ¼ 0:1; the fitness accuracy of the
Fig. 3. Mexican hat wavelet transform of load series in the scope of different scale.
Fig. 4. Morlet wavelet transform of load series in the scope of different scale.
Fig. 5. Gaussian wavelet transform of load series in the scope of different scale.
Q. Wu / Expert Systems with Applications 37 (2010) 194–201 199
http://-/?-http://-/?-
8/9/2019 1-s2.0-S0957417409004308-main
7/8
normalized samples is equal to 0.0002; the coefficient of control-
ling particle velocity attenuation: a ¼ 2.The Morlet, Mexican hot and Gaussian wavelet are selected to
analyze the load series on the different scales shown in Figs. 3–5.Morlet wavelet transform is the best wavelet transform that can
inosculate the original load series on the scope of scale from 0.3
to 4 among all given wavelet transforms.
Therefore, Morlet wavelet can be ascertained as a kernel func-
tion of Wv-SVM model, three parameters also are determined as
follows:
v 2 ½0; 1; a 2 ½0:3; 2 and
C 2 maxð xi; jÞ minð xi; jÞl
103; maxð xi: jÞ minð xi; jÞl
103
The trend of fitness value of HAGPSO is shown in Fig. 6. It is
obvious that the HAGPSO is convergent. Therefore, HAGPSO is able
to be applied to seek the parameters of Wv-SVM.
The optimal combinational parameters are obtained byAlgorithm HAGPSO, viz., C ¼ 960:10;v ¼ 0:88 and a ¼ 0:89. Fig. 7
illuminates the load series forecasting results given by HAGPSO
and Wv-SVM.
For analyzing the parameter searching capacity of HAGPSO
algorithm, the standard PSO algorithm is used to optimize param-
eters of Wv-SVM by training the original load series, then give the
latest 12 weeks forecasting results of each model shown in Table 2.
The comparison between HAGPSO and PSO optimizing the
parameters of the same model (Wv-SVM) is shown in Table 3.
The Table 3 shows the error index distribution from two differ-
ent models. The MAE, MAPE and MSE of HAGPSOWv-SVM are bet-
ter than ones of PSOWv-SVM. It is obvious that adaptive and
Gaussian mutation operators can improve the global search ability
of particle swarm optimization algorithm. Experiment resultsshow that the forecast’s precision is improved by HAGPSO, com-
pared with PSO under the same conditions.
6. Conclusion
In this paper, a new load forecasting model based on HAGPSO
and Wv-SVM is proposed. A new version of PSO, viz., hybrid parti-
cle swarm optimization with adaptive mutation and Gaussian
mutation (HAGPSO), is also proposed to optimize the parameters
of Wv-SVM. The performance of the HAGPSOWv-SVM is evaluated
by means of forecasting the data of power loads, and the simula-
tion results demonstrate that the Wv-SVM is effective in dealing
with many dimensions, nonlinearity and finite samples. Moreover,
it is shown that the HAGPSO presented here is available for theWv-SVM to seek optimized parameters.
Fig. 6. The change trend of the fitness function.
Fig. 7. The load forecasting results based on HAGPSOWv-SVM model.
Table 2
Comparison of forecasting result from two different models.
The latest 12 weeks Real value Forecasting value
PSOWv-SVM HAGPSOWv-SVM
1 580 725 703
2 2046 2010 20183 908 858 880
4 1625 1585 1606
5 452 547 525
6 2937 2880 2920
7 1135 1046 1167
8 2580 2493 2499
9 2561 2508 2566
10 781 908 884
11 1489 1536 1516
12 1532 1519 1525
Table 3
Error statistic of two forecasting models.
Model MAE MAPE MSE
PSOWv-SVM 69.92 0.068 6292
HAGPSOWv-SVM 45.25 0.048 3473
200 Q. Wu / Expert Systems with Applications 37 (2010) 194–201
8/9/2019 1-s2.0-S0957417409004308-main
8/8
In our experiments, the fixed adaptive coefficients ðb; kÞ, the
second step control parameter Mr of normal mutation and theparameter a of control the velocity attenuation are adopted. How-
ever, how to choose an appropriate coefficient is not described in
this paper. The research on the velocity changes when different
above parameters are adopted is a meaningful problem for future
research.
References
Benaouda, D., Murtagh, F., Starck, J. L., & Renaud, O. (2006). Wavelet-basednonlinear multiscale decomposition model for electricity load forecasting.Neurocomputing, 70(1–3), 139–154.
Chenhui, L. (1987). Theory and method of load forecasting of power systems. Ha’erbinInstitute of Technology Press.
Feng, S., & Xu, L. (1999). An intelligent decision support system for fuzzycomprehensive evaluation of urban development. Expert Systems with
Applications, 16 (1), 21–32.Hong, W. C. (2009). Electric load forecasting by support vector model. Applied
Mathematical Modelling, 33(5), 2444–2454.Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. IEEE International
Conference on Neural Networks, Australia, 1942–1948.Liang, R. H. (1997). Application of grey linear programming to short-term hydro
scheduling. Electric Power Systems Research, 41(3), 159–165.Lin, S. W., Ying, K. C., Chen, S. C., & Lee, Z. J. (2008). Particle swarm optimization for
parameter determination and feature selection of support vector machines.Expert Systems with Applications, 35(4), 1817–1824.
Mercer, J. (1909). Functions of positive and negative type and their connection withthe theory of integral equation, Philos. Transactions of the Royal Society of London, A-209, 415–446.
Pai, P. F., & Hong, W. C. (2005). Support vector machines with simulated annealingalgorithms in electricity load forecasting. Energy Conversion and Management,46 (17), 2669–2688.
Santos, P. J., Martins, A. G., & Pires, A. J. (2007). Designing the input vector to ANN-based models for short-term load forecast in electricity distribution systems.International Journal of Electrical Power and Energy Systems, 29(4), 338–347.
Shen, Q., Shi, W. M., Kong, W., & Ye, B. X. (2007). A combination of modified particleswarm optimization algorithm and support vector machine for gene selectionand tumor classification. Talanta, 71(4), 1679–1683.
Smola, A., & Scholkopf, B. (1998). The connection between regularization operatorsand support vector kernels. Neural Network, 11, 637–649.
Topalli, A. K., Erkmen, I., & Topalli, I. (2006). Intelligent short-term load forecastingin Turkey. International Journal of Electrical Power and Energy Systems, 28(7),437–447.
Vapnik, V. (1995). The Nature of Statistical Learning . New York: Springer.Wu, Q. (2009). The forecasting model based on wavelet v-support vector machine.
Expert Systems with Applications, 36 (4), 7604–7610.Wu, Q., Liu, J., Xiong, F. L., & Liu, X. J. (2009). The fuzzy wavelet classifier machine
with penalizing hybrid noises from complex diagnosis system. Acta Automatica
Sinica, 35(6), 773–779 (in Chinese).Wu, C. H., Tzeng, G. H., & Lin, R. H. (2009). A Novel hybrid genetic algorithm for
kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36 (3), 4725–4735.
Wu, Q., & Yan, H. S. (2009). Forecasting method based on support vector machinewith Gaussian loss function. Computer Integrated Manufacturing Systems, 15(2),306–312 (in Chinese).
Wu, Q., & Yan, H. S. (in press). Product sales forecasting model based on robust v-support vector machine. Computer Integrated Manufacturing Systems (inChinese).
Wu, Q., Yan, H. S., & Wang, B. (2009). The product sales forecasting model based onrobust wavelet v-support vector machine. Acta Automatica Sinica, 37 (7),1227–1232 (in Chinese).
Wu,Q., Yan, H. S., & Yang, H. B. (2008a). A hybrid forecasting model based onchaoticmapping and improved support vector machine. In: Proceedings of the ninthinternational conference for young computer scientists , (pp. 2701-2706).
Wu, Q., Yan, H. S., & Yang, H. B. (2008b). A forecasting model based support vectormachine and particle swarm optimization. In: Proceedings of the 2008 workshopon power electronics and intelligent transportation system
, (pp. 218-222).
Yang, X. M., Yuan, J. S., Yuan, J. Y., & Mao, H. (2007). A modified particle swarmoptimizer with dynamic adaptation. Applied Mathematics and Computation, 189,1205–1213.
Ying, L. C., & Pan, M. C. (2008). Using adaptive network based fuzzy inferencesystem to forecast regional electricity loads. Energy Conversion and Management,49(2), 205–211.
Yuan, S. F., & Chu, F. L. (2007). Fault diagnostics based on particle swarmoptimization and support vector machines. Mechanical Systems and SignalProcessing, 21(4), 1787–1798.
Zhao, L., & Yang, Y. (2009). PSO-based single multiplicative neuron model for timeseries prediction. Expert Systems with Applications, 36 (2), 2805–2812.
Q. Wu / Expert Systems with Applications 37 (2010) 194–201 201
Recommended