Upload
harinima
View
213
Download
0
Embed Size (px)
Citation preview
7/27/2019 qpso-ieee
1/5
An Evolutionary Quantum Behaved Particle Swarm Optimization for Mining
Association Rules
K.Indira
Research Scholar,Department of CSE,
Pondicherry Engg. College,
Puducherry, India
S.Kanmani
Professor,Department of IT,
Pondicherry Engg. College,
Puducherry, India
R.agan, !."ala#i, $.%ilton oseph
$inal &ear IT
Pondicherry Engg. College,
Puducherry, India
Abstract' Association Rule Mining is a techniue of datamining that aims in e!tracting interesting patterns" freuent
correlations or associations among set of items in the
transaction data#ases$ Particle Swarm Optimization %PSO& is
one of several methods for mining Association Rules$ 't is atechniue used to e!plore the search space of a given pro#lem
to find the settings or parameters reuired to ma!imize a
particular o#(ective$ )owever" avoiding local optima and
improving the convergence speed is still a tedious tas*$ 'n this
wor*" an Evolutionary Quantum #ehaved Particle Swarm
Optimization %QPSO& algorithm is proposed #y com#ining the
classical PSO philosophy and uantum mechanics to improve
the performance of PSO$ 't is a glo#al+convergence guaranteed
algorithm and has a #etter search a#ility than the traditional
PSO$ ,he performance of QPSO is compared with #asic PSO"
and three other variants of PSO and the e!perimental results
shows that the proposed algorithm outperforms PSO uite
significantly and is also found to #e as effective as the PSO
variants with marginally #etter computational efficiency$
Keywords+ Particle Swarm Optimization- QuantumBehavior- Association Rule Mining- .lo#al convergence$
I. I(TR)D*CTI)(
Data mining, also +non as Knoledge Disco-ery in
Dataases is the process of e/tracting the patterns from data
and ta+ing an action ased on the category of the pattern. 0s
the amount of data stored in dataases and types ofdataases increases rapidly e/tracting the critical hidden
information from these dataases has ecome an importantissue. Se-eral methods such as association rules, clustering
and classification ha-e een used to otain interesting factsfrom data. 0mong these models, association rule mining is
the most idely applied method.
Initially traditional algorithms li+e 0priori, DynamicItemset Counting, $P1!roth tree ha-e een used for
e/tracting the interesting patterns. "ut, as the numer of
rules and the correlations in dataases increases, the
computational comple/ity continues to gro rapidly. 2ater
E-olutionary algorithms such as !enetic 0lgorithms 3!0s4and Particle Sarm )ptimi5ation 3PS)4 ha-e een
introduced for mining association rules. The mainmoti-ation for using !0 in the disco-ery of association
rules is that they cope etter ith attriute interactionthrough mutation and crosso-er operators. 0lthough !0
disco-ers high le-el prediction rules ith etter attriute
interaction, the operators tends to increase the comple/ity of
computation. Similar to !0, PS) is also a population1asediterati-e algorithm. The fundamental prolem in PS) is its
imalanced local and gloal search and also hile trying to
reduce the con-ergence time it gets trapped in local optimal
region hen sol-ing comple/ multimodal prolems. Theseea+nesses ha-e restricted ider applications of PS).
Therefore, accelerating con-ergence speed and a-oidingthe local optima ha-e ecome the to most important and
appealing goals in PS) research. In this or+, an
E-olutionary 6uantum eha-ed Particle Sarm
)ptimi5ation algorithm that features etter search efficiencyy a-oiding local optima and impro-ing the con-ergence
speed is proposed.
The paper is organi5ed as follos. In section II thepreliminaries as association rule mining, PS) are discussed.
Section III analy5es the e/isting literature on PS) and its
-ariants. The proposed 6PS) methodology along ith
e-aluation parameters are gi-en in section I7. Section 7n
discusses the e/perimental results folloed y theconclusion in section 7I.
II. PRE2I%I(0RIES
A. Association Rule Mining
0ssociation rule mining is a techni8ue of data mining that
is -ery idely used to deduce inferences from largedataases. Typically the relationship ill e in the form of a
7/27/2019 qpso-ieee
2/5
rule9 I$ :;Conse8uent?. The rule
should e such that ;& A .
The to parameter that indicate the importance of
association rules are Bsupport and Bconfidence.
Thesupport indicates ho often the rule holds in a set ofdata. It is gi-en y
onstransactiofno.Total
;,ithonstransacti(o.of43sup =Xport
34
The confidence for a gi-en rule is a measure of
ho often the conse8uent is true, gi-en that the antecedentis true. If the conse8uent is false hile the antecedent is
true, then the rule is also false. If the antecedent is not
matched y a gi-en data item, then this item does not
contriute to the determination of the confidence of the rule.It is gi-en y
34
B. PSO Algorithm
PS) introduced y Kennedy and EerhartFG simulates
the eha-iors of ird floc+ing. In PS), the solutions to theprolem are represented as particles in the search space.
PS) is initiali5ed ith a group of random particles
3solutions4 and then searches for optimum -alue y updatingparticles in successi-e generations. In each iteration, all theparticles are updated y folloing to est -alues. The first
one is the est solution 3fitness4 it has achie-ed so far. This
-alue is called Bp#est. 0nother HestH -alue that is trac+ed
y the particle sarm optimi5er is the est -alue, otainedso far y any particle in the population. This est -alue is a
gloal est and called Bg#est.
Particle Sarm has to primary operations9 7elocityupdate and Position update. During each generation each
particle is accelerated toard the particles pre-ious est
position and the gloal est position. 0t each iteration, a
ne -elocity -alue for each particle is calculated ased on
its current -elocity, the distance from its pre-ious estposition, and the distance from the gloal est position. The
ne -elocity -alue is then used to calculate the ne/t position
of the particle in the search space. This process is theniterated a set numer of times or until a minimum error is
achie-ed.
0fter finding the to est -alues, the particle updates its-elocity and positions ith the folloing formulae
34
3J4
here -FG is the particle -elocity, presentFG is the currentparticle. pestFG and gestFG are local est and gloal est
position of particles. rand 34 is a random numer eteen
3,4. c and c are learning factors. *sually c A c A .
The outline of asic Particle Sarm )ptimi5er is as follos
Step /0 Initiali5e the population
Step 10 E-aluate fitness of the indi-idual particle 3update
pest4
Step 20 Keep trac+ of the indi-idual?s highest fitness 3gest4
Step 30 %odify -elocities ased on pest and gest
Step 40 *pdate the particle position
Step 50 Terminate if the condition is met
Step 60 !o to step
III. 2ITER0T*RERE7IEL
Particle sarm optimi5ation 3PS)4 is a population1
ased stochastic optimi5ation algorithm ased on the
metaphor of social interaction and communication such asird floc+ing and fish schooling F,G. In order to gain a
deep insight into the mechanism of PS), many theoretical
analyses ha-e een done on the algorithm. 0s for the
algorithm itself, 7an den "ergh FJGpro-ed that the canonicalPS) is not a gloal search algorithm, e-en not a local one,
according to the con-ergence criteria pro-ided y Solis and
Lets FMG.
In addition to the theoretical analyses mentioned
ao-e, there has een a considerale amount of or+ done
in de-eloping the original -ersion of PS), through
empirical simulations. Shi and Eerhart FG introduced theconcept of inertia eight to the original PS), in order to
alance the local and gloal search during the optimi5ation
process. 0ngeline FNG introduced a tournament selection into
PS) ased on the particle?s current fitness so that the
43sup
43sup43
Xport
YXportYXConfidence
=
7/27/2019 qpso-ieee
3/5
properties that ma+e some solutions superior ere
transferred directly to some of the less effecti-e particles.
This techni8ue impro-ed the performance of the PS)
algorithm on some enchmar+ functions.
Recently, inspired y 8uantum mechanics and tra#ectory
analysis of PS) FOG, Sun et al. used a strategy ased on a
8uantum d potential ell model to sample around thepre-ious est pointsFG, and later introduced the mean est
position into the algorithm and proposed a ne -ersion of
PS), 8uantum1eha-ed particle sarm optimi5ation
36PS)4 FQ,G. The iterati-e e8uation of 6PS) is -erydifferent from that of PS). "esides, unli+e PS), 6PS)
needs no -elocity -ectors for particles, and also has feer
parameters to ad#ust, ma+ing it easier to implement. The
6PS) algorithm has een shon to successfully sol-e aide range of continuous optimi5ation prolems and many
efficient strategies ha-e een proposed to impro-e the
algorithm F,G. This paper applies the 6PS)
methodology for mining association rules.
I7. %ET=)D)2)!&
A. Quantum behaed PSO
The 6uantum inspired Particle Sarm )ptimi5ation
36PS)4 is one of the recent optimi5ation methods ased on
8uantum mechanics. 2i+e any other e-olutionary algorithm,
a 8uantum inspired particle sarm algorithm relies on therepresentation of the indi-idual, the e-aluation function and
the population dynamics. The particularity of 8uantum
particle sarm algorithm stems from the 8uantumrepresentation it adopts hich allos representing thesuperposition of all potential solutions for a gi-en prolem.
%oreo-er, the position of a particle depends on the
proaility amplitudes a andb of the a-e function .
6PS) also stems from the 8uantum operators it uses toe-ol-e the entire population through generations. 6PS)
constitutes a poerful strategy to di-ersify the 6PS)
population and enhance the 6PS)?s performance in
a-oiding premature con-ergence to local optima.
In terms of classical mechanics, a particle is depicted y
its position -ector /iand -elocity -ector -i, hich determine
the tra!ector" of the particle. The particle mo-es along adetermined tra#ectory folloing (etonian mechanics, ut
this is not the case in 8uantum mechanics. In 8uantum
orld, the term tra!ector" is meaningless, ecause / i and
-iof a particle cannot e determined simultaneouslyaccording to uncertaint" principle. Therefore, if indi-idual
particles in a PS) system ha-e 8uantum eha-ior, the PS)
algorithm is ound to or+ in a different fashion.
In the 8uantum model of a PS), the state of a particle is
depicted y a-e function 3#, t4, instead of position and
-elocity. The dynamic eha-ior of the particle is idely
di-ergent from that of the particle in traditional PS)systems. The particles mo-e according to the folloing
iterati-e e8uations9
/3t4 A p U Vmest W /3t4V U ln3Xu4, if + YA .M/3t4 A pWUVmestW/3t4VUln3Xu4, if +Z.M 3M4
Lhere,
p A 3cpid cpgd4X 3c c4
u,+,c,care uniformly distriuted random numers in the
inter-al F,G.
is the contraction1e/pansion coefficient.
pidis the pest of the ithparticle.
pgdis the gloal est particle.
%ean est position or %ainstream Though point 3mest4 of
the population is defined as the mean of the est position of
all particle.
3N4
The steps in-ol-ed in 6uantum Particle Sarm )ptimi5er
are as follos
Step /0 Initiali5e the population
Step 10 E-aluate fitness of the indi-idual particle 3update
pest4
Step 20 Keep trac+ of the indi-idual?s highest fitness 3gest4
Step 30 Calculate mest 3mean of the pest of all particles4
Step 40 *pdate the particle position
Step 50 Terminate if the condition is met
Step 60 !o to step
B. $aluation Parameters
%uch or+ on association rule mining has een donefocusing on efficiency, effecti-eness and redundancy. $ocus
is also needed on the 8uality of rules mined. In this paper
association rule mining using 6PS) is treated as a multi1
o#ecti-e prolem here rules are e-aluated 8uantitati-elyand 8ualitati-ely ased on the folloing measures.
7/27/2019 qpso-ieee
4/5
Predictive Accuracy Predicti-e 0ccuracy 3P04 measures
the effecti-eness of the rules mined. The mined rules must
ha-e high predicti-e accuracy. The formula is gi-en y9
3O4
Lhere V;[&V is the numer of records that satisfy oth theantecedent ; and conse8uent &, V;V is the numer of rules
satisfying the antecedent ;.
7um#er of rules generated 0The count of the rulesgenerated ao-e a certain P0.
8aplace0 It is a confidence estimator that ta+es support into
account, ecoming more pessimistic as the support of ;
decreases. It is useful to detect spurious rules that may occury chance. It is defined as
34
9itness 0 $itness -alue is utili5ed to e-aluate the importanceof each particle. It is gi-en y
$itness3+4 A confidence3+4 Ulog3support3+44 U length3+4
3Q4
The o#ecti-e of this fitness function is ma/imi5ation. The
larger the particle support and confidence, the greater thestrength of the association, meaning that it is an important
association rule.
:onviction 0 Con-iction measures the ea+ness ofconfidence. Con-iction is infinite for logical implications
3confidence 4, and is if ; and & are independent.
34
7. RES*2TS0(DDISC*SSI)(
The goal of this section is to determine the o-erall
performance of 6PS) hen applied for association rule
mining. The proposed algorithm as tested on fi-e
enchmar+ datasets.
2enses, Car e-aluation, Post1)perati-e Patient Care,
=aerman?s Sur-i-al and \oo datasets ere ta+en up from
*CI Ir-ine repository for the e/periment. The effecti-eness
of 6PS) is compared ith !0 and PS) methodology formining association rules FG. To confirm the effecti-eness of
PS) and 6PS), oth the algorithms ere coded in a-a.
The details of the datasets are listed in Tale . The
efficiency of the algorithms as compared y utili5ing theoutput parameters mentioned earlier.
T0"2E I. D0T0SET DESCRIPTI)(
;ataset Attri#utes Swarm
Size
:lasses
2enses J J
Car E-aluation N O J
=aerman?s Sur-i-al
Post1operati-e Patient Care O
\oo N O
A. Predictie Accurac"
The predicti-e accuracy of the association rules mined
y proposed 6PS) methodology is compared ith the
results of !0 an PS) is plotted in $igure .
$igure . Predicti-e 0ccuracy Comparison
The predicti-e accuracy achie-ed y 6PS) is etter
compared to !0 and PS) for all the fi-e datasets ta+en upfor study.
B. %umber of rules generated
The numer of rules generated y 6PS) ith predicti-e
accuracy less than .M percentage of the ma/imum
predicti-e accuracy arri-ed is taulated for mining
X
YXcurac"edictieAc[
Pr =
E43sup
D43sup43
+
+=
Xport
YXportYXlapl
7/27/2019 qpso-ieee
5/5
association rules in comparison ith !0 and PS)
methodology for the datasets as shon in Tale .
T0"2E . C)%P0RIS)( )$ T=E (*%"ER )$ R*2ES !E(ER0TED
;atasets
.A PSO QPSO
2enses3J4
Car E-aluation
3O4
M N M
=aerman?sSur-i-al 34
J
Postoperati-e
Patient3O4
J Q J
\oo34O N M
It is noted that the 6PS) methodology generates
more numer of rules hen compared ith !0 and PS)
0ccuracy alone is not the o#ecti-e of mining
association rules. The interestingness of the association
rules mined is also measured through 2aplace, Con-ictionand 2e-erage measures. Tale presents the results of
2aplace, Con-iction and 2e-erage measures of the rules
mined from the datasets ta+en up for analysis.
T0"2E . %E0S*RES )$ 0SS)CI0TI)( R*2E %I(I(! LIT=
6PS)
;atasets 8aplace :onviction 8everage2enses .MQJ infinity .N
=aerman]s
Sur-i-al
.MN infinity .Q
Car E-aluation .MJ infinity .QJ
Postoperati-e
Patient
.M infinity .
\oo .MJ infinity .JQ
7I. C)(C2*SI)(
The 6PS) algorithm is superior to the standard PS)mainly in three aspects. $irstly, 8uantum theory is an
uncertain system in hich different state of the particles anda ider searching space of the algorithm can e generated.
Secondly, the introduction of mest into 6PS) is a
enchmar+ in 6uantum Theory. In the standard PS), it
con-erges fast, ut at times, the fast con-ergence happens in
the first fe iterations utfalls into a local optimal situationeasily in the ne/t fe iterations. Lith the introduction of
mest in 6PS), the a-erage error is loered, since each
particle cannot con-erge fast ithout considering its
colleagues, hich ma+es the fre8uency loer than PS).2astly, 6PS) has feer parameters compared to standard
PS) and it is much easier to implement and run. =ence the
performance of the algorithm is significantly impro-ed y
6PS).
RE$ERE(CES
FG. . Kennedy, and R. Eerhart, BParticle Sarm )ptimi5ation Proc.
IEEE International Conference on (eural (etor+s 3Perth,
0ustralia4, IEEE Ser-ice Center, Piscataay, (,FG. K.Indira, S.Kanmani, P.Prashanth, 7.=arish Konda Ramcharan Te#a,
and R.ee-a, BPopulation "ased Search %ethods in %ining
0ssociation Rules, Proc. Third International Conference on0d-ances in Communication, (etor+, and Computing W C(C
, 2(CS, , pp. MMWN.FG. &. Shi, R.C. Eerhart, B0 modified particle sarm optimi5er, Proc.
IEEE International Conference on E-olutionary Computation, IEEE
Press, Piscataay, (, QQ, pp. NQWO.
FJG. $. 7an den "ergh, 0n 0nalysis of Particle Sarm )ptimi5ers,
Ph.D. dissertation, *ni-. Pretoria, Pretoria, South 0frica, .FMG. $.. Solis, R. 1". BLets, %inimi5ation y random search
techni8ues, B%athematics of )perations Research, 7ol. N Q, pp.
QW.
FNG. P.. 0ngeline, B*sing selection to impro-e particle sarm
optimi5ation, Proc. IEEE International Conference on E-olutionaryComputation, ICEC QQ, 0nchorage, 0K, *S0, QQ, pp. JWQ.
FOG. %. Clerc, and . Kennedy, BThe particle sarm W e/plosion,
staility, and con-ergence in a multidimensional comple/ space,B
IEEE Transactions on E-olutionary Computation, 7ol. N, , pp.MWO.FG. . Sun, ". $eng, and L. ;u, BParticle sarm optimi5ation ith
particles ha-ing 8uantum eha-ior, Proc. IEEE Congress on
E-olutionary Computation, J, pp.MW.
FQG. . Sun, ". $eng, and L. ;*, B0 !loal search strategy of 8uantum1
eha-ed particle sarm optimi5ation, Proc. IEEE Conference onCyernetics and Intelligent Systems, J, pp. WN.
FG. un, ;. Leno, and $. "in, B0dapti-e parameter control for
8uantum1eha-ed particle sarm optimi5ation on indi-idual le-el, B
-ol. J, M, pp. JQWMJ.
FG. S.%. %i++i, and 0.0. Kish+, B6uantum particle sarmoptimi5ation for electromagnetics, IEEE Transactions on 0ntennas
and Propagation, 7ol. MJ, N, pp. ONJW OOM.
FG. S.(. )m+ar, K. Rahul, T.7.S. 0nanth, !.(. (ai+, and S.
!opala+rishnan, B6uantum eha-ed particle sarm optimi5ation
36PS)4 for multi1o#ecti-e design optimi5ation of composite
structures, E/pert Systems ith 0pplications, 7ol. N, Q, pp.
W