A New Optimizer Using Particle Swarm Theory

8/14/2019 A New Optimizer Using Particle Swarm Theory

1/5

A New Optimizer Using Particle Swarm TheoryRussell EberhartPurdue Scho ol of Engineering and TechnologyIndianap olis, IN 46202-5 160

eberhart@ ngr.iupui.eduJames KennedyBureau of Labo r StatisticsWashington, DC 20212kennedyj @pol.ocsp.bls.gov

ABSTRACTThe optimization of nonlinear functions using particleswarm methodology is described. Im plemen tations of twoparadigms are discussed and compared, including arecently developed locally oriented paradigm . Benchmarktesting of both paradigms is described, and applications,including neural network training an d robot task learning,are proposed. Relationships between particle swarmoptimization and both artificial life and evolutionarycomputation are reviewed.

1 INTRODUCTIONA new method for optimization of continuous nonlinearfunctions was recently introduced [6] This paper reviewsthe particle sw arm op timiz ation concept. Discussed nextare two paradigms that implement the concept, oneglobally oriented GB EST ), and one locally oriented1,131 IST , followed by results obtained from applicationsand tests upon which the paradigms have been shown toperform successfully.

Particle swarm optimization has roots in two maincomponent methodologies. Perhaps more obvious are itsties to artificial life A-life) in general, and to birdflocking, fish schooling, and swarming theory inparticular. It is also related, however, to evolutionaryComputation, and has ties to both genetic algorithms andevolution strategies 11Iarmlc swarm op timization comprises a very sim pleconcept, and paradigms are implemented in a few lines ofcomputer code. It requires only primitive mathematicaloperators, and is com putationally inexpensive in terms ofboth memory requirem ents and speed. Early testing hasfound the implementation to be effective with severalkinds of problem s [6]. Th is paper discusses app lication ofthe algorithm to the training of artificial neural network

Sixth International Symposium onMicro Machine and Human Science0 7803 2676 8/95 4.00 1995 IEEE. 39

weights. Particle swarm optimization has also beendemonstrated to perform well on genetic algorithm testfunctions, and it appears to be a promising approach forrobot task learning.Particle swarm op timization can be used to solve many ofthe same kinds of problems as genetic algorithms GAS)[6]. This optimization technique does not scffer,however, from some of GAs difficulties; interaction inthe group enhances rather than detracts from progresstoward the solution. Further, a particle swarm system hasmemory, which the genetic algorithm does not have.Change in genetic populations results in destruction ofpreviou s know ledge of the problem, except when elitism isemployed, in w hich case usually one or a small number ofindividuals retain their identities. In particle swarmoptimization, individuals who fly past optima are tuggedto return toward them; knowledge of good solutions isretained by all particles.

2 THE PARTICLE SWARM OPTIMIZATIONCONCEPT

Particle swarm optimization i s similar to a geneticalgorithm [2] in that the system is initialized with apopulation of random solutions. It is unlike a geneticalgorithm , however, in that each p otential solution is alsoassigned a randomized velocity, and the potentialsolutions, called particles, are then flown throughhyperspace.Each particle keeps track of its coordinates in hyperspacewhich are associated with the best solution fitness) i t hasachieved so far. The value of that fitness is also stored.)This value is called pbest. Another best value is alsotracked. The glob al version of the particle swarmoptimizer keeps track of the overall best value, and itslocation, obtained thus far by any particle in thepopulation; this is called gbest.
http://engr.iupui.edu/mailto:pol.ocsp.bls.govmailto:pol.ocsp.bls.govhttp://engr.iupui.edu/


2/5


3/5

3.2 The LBEST VersionBased, among other things, on findings from socialsimulations, it was decided to design a local versionparadigm ) of the particle swarm concept. In thisparadigm, particles have information only of their ownand their nearest array neighbors bests, rather than thatof the entire group. Instead of moving toward thestochastic average of pbest and gbest the best evaluationin the entire group , particles move toward the pointsdefined by pbest and lbest which is the index of thepnrticle with the best evaluation in the neiglzborhood. Inthe neighborhood=2 m odel, for instance, pa rticle i)compares its error value with particle i-1) andparticle i+l). The lbest version was tested withneighborhoods consisting of the im mediately adjacentneighbors neighborh ood=2 ), and with the three neighborson each side neighborh ood=6 ).Table 1 shows results of performance on the XOR neural-net problem with neighborhood=2. Note that no trialsfixated on local optima- nor have any in hundreds ofunreported tests.Cluster analysis of sets of weights from this versionshowed that blocks of neighbors, consisting of regionsfrom 2 to 8 adjacent individuals, had settled into the sameregions of the solution space. It appears that theIable 1. Local version, neighborhoodz2.

invulnerability of this version to local optima might resultfrom the fact that a number of groups of particlesspontaneously separate and explore different regions. It isthus a more flexible approach to information processingthan the GB EST model.Nonetheless, though this version rarely if ever becomesentrapped in a local optimum, it clearly requires moreiterations on average to find a criter ion erro r level. Table2 represents tests of a LBEST version withneighborhood=6, that is, with the three neighbors on eachside of the agent taken into account arrays wrapped, sothe final element was considered to be beside the firstone).This version is prone to local optima, at least whenVMAX is small, though less so than the GBEST version.Otherwise it seems, in most cases, to perform somewhatless well than the standard GBEST algorithm.In sum, the neighborh ood=2 model offered someintriguing possibilities, in that is seem s immune to localoptima. It is a highly d ecentralized model, which couldbe run with any number of particles. Expanding theneighborhood speeds up convergence, but introduces thefrailties of the G BEST m odel.

Median iterations required t o meet acr j t e r i on of squared error per node < 0.02. Population=2 0 particles. There were not r i a l s w i t h iterations > 2060.VMAX AC C-C ONST2.0 1.0 0.52 . 0 38.54.0 2 8 . 56.0 29.5

73340.537 553.53 9 . 5

I able 2. Local version, neighborhood z6. Median iterations required to meet aC I ilerion of squared error per node < 0 . 0 2 . Population=20 particles.

VMAX ACC-CONST2 . 0 1.0 0 . 52 . 0 31.5 2)4.0 36 1)6.0 26.5

38. 5 1)2629

27 1)252 0

41


4/5

4. FLOCKS, SWARMS AND PARTICLESA numbcr o f scientists have created computer sim ulationsof various interpretations of the movement of organismsi n a bird flock or fish school. Notably, Reynolds [lo] andHeppner and Grenander [4] resented simulations of birdflocking.It became obvious during the development of the particleswarm concept that the behavior of the population ofagents is more like a swarm than a flock. The term s w a mhas a basis in the literature. In particular, the authors usethc term i n accordance with a paper by Millonas [7], whodevcloped his models for applications in artificial life, andarliculated five basic principles of swarm intelligence.First is the proximity principle: the population should beablc LO carry out simple space and time computations.Second is the quality principle: the population should beable to respond to quality factors in the environment. l h i r is the principle of diverse response: the populationshould not commit its activities along excessively narrowclianncls Fourth is the principle of stability: thepopularion should not change its mode of behavior everytinic the environment changes. Fifth is the principle ofxl:ip[ability. the population must be able to changehch;ivior mode when its worth the computational price.N o t that principles four and five are the opposite sides ofIhc same coinI Irc particle swarm optimization concept and paradigmprcscnted in this paper seem to adhere to all fiveprinciples. Basic to the paradigm are n-dimensional spacecalculations carried out over a series of time steps. Thepopulation is responding to the quality factors pbest andghesr/lDesr The allocation of responses between pbestm ,,4iesr/thest ensures a diversity of response. Thepopulation changes its state mode of behavior) only whengbe.sr//Desf changes, thus adhering to the principle ofskibility. The population is adaptive because it doeschange when gbestAbest changes.rlie term parricle was selected as a compromise. While itcould be argued that the population members are mass-

I \ \ ;ind volume-less, and thus could be called points, itI < lek that velocities and accelerations are more,ippropriatcly applied to particles, even if each is definedto hwe arbitrarily small mass and volume. Further,liccvcs [9] discusses particle systems consisting of cloudsof primitive parbcles as models of diffuse objects such asclouds, fire and smoke. Thus the label the authors havecliosen to represent the optlmizatlon concept is particleCMJNt.?ll

5 TESTS AND EARLY APPLICATlONS OFTHEOPTIMIZER

The paradigm has been tested using systematicbenchmark tests as well as observing its performance onapplications that are known to be difficult. The neural-netapplication described in Section 3 , for instance, showedthat the particle swarm o ptimizer could train NNweightsas effectively as the usual erro r backpropagation method.The particle swarm op timizer has also been used to train aneural network to classify the Fisher Iris Data Set [3].Again, the optimizer trained the weights as effectively asthe backpro pagation method. Over a series of ten trainingsessions, the particle swarm optimizer paradigm requiredan average of 284 epochs [6]The particle swarm optimizer was compared to abenchmark for genetic algorithms in Davis [2]: theextremely nonlinear Schaffer f function. This function isvery difficult to optimize, as the highly discontinuous datasurface features many local optima. The particle swarmparadigm found the global optimum each run, andappears to approximate the results reported for elementarygenetic algorithms in Chapter 2 of [2] in terms of thenumber of evaluations required to reach certainperformance levels [ 6 ] .GAS have been used to learn complex behaviorscharacterized by sets of sequential decision rules. Oneapproach uses Cooperative Coevolutionary GeneticAlgorithms CCGAs) to evolve sequential decision rulesthat control simulated robot behaviors [8]. The GA isused to evolve populations of rule sets, which are appliedto problems involving multiple robots in competetive orcoopera tive tasks. Use of particle swarm optimization ,currently being explored, instead of the GA, may enhancepopulation evolution. For example, migration among sub -species of robots can be a problem due to GA crossover;this problem should not exist with particle swarms.

6 CONCLUSIONSThis paper introduces a new form of the particle swarmoptimizer, examines how changes in the paradigm affectthe number of iterations required to meet an errorcriterion, and the frequency with which models cycleinterminably around a nonglobal optimum. Threeversions were tested: the GBEST model, in which everyagent has information about the groups best evaluation,and two variations of the LBEST version, one with aneighborhood of six, and one with a neighborhood of two.It appears that the original GBEST version performs best

42


5/5

in terms of median num ber of iterations to convergence,while the LBEST version with a neighborhood of two ismost resistant to local min ima.Particle swarm optimization is an extremely simplealgorithm that seems to be effective for optimizing a widerange of functions. We view it as a m id-level form of A-life or biologically derived algorithm, occupying the spacein nature between evolutionary search, which requireseons, and neural processing, which occurs on the order ofmilliseconds. Soci al optim ization occurs in the timeframe of ordinary expe rienc e n fact, it is ordinaryexpcrience. In addition to its ties with A-life, particleswarm optim ization has obvious ties with evolutionarycomp utation. Conceptually, it seems to lie somewherebetween genetic algorithms and evolutionaryprogramming. It is highly dependent on stochasticprocesses, like evolutionary program ming. Theadjustment toward pbest and gbest by the particle swarmoplirnizer is conceptually similar to the crossoveroperation utilized by genetic algorithms. It uses theconcept of fitness, as do all evolutionary computationparadigms.rlniquc to the concept of particle swarm optimization isflying potential s olutions through hyperspace, acceleratingtoward better solutions. Other evolutionarycomputation schemes operate directly on potentialsolutions which are represented as locations inhypcrspace. Much of the success of partic le swarm sseems to lie in the agents tendency to hurtle past theirtarget. Hollands chapter on the optimum allocation oftrials [5] reveals the delicate balance betweenconscrvativc testing of known regions versus riskyexploration of the unknown. It appears that the currentversion of the paradigm allocates trials nearly optimally.The stochastic factors allow thorough search of spacesbcrwcen regions that have been found to be relativelygood, and the momentum effect caused by modifying theextant velocities rather than replacing them results inovcrshooling, or exploration of unknown regions of theproblem domain.Much further research remains to be conducted on thissimple new concept and paradigm. The goals intlcvcloping i t have been to keep i t simple and robust, andwc seem to have succeeded at that. The algorithm iswritten in a very few lines of code, and requires onlyspccification of the problem and a few p arameters in orderto solve it.

ACKNOWLEDGMENTPortions of this paper are adapted from a chapter onparticle swarm optimization in a book entitledCompu tational Intelligence PC Tools, to be published inearly 1996 by Academic Press Professional APP ). Thepermission of APP to include this material is gratefullyacknowledged.

REFERENCES[11 T. Baeck, Generalized convergence models fortournament and mu,lambda)-selection. Proc. of theSixth In ternational Con on Genetic Algorithms, pp. 2-7,Morgan Kaufmann Pub lishers, San Francisco, CA, 1 995.[2] L. Davis, Ed., Handbook o Genetic Algorithms.Van Nostrand Reinhold, New York, NY, 1991.[3] R. A. Fisher, The use of multiple measu remen ts intaxonomic problems. Annals o Eugenics, 7: 179-188,1936.[4] F. Heppner and U. Grenan der, A stochasticnonlinear model for coordinated bird flocks. In S.Krasner, Ed., The Ubiquity of Chaos, AAASPublications, W ashington, DC, 1990.[S J. H. Holland, Adaptation in Natural and ArtificialSystems, MIT Press, Cam bridge, MA ., 1992 .[6] J. Kennedy and R. Eberhart, Particle swarmoptimiza tion. Proc. IEEE International Conf. on NeuralNetworks Perth, Australia), IEEE Service Center,Piscataway, NJ, 1995 in press).[7] M. Millonas, Swarms, phase transitions, andcollective intelligence. In C. G. Langton, Ed., ArtificialLue I l l , Addison W esley, Reading, MA, 199 4.[8] M. Potter, K. De Jong, and J. Grefenstette, Acoevolutionary approach to learning sequential decisionrules. Proc. of the Sixth In ternational Con& on GeneticAlgorithms, pp. 366-372 , Morgan Kaufmann Publishers,San Francisco, CA, 1995.[9] W. T. Reeves, Particle systems - a technique formod eling a class of fuzzy objects. ACM Transactions onGraphics, 2 2):91-108, 1983.[lo] C. W. Reynolds, Flocks, herds and schools: adistributed behavioral model. Computer Graphics,2 1 4):25-34, 1987 .

43

Documents

A New Optimizer Using Particle Swarm Theory