The Demons Algorithm

Embed Size (px)

Citation preview

  • 8/13/2019 The Demons Algorithm

    1/12

    This article was downloaded by: [Universitaetsbibliothek Heidelberg]On: 19 August 2013, At: 04:05Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office:Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

    International Journal of Computer

    MathematicsPublication details, including instructions for authors and subscriptioninformation:

    http://www.tandfonline.com/loi/gcom20

    The demon algorithmTheo Zimmermann

    a& Peter Salamon

    a

    aDepartment of Mathematical Sciences, San Diego State University, San

    Diego, CA, 92182, U.S.A

    Published online: 20 Mar 2007.

    To cite this article:Theo Zimmermann & Peter Salamon (1992) The demon algorithm, International Journal of

    Computer Mathematics, 42:1-2, 21-31, DOI: 10.1080/00207169208804047

    To link to this article: http://dx.doi.org/10.1080/00207169208804047

    PLEASE SCROLL DOWN FOR ARTICLE

    Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content)

    contained in the publications on our platform. However, Taylor & Francis, our agents, and ourlicensors make no representations or warranties whatsoever as to the accuracy, completeness, orsuitability for any purpose of the Content. Any opinions and views expressed in this publication arethe opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis.The accuracy of the Content should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for any losses, actions,claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arising out ofthe use of the Content.

    This article may be used for research, teaching, and private study purposes. Any substantialor systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or

    distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and usecan be found at http://www.tandfonline.com/page/terms-and-conditions

    http://dx.doi.org/10.1080/00207169208804047http://www.tandfonline.com/action/showCitFormats?doi=10.1080/00207169208804047http://www.tandfonline.com/loi/gcom20http://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1080/00207169208804047http://www.tandfonline.com/action/showCitFormats?doi=10.1080/00207169208804047http://www.tandfonline.com/loi/gcom20
  • 8/13/2019 The Demons Algorithm

    2/12

  • 8/13/2019 The Demons Algorithm

    3/12

    22 T Z I M M E R M A N N A N D P S A L A M O N

    For fixed IT Eq. (1) describes a stationary Markov process with an invariantdistribution of states given by

    This is known as the Boltzmann distribution at temperature T and gives theprobability of finding the equilibrated system in state o he associated distributionof energies is usually addressed by the same name; it is given by

    The number of states g(E) at an energy E is also called the density of states.SA proceeds by making moves according to Eq. 1) and slowly lowering thetemperature T If the system is allowed to equilibrate at each temperature, it evolvesthro ugh a sequence of Boltzm ann distributions and will eventually freeze a t T= 0.It can be shown that if the cooling is done slowly enough [5] the system will endup in a Boltzmann distribution at T 0 i.e. in the lowest energy state(s). How ever,for practical purposes this cooling will take too long and one has to define feasibletemperature schedules which have a high probability of ending up with a good,though not necessarily best, solution.Onc e a suitable annealing schedule has been defined, an imp orta nt further questionbecomes how to allocate given computing resources to arrive at the best possibleresults. In th e simplest implem entation all available time is spent on a single annealingrun. It is typical for such a run that the majority of the time is spent at lowtemperature, where it takes very long to improve on the best energy that has beenencou ntered s o far. It turns out th at for large computing resources it is better to splitthe effort into several independent runs [6] This improves the chances of findinglow energy states by better sampling of the solution space. A further importantadvantage of such an approach is that these runs can be done in parallel.This finding also motivates a change in perspective. Instead of one ran dom walkerwe now look at an ensemble of random walkers moving independently through thestate space Q Assuming that they sh are a com mon temp erature, this gives easy accessto useful statistical inform ation in the form of ensemble averages. As will be discussedbelow, the first two mom ents of the energy provide the m eans for a usefuf ada ptiv etemperature schedule.

  • 8/13/2019 The Demons Algorithm

    4/12

    THE DEMON ALGORITHM 23

    I1 T H E D E M O N A L G O R IT H MThe main subject of this paper is to further exploit the ensemble approach byintroducing a generalization of SA that the ap proa ch m akes possible. Th e generaliza-tion was motivated by an analysis of stochastic search algorithms in terms ofinformation theory [7]. The basic question asked there is what is the minimumnum ber of evaluations of the objective function needed to proceed from on e state ofthe ensemble to the next one. The state of the ensemble is thereby given by aprobability distribution that defines the likelihood of finding a certain state in theensemble. It is assumed that the basic step of a stocha stic search algorithm consistsof replacing th e current ensemble m embers by sampling from neighboring states withthe aim of achieving a certain target distribution of the energies in the ensemble.Constraints are that the mean energy of the ensemble decrease and that the numberof function evaluations to p roduce the next generation of states be minimal. Theserequirements actually suggest that the ensemble should proceed via a sequence ofBoltzmann distributions with decreasing temperatures [7].This general setting does not require the Metropo lis algorithm in order to prod ucethe desired Boltzmann distribution of energies and suggests more direct ways toachieve the same result. We relax the condition that a member of the ensemble canonly be replaced by one of its own neighbors. We treat the ensemble as a populationwhose members can reproduce and die, so each member can be replaced by anyother member's neighbor. To this end the algorithm creates a p ool of candidate states.Th e states for the next generation are selected from this poo l which is formed by thecurren t states an d (so me of) their neighbors. T his defines what we call a collectivemove class since the ensem ble evolves as a whole. This m ethod shares several featureswith genetic algorithms and similar optimization algorithms which mimic biologicalevolution [8]

    The algorithm that we are proposing is as follows. Assume there is an ensemblewith members. A curren t state of the ensemb le at step is then given by a list ofN states, Lk w:, w; . The aim is to generate a target list L k + where theenergies of the members follow a Boltzmann distribution at some target temperatureT .Using the Metropolis algorithm, such a distribution is created as the limitingdistribution of the simple Markov process, Eq. (1). In our new method, the corre-sponding expected number of states at each energy is determined first and candidatestates are accepted and rejected accordingly. The calculation of these numbers, wecall them target frequencies, is actually a nontrivial task and can be done using arun-tim e estima tion of the local density of states, as discussed b elow.In our implementation, the target list is filled in such a way that each state in Lkcontributes on e neighbor to an auxiliary list L . This can be easily generalized suchthat each state in Lk contrib utes several neighbors. Th e neighbor list L is then scan ned(in a random way) and each state with an energy still needed to arrive at the targetfrequencies is mov ed t o the target list L k + .Thu s states at each energy are gathereduntil a sufficient numberof them have been accepted to fill up the qu ota s afterwhich state s at that energy are rejected. If L is exhausted but L k + is not complete,the original list Lk is scanned in the same way. If L k + is still not complete it is filled

  • 8/13/2019 The Demons Algorithm

    5/12

    24 T Z IMMERMANN AND P SALAMON

    with randomly chosen states from L . It is then decided using a Xz-test whether Lk+is close enough to t he target distribution using som e prespecified confidence level. Ifno, we generate a new neighbor list L and try again to reach Lk+ .f yes, we set anew target temperature and new target frequencies for kf and the whole processis repeated.This way of achieving a certain target distribution is reminiscent of the wayMaxwell s D emo n achieves one: The demon controls a do or in a wall separating twogas containers. Initially both gases are at the same temperature. The demon thencreates a temperature difference by only allowing fast molecules to go from the leftto the right container and slow ones only from the right to the left. Because of thisanalogy we are calling our method the Demon Algorithm (DA).Similar to traditiona l SA this algorithm proceeds th rough a sequence of Boltzmanndistributions with decreasing temperature. The main difference between the twomethods is that in SA the target distribution is achieved through a stochasticrelaxation process where a number of rand om w alkers move independently throu ghthe state space according to the Metropolis transition probabilities, Eq. (1). In thepresence of energy barriers this relaxation can be very slow. The DA, on the otherhand, is characterized by a population that evolves under a selection pressure definedby the target distribution, where walkers die and/or reproduce, removing the needfor barrier climbing. The only effective barrier climbing that can take place involvestransitions within a few energy standard deviations of the current mean energy. Italso becomes possible for a state together with one or several of its neighbors tobecome part of the next list, whereas other states die out in the sense that one oftheir neighb ors becom e part of successive lists. Th e DA is thu s able to achieve acertain target distribution of energies much faster than SA. The question, however,is whether the quality of the solutions will be comparable. This problem will beaddressed in Section V.

    A D A P T I V E T E M P E R A T U R E C O N T R O L A N D E S T I M A T I O N O FTAR GET FR EQUENC IESTo completely specify our optimization algorithms we have to define how thetemperature is controlled during an annealing run. We are using a temperatureschedule that has been proposed by Nu lton an d Salam on [9,10] and can bemotivated by the idea that two successive Boltzmann d istributions should no t be toofar apart. This can be made m ore precise and leads to the requirement that the meanenergy sho uld always be lowered by a certain fraction of the current energy sta nd arddeviation a . For the target mean energy E ) , . t temperature T, we then have

    The constant is called thermodynamic speed and E q. (4) defines a constantthermo dynam ic speed schedule. The new target tem perature Tk, ollows implicitly

  • 8/13/2019 The Demons Algorithm

    6/12

    THE D E M O N LGORITHMfrom Eq. (4) via

    where gi is the density of states, i.e. the number of states with energy E i .The basic problem in using this schedule is that the density of states g , is usuallynot known a priori. How ever, it can be estimated locally during the execution of thealgorithm [ll]. The idea is to keep a matrix Q of counters qJiwhich are incrementedby 1 each time a state with energy Ei generates a neighbor with energy E j .Normalizing Q to unit column sums we arrive at estimatesq;iq. . =J j i

    for the infinite temperature Metropolis transition probability of going from energyEi to E j . The equilibrium distribution of the so defined Markov process is simplythe density of states gi and can be obtained as the eigenvector with eigenvalue 1 ofthe stochastic matrix Q. From the density of states also follow directly the targetfrequencies for the DA, i.e. the expected nu m be r ni of states at energy Ei in the targetlist is:

    exp(- E i / T kni = Ng ,Z T k )

    where N = x ni is the size of the ensemble.At a certain stage of the algorithm only matrix elements qii connecting energies inthe vicinity of the current mean energy will be encountered. For energies below thelowest energy seen so far the corresponding counters will be zero. However, theBoltzmann distribution is typically concentrated around the current mean energy, soit is only that region where the density of states is actually needed. For this reasonwe only consider a submatrix R taken from Q that is centered about the currentmean energy and covers a few energy standard deviations. If R covers an energyinterval [ E m n] , he m atrix elements qli not included in R with _ i n ando r > n have to be added to the diago nal element qii in the same column in order toobtain an unbiased estimate of g.The matrix R is then normalized to unit column sums. The eigenvector witheigenvalue of the resulting stochas tic matrix R is th curren t e stimate of g and isused to set up the target frequencies for the DA and to calculate the next targettemperature via Eqs. (4) and (5). In our implementation, a new estimate for g iscalculated from the updated countermatrix Q whenever a new target temperaturehas to be set or after a certain number of new lists has been generated.

  • 8/13/2019 The Demons Algorithm

    7/12

    6 T Z l M M E RM A N N A N D P S A L A M O N

    IV A P P L I C A T I O N T O G R A P H B I P A R T IT I O N IN GA first test of the D A w as performed on the ca ricature of a state space which consistedof a weighted gra ph defining transition prob abilities between its vertices [12]. Th emain feature was a single energy barrier which separated an unfavorable localminimum with a large weight from the lesser weighted global minimum. Numericalresults showed the DA to be a n order of mag nitude faster than SA. This is of coursea very simplified situation; as a more realistic testing ground we implemented theDA for the problem of graph bipartitioning.In this problem the task is to divide the vertices of a given graph into two sets ofequal size such that the number of edges crossing between these sets is minimal. Thisproblem is NP-complete [I31 and hence among the hardest combinatorial optimiza-tion problems. Graph bipartitioning is a well studied problem with interestingproperties. For example, the random graph bipartitioning problem can be mappedon a spin glass system an d it is known tha t such systems have a complex state spacestructure with many local energy minima [14]. A rand om g raph is defined by thenumber n of vertices and a connection probability p, i.e. each of the possiblen(n 1)/2 edges is present w ith prob ability p. SA has been found to give good resultsfor bipartitioning such graphs [I51 and we shall concentrate on this type ofpartitioning problem.We im plemented the DA as well as the (ensemble-) SA algorithm with a constanttherm odyna mic speed schedule as described in Section 111 For a complete compar-ison we also implemented a random downhill search which simply generates asequence of neighboring states whose energy never increases. This is actuallyequivalent to SA with a target temperature T 0 and is called quenching . Theran dom gra phs we used had 500 vertices and a conne ction proba bility of0.004. This leads to a mean vertex degree (i.e. number of edges originating froma vertex) of 2. A stat e of the system to be an neale d is now sim ply given by a pa rtition,i.e. by specifying for each vertex to which one of the two sets it belongs. The moveclass is defined such that a neighboring partition is generated by randomly selectingone vertex from each set and exchanging them.To evaluate the performance of the different algorithms we monitor the value ofthe very best s o far energy (vbsfe), i.e., the best energy fou nd s o far dur ing a noptimization run. Figure 1 shows these values from representative runs of SA, DAand quenching for partitioning the same random graph. The Metropolis based SAused an ensemble size N 100 and a therm odyna mic speed of 0.1. Th e DA wasrun with an ensemble size N 1000 and a speed v 0.01. The quenching wasperform ed by doing 1000 indepe ndent runs with different initial conditions in parallel.The figure also shows the results for a variant of the Demon algorithm that will beintroduced in Section V.Th e vbsfe values are plotted w ith respect to the nu mbe r of energy evaluations. Thisdoes not reflect the actual computing time spent on the problem since quenchingdoes not involve any significant overhead, whereas the adap tive tempe rature con trolfor SA and D A c an take up a significant fraction of the available comp uting resources.For graph partitioning, the calculation of the energy difference for a move is fairlycheap, but in general the evaluation of the energy function can be a costly step and

  • 8/13/2019 The Demons Algorithm

    8/12

    THE D E M O N A L G O R I T HM 27

    43 4 5 6 2 3 4 5 61 1energy evaluations o

    igure I Very best energy seen so far with respect to the number of energy evaluations performed Theinset also gives the final vbsfe in parentheses The data are only displayed up to the last improvement ofvbsfe; the calculations have been continued substantially longer

    usually justifies large overhead provided this can reduce the number of energyevaluations.Figure shows three different regimes in the behavior of the various optimizationalgorithms. For small computing resources quenching gives the best results. This issimply due to the fact that quenching does not make any uphill moves. Forintermediate resources the DA becomes supe rior. However the algorithm terminateswith a final best energy value of vbsfe = 29. For very large resources therefore theSA algorithm performs best with a final vbsfe = 19. The quench ing algorithm endedwith vbsfe = 26. All runs shown in Figure were only stopp ed after there was noimprovem ent of the vbsfe for a time span longer th an a t least times the time of thelast change of vbsfe.A closer analysis of the behavior of the DA showed that in the course of the runeventually only states within a single local minimum in state space survive. To m akethis point more clear note that the state space R together with its topology andobjective function E: R -.R defines an energy hypersurface. For a continuousproblem R = Rn while for our problem R is discrete. The energy surface can thenbe viewed as a hierarchy of basins that are formed by energy barriers of variousheights. [16] A random walker moving on that surface at a certain mean energy isonly restricted by energy barriers that are higher than that energy. For decreasingenergy the state space is divided into more a nd m ore different basins or components.In SA the M etropolis algorithm guarantees in th e ideal case that the distributionof rand om walkers is uniform on sets of states with the sam e energy. Th e state space

  • 8/13/2019 The Demons Algorithm

    9/12

    28 T Z I M M E R M A N N A N D P S A L A M O N

    is thus uniformly probed. For the DA however, sampling fluctuation effects mayprematurely remove all states from a basin th at would actually lead to a good energyminimum. This effect will be addressed more closely below. There is also a second,more drastic effect. Namely, in our runs the DA typically ended such that a singlestate in a local minimum reproduced an d the state list was rapidly filled with copiesor neighbors of that state. This happened when at some low temperature thegeneration of a lower energy neighbor became a very unlikely event, but th eprobability for a neighbor of the same energy was considerably higher. F or th e sparserandom graphs treated here, neighboring states with the same energy are actuallyquite likely since there is a considerab le fraction of isolated vertices (i.e. vertices ofdegree zero).SA does not have these kind of problems since there are only individual movesand random walkers cannot die out or multiply. The Metropolis algorithm in SAgenerates a Boltzmann d istribution of srates as opposed to a Boltzmann distributionof energies n particular, the Boltzmann d istribution over states is uniform over stateswith the same energy. This guarantees an unbiased probing of the state space. TheDA, on the other hand, only enforces a Boltzmann distribution of energies and inthis way the distribution of states can be biased due to fluctuation and selectioneffects.In a practical application where the Boltzmann d istribution is only approxim atelyrealized, one can argue tha t SA works well because large basins att ract m any walkersan d it is plausible that deep basins a re associated with large rims. This conjecture istrue for Rn nd a smooth energy function. [7] It is also true for our graphpartitioning problem. It has not been proven in a general context, however, but webelieve it to be a basis for the success of SA. Th e same arg umen t sho uld also m akeit plausible that the DA should work well, however, one has to take into accountthe cou nterp rodu ctive effect of statistical sampling fluctuations.

    T o understan d this effect we investigated a simple neut ral model of the DAwhich neglects selective forces. Assume a n ensem ble of N walkers and a state spacewith disconnec ted basins. At some stage of the algorithm , basin i contains m,walkers. In a Demon step each walker generates k 1 neighbors (each in the samebasin) and we simply assume that the new list is generated by ran dom ly selecting Nwalkers from the available set of kN walkers, i.e. current a nd neighboring states. Theprobability of ending up with m walkers in basin i is then given by

    where we have z m x i m ' , N Eq. (8) is a hypergeometric distribution and definesa M ark ov process. It is clear that as soon a s a basin is empty (mi 0), it will remainempty and evenutally all states will be within one basin. It is interesting to note thatEq. 8) is also known from population genetics and describes the effect of neutralevolution [17] i.e. evolutionary processes that are solely due to statistical fluctuationsin finite populations in the absence of selection forces.Fo r the simple case of = 2 basins one can calculate approxim ately the mean time

  • 8/13/2019 The Demons Algorithm

    10/12

    T H E D E M O N A L G O RI TH M 9predicted from Eq. (8) until a basin is emptied [ 1 8 ] . This time ~ x )epends on theinitial relative population x m l / N and is given by

    1 / Nr x ) N X In x 1 ) ln(1 x))k 1

    It follows that r grows approxim ately propo rtional to .the ensemble size N . Thismeans that it may require rather large ensembles to suppress these samplingfluctuations to a sufficient degree such tha t selective forces alone are dom inating .Note that for k 2, Eq. (9) is maximum for k 2. This is the value we used in ourimplementation of the Demon algorithm.

    V T H E D E L T A D E M O N A L G O R I T H MThe DA in its present implementation involves a significant overhead for thetemperature control. This fact, and also the above mentioned problem of a singlestate quickly filling up the target list at low temp erature m otivated us to investigatea rather simplified version of the DA. This version simply requires the target list tobe filled with states at a single energy. In the language of statistical mechanics thismeans th at th e ensemble follows a microcanonical d istribution . Since this distributionis basically a &function centered on the target energy we call this version of the DAthe Delta D emo n (DD ). In our graph partitioning example there are only integerenergies and s o we defined the target energy to be equal to the current energy minusone.The D D algorithm only accepts states with energy lower than the cu rrent one andis therefore closer to the quenching algorithm. The main difference to quenching isthat the D D algorithm is able to reallocate its random walkers to m ore favorableparts in state space. In this way the algorithm avoids early trapping in unfavorablelocal minima and eventually outperforms quenching.The a lgorithm proceeds in such a way th at each state of the current list Lk producesa neighboring state. If the energy of this state is equal to the target energy, the stateis moved to the target list Lk+ . f its energy is less, then that state is kept in areservoir from where it will be moved in a later stage of the algorithm into the targetlist for that energy. In our implementation, the current list is scanned in a randomorder and the scan is repeated until the target list is filled. In this way it takes atleast N sweeps through the current list before the target list could be filled with equalenergy neighbors of a single low energy state. For the DA this may happen muchfaster in log, N steps since both original and neighbor state can be moved to thetarget list. Since at low energy it can take very long to fill the target list we set anupper limit to the number of energy evaluations per target list. The algorithm thenonly continu es with the target state s found so far. This actually decreases the ensemblesize NFigure 1 shows the vbsfe values for the D D algorithm for a representative run.The initial ensemble size was N 1 and the maximum number of energyevaluations set to 100000.As expected, the D D alg orithm is initially faster thanSA and the DA because there a re only downhill moves. Th e algorithm is slower than

  • 8/13/2019 The Demons Algorithm

    11/12

    3 T Z I M M ER M A NN A N D P SA LA MON

    quenching because the latter does not require a target list to be filled. Eventually,however, quenching becomes slower due to trappin g in local minima. In that regimethe D D algorithm performs similar to the DA w ith the tendency of finding slightlybetter final best energies; in the example of Figure 1 the final value was vbsfe 24It was interesting to observe that there is a certain critical energy E* 26 belowwhich it rat her sudd enly becomes very difficult to find lower energies. This thresholdis known from the spin glass analogue as th e glass transition, indicating the suddentransition to a regim e where typical time scales (in particular relaxation tim es) becomevery large. Our results indicate that only SA was really able to find energiessignificantly below E .

    VI SUMMARY AND CONCLUSIONSConventional simulated annealing in the ensemble approach is based on individualmoves of the ensemble members according to the M etropolis algorithm. In th e presentpaper we generalize this approach by replacing the Metropolis algorithm by acollective move class. The new mem bers of the ensemble ar e selected from a pool ofstates that is formed by the current states and some of their neighbors. The selectionis done with the aim of achieving a certain target distribution of the energies in thenext state of the ensemble. In this way ensemble members can be replaced by morefavorable neighbors of other states. In analogy to the Metropolis algorithm we usethe Boltzmann distribution in order to set up the required number of states at eachenergy in the target. Such a target distribution method reminds one of the action ofMaxwell's Dem on, hence the name Dem on A lgorithm.We implemented the Demon Algorithm together with conventional simulatedannealing and random downhill search (quenching) for the problem of graphbipartitioning. T he results showed tha t each one of these algorithms is optimal w ithina certain range of available com puting resources. As a m easure we took the very bestenergy seen so far with respect to the number of energy function evaluationsperformed. Fo r a small number of available energy function ev aluations quenchinggives the best results, simply because this algorith m does no t accept any uphill moves.For intermediate computing resources quenching slows down due to trapping inlocal minima and the Demon Algorithm becomes superior. On the very long runconventional simulated annealing performs best. The reason for that is tha t samplingfluctuations in the Demon Algorithm lead to a clustering of all ensemble states in asmall region in state space. The algorithm in this way freezes in a single localminimum . We modelled these fluctuations with a simple model which shows that theclustering takes place on a timescale proportional to the ensemble size, emphasizingthe need for large ensembles. Metropolis based simulated annealing does not havethis kind of problem; the individual moves of the ensemble members guarantee amore uniform sampling of the state space.Simple techniques can be implemented to reduce the extent of clustering. As anexample, we mention the possibility of restricting the fertility of ensem ble members,e.g., reduce the likelihood of considering neighbors of states whose neighbors havealready been added to the list.

  • 8/13/2019 The Demons Algorithm

    12/12

    THE DEMON ALGORITHM 31Our results show that the choice of the best algorithm depends on the availablecomputing resources. Where exactly the distinction is to be made will of coursedepend o n the specific optimization problem. Th e Dem on Algorithm allows for anum ber of possible variations e.g. the rath er simplified Delta D emo n discussed inSection V. Implem entation of the Dem on A lgorithm as described in Section IV leavesopen the optim al choice of characteristic parameters such as ensemble size thermo-dynamic speed and number of neighbors generated per ensemble member.

    References[I] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi, Optimization by simulated annealing, Science 2201983). 671-680.[2] V ce rn y, Thermodynamical approach to the traveling salesman problem: An efficient simulationalgorithm, J O T A 45 1985), 41.[3] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Equation of state calculationsby fast computing machines, J. chem. Phys. 21 1953), 1087-1092.[4] S R. White, Concepts of scale in simulated annealing, I C C D 84 1984). 646.[5] S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration ofimages, lEEE Proc. Pattern Analysis and Machine Intelligence PA MI -6 6 1984), 721-741.[6] G Ruppeiner, J. M. Pedersen and P. Salamon, Ensemble approach to simulated annealmg,J. Phys. I1 France 1 1991). 455470.[7] P. Salamon, K. H. Hoffmann, J. Harland and J. D. Nulton, An information theoretic bound on theperformance of simulated annealing algorithms. SDSU IRC report 88-1, 1988.[S] J. J. Grefenstette and J. E. Baker, How genetic algorithms work: A critical look at implicitparallelism, ICGA 1989).[9] J. D. Nulton and P. Salamon, Statistical mechanics of combinatorial optimization, Phys. Rev. A 371988), 1351-1356.[lo] P. Salamon, J. Nulton, J. Robinson, J. Pedersen, G Ruppeiner and L. Liao, Simulated annealingwith constant thermodynamic speed, Comput. Phys. Commun. 49 1988), 423428,[1 ] B. Andresen, K. H. Hoffmann, K . Mosegaard, J. Nulton, J. M. Pedersen and P. Salamon, On lumpedmodels for thermodynamic properties of simulated annealing problems, J. Phys. France 49 1988),1485-1492.

    [I21 0 . Mercado Kalas, The demon algorithm, MS thesis, San Diego S:ate University, 1989).[13] M. R. Garey and D. S Johnson, Computers and Intracrabiliry Freeman, San Francisco, 1979).[14] Y. Fu and P. W. Anderson, Application of statistical mechanics to NP complete problems incombinatorial optimization, J Phys. A 19 1986). 1605.1151 D. S. Johnson, C . R. Aragon, L. A. McGeoch and C . Schevon, Optimization by simulated annealing:An experimental evaluation Part I), Oper. Res. 1989).1161 B. Hajek, Cooling chedules for optimal annealing, Math. Oper. R. 13 1988), 311-329.[17] J. F. Crow and M. Kimura, An Inrroducrion ro Population Generics Harper Row, New York, 1970).[IS] M. Kimura and T. Ohta, The average number of generations until fixation of a mutant gene in afinite population, Genetics 61 1969), 763-771.