CS321 HS 2009 Autonomic Computer Systems Evolutionary Computation II November 19, 2009

1S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov. 2009

CS321 HS 2009CS321 HS 2009

Autonomic Computer SystemsAutonomic Computer Systems

Evolutionary Computation IIEvolutionary Computation II

November 19, 2009November 19, 2009

Lidia YamamotoUniversity of Basel http://cn.cs.unibas.ch


OverviewOverview

Evolutionary Computation, Part II• Representations• Performance considerations• Dynamic environments

Summary


Performance Issues: Performance Issues: Optimizing the OptimizationOptimizing the Optimization

What makes evolutionary computation difficult?

difficult = computationally expensive (in terms of time, memory)

from [Weise2009]

Premature convergence Ruggedness Lack of causality Deceptiveness Neutrality Epistasis

Noise Overfitting Oversimplification Multi-objectivity Dynamic environments No Free Lunch


MetricsMetrics

Diversity Causality Neutrality Evolvability


Exploitation vs. ExplorationExploitation vs. Exploration

Crucial to heuristic optimization: to strike a good balance between exploration and exploitation• Exploration: creation of novel solutions able to explore yet

unknown regions of the search space• Exploitation: make best use of existing good solutions, and

build upon them for the construction of new ones Too much exploration:

• Lose focus, wander randomly through search space: can’t improve

Too much exploitation:• Stick to small area near current (perhaps poor) solutions, don’t

look around: can't improve either


Premature ConvergencePremature Convergence

Convergence: an optimization algorithm converges when it doesn’t produce new solutions anymore, or keeps producing a very reduced subset of solutions

Premature convergence: the algorithm converges to a local optimum and can’t improve from there (unable to explore other regions of the search space)

Typical in multimodal fitness landscapes• multimodal function: has several maxima or minima• multimodal fitness landscape: has several (local or global)

optima



Example:

z

xy

z

xy

optimization run

initial population prematurely converged population



Typically caused by loss of diversity Diversity: measure of the amount variety, i.e. number of different

solutions in the population, and how different they are (distance between alternative solutions)

Loss of diversity: after the population converges, it becomes very uniform (all solutions resemble the best one). Causes:• too strong selective pressure towards best solution• too much exploitation of existing building blocks from current

population (e.g. by recombining them, or mutating them only slightly)



Fighting premature convergence:• Restart from scratch (as a last resort)• Maintain diversity (but may slow down opt.)

– Decrease selection pressure– Random immigrants: insert new random individuals periodically– Penalize similarity [Miller1996]:

• Crowding: similar individuals are more likely to die to make room for new ones

• Sharing: similar individuals “share” fitness (fitness gets reduced in proportion to the number of similar individuals)


RuggednessRuggedness

zz

x y

multimodal fitness landscape rugged fitness landscape

Rugged fitness landscape: multimodal with steep ascends and descends: optimization algorithm has trouble finding reliable gradient information to follow


RuggednessRuggedness

A typical cause of ruggedness: weak causality Strong causality: small changes in the genotype lead to small

changes in fitness (ideal) Weak causality: a small change in the genotype may lead to a

large or unpredictable change in fitness• a small mutation may convert a very good solution into a very

bad one, and vice-versa• optimization becomes erratic, may still work but very slowly

Mitigating the effects of ruggedness:• Large populations, high diversity• Change the genotype representation for a smoother

genotype-phenotype-fitness map


DeceptivenessDeceptiveness

The gradient leads the optimizer away from the optimum Consequence: optimizer may perform worse than random walk No effective countermeasures Palliative solutions: large populations, high diversity, increase

causality by grouping related genes

f(x)

x

globaloptimum


NeutralityNeutrality

A neutral change (e.g. neutral mutation) is a transformation in the genotype that produces no change in fitness

Degree of neutrality:• of a genotype: fraction of neutral results among all possible (1-

step) changes that can be applied to it• of a region of the search space: average neutrality of the

genotypes within this region


NeutralityNeutrality

Example:

f(x)

xneutral genotypes neutral genotypes

neutral region neutral region

neutral changes (e.g. mutation)


Neutrality and EvolvabilityNeutrality and Evolvability

Evolvability:• in biology: ability to generate heritable and selectable

phenotypic variation• in optimization: ability to produce new, fitter solutions

Neutrality has positive and negative influences on evolvability:• positive: it may help to avoid “death valleys” of poor solutions:

neutral changes accumulate, until enough changes result in a beneficial outcome– punctuated equilibria in biology: long periods of stasis, followed

by short periods of rapid phenotypic evolution

• negative: it may slow down convergence: within the neutral region, the algorithm has no hint about how to make progress


Neutrality BridgesNeutrality Bridges

Premature convergence Small neutral bridge Wide neutral bridge

figure from [Weise2009]


OverfittingOverfitting

Overfitting: emergence of an overly complicated solution that tries to fit as much of the training data as possible

Typical cause: noise in the measured data used as training set Example, in symbolic regression:

f(x)

x

f(x)

x

f(x)

x

original function measured data(with noise)

overfitted result


OverfittingOverfitting

Consequence: loss of generality: the solution generated is too specific for the set of data (includes the noise as part of the solution)

Generality: A solution is general if it is not only valid for the training samples, but also for all different inputs that it should face

Countermeasures:• to favor simpler solutions• larger and randomized training subsets, repeated tested


OversimplificationOversimplification

Opposite of overfitting: too simple solutions are obtained Causes:

• Incomplete training set, not sufficiently representative of the problem to be solved

• Premature convergence due to ruggedness, deceptiveness Solution: careful analysis of problem space and design of solution

representation

original function measured data oversimplified result

f(x)

x

f(x)

x x

f(x)


No Free Lunch TheoremNo Free Lunch Theorem

Wolpert and Macready, 1997: No Free Lunch (NFL) Theorem(s)• averaged over all problems, all search algorithms have the

same performance.• or: if an algorithm performs well for a certain category of

problems, it must perform poorly for other problems. Performance improvements often rely on more knowledge about

the problem domain (e.g. assume strong causality, or a certain degree of ruggedness)


Other IssuesOther Issues

Epistasis and Pleiotropy• Epistasis: interaction between different genes• Pleiotropy: a single gene influences multiple traits• In GP: one gene (e.g. program segment) influences other

genes (e.g. code executed afterwards): a mutation may have a cascade effect, leading to weak causality

Multi-Objective Optimization• multiple, possibly contradictory objectives to be pursued

simultaneously• must find a balance among them: notion of “better” replaced

by a notion of “dominant” solution


OverviewOverview


Summary


Optimization in Dynamic EnvironmentsOptimization in Dynamic Environments

Motivation: dynamic applications:• continuously changing environment• delivery scheduling, vehicle routing, greenhouse control...• autonomic environments:

– detect and respond to changes, continuous self-optimization

Dynamic Optimization: Algorithm should continuously track the optimum in the presence of dynamic changes and uncertainties• keep performance under (small) changes• adjust quickly to changes



Challenges: change and uncertainty• noise or errors in fitness function calculation or approximation• changes in environmental parameters (e.g. in a wireless net:

number of nodes, weather conditions or obstacles that may affect transmissions)

• change in desired optimum, i.e. change in fitness function Re-optimize (start from scratch) is expensive Crucial to keep diversity:

• if the optimum changes, the population must be able to re-adapt: this requires diversity in the population



In a dynamic environment, convergence to a given optimum is a problem: how to readapt to a new optimum?

Solutions:• Restart from scratch (last resort if changes are too severe)• Recreate diversity after change: randomization, e.g.

hypermutation (but: may destroy previous info)• Maintain diversity: e.g. random immigrants, sentinels

– random immigrants: insert new random individuals periodically– sentinels: keep some individuals at fixed locations– but: slows down convergence



Solutions (cont.):• Memory-enhanced algorithms: "remember" previous optima,

in case they come back:– implicit memory: redundant genetic representation (e.g. diploid)– explicit memory: explicitly store and retrieve info from mem.

• when problem changes: retrieve suitable solution from memory• more successful overall than implicit memory [Jin2005]

– both only useful in combination with diversity keeping• if no diversity in memory then memory not so useful



Solutions (cont.):• Multi-population approaches: different subpopulations on

different peaks, with memory of local optima– example of memory with diversity combination– approaches

• self-organizing scouts [Branke2000]• multi-national GA [Ursem2000]

• Anticipation and prediction [Bosman2005]– system tries to predict future consequences of current decisions– estimate expected values given probability distribution


OverviewOverview


Summary


SummarySummary

Solving problems with evolutionary computation involves a number of design choices:• Genotype representation for candidate solutions:

– string, tree, graph, multiset (chemistry),...

• Phenotype representation:– same as genotype?– or indirect encoding (e.g. grammatical evolution) with genotype-

phenotype map?

• Choice of reproduction, variation, fitness evaluation and selection mechanisms– strike a balance between exploration and exploitation

• Performance considerations


SummarySummary

Performance considerations:• prevent premature convergence• keep diversity (especially in multimodal landscapes and

dynamic environments)• face and exploit neutrality• deal with noisy fitness (e.g. in dynamic environments, avoid

overfitting) Not covered:

• co-evolution: different species (tasks) interact, have an impact on each other’s evolution– competitive relation, e.g. host-parasite– cooperative relation, e.g. symbiosis


ReferencesReferences

[Weise2009] T. Weise, M. Zapf, R. Chiong, and A. J. Nebro. “Why Is Optimization Difficult?” Nature-Inspired Algorithms for Optimisation, Studies in Computational Intelligence, volume 193, chapter 11, pages 1 50. Springer, 2009.

[Miller1996] B. L. Miller, M. J. Shaw, “Genetic algorithms with dynamic niche sharing for multimodal function optimization”, Proc. IEEE International Conference on Evolutionary Computation, agoya, Japan, May 1996.

[Jin2005] Y. Jin and J. Branke. "Evolutionary Optimization in Uncertain Environments - A Survey". IEEE Transactions on Evolutionary Computation, 9(3):303 317, Jun. 2005.

Documents

CS321 HS 2009 Autonomic Computer Systems Evolutionary Computation II November 19, 2009