View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Evolutionary Computational Intelligence
Lecture 8: Memetic Algorithms
Ferrante Neri
University of Jyväskylä
2
The Optimization Problem
All the problems can be formulated as an Optimization Problem that is the search of the maximum (or the minimum) of a given objective function
Deterministic Methods can fail because they could converge to local optimum
Evolutionary Algorithms can fail because they could converge to a sub-optimal solution
3
“Dialects” Developing in Artificial Intelligence
Fogel Owens (USA, 1965) Evolutionary Programming
Holland Genetic Algorithms (USA, 1973) Genetic Algorithm
Rechenberg Schwefel (Germany, 1973) Evolution Strategies
90s Evolutionary Algorithms (EA)
4
Historical Info about MAs
The term Memetic Algorithm (MA) is coined by Moscato (1989)
….but as always the same idea was also given under the name of– Hybrid GAs– Baldwinian GAs– Lamarckian GAs – Others…
5
The Metaphor
The Meme, the “Selfish Gene” (Dawkin, 1976). The Meme is a unit of “cultural
transmission” in the same way that genes are the units of biological transmission.
In EAs, genes are encoding of candidate solutions, in MAs the memes are also “strategies” of how to improve the solutions.
6
Memetic Algorithms
The combination of Evolutionary Algorithms with Local Search Operators that work within the EA loop has been termed “Memetic Algorithms”
Term also applies to EAs that use instance specific knowledge in operators
Memetic Algorithms have been shown to be orders of magnitude faster and more accurate than EAs on some problems, and are the “state of the art” on many problems
7
Michalewicz’s view on EAs
8
Local Searchers
Local Searcher (LS): a deterministic method able to find the nearest local optimum
Local Searchers can be classified according to:– Order – Pivot Rule– Depth– Neighborhood
9
Local Searchers’ Classification
Order zero if it uses just the function (direct search), order one if it uses the first derivative, order two if it uses the second derivative
Steepest Ascent Pivot Rule: the LS explores all the Neighborhood (e.g Hooke-Jeeves Method). Greedy Pivot Rule: the LS chooses the first better search direction found (e.g. Nelder-Mead Method)
10
Local Searchers’ Classification
The depth of the Local Search defines the termination condition for the outer loop (stop criterion)
The neighborhood generating function n(i) defines a set of points that can be reached by the application of some move operator to the point i
11
General Scheme of EAs
12
Pseudo-Code for typical EA
13
How to Combine EA and LS
14
Intelligent Initialization
The initial population is not given at pseudo-random but it is given according to a heuristic rule.
Examples: quasi-random generator, orthogonal arrays
It increases the average fitness but it decreases the diversity
15
Intelligent Variation Operators
Intelligent Crossover: finds the best combination between parents in order to generate the most performing offspring (e.g. heuristic selection of the cut point)
Intelligent Mutation: tries several possible mutated individuals in order to obtain the most “lucky” mutation (e.g. bit to flip)
16
Properly Said Memetic Algorithms: Local Search acting on Offspring
Can be viewed as a sort of “lifetime learning” The LS are applied to the offspring in order to have
more performing individuals A LS can be viewed also like a special mutation
operator and it is often (but not only!) used to speed-up the “endgame” of an EA by making the search in the vicinity
In fact the EAs are efficient in finding solutions near the optimum but not in finalizing the search
17
How to apply a Local Searcher?
Krasnogor (2002) shows that there are theoretical advantages to using a local search with a move operator (LS to the offspring ) that is different to the move operators used by mutation and crossover but…..
How many iterations of the local search are done ? Is local search applied to the whole population?
– or just the best ?– or just the worst ?– or to a certain part of the population according to some
rules?
Basically the right choice depends on the problem!
18
Two Models of Lifetime Adaptation
Lamarckiantraits acquired by an individual during its lifetime can be
transmitted to its offspring (refreshing of the genotype)e.g. replace individual with fitter neighbour
Baldwiniantraits acquired by individual cannot be transmitted to its
offspring (suggests new direction search)e.g. individual receives fitness (but not genotype) of
fitter neighbour
19
Efficiency and Robustness of the Memetic Algorithms
Usually the fitness landscapes are multimodal and very complex, or the decision space is very big
We would like to implement an algorithm which
is able to converge, every time it is run, to the optimal solution in a short time (avoiding premature convergence and stagnation)
20
Adaptivity and Self-Adaptivity
In order to enhance the efficiency and the robustness of a MA an adaptive or self-adaptive scheme can be used
Adaptive: the memes are controlled during the evolution by means of some rules depending on the state of the population
Self-Adaptive: the adaptive rules are encoded in the genotype of each individual
21
Multi-Meme systems
A Meme Algorithm uses one LS (usually complex)
A Multi-Meme Algorithm (M-MA) employs a set (a list) of LSs (usually simple)
If a M-MA is implemented the problem of how and when to run the LSs arises and some rules are therefore needed
22
Adaptivity + Multi-Meme
In order to properly select from the list the LS to use for the different stages of the evolution an adaptive strategy can be used
If the “necessities” of the evolutionary process are efficiently encoded it is possible to use different LSs in different moments and on different individuals (or set of individuals)
23
The use of several Local Searchers
Local Searchers with different features explore the search space from different perspectives
Different Local Searchers should “compete” and “cooperate” (Ong 2004) working to solve the classical problem, in EAs, of the balancing between “exploration” and “exploitation”
24
An Example: Adaptivity + Multi-Meme on the population diversity
min 1, best avg
best
f f
f
The state of the convergence of the algorithm can be measured on the basis of the coefficient:
if the convergence is going to approach but it is still quite far the Nelder-Mead is applied since it is greedy and explorative in order to jump out from the nearest basin of attraction
If the convergence is very near the Hooke-Jeeves is run since it is a LS with steepest ascent pivot rule and can then finalize the work in the hopefully found global optimum
25
Thank You for Your Attention
Questions?