Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
IntroductionAlgorithm
Properties and extensionsExamples
Lecture on bioinformatics, chapter 2:optimization and genetic algorithms
Fabian J. Theis
Institute of BiophysicsUniversity of Regensburg, Germany
4th May 2005
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Outline
IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms
AlgorithmBasic algorithmData representationSelection
ReproductionProperties and extensions
OverviewConvergence analysisSchema theoremGenetic programming
Examples2d-function optimizationGenetic MastermindHyerplane detection
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Algorithm
IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms
AlgorithmBasic algorithmData representationSelection
ReproductionProperties and extensions
OverviewConvergence analysisSchema theoremGenetic programming
Examples2d-function optimizationGenetic MastermindHyerplane detection
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Introduction
I idea of genetic algorithms (GAs)I extract optimization strategies nature uses successfully → Darwinian
EvolutionI transform them for application in mathematical optimization theory
I abstract goal: find the global optimum of a problem/function in adefined phase space
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Introduction
I idea of genetic algorithms (GAs)I extract optimization strategies nature uses successfully → Darwinian
EvolutionI transform them for application in mathematical optimization theory
I abstract goal: find the global optimum of a problem/function in adefined phase space
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I GA as special kind of reinforcement learningI no access to the full problem/functionI but: rewards are given for a given action/search space positionI goal: use rewards to find optimumI this contrasts to learning by (given) examples as in supervised
learning e.g. using neural networksI → traverse search space manually
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I simple algorithm: random samplingI pick a single location in the search spaceI store it if reward is higher than at previous locations, discard it
otherwiseI repeat
I other such algorithmsI Markov-Chain-Monte-Carlo search (MCMC)I simulated annealingI if derivative of reward is available: (conjugated) gradient
ascent/descent etc.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I simple algorithm: random samplingI pick a single location in the search spaceI store it if reward is higher than at previous locations, discard it
otherwiseI repeat
I other such algorithmsI Markov-Chain-Monte-Carlo search (MCMC)I simulated annealingI if derivative of reward is available: (conjugated) gradient
ascent/descent etc.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local
updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point
x(0) ∈ Rn
I local maxima of f can be found by iterating
x(t + 1) = x(t) + η∆x(t)
with
∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f
∂x(x(t))
the gradient of f at x(t).
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local
updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point
x(0) ∈ Rn
I local maxima of f can be found by iterating
x(t + 1) = x(t) + η∆x(t)
with
∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f
∂x(x(t))
the gradient of f at x(t).
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Optimization
I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local
updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point
x(0) ∈ Rn
I local maxima of f can be found by iterating
x(t + 1) = x(t) + η∆x(t)
with
∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f
∂x(x(t))
the gradient of f at x(t).
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Stochastic optimization
I Problem: minimize multivariate, real functionf (x), x = (x1, x2, . . . , xn)
T
I if no analytical expression of f (x) is given (but only some values) orf is non-continuous, f can be minimized by stochastic methods
I Random searchI choose x1 randomlyI calculate f (x1)I choose δ randomlyI calculate f (x1 + δ)I
f (x1 + δ) < f (x1) choose x1 + δ asif new starting point
f (x1 + δ) > f (x1) choose new δ
I choose δ = δ(t) with limt→∞ δ(t) → 0
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Stochastic optimization
I Problem: minimize multivariate, real functionf (x), x = (x1, x2, . . . , xn)
T
I if no analytical expression of f (x) is given (but only some values) orf is non-continuous, f can be minimized by stochastic methods
I Random searchI choose x1 randomlyI calculate f (x1)I choose δ randomlyI calculate f (x1 + δ)I
f (x1 + δ) < f (x1) choose x1 + δ asif new starting point
f (x1 + δ) > f (x1) choose new δ
I choose δ = δ(t) with limt→∞ δ(t) → 0
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Stochastic optimization
I Metropolis samplingI like random search butI if f (x1 + δ) > f (x1) only use newly generated x with probability
P =1
1 + exp�
f (x1+δ)−f (x1)α
�
withlim
t→∞α(t) → 0
I then the algorithm leaves a local minimum with P > 0
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I here: imitate nature’s robust way of evolving successful organismsI organisms ill-suited to an environment die off, whereas fit ones
reproduceI offspring is similar to the parents, so population fitness increases with
generationsI mutation can randomly generate new speciesI ‘The Origin of Species by Means of Natural Selection’, C.R. Darwin,
D. Appleton and Company, NY, 1897
I history:I introduced by J. Holland 1975I further invesigated by his students e.g. K. DeJong 1975I more recently theoretical advances e.g. by M. Vose 1993
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I what’s good for nature is good for artificial systems
I imagine population of individual ‘explorers’ sent into theoptimization phase-space
I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function
I the struggle of ‘life’ beginsI selectionI crossoverI mutation
I according to these rules populations tend to increase overall fitness
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I what’s good for nature is good for artificial systems
I imagine population of individual ‘explorers’ sent into theoptimization phase-space
I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function
I the struggle of ‘life’ beginsI selectionI crossoverI mutation
I according to these rules populations tend to increase overall fitness
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I what’s good for nature is good for artificial systems
I imagine population of individual ‘explorers’ sent into theoptimization phase-space
I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function
I the struggle of ‘life’ beginsI selectionI crossoverI mutation
I according to these rules populations tend to increase overall fitness
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I what’s good for nature is good for artificial systems
I imagine population of individual ‘explorers’ sent into theoptimization phase-space
I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function
I the struggle of ‘life’ beginsI selectionI crossoverI mutation
I according to these rules populations tend to increase overall fitness
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I advantagesI global not only local optimizationI simple and hence easy to implementI easy parallelization possible
I disadvantagesI how to encode phase-space positionI rather low speed and high computational costI parameter dependencies (population size, selection and reproduction
parameters)
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Reinforcement learningOptimizationImitate natureGenetic algorithms
Genetic algorithms
I advantagesI global not only local optimizationI simple and hence easy to implementI easy parallelization possible
I disadvantagesI how to encode phase-space positionI rather low speed and high computational costI parameter dependencies (population size, selection and reproduction
parameters)
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Algorithm
IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms
AlgorithmBasic algorithmData representationSelection
ReproductionProperties and extensions
OverviewConvergence analysisSchema theoremGenetic programming
Examples2d-function optimizationGenetic MastermindHyerplane detection
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Basic genetic algorithm
Data: population, a set of individuals
fitness-function Fitness, a function measuring fitness of an individual
Result: an individual
repeat
parents ← Selection (population, Fitness)1
population ← Reproduction (parents)2
until some individual is fit enough;
return the best individual in population according to Fitness3
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Individual
I an individual encodes the data space position
I classic GA approach: representation by word (chromosome) over afinite alphabet
I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if
alphabet is largerI finite alphabet implies discrete search space
I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Individual
I an individual encodes the data space position
I classic GA approach: representation by word (chromosome) over afinite alphabet
I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if
alphabet is largerI finite alphabet implies discrete search space
I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Individual
I an individual encodes the data space position
I classic GA approach: representation by word (chromosome) over afinite alphabet
I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if
alphabet is largerI finite alphabet implies discrete search space
I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Selection
I goal: select individuals that produce the next generation
I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce
several times
I selection probabilitiesI roulette wheel (Holland 1975)
P(choice of individual i) =f (i)Pj f (j)
problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank
e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected
subset
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Selection
I goal: select individuals that produce the next generation
I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce
several times
I selection probabilitiesI roulette wheel (Holland 1975)
P(choice of individual i) =f (i)Pj f (j)
problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank
e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected
subset
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Selection
I goal: select individuals that produce the next generation
I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce
several times
I selection probabilitiesI roulette wheel (Holland 1975)
P(choice of individual i) =f (i)Pj f (j)
problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank
e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected
subset
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Reproduction
I typically consists of two stagesI crossover (or mating): selected individuals are randomly paired and
(usually two) children are producedI mutation: genes can be altered by random mutation to a different
value according to a small probability
I use genetic operators to produce and alter new offspring → basicsearch mechanism in GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Reproduction
I typically consists of two stagesI crossover (or mating): selected individuals are randomly paired and
(usually two) children are producedI mutation: genes can be altered by random mutation to a different
value according to a small probability
I use genetic operators to produce and alter new offspring → basicsearch mechanism in GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Crossover
I let x, y ∈ An be the genes of the two parents
I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by
x ′i :=
�xi if i < ryi otherwise
y ′i :=
�yi if i < rxi otherwise
I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by
x′ := rx + (1− r)y
y′ := (1− r)x + ry
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Crossover
I let x, y ∈ An be the genes of the two parents
I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by
x ′i :=
�xi if i < ryi otherwise
y ′i :=
�yi if i < rxi otherwise
I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by
x′ := rx + (1− r)y
y′ := (1− r)x + ry
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Crossover
I let x, y ∈ An be the genes of the two parents
I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by
x ′i :=
�xi if i < ryi otherwise
y ′i :=
�yi if i < rxi otherwise
I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by
x′ := rx + (1− r)y
y′ := (1− r)x + ry
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Mutation
I let xi ∈ A be the gene of an individual that is to be mutated
I binary gene: binary mutationI x ′i := 1− xi
I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A
I also possible: non-uniform mutationI needs fixed distribution for element choice
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Mutation
I let xi ∈ A be the gene of an individual that is to be mutated
I binary gene: binary mutationI x ′i := 1− xi
I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A
I also possible: non-uniform mutationI needs fixed distribution for element choice
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Mutation
I let xi ∈ A be the gene of an individual that is to be mutated
I binary gene: binary mutationI x ′i := 1− xi
I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A
I also possible: non-uniform mutationI needs fixed distribution for element choice
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Mutation
I let xi ∈ A be the gene of an individual that is to be mutated
I binary gene: binary mutationI x ′i := 1− xi
I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A
I also possible: non-uniform mutationI needs fixed distribution for element choice
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
One generation example
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
Basic algorithmData representationSelectionReproduction
Analytical example
I optimize f (x) := x2 in [0, 1]
I x → 0.b9b8 . . . b0 ⇒ 1024 values of x
I minimal distance of adjacent values is 2−10
I mutation of bit i changes x by δ = 2−i → x + δ
I crossover of two character sets:
x = 0.b9b8.....b0
⇒ z = 0.b9b8....biai−1....a0
y = 0.a9a8.....a0 = x + δ
I crossover corresponds to improved stochastic search
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Progress diagram of a genetic algorithm
Gen := 0↓
Create InitialRandom Population
↓
−→ Termination CriteriaYes→ End
| satisfied?| ↓ No| Evaluate fitness of| each individual| ↓| i := 0| ↓
Gen := Gen + 1Yes← i := M?
↓ NoPr←− Select genetic Operation
Pm−→↓ ↓=; Pc ↓
Select one individual Select two individuals Select one individual↓
Perform reproduction Perform Crossover Perform mutation↓
i := i + 1Copy into new Insert two off-springs Insert mutant into
population into new population new population| | |−→ ↓ ←−
i := i + 1
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Convergence analysis
I corresponds to a ‘schema’ (J.Holland 1975)
I example goal: maximize f (x) = x2 in [0, 1]
I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing
probability according to fitness e.g.
1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...
.
.
. → →...
N 0, 0... → N 0, 1... → N 0, 10...
⇒
1 0, 11...2 0, 11... → 0, 1111111111
.
.
.N 0, 11...
I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Convergence analysis
I corresponds to a ‘schema’ (J.Holland 1975)
I example goal: maximize f (x) = x2 in [0, 1]
I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing
probability according to fitness e.g.
1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...
.
.
. → →...
N 0, 0... → N 0, 1... → N 0, 10...
⇒
1 0, 11...2 0, 11... → 0, 1111111111
.
.
.N 0, 11...
I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Convergence analysis
I corresponds to a ‘schema’ (J.Holland 1975)
I example goal: maximize f (x) = x2 in [0, 1]
I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing
probability according to fitness e.g.
1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...
.
.
. → →...
N 0, 0... → N 0, 1... → N 0, 10...
⇒
1 0, 11...2 0, 11... → 0, 1111111111
.
.
.N 0, 11...
I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Convergence analysis
I corresponds to a ‘schema’ (J.Holland 1975)
I example goal: maximize f (x) = x2 in [0, 1]
I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing
probability according to fitness e.g.
1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...
.
.
. → →...
N 0, 0... → N 0, 1... → N 0, 10...
⇒
1 0, 11...2 0, 11... → 0, 1111111111
.
.
.N 0, 11...
I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Schema
A schema is a bit pattern representing a set of binary characters usingthe symbols {0, 1, .}.Genetic algorithms consist of a finite series of the three steps:
1. choice of parent bit pattern
2. recombination
3. mutation
With which probability do the fitter bit pattern survive from generationto generation?
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.
I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain
the schema HI d(H): diameter of a schema i.e. the length of the shortest
subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3
I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability
p(Hj) =f (Hj)∑Ni=1 f (Hi )
I mean fitness of the population
fµ =1
N
N∑i=1
f (Hi )
p(Hj) =f (Hj)
N · fµTheis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.
I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain
the schema HI d(H): diameter of a schema i.e. the length of the shortest
subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3
I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability
p(Hj) =f (Hj)∑Ni=1 f (Hi )
I mean fitness of the population
fµ =1
N
N∑i=1
f (Hi )
p(Hj) =f (Hj)
N · fµTheis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.
I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain
the schema HI d(H): diameter of a schema i.e. the length of the shortest
subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3
I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability
p(Hj) =f (Hj)∑Ni=1 f (Hi )
I mean fitness of the population
fµ =1
N
N∑i=1
f (Hi )
p(Hj) =f (Hj)
N · fµTheis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.
I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain
the schema HI d(H): diameter of a schema i.e. the length of the shortest
subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3
I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability
p(Hj) =f (Hj)∑Ni=1 f (Hi )
I mean fitness of the population
fµ =1
N
N∑i=1
f (Hi )
p(Hj) =f (Hj)
N · fµTheis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
SelectionI selection probability of a chain containing schema H
P =k∑
j=1
f (Hj)
N · fµ
H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t
f (H) =
∑j f (Hj)
O(H, t)⇒ p =
O(H, t)f (H)
N · fµ
I probability PA for two chains containing H to be selected as parents:
PA =
[O(H, t)f (H)
N · fµ
]2
I probability PB that exactly one of two selected chains contains H:
PB = 2 · O(H, t)f (H)
N · fµ·(
1− O(H, t)f (H)
N · fµ
)Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
SelectionI selection probability of a chain containing schema H
P =k∑
j=1
f (Hj)
N · fµ
H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t
f (H) =
∑j f (Hj)
O(H, t)⇒ p =
O(H, t)f (H)
N · fµ
I probability PA for two chains containing H to be selected as parents:
PA =
[O(H, t)f (H)
N · fµ
]2
I probability PB that exactly one of two selected chains contains H:
PB = 2 · O(H, t)f (H)
N · fµ·(
1− O(H, t)f (H)
N · fµ
)Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
SelectionI selection probability of a chain containing schema H
P =k∑
j=1
f (Hj)
N · fµ
H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t
f (H) =
∑j f (Hj)
O(H, t)⇒ p =
O(H, t)f (H)
N · fµ
I probability PA for two chains containing H to be selected as parents:
PA =
[O(H, t)f (H)
N · fµ
]2
I probability PB that exactly one of two selected chains contains H:
PB = 2 · O(H, t)f (H)
N · fµ·(
1− O(H, t)f (H)
N · fµ
)Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
SelectionI selection probability of a chain containing schema H
P =k∑
j=1
f (Hj)
N · fµ
H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t
f (H) =
∑j f (Hj)
O(H, t)⇒ p =
O(H, t)f (H)
N · fµ
I probability PA for two chains containing H to be selected as parents:
PA =
[O(H, t)f (H)
N · fµ
]2
I probability PB that exactly one of two selected chains contains H:
PB = 2 · O(H, t)f (H)
N · fµ·(
1− O(H, t)f (H)
N · fµ
)Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Recombination
I probability that schema H is contained in child chain: if bothparents contain H, then 1, if only one then 1/2 in the mean
I schema H is divided during crossover with probability
Pdiv =d(H)− 1
l − 1
hence probability of survival during recombination
W ≥(
O(H, t)f (H)
N · fµ
)2
+2
2
(O(H, t)f (H)
N · fµ
) (1− O(H, t)f (H)
N · fµ
) (1− d(H)− 1
l − 1
)≥ O(H, t)f (H)
N · fµ
(1− d(H)− 1
l − 1
(1− O(H, t)f (H)
N · fµ
))
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Recombination
I probability that schema H is contained in child chain: if bothparents contain H, then 1, if only one then 1/2 in the mean
I schema H is divided during crossover with probability
Pdiv =d(H)− 1
l − 1
hence probability of survival during recombination
W ≥(
O(H, t)f (H)
N · fµ
)2
+2
2
(O(H, t)f (H)
N · fµ
) (1− O(H, t)f (H)
N · fµ
) (1− d(H)− 1
l − 1
)≥ O(H, t)f (H)
N · fµ
(1− d(H)− 1
l − 1
(1− O(H, t)f (H)
N · fµ
))
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Mutation
I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p
I hence total survival probability is
W ′ ≥ O(H, t)f (H)
N · fµ
(1− d(H)− 1
l − 1
(1− O(H, t)f (H)
N · fµ
))(1−p)b(H)
I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is
〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)
with
P =O(H, t)f (H)
N · fµand Pdiv =
d(H)− 1
l − 1
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Mutation
I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p
I hence total survival probability is
W ′ ≥ O(H, t)f (H)
N · fµ
(1− d(H)− 1
l − 1
(1− O(H, t)f (H)
N · fµ
))(1−p)b(H)
I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is
〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)
with
P =O(H, t)f (H)
N · fµand Pdiv =
d(H)− 1
l − 1
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Mutation
I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p
I hence total survival probability is
W ′ ≥ O(H, t)f (H)
N · fµ
(1− d(H)− 1
l − 1
(1− O(H, t)f (H)
N · fµ
))(1−p)b(H)
I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is
〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)
with
P =O(H, t)f (H)
N · fµand Pdiv =
d(H)− 1
l − 1
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and
reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces
better even at medium fitnessI If f (H) = fµ then
1− Ptrenn(1− P) = 1− Ptrenn
(1− O(H, t)
N
)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)
I genetic drift looses bit patterns, which can be countered by highermutation rates.
I correct balance between mutation and crossover is important, butdepends on the problem
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and
reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces
better even at medium fitnessI If f (H) = fµ then
1− Ptrenn(1− P) = 1− Ptrenn
(1− O(H, t)
N
)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)
I genetic drift looses bit patterns, which can be countered by highermutation rates.
I correct balance between mutation and crossover is important, butdepends on the problem
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and
reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces
better even at medium fitnessI If f (H) = fµ then
1− Ptrenn(1− P) = 1− Ptrenn
(1− O(H, t)
N
)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)
I genetic drift looses bit patterns, which can be countered by highermutation rates.
I correct balance between mutation and crossover is important, butdepends on the problem
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming
I generalization of genetic algorithms to automatically developcomputer programs (CP)
I each structure of the population represents a single CP
I properties of CP’sI hierarchical operationsI alternative calculation using conditionsI iterative calculationsI manipulation of different data typesI procedural calculations
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming
I generalization of genetic algorithms to automatically developcomputer programs (CP)
I each structure of the population represents a single CP
I properties of CP’sI hierarchical operationsI alternative calculation using conditionsI iterative calculationsI manipulation of different data typesI procedural calculations
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming — coding
representation of a CP by a tree e.g.
if a > b then x := 3
can be represented by the tree
if↙ ↘
> :=↙ ↘ ↙ ↘
a b x 3
I search of a CP that solves a given problem can be performed usingprinciples of genetic algorithms
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming — crossover-operatorparents generation
A G↙ ↘ ↙ ↘
B D H I↓ ↙ ↘ ↙ ↘C E F J K
↙ ↘L M
children generation
A G↙ ↘ ↙ ↘
K D H I↙ ↘ ↙ ↘ ↙ ↘
L M E F J B↓C
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming — crossover-operatorparents generation
A G↙ ↘ ↙ ↘
B D H I↓ ↙ ↘ ↙ ↘C E F J K
↙ ↘L M
children generation
A G↙ ↘ ↙ ↘
K D H I↙ ↘ ↙ ↘ ↙ ↘
L M E F J B↓C
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
OverviewConvergence analysisSchema theoremGenetic programming
Genetic programming — mutation-operator
a knot is chosen randomly and replaced by a randomly generated subtree:
A A↙ ↘ ↙ ↘
B D B D↓ ↙ ↘ ↓ ↙ ↘C E F C E K
↙ ↘L M
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Algorithm
IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms
AlgorithmBasic algorithmData representationSelection
ReproductionProperties and extensions
OverviewConvergence analysisSchema theoremGenetic programming
Examples2d-function optimizationGenetic MastermindHyerplane detection
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Examples
I continuous exampleI global optimization of continuous function f : [a, b] → R
I binary exampleI genetic MastermindI select optimal guess using GA
I example from our researchI perform overcomplete blind source separation by sparse component
analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Examples
I continuous exampleI global optimization of continuous function f : [a, b] → R
I binary exampleI genetic MastermindI select optimal guess using GA
I example from our researchI perform overcomplete blind source separation by sparse component
analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Examples
I continuous exampleI global optimization of continuous function f : [a, b] → R
I binary exampleI genetic MastermindI select optimal guess using GA
I example from our researchI perform overcomplete blind source separation by sparse component
analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
2d-function optimization
−10 −8 −6 −4 −2 0 2 4 6 8 10
5
10
15
20
25
30
35
40
45
50
x
multipeak
0 10 20 30 40 50 60 70 80 90 10020
25
30
35
40
45
50
f performance (optimal individual and mean)
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Genetic Mastermind
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Hyerplane detection
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
I perform overcomplete blind source separation by sparse componentanalysis [Georgiev et al., 2004, Theis et al., 2004]
I key problem: hyperplane detection
I solution: optimize cost function using GAs
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
Conclusions
I genetic algorithms perform global optimization
I they mimic nature by letting a population evolve according to theirfitness
I algorithmI selectionI reproduction: by crossover and mutation
I simple applicability in real-world situations
Theis Optimization and genetic algorithms
IntroductionAlgorithm
Properties and extensionsExamples
2d-function optimizationGenetic MastermindHyerplane detection
I Resources
I books: [Goldberg, 1989,Schoneburg et al., 1994]
I Matlab GA optimizationtoolbox:http://www.ie.ncsu.edu/
mirage/GAToolBox/gaot
I Details and papers on my websitehttp://fabian.theis.name
I This research was supported bythe DFG and BMBF.
I ReferencesP. Georgiev, F. Theis, and A. Cichocki. Sparse
component analysis and blind source separation ofunderdetermined mixtures. IEEE Trans. on NeuralNetworks in print, 2004.
D. Goldberg. Genetic Algorithms in Search Optimizationand Machine Learning. Addison Wesley Publishing,1989.
E. Schoneburg, F. Heinzmann, and S. Feddersen.Genetische Algorithmen und Evolutionsstrategien.Addison Wesley Publishing, 1994.
F. Theis, P. Georgiev, and A. Cichocki. Robustovercomplete matrix recovery for sparse sources usinga generalized hough transform. In Proc. ESANN2004, pages 343–348, Bruges, Belgium, 2004. d-side,Evere, Belgium. URL http:
//homepages.uni-regensburg.de/∼thf11669/
publications/theis04houghSCA ESANN04.pdf.
Theis Optimization and genetic algorithms