Selection Mechanisms & Initialization Strategies

Selection Mechanisms &

Initialization Strategies

Selection• Selection Pressure:

– Degree to which phenotypic differences affect differences in reproductive success.

– High selective pressure:

• Small phenotypic diffs => large repro diffs

• Fisher’s Fundamental Theorem of Natural Selection:– Evolutionary rate = f (reproductive-success variance)

• Reproductive success – Ability to pass genes to the next generation

– Nature: repro success = ``fitness”

– EC: repro success ~ Expected Value (EV) = g(fitness)

• Conclusion:– Nature: Evolutionary rate = f (fitness variance)

– EC: Evolutionary rate = f (EV variance)

– EC: Selection Pressure --+--> EV variance --+--> Evol rate

Selection Pressure in EC• Two points of application:

– ffitn: phenotypes |--> fitness

– fexp: fitness |--> expected value

• In both cases:– High selection pressure will magnify differences between points in

the domain and range, while low selection pressure will maintain or even reduce those differences.

– The net result is the same: selection pressure affects the degree to which phenotypic differences affect differences in the population

from generation to generation (i.e. Rate of evolution)

Phenotype Fitness ExpVal RouletteWheel Area

ffitn fexpNormalize

0

5 1

00

100

Selection & LandscapesFitness Landscape ExpVal Landscape

* A fitness landscape with high variance is a sign of high selection pressure via ffitn

Phenotype

Fit

ness E

xpV

al Low Selection Pressure

Phenotype

High Selection Pressurefexp

* Fitness landscapes are static for most EC apps, but Expval landscapes are very dynamic

Selection Mechanisms in EC• Generally refers to fexp : Fitness |--> Expected Value

• Exception: Tournament-style selection– Fitness values between 2 or more potential parents are compared

to choose a winner.

• Key Feature of Good Selection Mechanism– Maintain relatively constant selection pressure (i.e. Expval

variance) throughout the run.

– Thus, evolution never becomes too static (no expval variance) or focused/converged (extreme expval variance).

• Problem with EC simulations:– Early: Population fitness variance is high

– Late: ” ” ” ” low (convergence)

• Selection Mechanism Solution:– Early: fexp reduces differences

– Late: fexp magnifies differences

Fitness Landscape ExpVal Landscape

Early: Phenotypes are spread. The best (relative) individuals arenot that good on an absolute scale, so don’t reward them too much.

Late: Phenotypes are clustered about a pretty good spot. Now, smallimprovements should give high reproductive rewards in order to find the best solution in the region.

Selection Mechanism Comparison

• Assume a population of 10 individuals with fitness values 1…10.

• Each selection mechanism will map these fitness values to different areas on the roulette wheel.

• Notation:

EVi = expected value of individual i = expected number of

children that it will be the parent of in the next generation.

Fi = fitness of individual i.

= average fitness in the population

= fitness variance in the population

N = population size

Note: In the examples that follow, the expected values are computed and normalized, giving area on the roulette wheel. In practice, it is not always necessary to fully compute the expected values, since corresponding intermediate values may give the same areas on the wheel.

F

Fitness-Proportionate Selection

• Also called “roulette wheel” selection, even though many selection mechanisms use a stochastic roulette-wheel-like choice procedure for parents.• In practice, no need to divide each term by the fitness average prior to normalizing.

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

EVi Fi /F

Sigma-Scaling Selection

• Scaling by variance insures that selection pressure is a) not too high when there is a lot of variance, and b) not too low when there is little variance.

• Thus, selection pressure is kept relatively stable throughout evolution.

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

EVi 1Fi F

2

Boltzmann Selection

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

Temperature = 50 Temperature = 10

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

High Temp => Low Selection Pressure (Explorative)

EVi NeFi /T

eF j /T

j

T = Temperature

Boltzmann Selection (2)

Temperature = 5 Temperature = 1

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

Temperature = 0.5

Low Temp => High Selection Pressure (Exploitative)

Rank Selection

• Here, the population is sorted by descending fitness values. So the first individual has the best fitness.

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

Range = [10 11]Lower Selection Pressure

Range = [1 2]Higher Selection Pressure

Roulette Wheel

1

2

3

4

5

6

7

8

9

10

H = High valL = Low val(L+H)/2 = avg EV

2

1))(1(

HLN

LHiH

EVi

Tournament Selection• Randomly pick a tournament group, G, of K (tourn. size) individuals.

• With probability p, pick the individual (I) in G with the highest fitness, and with probability 1-p, randomly pick a member of G-I.

• Perform a second tournament to get the other parent.

• Low K => low selection pressure - since mediocre and poor individuals are only involved in small competitions and may easily win.

• High K => high selection pressure - since you have to be fitter than many others to reproduce.

• High p => high selection pressure - very little luck involved.

• Low p => lower selection pressure - bad indivs can get lucky.

PopulationTournaments (K = 3)

More Selection Mechanisms

(U,L) Selection: (“mu - lambda”)

1. Choose the U best parents.

2. Combine them to produce L children.

3. Choose the U best children as the next generation.

(U + L) Selection:

Same as (U,L), but now, in step 3, we choose the best among the U+L parents and kids to form the next generation.

Elitism:

- Allow the K best individuals to be copied directly into the next generation (without crossover or mutation)

- Often used in combination with other selection mechanisms.

- Common to use K=1 elitism in GP.

GP Tree Population Initialization

• Generate random trees

• Max-depth -vs- Max-init-depth params

– Don’t create huge trees initially.

– Let smaller building blocks emerge and combine to form bigger trees.

• Full -vs- Grow modes

– Full: Generate complete trees, i.e., every leaf node is at max-init-depth. => Make random selections ONLY from the function set until reaching the max-init-depth level.

– Grow: Trees need not be complete => Make random selections from BOTH the terminal and function sets (unless at max-init-depth.

• Func-bias param - biases random choices toward/away from the function set.

Higher function bias => fuller trees.

Ramped Half-and-half GP Tree Initialization

• Ramped: Gen an equal number of trees of all sizes from 2 to max-init-depth.

• Half-and-half: For each size class, half full and half grow inits.

• E.g. Population = 100 & max-init-depth = 6

– 20 indivs in each size class: 2,3,4,5, 6

– 10 full + 10 grow in each class.

• Purpose: Starts the population off with a lot of diversity.

• Popular technique.

Size 2 Size 3GrowFull

Linear GP Initialization

• If the linear genome represents a tree, (i.e., has tree-based execution) then similar rules to GP tree initialization apply.

• If the linear genome has linear or graphic execution modes, then:

– Insert a fixed header. For instance, to declare registers and to initialize them with input values.

– Length is a randomly chosen L in [M N], a program size range.

– For I = 1 to L

• Randomly select an operator

• Randomly select registers and/or constants for its operands

• Insert the operation code into the program sequence

– Insert a fixed footer. For instance, this might always return the value in register 0 as the output of the function.

Graph-Based GP InitializationWhen the STATIC genome structure is a graph, e.g. PADO.

• Insert fixed nodes such as START, END, READ, WRITE.

• Randomly select the total number of nodes, N.

• For I = 1 to N

– Generate a new node and randomly select a primitive function for it to perform.

• Randomly select the total number of connection arcs, C.

• For J = 1 to C

– Generate random source and destination nodes for each arc

• For I = 1 to N

– For node I, randomly generate decision rules for choosing among its departing arcs.

• This is not the exact PADO initialization procedure, but merely an abstract sketch of one possibility.

Fitness Functions• Error-based

– Fitness inversely proportional to total error on the test data.– E.g. symbolic regression, classification, image compression,

multiplexer design..• Cost-based

– Fitness inversely proportional to use of resources (e.g. time, space, money, materials, tree nodes)

– E.g. truck-backing, broom-balancing, energy network design…• Benefit-based

– Fitness proportional to accrued resources or other benefits.– E.g. foraging, investment strategies

• Parsimony-base– Fitness partly proportional to the simplicity of the phenotypes.– E.g. sorting algorithms, data compression…

• Entropy-based – Fitness directly or inversely proportional to the statistical entropy of a set

of collections: – E.g. Random sequence generators, clustering algorithms, decision trees.

)log( ii

i pp

Population Replacement• Generational (Batch)

– Repeat (for each generation or until other stop criteria are true):

• Evaluate the fitness of each individual

• Choose many parents for repro (some are chosen many times)

• Replace most or all of the population with the children

*Most common approach

• Steady-State (Incremental)

– Repeat (for fixed # rounds or until other stop criteria are true)

• Select a random group of parents, G.

• Evaluate the fitness of each parent in G

• Select the N parents with the highest fitness in G.

• Let the N good parents reproduce to form C children

• Replace the C lowest-fitness parents in G with the children

Documents

Selection Mechanisms & Initialization Strategies