68
1 Simulated Annealing (Reading – Section 10.9 of NRC) Genetic Algorithms Optimisation Methods

1 Simulated Annealing (Reading – Section 10.9 of NRC) Genetic Algorithms Optimisation Methods

  • View
    224

  • Download
    0

Embed Size (px)

Citation preview

1

Simulated Annealing(Reading – Section 10.9 of NRC)

Genetic Algorithms

Optimisation Methods

2

Simulated Annealing

Optimisation methods to date only find minimum of current basin on hyper-surface

SA (and Gas) are optimisation methods that can handle multiple local minima.

Analogy with thermodynamics, cooling and annealing of metals, cooling and freezing of liquids

Must provide the following elements: A description of possible system states A generator of random changes in the system (options

for next system state) An objective function (analogue of system energy) A control parameter (analogue of temperature) and a

cooling schedule which describes how the control parameter is lowered from high to low values.

3

Simulated Annealing

4

Simulated Annealing

5

Simulated Annealing

6

Simulated Annealing

7

Simulated Annealing

8

Simulated Annealing

9

Simulated Annealing

10

Simulated Annealing

11

Genetic Algorithms in a slide

Premise Evolution worked once (it produced us!), it might

work again Basics

Pool of solutions

Mate existing solutions to produce new solutions

Mutate current solutions for long-term diversity

Cull population

12

Genetic Algorithms in a slide

randomly initialise a pool of solutionsNext_Generation

Mutation_Loop

select a solution from pool using relative fitnessmutate solution and save

end_LoopCrossover_Loop

select pairs of solutions from pool using relative fitnesscrossover and save both child solutions

end_LoopTermination_Check

if not finishedcreate new pool from saved solutionsgoto Next_Generation GA Demo

13

Originator

John Holland

Seminal work Adaptation in Natural and Artificial Systems

introduced main GA concepts, 1975

14

Introduction

Computing pioneers (especially in AI) looked to natural systems as guiding metaphors

Evolutionary computation Any biologically-motivated computing activity

simulating natural evolution

Genetic Algorithms are one form of this activity

Original goals Formal study of the phenomenon of adaptation

John Holland

An optimization tool for engineering problems

15

Main idea

Take a population of candidate solutions to a given problem

Use operators inspired by the mechanisms of natural genetic variation

Apply selective pressure toward certain properties

Evolve a more fit solution

16

Why evolution as a metaphor

Ability to efficiently guide a search through a large solution space

Ability to adapt solutions to changing environments

“Emergent” behavior is the goal

“The hoped-for emergent behavior is the design of high-quality solutions to difficult problems and the ability to adapt these solutions in the face of a changing environment”

Melanie Mitchell, An Introduction to Genetic Algorithms

17

Evolutionary terminology

Abstractions imported from biology Chromosomes, Genes, Alleles Fitness, Selection Crossover, Mutation

18

GA terminology

In the spirit – but not the letter – of biology GA chromosomes are strings of genes

Each gene has a number of alleles; i.e., settings

Each chromosome is an encoding of a solution to a problem

A population of such chromosomes is operated on by a GA

19

Encoding

A data structure for representing candidate solutions Often takes the form of a bit string

Usually has internal structure; i.e., different parts of the string represent different aspects of the solution)

20

Crossover

Mimics biological recombination Some portion of genetic material is swapped between

chromosomes Typically the swapping produces an offspring

Mechanism for the dissemination of “building blocks” (schemas)

21

Mutation

Selects a random locus – gene location – with some probability and alters the allele at that locus

The intuitive mechanism for the preservation of variety in the population

22

Fitness

A measure of the goodness of the organism

Expressed as the probability that the organism will live another cycle (generation)

Basis for the natural selection simulation Organisms are selected to mate with probabilities

proportional to their fitness

Probabilistically better solutions have a better chance of conferring their building blocks to the next generation (cycle)

23

A Simple GA

Generate initial populationdo

Calculate the fitness of each member// simulate another generationdo

Select parents from current populationPerform crossover add offspring to the

new populationwhile new population is not full

Merge new population into the current population

Mutate current population

while not converged

24

How do GAs work

The structure of a GA is simple to comprehend, but the dynamic behavior is complex

Holland has done significant work on the theoretical foundations of GAs

“GAs work by discovering, emphasizing, and recombining good ‘building blocks’ of solutions in a highly parallel fashion.”

Melanie Mitchell, paraphrasing John Holland

Using formalism Notion of a building block is formalized as a schema Schemas are propagated or destroyed according to

the laws of probability

25

Genetic algorithm (GA), I

GA works by using a large population to explore many options in parallel.

The state of a GA is given by a population, with each member of the population being a complete set of parameters

26

Genetic algorithm - Overview

The whole population is updated in generations by four steps: Fitness: evaluate the function being searched Reproduction: the members of the new population

are selected based on their fitness. Members with a low fitness might disappear, and one with a high fitness can be duplicated.

Crossover: After two parents are randomly chosen based on their fitness, the offspring gets its parameter values based on some kind of random selection from the parents.

Mutation: randomly or in some other way change the parameter values

27

Genetic algorithm - Overview

Distribution of Individuals in Generation 0

Distribution of Individuals in Generation N

28

Example - Cumulative Selection

Methinks it is like a weasel 28 characters including blank 2728 random cases. Starting from random sentences, we can find the desired

sentence by the following procedure1. Generate 10 sentences of 27 randomly chosen

characters2. Select the sentence that has the most correct letters3. Duplicate this best sentence ten times4. For each duplicate, randomly replace a few letters

(mutation rate)5. Repeat step 2-4 until the target sentence is matched.

The Weasel Applet: http://home.pacbell.net/s-max/scott/weasel.html

29

Genetic algorithm - Overview

Maximization problem in 2D space ( 0<x<1, 0<y<1 )

Encoding: Individual: a point at (x, y);

P1=(0.14429628, 0.72317247), P2=(0.71281369, 0.83459991) Encoding to chromosome-like string:

P1=“1442962872317247”, P2=“7128136983459991”

30

Genetic algorithm - Overview

Breeding: Crossover:

P1 = 1442962872317247 O1 = 1448136983459991

P2 = 7128136983459991 O2 = 7122962872317247

Mutation:O2 = 7122962872317247 O2 = 7122962878317247

Decoding:O1 = 1448136983459991 (0.14481369, 0.83459991)O2 = 7122962878317247 (0.71229628, 0.78317247)

31

Genetic algorithm - Overview

Elitism: Store away the parameters defining the fittest

member of the current population. And later copy it intact in the offspring population

Variable mutation rate At any given time, keep track of the fitness value of

the fittest population members, and of the median ranked member.

The fitness difference f between those two indivituals is a measure of population convergenece.

If f becomes too small, increase the mutation rate. If f becomes too large, decrease the mutation rate.

32

Genetic algorithm - Overview

Hamming wall & creep mutation Example: optimal = 21000, current = 19994 Choose a digit Instead of replacing the digit, add either 1 or -1 Example: creep mutation hitting the middle “9” with 1

20094

33

Schema

A template, much like a regular expression, describing a set of strings

The set of strings represented by a given schema characterizes a set of candidate solutions sharing a property

This property is the encoded equivalent of a building block

34

Example

0 or 1 represents a fixed bit Asterisk represents a “don’t care” 11****00 is the set of all solutions encoded in 8

bits, beginning with two ones and ending with two zeros Solutions in this set all share the same variants of the

properties encoded at these loci

35

Schema qualifiers

Length The inclusive distance between the two bits in a

schema which are furthest apart (the defining length of the previous example is 8)

Order The number of fixed bits in a schema (the order of the

previous example is 4)

36

Not just sum of the parts

GAs explicitly evaluate and operate on whole solutions

GAs implicitly evaluate and operate on building blocks Existing schemas may be destroyed or weakened by

crossover New schemas may be spliced together from existing

schema

Crossover includes no notion of a schema – only of the chromosomes

37

Why do they work

Schemas can be destroyed or conserved

So how are good schemas propagated through generations? Conserved – good – schemas confer higher fitness on

the offspring inheriting them

Fitter offspring are probabilistically more likely to be chosen to reproduce

38

Approximating schema dynamics

Let H be a schema with at least one instance present in the population at time t

Let m(H, t) be the number of instances of H at time t

Let x be an instance of H and f(x) be its fitness The expected number of offspring of x is

f(x)/f(pop) (by fitness proportionate selection) To know E(m(H, t +1)) (the expected number

of instances of schema H at the next time unit), sum f(x)/f(pop) for all x in H GA never explicitly calculates the average fitness of a

schema, but schema proliferation depends on its value

39

Approximating schema dynamics

Approximation can be refined by taking into account the operators

Schemas of long defining length are less likely to survive crossover Offspring are less likely to be instances of such schemas

Schemas of higher order are less likely to survive mutation

Effects can be used to bound the approximate rates at which schemas proliferate

40

Implications

Instances of short, low-order schemas whose average fitness tends to stay above the mean will increase exponentially

Changing the semantics of the operators can change the selective pressures toward different types of schemas

41

Theoretical Foundations

Empirical observation GAs can work

Goal Learn how to best use the tool

Strategy Understand the dynamics of the model Develop performance metrics in order to quantify

success

42

Theoretical Foundations

Issues surrounding the dynamics of the model What laws characterize the macroscopic behavior of

GAs?

How do microscopic events give rise to this macroscopic behavior?

43

Theoretical Foundation

Holland’s motivation Construct a theoretical framework for adaptive

systems as seen in nature Apply this framework to the design of artificial

adaptive systems

Issues in performance evaluation According to what criteria should GAs be evaluated? What does it mean for a GA to do well or poorly? Under what conditions is a GA an appropriate solution

strategy for a problem?

44

Theoretical Foundation

Holland’s observations An adaptive system must persistently identify, test,

and incorporate structural properties hypothesized to give better performance in some environment

Adaptation is impossible in a sufficiently random environment

45

Theoretical Foundation

Holland’s intuition A GA is capable of modeling the necessary tasks in an

adaptive system

It does so through a combination of explicit computation and implicit estimation of state combined with incremental change of state in directions motivated by these calculations

46

Theoretical Foundation

Holland’s assertion The ‘identify and test’ requirement is satisfied by the

calculation of the fitnesses of various schemas

The ‘incorporate’ requirement is satisfied by implication of the Schema Theorem

47

Theoretical Foundation

How does a GA identify and test properties? A schema is the formalization of a property A GA explicitly calculates fitnesses of individuals and

thereby schemas in the population It implicitly estimates fitnesses of hypothetical

individuals sharing known schemas In this way it efficiently manages information

regarding the entire search space

48

Theoretical Foundation

How does a GA incorporate observed good properties into the population? Implication of the Schema Theorem

Short, low-order, higher than average fitness schemas will receive exponentially increasing numbers of samples over time

49

Theoretical Foundation

Lemmas to the Schema Theorem Selection focuses the search Crossover combines good schemas Mutation is the insurance policy

50

Theoretical Foundation

Holland’s characterization Adaptation in natural systems is framed by a tension

between exploration and exploitation Any move toward the testing of previously unseen

schemas or of those with instances of low fitness takes away from the wholesale incorporation of known high fitness schemas

But without exploration, schemas of even higher fitness can not be discovered

51

Theoretical Foundation

Goal of Holland’s first offering The original GA was proposed as an “adaptive plan”

for accomplishing a proper balance between exploration and exploitation

52

Theoretical Foundation

GA does in fact model this Given certain assumptions, the balance is achieved

A key assumption is that the observed and actual fitnesses of schemas are correlated

This assumption creates a stumbling block to which we will return

53

Traveling Salesperson Problem

Find the minimum distance tour around a set of cities, visiting each city only once and ending back where youstarted from.

54

Initial Population for TSP

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (4,3,6,2,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

55

Select Parents

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (4,3,6,2,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

Try to pick the better ones.

56

Create Off-Spring – 1 point

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (4,3,6,2,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(3,4,5,6,2)

57

(3,4,5,6,2)

Create More Offspring

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (4,3,6,2,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(5,4,2,6,3)

58

(3,4,5,6,2) (5,4,2,6,3)

Mutate

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (4,3,6,2,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

59

Mutate

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (2,3,6,4,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(3,4,5,6,2) (5,4,2,6,3)

60

Eliminate

(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)

(2,3,4,6,5) (2,3,6,4,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

Tend to kill off the worst ones.

(3,4,5,6,2) (5,4,2,6,3)

61

Integrate

(5,3,4,6,2) (2,4,6,3,5)

(2,3,6,4,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(3,4,5,6,2)

(5,4,2,6,3)

62

Restart

(5,3,4,6,2) (2,4,6,3,5)

(2,3,6,4,5) (3,4,5,2,6)

(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)

(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)

(3,4,5,6,2)

(5,4,2,6,3)

63

Genetic Algorithms

Facts Very robust but slow

Can make simulated annealing seem fast In the limit, optimal

64

Other GA-TSP Possibilities

Ordinal Representation Partially-Mapped Crossover Edge Recombination Crossover

Problem Operators are not sufficiently exploiting the proper

“building blocks” used to create new solutions.

65

Genetic Algorithms

Some ideas Parallelism Punctuated equilibria Jump starting Problem-specific information Synthesize with simulated annealing Perturbation operator

66

Heuristic H

Length(MST) < Length(T)Let T be the optimal tour.

67

Heuristic H

Tour T’ Tour T’’

68

Perturbation of points