31
Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Embed Size (px)

Citation preview

Page 1: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic AlgorithmsCSCI-2300 Introduction to Algorithms

David Goldschmidt, Ph.D.Rensselaer Polytechnic InstituteApril 29, 2013

Page 2: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Evolutionary Computing

Evolutionary computing produceshigh-quality partial solutions to problems throughnatural selection andsurvival of the fittest

– Compare to naturalbiological systems thatadapt and learn over time

Page 3: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Integer Binary code Integer Binary code Integer Binary code1 112 7 123 8 134 9 145 10 15

6 1 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

0 1 1 00 1 1 11 0 0 01 0 0 11 0 1 0

0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 1

Genetic Algorithm Example

Find the maximum value of function f(x) = –x2 + 15x

– Represent problem using chromosomes built from four genes:

http://w

ww.webgr

aphing.c

om

Page 4: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithm Example

Initial random population of size N = 6:

Chromosomelabel

Chromosomestring

Decodedinteger

Chromosomefitness

Fitnessratio, %

X1 1 1 0 0 12 36 16.5X2 0 1 0 0 4 44 20.2X3 0 0 0 1 1 14 6.4X4 1 1 1 0 14 14 6.4X5 0 1 1 1 7 56 25.7X6 1 0 0 1 9 54 24.8

x

50

40

30

20

60

10

00 5 10 15

f(x)

(a) Chromosome initial locations.x

50

40

30

20

60

10

00 5 10 15

(b) Chromosome final locations.

Page 5: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithm Example

Determine chromosome fitness foreach chromosome:

218 100.0

Chromosomelabel

Chromosomestring

Decodedinteger

Chromosomefitness

Fitnessratio, %

X1 1 1 0 0 12 36 16.5X2 0 1 0 0 4 44 20.2X3 0 0 0 1 1 14 6.4X4 1 1 1 0 14 14 6.4X5 0 1 1 1 7 56 25.7X6 1 0 0 1 9 54 24.8

x

50

40

30

20

60

10

00 5 10 15

f(x)

(a) Chromosome initial locations.x

50

40

30

20

60

10

00 5 10 15

(b) Chromosome final locations.

fitness function here issimply the original function

f(x) = –x2 + 15x

Page 6: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

100 0

36.743.149.5

75.2

X1: 16.5%X2: 20.2%X3: 6.4%X4: 6.4%X5: 25.3%X6: 24.8%

Genetic Algorithm Example

Use fitness ratios to determine which chromosomes are selected for crossoverand mutation operations:

Page 7: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithm Example

Converge on a near-optimal solution:

Chromosomelabel

Chromosomestring

Decodedinteger

Chromosomefitness

Fitnessratio, %

X1 1 1 0 0 12 36 16.5X2 0 1 0 0 4 44 20.2X3 0 0 0 1 1 14 6.4X4 1 1 1 0 14 14 6.4X5 0 1 1 1 7 56 25.7X6 1 0 0 1 9 54 24.8

x

50

40

30

20

60

10

00 5 10 15

f(x)

(a) Chromosome initial locations.

x

50

40

30

20

60

10

00 5 10 15

(b) Chromosome final locations.

Page 8: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Convergence Example

local maximum

global maximum

Page 9: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms – Step 1

Represent the problem domain asa chromosome of fixed length– Use a fixed number of genes to represent a

solution– Use individual bits or characters for efficient

memory use and speed

– e.g. Traveling Salesman Problem (TSP) http://www.lalena.com/AI/Tsp/

1 10 1 0 1 0 0 0 0 0 1 0 1 10

Page 10: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms – Step 2

Define a fitness function f(x) to measurethe quality of individual chromosomes

The fitness function determines– which chromosomes carry over to the next

generation – which chromosomes are crossed over with

one another– which chromosomes are individually mutated

Page 11: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms – Step 3

Establish our genetic algorithm parameters:– Choose the size of the population, N – Set the crossover probability, pc

– Set the mutation probability, pm

Randomly generate an initial populationof chromosomes:– x1, x2, ..., xN

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

1 10 1 0 1 0 0 0 0 0 1 0 1 10

. . .

Page 12: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms – Step 4

Calculate the fitness of eachindividual chromosome using f(x):– f(x1), f(x2), ..., f(xN)

Order the population based on fitness values

Page 13: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

100 0

36.743.149.5

75.2

X1: 16.5%X2: 20.2%X3: 6.4%X4: 6.4%X5: 25.3%X6: 24.8%

Genetic Algorithms – Step 5

Using pc, select pairs of chromosomesfor crossover

Using pm, select chromosomes for mutation

Chromosomes are selectedbased on their fitnessvalues using aroulette wheel approach:

Page 14: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Genetic Algorithms – Step 6

Create a pair of offspring chromosomes by applying a crossover operation:

Page 15: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Genetic Algorithms – Step 6

Mutate an offspring chromosome by applyinga mutation operation:

Page 16: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms – Steps 7 & 8

Step 7:– Place all generated offspring

chromosomes in a new population

Step 8:– Go back to Step 5 until the size of the new

population is equal to the size of the initial population, N

Page 17: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Genetic Algorithms – Steps 9 & 10

Step 9:– Replace the initial population with

the new population

Step 10:– Go back to Step 4 and repeat the process

until termination criteria are satisfied– Typically repeat this process for 50-5000+

generations

Page 18: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Iteration

Page 19: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Crossword Puzzle Construction

Given:– Dictionary of valid words

and phrases– Empty crossword grid

Problem:– Fill the crossword grid such

that all words both acrossand down are valid

(assign clues later)

Page 20: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Crossword Puzzle Construction

Genetic Algorithm (GA)– Evolve a solution by crossovers and

mutations through many generations– Initial population of crossword grids:

Random letters? Random letters based on Scrabble® frequencies? Random words from dictionary?

– Fitness of each grid is number of valid words

Page 21: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Termination Criteria

When do we stop?– Pause a genetic algorithm after a

given number of generations, thencheck the fittest chromosomes

If the fittest chromosomes are fitbeyond a given threshold,terminate the genetic algorithm

– Also consider stopping when the highest fitness value does not change for a large number of generations

?

Page 22: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

How long does it take for an algorithm toproduce a solution?– Depends on the size of the input and

the complexity of the algorithm

– The size of the input is n

– The complexity of the algorithm is classifiedbased on its expected run time

Computational Complexity

Page 23: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Big-O notation measures the expected run timeof an algorithm (i.e. its computational complexity)– Constant time: O(1) – Logarithmic time: O(log n) – Linear time: O(n) – Linearithmic time: O(n log n) – Quadratic time: O(n2) – Exponential time: O(c n) – Factorial time: O(n!)

Computational Complexity

P

NP

Page 24: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic algorithms are often well-suited to producing reasonable solutions to intractable problems– Intractable problems are problems with

excessive computational complexity i.e. in the Nondeterministic Polynomial (NP) class

of problems

– A reasonable solution is a partial or inexact solution that adequately solves the problem in polynomial time

Genetic Algorithms

Page 25: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Consider the Traveling Salesman Problem (TSP) in which a salesman aims to visit n cities exactly once covering the least distance http://mathworld.wolfram.com/TravelingSalesmanProblem.html http://www.tsp.gatech.edu/games/index.html

– Starting at any given node, choose from n–1 remaining nodes, then choose from n–2 remaining nodes, etc.

– Testing every possible route takes (n–1)! stepssee http://bio.math.berkeley.edu/classes/195/2000/lec14/index.html

Genetic Algorithms Example

yikes!

Page 26: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Use a genetic algorithm to evolve a near-optimal solution to the TSP– Label cities A, B, C, D, E, F, etc.– Example circuits: ABCDEF, BDAFCE,

FBECAD

– How do we perform crossover operations? Basic crossovers might result in invalid members

of the population e.g. combining ABCDEF and BDAFCE may result

in ABCFCE

Genetic Algorithms Example

Page 27: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Key challenge of developing a genetic algorithm is often the representation of the problem– For TSP, consider a standard ordering

ABCDEF, assigning the code 123456– All other sequences encoded

based on the removal of letters

– Basic crossover works...

Genetic Algorithms Example

Page 28: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

All other sequences encoded based on the removal of letters from standard ordering– Sequence BDAFCE has code 231311

B is 2 in ABCDEF D is 3 in ACDEF A is 1 in ACEF F is 3 in CEF C is 1 in CE E is 1 in E

Genetic Algorithms Example

Page 29: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Crossing ACEDB with ABCED...

Crossover Operation

Genetic Algorithms Example

Page 30: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Combining ACEDB with ABCED......yields ACBED

from A.K. Dewdney’s The (New) Turing Omnibus, Computer Science Press, New York, 1993

Genetic Algorithms Example

another approach: http://www.dna-evolutions.com/dnaappletsample.html

Page 31: Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 29, 2013

Genetic Algorithms

Advantages of genetic algorithms:– Often outperform “brute force” approaches by

randomly jumping around the search space– Ideal for problem domains in which near-

optimal (as opposed to exact) solutions are adequate

Disadvantages of genetic algorithms:– Might not find any satisfactory partial solutions– Tuning can be a challenge