65

Lecture-08-GA.ppt

Embed Size (px)

Citation preview

Page 1: Lecture-08-GA.ppt
Page 2: Lecture-08-GA.ppt

2

EvolutionHere’s a very oversimplified description of how evolution works

in biologyOrganisms (animals or plants) produce a number of offspring

which are almost, but not entirely, like themselvesVariation may be due to mutation (random changes)Variation may be due to reproduction (offspring have some characteristics

from each parent)

Some of these offspring may survive to produce offspring of their own—some won’tThe “better adapted” offspring are more likely to surviveOver time, later generations become better and better adapted

Genetic algorithms use this same process to “evolve” better programs

Page 3: Lecture-08-GA.ppt

3

GENETIC ALGORITHM

A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult problems

The problems are not solved by reasoning logically about them; rather populations of competing candidate solutions are spawned and then evolved to become better solutions through a process patterned after biological evolution

Less worthy candidate solutions tend to die out, while those that show promise of solving a problem survive and reproduce by constructing new solutions out of their components

Page 4: Lecture-08-GA.ppt

GENETIC ALGORITHM

GA begin with a population of candidate problem solutions

Candidate solutions are evaluated according to their ability to solve problem instances: only the fittest survive and combine with each other to produce the next generation of possible solutions

Thus increasingly powerful solutions emerge in a Darwinian universe

This method is heuristic in nature and it was introduced by John Holland in 1975

Page 5: Lecture-08-GA.ppt

5

Basic Genetic Algorithm

Start with a large “population” of randomly generated “attempted solutions” to a problem

Repeatedly do the following:Evaluate each of the attempted solutionsKeep a subset of these solutions (the “best” ones)Produce next generation from these solutions (using“inheritance” and “mutation”)

Quit when you have a satisfactory solution (or you run out of time)

Page 6: Lecture-08-GA.ppt

GENETIC ALGORITHMBasic Algorithm

begin set time t = 0; initialise population P(t) = {x1

t, x2t, …, xn

t} of solutions;

while the termination condition is not met do begin evaluate fitness of each member of P(t); select some members of P(t) for creating offspring; produce offspring by genetic operators; replace some members with the new offspring; set time t = t + 1; endend

Page 7: Lecture-08-GA.ppt

GENETIC ALGORITHM

Representation of Solutions: The Chromosome

Gene: A basic unit, which represents one characteristic of the individual. The value of each gene is called an allele

Chromosome: A string of genes; it represents an individual i.e. a possible solution of a problem. Each chromosome represents a point in the search space

Population: A collection of chromosomes

An appropriate chromosome representation is important for the efficiency and complexity of the GA

Page 8: Lecture-08-GA.ppt

GENETIC ALGORITHM

Evaluation/Fitness Function

It is used to determine the fitness of a chromosome

Creating a good fitness function is one of the challenging tasks of using GA

Page 9: Lecture-08-GA.ppt

Fitness Function

The fitness function can be the score of the classification accuracy of the rule-set over a set of provided training examples

Often other criteria may be included as well, such as the complexity of the rules or the size of the rule set

GENETIC ALGORITHM

Page 10: Lecture-08-GA.ppt

GENETIC ALGORITHM

Selection Operators (Algorithms)

They are used to select parents from the current population

The selection is primarily based on the fitness. The better the fitness of a chromosome, the greater its chance of being selected to be a parent

The most popular method of selection is Proportionate Selection

Page 11: Lecture-08-GA.ppt

GENETIC ALGORITHMReproduction Operators

Genetic operators are applied to chromosomes that are selected to be parents, to create offspring

Basically of two types: Crossover and Mutation

Crossover operators create offspring by recombining the chromosomes of selected parents

Mutation is used to make small random changes to a chromosome in an effort to add diversity to the population

Page 12: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Mutation

Mutation is another important genetic operator

Mutation takes a single candidate and randomly changes some aspect (gene) of it

For example, mutation may randomly select a bit in the pattern and change it, switching a 1 to a 0 or to # (don’t care)

Page 13: Lecture-08-GA.ppt

13

Simple exampleSuppose your “organisms” are 32-bit computer wordsYou want a string in which all the bits are onesHere’s how you can do it:

Create 100 randomly generated computer wordsRepeatedly do the following:

Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the ten words that have the most 1s (discard the rest) From each word, generate 9 new words as follows:

Pick a random bit in the word and toggle (change) it

Note that this procedure does not guarantee that the next “generation” will have more 1 bits, but it’s likely

Page 14: Lecture-08-GA.ppt

Realistic Example Suppose you have a large number of (x, y) data points

• For example, (1, 4), (3, 9), (5, 8), ...

You would like to fit a polynomial (of up to degree 1) through these data points

• That is, you want a formula y = mx + c that gives you a reasonably good fit to the actual data

• Here’s the usual way to compute goodness of fit: Compute the sum of (actual y – predicted y)2 for all the data points The lowest sum represents the best fit

You can use a genetic algorithm to find a “pretty good” solution

Page 15: Lecture-08-GA.ppt

Realistic Example Your formula is y = mx + c

Your unknowns are m and c; where m and c are integers

Your representation is the array [m, c]

Your evaluation function for one array is:

• For every actual data point (x, y) Compute ý = mx + c Find the sum of (y – ý)2 over all x The sum is your measure of “badness” (larger numbers are worse)

• Example: For [m,c] = [5, 7] and the data points (1, 10) and (2, 13): ý = 5x + 7 = 12 when x is 1 ý = 5x + 7 = 17 when x is 2 Now compute the badness

(10 - 12)2 + (13 – 17)2 = 22 + 42 = 20 If these are the only two data points, the “badness” of [5, 7] is 20

Page 16: Lecture-08-GA.ppt

Realistic Example Your GA might be as follows:

• Create two-element arrays of random numbers

• Repeat 50 times (or any other number):For each of the arrays, compute its badness (using all data

points)Keep the best arrays (with low badness)From the arrays you keep, generate new arrays as follows:

• Convert the numbers in the array to binary, toggle one of the bits at random

• Quit if the badness of any of the solution is zero

• After all 50 trials, pick the best array as your final answer

Page 17: Lecture-08-GA.ppt

Realistic Example (x, y) : {(1,5) (3, 9)}

[2 7][1 3] (initial random population, where m and c represent genes) ý = 2x + 7 = 9 when x is 1 ý = 2x + 7 = 13 when x is 3 Badness: (5 – 9)2 + (9 – 13)2 = 42 + 42 = 32 ý = 1x + 3 = 4 when x is 1 ý = 1x + 3 = 6 when x is 3 Badness: (5 – 4)2 + (9 – 6)2 = 12 + 32 = 10

Now, lets keep the one with low “badness” [1 3]

Binary representation [001 011]

Apply mutation to generate new arrays [011 011]

Now we have [1 3] [3 3] as the new population considering that we keep the two best individuals

Page 18: Lecture-08-GA.ppt

Realistic Example (x, y) : {(1,5) (3, 9)}

[1 3][3 3] (current population) ý = 1x + 3 = 4 when x is 1 ý = 1x + 3 = 6 when x is 3 Badness: (5 – 4)2 + (9 – 6)2 = 1 + 9 = 10 ý = 3x + 3 = 6 when x is 1 ý = 3x + 3 = 12 when x is 3 Badness: (5 – 6)2 + (9 – 12)2 = 1 + 9 = 10

Lets keep the [3 3]

Representation [011 011]

Apply mutation to generate new arrays [010 011] i.e. [2,3]

Now we have [3 3] [2 3] as the new population

Page 19: Lecture-08-GA.ppt

Realistic Example

(x, y) : {(1,5) (3, 9)}

[3 3][2 3] (current population) ý = 3x + 3 = 6 when x is 1 ý = 3x + 3 = 12 when x is 3 Badness: (5 – 6)2 + (9 – 12)2 = 1 + 9 = 10

ý = 2x + 3 = 5 when x is 1 ý = 2x + 3 = 9 when x is 3 Badness: (5 – 5)2 + (9 – 9)2 = 02 + 02 = 0

Solution found [2 3]

y = 2x+3

Note: It is not necessary that the badness must always be zero. It can be some other threshold value as well.

Page 20: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Crossover

Crossover operation takes two candidate solutions and divides them, swapping components to produce two new candidates

Page 21: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Crossover

Figure illustrates crossover on bit string patterns of length 8

The operator splits them and forms two children whose initial segment comes from one parent and whose tail comes from the other

Input Bit Strings1 1 # 0 1 0 1 # # 1 1 0 # 0 # 1

Resulting Strings1 1 # 0 # 0 # 1 # 1 1 0 1 0 1 #

Page 22: Lecture-08-GA.ppt

The simple example again Suppose your “individuals” are 32-bit computer words, and

you want a string in which all the bits are ones Here’s how you can do it:

• Create 100 randomly generated computer words

• Repeatedly do the following: Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the 10 words that have the most 1s (discard the

rest). From each word, generate 9 new words as follows:

• Choose one of the words• Take the first half of this word and combine it

with the second half of some other word

Page 23: Lecture-08-GA.ppt

The simple example again

Half from one, half from the other:

A = 0110 1001 0100 1110 1010 1101 1011 0101

B = 1101 0100 0101 1010 1011 0100 1010 0101

-----------------------------------------------------------------

C = 0110 1001 0100 1110 1011 0100 1010 0101

Page 24: Lecture-08-GA.ppt

Mutation vs Crossover In the simple example of 32-bit words (trying to get all 1s):

• The (two-parent, no mutation) approach, if it succeeds, is likely to succeed much faster

Because up to half of the bits change each time, not just one bit

• However, without mutation, it may not succeed at all By pure bad luck, maybe none of the first randomly generated words

have (say) bit 17 set to 1

• Then there is no way a 1 could ever occur in this position as we are not changing individual bits separately

Another problem is lack of genetic diversity

• Maybe some of the first generation did have bit 17 set to 1, but none of them were selected for the second generation

The best technique in general turns out to be crossover with mutation

Page 25: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Crossover

The place of split in the candidate solution is an arbitrary choice. This split may be at any point in the solution

This splitting point may be randomly chosen or changed systematically during the solution process

Crossover can unite an individual that is doing well in one dimension with another individual that is doing well in the other dimension

Page 26: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Crossover

Two types: Single point crossover & Uniform crossover

Single type crossoverThis operator takes two parents and randomly selects a single point between two genes to cut both chromosomes into two parts (this point is called cut point)The first part of the first parent is combined with the second part of the second parent to create the first childThe first part of the second parent is combined with the second part of first parent to create the second child

1000010 10000011110001 1110010

Page 27: Lecture-08-GA.ppt

GENETIC ALGORITHM

Reproduction Operators: Crossover

Uniform crossoverThe value of each gene of an offspring’s chromosome is randomly taken from either parentThis is equivalent to multiple point crossover

10000101110001 1010010

Page 28: Lecture-08-GA.ppt

28

GENETIC ALGORITHMExample

Find a number: 001010

You have guessed this binary number. If you write a program to find it, then there are 26 = 64 possibilities

If you find it with the help of Genetic Algorithm, then the program gives a number and you tell its fitness

The fitness score is the number of correctly guessed bits

Page 29: Lecture-08-GA.ppt

29

GENETIC ALGORITHMExample

Find a number: 001010

Step 1. Chromosomes produced.A) 010101 - 1B) 111101 - 1C) 011011 - 4 *D) 101100 - 3 *

 Best ones C & D

Page 30: Lecture-08-GA.ppt

30

GENETIC ALGORITHM

Example

Find a number: 001010

Mating New Variants EvaluationC) 01:1011 01:1100 (E) 3D) 10:1100 10:1011 (F) 4 *C) 0110:11 0110:00 (G) 4 *D) 1011:00 1011:11 (H) 3 Selection of F & G

Page 31: Lecture-08-GA.ppt

31

GENETIC ALGORITHMExample  Mating New Variants EvaluationF) 1:01011 1:11000 (H) 3G) 0:11000 0:01011 (I) 5 *F) 101:011 101:000 (J) 4 *G) 011:000 011:011 (K) 4 Selection of I and J 

Page 32: Lecture-08-GA.ppt

32

GENETIC ALGORITHM

Example  Mating New Variants EvaluationI) 0010:11 0010:00 (L) 5J) 1010:00 1010:11 (M) 4 I) 00101:1 00101:0 (N) 6 (success) *J) 10100:0 10100:1 (O) 3 In this game success was achieved after 16 questions, which is 4 times faster then checking all possible 26 = 64 combination

Page 33: Lecture-08-GA.ppt

33

GENETIC ALGORITHM

Example Mutation was not used in this example. Mutation would have been necessary, if, e.g. there was a 0 in the third bit of all 3 initial individuals. In that case no matter how the individuals are combined, we can never change this bit into 1. Mutation takes evolution out of a dead end.

Page 34: Lecture-08-GA.ppt

Eight Queens Problem

The problem is to place 8 queens on a chess board so that none of them can attack the other. A chess board can be considered a plain board with eight columns and eight rows.

Page 35: Lecture-08-GA.ppt

Eight Queens Problem

The possible cells that the Queen can move to when placed in a particular square are shaded

Page 36: Lecture-08-GA.ppt

Eight Queens Problem

We need a scheme to denote the board’s position at any given time

2 6 8 3 4 5 3 1

Page 37: Lecture-08-GA.ppt

Eight Queens Problem

We need a scheme to denote the board’s position at any given time

2 6 8 3 4 5 3 1

Page 38: Lecture-08-GA.ppt

Eight Queens ProblemNow we need a fitness function, a function by

which we can tell which board position is nearer to our goal. Since we are going to select best individuals at every step, we need to define a method to rate these board positions.

One fitness function can be to count the number of Queens that do not attack others

Page 39: Lecture-08-GA.ppt

Eight Queens Problem

Fitness Function:Q1 can attack NONEQ2 can attack NONEQ3 can attack Q6Q4 can attack Q5Q5 can attack Q4Q6 can attack Q5Q7 can attack Q4Q8 can attack Q5 Fitness = No of. Queens

that can attack noneFitness = 2

Page 40: Lecture-08-GA.ppt

Eight Queens ProblemChoose initial population of board

configurationsEvaluate the fitness of each individual

(configuration)Choose the best individuals from the

population for crossover

Page 41: Lecture-08-GA.ppt

Eight Queens ProblemSuppose the following individuals are chosen for crossover

8 5 7 2 7 1 3 5 4 5 8 2 7 1 6 5

Page 42: Lecture-08-GA.ppt

Eight Queens Problem Using Crossover Parents Children

8 5 7 2 7 1 3 5 8 5 7 2

4 5 8 2 7 1 6 5 4 5 8 2

Page 43: Lecture-08-GA.ppt

Eight Queens Problem

Page 44: Lecture-08-GA.ppt

Eight Queens ProblemMutation, flip bits at random4 5 8 2 7 1 6 5

0100 0101 1000 0010 0111 0001 0110 0101

0100 0101 1000 0010 0111 0001 0011 0101

4 5 8 2 7 1 3 5

Page 45: Lecture-08-GA.ppt

Eight Queens ProblemThis process is repeated until an

individual with required fitness level is found. If no such individual is found, then the process is repeated further until the overall fitness of the population or any of its individuals gets very close to the required fitness level. An upper limit on the number of iterations is usually put to end the process in finite time.

Page 46: Lecture-08-GA.ppt

Eight Queens ProblemSolution!

Q

Q

Q

Q

Q

Q

Q

Q

4 6 8 2 7 1 3 5

8

Page 47: Lecture-08-GA.ppt

47

Selection Process

GENETIC ALGORITHM

Page 48: Lecture-08-GA.ppt

48

Selection Process

It is used to select parents from the current population. The selection is primarily based on the fitness. The better the fitness of a chromosome, the greater its chance of being selected to be a parent

The rate at which a selection algorithm selects individuals with above average fitness is selective pressure

If there is not enough selective pressure, the population will fail to converge upon a solution. If there is too much, the population may not have enough diversity & converge prematurely

GENETIC ALGORITHMGENETIC ALGORITHM

Page 49: Lecture-08-GA.ppt

49

Selection Process: Random Selection

Random Selection:

Individuals are selected randomly with no reference to fitness at all

All the individuals, good or bad, have an equal chance of being selected

GENETIC ALGORITHMGENETIC ALGORITHM

Page 50: Lecture-08-GA.ppt

50

Selection Process: Proportional Selection

Proportional Selection:

We can select the fittest chromosomesHowever, the selection of only the fitter chromosomes may result in the loss of a correct gene value which may be present in a less fit member

One way to overcome this risk is to assign probability of selection to each chromosome based on its fitness

In this way even the less fit members have some chance of surviving into the next generation

GENETIC ALGORITHMGENETIC ALGORITHM

Page 51: Lecture-08-GA.ppt

51

Selection Process: Proportional Selection

The probability of selection of a chromosome “i” may be calculated as

pi = fitnessi / j fitnessj

Example

Chromosome Fitness Selection Probability1 7 7/142 4 4/143 2 2/144 1 1/14

GENETIC ALGORITHMGENETIC ALGORITHM

Page 52: Lecture-08-GA.ppt

52

Selection Process: Proportional Selection

GENETIC ALGORITHMGENETIC ALGORITHM

Page 53: Lecture-08-GA.ppt

53

Selection Process: Proportional Selection

Chromosomes are selected based on their fitness relative to the fitness of all other chromosomes

For this all the fitness are added to form a sum S and each chromosome is assigned a relative fitness (which is its fitness divided by the total fitness S)

A process similar to spinning a roulette wheel is adopted to choose a parent; the better a chromosome’s relative fitness, the higher its chances of selection

GENETIC ALGORITHMGENETIC ALGORITHM

Page 54: Lecture-08-GA.ppt

54

Selection Process: Proportional Selection

Once a parent is selected, the wheel is given a spin for finding the second parent. If the same chromosome is selected as the second parent, it is rejected and the wheel is spun again

After finding a pair, a second pair is selected, and so on

A chromosome may get selected several times and appear as a parent several times

GENETIC ALGORITHMGENETIC ALGORITHM

Page 55: Lecture-08-GA.ppt

55

Selection Process: Proportional Selection

Advantage

Selective pressure varies with the distribution of fitness within a population. If there is a lot of fitness difference between the more fit and less fit chromosomes, then the selective pressure will be higher

Disadvantage

As the population converges upon a solution, the selective pressure decreases, which may hinder the GA to find better solutions

GENETIC ALGORITHMGENETIC ALGORITHM

Page 56: Lecture-08-GA.ppt

56

Selection Process: Tournament Selection

Tournament Selection:

One parent is selected by comparing a subset b of the available chromosomes, and selecting the fittest; a second parent may be selected by repeating the process

The selection pressure increases as b increases.

Value of b = 2 is most commonly used

GENETIC ALGORITHMGENETIC ALGORITHM

Page 57: Lecture-08-GA.ppt

57

Selection Process: Tournament Selection

Its advantage is that the worse individuals of the population will have very little probability of selection, whereas the best individuals will not dominate the selection process, thus ensuring diversity

GENETIC ALGORITHMGENETIC ALGORITHM

Page 58: Lecture-08-GA.ppt

58

Selection Process: Rank based selection

Rank Based Selection:

Rank based selection uses the rank ordering of the fitness values to determine the probability of selection and not the fitness values themselves

This means that the selection probability is independent of the actual fitness value

Ranking therefore has the advantage that a highly fit individual will not dominate in the selection process as a function of the magnitude of its fitness

GENETIC ALGORITHMGENETIC ALGORITHM

Page 59: Lecture-08-GA.ppt

59

Selection Process: Rank based selection

The population is sorted from best to worst according to the fitness

Each chromosome is then assigned a newfitness based on a linear ranking function

New Fitness = (P – r) + 1

where P = population size, r = fitness rank of the chromosome

If P = 11, then a chromosome of rank 1 will have a New Fitness of 10 + 1 = 11 & a chromosome of rank 6 will have 6

GENETIC ALGORITHMGENETIC ALGORITHM

Page 60: Lecture-08-GA.ppt

60

Selection Process: Rank based selection

A user adjusted slope can also be incorporated

New Fitness = {(P – r) (max - min)/(P – 1)} + min

where max and min are set by the user to determine the slope (max - min)/(P – 1) of the function

Let P = 11, max = 8, min = 3, then a chromosome of rank 1 will have a New fitness of

10*5/10 + 3 = 8& a chromosome of rank 6 will have 5*5/10 + 3 = 5.5

GENETIC ALGORITHMGENETIC ALGORITHM

Page 61: Lecture-08-GA.ppt

61

Termination Requirement

GENETIC ALGORITHMGENETIC ALGORITHM

Page 62: Lecture-08-GA.ppt

62

Termination Requirement

The GA continues until some termination requirement is met, such as

- having a solution whose fitness exceeds some threshold- pre-specified number of generations have evolved- the fitness of solutions becomes stable & stops improving

GENETIC ALGORITHMGENETIC ALGORITHM

Page 63: Lecture-08-GA.ppt

63

Population Size

GENETIC ALGORITHMGENETIC ALGORITHM

Page 64: Lecture-08-GA.ppt

64

Population Size

Number of individuals present in an iteration (generation)

If the population size is too large, the processing time is high and the GA tends to take longer to converge upon a solution (because less fit members have to be selected to make up the required population)

If the population size is too small, the GA is in danger of premature convergence upon a sub-optimal solution (all chromosomes will soon have identical traits). This is primarily because there may not be enough diversity in the population to allow the GA to escape local optima

GENETIC ALGORITHMGENETIC ALGORITHM

Page 65: Lecture-08-GA.ppt

Genetic Algorithms

Advantages of genetic algorithms:Often outperform “brute force” approaches by

randomly jumping around the search spaceIdeal for problem domains in which near-

optimal (as opposed to exact) solutions are adequate

Disadvantages of genetic algorithms:Might not find any satisfactory partial solutionsTuning can be a challenge