Upload
bilalhsa100
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
2
EvolutionHere’s a very oversimplified description of how evolution works
in biologyOrganisms (animals or plants) produce a number of offspring
which are almost, but not entirely, like themselvesVariation may be due to mutation (random changes)Variation may be due to reproduction (offspring have some characteristics
from each parent)
Some of these offspring may survive to produce offspring of their own—some won’tThe “better adapted” offspring are more likely to surviveOver time, later generations become better and better adapted
Genetic algorithms use this same process to “evolve” better programs
3
GENETIC ALGORITHM
A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult problems
The problems are not solved by reasoning logically about them; rather populations of competing candidate solutions are spawned and then evolved to become better solutions through a process patterned after biological evolution
Less worthy candidate solutions tend to die out, while those that show promise of solving a problem survive and reproduce by constructing new solutions out of their components
GENETIC ALGORITHM
GA begin with a population of candidate problem solutions
Candidate solutions are evaluated according to their ability to solve problem instances: only the fittest survive and combine with each other to produce the next generation of possible solutions
Thus increasingly powerful solutions emerge in a Darwinian universe
This method is heuristic in nature and it was introduced by John Holland in 1975
5
Basic Genetic Algorithm
Start with a large “population” of randomly generated “attempted solutions” to a problem
Repeatedly do the following:Evaluate each of the attempted solutionsKeep a subset of these solutions (the “best” ones)Produce next generation from these solutions (using“inheritance” and “mutation”)
Quit when you have a satisfactory solution (or you run out of time)
GENETIC ALGORITHMBasic Algorithm
begin set time t = 0; initialise population P(t) = {x1
t, x2t, …, xn
t} of solutions;
while the termination condition is not met do begin evaluate fitness of each member of P(t); select some members of P(t) for creating offspring; produce offspring by genetic operators; replace some members with the new offspring; set time t = t + 1; endend
GENETIC ALGORITHM
Representation of Solutions: The Chromosome
Gene: A basic unit, which represents one characteristic of the individual. The value of each gene is called an allele
Chromosome: A string of genes; it represents an individual i.e. a possible solution of a problem. Each chromosome represents a point in the search space
Population: A collection of chromosomes
An appropriate chromosome representation is important for the efficiency and complexity of the GA
GENETIC ALGORITHM
Evaluation/Fitness Function
It is used to determine the fitness of a chromosome
Creating a good fitness function is one of the challenging tasks of using GA
Fitness Function
The fitness function can be the score of the classification accuracy of the rule-set over a set of provided training examples
Often other criteria may be included as well, such as the complexity of the rules or the size of the rule set
GENETIC ALGORITHM
GENETIC ALGORITHM
Selection Operators (Algorithms)
They are used to select parents from the current population
The selection is primarily based on the fitness. The better the fitness of a chromosome, the greater its chance of being selected to be a parent
The most popular method of selection is Proportionate Selection
GENETIC ALGORITHMReproduction Operators
Genetic operators are applied to chromosomes that are selected to be parents, to create offspring
Basically of two types: Crossover and Mutation
Crossover operators create offspring by recombining the chromosomes of selected parents
Mutation is used to make small random changes to a chromosome in an effort to add diversity to the population
GENETIC ALGORITHM
Reproduction Operators: Mutation
Mutation is another important genetic operator
Mutation takes a single candidate and randomly changes some aspect (gene) of it
For example, mutation may randomly select a bit in the pattern and change it, switching a 1 to a 0 or to # (don’t care)
13
Simple exampleSuppose your “organisms” are 32-bit computer wordsYou want a string in which all the bits are onesHere’s how you can do it:
Create 100 randomly generated computer wordsRepeatedly do the following:
Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the ten words that have the most 1s (discard the rest) From each word, generate 9 new words as follows:
Pick a random bit in the word and toggle (change) it
Note that this procedure does not guarantee that the next “generation” will have more 1 bits, but it’s likely
Realistic Example Suppose you have a large number of (x, y) data points
• For example, (1, 4), (3, 9), (5, 8), ...
You would like to fit a polynomial (of up to degree 1) through these data points
• That is, you want a formula y = mx + c that gives you a reasonably good fit to the actual data
• Here’s the usual way to compute goodness of fit: Compute the sum of (actual y – predicted y)2 for all the data points The lowest sum represents the best fit
You can use a genetic algorithm to find a “pretty good” solution
Realistic Example Your formula is y = mx + c
Your unknowns are m and c; where m and c are integers
Your representation is the array [m, c]
Your evaluation function for one array is:
• For every actual data point (x, y) Compute ý = mx + c Find the sum of (y – ý)2 over all x The sum is your measure of “badness” (larger numbers are worse)
• Example: For [m,c] = [5, 7] and the data points (1, 10) and (2, 13): ý = 5x + 7 = 12 when x is 1 ý = 5x + 7 = 17 when x is 2 Now compute the badness
(10 - 12)2 + (13 – 17)2 = 22 + 42 = 20 If these are the only two data points, the “badness” of [5, 7] is 20
Realistic Example Your GA might be as follows:
• Create two-element arrays of random numbers
• Repeat 50 times (or any other number):For each of the arrays, compute its badness (using all data
points)Keep the best arrays (with low badness)From the arrays you keep, generate new arrays as follows:
• Convert the numbers in the array to binary, toggle one of the bits at random
• Quit if the badness of any of the solution is zero
• After all 50 trials, pick the best array as your final answer
Realistic Example (x, y) : {(1,5) (3, 9)}
[2 7][1 3] (initial random population, where m and c represent genes) ý = 2x + 7 = 9 when x is 1 ý = 2x + 7 = 13 when x is 3 Badness: (5 – 9)2 + (9 – 13)2 = 42 + 42 = 32 ý = 1x + 3 = 4 when x is 1 ý = 1x + 3 = 6 when x is 3 Badness: (5 – 4)2 + (9 – 6)2 = 12 + 32 = 10
Now, lets keep the one with low “badness” [1 3]
Binary representation [001 011]
Apply mutation to generate new arrays [011 011]
Now we have [1 3] [3 3] as the new population considering that we keep the two best individuals
Realistic Example (x, y) : {(1,5) (3, 9)}
[1 3][3 3] (current population) ý = 1x + 3 = 4 when x is 1 ý = 1x + 3 = 6 when x is 3 Badness: (5 – 4)2 + (9 – 6)2 = 1 + 9 = 10 ý = 3x + 3 = 6 when x is 1 ý = 3x + 3 = 12 when x is 3 Badness: (5 – 6)2 + (9 – 12)2 = 1 + 9 = 10
Lets keep the [3 3]
Representation [011 011]
Apply mutation to generate new arrays [010 011] i.e. [2,3]
Now we have [3 3] [2 3] as the new population
Realistic Example
(x, y) : {(1,5) (3, 9)}
[3 3][2 3] (current population) ý = 3x + 3 = 6 when x is 1 ý = 3x + 3 = 12 when x is 3 Badness: (5 – 6)2 + (9 – 12)2 = 1 + 9 = 10
ý = 2x + 3 = 5 when x is 1 ý = 2x + 3 = 9 when x is 3 Badness: (5 – 5)2 + (9 – 9)2 = 02 + 02 = 0
Solution found [2 3]
y = 2x+3
Note: It is not necessary that the badness must always be zero. It can be some other threshold value as well.
GENETIC ALGORITHM
Reproduction Operators: Crossover
Crossover operation takes two candidate solutions and divides them, swapping components to produce two new candidates
GENETIC ALGORITHM
Reproduction Operators: Crossover
Figure illustrates crossover on bit string patterns of length 8
The operator splits them and forms two children whose initial segment comes from one parent and whose tail comes from the other
Input Bit Strings1 1 # 0 1 0 1 # # 1 1 0 # 0 # 1
Resulting Strings1 1 # 0 # 0 # 1 # 1 1 0 1 0 1 #
The simple example again Suppose your “individuals” are 32-bit computer words, and
you want a string in which all the bits are ones Here’s how you can do it:
• Create 100 randomly generated computer words
• Repeatedly do the following: Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the 10 words that have the most 1s (discard the
rest). From each word, generate 9 new words as follows:
• Choose one of the words• Take the first half of this word and combine it
with the second half of some other word
The simple example again
Half from one, half from the other:
A = 0110 1001 0100 1110 1010 1101 1011 0101
B = 1101 0100 0101 1010 1011 0100 1010 0101
-----------------------------------------------------------------
C = 0110 1001 0100 1110 1011 0100 1010 0101
Mutation vs Crossover In the simple example of 32-bit words (trying to get all 1s):
• The (two-parent, no mutation) approach, if it succeeds, is likely to succeed much faster
Because up to half of the bits change each time, not just one bit
• However, without mutation, it may not succeed at all By pure bad luck, maybe none of the first randomly generated words
have (say) bit 17 set to 1
• Then there is no way a 1 could ever occur in this position as we are not changing individual bits separately
Another problem is lack of genetic diversity
• Maybe some of the first generation did have bit 17 set to 1, but none of them were selected for the second generation
The best technique in general turns out to be crossover with mutation
GENETIC ALGORITHM
Reproduction Operators: Crossover
The place of split in the candidate solution is an arbitrary choice. This split may be at any point in the solution
This splitting point may be randomly chosen or changed systematically during the solution process
Crossover can unite an individual that is doing well in one dimension with another individual that is doing well in the other dimension
GENETIC ALGORITHM
Reproduction Operators: Crossover
Two types: Single point crossover & Uniform crossover
Single type crossoverThis operator takes two parents and randomly selects a single point between two genes to cut both chromosomes into two parts (this point is called cut point)The first part of the first parent is combined with the second part of the second parent to create the first childThe first part of the second parent is combined with the second part of first parent to create the second child
1000010 10000011110001 1110010
GENETIC ALGORITHM
Reproduction Operators: Crossover
Uniform crossoverThe value of each gene of an offspring’s chromosome is randomly taken from either parentThis is equivalent to multiple point crossover
10000101110001 1010010
28
GENETIC ALGORITHMExample
Find a number: 001010
You have guessed this binary number. If you write a program to find it, then there are 26 = 64 possibilities
If you find it with the help of Genetic Algorithm, then the program gives a number and you tell its fitness
The fitness score is the number of correctly guessed bits
29
GENETIC ALGORITHMExample
Find a number: 001010
Step 1. Chromosomes produced.A) 010101 - 1B) 111101 - 1C) 011011 - 4 *D) 101100 - 3 *
Best ones C & D
30
GENETIC ALGORITHM
Example
Find a number: 001010
Mating New Variants EvaluationC) 01:1011 01:1100 (E) 3D) 10:1100 10:1011 (F) 4 *C) 0110:11 0110:00 (G) 4 *D) 1011:00 1011:11 (H) 3 Selection of F & G
31
GENETIC ALGORITHMExample Mating New Variants EvaluationF) 1:01011 1:11000 (H) 3G) 0:11000 0:01011 (I) 5 *F) 101:011 101:000 (J) 4 *G) 011:000 011:011 (K) 4 Selection of I and J
32
GENETIC ALGORITHM
Example Mating New Variants EvaluationI) 0010:11 0010:00 (L) 5J) 1010:00 1010:11 (M) 4 I) 00101:1 00101:0 (N) 6 (success) *J) 10100:0 10100:1 (O) 3 In this game success was achieved after 16 questions, which is 4 times faster then checking all possible 26 = 64 combination
33
GENETIC ALGORITHM
Example Mutation was not used in this example. Mutation would have been necessary, if, e.g. there was a 0 in the third bit of all 3 initial individuals. In that case no matter how the individuals are combined, we can never change this bit into 1. Mutation takes evolution out of a dead end.
Eight Queens Problem
The problem is to place 8 queens on a chess board so that none of them can attack the other. A chess board can be considered a plain board with eight columns and eight rows.
Eight Queens Problem
The possible cells that the Queen can move to when placed in a particular square are shaded
Eight Queens Problem
We need a scheme to denote the board’s position at any given time
2 6 8 3 4 5 3 1
Eight Queens Problem
We need a scheme to denote the board’s position at any given time
2 6 8 3 4 5 3 1
Eight Queens ProblemNow we need a fitness function, a function by
which we can tell which board position is nearer to our goal. Since we are going to select best individuals at every step, we need to define a method to rate these board positions.
One fitness function can be to count the number of Queens that do not attack others
Eight Queens Problem
Fitness Function:Q1 can attack NONEQ2 can attack NONEQ3 can attack Q6Q4 can attack Q5Q5 can attack Q4Q6 can attack Q5Q7 can attack Q4Q8 can attack Q5 Fitness = No of. Queens
that can attack noneFitness = 2
Eight Queens ProblemChoose initial population of board
configurationsEvaluate the fitness of each individual
(configuration)Choose the best individuals from the
population for crossover
Eight Queens ProblemSuppose the following individuals are chosen for crossover
8 5 7 2 7 1 3 5 4 5 8 2 7 1 6 5
Eight Queens Problem Using Crossover Parents Children
8 5 7 2 7 1 3 5 8 5 7 2
4 5 8 2 7 1 6 5 4 5 8 2
Eight Queens Problem
Eight Queens ProblemMutation, flip bits at random4 5 8 2 7 1 6 5
0100 0101 1000 0010 0111 0001 0110 0101
0100 0101 1000 0010 0111 0001 0011 0101
4 5 8 2 7 1 3 5
Eight Queens ProblemThis process is repeated until an
individual with required fitness level is found. If no such individual is found, then the process is repeated further until the overall fitness of the population or any of its individuals gets very close to the required fitness level. An upper limit on the number of iterations is usually put to end the process in finite time.
Eight Queens ProblemSolution!
Q
Q
Q
Q
Q
Q
Q
Q
4 6 8 2 7 1 3 5
8
47
Selection Process
GENETIC ALGORITHM
48
Selection Process
It is used to select parents from the current population. The selection is primarily based on the fitness. The better the fitness of a chromosome, the greater its chance of being selected to be a parent
The rate at which a selection algorithm selects individuals with above average fitness is selective pressure
If there is not enough selective pressure, the population will fail to converge upon a solution. If there is too much, the population may not have enough diversity & converge prematurely
GENETIC ALGORITHMGENETIC ALGORITHM
49
Selection Process: Random Selection
Random Selection:
Individuals are selected randomly with no reference to fitness at all
All the individuals, good or bad, have an equal chance of being selected
GENETIC ALGORITHMGENETIC ALGORITHM
50
Selection Process: Proportional Selection
Proportional Selection:
We can select the fittest chromosomesHowever, the selection of only the fitter chromosomes may result in the loss of a correct gene value which may be present in a less fit member
One way to overcome this risk is to assign probability of selection to each chromosome based on its fitness
In this way even the less fit members have some chance of surviving into the next generation
GENETIC ALGORITHMGENETIC ALGORITHM
51
Selection Process: Proportional Selection
The probability of selection of a chromosome “i” may be calculated as
pi = fitnessi / j fitnessj
Example
Chromosome Fitness Selection Probability1 7 7/142 4 4/143 2 2/144 1 1/14
GENETIC ALGORITHMGENETIC ALGORITHM
52
Selection Process: Proportional Selection
GENETIC ALGORITHMGENETIC ALGORITHM
53
Selection Process: Proportional Selection
Chromosomes are selected based on their fitness relative to the fitness of all other chromosomes
For this all the fitness are added to form a sum S and each chromosome is assigned a relative fitness (which is its fitness divided by the total fitness S)
A process similar to spinning a roulette wheel is adopted to choose a parent; the better a chromosome’s relative fitness, the higher its chances of selection
GENETIC ALGORITHMGENETIC ALGORITHM
54
Selection Process: Proportional Selection
Once a parent is selected, the wheel is given a spin for finding the second parent. If the same chromosome is selected as the second parent, it is rejected and the wheel is spun again
After finding a pair, a second pair is selected, and so on
A chromosome may get selected several times and appear as a parent several times
GENETIC ALGORITHMGENETIC ALGORITHM
55
Selection Process: Proportional Selection
Advantage
Selective pressure varies with the distribution of fitness within a population. If there is a lot of fitness difference between the more fit and less fit chromosomes, then the selective pressure will be higher
Disadvantage
As the population converges upon a solution, the selective pressure decreases, which may hinder the GA to find better solutions
GENETIC ALGORITHMGENETIC ALGORITHM
56
Selection Process: Tournament Selection
Tournament Selection:
One parent is selected by comparing a subset b of the available chromosomes, and selecting the fittest; a second parent may be selected by repeating the process
The selection pressure increases as b increases.
Value of b = 2 is most commonly used
GENETIC ALGORITHMGENETIC ALGORITHM
57
Selection Process: Tournament Selection
Its advantage is that the worse individuals of the population will have very little probability of selection, whereas the best individuals will not dominate the selection process, thus ensuring diversity
GENETIC ALGORITHMGENETIC ALGORITHM
58
Selection Process: Rank based selection
Rank Based Selection:
Rank based selection uses the rank ordering of the fitness values to determine the probability of selection and not the fitness values themselves
This means that the selection probability is independent of the actual fitness value
Ranking therefore has the advantage that a highly fit individual will not dominate in the selection process as a function of the magnitude of its fitness
GENETIC ALGORITHMGENETIC ALGORITHM
59
Selection Process: Rank based selection
The population is sorted from best to worst according to the fitness
Each chromosome is then assigned a newfitness based on a linear ranking function
New Fitness = (P – r) + 1
where P = population size, r = fitness rank of the chromosome
If P = 11, then a chromosome of rank 1 will have a New Fitness of 10 + 1 = 11 & a chromosome of rank 6 will have 6
GENETIC ALGORITHMGENETIC ALGORITHM
60
Selection Process: Rank based selection
A user adjusted slope can also be incorporated
New Fitness = {(P – r) (max - min)/(P – 1)} + min
where max and min are set by the user to determine the slope (max - min)/(P – 1) of the function
Let P = 11, max = 8, min = 3, then a chromosome of rank 1 will have a New fitness of
10*5/10 + 3 = 8& a chromosome of rank 6 will have 5*5/10 + 3 = 5.5
GENETIC ALGORITHMGENETIC ALGORITHM
61
Termination Requirement
GENETIC ALGORITHMGENETIC ALGORITHM
62
Termination Requirement
The GA continues until some termination requirement is met, such as
- having a solution whose fitness exceeds some threshold- pre-specified number of generations have evolved- the fitness of solutions becomes stable & stops improving
GENETIC ALGORITHMGENETIC ALGORITHM
63
Population Size
GENETIC ALGORITHMGENETIC ALGORITHM
64
Population Size
Number of individuals present in an iteration (generation)
If the population size is too large, the processing time is high and the GA tends to take longer to converge upon a solution (because less fit members have to be selected to make up the required population)
If the population size is too small, the GA is in danger of premature convergence upon a sub-optimal solution (all chromosomes will soon have identical traits). This is primarily because there may not be enough diversity in the population to allow the GA to escape local optima
GENETIC ALGORITHMGENETIC ALGORITHM
Genetic Algorithms
Advantages of genetic algorithms:Often outperform “brute force” approaches by
randomly jumping around the search spaceIdeal for problem domains in which near-
optimal (as opposed to exact) solutions are adequate
Disadvantages of genetic algorithms:Might not find any satisfactory partial solutionsTuning can be a challenge