13
Genetic Algorithms Brandon Andrews

Brandon Andrews. What are genetic algorithms? 3 steps Applications to Bioinformatics

Embed Size (px)

Citation preview

Page 1: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Genetic AlgorithmsBrandon Andrews

Page 2: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Topics What are genetic algorithms? 3 steps Applications to Bioinformatics

Page 3: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

What are genetic algorithms?

Invented and published in 1975 by John Holland

Cells have DNA which define properties Reproduction crosses DNA from both

parents merging properties from both• During this step random mutations can occur

A test of the fitness of the organism is performed• Scores the organism against others based on

criteria for survival• Essentially evolution

Page 4: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

3 Steps Selection step

• Based on the calculate fitness Reproduction step

• Mutations• Strategies for crossing

Termination step• When the goal is met

Page 5: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Steps Expanded 1) Generate random properties

(chromosomes) for N entities 2) Calculate their fitness and discard ones

that fall below the threshold• Can be determined through a simulation

3) Randomly cross over pairs that survive the selection step• Also randomly choose properties and mutate

them. This could be as simple as jittering them 4) Go to step 2 until a goal is reached

• Return the best set of properties

Page 6: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Fitness Function Could be anything The goal is to minimize or maximize

the fitness function normally after each step

Page 7: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Crossover Probability How often crossovers happens

• 0% represents if no crossover and both parents are simply moved to the next step

• 100% represents that all of the parents are crossed and only their children are move to the next step

The idea is that hopefully the good properties of both parents are merged or the good parent is preserved completely if it has no flaws that can be fixed via a crossing pair

Page 8: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Mutation Probability The probability that part of the chromosome

is changed after a crossing• 0% if none of it is changed

Not useful since variety is needed to approach the best solution or you’re stuck with the first generated properties

• 100% if all of it is changed Not useful since it negates the point of crossing at all,

causes a random search essentially The concept is to stop the algorithm from

halting at a local maximum. The mutations have a chance to generate small better changes

Page 9: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Termination When the expected error is low

• Sometimes it’s hard to calculate an error since the solution isn’t known

Or when the results stop minimizing for a few iterations or stops increasing depending on the problem

Page 10: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Approximate Solutions Might be obvious, but genetic

algorithms are by design approximate solutions since they attempt to optimize to a solution• Perfection is only as good as the fitness

function and the number of iterations, crossing and mutation probabilities

Page 11: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

Applications Multiple Sequence Alignment

• Initial generation – random generation of an alignment based on the alignments of the given sequences No authors agree on the initial size of the population

• Selection via a tournament style pairing crossing the possible alignments

• The fitness function “Sum of pair” Objective Function (everyone uses a different

one)• The survival rate is different for each alignment

Sum all alignment scores together and take a percentage for each alignment Basically better alignments have a higher percentage to survive

Page 12: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

• Reproduction Crossing uses a “one-point crossover”

Takes the first half of the first alignment and cross if with the second half of the second parent

ABCD and EFGH -> ABGH Or “point-to-point crossover”

Random index is chosen ABCD and EFGH -> ABCH

• Mutation Remove or insert a gap into the alignment

Page 13: Brandon Andrews.  What are genetic algorithms?  3 steps  Applications to Bioinformatics

References Obitko M. (1998). Genetic Algorithms.

Retrieved from http://www.obitko.com/tutorials/genetic‑algorithms/

Radenbaugh A. (2008). Applications of genetic algorithms in bioinformatics. Retrieved from http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=4491&context=etd_theses