Representation
Chapter 4
Luke, Essentials of Metaheuristics, 2011
Byung-Hyun Ha
R1
2
Outline Introduction Vectors Directed encoded graphs Trees and Genetic Programming Grammatical Evolution Rulesets Bloat Summary
3
Introduction Representation of individual
Approach to construct, tweak, and present individual for fitness assessment
Metaheuristics as general framework Mostly, only representation differs with regard to different problems
4
Introduction Examples of representation
TSP• Permutation-based (order-based)
• 3-1-2-0 1-2-0-3 2-0-3-1 0-3-1-2, not distinct• Locus-based
• 3/2/0/1 ( 3-1-2-0, permutation-based), distinct• Random-key
• 0.78:0.56:0.69:0.11 ( 3-1-2-0, permutation-based) VRP (vehicle routing problems)
• Using separator• 6-9-0-2-4-7-5-0-8-1-3• Mutation and crossover?
Encoding and decoding
source: http://neo.lcc.uma.es/cEA-web/VRP.htmPhenotype Genotypeencoding
decoding
Tweak
5
Introduction Tweak in representation
Phenotype e. genotype Tweak genotype d. phenotype Determining fitness landscape
• Example: Hamming cliff and gray coding Remember small change!
• It can help metaheuristics, usually.
6
Introduction Much of representation is an art, not a science!
e.g., workflow (business process)• How to encode and tweak?
source: http://www.tonymarston.net/php-mysql/workflow.html
7
Introduction Properties, required (Talbi, 2009)
Completeness• All solutions should be represented
Connexity• A search path must exist between any two solutions (i.e., to global optimum)
Efficiency• Easiness to manipulate
Representation-solution mapping (Talbi, 2009) One-to-one Many-to-one
• Redundancy will enlarge the size of search space. One-to-many
• A good solution should be constructed from an individual.
8
Vectors Initialization and bias
Not difficult to initialize• Some totally-random initialization method (covered already)
Bias?• e.g., solution for robot walking using heuristic (e.g., by motion capture)• But diversity is useful, particularly early on.• Some suggestions
1) Biasing is dangerous.2) Start with values that aren’t all or exactly based on heuristic bias
Mutation Examples
• Gaussian convolution, bit-flip mutation, ...• Integer vector: Integer Randomization Mutation, Random Walk Mutation, ...
c.f., point mutation• Useful when there is less chance to get improvement by changing several
genes at a time• But, can be trapped in local optimum, e.g.,
9
Vectors Recombination
One- and Two-point Crossover, Uniform Crossover Line Recombination, Intermediate Recombination ...
Phenotype-specific mutation or crossover e.g., Jung & Moon, The Natural Crossover for the 2D Euclidean TSP,
2002 Consider fitness landscape.
10
Directed Encoded Graphs Graphs
Examples• Neural networks, finite-state automata, Petri nets, electrical circuits, ...
Types• Directed, undirected, with labels, with weights, cyclic, acyclic, recurrent, feed-
forward, sparse, dense, planar, ...• Those are constraints respecting Tweak.
Arbitrary-structured graph Target of graph representation
Types of encoding Direct encoding
• Exact node and edge description in representation Indirect (developmental) encoding
• Some (production) rule to constructing graph, as a solution (discussed later)
11
Directed Encoded Graphs Full adjacent matrix
e.g., a recurrent directed graph structure, with• no more than 5 nodes• no more than one edge between any two node• self-edges allowed• weights for edges
Mutation and crossover
12
Directed Encoded Graphs Arbitrary graph structure
Initialization of graph (N, E)• Determination of number of nodes and edges
• e.g., using geometric distribution• Creation of a node and an edge, depending on type of target graph
13
Directed Encoded Graphs Arbitrary graph structure (cont’d)
Further considerations in initialization• e.g., connected and directed acyclic graph• c.f., general algorithms textbook
Mutation• e.g., do one of the followings, random number of times
• delete a random edge• add a random edge• delete a node and all its edges• add a node• relabel a node• relabel an edge
Recombination• c.f., goal of crossover is to transfer essential and useful elements to another• Determining elements to transfer
• Selecting subset of nodes and edges, or selecting subgraph• Coping with missing target of edge and with disjoint
14
Directed Encoded Graphs Arbitrary graph structure (cont’d)
Recombination (cont’d)
15
Directed Encoded Graphs Example of container terminal operations
Relocation of containers in a bay for efficient loading• Solution as a list of movements
• e.g., (8-7) (4-1) (3-7) (2-4) (6-7) (2-4)(2-3) (2-7) (8-4) (0-3) (8-2) (9-6) (1-2)(5-1) (5-2) (9-2) (9-2) (7-3) (9-6) (5-0)(7-9) (8-9) (6-9) (7-5) (7-9) (6-1) (6-3)(6-8) (3-6) (1-6) (8-6)
• Weakness?• Solution as a graph
• Sufficient?
a
b
c
d e
f
16
Trees and Genetic Programming Genetic Programming
How to use stochastic methods to search for and optimize small computer programs or other computational devices
Concept of suboptimality, required• Not simply right or wrong
Examples• Team soccer robot behavior, fitting math. equation to data set, finding finite-st
ate automata which matching given language
Representation Lists or trees, usually
• e.g., an artificial ant, sin(cos(x – sin x) + xx) for symbolic regression
17
Trees and Genetic Programming Primitives in representation
Basic functions (e.g., kick-toward-goal) or CPU operations (e.g., +) Constraints of context
• e.g., 4 + kick-toward-goal(), no sense• e.g., matrix-multiply, expecting exactly two children and ... Tweaks need to maintain closure (valid individuals)
Fitness assessment Conversion data (genotype) to code (phenotype), and evaluate Examples
• Symbolic regression: sum of squared errors• Artificial ant: amount of food eaten
Tree-Style Genetic Programming Pipeline Sec. 3.3.3 One of popular algorithm for Genetic Programming (but not limited to)
18
Trees and Genetic Programming Initialization
New trees by repeatedly selecting from a function set• Considering arity (predefined number of children)• e.g., Grow, Full, Ramped Half-and-Half, PTC2 algorithms
Ephemeral random constants• Handling constants for leaves (e.g., 0.2462, 0.9, –2.34, 3.14, “s%&e:m”)• Special leaf nodes to be transformed into randomly-generated constant
19
Trees and Genetic Programming Recombination
e.g., subtree crossover: swap two selected subtrees• Non-homologous (i.e., global)
Mutation Examples
• Replacing random subtree with randomly-generated one (subtree mutation)• Replacing random non-leaf node with one of its subtrees• Picking random non-leaf node and swapping its subtrees• Mutating ephemeral random constants by introducing some noise• Swapping two disjoint subtrees
c.f., not popular because usually crossover is non-homologous
20
Trees and Genetic Programming Forests
e.g., forest of soccer robot team with each member as tree
Automatically defined functions (ADF) Not predefined functions but trees called by primary tree c.f., Modularity
• In case that we believe a good solution has repetitive part
21
Trees and Genetic Programming Edge encoding
e.g., an edge encoding for a finite-state automaton (it’s a graph) that interprets regular language (1|0)*01
• c.f., http://en.wikipedia.org/wiki/Lexical_analysis Indirect encoding (developmental encoding)
22
Trees and Genetic Programming Example of container terminal operations
Container-grounding position determination by weighted sum of scores• Solution as a list of weights
• Weakness?• Genetic programming?
23
Grammatical Evolution Using predefined grammar for tree
Trees generated by lists (indirect encoding)• c.f., http://en.wikipedia.org/wiki/Backus-Naur_form
Pros and cons• Almost always valid tree, reduced size of search space• Tiny changes early in list result in gigantic changes (un-smoothness).
24
Rulesets
A policy as solution of problem Consisting of a set of rules e.g., stock trading program, entities in simulations
State-action rules Typical form
• a b ... y z• e.g., (left sonar value > 3.2) (forward sonar value 5.0) (turn left to 50)
An interpretation• Mapping from state space into actions
Under-specification and over-specification• Default rules, vote, ...
Fitness assessment• On a ruleset, or on a series of rules
25
Rulesets
Production rules Typical form
• a b c ... z Modular indirect encoding
• Describing large complex solution with lots of repetitions by small and compact rule (search) space
e.g., 8-node directed unlabeled graph structure as solution
26
Rulesets
Production rules (cont’d) e.g., Lindenmayer systems (L-systems)
• e.g., Koch Curve• F F + F – F – F + F• F: draw a line forward, +: turn left, –: turn right
F
F+F-F-F+F
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F
F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F-F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F-F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F+F+F-F-F+F+F+F-F-F+F-F+F-F-F+F-F+F-F-F+F+F+F-F-F+F
27
Bloat Code bloat or code growth
A problem with variable-sized representation Far from optimum usually, memory consumption, ... and ugly
Common ways of handling Limiting size when individual is Tweaked Editing individual, to remove introns and the like Punishing individual for being very large
• e.g., linear parsimony pressure (problem?)• revised fitness f = r – (1 – )s, where r: fitness, s: size of individual
• e.g., non-parametric parsimony pressure
28
Summary Phenotype & genotype Encoding & decoding Representations
Vectors Graphs
+ Indirect-encoded graphs (edge encoding) Trees
+ Indirect-encoded trees (Grammatical Evolution) Rulesets
Bloat