1 A classification approach for structure discovery in search spaces of combinatorial optimization problems Daniel Porumbel 1, 2, *, Jin Kao Hao 2, Pascale

Embed Size (px)

Citation preview

  • Slide 1

1 A classification approach for structure discovery in search spaces of combinatorial optimization problems Daniel Porumbel 1, 2, *, Jin Kao Hao 2, Pascale Kuntz 1 1 LINA CS. Laboratory University of Nantes, France 2 LERIA CS. Laboratory University of Angers, France *New a ffiliation from September 2010: LGI2A CS. Lab., Artois University, France Slide 2 2 Data Mining in Search Spaces Search Space: the set of all (potential) solutions of an optimization problem A heuristic Search Algorithm goes through this Search Space to (try to) find the optimal solution Information on the structure of the Search Space is essential for heuristic guiding Main drawback: the Search Space is too large for complete enumeration or characterization Use Data Mining to discover hidden structures Slide 3 3 Towards guided optimization heuristics General objective: Adapt the search process (the heuristic) according to discovered search space properties Our approach: Consider a sample of the search space, i.e. a set of best candidate solutions Question: How are these solutions organized/structured ? Slide 4 4 Graph K-coloring problem A graph G=(V,E) is K-colorable if we can label its vertices with exactly K colors so that any two adjacent vertices have different labels (colors) Graph coloring problem Determine the minimal number K so that G is K-colorable : the chromatic number (G) NP-hard problem Our experimental framework : graph coloring problem Slide 5 5 Encoding of Coloring Solutions Solution C = a vector (c 1 |c 2 ||c |V| |) of labels/colors Conflict number of C: number of edges with both ends of the same color K-coloring problem find a optimal solution C* minimizing the conflict number The problem is solved if C* has no conflicts Solution C=(3|1|1|1|2) Conflict number = 2 (V3-V2 and V3-V4) Slide 6 6 Search Space for K-coloring The set of all solutions has cardinal K |V| ~ a space of dimension |V| Neighborhood relation to pass from solution to solution: Pass from a solution C to neighboring C by changing just one color in a conflicting vertex of C Our approach for solving the coloring problem: Tabu Search Slide 7 7 Tabu Search [Glover and Laguna, Tabu Search, 1987] A local search moving from solution to solution by applying the neighborhood relation It can move to a worser solution when there is no move that decreases the number of conflicts (no down move) Risk : to repeat a set of moves again and again => Tabu list = a list of temporary forbidden moves Slide 8 8 Classification of candidate solutions Question: given a set of best solutions, how are they distributed in the search space ? Our approach: 1. compute the distance (dissimilarity) between each two best solutions discovered by Tabu Search 2. analyze the solution distribution via the observed distance values one needs computing a distance with a low algorithmic complexity Slide 9 9 Fast Distance Calculation K-Coloring: the transfer distance between partitions [Rgnier, Sur quelques aspects mathematiques des problemes de classification automatique, 1965] Classical complexity : Hungarian algorithm O(|V|+k 3 ) Las Vegas algorithm : O(|V|) time if the partitions are close enough (condition required for our problem) [Porumbel, Hao, Kuntz, An improved algorithm for computing the partition distance, Discrete Applied Mathematics, 2010] Slide 10 10 High Quality Solutions plotted via Multidimensional Scaling 350 best local minima represented via Multidimensional Scaling DIMACS graph G=dsjc250.5 with k = 27 These points form clusters that can be covered by spheres of small diameter. Slide 11 11 Tabu Search trajectory: MDS representation We consider a Tabu Search exploring the search space We launch it from a local optimum and we let it explore This figure plots the solutions of high-quality (not worse than the starting point) Intuitively, the visited high-quality colorings are grouped in clusters Slide 12 12 Consider a Tabu Search process exploring the search space: Record first 40.000 high-quality solutions and compute all distances (pairwise) between them Histogram: number of pairs of solutions for observed distance value We observe: Small distances: solutions in the same cluster Large distances: solutions in different clusters Trajectory of long Tabu Search processes Slide 13 13 Integrating learned information in the search process These analyses let us to the clustering hypothesis: The best candidate solutions are grouped into clusters that can be covered by spheres of specific diameter (10%|V|) Question: how to exploit such information to improve a search process? Answer: consider a sphere-based search space organization Slide 14 14 Integrating learned information in the search process These analyses let us to the clustering hypothesis: The best candidate solutions are grouped into clusters that can be covered by spheres of specific diameter (10%|V|) Question: how to exploit such information to improve a Search process? Answer:consider a sphere-based search space organization Slide 15 15 Numerical Results Integrating learned information (e.g. clustering information) helped a basic local search to reach competitive results We proposed one of the first local search algorithms that can compete with complex population-based heuristics [Porumbel, Hao & Kuntz, A Search Space Cartography for Guiding Graph Coloring Heuristics, Computers & OR, 2010 ] First coloring with k=223 colors for the well-studied DIMACS graph dsjc1000.9 Slide 16 16 Detailed Numerical Results Slide 17 17 Conclusions Clustering information can be very useful in guiding any search algorithm We also employed the clustering hypothesis in an evolutionary coloring approach Such techniques can be used for any combinatorial optimization problem given a distance function that can rapidly computed More advanced techniques can be used to improve our knowledge of search spaces Using learning in optimization seems a promising direction (e.g. the Learning and Intelligent Optimization)