Upload
bo-sears
View
18
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Using Personal Condor to Solve Quadratic Assignment Problems. Jeff Linderoth Axioma, Inc. [email protected]. Partners in Crime. Kurt Anstreicher Nate Brixius University of Iowa. Jean-Pierre Goux MCS Division, ANL. LOTS of people in this room! University of Wisconsin. - PowerPoint PPT Presentation
Citation preview
Using Personal Condor toSolve Quadratic AssignmentProblems
Jeff LinderothAxioma, Inc.
Partners in Crime
Kurt Anstreicher
Nate Brixius
University of Iowa
Jean-Pierre Goux
MCS Division, ANL
LOTS of people in this room!
University of Wisconsin
Our Mission
1. Find the best possible solution to large quadratic assignment problem (QAP) instances
2. Prove that the solution is indeed optimal
3. Show how to exploit the Computational Grid offered by Personal Condor to make it happen
What’s a QAP?
Can be thought of as a facility location problem
The QAP is NP-REALLY-Hard TSP: Solve
n=13509 QAP: Solve n=25
Q: Why Is This Important? Answer #1: Practical applications
Facility Location Hospital Design Flight Instrument Layout
Answer #2: Similarity Comparable to other practically
important combinatorial optimization problems
TSP, MIP
The REAL Answer – It’s NOT!
“The Journey Is The Reward”
What can we learn about solving complexnumerical problems on Computational Grids?
The Perfect Marriage
While my wife likes this slide, really it’s the QAP and Condor that make the perfect marriage!
+
Making the Perfect Marriage Something Old Something New Something Borrowed Something Blue
Something Old: Branch-and-Bound1. Bound
Solve “auxiliary” problem that gives a lower bound on the optimal solution to the problem
Any assignment of facilities to locations gives an upper bound on the optimal solution
What if lower bound < upper bound?
Branch Divide-and-
Conquer! Recursively make
problem smaller by assigning each facility to a fixed location
Without the bounding, this is complete enumeration. (n!)
This is not “pleasantly parallel” computing!
* Something VERY old
Something New: A convex quadratic programming
relaxation
Solved with the Frank-Wolfe Algorithm*. Each iteration is one linear
assignment problem
Something Borrowed: With Condor it is easy to “borrow”
CPU cycles1. Call your friends and colleagues and
flock with their Condor pools2. Write an NPACI proposal and Glide-
In to supercomputer resources3. If all else fails (Condor/Globus not
installed), hobble in!
My Personal GridNumber Type Location Method
414 Intel/Linux Argonne Hobble-In96 SGI/Irix Argonne Glide-In
1024 SGI/Irix NCSA Glide-In
16 Intel/Linux NCSA Flocked
45 SGI/Irix NCSA Flocked
246 Intel/Linux Wisconsin Flocked
146 Intel/Solaris Wisconsin Flocked
133 Sun/Solaris Wisconsin Flocked
190 Intel/Linux Georgia Tech Flocked
94 Intel/Solaris Georgia Tech Flocked
54 Intel/Linux Italy (INFN) Flocked
25 Intel/LinuxNew Mexico (AHPCC)
Flocked
5 Intel/Linux Columbia U. Flocked
10 Sun/Solaris Columbia U. Flocked
12 Sun/Solaris Northwestern Flocked
* My sincerest apologies for the terrible pun
Something Blue? You could work until you’re blue in the
face and not solve QAP instances*
Instance Arch. Wall Time
Person Date
Nug22 Ultra 360MHz 56 Hours Hahn 1999
Nug24 Ultra 360MHz 9 days Hahn 1999
Nug25 Ultra 360MHz 66 days Hahn 1999
Nug22 48-96 Cenju-3 9 days Marzetta 1998
Nug25 64-128 Paragon
30 days Marzetta 1998
The Holy Grail We want to solve nug30! Extrapolating results and using an idea
of Knuth*, we conjecture that we will need roughly 10-15 years of CPU time
How can we be sure to use 10-15 years of CPU time somewhat efficiently?
We have the additional burden of working in Condor’s extremely dynamic environment!
* Something Old
Making the Marriage Work The MW runtime support library helps us
cope with the dynamic nature of our platform MW – Master Worker paradigm
Must deal with contention at the master Search/ordering strategies at both master and
worker are important! Parallel Efficiency improves from 50% to 90% Lots more details!
Paper available at www.optimization-online.org
Mission Accomplished!
Wall Clock Time 6:22:04:31
Avg. # Machines 653
Max. # Machines 1007
CPU Time Approx. 11 years
Nodes 11,892,208,412
LAPs 574,254,156,532
Parallel Efficiency 92%
Solution Characteristics
Number of Workers
The Ups & Downs1. Human (read Jeff) error
• Master compiled for <= 1000 workers
2. Condor schedd bug (Gasp!!!!)3. Master shut down to fix NFS problems4. Condor schedd bug5. Human (read Jeff) error
• Incorrect editing of configuration files resulting in many incorrect submissions
Number of Workers on June 12
Number of Workers at Three Biggest Contributors
Number of Workers at Three Next Largest Contributors
KLAPS
The Moral of the Story A good wedding/marriage requires four
key ingredients There were also four key ingredients to
solving nug301. Powerful mathematics for producing a lower
bound2. Innovative branching techniques3. An EXTREMELY powerful computing platform4. “Marrying” the algorithm to the platform in
an appropriate manner
The TRUE Moral It is possible to do complex
numerical calculations on the Computational Grid using Condor!
It opens the doors to attacking heretofore unsolved problems!
http://www.mcs.anl.gov/metaneos