Upload
theodore-sutton
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Common Algorithmic Approaches in VLSI CAD
• Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement, tech.mapping of fanout-free ckt for dynamic power min.]
• Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner]
• Dynamic programming [e.g., matrix multiplication, optimal buffer insertion]
• Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]
Common Algorithmic Approaches in VLSI CAD (contd)
• Search Methods:– Depth-first search (DFS): mainly used to find any
solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase]
– Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root]
– Best-first search (BeFS): used to find optimal or provably sub-optimal (at most a certain given factor of optimal) solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the “current partial soln node” [e.g., TSP, global routing]
• Iterative Improvement: deterministic, stochastic• Min-cost network flow
Divide & Conquer• Determine if the problem can be solved in a hierarchical or divide-&-
conquer (D&C) manner:
– D&C approach: See if the problem can be “broken up” into 2 or more smaller subproblems that can be “stitched-up” to give a soln. to the parent prob.– Do this recrusively for each large subprob until subprobs are small enough for an “easy” solution technique (could be exhasutive!)– If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar–The final design may or may not be optimal (will be optimal if the problem has the dynamic programming property; see later)
Subprob. A1
A1,1 A1,2 A2,1 A2,2
Root problem A
Subprob. A2
Stitch-up of solns to A1 and A2to form the complete soln to A
Do recursively until subprob-size is s.t. an exhaustive based optimal design is doable
Example from CAD: Min-total-sw-prob. (or min-dynamic power) tech. mapping of a fanout-free circuit.
Reduce-&-Conquer
Reduce problem size(Coarsening)
Solve
Uncoarsen andrefine solution
• Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing
Dynamic Programming (DP)
• The above primary property of DPs (optimal substructure: optimal solns. of sub-problems is part of optimal soln. of parent problem) also means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problem. The ocurrence of a subproblem multiple times in different higher-level problems is called the overlapping subproblem property. It is, however, not a necessary feature of a DP problem.
Stitch-upfunction
Stitch-up function f:Optimal soln of root =f(optimal solns of subproblems)= f(opt(A1), opt(A2), opt(A3), opt(A4))
RootProblem
A
A1 A2 A3 A4
Reuse of subproblem soln.
Subproblems
Dynamic Programming (contd.)• A negative example: Total sw. probability minimization in tech. mapping in a fanout-free circuit = SwP-Min(C, p(z)): C is a fanout-free ckt w/ z as its output. The problem is to minimize the p(z) + sum of sw. probabilities (01 transition probabilities) at the o/p of TM’ed gates in C excluding z (z’s sw. prob. is included in p(z)).• For a cut Ci w/ z at its o/p that can be TM’ed to a gate gi in the library, let x, y be 2 i/ps. Let p(x,y) be the mapping of p(z), based on gi, in terms of only the 4 transition probs. at x and y. Then, since p(x,y) is inseparable in terms of the trans.probs. of x and y, the exact problem to be solved is SwP_Min(C – Ci, p(x,y)), where C-Ci has 2 o/ps x, y, and thus independent cuts have to be taken for x and y, and the combination of these 2 sets of cuts will come into play. This will lead to a combinatorial explosion as we got further down the circuit to the inputs of each pair of cuts for x and y.• The final formulation is SwP-Min(C, p(z)) = Minall feasible Ci at z (SwP_Min(C – Ci, p(X(Ci)), where X(Ci) is the set of i/ps generated by Ci.• The above is not a D&C approach. In a D&C approach, we can create two subproblems SwP_Min(T(x), p(x) = p(x, yconst)) and SwP_Min(T(y), p(y) = p(xconst, y), where T(x) is the sub-circuit of C (a subtree) w/ x as its o/p, and p(x, yconst) is p(x, y) assuming some constant values for the 4 trans. probs. at y (or the subset of trans. probs. of y involved in p(x,y)).• Since there is no guarantee, and in fact it is unlikely, that the assumed constant values for the trans. probs. at y will be the exact trans. probs. one obtains by optimally solving the problem SwP_Min(C – Ci, p(x,y)) (which is the exact problem to solve), an optimal soln. to SwP_Min(T(x), p(x, yconst)) is not guaranteed to lead to, i.e., be part of the optimal soln. to SwP_Min(C, p(z)). A similar argument holds for the optimal soln. to and SwP_Min(T(y), p(xconst, y).
z
x
yS
wP
_Min(T
(x), p(x, yconst )
Sw
P_M
in(T(y), p(x
const ,y)
Sw. prob. at zin terms of various trans. probs. at all fanins cut by subset Si(z)
Fig.: D&C approach for SwP_Min(C, P0->1(z))
Ci
• Another way to look at the reason for this, is to see that the two subproblems are not independent (the trans. probs. implied at their o/ps by their solns. is needed to solve each subproblem leading to a cyclic dependency).• Since the above D&C seems to be the only way to break up SwP_Min(C, p(z)) into subproblems, this problem is not amenable to DP as it does not have the optimal substructure property.
Dynamic Programming (contd.)• A positive example: Total wire minimization in tech. mapping in a fanout-free circuit = DP_Min(C): C is a fanout-free ckt w/, say, z as its output. The problem is to minimize the sum of the number of outputs (each o/p contributes to an “exposed” wire in the circuit that needs to be routed), i.e., the sum of wires at the o/ps of TM’ed gates.• For a cut Ci w/ z at its o/p that can be TM’ed to a gate gi in the library, let x, y be 2 i/ps. Then the problem of minimizing the # of o/p wires in T(x) and T(y) are clearly independent problems, and the optimal soln. to each is part of the optimal soln. to DP_Min(C, z) given the cut Ci.• So the overall optimal formulation is to take the minimum soln. over all feasible cuts Ci w/ z at their o/p.• DP_TM(C) = Minall feasible Ci at z = o/p at C xj in X(Ci) DP_TM(T(xj)), where X(Ci) is the set of i/ps generated by Ci.• Whichever is the min. soln. producing cut Ck, the optimal solns. to the subproblems at its i/ps is part of the otimal soln. for DP_TM(C). • Thus, since the optimal substructure property holds, this problem is amenable to dynamic programming.
Ci
xyD
P_TM(T(x))
DP_TM
(T(x))D
P_TM(C)
z
Dynamic Programming (contd)
• Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x ………….. x Mn, Mi is of size ri x ci w/ ri = ci-1 for i > 1.• DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn}• Correctness rests on the property that the optimal way of multiplying M1x … x Mi& Mi+1 to Mn will be used in the “min” stitch-up function to determine the optimal soln for M• Thus if the optimal soln invloves a “cut” at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M)• Perform computation bottom-up (smallest sequences first)• Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears).
• Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is (length l of seq) -1 = l-1 (since min of l-1 different options is computed), where l = j-k+1• # of different M(j, k)’s is of length l = n – l + 1, 2 <= l <= n.• Total complexity = Sum l = 2 to n (l-1) (n-l+1) = (n3) (as opposed to, say, O(2 n) using exhaustive search)
Stitch-upfunction
RootProblem
A
A1 A2 A3 A4
Subproblems
DP in VLSI CAD• Example for the simple problem of only an optimization objective: Min-wire cost
tech. mapping of a fanout-free circuit, where the cost is # of wires. Thus best cost of a subproblem is easy to define and is a single value
• However, in CAD, the problems are generally multi-parameter ones: one opt. objective (min. or max.) and several upper-bound or lower-bound constraints on several metrics/parameters
• Which solution of a subproblem (i.e., a partial solution) is best is now harder to determine among several at a particular node of the DP tree or dag (directed acyclic graph)?
• Concept of domination is now important: A partial solution X represented by a vector of opt. and constraint metrics (a1, a2, …, ak) that is not worse in all metrics than any other partial soln. (i.e., X is not dominated by any other partial soln. of the same subproblem) is “best”.
• So there are multiple “best” solutions of a subproblem, one or more of which can be part of the optimal/best solution(s) of the parent problem. So after solving a subproblem, we will get multiple solutions (partial sols. of the parent problem), and we need to keep the non-dominated ones only and combine them w/ non-dominated solns of sibling subproblems to determine solns. to the parent problem.
• Note that we need to get rid of all dominated partial solns. as they are guaranteed not to lead to the optimal soln. of the full problem or more locally to non-dominated/best solns. of the parent problem.
A DP Example: Simple Buffer Insertion Problem
Given: Source and sink locations, sink capacitancesand RATs (reqd. arrival time), a buffer type, source delay rules, unit wire resistance and capacitance
Buffer
RAT1
RAT2
RAT3
RAT4
s0
Courtesy: Chuck Alpert, IBM
Simple Buffer Insertion Problem (contd)Find: Buffer locations and a routing tree such that slack (i.e., RAT) at the source is maximized—this gives greatest flexibility at the source in various ways: getting +ve RATs at fanin gates w/ fewer buffers at fanin nets, thus indirectly optimizing some other metrics, e.g., total leakage power or total cell/gate area.
RAT2
RAT3
RAT4
RAT1
s0
Courtesy: Chuck Alpert, IBM
)},()({min)( 0410 iii ssdelaysRATsq RAT
Possible buffer insertion points [nodes]—at and below branch nodes, and intermediate points on a long branchless interconnect
Slack/RAT Example
RAT = 400delay = 600
RAT = 500delay = 350
RAT = 400delay = 300
RAT = 500delay = 400
Slack/RAT = -200
Slack/RAT = +100
Courtesy: Chuck Alpert, IBM
Unsynthesizable!
Elmore Delay
22211 )()( CRCCRCADelay
A B CR1 R2
C1 C2
Courtesy: Chuck Alpert, IBM
(= Delay(AB) + Delay(BC)—sum of delays of “branch-less” segments on path from AC).Delay of a branchless seg:Delay(AB) = res(AB)*total cap seen by this res.) + wire delay (RwCw/2), Rw (Cw) = wire res. (cap) [wire delay ignored above]
DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS’90]• Associate each leaf node/sink with two metrics (Ct, Tt). Ct (cap seen) is useful as
upstream delay is dependent on Ct (how dependent will be based on usptream res. that us not known at this point—dependent on buffer insertion or not options taken later), and this upstream RAT dependent on both Ct and Tt.
• Downstream loading capacitance (Ct) and RAT (Tt). Want to min. Ct and max. Tt
• DP-based algo propagates potential solutions bottom-up [Van Ginneken, 90]. At each intermediate node t (a branch node or an artificial node on a long branch/interconnect), for each downstream soln. (Cn, Tn) do:
a) Add a wire:
b) Subsequently add a buffer:c) Consider both buffer and no-buffer (i.e., wire-only) solns. among the set of solns. at t.d) If t is a branch node, merge 2 every pair of sub-solutions at each sub-tree: For each
Zn=(Cn,Tn), Zm=(Cm,Tm) soln. vectors in the 2 subtrees, create a soln vector Zt=(Ct,Tt) where (note that wire-only/buffer options at this node will be
considered after merging):
Courtesy: UCLA
1
2
t n w
t n w n w w
C C C
T T R L R C
t b
t n b b n
C C
T T T R L
min( , )t n m
t n m
C C C
T T T
Cn, Tn
Ct, Tt
Cn, TnCt, Tt
Cn, Tn Cm, Tm
Ct, Tt
Cw, Rw
Note: Ln below is the same as Cn
DP Example (contd)d) (contd.) After merging:
i. Add a wire to each merged solution Zt (same cap. & delay change formulation as before)ii. Add a buffer to each Zt as before
e) Delete all dominated solutions at t: Zt1=(Ct1, Tt1) is dominated if there exists a Zt2=(Ct2, Tt2) s.t. Ct1 >= Ct2 and Tt1 <= Tt2 (i.e., both metrics are worse)
f) The remaining soln vectors are all “optimal”/“best” solns at t, and one of them will be part of the optimal solution at the root/driver of the net---this is the DP feature of this algorithm
RAT2
RAT3
RAT4
RAT1
s0
Van Ginneken Example
(20,400)
(20,400)(30,250)(5, 220)
WireC=10,d=150
BufferC=5, d=30
(20,400)
BufferC=5, d=50C=5, d=30
WireC=15,d=200 (for 1st subsoln)C=15,d=120 (for 2nd subsoln)
(30,250)(5, 220)
(45, 50)(5, 0)(20,100)(5, 70)
Courtesy: Chuck Alpert, IBM
Intermediate nodes for possiblebuffer location
Van Ginneken Example Cont’d
(20,400)(30,250)(5, 220)
(45, 50)(5, 0)(20,100)(5, 70)
(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)
(20,400)(30,250)(5, 220)
(20,100)(5, 70)(30,10)
(15, -10)
Pick solution with largest slack (max RAT), follow arrows forwardto get final complete solution
Wire C=10, d=90 (for 1st soln.)
Courtesy: Chuck Alpert, IBM
Wire C=10, d=80 (for 2nd soln.)
Mathematical Programming
Linear programming (LP)E.g., Obj: Min 2x1-x2+x3w/ constraintsx1+x2 <= a, x1-x3 <= b-- solvable in polynomial time
Quadratic programming (QP)E.g., Min. x12 – x2x3w/ linear constraints-- solvable in polynomial(cubic) time w/ equality constraints
Others
Mixed integer linear prog (ILP)-- NP-hard
Mixed integer quad. prog (IQP)-- NP-hard
Mixed 0/1 integer linear prog(0/1 ILP)-- NP-hard
Mixed 0/1 integer quad. prog(0/1 IQP)-- NP-hard
Some varsare integers
Some varsare in {0,1}
0/1 ILP/QLP Examples
• Generally useful for “assignment” problems, where objects {O1, ..., On) are to be assigned (possibly exclusively) to bins {B1, ..., Bm}• 0/1 variable x
i,j = 1 of object Oi is assigned to bin Bj
• Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQP
V1V2
uiuj
IQP modeling of min-cut part.:➢ x
i,1 = 1 => u
i in V1 else u
i in V2
(2nd var. xi,2
not needed due to mutual exclusivity & implication by x
i,1).
➢ Edge (ui, uj) in cutset if: x
i,1 (1-x
j,1) + (1-x
i,1)(x
j,1 ) = 1
➢ Objective function: Min Sum
(ui, uj) in E c(i,j) (x
i,1 (1-x
j,1) + (1-x
i,1)(x
j,1)
➢ Constraint: Sum w(ui) xi,1
<= max-size
21 EE 5301 - VLSI Design Automation I
Example 2 for ILP/IQP: HLS Resource Constraint Scheduling
• Constrained scheduling– General case NP-complete– Minimize latency given constraints on area or
the resources (ML-RCS)– Minimize resources subject to bound on latency (MR-LCS)
• Exact solution methods– ILP: Integer Linear Programming– Hu’s heuristic algorithm for identical processors/ALUs
• Heuristics– List scheduling– Force-directed scheduling
22 EE 5301 - VLSI Design Automation I
• Use binary decision variables– i = 0, 1, ..., n– l = 1, 2, ..., ’+1 ’ given upper-bound on latency – xil = 1 if operation i starts at step l, 0 otherwise.
• Set of linear inequalities (constraints),and an objective function (min latency)
• Observations–
– ti = start time of op i.
– is op vi (still) executing at step l?
ILP Formulation of ML-RCS
[Mic94] p.198
))(),((
0
iLii
Si
Li
Siil
vALAPtvASAPt
tlandtlforx
ill
i xlt .
11
l
dlmim
i
x ?
23 EE 5301 - VLSI Design Automation I
Start Time vs. Execution Time
• For each operation vi , only one start time
• If di=1, then the following questions are the same:
– Does operation vi start at step l?
– Is operation vi running at step l?
• But if di>1, then the two questions should be formulated as:
– Does operation vi start at step l?
• Does xil = 1 hold?
– Is operation vi running at step l?
• Does the following hold?1
1
l
dlmim
i
x?
24 EE 5301 - VLSI Design Automation I
Operation vi Still Running at Step l ?
• Is v9 running at step 6?
– Is x9,6 + x9,5 + x9,4 = 1 ?
• Note:– Only one (if any) of the above three cases can happen– To meet resource constraints, we have to ask the
same question for ALL steps, and ALL operations of that type
v9
4
5
6
x9,4=1
v9
4
5
6
x9,5=1
v9
4
5
6
x9,6=1
25 EE 5301 - VLSI Design Automation I
Operation vi Still Running at Step l ?
• Is vi running at step l ?
– Is xi,l + xi,l-1 + ... + xi,l-di+1 = 1 ?
vi
l
l-1
l-di+1
...
xi,l-di+1=1
vil
l-1
l-di+1
...
xi,l-1=1
vi
l
l-1
l-di+1
...
xi,l=1
. . .
26 EE 5301 - VLSI Design Automation I
• Constraints:– Exactly one start time per operation i:
For each i, xi,l = 1, l in [tiS, ti
L]– Sequencing (dependency) relations must be satisfied
– Resource constraints
• Objective: min
ILP Formulation of ML-RCS (cont.)
jl
jll
ilijjji dxlxlEvvdtt ..),(
1,,1,,,1,)(: 1
lnkax reskkvTi
l
dlmim
i i
nll
xl .
27 EE 5301 - VLSI Design Automation I
ILP Example• Assume = 4• First, perform ASAP and ALAP
– (we can write the ILP without ASAP and ALAP, but using ASAP and ALAP will simplify the inequalities)
+
NOP
+ <
-
-
NOP
1
2
3
4
+
NOP
+ <
-
-
NOP
1
2
3
4
v2v1
v3
v4
v5
vn
v6
v7
v8
v9
v10
v11
v2v1
v3
v4
v5
vn
v6
v7 v8
v9
v10
v11
28 EE 5301 - VLSI Design Automation I
ILP Example: Unique Start Times Constraint
• Without using ASAP and ALAP values:
• Using ASAP and ALAP:
1
...
...
...
1
1
4,113,112,111,11
4,23,22,21,2
4,13,12,11,1
xxxx
xxxx
xxxx
....
1
1
1
1
1
1
1
1
1
4,93,92,9
3,82,81,8
3,72,7
2,61,6
4,5
3,4
2,3
1,2
1,1
xxx
xxx
xx
xx
x
x
x
x
x
29 EE 5301 - VLSI Design Automation I
ILP Example: Dependency Constraints
• Using ASAP and ALAP, the non-trivial inequalities are: (assuming unit delay for + and *)
01.4.3.2.5
01.4.3.2.5
01.3.2.4
01.3.2.4.3.2
01.3.2.4.3.2
01.2.3.2
4,113,112,115,
4,93,92,95,
3,72,74,5
3,102,101,104,113,112,11
3,82,81,84,93,92,9
2,61,63,72,7
xxxx
xxxx
xxx
xxxxxx
xxxxxx
xxxx
n
n
30
EE 5301 - VLSI Design Automation I
ILP Example: Resource Constraints
• Resource constraints (assuming 2 adders and 2 multipliers)
• Objective:– Since =4 and sink has no mobility, any feasible solution is
optimum, but we can use the following anyway:
2
2
2
2
2
2
2
4,114,94,5
3,113,103,93,4
2,112,102,9
1,10
3,83,7
2,82,72,62,3
1,81,61,21,1
xxx
xxxx
xxx
x
xx
xxxx
xxxx
4,3,2,1, .4.3.2 nnnn xxxxM in
31 EE 5301 - VLSI Design Automation I
ILP Formulation of MR-LCS
• Dual problem to ML-RCS• Objective:
– Goal is to optimize total resource usage vector, a.– Objective function is cTa , where entries in c
are respective area costs of resources (the ak inequality constraint in ML-RCS is now an inequality with the variable ak (element of a) in the RHS.
• Constraints:– Same as ML-RCS constraints, plus:– Latency constraint added:
1. nll
xl
[©Gupta]
Search TechniquesA
BC
D
E
F
G
A
BC
D
E
F
G
1
2
3
45
6
A
BC
D
E
F
G
1
2
3
45
6
7
DFS BFSGraph
dfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u)
Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);
soln_dfs(v)/* used when nodes are basic elts of the problem and not partial soln nodes */v.mark = 1;If path to v is a soln, then return(1);for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found)end for;v.mark = 0; /* can visit v again to form another soln on a different path */return(0)
Search Techniques—Exhaustive DFSA
BC
D
E
F
G
1
2
3
45
6
DFS
optimal_soln_dfs(v)/* used when nodes are basic elts of the problem and not partial soln nodes */beginv.mark = 1;If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return;Endiffor each (v,u) in E if (u.mark != 1) then cost = cost + edge_cost(v,u); /* global var. */ optimal_soln_dfs(u)end for;v.mark = 0; /* can visit v again to form another soln on a different path */end
Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; cost = 0; optimal_soln_dfs(root);
Best-First Search
BeFS (root)begin open = {root} /* open is list of gen. but not expanded nodes—partial solns */ best_soln_cost = infinity; while open != nullset do begin curr = first(open); if curr is a soln then return(curr) /* curr is an optimal soln */ else children = Expand_&_est_cost(curr); /* generate all children of curr & estimate their costs---cost(u) should be a lower bound of cost of the best soln reachable from u */ for each child in children do begin if child is a soln then delete all nodes w in open s.t. cost(w) >= cost(child); endif store child in open in increasing order of cost; endfor endwhileend /* BFS */
Expand_&_est_cost(Y)begin children = nullset; for each basic elt x of problem “reachable” from Y & can be part of current partial soln. Y do begin if x not in Y and if feasible child = Y U {x}; path_cost(child) = path_cost(Y) + cost(Y, x) /* cost(Y, x) is cost of reaching x from Y */ est(child) = lower bound cost of best soln reachable from child; cost(child) = path_cost(child) + est(child); children = children U {child}; endforend /* Expand_&_est_cost(Y);
Y = partial soln. = a path from root to current “node” (a basic elt. of the problem, e.g., a city in TSP, a vertex in V0 or V1 in min-cut partitioning). We go from each such “node” u to the next one u that is “reachable “ from u in the problem “graph” (which is part of what you have to formulate)
u 10
12 15 19
18
1718
16
(1)
(2)
(3)
costs
root
Best-First SearchProof of optimality when cost is a LB• The current set of nodes in “open” represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of “open”• Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent• If first node curr in “open” is a soln, then cost(curr) <= cost(w) for each w in “open”•Cost of any node in the search space not in “open” and not yet generated is >= cost of its ancestor in “open” and thus >= cost(curr). Thus curr is the optimal (min-cost) soln
u 10
12 15 19
18
1718
16
(1)
(2)
(3)
costs
root
Y = partial soln.
Search techs for a TSP example9
5
21
3
5 4
8
7
5
AB
C
D
E
F
B E F
F
D F
E F D E
Dx
A A
C
F E E
A A A
27 31 33
Exhaustive search using DFS (w/ backtrack) for findingan optimal solution
Solution nodes
TSP graph
Search techs for a TSP example (contd)9
5
21
3
5 4
8
7
5
AB
C
D
E
F
B E F
F
D F
E F
A A
C
F
A
27
23+8
BeFS for finding an optimal TSP solution
22+9
C D E
C E D
X X X
F D
21+6C F
B F
F
A
8+16
11+14
14+9
20
5+15
• Lower-bound cost estimate: MST({unvisited cities} U {current city} U {start city})• LB as structure (spanning tree) is a superset of reqd soln structure (cycle)• min(metric M’s values in set S)<= min(M’s values in subset S’)• Similarly for max??
MST for node (A, E, F); =MST{F,A,B,C,D}; cost=16
Path cost for(A,E,F) = 8
Set S of all spanningtrees in a graph G
Set S’of all Hamiltonianpaths (that visits a nodeexactly once)in a graph G
S
S’
BFS for 0/1 ILP Solution
root(no vars
exp.)
• X = {x1, …, xm} are 0/1 vars• Choose vars Xi=0/1 as next nodes in some order (random or heuristic based)
X2=0 X2=1
Solve LPw/ x2=0;Cost=cost(LP)=C1
Solve LPw/ x2=1;Cost=cost(LP)=C2
Solve LPw/ x2=1, x4=0;Cost=cost(LP)=C3
Solve LPw/ x2=1, x4=1;Cost=cost(LP)=C4
X4=0 X4=1
X5=0 X5=1
Solve LPw/ x2=1, x4=1, x5=1Cost=cost(LP)=C6
Solve LPw/ x2=1, x4=1, x5=0Cost=cost(LP)=C5
optimal soln
Cost relations:C5 < C3 < C1 < C6C2 < C1C4 < C3
Iterative Improvement Techniques
Iterative improvement
Deterministic GreedyStochastic(non-greedy)
Locally/immediately greedy
Non-locally greedy
Make move that isimmediately (locally) bestUntil (no further impr.)(e.g., FM)
Make move that isbest according to somenon-immediate (non-local)metric (e.g., probability-based lookahead as in PROP)Until (no further impr.)
Make a combination of deterministic greedy moves and probabilistic moves that cause a deterioration (can help to jump out of local minima)Until (stopping criteria satisfied)• Stopping criteria could be an upper bound on the total # of moves or iterations