View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Instruction Scheduling Using Max-Min Ant System
Optimization
Gang Wang, Wenrui Gong, and Ryan KastnerDept. of Electrical and Computer Engineering
University of California, Santa Barbara
GLSVLSI’2005, Chicago, April 18, 2005
Instruction Scheduling
Instruction Scheduling is a fundamental synthesis problemSoftware - compilers for microprocessorsHardware - behavioral synthesis for ASIC, FPGA
Requisite Moore’s Law referenceMoore transistors = moore computational resourcesLarger applications = moore operations
How to best utilize the resources?
Instruction Scheduling Definition
Given a set of instructions and collection of computational units
Instruction modeled using data flow graph (DFG) Directed acyclic graph Each node is instruction Each edge is a data dependence
Find schedule for instructions to minimize some function (latency, area, power, …)
We focus on resource constrained problem, i.e. minimize latency given a set of resources.
Auto Regressive Filter
Instruction Scheduling
NP-hard Fundamental problem - many different heuristic methods
have been developed ILP Force directed Genetic algorithm Path based Graph theoretic Computational geometry List scheduling
+
start
+ <
-
-
end
1
2
3
4
v2v1
v3
v4
v5
vn
v6
v7
v8
v9
v10
v11
List Scheduling
Simple and effective Greedy strategy Operation selection decided by criticality O(n) time complexity
Make a priority list of the instructions based on some measure (mobility, instruction depth, number of successors, etc.)
No single priority function works well over all applications Highly dependent on problem instance Solution quality varies greatly across different functions
List Scheduling
+
start
+ <
-
-
end
1
2
3
4
v2v1
v3
v4
v5
vn
v6
v7
v8
v9
v10
v11
Procedure ListScheduling(G, R, L)
Input: DFG(V,E), resource set R, priority list L
Output: instruction schedule
1. cycle 0
2. ReadyList successors of start
3. While node end is not scheduled do
4. for op in ReadyList in decending priority do
5. if resource exists for op to start then
6. schedule op at time cycle
7. end if
8. Update ReadyList
9. end for
10. cycle cycle + 1
11.end while
12. return schedule
Our approach – Ant System Heuristic
Inspired by ethological study on the behavior of ants [Goss et al. 1989]
A meta heuristic A multi-agent cooperative searching method A new way for combining global/local heuristics Extensible and flexible
Formulating Problems Using Ant Search
Problem model – define search space, create decision variables
Pheromone model – used as a global heuristic, distribution of pheromones, evaporation and strengthening strategies
Ant search strategy – local heuristics and solution space traversal
Solution construction – method of creating an answer from decision variables
Feedback – provide assessment of solution quality and adjust pheromones accordingly
Hybrid Ants with Lists
Combine Ant System Optimization and List SchedulingAnts determine priority listList scheduling framework evaluates the “goodness”
of the list Iterative approach
Pheromone Model For Instruction Scheduling
ijop1
op2
op3
op4
op5
op6
Instructions
1
2
3
4
5
6
Priority List
Each instruction opi I associated with n pheromone trails where j = 1, …, nwhich indicate the favorableness of assign instruction i to position jEach
instruction also has a
dynamic local heuristic ij
Ant Search Strategy
op1
op2
op3
op4
op5
op6
Instructions
1
2
3
4
5
6
Priority List
Multiple ants independently create their own priority list
Fill one instruction at a time Iterative process
op1
op2
op3
op4
op5
op6
op5
op4
op1
op6
op2
op3
Ant Search Strategy
Each ant has memory about instructions already selected
At step j ant has already selected j-1 instructions
jth instruction selected probabilistically
op1
op2
op3
op4
op5
op6
Instructions
1
2
3
4
5
6
Priority List
op1
op2
op3
op4
op5
op6
op5
op4
op1
Ant Search Strategy
ij(k) : global heuristic (pheromone) for selecting instruction i at j position
j(k) : local heuristic – can use different properties Instruction mobility (IM) Instruction depth (ID) Latency weighted instruction depth (LWID) Successor number (SN)
, control influence of global and local heuristics
Pheromone Update
Priority lists evaluated using list scheduling Latency Lh for the result from ant h Evaporation – prevent stigmergy and punish “useless” trails Reinforcement – award trails with better quality
Pheromone Update
op1
op2
op3
op4
op5
op6
Instructions
1
2
3
4
5
6
Priority List
Evaporation happens on all trails
Reward the used trails based on the solution’s quality
op1
op2
op3
op4
op5
op6
op5
op4
op1
op6
op2
op3
Max-Min Ant System (MMAS)
Risks of Ant System optimization Too much feedback Dynamic range of pheromone trails can increase rapidly Limited search of solution space
Unused trails can be repetitively punished, therefore certain schedules may never be selected
Premature convergence
MMAS is designed to address this problem Built upon original AS Idea is to limit the pheromone trails within an evolving bound
so that more broader exploration is possible Better balance the exploration and exploitation Prevent premature convergence
Max-Min Ant System (MMAS)
Limit (t) within min(t) and max(t)
Sgb is the best global solution found so far at t-1 f(.) is the quality evaluation function, e.g. latency in our case avg is the average size of decision choices
Pbest (0,1] is the controlling parameter Smaller Pbest tighter range for more emphasis on exploration
When Pbest 0, we set min max
Other Algorithmic Refinements
Dynamically evolving local heuristicsExample: dynamically adjust instruction mobilityBenefit: dynamic search space reduction
Taking advantage of topological sorting of DFG when constructing priority listEach step ants select from the ready instructions
instead from all unscheduled instructions Benefit: greatly reduce the search space
Experimental Results
ILP (optimal) using CPLEX List scheduling
Instruction mobility (IM), instruction depth (ID), latency weighted instruction depth (LWID), successor number (SN)
Ant scheduling results using different local heuristics (Averaged over 5 runs, each run 100 iteration with 5 ants)
Benchmark(nodes/edges)
Resources CPLEX(latency
/runtime)
ForceDirected
List Scheduling MMAS-IS(average over 5 runs)
IM ID LWID SN IM ID LWID SN
HAL(21/25) la, lfm, lm, 3i, 3o 8/32 8 8 8 9 8 8 8 8 8
ARF(28/30) 2a, lfm, 2m 11/22 11 11 13 13 13 11 11 11 11
EWF(34/47) la, lfm, lm 27 /24000 28 28 31 31 28 27.2 27.2 27 27.2
FIR1 (40/39) 2a, 2m, 3i, 3o 13/232 19 19 19 19 18 17.2 17.2 17 17.8
FIR2(44/43) la, lfm, lm, 3i, 3o 14/11560 19 19 21 21 21 16.2 16.4 16.2 17
COSINE 1(66/76) 2a,2m, lfm, 3i, 3o 18 19 20 18 18 17.4 18.2 17.6 17.6
COSINE2(82/91) 2a,2m, lfm, 3i, 3o 23 23 23 23 23 21.2 21.2 21.2 21.2
Average 18 18.2 19.3 20.5 18.5 16.8 17.0 16.9 17.1
Conclusion
Proposed a novel heuristic method for resource-constrained instruction scheduling problem Hybrid MMAS and List Scheduling Create a mathematic model for combining global and local
heuristics
Experiment results are promising Near optimal results Outperforms widely used force-directed approach More stable and insensitive to choice of application/local
heuristics
Thanks! Questions?