Instruction Scheduling Using Max-Min Ant System Optimization Gang Wang, Wenrui Gong, and Ryan Kastner Dept. of Electrical and Computer Engineering University

Instruction Scheduling Using Max-Min Ant System

Optimization

Gang Wang, Wenrui Gong, and Ryan KastnerDept. of Electrical and Computer Engineering

University of California, Santa Barbara

GLSVLSI’2005, Chicago, April 18, 2005

Instruction Scheduling

Instruction Scheduling is a fundamental synthesis problemSoftware - compilers for microprocessorsHardware - behavioral synthesis for ASIC, FPGA

Requisite Moore’s Law referenceMoore transistors = moore computational resourcesLarger applications = moore operations

How to best utilize the resources?

Instruction Scheduling Definition

Given a set of instructions and collection of computational units

Instruction modeled using data flow graph (DFG) Directed acyclic graph Each node is instruction Each edge is a data dependence

Find schedule for instructions to minimize some function (latency, area, power, …)

We focus on resource constrained problem, i.e. minimize latency given a set of resources.

Auto Regressive Filter

Instruction Scheduling

NP-hard Fundamental problem - many different heuristic methods

have been developed ILP Force directed Genetic algorithm Path based Graph theoretic Computational geometry List scheduling

+

start

+ <

-

-

end

1

2

3

4

v2v1

v3

v4

v5

vn

v6

v7

v8

v9

v10

v11

List Scheduling

Simple and effective Greedy strategy Operation selection decided by criticality O(n) time complexity

Make a priority list of the instructions based on some measure (mobility, instruction depth, number of successors, etc.)

No single priority function works well over all applications Highly dependent on problem instance Solution quality varies greatly across different functions

List Scheduling

+

start

+ <

-

-

end

1

2

3

4

v2v1

v3

v4

v5

vn

v6

v7

v8

v9

v10

v11

Procedure ListScheduling(G, R, L)

Input: DFG(V,E), resource set R, priority list L

Output: instruction schedule

1. cycle 0

2. ReadyList successors of start

3. While node end is not scheduled do

4. for op in ReadyList in decending priority do

5. if resource exists for op to start then

6. schedule op at time cycle

7. end if

8. Update ReadyList

9. end for

10. cycle cycle + 1

11.end while

12. return schedule

Our approach – Ant System Heuristic

Inspired by ethological study on the behavior of ants [Goss et al. 1989]

A meta heuristic A multi-agent cooperative searching method A new way for combining global/local heuristics Extensible and flexible

Ant System Heuristic









Autocatalytic Effect

Formulating Problems Using Ant Search

Problem model – define search space, create decision variables

Pheromone model – used as a global heuristic, distribution of pheromones, evaporation and strengthening strategies

Ant search strategy – local heuristics and solution space traversal

Solution construction – method of creating an answer from decision variables

Feedback – provide assessment of solution quality and adjust pheromones accordingly

Hybrid Ants with Lists

Combine Ant System Optimization and List SchedulingAnts determine priority listList scheduling framework evaluates the “goodness”

of the list Iterative approach

Pheromone Model For Instruction Scheduling

ijop1

op2

op3

op4

op5

op6

Instructions

1

2

3

4

5

6

Priority List

Each instruction opi I associated with n pheromone trails where j = 1, …, nwhich indicate the favorableness of assign instruction i to position jEach

instruction also has a

dynamic local heuristic ij

Ant Search Strategy

op1

op2

op3

op4

op5

op6

Instructions

1

2

3

4

5

6

Priority List

Multiple ants independently create their own priority list

Fill one instruction at a time Iterative process

op1

op2

op3

op4

op5

op6

op5

op4

op1

op6

op2

op3

Ant Search Strategy

Each ant has memory about instructions already selected

At step j ant has already selected j-1 instructions

jth instruction selected probabilistically

op1

op2

op3

op4

op5

op6

Instructions

1

2

3

4

5

6

Priority List

op1

op2

op3

op4

op5

op6

op5

op4

op1

Ant Search Strategy

ij(k) : global heuristic (pheromone) for selecting instruction i at j position

j(k) : local heuristic – can use different properties Instruction mobility (IM) Instruction depth (ID) Latency weighted instruction depth (LWID) Successor number (SN)

, control influence of global and local heuristics

Pheromone Update

Priority lists evaluated using list scheduling Latency Lh for the result from ant h Evaporation – prevent stigmergy and punish “useless” trails Reinforcement – award trails with better quality

Pheromone Update

op1

op2

op3

op4

op5

op6

Instructions

1

2

3

4

5

6

Priority List

Evaporation happens on all trails

Reward the used trails based on the solution’s quality

op1

op2

op3

op4

op5

op6

op5

op4

op1

op6

op2

op3

Max-Min Ant System (MMAS)

Risks of Ant System optimization Too much feedback Dynamic range of pheromone trails can increase rapidly Limited search of solution space

Unused trails can be repetitively punished, therefore certain schedules may never be selected

Premature convergence

MMAS is designed to address this problem Built upon original AS Idea is to limit the pheromone trails within an evolving bound

so that more broader exploration is possible Better balance the exploration and exploitation Prevent premature convergence

Max-Min Ant System (MMAS)

Limit (t) within min(t) and max(t)

Sgb is the best global solution found so far at t-1 f(.) is the quality evaluation function, e.g. latency in our case avg is the average size of decision choices

Pbest (0,1] is the controlling parameter Smaller Pbest tighter range for more emphasis on exploration

When Pbest 0, we set min max

Other Algorithmic Refinements

Dynamically evolving local heuristicsExample: dynamically adjust instruction mobilityBenefit: dynamic search space reduction

Taking advantage of topological sorting of DFG when constructing priority listEach step ants select from the ready instructions

instead from all unscheduled instructions Benefit: greatly reduce the search space

MMAS Instruction Scheduling Algorithm

ARF Pheromones

Experimental Results

ILP (optimal) using CPLEX List scheduling

Instruction mobility (IM), instruction depth (ID), latency weighted instruction depth (LWID), successor number (SN)

Ant scheduling results using different local heuristics (Averaged over 5 runs, each run 100 iteration with 5 ants)

Benchmark(nodes/edges)

Resources CPLEX(latency

/runtime)

ForceDirected

List Scheduling MMAS-IS(average over 5 runs)

IM ID LWID SN IM ID LWID SN

HAL(21/25) la, lfm, lm, 3i, 3o 8/32 8 8 8 9 8 8 8 8 8

ARF(28/30) 2a, lfm, 2m 11/22 11 11 13 13 13 11 11 11 11

EWF(34/47) la, lfm, lm 27 /24000 28 28 31 31 28 27.2 27.2 27 27.2

FIR1 (40/39) 2a, 2m, 3i, 3o 13/232 19 19 19 19 18 17.2 17.2 17 17.8

FIR2(44/43) la, lfm, lm, 3i, 3o 14/11560 19 19 21 21 21 16.2 16.4 16.2 17

COSINE 1(66/76) 2a,2m, lfm, 3i, 3o 18 19 20 18 18 17.4 18.2 17.6 17.6

COSINE2(82/91) 2a,2m, lfm, 3i, 3o 23 23 23 23 23 21.2 21.2 21.2 21.2

Average 18 18.2 19.3 20.5 18.5 16.8 17.0 16.9 17.1

Conclusion

Proposed a novel heuristic method for resource-constrained instruction scheduling problem Hybrid MMAS and List Scheduling Create a mathematic model for combining global and local

heuristics

Experiment results are promising Near optimal results Outperforms widely used force-directed approach More stable and insensitive to choice of application/local

heuristics

Thanks! Questions?

Extra Slides

DFG Size Distribution

Auto Regressive Filter

Documents

Instruction Scheduling Using Max-Min Ant System Optimization Gang Wang, Wenrui Gong, and Ryan Kastner Dept. of Electrical and Computer Engineering University