17
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software (4/4) Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software (4/4)

  • Upload
    angeni

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software (4/4). Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu. Summary of the last 3 lectures. previous lectures. System Specification. this lecture. traditional compiler class. - PowerPoint PPT Presentation

Citation preview

Page 1: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Reconfigurable Computing(EN2911X, Fall07)

Lecture 11: RC Principles: Software (4/4)

Prof. Sherief RedaDivision of Engineering, Brown University

http://ic.engin.brown.edu

Page 2: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Summary of the last 3 lectures

partitioning

SW

System Specification

HW

compiling

Verilog

synthesis

mapping & packing

place & route

download to board

compile

link

configuration file

executable image

previous lectures

this lecture

traditional compiler class

Page 3: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Embedding a digital circuit to FPGA fabric

Programmableinterconnect

Programmablelogic blocks

[Maxfield’04]

Programmable logic element

1. Mapping decomposes the circuit into logic sections and flip-flops such that each section fits into a K-LUT LE.

2. Packing groups LEs into clusters so that each cluster fits into a LAB3. Placement determines the position of each cluster into the LABs of

the island style FPGA4. Routing determines the exact routes for the communicating

LE/LABsWhat are the objectives/metrics that these algorithms should pursue?

Page 4: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

1. Mapping finds a covering for a given circuit using K-LUT

Map to a LUT in a LB

[Figure form Cong FPGA’01]

Page 5: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

A covering example

[From Ling et al. DAC’05]

There could be many possible covering? Which one should be picked?

Page 6: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

2. Packing

How can we decide which LEs should go together in the same logic cluster?

Possible method (VPACK): Construct each cluster sequentially• Start by choosing seed LE for the cluster• Then greedily selects the LE which shares the most inputs and outputs

with the cluster being constructed• Repeat the procedure until greedily until the cluster is full or the number of

inputs exceed the limit I• Can addition of a LE to a cluster reduces the number of distinct inputs?

Page 7: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

3. Placement

What’s wrong with the previous greedy algorithm?

• Placement assigns an exact position or LAB for each cluster in the input netlist

• Suppose you start with a random placement, how can you improve it?

Possible algorithm: - Pick a pair of cells and swap their locations if this leads to reduction in WL

WL results

possible placements

localoptimal

globaloptimal

It can simply get stuck in a local optimal result

Page 8: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Simulated annealing allows us to avoid getting trapped in a local minimaModified algorithm• Generate a random move (say a swap of

two cells)– calculate the chance in WL (L) due to

the move– if the move results in reduction (L < 0)

then accept– else reject with probability 1-e-L/T

• T (temperature) controls the rejection probability

• Initially, T is high (thus avoiding getting trapped early in a local minima) then the temperature cools down in a scheduled manner; at the end, the rejection probability is 1

• With the right “slow-enough” cooling scheduling, simulated annealing is guaranteed to reach the global optimal

WL results

possible placements

localoptimal

globaloptimal

Page 9: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

How do the cooling scheduling and corresponding cost functions look like?

[source: I. Markov]

Page 10: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Placement before & after simulated annealing

[using VPR tool]

Page 11: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

4. Routing

Assign exact routes for each wire in the given circuit in the FPGA fabric such that no two wires overlap

General idea: •Order the wires according to some criteria•Sequentially route each wire using shortest path algorithms (after removing the resources consumed from preceding routed wires)

Page 12: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Maze routing

22 1 2

2 1 s 1 22 1 2

2 t

Problem: Find the shortest path for a 2-pin wire from s to t

5 4 3 4 5 6 7 8 9 104 3 2 3 4 5 6 7 8 9 103 2 1 2 3 4 5 6 7 8 9 102 1 s 1 2 9 103 2 1 2 3 11 104 3 2 3 4 10 11 t 115 4 3 4 5 9 10 116 5 4 5 6 7 8 9

6 5 6 7 8 96 7 8 9

grid cell capacity is full

grid cell still has available tracks

Speed ups are possible using A* search algorithms and other AI search techniques

Page 13: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Impact of Net Ordering

A bad net ordering may unnecessarily increase the total wirelength or even yield the chip unroutable!

A

A

B

B

B first then A(Good order)

A

AB

B

A first then B(Bad order)

• Example: Two nets A and B

Length in placement

Timing criticality

Page 14: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

When a route for a net can’t be found then rip up and re-route

A

B

C

A

B

C

Cannot route C

A

B

C

A

B

C

So rip-up Band route C first.

A

B

C

A

B

C

Finally route B.

[Example from Prof. D. Pan Lecture]

Page 15: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

VPR. After routing

After placement After placement and routing

You probably saw similar layouts from the Quartus II tool

Page 16: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Finally programming the FPGA

Configuration data in

Configuration data out

= I/O pin/pad

= SRAM cell

Page 17: Reconfigurable Computing (EN2911X, Fall07) Lecture 11: RC Principles: Software  (4/4)

Reconfigurable ComputingS. Reda, Brown University

Summary

• Done with software part for reconfigurable computing• Next lecture, project overview• The one after is the midterm• Afterwards, we will start looking at SystemC is a higher-

level method to synthesis systems