66
Hardware Compilation Gordon J. Pace December 2007

Hardware Compilation Gordon J. Pace December 2007

Embed Size (px)

Citation preview

Page 1: Hardware Compilation Gordon J. Pace December 2007

Hardware Compilation

Gordon J. Pace

December 2007

Page 2: Hardware Compilation Gordon J. Pace December 2007

Introduction

Circuits can be used to implement algorithms …

And are obviously as expressive as software …

But would you want to design a circuit for a task by hand?

How can high level languages be automatically compiled to hardware?

Page 3: Hardware Compilation Gordon J. Pace December 2007

But …

Running compiled code as software involves simply putting the program in memory …

But running code compiled to hardware required building the circuit. Or does it?

One can use programmable logic devices (such as FPGAs, field-programmable gate arrays).

Page 4: Hardware Compilation Gordon J. Pace December 2007

Programmable Logic Devices

Not as fast as ASIC (application-specific integrated circuit), and not as big, and not as cheap…

But you can download any circuit onto such a board!

Consists of a large array of gates which can be interconnected as desired.

May include more complex units on-board than simple logic gates.

Page 5: Hardware Compilation Gordon J. Pace December 2007

This lecture

I will be introducing the concepts behind hardware compilation, by starting with a simple imperative parallel language and slowly extend it.

I will describe hardware using Lava – a hardware description language embedded in Haskell.

Which will enable me to show you the code of an actual hardware compiler.

Page 6: Hardware Compilation Gordon J. Pace December 2007

But before we start …

A short reminder about circuit components and a Lava primer

Page 7: Hardware Compilation Gordon J. Pace December 2007

Hardware Components

We will be using the following basic components in our circuits:

and or

not delay

Page 8: Hardware Compilation Gordon J. Pace December 2007

Building a Multiplexer

sel

i0

i1

oo

sel

i0

i1

o = if sel then i1 else i0

mux (sel, (i0, i1)) = or2 (m0, m1) where m0 = and2 (i0, inv sel) m1 = and2 (i1, sel)

Page 9: Hardware Compilation Gordon J. Pace December 2007

Building a Register

o = if write then i else keep the previous value

register (write, i) = o where o = mux (write, (old, i)) old = delay low o

io

write

i

write

o

Page 10: Hardware Compilation Gordon J. Pace December 2007

Back to Hardware Compilation

How can one compile a high-level program into a hardware circuit?

main (input n, pow: int16, output result: int16) { local e: int16;

result := 1; e := pow; while (e > 0) { result := result * num; e := e – 1; }}

n

pow

result

Page 11: Hardware Compilation Gordon J. Pace December 2007

But how should it work?

Option 1:

Execute the algorithm at every clock tickWhat about non-terminating programs?Behaviour is identical in all clock ticksAlgorithms can only describe combinational

circuitsWhat about long combinational logic?How long should a clock tick take?

Page 12: Hardware Compilation Gordon J. Pace December 2007

But how should it work?

Option 2:

Program runs over several clock ticksWhen will the result be available?How will we know that the result is ready?Should the control over clock ticks be in the

hands of the programmer or the compiler?

Page 13: Hardware Compilation Gordon J. Pace December 2007

Mini-Flash

An imperative parallel language with one output variable

Page 14: Hardware Compilation Gordon J. Pace December 2007

Mini-Flash Syntax

program ::= Skip | Delay | Emit | program ; program | if signal then program else

program | while signal do program | program || program

Page 15: Hardware Compilation Gordon J. Pace December 2007

Informal semantics of Mini-Flash

Skip terminates immediately and does nothing.

Delay takes one clock tick to terminate.; is sequential composition.|| is fork-join parallel composition –

execute in parallel, and terminate once both sides have terminated.

Conditionals and loops work as usual.

Page 16: Hardware Compilation Gordon J. Pace December 2007

What about Emit?

The program has only one output wire,Which is always low unless an Emit

instruction has been executed in that clock tick.

The Emit instruction terminates immediately.

Page 17: Hardware Compilation Gordon J. Pace December 2007

Macros

We will be defining macros to avoid giving long, difficult to understand

programs as examples,and to avoid having to deal with function

definitions.

emit1 ≡ Emit; Delay

Page 18: Hardware Compilation Gordon J. Pace December 2007

Example: An Oscillator

oscillator ≡ forever { Delay; emit1 }

where forever is defined as:

forever P ≡ while high do P

Page 19: Hardware Compilation Gordon J. Pace December 2007

Example: Reading Input

Copy the input to the outputcopy in ≡

forever { if in then emit1 else Delay }

Copy input to output but avoiding two sequential high values:safecopy in ≡

forever if in then { emit1; Delay } else

Delay

Page 20: Hardware Compilation Gordon J. Pace December 2007

Example: Waiting for a Signal

Block the program until a signal becomes high:

wait in ≡ while (inv in) do Delay

Page 21: Hardware Compilation Gordon J. Pace December 2007

A final example: A detonator

Two persons get a controller each. Each controller has two buttons, which must be pressed in sequence (possibly with a pause in between) to enable. Both controllers must be triggered to enable the detonator.

detonator ((a, b), (c, d)) ≡ { wait a; wait b } || { wait c; wait

d }; Emit

a

b

c

d

Page 22: Hardware Compilation Gordon J. Pace December 2007

Compiling Mini-Flash

From programs to circuits.

Page 23: Hardware Compilation Gordon J. Pace December 2007

The Problem

Unlike execution of software on traditional processors, hardware in inherently parallel.How do we perform sequential

composition?How do we know a circuit has finished its

computation?

Page 24: Hardware Compilation Gordon J. Pace December 2007

The Shape of Circuits to Come

Solution: The circuits will have extra input start to start their execution and an extra output finish, to mark their termination:

start

finish

emitinputs

Page 25: Hardware Compilation Gordon J. Pace December 2007

Compiling Skip

start

finish

emitlow

Page 26: Hardware Compilation Gordon J. Pace December 2007

Compiling Delay

start

finish

emitlow

Page 27: Hardware Compilation Gordon J. Pace December 2007

Compiling Emit

start

finish

emit

Page 28: Hardware Compilation Gordon J. Pace December 2007

Compiling P;Q

start

finish

emitP

Q

Page 29: Hardware Compilation Gordon J. Pace December 2007

Compiling Conditionals

start

finish

emitP

Q

cond

Page 30: Hardware Compilation Gordon J. Pace December 2007

Compiling Parallel Composition

start

finish

emitP

Q

synchroniser

Page 31: Hardware Compilation Gordon J. Pace December 2007

The Synchroniser

Outputs high when both inputs have become high …

Can be implemented using the other language constructs1:

synchronise (f1, f2) ≡ forever { wait (or2 (f1, f2)); if (and2 (f1, f2)) then Skip else { Delay; wait (or2 (f1, f2))

}; emit1}

1 Well, this synchroniser almost always works …

Page 32: Hardware Compilation Gordon J. Pace December 2007

Compiling Loops

start

finish

emitP

cond

Page 33: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emitskip

Compiling:

while high do Skip

high

Page 34: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emit

Compiling:

while high do Skip

high

low

Page 35: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emit

high

low

Compiling:

while high do Skip

Page 36: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emit

high

low

Compiling:

while high do Skip

Page 37: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emitlow

Compiling:

while high do Skip

Page 38: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emitlow

Compiling:

while high do Skip

Combinational loop!

Page 39: Hardware Compilation Gordon J. Pace December 2007

Beware!

start

finish

emitlow

Compiling:

while high do Skip

To avoid combinational

loops, ensure that bodies of while

loops always take time to execute.

Page 40: Hardware Compilation Gordon J. Pace December 2007

Writing a Mini-Flash Compiler

From programs to circuits in one page of code!

Page 41: Hardware Compilation Gordon J. Pace December 2007

Mini-Flash Syntax

data MiniFlash = Skip | Delay | Emit | MiniFlash :>: MiniFlash | IfThenElse Signal (MiniFlash, MiniFlash) | While Signal MiniFlash | MiniFlash :|: MiniFlash

Page 42: Hardware Compilation Gordon J. Pace December 2007

Mini-Flash in Lava

The type of circuits produced:

type MiniFlashCircuit = Signal -> (Signal, Signal)

The compiler:

compile :: MiniFlash -> MiniFlashCircuit

Page 43: Hardware Compilation Gordon J. Pace December 2007

Compiling Skip

start

finish

emitlow

compile Skip start = (finish, emit) where finish = start emit = low

Page 44: Hardware Compilation Gordon J. Pace December 2007

Compiling Delay

compile Delay start = (finish, emit) where finish = delay low start emit = low

start

finish

emitlow

Page 45: Hardware Compilation Gordon J. Pace December 2007

Compiling Emit

start

finish

emit

compile Emit start = (finish, emit) where finish = start emit = start

Page 46: Hardware Compilation Gordon J. Pace December 2007

Compiling P;Q

compile (p :>: q) start = (finish, emit) where (middle, emit1) = compile p start (finish, emit2) = compile q middle emit = or2 (emit1, emit2)

start

finish

emitP

Q

Page 47: Hardware Compilation Gordon J. Pace December 2007

Compiling Conditional

compile (IfThenElse c (p, q)) start = (finish, emit) where (finish1, emit1) = compile p (and2 (start, c)) (finish2, emit2) = compile q (and2 (start, inv c)) emit = or2 (emit1, emit2) finish = or2 (finish1, finish2)

start

finish

emit

Q

Page 48: Hardware Compilation Gordon J. Pace December 2007

Compiling Parallel Composition

compile (p :|: q) start = (finish, emit) where (finish1, emit1) = compile p start (finish2, emit2) = compile q start emit = or2 (emit1, emit2) finish = synchroniser (finish1, finish2)

start

finish

emit

synchroniser

Page 49: Hardware Compilation Gordon J. Pace December 2007

Compiling Loops

compile (While c p) start = (finish, emit) where start1 = or2 (start, finish1) (finish1, emit) = compile p (and2 (c, start1)) finish = and2 (inv c, start1)

start

finish

emit

Page 50: Hardware Compilation Gordon J. Pace December 2007

Sanity Check for Loop Body

takesTime Skip = FalsetakesTime Emit = FalsetakesTime Delay = TruetakesTime (IfThenElse _ (p, q)) =

takesTime p && takesTime qtakesTime (p :>: q) = takesTime p || takesTime qtakesTime (p :|: q) = takesTime p || takesTime qtakesTime (While _ _) = False

Warning!This may give false negatives

Page 51: Hardware Compilation Gordon J. Pace December 2007

Extending the Language

Page 52: Hardware Compilation Gordon J. Pace December 2007

Assignments

Extending the language to still have one output variable, but set using assignments rather than Emit signals.

Changes the shape of the circuits produced: start

finish

value

inputsassign

Page 53: Hardware Compilation Gordon J. Pace December 2007

Assignments

Extending the language to still have one output variable, but set using assignments rather than Emit signals.

Changes the shape of the circuits produced: start

finish

value

inputsassign

High when the output is being

assigned a value

Contains the value which is being assigned

Page 54: Hardware Compilation Gordon J. Pace December 2007

Assignments

Combining the output of two circuits now requires more logic:

value1

assign1

value2

assign2

value

assign

Page 55: Hardware Compilation Gordon J. Pace December 2007

Assignments

Combining the output of two circuits now requires more logic:

value1

assign1

value2

assign2

value

assign

What happens if two parallel blocks assign to the variable at the same

time?

Page 56: Hardware Compilation Gordon J. Pace December 2007

Assignments

At the top level, we can extract the actual value of the output:

start

finish

value

inputsassign

reg

actual value

Page 57: Hardware Compilation Gordon J. Pace December 2007

Reading the Output Variable

We simply add another input wire (which could be bundled with the inputs we already have) and connect it at the top level:

start

finish

value

inputsassign

reg

actual value

Page 58: Hardware Compilation Gordon J. Pace December 2007

Multiple Output Variables

Can be done by replicating the output wires:

start

finish

emit1inputs emit2

emitn

Page 59: Hardware Compilation Gordon J. Pace December 2007

Multiple Output Variables

And replicating the logic used to combine the outputs.

emitP1

emitP2

emitPn

emitQ1

emitQ2

emitQn

emit1

emit2

emitn

Page 60: Hardware Compilation Gordon J. Pace December 2007

Multiple Output Variables

And replicating the logic used to combine the outputs.

emitP1

emitP2

emitPn

emitQ1

emitQ2

emitQn

emit1

emit2

emitn

Note that the number of outputs has to be statically determined

at compile-time!

Page 61: Hardware Compilation Gordon J. Pace December 2007

Other Possible Extensions

Channel communication.Common blocks of code (reusing the

same hardware is not always possible).Adding in-built functions (eg arithmetic).Dynamic variable allocation (well, sort

of).

Page 62: Hardware Compilation Gordon J. Pace December 2007

Conclusions

Page 63: Hardware Compilation Gordon J. Pace December 2007

Hardware Compilation

So algorithms can be compiled directly into circuits.

The extra start and finish wires handle the logic of the program counter in software.

After all, hardware is not so different from software.

Page 64: Hardware Compilation Gordon J. Pace December 2007

Other Issues Arising

Compact placement of gates is an NP-hard problem. How do we decide placement at compile time?

Some compilation/synthesis techniques compile algorithms without explicit timing.

The compilation presented here is very naïve. How can one optimise (number of gates, power consumption, combinational depth, etc)?

A single algorithm can now be realised in a combination of software and hardware. How can one partition code for software-hardware codesign effectively?

Page 65: Hardware Compilation Gordon J. Pace December 2007

Extra slides

Page 66: Hardware Compilation Gordon J. Pace December 2007

Example: Waiting for an edge

Detect a rising edge:

rising in ≡ forever { wait (inv t); wait t; emit1 }

A falling edge:

falling in ≡ rising (inv in)Any edge:

edge in ≡ rising in || falling in