Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
5/11/2008
1
Lava 4 (relevant to take home exam)
Stepping back to see the bigger picture
Where can more info. be found?
What are the hot research topics?
1
Prefix
Given inputs x1, x2, x3 … xn
Compute x1, x1*x2, x1*x2*x3, … , x1*x2*…*xn
Where * is an arbitrary associative (but not necessarily commutative) operator
2
5/11/2008
2
Why interesting?
Microprocessors contain LOTS of parallel prefix circuitsnot only binary and FP adders
address calculation
priority encoding etc.
Overall performance depends on making them fast
But they should also have low power consumption...
Parallel prefix is a good example of a connection pattern for which it is interesting to do better synthesis
3
Serial prefix
least most significant
inputs n=8depth d=7size s=7 (number ops)
Pictures generated by symbolic evaluation of Lava descriptionsStyle is specific to parallel prefix
4
5/11/2008
3
5
serr _ [a] = [a]
serr op (a:b:bs) = a:cs
where
c = op(a,b)
cs = serr op (c:bs)
*Main> simulate (serr plus) [1..10]
[1,3,6,10,15,21,28,36,45,55]
Sklansky
6
5/11/2008
4
Sklansky
32 inputs, depth 5, 80 operators
7
skl _ [a] = [a]
skl op as = init los ++ ros'
where
(los,ros) = (skl op las, skl op ras)
ros' = fan op (last los : ros)
(las,ras) = halveList as
8
5/11/2008
5
9
Brent Kung
fewer ops, at cost of being deeper. Fanout only 2
BK recursive pattern
10P is another half size network operating on only the thick wires
5/11/2008
6
11
Ladner Fischer
NOT the same as Sklansky; many books and papers are wrong about this(including slides from Digital Circuit Design course)
Question
How do we design fast low power prefix networks?
12
5/11/2008
7
Answer
Generalise the above recursive constructions
Use dynamic programming to search for a good solution
User Wired to increase accuracy of power and delay estimations (see later lecture by Emil)
13
BK recursive pattern
14
P is another half size network operating on only the thick wiresThis is an alternative view to the ”forwards and backwards trees” thatsome of you saw in Jeppson’s course
5/11/2008
8
BK recursive pattern generalised
15Each S is a serial network like that shown earlier
16
4 2 3 … 4
This sequence of numbersdetermines how the outer”layer” looks
5/11/2008
9
17
4 2 3 … 4
4 2 3 … 4
-1 +1
sequence for widths of fans at bottom is closely related
18
4 2 3 … 4
3 2 3 … 5
sequence for widths of fans at bottom is closely related
5/11/2008
10
19
4 2 3 … 4
So just look at allpossibilities for this sequence
and for each one findthe best possibility forthe smaller P
Then pick best overall!
Dynamic programming
Search!
need a measure function (e.g. number of operators)
Very similar to a ”shortest paths” algorithm
20
5/11/2008
11
21
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
22
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
f1 is the measure function beingoptimised for
5/11/2008
12
23
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
g is max width of small S and Fnetworks. Controls fanout.
24
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
contextdelays inwire numbers (positions) inallowed depth
(is,xs,w)
5/11/2008
13
25
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
use memoisation to avoidexpensive recomputation
26
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
base case: single wire
5/11/2008
14
27
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
Fail if it is simply impossibleto fit a prefix network in theavailable depth
28
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
For each candidate sequence:Build the resulting network(where call of (prefix f) gives the best network for the recursive callinside)(Needed to think hard aboutcontrolling size of search space)
5/11/2008
15
29
parpre f1 g ctx = getans (error "no fit") (prefix f1 ctx)whereprefix f = memo pm
wherepm ([d],_,w) = trywire ([d],w)pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC ds (prefix f)| ds <- topds g h (length is)]where
. . . .
The real code!
Finally, pick the best amongall these candidates
30
Result when minimising number of ops, depth 6, 33 inputs, fanout 7
This network is Depth Size Optimal (DSO)
depth + number of ops = 2(number of inputs)-2 (known to be smallest possible no. ops for given depth, inputs)
6 + 58 = 2*33 – 2
5/11/2008
16
31
64 inputs, depth 8, size 118 (also DSO)
BUT not min. depth.
We need to move away from DSO if we want shallow networks
A further generalisation
32
5/11/2008
17
33
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx)where
prefix f = memo pmwhere
pm ([],_,w) = trywire ([],w)pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis]
34
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx)where
prefix f = memo pmwhere
pm ([],_,w) = trywire ([],w)pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis]
extra base case for 0 inputs
5/11/2008
18
35
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx)where
prefix f = memo pmwhere
pm ([],_,w) = trywire ([],w)pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w)pm (is,xs,w) = ((bestOnE xs is f).dropFail)
[wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis]
now there are 2 recursive calls
Result
When minimising no. of ops: gives same as Ladner Fischer for 2^n inputs, depth n,
considerably fewer ops and lower fanoutelsewhere (non power of 2, deeper)
Translates into low power plus decent speed when exported to Design Compiler
36
5/11/2008
19
37
Link to Wired allows more accurate estimates. Can then explore design space
38
Can also export to Cadence SoC Encounter
5/11/2008
20
Wired
Start with Lava-like description and then graduallyadd placement info. + wiring ”guides”
Can still use our bag of programming tricks
(still embedded in Haskell)
Quick but relatively accurate design exploration
See lecture by Emil on thursday
39
Obvious questions
This is very low level. What about higher up, earlier in the design?
(Tentative assertion: these were general programming idioms with possible application at other levels of abstraction.)
What about the cases when such a structural approach is inappropriate?
Can we make refinement work?
Can we design appropriate GENERIC verification methods?
40
5/11/2008
21
Putting the designer in control
Connection patterns are essential first step (and give some layout awareness when wanted)
We write circuit generators rather than circuit descriptions. Everything is done behind the scenes by symbolic evaluation. Full power of Haskell is available to the user (but we have some useful idioms to reduce the fear).
Circuit generators are short and sweet and LOOK LIKE circuit descriptions.
41
It’s all about programming
Non-standard interpretation used after generation (as we havelong done) and now also to guide synthesis
Clever circuits a good idiom. Can control choice of components, wiring and topology. Greatly increase expressive power of the connection patterns approach.
Having a full functional language available is a great once onehas had some practice. More idioms to be discovered
Ideas compatible with Intel’s IDV
42
5/11/2008
22
We can’t only think about function
Clever circuits give a way to allow non-functional properties to influence design (even early on). Makes blocks context sensitive.
Vital as we move to deep sub-micron
Separation of concerns becoming less and less possible
First experiments are (and will be) about module generation
Remains to be seen if there are applications at higher levels
Hopefully, a project on DSP Algorithm Design with Ericsson
will explore this
43
44
The Big Picture (Design and Verification Languages)(see chapter in e-Book)
VHDL Verilog
CUML
5/11/2008
23
45
The Big Picture (Languages)
VHDL Verilog
CUML
46
Intel
IDV (Seger)
Forte (Intel’s FV system)
IBM
SystemML(now called HDML, on sourceforge)
Masters projects possible
Behavioural Lava (York)
Lava + Wired
etc.
Bluespec SV
Lustre, Esterel
Cryptol
5/11/2008
24
47
The Big Picture (Verification methods)(see course intro., lectures by Seger and Kunz)
Equivalence Checking (formal)
SimulationProperty Checking
Formal
48
Kunz (Infineon, Siemens, Bosch… OneSpin)
processor and SoC verificationSAT-basedExtremely impressive!
see also work at companies likeNVIDIA, Freescale, …(see panel at FMCAD 2007(links page))
A problem is that there is a lot of unpublished work….
5/11/2008
25
49
Intel (Seger’s lecture)Forte (STE)niches (such as Floating Point Arith.)
IBM Sixth Sensecombines formal and semi-formalemphasises scalability and automation
see great presentation by Baumgartnerfrom FMCAD 2006 (links page)
Hot research topics
Coverage (OneSpin look to have something veryinteresting, but it is not public)
Methodology, Finding new FV ”recipes”
Moving up in abstraction levels
Satisfiability Modulo Theories (SMT), First Order Logic
How to design (and verify) complete systems
has become harder because of multicore
Getting control of non-functional properties (particularly power consumption) 50
5/11/2008
26
Hot research topics
Parallelisation of EDA algorithms
Protocol verification
Increasing automation of FV
(e.g. transformation-based verification ala Sixth Sense)
how to build and use verification IP
reuse
Post-silicon verification
51
You should think about
The two different design flows that you have seenWhat was good and bad about themYOUR opinions based on your experience(which is influenced by previous expertise)
Formal Verificationevidence about its use (suitable niches, module verification)limitations (a main one being scalability)what it can give when it works
52