Upload
noleta
View
36
Download
0
Embed Size (px)
DESCRIPTION
Logic Decomposition of Asynchronous Circuits Using STG Unfoldings. Victor Khomenko School of Computing Science, Newcastle University, UK. Asynchronous circuits. The traditional synchronous (clocked) designs lack flexibility to cope with contemporary design technology challenges - PowerPoint PPT Presentation
Citation preview
Logic Decomposition of Asynchronous Circuits Using
STG UnfoldingsVictor Khomenko
School of Computing Science,Newcastle University, UK
2
Asynchronous circuits The traditional synchronous (clocked) designs
lack flexibility to cope with contemporarydesign technology challenges
Asynchronous circuits – no clocks: Low power consumption and EMI Tolerant of voltage, temperature and
manufacturing process variations Modularity – no problems with the clock skew
and related subtle issues[ITRS’09]: 22% of designs will be driven by ‘handshake
clocking’ in 2013, and 40% in 2020 Synthesis algorithms are complicated Computationally hard to synthesize efficient circuits
3
Motivation
• Logic decomposition is one of the most difficult tasks in logic synthesis
• The quality of the resulting circuit (in terms of area and latency) depends to a large extent on the way logic decomposition was performed
4
Speed-independency assumptions
• Gates are atomic (so no internal hazards)
• Gates’ delays are positive and unbounded (and perhaps variable)
• Wire delays are negligible (SI) or, alternatively, wire forks are isochronic (QDI)
Finstant
evaluator
delay
…
5
Speed-independent decomposition
G
…H1
Hk
……
delay
delay
delay
Finstant
evaluator
delay
…
6
VME Bus Controller
DeviceVME Bus
Controller
ldsldtack
d
Data TransceiverBus
dsrdtack
lds-d- ldtack- ldtack+
dsr- dtack+ d+
dtack- dsr+ lds+csc+
csc-
7
Complex-gate implementation
Device
d
Data TransceiverBus
dsr
dtack lds
ldtack
csc
May be not in the gate library and has to be decomposed
8
Naïve decomposition is hazardous
d
dsr
dtack lds
ldtack
csc
x
lds-d- ldtack- ldtack+
dsr- dtack+ d+
dtack- dsr+ lds+csc+
csc-
Unexpected!Unexpected!
9
Decompose at the PN level!
d
dsr
dtack lds
ldtack
csc
dec
lds-d- ldtack- ldtack+
dsr- dtack+ d+
dtack- dsr+ lds+csc+
csc-
dec+
dec-
Insert a new signal dec whose implementation is [dec] = ldtack + cscMultiway acknowledgement
10
Latch utilisationd
dsr
dtack lds
ldtack
csc
d
dsr
dtack lds
ldtack
cscC
Only possible because there is no globally reachable state at which dsr=ldtack=0 and csc=1
11
State Graphs:
Relatively easy theory Many algorithms
Not visual State space explosion problem
State Graphs vs. Unfoldings
12
State Graphs vs. UnfoldingsUnfoldings:
Alleviate the state space explosion problem More visual than state graphs Proven efficient for model checking
Quite complicated theory Not sufficiently investigated Relatively few algorithms
13
Function-guided signal insertion
forever dofor all non-input signals x do
S[x] ← ∅for all G {latches, gates} do
S[x] ← S[x] decompositions(x,G)bestH[x] ← best SI candidate in S[x]
if for each x, bestH[x] is implementableLibrary matchingstop
if for each x, bestH[x]=UNDEFINEDfail
H ← the most complex bestHInsert a new signal z implementing H into the STG
Logic decomposition algorithm[Cortadella et al, ’99]
14
Function-guided signal insertion
Problem: given a Boolean function F, insert a new signal dec (i.e. a set of new transitions labelled dec+ or dec-) with the implementation [dec]=F into the STG. Only unfolding prefix (rather than state graph) may be used.
15
Previous work: Transformations [PN’07]
Sequential pre-insertion Sequential post-insertion
Concurrent insertion
16
Previous work: main results [PN’07]• Validity criteria: safeness & bisimilarity
can be checked before the transformation is performed, i.e. on the original prefix (to avoid backtracking)
• Perform the insertion directly on the prefix avoid re-unfolding good for visualization (re-unfolding can
dramatically change the look of the prefix) Can transfer some information between
the iterations of the algorithm• The suite of transformations is good in
practice for resolution of encoding conflicts
17
Motivation for more transformations• The suite of transformations is not sufficient
for logic decomposition; intuitively: only linear (in the PN size) number of
sequential pre- and post-insertions (assuming that the pre- and postset sizes are bounded)
only quadratic (in the PN size) number of concurrent insertions
exponential number of ‘cuts’ in the PN where a Boolean expression can change its value
18
Example: imec-sbuf-ram-writedec+
dec-
dec
Implementation of prbar:(csc2 req) csc1 wsldin
imec-sbuf-ram-writereq
prechargeddone
wsldinwenin
prbarwenwsenackwsld
19
Generalised transition insertion [ICGT’10]
s1
s2
s3
d1
d2
• All previously listed good points hold for GTIs as well • Exponentially many GTIs can exist:
more likely that an appropriate transformation exists no longer practical to enumerate them all can enumerate only the ‘potentially useful’ (for logic
decomposition) GTIs
sources destinations
20
Compatible insertionsAn insertion I is compatible with F if whenever an
x can fire and trigger I, F’x=1, whereF’x = Fx=0 Fx=1
Intuitively, when x fires, the value of F must change, as I becomes enabled.
Cx I
F=v F=v
21
Compatible insertions
lds-
d-
ldtack-
ldtack+ dsr-dtack+d+
dtack-
dsr+ lds+ csc+
dsr+
csc+ csc-
F =ldtack csc
F=1
F=0
F=1
22
Find an optimal w.r.t. a heuristic cost function SAT assignment of the Boolean formula
MUTEX SA CUTOFF FUN
depending on the variables I1, ..., Ik corresponding to the compatible insertions, and conveying that:
no two insertions are non-commuting, or concurrent, or in auto-conflict, or one of them can trigger the other (MUTEX)
consistent assignment of signs is possible in the prefix (SA) and beyond cut-offs (CUTOFF)
F is a possible implementation of the newly inserted signal (FUN)
Reduction to (incremental) SAT
[ACSD’07]
23
Cost function
Parameterised by the user; takes into account:• the delay introduced by the insertion• the number of syntactic triggers of all non-
input signals• the number of inserted transitions of a signal• the number of signals which are not locked
with the newly inserted signal• …
24
Building FUNLet C be a configuration enabling some x, F’x=1, and I be the set of compatible insertions such that:
Cx I
F=v F=v
Then the clause VII I is in FUN. One can build a Boolean formula FUNGEN depending on C and compatible insertions whose SAT assignments satisfy this condition.
25
Building FUN (cont’d)Problem: it is infeasible to enumerate all configurations.
Idea 1: The same clause can be generated by many different configurations, and hence once one such configuration is found, the others can be excluded from the search.Idea 2: Clauses subsumed by already generated ones can be excluded from the search.
It is enough to add a clause VII I to FUNGEN
whenever a new clause VII I is computed.
26
Building FUN (example)
lds-
d-
ldtack-
ldtack+ dsr-dtack+d+
dtack-
dsr+ lds+ csc+
dsr+
csc+ csc-
F =ldtack csc
F=1
F=0
F=1
CC
27
Experimental results• Implemented in MPSAT (library matching not
implemented yet) and compared with PETRIFY• Assorted small benchmarks:
Similar failure rates and the quality of circuits – structural insertions seem sufficient
The tests reflect the quality of heuristics in choosing the decomposition in each step rather than the quality of the signal insertion routine
• Large benchmarks Tend to be non-decomposable by both tools Only one series (scalable pipelines) was useful
– can be solved by a single insertion, hence minimizes the impact of heuristics and reflects the quality of the signal insertion routine
– huge reachability graphs, so unfoldings win
28
Conclusions
• Unfolding-based decomposition algorithm alleviates state space explosion completes the design cycle based fully on
unfoldings (i.e. state graphs are never built)• All advantages of state-based decomposition
are retained: multiway acknowledgement latch utilisation highly optimised circuits
29
Thank you!Any questions?