Upload
harold-pierce
View
213
Download
0
Embed Size (px)
Citation preview
Synthesis of the Optimal 4-bit
Reversible CircuitsDmitri Maslov (spkr)
University of Waterloo Waterloo, ON, Canada
Oleg Golubitsky Sean Falconer
Stanford University Stanford, CA
Google Inc. Waterloo, ON, Canada
Basic Definitions
NOT
CNOT Toffoli-4
Toffoli
x x
y y
z xyz
Reversible circuit is a string of gates. Reversible n-bit function is a permutation of 2n elements.
page 1/15
Problem
Synthesize optimal 4-bit reversible circuits, i.e., containing minimal number of gates.
Complexity
-- There are 16!=20,922,789,888,000 reversible functions.
-- There 32 gates.
-- An average optimal circuit requires 11.94 gates.
:: 20,922,789,888,000 * log2 32 * 11.94 bits > 100 TB.
Murphy, David. "Western Digital Launches World-First 2TB Hard Drive". PC World. Retrieved 2009-01-27.
page 2/15
Importance
Library for physicists interested in performing a small experiment, but having very limited control over their system.
page 3/15
Indispensable for peep-hole optimization methods. Peep-hole optimizations are an important part of any modern compiler.
Mathematical curiosity. Computing the value of Shannon’s complexity function. L(3)=8, L(4)=[14,17], L(5)=?
Solution
Rough complexity analysis
-- space:
-- time:
Denote 16!=N (formally, N:=2n!).
Next, reduce these complexity figures to something manageable.
N N
page 4/15
Solution
Rough complexity analysis
-- space:
-- time:
Synthesize and save only halves of all optimal circuits.
An optimal circuit for any function may be found by searching for both of its halves.
N
NNN *
Optimization 1
page 5/15
Solution
Rough complexity analysis
-- space:
-- time: soft
Store optimal halves in a hash table.
N N
Optimization 2
Actual complexity is closer to
-- space:
-- time: soft
)(Re NoreHalvesquiredToStSpaceO
)(Re NoreHalvesquiredToStSpaceO
page 6/15
Solution
Simultaneous input/output relabeling does not change optimality of a circuit. Thus, we store a single (canonical---binary string with least lexicographic order) representative.
Optimization 3
In practice, there are almost 24=4! different relabelings, reducing the storage complexity by a factor of almost 24, and helping to reduce runtime.
page 7/15
Solution
If an optimal circuit is found for a function f, an optimal circuit for the inverse function, f -1, can be obtained by reversing the optimal circuit for f.
Optimization 4
In practice, random f frequently differs from f -1 resulting in the reduction of storage requirement by an additional factor of almost 2, and helping to further reduce the runtime.
page 8/15
Performance
k 7 8 9
Size 225 228 232
Memory usage 256 MB 2 GB 32 GB
Load factor 0.58 0.91 0.51
Parameters of the linear hash table storing canonical representatives.
Using a high performance server with 16 AMD Opteron 2300 MHz processors, 64 GB RAM, and Seagate Barracuda ES2 SCSI 7200 RPM HDD running Linux it took 10,549 seconds (under 3 hours) to synthesize all optimal circuits with up to 9 gates.
page 9/15
PerformanceSize Functions
14 17,191
13 2,371,039
12 5,110,943
11 2,051,507
10 392,108
9 50,861
8 5,269
7 455
6 24
5 3
Synthesis of 10,000,000 random functions (Fisher-Yates shuffle over Mersenne twister random number generator) took 104,616.716 seconds (about 29 hours) of user time with the maximal memory usage of 43.04 GB.
Loading optimal circuits with up to 9 gates into RAM took 1111 seconds.
On average, it took only 0.01035 seconds to synthesize an optimal circuit.
A 5400-RPM HDD access time may be expected to be on the order of 0.01—0.02 seconds.
page 10/15
Performance
Distribution of the number of functions requiring a circuit of a specified size (gate count).
page 11/15
Performance
Distribution of the number of linear functions requiring a circuit of a specified size (gate count).
It took under 2 seconds to synthesize all these circuits.
page 12/15
Size Functions
10 138
9 13,555
8 84,225
7 118,424
6 72,062
5 26,182
4 6,589
3 1206
2 162
1 16
0 1
WA: 6.8816 Total: 322,560
Performance
page 13/15
Future directionsLarger circuits
-- There are 80 transformations resulting from the application of all possible Toffoli-type gates on 5 bits.
-- 806*(log2 80)/5!/2 ~ 7.1 billion bits, fits into RAM memory.
-- 6+6=12. Meaning, it is reasonable to expect that extending the search for optimal 4-bit reversible circuits will allow to find optimal 5-bit reversible circuits with up to 12 gates.
page 14/15
Future directionsOptimal circuits using other cost metrics
This search can be easily extended to account for other cost metrics:
Weighted gate count optimal circuits---organize breadth first search such that a gate with cost G is assigned to a circuit of cost C at the iteration number G+C.
Depth optimal circuits---choose a different set of elementary transformations, e.g., circuit NOT(a)CNOT(b,c) is now an elementary transformation.
Depth optimal weighted gate circuits---combine previous two modifications.
page 15/15
END
Questions?
210!=541852879605885728307692194468385473800155396353801344448287027068321061207337660373314098413621458671907918845708980753931994165770187368260454133333721939108367528012764993769768292516937891165755680659663747947314518404886677672556125188694335251213677274521963430770133713205796248433128870088436171654690237518390452944732277808402932158722061853806162806063925435310822186848239287130261690914211362251144684713888587881629252104046295315949943900357882410243934315037444113890806181406210863953275235375885018598451582229599654558541242789130902486944298610923153307579131675745146436304024890820442907734561827369030502252796926553072967370990758747793127635104702469889667961462133026237158973227857814631807156427767644064591085076564783456324457736853810336981776080498707767046394272605341416779125697733374568037475186676265961665615884681450263337042522664141862157046825684773360944326737493676674915098953768112945831626643856479027816385730291542667725665642276826058264393884514911976419675509290208592713156362983290989441052732125187249527501314071676405516936190781821236701912295767363117054126589929916482008515781751955466910902838729232224509906388638147771255227782631322385756948819393658889908993670874516860653098411020299853816281564334981847105777839534742531499622103488807584513705769839763993103929665046046121166651345131149513657400869056334867859885025601787284982567787314407216524272262997319791568603629406624740101482697559533155736658800562921274680657285201570401940692285557800611429055755324549794008939849146812639860750085263298820224719585505344773711590656682821041417265040658600683844945104354998812886801316551551714673388323340851763819713591312372548673734783537316341517369387565212899726597964903241208727348690699802996369265070088758384854547542272771024255049902319275830918157448205196421072837204937293516175341957775422453152442280391372407717891661203061040255830055033886790052116025408740454620938384367637886658769912790922323717371343176067483352513629123362885893627132294183565884010418727869354439077085278288558308427090461075019007184933139915558212752392329879780649639075333845719173822840501869570463626600235265587502335595489311637509380219119860471335771652403999403296360245577257963673286654348957325740999710567131623272345766761937651408103999193633908286420510098577454524068106897392493138287362226257920000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
page __/__
263910 10418528796.5!2
Classically: 210!/(Lifespan_of_universe_in_Planck_time_units *
Estimated_number_of_atoms) ~ 102452 universes!
Work in progressSynthesize all optimal 4-bit circuits
-- Store circuits with up to 9 gates as we do it now.
-- Store a bit vector (~250 GB) for canonical representatives of circuits with 10, 11, 12, 13, and 14 gates, one at a time.
-- Use a minimal number of uploads/downloads of parts of each of such vectors into RAM.
page __/__