IntegratedSystemsLaboratory,EPFL,Switzerland lsi.epﬂ.ch...

Logic Synthesis for Quantum ComputingMathias Soeken, Alan Mishchenko, Luca Amarù, Robert K. Brayton

Integrated Systems Laboratory, EPFL, Switzerland

lsi.epfl.ch • msoeken.github.io

Quantum computing is getting real

I 17-qubit quantum computer from IBM based onsuperconducting qubits (16-qubit version available via cloudservice)

I 9-qubit quantum computer from Google based onsuperconducting circuits

I 5-qubit quantum computer at University of Maryland based onion traps

I Microsoft is investigating topological quantum computersI Intel is investigating silicon-based qubits

I “Quantum supremacy” experiment may be possible with ≈50qubits (45-qubit simulation has been performed classically)

I Smallest practical problems require ≈100 qubits

Quantum computing is getting real

I 17-qubit quantum computer from IBM based onsuperconducting qubits (16-qubit version available via cloudservice)

I 9-qubit quantum computer from Google based onsuperconducting circuits

I 5-qubit quantum computer at University of Maryland based onion traps

I Microsoft is investigating topological quantum computersI Intel is investigating silicon-based qubits

I “Quantum supremacy” experiment may be possible with ≈50qubits (45-qubit simulation has been performed classically)

I Smallest practical problems require ≈100 qubits

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits

2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial4. Circuit is not allowed to produce intermediate results, called

garbage qubits

Classical half-adder

Quantum half-adder

2(|0〉+ |1〉)

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

1. Quantum computers process qubits not bits

2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

garbage qubits

Quantum half-adder

2(|0〉+ |1〉)

2(|0〉+ |1〉)|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible

3. Standard gate library for today’s physical quantum computersis non-trivial

4. Circuit is not allowed to produce intermediate results, calledgarbage qubits

Quantum half-adder

2(|0〉+ |1〉)

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible

3. Standard gate library for today’s physical quantum computersis non-trivial

Quantum half-adder

2(|0〉+ |1〉)

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial

Quantum half-adder

2(|0〉+ |1〉)

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

garbage qubits

Quantum half-adder

2(|0〉+ |1〉)

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

Reversible gates

x1 x̄1

x1 ⊕ x2

x3 ⊕ x1x2

Toffoli

x3 ⊕ f (x1, x2)

Single-target

x5 ⊕ x1x̄2x3x̄4

Multiple-controlled Toffoli

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

Reversible gates

x1 x̄1

x1 ⊕ x2

x3 ⊕ x1x2

Toffoli

x3 ⊕ f (x1, x2)

Single-target

x5 ⊕ x1x̄2x3x̄4

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

Reversible gates

x1 x̄1

x1 ⊕ x2

x3 ⊕ x1x2

Toffoli

x3 ⊕ f (x1, x2)

Single-target

x5 ⊕ x1x̄2x3x̄4

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

Reversible gates

x1 x̄1

x1 ⊕ x2

x3 ⊕ x1x2

Toffoli

x3 ⊕ f (x1, x2)

Single-target

x5 ⊕ x1x̄2x3x̄4

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

Quantum gates

I Qubit is vector |ϕ〉 = ( αβ ) with |α2|+ |β2| = 1.

I Classical 0 is |0〉 = ( 10 ); Classical 1 is |1〉 = ( 0

|ϕ1〉|ϕ2〉

(1 0 0 00 1 0 00 0 0 10 0 1 0

)|ϕ1ϕ2〉

|ϕ〉 H1√2

( 1 11 −1

)|ϕ〉

Hadamard

|ϕ〉 T(

0 eiπ4

)|ϕ〉

Quantum gates

I Qubit is vector |ϕ〉 = ( αβ ) with |α2|+ |β2| = 1.

I Classical 0 is |0〉 = ( 10 ); Classical 1 is |1〉 = ( 0

|ϕ1〉|ϕ2〉

(1 0 0 00 1 0 00 0 0 10 0 1 0

)|ϕ1ϕ2〉

|ϕ〉 H1√2

( 1 11 −1

)|ϕ〉

Hadamard

|ϕ〉 T(

0 eiπ4

)|ϕ〉

Composing quantum gates

I Applying a quantum gate to a quantum state (matrix-vectormultiplication)

|ϕ〉 = ( αβ ) U U|ϕ〉

I Applying quantum gates in sequence (matrix product)

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

I Applying quantum gates in parallel (Kronecker product)

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

U2(U1 ⊗ U2)|ϕ1ϕ2〉

|ϕ〉 = ( αβ ) U U|ϕ〉

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

U2(U1 ⊗ U2)|ϕ1ϕ2〉

|ϕ〉 = ( αβ ) U U|ϕ〉

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

U2(U1 ⊗ U2)|ϕ1ϕ2〉

Mapping Toffoli gatesx1

T †T †

x3 ⊕ x1x2

Clifford+T circuit [Amy et al., TCAD 32, 2013]

x1x2x3x4

x5 ⊕ x1x2x3x4

Ç Costs are number of qubits and number of T gates

T †T †

x3 ⊕ x1x2

x1x2x3x4

x5 ⊕ x1x2x3x4

T †T †

x3 ⊕ x1x2

x1x2x3x4

x5 ⊕ x1x2x3x4

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

2. Map network into k-LUT network

3. Translate k-LUT network into reversiblenetwork with k-input single-target gates

4. Map single-target gates into Clifford+Tnetworks

conv.alg.

affects

#qubits

affects

network with k-input single-target gates

4. Map single-target gates into Clifford+Tnetworks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

networks

conv.alg.

affects

#qubits

affects

LUT mapping

I Realizing a logic function or logic circuit in terms of a k-LUTlogic network

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesis

I Typical objective functions are size (number of LUTs) anddepth (longest path from inputs to outputs)

I Open source software ABC can generate industrial-scalemappings

I Can be used as technology mapper for FPGAs (e.g., whenk ≤ 7)

LUT mapping

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesisI Typical objective functions are size (number of LUTs) and

depth (longest path from inputs to outputs)I Open source software ABC can generate industrial-scale

mappings

I Can be used as technology mapper for FPGAs (e.g., whenk ≤ 7)

LUT mapping

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesisI Typical objective functions are size (number of LUTs) and

depth (longest path from inputs to outputs)I Open source software ABC can generate industrial-scale

mappingsI Can be used as technology mapper for FPGAs (e.g., when

k ≤ 7)

k-LUT network to reversible network

x3x2x1 x4 x5

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

x3x2x1 x4 x5

x1x2x3x4x50

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

x3x2x1 x4 x5

x1x2x3x4x500

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x5000

x1x2x3x4x5

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x50000

x1x2x3x4x5

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x5

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x5

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputed

I order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x5

0y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x500y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x500y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillas

I maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x500y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas

� fast mapping that generates a fixed-space skeleton forsubnetwork synthesis

x3x2x1 x4 x5

x1x2x3x4x500000

x1x2x3x4x500y10y2

x1x2x3x4x50000

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

eleven

Single-target gate LUT mapping

x1x2x3x4x500000

x1x2x3x4x500y10y2

I Mapping problem: Given a single-target gate Tf (X , xt) (withcontrol function f , control lines X , and target line xt), a set ofclean ancillas Xc, and a set of dirty ancillas Xd, find the bestmapping into a Clifford+T network, such that all ancillas arerestored to their original value.

twelve

Single-target gate mapping algorithms

I Direct

I Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-based

I Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

twelve

I DirectI Map each control function using ESOP based synthesis

I Does not use ancillaeI LUT-based

twelve

I DirectI Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-based

twelve

I LUT-based

twelve

I LUT-basedI Map control function into smaller LUT network

I Map small LUTs into pre-computed optimum quantum circuits

twelve

I LUT-basedI Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

thirteen

Direct mapping

f (x1, x2, x3, x4) = [(x4x3x2x1)2 is prime]

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

f (x1, x2, x3, x4)

I Each multiple-controlled Toffoli gate is mapped to Clifford+T

� ESOP minimization tools (e.g., exorcism) optimize for cubecount

thirteen

Direct mapping

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

f (x1, x2, x3, x4)

thirteen

Direct mapping

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

f (x1, x2, x3, x4)

thirteen

Direct mapping

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

f (x1, x2, x3, x4)

fourteen

LUT-based single-target gate mapping

x2 x3 x1x4

fourteen

x2 x3 x1x4

fourteen

x2 x3 x1x4

fifteen

Exploiting Boolean function classificationx1x2x3x4

f = f ′

x1x2x3x4

swap inputs

x1x2x3x4

f = f ′

x1x2x3x4

invert inputs

x1x2x3x4

f = f ′

x1x2x3x4

invert output

x1x2x3x4

f = f ′

x1x2x3x4

spectral translation

x1x2x3x4

f = f ′

x1x2x3x4

disjoint spectral translation

I Operations do not influence T -count of the quantum circuit� All optimum circuits in an equivalence class have the same

T -count

sixteen

Classification of all 4-input functions

I All 65,356 4-input functions collapse into only 8 equivalenceclasses

(all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum

(and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2x3x4

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

sixteen

I All 65,356 4-input functions collapse into only 8 equivalenceclasses

(all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum

(and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2x3x4

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

sixteen

I All 65,356 4-input functions collapse into only 8 equivalenceclasses (all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum (and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2x3x4

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

seventeen

Trading off size and space with LUT sizeSynthesizing a 16-bit floating point adder with different LUT sizes

4 6 8 10 12 14 16 18 200

LUT size k

qubits

2 · 105

4 · 105qubits

T gates (direct)T gates (LUT-based)

eighteen

Experimental results: Quantum floating point library

I Existing implementations of 16-bit, 32-bit, and 64-bit floatingpoint components

� quantumfpl.stationq.comI Optimization using ABC (academic, open source)I LHRS implemented in RevKit (academic, open source)

revkit> read_aiger add_32.aigrevkit> ps -a[i] add_32: i/o = 64 / 32 and = 1763 lev = 137revkit> lhrs -k 16[i] run-time: 9.32 secsrevkit> ps -cLines: 368Gates: 6141T-count: 256668Logic qubits: 368

eighteen

Experimental results: Quantum floating point library

I Existing implementations of 16-bit, 32-bit, and 64-bit floatingpoint components

� quantumfpl.stationq.comI Optimization using ABC (academic, open source)I LHRS implemented in RevKit (academic, open source)

revkit> read_aiger add_32.aigrevkit> ps -a[i] add_32: i/o = 64 / 32 and = 1763 lev = 137revkit> lhrs -k 16[i] run-time: 9.32 secsrevkit> ps -cLines: 368Gates: 6141T-count: 256668Logic qubits: 368

nineteen

The LHRS ecosystem arxiv.org/abs/1706.02721

Mapping into LUTs

Aligning LUTs as single-target gates

Mapping single-target gates

2 LHRS

let gaussian a b c d x =let den = (x - b) * (x - b)let nom = -2.0f * c * ca * exp (den / nom)

/ program

LIQUi |〉 ProjectQ Quipper

nineteen

Mapping into LUTs

2 LHRS

/ program

nineteen

Mapping into LUTs

2 LHRS

/ program

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

twenty

Summary

twenty

Summary

twenty

Summary

twenty

Summary

Logic Synthesis for Quantum ComputingMathias Soeken, Alan Mishchenko, Luca Amarù, Robert K. Brayton

Integrated Systems Laboratory, EPFL, Switzerland

lsi.epfl.ch • msoeken.github.io

IntegratedSystemsLaboratory,EPFL,Switzerland lsi.epﬂ.ch...

Documents

Geotechnical Ref 1911 TELEMAC...· Crissier MMM - Switzerland , 1998 · Eiblschrofn rock-wall - Austria , 1999 · EMO Kade quay wall - Netherlands , 2011 · EPFL MT Building - Switzerland

4K ultraHD TV wireless REmote PROduction SYStems · 2017. 4. 28. · European Broadcasting Union (EBU) / Eurovision, Switzerland Ecole Polytechnique Fédérale Lausanne (EPFL), Switzerland

Jonas Latt EPFL, Lausanne, Switzerland Rome, DSFD 2010dsfd2010/TUTORIALS/Latt.pdf · Tutorial: the open-source library Palabos in your daily work Jonas Latt EPFL, Lausanne, Switzerland

Genova, Genova, Italy (EPFL), Lausanne, Switzerland arXiv

Yongluan Zhou University of Southern Denmark Ali Salehi, Karl Aberer EPFL, Switzerland - Manisha Mishra

Fundamentals of compressed sensing for radio ...€¦ · Signal Processing Lab. (EPFL, Switzerland) & Medical Image Processing Lab. (UniGE/EPFL, Switzerland) 5th SKA Workshop on Calibration

OptiMoS: Optimal Sensing for Mobile Sensors Optimal Sensing for Mobile Sensors Zhixian Yan , Julien Eberle†, Karl Aberer EPFL, Switzerland † Nokia Research Center, Lausanne, Switzerland

EPFL, SWITZERLAND The Sound of Perfection Thanks to the ... · the acoustics group at the EPFL (École Polytechnique Fédérale de Lausanne), said: “Traditional membrane resonator

1.3 Interaction of Radiation with Matter 1 By Archana SHARMA CERN Geneva Switzerland March 2009 Troisieme Cycle EPFL Lausanne, Switzerland Gaseous Particle

Analysis of SwissCovid · Analysis of SwissCovid Serge Vaudenay1 and Martin Vuagnoux2 2020, June 5 1 EPFL, Lausanne, Switzerland 2 base23, Geneva, Switzerland What follows is the

Benoit Salvant (EPFL / CERN Switzerland)

Modular Heap Analysis Of Higher Order Programs Ravichandhran Madhavan + * Ganesan Ramalingam * Kapil Vaswani * * Microsoft Research India + EPFL, Switzerland

Herbert Shea Institute of Microengineering EPFL Switzerland

CHIPP Plenary meeting 23-24 August 2010, Gersau, Switzerland W. Lustermann, ETH Zurich TU Dortmund, Dortmund, Germany ISDC, Geneva, Switzerland EPFL,

SOFC-Life (256885) - fch.europa.eu Review day 2012... · EDF France HEXIS Switzerland partners - research & university DTU-EC Denmark EPFL Switzerland VTT Finnland EMPA Switzerland

treeKL: A distance between high dimension empirical ...Riwal Lefort (1), Fran˘cois Fleuret (1;2) (1) Idiap research institute, Switzerland, riwal.lefort@idiap.ch 5 (2) EPFL, Switzerland,

Computer Graphics from your pockets to your CAVE Achille Peternier, Ph. D. Student VRLab, EPFL, Switzerland

The Scala Experience Martin Odersky EPFL Lausanne, Switzerland

1 CO-OPN: Concurrent Object-Oriented Petri Nets Giovanna Di Marzo Serugendo University of Geneva, Switzerland (Courtesy of Didier Buchs, EPFL, Switzerland)

TM Input Acceptance Vincent Gramoli, Derin Harmanci, Pascal Felber EPFL LPD - University of Neuchâtel Switzerland