IntegratedSystemsLaboratory,EPFL,Switzerland lsi.epfl.ch...

Preview:

Citation preview

Logic Synthesis for Quantum ComputingMathias Soeken, Alan Mishchenko, Luca Amarù, Robert K. Brayton

Integrated Systems Laboratory, EPFL, Switzerland

lsi.epfl.ch • msoeken.github.io

two

Quantum computing is getting real

I 17-qubit quantum computer from IBM based onsuperconducting qubits (16-qubit version available via cloudservice)

I 9-qubit quantum computer from Google based onsuperconducting circuits

I 5-qubit quantum computer at University of Maryland based onion traps

I Microsoft is investigating topological quantum computersI Intel is investigating silicon-based qubits

I “Quantum supremacy” experiment may be possible with ≈50qubits (45-qubit simulation has been performed classically)

I Smallest practical problems require ≈100 qubits

two

Quantum computing is getting real

I 17-qubit quantum computer from IBM based onsuperconducting qubits (16-qubit version available via cloudservice)

I 9-qubit quantum computer from Google based onsuperconducting circuits

I 5-qubit quantum computer at University of Maryland based onion traps

I Microsoft is investigating topological quantum computersI Intel is investigating silicon-based qubits

I “Quantum supremacy” experiment may be possible with ≈50qubits (45-qubit simulation has been performed classically)

I Smallest practical problems require ≈100 qubits

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits

2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial4. Circuit is not allowed to produce intermediate results, called

garbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)

|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits

2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial4. Circuit is not allowed to produce intermediate results, called

garbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible

3. Standard gate library for today’s physical quantum computersis non-trivial

4. Circuit is not allowed to produce intermediate results, calledgarbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)

|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible

3. Standard gate library for today’s physical quantum computersis non-trivial

4. Circuit is not allowed to produce intermediate results, calledgarbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)

|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial

4. Circuit is not allowed to produce intermediate results, calledgarbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)

|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

three

Challenges in logic synthesis for quantum computing

1. Quantum computers process qubits not bits2. All qubit operations, called quantum gates, must be reversible3. Standard gate library for today’s physical quantum computers

is non-trivial4. Circuit is not allowed to produce intermediate results, called

garbage qubits

Classical half-adder

x

0

y

1

s

1

c

0

Quantum half-adder

|x〉

1/√

2(|0〉+ |1〉)

|y〉

1/√

2(|0〉+ |1〉)

|0〉

|x〉|s〉

1/2(|000〉+ |010〉+ |110〉+ |101〉)

|c〉

four

Reversible gates

x1 x̄1

NOT

x1

x2

x1

x1 ⊕ x2

CNOT

x1

x2

x3

x1

x2

x3 ⊕ x1x2

Toffoli

x1

x2

x3

fx1

x2

x3 ⊕ f (x1, x2)

Single-target

x1

x2

x3

x4

x5

x1

x2

x3

x4

x5 ⊕ x1x̄2x3x̄4

Multiple-controlled Toffoli

x1

x2

x3

0

x1

x2

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

four

Reversible gates

x1 x̄1

NOT

x1

x2

x1

x1 ⊕ x2

CNOT

x1

x2

x3

x1

x2

x3 ⊕ x1x2

Toffoli

x1

x2

x3

fx1

x2

x3 ⊕ f (x1, x2)

Single-target

x1

x2

x3

x4

x5

x1

x2

x3

x4

x5 ⊕ x1x̄2x3x̄4

Multiple-controlled Toffoli

x1

x2

x3

0

x1

x2

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

four

Reversible gates

x1 x̄1

NOT

x1

x2

x1

x1 ⊕ x2

CNOT

x1

x2

x3

x1

x2

x3 ⊕ x1x2

Toffoli

x1

x2

x3

fx1

x2

x3 ⊕ f (x1, x2)

Single-target

x1

x2

x3

x4

x5

x1

x2

x3

x4

x5 ⊕ x1x̄2x3x̄4

Multiple-controlled Toffoli

x1

x2

x3

0

x1

x2

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

four

Reversible gates

x1 x̄1

NOT

x1

x2

x1

x1 ⊕ x2

CNOT

x1

x2

x3

x1

x2

x3 ⊕ x1x2

Toffoli

x1

x2

x3

fx1

x2

x3 ⊕ f (x1, x2)

Single-target

x1

x2

x3

x4

x5

x1

x2

x3

x4

x5 ⊕ x1x̄2x3x̄4

Multiple-controlled Toffoli

x1

x2

x3

0

x1

x2

x1 ⊕ x2 ⊕ x3

〈x1x2x3〉

Full adder

five

Quantum gates

I Qubit is vector |ϕ〉 = ( αβ ) with |α2|+ |β2| = 1.

I Classical 0 is |0〉 = ( 10 ); Classical 1 is |1〉 = ( 0

1 )

|ϕ1〉|ϕ2〉

(1 0 0 00 1 0 00 0 0 10 0 1 0

)|ϕ1ϕ2〉

CNOT

|ϕ〉 H1√2

( 1 11 −1

)|ϕ〉

Hadamard

|ϕ〉 T(

1 0

0 eiπ4

)|ϕ〉

T

five

Quantum gates

I Qubit is vector |ϕ〉 = ( αβ ) with |α2|+ |β2| = 1.

I Classical 0 is |0〉 = ( 10 ); Classical 1 is |1〉 = ( 0

1 )

|ϕ1〉|ϕ2〉

(1 0 0 00 1 0 00 0 0 10 0 1 0

)|ϕ1ϕ2〉

CNOT

|ϕ〉 H1√2

( 1 11 −1

)|ϕ〉

Hadamard

|ϕ〉 T(

1 0

0 eiπ4

)|ϕ〉

T

six

Composing quantum gates

I Applying a quantum gate to a quantum state (matrix-vectormultiplication)

|ϕ〉 = ( αβ ) U U|ϕ〉

I Applying quantum gates in sequence (matrix product)

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

I Applying quantum gates in parallel (Kronecker product)

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

) U1

U2(U1 ⊗ U2)|ϕ1ϕ2〉

six

Composing quantum gates

I Applying a quantum gate to a quantum state (matrix-vectormultiplication)

|ϕ〉 = ( αβ ) U U|ϕ〉

I Applying quantum gates in sequence (matrix product)

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

I Applying quantum gates in parallel (Kronecker product)

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

) U1

U2(U1 ⊗ U2)|ϕ1ϕ2〉

six

Composing quantum gates

I Applying a quantum gate to a quantum state (matrix-vectormultiplication)

|ϕ〉 = ( αβ ) U U|ϕ〉

I Applying quantum gates in sequence (matrix product)

|ϕ〉 = ( αβ ) U1 U2 (U2U1)|ϕ〉

I Applying quantum gates in parallel (Kronecker product)

|ϕ1〉 =( α1β1

)|ϕ2〉 =

( α2β2

) U1

U2(U1 ⊗ U2)|ϕ1ϕ2〉

seven

Mapping Toffoli gatesx1

x2

x3

=

H

T

T

T

T †T †

T †

T H

x1

x2

x3 ⊕ x1x2

Clifford+T circuit [Amy et al., TCAD 32, 2013]

x1x2x3x4

w

x5

=

x1x2x3x4

w

x5 ⊕ x1x2x3x4

Ç Costs are number of qubits and number of T gates

seven

Mapping Toffoli gatesx1

x2

x3

=

H

T

T

T

T †T †

T †

T H

x1

x2

x3 ⊕ x1x2

Clifford+T circuit [Amy et al., TCAD 32, 2013]

x1x2x3x4

w

x5

=

x1x2x3x4

w

x5 ⊕ x1x2x3x4

Ç Costs are number of qubits and number of T gates

seven

Mapping Toffoli gatesx1

x2

x3

=

H

T

T

T

T †T †

T †

T H

x1

x2

x3 ⊕ x1x2

Clifford+T circuit [Amy et al., TCAD 32, 2013]

x1x2x3x4

w

x5

=

x1x2x3x4

w

x5 ⊕ x1x2x3x4

Ç Costs are number of qubits and number of T gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network

3. Translate k-LUT network into reversiblenetwork with k-input single-target gates

4. Map single-target gates into Clifford+Tnetworks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates

4. Map single-target gates into Clifford+Tnetworks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

eight

LUT-based hierarchical reversible synthesisGoal: Automatically synthesizing large Boolean functions intoClifford+T networks of reasonable quality (qubits and T -count)

Algorithm: LUT-based hierarchical reversible synthesis (LHRS)

1. Represent input function as classical logicnetwork and optimize it

2. Map network into k-LUT network3. Translate k-LUT network into reversible

network with k-input single-target gates4. Map single-target gates into Clifford+T

networks

conv.alg.

new

alg.

affects

#qubits

affects

#T

gates

nine

LUT mapping

I Realizing a logic function or logic circuit in terms of a k-LUTlogic network

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesis

I Typical objective functions are size (number of LUTs) anddepth (longest path from inputs to outputs)

I Open source software ABC can generate industrial-scalemappings

I Can be used as technology mapper for FPGAs (e.g., whenk ≤ 7)

nine

LUT mapping

I Realizing a logic function or logic circuit in terms of a k-LUTlogic network

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesisI Typical objective functions are size (number of LUTs) and

depth (longest path from inputs to outputs)I Open source software ABC can generate industrial-scale

mappings

I Can be used as technology mapper for FPGAs (e.g., whenk ≤ 7)

nine

LUT mapping

I Realizing a logic function or logic circuit in terms of a k-LUTlogic network

I A k-LUT is any Boolean function with at most k inputsI One of the most effective methods used in logic synthesisI Typical objective functions are size (number of LUTs) and

depth (longest path from inputs to outputs)I Open source software ABC can generate industrial-scale

mappingsI Can be used as technology mapper for FPGAs (e.g., when

k ≤ 7)

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x5

00000

1

23

4 5 4

2

1

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x50

0000

1

23

4 5 4

2

1

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500

000

1

2

3

4 5 4

2

1

x1x2x3x4x5

00y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x5000

00

1

23

4 5 4

2

1

x1x2x3x4x5

00

y1

0y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x50000

0

1

23

4

5 4

2

1

x1x2x3x4x5

00

y1

0y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5

4

2

1

x1x2x3x4x5

00

y1

0

y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gate

I non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x5

00

y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputed

I order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x5

0

0y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputed

I order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x500y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputed

I order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x500y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillas

I maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x500y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas

� fast mapping that generates a fixed-space skeleton forsubnetwork synthesis

ten

k-LUT network to reversible network

y2y1

x3x2x1 x4 x5

1 2

3 4

5

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x500y10y2

x1x2x3x4x50000

1

2

45

4

23

1

x1x2x3x4x50y10y2

U k-LUT corresponds to k-controlled single-target gateI non-output LUTs need to be uncomputedI order of LUT traversal determines number of ancillasI maximum output cone determines minimum number of ancillas� fast mapping that generates a fixed-space skeleton for

subnetwork synthesis

eleven

Single-target gate LUT mapping

x1x2x3x4x500000

1

23

4 5 4

2

1

x1x2x3x4x500y10y2

I Mapping problem: Given a single-target gate Tf (X , xt) (withcontrol function f , control lines X , and target line xt), a set ofclean ancillas Xc, and a set of dirty ancillas Xd, find the bestmapping into a Clifford+T network, such that all ancillas arerestored to their original value.

twelve

Single-target gate mapping algorithms

I Direct

I Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-based

I Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

twelve

Single-target gate mapping algorithms

I DirectI Map each control function using ESOP based synthesis

I Does not use ancillaeI LUT-based

I Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

twelve

Single-target gate mapping algorithms

I DirectI Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-based

I Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

twelve

Single-target gate mapping algorithms

I DirectI Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-based

I Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

twelve

Single-target gate mapping algorithms

I DirectI Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-basedI Map control function into smaller LUT network

I Map small LUTs into pre-computed optimum quantum circuits

twelve

Single-target gate mapping algorithms

I DirectI Map each control function using ESOP based synthesisI Does not use ancillae

I LUT-basedI Map control function into smaller LUT networkI Map small LUTs into pre-computed optimum quantum circuits

thirteen

Direct mapping

f (x1, x2, x3, x4) = [(x4x3x2x1)2 is prime]

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

0

f

=

x1x2x3x4

f (x1, x2, x3, x4)

I Each multiple-controlled Toffoli gate is mapped to Clifford+T

� ESOP minimization tools (e.g., exorcism) optimize for cubecount

thirteen

Direct mapping

f (x1, x2, x3, x4) = [(x4x3x2x1)2 is prime]

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

0

f=

x1x2x3x4

f (x1, x2, x3, x4)

I Each multiple-controlled Toffoli gate is mapped to Clifford+T

� ESOP minimization tools (e.g., exorcism) optimize for cubecount

thirteen

Direct mapping

f (x1, x2, x3, x4) = [(x4x3x2x1)2 is prime]

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

0

f=

x1x2x3x4

f (x1, x2, x3, x4)

I Each multiple-controlled Toffoli gate is mapped to Clifford+T

� ESOP minimization tools (e.g., exorcism) optimize for cubecount

thirteen

Direct mapping

f (x1, x2, x3, x4) = [(x4x3x2x1)2 is prime]

= x̄4x̄3x2 ∨ x̄4x3x1 ∨ x4x̄3x2x1 ∨ x4x3x̄2x1

= x4x2x1 ⊕ x3x1 ⊕ x̄4x̄3x2

x1x2x3x4

0

f=

x1x2x3x4

f (x1, x2, x3, x4)

I Each multiple-controlled Toffoli gate is mapped to Clifford+T

� ESOP minimization tools (e.g., exorcism) optimize for cubecount

fourteen

LUT-based single-target gate mapping

x1

x2

x3

x4

0w

000

f

=

1

2

3

4

3

2

1

x1

x2

x3

x4

f

w

000

3

4

1

2

x2 x3 x1x4

f

fourteen

LUT-based single-target gate mapping

x1

x2

x3

x4

0w

000

f

=

1

2

3

4

3

2

1

x1

x2

x3

x4

f

w

000

3

4

1

2

x2 x3 x1x4

f

fourteen

LUT-based single-target gate mapping

x1

x2

x3

x4

0w

000

f

=

1

2

3

4

3

2

1

x1

x2

x3

x4

f

w

000

3

4

1

2

x2 x3 x1x4

f

fifteen

Exploiting Boolean function classificationx1x2x3x4

0

f = f ′

x1x2x3x4

f ′

swap inputs

x1x2x3x4

0

f = f ′

x1x2x3x4

f ′

invert inputs

x1x2x3x4

0

f = f ′

x1x2x3x4

f ′

invert output

x1x2x3x4

0

f = f ′

x1x2x3x4

f ′

spectral translation

x1x2x3x4

0

f = f ′

x1x2x3x4

f ′

disjoint spectral translation

I Operations do not influence T -count of the quantum circuit� All optimum circuits in an equivalence class have the same

T -count

sixteen

Classification of all 4-input functions

I All 65,356 4-input functions collapse into only 8 equivalenceclasses

(all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum

(and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2

x1x2x3x4

=

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

sixteen

Classification of all 4-input functions

I All 65,356 4-input functions collapse into only 8 equivalenceclasses

(all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum

(and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2

x1x2x3x4

=

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

sixteen

Classification of all 4-input functions

I All 65,356 4-input functions collapse into only 8 equivalenceclasses (all 4,294,967,296 5-input functions collapse into 48classes)

I Classification simple by comparing coefficients in the function’sWalsh spectrum (and auto-correlation spectrum)

x1x2x3x4

x1x2x3x4f(4)1

x1x2x3x4

x1x2x3x4f(4)2

x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)3

x1x2x3

x1x2x3x4

x1x2x3x4f(4)4

x1x2 ⊕ x1x2x3x4

x1x2x3x4

x1x2x3x4f(4)5

x1x2

x1x2x3x4

=

x1x2x3x4f(4)6

x1x2x3 ⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)7

x1x2⊕ x1x2x̄3x̄4⊕ x3x4

x1x2x3x4

x1x2x3x4f(4)8

x1x2 ⊕ x3x4

seventeen

Trading off size and space with LUT sizeSynthesizing a 16-bit floating point adder with different LUT sizes

4 6 8 10 12 14 16 18 200

100

200

300

LUT size k

qubits

2 · 105

4 · 105qubits

T gates (direct)T gates (LUT-based)

eighteen

Experimental results: Quantum floating point library

I Existing implementations of 16-bit, 32-bit, and 64-bit floatingpoint components

� quantumfpl.stationq.comI Optimization using ABC (academic, open source)I LHRS implemented in RevKit (academic, open source)

revkit> read_aiger add_32.aigrevkit> ps -a[i] add_32: i/o = 64 / 32 and = 1763 lev = 137revkit> lhrs -k 16[i] run-time: 9.32 secsrevkit> ps -cLines: 368Gates: 6141T-count: 256668Logic qubits: 368

eighteen

Experimental results: Quantum floating point library

I Existing implementations of 16-bit, 32-bit, and 64-bit floatingpoint components

� quantumfpl.stationq.comI Optimization using ABC (academic, open source)I LHRS implemented in RevKit (academic, open source)

revkit> read_aiger add_32.aigrevkit> ps -a[i] add_32: i/o = 64 / 32 and = 1763 lev = 137revkit> lhrs -k 16[i] run-time: 9.32 secsrevkit> ps -cLines: 368Gates: 6141T-count: 256668Logic qubits: 368

nineteen

The LHRS ecosystem arxiv.org/abs/1706.02721

Mapping into LUTs

Aligning LUTs as single-target gates

Mapping single-target gates

2 LHRS

let gaussian a b c d x =let den = (x - b) * (x - b)let nom = -2.0f * c * ca * exp (den / nom)

/ program

LIQUi |〉 ProjectQ Quipper

nineteen

The LHRS ecosystem arxiv.org/abs/1706.02721

Mapping into LUTs

Aligning LUTs as single-target gates

Mapping single-target gates

2 LHRS

let gaussian a b c d x =let den = (x - b) * (x - b)let nom = -2.0f * c * ca * exp (den / nom)

/ program

LIQUi |〉 ProjectQ Quipper

nineteen

The LHRS ecosystem arxiv.org/abs/1706.02721

Mapping into LUTs

Aligning LUTs as single-target gates

Mapping single-target gates

2 LHRS

let gaussian a b c d x =let den = (x - b) * (x - b)let nom = -2.0f * c * ca * exp (den / nom)

/ program

LIQUi |〉 ProjectQ Quipper

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

twenty

Summary

I LHRS is the first scalable synthesis algorithm that allowsreasonable space/time trade-offs

I Valuable tool to estimate the cost of future quantumalgorithms

I Step 1: Mapping to efficiently partition large function intosubfunctions (determines qubits)

I Step 2: High effort methods to map subfunctions intoClifford+T networks

I Most steps in the algorithm are (still) performed usingconventional algorithms

Logic Synthesis for Quantum ComputingMathias Soeken, Alan Mishchenko, Luca Amarù, Robert K. Brayton

Integrated Systems Laboratory, EPFL, Switzerland

lsi.epfl.ch • msoeken.github.io

Recommended