Transcript
Page 1: Quantum Wasserstein GANs - University Of Marylandxwu/papers/neurips_2019_poster.pdf · 2019-11-25 · Title: Quantum Wasserstein GANs Author: Shouvanik Chakrabarti*, Yiming Huang*,

Quantum Wasserstein GANsShouvanik Chakrabarti∗, Yiming Huang∗, Tongyang Li, Soheil Feizi, and Xiaodi Wu

Department of Computer Science, Institute for Advanced Computer Studies, andJoint Center for Quantum Information and Computer Science, University of Maryland

J o i n t C e n t e r f o rQuantum Information and Computer Science

arXiv:1911.00111NeurIPS 2019

Abstract

We propose the first design of quantumWasserstein Gen-erative Adversarial Networks (WGANs), which has beenshown to improve the robustness and the scalability ofthe adversarial training of quantum generative modelseven on noisy quantum hardware. Specifically, we pro-pose a definition of the Wasserstein semimetric betweendistributions over quantum data. We also demonstratehow to turn the quantum Wasserstein semimetric intoa concrete design of quantum WGANs that can be effi-ciently implemented on quantum machines. We use ourquantum WGAN to generate a 3-qubit quantum circuitof 50 gates that well approximates a 3-qubit 1-d Hamil-tonian simulation circuit that requires over 10k gates us-ing standard techniques.

Motivation

The first quantum computers will be small scale and noisysystems. Their practical applications will include the train-ing parameterized quantum circuits, which are networks ofclassical quantum gates controlled by classical parameters.

These circuits can be used as a parameterized representationof functions as called quantum neural networks, which canbe applied to classical supervised learning models [3], or toconstruct generative models. Quantum generative modelscan be used to characterize unknown physical processes, orquantum algorithms.

IBM Google

Quantum Generative Adversarial Networks [5] mimic theadversarial structure of classical GANs [4] to train a quan-tum circuit to generate an unknown quantum state.

Formulating classical GANs on the Wasserstein distance [1]has been observed to improve their training We improve thetraining of quantum GANs by formulating them based on aquantum divergence based on the Wasserstein Distance.

Quantum vs Classical Distributions

-Quantum distributions over a space Γ are described by andensity operator that is a positive semi-definite matrixover a space X ∼= CΓ with trace 1.

- Generalization of a classical random process: Classical dis-tributions are characterized by diagonal density operators.

- A quantum distribution over a system with componentsX ,Y is described by a quantum operator in the Kroneckerproduct space X ⊗ Y . The marginal distributions areobtained via the partial trace TrX ,TrY.

- Different quantum distributions can provide the same dis-tribution of classical outcomes. eg. the uniform classical

distribution 12

1 00 1

and the quantum state 12

1 11 1

.Wasserstein Distance

-Classical Wasserstein Distance: The minimum cost oftransporting a probability distribution to another giventhe cost c(x, y) of transporting unit mass between givenpoints.

- The transport plan is encoded as a joint distribution withit’s marginals as the two input distributions. Given inputdistributions P,Q,W (P,Q) = min

π

∫π(x, y)dxdy,

s.t.∫π(x, y)dy = P (x),

∫π(x, y)dx = Q(y),

π(x, y) ≥ 0- Can be reformulated in matrix form asW (P,Q) = min

πTr(πC)

s.t. π ∈ Pos(X ⊗ Y),TrX (π) = diag(Q),TrY(π) = diag(P )

where C = diag(c(x, y)).-Advantages over other distances: If an input distributionsis generated by applying a continuous parameterized func-tion, the Wasserstein distance is continuous in the param-eters, unlike measures such as the KL-divergence and JS-divergence.

- Regularization using the relative entropy [6] between π andP ⊗ Q makes the optimization problem strongly convexand differentiable in the generator parameters.

-Wasserstein GANs have been formulated using the Wasser-stein distance with cost ‖x− y‖1.

Quantum Wasserstein Semi-metric

-Motivation: Define a divergence between quantum statesresembling the Wasserstein distance.

- To allow for quantum states, relax the requirement thatP,Q be diagonal,qW(P,Q) = max

πTr(πC)

s.t., π ∈ Pos(X ⊗ Y),TrX (π) = Q,TrY(π) = P

- An arbitrarily chosen cost matrix C will not preserveqW(P, P ) = 0 for all quantum states. In particular noclassical cost (diagonal C) can work.

- To ensure that qW (P, P ) = 0 for all quantum states, Cfixed to be 1

2(IX⊗Y − SWAP), where the SWAP operatorswaps the subsystems X ,Y of a composite system (leavesP ⊗ P unchanged).

- Symmetric and positive, but does not satisfy the triangleinequality.

- The quantum relative entropy λTr(π log(π)− π log(P ⊗Q)) is added to the primal which makes the optimizationproblem strongly convex, and a differentiable function ofthe input distributions.

Quantum Wasserstein GAN

- The adversarial optimization problem for a GAN is for-mulated using the dual form of the semi-metric with anentropic regularizer to remove the hard constraints.

minG

maxφ,ψ

Ereal[ψ]− Efake[φ]− Ereal⊗fake[ξR]

- The fake state is generated using a parameterized quantumcircuit G. We use a well-studied model for parameterizedcircuits [3].

- The discriminator φ, ψ and regularizer ξR are quantitiesmeasured on the real and fake states. These measurementscan also be implemented using quantum circuits or as alinear combination of simpler measurements.

Structure of quantum Wasserstein GAN

- The system is scalable: the loss function and gradientscan be evaluated efficiently on a quantum computer (usingbasic gates polynomial in the number of parameters.)

Numerical Evaluation

2 qubit pure state 4 qubit pure state

8 qubit pure state 2 qubit mixed state

3 qubit mixed state Learning with Gaussian noise

Learning unknown quantum states (pure and mixed). Parameters arerandomly initialized from a normal distribution. The quality of learningis measured by the fidelity (1 indicates perfect learning.)

Compressing Quantum Circuits

The QWGAN can be used to approximate complex quan-tum circuits with smaller circuits. A smaller circuit istrained to approximate the Choi state which encodes theaction of a quantum circuit. Closeness in the Choi statesof two circuits results in the closeness of the outputs of thecircuit on average. We show that the quantum Hamiltoniansimulation circuit for the 1-d 3-qubit Heisenberg model in[2] can be approximated by a circuit of 52 gates with an av-erage output fidelity over 0.9999 and a worst-case error 0.15,while the best-known circuit based on the product formulawill need ∼ 11900 gates with a worst-case error 0.001.

References

[1] Martin Arjovsky, Soumith Chintala, and Léon Bottou, Wasserstein generative adversarialnetworks, Proceedings of the 34th International Conference on Machine Learning, pp. 214–223,2017, arXiv:1701.07875.

[2] Andrew M. Childs, Dmitri Maslov, Yunseong Nam, Neil J Ross, and Yuan Su, Toward the firstquantum simulation with quantum speedup, Proceedings of the National Academy of Sciences115 (2018), no. 38, 9456–9461, arXiv:1711.10980.

[3] Edward Farhi and Hartmut Neven, Classification with quantum neural networks on near termprocessors, 2018, arXiv:1802.06002.

[4] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,Aaron Courville, and Yoshua Bengio, Generative adversarial nets, Advances in NeuralInformation Processing Systems 27, pp. 2672–2680, 2014, arXiv:1406.2661.

[5] Seth Lloyd and Christian Weedbrook, Quantum generative adversarial learning, PhysicalReview Letters 121 (2018), 040502, arXiv:1804.09139.

[6] Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, and Jason D. Lee, On the convergence androbustness of training GANs with regularized optimal transport, Advances in NeuralInformation Processing Systems 31, pp. 7091–7101, Curran Associates, Inc., 2018,arXiv:1802.08249.

Recommended