38
Optimization and Machine Learning in Quantum Information Theory Peter Wittek ICFO-The Institute of Photonic Sciences and University of Bor˚ as 09 July 2015 Seminar in the University of Tokyo

Optimization and Machine Learning in Quantum Information Theory

Embed Size (px)

Citation preview

Page 1: Optimization and Machine Learning in Quantum Information Theory

Optimization and Machine Learning inQuantum Information Theory

Peter Wittek

ICFO-The Institute of Photonic Sciencesand

University of Boras

09 July 2015

Seminar in the University of Tokyo

Page 2: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Introduction

Global, nonconvex optimization problems are pervasive.Quantum information theory included

Finding the quantum bound of Bell inequalities, estimatingguessing probability, fidelity, ground state energy, adaptivephase estimation. . .

Machine learning similarlyWorth looking at the interaction of the two fields:

Classical optimization and learning theory applied inquantum information theory.Using quantum algorithms, protocols and strategies inmachine learning.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 3: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Polynomial Optimization Problems of NoncommutingVariables

The generic form is:

p? = infX ,φ〈φ,p(X )φ〉

s.t. ‖φ‖ = 1,gi(X ) � 0, i = 1, . . . ,mg .

〈φ|si(X )|φ〉 � 0, i = 1, . . . ,ms.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 4: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Words and Involution

Given n noncommuting variables, words are sequences ofletters of x = (x1, x2, . . . , xn) and x∗ = (x∗1 , x

∗2 , . . . , x

∗n ).

E.g., w = x1x∗2 .Involution: similar to a complex conjugation on sequencesof letters.A polynomial is a linear combination of wordsp =

∑w pww .

Hermitian moment matrix.Hermitian variables.Versus commutative case.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 5: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

The Relaxation

We replace the optimization problem by the following SDP:

miny

∑w pwyw (1)

s.t. M(y) � 0,

M(giy) � 0, i = 1, . . . ,mg .∑|w |≤2k

si,wyw > 0, i = 1, . . . ,ms.

Pironio, S.; Navascues, M. & Acın, A. Convergent relaxations ofpolynomial optimization problems with noncommuting variables.SIAM Journal on Optimization, SIAM, 2010, 20, 2157–2180.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 6: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Toy Example: Polynomial Optimization

Consider the following polynomial optimization problem:

minx ,φ〈φ|x1x2 + x2x1|φ〉

such that

||φ|| = 1

−x22 + x2 + 0.5 � 0,

x21 = x1,

x1x2 = x2x1.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 7: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Toy Example: Moment and localizing matrices

minx

2x1x2

such that 1 x1 x2 x1x2 x2

2

x1 x1 x1x2 x1x2 x1x22

x2 x12 y22 y122 y222

x1x2 x1x2 x1x22 x1x2

2 x1x32

x22 x1x2

2 x32 x1x3

2 x42

� 0

−x22 + x2 + 0.5 −x1x2

2 + y12 + 0.5x1 −x32 + x2

2 + 0.5x2

− x1x22 + x1x2 + 0.5x1 −x1x2

2 + x1x2 + 0.5x1 −x1x32 + x1x2

2 + 0.5x1x2

−x32 + x2

2 + 0.5x2 −x1x32 + x1x2

2 + 0.5x1x2 −x42 + x3

2 + 0.5x22

� 0.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 8: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Toy Example: Corresponding SDP

miny

2y12

such that 1 y1 y2 y12 y22

y1 y1 y12 y12 y122

y2 y12 y22 y122 y222

y12 y12 y122 y122 y1222

y22 y122 y222 y1222 y2222

� 0

−y22 + y2 + 0.5 −y122 + y12 + 0.5y1 −y222 + y22 + 0.5y2

− y122 + y12 + 0.5y1 −y122 + y12 + 0.5y1 −y1222 + y122 + 0.5y12

−y222 + y22 + 0.5y2 −y1222 + y122 + 0.5y12 −y2222 + y222 + 0.5y22

� 0.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 9: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Bounding Quantum Correlations

maxE ,φ〈φ,∑

ij

cijEiEjφ〉

subject to

||φ|| = 1EiEj = δijEi ∀i , j∑i

Ei = 1

[Ei ,Ej ] = 0 ∀i , j .Navascues, M.; Pironio, S. & Acın, A. Bounding the set ofquantum correlations. Physical Review Letters, 2007, 98, 1040.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 10: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Another Example

Hubbard Model:

H = −t∑<r ,s>

[c†r cs + c†scr

]+ U/2

∑<r ,s>

nr ns,

{cr , c†s} = δrsIr ,

{c†r , c†s} = 0,

{cr , cs} = 0.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 11: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

The Complexity of Translation

Generating the moment and localizing matrices is not atrivial task.The number of words – the monomial basis – growsexponentially in the order of relaxation.The number of elements in the moment matrix is thesquare of that.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 12: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

The Problem of Translation

Ncpol2SDPA: converter of symbolic description of(non)commutative polynomial optimization problem to anumerical SDP relaxation.Sparsest possible output.SDPA:

Parallel and distributed SDP solver.Arbitrary-precision variant.

Wittek, P.: Algorithm 950: Ncpol2sdpa—SparseSemidefinite Programming Relaxations for PolynomialOptimization Problems of Noncommuting Variables. ACMTransactions on Mathematical Software, 2015, 41(3):21.arXiv:1308.6029

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 13: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Large-Scale Problems

Structural redundancy is resolved on an ongoing basis.Up to solving SDPs of 250,000 variables.Quantum chemistry problems

Working towards a more scalable Hubbard model.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 14: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Generalizations

Bilevel problems.Mixed states.Steering.Numerically stable way of restricting dimension of Hilbertspace.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 15: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

The Roots of Machine Learning

StatisticsArtificial intelligenceTheory of computationsFurthermore:

OptimizationControl

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 16: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Assumptions, Parameters, and Statistics

Descriptive and inferential statistics.Assumptions derive from probability theory.Parameters enter through assumed probabilitydistributions.

It is often assumed that the data is generated by certainprobability distributions described by a finite number ofunknown parameters.

Statistical models.E.g., linear regression with Gaussian error term.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 17: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Sample Complexity

Think metrology:Cramer-Rao bound and the standard quantum limit: 1/N.Heisenberg limit: 1/N2.

We can establish guarantees on accuracy based on thesample size.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 18: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Theory of Computation

Solving problems efficiently by an algorithm.Number of steps required to arrive at a solution.

Computational complexityBig-o notation: O(n).

Compexity classes: P versus NP.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 19: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Artificial Intelligence

Reasoning and deduction.Formal logic and combinatorial explosion.

∃clouds ⇒ rain

Knowledge representation and ontologies.Uncertainty in AI.Bayesian inference, Bayesian networks.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 20: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

What Machine Learning Should Be About

Data-drivenLooking for patternsClasses, groups of similar objectsMainly quantitative, but can also be qualitative

Robust, tolerates noiseGeneralize well beyond training dataWe seek a balance between

Computational complexityModel complexity andSample complexity

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 21: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Learning Approach

Supervised: (x1, y1), . . . , (xn, yn).Biomedical: recognizing cancer cellsRecognizing handwritingSpam detection

UnsupervisedRecommendation enginesFinding groups of similar patentsIdentifying trends in a dynamic environment

Transductive learning.Reinforcement learning.

Class 1Class 2Decisionsurface

Unlabeled instancesDecision boundary

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 22: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

VC Dimension and Model Complexity

Shattering sets of labelled points.XOR problem.VC dimension can be infinite.

VC dimension is not perfect: see Rademacher complexity.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 23: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

VC Theorem and Structural Risk Minimization

Generalize well beyond training data.Bounds relate generalization performance to modelcomplexity.As opposed to empirical risk minimization.

P

(EN(f ) ≤ E +

√h(log(2N/h) + 1)− log(η/4)

N

)= 1− η,

whereEN(f ) is the error of the learned function f over the wholedistribution given the sample;E is the error on the sample;h is the VC dimension.VC dimension is not perfect: see Rademacher complexity.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 24: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Risk Minimization in Supervised Learning: SupportVector Machines

Maximum margin classifiersTraining example set:

{(x1, y1), . . . , (xN , yN)},

xi ∈ Rd are the data points.y ∈ {−1,1} are binary classes.

Minimize12

uT u

subject to

yi(uT xi + b) ≥ 1, i = 1, . . . ,N.

Output is a hyperplane: yi := sgn(uT xi + b).We had this result in the 1960s.

Class 1Class 2DecisionsurfaceMargin

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 25: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Making Support Vector Machines Practical

Allow for mixing of classes by some ξi ≥ 0.

Minimize12

uT u + CN∑

i=1

ξi

yi(uT xi + b) ≥ 1− ξi , ξi ≥ 0, i = 1, . . . ,N.

Dual formulation:

maxαi

N∑i=1

αi −12

∑i,j

αiαjyiyjx>i xj

0 ≤ αi ≤ C, i = 1, . . . ,N,N∑

i=1

αiyi = 0.

The importance of αi and the positive definite kernel.Peter Wittek Optimization and Learning in Quantum Information Theory

Page 26: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Neural networks

Feedforward network:

Connection to spin glasses.Shallow learners.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 27: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Deep Learning

Many-layered artificial neural networks.

Image is from https://colah.github.io/posts/2015-01-Visualizing-Representations/

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 28: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Main Research Directions

Classical learning applied to quantum physics problems.Quantum machine learning (quantum computationallearning).Quantum learning (quantum statistical learning).

Group similar states together according to some fidelitymeasure.Quantum template matching.Learnability of unknown quantum measurements.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 29: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Classical Learning in Quantum Physics Problems

Adaptive quantum phase estimation: classicalreinforcement learning.

Other attempts: measurement-based quantum computing,quantum logic gates with gradient ascent pulseengineering, simulating quantum circuits on spin systems.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 30: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Quantum Machine Learning

Classical data:Grover’s search.

Quantum associative memories.A form of quantum support machines.Hierarchical clustering.

Adiabatic optimization.Quantum data

Solving linear equations and self-analysis.Quantum principal component analysis.Quantum support vector machines.Quantum nearest neighbors algorithm.Topological analysis.

Learning of unitary transformations: similar to processtomography.

Regression and transductive learning.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 31: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Learning and Grover’s search

Without decoherence, Grover’s search finds an element inan unordered set quadratically faster than the classicallimit.Variant for finding minimum and maximum.It is a plug-and-play method.Implementations are not quite clear on actual speedup.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 32: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Adiabatic Quantum Computing

Find the global minimum of a given functionf : {0,1}n 7→ (0,∞), where minx f (x) = f0 and f (x) = f0 iffx = x0.Consider the Hamiltonian H1 =

∑x∈{0,1}n f (x)|x〉〈x |. Its

ground state is |x0〉.To find this ground state, consider the HamiltonianH(λ) = (1− λ)H0 + λH1.Demonstrations: search engine ranking and binaryclassification.

Hmem

Hinp

Hmem + Hinp

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 33: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Intermezzo: Least-Squares Support Vector Machines

Minimize12

u>u +γ

2

N∑i=1

e2i (2)

subject to the equality constraints

yi(u>φ(xi) + b) = 1− ei , i = 1, . . . ,N. (3)

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 34: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Quantum Least-Squares Support Vector Machines

Use an alternative formulation of support vector machines.Trade-off: losing sparsity (model complexity increases).

Core ideas:Quantum matrix inversion is fast.Simulation of sparse matrixes is efficient.Non-sparse density matrices reveal the eigenstructureexponentially faster than in classical algorithms.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 35: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Learning a Unitary Transformation

N disposals of a black-box unitary transformations,followed by K uses of the learned function.A form of quantum process tomography.Regression problem: unknown function == unknownquantum channel.Double optimization: input state and strategy.Transductive learning.

Unlabeled instancesClass 1Class 2

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 36: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Generalization of Causal Networks

Hidden Markov models.d-separation theorem and its quantum variants.

Challenges Reichenbach’s Common Cause Principle.Sequential measurements and inference.

Entropic description to linearize equations.

Connection to nonlocality.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 37: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Book

Monograph.Reached 1,009,508th bestselling position.

Peter Wittek Optimization and Learning in Quantum Information Theory

Page 38: Optimization and Machine Learning in Quantum Information Theory

Introduction SDP Elements of Machine Learning Quantum Physics and Machine Learning Conclusions

Summary

Nonconvex optimization is ubiquitous both in quantuminformation theory and machine learning.Classical and quantum learning can help in quantumphysics problems.

Robust heuristics.Structural risk minimization.Adaptive techniques: reinforcement learning.

Peter Wittek Optimization and Learning in Quantum Information Theory