Quantum Algorithm for LSOE

8/2/2019 Quantum Algorithm for LSOE

1/24

Quantum algorithm for linear systems of

equations

Aram W. Harrow, Avinatan Hassidimand Seth Lloyd

June 2, 2009

Abstract

Solving linear systems of equations is a common problem that arisesboth on its own and as a subroutine in more complex problems: given amatrix A and a vector b, find a vector x such that Ax = b. We considerthe case where one doesnt need to know the solution x itself, but ratheran approximation of the expectation value of some operator associatedwith x, e.g., xMx for some matrix M. In this case, when A is sparse andwell-conditioned, with largest dimension N, the best classical algorithmscan find x and estimate xMx in O(Npoly log(N)) time. Here, we ex-hibit a quantum algorithm for this task that runs in poly(log N) time, anexponential improvement over the best classical algorithm.

Quantum computers are devices that harness quantum mechanics to performcomputations in ways that classical computers cannot. For certain problems,

quantum algorithms supply exponential speedups over their classical counter-parts, the most famous example being Shors factoring algorithm [1]. Few suchexponential speedups are known, and those that are (such as the use of quantumcomputers to simulate other quantum systems [2]) have found little use outsidethe domain of quantum information theory. This paper presents a quantum al-gorithm that can give an exponential speedup for a broad range of applications.

Linear equations play an important role in virtually all fields of science andengineering. The sizes of the data sets that define the equations are growingrapidly over time, so that terabytes and even petabytes of data may need tobe processed to obtain a solution. The minimum time it takes to exactly solvesuch a set on a classical computer scales at least as N, where N is the size ofthe data set. Indeed, merely to write out the solution takes time of order N.

Frequently, however, one is interested not in the full solution to the equations,but rather in computing some function of that solution, such as determiningthe total weight of some subset of the indices. We show that in some cases, a

Department of Mathematics, University of Bristol, Bristol, BS8 1TW, U.K.MIT - Research Laboratory for Electronics, Cambridge, MA 02139, USAMIT - Research Laboratory for Electronics and Department of Mechanical Engineering,

Cambridge, MA 02139, USA

1


2/24

quantum computer can approximate the value of such a function in time which

is polylogarithmic in N, an exponential speedup over the best known classicalalgorithms. In fact, under standard complexity-theoretic assumptions, we provethat in performing this task any classical algorithm must be exponentially slowerthan the algorithm presented here. Moreover, we show that our algorithm isalmost the optimal quantum algorithm for the task.

We begin by presenting the main ideas behind the construction. Then wegive an informal description of the algorithm, making many simplifying assump-tions. Finally we present generalizations and extensions. The full proofs appearin the supporting online material [3].

Assume we are given the equation Ax = b, where b has N entries. Thealgorithm works by mapping b to a quantum state |b and by mapping A to asuitable quantum operator. For example, A could represent a discretized dif-ferential operator which is mapped to a Hermitian matrix with efficiently com-

putable entries, and |b could be the ground state of a physical system, or theoutput of some other quantum computation. Alternatively, the entries of A andb could represent classical data stored in memory. The key requirement here,as in all quantum information theory, is the ability to perform actions in super-position (also called quantum parallel). We present an informal discussion ofsuperposition, and its meaning in this context. Suppose that a algorithm (whichwe can take to be reversible without loss of generality) exists to map input x, 0to output x, f(x). Quantum mechanics predicts that given a superposition ofx, 0 and y, 0, evaluating this function on a quanutm computer will produce asuperposition of x, f(x) and y, f(y), while requiring no extra time to execute.One can view accessing a classical memory cell as applying a function whoseinput is the address of the cell and which output the contents of this cell. We

require that we can access this function in superposition.In the following paragraphs we assume that A is sparse and Hermitian.

Both assumptions can be relaxed, but this complicates the presentation. Wealso ignore some normalization issues (which are treated in the supplementarymaterial). The exponential speedup is attained when the condition number ofA is polylogarithmic in N, and the required accuracy is 1/ poly log n.

The algorithm maps the N entries of b onto the log2 N qubits requiredto represent the state |b. When A is sparse, the transformation eiAt |b canbe implemented efficiently. This ability to exponentiate A translates, via thewell-known technique of phase estimation, into the ability to decompose |b inthe eigenbasis of A and to find the corresponding eigenvalues j . Informally,the state of the system after this stage is close to

jj |uj |j, where uj is

the eigenvector basis of A, and|b

= j

j |

uj

. As the eigenvalues whichcorresponds to each eigenvector is entangled with it, One can hope to apply anoperation which would take

j j |uj |j to

j

1j j |uj |j. However, this

is not a linear operation, and therefore performing it requires a unitary, followedby a successful measurement. This allows us to extract the state |x = A1 |b.The total number of resources required to perform these transformations scalespoly-logarithmically with N.

2


3/24

This procedure yields a quantum-mechanical representation |x of the desiredvector x. Clearly, to read out all the components of x would require one toperform the procedure at least N times. However, when one is interested not inx itself, but in some expectation value xTMx, where M is some linear operator(our procedure also accommodates nonlinear operators as described below). Bymapping M to a quantum-mechanical operator, and performing the quantummeasurement corresponding to M, we obtain an estimate of the expectationvalue x| M|x = xTMx, as desired. A wide variety of features of the vector xcan be extracted in this way, including normalization, weights in different partsof the state space, moments, etc.

A simple example where the algorithm can be used is to see if two differentstochastic processes have similar stable state [4]. Consider a stochastic pro-cess xt = Axt1 + b, where the ith coordinate in the vector xt represents theabundance of item i in time t. The stable state of this distribution is given by

|x = (I A)1 |b. Let xt = Axt1 + b, and |x = (I A)1 b. To know if|x and |x are similar, we perform the SWAP test between them [5]. We notethat classically finding out if two probability distributions are similar requiresat least O(

N) samples [6]. One can apply similar ideas to know if different

pictures are similar, or to identify what is the relation between two pictures. Ingeneral, different problems require us to extract different features, and it is animportant question to identify what are the important features to extract.

Estimating expectation values on solutions of systems of linear equations isquite powerful. In particular, we show that it is universal for quantum com-putation - anything that a quantum computer can do, can be written as a setof linear equations, such that the result of the computation is encoded in someexpectation value of the solution of the system. Thus, matrix inversion can

be thought of as an alternate paradigm for quantum computing, along with[7, 8, 9, 10, 11, 12]. Matrix inversion has the advantage of being a naturalproblem that is not obviously related to quantum mechanics. We use the uni-versality result to show that our algorithm is almost optimal and that classicalalgorithms cannot match its performance.

An important factor in the performance of the matrix inversion algorithmis , the condition number of A, or the ratio between As largest and smallesteigenvalues. As the condition number grows, A becomes closer to a matrixwhich cannot be inverted, and the solutions become less stable. Such a matrixis said to be ill-conditioned. Our algorithms will generally assume that thesingular values of A lie between 1/ and 1; equivalently 2I AA I. Inthis case, we will achieve a runtime proportional to 2 log N. However, we also

present a technique to handle ill-conditioned matrices. The run-time also scalesas 1/ if we allow an additive error of in the output state |x. Therefore, if and 1/ are both poly log(N), the run-time will also be poly log(N). In thiscase, our quantum algorithm is exponentially faster than any classical method.

Previous papers utilized quantum computers to perform linear algebraic op-erations in a limited setting [13]. Our work was extended by [14] to solvingnonlinear differential equations.

3


4/24

We now give a more detailed explanation of the algorithm. First, we want to

transform a given Hermitian matrix A into a unitary operator e

iAt

which we canapply at will. This is possible (for example) if A is s-sparse and efficiently rowcomputable, meaning it has at most s nonzero entries per row and given a rowindex these entries can be computed in time O(s). Under these assumptions,Ref. [15] shows how to simulate eiAt in time

O(log(N)s2t),

where the O suppresses more slowly-growing terms (included in the supportingmaterial [3]). If A is not Hermitian, define

C =

0 A

A 0

(1)

As C is Hermitian, we can solve the equation Cy = b0 to obtain y = 0

x.

Applying this reduction if necessary, the rest of the paper assumes that A isHermitian.

We also need an efficient procedure to prepare |b. For example, if bi andi2i=i1

|bi|2 are efficiently computable then we can use the procedure of Ref. [16]to prepare |b.

The next step is to decompose |b in the eigenvector basis, using phaseestimation [17, 18]. Denote by |uj the eigenvectors of eiAt, and by j thecorresponding eigenvalues. Let

|0 :=

2

T

T1=0

sin( + 1

2)

T| (2)

for some large T. The coefficients of |0 are chosen (following [18]) to mini-mize a certain quadratic loss function which appears in our error analysis (seesupplementary material for details).

Next we apply the conditional Hamiltonian evolutionT1

=0 ||CeiAt0/Ton |0C |b, where t0 = O(/). Fourier transforming the first register givesthe state

Nj=1

T1k=0

k|jj |k |uj , (3)

where |k are the Fourier basis states, and |k|j| is large if and only if j 2kt0 .Defining k := 2k/t0, we can relabel our |k register to obtain

Nj=1

T

1

k=0

k|jjk |uj

Adding an ancilla qubit and rotating conditioned onk yields

Nj=1

T1k=0

k|jjk |uj

1 C

2

2k|0 + C

k|1

,

4


5/24

where C = O(1/). We now undo the phase estimation to uncompute the k.If the phase estimation were perfect, we would have k|j = 1 if k = j , and 0otherwise. Assuming this for now, we obtain

Nj=1

j |uj

1 C2

2j|0 + C

j|1

To finish the inversion we measure the last qubit. Conditioned on seeing 1,we have the state

1Nj=1 C

2/|j |2Nj=1

jC

j|uj

which corresponds to |x = nj=1 j

1j |uj up to normalization. We can de-

termine the normalization constant from the probability of obtaining 1. Finally,we make a measurement M whose expectation value x| M|x corresponds tothe feature of x that we wish to evaluate.

We present an informal description of the sources of error; the exact erroranalysis and runtime considerations are presented in [3]. Performing the phaseestimation is done by simulating eiAt. Assuming that A is s-sparse, this can bedone with negligible error in time nearly linear in t and quadratic in s.

The dominant source of error is phase estimation. This step errs by O(1/t0)in estimating , which translates into a relative error of O(1/t0) in

1. If 1/ taking t0 = O(/) induces a final error of . Finally, we consider thesuccess probability of the post-selection process. Since C = O(1/) and 1,this probability is at least (1/2). Using amplitude amplification [19], we findthat O() repetitions are sufficient. Putting this all together, the runtime is

O

log(N)s22/

By contrast, one of the best general-purpose classical matrix inversion algo-rithms is the conjugate gradient method [20], which, when A is positive definite,uses O(

log(1/)) matrix-vector multiplications each taking time O(Ns) for a

total runtime of O(Ns

log(1/)). (If A is not positive definite, O( log(1/))multiplications are required, for a total time of O(N s log(1/)).) An importantquestion is whether classical methods can be improved when only a summarystatistic of the solution, such as xMx, is required. Another question is whetherour quantum algorithm could be improved, say to achieve error in time propor-tional to poly log(1/). We show that the answer to both questions is negative,using an argument from complexity theory. Our strategy is to prove that the

ability to invert matrices (with the right choice of parameters) can be used tosimulate a general quantum computation.

We show that a quantum circuit using n qubits and T gates can be simulatedby inverting an O(1)-sparse matrix A of dimension N = O(2n). The conditionnumber is O(T2) if we need A to be positive definite or O(T) if not. Thisimplies that a classical poly(log N,, 1/)-time algorithm would be able to sim-ulate a poly(n)-gate quantum algorithm in poly(n) time. Such a simulation is

5


6/24

strongly conjectured to be false, and is known to be impossible in the presence

of oracles [21].The reduction from a general quantum circuit to a matrix inversion prob-lem, also implies that our algorithm cannot be substantially improved (un-der standard assumptions). If the run-time could be made polylogarithmicin , then any problem solvable on n qubits could be solved in poly(n) time(i.e. BQP=PSPACE), a highly implausible result[22]. Even improving our -dependence to 1 for > 0 would allow any time-T quantum algorithm to besimulated in time o(T); iterating this would again imply that BQP=PSPACE.Similarly, improving the error dependence to poly log(1/) would imply thatBQP includes PP, and even minor improvements would contradict oracle lowerbounds [22].

We now present the key reduction from simulating a quantum circuit tomatrix inversion. Let

Cbe a quantum circuit acting on n = log N qubits which

applies T two-qubit gates U1, . . . U T. The initial state is |0n and the answeris determined by measuring the first qubit of the final state.

Now adjoin an ancilla register of dimension 3T and define a unitary

U =

Tt=1

|t+1t|Ut + |t+T+1t+T|I+ |t+2T+1 mod 3Tt+2T|U3T+1t.(4)

We have chosen U so that for T + 1 t 2T, applying Ut to |1 | yields|t + 1 UT U1 |. If we now define A = I U e1/T then (A) = O(T),and we can expand

A1 =

k0Ukek/T, (5)

This can be interpreted as applying Ut for t a geometrically-distributed ran-dom variable. Since U3T = I, we can assume 1 t 3T. If we measurethe first register and obtain T + 1 t 2T (which occurs with probabilitye2/(1 + e2 + e4) 1/10) then we are left with the second register in thestate UT U1 |, corresponding to a successful computation. Sampling from|x allows us to sample from the results of the computation. This establishesthat matrix inversion is BQP-complete, and proves our above claims about thedifficulty of improving our algorithm.

We now discuss ways to extend our algorithm and relax the assumptions wemade while presenting it. First, we show how a broader class of matrices canbe inverted, and then consider measuring other features of x and performingoperations on A other than inversion.

Certain non-sparse A can also be simulated and therefore inverted; see [23]for a list of examples. It is also possible to invert non-square matrices, usingthe reduction presented from the non-Hermitian case to the Hermitian one.

The matrix inversion algorithm can also handle ill-conditioned matrices byinverting only the part of |b which is in the well-conditioned part of the matrix.Formally, instead of transforming |b = j j |uj to |x = j 1j j |uj, we

6


7/24

transform it to a state which is close toj,j


8/24

is exponentially smaller than the matrix to be inverted: a quantum computer

with under one hundred qubits suffices to invert a matrix with Avogadros num-ber of entries. Similarly, the exponential speedup of the algorithm allows it tobe performed with a relatively small number of quantum operations, therebyreducing the overhead required for quantum error correction.

Acknowledgements. We thank the W.M. Keck foundation for support,and AWH thanks them as well as MIT for hospitality while this work was carriedout. AWH was also funded by the U.K. EPSRC grant QIP IRC. SL thanks R.Zecchina for encouraging him to work on this problem. D. Farmer, M. Tegmark,S. Mitter, and P. Parillo supplied useful applications for this algorithm. Weare grateful as well to R. Cleve, S. Gharabian and D. Spielman for helpfuldiscussions.

References[1] P. W. Shor. Algorithms for quantum computation: discrete logarithms and

factoring. In S. Goldwasser, editor, Proceedings: 35th Annual Symposiumon Foundations of Computer Science, pages 124134. IEEE Computer So-ciety Press, 1994.

[2] S. Lloyd. Universal quantum simulators. Science, 273:10731078, August1996.

[3] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithmfor solving linear systems of equations, 2009. Supplementary material.

[4] D.G. Luenberger. Introduction to Dynamic Systems: Theory, Models, and

Applications. Wiley, New York, 1979.

[5] H. Buhrman, R. Cleve, J. Watrous, and R. De Wolf. Quantum fingerprint-ing. Physical Review Letters, 87(16):167902167902, 2001.

[6] Paul Valiant. Testing symmetric properties of distributions. In STOC,pages 383392, 2008.

[7] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Joshua Lapan, AndrewLundgren, and Daniel Preda. A Quantum Adiabatic Evolution Algo-rithm Applied to Random Instances of an NP-Complete Problem. Science,292(5516):472475, 2001.

[8] E. Knill, R. Laflamme, and GJ Milburn. A scheme for efficient quantumcomputation with linear optics. Nature, 409:4652, 2001.

[9] D. Aharanov and I. Arad. The BQP-hardness of approximating the JonesPolynomial, 2006. arXiv:quant-ph/0605181.

[10] M.A. Nielsen, M.R. Dowling, M. Gu, and A.C. Doherty. Quantum Com-putation as Geometry, 2006.

8


9/24

[11] M.H. Freedman, M. Larsen, and Z. Wang. A modular functor which is

universal for quantum computation. Comm. Math. Phys., 227(3):605622,2002.

[12] M.H. Freedman, A. Kitaev, M.J. Larsen, and Z. Wang. Topological quan-tum computation. Bull. Am. Math. Soc., 40(1):3138, 2003.

[13] A. Klappenecker and M. Roetteler. Quantum Physics Title: EngineeringFunctional Quantum Algorithms. Phys. Rev. A, 67:010302, 2003.

[14] S. K. Leyton and T. J. Osborne. A quantum algorithm to solve nonlineardifferential equations, 2008. arXiv:0812.4423.

[15] D.W. Berry, G. Ahokas, R. Cleve, and B.C. Sanders. Efficient Quan-tum Algorithms for Simulating Sparse Hamiltonians. Comm. Math. Phys.,

270(2):359371, 2007. arXiv:quant-ph/0508139.

[16] L. Grover and T. Rudolph. Creating superpositions that correspond toefficiently integrable probability distributions. arXiv:quant-ph/0208112.

[17] R. Cleve, A. Ekert, C. Macchiavello, and M. Mosca. Quantum AlgorithmsRevisited, 1997. arXiv:quant-ph/9708016.

[18] V. Buzek, R. Derka, and S. Massar. Optimal quantum clocks. Phys. Rev.Lett., 82:22072210, 1999. arXiv:quant-ph/9808042.

[19] G. Brassard, P. Hyer, M. Mosca, and A. Tapp. Quantum Amplitude Ampli-fication and Estimation, volume 305 of Contemporary Mathematics SeriesMillenium Volume. AMS, 2002. arXiv:quant-ph/0005055.

[20] Jonathan R. Shewchuk. An Introduction to the Conjugate GradientMethod Without the Agonizing Pain. Technical Report CMU-CS-94-125,School of Computer Science, Carnegie Mellon University, Pittsburgh, Penn-sylvania, March 1994.

[21] Daniel R. Simon. On the power of quantum computation. SIAM J. Comp.,26:116123, 1997.

[22] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser. A limit on the speedof quantum computation in determining parity. Phys. Rev. Lett., 81:54425444, 1998. arXiv:quant-ph/9802045.

[23] Andrew M. Childs. On the relationship between continuous- and discrete-

time quantum walk, 2008. arXiv:0810.0312.

[24] K. Chen. Matrix preconditioning techniques and applications. CambridgeUniv. Press, Cambridge, U.K., 2005.

[25] L. Sheridan, D. Maslov, and M. Mosca. Approximating fractional timequantum evolution, 2008. arXiv:0810.3843.

9


10/24

[26] Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Quantum random

access memory. Phys. Rev. Lett., 100:160501, 2008.[27] D. Aharonov and A. Ta-Shma. Adiabatic quantum state generation and

statistical zero knowledge. In Proceedings of the thirty-fifth annual ACMsymposium on Theory of computing (STOC), pages 2029. ACM Press NewYork, NY, USA, 2003. arXiv:quant-ph/0301023.

[28] P.C. Hansen. Rank-deficient and discrete il l-posted problems: Numericalaspects of linear inversion. SIAM, Philadelphia, PA, 1998.

[29] M. Sipser. Introduction to the Theory of Computation. International Thom-son Publishing, 1996.

[30] C.H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani. The strengths

and weaknesses of quantum computation. SIAM Journal on Computing,26:15101523, 1997.

A Supplementary Online Material

In this appendix, we describe and analyze our algorithm in full detail. Whilethe body of the paper attempted to convey the spirit of the procedure and leftout various improvements, here we take the opposite approach and describeeverything, albeit possibly in a less intuitive way. We also describe in moredetail our reductions from non-Hermitian matrix inversion to Hermitian matrixinversion (Section A.4) and from a general quantum computation to matrixinversion (Section A.5).

As inputs we require a procedure to produce the state |b, a method ofproducing the s non-zero elements of any row of A and a choice of cutoff .Our run-time will be roughly quadratic in and our algorithm is guaranteed tobe correct ifA 1 and A1 .

The condition number is a crucial parameter in the algorithm. Here wepresent one possible method of handling ill-conditioned matrices. We will definethe well-conditioned part of A to be the span of the eigenspaces correspondingto eigenvalues 1/ and the ill-conditioned part to be the rest. Our strategywill be to flag the ill-conditioned part of the matrix (without inverting it), andlet the user choose how to further handle this. Since we cannot exactly resolveany eigenvalue, we can only approximately determine whether vectors are inthe well- or ill-conditioned subspaces. Accordingly, we choose some > (say = 2). Our algorithm then inverts the well-conditioned part of the matrix,flags any eigenvector with eigenvalue 1/ as ill-conditioned, and interpolatesbetween these two behaviors when 1/ < || < 1/. This is described formallyin the next section. We present this strategy not because it is necessarily idealin all cases, but because it gives a concrete illustration of the key componentsof our algorithm.

Finally, the algorithm produces |x only up to some error which is given aspart of the input. We work only with pure states, and so define error in terms

10


11/24

of distance between vectors, i.e. || = 2(1 Re |). Since ancillastates are produced and then imperfectly uncomputed by the algorithm, ouroutput state will technically have high fidelity not with |x but with |x |000 . . ..In general we do not write down ancilla qubits in the |0 state, so we write |xinstead of|x |000 . . . for the target state, |b instead of|b |000 . . . for the initialstate, and so on.

A.1 Detailed description of the algorithm

To produce the input state |b, we assume that there exists an efficiently-implementable unitary B, which when applied to |initial produces the state|b, possibly along with garbage in an ancilla register. We make no furtherassumption about B; it may represent another part of a larger algorithm, ora standard state-preparation procedure such as [16]. Let TB be the number

of gates required to implement B. We neglect the possibility that B errs inproducing |b since, without any other way of producing or verifying the state|b, we have no way to mitigate these errors. Thus, any errors in producing |bnecessarily translate directly into errors in the final state |x.

Next, we define the state

|0 =

2

T

T1=0

sin( + 12)

T| (6)

for a T to be chosen later. Using [16], we can prepare |0 up to error intime poly log(T /).

One other subroutine we will need is Hamiltonian simulation. Using the

reductions described in Section A.4, we can assume that A is Hermitian. Tosimuluate eiAt for some t 0, we use the algorithm of [15]. If A is s-sparse,t t0 and we want to guarantee that the error is H, then this requires time

TH = O(log(N)(log(N))2s2t09

log(s2t0/H)). = O(log(N)s2t0) (7)

The scaling here is better than any power of 1/H, which means that the addi-tional error introduced by this step introduces is negligible compared with therest of the algorithm, and the runtime is almost linear with t0. Note that thisis the only step where we require that A be sparse; as there are some othertypes of Hamiltonians which can be simulated efficiently (e.g. [27, 15, 23]), thisbroadens the set of matrices we can handle.

The key subroutine of the algorithm, denoted Uinvert, is defined as follows:

1. Prepare |0C from |0 up to error .2. Apply the conditional Hamiltonian evolution

T1=0 ||C eiAt0/T up

to error H.

3. Apply the Fourier transform to the register C. Denote the resulting basisstates with |k, for k = 0, . . . T 1. Define k := 2k/t0.

11


12/24

4. Adjoin a three-dimensional register S in the state

h(k)S := 1 f(k)2 g(k)2 |nothingS+f(k) |wellS+g(k) |illS ,for functions f(), g() defined below in (8). Here nothing indicates thatthe desired matrix inversion hasnt taken place, well indicates that it has,and ill means that part of |b is in the ill-conditioned subspace of A.

5. Reverse steps 1-3, uncomputing any garbage produced along the way.

The functions f(), g() are known as filter functions[28], and are chosenso that for some constant C > 1: f() = 1/C for 1/, g() = 1/C for 1/ := 1/2 and f2() + g2() 1 for all . Additionally, f() shouldsatisfy a certain continuity property that we will describe in the next section.

Otherwise the functions are arbitrary. One possible choice is

f() =

12

when 1/12

sin2

1

1 1

when 1 > 1

0 when 1 >

(8a)

g() =

0 when 1/12

cos2

1

1 1

when 1 > 1

12

when 1 >

(8b)

If Uinvert is applied to |uj it will, up to an error we will discuss below, adjointhe state |h(j). Instead if we apply Uinvert to |b (i.e. a superposition ofdifferent |uj), measure S and obtain the outcome well, then we will haveapproximately applied an operator proportional to A1. Let p (computed inthe next section) denote the success probability of this measurement. Ratherthan repeating 1/p times, we will use amplitude amplification [19] to obtain thesame results with O(1/

p) repetitions. To describe the procedure, we introduce

two new operators:Rsucc = I

S 2|wellwell|S,acting only on the S register and

Rinit = I 2|initialinitial|.

Our main algorithm then follows the amplitude amplification procedure: we

start with UinvertB |initial and repeatedly apply UinvertBRinitBUinvertRsucc.Finally we measure S and stop when we obtain the result well. The number

of repetitions would ideally be /4

p, which in the next section we will showis O(). While p is initially unknown, the procedure has a constant probabilityof success if the number of repetitions is a constant fraction of /4p. Thus,following [19] we repeat the entire procedure with a geometrically increasingnumber of repetitions each time: 1, 2, 4, 8, . . . , until we have reached a power

12


13/24

of two that is . This yields a constant probability of success using 4repetitions.Putting everything together, the run-time is O((TB + t0s

2 log(N)), wherethe O suppresses the more-slowly growing terms of (log(N))2, exp(O(1/

log(t0/H)))

and poly log(T /). In the next section, we will show that t0 can be taken tobe O(/) so that the total run-time is O(TB +

2s2 log(N)/).

A.2 Error Analysis

In this section we show that taking t0 = O(/) introduces an error of in the final state. The main subtlety in analyzing the error comes from thepost-selection step, in which we choose only the part of the state attached tothe |well register. This can potentially magnify errors in the overall state. Onthe other hand, we may also be interested in the non-postselected state, which

results from applying Uinvert a single time to |b. For instance, this could be usedto estimate the amount of weight of |b lying in the ill-conditioned componentsof A. Somewhat surprisingly, we show that the error in both cases is upper-bounded by O(/t0).

In this section, it will be convenient to ignore the error terms H and ,as these can be made negligible with relatively little effort and it is the errorsfrom phase estimation that will dominate. Let U denote a version of Uinvert inwhich everything except the phase estimation is exact. Since U Uinvert O(H + ), it is sufficient to work with U. Define U to be the ideal version ofUinvert in which there is no error in any step.

Theorem A.1 (Error bounds).

1. In the case when no post-selection is performed, the error is bounded as

U U O(/t0). (9)

2. If we post-select on the flag register being in the space spanned by{|well , |ill}and define the normalized ideal state to be |x and our actual state to be|x then

|x |x O(/t0). (10)

3. If |b is entirely within the well-conditioned subspace of A and we post-select on the flag register being |well then

|x

|x

O(/t0). (11)

The third claim is often of the most practical interest, but the other two areuseful if we want to work with the ill-conditioned space, or estimate its weight.

The rest of the section is devoted to the proof of Theorem A.1. We firstshow that the third claim is a corollary of the second, and then prove the firsttwo claims more or less independently. To prove (10 assuming (9), observe thatif |b is entirely in the well-conditioned space, the ideal state |x is proportional

13


14/24

to A1 |b |well. Model the post-selection on |well by a post-selection first onthe space spanned by {|well , |ill}, followed by a post-selection onto |well. By(9), the first post-selection leaves us with error O(/t0). This implies that thesecond post-selection will succeed with probability 1O(2/t20) and thereforewill increase the error by at most O(/t0). The final error is then O(/t0) asclaimed in (11).

Now we turn to the proof of (9). A crucial piece of the proof will be thefollowing statement about the continuity of |h().Lemma A.2. The map |h() is O()-Lipschitz, meaning that for any1 = 2,

|h(1) |h(2) =

2(1 Re h(1)|h(2)) c|1 2|,

for some c = O(1).

Proof. Since |h() is continuous everywhere and differentiable everywhereexcept at 1/ and 1/, it suffices to bound the norm of the derivative of |h().We consider it piece by piece. When > 1/,

d

d|h() = 1

223

1 1/222 |nothing 1

22|well ,

which has squared norm 1224(2221) +

1424

2. Next, when 1/ < 1/ we find thatf f 1 < 1 / = ( )/, as desired.

Now, suppose that < 1/. Then

|f f|2 2

t20max |f|2 =

2

4

2

t202.

And similarly

|g g|2 2

t20max |g|2 =

2

4

2

t202.

Finally f()2 + g()2 = 1/2 for any

1/, implying the result.

Now we use Lemma A.3 to bound the two error contributions in (18). Firstbound

E[(f f)2 + (g g)2]2p

O

2

t20

E[(f

2 + g2)2]

E[f2 + g2] O

2

t20

(25)

The first inequality used Lemma A.3 and the second used the fact that E[2] O(1) even when conditioned on an arbitrary value of j (or equivalently j).

Next,

E[(f f)f + (g g)g]p

E

(f f)2 + (g g)2

(f2 + g2)

p(26)

E

22

t20(f2 + g2)2

p

(27)

O

t0

, (28)

17


18/24

where the first inequality is Cauchy-Schwartz, the second is Lemma A.3 and the

last uses the fact thatE

[||] E[2] = O(1) even when conditioned on j.We now substitute (25) and (28) into (21) (and assume t0) to find|p p|

p O

t0

. (29)

Substituting (25), (28) and (29) into (22), we find Re x|x 1 O(2/t20), orequivalently, that |x|x . This completes the proof of Theorem A.1.

A.3 Phase estimation calculations

Here we describe, in our notation, the improved phase-estimation procedure of[18], and prove the concentration bounds on |k|j |. Adjoin the state

|0 =

2

T

T1=0

sin( + 12)

T| .

Apply the conditional Hamiltonian evolution

|| eiAt0/T. Assume thetarget state is |uj, so this becomes simply the conditional phase

||eijt0/T.

The resulting state is

jt0 =

2

T

T1=0

eijt0

T sin( + 1

2)

T| |uj .

18


19/24

We now measure in the Fourier basis, and find that the inner product with1

TT1=0 e2ikT

| |uj is (defining := jt0 2k):k|j =

2

T

T1=0

eiT (jt02k) sin

( + 12

)

T(30)

=1

i

2T

T1=0

eiT

ei(+1/2)

T e i(+1/2)T

(31)

=1

i

2T

T1=0

ei2Tei

+T e i2TeiT (32)

=1

i

2T

ei2T

1 ei+i1

ei

+T

e i2T 1 ei+i

1

ei

T

(33)

= 1 + ei

i

2T

ei/2T

ei2T (+) e i2T (+)

ei/2Te

i2T () e i2T ()

(34)

=(1 + ei)ei/2T

i

2T

1

2i sin +2T 1

2i sin 2T

(35)

= ei 2 (1 1T )

2cos( 2 )

T

1

sin+2T

1sin2T

(36)

= ei 2 (1 1T )

2cos( 2

)

T sin

2T

sin +2T sin+2T

sin2T

(37)= ei

2 (1 1T )

2cos( 2)

T

2cos

2T

sin

2T

sin +

2T

sin 2T (38)

Following [18], we make the assumption that 2 T /10. Further using 3/6 sin and ignoring phases we find that

|k|j | 4

2

(2 2)(1 2+23T2

) 8

2. (39)

Thus |k|j |2 642/2 whenever |k jt0/2| 1.

A.4 The non-Hermitian case

Suppose A CMN with M N. Generically Ax = b is now underconstrained.Let the singular value decomposition of A be

A =Mj=1

j |uj vj | ,

with |uj CM, |vj CN and 1 M 0. Let V = span{|v1 , . . . , |vM}.Define

H =

0 A

A 0

. (40)

19


20/24

H is Hermitian with eigenvalues 1, . . . , M, corresponding to eigenvectorswj

:=

1

2(|0 |uj|1 |vj). It also has NM zero eigenvalues, correspondingto the orthogonal complement of V.To run our algorithm we use the input |0 |b. If |b = Mj=1 j |uj then

|0 |b =Mj=1

j1

2(w+j + wj )

and running the inversion algorithm yields a state proportional to

H1 |0 |b =Mj=1

j1j

12

(w+j wj ) =

Mj=1

j1j |1 |vj .

Dropping the inital|1, this defines our solution

|x. Note that our algorithm

does not produce any component in V, although doing so would have alsoyielded valid solutions. In this sense, it could be said to be finding the |x thatminimizes x|x while solving A |x = |b.

On the other hand, if M N then the problem is overconstrained. LetU = span{|u1 , . . . , |uN}. The equation A |x = |b is satisfiable only if|b U.In this case, applying H to |0 |b will return a valid solution. But if |b hassome weight in U, then |0 |b will have some weight in the zero eigenspace ofH, which will be flagged as ill-conditioned by our algorithm. We might chooseto ignore this part, in which case the algorithm will return an |x satisfyingA |x = Nj=1 |ujuj| |b.A.5 Optimality

In this section, we explain in detail two important ways in which our algorithm isoptimal up to polynomial factors. First, no classical algorithm can perform thesame matrix inversion task; and second, our dependence on condition numberand accuracy cannot be substantially improved.

We present two versions of our lower bounds; one based on complexity theory,and one based on oracles. We say that an algorithm solves matrix inversion ifits input and output are

1. Input: An O(1)-sparse matrix A specified either via an oracle or via apoly(log(N))-time algorithm that returns the nonzero elements in a row.

2. Output: A bit that equals one with probability x| M|x , whereM = |00| IN/2 corresponds to measuring the first qubit and |x isa normalized state proportional to A1 |b for |b = |0.

Further we demand that A is Hermitian and 1I A I. We take to be afixed constant, such as 1/100, and later deal with the dependency in . If thealgorithm works when A is specified by an oracle, we say that it is relativizing.Even though this is a very weak definition of inverting matrices, this task is stillhard for classical computers.

20


21/24

Theorem A.4. 1. If a quantum algorithm exists for matrix inversion run-

ning in time

1

poly log(N) for some > 0, thenBQP

=PSPACE

.2. No relativizing quantum algorithm can run in time 1 poly log(N).3. If a classical algorithm exists for matrix inversion running in time poly(, log(N)),

then BPP=BQP.

Given an n-qubit T-gate quantum computation, define U as in (4). Define

A =

0 I Ue 1T

I Ue 1T 0

. (41)

Note that A is Hermitian, has condition number 2T and dimension N =6T2n. Solving the matrix inversion problem corresponding to A producesan -approximation of the quantum computation corresponding to applyingU1, . . . , U T, assuming we are allowed to make any two outcome measurement onthe output state |x. Recall that

I U e 1T1

=k0

Ukek/T. (42)

We define a measurement M0, which outputs zero if the time register t is betweenT+1 and 2T, and the original measurements output was one. As Pr(T+1 k 2T) = e2/(1 + e2 + e4) and is independent of the result of the measurementM, we can estimate the expectation of M with accuracy by iterating thisprocedure O

1/2

times.

In order to perform the simulation when measuring only the first qubit,

defineB =

I6T2n 0

0 I3T2n U e 1T

. (43)

We now define B to be the matrix B, after we permuted the rows and columnssuch that if

C =

0 B

B 0

. (44)

and Cy =

b0

, then measuring the first qubit of |y would correspond to

perform M0 on |x. The condition number of C is equal to that of A, but thedimension is now N = 18T2n.

Now suppose we could solve matrix inversion in time 1(log(N)/)c1 forconstants c1 2, > 0. Given a computation with T 22n/18, let m =2

log(2n)log(log(n))

and = 1/100m. For sufficiently large n, 1/ log(n). Then

1

log(N)

c1 (2T)1

3n

c1 T1c2(n log(n))c1 ,

where c2 = 213c1 is another constant.

21


22/24

We now have a recipe for simulating an ni-qubit Ti-gate computation with

ni+1 = ni + log(18Ti) qubits, Ti+1 = T

1

i c3(ni log(ni))

c1

gates and error .Our strategy is to start with an n0-qubit T0-gate computation and iterate thissimulation m times, ending with an n-qubit T-gate computation witherror m 1/100. We stop iterating either after m steps, or wheneverTi+1 > T

1/2i , whichever comes first. In the latter case, we set equal to the

first i for which Ti+1 > T1/2i .

In the case where we iterated the reduction m times, we have Ti T(1/2)i 2(1/2)

i2n0 , implying that Tm n0. On the other hand, suppose we stop forsome < m. For each i < we have Ti+1 T1/2i . Thus Ti 2(1/2)

i2n0

for each i . This allows us to bound ni = n0 +i1

j=0 log(18Ti) = n0 +

2n0

i1j=0(1 /2)j + i log(18)

4 + 1

n0 + m log(18). Defining yet another

constant, this implies that Ti+1

T1i c3(n0 log(n0))c1 . Combining this with

our stopping condition T+1 > T1/2 we find that

T (c3(n0 log(n0))c1)2 = poly(n0).

Therefore, the runtime of the procedure is polynomial in n0 regardless of thereason we stopped iterating the procedure. The number of qubits used increasesonly linearly.

Recall that the TQBF (totally quantified Boolean formula satisfiability)problem is PSPACE-complete, meaning that any k-bit problem instance forany language in PSPACE can be reduced to a TQBF problem of lengthn = poly(k) (see [29] for more information). The formula can be solved in timeT

22n/18, by exhaustive enumeration over the variables. Thus a PSPACE

computation can be solved in quantum polynomial time. This proves the firstpart of the theorem.

To incorporate oracles, note that our construction of U in (4) could simplyreplace some of the Uis with oracle queries. This preserves sparsity, althoughwe need the rows of A to now be specified by oracle queries. We can now iteratethe speedup in exactly the same manner. However, we conclude with the abilityto solve the OR problem on 2n inputs in poly(n) time and queries. This, ofcourse, is impossible [30], and so the purported relativizing quantum algorithmmust also be impossible.

The proof of part 3 of Theorem A.4 simply formulates a poly(n)-time, n-qubit quantum computation as a = poly(n), N = 2n poly(n) matrix inversionproblem and applies the classical algorithm which we have assumed exists.

Theorem A.4 established the universality of the matrix inversion algorithm.To extend the simulation to problems which are not decision problems, note thatthe algorithm actually supplies us with |x (up to some accuracy). For example,instead of measuring an observable M, we can measure |x in the computationalbasis, obtaining the result i with probability | i|x |2. This gives a way to sim-ulate quantum computation by classical matrix inversion algorithms. In turn,this can be used to prove lower bounds on classical matrix inversion algorithms,

22


23/24

where we assume that the classical algorithms output samples according to this

distribution.Theorem A.5. No relativizing classical matrix inversion algorithm can run intime N2 unless 3 + 4 1/2.

If we consider matrix inversion algorithms that work only on positive definitematrices, then the N2 bound becomes N2

.

Proof. Recall Simons problem [21], in which we are given f : Zn2 {0, 1}2nsuch that f(x) = f(y) iff x + y = a for some a Zn2 that we would like tofind. It can be solved by running a 3n-qubit 2n + 1-gate quantum computationO(n) times and performing a poly(n) classical computation. The randomizedclassical lower bound is (2n/2) from birthday arguments.

Converting Simons algorithm to a matrix A yields

4n and N

36n23n.

The run-time is N2 2(3+4)n poly(n). To avoid violating the oraclelower bound, we must have 3 + 4 1/2, as required.

Next, we argue that the accuracy of algorithm cannot be substantially im-proved. Returning now to the problem of estimating x| M|x, we recall thatclassical algorithms can approximate this to accuracy in time O(N poly(log(1/))).This poly(log(1/)) dependence is because when writing the vectors |b and |xas bit strings means that adding an additional bit will double the accuracy.However, sampling-based algorithms such as ours cannot hope for a better thanpoly(1/) dependence of the run-time on the error. Thus proving that our algo-rithms error performance cannot be improved will require a slight redefinitionof the problem.

Define the matrix inversion estimation problem as follows. Given A,b,M,,,swith A 1, A1 , A s-sparse and efficiently row-computable, |b = |0and M = |00| IN/2: output a number that is within of x| M|x withprobability 2/3, where |x is the unit vector proportional to A1 |b.

The algorithm presented in our paper can be used to solve this problemwith a small amount of overhead. By producing |x up to trace distance /2 intime O(log(N)2s2/), we can obtain a sample of a bit which equals one withprobability with | x| M|x | /2. Since the variance of this bit is 1/4,taking 1/32 samples gives us a 2/3 probability of obtaining an estimate within/2 of . Thus quantum computers can solve the matrix inversion estimationproblem in time O(log(N)2s2/3).

We can now show that the error dependence of our algorithm cannot besubstantially improved.

Theorem A.6. 1. If a quantum algorithm exists for the matrix inversion es-timation problem running in time poly(, log(N), log(1/)) thenBQP=PP.

2. No relativizing quantum algorithm for the matrix inversion estimationproblem can run in time N poly()/ unless + 1.

23


24/24

Proof. 1. A complete problem for the class PP is to count the number of sat-

isfying assignments to a SAT formula. Given such formula , a quantumcircuit can apply it on a superposition of all 2n assignments for variables,generating the state

z1,...,zn{0,1}|z1, . . . , zn |(z1, . . . zn) .

The probability of obtaining 1 when measuring the last qubit is equal tothe number of satisfying truth assignments divided by 2n. A matrix inver-sion estimation procedure which runs in time poly log(1/) would enableus to estimate this probability to accuracy 22n in time poly(log(22n)) =poly(n). This would imply that BQP = PP as required.

2. Now assume that (z) is provided by the output of an oracle. Let Cdenote the number of z {0, 1}n such that (z) = 1. From [22], we knowthat determining the parity of C requires (2n) queries to . However,exactly determining C reduces to the matrix inversion estimation problemwith N = 2n, = O(n2) and = 2n2. By assumption we can solve thisin time 2(+)n poly(n), implying that + 1.

24

Documents

Quantum Algorithm for LSOE