Quantum Kolmogorov complexity based on classical descriptions

2464 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

Quantum Kolmogorov Complexity Based onClassical Descriptions

Paul M. B. Vitányi

Abstract—We develop a theory of the algorithmic informationin bits contained in an individual pure quantum state. This extendsclassical Kolmogorov complexity to the quantum domain retainingclassical descriptions. Quantum Kolmogorov complexity coincideswith the classical Kolmogorov complexity on the classical domain.Quantum Kolmogorov complexity is upper-bounded and can be ef-fectively approximated from above under certain conditions. Withhigh probability, a quantum object is incompressible. Upper andlower bounds of the quantum complexity of multiple copies of in-dividual pure quantum states are derived and may shed some lighton the no-cloning properties of quantum states. In the quantumsituation complexity is not subadditive. We discuss some relationswith “no-cloning” and “approximate cloning” properties.

Index Terms—Algorithmic information theory (quantum),classical descriptions of quantum states, information theory(quantum), Kolmogorov complexity (quantum), quantum cloning.

I. INTRODUCTION

QUANTUM information theory, the quantum-mechanicalanalog of classical information theory [6], is experi-encing a renaissance [2] due to the rising interest in the

notion of quantum computation and the possibility of realizinga quantum computer [16]. While Kolmogorov complexity[12] is the accepted absolute measure of information contentin an individual classical finite object, a similar absolutenotion is needed for the information content of anindividualpurequantumstate. One motivation is to extend probabilisticquantum information theory to Kolmogorov’s absolute indi-vidual notion. Another reason is to try and duplicate the successof classical Kolmogorov complexity as a general proof methodin applications ranging from combinatorics to the analysis ofalgorithms, and from pattern recognition to learning theory[13]. We propose a theory of quantum Kolmogorov complexitybased on classical descriptions and derive the results given inthe abstract. A preliminary partial version appeared as [19].

What are the problems and choices to be made developing atheory of quantum Kolmogorov complexity? Quantum theory

Manuscript received April 13, 2000; revised February 8, 2001. This work wassupported in part by the EU fifth framework project QAIP, IST-1999-11234, theNoE QUIPROCONE IST-1999-29064, the ESF QiT Programme, and the EUFourth Framework BRA NeuroCOLT II Working Group EP 27150. Part of thiswork was done during the author’s 1998 stay at the Tokyo Institute of Tech-nology, Tokyo, Japan, as Gaikoku-Jin Kenkyuin at INCOCSAT, and appearedin a preliminary version inProc. 15th IEEE Conf. Computational Complexity,pp. 263–270, 2000 archived as quant-ph/9907035.

The author is with the Centrum voor Wiskunde en Informatica (CWI), 1098SJ Amsterdam, The Netherlands, and has an appointment at the University ofAmsterdam, Amsterdam, The Netherlands (e-mail: [email protected]).

Communicated by P. Shor, Associate Editor for Quantum InformationTheory.

Publisher Item Identifier S 0018-9448(01)07018-3.

assumes that every complex vector of unit length represents arealizable pure quantum state [17]. There arises the question ofhow to design the equipment that prepares such a pure state.While there are continuously many pure states in a finite-di-mensional complex vector space—corresponding to all vectorsof unit length—we can finitely describe only a countable subset.Imposing effectiveness on such descriptions leads to construc-tive procedures. The most general such procedures satisfyinguniversally agreed-upon logical principles of effectiveness arequantum Turing machines, [3]. To define quantum Kolmogorovcomplexity by way of quantum Turing machines leaves essen-tially two options:

1) we want to describe every quantum superposition exactly;or

2) we want to take into account the number of bits per qubitsin the specification as well the accuracy of the quantumstate produced.

We have to deal with the following three problems.

• There are continuously many quantum Turing machines.

• There are continuously many pure quantum states.

• There are continuously many qubit descriptions.

There are uncountably many quantum Turing machines only ifwe allow arbitrary real rotations in the definition of machines.Then, a quantum Turing machine can only be universal in thesense that it can approximate the computation of an arbitrarymachine, [3]. In descriptions using universal quantum Turingmachines we would have to account for the closeness of ap-proximation, the number of steps required to get this precision,and the like. In contrast, if we fix the rotation of all contem-plated machines to a single primitive rotationwithand , then there are only countably many Turing ma-chines and the universal machine simulates the others exactly[1]. Every quantum Turing machine computation, using arbi-trary real rotations to obtain a target pure quantum state, can beapproximated to every precision by machines with fixed rota-tion but in general cannot be simulated exactly—just as in thecase of the simulation of arbitrary quantum Turing machinesby a universal quantum Turing machine. Since exact simula-tion is impossible by a fixed universal quantum Turing machineanyhow, but arbitrarily close approximations are possible byTuring machines using a fixed rotation such as, we are moti-vated to fix as a standard enumeration of quantumTuring machines using only rotation.

Our next question is whether we want programs (descrip-tions) to be in classical bits or in qubits? The intuitive notion ofcomputability requires the programs to be classical. Namely, to

0018–9448/01$10.00 © 2001 IEEE

VITÁNYI: QUANTUM KOLMOGOROV COMPLEXITY BASED ON CLASSICAL DESCRIPTIONS 2465

prepare a quantum state requires a physical apparatus that “com-putes” this quantum state from classical specifications. Sincesuch specifications have effective descriptions, every quantumstate that can be prepared can be described effectively in de-scriptions consisting of classical bits. Descriptions consisting ofarbitrary pure quantum states allows noncomputable (or hard tocompute) information to be hidden in the bits of the amplitudes.In Definition 4, we call a pure quantum statedirectly computableif there is a (classical) program such that the universal quantumTuring machine computes that state from the program and thenhalts in an appropriate fashion. In a computational setting, wenaturally require that directly computable pure quantum statescan be prepared. By repeating the preparation we can obtain ar-bitrarily many copies of the pure quantum state.

If descriptions are not effective then we are not going to usethem in our algorithms except possibly on inputs from an “un-prepared” origin. Every quantum state used in a quantum com-putation arises from some classical preparation or is possiblycaptured from some unknown origin. If the latter is the case,then we can consume it as conditional side information or anoracle.

Restricting ourselves to an effective enumeration of quantumTuring machines and classical descriptions to describe by ap-proximation continuously many pure quantum states is remi-niscent of the construction of continuously many real numbersfrom Cauchy sequences of rational numbers, the rationals beingeffectively enumerable.

Kolmogorov Complexity:We summarize some basic defini-tions in Appendix I (see also [20]) in order to establish notationsand recall the notion of shortest effective descriptions. More de-tails can be found in the textbook [13]. Shortest effective de-scriptions are “effective” in the sense that they are programs:we can compute the described objects from them. Unfortunately[12], there is no algorithm that computes the shortest programand then halts, that is, there is no general method to compute thelength of a shortest description (the Kolmogorov complexity)from the object being described. This obviously impedes actualuse. Instead, one needs to consider computable approximationsto shortest descriptions, for example, by restricting the allow-able approximation time. Apart from computability and approx-imability, there is another property of descriptions that is impor-tant to us. A set of descriptions isprefix-freeif no description is aproper prefix of another description. Such a set is called aprefixcode. Since a code message consists of concatenated codewords,we have to parse it into its constituent codewords to retrievethe encoded source message. If the code isuniquely decodable,then every code message can be decoded in only one way. Theimportance of prefix codes stems from the fact that i) they areuniquely decodable from left to right without backing up, and ii)for every uniquely decodable code there is a prefix code with thesame length codewords. Therefore, we can restrict ourselves toprefix codes. In our setting, we require the set of programs to beprefix-free and hence to be a prefix code for the objects beingdescribed. It is well known that with every prefix code therecorresponds a probability distribution such that the prefixcode is a Shannon–Fano code1 that assigns prefix code length

1In what follows, “log” denotes the binary logarithm.

to —irrespective of the regularities in. Forexample, with the uniform distribution on the setof -bit source words, the Shannon–Fano codeword length ofan all-zero source word equals the codeword length of a truly ir-regular source word. The Shannon–Fano code gives an expectedcodeword length close to the entropy, and, by Shannon’s Noise-less Coding Theorem, it possesses the optimal expected code-word length. But the Shannon–Fano code is not optimal for in-dividual elements: it does not take advantage of the regularity insome elements to encode those shorter. In contrast, one can viewthe Kolmogorov complexity as the codeword length of theshortest program for , the set of shortest programs consti-tuting the Shannon–Fano code of the so-called “universal dis-tribution” . The code consisting of the shortestprograms has the remarkable property that it achieves i) an ex-pected code length that is about optimal since it is close to theentropyand, simultaneously,ii) every individual object is codedas short as is effectively possible, that is, squeezing out all reg-ularity. In this sense, the set of shortest programs constitutes theoptimal effective Shannon–Fano code, induced by the optimaleffective distribution (the universal distribution).

Quantum Computing:We summarize some basic definitionsin Appendix II in order to establish notations and briefly reviewthe notion of a quantum Turing machine computation. See alsothe survey [2] on quantum information theory. More details canbe found in the textbook [16]. Loosely speaking, as random-ized computation is a generalization of deterministic computa-tion, so is quantum computation a generalization of randomizedcomputation. Realizing a mathematical random source to drive arandom computation is, in its ideal form, presumably impossible(or impossible to certify) in practice. Thus, in applications an al-gorithmic random number generator is used. Strictly speaking,this invalidates the analysis based on mathematical randomizedcomputation. As John von Neumann [15], [22] put it: “Any onewho considers arithmetical methods of producing random digitsis, of course, in a state of sin. For, as has been pointed out severaltimes, there is no such thing as a random number—there are onlymethods to produce random numbers, and a strict arithmeticalprocedure is of course not such a method.” In practice, ran-domized computations reasonably satisfy theoretical analysis.In the quantum computation setting, the practical problem isthat the ideal coherent superposition cannot really be maintainedduring computation but deteriorates—it decoheres. In our anal-ysis, we abstract from that problem and one hopes that in prac-tice anti-decoherence techniques will suffice to approximate theidealized performance sufficiently.

We view a quantum Turing machine as a generalization ofthe classic probabilistic (that is, randomized) Turing machine.The probabilistic Turing machine computation follows multiplecomputation paths in parallel, each path with a certain associ-ated probability. The quantum Turing machine computation fol-lows multiple computation paths in parallel, but now every pathhas an associated complex probability amplitude. If it is pos-sible to reach the same state via different paths, then in the prob-abilistic case the probability of observing that state is simplythe sum of the path probabilities. In the quantum case, it isthe squared norm of the summed path probability amplitudes.Since the probability amplitudes can be of opposite sign, the


observation probability can vanish; if the path probability am-plitudes are of equal sign then the observation probability canget boosted since it is thesquareof the sum norm. While thisgeneralizes the probabilistic aspect, and boosts the computationpower through the phenomenon of interference between parallelcomputation paths, there are extra restrictionsvis-a-visproba-bilistic computation in that the quantum evolution must be uni-tary.

Quantum Kolmogorov Complexity:We define the Kol-mogorov complexity of a pure quantum state as the length ofthe shortest two-part code consisting of a classical program tocompute an approximate pure quantum state and the negativelog-fidelity of the approximation to the target quantum state.We show that the resulting quantum Kolmogorov complexitycoincides with the classical self-delimiting complexity on thedomain of classical objects; and that certain properties that welove and cherish in the classical Kolmogorov complexity areshared by the new quantum Kolmogorov complexity: quantumKolmogorov complexity of an -qubit object is upper-boundedby about ; it is not computable but can under certain condi-tions be approximated from above by a computable process;and with high probability a quantum object is incompressible.We may call this quantum Kolmogorov complexity thebitcomplexityof a pure quantum state (using Dirac’s “ket”notation) and denote it by . From now on, we will

denote by an inequality to within an additive constant, and

by the situation when both and hold. For example, wewill show that, for -qubit states , the complexity satisfies

. For certain restricted pure quantum states,quantum Kolmogorov complexity satisfies the subadditiveproperty

But, in general, quantum Kolmogorov complexity isnotsubad-ditive. Although “cloning” of nonorthogonal states is forbiddenin the quantum setting [21], [7], copies of the same quantumstate have combined complexity that can be considerable lowerthan times the complexity of a single copy. In fact, quantumKolmogorov complexity appears to enable us to express and par-tially quantify “nonclonability” and “approximate clonability”of individual pure quantum states.

Related Work: In the classical situation there are several vari-ants of Kolmogorov complexity that are very meaningful intheir respective settings: plain Kolmogorov complexity, prefixcomplexity, monotone complexity, uniform complexity, nega-tive logarithm of universal measure, and so on [13]. It is, there-fore, not surprising that in the more complicated situation ofquantum information several different choices of complexitycan be meaningful and unavoidable in different settings. Fol-lowing the preliminary version [19] of this work there have beenalternative proposals.

Qubit Descriptions: The most straightforward way to de-fine a notion of quantum Kolmogorov complexity is to considerthe shortest effective qubit description of a pure quantum statewhich is studied in [4]. (Thisqubit complexitycan also be for-mulated in terms of the conditional version of bit complexity

as in [19].) An advantage of qubit complexity is that the upperbound on the complexity of a pure quantum state is immedi-ately given by the number of qubits involved in the literal de-scription of that pure quantum state. Let us denote the resultingqubit complexity of a pure quantum state by .

While it is clear that (just as with the previous approach) thequbit complexity is not computable, it is unlikely that one canapproximate the qubit complexity from above by a computableprocess in some meaningful sense. In particular, the dovetailingapproach we used in our approach now does not seem applicabledue to the noncountability of the potentential qubit program can-didates. The quantitative incompressibility properties are muchlike the classical case (this is important for future applications).There are some interesting exceptions in case of objects con-sisting of multiple copies related to the “no-cloning” propertyof quantum objects, [21], [7]. Qubit complexity does not satisfythe subadditive property, and a certain version of it (bounded fi-delity) is bounded above by the von Neumann entropy.

Density Matrices: In classical algorithmic informationtheory it turns out that the negative logarithm of the “largest”probability distribution effectively approximable from below—the universal distribution—coincides with the self-delimitingKolmogorov complexity. In [8], Gács defines two notions ofcomplexities based on the negative logarithm of the “largest”density matrix effectively approximable from below. Therearise two different complexities of based on whether wetake the logarithm inside as

or outside as

It turns out that . This approach serves tocompare the two approaches above. It was shown thatis within a factor of four of ; that essentially isa lower bound on , and an oracle version of is es-sentially an upper bound on qubit complexity . Since qubit

complexity is trivially and it was shown that bit complexityis typically close to , at first glance, this leaves the possibilitythat the two complexities are within a factor of two of each other.This turns out not to be the case since it was shown that the

complexity can for some arguments be much smaller thanthe complexity, so that the bit complexity is in these casesalso much smaller than the qubit complexity. As [8] states: thisis due to the permissive way the bit complexity deals with ap-proximation. The von Neumann entropy of a computable den-sity matrix is within an additive constant (the complexity of theprogram computing the density matrix) of a notion of averagecomplexity. The drawback of density-matrix-based complexityis that we seem to have lost the direct relation with a meaningfulinterpretation in terms of description length: a crucial aspect ofclassical Kolmogorov complexity in most applications [13].

Real Descriptions:A version of quantum Kolmogorov com-plexity briefly considered in [19] uses computable real parame-ters to describe the pure quantum state with complex probabilityamplitudes. This requires two reals per complex probability am-plitude, that is, for qubits one requires real numbers in


the worst case. A real number is computable if there is a fixedprogram that outputs consecutive bits of the binary expansionof the number forever. Since every computable real number mayrequire a separate program, a computable-qubit pure state mayrequire finite programs. Most -qubit pure states have pa-rameters that are noncomputable and increased precision willrequire increasingly long programs. For example, if the param-eters are recursively enumerable (the positions of the “”s inthe binary expansion is a recursively enumerable set), then a

length program per parameter, to achieve-bit precisionper recursively enumerable real, is sufficient and for some re-cursively enumerable reals also necessary. In certain contexts,where the approximation of the real parameters is a central con-cern, such considerations may be useful. While this approachdoes not allow the development of a clean theory in the sense ofthe previous approaches, it can be directly developed in termsof algorithmic thermodynamics—an extension of Kolmogorovcomplexity to randomness of infinite sequences (such as binaryexpansions of real numbers) in terms of coarse-graining and se-quential Martin-Löf tests, analogous to the classical case in [9],[13]. But this is outside the scope of the present paper.

II. QUANTUM TURING MACHINE MODEL

We assume the notation and definitions in Appendixes I andII. Our model of computation is a quantum Turing machineequipped with an input tape that is one-way infinite with theclassical input (the program) in binary left adjusted from the be-ginning. We require that the input tape is read-only from left-to-right without backing up. This automatically yields a propertywe require in the sequel. The set of halting programs is prefix-free. Additionally, the machine contains a one-way infinite worktape containing qubits, a one-way infinite auxiliary tape con-taining qubits, and a one-way infinite output tape containingqubits. Initially, the input tape contains a classical binary pro-gram , and all (qu)bits of the work tape, auxiliary tape, andoutput tape qubits are set to . In case the Turing machine hasan auxiliary input (classical or quantum), then initially the left-most qubits of the auxiliary tape contain this input. A quantumTuring machine with classical programand auxiliary inputcomputes until it halts with output on its output tape orit computes forever. Halting is a more complicated matter herethan in the classical case since quantum Turing machines are re-versible, which means that there must be an ongoing evolutionwith nonrepeating configurations. There are various ways to re-solve this problem [3] and we do not discuss this matter further.We only consider quantum Turing machines that do not modifythe output tape after halting. Another—related—problem is thatafter halting the quantum state on the output tape may be “en-tangled” with the quantum state of the remainder of the ma-chine, that is, the input tape, the finite control, the work tape,and the auxiliary tape. This has the effect that the output stateviewed in isolation may not be a pure quantum state but a mix-ture of pure quantum states. This problem does not arise if theoutput and the remainder of the machine form a tensor productso that the output is unentangled with the remainder. The resultsin this paper are invariant under these different assumptions, but

considering output entangled with the remainder of the machinecomplicates formulas and calculations. Correspondingly, we re-strict our consideration to outputs that form a tensor productwith the remainder of the machine, with the understanding thatthe same results hold with about the same proofs if we choosethe other option—except in the case of Theorem 4 item ii), seethe pertinent caveat there. Note that the Kolmogorov complexitybased on entangled output tapes is at most (and conceivably lessthan) the Kolmogorov complexity based on unentangled outputtapes.

Definition 1: Define theoutput of a quantum Turingmachine with classical program and auxiliary input asthe pure quantum state resulting from computing until ithalts with output on its output tape. Moreover, does notchange after halting, and it is unentangled with the remainderof ’s configuration. We write . If there is nosuch then is undefined and we write

. By definition, the input tape is read-only from left-to-rightwithout backing up: therefore, the set ofhalting programs

is prefix-free: no program in is a properprefix of another program in . Put differently, the Turing ma-chine scans all of a halting programbut never scans the bitfollowing the last bit of : it is self-delimiting.

We fix the rotation of all contemplated machines to a singleprimitive rotation with and . Thereare only countably many such Turing machines. Using a stan-dard ordering, we fix as a standard enumeration ofquantum Turing machines using only rotation. By [1], thereis a universal machine in this enumeration that simulates theothers exactly

for all

(Instead of the many-bit encoding for we can use a shorterself-delimiting code like in Appendix I.) As noted in the In-troduction, every quantum Turing machine computation usingarbitrary real rotations can be approximated to arbitrary preci-sion by machines with fixed rotationbut in general cannot besimulated exactly.

Remark 1: There are two possible interpretations for thecomputation relation . In the narrow interpreta-tion we require that with on the input tape and on theconditional tape halts with on the output tape. In the wideinterpretation we can define pure quantum states by requiringthat for every precision parameter the computation ofwith on the input tape andon the conditional tape, withona special new tape where the precision is to be supplied, haltswith on the output tape and . Sucha notion of “computable” or “recursive” pure quantum statesis similar to Turing’s notion of “computable numbers.” In theremainder of this section we use the narrow interpretation.

Remark 2: As remarked in [8], the notion of a quantum com-puter is not essential to the theory here or in [4], [8]. Since thecomputation time of the machine is not limited in the theory ofdescription complexity as developed here, a quantum computercan be simulated by a classical computer to every desired degree


of precision. We can rephrase everything in terms of the standardenumeration of of classical Turing machines. Let

be an -qubit state. We can write if either outputs

i) algebraic definitions of the coefficients of (in casethese are algebraic); or

ii) a sequence of approximations forwhere is an algebraic approximation

of to within .

III. CLASSICAL DESCRIPTIONS OFPURE QUANTUM STATES

The complex quantity is the inner product of vectorsand . Since pure quantum states have unit length

, where is the angle between vectorsand . The quantity , thefidelity between and ,is a measure of how “close” or “confusable” the vectorsand are. It is the probability of outcome being measuredfrom state . Essentially, we project on outcome usingprojection resulting in .

Definition 2: The(self-delimiting) complexityof with re-spect to quantum Turing machinewith as conditional inputgiven for free is

(1)where is the number of bits in the program, auxiliary isan input (possibly quantum) state, and is the target state thatone is trying to describe.

Note that is the quantum state produced by the computa-tion , and, therefore, given and , completely deter-mined by . Therefore, we obtain the minimum of the right-handside of the equality by minimizing overonly. We call thethat minimizes the right-hand side thedirectly computed partof

while is theapproximation part.Quantum Kolmogorov complexity is the sum of two terms:

the first term is the integral length of a binary program, and thesecond term, the minlog probability term, corresponds to thelength of the corresponding codeword in the Shannon–Fanocode associated with that probability distribution, see, for ex-ample, [6], and is thus also expressed in an integral number ofbits. Let us consider this relation more closely. For a quantumstate the quantity is the probabilitythat the state passes a test for, and vice versa. The term

can be viewed as the codeword length toredescribe , given and an orthonormal basis withas one of the basis vectors, using the Shannon–Fano prefixcode. This works as follows. Write . For every state

in -dimensional Hilbert space with basis vectorswe have

If the basis has as one of the basis vectors, then we canconsider as a random variable that assumes valuewith probability . The Shannon–Fano codeword for

in the probabilistic ensemble is basedon the probability of , given , and has length

. Considering a canonical method of con-structing an orthonormal basis from agiven basis vector, we can choosesuch that

The Shannon–Fano code is appropriate for our purpose sinceit is optimal in that it achieves the least expected codewordlength—the expectation taken over the probability of the sourcewords—up to 1 bit by Shannon’s Noiseless Coding Theorem.As in the classical case, the quantum Kolmogorov complexityis an integral number.

The main property required to be able to develop a mean-ingful theory is that our definition satisfies a so-calledInvari-ance Theorem(see also Appendix I). Below we use “” to de-note a special type of universal (quantum) Turing machine ratherthan a unitary matrix.

Theorem 1 (Invariance):There is a universal machinesuch that. for all machines, there is a constant (the lengthof the description of the index of in the enumeration), suchthat for all quantum states and all auxiliary inputs we have

Proof: Assume that the program that minimizes theright-hand side of (1) is and the computed is

There is a universal quantum Turing machinein the standardenumeration such that for every quantum Turingmachine in the enumeration there is a self-delimiting program

(the index of ) and for all : ifthen . In particular, this holds

for such that with auxiliary input halts with output .But with auxiliary input halts on input also with output

. Consequently, the programthat minimizes the right-handside of (1) with substituted for , and computesfor some state possibly different from , satisfies

Combining the two displayed inequalities, and setting, proves the theorem.

The key point is not that the universal Turing machine viewedas a description method does necessarily give the shortest de-scription in each case, but that no other effective descriptionmethod can improve on it infinitely often by more than a fixedconstant. Foreverypair of universal Turing machines asin the proof of Theorem 1, there is a fixed constant , de-pending only on and , such that for all we have


To see this, substitute for in Theorem 1, and, conversely,substitute for and for in Theorem 1, and combine thetwo resulting inequalities. While the complexities according to

and are not exactly equal, they areequal up to a fixed con-stantfor all and . Therefore, one or the other fixed choiceof reference universal machineyields resulting complexitiesthat are in a fixed constant envelope from each other for all ar-guments.

Programmers are generally aware that programs for symbolicmanipulation tend to be shorter when they are expressed in theLISP programming language than if they are expressed in For-tran, while for numerical calculations the opposite is the case.Or is it? The Invariance Theorem in fact shows that to expressan algorithm succinctly in a program, it does not matter whichprogramming language we use—up to a fixed additive constant(representing the length of compiling programs from either lan-guage into the other language) that depends only on the two pro-gramming languages compared. For further discussion of effec-tive optimality and invariance see [13].

Definition 3: We fix once and for all areference universalquantum Turing machine and define thequantum Kol-mogorov complexityas

where denotes the absence of conditional information.

The definition is continuous. If two quantum states are veryclose then their quantum Kolmogorov complexities are veryclose. Furthermore, since we can approximate every (purequantum) state to arbitrary closeness, [3], in particular,for every constant we can compute a (pure quantum)state such that . One can view this as theprobability of obtaining the possibly noncomputable outcome

when executing projection on and measuringoutcome . For this definition to be useful it should satisfythe following conditions.

• The complexity of a pure state that can be directly com-puted should be the length of the shortest program thatcomputes that state. (If the complexity is less, this maylead to discontinuities when we restrict quantum Kol-mogorov complexity to the domain of classical objects.)

• The quantum Kolmogorov complexity of a classical objectshould equal the classical Kolmogorov complexity of thatobject (up to a constant additive term).

• The quantum Kolmogorov complexity of a quantum ob-ject should have an upper bound. (This is necessary forthe complexity to be approximable from above, even ifthe quantum object is available in as many copies as werequire.)

• Most objects should be “incompressible” in terms ofquantum Kolmogorov complexity.

• In the classical case, the average self-delimiting Kol-mogorov complexity of the discrete set of all-bit stringsunder some distribution equals the Shannon entropy upto an additive constant depending on the complexity of

the distribution concerned. In our setting we would liketo know the relation of the expected-qubit quantumKolmogorov complexity, the expectation taken over acomputable (semi-)measure over the continuously many

-qubit states, with von Neumann entropy. Perhapsthe continuous set can be restricted to a representativediscrete set. We have no results along these lines. Oneproblem may be that in the quantum situation there canbe many different mixtures of pure quantum states thatgive rise to the same density matrix, and thus have thesame von Neumann entropy. It is possible that the averageKolmogorov complexities of different mixtures with thesame density matrix (or density matrices with the sameeigenvalues) are also different (and, therefore, not allof them can be equal to the single fixed von Neumannentropy which depends only on the eigenvalues). Incontrast, in the approach of [8], using semicomputablesemi-density matrices, as discussed in the Introduction,equality of “average - universal density” to thevon Neumann entropy (up to the Kolmogorov complexityof the semicomputable density itself) follows simply andsimilarly to the classical case. However, in this approachthe interpretation of “ - universal density” in termsof length of descriptions of one form or the other is quiteproblematic (in contrast with the classical case) and wethus lose the main motivation of quantum Kolmogorovcomplexity.

A. Consistency With Classical Complexity

Our proposal would not be useful if it were the case thatfor a directly computable object the complexity is less than theshortest program to compute that object. This would imply thatthe code corresponding to the probabilistic component in thedescription is possibly shorter than the difference in programlengths for programs for an approximation of the object andthe object itself. This would penalize definite description com-pared to probabilistic description and, in case of classical ob-jects, would make quantum Kolmogorov complexity less thanclassical Kolmogorov complexity.

Theorem 2 (Consistency):Let be the reference universalquantum Turing machine and let be a basis vector in a di-rectly computable orthonormal basis, given classical auxiliaryinput : there is a program such that . Then

up to .Proof: Let be such that

Denote the program that minimizes the right-hand side byand the program that minimizes the expression in the

statement of the theorem by .A dovetailedcomputation is a method related to Cantor’s cel-

ebrated diagonalization method: run all programs alternatinglyin such a way that every program eventually makes progress. On


a list of programs one divides the overall computa-tion into stages . In stage of the overall compu-tation, one executes theth computation step of every program

for .By running on all binary strings (candidate programs) si-

multaneously in a dovetailed fashion, one can enumerate allobjects that are directly computable, given, in order of theirhalting programs. Assume thatis also given a lengthprogram to compute —that is, enumerate the basis vectorsin . This way, computes , the program computes .Now since the vectors of are mutually orthogonal

Since is one of the basis vectors we have isthe length of a prefix code (the Shannon–Fano code) to com-pute from and . Denoting this code by, we have thatthe concatenation is a program to compute : parse itinto , and using the self-delimiting property ofand . Use to compute and use to compute , de-termine the probabilities for all basis vectors in .Determine the Shannon–Fano codewords for all the basis vec-tors from these probabilities. Sinceis the codeword for wecan now decode . Therefore,

which was what we had to prove.

Corollary 1: On classical objects (that is, the natural num-bers or finite binary strings that are all directly computable) thequantum Kolmogorov complexity coincides up to a fixed addi-tional constant with the self-delimiting Kolmogorov complexitysince for the standard classical basis .(We assume that the information about the dimensionality of theHilbert space is given conditionally.)

Remark 3: Fixed additional constants are no problem sincethe complexity also varies by fixed additional constants due tothe choice of reference universal Turing machine.

Remark 4: The original plain complexity defined by Kol-mogorov [13] is based on Turing machines where the input is de-limited by distinguished markers. A similar proof used to com-pare quantum Kolmogorov complexity with the plain (not self-delimiting) Kolmogorov complexity on classical objects showsthat they coincide, but only up to a logarithmic additive term.

B. Upper Bound on Complexity

A priori, in the worst case is possibly . We showthat the worst case has a upper bound.

Theorem 3 (Upper Bound):For all -qubit quantum states

we have .Proof: Write . For every state in -dimen-

sional Hilbert space with basis vectors wehave

Hence there is an such that . Let be a-bit program to construct a basis state given .

Then . Then

Remark 5: This upper bound is sharp since Gács [8] has re-cently shown that there are states with

C. Computability

In the classical case, Kolmogorov complexity is not com-putable but can be approximated from above by a computableprocess. The noncloning property prevents us from perfectlycopying an unknown pure quantum state given to us [21], [7].Therefore, an approximation from above that requires checkingevery output state against the target state destroys the latter. Itis possible to prepare approximate copies from the target state,but the more copies one prepares the less they approximate thetarget state [10], and this deterioration appears on the surface toprevent use in our application below. To sidestep the fragility ofthe pure quantum target state, we simply require that it is an out-come, in as many copies as we require, in a measurement thatwe have available. Another caveat with respect to item ii) in thetheorem below is that, since the approximation algorithm in theproof does not discriminate between entangled output states andunentangled output states, we approximate the quantum Kol-mogorov complexity by a directly computed part that is pos-sibly a mixture rather than a pure state. Thus, the approximatedvalue may be that of quantum Kolmogorov complexity basedon computations halting with entangled output states, which isconceivably less than that of unentangled outputs. This is theonly result in this paper that depends on that distinction.

Theorem 4 (Computability):Let be the pure quantumstate we want to describe.

i) The quantum Kolmogorov complexity is not com-putable.

ii) If we can repeatedly execute the projection andperform a measurement with outcome , then thequantum Kolmogorov complexity can be ap-proximated from above by a computable process witharbitrarily small probability of error of giving a toosmall value.

Proof: The uncomputability followsa fortiori from theclassical case. The semicomputability follows because we haveestablished an upper bound on the quantum Kolmogorov com-plexity, and we can simply enumerate all halting classical pro-grams up to that length by running their computations dove-tailed fashion. The idea is as follows. Let the target state be

of qubits. Then, (The unconditional caseis similar with replaced by .) We want

to identify a program such that minimizesamong all candidate programs. To identify

it in the limit, for some fixed satisfying (3) below for given


, repeat the computation of every halting programwith

at least times and perform the assumed projectionand measurement. For every halting programin the dove-tailing process we estimate the probabilityfrom the fraction : the fraction of positive outcomes outof measurements. The probability that the estimate is offfrom the real value by more than an is given by Chernoff’sbound: for

(2)

This means that the probability that the deviationexceeds vanishes exponentially with growing. Every can-didate program satisfies (2) with its own or . Thereare candidate programs and hence also out-comes with halting computations. We use this esti-mate to upper-bound the probability of error. For given ,the probability thatsomehalting candidate programsatisfies

is at most with

The probability thatnohalting program does so is at least .That is, with probability at least we have

for every halting program. It is convenient to restrict attentionto the case that all’s are large. Without loss of generality, if

then consider instead of . Then

(3)

The approximation algorithm is as follows.

Step 0: Set the required degree of approximationand the number of trials to achieve the requiredprobability of error .

Step 1: Dovetail the running of all candidate programs untilthe next halting program is enumerated. Repeat thecomputation of the new halting programtimes.

Step 2: If there is more than one programthat achieves thecurrent minimum, then choose the program with theleast length (and hence the least number of success-full observations). If is the selected program with

successes out of trials then set the current ap-proximation of to

This exceeds the proper value of the approximationbased on the real instead of by at most 1 bitfor all .

Step 3: Goto Step 1.

D. Incompressibility

Definition 4: A pure quantum state is computableif. Hence, all finite-dimensional pure quantum

states are computable. We call a pure quantum statedirectlycomputableif there is a program such that .

We have shown that quantum Kolmogorov complexity co-incides with classical Kolmogorov complexity on classical ob-jects in Theorem 2. In the proof, we demonstrated in fact thatthe quantum Kolmogorov complexity is the length of the clas-sical program that directly computes the classical objects. Bythe standard counting argument, Appendix I, the standard or-thonormal basis—consisting of all-bit strings—of the -di-mensional Hilbert space has at leastbasis vectors that satisfy . But whatabout nonclassical orthonormal bases? They may not satisfy thestandard counting argument. Since there are continuously manypure quantum states and the range of quantum Kolmogorovcomplexity has only countably many values, there are integervalues that are the Kolmogorov complexities of continuouslymany pure quantum states.

In particular, since the quantum Kolmogorov complexity of

an -qubit state is , the set of directly computable pure-qubit states has cardinality . They divide the

set of unit vectors in , the surface of the -dimensional ballwith unit radius in Hilbert space, into-many -dimen-sional connected surfaces, calledpatches, each consisting of onedirectly computable pure-qubit state together with thosepure -qubit states of which is the directly computedpart (Definition 2). In every patch, all with the samehave both the same complexity and the same directly computedpart, and for every fixed patch and every fixed value of approxi-mation part occurring in the patch, there are continuously many

with identical directly computed parts and approximationparts.A priori it is possible that this is the case for two distinctbasis vectors in a nonclassical orthonormal bases, which impliesthat the standard counting argument cannot be used to show theincompressibility of basis vectors of nonclassical orthonormalbases.

Lemma 1: There is a particular (possibly nonclassical)orthonormal basis of the -dimensional Hilbert space

, such that at least basis vectors satisfy.

Proof: Every orthonormal basis of has basis vec-tors and there are at most

programs of length less than . Hence, there are at mostprograms of length available to approximate the

basis vectors. We construct an orthonormal basis satisfyingthe lemma. The set of directly computed pure quantum states

span an -dimensional subspace within the -dimensional Hilbert space such that

. Here is a -dimensional subspaceof such that every vector in it is perpendicular to everyvector in . We can write every element as

where the ’s form an orthonormal basis of and the ’sform an orthonormal basis of so that the ’s and ’s


form an orthonormal basis for . For every state ,directly computed by a program , given , and basis vector

we have . Therefore, the assumptionthat is the directly computed part of implies

which is impossible. Hence, the directly computed part is inand has length . This proves the lemma.

Theorem 5 (Incompressibility):The uniform probability

Proof: The theorem follows immediately from a general-ization of Lemma 1 to arbitrary orthonormal bases as follows.

Claim 1: Every orthonormal basis of the-dimensional Hilbert space has at least

basis vectors that satisfy .Proof: Use the notation of the proof of Lemma 1. Let

be a set initially containing the programs of length less than, and let be a set initially containing the set of basis

vectors with . Assume to the contrarythat . Then at least two basis vectors, say and

, and some pure quantum state directly computed froma -length program satisfy

(4)

with being the directly computed part of both .This means that since not both and

can be equal to . Hence, for every directly computedpure quantum state of complexity there is at mostone basis state, say , of the same complexity (in fact, onlyif that basis state is identical with the directly computed state).Now eliminate every directly computed pure quantum stateof complexity from the set , and the basis state asabove (if it exists) from . We are now left withbasis states of which the directly computed parts are included in

with with every element in of complexity. Repeating the same argument, we end up with

basis vectors of which the directly computed parts areelements of the empty set, which is impossible.

Remark 6: Theorem 3 states an upper bound of on. This leaves a relatively large gap with the lower

bound of established here. But, as stated earlier, Gács [8] has

shown that there are states with ;in fact, most states satisfy this. The proof appears to supportabout the same incompressibility results as in this section, with

replaced by . The proof follows [5], analyzingcoverings of the -dimensional ball of unit radius.

E. Multiple Copies

For classical complexity we have , since aclassical program to computecan be used twice; indeed, itcan be used many times. In the quantum world things are not soeasy: the no-cloning property mentioned earlier, see [21], [7], or

the textbooks [17] and [16], prevent cloning an unknown purestate perfectly to obtain : that is,

There is a considerable literature on the possibility of approx-imate cloning to obtain imperfect copies from an unknownpure state, see for example [10]. Generally speaking, the morequbits are involved in the original copy and the more clonesone wants to obtain, the more the fidelity of the obtainedclones deteriorates with respect to the original copy. Thisstands to reason since high-fidelity cloning would enableboth superluminal signal transmission [11] and extraction ofessentially unbounded information concerning the probabilityamplitude from the original qubits. The approximate cloningpossibility suggests that in our setting the approximationpenalty induced by the second—fidelity—term of Definition 2may be lenient insofar that the complexity of multiple copiesincreases sublinearly with the number of copies. Even apartfrom this, the -fold tensor product of with itselflives in a small-dimensional symmetric subspace with theresult that can be considerably below .This effect was first noticed in the context of qubit complexity[4], and it similarly holds for the and complexities in[8]. Define

is a pure -qubit quantum state

and write . The following theorem states that the-foldcopy of every -qubit pure quantum state has complexityat most about , and there is a pure quantumstate for which the complexity of the -fold copy achieves

.

Theorem 6 (Multiples):Assume the above terminology.

Proof: Recall the and complexities of purequantum states [8] mentioned in the Introduction. Denoteby and the maximal values of

and over all qubit states ,respectively. All of the following was shown in [8] (the notationas above and an arbitrary state, for example )

Combining these inequalities gives the theorem.


The theorem gives a measure of how “clonable”individual-qubit pure quantum states are—rather than indicating theav-

eragesuccess of a fixed cloning algorithm for all-qubit purequantum states, as in the approximate and probabilistic cloningalgorithms referred to above. In particular, it gives an upperbound on the nonclonability of every individual pure quantumstate, and, moreover, it tells us that there exist individual purequantum states that are quite nonclonable. One can view this asan application of quantum Kolmogorov complexity. The differ-ence expresses the amount of extra in-formation required for copies of over that of one copy—inour particular meaning of (1).

F. Conditional Complexity and Cloning

In Definition 2, the conditional complexity is theminimum sum of the length of a classical program to compute

plus the negative logarithm of the probability of outcomewhen executing projection on and measuring, givenas input on an auxiliary input tape. In caseis a classical object,a finite binary string, there is no problem with this definition.The situation is more complicated if instead of a classical “”we consider the pure quantum stateas input on an auxiliary“quantum” input tape. In the quantum situation, the notion ofinputs consisting of pure quantum states is subject to very spe-cial rules.

First, if we are given an unknown pure quantum stateasinput it can be used only once, that is, it is irrevocably con-sumed and lost in the computation. It cannot be perfectly copiedor cloned without destroying the original as discussed above.This means that there is a profound difference between repre-senting a directly computable pure quantum state on the auxil-iary tape as a classical program or giving it literally. Given asa classical program, we can prepare and use arbitrarily manycopies of it. Given as an (unknown) pure quantum state in su-perposition, it can be used as perfect input to a computationonly once. Thus, the manner in which the conditional infor-mation is provided may make a big difference. A classical pro-gram for computing a directly computable quantum state carriesmore informationthan the directly computable quantum stateitself—much like a shortest program for a classical object car-ries more information than the object itself. In the latter case,it consists of partial information about the halting problem. Inthe quantum case of a directly computable pure state we havethe additional information that the state is directly computableand in case of a shortest classical program additional informa-tion about the halting problem. Thus, for classical objectswehave in contrast to the following.

Theorem 7 (Cloning):For every pure quantum state andevery , we have

(5)

Moreover, for every there exists an-qubit pure quantum state, such that for every , we have

(6)

Proof: Equation (5) is obvious. Equation (6) follows fromTheorem 6.

This holds even if is directly computable but is given inthe conditional in the form of an unknown pure quantum state.The lemma quantifies the “no-cloning” property of an individualpure quantum state . Given and the task to obtaincopies of , we require at least th of the information to ob-tain copies of —everything in the sense of quantumKolmogorov complexity (1). However, if is directly com-putable and the conditional is a classical program to computethis directly computable state, then that program can be usedover and over again, just like in the case of classical objects.

Lemma 2: For every directly computable pure quantum statecomputed by a classical program, and every

(7)

G. Subadditivity

Let and . Recall the following notation. Ifis a pure quantum state in -dimensional Hilbert space of

qubits, and is a pure quantum state in -di-mensional Hilbert space of qubits, then

is a pure quantum state in the -dimensional Hilbert spaceconsisting of the tensor product of the two initial spaces con-sisting of qubits.

In the classical Kolmogorov complexity case we have

for every pair of individual finite binary strings and(the analog of the similar familiar relation that holds amongentropies—a stochastic notion—in Shannon’s informationtheory). The second inequality is thesubadditivityproperty ofclassical Kolmogorov complexity. Obviously, in the quantum

setting also for every pair of individualpure quantum states . Below, we shall show that thesubadditive property doesnot hold for quantum Kolmogorovcomplexity. But in the restricted case of directly computablepure quantum states in simple orthonormal bases quantumKolmogorov complexity is subadditive, just like classicalKolmogorov complexity.

Lemma 3: For directly computable both of whichbelong to (possibly different) orthonormal bases of Kolmogorovcomplexity we have

up to an additive constant term.Proof: By Theorem 2, there is a program to compute

with and, by a similar argument as was usedin the proof of Theorem 2, a program to compute from

with up to additive constants. Useto construct two copies of and to construct fromone of the copies of . The separation between these concate-nated binary programs is taken care of by the self-delimitingproperty of the subprograms. An additional constant term takescare of the couple of -bit programs that are required.

In the classical case, we have equality in the lemma (up toan additive logarithmic term). The proof of the remaining in-equality, as given in the classical case, see [13], does not hold


for the quantum case. It would require a decision procedure thatestablishes equality between two pure quantum states withouterror. It is unknown to the author whether some approximatedecision rule would give some result along the required lines.We additionally note the following.

Lemma 4: For all directly computable pure states andwe have up to an additivelogarithmic term.

Proof: by the proof of The-orem 2. Then, the lemma follows by Lemma 3.

In contrast, quantum Kolmogorov complexity of arbitrary in-dividual pure quantum states dramaticallyfails to be subaddi-tive.

Theorem 8 (Non-Subadditivity):There are pure quantumstates of every length such that

Proof: Only the second inequality is nonobvious. Let

and let be a maximally complex classical-bit state. Then,. Hence the -bit program approxi-

mating by observing input , and outputting the resultingoutcome, demonstrates . Furthermore, isapproximated by with .Thus,

(the -term is due to the specification of the length of, and the term is due to the requirement of self-

delimiting coding). The lemma follows since .

Note that the witness states in the proof have

If we add the length in the qubit state in the conditional, thenthe upper bound reduces to , while the left-hand side in the

lemma stays . In the light of Theorem 2 (with substitutedin the conditional), this result indicates that statein the proof,although obviously directly computable, is not directly com-putable as an element from an orthonormal basis of low com-plexity. Every orthonormal basis, of which is a basis ele-ment, has complexity

The “no-cloning” or “approximate cloning” theorems in [21],[7], [10], [11], [16], [17] essentially show the following. Per-fect cloning is only possible if we measure according to anorthonormal basis of which one of the basis elements is thepure quantum state to be measured. Then, the measured purequantum state can be reproduced at will. Approximate cloningconsiders how to optimize measurements so that for a randompure quantum state (possibly from a restricted set) the repro-duced clone has on average optimal fidility with the original.Here we see that while the complexity of the orig-inal state in the proof above is , the complexity of an or-

thonormal bases of which it is a basis element can be (and usu-ally is in view of the incompressibility theorems) at leastforuniform at random chosen states—or every other complexityin between and by choice of . This gives a rigorous quan-tification of the quantum cloning fact that if we have full infor-mation to reproduce the basis of which the unknownindividualpure quantum state is a basis element, then the quantum Kol-mogorov complexity of that element is about zero—that is, wecan reproduce it at will.

It is easy to see that for the general case of pure states, an al-ternative demonstration of why the subadditivity property fails,can be given by way of the “noncloning” property of Theorem 6.

Lemma 5: There are infinitely many and such that thereare pure -qubit states for which

where “ ” is meant in the sense of “.”Proof: With we have2

for with fixed. Substitution in Theorem 6 showsthat there exists a state such that (up to lower order additiveterms) and . So writing(again up to lower order additive terms)

we obtain , up to an additive lower order term, which,

with , can only hold for . Hence, for large

enough and , one of the inequalities in the above chainmust be false.

APPENDIX ICLASSICAL KOLMOGOROV COMPLEXITY

It is useful to summarize the relevant parts and definitions ofclassical Kolmogorov complexity; see also [20], and the text-book [13]. The Kolmogorov complexity [12] of a finite object

is the length of the shortest effective binary description of.Let , where denotes the natural numbers and weidentify and according to the correspondence

Here, denotes theempty wordwith no letters. Thelengthof is the number of bits in the binary string. For example,

and .The emphasis is on binary sequences only for convenience;

observations in every finite or countably infinite alphabet can beso encoded in a way that is “theory neutral.”

2Use the following formula [13, p. 10]

loga

b= b log

a

b+ (a� b) log

a

a� b+

1

2log

a

b(a� b)+O(1):


A binary string is aproper prefixof a binary string if wecan write for . A set isprefix-freeif for every pair of distinct elements in the set neitheris a proper prefix of the other. A prefix-free set is also called aprefix code. Each binary string has a specialtype of prefix code, called aself-delimiting code

where if and otherwise. Thistakes care of all strings of length . The empty stringis encoded by . This code is self-delimiting because wecan determine where the codewordends by reading it fromleft to right without backing up. Using this code, we define thestandard self-delimiting code for to be . It is easyto check that and .

Let be a standard one-one mapping from to ,for technical reasons chosen such that

An example is . This can be iterated to.

Let be a standard enumeration of all Turing ma-chines, and let be the enumeration of correspondingfunctions which are computed by the respective Turing ma-chines. That is, computes . These functions are thepartialrecursivefunctions orcomputablefunctions. Theconditionalcomplexityof given with respect to a Turing machine is

Theorem 9 (Invariance):There is a universal Turing ma-chine , such that for all machines, there is a constant(the length of a self-delimiting description of the index ofinthe enumeration), such that for alland we have

Proof: Choose a universal Turing machine that ex-presses its universality in the following manner:

for all and .

Foreverypair of universal Turing machines for whichthe theorem holds, there is a fixed constant , dependingonly on and , such that for all we have

To see this, substitute for in the theorem, and, conversely,substitute for and for in the theorem, and combine thetwo resulting inequalities. While the complexities according to

and are not exactly equal, they areequal up to a fixed con-stant for all and . Therefore, one or the other fixed choiceof reference universal machineyields resulting complexities

that are in a fixed constant envelope from each other for all ar-guments.

Definition 5: We fix as ourreference universal computerand define theconditional Kolmogorov complexityof givenby

The unconditional Kolmogorov complexity of is defined by.

The Kolmogorov complexity of is the length of theshortest binary program from whichis computed. Though de-fined in terms of a particular machine model, the Kolmogorovcomplexity is machine-independent up to an additive constantand acquires an asymptotically universal and absolute characterthrough Church’s thesis, from the ability of universal machinesto simulate one another and execute every effective process. TheKolmogorov complexity of an object can be viewed as an abso-lute and objective quantification of the amount of informationin it. This leads to a theory ofabsoluteinformation contentsof individual objects in contrast to classic information theorywhich deals withaverageinformationto communicateobjectsproduced by arandom source[13].

Incompressibility: Since there is a Turing machine, say,that computes the identity function for all , itfollows that for fixed and all .

It is easy to see that there are also strings that can be describedby programs much shorter than themselves. For instance, thefunction defined by and forgrows very fast, is a “stack” of twos. Yet, for every

, it is clear that has complexity at most . Whatabout incompressibility? For everythere are binary stringsof length , but only descriptions in bi-nary string format of length less than. Therefore, there is atleast one binary string of length such that .We call such stringsincompressible. The same argument holdsfor conditional complexity: since for every lengththere areat most binary programs of length , for every bi-nary string there is a binary string of length such that

. “Randomness deficiency” measures how far theobject falls short of the maximum possible Kolmogorov com-plexity. For every constant we say a string is hasrandom-ness deficiencyat most if . Strings that areincompressible (say, with small randomness deficiency) are pat-ternless, since a pattern could be used to reduce the descriptionlength. Intuitively, we think of such patternless sequences asbeing random, and we use “random sequence” synonymouslywith “incompressible sequence.” (It is possible to give a rig-orous formalization of the intuitive notion of a random sequenceas a sequence that passes all effective tests for randomness, see,for example, [13].)

Since there are few short programs, there can be only few ob-jects of low complexity: the number of strings of length, thathave randomness deficiency at most, is at least .Hence, there is at least one string of lengthwith randomnessdeficiency , at least one-half of all strings of length haverandomness deficiency, at least three-fourths of all strings of


length have randomness deficiency, and at least theth part of all strings of length have randomness de-

ficiency at most .

Lemma 6: Let be a positive integer. For every fixed, everyset of cardinality has at least elementswith .

Proof: There are

binary strings of length less than. A fortiori there are at mostelements of that can be computed by binary programs of

length less than , given . This implies that at leastelements of cannot be computed by binary programs of lengthless than , given . Substituting by together withDefinition 5 yields the lemma.

If we are given as an explicit table then we can simply enu-merate its elements (in, say, lexicographical order) using a fixedprogram not depending onor . Such a fixed program can begiven in bits. Hence we can upper-bound the complexity

as .Incompressibility Method:One reason to formulate a notion

of quantum Kolmogorov complexity, apart from its interpreta-tion as the information in an individual quantum state, is the fol-lowing. We hope to duplicate the success of the classical versionas a proof method, the incompressibility method, in the theory ofcomputation and combinatorics [13]. In a typical proof using theincompressibility method, one first chooses an incompressibleobject from the class under discussion. The argument invariablysays that if a desired property does not hold, then in contrastwith the assumption, the object can be compressed. This yieldsthe required contradiction. Since most objects are almost incom-pressible, the desired property usually also holds for almost allobjects, and hence on average. The hope is that one can use thequantum Kolmogorov complexity to show, for example, lowerbounds on the complexity of quantum computations.

Prefix Kolmogorov Complexity:For technical reasons wealso need a variant of complexity, so-called prefix complexity,associated with Turing machines for which the set of programsresulting in a halting computation is prefix-free. We can realizethis by equipping the Turing machine with a read-only inputtape which is read from left-to-right without backing up, aseparate read/write work tape, an auxiliary read-only input tape,and a write-only output tape that is written from left-to-rightwithout backing up. All tapes are one-way infinite. Such Turingmachines are calledprefix machinessince the set of haltingprograms for such a machine forms a prefix-free set. Taking theuniversal prefix machine we can define the prefix complexityanalogously with the plain Kolmogorov complexity. Let bethe shortest program for that is enumerated first in a fixedgeneral enumeration process (say, by dovetailing the running ofall candidate programs) of all programs for which the referenceuniversal prefix machine computes. Then, the set

is aprefix code. That is, if and are codewords for and ,respectively, with , then is not a prefix of .

Let be a standard invertible effective one-to-one encodingfrom to prefix-free recursive subset of. For example,we can set . We insist on a prefix-free subset andrecursiveness because we want a universal Turing machine to beable to read an image under from left-to-right and determinewhere it ends. Let be a standard enumeration of allprefix machines, and let be the enumeration of cor-responding functions that are computed:computes . It iseasy to see that (up to the prefix-free encoding) these functionsare exactly thepartial recursivefunctions orcomputablefunc-tions. Theconditional complexityof given with respect to aprefix machine is

Choose a universal prefix machine that expresses its uni-versality in the following manner:

for all and . Proving the Invariance Theorem for prefixmachines proceeds by the same reasoning as before. Then, wecan define the following.

Definition 6: Fix a as above as ourreference universalprefix computer, and define theconditional prefix complexityof

given by

The unconditional Kolmogorov complexity of is defined by.

Note that can be slightly larger than , but forall we have

For example, the incompressibility laws hold also for butin slightly different form. The nice thing about is that wecan interpret as a probability distribution since isthe length of a shortest prefix-free program for. By the fun-damental Kraft’s inequality, see for example [6], [13], we knowthat if are the codeword lengths of a prefix code, then

. This leads to the notion of the “universal distri-bution” that assigns high probability to simpleobjects (that is, with low prefix complexity) and low probabilityto complex objects (that is, with high prefix complexity)—a rig-orous form of Occam’s Razor.

APPENDIX IIQUANTUM TURING MACHINES

We base quantum Kolmogorov complexity on quantumTuring machines. The simplest way to explain the idea ofquantum computation is perhaps by way of probabilistic(randomized) computation. This we explain here. Then, the


definition of the quantum (prefix) Turing machine is given inthe main text in Section II.

A. Notation

For every , the finite-dimensional Hilbert space hasa canonical basis . Assume that the canonicalbasis of is also the beginning of the canonical basis of

. The -fold tensor product of a Hilbert spaceis denoted by .A pure quantum state represented as a unit length vector in

such a Hilbert space is denoted asand the corresponding ele-ment of the dual space (the conjugate transpose) is written asor . The inner product of and is written in physicsnotation as and in mathematics notation as . The“bra-ket” notation is due to Dirac and is the standard quantum-mechanics notation. The “bra” denotes a row vector withcomplex entries, and “ket” is the column vector consistingof the conjugate transpose of (columns interchanged withrows and the imaginary part of the entries negated, that is,is replaced by ).

Of special importance is the two-dimensional Hilbert space, where is the set of complex real numbers, and, is

its canonical orthonormal basis. An element of is called aqubit (quantum bit in analogy with an element of whichis called abit for “binary digit”). To generalize this to strings of

qubits, we consider the quantum state spacewith .The basis vectors of this space are parameterizedby binary strings of length , so that is shorthand forand is shorthand for . Mathematically, is decom-posed into a tensor product ofcopies of , written as ,and an -qubit state in bra-ket notation can also bewritten as the tensor product , or in shorthand as

, a string of qubits, the qubits being distinguishedby position.

B. Probabilistic Computation

Consider the well-known probabilistic Turing machinewhich is just like an ordinary Turing machine, except that ateach step the machine can make a probabilistic move whichconsists in flipping a (say fair) coin and depending on the out-come changing its state to either one of two alternatives. Thismeans that at each such probabilistic move the computation ofthe machine splits into two distinct further computations eachwith probability . Ignoring the deterministic computationsteps, a computation involving coin flips can be viewed asa binary computation tree of depth with leaves, wherethe set of nodes at level corresponds to the possiblestate of the system after coin flips, every state occurringwith probability . For convenience, we can label the edgesconnecting a state directly with a state with the probabilitythat a state changes into state in a single coin flip (in thisexample all edges are labeled “”).

For instance, given an arbitrary Boolean formula containingvariables, a probabilistic machine can flip its cointimes

to generate each of the possible truth assignments at the-level nodes, and subsequently check in each node determin-

istically whether the local assignment makes the formula true. Ifthere are distinct such assignments then the probabilistic ma-

chine finds that the formula is satisfiable with probability at least—since there are distinct computation paths leading to

a satisfiable assignment.Now suppose the probabilistic machine is hidden in a black

box and the computation proceeds without us knowing theoutcomes of the coin flips. Suppose that aftercoin flipswe open part of the black box and observe the bit whichdenotes the truth assignment for variable . Beforewe opened the black box all initial truth assignments tovariables were still equally possible, each withprobability . After we observed the state of variable,say , the probability space of possibilities has collapsed to thetruth assignments which consist of all binary vectors with ainthe fifth position, each of which has probability renormalizedto .

C. Quantum Computation

A quantum Turing machine can be viewed as a generalizationof the probabilistic Turing machine. Consider the same compu-tation tree. In the probabilistic computation, there is a proba-bility associated with each node(state of the system),such that , summed over the nodes at the same level. Ina quantum-mechanical computation there is a “probability am-plitude” associated with each basis stateof the system.Ignore for the moment the quantum equivalent of the proba-bilistic coin flip to produce the computation tree. Consider thesimple case (corresponding to the probabilistic example of thestates of the nodes at theth level of the computation tree)where runs through the classical valuesthrough , inthe quantum case represented by the orthonormal basis-qubitstates through . Then, the nodes at levelare in a superposition

with the probability amplitudes satisfying

The amplitudes are complex numbers satisfyingwhere if then ,

and the summation is taken over all distinct states of theobservable at a particular instant. We say “distinct” states sincethe quantum-mechanical calculus dictates that equal statesare grouped together:. If state of probability amplitudeequals state of probability amplitude , then their combinedcontribution in the sum is . The computation stepsare governed by a matrix which represents the programbeing executed. Such a program has to satisfy the followingconstraints. Denote the set of possible configurations of theTuring machine by , where is the set of -bits columnvectors (the basis states) for simplicity. Then, maps thecolumn vector to . Here is a -elementcomplex vector of amplitudes of the quantum superposition ofthe basis states before the step, and the same after thestep concerned. The special property whichneeds to satisfyin quantum mechanics is that it isunitary, that is,where is the identity matrix and is the conjugate transpose


of (as with the bra-ket, “conjugate” means that all ’sare replaced by ’s and “transpose” means that the rowsand columns are interchanged). In other words,is unitary iff

.The unitary constraint on the evolution of the computation

enforces two facts.

1) If and then

for all (discretizing time for convenience).

2) A quantum computation is reversible (replace by). In particular this means that a computationis undone by running the computation step-

wise in reverse: .

The quantum version of a single classical bit is a superposi-tion of the two basis states

where . Such a state is called a quantum bitor qubit. It consists in part of the basis state and in part of thebasis state . The states are denoted by the column vectors ofthe appropriate complex probability amplitudes. For the basisstates the vector notations are (that is, and

) and (that is, and ). We also writeas the column vector .

Physically, for example, the state can be the state of apolarized photon, and the basis states are either horizontal orvertical polarization, respectively. Upon measuring accordingto the basis states, that is, passing the photon through a mediumthat is polarized either in the horizontal or vertical orientation,the photon is observed with probability or probability

, respectively. Consider a sample computation on a one-bitcomputer executing the unitary operator

(8)

It is easy to verify, using common matrix calculation, that

If we observe the computer in state , then the probabilityof observing state is , and the probability to ob-serve is . However, if we observe the com-puter in state , then the probability of observing stateis , and the probability to observe is . Similarly, if weobserve the computer in state , then the probability of ob-serving state is , and the probability to observe

is . If we observe the computer in state ,then the probability of observing state is , and the prob-ability to observe is . Therefore, the operator inverts abit when it is applied twice in a row, and hence has acquired thecharming namesquare root of “not”. In contrast, with the analo-gous probabilistic calculation, flipping a coin two times in a row,we would have found that the probability of each computation

path in the complete binary computation tree of depthwas ,and the states at the four leaves of the tree were ,resulting in a total probability of observing being , and thetotal probability of observing being as well.

The quantum principle involved in the above example iscalled interference, similar to the related light phenomenon inthe seminal “two-slit experiment.” If we put a screen with asingle small enough hole between a light source and a target,then we observe a gradually dimming illumination of thetarget screen, the brightest spot being colinear with the lightsource and the hole. If we put a screen withtwo small holesin between, then we observe a diffraction pattern of bright anddark stripes due to interference. Namely, the light hits everypoint on the screen via two different routes (through the twodifferent holes). If the two routes differ by an even number ofhalf wavelengths, then the wave amplitudes at the target areadded, resulting in twice the amplitude and a bright spot, and ifthey differ by an odd number of half wavelengths then the waveamplitudes are in opposite phase and are subtracted resultingin zero and a dark spot. Similarly, with quantum computation,if the quantum state is , then forwe have a probability of observing of , ratherthan which it would have been in a probabilisticfashion. For example, if and then theprobability of observing is rather than , and with thesign of inverted we observe with probability .

D. Quantum Algorithmics

A quantum algorithm corresponds to a unitary transforma-tion that is built up from elementary unitary transformations,every one of which only acts on one or two qubits. The al-gorithm applies to an initial classical state containing theinput and then makes a final measurement to extract the outputfrom the final quantum state. The algorithm is “efficient” ifthe number of elementary operations is “small,” which usuallymeans at most polynomial in the length of the input. Quantumcomputers can do everything a classical computer can do prob-abilistically—and more.

We are now in the position to explain the quantum equivalentof a probabilistic coin flip as promised in Appendix II-C. Thisis a main trick enhancing the power of quantum computation. Asequence of fair coin flips “corresponds” to a sequence of

one-qubit unitary operations, the Hadamard transform

on the successive bits of a register ofbits originally in theall-zero state . The result is a superposition

of all the possible states of the register, each with amplitude(and hence probability of being observed of ).

The Hadamard transform is ubiquitous in quantum com-puting; its single-fold action is similar to that of the transform

of (8) with the roles of “ ” and “ ” partly interchanged.In contrast to , that implements the logical “not,” we have

with the identity matrix.


Subsequent to application of , the computation proceedsin parallel along the exponentially many computation pathsin quantum coherent superposition. A sequence of trickyfurther unitary operations, for example the “quantum Fouriertransform,” and observations serves to exploit interference (andso-called entanglement) phenomena to effect a high probabilityof eventually observing outcomes that allow us to determinethe desired result, and suppressing the undesired spuriousoutcomes.

One principle that is used in many quantum algorithms is asfollows. If is a classical algorithm for computing some func-tion , possibly even irreversible like , thenwe can turn it into a unitary transformation which maps classicalstate to . Note that we can apply to a super-position of all inputs simultaneously

In some sense this state contains the results of computingforall possible inputs , but we have only applied once to obtainit. This effect together with the interference phenomenon is re-sponsible for one of the advantages of quantum over classicalrandomized computing and is calledquantum parallelism.

This leaves the question of how the input to a computation isprovided and how the output is obtained. Generally, we restrictourselves to the case where the quantum computer has a clas-sical input. If the input has bits, and the number of qubitsused by the computation is (input plus work space), thenwe pad the input with nonsignificant’s and start the quantumcomputation in an initial state (which must be in ) .When the computation finishes the resulting state is a unit vectorin , say , where runs through and the prob-ability amplitudes ’s satisfy . The output is ob-tained by performing a measurement with the basis vectors aspossible outcomes. The observed output is probabilistic: we ob-serve basis vector , that is, the -bit string , with probability

.

ACKNOWLEDGMENT

The ideas presented in this paper were developed from 1995through early 1998. Other interests prevented the author from

earlier publication. The author wishes to thank H. Buhrman, R.Cleve, W. van Dam, P. Gács, B. Terhal, J. Tromp, R. de Wolf,and a referee for discussions and comments on quantum Kol-mogorov complexity.

REFERENCES

[1] L. M. Adleman, J. Demarrais, and M.-D. A. Huang, “Quantum com-putability,” SIAM J. Comput., vol. 26, no. 5, pp. 1524–1540, 1997.

[2] C. H. Bennett and P. W. Shor, “Quantum information theory,”IEEETrans. Inform. Theory, vol. 44, pp. 2724–2742, Oct. 1998.

[3] E. Bernstein and U. Vazirani, “Quantum complexity theory,”SIAM J.Comput., vol. 26, no. 5, pp. 1411–1473, 1997.

[4] A. Berthiaume, W. van Dam, and S. Laplante, “Quantum Kolmogorovcomplexity,”J. Comput. System Sci., to be published.

[5] J. H. Conway and N. J. A. Sloane,Sphere Packings, Lattices, andGroups, 3rd ed. New York: Springer-Verlag, 1998.

[6] T. M. Cover and J. A. Thomas,Elements of Information Theory. NewYork: Wiley, 1991.

[7] D. Dieks, “Communication by EPR devices,”Phys. Lett. A, vol. 92, pp.271–272, 1982.

[8] P. Gács, “Quantum algorithmic entropy,” inProc. 16th IEEE Conf. Com-putational Complexity, 2001, available at http://xxx.lanl.gov/abs/quant-ph/0011046. Also,J. Phys. A: Math. Gen., vol. 34, 2001.

[9] , “The Boltzmann entropy and randomness tests,” inProc. IEEEPhysics and Computation Conf., 1994, pp. 209–216.

[10] N. Gisin and S. Massar, “Optimal quantum cloning machines,”Phys.Rev. Lett., vol. 79, pp. 2153–2156, 1997.

[11] N. Gisin, “Quantum cloning without signalling,”Phys. Lett. A, vol. 242,pp. 1–3, 1998.

[12] A. N. Kolmogorov, “Three approaches to the quantitative definition ofinformation,”Probl. Inform. Transm., vol. 1, no. 1, pp. 1–7, 1965.

[13] M. Li and P. M. B. Vitányi,An Introduction to Kolmogorov Complexityand Its Applications, 2nd ed. New York: Springer-Verlag, 1997.

[14] P. Martin-Löf, “The definition of random sequences,”Inform. Contr.,vol. 9, pp. 602–619, 1966.

[15] J. L. von Neumann, “Various techniques used in connection with randomdigits,” J. Res. Nat. Bur. Stand. Appl. Math. Ser., vol. 3, pp. 36–38, 1951.

[16] M. A. Nielsen and I. L. Chuang,Quantum Computation and QuantumInformation. Cambridge, U.K.: Cambridge Univ. Press, 2000.

[17] A. Peres,Quantum Theory: Concepts and Methods. Norwell, MA:Kluwer, 1995.

[18] P. W. Shor, “Polynomial-time algorithms for prime factorization and dis-crete logarithms on a quantum computer,”SIAM J. Comput., vol. 26, no.5, pp. 1484–1509, 1997.

[19] P. M. B. Vitányi, “Three approaches to the quantitative definition of in-formation in an individual pure quantum state,” inProc. 15th IEEE Conf.Computational Complexity, 2000, pp. 263–270.

[20] P. M. B. Vitányi and M. Li, “Minimum description length induction,Bayesianism, and Kolmogorov complexity,”IEEE Trans. Inform.Theory, vol. 46, pp. 446–464, Mar. 2000.

[21] W. K. Wooters and W. H. Zurek, “A single quantum cannot be cloned,”Nature, vol. 299, pp. 802–803, 1982.

[22] J. L. von Neumann,Collected Works, Vol. 1, A. H. Taub, Ed. Oxford,U.K.: Pergamon, 1963, pp. 768–770.

Documents

Quantum Kolmogorov complexity based on classical descriptions