12
Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICES OF THE AMS VOLUME 53, NUMBER 3 Introduction The problems of scientific computing often arise from the study of continuous processes, and ques- tions of computability and complexity over the reals are of central importance in laying the foun- dations for the subject. The first step is defining a suitable computational model for functions over the reals. Computability and complexity over discrete spaces have been very well studied since the 1930s. Different approaches have been proved to yield equivalent definitions of computability and nearly equivalent definitions of complexity. From the tra- dition of formal logic we have the notions of re- cursiveness and Turing machine. From computa- tional complexity we have Turing machine variants and abstract Random Access Machines (RAMs). All of these converge to define the same well-accepted notion of computability. The Church-Turing thesis asserts that this formal notion of computability is broad enough, at least in the discrete setting, to in- clude all functions that could reasonably be con- strued to be computable. In the continuous setting, where the objects are numbers in R, computability and complexity have received less attention, and there is no one accepted computation model. Alan Turing defined the no- tion of a single computable real number in his landmark 1936 paper [Tur36]: a real number is computable if its decimal expansion can be com- puted in the discrete sense (i.e., output by some Tur- ing machine). But he did not go on to define the notion of computable real function. There are now two main approaches to model- ing computation with real number inputs. The first approach, which we call the bit-model and which is the subject of this paper, reflects the fact that computers can store only finite approximations to real numbers. Roughly speaking, a real function f is computable in the bit model if there is an al- gorithm which, given a good rational approxima- tion to x , finds a good rational approximation to f (x). The second approach is the algebraic approach, which abstracts away the messiness of finite ap- proximations and assumes that real numbers can be represented exactly and each arithmetic oper- ation can be performed exactly in one step. The complexity of a computation is usually taken to be the number of arithmetic operations (for example, additions and multiplications) performed. The al- gebraic approach applies naturally to arbitrary rings and fields, although for modeling scientific computation the underlying structure is usually R or C. Algebraic complexity theory goes back to the 1950s (see [BM75, BCS97] for surveys). For scientific computing the most influential model in the algebraic setting is due to Blum, Shub, and Smale (BSS) [BSS89]. The model and its theory are thoroughly developed in the book [BCSS98] Mark Braverman is a Ph.D. student in the Department of Computer Science, University of Toronto. His email address is [email protected]. Partially supported by an NSERC Postgraduate Scholarship. Stephen Cook is Distinguished University Professor in the Department of Computer Science, University of Toronto. His email address is [email protected]. Partially supported by an NSERC Discovery Grant.

Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

  • Upload
    buidien

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

Computing over theReals: Foundations forScientific ComputingMark Braverman and Stephen Cook

318 NOTICES OF THE AMS VOLUME 53, NUMBER 3

IntroductionThe problems of scientific computing often arisefrom the study of continuous processes, and ques-tions of computability and complexity over thereals are of central importance in laying the foun-dations for the subject. The first step is defininga suitable computational model for functions overthe reals.

Computability and complexity over discretespaces have been very well studied since the 1930s.Different approaches have been proved to yieldequivalent definitions of computability and nearlyequivalent definitions of complexity. From the tra-dition of formal logic we have the notions of re-cursiveness and Turing machine. From computa-tional complexity we have Turing machine variantsand abstract Random Access Machines (RAMs). Allof these converge to define the same well-acceptednotion of computability. The Church-Turing thesisasserts that this formal notion of computability isbroad enough, at least in the discrete setting, to in-clude all functions that could reasonably be con-strued to be computable.

In the continuous setting, where the objects arenumbers in R, computability and complexity havereceived less attention, and there is no one accepted

computation model. Alan Turing defined the no-tion of a single computable real number in hislandmark 1936 paper [Tur36]: a real number iscomputable if its decimal expansion can be com-puted in the discrete sense (i.e., output by some Tur-ing machine). But he did not go on to define thenotion of computable real function.

There are now two main approaches to model-ing computation with real number inputs. The firstapproach, which we call the bit-model and whichis the subject of this paper, reflects the fact thatcomputers can store only finite approximationsto real numbers. Roughly speaking, a real functionf is computable in the bit model if there is an al-gorithm which, given a good rational approxima-tion to x , finds a good rational approximation tof (x).

The second approach is the algebraic approach,which abstracts away the messiness of finite ap-proximations and assumes that real numbers canbe represented exactly and each arithmetic oper-ation can be performed exactly in one step. Thecomplexity of a computation is usually taken to bethe number of arithmetic operations (for example,additions and multiplications) performed. The al-gebraic approach applies naturally to arbitraryrings and fields, although for modeling scientificcomputation the underlying structure is usually Ror C. Algebraic complexity theory goes back to the1950s (see [BM75, BCS97] for surveys).

For scientific computing the most influentialmodel in the algebraic setting is due to Blum, Shub,and Smale (BSS) [BSS89]. The model and its theoryare thoroughly developed in the book [BCSS98]

Mark Braverman is a Ph.D. student in the Department ofComputer Science, University of Toronto. His email addressis [email protected]. Partially supported by anNSERC Postgraduate Scholarship.

Stephen Cook is Distinguished University Professor in theDepartment of Computer Science, University of Toronto.His email address is [email protected]. Partiallysupported by an NSERC Discovery Grant.

Page 2: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 319

(see also the article [Blum04] in the Notices for anexposition). In the BSS model, the computer has reg-isters which can hold arbitrary elements of the un-derlying ring R. Computer programs perform exactarithmetic (+,−, ·, and ÷ in the case R is a field)and can branch on conditions based on exact com-parisons between ring elements (=, and also < , inthe case of an ordered field). Newton’s method, forexample, can be nicely presented in the BSS modelas a program (which may not halt) for finding anapproximate zero of a rational function, whenR = R . A nice feature of the BSS model is its gen-erality: the underlying ring R is arbitrary, and theresulting computability theory can be studied foreach R. In particular, when R = Z2, the model isequivalent to the standard bit computer model inthe discrete setting.

One of the big successes of discrete com-putability theory is the ability to prove uncom-putability results. The solution of Hilbert’s 10thproblem [Mat93] is a good example. The theoremstates that there is no procedure (e.g., no Turingmachine) which always correctly determineswhether a given Diophantine equation has a solu-tion. The result is convincing because of general ac-ceptance of the Church-Turing thesis.

A weakness of the BSS approach as a model ofscientific computing is that uncomputability re-sults do not correspond to computing practice inthe case R = R . Since intermediate register valuesof a computation are rational functions of the in-puts, it is not hard to see that simple transcendentalfunctions such as ex are not explicitly computableby a BSS machine. In the bit model these functionsare computable because the underlying philosophyis that the inputs and outputs to the computer arerational approximations to the actual real numbersthey represent. The definition of computability inthe BSS model might be modified to allow the pro-gram to approximate the exact output values, sothat functions like ex become computable. However

formulating a good general definition in the BSSmodel along these lines is not straightforward: see[Sma97] for an informal treatment and [Brv05] fora discussion and a possible formal model.

For uncomputability results, BSS theory con-centrates on set decidability rather than functioncomputation. A set C ⊆ Rn is decidable if someBSS computer program halts on each input x ∈ Rn

and outputs either 1 or 0, depending on whetherx ∈ C. Theorem 1 in [BCSS98] states that if C ⊆ Rn

is decided by a BSS program over R then C is acountable disjoint union of semi-algebraic sets. Anumber of sets are proved undecidable as corol-laries, including the Mandelbrot set and all non-degenerate Julia sets (Figure 2). However again itis hard to interpret these undecidability results interms of practical computing, because simple sub-sets of R2 which can be easily “drawn”, such as theKoch snowflake and the graph of y = ex (Figure 1)are undecidable in this sense [Brt03].

In the bit model there is a nice definition of de-cidability (bit-computability) for bounded subsetsof Rn. For the case of R2, the idea is that the set isbit-computable if some computer program candraw it on a computer screen. The program shouldbe able to zoom in at any point in the set and drawa neighborhood of that point with arbitrarily finedetail. Such programs can be easily written forsimple sets such as the Koch snowflake and thegraph of the equation y = ex, and more sophisti-cated programs can be written for many Julia sets(as will be seen below). A Google search on theWorld Wide Web turns up programs that apparentlydo the same for the Mandelbrot set. However noone knows how accurate these programs reallyare. The bit-computability of the Mandelbrot set isan open question, although we will see later thatit holds subject to a major conjecture. Because ofthe Church-Turing thesis, a proof of bit-uncomputability of the Mandelbrot set would carrysome force: any program for any computing device

Figure 1. The Koch snowflake and the graph of the equation y = ex.

Page 3: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

320 NOTICES OF THE AMS VOLUME 53, NUMBER 3

purporting to draw it must draw it wrong, at leastat some level of detail.

In the rest of the paper we concentrate on thebit model, because we believe that this is the mostaccurate abstraction of how computers are used tosolve problems over the reals. The bit model is notwidely appreciated in the mathematical community,perhaps because its principal references are notwritten to appeal to a wide audience. In contrastthe BSS model has received widespread attention,partly because its presentation [BSS89], and espe-cially the excellent reference [BCSS98], not onlypresent the model but provide a rigorous treatmentof matters of interest in scientific computation. Theauthors deserve credit not just for presenting anelegant model, but also for arousing interest infoundational issues for numerical analysis and in-spiring considerable research. Undoubtedly theconcept of abstracting away round-off error incomputations over the reals poses important nat-ural questions. One example is linear program-ming: Although polynomial-time algorithms areknown for solving linear programming problemswith rational inputs, these algorithms assume thatthe problem description size (in bits) is availableas an input parameter. This is not the usual frame-work for either the bit model or the BSS model. Thecommonly-used simplex algorithm can be neatlyformalized in the BSS model, but it requires expo-nential time in the worst case. It would be very niceto find a polynomial-time BSS algorithm for linearprogramming. Such an algorithm would be calledstrongly polynomial-time in the field of linear pro-gramming, and its existence is an important openquestion.

Here is an outline of the rest of the paper. Thenext section gives examples of easy and hard com-putational problems over the reals. The followingsection motivates and defines the bit model bothfor computing real functions and subsets of Rn.Computational complexity issues are discussed.After that we consider the computability and com-plexity of the Mandelbrot and Julia sets as a par-ticular application of the definitions. Simple pro-grams are available which seem to compute theMandelbrot set, but they may draw pieces whichshould not be there. The bit-computability of theMandelbrot set is open, but it is implied by a majorconjecture. Many Julia sets are not only computable,but are efficiently computable (in polynomial time).On the other hand some Julia sets are uncom-putable in a fundamental sense. Finally, the last sec-tion discusses a fundamental question related tothe Church-Turing thesis: are there physical sys-tems that can compute functions which are un-computable in the standard computer model?

Some of the material presented here is given inmore detail in [Brv05]. See [Ko91] and [Wei00] forgeneral references on bit-computability models.

Examples of “Easy” and “Hard” Problems

“Easy” ProblemsWe start with examples of problems over the realsthat should be “easy” according to any reasonablemodel. The everyday operations, such as addition,subtraction, and multiplication should be consid-ered easy. More generally, any of the operations thatcan be found on a common calculator can be re-garded as “easy”, at least in some reasonable re-gion. Such functions include, among others, sinx,ex ,

√x, and logx .

Functions with singularities, such as x÷ y andtanx, are easily computable on any closed regionwhich excludes the singularities. The computa-tional problem usually gets harder as we approachthe singularity point. For example, computing tanxbecomes increasingly difficult as x tends to π2 , be-cause the expression becomes increasingly sensi-tive in x .

Some basic numerical problems that are knownto have efficient solutions should also be relatively“easy” in the model. This includes inverting a wellconditioned matrix A . A matrix is well conditionedin this setting if A−1 does not vary too sharplyunder small perturbations of A .

Simple problems that arise naturally in the dis-crete setting should usually remain simple whenpassing to the continuous setting. This includesproblems such as sorting a list of real numbers andfinding lengths of shortest paths in a graph withreal edge lengths.

When one considers subsets of R2, a set shouldbe considered “easy” if we can draw it quickly withan arbitrarily high precision. Examples include sim-ple “paintbrush” shapes, such as the disc B((0,0),2)in R2, as well as simple fractal sets, such as the Kochsnowflake (Figure 1).

To summarize, the model should classify a prob-lem as “easy”, if there is an efficient algorithm tosolve it in some practical sense. This algorithmmay be inspired by a discrete algorithm, a numer-ical-analytic technique, or both.

“Hard” ProblemsNaturally, the “hard” problems are the ones forwhich no efficient algorithm exists. For example,it is hard to compute an inverse A−1 of a poorlyconditioned matrix A . Note that even simple nu-merical problems, such as division (x− 1)÷ (y − 1),become increasingly difficult in the poorly condi-tioned case. It becomes increasingly hard to eval-uate the latter expression as x and y approach 1.

Many problems that appear to be computation-ally hard arise while trying to model processes innature. A famous example is the N-body problem,which simulates the motion of planets. An evenharder example is solving the Navier-Stokes equa-tions used in simulations for fluid mechanics. We

Page 4: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 321

will return to questions of hardness in physical sys-tems in the last section.

Other problems that should be hard are the nat-ural extensions of very difficult discrete problems.Consider, for example, the Travelling SalesmanProblem (TSP). In this problem we are given a graphG = (V,E) with costs c(e) associated with the edgese ∈ E. Our goal is to minimize the cost of a Hamil-tonian cycle (a cycle which goes through each ver-tex exactly once) in G . This problem is widely be-lieved not to have an efficient solution in thediscrete case. In fact it is NP-complete in this case([GJ79]), and having a polynomial time algorithmfor it would imply that P = NP , which is believedto be unlikely. There is no reason to think that itshould be any easier in the continuous setting(where the costs c(e) need not be integers) than inthe discrete case.

The hardness of numerical problems may sig-nificantly vary with the domain of application. Con-sider for example the problem of computing theintegral I(x) =

∫ x0 f (t)dt. While it is very easy to com-

pute I(x) from f (x) in the case f is a polynomial,integrating a general polynomial time computableLipschitz function is as hard as counting the num-ber of the different shortest routes in the Travel-ling Salesman Problem. The latter problem is

complete for a class called #P, which is believed tosubsume NP.

Quasi-fractal Examples: The Mandelbrot andJulia SetsThe Mandelbrot and the Julia sets are common ex-amples of computer-drawn sets. Beautiful high-resolution images have become available to us withthe rapid development of computers. Amazingly,these extremely complex images arise from verysimple discrete iterated processes on the complexplane C.

For a point c ∈ C , define fc (z) = z2 + c. c is saidto be in the Mandelbrot set M if the iterated se-quence c, fc (c), fc (fc (c)), . . . does not diverge to ∞.While (we believe) very precise images of M can begenerated on a computer, proving that these im-ages approximate M would probably involve solv-ing some difficult open questions about it.

The family of Julia sets is parameterized byfunctions. In the simple case of quadratic functionsfc (z) = z2 + c, the Fatou set Kc is the set of pointsx , such that the sequence x, fc (x), fc (fc (x)), . . . doesnot diverge to ∞. The Julia set Jc is defined as theboundary of Kc . While many Julia sets, such as theones in Figure 2, are quite easy to draw, there are

Figure 2. a: the Mandelbrot set; b-d: Julia sets with parameter values of i, −1.57 , and −0.15 + 0.7i ,respectively.

Page 5: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

322 NOTICES OF THE AMS VOLUME 53, NUMBER 3

explicit sets of which we simply cannot produceuseful pictures.

As we see, it is not a priori clear whether thesesets should be “easy” or “hard”. This gives rise toa series of questions:• Is the Mandelbrot set computable?• Which Julia sets are computable?• Can the Mandelbrot set and its zoom-ins be

drawn quickly on a computer?• Which Julia sets and their zoom-ins can be drawn

quickly on a computer?These questions are meaningless unless we agreeupon a model of computation for this setting. Inthe following sections we develop such a model,based on “drawability” on a computer.

The Bit Model

Bit Computability for FunctionsThe motivation behind the bit model of computa-tion is idealizing the process of scientific com-puting. Consider, for example, the simple task ofcomputing the function x → ex for an x in the in-terval [−1,1]. The most natural solution appearsto be by taking the Taylor series expansion around0:

(1) ex =∞∑k=0

xk

k!.

Obviously in a practical computation we will onlybe able to add up a finite number of terms from(1). How many terms should we consider? By tak-ing more terms, we can improve the precision ofthe computation. On the other hand, we pay theprice of increasing the running time as we considermore terms. The answer is that we should take justenough terms to satisfy our precision needs.

Depending on the application, we might need thevalue of ex within different degrees of precision.Suppose we are trying to compute it with a preci-sion of 2−n. That is, on an input x we need to out-put a y such that |ex − y| < 2−n . It suffices to taken + 1 terms from (1) to achieve this level of ap-proximation. Indeed, assuming n ≥ 4, for anyx ∈ [−1,1],∣∣∣∣∣∣ex −

n∑k=0

xk

k!

∣∣∣∣∣∣ =

∣∣∣∣∣∣∞∑

k=n+1

xk

k!

∣∣∣∣∣∣ ≤∞∑

k=n+1

|xk|k!

∞∑k=n+1

1k!<

∞∑k=n+1

12k+1 = 2−(n+1).

In fact, a smaller number of terms suffices toachieve the desired precision. We take a portion ofthe series that yields error 2−(n+1) < 2−n, to allowroom for computation (round-off) errors in theevaluation of the finite sum. All we have to do nowis to evaluate the polynomial pn(x) =

∑nk=0

xkk! within

an error of 2−(n+1). To do this, we need to know xwith a certain precision 2−m. It is convenient to use

dyadic rationals to express approximations to x andex , where the dyadic rationals form the set

D = {m2n|m ∈ Z, n ∈ N}.

We can then take a dyadic q = r2s ∈ D such that

|x− q| < 2−m , and evaluate pn(q) within an errorof 2−(n+2) using finite precision dyadic arithmetic.This gives us an approximation y ∈ D of pn(q)such that |y − pn(q)| < 2−(n+2) . In our example, anassumption |x− q| < 2−(n+4) guarantees that|pn(x)− pn(q)| < 2−(n+2) , and we can takem = n + 4. To summarize, we have

|ex − y| ≤ |ex − pn(x)| + |pn(x)− pn(q)|+|pn(q)− y| < 2−(n+1) + 2−(n+2) + 2−(n+2) = 2−n,

and y is the answer we want. The running time ofthe computation is dominated by the time it takesto compute our approximation to pn(q). Note thatthis entire computation is done over the dyadic ra-tionals and can be implemented on a computer intime polynomial in n.

To find the answer we only needed to know xwithin an error of 2−(n+4). This is especially im-portant when one tries to compose several com-putations. For example, to compute eex−1 within anerror 2−n on the appropriate interval, it suffices toknow x within an error of 2−(n+8).

While evaluating the function ex is hardly a chal-lenge for scientific computing, the process de-scribed above illustrates the main ideas in the bitmodel of computation. Below are the main pointsthat we have seen in this example and that char-acterize the bit model for computing functions. Sup-pose we are trying to compute f : [a, b] → Rn. De-note the program computing f by Mf .• The goal of Mf (x,n) is to compute f (x) within the

error of 2−n;• Mf computes a precision parameter m, it needs

to know x within an error of 2−m to proceed withthe computation;

• Mf receives a dyadic q = r2s such that

|x− q| < 2−m ;• Mf computes a dyadic y such that|f (x)− y| < 2−n;

• the running time of Mf (x,n) is the time it takesto compute m plus the time it takes to computey from q (both of which have finite representa-tions by bits).Note that the entire computation of Mf is per-

formed only with finitely presented dyadic num-bers. There is a nice way to present the computa-tion using oracle terminology. An oracle for a realnumber x is a function φ : N→ D such that for alln, |φ(n)− x| < 2−n . Note that the q in the de-scription above can be taken to be q = φ(m). Insteadof querying the value of x once, we can allow Mfan unlimited access to the oracle φ. The only

Page 6: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 323

limitation would be that the time it takes Mf to readthe output of φ(m) is at least m. The oracle can bethought of as a READx(m) instruction, whichprompts the user to enter x with precision 2−m. Weemphasize the fact that x is presented to Mf as anoracle by writing Mφ

f (n) instead of Mf (x,n). Just asthe quality of the answer of Mf (x,n) should not de-pend on the specific 2−m-approximation q for x ,Mφf (n) should output a 2−n-approximation of f (x)

for any valid oracle φ for x .The running time T (n) of Mφ

f (n) is the worst-casetime a computation on this machine can take witha valid input and precision n. If T (n) is boundedby some polynomial p(n), we say that Mφ

f worksin polynomial time.

The output of Mφf (n) can be viewed itself as an

oracle for f (x). This allows us to compose functions.For example, given a machine Mφ

f (n) for comput-ing f (x), and a machine Mφ

g (n) for computing g(x),

we can compute f ◦ g(x) by MMφg

f (n) .This is the bit-computability notion for functions.

Early work on the computability of real functionswas done by Banach and Mazur in 1936–1939. Be-cause of the Second World War the results were onlypublished many years later [Maz63]. A definitionwhich is equivalent to bit-computability was firstproposed by Grzegorczyk [Grz55] and Lacombe[Lac55]. It has been since developed and general-ized. More recent references on the subject include[Ko91], [PR89], and [Wei00]. Let us see some ex-amples to illustrate this notion.Examples of Bit-computabilityWe start with a family of the simplest possiblefunctions, the constant functions. For a functionf (x) = c, c ∈ R a constant, we can completely ignorethe input x. The complexity of computing f (x) withprecision 2−n is the complexity of computing thenumber c within this error. In the original work byTuring [Tur36], the motivation for introducing Tur-ing machines was defining which numbers can becomputed, and which cannot.

For example, the numbers 13 = (0.01010101 . . . )2and

√2 appear to be “easy”, with the latter being

“harder” than the former. There are also easy al-gorithms for computing transcendental numberssuch as π and e. But there are only countably manyprograms, hence all but countably many reals can-not be approximated by any of them. To give a spe-cific example of a very hard number, consider somestandard encoding of all the Diophantine equa-tions, φ : {equations} → N . Let D = {φ(EQ) : EQis a solvable equation} ⊂ N, and

d =∑n∈D

4−n.

Then computing d with an arbitrarily high preci-sion would allow us to decide whether φ(EQ) is inD for any specific Diophantine equation EQ . This

would contradict the solution to Hilbert’s 10thProblem, which states that no such decision pro-cedure can exist.

The following example will illustrate the bit-computability of a more general function.

Example: Consider the function f (x) = 3√

1− x3 onthe interval [0,1]. It is a composition of two func-tions: g : x → 1− x3 and h : x → 3√x , both on the[0,1] interval. g is easier to compute in this case.It is computed the same way x → ex was computedin the previous section. The function h(x) is slightlytrickier. One possible approach is to use Newton’smethod to solve (approximately) the equationλ3 − x = 0 to obtain λ ≈ 3√x . The fact that g(x)and h(x) are computable is not surprising. In fact,both of these functions can be found on a commonscientific calculator.

In general, all “calculator functions” are com-putable on a properly chosen domain. For exam-ple, x → 1/x is computable on any domain whichexcludes 0. We can bound the time required tocompute the inverse, if the domain is properlybounded away from 0. The only true limiting fac-tor here is that computable functions as describedabove must be continuous on the domain of theirapplication. This is because the value of f (x) wecompute must be a good approximation for allpoints near x .

Theorem 1. Suppose that a function f : S → Rk iscomputable. Then f must be continuous on S .

This puts a limitation on the applicability on thecomputability notion above. While it is “good” atclassifying continuous functions, it classifies all dis-continuous functions, however simple, as beinguncomputable. We will return to this point at theend of the section.Bit-computability for SetsJust as the bit-computability of functions formal-izes finite-precision numerical computation, wewould like the bit-computability of sets to formal-ize the process of generating images of subsets inRk , such as the Mandelbrot and Julia sets whichwere discussed earlier.

An image is a collection of pixels, which can betaken to be small balls or cubes of size ε. ε can beseen as the resolution of the image. The smaller itis, the finer the image is. The hardness of produc-ing the image generally increases as ε gets smaller.A collection of pixels P is a good image of abounded set S if the following conditions are ful-filled:

• P covers S . This ensures that we “see” the en-tire set S , and

• P is contained in the ε-neighborhood of S . Thisensures that we don’t get “irrelevant” compo-nents which are far from S .

Page 7: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

324 NOTICES OF THE AMS VOLUME 53, NUMBER 3

It is convenient to take ε = 4 · 2−n—our computa-tion precision in this case.

Suppose now that we are trying to construct Pas a union of pixels of radius 2−n centered at gridpoints (2−(n+k) · Z)k . The basic decision we have tomake is whether to draw each particular pixel ornot, so that the union P would satisfy the condi-tions above. To ensure that P covers S , we includeall the pixels that intersect with S . To satisfy thesecond condition, we exclude all the pixels that are2−n-far from S . If none of these conditions holds,we are in the “gray” area, and either including orexcluding a pixel is fine. In other words, we shouldcompute a function from the family

(2) f (d,n) =

1 if B(d,2−n)∩ S �=∅0 if B(d,2 · 2−n)∩ S =∅0 or 1 otherwise

for every point d ∈ (2−(n+k) · Z)k . Here f is com-puted in the classical discrete sense.

Example: A “simple” set, such as a point, line seg-ment, circle, ellipse, etc. is computable if and onlyif all of its parameters are computable numbers.For example, a circle is computable if and only ifthe coordinates of its center and its radius arecomputable.

The way we have arrived at the definition of bit-computability might suggest that it is specificallytailored to computer-graphics needs and is notmathematically robust. This is not the case, as willbe seen in the following theorem. Recall that theHausdorff distance between bounded subsets of Rk

is defined as

dH (S, T ) = inf{d : S ⊂ B(T, d) and T ⊂ B(S, d)}.

We have the following.

Theorem 2. Let S ⊂ Rk be a bounded set. Then thefollowing are equivalent.

1. A function from the family (2) is computable.2. There is a program that for any ε > 0 gives anε-approximation of S in the Hausdorff metric.

3. The distance function dS (x) = inf{|x− y| :y ∈ S} is bit-computable.

1. and 3. remain equivalent even if S is not bounded.

Example: The finite approximations Ki of the Kochsnowflake are polygons that are obviously com-putable. The convergence Ki → K is uniform in theHausdorff metric. So K can be approximated in theHausdorff metric with any desired precision. Thusthe Koch snowflake is bit-computable.

The last characterization of set bit-computabil-ity in Theorem 2 connects the computability of setsand functions. There is another natural connection

between the computability notions for these ob-jects—through plots of a function’s graph.

Theorem 3. Let D ⊂ Rk be a closed and boundedcomputable domain, and let f : D → R be a contin-uous function. Then f is computable as a functionif and only if the graph Γf = {(x, f (x)) : x ∈ D} iscomputable as a set.

Example: Consider the set S = {(x, y) : x, y∈ [0,1], x3 + y3 = 1} . It is the graph of the functionf (x) = 3

√1− x3 on [0,1], which we have seen to be

computable. By Theorem 3, S is a bit-computableset. This is despite the fact (pointed out in [Blum04])that by the cubic case of Fermat’s Last Theorem theonly rational points in S are (0,1) and (1,0).

A more detailed discussion of bit computabil-ity for sets can be found in [BW99, Wei00, Brv05].

Computational Complexity in the Bit ModelSince the basic object in the discussion above is aTuring Machine, the computational complexity forthe bit model follows naturally from the standardnotions of computational complexity in the discretesetting. Basically, the time cost of a computationis the number of bit operations required.

For example, the time complexity Tπ (n) for com-puting the number π is the number of bit opera-tions required to compute a 2−n-approximation ofπ . The time complexity Tex (n) of computing the ex-ponential function x → ex on [−1,1] is the numberof bit operations it takes to compute a2−n-approximation of ex given an x ∈ [−1,1]. Thisrunning time is assessed at the worst possible ad-missible x . We have seen that Tex (n) is bounded bya polynomial in n, and it is not hard to see that thesame holds for Tπ (n) .

This computational complexity notion can beused to assess the hardness of the different nu-merical-analytic problems arising in scientific com-puting. For example, the dependence of matrix in-version hardness on the condition number of thematrix fits nicely into this setting.

Another example is a result by Schönhage [Sch82,Sch87] showing how the fundamental theorem ofalgebra can be implemented by a polynomial timealgorithm in the bit model. More precisely, he hasshown how to solve the following problem in timeO((n3 + n2s) log3(ns)) : Given any polynomialP (z) = anzn + · · · + a0 with aj ∈ C and|P | =

∑v |av | ≤ 1 , and given any s ∈ N, find ap-

proximate linear factors Lj (z) = ujz + vj(1 ≤ j ≤ n) such that |P − L1L2 · · ·Ln| < 2−sholds.

Some of the early work regarding the computa-tional complexity of operators such as taking de-rivatives and integration was done in [KF82]. Amore detailed exposition of the results can befound in [Ko91].

Page 8: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 325

The complexity of computing a set is the timeT (n) it takes to decide one pixel. More formally, itis the time required to compute a function fromthe family (2). Thus a set is polynomial-time com-putable if it takes time polynomial in n to decideone pixel of resolution 2−n.

To see why this is the “right” definition, supposewe are trying to draw a set S on a computer screenwhich has a 1000× 1000 pixel resolution. A2−n-zoomed in picture of S has O(22n) pixels ofsize 2−n, and it would take time O(T (n) · 22n) tocompute. This quantity is exponential in n, even ifT (n) is bounded by a polynomial. But we are draw-ing S on a finite-resolution display, and we will needto draw only 1000 · 1000 = 106 pixels. Decidingthese pixels would require O(106 · T (n)) = O(T (n))steps. This running time is polynomial in n if andonly if T (n) is polynomial. Hence T (n) reflects the“true” cost of zooming in when drawing S .Beyond the Continuous CaseAs we have seen earlier, the bit model notion ofcomputability is very intuitive for sets and for con-tinuous functions. However, by Theorem 1 it com-pletely excludes even the simplest discontinuousfunctions. For example the step function can be de-fined by

(3) sα(x) =

{0, if x < α1, if x ≥ α

Consider s0—the simplest case when α = 0. Thefunction is bit computable on any domain that ex-cludes 0. One could make the argument that aphysical device really cannot compute s0. There isno bound on the precision of x needed to computes0(x) near 0. Hence no finite approximation of x suf-fices to compute s0 even within an error of 1/3.

On the other hand, one might want to includethis function, and other simple functions in thecomputable class, because the primary goal of thisclassification is to distinguish between “easy” and“hard” problems, and computing s0 does not lookhard. If we were to allow the step functions, wewould probably like sα to be computable if and onlyif α is a computable real. There are several differ-ent approaches one can take on the computabilityof discontinuous functions. We will only mentiontwo here.

One possibility would be to say that a functionis computable if for any input n we can approxi-mate it correctly on 1− 1

n portion of the x’s (in mea-sure). It is not hard to see that under this defini-tion computability of α implies computability ofsα(x) . Another approach is to say that a functionis computable if we can plot its graph. By Theorem3, this definition extends the standard bit-computability definition in the continuous case. Ob-viously it makes sα computable whenever α iscomputable.

How Hard Are the Mandelbrot and JuliaSets?First let us consider questions of computability ofthe Mandelbrot set M . Despite the relatively sim-ple process defining M , the set itself is extremelycomplex and exhibits different kinds of behaviorsas we zoom into it. In Figure 3 we see some of thevariety of images arising in M .

The most common algorithm used to computeM is presented on Figure 4. The idea is to fix somenumber T—the number of steps we are willing toiterate. Then for every grid point c iteratefc (z) = z2 + c on c for at most T steps. If the orbitescapes B(0,2), we know that c /∈M. Otherwise, wesay that c ∈M. This is equivalent to taking the in-verse image of B(0,2) under the polynomial mapf T (c) = fc ◦ fc ◦ · · · ◦ fc︸ ︷︷ ︸

T times

(c) . In Figure 4 (right) a few

of these inverse images and their convergence toM are shown.

One problem with this algorithm is that its analy-sis should take into account roundoff error in-volved in the computation z ← z2 + c . But thereare other problems as well. For example, we takean arbitrary grid point c to be the representativeof an entire pixel. If c is not in M , we will miss thisentire pixel even if part of it intersects M. This prob-lem arises especially when we are trying to draw“thin” components of M , such as the one in theupper-right corner of Figure 3.

Perhaps a deeper objection to this algorithm isthe fact that we do not have any estimate on thenumber of steps T (n) we need to take to make thepicture 2−n-accurate. That is, a T (n) such that forany c which is 2−n-far from M , the orbit of cescapes in at most T (n) steps. In fact, no such es-timates are known in general, and the question oftheir existence is equivalent to the bit-computabilityof M (cf. [Hert05]).

Some of the most fundamental properties of Mremain open. For example, it is conjectured that itis locally connected, but with no proof so far.

Conjecture 4. The Mandelbrot set is locally con-nected.

When one looks at a picture of M , one sees asomewhat regular structure. There is a cardioidalcomponent in the middle, a smaller round compo-nent to the left of it, and two even smaller compo-nents on the sides of the main cardioid. In fact, manyof these components can be described combinato-rially based on the limit behavior of the orbit of c.E.g., in the main cardioid, the orbit of c convergesto an attracting point. These components are calledhyperbolic components because they index the hy-perbolic Julia sets that will be discussed below.Douady and Hubbard have shown that Conjecture

Page 9: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

326 NOTICES OF THE AMS VOLUME 53, NUMBER 3

4 implies the density of hyperbolic components inM :

Conjecture 5. Hyperbolic components are densein M .

Hertling [Hert05] has shown that Conjecture 5implies the computability of M . There is a possi-bility that M is computable even without this con-jecture holding, but it is hard to imagine such a con-struction without a much deeper understanding ofthe structure of M . Moreover, even if M is com-putable, questions surrounding its computationalcomplexity remain wide open. As we will see, thesituation is much clearer for most Julia sets.

A Julia set Jr is defined for every rational func-tion r from the Riemann sphere into itself. Here werestrict our attention to Julia sets correspondingto quadratic polynomials of the form fc (z) = z2 + c.Recall that the Fatou set Kc is the set of points xsuch that the sequence x, fc (x), fc (fc (x)), . . . doesnot diverge to ∞ . The Julia set Jc = ∂Kc is theboundary of the Fatou set.

For every parameter value c there is a differentset Jc , so in total there are uncountably many Juliasets, and we cannot hope to have a machine

computing Jc for each value of c. What we canhope for is a machine computing Jc when given ac-cess to the parameter c with an arbitrarily high pre-cision. The existence of such a machine and theamount of time the computation takes depend onthe properties of the particular Julia set. An ex-cellent exposition on the properties of Julia sets canbe found in [Mil00].

Computationally, the “easiest” case is that of thehyperbolic Julia sets. These are the sets for whichthe orbit of the point 0 either diverges to ∞ or con-verges to an attracting orbit. Equivalently, these arethe sets for which there is a smooth metric µ on aneighborhood N of Jc such that the map fc isstrictly expanding on N in µ. Hence, points escapethe neighborhood of Jc exponentially fast. That is,if d(Jc, x) > 2−n , then the orbit of x will escapesome fixed neighborhood of Jc in O(n) steps. Thisgives us the divergence speed estimate we lackedin the computation of the Mandelbrot set M andshows that in this case Jc is computable in poly-nomial time. The set M can be viewed as the set ofparameters c for which Jc is connected. The hy-perbolic Julia sets correspond to the values of cwhich are either outside M or in the hyperbolic

Figure 3. A variety of different images arising in zoom-ins of the Mandelbrot set (in black).

Page 10: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 327

components inside M . It can be shown that if Con-jecture 5 holds, all the points in the interior of Mcorrespond to hyperbolic Julia sets as well. Noneof the points in ∂M correspond to hyperbolic Juliasets.

We have just seen that the most “common” Juliasets are computable relatively efficiently. Theseare the Julia sets that are usually drawn, such asthe ones on Figure 2. This raises the question ofwhether all Julia sets are computable so efficiently.The answer to this question is negative. In fact, ithas been shown in [BY04] that there are some val-ues of c for which Jc cannot be computed (even withoracle access to c). The construction is based onJulia sets with Siegel disks. A parameter c which“fools” all the machines attempting to compute Jcis constructed, via a diagonalization argument sim-ilar to the one used in other noncomputability re-sults. Thus, while “most” Julia sets are relativelyeasy to draw, there are some whose pictures wemight never see.

Computational Hardness of PhysicalSystems and the Church-Turing ThesisIn the previous sections we have developed toolswhich allow us to discuss the complexity of com-putational problems in the continuous setting. Asin the discrete case, true hardness of problems de-pends on the belief that all physical computationaldevices have roughly the same computationalpower. In this section we present a connection be-tween tractability of physical systems in the bitmodel, and the possibility of having computingdevices more powerful than the standard com-puter. This provides further motivation for ex-ploring the computability and computational com-plexity of physical problems in the bit model. Thediscussion is based in part on [Yao02].

The Church-Turing thesis (CT), in its commoninterpretation, is the belief that the Turing ma-chine, which is computationally equivalent to theidealized common computer, is the most generalmodel of computation. That is, if a function can becomputed using any physical device, then it can becomputed by a Turing machine.

Negative results in computability theory dependon the Church-Turing thesis to give them strongpractical meaning. For example, by Hilbert’s 10thProblem, Diophantine equations cannot be gener-ally solved by a Turing Machine. This implies thatthis problem cannot be solved on a standard com-puter, which is computationally equivalent to theTuring Machine. We need the CT to assert that theproblem of solving these equations cannot besolved on any physical device and thus is trulyhard.

When we discuss the computational complexityof problems, we are not only interested in whethera function can be computed or not, but also in thetime it would take to compute a function. The Ex-tended Church-Turing thesis (ECT) states that anyphysical system is roughly as efficient as a Turingmachine. That is, if it computed a function f in timeT (n), then f can be computed by a Turing Machinein time T (n)c for some constant c.

In recent years, the ECT has been questioned, inparticular by advancements in the theory of quan-tum computation. In principle, if a quantum com-puter could be implemented, it would allow us tofactor an integer N in time polynomial in logN[Shor97]. This would probably violate the ECT,since factoring is believed to require superpoly-nomial time on a classical computer. On the otherhand there is no apparent way in which quantumcomputation would violate the CT.

Let f be some uncomputable function. Supposethat we had a physical system A and two feasible

Figure 4. The naïve algorithm for computing M , and some of its outputs.

Page 11: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

328 NOTICES OF THE AMS VOLUME 53, NUMBER 3

translators φ and ψ , such that φ translates aninput x to f into a state φ(x) of A . The evolutionof A on σ0 = φ(x) should yield a state σT such thatψ(σT ) = f (x) (Figure 5). This means that at least inprinciple we should be able to build a physical de-vice which would allow us to compute an uncom-putable function. To compute f on an input x , wetranslate x into a state σ0 = φ(x) of A . We thenallow A to evolve from state σ0 to σT—this is thepart of the computation that cannot be simulatedby a computer. We translate σT to obtain the out-put f (x) = ψ(σT ).

To make this scheme practical, we should requireA to be robust around σ0 = φ(x), at least in someprobabilistic sense. That is, for a small randomperturbation σ0 + ε of σ0, ψ(σT ) should be equalto f (x) with high probability.

It is apparent from this discussion that such anA should be hard to simulate numerically, for oth-erwise f would be computable via a numerical sim-ulation of A . On the other hand, “hardness to sim-ulate” does not immediately imply “computationalhardness”. Consider for example a fair coin. It isvirtually impossible to simulate a coin toss nu-merically due to the extreme sensitivity of theprocess to small changes in initial conditions. De-spite its unpredictability, a fair coin toss cannot beused to compute “hard” functions because it lacksrobustness. In fact, due to noise, for any initialconditions that put the coin far enough from theground, we know the probability distribution of theoutcome: 50% “heads” and 50% “tails”. Another ex-ample where “theoretical hardness” of the waveequation does not immediately imply a violation ofthe CT is presented in [WZ02].

This leads to a question that is closely relatedto the CT:

(∗ ) Is there a robust physical systemthat is hard to simulate numerically?

This is a question that can be formulated in theframework of bit-computability. Since the only

numerical simulations a computer can performare bit simulations, hardness of some robust sys-tem A for the bit model implies a positive answerfor (∗). On the other hand, proving that all com-putationally hard systems are inherently highlyunstable would yield a negative answer to thisquestion.

Note that even if the answer to (∗) is affirma-tive and CT does not hold, and there exists somephysical device A that can compute an uncom-putable function f, it does not imply that this de-vice could serve some “useful” purpose. That is, itmight be able to compute some meaningless func-tion f, but none of the “interesting” undecidableproblems, such as the Halting Problem or solv-ability of Diophantine equations.

AcknowledgmentsThe authors are grateful to the following people forhelpful comments on a preliminary version of thispaper: Eric Allender, Lenore Blum, Bill Casselman,Peter Hertling, Ken Jackson, Charles Rackoff,Michael Shub, Klaus Weihrauch, and Michael Yam-polsky. We would like to thank Philipp Hertel forsupplying a program that was helpful in produc-ing the images.

References[Blum04] L. BLUM, Computing over the reals: Where Tur-

ing meets Newton, Notices Amer. Math. Soc. 51 (2004),1024–1034.

[BSS89] L. BLUM, M. SHUB, and S. SMALE, On a theory of com-putation and complexity over the real numbers: NP-completeness, recursive functions and universal ma-chines. Bull. Amer. Math. Soc. 21 (1989), 1–46.

[BCSS98] L. BLUM, F. CUCKER, M. SHUB, and S. SMALE, Com-plexity and Real Computation, Springer, New York,1998.

[BM75] A. BORODIN and I. MUNRO, The Computational Com-plexity of Algebraic and Numeric Problems, Elsevier,New York, 1975.

[Brt03] V. BRATTKA, The emperor’s new recursiveness: Theepigraph of the exponential function in two modelsof computability, Words, Languages & Combinatorics

Figure 5. Computing f using a “hard” physical device A (left); a fair coin cannot be considered a“hard” device.

Page 12: Computing over the Reals: Foundations for Scientific ...Computing over the Reals: Foundations for Scientific Computing Mark Braverman and Stephen Cook 318 NOTICESOFTHEAMS VOLUME53,

MARCH 2006 NOTICES OF THE AMS 329

III, (Masami Ito and Teruo Imaoka, eds.), pp. 63–72,World Scientific Publishing, Singapore, 2003. ICWLC2000, Kyoto, Japan, March 14–18, 2000.

[BW99] V. BRATTKA and K. WEIHRAUCH, Computability of sub-sets of euclidean space I: Closed and compact subsets,Theoretical Computer Science, 219 (1999), 65–93.

[BY04] M. BRAVERMAN and M. YAMPOLSKY, Non-computableJulia sets, J. Amer. Math. Soc., to appear.

[Brv05] M. BRAVERMAN, On the complexity of real functions.Available from http://www.arxiv.org/abs/cs.CC/0502066+. Also in Proc. of 46th IEEE FOCS, pp.155–164, 2005.

[BCS97] P. BURGISSER, M. CLAUSEN, and M. A. SHOKROLLAHI, Al-gebraic Complexity Theory, Springer, New York, 1997.

[GJ79] M. R. GAREY and D. S. JOHNSON, Computers and In-tractability: A Guide to the Theory of NP-Complete-ness, W. H. Freeman and Company, 1979.

[Grz55] A. GRZEGORCZYK, Computable functionals, Fund.Math. 42 (1955), 168–202.

[Hert05] P. HERTLING, Is the Mandelbrot set computable?Math. Logic Quart. 51(1) (2005), 5–18.

[KF82] K. KO and H. FRIEDMAN, Computational complexityof real functions, Theor. Comp. Sci. 20(3) (1982),323–352.

[Ko91] K. KO, Complexity Theory of Real Functions,Birkhäuser, Boston, 1991.

[Lac55] D. LACOMBE, Extension de la notion de fonctionrécursive aux fonctions d’une ou plusieurs variablesréelles, C. R. Acad. Sci. Paris, 240 (1955), 2478-2480;241 (1955), 13-14, 151-153.

[Mat93] Y. MATIYASEVICH, Hilbert’s Tenth Problem, MITPress, Cambridge, London, 1993.

[Maz63] S. MAZUR, Computable Analysis, Rozprawy Matem-atyczne, Vol. 33, Warsaw, 1963.

[Mil00] J. MILNOR, Dynamics in One Complex Variable—In-troductory Lectures, second edition, Vieweg, 2000.

[PR89] M. B. POUR-EL and J. I. RICHARDS, Computability inAnalysis and Physics, Perspectives in MathematicalLogic, Springer, Berlin, 1989.

[Sch82] A. SCHÖNHAGE, The fundamental theorem of alge-bra in terms of computational complexity, Technicalreport, Math. Institut der. Univ. Tübingen, 1982.

[Sch87] ——— , Equation solving in terms of computa-tional complexity, Proceedings of the InternationalCongress of Mathematicians, 1986, (A. Gleason, ed.),Amer. Math. Soc., 1987.

[Shor97] P. SHOR, Polynomial-time algorithms for primefactorization and discrete logarithms on a quantumcomputer, SIAM J. Comput. 26 (1997), 1484-1509.

[Sma97] S. SMALE, Complexity theory and numerical analy-sis, Acta Numerica 6 (1997), 523-551.

[Tur36] A. M. TURING, On Computable Numbers, with anApplication to the Entscheidungsproblem, Proc. Lon-don Math Soc., (1936), 230-265.

[Wei00] K. WEIHRAUCH, Computable Analysis, Springer,Berlin, 2000.

[WZ02] K. WEIHRAUCH and N. ZHONG, Is wave propagationcomputable or can wave computers beat the Turingmachine?, Proc. London Math Soc. 85(3) (2002), 312-332.

[Yao02] A. YAO, Classical physics and the Church-Turingthesis, J. of ACM 50 (2003), 100–105. Available fromhttp://www.eccc.uni-trier.de/eccc-reports/2002/TR02-062/.