Upload
ian-stewart
View
213
Download
1
Embed Size (px)
Citation preview
Cooking the ClassicsIAN STEWART
IIn Chapter 17 of Mathematical Carnival [3] MartinGardner tells us that ‘‘When a mathematical puzzle isfound to contain a major flaw—when the answer is
wrong, when there is no answer, or when, contrary toclaims, there is more than one answer or a better answer—the puzzle is said to be ‘cooked’.’’ Gardner gives severalexamples, the simplest being a puzzle he himself set in achildren’s book: in the array of numbers
9 9 95 5 53 3 31 1 1
circle six digits to make the total of circled numbers equal 21.This is impossible on grounds of parity. Gardner’s answer, ineffect cooking his own puzzle, is to turn the page upsidedown and circle the three 6’s and the three 1’s that thenappear. But a reader, Howard Wilkerson, circled each of the3’s, one of the 1’s, and then drew a big circle round the othertwo 1’s (giving 11). This is better, since upside down 3’s and5’s don’t look like digits.
Gardner calls this kind of cook a quibble-cook. It exploitsan imprecise definition of the question to obtain an unex-pected answer. Mathematics is well up to speed on precisedefinitions these days. Even so, themathematics that we teachto our students, indeed tell to each other, is also sometimesopen to cookery—especially some of our classic theorems,which occasionally have become cliches. Over the years, I’vecomplied a mental list of cooked classics—a few contentious,all open to debate, all a matter of taste. I am now taking thedangerous step of committing them to print. At the very least,they might be offered to students as exercises, or for classdiscussion, to avoid giving the impression that the classicproof is holy writ. Many of them have a dynamical systemsflavour, even when the topic is number theory. A few don’t.
I’ll start with a couple of warm-up examples, which willbe familiar to most of you.
Root Two is IrrationalThe classic proof of the irrationality of
ffiffiffi
2p
proceeds bycontradiction, assuming that
ffiffiffi
2p¼ p=q in lowest terms, and
deducing that both p and q must be even. The argumentruns like this: p2 ¼ 2q2 so p must be even, say p = 2k. Butthen 2k2 ¼ q2 so q must be even, which is a contra-diction.
One problem with this proof is that ‘lowest terms’involves existence and uniqueness of prime factorization,but that can be got round by defining ‘lowest terms’ tomean ‘minimise q’.
If you’re prepared to accept existence and uniquenessof prime factorization, then it’s more informative toobserve that p2 ¼ 2q2 has an even power of 2 on the left-hand side, but an odd power on the right. Better still,prove:
THEOREM 0.1 A rational number a is a perfect square if
and only if every prime occurs to an even power in the
factorization of a.
This does involve extending prime factorization torationals, allowing negative powers, but that’s easy andthe proof is trivial. I think this theorem puts the topicinto an appropriate context, and makes the whole ideamuch clearer than a rather artificial argument tailoredspecifically to
ffiffiffi
2p
. Sometimes generalities are better thanexamples.
However, if you don’t want to go that route, primefactorization can be eliminated completely by using what isin effect the original Greek proof, thereby ‘classic-ing thecook’:
Suppose thatffiffiffi
2p¼ p=q where q is as small as possible.
Then p [ q and 2q [ p. Since
2�ffiffiffi
2p
ffiffiffi
2p� 1¼
ffiffiffi
2p
we have
ffiffiffi
2p¼ 2� p=q
p=q � 1¼ 2q � p
p� q
which, since p - q \ q, is a contradiction.
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 61
The GCD Is a Z-Linear CombinationA basic property of the integers is:
THEOREM 0.2 If g = gcd(a, b) then there exist p, q such
that g = pa + qb
One favoured route to this result is to set up the DivisionAlgorithm and the Euclidean Algorithm, and proceedinductively. This gets quite complicated.
An alternative is to use some ring theory, note that Z is aprincipal ideal domain, and consider the ideal generated by{a, b}. But this involves a fair amount of machinery.
However, a bare hands version of the PID proof is quickand simple, and avoids both algorithms:
Let k be the smallest positive integer of the form pa + qb.Clearly g divides k. I claim that k divides a. To see why,choose the largest m such that mk�a. (If you don’t like thisstep, choose the smallest m such that (m + 1)k [ a.) Ifmk = a then 0\ s = a - mk \ k (or else we can replacem by m + 1). Now
k � s ¼ paþ qb� s
¼ paþ qbþ km� a
¼ paþ qb� aþmðpaþ qbÞ¼ p0aþ q0b
for suitable p0, q0, contrary to the definition of k. Thiscontradiction proves that k divides a. Similarly, k divides b.
In fact, we can define the GCD this way, and proveexistence alongside the ‘linear combination’ property.
The GCD as a Dynamical SystemLet’s get more ambitious. I’ve always found the Euclideanalgorithm slightly complicated—not to understand or per-form, but to argue about theoretically. Expressing the GCDas an integer linear combination involves a complicated
induction, working backward through the algorithm, andsomehow the point gets lost.
Now, the place where recursion comes into its own inmathematics is dynamical systems. And it is straightforwardto turn the Euclidean algorithm into a dynamical system, bydoing something slightly more simple-minded. As a bonus,we don’t need the division algorithm.
Define a map / : N� N! N� N by
/ðx; yÞ ¼ ðmaxðx; yÞ �minðx; yÞ;minðx; yÞÞ
I will prove that if (a, b) = (0, 0), then after finitely manysteps the iterates of (a, b) reach a fixed point (d, 0) whered is the GCD of a and b.
First, I establish several simple facts:
1) /(x, y) = (0, 0) if and only if (x, y) = (0, 0).2) Define ||(x, y)|| = x + y. Then jj/ðx; yÞjj � jjðx; yÞjj
with equality if and only if x = 0 or y = 0.3) The fixed points of / are precisely the points (z, 0).4) If (x, y) = (0, 0) then / preserves the GCD. That is,
gcd(/(x, y)) = gcd(x, y).
The proofs are trivial: I give the fourth, which is typical.I claim that z divides both x and y if and only if it divides
the two components of /(x, y). If z|x and z|y thenz|max(x, y) and z|min(x, y). So z divides the two com-ponents of /(x, y). If z| (max (x, y) - min (x, y)) andz|min (x, y) then z|(max (x, y) - min (x, y) + min(x, y)) = max (x, y). Therefore z divides both x and y.
THEOREM 0.3 Let (a, b) = (0, 0). Then there exists t� 1
such that /t(a, b) = (d, 0), and then d = gcd(a, b).
PROOF. The norm ||(x, y)|| is a Liapunov function for /:
that is, it decreases when / is applied—strictly unless x = 0
or y = 0. Since x; y 2 N there must exist some t such that
jj/tþ1ða; bÞjj ¼ jj/tða; bÞjj. Then /t(a, b) = (0, d) or (d, 0)
for some d. Since /(0, d) = (d, 0), we have /t+1(a, b) =
(d, 0).
Since GCD is a conserved quantity for the dynamics,gcd(a, b) = gcd(d, 0) = d.
THEOREM 0.4 If d = gcd(a, b) then there exist p, q such
that d = pa + qb.
PROOF. Let X � N� N consist of all pairs ðp1aþq1b; p2aþ q2bÞ for p1; p2; q1; q2 2 Z. It is trivial to prove that
X is invariant under /: that is, /ðXÞ � X .
Since ða; bÞ 2 X , so is /t(a, b) for all t� 0. So ðd; 0Þ 2 X ,implying that d = pa + qb for some p, q.
This is all rather cute, and it gets cuter. The map / alsohas a scaling property:
/ðka; kbÞ ¼ k/ða; bÞ
for k 2 N. So X is the disjoint union of subsets X k, where
X0 ¼ fð0; 0ÞgX k ¼ fða; bÞ : gcdða; bÞ ¼ kgðk [ 0Þ
.........................................................................
AU
TH
OR IAN STEWART, Emeritus Professor of
Mathematics at Warwick University, is the
author of many research papers and books
for broad audiences. Currently he works on
pattern formation and network dynamics.A Fellow of the Royal Society, his awards
include the Royal Society’s Faraday Medal,
the Gold Medal of the Institute for Mathe-
matics and Its Applications, the Public
Understanding of Science Award of AAAS,
and the Zeeman Medal. His nonmathe-
matical interests include science fiction,
Egyptology, and geology.
Mathematics Institute
University of Warwick
Coventry, CV4 7AL
UK
e-mail: [email protected]
62 THE MATHEMATICAL INTELLIGENCER
Moreover, each Xk is / - invariant, and (aside fromk = 0) the dynamics of / on Xk is conjugate to that of / onX1 via the map ða; bÞ 7! ðka; kbÞ. So we can understand thedynamics of / by restricting attention to X1, the set of allpairs of coprime natural numbers.
Each such pair has a uniquely defined height, which isthe smallest t for which /t(a, b) = (1, 0). And we canclassify pairs by increasing height, using:
LEMMA 0.5 /(a, b) = (c, d) if and only if (a, b) =
(c + d, d) or (a, b) = (d, c + d). That is, /-1(c, d) =
{(c + d, d), (d, c + d)}.
We then find:
Height 0 : (1,0)Height 1 : (0,1)Height 2 : (1,1)Height 3 : (2,1), (1,2)Height 4 : (3,1), (1,3), (3,2), (2,3)Height 5 : (4,1), (1,4), (4,3), (3,4), (5,2), (2,5), (5,3), (3,5)So there are 2n-2 pairs of height n when n� 2.
The matrix of heights is curious:
1 2 3 4 5 6 7 8 92 � 3 � 4 � 5 � 63 3 � 4 4 � 5 5 �4 � 4 � 5 � 5 � 65 4 4 5 � 6 5 5 66 � � � 6 � 7 � �7 5 5 5 5 7 � 8 68 � 5 � 5 � 8 � 99 6 � 6 6 � 6 9 �
Here the entry in row a, column b is the height of (a, b),and we have written � whenever a, b are not coprime.
The reason for the � symbols is that the entire infinitematrix has a recursive structure: the part marked with �symbols repeats the same entries, but the row and columnpositions are scaled. (This is the decomposition into the Xk
mentioned earlier.)The form of this matrix is not obvious, and it could be
worth investigating.The map / is a formal version of the procedure known to
the ancient Greeks as anthyphaeresis (Fowler [2]), in whichsquares are trimmed off a rectangle until the remaining partis too small, at which point smaller squares are trimmed, andso on. It is of course well known that this procedure isequivalent to the Euclidean algorithm, which in turn isequivalent to the continued fraction expansion. The Greekproof that
ffiffiffi
2p
is irrational occupies similar territory.The map / makes sense on R
þ � Rþ, and is also related
to the continued fraction of x/y or y/x. The norm remains aLiapunov function, so there are no periodic points. How-ever, there are periodic points if we also rescale (x, y), andthese occur when x/y is a quadratic irrational.
Euler’s FormulaThe famous formula
eip ¼ �1
is often presented as something of a mystery. We all knowhow to justify it, but the usual approaches lack motivationand present it as some sort of accident. There is at least oneway to make it natural and inevitable, using differentialequations. Quite a bit of machinery needs to be set up to dothis, but it’s all good stuff in its own right.
Consider the ODE
dz
dt¼ iz
in the complex plane, with initial conditions z(0) = 1.The solution is z(t) = eit. (You can define the expo-
nential this way.)Now consider the geometrical meaning of the equation.
Since iz is orthogonal to z, the tangent vector to a solutionat a point z(t) is at right angles to the radius vector from 0 toz(t). It follows (and can be checked by a simple calcula-tion—convert to polars) that
• The unit circle centre 0 through z(t) is invariant under theflow (dr/dt = 0)
• In polar coordinates, dh/dt = 1.
Therefore the unit circle is the trajectory of the solutionwhen z(0) = 1, and t is arc length in radians, so the pointz(t) moves at uniform speed along the circle.
The Greek definition of p tells us that the circumferenceof the circle is 2p, so halfway round occurs when t = p. Buthalfway round is the point z = -1. Hence eip = -1.
All the ingredients of this proof are well known andform part of various standard approaches to the trigono-metric and exponential functions. But the overall packageseems not to get much prominence. Its big advantage is toexplain why circles (the definition of p) have anything todo with the exponential.
Infinitude of PrimesEuclid’s proof that the number of primes exceeds any finitebound is wonderfully clever, but I’ve always felt that con-sidering p1 . . . pk þ 1 is something of a rabbit out of a hat.What follows pretty much explains how the rabbit got intothe hat.
As a warm-up, suppose that the only primes were 2, 3,and 5. Then a systematic list of all products of powers ofthese would yield all possible numbers. The list is mosteasily generated in non-numerical order, something like1; 2; 3; 5; 2:2; 2:3; 2:5; 3:3; 3:5; 5:5; � � �.
We want to prove that something is missing, and a goodway to do that is to count how many numbers this processyields, up to some limit N.
The number 1 occurs once.
Multiples of 2 occur bN2c times.
Multiples of 3 occur bN3c times.
Multiples of 5 occur bN5c times.
However, we are overcounting since (for instance)multiples of 6 are multiples of 2 and multiples of 3. So we
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 63
must subtract multiples of 2.3, which occur b N2:3c times. But
then... Well, you can see what’s coming. By the inclusion-exclusion principle the number of numbers from 1 toN (which of course is N) has to be
1þ bN2c þ bN
3cbN
5c � b N
2:3c � b N
2:5c � b N
3:5c þ b N
2:3:5c
exactly.Oh, but those floor functions are a pain. So let’s make
N be exactly divisible by all the denominators; that is, N is amultiple of 2.3.5. Better still, why not set N = 2.3.5 = 30?Then the expression becomes
1þ 15þ 10þ 6� 2� 3� 5þ 1 ¼ 23
which isn’t 30.The generalization is now obvious. Suppose that p1; . . .;
pk is the entire (finite) list of primes. Let N ¼ p1. . .pk. Thesame argument using the inclusion-exclusion principlenow implies that the number of numbers between 1 and N,namely N, satisfies
N ¼ 1þX
i
N
pi�X
i;j
N
pipjþX
i;j;k
N
pipjpk� . . .
where the subscripts in all sums are unequal. Therefore
1 ¼ N 1�X
i
1
piþX
i;j
1
pipj�X
i;j;k
1
pipjpkþ . . .
!
¼ N 1� 1
p1
� �
. . . 1� 1
pk
� �
¼ p1 1� 1
p1
� �
. . .pk 1� 1
pk
� �
¼ ðp1 � 1Þ. . .ðpk � 1Þwhich is absurd unless k = 1 and p1 = 2. But now 3 ismissing from the list of primes.
Having now realized that some numbers are missing, wequickly notice that an obvious missing number is N � 1 ¼p1 . . . pk � 1. Having seen that, we reconstruct Euclid’sproof when we decide that a remainder of 1 is easier toexplain than a remainder of -1.
There are, of course, innumerable proofs of the infinitudeof primes. A very simple one, related to the above calcula-tion, is to compute the Euler /-function of N ¼ p1 . . . pk.Supposing that every number between 2 and N - 1 is amultiple of some pi, clearly /(N) = 1. On the other hand,
/ðN Þ ¼ /ðp1Þ. . ./ðpkÞ ¼ ðp1 � 1Þ. . .ðpk � 1Þ
So all pi = 2, and k = 1, as before.
Solution of Polynomials by RadicalsIt might seem unlikely that such a tried and tested area asthe solution of the quadratic, cubic, and quartic could haveanything new to offer. However, some minor variations onthe traditional themes are possible.
Quadratics
As a warm-up, I solve the general quadratic
x2 þ px þ q ¼ 0 ð0:1Þ
by an unorthodox method. The idea is that when it comes tothe crunch, the only quadratic we can factorise is x2 � k2. Sowe seek a, b such that
x2 þ px þ q ¼ ðx þ aÞ2 � b2
This leads directly to
2a ¼ p
a2 � b2 ¼ q
Therefore a = p/2, so b2 ¼ p2=4� q, so b ¼ �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p2=4� qp
.Now (0.1) becomes
0 ¼ ðx þ aÞ2 � b2 ¼ ðx þ aþ bÞðx þ a� bÞ
so x = -a ± b. That is,
x ¼ � p
2�
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p2=4� qp
which is the traditional formula. (What else?)
Quartics
If a trick works, use it again. Thus emboldened, I attemptthe general quartic
x4 þ px3 þ qx2 þ rx þ s ¼ 0 ð0:2Þ
by the same method. We seek a, b, c, d such that
x4 þ px3 þ qx2 þ rx þ s ¼ ðx2 þ ax þ bÞ2 � ðcx þ dÞ2
which leads to
2a ¼ p
a2 þ 2b� c2 ¼ q
2ab� 2cd ¼ r
b2 � d2 ¼ s
Clearly we must set a = p/2. Set b ¼ ðq þ c2 � p2=4Þ=2to solve the second equation for b in terms of c, and
d ¼ �r þ 2ab
2c¼ �r þ pðq þ c2 � p2=4Þ=2
2c
¼ �2r þ pðq þ c2 � p2=4Þ4c
to solve the third equation for d in terms of c. Finally,substitute all of this into the fourth equation:
ððq þ c2 � p2=4Þ=2Þ2 ��
�2r þ pðq þ c2 � p2=4Þ4c
�2
� s
¼ 0
and (with hindsight) multiply through by c2 to removedenominators. This yields
0 ¼ 1
4c6 þ q
2� 3p2
16
� �
c4 þ 3p4
64� p2
4qþ q2
4þ pr
4� s
� �
c2
þ � p6
256þ p4q
32� p2
16q2� p3r
16þ pqr
4� r2
4
� �
which (miraculously) is a cubic in c2. (It is of course avariant of Ferrari’s resolvent cubic.)
64 THE MATHEMATICAL INTELLIGENCER
Therefore we can solve quartics provided we can solvecubics.
Cubics
The same trick seems not to work, partly because 3 is odd.A variant succeeds in reducing the cubic to... the samecubic. The classic trick is to reduce the equation to
x3 þmx þ n ¼ 0
by translating x and then making a clever substitution. As avariation, don’t bother: consider
x3 þ px2 þ qx þ r ¼ 0
and substitute
x ¼ az þ bþ cz�1
(For motivation, consider the traditional x = t + 1/t forpalindromic polynomials.) Then
x3 þ px2 þ qx þ r ¼ Az3 þ Bz2 þ Cz þ D þ Ez�1
þ Fz�2 þ Gz�3
where
A ¼ a3
B ¼ a2ð3bþ pÞC ¼ að3b2 þ 3acþ 2bpþ qÞD ¼ b3 þ 6abcþ b2pþ 2acpþ bq þ r
E ¼ cð3b2 þ 3acþ 2bpþ qÞF ¼ c2ð3bþ pÞG ¼ c3
A fortuitous coincidence now smacks us between the eyes.Making two expressions 3b + p and 3b2 + 3ac + 2bp + qvanish causes the four coefficients of z2; z; z�1; z�2 tovanish. What luck! This happens when
a ¼ p2 � 3q
9cb ¼ � p
3
and c is a free parameter. (All we need is 9ac = p2 - 3q.)Now the cubic becomes
Az3 þ D þ Gz�3 ¼ 0
so
Az6 þ Dz3 þ G ¼ 0
which is quadratic in z3 so can be solved by radicals. Thensubstitute to get x.
Quintics
Thanks to Abel, Galois, and their predecessors, we knowthere isn’t a formula. I’ve concocted a stripped-down proofusing very little technical machinery, but it’s about ten pageslong. It’s not so much a cook as an attempt to reverse-engineer what the algebraists from Legendre to Kroneckeralready knew, and reassemble the bits that are needed. Itproves that x5 - 80x + 30 = 0 can’t be solved by radicals. Imay publish it elsewhere, once it’s been polished up.
Trisection of AnglesThe usual proof that angles cannot be trisected (see forexample [6]) relies on a cubic equation for cosð2p=9Þ andthe multiplicativity of the degree of a field extension. Here’san alternative using less machinery and a more naturalsetting.
Identify Euclid’s plane with the complex plane C.Define z 2 C to be constructible if it can be con-structed from Q � R � C by ruler and compass. (Notethat we don’t consider the real and imaginary partsseparately.)
The usual coordinate calculations prove:
LEMMA 0.6 A point z is constructible if and only if there
is a finite sequence of complex numbers a1; . . .; ak such that
a21 2 Q, a2
j 2 Qða1; . . .; aj�1Þ for j ¼ 2; . . .; k, and z 2Qða1; . . .; akÞ.
Here, as usual, Qð. . .Þ denoted the subfield of C gen-erated by the contents of the parentheses.
Observe that if K is a subfield of C and a2 [ K, then
KðaÞ ¼ fx þ ay : x; y 2 Kg
I now prove:
THEOREM 0.7 The primitive 9th root of unity f¼ e2pi=9 is
not constructible.
Since f trisects x = e2pi/3 it follows that the angle 2p/3cannot be trisected using ruler and compass.
It remains to prove the theorem without using multi-plicativity of the degree of a field extension.
PROOF. Assume for a contradiction that f is constructible.
Define a tower of subfields
Q ¼ K0 QðxÞ ¼ K1 K2 � � � Ks
such that f 2 Ks and Kj ¼ Kj�1ðajÞ where a2j 2 Kj�1, for
j ¼ 2; . . .; s. Note that the same goes for j = 1 since x ¼ð1þ i
ffiffiffi
3pÞ=2 so QðxÞ ¼ Qð
ffiffiffi
3pÞ. Such a tower exists if and
only if f is constructible. Choose one for which s is minimal.Then
f ¼ aþ bffiffiffi
bp
where a, b, b [ Ks-1. Minimality implies that b = 0,whence also a = 0. But f3 ¼ x, so
x ¼ ðaþ bffiffiffi
bp
Þ3 ¼ ða3 þ 3ab2bÞ þ ð3a2bþ b3bÞffiffiffi
bp
If 3a2bþ b3b 6¼ 0 then
ffiffiffi
bp
¼ x� a3 � 3ab2b3a2bþ b3b
which lies in Ks-1, contrary to minimality. Therefore3a2bþ b3b ¼ 0, so
x ¼ a3 þ 3ab2b
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 65
But now
ða� bffiffiffi
bp
Þ3 ¼ ða3 þ 3ab2bÞ � ð3a2bþ b3bÞffiffiffi
bp
¼ ða3 þ 3ab2bÞ ¼ x
The cube roots of x are f;xf;x2f. Therefore
a� bffiffiffi
bp
¼ xcf
where c = 1 or c = 2. (We can’t have c = 0 since b = 0.)But we already know that
aþ bffiffiffi
bp
¼ f
Adding, we get
f ¼ bð1þ xcÞ=2a
which is in Ks-1, a contradiction.
Two Squares TheoremFermat’s Two Squares Therorem, proved by Euler, statesthat any prime of the form 4k + 1 is a sum of two squares.There’s more, but this is the hard part. The traditionalapproaches either use quadratic residues or prove thatsome multiple of the prime is a sum of two squares and usedescent.
The following proof must be well known to number-theorists, but I’ve not seen it in the texts. It is short, con-ceptual, and straightforward.
Recall that the Gaussian integers Z½i comprise allcomplex numbers a + bi, where a; b 2 Z. There is a norm
N ðaþ biÞ ¼ a2 þ b2
and this is multiplicative:
N ðxyÞ ¼ N ðxÞN ðyÞ
The Gaussian integers form a unique factorisation domain:in fact the norm provides a Euclidean algorithm and thiscan be proved quickly by elementary means.
Let p 2 Z be prime (in the usual sense). We claim that ifp : 1 (mod 4) then p is not prime in Z½i. For a contra-diction, suppose that p = 4k + 1 is a Gaussian prime. ThenZ½i=p is a field. It has a subfield Z=p, which does notcontain i, since if it did, i would be real, indeed in Z. Themultiplicative group of this subfield is cyclic of order 4k sohas an element a of order 4. Now the quartic polynomialt4 - 1 has at least five distinct zeros: 1; a; a2; a3, and i. Thisis a contradiction.
Now N(p) = p2, so multiplicativity of the norm impliesthat p ¼ q1q2 where q1; q2 are prime in Z½i. Since p is real,q2 ¼ q1. Let q = a + bi. Then
p ¼ ðaþ biÞða� biÞ ¼ a2 þ b2
Strictly speaking, we get this up to a unit, but the unitsare ±1, ±i. Since p and a2 þ b2 are real and positive, theequation follows.
A tactical variant is to observe that Z½i=p isZ=p½t=ht2 þ 1i, and t2 + 1 is reducible (with a zero a) sothe quotient cannot be a field. This is marginally moreelegant but slightly less direct.
There is a nice analogue for Z½x where x is a cube rootof unity, and now we prove that primes 6k + 1 are of theform a2 þ b2 � ab, or equivalently a2 þ 3b2.
Of course quadratic reciprocity gives far more—but thatdoesn’t count as ‘elementary’.
‘Give Me a Place to Stand, and I Will Movethe Earth’So, famously, said Archimedes. I claim he already had aplace to stand. This is a quibble-cook, I think. Archimedesdidn’t get anything wrong. Just missed the point.
Archimedes was dramatizing the law of the lever, andwhat he had in mind was basically Figure 1. I don’t think hewas interested in the position of the Earth in space, but hewanted the pivot point to be fixed, and in order to applythe law of the lever he needed uniform gravity, contrary toastronomical fact. He also needed a perfectly rigid lever ofzero mass.
No matter. I don’t want to get into discussions aboutinertia or other quibbles. Let’s grant him all those things. Myquestion is: when the Earth moves, how far does it move?
Assume Archimedes can exert a force sufficient to lift hisown weight, say 100kg. The mass of the Earth is about6.1024kg. If the pivot is 1 metre from the Earth then the Law ofthe Lever tells us that distance from thepivot toArchimedes is6 9 1022 metres, and his lever is 1 + 6 9 1022 metres long. IfArchimedes moves his end of the lever one metre, similartriangles tell us that the Earth moves 1.6 9 10-23 metres. Aproton has diameter 10-15 metres...
OK, but it still moves, dammit!True. But suppose that instead of this huge and improb-
able apparatus, Archimedes standson the surfaceof theEarthand jumps. For every metre he clears, the Earth moves1.6 9 10-23 metres the other way (action/reaction). Basi-cally, this has exactly the same effect as a lever 1 + 6 9 1022
metres long—about 1.6 million light years, or about twothirds of the way to the Andromeda Galaxy.
The Reals are UncountableI like the ‘diagonal’ proof, but it does need some intricatemaneuvers with infinite decimals.
Suppose R is countable with R ¼ faj : j ¼ 1; 2; . . .g.Define two functions R! R:
f ðxÞ ¼ 1 8x 2 R
gðxÞ ¼1 if jx � aj j � 2�j for some j
0 otherwise
�
Figure 1. Archimedes’s lever.
66 THE MATHEMATICAL INTELLIGENCER
Then f(x) = g(x) for all x 2 R. But
Z 1
�1f ¼ 1
Z 1
�1g�
X
1
j¼1
2�j ¼ 1
Of course we need the Lebesgue integral to make thiswork, and for that we need Lebesgue measure on R. Nowthe cook gets cooked, because all we have to do is provethat a countable subset of R has measure zero.
So here’s a topological proof using less machinery. Withthe same assumptions, choose a sequence of non-emptyclosed intervals Aj � ½0; 1 such that aj 62 Aj and Ajþ1 � Aj .(This is easy.) Then A1 \ A2 \ A3 \ � � � is non-empty (bycompactness of [0,1]) but contains none of the aj.
Squares and RectanglesThe next item for cooking is a problem discussed by Ter-ence Tao in his book on problem-solving [7]. My treatmentis not suitable for his intended audience. Still, here goes.
The problem is about four rectangles, of equal area, thatfit together to form a big rectangle as in Figure 2, leaving arectangular hole (shaded). The problem is to prove that if theouter rectangle is actually square, then so is the shaded hole.Tao assumes that the outer square has side 1, and homes inon a key fact that cracks the problem wide open: the sidesx, y of each smaller rectangle must sum to 1. But how toprove this? He studies how the equal-area condition propa-gates information from each rectangle to the next, points outthat four such steps must lead back to the original rectangle,does some calculations, and out pops x + y = 1.
I’m a dynamical systems person, and yet again this lookssuspiciously like a dynamical systems problem to me. Thefunction that maps each rectangle to the next has a period-4point, and we have to deduce that this is really a fixed-point.(All four rectangles have to be congruent if the constructionis to work with a surrounding square; this is not immediatelyapparent from Tao’s treatment. The condition x + y = 1turns out to be irrelevant, though true.) Anyway, when Ipursued this line of attack, it led to a continued fraction andsome simple one-dimensional dynamics, like so:
Suppose that the surrounding square has unit side, and letthe sides of the first rectangle be x0; y0, which lie in (0,1) to
2R
1R
3R
0R
xy−y1
−y1
y
x
Figure 2. Configuration of four squares.
avoid trivial cases. Let the commonarea of the four rectanglesbe A. Then 4A \ 1 if the picture is to be believed, otherwisethe rectangles would overlap or the shaded region wouldhave zero area. So A\ 1
4.Following Tao, we observe that
ðx1; y1Þ ¼�
1� y0;A
1� y0
�
where A ¼ x0y0, and in general
ðxiþ1; yiþ1Þ ¼�
1� yi;A
1� yi
�
for all i because area is preserved. Already we see a discretedynamical system on R
2. Moreover, ðx0; y0Þ is a period-4point, otherwise the rectangles would not fit togethercorrectly.
Although this is a 2-dimensional system, it reduces toone dimension because xiyi ¼ A. So yi ¼ A=xi, and whatmatters is the function / defined by
/ðxÞ ¼ 1� A
x
which determines the trajectory of the x-coordinate.The 4-periodicity tells us that /4ðx0Þ ¼ x0. It is therefore
natural to work out what /4(x) is. Being lazy, and prone tocomputational errors, I did this using Mathematica, and theresult was what always happens to me when I use Math-ematica:
/ðxÞ ¼ 1� A
x
/2ðxÞ ¼ 1� A
1� Ax
/3ðxÞ ¼ 1� A
1� A1�A
x
/4ðxÞ ¼ 1� A
1� A1� A
1�Ax
I always forget that you have to tell Mathematica to simplifyand expand expressions the way a human mathematicianwould. Here it was just substituting them unchanged, andgetting needlessly complicated. I wanted to set x = /4(x)and solve, but I also wanted to keep track of what was goingon, so I tried to get the machine to simplify the expression—which never works the way I expect it to, somehow. Theresult was uninformative in any case.
I then realised that Mathematica was telling me some-thing, which I wouldn’t have noticed if the software hadhelpfully simplified the formula as it went along. You don’thave to be a genius to see that we are developing a con-tinued fraction, whose terms visibly have period one, notfour. It is obvious what all iterates of / look like. Moreover,if the limit of this continued fraction is z, then clearlyz ¼ 1� A
z, so z is actually a fixed point of /. In a little moredetail:
LEMMA 0.8 If the sequence of all iterates of a periodic
point converges, then it must be a fixed-point.
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 67
PROOF. Let x0 be periodic with period p. We know that
/n(x0) converges, and its limit n satifies /(n) = n. But the
subsequence /np(x0) is constant, with every term equal to
x0. So x0 = n and is a fixed-point.
But now we are done aside from some routine checking.If x is a period-4 point of /, then z = x, so x is a fixed-point. Therefore all four rectangles are congruent, theentire figure has rotational symmetry through p/2, and theshaded part is square.
Rigour demands a little care with convergence. Thecontinued fraction is not ‘regular’, with 1 on top in place ofA, so maybe it doesn’t converge. Actually, it does. Thisfollows easily from a few simple features of the dynamics of/, illustrated in Figure 3.
The fixed-points of / are the solutions of the quadraticx2 - x + A = 0, namely
k ¼ �1�ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1� 4Ap
2
l ¼ �1þffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1� 4Ap
2
Bearing in mind that A\ 14, these solutions are real. It is
routine to establish the inequalities
0\A\k\ffiffiffiffi
Ap
\l\1
The graph of / on [0,1] looks like Figure 3.There is a geometric symmetry in the problem: the two
sides x0; y0 of R0 are interchangeable. The symmetry isnonlinear, since it maps x to A/x. We use it in a trivial way:without loss of generality we can choose x0 [
ffiffiffiffi
Ap
, that is,make it the longer side of R0. The map / preserves thisproperty. Moreover, if
ffiffiffiffi
Ap
\x\l, then x \/(x)\ l, andif l\ x \ 1, then l\/(x)\ x. (These facts are clear fromthe figure, and can be checked by routine calculations.)By monotonicity, the sequence /n(x) converges for allx 2 ð
ffiffiffiffi
Ap
; 1Þ. Since the limit is a fixed-point of /, it must be l.Now we can appeal to Lemma 0.8 and we are done.
Note that x + y = 1 plays no role in this approach. Wegain something by this method, too. Suppose we start withany initial rectangle, taking x0 [
ffiffiffiffi
Ap
, and go round andround the square forming new rectangles. Although in
general this sequence does not have period 4, it convergestoward a rectangle of period 4, indeed period 1.
Courant-Robbins TrainThis is metacookery, perhaps meta-metacookery. The issuesare complicated and what gets cooked is not the obviousstatement. It can be uncooked, but this seems like luck morethan judgement, and doesn’t happen in very similarquestions.
I mention this one because it opens up an importantgeneral issue. Whenever I try to explain what’s wrong,someone always writes in and complains that with a bit ofextra work the conclusion remains correct. That’s true (andthis remark goes right back to Poston [4] when he pointedout the difficulty in 1976) but the whole point is that youneed the extra work, contrary to what Courant and Robbinssay. Indeed, if I wanted to be pedantic I could point outthat they don’t state the precise hypotheses needed for theirargument, so it’s not clear how generally they think itapplies.
Courant and Robbins [1] state the problem like this:Suppose a train travels from station A to station B alonga straight section of track. The journey need not be ofuniform speed or acceleration... But the exact motion ofthe train is supposed to be known in advance. On thefloor... a rod is pivoted so that it may move withoutfriction either forward or backward until it touches thefloor. If it touches the floor, we assume that it remainson the floor henceforth; this will be the case if the roddoes not bounce. Is it possible to place the rod in such aposition that, if it is released at the instant when the trainstarts and allowed to move solely under the influence ofgravity and the motion of the train, it will not fall to thefloor during the entire journey from A to B ?The answer they give is ‘yes’, and the reasoning is
continuity:No detailed knowledge of the laws of dynamics isneeded; only the following assumption of a physicalnature need be granted: The motion of the rod dependscontinuously on its initial position.Figure 4 is basically theirs, with added variables and
wheels removed. To spell out the proof: starting at angle 0,the rod stays there; the same goes for angle p. So [0, p]maps continuously to [0, p], so the image contains allpossible angles between 0 and p.
Actually, the important lesson in this example is thatboundary conditions can destroy ‘intuitive’ continuity prop-erties. As it happens, they don’t—in the simplest model forthis problem. But they do in slightly more complicated (andmore realistic) models, and in very similar problems. Theoffhand reference to an ‘assumption of a physical nature’could easily lead readers to think that the assumption isharmless, and would apply in all similar problems. That’snot so.
How justified, then, is the continuity assumption? Whatdo they mean when they state that ‘the motion of the rod’ isa continuous function of the initial position? There are atleast three meanings: the entire trajectory (continuity insome function space), its location after some fixed timeFigure 3. Graph of / showing fixed-points.
68 THE MATHEMATICAL INTELLIGENCER
before imposing the absorbing boundary conditions, itslocation after some fixed time after imposing the absorbingboundary conditions. These could have different continuityproperties.
If the rod is free to rotate to any angle in the circle,there’s no great difficulty—and the property of continuityholds for any ODE with well-behaved solutions for all time.Singularities can make the question ill-posed, but theydon’t occur here.
However, as Poston [4] pointed out, the ‘absorbingboundary conditions’ at angles 0, p are more problematic.In fact, for a wide class of ODEs, the boundary conditionsdestroy continuity, as we’ll shortly see.
So the train problem is ‘cooked’– not because Courantand Robbins gave the wrong answer; not even becausethey gave the wrong reason for it. It is cooked because theymade no effort at all to justify the reason, citing physicalintuition. Never forget that ODEs and PDEs have boundaryconditions. And those may have a dramatic effect on con-tinuity properties of solutions.
Suppose, for instance, that Courant and Robbins hadmodified the problem to allow the wind to blow, with aprespecified velocity, depending smoothly on time. Or (seebelow) allowed the train’s floor not to be flat—which, inci-dentally, they don’t specify, although their picture points thatway. Most readers would probably have accepted the same‘physical assumption’, but this time it would be plain wrong.Poston remarks that the only way he can see to salvage theargument is to impose some stringent conditions: perfectlylevel track, no springs in the wheels of the train... Then youstill have to explain why those conditions do the trick. This isnontrivial, and can’t just be dismissed as a simple propertybased on physical considerations.
The problem is worth analysing in detail.Consider a smooth vector field on a circle, depending
smoothly on time t 2 R. Then the associated flow w deter-mines diffeomorphisms wt of the circle for all t. So the time-t flow, for any given t, is smooth. For any given positionfunction F(t) for the train, the time taken to go from A to B isdetermined. So the map from initial state to final state, for theposition h of the rod, is a smooth diffeomorphism.
Suppose, however, that the flow looks like Figure 5,which is entirely reasonable for a general smooth ODE. Ifwe now impose the absorbing boundary conditions, wefind that all initial conditions lead, after finite time, to statesh = 0, p. That is, the rod hits the floor.
As I said, I’m less interested in developing conditions onthe mechanics that ensure such things cannot happen, thanin observing that in general ODEs they do happen, robustly.That alone makes the continuity assumption far from
obvious. Let’s seewhy,whichwill also explain the conditionsunder which Courant and Robbins are correct aboutcontinuity.
Figure 4 shows about the simplest model of the trainthat I can invent. The train itself is reduced to a point A thatmoves along the horizontal line; its location is x relative tothe origin. The rod is inclined at angle h, and we assume itsmass m is concentrated at the end. To remove variousconstants, choose units to make the length of the rod equalto 1, the mass m = 1, and the acceleration due to gravityg = 1.
Assume that the position of the train at time t is a pre-specified function F(t), which we take to be of class C2 toavoid analytic issues. Take coordinates in a moving frameattached to point A. This introduces a ‘fictitious force’� €FðtÞ, and aside from this we now have a simple pendu-lum. The angular position h satisifes the ODE
€h ¼ €FðtÞ sin h� cos h
and we take initial conditions h ¼ h0; _h ¼ 0 at time t = 0.(I use _h ¼ 0 because Courant and Robbins say ‘released’.It’s not essential.) The absorbing boundary conditions candestroy continuity if there exists h0 for which the trajectoryis tangent to the boundary h = 0, while h� 0 locally alongthe trajectory, as in Figure 6(b). Then we would expect tobe able to arrange F(t) to make nearby h0 hit the boundarytransversely or miss it altogether as in Figures 6 (a, c).Similar problems arise at the other boundary h = p if h�plocally.
The tangency condition on the boundary at 0 impliesthat h = 0 and _h ¼ 0. But when h = 0, the equation ofmotion implies that €h ¼ �1, so locally h� 0. Similarly, atthe p boundary, €h ¼ 1. So the problematic kind of ‘grazing’trajectory cannot occur.
Even now, it is a nontrivial exercise to prove that the finalposition is a continuous function of the initial one when theabsorbing boundary conditions hold. There might be othersources of discontinuity, for all we know. So technicallyCourant and Robbins are absolutely right, because theymake continuity an explicit assumption. But continuityfails if we make apparently harmless modifications to thequestion.
One of the simplest such modifications is to place theboundaries at p/4, 3p/4. Most readers would still be happy toaccept the continuity assumption. However, if we takeF(t) = t4/12 so that €FðtÞ ¼ t2, which is not exactly rocketscience, then numerical experiments find a grazing trajectorywith initial conditions close to h0 = 1.0664. So the continuity
θ
t
0
π
α
Figure 5. Why absorbing boundary conditions can destroy
continuity.
mgx θ
Α
(X,Y )
Figure 4. The Courant-Robbins train.
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 69
argument might break down. Further experiments indicatethat it does, and there is no intermediate position that staysupright (Figure 6).
For the next experiment, put the boundaries at p/50,49p/50. Again we find numerical evidence for a grazingtrajectory, and there is no intermediate position that staysupright beyond about 6 seconds (Figure 7).
These results accord with physical intuition, of course.With a perfectly horizontal track, no applied accelerationcan lift the rod off the floor if it is instantaneously at rest ath = 0, and the same goes for h = p. But if the rod is slightlyabove the horizontal position, a suitable acceleration couldlift it. This is why pretty much the only boundaries that
produce continuity are the ones employed by Courant andRobbins. Simple estimates prove:
THEOREM 0.9 Suppose 0 \ a\p and the boundaries are
at a, p - a. Suppose €FðtÞ ¼ t2. Then the rod hits the floor
after a finite time T(a).
PROOF. Let vðtÞ ¼ _hðtÞ be the angular velocity. Then the
ODE reduces to the system
_hðtÞ ¼ vðtÞ_vðtÞ ¼ t2 sin hðtÞ � cos hðtÞ
Figure 6. Four trajectories. Initial conditions respectively are h(0) = 0.2, 1.0664, 1.1, 1.3.
Figure 7. Four trajectories. Initial conditions respectively are h(0) = 0.9, 0.943974 , 1, 1.1.
70 THE MATHEMATICAL INTELLIGENCER
If t [ 2 then
_vðtÞ� ð22 � 1Þ sin a ¼ 3 sin a
Therefore
vð2þ tÞ� 1
provided
V þ 3t sin a� 1
where
V ¼ minfvð2Þ : hð0Þ 2 ½a; p� a; vð0Þ ¼ 0g
Therefore we must take
t� 1� V
3 sin a
Now, if _hðtÞ� 1 then hðt þ p� 2aÞ�p� a, so the rod hashit the boundary. Thus the rod hits the boundary after atmost time
T ðaÞ ¼ p� 2aþ 2þ 1� V
3 sin a
It remains to prove that V is finite. (It might be -?.) But
_vðtÞ� �cos a
so
vðtÞ� �t cos a
and
vð2Þ� �2 cos a
Therefore V � �2 cos a. Thus we may take
T ðaÞ ¼ p� 2aþ 2þ 1þ 2 cos a3 sin a
Since we are taking F(t) = t4/12, the stations can beplaced at A = 0, B = T4(a)/12. Note that T ðaÞ ! 1 as a! 0.
Poston [4] says: ‘Given the usual laws, the only physicalassumptions I can find which guarantee a nonfalling historyare that the pivot is perfect with the movement of the trainperfectly, totally horizontal. (Not just level track: the trainmust have no springs.)’ And he adds: ‘Courant and Robbinsdid not make a silly mistake, but Dynamical Systems hasprogressed... it would be silly now.’
Pouring from a Toroidal BottleI wonder if something similar is going on in a provocativearticle by Sarkaria [5] published in the Intelligencer in 2001.This discusses the hypothesis of continuity in the motion ofmatter, tracing it back to Anaxagoras, and demonstrates that‘matter and motion cannot both be assumed continuous’.The example cited is the impossibility of emptying ‘a tyre-tube filled with water into a bucket in any finite length oftime’. The proof is that a homotopically nontrival loop in thetyre would flow to define a homotopically trivial loop in thebucket. That is, the fundamental group p1 is an obstacle toemptying the bottle. Sarkaria observes that a similar
problem still arises if distinct fluid particles are allowed tooccupy the same location. The usual assumption of fluiddynamics, that particles don’t do that, implies that the flowof a body of fluid preserves its topological type.
We can cook this elegant but slightly artificial example byusing the invariant p0 instead—the space of connectedcomponents. Now there is amore commonplace, but equallydramatic scenario: the same appeal to continuity implies thatyou can’t empty a normal-shaped wine bottle into two ormore glasses. Initially the wine forms one connected com-ponent, and a continuous map can’t increase the number ofcomponents.
Once again, though, we must ask: is continuity of the flowjustified? It surely depends on the boundary conditions, notjust the PDE. Tyres and bottles have boundaries, and physi-cally sensible boundary conditions could–perhaps should—allow the fluid to ‘break’when its surfacemeets theboundaryin the right way, perhaps tangentially. Without this boundarycondition, the flow would be continuous, but with it, thefluid can split discontinuously.
I don’t know enough about the Navier-Stokes equationto analyse this possibility, which is a question for experts; Iimagine that the answer depends on the precise physicalassumptions encoded in the PDE and its boundaryconditions.
Afterword
As I said at the start, this is a personal compilation and I’mnot making any claims of originality or superiority. Varia-tions on proofs tend to be rediscovered over and overagain, and I’m sure that historians will shortly be telling methat the approach to cubics was known to some Portuguesealgebraist in 1573, whereas experts in number theory willquietly draw me aside to explain that Serre knew every-thing in this article in the 1950s. Still, I think it’s a usefulexercise to find alternatives to the classic proofs, some ofwhich have become cliches and some of which do make abit of a mouthful of things that can be done more clearlyand more simply. I’d be interested to see other exampleswhere the classics can be cooked. Maybe it would be worthsetting up a website on mathematical cookery.
REFERENCES
[1] H. Courant and H. Robbins. What is Mathematics?, Oxford:
Oxford University Press, 1941.
[2] D.H. Fowler. the Mathematics of Plato’s Academy, Oxford:
Oxford University Press, 1987.
[3] M. Gardner. Mathematical Carnival, New York: Knopf, 1975.
[4] T. Poston. Au Courant with differential equations, Manifold 18
(1976) 6–9.
[5] K.S. Sarkaria. A topological paradox of motion, Mathematical
Intelligencer 23 vol. 4 (2001) 66–68.
[6] I. Stewart. Galois Theory, Boca Raton: Chapman and Hall/CRC,
2004.
[7] T. Tao. Solving Mathematical Problems — a Personal Perspec-
tive, Oxford: Oxford University Press, 2006.
� 2011 Springer Science+Business Media, LLC, Volume 33, Number 1, 2011 71