41
Old and new algorithms for computing Bernoulli numbers David Harvey University of New South Wales 25th September 2012, University of Ballarat David Harvey Old and new algorithms for computing Bernoulli numbers

Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

  • Upload
    hathien

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Old and new algorithms for computing Bernoullinumbers

David Harvey

University of New South Wales

25th September 2012, University of Ballarat

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 2: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Bernoulli numbers

Rational numbers B0,B1, . . . defined by:

x

ex − 1=∑n≥0

Bn

n!xn.

The first few Bernoulli numbers:

B0 = 1 B1 = −1/2

B2 = 1/6 B3 = 0

B4 = −1/30 B5 = 0

B6 = 1/42 B7 = 0

B8 = −1/30 B9 = 0

B10 = 5/66 B11 = 0

B12 = −691/2730 B13 = 0

B14 = 7/6 B15 = 0

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 3: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Bernoulli numbers

Applications:

I Sums of the form 1n + 2n + · · ·+ Nn

I Euler–Maclaurin formula

I Connections to Riemann zeta function, e.g. Euler’s formula:

ζ(2n) =1

12n+

1

22n+

1

32n+ · · · = (−1)n+1 (2π)2nB2n

2(2n)!

I Arithmetic of cyclotomic fields, p-parts of class groups

I . . .

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 4: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Three algorithmic problems

I will discuss three algorithmic problems:

1. Given n, compute B0,B1, . . . ,Bn.

2. Given n, compute just Bn.

3. Given prime p, compute B0,B2, . . . ,Bp−3 modulo p.

How fast can we solve these problems?

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 5: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

How big is Bn?

Write B2n as a reduced fraction

B2n =N2n

D2n.

D2n has O(n) bits (von Staudt–Clausen theorem).

N2n has Θ(n log n) bits (Euler’s formula and Stirling’s formula).

Example:

B1000 =−18243 . . . (1769 digits) . . . 78901

2 · 3 · 5 · 11 · 41 · 101 · 251.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 6: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — cubic algorithms

Equating coefficients in(ex − 1

x

)∑n≥0

Bn

n!xn = 1

leads to the recursive formula

Bn = −n−1∑k=0

1

k!(n − k + 1)Bk .

Given B0, . . . ,Bn−1, we can compute Bn using O(n) arithmeticoperations.

But each Bk has O(k log k) bits, so we need O(n2+o(1)) bitoperations.

For computing B0, . . . ,Bn, we need O(n3+o(1)) bit operations.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 7: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — cubic algorithms

There are many other identities of this sort that lead to O(n3+o(1))algorithms for computing B0, . . . ,Bn, for example:

I Seidel (1877)

I Ramanujan (1911)

I Knuth–Buckholtz (1967)

I Atkinson (1986)

I Akiyama–Tanigawa (1999)

I Brent–H. (2011)

They have various big-O constants and numerical stabilityproperties.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 8: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — fast algorithms

But B0, . . . ,Bn only occupy O(n2 log n) bits altogether.

Can we compute them all in only O(n2+o(1)) bit operations?

Yes! (Brent–H., 2011)

There are several ways — I will sketch a particularly cute version.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 9: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — fast algorithms

Define the Tangent numbers T1,T2, . . . by

tan x =∑n≥1

Tn

(2n − 1)!x2n−1.

Here are the first few:

1, 2, 16, 272, 7936, 353792, . . .

They are all positive integers, and it turns out that

Tn =(−1)n−122n(22n − 1)

2nB2n.

So it suffices to compute the first n Tangent numbers.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 10: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — fast algorithms

Watch for the non-overlapping coefficients:

9! tan(10−3) = 362.880120960048384019584007936 . . .

We can easily read off the Tangent numbers:

T1 = 362880/9! = 1

T2 = 120960/(9!/3!) = 2

T3 = 48384/(9!/5!) = 16

T4 = 19584/(9!/7!) = 272

T5 = 7936/(9!/9!) = 7936.

In general, to compute T1, . . . ,Tn, it suffices to compute(2n − 1)! tan(10−p) to about n2 log n significant digits, wherep = O(n log n).

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 11: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0,B1, . . . ,Bn — fast algorithms

How do we compute (2n − 1)! tan(10−p)?

Like this:

9! tan(10−3) = 8!9! sin(10−3)

8! cos(10−3)

= 8!362.879939520003023999928000001 . . .

40319.979840001679999944000001 . . .

Writing down the required digits of sin(10−p) and cos(10−p) costsO(n2 log2 n) bit operations.

Then perform a single floating-point division with O(n2 log n)digits of accuracy. (Some error analysis required.)

Assuming FFT-based algorithms, total cost is O(n2 log2+o(1) n) bitoperations.

(Of course we work in base 2, not base 10!)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 12: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn

What about computing a single Bn, without necessarily computingB0, . . . ,Bn−1?

Since Bn has only O(n log n) bits, maybe we can do it inO(n1+o(1)) bit operations?

Can we at least do better than O(n2+o(1))?

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 13: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn

I’ll discuss three algorithms:

I The zeta-function algorithm(rediscovered many times... Euler, Chowla–Hartung 1972,Fillebrown–Wilf 1992, Fee–Plouffe 1996, Kellner 2002,Cohen–Belabas–Daly (?))

I The multimodular algorithm (H. 2008)

I A new p-adic multimodular algorithm (H., 2012 — see arXiv)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 14: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the zeta-function algorithm

Idea is to use Euler’s formula

B2n = (−1)n+1 2(2n)!

(2π)2nζ(2n).

It suffices to compute O(n log n) significant bits of the right handside.

It is known how to compute n! and π to this precision inO(n1+o(1)) bit operations.

But computing enough bits of ζ(2n) is more expensive, so weconcentrate on this.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 15: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the zeta-function algorithm

We use the Euler product:

ζ(2n)−1 =∏

p prime

(1− p−2n)

It turns out that we need O(n/ log n) primes to get enough bits,and we need to do O(n1+o(1)) work per prime.

Total: O(n2+o(1)) bit operations.

No better than computing all B0, . . . ,Bn!

(Fillebrown gives the more precise bound O(n2 log2+o(1) n). Maybethis can be improved by a logarithmic factor?)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 16: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the multimodular algorithm

Idea: compute Bn (mod p) for many primes p and reconstructusing the Chinese Remainder Theorem.

Using fast interpolation algorithms, reconstruction step costs onlyO(n1+o(1)) bit operations.

How many primes do we need? Since∑

p<N log p ∼ N, we need allprimes up to O(n log n), i.e. O(n) primes.

The expensive part is computing Bn (mod p) for each p. How dowe do this?

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 17: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the multimodular algorithm

Assume n ≥ 2, n even, and p ≥ 3.

Define ‘integralised’ Bernoulli number

B∗n = 2(1− 2n)Bn ∈ Z.

Basic identity (a ‘Voronoi congruence’):

B∗n = n

p−1∑j=1

(−1)j jn−1 (mod p)

= n(−1n−1 + 2n−1 − · · ·+ (p − 1)n−1) (mod p).

Straightforward evaluation scheme: compute in turn 1n−1, 2n−1,etc, all modulo p.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 18: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the multimodular algorithm

Better: rearrange terms. Put j = g i for a generator g .

Example to compute B8 (mod 23), using g = 5:

−17−57+27+107+47+207+87−177+167−117−97 (mod 23).

Can keep track of g i , g i(n−1) with only O(1) operations per term.

The sign of each term depends on the parity of g i .

The sum for g11, . . . , g22 is the same as for g0, . . . , g10.

Complexity: O(p1+o(1)) bit operations per prime.

Overall complexity to get Bn: still O(n2+o(1)).

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 19: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the multimodular algorithm

Various tricks to squeeze out large constant factor... beyond thescope of this talk.

Current record for a single Bn is held by the multimodularalgorithm.

Timing data from 2008 paper:

I B107 : 40 CPU hours using zeta function algorithm (Pari/GP).

I B107 : 10 CPU hours using multimodular algorithm.

I B108 : 1164 CPU hours using multimodular algorithm.

This data nicely illustrates quasi-quadratic asymptotics.

Wikipedia reports that Holoborodko recently used myimplementation to compute B2×108 .

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 20: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the p-adic multimodular algorithm

While preparing for this talk, I discovered a new algorithm:

Theorem (H., 2012)

The Bernoulli number Bn may be computed in

O(n4/3+o(1))

bit operations.

This is the first known subquadratic algorithm for this problem.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 21: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the p-adic multimodular algorithm

Sketch: the Voronoi identity may be generalised to obtaininformation modulo powers of p:

B∗n =

p−1∑j=0

s−1∑k=0

(n

k + 1

)B∗k+1p

k(−1)j jn−k−1 (mod ps)

(assuming s ≤ n).

For s = 1 this simplifies to the previous formula.

There are ps terms, and we need to do arithmetic modulo ps .

So the complexity of evaluating this sum (ignoring logarithmicfactors) is roughly ps2.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 22: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the p-adic multimodular algorithm

Optimisation problem: select precision parameter sp for each p.

Want to minimise the cost (roughly∑

p ps2p) while ensuring that

we collect enough bits (roughly∑

p sp log p) to determine Bn.

Unfortunately this doesn’t work — it still leads to overall costO(n2+o(1)).

(This remains true even if you throw in bits from ζ(n) to yieldinformation at the prime at infinity.)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 23: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the p-adic multimodular algorithm

But we can rewrite the identity like this:

B∗n =

p−1∑j=0

(−1)j jn−sps−1F (j/p) (mod ps)

where

F (x) =s−1∑k=0

(n

k + 1

)B∗k+1x

s−1−k ∈ Z[x ].

Note F (x) is a polynomial of degree s − 1, and is independent of p.

Select parameters s = n2/3+o(1), M = n2/3+o(1), N = n1/3+o(1).

Evaluate F (j/p) (mod 2M) for every p < N, every 0 ≤ j < p,simultaneously using a fast multipoint evaluation algorithm.

Then add everything up and use the Chinese Remainder Theorem!

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 24: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing Bn — the p-adic multimodular algorithm

Actually one can interpolate between

I O(n4/3+o(1)) time and O(n4/3+o(1)) space, and

I O(n3/2+o(1)) time and O(n1+o(1)) space.

For what n do we beat the previous algorithms in practice?

No idea. I hope to find out soon...

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 25: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — motivation

The residues B0,B2 . . . ,Bp−3 (mod p) are closely related to thearithmetic of cyclotomic fields.

Let K = Q(ζp) and let h be the class number of K .

Then

p | h ⇐⇒ p | Bk for some k = 0, 2, . . . , p − 3.

A prime p is regular if Bk 6= 0 (mod p) for all k = 0, 2, . . . , p − 3,i.e. p - h.

Much of Kummer’s work on Fermat’s Last Theorem applies only toregular primes, essentially because one can then always take pthroots in the class group.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 26: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — motivation

If p is not regular it is called irregular.

The smallest irregular prime is p = 37:

B32 =−7709321041217

5100= 0 (mod 37).

We call (37, 32) an irregular pair.

Each irregular pair corresponds to a nontrivial eigenspace in theGalois action on the p-Sylow part of the class group of K .

The irregular pairs have been determined for all p < 163 577 856(Buhler–H., 2011).

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 27: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — motivation

Once the irregular pairs for p are known, one can quickly testinteresting conjectures, such as the Kummer–Vandiver conjecture.

This states that the class number of the maximal real subfield

K+ = K ∩ R = Q(ζp + ζ−1p )

is not divisible by p.

Some people think it’s true.

Some people think it’s false.

(It’s true for all p < 163 577 856.)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 28: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p)

Modern algorithms for computing B0, . . . ,Bp−3 (mod p):

I Power series inversion

I Voronoi identity & Bluestein’s trick

I Voronoi identity & Rader’s trick

All have theoretical bit complexity O(p log2+o(1) p).

These days we can only fight to improve the constant.

(Pre-1992 algorithms had complexity O(p2+o(1)).)

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 29: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — series inversion

Idea: invert the power series

ex − 1

x= 1 +

x

2!+

x2

3!+ · · ·+ xp−3

(p − 2)!+ O(xp−2)

in Fp[[x ]].

Actually we only need coefficents of even index. Can save a factorof two by using for example

x2

cosh x − 1= −2

∑n≥0

(2n − 1)B2n

(2n)!x2n.

So we need to invert (cosh x − 1)/x2 to about p/2 terms.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 30: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — series inversion

Series inversion can be reduced to polynomial multiplication viaNewton’s method.

So the basic question is: how to multiply polynomials of degreeO(p) in Fp[x ]?

Fastest way to multiply polynomials is via FFT.

But we can’t perform an efficient FFT over Fp, because usually Fp

won’t have sufficiently many roots of unity of smooth order.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 31: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — series inversion

We can lift the multiplication problem to Z[x ], and then do either:

I Use floating-point FFT to multiply over R[x ], and roundresults to the nearest integer, or

I Choose a few word-sized primes q such that Fq supportsefficient FFT, multiply in Fq[x ] for each q, then reconstructvia CRT, or

I Some combination of the above.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 32: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — series inversion

Suppose f , g ∈ Z[x ], of degree p, with coefficients bounded by p.

Coefficients of h = fg are bounded by p3.

Example from 1992: p ∼ 106, so coefficients of h have 60 bits.

I IEEE double-precision floating-point is only 53 bits!Doesn’t quite fit.

I Could use two primes near 232.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 33: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — series inversion

Let’s cheat: represent coefficients in (−p/2, p/2) instead of [0, p).

Then statistically we expect coefficients of h = fg to be boundedby O(p2.5).

Try p ∼ 106 again: now coefficents of h have 50 bits.

This fits into floating-point (just), but no longer provably correct.

Fast-forward to 2012: now perhaps p ∼ 109.

Even using O(p2.5) bound, coefficients of h need 75 bits.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 34: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Voronoi identity

Recall Voronoi identity:

B∗nn

=

p−1∑j=1

(−1)j jn−1 (mod p).

Again let’s put j = g i for g a generator of (Z/pZ)∗, and let xi bethe corresponding ±1 sign for each i . Also write ` = n − 1, andb` = B∗n/n. The sum becomes

b` =

p−2∑i=0

g `ixi .

We want to evaluate this for all (odd) 0 ≤ ` < p − 1.

This is precisely a discrete Fourier transform of the sequence(x0, . . . , xp−2) over Fp.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 35: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Voronoi identity

Can’t evaluate the DFT directly over Fp (same reason as before).

Idea: use Bluestein’s trick to convert to polynomial multiplication.

Using the identity `i =(`2

)+(−i2

)−(`−i2

), the DFT

b` =

p−2∑i=0

g `ixi

becomes

g−(`2)b` =

p−2∑i=0

g−(`−i2 )(g(−i

2 )xi

).

The right hand side is now a convolution of g−( i2) and g(−i

2 )xi .

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 36: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Voronoi identity

Resulting algorithm involves multiplying two polynomials of degreeabout p/2 with coefficients in Fp.

We run into the same problem as before: for p ∼ 109, the productstill has 75-bit coefficients (even if allow cheating).

I will now sketch how Rader’s trick mitigates this problem. Thisseems to new in the context of computing Bernoulli numbers.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 37: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Rader’s trick

First, some symmetry implies we only need sum over odd terms:

b` = 2

p−2∑i=0i odd

g `ixi .

For simplicity let us assume that q = (p − 1)/2 is prime.

Then (Z/2qZ)∗ is cyclic of order q − 1.

In fact (Z/2qZ)∗ consists of all odd 0 ≤ i < p − 1, except for q.

Sob` = 2g `qxq + 2

∑i∈(Z/2qZ)∗

g `ixi .

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 38: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Rader’s trick

Now h be a generator of (Z/2qZ)∗. Put i = h−s and ` = ht .

Then the sumy` =

∑i∈(Z/2qZ)∗

g `ixi

becomes

yht =

q−2∑s=0

ght−sxh−s .

The right hand side is now a convolution of ghs and xh−s .

But notice that the values xh−s are all ±1!

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 39: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — Rader’s trick

So we end up with a polynomial multiplication fg , where:

I Both have degree about p/2,

I The coefficients of f are ‘random’ looking integers in [0, p),

I The coefficients of g are all ±1.

Now the coefficients of h = fg are bounded by about p2.

Even for p ∼ 109 this fits into 64 bits.

If we cheat, we get the bound p1.5. For p ∼ 109 this is about 45bits — it even fits into IEEE double-precision floating-point.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 40: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Computing B0, . . . ,Bp−3 (mod p) — timings

Historical comparison:

Year Hardware Algorithm p Time

1992 Motorola 68040 series inversion 106 200s2012 Intel ‘Westmere’ (12 cores) Rader’s trick 109 25s

Current plan is to do this for all p < 230 ∼ 109.

Should cost about 1.5 million CPU hours.

David Harvey Old and new algorithms for computing Bernoulli numbers

Page 41: Old and new algorithms for computing Bernoulli numbersweb.maths.unsw.edu.au/~davidharvey/talks/bernoulli.pdf · Old and new algorithms for computing Bernoulli numbers David Harvey

Thank you!

David Harvey Old and new algorithms for computing Bernoulli numbers