6. Arithmetical Functions6. Arithmetical Functions 6.1. Introduction. I have decided that it makes more sense to cover chapters 6 and 7 in reverse order, so this will follow my dis-cussion

MATH 465 NUMBER THEORY

6. Arithmetical Functions

6.1. Introduction. I have decided that itmakes more sense to cover chapters 6 and 7in reverse order, so this will follow my dis-cussion of Chapter 7.It is convenient to make the following def-

inition.Definition. LetA denote the set of arith-

metical functions, that is the functions de-fined by

A = f : N → C.

Of course the range of any particular func-tion might well be a subset of C, such as Ror Z. There are quite a number of impor-tant arithmetical functions. Some examplesare

1

2 MATH 465 NUMBER THEORY

The divisor function. The number ofpositive divisors of n.

d(n) =∑m|n

1.

Euler’s function. We have seen thisfunction and some of its properties already.It is the number ϕ(n) of integers m with1 ≤ m ≤ n and (m,n) = 1.We have already seen that∑

m|n

ϕ(m) = n.

The Mobius function. This is a morepeculiar function. It is defined by

µ(n) =

(−1)k if n is a product of k distinct primes,

0 if there is a prime p such that p2|n.It is also convenient to introduce three veryboring functions.

The unit.

e(n) =

1 (n = 1),

0 (n > 1).

MATH 465 NUMBER THEORY 3

The one.

1(n) = 1 for every n.

The identity.

N(n) = n.

Two other functions which have interestingstructures but which we will say less aboutat this stage are

The primitive character modulo 4.We define

χ1(n) =

(−1)

n−12 2 - n,

0 2|n.

Similar functions we have already met arethe Legendre symbol and its generalization,the Jacobi symbol, which for a fixed oddnatural number m is written( n

m

)J.

Sums of two squares. We define r(n)to be the number of ways of writing n as thesum of two squares of integers.


For example, 1 = 02+(±1)2 = (±1)2+12,so r(1) = 4, r(3) = r(6) = r(7) = 0, r(9) =4, 65 = (±1)2 + (±8)2 = (±4)2 + (±7)2 sor(65) = 16. This is the function r2(n) ofthe previous chapter.

d, ϕ, e, 1, N , χ1,( ·m

)Jhave an interest-

ing property. That is they are multiplica-tive. We already discussed this in connec-tion with Euler’s function and the Legendreand Jacobi symbols. Here is a reminder.

DefinitionAn arithmetical function f whichis not identically 0 ismultiplicative whenit satisfies

f (mn) = f (m)f (n)

whenever (m,n) = 1. Let M denote theset of multiplicative functions.

The function r(n) is not multiplicative, sincer(65) = 16 but r(5) = r(13) = 8. In-deed the fact that r(1) = 1 would contradictthe next theorem. However it is true thatr(n)/4 is multiplicative, but this is a littletrickier to prove.


Theorem 6.1. Suppose that f ∈ M. Thenf (1) = 1.

Proof. Since f is not identically 0 there isan n such that f (n) = 0. Hence f (n) =f (n × 1) = f (n)f (1), and the conclusionfollows. It is pretty obvious that e, 1 and N are in

M, and it is actually quite easy to show

Theorem 6.2.We have µ ∈ M.

Proof. Suppose that (m,n) = 1. If p2|mn,then p2|m or p2|n, so µ(mn) = 0 = µ(m)µ(n).If

m = p1 . . . pk, n = p′1 . . . p′l

with the pi, p′j distinct, then

µ(mn) = (−1)k+l = (−1)k(−1)l = µ(m)µ(n).

The following is very useful.

Theorem 6.3. Suppose the f ∈ M, g ∈M and h is defined for each n by

h(n) =∑m|n

f (m)g(n/m).


Then h ∈ M.

Proof. Suppose (n1, n2) = 1. Then a typicaldivisor m of n1n2 is uniquely of the formm1m2 with m1|n1 and m2|n2. Hence

h(n1n2) =∑m1|n1

∑m2|n2

f (m1m2)g(n1n2/(m1m2))

=∑m1|n1

f (m1)g(n1/m1)∑m2|n2

f (m2)g(n2/m2).

This enables is to establish an interestingproperty of the Mobius function.

Theorem 6.4.We have∑m|n

µ(m) = e(n).

Proof. By the definition of 1 the sum hereis ∑

m|n

µ(m)1(n/m)


and so by the previous theorem it is in M.Moreover if k ≥ 1, then∑

m|pkµ(m) = µ(1) + µ(p) = 1− 1 = 0

This suggests a general way of defining new

functions.

Definition. Given two arithmetical func-tions f and g we define theDirichlet con-volution f ∗ g to be the function definedby

(f ∗ g)(n) =∑m|n

f (m)g(n/m).

Note that this operation is commutative -simply replace m by n/m.It is also quite easy to see that

(f ∗ g) ∗ h = f ∗ (g ∗ h).Write the left hand side as∑

m|n

∑l|m

f (l)g(m/l)

h(n/m)


and interchange the order of summation andreplace m by kl.Dirichlet convolution has some interesting

properties1. f ∗ e = e ∗ f = f for any f ∈ A, so e

is really acting as a unit.2. µ ∗ 1 = 1 ∗µ = e, so µ is the inverse of

1, and vice versa.3. d = 1 ∗ 1, so d ∈ M. Hence4. d(pk) = k+1 and d(pk11 . . . p

krr ) = (k1+

1) . . . (kr + 1).

Theorem 6.5 (Mobius inversion I). Sup-pose that f ∈ A and g = f ∗ 1. Thenf = g ∗ µ.

Proof.We have

g ∗µ = (f ∗1)∗µ = f ∗ (1∗µ) = f ∗e = f.

Theorem 6.6 (Mobius inversion II). Sup-pose that g ∈ A and f = g ∗ µ, theng = f ∗ 1.

The proof is similar.


Theorem 6.7. We have ϕ = µ ∗ N andϕ ∈ M. Moreover

ϕ(n) = n∑m|n

µ(m)

m= n

∏p|n

(1− 1

p

)Proof.We saw earler that ϕ∗1 = N . Henceby the previous theorem we have

ϕ = N ∗ µ = µ ∗N.Therefore, by Theorem 6.3, ϕ ∈ M. More-over ϕ(pk) = pk−pk−1 and we are done.

Theorem 6.8. Let D = f ∈ A : f (1) =0. Then ⟨D, ∗⟩ is an abelian group.

Proof. Of course e is the unit, and closure isobvious. We already checked commutativityand associativity. It remains, given f ∈ D,to construct an inverse. Define g iterativelyby

g(1) = 1/f (1)

g(n) = −∑m|nm>1

f (m)g(n/m)/f (1)

and it is clear that f ∗ g = e.


There is a function which we have alreadyseen, but we have only used so far as a formof shorthand. This is the floor function. Itis not an arithmetical function - it is definedon R, not Z.Definition Given a real number x we de-

fine the floor function of x, ⌊x⌋, as thegreatest integer l with l ≤ x. Occasionallyit is also useful to define the ceiling func-tion ⌈x⌉ as the smallest integer u such thatx ≤ u. The difference x−⌊x⌋ is often calledthe fractional part of x and is sometimesdenoted by x.Examples ⌊π⌋ = 3, ⌈π⌉ = 4, ⌊

√2⌋ = 1,

⌊−√2⌋ = −2, ⌈−

√2⌉ = −1.

Another related function which is very use-ful in some parts of number theory, althoughwe will not use it here is ∥x∥, the distanceof x from a nearest integer,

∥x∥ = minn∈Z

|x−n| = min(x−⌊x⌋, ⌈x⌉−x).

The floor function has some useful proper-ties.


Theorem 6.9 (Properties of the floor func-tion). (i) For any x ∈ R we have 0 ≤x− ⌊x⌋ < 1.(ii) For any x ∈ R and k ∈ Z we have⌊x + k⌋ = ⌊x⌋ + k.(iii) For any x ∈ R and any n ∈ N wehave ⌊x/n⌋ = ⌊⌊x⌋/n⌋.(iv) For any x, y ∈ R we have ⌊x⌋+⌊y⌋ ≤⌊x + y⌋ ≤ ⌊x⌋ + ⌊y⌋ + 1.(v) For x ∈ R define b(x) = ⌊x⌋−2⌊x/2⌋.Then b(x) is periodic with period 2 andb(x) = 0 when 0 ≤ x < 1 and 1 when1 ≤ x < 2.

Proof. (i) For any x ∈ R we have 0 ≤ x −⌊x⌋ < 1. This is pretty obvious. If x −⌊x⌋ < 0, then x < ⌊x⌋ contradicting thedefinition. If 1 ≤ x−⌊x⌋, then 1+⌊x⌋ ≤ xalso contradicting the definition. This alsoshows that ⌊x⌋ is unique.


(ii) For any x ∈ R and k ∈ Z we have⌊x+k⌋ = ⌊x⌋+k. One way to see this is toobserve that by (i) we have x = ⌊x⌋+ θ forsome θ with 0 ≤ θ < 1. Then x+k−⌊x⌋−k = θ and since there is only one integer lwith 0 ≤ x+k− l < 1, and this l is ⌊x+k⌋we must have ⌊x + k⌋ = ⌊x⌋ + k.(iii) For any x ∈ R and any n ∈ N we

have ⌊x/n⌋ = ⌊⌊x⌋/n⌋. We know by (i)that θ = x/n − ⌊x/n⌋ satisfies 0 ≤ θ < 1.Now x = n⌊x/n⌋+nθ and so by (ii) ⌊x⌋ =n⌊x/n⌋ + ⌊nθ⌋. Hence ⌊x⌋/n = ⌊x/n⌋ +⌊nθ⌋/n and so ⌊x/n⌋ ≤ ⌊x⌋/n < ⌊x/n⌋+1and so ⌊x/n⌋ = ⌊⌊x⌋/n⌋.(iv) For any x, y ∈ R we have ⌊x⌋+⌊y⌋ ≤

⌊x+ y⌋ ≤ ⌊x⌋+ ⌊y⌋+ 1. Put x = ⌊x⌋+ θand y = ⌊y⌋+ ϕ where 0 ≤ θ, ϕ < 1. Then⌊x + y⌋ = ⌊θ + ϕ⌋ + ⌊x⌋ + ⌊y⌋ and 0 ≤θ + ϕ < 2.


(v) For x ∈ R define b(x) = ⌊x⌋−2⌊x/2⌋.Then b(x) is periodic with period 2 and b(x) =0 when 0 ≤ x < 1 and 1 when 1 ≤ x < 2.The periodicity is easy, since for any k ∈ Z

we have

b(x + 2k) = ⌊x⌋ + 2k − 2⌊(x/2) + k⌋= ⌊x⌋ + 2k − 2⌊(x/2)⌋ − 2k

= b(x).

Hence we only have to evaluate it when 0 ≤x < 2. It is pretty clear that b(x) = 0 when0 ≤ x < 1 and = 1 when 1 ≤ x < 2.

6.2. Euler and primes. Well here is aproof of the infinitude of primes which is es-sentially due to Euler and is analytic in na-ture. To begin first consider the importantsum

S(x) =∑n≤x

1

n

where x is a large real number. Of coursethe sum behaves a bit like the integral so isa bit like log x. In fact there is somethingmore precise which one can say, which was


discovered by Euler. We have

S(x) =∑n≤x

(1

x+

∫ x

n

dt

t2

)=

⌊x⌋x

+

∫ x

1

⌊t⌋t2dt

=

∫ x

1

dt

t+ 1−

∫ x

1

t− ⌊t⌋t2

dt− x− ⌊x⌋x

= log x + C0 +

∫ ∞

x

t− ⌊t⌋t2

dt− x− ⌊x⌋x

where

C0 = 1−∫ ∞

1

t− ⌊t⌋t2

dt

which gives∑n≤x

1

n= log x + C0 +O(1/x). (6.1)

HereC0 = 0.577 . . . is Euler’s constant, whichhe computed to 19 decimal places (by handof course). Actually that is not so hard andwe might say something about it later.By the way, it would be good here to say

something about notation. Typically mostlatin letters will be integers or natural num-bers, but t, x, y may well be real numbers,


according to context, and z, and in Dirichletseries s, will be complex numbers.Given functions f and g defined on some

domain X with g(x) ≥ 0 for all x ∈ X wewrite

f (x) = O(g(x)) (6.2)

to mean that there is some constant C suchthat

|f (x)| ≤ Cg(x)

for every x ∈ X . We also use

f (x) = o(g(x)

to mean that if there is some limiting oper-ation, such as x→ ∞, then

f (x)

g(x)→ 0

andf (x) ∼ g(x)

to meanf (x)

g(x)→ 1.

The symbolsO and owere invented by Bach-mann, Landau’s doctoral supervisor about120 years ago. The O-symbol can be a bit


clumsy for complicated expressions and wewill often instead use the Vinogradov sym-bols, which I. M. Vinogradov introduced about90 years ago. Thus we will use

f ≪ g (6.3)

to mean (6.2). This has the advantage thatwe can write strings of inequalities in theform

f1 ≪ f2 ≪ f3 ≪ . . . .

Also if f is also non-negative we may use

g ≫ f

to mean (6.3).

Return to

S(x) =∑n≤x

1

n.

Less precise than Euler’s result is the obser-vation that

S(x) ≥∑n≤x

∫ n+1

n

dt

t≥∫ x

1

dt

t= log x.

Now consider

P (x) =∏p≤x

(1− 1/p)−1


where the product is over the primes notexceeding x. Then

P (x) =∏p≤x

(1 +

1

p+

1

p2+ · · ·

)≥∑n≤x

1

n≥ log x.

Note that when one multiplies out the lefthand side every fraction 1

n with n ≤ x oc-curs. Since log x → ∞ as x → ∞, therehave to be infinitely many primes. Actuallyone can get something a bit more precise.Take logs on both sides. Thus

−∑p≤x

log(1− 1/p) ≥ log log x.

Moreover the expression on the left is

−∑p≤x

log(1− 1/p) =∑p≤x

∞∑k=1

1

kpk.

Here the terms with k ≥ 2 contribute atmost∑

p≤x

1

2

∞∑k=2

1

pk≤ 1

2

∞∑n=2

1

n(n− 1)=

1

2.


Hence we have just proved that∑p≤x

1

p≥ log log x− 1

2.

This is quite close to the truth, and we willshow in a while that there is a constant C1

such that∑p≤x

1

p= log log x + C1 + o(1).

Since ∫ x

2

dt

t log t= log log x− log log 2

it suggests that about 1/ log n of the num-bers near n are prime, or in other wordsthe “probability” that n is prime is 1/ log n.Hence one might guess that π(x) is indeedabout ∫ x

2

dt

log tand the following table indicates that this isindeed true for x out to about 1022.


x π(x) li(x)10 4 5.12102 25 29103 168 176104 1229 1245105 9592 9628106 78498 78626107 664579 664917108 5761455 5762208109 50847534 508492331010 455052511 4550556131011 4118054813 41180663991012 37607912018 376079502791013 346065536839 3460656458091014 3204941750802 32049420656901015 29844570422669 298445714752861016 279238341033925 2792383442485551017 2623557157654233 26235571656108201018 24739954287740860 247399543096904131019 234057667276344607 2340576673762223821020 2220819602560918840 22208196027836634831021 21127269486018731928 211272694866161261821022 201467286689315906290 201467286691248261498


Euler’s result on primes is often quoted asfollows.

Theorem 6.10 (Euler). The sum∑p

1

p

diverges.

6.3. Averages of arithmetical func-tions. In Euler’s work above we have al-ready seen that taking averages is am inter-esting way of examining things. Indeed oneof the most powerful techniques we have isto take an average. One of the more famoustheorems of this kind is

Theorem 6.11 (Dirichlet). Suppose thatX ∈ R and X ≥ 2. Then∑n≤X

d(n) = X logX+(2C0−1)X+O(X1/2).

Proof.We follow Dirichlet’s proof method,which has become known as the method ofthe hyperbola. The divisor function d(n)can be thought of as the number of ordered


pairs of positive integersm, l such thatml =n. Thus when we sum over n ≤ X we arejust counting the number of ordered pairsm, l such that ml ≤ X . In other words weare counting the number of lattice pointsm, l under the rectangular hyperbola

xy = X.

We could just crudely count, given m ≤ X ,the number of choices for l, namely⌊

X

m

⌋and obtain ∑

m≤X

X

m+O(X)

but this gives a much weaker error term.Dirichlet’s idea is to divide the region un-

der the hyperbola into two parts. That with

m ≤√X, l ≤ X

mand that with

l ≤√X, m ≤ X

l.


Clearly each region has the same number oflattice points. However the points m, l withm ≤

√X and l ≤

√X are counted in both

regions. Thus we obtain∑n≤X

d(n) = 2∑

m≤√X

⌊X

m

⌋− ⌊

√X⌋2

= 2∑

m≤√X

X

m−X +O(X1/2)

= 2X(log(

√X) + C0

)−X +O(X1/2).

where in the last line we used Euler’s esti-mate (6.1).

One can also compute an average for Eu-ler’s function

Theorem 6.12. Suppose that x ∈ R andx ≥ 2. Then∑

n≤xϕ(n) =

x2

2

∞∑m=1

µ(m)

m2+O(x log x).

We remark that the infinite series here is“well known” to be 6

π2.


Proof.We have ϕ = µ ∗N . Thus∑n≤x

ϕ(n) =∑n≤x

n∑m|n

µ(m)

m=∑m≤x

µ(m)∑l≤x/m

l.

We want a good approximation to the innersum. This is just the sum of an arithmeticprogression of ⌊x/m⌋ terms with first term1 and last term ⌊x/m⌋. Thus the sum is

1

2

⌊ xm

⌋(1 +

⌊ xm

⌋)=

1

2

( xm

)2+O

( xm

).

Inserting this in the formula above gives∑n≤x

ϕ(n) =x2

2

∑m≤x

µ(m)

m2+O

(∑m≤x

x

m

).

The error term is≪ x log x by Euler’s boundapplied to the sum. The main term is

x2

2

∞∑m=1

µ(n)

m2+O

(∑m>x

x2

m2

)The error term here, by the monotonicity ofthe general term is

≪ x2∫ ∞

x

dy

y2≪ x.


Collecting together our bounds gives the the-orem.

Likewise the sum of two squares function

Theorem 6.13 (Gauss). Suppose that x ∈R and x ≥ 2. Then∑

n≤X

r(n) = πX +O(X1/2).

We leave the proof as an exercise. As ahint, one can again consider it as a latticepoint problem, this time the number of lat-tice points inside a closed circle centre theorigin and of radius

√x. Then, there is a

general principal which is easy to prove inthis case that the number of lattice points ina convex region is equal to the area of the re-gion with an error proportional to the lengthof the boundary. One way of seeing this isto associate the square [u, u+1)× [v, v+1)with the lattice point u, v and to observethat all the relevant lattice points are insidethe circle radius

√x +

√2 and their union

contains the circle radius√x−

√2.


6.4. Elementary Prime number the-ory. The strongest results we know aboutthe distribution of primes use complex ana-lytic methods. However there are some veryuseful and basic results that can be estab-lished elementarily. Many expositions of theresults we are going to describe use noth-ing more than properties of binomial coef-ficients, but it is good to start to get theflavour of more sophisticated methods eventhough here they could be interpreted asjust properties of binomial coefficients. Westart by introducing

The von Mangold function. This isdefined by

Λ(n) =

0 if n = 1,

0 if p1p2|n with p1 = p2,

log p if n = pk.

The interesting thing is that the support ofΛ is on the prime powers, the higher pow-ers are quite rare, at most

√x of them not

exceeding x.


This function is definitely not multiplica-tive, since Λ(1) = 0. However it does satisfysome interesting relationships.

Lemma 6.14. Let n ∈ N. Then∑m|n

Λ(m) = log n,

Proof.Write n = pk11 . . . pkrr with the pj dis-

tinct. Then for a non-zero contribution tothe sum we have m = pjss for some s with1 ≤ s ≤ r and js with 1 ≤ js ≤ ks. Thusthe sum is

r∑s=1

ks∑js=1

log ps = log n.

We need to know something about the av-

erage of log n.

Lemma 6.15 (Stirling). Suppose that X ∈R and X ≥ 2. Then∑n≤X

log n = X(logX − 1) +O(logX).


This can be thought of as the logarithm ofStirling’s formula for ⌊X⌋!.Proof.We have∑n≤X

log n =∑n≤X

(logX −

∫ X

n

dt

t

)=⌊X⌋ logX −

∫ X

1

⌊t⌋tdt

=X(logX − 1)

+

∫ X

1

t− ⌊t⌋t

dt +O(logX).

Now we can say something about averages

of the von Mangoldt function.

Theorem 6.16. Suppose that X ∈ R andX ≥ 2. Then∑m≤X

Λ(m)

⌊X

m

⌋= X(logX−1)+O(logX).

Proof. The sum in question is

=∑m≤X

Λ(m)∑

k≤X/m

1.


Collecting together the ordered pairs mk =n for a given n and rearranging gives∑

n≤X

∑k,mkm=n

Λ(m)

and this is ∑n≤X

∑m|n

Λ(m).

By the first lemma this is∑n≤X

log n

and by the second it is

X(logX − 1) +O(logX).

At this stage it is necessary to introducesome of the fundamental counting functionsof prime number theory. For X ≥ 0 we


define

ψ(X) =∑n≤X

Λ(n),

ϑ(X) =∑p≤X

log p,

π(X) =∑p≤X

1.

The following theorem shows the close rela-tionship between these three functions.

Theorem 6.17. Suppose that X ≥ 2.Then

ψ(X) =∑k

ϑ(X1/k),

ϑ(X) =∑k

µ(k)ψ(X1/k),

π(X) =ϑ(X)

logX+

∫ X

2

ϑ(t)

t log2 tdt,

ϑ(X) = π(X) logX −∫ X

2

π(t)

tdt.

Note that each of these functions are 0when X < 2, so the sums are all finite.


Proof. By the definition of Λ we have

ψ(X) =∑k

∑p≤X1/k

log p =∑k

ϑ(X1/k).

Hence we have∑k

µ(k)ψ(X1/k) =∑k

µ(k)∑l

ϑ(X1/(kl)).

Collecting together the terms for which kl =m for a given m this becomes∑

m

ϑ(X1/m)∑k|m

µ(k) = ϑ(X).

We also have

π(X) =∑p≤X

(log p)

(1

logX+

∫ X

p

dt

t log2 t

)=ϑ(X)

logX+

∫ X

2

ϑ(t)

t log2 tdt.

The final identity is similar.

ϑ(X) =∑p≤X

logX −∑p≤X

∫ X

p

dt

t

etcetera. Now we come to a series of theorems which

are still used frequently.


Theorem 6.18 (Chebyshev). There arepositive constants C1 and C2 such that foreach X ∈ R with X ≥ 2 we have

C1X < ψ(X) < C2X.

Proof. Recall the function

b(x) = ⌊x⌋ − 2⌊x2

⌋defined in Theorem 6.9 for x ∈ R. There weshowed that Then b is periodic with period2 and

b(x) =

0 (0 ≤ x < 1),

1 (1 ≤ x < 2).

Hence

ψ(X) ≥∑n≤X

Λ(n)b(X/n)

=∑n≤X

Λ(n)

⌊X

n

⌋− 2

∑n≤X/2

Λ(n)

⌊X/2

n

⌋.

Here we used the fact that there is no con-tribution to the second sum when X/2 <n ≤ X . Now we apply Theorem 6.16 and


obtain for x ≥ 4

X(logX−1)−2X

2

(log

X

2− 1)

)+O(logX)

= X log 2 +O(logX).

This establishes the first inequality of thetheorem for all X > C for some positiveconstant C. Since ψ(X) ≥ log 2 for allX ≥ 2 the conclusion follows if C1 is smallenough.We also have, for X ≥ 4,

ψ(X)− ψ(X/2) ≤∑n≤X

Λ(n)b(X/n)

and we have already seen that this is

X log 2 +O(logX).

Hence for some positive constant C we have,for all X > 0,

ψ(X)− ψ(X/2) ≤ CX.

Hence, for any k ≥ 0,

ψ(X2−k)− ψ(X2−k−1) < CX2−k.

Summing over all k gives the desired upperbound. We can now obtain.


Corollary 6.19 (Chebyshev). There arepositive constants C3, C4, C5, C6 suchthat for every X ≥ 2 we have

C3X <ϑ(X) < C4X,

C5X

logX<π(X) <

C6X

logX.

Proof. The second result of Theorem 6.17states that

ϑ(X) =

∞∑k=1

µ(k)ψ(X1/k).

Remember that the series is really finite be-cause the terms are all 0 one X1/k < 2, i.ek > (logX)/(log 2). Thus by the previoustheorem∣∣∣∣∣

∞∑k=2

µ(k)ψ(X1/k)

∣∣∣∣∣≤ C2X

1/2 + C2X1/3 logX

log 2

< CX1/2

for some constant C. Thus

|ϑ(X)− ψ(X)| < CX1/2


and so by the previous theorem again

C1X−CX1/2 < ϑ(X) < C2+CX1/2 < C4X

with, say C4 = C2+C. If we take 0 < C ′ <C1, then

C ′X < C1X − CX1/2

provided that X > X0 =(

CC1−C ′

)2. Since

ϑ(X) ≥ log 2 whenever X ≥ 2 we can takeC3 to be the minimum of C ′ and

min2≤X≤X0

(ϑ(X)

X

).

Now turn to π(X). By the third formulain Theorem 6.17 we have

π(X) =ϑ(X)

logX+

∫ X

2

ϑ(t)

t log2 tdt.

Thus, at once

π(X) ≥ ϑ(X)

logX≥ C3X

logX.

The upper bound is more annoying. Wehave

π(X) ≤ C4X

logX+

∫ X

2

C4dt

log2 t.


The integral here is bounded by∫ √X

2

C4dt

(log 2)2+

∫ X

√X

C4dt

(log√X)2

<C4

√X

(log 2)2+

4C4X

(logX)2<

C ′X

logX.


It is also possible to establish a more pre-cise version of Euler’s result on the primes.

Theorem 6.20 (Mertens).There is a con-stant B such that whenever X ≥ 2 wehave∑

n≤X

Λ(n)

n= logX +O(1),

∑p≤X

log p

p= logX +O(1),

∑p≤X

1

p= log logX +B +O

(1

logX

),

∏p≤X

(1− 1

p

)=

c

logX+O

(1

(logX)2

).

Proof. By Theorem 6.16 we have∑m≤X

Λ(m)

⌊X

m

⌋= X(logX−1)+O(logX).

The left hand side is

X∑m≤X

Λ(m)

m+O(ψ(X)).


Hence by Cheyshev’s theorem we have

X∑m≤X

Λ(m)

m= X logX +O(X).

Dividing by X gives the first result.We also have∑

m≤X

Λ(m)

m=∑k

∑pk≤X

log p

pk.

The terms with k ≥ 2 contribute

≤∑p

∑k≥2

log p

pk≤

∞∑n=2

log n

n(n− 1)

which is convergent, and this gives the sec-ond expression.Finally we can see that∑

p≤X

1

p=∑p≤X

log p

p

(1

logX+

∫ X

p

dt

t log2 t

)=

1

logX

∑p≤X

log p

p+

∫ X

2

∑p≤t

log p

p

dt

t log2 t.

Let

E(t) =∑p≤t

log p

p− log t


so that by the second part of the theoremwe have E(t) ≪ 1. Then the above is

=logX + E(X)

logX+

∫ X

2

log t + E(t)

t log2 tdt

= log logX + 1− log log 2 +

∫ ∞

2

E(t)

t log2 tdt

+E(X)

logX−∫ ∞

X

E(t)

t log2 tdt.

The first integral here converges and the lasttwo terms are

≪ 1

logX.

For the final assertion of the theorem ob-serve that

− log

(1− 1

p

)=

∞∑k=1

1

kpk

and so

− log∏p≤X

(1− 1

p

)=∑p≤X

1

p+B1−

∑p>X

∞∑k=2

1

kpk


where

B1 =∑p

∞∑k=2

1

kpk

which converges absolutely since∞∑k=2

1

kpk≤

∞∑k=2

1

pk=

1

p(p− 1).

The other series is bounded by∑p>X

1

p(p− 1)≪ X−1.

Hence, by the third part of the theorem,

− log∏p≤X

(1− 1

p

)= log logX+B2+O

(1

logX

)for some real constant B2. Exponentiatingboth sides gives the desired conclusion.

There is an interesting application of theabove which lead to some important devel-opments. As a companion to the definitionof a multiplicative function we have


Definition. An f ∈ A is additivewhenit satisfies f (mn) = f (m) + f (n) whenever(m,n) = 1.

Now we introduce two further functions.

Definition. We define ω(n) to be thenumber of different prime factors of n andΩ(n) to be the total number of prime factorsof n.

Example. We have 360 = 23325 so thatω(360) = 3 and Ω(360) = 6. Generally,

when the pj are distinct, ω(pk11 . . . p

krr ) = r

and Ω(pk11 . . . pkrr ) = k1 + · · · kr.

One might expect that most of the timeΩ is appreciably bigger than ω, but in factthis is not so. By the way, there is someconnection with the divisor function. It isnot hard to show that

2ω(n) ≤ d(n) ≤ 2Ω(n).

In fact this is a simple consequence of thechain of inequalities

2 ≤ k + 1 ≤ 2k.


Theorem 6.21. Suppose that X ≥ 2.Then∑n≤X

ω(n) = X log logX+BX+O

(X

logX

)where B is the constant of Theorem 6.20,and∑n≤X

Ω(n) =

X log logX+

B +∑p

1

p(p− 1)

X+O

(X

logX

).

Proof.We have∑n≤X

ω(n) =∑n≤X

∑p|n

1 =∑p≤X

⌊X

p

⌋= X

∑p≤X

1

p+O

(π(x)

)and the result follows by combining Corol-lary 6.19 and Theorem 6.20.


The case of Ω is similar. We have∑n≤X

Ω(n) = X∑p,k

pk≤X

1

pk+O

∑k≤(logX)/(log 2)

π(X1/k)

.

When k ≥ 2 the terms in the error are ≪X1/2 and so the total contribution from thek ≥ 2 is ≪ X1/2 logX . In the main term,when k ≥ 2 it remains to understand thebehaviour of∑k≥2

∑p>X1/k

1

pk≤∑

p>X1/2

1

p2+∑k≥3

1

(X1/k)k/2

∑p

1

pk/2

The first sum is ≪ X−1/2 and the second is

≪ X−1/2∑p

1

p(p1/2 − 1)≪ X−1/2.

Hardy and Ramanujan made the remark-able discovery that log log n is not just theaverage of ω(n), but is its normal order.Later Turan found a simple proof of this.


Theorem 6.22 (Hardy & Ramanujan).Suppose that X ≥ 2. Then

∑n≤X

ω(n)−∑p≤X

1

p

2

≪ X∑p≤X

1

p,

∑n≤X

(ω(n)− log logX)2 ≪ X log logX

and

∑2≤n≤X

(ω(n)− log log n)2 ≪ X log logX

Turan. It is easily seen that

∑n≤X

∑p≤X

1

p− log logX)

2

≪ X


and (generally if Y ≥ 1 we have log Y ≤2Y 1/2)

∑2≤n≤X

(log logX − log log n)2 =∑

2≤n≤X

(log

logX

log n

)2

≪∑n≤X

logX

log n

=∑n≤X

∫ X

n

dt

t

=

∫ X

1

⌊t⌋tdt

≤ X.

Thus it suffices to prove the second state-ment in the theorem. We have

∑n≤X

ω(n)2 =∑p1≤X

∑p2≤Xp2 =p1

⌊X

p1p2

⌋+∑p≤X

⌊X

p

⌋≤ X(log logX)2 +O(X log logX).


Hence∑n≤X

(ω(n)− log logX)2 ≤ 2X(log logX)2

− 2(log logX)∑n≤X

ω(n) +O(X log logX)

and this is ≪ X log logX .

One way of interpreting this theorem isto think of it probabilistically. It is sayingthat the events p|n are approximately inde-pendent and occur with probability 1

p. Onemight guess that the distribution is normal,and this indeed is true and was establishedby Erdos and Kac about 1941. Let

Φ(a, b) = limx→∞

1

xcardn ≤ x : a <

ω(n)− log log n√log log n

≤ b.

Then

Φ(a, b) =1√2π

∫ b

a

e−t2/2dt.

The proof uses sieve theory, which we mightexplore later.


6.5. Orders of magnitude of arith-metical functions. It is sometimes use-ful to know something about the way thatan arithmetical function grows. Multiplica-tive functions tend to oscillate quite a bit insize. For example d(p) = 2 but if we take nto be the product of the first k primes, say

n =∏p≤X

p

for some large X , then

log n = ϑ(X)

so that

X ≪ log n≪ X

by Chebyshev and so

logX ∼ log log n,

but

d(n) = 2π(X)


so that

log d(n) = (log 2)π(X)

≥ (log 2)ϑ(X)

logX

∼ (log 2)log n

log log n.

Theorem 6.23. For every ε > 0 thereare infinitely many n such that

d(n) > exp

((log 2− ε) log n

log log n

).

The function d(n) also arises in compar-isons, for example in deciding the conver-gence of certain important series. Thus itis useful to have a simple universal upperbound.

Theorem 6.24. Let ε > 0. Then thereis a positive number C which depends atmost on ε such that for every n ∈ N wehave

d(n) < Cnε.

Note, such a statement is often written as

d(n) = Oε(nε)


ord(n) ≪ε n

ε.

There is also a more precise form of the the-orem which states that

d(n) ≪ε exp

((log 2 + ε) log n

log log n

)and the proof we give could be adapted toprove this.

Proof. Note that it suffices to prove the the-orem when

ε ≤ 1

log 2.

Write n = pk11 . . . pkrr where the pj are dis-

tinct. Recall that

d(n) = (k1 + 1) . . . (kr + 1).

Thusd(n)

nε=

r∏j=1

kj + 1

pεkjj

.

Since we are only interested in an upperbound the terms for which pεj > 2 can be

thrown away since 2k ≥ k + 1. Howeverthere are only ≤ 21/ε primes pj for which

pεj ≤ 2.


Morever for any such prime we have

pεkjj ≥ 2εkj

= exp(εkj log 2)

≥ 1 + εkj log 2

≥ (kj + 1)ε log 2.

Thus

d(n)

nε≤(

1

ε log 2

)21/ε

. (6.4)

The above can be refined to give a com-

panion to Theorem 6.23

Theorem 6.25. Let ε > 0. Then for ev-ery n ∈ N we have

d(n) ≪ exp

((log 2 + ε) log n

log log n

)Proof.Wemay suppose that n is larger thansome function of ε. In (6.4) replace the ε ofthat inequality by

log 2 + ε2

log log n.


The nε becomes

exp

((log 2 + ε

2

)log n

log log n,

)and the right hand side becomes

exp

(2

log log nlog 2+ε/2 log

log log n

(log 2 + ε/2) log 2

)The first factor in the exponent is

(log n)

(1+ε/(2 log 2)

)−1

.

Hence the above is

≪ exp

(ε log n

2 log log n

)and the theorem follows.

Documents

6. Arithmetical Functions6. Arithmetical Functions 6.1. Introduction. I have decided that it makes more sense to cover chapters 6 and 7 in reverse order, so this will follow my dis-cussion