mundy.netmundy.net/sam/ANT.pdf · Algebraic Number Theory Sam Mundy Last Modi ed: 2/17/2014 Introduction These notes give a complete introduction to the basic theory of algebraic

Algebraic Number Theory

Sam Mundy

Last Modified: 2/17/2014

Introduction

These notes give a complete introduction to the basic theory of algebraic numbers, as wellas some fundamental topics. The basic theory covered in these notes is standard, includingrings of integers, the theory of ramification, the finiteness of the class number and the unittheorem. The focus is on number fields. The local theory is treated, but not in any seriousdepth. After the unit theorem, we cover adeles and ideles and zeta functions.

After the basic theory, we explore many important topics, including class field theoryand Tate’s thesis. Also, I have included a brief treatment of function fields (the other kindof global field) because I find the basic literature lacking in terms of this subject (thoughsee Rosen [11]).

Conventions and Prerequisites

The basic prerequisites for these notes are a standard year-long graduate course in abstractalgebra and, for some of the later topics, a semester in complex analysis. For Part II, thereader will need other prerequisites, and each chapter in this part will explain what materialis needed.

As for conventions, we use the standard notations Z, Q, and so on, and Fq will denotethe finite field with q elements. The notation Zp will not mean Z/pZ; instead, Zp willdenote the p-adic integers, which will be introduced in these notes. Finally, all rings arecommutative with identity, and 0 /∈ N.

Contents

I Basic Theory 4

1 Dedekind Domains and Number Fields 41.1 Integral Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Dedekind Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 Number Fields and the Discriminant . . . . . . . . . . . . . . . . . . . . . . 121.4 Quadratic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5 Cyclotomic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.6 Quadratic Gauss Sums and Quadratic Reciprocity . . . . . . . . . . . . . . 19

1

1.7 Elementary Proof of Quadratic Reciprocity . . . . . . . . . . . . . . . . . . 22

2 Prime Decomposition 262.1 Galois Theory and Prime Decomposition . . . . . . . . . . . . . . . . . . . . 262.2 Prime Decomposition in Dedekind Domains . . . . . . . . . . . . . . . . . . 292.3 The Norm of an Ideal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.4 The Frobenius Automorphism . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Local Theory 373.1 Absolute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 Completions and the p-adic Numbers . . . . . . . . . . . . . . . . . . . . . . 403.3 Hensel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.4 Extensions of Local Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.5 Absolute Values on Number Fields . . . . . . . . . . . . . . . . . . . . . . . 503.6 Ramification in Local Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.7 The Different and the Relative Discriminant . . . . . . . . . . . . . . . . . . 55

4 Finiteness of the Class Number and the Unit Theorem 574.1 The Embedding of a Number Field into n-Space . . . . . . . . . . . . . . . 574.2 Minkowski’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.3 The Proof of the Finiteness of the Class Number . . . . . . . . . . . . . . . 604.4 Dirichlet’s Unit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.5 The Riemann-Roch Theorem For Number Fields . . . . . . . . . . . . . . . 66

5 Adeles and Ideles 735.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.2 Compactness Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.3 S-Units and the Recovery of the Unit Theorem and the Finiteness of hK . . 76

6 Zeta Functions and L-Functions 786.1 Dirichlet Series and the Riemann Zeta Function . . . . . . . . . . . . . . . . 786.2 The Functional Equation for the Zeta Function . . . . . . . . . . . . . . . . 836.3 The Dedekind Zeta Function . . . . . . . . . . . . . . . . . . . . . . . . . . 836.4 The Class Number Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.5 L-Functions and the Evaluation of the Class Number . . . . . . . . . . . . . 876.6 Dirichlet’s Theorem on Primes in Arithmetic Progressions . . . . . . . . . . 916.7 Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

II Towards a More Advanced Theory 96

7 Class Field Theory 967.1 Global Class Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.2 Local Class Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.3 The Proofs I: Group Cohomology . . . . . . . . . . . . . . . . . . . . . . . . 977.4 The Proofs II: Local Class Field Theory . . . . . . . . . . . . . . . . . . . . 97

2

7.5 The Proofs III: Global Class Field Theory . . . . . . . . . . . . . . . . . . . 977.6 The Proofs IV: The Chebotarev Density Theorem . . . . . . . . . . . . . . 97

8 Local Fields and Function Fields 978.1 Local Fields and Their Classification . . . . . . . . . . . . . . . . . . . . . . 978.2 The Arithmetic of Function Fields . . . . . . . . . . . . . . . . . . . . . . . 978.3 Finiteness of the Class Number and Unit Theorem for Function Fields . . . 978.4 The Zeta Function of a Function Field . . . . . . . . . . . . . . . . . . . . . 978.5 The Analytic Continuation and Functional Equation of the Zeta Function . 978.6 Overview: Adeles, Ideles, and Class Field Theory . . . . . . . . . . . . . . . 97

9 Tate’s Thesis 979.1 Abstract Harmonic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 979.2 Analysis on Local Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979.3 Analysis on Adeles and Ideles . . . . . . . . . . . . . . . . . . . . . . . . . . 979.4 Local Zeta Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979.5 Tate’s Riemann-Roch Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 979.6 Global Zeta Functions and Their Functional Equation . . . . . . . . . . . . 979.7 Another Proof of Arithmetic Serre Duality . . . . . . . . . . . . . . . . . . . 97

3

Part I

Basic Theory

1 Dedekind Domains and Number Fields

The principal object of our study will be the ring of integers in an algebraic number field(Section 1.3). Their elements are characterized by certain polynomial relations, which aredescribed in Section 1.1. However, we can make our approach more general; rings of integersare particular examples of rings called Dedekind domains. These are studied in Section 1.2,and then the theory is applied to rings of integers in Section 1.3. Afterwards, we describeand study two special kinds of number fields, namely quadratic fields and cyclotomic fields,and we examine their rings of integers. These examples are particularly important becausethey motivated the early development of the theory. As well, the cyclotomic fields areessentially the basis on which class field theory sits (though our treatment of this subjectin Section 7 will not make use of this fact, called the Kronecker-Weber Theorem).

Sources

Sections 1.1, 1.2 and 1.6 derive their exposition from the text of Lang [5]. A reference forSections 1.3 and 1.5 is the text of Marcus [6]. I saw first the argument of Section 1.4 in thenotes of Milne [7]. I am indebted to the PROMYS program for first letting me experiencethe contents of Section 1.7. For a beautiful and much more thorough treatment of the ideasin Sections 1.6 and 1.7, see the text of Ireland and Rosen [4]. Lang’s text [5] also containsa somewhat thorough, but brief, treatment of the basic theory of cyclotomic fields.

1.1 Integral Closure

The set up is this: A is an integral domain with fraction field K, and A (hence K) sitsinside a field L.

Definition 1.1. An element x ∈ L is called integral over A if x satisfies

xn + an−1xn−1 + · · ·+ a0 = 0

for some ai ∈ A and n ≥ 1. (Note that this polynomial is monic). A subring B of L isintegral over A if every element of B is integral over A.

Note that this implies, of course, that A itself is integral over A. Integrality is essentiallyan analog of the notion of algebraicity in field theory. In fact, it is a generalizaton, for ifA = K, then x ∈ L being integral over K simply means that it is algebraic over K.

Now integrality is often easier to verify via this definition in practice than in theory, sowe prove its equivalence to another condition.

Proposition 1.2. Let x ∈ L. Then x is integral over A if and only if there exists a finitelygenerated A-submodule M ⊂ L such that xM ⊂M .

4

Proof. Let x be integral, satisfying xn + an−1xn−1 + · · ·+ a0 = 0, and consider A[x]. Then

1, x, . . . xn−1 generates A[x] as an A-module (why?) and xA[x] ⊂ A[x].Conversely, assume that xM ⊂ M for some finitely generated A-submodule M of L.

Let v1, . . . , vm be generators for M . Since xvi ∈ M for each i, we can write xvi = ai1v1 +· · · + aimvm with the aij ∈ A. This means precisely that the matrix a = [aij ] (consideredas a matrix over L) has [vj ] as an eigenvector with eigenvalue x. Hence, the characteristicpolynomial of a, which is monic with coefficients in A, has x as a root. This exhibits x asintegral over A.

Corollary 1.3. Let B is a subring of L containing A, which is integral over A. If B isfinitely generated as an A-algebra, then it is actually finitely generated as an A-module.

Proof. Let x1, . . . , xn be generators for B as an A-algebra. By the proof of the proposition,A[x1] is finitely generated as an A-module. But clearly if R ⊂ S ⊂ T are any rings withS finitely generated as an R-module and T finitely generated as an S-module, then T isfinitely generated as an R module. So we may just apply this to the chain A ⊂ A[x1] ⊂· · · ⊂ A[x1, . . . xn] = B.

Corollary 1.4. The set B of all elements of L which are integral over A, forms a ring.

Proof. Let x, y ∈ B and let M,N be finitely generated A-submodules of L for whichxM ⊂ M and yN ⊂ N . Clearly MN is finitely generated as an A-module, and we have(x+ y)MN ⊂MN and xyMN ⊂MN . Also, (−x)M ⊂M .

The ring B in the above corollary is called the integral closure of A in L. If A = B,then we say that A is integrally closed in L. In particular, if A is integrally closed in K,its fraction field, then we say simply that A is integrally closed, or is an integrally closeddomain. It is clear that integral closures, hence integrally closed domains, are integraldomains, since they are subrings of fields.

Proposition 1.5. Let B,C be integral domains with A ⊂ B ⊂ C ⊂ L. C is integral overA if and only if C is integral over B and B is integral over A.

Proof. For the forward direction, the integrality of B over A is clear since B ⊂ C. Similarlythe integrality of C over B is clear because any polynomial with coefficients in A is apolynomial with coefficients in B.

For the converse, let x ∈ C be integral over B, satisfying xn+bn−1xn−1+· · ·+b0 = 0 with

the bi ∈ B. We construct a finitely generated A-module M for which xM = M . ConsiderA′ = A[b1, . . . , bn−1], which is integral over A, and hence finitely generated as an A-moduleby Corollary 1.3. Then A′[x] is finitely generated as an A′-module (by 1, x, . . . , xn−1) andhence it is finitely generated as an A module. Clearly xA′[x] ⊂ A′[x].

When A ⊂ B, it is often convenient to speak of the integrality over A of an elementx ∈ B where B is simply an integral domain with no reference to a field containing B.When we do this, we will mean that x is integral over A as an element of the field offractions Frac(B). Note, however, that taking the field of fractions may not be compatiblewith integral closures, i.e. there may be elements of Frac(B) which are integral over A butwhich do not belong to B (We will see for instance that Z ⊂ Z[

√−3] is an example by

5

Proposition 1.31).We would now like to prove a theorem about the interaction between elements of L

which are integral over A, and the field theory of the extension L/K. First we need somelemmas.

Lemma 1.6. If an element x ∈ L is algebraic over K, then there is an element in a ∈ Asuch that ax is integral over A.

Proof. Let anxn+an−1x

n−1 + · · ·+a0 = 0 with the ai’s in A. Such an equation exists sincex is algebraic over K and K is the fraction field of A. Thus, multiplying by an−1

n , we get

(anx)n + an−1(anx)n−1 + · · ·+ a0an−1n = 0.

So anx is integral over A.

Lemma 1.7. Let A ⊂ B be integral domains, and assume there is a homomorphism ofrings σ : B → σ(B). If x ∈ B is integral over A, then σ(x) is integral over σ(A)

Proof. If xn + an−1xn−1 + · · · + a0 = 0 for some ai ∈ A, then σ(x)n + σ(an−1)σ(x)n−1 +

· · ·+ σ(a0) = 0, and this suffices.

Corollary 1.8. Let A,K,L be as usual, and assume L/K is finite and separable. Assumex ∈ L is integral over A. Then NmL/K(x) and TrL/K(x) are integral over A.

Proof. We work in the Galois closure M of L over K. In M , we know that σ(x) is integralover A for all homomorphisms σ of L into M . The corollary follows because TrL/K(x) isthe sum of these σ(x), and NmL/K(x) is the product.

From Dedekind’s independence of characters, we know that if L/K is a finite separableextension, then TrL/K is non-degenerate, i.e. TrL/K(x) is not zero for some x ∈ L. Letα ∈ L be non-zero. Since the trace is K-linear, the map x 7→ TrL/K(αx) therefore definesa non-zero element of the dual space homK(L,K) of L over K. Letting α vary, we get ahomomorphism of L into the dual space whose kernel must be trivial. Therefore it is anisomorphism. Thus, given a basis w1, . . . , wn, we may speak of the dual basis w′1, . . . , w

′n

with respect to this isomorphism. It is characterized by the relations TrL/K(wiw′j) = δij .

Theorem 1.9. Let A be Noetherian and integrally closed, K the field of fractions of A,and L/K a finite separable extension. Then the integral closure B of A in L is finitelygenerated as an A-module.

Proof. We will show that B is contained in a submodule of a finitely generated A-module,which will prove the theorem since A is Noetherian. We construct this module as follows.Let w1, . . . , wn be a basis of L as a K-vector space. Let w′1, . . . , w

′n be the dual basis as

described above. By Lemma 1.6, there is a c ∈ A such that each w′i is integral over A. Themodule we will use is M = Ac−1w1 + · · ·+Ac−1wn, which is obviously finitely generated.

Now let x ∈ L be in B. Write x = c1w1 + · · · + cnwn with the ci ∈ K. Then since xand the cw′i are all integral over A, so is TrL/K(cw′ix) = cci. Since A is integrally closed,cci ∈ A. This implies immediately, by the definition of M , that x ∈ M . So B ⊂ M , asdesired.

6

Our main application of this theorem is to the next.

Theorem 1.10. Let A be a principal ideal domain and K is its field of fractions. Let L bea finite separable extension of K of degree n. If B is the integral closure of A in L, then Bis a free A-module of rank n.

Proof. By the theorem, B is finitely generated as an A-module. It is also torsion-freebecause it is an integral domain. So it is free.

Let r be the rank of B over A. Since L has a basis of elements which are integral overA (see the proof of the previous theorem) we know that r ≥ n (why does this follow?). Butin the proof of the theorem, we showed that B is contained in an A-module generated byn elements. Hence actually r = n.

This theorem will be very useful to us in Section 1.3, where it will be immediatelyapplied to rings of integers in number fields. We will hold off on this, however, in order tofirst discuss a more general topic in the next section.

We now wish to discuss how integral closure interacts with prime ideal theory. We shallhave occasion to localize in order to apply Nakayama’s lemma (Exercise 1.2 if you cannotrecall this result), so we need to know that localization preserves integrality.

Proposition 1.11. Let A ⊂ B be integral domains and assume B is integral over A. LetS ⊂ A be a multiplicative subset. Then S−1B is integral over S−1A. Furthermore, if A isintegrally closed, then so is S−1A.

Proof. For the first assertion, let x ∈ B and let M be a finitely generated A-submoduleof the fraction field of B such that xM ⊂ M . Then S−1M is a finitely generated S−1A-submodule of the fraction field of B (or, equivalently, of S−1B). If s ∈ S, then clearly(x/s)S−1M ⊂ S−1M . This shows the first assertion.

For the second, we need to show that any element in the integral closure of S−1A (inFrac(A)) is again in S−1A. So let x be in the integral closure of S−1A. Then we knowthere is an equation

xn +an−1

sn−1xn−1 + · · ·+ a0

s0= 0

where the ai ∈ A and the si ∈ S. Clearing denominators, we get an equation with coeffi-cients in A, and so the same trick as in the proof of Lemma 1.6 shows that sx is integralover A for some s ∈ S. Therefore, sx ∈ A and so x ∈ S−1A.

We now proceed to the prime ideal theory. An extremely important concept in thestudy of algebraic number theory is the following idea.

Definition 1.12. Let A ⊂ B be rings, and p ⊂ A and P ⊂ B be prime ideals. Then P issaid to lie over p if P ∩A = p

If A,B, p,P are as in the definition, then clearly the injection of A into B induces aninjection A/p→ B/P as follows. We have a map A→ B → B/P whose kernel consists of

7

those elements of A which lie in P, i.e. the kernel is p. Thus this map factors through A/pand the map A/p→ B/P is injective. To summarize, the following diagram is commutative:

A ��

//

��

B

��

A/p ��

// B/P.

This situation will occur again and again in our studies.

Theorem 1.13. Let A ⊂ B be integral domains with B integral over A, and let p ⊂ A bea prime ideal. Then pB 6= B, and there is a prime ideal P ⊂ B lying over A.

Proof. The idea is to reduce to a local problem. So let m be the maximal ideal of thelocalization Ap so that m = pAp. We know Bp is integral over Ap by Proposition 1.11. It iseasy to see that pBp = mBp. Therefore, if mBp 6= Bp, then we must have pB 6= B as well.Thus we only need to prove that pB 6= B when A is local and p is its maximal ideal.

We assume this now, and also assume, looking for a contradiction, that actually pB = B.Then we can write 1 in terms of elements of pB like

1 = a1b1 + · · ·+ anbn

where the ai ∈ p and the bi ∈ B. Let A′ = A[b1, . . . , bn]. Since the bi are integral over A,then Corollary 1.3 says that A′ is finitely generated as an A-module. As well, the elementsaibi are in the ideal pA′, so 1 ∈ pA′, so pA′ = A′. Then Nakayama’s lemma (Exercise 1.2)implies immmediately that A′ = 0, which is the desired contradiction. So pB 6= B.

Now we return to our original hypotheses. We need to show that there is a prime of Blying over p. We localize again, so that mBp 6= Bp, where m is the maximal ideal in Ap.This implies that mBp is inside a maximal ideal M.

We have firstM ∩Bp ∩Ap = M ∩Ap.

Since 1 /∈M, we have 1 /∈M∩Ap and so M∩Ap is proper. Therefore it is contained insideof m. But m ⊂ M ∩ Ap by construction, so the two are equal. Combining this with theequation above gives

M ∩Bp ∩Ap ∩A = m ∩A = p.

Now we have the equalities Bp ∩B ∩A = A = Bp ∩Ap ∩A. This gives

p = M ∩Bp ∩B ∩A = M ∩B ∩A.

But the ideal M∩B is prime, equal to P, say. Hence P∩A = p. This exhibits the desiredideal.

Proposition 1.14. Let A ⊂ B be integral domains with B integral over A. Assume P isa prime of B lying over a prime p in A. Then P is maximal if and only if p is.

8

Proof. This is actually just field theory. By applying Lemma 1.7 with σ the quotient map,we see that B/P is integral over A/p (this is where we use that P lies over p). So maximalityof either ideal is equivalent to one of these rings being a field. So assume A/p is a field.Then integrality of an element x in B/P over A/p means algebraicity. By basic field theory,(A/p)[x] is still a field, so x is invertible.

Conversely, if B/P is a field and A/p is not, then A/p has a maximal ideal, and soB/P has a prime which lies over it. But since fields have no nonzero proper ideals, this isa contradiction. So A/p is a field.

We conclude with a proposition which gives us a lot of examples of integrally closeddomains.

Proposition 1.15. All unique factorization domains are integrally closed.

Proof. Let A be a unique factorization domain and K its field of fractions. Assume a/b ∈ Kis fully reduced and integral over A so that(a

b

)n+ αn−1

(ab

)n−1+ · · ·+ α0 = 0

with the αi ∈ A. Then

b|bnα0 = −(an + αn−1ban−1 + · · ·+ α1b

n−1a)

Hence b|an, and every irreducible factor of b must divide a. Since a/b was fully reduced,this implies that b has no irreducible factors, and so is a unit. Hence a/b ∈ A and A isintegrally closed.

Corollary 1.16. Z is integrally closed.

1.2 Dedekind Domains

In this section, we define Dedekind domains and study their most basic properties. Rings ofintegers in number fields are a particular example of Dedekind domains, and we will studysome of their properties through this more general context.

Definition 1.17. An integral domain A which is Noetherian, integrally closed, and inwhich every non-zero prime ideal is maximal (i.e. the Krull dimension is 1), is called aDedekind domain.

The first main point about Dedekind domains is that while they are not necessarilyunique factorization domains, they actually have an analogue of unique prime factorization.In order to state and prove this, we need to make the following definition.

Definition 1.18. Let A be an integral domain. An A-submodule a of Frac(A) is called afractional ideal if there is an element x ∈ A such that xa ⊂ A.

Remark. In terms of algebraic geometry, the spectrum of a Dedekind domain is a curve,and so the divisor theory is well-behaved. We will actually prove in the next two theoremsa fact that is tantamount to the fact that the fractional ideals form a group under theobvious multiplication, and that the divisor group is isomorphic in a natural way to thegroup of fractional ideals.

9

Fractional ideals in Dedekind domains are finitely generated: If a is a fractional idealin a Dedekind domain A, then xa ⊂ A for some x, and so xa is an ideal in A. But A isNoetherian, so xa is finitely generated. Since xa is isomorphic to a as an A-module, a isalso finitely generated. Conversely, any finitely generated A-submodule of K = Frac(A) isa fractional ideal. To see this, simply clear the denominators of a set of generators.

If we have two fractional ideals a, b ⊂ K = Frac(A), we may define their product to bethe set

ab ={∑

aibi

∣∣∣ ai ∈ a, bi ∈ b}.

This is obviously an A-submodule of K. Let x, y be such that xa ⊂ A and yb ⊂ A. Thenobviously xyab ⊂ A. So ab is a fractional ideal. We also define

a−1 = {x ∈ K |xa ⊂ A}.

This is a fractional ideal because aa−1 ⊂ A, so if x ∈ a, then xa−1 ⊂ A. It is clearly anA-module.

Denote the set of all fractional ideals of A by J(A).

Theorem 1.19. J(A) is a group under the multiplication described above, with identity A.The inverse of a ∈ J(A) is given by the fractional ideal a−1, also described above.

Proof. We proceed in five steps.(i) We claim that any ideal in A contains a product of nonzero prime ideals. So assume

not. Since A is Noetherian, the set S of ideals in A which do not contain products ofprimes has a maximal element (apply Zorn’s lemma. Every chain in S is bounded aboveby Noetherianity). Call the maximal element a. The idea is to find ideals a1, a2 properlycontaining a such that a1a2 ⊂ a. Since a is maximal in S, the ai’s will contain a product ofprimes, and hence so will their product, a contradiction.

To construct the ai’s, first note that a is not prime, for otherwise it would contain itself.This means that there are b1, b2 ∈ A with b1, b2 /∈ a but b1b2 ∈ a. We let ai = (a, bi). Theai’s have the desired property, which proves the original claim.

(ii) Next we claim that if p is maximal, then pp−1 = A. Clearly A ⊂ p−1, and we claimthis containment is strict by constucting an element in p−1\A as follows. Let a ∈ p benonzero. Let p1 · · · pr ⊂ (a) be a product of primes as in (i) with r minimal. Since p isprime, one of the pi’s, say p1, is contained inside p. So it is equal to p since nonzero primesare maximal in A.

Now consider p2 · · · pr. This is not contained in (a) because r was minimal, and so wecan pick an element b /∈ (a) with b ∈ p2 · · · pr. Then bp ⊂ (a), so ba−1p ⊂ A and ba−1 ∈ p−1.However b /∈ (a) = aA so ba−1 /∈ A, so we have found our desired element.

We can now proceed with (ii). Since p = pA ⊂ pp−1 ⊂ A and p is maximal, we haveeither pp−1 = A or pp−1 = p. If the latter were true, then since p is a finitely generatedA-module, Proposition 1.2 would tell us that the elements of p−1 are integral. Since A isintegrally closed, p−1 would be contained in A, which is a contradiction. Hence pp−1 = A.

(iii) We now claim that for any ideal a, there is an ideal b such that ab = A (i.e. allideals are invertible). If there were a non-invertible ideal in A, then we can find one whichis maximal with this property. Call it a. We have A 6= a, so a is contained in a maximal p.

10

By (ii), a 6= p. Clearly p−1 ⊂ a−1 and

a ⊂ ap−1 ⊂ aa−1 ⊂ A.

Similarly as in (ii), ap−1 6= a, and hence ap−1 is an ideal which properly contains a. Thusap−1 is invertible, by the maximality of a. Then A = (ap−1)(ap−1)−1 = a(p−1(ap−1)−1),and we have an inverse for a. This is a contradiction, so every ideal is invertible.

(iv) We claim that if a is an ideal and c is a fractional ideal with ac = A, then actuallyc = a−1. Clearly we have c ⊂ a−1 by the definition of a−1. On the other hand, let x ∈ a−1.Then xa ⊂ A so that xac ⊂ c. Since xac = xA, this shows that x ∈ c, proving our claim.

(v) Finally, we claim that every nonzero fractional ideal is invertible. It follows easilyfrom this that the nonzero fractional ideals form a group as described. Let a be a nonzerofractional ideal, and let x ∈ A be such that xa ⊂ A. Then there is a fractional ideal b withxab = A. Hence xb provides an inverse to a.

Let a, b ⊂ A be nonzero fractional ideals in a Dedekind domain. We will say that adivides b, and write a|b, if a ⊃ b. This is the case if and only if A ⊃ a−1b and hencea(a−1b) = b. By the group structure of J(A), a−1b is the unique fractional ideal c suchthat ac = b. Since c is actually an ideal, we see that a|b if and only if there is an ideal cwith ac = b, hence the terminology.

Theorem 1.20. Every nonzero ideal of a Dedekind domain A factors uniquely into primeideals. Hence J(A) is free-abelian on the primes of A.

Proof. The proof proceeds in much the same way as the proof that Z is a UFD, and thesecond claim follows easily from the first (in the same way Q×/{±1} is free-abelian on theprimes of Z).

We claim first that every ideal a has a factorization into prime ideals. Assume this is notthe case, so that there is an ideal without a factorization into primes, and let a be maximalwith this property. Then obviously a is not prime, and so a is contained and not equal tosome maximal ideal p. Then p|a, and ap−1|a (since ap−1p = a). Hence ap−1 has a primefactorization because it contains a. Thus ap−1p = a has a factorization, a contradiction.Hence the claim is proved.

Now note that if a, b are ideals in A and if p is prime, then by the definition of primeideal, if p|ab then p|a or p|b. With this in mind, we proceed to the uniqueness of the primefactorization of an ideal a.

First, we know that because every prime is maximal, no prime ideal divides another.Hence a factorization of prime ideals into prime ideals is unique. Now we proceed byinduction. Assume a = p1 · · · pr and a = q1 · · · qs were two prime factorizations of the ideala. If r or s were 1, then we would be done. Without loss of generality, assume r ≤ s andassume that all ideals with a factorization into r − 1 primes have a unique factorization.We will show that, up to reordering, r = s and pi = qi. Since p1|a, we know p1|q1 · · · qs.Hence p1|qi for some i, say i = 1, and so p1 = q1. So divide by p1. Then p2 · · · pr = q2 · · · qs.Hence r = s and, after reordering, pi = qi for i ≥ 2, by the induction hypothesis. But weknew p1 = q1 already, so we are done.

We prove now with an interesting property of Dedekind domains, which provides forthem a converse to the statement PID⇒UFD.

11

Theorem 1.21. Let A be a Dedekind domain. If A is a unique factorization domain, thenactually A is a principal ideal domain.

Proof. By basic ring theory, we know that if A is a UFD and π ∈ A is irreducible, then theideal (π) is prime. So let p ⊂ A be a prime ideal, and let α ∈ p. Because p is prime, one ofthe irreducble factors of α, say π, is in p. Thus (π) ⊂ p, so (π) = p because every nonzeroprime is maximal. By the fact that any nonzero ideal factors into prime ideals in A, everyideal is actually principal.

Definition 1.22. A fractional ideal in an integral domain A is called principal if it is ofthe form αA for some α ∈ K = Frac(A).

If A is a Dedekind domain, it is easy to show that the nonzero principal fractional idealsform a subgroup P (A) of J(A). The quotient J(A)/P (A) is called the ideal class group ofA. It is easy to show that A is a principal ideal domain if and only if the ideal class groupis trivial. By the last proposition, the order of the ideal class group gives some measure ofhow close a Dedekind domain is from having unique factorization. We will prove later thatfor rings of integers in number fields, the ideal class group is always finite.

1.3 Number Fields and the Discriminant

We now give the main definitions of these notes.

Definition 1.23. A number field is a finite extension field of the field Q. The ring ofintegers in a number field K, denoted OK , is the integral closure of Z in K.

From the general theory we have developed and the fact that Z is integrally closed,OK is a free Z-module of rank [K : Q] (Theorem 1.10). Therefore OK is Noetherian,because it is Noetherian as a Z-module. It is integrally closed by definition. If p ⊂ OK is anonzero prime ideal, then it is easy to show that p ∩ Z is prime. p ∩ Z is nonzero becauseNmK/Q(α) ∈ p∩Z for any α ∈ p (Corollary 1.8). Hence p∩Z is maximal, and therefore sowas p. Therefore OK is a Dedekind domain.

By the group structure of the fractional ideal group J(OK), any ideal in OK can bemultiplied by another ideal to make it principal. Therefore, every ideal lies between OKand a principal ideal, and so all ideals are of rank [K : Q] as abelian groups. (In particular,every residue ring OK/a is torsion and finitely generated, hence finite). Finally, note thatif L/K is an extension of number fields, then of course OL contains OK and, in fact, OL isthe integral closure of OK in L by Proposition 1.5. We will apply the prime ideal theoryof Section 1.1 in Chapter 2.

We fix throughout this section a number field K whose degree over Q is n. We definean operation on n-tuples of K which will give rise to a very important invariant of K.

Definition 1.24. For an n-tuple α1, . . . , αn ∈ K, define the discriminant of α1, . . . , αn tobe

Disc(α1, . . . , αn) = (det[σi(αj)])2

where the σj vary over the embeddings of K into a fixed algebraic closure.

12

Proposition 1.25. Let α1, . . . , αn ∈ K. Then Disc(α1, . . . , αn) = det[Tr(αiαj)]. In par-ticular, Disc(α1, . . . , αn) ∈ Q and, if α1, . . . , αn ∈ OK , then Disc(α1, . . . , αn) ∈ Z.

Proof. We have

[σi(αj)][σj(αi)] = [σ1(αiαj) + · · ·+ σn(αiαj)] = [Tr(αiαj)].

Proposition 1.26. Let α1, . . . , αn ∈ K. Then Disc(α1, . . . , αn) = 0 if and only if α1, . . . , αnare Q-linearly dependent.

Proof. If α1, . . . , αn are Q-linearly dependent, say a1α1 +· · ·+anαn = 0 then the a1σi(α1)+. . . + anσi(αn) = 0 as well, for all homomorphisms σi of K into a fixed algebraic closure.Hence the column vectors [σi(αj)] (j fixed) are Q-linearly dependent. Thus the discriminantis zero.

Conversely, if Disc(α1, . . . , αn) = 0, then the rows of the matrix [Tr(αiαj)] are linearlydependent (over Q since the enrties of these rows are in Q). Assume the αi were Q-linearlyindependent. Then, after suitable row operations, we may assume that Tr(αiαj) = 0 forsome i and all j (i.e. one of the rows is zero). But the αiαj are Q-linearly independentfor j = 1, . . . , n because the αj are. This is a contradiction because the trace is non-degenerate.

Proposition 1.27. Let α1, . . . , αn and β1, . . . , βn be integral bases, i.e. bases for OK as afree abelian group. Then

Disc(α1, . . . , αn) = Disc(β1, . . . , βn)

This quantity is denoted ∆K and is called the discriminant of K.

Note that ∆K ∈ Z.

Proof. Let M be the change of basis matrix from α1, . . . , αn to β1, . . . , βn. Then Mhas integer entries, and (detM)2 Disc(α1, . . . , αn) = Disc(β1, . . . , βn) by definition. ThusDisc(α1, . . . , αn) divides Disc(β1, . . . , βn) in Z, by a positive number. Similarly Disc(β1, . . . , βn)divides Disc(α1, . . . , αn) in Z by a positive number, so the two are equal.

Actually, a stronger result can be obtained, and the previous result actually follows asa corollary:

Proposition 1.28. Let α1, . . . , αn be as in the previous proposition, but only assume nowthat β1, . . . , βn ∈ OK are linearly independent over Q. Let N be the Z-module that β1, . . . , βnspans. Then

[OK : N ]2 Disc(α1, . . . , αn) = Disc(β1, . . . , βn).

Proof. Let M be the matrix bringing α1, . . . , αn to β1, . . . , βn. Then

(detM)2 Disc(α1, . . . , αn) = Disc(β1, . . . , βn).

But |detM | is the index of N in OK by the general theory of finitely generated abeliangroups (details left to the reader).

13

The following result can be useful in detecting integral bases.

Corollary 1.29. (a) Elements α1, . . . , αn ∈ OK form an integral basis if and only ifDisc(α1, . . . , αn) = ∆K .

(b) If β1, . . . , βn ∈ OK has Disc(β1, . . . , βn) square-free, then β1, . . . , βn forms an inte-gral basis.

Proof. (a) is obvious. (b) follows from the formula

[OK : N ]2 Disc(α1, . . . , αn) = Disc(β1, . . . , βn),

either side of which can only be square-free if [OK : N ] = 1.

1.4 Quadratic Fields

By a quadratic field we mean an extension of Q of the form Q(√d) with d a square-free

integer. We will quickly compute the ring of integers and discriminant of any such field.This can actually be done by computing traces and norms and minimal polynomials ofcertain elements Q(

√d); If the minimal polynomial of an element has integer coefficients,

then it is an algebraic integer, and if the trace and norm of an element are not integers,then it is not an algebraic integer. However, We prefer to prove a general theorem aboutdiscriminants and then use it to deduce the desired results.

Theorem 1.30 (Stickelberger). Let K be any number field. Then ∆K ≡ 0, 1 (mod 4).

Proof. Let σ1, . . . , σn be homomorphisms ofK into a fixed algebraic closure and let α1, . . . , αnbe an integral basis for OK . Recall how the determinant of any n×n matrix can be writtenas a sum over the symmetric group Sn. For the particular matrix [σi(αj)], this expansionis ∑

τ∈Sn

sign(τ)

n∏i=1

σi(ατ(i)).

Let P be the part of this sum which is over the even permutations and −N the part whichis over the odd ones. Then

∆K = (P −N)2.

This is equal to (P +N)2 − 4PN .An application of one of the σk’s to a term in the sum above has the action of multiplying

the permutations τ by some fixed permutation τ0, i.e.

σk

(n∏i=1

σi(ατ(i))

)=

n∏i=1

σi(ατ0τ(i)).

Thus, σk either fixes both P and N , or switches them, depending on whether the corre-sponding permutation τ0 is even or odd. Thus, in either case, σk fixes both P + N andPN , and hence P + N and PN are rational. But they are integral by definition, so theyare in Z. The formula ∆K = (P +N)2 − 4PN immediately implies the theorem.

14

Proposition 1.31. Let K = Q(√d) be a quadratic field. Then

(a) If d ≡ 2, 3 (mod 4), then OK = Z[√d] and ∆K = 4d;

(b) If d ≡ 1 (mod 4), then OK = Z[√d+12 ] and ∆K = d.

Proof. (a) It is easy to see that Disc(1,√d) = 4d for any quadratic extension. Hence the

4d is a square multiple of ∆K by Proposition 1.28. So it is either 4d or d since d is squarefree, and Stickelberger’s theorem rules out d. Thus it is 4d, and therefore 1,

√d forms an

integral basis for OK . This proves that OK = Z[√d].

(b) The element√d+12 is an algebraic integer because its minimal polynomial is x2 −

x − d−14 . The discriminant of 1,

√d+12 is d, so ∆K = d and OK = Z[

√d+12 ] because d is

square-free (Corollary 1.29).

1.5 Cyclotomic Fields

Let ζm denote a primitive mth root of unity, i.e. a root of the polynomial xm − 1 overQ. Much like the previous section, the main goal here is to compute the ring of integersin the number field Q(ζm). However, the degree of the extension Q(ζm)/Q is not as easyto compute as in the case of a quadratic field (where it is trivial, of course). The degreeof this extension is actually ϕ(m) where ϕ denotes the Euler function, i.e. the number ofpositive integers n smaller than m with (n,m) = 1 or, equivalently, the number of invertibleelements in the ring Z/mZ. This fact is sometimes proved in a course on field theory, butwe will provide a proof here.

Lemma 1.32. Let ζ be a primitive mth root of unity (i.e. it is not an nth root of unityfor any n < m) and let K = Q(ζ). Then there is a homomorphism Gal(K/Q)→ (Z/mZ)×

given by (ζ 7→ ζa) 7→ a (mod m) which is, moreover, injective.

Proof. First note that K is Galois since it is the splitting field of xm − 1. Since the rootsof xm − 1 are ζi for 0 ≤ i ≤ m− 1, we know that any element of G = Gal(K/Q) maps ζ toζi for some i. Looking at the orders of ζ and ζi for this i, we see that i must be relativelyprime to m. So the map described is well defined and clearly a homomorphism. Since anyhomomorphism of K is determined by its action on ζ, we find that the map is injective, asdesired.

We would like to show that the map in the lemma is an isomorphism. It will be enoughto show that Gal(K/Q) has order ϕ(m), or equivalently, that [K : Q] = ϕ(m). This will beaccomplished by showing that ζ has ϕ(m) conjugates.

Theorem 1.33. Let ζ be a primitive mth root of unity and let K = Q(ζ). Then [K : Q] =ϕ(m) and Gal(K/Q) ∼= (Z/mZ)×.

Proof. For a relatively prime to m, we would like to show that ζ and ζa have the sameminimal polynomial. As noted above, this is enough to prove the theorem. We prove thisfor any prime p | a. Since a is a product of primes necessarily relative prime to a, and sinceζp is again a primitive mth root of unity, this will suffice.

So assume ζ and ζp have different minimal polynomials, say f, g respectively, andwrite xm − 1 = f(x)g(x)h(x) for some monic polynomial h(x) ∈ Q[x]. Then actually

15

f(x), g(x), h(x) ∈ Z[x] by Gauss’ lemma.We consider the polynomials f(x), g(x), h(x) ∈ Fp[x] which are the reductions modulo

p of the polynomials f, g, h. Then xm − 1 = f(x)g(x)h(x) in Fp[x]. But p does not dividethe degree of xm − 1, so xm − 1 is separable over Fp, as it and its derivative clearly shareno roots. Hence f and g are relatively prime in Fp[x].

Now ζ is a root of g(xp) by definition, so f(x)|g(xp). Reducing modulo p clearly impliesf(x)|g(xp) = (g(x))p, the equality by the Freshman’s Dream. This contradicts the fact thatf and g are relatively prime, proving the theorem.

In the sequel ζm will denote a primitive mth root of unity. The rest of this section willbe spent proving the following theorem.

Theorem 1.34. Let K = Q(ζm). Then OK = Z[ζm].

Discriminants will play a big role. Our strategy is to prove this for prime powers, andthen attach these results together. First, some general results on discriminants.

Proposition 1.35. Let K = Q(α) for some algebraic number α, and let σ1, . . . , σn be theembeddings of K into a fixed algebraic closure (so [K : Q] = n). Let f be the minimalpolynomial of α over Q. Then

Disc(1, α, . . . , αn−1) =∏

1≤i<j≤n(σi(α)− σj(α))2 = ±NmK/Q(f ′(α))

with the + sign if and only if n ≡ 0, 1 (mod 4).

Proof. The first equality follows from the fact that the discriminant being considered hereis the square of a Vandermonde determinant. We recall the result:

det[aj−1i ] =

∏1≤i<j≤n

(ai − aj)2,

if the ai’s are elements of some commutative ring.Now to prove the second equality, recall that f ′(α) =

∏σi 6=id(α − σi(α)). Taking the

product of all σi(f′(α)) then gives us the result, with the sign (−1)n(n−1)/2 coming from

rearranging terms (σi(α)− σj(α)) so that i < j.

For K a number field of degree n and α ∈ K, we denote the quantity Disc(1, α, . . . , αn−1)simply by Disc(α).

Proposition 1.36. Let K be a number field of degree n over Q, and let α1, . . . , αn ∈ OK belinearly independent over Q. Set d = Disc(α1, . . . , αn). Then every α ∈ OK can be writtenin the form

m1

dα1 + · · ·+ mn

dαn

where the mi are integers with d|m2i .

16

Proof. Let α ∈ OK . Write α = x1σ1(α1) + · · · + xnσn(αn), where xi ∈ Q. We can solvefor the xi’s using Cramer’s rule. This gives xi = γi

δ , where δ = det[σi(αj)] and γi is thedeterminant of this matrix with the ith row replaced by the row [σj(α)]. By definition,δ2 = d, and γi, δ are algebraic integers. But dxi = δγi, which shows that dxi ∈ Z. Hencesetting mi = dxi gives the theorem as long as d|m2

i . But m2i = d2x2

i = d2γ2i /δ

2 = dγ2i .

Lemma 1.37. Over Q(ζm), Disc(ζm)|mϕ(m) in Z.

Proof. Let f(x) ∈ Q[x] be the minimal polynomial of ζm (actually f(x) ∈ Z[x] by Gauss’Lemma). Then xm − 1 = f(x)g(x) with g(x) ∈ Z[x]. Taking derivatives at ζm givesmζm−1

m = f ′(ζm)g(ζm), so m = ζmf′(ζm)g(ζm). Taking norms now gives

mϕ(m) = ±Nm(ζmg(ζm)) Disc(ζm).

Since ζm and g(ζm) are integral, Nm(ζmg(ζm)) ∈ Z. This proves the lemma.

Lemma 1.38. Over Q(ζm), Disc(ζm) = Disc(1− ζm).

Proof. We have (σi(ζm)− σj(ζm)) = −(σi(1− ζm)− σj(1− ζm)), giving

Disc(ζm) =∏

1≤i<j≤ϕ(m)

(σi(ζm)− σj(ζm))

= (−1)ϕ(m)(ϕ(m)−1)∏

1≤i<j≤ϕ(m)

(σi(1− ζm)− σj(1− ζp))

= (−1)ϕ(m)(ϕ(m)−1) Disc(1− ζm).

But (−1)ϕ(m)(ϕ(m)−1) = 1, which proves the lemma.

Lemma 1.39. Let m = pr be a prime power. Then∏1≤i≤mp - i

(1− ζim) = p.

Proof. Let

f(x) =xp

r − 1

xpr−1 − 1= 1 + xp

r−1+ x2pr−1

+ · · ·+ x(p−1)pr−1.

Then the ζim are roots of the numerator of f for all i with p - i, while they are not roots ofthe denominator. So

f(x) =∏

1≤i≤mp - i

(x− ζim).

Now just note that f(1) = p.

Proposition 1.40. Theorem 1.34 holds for m = pr, a prime power.

17

Proof. Since Z[ζm] = Z[1− ζm], it suffices to show that OQ(ζm) = Z[1− ζm]. Let n = ϕ(m)and d = Disc(1 − ζm). By Lemmas 1.37 and 1.38, d is a power of p. By Proposition 1.36,every α ∈ OK has the form

x0

d+x1

d(1− ζm) + · · ·+ xn−1

d(1− ζm)n−1.

with the xi ∈ Z and d|x2i for all i.

Now assume OQ(ζm) were actually bigger than Z[1− ζm]. Then there is an α ∈ OQ(ζm)

which has the form above, but d - xi for some i. Choose j to be such that the power of poccuring in xj is smallest amongst all xi’s, say pe occurs but not pe+1. Since d - xi for somei, but pe | xi, and both d and pe are powers of p, it follows that pe+1|d. Then, multiplyingthe xi’s by d/pe+1 and subtracting off the terms before the jth term, we obtain an elementβ ∈ OQ(ζm) of the form

β =yjp

(1− ζm)j + · · ·+ yn−1

p(1− ζm)n−1.

By construction, p - yj and the yi’s are integers.Now since (1−ζm) divides (1−ζim) in Z[ζm] for all i, Lemma 1.39 shows that p/(1−ζm)n ∈

Z[ζm]. Therefore p/(1− ζm)j+1 ∈ Z[ζm] (j < n) and so

yj/(1− ζm) + yj+1 + · · ·+ yn−1(1− ζm)n−j−2 ∈ OQ(ζm).

Therefore, subtracting off the (j + 1)th and higher terms, which are obviously in OQ(ζm),we get that yj/(1− ζm) ∈ OQ(ζm). Hence the norm is an integer. But Nm(yj/(1− ζm)) =ynj /Nm(1 − ζm), and Nm(1 − ζm) = p by Lemma 1.39. Thus p|ynj , and so p|yj , a contra-diction. We conclude that OQ(ζm) = Z[1− ζm] = Z[ζm].

The next proposition is a general one which will allow us to go from the case m = pr

to general m.

Proposition 1.41. Let K,L be number fields of degrees n,m respectively, and assume[KL : Q] = nm. Let d = gcd(∆K ,∆L). Then OKL ⊂ 1

dOKOL. In particular, if d = 1,then OKL = OKOLProof. Let α1, . . . , αn be an integral basis for OK and β1, . . . , βm an integral basis for OL.Then clearly {αiβj} a basis for OKOL as a free abelian group, and also a basis for KL overQ. Let α ∈ OKL. We may write α as

α =∑i,j

mij

rαiβj

where the mij and r are integers which are all together relatively prime. We want to showr|d. To do this, we show r|∆K and the same arguments will show r|∆L.

A standard fact from field theory states that every embedding of K and of L into afixed algebraic closure extends uniquely to an embedding of KL which gives the originalembeddings as restrictions. Hence any embedding σ of K extends to one (also call it σ) ofKL which gives the identity on L. We therefore have

σ(α) =∑i,j

mij

rσ(αi)βj

18

for all embeddings σ of K. Let

xi =

m∑j=1

mij

rβj

so thatn∑i=1

σ(αi)xi = σ(α).

We may now employ the techniques of the proof of Proposition 1.36. We solve for xiusing Cramer’s rule, just like in that proposition, and obtain xi = γi/δ where the γi and δare algebraic integers, and δ2 = ∆K . Thus ∆Kxi = δγi is an algebraic integer. But

∆Kxi =

m∑j=1

∆Kmij

rβj ∈ L,

so in fact ∆Kxi ∈ OL. Thus, since the βj ’s form an integral basis of OL, the numbers∆Kmij/r are all integers. Since r is relatively prime to the mij , we must have r|∆K .

We can now finish the proof of Theorem 1.34. Let m,n be relatively prime posi-tive intgers. A standard result from elementary number theory (but also an immediateconsequence of the Chinese Remainder Theorem) says that ϕ(mn) = ϕ(m)ϕ(n). Thus[Q(ζm)Q(ζn) : Q] = [Q(ζm) : Q][Q(ζn) : Q]. Clearly also Q(ζm)Q(ζn) = Q(ζmn). Finally,the discriminants of Q(ζm) and Q(ζn) are relatively prime by Lemma 1.37. Hence, by theprevious proposition, OQ(ζm)Q(ζn) = Z[ζm]Z[ζn] for any m,n for which the Theorem 1.34holds. Clearly Z[ζm]Z[ζn] = Z[ζmn]. Thus, since we know Theorem 1.34 for powers ofprimes, we know it for all m.

1.6 Quadratic Gauss Sums and Quadratic Reciprocity

We want to look now at the diophantine equation

x2 − a = 0

modulo primes p. We define a classical symbol which denotes the existence or non-existenceof a solution to this equation. Although originally introduced more or less as a notationalconvenience, this symbol is now understood to reflect deep arithmetic information.

Definition 1.42. Let p be prime. For a ∈ Z, the Legendre symbol(ap

)is defined to be 1 if

p - a and x2 − a = 0 has a solution modulo p; it is defined to be −1 if p - a and there isno solution to x2 − a = 0 modulo p; and it is zero if p|a. In the case

(ap

)= 1, a is called a

quadratic residue modulo p. If(ap

)= −1, then a is called a quadratic non-residue modulo

p.

Clearly(ap

)is a well defined function of a modulo p. In fact

Proposition 1.43. The Legendre symbol is completely multiplicative, i.e.(abp

)=(ap

)(bp

)for all a, b ∈ Z and primes p. Hence it induces a homomorphism

( ·p

): (Z/pZ)× → {±1}.

19

Proof. The statement that(ap

)= 1 is the same as saying that the class of a modulo p is

in ((Z/pZ)×)2. In view of the cyclicity of this group, and its even order (unless p = 2 inwhich case the proposition is trivial), the proposition is clear.

One sees from this proof that there are exactly as many quadratic residues as non-residues in (Z/pZ)×.

There are only two homomorphisms from a cyclic group with even order to the groupwith two elements, namely the trivial homomorphism and the one sending a generator to−1. Thus the Legendre symbol is this non-trivial homomorphism. But this homomorphismcan also be described as sending the class of an integer a modulo p to the class of a(p−1)/2,for if g generates (Z/pZ), then g(p−1)/2 is the class of −1 modulo p.

We desire to prove the following famous theorem

Theorem 1.44 (Quadratic Reciprocity). Let p, q be distinct odd primes. Then we havethe following formulas: (

−1

p

)= (−1)

p−12 ;(

2

p

)= (−1)

p2−18 ;(

p

q

)(q

p

)= (−1)

p−12

q−12 .

The first formula follows at once from the description of the homomorphism( ·p

)we gave

above.To prove the second formula, let ζ be a primitive 8th root of unity. Then (ζ+ζ−1)2 = 2.

Thus, modulo pZ[ζ], we have (ζ + ζ−1)p−1 = 2(p−1)/2 ≡(

2p

). On the other hand, again

modulo pZ[ζ], we have (ζ + ζ−1)p = ζp + ζ−p, which is (ζ + ζ−1) when p ≡ ±1 (mod 8) andis −(ζ+ ζ−1) if p ≡ ±3 (mod 8). That is, (ζ+ ζ−1)p = (ζ+ ζ−1)(−1)(p2−1)8. Thus, modulopZ[ζ]

(−1)(p2−1)/8 ≡ (ζ + ζ−1)p−1 ≡(

2

p

).

For the general case, we need a lemma.

Lemma 1.45. Let p be an odd prime and consider the quadratic Gauss sum

S =

p−1∑a=1

(a

p

)ζap ∈ Z[ζp].

Then S2 = (−1)(p−1)/2p.

Proof. We have

S2 =∑a,b

(ab

p

)ζa+bp .

20

But as b runs through the non-zero residue classes modulo p, so does ab, so we get

S2 =∑a,b

(ab2

p

)ζb(a+1)p

=∑a,b

(a

p

)ζb(a+1)p

=∑b

(−1

p

)ζ0p +

∑a6=−1

(a

p

)∑b

ζb(a+1)p .

Then since 1 + ζp + · · ·+ ζp−1p = 0, we have∑

b

ζb(a+1) = −1

for a 6= −1. Hence

S2 = (p− 1)

(−1

p

)−∑a6=−1

(a

p

)

= p

(−1

p

)−∑a

(a

p

)= p

(−1

p

),

the last line because there are as many quadratic residues as non-residues.

Incidentally, we have the following corollary.

Corollary 1.46. The field Q(√p) is contained in Q(ζp) or Q(ζp, i), depending on the

congruence class of p modulo 4.

Now we complete the proof of quadratic reciprocity. Let p, q be odd primes and let Sdenote the quadratic Gauss sum in terms of ζp. We have

Sq = S(S2)q−12 = S(−1)

p−12

q−12 p

q−12 ≡ S(−1)

p−12

q−12

(p

q

)(mod qZ[ζp]).

But,

Sq ≡p−1∑a=1

(a

p

)ζaqp ≡

(q

p

) p−1∑a=1

(aq

p

)ζaqp ≡

(q

p

)S (mod qZ[ζp]).

Thus

S(−1)p−12

q−12 ≡

(q

p

)S (mod qZ[ζp])

We then multiply by S and cancel the (−1)(p−1)/2p to obtain the theorem.

21

1.7 Elementary Proof of Quadratic Reciprocity

Gauss is known to have discovered eight proofs of quadratic reciprocity during his lifetime.This is particularly remarkable since he was the first to prove this theorem since Eulerconjectured it about sixty years before him. As well, Gauss discovered his first proof ofquadratic reciprocity at the age of nineteen. This proof and his second were published inhis famous Disquisitiones Arithmeticae, which also contained many other remarkable newideas in arithmetic.

In this section we present Gauss’s third proof. It is both strikingly elementary and alsogeometric. We begin with a lemma.

Lemma 1.47 (Gauss). Let p be an odd prime and consider the set

S =

{p− 1

2,−p− 3

2, . . . ,

p− 1

2

}.

Let S− be the negative elements of S. For a ∈ Z with p - a, let µ be the order of the set

S− ∩{a, 2a, . . . ,

p− 1

2a

}Then

(ap

)= (−1)µ.

Proof. First of all, it is clear that since the residue classes of the elements a, 2a, . . . , p−12 a

are distinct modulo p, each of these element is congruent to a unique element of S modulop. So for 1 ≤ i ≤ p−1

2 , write ia ≡ εimi (mod p) with mi ∈ S positive and εi = ±1. Then

(−1)µ =

(p−1)/2∏i−1

εi

by definition.Now we show that mi 6= mj for i 6= j with 1 ≤ i, j ≤ p−1

2 . If we did have mi = mj

but i 6= j, then clearly εi 6= εj since the residue classes of the ia’s are distinct. Thereforeia ≡ −ja (mod p), so i ≡ p−j (mod p). Since i ∈ {1, . . . , p−1

2 } and p−j ∈ {p+12 , . . . , p−1},

this gives a contradiction. Therefore the mi’s are all distinct. Since they are all in the set{1, . . . , p−1

2 }, the set of mi’s actually equals this set.Now we compute

(p−1)/2∏i=1

ia ≡(p−1)/2∏i=1

εimi ≡ (−1)µ(p− 1

2

)! (mod p)

and(p−1)/2∏i=1

ia = ap−12

(p− 1

2

)! ≡

(a

p

)(p− 1

2

)! (mod p).

Upon calcellation of (p−12 )!, we obtain the lemma.

22

To find the quadratic character of 2, we let a = 2 in the lemma, and we consider themi’s and εi’s as in the proof. It is clear that for 2i ≤ p−1

2 , the mi’s are even and εi = 1,

while for 2i > p−12 , the mi’s are odd and ε = −1. Thus, εi = (−1)mi , and so

(−1)µ = (−1)1+2+···+ p−12 = (−1)

p2−18 .

For the general case, let [x] denote the greatest integer in x. Let εi,mi be as in the proofof Gauss’s lemma, with a = q. Then iq =

[iqp

]p+ εimi if εi = 1, and iq =

[iqp

]p+ εimi + 1 if

εi = −1. We also get

q

(p2 − 1

8

)=

(p−1)/2∑i=1

iq = µ+

(p−1)/2∑i=1

[iq

p

]p+

(p−1)/2∑i=1

εimi = µ+

(p−1)/2∑i=1

[iq

p

]p+

p2 − 1

8.

Rearranging and reducing modulo 2 gives

µ ≡ (q − 1)p2 − 1

8+

(p−1)/2∑i=1

[iq

p

]p ≡

(p−1)/2∑i=1

[iq

p

](mod 2).

Thus, by Gauss’s lemma, it suffices to prove that

(p−1)/2∑i=1

[iq

p

]+

(q−1)/2∑j=1

[jp

q

]=p− 1

2

q − 1

2.

This will be clear to any reader who knows how to correctly visualize these sums. Let usexplain.

Let T be the set

{(c, d) | c, d ∈ Z, 1 ≤ c < p/2, 1 ≤ d < q/2}.

This consists of the lattice points in the xy-plane bounded by the axes and the lines x = p/2and y = q/2. We split this rectangle across the diagonal: Let A be the number of elementsin the set

{(a, b) ∈ T : b < pa/q},

and let B the number of elements in the set

{(a, b) ∈ T : b > pa/q}.

Since b 6= pa/q for (a, b) ∈ T (pa/q is not an integer for 1 ≤ a < p/2, i.e. the diagonalcontains no lattice points), these sets union to T and we have

A+B = |T | = p− 1

2

q − 1

2.

Now for a fixed a, the set {b ∈ N : b < pa/q} has[paq

]elements, so summing over all a with

1 ≤ a < p/2 gives

A =

(p−1)/2∑a=1

[aq

p

].

23

Also, since the condition b > pa/q is equivalent to a < qb/p, a similar argument gives

B =

(q−1)/2∑b=1

[bp

q

].

Thus,(p−1)/2∑a=1

[aq

p

]+

(q−1)/2∑b=1

[bp

q

]=p− 1

2

q − 1

2.

This is what we wanted to show.

I want to finish this chapter by paraphrasing Glenn Stevens, who runs an intensivenumber theory program for high school students called PROMYS. One third of the pro-gram is spent trying to understand quadratic reciprocity. After his last lecture in 2013, heremarked on the history of mathematicians’ attempts at this.

It is clear that Gauss who, as mentioned above, found the first proof of this theoremamongst seven others, was not satisfied with his own understanding of it. He developed thefirst inklings of a theory of algebraic numbers in order to prove theorems which, in somesense, compliment or extend quadratic reciprocity. These theorems include the sign of thequadratic gauss sum and biquadratic reciprocity.

The proofs we presented admittedly shed no considerable light on the deep nature ofthis theorem. Any reader who seriously thinks otherwise may consider spreading his or herwisdom to the rest of the mathematical community.

The first notable attempts, besides those of Gauss, to understand quadraic reciprocityconsisted of generalizations. One such generalization is Eisenstein reciprocity, which maybe treated with the tools developed in the first part of these notes. Another is Artin reci-procity, which was known to generalize all other reciprocity laws at the time of its proof.This particular theorem was a major achievement in class field theory, which will be dis-cussed in a later part of these notes.

Artin reciprocity, in particular, has an interpretation in terms of properties of L-functions. In brief, one can interpret a character on an abelian Galois group as a 1-dimensional representation of the Galois group, and to any such representation one canassociate a meromorphic function called an L-function. (Similar functions pop up justabout everywhere in number theory). Artin reciprocity implies certain properties of thisfunction.

Now in the 1970’s, Robert Langlands formulated an extremely influential and unifyingprogram in number theory now called the Langlands program. It is essentially a series ofconjectures which relate L-functions from number theory, like the Artin L-functions, toL-functions appearing in geometry. In particular, Langlands interpreted the Hecke L-seriesas a geometric L-series (technically, the L-series associated to an automorphic form on thereductive algebraic group GL1(AK)) and noticed that Artin reciprocity is essentially thestatement that every Artin L-function is equal to some Hecke L-function. Thus the notionof reciprocity may be a reflection of the interaction between L-functions from geometry andL-functions from number theory.

A famous example of this interaction is actually the proof of Fermat’s Last Theorem.

24

The main point of the proof was that every L-function coming from an elliptic curve alsocame from a modular form. Here the modular forms are on the geometric side, while theelliptic curves are on the arithmetic side. Thus the proof of Fermat’s last theorem is, insome sense, the establishment of a reciprocity law.

Even with all of this, most serious mathematicians still do not claim to understandquadratic reciprocity. It is believed, however, that we are making progress.

Exercises

Exercise 1.1. Prove the last assertion (on the rank of B) of Theorem 1.10 by consideringB ⊗A K.

Exercise 1.2. Prove Nakayama’s lemma: Let A be local with maximal ideal p, and assumethat M is a finitely generated A-module. Then if pM = M , then M = 0. [Hint: Writeout generators for M and express one of them in terms of the others, with coefficients in p.Then induct on the number of generators.]

Exercise 1.3. Let p be a positive prime.(a) Prove that 1, ζp, . . . , ζ

p−2p is an integral basis for Z[ζp].

(b) Compute ∆Q(ζp).

25

2 Prime Decomposition

In this section we consider more deeply the notion of primes lying over other primes. Thisis especially interesting in Dedekind domains, where primes can factor, or decompose, inextensions. The Galois theory of the fraction fields plays a key role because the Galoisgroup permutes transitively the primes lying over a given prime.

Sources

The exposition in Sections 2.1 through 2.3 are heavily influenced by Lang’s text [5], thoughSection 2.2 is also influenced by Marcus [6].

2.1 Galois Theory and Prime Decomposition

We return to the setting of Section 1.1 and work very generally. We fix a ring A, integrallyclosed in its field of fractions K, and we let L be a finite separable extension of K and Bthe integral closure of A in L. If L/K is Galois, we fix the group G = Gal(L/K). Theexample to keep in mind, of course, is when K,L are number fields, and A = OK , so thatB = OL.

We want to see how G interacts with the primes P in B lying over a fixed maximal pin A. Recall we have a diagram

A ��

//

��

B

��

A/p ��

// B/P.

(See the remarks after Definition 1.12). Furthermore, since we assume p is maximal, so isP (Proposition 1.14). Thus we have an extension of residue fields A/p ⊂ B/P. The Galoisgroup of this extension will also interact with the Galois group of L over K.

Let us begin with the following proposition.

Proposition 2.1. Let p ⊂ A be maximal and assume L/K is Galois. Then given any twoprimes P,Q ⊂ B lying over p, there is a σ ∈ G with P = σQ.

Proof. We leave it as an exercise to show that, first of all, σ induces a ring automorphismof B which fixes A. This implies that σQ is prime for all Q.

Now assume P 6= σQ for all σ ∈ G. Then by the Chinese Remainder Theorem, we canfind an element x ∈ B with x ∈ P and x ≡ 1 (mod σQ). Then NmL/K(x) ∈ A (Corollary1.8) and it also a multiple of x (it is the product of the σ(x) for σ ∈ G), so x lies inP ∩A = p.

Since x /∈ σQ for all σ, we have σ(x) /∈ Q for all σ. But because NmL/K x ∈ p = Q∩A,we find that σ(x) ∈ Q for some σ since Q is prime. This is a contradiction.

Corollary 2.2. Given a maximal ideal p ⊂ A, there are only finitely many primes in Blying above p (L/K need not be Galois).

Proof. If L/K is Galois, then there can only be as many primes lying over p as automor-phisms in G, so the result is immediate. Hence we will reduce to the Galois case.

26

Let F be the Galois closure of L over K, and C the integral closure of A in F . Thereare finitely many primes in C lying over p. Thus we need to show that if P1,P2 are distinctprimes of B lying over p, and if Q1,Q2 ⊂ C are primes lying over P1,P2 respectively, thenthe Qi lie over p and are distinct. The first assertion follows from

p = Pi ∩A = Qi ∩B ∩A = Qi ∩A,

and the second fromQ1 ∩B = P1 6= P2 = Q2 ∩B.

Definition 2.3. Let L/K be Galois. Let P ⊂ B be a prime lying over a maximal p ⊂ A.Let D(P/p) be the set of elements σ of G such that σP = P. It is clear that D(P/p) isa subgroup of G. We call it the decomposition group of P over p. We often drop the pfrom the notation and write D(P). The fixed field of D(P) is denoted Ld and is called thedecomposition field of P

The importance of the decomposition group is that there is a canonical homomorphismD(P/p) → Gal((B/P)/(A/p)), described as follows. We have that σ(P) = P for allσ ∈ D(P/p), and that each σ is also a ring automorphism of B. Hence σ induces a mapσ : B/P→ B/P, which fixes A/p pointwise (because σ fixes A pointwise). Therefore σ isa well-defined element of Gal((B/P)/(A/p)).

Obviously σ 7→ σ is a homomorphism. Its kernel is denoted E(P/p), or E(P), and iscalled the inertia group of P over p. Clearly E(P) consists of all elements σ of D(P) (infact of G) for which σ(α) ≡ α (mod P). The fixed field of E(P) is denoted Le and is calledthe inertia field of P. Obviously,

K ⊂ Ld ⊂ Le ⊂ L.

We will compute the degrees of these extensions for the case L/K a Galois extension offraction fields of Dedekind domains in the next section.

We want to prove that the homomorphism D(P/p)→ Gal((B/P)/(A/p)) is surjective,and this will be done eventually. First we prove a basic fact about the extension of primesto the decomposition field.

Proposition 2.4. Let L/K be Galois, and P, p be as usual. Let Bd be the integral closureof A in Ld. Consider the prime P ∩ Bd. Then the only prime of B lying above P ∩ Bd isP itself. The field Ld is the smallest subfield of L containing K with this property.

Proof. First note that P and P ∩ Bd are maximal because they lie over the maximal p.Since Gal(L/Ld) = D(P/p) fixes P, we see that P is the only prime lying above P ∩ Bd

by Proposition 2.1.Now let E be any subfield of L containing K such that, if C is the integral closure of

A in E, we have that P is the only prime lying above P ∩ C. Let σ ∈ Gal(L/E). Thenσ(P) ∩ C = P ∩ C because σ fixes C. It follows that σ(P) = P and so σ ∈ D(P). HenceGal(L/E) ⊂ D(P), and so Ld ⊂ E. This suffices.

27

Proposition 2.5. Let the setting be as in the previous proposition, with Q = P∩Bd. Thenthe injection A/p→ Bd/Q is an isomorphism.

Proof. If σ ∈ G\D(P), then σP 6= P by definition, and hence also P 6= σ−1P. We letQσ = σ−1P ∩Bd. By the previous proposition, Qσ 6= Q.

Let x ∈ Bd. We will essentially be looking at the image of x modulo Q, and we willconstruct an element in A to which x is congruent. We use the Chinese Remainder Theoremto find an element y ∈ Bd satisfying

y ≡ x (mod Q)

y ≡ 1 (mod Qσ)

for all σ ∈ G\D(P). Since Q = P∩Bd and Qσ = σ−1P∩Bd, we may lift these congruencesto B and obtain

y ≡ x (mod P)

y ≡ 1 (mod σ−1P)

for all σ ∈ G\D(P). This last congruence implies σy ≡ 1 (mod P) for all σ ∈ G\D(P).Taking norms from Ld to K multiplies (some of) these congruences together, giving

NmLd/K(y) ≡ x (mod P).

This congruence holds modulo Q because NmLd/K(y) ∈ A since y ∈ Bd, and so NmLd/K(y) ∈Bd. Thus we have a congruence between x and an element of A, as desired.

Theorem 2.6. Let A,B,K,L, p,P be as usual, with L/K Galois. Then B/P is a normalextension of A/p and the map D(P)→ Gal((B/P)/(A/p)) is surjective.

Proof. Let us set some notation. Let l = B/P and k = A/p. We denote the reduction ofx ∈ B to l by x, and similarly with A to k. The reduction of any polynomial f over B orA to l or k is similarly denoted f .

We first prove the finiteness of any separable subextension of l/k. The normality of l/kwill fall out of this part of the proof. The technique will be to choose an element x ∈ l butinstead study the minimal polynomial of x ∈ L. The surjectivity will also be proved alongthese lines.

So choose x ∈ l such that k(x) is separable over k, and let f be the minimal polynomialof x over K. The coefficients of f lie in A and all of the roots xi of f are integral over A.Since L/K is Galois, f splits into linear factors

f(t) =∏

(t− xi).

Hence the reduction f of f splits in l as∏

(t− xi). Since f(x) = 0, the minimal polynomialof x over l divides f , and thus splits. We obtain simultaneously the inequality [k(x) : k] ≤[K(x) : K] and the normality of l/k. In particular [k(x) : k] ≤ [L : K] for any x whichgenerates a separable subextension of l/k. Since any separable subextension is generatedby one element (Primitive Element Theorem), this proves that the maximal separable

28

subextension is finite.Now we tackle the surjectivity. We can prove this easily for the case when only one

prime lies above p. Since the residue fields of Ld and K are the same (Proposition 2.5)and since D(P) = Gal(L/Ld), we may assume K = Ld. Proposition 2.4 shows that P liesabove only one prime of Ld.

Let x generate the maximal separable subextension of l/k. Similarly to the above, theminimal polynomial f of x in L splits over K, and so does f over k. The Galois groupof L/K, which is D(P) by our reduction, permutes transitively the roots xi of f , andsimilarly elements of Gal(l/k) send x to other roots of f . Because x generates the maximalseparable subextension of l/k, the automorphisms of l/k are determined by their action onx. Summarizing, for any i, there is a σ ∈ D(P) such that σ(x) = xi. Hence σ(x) = xi, andthis action determines any τ ∈ Gal(l/k). So any τ is some σ, proving surjectivity.

Corollary 2.7. Notation as above, we have an isomorphism D(P)/E(P) ∼= Gal(l/k).

2.2 Prime Decomposition in Dedekind Domains

We now specialize to Dedekind domains, where the prime extensions have a nice description.This description will also allow us to determine the order of the groups D(P)/E(P) andE(P).

Let A be a Dedekind domain with field of fractions K, L a finite separable extensionof K, and B the integral closure of A in L. Then if p is prime in A, we consider the idealpB in B. This has a unique factorization Pe1

1 · · ·Perr . The prime p is said to split into this

factorization in B. The ei = e(Pi/p) are called the ramification indeces of Pi over p. Ifone of the ei > 1, the prime p is said to ramify in B, and p is ramified. Otherwise p isunramified.

The ideal Pi ∩ A is prime in A and it contains pB ∩ A = p. Since non-zero primes aremaximal, Pi ∩A = p and so Pi lies over p.

Conversely, let P be a prime of B. Then P∩A is prime, and (P∩A)B ⊂ PB∩AB = P.Hence P occurs in the factorization of (P ∩ A)B. So every prime of B arises from thefactorization of the unique prime of A over which P lies. In particular, no prime of Boccurs in the factorization of two primes. In some sense, this suggests a sort of “evenness”to the splitting of primes in extensions. This intuition will be justified by the next theorem.

To state this theorem, we need another invariant associated to a prime in P ⊂ B lyingover the prime p ⊂ A. As we saw in the last section, there is a field extension (B/P)/(A/p).This extension is finite (in the case of Dedekind domains) for the following reasons. Weknow that B is a finitely generated free A-module of rank n = [L/K]. Thus B/pB is afinitely generated A/p-module. But by the Chinese Remainder Theorem, B/pB splits as

(B/Pe11 )× · · · × (BPer

r ).

One of the Pi’s is P, and so B/Pei is an A/p-module of finite rank. Hence so is B/P.With this notation, we denote by fi = f(Pi/p) the degree [B/Pi : A/p]. We will call

this the inertia degree of Pi over p.

Theorem 2.8. Let A be a Dedekind domain with field of fractions K, let L be a finiteseparable extension of K of degree n, and let B be the integral closure of A in L. Let p be

29

a prime in A which splits in B as Pe11 · · ·Per

r . Then we have the formula

r∑i=1

eifi = n.

In particular, if L/K is Galois, then all the ei’s are equal, to e, say, and all the fi areequal, to f , say, and we have

ref = n.

The proof will require a few preliminaries. The idea will be to localize because localDedekind domains are extremely well behaved, so much so that they have a name of theirown.

Definition 2.9. A Dedekind domain which is local is called a discrete valuation ring.

This way of stating the definition is perhaps non-standard. We will prove a propositionstating the equivalence of this definition to a different one. First, a couple of lemmas.

Lemma 2.10. Let A be a Dedekind domain, and S ⊂ A a multiplicative system. ThenS−1A is a Dedekind domain.

Proof. By Proposition 1.11, S−1A is still integrally closed. Since there is a one-to-oneinclusion preserving correspondence between the primes of S−1A and the primes of A whichdo not intersect S, every prime is still maximal. Finally, localization respects Noetherianity.

Lemma 2.11. Let A be a Dedekind domain and let S be a multiplicatively closed subsetof A. If a, b are ideals of A, then S−1(ab) = (S−1a)(S−1b). In otherwords, the mapI(A)→ I(S−1A) given by a 7→ S−1a is a homomorphism.

Proof. We have

S−1(ab) =

{∑ αβ

x

∣∣∣∣α ∈ a, β ∈ b, x ∈ S}

=

{∑ α

x

∑ β

y

∣∣∣∣α ∈ a, β ∈ b, x, y ∈ S}

= (S−1a)(S−1b).

Lemma 2.12. A Dedekind domain A with only finitely many primes is a principal idealdomain.

Proof. Let p1, . . . , pn be the primes of A. By the Chinese Remainder Theorem, we can findan element α of A which is in some pi but which is not in pj for j 6= i. Then pi|(α) andpj - (α). It follows that (α) = pri for some r. Hence α = β1 · · ·βr for some β1, . . . , βr ∈ p.Thus (β1) · · · (βr) = pi. By unique prime ideal factorization, (βj) = pi for some j (infact, all of the ideals (βj) are equal to pi, but we do not require this much information).Therefore, all primes are principal. Since any ideal can be factored into primes, every idealis principal.

30

Proposition 2.13. A ring A is a discrete valuation ring if and only if A is a principalideal domain with exactly one prime.

Proof. The forward direction follows by definition and by the previous lemma. Conversely,if A is a principal ideal domain with exactly one prime, then this prime is equal to themaximal which contains it; Every ideal is (finitely) generated by one element; and A isintegrally closed by Proposition 1.15. So A is Dedekind, and also local by assumption.

Remark. The usual definition of discrete valuation ring is the equivalent condition givenin the last proposition. The terminology stems from the fact that, as a unique factorizationdomain, A has only one irreducible element π. This is because every principal ideal gen-erated by an irreducible element must be prime, but there is only one prime. Thus everynon-zero element is uniquely written as uπr for some r. Let K = Frac(A). Then we see thatthere is an isomorphism K× ∼= A× × πZ ∼= A× × Z. The composition of this isomorphismwith the projection A××Z→ Z is called the valuation. This terminology itself stems fromthe fact that the p-adic integers Zp (to be introduced later) form a discrete valuation ring,and the associated valuation gives rise to an absolute value in the usual sense.

An example of a map which is considered a valuation and which is not discrete is themap log | · | on R× or C×. These ideas come from the fact that R, C, and Qp = Frac(Zp) areexamples of local fields, the theory of which will be discussed later. In fact, the topologyand associated measure theory of these fields unifies them and their valuations into onetheory.

We can prove Theorem 2.8 now. Localization at p clearly does not affect residue fields,and Lemma 2.11 tells us that the factorization of the localized prime pp in Bp does notchange. Hence fi, ei do not change. Furthermore, since Bp is the integral closure of Ap

in L (Proposition 1.11), Bp is a free Ap-module of rank n. Hence it suffices to prove thetheorem when A is local. We assume this now. In particular, now A is principal, and so isB because its only primes are the ones lying above p (why?). So Lemma 2.12 tells us thatB is principal.

Consider the A/p-module B/pB. This is a free A/p-module of rank n (to see this, justtensor B with A/p). By the Chinese Remainder Theorem,

B/pB ∼= B/Pe11 ⊕ · · · ⊕B/P

err ,

as an A/p-module. So it will be enough to prove that B/Peii is a B/Pi-vector space of

dimension ei.Let π be the generator of the ideal Pi (recall B is a principal ideal domain). Let j ≥ 0

and consider the map B/Pi → Pji/P

j+1i given by multiplication by πj . This is clearly an

isomorphism (for instance, tensor 0→ B → (πj)→ 0 with B/(π)). Hence the inclusions

B ⊃ Pi ⊃ P2i ⊃ · · · ⊃ Pei

i

give a composition series which shows that B/Peii is an ei-dimensional B/Pi-vector space.

As mentioned, this suffices.To prove the last statement, just note that the Galois group of L over K permutes the

primes Pi transitively. So because σ ∈ Gal(L/K) restricts to a ring automorphism of B,we clearly get the same ramification indeces and inertia degrees for each Pi.

31

Definition 2.14. Let A be a Dedekind domain with fraction field K, let L be a finiteseparable extension of K, and let B be the integral closure of A in L. Consider a prime pof A. If e = f = 1 for all primes lying over p in B, then p is said to split completely. Ife = n for some prime above p (in which case there is only one prime above p), then p issaid to be totally ramified.

Let us now consider multiple extensions. We have the following useful proposition.

Proposition 2.15. Let A be a Dedekind domain with fraction field K. Let K ⊂ L ⊂ Mbe finite separable extensions, and A ⊂ B ⊂ C be the corresponding integral closures of A.Consider a prime P of C lying over a prime q in B, which lies over a prime p in A. Thenwe have the following formulas.

e(P/p) = e(P/q)e(q/p)

f(P/p) = f(P/q)f(q/p).

Proof. The first formula follows from (pB)C = pC, and the second follows from the multi-plicativity of degrees of field extensions in towers.

Now we will prove a theorem which brings together all of our results.

Theorem 2.16. Let A be a Dedekind domain with fraction field K, let L be a finite Galoisextension of K, and let B be the integral closure of A in L. Consider a prime P in B lyingover a prime p in A. Let Be be the integral closure of A in Le, and Bd the integral closureof A in Ld. Let Pe = P ∩Be and Pd = P ∩Bd. Then we have the following data.

Kr⊂ Ld

f⊂ Le

e⊂ L (degrees)

p1⊂ Pd

1⊂ Pe

e⊂ P (ramification)

p1⊂ Pd

f⊂ Pe

1⊂ P (inertia)

Proof. We use repeatedly the fact that ref = n (Theorem 2.8).Corollary 2.7 shows immediately that [Le : Ld] = [B/P : A/p] = f . We want to show

[Ld : K] = r. Well, [Ld : K] = [G : D(P)]. Let σ, τ ∈ G. By definition, σP = τP if andonly if τ−1σP = P, if and only if τ−1σ ∈ D(P), if and only if σD(P) = τD(P). Sincethe set of σP’s includes all primes lying above p, this gives a one-to-one correspondencebetween these r primes and the cosets of D(P) in G. So [Ld : K] = r. Since ref = n, wehave [L : Le] = e.

Since there is only one prime of L above Pd (Proposition 2.4) and since we know [L :Ld] = ef , we must have e(P/Pd)f(P/Pd) = ef . Therefore, since clearly e(P/Pd) ≤ e andf(P/Pd) ≤ f , we have equality in both cases. Hence e(Pd/p) = f(Pd/p) = 1. Furthermore,if we prove any one of the equalities e(Pe/Pd) = 1, e(P/Pe) = e, f(Pe/Pd) = f , orf(P/Pe) = 1, then we have the other three, and so we will be done. We choose to provethe fourth. But this is clear; since the Galois group of L/Le acts trivially on the residuefield B/P by definition, Corollary 2.7 (with Le in place of K) tells us that the Galois groupof (B/P)/(Be/Pe) is trivial. This proves f(P/Pe) = 1, and hence the theorem.

32

Theorem 2.17. Let the notation be as in the previous theorem.(a) Ld is the largest intermediate field K ′ such that e(p′/p) = f(p′/p) = 1. Here

p′ = P ∩A′ where A′ is the integral closure of A in K ′.(b) Ld is the smallest intermediate field K ′ such that P is the only prime lying over p′

(c) Le is the largest intermediate field K ′ such that e(p′/p) = 1.(d) Le is the smallest intermediate field K ′ such that p′ is totally ramified in L.

Proof. (a) We know that e(P/p′) = e and f(P/p′) = f by the multiplicativity of e andf in towers. Thus the same is true for the primes in LdK ′, and also [L : K ′] ≥ ef . Bythe previous theorem, [LdK ′ : K] ≥ r, and hence [L : LdK ′] ≤ ef . Thus we have equality[L : LdK ′] = ef = [L : Ld]. So K ′ ⊂ Ld.

(b) is Proposition 2.4.(c) is similar to (a).(d) is again similar: If p′ is totally ramified, then [L : K ′] = e(P/p′) ≤ e = [L : Le], so

we conclude by examining LeK ′.

2.3 The Norm of an Ideal

Let A be a Dedekind domain with fraction field K, L a finite separable extension of K, andB the integral closure of A in L. We have a homomorphism I(A)→ I(B) given by a 7→ aB,whose properties we just studied. Now we wish define a homomorphism I(B) → I(A).This will be called the norm and will be denoted NmL/K . It is defined on the primes as

follows. For a prime P of B lying above a prime p of A, we let NmL/K P = pf(P/p). Weextend this to the whole of I(B) by linearity (recall I(B) is free abelian on the primes).The terminology is justified by the next few propositions.

Proposition 2.18. Let A be a Dedekind domain with field of fractions K, let L be a finiteseparable extension of K, and let B be the integral closure of A in L. Assume L/K isGalois with Galois group G. Let P be a prime of B lying above a prime p in A. Then, withnotation as in Theorem 2.8,

(NmL/K P)B =∏σ∈G

σP = (P1 · · ·Pr)ef .

Proof. We already know pB = (P1 · · ·Pr)e, from which the equality of the first and last

terms in the formula follows immediately. The equality between the second and last termsis just the fact that G permutes the primes over p transitively, each one corresponding toa coset of D(P) in G (see Theorem 2.16 and its proof, for instance).

Proposition 2.19. Let A be a Dedekind domain with field of fractions K, let L be a finiteseparable extension of K, and let B be the integral closure of A in L. Let a ⊂ A be afractional ideal. Then NmL/K(aB) = a[L:K].

Proof. This follows immediately for primes from Theorem 2.8. Just extend by linearity.

Proposition 2.20. Let K ⊂ E ⊂ F be a tower finite separable extensions, which arefraction fields of Dedekind domains A ⊂ B ⊂ C, B,C being the integral closures of A intheir fraction fields. Then

NmL/E ◦NmE/K = NmL/K

33

Proof. This follows immediately for primes from the multiplicativity of inertia degrees intowers. Extend by linearity.

Proposition 2.21. Let A be a Dedekind domain with field of fractions K, let L be a finiteseparable extension of K, and let B be the integral closure of A in L. Let β ∈ L, β 6= 0,and consider the principal fractional ideal βB = (β). Then

NmL/K(β) = (NmL/K β).

Proof. Let F be the Galois closure of L over K, and C the integral closure of A in F .Then NmL/E(βC) = (β)[L:E] by Proposition 2.19, and we have (NmL/E β) = (β[L:E]) =

(β)[L:E]. Therefore, if we know the proposition for F/K, we know it for E/K by thetransitivity of norms (of both elements and ideals). But now the proposition follows easilyfrom Proposition 2.18.

Let A be a Dedekind domain and assume all residue fields A/p are finite. For an ideala ⊂ A, define Na = |A/a|. The next proposition shows that this is finite, and determinesthe value. This will be very useful in studying the arithmetic of number fields and localfields.

Proposition 2.22. With the situation as above, assume a has the factorization a =pn1

1 · · · pnrr . Then

Na =∏

(Npi)ni .

Proof. We saw in the proof of theorem 2.8 that

A/a ∼= (A/p1)n1 × · · · × (A/pr)nr

at least as abelian groups. This suffices.

An example of where we have this situation is when K/Q is a number field. Then foran ideal a ∈ OK , we have NmK/Q a = (Na). We leave the proof as an exercise.

2.4 The Frobenius Automorphism

Let us very briefly discuss the Frobenius automorphism, which will be at the heart of classfield theory later. The setup is this. Let L/K be a Galois extension of number fields, andlet p be a prime of K which is unramified in L. Let P be a prime of L lying above p.Then we can associate to the pair (P, p) an automorphism φ(P, p) ∈ Gal(L/K) called theFrobenius automorphism, defined as follows.

Since p is unramified, E(P/p) is trivial and so D(P/p) ∼= Gal((OL/P)/(OK/p)). Since(OL/P)/(OK/p) is an extension of finite fields, their Galois group has a canonical generatorσ which is characterized by σa = aNp (since Np is the order of the field OK/p). HenceD(P/p) has a generator, the preimage of this σ, which is thus characterized by

φ(P, p)α ≡ αNp (mod P).

34

This is the Frobenius automorphism.Let τ an automorphism of L over K. Then

φ(P, p)(τ−1α) ≡ (τ−1α)Np (mod P)

and so(τφ(P, p)τ−1)α ≡ αNp (mod τP).

Hence τφ(P, p)τ−1 = φ(τP, p). Since the Galois group of L/K permutes transitively theprimes of L lying over p, we see that the set of Frobenius elements associated to a prime pis the conjugacy class of any one of these Frobenius elements. We denote this set by φ(p)

When L/K is an abelian extension, this implies there is only one element of φ(p), i.e.the Frobenius elements depend only on the prime p of K. In this case, we use the notationφ(p) for the set and its element interchangably.

So we have a map from primes to Gal(L/K) when L/K is abelian. We can thereforeextend it to the whole ideal group J(OK) by linearity. The result is the Artin map, whichis the main object of study in class field theory. If L/K is furthermore unramified (i.e. allprimes of K are unramified in L), then the Artin Reciprocity Law states that the kernel ofthis map is NmL/K(IL) (there is a ramified version as well).

The Artin Reciprocity Law is called so because, at the time of its proof, it was knownto generalize all known reciprocity laws.

One application of the Artin Reciprocity Law is the Chebotarev density theorem, whichdoes not require that L/K be abelian. Vaguely, it says that the primes of K are equidis-tributed according which Frobenius elements they give rise to. All of this will be studiedwith class field theory.

Exercises

Exercise 2.1. This exercise outlines a proof that if a prime p ∈ Z is ramified in K, thenp|∆K . (We will see a much more general version of this theorem in the next chapter. Inparticular, the converse is true.)

(a) Assume p is ramified in K. Show that there is an element α ∈ OK that is in everyprime above p, but not in pOK . If L is the Galois closure of K, show that σ(α) must thenlie in every prime above p in L, for all σ ∈ Gal(L/Q).

(b) Let α1, . . . , αn be an integral basis for OK . Show that there is an i such that

Disc(α1, . . . , αi−1, α, αi+1, . . . , αn) = m∆K

for some m coprime to p.(c) Using (a), show that m∆K ∈ pZ. Conclude that p|∆K .(d) Prove that only finitely many primes are ramified in any extension of number fields.

Exercise 2.2. This exercise shows how primes in Z split in Q(ζm). Let p be a prime in Zand write m = npk with gcd(m,n) = 1.

(a) Show that

p = (1− ζpk)ϕ(pk)

35

and use this to verify that the principal ideal (1 − ζpk) in the ring of integers in Q(ζpk) isprime.

(b) Verify that pOQ(ζn) is unramified in Q(ζn), say p = P1 . . .Pr. (See Exercise 2.1).(c) Show that

n−1∏i=1

(1− ζin) = n.

(d) Let P be any prime Pi from (b). Let a ∈ Z. Using the identity in (c), show thatζp

a

n ≡ ζ (mod P) if and only if pa ≡ 1 (mod n).(e) Let σ ∈ Gal(Q(ζn)/Q) correspond to the automorphism ζn 7→ ζpn. Let φ = φ(P/p)

be the Frobenius automorphism. Show that σ and φ generate the same subgroup ofGal(Q(ζn)/Q). Conclude that f(P/p) is the order of p in (Z/nZ)×.

(f) Let p have the factorization (Q1 · · ·Qs)e in Q(ζm), each Qi with common inertia

degree f . Show that e = ϕ(pk), f is the order of p in (Z/nZ)×, and r = ϕ(n)/f .

Exercise 2.3. This exercise outlines another proof of Quadratic Reciprocity (Theorem1.44) using the splitting of primes in cyclotomic extensions (Exercise 2.2).

(a) Let p be an odd prime. For every d|p − 1, prove that there is a unique subfield Fdof Q(ζp) of degree d over Q. What is it, in terms of Galois theory?

(b) Let q 6= p be a different odd prime. Using the result of Exercise 2.2, prove that q isa dth power in (Z/pZ)× if and only if q splits completely in Fd.

(c) Prove quadratic reciprocity by considering d = 2 in (b). Can you find a proof of thequadratic character of 2 along these lines?

36

3 Local Theory

In this chapter we develop the local methods of the theory of algebraic numbers. By thiswe do not mean any theory surrounding simply the localization of the ring of integers of anumber field at a prime. While this does play a small role in the theory, we will insteadconsider a completion of a number field with respect to an absolute value, in the sense ofcompleting Q to R. The absolute value itself, however, will be (in most cases) obtainedfrom a prime ideal (but in other cases, it will have nothing to do with prime ideals).

This completion will be a large field containing our number field (highly transcendental,though this is immaterial) and it will contain a subring which replaces the localization ofring of integers at a prime. The reason we study this ring, which seems more complicated,is that the situation with polynomials is actually much more simple, as Hensel’s Lemma(Theorem 3.19) will show.

It is interesting to mention that while there is a great interplay between absolute valueson a number field and prime ideals, the theories related to either have, in general, verydifferent flavors. Though there is great interplay between the theories. In Section 3.5 wewill classify the absolute values on a number field and we will see that the absolute valuesare actually just a very slight extension of the notion of prime ideal.

Sources

This entire chapter is a blend of inspiration from Neukrich [9], Lang [5], and my own ideas.

3.1 Absolute Values

We begin once again by working very generally.

Definition 3.1. Let K be a field. A function | · | : K → R is called an absolute value if thefollowing three properties are satisfied for all x, y ∈ K:

(1) |x| ≥ 0, equality holding if and only if x = 0;(2) |xy| = |x||y|;(3) [Triangle inequality] |x+ y| ≤ |x|+ |y|.An absolute value on K is called non-archimedean if, in addition, it the following fourth

property is satisfied:(4) [Ultrametric inequality] |x+ y| ≤ max{|x|, |y|}.Property (4) implies property (3), so it is also called the strong triangle inequality. If

property (4) does not hold, then we call K archimedean.

If K is a field, then there is always an absolute value, called the trivial absolute valuedefined by |x| = 1 if x 6= 0, and |0| = 0. We will always assume that the absolute values weare dealing with are not this one. An absolute value | · | on a field K induces a metric onK by

d(x, y) = |x− y|.

The trivial absolute value induces the discrete metric. The topology induced by the metricabove respects the field operations in the following sense.

37

Proposition 3.2. Let K be a field with an absolute value | · |. The following four functionsare continuous in the topology induced by | · | (give K ×K the product topology):

(x, y) 7→ x+ y : K ×K → K;

x 7→ −x : K → K;

(x, y) 7→ xy : K ×K → K;

x 7→ x−1 : K× → K.

Proof. The proof is the same as for R, which is presented in any course on elementaryanalysis.

Remark. The above proposition is the same as saying | · | makes K into a topological field.One can ask what topological properties may we give these fields, and how will they affectthe algebra. It turns out that the most interesting case for us will be when the field islocally compact. Such fields are called local fields. We will give a definition of local field inthis chapter, and it will look much stronger than simply requiring that the field be locallycompact as a topological field. However, it turns out that the two definitions are equivalent.One recovers the norm using the notion of Haar measure, which we will not develop in detailhere.

We classify the local fields up to isomorphism in Chapter 8.

In elementary analysis, one proves that N ⊂ R is not bounded, and this is called thearchimedean property. We now justify our use of this term in this context.

Proposition 3.3. Let K be a field with absolute value | · |. Then K is non-archimedean ifand only if the set {n · 1 | n ∈ N} is bounded in the metric topology induced by | · |.

Proof. If K is non-archimedean, then since |1| = 1 (why?), we must have |1 + · · ·+ 1| ≤ 1.The converse is a nice application of the binomial theorem. Assume N is such that

|n| ≤ N for all n ∈ N. Let x, y ∈ K and assume, without loss of generality, that |x| ≥ |y|.If m ≥ 0 is an integer, then |x|m|y|n−m ≤ |x|n. Therefore, by the triangle inequality,

|x+ y|n ≤n∑

m=0

∣∣∣∣(nm)∣∣∣∣ |x|m|y|n−m ≤ N(n+ 1)|x|n.

Taking nth roots gives|x+ y| ≤ N1/n(n+ 1)1/n|x|.

Since this held for any n, we may take the limit as n→∞ and obtain |x+ y| ≤ |x|. Sincex was the larger of the two, we are done.

Definition 3.4. Let K be a field. Two absolute values | · |1 and | · |2 are equivalent if theyinduce the same topology on K.

We know from the elementary theory of metric spaces that a metric d(·, ·) on a metricspace X induces the same topology as the metric d(·, ·)s for all s > 0. The converse is truein this setting.

38

Proposition 3.5. Let K be a field. Two (non-trivial) absolute values | · |1 and | · |2 areequivalent if and only if there is an s > 0 such that | · |1 = | · |s2.

Proof. The backward direction is clear by the remarks above. For the forward direction,we claim that for any absolute value | · | on K, we have |x| < 1 if and only if xn → 0 asn → ∞. But this is clear since |xn| = |x|n, which converges to 0 ∈ R as n → ∞ if andonly if |x| < 1. Thus, since 0 ∈ K is the only element with absolute value 0, it follows thatxn → 0 if and only if |x| < 1. We therefore have that |x| < 1 is a topological property of x,and hence |x|1 < 1 if and only if |x|2 < 1.

Now let y ∈ K with |y|1 > 1. Such a y exists because | · |1 is non-trivial. Assumex ∈ K is non-zero, and find the a ∈ R with |x|1 = |y|a1. Let ni/mi be a sequence of rationalnumbers approaching a from above, and n′i/m

′i be one which approaches a from below.

Then |x|1 = |y|a1 < |y|ni/mi

1 . Hence, raising to the mith power, we have |xni/ym1 |1 < 1.

Thus |xni/ym1 |2 < 1, and so |x|2 < |y|ni/mi

2 . Letting i→∞ gives |x|2 ≤ |y|a2.Similarly, we may use n′i/m

′i to get the inequality |x|2 ≥ |y|a2. Hence, |x|2 = |y|a2 and we

have transfered the equality across absolute values.Now this implies

log |x|1log |x|2

=log |y|1log |y|2

.

Define this common value to be s. Then log |x|1 = s log |x|2, so that |x|1 = |x|s2. Since |y|1was assumed to be larger than 1, so is |y|2, and hence s is a quotient of positive numbers.This implies that s > 0, and we are done.

Theorem 3.6 (Approximation Theorem). Let K be a field and assume K has absolutevalues | · |1, . . . , | · |n which are pairwise inequivalent. Let α1, . . . , αn be elements of K. Thenfor every ε > 0, there is an x ∈ K which satisfies

|x− αi|i < ε

for all i.

Proof. The proof of the previous proposition shows that, since |·|1 and |·|n are inequivalent,there is an element α ∈ K with |α|1 < 1 and |α|n ≥ 1. Similarly, there is an element β ∈ Kwith |β|1 ≥ 1 and |β|n < 1. Hence y = β/α has the property that |y|1 > 1 and |y|2 < 1.

We claim there is a z ∈ K with |z|1 > 1 and |z|i < 1 for 2 ≤ i ≤ n. We prove thisby induction on n. We just saw the case n = 2. Assume we have a z0 with |z0|1 > 1 and|z0|i < 1 for 2 ≤ i ≤ n− 1, and let y be as above. We have two cases.

Assume |z|n ≤ 1. If m is large enough, then zmy satisfies the desired inequalities.Assume now that |z|n > 1. Consider the elements tm = zm/(zm + 1). The |tm|i satisfy

the desired inequalities for all i with 1 ≤ i ≤ n − 1, and all m. Furthermore, as m → ∞,|tm|1 and |tm|n approach 1, while |tm|i → 0 for 2 ≤ i ≤ n − 1. So for m sufficiently large,tmy satisfies the desired inequalities.

Now we have a z with the desired properties, and we consider again the elementstm = zm/(zm + 1), so that |tm|1 → 1 and |tm|i → 0 for 2 ≤ i ≤ n. Construct zi ∈ Ksimilarly to z, but with |zi|i > 1 and |ti|j < 1 for i 6= j. Consider tim = zmi /(z

mi + 1). Then

39

for sufficiently large m, it is easy to see that

x =n∑i=1

αitim

has the properties stated in the theorem.

Remark. The previous theorem may be regarded as an analogue of the Chinese RemainderTheorem for absolute values. Actually, we will see soon that for a number field, there is adeep relationship between prime ideals and absolute values.

3.2 Completions and the p-adic Numbers

We are about to discuss the local theory of number fields. By this, we mean somethingdifferent from, but still similar to, the localization of a number field at a prime. Weconsider instead a larger ring, denoted Op (for p a prime). In some sense, the step from thelocalization to the ring Op is analogous the step from the localized polynomial ring k[x](x)

(k some field) to the power series ring k[[x]]. This analogy is actually very far reaching.Let K be a number field and p a prime of K. The characteristic of the residue field

OK/p is p, and Np = pf is its order.We will define an absolute value on K as follows. Let α ∈ K be non-zero and let (α)

be the principal fractional ideal generated by α. Let∏

pnp be the factorization of (α) intoprimes. Here, np ∈ Z and the product is over all primes of K, but it is finite because almostall of the np’s are zero. We define vp(α) = np. Note that vp : K× → Z is a homomorphism.Finally, we define the p-adic absolute value by

|α|p = (Np)−vp(α), |0|p = 0.

Clearly, |·|p satisfies properties (1) and (2) of Definition 3.1. It actually satisfies property(4), and hence also (3). To see this, note that if α, β ∈ K× with vp(α) > vp(β), thenvp(α+ β) = vp(β) = min{vp(α), vp(β)}. If instead vp(α) = vp(β), then vp(α+ β) ≥ vp(α) =vp(β). Clearly these statements imply the ultrametric inequality.

If we take K = Q and p ∈ Z a prime, then we can consider the p-adic absolute value| · |p on Q. If a ∈ Q, write a = prb with r ∈ Z and b ∈ Q with no power of p occuring in thenumerator or denominator of b. Then |a|p = 1/pr.

Let us now define the field Kp of p-adic numbers. The definition is simple. It isessentially the same process as the completion of Q to R; Topologically, Kp is the metricspace completion of K under the metric induced by the p-adic absolute value. Recall thatthis consists of equivalence classes of Cauchy sequences, two sequences being equivalent ifthe distance between their nth terms approaches zero as n → ∞. Because K is a field,this is the same as the componentwise difference of the sequences converging to zero. Theaddition and multiplication of the equivalence classes is done componentwise.

It is worthwhile to describe this process algebraically. We can take the ring of all Cauchysequences in K which are convergent with the p-adic absolute value. Then we can take thequotient by the maximal ideal of nullsequences (i.e. sequences which converge to zero).The field obtained is the same as the one just described.

40

In either construction, the absolute value of an element of Kp is the limit of the absolutevalues of the entries of any Cauchy sequence which represents this element. K embeds intoKp by assigning to any element α ∈ K the constant sequence {α, α, . . . }. The absolutevalue on Kp clearly restricts to the p-adic absolute value on K.

We now choose to study a more general class of fields than just the fields Kp.

Definition 3.7. Let K be a field. A valuation is a function v : K× → R such that exp ◦(−v)is an absolute value on K. A valuation is discrete if its image in R>0 is discrete. We saythat v is non-archimedean if it induces a non-archimedean absolute value. Equivalently,v(a+ b) ≥ min{v(a), v(b)} for a, b, a+ b ∈ K×.

Let K be a field which is complete with respect to a non-archimedean absolute valuegiven by a discrete valuation. We define O = {α ∈ K | |α| ≤ 1}. It is easy to check thatthis is a subring of K. It is closed under addition by the ultrametric inequality. The unitsin O are clearly the elements with absolute value 1; their inverses must still have absolutevalue 1. But, the compliment, which we denote p = {α ∈ K | |α| < 1}, is once again closedunder addition by the ultrametirc inequality, and also under multiplication by elements inO. It is therefore an ideal, and since its compliment in O consists of the units of O, itfollows immediately that O is local with maximal ideal p.

Definition 3.8. A non-archimedean field is a local field if it is complete with respect to adiscrete non-archimedean valuation and has finite residue field.

Remark. There is a definition for archimedean fields, and it is the same, but without theword “non-archimedean”. In either case, this definition is extremely redundant. In fact,as we mentioned, one can prove that every locally compact topological field is local, andvice-versa.

We now assume K is local and that its residue field has pf elements with p the charac-teristic. We normalize the absolute value on K so that the element with largest non-zeroabsolute value smaller than 1 (exists by discreteness of the valuation) has absolute valuep−f .

Proposition 3.9. O is a discrete valuation ring.

Proof. By proposition 2.13, it suffices to prove that O is principal and that p is the onlyprime. Let π ∈ p be an element with largest absolute value, so that |π| = p−f . We claimthat every element of K× can be written uniquely as uπm for some u ∈ O× and somem ∈ Z. To see this, let α ∈ K× have absolute value p−fm with m ∈ Z (all elements of K×

are like this). Then |π−mα| = 1, so π−mα is a unit, call it u. We then find that α = uπm.Uniqueness is clear.

Let a ⊂ O be an ideal, and choose an element α = uπm ∈ a with largest absolute value.Then (α) ⊂ a. We claim the opposite inclusion holds. In fact, we wish to claim somethingstronger, namely that (α) ⊃ {α ∈ O | |α| ≤ |πm|}. This set clearly contains a, so we willhave what we wanted. To prove this inclusion, let β ∈ O have |β| = p−fn with n ≥ m.Then there is a v ∈ O× with β = vπn. Hence β = (v/u)πn−mα ∈ (α). Thus a = (α), asdesired. Therefore O is principal. But we have proved more, namely that every ideal is ofthe form (πn) for some integer n ≥ 0. Clearly p = (π), so that every ideal in O is of theform pn. Since (πn) is obviously not prime for n ≥ 2, p is the unique prime in O.

41

Corollary 3.10. K× ∼= O× × Z.

Proof. This follows from the representation α = uπm described in the proof above. Theisomorphism is α 7→ (u,m).

We now wish to describe the additive structure of K and O. We will describe it for O,and the structure of K will follow from the fact that Frac(O) = K (prove this if you havenot).

Proposition 3.11. A series∑αn in K converges if and only if sequence {αn} converges.

Proof. The forward direction is obvious. For the converse, let an be the nth partial sum ofthe series

∑αn. Let ε > 0. Then for n sufficiently large and for m > n, we have |αm| < ε.

The ultrametric inequality implies

|am − an| = |αn+1 + αn+2 + · · ·+ αm| ≤ max{|αn+1|, . . . , |αm|} < ε.

So the sequence {an} is Cauchy and therefore converges since K is complete.

Proposition 3.12. Let π be a prime element in O, or, equivalently, a generator of theunique prime ideal p. Let n = pf − 1 and let u0 = 0, u1, . . . , un be representatives in O ofthe residue field O/p. Then every element α in O is equal to a unique convergent series ofthe form

α = ui0 + ui1π + ui2π2 + · · ·

with 0 ≤ ik ≤ n for all k ≥ 0 (any series like this is convergent by the Proposition 3.11).Similarly, every β ∈ K is equal to a unique convergent series

β = uimπm + uim+1π

m+1 + · · ·

for some m ∈ Z.

Proof. Let α ∈ O. First of all, since O is a discrete valuation ring, we know that pr/pr+1 ∼=O/p as O-modules by the map x 7→ πrx (see the proof of Theorem 2.8). This implies thatα1 = α − ui0 ∈ p for some i0, that α1 − ui1π ∈ p2 for some i1, and so on. In this way, weget a sequence

a0 = ui0 , a1 = ui0 + ui1π, a2 = ui0 + ui1π + ui2π2, a3 = · · ·

such that α− at ∈ pt for integers t ≥ 0, i.e. |α− at| ≤ p−ft. Therefore at → α.Now we need to show uniqueness. If

∞∑j=0

uijπj =

∞∑j=0

ui′jπj

then we may subtract term by term:

0 =∑

(uij − ui′j )πj .

42

If we had ij 6= i′j for some j, then |uij − ui′j | = 1 because the residue class of uij − ui′jmodulo p is non-zero. Thus |(uij − ui′j )π

j | = |πj |. Hence if j0 is the smallest j with ij 6= i′j ,

then |∑

(uij − ui′j )πj | = |πj0 |. But this is zero, a contradiction. This proves uniqueness.

As for β ∈ K, let m be such that |β| = p−fm. Then π−mβ ∈ O, so that π−mβ equals aunique series of the above form. Multiplying this series by πm yields the result.

Let us discuss briefly the topology of K. We have

Proposition 3.13. O and O× are compact.

Proof. Let {an} be a sequence in O. Let u0, . . . , un and π be as in Proposition 3.12 andlet in,j be such that uin,j is the jth coefficient of the series representation of an. Then forsome i, in,0 = i for infinitely many n. Let an0 to be the subsequence of the an with in,0 = i.Repeat this for in0,1 to get a smaller subsequence an1 , and continue. We obtain sequences

a00 , a10 , a20 , . . .

a01 , a11 , a21 , . . .

a02 , a12 , a22 , . . .

......

.... . .

The aii form the desired subsequence. So O is compact. O× is compact because it is aclosed subset of O (it is the unit circle).

Corollary 3.14. K is locally compact.

Proof. The desired compact neighborhood about any point x ∈ K can be taken to be x+O,which is both open and closed (why?).

Let us give another characterization of O. This one is not too important for the sequel,and so the reader who is not familiar with the following construction may want to skip it.

LetA1

f1←− A2f2←− A3

f3←− · · ·

be a decreasing sequence of groups (or rings, modules, etc.) with homomorphisms betweenthem. Then the group

lim←−Ai =

{(ai) ∈

∞∏i=1

Ai

∣∣∣∣∣ ai = f(ai+1) for all i

}

is called the inverse limit of the groups Ai. This is a well defined group, and for all i,there are unique homomorphisms ψi which satisfy fi ◦ ψi+1 = ψi. Furthermore, if the Aiare topological groups, i.e. groups which are topological spaces such that multiplicationand inversion are continuous, then so is lim←−Ai. It inherits its topology from the producttopology on

∏Ai. The reader who is not familiar with these things should either prove

these statements or consult a text which treats inverse limits.

43

Proposition 3.15. There is an isomorphism

O ∼= lim←−n∈NO/pn

which is also a homeomorphism when the O/pn are given the discrete topology.

Proof. To define the isomorphism O → lim←−O/pn, we simply take the projection of O onto

the components O/pn. This is clearly a well defined homomorphism. It is also injectivebecause its kernel is

⋂n p

n = 0.For the surjectivity, let (a1, a2, . . . ) ∈ lim←−O/p

n. Let u0, . . . , un and π be as in Propo-sition 3.12. Then it is easy to show that each an is represented uniquely by a sum of theform

ui0 + ui1π + ui2π2 + · · ·+ uin−1π

n−1.

Furthermore, since the projection O/pm → O/pn will simply delete the terms of this sumafter n− 1, the uij ’s are independent of n. Thus the element

ui0 + ui1π + ui2π2 + · · ·

maps to (a1, a2, . . . ).Now we need to show that this bijection is a homeomorphism. We will show that this

map establishes a one-to-one correspondence between a base of neighborhoods of each space.In fact, it is enough to specify bases of neighborhoods about the zero elements (why?). Abase of neighborhoods about 0 ∈ O is given by the ideals pn since these are the balls aroundzero. A base of neighborhoods about 0 ∈ lim←−O/p

n is given by

n−1∏i=1

{0} ×∞∏i=n

O/pi

(this follows from a general fact about inverse limits). These are obviously in one-to-onecorrespondence via the map given above.

Let K once again be a number field.

Proposition 3.16. Op is the (topological) closure of OK in Kp.

Proof. Any Cauchy sequence in OK has elements with p-adic absolute value smaller thanor equal to 1. Thus, so does the element to which it converges in Kp.

Conversely, let {an/bn} be a cauchy sequence whose limit α has |α| ≤ 1. In fact, let|α| = p−fm with m ≥ 0 an integer. Furthermore, assume that vp(bn) = 0 (which can beachieved by multiplying the numerator and denominator of an/bn by an element of p\p2

a suitable number of times). First of all, we know that since the image of the p absolutevalue on K is discete, the numbers |an/bn|p must stabilize, and hence do so at p−fm. Finda solution cn ∈ OK to the congruence cnbn ≡ an (mod pn) for n > m. This is possible sincevp(bn) = 0 so that bn 6≡ 0 (mod p). Then vp(cn − an/bn) ≥ n, and hence |cn − an/bn|p → 0as n→∞. This means that cn → lim(an/bn) = α. Since cn ∈ OK , we are done.

Proposition 3.17. We have a canonical isomorphism Op/p ∼= OK/p.

44

Proof. Consider the map OK → Op → Op/p. This map obviously has kernel p ⊂ OK andso we just need to show it is surjective. Let x ∈ Op and let {an} be a sequence in OKconverging to x. Then for sufficiently large n we have |an − x| ≤ p−f in Op. Thereforean ≡ x (mod p). Thus, for some n sufficiently large, any an will represent x modulo p, sothe map above is surjective.

Let us now consider the case K = Q. Let p ∈ Z be a prime. The ring O(p) in this caseis denoted Zp and is called the ring of p-adic integers. The field K(p) is denoted Qp, thefield of p-adic numbers. We have the following proposition, which describes the structureof Zp.

Proposition 3.18. The ring Zp is isomorphic to the ring of all power series

∞∑i=0

aipi

with ai ∈ {0, . . . , p−1}, and the addition and multiplication given like in base-p, by carrying.In other words,

Zp ∼= Z[[x]]/(x− p).

The field Qp is similar, but with series of the form

∞∑i=−n

aipi

for some n ≥ 0.

Proof. Exercise; this is a special case of Proposition 3.12.

3.3 Hensel’s Lemma

Here K is a non-archimedean local field, O its valuation ring, and p the prime of O. Wealso set the notation k = O/p.

Recall that a polynomial with coefficients in a unique factorization domain is calledprimitive if its coefficients do not all share a prime factor. For f ∈ O[x], this means thatone of the coefficients of f is a unit.

Theorem 3.19 (Hensel’s Lemma). Let f ∈ O[x] be primitive. Assume f admits a factor-ization modulo p,

f(x) = u(x)v(x) ∈ k[x]

where f is the reduction of f modulo p, and u, v ∈ k[x] are relatively prime. Then there arepolynomials g, h ∈ O[x] such that f = gh, g(x) = u(x), h(x) = v(x) and, deg(g) = deg(u).

Proof. The idea is to approximate the polynomials g, h and take the limit. Let us set somenotation. Let d = deg(f) and m = deg(u) so that deg(v) ≤ d − m. Let g0, h0 ∈ O[x]have g0 = u and h0 = v, and deg(g0) = m and deg(h0) ≤ d −m. We know since u, v arerelatively prime, that there are polynomials a, b ∈ O[x] with ag0 + bh0 ≡ 1 (mod p). So wehave two polynomials in p[x], namely f − g0h0 and ag0 + bh0− 1. Among their coefficients,

45

pick one with highest absolute value, say π. (This may not be a prime element!) Thenf ≡ g0h0 (mod π).

We will look now for polynomials pi, qi with deg(pi) < d and deg(qi) ≤ d−m such that,if we define

gn = g0 + p1π + · · ·+ pnπn

hn = h0 + q1π + · · ·+ qnπn

then f ≡ gn−1hn−1 (mod πn) and deg(gn) = m. Then we will take the limit (i.e. the limitof the coefficients as n→∞) and obtain our result. The degree of g will be correct becauseit will be the degree of g0 since deg(pi) < m.

Assume we have polynomials pi, qi as above for 1 ≤ i ≤ n − 1. We search for pn, qn.Since gn = gn−1 + pnπ

n and hn = hn−1 + qnπn, we want to find pn, qn with

f − gnhn ≡ πn(gn−1qn + hn−1pn) (mod πn+1).

Let fn = π−n(f − gnhn) ∈ O[x]. Then this is equivalent to

fn ≡ gn−1qn + hn−1pn ≡ g0qn + h0qn (mod π).

On the other hand, since ag0 + bh0 − 1 ≡ 0 (mod π), we obtain

ag0fn + bh0fn ≡ fn (mod π).

Writebfn = qg0 + pn

so that deg(pn) < m. This will be the pn we are looking for. The fact that g ≡ g0 (mod p)and that g and g0 have the same degree shows that the leading coefficient of g0 is a unit.Hence the quotient q has coefficients in O. Therefore,

fn ≡ ag0fn + bh0fn ≡ g0(afn + qh0) + pnh0 (mod π).

Omit all coefficients divisible by π from the polynomial afn + qh0 and call the result qn. Itis easy to check (do so) that pn, qn have the desired properties.

The following corollary also bares Hensel’s name.

Corollary 3.20 (Hensel’s Lemma). Let f ∈ O[x] be primitive. Assume f has a root α ink and that the derivative f ′ does not vanish. Then f has a root a in O and a ≡ α (mod p).

Corollary 3.21. Let q = pf be the order of k. Then O contains the (pf − 1)th roots ofunity and, along with 0, these form a complete set of coset representatives of O modulo p.

Proof. The polynomial xq−1 − 1 splits in k[x] and k× consists of the roots. Thus all of theroots xq−1− 1 in k extend to roots in O, and consequently they are all incongruent modulop. This proves the corollary.

46

Corollary 3.22. Let f(x) = anxn + · · · + a1x + a0 ∈ K[x] be irreducible with an 6= 0 and

a0 6= 0 (this last condition is, of course, automatically true). Then

max{|a0|, |a1|, . . . , |an|} = max{|a0|, |an|}.

In particular, if f is monic, then a0 ∈ O if and only if f ∈ O[x].

Proof. Denote max{|a0|, |a1|, . . . , |an|} = |f | (this notation is standard). After multiplyingby a suitable power of a prime element, we may assume that |f | = 1 and f ∈ O[x]. Let rbe smallest with |ar| = 1, so that

f ≡ (anxn−r + · · ·+ ar+1x+ ar)x

r (mod p).

If max{|a0|, |an|} < 1, then r 6= 0 and r 6= n. So this factorization consists of two non-constant polynomials, and Hensel’s Lemma gives a factorization of f . This is a contradic-tion, proving the corollary.

3.4 Extensions of Local Fields

Theorem 3.23. Let K be non-archimedean and local with respect to an absolute value | · |,and let L/K be an algebraic extension. Then | · | extends uniquely to a non-archimedeanabsolute value of L. If L/K has degree n, then | · | is given by the formula

|α| = (|NmL/K(α)|)1/n

In this case, the field L is complete under this absolute value.

Proof. If the extension L/K is infinite, it is still the union of its finite subextensions.Therefore, the theorem for finite extensions implies the theorem for infinite extensions. Sowe will assume L/K is finite of degree n.

Let O be the valuation ring of K and O′ the integral closure of O in L. We claim firstthat O′ = {α ∈ L | NmL/K(α) ∈ O}. The inclusion ⊂ is obvious. For the other inclusion,

let NmL/K(α) ∈ O and let f(x) = xd + · · · + a0 be the minimal polynomial of α. Thenm = n/d is an integer and ±NmL/K(α) = am0 , which is thus in O. Hence f ∈ O[x] byCorollary 3.22, i.e. α ∈ O′.

Now we define the absolute value on L by the formula in the theorem. It clearly has theproperty |α| = 0 if and only if α = 0, and is also multiplicative by definition. For the strongtriangle inequality, we notice that the inequality |α + β| ≤ max{|α|, |β|} is, after dividingby the larger of α or β, equivalent to |α + 1| ≤ 1 for |α| ≤ 1. This, in turn, is equivalentto |NmL/K(α+ 1)| ≤ 1, or, NmL/K(α+ 1) ∈ O, for |α| ≤ 1, i.e. for NmL/K α ∈ O. Hencethe strong triangle inequality is equivalent to (α + 1) ∈ O′ whenever α ∈ O′, which holdstrivially. This argument also shows that O′ is the valuation ring for L. Finally, the factthat it extends the given valuation on K is clear.

For the uniqueness, Let | · |0 be another non-archimedean absolute extending the oneon K, Let O0 be the corresponding valuation ring, and P0 its maximal ideal. Also, let P′

be the maximal of O′. We claim that O′ ⊂ O0. Assume not, and let α ∈ O′ but α /∈ O0.Then α−1 ∈ P0. Let f(x) = xd + · · ·+ a0 be the minimal polynomial of α. Then, dividingby αn yields

1 = −ad−1α−1 − · · · − (α−1)d.

47

This shows that 1 ∈ P0, which is a contradiction.We conclude that |α| ≤ 1 implies |α|0 ≤ 1. We claim that this implies that | · | and | · |0

are equivalent. For if not, then the Approximation Theorem (Theorem 3.6) implies thatthere is a β ∈ L with |β| ≤ 1 and |β|0 > 1, a contradiction. Hence | · |0 = | · |s for somes ∈ R>0 (Proposition 3.5). But s = 1 since | · | and | · |0 agree on K.

To show completeness, we refer to the next proposition. It is a basic proposition, usuallyproved for the field R in a course which treats the elementary theory of Banach spaces. Theproof is the same for K, and we leave it as an exercise.

Proposition 3.24. Let K be a complete field under the absolute value | · | and V a normedK-vector space of dimension n. Then any norm on V is equivalent to the maximum norm,given by

‖c1α1 + · · ·+ cnαn‖ = max{|c1|, . . . , |cn|}

for a basis α1, . . . , αn of V . In particular, V is complete.

Note that the valuation on K extends by the formula

v(α) =1

nv(NmL/K(α)).

Proposition 3.25 (Krasner’s Lemma). Let K be non-archimedean and local. Let α, β betwo elements of the algebraic closure of K and assume α is separable over K(β). Assumethat for all embeddings σ of K(α) into the algebraic closure, different from the identity, wehave

|β − α| < |σα− α|.

Here, of course, the absolute value on K(α, β) is the unique one, extending those on K,K(α), and K(β). Then actually K(α) ⊂ K(β).

Proof. Let τ be an embedding of K(α, β) into the algebraic closure which fixes K(β). Weneed to show that τ fixes α. Let σ be any non-identity embedding of K(α) into the algebraicclosure. Looking inside of the Galois closure of K(α, β) over K, with its unique absolutevalue, we find

|τβ − τα| = |β − α| < |σα− α|

because obviously |τ(·)| = | · |. Hence,

|τα− α| = |τα− β + β − α| ≤ max{|τα− β|, |β − α|} < |σα− α|.

Hence τα 6= σα. Thus τ |K(α) must be the identity, which is what we wanted to show.

Proposition 3.26. Let K be local and non-archimedean and let f ∈ K[x] be a separablemonic polynomial of degree n. Let |f | denote the maximum of the absolute values of thecoefficients of f . Assume f has a factorization

f(x) =∏

(x− αj)

in an algebraic closure of K. Let {fi} be a sequence of degree n polynomials in K[x] with|fi − f | → 0. Let fi =

∏(x − αij) be a factorization of fi. Then, after reordering the j’s

for each i, αij → αj for all j as i→∞.

48

Proof. First, let g ∈ K[x] be a polynomial of degree n. Let mg denote the largest absolutevalue of the roots of g. We claim that mg ≤ |g|1/(n−i) for some i with 0 ≤ i < n. Letg(x) = xn+an−1x

n−1 + · · ·+a0 and let α be a root of g. Then the equation g(α) = 0 yields

|αn| = |an−1αn−1 + . . .+ a1α+ a0|.

Thus, upon applying the ultrametric inequality,

|αn| ≤ |aiαi|

for some i < n and hence|αn−i| ≤ |ai| ≤ |g|,

which proves the claim. In particular, |g| is extremely small, then so are the roots.Now let βi be a root of fi. Then |f(βi)| = |f(βi) − fi(βi)| → 0 because the βi’s are

in a bounded set (why?). This shows that the roots of the fi’s become arbitrarily close tothose of f . Now we need to show that all roots of f are limits of roots of the fi’s. Since fis separable, this will prove that every root is a limit of roots of the fi’s exactly once, andthis will prove the proposition.

Let α be a root of f . Assume that there is no sequence of roots αij of the fi’s convergingto α. Then the roots of the fi converge to different roots of f . But |fi(α)| = |fi(α)−f(α)| →0. Let d be the distance of α from the roots αj of f . Then |α − αij | ≥ d/2 for sufficientlylarge i, and hence

|fi(α)| =∏|α− αij | ≥

dn

2n.

This contradicts |fi(α)| → 0, proving the proposition.

This can be done without the assumption that f be separable, but the only case weneed is when f is separable. However, see Exercise 3.1.

Proposition 3.27. Let K be local and non-archimedean and f, g ∈ K[x] have the samedegree n. Assume f is irreducible and separable. Then there is an ε such that if |f − g| < ε,then g is irreducible. Let αi be the roots of f and βi. Then furthermore, ε may be chosenso that, after rearranging the βi’s, we also have K(αi) = K(βi) for all i.

Proof. By the previous proposition, if ε is sufficiently small, then, after rearrangement,|αi − βi| < |αi − αj | and |αi − βi| < |βi − βj | for all i, j. Krasner’s Lemma immediatelyimplies K(βi) ⊂ K(αi) and also K(αi) ⊂ K(βi). So K(α) = K(β), and this itself impliesthat β has degree n. Thus g is the minimal polynomial of β, and hence g is irreducible.

The main point of the above proposition for us is the following.

Proposition 3.28. Let K be a number field and p a prime of K. Let F/Kp be a finiteextension of degree n. Then there is a extension field L of K such that [L : K] = n andF = LP for some prime lying over p. In other words, every extension of a p-adic field isP-adic.

49

Proof. Let α ∈ F be such that F = Kp(α) and let f be the minimal polynomial over Kp ofα. Then f is irreducible with coefficients in Kp. Approximate very closely the coefficientsof f by elements of K to get a new polynomial g ∈ K[x]. Then by the previous proposition,a fine enough approximation yields g irreducible in Kp[x], and hence in K[x], such that ghas a root β with Kp(β) = F . Let L = K(β). Then [L : K] = n because g is irreducible.Let P be any prime of L lying over p. Then LP is complete containing K and β. Thus itcontains the complete field Kp(β). If we could show that [LP : Kp] ≤ n, we would be done.

Let π ∈ K be such that π ∈ p but π /∈ p2. Then vp(π) = 1 and hence π is a prime elementin Kp. But P divides the ideal (π) ⊂ OL e times, where e = e(P/p) is the ramificationindex. Therefore, vP(π) = e. If π′ is a prime element of LP so that vP(π′) = 1, this showsthat (π) = (π′)e in OP. Hence the ramification index of the extension of the (only) primein OP over the prime of Op is e. Moreover, Proposition 3.17 shows that the inertia degreeis f = f(P/p). Hence [LP : Kp] = ef ≤ n, and we are done.

The methods used to prove the above proposition actually show something strongerabout the degree of an extension of p-adic fields. In fact, we have the following two propo-sitions:

Proposition 3.29. Let K be a number field, L a finite extension of K, p a prime of K,and P a prime of L lying over p. Let e be the ramification index of P/p. Then | · |P = | · |epon K. In particular, for any other prime Q lying above p, the absolute values | · |P and | · |Qinduce equivalent absolute values on K.

Proposition 3.30. Let K be a number field, L a finite extension of K, p a prime of K,and P a prime of L lying over p. Let e, f be, respectively, the ramification index and inertiadegree of P/p. Then [LP : Kp] = ef .

Finally, let us briefly discuss the Galois theory of p-adic fields.

Proposition 3.31. Let K be a number field and L a finite Galois extension, and let p bea prime of K and P a prime of L lying over p. Then LP/Kp is Galois and Gal(LP/Kp) ∼=D(P) where the decomposition group is the one for the extension L/K.

Proof. Let σ ∈ D(P). We construct an automorphism σP of the extension LP/Kp asfollows. First note that |α|P = |σ(α)|P because σ(P) = P by definition of the decompositiongroup. Thus σ is continuous under the P-adic absolute value. So σ extends to a continuousmap from the completion LP to itself. It is easy to check that it is a homomorphism. It isan isomorphism because the extension of σ−1 to LP clearly provides an inverse. Thus wehave a map σ 7→ σP : D(P)→ homKp(LP, LP). It is easy to check that this map respectscomposition. It is injective because each σP has a different action on L ⊂ LP. But D(P)has order ef , which is the degree of the extension LP/Kp. Thus LP/Kp is Galois andD(P)→ Gal(LP/Kp) is an isomorphism.

3.5 Absolute Values on Number Fields

Let K be a number field. We have seen non-archimedean absolute values on K coming fromprime ideals in OK , but there are others, which are archimedean, defined as follows. Letσ : K → C be an embedding. There are [K : Q] such embeddings because C is algebraicallyclosed. Then we may define | · |σ = |σ(·)| where | · | is the ordinary absolute value on C.

50

Definition 3.32. Let K be a number field. We let VK be the set of equivalence classes ofabsolute values on K, the equivalence considered here being usual equivalence of absolutevalues. The elements of VK are called primes of K. The equivalence classes of p-adicabsolute values for prime ideals p will be called the finite primes of K and the set of suchabsolute values will be denoted Vf . The ones arising from embeddings of K into C arecalled infinite primes and their set will be denoted V∞.

Remark. Earlier, we remarked that the non-zero prime ideals of a number field are likepoints on a variety; they are, in fact, closed points in the scheme SpecOK . But thisscheme is affine and not complete. The fix for this, for all intents and purposes arithmetic,seems to be including the infinite primes in the picture. This is the justification of theabove terminology; the infinite primes are like the points at infinity on a projectivization ofSpecOK . The reason this analogy makes sense is essentially the product formula (provedat the end of this section) which is the analogue of the theorem deg ◦div = 0 for projectivecurves in algebraic geometry. See Section 4.5 for details.

We want to prove that the prime ideals along with the embeddings K → C are innatural bijection with VK .

Proposition 3.33. Let K be a number field, L a finite extension of K, and p a prime idealof K. Let | · | be an absolute value on L extending | · |p. Then | · | is equivalent to | · |P forsome P a prime ideal of L lying over p.

Proof. Consider the completion L of L with respect to the absolute value | · |. The extendedabsolute value on L is non-archimedean (N is still bounded), so the elements of absolutevalue less than 1 form a maximal ideal in the closed unit ball. Call the closed unit ball Oand the maximal ideal m. Then O ⊃ OL because it contains OK and is integrally closed.Hence P = m∩OL is non-zero (it contains p) and prime in OL. In particular, P∩OK = p.

Now the localized ring (OL)P is contained in O because any element in OL\P hasabsolute value 1, and so the inverses of these elements lie in O. Furthermore, since Pconsists of elements of absolute value strictly less than 1, K\(OL)P contains only elementsof absolute value strictly greater than 1. This proves that | · | and | · |P give the same unitballs, and are hence equivalent (why?).

Theorem 3.34 (Ostrowski). Up to equivalence, the only (non-trivial) absolute values onQ are the p-adic ones, for primes p, and the one arising from the embedding Q→ R. Theseare all inequivalent.

Proof. Clearly these are inequivalent, for if p 6= q are primes, we have |p|p = p−1 6= |p|sq = 1for any s > 0, and as well, the real absolute value is archimedean while the p-adic ones arenot. It remains to show that these are the only ones.

Suppose | · | is a non-archimedean valuation on Q. Then since N is bounded (by 1, infact, since |1 + . . . + 1| ≤ 1) and | · | is not trivial, the set a = {a ∈ Z | |a| < 1} containsnon-zero elements. It is an ideal by the ultrametric inequality, and it is prime since |ab| < 1implies |a| < 1 or |b| < 1. It is thus equal to an ideal (p) for some prime p. Write a ∈ Z asa = bpm with p - b. Then b /∈ a so that

|a| = |b||pm| = |p|m = |p|msp = |a|sp

51

for s = log |p|/ log |p|p, which is greater than 0 (both logarithms are negative). Thus | · | isequivalent to | · |p.

Now assume | · | is archimedean. We claim that for all m,n ∈ N, we have |m|1/ logm =|n|1/ logn. Write m to base n as

a0 + a1n+ · · ·+ arnr

with 0 ≤ ai ≤ n − 1. Then nr ≤ m implies r ≤ logm/ log n. Also, |ai| ≤ |1 + · · · + 1| ≤ai|1| = ai ≤ n. Hence

|m| ≤∑

ai|n|i ≤∑

n|n|r ≤ (r + 1)n|n|r ≤(

1 +logm

log n

)n · nlogm/ logn.

We substitute mk for m, take kth roots, and let k →∞ to eliminate the left hand term inthe last product. This gives

|m| ≤ |n|logm/ logn

or,|m|1/ logm ≤ |n|1/ logn.

Switching the roles of m and n yields the claim.Now set c = |n|1/ logn. Then c is independent of n, and c strictly larger than 1 because

| · | is archimedean. Let c = es so that s > 0. Then |n| = ns = |n|sR where | · |R is theusual absolute value. Obviously this identity extends to all of Q. Hence | · | is equivalentto | · |R.

Lemma 3.35. Let K be a number field. The only archimedean absolute values on K comefrom embeddings K → C.

Proof. Let | · | be an archimedean absolute value on K and let K be the completion of Kwith respect to | · |. We will show that R ⊂ K topologically, and that K is algebraic overR. This will imply either K = R topologically, or K = C and induces the usual absolutevalue on R. The latter case implies that K = C has the usual absolute value on C. Ineither case, we would be done.

To show R ⊂ K topologically, we simply restrict |·| fromK to Q, where it is archimedean,and therefore the usual absolute value by Ostrowski’s Theorem. Passing to completionsshows R ⊂ K with the topologies corresponding.

To show K/R is algebraic, let α ∈ K and let {ai} be a sequence in K converging to α.Let n = [K : Q]. Let fi(x) = bn,ix

n + · · · + b0,i be the minimal polynomial of ai over Q,but normalized so that the coefficients are smaller in absolute value than 1. Then the aj,iare bounded as i ranges through the natural numbers. Therefore, there is a subsequenceof these polynomials such that the coefficients converge in R. This subsequence, say {fim},has fim(aim) = 0 for all m. Let bj be the limit of the coefficients bim,j . Then, upon takinglimits, we see immediately bnα

n + · · ·+ b0 = 0. Thus α is algebraic over R. This completesthe proof.

Finally, we prove that VK has the description given above.

52

Theorem 3.36. Let K be a number field. The only absolute values on K are the P-adicones for prime ideals K and the ones coming from embeddings K → C. These are allinequivalent except for any two absolute values coming from embeddings K → C that differby complex conjugation.

Proof. Since every absolute value extends an absolute value on Q, Proposition 3.33, Lemma3.35, and Ostrowski’s Theorem 3.34 show that every absolute value on K is of the form inthe theorem. We only need to show that they are not equivalent.

For non-archimedean absolute values, this is easy. Let P 6= Q be prime ideals of K.Then we can find an element α in P but not in Q. Then |α|P < 1 while |α|Q = 1. Thus| · |P 6= | · |sQ for any s > 0.

Now assume | · |1, | · |2 come from distinct embeddings σ1, σ2 : K → C. If they arecomplex conjugates, then clearly we get the same absolute value. So assume they are notcomplex conjugates. Assume these absolute values are equivalent, coming from σ1 and σ2

respectively. Then since | · |1 = | · |s2 and they are equal on Q, we have equality everywhere.Thus σ2σ

−11 : K → K is a Q automorphism on K which respects both absolute values, and

hence extends to an R-automorphism on C which, by assumption, is not conjugation. Thisis impossible, and so | · |1 and | · |2 are not equivalent.

We derive several corollaries.

Corollary 3.37. Let L/K be an extension of number fields of degree n and v = v(·) anabsolute value on K. Write w|v to mean that w is an absolute value on L extending v. LetKv, Lw denote the respective completions. Then∑

w|v

[Lw : Kv] = n.

Proof. For finite absolute values this follows from the theorem, along with Proposition3.30 and Theorem 2.8. For infinite places this follows from the fact that there are n K-homomorphisms L → C, and two embeddings give rise to the same absolute value if andonly if they differ by complex conjugation, which means that the degree [Lw : Kv] = 2anyway.

Corollary 3.38. Let the notation be as above. Then

NmL/K(α) =∏w|v

NmLw/Kv(α)

andTrL/K(α) =

∑w|v

TrLw/Kv(α).

Proof. Exercise.

Corollary 3.39. Let K be a number field of degree n and v0 an absolute value on Q. Forv|v0, let nv = [Kv : Qv0 ]. Then

|NmL/K(α)|v0 =∏v|v0

|α|v.

53

Theorem 3.40 (Product Formula). Let K be a number field and let α ∈ K×. Then∏v∈VK

|α|v = 1.

Proof. First we note that the product is well defined because only finitely many primesoccur in the factorization of (α). Next we note that if we know this for K = Q, then weknow it in general because, by Corollary 3.39, we have

1 =∏v∈VQ

|NmK/Q(α)|v =∏v∈VQ

∏w|v

|α|w =∏w∈VK

|α|w.

Now the formula is obviously multiplicative. So since Q×/{±1} is free abelian on theprimes, we need only to prove it for the primes of Z (and −1, but this is obvious). Butif p ∈ Z is prime, then we have |p|p = p−1, |p| = p for the ordinary absolute value, and|p|q = 1 for primes q 6= p. Hence the formula holds.

3.6 Ramification in Local Fields

In this section we consider two different types of ramification which occur often in the theoryof local fields and which are very similar. Namely, we will consider when an extension isunramified, in the sense we already know, and when an extension is tamely ramified, whichis to be explained. Note that since local fields have only one prime, it will cause noconfusion to say that an extension of local fields has a certain ramification property, ratherthan the corresponding extension of primes has that property. For instance, we may saythat an extension of local fields L/K is unramified if the corresponding extension of primesis unramified.

Proposition 3.41. Let L/K and E/K be finite separable extensions of local fields with bothL and E lying inside a fixed algebraic closure of K. Let F = LE. If L/K is unramified,then so is F/E.

Proof. In general, we will let a bar denote the reduction of an element modulo a prime.Let p ⊂ OK and P ⊂ OL be the primes, and k = OK/p and l = OL/P. Then l/k

is a finite extension of finite fields of degree [L : K]. This is because [L : K] = ef ande = 1. Let α ∈ OK be such that l = k(α), and let f ∈ OK [x] be the minimal polynomialof α over K. Then f(α) = 0 and so the minimal polynomial for α divides f . Hence[l : k] ≤ [K(α) : K], but [K(α) : K] ≤ [L : K] = [l : k]. Thus K(α) = L. Furthermore,[K(α) : K] = [l : k] = [k(α) : k], which implies that the minimal polynomial of α has degreethat of f . Hence it is f . Even more, F = E(α).

Let q ⊂ OE and Q ⊂ OF be the primes, and κ = OE/q and λ = OF /Q. Let g ∈ OE [x]be the minimal polynomial of α over E. Then g divides f . We also have that g is irreducibleover κ because, if it were not, then Hensel’s Lemma would produce a factorization of g.Hence [F : E] = [κ(α) : κ]. But [λ : κ] ≤ [F : E] and [κ(α) : κ] ≤ [λ : κ]. Thus these are allequal, and in particular, [F : E] = [λ : κ]. So F/E is unramified.

Corollary 3.42. A subextension of an unramified extension is unramified.

54

Proof. If E ⊂ L in the proposition, then we would have L/K is unramified by assumption,and we would conclude that L/E is unramified. Since the ramification degree is multiplica-tive in towers, E/K is unramified.

Corollary 3.43. The composite of two unramified extensions is unramified.

Proof. In the proposition, if E/K is unramified, then since we conclude anyway that F/Eis unramified, multiplicativity in towers gives that F/K is unramified.

Given an extension of local fields L/K it makes sense to talk about the maximal unram-ified subextension T of L/K, which is composite of all unramified subextensions of L/K,and is therefore unramified. Every extension of T in L must then be ramified. The com-posite of all unramified extensions of a local field K in a fixed algebraic closure is called themaximal unramified extension of K. We denote it Kur. Any finite subextension of Kur/Kis thus unramified.

Proposition 3.44. Let K be a local field with residue field k. There is a one-to-onecorrespondence between finite unramified extensions of K (in some fixed algebraic closure)and finite extensions of k (in some fixed algebraic closure) given by taking the residue field.

Proof. Let E,F be unramified extensions of K with residue fields κ, λ respectively, suchthat κ/k and λ/k have the same degree. Then κ = λ because k is finite. Since EF hasresidue field λκ (why?), we find that the inertia degree associated to EF/K, E/K and F/Kare all the same. But these are all unramified. So EF has the same degree over K as E orF . Thus E = F = EF , and the correspondence as stated in the proposition is injective.

To show it is surjective, let β be an element which generates a finite extension l ofk. Let f be the minimal polynomial of β over k and let f be a lift of this polynomial toK[x]. Then f is irreducible because f is. Let α be a root of f and consider K(α). Then[K(α) : K] = [l : k] and the residue field of K(α) contains l. Thus, by examining degrees,we see that it must equal k, and even more that K(α)/K is unramified. This proves thesurjectivity.

Let p be the characteristic of the residue field k of a local field K. Let k be an algebraicclosure of k. Then any finite extension l of k in k is generated by an mth root of unityfor some m with p - m. Let f be the minimal polynomial of a generator of l over k, sothat f | xm − 1 ∈ k[x]. Then by Hensel’s Lemma, f lifts to a polynomial f ∈ OK [x] whichdivides xm− 1 ∈ OK [x]. Thus the α we obtained in the proof above may as well have beenan mth root of unity over K. Thus we have proved

Proposition 3.45. Let K be a local field whose residue field is of characteristic p. ThenKur is the extension of K obtained by adjoining all the mth roots of unity to K with p - m.

Tamely ramified extensions to be developed.

3.7 The Different and the Relative Discriminant

Omitted for now.

55

Exercises

Exercise 3.1. Extend Proposition 3.26 to the case where f is not separable, with correctmultiplicity conditions.

56

4 Finiteness of the Class Number and the Unit Theorem

In this chapter, we prove theorems which greatly clarify the arithmetic structure of a numberfield. First we prove that the ideal class group of OK is finite. Then we completely describethe structure of the units O×K . As an extra topic, we include the basics of Riemann-Rochtheory for number fields, which develops analogues of the Riemann-Roch theorem and SerreDuality in arithmetic.

The methods used in this section are heavily geometric. We study embeddings of OKinto Rn and of O×K into another real space. The images of OK and its units have interestinggeometric properties in these spaces, and we can study the arithmetic structure of OK andits units through this geometry.

Sources

Besides the section on Riemann-Roch theory, this chapter is derived from the notes of Milne[7]. I first saw the material of the section on Riemann-Roch in Neukirch [9]. The proof ofthe main theorem is derived from the one in Neukirch and also a theorem in Lang [5].

4.1 The Embedding of a Number Field into n-Space

Let K be a number field of degree n. Recall the construction of the ideal class group.If J(OK) is the group of fractional idaels of OK , we denote by P (OK) the subgroup ofprincipal fractional ideals. Then CK = J(OK)/P (OK) is the ideal class group. We let hK ,called the class number, denote its cardinality.

The first main goal of this chapter is is to prove that hK is finite. To do this, we willneed to introduce a little bit of analysis. The trick will be to embed K into Rn. Thenit will turn out that, under this embedding, OK will have a nice geometric structure, aswill any fractional ideal in K. We study this geometry and deduce that every ideal class isrepresented by an ideal with a sufficiently small element of OK , and the geometry OK willbe such that there are only finitely many elements of bounded size.

Let us now describe the embedding. Recall that there are exactly n embeddings ofK → C, where n = [K : Q]. Some of these have image in R, called the real embeddings,and we denote the number of such embeddings by r. The ones which do not have imagein R will be the complex embeddings, and they come in complex conjugate pairs. Let s bethe number of such pairs so that n = r + 2s. We may therefore write

hom(K,C) = {σ1, . . . , σr, σr+1, σr+1, . . . , σr+s}

where the first r embeddings are real and the last 2s are complex, coming in pairs.Now we have a vector space isomorphism Rr×Cs ∼= Rn. We compose this isomorphism

with the embeddingα 7→ (σ1(α), . . . , σr+s(α)) : K → Rr × Cs

to get an embedding i : K → Rn. Here we chose exactly one embedding K → C for eachpair of complex embeddings.

Note that this is indeed injective, because it is injective on each component.Recall the following definition.

57

Definition 4.1. Let V be a finite dimensional real vector space, inheriting a topology froman isomorphism V ∼= Rm for some m. A subgroup Λ ⊂ V is called a lattice if it is discreteof rank m.

Note that Λ as above is automatically free abelian of rank m because Rm has no torsion.In fact,

Proposition 4.2. A subgroup Λ of a real vector space V ∼= Rm is a lattice if and only if itcan be generated (as an abelian group) by a basis of V .

Proof. This is a straightforward exercise for those readers who may not have seen a proof.For a more detailed description of a lattice, see Milne [7], chapter 4.

Proposition 4.3. i(OK) is a lattice in Rn.

Proof. Let α1, . . . , αn be an integral basis. Consider the vectors i(α1), . . . , i(αn). By theprevious proposition, we need to show that these are linearly independent in Rn. Writei(αi) = (βi1, . . . ,<βi,r+s,=βi,r+s) where we split up the complex components into real andimaginary parts. Consider the matrixα11 · · · <β1,r+s =β1,r+s

.... . .

......

αn1 · · · <βn,r+s =βn,r+s

.Treating this as a complex matrix, we can perform some elementary column operations andput this matrix in the form α11 · · · β1,r+s β1,r+s

.... . .

......

αn1 · · · βn,r+s βn,r+s

.the determinant of this matrix picks up a factor of 2s through these operations (detailsto the reader). But this is now the matrix defining the discriminant, so the determinantof this matrix, in absolute value, is

√|∆K | by definition. Hence the original matrix had

determinant 2−s√|∆K | 6= 0. This proves that the i(αj) were linearly independent.

This proof showed another piece of useful information. We need a definition to state it.

Definition 4.4. Let Λ ⊂ V be a lattice in a real vector space of dimension m, and letλ1, . . . , λm be generators for Λ as a free abelian group. The set

F =

{m∑i=1

aiλi

∣∣∣∣∣ 0 ≤ ai ≤ 1 for all i

}is called a fundamental parallelopiped of Λ.

Proposition 4.5. Let F be a fundamental paralellopiped for a lattice in the vector spaceV ∼= Rm, based on generators λ1, . . . , λm of Λ. Suppose V inherits the Lebesgue measureon Rm. Form the matrix A out of the column vectors λ1, . . . , λm. Then Vol(F ) = |detA|.Furthermore, the volume of a fundamental parallelopiped is independent of the generatorsλi chosen.

58

Proof. The first statement is usually proved in any course on real analysis treating in-tegration in arbitrary dimensions. The second follows from the fact that the change ofbasis matrix from one basis of Λ to another has determinant 1 in absolute value (Λ is freeabelian).

Proposition 4.6. Let a be a fractional ideal of K. Then i(a) is a lattice in Rn. Thefundamental parallelopiped of i(a) has volume 2−sNa

√|∆K | under the Lebesgue measure

on Rn.

Proof. By the above proposition, for a = OK , we proved what we needed in the proof ofProposition 4.3. For an integral ideal a this follows from Na = |OK/a| and the volumecomputation for OK . For a fractional ideal a, this follows by reducing to the case of anintegral ideal by multiplying by a suitable element of OK .

Remark. The measure we will use after these next few sections will actually be 2s timesthe Lebesgue measure on Rn, the factor arising from use twice the Lebesgue measure oneach complex component of Rr × Cs

4.2 Minkowski’s Theorem

In this section, we introduce a useful tool for detecting lattice points in a region in Rn. Itwill be a vital key in the proof of the finiteness of the class number.

Definition 4.7. Let E be a compact region in Rn. E is called convex if for any two pointsin the region, the line segment connecting them also lies in the region. E is symmetricabout the origin if for all α ∈ E, we have −α ∈ E.

Theorem 4.8 (Minkowski). Let Λ be a lattice in Rn and E a compact convex regionwhich is symmetric about the origin. Let F be a fundamental parallelopiped for E. IfVol(E) > 2n Vol(F ), then E contains a non-zero point of Λ.

Proof. Let λ1, . . . , λn be the basis used in the contruction of F , and consider the sets

2F =

{m∑i=1

aiλi

∣∣∣∣∣ 0 ≤ ai < 2 for all i

}

andE/2Λ = {α ∈ 2F | α ≡ β (mod 2Λ) for some β ∈ E}.

The volume of E/2Λ is at most the volume of 2F which is 2n VolF < VolE. But E/2Λconsists of the union of the translates of points in E to 2F . Since translation preservesvolume and the volume of E/2Λ is smaller than the volume of E, this implies two distincttranslates mapped to the same point in E/2Λ. Hence there are distinct α1, α2 ∈ E suchthat α1 ≡ α2 (mod 2Λ). Since E is symmetic about the origin, −α2 ∈ E. Since E is convex,the midpoint between α1 and −α2 is in E, i.e. (α1−α2)/2 ∈ E. But α1/2 ≡ α2/2 (mod Λ),so (α1 − α2)/2 ∈ Λ. Since α1 6= α2, this point is non-zero, and we have constructed thedesired point of Λ and E.

We quickly give one application of this theorem to elementary number theory.

59

Theorem 4.9 (Fermat). Every non-negative integer can be written as the sum of foursquares.

Proof. First of all, we have a formulas

(a2 + b2 + c2 + d2)(e2 + f2 + g2 + h2)

= (ae− bf − cg−dh)2 + (af + be+ ch+dg)2 + (ag− bh+ ce+df)2 + (ah+ bg− cf +de)2

and2 = 12 + 12 + 02 + 02.

Thus we need only to prove the theorem for an odd prime p.For such p, first we observe that there are m,n ∈ Z such that m2 +n2 + 1 ≡ 0 (mod p).

Indeed the set of possible values of m2 (mod p) and the set of possible values of −n2 −1 (mod p) both have p+1

2 elements. Hence they overlap in the set of all values modulo p.For such a choice of m,n, consider the lattice in R4

Λ = {(a, b, c, d) ∈ Z4 | c ≡ ma+ nb (mod p), d ≡ mb− na (mod p)}.

We find immediately that Λ ⊃ pZ4. Furthermore, Λ/pZ4 is a two dimensional Fp-vectorspace because a and b are allowed to be arbitrary modulo p, but then they fix c and d.Hence, if F is a fundamental parallelopiped of Λ, then Vol(F ) = p2.

Now consider a sphere S of radius r wnere 1.9p < r2 < 2p, say, centered about theorigin. S has volume π2r4/2 so that

Vol(S) > π2(1.9)2p2/2 > p2.

Thus S contains a non-zero point (a, b, c, d) of Λ.Now for this point, we have

a2 + b2 + c2 + d2 ≡ (a2 + b2)(1 +m2 + n2) ≡ 0 (mod p).

Hence p|(a2 + b2 + c2 + d2). On the other hand, since (a, b, c, d) ∈ S, we have (a2 + b2 +c2 + d2) < 2p. This implies p = a2 + b2 + c2 + d2.

4.3 The Proof of the Finiteness of the Class Number

We state two lemmas which we do not prove because they are elementary advanced calculus.They reader may wish to prove them, or see Milne [7] for the second. Then we state a thirdlemma which we do prove. This will be all of the calculus we need for the proof of thefiniteness of the class number. Note that we have used only a little bit of number theory,the main arithmetic point so far being that the discriminant is non-zero. Let us state thefirst lemma, which is about arithmetic and geometric means.

Lemma 4.10. Let a1, . . . , an > 0 be positive real numbers. Then(∏ai

)1/n≤ 1

n

∑ai.

60

The next lemma, at least in a more general form, is an exercise in using the gammafunction.

Lemma 4.11. For t > 0, let Z(t) = {(x1, . . . , xm) ∈ Rm | xi ≥ 0,∑xi < t} be an

n-simplex of size t, and let a ≤ n be a positive integer. Then∫Z(t)

xa · · ·xm dx1 · · · dxm = t2m−a/(2m− a)!.

Define a norm on Rr × Cs by

‖(x1, . . . , xr, zr+1, . . . , zr+s)‖ =

r∑i=1

|xi|+ 2

r+s∑i=r+1

|zi|.

Lemma 4.12. For t > 0, let X(t) = {x ∈ Rr × Cs | ‖x‖ ≤ t}. Then

Vol(X(t)) = 2r(π/2)stn/n!.

Proof. Write x = (x1, . . . , xr, zr+1, . . . , zr+s). Let Y (t) = X(t) ∩ {x | x1, . . . , xr ≥ 0} sothat Vol(X(t)) = 2r Vol(Y (t)). For the complex components, make the change of variable

zj =1

2ρj(cos θj + i sin θj).

The corresponding Jacobian is ρj/4. The volume of Y (t) is an integral over the xi’s, ρj ’s,and θj ’s, which, after performing the integral on the θj ’s, becomes

Vol(Y (t)) = 4−s(2π)s∫Zρr+1 · · · ρr+s dx1 · · · dxrdρr+1 · · · dρr+s

where Z is like in the previous lemma,

Z = {(x1, . . . , xr, ρr+1, . . . , ρr+s) | xi, ρj ≥ 0,∑

xi +∑

ρj ≤ t}.

The previous lemma now computes the volume as

Vol(X(t)) = 2r4−s(2π)st2r+2s−r/(2r + 2s− r)! = 2r(π/2)stn/n!.

We now prove a bound on the minimal size of an element in an ideal in OK , and derivea corollary.

Theorem 4.13. Let a be an ideal in OK . Then there is an element α ∈ a with

|NmK/Q(α)| ≤(

4

π

)s n!

nnNa√|∆K |.

61

Proof. Let X(t) be as in Lemma 4.12, F the fundamental parallelopiped of i(a). The setX(t) is compact, convex, and symmetric about the origin, so we may apply Minkowski’sTheorem. This says that if Vol(X(t)) ≥ 2n Vol(F ), then X(t) contains a non-zero point ofi(a). Thus, we want

2r(π/2)stn/n! ≥ 2n2−sNa√|∆K |

or,tn/nn ≥ (4/π)s(n!/nn)Na

√|∆K |.

Define t by making this an equality, so that there is a non-zero α ∈ a ∩ X(t). Lethom(K,C) = {σ1, . . . , σr, σr+1, . . . , σr+s}. Since this α is in X(t), we have, by Lemma4.10,

|NmK/Q(α)| = |σ1(α)| · · · |σr(α)||σr+1(α)|2 · · · |σr+s(α)|2

≤r∑i=1

|σi(α)|+ 2

s∑i=r+1

|σi(α)|

≤ tn/nn

Our definition of tn/nn was the bound stated in the theorem.

The bound in the next theorem is often called the Minkowski bound. The finiteness ofthe class number will follow immediately.

Theorem 4.14. Every ideal class in CK has an integral representative a which satisfies

Na ≤(

4

π

)s n!

nn

√|∆K |

Proof. Let c be a fractional ideal, and d ∈ K× with dc−1 integral, say equal to the ideal b.Then by the previous theorem, there is a β ∈ b such that

|NmK/Q(β)| ≤(

4

π

)s n!

nnNb√|∆K |.

Since b ⊃ (β), there is an ideal a with ab = (β). Thus we have

NaNb = |NmK/Q(β)| ≤(

4

π

)s n!

nnNb√|∆K |.

We cancel Nb to obtain the theorem, as long as a represents the same ideal class as c. Buta = βb−1 = βd−1c, so we are done.

Theorem 4.15. The class number of K is finite

Proof. By the previous theorem, we only need to show that there are finitely many integralideals in OK with bounded norm. This is easy. Each prime p in Z splits into finitely manyprimes in K primes with larger norm than p. So since there are only finitely many primesin Z of bounded norm, there are only finitely many in K. There are only finitely manyways to combine finitely many primes before exceeding a given bound, and this proves thetheorem.

Remark. The particularly attentive reader may have noticed that the preceding proof canbe made effective.

62

4.4 Dirichlet’s Unit Theorem

Let us state the unit theorem.

Theorem 4.16. The units in OK are a finitely generated abelian group of rank r + s − 1with torsion µ(K), the roots of unity in K.

We will show that the torsion is the roots of unity, then that the units are finitelygenerated of rank ≤ r + s − 1, and finally that the rank is ≥ r + s − 1. The full proofrequires several components.

Proposition 4.17. (O×K)tors = µ(K), i.e. the torsion of the units is the roots of unity.

Proof. Clearly µ(K) ⊂ (O×K)tors. For the opposite inclusion, Let α ∈ (O×K)tors. Thenαm = 1 for some m by definition

Proposition 4.18. Let α ∈ OK . Then α ∈ O×K if and only if |NmK/Q(α)| = 1.

Proof. For the forward implication, assume α ∈ O×K . Then α−1 ∈ OK . So we have

1 = NmK/Q(1) = NmK/Q(α) NmK/Q(α−1).

But NmK/Q(α) and NmK/Q(α−1) are in Z. Thus they are both ±1.For the converse, assume α ∈ OK has norm ±1. Then∏

σ∈hom(K,C)

σ(α) = ±1

so that, for any σ0 ∈ hom(K,C),

σ0(α)∏σ 6=σ0

σα = ±1,

so we have constructed an inverse to σ0(α) which is an algebraic integer and lies in σ0(K)since its inverse α does. Hence it is in OK (or more accurately, σ0(OK)) and so α is aunit.

Lemma 4.19. Let M,N be non-negative integers. Then the set of all algebraic integers αof degree ≤M and such that |σα| < N for all embeddings σ : K → C, is bounded.

Proof. The first condition implies a bound on the degree of the minimal polynomial of α,which, moreover, has integer coefficients. The second condition implies a bound on thesize of the coefficients, because they are the elementary symmetric polynomials in the σα.There are only finitely many integer polynomials with bounded coefficients, and hence onlyfinitely many such α.

Proposition 4.20. The set of α ∈ OK such that |σα| = 1 for all σ : K → C is preciselythe roots of unity.

Proof. By the previous proposition, the set {α, α2, . . . } lies in a finite set, and so is finite.This suffices.

63

Write hom(K,C) = {σ1, . . . , σr, σr+1, . . . , σr+s}. Consider H ⊂ Rr+s defined by H ={(x1, . . . , xr+s) | x1 + · · ·+xr+2xr+1 + · · ·+2xr+s = 0}. This is a vector space of dimensionr + s− 1. We define an map log : OK → H by

α 7→ (log |σ1α|, . . . , log |σrα|, log |σr+1α|, . . . , log |σr+sα|).

This is well defined by Proposition 4.18, and its kernel is (O×K)tors = µ(K) by the aboveproposition.

Proposition 4.21. O×K is finitely generated.

Proof. If we prove that log(O×K) is discrete, it will be a lattice in the subspace it generates,and so its rank will be at most r+ s− 1. To prove discreteness, Let E be a bounded subsetof log(O×K). Then for all u ∈ E and all j, log |σju| is bounded, hence so is |σju|, and henceE is finite by Lemma 4.19. This proves that log(O×K) is discrete.

The rest of the section will be devoted to proving Theorem 4.23, which immediatelyimplies the unit theorem. First, we set some ground. We treat Rr × Cs as a ring withmultiplication componentwise. For x = (x1, . . . , xr+s) ∈ Rr × Cs (the first r coordinatesreal and the last s complex), let

Nm(x) = x1 · · ·xrxr+1xr+1 · · · xr+s.

Then |Nm(x)| = |x1| · · · |xr||xr+1|2 · · · |xr+s|2. Let i be the embedding OK → Rr × Cs asin the proof of the finiteness of the class number. Then |Nm(i(α))| = |NmK/Q(α)| forα ∈ OK .

We will consider sets of the form x · i(OK), where the multiplication is takes place inthe ring Rr × Cs. It is clear that this is still a lattice for |Nm(x)| > 0 (i.e. all coordinatesnonzero). Its fundamental parallelopiped has volume 2−s

√|∆K ||Nm(x)| (exercise; see the-

orem 4.6 and its proof).The next result constructs units whose inverses have coordinates distributed like arbi-

trary x ∈ Rr × Cs. This will give us enough units to prove Theorem 4.23.

Lemma 4.22. There exists a constant M > 0 such that, for every x with 1/2 ≤ |Nm(x)| ≤1, there exists a unit ε ∈ O×K such that every coordinate of x · i(ε) is smaller than M .

Proof. Let x be as in the statement of the lemma. Let T ⊂ Rr × Cs, be compact, convex,symmetric about the origin, and have volume larger than 2n2−s

√|∆K | so that T is larger

than 2n times the fundamental parallelopiped of x·i(ε). Then Minkowski’s Theorem impliesthat there is a nonzero γ ∈ OK with x · i(γ) ∈ T . Let M0 be the radius of T (thelargest distance of any point in T from the origin). Then Nm(x · i(γ)) ≤ Mn

0 , hence|Nm(i(γ))| = |NmK/Q(γ)| ≤ 2Mn

0 .Now take a γ ∈ OK such that x · i(γ) ∈ T and consider the principal ideals (γ) ⊂ OK

for any such γ. They have bounded norms, and so there are only finitely many (this is theargument used at the end of the proof of the finiteness of the class number). Let γ1, . . . , γtbe the γ’s which generate these ideals. Take now any γ with x · i(γ) ∈ T so that (γ) = (γi)for some i. Then γ = γiε for some unit ε. We find that x · i(ε) ∈ i(γ−1

i ) · T , and hence

x · i(ε) ∈ T ′ = i(γ−11 ) · T ∪ · · · ∪ i(γ−1

t ) · T.

64

But T ′ is bounded and independent of x. Let M be its radius. Then x · i(ε) has coordinatesbounded in absolute value by M .

Theorem 4.23. log(O×K) is a (full) lattice in the vector space H.

Proof. Let xi ∈ Rr×Cs have |Nm(x)| = 1 and have the ith coordinate extremely small andall others large. Then there is an M as in the previous lemma and a unit εi such that eachcoordinate of x · i(εi) is smaller than M . If we choose x correctly, then multiplication by εidecreases the size of each coordinate of x except the ith coordinate. Hence |σj(εi)| < 1 forall j 6= i, and so log(|σj(εi)|) < 0. The i coordinate will, or course, have log(|σi(εi)|) > 0.

We now assume r + s− 1 ≥ 1, otherwise there is nothing to prove (What are the fieldswith r + s = 1?)

We claim that log ε1, . . . log εr+s−1 are linearly independent in H. To do this, we willshow that the matrix [log(|σj(εi)|)] for 1 ≤ i, j ≤ r + s − 1, is invertible. Well, the ithrow of this matrix, along with log(|σr+s(εi)|), sums to 0. Hence the sum of the entries ofany row is strictly larger than 0. Also, the only positive entries are on the diagonal, whilethe rest are negative. The next lemma, which is purely linear algebra, implies immediatelythat this matrix is invertible.

Lemma 4.24. Let A = [aij ] be an n × n real matrix with negative terms off the diagonaland each row summing to a positive value. Then A is invertible.

Proof. Assume not. Find a non-trivial solution [xj ] to the system∑aijxj = 0. Let i0 be

such that |xi0 | is maximal. Scale the solution so that xi0 = 1, and hence |xj | ≤ 1. Then

0 =∑

ai0,jxj ≥∑

ai0,j > 0,

a contradiction.

This concludes the proof of the unit theorem.Let us define an invariant which will become important to us later.

Definition 4.25. A fundamental system of units of K is a set of generators for the freepart of O×K . If ε1, . . . , εr+s−1 is such a system, then the determinant of the matrix whoseith row is

log(|σ1(εi)|), . . . , log(|σr(εi)|), 2 log(|σr+1(εi)|), . . . , 2 log(|σr+s−1(εi)|)

is called the regulator of K, denoted RegK .

The regulator is the analogue of the discriminant for the units. In particular, fundamen-tal systems of units can be detected using the regulator. Also, it is clear that the regulatoris nonzero (by the above theorem) and independent of the choice of fundamental system ofunits, just as the discriminant is independent of the choice of integral basis.

Remark. Continued fractions (see almost any book on elementary number theory) showup in finding units for Q(

√d) when d > 0. This is because they are useful in finding integral

solutions to Pell’s equation x2 − dy2 = 1. This is the same as NmQ(√d)/Q(x + y

√d) = 1,

which would indicate that the solution gives a unit. The smallest solution to this equationvia the method of continued fractions usually yields a fundamental unit. If it does not,then it yields a cube of a fundamental unit.

65

4.5 The Riemann-Roch Theorem For Number Fields

In this section, we take the analogies between algebraic geometry and number theory veryfar and formulate an arithmetic theorem analogous to the Riemann-Roch Theorem for pro-jective curves in algebraic geometry. A note on prerequisites: The only material needed inorder to understand the contents of this section is the material developed in these notes.The reader unfamiliar with algebraic geometry will probably find this section to be unmoti-vated, and it may be skipped. However, there is one lemma (Lemma 4.31) which is relevanthere for the development of future material and it is not algebro-geometric in nature. It is,rather, Minkowski theoretic. The main result of this section will also be cited in Chapter5, but its full power will not be needed there. Instead, the reader may refer to Lang [5] forthe weaker theorem, when it comes up.

The philosophy of this section is that the primes (finite and infinite) of a number fieldshould be viewed as points on a projective algebraic curve. The infinite primes will behavedifferently, because the completion of a number field at an infinite prime does not have adiscrete valuation, which will explain the appearance of R in the divisor group of a numberfield. The number field itself should be viewed as the field of rational functions on thiscurve, and its ring of integers the regular functions. Finally, the constant base field shouldbe viewed as µ(K)∪{0}, though this is not crucial. The reason for this is that we will haveker div = µ(K) for the divisor function we will define.

Fix a number field K. We begin with the definition of our analogue of the divisor group,called the Arakelov divisors. This is the set of formal sums

Div(OK) =

∑p∈VK

npp

∣∣∣∣∣∣np ∈ Z if p -∞, np ∈ R if p|∞, np = 0 for all but finitely many p

.

Thus,

Div(OK) ∼=⊕p-∞

Z⊕ Rr+s

where r is the number of real embeddings and 2s the number of complex embeddings. Inparticular, the group of fractional ideals J(OK) embeds naturally into Div(OK). In fact,we will want to take advantage of this embedding. So for convenience, we would like toview Div(OK) multiplicatively, like the ideal class group. For this purpose, we define thereplete ideals as

J(OK) =

∏p∈VK

pnp

∣∣∣∣∣∣np ∈ Z if p -∞, np ∈ R>0 if p|∞, np = 0 for all but finitely many p

.

We will define an isomorphism L : J(OK) → Div(OK) by essentially extracting valua-tions as follows:∏

p-∞

pnp ×∏p real

pnp ×∏

p complex

pnp 7→∑p-∞

−npp +∑p real

(log np)p +∑

p complex

(2 log np)p.

We also define a map div : K× → Div(OK) by first defining d : K× → J(OK) by

α 7→∏p-∞

pvp(α) ×∏p|∞

p|α|p

66

where | · |p is the absolute value induced by the embedding associated to the infinite primep. Then

div = L ◦ d.

Finally we have an analogue of the degree function, deg : Div(OK)→ R given by∑npp 7→

∑p-∞

np logNp +∑p|∞

np.

Theorem 4.26.deg ◦ div = 0.

Proof. This is just a matter of unraveling the definitions and applying the Product Formula3.40.

Let P (OK) = Im d where d is the map in div. Then we may define

Pic(OK) = J(OK)/P (OK).

This is the analogue of the Picard group.Let a be a replete ideal. It has a finite component af in the obvious sense, which can

naturally be identified with a fractional ideal. The remaining infinite component is denotedby a∞. We can “embed” a replete ideal into n-space by embedding its finite componentin the usual way, and then we scale the lattice by the infinite components as follows. Weidentify Rn ∼= Rr×Cs. If (x1, . . . , xr+s) ∈ Rr×Cs comes from the finite part of the repleteideal af , then we choose to scale the ith component by npi , where pi is the infinite primecorresponding to the ith component.

One convention we set here is to endow C with twice the Lebesgue measure. This meansthat the space Rr × Cs has 2s times its Lebesgue measure. This turns out to often be theright choice for arithmetic pruposes. With this in mind, we define the Euler characteristicassociated to a replete ideal a to be

χ(a) = − log Vol(a)

where the volume above means the volume of the fundamental parallelopiped associated tothe lattice in Rn coming from a.

The analogy here with the usual definition of the Euler characteristic in algebraic geom-etry and this one requires lengthy explanation in order to be understood. Briefly, one ex-planation is that it is equivalent to the definition which comes naturally from the arithmeticanalogue of the Grothendieck-Riemann-Roch Theorem. Another is explained adelically viasome of my research. In both cases this definition turns out to be the same, incidentally,as the natural one.

It is easy to compute the Euler characteristic. For a replete ideal a =∏

pnp , define

Na = Naf ·∏p|∞

np.

Thenχ(a) = − log(

√|∆K |Na)

67

because of Proposition 4.6. (Keep in mind that the complex components have twice theLebesgue measure).

Write O =∏

p1 for the identity of J(OK). Then the next theorem follows immediatelyfrom these computations.

Theorem 4.27 (Riemann-Roch, first form). For any replete ideal a, we have

χ(a) = deg a + χ(O)

where by deg(a) we mean deg(L(a)) = logNa.

It may be surprising that the proof is so easy. The difficulty is in showing that χ behavescorrectly (in a manner to be explained), and we have not yet showed this. This step willbe the arithmetic analogue of Serre Duality.

We would like to be able to state the Riemann-Roch Theorem now in a way whichis succeptible to Serre Duality. To do this, we define an analogue of the vector space ofglobal sections of the line bundle associated to a divisor, i.e. H0(X,D). For a replete ideala =

∏pnp , we define

H0(a) = {α ∈ K× | νp(α) ≤ np}

where νp(α) = vp(α) for p finite and νp(α) = |α|p for p infinite. This is a finite set becausethe restrictions on the finite primes force H0(a) to be in the lattice associated to af andthe conditions at the infinite primes bound these lattice points in space.

As an analogue of the dimension of this space, we define

`(a) = log|H0(a)|2r(2π)s

.

The factor on the bottom comes from the fact that the conditions on H0(O) at the infiniteprimes bound a set in Rr × Cs of volume 2r(2π)s. It is a normalization factor.

In the case of the replete ideal O, we can consider the constant g = `(O)−χ(O), whichshould be the analogue of the genus. We see that

g = logwK√|∆K |

2r(2π)s

where wK = |µ(K)|.Define in general

i(a) = `(a)− χ(a).

The next theorem follows immediately from these definitions.

Theorem 4.28 (Riemann-Roch). Let a be a replete ideal. Then

`(a)− i(a) = deg(a) + `(O)− g.

The analogue of Serre Duality is the only thing left. It is essentially a Schwartz typevanishing condition on i(a) in the sense of harmonic analysis. Recall that in algebraicgeometry, the index of specialty of a divisor vanishes very quickly with the degree. The

68

index of specialty we have in arithmetic will not completely vanish, however, because weallowed the degree to take on real values.

The proof of the analogue of Serre Duality will require some lemmas. We will be usingMinkowski theory to obtain not an existence theorem, but rather an asymptotic result.Hence we will require a finer analysis of the number of lattice points which fall into a givenregion.

To state the first lemma, we need a couple of definitions. The first is standard fromanalysis.

Definition 4.29. Let S ⊂ Rk for some k, and for some n, let f : S → Rn be a map. Thenf is said to satisfy a Lipschitz condition if there is a constant C such that

|f(x)− f(y)| ≤ C|x− y|

for all x, y ∈ S.

Let Ik = {(x1, . . . , xk) ⊂ Rk | 0 ≤ xi ≤ 1 for all i} denote the k-dimensional unit cube.

Definition 4.30. A subset E ⊂ Rn is said to be k-Lipschitz parametrizable if there arefinitely many Lipschitz maps Ik → E whose images cover E.

We will consider a bounded region D ⊂ Rn. The notation ∂D will be used for itsboundary. For t ∈ R>0, we let tD = {tx | x ∈ D}. Clearly ∂(tD) = t∂D. The rigorousproof is by double inclusion.

Lemma 4.31. Let D ⊂ Rn be a bounded region and assume ∂D is (n − 1)-Lipschitzparametrizable. Let Λ ⊂ Rn be a lattice with fundamental parallelopiped F . For t ∈ R>0,let N(t) be the number of lattice points of Λ in tD. Then

N(t) =Vol(D)

Vol(F )tn +O(tn−1)

where O(f(t)) is the standard notation denoting a function which, for sufficiently large t,is bounded above by Cf(t) for some constant C. In this case, the constant in the O termdepends on D, Λ, and the Lipschitz constants.

Proof. For λ ∈ Λ, Let Fλ be F shifted by λ. It is clear that Rn is the union of the Fλ’s,overlapping only at the boundries. If λ ∈ tD, then Fλ intersects tD and we must have thatFλ either lies in the interior of tD or that Fλ intersects ∂(tD) = t∂D. Let M(t) be thenumber of λ ∈ Λ with Fλ contained in the interior of tD, and let B(t) be the number ofλ ∈ Λ with Fλ intersecting t∂D. Then we have

M(t) ≤ N(t) ≤M(t) +B(t).

ButM(t) Vol(F ) ≤ Vol(tD) ≤ (M(t) +B(t)) Vol(F ).

Therefore, we have

M(t) ≤ Vol(D)

Vol(F )tn ≤M(t) +B(t)

69

and so we only need to show B(t) = O(tn−1).Let f : In−1 → D be one of the Lipschitz maps which parametrizes ∂D with constant C.

Then it is clear that tf parametrizes ∂(tD) with the constant tC. Partition each side of In−1

into btc equal pieces so that In−1 is partitioned into btcn−1 equal cubes (here bxc denotesthe greatest integer in x). Each of these small cubes has diameter

√n− 1/btc. Therefore,

the image of each small cube under tf has diameter at most Ct√n− 1/btc ≤ 2C

√n− 1.

This bounds above the number of Fλ which can intersect the image of one of the smallcubes by some constant C ′ depending on C and the volume of F (hence on Λ). This meansexactly that B(t) ≤ C ′′kbtcn−1 where k is the number of Lipschitz maps parametrizing theboundary of D, and C ′′ is the maximum of the C ′’s obtained as above for each Lipschitzmap. This implies the lemma.

The next proposition has a direct analogy in algebraic geometry.

Proposition 4.32. The functions χ and ` depend only on the class of a replete ideal inPic(OK).

Proof. By definition, it is clear that for a replete ideal a, Vol(a) depends only on the degreeof a. So use deg ◦ div = 0. For `, the proposition follows from the bijection H0(a)→ H0(αa)given by x 7→ αx for α ∈ K×.

Lemma 4.33. Let h = hK be the class number of K and let a1, . . . , ah be repesentativesfor the ideal class group in J(OK). Let c > 0 and

Ai(c) ={a =

∏pnp ∈ J(OK)

∣∣∣ af = ai, (np)np ≤ c(Na)fp/n for n|∞

}where fp = [Kp : R] (and so is either 1 or 2). Then c may be chosen so that

J(OK) =

h⋃i=1

Ai(c)P (OK).

Proof. Let Bi = {a ∈ J(OK) | af = ai}. After multiplication by a suitable element ofP (OK), obviously we can put any element of J(OK) into one of the Bi’s. So we want toshow that Bi ⊂ Ai(c)P (OK) for a suitable c. This will suffice.

To do this, consider a replete ideal a = aia∞ ∈ Bi. Write a∞ =∏

p|∞ pnp . Normalizea∞ as

a′∞ = a∞(Na∞)1/n

and writea′∞ =

∏p|∞

pn′p .

Then the vector (fp log n′p)p|∞ ∈ Rr+s lies in trace-zero space H = {(x1, . . . , xr+s) ∈Rr+s|

∑xi = 0}. Let log be as in the previous section so that log(O×K) ⊂ H forms a

full lattice. If c0 is the diameter of the fundamental parallelopiped of this lattice, then(fp log n′p)p|∞ lies within a distance of c0 from log(u) for some u ∈ O×K . Thus, for this u,

|fp log n′p − fp log |σp(u)|| ≤ fpc0

70

where σp has the obvious meaning. Thus

log np − log |σp(u)| = log n′p +1

nlogNa∞ − log |σp(u)|

≤ 1

nlogNa∞ + c0

=1

nlogNa∞ + ci

where ci = c0−Nai. Let b = u−1a. Multiplication by a unit does not change the finite partof a replete ideal, so b ∈ Bi. Furthermore, if b =

∏pνp , then

fp log νp = fp(log np − log |σp(u)|) ≤ fpn

logNa∞ + 2ci.

Thus nfpp ≤ e2c1(Na)np for infinite p. Thus b = u−1a ∈ Ai(e

2ci). Hence, taking themaximum c of the e2ci , we find that a ∈ Ai(c)P (OK), as desired.

Theorem 4.34 (Arithmetic Serre Duality). As a ranges through J(OK), we have theestimate

|H0(a−1)| = 2r(2π)s√|∆K |

Na +O((Na)1− 1n ).

Proof. We prove the estimate for H0(a) = H0(a) ∪ {0}, which suffices because this differsfrom H0(a) by one element. By Proposition 4.32, we need only focus on a representative ofeach class in Pic(OK). By the previous lemma, we may focus further on elements of Ai(c)as in that lemma. So let a =

∏pnp ∈ Ai(c).

Now H0(a−1) may be identified with those points (x1, . . . , xr, zr+1, . . . , zr+s) ∈ Rr×Cs inthe lattice i(a−1

f ) which are, furthermore, bounded by the conditions |xi| ≤ np or |zi| ≤ np,where p is the infinite prime corresponding to the ith component. The lattice i(a−1

f ) has

fundamental parallelopiped F with Vol(F ) =√|∆K |(Naf)

−1. Let Di be the unit intervalin R or the unit disk in C, depending on whether i is a real or complex component. LetD =

∏npDi where, once again, the infinite prime p is the one corresponding to the ith

component. Then the conditions described above on the lattice points at the infinite placesare equivalent to the condition that the lattice point is in D. Note that D has volumeVol(D) = 2r(2π)sNa∞. Assume for now that ∂D is (n− 1)-Lipschitz parametrizable. If welet c go to tc, then Na goes to tnNa and, furthermore, we have the estimate

|H0((ta)−1)| = Vol(D)

Vol(F )tn +O(tn−1) =

2r(2π)s√|∆K |

tnNa +O((tnNa)1− 1n ).

Here, by ta we mean the replete ideal obtained from a by multiplying the infinite componentsby t.

It remains to prove that D is (n−1)-Lipschitz parametrizable. Well, ∂D =⋃i(ni∂Di×∏

i 6=j njDj), so we need to prove that there is a Lipschitz map In−1 → ni∂Di ×∏i 6=j njDj

for each i. If i is real, we futher separate this set into two copies of∏i 6=j njDj . In

this case, we parametrize componentwise and use 2t − 1 on the real components, and(r, θ) 7→ r cos θ + ir sin θ. This is obviously Lipschitz. If the ith component is complex, weuse θ 7→ eiθ on the ith component and the maps above on the rest. This is also Lipschitz,and we thus have our parametrization. This completes the proof.

71

Corollary 4.35. If we define ` and i for Arakelov divisors D via composition with L−1,then we have

`(D)− i(D) = degD + `(O)− g

where i(D) = O(exp( 1n degD)) as degD →∞.

Remark. This version of Arithmetic Serre Duality, due to Serge Lang, is approximate.Van der Geer and Schoof redefined slightly the function ` to give an exact Serre Duality. Infact, i(D) is replaced in their theory by `(DK/Q−D). It had also been well known that theabsolute different is the correct analogue of the canonical divisor from algebraic geometry,so its appearance here makes sense.

Their definition of ` weights the points of af with an “effectivity”, in the sense thata divisor is no longer either effective or not effective. Instead, the effectivity is a numberbetween 0 and 1 which decays extremely quickly. See [12] for their paper.

72

5 Adeles and Ideles

Let K be a number field. In this chapter we construct a ring containing K and all of itscompletions at any absolute value. It will almost be the cartesian product of all completions,but there will be a certain finiteness restriction. The resulting ring will be locally compactas a topological ring. These are the adeles.

Adelic methods can act as a replacement for the Minkowski-style methods, or as theyare also called, the geometry of numbers, which we saw in the last chapter. In particular,all three Minkowski theoretic results of the last chapter (the finiteness of the class number,the unit theorem, and Lang’s arithmetic analogue of Serre Duality) all have proofs whichstay within the adeles instead of analyzing Rn. The proofs of the first two theorems areclassical, and the third is mine. However, the latter requires a good deal of the theory ofmeasures and Fourier analysis on locally compact abelian groups. It will be given at theend of these notes.

Adeles and ideles seem to be the right way to work globally in arithmetic. By this wemean they are the right way to consider all primes at once. A great example of this willbe class field theory, where all of the main theorems of global class field theory can benaturally formulated in terms of ideles.

Sources

The material of this section is very loosely based on Lang [5].

5.1 Definitions

Definition 5.1. A group G is called a topological group if it has been endowed with atopology such that the functions

(x, y) 7→ xy : G×G→ G

x 7→ x−1 : G→ G

are continuous. A topological ring is a ring whose additive group is a topological group,and which has a continuous multiplication.

It follows at once that if R is a topological ring, then multiplicaltion by an elementof R× is an automorphism of the additive group which is a homeomorphism. Similarly,translation is a homeomorphism on a topological group.

An important thing to note about topological groups is that, to specify a base of neigh-borhoods, it is enough to specify one about any point. One can then translate the base toother points.

Important examples of topological groups for us are the completions of a number fieldK. We will see very soon that the adeles and the ideles have a very nice topological struc-ture.

We now define the adeles. Let K be a number field. Recall that VK is the set of absoluteon K up to equivalence. For v ∈ VK , let Kv denote the completion of K with respect to v.

73

Thus Kv is R, C or a p-adic field. Then we define

AK =

(αv)v∈VK ∈∏v∈VK

Kv

∣∣∣∣∣∣αv ∈ Ov for all but finitely many v

.

These are the adeles. They are a ring under componentwise addition and multiplication.We give them a topology by specifying a base of open subsets about 0 to be ∏

v∈VK

Uv

∣∣∣∣∣∣Uv open in Kv, 0 ∈ Uv, Uv = Ov for all but finitely many v /∈ V∞

.

Thus AK is locally compact because there is a base of compact neighborhoods about 0(recall that Kv is compact).

Next we define

IK = A×K =

(αv)v∈VK ∈∏v∈VK

K×v

∣∣∣∣∣∣αv ∈ O×v for all but finitely many v

.

These are the ideles. A base of open about 1 is ∏v∈VK

Uv

∣∣∣∣∣∣Uv open in K×v , 1 ∈ Uv, Uv = O×v for all but finitely many v /∈ V∞

.

The reader must be careful here, as this does not give the subspace topology. In fact thistopology is finer. Once again IK is locally compact.

We can embed K into AK diagonally via the ring homomorphism

α 7→ (α, α, . . . ).

This is well defined because vp(α) = 0 for all but finitely many p. We identify K with itsembedding in AK . This embedding, when restricted to K×, gives an embedding K× → IK .

For S a finite set of primes including the infinite primes, we let AS be those adeleswhose v components are in Ov for v ∈ S. If S = V∞, we write A∞ for AS . Define similarlyIS and I∞

Theorem 5.2 (Approximation Theorem).

AK = K + A∞.

Proof. Let x = (xv)v∈VK ∈ AK and let m ∈ Z be such that mxv ∈ Ov for all v /∈ V∞. Let∏pnii be the prime decomposition ofm inOK . Let α ∈ OK be such thatmxv ≡ α (mod pni

i )for all i. Then x − α/m has integral components at the finite places. This proves thetheorem.

Remark. Both this theorem and the earlier Approximation Theorem 3.6 may be seenas extensions of the Chinese Remainder Theorem. This theorem is extremely similar tothe earlier Approximation Theorem, which was visibly about approximation. Whence thename.

74

There is a homomorphism IK → R>0 defined as follows. For x = (xv)v∈VK ∈ IK , wedefine | · | : IK → R>0 to be

|x| =∏v∈VK

|xv|v

where | · |v is the absolute value associated to v. This is well defined because |xv|v = 1 forall but finitely many v by definition. It is obviously a homomorphism, and it is continuous(exercise). Its kernel is denoted I1K . The quotient CK = IK/K× is called the idele classgroup (This conflicts with our notation for the ideal class group. We will always try to makeclear which group we are working with). We will see later that the idele class group has allof the ray class groups from class field theory as quotients. The subgroup C1

K = I1K/K×are the norm-one idele classes.

Theorem 5.3 (Product Formula).K× ⊂ I1K .

Proof. This is an immediate consequence of the product formula (Theorem 3.40) we sawbefore.

Remark. This is another analogue of the theorem deg ◦ div = 0 from algebraic geometry.

We should prove a fact about the topology of K and K× in AK and IK .

Proposition 5.4. K is discrete in AK and K× is discrete in AK .

Proof. We need to construct a neighborhood of 0 in AK which contains no other element ofK. Consider the neighborhood A∞. The other elements of K in this neighborhood are onlythe ones in OK (why?). OK is a lattice in the infinite components. Thus we can restrictthe infinite components to a neighborhood which contains only 0 from K.

The proof for K× ⊂ IK is similar (but note that the methods of the proof of the unittheorem are not needed as we never consider the logarithmic embedding).

5.2 Compactness Theorems

In this section we prove some theorems about compactness of certain adelic constructions.Here is the first.

Theorem 5.5. AK/K, with the quotient topology, is compact.

Proof. By the Approximation Theorem 5.2, every adele can be translated by an element ofK to A∞. Then we can bound this adele at the infinite places by shifting by an algebraicinteger, sinceOK is a lattice in

∏v∈V∞ Kv. Thus all elements of AK/K have a representative

within a fixed compact subset of the adeles. This proves the compactness of AK/K.

We now wish to prove that C1K is compact. We will use Lang’s arithmetic analogue of

Serre Duality to do this, but it is worth noting that it can be done without this by using amuch weaker result as in Lang [5], or skipping the Minkowski theory all together. The resultin Lang asserts that for a replete ideal a with sufficiently large norm, |H0(a)| is non-empty.Ramakrishnan and Valenza [10] develop the theory of adeles and ideles using the theory of

75

the Haar measure on locally compact abelian groups and avoid all Minkowski theory. Inparticular, the proof we give on Lang’s arithmetic analogue of Serre duality at the end ofthese notes is not redundant, and the proofs we give of the finiteness of the class numberand the unit theorem in this chapter could have no Minkowski theory in their background.

Let ρ > 1. C1K is the preimage of 1 under the norm map. We consider the preimage

of ρ under the norm map, and denote it by CρK . It is clearly homeomorphic to C1K via the

map which is translation by an idele of norm ρ. Such an idele exists. Indeed, consider theidele which is 1 at finite places and ρ1/n at the infinite places, where n = [K : Q]. Thus itsuffices to prove that CρK is compact for some ρ.

Lemma 5.6. There is a constant c, depending only on K, such that if x = (xv)v∈VK ∈ IKis such that |x| ≥ c, then there is an α ∈ K× with 1 ≤ |αxv|v ≤ |x| for all v ∈ VK .

Proof. Consider the replete ideal a with v-component |xv|v. Then |x| = Na. By Theorem4.34, for sufficiently large Na, H0(a) has large order, hence is non-empty. This meansprecisely that there is a β ∈ K× with |β|v ≤ |xv|v for all v. Hence, 1 ≤ |β−1xv|v. Therefore

1 ≤ |β−1xv| =∏w∈VK |xw|w∏w 6=v |xw|w

≤ ρ

1= ρ.

We take α = β−1 and obtain the lemma.

Theorem 5.7. C1K is compact.

Proof. Let ρ ≥ c be as in the lemma and x ∈ IK of norm ρ. Then there is an α such that1 ≤ |αxv|v ≤ |x|. In fact, |αxv|v = 1 for all but finitely many v by the definition of idele.The annuli {a ∈ K×v | 1 ≤ |a|v ≤ ρ} are compact because, in the infinite case, they areclosed and bounded; in the finite case, they are a union of finitely many translates of theunit circle. Thus the set described by the above conditions, namely 1 ≤ |αxv|v ≤ |x| and|αxv|v = 1 for all but finitely many fixed v, is compact, as it is a product of compact sets.

Under the quotient map IK → CK , CρK is contained in the image of the compact setdescribed above. It is closed because it is a preimage of the closed set {ρ} under the normmap, which is continuous. Hence it is a closed subset of a compact set, so it is compact.As mentioned before the lemma, this is enough to prove the theorem.

5.3 S-Units and the Recovery of the Unit Theorem and the Finitenessof hK

In this section we will recover the finiteness of the class number of a number field K andthe unit theorem. We also introduce a generalization of the units in OK and prove ageneralization of the unit theorem for them. We will note once again that the theorems ofthis chapter can be treated without any Minkowski theory.

Let C1∞ denote the quotient I1K ∩ I∞/K× ∩ I∞. Then C1

∞ embeds as a subgroup of C1K

which is open because I1∞ is open in I1K .

Theorem 5.8. C(OK) ∼= C1K/C

1∞.

76

Proof. First, for v finite, let pv be the ideal associated to v. Let x = (xv)v∈VK be an idele.We can associate to it a fractional ideal by

a(x) =∏v-∞

pvpv (xv)v .

This gives a surjective homomorphism IK → J(OK), and hence a surjective homomorphismCK → J(OK)/P (OK) = C(OK) (this is a notation for the ideal class group). The kernel ofIK → J(OK) is obviously I∞, hence CK/C∞ ∼= C(OK). But C1

K/C1∞∼= CK/C∞ because

CK ∼= R>0 × C1K through translation by an idele of a given norm which is 1 on the finite

components. Similarly C∞ ∼= R>0 × C1∞. This completes the proof.

Corollary 5.9. The class number of K is finite.

Proof. C1K/C

1∞ is compact because C1

K is. It is discrete because C1∞ is open in C1

K (whydoes this follow?). Hence it is finite. By the theorem, this suffices.

We now prove a generalization of the unit theorem. Let S be a finite set of primesincluding the infinite primes. Let KS = IS ∩ K×. These are the S-units. They are theelements α ∈ K× with vp(α) = 0 for all finite primes p associated to valuations in S. Inparticular KV∞ = O×K .

Let t = |S|. Consider the mapping log : IS → Rt given by

(xv)v∈VK 7→ (log |xv|v)v∈S .

The image lies in trace-zero space H = {(a1, . . . , at) ∈ Rt |∑ai = 0}. We restrict log

to KS . The same argument which shows that the log mapping of Section 4.4 is injectivemodulo torsion and has disctete image shows that this mapping is injective on KS/µ(K)with discrete image.

Notice also that the group I1S/KS is compact because it is a closed subgroup of thecompact group C1

K . This will be important in the following proof.

Theorem 5.10. log(KS) is a full lattice in H.

Proof. For convenience, order S so that the last absolute value is archimedean. We notethat log(I1S) generates H over R since we can choose the first t− 1 coordinates arbitrarilyand the adjust the last (archimedean) one to make the result lie in H. Let W be thesubspace of H generated by log(KS). We need to show H = W .

We have a (continuous) homomorphism I1S/KS → H/W whose image is a subgroupwhich generates H/W as an R-vector space. But the image is compact since I1S/KS is, andso it is trivial (the only compact subgroups of Rn are trivial, for any n). Hence H/W = 0,so H = W , and we are done.

Corollary 5.11. The S-units are finitely generated with free part of rank |S|−1 and torsionµ(K).

Note that this implies the ordinary unit theorem by taking S = V∞.

77

6 Zeta Functions and L-Functions

Here we will study the basic parts of the analytic side of algebraic number theory. We willessentially be encoding arithmetic information in certain analytic functions, and then wewill work together with analysis to prove results in arithmetic. This process is carried outall over arithmetic geometry.

The specific examples of functions we will study are the Riemann and Dedekind zetafunctions, and the Dirichlet L-functions, all of which are examples of Dirichlet series.

Sources

Section 6.1 is loosely based on Lang [5]. Sections 6.4 and 6.5 are based on Marcus [6].Section 6.6 is loosely based on Ireland and Rosen [4] (which, in turn, derives its expositionfrom the beautiful text on multiplicative analytic number theory by Davenport [2]). Section6.7 is based on Milne’s notes [8].

6.1 Dirichlet Series and the Riemann Zeta Function

The main object of study in this chapter will be series of the form

∞∑n=1

anns

where the an ∈ C and s is a complex variable, and ns is computed using the principal branchof the logarithm. A series of this form is called a Dirichlet series. We let s = σ + it be thedecomposition of s into real and imaginary parts. The reason for this notation is largelyhistorical. It was used by Riemann in his 1859 paper Uber die Anzahl der Primzahlenunter einer gegebenen Grosse (English translation: On the Number of Primes Less Thana Given Magnitude) in which he defined his zeta function and used it to obtain results onthe distribution of prime numbers (see the remarks below).

All series of the above form which we will define will converge absolutely and uniformlyin an open right half plane. However, an arbitrary Dirichlet series can converge nowhere.

Proposition 6.1. Let∞∑n=1

anns

be a Dirichlet series which converges for some s = s0 ∈ C. Let σ0 = <(s0). Then for anys with <(s) > σ0, the series converges, and it does so uniformly on compact subsets of thisregion.

Proof. First we note the following formula on partial summation. Let {an} and {bn} besequences of complex numbers, and denote An = a1 + · · · + an and Bn = b1 + · · · + bn.Then, setting b0 = 0, we have

N∑n=1

anbn = ANbN +N−1∑n=1

An(bn − bn+1).

78

The proof is an easy induction and we leave it to the reader.Now let <(s) > σ0. We apply this to the series∑ an

ns01

ns−s0.

This equals our original Dirichlet series, and partial summation gives

N∑n=1

anns

=1

N s−s0

N∑n=1

anns0

+

N∑n=1

n∑k=1

akks0

(1

ns−s0− 1

(n+ 1)s−s0

).

The first term on the right hand side converges to 0 as N →∞ because the series convergesat s0 by hypothesis. As for the second term, we note

1

ns−s0− 1

(n+ 1)s−s0= (s− s0)

∫ n+1

n

1

xs−s0+1dx.

Hence,

N∑n=1

n∑k=1

akks0

(1

ks−s0− 1

(k + 1)s−s0

)=

(∫ N+1

1

1

xs−s0+1dx

)(s− s0)

N∑n=1

n∑k=1

akks0

which obviously converges uniformly on compact subsets by the Weierstrass M -test. Thisproves the proposition.

Corollary 6.2. Let∑an/n

s be a Dirichlet series. Then there is a σ0 ∈ R ∪ {±∞} suchthat the series converges for any s with <s > σ0 and does not for <(s) < σ0. This is calledthe abscissa of convergence for the series.

Proposition 6.3. Let {an} be a sequence of complex numbers, and let An = a1 + · · ·+ anbe the nth partial sum. Assume that there is a C and a σ1 ≥ 0 such that |An| ≤ Cnσ1 forall n. Let σ0 be the abscissa of convergence of the series

∑an/n

s. Then σ0 ≤ σ1.

Proof. For convenience, we write Pn(s) for the nth partial sum of the series. Let n ≥ m.Using the partial summation technique of the previous proposition, we find

Pn(s)− Pm(s) = An1

ns−Am

1

ms+

n−1∑k=m+1

Ak

(1

ks− 1

(k + 1)s

).

As before, we have1

ks− 1

(k + 1)s= s

∫ k+1

k

1

xs+1dx.

Thus,

Pn(s)− Pm(s) = An1

ns−Am

1

ms+

n−1∑k=m+1

Aks

∫ k+1

k

1

xs+1dx.

Assume now that <s ≥ σ1 + δ for some δ > 0. Then∣∣∣∣Ak ∫ k+1

k

1

xs+1dx

∣∣∣∣ ≤ Ckσ1 ∫ k+1

k

1

xσ+1dx ≤

∫ k+1

k

1

xσ−σ1+1dx.

79

Therefore, if we take the sum from m+ 1 to ∞, we get

|Pn(s)− Pm(s)| ≤ |An|1

ns+ |Am|

1

ms+ |s|

∫ ∞k

1

xσ−σ1+1dx

≤ C

nδ+

C

mδ+C|s|δ

1

(m+ 1)δ.

This proves that the sequence of partial sums is Cauchy, and hence convergent.

According to the previous proposition, the Dirichlet series

ζ(s) =∞∑n=1

1

ns

converges (and hence does so absolutely) for <s ≥ 1 + δ for any δ > 0. By comparing itto the series for ζ(1 + δ), we see that ζ converges on compact subsets of the region definedby <s > 1. It therefore defines a holomorphic function there. The function ζ is called theRiemann zeta function.

Theorem 6.4. The function ζ(s) can be analytically continued to the region <s > 0 exceptfor a simple pole as s = 1 with residue 1.

Proof. For r ∈ N, consider the function

ζr(s) = ζ(s)− r∞∑n=1

1

(rn)s= 1 +

1

2s+ · · ·+ 1

(r − 1)s− r − 1

rs+

1

(r + 1)s+ · · · .

It follows immediately from Proposition 6.3 that the series above for ζr(s) converges for<s > 0 because the partial sums of the coefficients are bounded. This series convergesuniformly on compact subsets of this region by the Weierstrass M -test (by comparisonwith what series?). Hence it defines a holomorphic function in that region.

Also, notice that

r

∞∑n=1

1

(rn)s= r1−sζ(s)

and hence

ζ(s) =ζr(s)

1− r1−s

for <s > 1. Thus ζ continues analytically to <s > 0 except possibly at points where1 = r1−s.

We use the following argument. First, if we have 1 = r1−s, then 2πimr = (1 − s) log rfor an integer mr, or

s =2πimr

log r+ 1.

Let r = 2, 3. Then at any pole, we must have

2πim2

log 2+ 1 =

2πim3

log 3+ 1

80

or,m2 log 3 = m3 log 2.

This implies 3m2 = 2m3 , so m2 = m3 = 0 and hence s = 1.To prove there is a pole at s = 1, we can use that

∑n−1, diverges, but the following

method gives us more information.We compare the series for ζ with the integral

∫∞1 x−s dx. We immediately find, for

σ > 1, that

1

σ − 1=

∫ ∞1

1

xσdx ≤

∞∑n=1

1

nσ≤ 1 +

∫ ∞1

1

xσdx = 1 +

1

σ − 1.

This implies that ζ(s) has a pole at s = 1 with residue 1.

Now we state a theorem about ζ which hints at its relation to number theory.

Theorem 6.5. We have, for <s > 1,

ζ(s) =∏p

1

1− p−s

where the product is taken over all positive integer primes. This product converges absolutelyuniformly on compact subsets of the region <s > 1.

Proof. The product converges absolutely uniformly on compact subsets because its recip-rocal does. Indeed, from a standard fact from the theory on infinite products,

∏(1− p−s)

converges absolutely uniformly if and only if∑

log(1 − ps) does, where log is defined bythe usual series. But this converges absolutely uniformly on compact subsets of the region<s > 1 by comparison with the series for ζ.

As for the product, notice

1

1− p−s= 1 +

1

ps+

1

p2s+ · · · .

By unique factorization, every term n−s is a product of terms of these series in exactly oneway, hence the identity. The reader who desires a more rigorous argument should have notrouble supplying it.

Remark. Euler was the first to introduce the zeta function (but only for real s) and studyits relation to number theory. He used its factorization and its pole at s = 1 to prove thatthere are infinitely many primes; if there were finitely many, the product would convergeand then so would the harmonic series, contradiction. This method of proof is immenselydifferent from the proof of Euclid, and it illustrates the power of analytic methods in num-ber theory.

The next paper which utilized these ideas was written almost a century later by Dirich-let, in which he proved that every arithmetic progression a, a + m, a + 2m, . . . for a,mcoprime, contains infinitely many prime numbers. To do this, he introduced certain Dirich-let series (hence the name), called L-functions, whose coefficients were Dirichlet characters

81

evaluated at n and made a study of primes in arithmetic progressions analogous to Euler’sstudy, but the analysis was much more difficult. The algbera was also a little ahead of itstime, since Dirichlet characters are an example of group characters which, unsurprisingly,had not been invented. See section 6.6.

Dirichlet also studied his series only for real s. Riemann was the first to study thezeta function for complex s. He discovered its analytic continuation to the whole complexplane (without s = 1, where it has a simple pole) and functional equation. He used Euler’sfactorization to prove certain theorems about the distribution of prime numbers.

These ideas were truly the birth of an entire branch of mathematics called analyticnumber theory, in which (often deep) methods of analysis are used to tackle problems (usu-ally about the distribution of certain sets of numbers) in number theory. For instance, theTwin Prime Conjecture, that there exist infinitely many pairs of primes with differenceequal to 2, is being furiously tackled via deep analysis as I write these notes. This is dueto a breakthrough in 2013 by a mathematician named Yitang Zhang.

Prehaps the first really concrete result in this area is the Prime Number Theorem,which is a very beautiful theorem on the distribution of prime numbers. If π(x) denotesthe number of positive primes less than or equal the real number x, then

π(x) ∼ x

log x,

where the tilde denotes asymptotic equivalence. This was proved in 1896 independently byHadamard and de la Vallee Poussin. Their proofs were very intricate and used fine analysisof the zeta function inspired by Riemann. The crux of their proof was showing that thezeta function had no zeros on the line <s = 1. (By comparison, the crux of Dirichlet’sproof of his theorem on primes in arithmetic progressions was to show that the L-functionsdid not vanish at s = 1). Today an extremely elegant proof, about three pages long, isknown. It consists of some extremely elegant and short results discovered independentlyby various mathematicians working on refining the proof of the Prime Number Theorem.It was completed by D. J. Newman in the 1970’s.

The prime number theorem was actually conjectured by Gauss when he was fourteenyears old. Progress was made by Chebychev, who was also inspired by Euler’s methods, inthe late 1840’s and early 1850’s before Riemann’s paper. Chebychev was able to bound thequotient π(x)/(x/ log x) above and below for sufficiently large x by constants very close to1. He proved Bertrand’s “Postulate” that there is always a prime between n and 2n usinghis methods.

Much more is conjectured about the zeros of the zeta function than the theorem ofHadamard and de la Vallee Poussin, namely that it only vanishes for <s = 1/2 in thestrip 0 ≤ <s ≤ 1 (it also vanishes for negative even integers, which follows immediatelyfrom Riemann’s work, but this fact is almost always immaterial). This is the famousRiemann Hypothesis, which was conjectured by Riemann in his paper. Many analoguesof this conjecture exist in analytic number theory, as well as algebraic number theoryand algebraic geometry. Some are known to be true as well. For instance, the RiemannHypothesis for projective varieties over finite fields (part of the Weil Conjectures) was provedby Deligne in the 1970s and was prehaps the most important achievement in arithmeticgeometry of that decade. (In fact, the Weil conjectures were one of Grothendieck’s mainmotivations for his formulation of the theory of schemes).

82

6.2 The Functional Equation for the Zeta Function

Omitted for now.

6.3 The Dedekind Zeta Function

Let K be a number field. We will soon define an analytic function which encodes arithmeticinformation about K in its coefficients. We can then recover information about K bystudying the analytic properties of this function. This will be done when we study classfield theory, specifically the Chebotarev Density Theorem and its implications towardsArtin’s Reciprocity Law. (It is worth noting here, though, that analytic methods are notactually needed; in fact, the motivation for Chevalley’s invention of ideles was to avoid thisanalysis).

A lot can be said about this method of approaching number theory. For instance, theconjecture of Birch and Swinnerton-Dyer relates data about an elliptic curve to analyticinformation about an associated L-function. It is one of the most important and influentialconjectures in number theory, and has many implications for the arithmetic of elliptic curves(the finiteness of the Shafarevich-Tate group, for one).

We now give our definition.

Definition 6.6. The Dedekind zeta function, defined for <s > 1, is given by the series

ζK(s) =∑

a⊂OK

1

(Na)s

where the sum is over all integral ideals of K.

Proposition 6.7. The series for ζK converges absolutely uniformly on compact subsets ofthe region <s > 1 and therefore defines a holomorphic function in that region. Furthermore,we have Euler product

ζK(s) =∏p

1

1− (Np)−s

where the product is taken over all prime ideals of K. This product converges absolutelyuniformly on compact subsets of the region <s > 1.

Proof. The proof is much like that of Theorem 6.5. However, we work with the productfirst, before proving that the series for ζK converges.

First, we prove that the product converges. It suffices to consider the logarithm, whichhas terms − log(1 − (Np)−s). There are at most n = [K : Q] prime ideals p for which agiven prime p divides Np. Hence the series is bounded in absolute value by the series∑

k

n log(1− k−s).

This has the desired convergence properties because the series for ζ(s) does.Now to prove the formula here, just repeat the proof of the formula in Theorem 6.5, with

prime replaced by prime ideal. Note that this works because every integral ideal factorsuniquely into prime ideals. This proves, incidentally, that the series for ζK convergesabsolutely and uniformly on compact subsets of the region <s > 1 (why?) and so defines aholomorphic function in that region.

83

6.4 The Class Number Formula

Let K be a number field. Our goal in this section is to extend ζK and prove the followingbeautiful formula, called the Class Number Formula:

Ress=1 ζK(s) =2r(2π)s RegK hK

wK√|∆K |

= e−g RegK hK

where r is the number of real embeddings of K, 2s the number of complex embeddings,RegK the regulator, hK the class number, wK is the number of roots of unity in K, ∆K

the discriminant of K, and g is the genus of a number field occuring in the Riemann-Rochtheory.

I do not entirely understand the occurance of the genus in this theory, but there is aconnection coming from the adelic methods of Tate. The Riemann-Roch theory can berecovered from these methods, as I have shown in an adelic proof of Theorem 4.34 which Ihave reproduced at the end of these notes. The class number formula can also be recoveredvia these methods, and the quantity e−g occurs as a certain idelic volume while the constant2r(2π)s√|∆K |

occurs as a related adelic volume.

To prove the class number formula, we will need to work with the methods of Chapter4 and estimate the number of ideals in a given ideal class with norm smaller than a givennumber. We will not do this directly, however, as we shall see now.

Fix an ideal class C ∈ C(OK). For t > 0, consider the set

{a ∈ C | a ⊂ OK , Na ≤ t}.

Let iC(t) be the order of this set. This is the number we wish to estimate. Fix an integralideal b ∈ C−1. Then there is a bijection between this set and

{(α) ⊂ b | N(α) = |NmK/Q(α)| ≤ t · Nb}.

So we will count elements in the above set instead.Fix a free abelian subgroup V ⊂ O×K of rank r+ s−1. Let (x1, . . . , xr, zr+1, . . . , zr+s) ∈

(R×)r × (C×)s. Define

log(x1, . . . , xr, zr+1, . . . , zr+s) = (log |x1|, . . . , log |xr|, 2 log |zr+1|, . . . , 2 log |zr+s|).

This extends the log map of section 4.4 on O×K . In fact, the log map on O×K is the compo-sition of this one and the embedding i of Section 4.1. It is clear that the restriction of i toV into the group (R×)r × (C×)s is an injective homomorphism.

Let D be a set of coset representatives for i(V ) in (R×)r × (C×)s. Let Nm be thefunction on (R×)r × (C×)s from Section 4.4. Consider the set

{x ∈ D | x ∈ i(b), Nm(x) ≤ t · Nb}.

Then since the generators of a principal ideal differ by units, and since D is a set of cosetrepresentatives for the units modulo µ(K), the number of elements in this set is wk · iC(t).

We will construct a particular D whose boundary is (n − 1)-Lipschitz parametrizable

84

(Section 4.5), and whose volume is easy to measure.The image of the map log defined above takes i(V ) to the hyperplane

H = {(x1, . . . , xr+s) | x1 + · · ·+ xr + 2xr+1 + · · ·+ 2xr+s = 0}

and embeds i(V ) as a full lattice in H. Let F be the fundamental parallelopiped of thislattice. Then F has volume RegK (see Definition 4.25). Let v be the vector

(1, . . . , 1, 2, . . . , 2) ∈ Rr+s

where the first r entries are 1 and the last s are 2. Then we let D = log−1(F ⊕ Rv). Thenclearly D = aD for all a ∈ R×, i.e. D is homogeneous.

Let a > 0. Define Da = {x ∈ D | |Nm(x)| ≤ a}. By homogeneity, clearly Da = a1/nD1.Thus, wK · iC(t) is the number of points in i(b) ∩ (t · Nb)1/nD1.

Assume for now that ∂D1 (hence ∂Da) is (n − 1)-Lipschitz parametrizable. Then byLemma 4.31, we have

iC(t) =VolD1

wK Vol(Rn/i(b))Nb · t+O(t1−

1n ) =

2s Vol(D1)

wK√|∆K |

t+O(t1−1n ).

It thus remains to show that ∂D1 is (n − 1)-Lipschitz parametrizable and to compute itsvolume.

Lemma 6.8. ∂D1 is (n− 1)-Lipschitz parametrizable. Its volume is 2rπs RegK .

Proof. First of all, D1 is bounded because F is bounded, hence F ⊕ (−∞, 0)v is boundedabove, hence log−1(F ⊕ (−∞, 0)v) = D1 is bounded. We will focus on D+

1 = D1 ∩{(x1, . . . , zr+s) | x1, . . . , xr ≤ 0}. Then VolD1 = 2r VolD+

1 , and the boundary of D1 is(n−1)-Lipschitz parametrizable if and only if the boundary of D+

1 is. So we will work withD+

1 .

Let v1, . . . , vr+s−1 be a basis for the lattice i(V ). Let vji denote the jth coordinate ofvi, 1 ≤ j ≤ r + s. Let (x1, . . . , xr, zr+1, . . . , zr+s) ∈ D+

1 . Then this point is characterizedby equations

log xi =

r+s−1∑j=1

tjvij + a

for the real components and

2 log |zi| =r+s−1∑j=1

tjvij + 2a

for the complex components, where xj > 0, tj ∈ [0, 1), a ∈ (−∞, 0]. If we let tr+s = ea

and if we write zj in polar coordinates as (ρj , θj), then D+1 is therefore the set of all

(x1, . . . , xr, ρr+1eiθr+1 , . . . , ρr+se

iθr+s) with

xi = tr+s exp

r+s−1∑j=1

tjvij

85

and

ρi = tr+s exp

1

2

r+s−1∑j=1

tjvij

and

θi = 2πts+i

where ti ∈ [0, 1) for i 6= r + s and tr+s ∈ (0, 1]. Extending this parametrization to theboundary gives a map In → D+

1 which is smooth, hence Lipschitz (this implication is astandard fact from multivariable analysis). We only need to see that this function mapsopen sets into open sets, for this would imply that the boundary of In is mapped onto aset containing ∂D+

1 . But it is easy to see that the Jacobian determinant of this function isnon-zero on the interior of In (exercise), so this suffices by the open mapping theorem.

It remains to compute the volume. We need Vol(D+1 ) = πs RegK . We have

Vol(D+1 ) =

∫D+

1

ρr+1 · · · ρr+s dx1 · · · dxr dρr+1 · · · dρr+s dθr+1 · · · dρr+s.

This is equal to ∫Inρr+1 · · · ρr+s|J(t1, . . . , tn)| dt1 · · · dtn

where J(t1, . . . , tn) is the Jacobian mentioned above. The reader who computes this willfind that the integral becomes

Vol(D+1 ) = πs RegK

∫In

1

tr+sx1 · · ·xrρ2

r+1 · · · ρ2r+s dt1 · · · dtn.

This is equal to the desired volume.

Theorem 6.9.

iC(t) =2r(2π)s RegK

wK√|∆K |

t+O(t1−1n ).

Proof. This is immediate from the lemma and the remarks directly preceding it.

Now we prove the Class Number Formula. Let

ζK(C, s) =∑a∈C

1

(Na)s.

This defines a holomorphic function in the region <s > 1 by comparison to the Dedekindzeta function; this is a subseries. Furthermore,∑

C∈C(OK)

ζK(C, s) = ζK(s).

Theorem 6.10 (Class Number Formula). The function ζK(s) extends to the region <s >1− 1/n analytically except for a simple pole at s = 1 with residue

2r(2π)s RegK hK

wk√|∆K |

.

86

Proof. First note that the partial sums∑

Na≤t(Na)−s of the series for ζK(C, s) have exactlyiC(t) terms. Let

κ =2r(2π)s RegK

wk√|∆K |

and consider the sum ∑a∈C

1

(Na)s−∞∑k=1

κ

ks.

Rearranging terms, we find that this is a Dirichlet series in k whose coefficients, by Theorem6.9, are O(k1−1/n). By Proposition 6.3, this series converges for <s > 1 − 1/n and mustdo so absolutely uniformly on compact subsets of this region. This is ζK(C, s)−κζ(s), andhence ζK(C, s) continues analytically to the desired region except for a pole at s = 1 withresidue κ. Summing over all C ∈ C(OK), we obtain the desired result about ζK .

6.5 L-Functions and the Evaluation of the Class Number

In this section we introduce new Dirichlet series and examine their behavior at s = 1. Theseseries will turn out to be “pieces” of the Dedekind zeta function of an abelian extension ofQ. First we define the functions which give rise to their coefficients.

Definition 6.11. Let G be a finite abelian group. A character of G is a homomor-phism G → C×. A Dirichlet character (or simply character) modulo m is a characterχ : (Z/mZ)× → C×.

It is immediate that a character takes values in the |G|th roots of unity.We may view a Dirichlet character as a function Z → C by setting χ(a) = χ(a) if

(a,m) = 1 (here the bar denotes the class of a modulo m) and χ(a) = 0 if (a,m) > 1. Thuswe may define

L(s, χ) =

∞∑n=1

χ(n)

ns

and call this a Dirichlet L-series (or simply L-series, or L-function). In order to examinethe convergence of these series, we must examine the behavior of their coefficients.

Let G be a finite abelian group. We write G = hom(G,C×) for the character group.

Proposition 6.12. There is an isomophism G ∼= G (which is non-canonical).

Proof. First note that because the image of G in C× under any character lies in the |G|throots of unity, we have an isomorphism G ∼= hom(G,Z/|G|Z). Let G = Z1 ⊕ · · · ⊕ Zn be adecomposition of G into finite cyclic groups. Then |Zi| divides |G|, and

G ∼= hom(G,Z/|G|Z)∼= hom(Z1,Z/|G|Z)⊕ · · · ⊕ hom(Zn,Z/|G|Z)∼= Z/(|Z1|, |G|)Z⊕ · · · ⊕ Z/(|Zn|, |G|)Z∼= Z1 ⊕ · · · ⊕ Zn ∼= G.

This is what we wanted to show.

87

For a character χ, it is customary to write χ instead of χ−1. Of course, these are thesame thing.

Proposition 6.13 (Orthogonality Relations). Let G be a finite abelian group. Let g, h ∈ Gand χ, ψ ∈ G. Then we have

(a)∑

g∈G χ(g)ψ(g) = |G|δ(χ, ψ);

(b)∑

χ∈G χ(g)χ(h−1) = |G|δ(g, h).Here δ denotes Kronecker delta.

Proof. Let φ = χψ. If φ is not the identity, then there is a k ∈ G such that φ(k) 6= 1. Then∑g∈G

φ(g) =∑g∈G

φ(gk) =∑g∈G

φ(k)φ(g) = φ(k)∑g∈G

φ(g)

which implies∑φ(g) = 0. Otherwise

∑φ(g) =

∑1 = |G|. This proves (a), and (b) is

similar.

We call the Dirichlet character which is identically 1 the trivial character and denote itχ1

Corollary 6.14. If χ 6= χ1 is a nontrivial Dirichlet character modulo m, then∑m−1

n=1 χ(n) =0.

This corollary, along with Proposition 6.3 proves that L(s, χ) converges for χ 6= χ1,and does so uniformly on compact subsets of the region <s > 0. Therefore it defines aholomorphic function in this region. However, L(s, χ1) diverges at s = 1 by comparisonwith the zeta function.

The L(s, χ) have Euler products:

Proposition 6.15. Let χ be a Dirichlet character modulo m. Then

L(s, χ) =∏p

(1− χ(p)p−s)−1.

The product here may be taken either over all primes, or over all primes p - m.

Proof. Exercise; this is just like the Euler product for the zeta function, using the fact thatχ is completely multiplicative. The last statement follows from the fact that χ(p) = 0 ifand only if p|m.

Corollary 6.16. We have the formula

L(s, χ1) =∏p|m

(1− p−s)ζ(s).

Thus L(s, χ) can be analytically continued to <s > 0 except for a simple pole at s = 1.

88

Now let K/Q be abelian. Assume that K can be embedded in a cyclotomic field Q(ζm).(This hypothesis is actually satisfied for any abelian extension K/Q by a result called theKronecker-Weber Theorem). For a prime p of Z, write rp for the number of primes intowhich p splits in OK , and fp for the inertia degree. Let G be the Galois group of K over Q.

Then G is a homomorphic image of (Z/mZ)× = Gal(Q(ζm),Q). Hence G may be identifiedwith a subgroup of ((Z/mZ)×) . Therefore elements of G define Dirichlet characters modulom.

Lemma 6.17. Let p be a prime not dividing m and let p be the image of p in G under thecanonical homomorphism (Z/mZ)× → G. Then the order of p in G is fp.

Proof. The image of p in Gal(Q(ζm),Q) is the automorphism such that ζm 7→ ζpm. Hencethis induces the automorphism α 7→ αp on OQ(ζp)/P, where P is a prime lying above p.Thus the image of p in Gal(Q(ζm),Q) is the Frobenius automorphism φ(p). The primep is unramified in Q(ζm) because it does not divide the discriminant. Indeed, we showedin Section 1.5 that the discriminant divides mϕ(m). Thus the Frobenius automorphismgenerates the decomposition group of Q(ζm) over Q.

If we can show that the Frobenius automorphism of Q(ζm) over Q restricts to theFrobenius automorphism of K over Q, then this would show that the order of p in G wouldbe the order of the decomposition group of K over Q (p is again unramified in K) which isfp, and we would be done. We can show what we need in a more general setting.

Let M/K now be an abelian extension of number fields, L an intermediate extension,and let Q be a prime of M lying over a prime P of L and a prime p of K. Assume Qis unramified in M . We claim that φ(Q, p)|L = φ(P, p). But this is clear, since φ(Q, p)|Linduces the automorphism α 7→ αNp on the residue field, which characterizes the Frobenius.This completes the proof.

Theorem 6.18. With notation as above, we have the formula

ζK(s)

ζ(s)=∏p|m

(1− p−s)(1− p−fps)−rp∏

χ∈G, χ 6=χ1

L(s, χ).

In particular,

hK =wK√|∆K |

2r(2π)s RegK

∏p|m

(1− p−1)(1− p−fp)−rp∏

χ∈G, χ 6=χ1

L(1, χ).

Proof. Assume p - m. Let χ run through G. The homomorphisms (Z/mZ)× → C×induce the homomorphisms χ, and the χ(p) are fpth roots of unity by the lemma. Thehomomorphism ψ : G → C which sends g to ζag , where ag is the order of g in G, has theproperty that ψ(p) is a primitive fpth root of unity. Since χ 7→ χ(p) is a homomorphism,every fpth root of unity occurs as a χ(p) equally many times. This amount is thus |G|/fp,which equals rp because the ramification index ep of p in OK is 1.

With this in mind, we have that∏χ∈G

(1− χ(p)p−s) = (1− p−fps)rp .

89

Hence for χ 6= χ1, we have ∏χ∈G

L(s, χ) =∏p-m

(1− p−fps)rp .

On the other hand, the Euler product for the Dedekind zeta function in our Galois casebecomes

ζK(s)∏p

(1− p−fps)rp .

HenceζK(s) =

∏p|m

(1− p−fps)rp∏χ∈G

L(s, χ).

Dividing by the Riemann zeta function gives

ζK(s)

ζ(s)=∏p|m

(1− p−s)(1− p−fps)−rp∏

χ∈G, χ 6=χ1

L(s, χ)

as desired.The last statement follows from this and the Class Number Formula.

We deduce a very important corollary.

Corollary 6.19. L(1, χ) 6= 0.

Proof. If this were not true, then hQ(ζm) = 0 by the second formula in the theorem.

This will be key in the next section. Before this, however, we compute the values ofL(1, χ) for χ 6= χ1. This will give a formula for computing the class number of an abelianextension, assuming we can compute the regulator, which depends on the units.

Theorem 6.20. Let χ ∈ G be nontrivial. Then we have the formula

L(1, χ) = − 1

m

m−1∑k=1

τk(χ) log(1− e2πik/m)

where τk(χ) is the Gauss sum ∑a∈(Z/mZ)×

χ(a)e2πiak/m

(of which the sums in Section 1.6 are a special case) and the logarithm is computed via theseries

log(1− z) = −∞∑n=1

zn

n.

90

Proof. We have

L(s, χ) =∑

a∈(Z/mZ)×

χ(a)∑

n=a, n≥1

1

ns

=∑

a∈(Z/mZ)×

χ(a)

∞∑n=1

1

ns1

m

m−1∑k=0

e2πi(a−n)k/m

=1

m

m−1∑k=0

τk(χ)∞∑n=1

e2πink/m

ns

where the k = 0 term vanishes by the orthogonality relations. Set s = 1 to complete theproof.

6.6 Dirichlet’s Theorem on Primes in Arithmetic Progressions

We want to prove in this section that the arithmetic progression a, a+m, a+2m, . . . containsinfinitely many primes for (a,m) = 1. To do this, let χ be a Dirichlet character modulo mand consider, for s near 1 and <s > 1, the function

G(s, χ) = logL(s, χ) = log∏p-m

(1−χ(p)p−s)−1 = −∑p-m

log(1−χ(p)p−s) =∑p-m

χ(p)

ps+Rχ(s)

where Rχ(s) is a function which remains bounded as s gets close to 1. By corollary 6.19,this function is bounded near s = 1 unless χ = χ1, in which case it differs from

∑p p−s by

a bounded amount.It is worth noting that our proof of Corollary 6.19 involved a lot of what we have done

in these notes. It is easily the deepest part of the proof of Drirchlet’s theorem. There areproofs, however, without any algebraic number theory. Indeed, algebraic number theorycame formally after the time of Dirichlet.

Now let (a,m) = 1. By the orthogonality relations, we have∑χ

χ(a)G(s, χ) =∑p-m

∑χ

χ(a)χ(p)p−s +Rχ,a(s) = ϕ(m)∑

p≡a (mod m)

p−s +Rχ,a(s)

where Rχ,a(s) is also bounded as s → 1. But the G(s, χ) are bounded for χ 6= χ1, so thisshows that

G(s, χ1)− ϕ(m)∑

p≡a (mod m)

p−s

is bounded. Thus we have proved

Theorem 6.21 (Dirichlet). The difference∑p≡a (mod m)

p−s − 1

ϕ(m)

∑p

p−s

is bounded as s→ 1.

91

Since the second term in the above difference diverges, we obtain

Corollary 6.22. There are infinitely many primes in the arithmetic progression a, a +m, a+ 2m, . . . .

No proof of the above corollary is known which stays strictly within the realm of ele-mentary number theory. By this we mean there is no proof known which uses only rationalmethods learned in an elementary number theory course. However, special cases of thiscorollary can be obtained via such methods. There is an elementary proof, in the sensethat complex analysis is avoided, due to Selberg.

6.7 Densities

Let K be a number field. In this section, we will give two ways to measure how manyprimes of K are in a given set compared to the set of all primes of K.

Definition 6.23. Let T be a set of primes in K. If there is a real number d such that∑p∈T

1

(Np)s∼ d log

1

s− 1

as s ↓ 1, then we say that T has Dirichlet density d, and we often denote d = δ(T ). Here,s ↓ 1 means that s approaches 1 from above in the real numbers, and the tilde denotes thatthe difference of these terms is bounded in this limit.

Proposition 6.24. Let δ be the Dirichlet density and T a set of primes in K.(a) The set of all primes in K has Dirichlet density 1.(b) A finite set of primes in K has Dirichlet density 0.(c) 0 ≤ δ(T ) ≤ 1 if δ(T ) is defined.(d) If T is the disjoint union of T1 and T2, and if any two of T1, T2, T have well defined

Dirichlet density, then so does the third and δ(T1) + δ(T2) = δ(T ).(e) If T ′ ⊂ T , and if both T and T ′ have well defined Dirichlet density, then δ(T ′) ≤

δ(T ).

Proof. If we take the logarithm of the Euler product for ζK , we obtain

log ζK(s) = −∑p

log(1− (Np)−s) =∑p

∞∑m=1

1

m(Np)ms.

Also, ∑p

1

(Np)s≤∑p

∞∑m=1

1

m(Np)ms≤∑p

1

(Np)s+ ζK(2)

Now ζK(s) and 1s−1 both have simple poles at s− 1. Let ρ be the residue of ζK(s) at s = 1.

Then

log ζK(s)− log1

s− 1− log ρ→ 0

92

as s ↓ 1. This all proves (a).(b) is clear. To prove (d), assume that T1, T2 have well defined Dirichlet density. Then∑

p∈T

1

(Np)s=∑p∈T1

1

(Np)s+∑p∈T2

1

(Np)s∼ (δ(T1) + δ(T2)) log

1

s− 1.

The proof if any other two are defined is similar.(e) follows from the proof of (d) using T, T ′, T\T ′, and noting that all terms in the sums

involved are positive. (c) follows from (e) and the fact that T always contains a finite setof primes and is contained in the set of all primes.

By (a) of the proposition, Dirichlet’s Theorem says that the primes in the arithmeticprogression {mn+ a} have Dirichlet density 1/ϕ(m).

Let us introduce another kind of density and relate it to the Dirichlet density.

Definition 6.25. Let T be a set of primes in K. Let

ζK,T (s) =∏p∈T

(1− (Np)−s)−1

and assume that (ζK,T (s))n extends to a neighborhood of s = 1 and that it has a pole oforder m at s = 1. Then we say that T has polar density m/n.

Proposition 6.26. If a set T of primes of K has a polar density, then this density isunique.

Proof. Assume T has polar m/n and m′/n′. Then (ζK,T (s))nn′

has a pole of order mn′ orof m′n at s = 1. Hence m′n = mn′.

Proposition 6.27. Let δ be the polar density and T a set of primes in K.(a) The set of all primes in K has polar density 1.(b) A finite set of primes in K has polar density 0.(c) 0 ≤ δ(T ) ≤ 1 if δ(T ) is defined.(d) If T is the disjoint union of T1 and T2, and if any two of T1, T2, T have well defined

polar density, then so does the third and δ(T1) + δ(T2) = δ(T ).(e) If T ′ ⊂ T , and if both T and T ′ have well defined polar density, then δ(T ′) ≤ δ(T ).

Proof. (a) is contained in our statement of the Class Number Formula. (b) is immediatefrom the fact that the a finite product of the form∏

(1− (Np)−s)−1

is holomorphic near s = 1.To prove (d), first note that

ζK,T1(s)ζK,T2(s) =∏p∈T1

(1− (Np)−s)−1∏p∈T2

(1− (Np)−s)−1

=∏

p∈T1∩T2

(1− (Np)−s)−1 = ζK,T1∪T2(s)

93

for <s > 1 (where the product is absolutely convergent). Assume δ(T1) = m1/n1 andδ(T2) = m2/n2 are defined. Then (ζK,T1(s))n1n2 has a pole of order m1n2 and s = 1, and(ζK,T2(s))n1n2 a pole of order m2n1. Hence (ζK,T1∪T2(s))n1n2 has a pole of order (m1n2 +m2n1)/n1n2 = m1/n1+m2/n2. The arguments in which any other two densities are definedare similar.

Now we claim that δ(T ) ≥ 0. If δ(T ) < 0, then (ζK,T (s))n has a zero at s = 1 for somen. However, the product ∏

p∈T(1− (Np)−s)−1

has only terms which are larger than 1 at s = 1, and hence so does any power of this product.So the product cannot be zero as s → 1. Now (e) follows from (d) using T, T ′, T\T ′, and(c) follows from (e) as in the case of Dirichlet density.

Proposition 6.28. Let T be a set of primes of K. If the polar density of T exists, then sodoes the Dirichlet density and the two are equal.

Proof. Assume T has polar density m/n, and let

(ζK,T (s))n =a

(s− 1)m+

f(s)

(s− 1)m−1

be the expansion of (ζK,T (s))n about s = 1, where f is holomorphic near 1. Then a > 0since the terms in the product for ζK,T (s) are positive near s = 1. Hence, using the methodsin Proposition 6.24, we can take the logarithm of both sides and find

n log∑p∈T

1

(Np)s∼ m log

1

s− 1.

This proves the proposition.

Remark. The converse of the proposition is not true. In particular, there are sets withirrational Dirichlet density.

Now we prove some results about the splitting of primes in extensions.

Proposition 6.29. Let T be a set of primes in K. If T does not contain any primes p forwhich Np is prime in Z (i.e. the inertia degree over Q is 1), then δ(T ) = 0 with the polardensity.

Proof. Let p ∈ T and Np = pfp for p prime. Then f ≥ 2. Hence

ζK,T (s) =∏p∈T

(1− p−fps)−1 ≤∏p∈T

(1− p−2s)−1.

This last product converges and is holomorphic at s = 1 because it is a subproduct of theproduct for (ζ(2s))n. Thus the same is true for ζK,T (s). The holomorphicity of ζK,T (s) ats = 1 means exactly that δ(T ) = 0.

94

Lemma 6.30. Let L,M be finite extensions of K and p a prime of K. Then p splitscompletely in L and M if and only if it splits completely in LM .

Proof. The backwards direction is obvious. For the forward direction, let Q be a primein any normal extension F containing LM , lying over p, and let F d be the correspondingdecomposition field. Then by hypothesis F d contains both L and M (Theorem 2.17). Henceit contains LM , which proves the lemma.

Let L/K be number fields. We will use the notation Spl(L/K) for the set of primes ofK which split completely in L.

Theorem 6.31. Let L/K be number fields, and M the normal closure of L/K. Thenδ(Spl(L/K)) = 1/[M : K].

Proof. A prime in K splits completely in L if and only if it splits completely in any conjugateof L over K, if and only if it splits in M by the lemma. So we may assume L/K is Galois.Now let T be the set of primes of L lying over a prime in Spl(L/K). Then T is thecompliment of the set of primes which ramify or have interia degree at least 2. This setitself is a union of a finite set and a set of density zero (by Proposition 6.29). Thus T hasdensity 1. However, since Np = NP for P ∈ T lying over p ∈ Spl(L/K), we have ∏

p∈Spl(L/K)

(1− (Np)−s)−1

[L:K]

=∏P∈T

(1− (NP)−s)−1.

We just showed that the right hand side extends to have a simple pole at s = 1, hence sothe the left hand side. This means exactly that δ(Spl(L/K)) = 1/[L : K].

Theorem 6.32. Let L,M be finite Galois extensions of K. Then L ⊂ M if and only ifSpl(M/K) ⊂ Spl(L/K). Hence L = M if and only if Spl(L/K) = Spl(M/K).

Proof. By Lemma 6.30, Spl(LM/K) = Spl(L/K)∩Spl(M/K). This implies that if Spl(M/K) ⊂Spl(L/K), then Spl(M/K) = Spl(LM/K). Then by the previous theorem, [M : K] =[LM : K], i.e. L ⊂M . This proves one implication. The other is obvious.

95

Part II

Towards a More Advanced Theory

7 Class Field Theory

Class field theory is exceptional for many reasons. One is that the theory has only a fewmain results, which are considerably beautiful and simple. A second reason is that theproofs of these results are considerably difficult. A third is that the there are many differ-ent proofs of the same results, which are considerably diverse. A fourth is that the theoryhas a considerably rich history.

The approach we take to class field theory in these notes is considerably unorthodox.First, we do not develop the theory in full as we leave many of the auxilliary results asexercises. But the theory is developed in full in that, if the reader solves all of the exercises,he or she shall enjoy a complete proof of the main theorems.

The approach we take to the proofs is principally cohomological. We will use the co-homology of groups to develop the proofs main theorems of local class field theory. Localclass field theory will be used to develop global class field theory besides the ChebotarevDensity Theorem, and the Chebotarev Density Theorem will be proved using the globalclass field theory already developed.

We have divided our exposition of class field theory into global and local. This means,respectively, the theory of abelian extensions of number fields, and the theory of abelian ex-tensions of local fields (Characteristic 0 and characteristic p are considered simultaneously).Our goal will be to classify the abelian extensions of these fields in terms of the arithmeticof the fields themselves. The most prominent theorem is the Artin Reciprocity Law, whichhas a local and a global version. It provides a natural map from certain arithmetic con-structions arising from the field in question, to the abelianized absolute Galois group of thefield in question. Moreover, it says that this map has some very nice properties. The factthat the arithmetic of these fields has so much to do with their Galois theory is remarkable.

Sources

Prerequisites

The reader should know infinite Galois theory, which is the Galois theory of infinite alge-braic extensions. It is also the inverse limit of ordinary Galois theory. It would also helpthe reader to know some basic homological algebra, up to the theory of Ext and Tor.

Here is the first exercise.

Exercise 7.1. Prove that the composition of two abelian extensions is abelian. Prove thatthe composition of two unramified extensions (of a local field or number field) is unramified.Conclude that every field has a largest abelian extension, whose Galois group must be theabelianization (largest abelian quotient) of the absolute Galois group. Conclude also thatany number field or local field has a maximal unramified extension. You do not need to givethe Galois group here.

96

7.1 Global Class Field Theory

7.2 Local Class Field Theory

7.3 The Proofs I: Group Cohomology

7.4 The Proofs II: Local Class Field Theory

7.5 The Proofs III: Global Class Field Theory

7.6 The Proofs IV: The Chebotarev Density Theorem

8 Local Fields and Function Fields

8.1 Local Fields and Their Classification

8.2 The Arithmetic of Function Fields

8.3 Finiteness of the Class Number and Unit Theorem for FunctionFields

8.4 The Zeta Function of a Function Field

8.5 The Analytic Continuation and Functional Equation of the ZetaFunction

8.6 Overview: Adeles, Ideles, and Class Field Theory

9 Tate’s Thesis

9.1 Abstract Harmonic Analysis

9.2 Analysis on Local Fields

9.3 Analysis on Adeles and Ideles

9.4 Local Zeta Functions

9.5 Tate’s Riemann-Roch Theorem

9.6 Global Zeta Functions and Their Functional Equation

9.7 Another Proof of Arithmetic Serre Duality

References

[1] J. W. S. Cassels and A. Frohlich, Algebraic Number Theory. Academic Press, London,1967

[2] H. Davenport, Multiplicative Number Theory, Third Edition. Graduate Texts in Math-ematics 74. Springer-Verlag, Berlin, 2000.

[3] G. Folland, A Course in Abstract Harmonic Analysis. Studies in Advanced Mathemat-ics. CRC Press, Boca Raton, 1995.

97

[4] K. Ireland and M. Rosen, Classical Introduction to Modern Number Theory. GraduateTexts in Mathematics 84. Springer-Verlag, Berlin, 1990.

[5] S. Lang, Algebraic Number Theory, Second Edition. Graduate Texts in Mathematics 84.Springer-Verlag, Berlin, 1994.

[6] D. Marcus, Number Fields. Universitext. Springer-Verlag, Berlin, 1977.

[7] J. S. Milne, Algebraic Number Theory. http://www.jmilne.org/math/CourseNotes/ANT.pdf.

[8] J. S. Milne, Class Field Theory. http://www.jmilne.org/math/CourseNotes/CFT.pdf.

[9] J. Neukirch, Algebraic Number Theory. Grundlehren der Mathematischen Wis-senschaften 322. Springer-Verlag, Berlin, 1999.

[10] D. Ramakrishnan and R. Valenza, Fourier Analysis on Number Fields. Graduate Textsin Mathematics 186. Springer-Verlag, Berlin, 1999.

[11] M. Rosen, Number Theory in Function Fields. Graduate Texts in Mathematics 210.Springer-Verlag, Berlin, 2002.

[12] G. van der Geer, R. Schoof, Effectivity of Arakelov Divisors and the Analogue of theTheta Divisor of a Number Field, Selecta Math. New Ser 6. (2000), 377-398.

98

Documents

mundy.netmundy.net/sam/ANT.pdf · Algebraic Number Theory Sam Mundy Last Modi ed: 2/17/2014 Introduction These notes give a complete introduction to the basic theory of algebraic