MATH 4581: ABSTRACT ALGEBRA II NOTES by PROFESSOR … · 2020-01-10 · Chapter 1: Review of Linear Algebra Linear algebra is the most extensively applied area in all of algebra,

MATH 4581:

ABSTRACT ALGEBRA II

NOTES by PROFESSOR RONALD SOLOMON

Copyright May 20131

MATH 4581:

ABSTRACT ALGEBRA II

NOTES by PROFESSOR RONALD SOLOMON

Copyright May 20131

2

Table of Contents

Chapter 0: Introduction 3

Chapter 1: Review of Linear Algebra 4

Chapter 2: Linear Operators 11

Chapter 3: Inner Products and Orthogonal Matrices 22

Chapter 4: Permutations, Orbits, and Lagrange’s Theorem 28

Chapter 5: The Platonic Solids and their Symmetries 39

Chapter 6: The Orbit Counting Formula 45

Chapter 7: Finite Subgroups of SO(3) 52

Chapter 8: Imaginaries and Galois Fields 63

Chapter 9: Symmetric Polynomials & the Fundamental Theorem of Algebra 70

Chapter 10: The Cubic and Quartic Equations Revisited 80

Chapter 11: Galois’ Theory of Equations 81

Chapter 12: The Galois Correspondence 91

Index 105

3

Chapter 0: Introduction

Welcome back! This semester we will go deeper in our investigation of grouptheory and its applications to geometry and to the theory of equations.

The first high point of the course will be the discussion of the Platonic solids andtheir symmetry groups, culminating in the classification of all finite groups of rota-tions of R3, a theorem of great importance to crystallographers, indeed essentiallydiscovered by the French crystallographer, Auguste Bravais, around 1837.

The second high point of the course will be the theory of symmetric polynomialsand its application to give Euler’s proof of the Fundamental Theorem of Algebra.

The third high point of the course will be the description of Galois’ fundamen-tal work on the theory of equations, including the famous Galois CorrespondenceTheorem, discovered by Evariste Galois in 1832.

Before reaching these goals we need to review and reinforce our knowledge ofsome basic linear algebra, especially the theory of linear isometries of R3. This willbe covered in the first three chapters of the notes.

Next, we will need to review and expand our understanding of permutationgroups, including Cauchy’s e�cient methods for computing with permutations.This material will culminate with a proof of Lagrange’s Orbit-Stabilizer Theorem,the first and still one of the most important theorems in all of group theory. We willapply this theorem to give Cauchy’s proof of the Orbit Counting Formula, some-times incorrectly called Burnside’s Counting Formula. This is the most importanttool for using symmetry to facilitate di�cult counting problems. Its applicationswere extensively developed by Polya in the early 20th century. This material, alongwith an introduction to the Platonic solids and their symmetry groups, will occupyChapters 4 through 6. We will then be equipped to do the classification of the finite3-dimensional rotation groups, in Chapter 7.

Next we will shift to the theory of equations, beginning in Chapter 8. In Chapter9, we will develop the theory of symmetric functions and prove the FundamentalTheorem of Algebra. Then, after reviewing the theory of the cubic and quarticpolynomials, we will move on to a discussion of Galois’ theory of equations inChapters 11 and 12.

If time permits during the semester (or if not, after the semester ends), you shouldstudy Chapter 13, which takes a deeper look at the arithmetic of the Gaussianintegers and uses this to prove Fermat’s Two Squares Theorem.

4

Chapter 1: Review of Linear Algebra

Linear algebra is the most extensively applied area in all of algebra, and indeedperhaps in all of advanced mathematics. Its only competitor is di↵erential equa-tions. In the next three chapters we review and extend much of the basic linearalgebra which you learned in earlier math courses. This first chapter mostly repro-duces material from Chapter 14 of the Math 4580 text. But this is so important,it bears repetition and review. Also, note that Lemma 1.13 and Theorem 1.14 arenew.

In subsequent chapters, we shall stick to real vector spaces, but for later use inour study of field extension, we do this preliminary material over an arbitrary fieldof scalars.

Definition 1.1. Let F be a field. An F -vector space is a set V of objects, calledvectors, for which two operations are defined:

Vector Addition: To each pair (v, w) of vectors in V , there is a vector v + w 2 V .

and

Scalar Multiplication: To each pair (a, v), a 2 F , v 2 V , there is a vector a · v 2 V .

The following rules are satisfied:

(1) v + w = w + v for all v, w 2 V ;(2) u+ (v + w) = (u+ v) + w for all u, v, w 2 V ;(3) There exists a zero vector 0 such that v + 0 = v for all v 2 V ;(4) For each vector v 2 V , there exists a vector �v 2 V such that v+(�v) = 0;(5) a · (v + w) = a · v + a · w for all a 2 F , v, w 2 V ;(6) (a+ b) · v = a · v + b · v for all a, b 2 F , v 2 V ;(7) (ab) · v = a · (b · v) for all a, b 2 F , v 2 V ; and(8) 1 · v = v for all v 2 V .

Examples of Vector Spaces

(1) Fn is the vector space of all n-tuples (a1, . . . , an) with ai 2 F , with theoperations of position-wise addition and scalar multiplication. This is justthe obvious generalization of Rn, considered algebraically;

(2) F is the set of all real-valued functions f : R ! R, with pointwise additionand scalar multiplication, i.e.,

(f + g)(x) = f(x) + g(x) for all x 2 R,

and(a · f)(x) = a · f(x) for all a 2 R, x 2 R.

(3) C is the set of all everywhere continuous functions f : R ! R;(4) D is the set of all everywhere di↵erentiable functions f : R ! R;(5) P is the set of all polynomial functions f : R ! R.

5

A subset W of an F -vector space V which is itself an F -vector space underthe same operations is called a subspace of V . In the roster of examples (2)–(5)above, each space is a subspace of the one that precedes it. Since the larger spaceV satisfies all of the listed axioms, it su�ces, in order to prove that a non-emptysubset W is a subspace to verify:

(1) w + w1 2 W for all w,w1 2 W ; and(2) a · w 2 W for all a 2 F and w 2 W .

All the examples given above, except for Example 1, are infinite dimensionalvector spaces. We shall be interested primarily in finite-dimensional vector spaces.

Definition 1.2. Let {v1, v2, . . . , vn} be a set of vectors in V . Any vector v =c1 · v1 + . . . cn · vn, with ci 2 F for all i, is called a linear combination of thevectors v1, . . . , vn. The set of all linear combinations of these vectors is called thespan of (or the space spanned by) {v1, . . . , vn}.

Lemma 1.3. Span({v1, . . . , vn}) is a subspace of V .

Proof. We have

(a1 · v1 + . . . an · vn) + (b1 · v1 + . . . bn · vn) = (a1 + b1) · v1 . . . (an + bn) · vn;

andc · (a1 · v1 + . . . an · vn) = (ca1) · v1 + · · ·+ (can) · vn.

⇤Definition 1.4. We say that V is a finite-dimensional vector space if there existsa finite set {v1, . . . , vn} of vectors which spans V .

Lemma 1.5. Suppose V is a finite-dimensional vector space. Let B = {v1, . . . , vn}be a minimal spanning set for V . Then every vector v 2 V is uniquely express-ible as a linear combination of the vectors in B.

Proof. Let v 2 V . By definition of a spanning set, v is a linear combination of thevectors in B. Suppose the expression is not unique, i.e.,

v = a1 · v1 + . . . an · vn = b1 · v1 + · · ·+ bn · vn,

with some ai 6= bi. Rearranging, we may assume that an 6= bn. Then

vn =1

bn � an((a1 � b1) · v1 + · · ·+ (an�1 � bn�1) · vn�1).

But then {v1, . . . vn} is a spanning set for V , properly contained in B, contradictingthe minimal choice of B. Hence the expression for each vector v is unique.

⇤We call any minimal spanning set for V a basis for V .This is the important point: If B = {v1, . . . , vn} is a basis for V , then every

vector v has a unique set (a1, a2, . . . , an) of coordinates with respect to the basisB.

6

Definition 1.6. Two real vector spaces V and W are isomorphic if there is abijective function f : V ! W such that

(1) f(v + w) = f(v) + f(v0) for all v, v0 2 V ; and(2) f(a · v) = a · f(v) for all a 2 F , v 2 V .

In other words, f : V ! W is an invertible linear transformation. We writeV ⇠= W .

Theorem 1.7. Let V be a finite-dimensional real vector space with a basis B ={v1, . . . , vn}. Then V ⇠= Fn.

Proof. For each v 2 V , let

v = a1 · v1 + . . . an · vnbe the unique expression for v as a linear combination of the vectors in B. Definef : Fn ! V by

f(a1, . . . , an) = a1 · v1 + . . . an · vn.

Uniqueness of expression implies that this function is one-to-one. Since B is a basisfor V , the map is surjective. Hence f is a bijective function. It is easy to checkthat f is a linear transformation.

⇤Thus, every finite-dimensional vector space can be coordinatized and thereby

identified with Fn. The identification is not at all “natural”. Each choice of basisgives a di↵erent coordinatization. However, as we shall see, the dimension of V isa fixed number, independent of the choice of coordinatization.

Example 1.8: Let Pn be the set of all polynomials (with real coe�cients) ofdegree at most n. Clearly, each polynomial p(x) is uniquely expressible as

p(x) = a0 · 1 + a1 · x+ · · ·+ an · xn,

for some a0, a1, . . . , an 2 R. In other words, {1, x, x2, . . . , xn} is a basis for Pn, andthe map

p(x) ! (a0, a1, . . . , an)

defines an isomorphism between Pn and Rn+1.

But, again, the basis {1, x, . . . , xn}, though an obvious one is by no means theonly one. Indeed the whole subject of orthogonal polynomials investigates otherchoices of basis (such as Legendre polynomials) which are more useful for certainapplications. We will not pursue this theme here.

Definition 1.9. A set S = {v1, . . . , vk} of vectors is linearly independent if theonly linear combination of v1, . . . , vk which equals the zero vector is the “trivial”one:

0 · v1 + · · ·+ 0 · vk = 0.

In other words, 0 is uniquely expressible as a linear combination of the vectors inS.

7

Lemma 1.10. If S is a linearly independent set of vectors in Fm, then |S| m.

Proof. Let {v1, . . . , vn} be a subset of Fm. Write

vi = (a1i, a2i, . . . , ami)

for each i. Then c1v1 + · · ·+ cnvn = 0 if and only if (c1, c2, . . . , cn) is a solution ofthe homogeneous system of simultaneous linear equations Ax = 0, where

A =

0

B@

a11 a12 . . . a1na21 a22 . . . a2n. . .am1 am2 . . . amn

1

CA

If n > m, there is a non-0 solution (c1, c2, . . . , cn), and so {v1, . . . , vn} is not linearlyindependent.

⇤Lemma 1.11. Let f : V ! W be an isomorphism of vector spaces. Then f mapslinearly independent sets to linearly independent sets, and f maps spanning sets tospanning sets.

Proof. Let {v1, . . . , vm} be a linearly independent subset of V . If

c1 · f(v1) + · · ·+ cm · f(vm) = 0,

thenf(c1 · v1 + · · ·+ cm · vm) = 0,

and then , since f is injective,

c1 · v1 + · · ·+ cm · vm = 0.

Hence c1 = · · · = cm = 0, and so {f(v1), . . . , f(vm)} is a linearly independentsubset of W .

Suppose {v1, . . . , vm} spans V . Let w 2 W . Since f is surjective, there existsv 2 V with f(v) = w. Write

v = a1 · v1 + · · ·+ am · vm.

Thenw = f(v) = a1 · f(v1) + · · ·+ am · f(vm).

Hence {f(v1), . . . , f(vm)} is a spanning set for W .

⇤Theorem 1.12. The following conclusions hold:

(1) If Fn ⇠= Fm, then m = n;(2) If V is a finite-dimensional vector space, then every basis B of V has the

same cardinality n. (We call this number the dimension of V , dim(V ).);and

(3) If dim(V ) = n, then every linearly independent subset of V has cardinalityat most n.

8

Proof. Suppose that m < n and f : Fn ! Fm is an isomorphism. Clearly, thestandard basis B = {e1 = (1, 0, . . . , 0), . . . , en = (0, 0, . . . , 1)} is a linearly indepen-dent set of vectors in Fn. Then, by Lemma 14.11, f(B) = {f(e1), . . . , f(en)} is alinearly independent subset of Fm of cardinality n > m, contrary to Lemma 14.10.This proves (1).

Now if V is a finite-dimensional vector space with a basis B of cardinality n,then V ⇠= Fn. Since isomorphism is a transitive relation, it follows by (1) that n isuniquely determined, proving (2). Finally (3) follows by the same argument as in(1): If S is a linearly independent subset of V and f : V ! Fn is an isomorphism,then f(S) is a linearly independent subset of Fn, whence |S| n.

⇤We have seen that every minimal spanning set for V is a basis. Here is another

way to recognize a set as a basis for V .

Lemma 1.13. Let V be a finite-dimensional vector space. If B is a maximallinearly independent subset of V , then B is a basis for V .

Proof. Let B = {v1, . . . , vn}. Let v 2 V � B. Then the set B [ {v} is not linearlyindependent. Hence there is a set of scalars, not all 0, such that

c1 · v1 + . . . cn · vn + c · v = 0.

If c = 0, then B is not a linearly independent set, contrary to assumption. Hencec 6= 0 and we can solve for v:

v = �1

c(c1 · v1 + . . . cn · vn).

Hence B is a spanning set for V . Suppose that

a1 · v1 + · · ·+ an · vn = b1 · v1 + . . . bn · vn.

Then(a1 � b1) · v1 + · · ·+ (an � bn) · vn = 0.

Since B is a linearly independent set, ai = bi for all i. Hence each vector in V isuniquely expressible as a linear combination of the vectors in B, i.e., B is a basisfor V , as claimed.

⇤From this, we get the following very important extendibility result.

Theorem 1.14. Let V be a finite-dimensional vector space. Let U be a subspaceof V . Then U is also finite-dimensional, with dim(U) dim(V ) and with equalityonly if U = V . Moreover, if B is any basis for U , then it is extendible to a basisB⇤ for V .

Note. By convention, if U = {0}, then the empty set is a basis for U , anddim(U) = 0.

Proof. Suppose dim(V ) = n. By the remark, we may assume that U contains anon-zero vector u. Then {u} is a linearly independent subset of U . Since anysubset of U containing more than n vectors is linearly dependent, there must be amaximal linearly independent subset B of U with |B| n. Then B is a basis for

9

U and dim(U) = |B| n. Extend B to a maximal linearly independent subset B⇤

of V , of cardinality n. Then B⇤ is a basis for V . If dim(U) = n, then B = B⇤ andso, U = V .

⇤

Exercises

1. Let X be any non-empty set. Let F(X,Rn) be the set of all functions withdomain X and co-domain Rn, (The actual range of any one of these functions maybe a proper subset of Rn. Indeed, it may be a single point in Rn.) Define additionand scalar multiplication pointwise, i.e., if f and g are functions in F(X,Rn), andif f(x) = (a1, a2, . . . , an), g(x) = (b1, b2, . . . , bn), and if c is a real number, then

(f + g)(x) = f(x) + g(x) = (a1 + b1, a2 + b2, . . . , an + bn),

and(c · f)(x) = c · f(x) = (ca1, ca2, . . . , can).

Prove: F(X,Rn) is a real vector space.

2. Verify that C, D and P are vector subspaces of F .

3. Verify that the function f : Rn ! V defined in Theorem 1.7 is an isomorphismof vector spaces.

4. Let ⇧ be the plane in R3 whose equation is: x� y + z = 0.

(a) Verify that ⇧ is a vector subspace of R3.

(b) Give a basis B for ⇧.

(c) Extend the basis B to a basis B⇤ for R3.

5. Let ⇤ be the line in R3 given parametrically by (x, y, z) = (3t, t,�2t).

(a) Verify that ⇤ is a vector subspace of R3.

(b) Give an equation for the plane ⇧ passing through (0, 0, 0) which is perpen-dicular to ⇤.

(c) Give an orthonormal basis B = {u, v, w} for R3 such that {u} is a basis for⇤. [u, v, and w are mutually perpendicular unit vectors.]

6. Let A · x = 0 be a system of linear equations, where

A =

0

B@


1

CA .

Prove: The set of all solutions of this system of equations is a vector subspace ofRn.

7. Let y(t)000 + ay(t)00 + by(t) = 0 be a linear di↵erential equation, for somea, b 2 R. Prove: The set of all solutions of this di↵erential equation is a vectorsubspace of the space D of all everywhere di↵erentiable functions f : R ! R.

10

[Note: There is nothing special about three derivatives. This is just an example.The same statement would be true for arbitrary n-th order linear ODEs.]

8. Let V be a vector space. Let U and W be subspaces of V .

(a) Prove: U \W is a subspace of V .

(b) Prove: U [W is a subspace of V if and only if U ✓ W or W ✓ U .

(c) Let U +W := {v = u+ w 2 V : u 2 U and w 2 W}. Prove that U +W is asubspace of V .

(d) Prove: Suppose U \W = {0}. Suppose that B is a finite basis for U and B1

is a finite basis for W . Then B[B1 is a finite basis for U +W . (It is then commonto denote U +W as U �W , and call it the direct sum of U and W .)

(e) Prove: Suppose U +W is finite-dimensional. It is possible to choose a basisB for U and a basis B1 for W such that B \B1 is a basis for U \W .

(f) Show by example that, in general, if B is a basis for U and B1 is a basis forW , then B \B1 is NOT a basis for U \W .

9. Let V be a real vector space. We say that a (possibly infinite) subset B ofV is a basis for V if and only if every vector v 2 V is uniquely expressibleas a finite linear combination of vectors in B. (Thus, B is a linearly independentspanning set for V .) Suppose that B is a basis for V .

Prove: V is isomorphic to the subspace F0 of the vector space F(B,R) definedby: f 2 F0 if and only if f(b) = 0 for all but finitely many b 2 B.

[Hint: Define � : V ! F(B,R) as follows. If v = c1 · b1 + · · ·+ cn · bn for someb1, . . . , bn in B and some scalars c1, . . . , cn, let �(v) be the function fv : B ! R by:fv(bi) = ci, 1 i n, fv(b) = 0 for all b 2 B � {b1, . . . , bn}.]

Note: Using the Axiom of Choice, it is possible to prove that every vector spacehas a basis.

11

Chapter 2: Linear Operators

Most of the material in this chapter remains valid if R is replaced by an arbi-trary field F , but we restrict our attention to operators on real vector spaces forapplication to Euclidean geometry.

The most elementary non-trivial class of functions from Rn to Rn are the linearoperators:

f(x1, x2, . . . , xn) = (a11 ·x1+a12 ·x2+· · ·+a1n ·xn, . . . an1 ·x1+an2 ·x2+· · ·+ann ·xn).

The value of linear operators is

(1) We can understand them better than more complicated functions.(2) We can approximate nice smooth functions f in the neighborhood of each

point P by the Jacobian matrix Jacf (P ) of partial derivatives, which is alinear function approximating f near the point P . (This is the generalizationof approximating a smooth curve in R2 by its tangent line at the point P .)

(3) Many important functions are linear (or a�ne) operators, including isome-tries and projection maps.

It is convenient to use matrix notation for linear operators, and for this, it isstandard to write vectors as column vectors:

v =

0

B@

x1

x2

. . .xn

1

CA

Then, if

A =

0

B@


1

CA

we have

f(v) = A

0

B@

x1

x2

. . .xn

1

CA .

The following is a basis-free defintion of linear operators.

Definition 2.1. Let V be a vector space and let f : V ! V be a function satisfying:

(1) f(v + w) = f(v) + f(w) for all v, w 2 V ; and(2) f(a · v) = a · f(v) for all a 2 R, v 2 V .

Then we call f a linear operator on V .

In the finite-dimensional case, the use of a basis and coordinates a↵ords therelationship between the two definitions.

12

Lemma 2.2. Let V be a finite-dimensional vector space with a basis B = {v1, . . . , vn}.Let f : V ! V be a linear operator. Identify V with Rn via the coordinatization:

v = a1 · v1 + · · ·+ an · vn �!

0

B@

a1a2. . .an

1

CA .

Then f , written in B-coordinates, is the function:

f(

0

B@

x1

x2

. . .xn

1

CA) =

0

B@


1

CA

0

B@

x1

x2

. . .xn

1

CA

where

f(vi) =

0

B@

a1ia2i. . .ani

1

CA .

Conversely, given any n ⇥ n matrix A, the function fA : V ! V defined, withrespect to the coordinatization given by B, by:

f(

0

B@

x1

x2

. . .xn

1

CA) = A ·

0

B@

x1

x2

. . .xn

1

CA

is a linear operator on V .

Proof. The rules of matrix multiplication show that any function f(x) = A · x is alinear operator. Moreover, any linear operator f is uniquely determined by the setof values {f(v1), . . . , f(vn)} where {v1, . . . , vn} is any basis for V . If A is the n⇥nmatrix whose Column i is the column vector f(vi), then f(vi) = A · vi. Hence, thelinear operator f agrees with the function x ! A · x.

The discussion above also shows:

Lemma 2.3. Let V be a vector space with a basis B. Let f : B ! V be anyfunction. Then there is a unique linear operator f⇤ : V ! V such that f⇤ extendsf . Moreover f⇤ is an invertible linear operator if and only if f(B) is a basis for Vif and only if the associated matrix A is invertible.

Examples of Linear Operators

1. Let P be the vector space of all polynomials. DefineD : P ! P byD(p) = dpdx .

Then D is a linear operator on P. Note that D2 := D �D is the second derivativeoperator. Indeed, if p(x) is any polynomial, then

p(D) := an ·Dn + · · ·+ a1 ·D + a0 · I

is a linear di↵erential operator on P, and more generally, on the space of all infinitelydi↵erentiable functions of a real variable.

13

2. Let C be the vector space of all continuous real-valued functions. DefineJ : C ! C by J(f) =

R x0 f(t)dt. Then J is a linear operator.

The theory of di↵erential operators and integral operators has played an impor-tant role in the study of di↵erential equations and physics over the past century.

3. Let ⇢ : R2 ! R2 be the rotation about (0, 0) counterclockwise through theangle ✓. Then ⇢ is a linear operator with associated matrix

✓cos(✓) �sin(✓)sin(✓) cos(✓)

◆

4. Let r : R2 ! R2 be the reflection of R across the line y = mx, wherem = tan( ✓2 ). Then r is a linear operator with associated matrix

✓cos(✓) sin(✓)sin(✓) �cos(✓)

◆

Projection maps onto lines y = mx are also linear operators on R2.

Two subspaces naturally associated with a linear operator T : V ! V are thekernel of T ,

Ker(T ) = {v 2 V : T (v) = 0},and the range of T , T (V ),

T (V ) = {T (v) : v 2 V }.The following theorem is fundamental.

Theorem 2.4. Let T : V ! V be a linear operator on the finite-dimensional vectorspace V . Then

dim(Ker(T )) + dim(T (V )) = dim(V ).

Proof. Let B0 be a basis for Ker(T ). Extend B0 to a basis B = B0 [ B1 for V .Let V1 = Span(B1) and let T1 : V1 ! T (V ) be the restriction of T to V1. SinceB0 \B1 = ;, we have that

Ker(T ) \ V1 = {0}.We claim that T1 is an isomorphism of vector spaces.

Note that V = Ker(T ) + V1. Let v 2 V . Write v = u + v1 with u 2 Ker(T ),v1 2 V1. Then

T (v) = T (u+ v1) = T (u) + T (v1) = 0+ T (v1) = T (v1) = T1(v1).

Hence T1(V1) = T (V1) = T (V ), i.e., T1 is surjective. Suppose that v 2 Ker(T1).Then v 2 Ker(T ) \ V1 = {0}. Hence T1 is injective. Thus T1 is an isomorphism,as claimed. Hence

dim(T (V )) = dim(V1) = |B1| = |B|� |B0| = dim(V )� dim(Ker(T )),

14

proving the theorem.

⇤For the purpose of studying linear operators on higher dimensional vector spaces,

it is convenient whenever possible to break the problem down into “bite-sized”pieces,by finding smaller invariant subspaces.

Definition 2.5. Let T : V ! V be a linear operator. A subspace W of V is saidto be T -invariant if T (w) 2 W for all w 2 W , i.e. by restriction of domain, Tdefines a linear operator TW : W ! W .

Example. Let D : P ! P be the di↵erentiation operator. Let Pn be the subspaceof polynomials of degree at most n. Then, for all n, Pn is an (n+ 1)-dimensionalD-invariant subspace of P. [Note: In fact, D(Pn) = Pn�1 ✓ Pn.]

Other than the 0-subspace, the smallest possible T -invariant subspaces are 1-dimensional. If W = Span({w}) is a 1-dimensional T -invariant subspace (line),then

T (w) = � · wfor some scalar �. We then call w an eigenvector for T with eigenvalue �.

Examples

1. Let D : P ! P be the di↵erentiation operator. Since D(f) is of lower degreethan f , the only possible way that D(f) = � · f is if � = 0, i.e., D(f) = 0. Thusthe only eigenvectors for D are non-0 constant functions and the only eigenvalueis 0. (On the other hand, if we enlarge our space from P to C1(R), the space ofinfinitely di↵erentiable real-valued functions, then D : C1(R) ! C1(R) is still alinear operator, and now the function f(x) = e�x is an eigenfunction for D witheigenvalue �, for any real number �.)

2. Let r : R2 ! R2 be the reflection across the line y = mx. Then the line y =mx is an r-invariant 1-dimensional subspace of R2, and so (1,m) is an eigenvector forr with eigenvalue 1. Also, if m 6= 0, the orthogonal line y = � 1

mx is r-invariant, buteach vector on this line is mapped to its negative. Hence (m,�1) is an eigenvectorfor r with eigenvalue �1. [If R : R2 ! R2 is the reflection across the x-axis, y = 0,then the y-axis is also R-invariant, and (0, 1) is an eigenvector with eigenvalue �1.]

3. If ⇢ = ⇢✓ : R2 ! R2 is a non-identity rotation, then no 1-dimensional subspace(line through (0, 0)) is mapped to itself, unless ✓ = ⇡, in which case each vectoris mapped to its negative, and so every non-zero vector is an eigenvector for ⇢⇡with eigenvalue �1. (By convention, the zero vector is never considered to be aneigenvector.)

There is a lovely strategy for finding eigenvectors. Let T : V ! V be a linearoperator. Fix a basis B and a coordinatization for V relative to B. Then T (x) =A · x for some matrix A. We wish to solve the matrix equation:

A · x = � · x.Bringing everything to the left side of the equation, this is equivalent to solving

(A� �I) · x = 0.

15

Definition 2.6. Let A be an n⇥ n matrix and let � 2 R. The �-eigenspace forA is

{v 2 Rn : A · v = � · v},

i.e. it is the set of all eigenvectors for A with eigenvalue � plus the vector 0.

Our brief remarks above show that

Lemma 2.7. The �-eigenspace for A is the null space for A� �I, i.e.

{v 2 Rn : (A� � · I) · v = 0}.

Now, for any given number �, the problem of finding the null space for A � �Iis the problem of solving a certain homogeneous system of n linear equations inn unknowns. But, for almost all �, this system will have (0, 0, . . . , 0) as its onlysolution. The Eigenvalue Problem is:

Determine those (few) values of � for which the system has a non-zero solution.

This will be true if and only if the matrix A��I is singular, i.e. not invertible.And this property can be determined by the determinant det(A � �I). There isa very interesting general theory of the determinant. We will restrict our attentionto the case of n ⇥ n matrices with n = 2 or n = 3. Even then, we will just sketchthe ideas, from a geometric viewpoint.

Definition 2.8. Let A =

✓a bc d

◆. Then the determinant of A is

det(A) = ad� bc.

It is easy to see that the homogeneous system A · x = 0 has a non-zero solutionif and only if the rows (a, b) and (c, d) are proportional if and only if ad� bc = 0.

Now we proceed to the 3-dimensional case. We shall restrict our discussion oflinear algebra henceforth mostly to the 3-dimensional case, although almost everystatement extends to the general n-dimensional case.

Definition 2.9. Let

A =

0

@a b cd e fg h m

1

A .

Then

det(A) = a · det(✓e fh m

◆)� b · det(

✓d fg m

◆) + c · det(

✓d eg h

◆).

We recall the definition of the cross product of two vectors in R3. We use thestandard notation i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1).

16

Definition 2.10. Let u = ai+ bj + ck and v = di+ ej + fk be two vectors in R3.Then the cross product of u and v is

u⇥ v = det(

0

@i j ka b cd e f

1

A).

It is straightforward to check that u⇥ v is perpendicular to the vectors u and v.Moreover, if u and v are non-zero vectors, then ||u⇥ v|| is equal to the area of theparallelogram determined by the vectors u and v. The next lemma is clear fromthe definition of dot product.

Lemma 2.11. Let u = (a, b, c), v = (d, e, f), and w = (x, y, z) be three vectors inR3. Then

w · (u⇥ v) = det(

0

@x y za b cd e f

1

A).

It follows that

Lemma 2.12. det(A) = 0 if and only if the rows of A form a linearly dependentset of vectors if and only if the null space of A contains a non-zero vector.

We will need a few additional facts about determinants, the first two of whichare somewhat di�cult to prove. We refer students to other books for their proof.

Determinant Theorems 2.13. Let A and B be n ⇥ n matrices. The followingproperties hold:

(1) det(AB) = det(A)det(B);(2) det(AT ) = det(A);(3) If det(A) 6= 0, then A�1 exists and det(A�1) = 1

det(A) .

Returning to the eigenvalue problem, we obtain the following important result.

Theorem 2.14. Let A be a 3 ⇥ 3 matrix. Let � be a real number. The followingare equivalent

(1) The �-eigenspace of A is non-zero;(2) The null space of A� � · I is non-zero;(3) � is a root of the characteristic polynomial of A, cA(x) := det(x ·I�A).

Note that, by the Intermediate Value Theorem, every real cubic polynomialcrosses the x-axis at least one time. Indeed, cA(x) has either one real root and apair of complex conjugate roots, or cA(x) has three real roots. Thus A has at leastone real eigenvalue. Of course, a real root may occur with multiplicity 1, 2, or 3.

Computing det(x·I�A) is a bit tedious, but we can make two easy and importantobservations.

Lemma 2.15. Let A be a 3⇥3 matrix and let cA(x) be the characteristic polynomialof A. Write

cA(x) = x3 � c2x2 + c1x� c0.

17

Then c2 = Tr(A) is the trace of the matrix A, and c0 = det(A) is the determinantof the matrix A.

Proof. Considering the matrix

x · I �A =

0

@x� a11 a12 a13a21 x� a22 a23a31 a32 x� a33

1

A ,

we see that the cubic and quadratic terms of cA(x) all come from the product:

(x� a11)(x� a22)(x� a33) = x3 � (a11 + a22 + a33)x2 + · · · = x3 � Tr(A)x2 + . . . .

Thus c2 = Tr(A). Now

�c0 = cA(0) = det(0�A) = det(�A) = (�1)3det(A) = �det(A).

Hence c0 = det(A), completing the proof.

⇤Clearly, if v is an eigenvector for A with eigenvalue �, then every vector c · v

collinear with v is an eigenvector with the same eigenvalue �. Thus if v and w areeigenvectors for A with distinct eigenvalues, they cannot be collinear. The followingstronger statement is true.

Lemma 2.16. Let T : R3 ! R3 be a linear operator with matrix A, with respectto the standard basis for R3. Suppose that u, v, and w are eigenvectors for A withdistinct eigenvalues �, µ, and ⌫. Then {u, v, w} is a basis for R3.

Proof. Suppose that

(1) c1 · u+ c2 · v + c3 · w = 0.

Since no two of the vectors u, v, w are collinear, ci 6= 0 for all i. Apply T to bothsides of this equation, getting:

(2) c1� · u+ c2µ · v + c3⌫ · w = 0.

Multiply equation (1) by �, getting:

(3) c1� · u+ c2� · v + c3� · w = 0.

Now subtract equation (3) from equation (2), getting:

(4) c2(µ� �) · v + c3(⌫ � �) · w = 0.

But now, since c3 6= 0 and ⌫ 6= �, we may solve for w and get:

18

w =c2(�� µ)

c3(⌫ � �)· v,

whence v and w are collinear, a contradiction proving the lemma.

⇤Suppose that u, v and w are eigenvectors for A with eigenvalues �, µ, ⌫, not

necessarily distinct. Suppose that {u, v, w} forms a basis for R3. Form the matrixC whose columns are the coordinate vectors for u, v, and w with respect to thestandard basis for R3. Then the matrix AC has columns Au = � · u, Av = µ · v,and Aw = ⌫ · w. Hence

AC = CD := C

0

@� 0 00 µ 00 0 ⌫

1

A .

Since {u, v, w} is a basis for R3, C is an invertible matrix and

C�1AC = D =

0

@� 0 00 µ 00 0 ⌫

1

A .

We say that A is similar to the diagonal matrix D. We also say that A is diago-nalizable.

Definition 2.17. Two n ⇥ n matrices A and D are said to be similar if thereexists an invertible n⇥ n matrix C such that C�1AC = D.

It is left as an exercise to show that the relation of similarity is an equivalencerelation on the set of n⇥n matrices. The equivalence classes are called similarityclasses.

Now is a good time to recall some definitions from Math 4580. Indeed, thiswould be a good time for you to start reviewing the material in Chapters 9 through12 of the Math 4580 notes.

Definition 2.18. A nonempty set G of invertible functions is a group of func-tions if G is closed under composition of functions and under taking inverses. [SeeDefinition 10.3 on page 84 of the Math 4580 notes.]

Definition 2.19. Two functions f and f1 in a group G are called conjugate ifthere is a function g 2 G such that f1 = g � f � g�1. Conjugacy is an equivalencerelation on the set G, and the equivalence classes are called conjugacy classes.[See page 78 of the Math 4580 notes.]

Notice that the intersection of a similarity class of n⇥n matrices with the groupGL(n,R) of all invertible n ⇥ n matrices is a conjugacy class in this group. [SeeExercise 8.]

The following definition will be needed in Exercise 8.

Definition 2.20. Let G be a group (of functions) and H a subgroup of G. We saythat H is a normal subgroup of G if

H = gHg�1 = {ghg�1 : h 2 H}

19

for all g 2 G. (Equivalently, gH = Hg for all g 2 G, where gH and Hg are cosetsof H.) [See Exercise 7 in Chapter 11 of the Math 4580 notes.]

There are two interpretations for a pair of similar matrices, A and D, in termsof linear operators. Holding one basis, B, fixed, A and D are the matrices for twodi↵erent linear operators with respect to this fixed choice of basis. On the otherhand, we may regard the invertible matrix C as a change of basis matrix, and thenwe may interpret A and D as two di↵erent matrix representations for the samelinear operator T : Rn ! Rn with respect to two di↵erent bases. Thus, in thediagonalizable case, it is convenient to think as follows:

T : R3 ! R3 is a linear operator whose matrix is A with respect to the standardbasis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} for R3. On the other hand, there is a better choiceof basis for the purpose of understanding T geometrically. Namely, there are threeeigenlines R · u, R · v, R ·w, such that T “stretches”(or maybe, shrinks) these linesby scaling factors �, µ, and ⌫, respectively. With respect to the basis {u, v, w}, Tis represented by the diagonal matrix D. So, A and D represent the same linearoperator T with respect to di↵erent choices of basis.

Exercises

1. Let f : V ! V be a linear operator on the finite-dimensional real vector spaceV . Let B be a basis for V . Let A be the matrix which represents f with respectto the basis B.

(a) Prove: f is invertible if and only if f(B) is a basis for V .

(b) Prove: f is invertible if and only if A is an invertible matrix.

2. Prove: The di↵erentiation operator D : P ! P is a linear operator.

3. Prove: The integration operator J : C ! C is a linear operator.

4. Prove: D � J : P ! P is the identity operator on P, but J �D : P ! P isnot the identity operator.

5. Let T : V ! V be a linear operator.

(a) Prove: The range T (V ) is a subspace of V .

(b) Let V�(T ) := {v 2 V : T (v) = � · v}. Prove: V�(T ) is a T -invariant subspaceof V .

(c) Prove: Ker(T ) = V0(T ).

(d) Prove: V�(T ) = Ker(T � � · I).

6. Let W be a subspace of the real vector space V . We say that a linear operatorP : V ! V is a projection operator onto W if P (V ) = W and P (w) = w for allw 2 W .

Prove: A linear operator P : V ! V is a projection operator if and only if P �P := P 2 = P . [Hint: If P 2 = P , write every vector v 2 V as v = P (v)+(v�P (v)).Conclude that V = V0(P ) + V1(P ) and P is a projection operator onto V1(P ).]

7. Let u and v be vectors in R3.

20

(a) Verify that u⇥ v is perpendicular both to u and to v.

(b) Verify that u⇥ v = 0 if and only if {u, v} is a linearly dependent set, i.e. uand v are collinear.

(c) Using the scalar triple product interpretation of the 3⇥3 determinant, verifythat det(A) = 0 if and only if the rows of A form a linearly dependent set of vectors.

8. Use the Determinant Theorem 2.13 for this exercise. Let GL(Rn) be theset of all invertible linear operators on Rn. To each linear operator T 2 GL(Rn),associate the n ⇥ n matrix for T with respect to the standard basis for Rn. LetGL(n,R) denote the set of associated matrices.

(a) Prove: GL(n,R) is the set of all n⇥ n matrices A such that det(A) 6= 0.

(b) Prove: If A, B 2 GL(n,R), then AB 2 GL(n,R) and A�1 2 GL(n,R).Conclude that GL(n,R) is a group under the operation of matrix multiplication.

(c) Let SL(n,R) be the set of all n ⇥ n matrices A with det(A) = 1. Prove:SL(n,R) is a normal subgroup of GL(n,R). [You must prove that it is a subgroup,and also that it is normal.]

(d) Let � : GL(Rn) ! GL(n,R) be the function which associates to each invert-ible linear operator T the matrix representing T with respect to the standard basisfor Rn. Prove: � is an isomorphism of groups.

9(a) Prove: The relation of similarity is an equivalence relation on the set of alln⇥ n matrices.

(b) Prove: A subgroup H of a group G is normal if and only if H is a union ofconjugacy classes of G.

10. Let ⇧ be a plane passing through (0, 0, 0) in R3 with equation:

ax+ by + cz = 0.

Let R⇧ : R3 ! R3 be the reflection map across the plane ⇧. Let Pn : R3 ! R3 bethe orthogonal projection map onto the line through the normal vector n = (a, b, c)to the plane ⇧. Let P⇧ : R3 ! R3 be the orthogonal projection map onto the plane⇧.

(a) Prove: For all (x, y, z) 2 R3,

Pn(x, y, z) =ax+ by + cz

a2 + b2 + c2· n.

(b) Deduce that Pn is a linear operator on R3.

(c) If n = (1, 0,�1), write the matrix for the operator Pn with respect to thestandard basis for R3.

(d) Argue geometrically what the eigenvalues and eigenspaces for Pn are.

(e) Verify your claims in (d) for the example n = (1, 0,�1) by explicit matrixcalculation using the matrix found in part (c).

(f) Prove: P⇧ = I � Pn.

21

(g) If n = (1, 0,�1), write the matrix for the operator P⇧.

(h) Argue both geometrically and by matrix calculation what the eigenvaluesand eigenspaces for P⇧ are.

(h) Prove: P 2n = Pn and P 2

⇧ = P⇧. Hence both are projection operators.

(i) Prove by a geometrical argument: R⇧ = 2P⇧ � I.

(j) Deduce what the eigenvalues and eigenspaces for R⇧ are.

(k) For n = (1, 0,�1), find the matrix for R⇧ and check your claims concerningthe eigenvalues and eigenspaces.

22

Chapter 3: Inner Products and Orthogonal Matrices

The space Rn is not simply a vector space. It is a metric space with a geometricstructure given by the dot product: If u = (u1, . . . , un) and v = (v1, . . . , vn), then

u · v = u1v1 + · · ·unvn.

In particular, u · u is the square of the Euclidean distance from (u1, . . . , un) to(0, . . . , 0). Moreover, if u and v are vectors of unit length, then u · v is the cosineof the angle between them. Indeed, for any vectors u and v,

u · v = 0 if and only if u and v are orthogonal (perpendicular to each other).

It will be convenient to write vectors as column vectors: u =

0

@u1

. . .un

1

A, v =

0

@v1. . .vn

1

A. Then, the (matrix) transpose of u, uT , is

uT := (u1, . . . , un),

and the dot product u · v is the matrix product uT v.Let T : Rn ! Rn be a linear operator. Let A be the matrix representing T with

respect to the standard basis. Then the dot product T (u) · T (v) may be computedas the matrix product:

(Au)T (Av) = uTATAv.

In particular, if T : Rn ! Rn is a linear isometry, then T preserves distancesand angles, whence, for all u, v:

u · v = T (u) · T (v), i.e.,

uT v = (Au)T (Av) = uT (ATA)v.

We leave as an exercise to show that uT v = uT (ATA)v for all u, v if and only if

ATA = I.

Thus we have the following fact.

Lemma 3.1. Let T : Rn ! Rn be a linear operator. Let A be the matrix repre-senting T with respect to the standard basis for Rn. Then T is an isometry if andonly if ATA = I.

We call a square matrix A orthogonal if and only if ATA = I, i.e. the columnsof A form an orthonormal basis for Rn (and so do the rows, since AAT = I aswell). Likewise, we call a linear operator an orthogonal operator if the associatedmatrix with respect to the standard basis for Rn is an orthogonal matrix.

From the properties of determinants, we see that

23

Lemma 3.2. If A is an orthogonal matrix, then det(A) = ±1.

Definition 3.3. O(n) is the set of all orthogonal n⇥n matrices. SO(n) is the setof all orthogonal n⇥ n matrices of determinant 1.

We leave as an exercise to show

Theorem 3.4. O(n) is a subgroup of GL(n,R). SO(n) is a normal subgroup ofindex 2 in O(n).

We can now complete our discussion of isometries of Rn by proving the followinganalogue of Theorem 11.9 in the Math 4580 notes.

Theorem 3.5. Let f : Rn ! Rn be an isometry. Let f(0, 0, . . . , 0) = a =(a1, a2, . . . , an). Then f = Ta � R, where Ta is the translation operator by thevector a, and R is an orthogonal operator.

To prove Theorem 3.5, we need the following lemma, which is analogous toLemma 11.1 in the Math 4580 notes.

Lemma 3.6. (a) Let a = (a1, a2, . . . , an) and b = (b1, b2, . . . , bn) be two points inRn which are equidistant from each of the points

e0 = (0, 0, . . . , 0), e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1).

Then a = b.

(b) Let f : Rn ! Rn be an isometry fixing each of the points

e0 = (0, 0, . . . , 0), e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1).

Then f = I, the identity map.

Proof. This is Homework Exercise 11.

Proof of Theorem 3.5. Let g = T�a � f . Then g(e0) = e0. Let

g(e1) = a1, g(e2) = a2, . . . , g(en) = an.

Since g is an isometry fixing (0, 0, . . . , 0), and since B := {e1, e2, . . . , en} is thestandard orthonormal basis for Rn, we have that {g(e1), g(e2), . . . , g(en)} is likewisean orthonormal basis for Rn. Let R : Rn ! Rn be the linear operator whose matrixA with respect to the basis B has columns g(e1), g(e2), . . . , g(en). Then A is anorthogonal matrix, and so R is an orthogonal operator. Moreover g(ei) = R(ei)for all i, 0 i n. Hence, R�1 � g is an isometry of Rn fixing ei for all i. Thus,by Lemma 3.6(b), R�1 � g = I, and so g = R is an orthogonal operator. Sinceg = T�1

a � f , we conclude that f = Ta �R, as claimed.

We have the following immediate corollary to Theorem 3.5.

Corollary 3.7. Every isometry of Rn is an invertible function.

There is another important subgroup of O(n).

24

Definition 3.8. We call a matrix P a permutation matrix if each row and eachcolumn has exactly one entry equal to 1 and every other entry equal to 0. Equiva-lently, every entry of P is either 0 or 1, and the columns of P form an orthonormalbasis for Rn, (as do the rows). We let Pn denote the set of all n⇥ n permutationmatrices.

Theorem 3.9. Let {e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1)}be the standard orthonormal basis for Rn. For each permutation � 2 Sn, let � bethe unique linear operator on Rn such that

�(ei) = e�(i)

for all i, 1 i n. Let � be the matrix representing the linear operator � withrespect to the standard basis for Rn. Then � is a permutation matrix, and thefunction ⇥ : Sn ! Pn by

⇥(�) = �

is an isomorphism of groups.

Proof. Clearly ⇥ is a bijection. Also

⇥(� � ⌧)(ei) = e(��⌧)(i) = e�(⌧(i)),

while(⇥(�) �⇥(⌧))(ei) = ⇥(�)(e⌧(i) = e�(⌧(i)) = ⇥(� � ⌧)(ei).

Hence ⇥ is an isomorphism of groups.

There is a larger interesting subgroup of O(n), the Weyl group W (n).

Definition 3.10. We call a matrix M a signed permutation matrix if each rowand each column has exactly one entry equal either to 1 or to �1, and all otherentries equal to 0. Equivalently, every entry of M is either 0, 1, or �1, and thecolumns of M form an orthonormal basis for Rn, (as do the rows). We let W (n)denote the set of all n⇥ n signed permutation matrices.

Theorem 3.11. W (n) is a subgroup of O(n) of cardinality 2n · n!.Proof. Clearly, for each permutation matrix �, there are 2n signed permutationmatrices whose non-zero entries are in the same location as the 1 entries of �.Hence the set W (n) has cardinality 2n ·n!, and clearly, by definition, W (n) ✓ O(n).

Consider the set Z of all matrices in O(n) with integer entries. Clearly, theproduct of any two such matrices is another such matrix. Also, if A 2 O(n), thenA�1 = AT . Hence if A has integer entries, then so does A�1. As I has integerentries, Z is a subgroup of O(n). But now let

0

B@

a1a2. . .an

1

CA

be a column of some matrix A in Z. Then this column vector is a unit vector. So

a21 + a22 + · · ·+ a2n = 1.

25

Since each ai 2 Z, exactly one ai = ±1 and the other aj ’s are all 0. Moreoverthe columns of A form an orthonormal set. So Z = W (n), proving that W (n) is asubgroup of O(n).

We will use permutation matrices and signed permutation matrices in the nextchapter when we study the symmetry group of the regular tetrahedron. But nowwe return to general properties of orthogonal matrices. There are very limitedpossibilities for the real eigenvalues of orthogonal matrices.

Lemma 3.12. Let A 2 O(n) and let � be a (real) eigenvalue of A. Then � = ±1.

Proof. Let u be an eigenvector for A with eigenvalue �. Then

uTu = (Au)T (Au) = (� · uT )(� · u) = �2 · (uTu).

Since u 6= 0, uTu 6= 0 and so �2 = 1. Hence, � = ±1, as claimed.

Once again, we restrict attention to the 3-dimensional case. Let T : R3 ! R3 bea linear isometry. Since the characteristic polynomial cT (x) is a cubic polynomial,cT (x) has at least one real root ✏, and by Lemma 3.9, ✏ = ±1. Let u be a uniteigenvector for T with eigenvalue ✏. Since T is an isometry,

u? := {v 2 R3 : u · v = 0}is a T -invariant plane in R3 (with normal vector u). Since T : u? ! u? is anisometry, T acts as either a rotation or a reflection on the plane u?. Hence there isan orthonormal basis {v, w} for the plane u? such that the matrix for the T -actionon u? with respect to this basis is:

✓cos(✓) �sin(✓)sin(✓) cos(✓)

◆or

✓cos(✓) sin(✓)sin(✓) �cos(✓)

◆.

for some angle ✓, 0 ✓ < 2⇡.Thus the matrix for T with respect to the orthonormal basis {u, v, w} for R3 is:

0

@✏ 0 00 cos(✓) �sin(✓)0 sin(✓) cos(✓)

1

A or

0

@✏ 0 00 cos(✓) sin(✓)0 sin(✓) �cos(✓)

1

A .

In the case where T acts as a reflection on the plane u?, we can make a specialchoice of orthonormal basis for u?: Take v0 to be a unit vector along the reflectingmirror for T , and take w0 to be a unit vector perpendicular to v0. Then

T (v0) = v0 and T (w0) = �w0.

Hence, with respect to the orthonormal basis {u, v0, w0} for R3, the matrix for T is:

0

@✏ 0 00 1 00 0 �1

1

A .

If T 2 SO(3) and the restriction of T to u? is a reflection, then 1 = det(T ) = �✏.Hence ✏ = �1 and we see that T is a 180o rotation of R3 around the axis determinedby the eigenvector v0. On the other hand, if T 2 SO(3) and the restriction of T tou? is a rotation, then 1 = det(T ) = ✏, and T is a rotation of R3 through an angle ✓about the axis determined by the eigenvector u. Thus we have the following result.

26

Theorem 3.13. Let T : R3 ! R3 be a linear isometry with det(T ) = 1. Then Tis a rotation of R3 about an axis R · u, and there is an angle ✓ and an orthonormalbasis for R3 such that the matrix for T with respect to this basis is

0

@1 0 00 cos(✓) �sin(✓)0 sin(✓) cos(✓)

1

A .

Moreover, if T is not the identity operator, then the line R · u is the unique line ofeigenvectors with eigenvalue 1 for T , i.e. if v 2 R3 with T (v) = v, then v = c · ufor some scalar c.

Note that if T 2 SO(3) and T 6= I, then R ·u is the unique eigenline for T , unlessT induces a 180o rotation of u? in which case u? is the (�1)-eigenspace for T .

We now have a rather complete description of linear operators T 2 SO(3). InChapter 7, we will use this knowledge in combination with some basic facts aboutpermutation groups, including the lovely Orbit Counting Formula from Chapter 6,to describe the finite subgroups of SO(3).

In Math 4507, you study the group of isometries of Spherical Geometry. This isprecisely the group O(3) of all isometries of the 2-sphere S2 in R3, with the metricinduced from R3. You also study the group of isometries of Hyperbolic Geometry.This group is isomorphic to

H := {A 2 GL(3,R) : AT ·

0

@1 0 00 1 00 0 �1

1

A ·A =

0

@1 0 00 1 00 0 �1

1

A .}

Exercises

1. Prove: Let A be an n ⇥ n matrix such that uTAv = uT v for all vectors u,v 2 Rn. Then A = I, the identity matrix. [Hint: Let ei be the unit vector with0 in the j-th entry for all j 6= i and with 1 in the i-th entry. Compute eTi Aej andcompare with eTi ej .]

2. In this exercise, you may use basic properties of the dot product in Rn. Letu and v be vectors in Rn.

(a) Prove: u · v = 12 (||u+ v||2 � ||u||2 � ||v||2).

(b) Prove: Let T : Rn ! Rn be a linear isometry. Then T (u) · T (v) = u · v forall u, v 2 Rn.

(c) Prove: Let T : R3 ! R3 be a linear isometry. Suppose that u 2 R3 is aneigenvector for T . Then the plane

u? := {v 2 R3 : u · v = 0}

is a T -invariant subspace of R3.

3. Using Determinant Theorems 2.12, prove: If A is an orthogonal matrix, thendet(A) = ±1.

27

4. Prove Theorem 3.4: O(n) is a subgroup of GL(n,R), and SO(n) is a normalsubgroup of index 2 in O(n).

5. Let D(n) be the set of all n ⇥ n diagonal matrices (i.e., every entry o↵ themain diagonal is 0) such that every diagonal entry is ±1.

(a) Prove: W (n) = D(n) · Pn = {DP : D 2 D(n), P 2 Pn}.(b) Prove: D(n) is a normal subgroup of W (n).

6. Prove: Let R : R3 ! R3 be a linear operator. Then R is a reflection across aplane ⇧ through (0, 0, 0) if and only if the following three conditions hold:

(1) R has a 1-dimensional eigenspace U with eigenvalue �1;(2) R has a 2-dimensional eigenspace W with eigenvalue 1; and(3) U ? W .

7.(a) Give an example to show that not every linear isometry of R3 is a rotationor a reflection.

(b) Prove: If T : R3 ! R3 is a linear isometry, then either T is a rotation orT = R�⇢, where R is a reflection and ⇢ is a rotation (possibly the identity rotation).

8. Prove the following theorem of Euler: Let f : R3 ! R3 be an isometry (notnecessarily linear). Then either f = Tv � ⇢ or f = Tv � R � ⇢, where Tv : R3 ! R3

is the translation by the vector v 2 R3, ⇢ is a rotation about (0, 0, 0), and R is areflection across a plane passing through (0, 0, 0).

9. Prove the following theorems of Euler.

(a) Let f = Tv � ⇢ be an isometry, using the notation of Exercise 8, with ⇢ notthe identity rotation. Let u be an eigenvector for ⇢ with eigenvalue 1. Then f is arotation about some point in R3 if and only if u · v = 0, i.e., v lies in the plane u?.Moreover, if f is a rotation, then the axis of rotation is parallel to the vector u.

(b) Let f = Tv � ⇢ be as in (a). Write v = u1 + w1, where u1 is the orthogonalprojection of v onto the line through u, and w1 is the orthogonal projection of vinto the plane u?. Then

f = Tu1 � (Tw1 � ⇢) = Tu1 � ⇢1,

where ⇢1 := Tw1 � ⇢ is a rotation about an axis parallel to the vector u, and Tu1

is a translation by the vector u1, which is also parallel to the vector u. [Note: Ifu1 6= 0, then f is a screw motion along the axis of the rotation ⇢.]

10. Prove: The setH defined at the end of this chapter is a subgroup of GL(3,R).

28

Chapter 4: Permutations, Orbits, and Lagrange’s Theorem

We now return to the topic of group theory, in particular to the study of groups ofpermutations and groups of isometries. Armed with the tools of linear algebra, weshall be equipped to study the isometries of R3, and in particular, the symmetriesof the Platonic solids. First, however, we need to study the theory of permutationgroups more deeply. The highlight of this chapter will be the proof of Lagrange’sTheorem, which most people would point to as the beginning of the history of grouptheory.

Consider the equilateral triangle 4ABC and the rotation ⇢ counterclockwiseabout its center through an angle of 120o. We can think of ⇢ as a permutation ofthe vertex set {A,B,C}:

⇢(A) = B, ⇢(B) = C, ⇢(C) = A.

If, instead of an equilateral triangle, the figure was a regular octagon and ⇢ was a45o rotation about its center, then defining ⇢ as a function on the vertex set wouldrequire an enumeration of eight function values:

⇢(A) = B, ⇢(B) = C, ⇢(C) = D, ⇢(D) = E, ⇢(E) = F, ⇢(F ) = G, ⇢(G) = H, ⇢(H) = A.

This can become tedious. Cauchy devised a somewhat more e�cient notation forthese functions, called cycle notation. It is based on the following principle.

Definition 4.1. Let H be a group of permutations of a set X. The H-orbitcontaining the point x is the set xH := {h(x) : h 2 H}.Definition 4.2. Let � be a permutation of the set X. Recall that the cyclic groupgenerated by � is the set h�i := {�i : i 2 Z}. We write x� for xh�i, and call itthe �-orbit containing the point x.

We shall be particularly interested in the case where � has finite order.

Examples

1. If r is a reflection across a line in R2, then r has order 2, since r2 = r � r = I,but r 6= I.

2. If ⇢ is a 120o rotation of R2 about the point P , then ⇢ has order 3, since⇢3 = I, but ⇢ 6= I 6= ⇢2.

3. If ⇢ is a 45o rotation of R2 about the point P , then ⇢ has order 8.

Lemma 4.3. Let � 2 G be an element of order m. Let n 2 Z. Write

n = q ·m+ r

with q, r 2 Z, and 0 r < m, as given by the Division Algorithm. Then �n = �r.Hence

h�i = {I = �0,�,�2, . . . ,�m�1}.

Proof.�n = �q·m+r = (�m)q � �r = Iq � �r = �r.

29

⇤Now suppose that � is a permutation of the set X and � has order m. Let x be

a point of X. Then we can enumerate the elements of x� as:

x� = {x,�(x),�2(x), . . . ,�m�1(x)}.You might be tempted to guess that the �-orbit containing x always has cardi-

nality m, where m is the order of the permutation �. But it is easy to see that thisis not the case. For example, if X = R2 and ⇢ is a 120o rotation about the point P ,then ⇢ is a permutation of the set X and ⇢ has order 3, but the ⇢-orbit containingthe point P is simply {P}, since ⇢(P ) = ⇢2(P ) = P . So there can be repetitions inthe set

{x,�(x),�2(x), . . . ,�m�1(x)}.

A more interesting example is the following one:

Let X = {1, 2, 3, 4, 5}. Let � : X ! X be the permutation defined by:

�(1) = 2,�(2) = 3,�(3) = 1,�(4) = 5,�(5) = 4.

It is easy to check that �i(1) = 1 if and only if i is a multiple of 3, and �j(4) = 4if and only if i is even. It is then not hard to conclude that � has order 6. But

1� = {1,�(1),�2(1),�3(1),�4(1),�5(1)} = {1, 2, 3, 1, 2, 3} = {1, 2, 3}.We cycle through the same numbers twice. Similarly,

4� = {r,�(4),�2(4),�3(4),�4(4),�5(4)} = {4, 5, 4, 5, 4, 5} = {4, 5}.This time we cycle through the same numbers three times. So the �-orbits on Xhave cardinality 2 and 3, but � has order 6.

In general, if O is a �-orbit on the set X, we may define the permutation �O :O ! O to be the function � with its domain restricted to O. If

O = {x,�(x),�2(x), . . . ,�k�1(x)}with �k(x) = x, but �i(x) 6= x for all i, 1 i < k, then clearly �i

O 6= I for 1 i < k,but �k

O = I, since

�kO(�

i(x)) = �k(�i(x)) = �i(�k(x)) = �i(x)

for all i. So, �O has order k = |O|. Since �mO = �m = I, it follows, by Exercise 3b

of Chapter 5, that

If O is a �-orbit on X, then |O| divides m = |h�i|.This is a special case of the famous Theorem of Lagrange, which will will provelater in this chapter.

Going back to the example � : {1, 2, 3, 4, 5} ! {1, 2, 3, 4, 5} defined by

�(1) = 2,�(2) = 3,�(3) = 1,�(4) = 5,�(5) = 4,

we see that1� = 2� = 3�, and 4� = 5�.

30

Lemma 4.4. Let H be a group of permutations of the set X. Define a relation onX by

x ⌘H y if and only if x 2 yH .

Then ⌘H is an equivalence relation. The equivalence classes are the H-orbits onX.

Proof. We must check the three properies of an equivalence relation.

Reflexivity: Let x 2 X. Since I 2 H, x = I(x) 2 xH . So x ⌘H x.

Symmetry: Suppose x ⌘H y. Then x 2 yH . Hence, there exists h 2 H withh(y) = x. Then h�1 2 H and h�1(x) = y. So y 2 xH , i.e., y ⌘H x.

Transitivity: Suppose x ⌘H y and y ⌘H z. There there exists h, h0 2 H withh(y) = x and h0(z) = y. Then h � h0 2 H and

(h � h0)(z) = h(h0(z)) = h(y) = x.

Hence x 2 zH , i.e., x ⌘H z.

⇤From the basic properties of equivalence relations and partitions, we obtain the

following corollary.

Corollary 4.5. If O and O1 are two H-orbits on X, then either O = O1 orO \O1 = ;.

Rather than defining Cauchy cycle notation, we shall illustrate it by some ex-amples.

1. Let ⇢ be the 45o rotation of the regular octagon mentioned before, regardedas the following function on the set of vertices:

⇢(A) = B, ⇢(B) = C, ⇢(C) = D, ⇢(D) = E, ⇢(E) = F, ⇢(F ) = G, ⇢(G) = H, ⇢(H) = A.

In Cauchy notation, we write:

⇢ = (A,B,C,D,E, F,G,H).

This signifies that ⇢ maps the first entry in the 8-tuple, A, to the second entry, B;the second entry B to the third entry C, . . . , and finally, ⇢ maps the last entryH back to the first entry A. The 8-tuple (A,B,C,D,E, F,G,H) is called a cycle.The symbols in a cycle comprise all of the elements in some ⇢-orbit. In this case,there is only one ⇢-orbit.

We can compute powers of a permutation by leap-frogging in the cycle. Forexample, suppose we wish to compute the cycle structure of the 135o rotation⇢3. Since ⇢(A) = B, ⇢2(A) = ⇢(B) = C, and ⇢3(A) = ⇢(⇢2(A)) = ⇢(C) = D. So weleapfrog over two symbols in the cycle:

⇢3 = (A,D,G,B,E,H,C, F ).

31

Similarly we can compute the 90o rotation ⇢2 by leapfrogging over one symbol eachtime. But now there are two ⇢2-orbits, not just one:

⇢2 = (A,C,E,G)(B,D,F,H).

To compute ⇢�1, we may simply write the cycle for ⇢ backwards:

⇢�1 = (H,G, F,E,D,C,B,A),

or, if you prefer to start with A:

⇢�1 = (A,H,G, F,E,D,C,B).

This illustrates the important fact that there are many di↵erent Cauchy cycle struc-tures representing exactly the same permutation.

We can compute ⇢4 by leapfrogging over three symbols at a time in the cycle for⇢, but we can also compute it by squaring each cycle in ⇢2:

⇢4 = (A,E)(C,G)(B,F )(D,H).

2. Now let � : {1, 2, 3, 4, 5} ! {1, 2, 3, 4, 5} be the function discussed above:

�(1) = 2,�(2) = 3,�(3) = 1,�(4) = 5,�(5) = 4.

In Cauchy cycle notation,

� = (1, 2, 3)(4, 5).

Then�2 = (1, 3, 2)(4)(5).

�3 = (1)(2)(3)(4, 5).

�4 = (1, 2, 3)(4)(5).

�5 = (1, 3, 2)(4, 5).

�6 = I = (1)(2)(3)(4)(5).

As a further simplification of Cauchy notation, it is common, when the domain X isunambiguous, to omit cycles (orbits) of length 1. But, for the identity permutation,it is customary to write (1), rather than ;. Hence, in this example, we would write:

� = (1, 2, 3)(4, 5)

�2 = (1, 3, 2)

�3 = (4, 5)

�4 = (1, 2, 3)

�5 = (1, 3, 2)(4, 5)

�6 = (1).

This clearly verifies the assertion made earlier that � has order 6. Indeed, it is easyto see that the following assertion is true.

32

Lemma 4.6. If the permutation � is a single cycle of length k, then the order of �is k. If the permutation � is the product of disjoint cycles of lengths k1, k2, . . . , km,then the order of � is the least common multiple of {k1, k2, . . . , km}.

The fact that Cauchy cycle notation gives a valid representation of a permutation� is a consequence of the fact that the �-orbits on X form a disjoint partition ofthe set X. Hence each element of the domain X appears in one and only one cycle.

With some care, we can use the Cauchy notation to multiply permutations.Thus, suppose we have the permutations ⇢ = (1, 2, 3)(4, 5) and � = (1, 3, 4, 2) ofX = {1, 2, 3, 4, 5}. To compute ⇢ � �, we work from right to left in the cycles:

⇢ � � = (1, 2, 3)(4, 5)(1, 3, 4, 2).

Thus, in the rightmost cycle, 1 goes to 3. Then moving to the left, we see that 3goes to 1. So the net e↵ect is that 1 stays fixed. Next, in the rightmost cycle, 2goes to 1. Then in the leftmost cycle, 1 goes to 2. Again, the net e↵ect is that 2 isfixed. Next, in the rightmost cycle, 3 goes to 4. Then in the next cycle to the left,4 goes to 5. So, 3 goes to 5. Likewise, 4 goes to 2, which goes to 3. So 4 goes to 3.Finally, 5 goes to 4. Hence we conclude that

⇢ � � = (1)(2)(3, 5, 4) = (3, 5, 4).

In the exercises, you will be asked to practice your skills at the “calculus ofpermutations”. Practice makes perfect.

The multiset of numbers listing the lengths of the disjoint cycles (orbits) for apermutation � is called the cycle structure of �. For example, the cycle struc-ture of the permutation (1, 2, 3)(4, 5)(6, 7)(8)(9)(10) = (1, 2, 3)(4, 5)(6, 7) in S10 is{3, 2, 2, 1, 1, 1}. The following fact is fundamental.

Lemma 4.7. Let H be a subgroup of Sym(X) and let ⌧ be a permutation of theset X. Let O be an H-orbit on X. Then ⌧(O) is a ⌧ �H � ⌧�1-orbit on X of thesame length as |O|. In particular, if H = ⌧ � H � ⌧�1, then O and ⌧(O) are twoH-orbits of the same length. They may or may not be the same orbit.

Proof. Let x and y be in ⌧(O). Then there exist a and b in O with ⌧(a) = x,⌧(b) = y. Since a and b are in O, b = �(a) for some � 2 H. Then

(⌧ � � � ⌧�1)(x) = (⌧ � �)(a) = ⌧(b) = y.

Hence x and y are in the same ⌧ �H � ⌧�1-orbit.On the other hand, let a 2 O and x = ⌧(a). Suppose that y is in the same

⌧ �H � ⌧�1-orbit as x. Then, for some � 2 H,

y = (⌧ � � � ⌧�1)(x) = ⌧(�(a)).

Since �(a) 2 O, y 2 ⌧(O).Hence ⌧(O) is the ⌧ �H � ⌧�1-orbit containing x. Since ⌧ is a bijective map on

X, |O| = |⌧(O)|.

⇤As a corollary we obtain the following important fact.

33

Corollary 4.8. Let � and �1 be two permutations in Sn. Then � and �1 are inthe same Sn-conjugacy class if and only if they have the same cycle structure. Thusthe Sn-conjugacy classes are in one-to-one correspondence with the partitions ofn, i.e. the decompositions:

n = n1 + n2 + · · ·+ nr,

with n1 � n2 � · · · � nr > 0, and all ni 2 N.

Proof. Let O1, . . . ,Or be the �-orbits on {1, 2, . . . , n}. If �1 = ⌧ � � � ⌧�1, then byLemma 4.7, ⌧(O1), . . . , ⌧(Or are the �1-orbits on {1, 2, . . . , n}, and |Ok| = |⌧(Ok)|for all k. Hence � and �1 have the same cycle structure.

Now suppose that � and �1 have the same cycle structure. Let (a1, a2, . . . , at)be a cycle of � and (b1, b2, . . . , bt) be a cycle of �1. Let ⌧ be a permutation in Sn

with ⌧(ai) = bi, 1 i t. Then

(⌧ � � � ⌧�1)(bi) = (⌧ � �)(ai) = ⌧(ai+1) = bi+1.

(Here we understand t+1 to be 1.) Hence ⌧ �� ⌧�1 and �1 both contain the cycle(b1, b2, . . . , bt). Since cycles are disjoint and since � and �1 have the same cyclestructure, we can define ⌧ cycle by cycle, so that ⌧ � � � ⌧�1 = �1, i.e., � and �1

are in the same Sn-conjugacy class.

⇤We conclude this chapter with a proof of the most important basic theorem in the

theory of groups. It is also historically the first theorem in the theory of groups. Itwas stated by Lagrange in 1771, almost 60 years before the word group was coined.The striking nature of the result convinced mathematicians of the importance ofusing groups to organize their thinking about permutations.

Lagrange’s Orbit-Stabilizer Theorem. Let G be a finite group of permutationsof the set X. Let O be a G-orbit on the set X containing the point x. Let

Gx := {g 2 G : g(x) = x}.

Then|G| = |O| · |Gx|.

We call Gx the stabilizer in G of the point x. In order to prove Lagrange’sTheorem, we need a few remarks. The first is a special case of Theorem 10.11(1)in the Math 4580 text.

Lemma 4.9. For any x 2 X, Gx is a subgroup of G.

Next comes the crucial observation.

Lemma 4.10. Let g 2 G and let y = g(x). Then y 2 O and

g �Gx := {g � h : h 2 Gx} = {g0 2 G : g0(x) = y}.

Proof. By definition of orbits, if y = g(x), then y 2 O. Let g � h 2 g �Gx. Then

(g � h)(x) = g(h(x)) = g(x) = y.

34

Henceg �Gx ✓ {g0 2 G : g0(x) = y}.

Now let g0 2 G with g0(x) = y = g(x). Then (g�1 � g0)(x) = x. Hence g�1 � g0 = hfor some h 2 Gx. So g0 = g � h 2 g �Gx. Hence

{g0 2 G : g0(x) = y} ✓ g �Gx.

Henceg �Gx = {g0 2 G : g0(x) = y},

completing the proof.

⇤Our next remark is essentially the same as Lemma 12.2 in the Math 4580 text,

but now for left cosets.

Lemma 4.11. Let g 2 G. Then the function � : Gx ! g �Gx, defined by

�(h) = g � h

for all h 2 Gx, is a bijection of sets. Hence |Gx| = |g �Gx| for all g 2 G.

Proof. Clearly, by definition of g �Gx, � is surjective. Suppose that �(h) = �(h0).Then g � h = g � h0, and so, by the Cancellation Law, h = h0. Hence � is alsoinjective. So � is surjective. Now the equality of cardinalities is immediate.

⇤Now we can proceed to a proof of Lagrange’s Theorem.

Proof of Lagrange’s Theorem. Since O = {g(x) : g 2 G}, |O| |G| < 1. Let

O = {x = x1, x2, . . . , xm}.

Thus |O| = m.For every g 2 G, g(x) = xi for a unique i, 1 i m. Hence if we set

Gi = {g 2 G : g(x) = xi},

thenG = G1 [G2 [ · · · [Gm,

andGi \Gj = ; for all i 6= j.

Hence|G| = |G1|+ |G2|+ · · ·+ |Gm|.

By Lemma 6.10, Gi = gi �Gx, where gi 2 G and gi(x) = xi. Moreover, by Lemma6.11,

|Gi| = |gi �Gx| = |Gx| for all i.

Hence|G| = |Gx|+ |Gx|+ · · ·+ |Gx| = m · |Gx| = |O| · |Gx|,

35

proving Lagrange’s Theorem.

⇤You may wonder how Lagrange could even STATE Lagrange’s Theorem, given

that the concept of a group had not yet been defined. In fact, Lagrange consideredonly one type of group, Sn, the group of all permutations of the set {1, 2, . . . , n}.His version of Lagrange’s Theorem was more like the following:

Theorem 4.12. Let Sn act as a group of permutations of the set X. Then thecardinality of every Sn-orbit on X is a divisor of n!.

If you think of X = {1, 2, . . . , n}, then this is the trivial statement:

n is a divisor of n!

However, Sn acts “naturally”on other sets as well. For example, let

X = {(i, j) : 1 i, j n}.

Then |X| = n2 and Sn acts naturally on X via:

�(i, j) = (�(i),�(j)).

In this case, Sn has two orbits on X:

� := {(i, i) : 1 i n} and X ��.

As |�| = n and |X��| = n2�n = n(n�1), it is easy to verify Lagrange’s Theoremin this case as well. Lagrange was actually interested in the action of Sn on theinfinite set Pn of all multi-variable polynomials p(x1, x2, . . . , xn), under the action

�(p(x1, x2, . . . , xn)) = p(x�(1), x�(2), . . . , x�(n)).

Although Pn is an infinite set, of course all of the Sn-orbits have finite length,and Lagrange’s Theorem tells us that in fact this length must always divide n!.

We shall return to this setting for Lagrange’s Theorem in Chapter 9, when westudy polynomials and their roots. Now we shall return to geometric considerationsand see how we can use Lagrange’s Theorem to understand the structure of thesymmetry groups of the Platonic solids. But first we review some linear algebra.

Exercises

1. List all of the partitions of 6. For each partition ⇡, give a permutation �⇡

in S6 whose cycle structure is given by that partition. For each �⇡, list all of thepowers of �⇡ and indicate the order of �⇡.

2. Repeat Exercise 1 for 8 in place of 6.

3. Let ⇢ = (1, 2, 3, 4), � = (1, 2, 3)(4, 5) and ⌧ = (2, 4, 5) in S5. In each casebelow, write your answer in Cauchy cycle notation.

(a) Compute ⇢ � �.

(b) Compute � � ⇢.

36

(c) Compute � � ⌧ .(d) Compute ⌧ � �.(e) Compute ⇢ � ⌧ .(f) Compute ⌧ � ⇢.(g) Compute ⇢ � � � ⌧ .(h) Compute ⇢ � ⌧ � �.(i) Compute � � ⇢ � ⌧ .(j) Compute � � ⌧ � ⇢.(k) Compute ⌧ � ⇢ � �.(l) Compute ⌧ � � � ⇢.4. Let ⇢ : R2 ! R2 be a nonidentity rotation about the point P . Describe

geometrically the ⇢-orbits on the set P of all points of R2. Does this explain whyorbits are called orbits?

5. Let G be a group. Let N be a normal subgroup of G. Let H be any subgroupof G. Let

NH = {n � h : n 2 N and h 2 H}.

(a) Prove: NH is a subgroup of G.

(b) Suppose that N \ H = {I}. Let h, h0 2 H. Prove: N � h = N � h0 if andonly if h = h0. Conclude that |NH| = |N ||H|.

6. Let G be a group. Let g 2 G and let H be a subgroup of G. Define thefunction cg : H ! g �H � g�1 by

cg(h) = g � h � g�1 for all h 2 H.

Prove: cg is an isomorphism of groups.

7. Consider the group S4.

(a) Prove that {(1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)} is an S4-conjugacy class.

(b) Let V := {(1), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)}. Prove that V is a normalabelian subgroup of S4 isomorphic to the Klein 4-subgroup V4.

(c) Let H = {(1), (1, 2, 3), (1, 3, 2)}. Prove: V H is a normal subgroup of S4 ofcardinality 12, which is the union of three S4-conjugacy classes.

(d) Prove that S4 contains exactly three subgroups of cardinality 8, and thatevery element of S4 of order 4 is contained in exactly one of these subgroups.Conclude that these subgroups are S4-conjugate, and that each is isomorphic toD4, the symmetry group of the square.

In the next two exercises we prove another version of Lagrange’s Theorem, usingan argument very similar to that for Lagrange’s Orbit-Stabilizer Theorem.

8. Let G be a group and let H be a subgroup of G. Define a relation on G by:

37

g ⌘H g1 if and only if g�1 � g1 2 H.

(a) Prove: ⌘H is an equivalence relation on G.

(b) Prove: The ⌘H -equivalence classes are the left cosets g �H of H in G.

(c) Prove: |g �H| = |H| for all g 2 G.

9. Prove Lagrange’s Theorem: Let G be a finite group and H a subgroup of G.Then |H| divides |G|. [Hint: Use Exercise 8. Show that if m is the number of leftcosets of H in G, then |G| = m · |H|.]

10. Prove: Let p be a prime. Then every finite group of cardinality p is cyclic.

11. Prove: Let G be a group and H a subgroup of G. Then H is a normalsubgroup of G if and only if g �H = H � g for all g 2 G.

Definition. Let G be a (not necessarily finite) group and let H be a subgroup ofG. If there are only finitely many right cosets of H in G, then the (right) indexof H in G is the number m of right cosets of H in G. It is denoted (G : H).

12. Let G be a (not necessarily finite) group and H a subgroup of G with(G : H) = m < 1.

(a) Prove: The number of left cosets of H in G is also finite and is equal to m.[Hint: Prove that the map H � g ! (H � g)�1 = g�1 �H is a bijection between theset of right cosets of H in G and the set of left cosets of H in G.] Thus we mayspeak simply of the index of H in G.

(b) Prove: If (G : H) = 2, then H is a normal subgroup of G.

Definition. Let G be a group of permutations of the set X. We say that G istransitive on X, or that G acts transitively on X if X is a single G-orbit.

13. Prove: Let G be a finite group of permutations of the set X. Suppose Gacts transitively on X. Then X is a finite set, and |X| divides |G|.

14. Prove: Let H be a group of permutations of the set X. Let K be a normalsubgroup of H having a unique fixed point y on X, i.e.,

{y} = {x 2 X : k(x) = x 8k 2 K}.

Then h(y) = y for all h 2 H.

15. Let G be a group with |G| = 4. Define the map ⇤ : G ! Sym(G) by

⇤(g)(x) = g � x

for all g in the group G and all x in the set G.

(a) Prove: ⇤ defines an isomorphism between G and the subgroup ⇤(G) ofSym(G).

(b) Prove: Either G is cyclic or G is isomorphic to V4.

16. Let G be a group with |G| = 6.

38

(a) Prove: G contains elements of order 2 and 3. [Hint: Use Exercises 9 and 10from Chapter 12 of the Math 4580 text.]

(b) Prove: If G is not cyclic, then G has exactly three elements of order 2: ⌧1,⌧2, ⌧3. [Hint: Suppose not. Then G has a unique element ⌧ of order 2. Let g 2 Gof order 3. Prove that g � ⌧ = ⌧ � g. Conclude that g � ⌧ has order 6.]

(c) Prove: If G is not cyclic, then G ⇠= S3. [Hint: Define the map c : G !Sym({⌧1, ⌧2, ⌧3}) by:

c(g)(⌧i) = g � ⌧i � g�1.

Prove that c is an isomorphism of groups.]

17. In this exercise, we determine all of the subgroups of S4.

(a) Prove: S4 has exactly one subgroup of cardinality 12. (We call this subgroupthe alternating group on four letters and denote it A4.)

(b) Prove: If H is a subgroup of S4 containing two distinct cyclic subgroups ofcardinality 3, then H acts transitively on {1, 2, 3, 4}, and hence H = A4 or H = S4.

(c) Prove: Suppose H is a subgroup of S4 containing the cyclic subgroup K =h(a, b, c)i. Suppose that H 6= A4 and H 6= S4. Then either H = K or H = (S4)d,the stabilizer in S4 of the point d. In the latter case, H ⇠= S3. [Hint: Apply Exercise14. You must justify that K is a normal subgroup of H.]

(d) Prove: D4 contains two subgroups isomorphic to V4 and one subgroup iso-morphic to C4. All other subgroups have order 1, 2, or 8.

(e) Prove: Every subgroup of S4 is either cyclic of order 1, 2, 3, or 4, or isisomorphic to V4, S3, A4, or S4.

18.(a) Prove: Let G be a finite group and let K be a G-conjugacy class. Then|K| divides |G|.

[Hint: Use Lagrange’s Orbit-Stabilizer Theorem.]

(b) Prove: Let G be a group. Then Z(G) is the union of all G-conjugacy classesK such that |K| = 1.

(c) Prove: Let p be a prime, and let G be a finite group of cardinality pn forsome n 2 N. Then Z(G) contains a non-identity element of G.

(d) Prove: Let p be a prime and let G be a finite group of cardinality p2. ThenG is an abelian group. [Hint: By (b), Z(G) 6= {I}. If g 2 G � Z(G), argue thatevery element of G is equal to z � gi for some z 2 Z(G) and i 2 N. Conclude thatG is abelian.]

19. (Bonus) Let p be a prime and let G be a finite group with |G| = 2p. Prove:Either G is cyclic or G ⇠= Dp. [Hint: Suppose G is not cyclic. As in Exercise16, prove that G has exactly p elements of order 2, and a normal cyclic subgroupK = hxi of cardinality p. Let t 2 G � K. Show that x � t = t � x�1. Using this,prove that the map � : G ! Dp defined by

�(xi � tj) = ⇢i �Rj

for 0 i p� 1, 0 j 1, where ⇢ = ⇢ 2⇡p

and R is reflection across the x-axis, is

an isomorphism of groups.]

39

Chapter 5: The Platonic Solids and their Symmetries

It is quite remarkable that, although there are infinitely many regular polygons,there are, up to similarity, only five regular polyhedra: the Platonic solids.

Definition 5.1. A regular polyhedron is the surface S formed by a finite setof congruent regular polygons, called the faces of S, completely enclosing a regionin R3, such that any two faces are either disjoint or have exactly one vertex incommon or have exactly one edge in common.

Since S encloses a region, at least 3 faces must meet at any vertex of S, and theconvexity of S forces the sum of the angles at any vertex to be less than 360o. Hencethe polygons have interior angle less than 120o, i.e. they are triangles, squares, orpentagons. Moreover, in the case of squares and pentagons, exactly three polygonsmeet at a vertex, while in the case of triangles, at most five meet at a vertex.

Let V , E, and F denote the number of vertices, edges, and faces of S, respectively.We shall use Descartes’ Formula:

V � E + F = 2,

in conjunction with the local data:

2E = rV = mF,

where r is the number of edges meeting at a vertex v, and each face is a regularm-gon. Now we can easily compute, case-by-case. For example, if m = r = 3, then

2

3E � E +

2

3E = 2,

whence E = 6, and then, V = F = 4.Thus we get the following table:

m r V E F3 3 4 6 43 4 6 12 83 5 12 30 204 3 8 12 65 3 20 30 12

This is the data for the tetrahedron, the octahedron, the icosahedron, the cube,and the dodecahedron, respectively. It can be verified that if S is a Platonic solidand if one marks points at the center of each face of S and then joins points ata minimal distance apart by edges, the resulting surface is another Platonic solid,S⇤, said to be dual to S. This is easy to visualize in the case of the cube S, andthe resulting figure is a regular octahedron. Less obviously, the icosahedron andthe dodecahedron are dual figures.

Dual Platonic solids have the identical group of symmetries. We shall onlyconsider the group of rotational symmetries of the tetrahedron, the octahedron,and the icosahedron. These are called the tetrahedral group T , the octahedralgroup, O, and the icosahedral group I.

Lagrange’s Theorem enables us easily to give an upper bound for the sizes ofthese groups.

40

Lemma 5.2. The following upper bounds hold:

(1) |T | 12;(2) |O| 24; and(3) |I| 60.

Proof. First consider T . Let v be a vertex of the tetrahedron S(4). If ⇢ is a rotationin T fixing the vertex v, then ⇢ must also fix the center c of the opposite face andhence, ⇢ is either the identity map or ⇢ is a rotation through 120o (either clockwiseor counterclockwise) about the axis determined by v and c. Hence the stabilizerTv satisfies: |Tv| 3. Moreover, every rotation of S(4) moves v to some vertex ofS(4). Hence the orbit vT satisfies: |vT | 4. Hence by Lagrange’s Theorem,

|T | = |vT | · |Tv| 4 · 3 = 12.

For the octahedron and the icosahedron, each triangular face f is opposite anantipodal triangular face �f . Any non-identity rotation ⇢ fixing f (as a set) mustfix the centers of both f and �f and induce a 120o rotation about the axis joiningthose centers. Hence |Of | 3 and |If | 3. As the octahedron has 8 faces and theicosahedron has 20 faces, we conclude, using Lagrange’s Theorem, that

|O| = |fO| · |Of | 8 · 3 = 24,

and|I| = |f I | · |If | 20 · 3 = 60,

completing the proof of the lemma.

Now we wish to argue that these upper bounds are actually achieved. We willonly do this carefully in the first two cases.

The Symmetries of the Tetrahedron

There is a clever trick for describing a tetrahedron in a way that makes it easy tocompute its symmetries. Inside R4, we take the four unit vectors v1 = (1, 0, 0, 0),v2 = (0, 1, 0, 0), v3 = (0, 0, 1, 0), and v4 = (0, 0, 0, 1). Each of these points isp2 units from each of the others. So these points are the vertices of a regular

tetrahedron in R4. Indeed, the solid tetrahedron is the convex hull of these fourpoints:

{t1v1 + t2v2 + t3v3 + t4v4 : t1 + t2 + t3 + t4 = 1, 0 ti 1 8 i}.

This tetrahedron is contained in the 3-dimensional space defined by the equation:

x+ y + z + w = 1.

Now consider the subgroup P4 of the orthogonal group O(4) consisting of all4⇥ 4 permutation matrices. Each permutation � in P4 permutes the four verticesv1, v2, v3, v4, and hence is a symmetry of the tetrahedron S(4). Moreover, � leavesinvariant

R3 := {(x, y, z, w) 2 R4 : x+ y + z + w = 1},

41

and, since � is an isometry of R4, � induces an isometry of R3 which is a symmetryof S(4).

Since any two isometries of S(4) are completely determined by their action onthe four vertices of S(4), and since P4

⇠= S4, it follows that P4 = Isom(S(4)).Thus we have:

Theorem 5.3. If S(4) is a tetrahedron, then Isom(S(4)) ⇠= S4, and the group Tof rotational symmetries of S(4) is isomorphic to a subgroup of S4 of cardinality12.

Proof. The argument above shows that Isom(S(4)) ⇠= S4. Since O(3) = SO(3) ⇥{±I}, either all the symmetries in Isom(S(4)) are rotations or half are. Sincewe have shown that |T | 12, we conclude that exactly half the symmetries arerotations and |T | = 12. Thus, T is isomorphic to a subgroup of S4 of cardinality12.

In Chapter 4, Exercise 17a, you have shown that there is only one subgroup ofS4 of cardinality 12. It is called the alternating group of degree 4, and is denotedA4. It contains eight elements of order 3: for each vertex v of S(4), there are two120o rotations about an axis passing through v and the through the center of theface of S(4) opposite v. The remaining three non-identity rotational symmetriesare obtained as follows: The six edges of S(4) subdivide into three pairs {e, e0},where e and e0 have no vertex in common. If Le,e0 is the line through the midpointsof e and e0, then the 180o rotation of R3 about this line maps e to e and e0 to e0,and hence is a symmetry of S(4) of order 2.

The Rotational Symmetries of the Octahedron

We may construct the octahedron in R3 by taking as vertices the six points(±1, 0, 0), (0,±1, 0), (0, 0,±1). Notice that the four points lying in the x, y-planeform the vertices of a square of side

p2, and each of these points is at distancep

2 from both (0, 0, 1) and (0, 0,�1). Hence, joining (0, 0, 1) to the vertices at theendpoints of an edge of the square in the x, y-plane gives an equilateral triangle,and similarly for (0, 0,�1). These are the eight triangular faces of the octahedronS(8).

Let � be any permutation of {1, 2, 3}. Let ✏i = 1 or �1, for each i, 1 i 3.Setting e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 1, 0), we consider the unique linearoperator � which is the extension to R3 of the function:

�(e1) = ✏1 · e�(1),

�(e2) = ✏2 · e�(2),

�(e3) = ✏3 · e�(3).

For each choice of the ✏i, � is an isometry of R3 permuting the vertices of theoctahedron S(8). Hence each � is a symmetry of S(8). The matrix representing� with respect to the standard basis {e1, e2, e3} for R3 is a signed permutationmatrix.

Thus the Weyl group W (3) is a subgroup of Isom(S(8)). As we have seen,|W (3)| = 23 · 3! = 48. Thus |Isom(S(8))| � 48. On the other hand, we havecomputed, using Lagrange’s Theorem, that |O| 24, and we know that |O| �

42

12 |Isom(S(8))|. The only possible conclusion is that W (3) = Isom(S(8)) and O isa subgroup of W (3) of index 2, with |O| = 24.

As remarked before, the centers of the eight faces of the octahedron S(8) are thevertices of a cube C having the same group O of rotational symmetries. Let � bethe set of four long diagonals joining opposite vertices of C. If ⇢ is a symmetry of Cwhich maps each long diagonal to itself, then ⇢ = I or ⇢ = �I. (Exercise.) HenceI is the only rotational symmetry of O (and C) fixing every element of �. Thus if

� : O ! Sym(�),

is the function which maps each rotation in O to the permutation it induces on �,then � is an injective map. But |O| = 24 = |Sym(�)|. Hence O ⇠= Sym(�) ⇠= S4.

Thus we have proved the following theorem.

Theorem 5.4. The group O of rotational symmetries of the octahedron (or thecube) is isomorphic to the symmetric group S4.

We can enumerate the 24 symmetries in O as follows. First, regarding them assymmetries of the octahedron, we have:

(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-posite vertices;


(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-posite faces;

(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-posite edges; and

(5) 1 identity rotation.

Regarded as symmetries of the cube C, we have:






The Rotational Symmetries of the Icosahedron

We shall not attempt a rigorous determination of the group Icos of rotationalsymmetries of the icosahedron. We shall simple state the following facts:

Theorem 5.5. The group Icos of rotational symmetries of the icosahedron (andthe dodecahedron) is isomorphic to the alternating group of degree 5, the uniquesubgroup of cardinality 60 in the symmetric group S5.

We can enumerate the 60 symmetries in Icos as follows. First, regarding themas symmetries of the icosahedron, we have:


43

(2) 12 144o rotations (clockwise or counterclockwise) about the axes joiningopposite vertices;

(3) 20 120o rotations (clockwise or counterclockwise) about the axes joiningopposite triangular faces;

(4) 15 180o rotations (clockwise or counterclockwise) about the axes joiningopposite edges; and


Regarded as symmetries of the dodecahedron, we have:

(1) 12 72o rotations (clockwise or counterclockwise) about the axes joining op-posite pentagonal faces;

(2) 12 144o rotations (clockwise or counterclockwise) about the axes joiningopposite pentagonal faces;

(3) 20 120o rotations (clockwise or counterclockwise) about the axes joiningopposite vertices;

(4) 15 180o rotations (clockwise or counterclockwise) about the axes joiningopposite edges; and


Exercises

1. Consider the pattern F in R2 which is the tiling of the plane by unit squares,whose vertices have coordinates (a, b) with a, b 2 Z. Let G = Isom(F ). Prove thatG = T ·G0, where

T := {T(a,b) : a, b 2 Z}is the normal abelian subgroup of translational symmetries of F , and G0 is thestabilizer in G of the point (0, 0), with G0

⇠= D4, the group of all symmetries of thesquare.

2.(a) Consider the pattern F1 in R2 which is the tiling of the plane by congru-ent equilateral triangles with sides of unit length, including the triangle T having

vertices at (0, 0), (1, 0), and ( 12 ,p32 ). Prove: Isom(F1) = T1 ·H, where

T1 = {a · T(1,0) + b · T( 12 ,

p3

2 ): a, b 2 Z}

is the normal abelian subgroup of translational symmetries of F1, and

H = Isom(F1)0 ⇠= D6,

the group of all symmetries of the regular hexagon. [Note: There are six trianglesmeeting at the point (0, 0), comprising the six wedges of a regular hexagon centeredat (0, 0).]

(b) Let P be the center of the triangle T from (a). Find the coordinates of P .Prove: The stabilizer, Isom(F1)P , of the point P is isomorphic to D3, the groupof all symmetries of the equilateral triangle.

3. Let C be a cube in R3 centered at (0, 0, 0). Prove: If f 2 Isom(C) and fmaps each long diagonal of C to itself (but not necessarily pointwise), then f = Ior f = �I.

44

4. In this exercise, you may assume that the icosahedral group Icos transitivelypermutes the sets V , F , and E of all vertices, faces, and edges of the icosahedron,respectively.

(a) Prove: The set of 15 elements of order 2 in Icos forms a single Icos-conjugacyclass.

(b) Prove: Icos contains 10 subgroups of cardinality 3, and Icos permutes thesesubgroups transitively under conjugation.

(c) Prove: Icos contains 6 subgroups of cardinality 5 and Icos permutes thesesubgroups transitively under conjugation.

(d) Prove: The only normal subgroups of Icos are the identity subgroup {I} andthe full group Icos. [Hint: Recall that if N is a subgroup of Icos, then |N | divides60 = |I|. Moreover, if N is a normal subgroup of Icos and N contains the subgroupH of Icos, then N contains g �H � g�1 for all g 2 Icos. Now use (a), (b), and (c).]

Definition. We say that a group G is a simple group if {I} and G are the onlynormal subgroups of G.

5. Prove: If G is an abelian group, then G is a simple group if and only if |G| = pfor some prime p.

The group Icos is the smallest non-abelian simple group.

45

Chapter 6: The Orbit Counting Formula

We now can derive a very useful counting formula. It was apparently first dis-covered by Cauchy in 1845. It was rediscovered by Frobenius in 1887, and wasincluded by Burnside in his textbook on group theory. Sometimes it is misnamedthe Burnside Counting Formula.

The Orbit Counting Formula. Let G be a finite group of permutations of afinite set X. The number of G-orbits on X equals the average number of fixedpoints on X of the elements of G. In other words, let f : G ! N [ {0} be thefunction defined by

f(g) = |{x 2 X : g(x) = x}|.

Let r denote the number of G-orbits on X. Then

r =1

|G|X

g2G

f(g).

The keys to the proof are:

Lagrange’s Orbit-Stabilizer Theorem

and the following elementary but useful counting observation:

Lemma 6.1. Let G be a finite group of permutations of a finite set X. Then

X

g2G

f(g) =X

x2X

|Gx|.

Proof. The easiest way to visualize why this is true is to imagine a rectangulararray (a matrix, if you will) whose rows are labeled by the elements of G and whosecolumns are labeled by the points in the set X. The (g, x) entry of this array is 1 ifg(x) = x, and is 0 if g(x) 6= x. Now, add up all of the entries in the matrix. Addingone row at a time gives the answer

Pg2G f(g). Adding one column at a time gives

the answerP

x2X |Gx|, noting that the (g, x) entry is 1 if and only if g 2 Gx. Thisproves the lemma.

Next we make the following observation.

Lemma 6.2. Let G be a finite group of permutations of a finite set X. Let O beany G-orbit on X. Then

X

x2O|Gx| = |G|.

Thus1

|G|X

x2O|Gx| = 1,

for every G-orbit O on X.

Proof. Lagrange’s Orbit Stabilizer Theorem tells us that, for any x 2 O,

|G| = |Gx| · |O|.

46

In particular, this means that, for all x 2 O,

|Gx| =|G||O| ,

independent of the choice of x. But then

X

x2O|Gx| = |O| · |Gx| = |G|,

proving the lemma.

Now we can quickly complete the proof of the Orbit Counting Formula. ByLemma 6.1,

1

|G|X

g2G

f(g) =1

|G|X

x2X

|Gx|.

Let O1,O2, . . . ,Or be the G-orbits on X. Then, counting orbit by orbit, and usingLemma 6.2, we have

1

|G|X

x2X

|Gx| =rX

i=1

(1

|G|X

x2Oi

|Gx|) =rX

i=1

1 = r.

Thus1

|G|X

g2G

f(g) = r,

completing the proof of the Orbit Counting Formula.

You may well object that you have never counted orbits in your life and can’timagine a situation where it would be useful to do so. Here are a couple of exampleswhere orbits arise in a natural way.

Orbit Example 1. Let’s call an organic hexad a ring-shaped molecule consistingof six atoms, each of which is either a carbon atom (C) or a hydrogen atom (H). Howmany di↵erent organic hexads are possible? [Assume that all bonds are isomorphic.]

What makes this an orbit problem is the fact that each molecule can be rotatedinto six di↵erent position, and can be flipped across six axes. So each moleculecan be thought of as an orbit of the symmetry group D6 of the regular hexagon onthe set S of all possible labelings of each vertex with a letter C or H. Since eachvertex has two possible labelings and the choices are independent, there are 26 = 64di↵erent labelings, i.e. |S| = 64. But we want to count the number of orbits of D6

on S.Let’s count fixed points instead. Suppose ⇢ is a 60o rotation in either direction.

If L is a labeling with ⇢(L) = L, then every vertex must have the same labelas its adjacent vertex, i.e. the labeling L must be “monochromatic”. i.e., L ={C,C,C,C,C,C} or {H,H,H,H,H,H}. So

f(⇢) = 2.

On the other hand, if ⇢2 is a 120o rotation, then adjacent vertices may be labeledeither the same or di↵erently, but every other vertex must have the same label.Hence there are two additional labelings fixed by ⇢2:

47

{C,H,C,H,C,H} and {H,C,H,C,H,C}.

Thusf(⇢2) = 4.

Likewise, labelings fixed by a 180o rotation must have opposite vertices labeled thesame. So there are eight labelings fixed by ⇢3:

f(⇢3) = 8.

There are two di↵erent kinds of reflections. If rv is a reflection fixing two oppositevertices, then these two vertices may be labeled any way, but mirror-image verticesmust have the same label. So

f(rv) = 2⇥ 2⇥ 2⇥ 2 = 16.

If re is a reflection fixing no vertices, then there are three mirror-image pairs and

f(re) = 2⇥ 2⇥ 2 = 8.

Next we add up the number of fixed points, keeping track of the number of sym-metries of each type:

X

g2D6

f(g) = f(I)+2f(⇢)+2f(⇢2)+f(⇢3)+3f(rv)+3f(r3) = 64+4+8+8+48+24 = 156.

Finally we divide by |D6| to get that the number of distinct organic hexads is15612 = 13.Note: Don’t tell your chemistry professor about “organic hexads”. They have

no basis in chemical reality. However similar arguments can be used in genuinechemistry problems.

Before proceeding to the next example, we make a useful observation, which isimplicit in the last calculation.

Lemma 6.3. Let G be a group of permutations of a finite set X and let f(g) denotethe number of fixed points of the element g 2 G. If h � g � h�1 is any conjugate ofg in G, then

f(g) = f(h � g � h�1),

proof. For any element g 2 G, let

F (g) = {x 2 X : g(x) = x}.

Then f(g) = |F (g)|. We shall show that

F (h � g � h�1) = h(F (g)).

Then, since h is a bijective mapping on X,

48

|F (h � g � h�1)| = |F (g)|,

as claimed.First let x 2 F (g). Then

(h � g � h�1)(h(x)) = (h � g)(h�1(h(x))) = h(g(x)) = h(x).

Thus h(x) 2 F (h � g � h�1) whenever x 2 F (g), i.e.

h(F (g)) ✓ F (h � g � h�1).

Secondly, let y 2 F (h � g � h�1). We wish to show that h�1(y) 2 F (g). Now

y = (h � g)(h�1(y)).

So, applying h�1 to both sides, we get

h�1(y) = h�1((h � g)(h�1(y))) = (h�1 � h)(g(h�1(y))) = g(h�1(y)).

So h�1(y) 2 F (g), as desired. Thus

F (h � g � h�1) ✓ h(F (g)),

and soF (h � g � h�1) = h(F (g)),

as claimed.

Thus, in order to perform the calculations required for the Orbit Counting For-mula, we only need to compute f(g) for one representative of each conjugacy classof G, and we need to know the size of each conjugacy class. This is in fact what wedid in Example 1, where the “types”of rotations and reflections were actually thedi↵erent conjugacy classes of the group D6.

Example 2. Let’s call a crystal tetrad a crystalline molecule in the shape of atetrahedron with each vertex containing either a silicon atom (Si), an oxygen atom(O), or a hydrogen atom (H). How many di↵erent crystal tetrads are possible?

This problem is very similar to the previous one, only now we are consideringthe orbits of the symmetry group of the tetrahedron on the set S of all labelings ofthe vertices of the tetrahedron with the label Si, O or H. Thus there are 34 = 81possible labelings, i.e. |S| = 81.

We have seen that the tetrahedral group T is isomorphic to S4 acting as the groupof all possible permutations of the four vertices of the tetrahedron. We have alsoseen that two permutations in Sn are conjugate if and only if they have the samecycle structure. Here we must compute fixed points for five di↵erent permutations:(1), (1, 2), (1, 2, 3), (1, 2, 3, 4), and (1, 2)(3, 4).

If a labeling L is fixed by the isometry g, then two vertices of T which are in thesame g-orbit must have the same label, while vertices in di↵erent g-orbits may belabeled independently. Thus we easily establish the following table:

49

Type(g) |Type(g)| f(g)(1) 1 81(1, 2) 6 27(1, 2, 3) 8 9(1, 2, 3, 4) 6 3(1, 2)(3, 4) 3 9

Thus we conclude that the number of crystal tetrads is

(1⇥ 81) + (6⇥ 27) + (8⇥ 9) + (6⇥ 3) + (3⇥ 9)

24=

27 + 54 + 24 + 6 + 9

8=

120

8= 15.

Here is a somewhat more complicated example, taken from “Abstract Alge-bra” by Ted Shifrin.

Example 3. Suppose we are going to paint two faces of a cube red, two white, andtwo blue. How many di↵erently colored cubes are possible?

First we need to count the size of the set C of possible colorings. We have tochoose two faces out of six to color red. We can do this in 6⇥5

2 = 15 ways. Thenwe must choose two of the remaining four faces to color white. We can do this in4⇥32 = 6 ways. So there are 15⇥ 6 = 90 colorings.Next we must count the number of orbits of the octahedral group O on C. Again,

we do this by counting fixed points for each type of rotation in O. We recall thetypes:






If ⇢ is a 90o rotation and c is a coloring fixed by ⇢, then four faces of c must havethe same color, which is impossible. So f(⇢) = 0. Similarly, if ⇢ is a 120o rotation,then the three faces having a common vertex must be colored the same. So, againf(⇢) = 0.

If ⇢ is a 180o rotation about the center of two opposite faces, then there are3 ⇥ 2 ⇥ 1 colorings fixed by ⇢: the rotated faces must be colored in pairs. So thefixed faces must share the same third color. Thus f(⇢) = 6.

If ⇢ is a 180o rotation about the axis joining midpoints of opposite edges, thenpairs of interchanged faces must share the same color. Again, f(⇢) = 6.

Of course, f(I) = 90, and so we have that the answer is

3⇥ 6 + 6⇥ 6 + 1⇥ 90

24=

144

24= 6.

We conclude this section with an amusing example.

50

Example 3. Let p be a prime number. Imagine a pinwheel with p identically shapedpins. Suppose that there are n di↵erent colors of pins to choose from. How manydi↵erent pinwheels are possible?

Since the pins are di↵erent, front to back, only rotations are possible, not reflec-tions. Thus we are counting the number of orbits of the cyclic group of p rotationson the set S of np di↵erent colorings. For any non-identity rotation ⇢, there is onlyone ⇢-orbit on the p pins. Hence the only colorings fixed by ⇢ are the monochromaticcolorings. Thus

f(⇢) = n

for all non-identity ⇢, of which there are p� 1. Also, clearly

f(I) = np.

Hence the number of di↵erent pinwheels is

(1⇥ np) + ((p� 1)⇥ n)

p= n+

np � n

p.

Since the number of di↵erent pinwheels must be an integer, we obtain the followingcorollary.

Fermat’s Little Theorem. Let p be a prime and n be any natural number. Then

np ⌘ n (mod p).

Exercises

1. How many di↵erent colored tetrahedra are there in which each face is coloredeither red or white or blue?

2. How many di↵erent colored cubes are there in which each face is coloredeither red or white or blue?

3. How many di↵erent pinwheels with 6 identically shaped pins are there, if eachpin can be colored one of 4 di↵erent colors?

4. How many di↵erent pinwheels with 8 identically shaped pins are there, if eachpin can be colored one of 4 di↵erent colors?

5. Instead of pinwheels, consider bracelets with 8 identical size spherical beads,each of one of 4 di↵erent colors? How many di↵erent bracelets are there? [Nowreflectional symmetries must be considered, in addition to rotational symmetries.]

6. A toy pyramid in the shape of a regular tetrahedron is built out of six pegs.Count the number of di↵erent designs if there are

(a) two each of red, white, and blue pegs; or

(b) three each of red and white pegs.

7. The skeleton of a cube is made out of twelve pegs. How many distinguishablesuch cube can be made from:

51

(a) seven blue and five white pegs?

(b) six blue, two white, and four red pegs?

8. A soccer ball is more or less a regular dodecahedron in which the vertices havebeen replaced by regular hexagons. An artistic soccer ball manufacturer wants tomake soccer balls in which the hexagons are all black, but the four of the pentagonsare colored silver, four gold, and four scarlet. How many di↵erently colored soccerballs can he manufacture?

9. How many di↵erent square patchwork quilts, four patches by four patches,can be made from six red, four white, and six blue squares, assuming that the quilts

(a) cannot be turned over?

(b) can be turned over?

10. Prove: Let G be a finite group of permutations acting transitively on thefinite set X, where |X| > 1. There exists at least one element g 2 G having nofixed point on X, i.e., such that g(x) 6= x for all x 2 X.

52

7. Finite Subgroups of SO(3)

In this concluding chapter on symmetries and isometries, we shall combine all ofthe ideas we have developed so far to prove a very beautiful theorem of mid-19thcentury mathematics.

Theorem 7.1. Let G be a finite group of rotations of R3. Then one of the followingconclusions holds:

(1) G is a finite cyclic group of rotations, all about the same axis L; or(2) G fixes a line L as a set and a point P on L, and G induces on the plane

L? perpendicular to L and passing through P the full dihedral symmetrygroup of some regular polygon; or

(3) |G| = 12 and G induces the full group of rotational symmetries of sometetrahedron; or

(4) |G| = 24 and G induces the full group of rotational symmetries of someoctahedron; or

(5) |G| = 60 and G induces the full group of rotational symmetries of someicosahedron.

This is a very remarkable theorem, obviously closely related to the theorem thatthere are only five Platonic solids. It challenges our intuition. Although thereare infinitely many di↵erent 2-dimensional rotation groups, one for each regularpolygon, there are only three di↵erent essentially 3-dimensional rotation groups.Somehow, the extra dimension provides less freedom, not more.

In fact, we shall prove an even stronger statement.

Definition 7.2. Let f : Rn ! Rn be a function. We call f an a�ne transfor-mation if there exists a vector v 2 Rn and a linear operator g : Rn ! Rn suchthat f = Tv � g, where Tv : Rn ! Rn is translation by the vector v. We denote byAff(Rn) the group of all invertible a�ne transformations of Rn.

First we prove the following striking fact:

Theorem 7.3. Let G be a finite subgroup of Aff(Rn). Then G fixes a point, i.e.there exists a point P 2 Rn such that g(P ) = P for all P 2 Rn.

This statement is certainly false for many infinite subgroups of Aff(Rn), e.g.the translation group Tn. So we have to use the only two facts we know:

(1) G is finite; and(2) Every g 2 G acts as an a�ne transformation, i.e. for some vector v 2 Rn,

g = Tv � f,

where Tv is translation by v and f : Rn ! Rn is a linear transformation.

We use statement (2) to understand a�ne transformations a little better:

Lemma 7.4. Let g = Tv � f be an a�ne transformation of Rn, where f is a lineartransformation and Tv is a translation map. Let v1, v2, . . . , vm be vectors in Rn andlet c be a real scalar. Then the following formulas hold:

(1) g(v1 + v2 + · · ·+ vm) = g(v1) + g(v2) + · · ·+ g(vm)� (m� 1)v; and(2) g(c · v1) = c · g(v1) + (1� c) · v.

53

Proof. (1) We have

g(v1 + · · ·+ vm) = f(v1 + · · ·+ vm) + v = f(v1) + · · ·+ f(vm) + v.

On the other hand,

g(v1) + . . . g(vm)� (m� 1)v = (f(v1) + v) + · · ·+ (f(vm) + v)� (m� 1)v =

f(v1) + · · ·+ f(vm) +mv� (m� 1)v = f(v1) + · · ·+ f(vm) + v = g(v1 + · · ·+ vm).

(2) We have

g(c · v1) = f(c · v1) + v = c · f(v1) + v = c · (g(v1)� v) + v = c · g(v1) + (1� c) · v.

⇤Now we can describe the averaging trick which will permit us to find a fixed

point for G. Let’s imagine first that G is the cyclic group generated by a rotation⇢ about a point P through an angle of 2⇡

n . Hence ⇢n = I. Now suppose we didn’tknow P . We could pick a random point Q and look at the ⇢-orbit of Q:

Q, ⇢(Q), ⇢2(Q), . . . , ⇢n�1(Q).

Then next step, ⇢n(Q), would take us back to Q. What we would see is n pointsevenly spaced on the circumference of a circle, and we would realize that P must bethe center of that circle. If we change the choice of Q, we change the set of pointsand we probably even change the circle, but the center is always P !

This is the magic that we will exploit.

First we need the following remark.

Lemma 7.5. Let G be a group and let h be an element of G. Consider the function�h : G ! G defined by

�h(g) = h � g for all g 2 G.

This is a bijective function.

Proof. Since G is a group, h � g 2 G for all g 2 G. So �h is indeed a function fromG into G. Suppose h�g = h�g1. Then by the Left Cancellation Law, g = g1. Thus�h is an injective map.

For any g 2 G, h�1 � g 2 G, and

�h(h�1 � g) = h � (h�1 � g) = (h � h�1) � g = g.

Thus �h is also a surjective map, hence a bijective map.

⇤Now we can prove Theorem 7.3.

Proof of Theorem 7.3. Let v be any point in Rn. Let

54

v =1

|G| ·X

g2G

g(v).

Let h 2 G. We claim that h(v) = v. Write h = Tw � f , where Tw is translation bythe vector w and f 2 GL(Rn). Then by Lemma 7.4,

h(v) =1

|G| · (h(X

g2G

g(v))) + (1� 1

|G| ) · w =

=1

|G| (X

g2G

(h � g)(v)� (|G|� 1) · w) + (1� 1

|G| ) · w.

Now by Lemma 7.5,P

g2G(h � g)(v) =P

g2G g(v), since as g runs through theelements of G once each, so does h � g. Hence

h(v) =1

|G| (X

g2G

g(v)� (|G|� 1) · w) + (1� 1

|G| ) · w =

v � |G|� 1

|G| · w + (1� 1

|G| ) · w = v,

completing the proof of Theorem 7.3.

⇤Corollary 7.6. Let G be a finite subgroup of Isom(R3). Then G is conjugate toa subgroup of O(3).

Proof. By Theorem 7.3, there is a point v 2 R3 such that g(v) = v for all g 2 G.Then

T�v � g � Tv(0, 0, 0) = T�v(g(v)) = T�v(v) = (0, 0, 0).

Hence T�1v �G � Tv ✓ O(3), as claimed.

⇤We now study the finite subgroups of SO(3). It is a bit more complicated to

describe all of the finite subgroups of O(3). You will explore this in the exercises.

Let G be a finite subgroup of SO(3) with G 6= {I}. Recall that every non-identity element ⇢ of G is a non-identity rotation about an axis passing through(0, 0, 0), and the only points of R3 fixed by ⇢ lie on this axis of rotation. We let S2

denote the unit sphere in R3:

S2 = {(x, y, z) 2 R3 : x2 + y2 + z2 = 1}.Then the axis of rotation of ⇢ intersects S2 in exactly two antipodal points (a, b, c)and (�a,�b,�c). We think of (a, b, c) and (�a,�b,�c) as the north and southpoles for the rotation ⇢. Let

P = {P = (a, b, c) 2 S2 : g(P ) = P for some g 2 G� {I}}.We call P the set of poles for the group G. Notice that each non-identity element ofG contributes two poles to the set P. However two di↵erent non-identity elementsmay contribute the same pair of poles. In any case, we see that:

55

Lemma 7.7. We have2 |P| 2(|G|� 1).

Thus P is a finite set. Even better, we have the following fact.

Lemma 7.8. G acts as a group of permutations of the set P, i.e., if P 2 P andg 2 G, then g(P ) 2 P.

Proof. Let P 2 P. Then by definition of P, there exists some h 2 G with h 6= Iand with h(P ) = P . Now let g be any element of G. Then

(g � h � g�1)(g(P )) = (g � h)(g�1(g(P ))) = (g � h)(P ) = g(h(P )) = g(P ).

Also, if g � h � g�1 = I, then

h = g�1 � (g � h � g�1) � g = g�1 � I � g = I,

contrary to the fact that h 6= I. Hence g �h�g�1 6= I and g �h�g�1(g(P )) = g(P ).So g(P ) 2 P for all P 2 P and all g 2 G, as claimed.

⇤Remarkably, we can now count the number of G-orbits on P. Except for one

trivial case, there must be exactly three orbits.

Lemma 7.9. The following conclusions hold:

(1) G has either two or three orbits on P.(2) If G has only two orbits on P, then P = {P,�P} is a pair of antipodal

points on S2, and G is a cyclic group of rotations about the axis L throughP and �P .

(3) If G has three orbits on P, then |P| = |G|+ 2.

Proof. Using the notation of the Orbit Counting Formula, f(g) = 2 for all g 2G � {I}. Let m be the number of G-orbits on P. Then, by the Orbit CountingFormula,

m =(|G|� 1) · 2 + |P|

|G| = 2 +|P|� 2

|G| .

Thus m � 2 and, if m = 2, then |P| = 2, while if m = 3, then |P| = |G|+ 2.Since |P| 2(|G|� 1) by Lemma 12.6, we have that

m = 2 +|P|� 2

|G| 2 +2|G|� 4

|G| = 2 + 2 +�4

|G| < 4.

Hence m = 2 or 3.Finally, suppose that m = |P| = 2. If P is a pole of G, then so is the antipodal

point �P . Hence P = {P,�P} and these points are fixed by every element of G.Thus the line L through P and �P is held pointwise fixed by every element of G,and so using Theorem 12.12 in the Math 4580 text, we see that G acts as a finitecyclic group of rotations of the plane L? through (0, 0, 0) perpendicular to L.

⇤

56

For the remainder of this chapter, we shall assume that m = 3, and let O1, O2,and O3 be the three G-orbits on P. Choose notation so that

|O1| � |O2| � |O3|.

Let P be a point in O1, Q a point in O2, and R a point in O3. Recall thenotation:

GP = {g 2 G : g(P ) = P}.

By Lagrange’s Orbit Stabilizer Theorem,

|G| = |O1| · |GP | = |O2| · |GQ| = |O3| · |GR|.

Let |GP | = p, |GQ| = q, and |GR| = r. By the ordering of the orbits, we have

p q r.

Also, since every pole is fixed both by I and by at least one non-identity elementof G, we have

2 p q r.

We now obtain the key formula, which may be interpreted as a statement inspherical geometry:

Lemma 7.10.1

p+

1

q+

1

r= 1 +

2

|G| > 1.

Proof. By Lemma 7.9(3),

|G|+ 2 = |P| = |O1|+ |O2|+ |O3|.

Hence, dividing by |G| and using Lagrange’s formula, we have

1 +2

|G| =|O1||G| +

|O2||G| +

|O3||G| =

1

p+

1

q+

1

r.

⇤From this, we immediately get that the following are the only possibilities for p,

q, and r.

Lemma 7.11. One of the following possibilities holds:

(1) p = q = 2 and r = |G|2 ; or

(2) p = 2, q = r = 3, and |G| = 12; or(3) p = 2, q = 3, r = 4, and |G| = 24; or(4) p = 2, q = 3, r = 5, and |G| = 60.

57

Proof. If p > 2, then

1

p+

1

q+

1

r 1

3+

1

3+

1

3= 1 < 1 +

2

|G| ,

contrary to Lemma 7.10. Hence p = 2.If p = q = 2, then by Lemma 7.10, 1

r = 2|G| . So r = |G|

2 , as claimed in (1).Suppose that q > 3. Then

1

p+

1

q+

1

r 1

2+

1

4+

1

4= 1,

again contrary to Lemma 7.10. Hence we may assume that p = 2 and q = 3. Ifr > 5, then

1

p+

1

q+

1

r 1

2+

1

3+

1

6= 1,

again contrary to Lemma 7.10. Hence if q = 3, then r 2 {3, 4, 5}, completing theproof.

⇤We notice that in the last three cases, G has the cardinality of the tetrahe-

dral group, the octahedral group or the icosahedral group. But first, let’s try tounderstand Case 1.

Lemma 7.12. If p = q = 2, then O3 = {R,�R}. Let L be the line through R and�R, and let ⇧ be the plane through (0, 0, 0) perpendicular to L. Then G acts as adihedral group of isometries of the plane ⇧, containing r rotations and r reflections.

Proof. As r = |GR| = |G|2 , GR is a normal subgroup of G with

G = GR [ f �GR,

for some f = G�GR. Every element of GR fixes both R and �R, and so GR actsas a cyclic group of rotations of the plane ⇧ about the point (0, 0, 0). Let g 2 GR

with g 6= I. Then, since GR is a normal subgroup of G, f�1 � g � f := g1 is also anelement of GR. Hence g � f = f � g1, and so

g(f(R)) = (g � f)(R) = (f � g1)(R) = f(g1(R)) = f(R).

Hence f(R) is a pole of g. But the poles of g are R and �R. Since f is not in GR,f(R) 6= R. So f(R) = �R and f(�R) = R.

In any case, f fixes setwise both the line L and the plane ⇧ perpendicular to L.Hence every element of G fixes setwise both the line L and the plane ⇧.

Also, notice that, since f(R) = �R, the two poles, S and �S, of f lie in the plane⇧. Now f2(R) = R, and so f2 fixes the non-collinear points R, S, and (0, 0, 0). Sof2 = I for all f 2 G�GR. Hence f is a 180o rotation about the line M through Sand �S. So f induces a reflection of the plane ⇧ across the mirror-line M .

Thus G acts as a dihedral group of isometries of the plane ⇧ with the r elementsof GR acting as rotations of ⇧ about (0, 0, 0), and with the r elements of G � GR

acting as reflections across lines in ⇧ passing through (0, 0, 0).

⇤

58

Next we consider the case

p = 2, q = r = 3.

Consider the orbit O2 of size 4. Since no non-identity rotation fixes four poles,the map � : T ! Sym(O2) by restriction of domain is an injective map of T intoSym(O2) ⇠= S4. Since |T | = 12, T is isomorphic to the unique subgroup of S4 ofcardinality 12, namely, the alternating group A4. As you will show in Exercise 1below, A4 transitively permutes the set of unordered pairs {{i, j} : 1 i < j 4}.Translating this into geometry: T transitively permutes the set of six edges joiningpairs of vertices in O2. As T is a group of isometries, all edges have the samelength, i.e. the figure S(4) formed in this way is a tetrahedron, and T is the groupof rotational symmetries of S(4). If we were to use the other orbit of length 4, O3,we would have constructed the dual tetrahedron, S(4)⇤.

Next. we consider the case

p = 2, q = 3, r = 4.

Now we have that |G| = 24 and so the orbit O3 has length 6. Suppose that X and�X are two antipodal poles. Then GX = G�X . In particular, X and �X lie in

orbits of the same length, namely, |G||GX | . Since no two orbits have the same length

in this case (in contrast to the tetrahedral case), we see that X and �X lie in thesame orbit. In particular

O3 = {R,�R,S,�S, T,�T}

for some poles R, S, T and their antipodes.The group GR acts as a group of rotations of the latitudinal planes perpendicular

to the axis passing through R and �R. Hence, as we have seen from studying the2-dimensional case, GR is a cyclic group of cardinality 4 = 24

6 . If ⇢ is a cyclicgenerator of GR, then ⇢ is a 90o rotation about the {R,�R} axis. Since ⇢2 fixesonly the poles R and �R, the set {S,�S, T,�T} is a ⇢-orbit on P. As an exercise,you are asked to show that this is possible if and only if S, T,�S,�T lie at a setof compass points on the equatorial plane relative to the poles R and �R. ThusS, T,�S,�T determine a square in the equatorial plane, and {±R,±S,±T} is thevertex set of an octahedron C⇤ with G as its group of rotational symmetries.

The orbit O2 contains four pairs of antipodal poles, which may be obtained asfollows: Draw the lines through the centers of opposite faces of the octahedron C⇤.Each such line L gives a pair {QL,�QL} of antipodal points on the unit sphere S2,which are poles for the rotational symmetries ⇢F and ⇢2F of order 3 fixing the faceF . This set of eight points on S2 is the set of points in the orbit O2. Clearly, itmay be identified with the vertices of a cube which is a dilation of the cube C dualto the octahedron C⇤.

Finally we say a few words about the most di�cult case

p = 2, q = 3, r = 5.

Now |G| = 60 and the orbit O3 has length 12. Again, because no two orbits havethe same length, antipodal poles lie in the same orbit. In particular, GR is a cyclic

59

group of cardinality 5, fixing the points R and �R, while permuting the remaining10 points of O3 in two orbits of length 5.

Suppose one GR-orbit consists of five points in the equatorial plane relative to Rand �R. Then, since these points are the vertices of a regular pentagon, they do notcontain antipodal pairs. Hence the other GR-orbit must consist of their antipodalpoints, also lying in the equatorial plane. Let C be the great circle formed by theintersection of this equatorial plane with the unit sphere S2. Then, clearly C is theonly great circle on S2 containing 10 points from O3. But then since G acts on theorbit O3, G fixes the circle C (as a set), contrary to the fact that O3 is a G-orbitand R does not lie on C.

Hence there are two latitudinal circles, CN and CS , each containing the pointsin one GR-orbit of length 5 on O3, placed at the vertices of a regular pentagon.Moreover the regular pentagon on CN is antipodal to the regular pentagon on CS .

Using some further symmetry arguments, it can be shown that O3 is the setof vertices of an icosahedron inscribed in S2. And, then, in a similar way to theoctahedral case, it may be argued that O2 is the set of vertices of an inscribeddodecahedron, which is the dilation of the dodecahedron dual to the inscribedicosahedron.

Finally, as |G| = 60, it follows that G ⇠= I, the group of all rotational symmetriesof the icosahedron.

Exercises

1. Let G = A4, the alternating group on 4 letters. Let

X = {{i, j} : 1 i < j 4}.

Let G act on X via�({i, j}) = {�(i),�(j)}.

Prove: G acts transitively on the set X, i.e., for all {i, j} 2 X, there exists � 2 Gwith

�({1, 2}) = {i, j}.

2.(a) Justify the statement in the discussion of the octahedral case: “Since ⇢2

fixes only the poles R and �R, the set {S,�S, T,�T} is a ⇢-orbit on P.

(b) In the same context as (2a), prove: S, T,�S,�T lie at a set of compasspoints on the equatorial plane relative to the poles R and �R.

3. The Direct Product: Let G be a group having subgroups H and K satisfyingthe following two conditions:

(a) h � k = k � h for all h 2 H and k 2 K; and(b) H \K = {I}.

Prove: The subset HK of G is a subgroup of G which is isomorphic to the followingformal group. called the direct product of H and K:

H ⇥K := {(h, k) : h 2 H and k 2 K},

with the operation

60

(h, k) � (h0, k0) = (h � h0, k � k0).

4. Let G = HK ⇠= H ⇥K be a group. Let ⇡H : G ! H be the projection mapdefined by:

⇡H((hk)) = h 8 g = hk 2 G.

Define ⇡K : G ! K analogously.

(a) Prove: Let M be any subgroup of G. Then ⇡H(M) is a subgroup of H and⇡K(M) is a subgroup of K.

(b) Prove: M is a subgroup of the group ⇡H(M) · ⇡K(M) ⇠= ⇡H(M)⇥ ⇡K(M).

5.(a) Prove: O(3) ⇠= SO(3)⇥ {I,�I} ⇠= SO(3)⇥ C2.

(b) Conclude that every finite subgroup of O(3) is isomorphic to a subgroup ofH ⇥K, where H is a finite subgroup of SO(3) and |K| = 1 or 2.

6. Let S be a regular tetrahedron in R3. We have proved that Sym(S) ⇠= S4.On the other hand, the tetrahedral group T of all rotational symmetries of S isisomorphic to A4. Verify that S4 is not isomorphic to a subgroup of T ⇥ {I,�I}.Explain how this could be true.

7. If S is either a regular octahedron or a regular icosahedron centered at (0, 0, 0),then the antipodal map �I is a symmetry of S. Using this fact, prove that thesymmetry group of S is Sym(S) = O⇥ {±I} is S is an octahedron, and Sym(S) =Icos⇥ {±I} if S is an icosahedron.

8. Let D3 be the group of all diagonal matrices in O(3). Prove: D3⇠= C2⇥C2⇥

C2, an abelian group of cardinality 8, all of whose non-identity elements have order2.

9.(a) Prove: Let p be a prime and let G be a finite group of cardinality p2. Theneither G is cyclic or G ⇠= Cp ⇥ Cp.

(b) Exhibit a subgroup P of S6 with |P | = 9. Verify that P ⇠= C3 ⇥ C3.

10. Inner Product Spaces: A function h., .i : R3 ⇥R3 ! R is an inner producton R3 if the following properties hold:

(i) hu, vi = hv, ui for all u, v 2 R3;(ii) hu+ v, wi = hu,wi+ hv, wi for all u, v, w 2 R3;(iii) hcu, vi = c · hu, vi for all u, v 2 R3, c 2 R; and(iv) hu, ui � 0 for all u 2 R3, with equality if and only if u = 0.

(a) Prove: If h., .i is an inner product on R3, there is an orthonormal basis forR3 with respect to h., .i, i.e. there exist vectors e1, e2, e3 such that

hei, eji = 0 for all i 6= j, and hei, eii = 1 for all i.

(b) Let G⇤ be the set of all linear isometries of R3 with respect to h., .i, i.e.,

G⇤ := {T 2 GL(R3) : hu, vi = hT (u), T (v)i 8u, v 2 R3}.

61

Prove: G⇤ is a subgroup of GL(R3) and G⇤ ⇠= O(3).[Hint: With respect to the standard basis for R3,

O(3) ⇠= {A 2 GL(3,R) : ATA = I}.

Show that the same is true for G⇤ with respect to a suitable choice of basis for R3.]

11. Another Averaging Trick. Prove E. H. Moore’s Theorem: Let G be a finitesubgroup of GL(3,R). Then G is isomorphic to a subgroup of O(3). [Hint: Definea function on R3 ⇥ R3 by

hu, vi = 1

|G|X

g2G

g(u) · g(v)

for all u, v 2 R3. Prove that h., .i is an inner product on R3. Then, prove that Gis a group of linear isometries with respect to h., .i. Now use Exercise 8b.]

12. Prove: Let G be a finite subgroup of Aff(R3). Then G is isomorphic to asubgroup of O(3).

[Hint: Use Theorem 7.3 and the proof of Corollary 7.6 to show thatG is conjugateto a subgroup G1 of GL(3,R). Now use Exercise 11.]

13. Give an example of a bijective function f : R ! R of finite order, i.e., fn = Ifor some n 2 BbbN , such that f fixes no point of R, i.e., f(x) 6= x for all x 2 BbbR.

14(a) Consider the following set of 2⇥ 2 matrices with complex entries:

Q8 := {±I,±✓i 00 �i

◆,±

✓0 1�1 0

◆,±

✓0 ii 0

◆}.

Prove: Q8 is a subgroup of the group of all 2⇥ 2 matrices with complex entries.

(b) Let m 2 N and let ⇣m = cos( 2⇡m ) + isin( 2⇡m ). Using DeMoivre’s Formula,prove that ⇣mm = 1, but ⇣dm 6= 1 for all d < m, d 2 N.

(c) Let p and q be prime numbers. Let Zpq =

✓⇣p 00 ⇣q

◆. Prove: Zpq generates a

cyclic subgroup of cardinality pq in GL(2,C), the group of 2⇥2 invertible matriceswith complex entries.

(d) Let p be a prime number. Let a, b 2 N with a b. Let H be the subgroupof GL(2,C) generated by the two matrices:

✓⇣pa 00 1

◆and

✓1 00 ⇣pb

◆.

Prove: Hp⇠= Cpa ⇥ Cpb .

(e) Prove: Let L be the subgroup of GL(2,C) generated by the two matrices:

✓⇣3 00 ⇣�1

3

◆and

✓0 1�1 0

◆.

Then L is a nonabelian group of cardinality 12 with |Z(L)| = 2 and with only oneelement of order 2.

62

Remark:. We have now constructed examples of all finite groups G with |G| 15:

(1) Cn, n 15;(2) V4 = D2

⇠= C2 ⇥ C2;(3) D3

⇠= S3;(4) D4;

(5) C4 ⇥ C2⇠= h

✓i 00 1

◆,

✓1 00 �1

◆i;

(6) C2 ⇥ C2 ⇥ C2, the group of all diagonal matrices in O(3);(7) Q8 ✓ GL(2,C);(8) C3 ⇥ C3 ✓ S6;(9) D5;(10) C6 ⇥ C2 ✓ GL(2,C);(11) D6;(12) A4;(13) L ✓ GL(2,C).

15. Justify the statement that no two of the 27 groups listed above are isomor-phic.

It is not terribly di�cult, but it is a bit beyond the scope of this course to provethat every group G with |G| 15 is isomorphic to one of these 27 groups.

63

8. Imaginaries and Galois fields

“... and gives to airy nothing a local habitation and a name ...”– Wm. Shakespeare

We now change topic, returning to the theme of polynomial equations and theirroots. De Moivre’s Formula, which you studied last semester, demonstrated theexistence of exactly n complex nth roots for any number, i.e. a full complementof n solutions to the equation xn � ↵ = 0 could be found among the complexnumbers, for any given complex number ↵. This lent support to the idea that everypolynomial equation of degree n with complex coe�cients should have a full set of nsolutions (counting multiplicity) in the field C. A somewhat cryptic version of thisstatement was first made (without proof) by the French mathematician Girard asearly as 1629 (before Descartes published his Factor Theorem), along with formulasexpressing the coe�cients as symmetric functions of the roots.

Nevertheless, as late as the early 1700s, Leibniz thought he had a counterex-ample. D’Alembert published a somewhat incomplete proof of this “FundamentalTheorem of Algebra” using the methods of calculus, in 1746. Euler attempted amore algebraic proof in 1749, which was improved by de Foncenex and then La-grange in 1772.

There was one big problem with Euler’s proof, and this was pointed out byGauss, who proposed his own calculus-based proof in 1799, and a second morealgebraic proof in 1816. The problem detected by Gauss was:

Euler’s proof assumed the existence somewhere of a set of n roots of an nthdegree polynomial p(x) with real coe�cients. The proof then proceeded to showthat these roots were in fact complex numbers. But, said Gauss, this misses theentire point. Why do roots of p(x) exist anywhere??

If Euler were still alive when Gauss wrote this, he might have responded: Whatdo you mean exist somewhere?? We can simply invent them, as needed. Afterall, the imaginary number i was invented to provide a root for the polynomialf(x) = x2 + 1 2 R[x]. By adding i to R, we were able to create a larger field, C,containing the roots of all quadratic equations. Just keep on doing this, as needed.

Well, Gauss had a point. One can certainly invent symbols, but can one constructan algebraic structure containing these symbols and having all of the usual niceproperties that one needs to carry out Euler’s proof? In modern language, givena field F and a polynomial f(x) 2 F [x], can one always construct a larger field Econtaining F and also containing a complete set of roots for this polynomial? Galoiswas perhaps the first person to demonstrate that this is always possible. In 1831,he wrote a note on fields of numbers, hinting at the construction we shall describein this section. This was later clarified and elaborated by Leopold Kronecker.

Let’s construct C a di↵erent way: Let R[x] be the domain of all polynomialswith real coe�cients. Let (x2 + 1) denote the principal ideal in R[x] generated bythe polynomial q(x) = x2 + 1. Form the quotient ring C := R[x]/(x2 + 1). By theDivision Algorithm, we see that every element of C has the form

(a+ bx) + (x2 + 1)

64

for some a, b 2 R. If we define the symbol i to denote the coset x+ (x2 + 1), thenwe see that

i2 = x2 + (x2 + 1) = �1 + (x2 + 1).

Thus we have created a commutative ring whose objects are the symbols a+ bi fora and b real numbers, satisfying the condition i2 = �1, i.e. we have recreated thecomplex numbers.

Clearly, we can imitate this construction in great generality:

Let F be a field and let f(x) 2 F [x]. Take the principal ideal (f(x)) in F [x] andform the quotient ring F [x]/(f(x)).

It is easy to see that if we set

j = x+ (f(x)) 2 F [x]/(f(x)),

then f(j) = 0. So we have created a ring in which f(x) has a root.

Experimentation shows that in general the ring we have created is not a field.However, if f(x) is an irreducible polynomial in F [x], then indeed Euclid’s Lemmafor Polynomials will guarantee that every non-zero element of F [x]/(f(x)) has amultiplicative inverse in F [x]/(f(x)), i.e. F [x]/(f(x)) is indeed a field.

Theorem 8.1. Let F be a field and let f(x) 2 F [x] be a (non-constant) irreduciblepolynomial. Then the ring E := F [x]/(f(x)) is a field. Moreover, F is isomorphicto the following subfield of E:

F0 := {↵+ (f(x)) : ↵ 2 F}.

(Here we are identifying the number ↵ with the constant polynomial c(x) = ↵.)

Proof. For any p(x) 2 F [x], let [p(x)] := p(x) + (f(x)) 2 E. We need to prove thatif [g(x)] is a non-zero element of E, then there is a polynomial h(x) 2 F [x] suchthat [g(x)] · [h(x)] = [1] in E, i.e.,

(g(x) + (f(x)))(h(x) + (f(x))) = 1 + (f(x)) 2 F [x]/(f(x)).

This is true if and only if there exists a polynomial k(x) 2 F [x] such that:

g(x)h(x) = 1 + k(x)f(x).

Rewriting this as

g(x)h(x)� k(x)f(x) = 1,

reminds us of Euclid’s Lemma. Indeed, since f(x) is irreducible and g(x) is not amultiple of f(x), we do indeed have that gcd(g(x), f(x)) = 1. Hence there existpolynomials h(x) and m(x) such that

h(x)g(x) +m(x)f(x) = 1.

Taking k(x) = �m(x), we are done. Thus [g(x)] · [h(x)] = [1] in F [x]/(f(x)). Sothis ring is indeed a field.

65

Now let � : F ! F0 be the function

�(↵) = ↵+ (f(x)) for all ↵ 2 F .

Then clearly � is a homomorphism of F onto F0. If �(↵) = 0 + (f(x)), then theconstant polynomial ↵ is in the ideal (f(x)), i.e., ↵ is a multiple of f(x). Since f(x)is not a constant polynomial, this is possible only if ↵ = 0. Hence � : F ! F0 is anisomorphism of fields, as claimed. This completes the proof of the theorem.

⇤It is easy to see that the element j := x + (f(x)) in E := F [x]/(f(x)) is a root

of the polynomial f(x) 2 E[x], where one regards F as a subfield of E, and henceF [x] as a subdomain of E[x].

We are not done however. In order to fully answer Gauss’ objection, we have toconstruct an extension field of F which contains all of the roots of p(x). Considerfor example the irreducible polynomial f(x) = x4 + 2 2 Q[x]. If ↵ is the positivereal fourth root of 2, then the roots of f(x) in C are ↵, �↵, i · ↵, and �i · ↵. ThusQ(↵, i) contains all the roots of f(x) in C, but clearly Q(↵), being a subfield of R,does not. With some further thought, we can see that Q(i ·↵) also does not containall of the roots of f(x).

We solve this problem by repeating the process.

Theorem 8.2. Let F be a field and let p(x) 2 F [x] be a polynomial of degree n.There exists a field E containing a subfield F0 isomorphic to F and such that p(x)factors as a product of n linear factors in E[x].

Proof. We proceed by complete mathematical induction on the degree n of p(x). Ifn = 1, then p(x) itself is linear and there is nothing to prove. Suppose then thatthe theorem is true for all polynomials of degree less than n.

Suppose first that p(x) = g(x)h(x) with g(x) and h(x) non-constant polynomialsin F [x]. By induction, there is a field E1 containing a subfield F1 isomorphic to Fand such that g(x) factors into linear factors in E1[x]. Identifying F and F1, we mayassume that h(x) 2 E1[x]. Then by induction, there exists a field E containing asubfield E0 isomorphic to E1 and such that h(x) factors into linear factors in E[x].Now, since E0 is isomorphic to E1, E0 contains a subfield F0 isomorphic to F .Thus, after suitable identifications, we have p(x) = g(x)h(x) 2 E[x] with both g(x)and h(x) factoring into linear factors in E[x]. Hence p(x) factors into linear factorsin E[x], and we are done.

Thus we may assume that p(x) 2 F [x] is irreducible. Then by Theorem 8.1,there exists a field E1 containing a subfield F1 isomorphic to F , and such that p(x)has a root ↵ 2 E1. Thus, by Descartes’ Factor Theorem, there exists g(x) 2 E1[x]such that

p(x) = (x� ↵)g(x) 2 E1[x].

Since g(x) has degree n � 1, we may apply induction as before to conclude thatthere exists a field E containing a subfield E0 isomorphic to E1, and such that g(x)factors into linear factors in E1[x]. As before, E contains a subfield F0 isomorphicto F , and p(x) factors into linear factors in E[x], as desired.

⇤

66

Definition 8.3. Let F be a field and let p(x) 2 F [x] be a polynomial. Supposethat E is a field which contains a subfield F0 isomorphic to F , and such that p(x)factors into a product of linear factors in E[x]. Let r1, r2, . . . , rn be the roots ofp(x) in E, and let E0 = F (r1, r2, . . . , rn) ✓ E. Then we call E0 a splitting fieldfor p(x) over F .

It is very useful to be able to measure the “relative sizes”of fields F and E, whereE is an extension field of F , i.e.,F is a subfield of E. Since both fields are, ingeneral, infinite, cardinality is not a good measuring stick. Fortunately we have analternative, suggested by the following observation.

Lemma 8.4. Let E be an extension field of F . Then the usual operations ofaddition and multiplication in E make E an F -vector space.

We leave the proof as an exercise. Note that, since E is a field, we know that(E,+) is an abelian group. The axioms for scalar multiplication:

(1) ↵ · (u+ v) = ↵ · u+ ↵ · v, for ↵ 2 F , u, v 2 E;(2) (↵+ �) · u = ↵ · u+ � · u for ↵,� 2 F , u 2 E;(3) (↵�) · u = ↵ · (� · u) for ↵,� 2 F , u 2 E; and(4) 1 · v = v for all v 2 E.

follow easily from the properties of the field E.Since E is an F -vector space, we can measure its size relative to F by the

dimension of E as an F -vector space. Note that, if E = F , then {1} is a basis forE as an F -space, and so dimF (F ) = 1.

Definition 8.5. We write (E : F ) and speak of the degree of E over F , to denotethe dimension of E as an F -vector space.

This degree is most useful when it is finite. This will be the case in the situationsof interest to us. We need the following remark, whose proof we leave as an exercise.

Lemma 8.6. Let F be a field and E an extension field of F containing a root ↵of some non-zero polynomial in F [x]. Then the set

K(↵) = {f(x) 2 F [x] : f(↵) = 0}

is a principal ideal in F [x] generated by a monic irreducible polynomial m(x). Inparticular, if p(x) is any irreducible polynomial in F [x] with p(↵) = 0, then p(x) =c ·m(x) for some c 2 F .

We call m(x) the minimum polynomial of ↵ in F [x]. Note that m(x) dependsheavily both on ↵ and on F .

Theorem 8.7. Let F be a field and let p(x) 2 F [x] be a polynomial of degree n � 1.Let E = F [x]/(p(x)). Then (E : F ) = n.

Proof. If f(x) 2 F [x], we let [f(x)] := f(x) + (p(x)) 2 E. By the Division Algo-rithm, there exist polynomials q(x) and r(x) 2 F [x], with either r(x) = 0 or r(x)of smaller degree than n, such that

f(x) = q(x)p(x) + r(x).

Thus

67

[f(x)] = [q(x)p(x) + r(x)] = [q(x)][p(x)] + [r(x)] = [q(x)][0] + [r(x)] = [r(x)].

Set

r(x) = a0 + a1x+ · · ·+ an�1xn�1 2 F [x].

Being a bit sloppy, we shall assume that F ✓ E and denote by a the coset a +(p(x)) 2 E for any a 2 F . Thus

[r(x)] = a0 · [1] + a1 · [x] + · · ·+ an�1 · [xn�1] 2 E.

In other words, the set

B := {[1], [x], . . . , [xn�1]}

is a spanning set for E as an F -vector space. We claim that B is also an F -linearlyindependent set. For, suppose that

c0 · [1] + c1 · [x] + · · ·+ cn�1 · [xn�1] = [0] 2 E,

for some c0, c1, . . . , cn�1 2 F . Then

[c0 + c1x+ · · ·+ cn�1xn�1] = [0].

Setting h(x) = c0 + c1x+ . . . cn�1xn�1 2 F [x], we conclude that h(x) is a multipleof f(x). However, either h(x) is the zero polynomial or deg(h(x)) n � 1 < n =deg(f(x)). Hence h(x) ⌘ 0, i.e.

c0 = c1 = · · · = cn�1 = 0.

Thus B is indeed an F -linearly independent set. So B is an F -basis for E, whence(E : F ) = n, as claimed.

⇤As a corollary, we obtain the following important fact.

Corollary 8.8. Let E be a field and let F be a subfield of E. Let ↵ 2 E andsuppose that m(x) 2 F [x] is the minimum polynomial of ↵ in F [x]. If the degree ofm(x) is n, then (F (↵) : F ) = n.

Proof. F (↵) ⇠= F [x]/(m(x)).

⇤Note that not every number has a minimum polynomial. For example if F =

Q and E = R, then ⇡ 2 R and ⇡ is not the root of any polynomial equationwith rational coe�cients. [This is a fairly di�cult theorem to prove. It was firstproved by Lindemann.] We say that a number ↵ is algebraic over F if ↵ is theroot of a polynomial equation with coe�cients in F . Otherwise, we say that ↵ istranscendental over F . We shall restrict our attention to algebraic numbers. Wehave the following converse to Corollary 8.8.

68

Theorem 8.9. Let E be an extension field of F with (E : F ) = n < 1. Thenevery element of E is algebraic over F .

Proof. Let ↵ 2 E. Consider the set

S := {1,↵,↵2, . . . ,↵n}.Since |S| = n + 1 and dimF (E) = n, S is a linearly dependent set. Hence thereexist numbers c0, c1, . . . , cn 2 F , not all 0, such that

c0 + c1↵+ · · ·+ cn↵n = 0.

Let p(x) = cnxn + · · ·+ c1x+ c0 2 F [x]. Then p(↵) = 0. Hence ↵ is algebraic overF , as claimed.

⇤

Exercises

1. Describe the multiplication in the ring F [x]/(x2). Is this a field? What typeof element is [x]?

2. Describe the multiplication in the ring Q[x]/(x2 � x). Is this a field? Whattype of element is [x]?

3a. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreduciblepolynomials with gcd(g(x), h(x)) = 1. Prove: Q[x]/(p(x)) ⇠= F1 � F2, with F1 andF2 extension fields of Q. [Hint: Use the Chinese Remainder Theorem from lastsemester.]

b. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreduciblepolynomials. Prove: Q[x]/(p(x)) is not a field, but also it has no non-zero nilpotentelements.

c. (Bonus) Give necessary and su�cient conditions on a polynomial p(x) 2 Q[x]for the ring Q[x]/(p(x)) to contain non-zero nilpotent elements.

4a. Describe the multiplication in the ring Q[x]/(x2 + x + 1). Is this a field?What is the multiplicative inverse of [x]?

b. Let ! = � 12 +

p32 i 2 C. Let

E = {a+ b! + c!2 : a, b, c 2 Q} ✓ C.

Prove: E is closed under addition, subtraction, multiplication, and division (bynon-zero elements).

c. Let E be as in (b). Prove: E ⇠= Q[x]/(x2 + x+ 1).

5. Prove Lemma 8.4.

6. Prove Lemma 8.6.

7. Prove: Let F be a field. Let F0 be the intersection of all subfields of F . ThenF0 is a subfield of F . [Hence F0 is the unique smallest subfield of F .]

The next few exercises relate to finite fields. These were first described by Galois,and are sometimes called Galois fields.

69

8a. Prove: Let E be a finite field. Let F be the smallest subfield of E, i.e.,

F = {0E , 1E , 1E + 1E , ...}.

Then F ⇠= Z/pZ for some prime p. [Recall: p is the characteristic of the field E.]

b. Prove: Let E and F be as in (a). Then E is a finite-dimensional F -vectorspace. In particular, |E| = pn for some n 2 N.

c. Prove: Let E be a finite field with |E| = pn. Then xpn

= x for all x 2 E.[Hint: E � {0} is a finite group under multiplication. Apply Lagrange’s Theorem.]

d. Prove: Let E be a field of characteristic p. Let a, b 2 E. Then

(a+ b)pn

= apn

+ bpn

for all n 2 N.

9. Let F = Z/pZ. Let f(x) = xpn � x 2 F [x]. Let E be a splitting field for f(x)over F .

a. Prove: f(x) has pn distinct roots in E. [Hint: Suppose on the contrarythat f(x) = (x � a)2 · g(x) 2 E[x]. Compute f 0(x) in two di↵erent ways to get acontradiction.]

b. Let f(x) be as above. Let S := {a 2 E : f(a) = 0}. Prove: S is closed underaddition, subtraction, multiplication, and division (by non-zero elements), i.e., S isa subfield of E.

c. Let S be as in (b). Prove: S = E, i.e. |E| = pn.

10. Prove: If E and E1 are two fields with |E| = pn = |E1|, then E ⇠= E1. [Thus,up to isomorphism, there is one and only one field of cardinality pn for each primep and each n 2 N.]

11. Prove: Let E be a field of characteristic p. Define f : E ! E by f(x) = xp

for all x 2 E. Then f is an injective ring homomorphism. In particular, if |E| = pn

for some n 2 N, then f is an automorphism of E.

70

9. Symmetric Polynomials and theFundamental Theorem of Algebra

Symmetry is of course a ubiquitous topic in Euclidean geometry. But geometricsymmetry did not lead to the development of group theory. Instead, that hadto wait 2000 years until the problem of finding the roots of polynomial equationspushed mathematicians to develop the theory of symmetric polynomials. In thischapter we shall develop some basic properties of symmetric polynomials and applythem to give Euler’s proof of the Fundamental Theorem of Algebra.

Definition 9.1. A polynomial in n commuting variables f(r1, r2, . . . , rn) is calleda symmetric polynomial in these variables if, for every � 2 Sn,

f(r�(1), r�(2), . . . , r�(n)) = f(r1, r2, . . . , rn).

The concept of a symmetric polynomial is not interesting when n = 1. For n = 2,some examples are:

r1 + r2,

r1r2,

r21 + r22 + r1r2,

r31 + r32.

For n = 3, another type of example is:

r1r2 + r1r3 + r2r3.

We shall restrict our attention to symmetric polynomials with integer coe�cients.We leave it as an exercise to prove the following theorem.

Theorem 9.2. Let S be the set of all symmetric polynomials in the variablesr1, r2, . . . , rn with integer coe�cients. Then S is a subring of Z[r1, r2, . . . , rn], i.e.S contains 0 and 1, and S is closed under addition, subtraction, and multiplication.

Now we are really interested in polynomials p(x) in one variable. The contextin which these multivariable polynomials arises is the following:

Suppose that p(x) = xn+an�1xn�1+ · · ·+a1x+a0 2 F [x] is a monic polynomialhaving roots r1, r2, . . . , rn in some splitting field E containing F . Then

p(x) = xn + an�1xn�1 + · · ·+ a1x+ a0 = (x� r1)(x� r2) . . . (x� rn) 2 E[x].

Equating coe�cients, we get n formulas of the type:

�an�1 = r1 + r2 + · · ·+ rn,

an�2 = r1r2 + r1r3 + · · ·+ rn�1rn,

. . . ,

(�1)na0 = r1r2 . . . rn.

71

Each of the expressions on the right hand side can be thought of as a polynomialin the ring Z[r1, r2, . . . , rn]. In fact, each lies in the subring of symmetric polyno-mials, since obviously p(x) is unchanged by any permutation in the ordering of thelinear factors. Indeed, these n polynomials are called the elementary symmetricpolynomials in Z[r1, r2, . . . , rn]:

s1 = r1 + r2 + · · ·+ rn,

s2 =X

i 6=j

rirj ,

s3 =X

i 6=j 6=k 6=i

rirjrk,

. . .

sn = r1r2 . . . rn.

The following fundamental theorem was probably known to Isaac Newton, butwas first explicitly proved somewhat later by Edward Waring.

Waring’s Theorem. Let S be the subring of Z[r1, r2, . . . , rn] consisting of allsymmetric polynomials in the variables r1, r2, . . . , rn with integer coe�cients. Then

S = Z[s1, s2, . . . , sn].

The proof of Waring’s Theorem amounts to an algorithm for the following:

Given a symmetric polynomial f(r1, r2, . . . , rn), rewrite this polynomial as

f(r1, r2, . . . , rn) = F (s1, s2, . . . , sn),

for some polynomial F with integer coe�cients, depending of course on f .

Rather than describe the algorithm in full gory generality, let’s look at an illus-trative example in three variables. To save ink, let’s call the variables r, s, and t,instead of r1, r2, r3.

If you want to cook up an example of a symmetric polynomial in three variables,you can symmetrize any monomial by adding up all of its possible permutations.For example, starting with the monomial

m(r, s, t) = r2s,

we get the symmetrized polynomial

p(r, s, t) = r2s+ r2t+ s2r + s2t+ t2r + t2s.

An important feature of any algorithm is to have a way of measuring whetheryou are making steady progress in the correct direction, or just going around incircles. To do this, we choose a way to say that a symmetric polynomial p(r, s, t)is bigger than some other symmetric polynomial q(r, s, t). Then we will look for analgorithm that makes our polynomial smaller and smaller.

For a monomial m(r, s, t) = arisjtk (with a 2 Z), we call its degree vector(i, j, k). Thus, the monomial m(r, s, t) = r2s has degree vector (2, 1, 0). We orderthe degree vectors lexicographically reading from left to right. Thus

72

(3, 0, 0) > (2, 1, 1) > (2, 1, 0) > (2, 0, 7) > (0, 3, 5),

for example, i.e.,

r3 > r2st > r2s > r2t7 > s3t5.

Here is a crucial point: If f(r, s, t) is a symmetric polynomial containing a mono-mial with degree vector (i, j, k), then by symmetry, f must also contain monomialswith degree vector every possible permutation of (i, j, k). In one of these, we musthave i � j � k. In particular, the highest term of any symmetric polynomialf(r, s, t) has degree vector (a, b, c) with a � b � c.

Here is another crucial point: The highest terms of the elementary symmetricpolynomials in r, s, t have degree vectors as follows:

If s1 = r + s+ t, degree vector is (1, 0, 0).

If s2 = rs+ rt+ st, degree vector is (1, 1, 0).

If s3 = rst, degree vector is (1, 1, 1).

When we multiply monomials, we add their degree vectors. Hence if i � j � k,then

si�j1 sj�k

2 sk3 has degree vector (i, j, k).

For example, if we want a symmetric polynomial with degree vector (2, 1, 0), thenwe note that

(2, 1, 0) = (1, 0, 0) + (1, 1, 0),

and so s1s2 should do the job. Let’s check:

s1s2 = (r + s+ t)(rs+ rt+ st) = r2s+ r2t+ rt2 + s2t+ rs2 + st2 + 3rst.

Thus, indeed, the highest monomial term of s1s2 is r2s, which has degree vector(2, 1, 0).

Now let’s go back to our polynomial

p(r, s, t) = r2s+ r2t+ s2r + s2t+ t2r + t2s.

If we let q(r, s, t) = p(r, s, t) � s1s2, then we have succeeded in canceling the termof highest degree out of p(r, s, t). Thus q(r, s, t) = �3rst has highest degree vector(1, 1, 1) < (2, 1, 0). So we are making progress. In fact, in this case we are done,since q(r, s, t) = �3s3. So we have

p(r, s, t) = s1s2 + q(r, s, t) = s1s2 � 3s3 2 Z[s1, s2, s3],as desired.

Of course, in general, the procedure takes much longer, but clearly we keep sim-plifying our problem. So like the Euclidean algorithm or the Gaussian eliminationalgorithm, we must eventually succeed. You will be asked to try a few harderexamples in the homework exercises.

Here is the way that we will apply Waring’s Theorem to prove the FundamentalTheorem of Algebra.

73

Lemma 9.3. Let p(x) 2 R[x] be a monic polynomial of degree n. Let E be asplitting field for p(x) containing R. Denote by r1, r2, . . . , rn the roots of p(x) in E.Let c 2 N and consider the

�n2

�numbers aij := ri + rj + crirj where 1 i < j n.

Let f(x) 2 E[x] be the polynomial

f(x) =Y

1i<jn

(x� aij)

of degree�n2

�. Then f(x) 2 R[x], i.e., all of the coe�cients of f(x) are real numbers.

Proof. Let A = {aij : 1 i, j n, i 6= j}. Let Sn act on A by:

�(aij) = a�(i)�(j) for all � 2 Sn.

Then Sn permutes the elements of A. Hence each of the�n2

�elementary symmetric

polynomials in the elements of A:

E1 := a12 + a13 + · · ·+ an�1,n,

E2 :=X

(ij) 6=(kl)

aijakl,

. . .

E(n2):= a12 · a13 · . . . · an�1,n,

is fixed by every permutation in Sn. Thus, if we view each Ei as a polynomial inZ[r1, r2, . . . , rn], then in fact each Ei lies in the subring S of all symmetric poly-nomials in the variables r1, r2, . . . , rn. By Waring’s Theorem, S = Z[s1, s2, . . . , sn],where

s1 = r1 + r2 + · · ·+ rn,

s2 =X

i 6=j

rirj ,

. . .

sn = r1r2 . . . rn.

But then, for each i, either si or �si is a coe�cient of the polynomial p(x) 2 R[x],i.e., each si is a real number. Hence Z[s1, s2, . . . , sn] is a subring of R. It followsthat each Ei is a real number. But

f(x) = x(n2) � E1x(

n2)�1 + · · · ± E(n2)

.

Hence f(x) 2 R[x], as claimed.

⇤We are now ready to attack the Fundamental Theorem of Algebra in the manner

of Euler.

74

Fundamental Theorem of Algebra. Let p(x) 2 C[x] be a polynomial of degreen � 1. Then there exist complex numbers c, r1, r2, . . . , rn such that

p(x) = c(x� r1)(x� r2) . . . (x� rn) 2 C[x].

We begin with a few easy reductions. We shall refer to the Fundamental Theoremof Algebra as FTA.

Lemma 9.4. FTA is true provided that the following statement is true:

(*) Let f(x) be any monic polynomial in C[x] of degree n � 1. Then there is atleast one complex number r with f(r) = 0.

Proof. Assuming the statement above, we shall prove FTA by induction on thedegree n of p(x). If n = 1, then p(x) = ax+ b for some a, b 2 C, and so

p(x) = a(x� (�b

a)).

Thus we are done, taking c = a and r1 = �ba .

Now suppose FTA is true for polynomials of degree n, and let p(x) 2 C[x] havedegree n+ 1. Write

p(x) = an+1xn+1 + anx

n + · · ·+ a1x+ a0.

Let bi = aian+1

for 0 i n. Then p(x) = an+1f(x), where f(x) 2 C[x] is themonic polynomial

f(x) = xn+1 + bnxn + · · ·+ b1x+ b0.

By assertion (*), there exists a complex number r with f(r) = 0. Then by Descartes’Factor Theorem,

f(x) = (x� r)g(x),

where g(x) is a monic polynomial in C[x] of degree n. By induction, there existcomplex numbers r1, r2, . . . , rn with

g(x) = (x� r1)(x� r2) . . . (x� rn).

Thus

p(x) = an+1f(x) = an+1(x� r)g(x) = an+1(x� r)(x� r1)(x� r2) . . . (x� rn).

Thus p(x) has a factorization as claimed in FTA. Hence FTA is true, provided that(*) is true.

⇤The next step is to reduce (*) to the case when f(x) has real coe�cients.

75

Lemma 9.5. (*) is true if and only if the following statement is true:

(**) Let f(x) be any monic polynomial in R[x] of degree n � 1. Then there is atleast one complex number r with f(r) = 0.

Proof. It is clear that (*) implies (**). Suppose now that (**) is true, and let f(x)be a monic polynomial in C[x] of degree n � 1. Write

f(x) = xn + an�1xn�1 + · · ·+ a1x+ a0.

Define the conjugate polynomial f(x) by

f(x) = xn + an�1xn�1 + · · ·+ a1x+ a0.

Let g(x) = f(x) · f(x). Then

g(x) = f(x) · f(x) = f(x) · f(x) = f(x) · f(x) = g(x).

Thus g(x) 2 R[x]. Hence by (**), there exists r 2 C with g(r) = 0. Thus

0 = g(r) = f(r) · f(r).

Since C is a domain, either f(r) = 0 or f(r) = 0. If f(r) = 0, then (*) holds andwe are done. Suppose, then, that f(r) = 0, i.e.

rn + an�1rn�1 + · · ·+ a1r + a0 = 0.

Taking complex conjugates of both sides, we get

rn + an�1rn�1 + · · ·+ a1r + a0 = 0.

Hence f(r) = 0, and again (*) holds, and we are done in this case as well.

⇤Finally, we have reached the heart of the problem. We must show that every

non-constant monic polynomial with real coe�cients has at least one complex root.

It is impossible to give a purely algebraic proof of the Fundamental Theoremof Algebra, because the real number field is not a purely algebraic object. Itsconstruction depends on taking limits of Cauchy sequences or finding least upperbounds of infinite sets, both of which are analytic constructions. Euler uses calculusin his proof via the Intermediate Value Theorem of which we state (without proof)the following special case.

Intermediate Value Theorem. Let f(x) 2 R[x] be a polynomial. Suppose thatthere exist real numbers a and b such that f(a) 0 and f(b) � 0. Then there existsa real number c (between a and b) such that f(c) = 0.

The application of the Intermediate Value Theorem which we need is the follow-ing corollary.

76

Corollary 9.6. Let f(x) be a monic polynomial in R[x] of odd degree. Then thereis at least one real number r with f(r) = 0.

Idea of Proof. Let f(x) = xn + an�1xn�1 + · · ·+ a1x+ a0 2 R[x]. Then

f(x) = xn · (1 + an�11

x+ · · ·+ a1

1

xn�1+ a0

1

xn).

Let M = max0i<n|ai| and choose x such that |x| > max(1, nM). Then |xk| �|x| > nM for all k � 1. So

|an�k||xk| <

M

nM=

1

nfor 0 k < n.

Hence

|an�11

x+· · ·+a1

1

xn�1+a0

1

xn| |an�1|

|x| +· · ·+ |a1||xn�1|+

|a0||xn| <

1

n+· · ·+ 1

n+

1

n= 1.

Hence

1 + an�11

x+ · · ·+ a1

1

xn�1+ a0

1

xn> 0.

Hence f(x) > 0 for all x > max(1, nM), and f(x) < 0 for all x < min(�1,�nM).It follows from the Intermediate Value Theorem that there exists at least one realnumber r with f(r) = 0, as claimed.

⇤Now finally we arrive at Euler’s brilliant idea. He would like to prove the Funda-

mental Theorem of Algebra by induction. But rather than trying to use the obviousinduction on the degree of f(x), he uses induction on the 2-part of the degree off(x).

Theorem 9.7. Let f(x) be a monic polynomial in R[x] of degree n > 0. Thenthere is at least one complex number r with f(r) = 0.

Euler’s Proof. Write n = 2m · n1, where n1 is odd. The proof is by mathematicalinduction on m for m � 0. If m = 0, then n = n1 is odd, and we are done byCorollary 9.6. Hence we may assume that m > 0 and that the theorem is true forall monic polynomials in R[x] of degree k = 2m�1 · k1 with k1 odd.

Let E be a splitting field for f(x) containing C. Let r1, r2, . . . , rn be the roots off(x) in E. Our goal is to show that at least one of these roots is in the subfield C.We construct an infinite family of new polynomials g1(x), g2(x), . . . , one for eachnatural number, by the following rule:

gc(x) =Y

1i<jn

(x� (ri + rj + crirj)).

By Lemma 9.3, gc(x) 2 R[x] for all c 2 N. Moreover, the degree of gc(x) is

✓n

2

◆=

n(n� 1)

2=

2m · n1 · (n� 1)

2= 2m�1 · n1(n� 1),

77

with n1(n � 1) odd. Hence, by the inductive hypothesis, each gc(x) has at leastone complex root. In other words, for each c 2 N, there exists a pair {i, j} with1 i 6= j n, such that

ri + rj + crirj 2 C.

Since there are infinitely many natural numbers, it follows by the Pigeonhole Prin-ciple that there exist two di↵erent natural numbers c and d for which the samechoice of {i, j} yields a complex number. In other words, both

a := ri + rj + crirj 2 C

andb := ri + rj + drirj 2 C.

Then

rirj =a� b

c� d:= C 2 C,

and

ri + rj = a� crirj = a� ca� b

c� d:= �B 2 C.

But then

q(x) := (x� ri)(x� rj) = x2 � (ri + rj)x+ rirj = x2 +Bx+ C 2 C[x].

Now the Quadratic Formula is valid for polynomials with complex coe�cients. Let� denote one of the two complex square roots of B2 � 4C, as given by DeMoivre’sFormula. Then the roots of q(x) are:

ri =�B + �

22 C and rj =

�B � �

22 C.

Thus ri and rj are two conjugate complex numbers which are roots of the polyno-mial f(x), completing the proof.

⇤And indeed, by combining Lemmas 9.4 and 9.5, Corollary 9.6, and Theorem 9.7,

we have completed Euler’s beautiful proof of the Fundamental Theorem of Algebra.

78

Exercises

1. Prove Theorem 9.2. [Hint: For f = f(r1, r2, . . . , rn) 2 Z[r1, r2, . . . , rn] and for� 2 Sn, let �(f) := f(r�(1), r�(2), . . . , r�(n)). You may use the facts that �(f +g) =�(f) + �(g) and �(f · g) = �(f) · �(g).]

2a. Express r2+s2+t2 as a polynomial in the elementary symmetric polynomialss1, s2, s3.

b. Do the same for r3 + s3 + t3.

3. Express r21 + r22 + r23 + r24 as a polynomial in the four elementary symmetricpolynomials si(r, s, t, u), 1 i 4.

4a. Using the Fundamental Theorem of Algebra, prove that every polynomial inR[x] can be factored into a product of polynomials of degree 1 or 2, each in R[x].

b. Give necessary and su�cient conditions for a polynomial p(x) = anxn+ · · ·+a1x+ a0 2 R[x] to be irreducible.

5. Give an example of a quadratic polynomial in Q[x] which is irreducible inQ[x], but is not irreducible in R[x].

6a. Prove: Counting multiplicity, a polynomial of even degree in R[x] has aneven number of real roots.

b. Give an example of a quartic polynomial p(x) in R[x] having no real roots.Factor p(x) as a product of two quadratic polynomials in R[x].

7. Let f(r1, r2, . . . , rn) 2 Q[r1, r2, . . . , rn]. We call f(r1, r2, . . . , rn) an alternat-ing polynomial if the Sn-orbit containing f(r1, r2, . . . , rn) has cardinality 2, i.e.,there exists a polynomial g(r1, r2, . . . , rn) 2 Q[r1, r2, . . . , rn] such that, for every� 2 Sn, either

f(r�(1), r�(2), . . . , r�(n)) = f(r1, r2, . . . , rn),

orf(r�(1), r�(2), . . . , r�(n)) = g(r1, r2, . . . , rn).

Define the discriminant polynomial �(r1, r2, . . . , rn) by

�(r1, r2, . . . , rn) =Y

i<j

(ri � rj).

a. Prove: �(r1, r2, . . . , rn) is an alternating polynomial.

b. Prove: �2(r1, r2, . . . , rn) is a symmetric polynomial.

c. Let f(x) = x2 + bx + c = (x � r1)(x � r2). Express �2(r1, r2) in terms of band c.

79

8. Let �(r1, r2, . . . , rn) be the discriminant polynomial. We call a permutation� 2 Sn a transposition if � = (i, j) for some i, j with 1 i < j n.

a. Express the m-cycle (1, 2, . . . ,m) 2 Sn as a product of transpositions.

b. Explain why every permutation in Sn can be written as a product of trans-positions.

c. Prove: If ⌧ 2 Sn is a transposition, then

⌧(�(r1, r2, . . . , rn)) = ��(r1, r2, . . . , rn).

d. Prove: Suppose � 2 Sn and

� = ⌧1 · ⌧2 · . . . · ⌧m = t1 · t2 · . . . · tr,

where each ⌧i and each tj is a transposition. Then m is even if and only if r is even.[Hint: Use (c).]

e. Let An be the set of all permutations � 2 Sn such that � is expressible as aproduct of an even number of transpositions. Prove: An is a subgroup of Sn. [An

is called the alternating group on n letters.]

f. Prove: |An| = n!2 .

80

10. The Cubic and Quartic Equations Revisited

No one was more deeply influenced by the work of Euler than his young con-temporary, Joseph Louis Lagrange. During the period 1770–1772, while serving ascourt mathematician to Frederick the Great in Berlin, Lagrange undertook a deepstudy of the known methods for solving equations.

Let’s begin with the very easy case of the quadratic equation:

x2 + bx+ c = 0.

If we call the roots r and s, then we have

x2 + bx+ c = (x� r)(x� s) = x2 � (r + s)x+ rs.

So, we are given the symmetric numbers r + s = �b and r · s = c, and we wantto find the asymmetric numbers r and s. Since the set of all symmetric numbersis a ring, we can’t escape from it by doing addition, subtraction or multiplication.The one thing which we are allowed to do which decreases symmetry in a controlledfashion is to take square roots, cube roots, etc. This is a first clue.

Lagrange talks about how many “values” (“valeurs” in French) an expressiontakes when you permute the variables. Thus the expression

f(r, s) = r

gets permuted to the expression

g(r, s) = s

by the permutation

� = (r, s) 2 Sym({r, s}).

Since there are only two permutations in Sym({r, s, }): ⌧ = (r, s) and I, the identitypermutation, the total number of values taken by f(r, s) = r is 2.

In modern language we would say the following: The group Sym({r, s}) actson the infinite set of all polynomials in C[r, s] by permuting the variables. Eachpolynomial p(r, s) is contained in some Sym({r, s}) orbit, which must have sizeeither 1 or 2. The symmetric polynomials are precisely the polynomials which arein an orbit of size 1.

Now an interesting example of a polynomial in an orbit of size 2 is:

�(r, s) = r � s.

This polynomial is called the discriminant polynomial. It has the interestingproperty that:

⌧(�(r, s)) = ⌧(r � s) = s� r = �(r � s) = ��(r, s).

So, the Sym({r, s}) orbit containing �(r, s) is:

{�(r, s),��(r, s)}.

Since ⌧(�) = ��, we have

81

⌧(�2) = (��)2 = �2,

i.e., �2 is a symmetric polynomial in r and s. Hence, by Waring’s Theorem, �2 isexpressible as a polynomial in the elementary symmetric polynomials b = r+s andc = rs. Of course, it is easy to compute explicitly:

�2 = (r � s)2 = (r + s)2 � 4rs = (�b)2 � 4c = b2 � 4c.

Hence

� =pb2 � 4c

for a suitable choice of square root. Of course, we can now find r and s by solvingthe system of linear equations:

(1) r + s = �b(2) r � s =

pb2 � 4c

Thus we recover the Quadratic Formula. Now let’s try to apply the same rea-soning to the cubic equation:

x3 + px� q = 0.

Again set

x3 + px� q = (x� r)(x� s)(x� t).

Here the process is a bit more complicated. Lagrange’s key observation is that, inthe quadratic case, r+ s and r� s should be thought of as r+1 · s and r+(�1) · s,where 1 and �1 are the two square roots of one. Therefore the analogous objectsof study in the cubic case should be of the form

� = �(r, s, t) := r + !s+ !2t,

where 1, ! = � 12 +

p32 i and !2 = � 1

2 �p32 i are the three cube roots of 1, and r,

s, t are the three roots of f(x) := x3 + px� q. An expression of this form is calleda Lagrange resolvent for the cubic f(x).

Now, �(r, s, t) takes six values under the action of Sym({r, s, t}), suggesting whyCardano ended up with an equation of degree 6 in attempting to solve the cubic.If we set

µ = µ(r, s, t) := !r + s+ !2t,

then the six values taken by � are

�,! · �,!2 · �, µ,! · µ,!2 · µ.

Since !3 = (!2)3 = 1, the function �3 takes only two values under permutation:

�3 and µ3.

Thus, the set {�3, µ3} is a Sym({r, s, t}) orbit of size 2 on the set C[r, s, t]. Weleave as an exercise, the following corollary:

82

Lemma 10.1. Let � be a permutation in Sym({r, s, t}). If � has order 1 or 3,then

�(�3) = �3 and �(µ3) = µ3.

If � has order 2, then � interchanges �3 and µ3.

In any case, it follows that every permutation in Sym({r, s, t}) fixes both

�3 + µ3 and �3 · µ3,

i.e. these are symmetric polynomials in r, s, t. As a somewhat tedious exercise, youwill be asked to write out �3 + µ3 explicitly as a polynomial in r, s, t, and then toexpress it as a polynomial in the three elementary symmetric functions in r, s, t.Note that, in this case,

(1) s1 = r + s+ t = 0,(2) s2 = rs+ rt+ st = p, and(3) s3 = rst = q.

Now �3 and µ3 are the two roots of the quadratic polynomial

q(x) := x2 � (�3 + µ3)x+ �3 · µ3.

Hence, using the Quadratic Formula, we could explicitly solve for �3 and µ3 interms of p and q. Then, by taking cube roots, we could find � and µ. Finally, weend up with a system of three linear equations in the three unknowns r, s, and t:

r + s+ t = 0

r + !s+ !2t = �

!r + s+ !2t = µ

to be solved in order to find r, s, and t. We leave as an exercise for you to verifythat the coe�cient matrix

0

@1 1 11 ! !2

! 1 !2

1

A

is invertible, and hence the system has a unique solution.Lagrange further extended these ideas to explain the solution of the quartic

equation. We give a brief description in the general spirit of his work. Consider thequartic

f(x) = x4 + ax2 + bx+ c.

Let the roots of f(x) be r1, r2, r3, and r4. Consider the elements

✓1 = (r1 + r2)(r3 + r4)

✓2 = (r1 + r3)(r2 + r4)

✓3 = (r1 + r4)(r2 + r3)

83

in Z[r1, r2, r3, r4]. We leave it as an exercise to verify that the set

T := {✓1, ✓2, ✓3}is a S := Sym({r1, r2, r3, r4}) orbit on Z[r1, r2, r3, r4]. The kernel of the action ofS on this orbit is a normal Klein 4-subgroup V of S. It is very important, as weshall see later, that V is a normal subgroup of S.

Since S permutes the set T , it follows that the elementary symmetric functionsin the ✓i’s are fixed by all of the elements of S, and hence are expressible in termsof the elementary symmetric functions in the roots r1, r2, r3, r4, i.e., in terms of thecoe�cients a, b, c of f(x). In fact, computation shows that

✓1 + ✓2 + ✓3 = 2a

✓1✓2 + ✓1✓3 + ✓2✓3 = a2 � 4c

✓1✓2✓3 = �b2.

It follows that ✓1, ✓2, ✓3 are the roots of the resolvent cubic

h(x) = x3 � 2ax2 + (a2 � 4c)x+ b2.

Now, assuming that we have found ✓1, ✓2, and ✓3, we can easily solve for the rootsr1, r2, r3, r4 of f(x). For example, since f(x) has no cubic term, we have

(r1 + r2) + (r3 + r4) = 0 and (r1 + r2)(r3 + r4) = ✓1.

So r1 + r2 and r3 + r4 are the two roots of the quadratic equation

q(x) := x2 � 0x+ ✓1 = x2 + ✓1.

Hence r1 + r2 and r3 + r4 are the two square roots of �✓1. Similarly, r1 + r3 andr2 + r4 are the two square roots of �✓2; and r1 + r4 and r2 + r3 are the two squareroots of �✓3. Finally, as in the cubic case, one can solve a system of linear equationsto find the roots of f(x). For example,

r1 =1

2(p�✓1 +

p�✓2 +

p�✓3).

Lagrange found himself unable to extend his methods to the case of the quinticequation. There was a good reason for this failure, but it would only be clearlyelucidated 60 years later by Evariste Galois. However, Lagrange’s work was a failureonly in the sense that Columbus’ voyages were failures. Lagrange had touched upona new world: the world of groups.

Here is Lagrange’s formulation of the great theorem which has come to bear hisname.

Lagrange’s Theorem. Let f(r1, r2, . . . , rn) be a polynomial in n commuting vari-ables. The number of values taken by f under permutation of the variables must bea divisor of n!.

In the exercises you will be asked to reformulate this theorem in modern languageand to explain why it is a corollary of Lagrange’s Orbit-Stabilizer Theorem, asstated and proved in Chapter 4.

84

Exercises

1. Prove Lemma 10.1.

2a. Write out �3 + µ3 explicitly as a polynomial in r, s, t.

b. Express �3+µ3 as a polynomial in the three elementary symmetric functionsin r, s, t.

3. Prove that the matrix

0

@1 1 11 ! !2

! 1 !2

1

A

is invertible.

4. Using the notation from the discussion of the quartic equation, prove thatthe set T = {✓1, ✓2, ✓3} is a Sym({r1, r2, r3, r4})-orbit on Z[r1, r2, r3, r4], and provethat the kernel of the action of Sym({r1, r2, r3, r4}) on T is the group

V4 = {(1), (r1, r2)(r3, r4), (r1, r3)(r2, r4), (r1, r4)(r2, r3)}.

5a. Reformulate Lagrange’s Theorem as stated in this chapter in more modernlanguage, but stick to the same context in which Lagrange stated it. In other words,your reformulated theorem should not be any more general than the one Lagrangestated, but it should be in the language of a certain group acting on a certain set,and it should not use the term “values”.

b. Prove that the theorem you stated in (a) is a corollary of Lagrange’s Orbit-Stabilizer Theorem, as stated and proved in Chapter 4.

6a. For each divisor d of 24, give a polynomial fd(r1, r2, r3, r4) such that fd takesexactly d distinct “values”under permutation of the variables.

b. For each polynomial fd from (a), explicitly give the subgroup

Gd = {� 2 S4 : �(fd) = fd}.

7a. Prove: If H is a subgroup of S4 with |H| = 6, then H contains a transposi-tion.

b. Prove: If H is a subgroup of S4 with |H| = 8, then H contains a transposition.

8a. Prove: For all n � 2, Sn is generated by the transpositions

(1, 2), (2, 3), . . . , (n� 1, n).

[Hint: Let H be the subgroup generated by these transpositions. Prove that(1, 2, . . . , n) 2 H. Now use induction and Lagrange’s Orbit-Stabilizer Theorem.]

b. Prove: For all n � 2, Sn is generated by (1, 2) and (1, 2, . . . , n). [Hint: Let� = (1, 2, . . . , n). Compute �k � (1, 2) � ��k for 1 k n.]

c. Prove: Let p be a prime. Let ⌧ = (i, j) 2 Sp, and let � be a p-cycle in Sp.Then Sp is generated by ⌧ and �. [Hint: Renumber so that ⌧ = (1, 2). Argue that

85

for some k, �k is a p-cycle with �k(1) = 2. Renumber so that �k(i) = i+ 1 for alli, 1 i < p.]

9. Let G be a group and H a subgroup of G with (G : H) = n. Let

X = {g1H = H, g2H, . . . , gnH}

be the set of all left cosets of H in G. For each g 2 G, define the function �g : X !X by

�g(giH) = ggiH for all giH 2 X.

a. Prove: �g : X ! X is a permutation of X, i.e. a bijective function.

b. Prove: The order of �g as an element of Sym(X) is a divisor of the order ofg in G.

c. Prove: �g(H) = H if and only if g 2 H.

10. Let H be a subgroup of S5 with |H| = 30 or 40.

a. Prove: H contains a 5-cycle �. [Hint: Use Exercise 9.]

b. Let H5 := {h 2 H : h(5) = 5}. Prove: H5 contains a transposition of S5.[Hint: Use Exercise 7.]

c. Conclude that S5 has no subgroup H with |H| = 30 or 40. [Hint: Use Exercise8.]

d. Prove: If f(r1, r2, r3, r4, r5) takes at most 4 values under permutation of thevariables, then f is either an alternating or a symmetric function.

86

11. Galois’ Theory of Equations

Let’s try to imagine the thought processes of the young genius Evariste Galoisas he contemplated the work of his predecessors on the theory of equations.

On the one hand, there was the great paper of Lagrange, in which Lagrangeexamined the work of his predecessors and attempted to extract a universal guidingprinciple. The principle Lagrange discovered was that of symmetries of the rootsof the polynomial p(x). He let the symmetric group Sn act on the roots and foundauxiliary equations (Lagrange resolvents) whose solution would lead to a solutionof p(x) = 0 itself. However, Lagrange’s paper was finally pessimistic. He concluded(although he did not prove) that these methods would never give a formula forsolving equations of degree greater than 4.

On the other hand, there was the work of Gauss, which you studied in Math4580, in which Gauss showed how to solve the polynomial equation x17 � 1 = 0by successive extraction of square roots, and indeed gave a general analysis of theequations xn � 1 = 0. Here the key role was played by a much smaller group,Aut(Q(e

2⇡in ), isomorphic to the group Un of invertible elements of Z/nZ.

Galois realized that Gauss was on the right track . When considering a specificpolynomial p(x), one should not treat its roots as “indeterminates”– r1, r2, . . . , rn– and indiscriminantly apply every possible permutation in Sn. Rather, one shouldremember the algebraic relationships among the roots and apply only those permu-tations which respect those relationships. [Of course, Lagrange was right too. Hewas looking for a general formula valid for all equations, not a specific formula fora specific equation.]

Note: Throughout our discussion of Galois Theory, we shall assume withoutfurther comment, as did Galois, that we are working with subfields of the complexnumbers. Most of this theory can be extended, with minor changes, to much moregeneral contexts. Recall that, by the Fundamental Theorem of Algebra, if p(x) isany polynomial with coe�cients in any subfield F of C, then a splitting field, E,for p(x) over F can be found as a subfield of C. We have the following fundamentaldefinition.

Definition 11.1. Let F be a subfield of C and let E be the splitting field over F ofthe polynomial p(x) 2 F [x]. The Galois group of p(x) over F , Gal(E/F ), is thegroup of all � 2 Aut(E) such that �(x) = x for all x 2 F . We call E/F a Galoisextension. As noted, we shall assume without further comment that F ✓ E ✓ C.

Thus, Gal(E/F ) = Aut(E/F ) for the special case when E is a splitting field overF . In particular, in the important case when F = Q, we simply have Gal(E/Q) =Aut(E), since every automorphism of E fixes every rational number. We leave theproof of the following fact as an exercise.

Theorem 11.2. Let E/F be a Galois extension. Suppose a, b 2 E and � 2Gal(E/F ) with �(a) = b. Then a and b have the same minimum polynomial inF [x].

Other than the identity function, it is not obvious that there are any Galoisautomorphisms of E/F . The remarkable fact is that there are quite a few. This isthe content of the following converse of Theorem 11.2.

87

Theorem 11.3. Let E/F be a Galois extension. Let a 2 E with minimum poly-nomial p(x) 2 F [x]. Let b be any root of p(x). Then b 2 E and there exists� 2 Gal(E/F ) with �(a) = b.

We shall proceed via a sequence of intermediate results, arriving finally at astronger theorem which will imply Theorem 11.3.

Theorem 11.4. Let F and F 0 be subfields of C and let h : F ! F 0 be an isomor-phism of fields. Extend h to an isomorphism h : F [x] ! F 0[x] via:

h(anxn + · · ·+ a1x+ a0) = h(an)x

n + · · ·+ h(a1)x+ h(a0).

Let a be a root of the irreducible polynomial p(x) 2 F [x], and let b be a root of theirreducible polynomial h(p(x)) 2 F 0[x]. Then there is an isomorphism h⇤ : F (a) !F 0(b) such that

(1) h⇤(c) = h(c) for all c 2 F ; and(2) h⇤(a) = b.

Proof. Set p0(x) = h(p(x)) 2 F 0[x]. By the construction of extension fields, thereare isomorphisms:

f : F [x]/(p(x)) ! F (a)

and

g : F 0[x]/(p0(x)) ! F (b),

given by

f(c+ (p(x))) = c for all c 2 F , and f(x+ (p(x))) = a,

and

g(c0 + (p0(x))) = c0 for all c0 2 F 0, and g(x+ (p0(x))) = b.

The ring isomorphism h : F [x] ! F 0[x] maps the principal ideal (p(x)) to theprincipal ideal (p0(x)), and so there is an induced isomorphism

h : F [x]/(p(x)) ! F 0[x]/(p0(x))

such that

h(c+ (p(x))) = h(c) + (p0(x)) for all c 2 F , and h(x+ p(x)) = x+ (p0(x)).

Now define h⇤ = g� h�f�1 : F (a) ! F (b). As h⇤ is a composition of isomorphisms,h⇤ is an isomorphism. Moreover, direct computation shows that h⇤(c) = h(c) forall c 2 F and h⇤(a) = b, as claimed.

⇤

88

Theorem 11.5. Let E be the splitting field over F of the polynomial p(x) 2 F [x].Let E0 be another subfield of C containing F , and let h : E ! E0 be an isomorphismof fields satisfying:

h(x) = x for all x 2 F .

Then the following statements are true:

(1) E0 = E,(2) h is a Galois automorphism of E/F , and(3) h permutes the roots of p(x).

Proof. Let p(x) = anxn + · · ·+ a1x+ a0 2 F [x]. Let

{↵1,↵2, . . . ,↵n}

be the set of all roots of p(x). (Note: p(x) may have repeated roots. So there maybe redundancies on this list.) Then, for all i,

an · ↵ni + · · ·+ a1 · ↵i + a0 = 0.

Since ↵i 2 E for all i, we may apply h to this equation, yielding:

an · h(↵i)n + · · ·+ a1 · h(↵i) + a0 = 0.

Thus h(↵i) is also a root of p(x), for all i. In other words, since h is an injectivefunction, h permutes the roots of p(x), as claimed. In particular, h(↵i) 2 E for alli. However, since E is the splitting field for p(x) over F , E = F (↵1,↵2, . . . ,↵n).Hence, since h(F ) = F , we have that h(E) ✓ E, i.e., E0 ✓ E. Note: Since E0 isan infinite set, this does not immediately guarantee that E0 = E. However, in thiscase, h : E ! E0 is, in particular, an isomorphism of F -vector spaces. Since E isfinite-dimensional as an F -vector space, it now follows that E0 = E, as desired.

Now we are ready to prove the Main Theorem on Galois automorphisms.

Theorem 11.6. Let E/F be a Galois extension. Let L be any subfield of E andlet L0 be another subfield of C containing F . Suppose that g : L ! L0 is anisomorphism of fields satisfying:

g(x) = x for all x 2 F .

Then there exists � 2 Gal(E/F ) extending g, i.e.,

�(y) = g(y) for all y 2 L.

In particular, L0 is a subfield of E.

Proof. We proceed by complete mathematical induction on n := (E : L). Noticethat the case n = 1 is precisely Theorem 11.5. Now assume that the theorem istrue for all subfields K of E with (E : K) < n. Since n > 1, we may choosea 2 E � L. Let p(x) 2 L[x] be the minimum polynomial for a over L. As before,we may extend g to an isomorphism g : L[x] ! L0[x]. Let b be any root of thepolynomial p⇤(x) = g(p(x)) 2 L0[x]. By Theorem 11.4, g extends to an isomorphismg⇤ : L(a) ! L0(b). Since (E : L(a)) < n = (E : L), our inductive hypothesis implies

89

that g⇤ extends to a Galois automorphism � 2 Gal(E/F ). Clearly, � is the desiredextension of g and we are done.

⇤As a corollary we have Theorem 11.3, which we repeat now for emphasis.

Theorem 11.3. Let E/F be a Galois extension. Let a 2 E with minimum poly-nomial p(x) 2 F [x]. Let b be any root of p(x). Then b 2 E and there exists� 2 Gal(E/F ) with �(a) = b.

Proof. We may take L := F (a) and L0 := F (b). By Theorem 11.4, taking F = F 0

and h to be the identity map on F , there is an isomorphism g : L ! L0 withg(x) = x for all x 2 F , and with g(a) = b. Hence, by Theorem 11.6, there exists� 2 Gal(E/F ) with �(a) = b, as claimed.

⇤Here is an application to constructible numbers.

Theorem 11.7. Let ↵ 2 C be a constructible number. Let p(x) 2 Q[x] be theminimum polynomial for ↵. Then every root of p(x) is a constructible number.

Proof. Since ↵ is constructible, for some n 2 N, there is a finite tower of fields:

Q = K0 ✓ K1 ✓ · · · ✓ Kn

such that ↵ 2 Kn and (Ki+1 : Ki) = 2 for all i. Now p(x) 2 Kn[x]. Let E be asplitting field for p(x) over Kn. Let � be any root of p(x) in E. Then Q(↵) ⇠= Q(�)and so, by Theorem 11.3, there exists � 2 Gal(E/Q) = Aut(E) such that �(↵) = �.Since Ki ✓ E for all i, we may apply � to the tower above to get a new tower offields:

Q = L0 ✓ L1 = �(K1) ✓ · · · ✓ Ln = �(Kn)

such that � = �(↵) 2 Ln and (Li+1 : Li) = 2 for all i. Hence � is constructible, asclaimed.

⇤

Exercises

1. Prove Theorem 11.2.

2. Let E be the splitting field of p(x) = (x2 � 2)(x3 � 1) over Q.

a. Prove: (E : Q) = 4.

b. Prove that q(x) = x2 � 2 remains irreducible over Q(!), the splitting field ofx3 � 1 over Q.

c. Prove that Gal(E/Q) is a noncyclic group of cardinality 4.

d. Give three di↵erent subfields of E, each of degree 2 over Q.

3. Let E be as in Exercise 2. Prove that E is also the splitting field of f(x) =(x2 + 3)(x2 � 4x+ 2) over Q.

4. Find the splitting field and Galois group of g(x) = x3 � 5 over Q.

90

5. Find the splitting field and Galois group for h(x) = x4 � 2x2 + 9 over Q.

6. Let L be the splitting field of k(x) = x4 � 2 over Q.

a. Prove: (L : Q) = 8.

b. Prove that Gal(L/Q) is a subgroup of S4 isomorphic to D4.

91

12. The Galois Correspondence

We are finally ready to state and prove the amazing Galois Correspondence The-orem, relating the subfield structure of the Galois extension E/F to the subgroupstructure of its Galois group Gal(E/F ). One amazing aspect of this theorem is thatit describes the internal structure of an infinite, albeit finite-dimensional, object Ein terms of the internal structure of its finite group of automorphisms. Thus, inparticular, we shall see that, although E has infinitely many subspaces as a vectorspace over F , E has only finitely many subfields containing the field F . This is, infact, an easy consequence of our previous results. First note that if E is the splittingfield of the polynomial p(x) 2 F [x], and if K is any intermediate field between Fand E, then E is also the splitting field of the same polynomial p(x) regarded as apolynomial in K[x]. Thus, it makes sense to speak of the Galois group Gal(E/K).By definition,

Gal(E/K) = {� 2 Aut(E) : �(x) = x for all x 2 K}.

Since F ✓ K, any element of Gal(E/K) satisfies:

�(y) = y for all y 2 F .

Thus Gal(E/K) is a subgroup of Gal(E/F ). This easy remark has the followingimportant consequence.

Theorem 12.1. Let E/F be a Galois extension and let K be any intermediate fieldbetween F and E. For any ↵ 2 E �K, there exists � 2 Gal(E/K) with �(↵) 6= ↵.

Proof. Since ↵ 62 K, the minimum polynomial f(x) 2 K[x] for ↵ has degree atleast 2. By an earlier result, since f(x) 2 K[x] is irreducible, f(x) does not have arepeated root. Hence there is a root � of f(x) with � 6= ↵. Then by Theorem 11.3,there exists � 2 Gal(E/K) with �(↵) = � 6= ↵, as claimed.

⇤Now we can define the fundamental Galois Correspondence. We fix a Galois

extension E/F and let G = Gal(E/F ). We let

F = {K : F ✓ K ✓ E}

be the set of all subfields of E containing the field F . We let

G = {H : H ✓ G}

be the set of all subgroups of G. Recall that if H is a subgroup of G, we define

EH := {x 2 E : h(x) = x for all h 2 H}.

We define two functions:

� : F ! G via �(K) = Gal(E/K) for all K 2 F ,

and⇥ : G ! F via ⇥(H) = EH for H 2 G.

92

Our goal is to show that these two functions are inverses of each other and definea one-to-one correspondence between the fields in F and the groups in G.

We leave the following theorem as an exercise. It is an easy corollary of Theorem12.1.

Theorem 12.2. For all K 2 F , K = ⇥(�(K)), i.e.,

K = EGal(E/K).

In particular, ⇥ : G ! F is a surjective map, and so

|F| |G| < 1.

Thus, there are only finitely many subfields of E containing F .

Already, we have achieved the surprising result, announced above, that thereare only a finite number of fields lying between F and E, even though there areinfinitely many F -vector spaces lying between F and E.

To complete the proof of the Galois Correspondence Theorem, we need to knowthat for all subgroups H of G, we have

H = Gal(E/EH).

Note that

Gal(E/EH) = {� 2 G : �(x) = x whenever h(x) = x for all h 2 H}.

Clearly, by the definition,

H ✓ Gal(E/EH) ✓ G.

We need to verify that Gal(E/EH) is not bigger than it “should be”. This willfollows from the fundamental Primitive Element Theorem of Galois.

Primitive Element Theorem. Let E/F be a Galois extension. There exists↵ 2 E such that E = F (↵).

We call ↵ a primitive element of E/F . The Primitive Element Theorem is animmediate corollary of the following linear algebra fact.

Theorem 12.3. Let V be a finite-dimensional vector space over an infinite fieldF . Then V is not the union of any finite collection of proper F -subspaces of V .

This is intuitively obvious. No finite set of lines completely covers R2. No finiteset of planes completely covers R3.

Proof. Let dimF (V ) = n. We shall prove the theorem by induction on n. If n = 1,then the only proper subspace of V is {0}, and the theorem is obvious. Henceforthassume n � 2.

We call a subspace H of V a hyperplane if dimF (H) = n� 1. First we arguethat V contains infinitely many hyperplanes. Let B = {e1, e2, . . . en�1, en} be anF -basis for V . For ↵ 2 F , let H↵ be the subspace of V spanned by

93

B↵ := {e1, e2, . . . , en�1 + ↵ · en}.

Clearly H↵ is a hyperplane of V for all choices of ↵. Suppose H↵ = H� . Thenthere exist ci 2 F , 1 i n� 1 with

en�1 + � · en = c1 · e1 + c2 · e2 + · · ·+ cn�1 · (en�1 + ↵ · en).

Thus

c1 · e1 + c2 · e2 + · · ·+ (cn�1 � 1) · en�1 + (cn�1↵� �) · en = 0.

Since B is a linearly independent set, it follows first that cn�1 = 1 and then that↵ = �. Thus

H↵ = H� if and only if ↵ = �.

Since F is an infinite set, we conclude that V contains infinitely many hyperplanes,as claimed.

Now suppose that the theorem is true whenever dimF (W ) = n� 1, but supposethat the theorem is false for V . There there exists a finite set

H = {H1, H2, . . . , Hr}

of hyperplanes of V such that

V = H1 [H2 [ · · · [Hr.

Since V contains infinitely many hyperplanes, we may choose a hyperplane H of Vsuch that H 6= Hi for any i, 1 i r.

If H = H \ Hi for some i, then H ✓ Hi. But then H = Hi, contrary to thechoice of H. Hence H \Hi is a proper subspace of H for all i. But

H = (H \H1) [ (H \H2) [ · · · [ (H \Hr),

contrary to the inductive hypothesis. This completes the proof.

⇤We now restate and prove the Primitive Element Theorem.

Primitive Element Theorem. Let E/F be a Galois extension. There exists↵ 2 E such that E = F (↵).

Proof. Suppose that ↵ is not a primitive element of E for any ↵ 2 E. Then, forevery ↵ 2 E, F (↵) is a proper subfield of E containing F . But |F| < 1, i.e., thereare only finitely many proper subfields of E containing F , and E is the union ofthese finitely many proper subspaces, contrary to Theorem 12.3. This completesthe proof.

⇤Now we can complete the fundamental Galois Correspondence Theorem after

two corollaries.

94

Corollary 12.4. |Gal(E/F )| = (E : F ).

Proof. Let E = F (↵) by the Primitive Element Theorem. Then the minimumpolynomial p(x) 2 F [x] for ↵ has degree n := (E : F ). Let � be any Galoisautomorphism of E/F . Then � is completely determined by �(↵). Moreover, �(↵)is one of the n roots of p(x), and �(↵) 2 E by Theorem 11.3. Hence

|Gal(E/F )| n = (E : F ).

On the other hand, by the Main Theorem on Galois Automorphisms, for everyroot � of p(x), there exists one (and only one) Galois automorphism � 2 Gal(E/F )with �(↵) = �. Thus

|Gal(E/F )| � n = (E : F ).

Hence equality holds, as claimed.

⇤Corollary 12.5. Let H be any subgroup of G := Gal(E/F ). Then H = Gal(E/EH).

Proof. Let K = EH and let

H⇤ = Gal(E/K) = {� 2 G : �(x) = x for all x 2 K}.

Then H ✓ H⇤. By Corollary 12.4,

|H⇤| = (E : K).

We must show that (E : K) |H|.Let H = {h1, h2, . . . , hm}, with h1 the identity automorphism. Let ↵ be a

primitive element of E/K. Set

g(x) = (x� h1(↵)) · (x� h2(↵)) · . . . · (x� hm(↵)).

The coe�cients of g(x) are the elementary symmetric polynomials in

{h1(↵), h2(↵), . . . , hm(↵)},

and so they are fixed by every automorphism in H, i.e., if ci is a coe�cient of g(x),then

h(ci) = ci for all h 2 H.

Hence ci 2 EH = K for all coe�cients ci of g(x), i.e. g(x) 2 K[x]. Let f(x) 2 K[x]be the minimum polynomial of ↵ over K. Since g(↵) = 0, f(x) divides g(x). SinceE = K(↵), we have

|H⇤| = (E : K) = (K(↵) : K) = deg(f(x)) deg(g(x)) = m = |H|,

as claimed. Since H ✓ H⇤, we conclude that H = H⇤ = Gal(E/EH), as claimed.

⇤Now, in the notation established at the beginning of this section, we have

95

(� �⇥)(H) = �(EH) = Gal(E/EH) = H for all subgroups H of G.

Earlier we established that

⇥ � �(K) = ⇥(Gal(E/K)) = EGal(E/K) = K for all fields K with F ✓ K ✓ E.

Thus � and ⇥ are inverses of each other. Hence both � and ⇥ are bijectionsbetween the sets F and G. This completes the proof of the Fundamental Theoremof Galois Theory.

The Fundamental Theorem of Galois Theory. Let F ✓ E ✓ C with E/F aGalois extension of fields. Then the correspondence

K = EH () H = Gal(E/K)

defines a one-to-one inclusion-reversing correspondence between the subfields of Econtaining F and the subgroups of Gal(E/F ). Moreover,

|H| = (E : EH)

for every subgroup H of Gal(E/F ).

We apply this theorem to obtain a necessary and su�cient condition for a com-plex number to be constructible. First we need a general fact about finite p-groups.

Theorem 12.6. Let p be a prime number and let G be a finite group with |G| = pn

for some n 2 N. Then there is a tower of subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {e},

where (Gi : Gi+1) = p for all i, and e is the identity element of G.

The proof of Theorem 12.6 will require us to develop a little more basic grouptheory. Back in Math 4580, we defined the relation of conjugacy on a group G. Werecall this definition now.

Definition 12.7. Let G be a group. We say that two elements x and y of G areconjugate if there exists an element g 2 G such that y = gxg�1. We shall writex ⇠ y to denote the fact that x is conjugate to y.

We leave the following fact as an exercise.

Lemma 12.8. The relation of conjugacy is an equivalence relation on a group G.

We refer to the equivalence classes under this relation as conjugacy classes.Thus the conjugacy classes of G define a partition of G into disjoint subsets. Wecan think of this in a more sophisticated way in terms of group actions.

Lemma 12.9. Let G be a group. Then G acts as a group of functions on the setG via:

g(x) = gxg�1 for all x 2 G.

96

Proof. We must verify that multiplication in G corresponds to composition of func-tions, i.e., that

(gh)(x) = g(h(x))

for all g and h in the group G and all x in the set G:

(gh)(x) = (gh)x(gh)�1 = g(hxh�1)g�1 = g(h(x)),

as claimed.

⇤Notice that the conjugacy classes of G are the G-orbits on the set G under the

conjugation action. We introduce the following notation.

Definition 12.10. Let G be a group and let x be an element of G. Then thecentralizer in G of x is the set

CG(x) := {g 2 G : g · x = x · g} = {g 2 G : g(x) = x}.

Thus CG(x) is the stabilizer in G of the “point”x under the conjugation action. Inparticular, CG(x) is a subgroup of G.

In consequence of Lemma 12.9, we may apply Lagrange’s Orbit-Stabilizer The-orem to the conjugation action of the group G on the set G to obtain the followingresult, whose proof we leave as an exercise.

Theorem 12.11. Let G be a finite group. Let C1, C2, . . . , Cn be the conjugacyclasses of G. For each i, let xi be an element in Ci. Then the following conclusionshold:

(1) |G| =Pn

i=1 |Ci|;(2) For each i, |G| = |Ci| · |CG(xi)|;(3) Ci = {xi} if and only if xi 2 Z(G).(4) Suppose that C1, . . . , Cr are the conjugacy classes of G such that |Ci| > 1.

Then

|G| = |Z(G)|+rX

i=1

|Ci|.

Note that (b) implies, in particular that |Ci| divides |G| for all i.

The displayed equation is usually referred to as Cauchy’s Class Equation. Ithas the following important corollary.

Theorem 12.12. Let p be a prime and let G be a finite group with |G| = pn forsome n 2 N. Then Z(G) 6= {1}. In particular, G has a normal subgroup N with|N | = p.

Proof. By Theorem 12.11, if Ci is a conjugacy class of G, then |Ci| divides pn.Hence either |Ci| = 1 or p divides |Ci|. Of course, p divides |G|. Thus in thenotation of Cauchy’s Class Equation, p divides |Ci| for all i with 1 i r. Hencep divides

|Z(G)| = |G|� (|C1|+ · · ·+ |Cr|).

97

Since 1 2 Z(G), we have |Z(G)| > 0. Hence |Z(G)| � p.Now let x 2 Z(G) with x 6= 1. By Lagrange’s Theorem, the order of x divides

|G| = pn. Hence the order of x is pa for some a � 1. Set

z = xpa�1

.

Then z is an element of Z(G) of order p. Let N = hzi be the cyclic subgroup ofZ(G) generated by z. Then |N | = p. Indeed,

N = {1, z, z2, . . . , zp�1}.

Let g 2 G. Then, since N ✓ Z(G),

g · zi · g�1 = g · g�1 · zi = 1 · zi = zi 2 N,

for all i. Thus N is a normal subgroup of G with |N | = p.

⇤We would like to produce a tower of subgroups

1 = N0 ✓ N = N1 ✓ · · · ✓ Nn = G

with |Ni| = pi for all i. This will follow easily by induction once we generalize thequotient group construction.

Definition 12.13. Let G be a group and let N be a normal subgroup of G. Wedefine a quotient group G/N as follows. The set G/N is the set of all cosetsg ·N for g 2 G. Multiplication in G/N is defined by the rule:

(g ·N) · (g1 ·N) = (g · g1) ·N

for all g, g1 2 G.

Note that we do not have to specify left or right cosets, since the condition thatN is a normal subgroup of G is equivalent to the assertion

g ·N = N · g for all g 2 G.

As usual, we must verify that the multiplication operation is well-defined. Thisfollows directly from the equation above and the Associative Law:

(g ·N) · (g1 ·N) = g · (N · g1) ·N = g · (g1 ·N) ·N = (g · g1) · (N ·N) = (g · g1) ·N,

the last equality holding because N is a group. We leave as an exercise to verifythe following facts.

Theorem 12.14. Let G be a group and let N be a normal subgroup of G. Then

(1) G/N is a group; and

(2) If |G| < 1, then |G/N | = |G||N | .

We need one further fact.

98

Lemma 12.15. Let G be a group and let N be a normal subgroup of G. Definethe function f : G ! G/N by

f(g) = g ·N for all g 2 G.

Then f is a surjective group homomorphism. Moreover if H/N is a subgroup ofG/N and if

H := {h 2 G : f(h) 2 H/N},

then H is a subgroup of G. Moreover if |G| < 1, then |H| = |N | · |H/N |.Proof. We leave as an exercise to show that f is a surjective group homomorphism.Suppose that H/N is a subgroup of G/N and H is defined as above. Since H/N isa group, the identity coset 1 ·N is in H/N . Hence 1 2 H. Suppose that h, h1 2 H.Then h ·N 2 H/N and h1 ·N 2 H/N . Hence h�1 ·N = (h ·N)�1 2 H/N and

(h · h1) ·N = (h ·N) · (h1 ·N) 2 H/N.

Hence h�1 2 H and h · h1 2 H. Thus H is indeed a subgroup of G. Moreover,if G is finite, then H is the union of |H/N | cosets of N . Each of these cosets hascardinality |N |. Hence

|H| = |H/N | · |N |,

as claimed.

⇤We can now easily establish the existence of the desired tower of subgroups in a

finite p-group. The following result is a bit stronger than what we need for Theorem12.17, but we will use it again in Corollary 12.18.

Theorem 12.16. Let p be a prime and let G be a finite group with |G| = pn forsome integer n � 0. (We call such a group a finite p-group.) Let H be a subgroupof G with |H| = pm. Then there is a tower of subgroups

H = H0 ✓ H1 ✓ · · · ✓ Hn�m = G

with |Hi| = pm+i for all i, 0 i n.

Proof. We proceed by induction on k := n�m. The result is trivial if n�m = 0.Suppose then that the result is true for k = n � m � 1, and that |G| = pn and|H| = pm. By Theorem 12.12, G has a normal subgroup N with |N | = p. If N isnot contained in H, then H ✓ NH with |NH| = pm+1. Since by induction, theresult is true for k = n �m � 1 = n � (m + 1), it follows that there is a tower ofsubgroups

NH = H1 ✓ · · · ✓ Hn�m = G

with |Hi| = pm+1. Then taking H := H0, we are done.Hence, we may assume that N ✓ H. Let G = G/N . By induction, since

(G : H) = pn�m�1, there is a tower of subgroups

H0 = H ✓ H1 ✓ · · · ✓ Hn�m = G

99

with |Hi| = pm+i�1 for all i, 0 i n. For each i, let Hi be the pre-image in Gof Hi under the homomorphism f : G ! G/N via f(g) = g ·N for all g 2 G. Thenby Lemma 12.15, Hi is a subgroup of G for all i, 0 i n, and

|Hi| = |Hi| · |N | = pi�1 · p = pi,

for all i, 0 i n. Clearly, H0 = H and Hi ✓ Hi+1 for all i. Hence these groupsprovide the desired tower of subgroups of G.

⇤We can now complete our characterization of constructible numbers.

Theorem 12.17. Let ↵ 2 C. Let f(x) 2 Q[x] be the minimum polynomial of ↵over Q. Let E be the splitting field of f(x) over Q, and let G = Gal(E/Q). Then↵ is constructible if and only if |G| is a power of 2.

Proof. Suppose first that ↵ is constructible. We have seen earlier that then everyroot of f(x) is constructible. Hence by combining towers of fields, we can achievea tower

Q = K0 ✓ K1 ✓ K2 ✓ · · · ✓ Kn

such that E ✓ Kn and (Ki+1 : Ki) = 2 for all i. Hence (Kn : Q) = 2n by Theorem14.14 in the Math 2580 text, and since E ✓ Kn, (E : Q) is also a power of 2. Then|G| = (E : Q) is a power of 2, as claimed. Note that this extends Theorem 14.16 inthe Math 4580 text, which guarantees that the degree of f(x) is a power of 2.

Next suppose that |G| is a power of 2. We claim that there is a tower of fields

Q = E0 ✓ E1 ✓ E2 ✓ · · · ✓ En = E

with (Ei+1 : Ei) = 2 for all i. By the Galois Correspondence Theorem, this is trueif and only if there is a tower of subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {I}

with (Gi : Gi+1) = 2. Since |G| is a power of the prime 2, this is immediate fromTheorem 12.16.

⇤Using Theorems 12.16 and 12.17, we can obtain the following sharpened purely

field-theoretic criterion for a number to be constructible.

Corollary 12.18. Let ↵ 2 C with (Q(↵) : Q) = 2n. Then ↵ is constructible if andonly if there exists a tower of subfields of C:

Q = F0 ✓ F1 ✓ · · · ✓ Fn = Q(↵)

with (Fi+1 : Fi) = 2 for all i.

Proof. Clearly if the tower exists, then this demonstrates that ↵ is constructible,by Theorem 13.16 in the Math 4580 text. Now assume that ↵ is constructible.Let f(x) 2 Q[x] be the minimum polynomial for ↵ and let E be the splitting field

100

for f(x) over Q. By Theorem 12.17, the Galois group G := Gal(E/Q) is a 2-group. Let H = Gal(E/Q(↵)) ✓ G. Then by the Galois Correspondence Theorem,(G : H) = 2n. By Theorem 12.16, there exists a tower of subgroups

H = H0 ✓ H1 ✓ · · · ✓ Hn = G

with (G : Hi) = 2n�i. Let Fi := EHi . Again, by the Galois CorrespondenceTheorem, we get a tower of fields

Fn = EG = Q ✓ Fn�1 ✓ · · · ✓ F1 ✓ F0 = EH = Q(↵)

with (Fi+1 : Fi) = 2 for all i, as claimed.

⇤Corollary 12.18 does not seem very di↵erent from Theorem 13.16 in the Math

4580 text. However, without Galois Theory, it is not clear that Corollary 12.18can be deduced directly from the earlier Theorem 13.16, even in the following veryelementary case. Suppose that ↵ is a constructible number which is the root of aquartic irreducible polynomial p(x) 2 Q[x]. Let F = Q(↵) and suppose that E isthe splitting field for p(x) over Q with (E : Q) = 8. Suppose you know that thereis a tower of fields:

Q = E0 ✓ E1 ✓ E2 ✓ E3 = E

with (Ei : Q) = 2i. Show that there exists a subfield F1 of F with (F1 : Q) = 2. Idon’t know how to do this without using Galois Theory.

We now discuss a way to show that there exist nonconstructible numbers whoseminimum polynomial over Q has degree a power of 2, specifically 4. Going back toLagrange’s analysis of the quartic polynomial, we recall that if

p(x) = x4 + cx2 + dx+ e = (x� r1)(x� r2)(x� r3)(x� r4)

with c, d, e 2 Q, then the numbers

t1 := (r1 + r2)(r3 + r4), t2 := (r1 + r3)(r2 + r4), and t3 := (r1 + r4)(r2 + r3)

are roots of the Lagrange resolvent cubic polynomial

R(x) := x3 � 2cx2 + (c2 � 4e)x+ d2.

If this cubic polynomial is irreducible, then the Galois group of p(x) over Q isisomorphic to the symmetric group S4 or to the alternating group A4. In eithercase, it is not a 2-group, and hence the ri’s are not constructible numbers. In theexercises, you will be asked to work out an explicit example of this, namely whenp(x) = x4 + x+ 1.

We thank Professor S. K. Wong for the following easier argument for p(x) =x4 + x + 1 2 Q[x], avoiding the use of Lagrange resolvents, but using Corollary12.18. We leave some details to the exercises. Let ✓ 2 C be a root of p(x). By theRational Root Theorem from Math 4580 (Chapter 6, Exercise 9a), since f(1) = 3and f(�1) = 1, f(x) has no rational root. Hence (K : Q) = 2 or 4, whereK = Q(✓).

101

We outline a proof that K has no subfield F with (K : F ) = 2. It will follow thenthat (K : Q) = 4, and then, by Corollary 12;18, that ↵ is not constructible. Supposethen that K has a subfield F with (K : F ) = 2. Since K = F (✓), there is a monicquadratic polynomial q(x) 2 F [x] with q(✓) = 0. Since p(x) 2 F [x] with p(✓) = 0,we see that q(x) divides p(x). Write q(x) = x2 + ax+ b 2 F [x] and factor

p(x) = x4 + 0 · x3 + 0 · x2 + x+ 1 = (x2 + ax+ b)(x2 + cx+ d) 2 F [x].

Equating coe�cients, we obtain

(1) a+ c = 0;(2) ac+ b+ d = 0;(3) ad+ bc = 1; and(4) bd = 1.

Since c = �a, we see that a 6= 0 by equation (3), and then from (2) and (3) weget

(1) a2 = d+ b; and(2) 1

a = d� b.

Adding these, we get formulas for 2d and 2b, which we can multiply to get

(a2 +1

a)(a� 1

a) = 4bd = 4.

We conclude that a2 is a root of the polynomial r(x) = x3 � 4x � 1 2 Q[x]. Sincer(x) is irreducible in Q[x], it follows that (Q(a2) : Q) = 3. However a 2 F and soQ(a2) ✓ F with (F : Q) 2, a contradiction.

The concept of a normal subgroup was one of Galois’ great contribution to theemerging theory of groups. It plays the following crucial role in the theory ofequations, extending the earlier observation of Abel.

Theorem 12.19. Let E/F be a Galois extension of fields with Galois group G.Suppose that N is a normal subgroup of G. Then EN/F is a Galois extension offields with Galois group isomorphic to G/N . Conversely, if H is a subgroup of Gsuch that EH/F is a Galois extension of fields, then H is a normal subgroup of G.

Proof. Let ↵ 2 EN , let g 2 G, and let h 2 N . Since N is normal in G, there existsh1 2 N with

h · g = g · h1.

Then

h(g(↵)) = (h · g)(↵) = (g · h1)(↵) = g(h1(↵)) = g(↵).

Since this is true for every h 2 H, it follows that

g(↵) 2 EH for all ↵ 2 EH and all g 2 G.

Now let ↵1, . . . ,↵r be a set of elements of EN such that

EN = F (↵1, . . . ,↵r),

102

and let mi(x) 2 F [x] be the minimum polynomial of ↵i over F . If �i is any rootof mi(x), then �i 2 E and there exists gi 2 G with gi(↵i) = �i. But gi(↵i) 2 EH .Hence �i 2 EH . In other words, EH contains all of the roots of mi(x) for all i,1 i r. Thus EH is the splitting field of m(x) = m1(x) · . . .mr(x) 2 F [x], i.e.,EH/F is a Galois extension of fields.

Galois’ work finally clarified the question of when a polynomial equation canbe solved by a process involving only addition, subtraction, multiplication division,and extraction of roots. The fact that this was impossible for the general polynomialequation of degree n � 5 had been established a bit earlier by Ru�ni and Abel.

Speaking informally, what Galois showed was the following: Suppose p(x) is apolynomial with rational coe�cients, having splitting field E/Q. The problem offinding the roots of this polynomial can be reduced to the problem of finding roots ofpolynomials of lower degree if and only if there exists a Galois extension F/Q withF a proper subfield of E. If there is such a subfield, then one can first try to solvethe polynomial equation f(x) for which F/Q is the splitting field. Next one cantry to solve the polynomial equation g(x) 2 F [x] for which E is the splitting field.Since (F : Q) < (E : Q) and (E : F ) < (E : Q), the problem has been reduced totwo smaller problems. By the fundamental Galois Correspondence Theorem, this ispossible if and only if the Galois group G := Gal(E/Q) contains a proper normalsubgroup, i.e., a normal subgroup N with N 6= {1} and N 6= G. Then one canchoose F to be EN .

A polynomial equation is solvable by radicals, i.e. its solutions can be foundusing only addition, subtraction, multiplication, division, and extraction of roots,if and only if this process can continue to be refined until one finally reaches fieldextensions Fi+1/Fi, all of prime degree, as Gauss did in his reduction of the cyclo-tomic polynomials. Galois’ fundamental result in this context requires the followingdefinition.

Definition 12.20. A group G is solvable if there is a tower of normal subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {1}

such that each quotient group Gi/Gi+1 is an abelian group.

Theorem. Let p(x) 2 Q[x]. Then p(x) can be solved by a process involving onlyaddition, subtraction, multiplication, division, and extraction of roots if and only ifthe Galois group of p(x) is a solvable group.

It is possible to show that most polynomial equations of degree n have Galoisgroup Sn, the full symmetric group on n letters. Since Sn is not solvable for n � 5,it follows that most polynomials of degree at least 5 are not “solvable by radicals”.

Galois’ work to a large extent closed the book on the subject of finding algebraicalgorithms for solving general polynomial equations of degree greater than 4. How-ever, other mathematicians, notably Leopold Kronecker, pursued this algorithmicquestion much more deeply, in the case of polynomials with solvable Galois group.Even more importantly, Galois’ work opened the book of group theory and moregenerally, in conjunction with the great work of his predecessors such as Lagrangeand Gauss, opened a vast and fascinating book of abstract mathematical structures(like groups, rings, and fields) and the “Galois correspondences”which link them.

103

Although Galois’ work appeared almost impenetrable to his contemporaries, itwas clarified by Camille Jordan in his book Traite des substitutions and des equa-tions algebriques published in 1870. Two young men who came to Paris at thattime – Felix Klein and Sophus Lie – learned Galois’ theory from Jordan and wereprofoundly influenced by it. Klein was led to articulate his Erlanger Programmdescribing all geometries in terms of the action of groups of isometries on spaces.Lie was motivated to search for a Galois correspondence for di↵erential equations,which led him to the important concepts of a Lie group and a Lie algebra. This inturn had a profound impact on modern physics.

And so mathematics evolves.

Exercises


2. Let G be a group. Define the conjugacy relation xy on G by:

xy if and only if y = gxg�1 for some g 2 G.

Prove: The conjugacy relation is an equivalence relation on G.


4. Let G be a group and let N be a normal subgroup of G. a. Prove: G/N is agroup (as defined in Definition 12.13).

b. Prove: If |G| < 1, then |G/N | = |G||N | .

5. Let H be a finite group with |H| = pa · m, with p and prime and withgcd(p,m) = 1. Suppose that H has a normal subgroup P with |P | = pa. a. Prove:If ↵ 2 Aut(H), then ↵(P ) = P .

b. Prove: If H is a normal subgroup of a (larger) group G, then P is also anormal subgroup of G.

6. Verify that the function f defined in Lemma 12.15 is a surjective grouphomomorphism.

7a. Prove: If G is a finite group with |G| even, then G contains an element g oforder 2.

b. Prove: Suppose A is an abelian group with |A| = 2a · p1 · p2 · . . . · pr, wherethe pi are distinct odd primes. Then A has a subgroup B with either |B| = 2a orwith |B| = pi for some i.

c. Recall from Corollary 4.8 that two elements of S5 are conjugate if and only ifthey have the same cycle structure. Use this to list all the conjugacy classes of S5

and their sizes.

d. Prove: S5 is not a solvable group. [Hint: Suppose that S5 is a solvable group.Use (b) to argue that S5 must have a normal subgroup B with |B| = 2, 3, 4, 5, or8. Now use (c) to derive a contradiction.

8. For each of the equations listed below, determine the Galois group over Q ofthe splitting field of the equation. List all of the subgroups of the Galois group.

104

List all of the subfields of the splitting field of the equation, and draw a diagramillustrating the Galois correspondence between subgroups and subfields for eachexample.

a. (x2 + 1)(x2 � 2)

b. (x2 � 2)(x2 � 3)(x2 + 1) (Note: You must prove by explicit calculation thatp3 is not contained in Q[

p2].)

c. x3 � 2

d. x7 � 1

e. x4 � 3

f. x11 � 1

9. For each finite group G with |G| 7, give an example of an equation whoseGalois group over Q is isomorphic to G.

10. Let p(x) = x4 + x+ 1. Let E be the splitting field for p(x) over Q. a. Findthe resolvent cubic R(x).

b. Prove that R(x) is irreducible over Q.

c. Prove that (E : Q) = 12 or 24.

d. Prove: Gal(E/Q) ⇠= A4 or S4.

e. If p(x) = (x2+ ax+ b)(x2+ cx+ d), verify the calculations on page 100 whichshow that a2 is a root of the cubic polynomial r(x) = x3 � 4x� 1.

f. Prove: r(x) = x3 � 4x� 1 is irreducible in Q[x].

g. Explain why (Q(a2) : Q) = 3 and (F : Q) 2 combine to give a contradictionto the assumed existence of the field F .

105

INDEX

a�ne group, Aff(Rn) 52a�ne transformation 53algebraic number 67alternating group, An 38, 79alternating polynomial 78averaging trick 53basis 5Cauchy’s Class Equation 96centralizer of a group element, CG(x) 96characteristic polynomial 16conjugate elements, conjugacy class 18, 95coordinates 5cross product 16cycle notation 28cycle structure 30–31cyclic group 28degree of a field extension, (E : F ) 66determinant 15–16diagonalizable matrix 18dimension 6–7direct product 59direct sum 10discriminant 78, 80dodecahedron 39eigenspace 15eigenvalue 14eigenvector 14elementary symmetric polynomial 71extension field 65face of a polyhedron 39Fermat’s Little Theorem 50Fundamental Theorem of Algebra (FTA) 74Fundamental Theorem of Galois Theory 95Galois Correspondence 91Galois extension 86Galois field 68–69Galois group 86GL(n,R), GL(Rn) 20group 18hyperplane 92icosahedron 42–43inner product 60Intermediate Value Theorem 75invariant subspace 14isomorphism (of vector spaces) 5Lagrange’s Orbit-Stabilizer Theorem 33

106

Lagrange resolvent 81Lagrange’s Theorem 37, 82linear combination 5linear independence 6linear operator 11linear transformation 6Main Theorem on Galois Automorphisms 88minimum polynomial of a complex number 66normal subgroup 18, 101octahedron 41–42orbit 28Orbit Counting Formula 45orthogonal basis 22orthogonal group, O(n) 23orthogonal matrix 22orthogonal operator 22orthogonal vectors 22permutation matrix 24p-group (finite) 98Primitive Element Theorem 92–93projection operator 19quotient group 97regular polyhedron 39resolvent cubic 83scalars 4similar matrices 18simple group 44singular matrix 15SL(n,R) 20solvability by radicals 102solvable group 102spanning set 5special orthogonal group, SO(n) 23splitting field 66subspace 5symmetric polynomial 70tetrahedron 39–41transcendental number 67transitive group action 37vector space 4Waring’s Theorem 71

Documents

MATH 4581: ABSTRACT ALGEBRA II NOTES by PROFESSOR … · 2020-01-10 · Chapter 1: Review of Linear Algebra Linear algebra is the most extensively applied area in all of algebra,