The Principle of Mathematical Induction. Theorem 1.1. P nThe principle of mathematical induction shows that a recursive deﬁnition gives a well-deﬁned function, provided that the

ABSTRACT ALGEBRA I NOTES

MICHAEL PENKAVA

1. Peano Postulates of the Natural Numbers

1.1. The Principle of Mathematical Induction. The principle of mathemati-cal induction is usually stated as follows:

Theorem 1.1. Let Pn be a sequence of statements indexed by the positive integersn ∈ P. Suppose that

• P1 is true.• If Pn is true, then Pn+1 is true.

Then Pn is true for all n ∈ P.

This formulation makes the idea of mathematical induction into a property ofstatements. However, in reality, there is a deeper level to this principle, as aproperty of the positive integers themselves. Let us state this as a property of theset of positive integers.

Theorem 1.2. Let S be a subset of P satisfying the following:

• 1 ∈ S.• If n ∈ S then n+ 1 ∈ S.

Then S = P.

We don’t give a proof of either of these versions of the principle of mathematicalinduction. However, it is not difficult to show that both of these versions areequivalent. That is to say, if the version described in terms of statements is true,then the version given in terms of subsets of the positive integers is true and vice-versa. Instead, we will give an axiomatic construction of the positive integers,including the notions of addition and multiplication of such numbers, in terms ofwhat are called the Peano Postulates.

Before giving this axiomatic construction, we will give some simple examples ofhow to use the principle of mathematical induction to prove some explicit formu-lae. We begin with an apocryphal story about the mathematician Carl FriedrichGauss, 1777–1855, who was one of the most significant contributors to modernmathematics.

According to the story, Gauss was an elementary school student who was con-stantly disrupting his class, and his teacher decided to give him a task to occupy histime, adding up the numbers from 1 to 100. Unfortunately for his teacher, Gausswas able to give the answer immediately, ”They sum to 5050”. In various versionsof the story, his teacher doubted the answer, but Gauss was able to give a simpleexplanation for his result. There are 100 numbers from 1 to 100, and they can bedivided into 50 pairs of numbers each of which sum to 101, 1 and 100, 2 and 99,etc. Thus, the sum is 50*101 =5050.

1

2 MICHAEL PENKAVA

The above reasoning is certainly clever, but we can give a very general answer tothe question of how to sum the numbers from 1 to n using mathematical induction.In general, mathematical induction can be used to prove a conjecture, but usuallythe conjecture cannot be seen by using inductive methods. This may seem strange,that in order to determine a general result you have to first know the answer, butthis is a deep mystery of mathematics, that seeing what is true and being able toshow it are very different activities. The statement of the sum formula is as follows.

Theorem 1.3.n∑

k=1

k =n(n+ 1)

2.

Proof. We use the principle of mathematical induction. Let Pn be the statement∑nk=1 k = n(n+1)

2 .. We first show that P1 is true. To see this, note that if n = 1,the left hand side of P1 is simply the sum from k = 1 to 1 of k, which is just 1. On

the other hand, the right hand side of P1 is 1(1+1)2 , which is also equal to 1. Thus

we have shown that P1 is true.

Next, assume that Pn is true. Now Pn+1 is the statement∑n+1

k=1 k = (n+1)(n+2)2 .

Let us compute

n+1∑k=1

k = n+ 1 +

n∑k=1

k = n+ 1 +n(n+ 1)

2=

2(n+ 1)

2+n(n+ 1)

2

=(n+ 2)(n+ 1)

2=

(n+ 1)(n+ 2)

2.

Notice that in the third equality above, we used the statement Pn. By the principleof mathematical induction, the statement Pn is true for all n. �Exercise 1.4. Prove the following statements using mathematical induction.

(1)∑n

k=1 k2 = n(n+1)(2n+1)

6 .

(2)∑n

k=1 k3 = n2(n+1)2

4 .

(3) 3n > 2n for all positive integers n.

ABSTRACT ALGEBRA I NOTES 3

1.2. The Peano Postulates. The Peano postulates for the natural numbers werefirst given by the mathematician Giuseppe Peano 1858–1932, in the year 1889.These axioms were the culmination of about a century of work in developing thenotions of arithmetic as a system of formal reasoning. Here, we will give the axiomsand constructions for the set P of positive integers. Peano’s axioms were originallystated for the natural numbers N = 0, 1, . . . .

Definition 1.5. The positive integers P is a set with a map s : P → P, called thesuccessor map, satisfying:

(1) There is a natural number 1 such that 1 = s(n) for any n ∈ P.(2) If s(m) = s(n) then m = n.(3) If S is a subset of P such that

• 1 ∈ S.• If n ∈ S, then s(n) ∈ S.

Then S = P.It is not hard to show that if P and P′ are two such sets, then there is a unique

bijection φ : P → P′ such that φ(1) = 1 and φ(s(n)) = s(φ(n)) for all n ∈ P.This means that in some sense, the Peano postulates uniquely determines the setof positive integers.

The construction of all of the elementary arithmetic operations from the Peanopostulates was given in the Principia Mathematica, a three volume tome writtenby the mathematicians Alfred North Whitehead, 1861–1947, and Bertrand Russell1872–1970, consisting of thousands of pages. Clearly, in a course on Abstract Alge-bra, there is not enough time to give this kind of an in depth treatment of elementaryarithmetic, so we will only establish a few of the highlights of the material.

Theorem 1.6. If n ∈ P, then n = s(n).

Proof. Let S be the set of all elements of P which are not equal to their successors,that is all n ∈ P such that n = s(n). If we can show that S = P, then thetheorem is true. First we show that 1 ∈ S. This is true because by hypothesis, 1is not a successor of any element. Now suppose that n ∈ S. Then n = s(n). Ifs(n) = s(s(n)), then it would follow that n = s(n), since both n and s(n) have thesame successor. However, by assumption n = s(n). Thus s(n) = s(s(n)). It followsthat s(n) ∈ S. From this we conclude that S = P. �

The proof above illustrates a common technique in the theory of arithmetic onP. We use the inductive property of the natural numbers to show the property wewish to establish. We give one more example of such a proof.

Theorem 1.7. Let n ∈ P and suppose that n = s(m) for any m ∈ P. Then n = 1.

In other words, the only element of P which is not a successor is 1.

Proof. Let S be the set of all elements n ∈ P such that either n = 1 or n = s(m)for some m ∈ P. Notice that 1 ∈ S by assumption. Let us suppose that n ∈ S.Then s(n) ∈ S, since we have s(n) = s(m) where m = n. It follows that S = P.It follows that if n ∈ P , then n ∈ S, so that if n = 1, n = s(m) for some m ∈ P .Therefore, if n = s(m) for all m ∈ P, we must have n = 1. �Definition 1.8 (Recursion). A function f defined on P is said to be defined recur-sively if f is defined as follows. First, f(1) is explicitly given. Secondly, the valuef(s(n)) is given by some rule that depends only on the value of f(n).

4 MICHAEL PENKAVA

The principle of mathematical induction shows that a recursive definition givesa well-defined function, provided that the rule for f(s(n)) can always be evaluated.The rules for addition and multiplication of positive numbers are given by recursivedefinitions.

Definition 1.9 (Definition of addition). Addition is a binary operation on S givenby the following rules:

• m+ 1 = s(m).• m+ s(n) = s(m+ n).

Notice that there can be no conflict between the two rules because 1 is not asuccessor. The fact that addition is well-defined is an elementary exercise. Oneshows that the set S of all n such that m + n is defined satisfies the inductionhypotheses, so is all of P. From the definition of addition, we are able to showthe properties of associativity and commutativity of addition. The order in whichthese two properties are established is quite important. One of the difficulties thatWhitehead and Russell encountered in writing the Principia Mathematica was thatthere is a certain natural order in which the properties need to be established, andthe difficulty is determining that natural order.

Theorem 1.10 (Associativity of addition).

(a+ b) + c = a+ (b+ c),

for all positive integers a, b and c.

Proof. The first difficulty one has to overcome in this proof is that there are threevariables, but mathematical induction gives conditions for a subset S of P to beall of P. This means we should somehow reduce our proof to a one variable proof.One way to do this is to imagine that a and b are fixed numbers, and to show thatthe set S consisting of all c ∈ P such that (a+ b) + c = a+ (b+ c) is all of P.

First we show that 1 ∈ S. To see this, note that (a + b) + 1 = s(a + b) by thefirst rule of addition. Secondly, a+(b+1) = a+ s(b) = s(a+ b), by the second ruleof addition. It follows that (a + b) + 1 = s(a + b) = a + (b + 1). This shows that1 ∈ S.

Next, suppose that c ∈ S, so that (a+ b) + c = a+ (b+ c). Then

(a+ b) + s(c) = s((a+ b) + c) = s(a+ (b+ c)) = a+ s(b+ c) = a+ (b+ s(c)).

But this means that s(c) ∈ S. By the inductive principle of natural numbers, wehave S = P .

Finally, we note that although we fixed a and b to give this property for c, we didnot use any properties of a and b, so we finally see that the formula for associativityholds for all positive integers a, b and c. �Theorem 1.11 (Commutativity of addition). For all positive integers m,n ∈ P,

m+ n = n+m.

Sometimes, it helps to prove a technical or special case of a theorem, which willhelp in the general proof, as a separate result. Such a result is usually called alemma. Of course, a lemma is a theorem, but we usually reserve that word forresults which are primarily useful in proving a more important result. However,there are cases where an important result is also called a lemma, so one has to becareful.


Lemma 1.12. For all positive integers m, m+ 1 = 1 +m.

Proof of the lemma. Let S be the subset of all m ∈ P such that m + 1 = 1 +m.Evidently 1 ∈ S, since 1 + 1 = 1 + 1. Now suppose that m ∈ S. Then

1+s(m) = s(1+m) = s(m+1) = m+s(1) = m+(1+1) = (m+1)+1 = s(m)+1.

Thus, by induction S = P, and the lemma holds. �Notice that we used associativity in the proof of this lemma, so it was important

that the associative law of addition was established first.

Proof of the theorem. Fix m ∈ P. Let S be the subset of all n ∈ P such thatm + n = n + m. Then by the lemma, 1 ∈ S. Suppose that n ∈ S. Thenm+ n = n+m. As a consequence,

m+ s(n) = s(m+ n) = s(n+m) = n+ s(m) = n+ (m+ 1)

= n+ (1 +m) = (n+ 1) +m = s(n) +m.

Thus S = P and the commutative law of addition holds. �Definition 1.13 (Definition of multiplication). Multiplication is a binary operationon S given by the following rules:

• m · 1 = m.• m · s(n) = m · n+m.

There are two properties of multiplication, associativity and commutativity, anda property involving addition and multiplication called the distributive law.

Theorem 1.14 (The distributive law). For all a, b and c in P we have

a · (b+ c) = a · b+ a · c

Proof. Once again, we prove this result by fixing a and b and showing that the setof all c ∈ P for which the equation above holds satisfies the induction hypotheses.First, if c = 1, we note that

a · (b+ 1) = a · s(b) = a · b+ a = a · b+ a · 1,so 1 ∈ S. Suppose now that c ∈ S. Then

a · (b · s(c)) = a · (b+ (c+ 1)) = a · ((b+ c) + 1)

= a · (b+ c) + a · 1 = (a · b+ a · c) + a · 1= a · b+ (a · c+ a · 1) = a · b+ a · s(c)

�Actually, there are two distributive laws. The one stated above is often called

the left distributive law. The right distributive law is stated as follows:

(a+ b) · c = a · c+ b · cWhen the commutative law of multiplication holds, the right distributive law followsdirectly from the left distributive law. However, many structures with addition andmultiplication do not have a commutative multiplication, so in those cases, the leftand right distributive laws do not follow directly from each other, and other methodsof proof are necessary. It also may seem strange that we first proved the distributivelaw, instead of the laws involving multiplication alone, but it will emerge that weuse the distributive law in proving the other properties. An interesting feature

6 MICHAEL PENKAVA

of the proof of the distributive law is that all the work seems to be in movingparentheses around. This is a key feature in proofs in algebra.

Exercise 1.15. Prove the right distributive law: For all positive integers a, b andc, we have

(a+ b) · c = a · c+ b · c.

Theorem 1.16 (Associative law of multiplication). For all a, b and c in P, wehave

a · (b · c) = (a · b) · c.

Proof. As usual, we fix a and b and show that the set S of all c ∈ P such thata · (b · c) = (a · b) · c is all of P. Now

a · (b · 1) = a · b = (a · b) · 1,so 1 ∈ S. Now suppose c ∈ S. Then

(a · b) · s(c) = (a · b) · c+ (a · b) = a · (b · c) + a · b = a · (b · c+ b) = a · (b · s(c))�

Notice that in the proof of the associative law of multiplication we used the leftdistributive law. Finally, we are ready to prove the commutative law of multiplica-tion.

Theorem 1.17. For all positive integers a and b we have

a · b = b · a.

To simplify the proof, we first state and prove the following lemma.

Lemma 1.18. For all positive integers a, we have a · 1 = 1 · a.

Proof of the lemma. Let S be the set of all positive integers a such that a ·1 = 1 ·a.Then 1 ∈ S because 1 · 1 = 1 · 1. Now suppose that a ∈ S. Then

1 · s(a) = 1 · a+ 1 = a · 1 + 1 · 1 = (a+ 1) · 1 = s(a) · 1.�

Proof of the theorem. Fix a and let S be the set of all b ∈ P such that a · b = b · a.Then by the lemma, 1 ∈ S. Suppose now that b ∈ S. Then

a · s(b) = a · b+ a · 1 = b · a+ 1 · a = (b+ 1) · a = s(b) · a.This shows that S = P so the commutative law of multiplication holds for thepositive integers. �

Next, we introduce the notion of inequality for the positive integers.

Definition 1.19 (Definition of inequality). We say that a < b, a is less than b,precisely when there is some c such that b = a+ c.

Although we won’t develop the properties of inequalities, we point out that theusual properties of inequalities involving positive integers can all be establishedusing the properties of addition and multiplication which we have developed thusfar. To illustrate this principle, we state and prove the following theorem.

Theorem 1.20. If a < b then a+ c < b+ c for any c ∈ P.


Proof. Suppose that a < b. Then there is some x ∈ P such that b = a+ x. Thus

b+ c = (a+ x) + c = (a+ (x+ c) = a+ (c+ x) = (a+ c) + x.

It follows that a+ c < b+ c. �

1.3. Well Ordering and Strong Induction.

Definition 1.21. A set X is ordered provided it is equipped with a binary relation< satisfying:

(1) If a, b ∈ X, then exactly one of the following hold:• a < b.• a = b.• b < a.

(2) If a < b and b < c then a < c.

For an ordered set X, we write a ≤ b if a < b or a = b.

Definition 1.22. An ordered set X satisfies the Principle of Strong Induction ifgiven any subset S which satisfies:

• If x ∈ S for all x < n then n ∈ S

Then S = X. A subset of an ordered set X is said to be strongly inductive if itsatisfies the condition above.

One can restate the principle of strong induction in the form: X satisfies theprinciple of strong induction if every strongly inductive subset is all of X.

Theorem 1.23 (Strong Induction). P satisfies the principle of strong induction.

Proof. Let S ⊆ P be a strongly inductive subset of X. We need to show that S = P.To see this, we will show that a certain subset of S is already all of P. Let Y bethe subset of S consisting of all elements n ∈ S such that x ∈ S for all x < n. Weshow that Y satisfies the inductive hypotheses.

First, note that x ∈ S for all x < 1, since there are no such values of x. Therefore1 ∈ S. Furthermore, it is clear that 1 ∈ Y as well. Next, suppose that n ∈ Y . Thenfor all x < n, x ∈ S, and since n ∈ S, it follows that for all x < s(n), x ∈ S. Thuss(n) ∈ S. It follows that s(n) ∈ Y . Since Y satisfies the hypotheses of induction,Y = P. It follows that S = P as well. �

Definition 1.24. If X is an ordered set, and Q is a subset of X, then c is called aleast element of Q if c ≤ x for all x ∈ Q.

An ordered set X is well ordered or satisfies the least element property providedthat any nonempty subset Q of X has a least element.

Theorem 1.25. The set P satisfies the least element property.

Proof. Let Q be a subset of P which does not have a least element, and let S bethe subset of P consisting of all x ∈ P such that y ∈ Q for all y ≤ x. We show thatS satisfies the hypothesis of strong induction, which implies it is all of P. Supposethat x ∈ S for all x < n. Then x ∈ Q for all x < n. If n ∈ Q, it would be the leastelement of Q. Thus n ∈ Q, so n ∈ S. Thus S must be all of P. �

8 MICHAEL PENKAVA

2. Equivalence of forms of induction and well ordering

Both the Principle of Strong Induction and the Well Ordering Principle referonly to an ordering on a set X. The Principle of Mathematical Induction which wegave as part of the Peano Postulates, which is also known as weak induction requiresa successor operation, and there must be a connection between the ordering andthe successor operation. We have already shown that the set of positive integers,with the ordering given by the construction from the Peano postulates satisfies theWell Ordering Principle and the Principle of Strong Induction.

Theorem 2.1. Let X be an ordered set. Then X is well ordered if and only if itsatisfies the principle of strong induction.

Proof. We show that well ordering implies the principle of strong induction. Weleave the reverse direction as an exercise. Suppose that X is well ordered and Sis a strongly inductive subset of X. We must show that S = X. Let Q be thecomplement of S. It is enough to show that Q must be the empty set. Supposethat it is not empty. Then Q has a least element c. It follows that for all x < c,x is not an element of Q, which means that x is in S. Thus for all x < c, x ∈ S.Since S is strongly inductive, it follows that c ∈ S. But this contradicts the factthat c ∈ Q. This shows that Q is empty. �Exercise 2.2. Show that an ordered set satisfying the principle of strong inductionis well ordered.

It can be shown that every set X can be well-ordered, using the axiom of choice,which is an axiom of a certain set theory, called Zermelo-Frenkel Choice, oftendenoted as ZFC. To understand this construction would take us too far into therealm of set theory for this course. However, we note that if X is well ordered, thenthe principle of strong induction holds, by the theorem above.

Transfinite Induction refers to proofs using the principle of strong induction ona well ordered set. Since every set can be well ordered, transfinite induction canbe used to prove many interesting results in set theory, in particular, it is used tostudy ordinal numbers.

3. The Division Algorithm

From the positive integers, the integers are constructed in a straightforwardmanner, and all of the usual properties of addition, multiplication and inequalitiescan be established in a routine manner. Nevertheless, the construction takes a lotof detail and would take too long to carry out in this course. We will assume thatall of these basic properties have been shown, and will begin our analysis of theintegers with the division algorithm.

Theorem 3.1. Suppose that m,n ∈ Z and m = 0. Then there are unique q, r ∈ Zsuch that 0 ≤ r < |m| and

n = qm+ r.

Proof. We first show uniqueness of q and r. Suppose that n = mq + r and n =mq′ + r′, where 0 ≤ r ≤ |m| and 0 ≤ r ≤ |m|. If r = r′, it follows that mq = mq′,so m(q − q′) = 0. By the zero product property of the integers, either q − q′ = 0or m = 0. Since we have explicitly assumed that m = 0, it follows that q − q′ = 0,so q = q′. Now, let us assume that r = r′. Then we can assume without loss of


generality that r′ > r, so that r′−r > 0. Butm(q−q′) = r′−r, so |m||q−q′| = r′−r.However r′ − r < |m| − r < |m|, but |m||q − q′| > |m| unless q = q′. It follows thatq = q′ so r = r′. This proves uniqueness.

We will use the least element property of P to prove the existence of a q and rsatisfying the properties. Let X = {n −mq|q ∈ Z} ∩ P. Because m = 0, X = ∅.Therefore X has a least element r. We have n = mq + r for some q. Suppose thatr ≥ |m|. Then r′ = r − |m| ≥ 0 If m > 0, then n = mq + r = m(q + 1) + r′, sor′ ∈ X. If m < 0 then n = m(q− 1) + r′, so again r′ ∈ X. But this contradicts thefact that r is the least element of X, since r′ < r. �

Definition 3.2. Let a, b ∈ Z. We say that a divides b, and denote this by a|b,provided that there is some integer x such that ax = b.

Note that a|b is a statement, not a number.

Definition 3.3. Let m,n ∈ Z. Then c is called a greatest common divisor of mand n provided that

(1) c|m and c|n.(2) If d|m and d|n then d|c.

Notice that we did not define the greatest common divisor. In fact, in general, thegreatest common divisor is only determined up to multiplication by ±1, as we shallshow. However, this fact does allow us to define the greatest common divisor as theunique greatest common divisor which is nonnegative, which is exactly what mosttextbooks do. Note also that the definition of a greatest common divisor does notimply that such a thing exists. It simply gives a criterion for determining whethera number c is a greatest common division. It is common to write c = gcd(m,n) toexpress that c is a greatest common divisor of m and n, even though there is someambiguity about c.

Proposition 3.4. Suppose that a|b and b|a. Then b = ±a.

Proof. Let a|b and b|a. Then there are x, y ∈ Z such that b = ax and a = by. Itfollows that b = byx, so b(1 − yx) = 0. If b = 0, then a = 0, so b = a. Otherwisewe must have 1 − yx = 0, so xy = 1. In particular, x has a multiplicative inverse.But the only integers which have a multiplicative inverse are ±1, so x = ±1, andb = ±a. �

Theorem 3.5. Let c and d be two greatest common divisors of m and n. Thend = ±c.

Proof. Since c is a gcd ofm and n, we have c|m and c|n. Since d is a gcd ofm and n,it follows that c|d. Similarly, d|c. Thus, according to Proposition 3.4, d = ±c. �

Proposition 3.6. Let m ∈ Z. Then

(1) gcd(m, 0) = m.(2) gcd(m, 1) = 1.

Exercise 3.7. Prove Proposition 3.6

Definition 3.8. Let m,n ∈ Z. Then m and n are said to be relatively prime ifgcd(m,n) = 1. In other words, 1 is a greatest common divisor of m and n.

10 MICHAEL PENKAVA

Notice that m and 1 are relatively prime for any m ∈ Z, by Proposition 3.6.Now we will show that given m,n ∈ Z, there is always a greatest common divisor

of m and n. In other words, greatest common divisors exist!

Theorem 3.9. Let m,n ∈ Z, and suppose that n = 0. Let

X = {rm+ sn|r, s ∈ Z} ∩ P.

Then X has a least element c, and this least element is a greatest common divisorof m and n.

Moreover, for any m,n ∈ Z, if c is a gcd of m and n, then c = rm+sn for somer, s ∈ Z.

Proof. Since n = 0, |n| ∈ P. Moreover, |x| = sx where s = 1 or s = −1. Thus|x| = 0 ·m+ sn ∈ X, so X is nonempty. As a consequence, it has a least element c,and since c ∈ X, c = rm+ sn for some r, s ∈ Z. Since c = 0, there are unique q, dsuch that 0 ≤ d < c and m = cq + d. But then d = m − cq = m − (rm + sn)q =(1 − rq)m + (−sq)n. If d > 0, it follows that d ∈ X and d < c, which contradictsour assumption that c is the least element of X. Thus d = 0, so m = cq. Thus c|m.Similarly, c|n.

Now suppose that d ∈ Z satisfies d|m and d|n. Then m = xd and n = yd forsome x, y ∈ Z. Thus c = rxd + syd = (rx + sy)d. It follows that d|c. Thus c is agcd of m and n.

Finally, from what we have shown, when n = 0, we have constructed a gcd cof m and n which satisfies c = rm + sn for some r, s ∈ Z. If d is another gcd ofm and n, then either d = c or d = −c. But −c = (−r)m + (−s)n, so d can beexpressed in the required form. We still have to address the case when n = 0, butthen gcd(m,n) = m, so any gcd of m and n is of the form rm+ sn where r = ±1and s = 0. �

Corollary 3.10. Let m,n ∈ Z. Then m and n are relatively prime if and only ifthere are r, s ∈ Z such that 1 = rm + sn. In other words, we can express 1 as alinear combination of m and n.

Proof. If m and n are relatively prime, then 1 is a gcd of m and n. Thus, by thetheorem, 1 = rm+sn for some r, s ∈ Z. On the other hand, suppose 1 = rm+sn forsome r, s ∈ Z. Now, by the theorem, the least element in X = {rm+sn|r, s ∈ Z}∩Pis a gcd of m and n, and by assumption, 1 ∈ X. It follows that 1 must be the leastelement in X, so 1 is a gcd of m and n. �

Theorem 3.11 (Euclidean Algorithm). Suppose that n,m ∈ Z, and n = mq + r.Then gcd(m,n) = gcd(m, r).

Proof. Let c = gcd(m,n) and d = gcd(m, r). Then m = xd and r = yd for somex, y ∈ Z. Thus n = mq + r = (qx + y)d, so d|n. Since d|m and c is a gcd of mand n, it follows that d|c. Next, note that m = rc and n = sc for some r, s ∈ Z.so r = n −mq = (s − rq)c. It follows that c|r and c|m, so c|d. Therefore d = ±c.Thus gcd(m,n) = gcd(m, r). �

It may seem that the Euclidean algorithm is not an algorithm at all, since itdoes not tell one how to compute the gcd of m and n. The trick is to notice thatif we first express n = mq+ r with 0 ≤ r ≤ |m|. and then we express m = q1r+ r1with 0 ≤ r1 < r, and continue this process, we obtain a sequence of elements


r > r1 > · · · rn. Eventually, this process must terminate with some rn+1 = 0.But we have gcd(m,n) = gcd(m, r) = gcd(r, r1) = · · · = gcd(rn, rn+1) = rn, sincern+1 = 0. Thus, the Euclidean algorithm computes the gcd of m and n. In fact,the Euclidean algorithm is efficient in this computation. Moreover, we can adaptthe Euclidean algorithm to find numbers x and y so that the gcd c of m and nsatisfies c = xm + yn. Dr. Nick Passell, a professor emeritus of the departmentof mathematics at the University of Wisconsin-Eau Claire, developed an efficientalgorithm, which we illustrate below.

Let us find the gcd c of 78 and 30, as well as x and y such that c = 30x + 78y.First make a table with 4 columns, with headings r, −q, m and n. We will use itto keep track in each row how the element r can be expressed as a linear columnof m and n. For simplicity, we start with the largest element n = 78. and the firstrow expresses that it is zero times m = 30 plus 1 times n. In the next row, beforefilling in the q column, first note that m = 1 ·m+0 ·n, so put a 1 in the m columnand a 0 in the n column. Now, note that when we use the division algorithm toexpress n = mq+ r, with 0 ≤ r < |m|, we have q = 2, so write −2 in the q column,and put the r = 18 in the r column in the next row.

To figure out the m column in the current row, add the m column from 2 rowsabove, and −q times the m column in the row above, and do similarly for then column. Then we begin again by figuring out how to express 30 in the form30 = 18 ∗ q + r. We write the −q, which in this case is −1 in the q column, andproceed as before. In each case, we determine the m column by adding the valuein the m column two rows above plus the −q times the value in the m column inthe row above, and similarly for the n column.

Finally, when the number c in the r column divides the number in the r columnin the row above, that r is the gcd, and the numbers we calculate in the m andn columns become the x and y so that c = xm + yn. The complete calculation isgiven in the table below.

r -q m n78 0 130 -2 1 018 -1 -2 112 -1 3 -16 -5 2

From this calculation we see that 6 is the gcd of 78 and 30, and that 6 =−5 · 30 + 2 · 78.

Definition 3.12. An element a ∈ Z is called a unit if a has a multiplicative inverse.

Of course we already know that the units in Z are precisely the numbers ±1.

Definition 3.13. Let p ∈ Z and suppose that p is not zero and not a unit. Then

• p is said to be irreducible if whenever p = ab then either a or b is a unit.• p is said to be prime if whenever p|ab then p|a or p|b.

We will show that the notions of primeness and irreducibility coincide for the inte-gers.

Theorem 3.14. Let p ∈ Z be prime. Then p is irreducible.

12 MICHAEL PENKAVA

Proof. Suppose that p is prime and p = ab. Then p|ab so either p|a or p|b. Supposethat p|a. Then a = px for some x and thus p = pxb. It follows that p(1− xb) = 0.Since p = 0, we must have xb = 1, so b is a unit. Similarly, if p|b, then we can showthat a is a unit. It follows that p is irreducible. �Theorem 3.15. Let p be irreducible and a ∈ Z. Then either gcd(p, a) = 1 or p|a.

Proof. Let c = gcd(p, a). Then c|p so p = cx and a = cy for some x, y ∈ Z. If c is aunit, then gcd(p, a) = 1. Otherwise, x is a unit, so a = cy = px−1y. Thus p|a. �Theorem 3.16. Suppose that p is irreducible. Then p is prime. As a consequence,we have p is prime if and only if p is irreducible.

Proof. Suppose p is irreducible and p|ab. Then ab = xp for some x ∈ Z. If p |a,then gcd(p, a) = 1, so 1 = rp+ sa for some r, s ∈ Z. Thus

b = brp+ sab = brp+ sxp = (br + sx)p.

It follows that p|b. Thus p is prime. �Proposition 3.17. Suppose that a and b are relatively prime and that a|bx. Thena|x.

Exercise 3.18. Prove the above proposition.

4. Modular Arithmetic

Modular arithmetic is also called clock arithmetic, because the rules of additionresemble the rules for addition on a clock. In order to give a rigorous definition, wewill first introduce the notion of an equivalence relation. A relation on a set X isa subset of elements (a, b) ∈ X × X. If we have a relation, we often denote it byintroducing some symbol R, and write xRy to mean that (x, y) lies in the relation.For example, the relation equality is given by the symbol “=” and we write a = bto mean that (a, b) lies in the relation equality. Other examples of relations givenby symbols are “¡”, ≤, ⊆. If ∼ is the symbol of a relation, we will usually just callthe relation ∼, rather than say that it is the symbol of the relation.

Definition 4.1. Suppose ∼ is a relation on a setX. Then ∼ is called an equivalencerelation provided that

(1) a ∼ a for all a ∈ X. (Reflexivity)(2) If a ∼ b then b ∼ a. (Symmetry)(3) If a ∼ b and b ∼ c then a ∼ c. (Transitivity)

Definition 4.2. If ∼ is an equivalence relation on X and b ∈ X, then the equiva-lence class of b, denoted by b, is

b = {a ∈ X|a ∼ b}.The set of all equivalence classes of elements in X is denoted by X/ ∼ or sometimesX.

Theorem 4.3. Let ∼ be an equivalence relation on X. Then the following proper-ties hold:

(1) If a ∈ X, then a ∈ a. Thus a = ∅.(2) If a ∩ b = ∅, then a = b.(3) ∪{a|a ∈ X} = X.


Proof. Since a ∼ a by reflexivity, it follows that a ∈ a. Thus a = ∅. Suppose thatx ∈ a ∩ b. Then x ∼ b and x ∼ a. Then by symmetry, b ∼ x. Let y ∈ b. Theny ∼ b, and by transitivity y ∼ x, and applying the transitive rule a second time, wehave y ∼ a. It follows that y ∈ a. This shows b ⊆ a. By a similar argument a ⊆ b.Thus we must have a = b. Finally, let x ∈ X. Then x ∈ x, so x ∈ ∪{a|a ∈ X}. Itfollows that {a|a ∈ X} = X. �

Definition 4.4. Let X be a set and C be a collection of subsets of X. Then C issaid to be a partition of X provided that

(1) If A ∈ C, then A = ∅.(2) If A and B are in C, and A ∩B = ∅, then A = B.(3) If x ∈ X then x ∈ A for some A ∈ C.

Theorem 4.5. If ∼ is an equivalence relation on a nonempty set X, then thecollection X is a partition of X.

Exercise 4.6. Prove the above theorem.

Definition 4.7. Let n ∈ Z and define a relation on Z by

x = y (mod n) if y − x = kn for some k ∈ Z.

Theorem 4.8. The relation = (mod n) is an equivalence relation.

Proof. First, note that a = a (mod n), because a − a = 0 = 0 · n. Suppose thata = b (mod n), so b−a = kn for some k ∈ Z. But then a−b = (−k)n, which showsthat b = a (mod n). Finally, suppose that a = b (mod n) and b = c (mod n).Then b− a = kn and c− b = ln for some k, l ∈ Z. Thus

c− a = c− b+ b− a = ln− kn = (l − k)n.

It follows that a = c (mod n). �

Definition 4.9. For the equivalence relation = (mod n), the set of equivalenceclasses is denoted by Zn. (Some authors denote it by Z/nZ.)

Theorem 4.10. There is a well-defined binary operation + on Zn given by

a+ b = a+ b.

Moreover, this operation satisfies the following properties.

(1) a+ (b+ c) = (a+ b) + c. (Associativity)(2) a+ b = b+ a. (Commutativity)(3) a+ 0 = a. (Existence of additive identity)(4) a+−a = 0. (Existence of additive inverse)

Proof. It turns out that the hard part is showing that the addition is well defined.What causes the problem is that the sets a do not determine the element a. Sowhat the operation actually says is that to add the two sets, take arbitrary elementsa and b out of the sets and form the set a+ b. The problem is that we need to showthat the set a+ b does not depend on the choice of a and b.

To do this, let a1 ∈ a and b1 ∈ b. We need to show that a1 + b1 = a+ b. Nowa1 = a (mod n), and b1 = b (mod n), so a − a1 = kn and b − b1 = ln for somek, l ∈ Z. It follows that

(a+ b)− (a1 + b1) = a− a1 + b− b1 = kn+ ln = (k + l)n.

14 MICHAEL PENKAVA

Thus a+b = a1+b1 (mod n). It follows that a+b ∈ a1 + b1, and since a+b ∈ a+ b,we see that a1 + b1∩a+ b = ∅. Therefore, a1 + b1 = a+ b. This shows that additionis well defined.

Now, to show the associative law, we proceed as follows:

a+ (b+ c) = a+ b+ c = a+ b+ c = a+ b+ c = (a+ b) + c.

To show commutativity:

a+ b = a+ b = b+ a = b+ a.

Next, we compute

a+ 0 = a+ 0 = a.

Finally,

a+−a = a+−a = 0.

�

Theorem 4.11. There is a well defined binary operation · on Zn, called multipli-cation, given by

a · b = ab.

This operation satisfies the following properties:

(1) a · (b · c) = (a · b) · c. (Associativity)(2) a · b = b · a. (Commutativity)(3) a · (b+ c) = a · b+ a · c. (Distributive Law)(4) a · 1 = a. (Existence of a multiplicative identity)

Proof. As usual, well definedness is the hard part. Suppose that a1 = a (mod n)and b1 = b (mod n). We need to show that a1b1 = ab (mod n). Now a1 = a+ knand b1 = b+ ln for some k, l ∈ Z. Thus

a1b1−ab = a1b1−a1b+a1b−ab = a1(b1−b)+(a1−a)b = a1ln+knb = (a1l+kb)n.

Thus a1b1 = ab (mod n) and it follows that multiplication is well defined.The properties are straightforward to show and are left as an exercise. �

Theorem 4.12. Let a ∈ Z. Then a is a unit in Zn precisely when gcd(a, n) = 1.In that case, if we express 1 = xa+ yn, then (a)−1 = x. In particular, Zp is a fieldif and only if p is prime.

Exercise 4.13. Prove the theorem above.

Theorem 4.14 (Freshmen Exponentiation). Let p ∈ P be prime and a, b ∈ Z.Then (a+ b)p = ap + bp (mod p).

Proof. Recall the binomial theorem for n ∈ P: (a + b)n =∑n

k=0

(nk

)an−kbk. Note

that(nk

)= n!

k!(n−k)! , and that(nk

)∈ P. As a consequence, when n = p is prime,

we note that for any 1 ≤ x < p, we have gcd(x, p) = 1. But this means thatgcd(k!, p) = 1 and gcd((p−k)!, p) = 1 if 1 ≤ k < p. Therefore gcd(k!(p−k)!, p) = 1,if 1 ≤ k < p, and since (k!(p− k)!)|p!, it follows that (k!(n− k)!)|(p− 1)!. But thismeans that p|

(pk

)and thus

(pk

)= 0 (mod p) for 1 ≤ k < p. It follows that every

term in the binomial formula is equal to zero mod p except for the terms with k = 0and k = p. But the term corresponding to k = 0 is ap and the term correspondingto k = p is bp. This gives the exponentiation formula in the theorem. �


Theorem 4.15 (Fermat’s Little Theorem). Suppose that p ∈ P is prime. Then ifa ∈ Z, ap = a (mod p). In particular, if gcd(a, p) = 1, then ap−1 = 1 (mod p).

Proof. We first show the statement is true whenever a ∈ P. For a = 1, the statementis trivial. Suppose that ap = a (mod p). Then

(a+ 1)p = ap + 1p = a+ 1 (mod p).

Thus by induction, we see that the statement is true for all a ∈ P.Next, note that 0p = 0, so the statement holds for a = 0. If p is odd, then

if a ∈ P, we have (−a)p = (−1)pap = −a (mod p), so the statement holds whena < 0. Thus we only have to show the case when a < 0 and p = 2. But −a = amod 2, since −a − a = 2a is divisible by 2. Thus (−a)2 = a2 = a = −a mod 2.Thus the statement holds when p = 2 and a < 0.

Finally, suppose that gcd(a, p) = 1. Now ap = a (mod p) so a(ap−1 − 1) = 0(mod p). Since Zp is a field and a = 0 (mod p), it follows that ap−1 = 1 (mod p).

�

Theorem 4.16 (Chinese Remainder Theorem). Suppose thatm and n are relativelyprime and a, b ∈ Z. Then there is an x ∈ Z such that

x = a mod m

x = b mod n.

Proof. If there is an x satisfying the statement above then x = a+km and x = b+lnfor some k, l ∈ Z. As a consequence a+km = b+ln. This means that b−a = km−ln.On the other hand, since gcd(m,n) = 1, we know that 1 = rm+sn for some r, s ∈ Z.It follows that b−a−(b−a)rm−(a−b)sn. Thus if we set k = (b−a)r and l = (a−b)rwe have expressed b− a in the required format. �

Theorem 4.17 (General Chinese Remainder Theorem). Let m1, . . . ,mn ∈ Z bepairwise coprime; that is gcd(mi,mj) = 1 if i = j. Let a1, . . . , am ∈ Z. Then thereis an integer x such that x = ai (mod mi) for i = 1, . . . , n.

Proof. Let M =∏n

i=1mi, and Mi =∏

j =imj . Then miMi = M . Moreover mi

and Mi are relatively prime, so there are integers ri, si such that rimi + siMi = 1.Let ei = siMi. Then rimi + ei = 1. so ei = 1 (mod mi). Moreover, if j = i,then mj |Mi, so mj |ei, and ei = 0 (mod mi). Let x =

∑i = 1naiei. It follows that

x = ai (mod mi) for all i. �

Exercise 4.18. Suppose that gcd(a, n) = 1. Show that the equation ax = b(mod n) has a solution for any b ∈ Z. Moreover, if x is the equivalence class(mod n) of a particular solution x to the equation, then the solutions to the equa-tion are precisely the elements of the equivalence class of x mod n.

Exercise 4.19. Let gcd(a, n) = c, and express a = ca′ and n = cn′. Show that theequation ax = b (mod n) has a solution if and only if c|b. In that case, if b = cb′,and x is a solution to a′x = b′ (mod n′), then x is a solution to ax = b (mod n).

5. Permutations

Definition 5.1. If f : X → Y is a map, then

• f is injective if f(x) = f(x′) implies that x = x′.• f is surjectiveif given any y ∈ Y , there is some x ∈ X such that f(x) = y.

16 MICHAEL PENKAVA

f is said to be a bijection if f is both injective and surjective.

Theorem 5.2. Suppose that f : Y → Z and g : X → Y are maps. Then

• If f and g are both injective then f ◦ g is injective.• If f and g are both surjective then f ◦ g is surjective.• If f and g are both bijective then f ◦ g is bijective.

If h :W → X is another map, then

(f ◦ g) ◦ h = f ◦ (g ◦ h).

Proof. Suppose that both f and g are injective and (f ◦ g)(x) = (f ◦ g)(x′). Thenf(g(x)) = f(g(x′)), and since f is injective it follows that g(x) = g(x′). But then,since g is injective, we see that x = x′. Thus f ◦ g is injective.

Next, suppose that f and g are surjective, and y ∈ Y . Then since f is surjec-tive,there is some x ∈ X such that f(x) = y. Since g is surjective, there is somew ∈ W such that g(w) = x. Then (f ◦ g)(w) = f(g(w)) = f(x) = y. Thus f ◦ g issurjective.

Putting the two results together, we see that if f and g are bijective, then f ◦ gis bijective.

Finally, the associativity of function composition is easy to see and is left as anexercise to the reader. g(w) = x. �

Definition 5.3. Let X be a set Then the set SX = {f : X → X|f is bijective}is called the permutation group of X. The permutation group of n = {1, . . . , n} isdenoted simply as Sn.

Often, the permutation group of n is denoted by Σn instead of Sn.

Theorem 5.4. Function composition is a well defined binary operation SX×SX →SX . This operation, called the product of permutations, is usually denoted by jux-taposition instead of the composition symbol ◦. It satisfies the following properties.

(1) (στ)ϕ = σ(τϕ). (associativity)(2) The identity map 1X , defined by 1X(x) = x, is a permutation and

σ · 1X = 1x · σ = σ for all σ ∈ SX . (Existence of identity)

(3) The inverse map σ−1 to σ, defined by σ−1(y) = x if and only if σ(x) = yis a permutation of X and

σ · σ−1 = σ−1 · σ = 1X . (Existence of inverse)

Proof. Since the composition of bijections is a bijection, we see that the product ofpermutations is well defined. Since function composition is associative, the productis associative. Clearly, 1X is a bijection. We have (σ · 1X)(x) = σ(1X(x)) = σ(x),for any x ∈ X. Thus σ · 1X . Similarly, 1X · σ = σ. �

The identity element in Sn is often denoted as e, since the notation 1SXis

cumbersome. Note that with this notation, there is some ambiguity about whichSn the element e belongs to, which needs to be determined by context.

Definition 5.5 (Matrix Notation for Permutations). If σ ∈ Sn, then the matrix

notation for σ is

(1 · · · n

σ(1) · · · σ(n)

)


Definition 5.6. Let a1, . . . , ak be a sequence of distinct elements of X. Then thecycle σ associated to the sequence is the map σ : X → X given by

σ(x) =

ai+1 if x = ai and 1 ≤ i < k

a1 if x = ak

x if x ∈ {a1, . . . , ak}We say that the cycle σ has length k, and we denote it by σ = (a1, . . . , ak). Ifτ = (b1, . . . , bℓ) is another cycle, then the cycles σ and τ are said to be disjoint ifthe sets {a1, . . . , ak} and {b1, . . . , bℓ} are disjoint.

Exercise 5.7. Show that a cycle σ : X → X is actually a permutation of X.

Theorem 5.8. The product of disjoint cycles commutes.

Proof. Let σ = (a1, . . . , ak) and τ = (b1, . . . , bℓ) be two disjoint cycles. Let ϕ = στ ,and ψ = τσ. Suppose that x ∈ X. Let x ∈ X. Then exactly one of threepossibilities hold: x ∈ {a1, . . . , ak}, x ∈ {b1, . . . , bℓ}, or x ∈ {a1, . . . , ak, b1, . . . , bℓ}.Let us examine what happens in each case.

Case 1 : x ∈ {a1, . . . , ak} In this case σ(x) ∈ {b1, . . . , bℓ}, so ϕ(x) = τ(σ(x)) =σ(x). Moreover, τ(x) = x so ψ(x) = σ(τ(x)) = σ(x).. Thus ϕ(x) = ψ(x).

Case 2 : x ∈ {b1, . . . , bℓ} In this case τ(x) ∈ {a1, . . . , ak}, so ψ(x) = σ(τ(x)) =τ(x). Moreover, σ(x) = x so ϕ(x) = τ(σ(x)) = τ(x).. Thus ϕ(x) = ψ(x).

Case 3 :x ∈ {a1, . . . , ak, b1, . . . , bℓ}. In this case, both σ(x) = x and τ(x) = x, soψ(x) = x = ϕ(x).

Since ϕ(x) = ψ(x) for all x, we that σ and τ commute. �We can generalize the result above and combine with the associative law to see

that if σ1, . . . , σm is a sequence of disjoint cycles, then the order of multiplicationdoes not determine their product.

Theorem 5.9. If X is a nonempty set, then every permutation can be written asa product of disjoint cycles so that every element of X appears in one of the cycles.Moreover, this product is unique up to order.


Note that there is some ambiguity about which Sn a permutation written indisjoint notation belongs to. For example sigma = (1, 3, 2) might belong to Sn

for any n ≥ 3. Sometimes this ambiguity is advantageous. Note that there is noambiguity about the n when a permutation is expressed in matrix notation.

Theorem 5.11. Let σ = (a1, . . . , ak) be a cycle. Then σ−1 = (ak, . . . , a1). Inother words, to compute the inverse of a cycle, you just reverse the order of theelements in the cycle.


Theorem 5.13. If σ, τ ∈ SX , then (στ)−1 = τ−1σ−1.

Proof. If f : X → Y and g : Y → X, then we know that g = f−1 precisely wheng ◦ f = 1X and f ◦ g = 1Y . Thus we compute

(στ)(τ−1σ−1) = σ(ττ−1)σ−1 = σ · 1Xσ−1 = σσ−1 = 1X .

Similarly, (τ−1σ−1)(στ) = 1X . Thus (στ)−1 = τ−1σ−1. �

18 MICHAEL PENKAVA

In the study of linear algebra, you learned that if A, B are n × n matrices,then (AB)−1 = B−1A−1. The rule for computing the inverse of a product ofpermutations is analogous to the rule for matrix inverse computation.

If you combine the rule for computing the inverse of a cycle, and the rule forcomputing the inverse of a product of permutations, one obtains a simple methodfor computing the inverse of a product of any number of cycles, whether they aredisjoint or not. In the case when one has a product of disjoint cycles, this gives avery simple method of computing the inverse.

Example 5.14. Let σ = (1, 3, 5, 6)(2, 4, 8). Then σ−1 = (6, 5, 3, 1)(8, 4, 2). Noticethat we don’t have to reverse the order because the two cycles are disjoint, so theirinverses are also disjoint, and thus can be multiplied in any order.

It is also easy to multiply permutations which are expressed in cycle notation.In fact, one can compute the product of a number of permutations in a very quickfashion. It is also easy to convert the matrix notation for a permutation into disjointcycle notation.

Exercise 5.15. Let σ =

(1 2 3 4 5 6 7 83 5 4 6 1 8 7 2

). Then σ = (1, 3, 4, 6, 8, 2, 5).

Notice that σ = (1, 3, 4, 6, 8, 2, 5)(7) as well, but it is customary to drop the single-ton cycles from the expression for σ, as they are not necessary. Now let τ =(1, 4, 5)(2, 3)(7, 8) be a permutation in S8 expressed in cycle notation. Then the

matrix notation for τ is τ =

(1 2 3 4 5 6 7 84 3 2 5 1 6 8 7

). To find the product

of σ and τ write

στ = (1, 3, 4, 6, 8, 2, 5)(1, 4, 5)(2, 3)(7, 8) = (1, 6, 8, 7, 2, 4)(3, 5).

To calculate this, we first note that when computing the product of permutations,you must remember that the permutation on the right acts first. To get the righthand side of the equation, you first start a cycle with any number. We started with1, so we first wrote (1. Now reading from right to left, we track down where 1 goesto. First, the cycle (1, 4, 5) acts on 1 taking it to 4. Then the cycle (1, 3, 4, 6, 8, 2, 5)takes 4 to 6. Thus we put a comma, followed by a 6, so we have (1, 6 so far. Nextwe do the same thing as we did with 1, but starting with 6, and find that 6 goes to8. We continue in this manner until we have (1, 6, 8, 7, 2, 4. When we repeat theprocess with 4, we find 4 goes to 5 which then goes to 1. Since 4 goes to 1, whichis the first element in the cycle, we have computed a cycle in the product. Next, welook for a number which is not in the first cycle. 3 is such a number, so we canstart a new cycle with (3. In this manner, we compute the product.

Note that the method above can be applied when multiplying more than twopermutations together. Thus it is a very efficient method of computing the productof two permutations. One might ask, if the disjoint cycle notation is so advantageousfor computing inverses and products of cycles, what is the value of the matrixnotation. It turns out that the matrix notation has some applications, which donot arise in a course in abstract algebra, and the notation is a common notation aswell, so it is valuable to learn.

Definition 5.16. If σ ∈ SX , then the order of σ, denoted o(σ) is the least positiveinteger m such that σm = 1X . If there is no such integer, then we say that theorder of σ is ∞ and write o(σ) = ∞.


Theorem 5.17. Let σ = (a1, . . . , ak) be a cycle. Then o(σ) = k.

Proof. Suppose that 1 ≤ i < k. Then it is a straightforward induction to see thatσi(a1) = ai+1. Since ai+1 = a1, it follows that σi = e. In particular σk(a1) =σ(ak) = a1. Since σ = (aj , . . . , ak, a1, . . . , aj−1) for any 1 ≤ j ≤ k, it follows thatσk(aj) = aj for any 1 ≤ j ≤ k. Moreover if x ∈ {a1, . . . , ak}, then σ(x) = x, soσk(x) = x for all k. It follows that σk = e. �

Recall that if n1, . . . , nℓ ∈ P, then lcm(n1, . . . , nℓ) is the least common multipleof n1, . . . , nℓ. It is the smallest positive integer x such that ni|x for all i = 1, . . . , ℓ.

Corollary 5.18. Let σ = σ1 · · ·σm be a product of cycles σ1, . . . , σm. Then

o(σ) = lcm(o(σ1), . . . , o(σm)).

Definition 5.19. A cycle of the form (a1, a2) is called a transposition.

Theorem 5.20. Let k > 1 and σ = (a1, . . . , ak) be a cycle. Then

σ = (a1, a2)(a2, a3) . . . (an−1, an).

As a consequence, any element of Sn can be written as a product of transpositionswhen n > 1.

Proof. That σ = (a1, a2)(a2, a3) . . . (an−1, an) is a matter of calculation. If σ is notthe identity, then it can be written as a product of disjoint cycles, each of which haslength at least 2. Thus, after factoring each of them as a product of transpositions,we have found a factorization of σ in the desired form. It remains to consider thecase when σ = e. But e = (1, 2)(1, 2), so it is a product of transpositions. �

Definition 5.21. Let n > 1. Then a permutation σ ∈ Sn is said to be even if itcan be expressed as a product of an even number of transpositions. A permutationwhich is not even is said to be odd.

Note that if n > 1, then a permutation which is odd can be expressed as aproduct of an odd number of transpositions, since every permutation is a product oftranspositions. What is not so obvious is that a permutation can not be expressedboth as a product of an even number of transpositions and an odd number oftranspositions. In order to prove this fact, we need to develop some properties ofpermutations.

Definition 5.22. Suppose that σ ∈ Sn can be expressed as a product of k disjointcycles so that every number 1 ≤ i ≤ n appears in one of the cycles. Then the orbitnumber of σ is n− k.

Notice that since the decomposition of σ into such a product is unique up to theorder of the cycles, the orbit number is well defined. Also, the orbit number of theidentity element e is zero, since it is a product of n disjoint cycles.

Theorem 5.23. Suppose that σ ∈ Sn and τ = (a, b) be a transposition. Then theorbit number of στ is either 1 larger or 1 less than the orbit number of σ. Moreprecisely, if a and b lie in the same cycle of σ, then στ has 1 more orbit than σ,and if a and b lie in different cycles of σ, then στ has one less orbit.

20 MICHAEL PENKAVA

Proof. Suppose that a and b belong to the same orbit of σ. We can supposethat σ = (a, a1, . . . , ak, b, b1, . . . , bℓ), as the other cycles in σ will not influence theoutcome of the product. Then

στ = (a, a1, . . . , ak, b, b1, . . . , bℓ)(a, b) = (b, a1, . . . , ak)(a, b1, . . . , bℓ),

so στ has one more orbit.Next, suppose that a and b belong to different orbits. Then we can suppose that

σ = (a, a1, . . . , ak)(b, b1, . . . , bℓ). In this case, we have

στ = (a, a1, . . . , ak)(b, b1, . . . , bℓ)(a, b) = (a, b1, . . . , bℓ, b, a1, . . . , ak),

, so that στ has one less orbit. �Corollary 5.24. If n > 1, then an element σ ∈ Sn has a factorization as a productof an even number of transpositions or an odd number of transpositions, but notboth. In fact, σ is even precisely when its orbit number is even. Moreover, we havethe following:

• The product of two even elements is even.• The product of an even element and an odd element in either order is anodd element.

• The product of two odd elements is even.• The inverse of an even element is even.• The inverse of an odd element is odd.

6. Groups

Definition 6.1. A set G, equipped with a binary operation ⋆, called the productor group operation is called a group provided that

(1) a ∗ (b ∗ c) = (a ∗ b) ∗ c, for all a, b, c ∈ G. (associativity)(2) There is an element e ∈ G such that a ⋆ e = e ⋆ a = a for all a ∈ G.

(Existence of identity)(3) Given a ∈ G there is some b ∈ G such that a ⋆ b = b ⋆ a = e. (Existence of

inverse)

Frequently, the group operation is indicated by juxtaposition;i.e.we write ghinstead of g ⋆ h. If we wish to emphasize the group operation, we sometimes say(G, ⋆) is a group. This may be important when the set G is equipped with morethan one operation. It is also common for the operation to be written as +, butin that case, almost always we require the operation to be commutative, which wedefine below.

Definition 6.2. A group G with product ⋆ is said to be commutative providedthat a ⋆ b = b ⋆ a for all a, b ∈ G.

Examples of commutative groups are (Z,+), (Zn,+), (Q,+), (R,+), and anyvector space over any field k with the operation of addition. In all of these casesthe identity element is called 0. Commutative groups whose group operation is notwritten as + are (Z∗, ·), (Z∗

n, ·), (R∗, ·), where the ∗ means the subset of elementsinvertible under the group operation.

The set GL(n,k) of invertible n × n matrices with coefficients in a field k isa group under matrix multiplication, which is not commutative if n > 1. Thepermutation group SX is a group under composition of maps, which is also notcommutative when X has more than two elements.


A careful reading of the definition of a group reveals that it does not state thatthere is only one identity element or one element satisfying the inverse property.Luckily, we can prove this uniqueness of identity and inverse.

Theorem 6.3 (Uniqueness of Identity). Suppose that G is a group and e, e′ bothsatisfy the condition of identity in the second axiom of a group. Then e = e′. Infact, if e is the identity and e′ ⋆ a = a or a ⋆ e′ = a for some a ∈ G, then e′ = e.

Proof. Suppose that e ⋆ a = a ⋆ e = a for all a ∈ G, and that e′ ⋆ a = a, for somea ∈ G. By the third axiom of groups, there is some b ∈ G such that a⋆ b = e. Then

e′ = e′ ⋆ e = e′ ⋆ (a ⋆ b) = (e′ ⋆ a) ⋆ b = a ⋆ b = e.

The proof is similar if we assume a ⋆ e′ = a for some a ∈ G. �

Theorem 6.4 (Uniqueness of Inverse). Suppose that G is a group, a, b ∈ G anda ⋆ b = b ⋆ a = e. Let b′ ∈ G satisfy b′ ⋆ a = e or a ⋆ b′ = e. Then b′ = b.

Proof. Let a, b be as in the statement of the theorem, and suppose that b′ ⋆ a = e.Then

b′ = b′ ⋆ e = b′ ⋆ (a ⋆ b) = (b′ ⋆ a) ⋆ b = e ⋆ b = b.

A similar argument holds when a ⋆ b′ = e. �

As a consequence of the above theorem, we can give the definition below.

Definition 6.5. If G is a group with identity e and g ∈ G, then the inverse of g isthe unique element h such that g ⋆ h = h ⋆ g = e. When the group operation of Gis written in some multiplicative form (either by juxtaposition or ⋆), we denote theinverse of g by g−1. When the group operation of G is a commutative operationwritten as +, we write the inverse of g as −g. Most of the time, we will assumethat the group in question is written multiplicatively, so will state our results inthat form. Later, we will give a table which compares the multiplicative forms ofour results to their additively written counterparts.

Theorem 6.6. Let G be a group (written multiplicatively). Then

• If g ∈ G, then (g−1)−1 = g.• If g, h ∈ G,then (gh)−1 = h−1g−1.


Definition 6.8 (Exponentiation). Let G be a group. For n ∈ P we define thepower gn for g ∈ G recursively as follows:

• g1 = g.• gs(n) = gng.

This definition is extended to all n ∈ Z as follows

• g0 = e.• g−n = (gn)−1 if n ∈ P.

Lemma 6.9. Let G be a group, g ∈ G, and m,n ∈ P. Then

(1) gmgn = gm+n.(2) (gm)n = gmn.

22 MICHAEL PENKAVA

Proof. To establish the first equation, we show that the set S = {n ∈ P|gmgn =gm+n for all m ∈ P} is an inductive subset of P. Note gm+1 = gmg = gmg1, so1 ∈ S. Suppose that n ∈ S. Then

gm+s(n) = g(m+n)+1 = gm+ng = gmgng = gmgn+1 = gmgs(n).

Thus S is inductive so it follows that S = P.Next, we show that the set S = {n ∈ P|(gm)n = gmn for all m ∈ P} is an

inductive subset of P. Note (gm)1 = gm = gm·1, so 1 ∈ S. Suppose that n ∈ S.Then

(gm)s(n) = (gm)n+1 = (gm)n(gm)1 = gmngm = gmn+m = gm(n+1) = gm·s(n).

Thus S is inductive so that S = P. �

Theorem 6.10. Let G be a group, g ∈ G and n ∈ P. g−n = (g−1)n.

Proof. We proceed by induction. Let S be the subset of P such that g−n = (g−1)n

for all n ∈ S. Since g−1 = (g−1)1 by the definition of exponentiation, it followsthat 1 ∈ S. Suppose that n ∈ S. then

g−s(n) = (gs(n)−1 = (gng)−1 = g−1(gn)−1 = g−1(g−1)n = (g−1)n+1 = (g−1)s(n).

. �

Now we are ready to show that the statements of Lemma 6.9 holds for all integers.

Theorem 6.11. Let G be a group, g ∈ G, and m,n ∈ Z. Then

(1) gmgn = gm+n.(2) (gm)n = gmn.

Proof. Let us first note that both formulas are immediate whenever m or n is equalto zero. Thus, we can restrict to the case when either both m and n are negative,or when one is positive and the other is negative.

Let us examine the case when both coefficients are negative, so m = −k andn = −ℓ for some k, ℓ ∈ P. Then

gm+n = gn+m = g−(ℓ+k) = (gℓ+k)−1 = (gℓgk)−1

= (gk)−1(gℓ)−1 = g−kg−ℓ = gmgn.

Next

(gm)n) = (g−k)−ℓ = ((g−k)ℓ)−1 = (((gk)−1)ℓ)−1

= (((gk)ℓ)−1)−1) = (gk)ℓ = gkℓ = gmn.

Thus, both formulae hold when m and n are negative.Next, Suppose that m ≥ n ∈ P. Then

gmg−n = gm−ngn(gn)−1 = gm−n.

A similar formula holds when 1 ≤ m < n, but this time, we factor g−n = g−m−n−m.It is an exercise for the reader to extend this to the case g−mgn when m,n ∈ P.

Finally, let us consider the multiplicative formula. Let m,n ∈ P. Then

(g−m)n = ((gm)−1)n = (gm)−n = ((gm)n)−1 = (gmn)−1 = g−mn.

Finally, the case (gm)−n is handled similarly. �


Theorem 6.12. Suppose that G is a group, g, h ∈ G. Then if gh = hg, wehave (gh)m = gmhm for all m ∈ Z. Moreover, g and h commute precisely when(gh)2 = g2h2, so if g and h fail to commute, the formula does not hold for allm ∈ Z.


Let us consider a group G with a commutative operation +. In this case, itis natural to write a + a as 2a rather than a2. We could define na for a ∈ Zusing a recursive definition for n ∈ P, 0a = 0, and −na = −(na). Then we couldprove properties corresponding to the exponentiation rules we have shown above.However, it is not really necessary to do this, as these rules represent a translationof the power rules into additive notation. In the table below, we give a comparisonbetween the exponential properties of a group, written multiplicatively, and theproperties of multiplication by integers in a group with the group operation givenas +, which we assume is commutative.

Property Exponential AdditiveNotation Notation

Power gn ngInverse g−1 −gSum Rule gm+n = gmgn (m+ n)g = mg + ngMultiplication Rule (gm)n = gmn (mn)g = m(ng)Power of products If gh = hg then m(g + h) = mg +mh

(gh)m = gmhm

Proposition 6.14 (Cancelation Laws). Let G be a group and a, b, c ∈ G.

(1) If ab = ac then b = c. (left cancelation)(2) If ba = ca then b = c. (right cancelation)


Proposition 6.16. If a, b ∈ G, then

• The equation ax = b has the unique solution x = a−1b.• The equation xa = b has the unique solution x = ba−1.


Definition 6.18. If G is a finite group, then a Cayley Table of the group is ann×n matrix whose columns and rows are headed by elements of the group orderedas g1, . . . , gn, and whose entry in the ith row and jth column is gigj .

It is conventional to list the elements in the same order in the rows as in the columnsand to list the identity element e first. Cayley tables first appeared in an 1854 paperby Arthur Cayley 1821–1895.

Example 6.19. Recall that the symmetric group S3 is given by S3 = {e, ρ, ρ2, σ, ρσ, ρ2σ},where ρ = (1, 2, 3) and σ = (1, 2) are given in cyclic notation. Then a Cayley tablefor S3 is

24 MICHAEL PENKAVA

e ρ ρ2 σ ρσ ρ2σe e ρ ρ2 σ ρσ ρ2σρ ρ ρ2 e ρσ ρ2σ σρ2 ρ2 e ρ ρ2σ σ ρσσ σ ρ2σ ρσ e ρ2 ρρσ ρσ σ ρ2σ ρ e ρ2

ρ2σ ρ2σ ρσ σ ρ2 ρ e

In the example above, note that each element appears exactly once in eachcolumn and row of the table. This is a basic property of Cayley tables, whichfollows from Proposition solveeq. We can exploit this to find Cayley tables forgroups of small size.

Example 6.20. Let G be a group of order 2, e be its identity element, and g bethe nonidentity element. Then its Cayley Table is

e ge e gg g e

Example 6.21. Let G be a group of order 3, e be its identity element and g be anonidentity element. Let h be the third element. If g2 were equal to e, then considerwhat the Cayley Table would look like

e g he e g hg g e xh h y z

There is no way to assign a value to either x or y which is consistent with theobservation that the rows and columns must have each element appearing only once.Thus we must have g2 = h and we can write the Cayley Table as follows.

e g g2

e e g g2

g g g2 eg2 g2 e g2

We have presented Cayley tables for groups of order 2 and 3, but there is aproblem with the exposition. The table alone does not mean that there is a groupwith the structure given in the table. For example, if we have a group of order 3,then its Cayley table must look like the one given in the example above, but thatis not enough to show that there is such a group. The problem is that we have notverified the axioms.

In particular, the axiom of associativity for a binary operation on a set G is timeconsuming to verify. For a group of order n, there are n3 expressions of the form(a ⋆ b) ⋆ c, and another n3 expressions of the form a ⋆ (b ⋆ c). To check associativity,we have to compare the two sequences of products, so one needs to make 2n3

calculations to verify the associativity axiom directly.


On the other hand, it is possible to determine if the other two axioms are satisfiedby examining the Cayley table, so checking associativity is the main problem. Asa consequence, we will have to find other methods to check if a binary operationis associative. We already encountered this problem for the groups Z and Zn.We proved associativity of addition on P by verifying that it is a consequence ofthe Peano postulates, and we reduced the problem of verifying the associativity ofaddition on Zn to the associativity in Z. We did not give a proof of associativityin Z, because we did not give the construction of Z from P, but this can be done.

These remarks still are not enough to verify that there is a group whose Cayleytable corresponds to the one which we gave for a group of order 3. However, it isenough to show there is some group of order 3, because our argument shows thatthe table we gave applies to any group of order 3. However, we can exhibit such agroup easily, as we show in the theorem below.

Theorem 6.22. The group given by addition on Zn for n ∈ P has order n.


There is a second problem with the idea that there is only one group of order 3,and it is more substantial. In fact, there are many groups of order 3. If we take setG = {x, y, z} of three elements, then we can give them the structure of a group oforder 3, by identifying e = x, g = y and g2 = z. Since there is clearly more thanone set of order 3, there are more than one group of order 3. Yet, in some sense,we would like to say that all groups of order 3 are essentially the same group. Tomake this idea precise, we introduce the notion of isomorphism.

Definition 6.24. Let G and G′ be two sets equipped with binary operations (whichwe will denote by juxtaposition). Then an isomorphism ϕ fromG toG′ is a bijectionϕ : G→ G′ which satisfies

ϕ(gh) = ϕ(g)ϕ(h), for all g, h ∈ G.

We can interpret ϕ as a relabeling function. The key idea is that it doesn’tmatter whether you multiply the elements in G and then consider the image oftheir product or their images in G′, because you obtain the same result. We canalso express the property of isomorphism in the form

gh = ϕ−1(ϕ(g)ϕ(h).

From this point of view, to compute the product in G, first map the two elementsto G′, compute their product and then map back. We can use this formula to definea product on G given a product on G′ and a bijection between G and G′.

Notice that we did not require G or G′ to be a group to define an isomorphism.However, there is an important fact we can establish relating group structures andbijections.

Theorem 6.25. Let ϕ : G → G′ be an isomorphism between two sets which areequipped with binary operations. Then

• The product on G is associative if and only if the product on G′ is asso-ciative.

• There is an identity element in G if and only if there is one in G′.• G is a group if and only if G′ is a group.

26 MICHAEL PENKAVA

Proof. Suppose that x, y, z ∈ G′. Let g, h, k ∈ G be such that ϕ(g) = x, ϕ(h) = yand ϕ(k) = z. If the product on G is associative, then

(xy)z = (ϕ(g)ϕ(h))ϕ(k) = ϕ(gh)ϕ(k) = ϕ((gh)k) = ϕ(g(hk)) = ϕ(g)ϕ(hk)

= x(ϕ(h)ϕ(k)) = x(yx).

Suppose that g, h, k ∈ G and let x, y, z ∈ G′ be such that ϕ(g) = x, ϕ(h) = y andϕ(k) = z. If the product on G′ is associative, then

(gh)k = ϕ−1(ϕ((gh)k)) = ϕ−1(ϕ(gh)ϕ(k)) = ϕ−1((ϕ(g)ϕ(h))z)

= ϕ−1((xy)z) = ϕ−1(x(yz)) = ϕ−1(ϕ(g)(ϕ(h)ϕ(k)))

= ϕ−1((ϕ(g)ϕ(hk)) = ϕ−1(ϕ(g(hk))) = g(hk).

This shows that the product on G is associative precisely when the product on G′

is associative.Next, suppose that eg = eg = g for all g ∈ G. Let e′ = ϕ(e). Suppose that

x ∈ G′. Then x = ϕ(g) for some g ∈ G, so we have

e′x = ϕ(e)ϕ(g) = ϕ(eg) = ϕ(g) = x,

and similarly xe′ = x. Thus e′ is an identity element in G′. On the other hand, ife′ ∈ G′ satisfies e′x = xe′ = x for all x ∈ G′, then let e ∈ G be such that ϕ(e) = e′.Let g ∈ G. Then

eg = ϕ−1(ϕ(eg)) = ϕ−1(ϕ(e)ϕ(g)) = ϕ−1(e′ϕ(g)) = ϕ−1(ϕ(g)) = g,

and similarly ge = g.Now finally, note that if either G or G′ is a group, then the product on both

of them is associative and there are identity elements e ∈ G and ϕ(e) = e′ ∈ G′.Assume G is a group, and let x ∈ G′. Then there is some g ∈ G such that ϕ(g) = xand some h ∈ G such that gh = hg = e. Let y = ϕ(h). Then xy = ϕ(g)ϕ(h) =ϕ(gh) = ϕ(e) = e′ and similarly yx = e′. Thus G′ is a group. We leave for thereader the case of showing that if G′ is a group then G is a group. �

7. Subgroups

Definition 7.1. If H ⊆ G is a subset of a group G, then H is said to be a subgroupof G provided that H is a group under the same binary operation as in G. If H isa subgroup of G, we denote this fact by H ≤ G.

It is very important that we use the same binary operation. For example Q is agroup under addition, and Q∗, the set of nonzero elements of Q is a group undermultiplication. But it is not a subgroup of Q, because the binary operation is notthe same.

Since a subgroup H of a group G must be a group, it cannot be empty, sinceit must have an identity element. In fact, if e is the identity in G and e′ is theidentity in H, then if h ∈ H, we have e′h = h. But by Uniqueness of Identity inG, it follows that e′ = e. Thus the identity in H must be the same identity as inG. Similarly, the inverse of an element in H must coincide with the inverse of thatelement in G, by the Uniqueness of Inverse.

The theorem below gives a powerful criterion for determining whether a subsetof a group is a subgroup.


Theorem 7.2. Let H be a subset of a group G. Then H is a subgroup of Gif and only if it satisfies the three properties below.

(1) H is not empty.(2) If a, b ∈ H, then ab ∈ H. (H is closed under the group operation)(3) If a ∈ H, then a−1 ∈ H. (H is closed under inverse).

Proof. By the second property, the binary operation is well-defined onH. Althoughwe don’t state the condition of well-definedness of the binary operation as one ofthe three axioms of a group, it is implicitly required because the definition of agroup is based on the existence of a binary operation satisfying three requirements,so at least we must verify that the binary operation is well-defined.

The axiom of associativity is automatic, because associativity holds in G, so theproduct of elements in H must also satisfy associativity.

The existence of an identity in H is verified as follows. By the first property,there is some h ∈ H. By the third property, h−1 ∈ H. Thus by the second propertye = hh−1 lies in H. Since e is the identity in G, it also is the identity in H.

Finally, the third property guarantees that h−1 ∈ H if h ∈ H. Moreover, h−1 isthe inverse of h in H, since it is the inverse of h in G. �

Students often make the mistake of trying to show that H is a subgroup of Gby proving that an identity exists in H or an inverse exists in H. This is not theidea. Existence of the identity and existence of the inverse of an element in H arealready guaranteed. It is location of the identity and inverse we are concernedwith. You want to show the identity lies in H and the inverse of an element in Hlies in H, not that these elements exist.

Another common mistake is students proving that the product in H is associa-tive. We already know that so such a “proof” is irrelevant to showing that H is asubgroup of G. The key is to prove the three properties in the theorem hold!

Corollary 7.3. Suppose that H is a finite subset of a group G and it satisfies thefirst two properties in the theorem. Then H is a subgroup of G.

Proof. Since the first two properties hold, we need only show that the third holds.Let h ∈ H. Now the set P is infinite, and hk ∈ H for any k ∈ P by a straightforwardinduction argument, so there must be some m < n ∈ P such that hm = hn. Thene = hmh−m = hnh−m = hn−m. Let k = n−m. If k = 1, then h = e so h = h−1 andh−1 ∈ H. Otherwise k > 1, so k − 1 ∈ P. Now e = hhk−1 so h−1 = hk−1 ∈ H. �

Corollary 7.4. Let G be a finite group. Then if g ∈ G, g−1 = gk for some k ∈ P.

Proof. Let H = {gk|k ∈ P}. Then H is not empty, since g ∈ H. Suppose thatx, y ∈ H. Then x = gk and y = gℓ for some h, ℓ ∈ P. Then xy = gk+ℓ ∈ H. ThusH satisfies the first two properties in the theorem, and since G is finite, so is H.Thus by the corollary above H is a subgroup of G. But this implies g−1 = gk forsome k ∈ P. �

Theorem 7.5. Let H,K ≤ G. Then H ∩ K ≤ G. More generally, let Λ be aset and suppose that we have a collection of subgroups Hλ of G for λ ∈ Λ. Define∩λ∈Λ

Hλ = {g ∈ G|g ∈ Hλ for all λ ∈ Λ}. Then∩λ∈Λ

Hλ ≤ G.

28 MICHAEL PENKAVA

Proof. We prove the more general statement, since it includes the case of the in-

tersection of two subgroups. Let H =∩λ∈Λ

Hλ. First, since Hλ is a subgroup for all

λ ∈ Λ, we must have e ∈ Hλ for all λ ∈ Λ. Thus e ∈ H and H is not empty. Nowsuppose that a, b ∈ H. Then a, b ∈ Hλ for all λ, so ab ∈ Hλ for all λ. Thus ab ∈ H.Finally, a−1 ∈ Hλ for all λ, so it also follows that a−1 ∈ H. Thus H ≤ G. �

Theorem 7.6. Let H ≤ G and suppose that g ∈ G. Then the set gHg−1 ={ghg−1|h ∈ H} is a subgroup of G. Such a subgroup is called a conjugate of H (byg).

Proof. e ∈ H, so e = geg−1 ∈ gHg−1. Thus gHg−1 = ∅. Suppose that x, y ∈gHg−1. Then x = gag−1 and y = gbg−1 for some a, b ∈ H. Thus xy =gag−1gbg−1 = gabg−1 ∈ gHg−1, since ab ∈ H. Finally

x−1 = (gag−1)−1 = (g−1)−1a−1g−1 = ga−1g−1 ∈ gHg−1,

since a−1 ∈ H. �

Definition 7.7. Let G be a group and gλ ∈ G for λ ∈ Λ. Then ⟨aλ, λ ∈ Λ⟩ isthe intersection of all subgroups H such that aλ ∈ H for all λ ∈ Λ, is called thesubgroup generated by the aλ.

Exercise 7.8. Show that ⟨aλ, λ ∈ Λ⟩ is actually a subgroup of G.

Exercise 7.9. Let G = S3. Show that the only proper subgroups of G are ⟨(1, 2)⟩,⟨(1, 3)⟩, ⟨(2, 3)⟩ and ⟨(1, 2, 3)⟩. Thus there are precisely 6 subgroups of S3.

Definition 7.10. Let G be a group. Then the center of G, denoted Z(G) is thesubset of all g ∈ G such that gx = xg for all x ∈ G.

Exercise 7.11. Prove that the center of G is a subgroup of G.

Exercise 7.12. Find Z(S3).

Definition 7.13. Let G be a group and S ⊆ G. Then the centralizer of S in G isthe set CG(S) = {g ∈ G|gs = sg for all s ∈ S}. When the context is clear, we alsodenote the centralizer of S by C(S), and when S = {a} is a singleton, we usuallydenote C({a}) more compactly as C(a).

Note that C(G) = Z(G) is just the center of G. Also, note that Z(e) = G, sinceevery element of G commutes with e.

Proposition 7.14. Let S ⊆ G. Then C(S) is a subgroup of G; i.e., C(S) ≤ G.

Proof. First note that e ∈ C(S), since e commutes with any element of G, soit certainly commutes with every element in S. Thus C(S) = ∅. Suppose thatg, h ∈ C(G). If s ∈ S, then gs = sg and hs = sh, so

(gh)s = g(hs) = g(sh) = (gs)h = (sg)h = s(gh),

so gh ∈ C(G). Finally, if g ∈ C(G) and s ∈ S, then

g−1s = g−1se = g−1sgg−1 = g−1gsg−1 = esg−1 = sg−1.

Thus g−1 commutes with s, so g−1 ∈ C(G). This shows that C(S) ≤ G. �


8. Cyclic Groups

Definition 8.1. Let G be a group and g ∈ G. The cyclic subgroup generated by g,denoted by ⟨g⟩, is the set {gk|k ∈ Z}.

For a group G with commutative operation +, the cyclic subgroup ⟨g⟩ is givenby ⟨g⟩ = {kg|k ∈ Z}. This is because kg is the analogue of gk for groups withoperation +.

Example 8.2. We show that ⟨g⟩ is actually a subgroup of G. First, it is not empty,since g = g1 ∈ ⟨g⟩. Next, suppose that x, y,∈ ⟨g⟩. Then x = gk and y = gℓ forsome k, ℓ ∈ Z, so xy = gkgℓ = gk+ℓ ∈ ⟨g⟩. Finally x−1 = g−k ∈ ⟨g⟩.

Theorem 8.3. Let G be a group and g ∈ G. Then the cyclic subgroup ⟨g⟩ generatedby g is the intersection of all subgroups of G containing g. As a consequence, ifH ≤ G and g ∈ H, then ⟨g⟩ ≤ H.

Proof. Let H be the collection of all subgroups containing g. Then ⟨g⟩ ∈ H, so∪{H|H ∈ H} ⊆ ⟨g⟩. On the other hand, suppose that H ∈ H. We show by

induction that gm ∈ H for all m ∈ P. Clearly, g1 = g ∈ H, since H ∈ H. Supposethat gm ∈ H. Then gm+1 = gmg ∈ H, since H is closed under products. Thus, bythe principle of mathematical induction, gm ∈ H for all m ∈ P. Moreover, if m ∈ P,then g−m = (gm)−1 ∈ H, since H is closed under inverses. Finally, g0 = e ∈ H,since every subgroup contains the identity. Thus ⟨g⟩ ⊆ H. It follows from this that⟨g⟩ ⊆

∪{H|H ∈ H}. Thus equality holds. The second statement of the theorem

follows immediately from the first. �Example 8.4. Let G be a group. Then ⟨e⟩ = {e} is the cyclic subgroup generatedby the identity element. It is called the trivial subgroup of G. The group G is alsoa subgroup of G, called the improper subgroup of G. Note that every group hasthese two subgroups, and they are distinct unless G is a one element group. Allother subgroups of G are called proper, nontrivial subgroups.

Definition 8.5. A group G is called a cyclic group if there is an element g ∈ Gsuch that G = ⟨g⟩. An element g such that G = ⟨g⟩ is called a generator of thegroup G.

Example 8.6. The group Z is cyclic, because Z = ⟨1⟩. Moreover, the group Zn isalso cyclic, since Zn = ⟨1⟩.

Definition 8.7. Suppose that G and G are groups. If there is a an isomorphismφ : G→ G′ we say that G and G′ are isomorphic groups (or simply that G and G′

are isomorphic). We denote this by G ∼= G′.

Exercise 8.8. Show that ∼= is an equivalence relation.

Actually, the collection of all groups is not a set, but is something called a classin set theory. Since we require an equivalence relation to be a relation on a set, itis not technically correct that ∼= is an equivalence relation, since there is no set ofall groups. But we can extend the notion of an equivalence relation to classes, andif we do so, then it is true that ∼= is an equivalence relation.

Definition 8.9. If X is a finite set, the order of X, denoted as o(X) or |X|, is thenumber of elements in X. Otherwise, we say that X has infinite order and denoteo(X) = ∞.

30 MICHAEL PENKAVA

Theorem 8.10. Let G be a cyclic group. If G is infinite, then G ∼= Z. If G is notinfinite, then G ∼= Zn where o(G) = n.

Proof. Let G = ⟨g⟩. Suppose that o(G) = n < ∞. The sequence g = g1, . . . , gn+1

has n + 1 elements, so must contain some duplicates, so there is some 1 ≤ k <ℓ ≤ n + 1 such that gk = gℓ. Then gℓ−k = e. Moreover ℓ − k ≤ n + 1 − 1 = n.Since the set X = {x ∈ P|gk = e} includes ℓ − k, X is not empty and so has aleast element m. Moreover, m ≤ n. We claim that G = {g1, . . . , gm}, from whichit follows that m = n. It cannot happen that there are any duplicates in the setT = {g1, . . . , gm}, because if some gk = gℓ for some 1 ≤ k < ℓ < m, then gℓ−k = e,and 1 ≤ ℓ − k < m, which would contradict the minimality of m. On the otherhand, we claim that gk ∈ T for all k ∈ Z. To see this, we first show this fact fork ∈ N. Let S = {k ∈ P|gk ∈ T}. Suppose that x ∈ S for all x < k. If 1 ≤ k ≤ m,then clearly k ∈ T , so assume k > m. Then gk = gk−mgm = gk−me = gk−m. Nowx = k −m < k, so by assumption x ∈ S, so gk = gx ∈ T . Thus k ∈ S. By theprinciple of strong induction, S = P.

Next, note that g0 = gm ∈ T . Finally, note that if k ∈ P, then mk − k =(m− 1)k >= 0. Moreover

gmk−k = (gm)kg−k = ekg−k = eg−k = g−k,

so g−k = gmk−k ∈ T . This shows that G = T , and thus m = n, since T has melements and G has n elements.

Define a map ϕ : Zn → G by ϕ(k) = gk. We have to show that this map is welldefined. Suppose that k = ℓ. Then ℓ = k + nr for some r ∈ Z, so gℓ = gk+nr =gk(gn)r = gk. Thus the right hand side in the definition of ϕ is independent of thechoice of representative for k. Clearly, ϕ is a bijection, since it is surjective ando(Zn) = o(G).

We show that ϕ is an isomorphism. Now

ϕ(k + ℓ) = ϕ(k + ℓ) = gk+ℓ = gkgℓ = ϕ(k)ϕ(ℓ).

Thus ϕ satisfies the required condition of an isomorphism.Next, we consider the case when G is infinite. Define ϕ : Z → G by ϕ(k) = gk.

By the definition of a cyclic group, ϕ is surjective. If ϕ were not injective, thenthere would have to be integers k and ℓ such that k < ℓ and gk = gℓ. But thengl−k = e, and it would follow that G is finite, by the kind of argument we gaveabove. Therefore ϕ is injective, so it is a bijection. To see ϕ is an isomorphism,note that

ϕ(k + ℓ) = gk+ℓ = gkgℓ = ϕ(k)ϕ(ℓ).

Thus ϕ meets all of the conditions to be an isomorphism. �

The theorem above shows that we do not encounter any new kinds of groupswhen considering cyclic groups. Note that if G is any group and g ∈ G, then thecyclic subgroup ⟨g⟩ is cyclic, so it must be either isomorphic to Z or to Zn for somen ∈ P. This means there is a short menu for what the structure of ⟨g⟩ could be,and this is a very useful fact.

Theorem 8.11. Let m ∈ Z and d = gcd(m,n). Then ⟨m⟩ = ⟨d⟩ in Zn. Moreover,if n ∈ P, this subgroup has order n/d. As a consequence, the number of subgroupsof Zn is equal to the number of distinct nonnegative divisors of n.


Proof. Let H = ⟨m⟩ and K = ⟨d⟩. We show that H ⊆ K and K ⊆ H, from whichit follows that H = K. Express d = rm + sn. Thus d = rm, so that d ∈ ⟨m⟩. Itfollows that K ⊆ H, since any multiple of d is a multiple of m. On the other hand,since m is a multiple of d, m is a multiple of d. Thus m ∈ K, and it follows thatH ⊆ K.

To see that o(⟨d⟩) = n/d, note that if 1 ≤ k < n/d, then kd < n, so kd = 0.Moreover, if1 ≤ k < ℓ ≤ n/d, we could not have kd = ℓd, because in that casewe would have (ℓ − k)d = 0, and since 1 ≤ ℓ − k < n/d, this is impossible by theprevious remark. It follows that o(⟨d⟩) = n/d. �Corollary 8.12. Every subgroup of a cyclic group is cyclic.

Proof. First, note that the map ϕ : Z → Z0, given by ϕ(a) = a is surjective, sinceevery element in Zn for any n is of the form a for some a ∈ Z. Next note thata = b (mod 0) if and only if b−a = k0 = 0, that is, if and only if a = b. Thus ϕ isinjective, since if ϕ(a) = ϕ(b) then a = b, which only happens if a = b. Thus Z ∼= Z0.Now let H ≤ Zn. If H = ⟨o⟩, it is cyclic. Otherwise, let X = {x ∈ P|x ∈ H}. SinceH = ⟨o⟩, there must be some nonzero x ∈ Z such that x ∈ H. If x ∈ P then x ∈ X.If x < 0, then −x = −x ∈ H, so −x ∈ X. Thus X = ∅, so it must have a leastelement d. Suppose that m ∈ H. Express m = qd + r where 0 ≤ r < |d|. Sincer = m − qd, r ∈ H. Now if r > 0, it follows that r ∈ H, which would contradictthe minimality of d. Thus m ∈ ⟨d⟩, and H ⊆ ⟨d⟩. Since ⟨d⟩ ⊆ H, it follows thatH = ⟨d⟩. Thus H is cyclic. �

Because of our remarks above that Z ∼= Z0, we see that the theorem holds bothfor Z and Zn for all n. Of course, since Z is not finite, the counting part of thetheorem does not apply. However, we can easily see that ⟨k⟩ is an infinite cyclicgroup, so is isomorphic to Z, except when k = 0, in which case ⟨0⟩ = {0} ∼= Z1, thecyclic group with exactly one element.

The moral of all this discussion of cyclic groups is that they are not very inter-esting, since they are all isomorphic to groups of the form Zn, which we alreadyhave studied. Moreover, they are all abelian, and all subgroups of cyclic groups arecyclic.

Definition 8.13. Given a group G, its subgroup lattice is the collection H of allsubgroups of G, partially ordered by inclusion. The lattice diagram correspondingto the subgroup lattice is the graph with one vertex for each subgroup, with edgescorresponding to subgroup inclusion. These edges occur between subgroups K andH provided that K ≤ H and there is no subgroup X such that K ≤ X ≤ H exceptfor the trivial cases when X = H or X = K. We also draw the graph so that ifK ≤ H, and K = H, then the vertex corresponding to K lies lower in the graphthan the vertex corresponding to H.

A little thought reveals that there is one top vertex, corresponding to the im-proper subgroup G, and one bottom vertex, corresponding to the trivial subgroup⟨e⟩. We give two examples below of lattice diagrams, for the groups S3 and Z45.

9. Morphisms of Groups

Definition 9.1. Let G,G′ be groups. Then a morphism of groups or homomor-phism is a map ϕ : G→ G′ which satisfies

ϕ(gh) = ϕ(g)ϕ(h).

32 MICHAEL PENKAVA

⟨σ, τ⟩

jjjjjjjj

jjj�� ??

?TTTT

TTTTTT

⟨σ⟩

TTTTTTTT

TTTT⟨τ⟩

???

⟨στ⟩��

⟨σ2τ⟩

jjjjjjjj

jjj

⟨e⟩

Figure 1. Subgroup Lattice for S3, where σ = (1, 2, 3) and τ = (1, 2)

⟨1⟩�� ??

?

⟨3⟩??

?��

⟨5⟩��

⟨9⟩??

?⟨15⟩

��

⟨0⟩

Figure 2. Subgroup Lattice for Z45

Example 9.2. If ϕ : G→ G′ is an isomorphism, then it is a homomorphism, sincethe morphism condition above is one of the two conditions of an isomorphism.

Example 9.3. If G,G′ are groups, then the trivial morphism from G to G′ is themap ϕ(g) = e for all g ∈ G. It is clearly a homomorphism, since ϕ(gh) = e = e ·e =ϕ(g)ϕ(h), for any g, h ∈ G. Thus there is always at least one morphism betweenany two groups G and G′.

Example 9.4. The identity morphism 1G : G → G is the map 1G(g) = g for allg ∈ G. It is easy to show it is a morphism of groups.

Proposition 9.5. Let ϕ : G→ G′ be a homomorphism. Then

(1) ϕ(e) = e.(2) ϕ(g−1) = ϕ(g)−1.(3) ϕ(gm) = ϕ(g)m for any m ∈ Z.

Proof. First, Note that ϕ(e) = ϕ(e · e) = ϕ(e)ϕ(e). By uniqueness of identity in G′,this forces ϕ(e) = e. Next, note that e = ϕ(e) = ϕ(gg−1) = ϕ(g)ϕ(g−1), for any g ∈G. By uniqueness of inverse, it follows that ϕ(g−1) = ϕ(g)−1. Finally, we prove thelast result by first showing it for m ∈ P by induction. Since ϕ(g1) = ϕ(g) = ϕ(g)1,it holds for m = 1. Suppose that ϕ(gm) = ϕ(g)m. Then

ϕ(gm+1) = ϕ(gmg) = ϕ(gm)ϕ(g) = ϕ(g)mϕ(g) = ϕ(g)m+1.

Thus the formula holds for all m ∈ P. Next ϕ(g0) = ϕ(e) = e = ϕ(g)0, so it holdsfor m = 0. Finally, if m ∈ P, then

ϕ(g−m) = ϕ((g−1)m) = ϕ(g−1)m = (ϕ(g)−1)m = ϕ(g)−m.

Thus the formula holds for all m ∈ Z. �Theorem 9.6. Suppose that ϕ : G → G′ and ψ : G′ → G′′ are morphisms ofgroups. Then ψ ◦ ϕ : G→ G′′ is also a morphism of groups.


Proof. Let g, h ∈ G. Then

(ψ ◦ ϕ)(gh) = ψ(ϕ(gh)) = ψ(ϕ(g)ϕ(h)) = ψ(ϕ(g))ψ(ϕ(h)) = (ψ ◦ ϕ)(g)(ψ ◦ ϕ)(h).�

Corollary 9.7. Let ϕ : G→ G′ and ψ : G′ → G′′ be isomorphisms. Then ψ ◦ ϕ isan isomorphism.

Proof. We already showed that the composition of two bijections is a bijection.By the theorem, the composition of two isomorphisms is a morphism. Since anisomorphism is just a bijective morphism, this shows that ψ ◦ ϕ is an isomorphism.

�Definition 9.8. Let ϕ : G → G′ be a homomorphism of groups. Then the kernelof ϕ, denoted ker(ϕ) is the subset of G given by

ker(ϕ) = {g ∈ G|ϕ(g) = e}.

The notion of a kernel of a morphism of groups is parallel to that of the kernelof a linear transformation. In fact, the kernel of a linear transformation is a specialcase of the kernel of a morphism, since a linear map λ between two vector spacesis a morphism of their underlying group operation of addition, and the kernel ofλ, considered as a linear transformation coincides with the kernel of the map λ,considered as a morphism of groups!

Proposition 9.9. The kernel of a morphism ϕ : G→ G′ is a subgroup of G.

Proof. Let ϕ : G→ G′ be a homomorphism, and let H = ker(ϕ). Now e ∈ H sinceϕ(e) = e, so H = ∅. Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) = e · e = e, sogh ∈ H. Also ϕ(g−1) = ϕ(g)−1 = e−1 = e, so g−1 ∈ H. Thus H ≤ G. �

An interesting question is what kinds of subgroups of G turn up as kernels ofmorphisms. Since the kernel of the identity map 1G is just {e}, we see that thetrivial subgroup is the kernel of a morphism. Moreover, if we consider the trivialmorphism from G to G, it is easy to see that the improper subgroup of G is thekernel of a morphism. We will see later, in the section on normal subgroups, thatthe kinds of subgroups which are kernels of morphisms are quite special!

One of the most powerful results about kernels of morphisms is the following!

Theorem 9.10. Let ϕ : G → G′ be a group homomorphism. Then ϕ is injectiveif and only if ker(ϕ) = {e}.

Proof. Suppose ϕ is injective and g ∈ ker(ϕ). Then ϕ(g) = e = ϕ(e), so g = e.Thus ker(ϕ) = {e}. On the other hand, suppose that ker(ϕ) = {e} and ϕ(g) = ϕ(h).Then ϕ(gh−1) = ϕ(g)ϕ(h)−1 = e, so gh−1 = e and g = h. Thus ϕ is injective. �

What makes the theorem above so powerful is that it reduces the proof of in-jectivity to the consideration of when ϕ(g) = e, instead of having to study whenϕ(g) = ϕ(h) in general. In many cases, computation of the kernel of a morphism iseasy, so determination of whether the morphism is injective is equally easy.

Example 9.11. Let ϕ : Z → Zn be defined by ϕ(a) = a. Since ϕ(a+ b) = a+ b =a + b = ϕ(a) + ϕ(b), it follows that ϕ is a homomorphism. Moreover a ∈ ker(ϕ)if and only if ϕ(a) = 0 if and only if a = 0 if and only if a ∈ ⟨n⟩. Thus we haveidentified ker(ϕ) = ⟨n⟩.

34 MICHAEL PENKAVA

Theorem 9.12. Suppose that ϕ : G → G′ is a morphism of groups and H ≤ G.Then ϕ(H) ≤ G′.

Proof. Since e = ϕ(e) ∈ ϕ(H), ϕ(H) = ∅. Let x, y ∈ ϕ(H). Then x = ϕ(g) andy = ϕ(h) for some g, h ∈ H. But then xy = ϕ(g)ϕ(h) = ϕ(gh) ∈ ϕ(H), sincegh ∈ H. Finally x−1 = ϕ(g)−1 = ϕ(g−1) ∈ ϕ(H), since g−1 ∈ H. �Definition 9.13. If ϕ : X → Y is a map and S ⊆ Y , then ϕ−1(S) = {x ∈ X|ϕ(x) ∈S} is called the inverse image of S under ϕ.

Theorem 9.14. Let ϕ : G → G′ be a group homomorphism and H ′ ≤ G′. Thenϕ−1(H ′) ≤ G.

Proof. Let H = ϕ−1(H ′). Since ϕ(e) = e ∈ H ′, e ∈ H. This show that H = ∅.Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) ∈ H ′, since H ′ is a subgroup andϕ(g), ϕ(h) ∈ H ′. Thus gh ∈ H. Moreover ϕ(g−1) = ϕ(g)−1 ∈ H ′, so g−1 ∈ H.Thus H ≤ G. �Definition 9.15. Let G be a group. Then Aut(G) is the set of all isomorphismsfrom G onto G is called the group of automorphisms of G, or automorphism groupof G.

Theorem 9.16. Aut(G) is a group under composition. In fact, Aut(G) ≤ SG;i.e., the automorphism group of G is a subgroup of the permutation group of G.

Proof. Since the second statement implies the first, we prove that Aut(G) ≤ SG.Since an automorphism is a bijection by definition, Aut(G) ⊆ SG. Moreover,composition is the group operation in SG. We already showed that the identity 1Gis a morphism, so it is in Aut(G), which implies that Aut(G) = ∅. Suppose thatϕ, ψ ∈ Aut(G). Then we already showed that ϕ ◦ ψ is a morphism, so it lies inAut(G). Finally, we need to show that ϕ−1 ∈ Aut(G), which follows if we can showit is a morphism. But

ϕ−1(gh) = ϕ−1(ϕ(ϕ−1(g))ϕ(ϕ−1(h)) = ϕ−1(ϕ(ϕ−1(g)ϕ−1(h)) = ϕ−1(g)ϕ−1(h).

Thus ϕ−1 is a morphism, which shows it is in Aut(G). �Theorem 9.17 (Cayley’s Theorem). The map ϕ : G→ SG given by ϕ(g)(h) = ghis an injective morphism of groups. Thus G is isomorphic to a subgroup of SG. Inparticular, every group is isomorphic to a subgroup of a permutation group.

Proof. First, we need to show that if g ∈ G, then ϕ(g) ∈ SG. To see that ϕ(g)is injective, suppose ϕ(g)(h) = ϕ(g)(h′). Then gh = gh′ so by left cancelationh = h′. Thus ϕ(g) is injective. To see that ϕ(g) is surjective, suppose y ∈ G. Thenϕ(g)(g−1y) = gg−1y = y. Thus ϕ(g) is surjective, so it is bijective. This shows thatϕ(g) ∈ SG.

Next, suppose that g, g′ ∈ G. Then

ϕ(gg′)(x) = gg′x = ϕ(g)(g′x) = ϕ(g)(ϕ(g′)(x) = (ϕ(g) ◦ ϕ(g′))(x).Thus ϕ(gg′) = ϕ(g) ◦ ϕ(g′), which shows that ϕ is a morphism of groups. Supposethat g ∈ ker(ϕ). Then ϕ(g) = 1G, so

g = ge = ϕ(g)(e) = 1G(e) = e.

. Thus ker(ϕ) = {e}, which shows that ϕ is injective. Finally, consider the restric-tion of the image of ϕ to ϕ(G). Then the induced map G→ φ(G) is still injective,and is clearly surjective. This means that ϕ : G→ ϕ(G) is an isomorphism. �


In older textbooks, it was pointed out that Cayley’s Theorem is interestingfrom a theoretical viewpoint, but is utterly useless as a calculation tool. However,times have changed. The computational algebra software Maple has a group theorypackage that can help analyze properties of groups, but only if they are given asgroups of permutations, that is, as subgroups of some permutation group. Cayley’sTheorem can be used to express a finite group as a permutation group in a veryconcrete manner.

Example 9.18. Let G be a group and g ∈ G. Then the map cg : G → G given bycg(x) = gxg−1 is called conjugation by g. We claim that cg is an automorphismof G. To see this, first note that cg(xy) = gxyg−1 = gxg−1gyg−1 = cg(x)cg(y),so cg is a morphism. Thus, to show that cg is injective, we need only show thatker(cg) = {e}. Suppose that x ∈ ker(cg). Then e = cg(x), so gxg−1 = e, andmultiplying both sides on the left by g−1 and both side on the right yields x = e. Tosee that cg is surjective, let y ∈ G. We compute cg(g

−1yg) = gg−1ygg−1 = y. Thuscg is an isomorphism from G onto G, which is precisely what an automorphism ofG is.

Theorem 9.19. Let ϕ : G → G′ be a surjective morphism of groups and K =ker(ϕ). Then the map H ′ 7→ ϕ−1(H ′) is an order preserving bijection between theset of subgroups of G′ and the set of subgroups of G which contain K.

Proof. First, note that if H = ϕ−1(H ′) for some H ′ ≤ G′, then if g ∈ ker(ϕ),we have ϕ(g) = e ∈ H ′, which implies that g ∈ ϕ−1(H ′). This shows thatker(ϕ) ⊆ (ϕ−1(H ′)), which means that H ′ 7→ ϕ−1(H ′) is a well defined mapbetween subgroups of G′ and subgroups of G containing ker(ϕ). Now supposethat H is a subgroup of G containing ker(ϕ). Then H ′ = ϕ(H) is a subgroupof G′. We claim that ϕ−1(H ′) = H. To see this, suppose that a ∈ ϕ−1(H ′).Then ϕ(a) ∈ H ′ = ϕ(H), so there is some h ∈ H such that ϕ(a) = ϕ(h). Butthen ϕ(ah−1) = e, so ah−1 ∈ H, since H contains the kernel of ϕ. From thiswe conclude that a = ah−1h ∈ H. Thus ϕ−1(H ′) ⊆ H. However, if h ∈ Hthen ϕ(h) ∈ ϕ(H) = H ′, so h ∈ ϕ−1(H ′). Thus H ⊆ (ϕ−1(H ′), and we see thatH = ϕ−1(H ′). This shows that the map H ′ 7→ ϕ−1(H ′) is surjective to the set ofall subgroups of G containing ker(ϕ). Next, we show it is injective. Let H ′,K ′ besubgroups of G′ and suppose that H ′ = K ′. Suppose that x ∈ H ′ but x ∈ K ′. Thenthere is some g ∈ G such that ϕ(g) = x, since ϕ is surjective. We have g ∈ ϕ−1(H ′)but g ∈ ϕ−1(K ′). Thus ϕ−1(H ′) = ϕ−1(K ′). If we have some x ∈ K ′ which doesnot lie in H ′ we deduce by a similar argument that the two inverse images are notequal. This shows that the map is injective. �

Notice that in the proof above, we only needed that ϕ was surjective to showthat H ′ 7→ ϕ−1(H ′) is an injective map to the set of subgroups of G containingker(ϕ). When ϕ is not surjective, the proof above shows that the map is still welldefined and surjective.

10. Cosets

In this section we consider a fixed subgroup H of a group G.

Definition 10.1. Let H ≤ G. We define a relation ∼, called left equivalence bya ∼L b if and only if b−1a ∈ H. Similarly, we define right equivalence by a ∼R bif and only if ab−1 ∈ H.

36 MICHAEL PENKAVA

Theorem 10.2. The relations left equivalence and right equivalence are equiva-lence relations.

Proof. We prove the result for left equivalence, which we will denote by ∼ insteadof ∼L, and we leave the proof for right equivalence, which is similar, to the reader.First, note that a ∼ a because a−1a = e ∈ H. Next, suppose that a ∼ b. Thenb−1a = h ∈ H. Thus a−1b = (b−1a)−1 = h−1 ∈ H. This shows that b ∼ a. Finally,suppose that a ∼ b and b ∼ c. Then b−1a = h ∈ H and c−1b = h′ ∈ H. It followsthat c−1a = c−1bb−1a = hh′ ∈ H. Thus a ∼ c. �Definition 10.3. For the equivalence relation ∼L, we call the equivalence class ofan element a a left coset of a. The left coset of a is often denoted as aH, although,for simplicity, we will usually just denote it as a. Similarly, the equivalence class ofa under ∼R is called the right coset of a, and is often denoted as Ha. The set ofleft cosets of G is denoted by G/H. The number of elements of G/H is called theindex of H in G, and is denoted by [G : H]; in other words, [G : H] = o(G/H).

In fact, there is some ambiguity in the notation, as a = {b ∈ G|a ∼L b} bydefinition, but aH = {ah|h ∈ H}. However, it is easy to see that these two sets arethe same, so the ambiguity is only apparent, not real.

Lemma 10.4. The map ϕ : a → b, given by ϕ(x) = ba−1x, is a well definedbijection of aH onto bH.

Proof. Let x ∈ a. Then x = ah for some h ∈ H, so ϕ(x) = ba−1aah = bh ∈ b.This shows that ϕ is well defined. Suppose that ϕ(x) = ϕ(y) for some x, y ∈ a.Then x = ah and y = ah′ for some h, h′ ∈ H. We compute that ϕ(x) = bh andϕ(y) = bh′, so bh = bh′ and by left cancelation, we conclude that h = h′. Thisforces x = y. Thus ϕ is injective. Now let y ∈ b. Then y = bh for some h ∈ H. Letx− ah. Then x ∈ a and clearly ϕ(x) = y. Thus ϕ is surjective. This shows that ϕis a bijection of a onto b. �Theorem 10.5 (Lagrange’s Theorem). Suppose that o(G) <∞, and H ≤ G. Theno(H)|o(G).

Proof. By the lemma, it follows that the number of elements in a is independentof a. Note that e = H, so we have o(H) = o(a) for all a ∈ G. Recall that the setsa for a ∈ G are either disjoint or coincide, and that G is the union of all such sets.Since G is finite, it follows that G is a finite union of disjoint sets, each of which haso(H) elements. If there are m distinct such sets, we obtain that o(G) = mo(H). Itfollows that o(H)|o(G). �Corollary 10.6. If G is a finite group and g ∈ G, then o(g)|o(G).

Proof. Recall that the order of an element g is the least positive integer n suchthat gn = e. But by the classification of cyclic groups, we also know that ⟨g⟩ ={g1, . . . , gn}, so o(g) = o(⟨g⟩). Thus, by the theorem, o(g)|o(G). �Example 10.7. We show that there are exactly two groups of order 4, up to iso-morphism. Let G be a group of order 4. If there is an element g of order 4 in G,then G is cyclic, so G ∼= Z4. Otherwise, every non identity element in G has order2. Let e be the identity element in G. Then there are three non identity elementsin G. Denote two such distinct elements by a and b. Then ab cannot equal a, e orb, so the fourth element is ab. Thus, we have the following Cayley Table for G.


e a b abe e a b aba a e ab bb b ab e aab ab b a e

Notice that the Cayley Table is symmetric, so the group corresponding to the CayleyTable, if there is such a group, is abelian. However, to check that the Cayley Tableabove gives a group would require 128 calculations to check associativity. Thus, it ismore convenient if we can find a group with this Cayley Table. The group describedby this table is not cyclic, nor is it isomorphic to Sn, for any n. Nevertheless,we can find a subgroup of S4 which has this Cayley Table. Let a = (1, 2) andb = (3, 4). Then a2 = b2 = e. Moreover ab = ba. It follows from this fact that theset H = {a, b, ab, e} is a subgroup of S4 whose Cayley Table is given above.

The group with Cayley Table above has several names in the literature. It iscalled the Klein 4-group, after Felix Klein, 1849-1925. It is also isomorphic to thegroup Z2 × Z2, a group which we will understand when we discuss product groups.It is also isomorphic to the Dihedral Group D2, which is also denoted by some asD4. We will discuss Dihedral groups later.

11. Normal Subgroups and Quotient Groups

Proposition 11.1. Let K = ker(ϕ) be the kernel of a morphism ϕ : G→ G′. Thenif g ∈ G and x ∈ K, gxg−1 ∈ K. In other words, cg(K) = K for every g ∈ G,where cg is the conjugation by g morphism introduced in Example 9.18.

Proof. Let g ∈ G and x ∈ K. Then ϕ(x) = e. We compute ϕ(gxg−1) =ϕ(g)ϕ(x)ϕ(g)−1 = ϕ(g)eϕ(g)−1 = e. Thus gxg−1 ∈ K. To see the second state-ment, we note that cg(K) ⊆ K by the first statement. To see equality holds,suppose y ∈ K and g ∈ G, Then x = g−1yg = cg−1(y) ∈ K. Moreover, cg(x) = y.Thus K ⊆ gg(K) and equality holds. �

From the theorem above, we see that kernels of morphisms have a special prop-erty among subgroups. This property turns out to be very important, leading tothe definition below.

Definition 11.2. A subgroup H ≤ G is said to be normal in G if gxg−1 ∈ Hwhenever x ∈ H and g ∈ G. When H is normal in G, we denote this by H ▹G.

Since we defined a subgroup to be normal by requiring to have a property whichwe already showed that kernels of morphisms satisfy, it follows that every kernel ofa morphism is automatically a normal subgroup.

Theorem 11.3. Let H ≤ G. Then the following are equivalent.

(1) H ▹G.(2) gxg−1 ∈ H for all x ∈ H and g ∈ G.(3) cg(H) ⊆ H for all g ∈ G.(4) cg(H) = H for all g ∈ G.

Proof. The equivalence of conditions (1) and (2) is a matter of definition. That (2)is equivalent to (3) is straightforward. That (4) implies (3) is clear. To see that (3)implies (4), the same argument as in proposition 11.1 works. �

38 MICHAEL PENKAVA

Example 11.4. We give an example of a subgroup which is not normal. LetH = ⟨(1, 2)⟩ = {(1, 2), e} be the subgroup of S3 generated by the transposition(1, 2). Then c(1,3)((1, 2)) = (1, 3)(1, 2)(1, 3) = (2, 3) ∈ H, so H is not normal inS3.

The fact that we chose a nonabelian group in order to find a subgroup whichwas not normal in the group is no accident.

Theorem 11.5. Every subgroup of an abelian group is normal in the group.

Proof. Let H ≤ G and suppose G is abelian. If x ∈ H and g ∈ G, then cg(x) =gxg−1 = xgg−1 = x ∈ H, so H ▹G. �

It is very important to notice that the theorem does not imply that an abeliansubgroup of a group is normal in the group. In fact, in Example 11.4, the subgroupH of S3 is abelian, because it is a cyclic subgroup, but it is not normal in S3.

We would like to explore the possibility of giving a group structure to the set ofleft cosets G/H of a subgroup H of G. Remember that we denote the left coset ofg by g. How would we define a · b? There seems to be only one natural definition:

a · b = ab.

The problem with this “definition” is that it is not obvious that it is well-defined,since this rule can be stated as “Take some element out of the first set, someelement of the second set and then multiply them and take the resulting coset”.Of course, the answer may depend on which elements we take from the two sets.We encountered this problem when defining addition and multiplication on Zn andwere able to resolve them. It turns out that in this situation, the problem may notbe insurmountable. We will break the problem down into some steps and see thatsome condition on H is necessary in order for this definition to work.

Theorem 11.6. Suppose H ≤ G and that the multiplication of cosets on G/Hgiven by a · b = ab is well-defined. Then

• G/H is a group under this multiplication with identity e and inverse given

by (a)−1 = a−1.• The map π : G → G/H given by π(a) = a is a surjective morphism withkernel H.

• H is normal in G.• If G is finite, then o(G/H) = o(G)/o(H).

Thus, in order for the product above to be well defined, a necessary condition isthat H ▹ G. Conversely, when H ▹ G, the product above is well defined, so thiscondition is sufficient as well.

Proof. Assume a · b = ab is well defined. Then

(a · b) · c = ab · c = abc = a · bc = a · (b · c).a · e = ae = a = ea = e · a.

a · a−1 = aa−1 = e = a−1a = a−1 · a.

Thus G/H is a group. Now define π as above, and we compute

π(ab) = ab = a · b = π(a)π(b).


Thus π is a morphism of groups. We have π(a) = e precisely when a = e whichhappens if and only if a ∈ e = H. Thus ker(π) = H. Since kernels of morphismsare normal in their group, H ▹ G. The fact that o(G/H) = o(G)/o(H) doesn’tdepend on the group structure of G/H, but is simply a coset counting formula,which we already used in establishing Lagrange’s Theorem.

Now, let us show that when H ▹G, the product above is well defined. Supposethat a1 ∈ a and b1 ∈ b. Then a1 = ah and b1 = bh′ for some h, h′ ∈ H. We thencompute

a1b1 = ahbh′ = abb−1hbh′ ∈ ab,

since b−1hb ∈ H, so b−1hbh′ ∈ H. This shows that a1b1 = ab, and the product iswell defined. �Definition 11.7. IfH▹G, then the group G/H, equipped with the induced productabove, is called a quotient group or factor group of G.

Example 11.8. In any group G, the improper subgroup G and the trivial subgroup{e} are both normal in G. The group G/G has one element while the group G/{e}is isomorphic to G. Thus, we have identified the isomorphism classes of these twoquotient groups.

Example 11.9. Let G = S3. Then the complete list of subgroups of G is G, ⟨(1, 2)⟩,⟨(1, 3)⟩, ⟨(2, 3)⟩, ⟨(1, 2, 3)⟩, and {e}. Of these subgroups,

Example 11.10. Let n ∈ Z, and H = ⟨n⟩ = nZ be the subgroup of Z generated byn. Then we claim that Z/nZ = Zn. This shows that Zn is a quotient group of Z.To see this claim, note that a = b (mod n) precisely when b − a ∈ ⟨n⟩. Thus theequivalence classes mod n are precisely the left cosets of H. Also, the formula foraddition of cosets is the same in both cases. This shows that every quotient groupof Z is cyclic.

Theorem 11.11. Let n ∈ P and H = ⟨d⟩ be the subgroup of Zn generated by apositive divisor d of n. Then Zn/H ∼= Zd. In particular, every quotient group ofZn is cyclic.

Proof. We already showed that o(H) = n/d. Moreover, it is easy to see that Zn/His generated by the image of 1, so Zn/H is cyclic. To determine what cyclic groupit is isomorphic to, we need only find o(Zn)/H). But we have

o(Zn/H) = o(Zn)/o(H) =n

n/d= d.

This shows that Zn/H ∼= Zd. Since every subgroup of Zn is of the form H = ⟨d⟩,where d is a positive divisor of n, it follows that every quotient group of Zn iscyclic. �Theorem 11.12. Let ϕ : G → G′ be a morphism of groups and H ≤ ker(ϕ) bea normal subgroup of G. Then there is an induced map ϕ : G/K → G′, given byϕ(a) = ϕ(a).

Proof. Let a1 ∈ a, so that a1 = ah. Then

ϕ(a1) = ϕ(ah) = ϕ(a)ϕ(h) = ϕ(a)e = ϕ(a).

This shows that ϕ is well defined. Next, we see that

ϕ(a · b) = ϕ(ab) = ϕ(ab) = ϕ(a)ϕ(b) = ϕ(a) · ϕ(b),

40 MICHAEL PENKAVA

so ϕ is a morphism. �Theorem 11.13 (First Isomorphism Theorem for Groups). Let ϕ : G → G′ be amorphism of groups and H = ker(ϕ). Then ϕ(G) ∼= G/H.

Proof. The map ϕ : G/H → G′ is well defined, and its image is ϕ(G). Thus weobtain a morphism G/H → ϕ(G), which is surjective, since if y ∈ ϕ(G), theny = ϕ(g) for some g ∈ G, so that y = ϕ(g). Moreover, this map is injective, sinceif ϕ(a) = e, then ϕ(a) = e, so that a ∈ H and thus a = e. Thus ker(ϕ) = {e}. �

Using the first isomorphism theorem, we see that every morphism ϕ : G → G′

factors as the composition of an injective map, an isomorphism, and a surjectivemap. The injective map is the inclusion ϕ(G) ↪→ G′. (We use the symbol ↪→ toindicate an injective morphism.) The surjective map is the map G → G/ ker(ϕ).The isomorphism is the map G/H ∼= ϕ(G) given in the First Isomorphism Theoremfor groups. We have ϕ is given by the composition

G→ G/ ker(ϕ) ∼= ϕ(G) ↪→ G′.

One consequence of all this gyration is that we can understand all morphismsif we understand surjective and injective morphisms. Moreover, to find the setof morphisms from G to G′, we should study the quotient groups of G and thesubgroups of G′. Every morphism is given by an isomorphism from one of thequotient groups of G to a subgroup of G′.

Example 11.14. We classify all morphisms from Z4 to S3 and all morphisms fromS3 to Z4. We know that Z4 is abelian, so every subgroup is normal in Z4. Thereare precisely 3 subgroups of Z4, H0 = ⟨0⟩, H1 = ⟨1⟩, and H2 = ⟨2⟩, since there areonly 3 positive divisors of 4. There are 6 subgroups of S3, K0 = ⟨e⟩, K1 = ⟨(1, 2)⟩,K2 = ⟨(1, 3)⟩, K3 = ⟨(2, 3)⟩, K4 = ⟨(1, 2, 3)⟩, and S3 itself. Only the subgroups K1,K4 and S3 are normal in S3.

To find all the morphisms from Z4 to S3, note that the quotient groups of Z4

are isomorphic to Z1, Z2 and Z4. The group Z1 is isomorphic to the subgroup K0

of S3, and there is exactly one isomorphism between these groups. Thus we obtainthe map ϕ0 : Z4 → S3, given by ϕ0(a) = e for all a ∈ Z4, which is, of course thetrivial morphism. Next, the group Z2 is isomorphic to each of the subgroups K1, K2

and K3, and there is precisely one isomorphism between these groups. This givesthree more morphisms ϕ1, ϕ2 and ϕ3. The first one is given by ϕ1(a) = (1, 2)a, thesecond by ϕ2(a) = (1, 3)a, and the third by ϕ3(a) = (2, 3)a. Finally, we note thatnone of the subgroups of S3 are isomorphic to Z4. Thus we have found exactly 4morphisms from Z4 to S3.

To find all the morphisms from S3 to Z4, note that the quotients of S3 areS3/S3

∼= Z1, S3/K3∼= Z2, and S3/K0

∼= S3. The first quotient is isomorphic tothe subgroup H0, and gives rise to the trivial morphism from S3 → Z4. The secondquotient is isomorphic to H2, and gives rise to a morphism ϕ : S3 → Z4 given by

ϕ(σ) =

{2 if σ ∈ {(1, 2), (1, 3), (2, 3)}0 otherwise

.

Finally, the third quotient group is not isomorphic to any subgroup of Z4. Thusthere are exactly 2 morphisms S3 → Z4.

This example shows that even when the groups are small, the description of themorphisms between them is quite involved.


Theorem 11.15. Let H and K be two normal subgroups of G. Then H ∩K ▹G.


Definition 11.17. Let H,K ≤ G. Then define the product of the two subgroupsby HK = {hk|h ∈ H, k ∈ K}.

The product of two subgroups need not be a subgroup of G. To see this, ConsiderH = ⟨(1, 2)⟩, K = ⟨(1, 3)⟩ in S3. It is an easy exercise to show that HK isnot a subgroup of S3. For example, by direct computation, we see that HK has4 elements, while S3 has 6 elements, so by Lagrange’s Theorem, it cannot be asubgroup. However, we do have the following important result.

Theorem 11.18. Let H,K ≤ G and suppose that either H ▹G or K ▹G. ThenHK is a subgroup of G.


Theorem 11.20 (Second Isomorphism Theorem for groups). Let H,K ≤ G andsuppose that K ▹G. Then H ∩K ▹H, K ▹HK, and

H/H ∩K ∼= HK/K.

Proof. By Theorem 11.18, we know that HK ≤ G. Since cg(K) = K for all g ∈ G,this also holds for all g ∈ HK. Thus K ▹ HK. If h ∈ H and x ∈ H ∩ K, thench(x) = hgh−1 ∈ H and ch(x) ∈ K. Thus ch(x) ∈ H ∩ K. This shows thatH ∩ K ▹ H. Define ϕ : H → HK/K by ϕ(h) = h. This definition makes sensesince H ⊆ HK because e ∈ K, so h = he ∈ HK for all h ∈ H. Suppose thaty ∈ HK/K. Then y = hk for some h ∈ H and k ∈ K. Thus

ϕ(h) = h = he = hk = hk = y.

This shows that ϕ is surjective. Let h ∈ ker(ϕ). Then h = e, so h ∈ K. Thush ∈ H ∩ K. Since any element in H ∩ K is in the kernel of ϕ, it follows thatker(ϕ) = H ∩K. Thus by the first isomorphism theorem,

H/H ∩K ∼= ϕ(H) = HK/K.

�To prepare for the next isomorphism theorem, we need a definition.

Definition 11.21. Let X ⊆ G and H ▹ G. Then by X/H we mean the subset{x|x ∈ X} of G/H.

Proposition 11.22. If H ▹ G and H ≤ K, then K/H ≤ G/H. Moreover, ifK ▹G, then K/H ▹G/H.

Proof. Since e ∈ K/H, K/H is nonempty. Let a, b ∈ K/H. Then a = k1h1and b = k2h2 for some k1, k2 ∈ K and h1, h2 ∈ H. Then ab = k1h1k2h2 =k1k2k

−12 h1k2h2, so ab ∈ K/H. But a·b = ab, so this shows thatK/H is closed under

multiplication. Moreover a−1 = h−11 k−1

1 = k−11 k1h1k

−11 , so (a)−1 = a−1 ∈ K/H.

Thus K/H ≤ G/H. �Theorem 11.23 (Third Isomorphism Theorem for groups). Let H ≤ K ≤ G andsuppose that both H and K are normal subgroups of G, so that in particular, H▹K.Then

(G/H)/(K/H) ∼= G/K.

42 MICHAEL PENKAVA

Proof. The tricky part of the proof is to give a good notation for the equivalenceclasses of elements in G in the two different quotients G/K and G/H. Let us denotethe image of a ∈ G in G/H by a, and in G/K by a. Define a map ϕ : G/H → G/Kby ϕ(a) = a. To see that this is well defined, note that if π : G → G/K is theprojection π(a) = a, then H ⊆ ker(π), since H ⊆ K. Let ϕ = π be the inducedmap ϕ(a) = π(a) = a. Clearly ϕ is surjective. Now suppose that a ∈ ker(ϕ). Thena ∈ K, so a ∈ K/H. Moreover, if a ∈ K/H, then a = kh for some k ∈ K andh ∈ H. Since H ⊆ K, it follows that a ∈ K, so that ϕ(a) = e. Thus ker(ϕ) = K/H,and the induced map ϕ : (G/H)/(K/H) → G/K is surjective and injective, so isan isomorphism. �

11.1. The Commutator Subgroup.

Definition 11.24. If G is a group and g, h ∈ G, then the commutator of g and h,denoted [g, h], is the element

[g, h] = ghg−1h−1.

The commutator subgroup of G, denoted as G′ or [G,G], is the smallest subgroupof G containing all commutators.

It is not true in general that the set of commutators of a group G is a subgroup.Instead, we have the following characterization of [G,G].

Theorem 11.25. If a, b ∈ G, then [a, b]−1 = [b, a]. As a consequence [G,G] ={∏n

i=1[ai, bi]|ai, bi ∈ G,n ∈ P}.

Proof. To see the first statement, note that

[a, b][b, a] = aba−1b−1bab−1a−1 = e.

In general, if S = ∅ ⊆ G, we have ⟨S⟩ = {∏n

i=1 xi|x±1i ∈ S, n ∈ P}. But in this

case, since the set of commutators is closed under inverses, we have the simplerdescription above. �

Theorem 11.26. If a, b ∈ G, then cg([a, b]) = [cg(a), cg(b)] for all g ∈ G. As aconsequence, the commutator subgroup is a normal subgroup of G.

Theorem 11.27. Let ϕ : G → G′ be a morphism, and suppose that G′ is abelian.Then [G,G] ≤ ker(ϕ). Moreover G/[G,G] is abelian.

Proof. First, note that

φ([a, b]) = φ(aba−1b−1) = ϕ(a)ϕ(b)ϕ(a)−1ϕ(b)−1 = ϕ(a)ϕ(a)−1ϕ(b)ϕ(b)−1 = e.

Thus [a, b] ∈ ker(ϕ) for every commutator. It follows that any product of commuta-tors is in ker(ϕ), and since every element in the commutator subgroup is a productof commutators, we have [G,G] ≤ ker(ϕ).

Let π : G → G/[G,G] be the natural projection π(a) = a. Then if a, b ∈G/[G,G], we have

ab(a)−1(b)−1 = aba−1b−1 = [a, b] = e.

But this means that ab = ba, so G/[G,G] is abelian. �


12. Dihedral Groups

The dihedral group Dn is often defined as the group of symmetries of the regularn-gon. By symmetries, we mean rotations and reflections which preserve the setof vertices and edges of the n-gon. There are several different ways of representingthe dihedral group, and in this section, we will discuss several of them.

–0.8

–0.6

–0.4

–0.2

0

0.2

0.4

0.6

0.8

–1 –0.5 0.5 1

Figure 3. Hexagon centered at the origin with one vertex at (1, 0)

The picture above illustrates a regular hexagon, centered at the origin, with onevertex at the point (1, 0). The vertices are at the points (cos(kπ/3), sin(kπ/3), fork = 0, . . . , 5. More generally, for a regular n-gon, we would have n-vertices at thepoints (cos(2kπ/n), sin(2kπ/n), for k = 0, . . . , n − 1. There are n rotations whichpreserve the n-gon, generated by the rotation ρ through the angle of 2π/n, whichis conventionally chosen to be a counterclockwise rotation. The n rotations areρ, ρ2, . . . , ρn, where ρn is the identity element, since a rotation by the angle 2π isconsidered as the identity.

For a hexagon, there are 6 reflections, three of them across the lines determinedby pairs of midpoints of opposite edges, and three of them through the lines givenby lines given by pairs of opposite vertices. The same pattern holds for any n-gon, when n is even, but for n odd, there is a different pattern. There are still nreflections, but each is given by a reflection through a line through a vertex andthe midpoint of the opposite edge.

If we denote the reflection of the plane across the x-axis by σ, we note that forany n-gon, this reflection is one of the symmetries. Moreover, every reflection is ofthe form ρkσ, for k = 0, . . . , n. Thus the complete set of symmetries of the n-gon is

44 MICHAEL PENKAVA

{e, ρ, . . . , ρn−1, σ, ρσ, . . . , ρn−1σ}. This means that there are 2n symmetries of then-gon.

We would like to show that the set Dn of symmetries of the n-gon form a groupunder composition. We have already used composition to aid in the descriptionof the symmetries, but we still need to show that any composition of symmetriesis another symmetry. This is clear from the geometric point of view, since thecomposition of maps which preserve the edges and vertices should also preservethe vertices and edges. However, we would like to understand this idea from analgebraic point of view as well.

Let see what we can work out directly from the definitions of ρ and σ. Clearly,the elements σk = ρkσ are not rotations, because we have flipped the plane overwith σ, and the rotation ρk does not flip it over again. In fact, it is not hard to seethat the point whose coordinates are ±(cos(kπ/n), sin(kπ/n)) is preserved underσk, so that σk preserves the line through these two points, and therefore must be areflection across that line.

Suppose that ϕ is any element of the dihedral group. It is enough to knowwhether it is a rotation or reflection and what it does to the vertex (1, 0). We usethis fact to compute σρ. Since ρ takes (1, 0) to (cos(2π/n), sin(2π/n)), and σ hasthe effect of negating the y-coordinate,

(ρσ)(1, 0) = (cos(2π/n),− sin(2π/n)) = (cos(2π/n), sin(−2π/n)).

However, we can directly compute that

(ρn−1σ)(1, 0) = ρn−1(1, 0) = (cos(2(n− 1)π/n), sin(2(n− 1)π/n))

= (cos(2π/n), sin(−2π/n)),

It follows that σρ = ρn−1σ, since both of these maps are reflections and they take(1, 0) to the same vertex.

All of the 2n symmetries of the n-gon are given by orthogonal transformations ofthe plane. The set of all orthogonal transformations of the plane is a group, denotedby O(2,R), or just O(2). It consists of rotations, which are given by matrices of the

form ρθ =[cos(θ) − sin(θ)sin(θ) cos(θ)

], and reflections, which are given by matrices of the form

σθ =[cos(θ) sin(θ)sin(θ) − cos(θ)

]. Let σ =

[1 00 −1

]. Then σθ = ρθσ, by direct computation.

Moreover, by direct computation, we obtain that σρθ = ρ−θσ.Now, let ρ = ρθ for θ = 2π/n. It is easily computed that n is the least positive

integer such that ρn = I, the identity matrix, and that the set of 2n elementsDn = {I, ρ, . . . , ρn−1, σ, ρσ, . . . , ρn−1σ} is closed under multiplication and inversesand is nonempty. Thus Dn is a subgroup of O(2).

The last incarnation of the dihedral group that we will study is as operators onthe complex plane. By an operator on the complex plane we mean a map C → C.Let ρ be the operator defined by ρ(z) = e2πi/nz, in other words, multiplication bythe complex number e2πi/n, where eiθ = cos(θ) + i sin(θ), a property of complexnumbers that you can verify by expanding the power series for eiθ, and using thefact that i2 = −1.

Let σ be the conjugation operator σ(z) = z. Then the dihedral group Dn is thesubgroup of operators on the complex plane generated by ρ and σ. Since ρ and σare invertible maps, with the inverse to ρ given by multiplication by e−2πi/n, andσ being its own inverse, Dn is a subgroup of the permutations of C. Moreover ρk


is multiplication by e2kπi/n, so ρn = 1, the operator of multiplication by 1, whichis the identity operator on C.

Next, we compute

(σ ◦ ρ)(z) = σ(ρ(z)) = σ(e2πi/nz) = e2πi/nz = e2πi/nz

= e−2πi/nz = ρ−1(σ(z)) = (ρ−1 ◦ σ)(z).

From this we conclude that σρ = ρ−1σ. It follows from this relation that Dn ={e, ρ, . . . , ρn−1, σ, ρσ, . . . , ρn−1σ}, and that these 2n elements are distinct.

No matter which of the three models one makes of Dn, we obtain that it has 2nelements, that ρ has order n, σ has order 2, and σρ = ρ−1σ. In fact, these threeproperties are sufficient to compute all products in Dn, and therefore to constructthe Cayley Table of Dn.

For some low values of n, we obtain some isomorphisms between Dn and someother groups. D1 has exactly 2 elements, e and σ, so D1 ≃ Z2. D2 is isomorphicto the Klein 4-group. D3 ≃ S3, which can be verified by comparing Cayley tables.There is a morphism Dn → Sn, since every symmetry of the n-gon induces apermutation of the n vertices of the n-gon. When n ≥ 3, this map is injective.The isomorphism between D3 and S3 can be seen in this way. When n > 3, themorphism Dn → Sn is not surjective, as o(Dn) = 2n < n! = o(Sn) when n > 3.

The model for Dn as the symmetries of the n-gon has some difficulty when nis 1 or 2. To rescue the model, let us consider the n-gon to be defined as follows.Take the unit circle and mark n equally spaced points on it, in other words, thepoints (cos(2kπ, n), sin(2kπ/n)) for k = 0, . . . , n − 1. We will call these n-pointsthe vertices of the n-gon. Define the group Dn as the symmetries of the circle,which preserve this set of n vertices. In this way, the 1-gon is a circle with onepoint marked on it, and the 2-gon is a circle with two diametrically opposite pointsmarked on it.

Note that some mathematicians denote the group Dn as D2n. This kind ofconflicting terminology is not unusual in mathematics and reflects traditions comingfrom different branches of mathematics.

13. Direct Products and Semidirect Products

Definition 13.1. Let G and H be groups. Then the set G × H, equipped withbinary operator (g, h)(g′, h′) = (gg′, hh′) is called the direct product of G and H.

More generally, if Gλ is a collection of groups for λ ∈ Λ, then∏λ∈Λ

Gλ is equipped

with the binary operator (gλ)(g′λ) = (gλg

′λ) is called the direct product of the groups

Gλ.

Theorem 13.2. The direct product of groups is a group.

Proof. We give the proof for two groups G and G′. The proof for a collection ofgroups is similar. First, we check that the binary operation is associative.

(g, h)((g′, h′)(g′′, h′′)) = (g, h)(g′g′′, h′h′′) = (gg′g′′, hh′h′′) = (gg′, hh′)(g′′, h′′)

= ((g, h)(g′, h′))(g′′, h′′).

Next we check that (e, e) is the identity in G×H.

(e, e)(g, h) = (eg, eh) = (g, h) = (ge, he) = (g, h)(e, e).

46 MICHAEL PENKAVA

Finally we show that (g, h)−1 = (g−1, h−1).

(g, h)(g−1, h−1) = (gg−1, hh−1) = (e, e) = (g−1g, h−1h) = (g−1, h−1)(g, h).

�

Theorem 13.3. G×H is abelian if and only if both G and H are abelian groups.


Theorem 13.5. If (g, h) ∈ G×H, then (g, h)k = (gk, hk) for all k ∈ Z.


Example 13.7. Let G = Z2 and H = Z3. Then o(G ×H) = o(G)o(H) = 6. Wehave found two groups of order 6, up to isomorphism, Z6 and S3. It turns out thatthese are the only two groups of order 6 up to isomorphism. However we have notshown that, so let us see what we can determine. First, since both Z2 and Z3 areabelian, we know that G is abelian. Thus we can rule out the possibility that G isisomorphic to S3. On the other hand, if G ∼= Z6, it would have to have an elementof order 6. This motivates us to study the order of the elements in G. It is easyto calculate that k(1, 1) = (k, k), so this is (0, 0) precisely when k = 0 (mod 2) andk = 0 (mod 3). But this means that both 2 and 3 must divide k. Since 2 and 3 arerelatively prime, this forces their product to divide k, so k is a multiple of 6. Theleast nonzero positive multiple of 6 is 6, so we see that (1, 1) has order 6. But thatmeans that G is cyclic and so it is isomorphic to Z6.

Theorem 13.8. Let g ∈ G and h ∈ H be elements of finite order. Then the orderof (g, h) in G × H is the least common multiple of o(g) and o(h). On the otherhand, if either g or h have infinite order, then so does (g, h).

Proof. Let m = o(g) and n = o(h), and c = lcm(m,n). Now c = mx and c = nyfor some x, y ∈ Z, so

(g, h)c = (gc, hc) = (gmx, hny) = ((gm)x, (hn)y) = (ex, ey) = (e, e).

This shows that o(g, h)|c. On the other hand, if (g, h)k = (e, e), then gk = e andhk = e, so m|k and n|k, which means that c|k, since c is the least common multipleof m and n. Since the order of the element is the least positive power k such that(g, h)k = e, it follows that c|o(g, h). Thus the two are equal. When g or h hasinfinite order, it is impossible that (g, h)k = e for any positive integer k, because inthat case we would have gk = e and hk = e, which would imply that both of themhave finite order. �

In the example above, we saw that Z2 × Z3∼= Z6. In fact, this is a consequence

of the more general result below.

Theorem 13.9. Zm × Zn∼= Zmn precisely when gcd(m,n) = 1.

Proof. Suppose that Zm ×Zn∼= Zmn. Then there is an element (a, b) of order mn.

But the maximum possible order of any element in Zm ×Zn is lcm(m,n) ≤ mn. Itfollows that lcm(m,n) = mn. Now lcm(m,n) = mn

gcd(m,n) , so this forces gcd(m,n) =

1. On the other hand, when gcd(m,n) = 1, then o(1, 1) = lcm(m,n) = mn, soZm × Zn is a cyclic group of order mn, and thus it is isomorphic to Zmn. �


One problem with the notion of direct product is that in order for a group to besuch a direct product, it needs to consist of orders pairs. This is extremely unlikely,so at first hand, the notion of direct product does not seem to be so powerful. Butthe theorem above shows that the key idea is not being a direct product, but beingisomorphic to one. Because we know that Zmn is isomorphic to Zm × Zn when mand n are relatively prime, it tells us a lot about the structure of that group. Thefollowing theorem gives a method of characterizing when a group is isomorphic toa direct product.

Theorem 13.10 (Fundamental Theorem on Direct Products). Let H and K besubgroups of a group G. Assume the following three conditions hold.

(1) H ∩K = {e}.(2) H ▹G and K ▹G.(3) HK = G.

Then the map ϕ : H ×K → G, given by ϕ(h, k) = hk is an isomorphism betweenH ×K and G. We can replace (2) by

(2′) hk = kh for all h ∈ H, k ∈ K.

Moreover, when o(G) <∞, we can replace (3) by

(3′) o(G) = o(H)o(K).

Proof. We first show that if (1) and (2) hold, then (2′) holds. Let h ∈ H andk ∈ K. Note that hk = kh if and only if hkh−1k−1 = e. Now hkh−1k−1 lies inK because hkh−1 is in K. Similarly, it lies in H because kh−1k−1 is in H. Thushkh−1k−1 lies in H ∩K, so we must have hkh−1k−1 = e.

Next, we show that if (2′) and (3) hold, then (2) holds. Let g ∈ G. By (3), wemust have g = hk for some h ∈ H and k ∈ K. Let x ∈ H. Then

gxg−1 = hkxk−1h−1 = hkk−1xh−1 = hxh−1 ∈ H.

Thus H ▹G. By a similar argument K ▹G.Suppose that (1) holds. Then we claim that ϕ is injective. Suppose that hk =

h′k′ for some h′ ∈ H and k′ ∈ K. Then (h′)−1h = k′k−1 ∈ H ∩K, which impliesthat (h′)−1h = k′k−1 = e, from which we deduce that h = h′ and k = k′. Thus ϕis injective.

Suppose (1) and (3) hold, and that o(G) < ∞. Since (1) implies that ϕ isinjective, and (3) implies that ϕ is surjective, we have that ϕ is a bijection, soo(G) = o(H)× o(K). Thus (3′) holds.

On the other hand, suppose (1) and (3′) hold and that o(G) < ∞. Since ϕ isinjective, and the sets G and H ×K have the same number of elements, ϕ must besurjective. Thus G = HK. Thus (3) holds.

Now, we have already seen that if (1) holds, ϕ is injective, and if (3) holds, ϕis surjective. Thus it remains to show that ϕ is a morphism. If either (2) holdsas well, then we know that (2′) holds. Thus we will use (2′) to show that ϕ is amorphism. We have

ϕ((h, k)(h′, k′)) = ϕ((hh′, kk′)) = hh′kk′ = hkh′k′ = ϕ((h, k))ϕ((h′, k′)).

Thus ϕ is a morphism. �Notice that in the above proof, if conditions (1) and (2′) hold, then ϕ is still an

injective morphism. Thus, whether (3) holds or not, ϕ is an isomorphism of H ×Konto its image, which is a subgroup of G. However, we need to be a bit careful here,

48 MICHAEL PENKAVA

because it is not enough that (1) and (2) hold, since it was condition (2′) that weneeded to show that ϕ was a morphism, and condition (3) was needed to replace(2) by (2′).

Since we are mainly concerned with finite groups, it is usually much easier toshow that condition (3′) holds than condition (3). In fact, if we don’t know whato(H) and o(K) are, we probably don’t have enough information about the twosubgroups to conclude anything.

Definition 13.11. If H and K are subgroups of G and the map ϕ : H ×K → Ggiven by ϕ(h, k) is an isomorphism of groups, then we say that G is the internaldirect product of H and K, and we also write G = H ×K.

Notice that when we express an internal direct product in the form G = H ×K,we don’t mean that G consists of ordered pairs of elements of H and K. Thus thereis some ambiguity about whether H ×K means the internal direct product or theexternal direct product, which is given by these ordered pairs of elements. However,this ambiguity is not so much of a problem as the two groups are isomorphic in anatural manner.

Example 13.12. Consider the group R∗ of nonzero real numbers (under multipli-cation). Let H = {±1} and K = R+, the subgroup of positive real numbers. ClearlyH∩K = {1}. Moreover, every subgroup of an abelian group is normal, so condition(2) is satisfied. (Of course, condition (2′) is even more obvious.) Finally, everyelement of R∗ is in HK. Thus R∗ = H×K. We also express this as R∗ = Z2×R+,since H ∼= Z2.

Example 13.13. We give another proof that Zmn∼= Zm×Zn when gcd(m,n) = 1.

We note that m has order n and n has order m in Zmn. Let H = ⟨m⟩ and K = ⟨n⟩.Suppose x ∈ H∩K. Then o(x) must divide both m and n, since it divides the ordersof each of those groups, since it is a member of both of them. Thus o(x) = 1, sox = 0. It follows that H∩K = {0}, so condition (1) holds. Condition (2) holds sinceZmn is abelian. Condition (3′) holds since o(Zmn) = mn = o(K)o(H) = o(H)o(K).Thus Zmn = H ×K ∼= Zm × Zn.

Exercise 13.14. Show that D2n∼= Dn × Z2 when n is odd. To do this, show that

if we let r = ρ2, then the subgroup H = ⟨r, σ⟩ ∼= Dn. Let K = ⟨ρn⟩. Show thatK ∼= Z2. Then show that the hypotheses of the Fundamental Theorem on DirectProducts are satisfied.

Exercise 13.15. Recall that GL(n,R) is the group of n × n invertible matriceswith real coefficients, and SL(n,R) is the subgroup of matrices with determinant 1.Suppose that n is odd, so that det(−I) = −1. Let H = {±I}, and K = SL(n,R).Show that GL(n,R) = H ×K.

Before the introduction of the notion of direct product, we did not know manygroups. We have studied the groups Zn, Sn, Dn and some matrix groups. Withthe introduction of the direct product, we obtain a lot more groups. For example,every finite abelian group (actually every finitely generated abelian group) canbe expressed as a direct product of cyclic groups. This statement is called theFundamental Theorem of Finitely Generated Abelian Groups.

The proof of the Fundamental Theorem of Finite Abelian Groups is long, butthe applications of the theorem are fairly straightforward.


The construction of groups by direct products is still not enough to classifyall finite groups. We need a more powerful tool, called the semidirect product.Even this tool, which we introduce next, is not enough to classify finite groups,but it is a much more powerful tool than the direct product. The basic strategyin constructing finite groups is as follows. Suppose that you want to constructall groups of order n, and you do know all groups of order less than n, up toisomorphism. Then we would like to be able to construct groups of order n fromgroups of smaller order. The first case we need to consider is the case when ourgroup has no proper nontrivial normal subgroups.

Definition 13.16. Let G be a finite group. Then G is said to be a simple group ifthere are no proper nontrivial subgroups of G.

The first step in classifying groups is to determine all of the simple groups. Theclassification of simple groups turned out to be a very hard problem. Its solutionwas originally announced in the 1980s, but there were some problems with thissolution. Then, in the 1990s, it was thought to have been completely solved, butagain there were some problems with the proofs. According to Wikipedia, thecomplete classification of finite simple groups was completed in 2008. Nevertheless,the classification remains a difficult problem, and a complete proof has not yet beenpublished.

The complete list of simple groups contains several families of simple groups,and 26 special cases, called sporadic groups. We already know one of the families,Zp for p prime, because the only subgroups of Zp are the trivial subgroup and theimproper subgroup. This is because if 1 ≤ m < p, then m is a generator of Zp. Thefamily of groups An, the subgroup of even permutations in Sn, is another familyof simple groups for n ≥ 5. It was a very important discovery of Evariste Galois(1811-1832) that A5 was a simple group. This was the key fact in his proof thatthere can be no formula for obtaining the solution to a general quintic polynomialin terms of roots.

The largest sporadic group, called the Monster Group, was only discovered in1982 (although it was shown to exist in the 1970s). It has order

246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71 ≈ 8× 1053.

This group is so large that even calculating things like the product of two elementsin the group is extremely complicated. A great deal of study of this monster groupis still ongoing.

If α : K → Aut(H) ia a morphism of K to the automorphism group of a groupH, then it is typical to denote the automorphism α(k) by αk.

Definition 13.17. Suppose that α : K → Aut(K) is a morphism between thegroupK and the automorphism group of the groupH. Then the semidirect productof H and K determined by α, denoted by H oαK, is the set H ×K equipped withthe binary operation

(h, k)(h′, k′) = (hαk(h′), kk′).

When the map α is implicit, we usually write H oK instead of H oα K.

Theorem 13.18. The semidirect product H oα K is a group under the binaryoperation introduced above.

50 MICHAEL PENKAVA

Proof. To see associativity holds, we compute

((h, k)(h′, k′))(h′′, k′′) = (hαk(h′), kk′)(h′′, k′′) = (hαk(h

′)αkk′(h′′), kk′k′′)

= (hαk(h)αk(αk′(h′′))kk′k′′) = (hαk(h′αk′(h′′), kk′k′′

= (h, k)(h′αk′(h′′), k′k′′) = (h, k)((h′, k′)(h′′, k′′)).

It is natural to guess that the identity is (e, e), and we verify this by

(e, e)(h, k) = (eαe(h), ek) = (αe(h), k) = (1H(h), k) = (h, k)

(h, k)(e, e) = (hαk(e), ke) = (he, k) = (h, k).

It is not so obvious what (h, k)−1 should be, so let us solve for it. Suppose that(h, k)(x, y) = (e, e). Then

(e, e) = (h, k)(x, y) = (hαk(x), ky).

Thus y = k−1 and αk(x) = h−1, so applying αk−1 to both sides, we obtain that x =αk−1(h−1). Thus (x, y) = (αk−1(h−1), k−1). We need to verify that (x, y)(h, k) =(e, e). But

(x, y)(h, k) = (αk−1(h−1), k−1)(h, k) = (αk−1(h−1)αk−1(h), k−1k)

= (αk−1(h−1h), e) = (αk−1(e), e) = (e, e).

�

Note that a direct product is a special case of a semidirect product, where themap α is the trivial morphism between K and Aut(K), because in that case wehave

(h, k)(h′, k′) = (hαk(h′), kk′) = (h1H(h′), kk′) = (hh′, kk′).

As is the case for direct products, it is uncommon for a group to actually consistof ordered pairs, so there is little chance that a group fits the description of asemidirect product. However, what is more important is when a group is isomorphicto a semidirect product. The following theorem characterizes when G is isomorphicto a semidirect product.

Theorem 13.19. Suppose that H and K are subgroups of G satisfying

(1) H ∩K = {e}.(2) H ▹ G.(3) HK = G.

Let α : K → Aut(H) be given by αk(h) = khk−1 be the automorphism of H givenby the restriction of the conjugation operator to K, acting on H. Then the mapH oα K → G given by (h, k) 7→ hk is an isomorphism. If o(G) <∞, then we mayreplace condition (3) by the condition

(3)′ o(G) = o(H)o(K).

Proof. The fact that (3) is equivalent to (3)′ if (1) holds is proved in the same wayit was for direct products. Let ϕ : H oα K → G be given by ϕ(h, k) = hk. Then

ϕ((h, k)(h′, k′)) = ϕ(hkh′k−1, kk′) = hkh′k−1kk′ = hkh′k′ = ϕ(h, k)ϕ(h′, k′).

Injectivity of ϕ follows from (1) and surjectivity from (3). �


Example 13.20. Suppose that n ≥ 2. Let H = An be the alternating subgroup ofthe permutation group Sn. We know that An ▹ Sn and that o(An) = n!/2. LetK = ⟨(12)⟩ = {(12), e} ∼= Z2. Since o(H)o(K) = p! = o(Sn) and H ∩ K = {e},we see that Sn = An o Z2. Thus Sn is a semidirect product. Since An is simplefor n ≥ 5, this gives a decomposition of Sn as a semidirect product of two simplegroups.

Exercise 13.21. Show that Dn ≃ Zn o Z2.

Example 13.22. Let V be a vector space over a field k. Then GL(V ) is a the groupof invertible linear transformations from V to V . Let v ∈ V and A ∈ GL(V ). Thenthe map TA,v : V ] → V given by TA,v(x) = Ax+v is called an affine transformationof V . We show that the set Aff(V ) of affine transformations of V is a group undercomposition. Since

(TA,v ◦ TB,w)(x) = TA,v(TB,w(x)) = TA,v(Bx+ v)

= A(Bx+ v) + w = (AB)x+Av + w = TAB,Av+w(x),

we see that the composition of two affine transformations is another affine trans-formation.

Clearly, the identity map TI,0 is an affine transformation, and

TA,v ◦ TI,0 = TA,v = TI,0 ◦ TA,v,

so e = TI,0 is the identity in Aff(V ). Finally,

TA,v ◦ TA−1,−A−1v = TI,0 = TA−1,−A−1v ◦ TA,v,

so TA−1,−A−1v = T−1A,v. Thus Aff(V ) is a group.

Next, we let H = {TI,v|v ∈ V }. Denote an element TI,v by Tv and call it thetranslation by v. It is easy to see that Tv + Tw = Tv+w, T

−1v = T−v, and that

e ∈ H, so H is a subgroup of Aff(V ). It is also straightforward to see that the mapT : V → H given by T (v) = Tv is an isomorphism, so H ≃ V . Next, we compute

TA,v ◦ Tw ◦ TA−1,−A−1v = TA,v ◦ TA−1,w−A−1v = TI,Aw,

which shows that H ▹ Aff(V ). Let K = GL(V ) = {TA,0|A ∈ GL(V )}. We have

TA,0 ◦ TB,0 = TAB,0, T−1A,0 = TA−1,0 and e ∈ K, which shows that K ≤ Aff(V ).

Next, note that if TA,0 = Tv, then A = I and v = 0, soH∩K = {e}. Finally, notethat TA,v = Tv ◦ TA,0, which shows that HK = Aff(V ). Thus all three conditionsof the theorem on semidirect products are satisfied and Aff(V ) = H o K. SinceH ≃ V and K ≃ GL(V ), we express this fact in the form

Aff(V ) = V oGL(V ).

Department of Mathematics, University of Wisconsin-Eau Claire, Eau Claire, WI54729 USA

E-mail address: [email protected]

Documents

The Principle of Mathematical Induction. Theorem 1.1. P nThe principle of mathematical induction shows that a recursive deﬁnition gives a well-deﬁned function, provided that the