FOUNDATIONS & PROOF LECTURE NOTESmalhw/FandP-lectures.pdf · FOUNDATIONS & PROOF LECTURE NOTES ... Truth tables, equivalences, and ... tion used in truth tables; equivalence of propositions;

FOUNDATIONS & PROOF LECTURE NOTES

by Dr Lynne Walling

Note: You are expected to spend 3-4 hours per week working on this courseoutside of the lectures and tutorials. In this time you are expected to reviewthe lecture notes, the comments on your homework, and the model solutions;work on your current homework assignment; neatly rewrite your homeworksolutions for submission to your tutor.

In these notes, many proofs refer to previously proved results or previouslystated assumptions by restating the results or assumptions; this is how Iexpect you to refer to these things when you take the exam in this course.However, you may find it useful in studying to annotate these notes withthe propostion/theorem/corollary number or the page number containingthe result or assumption being invoked.

References for the course:

• These notes: Transcription of Lynne Walling’s Lectures on Founda-tions & Proof.• Larry Gerstein, Discrete Mathematics and Algebraic Structures, W.H.

Freeman and Company, 1987.• D.J. Velleman, How to Prove It: A Structured Approach, Cambridge

University Press, 2006.• P.J. Eccles, An Introduction to Mathematical Reasoning: Numbers,

Sets and Functions, Cambridge University Press, 1997.

The course is organised into the following sections.

§1. Introduction: Sets and Functions (including notation; discussionof sets; Cartesian products; definition of a function; injective, surjective, andbijective functions; composition of functions; invertible functions; proof bycontradiction)

§2. Truth tables, equivalences, and contrapositive (including nota-tion used in truth tables; equivalence of propositions; the contrapositive ofa proposition)

§3. Negations and contrapositives of propositions with quantifiers(including notation for a proposition dependent on a variable; an algorith-mic approach for negating complex propositions; an equivalent definition ofinjective)

§4. Set operations (including union, intersection, difference of two sets,complement of a set; De Morgan’s Laws and similar propositions involvingunions, intersections, differences, and complements of sets; indexed sets;inverse image of a set under a function; relations between inverse images)

1

2 F & P LECTURE NOTES

§5. Partitioning sets, equivalence relations, and congruences (in-cluding relations on a set; definitions of reflexive, symmetric, and transitiverelations; a correspondence between a partition of a set and an equivalencerelation on that set; congruences)

§6. Algorithms, recursion, and mathematical induction (includingthe division algorithm in the integers; highest common factors; Euclid’salgorithm; the Chinese Remainder Theorem; using mathematical inductionto prove relations on sets constructed using set operations)

§7. Strong induction and the Fundamental Theorem of Arithmetic(including the definition of a prime number; a proof that there are infinitelymany prime numbers; an application of the Fundamental Theorem of Arith-metic to find all prime numbers p so that 5p+ 9 is the square of an integer)

§8. Cardinality (including the definition of a countable set; statement ofthe Cantor-Schroder-Bernstein Theorem; basic results regarding the cardi-nality of subsets of the positive integers; proof that the Cartesian productZ+×Z+ is countable; proof that the union of a countable number of pairwisedisjoint countable sets is a countable set)

§9. Uncountable sets and power sets (including Cantor’s diagonali-sation proof that the unit interval (0, 1) is uncountable; proof of Cantor’sTheorem that the cardinality of the power set of a set A is strictly largerthan the cardinality of A)

§10. More proofs using contradiction, construction, and induc-tion (including more practice problems; how to easily determine whetheran integer is divisible by 9)

F & P LECTURE NOTES 3

1. Introduction: Sets and Functions

Mathematics is pure language - the language of science. It is unique amonglanguages in its ability to provide precise expression for every thought orconcept that can be formulated in its terms... It is also an art - the mostintellectual and classical of the arts. (Quote from A. Adler’s article “Math-ematics and Creativity” in The World Treasury of Physics, Astronomy, andMathematics.)

This course is heavily based on (1) definitions that are used to capturemathematial concepts, and on (2) using these definitions to solve mathemat-ical problems. So we begin by defining some terms and introducing somenotation we will use frequently.

A set is a collection considered as a unit. We are familiar with manysets, such as the set of integers, the set of rational numbers, and so on. Inmathematics we use certain sets so often that we have abbreviated notationfor them:

Z is the set of integers; so Z = {0,±1,±2,±3, . . .}.Q is the set of rational numbers; so Q =

{ab : a, b ∈ Z, b 6= 0

}, meaning

that Q is the set of all objects of the form ab that meet the conditions that

a, b ∈ Z and b 6= 0 (recall that a, b ∈ Z means that a, b are elements of theset Z).

R is the set of real numbers.C is the set of complex numbers, so C = {a+ b

√−1 : a, b ∈ R }.

{} is the empty set (i.e. the set with no elements), which is also denotedby ∅.Note: Suppose X is a set. We cannot say “choose x ∈ X” unless we knowX 6= ∅; however, we can say “suppose x ∈ X”, even when we don’t knowwhether X is nonempty.

We write Z+ to denote the set of positive integers, Q+ the set of positiverational numbers, and R+ the set of positive real numbers. (Note: 0 isneither positive nor negative.) The notation

A = {x ∈ R : x >√

2 }

means that A is the set of all real numbers x that meet the condition x >√

2.We write A ⊆ X when X is a set and A is a subset of X, meaning thatevery element of A is also an element of X. We write A ( X when A is aa proper subset of the set X, meaning that A is a subset of X but A is notequal to X. (The use of the notation A ⊂ X is not consistent throughoutmathematical literature, so we will avoid using this notation.) We writeA 6⊆ B when A is not a subset of B.

Note that ∅ is the only subset of ∅.Example: Z ⊆ Q ⊆ R ⊆ C.

Note: Suppose A,X are sets. Showing A = X is equivalent to showingA ⊆ X and X ⊆ A.

For A and B subsets of some set X, the notation A∪B denotes the unionof A and B, meaning

A ∪B = {x ∈ X : x ∈ A or x ∈ B }.


Similarly, for A and B subsets of some set X, the notation A ∩ B denotesthe intersection of A and B, meaning

A ∩B = {x ∈ X : x ∈ A and x ∈ B }.

Example: Suppose that A = {1, 2, 3} and B = {3, 4, 5}, which are subsetsof Z. Then

A ∪B = {1, 2, 3, 4, 5} and A ∩B = {3}.

The sets Z,Q,R and C are sets that have some useful properties. Forinstance, with X = Z or Q or R or C, and for any a, b, c ∈ X, we havea+ b,−a, ab ∈ X, a+ b = b+a, ab = ba, and c(a+ b) = ca+ cb. Further, forX = Q or R or C and a ∈ X with a 6= 0, we have 1

a ∈ X. We also know thatfor a, b ∈ Z+, we have a ≤ ab, and a = ab only when b = 1. In addition, weknow that for any a, b ∈ C, we have ab = 0 only when a = 0 or b = 0; thismeans that for a, b, c ∈ C with ab = ac and a 6= 0, we have a(b− c) = 0 andhence b− c = 0 so b = c.

A noteable property of R is that it is linearly ordered, meaning that forevery x, y ∈ R, either x ≤ y or y ≤ x, and if x, y ∈ R with x ≤ y and y ≤ xthen x = y. Note that any subset of R is also linearly ordered. We say a(nonempty) subset A of R is bounded above if there is some M ∈ R so thatM ≥ a for all a ∈ A, and we say A is bounded below if there is some m ∈ Rso that m ≤ a for all a ∈ A.

Proposition 1.1. Suppose A is a nonempty subset of Z.

(1) If A is bounded above then A contains a maximal element, meaningA is bounded above by an element of A.

(2) If A is bounded below then A contains a minimal element, meaningA is bounded below by an element of A.

Proof. (1) Suppose A is bounded above by N ∈ R. Choose any c ∈ A.Then there are finitely many integers between c and N , so there are finitelymany a ∈ A so that c ≤ a ≤ N ; A is bounded above by the largest of theseelements a.

(2) Suppose A is bounded below by n ∈ R. Choose any c ∈ A. Thenthere are finitely many elements of a so that n ≤ a ≤ c; A is bounded belowby the smallest of these elements a. �

Corollary 1.2. Any nonempty subset of Z+ has a minimal element.

Proof. If A ⊂ Z+ and A is nonempty, then A is a subset of Z that is boundedbelow by 1, and hence by the above theorem A has a minimal element. �

A cautionary tale regarding sets. Consider the following situation:“The barber is a man in town who shaves all those, and only those, men intown who do not shave themselves.” Who shaves the barber?

In 1901 Bertrand Russell presented a version of this paradox to the math-ematical community; this resulted in widespread fear that the foundationsof mathematics were “built on quicksand”. This paradox shows that a con-dition that contains an inherent contradiction does not determine a set.


There are many sources that discuss Russell’s Paradox (easily found bysearching the internet); students are encouraged to peruse these.

In mathematics, we are very often concerned with functions (also calledmaps). Some functions model the behaviour of complex systems, while otherfunctions allow us to compare two sets. We are accustomed to functionsthat are given by a formula, as when studying Calculus. For instance, youmight have been given f(x) = x2 and been instructed to graph this for−2 ≤ x ≤ 2. To do this, you would have plotted all points of the form(x, f(x)) (or equivalently, (x, x2)) for all x-values between −2 and 2. So theset {(x, f(x)) : −2 ≤ x ≤ 2 } is what you might have called the graph of ffor −2 ≤ x ≤ 2. Here we develop a formal definition of a function.

Definition. Given sets X,Y , we define the Cartesian product of X and Yas

{(x, y) : x ∈ X, y ∈ Y },and we denote this set by X × Y . (So X × Y is the set of all ordered pairs(x, y) that meet the conditions that x ∈ X and y ∈ Y .) Note that if X orY is the empty set, then so is X × Y .

Example: R× R = {(x, y) : x, y ∈ R }. So R× R is the Cartesian plane.

Example: Let X = {1, 2, 3}, Y = {4, 5, 6}. Then

X × Y = {(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)}.

Definitions. With X,Y nonempty sets, a function f from X into Y is aset of ordered pairs f ⊆ X × Y with the property that for each elementx ∈ X there is exactly one y ∈ Y so that (x, y) ∈ f . (Equivalently: WithX,Y nonempty sets, a function f from X into Y is a set of ordered pairsf ⊆ X × Y with the property that for each element x ∈ X there is exactlyone pair in f with first coordinate x.) When f is a function with (x, y) ∈ f ,we write f(x) to denote y. (So a function f from X into Y pairs each elementof X with exactly one element of Y , which we denote by f(x).) Thus usingthis notation, when f is a function from X to Y ,

f = {(x, f(x)) : x ∈ X }.

We write f : X → Y to denote that f is a function from X into Y (soimplicit in the notation f : X → Y is that X,Y are nonempty sets). Supposef : X → Y . We say X is the domain of f and Y is the codomain of f .The range (or image) of f , denoted f(X), is the set

f(X) = {f(x) : x ∈ X },

i.e. the set of all values f(x) where x meets the condition that x ∈ X. Sincef(x) ∈ Y for any x ∈ X, we also have

f(X) = {y ∈ Y : for some x ∈ X, f(x) = y }.

More generally, for any A ⊆ X,

f(A) = {f(x) : x ∈ A }.

Note: What we have defined here as the function f is what you may havepreviously called the graph of f .


Example: Let X = {x ∈ R : −2 ≤ x ≤ 2 }, Y = R, and

f = {(x, x2) : x ∈ R and − 2 ≤ x ≤ 2 }.

So f is a function from X into R, and f(x) = x2.

Example: Let X = {1, 2, 3}, Y = {4, 5, 6}. Let f = {(1, 4), (2, 5), (3, 4)},g = {(1, 4), (1, 5), (3, 6)}. Then f is a function from X into Y , since foreach x ∈ X, there is exactly one y ∈ Y so that (x, y) ∈ f . However, g isnot a function from X into Y : We have 1 ∈ X, but there are two valuesof y ∈ Y (namely y = 4 and y = 5) so that (1, y) ∈ g; further, 2 ∈ X, butthere is no value of y ∈ Y so that (2, y) ∈ g. We also have

f(X) = {f(1), f(2), f(3)} = {4, 5}.

Example: Define f : Z × Z → Z by f((m,n)) = n2. (So the range of f is{n2 : n ∈ Z }.) Let A = {(m,n) ∈ Z× Z : n = 2m }. So

A = {(m, 2m) : m ∈ Z }.

Then

f(A) = {f((m,n)) : (m,n) ∈ A }= {f((m, 2m)) : m ∈ Z }= {(2m)2 : m ∈ Z }= {4m2 : m ∈ Z }.

We often need to quantify objects in mathematics, meaning we need todistinguish between a condition always being met, or the existence of a casewhere a condition is met. Sometimes we also need to distinguish whetherthere is a unique case where a condition is met. For instance, suppose wehave a function f : X → Y . This means that for every x ∈ X there existsa unique y ∈ Y so that (x, y) ∈ f . Notice that the order of the quantifyingphrases is important:

“For every x ∈ X, there is a unique y ∈ Y so that (x, y) ∈ f” meansthat the choice of x determines the value of y (in this particular situation,we have y = f(x)). Contrastingly, “There exists a unique y ∈ Y so that forevery x ∈ X, (x, y) ∈ f” means that there is a unique y ∈ Y so that forevery x ∈ X, we have f(x) = y, meaning f is a constant function (as in thecase f : R→ R defined by f(x) = 5).

Notation: We use the symbol ∀ to denote “for all”, or equivalently, “forevery”. We use the symbol ∃ to denote “there exists”, and we use ∃! todenote “there exists a unique”, or equivalently “there exists one and onlyone”.

Notes: (1) To show that something is unique, a standard technique is toshow first that one such thing exists, and to show then that if another exists,it is equal to the first. For example, suppose X,Y are sets and f ⊆ X × Y .Then f is a function from X to Y if, ∀x ∈ X, ∃y ∈ Y so that (x, y) ∈ f ,and ∀y′ ∈ Y , if (x, y′) ∈ f then y′ = y.

(2) When we write “Suppose c ∈ X” or “Choose c ∈ X” or “Take c ∈ X”without stating further assumptions on c, we mean that we are choosing c


arbitrarily from X; thus anything we then conclude about c applies to everyelement of X.

(3) We will sometimes write “for c ∈ X” to mean “∀c ∈ X”, as the onlycondition being imposed on c is that it is in X. Somewhat similarly, we willsometimes write “for some c ∈ X” to mean “∃c ∈ X”.

Theorem 1.3. Suppose f : X → Y , g : X → Y . Then f = g if and only if∀x ∈ X, f(x) = g(x).

Proof. First suppose that ∀x ∈ X, f(x) = g(x). Thus

f = {(x, f(x)) : x ∈ X } = {(x, g(x)) : x ∈ X } = g.

Now suppose that f = g. Thus (x, y) ∈ f if and only if (x, y) ∈ g. Take[arbitrary] x ∈ X, and then choose [the unique] y ∈ Y so that (x, y) ∈ f = g;thus y = f(x), and also y = g(x). Hence for every x ∈ X, we have f(x) =g(x). �

Definitions. We say a function f : X → Y is injective (or one-to-one,or an injection) if, ∀x1, x2 ∈ X, x1 6= x2 implies f(x1) 6= f(x2) . (Here∀x1, x2 ∈ X means ∀x1 ∈ X, ∀x2 ∈ X.) We say a function f : X → Y issurjective (or onto, or a surjection) if, ∀ y ∈ Y , ∃ x ∈ X so that f(x) = y.(Thus f : X → Y is surjective if the range of f is Y .) A function is calledbijective if it is both injective and surjective.

Note: We have defined (for example) a map f : X → Y to be injective if,∀x1, x2 ∈ X, x1 6= x2 implies f(x1) 6= f(x2). According to standard Englishusage, a definition is a precise statement of what a word or expression means.Thus saying that a map f : X → Y is injective is equivalent to saying that∀x1, x2 ∈ X, x1 6= x2 implies f(x1) 6= f(x2).

Example: Define f : Z+ → Z+ by f(x) = x2. This function is injectivebut not surjective.

Example: Let R≥0 = {y ∈ R : y ≥ 0 }. Define g : R→ R≥0 by g(x) = x2;this function is surjective, but not injective.

Example: Define h : R→ R by h(x) = x3; this function is bijective.

Warning: Do not confuse the definition of injective with the definition of afunction. For example, consider f = {(y2, y) : y ∈ Z } ⊆ Z×Z. So for eachy ∈ Z, ∃! x ∈ Z so that (x, y) ∈ f (namely x = y2). But f is not a function,as, for example, (4, 2), (4,−2) ∈ f .

Example: Define f : R× R→ R× R by

f(m,n) = (m+ n,m− n).

We claim that f is surjective. We begin by choosing (u, v) ∈ R × R, thecodomain of f .

[We want to find (m,n) ∈ R×R, the domain of f , so that f(m,n) = (u, v).So we need to find m,n ∈ R so that m + n = u and m − n = v. To havethese equalities, we need m = u− n and m = v + n. To have these last twoequalities, we need u−n = v+n, or equivalently, u−v = 2n, or equivalently,u−v2 = n. If we have u−v

2 = n and m = u− n, then we have

m = u− u− v2

=u+ v

2.


Thus, having worked backwards to find m and n, we take these values form and n and, with hope, can show that f(m,n) = (u, v).]

Having chosen (u, v) ∈ R× R, we set m = u+v2 , n = u−v

2 . Thus (m,n) ∈R× R, the domain of f . Then

f(m,n) = (m+ n,m− n) =

(u+ v

2+u− v

2,u+ v

2− u− v

2

)= (u, v).

This shows that f is surjective.

Proposition 1.4. Suppose f : X → Y .

(a) f is injective if and only if ∀y ∈ f(X), ∃!x ∈ X so that f(x) = y.(b) f is bijective if and only if ∀ y ∈ Y , ∃! x ∈ X so that f(x) = y.

(So when f is bijective, f gives us a one-to-one correspondence between theelements of X and the elements of Y .)

Proof. (a) Suppose first that f is injective, and suppose y ∈ f(X). Thus∃x ∈ X so that f(x) = y. Now suppose x′ ∈ X so that x′ 6= x. Then sincef is injective, f(x′) 6= f(x) = y. Thus x is the only element of X so thatf(x) = y; in other words, x is the unique element of X so that f(x) = y. Insummary, for y ∈ f(X), ∃!x ∈ X so that f(x) = y.

Now suppose that ∀y ∈ f(X), ∃!x ∈ X so that f(x) = y. Supposex1, x2 ∈ X so that x1 6= x2; let y1 = f(x1). By assumption, x1 is the onlyelement of X that f maps to y1. Hence y1 6= f(x2), so f(x1) 6= f(x2). Thuswe have shown that for x1, x2 ∈ X with x1 6= x2, we have f(x1) 6= f(x2),meaning that f is injective.

(b) Say f is bijective; then f(X) = Y , so by (a), ∀y ∈ Y , ∃!x ∈ X so thatf(x) = y.

Now suppose that ∀y ∈ Y , ∃!x ∈ X so that f(x) = y. Then f is surjective,since ∀y ∈ Y , ∃x ∈ X so that f(x) = y. Therefore f(X) = Y , and by (a), fis injective. Hence f is bijective. �

Definitions. Suppose we have functions f : X → Y , g : Y → Z. We definethe composition of g and f , denoted g ◦ f , by

(g ◦ f)(x) = g(f(x)) for any x ∈ X.

Since f assigns to x ∈ X exactly one value f(x) ∈ Y , and g assigns tof(x) ∈ Y exactly one value in Z, we have that g ◦ f is a function from Xto Z, i.e. g ◦ f : X → Z. We say a function f : X → Y is invertible ifthere exists a function g : Y → X so that g ◦ f is the identity function onX (meaning that for all x ∈ X, (g ◦ f)(x) = x), and f ◦ g is the identityfunction on Y . Note that when g is an inverse for f , we also have that f isan inverse for g.

Example: Define f : R → R by f(x) = 2x + 3 and define g : R → R byg(x) = (x− 3)/2. Then for any x ∈ R,

g ◦ f(x) = g(f(x)) =f(x)− 3

2=

(2x+ 3)− 3

2= x,

and

f ◦ g(x) = f(g(x)) = 2g(x) + 3 = 2 · x− 3

2+ 3 = x.


Hence g ◦ f is the identity function on the domain of f , and f ◦ g is theidentity function on the domain of g. So f is invertible with g as an inverse.

Proposition 1.5. Suppose f : X → Y , g : Y → Z, h : Z → W . Thenh ◦ (g ◦ f) = (h ◦ g) ◦ f .

Proof. To show h ◦ (g ◦ f) = (h ◦ g) ◦ f , we need to show that for all x ∈ X,we have h ◦ (g ◦ f)(x) = (h ◦ g) ◦ f(x). So take x ∈ X; then

h ◦ (g ◦ f)(x) = h(g ◦ f(x)) = h(g(f(x)))

and

(h ◦ g) ◦ f(x) = h ◦ g(f(x)) = h(g(f(x))).

Thus h ◦ (g ◦ f) = (h ◦ g) ◦ f . �

Theorem 1.6. Suppose f : X → Y , g : Y → Z.

(a) If f and g are injective then so is g ◦ f .(b) If f and g are surjective then so is g ◦ f .

Proof. We will prove (a) and leave (b) as an exercise.Suppose f, g are injective, and suppose x1, x2 ∈ X so that x1 6= x2. Since

f is injective, this means that f(x1) 6= f(x2). Set y1 = f(x1), y2 = f(x2).Thus y1, y2 ∈ Y with y1 6= y2. Since g is injective, this means g(y1) 6= g(y2).Substituting for y1, y2, this means

g ◦ f(x1) = g(f(x1)) = g(y1) 6= g(y2) = g(f(x2)) = g ◦ f(x2).

Summarising, for any x1, x2 ∈ X, if x1 6= x2 then g ◦ f(x1) 6= g ◦ f(x2).Hence g ◦ f is injective. �

Note: This theorem shows that if f : X → Y and g : Y → Z are bothbijective, then g ◦ f : X → Z is also bijective.

As an exercise, one proves the following.

Theorem 1.7. Suppose f : X → Y , and g : Y → X, h : Y → X areinverses of f . Then g = h; that is, if f has an inverse then its inverse isunique.

Proof by contradiction: A proof by contradiction proceeds as follows.We want to prove a certain statement P is true. So instead, we assume thatP is false, and we use this to deduce as true something we know to be false.Hence we conclude that it is impossible that P is falst, and thus P must betrue. We use this technique in part of the proof of the next theorem.

Theorem 1.8. Suppose f : X → Y . Then f is invertible if and only if f isbijective.

Proof. There are 2 statements we need to prove:

(1) f is invertible only if f is bijective, or equivalently, f is invertibleimplies f is bijective.

(2) f is invertible if f is bijective, or equivalently, f is bijective impliesf is invertible;


To show (1): Suppose f is invertible, and let g : Y → X denote an inverseof f . Thus g◦f is the identity function on X, and f◦g is the identity functionon Y .

To show that f is injective, suppose that x1, x2 ∈ X so that x1 6= x2. So

x1 = g ◦ f(x1) = g(f(x1)), and x2 = g ◦ f(x2) = g(f(x2)).

[We now proceed to argue by contradiction: We want to deduce that f(x1) 6=f(x2). So we show that if f(x1) = f(x2) then we obtain a contradiction tosomething we know to be true, and thus it is impossible to have f(x1) =f(x2).] For the sake of contradiction, suppose that f(x1) = f(x2). Then wemust have g(f(x1)) = g(f(x2)). But this contradicts our deduction abovethat g(f(x1)) 6= g(f(x2)). Hence it cannot be the case that f(x1) = f(x2),so we must have that f(x1) 6= f(x2). This shows that (having assumed thatf is invertible) if x1, x2 ∈ X with x1 6= x2 then f(x1) 6= f(x2); that is, thisshows that f is injective.

We now show that f is surjective [note that we are still assuming that fis invertible with inverse g]. We begin by choosing [arbitrary] y ∈ Y . [Weneed to find x ∈ X so that f(x) = y.] We know that f ◦ g is the identityfunction on Y , so

y = f ◦ g(y) = f(g(y)).

Set x = g(x). Thus x ∈ X, and

f(x) = f(g(y)) = y.

This shows that f is surjective.Therefore we have shown that when f is invertible, then f is injective and

surjective, i.e. f is bijective.To show (2): Suppose f is bijective. [So we are assuming that f is injective

and surjective.] Set

g = {(y, x) ∈ Y ×X : (x, y) ∈ f }.Since f is bijective, ∀y ∈ Y , ∃!x ∈ X so that (x, y) ∈ f ; thus g is a function.So g : Y → X, and for any y ∈ Y , g(y) = x where f(x) = y. Now weneed to show that g is an inverse of f . For this, first take any x ∈ X. Sety = f(x). Thus by the definition of g, g(y) = x, so (g ◦ f)(x) = x. As x waschosen arbitrarily from X, this shows g ◦ f is the identity function on X.Now choose any y ∈ Y . Since f is bijective, there is a unique x ∈ X withf(x) = y. Thus g(y) = x, and hence (f ◦ g)(y) = f(x) = y. Since y waschosen arbitrarily from Y , this shows f ◦ g is the identity function on Y .Hence when f is bijective, we have that f is invertible. Thus (1) =⇒ (2). �

Note: Suppose f : X → Y bijective. In proving the above theorem, wefound a “recipe” for defining f−1 : Y → X:

For any y ∈ Y , f−1(y) = x where x ∈ X so that f(x) = y.

Example: Suppose a, b, c, d ∈ R so that a < b and c < d. Let [a, b] denotethe closed interval from a to b; that is,

[a, b] = {x ∈ R : a ≤ x ≤ b }.We claim there is a bijection between [a, b] and [c, d]. Intuitively, the ideais the we stretch or shrink the interval [a, b] to be the same length as [c, d],and shift this. The map f1(x) = x − a will take [a, b] to [0, b − a], then


f2(x) = x · d−cb−a will take [0, b−a] to [0, d−c], and then f3(x) = c+x will take

[0, d− c] to [c, d]. We set f = f3 ◦ f2 ◦ f1; thus we set f(x) = c+ (x−a)(d−c)(b−a) .

We want to show f : [a, b] → [c, d]. To do this, take x ∈ [a, b], we have0 ≤ x − a ≤ b − a. We know a < b and c < d, so b − a > 0 and d − c > 0.Hence

0 ≤ (x− a)(d− c)(b− a)

≤ (d− c),

and then

c ≤ c+(x− a)(d− c)

(b− a)≤ d.

So we indeed have that f : [a, b]→ [c, d]. (Warning: One may be temptedto argue by first assuming that c ≤ f(x) ≤ d, and then deducing thata ≤ x ≤ b, but what we need to show is that if x ∈ [a, b] then f(x) ∈ [c, d].In one’s scratch work one might first assume that c ≤ f(x) ≤ d and thendeduce that a ≤ x ≤ b, but then one must determine whether these stepscan be reversed to obtain a proof of what is needed. More generally, to provea statement of the form “If A then B”, it is incorrect to begin by assumingwhat is to be deduced.)

Now we want to show that f is bijective. So we could argue that f isinjective and surjective. Using the definition of injective that we have given,it is awkward to show that f is injective; in §3 we will use a result of §2 toproduce an equivalent definition of injective, using the “contrapositive” ofthe definition we have given. (The contrapositive of a statement of the form“If A holds then B holds” is “If B does not hold then A does not hold”;in §2 we will see that the contrapositive of a statement is equivalent to thestatement.) In arguing that this particular function is surjective, we wouldactually produce the inverse of f , so here we will argue that f is bijectiveby finding g : [c, d] → [a, b] so that g ◦ f is the identity map on [a, b] andf ◦ g is the identity map on [c, d].

Using the strategy we used to construct f , reversing the roles of a and c

and the roles of b and d, we define g(x) = a+ (x−c)(b−a)(d−c) . [Alternatively, we

could set y = f(x) and solve for x, finding that x = a+ (y−c)(b−a)(d−c) , and then

setting g(y) = a + (y−c)(b−a)(d−c) .] Then for x ∈ [c, d], we have c ≤ x ≤ d and

hence a ≤ a+ (x−c)(b−a)(d−c) ≤ b; so g : [c, d]→ [a, b]. Also, for x ∈ [a, b],

g ◦ f(x) = g(f(x))

= a+ (f(x)− c)(b− a)

(d− c)

= a+

(c+ (x− a)

(d− c)(b− a)

− c)

(b− a)

(d− c)= a+ (x− a)

= x.


Similarly,

f ◦ g(x) = f(g(x))

= c+ (g(x)− a)(d− c)(b− a)

= c+

(a+ (x− c)(b− a)

(d− c)

)(d− c)(b− a)

= x.

Thus g : [c, d]→ [a, b] is the inverse of f .

Note: Given the above definitions of f and g, it is necessary to ensurethat f([a, b]) ⊆ [c, d] and that g([c, d]) ⊆ [a, b], else we cannot claim thatf : [a, b] → [c, d] and g : [c, d] → [a, b], and knowing the domains andcodomains of f and g is necessary to apply the preceding theorem. We

could define f : [a, b] → R by f(x) = c + (x−a)(d−c)(b−a) and g : [0, 2] → R

by g(x) = a + (c−x)(b−a)(d−c) , and then proceed mechanically to argue that

g ◦ f(x) = x, f ◦ g(x) = x; this will work because we could have extendedthe domains of f and g to R, but unless c = 0 and d = 2, this does not provethat there is a bijection between [a, b] and [0, 1].

In the exercises, one proves the following. (Part (a) of this theorem is anexercise for §3, and part (b) is an exercise for this section.)

Proposition 1.9. Suppose f : X → Y , g : Y → X so that g ◦ f is theidentity map on X, meaning that for all x ∈ X, we have g ◦ f(x) = x.

(a) Suppose g is injective; then f ◦g is the identity map on Y (and henceg = f−1).

(b) Suppose f is surjective; then f ◦ g is the identity map on Y (andhence g = f−1).

As an exercise, one also proves the following.

Theorem 1.10. Suppose f : X → Y and g : Y → Z are bijective (andhence we know g ◦ f is bijective). Then (g ◦ f)−1 = f−1 ◦ g−1.

One also proves this useful result.

Proposition 1.11. Suppose f : X → Y is bijective, and A ⊆ X. SetB = {x ∈ X : x 6∈ A }. (Standard notation for B is X r A.) Thenf(A) ∩ f(B) = ∅.

2. Truth tables, equivalences, and contrapositive

We use the word “statement” interchangeably with the word “sentence”,and we agree that a statement can be true or false or neither, but a statementcannot be simultaneously true and false. In a mathematical system, the truestatements and false statements are the propositions of the system, and thelabel “true” or “false” associated with a given proposition is its truth value.

Notation: We use the symbol ¬ to mean “not”. We use the symbol ∧ tomean “and”. We use the symbol ∨ to mean “or”. (Note that we do not use


∨ to mean “exclusive or”; that is, P ∨ Q is true if P is true or if Q is trueor if both P and Q are true.) We use the symbol =⇒ to mean “implies”.(So P =⇒ Q means that if P is true then Q is true.) We use the symbol⇐⇒ to mean “if and only if”; so with P,Q propositions, P ⇐⇒ Q meansthat P =⇒ Q and Q =⇒ P . (So P ⇐⇒ Q means that P is true exactlywhen Q is true, and P is false exactly when Q is false.) When P ⇐⇒ Q,we say P and Q are equivalent.

Example: With x ∈ Z, we could have P representing the proposition“x ≥ 5” and Q representing the proposition “x ≤ 7”. Then P ∧ Q wouldrepresent the proposition “x ≥ 5 and x ≤ 7”.

When a proposition P is true, we sometimes express this by saying thatP holds.

Example: Suppose P and Q represent propositions. P =⇒ Q is theproposition that P implies Q, or in other words, the proposition that if Pis true then Q is true. To state this more emphatically, P =⇒ Q meansthat if P is true, then Q must be true. Note that P =⇒ Q allows for Pand Q to both be true, or for P to be false and Q to be true, or for P andQ to both be false. However, P =⇒ Q does not allow for P to be true andQ to be false. (Initially, it can seem confusing that P =⇒ Q is true whenP and Q are false. However, having P and Q false does not contradict thatQ must be true if P is true.) We can represent this scenario using what iscalled a “truth table”, wherein we consider all possible combinations of thetruth values of P and Q, and the consequent truth value of P =⇒ Q:

P Q [P =⇒ Q]

T T TT F FF T TF F T

(The square brackets on the top line of the truth table are used simply tomake it easier to distinguish the three propositions from each other.)

Note: We could prove the truth of the following propositions and theoremswithout using truth tables, but here we use truth table to establish somefundamental and useful results in a rather painless way.

Example: Suppose P and Q represent propositions. P ∧Q is true exactlywhen P and Q are both true. So the corresponding truth table is:

P Q [P ∧Q]

T T TT F FF T FF F F

Example: Suppose P and Q represent propositions. P ∨Q is true exactlywhen P or Q is true. We do not use the word “or” to mean “exclusive or”,


so P ∨Q is true when P and Q are both true. So the corresponding truthtable is:

P Q [P ∨Q]

T T TT F TF T TF F F

Example: Suppose P and Q represent propositions. The correspondingtruth table for (¬P ) ∨Q is:

P Q [¬P ∨Q]


Example: Suppose P and Q represent propositions. The correspondingtruth table for ¬Q =⇒ ¬P is:

P Q [¬Q =⇒ ¬P ]


This truth table can be easier to determine by expanding it as follows.

P Q ¬P ¬Q [¬Q =⇒ ¬P ]

T T F F TT F F T FF T T F TF F T T T

Theorem 2.1. Suppose P,Q are propositions.

(a) P =⇒ Q is equivalent to ¬Q =⇒ ¬P .(b) P =⇒ Q is equivalent to ¬P ∨Q.

Proof. We prove (a) and leave (b) as an exercise.

P Q [P =⇒ Q] [¬Q =⇒ ¬P ] [(P =⇒ Q) ⇐⇒ (¬Q =⇒ ¬P )]

T T T T TT F F F TF T T T TF F T T T


So for all truth values of P and Q, (P =⇒ Q) and (¬Q =⇒ ¬P ) have thesame truth values. Hence [(P =⇒ Q) ⇐⇒ (¬Q =⇒ ¬P )]. �

Definitions. We call the proposition ¬Q =⇒ ¬P the contrapositive ofthe proposition P =⇒ Q. As seen above, the proposition P =⇒ Qis equivalent to its contrapositive. The proposition Q =⇒ P is calledthe converse of the proposition P =⇒ Q; as an exercise one shows thatQ =⇒ P is not equivalent to P =⇒ Q.

We also have this easily proved result.

Proposition 2.2. Suppose P is a proposition. Then P ⇐⇒ ¬(¬P ).

Proof.P ¬P ¬(¬P )

T F TF T F

So the truth values of P and ¬(¬P ) always agree, so the propositions P and¬(¬P ) are equivalent. �

The next proposition shows that ∧ and ∨ are “associative”, meaning thatP ∧Q ∧R and P ∨Q ∨R are propositions that do not require parentheses.

Proposition 2.3. Suppose P,Q,R are propositions.

(a) (P ∧Q) ∧R ⇐⇒ P ∧ (Q ∧R).(b) (P ∨Q) ∨R ⇐⇒ P ∨ (Q ∨R).

Proof. We prove (a), and leave the proof of (b) as an exercise.

P Q R (P ∧Q) [(P ∧Q) ∧R]

T T T T TT T F T FT F T F FT F F F FF T T F FF T F F FF F T F FF F F F F

Also:

P Q R (Q ∧R) [P ∧ (Q ∧R)]

T T T T TT T F F FT F T F FT F F F FF T T T FF T F F FF F T F FF F F F F


So for any truth values of P,Q,R, these truth tables show that [(P ∧Q)∧R] ⇐⇒ [P ∧ (Q ∧R)].

Note that one could combine the above truth tables into one (large) table,or just combine some of the information from these two truth tables intoone truth table as follows:

P Q R [(P ∧Q) ∧R] [P ∧ (Q ∧R)] [(P ∧Q) ∧R] ⇐⇒ [P ∧ (Q ∧R)]

T T T T T TT T F F F TT F T F F TT F F F F TF T T F F TF T F F F TF F T F F TF F F F F T

�

The above proposition shows we can write P ∧ Q ∧ R and P ∨ Q ∨ R,without there being confusion. As a trivial exercise, one can also show thefollowing sometimes useful equivalences.

Proposition 2.4. Suppose P,Q,R are propostions. Then

P ∧Q∧R ⇐⇒ (P ∧Q)∧(P ∧R), and P ∨Q∨R ⇐⇒ (P ∨Q)∨(P ∨R).

Proposition 2.5. Suppose P,Q,R are propositions.

(a) P ∧ (Q ∨R) ⇐⇒ (P ∧Q) ∨ (P ∧R).(b) P ∨ (Q ∧R) ⇐⇒ (P ∨Q) ∧ (P ∨R).

Proof. We prove (a) and leave the proof of (b) as an exercise.

P Q R (Q ∨R) [P ∧ (Q ∨R)]

T T T T TT T F T TT F T T TT F F F FF T T T FF T F T FF F T T FF F F F F

Also:


P Q R (P ∧Q) (P ∧R) [(P ∧Q) ∨ (P ∧R)]

T T T T T TT T F T F TT F T F T TT F F F F FF T T F F FF T F F F FF F T F F FF F F F F F

So for any truth values of P,Q,R, these truth tables show that

[P ∧ (Q ∨R)] ⇐⇒ [(P ∧Q) ∨ (P ∧R)].

�

Theorem 2.6. Suppose P,Q are propositions.

(a) ¬(P ∧Q) ⇐⇒ ¬P ∨ ¬Q.(b) ¬(P ∨Q) ⇐⇒ ¬P ∧ ¬Q.(c) ¬(P =⇒ Q) ⇐⇒ (P ∧ ¬Q).

Proof. Using a truth table, we prove (a) and leave (b) and (c) as exercises.

P Q ¬P ¬Q P ∧Q ¬(P ∧Q) ¬P ∨ ¬Q

T T F F T F FT F F T F T TF T T F F T TF F T T F T T

Thus for any truth values of P,Q,R, the truth values of ¬(P ∧ Q) and¬P ∨ ¬Q are the same. This proves (1). �


Proposition 2.7. Suppose P,Q are propositions. Then [P ∨ Q] ⇐⇒[¬P =⇒ Q].

Note: With P,Q,R propositions, P =⇒ Q ⇐⇒ R and P ⇐⇒ Q =⇒ Rdo not have clear meanings. As exercises, one shows that the statementsP =⇒ (Q ⇐⇒ R) and (P =⇒ Q) ⇐⇒ R are not equivalent, andP ⇐⇒ (Q =⇒ R) and (P ⇐⇒ Q) =⇒ R are not equivalent. Note thatthis also means an assertion such as

P =⇒ Q ⇐⇒ R =⇒ S

has no clear meaning.

We give two proofs of the next theorem; one is a proof by contradiction,and the other is a proof by contrapositive.

Theorem 2.8. (Pigeonhole Principle) Let A be a set with n elements andB a set with m elements where m,n ∈ Z+ with m < n. Then there is noinjection from A into B. (So if n pigeons fly into m pigeonholes, then atleast one pigeonhole contains more than one pigeon.)


Proof. Proof 1: For the sake of contradiction, suppose g : A → B isinjective. Enumerate the elements of A as a1, a2, . . . , an and the elements ofB as b1, b2, . . . , bm. Let C = {g(ai) : i ∈ Z+, i ≤ n }. Thus C is a subset ofB, and since g is injective, C is a set with n elements. But this means thereis a subset of B containing more elements than are in B, which is impossible.Thus it cannot be possible to have an injective function g : A→ B.

Proof 2: The statement of the theorem is equivalent to “Let A be aset with n elements and B a set with m elements where m,n ∈ Z+. Ifm < n then there is no injection from A into B.” The contrapositive of thisstatement is “Let A be a set with n elements and B a set with m elementswhere m,n ∈ Z+. If there is an injection from A into B then m ≥ n.” Wewill prove this latter statement. Suppose g : A→ B is injective. Enumeratethe elements of A as a1, a2, . . . , an and the elements of B as b1, b2, . . . , bm.Let C = {g(ai) : i ∈ Z+, i ≤ n }. Thus C is a subset of B, and since g isinjective, C is a set with n elements. Thus B must have at least n elements,meaning m ≥ n.

�

Sometimes one can prove a result by contrapositive using an argumentthat is almost identical to proving the result by contradiction (as above).However, there are occassions where this is not the case; we will see anexample of this later in the course when we prove by contradiction that theinterval (0, 1) ⊆ R is what we call “uncountable”.

3. Negations and contrapositives of propositions withquantifiers

Suppose P (x) is a proposition involving x (where x ∈ X, X some set).Suppose the proposition

∀x ∈ X,P (x)

is not true. Then there must be an exceptional x ∈ X so that P (x) doesnot hold. That is,

¬(∀x ∈ X, P (x)) =⇒ (∃x ∈ X so that ¬P (x)).

Conversely, suppose the proposition

∃x ∈ X so that ¬P (x)

is true. Then it is not the case that P (x) holds for all x ∈ X, meaning

(∃x ∈ X so that ¬P (x)) =⇒ ¬(∀x ∈ X, P (x)).

Thus¬(∀x ∈ X, P (x)) ⇐⇒ (∃x ∈ X so that ¬P (x)).

This means we also have

¬(∃x ∈ X so that ¬P (x)) ⇐⇒ ¬(¬(∀x ∈ X, P (x))

⇐⇒ (∀x ∈ X, P (x)).

(Recall that for a proposition R, ¬(¬R) is equivalent to R.) Letting Q(x) =¬P (x), this gives us

¬(∃x ∈ X so that Q(x)) ⇐⇒ (∀x ∈ X, ¬Q(x)).


Note: We are inserting phrases like “such that” to make our sentences morereadable without changing their meanings.

Example: Recall that by definition, f : X → Y is injective if and only if

∀x1 ∈ X, ∀x2 ∈ X, x1 6= x2 =⇒ f(x1) 6= f(x2).

Let P (x1) be the proposition that ∀x2 ∈ X, x1 6= x2 =⇒ f(x1) 6= f(x2).(So f is injective if and only if ∀x1 ∈ X, P (x1).) We know that

¬(∀x1 ∈ X, P (x1)) is equivalent to (∃x1 ∈ X so that ¬P (x1)).

Now let Q(x1, x2) be the proposition x1 6= x2 =⇒ f(x1) 6= f(x2). Then

¬P (x1) ⇐⇒ ¬(∀x2 ∈ X, Q(x1, x2))

⇐⇒ ∃x2 ∈ X so that ¬Q(x1, x2).

Also, using results from §2, we have

¬Q(x1, x2) ⇐⇒ [x1 6= x2 ∧ ¬(f(x1) 6= f(x2))]

⇐⇒ [x1 6= x2 ∧ f(x1) = f(x2)].

Summarising, f : X → Y is not injective if and only if

∃x1 ∈ X, ∃x2 ∈ X so that x1 6= x2 ∧ f(x1) = f(x2).

Example: Suppose f : X → Y . By definition, we know f is surjective ifand only if

∀y ∈ Y, ∃x ∈ x so that f(x) = y.

Let P (y) be the proposition ∃x ∈ x so that f(x) = y. Thus

f is not surjective ⇐⇒ ¬(∀y ∈ Y, P (y))

⇐⇒ ∃y ∈ Y such that ¬P (y)

⇐⇒ ∃y ∈ Y so that ¬(∃x ∈ X so that f(x) = y)

⇐⇒ ∃y ∈ Y so that [∀x ∈ X, ¬(f(x) = y)]

⇐⇒ ∃y ∈ Y so that [∀x ∈ X, f(x) 6= y].

Example: For every n ∈ Z+, suppose an ∈ R. Consider the followingproposition:

∃c ∈ R so that ∀ε > 0, ∃N ∈ Z+ so that ∀n ≥ N, |an − c| < ε.

We negate this proposition in a series of steps so that each consecutive pairof propositions are clearly equivalent:

¬[∃c ∈ R so that ∀ε > 0, ∃N ∈ Z+ so that ∀n ≥ N, |an − c| < ε]

⇐⇒ ∀c ∈ R, ¬[∀ε > 0, ∃N ∈ Z+ so that ∀n ≥ N, |an − c| < ε]

⇐⇒ ∀c ∈ R, ∃ε > 0 so that ¬[∃N ∈ Z+ so that ∀n ≥ N, |an − c| < ε]

⇐⇒ ∀c ∈ R, ∃ε > 0 so that ∀N ∈ Z+, ¬[∀n ≥ N, |an − c| < ε]

⇐⇒ ∀c ∈ R, ∃ε > 0 so that ∀N ∈ Z+, ∃n ≥ N so that ¬[|an − c| < ε]

⇐⇒ ∀c ∈ R, ∃ε > 0 so that ∀N ∈ Z+, ∃n ≥ N so that |an − c| ≥ ε.

Note: With P,Q propositions, the proposition “P =⇒ Q” is equivalent tothe proposition “if P then Q”. When we have complex proposition involvingquantifiers and an implication, it can be important to know where the word“if” belongs. Here we consider an example of this.


Example: Suppose A ⊆ R with A 6= ∅. For L ∈ R, we say L is an upperbound for A if, ∀a ∈ A, a ≤ L. We say L ∈ R is a least upper bound for Aif (1) L is an upper bound for A, and (2) if M ∈ R is an upper bound for A,then L ≤ M . Let P (M) be the proposition that M is an upper bound forA (so P (M) means that ∀a ∈ A, a ≤ M). Thus L is a least upper boundfor A if and only if [P (L) ∧ (∀M ∈ R, P (M) =⇒ (L ≤ M))]. Notice thatthe quantifier on a ∈ A is part of the proposition P (M).

How can L fail to be a least upper bound for A? This can happen if Lis not an upper bound for A, or if there is an upper bound M for A withM < L. More formally, we have

L is not a least upper bound for A

⇐⇒ ¬[P (L) ∧ (∀M ∈ R, P (M) =⇒ L ≤M)]

⇐⇒ ¬P (L) ∨ ¬(∀M ∈ R, P (M) =⇒ L ≤M)

⇐⇒ ¬P (L) ∨ (∃M ∈ R so that ¬(P (M) =⇒ L ≤M))

⇐⇒ ¬P (L) ∨ (∃M ∈ R so that P (M) ∧ ¬(L ≤M)

⇐⇒ ¬P (L) ∨ (∃M ∈ R so that P (M) ∧ (L > M)

⇐⇒ [∃a ∈ A so that a > L] ∨ [∃M ∈ R so that (∀a ∈ A, a ≤M) ∧ (L > M)],

consistent with discussion above. However, if we were to proceed mechani-cally without thought, we might assert

L is not a least upper bound for A

⇐⇒ ¬[(∀a ∈ A, a ≤ L) ∧ (∀M ∈ R, ∀a ∈ A, a ≤M =⇒ L ≤M)]

⇐⇒ (∃a ∈ A so that a > L) ∨ (∃M ∈ R, ∃a ∈ A so that (a ≤M) ∧ (L > M)),

but this last proposition is not equivalent to “L is not a least upper bound forA”. The problem is that we could interpret “∀a ∈ A, a ≤M =⇒ L ≤M”as “∀a ∈ A, if a ≤ M then L ≤ M ,” or as “if, ∀a ∈ A, a ≤ M, then L ≤M .” Some texts try to avoid this confusion by writing “a ≤M ∀a ∈ A =⇒L ≤M ,” which can only be interpreted as “if, a ≤M ∀a ∈ A, then L ≤M .”

Example: Let [0, 1) = {x ∈ R : 0 ≤ x < 1 } and (0, 1) = {x ∈ R : 0 <x < 1 }. Define f : [0, 1)→ (0, 1) by

f(x) =

{1− 1

n+1 if x = 1− 1n for some n ∈ Z+,

x otherwise.

To understand this definition, we need to understand the condition “other-wise”:

¬[x = 1− 1

nfor some n ∈ Z+] ⇐⇒ ¬[∃n ∈ Z+ so that x = 1− 1

n]

⇐⇒ [∀n ∈ Z+, ¬(x = 1− 1

n)]

⇐⇒ [∀n ∈ Z+, x 6= 1− 1

n].

Contrapositives of propositions with quantifiers. Suppose P (x), Q(x)are propositions involving x ∈ X where X is some set. We have seen that


P (x) =⇒ Q(x) is equivalent to its contrapositive: ¬Q(x) =⇒ ¬P (x).Hence

[∀x ∈ X, (P (x) =⇒ Q(x))] is equivalent to [∀x ∈ X, (¬Q(x) =⇒ ¬P (x))].

Similarly,

[∃x ∈ X, (P (x) =⇒ Q(x))] is equivalent to [∃x ∈ X, (¬Q(x) =⇒ ¬P (x))].

This analysis extends to implication with multiple quantifiers; in the nextproposition we discuss such a situation.

Theorem 3.1. Suppose f : X → Y . The map f is injective if and only if

[∀x1, x2 ∈ X, f(x1) = f(x2) =⇒ x1 = x2].

Proof. By definition,

[f is injective ] ⇐⇒ [∀x1, x2 ∈ X, (x1 6= x2 =⇒ f(x1) 6= f(x2))].

The contrapositive of the statement

x1 6= x2 =⇒ f(x1) 6= f(x2)

is

¬(f(x1) 6= f(x2)) =⇒ ¬(x1 6= x2),

or equivalently,

f(x1) = f(x2) =⇒ x2 = x2.

Thus

∀x1, x2 ∈ X, x1 6= x2 =⇒ f(x1) 6= f(x2)

is equivalent to

∀x1, x2 ∈ X, f(x1) = f(x2) =⇒ x1 = x2,

which proves the proposition. �

Note: With f : X → Y , some texts define f to be injective if:

∀x1, x2 ∈ X, f(x1) = f(x2) =⇒ x1 = x2.

Since the above statement is equivalent to the definition given in §1, eithercan be used as the definition of injective. The definition in §1 is meantto capture more obviously that a map f is injective when it maps distinctelements of the domain to distinct elements of the codomain, but the aboveequivalent statement is often easier to use when proving a map is injective.

4. Set operations

Throughout this section, we rely on basic results from §2.Suppose that A,B are subsets of some set X.

Recall: A ∪B denotes the union of A and B, meaning

A ∪B = {x ∈ X : x ∈ A or x ∈ B }.So for x ∈ X, x ∈ A ∪B if and only if x ∈ A ∨ x ∈ B.A ∩B denotes the intersection of A and B, meaning

A ∩B = {x ∈ X : x ∈ A and x ∈ B }.


So for x ∈ X, x ∈ A ∩B if and only if x ∈ A ∧ x ∈ B. When A ∩B = ∅ wesay A and B are disjoint.ArB denotes the difference of A and B, meaning

ArB = {x ∈ X : x ∈ A and x 6∈ B }.So for x ∈ X, x ∈ ArB if and only if x ∈ A ∧ x 6∈ B.Ac denotes the complement of A, meaning

Ac = {x ∈ X : x 6∈ A }.We have the following simple proposition.

Theorem 4.1. Let X be a set, and for x ∈ X, let P (x) be the propositionthat x satisfies condition P , and let Q(x) be the proposition that x satisfiescondition Q. Set

A = {x ∈ X : P (x) }, B = {x ∈ X : Q(x) }.Then

A ∩B = {x ∈ X : P (x) ∧Q(x) } and A ∪B = {x ∈ X : P (x) ∨Q(x) }.

Proof. For x ∈ X, we have x ∈ A if and only if P (x); similarly, x ∈ B if andonly if Q(x). Thus

A ∩B = {x ∈ X : x ∈ A ∧ x ∈ B }= {x ∈ X : P (x) ∧Q(x) }

and

A ∪B = {x ∈ X : x ∈ A ∨ x ∈ B }= {x ∈ X : P (x) ∨Q(x) }.

�

Proposition 4.2. Suppose A,B,C are subsets of a set X.

(a) A ∩ (B ∩ C) = (A ∩B) ∩ C.(a) A ∪ (B ∪ C) = (A ∪B) ∪ C.

(Thus the set operations ∪ and ∩ are associative.)

Proof. We prove (a) and leave (b) as an exercise.Suppose x ∈ X. Let P be the propostion x ∈ A, Q the proposition x ∈ B,

and R the propostion x ∈ C. Recall that P ∧ (Q ∧ R) ⇐⇒ (P ∧ Q) ∧ R.Thus:

x ∈ A ∩ (B ∩ C) ⇐⇒ (x ∈ A) ∧ (x ∈ B ∩ C)

⇐⇒ (x ∈ A) ∧ (x ∈ B ∧ x ∈ C)

⇐⇒ P ∧ (Q ∧R)

⇐⇒ (P ∧Q) ∧R⇐⇒ (x ∈ A ∧ x ∈ B) ∧ x ∈ C⇐⇒ (x ∈ A ∩B) ∧ (x ∈ C)

⇐⇒ x ∈ (A ∩B) ∩ C.Thus the elements of X that are in A∩ (B ∩C) are exactly the elements ofX that are in (A ∩B) ∩ C, so A ∩ (B ∩ C) = (A ∩B) ∩ C. �


Theorem 4.3. Let A,B,C be subsets of a set X.

(a) A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C).(b) A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C).

Proof. We prove (a) and leave (b) as an exercise.Suppose x ∈ X. Let P be the propostion x ∈ A, Q the proposition x ∈ B,

and R the proposition x ∈ C. Recall that P∧(Q∨R) ⇐⇒ (P∧Q)∨(P∧R).Then:

x ∈ A ∩ (B ∪ C) ⇐⇒ x ∈ A ∧ x ∈ B ∪ C⇐⇒ x ∈ A ∧ (x ∈ B ∨ x ∈ C)

⇐⇒ P ∧ (Q ∨R)

⇐⇒ (P ∧Q) ∨ (P ∧R)

⇐⇒ (x ∈ A ∧ x ∈ B) ∨ (x ∈ A ∧ x ∈ C)

⇐⇒ (x ∈ A ∩B) ∨ (x ∈ A ∩ C)

⇐⇒ x ∈ (A ∩B) ∪ (A ∩ C).

Thus the elements of A∩(B∪C) are exactly the elements of (A∩B)∪(A∩C),and hence A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C). �

Proposition 4.4. Suppose A,B are subsets of a set X.

(a) ArB = A ∩Bc.(b) (ArB)c = Ac ∪B.

Proof. We prove (a) and leave (b) as an exercise.Suppose x ∈ X; then we have

x ∈ ArB ⇐⇒ x ∈ A ∧ x 6∈ B⇐⇒ x ∈ A ∧ x ∈ Bc

⇐⇒ x ∈ A ∩Bc,

Thus the elements of X that are in A r B are exactly the elements of Xthat are in A ∩Bc, so ArB = A ∩Bc. �

xs

Theorem 4.5. (De Morgan’s Laws) Suppose A,B,C are subsets of a setX.

(a) Ar (B ∪ C) = (ArB) ∩ (Ar C).(b) Ar (B ∩ C) = (ArB) ∪ (Ar C).(c) (A∩B)c = Ac ∪Bc. (Thus for x ∈ X, x 6∈ A∩B ⇐⇒ x 6∈ A∨ x 6∈

B.)(d) (A∪B)c = Ac ∩Bc. (Thus for x ∈ X, x 6∈ A∪B ⇐⇒ x 6∈ A∧ x 6∈

B.)

Proof. We prove (a), (d) and leave (b), (c) as exercises.


(a) Suppose x ∈ X. As an easy exercise using truth tables, one showsthat with P,Q,R propositions, P ∧ (Q∧R) ⇐⇒ (P ∧Q)∧ (P ∧R). Thus:

x ∈ Ar (B ∪ C) ⇐⇒ (x ∈ A) ∧ (x 6∈ B ∪ C)

⇐⇒ (x ∈ A) ∧ ¬(x ∈ B ∪ C)

⇐⇒ (x ∈ A) ∧ ¬(x ∈ B ∨ x ∈ C)

⇐⇒ (x ∈ A) ∧ (x 6∈ B ∧ x 6∈ C)

⇐⇒ (x ∈ A ∧ x 6∈ B) ∧ (x ∈ A ∧ x 6∈ C)

⇐⇒ (x ∈ ArB) ∧ (x ∈ Ar C)

⇐⇒ x ∈ (ArB) ∩ (Ar C).

Thus the elements of A r (B ∪ C) and (A r B) ∩ (A r C) are the same,meaning Ar (B ∪ C) = (ArB) ∩ (Ar C).

(d) Suppose x ∈ X. Thus:

x ∈ (A ∪B)c ⇐⇒ ¬(x ∈ A ∪B)

⇐⇒ ¬(x ∈ A ∨ x ∈ B)

⇐⇒ ¬(x ∈ A) ∧ ¬(x ∈ B)

⇐⇒ x ∈ Ac ∧ x ∈ Bc

⇐⇒ x ∈ Ac ∩Bc.

Since x ∈ (A ∪B)c if and only if x ∈ Ac ∩Bc, we have (A ∪B)c = Ac ∩Bc.(Note that we have also shown that x ∈ (A ∪B)c ⇐⇒ x ∈ Ac ∧ x ∈ Bc, sox 6∈ A ∪B ⇐⇒ x 6∈ A ∧ x 6∈ B.)

ALTERNATIVELY: Suppose x ∈ X. Then, using (a) we have

x ∈ (A ∪B)c ⇐⇒ x ∈ X r (A ∪B)

⇐⇒ xin[(X rA) ∩ (X rB)]

⇐⇒ x ∈ Ac ∩Bc.

Since x ∈ (A∪B)c if and only if x ∈ Ac∩Bc, we have (A∪B)c = Ac∩Bc. �

Notation: It is often convenient to denote the elements of a set usingindices, or subscripts. For example, suppose A is a set with 5 elements; wecan denote these elements as a1, a2, a3, a4, a5. Then we can write

A = {ai : i ∈ I } where I = {1, 2, 3, 4, 5};here I is called an indexing set. This notation is particularly useful whendealing with infinite sets. For instance, we will see that there are infinitelymany primes within the set of integers; ordering the primes in increasingorder, let pi denote the ith prime where i ∈ Z+. Then

{pi : i ∈ Z+ }denotes the set of all primes. Alternatively, we sometimes denote this setwith the notation {pi}i∈Z+ .

Let {Ai}i∈I be a collection of subsets of a set X where I is an indexingset. Then we write ∪i∈IAi to denote the union of all the sets Ai, i ∈ I. Thatis,

∪i∈IAi = {x ∈ X : ∃i ∈ I so that x ∈ Ai }.


Somewhat similarly, we write ∩i∈IAi to denote the intersection of all thesets Ai, i ∈ I. That is,

∩i∈IAi = {x ∈ X : ∀i ∈ I, x ∈ Ai }.

Proposition 4.6. Let X be a set with subset A, and an indexed collectionof subsets {Bi}i∈I , where I is an indexing set. Then we have:

(a) Ar ∩i∈IBi = ∪i∈I(ArBi).(b) Ar ∪i∈IBi = ∩i∈I(ArBi).

Proof. We prove (a) and leave (b) as an exercise.We know x ∈ ∩i∈IBi if and only if ∀i ∈ I, x ∈ Bi. So ¬(x ∈ ∩i∈IBi) if

and only if ∃i ∈ I so that x 6∈ Bi.Suppose x ∈ A r ∩i∈IBi. Then x ∈ A, and for some i ∈ I, we have

x 6∈ Bi. So for some i ∈ I, x ∈ ArBi. Thus x ∈ ∪i∈I(ArBi). This showsthat Ar ∩i∈IBi ⊆ ∪i∈I(ArBi).

Now suppose that x ∈ ∪i∈I(A r Bi). Thus for some i ∈ I, we havex ∈ A r Bi. So for some i ∈ I, x ∈ A and x 6∈ Bi. Since ∃i ∈ I sothat x 6∈ Bi, we have x 6∈ ∩i∈IBi. Thus x ∈ A r ∩i∈IBi. This shows that∪i∈I(A r Bi) ⊆ A r ∩i∈IBi. Together with the result of the precedingparagraph, we get Ar ∩i∈IBi = ∪i∈I(ArBi). �

Theorem 4.7. Suppose f : X → Y and X = U ∪ V . Then f(X) =f(U)∪f(V ). Further, if f is injective and U ∩V = ∅, then f(U)∩f(V ) = ∅.

Proof. Since U, V ⊆ X, clearly f(U), f(V ) ⊆ f(X), so f(U)∪f(V ) ⊆ f(X).On the other hand, take x ∈ X. Then x ∈ U or x ∈ V , so f(x) ∈ f(U) orf(x) ∈ f(V ). Therefore f(x) ∈ f(U) ∪ f(V ); as this holds for all x ∈ X, wehave f(X) ⊆ f(U) ∪ f(V ). Hence f(X) = f(U) ∪ f(V ).

Now suppose f is injective and U ∩ V = ∅. For the sake of contradiction,suppose there is some y ∈ f(U) ∩ f(V ). Thus there is some u ∈ U so thaty = f(u), and there is some v ∈ V so that y = f(v). Hence f(u) = y = f(v).Since f is injective, we have u = v. Hence u ∈ U ∩ V [as u = v and v ∈ V ],contradicting the assumption that U ∩ V = ∅. Thus there cannot be anyy ∈ f(U) ∩ f(V ), meaning f(U) ∩ f(V ) = ∅. �

Definition. Suppose f : X → Y , V ⊆ Y . We define the inverse image ofV under f as

f−1(V ) = {x ∈ X : f(x) ∈ V }.

Note that f−1(∅) = ∅.Warning: This notation does not mean f−1 is necessarily a function!

Example: Say f : R×R→ R is defined by f((x, y)) = 2x−5y. Then, fromlinear algebra, the “kernel” of f is

f−1({0}) = {(x, y) ∈ R× R : f((x, y)) ∈ {0} }= {(x, y) ∈ R× R : 2x− 5y = 0 }= {(x, y) ∈ R× R : y = 2x/5 }= {(x, 2x/5) : x ∈ R }.


Example: Suppose still that f : R×R→ R is defined by f(x, y) = 2x−5y.Let V = (0, 1), an open interval in R. Then

f−1(V ) = {(x, y) ∈ R× R : f((x, y)) ∈ V }= {(x, y) ∈ R× R : 2x− 5y ∈ (0, 1) }= {(x, y) ∈ R× R : 0 < 2x− 5y < 1 }

=

{(x, y) ∈ R× R :

5

2y < x <

5

2y +

1

2

}.

So we can also describe f−1(V ) as

f−1(V ) =

{(5

2y + ε, y

): ε, y ∈ R, 0 < ε <

1

2

}.

Example: Define g : R→ R by g(x) = |x3|. Take V = [4,∞). Then

g−1(V ) = {x ∈ R : g(x) ∈ V }= {x ∈ R : |x3| ∈ [4,∞) }= {x ∈ R : x3 ≥ 4 ∨ −x3 ≥ 4 }

= {x ∈ R : x ≥ 3√

4 ∨ x ≤ 3√−4 }

= (−∞, 3√−4] ∪ [

3√

4,∞).

Theorem 4.8. Let f : X → Y , and let U ⊆ X, V ⊆ Y . Then we have:

(a) f(f−1(V )) ⊆ V , and when f is surjective, f(f−1(V )) = V .(b) U ⊆ f−1(f(U)), and when f is injective, U = f−1(f(U)).

Proof. We prove (a) and leave (b) as an exercise.If V = ∅, then f−1(V ) = ∅ and f(f−1(V )) = ∅ = V . So suppose V 6= ∅.Choose y ∈ f(f−1(V )). Thus y = f(w) for some w ∈ f−1(V ). By the

definition of f−1(V ), we have f(w) ∈ V . Hence y = f(w) ∈ V . Since ywas chosen arbitrarily from f(f−1(V )), this shows that every element off(f−1(V )) lies in V , i.e. f(f−1(V )) ⊆ V .

Now suppose f is surjective. We have already established that f(f−1(V )) ⊆V , so to show f(f−1(V )) = V , we need to show V ⊆ f(f−1(V )). Supposev ∈ V . Since f is surjective, ∃x ∈ X so that f(x) = v. Thus f(x) ∈ V , sox ∈ f−1(V ). Hence v = f(x) ∈ f(f−1(V )). Since v was chosen arbitrarilyfrom V , this shows V ⊆ f(f−1(V )). Since we chose v arbitrarily from V , thisshows that f(f−1(V )) = V [under the assumption that f is surjective]. �

Theorem 4.9. Suppose f : X → Y and V1, V2 ⊆ Y. Then

f−1(V1 ∩ V2) = f−1(V1) ∩ f−1(V2)

and

f−1(V1 ∪ V2) = f−1(V1) ∪ f−1(V2).

Proof. We prove the first statement and leave the second as an exercise.We have

f−1(V1 ∩ V2) = {x ∈ X : f(x) ∈ V1 ∩ V2 }= {x ∈ X : f(x) ∈ V1 ∧ f(x) ∈ V2 }.


On the other hand,

f−1(V1) ∩ f−1(V2) = {x ∈ X : f(x) ∈ V1 } ∩ {x ∈ X : f(x) ∈ V2 }= {x ∈ X : f(x) ∈ V1 ∧ f(x) ∈ V2 }.

Therefore f−1(V1 ∩ V2) = f−1(V1) ∩ f−1(V2).Alternatively, one could present this argument as follows:

f−1(V1 ∩ V2) = {x ∈ X : f(x) ∈ V1 ∩ V2 }= {x ∈ X : f(x) ∈ V1 ∧ f(x) ∈ V2 }= {x ∈ X : f(x) ∈ V1 } ∩ {x ∈ X : f(x) ∈ V2 }= f−1(V1) ∩ f−1(V2).

�

5. Partitioning sets, equivalence relations, and congruences

According to standard usage of English, partitioning a set means we breakit into non-overlapping pieces. More precisely, we have the following.

Definition. A partition of a nonempty set X is a collection {Ai : i ∈ I }of nonempty subsets of X so that

(1) ∀x ∈ X, ∃i ∈ I so that x ∈ Ai;(2) ∀x ∈ X, ∀i, j ∈ I, if x ∈ Ai ∧ x ∈ Aj then Ai = Aj .

(In some texts the subsets Ai are called blocks of the partition.)

Example: Let X = {1, 2, 3, 4, 5, 6}. Then

{{1}, {2, 3}, {4, 5, 6}}is a partition of X. Another partition of X is

{{1, 2, 3}, {4, 6}, {5}}.

Partitions of sets are inextricably linked to “equivalence relations”; todefine these, we first need some other definitions.

Definitions. A relation ∼ on a nonempty set X corresponds to a subsetR∼ of X ×X; we write x ∼ y when (x, y) ∈ R∼, and we say x is related toy. Given a relation ∼ on X, we say:

(1) ∼ is reflexive if: ∀x ∈ X, we have x ∼ x;(2) ∼ is symmetric if: ∀x, y ∈ X, x ∼ y =⇒ y ∼ x;(3) ∼ is transitive if: ∀x, y, z ∈ X, (x ∼ y ∧ y ∼ z) =⇒ x ∼ z.

A relation is an equivalence relation if it is reflexive, symmetric, and transi-tive.

Example: Let T be the set of all triangles in R×R. For t1, t2 ∈ T , considerthe following relation: t1 ∼ t2 if t1 is similar to t2 (meaning there is acorrespondence between the interior angles of t1 and the interior angles of t2so that corresponding angles are equal). Then ∼ is an equivalence relation(check!).

Example: Let X = Z, and let R∼ = {(x, x) : x ∈ Z }. So ∀x ∈ Z,x ∼ x (so ∼ is reflexive). We claim that ∼ is an equivalence relation on


Z: We already noted ∼ is reflexive. Suppose x, y ∈ Z so that x ∼ y. Thus(x, y) ∈ R∼, so x = y. Hence (y, x) = (x, x) ∈ R∼, so y ∼ x. Thus ∼ issymmetric. Suppose x, y, z ∈ Z so that x ∼ y and y ∼ z. Thus x = y, andy = z, so x = y = z. Hence (x, z) = (x, x) ∈ R∼, so x ∼ z. Thus ∼ istransitive. So ∼ is an equivalence relation.

Note: If ∼ is an equivalence relation on some nonempty set X, then wenecessarily have

{(x, x) : x ∈ X } ⊆ R∼since ∼ is reflexive.

Example: Define a relation ∼ on Z by x ∼ y if x < y. So ∼ is not reflexive,as there are x ∈ Z so that ¬(x ∼ x); in particular, 1 ∈ Z and ¬(1 < 1) so¬(1 ∼ 1). Also, ∼ is not symmetric, as there are x, y ∈ Z so that x ∼ ybut ¬(y ∼ x); in particular, 2, 3 ∈ Z and 2 < 3 so 2 ∼ 3, but ¬(3 < 2) so¬(3 ∼ 2). However, ∼ is transitive: Suppose x, y, z ∈ Z so that x ∼ y andy ∼ z. Thus x < y and y < z, so x < y < z. Hence x < z, so x ∼ z.Definition. Suppose ∼ is an equivalence relation on a (nonempty) set X.For x ∈ X, we define

[x]∼ = {y ∈ X : y ∼ x },

and we call [x]∼ the equivalence class of x (relative to the relation ∼).

Proposition 5.1. Suppose ∼ is an equivalence relation on a (nonempty)set X. For any x, y ∈ X, [x]∼ 6= [y]∼ if and only if [x]∼ ∩ [y]∼ = ∅.

Proof. To ease notation, for x ∈ X let us temporarily write [x] for [x]∼.Take x, y ∈ X. We need to prove(1) [x] 6= [y] =⇒ [x] ∩ [y] = ∅, and(2) [x] ∩ [y] = ∅ =⇒ [x] 6= [y].

To do this, we will prove the contrapositive of each statement:(1) [x] ∩ [y] 6= ∅ =⇒ [x] = [y], and(2) [x] = [y] =⇒ [x] ∩ [y] 6= ∅.To prove (1): Suppose [x] ∩ [y] 6= ∅. Thus there is some z ∈ [x] ∩ [y].

Hence z ∈ [x], so z ∼ x; similarly, z ∈ [y], so z ∼ y. Since ∼ is symmetric,we have x ∼ z; since ∼ is transitive, we have x ∼ y. Now choose w ∈ [x];thus w ∼ x, and since x ∼ y and ∼ is transitive, w ∼ y. Hence w ∈ [y]; asthis holds for all w ∈ [x], we have [x] ⊆ [y]. A virtually identical argumentshows that for any w ∈ [y] we have w ∈ [x], so [y] ⊆ [x]. Hence [x] = [y].

To prove (2): Suppose [x] = [y]. We know x ∈ [x] as ∼ is reflexive and sox ∼ x. Hence x ∈ [x] = [x] ∩ [y], so [x] ∩ [y] 6= ∅. �

Theorem 5.2. Suppose ∼ is an equivalence relation on a (nonempty) setX. Then

Π = {[x]∼ : x ∈ X }is a partition of X.

Proof. To ease notation, for x ∈ X let us temporarily write [x] for [x]∼.Take a ∈ X. Then [a] ∈ Π; hence every element of X is in one of the sets

in Π.


Now suppose that for a ∈ X, we have a ∈ [x] and a ∈ [y] where x, y ∈ X.Then [x] ∩ [y] 6= ∅, so by the preceding proposition we have [x] = [y]. ThusΠ is a partition of X. �

On the other hand, we have the following.

Theorem 5.3. Suppose Π = {Ai : i ∈ I } is a partition of a (nonempty)set X (so I is an indexing set). For x, y ∈ X, define x ∼ y if ∃i ∈ I so thatx, y ∈ Ai. Then ∼ is an equivalence relation on X.

Proof. We first show ∼ is reflexive: Take x ∈ X. Since Π is a partition ofX, there is some i ∈ I so that x ∈ Ai. Thus x ∼ x.

Next we show ∼ is symmetric: Suppose x, y ∈ X so that x ∼ y. Thusthere is some i ∈ I so that x, y ∈ Ai. Hence y, x ∈ Ai, so y ∼ x.

Finally, we show ∼ is transitive: Suppose x, y, z ∈ X so that x ∼ y andy ∼ z. Thus there is some i ∈ I so that x, y ∈ Ai and some j ∈ I so thaty, z ∈ Aj . Hence y ∈ Ai and y ∈ Aj ; since Π is a partition, we must haveAi = Aj . Thus x, z ∈ Ai so x ∼ z.

This shows ∼ is an equivalence relation on X. �

Congruences. Here we present an explicit and fundamental example ofan equivalence relation on Z. We begin with a familiar definition.

Defintion. For x, y ∈ Z, we say x divides y if ∃z ∈ Z so that y = xz. Wewrite x|y to denote “x divides y”. Similarly, we write x - y to denote “xdoes not divide y”, meaning that ∀z ∈ Z, y 6= xz.

Fix n ∈ Z+. We define a relation on Z as follows: For a, b ∈ Z, we writea ≡ b (mod n) if n|a − b. When a ≡ b (mod n), we say a is congruent to bmodulo n. We leave it as an exercise to show that this relation is in fact anequivalence relation on Z. So when a, b ∈ Z and n ∈ Z+ with a ≡ b (mod n),then a and b are in the same congruence class modulo n.

This is a particularly interesting equivalence relation because of the fol-lowing.

Theorem 5.4. Fix n ∈ Z+. Suppose a, b, c, d ∈ Z so that a ≡ c (mod n),b ≡ d (mod n). Then

a+ b ≡ c+ d (mod n), ab ≡ cd (mod n).

Proof. By assumption, we have n|a− c and n|b− d. Thus for some x, y ∈ Z,we have a− c = nx and b− d = ny. Hence

(a+ b)− (c+ d) = (a− c) + (b− d) = nx+ ny = n(x+ y).

Since x + y ∈ Z, this means n|(a + b) − (c + d), so a + b ≡ c + d (mod n).Also, since a = c+ nx and b = d+ ny, we have

ab = (c+ nx)(d+ ny) = cd+ n(cy + dx+ nxy)

and hence ab − cd = n(cy + dx + nxy). Since cy + dx + nxy ∈ Z, we haven|ab− cd, so ab ≡ cd (mod n). �

This result helps simplify many computations modulo a positive integern.


Example: We compute 35 + 28 (mod 7) without working unnecessarilyhard.

We have 32 ≡ 9 ≡ 2 (mod 7). So 34 = 32 · 32 ≡ 2 · 2 ≡ 4 (mod 7). Hence

35 ≡ 34 · 3 ≡ 12 ≡ 5 (mod 7).

Somewhat similarly, 23 ≡ 8 ≡ 1 (mod 7), so

26 ≡ 23 · 23 ≡ 1 · 1 ≡ 1 (mod 7).

So 28 ≡ 26 · 22 ≡ 1 · 4 ≡ 4 (mod 7). Hence

35 + 28 ≡ 5 + 4 ≡ 2 (mod 7).

6. Algorithms, recursion, and mathematical induction

An algorithm is a logical step-by-step procedure for solving a problem ina finite number of steps. Many algorithms are recursive, meaning that afterone or more initial steps, a general method is given for determining eachsubsequent step on the basis of steps already taken.

As an example of a recursive algorithm, we discuss Euclid’s algorithm forfinding the highest common factor of two nonzero integers.

First, recall that we have a “division algorithm” for a, b ∈ Z+:

Theorem 6.1. Suppose a, b ∈ Z+. Then ∃!q, r ∈ Z so that b = aq+ r where0 ≤ r < a.

Proof. Consider the set A = {u ∈ Z : au ≤ b }. Since 0 ∈ A, we know A isnonempty, and since b ≤ ab, A is bounded above; hence we can choose q tobe the maximal element in A. [Thus q is the largest integer so that aq ≤ b.]Set r = b− aq. So b = aq+ r with 0 ≤ r < a. [If r ≥ a, then we would havea(q+ 1) ≤ b, contrary to our choice of q.] Note also that q, r are the uniqueintegers so that b = aq+ r with 0 ≤ r < a. To see this, suppose q′, r′ ∈ Z sothat

b = aq′ + r′ with 0 ≤ r′ < a.

Thus aq+ r = aq′+ r′, so a(q− q′) = r′− r. Since 0 ≤ r < a and 0 ≤ r′ < a,we have −a < r′− r < a. Note that 0 is the only integer strictly between −aand a that is divisible by a. Since q− q′ is an integer with a(q− q′) = r′− r,we must have r′ − r = 0 and q − q′ = 0, meaning r′ = r and q′ = q. Hencethere are unique q, r ∈ Z so that b = aq + r with 0 ≤ r < a. �

Note: An immediate consequence is that with n ∈ Z+, ∀b ∈ Z, ∃!r ∈ Zso that b ≡ r (mod n) with 0 ≤ r < n. Hence Z is partitioned into ncongruence classes modulo n. For n ≥ 3, these congruence classes are

{a ∈ Z : a ≡ 0 (mod n) },{a ∈ Z : a ≡ 1 (mod n) },{a ∈ Z : a ≡ 2 (mod n) },{a ∈ Z : a ≡ 3 (mod n) },

...

{a ∈ Z : a ≡ n− 1 (mod n) }.


Equivalently, for n ≥ 3 these congruence classes are

nZ, 1 + nZ, 2 + nZ, 3 + nZ, . . . , (n− 1) + nZ.


Proposition 6.2. Suppose a, b, c ∈ Z+. Then ∃!q, r ∈ Z so that b = aq + rwith c ≤ r < a+ c.

Remark: We can extend the division algorithm to show that for a, b ∈ Zwith a 6= 0, there exist unique q, r ∈ Z so that b = aq + r with 0 ≤ r < |a|.

Definitions. With a, b, c ∈ Z, c is a common divisor of a and b if c|a and c|b.Note that 1 is always a common divisor of a and b, and if a 6= 0, no integerlarger than |a| can be a common divisor of a and b. Also note that everyx ∈ Z is a divisor of 0, as 0 = 0 · x. With a, b ∈ Z, a, b not both 0, we writehcf(a, b) (or equivalently, gcd(a, b)) to denote the highest common factor (orequivalently, greatest common divisor) of a and b, meaning hcf(a, b) is thelargest common divisor of a and b.

(For a, b ∈ Z, not both 0, let C be the set of common divisors of a and bthat are positive. So

C = {d ∈ Z+ : d|a and d|b }.

C 6= ∅ since 1 ∈ C. Let M be the maximum of |a| and |b|. Then no integerlarger than M is a common divisor of a and b, so C is bounded above byM . Thus C has a maximal element, and this is hcf(a, b), which is positive.)

When hcf(a, b) = 1, we say a, b are relatively prime.Note that hcf(0, 0) does not exist, since every integer is a divisor of 0 (for

x ∈ Z, 0 = x · 0 so x|0). For a ∈ Z with a 6= 0, hcf(a, 0) = |a|.As an exercise, one proves the following.

Proposition 6.3. Suppose a, b ∈ Z+ and c = hcf(a, b). Take x, y ∈ Z sothat a = cx, b = cy. Then hcf(x, y) = 1.

Theorem 6.4. Take a, b ∈ Z so that a, b are not both 0, and let c = hcf(a, b).Then there exist s, t ∈ Z so that c = as+ bt.

Proof. Let d be the minimum value in the set

A = {au+ bv : (u, v ∈ Z) ∧ (au+ bv > 0) }.

(One checks that this subset of Z is nonempty, and it is bounded below by0, so it has a minimum value.) Take s, t ∈ Z so that d = as + bt. Notethat c|d since c|a and c|b [so a = cx, b = cy for some x, y ∈ Z, and thusd = as+ bt = c(xs+ yt)]. Hence c ≤ d.

Take q, r ∈ Z so that a = dq + r with 0 ≤ r < d. So

r = a− dq = a− (as+ bt)q = a(1− sq) + b(−tq).

If r > 0 then r ∈ A with r < d, contrary to how we chose d. Hence we musthave r = 0, which means d|a. A virtually identical argument shows that d|b,so d is a common divisor of a and b. As c = hcf(a, b), we have d ≤ c.

Hence c ≤ d and d ≤ c, which means c = d. So hcf(a, b) = c = d =as+ bt. �


Remark: Suppose c = hcf(a, b) = as + bt where a, b, s, t ∈ Z with a, b notboth 0. Thus ∃a′, b′ ∈ Z so that a = ca′, b = cb′. Then one can show thatfor any k ∈ Z, we have c = a(s + b′k) + b(t − a′k), and if s, t ∈ Z so thatas′ + bt′ = c then s′ = s+ b′k, t′ = t− a′k for some k ∈ Z.


Proposition 6.5. Suppose a, b, c ∈ Z so that c 6= 0, c|ab, and hcf(b, c) = 1.Then c|a.

Note that for a, b ∈ Z, a, b not both 0, this proof shows the existence ofsome s, t ∈ Z so that hcf(a, b) = as + bt, but it does not tell us the actualvalues of s and t. Euclid’s algorithm will produce values such values s, t; tohelp us prove this, we need the following, whose proof is left as an exercise.

Proposition 6.6. Suppose a, b, x ∈ Z, with a, b not both 0. Then hcf(|a|, |b|) =hcf(a, b) = hcf(b, a+ bx).

Euclid’s algorithm. Take a, b ∈ Z with b 6= 0; we will first computehcf(a, b) and then we will construct s, t ∈ Z so that hcf(a, b) = as+ bt.

Step 1: Choose q1, r1 ∈ Z so that a = bq1 + r1 with 0 ≤ r1 ≤ |b|. If r1 = 0then we stop; otherwise we continue.

Step 2: Choose q2, r2 ∈ Z so that b = r1q2 +r2 with 0 ≤ r2 < r1. If r2 = 0then we stop; otherwise we continue.

Step k (k ≥ 3): Choose qk, rk ∈ Z so that rk−2 = rk−1qk + rk with0 ≤ rk < rk−1. If rk = 0 then we stop; otherwise we continue.

Notice that after k steps, we have |b| > r1 > r2 > · · · > rk ≥ 0. Thusafter at most |b| steps, the algorithm must terminate.

If the algorithm terminates after 1 step, then hcf(a, b) = |b|, and we know

|b| =

{a · 0 + b · 1 if b > 0,

a · 0 + b · (−1) if b < 0.

So suppose the algorithm terminates after n steps where n > 1; we claimthat rn−1 = hcf(a, b). To see this, first note that r1 = a− bq1, r2 = b− r1q2,and for 3 ≤ k < n, we have rk = rk−2 − rk−1qk. Then the precedingproposition tells us

hcf(a, b) = hcf(b, r1) = hcf(r1, r2) = · · · = hcf(rn−1, rn).

Since the algorithm terminates after n steps, this means rn−1 > 0 but rn = 0;hence hcf(rn−1, rn) = hcf(rn−1, 0) = rn−1.

To realise rn−1 as as + bt, we substitute, using the equalities that rk =rk−2 − rk−1qk for 3 ≤ k < n, r2 = b− r1q2, and r1 = a− bq1.

Example: We compute hcf(1451, 323) and find s, t ∈ Z so that hcf(1451, 323) =1451s+ 323t.

Step 1: 1451 = 323 · 4 + 159 (so q1 = 4, r1 = 159).Step 2: 323 = 159 · 2 + 5 (so q2 = 2, r2 = 5).Step 3: 159 = 5 · 31 + 4 (so q3 = 31, r3 = 4).Step 4: 5 = 4 · 1 + 1 (so q4 = 1, r4 = 1).Step 5: 4 = 1 · 4 + 0 (so q5 = 4, r5 = 0).Hence hcf(1451, 323) = r4 = 1.


Solving the above equations for r4, r3, r2, r1 gives us:

1 = 5− 4 · 1,4 = 159− 5 · 31,

5 = 323− 159 · 2,159 = 1451− 323 · 4.

Thus

1 = 5− (159− 5 · 31) · 1= 5 · 32− 159 · 1= (323− 159 · 2) · 32− 159 · 1= 323 · 32− 159 · 65

= 323 · 32− (1451− 323 · 4) · 65

= 323 · 292− 1451 · 65.

(So 1 = hcf(1451, 323) = 1451s+ 323t where s = −65, t = 292.)

Remark: As a later exercise, one shows that for x, y ∈ Z with x, y 6= 0and hcf(x, y) = 1, there are infinitely many ways to choose u, v ∈ Z so thatxu + yv = 1. Recall that with c = hcf(a, b) we have a = cx, b = cy wherex, y ∈ Z with hcf(x, y) = 1. Consequently for any a, b ∈ Z with a, b 6= 0,there are infinitely many ways to choose s, t ∈ Z so that as+ bt = hcf(a, b).

As an application of Euclid’s algorithm, we prove the following.

Theorem 6.7. (Chinese Remainder Theorem) Suppose m,n ∈ Z+ withhcf(m,n) = 1. For any a, b ∈ Z, there is some x ∈ Z so that

x ≡ a (mod m) and x ≡ b (mod n).

Further, for x′ ∈ Z, we have x′ ≡ a (mod m) and x′ ≡ b (mod n) if and onlyif x′ ≡ x (mod mn).

Proof. Since hcf(m,n) = 1, there exist s, t ∈ Z so that ms+ nt = 1. Thus

1 ≡ ms+ nt ≡ nt (mod m)

and

1 ≡ ms+ nt ≡ ms (mod n).

Take x = msb+ nta. Then

x ≡ nta ≡ 1 · a ≡ a (mod m)

and

x ≡ msb ≡ 1 · b ≡ b (mod m).

We leave it as an exercise to show that for x′ ∈ Z, we have

x′ ≡ a (mod m) and x′ ≡ b (mod n) if and only if x′ ≡ x (mod mn).

�

Mathematical induction. Mathematical induction is a method of proofwherein we show the smallest instance of a given proposition is true, andfrom that deduce that each successive instance of the given proposition is


true. Thus we establish a base case, or some base cases, then set up arecursive process to establish the succeeding cases.

More formally, suppose P (n) is the proposition that the integer n hasproperty P . To prove that P (n) holds for all n ∈ Z+ using induction, wefirst prove P (1) holds (this is called the base case). Then we show that forany k ∈ Z+, P (k) =⇒ P (k + 1) (this is called the induction step); to dothis, one supposes that P (k) holds (called the induction hypothesis), andthen argues that this implies P (k+ 1) must hold. Hence for any n ∈ Z withn > 1, this second step shows that P (1) =⇒ P (2), P (2) =⇒ P (3), . . .,P (n − 1) =⇒ P (n). Having established that P (1) holds, P (1) =⇒ P (2)shows that P (2) holds; then P (2) =⇒ P (3) shows that P (3) holds; and soon. A proof using induction to show that P (n) holds for all n ∈ Z+ is calleda proof by induction on n.

Remarks:(1) Proving P (k) =⇒ P (k + 1) for all k ∈ Z+ does not allow us to

conclude P (n) holds for some n ∈ Z+ unless we have established that P (m)holds for some m ∈ Z+ with m < n.

(2) An induction argument gives us an algorithm, which we can onlyapply finitely many times. Hence if P (n) is a proposition that states thatthe integer n has property P where we begin by showing P (1) holds, theinduction step P (k) =⇒ P (k + 1) does not allow us to conclude thatproperty P (∞) holds. For example, consider the proposition P (n) thatsays “for any subset A of Z with n elements, A has a maximal element”;while this proposition is true for any subset A of Z where A has finitelymany elements, this proposition clearly does not hold for A = Z+. (Curiousstudents may want to read other sources about a proof technique called“transfinite induction”, which is beyond the scope of this course.)

(3) A proof by induction on n does not need to begin by establishing P (1).More generally, if we establish that P (n0) holds for some (fixed) n0 ∈ Z, andthat P (k) =⇒ P (k + 1) for any k ∈ Z with k ≥ n0, then the principle ofmathematical induction allows us to conclude that P (n) holds for all n ∈ Zwith n ≥ n0.

We now present some examples of proofs by induction.

Proposition 6.8. For every n ∈ Z+,

1 + 2 + 3 + · · ·+ n =n(n+ 1)

2.

Proof. For n ∈ Z+, let P (n) be the proposition that

1 + 2 + 3 + · · ·+ n =n(n+ 1)

2.

(Base case:) We have 1 = 1(1+1)2 , so P (1) holds.

(Induction step:) Suppose k ≥ 1 and P (k) holds; recall that P (k) is theproposition that

1 + 2 + 3 + · · ·+ k =k(k + 1)

2.


[We need to deduce that P (k + 1) holds.] Then

1 + 2 + 3 + · · ·+ k + (k + 1) =k(k + 1)

2+ (k + 1)

=k(k + 1)

2+

2(k + 1)

2

=k2 + 3k + 2

2

=(k + 1)(k + 2)

2.

Hence P (k + 1) holds if P (k) holds, or equivalently, P (k) =⇒ P (k + 1).By the principle of mathematical induction, this shows that for every

n ∈ Z+, P (n) holds. �

Proposition 6.9. Let X be a set, and let A,B1, B2, . . . , Bn ⊆ X wheren ∈ Z+. Then we have the following results.

(a) A ∪ (B1 ∩B2 ∩ · · · ∩Bn) = (A ∪B1) ∩ (A ∪B2) ∩ · · · ∩ (A ∪Bn).(b) A ∩ (B1 ∪B2 ∪ · · · ∪Bn) = (A ∩B1) ∪ (A ∩B2) ∪ · · · ∪ (A ∩Bn).(c) For n ≥ 2, (B1 ∩B2 ∩ · · · ∩Bn)c = Bc

1 ∪Bc2 ∪ · · · ∪Bc

n.(d) For n ≥ 2, (B1 ∪B2 ∪ · · · ∪Bn)c = Bc

1 ∩Bc2 ∩ · · · ∩Bc

n.

Proof. We prove (a) and leave (b), (c), (d) as exercises.(Base case:) First note that A ∪B1 = A ∪B1.(Induction step:) Now suppose that k ≥ 1 and that A ∪ (B1 ∩B2 ∩ · · · ∩

Bk) = (A ∪ B1) ∩ (A ∪ B2) ∩ · · · ∩ (A ∪ Bk). Let C = B1 ∩ B2 ∩ · · · ∩ Bk.Then

A ∪ (B1 ∩B2 ∩ · · · ∩Bk+1) = A ∪ (C ∩Bk+1).

By Proposition 4.3, A∪(C∩Bk+1) = (A∪C)∩(A∪Bk+1). By our inductionhypothesis,

A ∪ C = A ∪ (B1 ∩B2 ∩ · · · ∩Bk)

= (A ∪B1) ∩ (A ∪B2) ∩ · · · ∩ (A ∪Bk).

Hence

A ∪ (B1 ∩B2 ∩ · · · ∩Bk+1)

= A ∪ (C ∩Bk+1)

= (A ∪B1) ∩ (A ∪B2) ∩ · · · ∩ (A ∪Bk) ∩ (A ∪Bk+1).

Thus by the priniciple of mathematical induction, (a) holds for all n ∈Z+. �

7. Strong induction and the Fundamental Theorem ofArithmetic

An argument by strong induction proceeds as follows. Suppose P (n) isthe proposition that the integer n has property P . Fix n0 ∈ Z. To provethat P (n) holds for all n ∈ Z with n ≥ n0 using strong induction, we firstestablish that P (n0) holds, and then we show that for k ∈ Z with k ≥ n0,

[P (n0) ∧ P (n0 + 1) ∧ · · · ∧ P (k)] =⇒ P (k + 1).


As an example, we will prove the Fundamental Theorem of Arithmetic,which states that for n ∈ Z with n > 1, n can be written uniquely as aproduct of primes.

Before we can prove the Fundamental Theorem of Arithmetic, we needto establish some other basic results.

Definition. We say an integer p is prime if p > 1, and the only positivedivisors of p are 1 and p.

Remark: Later we will see that there are infinitely many primes.

Proposition 7.1. Suppose q1, . . . , qr ∈ Z where r ∈ Z with r ≥ 2, andsuppose p is a prime so that p|q1 · · · qr. Then for some i ∈ Z with 1 ≤ i ≤ r,we have p|qi.Proof. We proceed by induction on r.

[Base case] Suppose that p|q1q2. If p|q1 then we are done. So supposep - q1. Thus hcf(p, q1) = 1 [since the only positive divisors of p are 1 and p,and p is not a common factor of p and q1]. Hence by Proposition 6.5, p|q2.

[Induction step] Now suppose k ∈ Z with k ≥ 2, and suppose that ifp|q1 · · · qk where q1, . . . , qk ∈ Z, then p|qi for some i ∈ Z with 1 ≤ i ≤k [this is the induction hypothesis]. Suppose q1, . . . , qk, qk+1 ∈ Z withp|q1 · · · qkqk+1. Set t = q1 · · · qk. Thus p|tqk+1, so by the base case in thisargument, p|t or p|qk+1. If p|t then our induction hypothesis tells us thatp|qi for some i ∈ Z with 1 ≤ i ≤ k. If p 6 |t then p|qk+1. Hence p|qi for somei ∈ Z with 1 ≤ i ≤ k + 1.

Thus by the principle of mathematical induction, the proposition is proved.�

Theorem 7.2. (Fundamental Theorem of Arithmetic) For every n ∈ Zso that n > 1, we have n = p1p2 · · · pr for some primes p1, p2, . . . , pr withp1 ≤ p2 ≤ · · · ≤ pr. Further, if we also have n = q1q2 · · · qs for primesq1, q2, . . . , qs with q1 ≤ q2 ≤ · · · ≤ qs, we have r = s and pi = qi for all i ∈ Zwith 1 ≤ i ≤ r.Proof. We first argue by strong induction to show that each integer n > 1is a product of primes.

First, we note that 2 is a prime, so 2 is a product of primes (where thereis only one prime in this product).

Now suppose that k ∈ Z with k ≥ 2, and suppose that for all integersm ∈ Z with 2 ≤ m ≤ k, m is a product of primes. Consider the integer k+1.If k+1 is prime, then we are done. So suppose k+1 is not prime; thus 1 andk + 1 are not the only positive integers dividing k + 1. Hence there is somea ∈ Z+ so that 1 < a < k+ 1 with a|k+ 1; this means there is some b ∈ Z+

so that ab = k + 1. Since 1 < a, we have b < ab, so b < k + 1. Also, 1 ≤ b;since a < k+ 1 and ab = k+ 1, we have 1 < b. Thus a and b are products ofprimes, so k+1 is as well. Hence by the principle of mathematical induction,every integer n > 1 is a product of primes.

Now we want to show that for any integer n > 1, there is a uniqueway to realise n as a product of primes. More precisely, we want to showthat if p1p2 · · · pr = q1q2 · · · qs with p1, p2, . . . , pr, q1, q2, . . . , qs primes so thatp1 ≤ p2 ≤ · · · ≤ pr and q1 ≤ q2 ≤ · · · ≤ qs, then r = s and pi = qi for alli ∈ Z with 1 ≤ i ≤ r. To prove this, we argue by induction on r ∈ Z+.


More formally, for r ∈ Z+, we let P (r) be the proposition that if p1 · · · pr =q1 · · · qs with p1 ≤ · · · ≤ pr primes and q1 ≤ · · · ≤ qs primes, we have r = sand pi = qi for all i ∈ Z with 1 ≤ i ≤ r.

Suppose first that p1 = q1 · · · qs where p1 is prime and q1 ≤ · · · ≤ qs areprime (here s ∈ Z+). Since p1 is prime and thus cannot be a product of twoor more primes, it must be the case that s = 1 and p1 = q1. [This provesthe base case for the induction argument, i.e. this shows P (1) holds.]

Now suppose that k ∈ Z+, and that whenever n = p1 · · · pk = q1 · · · qs withp1, · · · , pk, q1, . . . , qs primes and p1 ≤ · · · ≤ pk, q1 ≤ · · · ≤ qs, we have k = sand pi = qi for all i ∈ Z with 1 ≤ i ≤ k. [This is the induction hypothesis.]Suppose now that a = p1 · · · pkpk+1 = q1 · · · qt with p1, . . . , pk, pk+1, q1, . . . , qtprimes and p1 ≤ · · · ≤ pk ≤ pk+1, q1 ≤ · · · ≤ qt. Note that since k ≥ 1, a isnot prime, and hence t ≥ 2. Let p be the largest prime so that p|a. [Notethat there are only finitely many primes dividing a since each such prime qsatisfies 2 ≤ q ≤ a, and there are only finitely many integers between 2 anda.] Thus we have p ≥ pk+1. Also, since p|a, we have p|p1 · · · pkpk+1, hencep|pi for some i ∈ Z, 1 ≤ i ≤ k + 1. Since pi is prime, we must have p = pi.So p = pi ≤ pk+1. Also, by our choice of p, we have p ≥ pk+1; hence wemust have p = pk+1.

A virtually identical argument shows that p = qt, and hence pk+1 = qt.Thus p1 · · · pk = q1 · · · qt−1. By the induction hypothesis, we have k = t− 1and pi = qi for i ∈ Z with 1 ≤ i ≤ k. Therefore we have k + 1 = t andpi = qi for all i ∈ Z with 1 ≤ i ≤ k + 1.

Consequently, by the principle of mathematical induction, the factorisa-tion of an integer n > 1 as a product of (nondecreasing) primes is unique. �

Corollary 7.3. For x ∈ Q+, ∃! a, b ∈ Z+ so that hcf(a, b) = 1 and x = ab .

Proof. Take x ∈ Q+. Thus ∃a, b ∈ Z, a, b 6= 0, so that x = ab . Since x > 0,

we have x = |x| = |a||b| , so we have that x is a quotient of two elements of Z+.

So suppose a, b > 0. Let c = hcf(a, b), and take a′, b′ ∈ Z+ so that a = ca′

and b = cb′. Thus hcf(a′, b′) = 1 and x = a′

b′ . [This shows that there is atleast one way to write any x ∈ Q+ as a

b where a, b ∈ Z+ with hcf(a, b) = 1.]Now suppose x ∈ Q+ and a, b, c, d ∈ Z+ so that x = a

b = cd with hcf(a, b) =

1 = hcf(c, d). Thus ad = bc. Suppose a = 1; then d = bc, so c|d. Sincehcf(c, d) = 1 and c > 0, this means c = 1 and hence a = c and b = d. Sosuppose a > 1; then a = p1 · · · pr for some r ∈ Z+ and primes p1, . . . , pr. Wenow argue by induction on r to show that a = c and b = d. First, supposea = p1 (p1 prime). We have p1|bc and hcf(a, b) = 1, so hcf(p1, b) = 1 andhence p1|c. Thus c = p1c

′ for some c′ ∈ Z+. So d = bc′ and hence c′|d.Since hcf(c, d) = 1 and c′ is a positive factor of d, we must have c′ = 1 andhence a = p1 = c and b = d. Now suppose k ≥ 1 and whenever p1, . . . , pkare prime and b′, c′, d′ ∈ Z+ so that hcf(p1 · · · pk, b′) = 1 = hcf(c′, d′) withp1 · · · pkd′ = b′c′, we have p1 · · · pk = c′. Suppose a = p1 · · · pkpk+1 wherep1, . . . , pk, pk+1 are prime. Thus pk+1|c, so c = pk+1c

′ for some c′ ∈ Z+.Therefore p1 · · · pkd = bc′, so by the induction hypothesis, p1 · · · pk = c′.Hence a = c and b = d. �

Corollary 7.4. There are infinitely many prime numbers in Z+.


Proof. (Euclid’s proof) For the sake of contradiction, suppose there are onlyfinitely many primes, and enumerate these as p1, p2, . . . , pm where m ∈ Z+

is the number of primes. Now set n = p1p2 · · · pm + 1. Clearly m ≥ 2, as 2and 3 are prime. Hence n > 1. By the Fundamental Theorem of Arithmetic,n can be factored as a product of primes; let q be a prime dividing n. Son = qk for some k ∈ Z+ [k > 0 since n > 0 and q > 0]. Since there are onlyfinitely many primes, we must have q = pj for some j ∈ Z+, 1 ≤ j ≤ m.Hence

n = pjk = p1p2 · · · pm + 1,

so

1 = n− p1p2 · · · pm = pj

(k − p1p2 · · · pm

pj

).

Hence the prime pj divides 1. Contradiction! Thus there cannot be a finitenumber of primes in Z+. �

Remark: One can use induction and the Fundamental Theorem of Arth-metic to prove the following generalisation of the Chinese Remainder The-orem: Suppose r ∈ Z with r ≥ 2, and m1, . . . ,mr ∈ Z+ are pairwise rel-atively prime, meaning that hcf(mi,mj) = 1for i, j ∈ Z with 1 ≤ i ≤ r,1 ≤ j ≤ r and i 6= j. For any a1, . . . , ar ∈ Z, there is some x ∈ Z so thatx ≡ ai (mod mi) for all i ∈ Z with 1 ≤ i ≤ r. Further, with x as aboveand x′ ∈ Z, we have x′ ≡ ai (mod mi) for i = 1, 2, . . . , r if and only ifx′ ≡ x (mod m1m2 · · ·mr).

Application: We make use of the Fundamental Theorem of Arithmetic tofind all primes p so that 5p+ 9 = n2 for some n ∈ Z+.

[Strategy: First, we suppose we have a prime p so that 5p + 9 = n2 forsome n ∈ Z+, and we deduce constraints on p. Then we consider all primesp subject to these constraints and determine for which of these p we havethat 5p+ 9 = n2 for some n ∈ Z+.]

Suppose p is prime and n ∈ Z+ so that 5p + 9 = n2. Since so 5p =(n+ 3)(n− 3), and n+ 3 > 0. By the Fundamental Theorem of Arithmetic,the only positive factors of 5p are 1, 5, p, 5p. Since n ∈ Z+, we know thatn+ 3 is positive.

Suppose n+ 3 = 1. Then n− 3 = −5, meaning 5p = (n+ 3)(n− 3) = −5.But this implies p = −1, which is not prime. So we cannot have n+ 3 = 1.

Suppose n + 3 = 5. Then n − 3 = −1, so 5p = (n + 3)(n − 3) = −5 andhence p = −1. But this is impossible [since −1 is not prime].

Suppose n+3 = p. Thus n−3 = p−6, so 5p = p(p−6). Hence 5 = p−6,so p = 11, which is prime. [So n+ 3 = p does not lead to a contradiction.]

Suppose n+3 = 5p. Thus 5p = (n+3)(n−3) = 5p(n−3). Hence n−3 = 1,and so n = 4. Then 5p = (n+ 3)(n− 3) = 7; but this is impossible, since 5does not divide [the prime] 7.

This shows that if p is a prime so that 5p+ 9 = n2 for some n ∈ Z+ thenp = 11. On the other hand, with p = 11, we have 5p+ 9 = 55 + 9 = 64 = 82.

Hence p is a prime with 5p+9 = n2 for some n ∈ Z+ if and only if p = 11.


8. Cardinality

Definitions. We say that two nonempty sets A and B have the samecardinality if there is a bijective map f : A → B, and we write |A| = |B|.Note that (1) h : A→ A defined by h(x) = x is bijective; (2) when f : A→ Bis bijective, f−1 : B → A exists and is also bijective; and (3) if f : A→ B andg : B → C are bijective, then so is g◦f : A→ C. (We are tempted to say thathaving the same cardinality is an equivalence relation on “the set of all sets”;however, it is known that there is no injection from P(X) = {A : A ⊆ X }into X. So if X were “the set of all sets”, we would have P(X) ⊆ X, andthe inclusion map ι : P(X)→ X defined by ι(A) = A would contradict thatthere is no injection from P(X) into X. Thus “the set of all sets” does notexist.)

We say a set Y has at least as many elements as a set X if there is aninjective map f : X → Y , and in this case we write |X| ≤ |Y |. If there isan injection from X into Y but no bijection between X and Y , we write|X| < |Y |. (Note that when X ⊆ Y and X 6= ∅, the map ι : X → Y definedby ι(x) = x is injective.)

Let A be a set. When A = ∅, we set |A| = 0. Now suppose n ∈ Z+

and f : {1, 2, . . . , n} → A is bijective, we write |A| = n and we say A hasn elements. Further, we can enumerate the elements of A as a1, a2, . . . , anwhere ai = f(i), and since f is injective, ai = aj if and only if i = j.

When |A| ∈ Z≥0, we say A is a finite set. When A is not a finite set wesay A is an infinite set.

Suppose A is a finite set with |A| = n (n ∈ Z≥0) and B is a subset ofA; we accept without proof that |B| = m where m ≤ n (m ∈ Z≥0), andthat A = B if and only if m = n. We also accept without proof that Z+

is infinite. Note that we have assumed the following: Suppose B ⊆ A; if Ais finite then B is finite. The contrapositive of this statement is: SupposeB ⊆ A; if B is infinite then A is infinite.

We will eventually see that |Z+| = |Z| = |Q+| = |Q|, but |Z+| < |R|.

Proposition 8.1. Suppose A,B ⊆ X where X is some set, where A,B arenonempty finite sets with A ∩B = ∅. Then |A ∪B| = |A|+ |B|.

Proof. Let s, t ∈ Z+ so that |A| = s and |B| = t. Thus we can enumeratethe elements of A as a1, a2, . . . , as where ai = aj only if i = j (here i, j areintegers between 1 and s). Similarly, we can enumerate the elements of Bas b1, b2, . . . , bt where bi = bj only if i = j (here i, j are integers between 1and t). We also know that for any integers i, j with 1 ≤ i ≤ s and 1 ≤ j ≤ t,we have ai 6= bj since A ∩B = ∅.

Define f : {1, 2, . . . , s+ t} → A ∪B by

f(n) =

{an if 1 ≤ n ≤ s,bn−s if s < n ≤ s+ t.

As an exercise, one shows that f is bijective. �

Definition. We say a set X is countable if there is a bijective functionf : Z+ → X, or equivalently, if there is a bijective map g : X → Z+. (Note:Some texts say a set if countable if it is finite or if there is a bijective function


f : Z+ → X, and when there is a bijective function f : Z+ → X, these textssay X is countably infinite.)

Note: Suppose X is a countable set; so by definition there is a bijective mapf : Z+ → X. Thus we can enumerate the elements of X as x1, x2, x3, . . .where xi = f(i) for i ∈ Z+.

Example: The set of positive even integers is countable: Let A = {2x :x ∈ Z+ }. Define f : Z+ → A by f(x) = 2x. To see f is injective, supposex, y ∈ Z+ so that f(x) = f(y). Thus 2x = 2y, so x = y, showing thatf is injective. To see f is surjective, take a ∈ A. Thus a = 2x for somex ∈ Z+, and hence a = 2x = f(x); so f is surjective. This shows thatf is bijective, and hence A is countable. Similarly, the set of odd positiveintegers, {2x− 1 : x ∈ Z+ }, can be shown to be countable.

Theorem 8.2. Suppose f : X → Y is injective and A ⊆ X. Then |A| =|f(A)|.

Proof. Let B = f(A). Define g : A → B by g(a) = f(a). By the definitionof B, B = f(A) = g(A), so g is surjective. Suppose a, a′ ∈ A so thatg(a) = g(a′). Then f(a) = f(a′), and since f is injective, this means a = a′.Hence g is injective. Thus g is bijective, so |A| = |g(A)|. We also know thatg(A) = B = f(A), so |A| = |g(A)| = |f(A)|. �

The next theorem may seem intuitively obvious, but a proper proof isbeyond the scope of this course.

Theorem 8.3. (a) Every infinite set contains a countable subset.(b) (Cantor-Schroder-Bernstein Theorem) If X,Y are sets with |X| ≤|Y | and |Y | ≤ |X| then |X| = |Y |. That is, if X,Y are sets so thatthere exist injective functions g : X → Y and h : Y → X then thereis a bijective function f : X → Y .

(In some texts, this theorem is called the Cantor-Bernstein Theorem or theSchoder-Bernstein Theorem; an interesting proof of this theorem due toHalmos can be found in the book by Pierre Grillet, which is available as anelectronic book from the University of Bristol library.)

Corollary 8.4. Suppose X ⊆ Z+. Then X is finite or countable.

Proof. If X is finite then we are done. So suppose X is infinite. We havean injective map g : X → Z+ given by g(x) = x, so |X| ≤ |Z+|. On theother hand, we know X contains a countable subset A. Hence there is abijective map h : Z+ → A. Define f : Z+ → X by f(n) = h(n). So fgives us an injective map from Z+ into X. Thus |Z+| ≤ |X|, and so by theCantor-Bernstein Theorem, there is a bijective map f : Z+ → X. Hence Xis countable. �

The next result is very useful when proving a set is countable.

Corollary 8.5. Suppose X is an infinite set. Then X is countable if andonly if there is an injective map f : X → Z+.


Proof. First suppose there is an injective map f : X → Z+. Thus |X| =|f(X)|, and f(X) is a subset of Z+. Since X is not finite and |X| = |f(X)|,f(X) is not finite. Hence (by the previous corollary), f(X) is countable,and hence X is countable.

Now suppose that X is countable. Thus there is a bijective g : Z+ → X.Since g is bijective, g−1 exists. With f = g−1, we have that f : X → Z+ isbijective and hence injective. �

As exercises, one proves the following.

Proposition 8.6. Suppose X is a countable set.

(a) Suppose A is a subset of X; then A is finite or countable.(b) Suppose A is a subset of X. If A is finite then X rA is countable.(c) X contains a subset B so that B and X rB are countable.(d) Suppose f : C → X is injective; then C is finite or countable.

Theorem 8.7. Suppose A,B ⊆ X where X is some set; suppose A is acountable set and B is a nonempty, finite set with A ∩B = ∅. Then A ∪Bis countable.

Proof. [The idea of this proof is that of the “Hilbert hotel”, where thereis always room for another guest: The Hilbert hotel has countably manyrooms, labeled 1, 2, 3, . . . (so for each number in Z+, there is a room withthat number). One night, all the rooms are occupied, and another potentialguest arrives at the hotel looking for a room. The manager says, no problem!Then the manager announces to the guests that every guest is to move tothe next room (so the guests in room n move to room n+ 1). Thus all theguests still have rooms, and room 1 has been made available to the newarrival.]

Since A is countable, there is an injective map f : A → Z+. B is finite,so we can list the distinct elements of B as b1, . . . , bm where m = |B| ∈ Z+.Define g : A ∪B → Z+ by

g(x) =

{i if x = bi,

f(x) +m if x ∈ A.

We claim that g is injective. To see this, take x, y ∈ A ∪ B so that x 6= y.If x, y ∈ B then x = bi and y = bj for some i, j ∈ Z+ so that i ≤ m, j ≤ mwith i 6= j, and hence g(x) = i 6= j = g(y). If x ∈ B and y ∈ A then x = bifor some i ∈ Z+ with i ≤ m, and hence g(x) = i < m + 1 ≤ g(y) + m + 1.If x, y ∈ A, then since x 6= y and f is injective, we have f(x) 6= f(y) and sog(x) = f(x) +m+ 1 6= f(y) +m+ 1 = g(y). Therefore g is injective. SinceA ⊆ A ∪ B and A is infinite, A ∪ B is infinite. Since g : A ∪ B → Z+ isinjective and A ∪B is infinite, A ∪B is countable. �

Theorem 8.8. Z+ × Z+ is countable.

Proof. We have that Z+ ×Z+ is infinite, as {(x, 1) : x ∈ Z+ } is an infinitesubset of Z+ × Z+.


We arrange the elements of Z+ × Z+ in a grid:

(1, 1) (1, 2) (1, 3) (1, 4) · · ·(2, 1) (2, 2) (2, 3) (2, 4) · · ·(3, 1) (3, 2) (3, 3) (3, 4) · · ·(4, 1) (4, 2) (4, 3) (4, 4) · · ·

......

......

We order the elements of this grid along the cross-diagonals:

(1, 1); (1, 2), (2, 1); (1, 3), (2, 2), (3, 1); . . .

We will define f : Z+ × Z+ → Z+ so that f((1, 1)) = 1, f((1, 2)) = 2,f((2, 1)) = 3, f((1, 3)) = 4, f(2, 2) = 5, f(3, 1) = 6, etc. We now find aformula to define f .

The kth cross-diagonal contains the pairs (1, k), (2, k−1), (3, k−2), . . . , (k, 1).So this cross-diagonal has k pairs. Thus the number of pairs in the first k−1cross-diagonals is

1 + 2 + 3 + · · ·+ (k − 1) =(k − 1)k

2.

Thus we define f : Z+ × Z+ by

f((i, k + 1− i)) =(k − 1)k

2+ i.

To show f is injective, suppose x, y ∈ Z+ × Z+ so that f(x) = f(y).Thus ∃i, k ∈ Z+ so that i ≤ k and x = (i, k + 1 − i) (so x is on the kthcross-diagonal). Suppose first that y is also on the kth cross-diagonal; thus∃j ∈ Z+ so that j ≤ k and y = (j, k + 1− j). Then

(k − 1)k

2+ i = f(x) = f(y) =

(k − 1)k

2+ j.

Hence i = j and so x = y. Now suppose y is not on the kth cross-diagonal.So there exist j,m ∈ Z+ so that j ≤ m and y = (j,m + 1 − j). So y is onthe mth cross-diagonal where m 6= k; hence m > k or k > m. Without lossof generality, assume m > k. [If it is the case that k > m then we rename xas y and y as x.] Thus m = k + r for some r ∈ Z+. So

f(x) =k − 1)k

2+ i ≤ (k − 1)k

2+ k,

and

f(y) =(k + r − 1)(k + r)

2+ j

=(k − 1)k

2+ kr +

(r − 1)r

2+ j

≥ (k − 1)k

2+ k + 1

(since kr ≥ k, r(r−1) ≥ 0, and j ≥ 1). Therefore f(x) 6= f(y), contradictingthe assumption that f(x) = f(y).

Hence, if f(x) = f(y) then x and y are on the same cross-diagonal andx = y. This shows f is injective. �


Suppose X,Y are countable. Then certainly X × Y is infinite: Choosey0 ∈ Y . Define f : X × {y0} → X by f(x, y0) = x. One easily showsf is bijective, so |X × {y0}| = |X|, and hence X × {y0} is countable. AsX × {y0} ⊆ X × Y , X × Y is infinite (as it contains an infinite subset).

As an exercise, one proves the following somewhat anti-intuitive result.

Corollary 8.9. Q+ and Q are countable.

Also as an exercise, one proves the following.

Proposition 8.10. Suppose X,Y are countable. Then:

(a) X × Y is countable.(b) Suppose also X ∩ Y = ∅. Then X ∪ Y is countable.

Note: Since Z is an infinite subset of Q and Q is countable, Z must becountable.

Corollary 8.11. Let {An : n ∈ Z+ } be a (countable) collection of countablesets that are pairwise disjoint. Then ∪n∈Z+An is countable.

Proof. First note that since A1 is infinite and A1 ⊆ ∪n∈Z+An, we know that∪n∈Z+An is also infinite. We have seen that if X is a countable set, Y isan infinite set, and ∃ an injective map g : Y → X, then Y is countable.Thus to prove that ∪n∈Z+An is countable, we will prove there is an injectivefunction g : ∪n∈Z+An → Z+ × Z+.

For each n ∈ Z+, enumerate the elements of An as an1, an2, an3, . . .. [Re-call that since An is countable, there is a bijective function fn : Z+ → An;for k ∈ Z+, set ank = fn(k).] Now define g : ∪∞n=1An → Z+ × Z+ byg(amk) = (m, k). [Since the An are pairwise disjoint, g is actually a func-tion.] To see g is injective, suppose g(amk) = g(ast) for some m, k, s, t ∈ Z+.Thus (m, k) = (s, t), so m = s, k = t, and hence amk = ast. Thus g isinjective. So we have an injective function from ∪∞n=1 into a countable set;since ∪∞n=1An contains the infinite set A1 and hence is infinite, ∪∞n=1An iscountable.

�

9. Uncountable sets and power sets

Definition. A set X is called uncountable if it is infinite but not countable.

We want to show that R is uncountable. To do this we will show that theinterval (0, 1) = {x ∈ R : 0 < x < 1 } is uncountable; as an exercise oneshows that there is a bijection between the interval (0, 1) and R.

We assume that every real number between 0 and 1 has a decimal expan-sion of the form

0.a1a2a3 · · · =∑k∈Z+

ak10−k


where ak ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for each k ∈ Z+. Note that

0.999 · · · =∑k∈Z+

9 · 10−k

= 9 · 1/10

1− 1/10

= 1

(recall that∑

k∈Z+10−k is a convergent geometric series). Consequently if

there is some N ∈ Z+ so that aN 6= 9 and an = 9 for all n ∈ Z+ with n > N ,then 0.a1a2a3 · · · = 0.a1a2 · · · aN−1bN where bN = aN + 1. We will assumethe result that for every α ∈ R with 0 < α < 1, there is a unique way towrite α as 0.a1a2a3 · · · so that ak ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for all k ∈ Z+

and ¬(∃N ∈ Z+ so that ∀n ∈ Z+, n > N =⇒ an = 9).

Theorem 9.1. The interval (0, 1) = {x ∈ R : 0 < x < 1 } is uncountable.

Proof. (Cantor’s diagonalisation argument) We know the interval (0, 1) isinfinite, since f : Z+ → (0, 1) defined by f(k) = 10−k is easily shown to beinjective. For the sake of contradiction, suppose (0, 1) is countable. Thuswe can enumerate the elements of (0, 1) as α1, α2, α3, . . .. Write each αk asa decimal expansion as described above:

αk = 0.ak1ak2ak3 · · ·

where aki ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and ¬(∃N ∈ Z+ so that ∀n ∈ Z+, n >N =⇒ akn = 9). For each k ∈ Z+, set

bk =

{1 if akk 6= 1,

2 if akk = 1.

Set β = 0.b1b2b3 · · · . Thus β ∈ R with 0 < β < 1 and ¬(∃N ∈ Z+ so that ∀n ∈Z+, n > N =⇒ bn = 9). Hence by assumption, β = αm for some m ∈ Z+.But bm 6= amm, contradicting the uniqueness of the representation of β asa decimal expansion not ending in an infinite sequence of 9s. Thus the as-sumption that the interval (0, 1) is countable leads to a contradiction, so(0, 1) must be uncountable. �

Remark: Suppose we havem ∈ Z+ and a1, a2, . . . , am ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}(not all 0), and

α = 0.a1a2 · · · ama1a2 · · · ama1a2 · · · am · · · = 0.a1a2 · · · am.

Then α is a rational number: Let b =∑m

k=1 ak · 10m−k. Then b ∈ Z+ and

α =∑n∈Z+

b · 10−mn

= b · 10−m

1− 10−m

=b

10m − 1.


Note that the map g : (0, 1) → R given by g(x) = x is injective, so|(0, 1)| ≤ |R|. Since |Z+| < |(0, 1)|, we get |Z+| < |R|, meaning R is uncount-able. In the following corollary, we show |(0, 1)| = |R|, which is another wayto argue that R is uncountable.

Corollary 9.2. There is a bijection between the interval (0, 1) and R (andhence R is uncountable).

Definition. For A a set, we let

P(A) = {C : C ⊆ A }.We call P(A) the power set of A.

Examples:(a) P(∅) = {∅}, so |P(∅)| = 1.(b) For any nonempty set X, we know ∅, X are distinct subsets of X, and

hence |P(X)| ≥ 2.(c) P({1, 2}) = {∅, {1}, {2}, {1, 2}}, so |P({1, 2})| = 4 = 22.(d) P({1, 2, 3}) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. So |P({1, 2, 3})| =

8 = 23.


Theorem 9.3. Suppose A is a finite set with |A| = n for some n ∈ Z withn ≥ 0. Then |P(A)| = 2n.

Remark: Suppose A is a finite set with n elements; enumerate these el-ements as a1, a2, . . . , an. Let Y = {(c1, c2, . . . , cn) : ci = 0 or 1 ∀i ∈Z with 1 ≤ i ≤ n }. Define f : Y → P(A) by

f((c1, c2, . . . , cn)) = {ai ∈ A : ci = 1 for some i ∈ Z with 1 ≤ i ≤ n }.Thus f((c1, c2, . . . , cn)) is a subset of A; one can show that f is bijective.

As exercises, one proves the following.

Proposition 9.4. Let A,B be sets.

(a) (A ⊆ B) ⇐⇒ (P(A) ⊆ P(B)).(b) P(A) ∪ P(B) ⊆ P(A ∪B).(c) P(A) ∩ P(B) = P(A ∩B).

Theorem 9.5. (Cantor’s Theorem) Let X be a set. Then |X| < |P(X)|.Proof. When X = ∅, then we know |X| = 0 < 1 = |P(X)|. So supposeX 6= ∅, and define f : X → P(X) by f(x) = {x}. We show f is injective:Suppose x1, x2 ∈ X so that f(x1) = f(x2). Thus {x1} = {x2}, and hencex1 = x2. Therefore f is injective, so |X| ≤ |P(X)|.

Now we want to show there is no bijection between X and P(X). Forthe sake of contradiction, suppose there is a bijection g : X → P(X). (Sofor each x ∈ X, g(x) is a subset of X.) Define A = {x ∈ X : x 6∈ g(x) }.Then A is a subset of X, so A ∈ P(X). Also, since we have assumed g isbijective, there is some z ∈ X so that g(z) = A. By the definition of g,z ∈ A if and only if z 6∈ g(z) = A. Thus we have a contradiction (namelythat z ∈ A ⇐⇒ z 6∈ A). Hence our assumption that there is a bijectivefunction g : X → P(X) must be false. So |X| < |P(X)|. �


10. More proofs using contradiction, construction, andinduction

Proposition 10.1. For prime p ∈ Z,√p is irrational.

Proof. For the sake of contradiction, suppose√p ∈ Q. Thus

√p = a

b where

a, b ∈ Z with b 6= 0; we can assume hcf(a, b) = 1. Then p = a2

b2, so pb2 = a2.

Thus p|a; hence a = pc for some c ∈ Z. This means we have pb2 = (pc)2 =p2c2, so b2 = pc2 and hence p|b. But then p|hcf(a, b), contradicting thathcf(a, b) = 1. Hence

√p cannot be rational. �

Proposition 10.2. There are infinitely many primes in Z+; in fact, thereare countably many primes.

Proof. Let X be the set of primes. Note that since X ⊆ Z+, we know X iseither finite or countable.

For the sake of contradiction, suppose there are only finitely many primesin Z+; let t be the number of primes. We know there is at least one prime,namely 2, so t ≥ 1. Let p1, . . . , pt be all the primes in Z+. Consider m =p1 · · · pt + 1. Since m ∈ Z with m > 1, by the Fundamental Theorem ofArithmetic we know there is some prime q ∈ Z+ so that q|m. So there issome m′ ∈ Z so that m = qm′, and hence 1 = qm′ − p1 · · · pt. Since wehave assumed there are finitely many primes, we must have q = pj for somej ∈ Z with 1 ≤ j ≤ t. Hence 1 = pjm

′ − p1 · · · pt, so pj |1. But since q = pjis prime and thus pj > 1, this is impossible. Thus there cannot be finitelymany primes. �

Given a, b, c ∈ Z, we can use Euclid’s algorithm to find all x, y ∈ Z sothat ax + by = c. Before we prove the general theorem, let us consider aspecific example.

Example: We want to construct all x, y ∈ Z so that 6x+ 8y = 2.First note that for x, y ∈ Z, we have 6x + 8y = 2 if and only if we have

3x + 4y = 1. Since hcf(3, 4) = 1, we know (by Euclid’s algorithm) that∃s, t ∈ Z so that 3s+ 4t = 1. (By inspection, we see that 3 · (−1) + 4 · 1 = 1,so in this case we don’t need to use Euclid’s algorithm to find s, t ∈ Z sothat 3s+ 4t = 1.) Now suppose we also have x, y ∈ Z so that 3x+ 4y = 1.Hence 3s+ 4t = 3x+ 4y, so 3(s− x) = 4(y − t). Thus 3|4(y − t), and sincehcf(3, 4) = 1, 3|y − t. Hence ∃k ∈ Z so that y − t = 3k, or equivalently,y = t+ 3k. A virtually identical argument shows that 4|s− x, so ∃k′ ∈ Z sothat x = s− 4k′. Therefore

3s+ 4t = 3x+ 4y = 3(s− 4k′) + 4(t+ 3k),

hence 0 = −12k′+ 12k, or equivalently, k′ = k. In summary, we have shownthat if s, t, x, y ∈ Z so that 3s + 4t = 1 = 3x + 4y, then ∃k ∈ Z so thatx = s− 4k and y = t+ 3k.

On the other hand, suppose 3s+ 4t = 1 (which is the case when s = −1,t = 1). Take any k ∈ Z and set x = s− 4k, y = t+ 3k. Then

3x+ 4y = 3s+ 4t = 1.

So 3x+ 4y = 1 if and only if x = −1− 4k, y = 1 + 3k for some k ∈ Z, andhence 6x+ 8y = 2 if and only if x = −1− 4k, y = 1 + 3k for some k ∈ Z.


More generally, we have the following proposition and corollary, whichone proves as exercises.

Proposition 10.3. Fix a, b, c ∈ Z so that a, b 6= 0. Let d = hcf(a, b). Takea′, b′ ∈ Z so that a = da′ and b = db′.

(a) If d - c then there do not exist x, y ∈ Z so that ax+ by = c.(b) Suppose d|c. Then ∃s, t ∈ Z so that as+ bt = c. Also, for x, y ∈ Z,

we have ax + by = c if and only if ∃k ∈ Z so that x = s − b′k andy = t+ a′k.

Corollary 10.4. Fix a, b, n ∈ Z so that n ≥ 1. There ∃x ∈ Z so thatax ≡ b (mod n) if and only if hcf(a, n)|b.

Proposition 10.5. Suppose m ∈ Z+ with m ≥ 2 and A1, . . . , Am arenonempty, finite sets.

(a) Suppose A1, . . . , Am are pairwise disjoint, meaning that for i, j ∈ Z+

with i, j ≤ m and i 6= j, we have Ai ∩Aj = ∅. Then

|A1 ∪ · · · ∪Am| = |A1|+ · · ·+ |Am|.

(b) |A1 × · · · ×Am| = |A1| · · · |Am|.

Proof. We prove (b) and leave (a) as an exercise.(b) We argue by induction on m.[Base case.] Let |A1| = s, |A2| = t. Thus we know there exist bijections

f : {1, 2, . . . , s} → A1 and g : {1, 2, . . . , t} → A2. For i, j ∈ Z+ with i ≤ s,j ≤ t, set ai = f(i) and set bj = g(j). Notice that since f and g arebijections, a1, a2, . . . , as are distinct and b1, b2, . . . , bt are distinct.

We define h : {1, 2, . . . , st} → A1 ×A2 as follows. Take n ∈ {1, 2, . . . , st}.Recall that as a consequence of the division algorithm for Z, ∃!q, r ∈ Z sothat n = tq+r where 1 ≤ r ≤ t. Note that since n ≥ 1, we must have q ≥ 0,for if q < 0 then q ≤ −1 and n ≤ −t + r ≤ 0. Also note that q < s, elsest+ 1 ≤ n = tq + r ≤ st. Hence aq+1 ∈ A1, and br ∈ A2. We define

h(n) = (aq+1, br) where q, r ∈ Z so that n = tq + r where 1 ≤ r ≤ t.

Since the conditions on q and r determine them uniquely, there is no ambi-guity in the meaning of h(n), or in other words, h is well-defined.

We need to show that h is bijective. Suppose first thatm,n ∈ {1, 2, . . . , st}so that h(m) = h(n). Take the unique q, r, q′, r′ ∈ Z so that n = tq + r,m = tq′+ r′ where 1 ≤ r ≤ t, 1 ≤ r′ ≤ t. Then (aq′+1, br′) = f(m) = f(n) =(aq+1, br). Thus we have aq′+1 = aq+1 and br′ = br; hence q′ + 1 = q + 1(since a1, a2, . . . , as are distinct) and r′ = r (since b1, b2, . . . , bt are dis-tinct). Thus m = tq′ + r′ = tq + r = n, showing that h is injective.Now take an arbitrary element (ai, bj) ∈ A1 × A2; thus 1 ≤ i ≤ s and1 ≤ j ≤ t, so 1 ≤ t(i − 1) + j ≤ st. Hence with n = t(i − 1) + j, wehave h(n) = (ai, bj), showing that h is surjective. Thus h is bijective. So|A1 ×A2| = st = |A1||A2|.

[Induction step.] Suppose that k ∈ Z with k ≥ 2, and suppose that|A1 × · · · ×Ak| = |A1| · · · |Ak|. Set A = A1 × · · · ×Ak. Thus

|A1 × · · · ×Ak ×Ak+1| = |A×Ak+1|,


and by the base case, we know |A × Ak+1| = |A| · |Ak+1|. So using theinduction hypothesis, we get

|A1 × · · · ×Ak ×Ak+1| = |A| · |Ak+1| = |A1| · · · |Ak| · |Ak+1|.

Hence by the principle of mathematical induction, (b) holds for all m ∈ Zwith m ≥ 2. �


Proposition 10.6. The union of countably many nonempty, pairwise dis-joint finite sets is countable.

Proposition 10.7. Suppose A and B are nonempty finite sets with |A| =|B|.

(a) Suppose f : A→ B is injective. Then f is bijective.(b) Suppose f : A→ B is surjective. Then f is bijective.

Proof. Let n ∈ Z+ so that n = |A|. So |B| = n, and there are bijectionsg : {1, 2, . . . , n} → A and h : {1, 2, . . . , n} → B; for i ∈ {1, 2, . . . , n}, setai = g(i), bi = h(i). Since g is injective, this means a1, . . . , an are distinct;similarly, b1, . . . , bn are distinct.

(a) Suppose f : A → B is injective. Then f(a1), . . . , f(an) are distinct,so |f(A)| = |A| = n. Since f(A) ⊆ B and |f(A)| = |B| = n ≤ ∞, we musthave f(A) = B. Thus f is surjective and hence bijective.

(b) Suppose f : A → B is surjective. For the sake of contradiction,suppose f is not injective. Thus ∃i, j ∈ {1, 2, . . . , n} so that ai 6= aj butf(ai) = f(aj). Since ai 6= aj , we have i 6= j. Thus

f(A) = {ak : k ∈ Z, 1 ≤ k ≤ n, k 6= j }.

So |f(A)| < n. But since f is surjective, we know f(A) = B and hence|f(A)| = |B| = n. This gives us a contradiction, so we must have that f isinjective and hence bijective. �

Finally, we offer a “party trick” based on the theory presented in thesenotes.

Take x ∈ Z+. Written as a decimal expansion, we write x as

amam−1 · · · a1a0where ai ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for 0 ≤ i ≤ m. Thus

x =m∑i=0

ai10i.

We know 10 ≡ 1 (mod 9). One uses induction to show that for all i ∈ Z+,we have 10i ≡ 1 (mod 9). Hence, again using induction, one shows that

m∑i=0

ai10i ≡m∑i=0

ai (mod 9).

Recall that x is divisible by 9 if and only if x ≡ 0 (mod 9), so x is divisibleby 9 if and only if the digits of x sum to a number divisible by 9.


One can devise a similar party trick to test for divisibility by 11. In thiscase one uses that for i ∈ Z, i ≥ 0, 10i ≡ 1 (mod 11) when i is even, and10i ≡ −1 (mod 11) when i is odd. [So what is the party trick?]

Documents

FOUNDATIONS & PROOF LECTURE NOTESmalhw/FandP-lectures.pdf · FOUNDATIONS & PROOF LECTURE NOTES ... Truth tables, equivalences, and ... tion used in truth tables; equivalence of propositions;