Math 109 Lecture

1. Sets

Set theory is the foundation of probability and statistics, as it is for almost every branch of mathematics.

Sets and subsets

First, a set is simply a collection of objects; the objects are referred to as elements of the set. The statement that x is an element of set S is written x∈S, and the negation that x is not an element of S is written as x∉S. By definition, a set is completely determined by its elements; thus sets A and B are equal if they have the same elements:

A=B if and only if x∈A⟺x∈BIf A and B are sets then A is a subset of B if every element of A is also an element of B:

A⊆B if and only if x∈A⟹x∈BConcepts in set theory are often illustrated with small, schematic sketches known as Venn diagrams, named for John Venn. The Venn diagram in the picture below illustrates the subset relation.

A⊆B

Membership is a primitive, undefined concept in naive set theory, the type we are studying here. However, the following construction, known as Russell's paradox, after the mathematician and philosopher Bertrand Russell, shows that we cannot be too cavalier in the construction of sets.

Let R be the set of all sets A such that A∉A. Then R∈R if and only if R∉R.

Proof:

The contradiction follows from the definition of R: If R∈R, then by definition, R∉R. If R∉R, then by definition, R∈R. The net result, of course, is that R is not a well-defined set.

Usually, the sets under discussion in a particular context are all subsets of a well-defined, specified set S, often called a universal set. The use of a universal set prevents the type of problem that arises in Russell's paradox. That is, if S is a given set and p(x) is a predicate on S (that is, a valid mathematical statement that is either true or false for each x∈S), then {x∈S:p(x)} is a valid subset of S. Defining a set in this way is known as predicate form. The other basic way to define a set is simply be listing its elements; this method is known as list form.

In contrast to a universal set, the empty set, denoted ∅, is the set with no elements.

∅⊆A for every set A.

Proof:

∅⊆A means that x∈∅⟹x∈A. Since the premise is false, the implication is true.

Suppose that A, B and C are subsets of a set S. The subset relation is a partial order on the collection of subsets of S. That is,

1. A⊆A (the reflexive property).2. If A⊆B and B⊆A then A=B (the anti-symmetric property).

3. If A⊆B and B⊆C then A⊆C (the transitive property).

If A⊆B and A≠B, we sometimes write A⊂B, and A is said to be a strict subset of B. If ∅⊂A⊂B, then A is called a proper subset of B. If S is a set, then the set of all subsets of S is known as the power set of S and is denoted P(S).

Special Sets

The following special sets are used throughout this text. Defining them will also give us practice using list and predicate form.

R, the set of real numbers, and the universal set for the other subsets in this list. N={0,1,2,…}, the set of natural numbers

N+={1,2,3,…}, the set of positive integers

Z={…,−2,−1,0,1,2,…}, the set of integers

Q={x∈R:x=mn for some m∈Z and n∈N+}, the set of rational numbers

A={x∈R:p(x)=0 for some polynomial p with integer coefficients}, the set of algebraic numbers.

Note that N+⊂N⊂Z⊂Q⊂A⊂R. We will also occasionally need the set of complex numbers C={x+iy:x,y∈R} where i is the imaginary unit.

Set Operations

We are now ready to review the basic operations of set theory. For the following definitions, suppose that A and B are subsets of a universal set, which we will denote by S.

The union of A and B is the set obtained by combining the elements of A and B.

A∪B={x∈S:x∈A or x∈B}

The intersection of A and B is the set of elements common to both A and B:

A∩B={x∈S:x∈A and x∈B}

If the intersection of sets A and B is empty, then A and B are said to be disjoint. That is, A and B are disjoint if A∩B=∅

http://www.math.uah.edu/stat/foundations/Order.html

The set difference of B and A is the set of elements that are in B but not in A:

B∖A={x∈S:x∈B and x∉A}

Sometimes (particularly in older works and particularly when A⊆B), the notation B−A is used instead of B∖A. When A⊆B, B−A is known as proper set difference.

The complement of A is the set of elements that are not in A:

Ac={x∈S:x∉A}

Note that union, intersection, and difference are binary set operations, while complement is a unary set operation.

In the Venn diagram applet, select each of the following and note the shaded area in the diagram.

1. A2. B

3. Ac

4. Bc

5. A∪B

6. A∩B

Basic Rules

In the following problems, A, B, and C are subsets of a universal set S. The proofs are straightforward, and just use the definitions and basic logic. Try the proofs yourself before reading the ones in the text.

A∩B⊆A⊆A∪B.

The identity laws:

1. A∪∅=A2. A∩S=A

The idempotent laws:

1. A∪A=A2. A∩A=A

The complement laws:

1. A∪Ac=S2. A∩Ac=∅

The double complement law: (Ac)c=A

The commutative laws:

1. A∪B=B∪A2. A∩B=B∩A

Proof:

These results follows from the commutativity of the or and and logical operators.

The associative laws:

1. A∪(B∪C)=(A∪B)∪C2. A∩(B∩C)=(A∩B)∩C

Proof:

These results follow from the associativity of the or and and logical operators.

Thus, we can write A∪B∪C without ambiguity. Note that x is an element of this set if and only if x is an element of at least one of the three given sets. Similarly, we can write A∩B∩C without ambiguity. Note that x is an element of this set if and only if x is an element of all three of the given sets.

The distributive laws:

1. A∩(B∪C)=(A∩B)∪(A∩C)2. A∪(B∩C)=(A∪B)∩(A∪C)

Proof: 1. x∈A∩(B∪C) if and only if x∈A and x∈B∪C if and only if x∈A and either x∈B or x∈C if and

only if x∈A and x∈B, or, x∈A and x∈C if and only if x∈A∩B or x∈A∩C if and only if x∈(A∩B)∪(A∩C.

2. The proof is exactly the same as (a), but with or and and interchanged.

DeMorgan's laws (named after Agustus DeMorgan):

1. (A∪B)c=Ac∩Bc

2. (A∩B)c=Ac∪Bc.

Proof: 1. x∈(A∪B)c if and only if x∉A∪B if and only if x∉A and x∉B if and only x∈Ac and x∈Bc if and

only if x∈Ac∩Bc

2. x∈(A∩B)c if and only if x∉A∩B if and only if x∉A or x∉B if and only x∈Ac or x∈Bc if and only if x∈Ac∪Bc

The following statements are equivalent:

1. A⊆B2. Bc⊆Ac

3. A∪B=B

4. A∩B=A

5. A∖B=∅Proof:

1. Recall that A⊆B means that x∈A⟹x∈B.2. Bc⊆Ac means that x∉B⟹x∉A. This is the contrapositive of (a) and hence is equivalent to (a).

3. If A⊆B then clearly A∪B=B. Conversely suppose A∪B=B. If x∈A then x∈A∪B so x∈B. Hence A⊆B.

4. If A⊆B then clearly A∩B=A. Conversely suppose A∩B=A. If x∈A then x∈A∩B and so x∈B. Hence A⊆B.

5. Suppose A⊆B. If x∈A then x∈B and so by definition, x∉A∖B. If x∉A then again by definition, x∉A∖B. Thus A∖B=∅. Conversely suppose that A∖B=∅. If x∈A then x∉A∖B so x∈B. Thus A⊆B.

In addition to the special sets mentioned earlier, we also have

R∖Q, the set of irrational numbers R∖A, the set of transcendental numbers

Since Q⊂A⊂R it follows that R∖A⊂R∖Q, that is, every transcendental number is also irrational.

Set difference can be expressed in terms of complement and intersection. All of the other set operations (complement, union, and intersection) can be expressed in terms of difference.

Results for set difference:

1. B∖A=B∩Ac

2. Ac=S∖A

3. A∩B=A∖(A∖B)

4. A∪B=S∖{(S∖A)∖[(S∖A)∖(S∖B)]}

Proof: 1. This is clear from the definition: B∖A=B∩Ac={x∈S:x∈B and x∉A}.2. This follows from (a) with B=S.

3. Using (a), DeMorgan's law, and the distributive law, the right side is

A∩(A∩Bc)c=A∩(Ac∪B)=(A∩Ac)∪(A∩B)=∅∪(A∩B)=A∩B

4. Using (a), (b), DeMorgan's law, and the distributive law, the right side is

[Ac∩(Ac∩B)c]c=A∪(Ac∩B)=(A∪Ac)∩(A∪B)=S∩(A∪B)=A∪B

http://www.math.uah.edu/stat/foundations/Sets.html#spc

So in principle, we could do all of set theory using the one operation of set difference. But as (c) and (d) suggest, the results would be hideous.

(A∪B)∖(A∩B)=(A∖B)∪(B∖A).

Proof:

A direct proof is simple, but for practice let's give a proof using set algebra, in particular, DeMorgan's law, and the distributive law:

(A∪B)∖(A∩B)=(A∪B)∩(A∩B)c=(A∩B)∩(Ac∪Bc)=(A∩Ac)∪(B∩Ac)∪(A∩Bc)∪(B∩Bc)=∅∪(B∖A)∪(A∖B)∪∅=(A∖B)∪(B∖A)

The set in the previous result is called the symmetric difference of A and B, and is sometimes denoted A△B. The elements of this set belong to one but not both of the given sets. Thus, the symmetric difference corresponds to exclusive or in the same way that union corresponds to inclusive or. That is, x∈A∪B if and only if x∈A or x∈B (or both); x∈A△B if and only if x∈A or x∈B, but not both. On the other hand, the complement of the symmetric difference consists of the elements that below to both or neither of the given sets:

(A△B)c=(A∩B)∪(Ac∩Bc)=(Ac∪B)∩(Bc∪A)

Proof:

Again, a direct proof is simple, but let's give an algebraic proof for practice:

(A△B)c=[(A∪B)∩(A∩B)c]c=(A∪B)c∪(A∩B)=(Ac∩Bc)∪(A∩B)=(Ac∪A)∩(Ac∪B)∩(Bc∪A)∩(Bc∪B)=S∩(Ac∪B)∩(Bc∪A)∩S=(Ac∪B)∩(Bc∪A)

There are 16 different (in general) sets that can be constructed from two given events A and B.

Proof:

S is the union of 4 pairwise disjoint sets: A∩B, A∩Bc, Ac∩B, and Ac∩Bc. If A and B are in general position, these 4 sets are distinct. Every set that can be constructed from A and B is a union of some (perhaps none, perhaps all) of these 4 sets. There are 24=16 sub-collections of the 4 sets.

Open the Venn diagram applet. This applet lists the 16 sets that can be constructed from given sets A and B using the set operations.

1. Select each of the four subsets in the proof of the last exercise: A∩B, A∩Bc, Ac∩B, and Ac∩Bc. Note that these are disjoint and their union is S.

2. Select each of the other 12 sets and show how each is a union of some of the sets in (a).

General Operations

The operations of union and intersection can easily be extended to a finite or even an infinite collection of sets.

Definitions

Suppose that A is a nonempty collection of subsets of a universal set S. In some cases, the subsets in A may be naturally indexed by a nonempty index set I, so that A={Ai:i∈I}. (In a technical sense, any collection of subsets can be indexed.)

The union of the collection of sets A is the set obtained by combining the elements of the sets in A:

⋃A={x∈S:x∈A for some A∈A}

If A={Ai:i∈I}, so that the collection of sets is indexed, then we use the more natural notation:

⋃i∈IAi={x∈S:x∈Ai for some i∈I}

The intersection of the collection of sets A is the set of elements common to all of the sets in A:

⋂A={x∈S:x∈A for all A∈A}

If A={Ai:i∈I}, so that the collection of sets is indexed, then we use the more natural notation:

⋂i∈IAi={x∈S:x∈Ai for all i∈I}Often the index set is an integer interval of N. In such cases, an even more natural notation is to use the upper and lower limits of the index set. For example, if the collection is {Ai:i∈N+} then we would write ⋃∞i=1Ai for the union and ⋂∞i=1Ai for the intersection. Similarly, if the collection is {Ai:i∈{1,2,…,n}} for some n∈N+, we would write ⋃ni=1Ai for the union and ⋂ni=1Ai for the intersection.

A collection of sets A is pairwise disjoint if the intersection of any two sets in the collection is empty: A∩B=∅ for every A,B∈A with A≠B. The collection of sets A is said to partition a set B if the collection A is pairwise disjoint and ⋃A=B.

Partitions are intimately related to equivalence relations.

Basic Rules

In the following problems, A={Ai:i∈I} is a collection of subsets of a universal set S, indexed by a nonempty set I, and B is a subset of S.

The general distributive laws:

1. (⋃i∈IAi)∩B=⋃i∈I(Ai∩B)2. (⋂i∈IAi)∪B=⋂i∈I(Ai∪B)

Restate the laws in the notation where the collection A is not indexed.

Proof: 1. x is an element of either set if and only if x∈B and x∈Ai for some i∈I.2. x is an element of either set if and only if x∈B or x∈Ai for every i∈I.

(⋃A)∩B=⋃{A∩B:A∈A}, (⋂A)∪B=⋂{A∪B:A∈A}

http://www.math.uah.edu/stat/foundations/Equivalence.html

The general De Morgan's laws:

1. (⋃i∈IAi)c=⋂i∈IAci

2. (⋂i∈IAi)c=⋃i∈IAci

Restate the laws in the notation where the collection A is not indexed.

Proof: 1. x∈(⋃i∈IAi)c if and only if x∉⋃i∈IAi if and only if x∉Ai for every i∈I if and only if x∈Aci for

every i∈I if and only if x∈⋂i∈IAci.2. x∈(⋂i∈IAi)c if and only if x∉⋂i∈IAi if and only if x∉Ai for some i∈I if and only if x∈Aci for

some i∈I if and only if x∈⋃i∈IAci.

(⋃A)c=⋂{Ac:A∈A}, (⋂A)c=⋃{Ac:A∈A}

Suppose that the collection A partitions S. For any subset B, the collection {A∩B:A∈A} partitions B.

Proof:

Suppose A={Ai:i∈I} where I is an index set. If i,j∈I with i≠j then (Ai∩B)∩(Aj∩B)=(Ai∩Aj)∩B=∅∩B=∅, so the collection {Ai∩B:i∈I} is disjoint. Moreover, by the distributive law,

⋃i∈I(Ai∩B)=(⋃i∈IAi)∩B=S∩B=B

A partition of S induces a partition of B

Suppose that {Ai:i∈N+} is a collection of subsets of a universal set S

1. ⋂∞n=1⋃∞k=nAk={x∈S:x∈Ak for infinitely many k∈N+}2. ⋃∞n=1⋂∞k=nAk={x∈S:x∈Ak for all but finitely many k∈N+}

Proof: 1. Note that x∈⋂∞n=1⋃∞k=nAk if and only if for every n∈N+ there exists k≥n such that x∈Ak. In

turn, this occurs if and only if x∈Ak for infinitely many k∈N+.2. Note that x∈⋃∞n=1⋂∞k=nAk if and only if there exists n∈N+ such that x∈Ak for every k≥n. In

turn, this occurs if and only if x∈Ak for all but finitely many k∈N+.

The sets in the previous result turn out to be important in the study of probability.

Product sets

Definitions

Product sets are sets of sequences. The defining property of a sequence, of course, is that order as well as membership is important.

Let us start with ordered pairs. In this case, the defining property is that (a,b)=(c,d) if and only if a=c and b=d. Interestingly, the structure of an ordered pair can be defined just using set theory. The construction in the result below is due to Kazamierz Kuratowski

Define (a,b)={{a},{a,b}}. This definition captures the defining property of an ordered pair.

Proof:

Suppose that (a,b)=(c,d) so that {{a},{a,b}}={{c},{c,d}}. In the case that a=b note that (a,b)={{a}}. Thus we must have {c}={c,d}={a} and hence c=d=a, and in particular, a=c and b=d. In the case that a≠b, we must have {c}={a} and hence c=a. But we cannot have {c,d}={a} because then (c,d)={{a}} and hence {a,b}={a}, which would force a=b, a contradiction. Thus we must have {c,d}={a,b}. Since c=a and a≠b we must have d=b. The converse is trivial: if a=b and c=d then {a}={c} and {a,b}={c,d} so (a,b)=(c,d).

For ordered triples, the defining property is (a,b,c)=(d,e,f) if and only if a=d, b=e, and c=f. Ordered triples can be defined in terms of ordered pairs, which via the last result, uses only set theory.

Define (a,b,c)=(a,(b,c)). This definition captures the defining property of an ordered triple.

Proof:

Suppose (a,b,c)=(d,e,f). Then (a,(b,c))=(d,(e,f)). Hence by the definition of an ordered pair, we must have a=d and (b,c)=(e,f). Using the definition again we have b=e and c=f. Conversely, if a=d, b=e, and c=f, then (b,c)=(e,f) and hence (a,(b,c))=(d,(e,f)). Thus (a,b,c)=(d,e,f).

All of this is just to show how complicated structures can be built from simpler ones, and ultimately from set theory. But enough of that! More generally, two ordered sequences of the same size (finite or infinite) are the same if and only if their corresponding coordinates agree. Thus for n∈N+, the definition for n-tuples is (x1,x2,…,xn)=(y1,y2,…,yn) if and only if xi=yi for all i∈{1,2,…,n}. For infinite sequences, (x1,x2,…)=(y1,y2,…) if and only if xi=yi for all i∈N+.

Suppose now that we have a sequence of sets of n sets, (S1,S2,…,Sn), where n∈N+. The Cartesian product of the sets is defined as follows:

S1×S2×⋯×Sn={(x1,x2,…,xn):xi∈Si for i∈{1,2,…,n}}

Cartesian products are named for René Descartes. If Si=S for each i, then the Cartesian product set can be written compactly as Sn, a Cartesian power. In particular, recall that R denotes the set of real numbers so that Rn is n-dimensional Euclidean space, named after Euclid, of course. The elements of {0,1}n are called bit strings of length n. As the name suggests, we sometimes represent elements of this product set as strings rather than sequences (that is, we omit the parentheses and commas). Since the coordinates just take two values, there is no risk of confusion.

Suppose that we have an infinite sequence of sets (S1,S2,…). The Cartesian product of the sets is defined by

S1×S2×⋯={(x1,x2,…):xi∈Si for each i∈{1,2,…}}

When Si=S for i∈N+, the Cartesian product set is sometimes written as a Cartesian power as S∞ or as SN+. An explanation for the last notation is given in the next section on functions. Also, notation similar to that of general union and intersection is often used for Cartesian product, with × as the operator. thus

×ni=1Si=S1×S2×⋯×Sn,×∞i=1Si=S1×S2×⋯Rules for Product Sets

We will now see how the set operations relate to the Cartesian product operation. Suppose that S and T are sets and that A⊆S, B⊆S and C⊆T, D⊆T. The sets in the exercises below are subsets of S×T.

The most important rules that relate Cartesian product with union, intersection, and difference are the distributive rules:

1. A×(C∪D)=(A×C)∪(A×D)2. (A∪B)×C=(A×C)∪(B×C)

3. A×(C∩D)=(A×C)∩(A×D)

4. (A∩B)×C=(A×C)∩(B×C)

5. A×(C∖D)=(A×C)∖(A×D)

6. (A∖B)×C=(A×C)∖(B×C)

Proof: 1. (x,y)∈A×(C∪D) if and only if x∈A and y∈C∪D if and only if x∈A and either y∈C or y∈D if

and only if x∈A and y∈C, or, x∈A and y∈D if and only if (x,y)∈A×C or (x,y)∈A×D if and only if (x,y)∈(A×C)∪(A×D).

2. Similar to (a), but with the roles of the coordinates reversed.

3. (x,y)∈A×(C∩D) if and only if x∈A and y∈C∩D if and only if x∈A and y∈C and y∈D if and only if (x,y)∈A×C and (x,y)∈A×D if and only if (x,y)∈(A×C)∩(A×D).

4. Similar to (c) but with the roles of the coordinates reversed.

5. (x,y)∈A×(C∖D) if and only if x∈A and y∈C∖D if and only if x∈A and y∈C and y∉D if and only if (x,y)∈A×C and (x,y)∉A×D if and only if (x,y)∈(A×C)∖(A×D).

6. Similar to (e) but with the roles of the coordinates reversed.

In general, the product of unions is larger than the corresponding union of products.

1. (A∪B)×(C∪D)=(A×C)∪(A×D)∪(B×C)∪(B×D)2. (A×C)∪(B×D)⊆(A∪B)×(C∪D)

Proof:

http://www.math.uah.edu/stat/foundations/Functions.html

1. (x,y)∈(A∪B)×(C∪D) if and only if x∈A∪B and y∈C∪D if and only if at least one of the following is true: x∈A and y∈C, x∈A and y∈D, x∈B and y∈C, x∈B and y∈D if and only if (x,y)∈(A×C)∪(A×D)∪(B×C)∪(B×D)

2. This follows from (a). Note that the left side is a proper subset of the right side if A×D or B×C is nonempty.

On the other hand, the product of intersections is the same as the corresponding intersection of products.

(A×C)∩(B×D)=(A∩B)×(C∩D)Proof:

(x,y)∈(A×C)∩(B×D) if and only if (x,y)∈A×C and (x,y)∈B×D if and only if x∈A and y∈C and x∈B and y∈D if and only if x∈A∩B and y∈C∩D if and only if (x,y)∈(A∩B)×(C∩D.

In general, the product of differences is smaller than the corresponding difference of products.

1. (A∖B)×(C∖D)=[(A×C)∖(A×D)]∖[(B×C)∖(B×D)]2. (A∖B)×(C∖D)⊆(A×C)∖(B×D)

Proof: 1. (x,y)∈(A∖B)×(C∖D if and only if x∈A∖B and y∈C∖D if and only if x∈A and x∉B and y∈C

and y∉D. On the other hand, (x,y)∈[(A×C)∖(A×D)]∖[(B×C)∖(B×D)] if and only if (x,y)∈(A×C)∖(A×D) and (x,y)∉(B×C)∖(B×D). The first statement means that x∈A and y∈C and y∉D. The second statement is the negation of x∈B and y∈C and y∉D. The two statements both hold if and only if x∈A and x∉B and y∈C and y∉D.

2. As in (a), (x,y)∈(A∖B)×(C∖D) if and only if x∈A and x∉B and y∈C and y∉D. This implies that (x,y)∈A×C and (x,y)∉(B×D).

Projections and Cross Sections

Suppose again that S and T are sets, and that C⊆S×T. The cross section of C at x∈S is

Cx={y∈T:(x,y)∈C}Similarly the cross section of C at y∈T is

Cy={x∈S:(x,y)∈C}Note that Cx⊆T for x∈S and Cy⊆S for y∈T. We define the projections of C on T and onto S by

CT={y∈T:(x,y)∈C for some x∈S},CS={x∈S:(x,y)∈C for some y∈T}

The projections are the unions of the appropriate cross sections.

1. CT=⋃x∈SCx

2. CS=⋃y∈TCy

Cross sections are preserved under the set operations. We state the result for cross sections of x∈S. By symmetry, of course, analgous results hold for cross sections of y∈T.

Suppose that C,D⊆S×T. Then for x∈T,

1. (C∪D)x=Cx∪Dx

2. (C∩D)x=Cx∩Dx

3. (C∖D)x=Cx∖Dx

Proof: 1. y∈(C∪D)x if and only if (x,y)∈C∪D if and only if (x,y)∈C or (x,y)∈D if and only if y∈Cx or

y∈Dx.2. The proof is just like (a), with and replacing or.

3. The proof is just like (a), with and not replacing or.

For projections, the results are a bit more complicated. We give the results for projections onto T; naturally the results for projections onto S are analogous.

Suppose again that C,D⊆S×T. Then

1. (C∪D)T=CT∪DT

2. (C∩D)T⊆CT∩DT

3. (CT)c⊆(Cc)T

Proof: 1. Suppose that y∈(C∪D)T. Then there exists x∈S such that (x,y)∈C∪D. Hence (x,y)∈C so

y∈CT, or (x,y)∈D so y∈DT. In either case, y∈CT∪DT. Conversely, suppose that y∈CT∪DT. Then y∈CT or y∈DT. If y∈CT then there exists x∈S such that (x,y)∈C. But then (x,y)∈C∪D so y∈(C∪D)T. Similarly if y∈DT then y∈(C∪D)T.

2. Suppose that y∈(C∩D)T. Then there exists x∈S such that (x,y)∈C∩D. Hence (x,y)∈C so y∈CT and (x,y)∈D so y∈DT. Therefore y∈CT∩DT.

3. Suppose that y∈(CT)c. Then y∉CT, so for every x∈S, (x,y)∉C. Fix x0∈S. Then (x0,y)∉C so (x0,y)∈Cc and therefore y∈(Cc)T.

It's easy to see that equality does not hold in general in parts (b) and (c). In part (b) for example, suppose that A1,A2⊆S are nonempty and disjoint and B⊆T is nonempty. Let C=A1×B and D=A2×B. Then C∩D=∅ so (C∩D)T=∅. But CT=DT=B. In part (c) for example, suppose that A is a nonempty proper subset of S and B is a nonempty proper subset of T. Let C=A×B. Then CT=B so (CT)c=Bc. On the other hand, Cc=(Ac×B)∪(A×Bc), so (Cc)T=T.

Computational Exercises

Subsets of R

The universal set is [0,∞). Let A=[0,5] and B=(3,7). Express each of the following in terms of intervals:

1. A∩B2. A∪B

3. A∖B

4. B∖A

5. Ac

Answer: 1. (3,5]2. [0,7]

3. [0,3]

4. (5,7)

5. (5,∞)

The universal set is N. Let A={n∈N:n is even} and let B={n∈N:n≤9}. Give each of the following:

1. A∩B in list form2. A∪B in predicate form

3. A∖B in list form

4. B∖A in list form

5. Ac in predicate form

6. Bc in list form

Answer: 1. {0,2,4,6,8}2. {n∈N:n is even or n≤9}

3. {10,12,14,…}

4. {1,3,5,7,9}

5. {n∈N:n is odd}

6. {10,11,12,…}

Coins and Dice

Let S={1,2,3,4}×{1,2,3,4,5,6}. This is the set of outcomes when a 4-sided die and a 6-sided die are tossed. Further let A={(x,y)∈S:x=2} and B={(x,y)∈S:x+y=7}. Give each of the following sets in list form:

1. A2. B

3. A∩B

4. A∪B

5. A∖B

6. B∖A

Answer: 1. {(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)}2. {(1,6),(2,5),(3,4),(4,3)}

3. {(2,5)}

4. {(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(1,6),(3,4),(4,3)}

5. {(2,1),(2,2),(2,3),(2,4),(2,6)}

6. {(1,6),(3,4),(4,3)}

Let S={0,1}3. This is the set of outcomes when a coin is tossed 3 times (0 denotes tails and 1 denotes heads). Further let A={(x1,x2,x3)∈S:x2=1} and B={(x1,x2,x3)∈S:x1+x2+x3=2}. Give each of the following sets in list form, using bit-string notation:

1. S2. A

3. B

4. Ac

5. Bc

6. A∩B

7. A∪B

8. A∖B

9. B∖A

Answer: 1. {000,100,010,001,110,101,011,111}2. {010,110,011,111}

3. {110,011,101}

4. {000,100,001,101}

5. {000,100,010,001,111}

6. {110,011}

7. {010,110,011,111,101}

8. {010,111}

9. {101}

Let S={0,1}2. This is the set of outcomes when a coin is tossed twice (0 denotes tails and 1 denotes heads). Give P(S) in list form.

Answer:

{∅,{00},{01},{10},{11},{00,01},{00,10},{00,11},{10,11},{00,01,10},{00,01,11},{00,10,11},{01,10,11},{00,01,10,11}}

Cards

A standard card deck can be modeled by the Cartesian product set

D={1,2,3,4,5,6,7,8,9,10,j,q,k}×{♣,♢,♡,♠}where the first coordinate encodes the denomination or kind (ace, 2-10, jack, queen, king) and where the second coordinate encodes the suit (clubs, diamonds, hearts, spades). Sometimes we represent a card as a string rather than an ordered pair (for example q♡). For the problems in this subsection, the card deck D is the universal set.

Let H denote the set of hearts and F the set of face cards. Find each of the following:

1. H∩F2. H∖F

3. F∖H

4. H△F

Answer: 1. {j♡,q♡,k♡}2. {1♡,2♡,3♡4♡,5♡,6♡,7♡8♡,9♡10♡}

3. {j♠,q♠,k♠,j♢,q♢,k♢,k♢,k♢,j♣,q♣,k♣}

4. {1♡,2♡,3♡,4♡,5♡,6♡,7♡,8♡.9♡,10♡,j♠,q♠,k♠,j♢,q♢,k♢,j♣,q♣,k♣}

A bridge hand is a subset of D with 13 cards. Often bridge hands are described by giving the cross sections by suit. Thus, suppose that N is a bridge hand, held by a player named North, with

N♣={2,5,q},N♢={1,5,8,q,k},N♡={8,10,j,q},N♠={1}

Find each of the following:

1. The nonempty cross sections of N by denomination.2. The projection of N onto the set of suits.

3. The projection of N onto the set of denominations

Answer: 1. N1={♢,♡}, N2={♣}, N5={♣,♢}, N8={♢,♡}, N10={♡}, Nj={♡}, Nq={♣,♢,♡},

Nk={♢}2. {♣,♢,♡,♠}

3. {1,2,5,8,10,j,q,k}

By contrast, it is usually more useful to describe a poker hand by giving the cross sections by denomination. In the usual version of draw poker, a hand is a subset of D with 5 cards. Thus, suppose that B is a poker hand, held by a player named Bill, with

B1={♣,♠},B8={♣,♠},Bq={♡}

Find each of the following:

1. The nonempty cross sections of B by suit.2. The projection of B onto the set of suits.

3. The projection of B onto the set of denominations

Answer: 1. B♣={1,8}, B♡={q}, B♠={1,8}2. {♣,♡,♠}

3. {1,8,q}

The poker hand in the last exercise is known as a dead man's hand. Legend has it that Wild Bill Hickock held this hand at the time of his murder in 1876.

General unions and intersections

For the problems in this subsection, the universal set is R.

Let An=[0,1−1n] for n∈N+. Find

1. ⋂∞n=1An

2. ⋃∞n=1An

3. ⋂∞n=1Acn

4. ⋃∞n=1Acn

Answer: 1. {0}2. [0,1)

3. (−∞,0)∪[1,∞)

4. R−{0}

Let An=(2−1n,5+1n) for n∈N+. Find

1. ⋂∞n=1An

2. ⋃∞n=1An

3. ⋂∞n=1Acn

4. ⋃∞n=1Acn

Answer: 1. [2,5]2. (1,6)

3. (−∞,1]∪[6,∞)

4. (−∞,2)∪(5,∞)

Subsets of R2

Let T be the closed triangular region in R2 with vertices (0,0), (1,0), and (1,1). Find each of the following:

1. The cross section Tx for x∈R2. The cross section Ty for y∈R

3. The projection of T onto the horizontal axis

4. The projection of T onto the vertical axis

Proof: 1. Tx=[0,x] for x∈[0,1], Tx=∅ otherwise2. Ty=[y,1] for y∈[0,1], Ty=∅ otherwise

3. [0,1]

4. [0,1]

2. Functions

Functions play a central role in probability and statistics, as they do in every other branch of mathematics. For the most part, the proofs in this section are straightforward, so be sure to try them yourself before reading the ones in the text.

Definitions and Properties

Suppose that S and T are sets. Technically, a function f from S into T is a subset of the product set S×T with the property that for each element x∈S, there exists a unique element y∈T such that (x,y)∈f. We then write y=f(x). Less formally, a function f from S into T is a rule (or procedure or algorithm) that assigns to each x∈S a unique element f(x)∈T. The definition of a function as a set of ordered pairs, is due to Kazamierz Kuratowski. When f is a function from S into T we write f:S→T.

A function f from S into T

The set S is the domain of f and the set T is the range space or co-domain of f. The actual range of f is the set of function values:

range(f)={y∈T:y=f(x) for some x∈S}So range(f)⊆T. If range(f)=T, then f is said to map S onto T (instead of just into T). Thus, if f maps S onto T then for each y∈S there exists x∈S such that f(x)=y. By definition, any function maps its domain onto its range.

A function f:S→T is said to be one-to-one if distinct elements in the domain are mapped to distinct elements in the range. That is, if u,v∈S and u≠v then f(u)≠f(v). Equivalently, if f(u)=f(v) then u=v.

Inverse functions

If f is a one-to-one function from S onto T, we can define the inverse of f as the function f−1 from T onto S given by

f−1(y)=x⟺f(x)=y;x∈S,y∈TIf you like to think of a function as a set of ordered pairs, then f−1={(y,x)∈T×S:(x,y)∈f}. The fact that f is one-to-one and onto ensures that f−1 is a valid function from T onto S. Sets S and T are in one-to-one

http://www.math.uah.edu/stat/foundations/Sets.html

correspondence if there exists a one-to-one function from S onto T. One-to-one correspondence plays an essential role in the study of cardinality.

Restrictions

Suppose that f:S→T and that A⊆S. The function fA:A→T defined by the rule fA(x)=f(x) for x∈A is called, appropriately enough, as the restriction of f to A. As a set of ordered pairs, note that fA={(x,y)∈f:x∈A}.

Composition

Suppose that g:R→S and f:S→T. The composition of f with g is the function f∘g:R→T defined by

(f∘g)(x)=f(g(x)),x∈RComposition is associative:

Suppose that h:R→S, g:S→T, and f:T→U. Then

f∘(g∘h)=(f∘g)∘hProof:

Note that both functions map R into U. Using the definition of composition, the value of both functions at x∈R is f(g(h(x))).

Thus we can write f∘g∘h without ambiguity. On the other hand, composition is not commutative. Indeed depending on the domains and co-domains, f∘g might be defined when g∘f is not. Even when both are defined, they may have different domains and co-domains, and so of course cannot be the same function. Even when both are defined and have the same domains and co-domains, the two compositions will not be the same in general. Examples of all of these cases are given in the computational exercises below.

Suppose that g:R→S and f:S→T.

1. If f and g are one-to-one then f∘g is one-to-one.2. If f and g are onto then f∘g is onto.

Proof: 1. Suppose that u,v∈R and (f∘g)(u)=(f∘g)(v). Then f(g(u))=f(g(v)). Since f is one-to-one,

g(u)=g(v). Since g is one-to-one, u=v.2. Suppose that z∈T. Since f is onto, there exist y∈S with f(y)=z. Since g is onto, there exists x∈R

with g(x)=y. Then (f∘g)(x)=f(g(x))=f(y)=z.

The identity function on a set S is the function IS from S onto S defined by IS(x)=x for x∈S

The identity function acts like an identity with respect to the operation of composition. If f:S→T then

1. f∘IS=f2. IT∘f=f

Proof:

http://www.math.uah.edu/stat/foundations/Cardinality.html

1. Note that f∘IS:S→T. For x∈S, (f∘IS)(x)=f(IS(x))=f(x).2. Note that IT∘f:S→T. For x∈S, (IT∘f)(x)=IT(f(x))=f(x).

The inverse of a function is really the inverse with respect to composition. Suppose that f is a one-to-one function from S onto T. Then

1. f−1∘f=IS2. f∘f−1=IT

Proof: 1. Note that f−1∘f:S→S. For x∈S, (f−1∘f)(x)=f−1(f(x))=x.2. Note that f∘f−1:T→T. For y∈T, (f−1∘f)(y)=f(f−1(y))=y

An element x∈Sn can be thought of as a function from {1,2,…,n} into S. Similarly, an element x∈S∞ can be thought of as a function from N+ into S. For such a sequence x, of course, we usually write xi instead of x(i). More generally, if S and T are sets, then the set of all functions from S into T is denoted by TS. In particular, as we noted in the last section, S∞ is also (and more accurately) written as SN+.

Suppose that g is a one-to-one function from R onto S and that f is a one-to-one function from S onto T. Then (f∘g)−1=g−1∘f−1.

Proof:

Note that f∘g)−1:T→S and g−1∘f−1:T→S. For y∈T, let x=(f∘g)−1(y). Then (f∘g)(x)=y so that f(g(x))=y and hence g(x)=f−1(y) and finally x=g−1(f−1(y)).

Inverse Images

Suppose that f:S→T. If A⊆T, the inverse image of A under f is the subset of S consisting of those elements that map into A:

f−1(A)={x∈S:f(x)∈A}

The inverse image of A under f

Technically, the inverse images define a new function from P(T) into P(S). We use the same notation as for the inverse function, which is defined when f is one-to-one and onto. These are very different functions, but usually no confusion results. The following important theorem shows that inverse images preserve all set operations.

Suppose that f:S→T, and that A⊆T and B⊆T. Then

1. f−1(A∪B)=f−1(A)∪f−1(B)

2. f−1(A∩B)=f−1(A)∩f−1(B)

3. f−1(A∖B)=f−1(A)∖f−1(B)

4. If A⊆B then f−1(A)⊆f−1(B)

5. If A and B are disjoint, so are f−1(A) and f−1(B)

Proof: 1. x∈f−1(A∪B) if and only if f(x)∈A∪B if and only if f(x)∈A or f(x)∈B if and only if x∈f−1(A)

or x∈f−1(B) if and only if x∈f−1(A)∪f−1(B)2. The proof is the same as (a), with intersection replacing union and with and replacing or throughout.

3. The proof is the same as (a), with set difference replacing union and with and not replacing or throughout.

4. Suppose A⊆B. If x∈f−1(A) then f(x)∈A and hence f(x)∈B, so x∈f−1(B).

5. If A and B are disjoint, then from (b), f−1(A)∩f−1(B)=f−1(A∩B)=f−1(∅)=∅.

The result in part (a) hold for arbitrary unions, and the results in part (b) hold for arbitrary intersections. No new ideas are involved; only the notation is more complicated.

Suppose that {Ai:i∈I} is a collection of subsets of T, where I is a nonempty index set. Then

1. f−1(⋃i∈IAi)=⋃i∈If−1(Ai)2. f−1(⋂i∈IAi)=⋂i∈If−1(Ai)

Proof: 1. x∈f−1(⋃i∈IAi) if and only if f(x)∈⋃i∈IAi if and only if f(x)∈Ai for some i∈I if and only if

x∈f−1(Ai) for some i∈I if and only if x∈⋃i∈If−1(Ai)2. The proof is the same as (a), with intersection replacing union and with for every replacing for some.

Forward Images

Suppose again that f:S→T. If A⊆S, the forward image of A under f is the subset of T given by

f(A)={f(x):x∈A}

The forward image of A under f

Technically, the forward images define a new function from P(S) into P(T), but we use the same symbol f for this new function as for the underlying function from S into T that we started with. Again, the two functions are very different, but usually no confusion results.

It might seem that forward images are more natural than inverse images, but in fact, the inverse images are much more important than the forward ones (at least in probability and measure theory). Fortunately, the inverse images are nicer as well--unlike the inverse images, the forward images do not preserve all of the set operations.

Suppose that f:S→T, and that A⊆S and B⊆S. Then

1. f(A∪B)=f(A)∪f(B).2. f(A∩B)⊆f(A)∩f(B). Equality holds if f is one-to-one.

3. f(A)∖f(B)⊆f(A∖B). Equality holds if f is one-to-one.

4. If A⊆B then f(A)⊆f(B).

Proof: 1. Suppose y∈f(A∪B). Then y=f(x) for some x∈A∪B. If x∈A then y∈f(A) and if x∈B then

y∈f(B). In both cases y∈f(A)∪f(B). Conversely suppose y∈f(A)∪f(B). If y∈f(A) then y=f(x) for some x∈A. But then x∈A∪B so y∈f(A∪B). Similarly, if y∈f(B) then y=f(x) for some x∈B. But then x∈A∪B so y∈f(A∪B).

2. If y∈f(A∩B) then y=f(x) for some x∈A∩B. But then x∈A so y∈f(A) and x∈B so y∈f(B) and hence y∈f(A)∩f(B). Conversely, suppose that y∈f(A)∩f(B). Then y∈f(A) and y∈f(B), so there exists x∈A with f(x)=y and there exists u∈B with f(u)=y. At this point, we can go no further. But if f is one-to-one, then u=x and hence x∈A and x∈B. Thus x∈A∩B so y∈f(A∩B)

3. Suppose y∈f(A)∖f(B). Then y∈f(A) and y∉f(B). Hence y=f(x) for some x∈A and y≠f(u) for every u∈B. Thus, x∉B so x∈A∖B and hence y∈f(A∖B). Conversely, suppose y∈f(A∖B). Then y=f(x) for some x∈A∖B. Hence x∈A so y∈f(A). Again, the proof breaks down at this point. However, if f is one-to-one and f(u)=y for some u∈B, then u=x so x∈B, a contradiction. Hence f(u)≠y for every u∈B so y∉f(B). Thus y∈f(A∖B).

4. Suppose A⊆B. If y∈f(A) then y=f(x) for some x∈A. But then x∈B so y∈f(B).

The result in part (a) hold for arbitrary unions, and the results in part (b) hold for arbitrary intersections. No new ideas are involved; only the notation is more complicated.

Suppose that {Ai:i∈I} is a collection of subsets of S, where I is a nonempty index set. Then

1. f(⋃i∈IAi)=⋃i∈If(Ai).2. f(⋂i∈IAi)⊆⋂i∈If(Ai). Equality holds if f is one-to-one.

Proof: 1. y∈f(⋃i∈IAi) if and only if y=f(x) for some x∈⋃i∈IAi if and only if y=f(x) for some x∈Ai and

some i∈I if and only if y∈f(Ai) for some i∈A if and only if y∈⋃i∈If(Ai).2. If y∈f(⋂i∈IAi) then y=f(x) for some x∈⋂i∈IAi. Hence x∈Ai for every i∈I so y∈f(Ai) for every

i∈I and thus y∈⋂i∈If(Ai). Conversely, suppose that y∈⋂i∈If(Ai). Then y∈f(Ai) for every i∈I. Hence for every i∈I there exists xi∈Ai with y=f(xi). If f is one-to-one, xi=xj for all i,j∈I. Call the common value x. Then x∈Ai for every i∈I so x∈⋂i∈IAi and therefore y∈f(⋂i∈IAi).

http://www.math.uah.edu/stat/foundations/Measure.html

http://www.math.uah.edu/stat/prob/index.html

Suppose again that f:S→T. As noted earlier, the forward images of f define a function from P(S) into P(T) and the inverse images define a function from P(T) into P(S). One might hope that these functions are inverses of one-another, but alas no.

Suppose that f:S→T.

1. A⊆f−1[f(A)] for A⊆S. Equality holds if f is one-to-one.2. f[f−1(B)]⊆B for B⊆T. Equality holds if f is onto.

Proof: 1. If x∈A then f(x)∈f(A) and hence x∈f−1[f(A)]. Conversely suppose that x∈f−1[f(A)]. Then

f(x)∈f(A) so f(x)=f(u) for some u∈A. At this point we can go no further. But if f is one-to-one, then u=x and hence x∈A.

2. Suppose y∈f[f−1f(B)]. Then y=f(x) for some x∈f−1(B). But then y=f(x)∈B. Conversely suppose that f is onto and y∈B. There exist x∈S with f(x)=y. Hence x∈f−1(B) and so y∈f[f−1(B)].

Spaces of Real Functions

Let S be a set and let V denote the set of all functions from S into R. The usual arithmetic operations on members of V are defined pointwise. That is, if f∈V, g∈V, and c∈R, then f+g, f−g, fg, and cf defined as follows for all x∈S.

(f+g)(x)=f(x)+g(x) (f−g)(x)=f(x)−g(x)

(fg)(x)=f(x)g(x)

(cf)(x)=cf(x)

If g(x)≠0 for every x∈S, then f/g defined by (f/g)(x)=f(x)/g(x) for x∈S is also in V. The 0 functions 0 of course is defined by 0(x)=0 for all x∈S. Each of these functions is also in V.

A fact that is very important in probability as well as other branches of analysis is that V with addition and scalar multiplication as defined above is a vector space.

(V,+,⋅) is a vector space over R. That is, for all f,g,h∈V and a,b∈R

1. f+g=f+f (commutative rule)2. f+(g+h)=(f+g)+h (associative rule)

3. a(f+g)=af+ag (distributive rule)

4. (a+b)f=af+bf (distributive rule)

5. f+0=f (existence of zero vector)

6. f+(−f)=0 (existence of additive inverses)

7. 1⋅f=f (unity rule)

Proof:

Each of these properties follows from the corresponding property in R.

Indicator Functions

Suppose that A is a subset of a universal set S. The indicator function of A is the function 1A:S→{0,1} defined as follows:

1A(x)={1,0,x∈Ax∉AThus, the indicator function of A simply indicates whether or not x∈A for each x∈S.

Conversely, any function f:S→{0,1} is the indicator function of the set A=f−1{1}={x∈S:f(x)=1}

Thus, there is a one-to-one correspondence between P(S), the power set of S, and the collection of indicator functions {0,1}S. The next exercise shows how the set algebra of subsets corresponds to the arithmetic algebra of the indicator functions.

Suppose that A and B are subsets of S. Then

1. 1A∩B=1A1B=min{1A,1B}2. 1A∪B=1−(1−1A)(1−1B)=max{1A,1B}

3. 1Ac=1−1A

4. 1A∖B=1A(1−1B)

5. A⊆B if and only if 1A≤1B

Proof: 1. Note that both functions on the right just take the values 0 and 1. Moreover,

1A(x)1B(x)=min{1A(x)1B(x)}=1 if and only if x∈A and x∈B2. Note that both function on the right just take the values 0 and 1. Moreover, 1−(1−1A(x))

(1−1B(x))=max{1A(x),1B(x)}=1 if and only if x∈A or x∈B.

3. Note that 1−1A just takes the values 0 and 1. Moreover, 1−1A(x)=1 if and only if x∉A.

4. Note that 1A∖B=1A∩Bc=1A1Bc=1A(1−1B) by parts (a) and (c)

5. Since both functions just take the values 0 and 1, note that 1A≤1B if and only if 1A(x)=1 implies 1B(x)=1. But in turn, this is equivalent to A⊆B.

The results in part (a) extends to arbitrary intersections and the results in part (b) extends to arbitrary unions.

Suppose that {Ai:i∈I} is a collection of subsets of S, where I is a nonempty index set. Then

1. 1⋂i∈IAi=∏i∈I1Ai=min{1Ai:i∈I}2. 1⋃i∈IAi=1−∏i∈I(1−1Ai)=max{1Ai:i∈I}

proof:

In general, a product over an infinite index set may not make sense. But if all of the factors are either 0 or 1, as they are here, then we can simply define the product to be 1 if all of the factors are 1, and 0 otherwise.

1. The functions in the middle and on the right just take the values 0 and 1. Moreover, both take the value 1 at x∈S if and only if x∈Ai for every i∈I.

2. The functions in the middle and on the right just take the values 0 and 1. Moreover, both take the value 1 at x∈S if and only if x∈Ai for some i∈I.

Functions into Product Spaces

Suppose that S and T are sets and that f:S→Tn where n∈{2,3,…}. For i∈{1,2,…,n}, we define the ith coordinate function fi:S=T by the rule that fi(x) is the ith coordinate of f(x) for each x∈S. Conversely, a sequence of functions (f1,f2,…,fn), each mapping S into T defines a function f:S→Tn by the rule f(x)=(f1(x),f2(x),…,fn(x)) for x∈S.

Operators

Suppose that S is a set. A function f:S→S is sometimes called an unary operator on S. As the name suggests, f operates on an element x∈S to produce a new element f(x)∈S. Similarly, a function g:S×S→S is sometimes called an binary operator on S. As this name suggests, g operates on a pair of elements (x,y)∈S×S to produce a new element g(x,y)∈S.

The arithmetic operators are quintessential examples:

minus(x)=−x is a unary operator on R. sum(x,y)=x+y is a binary operator on R.

product(x,y)=xy is a binary operator on R.

difference(x,y)=x−y is a binary operator on R.

reciprocal(x)=1x is a unary operator on R∖{0}.

For a fixed universal set S, the set operators on P(S) were studied in the section on Sets:

complement(A)=Ac is a unary operator. union(A,B)=A∪B is a binary operator.

intersect(A,B)=A∩B is a binary operator.

difference(A,B)=A∖B is a binary operator.

As these example illustrate, a binary operator is often written as xfy rather than f(x,y). Still, it is useful to know that operators are simply functions of a special type.

Suppose now that f is a unary operator on a set S, g is a binary operator on S, and that A⊆S. Then A is said to be closed under f if x∈A implies f(x)∈A. Thus, f restricted to A is a unary operator on A. Similarly, A is said to be closed under g if (x,y)∈A×A implies g(x,y)∈A. Thus, g restricted to A×A is a binary operator on A. Returning to our basic examples, note that


N is closed under plus and times, but not under minus and difference. Z is closed under plus, times, minus, and difference.

Q is closed under plus, times, minus, and difference. Q∖{0} is closed under reciprocal.

Many properties that you are familiar with for special operators (such as the arithmetic and set operators) can now be formulated generally. Suppose that f and g are binary operators on S. In the following definitions, x, y, and z are arbitrary elements of S.

f is commutative if f(x,y)=f(y,x), that is, xfy=yfx f is associative if f(x,f(y,z))=f(f(x,y),z), that is, xf(yfz)=(xfy)fz.

g distributes over f (on the left) if g(x,f(y,z))=f(g(x,y),g(x,z)), that is, xg(yfz)=(xgy)f(xgz)

The Axiom of Choice

Suppose that S is a collection of nonempty subsets of a set S. The axiom of choice states that there exists a function f:S→S with the property that f(A)∈A for each A∈S. The function f is known as a choice function.

Stripped of most of the mathematical jargon, the idea is very simple. Since each set A∈S is nonempty, we can select an element of A; we will call the element we select f(A) and thus our selections define a function. In fact, you may wonder why we need an axiom at all. The problem is that we have not given a rule (or procedure or algorithm) for selecting the elements of the sets in the collection. Indeed, we may not know enough about the sets in the collection to define a specific rule, so in such a case, the axiom of choice simply guarantees the existence of a choice function. Some mathematicians, known as constructionists do not accept the axiom of choice, and insist on well defined rules for constructing functions.

A nice consequence of the axiom of choice is a type of duality between one-to-one functions and onto functions.

Suppose that f is a function from a set S onto a set T. There exists a one-to-one function g from T into S.

Proof.

For each y∈T, the set f−1{y} is non-empty, since f is onto. By the axiom of choice, we can select an element g(y) from f−1{y} for each y∈T. The resulting function g is one-to-one.

Suppose that f is a one-to-one function from a set S into a set T. There exists a function g from T onto S.

Proof.

Fix a special element x0∈S. If y∈range(f), there exists a unique x∈S with f(x)=y. Define g(y)=x. If y∉range(f), define g(y)=x0. The function g is onto.


Some Elementary Functions

Each of the following rules defines a function from R into R.

f(x)=x2

g(x)=sin(x)

h(x)=⌊x⌋ u(x)=ex1+ex

Find the range of the function and determine if the function is one-to-one in each of the following cases:

1. f2. g

3. h

4. u

Answer: 1. Range [0,∞). Not one-to-one.2. Range [−1,1]. Not one-to-one.

3. Range Z. Not one-to-one.

4. Range (0,1). One-to-one.

Find the following inverse images:

1. f−1[4,9]2. g−1{0}

3. h−1{2,3,4}

Answer: 1. [−3,−2]∪[2,3]2. {nπ:n∈Z}

3. [2,5)

The function u is one-to-one. Find (that is, give the domain and rule for) the inverse function u−1.

Answer:

u−1(p)=ln(p1−p) for p∈(0,1)

Give the rule and find the range for each of the following functions:

1. f∘g2. g∘f

3. h∘g∘f

Answer:

1. (f∘g)(x)=sin2(x). Range [0,1]2. (g∘f)(x)=sin(x2). Range [−1,1]

3. (h∘g∘f)(x)=⌊sin(x)⌋. Range {−1,0,1}

Note that f∘g and g∘f are well-defined functions from R into R, but f∘g≠g∘f.

Dice

Let S={1,2,3,4,5,6}2. This is the set of possible outcomes when a pair of standard dice are thrown. Let f, g, u, and v be the functions from S into Z defined by the following rules:

f(x,y)=x+y g(x,y)=y−x

u(x,y)=min{x,y}

v(x,y)=max{x,y}

In addition, let F and U be the functions defined by F=(f,g) and U=(u,v).

Find the range of each of the following functions:

1. f2. g

3. u

4. v

5. U

Answer: 1. {2,3,4,…,12}2. {−5,−4,…,4,5}

3. {1,2,3,4,5,6}

4. {1,2,3,4,5,6}

5. {(i,j)∈{1,2,3,4,5,6}2:i≤j}

Give each of the following inverse images in list form:

1. f−1{6}2. u−1{3}

3. v−1{4}

4. U−1{(3,4)}

Answer:

1. {(1,5),(2,4),(3,3),(4,2),(5,1)}2. {(3,3),(3,4),(4,3),(3,5),(5,3),(3,6),(6,3)}

3. {(1,4),(4,1),(2,4),(4,2),(3,4),(4,3),(4,4)}

4. {(3,4),(4,3)}

Find each of the following compositions:

1. f∘U2. g∘U

3. u∘F

4. v∘F

5. F∘U

6. U∘F

Answer: 1. f∘U=f2. g∘U=|g|

3. u∘F=g

4. v∘F=f

5. F∘U=(f,|g|)

6. U∘F=(g,f)

Note that while f∘U is well-defined, U∘f is not. Note also that f∘U=f even though U is not the identity function on S.

Bit Strings

Let n∈N+ and let S={0,1}n and T={0,1,…,n}. Recall that the elements of S are bit strings of length n, and could represent the possible outcomes of n tosses of a coin (where 1 means heads and 0 means tails). Let f:S→T be the function defined by f(x1,x2,…,xn)=∑ni=1xi. Note that f(x) is just the number of 1s in the the bit string x. Let g:T→S be the function defined by g(k)=xk where xk denotes the bit string with k 1s followed by n−k 0s.

Find each of the following

1. f∘g2. g∘f

Answer: 1. f∘g:T→T and (f∘g)(k)=k.

2. g∘f:S→S and (g∘f)(x)=xk where k=f(x)=∑ni=1xi. In words, (g∘f)(x) is the bit string with the same number of 1s as x, but rearranged so that all the 1s come first.

In the previous exercise, note that f∘g and g∘f are both well-defined, but have different domains, and so of course are not the same. Note also that f∘g is the identity function on T, but f is not the inverse of g. Indeed f is not one-to-one, and so does not have an inverse. However, f restricted to {xk:k∈T} (the range of g) is one-to-one and is the inverse of g.

Let n=4. Give f−1({k}) in list form for each k∈T.

Answer: 1. f−1({0})={0000}2. f−1({1})={1000,0100,0010,0001}

3. f−1({2})={1100,1010,1001,0110,0101,0011}

4. f−1({3})={1110,1101,1011,0111}

5. f−1({4})={1111}

Again let n=4. Let A={1000,1010} and B={1000,1100}. Give each of the following in list form:

1. f(A)2. f(B)

3. f(A∩B)

4. f(A)∩f(B)

5. f−1(f(A))

Answer: 1. {1,2}2. {1,2}

3. {1}

4. {1,2}

5. {1000,0100,0010,0001,1100,1010,1001,0110,0101,0011}

In the previous exercise, note that f(A∩B)⊂f(A)∩f(B) and A⊂f−1(f(A)).

Indicator Functions

Suppose that A and B are subsets of a universal set S. Express, in terms of 1A and 1B, the indicator function of each of the 14 non-trivial sets that can be constructed from A and B. Use the Venn diagram applet to help.

Answer: 1. 1A

2. 1B

3. 1Ac=1−1A

4. 1Bc=1−1B

5. 1A∩B=1A1B

6. 1A∪B=1A+1B−1A1B

7. 1A∩Bc=1A−1A1B

8. 1B∩Ac=1B−1A1B

9. 1A∪Bc=1−1B+1A1B

10. 1B∪Ac=1−1A+1A1B

11. 1Ac∩Bc=1−1A−1B+1A1B

12. 1Ac∪bc=1−1A1B

13. 1(A∩Bc)∪(B∩Ac)=1A+1B−21A1B

14. 1(A∩B)∪(Ac∩Bc)=1−1A−1B+21A2B

Suppose that A, B, and C are subsets of a universal set S. Give the indicator function of each of the following, in terms of 1A, 1B, and 1C in sum-product form:

1. D={x∈S:x is an element of exactly one of the given sets}2. E={x∈S:x is an element of exactly two of the given sets}

Answer: 1. 1D=1A+1B+1C−2(1A1B+1A1C+1B1C)+31A1B1C

2. 1E=1A1B+1A1C+1B1C−31A1B1C

Operators

Recall the standard arithmetic operators on R discussed above.

We all know that sum is commutative and associative, and that product distributes over sum.

1. Is difference commutative?2. Is difference associative?

3. Does product distribute over difference?

4. Does sum distributed over product?

Answer: 1. No. x−y≠y−x2. No. x−(y−z)≠(x−y)−z

3. Yes. x(y−z)=(xy)−(xz)

4. No. x+(yz)≠(x+y)(x+z)

3. Relations

Definitions

Suppose that S and T are sets. A relation from S to T is a subset of the product set S×T. As the name suggests, a relation R from S into T is supposed to define a relationship between the elements of S and the elements of T, and so we usually use the more suggestive notation xRy when (x,y)∈R. The domain of R is the set of first coordinates and the range of R is the set of second coordinates.

domain(R)range(R)={x∈S:(x,y)∈R for some y∈T}={y∈T:(x,y)∈R for some x∈S}Recall that these are projections; the domain of R is the projection of R onto S and the range of R is the projection of R onto T. A relation from a set S to itself is said to be a relation on S.

Basic Examples

Suppose that S is a set and recall that P(S) denotes the power set of S, the collection of all subsets of S. The membership relation ∈ from S to P(S) is perhaps the most important and basic relationship in mathematics. Indeed, for us, it's a primitive (undefined) relationship—given x and A, we assume that we understand the meaning of the statement x∈A.

Another basic primitive relation is the equality relation = on a given set of objects S. That is, given two objects x and y, we assume that we understand the meaning of the statement x=y.

Other basic relations that you have seen are

1. The subset relation ⊆ on P(S).2. The order relation ≤ on R

These two belong to a special class of relations known as partial orders that we will study in the next section.

Note that a function f from S into T is a special type of relation. To compare the two types of notation (relation and function), note that xfy means that y=f(x).

Constructions

Set Operations

Since a relation is just a set of ordered pairs, the set operations can be used to build new relations from existing ones. Specifically, if Q and R are relations from S to T, then so are Q∪R, Q∩R, Q∖R. If R is a relation from S to T and Q⊆R, then Q is a relation from S to T.



http://www.math.uah.edu/stat/foundations/Sets.html#sct


Suppose that Q and R are relations from S to T.

1. x(Q∪R)y if and only if xQy or xRy.2. x(Q∩R)y if and only if xQy and xRy.

3. x(Q∖R)y if and only if xQy but not xRy.

4. If Q⊆R then xQy implies xRy.

Restrictions

If R is a relation on S and A⊆S then RA=R∩(A×A) is a relation on A, called the restriction of R to A.

Inverse

If R is a relation from S to T, the inverse of R is the relation from T to S defined by

yR−1x if and only if xRyEquivalently, R−1={(y,x):(x,y)∈R}. Note that any function f from S into T has an inverse relation, but only when the f is one-to-one is the inverse relation also a function (the inverse function).

Composition

Suppose that Q is a relation from S to T and that R is a relation from T to U. The composition Q∘R is the relation from S to U defined as follows: for x∈S and z∈U, x(Q∘R)z if and only if there exists y∈T such that xQy and yRz. Note that the notation is inconsistent with the notation used for composition of functions, essentially because relations are read from left to right, while functions are read from right to left. Hopefully, the inconsistency will not cause confusion, since we will always use function notation for functions.

Basic Properties

The important classes of relations that we will study in the next couple of sections are characterized by certain basic properties. Here are the definitions:

Suppose that R is a relation on S.

1. R is reflexive if xRx for all x∈S.2. R is irreflexive if no x∈S satisfies xRx.

3. R is symmetric if xRy implies yRx for all x,y∈S.

4. R is anti-symmetric if xRy and yRx implies x=y for all x,y∈S.

5. R is transitive if xRy and yRz implies xRz for all x,y,z∈S.

The proofs of the following results are straightforward, so be sure to try them yourself before reading the ones in the text.

A relation R on S is reflexive if and only if the equality relation = on S is a subset of R.

Proof:

This follows from the definitions. R is reflexive if and only if (x,x)∈R for all x∈S.

A relation R on S is symmetric if and only if R−1=R.

Proof:

Suppose that R is symmetric. If (x,y)∈R then (y,x)∈R and hence (x,y)∈R−1. If (x,y)∈R−1 then (y,x)∈R and hence (x,y)∈R. Thus R=R−1. Conversely, suppose R=R−1. If (x,y)∈R then (x,y)∈R−1 and hence (y,x)∈R.

A relation R on S is transitive if and only if R∘R⊆R.

Proof:

Suppose that R is transitive. If (x,z)∈R∘R then there exists y∈S such that (x,y)∈R and (y,z)∈R. But then (x,z)∈R by transitivity. Hence R∘R⊆R. Conversely, suppose that R∘R⊆R. If (x,y)∈R and (y,z)∈R then (x,z)∈R∘R and hence (x,z)∈R. Hence R is transitive.

A relation R on S is antisymmetric if and only if R∩R−1 is a subset of the equality relation = on S.

Proof:

Restated, this result is that R is antisymmetric if and only if (x,y)∈R∩R−1 implies x=y. Thus suppose that R is antisymmetric. If (x,y)∈R∩R−1 then (x,y)∈R and (x,y)∈R−1. But then (y,x)∈R so by antisymmetry, x=y. Conversely suppose that (x,y)∈R∩R−1 implies x=y. If (x,y)∈R and (y,x)∈R then (x,y)∈R−1 and hence (x,y)∈R∩R−1. Thus x=y so R is antisymmetric.

Suppose that Q and R are relations on S. For each property below, if both Q and R have the property, then so does Q∩R.

1. reflexive2. symmetric

3. transitive

Proof: 1. Suppose that Q and R are reflexive. Then (x,x)∈Q and (x,x)∈R for each x∈S and hence

(x,x)∈Q∩R for each x∈S. Thus Q∩R is reflexive.2. Suppose that Q and R are symmetric. If (x,y)∈Q∩R then (x,y)∈Q and (x,y)∈R. Hence

(y,x)∈Q and (y,x)∈R so (y,x)∈Q∩R. Hence Q∩R is symmetric.

3. Suppose that Q and R are transitive. If (x,y)∈Q∩R and (y,z)∈Q∩R then (x,y)∈Q, (x,y)∈R, (y,z)∈Q, and (y,z)∈R. Hence (x,z)∈Q and (x,z)∈R so (x,z)∈Q∩R. Hence Q∩R is transitive.

Suppose that R is a relation on a set S.

1. Give an explicit definition for the property R is not reflexive.

2. Give an explicit definition for the property R is not irreflexive.

3. Are any of the properties R is reflexive, R is not reflexive, R is irreflexive, R is not irreflexive equivalent?

Answer: 1. R is not reflexive if and only if there exists x∈S such that (x,x)∉R.2. R is not irreflexive if and only if there exists x∈S such that (x,x)∈R.

3. Nope.

Suppose that R is a relation on a set S.

1. Give an explicit definition for the property R is not symmetric.2. Give an explicit definition for the property R is not antisymmetric.

3. Are any of the properties R is symmetric, R is not symmetric, R is antisymmetric, R is not antisymmetric equivalent?

Answer: 1. R is not symmetric if and only if there exist x,y∈S such that (x,y)∈R and (y,x)∉R.2. R is not antisymmetric if and only if there exist distinct x,y∈S such that (x,y)∈R and (y,x)∈R.

3. Nope.


Let R be the relation defined on R by xRy if and only if sin(x)=sin(y). Determine if R has each of the following properties:


3. transitive

4. irreflexive

5. antisymmetric

Answer: 1. yes2. yes

3. yes

4. no

5. no

The relation R in the previous exercise is a member of an important class of equivalence relations.

Let R be the relation defined on R by xRy if and only if x2+y2≤1. Determine if R has each of the following properties:



3. transitive

4. irreflexive

5. antisymmetric

Answer: 1. no2. yes

3. no

4. no

5. no

4. Partial Orders

Basic Theory

Definitions

A relation ⪯ on S that is reflexive, anti-symmetric, and transitive is said to be a partial order on S, and the pair (S,⪯) is called a partially ordered set. Thus for all x, y, z∈S:

1. x⪯x, the reflexive property2. If x⪯y and y⪯x then x=y, the antisymmetric property

3. If x⪯y and y⪯z then x⪯z, the transitive property

As the name and notation suggest, a partial order is a type of ordering of the elements of S. Partial orders occur naturally in many areas of mathematics, including probability. A partial order on a set naturally gives rise to several other relations the set.

Suppose that ⪯ is a partial order on a set S. The relations ⪰, ≺, ≻, ⊥, and ∥ are defined as follows:

1. x⪰y if and only if y⪯x.2. x≺y if and only if x⪯y and x≠y.

3. x≻y if and only if y≺x.

4. x⊥y if and only if x⪯y or y⪯x.

5. x∥y if and only if neither x⪯y nor y⪯x.

http://www.math.uah.edu/stat/foundations/Relations.html

Note that ⪰ is the inverse of ⪯, and ≻ is the inverse of ≺. Note also that x⪯y if and only if either x≺y or x=y, so the relation ≺ completely determines the relation ⪯. Finally, note that x⊥y means that x and y are related in the partial order, while x∥y means that x and y are unrelated in the partial order. Thus, the relations ⊥ and ∥ are complements of each other, as sets of ordered pairs. A total or linear order is a partial order in which there are no unrelated elements.

A partial order ⪯ on S is a total order or linear order if for every x, y∈S, either x⪯y or y⪯x.

Suppose that ⪯1 and ⪯2 are partial orders on a set S. Then ⪯1 is an sub-order of ⪯2, or equivalently ⪯2 is an extension of ⪯1 if and only if x⪯1y implies x⪯2y for x, y∈S.

Thus if ⪯1 is a suborder of ⪯2, then as sets of ordered pairs, ⪯1 is a subset of ⪯2. We need one more relation that arises naturally from a partial order.

Suppose that ⪯ is a partial order on a set S. For x, y∈S, y is said to cover x if x≺y but no element z∈S satisfies x≺z≺y.

If S is finite, the covering relation completely determines the partial order, by virtue of the transitive property. The covering graph or Hasse graph is the directed graph with vertex set S and directed edge set E, where (x,y)∈E if and only if y covers x. Thus, x≺y if and only if there is a directed path in the graph from x to y. Hasse graphs are named for the German mathematician Helmut Hasse. The graphs are often drawn with the edges directed upward. In this way, the directions can be inferred without having to actually draw arrows.

Basic Examples

Of course, the ordinary order ≤ is a total order on the set of real numbers R. The subset partial order is one of the most important in probability theory:

Suppose that S is a set. The subset relation ⊆ is a partial order on P(S), the power set of S.

Proof:

We proved this result in the section on sets. To review, recall that for A, B∈P(S), A⊆B means that x∈A implies x∈B. Also A=B means that x∈A if and only if x∈B. Thus

1. A⊆A2. A⊆B and B⊆A if and only if A=B

3. A⊆B and B⊆C imply A⊆C

Let ∣ denote the division relation on the set of positive integers N+. That is, m∣n if and only if there exists k∈N+ such that n=km. Then

1. ∣ is a partial order on N+.2. ∣ is a sub-order of the ordinary order ≤.

Proof:


1. Clearly n∣n for n∈N+, since n=1⋅n, so ∣ is reflexive. Suppose m∣n and n∣m, where m, n∈N+. Then there exist j, k∈N+ such that n=km and m=jn. Substituting gives n=jkn, and hence j=k=1. Thus m=n so ∣ is antisymmetric. Finally, suppose m∣n and n∣p, where m, n, p∈N+. Then there exists j, k∈N+ such that n=jm and p=kn. Substituting gives p=jkm, so m∣p. Thus ∣ is transitive.

2. If m, n∈N+ and m∣n, then there exists k∈N+ such that n=km. Since k≥1, m≤n.

Basic Properties

The proofs of the following basic properties are straightforward. Be sure to try them yourself before reading the ones in the text.

The inverse of a partial order is also a partial order.

Proof:

Clearly the reflexive, antisymmetric and transitive properties hold for ⪰.

If ⪯ is a partial order on S and A is a subset of S, then the restriction of ⪯ to A is a partial order on A.

Proof:

The reflexive, antisymmetric, and transitive properties given above hold for all x, y, z∈S and hence hold for all x, y, z∈A.

The following theorem characterizes relations that correspond to strict order.

Let S be a set. A relation ⪯ is a partial order on S if and only if ≺ is transitive and irreflexive.

Proof:

Suppose that ⪯ is a partial order on S. Recall that ≺ is defined by x≺y if and only if x⪯y and x≠y. If x≺y and y≺z then x⪯y and y⪯z, and so x⪯z. On the other hand, if x=z then x⪯y and y⪯x so x=y, a contradiction. Hence x≠z and so x≺z. Therefore ≺ is transitive. If x≺y then x≠y by definition, so ≺ is irreflexive.

Conversely, suppose that ≺ is a transitive and irreflexive relation on S. Recall that ⪯ is defined by x⪯y if and only if x≺y or x=y. By definition then, ⪯ is reflexive: x⪯x for every x∈S. Next, suppose that x⪯y and y⪯x. If x≺y and y≺x then x≺x by the transitive property of ≺. But this is a contradiction by the irreflexive property, so we must have x=y. Thus ⪯ is antisymmetric. Suppose x⪯y and y⪯z. There are four cases:

1. If x≺y and y≺z then x≺z by the transitive property of ≺.2. If x=y and y≺z then x≺z by substitution.

3. If x≺y and y=z then x≺z by substitution.

4. If x=y and y=z then x=z by the transitive property of =.

In all cases we have x⪯z so ⪯ is transitive. Hence ⪯ is a partial order on S.

Monotone Sets and Functions

Partial orders form a natural setting for increasing and decreasing sets and functions. Here are the definitions:

Suppose that ⪯ is a partial order on a set S and that A⊆S. Then

1. A is increasing if x∈A and x⪯y imply y∈A.2. A is decreasing if y∈A and x⪯y imply x∈A.

Suppose that S is a set with partial order ⪯S, T is a set with partial order ⪯T, and that f:S→T. Then

1. f is increasing if and only if x⪯Sy implies f(x)⪯Tf(y).2. f is decreasing if and only if x⪯Sy implies f(x)⪰Tf(y).

3. f is strictly increasing if and only if x≺Sy implies f(x)≺Tf(y).

4. f is strictly decreasing if and only if x≺Sy implies f(x)≻Tf(y)

Recall the definition of the indicator function 1A associated with a subset A of a universal set S: For x∈S, 1A(x)=1 if x∈A and 1A(x)=0 if x∉A.

Suppose that ⪯ is a partial order on a set S and that A⊆S. Then

1. A is increasing if and only if 1A is increasing.2. A is decreasing if and only if 1A is decreasing.

Proof: 1. A is increasing if and only if x∈A and x⪯y implies y∈A if and only if 1A(x)=1 and x≤y implies

1A(y)=1 if and only if 1A is increasing.2. A is decreasing if and only if y∈A and x⪯y implies x∈A if and only if 1A(y)=1 and x≤y implies

1A(x)=1 if and only if 1A is decreasing.

Isomorphism

Two partially ordered sets (S,⪯S) and (T,⪯T) are said to be isomorphic if there exists a one-to-one function f from S onto T such that x⪯Sy if and only if f(x)⪯Tf(y), for all x, y∈S. The function f is an isomorphism.

Generally, a mathematical system often consists of a set and various structures defined in terms of the set, such as relations, operators, or a collection of subsets. Loosely speaking, two mathematical systems of the same type are isomorphic if there exists a one-to-one function from one of the sets onto the other that preserves the structures, and again, the function is called an isomorphism. The basic idea is that isomorphic systems are mathematically identical, except for superficial matters of appearance. The word isomorphism is from the Greek and means equal shape.

Suppose that the partially ordered sets (S,⪯S) and (T,⪯T) are isomorphic, and that f:S→T is an isomorphism. Then f and f−1 are strictly increasing.

Proof:

http://www.math.uah.edu/stat/foundations/Functions.html#ind

We need to show that for x, y∈S, x≺Sy if and only if f(x)≺Tf(y). If x≺Sy then by definition, f(x)⪯Tf(y). But if f(x)=f(y) then x=y since f is one-to-one. This is a contradiction, so f(x)≺Tf(y). Similarly, if f(x)≺Tf(y) then by definition, x⪯Sy. But if x=y then f(x)=f(y), a contradiction. Hence f(x)≺Sf(y).

In a sense, the subset partial order is universal—every partially ordered set is isomorphic to (S,⊆) for some collection of sets S.

Suppose that ⪯ is a partial order on a set S. Then there exists S⊆P(S) such that (S,⪯) is isomorphic to (S,⊆).

Proof:

For each x∈S, let Ax={u∈S:u⪯x}, and then let S={Ax:x∈S}, so that S⊆P(S). We will show that the function x↦Ax from S onto S is one-to-one, and satisfies

x⪯y⟺Ax⊆Ay

First, suppose that x, y∈S and Ax=Ay. Then x∈Ax so x∈Ay and hence x⪯y. Similarly, y∈Ay so y∈Ax and hence y⪯x. Thus x=y, so the mapping is one-to-one. Next, suppose that x⪯y. If u∈Ax then u⪯x so u⪯y by the transitive property, and hence u∈Ay. Thus Ax⊆Ay. Conversely, suppose Ax⊆Ay. As before, x∈Ax, so x∈Ay and hence x⪯y.

Extremal Elements

Various types of extremal elements play important roles in partially ordered sets. Here are the definitions:

Suppose that ⪯ is a partial order on a set S and that A⊆S.

1. An element a∈A is the minimum element of A if and only if a⪯x for every x∈A.2. An element a∈A is a minimal element of A if and only if no x∈A satisfies x≺a.

3. An element b∈A is the maximum element of A if and only if b⪰x for every x∈A.

4. An element b∈A is a maximal element of A if and only if no x∈A satisfies x≻b.

In general, a set can have several maximal and minimal elements (or none). On the other hand,

The minimum and maximum elements of A, if they exist, are unique. They are denoted min(A) and max(A), respectively.

Proof:

Suppose that a, b are minimum elements of A. Since a, b∈A we have a⪯b and b⪯a, so a=b by the antisymmetric property. The proof for the maximum element is analogous.

Minimal, maximal, minimum, and maximum elements of a set must belong to that set. The following definitions relate to upper and lower bounds of a set, which do not have to belong to the set.

Suppose again that ⪯ is a partial order on a set S and that A⊆S. Then

1. An element u∈S is a lower bound for A if and only if u⪯x for every x∈A.2. An element v∈S is an upper bound for A if and only if v⪰x for every x∈A.

3. The greatest lower bound or infimum of A, if it exists, is the maximum of the set of lower bounds of A.

4. The least upper bound or supremum of A, if it exists, is the minimum of the set of upper bounds of A.

By the uniqueness result above, the greatest lower bound of A is unique, if it exists. It is denoted glb(A) or inf(A). Similarly, the least upper bound of A is unique, if it exists, and is denoted lub(A) or sup(A). Note that every element of S is a lower bound and an upper bound for ∅, since the conditions in the definition hold vacuously.

The symbols ∧ and ∨ are also used for infimum and supremum, respectively, so ⋀A=inf(A) and ⋁A=sup(A) if they exist.. In particular, for x, y∈S, operator notation is more commonly used, so x∧y=inf{x,y} and x∨y=sup{x,y}, if these elements exist. If x∧y and x∨y exist for every x, y∈S, then (S,⪯) is called a lattice.

For the subset partial order, the inf and sup operators correspond to intersection and union, respectively:

Let S be a set and consider the subset partial order ⊆ on P(S), the power set of S. Let A be a nonempty subset of P(S)), that is, a nonempty collection of subsets of S. Then

1. inf(A)=⋂A2. sup(A)=⋃A

Proof: 1. First, ⋂A⊆A for every A∈A and hence ⋂A is a lower bound of A. If B is a lower bound of A then

B⊆A for every A∈A and hence B⊆⋂A. Therefore ⋂A is the greatest lower bound.2. First, A⊆⋃A for every A∈A and hence ⋃A is an upper bound of A. If B is an upper bound of A then

A⊆B for every A∈A and hence ⋃A⊆B. Therefore ⋃A is the least upper bound.

In particular, A∧B=A∩B and A∨B)=A∪B, so (S,P(S)) is a lattice.

Consider the division partial order ∣ on the set of positive integers N+ and let A be a nonempty subset of N+.

1. inf(A) is the greatest common divisor of A, usually denoted gcd(A) in this context.2. If A is infinite then sup(A) does not exist. If A is finite then sup(A) is the least common multiple of

A, usually denoted lcm(A) in this context.

Suppose that S is a set and that f:S→S. An element z∈S is said to be a fixed point of f if f(z)=z.

The following result explores a basic fixed point theorem for a partially ordered set. The theorem is important in the study of cardinality.

Suppose that ⪯ is a partial order on a set S with the property that sup(A) exists for every A⊆S. If f:S→S is increasing, then f has a fixed point.


http://www.math.uah.edu/stat/foundations/Order.html#ext2

Proof.

Let A={x∈S:x⪯f(x)} and let z=sup(A). If x∈A then x⪯z so x⪯f(x)⪯f(z)}. Hence f(z) is an upper bound of A so z⪯f(z). But then f(z)⪯f(f(z)) so f(z)∈A. Hence f(z)⪯z. Therefore f(z)=z.

Note that the hypotheses of the theorem require that sup(∅)=min(S) exists. The set A={x∈S:x⪯f(x)} is nonempty since min(S)∈A.

If ⪯ is a total order on a set S with the property that every nonempty subset of S has a minimum element, then S is said to be well ordered by ⪯. One of the most important examples is N+, which is well ordered by the ordinary order ≤. On the other hand, the well ordering principle, which is equivalent to the axiom of choice, states that every nonempty set can be well ordered.

Orders on Product Spaces

Suppose that S and T are sets with partial orders ⪯S and ⪯T respectively. Define the relation ⪯ on S×T by (x,y)⪯(z,w) if and only if x⪯Sz and y⪯Tw.

1. The relation ⪯ is a partial order on S×T, called, appropriately enough, the product order.2. Suppose that (S,⪯S)=(T,⪯T). If S has at least 2 elements, then ⪯ is not a total order on S2.

Proof: 1. If (x,y)∈S×T then x⪯Sx and y⪯Ty and therefore (x,y)⪯(x,y). Hence ⪯ is reflexive. Suppose

that (u,v), (x,y)∈S×T and that (u,v)⪯(x,y) and (x,y)⪯(u,v). Then u⪯Sx and x⪯Su, and v⪯Ty and y⪯Tv. Hence u=x and v=y, so (u,v)=(x,y). Hence ⪯ is antisymmetric. Finally, suppose that (u,v), (x,y), (z,w)∈S×T, and that (u,v)⪯(x,y) and (x,y)⪯(z,w). Then u⪯Sx and x⪯Sz, and v⪯Ty and y⪯Tw. Hence u⪯Sz and v⪯Sw, and therefore (u,v)⪯(z,w). Hence ⪯ is transitive.

2. Suppose that (S,⪯S)=(T,⪯T), and that x, y∈S are distinct. If (x,y)⪯(y,x) then x⪯Sy and y⪯Sx and hence x=y, a contradiction. Similarly (y,x)⪯(x,y) leads to the same contradiction x=y. Hence (x,y) and (y,x) are unrelated in the partial order ⪯.

The product order on R2. The region shaded red is the set of

points ⪰(x,y). The region shaded blue is the set of points ⪯(x,y). The region shaded white is the set of points that are not comparable with (x,y).

http://www.math.uah.edu/stat/foundations/Functions.html#aoc

Product order extends in a straightforward way to the Cartesian product of a finite or an infinite sequence of partially ordered spaces. For example, suppose that Si is a set with partial order ⪯i for each i∈{1,2,…,n}, where n∈N+. Let S=S1×S2×⋯×Sn. The product order ⪯ on S is defined by as follows: for x=(x1,x2,…,xn)∈S and y=(y1,y2,…,yn)∈S, x⪯y if and only if xi⪯iyi for each i∈{1,2,…,n}. Recalling that sequences are just functions defined on a particular type of index set, the product order can be generalized even further:

Suppose that Si is a set with partial order ⪯i for each i in a nonempty index set I. Let

S={f:f is a function on I such that f(i)∈Si for each i∈I}Define the relation ⪯ on S by f⪯g if and only if f(i)⪯ig(i) for each i∈I. Then ⪯ is a partial order on S, known again as the product order.Proof:

In spite of the abstraction, the proof is perfectly straightforward.

1. If f∈S, then f(i)⪯if(i) for every i∈I, and hence f⪯f. Thus ⪯ is reflexive.2. Suppose that f, g∈S and that f⪯g and g⪯f. Then f(i)⪯ig(i) and g(i)⪯if(i) for each i∈I. Hence

f(i)=g(i) for each i∈I and so f=g. Thus ⪯ is antisymmetric

3. Suppose that f, g, h∈S and that f⪯g and g⪯h. Then f(i)⪯ig(i) and g(i)⪯ih(i) for each i∈I. Hence f(i)⪯ih(i) for each i∈I, so f⪯h. Thus ⪯ is transitive.

Note that no assumptions are made on the index set I, other than it be nonempty. In particular, no order is necessary on I. The next result gives a very different type of order on a product space.

Suppose again that S and T are sets with partial orders ⪯S and ⪯T respectively. Define the relation ⪯ on S×T by (x,y)⪯(z,w) if and only if either x≺Sz, or x=z and y⪯Tw.

1. The relation ⪯ is a partial order on S×T, called the lexicographic order or dictionary order.2. If ⪯S and ⪯T are total orders on S and T, respectively, then ⪯ is a total order on S×T.

Proof: 1. If (x,y)∈S×T, then x=x and y⪯Ty. Hence (x,y)⪯(x,y) and so ⪯ is reflexive. Suppose that

(u,v), (x,y)∈S×T, and that (u,v)⪯(x,y) and (x,y)⪯(u,v). There are four cases, but only one can actually happen.

o If u≺Sx and x≺Su then u≺Su. This is a contradiction since ≺S is irreflexive, and so this case cannot happen.

o If u≺Sx, x=u, and y⪯Tv, then u≺Su. This case also cannot happen.

o If u=x, v⪯Ty, and x≺Su, then u≺Su. This case also cannot happen

o if u=x, v⪯Ty, x=u, and y⪯Tv, then u=x and v=y, so (u,v)=(x,y).

Hence ⪯ is antisymmetric. Finally, suppose that (u,v), (x,y), (z,w)∈S×T and that (u,v)⪯(x,y) and (x,y)⪯(z,w). Again there are four cases:

o If u≺Sx and x≺Sz then u≺Sz, and hence (u,v)⪯(z,w).

o If u≺Sx, x=z, and y⪯Tw then u≺Sz and hence (u,v)⪯(z,w).

o If u=x, v⪯Ty, and x≺Sz then u≺Sz and hence (u,v)⪯(z,w).

o If u=x, v⪯Ty, x=z, and y⪯Tw, then u=z and v⪯Tw, and hence (u,v)⪯(z,w).

Therefore ⪯ is transitive.

2. Suppose that ⪯S and ⪯T are total orders. Let (u,v), (x,y)∈S×T. Then either u⪯Sx or x⪯Su and either v⪯Ty or y⪯Tv. Again there are four cases

o If u≺Sx then (u,v)⪯(x,y)

o If x≺Su then (x,y)⪯(u,v)

o If u=x and v⪯Ty then (u,v)⪯(x,y)

o If u=x and y⪯Tv then (x,y)⪯(u,v)

Hence ⪯ is a total order.

The lexicographic order on R2. The region shaded red is the set

of points ⪰(x,y). The region shaded blue is the set of points ⪯(x,y). The region shaded white is the set of points that are not comparable with (x,y).

As with the product order, the lexicographic order can be generalized to a collection of partially ordered spaces. However, we need the index set to be totally ordered.

Suppose that Si is a set with partial order ⪯i for each i in a nonempty index set I. Suppose also that ≤ is a total order on I. Let

S={f:f is a function on I such that f(i)∈Si for each i∈I}Define the relation ⪯ on S as follows: f≺g if and only if there exists j∈I such that f(i)=g(i) if i<j and f(j)≺jg(j). Then

1. ⪯ is a partial order on S, known again as the lexicographic order.2. If ⪯i is a total order for each i∈I, and I is well ordered by ≤, then then ⪯ is a total order on S.

Proof:

1. By the result above on strict orders, we need to show that ≺ is irreflexive and transitive. Thus, suppose that f, g∈S and that f≺g and g≺f. Then there exists j∈I such that f(i)=g(i) if i<j and f(j)≺jg(j). Similarly there exists k∈I such that g(i)=f(i) if i<k and g(k)≺kf(k). Since I is totally ordered under ≤, either j<k, k<j, or j=k. If j<k then we have g(j)=f(j) and f(j)≺jg(j), a contradiction. If k<j then we have f(k)=g(k) and g(k)≺kf(k), another contradiction. If j=k, we have f(j)≺jg(j) and g(j)≺jf(j), another contradiction. Hence ≺ is irreflexive. Next, suppose that f, g, h∈S and that f≺g and g≺h. Then there exists j∈I such that f(i)=g(i) if i<j and f(j)≺jg(j). Similarly, there exists k∈I such that g(i)=h(i) if i<k and g(k)≺kh(k). Again, since I is totally ordered, either j<k or k<j or j=k. If j<k, then f(i)=g(i)=h(i) if i<j and f(j)≺jg(j)=h(j). If k<j, then f(i)=g(i)=h(i) if i<k and f(k)=g(k)≺kh(k). If j=k, then f(i)=g(i)=h(i) if i<j and f(j)≺jg(j)≺jh(j). In all cases, f≺h so ≺ is transitive.

2. Suppose now that ⪯i is a total order on Si for each i∈I and that I is well ordered by ≤. Let f, g∈S with f≠g. Let J={i∈I:f(i)≠g(i)}. Then J≠∅ by assumption, and hence has a minimum element j. If i<j then i∉J and hence f(i)=g(i). On the other hand, f(j)≠g(j) since j∈J and therefore, since ⪯j is totally ordered, we must have either f(j)≺jg(j) or g(j)≺jf(j). In the first case, f≺g and in the second case g≺f. Hence ⪯ is totally ordered.

The term lexicographic comes from the way that we order words alphabetically: We look at the first letter; if these are different, we know how to order the words. If the first letters are the same, we look at the second letter; if these are different, we know how to order the words. We continue in this way until we find letters that are different, and we can order the words. In fact, the lexicographic order is sometimes referred to as the first difference order. Note also that if Si is a set and ⪯i a total order on Si for i∈I, then by the well ordering principle, there exists a well ordering ≤ of I, and hence there exists a lexicographic total order on the product space S. As a mathematical structure, the lexicographic order is not as obscure as you might think.

(R,≤) is isomorphic to the lexicographic product of (Z,≤) with ([0,1),≤), where ≤ is the ordinary order for real numbers.

Proof:

Every x∈R can be uniquely expressed in the form x=n+t where n=⌊x⌋∈Z is the integer part and t=x−n∈[0,1) is the remainder. Thus x↦(n,t) is a one-to-one function from R onto Z×[0,1). For example, 5.3 maps to (5,0.3), while −6.7 maps to (−7,0.3). Suppose that if x=m+s, y=n+t∈R, where of course m, n∈Z are the integer parts of x and y, respectively, and s, t∈[0,1) are the corresponding remainders. Then x<y if and only if m<n or m=n and s<t. Again, to illustrate with real real numbers, we can tell that 5.3<7.8 just by comparing the integer parts: 5<7. We can ignore the remainders. On the other hand, to see that 6.4<6.7 we need to compare the remainders: 0.4<0.7 since the integer parts are the same.

Limits of Sequences of Real Numbers

Suppose that (a1,a2,…) is a sequence of real numbers.

The sequence inf{an,an+1…} is increasing in n∈N+.

Since the sequence of infimums in the last result is increasing, the limit exists in R∪{∞}, and is called the limit inferior of the original sequence:

http://www.math.uah.edu/stat/foundations/Order.html#prp3

lim infn→∞an=limn→∞inf{an,an+1,…}

The sequence sup{an,an+1,…} is decreasing in n∈N+.

Since the the sequence of supremums in the last result is decreasing, the limit exists in R∪{−∞}, and is called the limit superior of the original sequence:

lim supn→∞an=limn→∞sup{an,an+1,…}Note that lim infn→∞an≤lim supn→∞an and equality holds if and only if limn→∞an exists (and is the common value).


Let S={2,3,4,6,12}.

1. Sketch the Hasse graph corresponding to the ordinary order ≤ on S.2. Sketch the Hasse graph corresponding to the division partial order ∣ on S.

Answer:

1.

2.

Consider the ordinary order ≤ on the set of real numbers R, and let A=[a,b) where a<b. Find each of the following that exist:

1. The set of minimal elements of A2. The set of maximal elements of A

3. min(A)

4. max(A)

5. The set of lower bounds of A

6. The set of upper bounds of A

7. inf(A)

8. sup(A)

Answer: 1. {a}2. ∅3. a

4. Does not exist

5. (−∞,a]

6. [b,∞)

7. a

8. b

Again consider the division partial order ∣ on the set of positive integers N+ and let A={2,3,4,6,12}. Find each of the following that exist:

1. The set of minimal elements of A2. The set of maximal elements of A

3. min(A)

4. max(A)

5. The set of lower bounds of A

6. The set of upper bounds of A

7. inf(A)

8. sup(A).

Answer: 1. {2,3}2. {12}

3. Does not exist

4. 12

5. {1}

6. {12,24,36,…}

7. 1

8. 12

Let S={a,b,c}.

1. Give P(S) in list form.2. Describe the Hasse graph of (S,P(S))

Answer: 1. P(S)={∅,{a},{b},{c},{a,b},{a,c},{b,c},S}2. For A∈P(S) and x∈S∖A, there is a directed edge from A to A∪{x}

Note that the Hasse graph of ⊇ looks the same as the graph of ⊆, except for the labels on the vertices. This symmetry is because the complement relationship.

Let S={a,b,c,d}.

1. Give P(S) in list form.2. Describe the Hasse graph of (S,P(S))

Answer: 1. P(S)={∅,{a},{b},{c},{d},{a,b},{a,c},{a,d},{b,c},{b,d},{c,d},{a,b,c},

{a,b,d},{a,c,d},{b,c,d},S}2. For A∈P(S) and x∈S∖A, there is a directed edge from A to A∪{x}

Note again that the Hasse graph of ⊇ looks the same as the graph of ⊆, except for the labels on the vertices. This symmetry is because the complement relationship.

Suppose that A and B are subsets of a universal set S. Let A denote the collection of the 16 subsets of S that can be constructed from A and B using the set operations. Show that (A,⊆) is isomorphic to the partially ordered set in the previous exercise. Use the Venn diagram applet to help.

Proof:

Let a=A∩B, b=A∩Bc, c=Ac∩B, d=Ac∩Bc. Our basic assumption is that A and B are in general position, so that a,b,c,d are distinct and nonempty. Note also that {a,b,c,d} partitions S. Now, map each subset S of {a,b,c,d} to ⋃S. This function is an isomorphism from S to A. That is, for S and T subsets of {a,b,c,d}, S⊆T if and only if ⋃S⊆⋃T.

5. Equivalence Relations

Basic Theory

Definitions

A relation ≈ on a nonempty set S that is reflexive, symmetric, and transitive is said to be an equivalence relation on S. Thus, for all x,y,z∈S,

1. x≈x, the reflexive property.2. If x≈y then y≈x, the symmetric property.

3. If x≈y and y≈z then x≈z, the transitive property.

As the name and notation suggest, an equivalence relation is intended to define a type of equivalence among the elements of S. Like partial orders, equivalence relations occur naturally in most areas of mathematics, including probability.

Suppose that ≈ is an equivalence relation on S. The equivalence class of an element x∈S is the set of all elements that are equivalent to x, and is denoted

[x]={y∈S:y≈x}

Results

The most important result is that an equivalence relation on a set S defines a partition of S, by means of the equivalence classes.

Suppose that ≈ is an equivalence relation on a set S.

1. If x≈y then [x]=[y].2. If x≈y then [x]∩[y]=∅.

3. The collection of (distinct) equivalence classes is a partition of S into nonempty sets.

Proof: 1. Suppose that x≈y. If u∈[x] then u≈x and hence u≈y by the transitive property. Hence u∈[y].

Similarly, if u∈[y] then u≈y. But y≈x by the symmetric property, and hence u≈x by the transitive property. Hence u∈[x].




2. Suppose that x≈y. If u∈[x]∩[y], then u∈[x] and u∈[y], so u≈x and u≈y. But then x≈u by the symmetric property, and then x≈y by the transitive property. This is a contradiction, so [x]∩[y]=∅.

3. From (a) and (b), the (distinct) equivalence classes are disjoint. If x∈S, then x≈x by the reflexive property, and hence x∈[x]. Therefore ⋃x∈S[x]=S.

Sometimes the set S of equivalence classes is denoted S/≈. The idea is that the equivalence classes are new objects obtained by identifying elements in S that are equivalent. Conversely, every partition of a set defines an equivalence relation on the set.

Suppose that S is a collection of nonempty sets that partition a given set S. Define the relation ≈ on S defined by x≈y if and only if x∈A and y∈A for some A∈S.

1. ≈ is an equivalence relation.2. S is the set of equivalence classes.

Proof: 1. If x, then x∈A for some A∈S, since S partitions S. Hence x≈x, and so the reflexive property holds.

Next, ≈ is trivially symmetric by definition. Finally, suppose that x≈y and y≈z. Then x,y∈A for some A∈S and y,z∈B for some B∈S. But then y∈A∩B. The sets in S are disjoint, so A=B. Hence x,z∈A, so x≈z. Thus ≈ is transitive.

2. If x∈S, then x∈A for a unique A∈S, and then by definition, [x]=A.

A partition of S. Any two points in the same partition set are equivalent.

Sometimes the equivalence relation ≈ associated with a given partition S is denoted S/S. The idea, of course, is that elements in the same set of the partition are equivalent.

The process of forming a partition from an equivalence relation, and the process of forming an equivalence relation from a partition are inverses of each other.

1. If we start with an equivalence relation ≈ on S, form the associated partition, and then construct the equivalence relation associated with the partition, then we end up with the original equivalence relation. In modular notation, S/(S/≈)=≈.

2. If we start with a partition S of S, form the associated equivalence relation, and then form the partition associated with the equivalence relation, then we end up with the original partition. In modular notation, S/(S/S)=S.

Suppose that S is a nonempty set. The most basic equivalence relation on S is the equality relation =. In this case [x]={x} for each x∈S. At the other extreme is the trivial relation ≈ defined by x≈y for all x,y∈S. In this case S is the only equivalence class.

Every function f defines an equivalence relation on its domain, known as the equivalence relation associated with f. Moreover, the equivalence classes have a simple description in terms of the inverse images of f.

Suppose that f:S→T. Define the relation ≈ on S by x≈y if and only if f(x)=f(y).

1. The relation ≈ is an equivalence relation on S.2. The set of equivalences classes is S={f−1{t}:t∈range(f)}.

3. The function F:S→T defined by F([x])=f(x) is well defined and is one-to-one.

Proof: 1. If x∈S then trivially f(x)=f(x), so x≈x. Hence ≈ is reflexive. If x≈y then f(x)=f(y) so trivially

f(y)=f(x) and hence y≈x. Thus ≈ is symmetric. If x≈y and y≈z then f(x)=f(y) and f(y)=f(z), so trivially f(x)=f(z) and so x≈z. Hence ≈ is transitive.

2. Recall that t∈range(f) if and only if f(x)=t for some x∈S. Then by definition, [x]=f−1{t}={y∈S:f(y)=t}={y∈S:f(y)=f(x)}

3. From (1), [x]=[y] if and only if x≈y if and only if f(x)=f(y). This shows both that F is well defined, and that F is one-to-one.

The equivalence relation on S associated with f:S→T

Suppose again that f:S→T.

1. If f is one-to-one then the equivalence relation associated with f is the equality relation, and hence [x]={x} for each x∈S.

2. If f is a constant function then the equivalence relation associated with f is the trivial relation, and hence S is the only equivalence class.

Proof: 1. If f is one, then x≈y if and only if f(x)=f(y) if and only if x=y.2. If f is constant on S then f(x)=f(y) and hence x≈y for all x,y∈S.

Equivalence relations associated with functions are universal: every equivalence relation is of this form:

Suppose that ≈ is an equivalence relation on a set S. Define f:S→P(S) by f(x)=[x]. Then ≈ is the equivalence relation associated with f.

Proof:

From (1), x≈y if and only if [x]=[y] if and only if f(x)=f(y).

http://www.math.uah.edu/stat/foundations/Equivalence.html#eqr4

http://www.math.uah.edu/stat/foundations/Functions.html#img

Suppose that ≈ and ≅ are equivalence relations on a set S. Let ≡ denote the intersection of ≈ and ≅ (thought of as subsets of S×S). Equivalently, x≡y if and only if x≈y and x≅y.

1. ≡ is an equivalence relation on S.2. [x]≡=[x]≈∩[x]≅.

Suppose that we have a relation that is reflexive and transitive, but fails to be a partial order because it's not anti-symmetric. The relation and its inverse naturally lead to an equivalence relation, and then in turn, the original relation defines a true partial order on the equivalence classes. This is a common construction, and the details are given in the next theorem.

Suppose that ⪯ is a relation on a set S that is reflexive and transitive. Define the relation ≈ on S by x≈y if and only if x⪯y and y⪯x.

1. ≈ is an equivalence relation on S.2. If A and B are equivalence classes and x⪯y for some x∈A and y∈B, then u⪯v for all u∈A and

v∈B.

3. Define the relation ⪯ on the collection of equivalence classes S by A⪯B if and only if x⪯y for some (and hence all) x∈A and y∈B. Then ⪯ is a partial order on S.

Proof: 1. If x∈S then x⪯x since ⪯ is reflexive. Hence x≈x, so ≈ is reflexive. Clearly ≈ is symmetric by the

symmetry of the definition. Suppose that x≈y and y≈z. Then x⪯y, y⪯z, z⪯y and y⪯x. Hence x⪯z and z⪯x since ⪯ is transitive. Therefore x≈z so ≈ is transitive.

2. Suppose that A and B are equivalence classes of ≈ and that x⪯y for some x∈A and y∈B. If u∈A and v∈B, then x≈u and y≈v. Therefore u⪯x and y⪯v. By transitivity, u⪯v.

3. Suppose that A∈S. If x,y∈A then x≈y and hence x⪯y. Therefore A⪯A and so ⪯ is reflexive. Next suppose that A,B∈S and that A⪯B and B⪯A. If x∈A and y∈B then x⪯y and y⪯x. Hence x≈y so A=B. Therefore ⪯ is antisymmetric. Finally, suppose that A,B,C∈S and that A⪯B and B⪯C. Note that B≠∅ so let y∈B. If x∈A,z∈C then x⪯y and y⪯z. Hence x⪯z and therefore A⪯C. So ⪯ is transitive.

A prime example of the construction in the previous theorem occurs when we have a function whose range space is partially ordered. We can construct a partial order on the equivalence classes in the domain that are associated with the function.

Suppose that S and T are sets and that ⪯T is a partial order on T. Suppose also that f:S→T. Define the relation ⪯S on S by x⪯Sy if and only if f(x)⪯Tf(y).

1. ⪯S is reflexive and transitive.2. The equivalence relation on S constructed in the previous theorem is the equivalence relation associated

with f .

3. ⪯S can be extended to a partial order on the equivalence classes corresponding to f.

Proof:






1. If x∈S then f(x)⪯Tf(x) since ⪯T is reflexive, and hence x⪯Sx. Thus preceqS is reflexive. Suppose that x,y,z∈S and that x⪯Sy and y⪯Sz. Then f(x)⪯Tf(y) and f(y)⪯Tf(z). Hence f(x)⪯Tf(z) since ⪯T is transitive. Thus ⪯S is transitive.

2. For the equivalence relation ≈ on S constructed in (8), x≈y if and only if x⪯Sy and y⪯Sx if and only if f(x)⪯Tf(y) and f(y)⪯Tf(x) if and only if f(x)=f(y), since ⪯T is antisymmetric. Thus ≈ is the equivalence relation associated with f.

3. This follows immediately from (8) and parts (a) and (b). If u,v∈range(f), then f−1({u})⪯Sf−1({v}) if and only if u⪯Tv.

Examples and Applications

Simple functions

Give the equivalence classes explicitly for the functions from R into R defined below:

1. f(x)=x2.2. g(x)=⌊x⌋.

3. h(x)=sin(x).

Answer: 1. [x]={x,−x}2. [x]=[⌊x⌋,⌊x⌋+1)

3. [x]={x+2nπ:n∈Z}∪{(2n+1)π−x:n∈Z}

Calculus

Suppose that I is a fixed interval of R, and that S is the set of differentiable functions from I into R. Consider the equivalence relation associated with the derivative operator D on S, so that D(f)=f′. For f∈S, give a simple description of [f].

Answer:

[f]={f+c:c∈R}

Congruence

Recall the division relation ∣ from N+ to Z: For d∈N+ and n∈Z, d∣n means that n=kd for some k∈Z. In words, d divides n or equivalently n is a multiple of d. In the previous section, we showed that ∣ is a partial order on N+.

Now fix d∈N+. Define the relation ≡d on Z by m≡dn if and only if d∣(n−m). The relation ≡d is known as congruence modulo d. Also, let rd:Z→{0,1,…,d−1} be defined so that r(n) is the remainder when n is divided by d. Specifically, recall that by the Euclidean division theorem, named for Euclid of course, n∈Z can be written uniquely in the form n=kd+q where k∈Z and q∈{0,1,…,d−1}, and then rd(n)=q.



Congruence modulo d.

1. ≡d is the equivalence relation associated with the function rd.2. There are d distinct equivalence classes, given by [q]d={q+kd:k∈Z} for q∈{0,1,…,d−1}.

Proof: 1. Recall that for the equivalence relation associated with rd, integers m and n are equivalent if and only

if rd(m)=rd(n). By the division theorem, m=jd+p and n=kd+q, where j,k∈Z and p,q∈{0.1,…}, and these representations are unique. Thus n−m=(k−j)d+(q−p), and so m≡dn if and only if d∣(n−m) if and only if p=q if and only if rd(m)=rd(m).

2. Recall that the equivalence classes are r−1d{q} for q∈range(rd)={0,1,…,d−1}. By the division theorem, r−1d{q}={kd+q:k∈Z}.

Explicitly give the equivalence classes for ≡4, congruence mod 4.

Answer: [0]4={0,4,8,12,…}∪{−4,−8,−12,−16,…} [1]4={1,5,9,13,…}∪{−3,−7,−11,−15,…}

[2]4={2,6,10,14,…}∪{−2,−6,−10,−14,…}

[3]4={3,7,11,15,…}∪{−1,−5,−9,−13,…}

Linear Algebra

Linear algebra provides several examples of important and interesting equivalence relations. To set the stage, let Rm×n denote the set of m×n matrices with real entries, for m,n∈N+

First recall that the following are row operations on a matrix:

1. Multiply a row by a non-zero real number.2. Interchange two rows.

3. Add a multiple of a row to another row.

Row operations are essential for inverting matrices and solving systems of linear equations. If A∈Rm×n and B∈Rm×n, then A and B are said to be row equivalent if A can be transformed into B by a finite sequence of row operations.

Row equivalence is an equivalence relation on Rm×n.

Proof.

If A∈Rm×n, then A is row equivalent to itself: we can simply do nothing, or if you prefer, we can multiply the first row of A by 1. For symmetry, the key is that each row operation can be reversed by another row operation: multiplying a row by c≠0 is reversed by multiplying the same row of the resulting matrix by 1/c. Interchanging two rows is reversed by interchanging the same two rows of the resulting matrix. Adding c times row i to row j is reversed by adding −c times row i to row j in the resulting matrix. Thus, if we can transform A into B by a finite sequence of row operations, then we can transform B into A by applying the reversed row

operations in the reverse order. Transitivity is clear: If we can transform A into B by a sequence of row operations and B into C by another sequence of row operations, then we can transform A into C by putting the two sequences together.

Next recall that n×n matrices A and B are said to be similar if there exists an invertible n×n matrix P such that P−1AP=B. Similarity is very important in the study of linear transformations, change of basis, and the theory of eigenvalues and eigenvectors.

Similarity is an equivalence relation on Rn×n.

Proof:

If A∈Rn×n then A=I−1AI, where I is the n×n identity matrix, so A is similar to itself. Suppose that A,B∈Rn×n and that A is similar to B so that B=P−1AP for some invertible P∈Rn×n. Then A=PBP−1=(P−1)−1BP−1 so B is similar to A. Finally, suppose that A,B,C∈Rn×n and that A is similar to B and that B is similar to C. Then B=P−1AP and C=Q−1BQ for some invertible P,Q∈Rn×n. Then C=Q−1P−1APQ=(PQ)−1A(PQ), so A is similar to C.

Next recall that for A∈Rm×n, the transpose of A is the matrix AT∈Rn×m with the property that (i,j) entry of A is the (j,i) entry of AT, for i∈{1,2,…,m} and j∈{1,2,…,n}. Simply stated, AT is the matrix whose rows are the columns of A. For the theorem that follows, we need to remember that (AB)T=BTAT for A∈Rm×n and B∈Rn×k, and (AT)−1=(A−1)T if A∈Rn×n is invertible.

Matrices A,B∈Rn×n are said to be congruent (not to be confused with congruent integers above) if there exists an invertible P∈Rn×n such that B=PTAP. Congruence is important in the study of orthogonal matrices and change of basis.

Congruence is an equivalence relation on Rn×n

Proof:

If A∈Rn×n then A=ITAI, where again I is the n×n identity matrix, so A is congruent to itself. Suppose that A,B∈Rn×n and that A is congruent to B so that B=PTAP for some invertible P∈Rn×n. Then A=(PT)−1BP−1=(P−1)TBP−1 so B is congruent to A. Finally, suppose that A,B,C∈Rn×n and that A is congruent to B and that B is congruent to C. Then B=PTAP and C=QTBQ for some invertible P,Q∈Rn×n. Then C=QTPTAPQ=(PQ)TA(PQ), so A is congruent to C.

Number Systems

Equivalence relations play an important role in the construction of complex mathematical structures from simpler ones. Often the objects in the new structure are equivalence classes of objects constructed from the simpler structures, modulo an equivalence relation that captures the essential properties of the new objects.

The construction of number systems is a prime example of this general idea. The next exercise explores the construction of rational numbers from integers.

Define a relation ≈ on Z×N+ by (j,k)≈(m,n) if and only if jn=km.

1. ≈ is an equivalence relation.2. Define mn=[(m,n)], the equivalence class generated by (m,n), for m∈Z and n∈N+. This

definition captures the essential properties of the rational numbers.

Proof: 1. For (m,n)∈Z×N+, mn=nm of course, so (m,n)≈(m,n). Hence ≈ is reflexive. If (j,k),

(m,n)∈Z×N+ and (j,k)≈(m,n), then jn=km so trivially mk=nj, and hence (m,n)≈(j,k). Thus ≈ is symmetric. Finally, suppose that (j,k),(m,n),(p,q)∈Z×N+ and that (j,k)≈(m,n) and (m,n)≈(p,q). Then jn=km and mq=np, so jnp=kmp which implies jmq=kmp, and so jq=kp. Hence (j,k)≈(p,q) so ≈ is transitive.

2. Suppose that jk and mn are rational numbers in the usual, informal sense, where j,m∈Z and k,n∈N+. Then jk=mn if and only if jn=km if and only if (j,k)≈(m,n), so it makes sense to define mn as the equivalence class generated (m,n). Addition and multiplication are defined in the usual way: if (j,k),(m,n)∈Z×N+ then

jk+mn=jn+mkkn, jk⋅mn=jmkn

The definitions are consistent; that is they do not depend on the particular representations of the equivalence classes.

6. Cardinality

Basic Theory

Definitions

Suppose that S is a non-empty collection of sets. We define a relation ≈ on S by A≈B if and only if there exists a one-to-one function f from A onto B. The relation ≈ is an equivalence relation on S. That is, for all A,B,C∈S,

1. A≈A, the reflexive property2. If A≈B then B≈A, the symmetric property

3. If A≈B and B≈C then A≈C, the transitive property

Proof: 1. The identity function IA on A, given by IA(x)=x for x∈A, maps A one-to-one onto A. Hence A≈A2. If A≈B then there exists a one-to-one function f from A onto B. But then f−1 is a one-to-one function

from B onto A, so B≈A

3. Suppose that A≈B and B≈C. Then there exists a one-to-one function f from A onto B and a one-to-one function g from B onto C. But then g∘f is a one-to-one function from A onto C, so A≈C.

A one-to-one function f from A onto B is sometimes called a bijection. Thus if A≈B then A and B are in one-to-one correspondence and are said to have the same cardinality. Thus, the equivalence classes under this equivalence relation capture the notion of having the same number of elements.

Let N0=∅, and for k∈N+, let Nk={0,1,…k−1}. As always, N={0,1,2,…} is the set of all natural numbers.





Suppose that A is a set.

1. A is finite if A≈Nk for some k∈N, in which case k is the cardinality of A, and we write #(A)=k.2. A is infinite if A is not finite.

3. A is countably infinite if A≈N.

4. A is countable if A is finite or countably infinite.

5. A is uncountable if A is not countable.

In part (a), think of Nk as a reference set with k elements; any other set with k elements must be equivalent to this one. We will study the cardinality of finite sets in the next two sections on Counting Measure and Combinatorial Structures. In this section, we will concentrate primarily on infinite sets. In part (c), a countable set is one that can be enumerated or counted by putting the elements into one-to-one correspondence with Nk for some k∈N or with all of N. An uncountable set is one that cannot be so counted. Countable sets play a special role in probability theory, as in many other branches of mathematics. Apriori, it's not clear that there are uncountable sets, but we will soon see examples.

Preliminary Examples

If S is a set, recall that P(S) denotes the power set of S (the set of all subsets of S). If A and B are sets, then AB is the set of all functions from B into A. In particular, {0,1}S denotes the set of functions from S into {0,1}.

If S is a set then P(S)≈{0,1}S.

Proof:

The mapping that takes a set A∈P(S) into its indcator function 1A∈{0,1}S is one-to-one and onto. Specifically, if A,B∈P(S) and 1A=1B, then A=B, so the mapping is one-to-one. On the other hand, if f∈{0,1}S then f=1A where A={x∈S:f(x)=1}. Hence the mapping is onto.

Next are some examples of countably infinite sets.

The following sets are countably infinite:

1. The set of even natural numbers E={0,2,4,…}2. The set of integers Z

Proof: 1. The function f:N→E given by f(n)=2n is one-to-one and onto.2. The function g:N→Z given by g(n)=n2 if n is even and g(n)=−n+12 if n is odd, is one-to-one and

onto.

At one level, it might seem that E has only half as many elements as N while Z has twice as many elements as N. But as the previous result shows, that point of view is incorrect: N, E, and Z all have the same cardinality (and are countably infinite). The next example shows that there are indeed uncountable sets.

If A is a set with at least two elements then S=AN, the set of all functions from N into A, is uncountable.

http://www.math.uah.edu/stat/foundations/Functions.html#ind

http://www.math.uah.edu/stat/foundations/Structures.html

http://www.math.uah.edu/stat/foundations/Counting.html

Proof:

The proof is by contradiction, and uses a nice trick known as the diagonalization method. Suppose that S is countably infinite (it's clearly not finite), so that the elements of S can be enumerated: S={f0,f1,f2,…}. Let a and b denote distinct elements of A and define g:N→A by g(n)=b if fn(n)=a and g(n)=a if fn(n)≠a. Note that g≠fn for each n∈N, so g∉S. This contradicts the fact that S is the set of all functions from N into A.

Subsets of Infinite Sets

Surely a set must be as least as large as any of its subsets, in terms of cardinality. On the other hand, by the example above, the set of natural numbers N, the set of even natural numbers E and the set of integers Z all have exactly the same cardinality, even though E⊂N⊂Z. In this subsection, we will explore some interesting and somewhat paradoxical results that relate to subsets of infinite sets. Along the way, we will see that the countable infinity is the smallest of the infinities.

If S is an infinite set then S has a countable infinite subset.

Proof:

Select a0∈S. It's possible to do this since S is infinite and therefore nonempty. Inductively, having chosen {a0,a1,…,ak−1}⊆S, select ak∈S∖{a0,a1,…,ak−1}. Again, it's possible to do this since S is not finite. Manifestly, {a0,a1,…} is a countably infinite subset of S.

A set S is infinite if and only if S is equivalent to a proper subset of S.

Proof:

If S is finite, then S is not equivalent to a proper subset by the pigeonhole principle. If S is infinite, then S has countably infinite subset {a0,a1,a2,…} by the previous result. Define the function f:S→S by f(an)=a2n and f(x)=x for x∈S∖{a0,a1,a2,…}. Then f maps S one-to-one onto S∖{a1,a3,a5,…}.

When S was infinite in the proof of the previous result, not only did we map S one-to-one onto a proper subset, we actually threw away a countably infinite subset and still maintained equivalence. Similarly, we can add a countably infinite set to an infinite set S without changing the cardinality.

If S is an infinite set and B is a countable set, then S≈S∪B.

Proof:

Consider the most extreme case where B is countably infinite and disjoint from S. Then S has a countably infinite subset A={a0,a1,a2,…} by a result above, and B={b0,b1,b2,…}. Define the function f:S→S∪B by f(an)=an/2 if n is even, f(an)=b(n−1)/2 if n is odd, and f(x)=x if x∈S∖{a0,a1,a2,…}. Then f maps S one-to-one onto S∪B.

In particular, if S is uncountable and B is countable then S∪B and S∖B have the same cardinality as S, and in particular are uncountable.

http://www.math.uah.edu/stat/foundations/Cardinality.html#inf1

http://www.math.uah.edu/stat/foundations/Cardinality.html#exm2

In terms of the dichotomies finite-infinite and countable-uncountable, a set is indeed at least as large as a subset. First we need a preliminary result.

If S is countably infinite and A⊆S then A is countably infinite.

Proof:

It suffices to show that if A is an infinite subset of S then A is countably infinite. Since S is countably infinite, it can be enumerated: S={x0,x1,x2,…}. Let ni be the ith smallest index such that xni∈A. Then A={xn0,xn1,xn2,…} and hence is countably infinite.

Suppose that A⊆B.

1. If B is finite then A is finite.2. If A is infinite then then B is infinite.

3. If B is countable then A is countable.

4. If A is uncountable then B is uncountable.

Proof: 1. This is clear from the definition of a finite set.2. This is the contrapositive of (a).

3. If A is finite, then A is countable. If A is infinite, then B is infinite by (b) and hence is countably infinite. But then A is countably infinite by the previous result.

4. This is the contrapositive of (c).

Comparisons by one-to-one and onto functions

We will look deeper at the general question of when one set is at least as big than another, in the sense of cardinality. Not surprisingly, this will eventually lead to a partial order on the cardinality equivalence classes

First note that if there exists a function that maps a set A one-to-one into a set B, then in a sense, there is a copy of A contained in B. Hence B should be at least as large as A.

Suppose that f:A→B is one-to-one.

1. If B is finite then A is finite.2. If A is infinite then B is infinite.

3. If B is countable then A is countable.

4. If A is uncountable then B is uncountable.

Proof:

Note that f maps A one-to-one onto f(A). Hence A≈f(A) and f(A)⊆B. The results now follow from the previous comparison result:

http://www.math.uah.edu/stat/foundations/Cardinality.html#cmp1

http://www.math.uah.edu/stat/foundations/Cardinality.html#inf4

1. If B is finite then f(A) is finite and hence A is finite.2. If A is infinite then f(A) is infinite and hence B is infinite.

3. If B is countable then f(A) is countable and hence A is countable.

4. If A is uncountable then f(A) is uncountable and hence B is uncountable.

On the other hand, if there exists a function that maps a set A onto a set B, then in a sense, there is a copy of B contained in A. Hence A should be at least as large as B.

Suppose that f:A→B is onto.

1. If A is finite then B is finite.2. If B is infinite then A is infinite.

3. If A is countable then B is countable.

4. If B is uncountable then A is uncountable.

Proof:

For each y∈B, select a specific x∈A with f(x)=y (if you are persnickety, you may need to invoke the axiom of choice). Let C be the set of chosen points. Then f maps C one-to-one onto B, so C≈B and C⊆A. The results now follow from the comparison result above:

1. If A is finite then C is finite and hence B is finite.2. If B is infinite then C is infinite and hence A is infinite.

3. If A is countable then C is countable and hence B is countable.

4. If B is uncountable then C is uncountable and hence A is uncountable.

The previous exercise also could be proved from the one before, since if there exists a function f mapping A onto B, then there exists a function g mapping B one-to-one into A. This duality is proven in the discussion of the axiom of choice. A simple and useful corollary of the previous two theorems is that if B is a given countably infinite set, then a set A is countable if and only if there exists a one-to-one function f from A into B, if and only if there exists a function g from B onto A.

If Ai is a countable set for each i in a countable index set I, then ⋃i∈IAi is countable.

Proof:

Consider the most extreme case in which the index set I is countably infinite. Since Ai is countable, there exists a function fi that maps N onto Ai for each i∈N. Let M={2i3j:(i,j)∈N×N}. Note that the points in M are distinct, that is, 2i3j≠2m3n if (i,j),(m,n)∈N×N and (i,j)≠(m,n). Hence M is infinite, and since M⊂N, M is countably infinite. The function f given by f(2i3j)=fi(j) maps M onto ⋃i∈IAi, and hence this last set is countable.

If A and B are countable then A×B is countable.


http://www.math.uah.edu/stat/foundations/Cardinality.html#cmp1



Proof:

There exists a function f that maps N onto A, and there exists a function g that maps N onto B. Again, let M={2i3j:(i,j)∈N×N} and recall that M is countably infinite. Define h:M→A×B by h(2i3j)=(f(i),g(j)). Then h maps M onto A×B and hence this last set is countable.

The last result could also be proven from the one before, by noting that

A×B=⋃a∈A{a}×B

Both proofs work because the set M is essentially a copy of N×N, embedded inside of N. The last theorem generalizes to the statement that a finite product of countable sets is still countable. But, from the result above on sets of functions, a product of infinitely many sets (with at least 2 elements each) will be uncountable.

The set of rational numbers Q is countably infinite.

Proof:

The sets Z and N+ are countably infinite and hence the set Z×N+ is countably infinite. The function f:Z×N+→Q given by f(m,n)=mn is onto.

A real number is algebraic if it is the root of a polynomial function (of degree 1 or more) with integer coefficients. Rational numbers are algebraic, as are rational roots of rational numbers (when defined). Moreover, the algebraic numbers are closed under addition, multiplication, and division. A real number is transcendental if it's not algebraic. The numbers e and π are transcendental, but we don't know very many other transcendental numbers by name. However, as we will see, most (in the sense of cardinality) real numbers are transcendental.

The set of algebraic numbers A is countably infinite.

Proof:

Let Z0=Z∖{0} and let Zn=Zn−1×Z0 for n∈N+. The set Zn is countably infinite for each n. Let C=⋃∞n=1Zn. Think of C as the set of coefficients and note that C is countably infinite. Let P denote the set of polynomials of degree 1 or more, with integer coefficients. The function (a0,a1,…,an)↦a0+a1x+⋯+anxn maps C onto P, and hence P is countable. For p∈P, let Ap denote the set of roots of p. A polynomial of degree n in P has at most n roots, by the fundamental theorem of algebra, so in particular Ap is finite for each p∈P. Finally, note that A=⋃p∈PAp and so A is countable. Of course N⊂A, so A is countably infinite.

Now let's look at some uncountable sets.

The interval [0,1) is uncountable.

Proof:

Recall that {0,1}N+ is the set of all functions from N+ into {0,1}, which in this case, can be thought of as infinite sequences or bit strings:

{0,1}N+={x=(x1,x2,…):xn∈{0,1} for all n∈N+}

http://www.math.uah.edu/stat/foundations/Cardinality.html#exm3

By (4), this set is uncountable. Let N={x∈{0,1}N+:xn=1 for all but finitely many n}, the set of bit strings that eventually terminate in all 1s. Note that N=⋃∞n=1Nn where Nn={x∈{0,1}N+:xk=1 for all k≥n}. Clearly Nn is finite for all n∈N+, so N is countable, and therefore S={0,1}N+∖N is uncountable. In fact, S≈{0,1}N+. The function

x↦∑n=1∞xn2n

maps S one-to-one onto [0,1). In words every number in [0,1) has a unique binary expansion in the form of a sequence in S. Hence [0,1)≈S and in particular, is uncountable. The reason for eliminating the bit strings that terminate in 1s is to ensure uniqueness, so that the mapping is one-to-one. The bit string x1x2⋯xk0111⋯ corresponds to the same number in [0,1) as the bit string x1x2⋯xk1000⋯.

The following sets have the same cardinality, and in particular all are uncountable:

1. R, the set of real numbers.2. Any interval I of R, as long as the interval is not empty or a single point.

3. R∖Q, the set of irrational numbers.

4. R∖A, the set of transcendental numbers.

5. P(N), the power set of N.

Proof: 1. The mapping x↦2x−1x(1−x) maps (0,1) one-to-one onto R so (0,1)≈R. But (0,1)=[0,1)∖{0},

so (0,1]≈(0,1)≈R, and all of these sets are uncountable by the previous result. 2. . Suppose a,b∈R and a<b. The mapping x↦a+(b−a)x maps (0,1) one-to-one onto (a,b) and

hence (a,b)≈(0,1)≈R. Also, [a,b)=(a,b)∪{a}, (a,b]=(a,b)∪{b}, and [a,b]=(a,b)∪{a,b}, so (a,b)≈[a,b)≈(a,b]≈[a,b]≈R. The function x↦ex maps R one-to-one onto (0,∞), so (0,∞)≈R. For a∈R, the function x↦a+x maps (0,∞) one-to-one onto (a,∞) and the mapping x↦a−x maps (0,∞) one to one onto (−∞,a) so (a,∞)≈(−∞,a)≈(0,∞)≈R. Next, [a,∞)=(a,∞)∪{a} and (−∞,a]=(−∞,a)∪{a}, so [a,∞)≈(−∞,a]≈R.

3. Q is countably infinite, so R∖Q≈R.

4. Similarly, A is countably infinite, so R∖A≈R.

5. If S is countably infinite, then by the previous result and (a), P(S)≈P(N+)≈{0,1}N+≈[0,1).

The Cardinality Partial Order

Suppose that S is a nonempty collection of sets. We define the relation ⪯ on S by A⪯B if and only if there exists a one-to-one function f from A into B, if and only if there exists a function g from B onto A. In light of the previous subsection, A⪯B should capture the notion that B is at least as big as A, in the sense of cardinality.

The relation ⪯ is reflexive and transitive.

Proof:

For A∈S, the identity function IA:A→A given by IA(x)=x is one-to-one (and also onto), so A⪯A. Suppose that A,B,C∈S and that A⪯B and B⪯C. Then there exist one-to-one functions f:A→B and g:B→C. But then g∘f:A→C is one-to-one, so A⪯C.

Thus, we can use the construction in the section on on Equivalence Relations to first define an equivalence relation on S, and then extend ⪯ to a true partial order on the collection of equivalence classes. The only question that remains is whether the equivalence relation we obtain in this way is the same as the one that we have been using in our study of cardinality. Rephrased, the question is this: If there exists a one-to-one function from A into B and a one-to-one function from B into A, does there necessarily exist a one-to-one function from A onto B? Fortunately, the answer is yes; the result is known as the Schröder-Bernstein Theorem, named for Ernst Schröder and Sergi Bernstein.

If A⪯B and B⪯A then A≈B.

Proof:

Set inclusion ⊆ is a partial order on P(A) (the power set of A) with the property that every subcollection of P(A) has a supremum (namely the union of the subcollection). Suppose that f maps A one-to-one into B and g maps B one-to-one into A. Define the function h:P(A)→P(A) by h(U)=A∖g[B∖f(U)] for U⊆A. Then h is increasing:

U⊆V⟹f(U)⊆f(V)⟹B∖f(V)⊆B∖f(U)⟹g[B∖f(V)]⊆g[b∖f(u)]⟹A∖g[B∖f(U) ⊆A∖g[B∖f(U)]

From the fixed point theorem for partially ordered sets, there exists U⊆A such that h(U)=U. Hence U=A∖g[B∖f(A)] and therefore A∖U=g[B∖f(U)]. Now define F:A→B by F(x)=f(x) if x∈U and F(x)=g−1(x) if x∈A∖U.

f maps A one-to-one onto f(A); g maps B∖f(A) one-to-one onto A∖U

Next we show that F is one-to-one. Suppose that x1,x2∈A and F(x1)=F(x2). If x1,x2∈U then f(x1)=f(x2) so x1=x2 since f is one-to-one. If x1,x2∈A∖U then g−1(x1)=g−1(x2) so x1=x2 since g−1 is one-to-one. If x1∈U and x2∈A∖U. Then F(x1)=f(x1)∈f(U) while F(x2)=g−1(x2)∈B∖f(U), so F(x1)=F(x2) is impossible.

Finally we show that F is onto. Let y∈B. If y∈f(U) then y=f(x) for some x∈U so F(x)=y. If y∈B∖f(U) then x=g(y)∈A∖U so F(x)=g−1(x)=y.

We will write A≺B if A⪯B, but A≈B, That is, there exists a one-to-one function from A into B, but there does not exist a function from A onto B. Note that ≺ would have its usual meaning if applied to the equivalence classes. That is, [A]≺[B] if and only if [A]⪯[B] but [A]≠[B]. Intuitively, of course, A≺B means that B is strictly larger than A, in the sense of cardinality.

A≺B in each of the following cases:

1. A and B are finite and #(A)<#(B).2. A is finite and B is countably infinite.

3. A is countably infinte and B is uncountable.

http://www.math.uah.edu/stat/foundations/Order.html#ext7



We close our discussion with the observation that for any set, there is always a larger set.

If S is a set then S≺P(S).

Proof:

First, it's trivial to map S one-to-one into P(S); just map x to {x}. Suppose now that f maps S onto P(S) and let R={x∈S:x∉f(x)}. Since f is onto, there exists t∈S such that f(t)=R. Note that t∈f(t) if and only if t∉f(t).

The proof that a set cannot be mapped onto its power set is similar to the Russell paradox, named for Bertrand Russell.

The continuum hypothesis is the statement that there is no set whose cardinality is strictly between that of N and R. The continuum hypothesis actually started out as the continuum conjecture, until it was shown to be consistent with the usual axioms of the real number system (by Kurt Gödel in 1940), and independent of those axioms (by Paul Cohen in 1963).

If S is uncountable, then there exists A⊆S such that A and Ac are uncountable.

Proof:

There is an easy proof, assuming the continuum hypothesis. Under this hypothesis, if S is uncountable then [0,1)⪯S. Hence there exists a one-to-one function f:[0,1)→S. Let A=f[0,12). Then A is uncountable, and since f[12,1)⊆Ac, Ac is uncountable.

There is a more complicated proof just using the axiom of choice.

7. Counting Measure

Basic Theory

Suppose that S is a finite set. For A⊆S, the cardinality of A is the number of elements in A, and is denoted #(A). The function # on P(S) is called counting measure. Counting measure plays a fundamental role in discrete probability structures, and particularly those that involve sampling from a finite set. The set S is typically very large, hence efficient counting methods are essential. The first combinatorial problem is attributed to the Greek mathematician Xenocrates.

In many cases, a set of objects can be counted by establishing a one-to-one correspondence between the given set and some other set. Naturally, the two sets have the same number of elements, but for some reason, the second set may be easier to count.

The Addition Rule

The addition rule of combinatorics is simply the additivity axiom of counting measure.

If {A1,S2,…,An} is a collection of disjoint subsets of S then

#(⋃i=1nAi)=∑i=1n#(Ai)

The addition rule

Simple Counting Rules

The counting rules in this subsection are simple consequences of the addition rule. Be sure to try the proofs yourself before reading the ones in the text.

#(Ac)=#(S)−#(A). This is the complement rule.

Proof:

Note that A and Ac are disjoint and their union is S. Hence #(A)+#(Ac)=#(S).

The complement rule

#(B∖A)=#(B)−#(A∩B). This is the difference rule.

http://www.math.uah.edu/stat/foundations/Counting.html#add1


http://www.math.uah.edu/stat/urn/Introduction.html

Proof:

Note that A∩B and B∖A are disjoint and their union is B. Hence #(A∩B)+#(B∖A)=#(B).

If A⊆B then #(B∖A)=#(B)−#(A).

Proof:

This follows from the difference rule, since A∩B=A.

If A⊆B then #(A)≤#(B).

Proof:

This follows from the previous result: #(B)=#(A)+#(B∖A)≥#(A).

Thus, # is an increasing function, relative to the subset partial order ⊆ on P(S), and the ordinary order ≤ on N.

Inequalities

This subsection gives two inequalities that are useful for obtaining bounds on the number of elements in a set. The first is Boole's inequality (named after George Boole) which gives an upper bound on the cardinality of a union.

If {A1,A2,…,An} is a a finite collection of subsets of S then

#(⋃i=1nAi)≤∑i=1n#(Ai)Proof:

Let B1=A1 and Bi=Ai∖(A1∪⋯Ai−1) for i∈{2,3,…,n}. Note that {B1,B2,…,Bn} is a pairwise disjoint collection and has the same union as {A1,A2,…,An}. From the the increasing property, #(Bi)≤#(Ai) for each i∈{1,2,…,n}. Hence by the addition rule,

#(⋃i=1nAi)=#(⋃i=1nBi)≤∑i=1n#(Ai)

Intuitively, Boole's inequality holds because parts of the union have been counted more than once in the expression on the right. The second inequality is Bonferroni's inequality (named after Carlo Bonferroni), which gives a lower bound on the cardinality of an intersection.

If {A1,A2,…,An} is a finite collection of subsets of S then

#(⋂i=1nAi)≥#(S)−∑i=1n[#(S)−#(Ai)]Proof:

Using the complement rule, Boole's inequality, and DeMorgan's law,

#(⋂i=1nAi)=#(S)−#(⋃i=1nAci)≥#(S)−∑i=1n#(Aci)=#(S)−∑i=1n[#(S)−#(Ai)]

http://www.math.uah.edu/stat/foundations/Sets.html#gen2

http://www.math.uah.edu/stat/foundations/Counting.html#inq1

http://www.math.uah.edu/stat/foundations/Counting.html#rul1



http://www.math.uah.edu/stat/foundations/Order.html#mon


The Inclusion-Exclusion Formula

The inclusion-exclusion formula gives the cardinality of a union of sets in terms of the cardinality of the various intersections of the sets. The formula is useful because intersections are often easier to count. We start with the special cases of two sets and three sets. As usual, we assume that the sets are subsets of a finite universal set S.

If A and B are subsets of S then #(A∪B)=#(A)+#(B)−#(A∩B).

Proof:

Note first that A∪B=A∪(B∖A) and the latter two sets are disjoint. From the addition rule and the difference rule,

#(A∪B)=#(A)+#(B∖A)=#(A)+#(B)−#(A∩B)

The inclusion-exclusion theorem for two sets

If A, B, C are subsets of S then #(A∪B∪C)=#(A)+#(B)+#(C)−#(A∩B)−#(A∩C)−#(B∩C)+#(A∩B∩C).

Proof:

Note that A∪B∪C=(A∪B)∪[C∖(A∪B)] and that A∪B and C∖(A∪B) are disjoint. Using the addition rule and the difference rule,

#(A∪B∪C)=#(A∪B)+#[C∖(A∪B)]=#(A∪B)+#(C)−#[C∩(A∪B)]=#(A∪B)+#(C)−#[(A∩C)∪(B∩C)]

Now using the inclusion-exclusion rule for two sets (twice) we have #(A∪B∪C)=#(A)+#(B)−#(A∩B)+#(C)−#(A∩C)−#(B∩C)+#(A∩B∩C)

The inclusion-exclusion theorem for three sets

The inclusion-exclusion rule for two sets and for three sets can be generalized to a union of n sets; the generalization is known as the (general) inclusion-exclusion formula.

Suppose that {Ai:i∈I} is a collection of subsets of S where I is an index set with #(I)=n. Then

#(⋃i∈IAi)=∑k=1n(−1)k−1∑J⊆I,#(J)=k#⎛⎝⋂j∈JAj⎞⎠Proof:

http://www.math.uah.edu/stat/foundations/Counting.html#inx2









The proof is by induction on n. The formula holds for n=2 events by (7) and for n=3 events by (8). Suppose the formula holds for n∈N+, and suppose that {A1,A2,…,An,An+1} is a collection of n+1 subsets of S. Then

⋃i=1n+1Ai=(⋃i=1nAi)∪[An+1∖(⋃i=1nAi)]and the two sets connected by the central union are disjoint. Using the addition rule and the difference rule,

#(⋃i=1n+1Ai)=#(⋃i=1nAi)+#(An+1)−#[An+1∩(⋃i=1nAi)]=#(⋃i=1nAi)+#(An+1)−#[⋃i=1n(Ai∩An+1)]

By the induction hypothesis, the formula holds for the two unions of n sets in the last expression. The result then follows by simplification.

The general Bonferroni inequalities, named again for Carlo Bonferroni, state that if sum on the right is truncated after k terms (k<n), then the truncated sum is an upper bound for the cardinality of the union if k is odd (so that the last term has a positive sign) and is a lower bound for the cardinality of the union if k is even (so that the last terms has a negative sign).

The Multiplication Rule

The multiplication rule of combinatorics is based on the formulation of a procedure (or algorithm) that generates the objects to be counted. Specifically, suppose that the procedure consists of k steps, performed sequentially, and that for each j∈{1,2,…,k}, step j can be performed in nj ways regardless of the choices made on the previous steps. Then the number of ways to perform the entire algorithm (and hence the number of objects) is n1n2⋯nk.

The key to a successful application of the multiplication rule to a counting problem is the clear formulation of an algorithm that generates the objects being counted, so that each object is generated once and only once. That is, we must neither over-count nor under-count. It's also important to notice that the set of choices available at step j may well depend on the previous steps; the assumption is only that the number of choices available does not depend on the previous steps.

The first two results below give equivalent formulations of the multiplication principle.

Suppose that S is a set of sequences of length k, and that we denote a generic element of S by (x1,x2,…,xk). Suppose that for each j∈{1,2,…,k}, xj has nj different values, regardless of the values of the previous coordinates. Then #(S)=n1n2⋯nk.

Proof:

A procedure that generates the sequences in S consists of k steps. Step j is to select the jth coordinate.

Suppose that T is an ordered tree with depth k and that each vertex at level i−1 has ni children for i∈{1,2,…,k}. Then the number of endpoints of the tree is n1n2⋯nk.

Proof:

Each endpoint of the tree is uniquely associated with the path from the root vertex to the endpoint. Each such path is a sequence of length k, in which there are nj values for coordinate j for each j∈{1,2,…,k}. Hence the result follows from the previous result on sequences.

http://www.math.uah.edu/stat/foundations/Counting.html#mul1



Product Sets

If Si is a set with ni elements for i∈{1,2,…,k} then

#(S1×S2×⋯×Sk)=n1n2⋯nk

Proof:

This is a corollary of the result above on sequences.

If S is a set with n elements, then Sk has nk elements.

Proof:

This is a corollary of the on previous result on product sets.

In the previous result, note that the elements of Sk can be thought of as ordered samples of size k that can be chosen with replacement from a population of n objects. Elements of {0,1}n are sometimes called bit strings of length n. Thus, there are 2n bit strings of length n.

Functions

The number of functions from a set A of m elements into a set B of n elements is nm.

Proof:

An algorithm for constructing a function f:A→B is to choose the value of f(x)∈B for each x∈A. There are n choices for each of the m elements in the domain

Recall that the set of functions from a set A into a set B (regardless of whether the sets are finite or infinite) is denoted BA. The result in the previous exercise is motivation for this notation. Note also that if S is a set with n elements, then the elements in the Cartesian power set Sk can be thought of as functions from {1,2,…,k} into S. Thus, the result above on Cartesian powers can be thought of as a corollary of the previous result on functions.

Subsets

If S is a set with n elements then there are 2n subsets of S.

Proof from the multiplication principle:

An algorithm for constructing A⊆S, is to decide whether x∈A or x∉A for each x∈S. There are 2 choices for each of the n elements of S.

Proof using indicator functions:

Recall that there is a one-to-one correspondence between subsets of S and indicator functions on S. An indicator function is simply a function from S into {0,1}, and the number of such functions is 2n by the result above on functions

http://www.math.uah.edu/stat/foundations/Counting.html#fnc1






Suppose that {A1,A2,…Ak} is a collection of k subsets of a set S. There are 22k different (in general) sets that can be constructed from the k given sets, using the operations of union, intersection, and complement. These sets form the algebra generated by the given sets.

Proof:

First note that there are 2k pairwise disjoint sets of the form B1∩B2∩⋯∩Bk where Bi=Ai or Bi=Aci for each i. Next, note that every set that can be constructed from {A1,A2,…,An} is a union of some (perhaps all, perhaps none) of these intersection sets.

Open the Venn diagram applet.

1. Select each of the 4 disjoint sets A∩B, A∩Bc, Ac∩B, Ac∩Bc.2. Select each of the 12 other subsets of S. Note how each is a union of some of the sets in (a).

Suppose that S is a set with n elements and that A is a subset of S with k elements. The number of subsets of S that contain A is 2n−k.

Proof:

Note that subset B of S that contains A can be written uniquely in the form B=A∪C where C⊆Ac. Ac has n−k elements and hence there are 2n−k subsets of Ac.


A license number consists of two letters (uppercase) followed by five digits. How many different license numbers are there?

Answer:

67600000

Suppose that a Personal Identification Number (PIN) is a four-symbol code word in which each entry is either a letter (uppercase) or a digit. How many PINs are there?

Answer:

1679616

In the board game Clue, Mr. Boddy has been murdered. There are 6 suspects, 6 possible weapons, and 9 possible rooms for the murder.

1. The game includes a card for each suspect, each weapon, and each room. How many cards are there? 2. The outcome of the game is a sequence consisting of a suspect, a weapon, and a room (for example,

Colonel Mustard with the knife in the billiard room). How many outcomes are there?

3. Once the three cards that constitute the outcome have been randomly chosen, the remaining cards are dealt to the players. Suppose that you are dealt 5 cards. In trying to guess the outcome, what hand of cards would be best?

http://www.hasbro.com/clue/

http://www.math.uah.edu/stat/foundations/Measure.html

Answer: 1. 21 cards2. 324 outcomes

3. The best hand would be the 5 remaining weapons or the 5 remaining suspects.

An experiment consists of rolling a fair die, drawing a card from a standard deck, and tossing a fair coin. How many outcomes are there?

Answer:

624

A fair die is rolled 5 times and the sequence of scores recorded. How many outcomes are there?

Answer:

7776

Suppose that 10 persons are chosen and their birthdays recorded. How many possible outcomes are there?

Proof:

41969002243198805166015625

In the card game Set, each card has 4 properties: number (one, two, or three), shape (diamond, oval, or squiggle), color (red, blue, or green), and shading (solid, open, or stripped). The deck has one card of each (number, shape, color, shading) configuration. A set in the game is defined as a set of three cards which, for each property, the cards are either all the same or all different.

1. How many cards are in a deck? 2. How many sets are there?

Answer: 1. 812. 1080

A coin is tossed 10 times and the sequence of scores recorded. How many sequences are there?

Answer:

1024

A string of lights has 20 bulbs, each of which may be good or defective. How many configurations are there?

Answer:

1048576

The die-coin experiment consists of rolling a die and then tossing a coin the number of times shown on the die. The sequence of coin results is recorded.

http://www.setgame.com/

1. How many outcomes are there?2. How many outcomes are there with all heads?

3. How many outcomes are there with exactly one head?

Answer: 1. 1262. 6

3. 21

Run the die-coin experiment 100 times and observe the outcomes.

Suppose that a sandwich at a restaurant consists of bread, meat, cheese, and various toppings. There are 4 different types of bread (select one), 3 different types of meat (select one), 5 different types of cheese (select one), and 10 different toppings (select any). How many sandwiches are there?

Answer:

61440

Consider a deck of cards as a set D with 52 elements.

1. How many subsets of D are there?2. How many functions are there from D into the set {1,2,3,4}?

Answer: 1. 45035996273704962. 20282409603651670423947251286016

The Galton Board

The Galton Board, named after Francis Galton, is a triangular array of pegs. Galton, apparently too modest to name the device after himself, called it a quincunx from the Latin word for five twelfths (go figure). The rows are numbered, from the top down, by (0,1,…). Row n has n+1 pegs that are labeled, from left to right by (0,1,…,n). Thus, a peg can be uniquely identified by an ordered pair (n,k) where n is the row number and k is the peg number in that row.

A ball is dropped onto the top peg (0,0) of the Galton board. In general, when the ball hits peg (n,k), it either bounces to the left to peg (n+1,k) or to the right to peg (n+1,k+1). The sequence of pegs that the ball hits is a path in the Galton board.

There is a one-to-one correspondence between each pair of the following three collections:

1. Bit strings of length n2. Paths in the Galton board from (0,0) to any peg in row n.

3. Subsets of a set with n elements.

Thus, each of these collections has 2n elements.

Open the Galton board applet.

1. Move the ball from (0,0) to (10,6) along a path of your choice. Note the corresponding bit string and subset.

2. Generate the bit string 0111001010. Note the corresponding subset and path.

3. Generate the subset {2,4,5,9,10}. Note the corresponding bit string and path.

4. Generate all paths from (0,0) to (4,2). How many paths are there?

Answer: 4. 6

Documents

Math 109 Lecture