59
1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F , we mean a non-empty set of elements with two laws of combination, which we call an addition + and a multiplication · satisfying: (F1) To every pair of elements a, b F there is associated a unique element, called their sum, which we denote by a + b. (F2) Addition is associative: (a + b)+ c = a +(b + c). (F3) Addition is commutative: a + b = b + a. (F4) There exists an element, which we denote by 0, such that a +0= a for all a F . (F5) For each a F there exists an element, which we denote by a such that a +(a)=0. (F6) To every pair of elements a, b F there is associated a unique element, called their product, which we denote by ab, or a · b. (F7) Multiplication is associative: (ab)c = a(bc). (F8) Multiplication is commutative: ab = ba. (F9) There exists an element different from 0, which we denote by 1, such that a · 1= a for all a F . (F10) For each a F , a =0, there exists an element which we denote by a 1 , such that a · a 1 =1. (F11) Multiplication is distributive with respect to addition: (a + b)c = ac + bc. Remark. Note that in a field F , 0+0=0. We write Q for the set of rational numbers, R for the set of real numbers and C for the set of complex numbers. These sets are fields. The rigorous definition and treatments on fields can be found in any abstract algebra courses including 2301337 Abstract Algebra I. The definition of field was presented once in Linear Algebra I. In this course, F always denotes any of Q, R, C or other fields. Its members are called scalars. However, almost nothing essential is lost if we assume that F is the real field R or the complex field C. Example 1.1.1. A non-empty subset F of C such that for any x, y F , x y F and xy F and for any non-zero z F , 1/z F is also a field. It is called a subfield of C. For example, Q(i)= {a + bi : a, b Q} is a subfield of C. Example 1.1.2. Let p be a prime and F p = {0, 1,...,p 1}. For a and b in F p , we define a + b = the remainder when we divide a + b with p, and ab = the remainder when we divide ab with p. Then (F p , +, ·) is a finite field of p elements. Note that if p =2, we have 1+1=0. 1

1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1 | Vector Spaces

1.1 The Algebra of Matrices over a Field

Definition. By a field F , we mean a non-empty set of elements with two laws of combination,

which we call an addition + and a multiplication · satisfying:

(F1) To every pair of elements a, b ∈ F there is associated a unique element, called their sum,

which we denote by a+ b.(F2) Addition is associative: (a+ b) + c = a+ (b+ c).(F3) Addition is commutative: a+ b = b+ a.

(F4) There exists an element, which we denote by 0, such that a+ 0 = a for all a ∈ F .

(F5) For each a ∈ F there exists an element, which we denote by −a such that a+ (−a) = 0.

(F6) To every pair of elements a, b ∈ F there is associated a unique element, called their

product, which we denote by ab, or a · b.(F7) Multiplication is associative: (ab)c = a(bc).(F8) Multiplication is commutative: ab = ba.

(F9) There exists an element different from 0, which we denote by 1, such that a · 1 = a for all

a ∈ F .

(F10) For each a ∈ F , a 6= 0, there exists an element which we denote by a−1, such that

a · a−1 = 1.

(F11) Multiplication is distributive with respect to addition: (a+ b)c = ac+ bc.

Remark. Note that in a field F , 0 + 0 = 0.

We write Q for the set of rational numbers, R for the set of real numbers and C for the set

of complex numbers. These sets are fields. The rigorous definition and treatments on fields can

be found in any abstract algebra courses including 2301337 Abstract Algebra I. The definition of

field was presented once in Linear Algebra I. In this course, F always denotes any of Q, R, Cor other fields. Its members are called scalars. However, almost nothing essential is lost if we

assume that F is the real field R or the complex field C.

Example 1.1.1. A non-empty subset F of C such that for any x, y ∈ F , x − y ∈ F and xy ∈ Fand for any non-zero z ∈ F , 1/z ∈ F is also a field. It is called a subfield of C. For example,

Q(i) = {a+ bi : a, b ∈ Q} is a subfield of C.

Example 1.1.2. Let p be a prime and Fp = {0, 1, . . . , p− 1}. For a and b in Fp, we define

a+ b = the remainder when we divide a+ b with p, and

ab = the remainder when we divide ab with p.

Then (Fp,+, ·) is a finite field of p elements. Note that if p = 2, we have 1 + 1 = 0.

1

Page 2: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

2 1. Vector Spaces

Definition. Let F be a field. An m × n (m by n) matrix A with m rows and n columns with

entries over F is a rectangular array of the form

A =

a11 · · · a1j · · · a1n...

......

ai1 · · · aij · · · ain...

......

am1 · · · amj · · · amn

,

where aij ∈ F for all i ∈ {1, 2, . . . ,m} and j ∈ {1, 2, . . . , n}. We write Mm,n(F ) for the set of

m×n matrices with entries in F and we write Mn(F ) for Mn,n(F ) the set of square matrices of

order n.

Remark. As a shortcut, we often use the notation A = [aij ] to denote the matrix A with entries

aij . Notice that when we refer to the matrix we put parentheses—as in “[aij ]”, and when we refer

to a specific entry we do not use the surrounding parentheses—as in “aij .”

Definition. Two m × n matrices A = [aij ] and B = [bij ] are equal if aij = bij for all i ∈{1, 2, . . . ,m} and j ∈ {1, 2, . . . , n}.

Definition. The m× n zero matrix 0m×n ∈Mm,n(F ) is the matrix with 0F ’s everywhere,

0m×n =

0 0 0 · · · 00 0 0 · · · 0...

...... · · · ...

0 0 0 · · · 0

.

When m = n we write 0n as an abbreviation for 0n×n.

The n × n identity matrix In ∈ Mn(F ) is the matrix with 1’s on the main diagonal and 0’s

everywhere else,

In =

1 0 0 · · · 00 1 0 · · · 0...

...... · · · ...

0 0 0 · · · 1

.

Definition. Let A = [aij ] and B = [bij ] be m×n matrices and a scalar r ∈ F . The matrix A+rBis the matrix C ∈Mm,n(F ) with entries C = [cij ] where

cij = aij + rbij .

Theorem 1.1.1. Let A,B and C be matrices of the same size, and let r and s be scalars in F . Then(a) A+B = B +A (e) r0 = 0 and 0A = 0

(b) (A+B) + C = A+ (B + C) (f) 1A = A(c) A+ 0 = A (g) (r + s)A = rA+ sA(d) r(A+B) = rA+ rB (h) r(sA) = (rs)A = (sr)A = s(rA)

Page 3: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.1. The Algebra of Matrices over a Field 3

Definition. Let A be an m × n matrix with columns ~a1,~a2, . . . ,~an and ~x is a column vector

in Fn. The product of A and ~x denoted by A~x is the linear combination of the columns of Ausing the corresponding entries in ~x as weights. That is,

A~x =[~a1 ~a2 · · · ~an

]

x1x2...

xn

:= x1~a1 + x2~a2 + · · ·+ xn~an.

If B is an n × p matrix with columns ~b1,~b2, . . . ,~bp, then the product of A and B, denoted by

AB, is the m× p matrix with columns A~b1, A~b2, . . . , A~bp. In other words,

AB = A[

~b1 ~b2 · · · ~bp

]

:=[

A~b1 A~b2 · · · A~bp

]

.

The above definition of AB is a good for theoretical work. When A and B have small sizes,

the following method is more efficient when working by hand. Let A = [aij ] ∈ Mm,n(F ) and

B = [bij ] ∈ Mn,p(F ). Then the matrix product AB is defined as the matrix C = [cij ] ∈ Mm,p(F )with entries

cij =n∑

l=1

ailblj ,

that is,

a11 a12 · · · a1n...

... · · · ...

ai1 ai2 · · · ain...

... · · · ...

am1 am2 · · · amn

b11 · · · b1j · · · b1pb21 · · · b2j · · · b2p... · · · ... · · · ...

bn1 · · · bnj · · · bnp

=

c11 · · · c1p... · · · ...

· · · cij · · ·... · · · ...

cm1 · · · cmp

.

If A is a square matrix of order n, then we write Ak for A · · ·A︸ ︷︷ ︸

k copies

.

Theorem 1.1.2. Let A be m × n and let B and C have sizes for which the indicated sums and

products are defined.

(a) A(B + C) = AB +AC and (B + C)A = BA+ CA(b) r(AB) = (rA)B = A(rB) for any scalar r(c) A0n×k = 0m×k and 0k×mA = 0k×n

(d) ImA = A = AIn(e) A(BC) = (AB)C

Remarks. Properties above are analogous to properties of real numbers. But NOT ALL real

number properties correspond to matrix properties.

1. It is not the case that AB always equal to BA.

2. Even if A 6= 0 and AB = AC, then B may not equal to C. (A must have an inverse!)

3. It is possible for AB = 0 even if A 6= 0 and B 6= 0. E.g.,

[1 00 0

] [0 01 0

]

=

[0 00 0

]

.

Page 4: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

4 1. Vector Spaces

Definition. The transpose of an m × n matrix A is the n × m matrix obtained from A by

interchanging the rows and columns. We denote the transpose of A by AT . That is, if A =[aij ]m×n, then AT = [bji]n×m where bji = aij for all i, j. Moreover,

~xT =

x1x2...

xm

T

=[x1 x2 · · · xm

]

and so if A =[~a1 ~a2 · · · ~an

], then

AT =

~aT1~aT2...

~aTn

.

Theorem 1.1.3. Let A and B denote matrices whose sizes are appropriate for the following sums

and products.(a) (AT )T = A (c) (rA)T = rAT for any scalar r(b) (A+B)T = AT +BT (d) (AB)T = BTAT

1.2 Axioms of a Vector Space

Definition. A vector space V over a field F is a nonempty set of elements called vectors, which

two laws of combination, called vector addition (or addition) and scalar multiplication, sat-

isfying the following conditions.(A1) ∀~u,~v ∈ V, ~u+ ~v ∈ V . (SM1) ∀a ∈ F, ∀~u ∈ V, a~u ∈ V .

(A2) ∀~u,~v ∈ V, ~u+ ~v = ~v + ~u. (SM2) ∀a ∈ F, ∀~u,~v ∈ V, a(~u+ ~v) = a~u+ a~v.

(A3) ∀~u,~v ∈ V, ~u+ (~v + ~w) = (~u+ ~v) + ~w. (SM3) ∀a, b ∈ F, ∀~u ∈ V, (a+ b)~u = a~u+ b~u.

(A4) ∃~0 ∈ V, ∀~u ∈ V, ~u+~0 = ~u = ~0 + ~u. (SM4) ∀a, b ∈ F, ∀~u ∈ V, (ab)~u = a(b~u).

(A5) ∀~u ∈ V, ∃~u′ ∈ V, ~u+ ~u′ = ~0 = ~u′ + ~u. (SM5) ∀~u ∈ V, 1~u = ~u (1 ∈ F ).

We call ~0 the zero vector and ~u′ the negative of ~u.

Theorem 1.2.1. Let V be a vector space over a field F . Then

1. (Cancellation) ∀~u,~v, ~w ∈ V, ~u+ ~w = ~v + ~w ⇒ ~u = ~v and

∀~u,~v, ~w ∈ V, ~w + ~u = ~w + ~v ⇒ ~u = ~v.

2. The zero vector and the negative of ~u are unique. We shall denote the negative of ~u by −~u.

3. ∀~v ∈ V,−(−~v) = ~v.

4. ∀~v ∈ V, 0~v = ~0.

5. ∀a ∈ F, a~0 = ~0.

6. ∀a ∈ F, ∀~v ∈ V, (−a)~v = −(a~v) = a(−~v). In particular, (−1)~v = −(1~v) = −~v.

7. ∀a ∈ F, ∀~v ∈ V, a~v = ~0⇒ (a = 0 ∨ ~v = ~0).

Examples 1.2.1. 1. For any field F and n ≥ 1, we have Fn is a vector space over F where

(x1, . . . , xn) + (y1, . . . , yn) = (x1 + y1, . . . , xn + yn)

and

a(x1, . . . , xn) = (ax1, . . . , axn)

for all (x1, . . . , xn), (y1, . . . , yn) ∈ Fn and a ∈ F

Page 5: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.2. Axioms of a Vector Space 5

2. Let m,n ∈ N, F is a field and Mm,n(F ) the set of all m× n matrices over F . Then Mm,n(F )is a vector space over F under the usual addition and scalar multiplication of matrices.

3. [The space of functions from a set to a field] Let S be a nonempty set and F a field. Let

FS = {f | f : S → F}. Then FS is a vector space over F by defining f + g and af for

functions f, g ∈ FS and a scalar c ∈ F as follows:

(f + g)(t) = f(t) + g(t) and (cf)(t) = cf(t)

for all t ∈ S. The zero function from S into F is the zero vector of F and the negative of

f ∈ V is −f defined by (−f)(t) = −f(t) for all t ∈ S.

4. [The sequence space] Let FN = {(xn) : (xn) is a sequence in F}. Then FN is a vector

space over F under the usual addition and scalar multiplication of sequences. That is, for

sequences (an) and (bn) in FN and a scalar c ∈ F ,

(an) + (bn) = (an + bn) and c(an) = (c an).

Its zero is the zero sequence (zn) where zn = 0 for all n and the negative of (an) is the

sequence (bn) given by bn = −an for all n.

5. Let n be a non-negative integer and Fn[x] be the set of polynomials over F of degree at most

n. That is,

Fn[x] = {a0 + a1x+ a2x2 + · · ·+ anx

n : ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.

We define the addition and scalar multiplication by

p(x) + q(x) = (a0 + b0) + (a1 + b1)x+ (a2 + b2)x2 + · · ·+ (an + bn)x

n

and c(p(x)) = (ca0) + (ca1)x+ (ca2)x2 + · · ·+ (can)x

n

for all polynomials p(x) = a0+a1x+a2x2+· · ·+anx

n and q(x) = b0+b1x+b2x2+· · ·+bnx

n in

Fn[x] and c ∈ F . Then Fn[x] is a vector space over F . Observe that for each positive integer

n, we have Fn−1[x] ⊂ Fn[x].6. [The space of polynomials over a field] Let F [x] be the set of all polynomials over F . That

is,

F [x] = {a0 + a1x+ a2x2 + · · ·+ anx

n : n ≥ 0 and ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.

Then F [x] =⋃

n≥0

Fn[x]. If we use the addition and scalar multiplication defined for Fn[x],

then F [x] is a vector space over F . The zero polynomial 0(x) = 0 + 0x + 0x2 + · · · is its

zero vector and for f(x) = c0 + c1x + · · · + cnxn ∈ F [x], the negative of f(x) is (−f)(x) =

(−c0) + (−c1)x+ · · ·+ (−cn)xn.

Theorem 1.2.2. Let (V1,+1, ·1), (V2,+2, ·2), . . . , (Vn,+n, ·n) be vector spaces over a field F .

For (~v1, ~v2, . . . , ~vn), (~w1, ~w2, . . . , ~wn) ∈ V and c ∈ F , we define the addition and scalar multiplica-

tion on V = V1 × V2 × · · · × Vn by

(~v1, ~v2, . . . , ~vn) + (~w1, ~w2, . . . , ~wn) = (~v1 +1 ~w1, ~v2 +2 ~w2, . . . , ~vn +n ~wn)

and c(~v1, ~v2, . . . , ~vn) = (c ·1 ~v1, c ·2 ~v2, . . . , c ·n ~vn).

Then V is a vector space over F with the zero vector ~0 = (~01,~02, . . . ,~0n) and the negative of

(~v1, ~v2, . . . , ~vn) is (−~v1,−~v2, . . . ,−~vn). V is called the direct product of V1, V2, . . . , Vn.

Page 6: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

6 1. Vector Spaces

1.3 Subspaces

Definition. Let V be a vector space over a field F . A subspace of V is a subset of V which

is itself a vector space over F with the operations of vector addition and scalar multiplication

of V .

Theorem 1.3.1. Let W be a nonempty subset of V . Then the following statements are equivalent.

(i) W is a subspace of V .

(ii) ∀~u,~v ∈W, ∀c ∈ F, ~u+ ~v ∈W and c~u ∈W .

(iii) ∀~u,~v ∈W, ∀c, d ∈ F, c~u+ d~v ∈W .

(iv) ∀~u,~v ∈W, ∀c ∈ F, c~u+ ~v ∈W .

Examples 1.3.1. 1. For any vector space V over a field F , we have {~0V } and V are subspaces

of V , called trivial subspaces.

2. For a non-negative integer n, we have Fn[x] is a subspace of F [x].3. Let α ∈ F and Vα = {(x1, x2) : x1 = αx2}. Then Vα is a subspace of F 2.

4. Let Bd(R) = {(an) ∈ RN : (an) is a bounded sequence},C(R) = {(an) ∈ RN : (an) is a convergent sequence} and

C0(R) = {(an) ∈ RN : an → 0 as n→∞}.Then Bd(R), C(R) and C0(R) are subspaces of RN.

5. Let C0(−∞,∞) = {f ∈ RR : f is continuous on (−∞,∞)}.Then C0(−∞,∞) is a subspace of RR.

6. Let W = {f : R→ R | f ′′ = f}. Then W is a subspace of RR.

7. Let W1 = {p(x) ∈ F [x] : p(1) = 0} and W2 = {p(x) ∈ F [x] : p(0) = 1}.Then W1 is a subspace of F [x] but W2 is not.

8. Let A ∈ Mm,n(F ). Then NulA = {~x ∈ Fn : A~x = ~0m} is a subspace of Fn, called the null

space of A.

Theorem 1.3.2. Let V be a vector space over a field F . The intersection of any collection of

subspaces of V is a subspace of V .

Definition. For non-empty subsets S1, S2, . . . , Sn of V , we define

S1 + S2 + · · ·+ Sn =n∑

i=1

Si = {x1 + x2 + · · ·+ xn : x1 ∈ S1, x2 ∈ S2, . . . , xn ∈ Sn}.

Theorem 1.3.3. If W1, . . . ,Wn are subspaces of V , then W1 + · · ·+Wn is a subspace of V .

Remark. W1 +W2 is the smallest subspace of V containing W1 and W2, i.e., any subspace con-

taining W1 and W2 must contain W1 +W2.

Definition. Let V be a vector space over a field F .

A vector ~v is said to be a linear combination of ~v1, . . . , ~vn ∈ V if

∃a1, . . . , an ∈ F,~v = a1~v1 + · · ·+ an~vn.

Definition. Let S ⊆ V . The subspace of V spanned by S is defined to be the intersection of

all subspaces of V containing S. We denote this subspace by Span S.

For ~v1, . . . , ~vp ∈ V , we call Span{~v1, . . . , ~vp} the subspace of V spanned by ~v1, . . . , ~vp.

Page 7: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.3. Subspaces 7

Since ∅ ⊂ {~0V } which is the smallest of all subspaces of V , we have Span = {~0V }. Moreover,

if W is a subspace of V , then SpanW = W . In particular, Span(SpanS) = SpanS.

Remark. Let S be a non-empty subset of V and let W be a subspace of V containing S. Note that

for c1, . . . , cm ∈ F and ~v1, . . . ~vm ∈ S, we have ~v1, . . . ~vm ∈W and so

c1~v1 + · · ·+ cm~vm ∈W.

Thus, Y := {c1~v1 + · · ·+ cm~vm : c1, . . . , cm ∈ F and ~v1, . . . , ~vm ∈ S for some m ∈ N} ⊆ W for all

subspaces W of V containing S. Hence, Y ⊆ SpanS.

Theorem 1.3.4. SpanS is the smallest subspace of V containing S. That is, any subspace of Vcontaining S must also contain SpanS. Moreover, Span ∅ = {~0} and

Span S = {c1~v1 + · · ·+ cm~vm : c1, . . . , cm ∈ F and ~v1, . . . , ~vm ∈ S for some m ∈ N} if S 6= ∅.

In particular,

Span{~v1, . . . , ~vp} = {c1~v1 + · · ·+ cp~vp : c1, . . . , cp ∈ F}.

Definition. Let A =[~a1 ~a2 . . . ~an

]be an m× n matrix over a field F . Then ~ai ∈ Fm for all

i = 1, 2, . . . , n and Span{~a1,~a2, . . . ,~an} is a subspace of Fm, called the column space of A. We

denote this space by ColA.

By Theorem 1.3.4, we have

ColA = {c1~a1 + c2~a2 + · · ·+ cn~an : c1, c2, . . . , cn ∈ F}.

Definition. Let V and W be vector spaces over a field F . A function T : V →W is said to be a

linear transformation if the following conditions are satisfied:

(i) ∀~u,~v ∈ V, T (~u+ ~v) = T (~u) + T (~v) and

(ii) ∀~u ∈ V, ∀c ∈ F, T (c~u) = cT (~u).

Theorem 1.3.5. Let V and W be vector spaces over a field F and T : V → W a linear transfor-

mation. Then T (~0V ) = ~0W and ∀~v ∈ V, T (−~v) = −T (~v).

Theorem 1.3.6. The following statements are equivalent.

(i) T is a linear transformation.

(ii) ∀~u,~v ∈ V, ∀c ∈ F, T (~u+ ~v) = T (~u) + T (~v) ∧ T (c~u) = cT (~u).(iii) ∀~u,~v ∈ V, ∀c, d ∈ F, T (c~u+ d~v) = cT (~u) + dT (~v).(iv) ∀~u,~v ∈ V, ∀c ∈ F, T (c~u+ ~v) = cT (~u) + T (~v).

Definition. Let V and W be vector spaces over a field F and T : V → W a linear transforma-

tion. Recall that the image or range of T is given by

imT = rangeT = {~w ∈W : ∃~v ∈ V, T (~v) = ~w} = {T (~v) : ~v ∈ V }.

The kernel of T is defined by

kerT = {~v ∈ V : T (~v) = ~0W } = T−1({~0W }).

Theorem 1.3.7. The kernel of T is a subspace of V and the image of T is a subspace of W .

Page 8: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

8 1. Vector Spaces

Example 1.3.2. Let A =[~a1 ~a2 . . . ~an

]be an m× n matrix over a field F .

Then the matrix transformation T : Fn → Fm given by

T (~x) = A~x

is a linear transformation. Its kernel is

NulA = {~x ∈ Fn : A~x = ~0m}

the null space of A, and its image is

imT = {A~x : ~x ∈ Fn} = {x1~a1 + · · ·+ xn~an : x1, . . . , xn ∈ F} = ColA,

which is the column space of A.

Remark. Since the image of T : ~x→ A~x the column space of A,

T is onto⇔ imT = Fm ⇔ ColA = Fm.

If ColA = Fm, we say that the columns of A span Fm.

Example 1.3.3. Let T : R[x]→ R be defined by

T (p(x)) = p(1)

for all p(x) ∈ R[x]. Show that T is an onto linear transformation and find its kernel.

Example 1.3.4. Let V be the space of differentiable functions on (−∞,∞) with continuous

derivative. Define a function T : V → C0(−∞,∞) by

T (f(x)) = f ′(x)

for all f ∈ V . Show that T is an onto linear transformation and find its kernel.

Definition. Let V be a vector space over a field F . Vectors ~u1, ~u2, . . . , ~un in V are linearly

independent if

∀c1, c2, . . . , cn ∈ F, c1~u1 + c2~u2 + · · ·+ cn~un = ~0⇒ c1 = c2 = · · · = cn = 0.

If there is a linear combination c1~u1 + c2~u2 + . . . cn~un = ~0 with the scalars c1, c2, . . . , cn not all

zero, we say that ~u1, ~u2, . . . , ~un are linearly dependent.

Example 1.3.5. Determine whether the set of vectors

{(1, 1, 1), (0, 1, 1), (0, 0, 1)}

is dependent or independent in R3.

Example 1.3.6. Determine whether the set of vectors

~u1 =

[2 2 10 0 1

]

, ~u2 =

[0 0 10 0 1

]

, ~u3 =

[1 1 10 0 1

]

is dependent or independent in M2,3(R).

Remarks. 1. The empty set is linearly independent.

2. If ~0V is in S, then S is linearly dependent.

3. The singleton {~0V } is linearly dependent and {~u} is linearly independent unless ~u = ~0V .

Page 9: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.4. Bases and Dimensions 9

Theorem 1.3.8. Let V be a vector space over a field F and S1 ⊆ S2 ⊆ V . Then

1. SpanS1 ⊆ SpanS2.

2. If S1 is linearly dependent, then S2 is linearly dependent.

3. If S2 is linearly independent, then S1 is linearly independent.

Example 1.3.7. Consider the space of continuous functions C0[−1, 1].Determine whether the functions 1, x, x2 are dependent or independent.

Remark. Observe that the question of dependence and independence of sets of functions is re-

lated to the interval over which the space is defined. Consider the same interval [−1, 1] with the

functions f , g and h defined as follows:

f(x) = 1,−1 ≤ x ≤ 1,

g(x) =

{

0 if −1 ≤ x ≤ 0,

x if 0 ≤ x ≤ 1.and h(x) =

{

0 if −1 ≤ x ≤ 0,

x2 if 0 ≤ x ≤ 1.

These functions are linearly independent. However, if we restrict these same functions to the

interval [−1, 0], then they are dependent because

0 · f(x) + 1 · g(x) + 0 · h(x) = 0

for −1 ≤ x ≤ 0.

Theorem 1.3.9. Let T : V →W be a linear transformation. Then T is 1-1⇔ kerT = {~0V }.

1.4 Bases and Dimensions

Definition. Let V be a vector space over F . A subset B ⊂ V is a basis for V if B is linearly

independent and Span B = V .

Theorem 1.4.1. Let V be a vector space over a field F and B = {~v1, . . . , ~vn} ⊆ V linearly

independent.

1. If ~v ∈ SpanB, then there exist unique c1, . . . , cn ∈ F such that

~v = c1~v1 + · · ·+ cn~vn.

2. If B is a basis for V , then every vector in V can be expressed uniquely as a linear combination

of ~v1, . . . , ~vn.

3. Let W be a vector space over a field F and ~w1, . . . , ~wn ∈W (not necessarily distinct). If B is a

basis for V , then there is a unique linear transformation from V to W such that T (~vi) = ~wi

for all i ∈ {1, . . . , n}.

Examples 1.4.1. 1. Find a linear transformation T that satisfies the following conditions

(i) T : C→ R2[x] with T (1− i) = 2x2 and T (1 + i) = 1− x,

(ii) T : R2[x]→ R2 with T (1) = (2, 1), T (1− x) = (0, 1) and T (x+ x2) = (1, 1).2. Let T : R1[x]→ R3 be a linear transformation with

T (2− x) = (1,−1, 1) and T (1 + x) = (0, 1,−1).

Find T (−1 + 2x).

Page 10: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

10 1. Vector Spaces

Lemma 1.4.2. 1. If ~u,~v1, . . . , ~vn ∈ S and ~u = c1~v1+ · · ·+cn~vn, then SpanS = Span(Sr{~u}).2. If S is a linearly independent subset of V and ~u /∈ SpanS, then S ∪ {~u} is linearly indepen-

dent.

Theorem 1.4.3. Let V be a vector space over F .

1. If B is a linearly independent subset of V which is maximal with respect to the property of

being linearly independent (i.e., ∀B ⊆ S, S 6= B ⇒ S is not linearly independent), then B is

a basis of V .

2. If B is a spanning set for V which is minimal with respect to the property of spanning (i.e.,

∀S ⊆ B, S 6= B ⇒ Span S $ V ), then B is a basis of V .

Theorem 1.4.4. [Replacement Theorem] Let V be a vector space that is spanned by a set Gcontaining exactly n vectors. Let L be a linearly independent subset of V with m vectors. Then

1. m ≤ n,

2. there exists a subset H of G with n−m vectors such that L ∪H spans V .

Example 1.4.2. Extend {(1, 1, 1)} to a basis of R3.

Corollary 1.4.5. If a vector space V has a finite spanning set {~v1, . . . , ~vn}, then

1. {~v1, . . . , ~vn} has a subset which is a basis,

2. any linearly independent set in V can be extended to a basis,

3. V has a basis,

4. any two bases have the same finite number of elements, necessarily ≤ n.

Definition. If a vector space V has a finite spanning set, then we say that V is finite-

dimensional, and the number of elements in a basis is called the dimension of V , written

dimV . If V has no finite spanning set, we say that V is infinite-dimensional.

Examples 1.4.3. 1. The vector space {~0} has dimension zero with basis ∅.2. The vector space Fn, n ≥ 1, is of dimension n with standard basis {(1, 0, . . . , 0), (0, 1, . . . , 0),

. . . , (0, 0, . . . , 1)}. Similarly, Mm,n(F ) is of dimension mn where m,n ∈ N.

3. The vector space Fn[x] is of dimension n+ 1 with standard basis {1, x, x2, . . . , xn}.4. The vector spaces FN and F [x] are infinite-dimensional. A basis for F [x] is {1, x, x2, . . . }.5. If we consider C as a vector space over C, it has dimension one with basis {1}. But if we

consider C as a vector space over R it has dimension two with basis {1, i}.Remark. The above corollary is valid for a “finite” dimensional vector space. For a general (fi-

nite/infinite dimensional) vector space V , consider L = {L ⊆ V : L is linearly independent}.Then ∅ ∈ L . Partially ordering L by ⊆. We now show that every chain in L has an upper

bound. Let C be a chain in L . Consider⋃

C . Let ~v1, . . . , ~vn ∈⋃

C and c1, . . . , cn ∈ F be such

that c1~v1 + · · · + cn~vn = ~0V . Suppose ~vi ∈ Li for some Li ∈ C for all i ∈ {1, . . . , n}. Since C

is a chain, we may suppose that L1 ⊆ . . . ⊆ Ln. Thus, ~v1, . . . , ~vn are in Ln which is a linearly

independent set. This implies c1 = · · · = cn = 0. Hence,⋃

C is a linearly independent set, so⋃

C

is in L . By Zorn’s lemma—“If a partially ordered set P has the property that every chain (i.e.,

totally ordered subset) has an upper bound in P , then the set P contains at least one maximal

element.”, L contains a maximal element, say B. This is a maximal linearly independent subset

of V . By Theorem 1.4.3 (1), B is a basis for V . Hence, every vector space has a basis. Note that a

basis for FN exists in this way and is not constructible explicitly.

Corollary 1.4.6. If V is a finite-dimensional vector space with dimV = n, then any spanning

set of n elements is a basis of V , and any linearly independent set of n elements is a basis of V .

Consequently, if W is an n-dimensional subspace of V , then W = V .

Page 11: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.4. Bases and Dimensions 11

Corollary 1.4.7. If V is a finite-dimensional vector space and U is a proper subspace of V , then Uis finite-dimensional and dimU < dimV .

Theorem 1.4.8. If W1 and W2 are finite dimensional subspaces of a vector space V over a field F ,

then W1 +W2 is finite dimensional and

dim(W1 +W2) = dimW1 + dimW2 − dim(W1 ∩W2).

Example 1.4.4. Consider two subspaces of R5

W1 =

aa− bb

a+ b0

∈ R5 : a, b ∈ R

and W2 =

cd0e

d− e

∈ R5 : c, d, e ∈ R

.

Find bases for W1, W2 and W1 ∩W2. Determine the dimension of W1 +W2.

Definition. Let V and W be vector spaces over a field and T : V →W a linear transformation.

If V is finite dimensional, the rank of T , denoted by rankT , is dim(imT ) and the nullity of T ,

denoted by nullity T , is dim(kerT ).

Theorem 1.4.9. Let V and W be vector spaces over a field F and T : V → W a linear transfor-

mation. If V is finite dimensional, then

rankT + nullity T = dimV.

Theorem 1.4.10. Let V and W be finite dimensional and T : V → W a linear transformation

and dimV = dimW . Then T is one-to-one⇔ T is onto.

Corollary 1.4.11. If V is finite dimensional, S and T are linear transformations from V to V , and

T ◦ S is the identity map, then T = S−1.

From Theorem 1.4.1, we have known that the representation of a given vector ~v ∈ V in terms

of a given basis is unique.

Definition. Let V be an n-dimensional vector space over a field F with ordered basis B ={~v1, . . . , ~vn} and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1, . . . , cn) ∈ Fn, ~v = c1~u1 + · · ·+ cn~un. The vector

[~v]B =

c1...

cn

∈ Fn

is called the coordinate vector of ~v relative to the ordered basis B.

Theorem 1.4.12. For ~v, ~w ∈ V and c ∈ F , we have [~v + ~w]B = [~v]B + [~w]B and [c~v]B = c[~v]B.

Definition. A one-to-one linear transformation from V onto W is called an isomorphism. If

there exists an isomorphism from V onto W , then we say that V is isomorphic to W and we

write V ∼= W .

Page 12: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

12 1. Vector Spaces

Note that ∼= is an equivalence relation.

Theorem 1.4.13. Let V be an n-dimensional vector space over F .

If B is a basis for V , then the map ~v 7→ [~v]B is an isomorphism from V onto Fn. Hence, V ∼= Fn.

Therefore, the theory of finite-dimensional vector spaces can be studied from column vectors

and matrices which we shall pursue in the next chapter.

Corollary 1.4.14. If V and W are finite dimensional, then dimV = dimW ⇔ V ∼= W .

Exercises for Chapter 1. 1. Let V = R+ the set of all positive integers. Define a vector addition anda scalar multiplication on V as

v ⊕ w = vw and α⊙ v = vα

for all positive real numbers v and w, and α ∈ R. Show that (V,⊕,⊙) is a vector space over R.2. Let V be a vector space over a field F . For c ∈ F and ~v ∈ V , if c~v = ~v, prove that c = 1 or ~v = ~0V .3. Which of the following are subspaces of M2(R)?

(a) {A ∈M2(R) : detA = 0} (b) {A ∈M2(R) : A = AT }(c) {A ∈M2(R) : A = −AT } (d) {A ∈M2(R) : A2 = A}

4. Which of the following are subspaces of RN?(a) All sequences like (1, 0, 1, 0, . . . ) that include infinitely many zeros.(b) {(an) ∈ RN : ∃n0 ∈ N, ∀j ≥ n0, aj = 0}. (c) All decreasing sequences: aj+1 ≤ aj for all j ∈ N.(d) All arithmetic sequences: {(an) ∈ RN : ∃a, d ∈ R, ∀n ∈ N, an = a+ (n− 1)d}.(e) All geometric sequences: {(an) ∈ RN : ∃a, r ∈ R, ∀n ∈ N, r 6= 0 ∧ an = arn−1}.

5. Which of the following are subspaces of V = C0[0, 1]?(a) {f ∈ V : f(0) = 0} (b) {f ∈ V : ∀x ∈ [0, 1], f(x) ≥ 0}(c) All increasing functions: ∀x, y ∈ [0, 1], x < y ⇒ f(x) ≤ f(y).

6. Let V and W be vector spaces over a field F and T : V →W a linear transformation.(a) If V1 is a subspace of V , then T (V1) = {T (~x) : ~x ∈ V1} is a subspace of W .(b) If W1 is a subspace of W , then T−1(W1) = {~x ∈ V : T (~x) ∈W1} is a subspace of V .

7. If L,M and N are three subspaces of a vector space V such that M ⊆ L, then show that

L ∩ (M +N) = (L ∩M) + (L ∩N) = M + (L ∩N).

Also give an example, in which the result fails to hold when M * L. (Hint. Consider Vα of F 2.)8. Let S1 and S2 be subsets of a vector space V . Prove that Span(S1 ∪ S2) = Span S1 + Span S2.9. If ~v1, ~v2, ~v3 ∈ V such that ~v1 + ~v2 + ~v3 = ~0, prove that Span{~v1, ~v2} = Span{~v2, ~v3}.

10. Let S = {~v1, . . . , ~vn} and c1, . . . , cn ∈ F r {0}. Prove that:(a) Span S = Span{c1~v1, . . . , cn~vn}(b) S is linearly independent⇔ {c1~v1, . . . , cn~vn} is linearly independent.

11. If {~y,~v1, . . . , ~vn} is linearly independent, show that {~y+ ~v1, . . . , ~y+ ~vn} is also linearly independent.12. Determine (with reason or counter example) whether the following statements are TRUE or FALSE.

(a) If W1 and W2 are subspaces of V , then W1 ∪W2 is a subspace of V .(b) If {~v1, ~v2, ~v3} is a basis of R3, then {~v1, ~v1 + ~v2, ~v1 + ~v2 + ~v3} is a basis of R3.

13. Determine whether the following subsets are linearly independent.(a) {(1, i,−1), (1 + i, 0, 1− i), (i,−1,−i)} in C3 (b) {x, sinx, cosx} in C0(R)

14. Let V be a vector space over a field F . Let ~v1, ~v2, . . . , ~vn be vectors in V .If ~w ∈ Span{~v1, ~v2, . . . , ~vn}r Span{~v2, . . . , ~vn}, then ~v1 ∈ Span{~w,~v2, . . . , ~vn}r Span{~v2, . . . , ~vn}.

15. Prove that if U and V are finite dimensional vector spaces, then dim(U × V ) = dimU + dimV .16. Find a basis and the dimension of the following subspaces of M2(R).

(a) {A ∈M2(R) : A = AT } (b) {A ∈M2(R) : A = −AT }(c) {A ∈M2(R) : ∀B ∈M2(R), AB = BA}

17. Let B ∈M2(R) and W = {A ∈M2(R) : AB = BA}.Prove that W is a subspace of M2(R) and dimW ≥ 2.

18. Find a basis for the subspace W = {p(x) ∈ R3[x] : p(2) = 0} and extend to a basis for R3[x].

Page 13: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

1.4. Bases and Dimensions 13

19. Let W1 = Span{(1, 0, 2), (1,−2, 2)} and W2 = Span{(1, 1, 0), (0, 1,−1)} in R3.Find dim(W1 ∩W2) and dim(W1 +W2).

20. If T : V →W is a linear transformation and B is a basis for V , prove that Span T (B) = imT .21. Let T : R2[x]→ R3[x] be given by T (p(x)) = xp(x).

(a) Prove that T is a linear transformation and determine its rank and nullity.(b) Does T−1 exist? Explain.

22. Suppose that U and V are subspaces of R13, with dimU = 7 and dimV = 8.(a) What is the smallest and largest possible dimensions of U ∩ V ? Explain.(b) What is the smallest and largest possible dimensions of U + V ? Explain.

23. If V and W are finite-dimensional vector spaces such that dimV > dimW , then there is no one-to-one linear transformation T : V →W .

24. Let U and W be subspaces of a vector space V . If dimV = 3, dimU = dimW = 2 and U 6= W , provethat dim(U ∩W ) = 1.

25. Let U and W be subspaces of a vector space V such that U ∩W = {~0}.Assume that ~u1, ~u2 are linearly independent in U and ~w1, ~w2, ~w3 are linearly independent in W .(a) Prove that {~u1, ~u2, ~w1, ~w2, ~w3} is a linearly independent set in V .(b) If dimV = 5, show that dimU = 2 and dimV = 3.

Page 14: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

14 1. Vector Spaces

Page 15: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

2 | Inner Product Spaces

2.1 Inner Products

We shall need the following properties of complex numbers.

Proposition 2.1.1. Let z = a+ bi where a, b ∈ R.

1. Re z = a (real part) and Im z = b (imaginary part)

2. The conjugate z = a− bi, and the absolute value |z| =√a2 + b2. Moreover, zz = |z|2.

3. ¯z = z and |z| = 0⇔ a = b = 0.

4. If z, w ∈ C, then z + w = z + w and zw = zw.

Definition. Let F = R or C and let V be a vector space over F . Let ~u and ~v be vectors in V .

An inner product or scalar product on V is a function from V ×V to F , denoted by 〈·, ·〉, with

following properties:

(IN1) ∀~u,~v, ~w ∈ V, 〈~u+ ~v, ~w〉 = 〈~u, ~w〉+ 〈~v, ~w〉.(IN2) ∀~u,~v ∈ V, ∀c ∈ F, 〈c~u,~v〉 = c〈~u,~v〉.(IN3) ∀~u,~v ∈ V, 〈~u,~v〉 = 〈~v, ~u〉. Here, · is the complex conjugation.

(IN4) ∀~u ∈ V, 〈~u, ~u〉 ≥ 0 and [〈~u, ~u〉 = 0⇒ ~u = ~0].

A vector space over F , in which an inner product is defined, is called an inner product space.

Remarks. 1. For all ~u,~v ∈ V , 〈~0, ~u〉 = 0 = 〈~u,~0〉 and 〈~u,~v〉 = 0⇔ 〈~v, ~u〉 = 0.

2. If F = R, then (IN3) reads ∀~u,~v ∈ V, 〈~u,~v〉 = 〈~v, ~u〉.

Example 2.1.1. Consider the complex vector space Cn of n-tuples of complex numbers. Let

~u = (u1, u2, . . . , un) and ~v = (v1, v2, . . . , vn). We define

〈~u,~v〉 = u1v1 + u2v2 + . . .+ unvn.

Show that this is an inner product.

Remark. If we consider, on the other hand, Rn the space of n-tuples of real numbers, we have a

real-valued scalar product 〈~u,~v〉 = u1v1 + u2v2 + . . .+ unvn and the verification of the properties

is exactly like Example 2.1.1, where all conjugation symbols are removed.

Example 2.1.2. Consider V = C0[a, b] the vector space of real-valued continuous functions de-

fined on the interval [a, b]. Let

〈f, g〉 =∫ b

af(x)g(x) dx.

Show that this defines an inner product.

We can add to the list of properties of the scalar product by proving some theorems, assuming

of course that we are dealing with a complex vector space with a scalar product.

15

Page 16: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

16 2. Inner Product Spaces

Theorem 2.1.2. 1. ∀~u,~v, ~w ∈ V, 〈~u,~v + ~w〉 = 〈~u,~v〉+ 〈~u, ~w〉.2. ∀~u,~v ∈ V, ∀c ∈ F, 〈~u, c~v〉 = c〈~u,~v〉.3.(∀~u ∈ V, 〈~u,~v〉 = 0)⇒ ~v = ~0.

4.(∀~u ∈ V, 〈~u,~v〉 = 〈~u, ~w〉

)⇒ ~v = ~w. In fact, if 〈~v − ~w,~v〉 = 〈~v − ~w, ~w〉, then ~v = ~w.

Remark. Let c1, c2 ∈ F and ~u,~v ∈ V . Then

〈c1~u+ c2~v, c1~u+ c2~v〉 = c1c1〈~u, ~u〉+ c1c2〈~u,~v〉+ c1c2〈~v, ~u〉+ c2c2〈~v,~v〉.

Moreover, if 〈~u,~v〉 = 0, then 〈~v, ~u〉 = 0, so

〈c1~u+ c2~v, c1~u+ c2~v〉 = c1c1〈~u, ~u〉+ c2c2〈~v,~v〉 = |c1|2〈~u, ~u〉+ |c2|2〈~v,~v〉.

The quantity 〈~u, ~u〉 is non-negative and is zero if and only if ~u = ~0. Therefore, we associate

with it the square of the length of the vector.

Definition. For ~v ∈ V , we define the length or norm of ~v to be ‖~v‖ =√

〈~v,~v〉.

Some of the properties of the norm are given by the next theorem.

Theorem 2.1.3. If V is an inner product space over F , then the norm ‖ · ‖ has the following

properties:

1. ∀~u ∈ V, ‖~u‖ ≥ 0 and ‖~u‖ = 0⇔ ~u = ~02. ∀~u ∈ V, ∀a ∈ F, ‖a~u‖ = |a|‖~u‖3. ∀~u,~v ∈ V, |(~u,~v)| ≤ ‖~u‖‖~v‖ (the Cauchy-Schwarz inequality)

4. ∀~u,~v ∈ V, ‖~u+ ~v‖ ≤ ‖~u‖+ ‖~v‖ (the triangle inequality).

Example 2.1.3. Let f be a real-valued continuous function defined on the interval [a, b]. Prove

that ∣∣∣∣∣

∫ b

af(x)dx

∣∣∣∣∣≤ (b− a)M, where M = max

x∈[a,b]|f(x)|.

2.2 Orthonormal Bases

Definition. Let V be an inner product space over F . Two nonzero vectors ~u and ~v are orthog-

onal if (~u,~v) = 0. A vector ~u is a unit vector if ‖~u‖ = 1.

Definition. A subset S of V is called an orthogonal set if ∀~u,~v ∈ S, ~u 6= ~v ⇒ ~u and ~v are

orthogonal. Moreover, S is called an orthonormal set if S is orthogonal and ∀~v ∈ S, ‖~v‖ = 1.

Example 2.2.1. 1. The standard basis of Fn, n ∈ N is an orthonormal set.

2. Let V = C0[0, 2π] with inner product 〈f, g〉 =∫ 2π

0f(x)g(x) dx. Then

S =

{1√2π

,1√πcosx,

1√πsinx,

1√πcos 2x,

1√πsin 2x, . . .

}

is an orthonormal set.

Page 17: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

2.2. Orthonormal Bases 17

Let V be an inner product space.

Lemma 2.2.1. Let S = {~v1, . . . , ~vn} be an orthogonal set.

1. ∀α1, . . . , αn ∈ F, ∀k ∈ {1, . . . , n},⟨ n∑

i=1

αi~vi, ~vk

= αk‖~vk‖2.

2. ∀~v ∈ Span S,~v =n∑

i=1

〈~v,~vi〉‖~vi‖2

~vi.

Theorem 2.2.2. If S is an orthogonal set, then S is linearly independent.

Theorem 2.2.3. [Gram-Schmidt Process] Let ~v1, ~v2, . . . , ~vn ∈ V be linearly independent. Then

∀m ∈ {1, . . . , n}, ∃~w1, . . . , ~wm ∈ V such that {~w1, . . . , ~wm} is an orthogonal set and it is a basis

for Span{~v1, . . . , ~vm}.

Proof. We prove this theorem by induction on m ≥ 1.

If m = 1, {~v1} is an orthogonal set. Choose ~w1 = ~v1. Then Span{~w1} = Span{~v1}. Let

k ∈ {1, 2, . . . , n − 1} and assume that there exist ~w1, . . . , ~wk ∈ V such that {~w1, . . . , ~wk} is an

orthogonal set Span{~w1, . . . , ~wk} = Span{~v1, . . . , ~vk}. Choose

~wk+1 = ~vk+1 − vk+1 = ~vk+1 −k∑

i=1

〈~vk+1, ~wi〉‖~wi‖2

~wi. (2.2.1)

We have to show that:

(1) {~w1, . . . , ~wk, ~wk+1} is an orthogonal set. By induction hypothesis, {~w1, . . . , ~wk} is an orthog-

onal set, so it suffices to show that ~wk+1 is orthogonal to ~wj for all j ∈ {1, . . . , k}. Let

j ∈ {1, . . . , k}.

〈~wk+1, ~wj〉 =⟨

~vk+1 −k∑

i=1

〈~vk+1, ~wi〉‖~wi‖2

~wi, ~wj

= 〈~vk+1, ~wj〉 −k∑

i=1

⟨〈~vk+1, ~wi〉‖~wi‖2

~wi, ~wj

= 〈~vk+1, ~wj〉 − 〈~vk+1, ~wj〉 = 0.

(2) Span{~w1, . . . , ~wk, ~wk+1} = Span{~v1, . . . , ~vk, ~vk+1}. Again, by induction hypothesis,

Span{~w1, . . . , ~wk} = Span{~v1, . . . , ~vk}.

From Eq. (2.2.1), we have

~wk+1 ∈ Span{~w1, . . . , ~wk, ~vk+1} = Span{~v1, . . . , ~vk, ~vk+1}.

Then Span{~w1, . . . , ~wk, ~wk+1} ⊆ Span{~v1, . . . , ~vk, ~vk+1}. For the reverse, we note that

~vk+1 = ~wk+1 +k∑

i=1

〈~vk+1, ~wi〉‖~wi‖2

~wi ∈ Span{~w1, . . . , ~wk, ~wk+1}.

Since an orthogonal set is linearly independent, {~w1, . . . , ~wk} is a basis for Span{~v1, . . . , ~vk}.

Corollary 2.2.4. If V is a finite dimensional inner product space, then V has an orthonormal

basis.

Page 18: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

18 2. Inner Product Spaces

Proof. Let B = {~v1, . . . , ~vm} be a basis for V . Then B is linearly independent. By the Gram-

Schmidt Process, we can construct an orthogonal subset {~w1, . . . , ~wm} of V which is a basis for

Span{~v1, . . . , ~vm} = V . Hence, {~w1, . . . , ~wm} is an orthogonal basis for V in which we can nor-

malize each vector to obtain an orthonormal basis as desired.

Example 2.2.2. Let H = Span

12i0

,

2i6−3

⊂ C3. Find an orthonormal basis for H.

Example 2.2.3. Let V be the space of continuous functions on [0, 1] and H = Span{1, 3√x, 10x}a 3-dimensional subspace of V . Use the Gram-Schmidt process to find an orthogonal basis for H.

2.3 Orthogonal Complements

Definition. Let V be an inner product space over F . For S ⊆ V , the orthogonal complement

of S is the set S⊥, read “S perp”, defined by

S⊥ = {~v ∈ V : 〈~v, ~u〉 = 0 for all ~u ∈ S}.

Remark. ∅⊥ = V = {~0}⊥, V ⊥ = {~0} and S⊥ = (Span S)⊥.

Theorem 2.3.1. For any subset S of V , S⊥ is a subspace of V .

Lemma 2.3.2. Let S = {~v1, . . . , ~vn} be a set of distinct nonzero vectors.

If S is an orthogonal set, then ~v −n∑

i=1

〈~v,~vi〉‖~vi‖2

~vi ∈ S⊥ for all ~v ∈ V .

Theorem 2.3.3. [Bessel’s inequality] Let S = {~v1, . . . , ~vn} be a set of distinct nonzero vectors.

If S is an orthogonal set, then for all ~v ∈ V ,

n∑

i=1

|〈~v,~vi〉|2‖~vi‖2

≤ ‖~v‖2

and equality holds if and only if ~v ∈ Span S.

Let W1 and W2 be subspaces of a vector space V . We know that W1 +W2 is subspace of V . If

V = W1 +W2, we say that V is a sum of W1 and W2. The sum is direct, denoted by W1 ⊕W2, if

W1 ∩W2 = {~0V }. That is,

V = W1 ⊕W2 ⇔ [(1) V = W1 +W2 and (2) W1 ∩W2 = {~0V }].

Theorem 2.3.4. V = W1 ⊕W2

⇔ every vector ~v ∈ V can be expressed uniquely as ~v = ~w1 + ~w2 with ~w1 ∈W1 and ~w2 ∈W2.

Theorem 2.3.5. [Orthogonal Decomposition Theorem] Let W be a finite dimensional subspace

of an inner product space V . Then

1. V = W ⊕W⊥. In other words, every ~v in V decomposes uniquely as ~v = ~y + ~z with ~y ∈ Wand ~z ∈W⊥.

2. dimW + dimW⊥ = dimV .

Page 19: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

2.3. Orthogonal Complements 19

Exercises for Chapter 2. 1. Let Vn = {A ∈ Mn(R) : A = AT } be the vector space of all n × nsymmetric matrices over R, and define the product of two matrices A and B by

〈A,B〉 = tr (AB).

where tr denotes the trace of matrix.(a) Show that this is an inner product on Vn.

(b) Obtain an orthonormal basis for the subspace H = Span

{[1 00 1

]

,

[0 00 2

]}

of V2.

2. Find an orthonormal basis for R2[x] with respect to the inner product

〈p(x), q(x)〉 =∫ 1

0

p(x)q(x) dx.

3. Let W = {y(x) ∈ RR : y′′ + 4y = 0}. Then W is a real vector space generated by {cos 2x, sin 2x}.Define an inner product 〈y, z〉 =

∫ π

0

y(x)z(x) dx for all y, z ∈W . Find an orthonormal basis for W .

4. Let V and W be two vector spaces and T a one-to-one linear transformation from V into W . If W isan inner product space with inner product (·, ·), prove that the function 〈 , 〉 : V ×V → F defined by

〈~u,~v〉 = (T (~u), T (~v))

for all ~u,~v ∈ V is an inner product on V .5. Let V be an inner product space over F . Prove the following statements.

(a) If F = R, then ∀~u,~v ∈ V, 〈~u,~v〉 = 14‖~u+ ~v‖2 − 1

4‖~u− ~v‖2.

(b) If F = C, then ∀~u,~v ∈ V, 〈~u,~v〉 = 14‖~u+ ~v‖2 − 1

4‖~u− ~v‖2 + i4‖~u+ i~v‖2 − i

4‖~u− i~v‖2.(c) ∀~u,~v ∈ V, ‖~u+ ~v‖2 + ‖~u− ~v‖2 = 2‖~u‖2 + 2‖~v‖2.(a) and (b) are called the polarization identity and (c) is called the parallelogram law.

6. Show that |‖~u‖ − ‖~v‖| ≤ ‖~u− ~v‖ for all ~u,~v ∈ V .7. From the Cauchy-Schwarz inequality, |〈~u,~v〉| ≤ ‖~u‖‖~v‖, prove that equality holds if and only if ~u and

~v are linearly dependent.8. By choosing a suitable vector ~b in the Cauchy-Schwarz inequality, prove that

(a1 + · · ·+ an)2 ≤ n(a21 + · · ·+ a2n).

When does equality hold?9. Consider V = C0[a, b]. Let f ∈ V . Prove that

∫ b

a

|f(x)|2 dx ≤(∫ b

a

|f(x)| dx)1/2(∫ b

a

|f(x)|3 dx)1/2

.

10. Prove that the finite sequence a0, a1, . . . , an of positive real numbers is a geometric progression ifand only if

(a0a1 + a1a2 + · · ·+ an−1an)2 = (a20 + a21 + · · ·+ a2n−1)(a

21 + a22 + · · ·+ a2n).

11. Let P (x) be a polynomial with positive real coefficients. Prove that

P (a)P (b) ≥ P (√ab)

for all a, b ≥ 0.12. Let V be an n-dimensional inner product space and m < n. If {~v1, . . . , ~vm} is an orthonormal set,

then there exists ~vm+1, . . . , ~vn ∈ V such that {~v1, . . . , ~vn} is an orthonormal basis for V .13. Prove the following statements.

(a) ∀S1, S2 ⊆ V, S1 ⊆ S2 ⇒ S⊥1 ⊇ S⊥

2 . (b) ∀S ⊆ V, (Span S)⊥ = S⊥.(c) For S ⊆ V , if ~u ∈ S and ~v ∈ S⊥, then ‖~u+ ~v‖2 = ‖~u‖2 + ‖~v‖2.(d) For ~v1, . . . , ~vn ∈ V , {~v1}⊥ ∩ · · · ∩ {~vn}⊥ = (Span{~v1, . . . , ~vn})⊥.

14. Construct an orthonormal basis for the subspace H = {(1,−i, i)}⊥ of C3.15. Let W be a subspace of an inner product space V over F . If ~v ∈ V satisfies

〈~v, ~w〉+ 〈~w,~v〉 ≤ 〈~w, ~w〉 for all ~w ∈W,

show that 〈~v, ~w〉 = 0 for all ~w ∈W .

Page 20: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

20 2. Inner Product Spaces

16. Consider the inner product space C0[−1, 1]. Suppose that f and g are continuous on [−1, 1] and‖f − g‖ ≤ 5. Let

u1(x) =1√2

and u2(x) =

3

2x for x ∈ [−1, 1].

Write

aj =

∫ 1

−1

uj(x)f(x) dx and bj =

∫ 1

−1

uj(x)g(x) dx

for j = 1, 2. Show that |a1 − b1|2 + |a2 − b2|2 ≤ 25. (Hint. Use Bessel’s inequality.)17. If V is a finite dimensional inner product space and W is a subspace of V , prove that (W⊥)⊥ = W .18. If {~v1, ~v2} is a basis for V , show that V = Span{~v1} ⊕ Span{~v2}.19. Consider the subspace Vα, α ∈ R, of R2. Prove that if α 6= β, then R2 = Vα ⊕ Vβ .20. Let V = RR be the space of all functions from R to R. Let

Ve = {f ∈ V : ∀x ∈ R, f(−x) = f(x)} and Vo = {f ∈ V : ∀x ∈ R, f(−x) = −f(x)},

the sets of all even and odd functions, respectively. Prove the following statements.(a) Ve and Vo are subspaces of V . (b) V = Ve ⊕ Vo.

21. Let S be a set of vectors in a finite dimensional inner product space V . Suppose that “〈~u,~v〉 = 0 forall ~u ∈ S implies ~v = ~0”. Show that V = SpanS.

22. Let RN be the sequence space of real numbers. Let V = {(an) ∈ RN : only finitely many ai 6= 0}.(a) Prove that V is a subspace of RN.(b) Given (an), (bn) ∈ V , define

〈(an), (bn)〉 =∞∑

n=1

anbn.

(Note that this makes sense since only finitely many ai and bi are nonzero.) Show that this definesan inner product on V .

(c) Let U =

{

(an) ∈ V :

∞∑

n=1

an = 0

}

.

Show that U is a subspace of V such that U⊥ = {~0}, U + U⊥ 6= V and U 6= U⊥⊥.

Page 21: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3 | Matrices

3.1 Solutions of Linear Systems

Definition. For any system of m linear equations in n unknowns with coefficients over a field F

a11x1 + a12x2 + . . . + a1nxn = b1a21x1 + a22x2 + . . . + a2nxn = b2

......

am1x1 + am2x2 + . . . + amnxn = bm,

we can use the matrix notation

A~x = ~b,

where

A =

a11 a12 . . . a1na21 a22 . . . a2n...

......

am1 am2 . . . amn

, ~x =

x1x2...

xn

, and ~b =

b1b2...

bm

,

considered as matrices over F . In this case, we usually call A the coefficient matrix of the

system. It is clear that A~x = ~b has a solution ⇔ ~b ∈ ColA. If all b1, . . . , bm are equal to 0,

the linear system is said to be homogeneous. Note that all solutions of a homogeneous system

form the null space of A.

There is another matrix which plays an important role in the study of linear systems. This is

the augmented matrix, which is formed by inserting~b as a new last column into the coefficient

matrix. In other words, the augmented matrix is

[

A : ~b]

=

a11 a12 . . . a1n b1a21 a22 . . . a2n b2...

......

...

am1 am2 . . . amn bm

.

Remark. A homogeneous linear system

a11x1 + a12x2 + . . . + a1nxn = 0a21x1 + a22x2 + . . . + a2nxn = 0

......

am1x1 + am2x2 + . . . + amnxn = 0,

always has a trivial solution, namely the solution obtained by letting all xj = 0. Other nonzero

solutions (if any) are called nontrivial solutions.

Definition. The rank of a matrix A is the dimension of the column space of A.

21

Page 22: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

22 3. Matrices

Remark. If A is an m×n matrix, then rankA ≤ n and rankA is the maximum number of linearly

independent columns of A by Corollary 1.4.5.

Theorem 3.1.1. Let A be an m× n matrix over a field F .

1. The homogeneous system A~x = ~0m has only the trivial solution ~x = ~0n⇔ the columns of A are linearly independent⇔ rankA = n.

2. If rankA < n, then a homogeneous linear system has a nontrivial solution in F .

For an m× n matrix A over a field F , recall that the matrix transformation

T : ~x 7→ A~x

is a linear transformation from Fn to Fm. Its kernel is NulA and its image is ColA.

Definition. The dimension of NulA is called the nullity of A, denoted by nullityA.

By Theorem 1.4.9, we have:

Corollary 3.1.2. Let A be an m× n matrix over a field F . Then

rankA+ nullityA = n = the number of columns of A.

Examples 3.1.1. Consider the following augmented matrices. Write down their general solutions

(if any).

1.

1 −3 4 70 1 2 20 0 1 5

2.

1 −3 7 00 1 4 00 0 0 0

1 −3 7 10 1 4 00 0 0 −1

3.

[1 0 3 00 1 −2 0

] [1 0 3 50 1 −2 1

]

4.

1 −4 −2 0 3 −50 1 0 0 −1 −10 0 0 1 0 00 0 0 0 0 0

Theorem 3.1.3. Let A be an m× n matrix over a field F and ~b ∈ Fm.

1. A~x = ~b has a solution⇔ ~b ∈ ColA⇔ rank[A : ~b] = rankA.

2. If ~z ∈ Fn is a solution of A~x = ~b, then

~z = ~y + ~yp,

where ~y is a solution of the homogeneous system A~x = ~0m and A~yp = ~b.

Hence, the solution set of A~x = ~b is empty or given by

~yp + {~y ∈ Fn : A~y = ~0m},

where ~yp is a solution of A~x = ~b, called a particular solution.

Corollary 3.1.4. Let A be an m× n matrix over a field F and ~b ∈ Fm.

1. If A~x = ~b has a unique solution, then A~x = ~0m has a unique solution and rankA = n.

2. If A~x = ~0m has a nontrivial solution, then A~x = ~b has no solution or more than one

solutions.

Page 23: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.2. Inverse of a Matrix and Elementary Matrices 23

3.2 Inverse of a Matrix and Elementary Matrices

Definition. The main part of the algorithms used for solving simultaneous linear systems with

coefficients in F is called elementary row operations. It makes repeatedly used of three

operations on the linear system or on its augmented matrix, each of which preserves the set of

solutions because its inverse is an operation of the same kind:

1. (Interchange, Rij) Interchange the ith row and the jth row.

2. (Scaling, cRi) Multiply the ith row by a nonzero scalar c.3. (Replacement, Ri + cRj) Replace the ith row by the sum of it and a scalar c multiple of

the jth row.

The elementary column operations are defined in a similar way.

Remark. The elementary row operations are reversible as follows.

Operation Reverse

Rij Rij

cRi, c 6= 0 (1/c)Ri

Ri + cRj Ri − cRj

Definition. Two linear systems are said to be equivalent if they have the same set of solutions.

Theorem 3.2.1. Suppose that a sequence of elementary operations is performed on a linear system.

Then the resulting system has the same set of solutions as the original, so the two linear systems

are equivalent.

Proof. It is clear from the way we do the row reductions that if c1, c2, . . . , cn satisfy the original

system, then they also satisfy the reduced system. Since the elementary row operations are re-

versible if we start with the reduced system, the original system can be recovered. Now, it is clear

that any solutions of the reduced system is also a solution of the original system.

Definition. A rectangular matrix is in echelon form (or row-echelon form) if it has the fol-

lowing three properties:

1. All nonzero rows are above any rows of all zeros.

2. Each leading entry of a row is in a column to the right of the leading entry of the row

above it.

3. All entries in a column below a leading entry are zero. If a matrix in echelon form satisfies

the following additional conditions, then it is in reduced echelon form (or reduced row-

echelon form):

4. The leading entry in each nonzero row is 1, called the leading 1.

5. Each leading 1 is the only nonzero entry in its column.

An echelon matrix (respectively, reduced echelon matrix) is one that is in echelon form (re-

spectively, reduced echelon form).

Theorem 3.2.2. Every matrix can be brought to a reduced echelon matrix by a finite sequence of

elementary row operations.

Proof. This can be done by an algorithm, called the Gaussian Algorithm.

1. If the matrix consists entirely of zeros, stop–it is already in row-echelon form.

2. Otherwise, find the first column from the left containing a nonzero entry (call it a), and

move the row containing that entry to the top position.

Page 24: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

24 3. Matrices

3. Now multiply the new top row by 1/a to create a leading 1.

4. By subtracting multiples of that row from rows below it, make each entry below the lead-

ing 1 zero.

This completes the first row, and all further row operations are carried out on the remaining rows.

5. Repeat steps 1–4 on the matrix consisting of the remaining rows.

The process stops when either now rows remain at Step 5 of the remaining rows consist entirely

of zeros. Observe that the Gaussian algorithm is recursive.

Definition. A pivot position in a matrix A is a location in A that corresponds to a leading entry

in an echelon form of A. A pivot column is a column of A that contains a pivot position.

Definition. Let A be an n × n matrix. We say that A is invertible or nonsingular and has the

n× n matrix B as inverse if AB = BA = In.

If B and C are n× n matrices with AB = In and CA = In, then the associativity of multipli-

cation implies that

B = InB = (CA)B = C(AB) = CIn = C.

Hence an inverse for A is unique if it exists and we write A−1 for this inverse.

Theorem 3.2.3. Suppose A and B are invertible matrices of the same size. Then the following

results hold:

(a) A−1 is invertible and (A−1)−1 = A, i.e., A is the inverse of A−1.

(b) AB is invertible and (AB)−1 = B−1A−1.

(c) AT is invertible and (AT )−1 = (A−1)T .

Theorem 3.2.4. [Invertible Matrix Theorem]

The following statements are equivalent for an n× n matrix A.

(i) A is invertible.

(ii) The homogeneous system A~x = ~0 has only the trivial solution ~x = ~0n.

(iii) A can be carried to the identity matrix In by elementary row operations.

(iv) The system A~x = ~b has at least one solution for any vector ~b ∈ Fn.

(v) There is an n× n matrix C such that AC = In.

Corollary 3.2.5. If A and C are square matrices such that AC = I, then also CA = I. In

particular, A and C are invertible, C = A−1 and A = C−1.

Corollary 3.2.6. An n× n matrix A is invertible if and only if rankA = n.

Definition. An elementary matrix is one that is obtained by performing a single elementary

row operation on an identity matrix.

Example 3.2.1. Consider the following elementary matrices:

E1 =

1 0 00 2 00 0 1

, E2 =

1 0 00 0 10 1 0

, and E3 =

1 0 00 1 03 0 1

.

Let A =

a b cd e fg h i

. Compute the products E1A, E2A and E3A.

Page 25: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.3. More on Ranks 25

Theorem 3.2.7. If an elementary row operation is performed on an m×n matrix A, the resulting

matrix can be written as EA, where the m ×m matrix E is created by performing the same row

operation on Im.

Remark. Elementary matrices are invertible because row operations are reversible. To find the in-

verse of an elementary matrix E, determine the elementary row operation needed to transform Eback into I and apply this operation to I to obtain the inverse.

Corollary 3.2.8. An elementary matrix is invertible. Moreover,

1. If IRij−→ E1, then I

Rij−→ E−11 .

2. If c 6= 0 and IcRi−→ E2, then I

(1/c)Ri−→ E−12 .

3. If c ∈ F and IRi+cRj−→ E3, then I

Ri−cRj−→ E−13 .

Example 3.2.2. Find the inverses of the elementary matrices given in Example 3.2.1

Theorem 3.2.9. Suppose A is an m× n matrix and A→ B by elementary row operations.

1. B = UA for some m×m invertible matrix U .

2. U can be computed by [A : Im]→ [B : U ] using the operations carrying A→ B.

3. U = EkEk−1 . . . E2E1, where E1, E2, . . . , Ek−1, Ek are the elementary matrix corresponding

(in order) to the elementary row operations carrying A→ B.

Example 3.2.3. If A =

[2 3 11 2 1

]

, express the reduced row-echelon form R of A as R = UA

where U is invertible.

Theorem 3.2.10. A square matrix is invertible if and only if it is a product of elementary matrices.

Remark. From the above theorem, we obtain an algorithm to find A−1 if A is invertible. Namely,

we start with the block matrix [A : I] and row reduce it until we reach the final reduced echelon

form [I : U ] (because A is row equivalent to I by Theorem 3.2.4). Then we have U = A−1.

Example 3.2.4. Express A =

[−2 31 0

]

as a product of elementary matrices.

3.3 More on Ranks

Definition. Let A be an m× n matrix.

The column space, ColA, of A is the subspace of Rm spanned by the columns of A.

The row space, RowA, of A is the subspace of Rn spanned by the rows of A.

Note that ColA = RowAT .

Lemma 3.3.1. Let V be a vector space over a field F . Let ~v1, . . . , ~vn be in V .

1. Span{~v1, . . . , ~vn} = Span{~v1, . . . , c~vi, . . . ~vn} for all i ∈ {1, . . . , n} and c ∈ F nonzero.

2. Span{~v1, . . . , ~vn} = Span{~v1, . . . , ~vi + c~vj , . . . , ~vj , . . . , ~vn} for all i 6= j and c ∈ F .

Lemma 3.3.2. Let A and B denote m× n matrices.

1. If A→ B by elementary row operations, then RowA = RowB.

2. If A→ B by elementary column operations, then ColA = ColB.

Page 26: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

26 3. Matrices

If A is any matrix, we can carry A → R by elementary row operations where R is a row-

echelon matrix. Hence, RowA = RowR by Lemma 3.3.2.

Lemma 3.3.3. If R is a row-echelon matrix, then

1. The nonzero rows of R form a basis for RowR.

2. The columns of R containing leading ones form a basis for ColR.

Theorem 3.3.4. Let A denote any m× n matrix of rank r. Then

dimColA = r = dimRowA.

Moreover, if A is carried to a row-echelon matrix R by row operations, then

1. The r nonzero rows of R form a basis for RowA.

2. If the pivot positions lie in columns j1, j2, . . . , jr of R, then columns j1, j2, . . . , jr of A are a

basis of ColA. That is, the pivot columns of A form a basis for ColA.

Corollary 3.3.5. 1. If A is any matrix, then rankA = rankAT .

2. If A is an m× n matrix, then rankA ≤ m and rankA ≤ n.

3. rankA = rankUA = rankAV whenever U and V are invertible.

Corollary 3.3.6. Let A,B,U and V be matrices of sizes for which the indicated products are

defined.

1. Col(AV ) ⊆ ColA, with equality if V is (square and) invertible.

2. Row(UA) ⊆ RowA, with equality if U is (square and) invertible.

3. rankAB ≤ rankA and rankAB ≤ rankB.

Let A be an m×n matrix of rank r, and let R be the reduced row-echelon form of A. Theorem

3.2.9 shows that R = UA where U is invertible, and that U can be found by [A : Im]→ [R : U ].The matrix R has r leading ones (since rank A=r) so, as R is reduced, the n ×m matrix RT

contains each row of Ir in the first r columns. Thus, row operations will carry RT →[Ir 00 0

]

n×m

.

Hence, Theorem 3.2.9 (again) shows that

[Ir 00 0

]

n×m

= U1RT where U1 is an n × n invertible

matrix. Writing V = UT1 , we obtain

UAV = RV = RUT1 = (U1R

T )T =

([Ir 00 0

]

n×m

)T

=

[Ir 00 0

]

m×n

.

Moreover, the matrix U1 = V T can be computed by [RT : In]→[[

Ir 00 0

]

n×m

: V T

]

. This proves

Theorem 3.3.7. Let A be an m × n matrix of rank r. There exist invertible matrices U and V of

size m×m and n× n, respectively, such that

UAV =

[Ir 00 0

]

m×n

,

called the Smith normal form of A.

Moreover, if R is a reduced row-echelon form of A, then:

1. U can be computed by [A : Im]→ [R : U ].

2. V can be computed by [RT : In]→[[

Ir 00 0

]

n×m

: V T

]

.

Page 27: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.4. Permutations and Determinants 27

Example 3.3.1. Given A =

1 −1 1 22 −2 1 −1−1 1 0 3

, find invertible matrices U and V such that UAV

is in the Smith normal form.

Theorem 3.3.8. [Uniqueness of the reduced row-echelon form]

If a matrix A is carried to reduced row-echelon matrices R and S by row operations, then R = S.

Proof. Observe first that UR = S for some invertible matrix U (by Theorem 3.2.9 there exist

invertible matrices P and Q such that R = PA and S = QA; take U = QP−1. We show that

R = S by induction on the number m of rows of A. The case m = 1 is trivial because we can

perform only scaling. If ~rj and ~sj denotes the jth column of R and of S, respectively, the fact that

UR = S gives

U~rj = ~sj for each j. (3.3.1)

Since U is invertible, this shows that R and S have the same zero columns. Hence, by passing to

the matrices obtained by deleting the zero columns from R and S, we may assume that R and Shave no zero columns.

But then the first column of R and S is the first column of Im because they are reduced row-

echelon so (3.3.1) forces that the first column of U is the first column of Im. Now, write U,R and

S in block form as follows.

U =

[1 X0 V

]

, R =

[1 Y0 R′

]

and S =

[1 X0 S′

]

.

Since UR = S, block multiplication gives V R′ = S′ so, since V is invertible (U is invertible) and

both R′ and S′ are reduced row-echelon, we obtain R′ = S′ by the induction hypothesis. Thus, Rand S have the same number (say r) of leading 1’s, and so both have m− r zero rows.

In fact, R and S have leading ones in the same columns, say r of them. Applying (3.3.1) to

these columns shows that the first r columns of U are the first r columns of Im. Hence, we can

write U,R and S in block form as follows:

U =

[Ir M0 W

]

, R =

[R1 R2

0 0

]

and S =

[S1 S2

0 0

]

,

where R1 and S1 are r × r. Then block multiplication gives UR = R. That is, S = R. This

completes the proof.

3.4 Permutations and Determinants

Definition. Let n ∈ N. A permutation σ on the set {1, 2, . . . , n} is a one-to-one mapping of the

set onto itself or equivalently, a rearrangement of the numbers 1, 2, . . . , n. Such a permutation

σ is denoted by

σ =

(1 2 . . . nj1 j2 . . . jn

)

or σ = j1j2 . . . jn, where ji = σ(i).

The set of all such permutations is denoted by Sn, and the number of such permutations is n!.

Example 3.4.1. S2 = {12, 21} and S3 = {123, 132, 213, 231, 312, 321}.

Remark. If σ ∈ Sn, then the inverse mapping σ−1 ∈ Sn; and if σ, τ ∈ Sn, then the composition

mapping σ ◦ τ ∈ Sn. Also, the identity mapping ε = σ ◦ σ−1 = 123 . . . n ∈ Sn.

Page 28: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

28 3. Matrices

Definition. For a permutation σ in Sn, let

Iσ = {(i, k) : i, k ∈ {1, 2, . . . , n}, i < k and σ(i) > σ(k)}.

We say that σ is an even permutation⇔ |Iσ| is even, and an odd permutation⇔ |Iσ| is odd.

We then define the sign or parity of σ, written sgnσ, by

sgnσ =

{

1 if σ is even,

−1 if σ is odd.

Thus, sgnσ ∈ {−1, 1} for all σ ∈ Sn.

Example 3.4.2. Let σ = 2134 in S4 and τ = 21543 in S5.

1. Find σ−1 and τ−1.

2. Compute sgnσ and sgn τ .

Theorem 3.4.1. Let n ≥ 2 and let g be the polynomial given by

g = g(x1, x2, . . . , xn) =∏

i<j

(xi − xj).

For σ(g) ∈ Sn, define the polynomial

σ(g) =∏

i<j

(xσ(i) − xσ(j)).

Then

σ(g) =

{

g if σ is even,

−g if σ is odd.

That is, σ(g) = (sgnσ)g.

Theorem 3.4.2. Let σ, τ ∈ Sn. Then

sgn(τ ◦ σ) = (sgn τ)(sgnσ).

Thus, the product of two even or two odd permutations is even, and the product of an odd and an

even permutation is odd.

Let [aij ] be a square matrix of size n× n.

Consider a product of n elements of A such that one and only one element comes from each

row and one and only one element comes from each columns. Such a product can be written in

the form

a1jia2j2 . . . anjn

that is, where the factors comes from successive rows, and so the first subscripts are in the nat-

ural order 1, 2, . . . , n. Now since the factors come from different columns, the sequence of the

second subscripts forms a permutation σ = j1j2 . . . jn in Sn. Conversely, each permutation in Sn

determines a product of the above form. Thus the matrix A contains n! such products.

Definition. The determinant of A = [aij ], denoted by detA or |A|, is the sum of all the above

n! products where each such product is multiplied by sgnσ. That is,

|A| =∑

σ∈Sn

(sgnσ)a1j1a2j2 . . . anjn =∑

σ∈Sn

(sgnσ)a1σ(1)a2σ(2) . . . anσ(n).

Page 29: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.4. Permutations and Determinants 29

Lemma 3.4.3. Let A = [aij ] be an n× n matrix and σ ∈ Sn.

1. sgnσ−1 = sgnσ.

2. {(i, σ−1(i)) : i ∈ {1, 2, . . . , n}} = {(σ(i), i) : i ∈ {1, 2, . . . , n}}.3. aσ(1),1aσ(2),2 . . . aσ(n),n = a1,σ−1(1)a2,σ−1(2) . . . an,σ−1(n).

Theorem 3.4.4. The determinant of a matrix A and its transpose are equal. That is, |A| = |AT |.

Remark. By this theorem, any theorem about the determinant of a matrix A that concerns the

rows of A will have an analogous theorem concerning the columns of A.

Lemma 3.4.5. For k < l, τ =

(1 2 . . . k . . . l . . . n1 2 . . . l . . . k . . . n

)

is an odd permutation in Sn.

Proof. Note that Iτ = {(k, j) : j ∈ {k + 1, k + 2, . . . , l}} ∪ {(i, l) : i ∈ {k + 1, k + 2, . . . , l − 1}}.Then |Iτ | = (l − k) + (l − k − 1) = 2(l − k) − 1 is odd. Thus, τ is an odd permutation and so

sgn τ = −1.

Theorem 3.4.6. If A→ B by interchanging two rows (columns) of A, then |B| = −|A|.

Theorem 3.4.7. Let A be a square matrix of size n× n.

(a) If A has a row (column) of zeros, then |A| = 0.

(b) If σ 6= 12 . . . n, then ∃i ∈ {1, 2, . . . , n}, i > σ(i).(c) If A is triangular, i.e., A has zeros above or below the diagonal, then |A| is the product of

diagonal elements. In particular, |I| = 1.

Theorem 3.4.8. If A has two identical rows (columns), then |A| = 0.

Proof. Assume that kth and lth rows are identical with k < l.That is, akj = alj for all j ∈ {1, . . . , n}.In particular, for any σ ∈ Sn, akσ(l) = alσ(l) and akσ(k) = alσ(k).

Let τ =

(1 2 . . . k . . . l . . . n1 2 . . . l . . . k . . . n

)

.

Then sgn τ = −1 and σ(τ(j)) = σ(j) for all j ∈ {1, . . . , n}r {k, l}. Also,

sgn(στ) = (sgnσ)(sgn τ) = − sgnσ.

As σ runs through all even permutations, στ runs through all odd permutations, and vice versa.

Thus

|A| =∑

σ∈Sn

(sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)

=∑

σ∈Sneven

((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)

+ (sgn(στ))a1στ(1)a2στ(2) . . . akστ(k) . . . alστ(l) . . . anστ(n))

=∑

σ∈Sneven

((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)

− (sgnσ)a1σ(1)a2σ(2) . . . akσ(l) . . . alσ(k) . . . anσ(n))

=∑

σ∈Sneven

((sgnσ)a1σ(1)a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)

− (sgnσ)a1σ(1)a2σ(2) . . . alσ(l) . . . akσ(k) . . . anσ(n))

= 0.

Page 30: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

30 3. Matrices

Hence, we have the theorem.

Theorem 3.4.9. If A→ B by multiplying a row (column) of A by a scalar c ∈ F , then |B| = c|A|.

Remark. If A is an n× n matrix, then |cA| = cn|A|.

Theorem 3.4.10. If A→ B by adding a multiple of a row (column) of A to another row (column)

of A, then |B| = |A|.

Corollary 3.4.11. If E is an elementary matrix, then detE 6= 0.

Lemma 3.4.12. Let E be an elementary matrix. Then |EA| = |E||A| for any matrix A. In

particular, if E1, E2 . . . , Es are elementary matrices, then

|E1E2 . . . Es| = |E1||E2| . . . |Es|.

Theorem 3.4.13. Let A be a square matrix. Then, A is invertible⇔ detA 6= 0.

Theorem 3.4.14. The determinant of a product of two matrices A and B is the product of their

determinants; that is |AB| = |A||B|.

Definition. Consider an n-square matrix A = [aij ]. Let Mij(A) denote the (n − 1)-square

submatrix of A obtained by deleting its ith row and jth column. The determinant |Mij(A)| is

called the minor of the element aij of A, and we define the cofactor of aij , denoted by Cij(A),to be the “signed” minor:

Cij(A) = (−1)i+j |Mij(A)|.

Recall that

|A| =∑

σ∈Sn

(sgnσ)a1σ(1)a2σ(2) . . . anσ(n)

= aijCij(A) + (terms which do not contain aij as a factor).

Lemma 3.4.15. Cij(A) = Cij(A) for all i, j ∈ {1, . . . , n}.

Grant this lemma, we observe that

|A| =n∑

j=1

aijCij(A) =n∑

j=1

aijCij(A).

Therefore, we have shown.

Theorem 3.4.16. [Laplace] The determinant of a square matrix A = [aij ] is equal to the sum of

the products obtained by multiplying the elements of any row (column) by their respective cofactors:

|A| = ai1Ci1(A) + ai2Ci2(A) + · · ·+ ainCin(A) =n∑

j=1

aijCij(A)

|A| = a1jC1j(A) + a2jC2j(A) + · · ·+ anjCnj(A) =n∑

i=1

aijCij(A)

for all i, j ∈ 1, 2, . . . , n.

Page 31: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.4. Permutations and Determinants 31

Remark. The above formulas for |A| is called the Laplace expansions of the determinant of Aby the ith row and the jth column. Together with the elementary row operations, they offer a

method of simplifying the computation of |A|.Next we proceed to prove the lemma.

Proof of Lemma 3.4.15. Note that for any matrix B,

|B| =∑

σ∈Sn

(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1)bnσ(n)

= bnn∑

σ∈Sn

σ(n)=n

(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1) + (other terms which do not contain bnn).

Thus,

Cnn(B) =∑

σ∈Sn

σ(n)=n

(sgnσ)b1σ(1)b2σ(2) . . . bn−1,σ(n−1)

=∑

τ∈Sn−1

(sgn τ)b1τ(1)b2τ(2) . . . bn−1,τ(n−1)

=

∣∣∣∣∣∣∣∣∣

b11 b12 . . . b1,n−1

b21 b22 . . . b2,n−1

. . .

bn−1,1 bn−1,2 . . . bn−1,n−1

∣∣∣∣∣∣∣∣∣

= determinant of the matrix obtained from deleting the nth row and nth column of B.

Write

A =

a11 a12 . . . a1j . . . a1na21 a22 . . . a2j . . . a2n

...

ai1 ai2 . . . aij . . . ain...

an1 an2 . . . anj . . . ann

.

To compute Cij(A), we row reduce A to A′ by interchanging rows n− i times and columns n− jtimes as shown:

A′ =

a11 a12 . . . a1,j−1 a1,j+1 . . . a1n a1ja21 a22 . . . a2,j−1 a2,j+1 . . . a2n a2j

......

ai−1,1 ai−1,2 . . . ai−1,j−1 ai−1,j+1 . . . ai−1,n ai−1,j

ai+1,1 ai+1,2 . . . ai+1,j−1 ai+1,j+1 . . . ai+1,n ai+1,j...

...

an1 an2 . . . an,j−1 an,j+1 . . . ann anjai1 ai2 . . . ai,j−1 ai,j+1 . . . ain aij

.

Hence,

|A′| = (−1)(n−i)+(n−j)|A| = (−1)−i−j |A|.That is,

|A| = (−1)i+j |A′|aijCij(A) + (other terms) = (−1)i+jaijCnn(A

′) + (other terms).

Page 32: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

32 3. Matrices

Therefore,

Cij(A) = (−1)i+jCnn(A

′)

= (−1)i+j(the determinant of the matrix obtained from

deleting the nth row and the nth column of A′)

= (−1)i+j |Mij(A)|= Cij(A).

This completes the lemma.

Definition. Let A = [aij ] be an n × n matrix and let Cij(A) denote the cofactor of aij . The

classical adjoint of A, denoted by adjA, is the transpose of the matrix of the cofactors of A,

namely,

adjA = [Cij(A)]T .

We say “classical adjoint” here instead of simply “adjoint” because the term “adjoint” will be

used for an entirely different concept.

Theorem 3.4.17. Let A be a square matrix. Then

A(adjA) = (adjA)A = |A|I

where I is the identity matrix. Thus, if |A| 6= 0, then

A−1 =1

|A|(adjA).

For any n× n matrix A and any ~b ∈ Fn, let Ai(~b) be the matrix obtained from A by replacing

the ith column by the vector ~b, that is,

Aj(~b) =

[

~a1 · · · ~b︸︷︷︸

jth

· · · ~an

]

for all j = 1, 2, . . . , n.

Theorem 3.4.18. [Cramer’s rule] Let A be an invertible n×n matrix. For any~b ∈ Fn, the unique

solution ~x of A~x = ~b has entries given by

xj =|Aj(~b)||A| , j = 1, 2, . . . , n.

Exercises for Chapter 3. 1. The following matrices are echelon forms of coefficient matrices of linearsystems. Which has a unique solution? Why?

(a)

1 2 3 40 1 2 30 0 1 20 0 0 1

(b)

1 2 3 40 1 2 30 0 0 10 0 0 0

2. Find the general solution to the linear system

x1 + 2x2 + x3 − 2x4 = 52x1 + 4x2 + x3 + x4 = 93x1 + 6x2 + 2x3 − x4 = 14

Page 33: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.4. Permutations and Determinants 33

3. Consider the linear system with parameter a

(2a− 1)x + ay − (a+ 1)z = 1ax + y − 2z = 12x + (3− a)y + (2a− 6)z = 1

Determine, with proof, for which a this system has(a) no solution (b) a unique solution (c) more than one solutions.

4. Consider the linear systemx + 2y + z = 3

ay + 5z = 102x + 7y + az = b

(a) Find those values of a for which the system has a unique solution.(b) Find those pairs of values (a, b) for which the system has more than one solutions.

5. If A~x = ~b has more than one solutions, why is it impossible for A~x = ~c (new right-hand side) to haveonly one solution? Could A~x = ~c have no solution?

6. Let A~x = ~0 be a homogeneous system of n linear equations in n unknowns and let Q be an invertiblen × n matrix. Show that A~x = ~0 has a nontrivial solution if and only if (QA)~x = ~0 has a nontrivialsolution.

7. If A~x = ~b has two distinct solutions ~p and ~q, find two distinct solutions to A~x = ~0.8. Under what conditions on b1 and b2 (if any) is A~x = ~b consistent (has a solution)?

A =

[1 2 0 32 4 0 7

]

and ~b =

[b1b2

]

.

9. Find the number c so that (if possible) the rank of A is (a) 1 (b) 2 (c) 3

A =

6 4 2−3 −2 −19 6 c

10. Suppose A =

1 2 1 b2 a 1 8∗ ∗ ∗ ∗

has the reduced echelon form R =

1 2 0 30 0 1 20 0 0 0

.

(a) Find a and b. (b) Solve A~x = ~0.11. Let A be an m× n matrix for which

A~x =

111

has no solutions and A~x =

010

has a unique solution.

(a) Give all possible information about m and n and the rank of A.(b) Find all solutions of A~x = ~0 and explain your answer.

12. Let A be an 3× 4 matrix for which

A~x =

111

has no solutions and A~x =

010

has more than one solutions.

(a) Give all possible values for rankA.(b) Do we always have more than one solutions for A~x = ~0? Explain your answer.(c) Is it possible to have a vector ~b such that A~x = ~b has a unique solution? Why?

13. Find the value for c in the following n by n inverse:

if A =

n −1 . . . −1−1 n . . . −1. . . . . . . . . −1−1 −1 . . . n

then A−1 =1

n+ 1

c 1 . . . 11 c . . . 1. . . . . . . . . 11 1 . . . c

.

14. If E is an elementary matrix, prove that ET is an elementary matrix.

Page 34: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

34 3. Matrices

15. Let E1, E2 and E3 denote, respectively, the elementary row operations

“Interchange rows R1 and R2” “Multiply R3 by 5” “Replace R2 by −3R1 +R2”.

(a) Find the corresponding 3-square elementary matrices E1, E2 and E3.(b) Find the inverses of matrices E1, E2 and E3.

16. Let A be a 3 × 3 invertible matrix. Construct B by replacing R3 of matrix A by R3 − 4R1. How dowe find B−1 from A−1? Explain.

17. Let A be a 3× 3 matrix and B =

0 1 01 0 00 0 1

. Consider the augmented matrix C = [A : B]. After row

reducing C, we get the following matrix

1 0 1 2 −3 −40 1 0 −1 2 20 0 −1 0 0 1

.

Compute A−1.

18. (a) For which values of the parameter c is A =

−2 1 c0 −1 11 2 0

invertible?

(b) For which values of e is the matrix A =

5 e ee e e1 2 e

not invertible?

19. Let A =

a b ba a ba a a

. If a 6= 0 and a 6= b, prove that A is invertible and find A−1 in terms of a and b.

20. Show that if A =

1 0 00 1 0a b c

is an elementary matrix, then at least one entry in the third row must

be zero.21. In each case find an elementary matrix E such that B = EA.

(a) A =

[2 13 −1

]

, B =

[3 −12 1

]

(b) A =

[2 13 −1

]

, B =

[−1 −33 −1

]

22. In each case find an invertible matrix U such that UA = B, and express U as a product of elementarymatrices.

(a) A =

[2 1 3−1 1 2

]

, B =

[1 −1 −23 0 1

]

(b) A =

[2 −1 01 1 1

]

, B =

[3 0 12 −1 0

]

23. In each case find invertible matrices U and V such that UAV is in the Smith normal form.

(a) A =

[1 1 −1−2 −2 4

]

(b) A =

[3 22 1

]

(c) A =

[1 −2−2 4

]

(d) A =

1 −1 2 12 −1 0 30 1 −4 1

24. Let F be a field and A = [aij ] ∈Mn(F ). Define the trace of A to be the sum of the diagonal elements,that is,

trA =n∑

i=1

aii.

(a) Show that the trace is a linear transformation from Mn(F ) onto F .(b) If A and B are in Mn(F ), then tr (AB) = tr (BA).(c) If B is invertible, then tr (B−1AB) = trA.(d) Prove that there are no square real matrices A and B such that AB −BA = In.

25. Let A =

1 2 0 2 1−1 −2 1 1 01 2 −3 −7 −2

.

(a) Find bases for ColA and NulA. (b) Find rankA and nullityA.

26. Determine the rank and nullity of A =

1 2 0 30 2 2 20 0 0 00 0 0 4

.

27. If A is an n× n matrix such that A2 = A and rankA = n, prove that A = In.

Page 35: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

3.4. Permutations and Determinants 35

28. Let A be a 5× 7 matrix with rank 4.(a) What is the dimension of the solution space of A~x = ~0?(b) Does A~x = ~b have a solution for all ~b ∈ R5? Explain.

29. Let A be a square matrix such that Ak = 0 for some positive integer k. Prove that I +A is invertible.30. Let A and B be m× n and n×m matrices, respectively. If m > n, show that AB is not invertible.31. Let A and B be m× n and n×m matrices, respectively. Show that AB = 0m×m ⇔ ColB ⊆ NulA.

32. Determine the sign of all permutations in S4 and expand the determinant

∣∣∣∣∣∣∣∣

a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34a41 a42 a43 a44

∣∣∣∣∣∣∣∣

by

using permutations and their signs explicitly.33. Determine the sign of the following permutations in S5.

(a) 12354 (b) 12534 (c) 15243 (d) 5432134. Show that if two rows (columns) of A are proportional, i.e., Rk = cRl for some k < l, then |A| = 0.35. Let A = [aij ] be a square matrix of order n and σ ∈ Sn. If Aσ = [aσ(i),j ], show that |Aσ| = (sgnσ)|A|.36. Prove that if n is odd, 1 + 1 6= 0 and A is a square matrix of order n with A = −AT , then A is not

invertible.37. After the indicated row operations on a 3 × 3 matrix A with detA = −540, matrices A1, A2, . . . , A5

are successively obtained:

AR1+3R2

// A1R23

// A23R2−R1

// A3R1−3R2

// A42R1

// A5 .

Determine the values of |A1|, |A2|, |A3|, |A4| and |A5|, respectively.38. If A is an invertible square matrix of order n > 1, show that det(adjA) = (detA)n−1.

What is det(adjA) if A is not invertible? Prove your answer.39. Let A,B,C be 3× 3 matrices with detA = 3, detB3 = −8, detC = 2. Compute

(a) det(ABC) (b) det(5ACT ) (c) det(A3B−3C−1) (d) det[B−1(adjC)] .40. Show that adjAT = (adjA)T .41. Show that if A is invertible and n > 2, then adj (adjA) = (detA)n−2A.42. If A and B are invertible, show that

adj (AB) = (adjB)(adjA) and adj (BAB−1) = B(adjA)B−1.

43. Prove that if A is an invertible upper triangular matrix (all entries lying below the diagonal are zero),then adjA and A−1 are upper triangular.

44. Suppose the set of real-valued functions f1(x), f2(x), . . . , fk(x) are all defined and are differentiablek − 1 times on the interval [a, b]. The Wronskian of the set of functions is defined on this interval tobe the determinant

W (x) =

∣∣∣∣∣∣∣∣∣∣∣

f1(x) f2(x) · · · fk(x)f ′1(x) f ′

2(x) · · · f ′

k(x)f ′′1 (x) f ′′

2 (x) · · · f ′′

k (x)...

.... . .

...

f(k−1)1 (x) f

(k−1)2 (x) · · · f

(k−1)k (x)

∣∣∣∣∣∣∣∣∣∣∣

.

Prove that a set of real-valued functions {f1(x), f2(x), . . . , fk(x)} differentiable k − 1 times on theinterval [a, b], are linearly independent if W (x0) 6= 0 at some point x0 in the interval.

45. Consider the interval [−1, 1] and the two functions defined by

f(x) =

{

0 if −1 ≤ x ≤ 0,

x2 if 0 ≤ x ≤ 1and g(x) =

{

x2 if −1 ≤ x ≤ 0,

0 if 0 ≤ x ≤ 1.

These functions are both differentiable. Show that f and g are linearly independent but W (x) = 0for all x ∈ [−1, 1]. This provides an example to prove that the converse of the previous problem doesnot hold.

46. (a) Show that the functions 1, x, x2, . . . , xk are linearly independent in the function space C0[0, 1].(b) Show that the functions sinx, sin 2x, sin 3x, . . . , sin kx are linearly independent in the functionspace C0[0, 2π]. (Hint. Use the Wronskian.)

Page 36: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

36 3. Matrices

47. Use induction to show that ∣∣∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 · · · 1 1

1 0 0... 0 0

0 1 0 · · · 0 00 0 1 · · · 0 0...

... · · ·...

...0 0 0 · · · 1 0

∣∣∣∣∣∣∣∣∣∣∣∣∣∣

= (−1)n+1.

48. (a) Let x1, x2 and x3 be numbers. Show that

V2 =

∣∣∣∣

1 x1

1 x2

∣∣∣∣= x2 − x1 and V3 =

∣∣∣∣∣∣

1 x1 x21

1 x2 x22

1 x3 x23

∣∣∣∣∣∣

= (x2 − x1)(x3 − x1)(x3 − x2).

(b) If x1, x2, . . . , xn are numbers, then show by induction that

Vn =

∣∣∣∣∣∣∣∣

1 x1 . . . xn−11

1 x2 . . . xn−12

. . .1 xn . . . xn−1

n

∣∣∣∣∣∣∣∣

=∏

i<j

(xj − xi).

This determinant is called the Vandermonde determinant. (Hint. To do the induction easily, multi-ply each column by x1 and subtract it from the next column on the right starting from the right-handside. We shall find that Vn = (xn − x1) . . . (x2 − x1)Vn−1.)

Page 37: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

4 | Linear Transformations

4.1 Linear Functionals

Definition. Let V and W be two vector spaces over F . We write L(V,W ) for the set of all linear

transformations from V to W , that is,

L(V,W ) = {T : V →W |T is a linear transformation}.

Then L(V,W ) is a vector space over F with the operations defined by for S, T ∈ L(V,W ),

(S + T )(~v) = S(~v) + T (~v) and (cT )(~v) = c T (~v)

for all ~v ∈ V and c ∈ F . Note that the zero function is its zero vector and (−T )(~v) = −T (~v) for

all ~v ∈ V .

Remark. By Theorem 1.4.1, for a given basis B = {~v1, ~v2, . . . , ~vn} for an n-dimensional vector

space V , there exists a unique linear transformation T : V → W such that T (~vi) = ~wi ∈ W for

all i ∈ {1, 2, . . . , n}. Then for S, T ∈ L(V,W ), (S(~vα) = T (~vα) for all i ∈ {1, 2, . . . , n})⇒ S = T .

Hence, to show that two linear transformations are identical, it suffices to see the equality on some

basis of V .

Theorem 4.1.1. Let B = {~v1, . . . , ~vn} be a basis for V and let C = {~w1, . . . , ~wm} be a basis for W .

For each i ∈ {1, . . . , n} and j ∈ {1, . . . ,m}, we define

Tij(~vk) =

{

~wj if i = k,

~0W if i 6= k,

for all k ∈ {1, . . . , n}. By Theorem 1.4.1, Tij ∈ L(V,W ) for all i, j. Then

{Tij : i ∈ {1, . . . , n} and j ∈ {1, . . . ,m}}

is a basis for L(V,W ). Hence, if dimV = n and dimW = m, then dimL(V,W ) = mn.

Definition. Let V be a vector space over a field F .

A linear transformation from V to F is also called a linear functional. Let

V ∗ = L(V, F ) and V ∗∗ = (V ∗)∗(= L(V ∗, F ) = L(L(V, F ), F )).

By Theorem 4.1.1, we have that if V is finite dimensional, then

dimV = dimV ∗ = dimV ∗∗

and thus, by Corollary 1.4.14, V ∼= V ∗ ∼= V ∗∗.

37

Page 38: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

38 4. Linear Transformations

Definition. The space V ∗ is called the dual space of V and V ∗∗ is called the double dual of V .

Examples 4.1.1. The following functions are linear functionals.

1. T : C0[0, 1]→ R given by T (f) =

∫ 1

0f(x) dx.

2. T : F [x]→ F given by T (p(x)) = p(1).

Remarks. 1. For f ∈ V ∗,

(a) f 6= 0⇒ im f = F(b) if V is finite dimensional and f 6= 0, then nullity f = (dimV )− 1.

2. For ~v ∈ V , if f(~v) = 0 for all f ∈ V ∗, then ~v = ~0.

Theorem 4.1.2. Let dimV = n and let B = {~v1, . . . , ~vn} be a basis of V .

For each i ∈ {1, . . . , n}, let fi ∈ V ∗ be such that

fi(~vj) =

{

1 if i = j,

0 if i 6= j.

Then the following statements hold.

1. {f1, . . . , fn} is a basis of V ∗ which is called the dual basis of B.

2. ∀f ∈ V ∗, f =n∑

i=1

f(~vi)fi = f(~v1)f1 + . . .+ f(~vn)fn.

3. ∀~v ∈ V , ~v =n∑

i=1

fi(~v)~vi = f1(~v)~v1 + . . .+ fn(~v)~vn.

For ~v ∈ V, define L~v : V ∗ → F by L~v(f) = f(~v) for all f ∈ V ∗.

Then L~v ∈ V ∗∗ for all ~v ∈ V . Hence, {L~v : ~v ∈ V } ⊆ V ∗∗.

Theorem 4.1.3. 1. The map θ : ~v 7→ L~v is a 1-1 linear transformation from V into V ∗∗.

2. If V is finite-dimensional, then

(a) the map θ : ~v 7→ L~v is an isomorphism of V onto V ∗∗

(b) ∀L ∈ V ∗∗, ∃!~v ∈ V, L = L~v.

Corollary 4.1.4. If V is finite dimensional, then each basis of V ∗ is the dual of some basis of V .

Example 4.1.2. Consider V = R2[x], the vector space of all polynomials of degree less than 2over R. Let t1, t2, t3 be three distinct real numbers and let fi(p(x)) = p(ti) for all p(x) ∈ R2[x] and

i = 1, 2, 3.

Show that {f1, f2, f3} is a basis of V ∗ and find a basis of V such that {f1, f2, f3} is its dual

basis.

Let V be an inner product space over a field F = R or C.

1. ∀~w ∈ V , the map ~v 7→ (~v, ~w) is a linear functional on V .

2. The maps ~v 7→ (~v, ~w1) and ~v 7→ (~v, ~w2) are identical⇔ ~w1 = ~w2.

Theorem 4.1.5. Let V be a finite dimensional inner product space and f ∈ V ∗.

Then ∃!~w ∈ V, f(~v) = (~v, ~w) for all ~v ∈ V .

Hence, V ∗ = {f~w : ~w ∈ V } where f~w(~v) = (~v, ~w) for all ~v ∈ V .

Page 39: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

4.2. Quotient Spaces and Isomorphism Theorem 39

4.2 Quotient Spaces and Isomorphism Theorem

Let V be a vector space over a field F and let W be any subspace of V . For ~v ∈ V , define

~v +W = {~v + ~w : ~w ∈W}

which is called a coset of W . Then

(1) ∀~v1, ~v2 ∈ V,~v1 +W = ~v2 +W ⇔ ~v1 − ~v2 ∈W ,

(2) ∀~v1, ~v2 ∈ V, (~v1 +W ) ∩ (~v2 +W ) = ∅ or ~v1 +W = ~v2 +W and

(3) ∀~v1, ~v2 ∈ V, (~v1 +W ) + (~v2 +W ) = (~v1 + ~v2) +W .

For c ∈ F and ~v ∈ V , define c(~v +W ) = c~v +W .

Definition. Let V/W = {~v + W : ~v ∈ V }. It is a vector space over F with respect to the

operations

(~v1 +W ) + (~v2 +W ) = (~v1 + ~v2) +Wand c(~v1 +W ) = c~v1 +W,

and ~0 +W is the zero vector of V/W and −(~v +W ) = (−~v) +W for all ~v ∈ V .

The vector space V/W is called the quotient space of V by W .

Theorem 4.2.1. 1. There is a linear transformation π from V onto V/W given by

π : ~v 7→ ~v +W for all ~v ∈ V .

Its kernel is equal to W . This map π is called the canonical projection from V onto V/W .

2. If V is a finite dimensional vector space and W is a subspace of V , then V/W is finite

dimensional and dim(V/W ) = dimV − dimW .

Theorem 4.2.2. [Isomorphism Theorem] Let V and W be two vector spaces over a field F and

T : V →W a linear transformation. Then

V/(kerT ) ∼= imT.

Example 4.2.1. Let A be an m × n matrix and TA : Rn → Rm given by TA(~x) = A~x. Then we

have

Rn/(NulA) ∼= ColA.

Moreover, if ~b ∈ ColA, then A~x = ~b has a solution, say ~yp. Theorem 4.2.2 also gives the corre-

spondence

~yp +NulA←→ ~b.

This is Theorem 3.1.3 (2).

4.3 Matrix Representations

Definition. Let V be an n-dimensional vector space over a field F with an ordered basis B ={~v1, ~v2, . . . , ~vn} and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1, . . . , cn) ∈ Fn,

~v = c1~v1 + c2~v2 + . . .+ cn~vn and [~v]B =

c1c2...

cn

∈ Fn

is called the coordinate vector of ~v relative to the ordered basis B.

Page 40: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

40 4. Linear Transformations

Example 4.3.1. Let B = {(1, 1, 0, 0), (1, 0, 1, 0), (1, 1, 1, 0), (0, 0, 0, 2i)} be an ordered basis for C4.

Find [(2,−16, 3,−i)]B.

We recall Theorems 1.4.12 and 1.4.13 as follows.

Theorem 4.3.1. Let V be an n-dimensional vector space over F and B a basis for V .

1. For ~v, ~w ∈ V and c ∈ F , we have [~v + ~w]B = [~v]B + [~w]B and [c~v]B = c[~v]B.

2. The map ~v 7→ [~v]B is an isomorphism from V onto Fn.

This also implies ∀~u,~v ∈ V, [~u]B = [~v]B ⇔ ~u = ~v.

Theorem 4.3.2. Let T : V → W be a linear transformation where dimV = m and dimW = n,

and let B = {~v1, . . . , ~vn} and C = {~w1, . . . , ~wm} be ordered bases of V and W , respectively. Then

for each j ∈ {1, . . . , n}, we have

T (vj) = d1j ~w1 + d2j ~w2 + · · ·+ dmj ~wm.

Hence, there exists a unique m× n matrix over a field F given by

[T ]CB =[[T (~v1)]C [T (~v2)]C · · · [T (~vn)]C

]=

d11 d12 . . . d1nd21 d22 . . . d2n...

......

dm1 dm2 . . . dmn

.

Furthermore, ϕ : T 7→ [T ]CB is an isomorphism of L(V,W ) onto Mm,n(F ).

Definition. The matrix [T ]CB is called the matrix for T relative to the ordered bases B and

C If V = W and B = C, then we write [T ]B for [T ]BB. In addition, if T : Fn → Fn is a linear

transformation and B is the standard basis for Fn, we call [T ]B the standard matrix for T .

Note that for ~v ∈ V ,

~v = c1~v1 + c2~v2 + · · ·+ cn~vn,

so that

T (~v) = c1T (~v1) + c2T (~v2) + · · ·+ cnT (~vn).

Thus,

[T (~v)]C = c1[T (~v1)]C + c2[T (~v2)]C + · · ·+ cn[T (~vn)]C =[[T (~v1)]C [T (~v2)]C · · · [T (~vn)]C

]

c1c2...

cn

.

That is,

[T (~v)]C = [T ]CB [~v]B for all ~v ∈ V .

We conclude this in the following diagram.

~v

��

T// T (~v)

��

[~v]B[T ]C

// [T (~v)]C = [T ]CB[~v]B

Page 41: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

4.4. Change of Bases 41

Example 4.3.2. 1. Let B = {1 + x, x} be an ordered basis for R1[x] and

C = {1 + x, x, x2 − 1, x3} an ordered basis for R3[x].Let T : R1[x]→ R3[x] be a linear transformation defined by

T (a+ bx) = x2(a+ bx) for all a, b ∈ R.

Find [T ]CB.

2. Suppose T : M22(R)→ R3 is a linear transformation with [T ]CB =

1 −1 0 00 1 −1 00 0 1 −1

where

B =

{[1 00 0

]

,

[0 10 0

]

,

[0 01 0

]

,

[0 00 1

]}

and C = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.

Compute T

([a bc d

])

.

Theorem 4.3.3. Let V , W and Z be finite-dimensional vector spaces over a field F and let B, Cand D be ordered bases of V , W and Z, respectively.

If S : V →W and T : W → Z are linear transformations, then

[T ◦ S]DB = [T ]DC [S]CB.

Moreover, if V = W = Z and B = C = D, then [T ◦ S]B = [T ]B[S]B.

Corollary 4.3.4. Let V be a finite-dimensional vector space, B an ordered basis and T : V → V a

linear transformation. Then

1. T is an isomorphism⇒ [T ]B is invertible and [T ]−1B =

[T−1

]

B.

2. [T ]B is invertible⇒ T is an isomorphism and[T−1

]

B= [T ]−1

B .

Theorem 4.3.5. Let T : V → W be a linear transformation where dimV = n and dimW = m.

If B and C are any ordered bases for V and W , respectively, then rankT = rank[T ]CB.

Example 4.3.3. Define T : R2[x] → R3 by T (a + bx + cx2) = (a − 2b, 3c − 2a, 3c − 4b) for all

a, b, c ∈ R. Compute rankT .

4.4 Change of Bases

Definition. Let V be a n-dimensional vector space over a field F . with an ordered basis B ={~v1, . . . , ~vn}. If B′ = {~v′1, . . . , ~v′n} is another ordered basis for V , we define the transition or

change of coordinate matrix from B′ to B by PB→B′ = [I]B′

B

Theorem 4.4.1. Let B,B′ and B′′ be bases for V . Then

1. ∀~v ∈ V, [~v]B′ = PB→B′ [~v]B,

2. PB→B = In,

3. PB→B′ is invertible and (PB→B′)−1 = PB′→B,

4. PB→B′′ = PB′→B′′PB→B′ .

Example 4.4.1. Let B =

{[−10

]

,

[1−1

]}

and B′ ={[

11

]

,

[2−1

]}

be ordered bases for R2.

(a) Find PB→B′

.

(b) If ~v =

[0−1

]

, find [~v]B and [~v]B′ .

Page 42: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

42 4. Linear Transformations

Definition. A linear transformation from V to V is called a linear operator on V .

Theorem 4.4.2. Let B and B′ be two bases for a finite dimensional vector space V . If T : V → Vis a linear operator, then

[T ]B′ = [I]BB′ [T ]B[I]B′

B = (PB→B′)−1[T ]B(PB→B′).

Example 4.4.2. Let T : R3 → R3 be a linear transformation with

standard matrix

2 1 06 1 −10 0 1

. Find [T ]B′ where B′ =

120

,

1−30

,

−11−6

.

From the above theorem, we have

det[T ]B = det[T ]B′ and rank[T ]B = rank[T ]B′

for any two bases B and B′ for V .

Definition. If T : V → V is a linear operator, we define the determinant of T by

detT = det[T ]B for some basis B for V .

Definition. For n × n matrices A and B, we say that A is similar to B, A ∼ B, if there exists

an invertible matrix P ∈Mn(F ) such that B = P−1AP .

Remarks. 1. ∼ is an equivalence relation on Mn(F ).2. If A ∼ B, then AT ∼ BT , Ak ∼ Bk for all k ∈ N, and A−1 ∼ B−1 (if inverses exist).

3. [T ]B and [T ]B′ are similar for any two bases B and B′ of V .

Definition. The trace of an n× n matrix A is the sum of the diagonal elements.

Theorem 4.4.3. Let A and B be similar matrices. Then

1. detA = detB,

2. rankA = rankB,

3. trA = trB.

Exercises for Chapter 4. 1. If T : V → W is an isomorphism and B is a basis for V , prove that T (B)is a basis for W .

2. Let T : V → V be a linear transformation. Suppose that there exists a ~v ∈ V such that T (T (~v)) 6= ~0and T (T (T (~v))) = ~0. Prove that {~v, T (~v), T (T (~v))} is linearly independent.

3. Let S, T ∈ L(V,W ) and c ∈ F . Prove that:(a) kerS ∩ kerT ⊆ ker(S + cT ) (b) im(S + T ) ⊆ imS + imT .

4. Let E be a linear transformation on a vector space V such that E ◦ E = E.Prove that the following statements hold.(a) ∀~v ∈ V,~v ∈ imE ⇔ E(~v) = ~v (b) ∀~v ∈ V,~v − E(~v) ∈ kerE (c) V = kerE ⊕ imE.

5. Let f, g ∈ V ∗. If ker f ⊆ ker g, prove that g = cf for some c ∈ F .6. Let V be an n-dimensional vector space over F .

If f, g ∈ V ∗ are linearly independent, find dim(ker f ∩ ker g).7. If V and W are finite dimensional vector spaces which are isomorphic, prove that V ∗ ∼= W ∗.8. Let B = {(1, 0,−1), (1, 1, 1), (2, 2, 0)} be a basis for R3. Find the dual basis of B.

Page 43: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

4.4. Change of Bases 43

9. Consider V = R1[x]. Let f1 : V → R and f2 : V → R be defined by

f1(p(x)) =

∫ 1

0

p(x) dx and f2(p(x)) =

∫ 2

0

p(x) dx.

Clearly, f1, f2 ∈ V ∗. Prove that {f1, f2} is a basis for V ∗ and find a basis of V such that {f1, f2} is itsdual basis.

10. (a) Let W be a subspace of a finite dimensional vector space V .If B = {x1, . . . , xm} is a basis for W and {x1, . . . , xm, xm+1, . . . , xn} is a basis of V ,show that {xm+1 +W, . . . , xn +W} is a basis for V/W .(b) Let H = Span{(1, 1,−1)}. Determine a basis for R3/H.

11. Let W1 and W2 be two subspaces of a vector space V .Define T : W1+W2 →W2/(W1∩W2) by T (~w1+ ~w2) = ~w2+(W1∩W2) for all ~w1 ∈W1 and ~w2 ∈W2.(a) Prove that T is well defined and is an onto linear transformation.(b) Prove that kerT = W1.(c) Conclude by Theorem 4.2.2 that (W1 +W2)/W1

∼= W2/(W1 ∩W2).This is a generalization of Theorem 1.4.8.

12. If W1 and W2 are subspaces of V with W1 ⊆W2.Define T : V/W1 → V/W2 by T (~v +W1) = ~v +W2 for all ~v ∈ V .(a) Prove that T is well defined and is an onto linear transformation.(b) Prove that kerT = W2/W1.(c) Conclude by Theorem 4.2.2 that (V/W1)/(W2/W1) ∼= V/W2.

13. Let U , V and W be finite dimensional vector spaces over a field F . Let S : U → V and T : V → Wbe linear transformations such that T ◦ S is the zero map. Show that

dim(W/ imT )− dim(kerT/ imS) + nullityS = dimW − dimV + dimU.

14. Let V and W be finite dimensional vector spaces over a field F . Let U be a subspace of V andT : V →W a linear transformation.(a) Prove that dim(V/U) ≥ dim(T (V )/T (U)).(b) If T is 1-1, prove also that the inequality in (a) becomes an equality.

15. For S ⊆ V , let A(S) = {f ∈ V ∗ : f(~v) = 0 for all ~v ∈ S}. It is called the annihilator of S.Prove that(a) A(S) is a subspace of V ∗ (b) If S1 ⊆ S2, then A(S1) ⊇ A(S2)(c) If V is finite dimensional and W is a subspace of V , then V ∗/A(W ) ∼= W ∗.

16. Prove that ∀S, T ∈ L(V, V ), S ◦ T ∈ L(V, V ).17. Let T : V →W be a linear transformation where dimV = dimW = n.

Prove that the following statements are equivalent.(i) T is an isomorphism.(ii) [T ]cC

Bis invertible for all ordered bases B and C of V and W , respectively.

(iii) [T ]cCB

is invertible for some pair of ordered bases B and C of V and W , respectively.18. Suppose the linear transformation T : R2 → R2 is given by

T (1, 1) = (2, 3) and T (−1, 1) = (4, 5).

Find the standard matrix for T .19. Let B = {1, x, x2} be an ordered basis for R2[x] and C = {(1, 0), (1,−1)} an ordered basis for R2.

Find [T ]CB

if T : R2[x]→ R2 is a linear transformation defined by T (a+ bx+ cx2) = (a+ c, 2b) for alla, b, c ∈ R.

20. Let B = {sin t, cos t} and B′ = {sin t+2 cos t, sin t−cos t} be ordered bases for H = Span B = Span B′which is a subspace of C1(−∞,∞). Let D : H → H defined by D(f) = f ′′ for all

21. If T : V → W is an isomorphism, prove that ([T ]CB)−1 = [T−1]B

Cfor all ordered bases B and C of V

and W , respectively.22. Let T : R2[x] → R3 be a linear transformation defined by T (a + bx + cx2) = (a − c, b, a − 2c) for all

a, b, c ∈ R. If B = {1, x, x2} and C = (1, 0, 0), (0, 1, 0), (0, 0, 1), find [T ]CB

and the formula for T−1.23. Let T : Rn[x]→ Rn[x] be a linear transformation defined by

T (p(x)) = p(x) + xp′(x),

where p′(x) is the derivative of p(x). Show that T is an isomorphism by finding [T ]B where B ={1, x, x2, . . . , xn}.

Page 44: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

44 4. Linear Transformations

24. Let α be a real number. Define a linear transformation Tα : M2(R)→M2(R) by

Tα(A) = A+ αAT for all A ∈M2(R).

If B =

{[1 00 0

]

,

[0 00 1

]

,

[0 11 0

]

,

[0 −11 0

]}

, find [T ]B and conclude that Tα is invertible if α2 6= 1.

25. Let T : R2 → R2 be a linear transformation defined by T (x, y) = (−y, x) for all x, y ∈ R. Prove that(a) ∀c ∈ R, (A− cI2) is invertible,

(b) if B is an ordered basis for R2 and [T ]B =

[a bc d

]

, then bc 6= 0.

Page 45: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5 | Structure Theorems

5.1 Eigenvalues and Eigenvectors

We first recall some numerical examples.

Example 5.1.1. Diagonalize A =

[1 0−1 2

]

.

That is, find an invertible matrix P and a diagonal matrix D (if any) such that A = PDP−1.

Example 5.1.2. Let A =

3 1 −12 2 −12 2 0

and T (~x) = A~x a matrix transformation on R3.

Find a basis B (if any) such that [T ]B is a diagonal matrix. Given det(A− λI) = −(λ− 1)(λ− 2)2.

Definition. Let V be a vector space over a field F and T ∈ L(V, V ).A scalar λ ∈ F is called an eigenvalue or characteristic value of T if there exists a nonzero

vector ~v ∈ V such that T (~v) = λ~v. If λ is is an eigenvalue of T , then ~v ∈ V such that T (~v) = λ~vis called an eigenvector or characteristic vector of T associated with the characteristic

value λ. We have that

Eλ(T ) = {~v ∈ V : T (~v) = λ~v} = {~v ∈ V : (T − λI)(~v) = ~0V } = ker(T − λI)

is a subspace of V , called the eigenspace or characteristic space of T associated with λ.

Remark. λ is an eigenvalue of T ⇔ ker(T − λI) 6= ~0V ⇔ T − λI is not 1-1.

For matrix theory, we restrict ourselves to the case of V is n-dimensional. Then L(V, V ) ∼=Mn(F ) with T 7→ [T ]B for a fixed basis B of V . Hence, we can only work on Mn(F ).

Definition. Let A ∈Mn(F ). The matrix transformation TA : Fn → Fn is given by

TA(~x) = A~x

for all ~x ∈ Fn. An eigenvalue of TA is called an eigenvalue of A and the eigenspace of TA is

called an eigenspace of A. In other words,

Eλ(A) = {~x ∈ Fn : A~x = λ~x} = {~x ∈ Fn : (A− λIn)~x = ~0n} = Nul(A− λIn).

Then

λ is an eigenvalue of A⇔ ker(TA − λI) 6= ~0n

⇔ Nul(A− λIn) 6= ~0n

⇔ A− λIn is not invertible

⇔ det(A− λIn) = 0.

45

Page 46: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

46 5. Structure Theorems

Definition. The polynomial cA(x) = det(xIn−A) is called the characteristic polynomial of A.

Thus we have proved

Theorem 5.1.1. For A ∈Mn(F ), λ is an eigenvalue of A⇔ det(A− λIn) = 0, i.e., λ is a root of

the characteristic polynomial of A.

Since an eigenvalue of an n×n matrix A is a root of cA(x) = det(xIn−A) which has degree nand a polynomial of degree n over a field F has at most n roots in F , A has ≤ n eigenvalues.

Theorem 5.1.2. An n× n matrix has at most n eigenvalues.

Remark. If A is similar to B, then detA = detB and

det(B − λIn) = det(P−1AP − λP−1InP ) = det(P−1(A− λIn)P ) = det(A− λIn).

Therefore, we have the following result.

Theorem 5.1.3. If A and B are similar n×n matrices, then A and B have the same characteristic

polynomial and eigenvalues (with same multiplicities).

Example 5.1.3. The matrices

A =

[1 10 1

]

and I =

[1 00 1

]

have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not

similar because PIP−1 = I for any invertible matrix P .

Definition. A diagonal matrix D is a square matrix such that all the entries off the main

diagonal are zero, that is if D is of the form

D =

λ1 0 . . . 00 λ2 . . . 0...

.... . .

...

0 0 . . . λn

= diag(λ1, λ2, . . . , λn),

where λ1, λ2, . . . , λn ∈ F (not necessarily distinct).

Definition. An n× n matrix A over F is said to be diagonalizable if A is similar to a diagonal

matrix, that is, there are an invertible matrix P and a diagonal matrix D such that P−1AP = D.

In this case, we say that P diagonalizes A.

Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We

say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix.

Theorem 5.1.4. Let A be an n× n matrix.

1. A is diagonalizable⇔A has eigenvectors ~v1, . . . , ~vn such that P =

[~v1 · · · ~vn

]is invertible.

2. When this is the case, P−1AP = diag(λ1, λ2, . . . , λn), where

for each i, λi is the eigenvalue of A corresponding to ~vi.

Page 47: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.1. Eigenvalues and Eigenvectors 47

Proof. Let P =[~v1 ~v2 . . . ~vn

]and D = diag(λ1, λ2, . . . , λn).

Then AP = PD becomes

A[~v1 ~v2 · · · ~vn

]=[~v1 ~v2 · · · ~vn

]

λ1 0 · · · 00 λ2 · · · 0...

.... . .

...

0 0 · · · λn

[A~v1 A~v2 · · · A~vn

]=[λ1~v1 λ2~v2 · · · λn~vn

].

Comparing columns shows that A~vi = λi~vi for each i, so

P−1AP = D ⇔ P is invertible and A~vi = λi~vi for all i ∈ {1, . . . , n}.

The results follow.

Theorem 5.1.5. Let ~v1, . . . , ~vm be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λm of

an n× n matrix A. Then {~v1, . . . , ~vm} is linearly independent.

Proof. We use induction on k.

If k = 1, then {~v1} is linearly independent because ~v1 6= ~0.

Let k ≥ 1 and the theorem is true for any k eigenvectors.

Let ~v1, . . . , ~vk+1 be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λk+1 of A.

Let c1, . . . , ck+1 ∈ F be such that

c1~v1 + c2~v2 + · · ·+ ck+1~vk+1 = ~0. (5.1.1)

Since A~vi = λ~vi for all i, multiplying by A both sides gives

c1λ1~v1 + c2λ2~v2 + · · ·+ ck+1λk+1~vk+1 = ~0. (5.1.2)

Subtracting (5.1.2) by λ1×(5.1.1), we have

c2(λ2 − λ1)~v2 + · · ·+ ck+1(λk+1 − λ1)~vk+1 = ~0.

Since ~v2, . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so

c2(λ2 − λ1) = · · · = ck+1(λk+1 − λ1) = 0.

However, λ1, . . . , λk+1 are distinct, hence we get

c2 = · · · = ck+1 = 0.

This implies c1~v1 = ~0, so c1 = 0 because ~v1 6= ~0.

Therefore, {~v1, . . . , ~vk+1} is linearly independent.

Corollary 5.1.6. If A is an n× n matrix with n distinct eigenvalues, then A is diagonalizable.

Proof. Let ~v1, . . . , ~vn be eigenvectors corresponding to distinct eigenvalues λ1, . . . , λn of A.

Then they are linearly independent, and so P =[~v1 . . . ~vn

]is invertible

and P−1AP = diag(λ1, λ2, . . . , λn).

Page 48: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

48 5. Structure Theorems

Lemma 5.1.7. Let {~v1, . . . , ~vk} be a linearly independent set of eigenvectors of an n×n matrix A,

extend it to a basis of Fn, and let

P =[~v1 . . . ~vk ~vk+1 . . . ~vn

]

which is invertible. If λ1, . . . , λk are the (not necessarily distinct) eigenvalues of A corresponding

to ~v1, . . . , ~vk, respectively, then P−1AP has block form

P−1AP =

[diag(λ1, . . . , λk) B

0 C

]

where B has size k × (n− k) and C has size (n− k)× (n− k).

Definition. An eigenvalue λ of a square matrix A is said to have multiplicity m if it occurs mtimes as a root of the characteristic polynomial cA(x).

In other words,

cA(x) = (x− λ)mg(x)

for some polynomial g(x) such that g(λ) 6= 0.

Lemma 5.1.8. Let λ be an eigenvalue of multiplicity m of a square matrix A.

Then nullity(A− λI) = dimEλ(A) ≤ m.

Proof. Assume that dimEλ(A) = d with basis {~v1, . . . , ~vd}. By Lemma 5.1.7, there exists an

invertible n× n matrix P such that

P−1AP =

[λId B0 C

]

= M

where Id is the d× d identity matrix. Since M and A are similar,

cA(x) = cM (x) = det(xIn −M) =

∣∣∣∣

(x− λ)Id B0 xIn−d − C

∣∣∣∣

= (det(x− λ)Id)(det(xIn−d − C))

= (x− λ)dcC(x).

Hence, d ≤ m because m is the highest power of (x− λ) in cA(x).

Theorem 5.1.9. Let λ1, . . . , λk be all distinct eigenvalues of an n× n matrix A.

For each i ∈ {1, . . . , k}, let mi denote the multiplicity of λi and write di = nullity(A− λiIn).Then 1 ≤ di ≤ mi for all i, n = m1 + · · ·+mk and

cA(x) = (x− λ1)m1 · · · (x− λk)

mk .

Moreover, the following statements are equivalent.

(i) A is diagonalizable.

(ii) di = nullity(A− λiIn) = dimEλi(A) = mi for all i.

(iii) n = d1 + · · ·+ dk.

Page 49: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.2. Annihilating Polynomials 49

5.2 Annihilating Polynomials

Let A be an n × n matrix over a field F . Since dimMn(F ) = n2, the set {In, A,A2, . . . , An2} is

linearly dependent. Then there exist c0, c1, . . . , cn2 in F not all zero such that

c0In + c1A+ c2A2 + · · ·+ cn2An2

= 0.

Let f(x) be the polynomial over F defined by f(x) = c0+c1x+c2x2+ · · ·+cn2xn

2

. Then f(x) 6= 0and f(A) = 0.

Let g(x) = α−1f(x) where α is the leading coefficient of f(x). Then g(x) is monic (leading

coefficient = 1) and g(A) = 0. Thus there exists a polynomial p(x) over F such that

(a) p(A) = 0(b) p(x) is monic and

(c) ∀ nonzero polynomial q(x), q(A) = 0⇒ deg p(x) ≤ deg q(x).We have that such p(x) is unique (Proof!) and it is called the minimal polynomial. Note that if

k(x) ∈ F [x] and k(A) = 0, then p(x) | k(x).

Remark. If A and B in Mn(F ) are similar, then they have the same minimal polynomial.

Recall that the characteristic polynomial of A is given by

cA(x) = det(xIn −A).

Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots.

Remark. Although the minimal polynomial and the characteristic polynomial have the same

roots, they may not be the same.

Example 5.2.1. The characteristic polynomial for A =

5 −6 −6−1 4 23 −6 −4

is (x− 1)(x− 2)2 while

(A− I)(A− 2I) = 0,

so the minimal polynomial of A is (x− 1)(x− 2). Notice that A is diagonalizable. In general, we

have:

Theorem 5.2.2. If an n× n matrix A is diagonalizable with distinct eigenvalues λ1, . . . , λk, then

(x− λ1) . . . (x− λk) is the minimal polynomial for A.

Theorem 5.2.3. [Cayley-Hamilton] If f(x) is the characteristic polynomial of a matrix A, then

f(A) = 0.

Proof. Write f(x) = det(xIn − A) = xn + an−1xn−1 + · · · + a1x + a0. Let B = xIn − A. Since

adjB is a matrix such that each entry is obtained by using (n − 1) × (n − 1) submatrix of A and

computing its determinant,

Cij(B) = b(n−1)ij xn−1 + b

(n−2)ij xn−2 + · · ·+ b

(1)ij x+ b

(0)ij

for all i, j ∈ {1, . . . , n}. Thus

adjB = [Cij(B)]Tn×n

= [b(n−1)ij xn−1 + b

(n−2)ij xn−2 + · · ·+ b

(1)ij x+ b

(0)ij ]Tn×n

= Bn−1xn−1 +Bn−2x

n−2 + · · ·+B1x+B0

Page 50: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

50 5. Structure Theorems

where Bi ∈Mn(F ). Recall that

(detB)In = B(adjB) = B(Bn−1xn−1 +Bn−2x

n−2 + · · ·+B1x+B0).

Then

(xn + an−1xn−1 + · · ·+a1x+ a0)In

= B(Bn−1xn−1 +Bn−2x

n−2 + · · ·+B1x+B0)

= (xI −A)(Bn−1xn−1 +Bn−2x

n−2 + · · ·+B1x+B0)

= Bn−1xn +Bn−2x

n−1 + · · ·+B1x2 +B0x

−ABn−1xn−1 +ABn−2x

n−2 + · · ·+AB1x+AB0.

This gives

I = Bn−1

an−1I = Bn−2 −ABn−1

an−2I = Bn−3 −ABn−2

...

a1I = B0 −AB1

a0I = −AB0.

Therefore

An+an−1An−1 + . . . a1A+ a0I

= AnBn−1 +An−1(Bn−2 −ABn−1) +An−2(Bn−3 −ABn−2) + . . .

+A(B0 −AB1)−AB0

= 0

as desired.

Example 5.2.2. Determine the minimal polynomial of A =

3 1 −12 2 −12 2 0

.

Some consequences of the Cayley-Hamilton are as follows.

Corollary 5.2.4. The minimal polynomial of A divides its characteristic polynomial.

Recall that

0 is an eigenvalue of A⇔ 0 = det(A− 0I) = detA⇔ A is not invertible.

Corollary 5.2.5. If f(x) = a0 + a1x+ · · ·+ an−1xn−1 + xn is the characteristic polynomial of an

invertible matrix A, then a0 6= 0

A−1 = − 1

a0(a1I + a2A+ · · ·+An−1).

Page 51: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.3. Symmetric and Hermitian Matrices 51

5.3 Symmetric and Hermitian Matrices

Definition. Let F = R or C and A = [aij ] a matrix over F .

The matrix A is said to be symmetric if A = AT . We define AH = [aij ]T , the conjugate

transpose of A, called A Hermitian. We say that A is Hermitian or self-adjoint if A = AH .

Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R.

Example 5.3.1. Let A =

[3 11 −2

]

and B =

[−1 2 + 3i

2− 3i 2

]

.

Then A is symmetric and both of them are Hermitian.

Theorem 5.3.1. If A is a Hermitian matrix, then

(1) ~xHA~x is real for all ~x ∈ Cn and (2) the eigenvalues of A are real.

That is, if A is Hermitian, then all roots of cA(x) are real.

Example 5.3.2. For vectors ~x and ~y in Cn, we define (~x, ~y) = ~xH~y.

Then (·, ·) is an inner product on Cn so that

‖~x‖2 = ~xH~x = |x1|2 + · · ·+ |xn|2 for all ~x = (x1, . . . , xn) ∈ Cn.

Theorem 5.3.2. Two eigenvectors corresponding to different eigenvalues of a Hermitian matrix

are orthogonal to one another.

Definition. For F = R or C and U ∈ Mn(F ), U is called unitary if UHU = In = UUH . If

F = R, a unitary matrix satisfies UTU = In = UUT and may be called an orthonormal matrix.

Theorem 5.3.3. Let U ∈Mn(C) be a unitary matrix.

For the inner product defined in Example 5.3.2, we have

(U~x, U~y) = (~x, ~y) for all ~x, ~y ∈ Cn, so ‖U~x‖ = ‖~x‖ for all ~x ∈ C.

Corollary 5.3.4. If U =[~u1 ~u2 . . . ~un

]∈ Mn(C) is a unitary matrix, then for all j, k ∈

{1, 2, . . . , n} we have

(~uj , ~uk) =

{

1 if j = k,

0 if j 6= k.

Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise.

Example 5.3.3. U1 =1√2

[1 ii 1

]

and U2 =

[cos t − sin tsin t cos t

]

are unitary matrices.

Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1.

Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other.

We are going to explore some very remarkable facts about Hermitian and real symmetric

matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished

by a unitary matrix P . This means that P−1AP = PHAP is diagonal. In this situation, we say

that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are

particularly attractive since the calculation is essentially free and error-free as well: PH = P−1.

Page 52: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

52 5. Structure Theorems

Theorem 5.3.6. If a real matrix A is a orthogonally diagonalizable with an orthonormal matrix P ,

that is P TAP is a diagonal matrix, then A is symmetric.

Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result.

Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable.

In addition, every real symmetric matrix is orthogonally diagonalizable.

Proof. We shall show this statement by induction on n. It is clear for n = 1.

Assume that n > 1 and every (n−1)×(n−1) Hermitian matrix is unitary diagonalizable. Consider

an n× n Hermitian matrix A.

Let λ1 be a real eigenvalue of A with unit eigenvector ~v.

Then A~v = λ1~v and ‖~v‖ = 1.

Let W = {~v}⊥ with orthonormal basis {~z1, . . . , ~zn−1}.Thus, R =

[~v ~z1 . . . ~zn−1

]is an n× n unitary matrix. Observe that

B = RHAR =

λ1 0 . . . 00 b22 . . . b2n...

......

0 bn2 . . . bnn

=

λ1 0 . . . 00... C(n−1)×(n−1)

0

and BH = (RHAR)H = B. Hence, B is Hermitian and so is C.

Since C is an (n− 1)× (n− 1) Hermitian matrix, by the induction hypothesis,

∃ an (n− 1)× (n− 1) unitary matrix Q such that QHCQ = diag(λ2, . . . , λn).

Let P =

1 0 . . . 00... Q0

n×n

. Then P is an n× n unitary matrix and

PHBP = PHRHARP = (RP )HA(RP ).

Choose U = RP . Then UH = (RP )H = PHRH = R−1P−1 = (RP )−1 = U−1 and

UHAU = PHBP =

λ1

λ2

. . .

λn

.

Hence, A is unitarily diagonalizable.

Example 5.3.4. Diagonalize the Hermitian matrix A =

[1 1− i

1 + i 0

]

.

Example 5.3.5. Orthogonally diagonalize the symmetric matrix A =

1 2 02 4 00 0 5

.

(Given λ = 0, 5, 5).

Definition. A square matrix A is normal if AHA = AAH .

Clearly, every Hermitian matrix is normal.

Page 53: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.4. Jordan Forms 53

Theorem 5.3.8. A matrix is unitarily diagonalizable if and only if it is normal.

Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this

course.

Real versus Complex

(x1, . . . , xn) ∈ Rn (x1, . . . , xn) ∈ Cn

length: ‖~x‖2 = x21 + · · ·+ x2n ‖~x‖2 = |x1|2 + · · ·+ |xn|2transpose: AT

ij = Aji Hermitian: AHij = Aji

(AB)T = BTAT (AB)H = BHAH

~x · ~y = ~xT~y = x1y1 + · · ·+ xnyn ~x · ~y = ~xH~y = x1y1 + · · ·+ xnynorthogonality: ~xT~y = 0 ~xH~y = 0orthonormal: P TP = In = PP T unitary: UHU = In = UUH

symmetric matrix: AT = A Hermitian matrix AH = AA = PDP−1 = PDP T (real D) A = UDU−1 = UDUH (real D)

orthogonally diagonalizable unitarily diagonalizable

5.4 Jordan Forms

Theorem 5.1.9 gives necessary and sufficient conditions for an n× n matrix to be diagonalizable,

namely that it should have n independent eigenvectors. We have also seen square matrices which

are not diagonalizable. In this section, we discuss the so-called Jordan canonical form, a form

of matrix to which every square matrix is similar.

Definition. Let A be an n× n matrix. Let λ be an eigenvalue of A with nullity(A− λIn) = ℓ.Assume that λ is of multiplicity m. Then 1 ≤ ℓ ≤ m.

If m = 1, then ℓ = m = 1.

If m > 1 and ℓ < m, then λ is said to be defective and the number m − ℓ > 0 of missing

eigenvector(s) is called the defect of λ.

Note that if A has a defective eigenvalue, then A is not diagonalizable.

Definition. The generalized eigenspace Gλ corresponding to an eigenvalue λ of A, consists

of all vectors ~v such that, for some k ∈ N, (A− λI)k~v = ~0, that is,

Gλ(A) = {~v ∈ Fn : (A− λI)k~v = ~0 for some k ∈ N} =⋃

k∈N

Nul(A− λI)k.

Definition. A length r chain of generalized eigenvectors based on the eigenvector ~v for λis a set {~v = ~v1, ~v2, . . . , ~vr} of r linearly independent generalized eigenvectors such that

(A− λI)~vr = ~vr−1,

(A− λI)~vr−1 = ~vr−2,

...

(A− λI)~v2 = ~v1.

Since ~v1 is an eigenvector, (A− λI)~v1 = ~0. It follows that

(A− λI)r~vr = ~0.

Page 54: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

54 5. Structure Theorems

We may denote the action of the matrix A− λI on the string of vectors by

~vr −→ ~vr−1 −→ · · · −→ ~v2 −→ ~v1 −→ ~0.

Now let W be the subspace of Gλ spanned by {~v1, . . . , ~vr}. Any vector ~x in W has a represen-

tation of the form

~x = c1~v1 + c2~v2 + · · ·+ cr~vr

and

A~x = c1(A~v1) + c2(A~v2) + · · ·+ cr(A~vr)

= c1(λ~v1) + c2(λ~v2 + ~v1) + · · ·+ cr(λ~vr + ~vr−1)

= (λc1 + c2)~v1 + · · ·+ (λcr−1 + cr)~vr−1 + λcr~vr.

Thus A~x is also in W . If B = {~v1, . . . , ~vr} is a basis for W , then

[A~x]B =

λc1 + c2λc2 + c3

...

λcr−1 + crλcr

=

λ 1λ 1

. .λ 1

λ

c1c2...

cr−1

cr

= J [~x]B

where

J = J(λ; r) =

λ 1λ 1

. .λ 1

λ

r×r

is called the Jordan block of size r corresponding to λ.

Example 5.4.1. Let A =

[1 1−1 3

]

. Find generalized eigenspaces of A.

Example 5.4.2. Let A1 =

0 1 2−5 −3 −71 0 0

and A2 =

−1 1 00 −1 00 1 −1

.

Then A1 and A2 have the same characteristic polynomial (x+ 1)3. Find

(1) the minimal polynomials of A1 and A2, and

(2) the generalized eigenspaces of A1 and A2.

Theorem 5.4.1. If a n × n matrix A has t linearly independent eigenvectors, then it is similar to

a matrix J , that is, in Jordan form, with t square blocks on the diagonal:

Jordan form J = M−1AM =

J1. . .

Jt

.

Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal:

Jordan block Ji = J(λi, ri) =

λi 1λi 1

. .. 1

λi

ri×ri

.

The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, Mconsists of n generalized eigenvectors which are linearly independent.

Page 55: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.4. Jordan Forms 55

Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized

eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the

lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the

structure of these chains depends on the defect of λ, and can be quite complicated. For instance,

a multiplicity-four-eigenvalue can correspond to

• Four length 1 chain (defect 0);

• Two length 1 chains and a length 2 chain (defect 1);

• Two length 2 chains (defect 2);

• A length 1 chain and a length 3 chain (defect 2);

• A length 4 chain (defect 3).

Observe that, in each of these cases, the length of the longest chain is at most d + 1 where dis the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors

corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin

with the equation

(A− λI)d+1~u = ~0 (5.4.1)

to start building the chains of generalized eigenvectors corresponding to λ.

Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the

matrix A− λI until the zero vector is obtained. If

(A− λI)~u1 = ~u2 6= ~0

(A− λI)~u2 = ~u3 6= ~0

...

(A− λI)~uk−1 = ~uk 6= ~0

but (A− λI)~uk = ~0, then we get the string of k generalized eigenvectors

~u1 −→ ~u2 −→ · · · −→ ~uk.

Example 5.4.3. Let A =

0 0 1 00 0 0 1−2 2 −3 12 −2 1 −3

with the characteristic polynomial x(x+ 2)3.

Find the chains of generalized eigenvectors corresponding to each eigenvalues and the Jordan

form of A.

Example 5.4.4. Let A =

8 0 0 00 8 0 34 0 8 00 0 0 8

. Find the minimal polynomial of A and chain(s) of

generalized eigenvectors and the Jordan form of A.

Example 5.4.5. Write down the Jordan form of the following matrices.

(1)

0 0 1 10 0 1 10 0 1 10 0 0 0

(2)

3 5 0 00 3 6 00 0 4 70 0 0 4

(3)

3 0 0 00 3 5 00 0 4 60 0 0 4

Page 56: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

56 5. Structure Theorems

Let N(r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and

zero elsewhere. For example,

N(2) =

[0 10 0

]

, N(3) =

0 1 00 0 10 0 0

, N(4) =

0 1 0 00 0 1 00 0 0 10 0 0 0

, etc.

Then J(λ; r) = λI +N(r), or in abbreviated J = λI +N .

Suppose that f(x) is a polynomial of degree s. Then the Taylor expansion around a point c from

calculus gives us

f(c+ x) = f(c) + f ′(c)x+f ′′(c)

2!x2 + · · ·+ f (s)(c)

s!xs,

where f ′, f ′′, . . . , f (s) represent successive derivatives of f . In terms of matrices I and N , we

have

f(J) = f(λI +N) = f(λI) + f ′(λI)N +f ′′(λI)N2

2!+ · · ·+ f (s)(λI)N s

s!

= f(λ)I + f ′(λ)N +f ′′(λ)

2!N2 + · · ·+ f (s)(λ)

s!N s

=

f(λ) f ′(λ)f ′′(λ)

2!. .

f(λ) f ′(λ)f ′′(λ)

2!.

. . . .

. .f ′′(λ)

2!. f ′(λ)

f(λ)

r×r

because the entries of Nk that are k steps above the diagonal are 1’s and all the other entries

are zeros.

Example 5.4.6. Compute J(λ; 4)2, J(λ; 3)10 and J(λ; 2)s.

Remark. If J =

J1. . .

Jt

is in a Jordan form, then Js =

Js1

. . .

Jst

.

Example 5.4.7. Compute Js for J =

2 1 00 2 00 0 3

.

Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal

polynomial.

Solution. Let J be the Jordan form of A. Since f(A) = Mf(J)M−1, f(A) = 0 if and only if

f(J) = 0. Also, if J(λ; r) is a Jordan block, then f(J(λ; r)) is a Jordan block of f(J). We must

thus find a polynomial such that, for every Jordan block J(λ; r) of J , f(J(λ; r)) = 0 holds.

But we derived a formula for f(J(λ; r)), and it equals the zero matrix if and only if f(λ), f ′(λ),. . . , f (r−1)(λ) are all zero. Thus, f(x) and its first r − 1 derivatives must vanish at x = λ; in other

words, (x− λ)r must be a factor of f(x).

Page 57: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.4. Jordan Forms 57

Let λ1, . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks

corresponding to the eigenvalue λi. Hence, we obtain

f(x) = (x− λ1)m1 . . . (x− λk)

mk

is the minimal polynomial of A.

Example 5.4.9. Find the minimal polynomial of the following matrices.

(1)

2 0 00 2 00 0 −1

(2)

2 1 00 2 00 0 −1

(3)

2 1 00 1 00 0 −1

(4)

3 13 1

33

8 18

5

(5)

3 13

3 13

8 18

5

Exercises for Chapter 5. 1. Let A =

0 1 00 0 1a b c

. Find a, b, c so that det(A− λI3) = 9λ− λ3.

2. Let T : V → V be a linear operator.A subspace U of V is T -invariant if T (U) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U .(a) Show that kerT and imT are T -invariant.(b) If U and W are T -invariant, prove that U ∩W and U +W are also T -invariant.(c) Show that the eigenspace Eλ(T ) is T -invariant.

3. Show that A and AT have the same eigenvalues.4. Show that if λ1, . . . , λk are eigenvalues of A, then λm

1 , . . . , λmk are eigenvalues of Am for all m ≥ 1.

Moreover, each eigenvector of A is an eigenvector of Am.5. Let A and B be n× n matrices over a field F . If I −AB is invertible, prove that I −BA is invertible

and (I −BA)−1 = I +B(I −AB)−1A.6. Show that if A and B are the same size, then AB and BA have the same eigenvalues.7. Determine all 2× 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a.8. Let V be the space of all real-valued continuous functions. Define T : V → V by

(Tf)(x) =

∫ x

0

f(t) dt.

Show that T has no eigenvalues.9. Prove that if A is invertible and diagonalizable, then A−1 is diagonalizable.

10. Let V = Span{1, sin 2t, sin2 t}. Let T : V → V defined by T (f) = f ′′.Find all eigenvalues and eigenspaces of D. Is T diagonalizable? Explain.

11. Let A = [aij ] be an n× n matrix such that for each i = 1, 2, . . . , n, we have

n∑

j=1

aij = 0.

Show that 0 is an eigenvalue of A.12. Let A be an n× n matrix with characteristic polynomial (x− λ1)

d1 . . . (x− λk)dk .

Show that trA = d1λ1 + · · ·+ dkλk.13. Let A be a 2× 2 matrix. Prove that the characteristic polynomial of A is given by

x2 − (trA)x+ detA = 0.

14. If A and B are 2× 2 matrices with determinant one, prove that

trAB − (trA)(trB) + trAB−1 = 0.

Page 58: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

58 5. Structure Theorems

15. Find the 2× 2 matrices with real entries that satisfy the equation

X3 − 3X2 =

[−2 −2−2 −2

]

.

(Hint. Apply the Cayley-Hamilton Theorem.)

16. Let A =

0 0 c1 0 b0 1 a

.

Prove that the minimal polynomial of A and the characteristic polynomial of A are the same.17. A 3× 3 matrix A has the characteristic polynomial x(x− 1)(x+ 2).

What is the characteristic polynomial of A2?18. Let V = Mn(F ) be the vector space of n× n matrices over a field F . Let A be an n× n matrix.

Let TA be the linear operator on V defined by TA(B) = AB.Show that the minimal polynomial for TA is the minimal polynomial for A.

19. Let U be an n× n real orthonormal matrix. Prove that(a) |tr (U)| ≤ n, and (b) det(U2 − In) = 0 if n is odd.

20. If U =[~u1 ~u2 . . . ~un

]with (~uj , ~uk) =

{

1 if j = k,

0 if j 6= k, prove that U is unitary.

21. Let A be an n× n symmetric matrix with distinct eigenvalues λ1, . . . , λk. Prove that

(A− λ1In) . . . (A− λkIn) = 0.

22. Unitarily diagonalize the following matrices.

(a)

[2 1−1 2

]

(b)

[3 i−i 0

]

(c)

0 1 0−1 0 00 0 −1

(d)

0 1 01 0 00 0 1

(e)

2 i i−i 1 0−i 0 1

23. Show that every unitarily diagonalizable matrix is normal.24. Suppose that A is real symmetric and orthonormal. Prove that the only possible eigenvalues of A are±1.

25. Show that if a real matrix A is skew-symmetric (i.e., AT = −A), then iA is Hermitian.26. Prove that if A is unitarily diagonalizable, then so is AH .27. Let A be any square real matrix. Show that the eigenvalues of ATA are all non-negative.28. Show that the generalized eigenspace Gλ corresponding to an eigenvalue λ of an n × n matrix A is

a subspace of Fn.29. Suppose the characteristic polynomial of a 4× 4 matrix A is (x− 1)2(x+ 1)2.

(a) Prove that A−1 = 2A−A3. (b) Write down all possible Jordan form(s) of A.30. Let J = J(λ; r) be an r × r Jordan block with λ on its diagonal. Show that J has only one linearly

independent eigenvector corresponding to λ.31. If J is in Jordan form with k Jordan blocks on the diagonal, prove that J has exactly k linearly

independent eigenvectors.32. These Jordan matrices have eigenvalues 0, 0, 0, 0:

J =

0 10

0 10

and K =

0 10 1

00

.

For any matrix M , compare JM with MK. If they equal, show that M is not invertible. Then Jand K are not similar.

33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np(λ) = nullity(A− λI)p, p ∈ N, areas follows:n1(2) = 2, n2(2) = 4, np(2) = 5 for p ≥ 3, and n1(5) = 1, np(5) = 2 for p ≥ 2.Write down the Jordan form of A.

34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J2, count its eigenvectors and write itsJordan form.

35. How many possible Jordan forms are there for a 6× 6 matrix with characteristicpolynomial (x− 1)2(x+ 2)4?

Page 59: 1 | Vector Spacespioneer.netserv.chula.ac.th/~myotsana/MATH336LinearII.pdf · 2015-04-10 · 1 | Vector Spaces 1.1 The Algebra of Matrices over a Field Definition. By a field F,

5.4. Jordan Forms 59

36. Let A =

2 a b0 2 c0 0 1

∈M3(R).

(a) Prove that A is diagonalizable if and only if a = 0.(b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0.

37. Let V = {h(x, y) = ax2 + bxy + cy2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the spaceof polynomial in two variables x and y over R. Then B = {x2, xy, y2, x, y, 1} is a basis for V . DefineT : V → V by

(T (h))(x, y) =∂

∂y

(∫

h(x, y) dx

)

.

(a) Prove that T is a linear transformation and find A = [T ]B.(b) Compute the characteristic polynomial and the minimal polynomial of A.(c) Find the Jordan form of A.

38. True or False:

(a)

[3 00 4

]

and

[3 10 4

]

are similar. (b)

[3 00 3

]

and

[3 10 3

]

are similar.

39. Show that

a 1 00 a 00 0 b

and

b 0 00 a 10 0 a

are similar.

40. Write down the Jordan form for the following matrices and find its minimal polynomial.

(a)

[−2 1−1 −4

]

(b)

−1 0 10 −1 11 −1 −1

(c)

1 0 0−2 −2 −32 3 4

(d)

2 0 0−7 9 70 0 2

(e)

3 1 −12 2 −12 2 0

(f)

−2 17 4−1 6 10 1 2

(g)

−3 5 −53 −1 38 −8 10

(h)

5 −1 11 3 0−3 2 1

(i)

1 −4 0 −20 1 0 06 −12 −1 −60 −4 0 −1

(j)

2 1 0 10 2 1 00 0 2 10 0 0 2

(k)

−1 −4 0 01 3 0 01 2 1 00 1 0 1

(l)

1 3 7 00 −1 −4 00 1 3 00 −6 −14 1

Eigenvalues: (b) −1,−1,−1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3(i) −1,−1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1.