Notes for Axler’s Linear Algebra Done Right - … · Notes for Axler’s Linear Algebra Done Right Christopher Eur July 13, 2014 ... equations must have nonzero solutions. An inhomogeneous

Notes for Axler’s Linear Algebra Done Right

Christopher Eur

July 13, 2014

Contents

1 Vector Spaces 31.1 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Definition of Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Properties of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Sums and Direct Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Finite-Dimensional Vector Spaces 52.1 Span and Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Linear Maps 63.1 Definitions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 Null Spaces and Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.3 The Matrix of a Linear Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.4 Invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Polynomials 9

5 Eigenvalues and Eigenvectors 105.1 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.2 Polynomials Applied to Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.3 Upper-Triangular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.4 Diagonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115.5 Invariant Subspaces on Real Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . 11

6 Inner-Product Spaces 126.1 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.2 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.3 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.4 Orthogonal Projections and Minimization Problems . . . . . . . . . . . . . . . . . . 136.5 Linear Functionals and Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1

7 Operators on Inner-Product Spaces 157.1 Self-Adjoint and Normal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157.2 The Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157.3 Normal Operators on Real Inner-Product Spaces . . . . . . . . . . . . . . . . . . . . 167.4 Positive Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177.5 Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177.6 Polar and Singular-Value Decompositions . . . . . . . . . . . . . . . . . . . . . . . . 18

8 Operators on Complex Vector Spaces 198.1 Generalized Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198.2 The Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198.3 Decomposition of an Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208.4 Square Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218.5 The Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218.6 Jordan Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

9 Operators on Real Vectors Spaces 22

2

1 Vector Spaces

1.1 Complex numbers

Definition : a group < G, ∗ > is a set G with a binary operation ∗ such that:

1. Closure: a ∗ b ∈ G,∀a, b ∈ G

2. Associativity : a ∗ (b ∗ c) = (a ∗ b) ∗ c,∀a, b, c ∈ G

3. Identity : ∃e such that e ∗ a = a ∗ e = a,∀a ∈ G

4. Inverse: ∀a ∈ G, ∃b ∈ G such that a ∗ b = b ∗ a = e

note: Some authors assume that by binary operation, the closure property is implied. Also, agroup is called Abelian if commutativity holds (i.e. a ∗ b = b ∗ a,∀a, b ∈ G).

Examples : Set of integers with ∗ as +; the set {-1,1} with ∗ as ordinary multiplication; Zm

(integers modulo m) with addition; permutation groups

Definition : a field < F,+, · > is a set F with binary operations + and · such that:

1. < F,+ > and < F − {0}, · > are Abelian groups.

2. Distributivity holds: a(b+ c) = ab+ ac,∀a, b, c ∈ F

Examples : R and C; Zp where p is prime

Notation : F denotes a field, which in this book is either R or C.

1.2 Definition of Vector Space

Definition : A vector space over F is a set V with an addition on V and a scalar multiplication onV such that:

1. < V,+ > is an Abelian group.

2. Scalar multiplication:

(a) 1v = v,∀v ∈ V(b) (ab)v = a(bv),∀v ∈ V,∀a, b ∈ F

3. Distributive properties:a(u+ v) = au+ av and (a+ b)u = au+ bu,∀a, b,∈ F,∀u, v ∈ V

Examples : Euclidean space: Fn = {(x1, x2, . . . , xn) : x1, . . . , xn ∈ F}; Polynomials with coeffi-cients in F denoted P(F).

3

1.3 Properties of Vector Spaces

Note: The following two propositions follows immediately from < V,+ > being an Abelian group.Proposition : A vector space has a unique additive identity.Proposition : Every element in a vector space has a unique additive inverse.

Note: Distributivity is critical in the following two propositions.Proposition : 0v = 0 and a0 = 0 for every v ∈ V, a ∈ FProposition : (−1)v = −v,∀v ∈ V

1.4 Subspaces

Definition : A subset U of V is a subspace of V if:

1. additive identity : 0 ∈ U

2. closed under addition: u, v ∈ U ⇒ u+ v ∈ U

3. closed under scalar multiplication: a ∈ F and u ∈ U ⇒ au ∈ U

Examples : A good example used as a nonexample in later topics: Let V be F3. The followingare some subspaces of V .

1. U1 = {(x, y, 0) : x, y ∈ F}

2. U2 = {(0, z, z) : z ∈ F}

3. U3 = {(0, 0, w) : w ∈ F}

1.5 Sums and Direct Sums

Definition : Let U1, U2, · · · , Um are subspaces of V. The sum of U1, U2, · · · , Um, denoted U1 +· · ·+ Um is the set of all possible sums of elements of U1, · · · , Um. More precisely,

U1 + · · ·+ Um = {u1 + · · ·+ um : u1 ∈ U1, . . . , um ∈ Um}

Definition : V is the direct sum of subspaces U1, U2, . . . , Um, denoted V = U1 ⊕ · · · ⊕ Um, if eachelement of V can be written uniquely as a sum u1 + · · ·+ um where each uj ∈ Uj .

Proposition : Let U1, . . . , Um be subspaces of V .V = U1 ⊕ · · · ⊕ Um if and only if

1. V = U1 + · · ·+ Um and

2. u1 + · · ·+ um = 0 only if uj = 0 (j = 1, . . . ,m)

Proposition : Suppose U and W are subspaces of V . Then V = U ⊕W if and only if V = U +Wand U ∩W = {0}.

4

2 Finite-Dimensional Vector Spaces

2.1 Span and Linear Independence

Definition : A linear combination of list (v1, . . . , vm) of vectors in V is a vector of the form

a1v1 + · · ·+ amvm where a1, . . . , am ∈ F

Definition : The span of (v1, . . . , vm) is the set of all linear combinations of (v1, . . . , vm). In otherwords,

span(v1, . . . , vm) = {a1v1 + · · ·+ amvm : a1, . . . , am ∈ F}

Definition : A vector space is called finite dimensional is some list of vectors in it spans the space.

Definition : A list (v1, . . . , vm) of vectors in V is linearly independent if

a1v1 + · · ·+ amvm = 0 implies a1 = · · · = am = 0

Definition : a list (v1, . . . , vm) is linearly dependent if it is not linearly independent; in other words,∃a1, . . . , am not all zero such that a1v1 + · · ·+ amvm = 0.

Note: The following two are VERY important.

Linear Dependence Lemma : If (v1, . . . , vm) is linearly dependent in V and v1 6= 0, then∃j ∈ {2, . . . ,m} such that:

1. vj ∈ span(v1, . . . , vj−1)

2. span(v1, . . . , vm) = span(v1, . . . , vj−1, vj+1, . . . , vm)

Theorem : Let S and L be finite subsets of a vector space where S spans V and L is linearlyindependent. Then |S| ≥ |L|

(Note: Artin’s approach provides much cleaner proof than Axler’s.)

Corollary : Suppose that for each positive integer m there exists a linearly independent list ofm vectors in V . Then V is infinite dimensional.

2.2 Bases

Definition : A basis B of V is a list of vectors in V that spans V and is linearly independent.Example :

• < (1, 0, . . . , 0), (0, 1, . . . , 0), (0, . . . , 1) > is the standard basis of Fn

• (1, z, . . . , zm) is a basis of Pm(F)

Proposition : A list (v1, . . . , vn) of vectors in V is a basis of V if and only if every v ∈ V can bewritten uniquely in the form

v = a1v1 + · · ·+ anvn where a1, . . . , an ∈ F

5

Theorem : Every spanning list in a vector space can be reduced to a basis of the vector space.

Corollary : Every finite-dimensional vector space has a basis.

Theorem : Every linearly independent list of vectors in a finite-dimensional vector space canbe extended to a basis of the vector space.

Proposition : Let U be a subspace of a finite-dimensional vector space V . There exists a subspaceW of V such that V = U ⊕W .

2.3 Dimension

Theorem : Any two bases of a finite-dimensional vector space have the same length.

Definition : The dimension of a finite-dimensional vector space V , denoted dimV , is the length ofany basis of the vector space V .

Proposition : If V is finite dimensional and U is a subspace of V , then dimU ≤dimV .

Proposition : If V is finite dimensional, then:

• every spanning list of vectors in V with length dimV is a basis of V .

• every linearly independent list of vectors in V with length dimV is a basis of V .

Theorem : If U1 and U2 are subspaces of a finite dimensional vector space, then

dim(U1 + U2) = dimU1 + dimU2 − dim(U1 ∩ U2)

Theorem : Suppose V is finite dimensional and U1, . . . , Um are subspaces of V , then TFAE:

1. V = U1 + · · ·+ Um and dimV = dimU1 + · · ·+ dimUm

2. V = U1 ⊕ · · · ⊕ Um

3 Linear Maps

3.1 Definitions and Examples

Definition: A linear map, or a linear transformation, from V to W is a function T : V →W withthe following properties:

(a) additivity : T (v + u) = Tu+ Tv,∀u, v ∈ V

(b) homogeneity : T (av) = a(Tv),∀a ∈ F and v ∈ V

Notation: Tv same thing as T (v). The set of all linear maps from V to W is denoted L(V,W ).

Examples

6

• Zero: 0 ∈ L(V,W ) where 0v = 0,∀v.

• Identity I ∈ L(V, V ) where Iv = v.

• Differentiation: T ∈ L(P(F),P(F)) where Tp = p′.

• Integration: T ∈ L(P(R),R) where Tp =

∫ 1

0

p(x)dx.

• Multiplication by x2: T ∈ L(P(F),P(F)) where T (p(x)) = x2p(x)

• Backward shift : T ∈ L(F∞,F∞) where T (x1, x2, x3, . . .) = (x2, x3, . . .)

• from Fn to Fm: T ∈ L(F3,F2) where T (x, y, z) = (2x− y+ 3z, 7x+ 5y− 6z). More generally,T ∈ L(Fn,Fm) where

T (x1, . . . , xn) = (a1,1x1 + · · ·+ a1,nxn, . . . , am,1x1 + · · ·+ am,nxn)

L(V,W ) as a vector space

1. addition: (S + T )v = Sv + Tv

2. scalar multiplication: (aT )v = a(Tv)

∗ product (not necessary for vector space): ST = S ◦ T (composition of functions). This makesthe set of linear maps a ring.

3.2 Null Spaces and Ranges

Definition: the null space of T , or its kernel, is the subset of V consisting of those vectors that Tmaps to 0. That is:

null T = kerT = {v ∈ V : Tv = 0}

Proposition: If T ∈ L(V,W ), then null T is a subspace of V .

Proposition: Let T ∈ L(V,W ). Then T is injective if and only if null T = {0}.

Definition: the range of T , or image, is the subset of W consisting of those vectors that areof the form Tv for some v ∈ V . That is:

range T = {Tv : v ∈ V } = {w ∈W : ∃v ∈ V such that Tv = w}

Proposition: If T ∈ L(V,W ), then range T is a subspace of W .

Definition: A linear map T : V →W is called surjective if range T = W .

Theorem (Rank-nullity): If V is finite dimensional and T ∈ L(V,W ), then range T is a finite-dimensional subspace of W and

dimV = dim null T + dim range T

7

Corollary: Let V and W are finite-dimensional vector spaces. If dimV > dimW , then nolinear map from V to W is injective. If dimV < dimW , then no linear map from V to W issurjective.

Corollary: A homogeneous system of linear equations in which there are more variales thanequations must have nonzero solutions. An inhomogeneous system of linear equations in whichthere are more equations than variables has no solution for some choice of constant terms.

3.3 The Matrix of a Linear Map

Definition: The matrix of T : V → W with respect to the bases (v1, . . . , vn) and (w1, . . . , wm) is{ai,j} (1 ≤ i ≤ n, 1 ≤ j ≤ m) such that

Tvj = a1,jw1 + · · ·+ am,jwm

. In other words:

M(T, (v1, . . . , vn), (w1, . . . , wn)) =

v1 . . . vj . . . vnw1

...wm

a1,1 . . . a1,j . . . a1,n...

......

am,1 . . . am,j . . . am,n

We define addition, scalar multiplication, and product of matrices such that

1. M(T + S) =M(T ) +M(S),

2. M(cT ) = cM(T )

3. M(TS) =M(T )M(S)

holds true (assuming that the operations make sense).

Addition and scalar multiplication are trivial; we only discuss the product:

Definition: Let S : U → V and T : V → W be linear maps where l, n,m are dimensions ofU, V,W respectively. Suppose

M(T ) =

a1,1 . . . a1,n...

...am,1 . . . am,n

and M(S) =

b1,1 . . . b1,l...

...bn,1 . . . bn,l

The product of matrices M(T ) and M(S) is the matrix {ci,j}(1≤i≤m, 1≤j≤l) where

ci,j =

n∑k=1

ai,kbk,j

8

Definition: Let (v1, . . . , vn) be the basis of V . The matrix of v, denoted M(v), is the n-by-1matrix:

M(v, (v1, . . . , vn)) =

b1...bn

where b1, . . . , bn ∈ F are unique scalars such that v = b1v1 + · · ·+ bnvn.

Proposition: Suppose T ∈ L(V,W ), (v1, . . . , vn) is the basis of V , and (w1, . . . , wm) is the basisof W . Then

M(Tv) =M(T )M(v) for every v

3.4 Invertibility

Definition A linear map is called invertible if it has an inverse. For a linear map T : V → W , alinear map S : W → V satisfying ST = I and TS = I is the inverse of T .

Proposition: A linear map is invertible if and only if it is injective and surjective.

Definition: Two vector spaces are isomorphic if there is an invertible linear map from one vectorspace onto the other one.

Notation: Mat(m,n,F) denotes the set of m × n matrices. Also, L(V ) = L(V, V ), which isan operator.

Theorem: Two finite-dimensional vector spaces are isomorphic if and only if they have the samedimension.

Proposition: Suppose that (v1, . . . , vn) is the basis of V and (w1, . . . , wm) is the basis of W .Then M is an invertible linear map between L(V,W ) and Mat(m,n,F).

Corollary: If V and W are finite dimensional, then

dimL(V,W ) = (dimV )(dimW )

Proposition: Suppose V is finite dimensional. If T ∈ L(V ), then TFAE:

1. T is invertible

2. T is injective

3. T is surjective

4 Polynomials

Skipped

9

5 Eigenvalues and Eigenvectors

5.1 Invariant Subspaces

Definition: For T ∈ L(V ), a subspace U of V is invariant under T if u ∈ U implies Tu ∈ U , i.e.if T |U is an operator on U .

Examples: {0}, null T , and range T .

Definition: A scalar λ ∈ F is an eigenvalue of T ∈ L(V ) if there exists a nonzero vector u ∈ Vsuch that

Tu = λu

Note: Tu = λu ⇔ (T − λI)u = 0, and so, λ is an eigenvalue of T if and only if T − λI is notinvertible, not surjective, or injective (they are all equivalent for operator)

Definition: Suppose T ∈ L(V ) and λ ∈ F is an eigenvalue of T . Then, a vector u ∈ V is aneigenvector of T (corresponding to λ) if Tu = λu.

Note Again, since Tu = λu ⇔ (T − λI)u = 0, the set of eigenvectors of T corresponding tolambda equals null(T − λI), a subspace of V .

Theorem: Let T ∈ L(V ). Suppose λ1, . . . , λm are distinct eigenvalues of T and v1, . . . , Vm arecorresponding nonzero eigenvectors. Then (v1, . . . , vm) is linearly independent.

Corollary: Each operator on V has at most dimV distinct eigenvalues.

5.2 Polynomials Applied to Operators

Notation: Tm = T · · ·T (m times). Also, the linear function from P(F) to L(V ) given by p 7→ p(T )where we “plug in” T to a polynomial, denoted by p(T ).

Note: (pq)(T ) = p(T )q(T ) = q(T )p(T )

5.3 Upper-Triangular Matrices

Theorem: Every operator on finite-dimensional, nonzero, complex vector space has an eigenvalue.Question: is the proof showing that any nonzero vector in complex vector space is an eigenvec-

tor??

Definitions: the diagonal of a square matrix consists of entries along the straight line from theupper left corner to the bottom right corner. A matrix is called upper triangular if all the entiresbelow the diagonal equal 0.

Proposition: Suppose T ∈ L(V ) and (v1, . . . , vn) is a basis of V . Then TFAE:

(a) the matrix of T with respect to (v1, . . . , vn) is upper triangular

(b) Tvk ∈ span(v1, . . . , vk) for each k = 1, . . . , n

(c) span(v1, . . . , vk) is invariant under T for each k = 1, . . . , n

10

Theorem: Suppose V is a complex vector space and T ∈ L(V ). Then T has an upper-triangularmatrix with respect to some basis of V .

Proposition: Suppose T ∈ L(V ) has an upper-triangular matrix with respect to some basisof V . Then T is invertible if and only if all the entries on the diagonal of that upper-triangularmatrix are nonzero.

Proposition: Suppose T ∈ L(V ) has an upper-triangular matrix with respect to some basis of V .Then the eigenvalues of T consist precisely of the entries on the diagonal of that upper-triangularmatrix.

5.4 Diagonal Matrices

Definition: A diagonal matrix is a square matrix that is 0 everywhere except possibly along thediagonal.

Proposition: If T ∈ L(V ) has dimV distinct eigenvalues, then T has a diagonal matrix withrespect to some basis of V .

Proposition: Suppose T ∈ L(V ). Let λ1, . . . , λm denote the distinct eigenvalues of T . ThenTFAE:

(a) T has a diagonal matrix w/r/t some basis of V

(b) V has a basis consisting of eigenvectors of T

(c) there exist one-dimensional subspaces U1, . . . , Un of V , each invariant under T , such thatV = U1 ⊕ · · · ⊕ Un

(d) V = null(T − λ1I)⊕ · · ·⊕ null(T − λmI)

(e) dimV = dim null(T − λ1I) + · · ·+ dim null(T − λmI)

5.5 Invariant Subspaces on Real Vector Spaces

Theorem: Every operator on a finite-dimensional, nonzero, real vector space has an invariantsubspace of dimension 1 or 2.

Definition: Let V = U ⊕ W , i.e. for each vector v ∈ V , v = u + w where u ∈ U,w ∈ W(unique). A projection onto U with null space W , denoted PU,W ∈ L(V ), is defined as,

PU,W v = u

Note: v = PU,W v + PW,Uv, range PU,W = U and null PU,W = W

Theorem: Every operator on an odd-dimensional real vector space has an eigenvalue.

11

6 Inner-Product Spaces

6.1 Inner Products

Definition: An inner product is a function V ×V → F, denoted 〈v, u〉, with the following proerties:

1. Positive-definiteness: 〈v, v〉 ≥ 0,∀v ∈ V and 〈v, v〉 = 0⇔ v = 0

2. Linearity in first slot : for each fixed w ∈ V , the function that takes v to 〈v, w〉 is a linearmap from V to F.

3. Conjugate symmetry : 〈v, w〉 = 〈w, v〉

Note: The definition implies: additivity in the second slot and conjugate homogeneity (i.e.〈v, au〉 = a〈v, u〉)

Examples: Dot product on Rn, Euclidean inner-product on Fn by 〈(v1, . . . , wn), (u1, . . . , un)〉 =

w1z1 + · · ·+ wnzn, on Pm(F) by 〈p, q〉 =

∫ 1

0

p(x)q(x)dx

6.2 Norms

Definition: The Norm of v ∈ V , denoted ‖v‖, is defined as ‖v‖ =√〈v, v〉

Note: ‖av‖ = |a| ‖v‖

Definition: Two vectors v, u,∈ V are orthogonal if 〈v, u〉 = 0

Theorem (Pythagoras): If v, u are orthogonal vectors in V , then

‖v + u‖2 = ‖v‖2 + ‖u‖2

Remark: orthogonal decomposition: write u as a scalar multiple of v plus a vector orthogonalto v by:

u =〈u, v〉‖v‖2

v +

(u− 〈u, v〉

‖v‖2v

)

Cauchy-Schwarz Inequality: If u, v ∈ V , then |〈u, v〉| ≤ ‖u‖‖v‖

Triangle Inequality: If u, v ∈ V , then ‖u+ v‖ ≤ ‖u‖+ ‖v‖

Parallelogram Equality: If u, v ∈ V , then ‖u+ v‖2 + ‖u− v‖2 = 2(‖u‖2 + ‖v‖2)

6.3 Orthonormal Bases

Definition: a list of vectors is orthonormal if the vectors in it are pairwise orthogonal and eachhas norm 1.

12

Proposition: If (e1, . . . , em) is an orthonormal list of vectors in V , then

‖a1e1 + · · ·+ amem‖2 = |a1|2 + · · · |a|2m

Corollary: Every orthonormal list of vectors is linearly independent.

Theorem: Suppose (e1, . . . , em) is an orthonormal basis of V . Then ∀v ∈ V ,

v = 〈v, e1〉e1 + · · ·+ 〈v, en〉en and ‖v‖2 = |〈v, e1〉|2 + · · ·+ |〈v, en〉|2

Gram-Schmidt: If (v1, . . . , vm) is a linearly independent list of vetoes in V , then there exists anorthonormal list (e1, . . . , em) of vectors in V such that span(v1, . . . , vm) = span(e1, . . . , em)

Note: Define ej =vj − 〈vj , e1〉e1 − · · · − 〈vj , ej−1〉ej−1‖vj − 〈vj , e1〉e1 − · · · − 〈vj , ej−1〉ej−1‖

Corollary: Every finite-dimensional inner-product space has an orthonormal basis.Corollary: Every orthonormal list of vectors in V can b extended to an orthonormal basis of V .

Corollary: Suppose T ∈ L(V ). If T has an upper-triangular matrix with respect to some ba-sis of V , then T has an upper-triangular matrix with respect to some orthonormal basis of V .

Corollary (Shur’s theorem): Suppose V is a complex vector space and T ∈ L(V ). Then Thas an upper-triangular matrix with respect to some orthonormal basis of V .

6.4 Orthogonal Projections and Minimization Problems

Definition: Let U be subset of V . Then the orthogonal complement of U , denoted U⊥, is the setof all vectors in V that are orthogonal to every vector in U :

U⊥ = {v ∈ V : 〈v, u〉 = 0 ∀u ∈ U}

Theorem: If U is a subspace of V , then V = U ⊕ U⊥

Corollary: If U is a subspace of V , then U = (U⊥)⊥

Definition: Let U be subspace of V . The orthogonal projection of V onto U denoted PU isPU,U⊥ . (i.e. v = u+ w for unique u ∈ U,w ∈ U⊥, then PUv = u).

Proposition: for U subspace of V ,

• range PU = U , null PU = U⊥; P 2U = PU

• v − PUv ∈ U⊥ ∀v ∈ V

• ‖PUv‖ ≤ ‖v‖

• If (e1, . . . , em) is an orthonormal basis of U , then PUv = 〈v, e1〉e1 + · · ·+ 〈v, em〉em

13

Proposition: Suppose U is a subspace of V and v ∈ V . Then

‖v − PUv‖ ≤ ‖v − u‖

for every u ∈ U . Furthermore, if u ∈ U and the inequality above is an equality, then u = PUvNote: We can use the above result for minimization problems; for example, in approximating

sinx, we can consider V to be space of functions and U polynomial space of degree 6. Note that Umust be finite-dimensional while V can be infinite.

6.5 Linear Functionals and Adjoints

Definition: A linear functional on V is a linear map ϕ : V → F

Theorem: Suppose ϕ is a linear functional on V . Then there is a unique vector v ∈ V suchthat ϕ(u) = 〈u, v〉

Definition: Let T ∈ L(V,W ) (V,W are finite-dimensional, nonzero, inner-product space). Fixw ∈ W , let linear functional ϕ : v 7→ 〈Tv,w〉. The adjoint of T , denoted T ∗, is a linear mapL(W,V ) where T ∗w is the unique vector in V such that ϕ(v) is given by taking inner-product withT ∗w.

In other words, T ∗w is the unique vector in V such that 〈Tv,w〉 = 〈v, T ∗w〉

Proposition: The function T → T ∗ has the following properties:

• (S + T )∗ = S∗ + T ∗

• (aT )∗ = aT ∗

• (T ∗)∗ = T

• I∗ = I

• (ST )∗ = T ∗S∗

Proposition: Suppose T ∈ L(V,W ). Then

1. null T ∗ = (range T )⊥

2. range T ∗ = (null T )⊥

3. null T = (range T ∗)⊥

4. range T = (null T ∗)⊥

Proposition: Suppose T ∈ L(V,W ). If (e1, . . . , en) is an orthonormal basis of V and (f1, . . . , fm)is an orthonormal basis of W , then

M(T ∗, (f1, . . . , fm), (e1, . . . , en))

is the conjugate transpose ofM(T, (e1, . . . , em), (f1, . . . , fn))

14

7 Operators on Inner-Product Spaces

7.1 Self-Adjoint and Normal Operators

Definition: An operator T ∈ L(V ) is self-adjoint or Hermitian if T = T ∗.

Proposition: Every eigenvalue of a self-adjoint operator is real.

Proposition: If V is a complex inner-product space and T is an operator on V such that 〈Tv, v〉 = 0for every v ∈ V , then T = 0

Corollary: Let V be a complex inner-product space and let T ∈ L(V ). Then T is self-adjoint ifand only if 〈Tv, v〉 ∈ R for every v ∈ V

Proposition: If T is a self-adjoint operator on V such that 〈Tv, v〉 = 0 for every v ∈ V , then T = 0

Definition: An operator on an inner-product space is normal if it commutes with its adjoint;i.e. TT ∗ = T ∗T

Proposition: An operator T ∈ L(V ) is normal if and only if ‖Tv‖ = ‖T ∗v‖ for all v ∈ VNote: this implies that null T = nullT ∗ for every normal operator T

Corollary: Suppose T ∈ L(V ) is normal. If v ∈ V is an eigenvector of T with eigenvalue λ ∈ F ,then v is also eigenvector of T ∗ with eigenvalue λ.

Corollary: If T ∈ L(V ) is normal, then eigenvectors of T corresponding to distinct eigenvalues areorthogonal.

7.2 The Spectral Theorem

Complex Spectral Theorem: Suppose that V is a complex inner-product space and T ∈ L(V ).Then V has an orthonormal basis consisting of eigenvectors of T if and only if T is normal.

Lemma: Suppose T ∈ L(V ) is self-adjoint. If α, β ∈ R are such that α2 < 4β, then T 2 + αT + βIis invertible.

Lemma: Suppose T ∈ L(V ) is self-adjoint. Then T has an eigenvalue.

Real Spectral Theorem: Suppose that V is a real inner-product space and T ∈ L(V ). Then Vhas an orthonormal basis consisting of eigenvectors of T if and only if T is self-adjoint.

Corollary: Suppose that T ∈ L(V ) is self-adjoint (or that F = C and T is normal). Let λ1, . . . , λmdenote the distinct eigenvalues of T . Then

V = null(T − λ1I)⊕ · · · ⊕ null(T − λmI)

Furthermore, each vector in each null(T −λjI) is orthogonal to all vectors in the other subspace ofthis decomposition.

15

7.3 Normal Operators on Real Inner-Product Spaces

Lemma: Suppose V is a two-dimensional real inner-product space and ∈ L(V ). Then the TFAE:

(a) T is normal but not self-adjoint

(b) the matrix of T with respect to every orthonormal basis of V has the form[a −bb a

]with b 6= 0

(c) the matrix of V with respect to some orthonormal bsis of V has the form[a −bb a

]with b > 0

Proposition: Suppose T ∈ L(V ) is normal and U is a subspace of V that is invariant under T .Then

1. U⊥ is invariant under T

2. U is invariant under T ∗

3. (T |U )∗ = T ∗|U

4. T |U is a normal operator on U

5. T |U⊥ is a normal operator on U⊥

Definition: A block diagonal matrix is a square matrix of the form A1 0. . .

0 Am

where A1, . . . , Am are square matrices lying along the diagonal and all other entries of matrix equal0.Note: If A and B are block diagonal matrices of the form

A =

A1 0. . .

0 Am

, B =

B1 0. . .

0 Bm

where Aj has the same size as Bj for j = 1, . . . ,m, then AB is a block diagonal matrix of the form A1B1 0

. . .

0 AmBm

16

Theorem: Suppose that V is a real inner-product space and T ∈ L(V ). Then T is normal ifand only if there is an orthonormal basis of V with respect to which T has a block diagonal matrixwhere each block is a 1-by-1 matrix of a 2-by=2 matrix of the form[

a −bb a

]with b > 0

7.4 Positive Operators

Definition: An operator T ∈ L(V ) is called positive (semidefinite) is T is self-adjoint and 〈Tv, v〉 ≥0 for all v ∈ V . Note that if V is a complex vector spar, then the self-adjoint condition can bedropped.

Definition: A square root of an operator T is an operator S such that S2 = T

Theorem: Let T ∈ L(V ). Then TFAE:

(a) T is positive

(b) T is self-adjoint and all the eigenvalues of T are nonnegative

(c) T has a positive square root

(d) T has a self-adjoint square root

(e) there exists an operator S ∈ L(V ) such that T = S∗S

Proposition: Every positive operator on V has a unique positive square root.

7.5 Isometries

Definition: An operator S ∈ L(V ) is an isometry if ‖Sv‖ = ‖v‖ for all v ∈ VNote: An isometry on a real inner-product space is an orthogonal operator ; on complex inner-

product space is a unitary operator

Theorem: Suppose S ∈ L(V ). Then TFAE:

(a) S is isometry

(b) 〈Sv, Su〉 = 〈v, u〉 for all u, v ∈ V

(c) S∗S = I

(d) (Se1, . . . , Sen) is orthonormal whenever (e1, . . . , en) is an orthonormal list of vectors in V

(e) there exists an orthonormal basis (e1, . . . , en) of V such that (Se1, . . . , Sen) is orthonormal

(f) S∗ is isometry

17

(g) 〈S ∗ v, S ∗ u〉 = 〈v, u〉 for all u, v ∈ V

(h) SS∗ = I

(i) (S∗e1, . . . , S∗en) is orthonormal whenever (e1, . . . , en) is an orthonormal list of vectors in V

(j) there exists an orthonormal basis (e1, . . . , en) of V such that (S∗e1, . . . , S∗en) is orthonormal

Theorem: Suppose V is a complex inner-product space and S ∈ L(V ). Then S is an isometry ifand only if there is an orthonormal basis of V consisting of eigenvectors of S all of whose corre-sponding eigenvalues have absolute value 1.

Theorem: Suppose that V is a real inner-product space and S ∈ L(V ). Then S is an isome-try if and only if there is an orthonormal basis of V with respect to which S has a block diagonalmatrix where each block on the diagonal is a 1-by-1 matrix containing 1 or −1 or a 2-by-2 matrixof the form [

cos θ − sin θsin θ cos θ

]with θ ∈ (0, π)

7.6 Polar and Singular-Value Decompositions

Polar Decomposition: If T ∈ L(V ), then there exists an isometry S ∈ L(V ) such that

T = S√T ∗T

Note: we can thus write each operator on V as the product of two operators, each of whichcomes from a class that we understand well: an isometry (normal) and a positive operator (self-adjoint).

Definition: Suppose T ∈ L(V ). The singular values of T are the eigenvalues of√T ∗T , with

each eigenvalue λ repeated dim null(√T ∗T − λI) times.

Note: Since√T ∗T is positive (self-adjoint), there are always dimV many singular values for

T ∈ L(V )

Singular-Value Decomposition: Suppose T ∈ L(V ) has singular values s1, . . . , sn. Then thereexist orthonormal bases (e1, . . . , en) and (f1, . . . , fn) of V such that

Tv = s1〈v, e1〉f1 + · · ·+ sn〈v, en〉fn

for every v ∈ V

Notes: This implies that every operator on V has a diagonal matrix with respect to some or-thonormal bases of V , provided that we are permitted to use two different bases rather than asingle basis. That is,

M(T, (e1, . . . , en), (f1, . . . , fn)) =

s1 0. . .

0 sn

18

8 Operators on Complex Vector Spaces

8.1 Generalized Eigenvectors

Definition: Suppose T ∈ L(V ) and λ is an eigenvalue of T . A generalized eigenvector of Tcorresponding to λ is a vector v ∈ V such that

(T − λI)jv = 0

for some positive integer j.

Example: (z1, z2, 0) is an generalized eigenvector of T : (z1, z2, z3) 7→ (z2, 0, z3). Furthermore,note that C3 = {(z1, z2, 0) : z1, z2 ∈ C} ⊕ {(0, 0, z3) : z3 ∈ C}

Remark: Note that for T ∈ L(V ),

{0} = null T 0 ⊂ null T 1 ⊂ · · · ⊂ null T k ⊂ null T k+1 ⊂ · · ·

Proposition: If T ∈ L(V ) and m is a nonnegative integer such that null Tm = null Tm+1, then

null T 0 ⊂ null T 1 ⊂ · · · ⊂ null Tm = null Tm+1 = null Tm+2 = · · ·

Proposition: If T ∈ L(V ), then

null T dimV = null T dimV+1 = null T dimV+2 = · · ·

Corollary: Suppose T ∈ L(V ) and λ is an eigenvalue of T . Then the set of generalized eigenvectorsof T corresponding to λ equals null(T − λI)dimV

Definition: An operator is nilpotent if some power of it equals 0. An example is differentia-tion on Pm(R)

Corollary: Suppose N ∈ L(V ) is nilpotent. Then NdimV = 0

Remark: Note that for T ∈ L(V ),

V = range T 0 ⊃ range T 1 ⊃ · · · ⊃ range T k ⊃ range T k+1 ⊃ · · ·

Proposition: If T ∈ L(V ), then

range T dimV = range T dimV+1 = range T dimV+2 = · · ·

8.2 The Characteristic Polynomial

Theorem: Let T ∈ L(V ) and λ ∈ F. Then for every basis of V with respect to which T has anupper-triangular matrix, λ appears on the diagonal of the matrix of T precisely dim null(T−λI)dimV

times.

19

Definition: The multiplicity of an eigenvalue λ of T is defined to be the dimension of the subspaceof generalized eigenvectors corresponding to λ, i.e. dim null(T − λI)dimV

Proposition: If V is complex vector space and T ∈ L(V ), then the sum of the multiplicitiesof all the eigenvalues of T equals dimV .

Definition: Suppose V is a complex vector space and T ∈ L(V ). Let λ1, . . . , λm denote thedistinct eigenvalues of T . Let dj denote the multiplicity of λj as an eigenvalue of T . The polyno-mial (z − λ1)d1 · · · (z − λm)dm is the characteristic polynomial of T .

alternatively, ifM(T ) =

λ1 ∗. . .

0 λn

, the characteristic polynomial of T is (z−λ1) · · · (z−

λn)

Cayley-Hamilton Theorem: Suppose that V is a complex vector space and T ∈ L(V ). Letq denote the characteristic polynomial of T . Then q(T ) = 0.

8.3 Decomposition of an Operator

Proposition: If T ∈ L(V ) and p ∈ P(F), then null p(T ) is invariant under T .

Theorem: Suppose T is a complex vector space and T ∈ L(V ). Let λ1, . . . , λm be the distincteigenvalues of T , and let U1, . . . , Um be the corresponding subspaces of generalized eigenvectors(Uj = null(T − λjI)dimV ). Then

1. V = U1 ⊕ · · · ⊕ Um

2. each Uj is invariant under T

3. each (T − λjI)|Uj is nilpotent

Corollary: Suppose T is a complex vector space and T ∈ L(V ). Then there is a basis of V con-sisting of generalized eigenvectors of T .

Lemma: Suppose N is a nilpotent operator on V . Then there is a basis of V with respect towhich the matrix of N has the form 0 ∗

. . .

0 0

Theorem: Suppose T is a complex vector space and T ∈ L(V ). Let λ1, . . . , λm be the distincteigenvalues of V . Then there is a basis of V with respect to which T has a block diagonal matrixof the form A1 0

. . .

0 Am

20

where each Aj is an upper-triangular matrix of the form

Aj =

λj ∗. . .

0 λj

8.4 Square Roots

Lemma: Suppose N ∈ L(V ) is nilpotent. Then I +N has a square root.

Theorem: Suppose V is a complex vector space. If T ∈ L(V ) is invertible, then T has a square root.

Remark: If V complex vector space and T ∈ L(V ) is invertible, then T has a kth-root for ev-ery positive integer k.

8.5 The Minimal Polynomial

Definition: The minimal polynomial of T is the monic polynomial p ∈ P(F) of smallest degreesuch that p(T ) = 0

Note: by Cayley-Hamilton, we know that in a complex vector space the degree is at most dimV

Theorem: Let T ∈ L(V ) and let q ∈ P(F). Then q(T ) = 0 if and only if the minimal poly-nomial of T divides q.

Theorem: Let T ∈ L(V ). Then the roots of the minimal polynomial of T are precisely theeigenvalues of T .

8.6 Jordan Form

Notation: Suppose N ∈ L(V ) is nilpotent. For each nonzero vector v ∈ V , let m(v) denote thelargest nonnegative integer such that Nm(v)v 6= 0

Lemma: If N ∈ L(V ) is nilpotent, then there exist vectors v1, . . . , vk ∈ V such that

1. (v1, Nv1, . . . , Nm(v1)v1, . . . , vk, Nvk, . . . , N

m(vk)vk) is a basis of V

2. (Nm(v1)v1, . . . , Nm(vk)vk) is a basis of null N .

Definition: A basis of V is called a Jordan basis for T if with respect to this basis T has a blockdiagonal matrix A1 0

. . .

0 Am

21

where each Aj is an upper-triangular matrix of the formλj 1 0

. . .. . .

. . . 10 λj

with each λj an eigenvalue of T

Theorem: Suppose V is a complex vector space. If T ∈ L(V ), then there is a basis of V that is aJordan basis for T .

9 Operators on Real Vectors Spaces

22

Documents

Notes for Axler’s Linear Algebra Done Right - … · Notes for Axler’s Linear Algebra Done Right Christopher Eur July 13, 2014 ... equations must have nonzero solutions. An inhomogeneous