76
Lecture Notes 1: Matrix Algebra Part D: Similar Matrices and Diagonalization Peter J. Hammond minor revision 2020 September 26th University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 76

Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Lecture Notes 1: Matrix AlgebraPart D: Similar Matrices and Diagonalization

Peter J. Hammond

minor revision 2020 September 26th

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 76

Page 2: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 76

Page 3: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Definitions in the Real Case

DefinitionConsider any n × n matrix A.

The scalar λ ∈ R is an eigenvalue of A,just in case the equation Ax = λx has a non-zero solution.

In this case the solution x ∈ Rn \ {0} is an eigenvector,and the pair (λ, x) is an eigenpair.

The spectrum of the matrix A is the set SA of its eigenvalues.

Let SRA denote the subset of its real eigenvalues.

Let SCA denote the subset of its complex eigenvalues,

which satisfies SCA = SA \ SR

A .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 76

Page 4: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Summary of Main Properties

We will be demonstrating the following properties:

1. SRA ⊆ SA and #SA ≤ n

2. The number #SCA of complex eigenvalues is even,

and the members of SCA are complex conjugate pairs λ± µi .

3. SRA = ∅ is possible in case n is even, but not if n is odd.

4. In case A is symmetric, one has SCA = ∅ and SR

A = SA.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 76

Page 5: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Eigenspace

Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx}denote the associated set of eigenvectors.

Given any two eigenvectors x, y ∈ Eλand any two scalars α, β ∈ R, note that

A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy)

Hence the linear combination αx + βy,unless it is 0, is also an eigenvector in Eλ.

It follows that the set Eλ ∪ {0} is a linear subspace of Rn

which we call the eigenspace associated with the eigenvalue λ.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 5 of 76

Page 6: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Powers of a Matrix

TheoremSuppose that (λ, x) is an eigenpair of the n × n matrix A.

Then Amx = λmx for all m ∈ N.

Proof.By definition, Ax = λx.

Premultiplying each side of this equation by the matrix A gives

A2x = A(Ax) = A(λx) = λ(Ax) = λ(λx) = λ2x

As the induction hypothesis,suppose that Am−1x = λm−1x for any m = 2, 3, . . .

Premultiplying each side of this last equation by the matrix A gives

Amx = A(Am−1x) = A(λm−1x) = λm−1(Ax) = λm−1(λx) = λmx

This completes the proof by induction on m.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 6 of 76

Page 7: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Characteristic EquationThe equation Ax = λx holds for x 6= 0if and only if x 6= 0 solves (A− λI)x = 0.

This holds iff the matrix A− λI is singular,which holds iff λ is a characteristic root— i.e., it solves the characteristic equation |A− λI| = 0.

Equivalently, λ is a zero of the polynomial |A− λI| of degree n.

Suppose |A− λI| = 0 has k distinct real roots λ1, λ2, . . . , λkwhose multiplicities are respectively m1,m2, . . . ,mk .

This means that

|A− λI| = (−1)n(λ− λ1)m1(λ− λ2)m2 · · · (λ− λk)mk

= (−1)n∏k

j=1(λ− λj)mj

The polynomial has degree m1 + m2 + . . .+ mk , which equals n.

This implies that k ≤ n,so there can be at most n distinct real eigenvalues.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 7 of 76

Page 8: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Eigenvalues of a 2× 2 matrix

Consider the 2× 2 matrix A =

(a11 a12a21 a22

).

The characteristic equation for its eigenvalues is

|A− λI| =

∣∣∣∣a11 − λ a12a21 a22 − λ

∣∣∣∣ = 0

Evaluating the determinant gives the equation

0 = (a11 − λ)(a22 − λ)− a12a21= λ2 − (a11 + a22)λ+ (a11a22 − a12a21)= λ2 − (tr A)λ+ |A| = (λ− λ1)(λ− λ2)

where the two roots λ1 and λ2 of the quadratic equation have:I a sum λ1 + λ2 equal to the trace tr A of A

(the sum of its diagonal elements);I a product λ1 · λ2 equal to the determinant of A.

Let Λ denote the diagonal matrix diag(λ1, λ2)whose diagonal elements are the eigenvalues.

Note that tr A = tr Λ and |A| = |Λ|.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 76

Page 9: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Case of a Diagonal Matrix, I

For the diagonal matrix D = diag(d1, d2, . . . , dn),the characteristic equation |D− λI| = 0takes the degenerate form

∏nk=1(dk − λ) = 0.

So the spectrum SA equals {d1, d2, . . . , dn},the set of diagonal elements.

#SA could be any number beteen 1 and n.

The ith component of the vector equation Dx = dkxtakes the form dixi = dkxi ,which has a non-trivial solution if and only if di = dk .

The kth vector ek = (δjk)nj=1

of the canonical orthonormal basis of Rn

always solves the equation Dx = dkx,and so is an eigenvector associated with the eigenvalue dk .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 76

Page 10: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Case of a Diagonal Matrix, II

Apart from non-zero multiples of ek ,there are other eigenvectors associated with dkonly if a different element di of the diagonal also equals dk .

In fact, the eigenspace associated with each eigenvalue dkequals the space spanned by the set {ei | di = dk}of canonical basis vectors.

Example

In case D = diag(1, 1, 0) the spectrum is {0, 1} with:

I the one-dimensional eigenspace

E0 = {x3 (0, 0, 1)> | x3 ∈ R}

I the two-dimensional eigenspace

E1 = {x1 (1, 0, 0)> + x2 (0, 1, 0)> | (x1, x2) ∈ R2}

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 10 of 76

Page 11: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Characterizing 2× 2 Orthogonal Matrices

By definition, an orthogonal matrix P satisfies P>P = PP> = I.

In the 2× 2 case when P = (pij)2×2 , the matrix PP> equals(p11 p12p21 p22

)(p11 p21p12 p22

)=

((p11)2 + (p12)2 p11p21 + p21p22p21p11 + p22p12 (p21)2 + (p22)2

)Here we use trigonometric identitiesand put p11 = cos θ, p22 = cos η, p12 = − sin θ, p21 = sin η.

This makes PP> equal(cos θ − sin θsin η cos η

)(cos θ sin η− sin θ cos η

)=

(1 sin(η − θ)

sin(η − θ) 1

)

This equals the identity matrix

(1 00 1

)if and only if P equals the rotation matrix Rθ :=

(cos θ − sin θsin θ cos θ

).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 76

Page 12: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Rotations Illustrated

Illustrating Pθ = Rθ

(10

)=

(cos θ − sin θsin θ cos θ

)(10

)=

(cos θsin θ

).

Also Pθ+ 12π = Rθ

(01

)=

(cos θ − sin θsin θ cos θ

)(01

)=

(− sin θcos θ

).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 76

Page 13: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Rotations in Polar CoordinatesRecall that a 2-dimensional rotation matrix takes the form

Rθ :=

(cos θ − sin θsin θ cos θ

)for θ ∈ R, which is the angle of rotation measured in radians.The rotation Rθ transforms any vector x = (x1, x2) ∈ R2 to

Rθx =

(cos θ − sin θsin θ cos θ

)(x1x2

)=

(x1 cos θ − x2 sin θx1 sin θ + x2 cos θ

)Introduce polar coordinates (r , η),where x = (x1, x2) = r(cos η, sin η). Then

Rθx = r

(cos η cos θ − sin η sin θcos η sin θ + sin η cos θ

)= r

(cos(η + θ)sin(η + θ)

)This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ Rand k ∈ Z, and that RθRη = RηRθ = Rθ+η for all θ, η ∈ R.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 76

Page 14: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Does a Rotation Matrix Have Real Eigenvalues?

The characteristic equation |Rθ − λI| = 0 takes the form

0 =

∣∣∣∣cos θ − λ − sin θsin θ cos θ − λ

∣∣∣∣ = (cos θ−λ)2+sin2 θ = 1−2λ cos θ+λ2

1. A degenerate case occurs when θ = 2kπ for some k ∈ Zso cos θ = 1 and sin θ = 0.

Then Rθ reduces to the identity matrix I2.

2. Otherwise, the real matrix Rθ has no real eigenvalues.

Indeed, if cos θ < 1, the characteristic equationhas two roots λ = cos θ ± i sin θ = e±iθ.

Because sin θ =√

1− cos2 θ 6= 0,there are two distinct complex conjugate eigenvalues.

The associated eigenspacesmust both consist of complex eigenvectors.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 14 of 76

Page 15: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 76

Page 16: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Complex Eigenvalues

To consider complex eigenvalues properly,we need to leave Rn and consider instead the linear space Cn

whose elements are n-vectors with complex coordinates.

That is, we consider a linear space whose field of scalarsis the plane C of complex numbers,rather than the line R of real numbers.

Suppose A is any n × n matrixwhose elements may be real or complex.

The complex scalar λ ∈ C is an eigenvaluejust in case the equation Ax = λx has a non-zero solution,in which case that solution x ∈ Cn \ {0} is an eigenvector.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 16 of 76

Page 17: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Fundamental Theorem of Algebra

TheoremLet P(λ) = λn +

∑n−1k=0 pkλ

k

be a polynomial function of λ of degree n in the complex plane C.

Then there exists at least one root λ ∈ C such that P(λ) = 0.

Corollary

The polynomial P(λ) can be factorizedas the product Pn(λ) ≡

∏nr=1(λ− λr ) of exactly n linear terms.

Proof.The proof will be by induction on n.

When n = 1 one has P1(λ) = λ+ p0, whose only root is λ = −p0.

Suppose the result is true when n = m − 1.

By the fundamental theorem of algebra,there exists λ ∈ C such that Pm(λ) = 0.

Polynomial division gives Pm(λ) ≡ Pm−1(λ)(λ− λ), etc.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 17 of 76

Page 18: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Characteristic Roots as Eigenvalues

TheoremEvery n × n matrix A ∈ Cn×n with complex elementshas exactly n eigenvalues (real or complex)corresponding to the roots, counting multiple roots,of the characteristic equation |A− λI| = 0.

Proof.The characteristic equation can be written in the form Pn(λ) = 0where Pn(λ) ≡ |λI− A| is a polynomial of degree n.

By the fundamental theorem of algebra, together with its corollary,the polynomial |λI− A| equalsthe product

∏nr=1(λ− λr ) of n linear terms.

For any of these roots λr the matrix A− λr I is singular.

So there exists x 6= 0 such that (A− λr I)x = 0 or Ax = λrx,implying that λr is an eigenvalue.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 76

Page 19: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 76

Page 20: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Linear Independence of Eigenvectors

The following theorem tells usthat eigenvectors associated with distinct eigenvaluesmust be linearly independent.

TheoremLet {λk}mk=1 = {λ1, λ2, . . . , λm}be any collection of m ≤ n distinct eigenvalues.

Then any corresponding set {xk}mk=1 of associated eigenvectorsmust be linearly independent.

The proof will be by induction on m.

Because x1 6= 0, the set {x1} is linearly independent.

So the result is evidently true when m = 1.

As the induction hypothesis, suppose the result holds for m − 1.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 20 of 76

Page 21: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Completing the Proof by Induction, I

Suppose that one solution of the equation Ax = λmx,which may be zero, is the linear combination x =

∑m−1k=1 αkxk

of the preceding m − 1 eigenvectors. Hence

Ax = λmx =∑m−1

k=1αkλmxk

Then the hypothesis that {(λk , xk)}m−1k=1is a collection of eigenpairs implies that this x satisfies

Ax =∑m−1

k=1αkAxk =

∑m−1

k=1αkλkxk

Subtracting this equation from the prior equation gives

0 =∑m−1

k=1αk(λm − λk)xk

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 76

Page 22: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Completing the Proof by Induction, II

So we have0 =

∑m−1

k=1αk(λm − λk)xk

The induction hypothesis is that the set {xk}m−1k=1of distinct eigenvectors is linearly independent, implying that

αk(λm − λk)xk = 0 for k = 1, . . . ,m − 1

But we are assuming that λm 6∈ {λk}m−1k=1 ,so λm − λk 6= 0 for k = 1, . . . ,m − 1.

It follows that αk = 0 for k = 1, . . . ,m − 1.

We have proved that if x =∑m−1

k=1 αkxk solves Ax = λmx,then x = 0, so x is not an eigenvector.

This completes the proof by induction that no eigenvector x ∈ Eλmcan be a linear combinationof the eigenvectors xk ∈ Eλk (k = 1, . . . ,m − 1).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 76

Page 23: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 76

Page 24: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Similar Matrices

DefinitionThe two n × n matrices A and B are similarjust in case there exists an invertible n × n matrix Ssuch that the following three equivalent statements all hold

B = S−1AS⇐⇒ SB = AS⇐⇒ A = SBS−1

in which case we write A ∼ B.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 76

Page 25: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Similarity is an Equivalence Relation

TheoremThe similarity relation is an equivalence relation — i.e., ∼ is:

reflexive A ∼ A;

symmetric A ∼ B⇐⇒ B ∼ A;

transitive A ∼ B & B ∼ C =⇒ A ∼ C

Proof.The proofs that ∼ is reflexive and symmetric are elementary.

Suppose that A ∼ B and B ∼ C.

By definition, there exist invertible matrices S and Tsuch that B = S−1AS and C = T−1BT.

Define U := ST, which is invertible with U−1 = T−1S−1.

Then C = T−1(S−1AS)T = (T−1S−1)A(ST) = U−1AU.

So A ∼ C.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 25 of 76

Page 26: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Similar Matrices Have Identical Spectra

TheoremIf A ∼ B then SA = SB.

Proof.Suppose that A = SBS−1 and that (λ, x) is an eigenpair of A.

Then x 6= 0 solves Ax = SBS−1x = λx.

Premultiplying each side of the equation SBS−1x = λx by S−1,it follows that y := S−1x solves By = λy.

Moreover, because S−1 has the inverse S, the equation S−1x = ywould have only the trivial solution x = Sy = 0 in case y = 0.

Hence y 6= 0, implying that (λ, y) is an eigenpair of B.

A symmetric argument showsthat if (λ, y) is an eigenpair of B = S−1SA,then (λ,Sy) is an eigenpair of A.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 26 of 76

Page 27: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Diagonalization

DefinitionAn n × n matrix A matrix is diagonalizable just in caseit is similar to a diagonal matrix Λ = diag(λ1, λ2, . . . , λn).

TheoremGiven any diagonalizable n × n matrix A:

1. The columns of any matrix S that diagonalizes Amust consist of n linearly independent eigenvectors of A.

2. The matrix A is diagonalizable if and only ifit has a set of n linearly independent eigenvectors.

3. The matrix A and its diagonalization Λ = S−1AShave the same set of eigenvalues.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 27 of 76

Page 28: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Part 1

Suppose that AS = SΛ where A = (aij)n×n, S = (sij)

n×n,and Λ = diag(λ1, λ2, . . . , λn).

Then for each i , k ∈ {1, 2, . . . , n},equating the elements in row i and column kof the equal matrices AS and SΛ implies that∑n

j=1aijsjk =

∑n

j=1sijδjkλk = sikλk

It follows that Ask = λksk

where sk = (sik)ni=1 denotes the kth column of the matrix S.

Because S must be invertible:

I each column sk must be non-zero, so an eigenvector of A;

I the set of all these n columns must be linearly independent.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 28 of 76

Page 29: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proofs of Parts 2 and 3

Proof of Part 2: By part 1, if the diagonalizing matrix S exists,its columns must form a set of n linearly independent eigenvectorsfor the matrix A.

Conversely, suppose that A does have a set {x1, x2, . . . , xn}of n linearly independent eigenvectors,with Axk = λkxk for k = 1, 2, . . . , n.

Now define S as the n × n matrix whose kth columnis the eigenvector xk , for each k = 1, 2, . . . , n.

Then it is easy to check that AS = SΛwhere Λ = diag(λ1, λ2, . . . , λn).

Proof of Part 3: This follows from the general propertythat similar matrices have the same spectrum of eigenvalues.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 29 of 76

Page 30: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 30 of 76

Page 31: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Complex Conjugates and Adjoint MatricesRecall that any complex number c ∈ C can be expressed as a + ibwith a ∈ R as the real part and b ∈ R as the imaginary part.

The complex conjugate of c is c = a− ib.

Note that cc = cc = (a + ib)(a− ib) = a2 + b2 = |c|2,where |c | is the modulus of c.

Any m × n complex matrix C = (cij)m×n ∈ Cm×n

can be written as A + iB, where A and B are real m × n matrices.

The adjoint of the m × n complex matrix C = A + iB,is the n ×m complex matrix C∗ := (A− iB)> = A> − iB>.

This is the transpose of the matrix A− iB whose elementsare the complex conjugates cjk of the corresponding elements of C.

That is, each element of C∗ is given by c∗jk = akj − ibkj .

In the case of a real matrix A, whose imaginary part is 0,its adjoint is simply the transpose A>.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 31 of 76

Page 32: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Self-Adjoint and Symmetric Matrices

An n × n complex matrix C = A + iB is self-adjointjust in case C∗ = C, which holds if and only if A>− iB> = A + iB,and so if and only if:

I the real part A is symmetric;

I the imaginary part B is anti-symmetricin the sense that B> = −B.

Of course, a real matrix is self-adjoint if and only if it is symmetric.

TheoremAny eigenvalue of a self-adjoint complex matrix is a real scalar.

Corollary

Any eigenvalue of a symmetric real matrix is a real scalar.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 32 of 76

Page 33: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof that Eigenvalues of a Self-Adjoint Matrix are Real

Suppose that the scalar λ ∈ C and vector x ∈ Cn together satisfythe eigenvalue equation Ax = λx for any A ∈ Cn×n.

Taking complex conjugates throughout, one has λx∗ = x∗A∗.

By the associative law of complex matrix multiplication,one has x∗Ax = x∗(Ax) = x∗(λx) = λ(x∗x)as well as x∗A∗x = (x∗A∗)x = (λx∗)x = λ(x∗x).

In case A is self-adjoint and so A∗ = A,subtracting the second equation from the first gives

x∗Ax− x∗A∗x = x∗(A− A∗)x = 0 = (λ− λ)(x∗x)

But in case x is an eigenvector, one has x = (xi )ni=1 ∈ Cn \ {0}

and so x∗x =∑n

i=1 |xi |2 > 0.

Because 0 = (λ− λ)x∗x, it follows that the eigenvalue λsatisfies λ− λ = 0, implying that λ is real.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 33 of 76

Page 34: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 34 of 76

Page 35: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Orthogonal and Orthonormal Sets of Vectors

Recall our earlier definition:

DefinitionA set of k vectors {x1, x2, . . . , xk} ⊂ Rn is said to be:

I pairwise orthogonal just in case xi · xj = 0 whenever j 6= i ;

I orthonormal just in case, in addition, each ‖xi‖ = 1— i.e., all k elements of the set are vectors of unit length.

The set of k vectors {x1, x2, . . . , xk} ⊂ Rn is orthonormaljust in case xi · xj = δij for all pairs i , j ∈ {1, 2, . . . , k}.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 35 of 76

Page 36: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Orthogonal Matrices: Recall

DefinitionAny n × n matrix is orthogonaljust in case its n columns (or rows) form an orthonormal set.

TheoremGiven any n × n matrix P, the following are equivalent:

1. P is orthogonal;

2. PP> = P>P = I;

3. P−1 = P>;

4. P> is orthogonal.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 36 of 76

Page 37: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Complex Case: Self-Adjoint and Unitary Matrices

We briefly consider matrices with complex elements.

Recall that the adjoint A∗ of an m × n matrix Ais the matrix formed from the transpose A>

by taking the complex conjugate of each element.

The appropriate extension to complex numbers of:

I a symmetric matrix satisfying A> = Ais a self-adjoint matrix satisfying A∗ = A;

I an orthogonal matrix satisfying P−1 = P>

is a unitary matrix satisfying U−1 = U∗.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 37 of 76

Page 38: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 38 of 76

Page 39: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Orthogonal Projection Matrices

DefinitionAn n × n matrix P is an orthogonal projection if P2 = Pand u>v = 0 whenever Pv = 0 and u = Px for some x ∈ Rn.

TheoremSuppose that the n ×m matrix X has full rank m < n.

Let L ⊂ Rn be the linear subspace spannedby m linearly independent columns of X.

Define the n × n matrix P := X(X>X)−1

X>. Then:

1. The matrix P is a symmetric orthogonal projection onto L.

2. The matrix I− P is a symmetric orthogonal projectiononto the orthogonal complement L⊥ of L.

3. For each vector y ∈ Rn, its orthogonal projection onto Lis the unique vector v = Py ∈ Lthat minimizes the distance ‖y − v‖ between y and L— i.e., ‖y − v‖ ≤ ‖y − u‖ for all u ∈ L.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 39 of 76

Page 40: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Orthogonal Complements

DefinitionA subset L ⊆ Rn is a linear subspace just in case λx + µy ∈ Lfor every pair of vectors x, y in L and every pair of scalars λ, µ in R.

DefinitionGiven any linear subspace L of Rn, its orthogonal complement L⊥

is the set of all vectors y ∈ Rn such that x · y = 0 for all x ∈ L.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 40 of 76

Page 41: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Two Examples

Example

Suppose that L is the space spannedby any finite subset {ei | i ∈ I}of the canonical basis {ei | i = 1, 2, . . . , n} of Rn.

Then L⊥ is the space spannedby the complementary set {ei | i 6∈ I} of canonical basis vectors.

Example

Any c 6= 0 in Rn generates the straight line L(c) = {λc | λ ∈ R},which is a one-dimensional linear subspace in Rn.

Its orthogonal complement L(c)⊥ = {x | c · x = 0}consists of an n − 1-dimensional subspacewhich is the unique hyperplane in Rn that has c as a normal.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 41 of 76

Page 42: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Part 1

First note that if X is an n ×m matrix, then X>X is m ×m.

Then, provided that (X>X)−1 exists,so does the n × n matrix P = X(X>X)−1X>.

Because of the rules for the transposes of products and inverses,

the definition P := X(X>X)−1

X> implies that P> = P and also

P2 = X(X>X)−1

X>X(X>X)−1

X> = X(X>X)−1

X> = P

Moreover, if Pv = 0 and u = Px for some x ∈ Rn, then

u · v = u>v = x>P>v = x>Pv = 0

Finally, for every y ∈ Rn, the vector Py equals Xb, where

b = (X>X)−1

X>y

Hence Py ∈ L.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 42 of 76

Page 43: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Part 2

Evidently (I− P)> = I− P> = I− P, and

(I− P)2 = I− 2P + P2 = I− 2P + P = I− P

Hence I− P is a projection.

This projection is also orthogonal because if (I− P)v = 0and u = (I− P)x for some x ∈ Rn, then

u · v = u>v = x>(I− P)>v = x>(I− P)v = 0

Next, suppose that v = Xb ∈ L and that y = (I− P)xbelongs to the range of (I− P). Then

y · v = y>v = x>(I− P)>Xb = x>Xb− x>Xb = 0

Hence y ∈ L⊥.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 43 of 76

Page 44: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Part 3For any vector v = Xb ∈ L and any y ∈ Rn,because y>Xb and b>X>y are equal scalars, one has

‖y − v‖2 = (y − Xb)>(y − Xb) = y>y − 2y>Xb + b>X>Xb

Now define b := (X>X)−1

X>y (which is the OLS estimator of bin the linear regression equation y = Xb + e) and v := Xb = Py.

Because P>P = P> = P = P2, one has

‖y − v‖2 = y>y − 2y>Xb + b>X>Xb

= (b− b)>X>X(b− b) + y>y − b>X>Xb

= ‖v − v‖2 + y>y − y>P>Py = ‖v − v‖2 + y>y − y>Py

On the other hand, given that v = Py, one also has

‖y − v‖2 = y>y − 2y>v + v>v

= y>y − 2y>Py + y>P>Py = y>y − y>Py

So ‖y − v‖2 − ‖y − v‖2 = ‖v − v‖2 ≥ 0 with = iff v = v.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 44 of 76

Page 45: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 45 of 76

Page 46: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

A Trick Function for Generating EigenvaluesFor all x 6= 0, define the Rayleigh quotient function

Rn \ {0} 3 x 7→ f (x) :=x>Ax

x>x=

∑ni=1

∑nj=1 xiaijxj∑n

i=1 x2i

It is homogeneous of degree zero, and left undefined at x = 0.

Its partial derivative w.r.t. any component xh of the vector x is

∂f

∂xh=

2

(x>x)2

[∑n

j=1ahjxj(x>x)− (x>Ax)xh

]At any critical point x 6= 0 where ∂f /∂xh = 0 for all h,one therefore has (x>x)Ax = (x>Ax)x.

Hence Ax = λx where λ = (x>Ax)/(x>x) = f (x).

That is, a stationary point x 6= 0 must be an eigenvector,with the corresponding function value f (x)as the associated eigenvalue.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 46 of 76

Page 47: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

More Properties of the Rayleigh Quotient

Using the Rayleigh quotient

Rn \ {0} 3 x 7→ f (x) :=x>Ax

x>x=

∑ni=1

∑nj=1 xiaijxj∑n

i=1 x2i

one can state and prove the following lemma.

LemmaEvery n × n symmetric square matrix A:

1. has a maximum eigenvalue λ∗ with λ∗ real,and any associated eigenvector x∗ as a maximum point of f ;

2. has a minimum eigenvalue λ∗ with λ∗ real,and any associated eigenvector x∗ as a minimum point of f ;

3. satisfies A = λI if and only if λ∗ = λ∗ = λ.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 47 of 76

Page 48: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Parts 1 and 2

The unit sphere Sn−1 is a closed and bounded subset of Rn.

Moreover, the Rayleigh quotient function f is continuouswhen restricted to Sn−1.

By the extreme value theorem, f restricted to Sn−1 must have:

I a maximum value λ∗ attained at some point x∗;

I a minimum value λ∗ attained at some point x∗.

Because f is homogeneous of degree zero,these are the maximum and minimum values of fover the whole domain Rn \ {0}.In particular, f must be critical at any maximum point x∗,as well as at any minimum point x∗.

But critical points must be eigenvectors.

This proves parts 1 and 2 of the lemma.

Part 3 is left as an exercise.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 48 of 76

Page 49: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 49 of 76

Page 50: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

A Minor Lemma

LemmaLet A be a symmetric n × n matrix.

Suppose that λ and µ are distinct eigenvalues,with corresponding eigenvectors x and y.

Then x and y are orthogonal — that is, x · y = 0.

Proof.Suppose that the non-zero vectors x and ysatisfy Ax = λx and Ay = µy.

Because A is symmetric, one has

λx>y = (Ax)>y = x>A>y = x>Ay = µx>y

In case λ 6= µ, it follows that 0 = x>y = x · y.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 50 of 76

Page 51: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

A Useful Lemma

LemmaLet A be a symmetric n × n matrix.

Suppose that there are m < n eigenvectors {uk}mk=1

which form an orthonormal set of column n-vectors,as well as the columns of an n ×m matrix U.

Then there is at least one more eigenvector x 6= 0that satisfies U>x = 0— i.e., it is orthogonal to each of the m eigenvectors uk .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 51 of 76

Page 52: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Constructive Proof, Part 1

For each eigenvector uk , let λk be the associated eigenvalue,so that Auk = λkuk for k = 1, 2, . . . ,m.

Then the n ×m matrix U satisfies AU = UΛwhere Λ is the m ×m matrix diag(λk)mk=1.

Also, because the eigenvectors {uk}mk=1 form an orthonormal set,one has U>U = Im.

Hence U>AU = U>UΛ = Λ.

Also, transposing AU = UΛ gives U>A = ΛU>.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 52 of 76

Page 53: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Constructive Proof, Part 2

Consider now the n × n matrix A := (In −UU>)A(In −UU>).

Then A is symmetric because both A and UU> are symmetric.

Note that, because AU = UΛ and so U>A = ΛU>, one has

A = A−UU>A− AUU> + UU>AUU>

= A−UΛU> −UΛU> + UΛU> = A−UΛU>

The symmetric matrix A has at least one real eigenvalue λ.

Let x 6= 0 be an associated real eigenvector, which must satisfy

Ax = (I−UU>)A(I−UU>)x = (A−UΛU>)x = λx

Pre-multiplying each side of the last equalityby the m × n matrix U> shows that

λU>x = U>Ax−U>UΛU>x = ΛU>x− ΛU>x = 0m

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 53 of 76

Page 54: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Constructive Proof, Part 3

There are now two cases.

Consider first the generic casewhen A has at least one eigenvalue λ 6= 0.

Then there is a corresponding eigenvector x 6= 0 of A

that satisfies λU>x = 0m and so U>x = 0>m.

But then the earlier equation

Ax = (I−UU>)A(I−UU>)x = (A−UΛU>)x = λx

implies thatAx = (A + UΛU>)x = Ax = λx

Hence x is an eigenvector of A as well as of A.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 54 of 76

Page 55: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Constructive Proof, Part 4

The remaining exceptional case occurswhen the only eigenvalue of the symmetric matrix A is λ = 0.

Given the properties of the Rayleigh quotient function,this implies that A = 0 and so A = UΛU>.

Then any vector x 6= 0 satisfying U>x = 0must satisfy Ax = UΛU>x = 0.

This implies that x is an eigenvector of Aassociated with the eigenvalue λ = 0.

In both cases there is an eigenvector x of Aassociated with the common eigenvalue λ of both A and A

that satisfies U>x = 0>m.

This completes the proof. �

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 55 of 76

Page 56: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Spectral Theorem

TheoremGiven any symmetric n × n matrix A:

1. the set of all its eigenvectors spans the whole of Rn;

2. there exists an orthogonal matrix P that diagonalizes Ain the sense that P>AP = P−1AP is a diagonal matrix Λ,whose elements are the eigenvalues of A, all real.

When A is required to be self-adjoint rather than symmetric,and the eigenvectors may be complex, the corresponding result is:

Given any self-adjoint n × n matrix A:

1. the set of all its eigenvectors spans the whole of Cn;

2. there exists a unitary matrix U that diagonalizes Ain the sense that U∗AU = U−1AU is a diagonal matrix Λwhose elements are the eigenvalues of A, all real.

We give a proof for the case when A is symmetric.University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 56 of 76

Page 57: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Spectral Theorem, Part 1

The symmetric matrix A has at least one eigenvalue λ,which must be real.

The associated eigenvector x, normalized to satisfy x>x = 1,forms an orthonormal set {u1}satisfying u>j uk = δjk for all j , k = 1, 2, . . . ,m.

As the induction hypothesis,suppose that there are m < n eigenvectors {uk}mk=1

which form an orthonormal set of vectors.

We have just proved that this hypothesis holds for m = 1.

The “useful lemma” shows that, if the hypothesis holdsfor any m = 1, 2, . . . , n − 1, then it holds for m + 1.

So the result follows for m = n by induction.

In particular, when m = n, there exists an orthonormal setof n eigenvectors, which must then span the whole of Rn.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 57 of 76

Page 58: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Proof of Spectral Theorem, Part 2

Also, by the previous result,we can take P as an orthogonal matrixwhose columns are an orthonormal set of n eigenvectors.

Then AP = PΛ.

So premultiplying by P> = P−1 gives P>AP = P−1AP = Λ. �

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 58 of 76

Page 59: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 59 of 76

Page 60: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Definition of Quadratic Form

DefinitionA quadratic form on the n-dimensional Euclidean space Rn

is a mapping

Rn 3 x 7→ q(x) = x>Qx =∑n

i=1

∑n

j=1xiqijxj ∈ R

where Q is a symmetric n × n matrix.

The quadratic form x>Qx is diagonal just in casethe matrix Q is diagonal, with Q = Λ = diag(λ1, λ2, . . . , λn).

In this case x>Qx reduces to x>Λx =∑n

i=1 λi (xi )2.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 60 of 76

Page 61: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Hessian Matrix of a Quadratic Form

Example

1. Given the quadratic form q(x , y) = (x , y)

(a bc d

)(xy

),

show that, even if

(a bc d

)is not symmetric,

its Hessian matrix of second-order partial derivatives

is the symmetric matrix

(q′′xx q′′xyq′′yx q′′yy

)=

(2a b + c

b + c 2d

).

2. Given the quadratic form defined for x ∈ Rn by q(x) = x>Ax,show that, even if A is not symmetric,its Hessian matrix

(∂2q/∂xi∂xj

)n×n

of second-order partial derivativesis the symmetric matrix A + A>.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 61 of 76

Page 62: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Symmetry Loses No Generality

Requiring Q in x>Qx to be symmetric loses no generality.

This is because, given a general non-symmetric n × n matrix A,repeated transposition implies that

x>Ax = (x>Ax)> = 12 [x>Ax + (x>Ax)>] = 1

2x>(A + A>)x

Hence x>Ax = x>A>x = x>Qxwhere Q is the symmetrized matrix 1

2(A + A>).

Note that Q is indeed symmetric because

Q> = 12(A + A>)> = 1

2 [A> + (A>)>

] = 12(A> + A) = Q

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 62 of 76

Page 63: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Definiteness of a Quadratic Form

When x = 0, then x>Qx = 0. Otherwise:

DefinitionThe quadratic form Rn 3 x 7→ x>Qx ∈ R is:

positive definite just in case x>Qx > 0 for all x ∈ Rn \ {0};negative definite just in case x>Qx < 0 for all x ∈ Rn \ {0};positive semi-definite just in case x>Qx ≥ 0 for all x ∈ Rn;

negative semi-definite just in case x>Qx ≤ 0 for all x ∈ Rn;

indefinite just in case there exist x+ and x− in Rn

such that (x+)>Qx+ > 0 and (x−)>Qx− < 0.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 63 of 76

Page 64: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Definiteness of a Diagonal Quadratic Form

TheoremThe diagonal quadratic form

∑ni=1 λi (xi )

2 ∈ R is:

positive definite if and only if λi > 0 for i = 1, 2, . . . , n;

negative definite if and only if λi < 0 for i = 1, 2, . . . , n;

positive semi-definite if and only if λi ≥ 0 for i = 1, 2, . . . , n;

negative semi-definite if and only if λi ≤ 0 for i = 1, 2, . . . , n;

indefinite if and only if there exist i , j ∈ {1, 2, . . . , n}such that λi > 0 and λj < 0.

Proof.The proof is left as an exercise.

The result is obvious if n = 1, and straightforward if n = 2.

Working out these two cases first suggests the proof for n > 2.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 64 of 76

Page 65: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 65 of 76

Page 66: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Diagonalizing Quadratic Forms

Consider a quadratic form Rn 3 x 7→ x>Qx ∈ R where,without losing generality,we assume that the n × n matrix Q is symmetric.

By the spectral theorem for symmetric matrices,there exists a matrix P that diagonalizes Q,meaning that P−1QP is a diagonal matrix that we denote by Λ.

Moreover P can be made orthogonal, meaning that P−1 = P>.

Given any x 6= 0, because P−1 exists, we can define y = P−1x.

This implies that x = Py, where y 6= 0 because (P−1)−1 = P.

Then x>Qx = y>P>QPy = y>Λy,so the diagonalization leads to a diagonal quadratic form.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 66 of 76

Page 67: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Eigenvalue Test of Definiteness

A standard result says that Qand its diagonalization P−1QP have the same set of eigenvalues.

From the theorem on the definiteness of a diagonal quadratic form,it follows that:

TheoremThe quadratic form x>Qx is:

positive definite if and only if all its eigenvalues are positive;

negative definite if and only if all its eigenvalues are negative;

positive semi-definite if and only ifall its eigenvalues are non-negative;

negative semi-definite if and only ifall its eigenvalues are non-positive;

indefinite if and only ifit has both positive and negative eigenvalues.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 67 of 76

Page 68: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Concavity or Convexity of a Quadratic Form

TheoremAs a function of x, the quadratic form x>Qx is:

strictly convex if and only if all it is positive definite;

strictly concave if and only if all it is negative definite;

convex if and only if it is positive semi-definite;

concave if and only if it is negative semi-definite.

Otherwise x>Qx is neither concave nor convexif and only if it is indefinite.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 68 of 76

Page 69: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

OutlineEigenvalues and Eigenvectors

Real CaseThe Complex CaseLinear Independence of Eigenvectors

Diagonalizing a General MatrixSimilar Matrices

Properties of Adjoint and Symmetric MatricesA Self-Adjoint Matrix has only Real Eigenvalues

Diagonalizing a Symmetric MatrixOrthogonal MatricesOrthogonal ProjectionsRayleigh QuotientThe Spectral Theorem

Quadratic Forms and Their DefinitenessQuadratic FormsThe Eigenvalue Test of DefinitenessSylvester’s Criterion for Definiteness

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 69 of 76

Page 70: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Case of a Quadratic Form in Two Variables

The general quadratic form in 2 variables is

(x , y)

(a hh b

)(xy

)= ax2 + 2hxy + by2

If it is positive definite, it is positive whenever x 6= 0 and y = 0,implying that a > 0.

If a > 0, completing the square implies thatax2 + 2hxy + by2 = a(x + hy/a)2 + (b − h2/a)y2.

Given that a > 0, this is positive definite if and only if b > h2/a,

or if and only if ab − h2 =

∣∣∣∣a hh b

∣∣∣∣ > 0.

For the case of 2 variables, this proves Sylvester’s Criterion:The real-symmetric matrix A is positive definiteif and only if all the leading principal minors of A are positive.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 70 of 76

Page 71: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

The Case of a Diagonal Quadratic Form

The general diagonal quadratic form in n variables is x>Λxwhere x is an n-vectorand Λ is an n × n diagonal matrix diag(λ1, . . . , λn).

Then the quadratic form x>Λx =∑n

i=1 λi (xi )2:

1. is positive definite if and only if λi > 0 for i = 1, 2, . . . , n.

This is true if and only if the k-fold product∏k

i=1 λiis positive for k = 1, 2, . . . , n.

But∏k

i=1 λi = |diag(λ1, . . . , λk)|is the leading principal minor of order kfor the diagonal matrix Λ = diag(λ1, . . . , λn).

2. is positive semi-definite if and only if λi ≥ 0 for i = 1, 2, . . . , n.

This is true if and only if the product∏

i∈K λi is nonnegativefor every nonempty K ⊆ Nn = {1, 2, . . . , n}.But each product

∏i∈K λi is a principal minor

for the diagonal matrix Λ = diag(λ1, . . . , λn).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 71 of 76

Page 72: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Sylvester’s Criterion: General Statement

Theorem (Sylvester’s criterion)

Given the symmetric matrix A, for each k = 1, . . . , n,and for each non-empty subset K ⊆ Nn with k = #K :let Dk denote the leading principal minor of order k;let ∆K

k denote an arbitrary principal minor of order k.

Then the quadratic form x>Ax is:

positive definite ⇐⇒ Dk > 0 for all k = 1, . . . , n

positive semidefinite ⇐⇒ ∆Kk ≥ 0 for all ∆K

k of any order k

negative definite ⇐⇒ (−1)kDk > 0 for all k = 1, . . . , n

negative semidefinite ⇐⇒ (−1)k∆Kk ≥ 0 for all ∆K

k of any order k

Note that the necessary and sufficient conditionsfor A to be negative (semi-) definiteare exactly those for −A to be positive (semi-) definite,

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 72 of 76

Page 73: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Necessity of Sylvester’s Criterion

Necessity for a Positive (Semi-) Definite Matrix.

Given any non-empty subset K ⊆ Nn with with k = #K ,let AK denote the k × k matrix (aij)(i ,j)∈K×K .

Then, given any n-vector x,let xK denote (xj)j∈K and let x−K denote (xj)j 6∈K .

In case A is positive definite, whenever xK 6= 0 and x−K = 0,then x>Ax = (xK )>AKxK > 0, so AK is positive definite.

Because the determinant of a symmetric matrixis the product of its eigenvalues,it follows that |AK | > 0.

This holds in particular when K = Nk , and so |AK | = Dk .

In case A is positive semi-definite,the same argument shows that AK is positive semi-definite,and so ∆K

k = |AK | ≥ 0.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 73 of 76

Page 74: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Sufficiency of Sylvester’s Criterion

Apart from the 2× 2 and diagonal cases,the proof of sufficiency is much more challenging!

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 74 of 76

Page 75: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

An Improved (?) Eigenvalue Test of DefinitenessWhen restricted to the unit sphere Sn−1,

the Rayleigh quotient Rn \ {0} 3 x 7→ f (x) =x>Qx

x>x→ R

reduces to the quadratic form x>Qx, whose:

1. maximum value is the largest eigenvalue λ∗ of Q;

2. minimum value is the smallest eigenvalue λ∗ of Q.

It follows that:

TheoremThe quadratic form x>Qx is:

positive definite iff λ∗ > 0;

negative definite iff λ∗ < 0;

positive semi-definite (but not positive definite) iff λ∗ = 0;

negative semi-definite (but not negative definite) iff λ∗ = 0;

indefinite if and only if λ∗ > 0 > λ∗;

identically zero if and only if λ∗ = λ∗ = 0.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 75 of 76

Page 76: Lecture Notes 1: Matrix Algebra Part D: Similar Matrices

Envoi

See you, at least online, on Tuesday September 29th.

Enjoy the next five days of lectures with my colleague Pablo Beker!

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 76 of 76