Chapter 6 The Jordan Canonical Form - Queen's U

Chapter 6

The Jordan Canonical Form

6.1 Introduction

The importance of the Jordan canonical form became evident in the last chapter, whereit frequently served as an important theoretical tool to derive practical procedures forcalculating matrix polynomials.

In this chapter we shall take a closer look at the Jordan canonical form of a givenmatrix A. In particular, we shall be interested in the following questions:

• how to determine its structure;

• how to calculate P such that P−1AP is a Jordan matrix.

As we had learned in the previous chapter in connection with the diagonalizationtheorem (cf. section 5.4), the eigenvalues and eigenvectors of A yield important clues fordetermining the shape of the Jordan canonical form. Now it is not difficult to see thatfor 2 × 2 and 3 × 3 matrices the knowledge of the eigenvalues and eigenvectors A alonesuffices to determine the Jordan canonical form J of A, but for larger size matrices thisis no longe true. However, by generalizing the notion of eigenvectors, we can determineJ from this additional information. Thus we shall:

• study some basic properties of eigenvalues and eigenvectors in section 6.2;

• learn how to find J and P when m ≤ 3 (section 6.3);

• define and study generalized eigenvectors and learn how determine J (section 6.4);

• learn a general algorithm for determining P in section 6.5.

In addition, we shall also look at some applications of the Jordan canonical formsuch as a proof of the Cayley-Hamilton theorem (cf. section 6.6). Other applications willfollow in later chapters.

274 Chapter 6: The Jordan Canonical Form

6.2 Algebraic and geometric multiplicities of eigen-

values

As we shall see, much (but not all) of the structure of the Jordan canonical form J of amatrix A can be read off from the algebraic and geometric multiplicities of the eigenvaluesof A, which we now define.

Definition. Let A be an m×m matrix and λ ∈ C. Then

mA(λ) = multλ(chA), the multiplicity of λ as a root of chA(t) (cf. chapter 3),

is called the algebraic multiplicity of λ in A;

νA(λ) = dimCEA(λ) is called the geometric multiplicity of λ in A.

Here, as before (cf. section 5.4),

EA(λ) = {~v ∈ Cm : A~v = λ~v} = Nullsp(A− λI) denotes the λ-eigenspace of A.

Remarks. 1) Note that the above definition does not require λ to be an eigenvalue ofA. Thus by definition:

λ is an eigenvalue of A ⇔ νA(λ) ≥ 1 ⇔ mA(λ) ≥ 1.

2) We shall see later (in Theorem 6.4) that we always have νA(λ) ≤ mA(λ).

3) By linear algebra: νA(λ)defn= dim Nullsp(A− λI) = m− rank(A− λI).

Example 6.1. Find the algebraic and geometric multiplicities of (the eigenvalues of)the matrices

A =

1 1 20 1 20 0 3

and B =

1 0 20 1 20 0 3

Solution. Since A and B are both upper triangular and have the same diagonal entries1, 1, 3 we see that

chA(t) = chB(t) = (t− 1)2(t− 3).

Thus, both matrices have λ1 = 1 and λ3 = 3 as their eigenvalues with algebraic multi-plicities

mA(1) = mB(1) = 2 and mA(2) = mB(2) = 1.

To calculate the geometric multiplicities, we have to determine the ranks of A− λiI andB − λiI for i = 1, 2. Now

A−I =

0 1 20 0 20 0 2

, B−I =

0 0 20 0 20 0 2

, A−3I =

−2 1 20 −2 20 0 0

, B−3I =

−2 0 20 −2 20 0 0

.

Section 6.2: Algebraic and geometric multiplicities of eigenvalues 275

Thus, since A−I clearly has 2 linearly independent column vectors, we see that rank(A−I) = 2, and so νA(1) = 3− rank(A− I) = 3− 2 = 1. Similarly, rank(B − I) = 1 and soνB(1) = 3− rank(B − I) = 3− 2 = 1.

Furthermore, since A − 3I and B − 3I both have rank 2, it follows that νA(3) =νB(3) = 3− 2 = 1. Thus, the geometric multiplicities of the eigenvalues of A and B are

νA(1) = 1, νB(1) = 2 and νA(3) = νB(3) = 1.

Example 6.2. Consider the following three Jordan matrices:

J1 =

5 0 00 5 00 0 5

, J2 =

5 1 00 5 00 0 5

, J3 =

5 1 00 5 10 0 5

.

Then their algebraic and geometric multiplicities are given in the following table:

1 2 3

chJi(t) (t− 5)3 (t− 5)3 (t− 5)3

mJi(5) 3 3 3

νJi(5) 3 2 1

EJi(5) 〈~e1, ~e2, ~e3〉〈~e1, ~e3〉〈~e1〉

Here ~e1 = (1, 0, 0)t, e2 = (0, 1, 0)t, e3 = (0, 0, 1)t denote the standard basis vectors of C3

and 〈. . .〉 denotes the span (= set of all linear combinations) of the vectors.

Verification of table: To check the first two rows of the table we note that

chJi(t) = (−1)3det(Ji − 5I) = − det

5− t ∗ ∗0 5− t ∗0 0 5− t

= −(5− t)3 = (t− 5)3.

Thus, for all three matrices λ1 = 5 is the only eigenvalue and its algebraic multiplicity ismJi

(5) = 3 (= the exponent of (t− 5) in chJi(t)).

To compute νJi(5), it is enough to find rank(Ji − 5I) = the number of non-zero rows

of the associated row echelon form. Here we need to consider the three cases separately:

1) Since J1−5I = 0, and rank(0) = 0, we have νJi(5) = n−rank(J1−5I) = 3−0 = 3.

2) Next, J2−5I =

0 1 00 0 00 0 0

, which is in row echelon form. Thus rank(J2−5I) = 1,

and hence νJ2(5) = n− rank(J2 − 5I) = 3− 1 = 2.

3) Similarly, J3 − 5I =

0 1 00 0 10 0 0

, which is again in row echelon form. Thus

νJ3(5) = n− rank(J3 − 5I) = 3− 2 = 1.

Finally, the indicated basis of EJi(5) is obtained by using back-substitution.


Example 6.3. Let J = J(λ, k) =

λ 1 0 0

0. . . . . . 0

.... . . . . . 1

0 . . . 0 λ

be a Jordan block of size k.

Then: chJ(t) = (t− λ)k (since J is upper triangular)

EJ(λ) = {c(1, 0, . . . , 0)t : c ∈ C} (cf. Example 5.7 of chapter 5)

mJ(λ) = multλ(chJ) = k,

νJ(λ) = dimCEA(λ) = 1.

The above example shows us how to quickly find the algebraic and geometric mul-tiplicities of Jordan blocks. To extend this to Jordan matrices, i.e. to matrices of theform

J = Diag(J1, J2, . . . , Jr) =

J1 0 . . . 0

0 J2. . .

......

. . . . . . 0

0 . . . 0 Jr

,

where the Ji = J(λi,mi) are Jordan blocks, we shall use the following result.

Theorem 6.1 (Sum Formula). If A = Diag(B,C) =(

B 00 C

), then the algebraic and geo-

metric multiplicities of the eigenvalues of A are the sum of the corresponding multiplicitiesof those of B and C. In other words, for any λ ∈ C we have

(1) mA(λ) = mB(λ) +mC(λ) and νA(λ) = νB(λ) + νC(λ).

Example 6.4. As in Example 6.2, let J2 = Diag(J (5, 2)︸︷︷︸B

, J (5, 1)︸︷︷︸C

). Then

mJ2(5)(1)= mB(5) +mC(5)

Ex. 6.3= 2 + 1 = 3,

νJ2(5)(1)= νB(5) + νC(5)

Ex. 6.3= 1 + 1 = 2.

This example generalizes as follows:

Corollary. If J = Diag(J11, J12, . . . , Jij, . . . ) is a Jordan matrix with Jordan blocksJij = J(λi, kij) and λ ∈ C, then

νJ(λ) = the number of Jordan blocks Jij with eigenvalue λi = λ,

mJ(λ) = the sum of the sizes kij of the Jordan blocks Jij with eigenvalue λi = λ.

Proof. By Theorem 6.1 we have νJ(λ) =∑

i,j νJij(λ). Now by Example 6.3 we know

that νJij(λ) = 1 if Jij has eigenvalue λi = λ and νJij

(λ) = 0 otherwise, so the assertionfor νJ(λ) follows. The formula for mJ(λ) is proved similarly.


Theorem 6.1 is, in fact, a special case of a much more precise result. To state it in aconvenient form, it is useful to introduce the following notation.

Notation. (a) If ~v = (v1, . . . , vn)t ∈ Cn and ~w = (w1, . . . , wm)t ∈ Cm, then the vector

~v ⊕ ~w := (v1, . . . , vn, w1, . . . , wm)t ∈ Cn+m

is called the direct sum of ~v and ~w.

(b) If V ⊂ Cn and W ⊂ Cm are subspaces, then the direct sum of V and W is thesubspace

V ⊕W = {~v ⊕ ~w ∈ Cn+m : ~v ∈ V, ~w ∈ W}.

Remarks. 1) If ~v1, . . . , ~vr is a basis of V ⊂ Cn and ~w1, . . . , ~ws is one of W ⊂ Cm, then~v1 ⊕~0m, . . . , ~vr ⊕~0m,~0n ⊕ ~w1, . . . ,~0n ⊕ ~ws is a basis of V ⊕W . (Here, ~0m = (0, . . . , 0︸︷︷︸

m

)t.)

Thus

(2) dim(V ⊕W ) = dimV + dimW.

2) If A is an a×m matrix and B is a b×n matrix, then for every ~v ∈ Cm and ~w ∈ Cn

we have

(3) Diag(A,B)(~v ⊕ ~w) = (A~v)⊕ (B~w).

Example 6.5. Let V = {c1(1, 2)t + c2(3, 4)t : c1, c2 ∈ C} and W = {c′1(1, 2, 1)t +c′2(3, 4, 1):c′1, c

′2 ∈ C}. Verify the addition rule (2) for V and W .

Solution. If ~v ∈ V , then ~v = c1(1, 2)t+c2(3, 4)t = (c1, 2c1)t+(3c2, 4c2)

t = (c1+3c2, 2c1+4c2)

t, and similarly each ~w ∈ W has the form ~w = (c′1 + 3c′2, 2c′1 + 4c′2, c

′1 + c′2)

t. Thus

~v ⊕ ~w = (c1 + 3c2, 2c1 + 4c2)t ⊕ (c′1 + 3c′2, 2c

′1 + 4c′2, c

′1 + c′2)

t

= (c1 + 3c2, 2c1 + 4c2, c′1 + 3c′2, 2c

′1 + 4c′2, c

′1 + c′2)

t

= c1(1, 2, 0, 0, 0)t + c2(3, 4, 0, 0, 0)t + c′1(0, 0, 1, 2, 1)t + c′2(0, 0, 3, 4, 1)t,

and so V ⊕ W = {c1(1, 2, 0, 0, 0)t + c2(3, 4, 0, 0, 0)t + c′1(0, 0, 1, 2, 1)t + c′2(0, 0, 3, 4, 1)t :c1, c2, c

′1, c

′2 ∈ C}. Therefore, dim(V ⊕W ) = 4 = dimV + dimW .

Example 6.6. Verify the multiplication rule (3) for A =(1 23 4

)and B =

(5 26 1

).

Solution. Write ~v = (v1, v2) and ~w = (w1, w2). Then

Diag(A,B)(~v ⊕ ~w) =

1 2 0 03 4 0 00 0 5 20 0 6 1

v1

v2

w1

w2

=

v1 + 2v2

3v1 + 4v2

5w1 + 2w2

6w1 + w2

=

(v1 + 2v2

3v1 + 4v2

)⊕(

5w1 + 2w2

6w1 + w2

)=

(1 23 4

)(v1

v2

)⊕(

5 26 1

)(w1

w2

)= A~v ⊕B~w.


We are now ready to state and prove the following refinement of Theorem 6.1.

Theorem 6.2. If A = Diag(B,C), then

(4) chA(t) = chB(t) · chC(t) and EA(λ) = EB(λ)⊕ EC(λ).

Proof. (a) Since the determinant of a block diagonal matrix is the product of the deter-minants of the blocks, we obtain

chA(t) = det(tI−A) = det(Diag(tI−B, tI−C)) = det(tI−B) det(tI−C) = chB(t) chC(t).

(b) Suppose that B is an m×m matrix and C an n×n matrix. Now each ~u ∈ Cm+n

can be written uniquely as ~u = ~v ⊕ ~w with ~v ∈ Cm and ~w ∈ Cn, so (A − λI)~u =Diag(B − λI, C − λI)(~v ⊕ ~w) = (B − λI)~v ⊕ (C − λI)~w by (3). Thus, ~u ∈ EA(λ) ⇔(A − λI)~u = ~0 ⇔ (B − λI)~v = ~0 and (C − λI)~w = ~0 ⇔ ~v ∈ EB(λ) and ~w ∈ EC(λ) ⇔~u = ~v ⊕ ~w ∈ EB(λ)⊕ EC(λ), and so EA(λ) = EB(λ)⊕ EC(λ), as claimed.

Remark. If A =(

B 0X C

)or A =

(B X0 C

), then it is still true that chA(t) = chB(t) chC(t).

However, in this case the second formula of (4) no longer holds (in general).

Proof of Theorem 6.1. It is easy to see that Theorem 6.1 follows from Theorem

6.2. Indeed, mA(λ) = multλ(chA)(4)= multλ(chB chC) = multλ(chB) + multλ(chC) =

mB(λ) +mC(λ), which proves the first equality of Theorem 6.1.

For the second we observe that νA(λ) = dimEA(λ)(4)= dim(EB(λ) ⊕ EC(λ))

(2)=

dimEB(λ) + dimEC(λ) = νB(λ) + νC(λ).

Example 6.7. Find chA(t) and EA(λ) when A =

2 1 0 01 2 0 00 0 0 10 0 −1 2

.

Solution. Since A = Diag(B,C), where B =(2 11 2

)and C =

(0 1−1 2

), it is enough by

Theorem 6.2 to work out the eigenspaces and characteristic polynomials for B and C.

(a) chB(t) = (t− 2)2 − 1 = t2 − 4t+ 3 = (t− 1)(t− 3)

EB(1) = {c(

1−1

): c ∈ C}, EB(3) = {c

(11

): c ∈ C}, and EB(λ) = {0} for λ 6= 1, 3.

(b) chC(t) = −t(2− t) + 1 = (t− 1)2

EC(1) = {c(11

): c ∈ C}, and EC(λ) = {0} for λ 6= 1.

(c) Thus: chA(t)(4)= chB(t) · chC(t) = (t− 1)(t− 3) · (t− 1)2 = (t− 1)3(t− 3);

EA(1)(4)= EB(1)⊕ EC(1) = {c1

(1−1

)} ⊕ {c2

(11

)}

= {c1(1,−1, 0, 0)t + c2(0, 0, 1, 1)t : c1, c2 ∈ C},EA(3) = EB(3)⊕ EC(3)

= {c1(11

)} ⊕ {~0} (Note: 3 is not an eigenvalue of C)

= {c1(1, 1, 0, 0)t : c ∈ C},EA(λ) = EB(λ)⊕ EC(λ) = {~0} ⊕ {~0} = {~0}, if λ 6= 1, 3.


Another important property of algebraic and geometric multiplicities is the following.

Theorem 6.3 (Invariance Property). If B = P−1AP , then

(5) chB(t) = chA(t) and EA(λ) = PEB(λ) := {P~v : ~v ∈ EB(λ)}.

In particular, we have

mB(λ) = mA(λ) and νB(λ) = νA(λ).

Example 6.8. Find the characteristic polynomial and the eigenspaces of

A = PBP−1 =

4 0 −21 2 −12 0 0

, where B =

2 1 00 2 00 0 2

and P =

2 1 11 1 12 0 1

.

Solution. First note that B = Diag(J1, J2), where J1 = J(2, 2) and J2 = J(2, 1). Thus

chA(t)(5)= chB(t)

(4)= chJ1(t) chJ2(t)

Ex.6.3= (t−2)2(t−2) = (t−2)3. Thus, λ1 = 2 is the only

eigenvalue of A (and of B).

Moreover, EB(2)(4)= {c1(1, 0)t : c1 ∈ C} ⊕ {c2(1) : c2 ∈ C}= {c1(1, 0, 0)t + c2(0, 0, 1)t : c1, c2 ∈ C}

and hence EA(2)(5)= {c1 P (1, 0, 0)t︸︷︷︸

1st column of P

+c2 P (0, 0, 1)t︸︷︷︸3rd column of P

: c1, c2 ∈ C}

= {c1(2, 1, 2)t + c2(1, 1, 1)t : c1, c2 ∈ C}.

Check: A

0@ 212

1A = (PBP−1)(P

0@ 100

1A) = PB

0@ 100

1A = P (2

0@ 100

1A) = 2

0@ 212

1A⇒ (2, 1, 2)t ∈ EA(2).

Proof of Theorem 6.3. (a) Recall that for any two n× n matrices X and Y we havedet(XY ) = det(X) · det(Y ), and hence also det(P−1XP ) = det(P )−1 det(X) det(P ) =det(X). Applying this to X = A− tI yields

chP−1AP (t) = (−1)n det(P−1AP − tI) = (−1)n det(P−1(A− tI)P )

= (−1)n det(A− tI) = chA(t).

(b) We have: ~v ∈ EA(λ) ⇔ A~v = λ~v ⇔ PBP−1~v = λ~v

⇔ BP−1~v = P−1λ~v ⇔ B(P−1~v) = λ(P−1~v)

⇔ P−1~v ∈ EB(λ) ⇔ ~v ∈ PEB(λ).

We have thus shown that ~v ∈ EA(λ)⇔ ~v ∈ PEB(λ), which means that EA(λ) = PEB(λ).The last two assertions follow from (5) by talking degrees and dimensions.


Corollary. If P−1AP = J is a Jordan matrix, then

νA(λ) = the number of Jordan blocks of J with eigenvalue λ

mA(λ) = the sum of the sizes of the Jordan blocks of J with eigenvalue λ.

Proof. Combine Theorem 6.3 with the Corollary of Theorem 6.1:

νA(λ)Th. 6.3

= νJ(λ)Th. 6.1

= #{Jordan blocks Jij with eigenvalue λ},

and the assertion about mA(λ) is proved similarly.

Remark. The above corollary represents a fundamental step towards computing thestructure of the Jordan canonical form J associated to A: we see that the algebraic andgeometric multiplicities reveal the number of Jordan blocks and the sum of their sizes.

If n ≤ 3, then the above rules already determine J , as we shall see in more detail inthe next section. However, if n ≥ 4, then this is no longer true, as the following exampleillustrates.

Example 6.9. The two Jordan matrices

J1 =

2 1 0 00 2 0 00 0 2 10 0 0 2

and J2 =

2 0 0 00 2 1 00 0 2 10 0 0 2

clearly have the same number of Jordan blocks (so νJ1(2) = νJ2(2) = 2), and the sum oftheir sizes is also the same (so mJ1(2) = mJ2(2) = 4), but the Jordan matrices are not thesame (even if we rearrange the blocks). Thus, the algebraic and geometric multiplicitiesalone cannot distinguish between these Jordan forms.

Theorem 6.4 (Jordan Canonical Form). Every square matrix A is similar to a Jordanmatrix. In other words, there is an invertible matrix P such that

P−1AP = J = Diag(J11, . . . , Jij, . . .)

is a block diagonal matrix consisting of Jordan blocks Jij = J(λi, kij). Moreover:

1) The λ1, λ2, . . . , λs are the (distinct) eigenvalues of A.

2) The number of Jordan blocks Ji1, Ji2, . . . with eigenvalue λi equals the geometricmultiplicity νA(λi) = νi (so that this list ends with Ji,νi

).

3) The sum of the sizes kij of the blocks Ji1, Ji2, . . . , Ji,νiwith eigenvalue λi equals the

algebraic multiplicity mi = mA(λi):

(6) ki1 + ki2 + · · ·+ kiνi= mi;

in particular: νi ≤ mi.

4) The Jij’s are uniquely determined by A up to order.


Remarks. 1) In the above statement of the Jordan canonical form the following fact isimplicitly used:

If J and J ′ are two Jordan matrices which have the same lists of Jordan blocks but in adifferent order, then J and J ′ are similar, i.e. there is a matrix P such that J ′ = P−1JP .

[To see why this is true, consider anm×m block diagonal matrix Diag(A,B) consistingof two blocks A and B of size k × k and (m− k)× (m− k), respectively. Then we have

P−1k Diag(A,B)Pk = Diag(B,A),

where Pk = (~ek+1|~ek+2| . . . | ~em|~e1| . . . |~ek), and, as usual, ~ei = (0, . . . , 0,i

1, 0, . . . , 0)t ∈ Cm.From this the above statement about the Jordan matrices follows readily. Note thatthe same argument also yields the corresponding statement for arbitrary block diagonalmatrices.]

As a result of this fact, we can always choose the matrix P in Theorem 6.4 in sucha way that, after fixing an ordering λ1, . . . , λs of the eigenvalues, the Jordan matrix hasthe form J = Diag(J11, . . . , Jij, . . .), where the Jordan blocks Jij = J(λi, kij) are orderedin deceasing size (for each eigenvalue λi), i.e. we have

ki1 ≥ ki2 ≥ . . . ≥ kiνi;

such a Jordan matrix will be said to be in standard form. For example, J = Diag(J(1, 2),J(1, 1), J(2, 2), J(−1, 3), J(−1, 2)) is in standard form but Diag(J(2, 2), J(2, 3)) is not.

2) Conversely, suppose J and J ′ are two Jordan matrices which are similar. Then thelists of Jordan blocks of J and J ′ are the same (up to order), as we shall see later (cf.Theorem 6.5, Corollary).

Corollary 1. Two m×m matrices A and B are similar (that is, B = P−1AP for someP ) if and only if they have the same Jordan canonical form (up to order) J .

Proof. Let J and J ′ be the Jordan canonical forms of A and B, respectively. Since Ais similar to J and B is similar to J ′, it follows that A and B are similar if and only ifJ and J ′ are similar. Now by the above remark, J and J ′ are similar if and only if theyare identical up to the order of their blocks, and so the assertion follows.

Corollary 2. A matrix A is diagonable if and only if the algebraic and geometric mul-tiplicities of all its eigenvalues λi are the same:

νA(λ1) = mA(λ1), . . . , νA(λs) = mA(λs).

Proof. First note that if A is diagonable, then the associated diagonal matrix P−1APis the Jordan canonical form of A, and conversely, if the JCF of A is a diagonal matrix,then A is clearly diagonable. Thus:

A is diagonable ⇔ its associated Jordan form is a diagonal matrix⇔ all Jordan blocks have size 1× 1⇔ kij = 1 for all i, j⇔ νA(λi) = mA(λi), 1 ≤ i ≤ s, by (6).


Remark. By using terminology which will be studied in more detail in the next chapter,the above Corollary 2 can be rephrased more elegantly as follows:

Corollary 2′. A matrix A is diagonable if and only if all its eigenvalues are regular.

Here, an eigenvalue λ of A is called regular if its algebraic and geometric multiplicitiescoincide, i.e. if mA(λ) = νA(λ).

Indeed, the above proof (or equation (6)) shows more precisely that

Corollary 3. An eigenvalue λ of A is regular if and only if all its Jordan blocks (in theassociated Jordan canonical form J of A) have size 1× 1.

Exercises 6.2.

1. Find all the eigenvalues, their algebraic and geometric multiplicities and their as-sociated eigenspaces of the matrix A when:

(a) A =

2 1 0 0−1 4 0 0−1 1 2 1−1 1 −1 4

; (b) A = PBP−1, where

P =

1 1 1 1 1 1 13 2 1 0 −1 −2 −39 4 1 0 1 4 9

27 8 1 0 −1 −8 −2781 16 1 0 1 16 81

243 32 1 0 −1 −32 −243729 64 1 0 1 64 729

and B =

1 1 0 0 0 0 00 1 1 0 0 0 00 0 1 0 0 0 00 0 0 1 1 0 00 0 0 0 1 0 00 0 0 0 0 2 10 0 0 0 0 0 2

.

Hint: In (b), you shouldn’t have to do any calculations.

2. Write down two 4 × 4 Jordan matrices which are not similar and yet have thesame eigenvalues and the same algebraic and geometric multiplicities. Justify youranswer.

3. Find a Jordan matrix J such that

PJP−1 =

4 1 −1−2 1 1

2 1 1

for some invertible matrix P . [Do not find P !]

4. Find all the Jordan matrices J (in standard form) with chJ(t) = (t− 1)2(t− 2)4.


5. Consider the Jordan matrices

J1 =

1 1 00 1 00 0 1

and J2 =

1 0 00 1 10 0 1

.

(a) Which of these is standard form?

(b) Find a matrix P such that J1 = P−1J2P .

(c) Find a matrix Q such that Q−1JQ is in standard form where

J = Diag(J1, J2) =

(J1 00 J2

)=

1 1 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 10 0 0 0 0 1

.

6. (a) Suppose that A1 and A2 are two square matrices with characteristic polynomial

chAj(t) = (t− λ1)

m1j(t− λ2)m2j . . . (t− λs)

msj ,

where mij ≥ 0 for 1 ≤ i ≤ s and j = 1, 2. Let EAj

ik denote the ikth constituent

matrix of Aj for 1 ≤ i ≤ s and 0 ≤ k ≤ mij − 1, and put EAj

ik = 0 if k ≥ mij. Showthat the constituent matrices EA

ik of A = Diag(A1, A2) are given by

EAik = Diag(EA1

ik , EA2ik ), 1 ≤ i ≤ s, 1 ≤ k ≤ mi1 +mi2 − 1.

(b) Use this formula to find the constituent matrices of the Jordan matrices

(i) J = Diag(J(−1, 2), J(1, 2)) and (ii) J ′ = Diag(J(−1, 2), J(−1, 3)).

7. Let a0, a1, . . . , an−1 ∈ C and consider the matrix

A =

0 1 0 . . . 00 0 1 0...

. . . . . ....

0 0 1a0 a1 . . . an−2 an−1

.

(a) Show that the geometric multiplicity of every eigenvalue λ of A is νA(λ) = 1.

(b) Show that A is diagonable if and only if chA(t) has n distinct roots. [Recallfrom section 5.9 that chA(t) = tn − an−1t

n−1 − . . . − a1t − a0, but you don’tneed this here.]


6.3 How to find P such that P−1AP = J (for m ≤ 3)

Before explaining the general procedure of finding the Jordan canonical form J (and theassociated matrix P ) of an n×n matrix A, let us first look at the special case that m ≤ 3.

The advantage of the case m ≤ 3 is that the algebraic and geometric multiplicitiessuffice for finding the Jordan canonical form; this no longer true ifm ≥ 4; cf. Example 6.9.Nevertheless, in calculating the associated matrix P , we are naturally led to a methodwhich can be generalized to larger matrices, as will become evident in the next sections.This method consists of looking at the so-called generalized eigenvectors which we willneed here only in special cases. The basic idea is the following.

Basic Idea: In order to find P = (~v1| . . . |~vn) such that P−1AP = J , write this equationas

AP = PJ.

By using the identities

AP = (A~v1| · · · |A~vn),

P (a1| · · · |an)t = a1~v1 + · · ·+ an~vn,

the equation AP = PJ translates into a set of (vector) equations for the ~vi’s which wecan solve. The following examples show how this method works.

Example 6.10. If A =

(0 1−1 −2

), find P such that J = P−1AP is a Jordan matrix.

Solution. The procedure naturally divides into two steps.

Step 1. Find the Jordan canonical form J of A.

(i) The characteristic polynomial of A is

chA(t) = (−1)2 det

(−t 1−1 −2− t

)= t(t+ 2)− 1 = (t+ 1)2,

and so λ1 = −1 is the only eigenvalue; it has algebraic multiplicity m1 = mA(λ1) = 2.Thus, the sum of the sizes of the Jordan blocks of J is m1 = 2.

(ii) Since

A+ I =

(1 1−1 −1

)→(

1 10 0

),

it follows that the λ1-eigenspace is EA(−1) = {c(1,−1)t : c ∈ C}; in particular, ν1 = 1.Thus, we have 1 Jordan block.

(iii) By combining (i) and (ii) we can conclude:

m1 = 2ν1 = 1

}⇒ 1 Jordan block of size 2 (with eigenvalue λ1 = −1)

⇒ J =

(−1 1

0 −1

)is the associated Jordan canonical form.

Section 6.3: How to find P such that P−1AP = J (for m ≤ 3) 285

Step 2. Find P such that P−1AP = J or, equivalently, such that AP = PJ .

Write P = (~v1|~v2), with ~v1, ~v2 ∈ C2. Since

AP = (A~v1|A~v2),

PJ = (~v1|~v2)(−1 1

0 −1

)= (−~v1|~v1 − ~v2),

we want to choose ~v1, ~v2 in such a way that

1) A~v1 = −~v1

2) A~v2 = ~v1 − ~v2

}AP = PJ

3) ~v1, ~v2 are linearly independent (⇔ P is invertible)

Observations: (a) The equations 1) and 2) can also be written in the form

1′) (A+ I)~v1 = ~0,

2′) (A+ I)~v2 = ~v1.

(b) These equations imply that (A+ I)2~v22′)= (A+ I)~v1

1′)= ~0, i.e.

4) (A+ I)2~v2 = ~0.

(c) Conversely, if we pick ~v2 such that 4) holds and define ~v1 := (A+ I)~v2, then both1) and 2) hold.

(d) However: we have to pick ~v2 carefully such that condition 3) holds. It turns outthat condition 3) will hold if (and only if) we take

3′) ~v2 /∈ EA(−1).

[Indeed, suppose that ~v2 /∈ EA(−1) (and satisfies 4)); then ~v1 := (A + I)~v2 6= ~0. Nowif c1~v1 + c2~v2 = ~0, then applying A + I yields ~0 = (A + I)(c1~v1 + c2~v2) = c2~v1 (because(A + I)~v1 = 0 by 1′)), so c2 = 0. But then c1~v1 = ~0, so c1 = 0, and hence ~v1 and ~v2 arelinearly independent.]

These observations lead to the following strategy:

Pick ~v2 /∈ EA(−1) such that (A+ I)2~v2 = ~0,

put ~v1 = (A+ I)~v2.

Then: P = (~v1|~v2) is invertible and P−1AP =

(−1 1

0 −1

).

Let us apply this strategy here. Since (A + I)2 = 0 (either by direct computation orby using the Cayley-Hamilton Theorem: (A + I)2 = chA(A) = 0), it follows that every~v ∈ C2 satisfies 4). Thus, pick any ~v /∈ EA(−1) = {c(1,−1)t}; take, for example, ~v2 =

(10

).


Then ~v1 = (A+ I)~v2 =

(1 1

−1 −1

)(1

0

)=

(1

−1

),

so P = (~v1|~v2) =

(1 1

−1 0

)is the desired matrix.

Check: P−1AP =(0 −11 1

)(0 1

−1 −2

)(1 1−1 0

)=(

1 2−1 −1

)(1 1−1 0

)=(−1 1

0 −1

)= J .

Remark. Note that the above procedure depends on picking a vector ~v2 in the space

E2A(λ) := {~v ∈ Cn : (A+ λI)2~v = ~0};

such a vector is called a generalized eigenvector (of order ≤ 2).

Example 6.11. If A =

1 2 −10 2 01 −2 3

, find P such that P−1AP is a Jordan matrix.

Solution. We shall follow the steps of the previous example.

Step 1. Find the associated Jordan canonical form J .

(i) Expanding the determinant along the 2nd row yields

chA(t) = (−1)3(2− t) det

(1− t −1

1 3− t

)= (t− 2)[(1− t)(3− t) + t] = (t− 2)3.

Thus, the only eigenvalue is λ1 = 2; its algebraic multiplicity is m1 = 3.

(ii) By row reduction we get

A− 2I =

−1 2 −10 0 01 −2 1

→ 1 −2 1

0 0 00 0 0

.

Thus, the 2-eigenspace is EA(2) = {c1(2, 1, 0)t + c2(−1, 0, 1)t : c1, c2 ∈ C}, and henceν1 = 2. Therefore, J has 2 Jordan blocks.

(iii) From (i) and (ii) we conclude that J =

2 1 00 2 00 0 2

.

Step 2. Find P such that AP = PJ .

Write P = (~v1|~v2|~v3), where ~v1, ~v2, ~v3 ∈ C3. Then we want to choose the ~vi’s such thatAP = PJ , i.e. such that (A~v1|A~v2|A~v3) = (2~v1|~v1 + 2~v2|2~v3). Thus we want:

A~v1 = 2~v1 (A− 2I)~v1 = ~0

A~v2 = ~v1 + 2~v2 (A− 2I)~v2 = ~v1

A~v3 = 2~v3 (A− 2I)~v3 = ~0


In addition, we need to pick the ~vi’s such that ~v1, ~v2 and ~v3 are linearly independent.Following the same line of thought as in the previous example, this leads to

Strategy: pick ~v2 ∈ E2A(2), not in EA(2);

define ~v1 := (A− 2I)~v2;

pick ~v3 ∈ EA(2), linearly independent from ~v1, ~v2.

Now EA(2) = {c1(2, 1, 0)t + c2(1, 0,−1)t} (cf. step 1)

E2A(2) = C3 (since (A− 2I)2 = 0).

Thus, take ~v2 = (1, 0, 0)t

⇒ ~v1 = (A− 2I)~v2 = (−1, 0, 1)t

and take ~v3 = (2, 1, 0)t ∈ EA(2)

⇒ P = (~v1|~v2|~v3) =

−1 1 20 0 11 0 0

.

Check : P−1AP =

0 0 11 −2 10 1 0

1 2 −10 2 01 −2 3

−1 1 20 0 11 0 0

=

1 −2 32 −4 20 2 0

−1 1 20 0 11 0 0

=

2 1 00 2 00 0 2

= J ;

in the above, P−1 was computed by row reducing (P |I)→ (I|P−1): −1 1 2 1 0 00 0 1 0 1 01 0 0 0 0 1

→ 1 0 0 0 0 1

0 1 2 1 0 00 0 1 0 1 0

→ 1 0 0 0 0 1

0 1 0 1 −2 00 0 1 0 1 0

.

Example 6.12. Find P such that P−1AP is a Jordan matrix when A =

2 2 10 3 10 −1 1

.

Step 1. Find the Jordan canonical form J .

(i) By expanding the determinant along the first column, we get

chA(t) = (t− 2) det

(3− t 1−1 1− t

)= (t− 2)[(3− t)(1− t) + 1] = (t− 2)3,

so λ1 = 2 and m1 = 3.

(ii) Since

A− 2I =

0 2 10 1 10 −1 −1

→ 0 2 1

0 0 10 0 0

,

we see that the 2-eigenspace is EA(2) = {c(1, 0, 0)t : c ∈ C}. Thus, ν1 = 1 and so J


consists of 1 Jordan block:

J =

2 1 00 2 10 0 2

︸︷︷︸1 Jordan block.

Step 2. Find P such that AP = PJ .

Again, write P = (~v1|~v2|~v3), and choose the ~vi’s such that AP = PJ , i.e. such that

(A~v1|A~v2|A~v3) = (2~v1|~v1 + 2~v2|~v2 + 2~v3).

Thus we want:A~v1 = 2~v1 (A− 2I)~v1 = ~0

A~v2 = ~v1 + 2~v2 (A− 2I)~v2 = ~v1

A~v3 = ~v2 + 2~v3 (A− 2I)~v3 = ~v2

Extending the reasoning of Example 6.10, we see that all these conditions are satisfied ifwe pick ~v3 such that

~v3 ∈ E3A(2) := {~v ∈ C3 : (A− 2I)3 = ~0},

and then define ~v2 = (A − 2I)~v3 and ~v1 = (A − 2I)~v2. In addition, we need to pick the~vi’s to be linearly independent, and this means that we must require that ~v3 /∈ E2

A(2).We thus have the following

Strategy: pick ~v3 ∈ E3A(2), not in E2

A(2),

define ~v2 := (A− 2I)~v3,

define ~v1 := (A− 2I)~v2.

For this, we first need to compute the generalized eigenspaces E2A(2) and E3

A(2). Since

(A− 2I)2 =

0 2 10 1 10 −1 −1

0 2 10 1 10 −1 −1

=

0 1 10 0 00 0 0

,

it follows that E2A(2) = {c1(0, 1,−1) + c2(1, 0, 0) : c1, c2 ∈ C}. Moreover, E3

A(2) = C3

since (A − 2I)3 = 0, as can be seen either by a direct computation or by applying theCayley-Hamilton Theorem: (A− 2I)3 = chA(A) = 0. Thus

EA(2) = Nullsp(A− 2I) = {c(1, 0, 0)t}E2

A(2) = Nullsp((A− 2I)2) = {c1(0, 1,−1)t + c2(1, 0, 0)t}E3

A(2) = Nullsp((A− 2i)3) = C3

Take ~v3 = (0, 0, 1)t ∈ C3; note that ~v3 /∈ E2A(2). Then ~v2 = (A − 2I)~v3 = (1, 1,−1)t ∈

E2A(2) and ~v1 = (A− 2I)~v2 = (1, 0, 0)t ∈ EA(2). Thus

P = (~v1|~v2|~v3) =

1 1 00 1 00 −1 1

satisfies: P−1AP = J.


Check : P−1AP =

1 −1 00 1 00 1 1

2 2 10 3 10 −1 1

1 1 00 1 00 −1 1

=

2 −1 00 3 10 2 2

1 1 00 1 00 −1 1

=

2 1 00 2 10 0 2

= J,

where (as in the previous example) we have computef P−1 by row reduction:

(P |I) =

1 1 0 1 0 00 1 0 0 1 00 −1 1 0 0 1

→ 1 1 0 1 0 0

0 1 0 0 1 00 0 1 0 1 1

→ 1 0 0 1 −1 0

0 1 0 0 1 00 0 1 0 1 1

.

Remark. For matrices of size ≥ 4 a similar method would also work once we couldcomplete step 1, i.e. predict the matrix J . As long as the geometric multiplicity of everyeigenvalue satisfies νA(λ) ≤ 3, the above method generalizes without much change, butnot when some νA(λ) > 3. However, we shall see presently how to do this in general!

Exercises 6.3.

1. Find an invertible matrix P such that P−1AP is in Jordan canonical form, where

(a) A =

(1 −22 5

)(b) A =

2 1 −10 1 01 1 0

.

Also, find the Jordan canonical form of A in each case.

2. Find a matrix P such that P−AP is in Jordan canonical form when

(a) A =

−2 1 −1−6 4 −1

8 −2 4

(b) A =

−2 0 −2−6 2 −3

8 0 6

.

3. (a) Suppose B is 3× 3 matrix such that B3 = 0, and there exists ~v ∈ C3 such thatB2~v 6= ~0. Show that ~v,B~v,B2~v are linearly independent and that we have

P−1BP = J(0, 3) =

0 1 00 0 10 0 0

if P = (B2~v|B~v|~v).

(b) Let A be a matrix with characteristic polynomial chA(t) = (t − λ)3. Supposethere exists a vector ~v ∈ C3 such that B2~v 6= ~0, where B = A − λI. Show thatP = (B2~v|B~v|~v) is invertible and that P−1AP = J(λ, 3).

(c) More generally, suppose A is an m ×m matrix with characteristic polynomialchA(t) = (t − λ)m and that there exists a vector ~v ∈ Cm such that Bm−1~v 6= ~0,where B = A−λI. Show that P = (Bm−1~v|Bm−2~v| . . . |B~v|~v) is invertible and thatP−1AP = J(λ,m).


6.4 Generalized Eigenvectors and the JCF

While for m ≤ 3 the algebraic and geometric multiplicities of the eigenvalues of A deter-mine the Jordan canonical form (JCF), this is no longer true for m ≥ 4, as Example 6.9shows. For this reason we need to look at generalized eigenspaces.

Definition. Let A be an m× matrix and λ ∈ C. If p ≥ 1 is an integer, then the p-thgeneralized eigenspace of A with respect to λ is the subspace

EpA(λ) = Nullsp((A− λI)p) = {~v ∈ Cm : (A− λI)p~v = ~0}.

Its dimensionνp

A(λ) := dimEpA(λ) = m− rank((A− λI)p).

is called the p-th geometric multiplicity of λ in A, and any vector ~v ∈ EpA(λ) is called a

generalized λ-eigenvector of A of order ≤ p.

Remark. The generalized eigenspaces fit into an increasing sequence of subspaces

{0} ⊂ E1A(λ) = EA(λ)︸︷︷︸

(usual)eigenspace

⊂ E2A(λ) ⊂ · · · ⊂ Ep

A(λ) ⊂ . . . ⊂ Cm,

for if ~v ∈ EpA(λ), then Bp~v = ~0, where B = A − λI, and hence also Bp+1~v = B(Bp~v) =

B~0 = ~0, i.e. ~v ∈ Ep+1A (λ). Thus, the generalized geometric multiplicities satisfy the

inequalities0 ≤ ν1

A(λ) = νA(λ) ≤ ν2A(λ) ≤ . . . ≤ νp

A(λ) ≤ . . . ≤ m.

Notation. We denote the sequence of generalized geometric multiplicities by

ν∗A(λ) = (ν1A(λ), ν2

A(λ), . . . , νpA(λ), . . .).

Example 6.13. If J = J(λ, k) is a Jordan block of size k, then for p < k we have

EpJ(λ) = Nullsp((J − λI)p) = Nullsp

0BBBBBBBBB@

0 0 1 . . . 0...

. . .. . .

. . ....

.

... . .

. . . 1 p + 1...

. . . 00 . . . . . . . . . 0

1CCCCCCCCCA

p + 1

= {c1~e1 + · · ·+ cp~ep},

whereas EpJ(λ) = Nullsp(0) = Ck if p ≥ k. Thus, for all p ≥ 1 we have

(7) νpJ(λ) = min(p, k) =

{p if p ≤ kk if p ≥ k

,

and henceν∗J(λ) = (1, 2, 3, . . . , k − 1, k, k, . . .).

Section 6.4: Generalized Eigenvectors and the JCF 291

Theorem 6.5 (Properties of generalized eigenvectors). Let λ ∈ C and p ≥ 1.

(a) If A = Diag(B,C), then EpA(λ) = Ep

B(λ)⊕EpC(λ) and hence νp

A(λ) = νpB(λ)+νp

C(λ).

(b) If B = P−1AP, then EpA(λ) = PEp

B(λ); in particular, νpB(λ) = νp

A(λ).

(c) If A is similar to a Jordan matrix J = Diag(. . . , J(λi, kij), . . .), then

(8) νpA(λ)− νp−1

A (λ) = #(Jordan blocks J(λi, kij) of J with λi = λ and kij ≥ p).

(d) If νp+1A (λ) = νp

A(λ), then νp+qA (λ) = νp

A(λ), for all q ≥ 1.

Proof. (a) Since (A− λI)p = Diag(B − λI, C − λI)p = Diag((B − λI)p, (C − λI)p), wehave (cf. the proof of Theorem 6.3)

EpA(λ) = Nullsp((A− λI)p) = Nullsp(Diag(B − λI)p, (C − λI)p)

= Nullsp((B − λI)p, (C − λI)p) = EpB(λ)⊕ Ep

C(λ).

This proves the first statement of (a) and the second follows from the first by takingdimensions.

(b) We have (B − λI)p = P−1(A− λI)pP, and so

~v ∈ EpA(λ) ⇔ (A− λI)p~v = ~0

⇔ P−1(A− λI)pPP−1~v = ~0

⇔ (B − λI)pP−1~v = 0

⇔ P−1~v ∈ EpB(λ),

which means that EpA(λ) = PEp

B(λ). Taking dimensions yields νpA(λ) = νp

B(λ).

(c) Suppose first that A = J(λ,m) is a Jordan block. Then by Example 6.13 we have

(9) νpA(λ)− νP−1

A (λ) = min(p,m)−min(p− 1,m) =

{1 if p ≤ m0 if p > m

.

Thus, the formula (8) is true for Jordan blocks.Next, suppose that A = J = Diag(J11, . . . , Jij, . . . ) is a Jordan matrix, where Jij =

J(λi, kij). Then by (a) and (9) we have

νpJ(λi)− νp−1

J (λi)(a)=∑

j

νpJij

(λi)− νp−1Jij

(λi)(9)=∑

jkij≥p

1,

which proves formula (8) for Jordan matrices.Finally, if A = PJP−1, then by (b) we have νp

A(λ)− νp−1A (λ) = νp

J(λ)− νp−1J (λ), and

so formula (8) follows from what was just proved.

(d) We first note that if A is similar to a Jordan matrix J , then the assertion is clearby (c). Indeed, if νp

A(λ) = νp+1A (λ), then it follows from (c) that J has no Jordan blocks of

size ≥ p+ 1, and hence also none of size ≥ p+ q, which means that νp+qA (λ) = νp+q−1

A (λ).


Now although every matrix A is indeed similar to a Jordan matrix (Jordan’s theorem),we do not want to use this fact here, and so we give a direct proof of (d). This proof isbased on the following formula which is also interesting in itself:

(10) νp+1A (λ)− νp

A(λ) = dim(Im(A− λI) ∩ EpA(λ)),

in which Im(B) = {B~v : ~v ∈ Cn} denotes (as usual) the image space of a matrix B (alsocalled the range or column space of B).

From this formula (10) the assertion follows immediately, for if νp+1 = νp, then also

Ep+1 = Ep, and hence νp+2 − νp+1 = (νp+2 − ν) − (νp+1 − ν) (10)= dim(Im(B) ∩ Ep+1) −

dim(Im(B) ∩ Ep) = 0. Thus νp+2 = νp+1, and so the claim follows by induction.It thus remains to verify (10). For this, put B = A− λI. Then we have

(11) BEp+1A (λ) = Im(B) ∩ Ep

A(λ).

Indeed, if ~w ∈ BEp+1A (λ), i.e. ~w = B~v with ~v ∈ Ep+1

A (λ), then Bp ~w = Bp(B~v) =Bp+1~v = ~0, and so ~w ∈ Im(B) ∩ Ep

A(λ). Conversely, if ~w = B~v ∈ Im(B) ∩ EpA(λ), then

Bp+1~v = Bp ~w = ~0, so ~v ∈ Ep+1A (λ) and ~w = B~v ∈ BEp+1. Thus, equality holds in (11).

Taking dimensions in (11) yields

dim(Im(B) ∩ EpA(λ)) = dim(BEp+1

A (λ)) = dim(Ep+1A (λ))− dim(Nullsp(B) ∩ Ep+1),

where the latter equality follows from the general fact (the rank-nullity theorem) thatfor any subspace V we have dimBV = dimV − dim(V ∩ Nullsp(B)). Now since hereNullsp(B) = EA(λ) ⊂ Ep+1

A (λ), it follows that dim(Nullsp(B)∩Ep+1A (λ)) = dimEA(λ) =

νA(λ), and so (10) follows since νp+1A (λ) = dimEp+1

A (λ) by definition.

Remark. By part (d) we see that the generalized geometric multiplicities νpA(λ) exhibit

the following growth pattern:

ν1A(λ) < . . . < νp

A(λ) = νp+1A (λ) = . . . = νp+q

A (λ) = . . . .

This raises the question: what is the value to which the geometric multiplicities stabilize?Now it turns out (but this is more difficult to prove) that this stabilizing value is preciselythe algebraic multiplicity:

νpA(λ) = νp+1

A (λ) ⇔ νpA(λ) = mA(λ).

Corollary. The numbers νpA(λ) determine the Jordan canonical form J of A by taking

second differences. More precisely, the number npA(λ) of Jordan blocks of J of type J(λ, p)

is given by the formula

(12) npA(λ) = ∆2νp

A(λ) := (νpA(λ)− νp−1

A (λ))− (νp+1A (λ)− νp

A(λ)).

In particular, two Jordan matrices J and J ′ are similar if and only if νpJ(λ) = νp

J ′(λ), forall p ≥ 1 and λ ∈ C.


Proof. By equation (10) we have (νpA(λ) − νp−1

A (λ)) − (νp+1A (λ) − νp

A(λ)) = #(Jordanblocks of size ≥ p) - #(Jordan blocks of size ≥ p+ 1) = np

A(λ), which is (12).If J and J ′ are similar, then by Theorem 6.5b) we have that νp

J(λ) = νpJ ′(λ), for all

p ≥ 1 and λ ∈ C. Conversely, if all these numbers are equal, then it follows from (12)that J and J ′ have exactly the same number of Jordan blocks of each type and henceare equal up to order. By the remark after Theorem 6.4 we know that then J and J ′ aresimilar.

Example 6.14. Determine the Jordan blocks of a Jordan matrix A with characteristicpolynomial chA(t) = (t− 7)5 and generalized geometric multiplicities

ν∗A(7) = (2, 4, 5, 5, 5, . . . )

Solution. By the above corollary, we have to determine the second differences ∆2ν∗A,which can be calculated by using the following scheme:

p 0 1 2 3 4 5

ν∗A 0 2 4 5 5 5︸︷︷︸︸︷︷︸︸︷︷︸︸︷︷︸︸︷︷︸∆ν∗A 2 2 1 0 0︸︷︷︸︸︷︷︸︸︷︷︸︸︷︷︸∆2ν∗A 0 1 1 0

Note that we added an extra column for p = 0 in the above table in order to be able tocompute the first column of ∆2ν∗A.

Conclusion. A has: 0 blocks of size 1× 11 block of size 2× 21 block of size 3× 30 blocks of size 4× 4 etc.

Thus: A = Diag(J(7, 3), J(7, 2)) =

7 1 0 0 00 7 1 0 00 0 7 0 00 0 0 7 10 0 0 0 7

(up to order).

Remark. In place of the above calculation scheme, we could have used instead thefollowing longer but more detailed analysis:

ν4A − ν3

A = 5− 5 = 0 ⇒ 0 blocks of size ≥ 4

ν3A − ν2

A = 5− 4 = 1 ⇒ 1 block of size ≥ 3

ν2A − ν1

A = 4− 2 = 2 ⇒ 2 blocks of size ≥ 2

ν1A − ν0

A = 2− 0 = 2 ⇒ 2 blocks of size ≥ 1

} 1 block of size 3

} 1 block of size 2

} 0 blocks of size 1


Example 6.15. Find the Jordan canonical form of A =

7 0 0 1 00 7 0 0 01 0 7 0 00 0 0 7 00 1 0 0 7

.

Solution. Step 1: Find the characteristic polynomial of A:

chA(t) = (−1)5 det

7− t 0 0 1 0

0 7− t 0 0 01 0 7− t 0 00 0 0 7− t 00 1 0 0 7− t

= (t− 7) det

7− t 0 0 1

0 7− t 0 01 0 7− t 00 0 0 7− t

=−(t− 7)2 det

7− t 0 00 7− t 01 0 7− t

= (t− 7)5.

Step 2: Calculate the generalized geometric multiplicities:

Put B = A− 7I. Then

B =

0 0 0 1 00 0 0 0 01 0 0 0 00 0 0 0 00 1 0 0 0

, B2 =

0 0 0 0 00 0 0 0 00 0 0 1 00 0 0 0 00 0 0 0 0

, B3 =

0 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 0

,

which have rank 3, 1, and 0 respectively. Thus, since νpA(7) = 5− rank(Bp), we see that

ν∗A(7) = (2, 4, 5, 5, . . .).

Step 3: Find the Jordan blocks by the method of second differences:

Since the Jordan canonical form J of A has chJ(t) = (t−7)5 and its generalized geometricmultiplicities are ν∗J(7) = (2, 4, 5, 5, . . .), we can conclude by Example 6.14 that

J = Diag(J(7, 3), J(7, 2)) =

7 1 0 0 00 7 1 0 00 0 7 0 00 0 0 7 10 0 0 0 7

(up to order).

Example 6.16. Find the Jordan canonical form of A =

1 0 6 2 0 21 2 −1 0 0 00 0 4 1 0 11 0 −2 2 0 0−1 0 4 0 2 1−1 0 −2 −2 0 0

.


Solution. Step 1: Compute the characteristic polynomial of A.

Expanding the following determinants successively along the 2nd column, the 4th columnand the 3rd row, we get

chA(t) = (−1)6 det

1− t 0 6 2 0 2

1 2− t −1 0 0 00 0 4− t 1 0 11 0 −2 2− t 0 0−1 0 4 0 2− t 1−1 0 −2 −2 0 −t

= (2− t)

1− t 6 2 0 2

0 4− t 1 0 11 −2 2− t 0 0−1 4 0 2− t 1−1 −2 −2 0 −t

= (t− 2)2 det

1− t 6 2 2

0 4− t 1 11 −2 2− t 0−1 −2 −2 −t

= (t− 2)2

1 · det

6 2 24− t 1 1−2 −2−t

− (−2)

1− t 2 20 1 1−1 −2−t

+ (2− t) det

1− t 6 20 4− t 1−1 −2 −t

= (t− 2)2[(−4 + 6t− 2t2) + 2(1− t)(2− t) + (2− t)(4− 8t+ 5t2 − t3)]= (t− 2)2[(t− 2)((2− 2t) + 2(t− 1) + (t3 − 5t2 + 8t− 4))]

= (t− 2)3[t3 − 5t2 + 8t− 4] = (t− 1)(t− 2)5.

Thus, we have two eigenvalues: λ1 = 1 and λ2 = 2 with algebraic multiplicities mA(λ1) =1 and mA(λ2) = 5, respectively.

Step 2: Find the p-th geometric multiplicities:

(i) For λ1 = 1:

Put B1 = A− I =

0 0 6 2 0 21 1 −1 0 0 00 0 3 1 0 11 0 −2 1 0 0−1 0 4 0 1 1−1 0 −2 −2 0 −1

→

1 0 −2 1 0 00 1 1 −1 0 00 0 3 1 0 10 0 0 1 0 10 0 0 0 1 00 0 0 0 0 0

.

Thus, νA(λ1) = n − rk(B1) = 6 − 5 = 1. Since also mA(λ1) = 1, and always νp(λ1) ≤mA(λ1) (cf. the remark after Theorem 6.5), we see that νp(λ1) = 1 for all p ≥ 1. (Alter-natively, we could have computed B2

1 and noticed that rank(B21) = rank(B1).)


(ii) For λ2 = 2:

Put B2 = A− 2I =

−1 0 6 2 0 2

1 0 −1 0 0 00 0 2 1 0 11 0 −2 0 0 0−1 0 4 0 0 1−1 0 −2 −2 0 −2

→

1 0 −1 0 0 00 0 1 0 0 00 0 0 1 0 10 0 0 0 0 10 0 0 0 0 00 0 0 0 0 0

.

Thus νA(λ2) = n− rk(B2) = 6− 4 = 2. Furthermore, since

B22 =

1 0 −2 0 0 0−1 0 4 1 0 1

0 0 0 0 0 0−1 0 2 0 0 0

0 0 0 0 0 01 0 −2 0 0 0

→

1 0 −2 0 0 00 0 2 1 0 10 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0

,

B32 =

−1 0 2 0 0 0

1 0 −2 0 0 00 0 0 0 0 01 0 −2 0 0 00 0 0 0 0 0−1 0 2 0 0 0

→

1 0 −2 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0

,

it follows that ν2A(2) = n− rk(B2

2) = 6− 2 = 4, and ν3A(2) = n− rk(B3

2) = 6− 1 = 5. Wecan stop here because ν3

A(2) = 5 = mA(2), and hence νpA(2) = 5 for all p ≥ 3.

Step 3: Find the number of Jordan blocks via the method of second differences.

By step 2 we have:

ν∗A(1) = (1, 1, 1, 1, . . .)

ν∗A(2) = (2, 4, 5, 5, . . .)

Thus, since νA(1) = mA(1) = 1, we have 1 block J(1, 1). Moreover, since the values ofν∗A(2) are identical to those of the ν∗A(7) of Example 6.14, it follows that we have alsohave the same number of blocks (by taking second differences).

Thus, J has 1 block of size 1 with eigenvalue λ1 = 11 block of size 2 with eigenvalue λ2 = 21 block of size 3 with eigenvalue λ2 = 2,

and hence the Jordan canonical form J of A is

J = Diag(J(1, 1), J(2, 2), J(2, 3)) =

1 0 0 0 0 00 2 1 0 0 00 0 2 0 0 00 0 0 2 1 00 0 0 0 2 10 0 0 0 0 2

(up to order).


Exercises 6.4.

1. Find the generalized geometric multiplicities of the following Jordan matrices:

(a) J = Diag(J(1, 2), J(1, 3));

(b) J = Diag(J(2, 1), J(2, 2), J(2, 3));

(c) J = Diag(J(3, 3), J(3, 3), J(3, 3));

(d) J = Diag(J(1, 3), J(3, 2), J(3, 3)).

2. Let J be a Jordan matrix (in standard form) and suppose that its characteristicpolynomial has the form chJ(t) = (t + 1)m. Find J if its sequence of generalizedgeometric multiplicities is

(a) ν∗J(−1) = (3, 6, 6, 6, . . .);

(b) ν∗J(−1) = (3, 5, 6, 6, 6, . . .);

(c) ν∗J(−1) = (3, 6, 7, 7, 7, . . .);

(d) ν∗J(−1) = (3, 6, 7, 8, 9, 9, 9, . . .).

3. Find the Jordan canonical form J of the following matrices and justify your result.

(a) A =

−2 1 0 −1 10 −2 1 −1 10 0 −2 0 10 0 0 −2 10 0 0 0 −2

; (b) B =

−2 0 0 −1 10 −2 0 −1 10 0 −2 0 10 0 0 −2 10 0 0 0 −2

.

[Do not find P such that P−1AP = J or P−1BP = J ]

4. Find the Jordan canonical form of the matrix

A =

−1 1 0 0 1 1 1 10 −1 0 0 0 1 1 10 0 −1 1 0 0 1 10 0 0 −1 0 0 0 10 0 0 0 −1 1 1 00 0 0 0 0 −1 0 10 0 0 0 0 0 1 10 0 0 0 0 0 0 1

.

Be sure to justify your result by suitable computations, but do not find P such thatP−1AP is a Jordan matrix.


6.5 A procedure for finding P such that P−1AP = J

In the previous section we proved that the generalized geometric multiplicities νpA(λ) =

dimEpA(λ) determine the Jordan canonical form J of a matrix A. Here we shall see that

generalized eigenspaces EpA(λ) can be used to find a matrix P such that P−1AP = J ;

this partly generalizes the method which we learned in section 6.3.

Procedure for finding P such that P−1AP = J

I. Compute and factor the characteristic polynomial of the n× n matrix A:

chA(t) = (t− λ1)m1(t− λ2)

m2 · · · (t− λs)ms .

II. For each eigenvalue λi, 1 ≤ i ≤ s, find a basis Bi of E∗A(λi) := Emi

A (λi) as follows:

1. Compute νjA(λi) = m − rank((A − λiI)

j) for j = 1, . . . ,mi, and find theminimal k = ki such that νk

i = mi.

[It is usually a good idea to find the generalized eigenspaces EjA(λi) as well.]

2. Build up a basis Bi of EkA(λi) = Emi

A (λi) = E∗A(λi) as follows:

i. Pick a generalized eigenvector ~v ∈ EkA(λi) of degree k (so ~v /∈ E(k−1)

A (λi));start the list Bi with

Bi = {~wi1 := ~v, ~wi2 := (A− λiI)~wi1, . . . , ~wik := (A− λiI)~wi,k−1}.

ii. If we have already mi vectors in the list, Bi is the desired basis and weare done with the eigenvalue λi; otherwise, proceed to the next step.

iii. Determine the largest ` such that E`A(λi) 6⊂ E

(`−1)A (λi) + span(Bi), and

pick a generalized eigenvector ~u ∈ E`A(λi) such that ~u /∈ E

(`−1)A (λi) +

span(Bi). (In other words, pick a generalized eigenvector ~u of highestpossible degree ` such that ~u is linearly independent of the vectors alreadyin the list Bi, together with those of E

(`−1)A (λi). )

iv. Add the vectors ~u, (A−λiI)~u, . . . , (A−λiI)`−1~u at the end of the list Bi.

Go back to step ii.

III. Each (ordered) list Bi has now the form Bi = {~wi1, ~wi2, . . . , ~wimi}. Assemble these

lists as the column vectors of the matrix P by reversing the order in each list Bi;thus,

P = (~w1m1 | ~w1,m1−1 | . . . | ~w11︸︷︷︸B1

| ~w2m2 | ~w2,m2−1 | . . . | ~w21︸︷︷︸B2

| . . . | ~wsms | . . . | ~ws1︸︷︷︸Bs

).

Then: J = P−1AP is the Jordan Canonical Form of A, and the Jordan blockswith the same eigenvalue are arranged in order of increasing block size, i.e. J is inreverse standard form (cf. p. 281).

Section 6.5: A procedure for finding P such that P−1AP = J 299

Example 6.17. Verify the algorithm for A = Diag(J(3, 2), J(3, 3)) =

3 1 0 0 00 3 0 0 00 0 3 1 00 0 0 3 10 0 0 0 3

.

Solution. Since this matrix is already in (reverse standard) Jordan Canonical Form, weknow that we can take P = I. Thus, we expect the algorithm to assemble the identitymatrix I.

I. Clearly, det(A) = (t− 3)5, so λ1 = 3 and m1 = n = 5.

II. 1. The generalized eigenspaces are:

EA(3) = Nullspace(A− 3I) = Nullsp

0 1 0 0 00 0 0 0 00 0 0 1 00 0 0 0 10 0 0 0 0

= {c1~e1 + c2~e3}

E2A(3) = Nullsp((A− 3I)2) = Nullsp

0 0 0 0 00 0 0 0 00 0 0 0 10 0 0 0 00 0 0 0 0

= {c1~e1 + c2~e2 + c3~e3 + c4~e4}

E3A(3) = Nullsp((A− 3I)3) = Nullsp(0) = C5

Thus, νA(3) = 2 < ν2A(3) = 4 < ν3

A(3) = 5 = m1, and so k = 3.

2. We now construct the basis B1 of E3A(3) = C5 as follows.

i. Pick ~v ∈ C5 of exact degree 3, i.e., ~v /∈ E2A(3). For example, take ~v = ~e5.

Then: ~w11 = ~e5, ~w12 = (A− 3I)~e5 = ~e4, ~w13 = (A− 3I)~e4 = ~e3.

ii. At this point the list is B1 = {~e5, ~e4, ~e3}, which consists only of 3 < 5 elements,so we continue with step iii.

iii. Since E2A(3) + span(B1) = E3

A(3), we cannot take ` = 3. Thus, try ` = 2,and look for ~u ∈ E2

A(3) = {c1~e1 + c2~e2 + c3~e3 + c4~e4} such that ~u /∈ EA(3) +span(B1) = {c1~e1 + c3~e3︸︷︷︸

EA(3)

+c2~e2 +c4~e4}. Clearly, we can take ~u = ~e2 (and hence

` = 2).

iv. Thus, we add ~u = ~e2 and (A − 3I)~u = ~e1 at the end of B1 to get B1 ={e5, e4, e3, e2, e1}. Since we now have 5 = m1 vectors in B1, we have con-structed the desired basis of E3

A(3) = C5.

III. Assembling P from B1 (in reverse order) yields P = (~e1| . . . |~e5) = I, the 5 × 5identity matrix. Thus, P−1AP = A = J is in Jordan Canonical Form.


Example 6.18. If A is the matrix of Example 6.16, find P such that J = P−1AP is theJordan canonical form of A.

Solution. We follow the steps of the algorithm.

Step I. Compute and factor the characteristic polynomial:

By Example 6.16 we know that the characteristic polynomial is chA(t) = (t− 1)(t− 2)5,so we have 2 eigenvalues: λ1 = 1 and λ2 = 2.

Step II. For each i find a basis Bi of EmiA (λi):

a) For λ1 = 1:

1. Again, from Example 6.16 we know that νA(1) = 1; moreover, by the reduced matrixgiven there we obtain

EA(1) = 〈(1,−1, 0,−1, 0, 1)t〉

2. Thus, B1 = {~w11}, where ~w11 := (1,−1, 0,−1, 0, 1)t.

b) For λ2 = 2:

1. From Example 6.16 we know that ν∗A(2) = (2, 4, 5, 5, . . .), so the smallest k such thatνk

A(2) = m2 is k = 3. Moreover, from the row reduced matrices of Example 6.16 weobtain:

EA(2) = 〈~u11, ~u12〉, where ~u11 = (0, 1, 0, 0, 0, 0)t,~u12 = (0, 0, 0, 0, 1, 0)t;

E2A(2) = 〈~u11, ~u12, ~u21, ~u22〉, where ~u21 = (0, 0, 0, −1, 0, 1)t,

~u22 = (2, 0, 1, −2, 0, 0)t;E3

A(2) = 〈~u11, ~u12, ~u31, ~u32, ~u33〉, where ~u31 = (0, 0, 0, 1, 0, 0)t,~u32 = (0, 0, 0, 0, 0, 1)t,~u33 = (2, 0, 1, 0, 0, 0)t.

Note that there are two relations among the uij’s: ~u21 = −~u31 +~u32 and ~u22 = ~u33−2~u31.

2. i. Clearly, ~v = ~u31 = (0, 0, 0, 1, 0, 0)t ∈ E3A(2) but ~v /∈ E2

A(2), so ~v has degree 3. Thus,we can start the list B2 with ~v as a generator. (We could also take instead ~v = ~u32 or ~v33

or most linear combinations of these vectors.) Thus, put ~w11 = ~v, ~w12 = (A− 2I)~w11 =(2, 0, 1, 0, 0,−2)t = ~u22 − 2~u21, ~w13 = (A − 2I)~w12 = (0, 1, 0, 0, 0, 0)t = ~u11. (Note that(A− 2I)~w13 = ~0 as expected since k = 3.) We thus have the list

B21 = {~w11, ~w12, ~w13}.

ii. Since #B21 = 3 < m2 = 5, we continue with the next step.

iii. Clearly E2A(2) + span(B21) = 〈~u11, ~u12, ~u21, ~u22, ~u31, ~u22 − 2~u21〉 = E3

A(2), so ` < 3.However, ~u21 /∈ EA(2) + span(B21) = span(B21) but ~u21 ∈ E2

A(2), so ` = 2. Moreover, wecan take ~w24 := ~u21 as the next generator. Thus ~w25 := (A− 2I)~w24 = (0, 0, 0, 0, 1, 0)t =~u21 is the next element in the list:

B22 = {~w24, ~w25}.

Section 6.5: A procedure for finding P such that P−1AP = J 301

ii. We have now constructed the list B2 = B21 ∪ B22 = {~w21, . . . , ~w25}. Since #B2 =5 = mA(2), we are done with step II.

Step III. In step II we had constructed the lists B1 = {~w11} and B2 = B21 ∪ B22 ={~w11, . . . , ~w15}. Assembling the elements of each list in reverse order yields

P = (~w11|~w25|~w24| . . . |~w21) =

1 0 0 0 2 0−1 0 0 1 0 0

0 0 0 0 1 0−1 0 −1 0 0 1

0 1 0 0 0 01 0 1 0 −2 0

.

Thus, P is the matrix which transforms A to its Jordan canonical form; i.e. P is suchthat J = P−1AP is a Jordan matrix.

Check: P−1AP

=

1 0 −2 0 0 00 0 0 0 1 0−1 0 4 0 0 1

1 1 −2 0 0 00 0 1 0 0 00 0 2 1 0 1

1 0 6 2 0 21 2 −1 0 0 00 0 4 1 0 11 0 −2 2 0 0−1 0 4 0 2 1−1 0 −2 −2 0 0

1 0 0 0 2 0−1 0 0 1 0 0

0 0 0 0 1 0−1 0 −1 0 0 1

0 1 0 0 0 01 0 1 0 −2 0

=

1 0 −2 0 0 0−1 0 4 0 2 1−2 0 8 0 0 2

2 2 −3 0 0 00 0 4 1 0 10 0 4 2 0 2

1 0 0 0 2 0−1 0 0 1 0 0

0 0 0 0 1 0−1 0 −1 0 0 1

0 1 0 0 0 01 0 1 0 −2 0

=

1 0 0 0 0 00 2 1 0 0 00 0 2 0 0 00 0 0 2 1 00 0 0 0 2 10 0 0 0 0 2

.

Exercises 6.5.

1. Find a matrix P such that P−1AP is in Jordan canonical form for each of thematrices A of Problem 3 of Exercise 6.4.

2. If A is as in Problem 4 of Exercises 6.4, find P such that P−1AP is in Jordancanonical form.

3. Find a matrix P such that P−1AP is in Jordan canonical form where

A =

−1 1 0 0 1 1 1 10 −1 0 0 0 1 1 10 0 −1 1 0 0 1 10 0 0 −1 0 0 0 10 0 0 0 −1 1 1 00 0 0 0 0 1 0 00 0 0 0 0 1 1 00 0 0 0 0 1 1 1

.


6.6 A Proof of the Cayley–Hamilton Theorem

As was promised in the introduction, we want to use Jordan’s theorem (Theorem 6.4) toprove:1

Theorem 6.6 (Cayley-Hamilton). For any square matrix A we have chA(A) = 0.

Proof. We present here a proof that is typical of all the proofs using the Jordan canonicalform in that they all follow the following pattern:

Step 1. Prove the assertion for Jordan blocks.

Step 2. Prove the statement for Jordan matrices (using step 1).

Step 3. Deduce from step 2 and Jordan’s theorem that the assertion is true for ageneral matrix.

We now apply this strategy to proving the Cayley-Hamilton theorem.

Step 1. Let A = J(λ, k) be a Jordan block.

Then clearly chA(t) = (t− λ)k, so

chA(A) = (J(λ, k)− λI)k = J(0, k)k = 0

because for any p ≤ k we have (cf. Example 6.13):

J(0, k)p =

0BBBBBBBBB@

0 0 1 . . . 0...

. . .. . .

. . ....

.

... . .

. . . 1 p + 1...

. . . 00 . . . . . . . . . 0

1CCCCCCCCCA

p + 1

Thus, the Cayley-Hamilton theorem holds for Jordan blocks.

Step 2. Let A = Diag(J11, . . . , Jij, . . .) be a Jordan matrix.

Put c(t) = chA(t). Since A is a diagonal block matrix, we have

c(t) = chJ11(t) · · · chJij(t) · · · ,

so in particular for each i, j we have c(t) = gij(t) chJij(t) for some polynomial gij(t).

Thus, c(Jij) = gij(Jij) chJij(Jij) = 0 since by step 1 we have chJij

(Jij) = 0. Thus

c(A)=c(Diag(J11, . . . , Jij, . . .))=Diag(c(J11), . . . , c(Jij), . . .)=Diag(0, . . . , 0, . . . )=0,

and so the statement holds for Jordan matrices.

1In the appendix we shall give a direct proof of the Cayley-Hamilton Theorem; cf. Theorem 6.7 andthe remark following it.

Section 6.6: A Proof of the Cayley–Hamilton Theorem 303

Step 3. Let A be an arbitrary matrix.

By Jordan’s theorem, there is a matrix P such that J = P−1AP is a Jordan matrix.Then chA(t) = chJ(t) and so, using step 2, we obtain

chA(A) = chJ(PJP−1) = P chJ(J)︸︷︷︸0

P−1 = 0.

Thus, the Cayley-Hamilton theorem holds for an arbitrary matrix A.

As was mentioned in the above proof, the basic strategy used in the proof applies tomany other situations as well. For example:

Example 6.19. Find all m×m matrices A such that A2 = I.

Solution. We follow the above strategy.

Step 1. Find all Jordan blocks J = J(λ,m) satisfying J2 = I.

By the explicit formula of powers of Jordan blocks (cf. Theorem 5.7), this can onlyhappen if m = 1. Furthermore, in that case we must have that λ = ±1, i.e. either λ = 1or λ = −1.

Step 2. Find all Jordan matrices J = Diag(J11, . . . , Jij . . .) satisfying J2 = I.

Since J = Im implies that J2ij = Ikij

, we obtain from step 1 that J is a diagonal matrixwith ±1 along the diagonal. (Conversely, every matrix J of this form satisfies J2 = I.)

Step 3. General case: A = PJP−1 is any matrix such that A2 = I.

Then we also have J2 = P−1A2J = I, so by step 2 A is similar to a diagonal matrixJ = Diag(±1, . . . ,±1). Conversely, any matrix A of this form satisfies A2 = I.

Conclusion. A matrix A satisfies A2 = I if and only if it is similar to a diagonal matrixof the form Diag(±1, . . . ,±1).

Exercises 6.6.

1. Let A be a matrix with characteristic polynomial

chA(t) = (t− λ1)m1(t− λ2)

m2 · · · (t− λs)ms .

Prove that for any polynomial f(t) ∈ C[t], the characteristic polynomial of f(A) is

chf(A)(t) = (t− f(λ1))m1(t− f(λ2))

m2 · · · (t− f(λs))ms .

Hint: First prove it in the case that A is a Jordan block, then for a general Jordanmatrix, and then use Jordan’s theorem.

2. Find all m ×m matrices A satisfying A2 = A. (A matrix satisfying this equationis called an idempotent matrix.)

Documents

Chapter 6 The Jordan Canonical Form - Queen's U