Lecture Notes on Numerical Linear Algebra - CIMATangeluh/webpage_ANI/Libros/SheenLectures.pdf · 1.1 Diﬃculties in Computation in Numerical Linear Algebra Prob-lems Numerical methods

Lecture Notes on Numerical Linear Algebra

Dongwoo Sheen

Department of Mathematics, Seoul National University, Seoul 151-747

March 27, 2002

2 c© 2001 D. Sheen <http://www.nasc.snu.ac.kr/> All Rights Reserved

Contents

1 Introduction 1

1.1 Difficulties in Computation in Numerical Linear Algebra Problems . . . . . . . . . . . . . 1

1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 The Gaussian Elimination 3

2.1 Gaussian elimination - an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 LU-decomposition with partial pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Examples needed partial pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.2 Gaussian Elimination with partial pivoting . . . . . . . . . . . . . . . . . . . . . . 6

2.2.3 Doolittle’s algorithm for LU -decomposition . . . . . . . . . . . . . . . . . . . . . . 8

2.2.4 Direct triangular decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.5 Gauss-Jordan Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.6 LDMT -decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.7 Banded matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.8 Gaussian elimination of a banded matrix with partial pivoting. . . . . . . . . . . . 20

3 Matrix norms and condition number 23

3.1 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Matrix norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Conditon Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3.1 The concept of a condition number κ(A) . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Householder transformation 33

4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2 Householder transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

i

ii CONTENTS

5 Singular value decomposition(SVD) and Least Squares Problems 41

5.1 Singular Value Decomposition(SVD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.2 Shur Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3 Least squares problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.4 Moore-Penrose pseudo inverse A† of A ∈M(m,n). . . . . . . . . . . . . . . . . . . . . . . 47

6 Eignevalues and eigenvectors 51

6.1 Cayley-Hamilton theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2 Gerschgorin theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.3 The power method (to compute eigenvalues and eigenvectors.) . . . . . . . . . . . . . . . 55

6.3.1 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.3.2 Inverse iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7 Iterative methods for linear systems 57

7.1 (Gauss-) Jacobi method and Seidel method . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.1.1 Jacobi method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.1.2 Seidel method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.2 Richardson iterative method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Chapter 1

Introduction

1.1 Difficulties in Computation in Numerical Linear Algebra Prob-

lems

Numerical methods for solving large-scale linear algebraic problems are necessary to solve various par-

tial/ordinary differential equations by the finite element method, finite difference method, spectral

method, and etc. Also in the solution of optimization problems, such methods are essential.

Difficulties in practical computation of large linear systems arise from the following observations:

• usually the computational costs are too expensive;

• there will be possible loss in accuracy with a fixed number of digits computation;

• the methods are not applicable to different problems.

The main questions in numerical methods for linear systems are

• How fast is the numerical method in the sense of operation counts (flops: 1 flop=1 multiplication

+ 1 addition)?

• What is the accuracy? Can a priori and a posteriori estimates be given?

• What is the coverage of the method?

1

2 CHAPTER 1. INTRODUCTION

1.2 Examples

We begin by giving a short illustration in operation counting for the computation of det(A), for a given

n× n matrix A. From the formula in elementary linear algebra,

det(A) =∑

σ

sign(σ)a1σ1 · · · anσn , (1.1)

where the summation is taken for all permutation σ : {1, · · · , n} 7→ {1, · · · , n} and the signature function

is defined as usual:

sign(σ) =

1, σ is an even permutation,

−1, σ is an odd permutation.(1.2)

Moreover, the determinant of A can be computed using cofactor matrices in the following manner:

det(A) =n∑

k=1

(−1)1+ka1k det(A1k), (1.3)

where A1k is the submatrix of size (n−1)× (n−1) obtained by deleting the 1st row and kth column from

A, which is identical to the (k, 1)-cofactor matrix of A. The determinants of A1k’s then can be computed

by using the formula (1.3), recursively, until the size of cofactor matrices become 1× 1. Therefore,

flops(det(A)) = n · flops(det(A1k))

= n(n− 1) · flops(det((n− 2)× (n− 2)submatrix of A))

= · · · = n ! .

Imagine how huge the number of operations are need by using this sort of elementary rule to calculate

the determinant of a matrix of 1, 000, 000 × 1, 000, 000, where in actual computation such big a size of

matrix will occur frequently.

Another example is to compute the inverse of a nonsingular matrix A. Cramer’s rule implies

A−1 =1

det(A)

((−1)j+k det(Ajk)

),

where Ajk denotes the (j, k)-cofactor matrix of A. If one uses the above idea to compute the determinants,

the total flops will be n ! +n2 · (n− 1) ! = (n + 1) !, which will be impossible in practical computing.

Later in this chapter, we will see the flops will be substantially reduced by decomposing the matrix

A into a product of lower and upper triangular matrices which results from a Gaussian elimination

procedure.

Chapter 2

The Gaussian Elimination

2.1 Gaussian elimination - an example

In this chapter we consider the linear system

Ax = b, (2.1)

where A and b are given an n × n matrix and a vector in Cn. As a simple example, we consider the

following matrix A and b:

A =

2 −1 0

−1 2 −1

0 −1 2

, b =

1

2

1

. (2.2)

Let A(1) denote the augmented matrix by gluing the vector b as the fourth column to the matrix A, so

that

A(1) =

2 −1 0 1

−1 2 −1 2

0 −1 2 1

. (2.3)

The Gaussian elimination procedure is then to reduce the matrix A(1) to an upper triangular matrix by

elementary row operations, which can be interpreted as follows.

Let G1 = I −m1 et1, with

m1 = (0,m21,m31)t =1

a(1)11

(0, a(1)21 , a

(1)31 )t = (0,−1

2, 0)t.

3

4 CHAPTER 2. THE GAUSSIAN ELIMINATION

Then the first step of Gaussian elimination can be understood as multiplying the matrices G1 and A(1)

to obtain A(2) as follows:

G1A(1) =

1 0 012 1 0

0 0 1

2 −1 0 1

−1 2 −1 2

0 −1 2 1

=

2 −1 0 1

0 32 −1 5

2

0 −1 2 1

:= A(2). (2.4)

Next, let G2 = I −m2 et2, with

m2 = (0, 0,m32)t =1

a(2)22

(0, 0, a(2)32 )t = (0, 0,−2

3)t.

Then the second step of Gaussian elimination can be understood as multiplying the matrices G2 and A(2)

to obtain A(3) as follows:

G2A(2) =

1 0 0

0 0 1

0 23 1

2 −1 0 1

0 32 −1 5

2

0 −1 2 1

=

2 −1 0 1

0 32 −1 5

2

0 0 43

83

:= A(3). (2.5)

Definition 2.1. A matrix A is called a lower (respectively, upper) triangular if ajk = 0 for all j < k

(respectively, j > k). Moreover, if all the diagonal elements of such matrices are equal to 1, such matrices

are said to be unit lower, or upper triangular.

L =

1 0

1. . .

∗ 1

and U =

u11 ∗

u22

. . .

0 unn

Proposition 2.1. If A and B are (unit) lower triangular, then so is AB.

Definition 2.2. Gk = I − geTk is called a Gaussian transformation matrix with Gauss vector g =

(0, 0, · · · , gk+1, · · · , gn)t, where ek is the kth standard unit vector.

Gk =

1. . .

1

−gk+1 1...

. . .

−gn 1

.

2.1. GAUSSIAN ELIMINATION - AN EXAMPLE 5

Note that the matrix A(3) is upper triangular.

Summarizing the above procedure, one can write in the form

A(3) = G2A(2) = G2G1A

(1), (2.6)

from which one has the following decomposition of A(1) as a multiplication of a lower triangular matrix

L and a unit upper triangular matrix U :

A(1) = LU =[G−1

1 G−12

]A(3), (2.7)

where G−1j is the inverse of Gj .

Exercise 2.1. It is immediate to see that

G−1j = I + mj et

j , for j = 1, 2. (2.8)

Exercise 2.2. Check that L = G−11 G−1

2 is a lower triangular matrix with the diagonal entry being 1.

Moreover, show that

L =

1 0 0

m21 1 0

m31 m32 1

.

Exercise 2.3. Verify that the above procedure together with Exercises 2.1 and 2.2 is valid for any n× n

matrices, as long as a(j)jj 6= 0 for j = 1, · · · , n− 1.

Exercise 2.4. Let L and U be (unit) lower and upper triangular matrices.

• Show that L−1 and U−1 are also lower and upper triangular matrices. if ljj 6= 0, ujj 6= 0 ∀j

• How many flops are needed to calculate L−1 and U−1?

• If A = LU , then A−1 = U−1L−1 Assuming that L−1 and U−1 are already computed (as above).

How many flops will be required to compute U−1L−1

From (2.5) it follows that

2x1 − x2 = 1,

32x2 − x3 = 5

2 ,

43x3 = 8

3 .

(2.9)

Thus the solution is easily obtained by looking at the equations (2.9) in backward order:

x3 = 2, x2 = 3, x1 = 2.


2.2 LU-decomposition with partial pivoting

2.2.1 Examples needed partial pivoting

Why do we need pivoting? (This part will be rewritten.)

Example 2.1. β = 10, t = 3 (decimal, rounding after 3rd decimal point.)

0.001 1.00

1.00 2.00

X =

1.00

3.00

, X =

1.002

0.998

Example 2.2. 0. 1.00

2.00 0.

(2.10)

2.2.2 Gaussian Elimination with partial pivoting

In the previous subsection the Gaussian elimination procedure is interpreted as multiplying by the Gauss

transformation matrix Gj to the left of A(j) at j-th step for j = 1, · · · , n− 1 so that the resulting upper

triangular matrix U = Gn−1 · · ·G2G1A is obtained. However, Exmaples 2.1 and 2.2 show that one needs

to swap the j-th with js-th row where |ajsj | ≥ |akj | for k ≥ j. This step is called partial pivoting and

amounts to multiply to the left of A(j) by the permutation matrix Pj given by

Pj =

1. . .

1

1...

. . .

1

.

The j-th step is then complete by multiplying Gj to the resulting matrix PjA(j). Therefore the whole

procedure of Gaussian elimination with partial pivoting corresponds to obtaining the following upper

triangular matrix

Gn−1Pn−1Gn−2Pn−2 · · ·G1P1A = U,

which is stated as in the following algorithm in which the j-th column below the diagonal at j-th step is

filled with the multiplier vectors used for Gk.

2.2. LU-DECOMPOSITION WITH PARTIAL PIVOTING 7

Algorithm 2.1.gausselim.f90 (with partial pivoting)1 DO j=1,n-1

2 ksave = maxloc(dabs(a(j:n,j)))

3 k = ksave(1) + j-1

4 swap a(j,:) and a(k,:)

5 swap b(j) and b(k)

6 DO k=j+1, n

7 m = a(k,j)/a(j,j)

8 a(k,:) = a(k,:) - m*a(j,:)

9 b(k) = b(k) - m*b(j)

10 ENDDO

11 ENDDO

Denote

P = Pn−1Pn−2 · · ·P1 and L = P [Gn−1Pn−1Gn−2Pn−2 · · ·G1P1]−1 . (2.11)

Lemma 2.1. The matrix L defined by (2.11) is a unit lower triangular matrix.

Proof Exercise.

We thus have the LU -decomposition of A in the form

PA = LU, (2.12)

where

L =

1 0 0 · · · 0

l21 1 0 · · · 0

l31 l32 1 · · · 0

· · · · · · ·

· · · · · · ·

· · · · · · ·

ln1 ln2 ln3 · · · 1

, U =

u11 u12 u13 · · · u1n

0 u22 u23 · · · u2n

0 0 u33 · · · u3n

· · · · · · ·

· · · · · · ·

· · · · · · ·

0 0 0 · · · unn

, (2.13a)

Indeed, we have the following theorem.

Theorem 2.1. An n × n invertible matrix has an LU -decomposition PA = LU where L is a unit

lower triangular matrix, U is an upper triangular matrix, and P = Pn−1 · · ·P1 with Pj being the j-th

row exchange permuatation. The j-th column of L below the diagonal is a permuted version of Gj by

Pj+1, · · · , Pn−1. If Gj = I − l(j)eTj with the Gauss vector l(j) = (0, · · · , 0︸︷︷︸

j

, lj+1, · · · , ln), then L(j + 1 :

n, j) = l(j + 1 : n), where l = Pn−1 · · ·Pj+1 l(j).


Proof Set Gn−1 = Gn−1 and Gj = Pn−1 · · ·Pj+1GjPj+1 · · ·Pn−1 for j = n− 2, · · · , 1. Then,

Gn−1 · · · G1PA = Gn−1(Pn−1Gn−2Pn−1) · · · (Pn−1 · · ·P2G1P2 · · · )PA

...

= (Gn−1Pn−1)(Gn−2Pn−2)(Gn−3Pn−3) · · · (G1P1)A

= U.

Since Pj is a permutation to swap j-th row and jk-th row, Pj(1 : j − 1, 1 : j − 1) = Ij−1, which implies

that etjPj+1 · · ·Pn−1 = et

j . Thus

Gj = Pn−1 · · ·Pj+1(I − l(j)etj)Pj+1 · · ·Pn−1

= I − (Pn−1 · · ·Pj+1)l(j)etj ,

which implies that Gj is a Gauss transform with Gauss vector g(j) = Pn−1 · · ·Pj+1 l(j). Notice that

L = (Gn−1 · · · G1)−1. Hence PA = LU .

2.2.3 Doolittle’s algorithm for LU-decomposition

A = LU

=

1 0

l21 1...

.... . .

ln1 ln+1,1 · · · 1

u11 u12 · · · u1n

u22 · · · u1,n+1

. . ....

unn

LU -decomposition corresponds to solving for n2 unknowns ino ljk’s and ujk’s for n2 equations ajk =∑n

m=1 ljmumk, j, k = 1 = n, . . . ,


2.2.4 Direct triangular decompositions

A Gaussian elimination scheme computes a sequence of lower and upper triangular matrices. Instead of

computing the sequence of lower and upper triangular matrices, one can compute the components of the

lower and upper triangular matrices directly. We introduce the following two algorithms: Doolittle’s and

Crout’s algorithms. These algorithms are essentially identical to Gaussian elimination except the change

of ordering in computation. Indeed, Gaussian elimination computes the inner-products of intermediate

results, while direct trianular decompositions computes those without storing intermediate results, and

therefore the latter schemes are slightly more precise in keeping double precision calculations consistently.

In this sense, as long as no pivoting is required, the use of direct triangular decompositions have an

advantages, and in actual hand calculators the direct decomposition algorithms have been implemented.

Algorithm 2.2.Doolittle’s algorithm for LU -decompositionPA = LU (Assume P = I.)

L is unit lower triangular and U is upper triangular.

ajk =∑min{j,k}

m=1 ljmumk

1 DO j=1,n

2 DO k=j,n

3 u(j,k)=a(j,k)-dot product( l(j,1:j-1), u(1:j-1,k)) ajk =∑j−1

m=1 ljmumk + ljjujk

4 ENDDO

5 DO k=j+1,n

6 l(k,j)=(a(k,j)-dot product( l(k,1:k-1),u(1:k-1,j)))/u(j,j) ajk =∑k−1

m=1 ljmumk + ljkukk

7 ENDDO

8 ENDDO


Algorithm 2.3.Crout’s algorithm for LU -decompositionPA = LU (Assume P = I.)

L is lower triangular and U is unit upper triangular.

ajk =∑min{j,k}

m=1 ljmumk

1 DO j=1,n

2 DO k=j,n

3 l(k,j) = a(k,j) - dot product( l(k,1:k-1), u(1:k-1,j)) ajk =∑j−1

m=1 ljmumk + ljjujk

4 ENDDO

5 DO k=j+1,n

6 u(j,k) = (a(j,k) - dot product( l(j,1:j-1), (u1:j-1,k)) )/l(j,j) ajk =∑k−1

m=1 ljmumk + ljkukk)

7 ENDDO

8 ENDDO

Total number of flopsn∑

j=1

[(j − 1)(n− j + 1) + j ∗ (n− j)]

=n∑

j=1

(nj − j2 + j − n + j − 1 + nj − j2

)=

n∑j=1

[(−2j2) + 2(n + 1)j − n− 1

]= −2

n(n + 1)(2n + 1)6

+ 2(n + 1)n(n + 1)

2− (n + 1)n

= n(n + 1)(−2n

3− 1

3+ n + 1− 1

)= n(n + 1)

(n

3− 1

3

).

2.2.5 Gauss-Jordan Algorithm

Ax(j) = b(j), j = 1, · · · ,m flops : n3

3 −n3 + mn2

• If m � n, A−1 would be helpful.

• In data fitting problem, some component A−1jk gives certain information on the dependence of input

to output data. In these cases, an actual computation of A−1 is desirable.

PA = LU ⇒ A−1

Assume that P = I. I = AA−1

Then n column vectors of A−1 xj satisfy Axj = ej , j = 1, · · · , n

This amounts to solving a linear system n timesn3

3 −n3 + n3 ∼= 4

3n3 flops.


Regarding x and y as general variables, the inversion of the mapping x 7→ Ax = y is equivalently set

to finding the inverse of A.

Consider

a11x1 + a12x2 + · · · a1nxn = y1,

...

an1x1 + an2x2 + · · · annxn = yn.

The first step exchanges the variable x1 with one of yp1 , where p1 is given such that |apj | = maxk≥j |akj |

and swap the p1th and jth rows. Then solve the resulting first equation for x1 and substitute x1 in all

equations with the x1 to be represented in the first equation. Write the resulting equations in the form

a′11y1 + a′12x2 + · · · a′1nxn = x1,

a′21y1 + a′22x2 + · · · a′2nxn = y2,

...

a′n1y1 + a′n2x2 + · · · a′nnxn = yn.

Repeat then to the next step to paritial pivot in the second column. Then solve the resulting second

equation for x2 and substitute x2 in all equations with the x2 to be represented in the second equation.

Write the resulting equations in the form

a′′11y1 + a′′12y2 + · · · a′′1nxn = x1,

a′′21y1 + a′′22y2 + · · · a′′2nxn = x2,

...

a′′n1y1 + a′′n2y2 + · · · a′′nnxn = yn.

Repeat these n steps.

We then have A−1 as the final matrix a(n)jk in the following algorithm multiplied by P , the matrix of

partial pivoting.

Algorithm 2.4. Gauss-Jordan Algorithm

• do j = 1, n

– Find p such that |apj | = maxk≥j |akj |

– Swap the pth and jth rows


– a′jj = 1/ajj

– do while (i, k 6= j)

∗ a′jk = −ajk/akk

∗ a′ij = aij/akk

∗ a′ik = aik − aijajk/ajj

– enddo

• enddo

To find the inverse A−1 of A, assume that we have the LU -decomposition PA = LU . let X = A−1,

then writing X = [x1, . . . , xn] ∈ Rn×n

Axj = ej j = 1, . . . , n (2.14)

PAxj = Pej j = 1, . . . , n (2.15)

Setting Uxj = yj for j = 1, . . . , n, Lyi = Pej by forward elimination and Uxj = yj have n3

3 +

n(

n(n−1)2 + n(n+1)

2

)' 4

3n3 flops.


2.2.6 LDMT -decomposition

Theorem 2.2 (LDMT -decomposition). Suppose that all the leading principal submatrices of A are

nonsingular. Then there exist L,M(unit lower triangular matrices) that A = LDMT , D diagonal matrix.

(Assume that P = I)

Proof A = LU . Let D = diag{u11, · · · , unn} =

u11 0

. . .

unn

Since A is nonsingular, ujj 6= 0,

(j = 1, · · · , n) Thus D−1 = diag{u−111 , · · · , u−1

nn} Let MT = D−1U . Then MT is a unit upper triangular

and M is a unit lower triangular. Hence A = LU = LDD−1U = LDMT .


Algorithm 2.5.A = LDMT D = diag{d1, · · · , dn}1 DO k=1,n

2 akk =∑k−1

p=1 lkpdpmkp + lkkdkmkk ⇒ dk = akk −∑

lkpdpmkp

3 DO i=k+1,n for i > k

4 lik = (aik −∑k−1

p=1 lipdpmkp)/dk aik =∑k−1

p=1 lipdpmkp + likdkmkk, mkk = 1)

5 mik = (aki −∑k−1

p=1 lkpdpmip)/dk aki =∑k−1

p=1 lkpdpmip + lkkdkmik, lkk = 1)

6 ENDDO

7 ENDDO

The above algorithm requires 2n3

3 flops. Notice that

rkp = dpmkp, k = 1, · · · , n− 1,

wkp = lkpdp, k = 1, · · · , n− 1

are independent of the inner loo indepx i. Compute these before executing the inner loop, to have the

following algorithm of flops n3

3 :


Algorithm 2.6.A = LDMT (reduced version)1 DO k=1,n

2 DO p=1,k-1

3 rp = dpapk

4 wp = akpdp

5 ENDDO

6 dk = akk −∑k−1

p=1 akprp

7 DO i=k+1,n

8 aik = (aik −∑k−1

p=1 aiprp)/dk

9 aki = (aki −∑k−1

p=1 wpapi)/dk

10 ENDDO

11 ENDDO


Proposition 2.2 (LDLT -decomposition of symmetric matrix). Suppose that all the leading prin-

cipal submatrices of A are nonsingular. For a nonsingular symmetric matrix A, there exists a unit lower

triangular matrix L and a diagonal matrix D such that A = LDLT .

Proof We start with A = LDMT (L,M :unit lower triangular matrices.)

⇒ M−1AM−T︸︷︷︸symmetric

= M−1LD︸︷︷︸lower triangular

Therefore, M−1L is diagonal. Moreover, it is unit lower triangular, which implies that M−1L = I.

Hence L = M .

Gaussian elimination of n× n matrix (A = LU decomposition) needs n3

3 −n3 flops.

Definition 2.3. An n × n real matrix A is called positive-definite if xT Ax > 0 for all x ∈ Rn, x 6= 0.

For complex cases, an n × n complex matrix A is postive definite if x∗Ax > 0 for all x ∈ Cn, x 6= 0.

(x∗ = (x)T )

Proposition 2.3. Let A be a positive-definite matrix. Then, all the diagonal entries of A are positive

and all the eigenvalues of A are positive.

Theorem 2.3. If A ∈ Rn×n is positive-definite, A = LDMT with positive diagonal entries.

Proof Since xT Ax > 0, for all x ∈ Rn\{0}, all principal submatrices of A are positive-definite. Then

A has a decomposition A = LDMT by the previous theorem. Set S = DMT L−T = L−1AL−T . S is

positive-definite.

Theorem 2.4 (Cholesky decomposition). Let A be a symmetric, positive-definite n × n matrix.

Then there exists a lower triangular matrix G such that A = GGT with positive diagonal entries of G.

Proof A = LDMT . Since A is symmetric, L = M . A = LDLT = LD1/2D1/2LT = LD1/2(LD1/2)T =

GGT .

Algorithm 2.7.Cholesky decomposition1 DO k=1,n aik =

∑kp=1 gipgkp if i ≥ k)

2 akk(= gkk) =√

akk −∑k−1

p=1 a2kp

3 DO i=k+1,n

4 aik(= gik) = (aik −∑k−1

p=1 aipakp)/akk

5 ENDDO

6 ENDDO


2.2.7 Banded matrices

Definition 2.4. A = (aij) has an upper bandwidth q if aij=0 for j > i + q.

. . . · · · . . . 0

ai−p,i−p · · · ai−p,i

. . ....

. . .

aii · · · ai,i+q

∗ . . . · · · . . .

and A has a low bandwidth q if aij = 0 for i < j + q

. . . · · · . . . ∗

ai,i−q · · · aii

. . ....

. . .

ai+q,i · · · . . .

0. . . · · · . . .

A has lower and upper bandwidth p and q aij = 0 for j > i + q or i > j + p.

Example 2.3 (Finite difference or finite element method).

−u′′(x) = f(x) x ∈ (a, b)

u(a) = u(b) = 0leads

to the matrix

2 −1

−1 2 −1

−1 2 −1. . . . . . . . .

−1 2 −1

−1 2

which has lower bandwidth=1, upper bandwidth=1, so named tridiagonal matrix.

Example 2.4 (The finite difference method).

−∆u(x) = f(x), x ∈ (0, 1)2

u(x) = g(x), x ∈ ∂(0, 1)2(2.16)


A =

N︷︸︸︷4 −1

−1. . . . . .. . . . . . −1

−1 4

−1. . .

. . .

−1

0

−1. . .

. . .

−1

. . . . . .

. . . . . .

−1. . .

. . .

−1

0

−1. . .

. . .

−1

4 −1

−1. . . . . .. . . . . . −1

−1 4

︸︷︷︸

N

upper bandwidth is N and lower bandwidth is N .

Theorem 2.5 (LU-decomposition of banded matrices). Let A has the LU -decomposition A = LU

and lower and upper bandwidth p and q. Then L has lower bandwidth p and U has upper bandwidth q.

Proof Write A =

α ~wt

~v B

for α ∈ R, ~v ∈ Rn−1, ~w ∈ Rn−1 and B is n− 1× n− 1 matrix.

(one step) =

1 ~0

~v/α In−1

α ~wt

~0 B −(~v ~wt

)/α

=

1 ~0

~v/α In−1

cc

1 ~0

~0 B − (~v ~wt)/α

α ~wt

~0 In−1

Note that B =

(~v ~wt

)/α has lower and upper bandwidth p and q. By induction assumption we may

assume that B =(~v ~wt

)/α has LU -decomposition such that B =

(~v ~wt

)/α = L1U1 with L1 has lower


bandwidth p and U1 has upper bandwidth q.

A =

1 ~0

~v/α In−1

1 ~0

~0 L1

︸︷︷︸

lower bandwidth p

1 ~0

~0 U1

α ~wt

~0 In−1

︸︷︷︸

upper bandwidth q

(2.17)

Algorithm 2.8.LU -decomposition without pivoting1 DO j=1,n-1

2 ml = min( j+p,n)

3 a(j+1:ml, j) = a(j+1:ml, j) / a(j,j)

4 DO k=j+1,ml

5 mu = min( j+q,n)

6 a(k, j+1:mu) = a(k, j+1:mu) - a(k,j)*a(j,j+1:mu)

7 ENDDO

8 ENDDO

Algorithm 2.9.forward elimination1 DO j=1,n

2 k = max(1,j-p)

3 b(j) = b(j) - dot product(a(j, k:j-1), b(k:j-1))

4 ENDDO

Algorithm 2.10.back substitution1 DO j=n,1,-1

2 k = min( j+q,n)

3 b(j) = ( b(j) - dot product(a(j, j+1:k), b(j+1:k)) ) / a(j,j)

4 ENDDO

Check LU -decomposition requires npq − 12pq2 − p3

6 + pn flops if p ≤ q

npq − 12p2q − q3

6 + qn flops if p > q

forward elimination : np− p2

2flops

back substitution : n(q + 1)− q2

2flops

Thus N2 ×N2 matrix with bandwidth N , n = N2 and p = q = N

O

(N4 − 1

2N3 − 1

6N3 + N3 + N3 − 1

2N2 + N2(N + 1) +

12N2

)= O(N4)


Algorithm 2.11.Cholesky decomposition of symmetric, positive-definite, bandwidth p

1 DO k=1,n

2 ml = max(k-p,1)

3 a(k, k) = sqrt(a(k, k) - dot product(a(k, ml:k-1), a(k, ml:k-1)) )

4 DO j=k+1,min(n, k+p)

5 a(j, k) = ( a(j, k) - dot product(a(j, ml:k-1), a(k, ml:k-1)) ) / a(k, k)

6 ENDDO

7 ENDDO

m = max(1, k − p) (2.18a)

akk =k∑

p=m

gkpgkp (2.18b)

ajk =min(j,k)∑

p=m

gjpgkp (2.18c)

2.2.8 Gaussian elimination of a banded matrix with partial pivoting.

Theorem 2.6. Suppose A is an invertible n × n matrix with lower and upper bandwidth p and q. For

j = 1, 2, . . . , n − 1, set Gj = I − g(j)etj, and P1, P2, . . . , Pn−1 be permutation matrices such that U =

Gn−1Pn−1Gn−2Pn−1 · · ·G1P1A is upper triangular. Then U has upperband p + q and g(j)i = 0 if i ≤ j or

i > j + p.

Proof Let P = Pn−1Pn−2 · · ·P1 and PA = LU . P t = [es1 , . . . , esn ] where {s1, . . . , sn} is a permutation

of {1, 2, . . . , n}. If si > i + p then the leading i× i principal submatrix of PA is singular, since (PA)ij =

asi,j = 0 for j = 1, . . . , si − p− 1. and since si − p− 1 ≥ i. This means that U is singular and hence A is

singular.

From this contradiction, si ≤ i + p for i = 1, . . . , n. Thus PA has upper bandwidth p + q. By the

previous Theorem 2.5 (without partial pivoting), L has lower bandwidth p and U has upper bandwidth

p + q. The fact that g(j)i = 0 for i ≤ j or i > j + p follows from the observation that only elements

(j + 1, j), . . . , (j + p, j) of Gj−1Pj−1 · · ·G1P1A need to be zero by Gj .

Recall that L = P (Gn−1Pn−1 · · ·G1P1A)−1 has all the diagonal entries being equal to 1.

Remark 2.1. If a matrix from LU -decomposition is nonsingular resulting, all its principal submatrices

are nonsingular.

Algorithm 2.12.


1 DO j=1,n-1

2 k=maxloc(abs(a(j:j+p, j)) + j-1)

3 swap(a(j, j:j+p+q), a(k, j:j+p+q))

4 DO k=j+1, min( j+p, n)

5 m = a(k,j)/a(j,j)

6 a(k, j+1:j+p+q) = a(k, j+1:j+p+q) - m*a(j, j+1:j+p+q)

7 ENDDO

8 ENDDO


Chapter 3

Matrix norms and condition number

3.1 Norm

Definition 3.1. Let F = R or F = C. A norm ‖·‖ on a vector space X over F is a nonnegative

real-valued function, ‖·‖ : X → R+ is such that

1. ‖x‖ ≥ 0 for ∀x ∈ X,

2. ‖αx‖ = |α| ‖x‖ for ∀α ∈ F, ∀x ∈ X,

3. ‖x + y‖ ≤ ‖x‖+ ‖y‖ for ∀x, y ∈ X,

4. ‖x‖ = 0 iff x = 0.

If 1, 2, 3 are satisfied, ‖·‖is called a seminorm.

We will implicitly assume that all the vector spaces are over the field F = R or F = C.

Example 3.1 (Holder norm or p-norm). 1

‖x‖p :=

N∑j=1

|xj |p 1

p

1 ≤ p < ∞,

‖x‖∞ := max1≤j≤N

|xj |,

Holder inequality (p ≥ 1, q ≥ 1, 1p + 1

q = 1) 2

|x∗y| = |xT y|

=N∑

j=1

xjyj

≤ ‖x‖p ‖y‖q ∀x, y ∈ CN

1‖x‖∞ is called ∞-norm or sup-norm2if p = q = 2, it is called Cauchy-Schwarz inequality

23

24 CHAPTER 3. MATRIX NORMS AND CONDITION NUMBER

Example 3.2. Unit circles in R2 with respect to p-norm

{x ∈ R2; ‖x‖p = 1}.

Need graphs for p = 1, 2,∞.

Definition 3.2. A linear transformation Q is called unitary (or orthogonal) if

QT Q = Q∗Q = I = QQ∗ = QQT .

Remark 3.1. If Q is unitary, then

‖Qx‖2 = ‖x‖2 ∀x ∈ X.

Indeed,

‖Qx‖22 = (Qx)∗(Qx) = x∗Q∗Qx = x∗Ix = ‖x‖2 .

Definition 3.3. Two norms ‖·‖α and ‖·‖β on a vector space are called equivalent if there exist positive

constants c1 and c2 such that

c1 ‖x‖β ≤ ‖x‖α ≤ c2 ‖x‖β ∀x ∈ X.

Exercise 3.1. For ∀x ∈ X, show that

‖x‖2 ≤ ‖x‖1 ≤√

N ‖x‖2 ,

‖x‖∞ ≤ ‖x‖2 ≤√

N ‖x‖∞ ,

‖x‖∞ ≤ ‖x‖1 ≤ N ‖x‖∞ .

Also describe when the equality holds.

Definition 3.4. A function f : X → R is called uniformly continuous if ∀ε > 0 ∃δ(ε) > 0 such that

‖x− y‖ < δ =⇒ |f(x)− f(y)| < ε

Example 3.3. f(x) = 1/x is continuous, but not uniformly continuous. Let ε > 0 be given, then ∀δ > 0

we can find x, y ∈ (0, 1] such that ‖x− y‖ < δ and |f(x)− f(y)| > ε∣∣∣∣1x − 1y

∣∣∣∣ =∣∣∣∣x− y

xy

∣∣∣∣ let y − x =δ

2x, y = (0, 1]

=δ/2xy

>δ

21y≥ ε

3.1. NORM 25

Table 3.1: Example 3.3

Function Range Continuous Uniform

f(x) = x x ∈ R yes yes

f(x) = 1x x ∈ (0,∞) yes no

Remark 3.2. A continuous function on a compact set is uniformly continuous.

Lemma 3.1. A norm ‖·‖ on CN is uniformly continuous with respect to ‖·‖∞.

Proof Let ε > 0 be given, then ∀ x, y ∈ CN ,

‖x− y‖ = ‖(x1 − y1)e1 + (x2 − y2)e2 + · · ·+ (xN − yN )eN‖

≤N∑

j=1

|xj − yj | ‖ej‖

≤ maxj|xj − yj |

N∑j=1

‖ej‖

= ‖x− y‖∞N∑

j=1

‖ej‖ .

Thus if we choose δ = εc = ε/

∑Nj=1 ‖ej‖ ∀x, y,

‖x− y‖∞ < δ =⇒ ‖x− y‖ < ε.

Theorem 3.1. Any two norms on CN are equivalent. Indeed, any two norms on a finite-dimensional

vector space are equivalent.

Proof We will show that any norm ‖·‖is equivalent to ‖·‖∞ that is, ∃c1 > 0 and ∃c2 > 0 such that

c1 ‖x‖∞ ≤ ‖x‖ ≤ c2 ‖x‖∞ , ∀x ∈ CN .

For this, set S ={x ∈ CN | ‖x‖∞ = 1

}then S is a compact set.

Since ‖·‖is (uniformly) continuous by the previous lemma, there exist c1 := minx∈S ‖x‖ and c2 :=

maxx∈S ‖x‖. Since the zero vector 0 is not included in S, c1 > 0. Then ∀y 6= 0,∥∥∥ y‖y‖∞

∥∥∥∞

= 1 implies

that y‖y‖∞

∈ S. Therefore,

c1 ≤∥∥∥∥ y

‖y‖∞

∥∥∥∥ ≤ c2,∀y ∈ CN , y 6= 0.


This implies that

c1 ‖y‖∞ ≤ ‖y‖ ≤ c2 ‖y‖∞ .

Thus ‖·‖and ‖·‖∞ are equivalent.

3.2 Matrix norm

Let M(m,n) be the set of all m× n matrices over C(or R).

Definition 3.5. A matrix norm ‖·‖ : M(m, n) −→ R+ is a nonnegative real-valued function such that

1. ‖A‖ ≥ 0, ∀A ∈M(m,n)

2. ‖αA‖ = |α| ‖A‖ , ∀α ∈ C, ∀A ∈M(m,n)

3. ‖A + B‖ ≤ ‖A‖+ ‖B‖ , ∀A,B ∈M(m,n)

4. ‖A‖ = 0 if and only if A = 0.

Example 3.4. Let ‖A‖∆ be defined by

‖A‖∆ = maxj,k

|ajk|.

Then ‖·‖∆ defines a norm.

But if A =

1 1

1 1

, A2 =

2 2

2 2

, then∥∥A2

∥∥∆

= 2, ‖A‖∆ = 1.

∥∥A2∥∥

∆� ‖A‖∆ ‖A‖∆ .

Definition 3.6. A matrix norm on M(n,n) is submultiplicative if

‖AB‖ ≤ ‖A‖ ‖B‖ , ∀A,B ∈M(n, n).

Definition 3.7. A matrix norm ‖·‖on M(m,n) is said to be consistent with respect to the vector norms

‖·‖α on Cn and ‖·‖β on Cm if ‖Ax‖β ≤ ‖A‖ ‖x‖α , ∀x ∈ Cn, A ∈M(m,n).

Definition 3.8. A matrix norm ‖·‖subordinate to the vector norms ‖·‖α on Cn and ‖·‖β on Cm is

defined by

‖A‖α,β = supx6=0,x∈Cn

‖Ax‖β

‖x‖α

= sup‖x‖α=1,x∈Cn

‖Ax‖β .

3.2. MATRIX NORM 27

Note that a subordinate norm is consistent with respect to the associated norms.

Example 3.5 (Matrix norm). 1. Frobenius norm(F-norm, Schur norm.)

‖A‖F =

m∑j=1

n∑k=1

|ajk|21/2

.

2. p-norms. (p ≥ 1.)

‖A‖p = supx 6=0

‖Ax‖p

‖x‖p

.

(subordinate to ‖·‖p norms).

Exercise 3.2. Let A = (ajk) be an m× n matrix over C.

1. The 1-norm = the column norm: Show that

‖A‖1 = maxk=1,··· ,n

m∑j=1

|ajk|.

That is, the 1-norm is the maximum of 1-norms of all columns of the matrix A.

2. The ∞-norm (=the maximum norm = the sup norm) = the row norm: Show that

‖A‖∞ = maxj=1,··· ,m

n∑k=1

|ajk|.

That is, the ∞-norm is the maximum of 1-norms of all rows of the matrix A.

3. Show that ‖AB‖p ≤ ‖A‖p ‖B‖p for all 1 ≤ p ≤ ∞. Of course, B is an n× p matrix.

4. Suppose A = diag(µ1, µ2, · · · , µk), k = min{m,n}. Show that

‖A‖p = maxj=1,··· ,k

|µj |.

5. If u ∈ Cm, and v ∈ Cn, show that ‖uv∗‖2 = ‖u‖2 ‖v‖2 and ‖uv∗‖∞ = ‖u‖∞ ‖v‖1. What can be said

of ‖uv∗‖1?

6. Let y ∈ Cm and 0 6= x ∈ Cn. Show that X = (y − Ax)x∗/(x∗x) is the solution with the smallest

2-norm satisfying (A + X)x = y.


Proposition 3.1. 1. The subordinate matrix norm is the smallest consistent norm with respect to the

vector norms ‖·‖α and ‖·‖β.

2. Subordinate matrix norms are submultiplicative.

3. Let m×m matrix Q and n× n matrix Z be unitary transformations. Then

‖QAZ‖2 = ‖A‖2 ,

‖QAZ‖F = ‖A‖F .

Proof of proposition The first posposition is proved in the last class.

Let A and B be m× n and n× p matrices. Then

‖AB‖α,β = sup0 6=x∈Cp

‖ABx‖β

‖x‖α

= sup0 6=x∈Cp,0 6=Bx∈Cm

‖ABx‖β

‖Bx‖γ

‖Bx‖γ

‖x‖α

≤ sup0 6=x∈Cp

‖Bx‖γ

‖x‖α

sup0 6=y∈Cm

‖Ay‖β

‖y‖γ

= ‖B‖α,γ ‖A‖γ,β .

This proves the second proposition.

For the third proposition (3.1),

‖QAZ‖2 = supx 6=0,x∈Cn

‖QAZx‖2‖x‖2

= supx 6=0,x∈Cn

{(QAZx)∗(QAZx)}1/2

‖x‖2

= supx 6=0,x∈Cn

(x∗Z∗A∗Q∗QAZx)1/2

‖x‖2

= supZx6=0,Zx∈Cn

‖A(Zx)‖2‖Zx‖2

= ‖A‖2 .

Suppose Z = I.

‖QAZ‖2F =∑j,k

∣∣∣∣∣n∑

l=1

qjlalk

∣∣∣∣∣2

=n∑

k=1

m∑j=1

|m∑

l=1

qjlalk|2

=n∑

k=1

m∑j=1

|ajk|2 since Q is orthogonal

= ‖A‖2F .

3.3. CONDITON NUMBER 29

Next suppose Q = I. Then from the above it follows that ‖Z∗A∗‖F = ‖A‖F . Therefore,

‖QAZ‖2F = ‖AZ‖2F = ‖A‖2F .

From now on, we shall assume that matrix norms are consistent and submultiplicative.

3.3 Conditon Number

3.3.1 The concept of a condition number κ(A)

Condition numbers important in the numerical analysis of a linear system. They provide a measure of

ill-posedness of the system, which means how much errors in solutions depend on the relative changes in

the data and matrices.

Want to know about the relative error ‖∆x‖‖x‖ in terms of ‖∆b‖

‖b‖ .

First, consider Ax = b, A(x + ∆x) = b + ∆b. b + ∆b is a perturbed data and x + ∆x is its associated

solution. We want to see ‖∆x‖‖x‖ ∼ ‖∆b‖

‖b‖ . We have A∆x = ∆b, so that ∆x = A−1∆b. Since ‖b‖ ≤ ‖A‖‖x‖,

we have

‖∆x‖‖x‖

≤‖A‖

∥∥A−1∆b∥∥

‖b‖≤‖A‖

∥∥A−1∥∥ ‖∆b‖

‖b‖= κ(A)

‖∆b‖‖b‖

,

where

κ(A) = ‖A‖ ‖A−1‖.

Thus the relative change in x is bounded by the relative change in b multiplied by κ(a).

Definition 3.9. For an invertible A ∈ M(n, n), the condition number κ(A) is defined by κ(A) =

‖A‖ ‖A−1‖. Note that the condition number depends on the choice of norm ‖ · ‖.

Remark 3.3. κ(A) ≥ 1 since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ = κ(A).

Summary

1 κ(A) measures the sensitivity of the relative error in the solution change to the change in the RHS b

‖∆x‖‖x‖ ≤ κ(A)‖∆b‖

‖b‖ .

2 ‖∆x‖ ≤ ‖A−1‖‖∆b‖

x : approximate solution to Ax = b, x = x + ∆x.

With the residual r(x) := b−Ax (r(x) = 0), we have

‖∆x‖ ≤ ‖A−1‖‖r(x)‖, since r(x) = b−Ax = b−A(x + ∆x) = −A∆x.


Lemma 3.2. If F ∈M(n, n) with ‖F‖ < 1, then I + F is invertible and∥∥(I + F )−1

∥∥ < 1/(1− ‖F‖)

Proof ∀x ∈ Cn, x 6= 0, ‖(I + F )x‖ = ‖x + Fx‖ ≥ ‖x‖−‖Fx‖ ≥ ‖x‖−‖F‖ ‖x‖ = ‖x‖ (1−‖F‖) > 0.

Thus I + F is invertible. Moreover,

1 = ‖I‖ =∥∥(I + F )−1(I + F )

∥∥=

∥∥(I + F )−1 + (I + F )−1F∥∥

≥∥∥(I + F )−1

∥∥− ∥∥(I + F )−1F∥∥

≥∥∥(I + F )−1

∥∥− ∥∥(I + F )−1∥∥ ‖F‖

=∥∥(I + F )−1

∥∥ (1− ‖F‖)

Consequently,∥∥(I + F )−1∥∥ ≤ 1

1− ‖F‖.

3.3. CONDITON NUMBER 31

Theorem 3.2. Let A ∈M(n, n) invertible B = A(I+F ) with ‖F‖ < 1. Consider Ax = b, B(x+∆x) = b.

Then,

‖∆x‖‖x‖

≤ ‖F‖1− ‖F‖

.

Moreover,

‖∆x‖‖x‖

≤κ(A)‖B−A‖

‖A‖

1− κ(A)‖B−A‖‖A‖

.

Proof B−1 = (I + F )−1A−1 exists by lemma 3.2. Since x = A−1b, ∆x = B−1b− x = (B−1 −A−1)b =

B−1(A−B)A−1b. We have

‖∆x‖‖x‖

=

∥∥B−1(A−B)A−1b∥∥

‖A−1b‖

≤∥∥B−1(A−B)

∥∥∥∥A−1b∥∥

‖A−1b‖=

∥∥(I + F )−1A−1(A−B)∥∥

=∥∥(I + F )−1A−1(−AF )

∥∥≤

∥∥(I + F )−1∥∥ ‖F‖

≤ 11− ‖F‖

‖F‖ .

Since∥∥A−1(B −A)

∥∥ = ‖F‖,

‖∆x‖‖x‖

≤∥∥A−1(B −A)

∥∥1− ‖A−1(B −A)‖

≤∥∥A−1

∥∥ ‖B −A‖1− ‖A−1‖ ‖B −A‖

=κ(A)‖B−A‖

‖A‖

1− κ(A)‖B−A‖‖A‖

.

Now assuming that B−1 exists in the previous theorem, set C = (I + F )−1 = B−1A, F = A−1B − I.

If ‖F‖ << 1, B−1 can be regarded as an approximate inverse of A.∥∥B−1A

∥∥ ≤ 11−‖F‖ by lemma 3.2.

Interchanging A and B,∥∥A−1B

∥∥ ≤ 11−‖B−1A−I‖ . Since A−1 = A−1(BB−1) = (A−1B)B−1,

∥∥A−1∥∥ ≤ ∥∥A−1B

∥∥∥∥B−1∥∥ ≤ ∥∥B−1

∥∥1− ‖B−1A− I‖

.

For Ax = b, assume that x is an approximation to x, and B−1 is available. Writing r(x) := b−Ax =

Ax−Ax = A(x− x),

‖∆x‖ = ‖x− x‖ ≤∥∥A−1

∥∥ ‖r(x)‖ ≤∥∥B−1

∥∥1− ‖B−1A− I‖

‖r(x)‖ .


Theorem 3.3. Suppose that Ax = b and (A + ∆A)(x + ∆x) = (b + ∆b), If ‖∆A‖ < 1‖A−1‖ , then

‖∆x‖‖x‖

≤ κ(A)

1− k(A)‖∆A‖‖A‖

{‖∆A‖‖A‖

+‖∆b‖‖b‖

}Proof

∥∥A−1∆A∥∥ ≤ ∥∥A−1

∥∥ ‖∆A‖ < 1. By lemma 3.2, I +A−1∆A is invertible and ‖(I +A−1∆A)−1‖ ≤1

1−‖A−1∆A‖ ≤1

1−‖A−1‖‖∆A‖ .

A−1(b + ∆b) = A−1(A + ∆A)(x + ∆x)

= (I + A−1∆A)(x + ∆x)

= (I + A−1∆A)∆x + x + A−1∆Ax

= (I + A−1∆A)∆x + A−1(b + ∆Ax).

then by cancelling A−1b from the two end sides,

‖∆x‖‖x‖

≤∥∥(I + A−1∆A)−1

{A−1(∆b−∆Ax)

}∥∥‖x‖

≤∥∥(I + A−1∆A)−1

∥∥∥∥A−1∥∥ ‖∆b−∆Ax‖

‖x‖

≤∥∥A−1

∥∥1− ‖A−1‖ ‖∆A‖

{‖∆b‖‖x‖

+ ‖∆A‖}

=κ(A)

1− κ(A)‖∆A‖‖A‖

{‖∆b‖‖A‖ ‖x‖

+‖∆A‖‖A‖

}

≤ κ(A)

1− κ(A)‖∆A‖‖A‖

{‖∆A‖‖A‖

+‖∆b‖‖b‖

}.

Chapter 4

Householder transformation

4.1 Motivation

Recall that each step in Gaussian elimination is to multiply the matrices Gj and Pj so that the resulting

matrix U = Gn−1Pn−1Gn−2Pn−1 · · ·G1P1A is upper triangular. As before let A(0) = A and A(j) =

GjPjA(j−1). If ε(j) is the bound of round-off error arising from the multiplication GjPj to A(j−1), then

the relative error in the final solution x is estimated by κ(A(j))ε(j). Therefore

‖∆x‖‖x‖

≤n∑

j=0

κ(A(j))ε(j),

where ε(0) denotes the errors in the initial data A and b.

Lemma 4.1. Let ‖ · ‖ be the 2− norm, then for any n× n A and a unitary matrix U , κ(UA) = κ(A).

Proof κ(UA) = ‖UA‖2‖(UA)−1‖2 = ‖A‖2‖A−1U∗‖2 = ‖A‖2‖A−1‖2 = κ(A).

This motivates to try to find an alternative way to multiply A(j−1) to the left by a matrix H(j) to

obtain A(j), recursively, in order to obtain a resulting matrix

R := A(n−1) = H(n−1)H(n−2) · · ·H(1)A(0) (4.1)

to be a matrix which is easy to be inverted. For instance, if A(n−1) is triangular matrix, it is easy to be

inverted as in the Gaussian elimination. Notice that if follows from (4.1) that

A =[H(n−1)H(n−2) · · ·H(1)

]−1A(n−1) =: H−1R. (4.2)

Having in mind that a unitary matrix is also as easily inverted as triangular matrices, and a mutiple of

several unitary matrices is also unitary, one can try to find each H(j) to be unitary. But the problem

33

34 CHAPTER 4. HOUSEHOLDER TRANSFORMATION

is how systematically obtain such unitary matrices. The next section is due to the original idea of

Householder.

4.2 Householder transformation

Let Hv : x ∈ Rn 7−→ Hvx ∈ Rn be the reflection of x to the hyperplane orthogonal to v spanned by {v}.

Then, as in the following figure,

(Hvx− x)//v,−→ x−Hvx =(2v · x)v · v

v.

Therefore,

Hvx =(

I − 2vvT

vT v

)x

This transformation can be extended to the vectro space over the complex field.

Definition 4.1. Given v ∈ Cn, Hv = I − 2vv∗

v∗v is called the Householder transformation (or Householder

matrix) with respect to v.

Remark 4.1. • Hv is Hermitian : H∗v = Hv(H∗

v = Hv)

• Hv is involving (H2v = I)

• Hv is unitary.

Proof H∗vHv = (I − 2vv∗

v∗v )∗(I − 2vv∗

v∗v )=I − 4vv∗

v∗v + 4vv∗vv∗

(v∗v)2= I .

Hv : Cn ⇒ Cn x 7→ Hvx = (I − 2vv∗

v∗v )x.

(1) Given x 6= 0, x ∈ Cn, construct a v ∈ Cn such that Hvx = αe1. for some α ∈ C. For this, notice

that

Hvx = (I − 2vv∗

v∗v)x = x− 2

v∗x

v∗vv = αe1.

This means

v ∈ Span{x, e1, } i.e. v = β′x + βe1, for some β, β′ ∈ C.

4.2. HOUSEHOLDER TRANSFORMATION 35

Since we are only interested in the direction of v, we may assume β′ = 1 Thus, we start with

v = x + βe1.

We then have

v∗v = (x∗ + βet1)x = ‖x‖2 + βx1 ∈ C,

v∗v = (x∗ + βet1)(x + βe1) = ‖x‖22 + |x2|+ βx1 + βx1.

Notice that

βx1 + βx1 = 2Re (βx1) = 2 Re (βx1).

Then

Hvx = x− 2vv∗

v∗v(x + βe1) = (1− 2

vv∗

v∗v)x− 2β

v∗x

v∗ve1 = αe1

implies that

1− 2v∗x

v∗v= 0,

from which

‖x‖22 + |β|2 + βx1 + βx1 − 2(‖x‖2 + βx1) = 0.

Since

−‖x‖22 + |β|2 + βx1 − βx1 = 0.

Here −‖x‖22 + |β|2 and −(βx1 − βx1) are equal, and therefore the latter must be real.

Hence

βx1 = βx1 ∈ R. (4.3)

and |β|2 = ‖x‖22, or β = eiσ‖x‖2 for some σ ∈ R. Thus choose v such that

v = x + eiσ‖x‖2e1.


In particular, we choose a σ such that

‖v‖2 ≥ ‖x‖2. (4.4)

Thus if x1 = |x1|eiω, then (4.3) implies taht

βx1 = ei(ω−σ)|x1|‖x‖2 which is real

= ±|x1|‖x‖2

and thus

β = ±|x1|x1‖x‖2, x1 6= 0.

Hence

β = ±x1|x1|x1x1

‖x‖2 = ± x1

|x1|‖x‖2.

The condition (4.4) is then equivalent to

‖v‖2 = ‖x + βe1‖ = ‖x± x1

|x1|‖x‖2e1‖2 ≥ ‖x‖2.

Thus finally choose β such that

β =x1

|x1|‖x‖2.

With this choice of β

v∗x = ‖x‖2 +x1

|x1|‖x‖2x1 = ‖x‖2(‖x‖2 + |x1|),

v∗x = ‖x‖22 + (x1

|x1|‖x‖2)2 + βx1 + βx1

= ‖x‖22 + ‖x‖22 + 2‖x‖2|x1| = 2‖x‖2(‖x‖2 + |x1|),

Hvx = −2βv∗x

v∗ve1 = −2

x1

|x1|‖x‖2 ·

12e1 = − x1

|x1|‖x‖2e1.

(2) Householder transformation(matrices) can be used to make any contiguous block of vector compo-

nents zeros.

Given x = (x1, · · · , xn)∗, let

v = (0, · · · , 0, xk + xk|xk|√|xk|2 + · · ·+ |xj |2, xk+1, · · · , xj , 0, · · · , 0)

Then Hvx = (I − 2vv∗

v∗v )x

= (x1, · · · , xk−1,− xk|xk|α, 0, · · · , 0, xj+1, · · · , xn)


A matrix A ∈ M(n, n) can be reduced step by step using the unitary Householder matrices Hj defined

as follows :

A(0) := A

do j = 1, n− 1

A(j) = HjA(j−1) ⇒ A(n−1) = Hn−1Hn−2 · · ·H1A

(0)

enddo

First, Let H1 be determined by H1a(0)1 = αe1 where a

(j)k : k-column of a(j)

Here.

A(0) =

a11 a12 · · · a1,n

a21 a22 · · · a2,n

.... . .

...

an,1 an,2 · · · an,n

and H1 = Hv(1) , find v(1) such that H

(1)v A(0) = α1e1

A(1) = H1A(0) =

a

(1)11 a

(1)12 · · · a

(1)1,n

0 a(1)22 · · · a

(1)2,n

.... . .

...

a(1)n,1 a

(1)n,2 · · · a

(1)n,n

find v(2) such that Hv(2)A2(1)

a(1)12

α

0...

0

H2 := Hv(2)

A(2) = H2A(1) =

a

(1)11 a

(1)12 · · · a

(1)1,n

0 a(2)22 · · · a

(1)2,n

.... . .

...

a(1)n,1 a

(1)n,2 · · · a

(2)n,n


therefore, A(n−1) = Hn−1A(N−2) = Hn−1Hn−2 · · ·H1A

(0) = R

Notice that R is an upper triangular matrix, and v = x + ‖x‖2e1

Hvx = (I − 2vv∗

v∗v)x = x− 2

(x + ‖x‖2e1)(x∗x + ‖x‖2x1)(x∗ + ‖x‖2e1

t)(x + ‖x‖2e1)

= x− 2(x + ‖x‖2e1)‖x‖22

(‖x‖22 + ‖x‖22)= −‖x‖2e1.

Here, we assume that x1|x1| = 1 if x1 = 0.

(Hn−1Hn−2...H1)−1R = A(0) = A

QR = A ( QR - decomposition )

Suppose

a(j−1)11 · · · a

(j−1)1,j−1 a

(j−1)1,j · · · a

(j−1)1,n

. . ....

...

0 a(j−1)j−1,j−1 a

(j−1)j−1,j · · · a

(j−1)j−1,n

a(j−1)j,j · · · a

(j−1)j,n

......

a(j−1)n,j · · · a

(j−1)n,n

Determine a Householder matrix Hj such that

Hj

a

(j−1)jj

...

a(j−1)nj

= αj

1

0...

0

for some αj

Then set Hj =

Ij−1 0

0 Hj

so that A(j) = HjA(j−1)

⇒ A(n−1) = Hn−1A(n−2) = · · ·

=Upper triangular matrix = U = R

Summarizing, Hn−1Hn−2 · · ·H1A = R A = H∗R = QR

Givens rotation

If A ∈ M(2, 2) is orthogonal, i.e. A∗A = I, then A is a Housholder matrix or A is a Givens rotation.


1) ad− bc=1 =⇒

a b

c d

=

cos θ − sin θ

sin θ cos θ

which is a givens rotation.

2) ad− bc= - 1 =⇒

a b

c d

=

cos θ sin θ

sin θ − cos θ

which is a Householder transformation.

v = (sin θ2 , cos θ

2)

Hv = I − 2vv∗

v∗v=

1− 2sin2 θ2 −2sin θ

2 cos θ2

−2 sin θ2 cos θ

2 1− 2cos2 θ2

=

cos (−θ) sin (−θ)

sin (−θ) − cos (−θ)

G(j, k, θ) =

I... 0

... 0

· · · c · · · s · · ·

0... I

... 0

· · · −s · · · c · · ·

0... 0

... I

H(j, k, θ) =

I... 0

... 0

· · · c · · · s · · ·

0... I

... 0

· · · s · · · −c · · ·

0... 0

... I

The Givens rotation

G(j, k, θ) : x 7→ y = G(j, k, θ)x

is useful to annihilate a specific component. Since

yj = cxj + sxk, yk = −sxj + cxk, yl = xl if l 6= j, k,

in order to make yk = 0,

c = xj/√

x2j + x2

k, s = xk/√

x2j + x2

k will do.

Givens rotations are used to zero a specific entry.

• Algorithm : Givens(j, k, θ, A, m, n), A∈ M(m,n)

v(:) = a(j, :)

w(:) = a(k, :)

a(j, :) = d cos θ × v(:) + d sin θ × w(:)

a(k, :) = −d sin θ × v(:) + d sin θ × w(:)

Properties of Givens ratations


1. Givens rotation is a rank-two perturbation of identity.

2. Since G(j, k, θ)−1 = G(j, k, θ)t = G(j, k,−θ) Givens rotations are orthogonal:

G(j, k, θ)tG(j, k, θ) = G(j, k, θ)G(j, k, θ)t = I

Let Q = G1 · · ·GN , Gj : Givens rotation.

Chapter 5

Singular value decomposition(SVD) and

Least Squares Problems

5.1 Singular Value Decomposition(SVD)

Theorem 5.1. For A ∈ M(m,n), there exist unitary matrices U = [u1, . . . , um] ∈ M(m,m) and

V = [v1, . . . , vn] ∈ M(n, n) such that U∗AV = diag(σ1, . . . , σp) =: Σ, p = min(m, n) where σ1 ≥ σ2 ≥

· · · ≥ σp.

Proposition 5.1. A = UU∗AV V ∗ = UΣV ∗

1. From AV = UΣ, one has Avj = σjuj , j = 1, . . . , p

If U=V, then this equation implies a eigenvalue problem. Hence it is called a generalized eigen

value.

2. From U∗A = ΣV ∗, one has A∗uj = σjvj , j = 1, . . . , p

σj : jth singular value of A.

uj, vj : jth left and right singular vectors of A.

σ1 = ‖A‖2 = supx 6=0

‖Ax‖2‖x‖2

= sup‖x‖2=1

‖Ax‖2 (5.1)

If we set E = {y ∈ Rn : y = Ax, ‖x‖2 = 1} FIGURE NEEDED. A circle becomes ellipsoid with

singular value being the semiaxis of E.

41

42CHAPTER 5. SINGULAR VALUE DECOMPOSITION(SVD) AND LEAST SQUARES PROBLEMS

Example 5.1. A =

0.96 1.72

2.28 0.96

has the SVD

A = UΣV ∗ =

0.6 −0.8

0.8 0.6

3 0

0 1

0.8 0.6

0.6 −0.8

∗ . (5.2)

u1 = (0.6, 0.8)t, u2 = (−0.8, 0.6)t

Proof of SVD Let σ1 = ‖A‖2 = max‖v‖2=1 ‖Av‖2. Then there exist v1 ∈ Cn, with ‖v1‖2 = 1,

‖Av1‖2 = σ1. Put y = Av1 and u1 = y‖y‖2

= Av1‖Av1‖2

= Av1σ1

therefore Av1 = σ1u1.

Next choose U1 ∈ M(m,m − 1) and V1 ∈ M(n, n − 1) such that U = [u1, U1] ∈ M(m,m) and

V = [v1, V1] ∈M(n, n) are orthogonal. Then

A1 := U∗AV =

u∗1

U∗1

[ Av1 AV1

]

=

u∗1Av1 u∗1AV1

U∗1 Av1 U∗

1 AV1

=

σ1 u∗1AV1

0 U∗1 AV1

Claim u∗1AV1 = 0. Indeed, observe that∥∥∥∥∥∥A1

σ1

(u∗1AV1)∗

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ σ2

1 + (u∗1AV1)(u∗1AV1)∗

(U∗1 AV1)(u∗1AV1)∗

∥∥∥∥∥∥2

≥√

σ21 + ‖u∗1AV1‖22,

which implies that

‖A1‖2 = supx∈Cn

‖A1x‖2‖x‖2

≥σ2

1 + ‖u∗1AV1‖22‖(σ1, u∗1AV1)‖2

= σ21 + ‖u∗1AV1‖22.

Since ‖A1‖2 = ‖U∗AV ‖2 = ‖A‖2 = σ1, we conclude that u∗1AV1 = 0. Therefore,

A1 = U∗AV =

σ1 0

0 U∗1 AV1

.

By an induction argument

U∗AV =

σ1 0

σ2

. . .

0 σp 0 · · · 0

. (5.3)

Observe that ‖U∗1 AV1‖2 ≤ ‖A‖2 = σ1.

5.1. SINGULAR VALUE DECOMPOSITION(SVD) 43

Corollary 5.1. If A = UΣV ∗ is a SVD of A with σ1 ≥ σ2 ≥ · · · ≥ σr > σr+1 = · · · = σp = 0, the

following properties hold:

1. A =∑r

j=1 σjujv∗j = UrΣrV

∗r

Ur = [u1, . . . , ur], Vr = [v1, . . . , vr], Σr = diag{σ1, . . . , σr}.

2. rank (A) = r.

3. N(A) = span{vr+1, . . . , vn}.

4. R(A) = span{u1, . . . , ur}.

5. ‖A‖F =√

σ21 + · · ·+ σ2

r .

6. ‖A‖2 = σ1.

7. σj =√

λj(A∗A), j = 1, . . . , p,

where λj(A∗A) is the jth largest eigenvalue of A∗A.

Proof 1.

Range(A) = {Ax | x ∈ Cn}

= {r∑

j=1

σjujv∗j x | x ∈ Cn}

= {r∑

j=1

σjujv∗j (∑

k = 1n) | αk ∈ C}

= {r∑

j=1

σjαjuj | αj ∈ C}

where x =∑

αjvj, v = [v1, · · · vn] is unitary and span{v1, · · · , vn} = Cn

2.

Null(A) = {x ∈ Cn | Ax = 0}

= {x ∈ Cn |r∑

j=1

σjujv∗j x = 0}

= {x ∈ Cn |r∑

j=1

σjujv∗j

n∑k=r+1

αkvk}

where x =∑n

k=r+1 αkvk


5. ‖A‖F = ‖uΣv∗‖F = ‖Σ‖F =√

σ21 + · · ·+ σ2

r where u, v∗ : unitary.

note:: perserve F -norm.

5.2 Shur Decomposition

For A ∈M(n, n), ∃ a unitary matrix U such that

U−1AU = U∗AU =

λ1 ∗

. . .

0 λn

= T ; an upper triangular matirx.

In other words, A and T are unitarily equivalent.

Proposition 5.2. Every Hermitian matrix is unitarily similar to a diagonal matrix with real entries.

Proof Let U∗AU = T be a Shur decomposition of A. Then

T = U∗AU = U∗A∗U = (U∗AU)∗ = T ∗ (5.4)

Since T is an upper-triangular matrix and T = T ∗, T is diagonal with real entries.

Thus, if U−1AU = Λ = diag{λ1, . . . , λn} =⇒ AU = ΛU =⇒ Auj = λjuj , j = 1, . . . , n

For a general m× n matrix A, the singular value decomposition A = UΣV ∗ implies that

A∗A = (UΣV ∗)∗(UΣV ∗) = V ΣU∗UΣV ∗

and therefore,

A∗A = V Σ2V ∗; A∗AV = V Σ2 =⇒ A∗Avj = σ2j vj .

5.3 Least squares problems

Ax = b, A ∈ Cm×n, b ∈ Cm (5.5)

If m > n (m equations in n unknowns), usually there does not exist an exact solution; thus we seek

φ(xopt) = minx∈Cn ‖Ax− b‖.

5.3. LEAST SQUARES PROBLEMS 45

Example 5.2.

A =

1

1

1

, b1 =

b1

b2

b3

, b1 ≤ b2 ≤ b3

1. In ‖ · ‖1-norm,

‖Ax− b‖1 =

∥∥∥∥∥∥∥∥∥

x− b1

x− b2

x− b3

∥∥∥∥∥∥∥∥∥

1

= |x− b1|+ |x− b2|+ |x− b3|.

∴ x = b2.

2. In ‖ · ‖∞-norm (mini-max problem),

minx‖Ax− b‖∞ = min

x

[max

j=1,2,3|x− bj |

]∴ x =

b1 + b3

2

note:: b1 and b3 are end points.

3. In ‖ · ‖2-norm (least square problem),

minx‖Ax− b‖2 = min

x

3∑j=1

(x− bj)2 differentiate in x

∴ x =b1 + b2 + b3

2

From now on, we consider only the least squares problems. Then,

φ(x) = ‖Ax− b‖2 = (Ax− b)∗(Ax− b) = x∗A∗Ax− x∗A∗b− b∗Ax + ‖b‖22 .

∇φ(x) = 2[A∗Ax−A∗b] = 0.

We have the normal equation:

A∗(Ax− b) = 0, (5.6)

which has a unique solution if and only if rank (A) = n. 1

Set χ = {x ∈ Cn| ‖Ax− b‖2 = miny∈Cn ‖Ay − b‖2}. Then the following hold:1rank (BA) ≤ rank (A)


1. x ∈ χ ⇐⇒ A∗(Ax− b) = 0.

2. χ is a convex set, x, y ∈ χ =⇒ (1− α)x + αy ∈ χ.

3. χ has a unique element xLS with minimal 2-norm.

4. χ = {xLS} ⇐⇒ rank (A) = n, Denote

ρLS = ‖AxLS − b‖2 .

Remark ::

◦ Optimization Problems ::[linear, nonlinear],[uncontrained,constrained].

◦ Linear Programing = Constrained linear optimization.

Algorithm 5.1.Find xLS and ρLSA ∈ M(m, n), rank (A) = n, b ∈ Cn

1 C = A∗A A ∈ M(m, n)

2 d = A∗bCompute the cholesky decomposition

3 C = GG∗ G ∈M(n, n) lower triangular

4 Solve for y : Gy = d

5 Solve for x : G∗x = y

(The number of flops) ' n2

2

(m + n

3

)Theorem 5.2. Let A = UΣV ∗ be a SVD of A.

rank (A) = r (5.7)

A =r∑

j=1

σjujv∗j (5.8)

xLS =r∑

j=1

u∗jb

σjvj (5.9)

ρLS =

√√√√ m∑j=r+1

∣∣∣u∗jb∣∣∣2 (5.10)

Proof Since U∗AV = Σ is diagonal, writing α = V ∗x

‖Ax− b‖22 = ‖U∗(Ax− b)‖22= ‖U∗AV α− U∗b‖22=

∥∥∥∑rj=1 σjαj −

∑mj=1 u∗jb

∥∥∥2

2

=∑r

j=1

∣∣∣σjαj − u∗jb∣∣∣2 +

∑mj=r+1

∣∣∣u∗jb∣∣∣2 ,

which has a minimum if αj =u∗j b

σj, j = 1, · · · , r, with equations 5.9, 5.10

5.4. MOORE-PENROSE PSEUDO INVERSE A† OF A ∈M(M,N). 47

5.4 Moore-Penrose pseudo inverse A† of A ∈M(m,n).

The Moore-Penrose pseudo inverse A† ∈M(n, m) of AM(m,n) is the unique solution χ ∈M(n, m) such

that

1. AχA = A.

2. χAχ = χ

3. (Aχ)∗ = Aχ ∈M(m,m)

4. (χA)∗ = χA ∈M(n, n)

If A = UΣV ∗ is a SVD of A then A† = V Σ†U∗ where

Σ† =

1/σ1

1/σ2

. . .

1/σr

0

(5.11)

Properties 5.1. 1. A†† = A.

2. (A†)∗ = (A∗)†.

3. ‖Ax− y‖2 ≥∥∥AA†y − y

∥∥2

x ∈ Cn.

4. ‖Ax− y‖2 =∥∥AA†y − y

∥∥2

and x 6= A†y =⇒ ‖x‖2 >∥∥A†y

∥∥ , x 6= xLS .

5. Let P : Cn → N(A)⊥ and P : Cm → R(A)⊥ are orthogonal projections. Then A†A = P and

AA† = P .

Proof We have

P = P ∗ = P 2;Px = 0 if and only if x ∈ N(A),

P = P∗ = P

2;Px = 0 if and only if x ∈ R(A).

For each y ∈ R(A) there exists a unique xy ∈ N(A)⊥ such that Axy = y. This defines a well-defined

mapping f : R(A) → Cn such that

Af(y) = y, f(y) ∈ N(A)⊥ ∀y ∈ R(A).


Indeed, if y ∈ R(A), then there is an x such that y = Ax; thus

y = A(Px + (I − P )x) = APx + A(I − P )x = APx, since (I − P )x ∈ N(A),

with Px ∈ N(A)⊥. Hence if x1, x2 ∈ N(A)⊥, Ax1 = Ax2 = y, we have

x1 − x2 ∈ N(A) ∩N(A)⊥ = {0}.

. Thus x1 = x2.

Theorem 5.3. Let A = UΣV ∗ be a SVD of A ∈M(m,n) with r = rank (A). Then for k < r,

minrank (B)=k,B∈M(m,n)

=

∥∥∥∥∥∥A−k∑

j=1

σjujv∗j

∥∥∥∥∥∥2

(5.12)

LATE

Theorem 5.4 (Dimension theorem). B : Cn −→ Cn

n = rank (B) + nullity(B) = dim(R(B)) + dim(N(B)) (5.13)

Let x1, . . . , xn be orthogonal vector of N(B). Then span{x1, . . . , xn−k} ∩ span{v1, . . . , vk+1} 6= {0}

thus there exists z ∈ Cn with ‖z‖2 = 1, Bz = 0 and z =∑k+1

j=1 βjvj . Hence Az =∑k+1

j=1 βjAvj =∑k+1j=1 βj (

∑rl=1 σlulv

∗l ) vj =

∑k+1j=1 βjσjuj . Then

‖A−B‖22 ≥ ‖(A−B)z‖22 = ‖Az‖22=

∥∥∥∑k+1j=1 βjσjuj

∥∥∥2

2

=∑k+1

j=1 σ2j ‖βjuj‖22 uj ’s are orthogonal

≥ σ2k+1

∑k+1j=1 ‖βjuj‖22

= σ2k+1

∑k+1j=1 β2

j

= σ2k+1

‖A−B‖2 ≥ σk+1 (5.14)

This proves our theorem.

Consider

minx∈Rn ‖Ax− b‖ , A ∈M(m, n), b ∈ Rm (5.15)

rank(A) = n, m ≥ n (5.16)

5.4. MOORE-PENROSE PSEUDO INVERSE A† OF A ∈M(M,N). 49

Recall that the Householder matrices(equation 4.3)

Hj = I − 2vjv

tj

vtv

such that

HnHn−1 · · ·H1A︸︷︷︸QT

=

. . . ∗. . .

. . . 0

0 · · · · · · 0

=

R

0

A(0) = A (5.17a)

b(0) = b (5.17b)

Algorithm 5.2.

1 DO j=1,n

2 A(j) = HjA(j−1)

3 b(j) = Hjb(j−1)

4 ENDDO

A(n) =

R ∈M(m,m)

0 ∈M(m− n, n)

(5.18a)

b(n) =

b1 ∈ Rn

b2 ∈ Rm−n

. (5.18b)

‖Ax− b‖2 =∥∥QT (Ax− b)

∥∥=

∥∥∥∥∥∥ R

0

x−

b1

b2

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ Rx

0

− b1

b2

∥∥∥∥∥∥2

=√‖Rx− b1‖22 + ‖b2‖22

x = R−1b1 − xLS (5.19)

ρLS = ‖b2‖2 (5.20)


R is invertible since rank (A) = rank (R) = n.

QA =

R

0

, A =

Q∗R

0

. QR-decomposition of A can be also obtained by Gram-Schmidt

orthogonalization.

A = [a1, . . . , an], {a1, . . . , an} linearly independent. Q = [q1, . . . , qm], orthogonal.

[a1, . . . , an] = [q1, . . . , qn]

r11 · · · · · · r1n

r22 · · · r2n

. . ....

rnn

0 0 · · · 0

(5.21)

atiak = rik =⇒ ak =

∑kj=1

(qtjak

)qj span{a1, . . . , ak} ⊂ span{q1, . . . , qn} Since the dimension of both

spaces are equal span{a1, . . . , ak} = span{q1, . . . , qk}.

R(A) = span{q1, . . . , qn} (5.22)

R(A)⊥ = span{qn+1, . . . , qm} (5.23)

A : Rn 7→ Rm,

ak =k−1∑j=1

rjkqj + rkkqkqk =

ak −k−1∑j=1

rjkqj

/rkk (5.24a)

Algorithm 5.3.Gram-Schmidt orthogonalization1 DO k=1,n

2 sjk = qtjak, j = 1, . . . , k − 1

3 zk = ak −∑k−1

j=1 sjkqj

4 rkk = ‖zk‖25 qk = zk/rkk

6 rjk = sjk/rkk

7 ENDDO

Chapter 6

Eignevalues and eigenvectors

Right eigenvector x and eigenvalue λ Ax = λx, x 6= 0.

Left eigenvector y and eigenvalue λ y∗A = λy∗, y 6= 0.

Rayleigh quotient λ = x∗Axx∗x . If x is eigenvector λ is the associated eigenvalue.

Charateristic polynomial PA(λ) = det(A− λI) = (λ1 − λ) · · · (λn − λ). λj is the eigenvalue of A.

PA(0) = det(A) =∏n

j=1 λj .

Trace of A Tr(A) =∑n

j=1 λj

Spectrum of A σ(A) is the set of all eigenvalues of A.

Spectral radius ρ(A) = maxλ∈σ(A)(λ). σ(A∗) = σ(A), indeed λ ∈ σ(A∗).

0 = det(A∗ − λI) = det[(A− λI)∗] = det(A− λI).

6.1 Cayley-Hamilton theorem

P (A) = 0. PA(λ) = (λ1 − λ)m1 · · · (λk − λ)mk ,∑k

j=1 mj = n, mj is algebraic multiplicity of λj . det(A−

λI) = 0, λ1, . . . , λk are eigenvalues.

Dimension of eigenspace corresponding to λj = “geometric dimensionltiplicity of λj” ≤ mj .

Theorem 6.1. Consistant matrix norm ‖·‖,

ρ(A) ≤ ‖A‖ , ∀A ∈M(n, n) (6.1)

51

52 CHAPTER 6. EIGNEVALUES AND EIGENVECTORS

Proof Let Ax = λx, x 6= 0. Then |λ| ‖x‖ = ‖λx‖ = ‖Ax‖ ≤ ‖A‖ ‖x‖. Therefore, |λ| ≤ ‖A‖ and

ρ(A) ≤ ‖A‖.

Theorem 6.2. Let A ∈ M(n, n) and ε > 0 be given. Then there exists a consistent matrix norm,

‖·‖ = ‖·‖A,ε such that ρ(A) = ‖A‖A,ε ≤ ρ(A) + ε.

Proof Let TAT−1 = J be the Jordan canonical form where J is a block diagonal matrix where blocks are

of the form, Jk =

λk 1 0

. . . . . .

λk 1

0 λk

, J =

J1

J2

. . .

Jk

. Let Dε = diag{1, ε, ε2, . . . , εn−1.

Consider the transformation,

J −→ D−1ε JDε =

λk ε 0

. . . . . .

λk ε

0 λk

.

∥∥D−1ε JDε

∥∥∞ =

∥∥D−1e TAT−1Dε

∥∥∞ = ρ(A) + ε

In fact, if S is a nonsingular matrix, then x 7→ ‖Sx‖ defines a new vector norm ‖·‖(p).

‖A‖(p) = maxx 6=0

‖Ax‖(p)

‖x‖(p)

= maxx6=0

‖SAx‖p

‖Sx‖p

= maxy 6=0

∥∥SAS−1y∥∥

‖y‖p

,

DεT = S, T−1D−1ε = S−1. We choose the vector p(x) =

∥∥D−1ε Tx

∥∥∞. Then

∥∥D−1ε JDε

∥∥∞ =∥∥D−1

ε TA(D−1ε T )−1

∥∥∞ = ‖A‖(p) ≤ ρ(A) + ε.

Theorem 6.3. Let A ∈M(n, m), and ‖·‖ be a consistent norm. Then

limk→∞

∥∥∥Ak∥∥∥1/k

=1

ρ(A)(6.2)

Theorem 6.4. Let A ∈M(n, n), Then

limk→∞

Ak = 0 ⇐⇒ ρ(A) < 1 (6.3)

Proof Assume ρ(A) < 1. Then by theorem 6.2, there exists a consistent matrix norm ‖·‖ such that

‖A‖ ≤ ρ(A) + ε < 1. Then∥∥Ak

∥∥ ≤ ‖A‖k ≤ (ρ(A) + ε)k < 1. Thus limk→∞∥∥Ak

∥∥ = 0. and then∥∥limk→∞Ak∥∥ = 0, therefore limk→∞Ak = 0.

Next, assume that limk→∞Ak = 0. Let Ax = λx, x 6= 0, then Akx = λkx. (limk→∞Ak)x =

limk→∞ λkA =⇒ |λ| < 1. ∴ ρ(A) < 1.

6.1. CAYLEY-HAMILTON THEOREM 53

Remark 6.1. In the case of theorem 6.4, i.e. ρ(A) < 1.

11 + ‖A‖

≤∥∥(I −A)−1

∥∥ ≤ 11− ‖A‖

(6.4)

Proof To see I − A is invertible, (I − A)(I + A + A2 + · · ·Ak) = I − Ak+1, so (I − A)∑∞

k=0 Ak = I.

Also (∑∞

k=0 Ak)(I −A) = I. Therefore

∴ (I −A)−1 =∞∑

k=0

Ak. (6.5)

1 = ‖I‖ =∥∥(I −A)(I −A)−1

∥∥ ≤ ‖I −A‖∥∥(I −A)−1

∥∥ ≤ (1 + ‖A‖)∥∥(I −A)−1

∥∥.1

1 + ‖A‖≤∥∥(I −A)−1

∥∥ . (6.6)

Next, (I−A)−1 = IA(I−A)−1. Multiply both sides by (I−A).∥∥(I −A)−1

∥∥ ≤ ‖I‖+‖A‖∥∥(I −A)−1∥∥.∥∥(I −A)−1

∥∥ ≤ 11− ‖A‖

> 0 (6.7)

Theorem 6.5. For A ∈M(n, n), let

H =A + A∗

2the Hermitian part of A (6.8)

S =A−A∗

2ithe skew-symetric part of A (6.9)

then for evert λ ∈ σ(A),

λmin(H) ≤ Re (λ) ≤ λmax(H) (6.10)

λmin(S) ≤ Im (λ) ≤ λmax(S) (6.11)

Proof Let Ax = λx, ‖x‖2 = 1. Then λ = x∗Ax = x∗A+A∗

2 x + x∗A−A∗

2 x = x∗Hx + ix∗Sx. Since H and

S are Hermition, they are unitary similar to real diagonal matrices.

H = MΣM∗ (6.12)

S = NΣ′N∗ (6.13)

Σ and Σ′ are real diagonal with eigenvalues. λ = x∗MΣM∗x + ix∗NΣ′N∗x = y∗Σy + iz∗Σ′z. Since

y∗y = x∗MM∗x = x∗x = 1 and z∗z = x∗NN∗x = x∗x = 1.

Re (λ) = y∗Σy (6.14)

Im (λ) = z∗Σ′z. (6.15)

This shows equation 6.10 and equation 6.11.


6.2 Gerschgorin theorem

Rj := {z ∈ C : |z − ajj ≤h∑

k=1k 6=j

|ajk|} (6.16)

Theorem 6.6. σ(A) ⊂⋃n

j=1 Rj.

Proof Let D = diag{a11, a22, . . . , ann} and write A = D + E. For λ ∈ σ(A), if λ = ajj for same j,

then nothing to prove. therefore assume that λ 6= ajj , ∀j, then next B := A − λI = (D − λI) + E.

Since λ ∈ σ(A), there exists x 6= 0 such that Bλx = 0 or D − λI)x + Ex = 0. x = −(D − λI)−1Ex.

‖x‖∞ ≤∥∥(D − λI)−1

∥∥∞ ‖x‖∞.

1 ≤∥∥(D − λ)−1

∥∥∞

for some j= =

n∑k=1

∣∣∣∣ ejk

ajj − λ

∣∣∣∣ = n∑k=1k 6=j

∣∣∣∣ ajk

ajj − λ

∣∣∣∣ .

|ajj − λ| ≤n∑

k=1k 6=j

|ajk| (6.17)

Example 6.1.

A =

2 −1

−1 2 −1. . .

−1 2 −1

−1 2

Cj := {z ∈ C : |z − ajj | ≤n∑

k=1, k 6=j

|akj|}

Theorem 6.7. σ(A) ⊂(⋃

Rj

)∩(⋃

Cj

)(6.18)

Proof The same as theorem 6.6 by using ‖·‖1 instead of ‖·‖∞.

Theorem 6.8. If(⋃m

j=1 Rj

)∩(⋃n

j=m+1

)= φ then

⋃mj=1 Rj contains exactly in eigenvalues of A, with

each eigenvalue being counted according to its algebraic multiplicity. The remain eigenvalues are in⋃nj=m+1 Rj.

6.3. THE POWER METHOD (TO COMPUTE EIGENVALUES AND EIGENVECTORS.) 55

Proof Let A = AD + R. For t ∈ [0, 1], set At = AD + tR. Then A0 = AD, A1 = A. The eigenvalues of

At are continuous functions of t. By the theorem of Gerschgorin circles, for t = 0, ∃ exactly m eigenvalues

of A0 in⋃m

j=1 Rj and n −m in⋃n

j=m+1 Rj (taking into account the multiplicities.) Since for t ∈ [0, 1],

all eigenvalues of At must lie in these discs, it folloes that m eigenvalues of A are in⋃m

j=1 Rj and the

remaining n−m in⋃n

j=m+j Rj .

Rj =

|Z − ajj ≤∑k 6=j

tαjk

(6.19)

6.3 The power method (to compute eigenvalues and eigenvectors.)

Let A ∈ M(n, n) be diagonalizable such that X−1AX = Λ, X = [x1, . . . , xn], Λ = diag{λ1, . . . , λn},

|λ1| > |λ2| ≥ · · · ≥ |λn| ≥ 0.

λ1 has algebraic multiplicity 1. The power method to apporoximate λ1 and it eigenvector x1.

Algorithm 6.1.

1 choose an initial guess q(0) ∈ Cn,∥∥q(0)

∥∥2

= 1.

2 DO k=1, until converge

3 z(k) = Aq(k−1)

4 q(k) = z(k)

‖z(k)‖1

k →∞, q(k) → x1

5 ν(k) = (q(k))∗Aq(k) k →∞, ν(k) → λ1

6 ENDDO

6.3.1 Convergence analysis

Note that

q(k) =akq(0)∥∥Akq(0)

∥∥2

(6.20)

Let q(0) =∑n

j=1 αjxj , αj ∈ C (since A is diagonalizable), Axj = λjxj , ∀j Then

Akq(0) =∑n

j=1 αjλkj xj

= α1λk1

[x1 +

∑nj=2

αj

α1

(λj

λ1

k)k

xj

]=: α1λ

k1 q

(k).

(6.21)

Claim There exists C > 0 such that∥∥q(k) − x1

∥∥2≤ C(λ0/λ1)k, k ≥ 1 if α1 6= 0.


Proof Without loss of generality, we may assume that ‖xj‖2 = 1, j = 1, . . . ,m. Then∥∥q(k) − x1

∥∥2

=∥∥∥∥∥∑nj=2

αj

α1

(λj

λ1

k)k

xj

∥∥∥∥∥2

≤=(∑n

j=2

∣∣∣αj

α1

∣∣∣2 ∣∣∣λj

λ1

∣∣∣2k)1/2

=∣∣∣λ2λ1

∣∣∣k n∑

j=2

∣∣∣∣αj

α1

∣∣∣∣1/2

︸︷︷︸independent of k ≡ C

Thus we see that,

q(k) → x1, as k →∞. (6.22)

Letting q(k) = βkq(k), βk = ‖Akq(0)‖

α1λk1

, from the above argument we have

ν(k) =(q(k))∗Aq(k)

(q(k))∗q(k)=

(q(k))∗Aq(k)

(q(k))∗q(k)→ λ1. (6.23)

Exercise 6.1.

A =

−261. 209. −49.

−530. 422. −98.

−800. 631. −144.

, v(0) =

1

0

0

Apply the power method to find the largest eigenvalue and eigenvector.

6.3.2 Inverse iteration

Given µ 6∈ σ(A), find an approximation of λ ∈ σ(A) that is closest to mu, “mu” is calles a shift.

Idea : Apply the power method to M−1µ = (A− µI)−1.

Note that σ(M−1µ ) = {(λj − µ)−1 : λj ∈ σ(A)}. Indeed, Ax = λx ⇔ (A − µI)x = (λ − µ) ⇔

(λ − µ)−1x = (A − µI)x = M−1µ x. Assume that ∃ an integer m such that |λ − µ| < |λj − µ|, ∀j =

1, . . . ,m, j 6= m. So that ξm = (λm − µ)−1 is the eigenvalue of M−1µ with largest mudulus. If µ = 0, λm

is the eigenvalues with smallest modulus. λm = 1ξm

+ µ.

Algorithm 6.2.

1 DO j=1, max j

2 z(j) = (A− µI)−1q(j−1)

3 q(j) = z(j)/∥∥z(j)

∥∥2

4 σ(j) = (a(j))∗Aa(j)/(q(j))∗q(j)

5 ENDDO

Chapter 7

Iterative methods for linear systems

7.1 (Gauss-) Jacobi method and Seidel method

7.1.1 Jacobi method

P = D = diag{a11, a22, . . . , ann} (7.1)

N = −(A−D) (7.2)

A = P −N or N = P −A. To solve Ax = b, Px = Nx + b.

x = P−1(Nx + b) = P−1((P −A)x + b) = x + P−1(b−Ax). (7.3)

The Jacobi method is the iterative method with initial approximation x(0).

Algorithm 7.1.

1 DO j=1, max iter

2 x(j) = x(j−1) + P−1(b−Ax(j−1)) Dx(j) + (L + U)x(j−1) = b

3 ENDDO

A = L+D+U where L is lower triangular matrix and D is diagonal matrix and U is upper triangular

matix.

r(j) = b−Ax(j) ; jth residual.

7.1.2 Seidel method

P = D + L (7.4)

N = −(A− P ) = −U (7.5)

57

58 CHAPTER 7. ITERATIVE METHODS FOR LINEAR SYSTEMS

Algorithm 7.2.

1 x(0) is initial approximation

2 DO j=1, max iter

3 DO k=1, n

4 x(j)k =

(bk −

∑k−1l=1 aklx

(j)l −

∑nl=k+1 aklx

(j−1)l

)/akk (D + L)x(j) + Ux(j−1) = b

5 ENDDO

6 ENDDO

x(j) = (I − P−1A)x(j−1) + P−1b is called jth iterative vector. Instead x(j) = [ω(I − P−1A) + (1 −

ω)I]x(j−1) + ωP−1b, if ω = 1, original scheme, if ω = 0, stationary, otherwise relaxed scheme.

Exercise 7.1 (Seidal over relaxed(SOR) method).

A =

2 −1

−1 2 −1. . .

−1 2 −1

−1 2

n = 100, 1000, 10000, ω = 1.7, 1, x

(0)j = (−1)j. Check the change of residual using ‖·‖∞.

7.2 Richardson iterative method

For some easily invertible matrix P .

x(j) = x(j−1) + αjP−1r(j−1) jth iterative step (7.6)

Jacobi method αj = 1 ∀j P = D

Seidel method αj = 1 ∀j P = D + L

JOR1 method αj = ω ∀j P = D

SOR method αj = ω ∀j P = D + L

If αj = constant ∀j, the scheme is called “stationary Richardson method,” otherwise it is called an

“unstationary Richardson method.”

Theorem 7.1. The stationary Richardson method is convegent iff

2αRe λ

α2|λ|2> 1 λ ∈ σ(P−1A) (7.7)

7.2. RICHARDSON ITERATIVE METHOD 59

Proof

ρ(I − αP−1A) < 1

⇐⇒ (1− α Re = lambda)2 + (α Im λ)2 < 1 ∀λ ∈ σ(P−1A)

⇐⇒ α2|λ|2 < 2α Re λ ∀λ ∈ σ(P−1A)

⇐⇒ 2α Re λ

α2|λ|2> 1 ∀λ ∈ σ(P−1A)

Now,

x(j) = x(j−1) + αP−1(b−Ax(j−1))

−) x = x + αP−1(b−Ax)

e(j) = e(j−1) − αP−1Ae(j−1) = (I − αP−1A)e(j−1) e(j) = x(j) − x

e(j) = (I − αP−1A)e(j−1) = (I − αP−1A)je(0),∥∥e(j)

∥∥ ≤ ∥∥I − αP−1A∥∥j ∥∥e(0)

∥∥ ∥∥I − αP−1A∥∥ < 1 iff

ρ(I − αP−1A) < 1.

Theorem 7.2. Assume that P−1A has positive real eigenvalues 0 < λ1 ≤ · · ·λN . Then the stationary

Richardson method converges iff

0 < α <2

λN. (7.8)

Moreover, ρ(I − αP−1A) < 1 is minimized if α = 2λ1+λN

with the value λN−λ1λN+λ1

.

Proof The eigenvalues of I − αP−1A are 1− αλj , j = 1, . . . , N . In order to have |1− αλj | < 1, ∀j,

α <2λj

, ∀j

Thus if α < 2/λN , the method conveges.

ρ(I−αP−1A) is minimized if 1−αλN = −(1−αλ1) or α = 2/(λ1+λN ). In this case, ρ(I−αP−1A) =

1− 2λ1+λN

λ1 = λN−λ1λN+λ1

Theorem 7.3. If A is strictly diagonally dominant (that is |ajj | >∑

k 6=j |ajk|, ∀j ) then the Jacobi

method converges.

Proof P = D, αj = α = 1.

I − P−1A = I −D−1A =

0 a12/a11 · · · a1n/a11

a21/a22. . .

.... . .

an1/ann 0

∥∥I − P−1A

∥∥∞ = maxj

∑k 6=j |

ajk

ajj< 1. Thus the Jacobi method converges.

60 CHAPTER 7. ITERATIVE METHODS FOR LINEAR SYSTEMS

Theorem 7.4. The Seidal method conveges if∥∥I −D−1A

∥∥∞ < 1 or A is diagonally dominant.

Proof The seidal method gives the kth step error

e(k)j = −

j−1∑l=1

ajl

ajje(k)j −

n∑l=j+1

ajl

ajje(k−1)l j = 1, . . . , n

Denote r = |I −D−1A|∞ < 1. Thus we see that∥∥∥e(j)∥∥∥∞≤ r

∥∥∥e(j−1)∥∥∥∞

∀j. (7.9)

For that jth component of e(k), we start with j = 1.

|e(k)1 | =

∣∣∣∣∣−n∑

l=2

a1l

a11e(k−1)l

∣∣∣∣∣≤

n∑l=2

∣∣∣∣ a1l

a11

∣∣∣∣ ∥∥∥e(k−1)∥∥∥∞

≤ r∥∥∥e(k−1)

∥∥∥∞

Assume that |e(k)j | ≤ e

∥∥e(k−1)∥∥∞ for j = 1, . . . , l − 1. Then

|e(k)l | =

l−1∑m=1

∣∣∣∣alm

all

∣∣∣∣ e(k)m +

n∑m=l+1

∣∣∣∣alm

all

∣∣∣∣ |e(k−1)|

≤

(l−1∑m=1

∣∣∣∣alm

all

∣∣∣∣)

r∥∥∥e(l−1)

∥∥∥∞

+

(n∑

m=l+1

∣∣∣∣alm

all

∣∣∣∣)∥∥∥e(l−1)

∥∥∥∞

≤∥∥I −D−1A

∥∥∞

∥∥∥e(k−1)∥∥∥∞

.

In other word∥∥∥e(k)∥∥∥∞≤ r

∥∥∥e(k−1)∥∥∥∞

(7.10)

Therefore Seidel methods is convergent.

If |ajj | ≥∑

l 6=j |ajl| ∀j and |akk >∑

l 6=j |akl| at least for one k then Jacobi or Seidel method is

convergent.

Theorem 7.5. Let A be a Hermition matrix with positive diagonal entries, then the Seidel method is

convergent. Longleftrightarrow A is positive-definite.

Theorem 7.6. Let A be a symmetric matrix with positive diagonals. The SOR is convergent iff A is

positive-definite and ω ∈ (0, 2).

Documents

Lecture Notes on Numerical Linear Algebra - CIMATangeluh/webpage_ANI/Libros/SheenLectures.pdf · 1.1 Diﬃculties in Computation in Numerical Linear Algebra Prob-lems Numerical methods