Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Lecture 2 INF-MAT 4350 2009: 7.1-7.6, LU,symmetric LU, Positve (semi)Definite, Cholesky,
Semi-Cholesky
Tom Lyche and Michael Floater
Centre of Mathematics for Applications,Department of Informatics,
University of Oslo
August 27, 2009
Triangular Matrices (from week 1)
Recall:
I The product of two upper(lower) triangular matrices isupper(lower) triangular.
I A triangular matrix is nonsingular if and only if all diagonalelements are nonzero.
I The inverse of two upper(lower) triangular matrices isupper(lower) triangular.
I A matrix is unit triangular if it is triangular with 1’s on thediagonal.
I The product of two unit upper(lower) triangular matrices isunit upper(lower) triangular.
I A unit upper(lower) triangular matrix is invertible and theinverse is unit upper(lower) triangular.
LU Factorization
I We say that A = LR is an LU factorization of A ∈ Rn,n ifL ∈ Rn,n is lower (left) triangular and R ∈ Rn,n is upper(right) triangular. In addition we will assume that L is unittriangular.
I Example
A =
[2 −1−1 2
]=
[1 0
−1/2 1
] [2 −10 3/2
]
Example
Not every matrix has an LU factorization.
I An LU factorization of A =[
0 11 1
]must satisfy the equations[
0 11 1
]=
[1 0l1 1
] [r1 r30 r2
]for the unknowns l1 in L and r1, r2, r3 in R.
I Get equations [0 11 1
]=
[r1 r3
l1r1 l1r3 + r2
]I Comparing (1, 1)-elements we see that r1 = 0,
I this makes it impossible to satisfy the condition 1 = l1r1 forthe (2, 1) element. We conclude that A has no LUfactorization.
Submatrices
I A ∈ Cn,n
I r = [r1, . . . , rk ] for some 1 ≤ r1 < · · · < rk ≤ n
I Principal: B = A(r, r), bi ,j = ari ,rj
I Leading principal: B = Ak := A(1 : k, 1 : k)
I The determinant of a (leading) principal submatrix is called a(leading) principal minor.
I The principal submatrices of A =[
1 2 34 5 67 8 9
]are
[1], [5], [9], [ 1 24 5 ] , [ 1 3
7 9 ] , [ 5 68 9 ] , A.
I The leading principal submatrices are
[1], [ 1 24 5 ] , A.
Existence and Uniqueness of LU
TheoremSuppose the leading principal submatrices Ak of A ∈ Cn,n arenonsingular for k = 1, . . . , n − 1. Then A has a unique LUfactorization.
Example [1 10 0
]=
[1 00 1
] [1 10 0
]
Proof
I Proof by induction on n
I n = 1: [a11] = [1][a11].
I Suppose that An−1 has a unique LU factorizationAn−1 = Ln−1Rn−1, and that A1, . . . ,An−1 are nonsingular.
I Since An−1 is nonsingular it follows that Ln−1 and Rn−1 arenonsingular.
I But then
A =
[An−1 bcT ann
]=
[Ln−1 0
cTR−1n−1 1
] [Rn−1 v
0 ann − cTR−1n−1v
]= LR,
where v = L−1n−1b is an LU factorization of A.
I Since Ln−1 and Rn−1 are nonsingular the block (2,1) entry inL is uniquely given and then rnn is also determined uniquelyfrom the construction. Thus the LU factorization is unique.
I Using block multiplication one can show
LemmaSuppose A = LR is the LU factorization of A ∈ Rn,n. Fork = 1, . . . , n let Ak ,Lk ,Rk be the leading principal submatrices ofA,L,R, respectively. Then Ak = LkRk is the LU factorization ofAk for k = 1, . . . , n.
I Example
A =
1 2 34 5 67 8 9
=
1 0 04 1 07 2 1
1 2 30 −3 −60 0 0
= LR.
I A1 = [1] = [1][1] = L1R1
I A2 =
[1 24 5
]=
[1 04 1
] [1 20 −3
]= L2R2
I R(3, 3) = 0 and A is singular.
A Converse
I TheoremSuppose A ∈ Cn,n has an LU factorization. If A is nonsingular thenthe leading principal submatrices Ak are nonsingular fork = 1, . . . , n − 1 and the LU factorization is unique.
I Proof: Suppose A is nonsingular with the LU factorizationA = LR.
I Since A is nonsingular it follows that L and R are nonsingular.
I By Lemma we have Ak = LkRk .
I Lk is unit lower triangular and therefore nonsingular.
I Rk is nonsingular since its diagonal entries are among thenonzero diagonal entries of R.
I But then Ak is nonsingular for all k. Moreover uniquenessfollows.
I Remark The LU factorization of a singular matrix need not beunique. For the zero matrix any unit lower triangular matrixcan be used as L in an LU factorization.
Symmetric LU Factorization
I For a symmetric matrix the LU factorization can be written ina special form.
A =
»2 −1−1 2
–=
»1 0
−1/2 1
– »2 −10 3/2
–=
»1 0
−1/2 1
– »2 00 3/2
– »1 −1/20 1
–I In the last product the first and last matrix are transposes of
each other.
I A = LDLT symmetric LU factorization.
I A = LR where R = DLT
I DefinitionSuppose A ∈ Rn,n. A factorization A = LDLT , where L is unitlower triangular and D is diagonal is called a symmetric LUfactorization.
LDLT Characterization
TheoremSuppose A ∈ Rn,n is nonsingular. Then A has a symmetric LUfactorization A = LDLT if and only if A = AT and Ak isnonsingular for k = 1, . . . , n − 1. The symmetric LU factorizationis unique.
Block LU Factorization
Suppose A ∈ Rn,n is a block matrix of the form
A :=
A11 · · · A1m...
...Am1 · · · Amm
, (1)
where each (diagonal) block Aii is square. We call the factorization
A = LR =
I
L21 I...
. . .
Lm1 · · · Lm,m−1 I
R11 · · · R1m
R21 · · · R2m
. . ....
Rmm
(2)
a block LU factorization of A. Here the ith diagonal blocks I inL and Rii in R have the same order as Aii .
Block LU
The results for elementwise LU factorization carry over to block LUfactorization as follows.
TheoremSuppose A ∈ Rn,n is a block matrix of the form (1), and theleading principal block submatrices
Ak :=
A11 · · · A1k...
...Ak1 · · · Akk
are nonsingular for k = 1, . . . ,m − 1. Then A has a unique blockLU factorization (2). Conversely, if A is nonsingular and has ablock LU factorization then Ak is nonsingular for k = 1, . . . ,m− 1.
Why Block LU?
I The number of flops for the block LU factorization is thesame as for the ordinary LU factorization.
I An advantage of the block method is that it combines manyof the operations into matrix operations.
The PLU Factorization
I A nonsingular matrix A ∈ Rn,n has an LU factorization if andonly if the leading principle submatrices Ak are nonsingular fork = 1, . . . , n − 1.
I This condition seems fairly restrictive.
I However, for a nonsingular matrix A there always is apermutation of the rows so that the permuted matrix has anLU factorization.
I We obtain a factorization of the form PTA = LR orequivalently A = PLR, where P is a permutation matrix, L isunit lower triangular, and R is upper triangular. We call this aPLU factorization of A.
Positive (Semi)Definite Matrices
Suppose A ∈ Rn,n is a square matrix. The function f : Rn → Rgiven by
f (x) = xTAx =n∑
i=1
n∑j=1
aijxixj
is called a quadratic form. We say that A is
(i) positive definite if xTAx > 0 for all nonzero x ∈ Rn.
(ii) positive semidefinite if xTAx ≥ 0 for all x ∈ Rn.
(iii) negative (semi)definite if −A is positive(semi)definite.
(iv) symmetric positive (semi)definite if A issymmetric in addition to being positive(semi)definite.
(v) symmetric negative (semi)definite if A issymmetric in addition to being negative(semi)definite.
Observations
I A matrix is positive definite if it is positive semidefinite and inaddition
xTAx = 0 ⇒ x = 0. (3)
I A positive definite matrix must be nonsingular. Indeed, ifAx = 0 for some x ∈ Rn then xTAx = 0 which by (3) impliesthat x = 0.
I
[3 21 2
]is positive definite.
I The zero-matrix is symmetric positive semidefinite, while theunit matrix is symmetric positive definite.
I The second derivative matrix T = tridiag(−1, 2,−1) ∈ Rn,n issymmetric positive definite.
Useful Results
TheoremLet m, n be positive integers. If A ∈ Rn,n is positive semidefiniteand X ∈ Rn,m then B := XTAX ∈ Rm,m is positive semidefinite. Ifin addition A is positive definite and X has linearly independentcolumns then B is positive definite.
Proof.Let y ∈ Rm and set x := Xy. Then yTBy = xTAx ≥ 0. If A ispositive definite and X has linearly independent columns then x isnonzero if y is nonzero and yTBy = xTAx > 0.
Taking A := I and X := A we obtain
Corollary
Let m, n be positive integers. If A ∈ Rm,n then ATA is positivesemidefinite. If in addition A has linearly independent columnsthen ATA is positive definite.
More Useful Results
TheoremAny principal submatrix of a positive (semi)definite matrix ispositive (semi)definite.
Proof.Suppose the submatrix B is defined by the rows and columnsr1, . . . , rk of A. Then B := XTAX, whereX = [er1 , . . . , erk ] ∈ Rn,k , and B is positive (semi)definite byTheorem 7.
If A is positive definite then the leading principal submatrices arenonsingular and we obtain:
Corollary
A positive definite matrix has a unique LU factorization.
What about the Eigenvalues?
TheoremA positive (semi)definite matrix A has positive (nonnegative)eigenvalues. Conversely, if A has positive (nonnegative) eigenvaluesand orthonormal eigenvectors then it is positive (semi)definite.
Proof.
I Consider the positive definite case.
I Ax = λx with x 6= 0 ⇒ λ = xT AxxT x
> 0.
I Suppose conversely that A ∈ Rn,n has eigenpairs (λj ,uj),j = 1, . . . , n, where the eigenvalues are positive and theeigenvectors satisfy uT
i uj = δij , i , j = 1, . . . , n.
I Let U := [u1, . . . ,un] ∈ Rn,n and D := diag(λ1, . . . , λn).
I AU = UD and UTU = I ⇒ UTAU = D.
I Let x ∈ Rn be nonzero and define c ∈ Rn by Uc = x.
I xTAx = (Uc)TAUc = cTUTAUc = cTDc =∑n
j=1 λjc2j > 0.
I The positive semidefinite case is similar.
What about the Determinant?
TheoremIf A is positive (semi)definite then det(A) > 0 (det(A) ≥ 0).
Proof.Since the determinant of a matrix is equal to the product of itseigenvalues this follows from the previous theorem.
The Symmetric Case
LemmaIf A is symmetric positive semidefinite then for all i , j
1. |aij | ≤ (aii + ajj)/2,
2. |aij | ≤√
aiiajj .
Proof.For all i , j and α, β ∈ R
I 0 ≤ (αei + βej)TA(αei + βej) = α2aii + β2ajj + 2αβaij ,
I α = 1, β = ±1 =⇒ aii + ajj ± 2aij ≥ 0 =⇒ 1.
I 2. follows trivially from 1. if aii = ajj = 0.
I Suppose one of them, say aii is positive.
I Taking α = −aij , β = aii we find0 ≤ a2
ijaii + a2iiajj − 2a2
ijaii = aii (aiiajj − a2ij).
I But then aiiajj − a2ij ≥ 0 and 2. follows.
A Consequence
I If A is symmetric positive semidefinite and one diagonalelement is zero, say aii = 0 then all elements in row i andcolumn i must also be zero.
I For since |aij | ≤√
aiiajj we have aij = 0 for all j , and bysymmetry aji = 0 for all j .
I In particular, if A ∈ Rn,n is symmetric positive semidefinite
and a11 = 0 then A has the form
[0 0T
0 B
], B ∈ Rn−1,n−1
I
A1 =
[0 11 1
], A2 =
[1 22 2
], A3 =
[−2 1
1 2
].
None of them is symmetric positive semidefinite.
Cholesky Factorization
Definition
1. A factorization A = RTR where R is upper triangular withpositive diagonal elements is called a Cholesky factorization.
2. A factorization A = RTR where R is upper triangular withnon-negative diagonal elements is called a semi-Choleskyfactorization.
TheoremLet A ∈ Rn,n.
1. A has a Cholesky factorization if and only if it is symmetricpositive definite.
2. A has a semi-Cholesky factorization if and only if it issymmetric positive semidefinite.
Proof Outline Positive Semidefinite Case
I If A = RTR is a semi-Cholesky factorization then A issymmetric positive semidefinite.
I Suppose A ∈ Rn,n is symmetric positive semidefinite.
I We use induction and partition A as
A =
[α vT
v B
], α ∈ R, v ∈ Rn−1, B ∈ Rn−1,n−1.
I α = a11 = eT1 Ae1 ≥ 0.
I If α = 0 then v = 0.
I The principal submatrix B is positive semidefinite.
I By induction B has a semi-Cholesky factorization B = RT1 R1.
R =
[0 0T
0 R1
]is a semi-Cholesky factorization of A.
Proof Continued
I A =
[α vT
v B
].
I α > 0, β :=√
α:
I C := B− vvT/α is symmetric positive semidefinite.
I By induction C has a semi-Cholesky factorization C = RT1 R1.
I R :=
[β vT/β0 R1
]is a semi-Cholesky factorization of A.
Criteria Symmetric Positive Semidefinite Case
TheoremThe following is equivalent for a symmetric matrix A ∈ Rn,n.
1. A is positive semidefinite.
2. A has only nonnegative eigenvalues.
3. A = BTB for some B ∈ Rn,n.
4. All principal minors are nonnegative.
Criteria Symmetric Positive Definite Case
TheoremThe following is equivalent for a symmetric matrix A ∈ Rn,n.
1. A is positive definite.
2. A has only positive eigenvalues.
3. All leading principal minors are positive.
4. A = BTB for a nonsingular B ∈ Rn,n.
Banded CaseRecall that a matrix A has bandwidth d ≥ 0 if aij = 0 for|i − j | > d . (semi) Cholesky factorization preserves bandwidth.
Corollary
The Cholesky-factor R :=[
β vT /β0 R1
]has the same bandwidth as A.
Proof.
I Suppose A =
[α vT
v B
]∈ Rn,n has bandwidth d ≥ 0.
I Then vT = [uT , 0T ] , where u ∈ Rd
I vvT = [ u0 ] [ uT 0T ] =
[uuT 00 0
]I C = B− vvT/α differs from B only in the upper d × d corner.
I C has the same bandwidth as B and A.
I By induction on n, C = RT1 R1, where R1 has the same
bandwidth as C.
I But then R has the same bandwidth as A.
Towards an Algorithm
I Since A is symmetric we only need to use the upper part of A.
I The first row of R is vT/β if α > 0 and zero if α = 0.
I We store the first row of R in the first row of A and the upperpart of C = B− vvT/α in the upper part of A(2 : n, 2 : n).
The first row of R and the upper part of C can be computed asfollows.
if A(1, 1) > 0
A(1, 1) =√
A(1, 1)
A(1, 2 : n) = A(1, 2 : n)/A(1, 1)
for i = 2 : n
A(i , i : n) = A(i , i : n)− A(1, i) ∗ A(1, i : n)
(4)
Cholesky and Semi-Cholesky[bandcholesky]
1. function R=bandcholesky(A,d)2. n=length(A);3. for k=1:n4. if A(k,k)>05. kp=min(n,k+d);6. A(k,k)=sqrt(A(k,k));7. A(k,k+1:kp)=A(k,k+1:kp)/A(k,k);8. for i=k+1:kp9. A(i,i:kp)=A(i,i:kp)-A(k,i)*A(k,i:kp);
10. end11. else12. A(k,k:kp)=zeros(1,kp-k+1);13. end14. end15. R=triu(A);
Comments
I We overwrite the upper triangle of A with the elements of R.
I Row k of R is zero for those k where rkk = 0.
I We reduce round-off noise by forcing those rows to be zero.
I There are many versions of Cholesky factorizations, see theGolub-VanLoan book
I The algorithm is based on outer products vvT .
I An advantage of this formulation is that it can be extended topositive semidefinite matrices.
Banded Forward Substitution
[bandforwardsolve] Solves the lower triangular system RTy =b. R is upper triangular and banded with rkj = 0 for j−k > d .
1. function y=bandforwardsolve(R,b,d)2. n=length(b); y=b(:);3. for k=1:n4. km=max(1,k-d);5. y(k)=(y(k)-R(km:k-1,k)’*y(km:k-1))/R(k,k);6. end
Banded Backward Substitution
[bandbacksolve] Solves the upper triangular system Rx = y.R is upper triangular and banded with rkj = 0 for j − k > d .
1. function x=bandbacksolve(R,y,d)2. n=length(y); x=y(:);3. for i=n:-1:14. kp=min(n,k+d);5. x(k)=(x(k)-R(k,k+1:kp)*x(k+1:kp))/R(k,k);6. end
Number of Flops, Discussion
I Full matrix: O(n3/3)
I Half of what is needed for Gaussian elimination
I Banded matrix, bandwidth d : O(nd2)
I Restricted to positive (semi)-definite matrices
I Many versions of Cholesky factorization tuned to differentmachine architectures.
I Symmetric LU factorization can be used for many symmetricmatrices that are not positive definite.