Download pdf - Understanding the QR algorithm, Part X · 2. Fundamentals of Matrix Computations, Wiley, 1991 3. Some perspectives on the eigenvalue problem, 1993 4. QR-like algorithms—an overview

Understanding the QR algorithm,Part X

David S. [email protected]

Department of Mathematics

Washington State University

Glasgow 2009 – p. 1

1. Understanding the QR algorithm, SIAM Rev., 1982



2. Fundamentals of Matrix Computations, Wiley, 1991




3. Some perspectives on the eigenvalue problem, 1993





4. QR-like algorithms—an overview of convergencetheory and practice, AMS proceedings, 1996






5. QR-like algorithms for eigenvalue problems, JCAM,2000















7. The Matrix Eigenvalue Problem: GR and KrylovSubspace Methods, SIAM, 2007.








7. The Matrix Eigenvalue Problem: GR and KrylovSubspace Methods, SIAM, 2007.

8. The QR algorithm revisited, SIAM Rev., 2008.


Some names associated withthe QR algorithm


Some names associated withthe QR algorithm (short list)



Rutishauser



Rutishauser

Kublanovskaya



Rutishauser

Kublanovskaya

Francis



Rutishauser

Kublanovskaya

Francis

Implicitly Shifted QR algorithm



Rutishauser

Kublanovskaya

Francis

Implicitly Shifted QR algorithmHow should we understand it?



Rutishauser

Kublanovskaya

Francis

Implicitly Shifted QR algorithmHow should we understand it? . . . view it?



Rutishauser

Kublanovskaya

Francis

Implicitly Shifted QR algorithmHow should we understand it? . . . view it?. . . teach it to our students?


The Standard Approach . . .


The Standard Approach . . .. . . dating from the work of Francis



Start with the basic algorithm . . .




A = QR




A = QR RQ = A




A = QR RQ = A repeat!





This is simple,





This is simple, appealing,





This is simple, appealing, does not require muchpreparation,





This is simple, appealing, does not require muchpreparation, but . . .





This is simple, appealing, does not require muchpreparation, but . . .

. . . it is far removed from versions of theQRalgorithm that are actually used.


Refinements


Refinementsshifts of origin



reduction to Hessenberg form




implicit shift technique (Francis)





double shiftQR





double shiftQR

multiple shiftQR





double shiftQR

multiple shiftQR

implicit-Q theorem





double shiftQR

multiple shiftQR

implicit-Q theoremvs.Krylov subspaces





double shiftQR

multiple shiftQR


Introducing Krylov subspaces improvesunderstanding,





double shiftQR

multiple shiftQR


Introducing Krylov subspaces improvesunderstanding, allows more general results,





double shiftQR

multiple shiftQR


Introducing Krylov subspaces improvesunderstanding, allows more general results, andprepares students for Krylov subspace methods.


The Implicitly Shifted QR Iteration


The Implicitly Shifted QR Iterationmatrix is in upper Hessenberg form



pick some shiftsρ1, . . . ,ρm



pick some shiftsρ1, . . . ,ρm (m = 1, 2, 4, 6)




p(A) = (A − ρ1I) · · · (A − ρmI)




p(A) = (A − ρ1I) · · · (A − ρmI) expensive!





computep(A)e1





computep(A)e1 cheap!






Build unitaryQ0 with q1 = αp(A)e1.







Perform similarity transformA → Q∗0AQ0.







Perform similarity transformA → Q∗0AQ0.

Hessenberg form is disturbed.


An Upper Hessenberg Matrix@

@@

@@

@@

@@

@@

@@


After the Transformation ( Q∗0AQ0)

@@

@@

@@

@@

@@


After the Transformation ( Q∗0AQ0)

@@

@@

@@

@@

@@

Now return the matrix to Hessenberg form.


Chasing the Bulge@

@@@

@@

@@

@@@


Chasing the Bulge@

@@

@@

@@

@@

@


Done@

@@

@@

@@

@@

@@

@@


Done@

@@

@@

@@

@@

@@

@@

The implicitQR step is complete!


Summary of Implicit QR Iteration


Summary of Implicit QR IterationPick some shifts.



Computep(A)e1. (p determined by shifts)




Build Q0 with first columnq1 = αp(A)e1.





Make a bulge. (A → Q∗0AQ0)






Chase the bulge. (return to Hessenberg form)






Chase the bulge. (return to Hessenberg form)

A = Q∗AQ


Question


QuestionThis differs a lot from the basicQR step.



A = QR RQ = A



A = QR RQ = A

Can we carve a reasonable pedagogical path thatleads directly to the implicitly-shiftedQR algorithm,



A = QR RQ = A

Can we carve a reasonable pedagogical path thatleads directly to the implicitly-shiftedQR algorithm,bypassing the basicQR algorithm entirely?



A = QR RQ = A

Can we carve a reasonable pedagogical path thatleads directly to the implicitly-shiftedQR algorithm,bypassing the basicQR algorithm entirely?

That’s what we are going to do today.


Ingredients


Ingredientssubspace iteration (power method)



Krylov subspaces



Krylov subspaces and subspace iteration




(unitary) similarity transformation(change of coordinate system)











Hessenberg form and Krylov subspaces(instead of implicit-Q theorem)






Hessenberg form and Krylov subspaces(instead of implicit-Q theorem)

No Magic Shortcut!


Power Method, Subspace Iteration


Power Method, Subspace Iterationv, Av, A2v, A3v, . . .



convergence rate|λ2/λ1 |




S, AS, A2S, A3S, . . .




S, AS, A2S, A3S, . . .

subspaces of dimensionj




S, AS, A2S, A3S, . . .

subspaces of dimensionj (|λj+1/λj |)




S, AS, A2S, A3S, . . .


Substitutep(A) for A




S, AS, A2S, A3S, . . .


Substitutep(A) for A (shifts, multiple steps)




S, AS, A2S, A3S, . . .



S, p(A)S, p(A)2S, p(A)3S, . . .




S, AS, A2S, A3S, . . .



S, p(A)S, p(A)2S, p(A)3S, . . .

convergence rate|p(λj+1)/p(λj) |


Krylov Subspaces . . .


Krylov Subspaces . . .. . . and Subspace Iteration


Krylov Subspaces . . .. . . and Subspace IterationDef: Kj(A, q) = span

{

q, Aq,A2q, . . . , Aj−1q}



{

q, Aq,A2q, . . . , Aj−1q}

j = 1, 2, 3, . . . (nested subspaces)



{

q, Aq,A2q, . . . , Aj−1q}


Kj(A, q) are “determined byq”.



{

q, Aq,A2q, . . . , Aj−1q}



p(A)Kj(A, q) = Kj(A, p(A)q)



{

q, Aq,A2q, . . . , Aj−1q}




. . . becausep(A)A = Ap(A)



{

q, Aq,A2q, . . . , Aj−1q}




. . . becausep(A)A = Ap(A)

Conclusion: Power method induces nested subspaceiterations on Krylov subspaces.


power method: p(A)kq



nested subspace iterations:

p(A)kKj(A, q) = Kj(A, p(A)kq) j = 1, 2, 3, . . .



nested subspace iterations:

p(A)kKj(A, q) = Kj(A, p(A)kq) j = 1, 2, 3, . . .

convergence rates:

|p(λj+1)/p(λj) |, j = 1, 2, 3, . . .


(Unitary) Similarity Transforms


(Unitary) Similarity TransformsA → Q∗AQ preserves eigenvalues



transforms eigenvectors in a simple way(w → Q∗w)




is a change of coordinate system (v → Q∗v)





triangular form (eigenvalues)





triangular form (eigenvalues)

relationship of invariant subspaces to triangular form


Subspace Iterationwith change of coordinate system



takeS = span{e1, . . . , ej}




p(A)S = span{p(A)e1, . . . , p(A)ej}

= span{q1, . . . , qj} (orthonormal)




p(A)S = span{p(A)e1, . . . , p(A)ej}


build unitaryQ = [q1 · · · qj · · ·]




p(A)S = span{p(A)e1, . . . , p(A)ej}



change coordinate system:A = Q∗AQ




p(A)S = span{p(A)e1, . . . , p(A)ej}




qk → Q∗qk = ek




p(A)S = span{p(A)e1, . . . , p(A)ej}




qk → Q∗qk = ek

span{q1, . . . , qj} → span{e1, . . . , ej}




p(A)S = span{p(A)e1, . . . , p(A)ej}




qk → Q∗qk = ek

span{q1, . . . , qj} → span{e1, . . . , ej}

ready for next iterationGlasgow 2009 – p. 19

This version of subspace iteration . . .



. . . holds the subspace fixed




while the matrix changes.





. . . moving toward a matrix under which

span{e1, . . . , ej}

is invariant.





. . . moving toward a matrix under which

span{e1, . . . , ej}

is invariant.

A →

[

A11 A12

0 A22

]

(A11 is j × j.)


Reduction to Hessenberg form


Reduction to Hessenberg formQ → Q∗AQ = H (a similarity transformation)



can always be done (direct method,O(n3) flops)




brings us closer to triangular form





makes computations cheaper






First columnq1 can be chosen “arbitrarily”.






First columnq1 can be chosen “arbitrarily”.

Example: q1 = αp(A)e1


Krylov Subspaces . . .


Krylov Subspaces . . .. . . and Hessenberg matrices . . .



. . . go hand in hand.




A properly upper Hessenberg=⇒

Kj(A, e1) = span{e1, . . . , ej}.




A properly upper Hessenberg=⇒

Kj(A, e1) = span{e1, . . . , ej}.

More generally . . .


Krylov-Hessenberg Relationship


Krylov-Hessenberg RelationshipIf H = Q∗AQ, andH is properly upper Hessenberg,then forj = 1, 2, 3, . . . ,

span{q1, . . . , qj} = Kj(A, q1).



span{q1, . . . , qj} = Kj(A, q1).

Proof (sketch):



span{q1, . . . , qj} = Kj(A, q1).

Proof (sketch): Induction onj.



span{q1, . . . , qj} = Kj(A, q1).

Proof (sketch): Induction onj. AQ = QH



span{q1, . . . , qj} = Kj(A, q1).

Proof (sketch): Induction onj. AQ = QH

Aqj =n

∑

i=1

qihij =

j∑

i=1

qihij + qj+1hj+1,j


Aqj =

j∑

i=1

qihij + qj+1hj+1,j


Aqj =

j∑

i=1

qihij + qj+1hj+1,j

qj+1hj+1,j = Aqj −

j∑

i=1

qihij


Aqj =

j∑

i=1

qihij + qj+1hj+1,j


j∑

i=1

qihij

Proof by induction follows immediately.


Aqj =

j∑

i=1

qihij + qj+1hj+1,j


j∑

i=1

qihij


This also gives the student a preview of the Arnoldiprocess,


Aqj =

j∑

i=1

qihij + qj+1hj+1,j


j∑

i=1

qihij


This also gives the student a preview of the Arnoldiprocess,the most important Krylov subspacemethod.


and now,


and now, the Implicit QR Iteration


and now, the Implicit QR IterationWork with Hessenberg form to get . . .



. . . efficiency.



. . . efficiency.

. . . automatic nested subspace iterations.



. . . efficiency.


Get some shiftsρ1, . . . ,ρm to definep.



. . . efficiency.



Computep(A)e1. (power method)



. . . efficiency.



Computep(A)e1. (power method)

TransformA to upper Hessenberg form:

A = Q∗AQ

by a matrixQ that hasq1 = αp(A)e1.


A = Q∗AQ where q1 = αp(A)e1.



q1 → Q∗q1 = e1



q1 → Q∗q1 = e1

power method with a change of coordinate system.Moreover . . .



q1 → Q∗q1 = e1


p(A)Kj(A, e1) = Kj(A, p(A)e1)



q1 → Q∗q1 = e1



i.e.p(A)span{e1, . . . , ej} = span{q1, . . . , qj}



q1 → Q∗q1 = e1




subspace iteration with a change of coordinatesystem



q1 → Q∗q1 = e1





j = 1, 2, 3, . . . ,n − 1



q1 → Q∗q1 = e1





j = 1, 2, 3, . . . ,n − 1

|p(λj+1)/p(λj) | j = 1, 2, 3, . . . ,n − 1


Details


Detailschoice of shifts



We change the shifts at each step.




⇒ quadratic or cubic convergence





Other Questions





Other Questions. . . how to get BLAS 3 speed?

. . . how to parallelize?


In Conclusion


In ConclusionA careful study of


In ConclusionA careful study of the power method and its extensions,


In ConclusionA careful study of the power method and its extensions,similarity transformations,


In ConclusionA careful study of the power method and its extensions,similarity transformations,Hessenberg form,


In ConclusionA careful study of the power method and its extensions,similarity transformations,Hessenberg form,andKrylov subspaces


In ConclusionA careful study of the power method and its extensions,similarity transformations,Hessenberg form,andKrylov subspacesleads directly to the implicitly shiftedQR algorithm.



The basic, explicitQR algorithm is skipped.




The implicit-Q theorem is avoided.





Krylov subspaces are emphasized.





Krylov subspaces are emphasized.

Krylov subspace methods are foreshadowed.


One Last Question


One Last QuestionIn the implicitly shiftedQR algorithm


One Last QuestionIn the implicitly shiftedQR algorithmtheQR decomposition is nowhere to be seen.



Should the implicitly-shiftedQR algorithm be givena different name?



Should the implicitly-shiftedQR algorithm be givena different name? Some possibilities: . . .




. . . unitary bulge-chasing algorithm





. . . Hessenberg-Krylov nonstationary progressivenested subspace iteration






. . . Francis’s algorithm






. . . Francis’s algorithm

Thank you for your attention.