On the properties of Krylov subspaces in finite precision ...On the properties of Krylov subspaces in nite precision CG computations Tom a s Gergelits

On the properties of Krylov subspaces in finite precision CGcomputations

Tomas Gergelits∗ <[email protected]> Zdenek Strakos∗,† <[email protected]>

∗Faculty of Mathematics and Physics, Charles University in Prague†Institute of Computer Science, Academy of Sciences of the Czech Republic

Introduction

We address the question of difference between Krylov subspaces generated by the CG method in finiteprecision arithmetic and their exact arithmetic counterparts. Since we study the behaviour of CG inpractical computations, we concentrate on situation with significant delay of convergence. We observethat apart from the delay the computed Krylov subspaces do not depart much from their exact arithmeticcounterparts.

Krylov subspaces in practical computations

The need of computing the basis of the Krylov subspace

Kl(B , v) = span{v ,Bv , . . . ,B l−1v}is in the nutshell of CG and other Krylov subspace methods. However, due to rounding errors, the subspacespanned by the practically computed basis may differ and the following questions arise:

I What is the difference between the computed Krylov subspace Kl(B , v) and its exact arithmeticcounterpart Kl(B , v) ?

I Can we find perturbations ∆B , δv such that Kl(B + ∆B , v + δv) = Kl(B , v) ?

I How much are the Krylov subspaces Kl(B + ∆B , v + δv) sensitive to general small perturbations?

Results in literature, e.g., [1, 2, 3], rely on the assumption of full dimensionality of studied (computedor perturbed) Krylov subspaces.

The CG method in practical computations

The CG method is the method of choice for solving linear systems

Ax = b, A ∈ FN×N HPD, b ∈ FN , F is R or C.

I CG is a projection method which minimizes the energy norm of the error

xl ∈ x0 +Kl(A, r0), rl ⊥ Kl(A, r0); ‖x − xl‖A = min {‖x − y‖A : y ∈ x0 +Kl(A, r0)} .I CG is computationally based on short recurrences.

Delay of convergence & rank deficiencyUsing short recurrences in practical computation leads to the loss of global orthogonality of thecomputed residuals and even to the loss of their linear independence. Consequently, the computed Krylovsubspace spanned by these vectors can be rank-deficient which causes significant delay ofconvergence.

0 10 20 30 40 50

10−15

10−10

10−5

100

delay ofconvergence

iteration number

rela

tive

A−

norm

of t

he e

rror

exact arithmeticfinite precision arithmeticloss of orthogonality

0 10 20 30 40 500

5

10

15

20

25

30

35

40

45

50

iteration number

rank

of t

he c

ompu

ted

Kry

lov

subs

pace

rank−deficiency

FP CG computationexact CG computation

Idea of shift

Taking into the account the phenomenon of delay of convergence, we should compare differentiterations when comparing CG computations in finite precision and exact arithmetic. We relate:

k-th iteration of FP CG ⇐⇒ l -th iteration of exact CG where

k − l corresponds to delay of convergence or rank-deficiency of computed Krylov subspace.We want to study:

‖x − xk‖A × ‖x − xl‖Axk × xl

Kk(A, r0) × Kl(A, r0).

Correspondence among computed and exact approximation vectors

We see that

‖xk−xl‖A‖x−xl‖A

� 1,

i.e., the distance between the exact and shifted FPapproximations is small in comparison with the actualsize of error.

0 6 (6) 12 (12) 18 (23) 25 (50)

10−15

10−10

10−5

100

ener

gy n

orm

l(k)

‖x− xl‖A‖x− xk‖A‖xk − xl‖A‖xk−xl‖A‖x−xl‖A

Ax = b

xk

xl

x

FN

exact computation

finite precision computation

xk

xkx

x0

FNAx = b

exact computation

finite precision computation

delay at the k-th step

Figure : Trajectory of approximations xk generated by FP CG computations follows closely the trajectory of the exact CGapproximations xl with a delay given by the rank-deficiency of the computed Krylov subspace.

Correspondence between computed and exact Krylov subspaces

The distance between subspaces is measured using principal angles ϑj and vectors pj , qj defined as

ϑj = minp∈Fj‖p‖=1

minq∈Gj‖q‖=1

arccos ( p∗q ) ≡ arccos ( pj∗qj ) where Fj ≡ Kk(A, r0) ∩

{p1, . . . , pj−1

}⊥,

Gj ≡ Kl(A, r0) ∩{q1, . . . , qj−1

}⊥.

We compute principal angles and vectors via the computation of the SVD decomposition of the matrix U∗l VlU∗l Vl = FΣG∗,

where Ul is the orthogonal basis of the l -dimensional restriction of Kk(A, r0) and Vl is an orthogonal basisof Kl(A, r0). It holds that:

[p1, . . . , pl ] ≡ P = UlF , pj = Ul fj ,

[q1, . . . , ql ] ≡ Q = VlG , qj = Vlgj ,

diag(cos(ϑ1), . . . , cos(ϑl)) ≡ Σ.

0 6 (6) 12 (12) 18 (23) 25 (54)0

0.5

1

1.5

l(k)

Comparison of principal angles of subspaces Kk and Kl

angles-in

◦degrees

ϑlϑl−1ϑl−2ϑl−3

0 82 (170) 164 (502) 246 (976) 329 (1699)0

50

100

l(k)


angles-in

◦degrees

ϑl

ϑl−1ϑl−2ϑl−3

Figure : Left: In this numerical experiment, the largest canonical angle is ≈ 1o and thus Krylov subspaces Kl(A, r0) andKk(A, r0) are nearly the same. Right: Things can be more complicated, as illustrated on experiment with data Bus 494 from theMatrixMarket database. The departure of subspaces is, however, still only in few directions.

Study of departure of Krylov subspaces

Influence of clustered eigenvaluesWe have observed that the quality of closeness of generated Krylov subspaces depends on the possiblepresence of clustered eigenvalues. The tighter cluster is, the more severe is departure of subspaces.

0 6 (6) 13 (13) 20 (24) 27 (50)0

10

20

30

l(k)


angles-in

◦degrees


0 6 (6) 13 (13) 20 (26) 27 (48)0

50

100

l(k)


angles-in

◦degrees


Figure : Illustration of the influence of clustered eigenvalues. We plot the largest principal angles for two different settings. Fivelargest eigenvalues are clustered in the interval of length ∆ = 10−8 (left) and ∆ = 10−12 (right).

Similar phenomenon of the loss and recapture of correlation between Arnoldi vectors computed by twoimplementations of the Arnoldi algorithm in finite precision arithmetic was observed in [4].

Concluding remarks

I The trajectories of computed approximations are enclosed in a shrinking “cone”.

I Krylov subspaces are in general sensitive to small perturbations of the matrix A. The observed “stability”(or inertia) of computed Krylov subspace represents phenomenon which needs further investigation.

I There is principle difference in analysis between short and long recurrences. Using shortrecurrences we can not guarantee that the computed basis is well conditioned and that thecomputed subspaces have full dimension.

Acknowledgement

This work is supported by the ERC-CZ project LL1202, by the GACR grant 201/13-06684S and by theGAUK grant 695612.

Bibliography

Carpraux, J.-F. and Godunov, S. K. and Kuznetsov, S. V. (1996). Condition number of the Krylov basesand subspaces. Linear Algebra and its Applications 265, 1–28.

Paige, C. C. and Van Dooren, P. (1999). Sensitivity analysis of the Lanczos reduction. Numerical LinearAlgebra with Applications 6, 29–50.

Stewart, G. W. (2002). Backward error bounds for approximate Krylov subspaces. Linear Algebra and itsApplications 340, 81–86.

Oksa, G. and Rozloznık, M. (2011). On numerical behavior of the Arnoldi algorithm in finite precision formatrices with close eigenvalues. In SNA’11: Seminar on Numerical Analysis 2011, Institute of Geonics ASCR, Ostrava, 88–91.

http://more.karlin.mff.cuni.cz

Documents

On the properties of Krylov subspaces in finite precision ...On the properties of Krylov subspaces in nite precision CG computations Tom a s Gergelits