Chapter 12 Solving secular equationsdottmath/corsi2012/... · J.R. Bunch, C.P. Nielsen and D.C....

Preview:

Citation preview

Chapter 12Solving secular equations

Gerard MEURANT

January-February, 2012

1 Examples of Secular Equations

2 Secular equation solvers

3 Numerical experiments

Examples of secular equations

What are secular equations?The term “secular” comes from the latin “saecularis” which isrelated to “saeculum”, which means “century”So secular refers to something that is done or happens everycentury. It is also used to refer to something that is severalcenturies oldIt appeared in mathematics to denote equations related to themotion of planets and celestial mechanicsFor instance, it appears in the title of a 1829 paper ofA. L. Cauchy (1789–1857) “Sur l’equation a l’aide de laquelle ondetermine les inegalites seculaires des mouvements des planetes”(Oeuvres Completes (IIeme Serie), v 9 (1891), pp 174–195). Thereis also a paper by J. J. Sylvester (1814–1897) whose title is “Onthe equation to the secular inequalities in the planetary theory”(Phil. Mag., v 5 n 16 (1883), pp 267-269)

Eigenvalues of a Tridiagonal Matrix

We look for an eigenvalue λ and an eigenvector x =(y ζ

)Tof

Jk+1 where y is a vector of dimension k and ζ is a real number.Then

Jky + ηkζek = λy

ηkyk + αk+1ζ = λζ

where yk is the last component of y , αj , j = 1, . . . , k + 1 are thediagonal entries of Jk+1 and ηj , j = 1, . . . , k are the subdiagonalentriesBy eliminating the vector y from these two equations we have

(αk+1 − η2k(ek)T (Jk − λI )−1ek)ζ = λζ

αk+1 − η2kk∑

j=1

(ξj)2

θj − λ= λ

where ξj = z jk is the kth (i.e., last) component of the jth

eigenvector of Jkthe θj ’s are the eigenvalues of Jk , that is, the Ritz valuesToo obtain the eigenvalues of Jk+1 from those of Jk we have tosolve

f (λ) = λ− αk+1 + η2k

k∑j=1

ξ2jθj − λ

= 0

The secular function f has poles at the eigenvalues (Ritz values)

of Jk for λ = θj = θ(k)j , j = 1 . . . , k

−1 0 1 2 3 4 5 6 7−10

−8

−6

−4

−2

0

2

4

6

8

10

Example of secular function with k = 4

Modification by a Rank-One MatrixAssume that we know the eigenvalues of a matrix A and we wouldlike to compute the eigenvalues of a rank-one modification of AWe have

Ax = λx

where we know the eigenvalues λ and we want to compute µ suchthat

(A + ccT )y = µy

where c is a given vector (not orthogonal to an eigenvector of A)Clearly µ is not an eigenvalue of ATherefore A− µI is nonsingular and we obtain an equation for µ

y = −(A− µI )−1ccT y

Multiplying by cT to the left,

cT y = −cT (A− µI )−1ccT y

The secular equation is

1 + cT (A− µI )−1c = 0

Using the spectral decomposition of A = QΛQT with Qorthogonal and Λ diagonal and z = QT c

1 +n∑

j=1

(zj)2

λj − µ= 0

where λj are the eigenvalues of A

Constrained Eigenvalue ProblemWe wish to find a vector x of norm one which is the solution of

maxx

xTAx

satisfying the constraint cT x = 0 where c is a given vectorWe introduce a functional ϕ with two Lagrange multipliers λ andµ corresponding to the two constraints

ϕ(x , λ, µ) = xTAx − λ(xT x − 1) + 2µxT c

Computing the gradient of ϕ with respect to x , which must bezero at the solution,

Ax − λx + µc = 0

from which we have x = −µ(A− λI )−1cIf λ is not an eigenvalue of A and using cT x = 0 we have

cT (A− λI )−1c = 0

Using the spectral decomposition of A = QΛQT and d = QT c

f (λ) =n∑

j=1

d2j

λj − λ= 0

There are n − 1 solutions to the secular equationWhen we have the values of λ that are solutions, we use theconstraint xT x = 1 to remark that

xT x = µ2cT (A− λI )−2c = 1

Therefore,

µ2 =1

cT (A− λI )−2c

andx = −µ(A− λI )−1c

−1 0 1 2 3 4 5 6 7−10

−8

−6

−4

−2

0

2

4

6

8

10

Example of secular function

Eigenvalue Problem with a Quadratic ConstraintWe consider the problem

minx

xTAx − 2cT x

with the constraint xT x = α2

Introducing a Lagrange multiplier and using the stationary valuesof the Lagrange functional

(A− λI )x = c , xT x = α2

Let A = QΛQT

The Lagrange equations can be written as

ΛQT x − λQT x = QT c, xTQQT x = α2

Introducing y = QT x and d = QT c

Λy − λy = d , yT y = α2

Assume that all the eigenvalues λi of A are simpleIf λ is equal to one of the eigenvalues, say λj , we must have dj = 0For all i 6= j we have

yi =di

λi − λThen, there is a solution or not, whether we have∑

i 6=j

(di

λi − λj

)2

= α2

or notIf λ is not an eigenvalue of A, the inverse of A− λI exists and weobtain the secular equation

cT (A− λI )−2c = α2

which is written as

f (λ) =n∑

i=1

(di

λi − λ

)2

− α2 = 0

−1 0 1 2 3 4 5 6 7−10

−8

−6

−4

−2

0

2

4

6

8

10

Example of secular function

Secular equation solversConsider solving the equation

1 + ρ

n∑j=1

(cj)2

dj − λ= 0

with ρ > 0When we look for the solution in the interval ]di , di+1[, we make achange of variable λ = di + ρtDenoting δj = (dj − di )/ρ, the secular equation is

f (t) = 1 +i∑

j=1

c2j

δj − t+

n∑j=i+1

c2j

δj − t= 1 + ψ(t) + φ(t) = 0

Note that

ψ(t) =i−1∑j=1

c2j

δj − t−

c2i

t

since δi = 0The function ψ has poles δ1, . . . , δi−1, 0 with δj < 0,j = 1, . . . , i − 1

The solution is sought in the interval ]0, δi+1[ with δi+1 > 0

Let us denote δ = δi+1

In ]0, δ[ we have ψ(t) < 0 and φ(t) > 0

Assume we know all the poles of the secular function f and we areable to compute values of ψ and φ and their derivatives

BNS methods

Bunch, Nielsen and Sorensen interpolated ψ to first order by arational function p/(q − t) and φ by r + s/(δ − t)This is called osculatory interpolationThe parameters p, q, r , s are determined by matching the exactvalues of the function and the first derivative of ψ or φ at somegiven point t (to the right of the exact solution) where f has anegative value

q = t + ψ(t)/ψ′(t)

p = ψ(t)2/ψ′(t)

r = φ(t)− (δ − t)φ′(t)

s = (δ − t)2φ′(t)

For computing them we have

ψ′(t) =i∑

j=1

c2j

(δj − t)2

and

φ′(t) =n∑

j=i+1

c2j

(δj − t)2

Then the new iterate is obtained by solving the quadratic equation

1 +p

q − t+ r +

s

δ − t= 0

This equation has two roots, of which only one is of interestThis method is called BNS1

−2 −1 0 1 2 3 4 5−10

−8

−6

−4

−2

0

2

4

6

8

10

Functions ψ (solid) and φ (dots) as function of t

To obtain an efficient method it remains to find a good initial guess

Melman’s strategy is to look for a zero of an interpolant of thefunction tf (t)When seeking the ith root, this function is written as

tf (t) = t

(1−

c2i

t+

c2i+1

δ − t+ h(t)

)

The function h is written in two parts h = h1 + h2

h1(t) =i−1∑j=1

c2j

δj − t, h2(t) =

n∑j=i+2

c2j

δj − t.

The functions h1 and h2 do not have poles in the interval ofconcernThey are interpolated by rational functions of the form p/(q − t)The parameters are found by interpolating the function values atpoints 0 and δThis gives a function h = h1 + h2

The starting point is obtained by computing the zero (in theinterval ]0, δ[) of the function

t

(1−

c2i

t+

c2i+1

δ − t+ h(t)

)

This can be done with a standard zero finder

BNS2

Interpolating ψ by r + s/t and φ by p/(q− t), one obtains anothermethod called BNS2Then,

r = ψ(t)− (δ − t)ψ′(t),

s = (δ − t)2ψ′(t)

q = t + φ(t)/φ′(t)

p = φ(t)2/φ′(t)

MW

R. C. Li proposed to use r + s/(δ − t) to interpolate bothfunctions ψ and φ

In fact, we interpolate ψ, which has a pole at 0 by r + s/t and φ,which has a pole at δ by r + s/(δ − t)

This method is not monotonic, but has quadratic convergence

Gragg’s method

Gragg interpolates f at t to second order with the function

a + b/t + c/(δ − t)

This gives

c =(δ − t)3

δf ′ +

t(δ − t)3

2δf ′′

b =t3

[f ′′(δ − t)− 2f ′

]a = f − f ′′t

2(δ − t) + (2t − δ)f ′

Then we solve the quadratic equation

at2 − t(aδ − b + c)− bδ = 0

The convergence is cubic

Numerical experiments

1 + ρ

n∑j=1

(cj)2

dj − λ= 0

The dimension is n = 4 and d = [1, 1 + β, 3, 4]Defining vT = [γ, ω, 1, 1], we have cT = vT/‖v‖

β = 1, γ = 10−2, ω = 1

Method No. it. Root−1 f (Root)

BNS1 2 2.068910657999406 10−5 −4.2294 10−12

BNS2 2 2.068910658015177 10−5 8.0522 10−12

MW 2 2.068910657999406 10−5 −4.2294 10−12

GR 2 2.068910657999392 10−5 −4.2398 10−12

β = 1, γ = 1, ω = 10−2

Method No. it. Root−1 f (Root)

BNS1 3 2.539918603315181 10−1 −3.3307 10−16

BNS2 2 2.539918603315182 10−1 −1.1102 10−16

MW 3 2.539918603315181 10−1 −3.3307 10−16

GR 3 2.539918603315181 10−1 −3.3307 10−16

β = 10−2, γ = 1, ω = 1

Method No. it. Root−1 f (Root)

BNS1 2 4.939569815595898 10−3 7.8160 10−14

BNS2 2 4.939569815595901 10−3 1.4921 10−13

MW 2 4.939569815595898 10−3 7.8160 10−14

GR 2 4.939569815595873 10−3 −4.0501 10−13

J.R. Bunch, C.P. Nielsen and D.C. Sorensen,Rank-one modification of the symmetric eigenproblem,Numer. Math., v 31 (1978), pp 31–48

Ren-Cang Li, Solving secular equations stably andefficiently, Report UCB CSD-94-851, University of California,Berkeley (1994)

A. Melman, A unifying convergence analysis of second-ordermethods for secular equation, Math. Comput., v 66 n 217(1997), pp 333–344

A. Melman, A numerical comparison of methods for solvingsecular equations, J. Comput. Appl. Math., v 86 (1997),pp 237–249