HOW LARGE IS THE EXPONENTIAL OF A BANDED MATRIX?

NEW ZEALAND JOURNAL OF MATHEMATICS Volume 29 (2000), 177-192

HOW LARGE IS THE EXPONENTIAL OF A BANDED MATRIX?

A r ie h I s e r l e s

(Received January 1999)

Abstract. Let A be a banded matrix of bandwidth s > 1. The exponential of A is usually dense. Yet, its elements decay very rapidly away from the diagonal and, in practical computation, can be set to zero away from some bandwidth r > s which depends on the required accuracy threshold. In this paper we investigate this phenomenon and present sharp, computable bounds on the rate of decay.

1. Sparsity and Matrix Exponents

How large is a matrix exponential? And why should it matter in practical numerical calculations?

On the face of it, the first question is sheer nonsense. Let A 6 Mn(K), the set of n x n real matrices. By an assiduous choice of A , the entries of

'Arn—0

can be made as large (or as small) as we wish them to be [12]. In particular, even if A is sparse, eA is likely to be a dense matrix. Yet, it is the contention of the present paper that this is a simplistic point of view and that, provided that A is a banded matrix, eA is itself within an exceedingly small distance from a banded matrix.

In Figure 1 we have performed the following exercise. 105 tridiagonal matrices A\,A 2, . . . , 4io5 E M50(M) were chosen at random, with nonzero entries distributed uniformly in [—1,1]. The entries of the matrix E, plotted at the upper left corner of Figure 1, are the maximum across the relevant entries of eAk, k = 1 ,2 ,... , 105. (Throughout this paper, ‘plotting a matrix’ means that, using the MATLAB function mesh, we display a matrix as a three-dimensional surface: this allows, at a glance, to take heed of the size of its elements.) It can be observed at once that the entries of E decay rapidly away from the diagonal. This is confirmed in the figure at the upper right corner, which displays log10 \Ek,i\, k,l — 1 ,2 ,... ,50. The decay away from the diagonal is evidently faster than exponential. Although E is formally dense, its entries are very small away from a narrow band along the diagonal. Further affirmation is provided by the bottom two figures, which display the vector {i?25+fc,25-fc : k = —24,... , 24} to an absolute (on the left) and base-10 logarithmic (on the right) scale. The reason for base-10 logarithms is that they demonstrate, at a single glance, the number of decimal digits. Thus, |î,4g| ~ 10-80

1991 A M S Mathematics Subject Classification: 22E60.

178 ARIEH ISERLES

and l-E'20,301 ~ IO-8 . Given that E represents a (statistical) upper bound on the size of the exponential and bearing in mind that numerical calculations are invariably required to finite accuracy, almost always far coarser than IO-80, it follows that each eAk can be approximated very well by a banded matrix!

This is perhaps not as surprising as it may appear at first, hence the growth in the (k, I) entry of eA is governed by a competition between the growth in the entries of Am and the decay of 1/m!. Since factorials ultimately overtake powers, one can expect decay to occur. Yet, the precise rate of decay is of interest and its magnitude is, we believe, surprising.

F igure 1. Exponentials of a tridiagonal matrix.

Lest there is an impression that this behaviour is specific to tridiagonal matrices, Figure 2 provides similar data for quindiagonal (i.e. with five nonzero diagonals) matrices. The decay away from the diagonal is less rapid than in the tridiagonal case, yet fast enough for our observations to retain their validity.

Of course, there is a great deal of difference between experiments with random matrices and rigourous mathematical statements. In this paper we prove that the behaviour indicated in Figures 1 and 2 is indicative of exponentials of all banded matrices. Moreover, tight upper bounds on the worst-possible rate of decay away from the exponential can be established for every bandwidth. In other words, given an arbitrary matrix A € Mn (R) of bandwidth s and such that p = maxfcî^,,.. ,n | 4fc,z|, for every threshold e > 0 we can determine (as a function of s and p but, interestingly enough, not of n) an integer r such that |Ek,i\ < £ for all \k — l\ > r, where E = eA.

Why should all this matter, except for an intrinsic mathematical interest? Classical methods for the evaluation of a matrix exponential cannot take advantage of the above phenomenon. Such methods typically can be classified into three categories:

HOW LARGE IS THE EXPONENTIAL OF A BANDED MATRIX? 179

F igure 2. Exponentials of a quindiagonal matrix.

1. Rational approximants: The exponential is replaced by a rational function, e2 « r(z) := p(z)/q(z), where p G PQ, q G P/3, <?(0) = 1 and ez —p(z)/q(z) is in some sense ‘small’ . Typical measure of smallness is the order of approximation at the origin, which results in Pade approximants and their modifications [2, 17]. Other criteria of smallness can be applied, e.g. minimisation with respect to the Loo norm in some subset of the complex plane or interpolation away from the origin [14]. Thus, instead of evaluating eA, one computes two matrixvalued polynomials, p(A) and q(A), and inverts the latter to obtain r(A). Suppose that A is banded. Then, using the approach of this paper, it is possible to show that r(A) is near a banded matrix, although less near than in the exponential case. Yet, it is not clear at all how this behaviour can be exploited in the design and implementation of effective numerical methods for the exponentiation of banded matrices.

2. K rylov subspace m ethods: The underlying idea is to seek an approximant to eAv, where A G Mn(R) and v G Mn, from the r-dimensional Krylov subspace K n r = Span{v, A v ,. . . ,A r~ 1v} [11, 20]. It is possible to show that surprisingly good and robust approximants can be obtained for relatively modest values of r.It is not transparent at all how, for banded A, the nearness of eA to a banded matrix can be exploited for an improved Krylov-subspace approximation of the exponential. The striking efficiency of Krylov subspace methods hinges on the fact that K n,r retains, in a well-understood sense, much of the spectral information of the matrix A , whilst cutting the dimension. This has no apparent connection to the size of the entries of eA.

180 ARIEH ISERLES

3. Methods of numerical linear algebra: The conceptually simplest means of evaluating eA is by spectral decomposition: if A = V D V ~ l, where D is diagonal, then eA = VeDV ~ 1. This, however, is not a viable technique for general matrices. In place of a spectral decomposition, one can factorize A in a different form, e.g. into a Schur decomposition [8, 17]. Regardless of the merits of this approach, it is quite clear that the nearness of eA to a banded matrix is of no use whatsoever in this framework.

To recap, our observation does not help in making classical methods more effective. Intriguingly, this is not the case for a new generation of methods for the approximation of exponentials, that have been recently introduced within the context of Lie-group methods. Let G C Mn(M) be a matrix Lie group and g be the corresponding Lie algebra. (We refer the reader to [3] and [21] for a good exposition of Lie groups and Lie algebras.) Given A G 0, it is true that eA G G and this forms vital part of many recent numerical methods for differential equations evolving on Lie groups: Runge-Kutta/Munthe-Kaas schemes [18], Magnus expansions [15] and Fer expansions [22]. Not every exponential approximant takes a matrix from a Lie algebra to a Lie group [4]. In the case of quadratic Lie groups, e.g. On(lR), Un(M) and Spn(R), it suffices to use diagonal Pade approximants, but it can be proved that the only analytic function / such that f(z) — 1 + z + G [z2) and / : stn(M) —> SLn(R) for all n > 2 is the exponential function itself!

The goal of designing exponential approximants that take A E Q into its Lie group has motivated recent interest in a new generation of algorithms [4, 6]. The common denominator is that A is split in the form

S

A = 5 2 A k,k= 1

where A k G 0 , k — 1, 2 , . . . , s, has an exponential that can be easily evaluated exactly and so that

etAletA2 •••e*As = etA + 0 (tp+1)

for sufficiently large value of p > 1. Recently, it has been demonstrated that the above approach can be improved by letting

r

A ^ O'kQki k= 1

where r = dimg and {Qi, Q2, • • • , Qr} is a basis of the algebra, and approximatingetA ~ e9 i(t)Qie92(t)Q2 . . . e£>s(*)Qs ^

where <71, <72, • • • , gr are polynomials [5]. The main design features of this method are the right choice of the basis and an exploitation of certain features of the underlying Lie algebra. An important byproduct is that, whenever it is known that etA is near to a banded matrix, it is possible to amend the method to produce a banded approximant. Moreover, this procedure brings radically down the computational cost and allows to take full advantage of sparsity. We do not delve into the details of the method from [5], merely using it as an example of an application of the results in this paper.


In Section 2 we analyse in detail the case of a tridiagonal matrix, demonstrating that the decay of the entries of eA away from the diagonal are bounded from above by a modified Bessel function. The theory for matrices of general bandwidth is introduced in Section 3, where we use Fourier analysis to obtain upper bounds on the decay away from the origin. Finally, in Section 4, we sketch briefly our conclusions and examine how the results can be generalized from the exponential to other well-behaved functions.

2. Tridiagonal Matrices

Let A € Mn(R) be a tridiagonal matrix and

P = , max \Ak}t\.k,l=1,2,... ,n

We denote the entries of Am by 1, whence AQk t = 5k,i and

min{n,Z+l}A ” (+1 = E A t,jAi,u M = l , 2 , . . . , n , m > 0. (2.1)

j=max{l,Z—1}

Proposition 2.1. For every m > 0, k — 1 ,2,... , n and |/| < m, it is true that

IÊfc+il — cm,iPm-l

where co,z = 5o,z and

Cm +l,Z = C m , I — 1 “I" C m , I "i" Cm,Z+1 , \ l \ ^ T U -(- 1 , ( 2 -2 )

while = 0, |/| > m + 2.

Proof. A trivial consequence of (2.1). Specifically, for I = m + 1 we have

Ak,k+m+1 — Ak,k+mAk+m,k+m+It

hence we can let cm+i )m+i — cm)m. When / = m, we obtain

= - fe,fc+m-l-Â:+m-l,fc+m + fc,fc+m fc+m,fc+m

and we can take cm+i>m = cmjm_i + cm;m. In the case Z = 0 ,1 ,... , m — 1 the recursion leads to

A™k+i = A™k+i-iAk+i-i,k+i + A™k+iAk+i,k+i + A™k+l+1A k+i+1,k+i

and again we can choose cmj consistently with (2.2). An identical argument extends to negative values of /. □

Lettingm

Cm(^) •— ^ Cm,lZ ? TU E Z~*~,l——m

we deduce at once that

182 ARIEH ISERLES

We are interested in bounding the size of the entries of E := eA. To this end we note that, for every k = 1 ,2 ,... , n and k + I = 1 ,2 ,... , n it is true that

therefore

O O 1

F r(/>) : = E »• e Z + .z—' m!

To investigate the size of functions Fr, we expand (2.3) in Laurent series. It follows easily that

where

fc=0

k m

+ ^ r = E ( : ) H y = i : ( : ) E A„ 2 2 - fc

|_to/2J / ^ 2 k / r \ i \

= e (r, e gfc=0 V ' 2=0 X

L(m-1)/2J \ 2fc+l /rjj i 1\+ s ( 1 7 1 ) i z ^ + i v2(i-fc)-i

fc = 0 + 2 = 0

Lm /2J

= E2= — [m/2j

L(m +1)/2J

+ EJ = - L ( m - i ) / 2 j

Lm /2J

E ml

k = I(m — 2k)\(k + l)\{k — l)\

,21

L ( m + 1) /2 J

Efc=max{ —2,2+1}

ml(m — 2k — l)!(fc + l)\(k — l)\

„22 — l

We concern ourselves with nonnegative indices /, since it is trivial to prove cm,-i = cmj. Therefore

[m/2jCm,21 — ^ ^

m

k = l

L(m—1) / 2J

Cm,22+1 — ^ '

(m - 2k)\{k + l)\(k - l)\

m\

k = l(m — 2k — 1)! (A: + I + l)!(fc — /)!


for all relevant values of I. We thus deduce thatoo |m /2J -I

F2i{p) = ^ H (m - 2k)\(k + l)\(k - l)\m—l k=l

00 n2fc oo rt2(fc+0= - S ( fc_ 0I(fc+l)1 — £ « (* + 20.

where /^(.z) is the modified Bessel function [1, 19] Likewise,oo L(m - ! ) J

- ( m - 2 f c - l ) ! ( f c - / ) ! ( £ + / + l ) rpin

= £ ( * - , ) ! ( * + ; + !)! J J + i ( m - 2 * - 1 ) !

= ^ g a i T F i r r i p ^ - " ^

We thus deduce that

Fr(p )=e^Jr(2p), rG Z + . (2.4)

The proof of the following theorem follows at once from (2.4).

Theorem 2.2. Let A be tridiagonal and the magnitude of its nonzero entries bounded by p > 0. Then

|(eA)fc,*l < epI\k-i\{2p), k,l = 1 ,2 ,... ,n.

An alternative interpretation of the bound (2.4) is that the worse possible case is obtained for a bi-infinite Toeplitz matrix with the symbol p(z + 1 + 1 /z), whence all the upper bounds become equalities and Ek,k+s = js|(p)> k,s E Z .1

Since

T ( x _ V ' 1 fZ \2m+uu[Z) m\r(m + v + l )\2Jm=0 v 7

it follows that

r » 1.V ' r ! V r / ’

In other words, scaling by the magnitude of entries along the diagonal,

l o g ^ f ^ « r ( lo g p - lo g r + 1) - |log(27rr) - l o g / 0(2p), r > 1,Fo{P)

and the upper bound decays hyper-exponentially away from the diagonal. This illustrates the behaviour along the right column of Figure 1.

xWe refer the reader to [9] for the theory of Toeplitz matrices.

184 ARIEH ISERLES

In Table 1 we have displayed the growth in the number of nonzero entries greater than a threshold e > 0 for tridiagonal and (to illustrate the next section) quindiagonal 200 x 200 matrices for decreasing e. We have considered both the worst case of a Toeplitz matrix and the average case, a mean of 1000 randomly selected matrices. For each case we have listed two numbers: the overall number of nonzero entries (out of 40000) and the minimal bandwidth within which they reside. It is evident that the rate of decay is very rapid: even in the case £ = 10-12, not that far from IEEE machine epsilon, sparsity remains significant. It is important to bear in mind that, in the worse case, a Toeplitz matrix, although not identical to the bi-infinite Toeplitz matrix with the same symbol, has spectrum exponentially near to a {Too) [9]- This explains why our bounds for the decay in the size of the elements of eT°° remain remarkably sharp for finite-dimensional Toeplitz matrices. In general, let A be a tridiagonal matrix whose entries are bounded by p > 0 and

Tridiagonal Quindiagonal£ worst case average case worst case average case

nz bd nz bd nz bd nz bd

10~2 2170 5 968.9 1.94 5218 13 1888.9 4.2810~4 2944 7 1608.3 3.56 7058 18 3247.1 7.8910“ 6 3710 9 2175.4 5.01 8494 22 4416.0 10.8610~8 4468 11 2699.2 6.37 9898 26 5479.0 13.7010~10 5218 13 3194.4 7.65 10930 29 6469.6 16.3910-12 5960 15 3666.5 8.89 12278 33 7405.3 18.96

T a b l e 1. Entries of eA larger in magnitude than a given threshold £ and the bandwidth within which they reside for tridiagonal and quindiagonal matrices, ‘nz’ and ‘bd’ stand for the number of nonzero entries and ‘their’ bandwidth respectively.

choose a threshold e. Suppose that r is the least integer such that epIr(2p) < £. Then we might set all the elements of eA with \k — l\ > r equal to zero whilst committing an entry-wise error less than e. Note that this bound does not depend on the dimension n. The computation of such r is assisted by the following observation about modified Bessel functions.

P roposition 2.3. Let i); be the Digamma function [1, p. 258]. For every 0 < p < 2e^!'+1) the sequence {/r?(p)}r?>v is strictly monotonically decreasing to zero as r] —> oo.

Proof. The derivative of the modified Bessel function Iv, where u G M, with respect to its parameter is

— T(n\ - T (n\ 1p(k + V + 1) //?\2fcdv u{p) u(p) g 2 ( 2 J jLk\T(k + u + l ) \ 2 ) ’


where T is the Gamma function [1, p. 377]. Substituting the series expansion of Iu, we have

t _ (P Y 'S r 1 (P \2k^~~P (PY'ST' ^(^ + + 1) (P \2kp) ( 2) ^ fc!r(fc + + l) ( 2) 2 V2) ^ fc!r(A: + z/ + l) \2J k—0 k—0

= - ( 2) ” § V (k + " + ! ) - ‘° 8 1 ] Hr ( t + , + 1) ( f r ■

As long as 0 < p < 2e^v+1\ we have ip(k + u + 1) — log | > 0, k > 0, and dlu(p)du < 0. This proves the proposition. □

The condition of the proposition is always fulfilled for sufficiently large u, since, according to the integral representation in [1, p. 259],

-der '(t ) e -« -

**> = m = ~1 + l —

- 7 + fJo

e~€2C + e-2? _ e- «1 — e- £

r°° P-2C _ p-t£= ~ 7 + 1 + I l - e - f d?’

where 7 « .57721 is the Euler constant. Since the integrand is nonnegative for t > 2, we deduce that ip is positive in that regime. Moreover, ip(t) ~ \ogt for t 1, hence it becomes unbounded for t —> 00 [1, p. 259].

3. General banded matrices

Let A E Mn(R) be a banded matrix of bandwidth s > 1: thus, A^j = 0 for |fc — l\ > s + 1. We let

p - maxk,l=l,2,... ,n

It is important to investigate how much of the analysis of the tridiagonal case s = 1 survives in the present setting. Proposition 2.1 is a case in point and it requires a trivial amendment. As before, we denote by A ^x the entries of the mth power of A, m > 0.Proposition 3.1. It is true that

\A™k+i\ < cm,ipm, m > 0, fc = 1 ,2 ,... ,n, \l\<m,

where co,z = 50ti,1

Cm+1,1 = ^ Cm,8+ji 11 ^ rn-\-1, j = - i

and cm+ij = 0 for \l\ > m + 2.P roof. Follows similarly to Proposition 2.1, except that, in place of (2.1), we use the recursion

min {n,Z+s}

186 ARIEH ISERLES

Retaining the definition of Cm, it follows at once from (3.5) thatsm / a \ m

Cm{z) = Y I Cm’izl = ( Y I zl ) ’ r n e Z + .l = —sm Kl=-s

Letting again E = e , we deduce identically to Section 2 that

-Et.l < k,l = 1 ,2 ,... ,n. (3.6)

This is as far as the technique of Section 2 takes us. At least in the quindiagonal case s = 2 it is possible to express Fr(p), with substantial effort, as a linear combination of modified Bessel functions,

Fr (p) = e * I\j\(2p)IT+2j(2p)j = — OO

= e* | j 0 (2 p )jr (2p) + f y j (‘lp ) [ Ir+2j(2p) + I r -v (2 p ) ] J ,

except that this expression is of limited transparency and utility. We need a different approach, and it is provided by Fourier analysis.

We observe thatS171 .j ,»7r

e - ir8Cm(ei9)cU> = £ Cm ,i— I e ' ^ ' d 8 = cm,r .k = —sm ~ n

j _ r2tt

Thus, we deduce from (3.6) that for every r G Z +

00 nm 1 /*7r / _ s \ m

\k=—s1 00 00 ^

= — V e~'ir6 V —27r ^ mm = r m —r

_ l r2tt J-n

\k=—s

d e

—irO exp ps \ r - 1 1

k = —s m=0p E

\k= — s

d9.

In other words, for each r G Z + we are interested in the rate of decay of the rth Fourier component of the function

r —1

gr{z) := e x pfc=—s m=0

1ml »\ E

Note that the integral representation of Fr can be somewhat simplified,

Fr(p)

= i r7T Jo

cos I exp < p 1 + 2 cos(fc#)fc = l

) - E ^ M i + 2 E c o s W ] r ) de-J m=0 fc=l J


This can be readily used to produce an alternative proof of (2.4) by exploiting the identity

Ir(z) = - [ ez costf cos(r#)d0 ft Jo

[1, p. 376]. For more general analysis we require a classical estimate on the rate of decay of Fourier components (or, equivalently, of the coefficients of Laurent series) of analytic functions which can be found, for example, in [10, Vol. I, p. 221].

Theorem 3.2. Let f be analytic in the annulus {z € C : a < \z — zo\ < /3} and for a E (a,/3) let

/i(cr) := max \f(z0 + a e l6)\.— 7 T < 0 < 7 T

The coefficients {fn}%L-oo of the Laurent series of f satisfy the inequality

l/nl < n e z . (3.7)aIn our case, each gr is analytic in the punctured complex plane C\ {0}, therefore

a = 0, /3 = oo and zq = 0. Moreover, for each r € Z+ ,

z cos 6

OO / *£ m\ (P £m =r \ k = —se ikefi{a) = [ir (<7) = max \fr(ael6)\ = max

— 7 T < 0 < 7 T — 7 T < 0 < 7 T

OO - / s \ m

m =r \ k——s )

Therefore, for every a > 1 we obtain the upper bound

\Fr{p)\ < (T~rgr{(T) := V?r((7-), T G Z + . (3.8)The problem with the bound (3.8), though, is which a > 1 to choose. Clearly, for different values of r we can choose different values of a = ar, to make the bound (3.8) as low as possible.

As before, the tridiagonal case provides us with a clue. Letting s = 1, we have

V r (a ) = <J~r i e ^ £T+1+<T~ 1 - ^ + 1 + < 7 - 1 )]m | •mlm=0

Therefore,

ip'r((j) - -ipr(cr) + - ( 1 - a 2)<pr-i(a).a <7

Setting the derivative to zero we have

P Vr{°)Since

/ \ / \ \p{\ + + o~2)Y~l aipr{a) - (pr-i{a ) = ------------- ^ _ -----------« 0, r > 1,

we have ipr- i (cr)/</?r(cr) £3 cr, therefore

188 ARIEH ISERLES

with the positive root

CT j ' —

Therefore, for sufficiently large r,

2 pr + \/r2 + 4 p2

r + \Jr2 + 4p2 r 2 p ~ p

eP+V^+4^ _ ^ (p + \A2 + 4 p 2)7m=0 m

0 ’

r—1y _

rr>1P_r!

The above analysis remains virtually intact for general s > 1 and the equation for optimal a is approximately

p kcrk = r.k = —s

In general, this equation cannot be solved in a closed form. Suppose, however, that r » 1 and let e p /r , whence 0 < £ <C 1. We seek a solution of the form

0V = [1 + o(l)], £ | 0,

where c > 0 and f3 are unknown constants. Substituting into the equation, it is trivial to verify that c — /3 = 1/s, whence [r/{ps)\l s is a very good approximation to the optimal value. After some easy algebra this leads to an upper bound

, r / . _ ^ (»•/*)\Fr(P)\ < ( f )

r / &

Em=0 m! (3.9)

Theorem 3.3. Let E = eA, where A is a banded matrix of bandwidth s > 1 and set p = maxfci/=i j2)... >n I A k,i I • Then

\k-i\-i

IEk,i\ <ps

\k-l\

\k—l\/s\k—Z|/s E

m=0

(\k -i\ /syml

\k-l\ » 1. (3.10)

It is easy to use the theorem to bound the bandwidth r within which one can confine all the entries of E which exceed £ > 0 in magnitude. However, in that case it is more reasonable (and computationally straightforward) to compute numerically ar that minimises <pr and choose an integer r such that (pr(ar) < £.

Figure 3 displays the functions in a base-10 logarithmic scale. The minimum of the curve corresponds to the best value ar. It is evident from the figure that good estimates of the minimum are crucial to avoid an overly pessimistic estimate of the rate of decay. This, indeed, is a justification for numerical computation of the minimum, in preference to the estimate of Theorem 3.1.

4. Generalizations and conclusions

LetOO

h(z) = ^ hrnZm m=0

HOW LARGE IS THE EXPONENTIAL OF A BANDED MATRIX? 189r = 20

p = A

F i g u r e 3. The function log10 <Pr(&) for a range of values of r and different numbers p.

be an arbitrary function, analytic in the disc \z\ < 7 . How fast - if at all - decay the elements of h(A) away from the diagonal if A is a banded matrix?

In Figure 4 we have plotted h(pT), where T 6 Mioo(R) is Toeplitz, tridiagonal, with the symbol z + 1 + z~1, while h(z) = (l — \z) 1 (l + , for two different values of p. It is evident that, unlike in the exponential case, a very minor amendment in the size of the coefficients leads to radically different behaviour.

100

□200

0 0 0 00 = -SS. M 100 P =

6710

F i g u r e 4. The matrix h{A) for the [1/1] Pade approximant to the exponential, h(z) = ( l — ^z) 1 (l + ^), and a tridiagonal Toeplitz matrix A = pT e Mi00(M).

190 ARIEH ISERLES

Retaining notation from Section 4, we briefly comment on the validity of our results in the more general setting. Again, A is of bandwidth s > 1 and its nonzero elements are bounded in magnitude by p > 0. We have

i ( s

■( p ) = ar/J - 7 T ™ — \ k — - s

ike d 0,

where \F _i\(p)\ is a bound on the magnitude of a (k,l) entry of h(A). Using Theorem 2.3 we deduce that, for every cr > 1

\Fr(p)\ < o- r max— 7 V < 0 < 7 r

k = —s5 3 hm ( P 5 3 ° kem=r \ j

< a - r 5 3 \hm\ (p 5 3 ° k a r K \P 5 3 (4.11)k= — s k=-

Of course, (4.11) makes sense only as long as the argument of hr is within the disc of convergence. Since, e.g. by the Cauchy criterion, hr and h share the same radius of convergence, we require that a > I obeys

p 5 3 ° k < 7 ‘k——s

This means that we must restrict the range of p to7

(4.12)

P < 2s + 1otherwise (4.12) fails for every a > 1. This explains the difference between the two matrices in Figure 4: while p = ^ | lies on the ‘safe’ side, p = is marginally too large.

To obtain the least upper bound in (4.11), we need to minimise cr~rhr{pJ2k=-s ak) for all cr > 1 that satisfy (4.12). Returning to the example from Figure 4, namely the [1/1] diagonal Pade approximant to the exponential, h{z) = (1 — 1 + |«z),we have ho = 1, hm — 2-m+1 for m G N and

M y ) = 2 1 (\y)r, 0 < y < 2,

in the tridiagonal case s = 1. Letting £ = \p(o + 1 + (T~1) < 1, we have a = i ( —1 + 2£/p + / p2 — 4£/p — 3), therefore need to minimise

cr rhr (p(cr + 1 - f u J)) = i 4

with respect to £. This can be done numerically.To recap, we have demonstrated in this paper that the entries in the exponential

of a banded matrix rapidly decay away from the diagonal. Moreover, by providing estimates on the rate of decay, we have shown that is it possible to predict, given any threshold e > 0, the bandwidth outside of which the elements of eA are smaller than s in modulus. Similar analysis can be extended to analytic functions. In the case of analytic functions with finite radius of convergence, though, we need to


restrict the size of the entries of A, otherwise decay away from the diagonal is not assured.

Computer experiments demonstrate that similar phenomenon takes place for matrices with more elaborate sparsity patterns. Upon exponentiation, having thrown away small entries, the surviving sparsity pattern, although degraded in comparison with the original matrix, is often substantial enough to be of interest in practical computation.

Acknowledgem ents. The author is grateful to Per Christian Moan for his helpful comments on an earlier version of the manuscript.

References

1. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions, Dover, New York, 1965.

2. G.A. Baker, Essentials of Pade Approximants, Academic Press, New York, 1975.

3. R. Carter and G. Segal and I. Macdonald, Lectures on Lie Groups and Lie Algebras, Cambridge University Press, LMS Student Texts, Cambridge, 1995.

4. E. Celledoni and A. Iserles, Approximating the exponential from a Lie algebra to a Lie group, University of Cambridge, NA03, 1998.

5. E. Celledoni and A. Iserles, Numerical calculation of the matrix exponential based on Wei-Norman equations, University of Cambridge, 1999, (in preparation).

6. E. Celledoni and A. Iserles and S.P. N0rsett, Complexity of Lie-algebraic discretization methods, University of Cambridge, 1999, (in preparation).

7. K. Eng0 and A. Marthinsen and H. Munthe-Kaas, DiffMan: An object oriented MATLAB toolbox for solving differential equations on manifolds, University of Bergen, Bergen, Norway, 1997.

8. G.H. Golub and C.F. Van Loan, Matrix Computations, 2nd Edition, The Johns Hopkins Press, Baltimore, 1989.

9. U. Grenander and G. Szego, Toeplitz Forms and Their Applications, Chelsea, New York, 1958.

10. P. Henrici, Applied and Computational Complex Analysis, Volume I, John Wiley and Sons, New York, 1974.

11. M. Hochbruck and Ch. Lubich, On Krylov subspace approximations to the matrix exponential operator, SIAM J. Num. Anal. 34 (1997), 1911-1925.

12. R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985.

13. R.A. Horn and C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1991.

14. A. Iserles and S.P. N0rsett, Order Stars, Chapman and Hall, London, 1991.15. A. Iserles and S.P. N0rsett, On the solution of linear differential equations in

Lie groups, Philosophical Trans. Royal Soc. A (1999), to appear.16. W. Magnus, On the exponential solution of differential equations for a linear

operator, Comm. Pure Appl. Maths, VII (1954), 649-673.17. C.B. Moler and C.F. Van Loan, Nineteen dubious ways to compute the

exponential of a matrix, SIAM Review, 20 (1983), 801-836.

192 ARIEH ISERLES

18. H. Munthe-Kaas, Runge-Kutta methods on Lie groups, BIT, 38 (1998), 92-111.

19. E.D. Rainville, Special Functions, Macmillan, New York, 1960.20. H. Tal Ezer and R. Kosloff, An accurate and efficient scheme for propagating the

time dependent Schrddinger equation, J. Chem. Phys. 81 (1984), 3967-3970.21. V.S. Varadarajan, Lie Groups, Lie Algebras, and their Representations,

Graduate Texts in Mathematics, Springer-Verlag, No. 102, 1984.22. A. Zanna, Collocation and relaxed collocation for the Fer and the Magnus

expansions, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, England, NA17, 1997.

Arieh IserlesDepartment of Applied Mathematics and Theoretical PhysicsUniversity of CambridgeSilver StreetCambridge CB3 9EWEnglandUNITED KIN G D O [email protected]

mailto:[email protected]

Documents

HOW LARGE IS THE EXPONENTIAL OF A BANDED MATRIX?