The University of Manchester - Research Matters Nick ...higham/talks/fun17.pdfNC State, Tuesday April 25, 2017 Multivalued troubleLogarithmInverse Trig/HypAlgorithms Outline 1 The

Research Matters

February 25, 2009

Nick HighamDirector of Research

School of Mathematics

1 / 6

Challenges inMultivalued Matrix Functions

Nick HighamSchool of Mathematics

The University of Manchester

http://www.maths.manchester.ac.uk/~higham@nhigham, nickhigham.wordpress.com

Department of MathematicsNC State, Tuesday April 25, 2017

http://www.maths.manchester.ac.uk/~higham/

http://www.maths.manchester.ac.uk

http://www.man.ac.uk

http://www.maths.manchester.ac.uk/~higham/

https://twitter.com/nhigham

http://nickhigham.wordpress.com

Multivalued trouble Logarithm Inverse Trig/Hyp Algorithms

Outline

1 The Trouble with MultivaluedFunctions

2 The Matrix Logarithm and MatrixUnwinding Function

3 Matrix Inverse Trigonometric &Hyperbolic Functions

4 Algorithms

Nick Higham Matrix Functions 3 / 52


Multivalued (Inverse) Functions

log, acos, asin, acosh, asinh, . . .Branch cuts help define connected domain where fanalytic.Location of branch cuts is otherwise arbitrary.Define principal values (distinguished branches),including values on the branch cuts.



Issues for Numerical Computation

Branch cuts and choices of branch must be consistentbetween different inverse functions.Choices must be clearly documented.Precisely when do identities hold?

For matrices:

Existence.Uniqueness of principal values.Validity of identities: more than just the scalar case.Computation.



Issues for Numerical Computation

Branch cuts and choices of branch must be consistentbetween different inverse functions.Choices must be clearly documented.Precisely when do identities hold?

For matrices:

Existence.Uniqueness of principal values.Validity of identities: more than just the scalar case.Computation.



Quotes (1)

Corless et al. (2000):

Definitions of the elementary functionsare given in many textbooks andmathematical tables . . .often require a great deal of commonsense to interpret them, or . . .are blatantly self-inconsistent



Quotes (2)Penfield (1981):

One cannot find in the mathematics or computer-scienceliterature a definitive value for the principal value of thearcsin of 3.

Kahan (1987):

Principal Values have too often beenleft ambiguous on the slits, causingconfusion and controversy . . .

Comparing various definitions, andchoosing among them, is a tediousbusiness prone to error.



The Scalar Logarithm: Comm. ACM

J. R. Herndon (1961). Algorithm 48: Logarithm of acomplex number.

A. P. Relph (1962). Certification of Algorithm 48:Logarithm of a complex number.

M. L. Johnson and W. Sangren, W. (1962). Remark onAlgorithm 48: Logarithm of a complex number.

D. S. Collens (1964). Remark on remarks onAlgorithm 48: Logarithm of a complex number.

D. S. Collens (1964). Algorithm 243: Logarithm of acomplex number: Rewrite of Algorithm 48.


































One of my choices for Five Books.Winner of MAA Euler Book Prize, 2017.

http://fivebooks.com/interview/nicholas-higham-applied-mathematics/

https://nickhigham.wordpress.com/2016/05/02/five-books-on-applied-mathematics/


Logarithm of Product, Complex Case

Goes on to say must take appropriate branch for eachoccurrence of log.



Logarithm of Product, Complex Case

Goes on to say must take appropriate branch for eachoccurrence of log.



NIST Handbook (Olver et al. 2010)

Arcsin, Arccos are “general values” of inverse sine, inversecosine.


HP-15C Handbook (1982 and 1986)

acos formula “quite wrong” (Kahan, 1987)

HP-15C Handbook (1982 and 1986)

acos formula “quite wrong” (Kahan, 1987)


MATLAB and Symbolic Math Toolbox

In R2016a:>> z = -4; acosh(z), double(acosh(sym(z)))ans =

2.0634e+00 + 3.1416e+00ians =-2.0634e+00 + 3.1416e+00i

Fixed in R2016b.



Identities in Complex Variables

(1− z)1/2(1 + z)1/2 = (1− z2)1/2 for all z ,

(z − 1)1/2(z + 1)1/2 = (z2 − 1)1/2 for arg z ∈ (−π/2, π/2] .

Need to be very careful in simplifying expressions involvingmultivalued functions.

When is log ez = z?When is acos(cos z) = z?




(1− z)1/2(1 + z)1/2 = (1− z2)1/2 for all z ,

(z − 1)1/2(z + 1)1/2 = (z2 − 1)1/2 for arg z ∈ (−π/2, π/2] .






(1− z)1/2(1 + z)1/2 = (1− z2)1/2 for all z ,

(z − 1)1/2(z + 1)1/2 = (z2 − 1)1/2 for arg z ∈ (−π/2, π/2] .





Outline




4 Algorithms



Logarithm and Square Root

X is a logarithm of A ∈ Cn×n if eX = A. Write X = log A.

Branch cut (−∞,0].

Principal logarithm, log A: Imλ(log A) ∈ (−π, π].

Principal square root: X 2 = A and Reλ(X ) ≥ 0,(−r)1/2 = r 1/2i for r ≥ 0.



The Lambert W Function

Wk(a), k ∈ Z: solutions of wew = a.

−6π −4π −2π 0 2π 4π 6π

−6π

−4π

−2π

0

2π

4π

6π

W0

W1

W2

W−1

W−2

Fasi, H & Iannazzo:An Algorithm for theMatrix Lambert WFunction, SIMAX(2015).



Unwinding Number

Definition

U(z) = z − log ez

2πi.

Note:z = log ez + 2π iU(z).

Corless, Hare & Jeffrey (1996);Apostol (1974): special case.

ISO standard typesetting!


https://nickhigham.wordpress.com/2016/01/28/typesetting-mathematics-according-to-the-iso-standard/


Unwinding Number

Definition

U(z) = z − log ez

2πi.

Note:z = log ez + 2π iU(z).

Corless, Hare & Jeffrey (1996);Apostol (1974): special case.

ISO standard typesetting!


https://nickhigham.wordpress.com/2016/01/28/typesetting-mathematics-according-to-the-iso-standard/


Unwinding number is an integer

U(z) = z − log ez

2πi=

⌈Im z − π

2π

⌉.

When is log(ez) = z?U(z) = 0 iff Im z ∈ (−π, π].

Riemann surface oflogarithm



Matrix Unwinding FunctionH & Aprahamian (2014):

U(A) = A− log eA

2πi, A ∈ Cn×n.

Jordan formFor Z−1AZ = diag(Jk(λk)),

U(A) = Zdiag(U(λk)Imk )Z−1.

U(A) is diagonalizable.U(A) has integer ei’vals.U(A) is pure imaginary if A is real.



Matrix Unwinding FunctionH & Aprahamian (2014):

U(A) = A− log eA

2πi, A ∈ Cn×n.

Jordan formFor Z−1AZ = diag(Jk(λk)),

U(A) = Zdiag(U(λk)Imk )Z−1.

U(A) is diagonalizable.U(A) has integer ei’vals.U(A) is pure imaginary if A is real.



Logarithm of Matrix Product

Theorem (Aprahamian & H, 2014)

Let A,B ∈ Cn×n be nonsingular and AB = BA. Then

log(AB) = log A + log B − 2πiU(log A + log B).



Power of a Product

Theorem

Let A,B ∈ Cn×n be nonsingular and AB = BA. For α ∈ C,

(AB)α = AαBαe−2παiU(log A+log B).

Proof.

(AB)α = eα log(AB)

= eα(log A+log B−2πiU(log A+log B))

= AαBαe−2απiU(log A+log B).



The Square Root Relation Explained

(1− z2)1/2 = (1− z)1/2(1 + z)1/2 (−1)U(log(1−z)+log(1+z)) .

Wlog, Im z ≥ 0. Then

0 ≤arg(1 + z) ≤ π,

−π ≤arg(1− z) ≤ 0,

so

Im[log(1− z) + log(1 + z)

]∈ (−π, π]

⇒ U[log(1− z) + log(1 + z)

]= 0.

This is not true for z − 1 and z + 1!



Outline




4 Algorithms



Early Definition

W. H. Metzler, On the Roots of Matrices (1892):

“Proves” sin(asinA) = A.



Application (1)

American Journal of Agricultural Economics, 1977



Application (2)

In a 1954 paper on the energy equation of afree-electron model :



Toolbox of Matrix Functions

d2ydt2 + Ay = 0, y(0) = y0, y ′(0) = y ′0

has solution

y(t) = cos(√

At)y0 +(√

A)−1 sin(

√At)y ′0.

But [y ′

y

]= exp

([0 −tA

t In 0

])[y ′0y0

].

In software want to be able evaluate interesting f atmatrix args as well as scalar args.




d2ydt2 + Ay = 0, y(0) = y0, y ′(0) = y ′0

has solution

y(t) = cos(√

At)y0 +(√

A)−1 sin(

√At)y ′0.

But [y ′

y

]= exp

([0 −tA

t In 0

])[y ′0y0

].





d2ydt2 + Ay = 0, y(0) = y0, y ′(0) = y ′0

has solution

y(t) = cos(√

At)y0 +(√

A)−1 sin(

√At)y ′0.

But [y ′

y

]= exp

([0 −tA

t In 0

])[y ′0y0

].




Existing Software

MATLAB has built-in expm, logm, sqrtm, funm.We have written cosm, sinm, signm, powerm,lambertwm, unwindm, . . .Julia has expm, logm, sqrtm.NAG Library has 42+ f (A) codes.

H & Deadman, A Catalogue of Software for MatrixFunctions (2016).


http://eprints.ma.man.ac.uk/2450

http://eprints.ma.man.ac.uk/2450

https://nickhigham.wordpress.com/2016/03/10/updated-catalogue-of-software-for-matrix-functions/


Definitions

cos X =eiX + e−iX

2, sin X =

eiX − e−iX

2i,

cosh X = cos iX , sinh X = −i sin iX .

Concentrate on inverse cosine.Analogous results for inverse sine, inverse hyperboliccosine, inverse hyperbolic sine.

Joint work with Mary Aprahamian.



Inverse Cosine

A = cos X =eiX + e−iX

2,

implies(eiX − A)2 = A2 − I

oreiX = A +

√A2 − I.

But not all square roots give a solution!

Theorem

cos X = A has a solution if and only if A2 − I has a squareroot. All solutions are of the form X = −i Log(A +

√A2 − I)

for some square root and logarithm.



Inverse Cosine

A = cos X =eiX + e−iX

2,

implies(eiX − A)2 = A2 − I

oreiX = A +

√A2 − I.

But not all square roots give a solution!

Theorem

cos X = A has a solution if and only if A2 − I has a squareroot. All solutions are of the form X = −i Log(A +

√A2 − I)

for some square root and logarithm.



Problems

Pólya & Szegö, Problems and Theorems in Analysis II(1998): do

sin X =

[1 10 1

], sin X =

[−1 −10 −1

]have a solution?

Putnam Problem 1996-B4: does

sin X =

[1 19960 1

]have a solution?



Principal Inverse Cosine

acos

−1 1 0 π

Ω1 Ω2

acosΩ1

acosΩ2



Existence and Uniqueness

Theorem

If A ∈ Cn×n has no ei’vals ±1 there is a unique principalinverse cosine acosA, and it is a primary matrix function.



Log Formula

Lemma

If A ∈ Cn×n has no ei’vals ±1,

acosA = −i log(A + i(I − A2)1/2)

= −2i log

((I + A

2

)1/2

+ i(

I − A2

)1/2).

MATLAB docs define acos by first expression for scalars.Not obvious that RHS satisfies the conditions for acos!

Is the lemma an immediate consequence of the scalarcase?



Log Formula

Lemma

If A ∈ Cn×n has no ei’vals ±1,


= −2i log

((I + A

2

)1/2

+ i(

I − A2

)1/2).

MATLAB docs define acos by first expression for scalars.Not obvious that RHS satisfies the conditions for acos!

Is the lemma an immediate consequence of the scalarcase?



When Scalar Identity Implies Matrix Identity

Theorem (Horn & Johnson, 1991)

If f ∈ Cn−1 then f (A) = 0 for all A ∈ Cn×n iff f (z) = 0 for allz ∈ C.

cos2 A + sin2 A = I√

Not applicable for identities involving branch cuts.



Principal Inverse Coshine

acosh

−1 1 0

iπ

Ω1 Ω3

acoshΩ1

acoshΩ3

−iπ



acos and acosh

Abramowitz & Stegun: acosh z = ±i acos z .

Theorem

If A ∈ Cn×n has no ei’val −1 or on (0,1] then

acoshA = i sign(−iA) acosA.

Sign is based on the scalar map

z → sign(Re z) = ±1,sign(0) = 1,sign(y i) = sign(y) for 0 6= y ∈ R.



Roundtrip Relations

By definition, cos(acosA) = A.

What about acos(cos A) = A ?

Theorem

If A ∈ Cn×n has no ei’val with real part kπ, k ∈ Z, then

acos(cos A) =(A− 2πU(iA)

)sign

(A− 2πU(iA)

).

Corollary

acos(cosA) = A iff every e’val of A has real part in (0, π).



Roundtrip Relations



Theorem



)sign

(A− 2πU(iA)

).

Corollary




Roundtrip Relations



Theorem



)sign

(A− 2πU(iA)

).

Corollary




Outline




4 Algorithms



GNU Octave

thfm (“Trigonometric/hyperbolic functions of squarematrix”)

Does not return the principal value of acosh! Uses

acoshA = log(A + (A2 − I)1/2)

instead of

acoshA = log(A + (A− I)1/2(A + I)1/2).



Padé Approximation

Rational rkm(x) = pkm(x)/qkm(x) is a[k ,m] Padé approximant to f if pkm andqkm are polys of degree at most k and mand

f (x)− rkm(x) = O(xk+m+1).

Generally more efficient thantruncated Taylor series.Possible representations:

Ratio of polys.Continued fraction.Partial fraction.

Henri Padé1863–1953



Padé Approximation for acos

acos(I − A) = (2A)1/2∞∑

k=0

(2kk

)8k(2k + 1)

Ak , ρ(A) ≤ 2.

Use Padé approximants of f (x) = (2x)−1/2 acos(1− x).

Backward error hm(A) defined by

(2A)1/2rm(A) = acos(I − A + hm(A)

)satisfies

‖hm(A)‖‖A‖

≤∞∑`=0

|c`|‖A‖2m+`+1,

In fact, replace ‖A‖ by αp(A), where, with p(p−1) ≤ 2m+1,

αp(A) = max(‖Ap‖1/p, ‖Ap+1‖1/(p+1)) ≤ ‖A‖.




acos(I − A) = (2A)1/2∞∑

k=0

(2kk

)8k(2k + 1)

Ak , ρ(A) ≤ 2.

Use Padé approximants of f (x) = (2x)−1/2 acos(1− x).Backward error hm(A) defined by

(2A)1/2rm(A) = acos(I − A + hm(A)

)satisfies

‖hm(A)‖‖A‖

≤∞∑`=0

|c`|‖A‖2m+`+1,


αp(A) = max(‖Ap‖1/p, ‖Ap+1‖1/(p+1)) ≤ ‖A‖.




acos(I − A) = (2A)1/2∞∑

k=0

(2kk

)8k(2k + 1)

Ak , ρ(A) ≤ 2.

Use Padé approximants of f (x) = (2x)−1/2 acos(1− x).Backward error hm(A) defined by

(2A)1/2rm(A) = acos(I − A + hm(A)

)satisfies

‖hm(A)‖‖A‖

≤∞∑`=0

|c`|‖A‖2m+`+1,


αp(A) = max(‖Ap‖1/p, ‖Ap+1‖1/(p+1)) ≤ ‖A‖.



Reducing the Argument

Use acos X = 2 acos((1

2(I + X ))1/2)

to get argument near I.

Lemma

For any X0 ∈ Cn×n, the sequence defined by

Xk+1 =

(I + Xk

2

)1/2

satisfies limk→∞ Xk = I.

Not trivial to prove!Our proof: derive scalar result, then apply generalresult of Iannazzo (2007).



Choice of Padé Degree

Take enough square roots to get close to I.

Then balance cost of extra square roots with cost ofevaluating rm.

Use fact that

(I − Xk+1)(I + Xk+1) = I − X 2k+1 =

I − Xk

2

implies ‖I − Xk+1‖ ≈ ‖I − Xk‖/4.



Other Features

Initial Schur decomposition.

Compute square roots using Björck–Hammarling(1983) recurrence.

Use estimates of ‖Ak‖1 (alg of H & Tisseur (2000)).

Get the other functions from

asinA = (π/2)I − acosA,asinhA = i asin(−iA) = i

((π/2)I − acos(−iA)

),

acoshA = i sign(−iA) acosA.



How to Test an Algorithm for acosA?

Could check identities

Roundtrip relations.acosA + asinA = (π/2)I.sin(acosA) = (I − A2)1/2.

Deadman & H (2016) give relevant “fudge factors”.

Here, compute relative errors

‖X − X‖1

‖X‖1,

where X computed at high precision usingA = VDV−1 ⇒ f (A) = Vf (D)V−1 (AdvanPix Toolbox).



Comparison

Schur–Padé alg (Schur decomposition, square roots,Padé approximant).

The formula


computed with MATLAB logm and sqrtm, using asingle Schur decomposition. GNU Octave uses thisformula.



Experiment, n = 10For acos, compare new alg with log formula

2 4 6 8 10 12 14 16 18 20

10-15

10-10

10-5

100

Schur-Padé algorithm

Log formula



The Problem with the Log Formulas


A =

[0 b−b 0

], b = 1000, Λ(A) = ±1000i.

‖acosA− X‖1

‖acosA‖1≈

1.98× 10−9 log formula,3.68× 10−16 new Alg.

E’vals of A + i(I − A2)1/2 ≈ 5× 10−5 i,2000i.

Relative 1-norm condition number of acosA is 0.83.Instability of log formula.



Conclusions

First thorough treatment of inverse trigonometric andinverse hyperbolic matrix functions.

Existence and uniqueness results.Various scalar identities extended to matrix case.New roundtrip identities (new even in scalar case).New Schur–Padé algs—numerically stable.MATLAB codes on GitHub:https://github.com/higham/matrix-inv-trig-hyp

Mary Aprahamian & H (2016), Matrix InverseTrigonometric and Inverse Hyperbolic Functions:Theory and Algorithms, SIAM J. Matrix Anal. Appl.,37(4):1453–1477, 2016.


https://github.com/higham/matrix-inv-trig-hyp

https://doi.org/10.1137/16M1057577

https://doi.org/10.1137/16M1057577

https://doi.org/10.1137/16M1057577

https://nickhigham.wordpress.com/2016/02/02/principal-values-of-inverse-cosine-and-related-functions/


Future Directions

Theory and algs for non-primary functions, perhapslinked to an f (A(t)) application.Better understanding of conditioning of f (A).Matrix argument reduction.f (A)b problem.

ManchesterNLA group,Oct 2016


References I

A. H. Al-Mohy and N. J. Higham.Improved inverse scaling and squaring algorithms forthe matrix logarithm.SIAM J. Sci. Comput., 34(4):C153–C169, 2012.

T. M. Apostol.Mathematical Analysis.Addison-Wesley, Reading, MA, USA, second edition,1974.xvii+492 pp.


References II

M. Aprahamian and N. J. Higham.The matrix unwinding function, with an application tocomputing the matrix exponential.SIAM J. Matrix Anal. Appl., 35(1):88–109, 2014.

M. Aprahamian and N. J. Higham.Matrix inverse trigonometric and inverse hyperbolicfunctions: Theory and algorithms.SIAM J. Matrix Anal. Appl., 37(4):1453–1477, 2016.


References III

R. M. Corless, J. H. Davenport, D. J. Jeffrey, and S. M.Watt.“According to Abramowitz and Stegun” or arccothneedn’t be uncouth.ACM SIGSAM Bulletin, 34(2):58–65, 2000.

R. M. Corless and D. J. Jeffrey.The unwinding number.ACM SIGSAM Bulletin, 30(2):28–35, June 1996.

E. Deadman and N. J. Higham.Testing matrix function algorithms using identities.ACM Trans. Math. Software, 42(1):4:1–4:15, Jan. 2016.


References IV

M. Fasi, N. J. Higham, and B. Iannazzo.An algorithm for the matrix Lambert W function.SIAM J. Matrix Anal. Appl., 36(2):669–685, 2015.

N. J. Higham.Functions of Matrices: Theory and Computation.Society for Industrial and Applied Mathematics,Philadelphia, PA, USA, 2008.ISBN 978-0-898716-46-7.xx+425 pp.

N. J. Higham and A. H. Al-Mohy.Computing matrix functions.Acta Numerica, 19:159–208, 2010.


References V

N. J. Higham and E. Deadman.A catalogue of software for matrix functions. Version2.0.MIMS EPrint 2016.3, Manchester Institute forMathematical Sciences, The University of Manchester,UK, Jan. 2016.22 pp.Updated March 2016.

N. J. Higham and F. Tisseur.A block algorithm for matrix 1-norm estimation, with anapplication to 1-norm pseudospectra.SIAM J. Matrix Anal. Appl., 21(4):1185–1201, 2000.


References VI

R. A. Horn and C. R. Johnson.Topics in Matrix Analysis.Cambridge University Press, Cambridge, UK, 1991.ISBN 0-521-30587-X.viii+607 pp.

B. Iannazzo.Numerical Solution of Certain Nonlinear MatrixEquations.PhD thesis, Università degli Studi di Pisa, Pisa, Italy,2007.180 pp.


References VII

D. J. Jeffrey, D. E. G. Hare, and R. M. Corless.Unwinding the branches of the Lambert W function.Math. Scientist, 21:1–7, 1996.

W. Kahan.Branch cuts for complex elementary functions or muchado about nothing’s sign bit.In A. Iserles and M. J. D. Powell, editors, The State ofthe Art in Numerical Analysis, pages 165–211. OxfordUniversity Press, 1987.

W. H. Metzler.On the roots of matrices.Amer. J. Math., 14(4):326–377, 1892.


References VIII

Multiprecision Computing Toolbox.Advanpix, Tokyo.http://www.advanpix.com.

P. Penfield, Jr.Principal values and branch cuts in complex APL.SIGAPL APL Quote Quad, 12(1):248–256, Sept. 1981.


http://www.advanpix.com

References IX

G. Pólya and G. Szegö.Problems and Theorems in Analysis II. Theory ofFunctions. Zeros. Polynomials. Determinants. NumberTheory. Geometry.Springer-Verlag, New York, 1998.ISBN 3-540-63686-2.xi+392 pp.Reprint of the 1976 edition.

K. Ruedenberg.Free-electron network model for conjugated systems. V.Energies and electron distributions in the FE MO modeland in the LCAO MO model.J. Chem. Phys., 22(11):1878–1894, 1954.


References X

S. T. Yen and A. M. Jones.Household consumption of cheese: An inversehyperbolic sine double-hurdle model with dependenterrors.American Journal of Agricultural Economics, 79(1):246–251, 1997.


Documents

The University of Manchester - Research Matters Nick ...higham/talks/fun17.pdfNC State, Tuesday April 25, 2017 Multivalued troubleLogarithmInverse Trig/HypAlgorithms Outline 1 The