Si Max 070477

Embed Size (px)

Citation preview

  • 7/30/2019 Si Max 070477

    1/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    SIAM J. MATRIX ANAL. APPL. c 2008 Society for Industrial and Applied MathematicsVol. 30, No. 2, pp. 762776

    PERTURBATION BOUNDS FOR DETERMINANTS AND

    CHARACTERISTIC POLYNOMIALS

    ILSE C. F. IPSEN AND RIZWANA REHMAN

    Abstract. We derive absolute p erturbation bounds for the coefficients of the characteristicpolynomial of a n n complex matrix. The bounds consist of elementary symmetric functions ofsingular values, and suggest that coefficients of normal matrices are better conditioned with regardto absolute perturbations than those of general matrices. When the matrix is Hermitian positive-definite, the bounds can be expressed in terms of the coefficients themselves. We also improve absoluteand relative perturbation bounds for determinants. The basis for all bounds is an expansion of thedeterminant of a perturbed diagonal matrix.

    Key words. elementary symmetric functions, singular values, eigenvalues, condition number,determinant

    AMS subject classifications. 65F40, 65F15, 65F35, 65Z05, 15A15, 15A18

    DOI. 10.1137/070704770

    1. Introduction. The characteristic polynomial of a n n complex matrix A isdefined as

    det(I A) = n + c1n1 + + cn1 + cn,

    where in particular cn = (1)n det(A) and c1 = trace(A).

    The coefficients of the characteristic polynomial of a complex matrix are of cen-tral importance in a quantum physics application. There they supply informationabout thermodynamic properties of fermionic systems, which arise, for instance, inthe study of structure and evolution of neutron stars. These thermodynamic quanti-ties are calculated from partition functions. It turns out that the partition functionZk for k noninteracting fermions is given by Zk = (1)kck, where the matrix A is

    a function of the particle Hamiltonian operator [9]. Partition functions for systemsof interacting fermions require repeated calculation of noninteracting partition func-tions. The matrices A in these problems have fairly small dimension (n 1000) andno discernible structure.

    In order to assess the stability of numerical methods for computing the charac-teristic polynomial, though, we first need to know the conditioning of the ck and theirsensitivity to perturbations in the matrix A. To this end, we derive perturbationbounds for absolute normwise perturbations.

    Main results. The main idea behind our perturbation bounds is a determi-nant expansion of a perturbed diagonal matrix (Theorem 2.3). The expansion canbe extended to any square matrix via the SVD (Corollary 2.4). The resulting abso-lute perturbation bounds contain elementary symmetric functions of singular values.Below we present weaker, first-order versions of these bounds.

    Let 1 n 0 be the singular values of A, and let A + E be a n ncomplex matrix with E2 < 1 and characteristic polynomial det(I (A + E)) =

    Received by the editors October 8, 2007; accepted for publication (in revised form) by A. From-mer April 3, 2008; published electronically July 2, 2008.

    http://www.siam.org/journals/simax/30-2/70477.htmlDepartment of Mathematics, North Carolina State University, Raleigh, NC 27695-8205 (ipsen@

    ncsu.edu, http://www4.ncsu.edu/ipsen/, [email protected]).

    762

  • 7/30/2019 Si Max 070477

    2/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 763

    n + c1n1 + + cn1 + cn. The extreme coefficients c1 and cn have the simplest

    bounds. The linearity of the trace implies

    |c1 c1| = | trace(A + E) trace(A)| = | trace(E)| nE2,

    so that coefficient c1 is well conditioned with regard to absolute perturbations if thematrix order n is not too large. The determinant satisfies to first order (Remark 2.9)

    |cn cn| = | det(A) det(A + E)| sn1E2 + O(E22),

    where sn1 is the (n 1)st elementary symmetric function in the singular values andhas the upper bound sn1 n1 . . . n1.

    The remaining coefficients ck satisfy to first order (Remark 3.4)

    |ck ck|

    n

    k

    s(k)k1E2 + O(E

    22), 2 k n,

    where s(k)k1 is the (k 1)st elementary symmetric function in the k largest singular

    values, and s(k)k1 k1 . . . k1. However, if the matrix is normal or Hermitian, then

    the bound improves to (Remark 3.6)|ck ck| (n k + 1)sk1E2 + O(E

    22), 2 k n,

    where sk1 is the (k 1)st function in all singular values. Also, since A is normal,i = |i|, where i are the eigenvalues. Since the binomial term

    nk

    is reduced to

    n k + 1, the coefficients of a normal matrix are likely to be better conditioned thanthose of a general matrix.

    When the matrix is Hermitian positive-definite, eigenvalues are equal to singularvalues, and the above bound can be written as (Corollary 3.7)

    |ck ck| (n k + 1)|ck1|E2 + O(E22), 2 k n.

    As a result, ck is well conditioned in the absolute sense if the magnitude of thepreceding coefficient |ck1| is not too large.

    Overview. Section 2 deals with determinants. We first derive expansions fordeterminants (section 2.1), and from them absolute perturbation bounds in termsof elementary symmetric functions of singular values (section 2.2), as well as rela-tive bounds for determinants (section 2.3), and local sensitivity results (section 2.4).Section 3 deals with coefficients ck of the characteristic polynomial. We derive ab-solute perturbation bounds for general matrices (section 3.1) and normal matrices(section 3.2), as well as normwise bounds (section 3.3).

    Notation. The matrix A is a n n complex matrix with singular values 1 n 0, and eigenvalues i, labelled so that |1| |n|. The two-norm isA2 = 1, and A is the conjugate transpose ofA. The matrix I = diag

    1 . . . 1

    is the identity matrix, with columns ei, i 1. We denote by Ai the principal sub-matrix of order n 1 that is obtained by removing row and column i of A, and by

    Ai1...ik the principal submatrix of order nk, obtained by removing rows and columnsi1 . . . ik.

    2. Determinants. We derive expansions and perturbation bounds for determi-nants. We start with expansions for determinants of perturbed matrices (section 2.1),and from them derive absolute perturbation bounds in terms of elementary symmetricfunctions of singular values (section 2.2), as well as relative bounds for determinants(section 2.3), and local sensitivity results (section 2.4).

  • 7/30/2019 Si Max 070477

    3/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    764 ILSE C. F. IPSEN AND RIZWANA REHMAN

    2.1. Expansions. We derive expansions for determinants of perturbed matricesin several steps, by considering perturbations that have only a single nonzero diagonalelement (Lemma 2.1), perturbations of diagonal matrices (Theorem 2.3), and at lastperturbations of general matrices (Corollary 2.4).

    Lemma 2.1. Let A be a n n complex matrix, a scalar, and Ai the principalsubmatrix of order n 1 obtained by deleting row and column i of A.

    If B = A + eiei , then det(B) = det(A) + det(Ai), 1 i n.

    Proof. This follows from a cofactor expansion [8, Theorem 2.3.1] along row i orcolumn i of B.

    The above expansion can be used to expand the determinant of a perturbed diag-onal matrix. Before deriving this expansion, we motivate its expression on matricesof order 2 and 3.

    Example 2.2. If

    D =

    1

    2

    , F =

    f11 f12f21 f22

    ,

    then det(D + F) = det(D) + det(F) + S1

    , where S1

    1

    f22

    + 2

    f11

    .If

    D =

    1 2

    3

    , F =

    f11 f12 f13f21 f22 f23

    f31 f32 f33

    ,

    then det(D + F) = det(D) + det(F) + S1 + S2, where

    S1 1 det

    f22 f23f32 f33

    + 2 det

    f11 f13f31 f33

    + 3 det

    f11 f12f21 f22

    ,

    and S2 12f33 + 13f22 + 23f11.These examples illustrate that the expansion of det(D + F) can be written as

    a sum, where each term consists of a product of k diagonal elements of D and thedeterminant of the complementary submatrix of order n k of F.

    To derive expansions for diagonal matrices of any order, we denote by Fi1...ik theprincipal submatrix of order n k obtained by deleting rows and columns i1 . . . ik ofthe n n matrix F.

    Theorem 2.3 (expansion for diagonal matrices). Let D and F ben n complexmatrices. If D = diag

    1 . . . n

    , then

    det(D + F) = det(D) + det(F) + S1 + + Sn1,

    where

    Sk

    1i1

  • 7/30/2019 Si Max 070477

    4/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 765

    Proof. The proof is by induction over the matrix order n, and Example 2.2represents the induction basis. Assuming the statement is true for matrices of ordern 1, we show that it is also true for matrices of order n. Let

    D

    (j)

    diag

    0 . . . 0 j+1 . . . n

    be a diagonal matrix of order n with j leading zeros. Applying Lemma 2.1 to A D(1) + F and B A + 1e1e

    1 gives

    det(D + F) = 1 det(D1 + F1) + det(D(1) + F).

    We repeat this process on the second summand det(D(1) + F) to remove the diagonalelements j one by one; j 2. To this end, we apply Lemma 2.1 to A D

    (j) + Fand B A + jeje

    j , and denote by (D

    (j))j the matrix of order n 1 obtained by

    removing row and column j from D(j). This gives

    det(D(1) + F) =n1

    j=2

    j det(D(j))j + Fj+ n det(Fn) + det(F).

    Putting the above expression into the expansion for det(D + F) yields

    det(D + F) = det(F) + 1 det(D1 + F1) +n1j=2

    j det

    (D(j))j + Fj

    + n det(Fn).

    Since D1+ F1 and (D(j))j + Fj are matrices of order n 1, we can apply the induction

    hypothesis. To take advantage of the fact that the j 1 top diagonal elements of(D(j))j are zero, we define the following sums for matrices of order n 1,

    S(j)k

    j+1i1

  • 7/30/2019 Si Max 070477

    5/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    766 ILSE C. F. IPSEN AND RIZWANA REHMAN

    Corollary 2.4 (expansion for general matrices). LetA and E ben n complexmatrices, and F UEV. Then

    det(A + E) = det(A) + det(E) + S1 + + Sn1,

    where

    Sk det(U V)

    1i1

  • 7/30/2019 Si Max 070477

    6/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 767

    Corollary 2.7 (general matrices). Let A and E be n n complex matrices.Then

    | det(A) det(A + E)| n

    i=1

    sniEi2.

    If rank(A) = r for some 1 r n 1, then

    | det(A + E)| Enr2

    ri=0

    sriEi2,

    where the sj are elementary symmetric functions in the r largest singular values ofA, 1 j r.

    The bounds hold with equality for E = U V with > 0, where A = UV is aSVD of A.

    Proof. Corollary 2.4 implies | det(A)det(A+E)| | det(E)|+ |S1|+ + |Sn1|.To bound |Sk| use the fact that | det(U V)| = 1 and i 0 to obtain

    |Sk| max1i1

  • 7/30/2019 Si Max 070477

    7/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    768 ILSE C. F. IPSEN AND RIZWANA REHMAN

    where sn1 n1 . . . n1. Hence we can view sn1 or n1 . . . n1 as first-ordercondition numbers for absolute perturbations in A.

    Example 2.10. The perturbation of a diagonally scaled Jordan block below illus-trates that the first-order bound in Remark 2.9 can hold with equality. Let

    A =

    0 1 0 . . . 0... 0 2

    . . ....

    .... . .

    . . . 00 0 n10 0 . . . . . . 0

    , E = ene1,

    where || 1 and i > 0, 1 i n 1. Then | det(A + E) det(A)| = 1 . . . n1.Since the singular values of A are 0 and i > 0, 1 i n 1, we obtain | det(A +E) det(A)| = sn1E2.

    Replacing the singular values in Corollary 2.7 by powers of A2 gives the simpler,but weaker bounds below.

    Corollary 2.11 (normwise bounds). Let A and E be n n complex matrices.Then

    | det(A + E) det(A)| n

    i=1

    n

    i

    Ani2 E

    i2

    = (A2 + E2)n An2 .

    If rank(A) = r for some 1 r n 1, then

    | det(A + E)| Enr2

    ri=0

    r

    i

    Ari2 E

    i2

    = Enr2 (A2 + E2)r.

    Proof. This follows from Corollary 2.7 and sni nniAni2 = niAni2 ,1 i n 1.

    A bound similar to the one in Corollary 2.11 was already derived in [1, section 20],[2, Problem I.6.11], [3, Theorem 4.7] for any p-norm, by taking Frechet derivatives ofwedge products. Below we give a basic proof from first principles for the two-norm.

    Theorem 2.12 (section 20 in [1], problem I.6.11 in [2], Theorem 4.7 in [3]). LetA and E be n n complex matrices. Then

    | det(A + E) det(A)| nE2 max{A2, A + E2}n1.

    Proof. We first show the statement for a diagonal matrix. That is, if D =diag

    1 . . . n

    is diagonal, then

    det(D + F) = det(D) + z, where |z| nF2 max{D2, D + F2}n1.

    The proof is by induction. For n = 2

    D =

    1

    2

    , F =

    f11 f12f21 f22

    ,

    and

    z det(D + F) det(D) = 1f22 + det

    f11 f12f21 2 + f22

    .

  • 7/30/2019 Si Max 070477

    8/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 769

    Lemma 2.5 implies

    |z| F2D2 +

    f11f21

    2

    f12

    2 + f22

    2

    F2D2 + F2D + F2

    2F2 max{D2, D + F2}.

    This completes the induction basis. Assuming the statement is true for matrices oforder n 1, we show that it is also true for matrices of order n. As in the proof ofTheorem 2.3, let D(1) diag

    0 2 . . . n

    be the matrix obtained from D by

    replacing 1 with 0, and apply Lemma 2.1 to conclude

    det(D + F) = 1 det(D1 + F1) + det(D(1) + F).

    Since D1 + F1 is a matrix of order n 1, the induction hypothesis applies and givesdet(D1 + F1) = det(D1) + z1, where

    |z1| (n 1)F12 max{D12, D1 + F12}n2

    (n 1)F2 max{D2, D + F2}n2

    .

    Substitute the above expression into the expansion for det(D + F) to obtain

    z det(D + F) det(D) = 1z1 + det(D(1) + F),

    where |1z1| (n 1)F2 max{D2, D + F2}n1. Applying Lemma 2.5 to

    det(D(1) + F) yields

    det(D(1) + F) F e12

    ni=2

    (D + F)ei2 F2D + Fn12 .

    Therefore we have proved the theorem for diagonal matrices D.

    To prove the theorem for general matrices A, let A = UV

    be a SVD of A.Then det(A + E) = det(U V) det( + F), where F UEV. Since is diagonal,det( + F) = det() + z, where

    |z| nF2 max{2, + F2}n1 = nE2 max{A2, A + E2}

    n1.

    Hence det(A + E) det(A) = det(U V)z, and the result follows from | det(U V)|= 1.

    2.3. Relative perturbation bounds. We derive expansions for relative pertur-bations of determinants, as well as relative perturbation bounds that improve existingbounds.

    Theorem 2.13 (expansion). Let A and E be n n complex matrices. If A isnonsingular, then

    det(A + E) det(A)

    det(A)= det(A1E) + S1 + + Sn1,

    where

    Sk

    1i1

  • 7/30/2019 Si Max 070477

    9/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    770 ILSE C. F. IPSEN AND RIZWANA REHMAN

    Proof. Write det(A + E) = det(A)det(I + A1E) and apply Theorem 2.3 todet(I+ A1E).

    Corollary 2.14 (relative perturbation bound). Let A and E be n n complexmatrices. If A is nonsingular, then

    | det(A + E) det(A)|

    | det(A)|

    E2A2

    + 1

    n

    1,

    where A2A12.

    Proof. Apply Corollary 2.7 to

    | det(A + E) det(A)|

    | det(A)|= | det(I+ A1E) det(I)|,

    and bound A1E2 E2/A2.Remark 2.15. Corollary 2.14 is more general and tighter than the following bound

    from [4, (1.6)], [5, Problem 14.15]:

    | det(A + E) det(A)|

    | det(A)|

    nE2/A21 nE2/A2 ,

    which holds only for nE2/A2 < 1. This is true because of the following. Withq A12E2 = E2/A2 we can write the first term in the bound of Corol-lary 2.14 as

    (q+ 1)n =n

    i=0

    n

    i

    qi

    ni=0

    niqi i=0

    (nq)i.

    If nq < 1, then

    i=0(nq)i = 11nq , so that

    (q+ 1)n 1 1

    1 nq 1 =

    nq

    1 nq.

    This implies for the bound in Corollary 2.14

    E2A2

    + 1

    n 1

    n||E2/A21 nE2/A2

    ,

    where the last expression is the bound in [4, inequality (1.6)], [5, Problem 14.15].

    2.4. Local sensitivity. We derive a local condition number for determinantsfrom directional derivatives. The directional derivative for det(A) in the direction E

    is dk

    dxkdet(A + xE).

    Although we derive the expressions below from the expansion in Theorem 2.3, wecould have also used the expression for derivatives of A(x) in [7, equation (6.5.9)].

    Theorem 2.16. Let A and E be n n complex matrices, F UEV, and x areal scalar. Then

    det(A + xE) =n

    i=1

    Snixi + det(A),

    where

    S0 det(E), Sk det(U V)

    1i1

  • 7/30/2019 Si Max 070477

    10/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 771

    and

    dk

    dxkdet(A + xE)|x=0 = k!Snk, 1 k n.

    Proof. If D = diag

    1 . . . n

    is a diagonal matrix, then Theorem 2.3 impliesdet(D+xF) = det(xF)+ S1+ +Sn1+det(D), where det(xF) = xn det(F) = xnS0and

    Sk =

    1i1

  • 7/30/2019 Si Max 070477

    11/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    772 ILSE C. F. IPSEN AND RIZWANA REHMAN

    The following example illustrates that products of singular values play an impor-tant role in the conditioning of the coefficients ck.

    Example 3.1 (companion matrices). The n n matrix

    A =

    1 2 . . . . . . n 0 . . . . . . 0

    0 . . .

    ......

    . . .. . .

    . . ....

    0 . . . 0 0

    , > 0,

    is a multiple of a companion matrix, and let E = e1

    . . .

    with > 0. Therespective coefficients of the characteristic polynomials of A and A + E are [5, section28.6]

    ci = ii1, ci = (i + )

    i1, 1 i n.

    Then |ci ci| =

    i1

    , 1 i n. The singular values of A are [5, section 28.6]

    21 =1

    2

    +

    2 4|n|2

    , 2n =

    1

    2

    2 4|n|2

    ,

    where 1+|1|2+ +|n|2, and i = , 2 i n1. Therefore the conditioning

    of the coefficients ck is determined by products of singular values.The products of singular values in our perturbation bounds are expressed in terms

    of elementary symmetric functions of only the largest singular values ofA.Definition 3.2 (elementary symmetric functions in the largest singular values).

    Let A be a n n matrix with singular values 1 . . . n. Denote by

    s(k)0 1, s

    (k)j

    1i1

  • 7/30/2019 Si Max 070477

    12/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 773

    the matrices Ai1...ik + Ei1...ik are of order n k. Fix the indices i1, . . . , ik; set B Ai1...ik and F Ei1...ik ; and let 1 . . . nk be the singular values of B.Corollary 2.4 implies det(B + F) = det(B) + det(F) + S1 + + Snk1, where

    Sj

    = 1i1

  • 7/30/2019 Si Max 070477

    13/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    774 ILSE C. F. IPSEN AND RIZWANA REHMAN

    The bound holds with equality if E = I with > 0.Proof. Since A is normal, it has an eigenvalue decomposition A = VV, where

    = diag

    1 n

    is complex diagonal, |1| . . . |n|, and V is unitary.Set D I and F VEV , so that det(I (A + E)) = det(D + F).

    Theorem 2.3 implies det(D + F) = det(D) + det(F) + S1 + + Sn1. Substitutingdet(D) = n +

    nk=1 ck

    nk and det(D + F) = n +n

    k=1 cknk in the above

    expansion gives

    nk=1

    (ck ck)nk = det(F) + S1 + + Sn1.

    Thus ck ck is equal to the coefficient ofnk on the right-hand side, i.e., in det(F)+

    S1 + + Sn1. Since

    Snj

    1i1

  • 7/30/2019 Si Max 070477

    14/15

    Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

    BOUNDS FOR CHARACTERISTIC POLYNOMIALS 775

    Proof. The coefficients ck are also elementary symmetric functions in the eigenval-ues [6, section 1.2], and the eigenvalue of a Hermitian positive-definite is equal to thesingular values. Thus ck = (1)

    ksk, and the result follows from Theorem 3.5.To first order, the conditioning of coefficient ck is determined by the magnitude of

    the preceding coefficient, |ck1|. As in Corollary 2.8, the matrix A+E in Corollary 3.7does not have to be Hermitian positive-definite, because E can be arbitrary. Below weillustrate that one cannot use the expression in Corollary 3.7 for indefinite matrices;that is, positive-definiteness of A is crucial for the expression in Corollary 3.7.

    Example 3.8. Corollary 3.7 is not valid for indefinite Hermitian matrices and inparticular matrices with zero trace.

    To see this, let

    A =

    , A =

    +

    ,

    where > 0 and > 0. The characteristic polynomials are

    det(I A) = 2 2, (I (A + E)) = 2 ( )2,

    so that c2 c2 = 2 2. However, |c2 c2| cannot be bounded in terms of c1, as

    required by Corollary 3.7, because c1 = 0.

    3.3. Normwise bounds. Replacing the singular value products by powers ofA2 gives the following simpler, but weaker bounds.

    Corollary 3.9 (normwise bounds). Let A and E be n n complex matrices.Then

    |ck ck| k

    n

    k

    ki=1

    k

    i

    Aki2 E

    i2,

    =

    n

    k

    (A2 + E2)k Ak

    , 1 k n.

    If rank(A) = r for some 1 r n 1, then

    |ck ck| k

    n

    k

    Ekr2

    ri=1

    k

    i

    Ari2 E

    i2,

    =

    n

    k

    Ekr2 ((A2 + E2)

    r Ar) , r + 1 k n.

    Proof. This follows from Theorem 3.3 and

    s(k)ki

    k

    k i

    Aki2 =

    k

    i

    Aki2 , 1 i k 1.

    A similar bound was was already derived in [1, section 20] and [2, Problem I.6.11]for any p-norm, by taking Frechet derivatives of wedge products. Below we give abasic proof from first principles for the two-norm.

    Theorem 3.10 (section 20 in [1], problem I.6.11 in [2]). Let A and E be n ncomplex matrices. Then

    |ck ck| k

    n

    k

    E2 max{A2, A + E2}

    k1, 1 k n.

  • 7/30/2019 Si Max 070477

    15/15

    776 ILSE C. F. IPSEN AND RIZWANA REHMAN

    Proof. As in the proof of Theorem 3.3, we use

    cnk = (1)nk

    1i1