18
Journal of Computational Acoustics, Vol. 14, No. 4 (2006) 397–414 c IMACS A STOPPING RULE FOR THE CONJUGATE GRADIENT REGULARIZATION METHOD APPLIED TO INVERSE PROBLEMS IN ACOUSTICS THOMAS DELILLO Department of Mathematics and Statistics Wichita State University, Wichta, KS 67260 [email protected] TOMASZ HRYCAK Department of Mathematics, University of Vienna Nordbergstrasse 15, A-1090 Wien, Austria [email protected] Received 18 July 2004 Revised 28 November 2005 We present a novel parameter choice strategy for the conjugate gradient regularization algorithm which does not assume a priori information about the magnitude of the measurement error. Our approach is to regularize within the Krylov subspaces associated with the normal equations. We implement conjugate gradient via the Lanczos bidiagonalization process with reorthogonalization, and then we construct regularized solutions using the SVD of a bidiagonal projection constructed by the Lanczos process. We compare our method with the one proposed by Hanke and Raus and illustrate its performance with numerical experiments, including detection of acoustic boundary vibrations. Keywords : Ill-posed problems; inverse problems in acoustics; iterative regularization; conjugate gradient; parameter choice strategies; Helmholtz equation. AMS Subject Classifications: 65F22, 65F10, 65J20, 65J22 1. Introduction Discretizaton of ill-posed problems, which commonly arise in practical applications, leads to linear systems with large condition numbers. Since the right-hand side is normally endowed with noise from measurements, a straightforward computation does not render an accurate solution. From an engineering point of view, this problem requires a regularization technique to prevent amplification of the high frequency modes in the noise. A vast number of possi- bilities is presented in Ref. 15. Any such method constructs a sequence of approximations, called regularized solutions. Several regularization algorithms are described in Refs. [3, 12, 15, 19]. These include the truncated singular value decomposition (TSVD) and the conjugate gradient method applied 397

A STOPPING RULE FOR THE CONJUGATE GRADIENT … › ~delillo › _J_Comp_Acous_stop.pdfFebruary 5, 2007 18:33 WSPC/130-JCA 00311 Stopping Rule for Conjugate Gradient 399 While this

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Journal of Computational Acoustics, Vol. 14, No. 4 (2006) 397–414c© IMACS

    A STOPPING RULE FOR THE CONJUGATE GRADIENTREGULARIZATION METHOD APPLIED TO INVERSE

    PROBLEMS IN ACOUSTICS

    THOMAS DELILLO

    Department of Mathematics and StatisticsWichita State University, Wichta, KS 67260

    [email protected]

    TOMASZ HRYCAK

    Department of Mathematics, University of ViennaNordbergstrasse 15, A-1090 Wien, Austria

    [email protected]

    Received 18 July 2004Revised 28 November 2005

    We present a novel parameter choice strategy for the conjugate gradient regularization algorithmwhich does not assume a priori information about the magnitude of the measurement error. Ourapproach is to regularize within the Krylov subspaces associated with the normal equations. Weimplement conjugate gradient via the Lanczos bidiagonalization process with reorthogonalization,and then we construct regularized solutions using the SVD of a bidiagonal projection constructedby the Lanczos process. We compare our method with the one proposed by Hanke and Raus andillustrate its performance with numerical experiments, including detection of acoustic boundaryvibrations.

    Keywords: Ill-posed problems; inverse problems in acoustics; iterative regularization; conjugategradient; parameter choice strategies; Helmholtz equation.

    AMS Subject Classifications: 65F22, 65F10, 65J20, 65J22

    1. Introduction

    Discretizaton of ill-posed problems, which commonly arise in practical applications, leads tolinear systems with large condition numbers. Since the right-hand side is normally endowedwith noise from measurements, a straightforward computation does not render an accuratesolution. From an engineering point of view, this problem requires a regularization techniqueto prevent amplification of the high frequency modes in the noise. A vast number of possi-bilities is presented in Ref. 15. Any such method constructs a sequence of approximations,called regularized solutions.

    Several regularization algorithms are described in Refs. [3, 12, 15, 19]. These include thetruncated singular value decomposition (TSVD) and the conjugate gradient method applied

    397

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    398 T. DeLillo & T. Hrycak

    to the normal equations (CGNE). In the case of the truncated SVD, the regularizationparameter is the number of singular vectors retained in the reconstruction. For the conjugategradient, the parameter is the number of iterations. Specific applications in acoustics areconsidered in Ref. 23.

    It is well known that the regularized solutions xm of the linear system Ax = b initiallyapproach the exact solution, and then they drift away from it. In the realm of iteration tech-niques, according to the Morozov discrepancy principle, making the residual much smallerthan the error leads to a deteriorated solution. It is thus paramount to determine an appro-priate regularization parameter m.

    There have been a number of attempts to find the optimal value of the regularizationparameter for various regularization techniques; see Ref. 15 for a survey of those results. Asobserved in Ref. 18, in the case of the MINRES iterative algorithm, the optimal numberof iterations is quite small and the classical asymptotic bounds for the error do not apply.A similar phenomenon takes place for the conjugate gradient, which makes the design of astopping rule difficult.

    In this paper, we present a heuristic rule for choosing a regularization parameter whichdoes not assume any a priori knowledge about the magnitude of the error. We focus ourattention on the method of conjugate gradient applied to the normal equations. Our interestin conjugate gradient is motivated by several important applications, including some inverseproblems in acoustics (see Refs. 5–7), where the (complex) eigenvalues of the matrix Ain question are scattered around the origin. On the other hand, the singular values areclustered, which foretells a fast convergence if the normal equations are solved instead. Oneexample of this phenomenon occurs when A is the single (or double) layer potential for theHelmholtz equation on two concentric spheres, see Ref. 6. In this case, the eigenspaces ofA∗A are spanned by the spherical harmonics and have dimensions 1, 3, 5, . . . .

    Given the effectiveness of the approximations from the Krylov subspaces Kn of A∗A, weattempt to find a regularized solution in Kn. Our method may be categorized according toframework of Kilmer and O’Leary where approximate solutions are obtained by projectionon Krylov subspaces followed by regularization. We implement conjugate gradient via theLanczos bidiagonalization process with reorthogonalization. We then use the SVD of abidiagonal projection generated by the Lanczos process to construct a new basis for Kn,which plays the role of the right singular vectors of A. We then apply the usual paradigmthat the coefficients in this new basis decrease until the noise level is reached, which givesrise to a stopping strategy. This combination of the truncated SVD and conjugate gradientin the environment described above should be more effective than the traditional truncatedSVD. This is because the singular vectors of A are constructed without any informationabout the right-hand side, and, therefore, tend to provide a worse approximation to thesolution; see similar observations in Ref. 11, Sec. 2, p. 1012.

    We would like to emphasize that our results rely in a crucial way on reorthogonalizationof residual vectors of the normal equations, which is needed to eliminate “ghost” singularvalues.

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 399

    While this method can be used on several problems, it is particularly effective for large-scale ones, where the singular values are well-grouped and noise levels are high (about1–5%), as is the case for many inverse problems in acoustics. It has been extensivelytested on realistic geometries in Refs. 4 and 7; here we illustrate it with simpler numericalexperiments, including inverse problems of detection of acoustic boundary vibrations in twoand three dimensions. We also include a comparison with a method proposed by Hanke andRaus in Ref. 13 and with the truncated SVD regularization.

    The paper is organized as follows. Section 2 recapitulates fundamental results about theconjugate gradient and the truncated SVD methods. Section 3 describes our algorithm forselecting a regularization parameter. Section 4 presents the results of numerical experiments,including an inverse problem of detection of acoustic boundary vibrations. The last sectioncontains our conclusions and suggestions for future work.

    2. Preliminaries

    We briefly describe two classical regularization methods: The truncated singular valuedecomposition (TSVD) and the conjugate gradient applied to the normal equations(CGNE). We conclude this section with an outline of the parameter selection method pro-posed by Hanke and Raus, see Ref. 13.

    2.1. Truncated singular value decomposition

    Let A = UΣV ∗ be the singular value decomposition of an n × n matrix A, with unitarymatrices U = [u1, . . . , un], V = [v1, . . . , vn], and a diagonal matrix Σ = diag(σ1, . . . , σn),(σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0). The ui’s and vi’s are the singular vectors and the σi’s are thesingular values of A . It is well known that the solution to least squares problem

    minx

    ‖Ax − b‖2 (1)is given by

    xLS =rank(A)∑

    i=1

    u∗i bσi

    vi. (2)

    The TSVD method truncates the solution after the first m terms, thus giving the regularizedsolution

    xm =m∑

    i=1

    u∗i bσi

    vi. (3)

    Naturally, the number m of retained terms is a regularization parameter in this case. Onecommon way to estimate the optimal value of m is to look at the behavior of the coefficientsu∗i b/σi. The coefficients are expected to decay for smooth data, and start growing when thenoise level is reached. A simple stopping strategy is to truncate the expansion at the indexcorresponding to the coefficient with the smallest magnitude.

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    400 T. DeLillo & T. Hrycak

    2.2. Conjugate gradient algorithm for the normal equations

    The conjugate gradient algorithm applied to the normal equations (CGNE)

    A∗Ax = A∗b (4)

    at the mth step minimizes the A∗A–norm error

    ‖x − xm‖A∗A = ‖A(x − xm)‖22 = ‖b − Axm‖22 (5)over the mth Krylov subspace,

    Km(A∗A,A∗b) = span(A∗b, (A∗A)A∗b, . . . , (A∗A)m−1A∗b

    ). (6)

    There are several algebraically equivalent formulations of the conjugate gradient algo-rithm applied to the normal equations A∗A = A∗b. We are concerned here with the followingimplementation, see Ref. 15.

    Algorithm

    x0 = 0

    r0 = b − Ax0d0 = A∗r0for m = 1, 2, . . .

    αm = ‖A∗rm−1‖22/‖Adm−1‖22xm = xm−1 + αmdm−1rm = rm−1 − αmAdm−1βm = ‖A∗rm‖22/‖A∗rm−1‖22dm = A∗rm + βmdm−1

    end

    At step m, conjugate gradient finds the least squares solution xm over the Krylov spaceKm(A∗A,A∗b). Therefore, there is a polynomial Pm of degree m − 1 such that

    xm = Pm(A∗A)A∗b. (7)

    It is important that Pm can be evaluated via a three-term recursion. The following formulais proved in Ref. 15

    Pm(t) =(−αmt + αmβm−1

    αm−1+ 1

    )Pm−1(t) − αmβm−1

    αm−1Pm−2(t) + αm, (8)

    where the coefficients αm and βm have just been defined in the description of the CGNEalgorithm. The initial conditions are P−1(t) = P0(t) = 0.

    We also note that the conjugate gradient method finds the optimal solution in the firstseveral iterations and then diverges rapidly toward the solution to the problem with noisydata. This regularizing behavior is due to the fact that conjugate gradient initially reduces

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 401

    the error in the direction of the dominant (low frequency) singular vectors vi (A∗Avi = σ2i vi)which are less corrupted by noise relative to the high frequency modes corresponding tosmall σi’s. The rapid convergence–divergence behavior, known as semiconvergence, is oftenmore pronounced for the conjugate gradient than for the TSVD and the choice of theregularization parameter m — the “stopping rule” — is thus crucial.

    2.3. Conjugate gradient for the normal equations via Lanczosbidiagonalization

    It is well known that the CGNE algorithm can be implemented via the Lanczos bidiagonal-ization process, see Ref. 10. This approach creates a lower bidiagonal matrix Bm and tworectangular matrices Um+1 and Vm with orthonormal columns such that

    AVm = Um+1Bm. (9)

    The columns u1, u2, . . . , um+1 of Um+1, which are called the left Lanczos vectors, form anorthonormal basis of Km+1(AA∗, b). Similarly, the columns v1, v2, . . . , vm of Vm are called theright Lanczos vectors and form a basis for Km(A∗A,A∗b). The Lanczos process is initializedby setting u1 = b/‖b‖2, β0 = 0, and then, for m = 1, 2, . . ., one iterates

    pm = A∗um − βm−1vm−1,αm = ‖pm‖2,

    vm =pmαm

    ,

    qm+1 = Avm − αmum,βm = ‖qm+1‖2,

    um+1 =qm+1βm

    .

    The (m + 1) × m matrix Bm has the form

    Bm =

    α1

    β1 α2

    β2 . . .

    . . . αm

    βm

    . (10)

    The least squares solution xm can be expressed in terms of the singular values and singularvectors of Bm. Let

    Bm = ŨmΣ̃mṼ ∗m

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    402 T. DeLillo & T. Hrycak

    be the reduced singular value decomposition of Bm, i.e., Ũm is an (m+1)×m matrix, whileΣ̃m and Ṽm are m × m.

    We have

    minx∈Kn(A∗A,A∗b)

    ‖b − Ax‖2 = miny∈Rn

    ‖b − AVmy‖2 = miny∈Rn

    ‖b − Um+1Bmy‖2

    = miny∈Rn

    ‖b − Um+1ŨmΣ̃mṼ ∗my‖2.

    If Σ̃m is invertible, the above minimum is achieved at

    ym = ṼmΣ̃−1m Ũ∗mU

    ∗m+1b,

    and

    xm = Vmym = VmṼmΣ̃−1m Ũ∗mU

    ∗m+1b (11)

    is the mth CGNE iterate. Moreover, since u1 = b/‖b‖2,U∗m+1b = ‖b‖2 [1, 0, . . . , 0]t

    and

    Ũ∗mU∗m+1b = ‖b‖2 Ũm(1, :)∗, (12)

    where Ũm(1, :) denotes the first row of Ũm. Combining (11) and (12) we obtain

    xm = ‖b‖2 VmṼmΣ̃−1m Ũm(1, :)∗. (13)

    2.4. Heuristic parameter choice rules

    In this subsection, we present a brief outline of the parameter selection method proposedby Hanke and Raus, see Refs. 12 and 13. In this case, the conjugate gradient method isapplied to the normal equations A∗Ax = A∗bδ, where bδ is a vector of perturbed data with‖bδ − b‖2 ≤ δ‖b‖2. Hanke (see Ref. 12) derived the following error estimate

    ‖x − xm‖2 ≈ c |p′m(0)|12‖bδ − Axm‖2, (14)

    where c is a certain positive constant. The polynomials pm are used in Hanke’s presentationof Krylov methods with orthogonal residual polynomials. They can be expressed via thepolynomials Pn introduced in Sec. 2.2, namely

    pm(t) = 1 − tPm(t). (15)This immediately implies that p′m(0) = −Pm(0) and

    ‖x − xm‖2 ≈ c |Pm(0)|12 ‖bδ − Axm‖2. (16)

    The proposed stopping rule is as follows: compute the sequence

    φm = |Pm(0)| 12 ‖bδ − Axm‖2, (17)

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 403

    and find the value m = m0 where the sequence {φm} has a global minimum. Choose xm0as the best approximation to x.

    We would like to note that the recursion (8) gives us a simple way to compute thequantity φm in the course of the iterations.

    3. A New Parameter Selection Strategy

    3.1. Description of the algorithm

    We now proceed with a description of a new heuristic rule for choosing a regularizationparameter in the case of conjugate gradient applied to the normal equations. Our methoddoes not assume any a priori knowledge about the size of the error. We use the notationfrom Sec. 2.3.

    Our approach is to seek regularized solutions within the Krylov subspaces of A∗A. Weuse the SVD of the bidiagonal matrix Bm generated by the Lanczos process to constructa new basis for Km(A∗A, bδ), which plays the role of the right singular vectors of A. Wethen compute the expansion coefficients of xm in this basis and apply the stopping methoddescribed in Sec. 2.1. Specifically, we choose m to minimize the magnitude of the mthcoefficient of xm. A formal description follows.

    Given perturbed data bδ with ‖bδ − b‖2 = δ‖b‖2, we apply the conjugate gradient algo-rithm to the normal equations A∗Ax = A∗bδ. As described in Sec. 2.3, we use the Lanczosbidiagonalization process to construct a lower bidiagonal matrix Bm and two rectangularmatrices Um+1 and Vm with orthonormal columns such that

    AVm = Um+1Bm, (18)

    and bδ/‖bδ‖2 is the first column of Um+1. Let

    Bm = ŨmΣ̃mṼ ∗m (19)

    be the singular value decomposition of Bm. The columns um1 , um2 , . . . , u

    mm of UmŨm form an

    orthonormal basis of the Krylov subspace Km(AA∗, bδ), while the columns vm1 , vm2 , . . . , vmmof VmṼm an orthonormal basis of the Krylov subspace Km(A∗A,A∗bδ). To describe ourparameter selection rule, we compute the coefficients of the approximate solution xm asrepresented in the basis vm1 , v

    m2 , . . . , v

    mm. According to formula (13),

    xm = ‖bδ‖2 VmṼmΣ̃−1m Ũm(1, :)∗ = ‖bδ‖2m∑

    i=1

    Ũm(1, i)Σ̃m(i, i)

    vmi .

    Up to a constant factor, the magnitude of the mth coefficient is given by the formula

    ψm =|Ũm(1,m)|Σ̃m(m,m)

    . (20)

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    404 T. DeLillo & T. Hrycak

    We find the value m = m0 where the sequence {ψm} has a global minimum and choose xm0as our approximation to x.

    Corollary 8.6.3 from Ref. 10 shows that the singular values σk(Bm) of the matri-ces Bm interlace, and therefore the smallest singular values σm(Bm) = Σ̃m(m,m)decrease as m increases. This explains better our main observation: the quantities ψm =|Ũm(1,m)|/Σ̃m(m,m) start growing as soon as the noise level is reached, since the coeffi-cients in the numerators level off.

    According to Theorem 3.3, p. 52 of Ref. 9, in the absence of knowledge of the noise levelδ, no regularization method can succeed on all ill-posed problems. However, given that thecoefficients Ũm(1,m) go to zero fast enough, the sequence ψm has a local minimum, whichpoints to likely candidate for the optimal regularized solution.

    As we have already indicated in the introduction, our numerical experiments show thatthe reorthogonalization of the residual vectors is crucial for the success of this approach. Itis well known that without reorthogonalization, multiple copies of the singular values of Σ̃m— so-called “ghost” singular values — will be computed. Occurrence of these ghost singularvalues and vectors introduces random jumps in the magnitude of the coefficients cmi andinterferes with our algorithm.

    Taking this into account, we arrive at the following algorithm for computing thequantities ψm.

    Algorithm

    U(:, 1) = bδ/‖bδ‖2V (:, 0) = 0

    B(1, 0) = 0

    for m = 1, 2, . . .

    p = A∗ ∗ U(:,m) − B(m,m − 1) ∗ V (:,m − 1)p = p − V (:, 1 : m − 1) ∗ (V (:, 1 : m − 1)∗ ∗ p) (reorthogonalization)B(m,m) = ‖p‖2V (:,m) = p/B(m,m)

    q = A ∗ V (:,m) − B(m,m) ∗ U(:,m)q = q − U(:, 1 : m) ∗ (U(:, 1 : m)∗ ∗ q) (reorthogonalization)B(m + 1,m) = ‖q‖2U(:,m + 1) = q/B(m + 1,m)[Ũ , Σ̃, Ṽ

    ]= svd(B(1 : m + 1, 1 : m)) (the reduced SVD)

    ψm = |Ũ(1,m)|/Σ̃(m,m)xm = ‖bδ‖2 V (:, 1 : m)Ṽ Σ̃−1Ũ(1, :)∗

    end

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 405

    3.2. Complexity of the algorithm

    In Appendix A, we explain why the operation count of computing the smallest singularvalue of an m×m bidiagonal matrix is O(m). The quantity |Ũm(1,m)| is also evaluated inthe process. Thus the cost of the mth step is dominated by that of orthogonalization, thatis O(mn). The combined operation count of the first M steps is O(M2n).

    In common acoustical problems in three dimensions, the optimal m is small — between10 and 20 in our experiments. This is the result of singular values being multiple or atleast clustered, which accelerates convergence of CGNE. Additionally, the presence of noisefurther decreases the optimal value of m.

    4. Numerical Results

    The algorithm described in Sec. 3 has been implemented in MATLAB and applied to threeproblems: (1) a two-dimensional acoustical inverse problem, (2) a three-dimensional acous-tical inverse problem, and (3) a computation of the second derivative of a function of onevariable.

    The first problem deals with detection of acoustic boundary vibrations in two dimen-sions, see Ref. 5. We consider two concentric circles ‖x‖ = r1 and ‖x‖ = r2 with r1 < r2.We place a unit charge with wave number k at the point x0 outside the outer circle andevaluate its potential on the inner one. Specifically, we evaluate u(x) = G(x, x0) on thecircle ‖x‖ = r1, where G(x, y) is the free space Green’s function for the Helmholtz equation,

    G(x, y) =i

    4H (1)0 (k‖x − y‖). (21)

    The problem is to determine the (outward) normal velocity v on ‖x‖ = r2, given the pressuremeasurements uδ = u + e on ‖x‖ = r1, where e is a microphone noise with ‖e‖2 = δ‖u‖2.We use the representation of u via the single-layer potential

    u(x) = Sϕ(x) :=∫‖x‖=r2

    G(x, y)ϕ(y) dy, (22)

    for ‖x‖ < r2. The charge density ϕ is determined from the pressures u(x) on the circle‖x‖ = r1 and then the normal velocity on the circle ‖x‖ = r2 is computed from the formula

    v(x) =∂u

    ∂ν(x) = Kϕ(x) :=

    12

    ϕ(x) +∫‖x‖=r2

    ∇xG(x, y) · ν(x) ϕ(y) dy, (23)

    where ν(x) is the outward unit normal. The second problem is a three-dimensional counter-part of the first one. The description is identical, except that we use the free space Green’sfunction for the Helmholtz equation in R3,

    G(x, y) =14π

    eik‖x−y‖

    ‖x − y‖ . (24)

    In the planar case, the operators S and K are approximated by the n-point Nyström methodwith spectral accuracy yielding matrices Sn and Kn. In three dimensions, we use a piecewise

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    406 T. DeLillo & T. Hrycak

    linear boundary element approximation based on Refs. 2 and 3. The CGNE method appliedto the system Snϕ = uδ generates a sequence of approximate charge densities ϕm, (m =1, 2, . . .). We then compute a sequence of regularized normal velocities vm = Knϕm. Theoptimal number of iterations is estimated via the algorithm of Sec. 3.

    Specific details are discussed in Examples 1 and 2 below. We show the relative errors ofthe normal velocities, as well as the quantities ψm defined in Sec. 3 and φm introduced byHanke and Raus. Our relative errors are given by

    ‖v − vm‖2/‖v‖2 ≤ cond(Kn)‖ϕ − ϕm‖2/‖ϕ‖2, (25)

    where vm = Knϕm. In our examples, cond(Kn) = O(1).

    Example 1. Two-dimensional case. We set r1 = 0.9, r2 = 1 and x0 = (1.1, 0) and computed40 iterations of CGNE. The exact pressures were perturbed with noise at the level δ = 0.01.Specifically, we computed

    uδ = u + δ‖u‖2 ξ, (26)

    where the entries of vector ξ were sampled independently from the uniform distribution onthe interval (−1, 1) and then ξ was normalized to have norm 1. We used n = 500 pointsin discretization of the operators S and K. The wave numbers were k = 1, k = 3 andk = 6. The results are presented in Fig. 1 and Table 1. The first column of the tablecontains the wave number k, followed by relative errors erropt of the velocities vm for theoptimal regularization parameter mopt, the error errψ corresponding to the regularizationparameter mψ predicted by our new stopping rule, and the error errφ corresponding to theregularization parameter mφ predicted by the Hanke–Raus rule. We notice that the new

    0 5 10 15 20 25 30 35 4010

    −1

    100

    101

    102

    φm

    ψm

    error

    Fig. 1. Comparison of the new stopping rule ψm and the Hanke–Raus stopping rule φm in Example 1 fork = 1, δ = 0.01.

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 407

    Table 1. Optimal errors for the Lanczos CGNE and those inferredvia the stopping rules in Example 1 for δ = 0.01.

    k erropt mopt errψ mψ errφ mφ

    1 0.194 15 0.194 15 0.420 63 0.132 18 0.133 19 0.281 86 0.087 23 0.087 24 0.118 10

    Table 2. Optimal errors for the Lanczos CGNE and those inferred via thestopping rule for δ = 0.01.

    k n cond(Sn) erropt mopt errψ mψ Time

    1 66 5.7 × 101 0.21 7 0.22 10 11 258 3.9 × 102 0.14 5 0.16 7 1251 1026 8.6 × 103 0.11 7 0.12 9 2423 66 2.4 × 101 0.19 20 0.20 10 13 258 1.8 × 102 0.10 9 0.11 11 1223 1026 4.1 × 103 0.07 11 0.07 14 2416 66 7.8 0.31 7 0.34 9 1

    6 258 8.9 × 101 0.12 9 0.13 18 1226 1026 2.2 × 103 0.06 13 0.07 18 241

    method localizes the optimal regularization parameter much better than that of Hanke–Raus, and leads to almost optimal errors.

    Example 2. Three-dimensional case. We set x0 = (2, 0, 0) and kept all other parametersunchanged. The boundary element code was used with n = 66, n = 258 and n = 1026elements. Table 2 indicates how the relative errors erropt of the velocities vm for the optimalregularization parameter mopt compare with the errors errψ corresponding to the regular-ization parameter mψ predicted by our new stopping rule. The timings (in seconds) for theLanczos CGNE with reorthogonalization are given in the last column. Our MATLAB codewas run under Windows on a 950 MHz PC in this case. For n = 1026, 200 iterations wereperformed.

    Figures 2–4 present the relative errors for the Lanczos CGNE and the TSVD fork = 1, 3, 6. We can easily notice that in case of TSVD, the optimal value of regularizationparameter — number of leading singular vectors used in the expansion — strongly dependson the wave number k, while for the Lanczos CGNE the dependence is weak. Similar obser-vations for CGNE were made in Refs. 6 and 7, where the method was also applied to other,more realistic large-scale problems for cylindical geometries. We also notice that CGNEarrives at the optimal solution after about six to nine iterations, while TSVD requires over30–80 basis vectors to approach a similar accuracy. As we explained in the introduction, thisis a common occurrence when multiple or clustered singular values are present. Figures 5–7compare the new stopping rule and that of Hanke and Raus on the same problems.

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    408 T. DeLillo & T. Hrycak

    0 20 40 60 80 100

    10−1

    100

    101

    CGNETSVD

    Fig. 2. Comparison of relative errors of the TSVD and the Lanczos CGNE for k = 1, n = 1026, δ = 0.01.

    0 20 40 60 80 100

    10−1

    100

    101

    CGNETSVD

    Fig. 3. Comparison of relative errors of the TSVD and the Lanczos CGNE for k = 3, n = 1026, δ = 0.01.

    Additional examples of practical problems where this method has been applied includea Helmholtz–Kirchhoff system, and a realistic large-scale regime of a Cessna test section(see Ref. 4).

    Example 3. This example is borrowed from Refs. 14 and 15, and deals with a computationof the second derivative of a function of one variable by inverting a Fredholm operator. Ourgoal is to solve the integral equation

    ∫ 10

    G(x, y) f(y) dy =16(x3 − x) (0 < x < 1), (27)

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 409

    0 20 40 60 80 100

    10−1

    100

    101

    CGNETSVD

    Fig. 4. Comparison of relative errors of the TSVD and the Lanczos CGNE for k = 6, n = 1026, δ = 0.01.

    0 20 40 60 80 10010

    −1

    100

    101

    102

    103

    φm

    ψm

    error

    Fig. 5. Comparison of the new stopping rule ψm and the Hanke–Raus stopping rule φm for the LanczosCGNE with k = 1, n = 1026, δ = 0.01.

    where

    G(x, y) =

    {x(y − 1) if x < y,y(x − 1) if x ≥ y, (28)

    is the Green’s function for the second derivative operator d2/dx2. The exact solution isf(x) = x. We discretize G on n = 800 points by the Galerkin method using the MATLABcode deriv2 from Ref. 14, which results in a matrix (denoted by Gn) with condition number

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    410 T. DeLillo & T. Hrycak

    0 20 40 60 80 10010

    −2

    10−1

    100

    101

    102

    103

    φm

    ψm

    error

    Fig. 6. Comparison of the new stopping rule ψm and the Hanke–Raus stopping rule φm for the LanczosCGNE with k = 3, n = 1026, δ = 0.01.

    0 20 40 60 80 10010

    −2

    10−1

    100

    101

    102

    103

    φm

    ψm

    error

    Fig. 7. Comparison of the new stopping rule ψm and the Hanke–Raus stopping rule φm for the LanczosCGNE with k = 6, n = 1026, δ = 0.01.

    about 7.9 × 105. Table 3 gives further indication of the reliability of our new stoppingmethod for various noise levels (assuming the problem is sufficiently resolved) and for moreslowly decaying singular values (σm = O(1/m2) in this case). The data also indicates thatxδ → xexact as δ → 0, as expected for regularization methods, see Ref. 9.

    Figure 8 depicts the L-curve formed by the CG iterates —we plot the norm of theregularized solution xm versus that of the residual Gnxm − f . The point correspondingto the iteration found by the stopping rule is the one marked with a circle. The figure

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 411

    Table 3. Optimal errors for the Lanczos CGNE andthose according to the stopping rule in Example 3.

    δ erropt mopt errψ mψ

    10−1 0.263 6 0.304 410−2 0.207 9 0.220 810−3 0.151 15 0.151 1510−4 0.097 26 0.101 2310−5 0.067 45 0.068 4210−6 0.045 78 0.045 7210−7 0.023 140 0.025 122

    10−12

    10−10

    10−8

    10−6

    10−4

    10−2

    10−0.54

    10−0.53

    10−0.52

    10−0.51

    ||G x f||n m 2

    ||x |

    |m

    2

    Fig. 8. The L-curve in Example 3 with δ = 1e − 6.

    indicates that our stopping rule solution approximately agrees with the one given by theL-curve method.

    5. Conclusions

    We have tested a new Lanczos-based stopping rule and found it accurate and reliable in avariety of test problems. In several examples, it performs much better than the stoppingrule of Hanke and Raus. Large systems arising from three-dimensional problems in acousticscan be handled successfully due to clustering of the singular values, which makes CGNEefficient on such problems. The regularized solutions can be found in several iterations with

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    412 T. DeLillo & T. Hrycak

    mild dependence on the wave number. The new method does not rely on any informationabout the magnitude of the errors in given data.

    As the size of applied computational tasks grows, iterative methods will become evenmore crucial for large, dense problems where the full SVD is impractical. Given the paucity ofeffective, noise-free stopping rules for iterative methods, we believe that our stopping methodcan be an important tool in actual numerical use of iterative regularization algorithms.

    We plan to further investigate applications of the method to problems of near-fieldacoustic holography,21,22 including the HELS method.16,20

    Appendix A

    In this appendix, we describe an iterative procedure with complexity O(m) for an approx-imation of the smallest singular value of an m × m bidiagonal matrix Bm. It requires twowell-known ideas.

    First, the SVD of Bm can be computed via the eigenvalue decomposition of the sym-metric block matrix defined as

    H =

    [0 B∗m

    Bm 0

    ]. (A.1)

    If Bm = UΣV ∗ is the SVD of Bm, then

    H = Q

    [Σ 0

    0 −Σ

    ]Q∗

    is the eigenvalue decomposition of H, where the unitary matrix Q is given by

    Q =1√2

    [V V

    U −U

    ].

    Thus the required smallest singular value of Bm is equal to the smallest positive eigenvalueof H.

    The second idea is to apply inverse iterations (see e.g., Ref. 8, p. 155), to find thesmallest positive eigenvalue of H. Sometimes called the inverse power method, it amountsto computing several iterations of the matrix H−1 applied to a randomly chosen vector. Themethod has a linear convergence rate, so the number of iterations is O(1) and depends onthe required precision — we expect three decimal places to be enough in most engineeringapplications.

    Each application of H−1 requires a solution of one upper-triangular and one lower-triangular system. Since both matrices Bm and B∗m are bidiagonal, this can be accomplishedwith one back and one forward substitution in O(m) operations. The total operation countfor the smallest singular value computation is also O(m).

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    Stopping Rule for Conjugate Gradient 413

    Acknowledgments

    This research was supported by the NSF under Cooperative Agreement EPS-9874732 andby the NSF grant ITR-0081270.

    References

    1. K. E. Atkinson, The Numerical Solution of Integral Equations of the Second Kind (CambridgeUniversity Press, 1997).

    2. K. E. Atkinson, User’s Guide to a Boundary Element Package for Solving Integral Equationson Piecewise Smooth Surfaces, Release No. 2 (University of Iowa, 1998).

    3. A. Björck, E. Grimme and P. Van Dooren, An implicit shift bidiagonalization algorithm forill-posed systems, BIT 34(4) (1994) 510–534.

    4. T. DeLillo, T. Hrycak and V. Isakov, Theory and boundary element methods for nearfieldacoustic holography, J. Comput. Acoust. 13 (2005) 163–185.

    5. T. DeLillo, V. Isakov, N. Valdivia and L. Wang, The detection of the source of acoustical noisein two dimensions, SIAM J. Appl. Math. 61(6) (2001) 2104–2121.

    6. T. DeLillo, V. Isakov, N. Valdivia and L. Wang, The detection of surface vibrations from interioracoustical pressure, Inverse Problems 19 (2003) 507–524.

    7. T. K. DeLillo, T. Hrycak and N. Valdivia, Iterative regularization methods for inverseproblems in acoustics, in Proc. 2002 ASME Int. Mech. Eng. Congress, New Orleans, LA,IMECE2002/NCA-3270.

    8. J. Demmel, Applied Numerical Linear Algebra (Society for Industrial and Applied Mathematics,1997).

    9. H. W. Engl, M. Hanke and A. Neubauer, Regularization of Inverse Problems (Kluwer, Dordrecht,1996).

    10. G. H. Golub and C. Van Loan, Matrix Computations, 3rd edn. Johns Hopkins Studies in theMathematical Sciences. (Johns Hopkins University Press, Baltimore, MD, 1996).

    11. M. Hanke, On Lanczos based methods for the regularization of discrete ill-posed problems, BIT41 (2002) 1008–1018.

    12. M. Hanke, Conjugate gradient type methods for ill-posed problems, Pitman Research Notes inMathematics Series 327 (Longman Scientific and Technical, Essex, UK, 1995).

    13. M. Hanke and T. Raus, A general heuristic for choosing the regularization parameter in ill-posedproblems, SIAM J. Sci. Comput. 17(4) (1996) 956–972.

    14. P. C. Hansen, Regularization Tools: A matlab package for analysis and solution of discreteill-posed problems, Numer. Algor. 6 (1994) 1–35.

    15. P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems–Numerical Aspects of LinearInversion (SIAM, 1998).

    16. V. Isakov and S. Wu, On theory and application of the Helmholtz equation least squares methodin inverse acoustics, Inverse Problems 18(4) (2002) 1147–1159.

    17. M. Kilmer and D. O’Leary, Choosing regularization parameters in iterative methods for ill-posedproblems, SIAM J. Matrix Anal. Appl. 22(4) (2001) 1204–1221.

    18. M. Kilmer and G. W. Stewart, Iterative regularization and MINRES, SIAM J. Matrix Anal.Appl. 21(2) (1999) 613–628.

    19. C. R. Vogel and J. G. Wade, Iterative SVD-based methods for ill-posed problems, SIAM J. Sci.Comput. 15 (1994) 736–754.

    20. Z. Wang and S. Wu, A Helmholtz Equation-Least Squares method for reconstructing the acousticpressure field, J. Acoust. Soc. Am. 102(4) (1997) 2020–2032.

  • February 5, 2007 18:33 WSPC/130-JCA 00311

    414 T. DeLillo & T. Hrycak

    21. E. G. Williams, Fourier Acoustics (Academic, New York, 1999).22. E. G. Williams, B. H. Houston, P. C. Herdic, S. T. Raveendra and B. Gardner, Interior near-field

    acoustical holography in flight, J. Acoust. Soc. Am. 108 (2000) 1451–1463.23. E. G. Williams, Regularization methods for near-field acoustical holography, J. Acoust. Soc.

    Am. 110 (2001) 1976–1988.