22
IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 Regularization Parameter Selection for Nonlinear Iterative Image Restoration and MRI Reconstruction Using GCV and SURE-Based Methods Sathish Ramani*, Member, IEEE, Zhihao Liu, Jeffrey Rosen, Jon-Fredrik Nielsen, and Jeffrey A. Fessler, Fellow, IEEE Abstract—Regularized iterative reconstruction algorithms for imaging inverse problems require selection of appropriate regu- larization parameter values. We focus on the challenging problem of tuning regularization parameters for nonlinear algorithms for the case of additive (possibly complex) Gaussian noise. General- ized cross-validation (GCV) and (weighted) mean-squared error (MSE) approaches (based on Stein’s Unbiased Risk Estimate— SURE) need the Jacobian matrix of the nonlinear reconstruction operator (representative of the iterative algorithm) with respect to the data. We derive the desired Jacobian matrix for two types of nonlinear iterative algorithms: a fast variant of the standard iterative reweighted least-squares method and the contemporary split-Bregman algorithm, both of which can accommodate a wide variety of analysis- and synthesis-type regularizers. The proposed approach iteratively computes two weighted SURE-type measures: Predicted-SURE and Projected-SURE (that require knowledge of noise variance σ 2 ), and GCV (that does not need σ 2 ) for these algorithms. We apply the methods to image restoration and to magnetic resonance image (MRI) reconstruction using total variation (TV) and an analysis-type 1-regularization. We demonstrate through simulations and experiments with real data that minimizing Predicted-SURE and Projected-SURE con- sistently lead to near-MSE-optimal reconstructions. We also observed that minimizing GCV yields reconstruction results that are near-MSE-optimal for image restoration and slightly sub- optimal for MRI. Theoretical derivations in this work related to Jacobian matrix evaluations can be extended, in principle, to other types of regularizers and reconstruction algorithms. EDICS TEC-RST, COI-MRI Index Terms—Regularization parameter, generalized cross- validation (GCV), Stein’s unbiased risk estimate (SURE), image restoration, MRI reconstruction I. I NTRODUCTION Inverse problems in imaging invariably need image recon- struction algorithms to recover an underlying unknown object of interest x from measured data y. Reconstruction algorithms typically depend on a set of parameters that need to be Copyright (c) 2012 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. This work was supported by the National Institutes of Health under Grant P01 CA87634. *Sathish Ramani, Zhihao Liu, Jeffrey Rosen, and Jeffrey A. Fessler are with the Department of Electrical Engineering and Computer Science, University of Michigan. Jon-Fredrik Nielsen is with the fMRI Laboratory, University of Michigan, Ann Arbor, MI, U.S.A. Email: {sramani, zhihao, jayar, jfnielse, fessler}@umich.edu. adjusted properly for obtaining good image-quality. Choosing suitable parameter values is a nontrivial, application-dependent task and has motivated research on automated parameter selection based on quantitative measures [1]–[31]. Quantitative parameter optimization methods can be broadly classified as those based on the discrepancy principle [1], the L-curve method [2]–[5], generalized cross-validation (GCV) [6]–[17] and estimation of (weighted) mean-squared error (MSE) [18]– [30]. Recently, a new measure of image-quality (different from GCV and MSE) was introduced in [31] but its applicability has been demonstrated only for denoising applications [31]. In inverse problems, typically, image reconstruction is per- formed by minimizing a cost function composed of data- fidelity term and (one or more) regularization terms. Image quality in such cases is governed by regularization parameters that control the bias-variance trade-off (or equivalently, the balance between image-smoothing and amplification of noise) in the reconstruction. Using discrepancy principle requires minimizing the difference between the data-fidelity term and the noise variance [1] and can lead to over-smoothing [22]. In the L-curve method, parameters are chosen so as to maximize the curvature of a (L-shaped) parametric curve (constructed from the components of the cost function) [2]–[4]. This method can be computationally expensive and sensitive to curvature evaluation [5], [25]. GCV is a popular criterion used for parameter selection in a variety of inverse problems, especially for linear reconstruction algorithms [7]–[16]. The advantage of GCV is that it does not require knowledge of noise variance and is known to yield regularization parameters for linear algorithms that asymptotically minimize the true MSE [7]. Some extensions of GCV are also available for nonlinear algorithms [15]–[17] but they are computationally more involved (see Section III-A) than for linear algorithms. MSE-estimation-based methods can be attractive alterna- tives to GCV since image quality is often quantified in terms of MSE in image reconstruction problems. For Gaussian noise, Stein’s unbiased risk estimate (SURE) [18] provides a practical means of unbiasedly assessing MSE for denoising problems. Unlike GCV, SURE requires knowledge of noise statistics but is optimal even in the nonasymptotic regime. SURE has been successfully employed for optimally adjusting parameters of a variety of denoising algorithms [32]–[36]. For ill-posed inverse problems, it is not possible to estimate MSE (except in some special instances [22]–[25]) since y may only contain partial information about x [24, Sec. IV]. In such

IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1

Regularization Parameter Selection for NonlinearIterative Image Restoration and MRI Reconstruction

Using GCV and SURE-Based MethodsSathish Ramani*,Member, IEEE,Zhihao Liu, Jeffrey Rosen, Jon-Fredrik Nielsen, and Jeffrey A. Fessler,Fellow,

IEEE

Abstract—Regularized iterative reconstruction algorithms forimaging inverse problems require selection of appropriateregu-larization parameter values. We focus on the challenging problemof tuning regularization parameters for nonlinear algorit hms forthe case of additive (possibly complex) Gaussian noise. General-ized cross-validation (GCV) and (weighted) mean-squared error(MSE) approaches (based on Stein’s Unbiased Risk Estimate—SURE) need the Jacobian matrix of the nonlinear reconstructionoperator (representative of the iterative algorithm) with respectto the data. We derive the desired Jacobian matrix for two typesof nonlinear iterative algorithms: a fast variant of the standarditerative reweighted least-squares method and the contemporarysplit-Bregman algorithm, both of which can accommodate awide variety of analysis- and synthesis-type regularizers. Theproposed approach iteratively computes two weighted SURE-typemeasures: Predicted-SURE and Projected-SURE (that requireknowledge of noise varianceσ2), and GCV (that does not needσ2)for these algorithms. We apply the methods to image restorationand to magnetic resonance image (MRI) reconstruction usingtotal variation (TV) and an analysis-type ℓ1-regularization. Wedemonstrate through simulations and experiments with realdata that minimizing Predicted-SURE and Projected-SURE con-sistently lead to near-MSE-optimal reconstructions. We alsoobserved that minimizing GCV yields reconstruction results thatare near-MSE-optimal for image restoration and slightly sub-optimal for MRI. Theoretical derivations in this work relat edto Jacobian matrix evaluations can be extended, in principle, toother types of regularizers and reconstruction algorithms.

EDICSTEC-RST, COI-MRI

Index Terms—Regularization parameter, generalized cross-validation (GCV), Stein’s unbiased risk estimate (SURE), imagerestoration, MRI reconstruction

I. I NTRODUCTION

Inverse problems in imaging invariably need image recon-struction algorithms to recover an underlying unknown objectof interestx from measured datay. Reconstruction algorithmstypically depend on a set of parameters that need to be

Copyright (c) 2012 IEEE. Personal use of this material is permitted.However, permission to use this material for any other purposes must beobtained from the IEEE by sending a request to [email protected].

This work was supported by the National Institutes of Healthunder GrantP01 CA87634.

*Sathish Ramani, Zhihao Liu, Jeffrey Rosen, and Jeffrey A. Fessler are withthe Department of Electrical Engineering and Computer Science, Universityof Michigan. Jon-Fredrik Nielsen is with the fMRI Laboratory, University ofMichigan, Ann Arbor, MI, U.S.A. Email:sramani, zhihao, jayar, jfnielse,[email protected].

adjusted properly for obtaining good image-quality. Choosingsuitable parameter values is a nontrivial, application-dependenttask and has motivated research on automated parameterselection based on quantitative measures [1]–[31]. Quantitativeparameter optimization methods can be broadly classified asthose based on the discrepancy principle [1], the L-curvemethod [2]–[5], generalized cross-validation (GCV) [6]–[17]and estimation of (weighted) mean-squared error (MSE) [18]–[30]. Recently, a new measure of image-quality (different fromGCV and MSE) was introduced in [31] but its applicabilityhas been demonstrated only for denoising applications [31].

In inverse problems, typically, image reconstruction is per-formed by minimizing a cost function composed of data-fidelity term and (one or more) regularization terms. Imagequality in such cases is governed by regularization parametersthat control the bias-variance trade-off (or equivalently, thebalance between image-smoothing and amplification of noise)in the reconstruction. Using discrepancy principle requiresminimizing the difference between the data-fidelity term andthe noise variance [1] and can lead to over-smoothing [22]. Inthe L-curve method, parameters are chosen so as to maximizethe curvature of a (L-shaped) parametric curve (constructedfrom the components of the cost function) [2]–[4]. Thismethod can be computationally expensive and sensitive tocurvature evaluation [5], [25]. GCV is a popular criterionused for parameter selection in a variety of inverse problems,especially for linear reconstruction algorithms [7]–[16]. Theadvantage of GCV is that it does not require knowledge ofnoise variance and is known to yield regularization parametersfor linear algorithms that asymptotically minimize the trueMSE [7]. Some extensions of GCV are also available fornonlinear algorithms [15]–[17] but they are computationallymore involved (see Section III-A) than for linear algorithms.

MSE-estimation-based methods can be attractive alterna-tives to GCV since image quality is often quantified in termsof MSE in image reconstruction problems. For Gaussiannoise, Stein’s unbiased risk estimate (SURE) [18] providesa practical means of unbiasedly assessing MSE for denoisingproblems. Unlike GCV, SURE requires knowledge of noisestatistics but is optimal even in the nonasymptotic regime.SURE has been successfully employed for optimally adjustingparameters of a variety of denoising algorithms [32]–[36].Forill-posed inverse problems, it is not possible to estimate MSE(except in some special instances [22]–[25]) sincey may onlycontain partial information aboutx [24, Sec. IV]. In such

Page 2: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

2 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

cases, the principles underlying SURE may be extended toestimate weighted variants of MSE (e.g., by evaluating theerror only on components ofx that are accessible fromy)[19], [25], [26]. Several weighted SURE-type approaches havebeen proposed and employed for (near) optimal parametertuning in ill-posed inverse problems, e.g., linear restoration[19], nonlinear noniterative restoration [26], image recon-struction using sparse priors [29], [30],noniterative parallelmagnetic resonance image (MRI) reconstruction [27], non-linear restoration [25], [28] and nonlinear image upsampling[25] using iterative shrinkage-thresholding type algorithms thatspecifically apply to synthesis formulations [25], [28], [37],[38] of image reconstruction problems. Synthesis formulationspreclude popular regularization criteria such as total variation(TV) and smooth edge-preserving regularizers (e.g., Huber[39], smoothed-Laplacian [40]) that belong to the class ofanalysis formulations. Bayesian methods [41]–[45] have beenemployed for parameter tuning in image restoration problemsinvolving analysis-type quadratic regularizers [41], [42] andTV [43]–[45].

This paper focusses on computing the nonlinear version ofGCV (denoted by NGCV) [16], [17] and weighted SURE-type measures [20], [24] for nonlinear iterative reconstructionalgorithms that can tackle a variety of nonquadratic regular-ization criteria including synthesis- and analysis-type (e.g.,TV) regularizers. Both NGCV and weighted SURE-measuresrequire the Jacobian matrix of the reconstruction operator(rep-resentative of the iterative algorithm) evaluated with respect tothe data [16], [17], [24] (see Section III). We derive the desiredJacobian matrix for two types of computationally efficientalgorithms: the contemporary split-Bregman (SB) algorithm[46] and IRLS-MIL [47], [48] that uses the matrix inversionlemma (MIL) to accelerate standard iterative reweighted leastsquares (IRLS) [47], [48]. Our work can be interpreted asan extension to previous research [25]–[30] that focussed onapplying weighted SURE-type measures to inverse problemswith noniterative algorithms [26], [27] and to iterative imagereconstruction based on sparsity priors [29], [30] and synthesisformulations [25], [28].

In this paper, we compute Predicted-SURE [20], [21],Projected-SURE [24] and NGCV [16], [17] for nonlinearimage restoration and MRI reconstruction (from partially sam-pled Cartesian k-space data) using TV and an analysis-typeℓ1-regularization. We also illustrate using simulations (forimagerestoration and MRI reconstruction) and experiments with realdata (for MRI reconstruction) that both Predicted-SURE andProjected-SURE provide near-MSE-optimal selection of regu-larization parameters in these applications. We also observedthat NGCV yielded near-MSE-optimal selections for imagerestoration and slightly sub-optimal parameter values forMRIreconstruction.

The paper is organized as follows. Section II describesthe problem mathematically and presents our notation andmathematical requisites essential for theoretical derivations.Section III briefly reviews (N)GCV and weighted SURE-type measures. Section IV describes in detail the deriva-tion of Jacobian matrices for the considered algorithms. Wepresent experimental results for image restoration and MRI

reconstruction in Section V and discuss reconstruction qualityand memory / computational requirements of the consideredalgorithmsin Section VI. Finally, we draw our conclusions inSection VII.

II. N OTATION AND PROBLEM DESCRIPTION

We use the following linear data model

y = Ax+ ξξξ, (1)

that is appropriate for many imaging inverse problems includ-ing image restoration and MRI reconstruction from partiallysampled Cartesian k-space data. In (1),y ∈ ΩM is theobserved data,A ∈ ΩM×N is a known (rectangular) matrix(typically M ≤ N ), and Ω is either R or C dependingon the application. We assumex ∈ ΩN is an unknowndeterministic quantity. For image restoration,Ω = R, M = Nand we assume thatA is circulant, while for MRI with partialCartesian k-space sampling,1 Ω = C andA = MQ, whereQ ∈ CN×N is the orthonormal DFT matrix,M is theM ×Ndownsampling matrix that satisfiesMM⊤ = IM and IM isthe identity matrix of sizeM .

Throughout the paper,(·)⊤ denotes the transpose of a realvector or matrix,(·)⋆ denotes the complex conjugate,(·)H isthe Hermitian-transpose,(·)

Rand (·)I indicate the real and

imaginary parts, respectively, of a complex vector or matrix.Them-th element of any vectory is denoted byym and themn-th element of any matrixA is written as[A]mn.

For simplicity, we modelξξξ ∈ ΩM as an i.i.d. zero-mean Gaussian random vector with covariance matrixΛΛΛ =σ2IM and probability densitygΩ(ξξξ). For Ω = R, gR(ξξξ) =(2πσ2)−

M2 exp(−ξξξ⊤ξξξ/2σ2), while for Ω = C, we assume

ξξξ is an i.i.d. complex Gaussian random vector (which isa reasonable model for MRI applications), sogC(ξξξ) =(πσ2)−M exp(−ξξξHξξξ/σ2). SURE-type methods discussed inthis paper (see Section III-B) can be readily extended tomore general cases (such asξξξ with non-zero mean andcovarianceΛΛΛ 6= σ2IM ) using the generalized SURE (GSURE)methodology developed in [24].

Given datay, we obtain an estimate of the unknown imagex by minimizing a cost function based on (1) composed ofa data-fidelity term and some regularization that is designedusing “smoothness” penalties or prior information aboutx:

uθθθ(y)= argmin

u

J (u)

=

1

2‖y −Au‖22 +Ψ(u)

, (2)

where ‖ · ‖2 represents the Euclidean norm,Ψ representsa suitable regularizer that is (possibly nonsmooth, i.e., notdifferentiable everywhere) convex anduθθθ : ΩM → ΩN may beinterpreted as a (possibly nonlinear) mapping or an algorithm,representative of the minimization in (2), that acts ony to yieldthe estimateuθθθ(y). In practice, the mappinguθθθ depends onone or more parametersθθθ that need to be set appropriatelyto obtain a meaningful estimateuθθθ(y). In problems such as(2), typically, θθθ = λ is a scalar known as the regularization

1Partial k-space sampling on Cartesian grids is relevant foraccelerating3-D MR acquisition in practice, where undersampling is typically applied inthe phase-encode plane [49].

Page 3: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

3

parameter that plays a crucial role in balancing the data-fidelityand regularization terms: smallλ-values can lead to noisyestimates while a largeλ results in over smoothing and lossof details. Quantitative criteria such as GCV [6], [16], [17]and (weighted) SURE-type measures [24], [25] can be usedfor tuningθθθ of a nonlinearuθθθ, but they require the evaluationof the Jacobian matrix [16], [17], [24],J(uθθθ ,y) ∈ ΩN×M

for Ω = R,C (see Sections III-A and III-B), consisting ofpartial derivatives of the elementsuθθθ,n(y)Nn=1 of uθθθ(y) withrespect toymMm=1.

Definition 1: Let uθθθ : RM → RN be differentiable (in theweak sense of distributions [50, Ch. 6]). The Jacobian matrixJ(uθθθ,y) ∈ RN×M evaluated aty ∈ RM is specified using itselements as

[J(uθθθ ,y)]nm=

∂uθθθ,n(z)

∂zm

∣∣∣∣z=y

. (3)

Definition 2: Let uθθθ : CM → ΩN (with Ω = R or C)be individually analytic [51] with respect toy

Rand yI (in

the weak sense of distributions [50, Ch. 6]). The JacobianmatricesJ(uθθθ,y), J(uθθθ,y

⋆) ∈ CN×M are specified usingtheir respective elements as [51, Eq. 13], [52]

[J(uθθθ,y)]nm=

1

2

(∂uθθθ,n(z)

∂zRm− ı

∂uθθθ,n(z)

∂zIm

)∣∣∣∣z=y

,(4)

[J(uθθθ,y⋆)]nm

=

1

2

(∂uθθθ,n(z)

∂zRm+ ı

∂uθθθ,n(z)

∂zIm

)∣∣∣∣z=y

.(5)

Remark 1:Whenuθθθ : CM → ΩN is prescribed in terms ofy andy⋆, J(uθθθ,y) is evaluated treatingy as a variable andy⋆ as a constant [52], [53]. Similarly,J(uθθθ,y

⋆) is evaluatedtreatingy as constant [52], [53].

For common (and some popular) instances ofJ in (2),uθθθ satisfies the hypotheses in Definitions 1 and 2 and inturn allows the computation of GCV and weighted SURE-type measures for reliable tuning ofθθθ as illustrated in ourexperiments.

III. G ENERALIZED CROSS-VALIDATION AND WEIGHTED

SURE-TYPE MEASURES

A. Generalized Cross-Validation (GCV)

GCV is based on the “leave-one-out” principle [7] that leadsto a simple expression in the case of linear algorithms: fora generic linear mappinguθθθ(y) = Fθθθy, the GCV measure(denoted by LGCV) is given by [7]

LGCV(θθθ)=

M−1‖(IM −AFθθθ)y‖22(1−M−1trAFθθθ)2

. (6)

For nonlinear estimatorsuθθθ(y), we consider the followingGCV measure (denoted by NGCV)

NGCV(θθθ)=

M−1‖y−Auθθθ(y)‖22(1−M−1RtrAJ(uθθθ,y))2

, (7)

adapted from [17, Sec. 3] that was originally derived usingthe standard “leave-one-out” principle for nonlinear algorithms[16]. We take the real part,R·, in the denominator of (7)specifically for the case ofΩ = C to avoid spurious complexentries while evaluating NGCV(θθθ) numerically.

LGCV has been more widely used [7]–[11], [13], [14](for linear algorithms) than NGCV (for nonlinear algorithms),perhaps because NGCV is computationally more involved thanLGCV. Recently, Liaoet al. proposed GCV-based automaticnonlinear restoration methods using alternating minimizationin [54], [55]. Although their methods are nonlinear overall,they rely on linear sub-problems arising out of alternatingmin-imization and employ LGCV for parameter tuning. In contrast,we propose to tackle NGCV (7) directly and demonstrate itsuse in nonlinear image restoration and MRI reconstruction.

B. Weighted SURE-Type Measures

In the context of image reconstruction, the mean-squarederror (risk) measure,

MSE(θθθ)= N−1‖x− uθθθ(y)‖22 (8)

is often used to assess image quality and is an attractiveoption for optimizing θθθ. However, MSE(θθθ) cannot be di-rectly computed since the cross-termxHuθθθ(y) depends onthe unknownx (‖x‖22 is an irrelevant constant independentof θθθ) and needs to be estimated in practice. For denoisingapplications, i.e.,A = IN in (1), the desired cross-termcan be manipulated asxHuθθθ(y) = (y − ξξξ)Huθθθ(y) and thestatistics ofξξξ may then be used to estimateξξξHuθθθ(y). Inthe Gaussian setting,ξξξ ∼ N (0, σ2IN ), Stein’s result [18](for Ω = R) can be used for this purpose and leads toEξξξξξξ⊤uθθθ(y) = σ2EξξξtrJ(uθθθ,y), whereEξξξ· representsexpectation with respect toξξξ. Replacingξξξ⊤uθθθ(y) in MSE(θθθ)with σ2 trJ(uθθθ,y) thus yields the so-called Stein’s unbiasedrisk estimate (SURE) [18],

SURE(θθθ)= N−1‖y − uθθθ(y)‖22 − σ2

+2σ2N−1trJ(uθθθ,y), (9)

that is an unbiased estimator of MSE(θθθ), i.e.,EξξξMSE(θθθ) =EξξξSURE(θθθ). The accuracy of SURE(θθθ) generally increaseswith N (law of large numbers), so it is appealing forimage-processing applications (whereN is large, typicallyN ≥ 2562) [36]. Using SURE(θθθ) as a practical alternativeto MSE(θθθ) requires (in addition toσ2) the evaluation oftrJ(uθθθ,y) that can be performed analytically for somespecial types ofdenoisingalgorithms [32]–[35] or numericallyusing the Monte-Carlo method in [36, Th. 2] for a general(iterative / noniterative)denoisingalgorithmuθθθ.

For inverse problems modeled by (1),xHuθθθ(y) can be ma-nipulated in terms ofy (andξξξ and thus allows the estimationof MSE(θθθ) using statistics ofξξξ) only in some special instances,e.g., whenuθθθ(y) ∈ RAH, the range space ofAH [24, Sec.IV], 2 or whenA has full column rank [25, Sec. 4].2 In manyapplications,A has a nontrivial null-spaceNA: informationaboutx contained inNA is not accessible fromy (andstatistics ofξξξ) and it is impossible to estimate MSE(θθθ) [24] insuch cases. An alternative is to compute the error using onlythe components ofx that lie in the orthogonal complementof N(A): N(A)⊥ = RAH [24], [25]; these components

2If uθθθ ∈ RAH, we can writeuθθθ = AHgθθθ for some operatorgθθθ , sothatxHuθθθ(y) = (y − ξξξ)Hgθθθ(y) [23, Sec. 3.1]. Alternatively, ifA has fullcolumn rank, thenxHuθθθ(y) = (y − ξξξ)HA(AHA)−1uθθθ(y).

Page 4: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

4 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

are in turn accessible fromy (andξξξ). Such an error measurecorresponds to projecting3 the error(x−uθθθ(y)) on toRAHand is given by [24], [25]

Projected-MSE(θθθ)= M−1‖P(x− uθθθ(y))‖22, (10)

whereP = AH(AAH)†A is the projection operator and(·)†represents pseudo-inverse.

Another quadratic error measure for inverse problemsthat is amenable to estimation (using statistics ofξξξ)3 isPredicted-MSE [20], [21] that corresponds to computing theerror in the data-domain:

Predicted-MSE(θθθ)= M−1‖A(x− uθθθ(y))‖22. (11)

Both (10) and (11) can be interpreted as particular instancesof the following general weighted form:

WMSE(θθθ)= M−1‖A(x− uθθθ(y))‖2W, (12)

where‖x‖2W= xHWx, andW is a Hermitian-symmetric,

W = WH, positive definite,W ≻ 0, weighting matrix.For (11), W = IM and the overall weighting is providedby the eigenvalues ofAHA, while for (10) it is easy to seethat W = Winv

= (AAH)† sincePHP = P. For image

restoration with circulantA, Winv can be easily implementedusing FFTs. For MRI with partial Cartesian k-space sampling,A = MQ (see Section II) leads toWinv = (MQQHMH)† =IM , so Projected-MSE and Predicted-MSE are equivalent andcorrespond to evaluating squared-error at the sample locationsin k-space.

Similar to SURE(θθθ), an estimator for WMSE(θθθ) can bederived under the Gaussian assumption as summarized in thefollowing results.

Lemma 1:Let uθθθ : ΩM → ΩN be differentiable (forΩ =R) or individually analytic (forΩ = C with respect to realand imaginary parts of its argument), respectively, in the weaksense of distributions [50, Ch. 6]. Then, for any deterministicT ∈ ΩM×N satisfyingEξξξ|[Tuθθθ(y)]m| < ∞, m = 1 . . .M ,we have that

EξξξξξξHTuθθθ(y) = σ2EξξξtrTJ(uθθθ,y). (13)

The proof is very similar to those in [34, Lemma 1] [24, Th.1] for Ω = R, while it constitutes a straightforward extensionof [24, Th. 1] forΩ = C and is presented as supplementarymaterial (due to page limits).4

Theorem 1:Let uθθθ andT = WA satisfy the hypothesesin Lemma 1 forA ∈ ΩM×N in (1) and for a Hermitian-symmetric, positive definite matrixW ∈ ΩM×M . Then forΩ = R or C, the random variable

WSURE(θθθ)= M−1‖y −Auθθθ(y)‖2W − σ2

MtrW

+2σ2

MRtrWAJ(uθθθ ,y) (14)

3SincePHP = P, the cross-termxHPHPuθθθ(y) in Projected-MSE(θθθ)(10) is nothing butxHAH(AAH)†Auθθθ(y) = (y−ξξξ)H(AAH)†Auθθθ(y).For Predicted-MSE(θθθ) (11), we have thatxHAHAuθθθ(y) = (y −ξξξ)HAuθθθ(y).

4Supplementary material containing a proof of Lemma 1 and additionalillustrations for experimental results is available at http://tinyurl.com/supmat.

is an unbiased estimator of WMSE(θθθ) in (12), i.e.,EξξξWMSE(θθθ) = EξξξWSURE(θθθ).The proof is straightforward and uses (13) to estimateξξξHWAuθθθ(y) in WMSE(θθθ). Similar to SURE(θθθ), WSURE(θθθ)is independent ofx and depends purely on the noise varianceσ2, the data and the reconstruction algorithm. The Monte-Carlo scheme [36, Th. 2] that uses numerical differentiation fora general nonlinearuθθθ may be adapted to iteratively estimatetrWAJ(uθθθ,y) in (14) for the case ofΩ = R by consideringWAuθθθ instead ofuθθθ in [36, Eq. 14]. In this paper, wepropose to evaluateJ(uθθθ,y) analytically forΩ = R andC.This process depends on the choice of the estimatoruθθθ, theregularizationΨ in (2) and the nature of application (e.g.,Ω = R for restoration andΩ = C for MRI), and thereforeneeds to be accomplished on a case-by-case basis.

IV. EVALUATION OF THE JACOBIAN MATRIX J(uθθθ ,y)

For nonquadratic regularizers, there is no closed-form ex-pression for the estimatoruθθθ in (2), so it is not possible toevaluateJ(uθθθ,y) in (14) directly. In this section, we showhow to computeJ(uθθθ,y) recursively for two types of iterativealgorithms used for minimizingJ in (2). Henceforth, we leaveimplicit the dependence ofuθθθ(y) ony and drop the subscriptθθθwhen necessary, so thatu represents either the estimator or theiteratively-reconstructed estimate depending on the context.

We focus on IRLS-MIL [47], [48] that is a fast variant ofthe standard iterative reweighted least-squares (IRLS) and thesplit-Bregman (SB) algorithm [46] that is based on variablesplitting. Both algorithms are computationally efficient,5 andcan be employed for image restoration and MRI reconstruction[46]–[48], [56]. Furthermore, they can accommodatea generalclass of regularization criteria of the form

Ψ(Ru) = λ

L∑

l=1

Φl

(P∑

p=1

|[Rp u]l|q), (15)

whereλ > 0 is the regularization parameter,Φl are potentialfunctions,R ∈ RR×N with R

= [R⊤

1 · · ·R⊤P ]

⊤, Rp ∈ RL×N

are regularization operators (e.g., finite differences, frames,etc.) andR = PL. We consider the following convex instancesof (15) that are popularly used for image restoration and MRIreconstruction:

• Analysisℓ1-regularization (Φl(x) = x, q = 1):

Ψℓ1(Ru)= λ‖Ru‖1 = λ

P∑

p=1

L∑

l=1

|[Rp u]l| , (16)

• Total variation (TV) (Φl(x) =√x, q = 2):

ΨTV(Ru)= λ

L∑

l=1

√√√√P∑

p=1

|[Rp u]l|2. (17)

We deriveJ(uθθθ,y) for image restoration with the IRLS-MILalgorithm (see Sections IV-A–IV-B) and MRI reconstruction

5IRLS-MIL has been demonstrated to converge faster than conventionalmethods (e.g., nonlinear conjugate gradient) [47], [48], while SB is moreversatile and computationally efficient than fixed-point continuation andgraph-cuts-based solvers [46].

Page 5: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

5

with the SB algorithm (see Sections IV-C–IV-D). Derivationsof J(uθθθ,y) for other combinations (i.e., image restoration withthe SB algorithm and MRI reconstruction with the IRLS-MILalgorithm) can be accomplished in a similar manner and arenot considered for brevity.

The derivations in Sections IV-A–IV-D can also be ex-tended, in principle, to other instances of (15) such as smoothconvex edge-preserving regularizers forq = 1, e.g., Huber[39], Fair or smoothed-Laplacian [48] and synthesis forms,e.g., by considering a variablew to be estimated such thatx = Sw andASw in (1) andΦ(w) in (15), for some potentialfunctionΦ and synthesis operatorS [25], [28], [37], [38].

A. Image Restoration with IRLS-MIL Algorithm

IRLS-MIL uses matrix-splitting [57, pp. 46-50] and thematrix inversion lemma (MIL) for efficient preconditioningand fast solving of iteration-dependent linear systems arisingin the standard IRLS scheme [47], [48]. We summarize theIRLS-MIL iterations below (detailed derivation of IRLS-MILcan be found in [47], [48]) for image restoration (Ω = R in(1)): for any matrixC such thatC ≻ A⊤A, i.e.,

z⊤(C−A⊤A)z > 0, ∀ z ∈ RN , (18)

at outeri-iteration of IRLS-MIL, we perform innerj-iterationsinvolving the following two-steps:

u(i+1,j+1) = b(i+1,j) −C−1 R⊤ v(i+1,j),(19)

solveG(i)v(i+1,j) = Rb(i+1,j) for v(i+1,j), (20)

where

b(i+1,j) = C−1A⊤y + (IN −C−1A⊤A)u(i+1,j), (21)

G(i) = ΓΓΓ(i) + R C−1R⊤ and ΓΓΓ(i)

= diagγγγ(i) is adiagonal matrix constructed fromγγγ(i) ∈ RR with γγγ(i)

=

[γγγ(i)⊤1 · · ·γγγ(i)⊤

P ]⊤. The l-th element ofγγγ(i)p ∈ RL for (15) is

given byγ(i)pl =

∣∣∣[Rp u(i+1,0)]l

∣∣∣2−q

[qλΦ′

l

(P∑

s=1

∣∣∣[Rs u(i+1,0)]l

∣∣∣q)]−1

, (22)

whereΦ′ denotes the first derivative ofΦ. For instance,

γ(i)pl = λ−1|[Rp u

(i+1,0)]l|, (23)

for the ℓ1-regularization in (16), while for TV-regularization(17),

γγγ(i) = 1P ⊗ γγγ(i), (24)

where 1P = [1 · · · 1]⊤ is a P × 1 vector, ⊗ denotes theKronecker product and thel-th element ofγγγ(i) ∈ RL is given

by γ(i)l = λ−1

√∑Ps=1

∣∣[Rs u(i+1,0)]l∣∣2.

WhenR is composed of sparsifying operators (e.g., finitedifferences, wavelets, frames, etc.),Rpu

(i,0) tends to becomesparse for sufficiently largei, so for practically appealinginstances ofΨ (e.g., when1 ≤ m < 2 and Φ is an edge-preserving potential function or for nonsmooth instances suchas those based on theℓ1-norm or TV),ΓΓΓ(i) will become sparsetoo. So in the standard IRLS scheme (which utilizes[ΓΓΓ(i)]−1

rather thanΓΓΓ(i) [47, Eqs. 3-6]), a small positive additiveconstant is included inΓΓΓ(i) for maintaining numerical stabilityof [ΓΓΓ(i)]−1—this is often referred to as corner-rounding [47].However no such corner-rounding is required for the IRLS-MIL scheme [47], [48] as it only utilizesΓΓΓ(i).

To solve (20), we apply a matrix-splitting strategy (similarto [47, Eq. 7]) toG(i) that leads to an iterative scheme for(20) (with iteration indexk) with the following update step:

v(i+1,j,k+1) = D−1ΓΓΓ(i)(Rb(i+1,j) +Hρv

(i+1,j,k)), (25)

whereDΓΓΓ(i)= ΓΓΓ(i) + ρIR is an invertible diagonal matrix,

Hρ= ρIR −R C−1R⊤ and

ρ > max eigvalR C−1R⊤ (26)

depends only onA (via C) and R and can therefore beprecomputed. In practice, we performK iterations of (25)and apply the final updatev(i+1,j,K) in place ofv(i+1,j) in(19). We prefer (25) over a PCG-type solver for (20) as (25)is linear in bothb(·) andv(·) and decouples the shift-variantcomponentΓΓΓ(i) from the rest of terms inG(i): these featuressimplify the analytical derivation ofJ(uθθθ,y) for IRLS-MILas demonstrated next.

B. Jacobian Matrix Derivation for IRLS-MIL Algorithm

Since b(·) and v(·) are functions ofy (via u(·)), usinglinearity of (3), at the end ofK iterations of (25), we getthe Jacobian matrix update corresponding tou(i+1,j+1) from(19) as

J(u(i+1,j+1),y) = J(b(i+1,j),y) −C−1R⊤ J(v(i+1,j,K),y). (27)

From (21), we get thatJ(b(i+1,j),y) = C−1A⊤ + (IN −C−1A⊤A) J(u(i+1,j),y). To obtain J(v(i+1,j,K),y), wederive a recursive update forJ(v(i+1,j,k+1) ,y) from (25)using a straightforward application of product rule for Jacobianmatrices and the fact that (25) involves only a diagonal matrixD−1

ΓΓΓ(i) :

J(v(i+1,j,k+1) ,y) = D−1ΓΓΓ(i)

(R J(b(i+1,j),y)

+Hρ J(v(i+1,j,k),y))

−D−2ΓΓΓ(i) Dv J(γγγ(i),y), (28)

whereDv= diagRb(i+1,j) + Hρv

(i+1,j,k). Using chainrule for Jacobian matrices [52], we have that

J(γγγ(i),y) = J(γγγ(i),u(i+1,0)) J(u(i+1,0),y), (29)

where J(γγγ(i),u(i+1,0)) ∈ RR×N constitutes derivatives ofγ(i)

l Rl=1 with respect tounNn=1, evaluated atu(i+1,0) andcan be computed readily analytically for the regularizers in(15) using (22). For theℓ1-regularization6 Ψℓ1 in (16),

J(γγγ(i),u(i+1,0)) = λ−1diagτττ (i)R, (30)

6The derivatives are interpreted in the weak sense of distributions [50, Ch.6] whenever conventional differentiation does not apply.

Page 6: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

6 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

where the elements ofτττ (i) ∈ RR are τ(i)l

=

sign([Ru(i+1,0)]l).For TV-regularizationΨTV in (17), we get (using tedious,

but elementary calculus) that

J(γγγ(i),u(i+1,0)) = 1P ⊗(λ−1

P∑

p=1

diagωωω(i)p Rp

), (31)

where the elements ofωωω(i)p ∈ RL are given by

ω(i)pl

=

[Rpu(i+1,0)]l√∑P

s=1 |[Rsu(i+1,0)]l|2=

[Rpu(i+1,0)]l

λγ(i)l

, (32)

with the assumption thatω(i)pl = 0 whenever[Rpu

(i+1,0)]l =0, p = 1 . . . P .

Thus, in addition to running (19), (25) for obtainingu(i,j),we propose to run the sequence of iterations (27)-(28) using(29)-(32) for iteratively evaluatingJ(u(i,j),y), (and thus,NGCV(θθθ) and WSURE(θθθ)) at any stage of IRLS-MIL.

C. MRI Reconstruction with Split-Bregman Algorithm

The split-Bregman (SB) algorithm [46] for solving (2) isbased on a variable splitting strategy [46], [49], [56], wherean auxiliary variablev ∈ CR is used to artificially introducethe constraintv = Ru and separateRu from Ψ leading to

minu,v

1

2‖y −Au‖22 +Ψ(v) subject tov = Ru (33)

that is equivalent to (2). The above constrained problemis then solved using the so-called Bregman-iterations [46,Eqs. (2.9)-(2.10)] that consists of alternating between theminimization of an augmented Lagrangian (AL) function,7

L(u,v) = 1

2‖y −Au‖22 + Ψ(v) + µ2 ‖v −Ru− ηηη‖22, jointly

with respect to(u,v) [46, Eq. (3.7)] and a simple update step[46, Eq. (3.8)] for a Lagrange-multiplier-like vector,ηηη ∈ CR,for the constraint in (33). The penalty parameterµ > 0does not influence the final solution of (33) and (2), butgoverns the convergence speed of the Bregman-iterations [46].In practice, the joint-minimization step is often replacedbyalternating minimization [46, Sec. 3.1], i.e.,L is minimizedwith respect tou andv one at a time, which decouples theminimization step and simplifies optimization. We summarizethe SB algorithm below for solving (33) (and equivalently (2),for MRI reconstruction):

u(i+1) = B−1µ [AHy + µR⊤(v(i) − ηηη(i))], (34)

v(i+1) = dΨ((i))

= argmin

v

Ψ(v) +

µ

2‖v − (i)‖22

,

(35)

ηηη(i+1) = (i) − v(i+1), (36)

where Bµ= AHA + µR⊤R. Step (35) corresponds to a

denoising problem to which we associate a denoising operatordΨ : CR → CR that acts on

(i) = Ru(i+1) + ηηη(i) (37)

7For (33),L is equivalent to the sum of the Bregman-distance [46] and aquadratic penalty term for the constraint in (33) up to irrelevant constants.

to yield v(i+1). For a generalΨ such as (15), (35) can besolved iteratively in which casedΨ is representative of theiterative scheme used for (35). However, for several specialinstances ofΨ [58, Sec. 4] including (16)-(17), one cansolve (35) exactly anddΨ admits an analytical closed-formexpression. Specifically for (16)-(17), it can be shown that(35) further decouples in terms of the elementsvrRr=1 of v,i.e., dΨ is a point-wise operator such thatv(i+1)

r = dΨ((i)r ),

r = 1 . . . R.Before proceeding, it is helpful to introduceβββ(i)

= 1P ⊗βββ(i)

, where thel-th element ofβββ(i) ∈ RL is given byβ(i)

l=√∑P

p=1 |(i)(p−1)L+l|2.ThendΨ for (16)-(17) can be obtained

as [46], [49]:

dℓ1(ρ(i)r ) = (i)r

(1− λ |µ (i)r |−1

)1(|(i)r | > λ/µ),(38)

dTV(ρ(i)r ) = (i)r

(1− λ (µβ(i)

r )−1)1(β(i)

r > λ/µ),(39)

where1(·) is an indicator function that takes the value1 whenthe condition in its argument is satisfied and is0 otherwise.

D. Jacobian Matrix Derivation for Split-Bregman Algorithm

We note thatu(·), v(·), ηηη(·) and (·) are implicit functionsof y and y⋆. Therefore, we evaluate8 J(u(i),y) using (34)and the linearity of (4) to get that

J(u(i+1),y) = B−1µ [AH + µR⊤(J(v(i),y) − J(ηηη(i),y) ) ]. (40)

For the complex-valued case (Ω = C), we also need toevaluateJ(u(i+1),y⋆) as explained next. ForJ(u(i+1),y⋆),we treaty as a constant in (34) (see Remark 1) so that

J(u(i+1),y⋆) = µB−1µ R⊤(J(v(i),y⋆) − J(ηηη(i),y⋆) ). (41)

For brevity, henceforth we usez to represent eithery or y⋆

as required. From (35)-(36), we have that

J(v(i+1), z) = J(dΨ((i)), z), (42)

J(ηηη(i+1), z) = J((i), z)− J(v(i+1), z). (43)

Using chain rule forJ(dΨ((i)), z) [53, Th. 1], we get that

J(dΨ((i)), z) = J(dΨ,

(i)) J((i), z)

+J(dΨ, (i)⋆) (J((i), z⋆))⋆, (44)

whereJ((i), z) = RJ(u(i+1), z)−J(ηηη(i), z) from (37). Thus,due to (44), bothJ(u(·),y) and J(u(·),y⋆) are required asmentioned earlier.

For the case ofℓ1-regularization (16),dℓ1 in (38) dependsonly on(i)r and(i)⋆r , soJ(dℓ1 ,

(i)) andJ(dℓ1 , (i)⋆) become

diagonal matrices:

J(dℓ1 , (i)) = D

1,(i)

(IR − λ

2µD−1

|(i)|

), (45)

J(dℓ1 , (i)⋆) =

λ

2µD1,(i) D2

(i)D−3|(i)|, (46)

8 The derivatives are interpreted in the weak sense of distributions [50, Ch.6] whenever conventional differentiation does not apply.

Page 7: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

7

where

D1,(i)= diag

1(|(i)r | > λ/µ)

R

r=1

, (47)

D(i)= diag(i), (48)

D|(i)|= diag

|(i)r |

R

r=1

. (49)

For the case of TV-regularization (17), we apply (tedious,but elementary) product rule to obtain

J(dTV, (i)) = D1,βββ(i)

[IR − λ

µD−1

βββ(i)

µD(i) D−2

βββ(i)

(1P ⊗ J(βββ

(i), (i))

)],

(50)

J(dTV, (i)⋆) =

λ

µD1,βββ(i) D(i) D−2

βββ(i)

×(1P ⊗ (J(βββ

(i), (i)))⋆

), (51)

where

J(βββ(i), (i)) =

1

2(1⊤P ⊗D−1

βββ(i))D

⋆(i) , (52)

Dβββ(i)= diagβββ(i), (53)

D1,βββ(i)= diag

1(β(i)

r > λ/µ)R

r=1

, (54)

Dβββ

(i)= diagβββ(i). (55)

Thus, similar to the case of IRLS-MIL, we propose to run (34)-(36) for obtainingu(i) and (40)-(55) for iteratively evaluatingJ(u(i),y) (and thus, NGCV(θθθ) and WSURE(θθθ)) at any stageof the SB algorithm.

E. Monte-Carlo Trace Estimation

The Jacobian matricesJ(·, ·) in Sections IV-B and IV-Dhave enormous sizes for typical reconstruction settings andcannot be stored and manipulated directly to compute thedesired traces,trAJ(uθθθ,y) in (7) and trWAJ(uθθθ,y)in (14). So we use a Monte-Carlo method to estimatetrAJ(uθθθ,y) and trWAJ(uθθθ,y) that is based on thefollowing well-established identity [12], [59]–[61].

Proposition 1: Let n ∈ RM be an i.i.d. zero-mean randomvector with unit variance andT ∈ ΩM×M (for Ω = R or C)be deterministic with respect ton. Then

Enn⊤ Tn = trT. (56)

For practical applications,En· in (56) can be replacedby sample mean,1/Nr

∑Nr

i=1 n⊤i Tni, with Nr independent

realizationsniNr

i=1. In image-processing applications where,typically, M is large andT has a sparse off-diagonal struc-ture, tT

= n⊤ Tn (corresponding toNr = 1) provides

a reliable estimate oftrT [11], [12], [25], [28]. To usethis type of stochastic estimation fortrAJ(uθθθ,y) andtrWAJ(uθθθ,y), we adopt the procedure applied in [25],[28]: we take products withn in (27)-(29), (40)-(44) andstore and update vectors of the formJ(u(·), ·)n, J(v(·), ·)n,

J(ηηη(·), ·)n in IRLS-MIL and SB algorithms, respectively. Atany point during the course of the algorithms, the desiredtraces in (7) and (14) are stochastically approximated, respec-tively, as

trAJ(uθθθ ,y) ≈ tNGCV= n⊤AJ(u(·),y)n, (57)

trWAJ(uθθθ,y) ≈ tWSURE= n⊤WAJ(u(·),y)n. (58)

To improve accuracy of (57)-(58),n can be designed to de-crease the variance oftNGCV and tWSURE: it has been shown[59], [61] that variance of a Monte-Carlo trace estimate (suchas tNGCV or tWSURE) is lower for a binary random vectorn±1 whose elements are either+1 or −1 with probability 0.5than for a Gaussian random vectorn ∼ N (0, IM ) employedin [25], [28]. So in our experiments, we used one realizationofn±1 in (57)-(58). Figs. 1, 2 present outlines for implementingIRLS-MIL and SB algorithms with recursions forJ(·, ·)n±1

to compute and monitor NGCV(θθθ) and WSURE(θθθ) as thesealgorithms evolve.

F. Implementation of IRLS-MIL and Split-Bregman Algorithms

The convergence speed of IRLS-MIL (19), (25) dependsprimarily on the “proximity” ofC toA⊤A while ensuring (18)[47], [48]. Ideally, we would like to choose the circulant matrixCopt = QHdiagαααoptQ, whereQ is the DFT matrix andαααopt

= argmindiagαααQA⊤AQH |||diagααα − QA⊤AQH|||

for some matrix norm||| · |||, e.g., the Frobenius norm.However,αααopt can be both challenging and computationallyexpensive to obtain for a generalA. For image restoration,typically, A⊤A ∈ RN×N is circulant, soαααopt is simply theeigenvalues ofA⊤A. In our experiments, we usedC = Cν

=

A⊤A+νIN and implementedC−1ν using FFTs. The parameter

ν > 0 was chosen to achieve a prescribed condition numberof Cν , κ(Cν), that can be easily computed as a functionof ν. In general, settingκ(Cν) to a large value can lead tonumerical instabilities inC−1

ν and IRLS-MIL, while a smallκ(Cν) reduces convergence speed of IRLS-MIL [47], [48]. Inour experiments, we found thatν leading toκ(Cν) ∈ [20, 100]yielded good convergence speeds for a fixed number of outer(i.e., index byi) iterations of IRLS-MIL, so we simply setνsuch thatκ(Cν) = 100.

For MRI reconstruction from partially sampled Cartesiank-space data,AHA ∈ CN×N is circulant [49]. We choseRpPp=1 in (16)-(17) to be shift-invariant with periodicboundary extensions so thatR⊤R, and thusBµ in (34), arecirculant as well. Then we implementedB−1

µ in (34) usingFFTs. One way to select the penalty parameterµ for theSB algorithm is to minimize the condition numberκ(Bµ) ofBµ: µ = µmin

= argminµ κ(Bµ) [46]. We found in our

experiments that the empirical selectionµ = µmin × µfactor

with µfactor ∈ [10−5, 10−2] yielded favorable convergencespeeds of the SB algorithm for a fixed number of iterationscompared to usingµmin, so we setµfactor = 10−4 throughout.

V. EXPERIMENTAL RESULTS

A. Experimental Setup

In all our experiments, we focussed on tuning the regular-ization parameterλ (16)-(17) for a fixed number of (outer)

Page 8: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

8 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

1) Initialization:u(0,0) = A⊤y, J(u(0,0),y)n

= A⊤n, i = 0

2) Repeat Steps 3-12 until Stop Criterion is met3) If i = 0

u(i+1,0) = u(i,0), v(i+1,0,0) = Ru(i,0), J(u(i+1,0),y)n= J(u(i,0),y)n, J(v(i+1,0,0),y)n

= RJ(u(i,0),y)n

Elseu(i+1,0) = u(i,J), v(i+1,0,0) = v(i,J−1,K), J(u(i+1,0),y)n

= J(u(i,J),y)n, J(v(i+1,0,0),y)n

= J(v(i,J−1,K),y)n

4) ComputeΓΓΓ(i) using (22); setj = 05) Run J iterations of Steps 6-106) Computeb(i+1,j) (21) andJ(b(i+1,j),y)n

7) If j > 0 setv(i+1,j,0) = v(i+1,j−1,K) andJ(v(i+1,j,0),y)n= J(v(i+1,j−1,K),y)n

8) RunK iterations of (25) and (28) to getv(i+1,j,K) andJ(v(i+1,j,K),y)n9) Computeu(i+1,j+1) (19) andJ(u(i+1,j+1),y)n (27)

10) Setj = j+1 and return to Step 511) Compute NGCV(θθθ) and / or WSURE(θθθ) at iterationi using (57)-(58), (7) and (14), respectively12) Seti = i+ 1 and return to Step 2

Fig. 1. Iterative computation of WSURE(θθθ) and NGCV(θθθ) for image deblurring using IRLS-MIL algorithm (withJ iterations of (19)-(25) andK iterationsof (25)). We use a pregenerated binary random vectorn = n±1 for Monte-Carlo computation (57)-(58) of the required traces in (7) and (14), respectively.Vectors of the formJ(·(·), ·)n are stored and manipulated in place of actual matricesJ(·(·), ·).

1) Initialization:u(0) = AHy, v(0)

= Ru(0), ηηη(0) = 0, i = 0

J(u(0),y)n= AHn, J(v(0),y)n

= RAHn, J(ηηη(0),y)n

= 0, J(u(0),y⋆)n

= 0, J(v(0),y⋆)n

= 0, J(ηηη(0),y⋆)n

= 0,

2) RepeatSteps 3-7 until Stop Criterion is met3) Computeu(i+1), J(u(i+1),y)n, J(u(i+1),y⋆)n, respectively, using (34), (40), (41)4) Computev(i+1) using (35), (38)-(39) andJ(v(i+1),y)n, J(v(i+1),y⋆)n, respectively, using (42), (44)-(55)5) Computeηηη(i+1), J(ηηη(i+1),y)n, J(ηηη(i+1),y⋆)n, respectively, using (36) and (43)6) Compute NGCV(θθθ) and / or WSURE(θθθ) at iterationi using (57)-(58), (7) and (14), respectively7) Seti = i+ 1 and return to Step 2

Fig. 2. Iterative computation of WSURE(θθθ) and NGCV(θθθ) for MRI reconstruction with split-Bregman algorithm. We use a pregenerated binary randomvector n = n±1 for Monte-Carlo computation (57)-(58) of the required traces in (7), (14), respectively. Vectors of the formJ(·(·), ·)n are stored andmanipulated in place of actual matricesJ(·(·), ·).

iterations for both IRLS-MIL and SB algorithms, although inprinciple, we can apply the greedy method9 of Giryes et al.[25, Sec. 5.2] to minimize WSURE and NGCV as functionsof both the number of iterations andλ. For IRLS-MIL, weusedJ = K = 1 (J iterations of (19)-(25) andK iterationsof (25), see Fig. 1) and set the maximum number of iterations(indexed byi) to 100 for both algorithms. We used 2 levelsof the undecimated Haar wavelet transform (excluding theapproximation level) forR in Ψℓ1 (16) and horizontal andvertical finite differences forRp2p=1 in ΨTV (17), all withperiodic boundary extensions.

Both NGCV (7) and WSURE (14) require the evaluationof J(uθθθ,y), therefore their computation costs are similar fora given reconstruction algorithm.We evaluated NGCV(λ) in(7) for image restoration and MRI reconstruction and the(oracle) MSE using (8). We assumed thatσ2 (the varianceof noise in y) was known10 in all simulations to computethe following WSURE-based measures: Predicted-SURE(λ)with W = IM in (14) and Projected-SURE(λ) with W =(AAH)† in (14) that correspond to Predicted-MSE(λ) (11)and Projected-MSE(λ) (10), respectively. For image restora-tion, we computedW = (AAH)† for Projected-SURE usingFFTs.11 For MRI reconstruction from partially sampled Carte-

9The Jacobian matrixJ(u(·),y) is updated at every (outer) iteration ofIRLS-MIL and SB algorithms (see Figs. 1, 2), so (57)-(58) canbe used tomonitor NGCV(θθθ) and WSURE(θθθ), respectively, during the course of thealgorithms as elucidated in Figs. 1, 2.

10In practice,σ can be estimated fairly reliably using, e.g., the techniquesproposed in [22, Sec. V].

11We set the eigenvalues ofAAH below a threshold of10−5 to zero fornumerical stability of(AAH)†.

sian k-space data, Predicted-SURE and Projected-SURE areequivalent (sinceW = IM , see Section III-B) and correspondto evaluating the error at sample locations in the k-space.

B. Results for Image Restoration

We performed three sets of experiments with simulated datacorresponding to the setups (with standard blur kernels [37])summarized in Table I. In each simulation, data was generatedcorresponding to a blur kernel and a prescribed BSNR (SNR ofblurred and noisy data) [43]. IRLS-MIL was then applied forvaryingλ and the quality of the deblurred images was assessedby computing Projected-SURE(λ), Predicted-SURE(λ) andNGCV(λ). We also included the following GCV-measureadapted from [25, Eq. 11] in our tests:

RLGCV(λ)=

M−1‖(y −Auλ(y)‖22(1−M−1n⊤±Auλ(n±))2

. (59)

wheren± is the binary random vector specified in SectionIV-E. RLGCV in (59) is a randomized version of LGCV in(6) that applies to linear algorithms but has been suggestedforuse with nonlinear algorithms as well in [25]. We minimizedthese measures overλ using golden-section search and calcu-lated the improvement in SNR (ISNR) [43] of correspondingdeblurred images (after minimizing the various measures).

Tables II and III summarize the ISNR-results for Ex-periments IR-A and IR-B, respectively, for varying BSNR.Minimization of Projected-SURE(λ) yields deblurred imageswith ISNR (reasonably) close to the corresponding minimum-MSE (oracle) result in all cases. Surprisingly, data-domainpredicted-type measures Predicted-SURE and NGCV, which

Page 9: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

9

TABLE ISETUP FOR IMAGE RESTORATION(IR) EXPERIMENTS

Experiment Test image (256× 256) Blur RegularizationIR-A Cameraman Uniform 9× 9 ΨTV

IR-B House (1 + x1 + x2)−1, −7 ≤ x1, x2 ≤ 7 Ψℓ1

IR-C Cameraman Uniform (with varying sizes) ΨTV

TABLE IIISNR† (IN dB) OF DEBLURRED IMAGES FOREXPERIMENT IR-A AND VARYING BSNR

BSNR σ2 MSE (oracle) Projected-SURE Predicted-SURE NGCV RLGCV in (59)20 3.08e+01 3.85 3.73 3.84 3.84 2.4530 3.08e+00 5.85 5.84 5.85 5.85 2.4040 3.08e-01 8.50 8.50 8.49 8.49 2.4150 3.08e-02 11.02 10.97 11.00 11.01 2.38

TABLE IIIISNR† (IN dB) OF DEBLURRED IMAGES FOREXPERIMENT IR-B AND VARYING BSNR

BSNR σ2 MSE (oracle) Projected-SURE Predicted-SURE NGCV RLGCV in (59)20 1.65e+01 5.85 5.80 5.83 5.72 3.4830 1.65e+00 8.49 8.49 8.49 8.49 2.9440 1.65e-01 11.68 11.68 11.67 11.63 2.8550 1.65e-02 16.00 15.76 15.76 15.76 2.85

TABLE IVISNR† (IN dB) OF DEBLURRED IMAGES FOREXPERIMENT IR-C: UNIFORM BLUR OF VARYING SIZES ANDBSNR= 40 dB

Blur size σ2 MSE (oracle) Projected-SURE Predicted-SURE NGCV RLGCV in (59)5× 5 3.36e-01 9.82 9.82 9.74 9.74 2.679× 9 3.08e-01 8.50 8.50 8.48 8.48 2.41

15× 15 2.78e-01 7.42 7.38 7.42 7.42 2.2221× 21 2.57e-01 6.86 6.78 6.82 6.82 2.33

† ISNR values within 0.1 dB of the oracle are indicated in bold in Tables II-IV.

−6.26 −5.56 −4.86 −4.16 −3.46 −2.76 −2.06 −1.36 −0.66 0.04 0.74−0.56

0.04

0.64

1.24

1.84

2.44

3.04

3.64

4.23

4.83

5.43

6.03

6.63

7.23

7.83

8.43

9.03

ISN

R (

in d

B)

λ (log10

scale)

ISNRMSE−optimalProjected−SURE−basedPredicted−SURE−basedNGCV−basedLGCV−based

−4.4 −3.8 −3.2 −2.6 −2 −1.4 −0.8 −0.2 0.4 1 1.6−9.49

−8.09

−6.69

−5.29

−3.89

−2.49

−1.09

0.31

1.71

3.11

4.51

5.91

7.32

8.72

10.12

11.52

12.92

ISN

R (

in d

B)

λ (log10

scale)

ISNRMSE−optimalProjected−SURE−basedPredicted−SURE−basedNGCV−basedLGCV−based

Fig. 3. Plot of ISNR(λ) as a function of regularization parameterλ. Left: Experiment IR-A corresponding to third row of Table II; Right: Experiment IR-Bcorresponding to third row of Table III. The plots indicate that λ’s that minimize Projected-SURE, Predicted-SURE, NGCV andthe (oracle) MSE are veryclose to each other. RLGCV-based selection (59) is far away from that of oracle MSE-based selection and leads to over-smoothing and loss of details, e.g.,see Fig. 4g.

are known to undersmooth linear deblurring algorithms [22],[24], also consistently yield ISNRs that are remarkably nearthe corresponding oracle-ISNRs. These observations are alsosubstantiated by Fig. 3 where we plot ISNR(λ) versusλ for specific instances of Experiments IR-A and IR-B:ISNRs corresponding to the optima of Projected-SURE(λ),Predicted-SURE(λ) and NGCV(λ) are close to the oracle-ISNR. Accordingly, the deblurred images (corresponding toan instance of Experiment IR-A) obtained by minimizingProjected-SURE(λ) (Fig. 4d), Predicted-SURE(λ) (Fig. 4e)and NGCV(λ) (Fig. 4f) closely resemble the correspondingminimum-MSE result (Fig. 4c) in terms of visual appearance.

We present additional illustrations (for Experiments IR-Aand IR-B) that corroborate these inferences as supplementarymaterial.4

To further investigate the potential of Predicted-SURE andNGCV, we generatedy corresponding to uniform blur of vary-ing sizes (for a fixed BSNR of 40 dB: Experiment IR-C) andminimized the various measures (using golden-section search)in each case. The ISNR-results summarized in Table IV for thisexperiment indicate that minimization of Predicted-SURE(λ)and NGCV(λ) (and also Projected-SURE(λ)) lead to de-blurred images with ISNRs close to that of the correspondingMSE-optimal ones. We obtained similar promising results at

Page 10: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

10 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

(a) (b) (c)

(d) (e) (f)

(g) Fig. 4. Experiment IR-A corresponding to third row of Table II:Zoomed images of (a) Noisefree Cameraman; (b) Blurred and noisydata; and TV-deblurred images with regularization parameter λ se-lected to minimize (c) (oracle) MSE (8.50 dB); (d) Projected-SURE(8.50 dB); (e) Predicted-SURE (8.49 dB); (f) NGCV (8.49 dB);(g) RLGCV in (59) (2.41 dB). Projected-SURE-, Predicted-SURE-and NGCV-based results (d)-(f) visually resemble the oracle MSE-based result (c) very closely, while the RLGCV-based (59) resultis considerably over-smoothed.

varying (BSNR = 20, 30 dB) levels of noise (results notshown). These observations suggest that Predicted-SURE andNGCV may be reasonable alternatives to Projected-SURE fortuningλ for nonlinear restoration.

In all image-restoration experiments, the RLGCV-measure(59) yieldedλ-values that were larger (by at least an orderof magnitude, e.g., see Fig. 3) than corresponding oracle-optimum-λ, leading to over-smoothing and loss of details (e.g.,see Fig. 4g), and thus reduced ISNR (see RLGCV-columnin Tables II-IV). These results are perhaps due to the factthat RLGCV (based on LGCV in (6)) is primarily designedfor linear algorithms and is therefore unable to cope withnonlinearity of (2) for the strongly nonquadratic regularizersin (16)-(17). On the contrary, NGCV (7), which is specificallydesigned to handle nonlinear algorithms [16], [17], provides areliable means of selectingλ for nonlinear restoration. We donot show results for RLGCV hereafter.

C. Results for MRI Reconstruction

We conducted experiments with both synthetic and real MRdata (setups summarized in Table V) for MRI reconstruction.

In the synthetic case, we considered two test images (of size256× 256): the Shepp-Logan phantom (Experiments MRI-A)and a noisefreeT2-weighted MR image (Experiment MRI-B, see Fig. 6a) from the Brainweb database [62]. Partialsampling of k-space was simulated by applying a samplingmask (confined to a Cartesian grid)12 on the Fourier transform(FFT) of test images. We considered two types of maskscorresponding to a near uniform (less than Nyquist rate) butrandom sampling of k-space with a8 × 8 fully sampled13

central portion (see Fig. 6b) and radial patterns that denselysample the center13 but sparsely sample the outer k-space (e.g.,see Fig. 7b). Complex (i.i.d., zero-mean) Gaussian noise ofappropriate variance was added at sample locations to simulatenoisy data of prescribed SNR in Experiments MRI-A andMRI-B.

For experiments with real MR data, we acquired 10 in-

12Cartesian undersampling is more appropriate for 3-D MRI in practice andis applied here retrospectively for 2-D MRI for illustration purposes.

13Partial k-space sampling schemes typically involve dense sampling of thecentral portion (as that contains most of the signal energy)and undersamplingof outer portions of k-space [49], respectively.

Page 11: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

11

TABLE VSETUP FOR EXPERIMENTS WITH SIMULATED AND REALMR DATA

Experiment Test image / Real MR data (256× 256) Retrospective (Cartesian) undersamplingRegularizationMRI-A Shepp-Logan phantom Radial (30 lines, 89% undersampling) ΨTV

MRI-B NoisefreeT2-weighted MR image Random (60% undersampling) Ψℓ1

MRI-C Real GE-phantom dataset Radial (with varying number of lines) ΨTV

TABLE VIPSNR† (IN dB) OF MRI RECONSTRUCTIONS FOREXPERIMENT MRI-A AND VARYING DATA SNR

Data SNR (in dB) σ2 MSE (oracle) Predicted-SURE NGCV30 2.69e+01 13.69 13.66 12.7240 2.69e+00 22.28 22.21 21.6850 2.69e-01 31.90 31.86 30.7460 2.69e-02 42.33 42.33 42.12

TABLE VIIPSNR† (IN dB) OF MRI RECONSTRUCTIONS FOREXPERIMENT MRI-B AND VARYING DATA SNR

Data SNR (in dB) σ2 MSE (oracle) Predicted-SURE NGCV30 1.33e+01 7.77 7.33 7.3340 1.33e+00 10.58 10.53 10.3850 1.33e-01 11.62 11.62 11.5860 1.33e-02 11.83 11.81 11.83

TABLE VIIIPSNR† (IN dB) OF MRI RECONSTRUCTIONS FOREXPERIMENT MRI-C AND VARYING UNDERSAMPLING RATES

Number of radial lines % undersampling MSE (oracle) Predicted-SURE NGCV20 93 26.19 26.16 25.8230 89 30.08 30.06 29.4240 85 31.71 31.69 31.0950 82 33.03 33.03 32.1660 78 33.65 33.56 32.85

† PSNR values within 0.1 dB of the oracle are indicated in bold in Tables VI-VIII.

(a)

−6.09 −5.29 −4.49 −3.69 −2.89 −2.09 −1.29 −0.49 0.31 1.11 1.91−3.75

−2.8

−1.85

−0.91

0.04

0.99

1.94

2.89

3.83

4.78

5.73

6.68

7.63

8.57

9.52

10.47

11.42

PS

NR

(in

dB

)

λ (log10

scale)

PSNRMSE−optimalPredicted−SURE−basedNGCV−based

−8.05 −7.25 −6.45 −5.65 −4.85 −4.05 −3.25 −2.45 −1.65 −0.85 −0.0521.82

22.56

23.3

24.04

24.78

25.53

26.27

27.01

27.75

28.49

29.23

29.98

30.72

31.46

32.2

32.94

33.68

PS

NR

(in

dB

)

λ (log10

scale)

PSNRMSE−optimalPredicted−SURE−basedNGCV−based

(b)

Fig. 5. Plot of PSNR(λ) as a function of regularization parameterλ: (a) Experiment MRI-B corresponding to second row of Table VII; (b) ExperimentMRI-C corresponding to fourth row of Table VIII. The plots indicate thatλ’s that minimize Predicted-SURE and the (oracle) MSE are very close to each otherand lead to almost identical PSNRs. NGCV-based selection isaway from the MSE-based selection in both plots: in case of (a) it still yields a reconstructionFig. 6f that is agreeably close to the oracle in terms of PSNR and visual quality Fig. 6d, but in (b) it leads to a slight reduction in PSNR and correspondinglythe reconstruction Fig. 7f exhibits slightly more artifacts at the center and around the object.

dependent sets of fully-sampled 2-D data (256 × 256) of aGE-phantom using a GE 3T scanner (gradient-echo sequencewith flip angle = 35, repetition time = 200 ms, echo time= 7 ms, field of view (FOV) = 15 cm and voxel-size =0.6 × 0.6 mm2). These fully-sampled datasets were used toreconstruct (using iFFT) 2-D images that were then averagedto obtain a reference image that served as the true “unknown”x (see Fig. 7a) for computing the oracle MSE (8). Weseparately acquired 2-D data from a dummy scan (with thesame scan setting) where no RF field was applied. We used

this dummy-data to estimateσ2 by the empirical variance.We retrospectively undersampled data from one of the 10 setsby applying radial sampling patterns (confined to a Cartesiangrid)13 with varying number of spokes in Experiment MRI-C.

We ran the SB algorithm and minimizedPredicted-SURE(λ) and NGCV(λ) using golden-sectionsearch for each instance of Experiments MRI-A, MRI-B, andMRI-C, respectively. Tables VI-VIII present PSNR (computedas 20 log10(

√N maxx/‖x − uλ(y)‖2) of reconstructions

obtained after minimization of Predicted-SURE(λ) and

Page 12: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

12 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

(a) (b) (c)

(d) (e) (e)

Fig. 6. Simulations corresponding to Experiment MRI-B and second row of Table VII: (a) NoisefreeT2-weighted MR test image; (b) Retrospectiverandom undersampling (black dots indicate sample locations on a Cartesian grid, 60% undersampling); (c) Magnitude of zero-filled iFFT reconstruction fromundersampled data (-2.88 dB db); and magnitude of reconstructions obtained (using analysisℓ1-regularization with 2 levels of undecimated Haar wavelet)with regularization parameterλ selected to minimize (d) (oracle) MSE (10.58 dB); (e) Predicted-SURE (10.53 dB); (f) NGCV (10.38 dB). Regularizedreconstructions (d)-(f) have reduced noise and artifacts compared to the zero-filled iFFT reconstruction (c). Both Predicted-SURE-based and NGCV-basedresults (e) and (f) closely resemble the oracle MSE-based result (d) in this experiment.

NGCV(λ). In almost all experiments, NGCV-based selectionsresulted in worse PSNRs than those corresponding toPredicted-SURE-selections. This is also corroborated byFig. 5 where we plot PSNR(λ) as a function ofλ for specificinstances of Experiment MRI-B and MRI-C. NGCV-basedselections are away (approximately, by an order of magnitude)from both Predicted-SURE- and oracle-selections. As thePSNR-profile in Fig. 5a exhibits a plateau14 over a large rangeof λ-values, NGCV-based reconstruction in Fig. 6d is visuallysimilar to the corresponding minimum-MSE reconstructionin Fig. 6f. However, this is not the case with Fig. 5b andcorrespondingly, the NGCV-based reconstruction in Fig. 7fexhibits slightly more artifacts at the center and aroundthe object’s periphery compared to Predicted-SURE-based,Fig. 7e, and minimum-MSE, Fig. 7d, reconstructions. Theseresults indicate that NGCV may not be as consistently robustfor MRI reconstruction from partially sampled Cartesian data(which is a severely ill-posed problem whereAHA has alot of zero-eigenvalues) as for image restoration (where onlyfewer eigenvalues ofAHA are zero, especially for the blursconsidered in Section V-B).

On the other hand, Predicted-SURE-based tuning consis-tently yields PSNRs close to the corresponding (minimum-MSE) oracle-PSNRs as seen from Tables VII-VIII and Fig. 5.

14This is perhaps because the problem is less ill-posed in Experiment MRI-B as the k-space is sampled in a nearly uniform (but random) fashion (seeFig. 6b) compared to other setups, MRI-A and MRI-C, respectively, that useradial sampling (e.g., see Fig. 7b) where the corners of k-space are sparselysampled.

Predicted-SURE also leads to reconstructions (see Figs. 6e,7e) that are visually similar to the respective minimum-MSEreconstructions (see Figs. 6d, 7d). These results demonstratethe potential of Predicted-SURE for selection ofλ for MRIreconstruction.

VI. D ISCUSSION

A. Reconstruction Quality

Reconstruction quality in inverse problems of the form (1)-(2) is mainly governed by (a) the cost criterionJ in (2), and(b) the choice of associated regularization parameter(s). Inthiswork, we have only addressed the latter aspect, i.e., (b), forspecific (but popular) regularizers such as TV and those basedon theℓ1-norm. As we achieve near-MSE-optimal tuning ofthe regularization parameter for these regularizers, our TV-based image restoration results are comparable to those in [43],[45], [54]. It should be noted that this optimality (achievedby considering (b) alone) however is only over the set ofsolutions prescribed by the minimization problem in (2) fora given regularizer. It is possible to further improve qualityby considering more sophisticated regularizers, e.g., higher-degree total variation [63], Hessian-based [64] and nonlocalregularization [65]. Extending the applicability of our currentparameter selection techniques to these advanced regularizersrequires more investigation and is a possible direction forfuture research.

Page 13: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

13

(a) (b) (c)

(d) (e) (f)

Fig. 7. Experiment MRI-C with real MR GE-phantom data (corresponding to fourth row of Table VIII): (a) Magnitude of reference image reconstructed (usingiFFT) and averaged over 10 fully-sampled acquisitions; (b)Retrospective sampling along radial lines (50 lines on a Cartesian grid with 82% undersampling,black lines indicate sample locations); (c) Magnitude of zero-filled iFFT reconstruction from undersampled data (22.51 dB); and magnitude of TV-regularizedreconstructions with regularization parameterλ selected to minimize (d) (oracle) MSE (33.03 dB); (e) Predicted-SURE (33.03 dB); (f) NGCV (32.30 dB).Regularized reconstructions (d)-(f) have reduced artifacts compared to the zero-filled iFFT reconstruction (c). Predicted-SURE-based result (e) closely resemblesthe oracle MSE-based result (d), while NGCV-based result (f) exhibits slightly more artifacts at the center and around the object’s periphery.

B. IRLS-MIL and Split-Bregman Algorithms

Both IRLS-MIL and split-Bregman (SB) algorithms cantackle general minimization problems of the form (2) witharbitrary convex regularizers. However, the inner-steps of theSB algorithm (34)-(35) may not admit exact updates for ageneralA and / or general regularizerΨ such as (15) or thosein [63], [64].In such cases, iterative schemes may be neededfor the updates in (34)-(35), and correspondingly, evaluationof J(u(·),y) has to be performed on a case-by-case basisdepending on the type of iterative schemes used for (34)-(35). In this respect, IRLS-MIL is slightly more general: asit is based on the standard gradient-descent IRLS scheme[47], [48], it may be more amenable to tackle sophisticatedregularizers [63], [64] and / or a data model involving a moregeneral15 A.

C. Memory and Computation Requirements

Evaluating reconstruction quality through quantitative mea-sures generally involves additional memory and computa-tional requirements [31]. In our case, it is clear from (27)-(32) that storing and manipulatingJ(u(·),y)n, J(v(·),y)n,J(γγγ(·),y)n and evaluating NGCV(λ), Predicted-SURE(λ)and Projected-SURE(λ) for one instance of the parametervector λ demand similar memory and computational loadas the IRLS-MIL iterations (19), (25) themselves. These

15The matrixC needs to be chosen in accordance with (18), but since itdepends only onA (andAH), it can be predetermined for a given problem.

requirements are also comparable to those of the iterativerisk estimation techniques in [25], [28] and the Monte-Carlodivergence estimator in [36, Th. 2] (that needs two algorithmevaluations for one instance ofλ).The complex-valued case(Ω = C) demands even more memory and computations(compared to the real-valued caseΩ = R) as one has totackle Jacobian matrices evaluated with respect toy andy⋆. This additional requirement is purely a consequence ofcomplex-calculus. In general, the exact amount of storage andcomputation necessary for evaluating NGCV and WSUREdepends on how the reconstruction algorithm is implemented.

Furthermore, in our experiments, we optimize NGCV(λ),Predicted-SURE(λ) and Projected-SURE(λ) using golden-section search that necessitates multiple evaluations of theseperformance measures for several instances ofλ. To savecomputation time, it is desirable to optimizeλ simultaneouslyduring reconstruction. Designing such a scheme is not straight-forward when the reconstruction problem is posed as (2) sinceintermittently changingλ affects the cost functionJ and altersthe original problem (2).

To avoid this difficulty, image reconstruction can be for-mulated as apenalty problemusing variable splitting andpenalty techniques [66]. Alternating minimization can thenbe employed to decouple the originalpenalty problemintosimpler linear and nonlinear subproblems [66]. The advantageof this approach is that it provides the option for optimizingparameters based on the subproblems, which can be achievedrelatively easily. Liaoet al. [54], [55] demonstrated the prac-

Page 14: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

14 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

ticability of this approach for TV-based image restoration,but they optimized regularization parameters only based onlinear subproblems (using LGCV) and used continuation tech-niques to adjust other parameters associated with nonlinearsubproblems. Since the techniques developed in this paper canhandle nonlinear algorithms, they may be adapted to optimizeparameters (e.g., using NGCV) associated with the nonlinearsubproblems in thepenalty formulation. As part of futurework, we plan to investigate thepenaltyapproach for biomed-ical image reconstruction with simultaneous optimizationofpenalty parameters.

VII. SUMMARY & CONCLUSIONS

Proper selection of the regularization parameter (λ) is animportant part of regularized methods for inverse problems.GCV and (weighted) MSE-estimation based on the principle ofStein’s Unbiased Risk Estimate [18]—SURE (in the Gaussiansetting) can be used for selectingλ, but they require the traceof a linear transformation of the Jacobian matrix,J(uθθθ,y),associated with the nonlinear (possibly iterative) reconstruc-tion algorithm represented by the mappinguθθθ. We derivedrecursions forJ(uθθθ ,y) for two types of nonlinear iterativealgorithms: the iterative reweighted least squares methodwithmatrix inversion lemma (IRLS-MIL) [47] and the variablesplitting-based split-Bregman (SB) algorithm [46], both ofwhich are capable of handling (synthesis-type and) a varietyof analysis-type regularizers.

We estimated the desired trace for nonlinear imagerestoration and MRI reconstruction (from partially sam-pled Cartesian k-space data) by applying a Monte-Carloprocedure similar to that in [25], [28]. We implementedIRLS-MIL and SB along with computation of NGCV(λ),Predicted-SURE(λ), and Projected-SURE(λ) for total varia-tion and analysisℓ1-regularization. Through simulations, weshowed for image restoration that selectingλ by minimizingNGCV(λ), Predicted-SURE(λ), and Projected-SURE(λ) con-sistently yielded reconstructions that were close to correspond-ing minimum-MSE reconstructions both in terms of visualquality and SNR improvement. For MRI (with partial Carte-sian k-space sampling), we conducted experiments with bothsynthetic and real phantom data and found that NGCV-basedreconstructions were slightly sub-optimal in terms of SNR im-provement, while minimizing Predicted-SURE(λ) (equivalentto Projected-SURE(λ) in this case) consistently yielded near-MSE-optimal reconstructions both in terms of SNR improve-ment and visual quality. These results indicate the feasibility ofapplying GCV- and weighted SURE-based selection ofλ foriterative nonlinear reconstruction using analysis-type regular-izers. The philosophy underlying theoretical developments inthis work can also be extended, in principle, to handle otherregularizers, reconstruction algorithms and inverse problemsinvolving Gaussian noise.

REFERENCES

[1] W. C. Karl, “Regularization in Image restoration and reconstruction,” inHandbook of Image & Video Processing, A. Bovik, Ed., pp. 183–202.ELSEVIER, 2nd edition, 2005.

[2] P. C. Hansen, “Analysis of discrete ill-posed problems by means of theL-curve,” SIAM Review, vol. 34, no. 4, pp. 561–580, 1992.

[3] P. C. Hansen and D. P. O’Leary, “The use of the L-curve in theregularization of discrete ill-posed problems,”SIAM J. Sci. Comput.,vol. 14, no. 6, pp. 1487–1503, 1993.

[4] T. Reginska, “A regularization parameter in discrete ill-posed problems,”SIAM J. Sci. Comput., vol. 17, no. 3, pp. 740–749, 1996.

[5] C. R. Vogel, “Non-convergence of the L-curve regularization parameterselection method,”Inverse Prob., vol. 12, no. 4, pp. 535–47, Aug. 1996.

[6] G. Wahba,Spline Models for Observational Data, SIAM, Philadelphia,1990.

[7] P. Craven and G. Wahba, “Smoothing noisy data with splinefunctions,”Numer. Math., vol. 31, pp. 377–403, 1979.

[8] G. H. Golub, M. Heath, and G. Wahba, “Generalized Cross-Validationas a Method for Choosing a Good Ridge Parameter,”Technometrics,vol. 21, no. 2, pp. 215–223, May 1979.

[9] S. J. Reeves and R. M. Mersereau, “Optimal estimation of theregularization parameter and stabilizing functional for regularized imagerestoration,” Opt. Engg., vol. 29, no. 5, pp. 446–54, 1990.

[10] S. J. Reeves, “Optimal space-varying regularization in iterative imageresotration,” IEEE Trans. Im. Proc., vol. 3, no. 3, pp. 319–24, 1994.

[11] D. A. Girard, “The fast Monte-Carlo Cross-Validation and CL pro-cedures: Comments, new results and application to image recoveryproblems,” Computation. Stat., vol. 10, pp. 205–231, 1995.

[12] D. A. Girard, “A fast Monte-Carlo cross-validation procedure for largeleast squares problems with noisy data,”Numerische Mathematik, vol.56, no. 1, pp. 1–23, Jan. 1989.

[13] J. D. Carew, G. Wahba, X. Xie, E. V. Nordheim, and M. E. Meyerandb,“Optimal spline smoothing of fMRI time series by generalized cross-validation,” NeuroImage, vol. 18, pp. 950–61, 2003.

[14] S. Sourbron, R. Luypaert, P. V. Schuerbeek, M. Dujardin, and T. Stadnik,“Choice of the regularization parameter for perfusion quantification withMRI,” Phys. Med. Biol., vol. 49, pp. 3307–24, 2004.

[15] F. O’Sullivan and G. Wahba, “A cross validated Bayesianretrievalalgorithm for nonlinear remote sensing experiments,”J. Comp. Phys.,vol. 59, no. 3, pp. 441–55, July 1985.

[16] L. N. Deshpande and D. A. Girard, “Fast computation of cross-validatedrobust splines and other non-linear smoothing splines,” inCurvesand Surfaces, Le Mehaute Laurent and Schumaker, Eds., pp. 143–8.Academic, Boston, 1991.

[17] D. A. Girard, “The fast Monte-Carlo Cross-Validation and CL pro-cedures: Comments, new results and application to image recoveryproblems - Rejoinder,”Computation. Stat., vol. 10, pp. 251–258, 1995.

[18] C. Stein, “Estimation of the mean of a multivariate normal distribution,”Ann. Stat., vol. 9, no. 6, pp. 1135–51, Nov. 1981.

[19] L. Desbat and D. A. Girard, “The “Minimum Reconstruction Error”choice of regularization parameters: some more efficient methods andtheir applicaiton to deconvolution problems,”SIAM J. Sci. Comput., vol.16, no. 6, pp. 1387–1403, Nov. 1995.

[20] J. A. Rice, “Choice of smoothing parameter in deconvolution problems,”Contemporary Mathematics, vol. 59, pp. 137–51, 1986.

[21] A. M. Thompson, J. C. Brown, J. W. Kay, and D. M. Titterington,“A Study of Methods of choosing the smoothing parameter in imagerestoration by regularization,”IEEE Trans. Patt. Anal. Mach. Intell.,vol. 13, no. 4, pp. 326–339, 1991.

[22] N. P. Galatsanos and A. K. Katsaggelos, “Methods for choosing theregularization parameter and estimating the noise variance in imagerestoration and their relation,”IEEE Trans. Im. Proc., vol. 1, no. 3,pp. 322–336, 1992.

[23] S. Ramani, C. Vonesch, and M. Unser, “Deconvolution of 3D fluores-cence micrographs with automatic risk minimization,” inProc. IEEEIntl. Symp. Biomed. Imag., pp. 732–5, 2008.

[24] Y. C. Eldar, “Generalized SURE for exponential families: applicationsto regularization,” IEEE Trans. Sig. Proc., vol. 57, no. 2, pp. 471–81,Feb. 2009.

[25] R. Giryes, M. Elad, and Y. C. Eldar, “The projected GSUREforautomatic parameter tuning in iterative shrinkage methods,” Appliedand Computational Harmonic Analysis, vol. 30, no. 3, pp. 407–22, May2011.

[26] J-C. Pesquet, A. Benazza-Benyahia, and C. Chaux, “A SURE approachfor digital signal/image deconvolution problems,”IEEE Trans. Sig.Proc., vol. 57, no. 12, pp. 4616–32, Dec. 2009.

[27] A. Marin, C. Chaux, J-C. Pesquet, and P. Ciuciu, “Image reconstructionfrom multiple sensors using stein’s principle. Application to parallelMRI,” in Biomedical Imaging: From Nano to Macro, IEEE InternationalSymposium on, pp. 465–468, 2011.

Page 15: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

15

[28] C. Vonesch, S. Ramani, and M. Unser, “Recursive risk estimationfor non-linear image deconvolution with a wavelet-domain sparsityconstraint,” Proc. IEEE Intl. Conf. Img. Proc., pp. 665–8, 2008.

[29] M. Ting, R. Raich, and A. O. Hero, “Sparse image reconstruction usingsparse priors,”Proc. IEEE Intl. Conf. Img. Proc., pp. 1261–4, October2006.

[30] O. Batu and M. Cetin, “Parameter selection in sparsity-driven SARimaging,” IEEE Trans. Aero. Elec. Sys., vol. 47, no. 4, pp. 3040–50,2011.

[31] X. Zhu and P. Milanfar, “Automatic parameter selectionfor denoisingalgorithms using a no-reference measure of image content,”IEEE Trans.Im. Proc., vol. 19, no. 12, pp. 3116–32, Dec. 2010.

[32] X. -P. Zhang and M. D. Desai, “Adaptive denoising based on SURErisk,” IEEE Sig. Proc. Lett., vol. 5, no. 10, pp. 265–267, 1998.

[33] F. Luisier, T. Blu, and M. Unser, “A New SURE approach to imagedenoising: Interscale Orthonormal Wavelet Thresholding,” IEEE Trans.Im. Proc., vol. 16, no. 3, pp. 593–606, 2007.

[34] T. Blu and F. Luisier, “The SURE-LET approach to image denoising,”IEEE Trans. Im. Proc., vol. 16, no. 11, pp. 2778–86, 2007.

[35] D. Van De Ville and M. Kocher, “Nonlocal means with dimensionalityreduction and SURE-based parameter selection,”IEEE Trans. Im. Proc.,vol. 20, no. 9, pp. 2683–90, 2011.

[36] S. Ramani, T. Blu, and M. Unser, “Monte-Carlo SURE: A black-box optimization of regularization parameters for generaldenoisingalgorithms,” IEEE Trans. Im. Proc., vol. 17, no. 9, pp. 1540–54, Sept.2008.

[37] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet-based image restoration,”IEEE Trans. Im. Proc., vol. 12, no. 8, pp.906–16, Aug. 2003.

[38] M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis insignal priors,” Inv. Prob., vol. 23, pp. 947–968, 2007.

[39] P. J. Huber,Robust statistics, Wiley, New York, 1981.[40] J. A. Fessler and S. D. Booth, “Conjugate-gradient preconditioning

methods for shift-variant PET image reconstruction,”IEEE Trans. Im.Proc., vol. 8, no. 5, pp. 688–99, May 1999.

[41] G. Archer and D. M. Titterington, “On some Bayesian / regularizationmethods for image restoration,”IEEE Trans. Im. Proc., vol. 4, no. 7,pp. 989–95, 1995.

[42] R. Molina, A. K. Katsaggelos, and J. Mateos, “Bayesian and regulariza-tion methods for hyperparameter estimation in image restoration,” IEEETrans. Im. Proc., vol. 8, no. 2, pp. 231–46, 1999.

[43] J. P. Oliveira, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “Adaptive to-tal variation image deblurring: A majorization-minimization approach,”Signal Processing, vol. 89, pp. 1683–93, 2009.

[44] S. D. Babacan, R. Molina, and A. K. Katsaggelos, “Variational bayesianblind deconvolution using a total variation prior,”IEEE Trans. Im. Proc.,vol. 18, no. 1, pp. 12–26, 2009.

[45] S. D. Babacan, R. Molina, and A. K. Katsaggelos, “Parameter estimationin TV image restoration using variational distribution approximation,”IEEE Trans. Im. Proc., vol. 17, no. 3, pp. 326–39, 2008.

[46] T. Goldstein and S. Osher, “The split Bregman method forL1-regularized problems,”SIAM J. Imaging Sci., vol. 2, no. 2, pp. 323–43,2009.

[47] S. Ramani and J. A. Fessler, “An accelerated iterative reweighted leastsquares algorithm for compressed sensing MRI,” inProc. IEEE Intl.Symp. Biomed. Imag., pp. 257–60, 2010.

[48] S. Ramani, J. Rosen, Z. Liu, and J. A. Fessler, “Iterative weighted riskestimation for nonlinear image restoration with analysis priors,” Proc.SPIE Elec. Img., vol. 8296, pp. 82960N1–12, 2012.

[49] S. Ramani and J. A. Fessler, “Parallel MR image reconstruction usingaugmented Lagrangian methods,”IEEE Trans. Med. Imag., vol. 30, no.3, pp. 694–706, 2011.

[50] E. H. Lieb and M. Loss,Analysis, American Mathematical Society,Providence, RI, USA, 2nd, revised edition, 2001.

[51] D. H. Brandwood, “A complex gradient operator and its application inadaptive array theory,”IEE Proceedings H: Microwaves, Optics andAntennas, vol. 130, no. 1, pp. 11–6, Feb. 1983.

[52] M. Brookes, “The Matrix Reference Manual,” [online],http://www.ee.imperial.ac.uk/hp/staff/dmb/matrix/calculus.html, 2011.

[53] A. Hjorungnes and D. Gesbert, “Complex-valued matrix differentiation:techniques and key results,”IEEE Trans. Sig. Proc., vol. 55, no. 6, pp.2740–6, 2007.

[54] H. Liao, F. Li, and M. K. Ng, “Selection of regularization parameter intotal variation image restoration,”J. Opt. Soc. Am. A, vol. 26, no. 11,pp. 2311–20, 2009.

[55] H. Liao and M. K. Ng, “Blind deconvolution using generalized cross-validation approach to regularization parameter estimation,” IEEE Trans.Im. Proc., vol. 20, no. 3, pp. 670–80, 2011.

[56] M. V. Afonso, Jose M Bioucas-Dias, and Mario A T Figueiredo, “Fastimage recovery using variable splitting and constrained optimization,”IEEE Trans. Im. Proc., vol. 19, no. 9, pp. 2345–56, Sept. 2010.

[57] W. Hackbusch,Iterative Solution of Large Sparse Systems of Equations,Springer-Verlag, New York, 1994.

[58] C. Chaux, P. L. Combettes, J-C. Pesquet, and Valerie R Wajs, “Avariational formulation for frame-based inverse problems,” InverseProb., vol. 23, no. 4, pp. 1495–518, Aug. 2007.

[59] M. F. Hutchinson, “A stochastic estimator for the traceof the influ-ence matrix for Laplacian smoothing splines,”Comm. in Statistics -Simulation and Computation, vol. 19, no. 2, pp. 433–50, 1990.

[60] Z. Bai, M. Fahey, and G. Golub, “Some large-scale matrixcomputationproblems,” J. Comput. Appl. Math., vol. 74, pp. 71–89, 1996.

[61] S. Dong and K. Liu, “Stochastic estimation withz2 noise,” Phys. Lett.B, vol. 328, pp. 130–6, 1994.

[62] “Brainweb: Simulated MRI Volumes for Normal Brain,” McConnellBrain Imaging Centre, http://www.bic.mni.mcgill.ca/brainweb/.

[63] Y. Hu and M. Jacob, “Higher degree total variation (HDTV)regularization for image recovery,” IEEE Trans. Im. Proc., DOI:10.1109/TIP.2012.2183143, 2012, in press.

[64] S. Lefkimmiatis, A. Bourquard, and M. Unser, “Hessian-based normregularization for image restoration with biomedical applications,” IEEETrans. Im. Proc., vol. 21, no. 3, pp. 983–95, 2012.

[65] A. Elmoataz, O. Lezoray, and S. Bougleux, “Nonlocal discrete regu-larization on weighted graphs: A framework for image and manifoldprocessing,”IEEE Trans. Im. Proc., vol. 17, no. 7, pp. 1047–60, 2008.

[66] Y. Wang, J. Yang, W. Yin, and Y. Zhang, “A new alternatingminimiza-tion algorithm for total variation image reconstruction,”SIAM J. Img.Sci., vol. 1, no. 3, pp. 248–72, 2008.

Sathish Ramani (S’08, M’09) received the M.Sc.degree in Physics from Sri Sathya Sai Instituteof Higher Learning (SSSIHL), Puttaparthy, AndhraPradesh, India in 2000, the M.Sc. degree in Elec-trical Engineering from Indian Institute of Science(IISc), Bangalore, India in 2004 and Ph.D. in Electri-cal Engineering from Ecole Polytechnique Federalede Lausanne (EPFL), Switzerland in 2009. He re-ceived a Best Student Paper Award at the IEEEInternational Conference on Acoustics, Speech andSignal Processing in 2006. He was a Swiss National

Science Foundation Post-Doctoral Fellow in Systems Group,Electrical En-gineering and Computer Science Department, University of Michigan, USAfrom 2009 to 2010 and continues to work there as a post-doctoral researchscholar. His research interests are in splines, interpolation, variational methodsin image processing and optimization algorithms for biomedical imagingapplications.

Zhihao Liu was born in Hunan, China, on July 7,1990. He is currently a senior year undergraduatestudent in Electrical Engineering, EECS departmentof the University of Michigan, Ann Arbor. Prior tothat, he spent two years of undergraduate studies inthe Science and Integrated Class, Hongshen HonorsSchool, Chongqing University, China, from 2008to 2010. He joined the University of Michigan inSeptember 2010 and plans to continue with grad-uate studies after completing his B.S degree. Hisresearch interests are generally in signal and image

processing, especially with respect to medical applications.

Page 16: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

16 IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

Jeffrey Rosen is an undergraduate student at theUniversity of Michigan (UM), Ann Arbor pursuinga B.S.E. degree in electrical engineering (expectedgraduation April 2012). He will join the ComputerScience Engineering graduate program, also at UM,Ann Arbor, in Fall 2012. His research interestsinclude computer architecture, particularly multicoreand multiprocessor programmability

Jon-Fredrik Nielsen received a PhD in physicsfrom Ohio State University in 2001, and was a post-doctoral fellow at University of Southern Californiabefore joining the functional MRI laboratory at Uni-versity of Michigan as research faculty. His researchinterests include rapid steady-state MRI sequences,functional MRI, and MRI image reconstruction.

Jeffrey A. Fessler (F’06) received the BSEE de-gree from Purdue University in 1985, the MSEEdegree from Stanford University in 1986, and theM.S. degree in Statistics from Stanford University in1989. From 1985 to 1988 he was a National ScienceFoundation Graduate Fellow at Stanford, where heearned a Ph.D. in electrical engineering in 1990. Hehas worked at the University of Michigan since then.From 1991 to 1992 he was a Department of EnergyAlexander Hollaender Post-Doctoral Fellow in theDivision of Nuclear Medicine. From 1993 to 1995

he was an Assistant Professor in Nuclear Medicine and the BioengineeringProgram. He is now a Professor in the Departments of Electrical Engineeringand Computer Science, Radiology, and Biomedical Engineering. He becamea Fellow of the IEEE in 2006, for contributions to the theory and practiceof image reconstruction. He received the Francois Erbsmannaward for hisIPMI93 presentation. He serves as an associate editor for IEEE Transactionson Medical Imaging and is a past associate editor for the IEEETransactions onImage Processing and the IEEE Signal Processing Letters. Hewas co-chair ofthe 1997 SPIE conference on Image Reconstruction and Restoration, technicalprogram co-chair of the 2002 IEEE International Symposium on BiomedicalImaging (ISBI), and general chair of ISBI 2007. His researchinterests arein statistical aspects of imaging problems, and he has supervised doctoralresearch in PET, SPECT, X-ray CT, MRI, and optical imaging problems.

Page 17: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

1

Regularization Parameter Selection for

Nonlinear Iterative Image Restoration and MRI

Reconstruction Using GCV and SURE-Based

Methods—Supplementary MaterialSathish Ramani*,Member, IEEE, Zhihao Liu, Jeffrey Rosen, Jon-Fredrik Nielsen, and

Jeffrey A. Fessler,Fellow, IEEE

University of Michigan, Ann Arbor, MI, USA.

We present a proof for Lemma 1 and show additional illustrations that we could not accommodate

in the paper due to space constraints. References to equations, figures, etc. are within this material only

unless specified otherwise.

I. L EMMA 1 IN THE PAPER

Lemma 1: Let fλλλ : ΩM → ΩN be differentiable (forΩ = R) or individually analytic (forΩ = C

with respect to real and imaginary parts of its argument), respectively, in the weak sense. Then, for any

deterministicT ∈ ΩM×N satisfyingEξξξ|[Tfλλλ(y)]m| < ∞, m = 1 . . .M , we have forΩ = R or C that

EξξξξξξHTfλλλ(y) = σ2EξξξtrTJΩfλλλ(y). (1)

Proof: The proof forΩ = R is very similar to those in [1, Th. 1], [2, Lemma 1], so we focuson

the case ofΩ = C. The density functiongC(ξξξ) = (πσ2)−M exp(−ξξξHξξξ/σ2) is analytic with respect toξξξ

andξξξ⋆ individually and satisfies the identity

ξξξHgC(ξξξ) = −σ2

2[∇ξξξ

R− ι∇ξξξI

]gC(ξξξ), (2)

where ∇ξξξR

and ∇ξξξIdenote1 × M gradient operators consisting partial derivatives with respect to

Page 18: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

2

−6.26 −5.56 −4.86 −4.16 −3.46 −2.76 −2.06 −1.36 −0.66 0.04 0.74−1.32

−1.18

−1.04

−0.9

−0.76

−0.63

−0.49

−0.35

−0.21

−0.07

0.07

0.21

0.34

0.48

0.62

0.76

0.9

Pre

dict

ed−

MS

E, P

redi

cted

−S

UR

E (

in d

B)

λ (log10

scale)

Predicted−SUREPredicted−MSEProjected−SUREProjected−MSE

1.93

1.99

2.04

2.09

2.15

2.2

2.25

2.31

2.36

2.41

2.47

2.52

2.57

2.63

2.68

2.73

2.79

Pro

ject

ed−

MS

E, P

roje

cted

−S

UR

E (

in d

B)

−4.4 −3.8 −3.2 −2.6 −2 −1.4 −0.8 −0.2 0.4 1 1.6−1.58

−1.31

−1.04

−0.77

−0.5

−0.23

0.04

0.3

0.57

0.84

1.11

1.38

1.65

1.92

2.19

2.46

2.73

Pre

dict

ed−

MS

E, P

redi

cted

−S

UR

E (

in d

B)

λ (log10

scale)

Predicted−SUREPredicted−MSEProjected−SUREProjected−MSE

1.2

1.32

1.45

1.57

1.7

1.82

1.95

2.07

2.19

2.32

2.44

2.57

2.69

2.82

2.94

3.07

3.19

Pro

ject

ed−

MS

E, P

roje

cted

−S

UR

E (

in d

B)

Fig. 1. Plot of Predicted-MSE(λ), Predicted-SURE(λ), Projected-MSE(λ) and Projected-SURE(λ) as functions of regularizationparameterλ. Left: Experiment IR-A corresponding to third row of Table II in the paper; Right: Experiment IR-B correspondingto third row of Table III in the paper. The plots indicate thatthe SURE-curves closely capture the trends of the respectiveMSE-curves and their minima (indicated by ’*’) are close to that of the (oracle) MSE indicated by the solid vertical line.

ξRmMm=1, ξImMm=1. Expanding the LHS of (1) and using (2), we get that

EξξξξξξHTfλλλ(y) =

∫gC(ξξξ)ξξξ

HTfλλλ(y) dξξξRdξξξI = −σ2

2

∫ [(∇ξξξ

R− ι∇ξξξI

)gC(ξξξ)

]Tfλλλ(y) dξξξRdξξξI . (3)

Integrating-by-parts the term involving∇ξξξR

in (3) and using the fact that lim|ξRm|→∞

gΩ(ξξξ) [Tfλλλ(y)]m = 0,

whenEξξξ|[Tfλλλ(y)]m| < +∞ [1], [2], we get that

∫∇ξξξ

RgC(ξξξ)Tfλλλ(y) dξξξR

dξξξI =

M∑

m=1

N∑

n=1

∫∂gC(ξξξ)

∂ξRmTmnfλλλ,n(y) dξξξR

dξξξI

= −M∑

m=1

N∑

n=1

∫gC(ξξξ)Tmn

∂fλλλ,n(y)

∂ξRmdξξξRdξξξI

= −M∑

m=1

N∑

n=1

∫gC(ξξξ)Tmn

∂fλλλ,n(y)

∂yRmdξξξRdξξξI , (4)

where we have set∂/∂ξRm = ∂/∂yRm sinceAx (Eq. (1) in the paper) is a deterministic constant.

Going through a similar derivation for the integral involving −ι∇ξξξIand using∂/∂ξIm = ∂/∂yIm, we

get that

− ι

∫∇ξξξI

gC(ξξξ)Tfλλλ(y) dξξξRdξξξI = ι

M∑

m=1

N∑

n=1

∫gC(ξξξ)Tmn

∂fλλλ,n(y)

∂yImdξξξRdξξξI . (5)

Combining (3)-(5) with the definition ofJΩfλλλ(y) (Eq. (4) in the paper) yields the desired result (1).

Page 19: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

3

(a) (b) (c)

(d) (e) (f)

(g) Fig. 2. Experiment IR-B corresponding to third row of Table III in thepaper: Zoomed images of (a) Noisefree House; (b) Blurred andnoisy data;and deblurred images obtained (using analysisℓ1-regularization with 2 levelsof undecimated Haar wavelet) with regularization parameter λ selected tominimize (c) (oracle) MSE (11.68 dB); (d) Projected-SURE (11.68 dB); (e)Predicted-SURE (11.63 dB); (f) NGCV (11.63 dB); (g) LGCV (2.87 dB).Projected-SURE-, Predicted-SURE- and NGCV-based results(d)-(f) visuallyresemble the oracle MSE-based result (c) very closely, while the LGCV-basedresult is considerably over-smoothed.

II. A DDITIONAL ILLUSTRATIONS FORIMAGE RESTORATION

Fig. 1 plots Projected-MSE(λ), Projected-SURE(λ), Predicted-MSE(λ) and Predicted-SURE(λ) versus

λ for specific instances of Experiments IR-A and IR-B (corresponding to Cameraman and House test

images, respectively, see Table I in the paper). Projected-SURE- and Predicted-SURE-curves are accurate

in capturing the trends of Projected-MSE- and Predicted-MSE-curves and also exhibit minima (indicated

by *) close to that of MSE (solid vertical line). The deblurred images (corresponding to an instance of

Experiment IR-B) obtained by minimizing Projected-SURE(λ) (Fig. 2d), Predicted-SURE(λ) (Fig. 2e)

and NGCV(λ) (Fig. 2f) closely resemble the corresponding minimum-MSE result (Fig. 2c) in terms of

Page 20: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

4

(a)

−5.94 −5.14 −4.34 −3.54 −2.74 −1.94 −1.14 −0.34 0.46 1.26 2.060.11

0.22

0.34

0.46

0.58

0.69

0.81

0.93

1.05

1.17

1.28

1.4

1.52

1.64

1.75

1.87

1.99P

redi

cted

−M

SE

, Pre

dict

ed−

SU

RE

(in

dB

)

λ (log10

scale)

Predicted−SUREPredicted−MSE

−5.94 −5.14 −4.34 −3.54 −2.74 −1.94 −1.14 −0.34 0.46 1.26 2.06 −7.2

−5.25

−3.3

−1.35

0.6

2.55

4.5

6.45

8.4

10.35

12.3

14.25

16.2

18.15

20.1

22.05

24

PS

NR

(in

dB

)

λ (log10

scale)

PSNRMSE−optimalPredicted−SURE−basedNGCV−based

(b)

Fig. 3. Experiment MRI-A with (Shepp-Logan phantom) corresponding to second row of Table VI in the paper: (a) Plot ofPredicted-MSE and Predicted-SURE as functions of regularization parameterλ indicates that Predicted-SURE closely capturesthe trend of Predicted-MSE and their minima (indicated by ’*’) are close to that of the (oracle) MSE indicated by the solidvertical line; (b) Plot of PSNR(λ) versusλ indicate thatλ’s that minimize Predicted-SURE and the (oracle) MSE are very closeto each other and lead to almost identical PSNRs and reconstructions (see Fig. 5). NGCV-based selection is away from theMSE-based selection but it still yields a reconstruction Fig. 5f that is agreeably close to the oracle Fig. 5d in terms of visualquality in this instance of Experiment MRI-A.

(a)

−6.09 −5.29 −4.49 −3.69 −2.89 −2.09 −1.29 −0.49 0.31 1.11 1.910.03

0.08

0.14

0.19

0.25

0.3

0.36

0.41

0.47

0.52

0.58

0.63

0.69

0.74

0.8

0.85

0.9

Pre

dict

ed−

MS

E, P

redi

cted

−S

UR

E (

in d

B)

λ (log10

scale)

Predicted−SUREPredicted−MSE

−8.05 −7.25 −6.45 −5.65 −4.85 −4.05 −3.25 −2.45 −1.65 −0.85 −0.05−3.63

−3.62

−3.61

−3.6

−3.58

−3.57

−3.56

−3.55

−3.54

−3.53

−3.51

−3.5

−3.49

−3.48

−3.47

−3.45

−3.44

Pre

dict

ed−

MS

E, P

redi

cted

−S

UR

E (

in d

B)

λ (log10

scale)

Predicted−SUREPredicted−MSE

(b)

Fig. 4. Plot of Predicted-MSE and Predicted-SURE as functions of regularization parameterλ: (a) Experiment MRI-B withsynthetic data corresponding to second row of Table VII in the paper; (b) Experiment MRI-C with real MR data correspondingto fourth row of Table VIII in the paper. The plots indicate that Predicted-SURE closely captures the trend of Predicted-MSEand their minima (indicated by ’*’) are close to that of the (oracle) MSE indicated by the solid vertical line.

visual appearance, while that obtained by minimizing LGCV(λ) is over-smoothed (see Fig. 2g). These

results are also consistent with what is reported in the paper (e.g., see Figs. 3 and 4 in the paper).

III. A DDITIONAL ILLUSTRATIONS FORMRI RECONSTRUCTION

We evaluated Predicted-MSE(λ), Predicted-SURE(λ) as functions ofλ and plotted them in Figs. 3a and

4 for specific instances of Experiments MRI-A (with synthetic data corresponding to noisefree Shepp-

Logan phantom), MRI-B (with synthetic data corresponding to a noisefreeT2-weighted MR image)

Page 21: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

5

(a) (b) (c)

(d) (e) (f)

Fig. 5. Experiment MRI-A with Shepp-Logan phantom (corresponding to second row of Table VI: (a) Noisefree Shepp-Logan(256×256) phantom; (b) Retrospective sampling along radial lines (30 lines on a Cartesian grid with 89% undersampling, blacklines indicate sample locations); (c) Magnitude of zero-filled iFFT reconstruction from undersampled data; and magnitudeof TV-regularized reconstructions with regularization parameterλ selected to minimize (d) (oracle) MSE (22.28 dB); (e)Predicted-SURE (22.21 dB); (f) NGCV (21.68 dB). Regularized reconstructions (d)-(f) have reduced artifacts comparedtothe zero-filled iFFT reconstruction (c). Predicted-SURE-based and NGCV-based results (e), (f), respectively, closely resemblethe oracle MSE-based result (d) in this experiment.

and MRI-C (with real GE phantom data) in the paper. Predicted-SURE not only captures the trend of

Predicted-MSE in Figs. 3a and 4, but also exhibits minima (indicated by *) close to that of MSE (solid

vertical line).

We also plot PSNR(λ) versusλ in Fig. 3b and present reconstructions in Fig. 5 for the instance of

Experiment MRI-A considered in Figs. 3a: Although NGCV-selection is slightly away from the (oracle)

MSE-selection in Fig. 3b, the corresponding reconstruction Fig. 5f is visually similar to the MSE-based

one in Fig. 5d in this case. Predicted-SURE-selection is close to the (oracle) MSE-selection in Fig. 3b

and therefore naturally leads a reconstruction Fig. 5e thatresembles the MSE-based one in Fig. 5d.

These illustrations point to (the sub-optimality of NGCV and) the accuracy of Predicted-SURE(λ) for

MRI reconstruction from partially sampled Cartesiank-space data and are consistent with our results

portrayed in the paper.

Page 22: IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 …web.eecs.umich.edu/.../files/jour/12/pre/ramani-12-rps.pdf · 2015. 9. 9. · IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR

6

REFERENCES

[1] Y. C. Eldar, “Generalized SURE for exponential families: applications to regularization,”IEEE Trans. Sig. Proc., vol. 57,

no. 2, pp. 471–81, Feb. 2009.

[2] T. Blu and F. Luisier, “The SURE-LET approach to image denoising,” IEEE Trans. Im. Proc., vol. 16, no. 11, pp. 2778–86,

2007.