26
Cahier du GERAD G-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO ˚ AND DOMINIQUE ORBAN : Abstract. We consider the iterative solution of regularized saddle-point systems. When the leading block is symmetric and positive semi-definite on an appropriate subspace, Dollar, Gould, Schilders, and Wathen (2006) describe how to apply the conjugate gradient (CG) method coupled with a constraint preconditioner, a choice that has proved to be effective in optimization applications. We investigate the design of constraint-preconditioned variants of other Krylov methods for regularized systems by focusing on the underlying basis-generation process. We build upon principles laid out by Gould, Orban, and Rees (2014) to provide general guidelines that allow us to specialize any Krylov method to regularized saddle-point systems. In particular, we obtain constraint-preconditioned variants of Lanczos and Arnoldi-based methods, including the Lanczos version of CG, MINRES, SYMMLQ, GMRES(m) and DQGMRES. We also provide MATLAB implementations in hopes that they are useful as a basis for the development of more sophisticated software. Finally, we illustrate the numerical behavior of constraint-preconditioned Krylov solvers using symmetric and nonsymmetric systems arising from constrained optimization. Key words. Regularized saddle-point systems, constraint preconditioners, Lanczos and Arnoldi procedures, Krylov solvers. AMS subject classifications. 65F08, 65F10, 65F50, 90C20. 1. Introduction. We consider the iterative solution of the regularized saddle- point system (1) A B T B ´C x y b 0 , where A P R nˆn may be nonsymmetric, C P R mˆm is nonzero and symmetric, and B P R mˆn . We denote K the matrix of (1). There is no loss of generality in assuming that the last m entries of the right-hand side of (1) are zero, as discussed later. A constraint preconditioner for (1) has the form (2) P G B T B ´C , where G is an approximation to A such that (2) is nonsingular. When A is symmetric and has appropriate additional properties, a constraint preconditioner allows the application of CG even though K and P are indefinite (Dollar et al., 2006). We are interested in the design of constraint-preconditioned versions of additional Krylov methods for (1), including methods that can be used when A is nonsymmetric. We extend the work of Gould et al. (2014) on projected and constraint-preconditioned Krylov methods for saddle-point systems with C 0 by exploiting a suitable reformu- lation of (1) suggested by Dollar et al. (2006). We develop constraint-preconditioned ˚ Dipartimento di Matematica e Fisica, Università degli Studi della Campania “L. Vanvitelli”, Caserta, Italy. E-mail: [email protected]. Research partially supported by GERAD during a visit of this author in 2016, and by Gruppo Nazionale per il Calcolo Scientifico – Istituto Nazionale di Alta Matematica (GNCS–INdAM), Italy. : GERAD and Department of Mathematics and Industrial Engineering, École Polytechnique, Montréal, QC, Canada. E-mail: [email protected]. Research partially supported by an NSERC Discovery Grant. 1

CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

  • Upload
    others

  • View
    6

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

Cahier du GERAD G-2019-72

CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FORREGULARIZED SADDLE-POINT SYSTEMS

DANIELA DI SERAFINO˚ AND DOMINIQUE ORBAN:

Abstract. We consider the iterative solution of regularized saddle-point systems. When theleading block is symmetric and positive semi-definite on an appropriate subspace, Dollar, Gould,Schilders, and Wathen (2006) describe how to apply the conjugate gradient (CG) method coupled witha constraint preconditioner, a choice that has proved to be effective in optimization applications. Weinvestigate the design of constraint-preconditioned variants of other Krylov methods for regularizedsystems by focusing on the underlying basis-generation process. We build upon principles laid outby Gould, Orban, and Rees (2014) to provide general guidelines that allow us to specialize anyKrylov method to regularized saddle-point systems. In particular, we obtain constraint-preconditionedvariants of Lanczos and Arnoldi-based methods, including the Lanczos version of CG, MINRES,SYMMLQ, GMRES(m) and DQGMRES. We also provide MATLAB implementations in hopes thatthey are useful as a basis for the development of more sophisticated software. Finally, we illustrate thenumerical behavior of constraint-preconditioned Krylov solvers using symmetric and nonsymmetricsystems arising from constrained optimization.

Key words. Regularized saddle-point systems, constraint preconditioners, Lanczos and Arnoldiprocedures, Krylov solvers.

AMS subject classifications. 65F08, 65F10, 65F50, 90C20.

1. Introduction. We consider the iterative solution of the regularized saddle-point system

(1)„

A BT

B ´C

xy

b0

,

where A P Rnˆn may be nonsymmetric, C P Rmˆm is nonzero and symmetric, andB P Rmˆn. We denote K the matrix of (1). There is no loss of generality in assumingthat the last m entries of the right-hand side of (1) are zero, as discussed later.

A constraint preconditioner for (1) has the form

(2) P “

G BT

B ´C

,

where G is an approximation to A such that (2) is nonsingular. When A is symmetricand has appropriate additional properties, a constraint preconditioner allows theapplication of CG even though K and P are indefinite (Dollar et al., 2006).

We are interested in the design of constraint-preconditioned versions of additionalKrylov methods for (1), including methods that can be used when A is nonsymmetric.We extend the work of Gould et al. (2014) on projected and constraint-preconditionedKrylov methods for saddle-point systems with C “ 0 by exploiting a suitable reformu-lation of (1) suggested by Dollar et al. (2006). We develop constraint-preconditioned

˚Dipartimento di Matematica e Fisica, Università degli Studi della Campania “L. Vanvitelli”,Caserta, Italy. E-mail: [email protected]. Research partially supported byGERAD during a visit of this author in 2016, and by Gruppo Nazionale per il Calcolo Scientifico –Istituto Nazionale di Alta Matematica (GNCS–INdAM), Italy.

:GERAD and Department of Mathematics and Industrial Engineering, École Polytechnique,Montréal, QC, Canada. E-mail: [email protected]. Research partially supported by anNSERC Discovery Grant.

1

Page 2: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

2 [toc]

variants of the Lanczos and Arnoldi basis-generation processes, and use them toderive variants of Krylov solvers based on those processes. More generally, we provideguidelines that can be also exploited to obtain constraint-preconditioned versions ofother Krylov methods not considered in this paper. Finally, we distribute MATLABimplementations of the constraint-preconditioned methods discussed here as templatesfor the development of more sophisticated numerical software.

Systems of type (1) arise in interior-point methods for constrained optimizationin the presence of inequality constraints or when regularization is used (Benzi, Golub,and Liesen, 2005; D’Apuzzo, De Simone, and di Serafino, 2010; Friedlander and Orban,2012). They also appear in Lagrangian approaches for variational problems withequality constraints when the constraints are relaxed or a penalty term is applied(Pestana and Wathen, 2015). In the above cases, A is usually symmetric, but may alsobe nonsymmetric—see Section 7, and often has additional properties, e.g., it accountsfor local convexity of the optimization problem. Regularized saddle-point systems withnonsymmetric A arise also from the stabilized finite-element discretization of Oseenproblems obtained by linearization, through Picard’s method, of the steady-stateNavier-Stokes equations governing the flow of a Newtonian incompressible viscousfluid (Benzi et al., 2005).

Constraint preconditioners have widely demonstrated their effectiveness on saddle-point systems, especially when the leading block is symmetric and enjoys additionalproperties, such as being positive definite; much work has been carried out to develop,analyze and approximate constraint preconditioners in this case, see, e.g., (Benzi et al.,2005; D’Apuzzo et al., 2010; Gould et al., 2014; De Simone, di Serafino, and Morini,2018) and the references therein.

The rest of this paper is organized as follows. Section 2 provides preliminaryresults used in the sequel. In Section 3, we describe the constraint-preconditionedLanczos process and, in Section 4, we present variants of Krylov solvers based on it.In Section 5, we describe the constraint-preconditioned Arnoldi process and associatedKrylov methods. In Section 6, we discuss implementation issues and provide detailson the MATLAB codes. In Section 7, we illustrate the numerical behavior of someconstraint-preconditioned solvers on regularized saddle-point systems, with symmetrixand nonsymmetric matrices, from constrained optimization. We conclude in Section 8.

Notation. Uppercase Latin letters (A, B, . . .), lowercase Latin letters (a, b, . . .),and lowercase Greek letters (α, β, . . .) denote matrices, vectors and scalars, respectively.The Euclidean norm is denoted } ¨ }. If S “ ST is a positive definite matrix, theS-norm is defined as }u}2S “ uTS u. All vectors are column vectors. For brevity, weuse the MATLAB-like notation rv ; ws to represent the vector rvT wT sT .

2. Preliminaries. We assume throughout that K is nonsingular, which implies

(3) NullpAq XNullpBq “ t0u and NullpBT q XNullpCq “ t0u.

In general the converse is not true. A counterexample consists in taking

A “

»

1 ´1 00 0 01 0 1

fi

fl , B “

1 0 00 0 1

, C “

1 00 1

.

Benzi et al. (2005) and D’Apuzzo et al. (2010) give additional conditions that guaranteenonsingularity of K. Note however that we do not require B to have full rank or C tobe positive (semi-)definite.

2

Page 3: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 3

In order to develop constraint-preconditioned Krylov methods for (1), we specializethe basis-generation processes underlying those methods. We focus on the Lanczos(1950) and Arnoldi (1951) processes, which compute orthonormal bases of Krylovspaces associated with symmetric and general matrices, respectively. For reference, thestandard Lanczos process is stated as Algorithm 4 in Appendix A. It is straighforwardto apply our arguments to the Lanczos (1950) biorthogonalization process and itstranspose-free variants (Brezinski and Redivo-Zaglia, 1998; Chan, de Pillis, and van derVorst, 1998). We implicitly assume that A “ AT when considering the Lanczos process.

Following Dollar et al. (2006), we reformulate (1) as follows. Assume thatrankpCq “ p and C has been decomposed as1

(4) C “ EFET,

where F P Rpˆp is symmetric and nonsingular and E P Rmˆp. Then, by using theauxiliary variable

(5) w “ ´FET y,

equation (1) may be written

(6)

»

A BT

F´1 ET

B E

fi

fl

»

xwy

fi

fl “

»

b00

fi

fl ,

which has a standard symmetric saddle-point form

(7)„

M NT

N

gy

b00

, M “

A

F´1

, N ““

B E‰

, g “

xw

, b0 “

b0

.

The principles laid out by Gould et al. (2014) may now be applied to (6).Note that (6) is nonsingular if and only if (1) is nonsingular, and therefore N

must have full rank. Because g P NullpNq, there exists pd P Rn`p´m such that

(8) g “

xw

“ Z pd “

Z1

Z2

pd,

where the columns of Z form a basis of NullpNq. The restriction of (6) to NullpNq is

(9) xM px “ pb,

wherexM “ ZTMZ “ ZT1 AZ1 ` Z

T2 F

´1Z2,(10a)„

xw

Z1

Z2

px, pb ““

ZT1 ZT2‰

b0

“ ZT1 b.(10b)

In a Krylov method for (9), it is appropriate to use a preconditioner of the form

(11) pP “ ZT1 GZ1 ` ZT2 F

´1Z2.

If G is suitable, the preconditioned method can be reformulated entirely in terms offull space quantities (Gould, Hribar, and Nocedal, 2001; Dollar et al., 2006; Gouldet al., 2014). Following (Gould et al., 2014, Assumption 2.2), we require the followingassumption.

1Note that (4) will be only used for the purpose of deriving computational processes and neednot be computed in practice.

3

Page 4: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

4 [toc]

Assumption 2.1. The matrix„

G

F´1

is symmetric and positive definite on NullpNq.

A consequence of Assumption 2.1 is that (11) is symmetric and positive definite.We enforce Assumption 2.1 throughout this paper to guarantee that Krylov

methods for (9) give rise to corresponding full-space methods for (1). However, atleast in principle, Assumption 2.1 is not always necessary, e.g., in Krylov methodsbased on the Arnoldi process.

The application of the preconditioner pP , i.e., qu “ pP´1pu, can be written as

(12)„

uxuw

“ PG

uxuw

, PG “ Z pP´1ZT ,

uxuw

“ Z qu, ZT„

uxuw

“ pu.

Furthermore,

PG

G

F´1

is an oblique projector into NullpNq. Let pL be the lower triangular Cholesky factor ofpP and let

(13) pK “ K´

pL´1xM pL´T , pL´1

ppb´ xM px0q¯

be the Krylov space generated by the preconditioned reduced operator pL´1xM pL´T

and initial vector pL´1ppb ´ xM px0q, where pb is given in (10) and x0 “ Z1px0, with Z1

defined in (8).The computation of (12) can be obtained by solving

(14)

»

G BT

F´1 ET

B E

fi

fl

»

uxuwz

fi

fl “

»

uxuw0

fi

fl ,

see, e.g., Gould et al. (2001), so that PG could be expressed as

PG “

I 0 00 I 0

»

G BT

F´1 ET

B E

fi

fl

´1»

I 00 I0 0

fi

fl .

We now apply Principles 2.1 and 2.2 of Gould et al. (2014) to the standard Lanczosbasis-generation process for pK, and obtain the projected Lanczos process outlined inAlgorithm 1.

In Algorithm 1, the notation }u}rP s represents a measure of the deviation ofu “ rux ; uws from NullpNq (Gould et al., 2014, Section 3). More precisely

(15) }u}2rP s :“ uTx ux ` uTwuw,

where u “ rux ; uws is defined by (14). Note that }u}rP s is actually a seminorm andvanishes if and only if rux ; uws is orthogonal to NullpNq.

4

Page 5: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 5

Algorithm 1 Projected Lanczos Process1: choose rx0 ; w0s such that Bx0 ` Ew0 “ 0 initial guess2: v0,x “ 0, v0,w “ ´w0 initial Lanczos vector3: u0,x “ b´Ax0, u0,w “ ´F

´1w0 u0 “ b0 ´Mg04: ru1,x ; u1,w ; z1s Ð solution of (14) with right-hand side ru0,x ; u0,w ; 0s5: v1,x “ u1,x, v1,w “ u1,w v1 “ PG u0

6: β1 “ pvT1,xu0,x ` v

T1,wu0,wq

12 β1 “ pv

T1 u0q

12

7: if β1 ‰ 0 then8: v1,x “ v1,x{β1, v1,w “ v1,w{β1 }v1}rP s “ 19: end if

10: k “ 111: while βk ‰ 0 do12: uk,x “ Avk,x, uk,w “ F´1vk,w uk “Mvk13: αk “ vTk,xuk,x ` v

Tk,wuk,w αk “ vTk uk

14: ruk`1,x ; uk`1,w ; zk`1s Ð solution of (14) with right-hand side ruk,x ; uk,w ; 0s15: vk`1,x “ uk`1,x ´ αkvk,x ´ βkvk´1,x vk`1 “ uk`1 ´ αkvk ´ βkvk´1

16: vk`1,w “ uk`1,w ´ αkvk,w ´ βkvk´1,w

17: βk`1 “ pvTk`1,xuk,x ` v

Tk`1,wuk,wq

12 βk`1 “ pv

Tk`1ukq

12

18: if βk`1 ‰ 0 then19: vk`1,x “ vk`1,x{βk`1, vk`1,w “ vk`1,w{βk`1 }vk`1}rP s “ 120: end if21: k “ k ` 122: end while

Conceptually, the Lanczos process corresponding to Algorithm 1 can be summarizedas»

A BT

F´1 ET

B E

fi

fl

»

Vk,xVk,wZk

fi

fl “

»

G BT

F´1 ET

B E

fi

fl

¨

˝

»

Vk,xVk,wZk

fi

flTk ` βk`1

»

vk`1,x

vk`1,w

zk`1

fi

fl eTk

˛

‚,

where

Vk,x ““

v1,x . . . vk,x‰

, Vk,w ““

v1,w . . . vk,w‰

, Zk ““

z1 . . . zk‰

,

and Tk is the usual Lanczos tridiagonal matrix. Provided that rx0 ; w0s P NullpNq,(Gould et al., 2014, Theorem 2.2) guarantees that Algorithm 1 is well defined andequivalent to Algorithm 5 in Appendix A applied to (9)–(10) with preconditioner (11).In Algorithm 1 and subsequent algorithms, we use the symbol “Д to assign to thevector on the left of the arrow the result of the external procedure on the right of thearrow.

In the next sections we show how the projected basis-generation procedures canbe further reformulated by referring to the original system (1), thus avoiding the useof E and F and the factorization (4).

3. Constraint-Preconditioned Lanczos Process. If we define pk “ uk,x forall k ě 1, and

(16) tk “ EFuk,w, k “ 0, 1, . . .

5

Page 6: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

6 [toc]

then (14) at line 14 of Algorithm 1 can be written as

(17)„

G BT

B ´C

pk`1

zk`1

uk,x´tk

.

Assumption 2.1 occurs when the sum of the number of negative eigenvalues of thematrix of (17) and C is m (Dollar et al., 2006, Theorem 2.1), which may be verified ifan inertia-revealing symmetric indefinite factorization is used to solve (17), such asthat of Duff (2004).

Unfortunately, (17) still appears to depend on F via (16). We now reformulateAlgorithm 1 in terms of full-space quantities. Define the initial guess

(18) w0 “ ´FET q0,

where q0 P Rm is arbitrary (e.g., q0 “ 0). Line 3 of Algorithm 1 and (16) yield

u0,x “ r0 “ b´Ax0, u0,w “ ET q0, t0 “ Cq0.

From here on, let us denote pk “ vk,x. At lines 4-5 of Algorithm 1, we compute p1 “ p1and z1 from (14), which yields, in particular, u1,w “ FET pq0 ´ z1q. If we define

s1 “ q0 ´ z1, q1 “ s1,

lines 5-6 of Algorithm 1 take the form

p1 “ p1(19a)

v1,w “ FET q1,(19b)

β1 “ ppT1 u0,x ` q

T1 Cq1q

12 “ ppT1 u0,x ` q

T1 t0q

12 .(19c)

We then normalize by dividing p1 and q1 by β1. Lines 12–13 of Algorithm 1 and (16)give

u1,w “ ET q1,(20a)t1 “ Cq1,(20b)

α1 “ pT1 u1,x ` qT1 t1 “ pT1 Ap1 ` q

T1 Cq1.(20c)

We now compute p2 and v2,w from lines 15–16 of Algorithm 1 with k “ 1, i.e., wecompute p2 and z2 from (17), and note that u2,w “ FET pq1 ´ z2q. Thus, by setting

s2 “ q1 ´ z2, q2 “ s2 ´ α1q1 ´ β1q0,

we obtain from lines 2 and 15–16 of Algorithm 1 together with (19b), (20a) and (20b):

p2 “ p2 ´ α1p1 ´ β1p0

v2,w “ FET s2 ´ α1FET q1 ´ β1FE

T q0 “ FET q2

β2 “ ppT2 Ap1 ` q

T2 Cq1q

12 “ ppT2 u1,x ` q

T2 t1q

12 .

Then, according to line 19, p2 must be divided by β2, and we do the same with q2.An induction argument shows that for all k ě 1

uk,w “ ET qk,

tk “ Cqk,

αk “ pTk uk,x ` qTk tk “ pTkApk ` q

Tk Cqk,

6

Page 7: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 7

where qk has been normalized by βk. Furthermore, letting

sk`1 “ qk ´ zk`1, qk`1 “ sk`1 ´ αkqk ´ βkqk´1,

we obtain

pk`1 “ pk`1 ´ αkpk ´ βkpk´1

vk`1,w “ FET qk`1,

βk`1 “ ppTk`1Apk ` q

Tk`1Cqkq

12 “ ppTk`1uk,x ` q

Tk`1tkq

12 .

We divide pk`1 and qk`1 by βk`1 to obtain the vectors to be used at the next iteration.Thus, if we rename uk,x as uk, we obtain Algorithm 2.

Algorithm 2 Constraint-Preconditioned Lanczos Process1: choose rx0 ; q0s such that Bx0 ´ Cq0 “ 0 initial guess2: p0 “ 0 initial Lanczos vector3: u0 “ b´Ax0, t0 “ Cq04: rp1 ; z1s Ð solution of (17) with right-hand side ru0 ; ´t0s5: p1 “ p16: s1 “ q0 ´ z1, q1 “ s1

7: β1 “ ppT1 u0 ` q

T1 t0q

12

8: if β1 ‰ 0 then9: p1 “ p1{β1, q1 “ q1{β1

10: end if11: k “ 112: while βk ‰ 0 do13: uk “ Apk, tk “ Cqk14: αk “ pTk uk ` q

Tk tk “ pTkApk ` q

Tk Cqk

15: rpk`1 ; zk`1s Ð solution of (17) with right-hand side ruk ; ´tks16: pk`1 “ pk`1 ´ αkpk ´ βkpk´1

17: sk`1 “ qk ´ zk`1, qk`1 “ sk`1 ´ αkqk ´ βkqk´1

18: βk`1 “ ppTk`1uk ` q

Tk`1tkq

12 “ ppTk`1Apk ` q

Tk`1Cqkq

12

19: if βk`1 ‰ 0 then20: pk`1 “ pk`1{βk`1, qk`1 “ qk`1{βk`1

21: end if22: k “ k ` 123: end while

The above transformations can be condensed in the following principle, whichsummarizes the conversion a of projected process into a constraint-preconditionedprocess.

Principle 1.1. Basis vectors vk`1,x are unchanged;2. Basis vectors vk`1,w have the form FET qk`1, where qk`1 is defined by

sk`1 “ qk ´ zk`1,

q1 “ s1,

qk`1 “ sk`1 ´ αkqk ´ βkqk´1, pk ě 1q,

and where zk`1 results from the solution of (17);7

Page 8: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

8 [toc]

3. Inner products of the form vTi,wuj,w become qTi Cqj “ qTi tj.

Theorem 1 summarizes the equivalence between the two formulations.

Theorem 1. Let E and F be as defined in (4) and G chosen to satisfyAssumption 2.1. Let q0 P R

m be arbitrary. Then, Algorithm 1 with startingguesses x0 P R

n and w0 “ ´FET q0, such that Bx0 ` Ew0, is equivalent toAlgorithm 2 with starting guesses x0 and q0. In particular, for all k, the vectorsvk,x and vk,w, and the scalars αk and βk in Algorithm 1 are equal to the vectorspk and FET qk, and to the scalars αk and βk in Algorithm 2, respectively.

Note that Algorithm 2 does not contain references to E and F . The variable skis used only to improve readability. Assumption 2.1 guarantees that Algorithm 2 iswell posed because it is equivalent to Algorithm 1, which, in turn, is equivalent tothe standard Lanczos process for building an orthonormal basis of (13). The mainadvantages of Algorithm 2 are that it works directly with the formulation (1) and itonly requires storage for three vectors of size n`m (rpk ; qks, ruk ; tks, and rpk ; zks),as opposed to the same number of vectors of size n` p`m for Algorithm 1.

We call Algorithm 2 the Constraint-Preconditioned Lanczos (CP-Lanczos) processbecause of its similarity to a Lanczos process for building an orthonormal basis of aKrylov space associated with the preconditioned operator P´1M , even though thelatter appears nonsymmetric.

4. Constraint-Preconditioned Lanczos-Based Krylov Solvers. We mayexploit Theorem 1 and use Algorithm 2 to derive a constraint-preconditioned versionof any Krylov method based on the Lanczos process. To this aim, we must understandhow the update of the k-th iterate rxk ; wks in a Krylov method based on Algorithm 1translates into the update of the k-th iterate rxk ; yks in the version of that Krylovmethod based on Algorithm 2. In the following, the former and the latter versionof the Krylov method are referred to as projected-Krylov (P-Krylov) and constraint-preconditioned-Krylov (CP-Krylov), respectively.

Because the initial guess g0 “ rx0 ; w0s of P-Krylov applied to (6) must lie inNullpNq, CP-Krylov must be initialized with rx0 ; y0s such that

(21) Bx0 ´ Cy0 “ 0.

Our first result states a property of Algorithm 2 that follows from a specific q0.

Lemma 1. Let Algorithm 2 be initialized with x0 P Rn and q0 P NullpCq.

Then, for all k ě 0,

(22) Bpk ` Cqk “ 0.

Proof. We proceed by induction. For k “ 0, (22) holds because p0 “ 0 andq0 P NullpCq. For k “ 1, p1 “ p1, q1 “ q0 ´ z1 “ ´z1, and (17) and our assumptionthat q0 P NullpCq yield

Bp1 ` Cq1 “ Bp1 ´ Cz1 “ ´t0 “ ´Cq0 “ 0.

Assume (22) holds for any index j ď k. Lines 16–17 of Algorithm 2, (16), (17), andour induction assumption imply that

Bpk`1 ` Cqk`1 “ Bpk`1 ` Cpqk ´ zk`1q ´ αkpBpk ` Cqkq ´ βkpBpk´1 ` Cqk´1q

“ Bpk`1 ` Cpqk ´ zk`1q

“ Bpk`1 ´ Czk`1 ` tk “ 0,

8

Page 9: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 9

which establishes (22).

An interesting property of the CP-Lanczos process is that it is equivalent toformally applying the standard Lanczos process to system (1) with preconditioner (2),where by “formal application”, we mean that the Lanczos process is applied blindlyas if P were positive definite. Such formal application is stated as Algorithm 6 inAppendix A. The equivalence with Algorithm 2 is stated in the next result, whichparallels (Gould et al., 2014, Theorem 2.2).

Theorem 2. Let Algorithm 2 be initialized with x0 P Rn such that Bx0 “ 0,

q0 “ 0 P Rm, and Algorithm 6 be initialized with the same x0 and y0 P Rm such

that (21) is satisfied. Then, for all k ě 0, vk,x “ pk and vk,y “ ´qk, wherervk,x ; vk,ys is the k-th Lanczos vector generated in Algorithm 6, and pk and qkare the k-th Lanczos vectors generated in Algorithm 2. In addition, the scalars αkand βk computed at each iteration are the same in both algorithms.

Proof. We proceed by induction. The result holds for k “ 0 because rv0,x ; v0,ys “r0 ; 0s “ rp0 ; ´q0s. With q0 “ 0, Algorithm 2 initializes u0 “ b ´ Ax0 and t0 “ 0.Because (21) is satisfied, Algorithm 6 initializes r0,x “ u0 ´B

T y0 and r0,y “ 0. Thus,rv1,x ; v1,ys solves (17) with right-hand side ru0 ´ BT y0 ; 0s. By (Gould et al., 2014,Theorem 2.1, item 2), rv1,x ; v1,ys equivalently solves (17) with right-hand side ru0 ; 0s,and therefore, rv1,x ; v1,ys at line 4 of Algorithm 6 is equal to rp1 ; z1s. Lines 5–6 ofAlgorithm 2 subsequently set p1 “ p1 “ v1,x and q1 “ s1 “ q0 ´ z1 “ ´v1,y.

With q0 “ 0, line 7 of Algorithm 2 computes β1 “ ppT1 u0q

12 . We take the inner

product of the second row of (17) with z1 “ ´q1 and note that t0 “ 0, and obtainzT1 Bp1 “ zT1 Cz1 “ qT1 Cq1. Similarly, we take the inner product of the first row of (17)with p1 and substitute zT1 Bp1 to obtain pT1 u0 “ pT1Gp1 ` qT1 Cq1, so that β1 is thesame as that computed at line 5 of Algorithm 6. We have established that the resultalso holds for k “ 1.

At a general iteration k, Algorithm 2 sets uk “ Apk, tk “ Cqk and computesαk “ pTk uk ` qTk tk “ pTkApk ` qTk Cqk. By Lemma 1, qTk Bpk ` qTk Cqk “ 0, so thatαk “ pTkApk ´ 2qTk Bpk ´ qTk Cqk. Under the recurrence assumption that vk,x “ pkand vk,y “ ´qk, this expression of αk is the same as that computed at line 12 ofAlgorithm 6.

At line 15 of Algorithm 2, we compute rpk`1 ; zk`1s from (17), or, equivalently, asthe solution to

G BT

B ´C

pk`1

zk`1 ´ qk

Apk0

.

In view of Lemma 1, our recurrence assumption, and (Gould et al., 2014, Theorem 2.1,item 2), line 13 of Algorithm 6 computes rvk`1,x ; vk`1,ys as the solution to thesame system as above. Therefore, at that point in each algorithm vk`1,x “ pk`1 andvk`1,y “ zk`1´qk “ ´sk`1. The vector updates at lines 16–17 of Algorithm 2 togetherwith those at line 14 of Algorithm 6 show that vk`1,x “ pk`1 and vk`1,y “ ´qk`1. Ourrecurrence assumption and Lemma 1 yieldBvk,x´Cvk,y “ 0 andBvk`1,x´Cvk`1,y “ 0.Finally, Algorithm 6 sets

β2k`1 “ vTk`1,xuk,x ` v

Tk`1,yuk,y

“ vTk`1,xAvk,x ` pBvk`1,x ´ Cvk`1,yqT vk,y ` v

Tk`1,yBvk,x

“ vTk`1,xAvk,x ` vTk`1,yCvk,x,

9

Page 10: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

10 [toc]

which is the same value computed in Algorithm 2.

Theorem 2 shows that Algorithm 2 may be summarized as„

A BT

B ´C

Pk´Qk

G BT

B ´C

ˆ„

Pk´Qk

Tk ` βk`1

pk`1

´qk`1

eTk ,

˙

provided that Bx0 “ 0 and q0 “ 0, where Tk is the same as in Algorithm 1, and

Pk ““

p1 . . . pk‰

, Qk ““

q1 . . . qk‰

.

A consequence of Theorem 2 is that any CP-Krylov method is formally equivalentto the corresponding standard Krylov method applied to system (1) with precondi-tioner (2).

Corollary 1. Let Algorithm 2 be initialized with x0 P Rn such that Bx0 “ 0,

q0 “ 0 P Rm, and Algorithm 6 be initialized with the same x0 and y0 P Rm such

that (21) is satisfied. The k-th approximate solution of (1) computed by anyLanczos-based CP-Krylov method coincides with the k-th approximate solutionobtained by formally applying the standard version of the same method to (1) withpreconditioner (2).Although Corollary 1 states that standard Lanczos-based methods can be safely

applied to (1) with preconditioner (2) and an appropriate starting point, Algorithm 2reduces the computational effort by never requiring products with B or BT . Onlyproducts with A and C are necessary. On the other hand, thanks to Theorem 2,specialized implementations of the standard Lanczos-based methods can be developedby exploiting the equalities Bpk ` Cqk “ 0 and Bxk ´ Cyk “ 0, thus saving matrix-vector products. The computation involving sk`1 can be carried out, for example, asthe update qk´1 “ qk ´ zk`1 ´ βkqk´1 followed by qk`1 “ qk´1 ´ αkqk, or sk`1 canoverwrite zk`1. Finally, once (2) has been factorized, storing B is no longer necessary,and this can be used to free memory if needed.

A consequence of Theorem 2 is a formal equivalence between the iterates generatedby Lanczos-based methods applied by way of Algorithm 2 and Algorithm 6. Thisequivalence requires a re-interpretation of the optimality conditions associated withthe Krylov method.

Consider, e.g., MINRES (Paige and Saunders, 1975). The residual associated withiterate rxk ; wk ; yks generated by P-MINRES, with wk “ ´FE

T yk, is

rP,k “

»

rP,k,xrP,k,wrP,k,y

fi

fl “

»

b00

fi

fl´

»

A BT

F´1 ET

B E

fi

fl

»

xkwkyk

fi

fl “

»

b´Axk ´BT yk

00

fi

fl ,

where we used the fact that Bxk ` Ewk “ 0 for all k. This residual corresponds tothe residual at iterate rxk ; yks generated by CP-MINRES:

rCP,k “

b0

´

A BT

B ´C

xkyk

b´Axk ´BT yk

0

,

where we exploited the fact that Bxk ´ Cyk “ 0 for all k, which comes from Bxk `Ewk “ 0 and wk “ ´FE

T yk.We may apply the arguments of Gould et al. (2014, Section 3) to conclude that

P-MINRES, and hence CP-MINRES, minimizes the deviation of rrP,k,x ; 0s from the10

Page 11: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 11

range space of N , i.e., as in (15),

(23) }rP,k}2rP s “ pb´Axk ´B

T ykqThk,

where»

G BT

F´1 ET

B E

fi

fl

»

hkfklk

fi

fl “

»

b´Axk ´BT yk

00

fi

fl .

Because hk P NullpBq, we also have

(24) }rP,k}2rP s “ pb´Axkq

Thk.

Equivalently, hk may be computed from„

G BT

B ´C

hklk

b´Axk0

.

Because of its residual norm minimization property, CP-MINRES is appropriateto solve saddle-point systems in a linesearch inexact-Newton context, where we seekto reduce the residual of the Newton-like equations (1) in an appropriate space.

The same reasoning applies to the constraint-preconditioned version of any Lanczos-based Krylov method. For example, Paige and Saunders (1975) derive the conjugategradient method of Hestenes and Stiefel (1952) directly from the Lanczos process. Thenullspace variant of the constraint-preconditioned version, Lanczos CP-CG, generatesiterates pxk so as to minimize the energy norm of the error, i.e.,

}pek}2xM“ peTk xMpek,

where pek “ pxk ´ px˚, and px˚ is the exact solution of (9). The definitions (10) yield

}pek}2xM“ ppxk ´ px˚q

TZTMZppxk ´ px˚q

“ pxk ´ x˚qTApxk ´ x˚q ` pwk ´ w˚q

TF´1pwk ´ w

˚q

“ pxk ´ x˚qTApxk ´ x˚q ` pyk ´ y˚q

TCpyk ´ y˚q,

where we used again the relationship wk “ ´FET yk between iterates of P-CG and

CP-CG. For Lanczos CP-CG to be applicable, xM must be positive definite, whichoccurs when the sum of the number of negative eigenvalues of K and C is m (Dollaret al., 2006, Theorem 2.1).

We can derive a “traditional” CP-CG implementation by applying the usualtransformations to the Lanczos CP-CG. The result coincides with the implementationof Dollar et al. (2006), although the latter authors assume that B has full row rank forspecific purposes. It is also equivalent to that of Cafieri, D’Apuzzo, De Simone, anddi Serafino (2007) for (1) with positive definite C. The above suggests that CP-CGis appropriate to solve saddle-point systems in constrained optimization where (1) isused to minimize a quadratic model of a penalty function and sufficient decrease ofthis quadratic model is sought, such as in trust-region methods.

Our last example considers SYMMLQ (Paige and Saunders, 1975), which doesnot require xM to be positive definite but, like CG, requires (1) to be consistent. Its

11

Page 12: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

12 [toc]

constraint-preconditioned version, CP-SYMMLQ, computes rxk ; yks so as to minimizethe error in a norm defined by the preconditioner, i.e.,

peTk pP´1pek “ ppxk ´ px˚q

TpZTPZq´1

ppxk ´ px˚q

“ ppxk ´ px˚qTZTZpZTPZq´1ZTZppxk ´ px˚q

xk ´ x˚wk ´ w˚

T

PG

xk ´ x˚wk ´ w˚

.

where we used similar identifications as above and assumed, without loss of generality,that Z has orthonormal columns. In other words, if we define

(25)

»

G BT

F´1 ET

B E

fi

fl

»

exewe

fi

fl “

»

xk ´ x˚wk ´ w˚

0

fi

fl ,

thenpeTk pP´1

pek “ pxk ´ x˚qT ex ` pwk ´ w˚q

T ew “ eTxGex ` ewF´1ew.

By (5) and (25), there exists a vector ey such that ew “ ´FET ey, and thus eTwF

´1ew “

eTy Cey. The second block row of (25) premultiplied by E yields EF pwk ´w˚q ´Ce “´Cey, so that (25) can be written as

G BT

B ´C

exey

xk ´ x˚0

.

Finally, CP-SYMMLQ minimizes

peTk pP´1pek “ eTxGex ` e

Ty Cey.

5. Constraint-Preconditioned Arnoldi Process and Associated KrylovSolvers. A constraint-preconditioned version of the Arnoldi process can be derivedby reasoning as in Section 3, obtaining Algorithm 3. The equivalence between theprojected version (Algorithm 7 in Appendix A) and the constraint-preconditionedversion is stated in Theorem 3, which is akin to Theorem 1.

Theorem 3. Let E and F be as defined in (4) and G chosen to satisfyAssumption 2.1. Let q0 P R

m be arbitrary. Then, Algorithm 7 in Appendix Awith starting guesses x0 P R

m and w0 “ ´FET q0, such that Bx0 ` Ew0 “ 0,

is equivalent to Algorithm 3 with starting guesses x0 and q0. In particular, forall k, the vectors vk,x and vk,w and the scalars hi,k in Algorithm 7 are equal tothe vectors pk and FET qk, and to the scalars hi,k in Algorithm 3, respectively.As in the case of the Lanczos process, the CP-Arnoldi process is equivalent to apply-

ing the corresponding standard Arnoldi process to system (1) with preconditioner (2)(see Algorithm 8 in Appendix A), as stated in the next theorem.

Theorem 4. Let Algorithm 3 be initialized with x0 P Rn such that Bx0 “ 0,

q0 “ 0 P Rm, and Algorithm 8 be initialized with the same x0 and y0 P Rm such

that (21) is satisfied. Then, for all k ě 0, vk,x “ pk and vk,y “ ´qk, wherervk,x ; vk,ys is the k-th Lanczos vector generated in Algorithm 8, and pk and qkare the k-th Arnoldi vectors generated in Algorithm 3. In addition, the scalars

12

Page 13: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 13

Algorithm 3 Constraint-Preconditioned Arnoldi Process1: choose rx0 ; q0s such that Bx0 ´ Cq0 “ 0 initial guess2: p0 “ 0 initial Arnoldi vector3: u0 “ b´Ax0, t0 “ Cq04: rp1 ; z1s Ð solution of (17) with right-hand side ru0 ; ´t0s5: p1 “ p16: q1 “ q0 ´ z1

7: h1,0 “ ppT1 u0 ` q

T1 t0q

12

8: if h1,0 ‰ 0 then9: p1 “ p1{h1,0, q1 “ q1{h1,0

10: end if11: k “ 112: while hk,k´1 ‰ 0 do13: uk “ Apk, tk “ Cqk14: rpk`1 ; zk`1s Ð solution of (17) with right-hand side ruk ; ´tks15: pk`1 “ pk`1

16: qk`1 “ qk ´ zk`1

17: for i “ 1, . . . , k do

18: hi,k “ ppTi uk ` q

Ti tkq

12 “ pTi Apk ` q

Ti Cqk

19: pk`1 “ pk`1 ´ hi,kpi20: qk`1 “ qk`1 ´ hi,kqi21: end for22: hk`1,k “ pp

Tk`1uk ` q

Tk`1tkq

12 “ ppTk`1Apk ` q

Tk`1Cqkq

12

23: if hk,k`1 ‰ 0 then24: pk`1 “ pk`1{hk,k`1, qk`1 “ qk`1{hk,k`1

25: end if26: k “ k ` 127: end while

hi,k computed at each iteration are the same in both algorithms.Theorem 4 allows us to develop a constraint-preconditioned variant of any Krylov

method based on the Arnoldi process, using a starting guess satisfying (21). Suchvariants are equivalent to their standard counterparts preconditioned with (2), but arecomputationally cheaper, as in the case of Lanczos-based methods. Furthermore, CP-Krylov versions of optimal Arnoldi-based Krylov methods preserve the minimizationproperties of these methods in the sense explained in Section 4. For example, theconstraint-preconditioned version of GMRES (Saad and Schultz, 1986) minimizes thenorm of the deviation of the residual from RangepNq similarly to MINRES.

Obtaining constraint-preconditioned versions of GMRES(`) and DQGMRES isstraighforward, by restarting and truncating the CP-Arnoldi basis generation process,respectively, as in the standard case (Saad, 2003). Note that DQGMRES withmemory 2, i.e., with orthogonalization of each Arnoldi vector against the two previousvectors only, is equivalent to CP-MINRES in exact arithmetic when A is symmetric.In finite precision arithmetic, DQGMRES with a larger memory may dampen the lossof orthogonality among the Lanczos vectors and act as a local reorthogonalizationprocedure, although we did not observe significant differences in Section 7.

Dollar (2007, Theorems 4.1 and 4.3) establishes that if C is positive semi-definiteof rank p, P´1K has an eigenvalue at 1 of multiplicity 2m´ p, while the remaining

13

Page 14: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

14 [toc]

n´m` p eigenvalues are defined by a generalized eigenvalue problem. A remark after(Dollar, 2007, Theorem 4.1) states that Assumption 2.1 ensures that all eigenvalues arereal. In addition, the dimension of the Krylov space is at most minpn´m`p`2, n`mq.Inspection reveals that Dollar’s proofs of those results do not use the fact that A issymmetric; the results hold for general A. Loghin (2017) establishes similar results onthe eigenvalues of non-regularized saddle-point matrices for general A and general G.Clustering eigenvalues accelerates convergence of nonsymmetric Krylov solvers inmany practical cases, although the convergence behavior of such solvers is not fullycharacterized by the eigenvalues (Greenbaum, Pták, and Strakoš, 1996).

6. Implementation Issues. We implemented the constraint-preconditionedvariants of the Lanczos-CG, MINRES, SYMMLQ, GMRES(`) and DQGMRES methodsfor (1) in a MATLAB library named cpkrylov. For completeness, we also included inthe library an implementation of the CP-CG method in the form given by Dollar et al.(2006). We think that cpkrylov can be useful as a basis for the development of moresophisticated numerical software.

All solvers are accessed via a common interface exposed by the main driverreg_cpkrylov(), which performs pre-processing operations, calls the requested solver,performs post-processing operations, and returns solutions and statistics to the user.cpkrylov is freely available from github.com/optimizers/cpkrylov.

Because A is never required as an explicit matrix, we allow the user to supply it asan abstract linear operator as implemented in the Spot linear operator toolbox2. Spotallows us to use the familiar matrix notation with operators for which a representationas an explicit matrix is unavailable or inefficient. This affords the user flexibility indefining A while keeping the implementation of the various Krylov methods as readableas if A were a matrix.

Gould et al. (2014) observe that the numerical stability of projected Krylov solversdepends on keeping rxk ; wks in NullpNq. While the iterates lie in the nullspace in exactarithmetic, rxk ; wks may have a non-negligible component in RangepNT

q because ofroundoff error. In turn, the stability of CP-Krylov solvers depends on how accuratelyrxk ; yks satisfies

Bxk ´ Cyk “ 0.

Gould et al. (2001) suggest to increase the accuracy by applying iterative refinementafter solving (17) with a direct method. In cpkrylov, the constraint preconditionerPG is implemented as a Spot operator P such that writing z = P*r, where z = [z1 ;z2] and r = [r1 ; r2], corresponds to solving

(26)„

G BT

B ´C

z1z2

r1r2

,

and performing iterative refinement if requested by the user or if the residual normof (26) exceeds a given tolerance. By default, the matrix of (26) is factorized byway of MATLAB’s ldl(). Spot allows us to separate the implementation of thepreconditioner from that of other phases of solvers, so that future extensions to theformer (e.g., the case where applying PG results from a different factorization) willnot require changes to the latter. Currently, the implementation of PG is transparentto the user, who must only pass the matrices G, B and C to reg_cpkrylov().

2www.cs.ubc.ca/labs/scl/spot

14

Page 15: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 15

All CP-Krylov solvers stop when

(27) }rP,k}rP s ď εa ` }rP,0}rP s εr,

where }rP,k}rP s is defined in (23) (or, equivalently, in (24)), and εa and εr are tolerancesgiven by the user (default values are also set in our implementations). Note that, forall the CP-Krylov solvers except CP-DQGMRES, }rP,k}rP s is obtained as a byproductof other computations performed in algorithm. CP-DQGMRES computes an estimateof the residual norm only. A computationally cheap overestimate of the residual normcould be used in the stopping criterion, but this may unnecessarily increase the numberof iterations (Saad and Wu, 1996, Section 3.1). A maximum number of iterations canbe also specified for all solvers.

So far, we have considered the case where the last m entries of the right-hand sideof (1) are zero. When the right-hand side has the general form rb1 ; b2s with b2 ‰ 0,we can compute ∆x and ∆y such that

(28) B∆x´ C∆y “ b2,

by applying PG to r0 ; b2s, and subsequently solve (1) with b “ b1´A∆x´BT∆y. Thesolution of the original system is rx`∆x ; y `∆ys. These pre- and post-processingsteps are implemented in reg_cpkrylov().

7. Numerical Experiments. We report results obtained by applying somesolvers from the cpkrylov library to regularized saddle-point systems included in thecollection described in (Orban, 2015a,b). The systems of the collection were generatedby running the primal-dual regularized interior-point (IP) method by Friedlander andOrban (2012) on sparse convex quadratic programming problems from CUTEst (Gould,Orban, and Toint, 2015).

At each iteration of the IP method, a Newton step is applied to suitably perturbedKKT conditions associated with a regularized quadratic problem in the form

minimizexPR

n, rPR

m,sPR

p

12 x

TQx` cTx` 12ρ}x´ xk}

2` 1

2ρ}s´ sk}2` 1

2δ}r ` yk}2

subject to J1x` J2s` δr “ b, s ě 0,

which includes linear inequality constraints, free variables, and nonnegativity bounds.The Newton step requires the solution of a linear system, which can be cast into theform (1) by an inexpensive elimination of variables. The corresponding matrix takesthe form

K2 “

A BT

B ´C

»

Q` ρI 0 JT10 S´1Z ` ρI JT2J1 J2 ´δI

fi

fl ,

where S “ diagpsq, Z “ diagpzq, with z the complementary variable of s, and theblock A is symmetric positive definite. Recently, unreduced KKT systems have at-tracted the interest of researchers because of their better spectral properties, especiallyas the IP iterate approaches a solution of the optimization problem (Greif, Moulding,and Orban, 2014; Morini, Simoncini, and Tani, 2016). Among the possible unreducedmatrices, we consider

K3 “

A BT

B ´C

»

Q` ρI 0 0 JT10 ρI ´I JT20 Z S 0J1 J2 0 ´δI

fi

ffi

ffi

fl

.

15

Page 16: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

16 [toc]

K2 K3

Problem size dens % cond size dens % cond IP iterscvxqp1_s 550 7.3e´1 O(103 ´ 10

16) 750 5.0e´1 O(103 ´ 1010) 0, 5, 10

cvxqp1_m 5500 7.4e´2 O(104 ´ 1011) 7500 5.1e´2 O(104 ´ 10

11) 0, 5, 10cvxqp1_l 55000 7.4e´3 O(105 ´ 10

13) 75000 5.1e´3 O(105 ´ 1013) 0, 5, 10

cvxqp2_s 525 7.4e´1 O(103 ´ 109) 725 5.0e´1 O(103 ´ 10

7) 0, 5, 10cvxqp2_m 5250 7.4e´1 O(104 ´ 10

9) 7250 5.1e´2 O(104 ´ 109) 0, 5, 10

cvxqp2_l 52500 7.5e´3 O(105 ´ 1011) 72500 5.1e´3 O(105 ´ 10

11) 0, 5, 10cvxqp3_s 575 7.2e´1 O(103 ´ 10

11) 775 5.0e´1 O(103 ´ 109) 0, 5, 10

cvxqp3_m 5750 7.3e´2 O(104 ´ 1013) 7750 5.0e´2 O(104 ´ 10

12) 0, 5, 10cvxqp3_l 57500 7.3e´3 O(105 ´ 10

13) 77500 5.0e´3 O(105 ´ 1013) 0, 5, 10

mosarqp1 8900 3.4e´2 O(10´ 108) 12100 2.5e´2 O(10´ 10

4) 0, 5mosarqp2 3900 9.6e´2 O(10´ 10

5) 5400 6.6e´2 O(10´ 103) 0, 5

stcqp1 22537 2.5e´2 O(103 ´ 1013) 30731 1.6e´2 O(103 ´ 10

13) 0, 5, 10stcqp2 22537 2.5e´2 O(103 ´ 10

7) 30731 1.6e´2 O(103 ´ 107) 0, 5

Table 1Details on the saddle-point matrices used in the experiments.

which is a permutation of the matrix in the first system of (Orban, 2015a, page 3). Allthe linear systems from (Orban, 2015b) correspond to IP iterations 0, 5 and 10, wherea larger iteration number yields a more ill-conditioned system. The differences betweenthe matrices at the various iterations reside in the values of the vectors s and z, and ofthe regularization parameters ρ and δ. The latter were set to ρ “ δ “ 1 at iteration 0,ρ “ δ “ 10´5 at iteration 5, and ρ “ δ “ 10´8 at iteration 10. On some CUTEstquadratic problems the IP method satisfied its stopping criterion before iteration 10or even iteration 5; therefore, not all problems have three associated systems.

Here we show results concerning the CUTEst problems reported in the first columnof Table 1. For three of them the IP method did not reach iteration 10; altogether, weconsidered 36 linear systems for each matrix type, i.e., K2 or K3. For each problem,we report in Table 1 the size of the matrices, their density (computed as the ratiobetween the number of nonzero entries and the total number of entries), the range oftheir condition numbers (obtained by using the condest() MATLAB function), andthe associated IP iterations. We see that for six problems the condition number of K3

varies in a smaller range than that of K2. Note that we simply provide the conditionnumber of K3 as an indication of the difficulty of solving a system with this matrixand the accuracy that may be expected of a solution.

We run CP-CG and CP-MINRES on the systems with matrix K2, and CP-DQGMRES and CP-GMRES(`) on the systems with matrix K3. We set ` “ 100 andused the same value for the memory parameter of CP-DQGMRES, i.e., the numberof Arnoldi vectors to be stored in the truncated CP-Arnoldi process. We denote thissolver CP-DQGMRES(100). The leading block G of the constraint preconditioner (2)was set as the diagonal of A. In (27) we set εa “ εr “ 10´6; we also fixed a maximumnumber of 1500 iterations. One iterative refinement step was performed during theapplication of the constraint preconditioner. The experiments were carried out ona 3.1 GHz Intel Core i7 processor with 16 GB of RAM, 8 MB of L3 cache and themacOS 10.14.6 operating system, using MATLAB R2018b. The execution times weremeasured in seconds, by using the tic and toc MATLAB commands.

In Figure 1 we compare CP-CG and CP-MINRES using Dolan and Moré (2002)performance profiles, with the number of iterations and the execution time as per-formance metrics. The execution time only includes the time for solving the linearsystem, since the time for building the preconditioner is the same for the two solvers.

16

Page 17: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 17

For completeness, we provide a brief description of the performance profiles. Letus consider a set of algorithms tAi | i “ 1, . . . , nAu and a set of test problemstTj | j “ 1, . . . , nT u. Let STj ,Ai

ě 0 be a statistic corresponding to the solution of Tjby Ai, and suppose that the smaller the statistic the better the algorithm. Furthermore,let STj

“ mintSTj,Ai| i “ 1, . . . , nAu. The performance profile of algorithm Ai is

defined as

πipχq “sizetTj | STj ,Ai

{STjď χu

sizetTj | j “ 1, . . . , nT u, χ ě 1,

where the ratio STj ,Ai{STj

is set to `8 if Ai fails in solving Tj . Thus πip1q gives thepercentage of problems for which Ai is the best, while the percentage of problems thatare successfully solved by Ai is limχÑ`8 πipχq.

1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

number of iterations (36 systems)

CP-CG

CP-MINRES

1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4 1.450

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

solve time (36 systems)

CP-CG

CP-MINRES

Fig. 1. CP-CG vs. CP-MINRES on systems with matrix K2: performance profiles in terms ofiterations (left) and solve time (right).

We see that CP-MINRES performs slightly better than CP-CG in terms of bothiterations and solve time. This agrees with the fact that CP-MINRES minimizesthe residual seminorm used in the stopping criterion, while CP-CG does not, andconfirms that CP-MINRES is appropriate to solve saddle-point systems in linesearchinexact-Newton contexts. Furthermore, both methods were able to solve all thesystems with the required accuracy. A closer examination of the results reveals thatCP-CG and CP-MINRES performed their largest number of iterations, 1,288 and1,045, respectively, on cvxqp2_l at IP iteration 5. On all other systems, CP-CG andCP-MINRES performed at most 861 and 735 iterations, respectively.

In order to provide some details on the behavior of CP-CG and CP-MINRES, inFigure 2 we depict the histories of }rP,k}rP s for both solvers applied to the systemscorresponding to cvxqp1_l and stcqp1 at IP iterations 10 and 5, respectively. Wesee that CP-MINRES is more efficient than CP-CG at reducing the residual whenthe problem requires a larger number of iterations. We verified that this is a generalbehavior.

In Figure 3 we compare CP-GMRES(100) with CP-DQGMRES(100) in terms ofiterations and solve time.

CP-GMRES(100) appears more efficient than CP-DQGMRES(100). Neithermanages to solve mosarqp1 at IP iteration 5 within 1,500 iterations. CP-GMRES(100)solves cvxqp2_l at IP iteration 5 in 1,319 iterations and all other systems in fewerthan 933 iterations. CP-DQGMRES(100) requires between 1,248 and 1,447 iterations

17

Page 18: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

18 [toc]

0 50 100 150 200 250 300

iters

10-3

10-2

10-1

100

101

102

103

104

||re

s||

cvxqp1-l, IP it 10, 2x2 matrix, residual history

CP-CG

CP-MINRES

0 5 10 15 20 25

iters

10-4

10-3

10-2

10-1

100

101

102

103

||re

s||

stcqp1, IP it 5, 2x2 matrix, residual history

CP-CG

CP-MINRES

Fig. 2. Convergence histories of CP-CG and CP-MINRES for the systems arising in theapplication of the IP method to cvxqp1-l (left) and stcqp1 (right), at IP iterations 10 and 5, respectively.

1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1number of iterations (36 systems)

CP-GMRES(100)

CP-DQGMRES(100)

1 1.5 2 2.5 3 3.5 4 4.5 5 5.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1solve time (36 systems)

CP-GMRES(100)

CP-DQGMRES(100)

Fig. 3. CP-GMRES(100) vs. CP-DQGMRES(100) on systems with matrix K3: performanceprofiles in terms of iterations (left) and solve time (right).

on three systems and fewer that 793 on the rest.CP-DQGMRES(100) and CP-GMRES(100) performed fewer than 100 iterations,

and therefore were one and the same method in principle, on about 47% of thesystems. On those 47% of systems, they performed the same number of iterations.CP-DQGMRES(100) performed fewer iterations than CP-GMRES(100) on 22%, andmore iterations on 28% of the systems. Closer inspection of the results revealed thatin the former case the numbers of iterations are generally close, while in the latter thenumber of iterations of CP-GMRES(100) is often substantially smaller. Examples ofthis behavior are given in Figure 4, showing the convergence histories for the systemscorresponding to cvxqp3_l and cvxqp1_m at IP iterations 10 and 5, respectively.

The solve time profiles show that CP-DQGMRES(100) was generally slowerthan CP-GMRES(100), although the times are essentially the same when the twosolvers performed less than 100 iterations. This behavior seems to indicate that theimplementation of CP-DQGMRES can be improved.

In order to evaluate the effectiveness of our constraint preconditioner on a saddle-point matrix with nonsymmetric leading block, we also applied the standard MATLABgmres() with restart equal to 100, without preconditioner, to the systems with

18

Page 19: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 19

0 50 100 150 200 250 300 350

iters

10-2

10-1

100

101

102

103

104

105

||re

s||

cvxqp3-l, IP it 10, 3x3 matrix, residual history

CP-GMRES(100)

CP-DQGMRES(100)

0 20 40 60 80 100 120 140

iters

10-3

10-2

10-1

100

101

102

103

104

||re

s||

cvxqp1-m, IP it 5, 3x3 matrix, residual history

CP-GMRES(100)

CP-DQGMRES(100)

Fig. 4. Convergence histories of CP-DQGMRES(100) and CP-GMRES(100) for the systemsarising in the application of the IP method to cvxqp3-l (left) and cvxqp1-m (right), at IP iterations10 and 5, respectively.

matrix K3. We set the tolerance for the stopping criterion equal to 10´6 and fixed amaximum number of 1500 iterations. gmres() was able to reach the required accuracyin the residual only for 10 systems, all corresponding to IP iteration 0. Since theresidual norm computed by the standard GMRES solver with no preconditioner isdifferent from that computed by CP-GMRES, we also checked the accuracy of thecomputed solution, by using the 2-norm relative error with respect to the solutionobtained by solving the saddle-point system with the \ (backslash) MATLAB operator.The error of the standard GMRES solver is larger than the error of CP-GMRES for 9out of the 10 problems problems where gmres() satisfies the stopping criterion, and ismuch larger in all the other cases.

Finally, in Figure 5 we compare CP-MINRES applied to the systems with matrixK2

with CP-GMRES(100) applied to the systems with matrix K3, by using performanceprofiles of the number of iterations and the total execution time, i.e., the time forsetting up the preconditioner plus the time for solving the system. CP-MINRES isalways more efficient than CP-GMRES(100) in terms of both metrics on our test set.

10 20 30 40 50 60 70 80 900

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1number of iterations (36 systems)

CP-MINRES

CP-GMRES(100)

5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1total time (36 systems)

CP-MINRES

CP-GMRES(100)

Fig. 5. CP-MINRES vs. CP-GMRES(100): performance profiles in terms of iterations (left)and total time (right).

19

Page 20: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

20 [toc]

8. Discussion. We extended the approach of Gould et al. (2014) to saddle-pointsystems with regularization and provided principles from which to derive constrained-preconditioned iterative methods. The resulting methods are conceptually equivalentto standard iterative methods applied to a reduced system in a way that preservestheir properties, including quantities that increase or decrease monotonically at eachiteration. Specifically, we discussed constraint-preconditioned versions of the CG-Lanczos, MINRES, SYMMLQ, GMRES(`) and DQGMRES methods, showing thatthey preserve the properties of the corresponding standard methods in a suitablereduced Krylov space. We illustrated our approach on methods based on the Lanczosand Arnoldi processes, but it applies equally to other processes, including thoseof Golub and Kahan (1965), Saunders, Simon, and Yip (1988), and the unsymmetricLanczos (1952) bi-orthogonalization process. We also implemented these methods in aMATLAB library, named cpkrylov, which provides a basis for the development ofmore sophisticated numerical software.

An open question related to constraint preconditioners concerns the best wayto reduce their computational cost. Inexact constraint preconditioners have beendeveloped and analyzed, based on approximations of the Schur complement of theleading block of the constraint preconditioner or on other approximations (Lukšan andVlček, 1998; Perugia and Simoncini, 2000; Durazzi and Ruggiero, 2003; Bergamaschi,Gondzio, Venturin, and Zilli, 2007; Sesana and Simoncini, 2013). Preconditionerupdating techniques, producing inexact and exact constraint preconditioners, have beenalso proposed in order to reduce the cost of solving sequences of saddle-point systems(Bellavia, De Simone, di Serafino, and Morini, 2015, 2016; Fisher, Gratton, Gürol,Trémolet, and Vasseur, 2016; Bergamaschi, De Simone, di Serafino, and Martínez, 2018).It must be noted, however, that the inexact constraint preconditioners considered so fargenerally do not produce preconditioned vectors lying in the nullspace of N , which is akey issue to obtain CP-preconditioned methods methods for (1) equivalent to suitablypreconditioned Krylov methods for (9). On the other hand, inexact preconditionershave proven effective in reducing the computational time for the solution of large-scale saddle-point systems. A further possibility for lowering the cost of constraintpreconditioners is to apply them inexactly using an iterative method. Of course,preserving the property of obtaining preconditioned vectors lying in the nullspace ofN is still a main issue. To the best of our knowledge, this approach has not been yetaddressed in the literature.

References.W. E. Arnoldi. The principle of minimized iterations in the solution of the matrix eigenvalue

problem. Q. Appl. Math., 9:17–29, 1951. DOI: 10.1090/qam/42792.S. Bellavia, V. De Simone, D. di Serafino, and B. Morini. Updating constraint preconditioners

for KKT systems in quadratic programming via low-rank corrections. SIAM J. Optim., 25(3):1787–1808, 2015. DOI: 10.1137/130947155.

S. Bellavia, V. De Simone, D. di Serafino, and B. Morini. On the update of constraintpreconditioners for regularized KKT systems. Comput. Optim. Appl., 65(2):339–360, 2016.DOI: 10.1007/s10589-016-9830-4.

M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. ActaNumer., 14:1–137, 2005. DOI: 10.1017/S0962492904000212.

L. Bergamaschi, J. Gondzio, M. Venturin, and G. Zilli. Inexact constraint preconditioners forlinear systems arising in interior point methods. Comput. Optim. Appl., 36(2–3):137–147,2007. DOI: 10.1007/s10589-006-9001-0.

L. Bergamaschi, V. De Simone, D. di Serafino, and A. Martínez. BFGS-like updates ofconstraint preconditioners for sequences of KKT linear systems in quadratic programming.Numer. Linear Algebra Appl., 25(5):e2144, 2018. DOI: 10.1002/nla.2144.

20

Page 21: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 21

C. Brezinski and M. Redivo-Zaglia. Transpose-free Lanczos-type algorithms for nonsymmetriclinear systems. Numer. Algor., 17(1–2):67–103, 1998. DOI: 10.1023/A:1012085428800.

S. Cafieri, M. D’Apuzzo, V. De Simone, and D. di Serafino. On the iterative solution of KKTsystems in potential reduction software for large-scale quadratic problems. Comput. Optim.Appl., 38(1):27–45, 2007. DOI: 10.1007/s10589-007-9035-y.

T. F. Chan, L. de Pillis, and H. van der Vorst. Transpose-free formulations of Lanczos-typemethods for nonsymmetric linear systems. Numer. Algor., 17(1–2):51–66, 1998. DOI:10.1023/A:1011637511962.

M. D’Apuzzo, V. De Simone, and D. di Serafino. On mutual impact of numerical linearalgebra and large-scale optimization with focus on interior point methods. Comput. Optim.Appl., 45(2):283–310, 2010. DOI: 10.1007/s10589-008-9226-1.

V. De Simone, D. di Serafino, and B. Morini. On preconditioner updates for sequences ofsaddle-point linear systems. Communications in Applied and Industrial Mathematics, 9(1):35–41, 2018. DOI: 10.1515/caim-2018-0003.

E. D. Dolan and J. J. Moré. Benchmarking optimization software with performance profiles.Math. Program., Series B, 29(2):201–213, 2002. DOI: 10.1007/s101070100.

H. S. Dollar. Constraint-style preconditioners for regularized saddle point problems. SIAM J.Matrix Anal. Appl., 29(2):672–684, 2007. DOI: 10.1137/050626168.

H. S. Dollar, N. I. M. Gould, W. H. A. Schilders, and A. J. Wathen. Implicit-factorizationpreconditioning and iterative solvers for regularized saddle-point systems. SIAM J. MatrixAnal. Appl., 28(1):170–189, 2006. DOI: 10.1137/05063427X.

I. S. Duff. MA57—a code for the solution of sparse symmetric definite and indefinite systems.ACM Trans. Math. Software, 30(2):118–144, 2004. DOI: 10.1145/992200.992202.

C. Durazzi and V. Ruggiero. Indefinitely preconditioned conjugate gradient method for largesparse equality and inequality constrained quadratic problems. Numer. Linear AlgebraAppl., 10(8):673–688, 2003. DOI: 10.1002/nla.308.

M. Fisher, S. Gratton, S. Gürol, Y. Trémolet, and X. Vasseur. Low rank updates inpreconditioning the saddle point systems arising from data assimilation problems. Optim.Method Softw., 33(1):45–69, 2016. DOI: 10.1080/10556788.2016.1264398.

M. P. Friedlander and D. Orban. A primal-dual regularized interior-point method for convexquadratic problems. Math. Program. Comp., 4(1):71–107, 2012. DOI: 10.1007/s12532-012-0035-2.

G. H. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix.SIAM J. Numer. Anal., 2(2):205–224, 1965. DOI: 10.1137/0702016.

N. Gould, D. Orban, and Ph. Toint. CUTEst: a constrained and unconstrained testingenvironment with safe threads for mathematical optimization. Comput. Optim. Appl., 60(3):545–557, 2015. DOI: 10.1007/s1058.

N. I. M. Gould, M. E. Hribar, and J. Nocedal. On the solution of equality constrainedquadratic problems arising in optimization. SIAM J. Sci. Comput., 23(4):1375–1394, 2001.DOI: 10.1137/S1064827598345667.

N. I. M. Gould, D. Orban, and T. Rees. Projected Krylov methods for saddle-point systems.SIAM J. Matrix Anal. Appl., 35(4):1329–1343, 2014. DOI: 10.1137/130916394.

A. Greenbaum, V. Pták, and Z. Strakoš. Any nonincreasing convergence curve ispossible for GMRES. SIAM J. Matrix Anal. Appl., 17(3):465–469, 1996. DOI:10.1137/S0895479894275030.

C. Greif, E. Moulding, and D. Orban. Bounds on eigenvalues of matrices arising frominterior-point methods. SIAM J. Optim., 24(1):49–83, 2014. DOI: 10.1137/120890600.

M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. J.Res. Natl. Bur. Stand., 49(6):409–436, 1952. DOI: 10.6028/jres.049.044.

C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differentialand integral operators. J. Res. Natl. Bur. Stand., 45:225–280, 1950.

C. Lanczos. Solution of systems of linear equations by minimized iterations. J. Res. Natl.Bur. Stand., 49(1):33–53, 1952. DOI: 10.6028/jres.049.006.

D. Loghin. A note on constraint preconditioning. SIAM J. Matrix Anal. Appl., 38(4):

21

Page 22: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

22 [toc]

1486–1495, 2017. DOI: 10.1137/16M1072036.L. Lukšan and J. Vlček. Indefinitely preconditioned inexact Newton method for large sparse

equality constrained nonlinear programming problems. Numer. Linear Algebra Appl., 5:219–247, 1998. DOI: 10.1002/(SICI)1099-1506(199805/06)5:3<219::AID-NLA134>3.0.CO;2-7.

B. Morini, V. Simoncini, and M. Tani. Spectral estimates for unreduced symmetric KKTsystems arising from interior point methods. Numer. Linear Algebra Appl., 23:776–800,2016. DOI: 10.1002/nla.2054.

D. Orban. A collection of linear systems arising from interior-point methods for quadraticoptimization. Cahier du GERAD G-2015-117, GERAD, Montréal, QC, Canada, 2015a.

D. Orban. A collection of linear systems arising from interior-point methods for quadraticoptimization. Online data set. DOI: 10.5281/zenodo.34130, 2015b.

C. C. Paige and M. A. Saunders. Solution of sparse indefinite systems of linear equations.SIAM J. Numer. Anal., 12(4):617–629, 1975. DOI: 10.1137/0712047.

I. Perugia and V. Simoncini. Block-diagonal and indefinite symmetric preconditioners formixed finite element formulations. Numer. Linear Algebra Appl., 7(7–8):585–616, 2000.DOI: 10.1002/1099-1506(200010/12)7:7/8<585::AID-NLA214>3.0.CO;2-F.

J. Pestana and A. J. Wathen. Natural preconditioning and iterative methods for saddle pointsystems. SIAM Review, 57(1):51–71, 2015. DOI: 10.1137/130934921.

Y. Saad. Iterative methods for sparse linear systems. Society for Industrial and AppliedMathematics, Philadelphia, PA, second edition, 2003. DOI: 10.1137/1.9780898718003.

Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solvingnonsymmetric linear systems. SIAM J. Sci. and Statist. Comput., 7(3):856–869, 1986. DOI:10.1137/0907058.

Y. Saad and K. Wu. DQGMRES: a direct quasi-minimal residual algorithm based onincomplete orthogonalization. Numer. Linear Algebra Appl., 3(4):329–343, 1996. DOI:10/fwxn2n.

M. A. Saunders, H. D. Simon, and E. L. Yip. Two conjugate-gradient-type methodsfor unsymmetric linear equations. SIAM J. Numer. Anal., 25(4):927–940, 1988. DOI:10.1137/0725052.

D. Sesana and V. Simoncini. Spectral analysis of inexact constraint preconditioning forsymmetric saddle point matrices. Linear Algebra and its Applications, 438(6):2683–2700,2013. DOI: 10.1016/j.laa.2012.11.022.

22

Page 23: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 23

Appendix A. Standard Lanczos and Arnoldi Processes. For reference westate the standard Lanczos process in plain and preconditioned versions, and thefull-space Lanczos process for (1) with preconditioner (2). We also state the projectedand full-space Arnoldi processes.

Algorithm 4 Lanczos Process for Ax “ b

1: choose x02: v0 “ 0, k “ 13: r0 “ b´Ax04: v1 “ r0

5: β1 “ pvT1 r0q

12

6: if β1 ‰ 0 then7: v1 “ v1{β1 }v1}2 “ 18: end if9: k “ 1

10: while βk ‰ 0 do11: uk “ Avk12: αk “ uTk vk13: vk`1 “ uk ´ αkvk ´ βkvk´1

14: βk`1 “ pvTk`1ukq

12

15: if βk`1 ‰ 0 then16: vk`1 “ vk`1{βk`1 }vk`1}2 “ 117: end if18: k “ k ` 119: end while

Algorithm 5 Lanczos Process for Ax “ b with Preconditioner J “ JT ą 0

1: choose x02: v0 “ 03: r0 “ b´Ax04: solve Jv1 “ r0

5: β1 “ pvT1 r0q

12

6: if β1 ‰ 0 then7: v1 “ v1{β1 }v1}J “ 18: end if9: k “ 1

10: while βk ‰ 0 do11: uk “ Avk12: αk “ uTk vk13: solve Jvk`1 “ uk14: vk`1 “ vk`1 ´ αkvk ´ βkvk´1

15: βk`1 “ pvTk`1ukq

12

16: if βk`1 ‰ 0 then17: vk`1 “ vk`1{βk }vk`1}J “ 118: end if19: k “ k ` 120: end while

23

Page 24: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

24 [toc]

Algorithm 6 Full-Space Lanczos Process for (1) with Preconditioner (2)1: choose rx0 ; y0s such that Bx0 ´ Cy0 “ 02: initialize

v0,xv0,y

00

3: set Bx0 ´ Cy0 “ 0 ñ r0,y “ 0

r0,xr0,y

b0

´

A BT

B ´C

x0y0

b´Ax0 ´BT y0

0

4: obtain v1 as the solution of„

G BT

B ´C

v1,xv1,y

r0,xr0,y

5: β1 “ pvT1 r0q

12 “ pvT1,xr0,xq

12

6: if β1 ‰ 0 then7: v1 “ v1{β18: end if9: k “ 1

10: while βk ‰ 0 do11: compute

uk,xuk,y

A BT

B ´C

vk,xvk,y

12: αk “ uTk vk “ vTk,xAvk,x ` 2vTk,xBT vk,y ´ v

Tk,yCvk,y

13: obtain vk`1 as the solution of„

G BT

B ´C

vk`1,x

vk`1,y

uk,xuk,y

14: vk`1 “ vk`1 ´ αkvk ´ βkvk´1

15: βk`1 “ pvTk`1ukq

12 “ pvTk`1,xuk,x ` v

Tk`1,yuk,yq

12

16: if βk`1 ‰ 0 then17: vk`1 “ vk`1{βk`1

18: end if19: k “ k ` 120: end while

24

Page 25: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

[toc] 25

Algorithm 7 Projected Arnoldi Process1: choose rx0 ; w0s such that Bx0 ` Ew0 “ 02: v0,x “ 0, v0,w “ ´w0

3: u0,x “ b´Ax0, u0,w “ ´F´1w0

4: ru1,x ; u1,w ; z1s Ð solution of (14) with right-hand side ru0,x ; u0,w ; 0s5: v1,x “ u1,x, v1,w “ u1,w v1 “ PG u0

6: h1,0 “ pvT1,xu0,x ` v

T1,wu0,wq

12

7: if h1,0 ‰ 0 then8: v1,x “ v1,x{h1,0, v1,w “ v1,w{h1,09: end if

10: k “ 111: while hk,k´1 ‰ 0 do12: uk,x “ Avk,x, uk,w “ F´1vk,w13: ruk`1,x ; uk`1,w ; zk`1s Ð solution of (14) with right-hand side ruk,x ; uk,w ; 0s

14: vk`1,x “ uk`1,x, vk`1,w “ uk`1,w vk`1 “ PG uk15: for i “ 1, . . . , k do16: hi,k “ vTi,xuk,x ` v

Ti,wuk,w

17: vk`1,x “ vk`1,x ´ hi,kvi,x, vk`1,w “ vk`1,w ´ hi,kvi,w18: end for19: hk`1,k “ pv

Tk`1,xuk,x ` v

Tk`1,wuk,wq

12

20: if hk`1,k ‰ 0 then21: vk`1,x “ vk`1,x{hk`1,k, vk`1,w “ vk`1,w{hk`1,k

22: end if23: k “ k ` 124: end while

25

Page 26: CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR … · CahierduGERADG-2019-72 CONSTRAINT-PRECONDITIONED KRYLOV SOLVERS FOR REGULARIZED SADDLE-POINT SYSTEMS DANIELA DI SERAFINO AND DOMINIQUE

26 [toc]

Algorithm 8 Full-Space Arnoldi Process for (1) with Preconditioner (2)1: choose rx0 ; y0s such that Bx0 ´ Cy0 “ 02: initialize

v0,xv0,y

00

3: set Bx0 ´ Cy0 “ 0 ñ r0,y “ 0

r0,xr0,y

b0

´

A BT

B ´C

x0y0

b´Ax0 ´BT y0

0

4: obtain v1 as the solution of„

G BT

B ´C

v1,xv1,y

r0,xr0,y

5: h1,0 “ pvT1 r0q

12 “ pvT1,xr0,xq

12

6: if h1,0 ‰ 0 then7: v1 “ v1{β18: end if9: k “ 1

10: while hk,k´1 ‰ 0 do11: compute

uk,xuk,y

A BT

B ´C

vk,xvk,y

12: obtain vk`1 as the solution of„

G BT

B ´C

vk`1,x

vk`1,y

uk,xuk,y

13: for i “ 1, . . . , k do14: hi,k “ vTi uk “ vTi,xAvk,x ` 2vTi,xB

T vk,y ´ vTi,yCvk,y

15: vk`1 “ vk`1 ´ hi,kvi16: end for17: hk`1,k “ pv

Tk`1ukq

12 “ pvTk`1,xuk,x ` v

Tk`1,yuk,yq

12

18: if hk`1,k ‰ 0 then19: vk`1 “ vk`1{hk`1,k

20: end if21: k “ k ` 122: end while

26