The splitting algorithms by Ryu and by Malitsky-Tam

The splitting algorithms by Ryu and by Malitsky-Tamapplied to normal cones of linear subspaces converge

strongly to the projection onto the intersection

Heinz H. Bauschke*, Shambhavi Singh†, and Xianfu Wang‡

September 22, 2021

Abstract

Finding a zero of a sum of maximally monotone operators is a fundamental prob-lem in modern optimization and nonsmooth analysis. Assuming that resolvents ofthe operators are available, this problem can be tackled with the Douglas-Rachfordalgorithm. However, when dealing with three or more operators, one must work ina product space with as many factors as there are operators. In groundbreaking re-cent work by Ryu and by Malitsky and Tam, it was shown that the number of factorscan be reduced by one. These splitting methods guarantee weak convergence to somesolution of the underlying sum problem; strong convergence holds in the presence ofuniform monotonicity.

In this paper, we provide a case study when the operators involved are normalcone operators of subspaces and the solution set is thus the intersection of the sub-spaces. Even though these operators lack strict convexity, we show that strikingconclusions are available in this case: strong (instead of weak) convergence and thesolution obtained is (not arbitrary but) the projection onto the intersection. Numericalexperiments to illustrate our results are also provided.

2020 Mathematics Subject Classification: Primary 41A50, 49M27, 65K05, 47H05; Secondary15A10, 47H09, 49M37, 90C25.

*Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected].

†Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected].

‡Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected].

1

arX

iv:2

109.

1107

2v1

[m

ath.

OC

] 2

2 Se

p 20

21

Keywords: best approximation, Hilbert space, intersection of subspaces, linear convergence,Malitsky-Tam splitting, maximally monotone operator, nonexpansive mapping, resolvent, Ryusplitting.

1 Introduction

Throughout the paper, we assume that

X is a real Hilbert space (1)

with inner product 〈·, ·〉 and induced norm ‖ · ‖. Let A1, . . . , An be maximally monotoneoperators on X. (See, e.g., [7] for background on maximally monotone operators.) Onecentral problem in modern optimization and nonsmooth analysis asks to

find x ∈ X such that 0 ∈ (A1 + · · ·+ An)x. (2)

In general, solving (2) may be quite hard. Luckily, in many interesting cases, we haveaccess to the firmly nonexpansive resolvents JAi := (Id+Ai)

−1 which opens the door toemploy splitting algorithms to solve (2). The most famous instance is the Douglas-Rachfordalgorithm [15] whose importance for this problem was brought to light in the seminalpaper by Lions and Mercier [17]. However, the Douglas-Rachford algorithm requiresthat n = 2; if n ≥ 3, one may employ the Douglas-Rachford algorithm to a reformulationin the product space Xn [13, Section 2.2]. In recent breakthrough work by Ryu [20], it wasshown that for n = 3 one may formulate an algorithm that works in X2 rather than X3.We will refer to this method as Ryu’s algorithm. Very recently, Malitsky and Tam proposedin [18] an algorithm for a general n ≥ 3 that is different from Ryu’s and that operators inXn−1. (No algorithms exist in product spaces featuring fewer factors than n− 1 factors ina certain technical sense.) We will review these algorithms in Section 3 below. Both Ryu’sand the Malitsky-Tam algorithm are known to produce some solution to (2) via a sequencethat converges weakly. Strong convergence holds in the presence of uniform monotonicity.

The aim of this paper is provide a case study for the situation when the maximally monotoneoperators Ai are normal cone operators of closed linear subspaces Ui of X. These operators arenot even strictly monotone. Our main results show that the splitting algorithms by Ryuand by Malitsky-Tam actually produce a sequence that converges strongly and we are ableto identify the limit to be the projection onto the intersection U1 ∩ · · · ∩Un! The proofs ofthese results rely on the explicit identification of the fixed point set of the underlying Ryuand Malitsky-Tam operators. Moreover, a standard translation technique gives the sameresult for affine subspaces of X provided their intersection is nonempty.

The paper is organized as follows. In Section 2, we collect various auxiliary results forlater use. The known convergence results on Ryu splitting and on Malitsky-Tam splitting

2

are reviewed in Section 3. Our main results are presented in Section 4. Matrix represen-tations of the various operators involved are provided in Section 5. These are useful forour numerical experiments in Section 6. Finally, we offer some concluding remarks inSection 7.

The notation employed in this paper is standard and follows largely [7]. When z =x + y and x ⊥ y, then we also write z = x ⊕ y to stress this fact. Analogously for theMinkowski sum Z = X + Y, we write Z = X⊕Y as well as PZ = PX ⊕ PY if X ⊥ Y.

2 Auxiliary results

In this section, we collect useful properties of projection operators and results on iteratinglinear/affine nonexpansive operators. We start with projection operators.

2.1 Projections

Fact 2.1. Suppose U and V are nonempty closed convex subsets of X such that U ⊥ V.Then U ⊕V is a nonempty closed subset of X and

PU⊕V = PU ⊕ PV (3)

Proof. See [7, Proposition 29.6]. �

Here is a well known illustration of Fact 2.1 which we will use repeatedly in the paper(sometimes without explicit mentioning).

Example 2.2. Suppose U is a closed linear subspace of X. Then

PU⊥ = Id−PU. (4)

Proof. The orthogonal complement V := U⊥ satisfies U ⊥ V and also U + V = X; thusPU+V = Id and the result follows. �

Fact 2.3 (Anderson-Duffin). Suppose that X is finite-dimensional and that U, V are twolinear subspaces of X. Then

PU∩V = 2PU(PU + PV)†PV , (5)

where “†” denotes the Moore-Penrose inverse of a matrix.

3

Proof. See, e.g., [7, Corollary 25.38] or the original [1]. �

Corollary 2.4. Suppose that X is finite-dimensional and that U, V, W are three linear sub-spaces of X. Then

PU∩V∩W = 4PU(PU + PV)†PV

(2PU(PU + PV)

†PV + PW)†PW . (6)

Proof. Use Fact 2.3 to find PU∩V , and then use Fact 2.3 again on (U ∩V, W). �

Corollary 2.5. Suppose that X is finite-dimensional and that U, V are two linear subspacesof X. Then

PU+V = Id−2PU⊥(PU⊥ + PV⊥)†PV⊥ (7a)

= Id−2(Id−PU)(2 Id−PU − PV

)†(Id−PV). (7b)

Proof. Indeed, U + V = (U⊥ ∩V⊥)⊥ and so PU+V = Id−PU⊥∩V⊥ . Now apply Fact 2.3 to(U⊥, V⊥) followed by Example 2.2. �

Fact 2.6. Let Y be a real Hilbert space, and let A : X → Y be a continuous linear operatorwith closed range. Then

Pran A = AA†. (8)

Proof. See, e.g., [7, Proposition 3.30(ii)]. �

2.2 Linear (and affine) nonexpansive iterations

We now turn results on iterating linear or affine nonexpansive operators.

Fact 2.7. Let L : X → X be linear and nonexpansive, and let x ∈ X. Then

Lkx → PFix L(x) ⇔ Lkx− Lk+1x → 0. (9)

Proof. See [3, Proposition 4], [4, Theorem 1.1], [8, Theorem 2.2], or [7, Proposition 5.28].(The versions in [3] and [4] are much more general.) �

Fact 2.8. Let T : X → X be averaged nonexpansive with Fix T 6= ∅. Then (∀x ∈ X)Tkx− Tk+1x → 0.

Proof. See Bruck and Reich’s paper [12] or [7, Corollary 5.16(ii)]. �

4

Corollary 2.9. Let L : H → H be linear and averaged nonexpansive. Then

(∀x ∈ H) Lkx → PFix L(x). (10)

Proof. Because 0 ∈ Fix L, we have Fix L 6= ∅. Now combine Fact 2.7 with Fact 2.8. �

Fact 2.10. Let L be a linear nonexpansive operator and let b ∈ X. Set T : X → X : x →Lx + b and suppose that Fix T 6= ∅. Then b ∈ ran (Id−L), and for every x ∈ X anda ∈ (Id−L)−1b, the following hold:

(i) b = a− La ∈ ran (Id−L).(ii) Fix T = a + Fix L.

(iii) PFix T(x) = PFix L(x) + P(Fix L)⊥(a).(iv) Tkx = Lk(x− a) + a.(v) Lkx → PFix Lx⇔ Tkx → PFix Tx.

Proof. See [9, Lemma 3.2 and Theorem 3.3]. �

Remark 2.11. Consider Fact 2.10 and its notation. If a ∈ (Id−L)−1b then P(Fix L)⊥ is like-wise because b = (Id−L)a = (Id−L)(PFix L(a) + P(Fix L)⊥(a)) = P(Fix L)⊥(a); moreover,using [16, Lemma 3.2.1], we see that

(Id−L)†b = (Id−L)†(Id−L)a = P(ker(Id−L))⊥(a) = P(Fix L)⊥(a), (11)

where again “†” denotes the Moore-Penrose inverse of a continuous linear operator (withpossibly nonclosed range). So given b ∈ X, we may concretely set

a = (Id−L)†b ∈ (Id−L)−1b; (12)

with this choice, (iii) turns into the even more pleasing identity

PFix T(x) = PFix L(x) + a. (13)

3 Known results on Ryu and on Malitsky-Tam splitting

In this section, we present the precise form of Ryu’s and the Malitsky-Tam algorithms andreview known convergence results.

5

3.1 Ryu splitting

We start with Ryu’s algorithm. In this subsection,

A, B, C are maximally monotone operators on X, (14)

with resolvents JA, JB, JC, respectively.

The problem of interest is to

find x ∈ X such that 0 ∈ (A + B + C)x, (15)

and we assume that (15) has a solution. The algorithm pioneered by Ryu [20] provides amethod for finding a solution to (15). It proceeds as follows. Set1

M : X× X → X× X× X :(

xy

)7→

JA(x)JB(JA(x) + y)

JC(

JA(x)− x + JB(JA(x) + y)− y) . (16)

Next, denote by Q1 : X × X × X → X : (x1, x2, x3) 7→ x1 and similarly for Q2 and Q3. Wealso set ∆ :=

{(x, x, x) ∈ X3

∣∣ x ∈ X}

. We are now ready to introduce the Ryu operator

T := TRyu : X2 → X2 : z 7→ z +((Q3 −Q1)Mz, (Q3 −Q2)Mz

). (17)

Given a starting point (x0, y0) ∈ X × X, the basic form of Ryu splitting generates a gov-erning sequence via

(∀k ∈N) (xk+1, yk+1) := (1− λ)(xk, yk) + λT(xk, yk). (18)

The following result records the basic convergence properties by Ryu [20], and recentlyimproved by Aragon-Artacho, Campoy, and Tam [2].

Fact 3.1 (Ryu and also Aragon-Artacho-Campoy-Tam). The operator TRyu is nonexpan-sive with

Fix TRyu ={(x, y) ∈ X× X

∣∣ JA(x) = JB(JA(x) + y) = JC(RA(x)− y)}

(19)

andzer(A + B + C) = JA

(Q1 Fix TRyu

). (20)

Suppose that 0 < λ < 1 and consider the sequence generated by (18). Then there exists(x, y) ∈ X× X such that

(xk, yk)⇀ (x, y) ∈ Fix TRyu, (21)

1We will express vectors in product spaces both as column and as row vectors depending on whichversion is more readable.

6

M(xk, yk)⇀ M(x, y) ∈ ∆, (22)

and ((Q3 −Q1)M(xk, yk), (Q3 −Q2)M(xk, yk)

)→ (0, 0). (23)

In particular,JA(xk)⇀ JA x ∈ zer(A + B + C). (24)

Proof. See [20] and [2]. �

3.2 Malitsky-Tam splitting

We now turn to the Malitsky-Tam algorithm. In this subsection, let n ∈ {3, 4, . . .} and letA1, A2, . . . , An be maximally monotone operators on X. The problem of interest is to

find x ∈ X such that 0 ∈ (A1 + A2 + · · ·+ An)x, (25)

and we assume that (25) has a solution. The algorithm proposed by Malitsky and Tam[18] provides a method for finding a solution to (25). Now set2

M : Xn−1 → Xn :

z1...

zn−1

7→

x1...

xn−1xn

, where (26a)

(∀i ∈ {1, . . . , n}) xi =

JA1(z1), if i = 1;JAi(xi−1 + zi − zi−1), if 2 ≤ i ≤ n− 1;JAn(x1 + xn−1 − zn−1), if i = n.

(26b)

As before, we denote by Q1 : Xn → X : (x1, . . . , xn−1, xn) 7→ x1 and similarly forQ2, . . . Qn. We also set ∆ :=

{(x, . . . , x) ∈ Xn

∣∣ x ∈ X}

, the diagonal in Xn. We are nowready to introduce the Malitsky-Tam (MT) operator

T := TMT : Xn−1 → Xn−1 : z 7→ z +

(Q2 −Q1)Mz(Q3 −Q2)Mz

...(Qn −Qn−1)Mz

. (27)

2Again, we will express vectors in product spaces both as column and as row vectors depending onwhich version is more readable.

7

Given a starting point z0 ∈ Xn−1, the basic form of MT splitting generates a governingsequence via

(∀k ∈N) zk+1 := (1− λ)zk + λTzk. (28)

The following result records the basic convergence.

Fact 3.2 (Malitsky-Tam). The operator TMT is nonexpansive with

Fix TMT ={

z ∈ Xn−1 ∣∣ Mz ∈ ∆}

, (29)

zer(A1 + · · ·+ An) = JA1

(Q1 Fix TMT

). (30)

Suppose that 0 < λ < 1 and consider the sequence generated by (28). Then there existsz ∈ Xn−1 such that

zk ⇀ z ∈ Fix TMT, (31)

Mzk ⇀ Mz ∈ ∆, (32)

and(∀(i, j) ∈ {1, · · · , n}2) (Qi −Qj)Mzk → 0. (33)

In particular,JA1 Q1Mzk ⇀ JAQ1Mz ∈ zer(A1 + . . . + An). (34)

Proof. See [18]. �

4 Main Results

We are now ready to tackle our main results. We shall find useful descriptions of the fixedpoint sets of the Ryu and the Malitsky-Tam operators. These description will allow us todeduce strong convergence of the iterates to the projection onto the intersection.

4.1 Ryu splitting

In this subsection, we assume that

U, V, W are closed linear subspaces of X. (35)

We setA := NU, B := NV , C := NW . (36)

8

ThenZ := zer(A + B + C) = U ∩V ∩W. (37)

Using linearity of the projection operators, the operator M defined in (16) turns into

M : X× X → X× X× X :(

xy

)7→

PUxPV PUx + PVy

PW PUx + PW PV PUx− PW x+PW PVy− PWy

, (38)

while the Ryu operator is still (see (17))

T := TRyu : X2 → X2 : z 7→ z +((Q3 −Q1)Mz, (Q3 −Q2)Mz

). (39)

We now determine the fixed point set of the Ryu operator.

Lemma 4.1. Let (x, y) ∈ X× X. Then

Fix T =(Z× {0}

)⊕((

U⊥ ×V⊥) ∩(∆⊥ + ({0} ×W⊥)

)), (40)

where ∆ ={(x, x) ∈ X× X

∣∣ x ∈ X}

. Consequently, setting

E =(U⊥ ×V⊥) ∩

(∆⊥ + ({0} ×W⊥)

), (41)

we havePFix T(x, y) = (PZx, 0)⊕ PE(x, y) ∈ (PZx⊕U⊥)×V⊥. (42)

Proof. Note that (x, y) = (PW⊥y + (x − PW⊥y), PW⊥y + PWy) = (PW⊥y, PW⊥y) + (x −PW⊥y, PWy) ∈ ∆ + (X×W). Hence

X× X = ∆ + (X×W) is closed; (43)

consequently, by, e.g., [7, Corollary 15.35],

∆⊥ + ({0} ×W⊥) is closed. (44)

Next, using (19), we have the equivalences

(x, y) ∈ Fix TRyu (45a)

⇔ PUx = PV(

PUx + y)= PW

(RUx− y

)(45b)

⇔ PUx ∈ Z ∧ y ∈ V⊥ ∧ PUx = PW(

PUx− PU⊥x− y)

(45c)

⇔ x ∈ Z + U⊥ ∧ y ∈ V⊥ ∧ PU⊥x + y ∈W⊥. (45d)

9

Now define the linear operator

S : X× X → X : (x, y) 7→ x + y. (46)

Hence

Fix TRyu ={(x, y) ∈ (Z + U⊥)×V⊥

∣∣ PU⊥x + y ∈W⊥}

(47a)

={(z + u⊥, v⊥)

∣∣ z ∈ Z, u⊥ ∈ U⊥, v⊥ ∈ V⊥, u⊥ + v⊥ ∈W⊥}

(47b)

= (Z× {0})⊕((U⊥ ×V⊥) ∩ S−1(W⊥)

). (47c)

On the other hand, S−1(W⊥) = ({0} ×W⊥) + ker S = ({0} ×W⊥) + ∆⊥ is closed by(44). Altogether,

Fix TRyu = (Z× {0})⊕((U⊥ ×V⊥) ∩ (({0} ×W⊥) + ∆⊥)

), (48)

i.e., (40) holds. Finally, (42) follows from Fact 2.1. �

We are now ready for the main convergence result on Ryu’s algorithm.

Theorem 4.2 (main result on Ryu splitting). Given 0 < λ < 1 and (x0, y0) ∈ X × X,generated the sequence (xk, yk)k∈N via3

(∀k ∈N) (xk+1, yk+1) = (1− λ)(xk, yk) + λT(xk, yk). (49)

ThenM(xk, yk)→

(PZ(x0), PZ(x0), PZ(x0)

); (50)

in particular,PU(xk)→ PZ(x0). (51)

Proof. Set Tλ := (1− λ) Id+λT and observe that (xk, yk)k∈N = (Tkλ(x0, y0))k∈N. Hence,

by Corollary 2.9 and (42)

(xk, yk)→ PFix Tλ(x0, y0) = PFix T(x0, y0) (52a)

= (PZx0, 0) + PE(x0, y0) ∈ (PZx0 ⊕U⊥)×V⊥, (52b)

where E is as in Lemma 4.1. Hence

Q1M(xk, yk) = PUxk → PU(PZx0) = PZx0. (53)

Now (23) yields

limk→∞

Q1M(xk, yk) = limk→∞

Q2M(xk, yk) = limk→∞

Q3M(xk, yk) = PZx0, (54)

i.e., (50) and we’re done. �

3Recall (38) and (39) for the definitions of M and T.

10


Let n ∈ {3, 4, . . .}. In this subsection, we assume that U1, . . . , Un are closed linear sub-spaces of X. We set

(∀i ∈ {1, 2, . . . , n}) Ai := NUi and Pi := PUi . (55)

ThenZ := zer(A1 + · · ·+ An) = U1 ∩ · · · ∩Un. (56)

The operator M defined in (26) turns into

M : Xn−1 → Xn :

z1...

zn−1

7→

x1...

xn−1xn

, where (57a)

(∀i ∈ {1, . . . , n}) xi =

P1(z1), if i = 1;Pi(xi−1 + zi − zi−1), if 2 ≤ i ≤ n− 1;Pn(x1 + xn−1 − zn−1), if i = n.

(57b)

and the MT operator remains (see (27))

T := TMT : Xn−1 → Xn−1 : z 7→ z +

(Q2 −Q1)Mz(Q3 −Q2)Mz

...(Qn −Qn−1)Mz

. (58)

We now determine the fixed point set of the Malitsky-Tam operator.

Lemma 4.3. The fixed point set of the MT operator T = TMT is

Fix T ={(z, . . . , z) ∈ Xn−1 ∣∣ z ∈ Z

}⊕ E, (59)

where

E := ran Ψ ∩(Xn−2 ×U⊥n ) (60a)

⊆ U⊥1 × · · · × (U⊥1 + · · ·+ U⊥n−2)×((U⊥1 + · · ·+ U⊥n−1) ∩U⊥n

)(60b)

and

Ψ : U⊥1 × · · · ×U⊥n−1 → Xn−1 (61a)

11

(y1, . . . , yn−1) 7→ (y1, y1 + y2, . . . , y1 + y2 + · · ·+ yn−1) (61b)

is the continuous linear partial sum operator which has closed range.

Let z = (z1, . . . , zn−1) ∈ Xn−1, and set z = (z1 + z2 + · · ·+ zn−1)/(n− 1). Then

PFix Tz = (PZ z, . . . , PZ z)⊕ PEz ∈ Xn−1 (62)

and henceP1(Q1PFix T)z = PZ z. (63)

Proof. Assume temporarily that z ∈ Fix T and set x = Mz = (x1, . . . , xn). Then x := x1 =· · · = xn and so x ∈ Z. Now P1z1 = x1 = x ∈ Z and thus

z1 ∈ x + U⊥1 ⊆ Z + U⊥1 . (64)

Next, x = x2 = P2(x1 + z2 − z1) = P2x1 + P2(z2 − z1) = P2x + P2(z2 − z1) = x, whichimplies P2(z2 − z1) = 0 and so z2 − z1 ∈ U⊥2 . It follows that

z2 ∈ z1 + U⊥2 . (65)

Similarly, by considering x3, . . . , xn−1, we obtain

z3 ∈ z2 + U⊥3 , . . . , zn−1 ∈ zn−2 + U⊥n−1. (66)

Finally, x = xn = Pn(x1 + xn−1− zn−1) = Pn(x + x− zn−1) = 2x− Pnzn−1, which impliesPnzn−1 = x, i.e., zn−1 ∈ x + U⊥n . Combining with (66), we see that zn−1 satisfies

zn−1 ∈ (zn−2 + U⊥n−1) ∩ (P1z1 + U⊥n ). (67)

To sum up, our z ∈ Fix T must satisfy

z1 ∈ Z + U⊥1 (68a)

z2 ∈ z1 + U⊥2 (68b)... (68c)

zn−2 ∈ zn−3 + U⊥n−2 (68d)

zn−1 ∈ (zn−2 + U⊥n−1) ∩ (P1z1 + U⊥n ). (68e)

We now show the converse. To this end, assume now that our z satisfies (68). Notethat Z⊥ = U⊥1 + · · ·+ U⊥n . Because z1 ∈ Z + U⊥1 , there exists z ∈ Z and u⊥1 ∈ U⊥1 suchthat z1 = z⊕ u⊥1 . Hence x1 = P1z1 = P1z = z. Next, z2 ∈ z1 + U⊥2 , say z2 = z1 + u⊥2 =

12

z⊕ (u⊥1 + u⊥2 ), where u⊥2 ∈ U⊥2 . Then x2 = P2(x1 + z2 − z1) = P2(z + u⊥2 ) = P2z = z.Similarly, there exists also u⊥3 ∈ U⊥3 , . . . , u⊥n−1 ∈ U⊥n−1 such that x3 = · · · = xn−1 = z andzi = z⊕ (u⊥1 + · · ·+ u⊥i ) for 2 ≤ i ≤ n− 1. Finally, we also have zn−1 = z⊕ u⊥n for someu⊥n ∈ U⊥n . Thus xn = Pn(x1 + xn−1 − zn−1) = Pn(2z− (z + u⊥n )) = Pnz = z. Altogether,z ∈ Fix T. We have thus verified the description of Fix T announced in (59), using theconvenient notation of the operator Ψ which is easily seen to have closed range.

Next, we observe that

D :={(z, . . . , z) ∈ Xn−1 ∣∣ z ∈ Z

}= Zn−1 ∩ ∆, (69)

where ∆ is the diagonal in Xn−1 which has projection P∆(z1, . . . , zn) = (z, . . . , z) (see, e.g.,[7, Proposition 26.4]). By convexity of Z, we clearly have P∆(Zn−1) ⊆ Zn−1. BecauseZn−1 is a closed linear subspace of Xn−1, [14, Lemma 9.2] and (69) yield PD = PZn−1 P∆and therefore

PDz = PZn−1 P∆z =(

PZ z, . . . , PZ z). (70)

Combining (59), Fact 2.1, (69), and (70) yields (62).

Finally, observe that Q1(PEz) ∈ U⊥1 by (60). Thus Q1(PFix Tz) ∈ PZ z+U⊥1 and therefore(63) follows. �

We are now ready for the main convergence result on the Malitsky-Tam algorithm.

Theorem 4.4 (main result on Malitsky-Tam splitting). Given 0 < λ < 1 and z0 =(z0,1, . . . , z0,n−1) ∈ Xn−1, generate the sequence (zk)k∈N via4

(∀k ∈N) zk+1 = (1− λ)zk + λTzk. (71)

Setp :=

1n− 1

(z0,1 + · · ·+ z0,n−1

). (72)

Then there exists z ∈ Xn−1 such that

zk → z ∈ Fix T, (73)

andMzk → Mz = (PZ p, . . . , PZ p) ∈ Xn. (74)

In particular,

P1(Q1zk) = Q1Mzk → PZ(p) = 1n−1 PZ

(z0,1 + · · ·+ z0,n−1

). (75)

Consequently, if x0 ∈ X and z0 = (x0, . . . , x0) ∈ Xn−1, then

P1Q1zk → PZx0. (76)4Recall (57) and (58) for the definitions of M and T.

13

Proof. Set Tλ := (1− λ) Id+λT and observe that (zk)k∈N = (Tkλz)k∈N. Hence, by Corol-

lary 2.9 and Lemma 4.3,

zk → PFix Tλz0 = PFix Tz0 (77a)

= (PZ p, . . . , PZ p)⊕ PE(z0), (77b)

where E is as in Lemma 4.3. Hence, using also (63),

Q1Mzk = P1Q1zk (78a)

→ P1Q1((PZ p, . . . , PZ p)⊕ PE(z0)

)(78b)

= P1(

PZ p + Q1(PE(z0)))

(78c)

∈ P1(

PZ p + U⊥1)

(78d)

= {P1PZ p} (78e)= {PZ p}, (78f)

i.e., Q1Mzk → PZ p. Now (33) yields Qi Mzk → Pz p for every i ∈ {1, . . . , n}. This yields(74) and (75).

Finally, the “Consequently” part is clear because when z0 has this special form, thenp = x0. �

4.3 Extension to the consistent affine case

In this subsection, we comment on the behaviour of the splitting algorithms by Ryu andby Malitsky-Tam in the consistent affine case. To this end, we shall assume that V1, . . . , Vnare closed affine subspaces of X with nonempty intersection:

V := V1 ∩V2 ∩ · · · ∩Vn 6= ∅. (79)

We repose the problem of finding a point in Z as

find x ∈ X such that 0 ∈ (A1 + A2 + · · ·+ An)x, (80)

where each Ai = NVi . When we consider Ryu splitting, we also impose n = 3. SetUi := Vi − Vi, which is the parallel space of Vi. Now let v ∈ V. Then Vi = v + Ui andhence JNVi

= PVi = Pv+Ui satisfies Pv+Ui = v + PUi(x − v) = PUi x + PU⊥i(v). Put differ-

ently, the resolvents from the affine problem are translations of the the resolvents fromthe corresponding linear problem which considers Ui instead of Vi.

The construction of the operator T ∈ {TRyu, TMT} now shows that it is a translation ofthe corresponding operator from the linear problem. And finally Tλ = (1− λ) Id+λT is

14

a translation of the corresponding operator from the linear problem which we denote byLλ: Lλ = (1− λ) Id+λL, where L is either the Ryu operator of the Malitsky-Tam operatorof the parallel linear problem, and there exists b ∈ Xn−1 such that

Tλ(x) = Lλ(x) + b. (81)

By Fact 2.10 (applied in Xn−1), there exists a vector a ∈ Xn−1 such that

(∀k ∈N) Tkλx = a + Lk

λ(x− a). (82)

In other words, the behaviour in the affine case is essentially the same as in the linearparallel case, appropriately shifted by the vector a. Moreover, because Lk

λ → PFix L in theparallel linear setting, we deduce from Fact 2.10 that

Tkλ → PFix T (83)

By (82), the rate of convergence in the affine case are identical to the rate of convergencein the parallel linear case. Thus, if (xk, yk)k∈N is the governing sequence generated byRyu splitting, then

PV1 xk → PV(x0). (84)

And if zk = (zk,1, . . . , zk,n−1)k∈N is the sequence generated by Malitsky-Tam splitting,then

PV1 Q1zk → 1n−1 PV(z0,1 + · · ·+ z0,n−1). (85)

To sum up this subsection, we note that in the consistent affine case, Ryu’s and the Malitsky-Tam algorithm exhibit the same pleasant convergence behaviour as their linear parallel counter-parts!

It is, however, presently quite unclear how these two algorithms behave when V = ∅.

5 Matrix representation

In this section, we assume that X is finite-dimensional, say

X = Rd. (86)

The two splitting algorithms considered in this paper are of the form

Tkλ → PFix T, where 0 < λ < 1 and Tλ = (1− λ) Id+λT. (87)

Note that T is a linear operator; hence, so is Tλ and by [9, Corollary 2.8], the convergencein (87) is linear because X is finite-dimensional. What can be said about this rate? By [6,

15

Theorem 2.12(ii) and Theorem 2.18], a (sharp) lower bound for the rate of linear conver-gence is the spectral radius of Tλ − PFix T, i.e.,

ρ(Tλ − PFix T

)= max

∣∣{spectral values of Tλ − PFix T}∣∣, (88)

while an upper bound is the operator norm∥∥Tλ − PFix T∥∥. (89)

The lower bound is optimal and close to the true rate of convergence, see [6, Theo-rem 2.12(i)]. Both spectral radius and operator norms of matrices are available in pro-gramming languages such as Julia [11] which features strong numerical linear algebracapabilities. In order to compute these bounds for the linear rates, we must provide ma-trix representations for T (which immediately gives rise to one for Tλ) and for PFix T. Inthe previous sections, we casually switched back and forth being column and row vectorrepresentations for readability. In this section, we need to get the structure of the objectsright. To visually stress this, we will use square brackets for vectors and matrices.

For the remainder of this section, we fix three linear subspaces U, V, W of Rd, withintersection

Z = U ∩V ∩W. (90)

We assume that the matrices PU, PV , PW in Rd×d are available to us (and hence so arePU⊥ , PV⊥ , PW⊥ and PZ, via Example 2.2 and Corollary 2.4, respectively).

5.1 Ryu splitting

In this subsection, we consider Ryu splitting. First, the block matrix representation of theoperator M occurring in Ryu splitting (see (38)) is

PU 0

PV PU PV

PW PU + PW PV PU − PW PW PV − PW

∈ R3d×2d. (91)

Hence, using (39), we obtain the following matrix representation of the Ryu splitting op-erator T = TRyu:

T =

[Id 0

0 Id

]+

[− Id 0 Id

0 − Id Id

] PU 0

PV PU PV

PW PU + PW PV PU − PW PW PV − PW

(92a)

16

=

[Id− PU + PW PU + PW PV PU − PW PW PV − PW

PW PU + PW PV PU − PW − PV PU Id+PW PV − PV − PW

]∈ R2d×2d. (92b)

Next, we set, as in Lemma 4.1,

∆ ={[x, x]ᵀ ∈ R2d ∣∣ x ∈ X

}, (93a)

E =(U⊥ ×V⊥) ∩

(∆⊥ + ({0} ×W⊥)

)(93b)

so that, by (42),

PFix T

[xy

]=

[PZx

0

]+ PE

[xy

]. (94)

With the help of Corollary 2.4, we see that the first term, [PZx, 0]ᵀ, is obtained by applyingthe matrix[

PZ 00 0

]=

[4PU(PU + PV)

†PV(2PU(PU + PV)

†PV + PW)†PW 0

0 0

]∈ R2d×2d (95)

to [x, y]ᵀ. Let’s turn to E, which is an intersection of two linear subspaces. The projectorof the left linear subspace making up this intersection, U⊥×V⊥, has the matrix represen-tation

PU⊥×V⊥ =

[Id−PU 0

0 Id−PV

]. (96)

We now turn to the right linear subspace, ∆⊥ + ({0} ×W⊥), which is a sum of two sub-spaces whose complements are ∆⊥⊥ = ∆ and (({0} ×W⊥)⊥ = X×W, respectively. Theprojectors of the last two subspaces are

P∆ =12

[Id IdId Id

]and PX×W =

[Id 00 PW

], (97)

respectively. Thus, Corollary 2.5 yields

P∆⊥+({0}×W⊥) (98a)

=

[Id 00 Id

]− 2 · 1

2

[Id IdId Id

] (12

[Id IdId Id

]+

[Id 00 PW

])† [Id 00 PW

](98b)

=

[Id 00 Id

]− 2

[Id IdId Id

] [3 Id IdId Id+2PW

]† [Id 00 PW

]. (98c)

To compute PE, where E is as in (93b), we combine (96), (98) under the umbrella of Fact 2.3— the result does not seem to simplify so we don’t typeset it. Having PE, we simply addit to (95) to obtain PFix T because of (94).

17


In this subsection, we turn to Malitsky-Tam splitting for the current setup — this corre-sponds to Section 4.2 with n = 3 and where we identify (U1, U2, U3) with (U, V, W).

The block matrix representation of M from (57) isPU 0

−PV(Id−PU) PV

PW(PU + PV PU − PV) −PW(Id−PV)

∈ R3d×2d. (99)

Thus, using (58), we obtain the following matrix representation of the Malitsky-Tam split-ting operator T = TMT:

T =

[Id 0

0 Id

]+

[− Id Id 0

0 − Id Id

] PU 0

−PV(Id−PU) PV

PW(PU + PV PU − PV) −PW(Id−PV)

(100a)

=

[Id− PU − PV(Id−PU) PV

PV(Id−PU) + PW(PU + PV PU − PV) Id− PV − PW(Id−PV)

](100b)

=

[(Id−PV)(Id−PU) PV

(Id−PW)PV(Id−PU) + PW PU (Id−PW)(Id−PU)

]∈ R2d×2d. (100c)

Next, in view of (62), we have

PFix T =12

[PZ PZPZ PZ

]+ PE, (101)

where (see (60) and (61))E = ran Ψ ∩ (X×W⊥) (102)

and

Ψ : U⊥ ×V⊥ → X2 :[

y1y2

]7→[

y1y1 + y2

]. (103)

We first note that

ran Ψ = ran[

Id 0Id Id

] [PU⊥ 0

0 PV⊥

]= ran

[PU⊥ 0PU⊥ PV⊥

]. (104)

We thus obtain from Fact 2.6 that

Pran Ψ =

[PU⊥ 0PU⊥ PV⊥

] [PU⊥ 0PU⊥ PV⊥

]†

. (105)

18

On the other hand,

PX×W⊥ =

[Id 00 P⊥W

](106)

In view of (102) and Fact 2.3, we obtain

PE = 2Pran Ψ(

Pran Ψ + PX×W⊥)†PX×W⊥ . (107)

We could now use our formulas (105) and (106) for Pran Ψ and PX×W⊥ to obtain a more ex-plicit formula for PE — but we refrain from doing so as the expressions become unwieldy.Finally, plugging the formula for PZ from Corollary 2.4 into (101) as well as plugging (107)into (101) yields a formula for PFix T.

6 Numerical experiments

We now outline a few experiments conducted to observe the performance of the algo-rithms outlined in Section Section 5. Each instance of an experiment involves 3 subspacesUi of dimension di for i ∈ {1, 2, 3} in X = Rd. By [19, equation (4.419) on page 205],

dim(U1 + U2) = d1 + d2 − dim(U1 ∩U2). (108)

Hencedim(U1 ∩U2) = d1 + d2 − dim(U1 + U2) ≥ d1 + d2 − d. (109)

Thus dim(U1 ∩U2) ≥ 1 whenever

d1 + d2 ≥ d + 1. (110)

Similarly,

dim(Z) ≥ dim(U1 ∩U2) + d3 − d ≥ d1 + d2 − d + d3 − d = d1 + d2 + d3 − 2d. (111)

Along with (110), a sensible choice for di satisfies

di ≥ 1 + d2d/3e (112)

because then d1 + d2 ≥ 2 + 2d2d/3e ≥ 2 + 4d/3 > 2 + d. Hence d1 + d2 ≥ 3 + d andd1 + d2 + d3 > 3+ 3d2d/3e ≥ 3+ 2d. The smallest d that gives proper subspaces is d = 6,for which d1 = d2 = d3 = 5 satisfy the above conditions.

We now describe our set of 3 numerical experiments designed to observe different as-pects of the algorithms.

19

Figure 1: Experiment 1: spectral radii and operator norms

6.1 Experiment 1: Bounds on the rates of linear convergence

As shown in Section 5, we have lower and upper bounds on the rate of linear convergenceof the operator Tλ. We conduct this experiment to observe how these bounds change aswe increase λ. To this end, we generate 1000 instances of sets of linear subspaces U1, U2and U3. This can be done by randomly generating sets of 3 matrices B1, B2, B3 in R5×6.These can be used to define the range spaces of these subspaces, which in turn will giveus the projection onto Ui using [7, Proposition 3.30(ii)],

PUi = BiB†i . (113)

For each instance, algorithm and λ ∈{

0.01 · k∣∣ k ∈ {1, 2, . . . , 99}

}, we obtain the opera-

tors Tλ and PFix T as outlined in Section 5 and compute the spectral radius and operatornorm of Tλ − PFix T. Figure 1 reports the average of the spectral radii and operator normsfor each λ. While Ryu sees a decline in the lower bound for the rate of convergence, MTsees a minimizer around 0.9.

20

Figure 2: Experiment 2: number of iterations for the governing sequence

Figure 3: Experiment 2: number of iterations for the shadow sequence

21

6.2 Experiment 2: Number of iterations to achieve prescribed accuracy

Because we know the limit points of the governing as well as shadow sequences, weinvestigate how changing λ affects the number of iterations required to approximate thelimit to a given accuracy. For these experiments, we fix 100 instances of sets of subspaces{U1, U2, U3}. We also fix 100 different starting points in R6. For each instances of thesubspaces, starting point z0 and λ ∈

{0.01 · k

∣∣ k ∈ {1, 2, . . . , 99}}

, we obtain the numberof iterations (up to a maximum of 104 iterations) required to achieve ε = 10−6 accuracy.

For the governing sequence, the limit PFix Tz0 is used to determine the stopping con-dition. Figure 2 reports the median number of iterations required for each λ to achievethe given accuracy. For the shadow sequence, we compute the median number of itera-tions required to achieve ε = 10−6 accuracy for the shadow sequence Mzk with respect toits limit (PZz0, PZz0, PZz0). Here M for Ryu and MT can be obtained from (91) and (99)respectively. See Figure 3 for results.

For both the algorithms and experiments, increasing values of λ result in a decreasingnumber of median iterations required. As is evident from the maximum number of iter-ations required for a fixed lambda, the shadow sequence converges before the governingsequence for larger values of λ. One can also see that Ryu requires fewer median itera-tions for both the governing and the shadow sequence to achieve the same accuracy asMT for a fixed lambda.

6.3 Experiment 3: Convergence plots of shadow sequences

In this experiment, we measure the distance of the shadow sequence from the limit pointfor each iteration to observe the approach of the iterates of the algorithm to the solution.We pick the λ with respect to which the iterates converge the fastest, which is λ = 0.99for both the algorithms because of Figure 3. Similar to the setup of the previous exper-iment, we fix 100 starting points and 100 sets of subspaces {U1, U2, U3}. We now runthe algorithms for 150 iterations for each starting point and each set of subspaces. Wemeasure ‖Mzn − (PZz0, PZz0, PZz0)‖ for each iteration. Figure 4 reports the median of‖Mzi − (PZz0, PZz0, PZz0)‖ for each iteration i ∈ {1, . . . , 150}.

As can be seen in Figure 4, Ryu converges faster to the solution compared to MT. Bothshow faint “rippling” akin to the one known to occur for the Douglas-Rachford algorithm.

22

Figure 4: Experiment 3: convergence plot of the shadow sequence

7 Conclusion

In this paper, we investigated the recent splitting methods by Ryu and by Malitsky-Tamin the context of normal cone operators for subspaces. We discovered that both algo-rithms find not just some solution but in fact the projection of the starting point onto theintersection of the subspaces. Moreover, convergence of the iterates is strong even ininfinite-dimensional settings. Our numerical experiments illustrated that Ryu’s methodseems to converge faster although Malitsky-Tam splitting is not limited in its applicabilityto just 3 subspaces.

Two natural avenues for future research are the following. Firstly, when X is finite-dimensional, we know that the convergence rate of the iterates is linear. While we il-lustrated this linear convergence numerically in this paper, it is open whether there arenatural bounds for the linear rates in terms of some version of angle between the subspacesinvolved. For the prototypical Douglas-Rachford splitting framework, this was carriedout in [5] in terms of the Friedrichs angle. Secondly, what can be said in the inconsistentaffine case? Again, the Douglas-Rachford algorithm may serve as a guide to what theexpected results and complications might be; see, e.g., [10].

23

References

[1] W.N. Anderson and R.J. Duffin, Series and parallel addition of matrices, Journalof Mathematical Analysis and Applications 26, 576–594, 1969. https://doi.org/10.1016/0022-247X(69)90200-5

[2] F.J. Aragon-Artacho, R. Campoy, and M.K. Tam, Strengthened splitting methodsfor computing resolvents, Computational Optimization and Applications, 2021. https://doi.org/10.1007/s10589-021-00291-6, preprint version: https://arxiv.org/

abs/2011.01796v3

[3] J.B. Baillon, Quelques proprietes de convergence asymptotique pour les contrac-tions impaires, Comptes rendus de l’Academie des Sciences 238, Aii, A587–A590,1976.

[4] J.B. Baillon, R.E. Bruck, and S. Reich, On the asymptotic behavior of nonexpan-sive mappings and semigroups in Banach spaces, Houston Journal of Mathemat-ics 4(1), 1–9, 1978. https://www.math.uh.edu/~hjm/restricted/archive/v004n1/0001BAILLON.pdf

[5] H.H. Bauschke, J.Y. Bello Cruz, T.T.A. Nghia, H.M. Phan, and X. Wang, The rate oflinear convergence of the Douglas-Rachford algorithm for subspaces is the cosineof the Friedrichs angle, Journal of Approximation Theory 185, 63–79, 2014. https://doi.org/10.1016/j.jat.2014.06.002

[6] H.H. Bauschke, J.Y. Bello Cruz, T.T.A. Nghia, H.M. Phan, and X. Wang, Opti-mal rates of linear convergence of relaxed altnerating projections and general-ized Douglas-Rachford methods for two subspaces, Numerical Algorithms 73, 33–76,2016. https://doi.org/10.1007/s11075-015-0085-4

[7] H.H. Bauschke and P.L. Combettes, Convex Analysis and Monotone Operator The-ory in Hilbert Spaces, second edition, Springer, 2017. https://doi.org/10.1007/978-3-319-48311-5

[8] H.H. Bauschke, F. Deutsch, H. Hundal, and S.-H. Park, Accelerating the conver-gence of the method of alternating projections, Transactions of the AMS 355(9), 3433–3461, 2003. https://doi.org/10.1090/S0002-9947-03-03136-2

[9] H.H. Bauschke, B. Lukens, and W.M. Moursi, Affine nonexpansive operators,Attouch-Thera duality, and the Douglas-Rachford algorithm, Set-Valued and Vari-ational Analysis 25, 481–505, 2017. https://doi.org/10.1007/s11228-016-0399-y

[10] H.H. Bauschke and W.M. Moursi, The Douglas-Rachford algorithm for two (notnecessarily intersecting) affine subspaces, SIAM Journal on Optimization 26(2), 968–985, 2016. https://doi.org/10.1137/15M1016989

24

https://doi.org/10.1016/0022-247X(69)90200-5

https://doi.org/10.1016/0022-247X(69)90200-5

https://doi.org/10.1007/s10589-021-00291-6

https://doi.org/10.1007/s10589-021-00291-6

https://arxiv.org/abs/2011.01796v3


https://www.math.uh.edu/~hjm/restricted/archive/v004n1/0001BAILLON.pdf

https://www.math.uh.edu/~hjm/restricted/archive/v004n1/0001BAILLON.pdf

https://doi.org/10.1016/j.jat.2014.06.002

https://doi.org/10.1016/j.jat.2014.06.002

https://doi.org/10.1007/s11075-015-0085-4

https://doi.org/10.1007/978-3-319-48311-5

https://doi.org/10.1007/978-3-319-48311-5

https://doi.org/10.1090/S0002-9947-03-03136-2

https://doi.org/10.1007/s11228-016-0399-y

https://doi.org/10.1137/15M1016989

[11] J. Bezanson, A. Edelman, S. Karpinski, and V.B. Shah, Julia: a fresh approach tonumerical computing, SIAM Review 59(1), 65–98, 2017. https://doi.org/10.1137/141000671

[12] R.E. Bruck and S. Reich, Nonexpansive projections and resolvents of accretiveoperators in Banach spaces, Houston Journal of Mathematics 3(4), 459–470, 1977.https://www.math.uh.edu/~hjm/restricted/archive/v003n4/0459BRUCK.pdf

[13] P.L. Combettes, Iterative construction of the resolvent of a sum of maximal mono-tone operators, Journal of Convex Analysis 16(4), 727–748, 2009. https://www.

heldermann.de/JCA/JCA16/JCA163/jca16044.htm

[14] F. Deutsch, Best Approximation in Inner Product Spaces, Springer, 2001. https://doi.org/10.1007/978-1-4684-9298-9

[15] J. Douglas and H.H. Rachford, On the numerical solution of heat conduction prob-lems in two and three space variables, Transactions of the AMS 82, 421–439, 1956.https://doi.org/10.1090/S0002-9947-1956-0084194-4

[16] C.W. Groetsch, Generalized Inverses of Linear Operators, Marcel Dekker, 1977.

[17] P.-L. Lions and B. Mercier, Splitting algorithms for the sum of two nonlinear oper-ators, SIAM Journal on Numerical Analysis 16, 964–979, 1979. https://doi.org/10.1137/0716071

[18] Y. Malitsky and M.K. Tam, Resolvent splitting for sums of monotone operators withminimal lifting. https://arxiv.org/abs/2108.02897v1

[19] C.D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, 2000.

[20] E.K. Ryu, Uniqueness of DRS as the 2 operator revolent-splitting and impossibilityof 3 operator resolvent-splitting, Mathematical Programming (Series A) 182, 233–273,2020. https://doi.org/10.1007/s10107-019-01403-1

25

https://doi.org/10.1137/141000671

https://doi.org/10.1137/141000671

https://www.math.uh.edu/~hjm/restricted/archive/v003n4/0459BRUCK.pdf

https://www.heldermann.de/JCA/JCA16/JCA163/jca16044.htm

https://www.heldermann.de/JCA/JCA16/JCA163/jca16044.htm

https://doi.org/10.1007/978-1-4684-9298-9

https://doi.org/10.1007/978-1-4684-9298-9

https://doi.org/10.1090/S0002-9947-1956-0084194-4

https://doi.org/10.1137/0716071

https://doi.org/10.1137/0716071


https://doi.org/10.1007/s10107-019-01403-1

Documents

The splitting algorithms by Ryu and by Malitsky-Tam