Ubc 2010 Fall Schaad Jason

Modeling the 8-Queens Problem andSudoku using an Algorithm based on

Projections onto Nonconvex Setsby

Jason Schaad

B.Sc., The University of British Columbia, 2000M.A., Gonzaga University, 2006

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

in

The College of Graduate Studies

(Mathematics)

THE UNIVERSITY OF BRITISH COLUMBIA

(Okanagan)

August 2010

c© Jason Schaad 2010

Abstract

We begin with some basic definitions and concepts from convex analysis andprojection onto convex sets (POCS). We next introduce various algorithmsfor converging to the intersection of convex sets and review various results(Elser’s Difference Map is equivalent to the Douglas-Rachford and Fienup’sHybrid Input-Output algorithms which are both equivalent to the HybridProjection-Reflection algorithm). Our main contribution is two-fold. First,we show how to model the 8-queens problem and following Elser, we modelSudoku as well. In both cases we use several very nonconvex sets and whilethe theory for convex sets does not apply, so far the algorithm finds a so-lution. Second, we show that the operator governing the Douglas-Rachforditeration need not be a proximal map even when the two involved resolventsare; this refines an observation of Eckstein.

ii

Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.1 Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Subspaces and Convex Sets . . . . . . . . . . . . . . . . . . . 42.3 The Inner Product and Inner Product Spaces . . . . . . . . . 52.4 Cauchy Sequences, Completeness and Hilbert spaces . . . . . 62.5 Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . 112.6 Lipschitz and Nonexpansivity . . . . . . . . . . . . . . . . . . 132.7 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.8 Properties of the Projector . . . . . . . . . . . . . . . . . . . 232.9 The Reflector . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.10 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Projection Methods . . . . . . . . . . . . . . . . . . . . . . . . 263.1 Elser’s Difference Map . . . . . . . . . . . . . . . . . . . . . . 263.2 Douglas-Rachford and Fienup HIO . . . . . . . . . . . . . . . 273.3 Fixed Point Results . . . . . . . . . . . . . . . . . . . . . . . 283.4 The Main Result . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Monotone Operators . . . . . . . . . . . . . . . . . . . . . . . . 334.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.2 Eckstein’s Observation . . . . . . . . . . . . . . . . . . . . . 43

iii

Table of Contents

5 8 Queens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1 Queen’s Role in Chess . . . . . . . . . . . . . . . . . . . . . . 475.2 8 Queens History . . . . . . . . . . . . . . . . . . . . . . . . 475.3 Solutions to 8 Queens . . . . . . . . . . . . . . . . . . . . . . 485.4 Modeling 8 Queens . . . . . . . . . . . . . . . . . . . . . . . 505.5 Row Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 515.6 Column Constraints . . . . . . . . . . . . . . . . . . . . . . . 515.7 Positive Slope Constraints . . . . . . . . . . . . . . . . . . . 525.8 Negative Slope Constraints . . . . . . . . . . . . . . . . . . . 535.9 Projection onto the Nearest Unit Vector . . . . . . . . . . . . 545.10 Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.11 8 Queens Example . . . . . . . . . . . . . . . . . . . . . . . . 56

6 Sudoku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.1 What is Sudoku? . . . . . . . . . . . . . . . . . . . . . . . . 606.2 Modeling Sudoku . . . . . . . . . . . . . . . . . . . . . . . . 606.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.3.1 North-South Hallways . . . . . . . . . . . . . . . . . . 636.3.2 East-West Hallways . . . . . . . . . . . . . . . . . . . 646.3.3 Meeting Rooms . . . . . . . . . . . . . . . . . . . . . 656.3.4 The Given Sudoku . . . . . . . . . . . . . . . . . . . . 66

6.4 Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.5 Sudoku Example . . . . . . . . . . . . . . . . . . . . . . . . . 676.6 Sudoku Distance . . . . . . . . . . . . . . . . . . . . . . . . . 74

7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Appendices

A 8 Queens PHP Code . . . . . . . . . . . . . . . . . . . . . . . . 80

B Sudoku PHP Code . . . . . . . . . . . . . . . . . . . . . . . . . 93

C Eckstein Calculation Details . . . . . . . . . . . . . . . . . . . 109

iv

List of Figures

2.1 Example of Convexity . . . . . . . . . . . . . . . . . . . . . . 5

5.1 Queen moves . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.2 Initial Starting Position . . . . . . . . . . . . . . . . . . . . . 575.3 30 iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.4 60 iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.5 90 iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.6 Solved in 96 iterations . . . . . . . . . . . . . . . . . . . . . . 59

6.1 A Sample Sudoku Puzzle . . . . . . . . . . . . . . . . . . . . 616.2 A Sample Sudoku Solution . . . . . . . . . . . . . . . . . . . 626.3 9 Sudoku Elavator Shafts . . . . . . . . . . . . . . . . . . . . 636.4 A Sodoku Hallway . . . . . . . . . . . . . . . . . . . . . . . . 646.5 A Sodoku Meeting Room . . . . . . . . . . . . . . . . . . . . 656.6 Solution to Escargot . . . . . . . . . . . . . . . . . . . . . . . 736.7 Distance versus number of iterations . . . . . . . . . . . . . . 74

C.1 Calculation Details . . . . . . . . . . . . . . . . . . . . . . . . 110

v

Acknowledgements

First and foremost, I thank my supervisor Heinz Bauschke. His enthusiasmfor mathematics is truly infectious. Without the support from my wife Yuki,the flexibility of my boss Jon Rever and the help of Liangjin Yao, I wouldnot have been able to complete this thesis- thank you. I also thank ShawnWang, Yves Lucet, Yong Gao, and Pulkit Bansal for careful reading of mythesis.

vi

Chapter 1

Introduction

In mid August 2007, I attended the Second Mathematical Programming So-ciety International Conference on Continuous Optimization [ICC07] held atMcMaster University in Hamilton Ontario. I had the pleasure of attendinga talk given by Dr. Veit Elser from Cornell University. Elser talked verypassionately about a new algorithm he discovered while researching the hy-brid input-output (HIO) algorithm of James R. Fienup, Ph.D., a Universityof Rochester optics researcher. I was intrigued by the variety of problemsElser was able to solve with his difference map.

A large number of problems can be formulated as follows: find an ele-ment of A ∩B with the understanding that finding elements of A and B iseasy [ERT07]. “Difficult problems can often be broken down into a collec-tion of smaller, more tractable, subproblems.” [GE08]. Elser gives a veryeasy to understand example: Suppose you want to unscramble the anagram”ilnoostu”. We think about two sets, A and B. Set A is the 20,160 distinctpermutations of the given 8 letters. Set B, is the collection of eight-letterEnglish words; the cardinality of set B is approximately 30,000. The inter-section of these two sets contains our answer, in fact A ∩B = {solution}.

The algorithm projects onto the sets A and B, this projection needs tobe computationally quick in order for the difference map to be successful.

Convergence to a fixed point has been proved when our sets are convex[BCL02]. This of course is not the case with Sudoku, as we are workingwith positive integers from 1 to 9. But, the Sudoku solver has solved everySudoku problem we have thrown at it. In fact, Science News published anarticle about the Sudoku Solver [Reh] and people from around the worldtried to find a Sudoku puzzle that it could not solve, but to no avail. Theserver that hosted the PHP code to solve Sudoku was overloaded because ofthe interest. The Internet Service Provider emailed and was upset becausethe algorithm was using over 99 percent of virtual server’s processor; we hadto move the code to the University of British Columbia Servers. We werehoping to find a Sudoku puzzle that didn’t work so that we can understandwhat makes that particular Sudoku special. There may be a theorem outthere that shows our method always works on Sudoku. We leave that for

1

Chapter 1. Introduction

future work in this area.Modeling Sudoku is complicated. Perhaps the greatest contribution of

this thesis is providing the full details of modeling Sudoku in order for thedifference map to solve it; it is very ingenious. I want to be very clear, wedid not discover the Sudoku model, Dr. Veit Elser did. In fact, talking withhim at the International Math Conference in Banff in 2009 [bir09], I askedhim how he ever came up with such a clever model. Elser told me simply,”I was motivated”. He knew his difference map could solve something asdifficult as Sudoku puzzle, so he was determined to show the world.

We wanted to try to model something simpler in order to better explainhow to model Sudoku. Once we explain how to model 8-queens, the modelingof Sudoku is much easier to understand. The modeling of the 8-queensproblem is a new application, and as far as I know Elser has not attemptedit.

As mathematicians we want to know why it works. The study of pro-jection operators reveals a strong connection to monotone operators. Werevisit work by Eckstein, correct a computational error, and provide a muchsimpler example.

The remainder of this thesis is organized as follows. Chapter 2 is areview of the well known definitions and concepts from the broad area ofconvex analysis. We build the theory so we can define a Hilbert space andprojections.

In Chapter 3 we consider an algorithm for converging to the intersectionof convex sets and review various related results. We show that Elser’sDifference Map is equivalent to the Douglas-Rachford and Fienup’s HybridInput-Output algorithms.

In Chapter 4 we show that the operator governing the Lions-Mercieriteration need not be a proximal map even when the two involved resolventsare; this refines an observation of Eckstein.

Chapter 5 focusses on modeling the 8-queens problem, and Chapter 6concerns the Sudoku puzzle. Finally, Chapter 7 concludes this thesis.

2

Chapter 2

Background

This section is included so the results of this thesis can be understood inone self contained document. The readers can first familiarize themselveswith the well known facts presented in this chapter.

2.1 Vector Space

Definition 2.1 (Vector space). A vector space V over R is a nonemptyset or collection of elements (called vectors) with two algebraic operations.These operations are called vector addition and multiplication of vectors byscalars. For all vectors x,y,z ∈ V and α,β ∈ R the following conditions arerequired to hold:

1. x + y, called the sum of x and y, is a vector in V and is the uniqueresult of addition applied to x and y.

2. αx, called the scalar multiple of x by α, is a vector in V and is theunique result of scalar multiplication applied to x by α.

3. Addition is commutative, x+ y = y + x.

4. Addition is associative, x+ (y + z) = (x+ y) + z.

5. There is a zero vector 0 in V such that x+ 0 = x.

6. There is an additive inverse −x such that x+ (−x) = 0.

7. Scalar multiplication is distributive, α(x+y) = αx+αy, and (α+β)x =αx+ βx.

8. Scalar multiplication is commutative, α(βx) = (βα)x.

9. Scalar multiplication by 1 is the identity operator, 1x = Idx = x.

3

2.2. Subspaces and Convex Sets

2.2 Subspaces and Convex Sets

Definition 2.2 (subspace). A subset W of a vector space V is a subspaceof V if W is a vector space itself with the same operations of addition andscalar multiplication

We note that the zero vector, 0, is in every subspace. The smallestpossible subspace is {0}. And the largest possible subspace is the vectorspace V . It is very tedious and time consuming to check if a subset of V isa subspace by applying the definition of a vector space. The following twofacts are well known easier ways to verify if W is indeed a subspace. Theproofs are omitted, but can be found in many Linear Algebra textbooks.

Fact 2.3. [Mey00, page 162] Let V be a real vector space. Then W is asubspace of V if and only if W ⊆ V and αx+y ∈W ∀x, y ∈W, ∀α ∈ R.

Proposition 2.4. If W1 and W2 are subspaces of a real vector space V ,then W1 ∩W2 is a subspace of the vector space V .

Proof. By Fact 2.3 we need only show that W1 ∩W2 ⊆ V and αx + y ∈W1 ∩W2 ∀x, y ∈ W ∀α ∈ R. Pick x ∈ W1 ∩W2, then x ∈ W1 ⇒ x ∈ Vas W1 is a subspace.Now pick x, y ∈W1 ∩W2 this implies that

x ∈W1 and x ∈W2, (2.1)

y ∈W1 and y ∈W2. (2.2)

Since W1 and W2 are both subspaces,

αx+ y ∈W1 (2.3)

αx+ y ∈W2. (2.4)

We conclude that αx+ y ∈W1 ∩W2.

Definition 2.5 (Affine subspace). The set W is an affine subspace of V ifthere exists a vector v ∈ V and a subspace W0 of V such that W = v+W0 ={v + w : w ∈W0}.

Definition 2.6 (Convex set). Let X be a vector space and let C ⊆ X. Wesay C is convex if

λC + (1− λ)C ⊆ C, ∀λ ∈ [0, 1]. (2.5)

4

2.3. The Inner Product and Inner Product Spaces

Figure 2.1: Example of Convexity

A geometric characterization of this definition is that a set is convex ifit contains all line segments whose end points are in the set. See Figure 2.1

It is easy to see that subspaces are convex sets.

Definition 2.7 (Convex function). Let f : X → ]−∞,+∞]. We say f is aconvex function if

f((1− λ)x+ λy) ≤ (1− λ)f(x) + λf(y), ∀x, y ∈ X,∀λ ∈ ]0, 1[ . (2.6)

Definition 2.8 (Domain). Let f : X → ]−∞,+∞]. The domain of afunction is

dom f = {x | f(x) < +∞}. (2.7)

Definition 2.9 (Epigraph). Let f : X → ]−∞,+∞]. Then the epigraph ofa function is the set of points lying on or above the graph, i.e.,

epi f = {(x, µ) |x ∈ dom f, µ ∈ R, µ ≥ f(x)}.

A function is convex if and only if its epigraph is a convex set; this iscalled a geometric characterization of a convex function.

2.3 The Inner Product and Inner Product Spaces

Definition 2.10 (Inner Product). An inner product on X is a mapping ofX ×X (where X is a vector space) to R. For every pair of vectors x and ythere is an associated scalar 〈x, y〉, called the inner product of x and y, suchthat for all x, y, z ∈ X and for all α ∈ R we have:

1. 〈x+ y, z〉 = 〈x, z〉+ 〈y, z〉

5

2.4. Cauchy Sequences, Completeness and Hilbert spaces

2. 〈αx, y〉 = α〈x, y〉

3. 〈x, y〉 = 〈y, x〉

4. 〈x, x〉 ≥ 0

5. 〈x, x〉 = 0⇐⇒ x = 0

Definition 2.11. (Inner Product Space) An inner product space (some-times called a pre-Hilbert Space) is a vector space X with an inner productdefined on X.

Definition 2.12. (Orthogonality) An element x of an inner product spaceX is said to be orthogonal to an element y ∈ X if 〈x, y〉 = 0.

Definition 2.13. (Norm) Let V be a vector space. The function ‖·‖ : V →R is called a norm if the following hold:

1. ‖x‖ ≥ 0 and ‖x‖ = 0 if and only if x = 0

2. ‖x+ y‖ ≤ ‖x‖+ ‖y‖, ∀x, y ∈ V

3. ‖αx‖ = |α|‖x‖ ∀α ∈ R, ∀x ∈ V

An inner product on X defines a norm on X which is given by

‖x‖ =√〈x, x〉 ≥ 0. (2.8)

Note that‖x‖2 = 〈x, x〉 ≥ 0. (2.9)

We also have a metric on X which is given by

d(x, y) = ‖x− y‖ =√〈x− y, x− y〉 ≥ 0. (2.10)

2.4 Cauchy Sequences, Completeness and Hilbertspaces

Definition 2.14. (Cauchy Sequence) Let X be an inner product space. Asequence (xn) in X is said to be a Cauchy sequence if for every ε > 0 thereis an N = N(ε) such that ‖xm − xn‖ < ε for every m,n > N .

Definition 2.15. (Completeness) The space X is said to be complete ifevery Cauchy sequence in X has a limit which is an element of X.

6


Definition 2.16. (Hilbert Space) A complete inner product space is said tobe a Hilbert Space.

From now on we assume that X is a Hilbert Space.

Example 2.17 (Euclidean space Rn). The space Rn is a Hilbert Space. Thisis the primary Hilbert Space focussed on in this thesis. [Kre78, Example1.5-1] shows us that Rn is complete. The inner product is defined by

〈x, y〉 = x1y1 + · · ·+ xnyn, (2.11)

where x = (x1, · · · , xn) and y = (y1, · · · , yn). Now from (2.11) we see that

‖x‖ =√〈x, x〉 =

√x2

1 + · · ·+ x2n. (2.12)

If n = 3, (2.11) gives us the usual dot product

〈x, y〉 = x · y = x1y1 + x2y2 + x3y3, (2.13)

where x = (x1, x2, x3) and y = (y1, y2, y3). The orthogonality

〈x, y〉 = x · y = 0 (2.14)

agrees with the concept of perpendicularity.

Example 2.18 (`2). [Kre78, 1.2-3] Pronounced ”Little el 2” and sometimeswritten as `2(I) is the Hilbert space of sequences index by a countable set,usually N (but could be Z). Each element x of the space `2 is a sequencex = (xn) = (x1, x2, . . . , xn, . . .) with

‖(xn)‖ =

√√√√ ∞∑n=1

x2n < +∞. (2.15)

For two elements (sequences) x = (xn) ∈ `2 and y = (yn) ∈ `2, the innerproduct is defined by

〈x, y〉 =

∞∑n=1

〈xn, yn〉. (2.16)

Definition 2.19 (Bounded). Given a set A in X, A is said to be boundedif we have

supx∈A‖x‖ < +∞

.

7


Theorem 2.20. Every convergent sequence in a Hilbert space is a boundedCauchy Sequence.

Proof. Suppose xn −→ x. We first show (xn) is Cauchy. By the triangleinequality we have

‖xn − xm‖ = ‖xn − x+ x− xm‖ ≤ ‖xn − x‖+ ‖x− xm‖. (2.17)

We now let ε > 0. Then ∃N so for n > N and m > N we have

‖xn − x‖ <ε

2, (2.18)

‖x− xm‖ <ε

2. (2.19)

Hence m,n > N

‖xn − xm‖ ≤ ‖xn − x‖+ ‖x− xm‖ =ε

2+ε

2= ε, (2.20)

which proves the sequence is Cauchy. Now we show (xn) is bounded. Againwe have xn −→ x. For ε = 1, we get N ∈ N such that

n > N =⇒ ‖xn − x‖ < 1. (2.21)

We claim

‖xn‖ − ‖x‖ ≤ ‖xn − x‖. (2.22)

By the triangle inequality

‖xn‖ = ‖ − xn‖ = ‖ − xn + x− x‖ = ‖(x− xn) + (−x)‖ ≤ ‖x‖+ ‖x− xn‖.(2.23)

Thus, for n > N ,

‖xn‖ < ‖x‖+ 1. (2.24)

If we choose M to be

M = max{‖x‖+ 1, ‖x1‖, ‖x2‖, ..., ‖xN‖} (2.25)

we get

‖xn‖ ≤M, ∀n. (2.26)

as desired.

8


Definition 2.21 (Distance to a set). The distance from a vector x ∈ X toa nonempty set A ⊆ X is defined as follows

dA(x) := infy∈A‖x− y‖.

Theorem 2.22 (Cauchy-Schwarz inequality). For any x and y in a Hilbertspace X, we have that

|〈x, y〉| ≤ ‖x‖‖y‖.

Proof. We prove this with two cases.Case 1: x = 0 or y = 0. Clear.Case 2: x 6= 0 and y 6= 0. Then

0 ≤⟨

x

‖x‖− y

‖y‖,x

‖x‖− y

‖y‖

⟩(2.27)

=

∥∥∥∥ x

‖x‖− y

‖y‖

∥∥∥∥2

=〈x, x〉‖x‖2

− 2〈x, y〉‖x‖‖y‖

+〈y, y〉‖y‖2

(2.28)

⇐⇒ ‖x‖2

‖x‖2− 2〈x, y〉‖x‖‖y‖

+‖y‖2

‖y‖2≥ 0 (2.29)

⇐⇒ 2 ≥ 2〈x, y〉‖x‖‖y‖

(2.30)

⇐⇒ ‖x‖‖y‖ ≥ 〈x, y〉 (2.31)

⇐⇒ 〈x, y〉 ≤ ‖x‖‖y‖ (2.32)

This is true for all x. Let x be replaced by −x. Now we have

‖ − x‖‖y‖ ≥ 〈−x, y〉 (2.33)

⇐⇒ −〈x, y〉 ≤ ‖x‖‖y‖. (2.34)

9


Now

max{〈−x, y〉,−〈x, y〉} = |〈x, y〉|. (2.35)

Combine (2.32) and (2.34) to get

|〈x, y〉| ≤ ‖x‖‖y‖, ∀x, y ∈ X (2.36)

as desired.

Theorem 2.23 (Triangle Inequality). Then for every x, y ∈ X, we havethat

‖x+ y‖ ≤ ‖x‖+ ‖y‖.

Proof. We have

‖x+ y‖2 = 〈x+ y, x+ y〉 (2.37)

= 〈x, x+ y〉+ 〈y, x+ y〉 (2.38)

= 〈x, x〉+ 〈x, y〉+ 〈y, x〉+ 〈y, y〉 (2.39)

= ‖x‖2 + 2〈x, y〉+ ‖y‖2 (2.40)

≤ ‖x‖2 + 2|〈x, y〉|+ ‖y‖2. (2.41)

By the Cauchy-Schwarz inequality we have

‖x+ y‖2 ≤ ‖x‖2 + 2‖x‖‖y‖+ ‖y‖2 (2.42)

= (‖x‖+ ‖y‖)2. (2.43)

We take the square root on both sides to obtain

‖x+ y‖ ≤ ‖x‖+ ‖y‖ (2.44)

as desired.

Theorem 2.24 (Parallelogram law). For every x, y ∈ X

‖x+ y‖2 + ‖x− y‖2 = 2(‖x‖2 + ‖y‖2).

Proof. We have,

‖x+ y‖2 + ‖x− y‖2 = 〈x+ y, x+ y〉+ 〈x− y, x− y〉 (2.45)

= 2〈x, x〉+ 2〈y, y〉+ 2〈x, y〉 − 2〈x, y〉 (2.46)

= 2(‖x‖2 + ‖y‖2). (2.47)

10

2.5. Product Space

2.5 Product Space

Definition 2.25 (Product Space). Let N ∈ N = {1, 2, 3, . . .}, and Xi beHilbert spaces, with inner products 〈·, ·〉i, for each i ∈ {1, . . . , N}. Thecorresponding product space H := X1 ×X2 × . . .×XN is defined by

H ={

(xi) = (x1, x2, . . . , xN ) | xi ∈ Xi

}, (2.48)

〈x, y〉 =N∑i=1

〈xi, yi〉i, ∀x = (xi), y = (yi) ∈ H. (2.49)

Then,

‖x‖ =

√√√√ N∑i=1

‖xi‖2i , ∀x = (xi) ∈ H. (2.50)

If Xi = X,∀i ∈ {1, · · · , N}, we denote H by XN .

In fact, the well known space RN is actually a product space. We canwrite RN as R× · · · × R.

Proposition 2.26. The Product Space is a Hilbert space.

Proof. We first prove that

〈x, y〉 =N∑i=1

〈xi, yi〉i, ∀x = (xi), y = (yi) ∈ H. (2.51)

is an inner product. Let x = (xi), y = (yi), z = (zi) ∈ H, and α ∈ R. Asdescribed above, we need to show five properties in order to prove this is aninner product. Until the end of this proof, we abbreviate

∑Ni=1 by

∑.

1.

〈x+ y, z〉 = 〈(xi) + (yi), (zi)〉 = 〈(xi + yi), (zi)〉 =∑〈xi + yi, zi〉i (2.52)

Since xi, yi, zi ∈ Xi and Xi is a Hilbert space, we can write

=∑

(〈xi, zi〉i + 〈yi, zi〉i) =∑〈xi, zi〉i +

∑〈yi, zi〉i = 〈x, z〉+ 〈y, z〉.

(2.53)

2.

〈αx, y〉 = 〈α(xi), (yi)〉 = 〈(αxi), (yi)〉 (2.54)

11

2.5. Product Space

Since xi, yi, zi ∈ Xi and Xi is a Hilbert space, we can write

〈αx, y〉 =∑〈αxi, yi〉i =

∑α〈xi, yi〉i = α

∑〈xi, yi〉i = α〈x, y〉. (2.55)

3.

〈x, y〉 = 〈(xi), (yi)〉 =∑〈xi, yi〉i =

∑〈yi, xi〉i = 〈(yi), (xi)〉 = 〈y, x〉.

(2.56)4.

〈x, x〉 = 〈(xi), (xi)〉 =∑〈xi, xi〉i =

∑‖xi‖2i ≥ 0. (2.57)

5. From above we know ∑‖xi‖2i ≥ 0. (2.58)

Then 〈x, x〉 = 0 =∑‖xi‖2i ⇔ xi = 0, ∀i ⇔ x = (0, 0, . . .) = 0. Hence, 〈·, ·〉

as defined above is an inner product.It also induces the following norm,

∀x = (xi) ∈ H (2.59)

‖x‖ =√〈x, x〉 =

√∑‖xi‖2i . (2.60)

Finally, to show this is a Hilbert space, we need to show that it is complete.Let (xn) be a Cauchy sequence inH. Write xn = (xn,i) = (xn,1, xn,2, . . . , xn,N ).Since (xn) is a Cauchy sequence we know that ∀ε∃N1 such that

‖xn − xm‖ < ε, ∀n,m > N1 (2.61)

If we square both sides we get

‖xn − xm‖2 < ε2. (2.62)

Note that,

xn − xm (2.63)

= (xn,1, xn,2, . . . , xn,N )− (xm,1, xm,2, . . . , xm,N ) (2.64)

= (xn,1 − xm,1, xn,2 − xm,2, . . . , xn,N − xm,N ). (2.65)

12

2.6. Lipschitz and Nonexpansivity

Thus,

‖xn − xm‖2 (2.66)

= ‖xn,1 − xm,1‖21 + ‖xn,2 − xm,2‖22 + . . .+ ‖xn,N − xm,N‖2N . (2.67)

By (2.62)

‖xn,1 − xm,1‖21 + ‖xn,2 − xm,2‖22 + . . .+ ‖xn,N − xm,N )‖2N ≤ ε2. (2.68)

So, this implies that‖xn,i − xm,i‖2i ≤ ε2, ∀i. (2.69)

Thus,‖xn,i − xm,i‖ ≤ ε, ∀i ∀n,m > N1, (2.70)

i.e., (xn,i) is a Cauchy sequence for every i, and xn,i is a sequence in Xi,which is complete. So ∃x0,i ∈ Xi such that xn,i −→ x0,i. Therefore,

xn −→ (x0,1, x0,2, . . . , x0,N ). (2.71)

We have shown that our product space is indeed complete and thus a Hilbertspace.

Definition 2.27 (Diagonal space). Let N ∈ N, and let X be a Hilbertspace. The diagonal space ∆ in XN is the subspace

∆ = {(x, . . . , x) ∈ XN | x ∈ X}. (2.72)

2.6 Lipschitz and Nonexpansivity

Definition 2.28 (Lipschitz). Let T : X → X. Then T is said to be α-Lipschitz continuous if

(∀x ∈ X)(∀y ∈ X) ‖Tx− Ty‖ ≤ α‖x− y‖.

Example 2.29 (The Identity). The Identity operator on X is 1-Lipschitz;this is very easy to show:

‖x− y‖ = ‖ Id(x)− Id(y)‖ ≤ 1 · ‖x− y‖.

Theorem 2.30. Let F1 : X → X be L1-Lipschitz and F2 : X → X beL2-Lipschitz, then F1 + F2 is L1 + L2 Lipschitz.

13


Proof. We start with the definition of being Lipschitz continuous.

F1 is L1−Lipschitz⇐⇒ ∀x ∀y ‖F1(x)− F1(y)‖ ≤ L1 · ‖x− y‖ (2.73)

F2 is L2−Lipschitz⇐⇒ ∀x ∀y ‖F2(x)− F2(y)‖ ≤ L2 · ‖x− y‖ (2.74)

Hence,

‖(F1 + F2)(x)− (F1 + F2)(y)‖ = ‖(F1(x) + F2(x))− (F1(y) + F2(y))‖(2.75)

= ‖(F1(x)− F1(y)) + (F2(x)− F2(y))‖, (2.76)

which, by the triangle inequality, is

≤ ‖F1(x)− F1(y)‖+ ‖F2(x)− F2(y)‖. (2.77)

Now since F1 is L1-Lipschitz and F2 is L2-Lipschitz we get

≤ L1 ·‖x−y‖+L2 ·‖x−y‖ = ‖x−y‖·(L1 +L2) = (L1 +L2) ·‖x−y‖, (2.78)

as desired.

Definition 2.31 (Nonexpansivity). Let T : X → X. We say T is nonex-pansive if

(∀x ∈ X)(∀y ∈ X) ‖Tx− Ty‖ ≤ ‖x− y‖.

Remark 2.32. We note that T being nonexpansive means it is 1-Lipschitzcontinuous. Therefore, the Identity is not only Lipschitz continuous but itis also nonexpansive.

Example 2.33 (The Right Shift Operator). The circular right shift operatorR : Xn → Xn : (x1, . . . , xn) 7→ (xn, x1, . . . , xn−1) satisfies ‖Rx‖ = ‖x‖ sothat, using linearity of R,

‖R(x− y)‖ = ‖Rx−Ry‖ = ‖x− y‖,

and hence R is nonexpansive.

Example 2.34 (Linear operator on `2). Recall that an element x in `2 isof the form x = (xn) = (x1, x2, . . .) such that

‖(xn)‖ =

√√√√ ∞∑n=1

xn2 < +∞ (2.79)

14


We define a linear operator T : `2 → `2 by

Tx := (xnn

) = (x1,x2

2,x3

3, . . . ,

xnn, . . .), ∀x = (xn) = (x1, x2, . . .) ∈ `2.

(2.80)So we have (∀x = (xn) ∈ `2)

‖Tx‖2 =

∞∑i=1

(x2i

i2) ≤

∞∑i=1

(x2i ) = ‖x‖2 (by (2.79)). (2.81)

Taking the square root on both sides,

‖Tx‖ ≤ ‖x‖, ∀x ∈ `2. (2.82)

Now by linearity of T we have

‖Tx− Ty‖ = ‖T (x− y)‖ ≤ ‖x− y‖, ∀x, y ∈ `2. (2.83)

Therefore, T is nonexpansive.

Definition 2.35 (Continuity). Let T : X → X. T is said to be continuousif for all sequences (xn) ∈ X with xn → x, then Txn → Tx.

Alternatively, given an arbitrary element x ∈ X, for all ε > 0 there existsδ > 0 such that ‖Tx− Ty‖ < ε whenever y ∈ X with ‖x− y‖ < δ.

Proposition 2.36. If F is L-Lipschitz, then F is continuous.

Proof. The result is clear when L = 0. So assume L > 0. Let c ∈ X. Wewill show that F is continuous at c. Given ε > 0, we need to find δ > 0 suchthat ‖F (x)− F (c)‖ < ε when ‖x− c‖ < δ.

We now set δ :=ε

L> 0.

Then (∀x ∈ X),

‖x− c‖ < δ ⇒ ‖F (x)− F (c)‖ ≤ L‖x− c‖ ≤ Lδ = ε. (2.84)

So ‖F (x)− F (c)‖ < ε as desired.

Theorem 2.37. The composition of nonexpansive operators is nonexpan-sive.

Proof. Let T and N be nonexpansive, and let x and y be in X. Then,

‖T (Nx)− T (Ny)‖ ≤ ‖Nx−Ny‖ ≤ ‖x− y‖. (2.85)

15


Definition 2.38 (Firmly nonexpansive). Let T : X → X. We say T isfirmly nonexpansive if

(∀x ∈ X)(∀y ∈ X) ‖Tx− Ty‖2 ≤ 〈x− y, Tx− Ty〉

Theorem 2.39. Let T : X → X. Then the following statements are equiv-alent:

1. T is firmly nonexpansive.

2. ‖Tx− Ty‖2 + ‖(x− Tx)− (y − Ty)‖2 ≤ ‖x− y‖2 ∀x,∀y.

3. ‖(2Tx− x)− (2Ty − y)‖ ≤ ‖x− y‖ ∀x, ∀y.

4. Id−T is firmly nonexpansive

5. T = 12N + 1

2 Id, where N is nonexpansive.

Proof. We rewrite the statements above by letting a = x− y, b = Tx− Ty.The following statements are equivalent:

1. ‖b‖2 ≤ 〈a, b〉.

2. ‖b‖2 + ‖a− b‖2 ≤ ‖a‖2.

3. ‖2b− a‖ ≤ ‖a‖.

4. Id−T is firmly nonexpansive.

5. T = 12N + 1

2 Id, where N is nonexpansive

2⇐⇒ ‖b‖2 + ‖a− b‖2 ≤ ‖a‖2 (2.86)

⇐⇒ ‖b‖2 + ‖a‖2 − 2〈a, b〉+ ‖b‖2 ≤ ‖a‖2 (2.87)

⇐⇒ 2‖b‖2 ≤ 2〈a, b〉 (2.88)

⇐⇒ ‖b‖2 ≤ 〈a, b〉 (2.89)

⇐⇒ 1 (2.90)

3⇐⇒ ‖2b− a‖2 ≤ ‖a‖2 (2.91)

⇐⇒ 4‖b‖2 − 4〈a, b〉+ ‖a‖2 ≤ ‖a‖2 (2.92)

⇐⇒ 4‖b‖2 ≤ 4〈a, b〉 (2.93)

⇐⇒ ‖b‖2 ≤ 〈a, b〉 (2.94)

⇐⇒ 1 (2.95)

16


4⇐⇒ ‖(Id−T )x− (Id−T )y‖2 ≤ 〈x− y, (Id−T )x− (Id−T )y〉 (2.96)

⇐⇒ ‖a− b‖2 ≤ 〈a, a− b〉 (2.97)

⇐⇒ ‖a‖2 − 2〈a, b〉+ ‖b‖2 ≤ 〈a, a〉 − 〈a, b〉 (2.98)

⇐⇒ ‖a‖2 + ‖b‖2 ≤ ‖a‖2 − 〈a, b〉+ 2〈a, b〉 (2.99)

⇐⇒ ‖b‖2 ≤ 〈a, b〉 (2.100)

⇐⇒ 1 (2.101)

Note that

T =1

2Id +

1

2N ⇐⇒ N = 2T − Id . (2.102)

Hence,

3⇐⇒ N is nonexpansive (2.103)

⇐⇒ 5. (2.104)

The last equivalence is a tool to create new firmly nonexpansive map-pings from nonexpansive mappings.

Example 2.40. We define A on `2 as follows:

A : (x1, x2, x3, . . .) 7−→ (x1,x2

2,x3

3, . . .). (2.105)

If we take A+ Id we get

A+ Id : (x1, x2, x3, . . .) 7−→ (2x1,3

2x2,

4

3x3, . . .). (2.106)

We now define F to be the inverse,

F := (A+ Id)−1 : (ξ1, ξ2, ξ3, . . .) 7−→ (1

2ξ1,

2

3ξ2,

3

4ξ3, . . .). (2.107)

We will now show that F := (A+ Id)−1 is firmly nonexpansive. By (2.107),F is linear. We will now show that

‖Fx‖2 ≤ 〈Fx, x〉 ∀x, (2.108)

17

2.7. Projections

which, by linearity of F , is equivalent to the firm nonexpansivity of F . Lety = Fx. Then x = F−1y = (A + Id)y. Then ‖Fx‖2 = ‖y‖2, as they areequal, and since 〈y,Ay〉 ≥ 0,

〈Fx, x〉 = 〈y, (A+ Id)y〉 = 〈y,Ay〉+ ‖y‖2 ≥ ‖y‖2 = ‖Fx‖2. (2.109)

We have shown that 〈Fx, x〉 ≥ ‖Fx‖2 and therefore F is firmly nonex-pansive.

Corollary 2.41. All firmly nonexpansive mappings are nonexpansive.

Proof. For T , a firmly nonexpansive mapping, we know that ∀x,∀y

‖T (x)− T (y)‖2 ≤ 〈T (x)− T (y), x− y〉 ≤ ‖Tx− Ty‖ · ‖x− y‖. (2.110)

We now consider two cases. Case 1: ‖T (x)− T (y)‖ = 0.

‖T (x)− T (y)‖ = 0 ≤ ‖x− y‖ (2.111)

as norms are always greater or equal to zero. So we have,

‖T (x)− T (y)‖ ≤ ‖x− y‖, (2.112)

therefore T is nonexpansive, as desired. Case 2: ‖T (x)− T (y)‖ > 0. Againwe have

‖T (x)− T (y)‖2 ≤ ‖T (x)− T (y)‖ · ‖x− y‖. (2.113)

Divide both sides by ‖T (x)− T (y)‖ to obtain,

‖T (x)− T (y)‖ ≤ ‖x− y‖, (2.114)

as desired.

2.7 Projections

The main focus of this section is to provide an introduction to projectionsonto convex sets.

Definition 2.42. Let C ⊆ X, x ∈ X, and c∗ ∈ C. Then c∗ is the projec-tion of x onto C if

(∀c ∈ C) ‖x− c∗‖ ≤ ‖x− c‖.

18

2.7. Projections

In other words, c∗ is a closest point to x ∈ C.

The associated operator PC : X ⇒ C : x 7→ {c∗ ∈ C| c∗ is a projection of xonto C } is called the projection operator or projector. If PCx = {c∗}we also write c∗ = PCx rather than {c∗} = PCx, which is a slight abuse ofnotation.

Lemma 2.43. Let u and v be in X. Then the following holds:

〈u, v〉 ≤ 0⇐⇒ (∀α ∈ R+)‖u‖ ≤ ‖u− αv‖ ⇐⇒ (∀α ∈ [0, 1])‖u‖ ≤ ‖u− αv‖.

Proof. Observe that

(∀α ∈ R) ‖u− αv‖2 − ‖u‖2 = α(α‖v‖2 − 2〈u, v〉). (2.115)

Hence, for forward implications follow immediately. Conversely, if for everyα ∈]0, 1], ‖u‖ ≤ ‖u − αv‖, then (2.115) implies that 〈u, v〉 ≤ α‖v‖2/2. Asα ↓ 0, we obtain 〈u, v〉 ≤ 0.

Theorem 2.44 (The Projection Theorem). Let C be a nonempty closedconvex subset of X. Then ∀x ∈ X there exists a unique c∗ ∈ C such that‖x− c∗‖ = dC(x). Moreover, c∗ is characterized by

c∗ ∈ C (2.116)

and

(∀c ∈ C) 〈c− c∗, x− c∗〉 ≤ 0. (2.117)

Proof. Fix x ∈ X. Pick a sequence (cn) in C such that ‖x− cn‖ −→ dC(x).By the parallelogram law,

‖cn − cm‖2 = 2‖cn − x‖2 + 2‖cm − x‖2 − ‖(cn + cm)− 2x‖2 (2.118)

= 2‖cn − x‖2 + 2‖cm − x‖2 − 4‖cn + cm2

− x‖2 (2.119)

≤ 2‖cn − x‖2 + 2‖cm − x‖2 − 4(dC(x))2; (2.120)

as dC(x) ≤ ‖x− cn + cm2

‖. Now as n,m −→∞

19

2.7. Projections

−→ 2(d2C(x)) + 2(dC(x))2 − 4(dC(x))2 = 0. (2.121)

Therefore (cn) is Cauchy. Hence (cn) is Cauchy and thus convergent as Xis a Hilbert space and thus complete:

cn −→ c∗ ∈ X. (2.122)

But (cn) lies in C and C is closed, so c∗ ∈ C as well. Now since cn −→ c∗

we also have that x − cn −→ x − c∗ and thus ‖x − cn‖ −→ ‖x − c∗‖. Onthe other hand, ‖x− cn‖ → dC(x) by assumptions. Since limits are unique‖x − c∗‖ = dC(x) so c∗ ∈ PCx. We have shown existence. Now we need toshow uniqueness. We let c′ be another nearest point in PCx. We need onlyshow that c∗ = c′. Set (c′n) = (c∗, c′, c∗, c′, ...), then ‖x − c′n‖ = dC(x). By(2.119) we deduce that ‖c∗ − c′‖2 = 0 i.e., c∗ = c′.

Now for the characterization. Pick c ∈ C and set cα = αc+(1−α)c∗ ∈ Cwhere α ∈ [0, 1]. Then, using Lemma 2.43 (with u = x− c∗ and v = c− c∗),we obtain

‖x− c∗‖ = dC(x) (2.123)

⇐⇒ (∀c ∈ C)(∀α ∈ [0, 1]) ‖x− c∗‖ ≤ ‖x− cα‖ = ‖(x− c∗)− α(c− c∗)‖(2.124)

⇐⇒ (∀c ∈ C)〈x− c∗, c− c∗〉 ≤ 0. (2.125)

We now give a few examples of projections.

Example 2.45 (Hyperplane). (See [Deu01, page 99].) If a ∈ X\{0}, β ∈ Rand C = {x ∈ X|〈a, x〉 = β} then

PCx = x− (〈a, x〉 − β)

‖a‖2a (2.126)

Proof. We shall use the characterization of the Projection theorem. It canbe easily shown that our set C is closed and convex. Next we call ourprojection candidate c∗ and check that c∗ ∈ C.

〈a, x− (〈a, x〉 − β)

‖a‖2a〉 = 〈a, x〉 − (〈a, x〉 − β)

‖a‖2〈a, a〉 = β. (2.127)

20

2.7. Projections

Next we take c ∈ C, which means that 〈a, c〉 = β. Then, by simplifying andrecalling that 〈a, c〉 = β and 〈a, a〉 = ‖a‖2,

〈c− (x− 〈a, x〉 − β‖a‖2

a),(〈a, x〉 − β)

‖a‖2a〉 (2.128)

=〈a, x〉 − β‖a‖2

(〈c, a〉 − 〈x, a〉+(〈a, x〉 − β)

‖a‖2〈a, a〉) (2.129)

=〈a, x〉 − β‖a‖2

(β − 〈x, a〉+ 〈a, x〉 − β) (2.130)

=〈a, x〉 − β‖a‖2

(0) (2.131)

= 0 (2.132)

≤ 0. (2.133)

Since the two conditions for the characterization are satisfied, our candidatec∗ is indeed the projection PCx

Example 2.46 (The unit ball). We provide the projection onto the unitball, C := B(0; 1) = {x|‖x‖ ≤ 1}:

PCx =x

max{‖x‖, 1}, ∀x ∈ X. (2.134)

Proof. C is closed and convex this implies that our projection is a singletonby Theorem 2.44. We denote c∗ our projection candidate in (2.134). Toprove this is a projection we need to show two things:

1. c∗ ∈ C

2. 〈c− c∗, x− c∗〉 ≤ 0 ∀c ∈ C, ∀x ∈ X

Case 1: Assume that x ∈ C. This means that x is in the unit ball, somax{‖x‖, 1} = 1 and hence c∗ = x.

1. Trivial by assumptions.

2. 〈c− c∗, x− c∗〉 = 〈c− x, x− x〉 = 〈c− c∗, 0〉 = 0 ≤ 0.

Case 2: Assume that x /∈ C.1. x /∈ C implies ‖x‖ > 1 and c∗ =

x

‖x‖by definition. We also have that

‖c∗‖ = 1 so c∗ ∈ C:2. ∀c ∈ C we have ‖c‖ ≤ 1 by definition

21

2.7. Projections

〈c− c∗, x− c∗〉 = 〈c− x

‖x‖, x− x

‖x‖〉 (2.135)

= 〈c− x

‖x‖, (1− 1

‖x‖)x〉 = 〈c, (1− 1

‖x‖)x〉 − 〈 x

‖x‖, (1− 1

‖x‖)x〉. (2.136)

Now since

(1− 1

‖x‖)

‖x‖〈x, x〉 =

(1− 1

‖x‖)

‖x‖‖x‖2 = ‖x‖(1− 1

‖x‖) = ‖x‖ − 1 > 0, (2.137)

we have that

〈c− c∗, x− c∗〉 = (2.138)

〈c, (1− 1

‖x‖)x〉 − ‖x‖(1− 1

‖x‖) (2.139)

≤ ‖c‖‖x‖(1− 1

‖x‖)− ‖x‖(1− 1

‖x‖) (2.140)

by the Cauchy-Schwarz inequality. So this

= (‖x‖ − 1)(‖c‖ − 1) ≤ 0, (2.141)

as ‖x‖ − 1 > 0 and ‖c‖ − 1 ≤ 0.

Example 2.47 (The Product Space). We show that the projection ontothe product set, C := C1 × C2 × . . .× Cn is:

PC(x1, x2, . . . , xn) = (PC1(x1), PC2(x2), . . . , PCn(xn)). (2.142)

Proof. Clearly, PCi(xi) ∈ Ci. Then (PC1(x1), PC2(x2), . . . , PCn(xn)) ∈ C1 ×C2 × · · · × Cn = C. Now ∀c ∈ C, let c = (c1, c2, . . . , cn) with ci ∈ Ci. Wewill show that

〈c− ((PC1(x1), . . . , PCn(xn)), (x1, x2, . . . , xn)− (PC1(x1), . . . , PCn(xn))〉 ≤ 0(2.143)

22

2.8. Properties of the Projector

Now,

〈(c1, . . . , cn)− (PC1(x1), . . . , PCn(xn)), (x1, . . . , xn)− (PC1(x1), . . . , PCn(xn)〉(2.144)

= 〈(c1 − PC1(x1), . . . , cn − PCn(xn)), (x1 − PC1(x1), . . . , xn − PCn(xn))〉(2.145)

= 〈c1 − PC1(x1), x1 − PC1(x1)〉+ . . .+ 〈cn − PCn(xn), xn − PCn(xn)〉 ≤ 0,(2.146)

because every term above is less than or equal to 0 by the Theorem 2.44.

Example 2.48 (The Diagonal Set). We show that the projection onto theDiagonal Set, C := {(x, x, . . . , x) ∈ Xn} is:

PC(x1, x2, . . . , xn) = (x, x, . . . , x), (2.147)

where x = 1n

∑xi.

Proof. C is a closed and convex set. Clearly, (x, . . . , x) ∈ C. Let c ∈ C, sayc = (x, x, . . . , x) for some x ∈ X. Then

〈c− (x, . . . , x), (x1, x2, . . . , xn)− (x, x, . . . , x〉 (2.148)

= 〈(x, x, . . . , x)− (x, . . . , x), (x1, x2, . . . , xn)− (x, x, . . . , x〉 (2.149)

= 〈(x− x, . . . , x− x), (x1 − x, . . . , xn − x)〉 (2.150)

= 〈x− x, x1 − x〉+ . . .+ 〈x− x, xn − x〉 (2.151)

= 〈x− x, (x1 − x) + . . .+ (xn − x)〉 (2.152)

= 〈x− x, (x1 + . . .+ xn)− nx〉 (2.153)

= 〈x− x, nx− nx〉 (2.154)

= 〈x− x, 0〉 (2.155)

= 0 (2.156)

≤ 0. (2.157)

The conclusion follows from the Projection theorem.

2.8 Properties of the Projector

Theorem 2.49 (Projections are firmly nonexpansive). Suppose C is anonempty closed convex set in the Hilbert space X. Then the projectionPC onto C is firmly nonexpansive:

‖PCx− PCy‖2 ≤ 〈x− y, PCx− PCy〉, ∀x, y ∈ C (2.158)

23

2.9. The Reflector

for all x, y ∈ X.

Proof. By (2.117), we have that

‖PCx− PCy‖2 = 〈PCx− PCy, PCx− PCy〉≤ 〈PCx− PCy, PCx− PCy〉+ 〈x− PCx, PCx− PCy〉

(2.159)

= 〈x− PCy, PCx− PCy〉 (2.160)

≤ 〈x− PCy, PCx− PCy〉+ 〈PCy − y, PCx− PCy〉 (2.161)

= 〈x− y, PCx− PCy〉.

So PC is firmly nonexpansive as desired.

Corollary 2.50. Suppose C is a nonempty closed convex set in the Hilbertspace X. Then PC is nonexpansive.

Proof. PC is firmly nonexpansive and therefore by Corollary 2.41 nonexpan-sive.

Remark 2.51. Since PC is nonexpansive, it is 1-Lipschitz and hence contin-uous.

Remark 2.52. Since projection operators are nonexpansive, the compositionof projection operators is nonexpansive.

2.9 The Reflector

Definition 2.53. Let B be a subset of X. The reflector is defined asRB := 2PB − Id.

Proposition 2.54. Let B ⊆ X be a nonempty closed convex set. Then thereflector RB is nonexpansive.

Proof. Recall Theorem 2.39 and Theorem 2.49, so PB =2PB − Id + Id

2=

RB + Id

2is firmly nonexpansive.

This means that RB is nonexpansive if and only if PB is firmly nonex-pansive. The good news is that all projections are firmly nonexpansive byTheorem 2.49.

Proposition 2.55. The composition of reflectors is nonexpansive.

24

2.10. Miscellaneous

Proof. We proved that the composition of nonexpansive operators is nonex-pansive Theorem 2.37. Since each reflector is nonexpansive by Proposition2.54, we are done.

2.10 Miscellaneous

Theorem 2.56. We claim if xn → x, then (xn) is bounded.

Proof. For ε = 1, ∃N such that ∀n ≥ N we have

‖xn − x‖ < ε = 1, (2.162)

by the definition of a limit. By the triangle inequality,

‖xn‖ − ‖x‖ ≤ ‖xn − x‖ ≤ 1 ∀n ≥ N. (2.163)

Thus,‖xn‖ ≤ ‖x‖+ 1 ∀n ≥ N. (2.164)

Next we define M to be max{‖x‖+ 1, ‖x1‖, ‖x2‖, ..., ‖xN−1‖}. Thus

‖xn‖ ≤M, ∀n. (2.165)

Theorem 2.57. We claim if (xn) is bounded, and B is a nonempty closedconvex set, then (PB(xn)) is also bounded.

Proof. Fix y, since PB is firmly nonexpansive by Theorem 2.49 and thusnonexpansive by Theorem 2.41 we have,

‖PB(xn)− PB(y)‖ ≤ ‖xn − y‖ ≤ ‖xn‖+ ‖y‖. (2.166)

Since ‖(xn)‖ is bounded we have,

‖xn‖+ ‖y‖ ≤M + ‖y‖ (2.167)

for some M > 0. By the triangle inequality,

‖PB(xn)‖ − ‖PB(y)‖ ≤ ‖PB(xn)− PB(y)‖ ≤M + ‖y‖. (2.168)

Therefore, ‖PB(xn)‖ ≤ ‖PBy‖+M +‖y‖ and (PB(xn)) is bounded.

Definition 2.58 (Weakly convergent sequence). We say xn ⇀ x ⇐⇒〈xn, z〉 −→ 〈x, z〉 ∀z ∈ X.

Definition 2.59 (Weak cluster point). A vector z is a weak cluster pointof a sequence (xn) if there is a subsequence (xkn) of (xn) such that xkn ⇀ z.

25

Chapter 3

Projection Methods

In this chapter we show that Elser’s Difference Map algorithm is equiva-lent to the Douglas-Rachford and Fienup Hybrid Input-Output (HIO) al-gorithms. Throughout this chapter A and B are nonempty fixed convexand closed sets in X a real Hilbert space, and PA,PB,RA, and RB are theassociated projections and reflectors onto the sets A and B.

3.1 Elser’s Difference Map

Definition 3.1. Elser’s Difference Map [ERT07, page 419] is defined by

D(x) = x+ β[PA ◦ gB(x)− PB ◦ fA(x)] (3.1)

where

fA(x) = PA(x)− (PA(x)− x)/β (3.2)

and

gB(x) = PB(x) + (PB(x)− x)/β, (3.3)

where β is a real parameter that is not equal to zero.

Proposition 3.2. Elser’s Difference Map, for β = 1, can be written in theform

D = PA(2PB − Id) + (Id−PB). (3.4)

This is the same as Fienup’s HIO [Fie82, page 2763] and the Douglas-Rachford algorithm [LM79].

Proof. The Difference Map, when β = 1, becomes

D(x) = x+ [PA ◦ gB(x)− PB ◦ fA(x)] (3.5)

wherefA(x) = PA(x)− (PA(x)− x) = x (3.6)

26

3.2. Douglas-Rachford and Fienup HIO

andgB(x) = PB(x) + (PB(x)− x) = 2PB(x)− x. (3.7)

We now plug into (3.5) giving:

D(x) = x+ PA(2PB(x)− x)− PB(x). (3.8)

HenceD = PA(2PB − Id) + (Id−PB), (3.9)

as desired.

Lemma 3.3. If A is a closed subspace, in such a case PA is linear by[Deu01, 5.13(1) page 79], we can write D = PA(2PB − Id) + (Id−PB) in amore symmetric fashion:

D = PAPB + (Id−PA)(Id−PB). (3.10)

Proof. Indeed,

D = PA(2PB − Id) + (Id−PB) (3.11)

= 2PAPB − PA + Id−PB (2PAPB = PAPB + PAPB) (3.12)

= PAPB + PAPB − PA + Id−PB (3.13)

= PAPB + PA(PB − Id) + (Id−PB) (3.14)

= PAPB − PA(Id−PB) + (Id−PB) (3.15)

= PAPB + (−PA + Id)(Id−PB) (3.16)

= PAPB + (Id−PA)(Id−PB) (3.17)

as desired.

3.2 Douglas-Rachford and Fienup HIO

Proposition 3.4. The Fienup’s Hybrid Input-Output Algorithm and Douglas-Rachford Algorithm is described by

xn+1 =(PA(2PB − Id) + (Id−PB)

)(xn). (3.18)

For brevity, we set

T = PA(2PB − Id) + (Id−PB). (3.19)

Then

T =1

2(RARB + Id), (3.20)

where RA = 2PA − Id and RB = 2PB − Id.

27

3.3. Fixed Point Results

Proof. Indeed,

T = PA(2PB − Id) + (Id−PB) (3.21)

= (PARB + Id−PB) (2PB − Id replaced with RB) (3.22)

=1

2(2PARB + 2 Id−2PB) (factor out

1

2) (3.23)

=1

2(2PARB + Id + Id−2PB) (write 2 Id as Id + Id = 2 Id) (3.24)

=1

2(2PARB + Id−RB) (2PB − Id replaced with RB) (3.25)

=1

2(2PARB −RB + Id) (rearrange) (3.26)

=1

2((2PA − Id)RB + Id) (factor out RB) (3.27)

=1

2(RARB + Id) (3.28)

as desired.

3.3 Fixed Point Results

Theorem 3.5. Set T = PA(2PB − Id) + (Id−PB). Then

PB(FixT ) = A ∩B ⊆ FixT (3.29)

Proof. Fix x ∈ X, and write as x = b + q, where b = PB(x) and henceq = x− b. Then

x = T (x) definition of a fixed point (3.30)

⇐⇒ x = PA(2PB − Id)(x) + (Id−PB)(x) (3.31)

⇐⇒ b+ q = PA(2b− (b+ q)) + b+ q − b substitution (3.32)

⇐⇒ b+ q = PA(2b− b− q) + q simplify (3.33)

⇐⇒ b = PA(b− q) recall b = PB(x) (3.34)

⇐⇒ PB(x) = PA(PB(x)− (x− PB(x))) substitution (3.35)

⇐⇒ PB(x) = PA(PB(x)− x+ PB(x)) simplify (3.36)

⇐⇒ PB(x) = PA(2PB(x)− x) simplify (3.37)

⇐⇒ PB(x) = PA(RB(x)). (3.38)

All together this means that x = T (x)⇐⇒ PB(x) = PA(RB(x)), PB(x) ∈ Bin which case PB(x) ∈ A. So when x ∈ FixT , we have PB(x) ∈ B ∩A, and

28

3.4. The Main Result

since x = T (x) and x was an arbitrary fixed point we have

PB(FixT ) ⊆ A ∩B. (3.39)

We now prove that A ∩ B ⊆ FixT . ∀x ∈ A ∩ B we have PA(x) = x andPB(x) = x so

T (x) = PA(2PB − Id)(x) + (Id−PB)(x) expand (3.40)

= PA(2PB(x)− x) + (x− PB(x)) (3.41)

= PA(2x− x) + (x− x) (3.42)

= PA(x) (3.43)

= x. (3.44)

HenceA ∩B ⊆ FixT. (3.45)

Now apply PB to (3.45) to deduce that

A ∩B = PB(A ∩B) ⊆ PB(FixT ). (3.46)

Combining (3.39) and (3.46), we see that

PB(FixT ) = A ∩B. (3.47)

3.4 The Main Result

We next present the main convergence result. We show what happens in afinite dimensional Hilbert space and then give an example about why theproof fails when we try to extend to infinite dimensions.

Throughout the remainder of this chapter, we set (see Proposition 3.4)

T =1

2(RARB + Id) = PA(2PB − Id) + (Id−PB). (3.48)

Fact 3.6 (Opial). [Opi67] Suppose that T is a firmly nonexpansive mappingfrom X to X with FixT 6= ∅. Then for every x ∈ X, the sequence (Tnx)weakly converges to some point in Fix T .

29


Theorem 3.7. Suppose that A ∩ B 6= ∅. Then T is firmly nonexpansiveand the sequence (xn) generated by

xn+1 =1

2(RARB + Id)(xn) = T (xn) (3.49)

converges weakly to some point x ∈ Fix T , and PB(x) ∈ A ∩ B. Moreover,the sequence (PB(xn)) is bounded and every weak cluster point of (PB(xn))lies in A ∩B.

Proof. Fix T contains A ∩B 6= ∅ by Theorem 3.5. The sequence

(xn) = (Tn(x0)) (3.50)

converges weakly to some fixed point x of T by Fact 3.6. Since (xn)is bounded, it follows (PB(xn)) is bounded by Theorem 2.57. RARB isnonexpansive by Proposition 2.54 and Theorem 2.37. This implies thatT = Id +RARB

2 is firmly nonexpansive by Theorem 2.39. Thus,

‖T (x)−T (y)‖2+‖(Id−T )(x)−(Id−T )(y)‖2 ≤ ‖x−y‖2 ∀x, y ∈ X. (3.51)

Pick x = xn and y = x a fixed point. Then we get

‖xn − x‖2 ≥ ‖T (xn)− T (x)‖2 + ‖xn − T (xn)− x+ T (x)‖2 (3.52)

= ‖xn+1 − x‖2 + ‖xn − xn+1 − x+ x‖2 (3.53)

= ‖xn+1 − x‖2 + ‖xn − xn+1‖2 (3.54)

Som∑n=1

(‖xn − x‖2 − ‖xn+1 − x‖2) ≥m∑n=1

‖xn − xn+1‖2. (3.55)

The summation on the left is

‖x1 − x‖2 − ‖xm+1 − x‖2, (3.56)

which is less than or equal to ‖x1 − x‖2 < +∞. Hence

∀mm∑n=1

‖xn − xn+1‖2 ≤ ‖x1 − x‖2 < +∞ (3.57)

since ‖x1 − x‖2 is a constant we have

∞∑n=1

‖xn − xn+1‖2 ≤ ‖x1 − x‖2 < +∞. (3.58)

30


Hence, ‖xn − xn+1‖2 → 0, thus, ‖xn − xn+1‖ → 0 i.e.,

xn − xn+1 → 0. (3.59)

It follows that

xn − xn+1 = xn − PA(2PB − Id)xn − (Id−PB)xn (3.60)

= PBxn − PA(2PB − Id)xn → 0. (3.61)

In infinite dimensions, it is a fact that if (PBxn)n∈N is bounded then it hasweak cluster points by [Deu01, page 205, 9.12] and Definition 2.59, say

PBxkn ⇀ z. (3.62)

Now (PBxn)n∈N lies in B and B is weakly closed by [Rud91, page 66, The-orem 3.12 corollaries (a)] (B is weakly closed since B is closed and convex)so its weak limit z ∈ B. We have

PBxn = PBxn − PA(2PB − Id)xn + PA(2PB − Id)xn, (3.63)

recall that PBxn − PA(2PB − Id)xn → 0. Thus

PA(2PB − Id)xkn ⇀ z. (3.64)

Since A is also weakly closed we have that z ∈ A. Altogether we havez ∈ A ∩B.

The statement of Theorem 3.7 is even simpler in finite dimensions.

Theorem 3.8. Suppose that X is finite dimensional and A∩B 6= ∅. Thenthe sequence (xn) generated by

xn+1 = Txn, where T :=1

2(RARB + Id) (3.65)

converges to some point x ∈ FixT , and PB(x) ∈ A ∩B. Moreover,

PB(xn)→ PB(x) ∈ A ∩B. (3.66)

Proof. Note that T is a firmly nonexpansive operator, because it is a pro-jection by Theorem 2.49. So

xn → x = Tx =⇒ PBxn → PBx (3.67)

as PB is continuous. We know PBx ∈ PB(Fix T ) = A ∩B by Theorem 3.5.

31


Remark 3.9. The proof of the Theorem 3.8 does not work in infinite dimen-sions because PB may fail to be weakly continuous.

Zarantonello [Zar71, page 245], showed that if

xn ⇀ x; PBxn ⇀ PBx. (3.68)

Let B be the unit-ball in `2. Then let

xn = e1 + en ⇀ e1 = PBe1 as e1 ∈ B. (3.69)

It is a fact that if we are outside the unit ball the closed-form projection isx

‖x‖by Example 2.46. So if n ≥ 2, then

PBxn =e1 + en‖e1 + en‖

=e1 + en√

2⇀

e1√2. (3.70)

So on one hand we havexn ⇀ e1, (3.71)

but on the other we have

PBxn ⇀e1√

26= e1 = PBe1. (3.72)

Altogether, PB is not weakly continuous.

32

Chapter 4

Monotone Operators

Throughout this chapter, X is a finite-dimensional real Hilbert space withinner product 〈·, ·〉 and induced norm ‖ · ‖.

C and D are two closed convex sets in X such that C ∩D 6= ∅.

Recall that if A : X → X is linear, then A is automatically continuous (see,e.g., [Kre78, Page 94]).

4.1 Definitions

We first introduce some fundamental definitions. The arrow ”→” is used fora single-valued mapping, whereas ”⇒” denotes a set-valued mapping, i.e.X → P(X), the power set of X.

Definition 4.1. Let T : X ⇒ X. We say T is monotone if

〈x∗ − y∗, x− y〉 ≥ 0, (4.1)

whenever x∗ ∈ T (x), y∗ ∈ T (y).

Proposition 4.2. Let A : X → X be linear. Then A is monotone, if andonly if,

〈x,Ax〉 ≥ 0 ∀x ∈ X. (4.2)

Proof. “⇒” For every x ∈ X, by monotonicity and linearity of A we have

〈Ax, x〉 = 〈Ax−A0, x− 0〉 ≥ 0. (4.3)

(4.3) holds by A0 = 0 (since A is linear).“⇐” For every x, y ∈ X, since 〈x,Ax〉 ≥ 0, we have

⟨Ax−Ay, x− y

⟩=⟨A(x− y), x− y

⟩≥ 0. (4.4)

33

4.1. Definitions

Definition 4.3 (Adjoint operator). Let A : X → X be continuous andlinear. We say A∗ is the adjoint operator of A if

〈x,Ay〉 = 〈A∗x, y〉 ∀x, y ∈ X. (4.5)

Remark 4.4. We know A∗ exists and is unique, see [Kre78, Theorem 3.9-2].

Remark 4.5. We know A∗∗ = A, see [Deu01, Lemma 8.29 (3)].

Remark 4.6. If A ∈ Rn×n then A∗ = AT , see [Mey00, Page 84].

Definition 4.7 (Symmetric). Let A : X → X be linear. We say that A issymmetric if A∗ = A.

Definition 4.8 (Symmetric Part). Let A : X → X be linear. We say A+

is the symmetric part of A and is defined as follows:

A+ =A+A∗

2. (4.6)

Proposition 4.9. Let A : X → X be linear. Then A+ is symmetric.

Proof. A∗+ = (A+A∗

2 )∗ = A∗+A∗∗

2 = A∗+A2 = A+.

Definition 4.10 (Antisymmetric). Let A : X → X be linear. We say thatA is antisymmetric if A∗ = −A.

Definition 4.11 (Antisymmetric Part). Let A : X → X be linear. We sayA0 is the antisymmetric part of A and is defined as follows:

A0 =A−A∗

2. (4.7)

Proposition 4.12. Let A : X → X be linear. Then A0 is antisymmetric.

Proof. A∗0 = (A−A∗

2 )∗ = A∗−A∗∗

2 = A∗−A2 = −(A−A

∗

2 ) = −A0.

Lemma 4.13. Let A : X → X be linear. Then A is monotone ⇔ A∗ ismonotone.

Proof. By Proposition 4.2 we have that,

A is monotone ⇐⇒ 〈x,Ax〉 ≥ 0 ∀x (4.8)

⇐⇒ 〈Ax, x〉 ≥ 0 ∀x (4.9)

⇐⇒ 〈x,A∗x〉 ≥ 0 ∀x (4.10)

⇐⇒ A∗ is monotone. (4.11)

34

4.1. Definitions

Definition 4.14. Let A : X → X be linear. We set

qA : X → R : x 7→ 1

2〈x,Ax〉. (4.12)

Proposition 4.15. Let A : X → X be linear. Then ∇qA = A+.

Proof. For x ∈ X and h ∈ X\{0}, we have,

qA(x+ h)− qA(x)− 〈A+x, h〉‖h‖

(4.13)

=qA(x) + qA(h) + 1

2〈x,Ah〉+ 12〈Ax, h〉 − qA(x)− 〈A+x, h〉‖h‖

(4.14)

=qA(h) + 1

2〈A∗x, h〉+ 1

2〈Ax, h〉 − 〈A+x, h〉‖h‖

(4.15)

=qA(h) + 〈A+x, h〉 − 〈A+x, h〉

‖h‖(4.16)

=qA(h)

‖h‖(4.17)

=12〈h,Ah〉‖h‖

. (4.18)

Since A is continuous, we have, for M = ‖A‖,

‖Ah‖ ≤M‖h‖, ∀h ∈ X. (4.19)

Then, by Cauchy-Schwarz we get

0 ≤∣∣∣∣qA(x+ h)− qA(x)− 〈A+x, h〉

‖h‖

∣∣∣∣ (4.20)

=

∣∣∣∣∣ 12〈h,Ah〉‖h‖

∣∣∣∣∣ (4.21)

≤12‖h‖ ·M · ‖h‖

‖h‖=

1

2M‖h‖ → 0 as ‖h‖ → 0+. (4.22)

By the Squeeze theorem we get,

lim‖h‖→0+

qA(x+ h)− qA(x)− 〈A+x, h〉‖h‖

= 0 (4.23)

i.e., ∇qA = A+. So we have proven that qA is Frechet differentiable at xand ∇qA = A+.

35

4.1. Definitions

Proposition 4.16. Let A : X → X be linear. Then A is monotone if andonly if qA ≥ 0.

Proof. See Proposition 4.2.

Proposition 4.17. Let A : X → X be linear. Then qA = qA+.

Proof. Let x ∈ X. Then,

2qA+(x) = 〈A+x, x〉 = 〈A∗+A2 x, x〉

= 〈A∗x2 , x〉+ 〈Ax2 , x〉 = 〈x2 , Ax〉+ 〈Ax2 , x〉

= 〈Ax, x〉 = 2qA(x).

Proposition 4.18. Let A : X → X be linear. Then A is monotone if andonly if A+ is monotone.

Proof. By Proposition 4.2 we have that

A is monotone ⇐⇒ qA ≥ 0 by Proposition 4.16 (4.24)

⇐⇒ qA+ ≥ 0 by Proposition 4.17 (4.25)

⇐⇒ A+ is monotone by Proposition 4.16 . (4.26)

Definition 4.19. Let T : X ⇒ X be monotone. We call T maximal mono-tone if for every (y, y∗) /∈ graT there exists (x, x∗) ∈ graT with 〈x−y, x∗−y∗〉 < 0.

Fact 4.20. Let A : X ⇒ X be maximal monotone and (x0, x∗0) ∈ X × X.

Let A : X ⇒ X such that gra A = graA− (x0, x∗0) ( i.e., a rigid translation

of graA). Then A is maximal monotone.

Proof. Follows directly from Definition 4.19.

Lemma 4.21. Every continuous monotone operator A : X → X is maximalmonotone.

Proof. [RW98, Example 12.7].

36

4.1. Definitions

Definition 4.22 (Subdifferential). Let f : X → ]−∞,+∞] and let x ∈ Xbe such f(x) < +∞. Then the subdifferential of f at x is

∂f(x) = {s ∈ X | f(y) ≥ f(x) + 〈s, y − x〉, ∀y ∈ X}. (4.27)

The elements in ∂f(x) are called subgradients of f at x.

See [Roc97, Part V] for basic properties of the subdifferential operator.

Fact 4.23. ∂f(x) = ∇f(x) if f is Gateaux differentiable, see [Zal02, Theo-rem 2.4.4(i)].

Example 4.24. Let X = R and suppose f(x) = |x|. Then,

∂f(x) =

{−1} if x < 0;

[−1,+1] if x = 0;

{1} if x > 0.

(4.28)

Proof. Let x < 0 and s ∈ ∂f(x). We have

〈s, y − x〉 ≤ f(y)− f(x) = |y| − |x|. (4.29)

Let z ∈ X and y = x+ tz, where t > 0. By (4.29),

〈s, tz〉 ≤ |x+ tz| − |x| = |x+ tz| − (−x). (4.30)

Since x < 0, x+ tz < 0 when t is small. Thus by (4.30),

〈s, tz〉 ≤ −(x+ tz)− (−x) = −tz (4.31)

so〈s, z〉 ≤ −z. (4.32)

So plug in z = ±1 and we get that s = −1. This shows that when x < 0 thesubdifferential is the set {−1}. Now let x = 0 and s ∈ ∂f(x) = ∂f(0). Wehave

〈s, y − 0〉 ≤ f(y)− f(0) = |y| ⇐⇒ 〈s, y〉 ≤ |y| ∀y (4.33)

⇐⇒ 〈s, y〉 ≤ |y| ∀y 6= 0 (4.34)

⇐⇒ 〈s, y|y|〉 ≤ 1 ∀y 6= 0 (4.35)

⇐⇒ max{〈s, 1〉, 〈s,−1〉} ≤ 1 (4.36)

⇐⇒ max{s,−s〉} ≤ 1 (4.37)

⇐⇒ |s| ≤ 1 (4.38)

⇐⇒ s ∈ [−1, 1]. (4.39)

37

4.1. Definitions

This shows that when x = 0 the subdifferential is the set [−1, 1].Finally, let x > 0 and s ∈ ∂f(x). We have

〈s, y − x〉 ≤ f(y)− f(x) = |y| − |x|. (4.40)

Let z ∈ X and y = x+ tz, where t > 0. By (4.40),

〈s, tz〉 ≤ |x+ tz| − |x| = |x+ tz| − x. (4.41)

Since x > 0, x+ tz > 0 when t is small. Thus by (4.41),

〈s, tz〉 ≤ (x+ tz)− x = tz (4.42)

so〈s, z〉 ≤ z. (4.43)

So plug in z = ±1 and we get that s = 1. This shows that when x > 0 thesubdifferential is the set {1}.

Fact 4.25 (Rockafellar). Let f : X → ]−∞,+∞] be proper lower semicon-tinuous and convex. Then ∂f is maximal monotone.

Proof. See [Sim98, Page 113] or [Roc70].

Definition 4.26 (Positive-semidefinite matrix). The matrix A ∈ Rn×n ispositive semidefinitite, written A � 0, if A is symmetric, i.e. A = AT and

〈x,Ax〉 ≥ 0 ∀x. (4.44)

Fact 4.27. In the 2× 2 case it can be checked that(a bb c

)� 0⇐⇒ a ≥ 0, c ≥ 0, ac− b2 ≥ 0, (4.45)

see [Mey00, Page 566].

Definition 4.28 (Resolvent). Let M be maximal monotone and JM =(Id +M)−1. Then JM is called the resolvent of M .

Fact 4.29 (Minty). Let T : X → X. Then

T is firmly nonexpansive⇐⇒ T = (Id +M)−1,where M is maximal monotone,(4.46)

see [EB92, Theorem 2].

38

4.1. Definitions

The above fact can also be found in [Min62]. We can see from the abovethat we have another way to characterize firmly nonexpansive mappings.

Definition 4.30 (Indicator Function). Let S ⊆ X, then the indicator func-tion is defined by

x 7→ ιS(x) =

{0, if x ∈ S;

∞, otherwise.

Definition 4.31 (Proximal Mapping). Let T : X → X be firmly nonex-pansive, so that T = (Id +M)−1, where M is maximal monotone, see Fact4.29. If M = ∂f for some function f that is convex, lower semicontinuousand proper, then T is called a proximal mapping.

Remark 4.32 (Projection). Let T : X → X be firmly nonexpansive, sayT = (Id +M)−1, where M is maximal monotone, see Fact 4.29. If M = ∂ιC ,the subdifferential of the indicator function, then T = PC .

Projection mappings are a subset of proximal mappings and proximalmappings are subsets of firmly nonexpansive mappings.

Lemma 4.33. Every proximal mapping is a subdifferential operator.

Proof. Let T be a proximal mapping. Then by definition, the sum rule[Zal02, Theorem 2.8.7(iii)], and [Roc97, Theorem 23.8],

T = (Id +∂f)−1 (4.47)

= (∇1

2‖ · ‖2 + ∂f)−1 (4.48)

= (∂(1

2‖ · ‖2 + f))−1 (4.49)

= ∂(1

2‖ · ‖2 + f)∗, (4.50)

where g∗(x) = supy∈X(〈x, y〉−g(y)) is the Fenchel conjugate of g, see [Roc97,Chapter 12] for basic properties. This clearly shows that the proximal map-ping is a subdifferential.

Lemma 4.34. The projection operator PC is the gradient of 12‖ · ‖ −

12d

2C .

Proof. We use [RW98, Theorem 2.26] throughout this proof. If we set f tobe the indicator function and λ = 1, then by [RW98, Definition 1.22] we get

39

4.1. Definitions

when x ∈ X,

e1ιC(x) = infω∈X

ιg(ω) +1

2‖ω − x‖2 (4.51)

= infω∈C

1

2‖ω − x‖2 (4.52)

=1

2d2C(x), (4.53)

where e1g(x) = infω∈X g(ω) + 12‖ω − x‖

2 is the Moreau envelope of g andP1g(x) is the corresponding set of minimizers. Again by [RW98, Definition1.22] we get,

P1ιC(x) = argminω∈X

ιC +1

2‖ω − x‖2 (4.54)

= argminω∈C

1

2‖ω − x‖2 (4.55)

= argminω∈C

1

2d2C(x) (4.56)

= PC(x). (4.57)

Now by [RW98, Theorem 2.26b] we have,

∇e1ιC = Id−PC . (4.58)

By rearranging, then using (4.53) we have

PC = Id−∇e1ιC (4.59)

= Id−∇(1

2d2C) (4.60)

= ∇(1

2‖ · ‖2)−∇(

1

2d2C) (4.61)

= ∇(1

2‖ · ‖2 − 1

2d2C), (4.62)

which clearly shows that PC is in fact a gradient as desired.

Proposition 4.35. Let B : X → X be linear. Then B = ∂g ⇐⇒ B = B∗,in which case B = ∇qB and g = qB + c, where c ∈ R.

Proof. ”⇐”: B = B∗ =⇒ B = B+ now by Proposition 4.15 =⇒ B =B+ = ∇qB. ”⇒”: B = ∇g =⇒ ∇2g = B is symmetric [MN88, Theo-rem 4 Page 120].

40

4.1. Definitions

Lemma 4.36. Let T be linear and a proximal mapping. Then T = T ∗.

Proof. By assumption, there exists a lower semicontinuous convex functionf such that

T = (Id +∂f)−1. (4.63)

By Proposition 4.35, and the fact that Id is symmetric,

Id = ∇qId. (4.64)

By (4.63),T = (∇qId + ∂f)−1 = (∂(f + qId))−1 (4.65)

by [Roc97, Theorem 23.8]. Let

g = (f + qId)∗ (4.66)

then∂g = (∂(f + qId))−1 (4.67)

by [Roc97, Corollary 23.5.2]. By (4.65), we have

T = ∂g. (4.68)

Then by Proposition 4.35, T = T ∗.

Below is an alternative proof provided by Liangjin Yao. We first notethat if

T = (Id +∂f)−1 (4.69)

thenT−1 = Id +∂f (4.70)

andT−1 − Id = ∂f. (4.71)

Since T−1 and Id are linear relations, so is T−1− Id and hence gra(T−1− Id)is a linear subspace. Then

(∂f)∗ = ∂f (4.72)

by [BWY09, Lemma 4.1]. Now

T = (Id +∂f)−1 (4.73)

41

4.1. Definitions

and

T ∗ = [(Id +∂f)−1]∗ (4.74)

= [(Id +∂f)∗]−1 by [Cro98, Proposition III 1.3b] (4.75)

= [(Id)∗ + (∂f)∗]−1 by [Bor83, Theorem 7.4] (4.76)

= (Id +∂f)−1 by (4.72) (4.77)

= T. (4.78)

Lemma 4.37. Let P =

(a bc d

)such that P is a proximal mapping. Then

b = c.

Proof. By Lemma 4.36, P = P ∗, hence b = c.

Lemma 4.38. (See also [RW98, Example 12.7].) Let A : X → X be con-tinuous and monotone. Then A is maximal monotone.

Proof. Let (x, x∗) ∈ X ×X be such that

〈x− y, x∗ −Ay〉 ≥ 0, ∀y ∈ X. (4.79)

It suffices to show that x∗ = Ax. Let z ∈ X and t > 0. Let y = x+ tz ∈ X.Then

〈x− (x+ tz), x∗ −A(x+ tz)〉 ≥ 0 (4.80)

⇐⇒ 〈−tz, x∗ −A(x+ tz)〉 ≥ 0 (4.81)

⇐⇒ 〈A(x+ tz)− x∗, tz〉 ≥ 0 divide both sides by t (4.82)

⇐⇒ 〈A(x+ tz)− x∗, z〉 ≥ 0. (4.83)

Let t→ 0+. Since A is continuous,

〈Ax− x∗, z〉 ≥ 0, ∀z ∈ X. (4.84)

Set z = x∗ −Ax. Then (4.84) yields,

0 ≥ ‖Ax− x∗‖2 ⇐⇒ 〈Ax− x∗, x∗ −Ax〉 ≥ 0. (4.85)

So, ‖Ax− x∗‖ = 0, i.e., x∗ = Ax.

Corollary 4.39. Let A : X → X be linear. Then A is maximal monotoneif and only if ∀x 〈x,Ax〉 ≥ 0.

Proof. We have ∀x 〈x,Ax〉 ≥ 0 ⇐⇒ A is monotone by Proposition 4.2.Now since X is finite-dimensional, this implies that A is continuous by[Kre78, Page 94]. Therefore, since A monotone and continuous, it is maxi-mal monotone by Lemma 4.38.

42

4.2. Eckstein’s Observation

4.2 Eckstein’s Observation

Recall that RC and RD are the reflectors of given subsets C and D.

Definition 4.40. Following Eckstein in his thesis, for any maximal mono-tone operators A and B, and λ > 0 we define

Gλ,A,B = JλA(2JλB − Id) + (Id−JλB). (4.86)

Remark 4.41. It is important to note that JλA = (Id +λA)−1 is the resolventof λA by Definition 4.28 as A is maximal monotone,

Fact 4.42. (See [Eck89, Corollary 4.21].) Let G−1λ,A,B = Sλ,A,B + Id, then

Gλ,A,B is firmly nonexpansive.

Definition 4.43 (Splitting Operator). (See [Eck89, Page 137].)

Sλ,A,B = G−1λ,A,B − Id . (4.87)

In [Eck89, Corollary 4.21] we see a fairly complicated proof of Fact 4.42.However, just like we proved that T is firmly nonexpansive in Theorem 3.7,it may be proved that Gλ,A,B is firmly nonexpansive by replacing projec-tions with appropriate resolvents. See also [Com04] and [EF98] for furtherinformation on the Douglas-Rachford and related methods involving (firmly)nonexpansive operators.

Eckstein was the first to observe the following [Eck89, Proposition 4.10].

Proposition 4.44. There exist maximal monotone operators A and B onR2 and a constant λ > 0 such that A and B are both subdifferentials of convexfunctions, but Sλ,A,B is not the subdifferential of any convex function.

Proof. Let

A =

(2 11 2

)(4.88)

B =

(2 00 1

)(4.89)

λ =1

3. (4.90)

By Fact 4.27, A and B � 0. Then, by Proposition 4.35,

A = ∂f, where f(x) =1

2xT(

2 11 2

)x (4.91)

43


B = ∂g, where g(x) =1

2xT(

2 00 1

)x. (4.92)

By direct calculation,

JλA = (Id +λA)−1 =

(58

−18

−18

58

)(4.93)

JλB = (Id +λB)−1 =

(35 0

0 34

). (4.94)

Now Eckstein writes

Gλ,A,B = JλA(2JλB − Id) + (Id−JλB) =

(12

116

110

1116

)(4.95)

and

Sλ,A,B = G−1λ,A,B − Id =

(2827

−527

−827

1327

). (4.96)

Since Gλ,A,B is not symmetric, it cannot be a proximal mapping: indeed, Gis firmly nonexpansive by Fact 4.42 and if G were a proximal map, it wouldby symmetric by Lemma 4.37, but it is not.

Eckstein made a numerical error in (4.95) and as a result (4.96). Thematrices in fact are


(2140

−116

−140

916

)(4.97)

and


(4347

1047

447

3747

), (4.98)

however, Eckstein’s conclusion remains valid (see appendix for the calcula-tion details). In Example 4.46, we provide an example which isn’t prone tonumerical errors as it is much simpler.

Lemma 4.45. Let T : X → X be linear. If

T =Id +RDRC

2(4.99)

then,RDRC is symmetric⇐⇒ T is symmetric. (4.100)

44


Proof.

RDRC is symmetric ⇐⇒ (RDRC)∗ = RDRC (4.101)

⇐⇒ Id +(RDRC)∗ = Id +RDRC (4.102)

⇐⇒ (Id +RDRC)∗ = Id +RDRC (4.103)

⇐⇒ (Id +RDRC)∗

2=

Id +RDRC2

(4.104)

⇐⇒ T ∗ = T (4.105)

Example 4.46. Set

T =Id +RDRC

2. (4.106)

Then we have the implications if T is a proximal mapping then T is sym-metric ⇐⇒ RDRC is symmetric, by Lemma 4.36 and Lemma 4.45.

Consider,C = {(x, 0)|x ∈ R} the x-axis in R2 (4.107)

D = {(x, x)|x ∈ R} the diagonal in R2. (4.108)

We have

PC =

(1 00 0

), PD =

(12

12

12

12

), Id =

(1 00 1

). (4.109)

Now the reflectors onto set C and D are defined as follows:

RC = 2PC − Id = 2

(1 00 0

)−(

1 00 1

)=

(1 00 −1

)(4.110)

RD = 2PD − Id = 2

(12

12

12

12

)−(

1 00 1

)=

(0 11 0

). (4.111)

Furthermore,

RDRC =

(0 11 0

)(1 00 −1

)=

(0 −11 0

)(4.112)

which is clearly not symmetric! Therefore, T is a firmly nonexpansive map-ping that is not a proximal mapping even though it was constructed fromtwo proximal mappings.

45


Now

T =Id +RDRC

2=

(1 00 1

)+

(0 −11 0

)2

=

(12 −1

2

12

12

)(4.113)

T =Id +RDRC

2= (Id +M)−1 (4.114)

where M is maximal monotone, and hence

M = T−1 − Id . (4.115)

We calculate T−1 directly from T ,

T−1 =

(1 1−1 1

). (4.116)

And finally M ,

M = T−1 − Id =

(0 1−1 0

). (4.117)

This is clearly not symmetric and does not come from a subdifferentialoperator, by Proposition 4.35 applied to M .

Eckstein in his thesis pointed out that the T in Douglas-Rachford (akaDifference map or Lions-Mercier) is not necessarily a proximal map even if itis constructed from two proximal mappings. (However, his example is morecomplicated and has a numerical error, compared to the simpler examplewe just showed).

This shows that you truly need monotone operator theory to understandDouglas-Rachford or the Difference Map, even if you started with the projec-tion operators since the resolvent is not a proximal mapping (the monotoneoperator part is not a subdifferential operator).

46

Chapter 5

8 Queens

The eight queens puzzle is the problem of putting eight chess queens on an8× 8 chessboard such that none of them is able to capture any other usingthe standard chess queen’s moves. The queens must be placed in such a waythat no two queens would be able to attack each other. Thus, a solutionrequires that no two queens share the same row, column, or diagonal. Theeight queens puzzle is an example of the more general n queens puzzle ofplacing n queens on an n × n chessboard, where solutions exist only forn = 1 or n ≥ 4. This is a new application of Elser’s Difference Map. Froma complexity point of view, the n queens puzzle is easy — in fact, it can besolved in linear time, see [AY89] for details. Our interest in it stems froma modeling perspective; it will make the modeling of Sudoku in the nextchapter easier to understand.

5.1 Queen’s Role in Chess

The queen can be moved any number of unoccupied squares in a straightline vertically, horizontally, or diagonally, thus combining the moves of therook and bishop. The queen captures by occupying the square on which anopponent’s piece sits, see Figure 5.1 (from [Que]).

5.2 8 Queens History

The puzzle was originally proposed in 1850 by Franz Nauck. Nauck alsoextended the puzzle to the n queens problem (on an n×n chess-board, howcan n queens be placed) [RBC74, page 166]. In 1874, S. Gunther proposeda method of finding solutions by using determinants, and J.W.L. Glaisherrefined this approach [RBC74, page 166-167].

This puzzle even appeared in the popular early 1990s computer gameThe 7th Guest.

47

5.3. Solutions to 8 Queens

Figure 5.1: Queen moves

5.3 Solutions to 8 Queens

The eight queens puzzle has 92 distinct solutions. If solutions that differonly by symmetry operations (rotations and reflections) of the board arecounted as one, the puzzle has 12 unique (or fundamental) solutions [RBC74,page 177].

48

5.3. Solutions to 8 Queens

12 Unique solutions to the 8 Queens problem

8 0Z0l0Z0Z7 Z0Z0Z0l06 0ZqZ0Z0Z5 Z0Z0Z0Zq4 0l0Z0Z0Z3 Z0Z0l0Z02 qZ0Z0Z0Z1 Z0Z0ZqZ0

a b c d e f g h

8 0Z0ZqZ0Z7 ZqZ0Z0Z06 0Z0l0Z0Z5 Z0Z0Z0l04 0ZqZ0Z0Z3 Z0Z0Z0Zq2 0Z0Z0l0Z1 l0Z0Z0Z0

a b c d e f g h

8 0Z0l0Z0Z7 ZqZ0Z0Z06 0Z0Z0ZqZ5 Z0l0Z0Z04 0Z0Z0l0Z3 Z0Z0Z0Zq2 0Z0ZqZ0Z1 l0Z0Z0Z0

a b c d e f g h

8 0Z0l0Z0Z7 Z0Z0ZqZ06 0Z0Z0Z0l5 Z0l0Z0Z04 qZ0Z0Z0Z3 Z0Z0Z0l02 0Z0ZqZ0Z1 ZqZ0Z0Z0

a b c d e f g h

8 0ZqZ0Z0Z7 Z0Z0ZqZ06 0Z0Z0Z0l5 l0Z0Z0Z04 0Z0l0Z0Z3 Z0Z0Z0l02 0Z0ZqZ0Z1 ZqZ0Z0Z0

a b c d e f g h

8 0Z0ZqZ0Z7 Z0l0Z0Z06 0Z0Z0Z0l5 Z0ZqZ0Z04 0Z0Z0ZqZ3 l0Z0Z0Z02 0Z0Z0l0Z1 ZqZ0Z0Z0

a b c d e f g h

49

5.4. Modeling 8 Queens

12 Unique solutions to the 8 Queens problem

8 0Z0ZqZ0Z7 Z0Z0Z0l06 0Z0l0Z0Z5 l0Z0Z0Z04 0ZqZ0Z0Z3 Z0Z0Z0Zq2 0Z0Z0l0Z1 ZqZ0Z0Z0

a b c d e f g h

8 0Z0l0Z0Z7 l0Z0Z0Z06 0Z0ZqZ0Z5 Z0Z0Z0Zq4 0Z0Z0l0Z3 Z0l0Z0Z02 0Z0Z0ZqZ1 ZqZ0Z0Z0

a b c d e f g h

8 0ZqZ0Z0Z7 Z0Z0ZqZ06 0Z0l0Z0Z5 l0Z0Z0Z04 0Z0Z0Z0l3 Z0Z0l0Z02 0Z0Z0ZqZ1 ZqZ0Z0Z0

a b c d e f g h

8 0Z0Z0l0Z7 ZqZ0Z0Z06 0Z0Z0ZqZ5 l0Z0Z0Z04 0Z0l0Z0Z3 Z0Z0Z0Zq2 0Z0ZqZ0Z1 Z0l0Z0Z0

a b c d e f g h

8 0Z0l0Z0Z7 Z0Z0Z0l06 qZ0Z0Z0Z5 Z0Z0Z0Zq4 0Z0ZqZ0Z3 ZqZ0Z0Z02 0Z0Z0l0Z1 Z0l0Z0Z0

a b c d e f g h

8 0Z0Z0l0Z7 Z0ZqZ0Z06 0Z0Z0ZqZ5 l0Z0Z0Z04 0Z0Z0Z0l3 ZqZ0Z0Z02 0Z0ZqZ0Z1 Z0l0Z0Z0

a b c d e f g h

5.4 Modeling 8 Queens

We begin by thinking of the 8 queens as a 8×8 binary matrix. We representa queen on a square with a 1 and a blank square with a 0. Our underlying

50

5.5. Row Constraints

Hilbert space is X = R82 or X = R64.

5.5 Row Constraints

Each row can only have one queen in it. In each row we are only allowed 1one the rest are zeros. We call this constraint C1.

C1 = {x ∈ R8×8| every row of x is a unit vector}. (5.1)

The row constraint ensures that each row has only one queen in it.

8 qZ0Z0Z0Z7 l0Z0Z0Z06 qZ0Z0Z0Z5 l0Z0Z0Z04 qZ0Z0Z0Z3 l0Z0Z0Z02 qZ0Z0Z0Z1 l0Z0Z0Z0

a b c d e f g h

5.6 Column Constraints

Each column can only have one queen in it. In each column we are onlyallowed 1 one the rest are zeros. We call this constraint C2

C2 = {x ∈ R8×8| every column x is a unit vector}. (5.2)

The column constraint ensures that each column has only one queen init.

51

5.7. Positive Slope Constraints

8 qlqlqlql7 Z0Z0Z0Z06 0Z0Z0Z0Z5 Z0Z0Z0Z04 0Z0Z0Z0Z3 Z0Z0Z0Z02 0Z0Z0Z0Z1 Z0Z0Z0Z0

a b c d e f g h

5.7 Positive Slope Constraints

Every positively sloped diagonal must have at most one queen in it. In eachpositively sloped diagonal we are only allowed at most one 1. We call thisconstraint C3. We can think of these as diagonals, or bishop moves. Thereare 15 positively sloped diagonals.

C3 = {x ∈ R8×8| every positive diagonal of x is a unit vector or the 0 vector}.(5.3)

The positive slope constraint ensures that each positive diagonal has atmost one queen in it.

52

5.8. Negative Slope Constraints

8 0Z0Z0Z0Z7 Z0Z0Z0Z06 qZ0Z0Z0Z5 ZqlqZ0Z04 0Z0l0l0Z3 Z0Z0Z0l02 0Z0Z0Z0Z1 Z0Z0Z0Zq

a b c d e f g h

5.8 Negative Slope Constraints

As with the positive slope constraint, the negative sloped constraint musthave at most one queen in it. Again we are only allowed at most one 1. Wecall this constraint C4. There are 15 negatively sloped diagonals.

C4 = {x ∈ R8×8| every negative diagonal of x is a unit vector or the 0 vector}.(5.4)

The negative slope constraint ensures that each negative diagonal has atmost one queen in it.

8 0Z0Z0Z0Z7 Z0Z0Z0Z06 0Z0Z0Z0l5 Z0ZqZql04 0l0l0Z0Z3 Z0Z0Z0Z02 0Z0Z0Z0l1 Z0Z0Z0Z0

a b c d e f g h

53

5.9. Projection onto the Nearest Unit Vector

5.9 Projection onto the Nearest Unit Vector

Denote the standard unit vectors in Rn by ei. Let C = {e1, . . . , en} be avery nonconvex set in X. Then

PCx = {ei | xi = max{x1, . . . , xn}}. (5.5)

Proof. The set C has a finite number of elements so PCx 6= ∅. Let i ∈{1, . . . , n}. Then

ei ∈ PCx⇐⇒ ‖x− ei‖ ≤ ‖x− ej‖ ∀j (5.6)

⇐⇒ ‖x− ei‖2 ≤ ‖x− ej‖2 ∀j (5.7)

⇐⇒ ‖x‖2 − 2〈x, ei〉+ ‖ei‖2 ≤ ‖x‖2 − 2〈x, ej〉+ ‖ej‖2 ∀j (5.8)

⇐⇒ −2〈x, ei〉 ≤ −2〈x, ej〉 ∀j (5.9)

⇐⇒ 〈x, ej〉 ≤ 〈x, ei〉 ∀j (5.10)

⇐⇒ xj ≤ xi ∀j. (5.11)

This projection is very easy to compute, especially for a computer asit picks the biggest component, sets it to 1 and the others to zero. All theconstraints can be represented as unit vectors. For the diagonals we comparethe projection with the zero vector and choose the nearest one. In generalwe pick the biggest component in our vector, set it to 1 and the others to 0.In the case of a tie we pick the lowest component.

The constraints C1,C2,C3, and C4 can be applied to the 8 queens matrixas projections.

.3 .8 .3 .2 .4 .5 .5 .1

.1 .1 .1 .1 .1 .1 .1 .1

.2 .7 .4 .2 .7 .8 .5 .9

.9 .5 .5 .5 .5 .5 .4 .5

.4 .2 0 0 0 0 .2 .31 .1 .1 .3 .2 .6 .5 .6−6 .8 .1 .2 .3 .4 .7 .9.3 .4 .7 .4 .2 1 .8 2

7→

0 1 0 0 0 0 0 01 0 0 0 0 0 0 00 0 0 0 0 0 0 11 0 0 0 0 0 0 01 0 0 0 0 0 0 01 0 0 0 0 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 0 1

The matrix on the right is PC1x where x is the matrix on the left.

The negative slope constraint is exactly the same as the positive con-straint except in the negative direction. If we rotate the chessboard 90

54

5.10. Iterations

degrees in a counterclockwise direction, apply the positive slope constraintand rotate it 90 degrees clockwise back we get the negative slope constraint.In fact this is how the algorithm is implemented.

.3 0 .3 .2 .4 .5 .5 .10 .1 .1 .1 .1 .1 .1 .1.2 .7 .4 .2 .7 .8 .7 .9.9 .5 .5 .5 .5 .5 .4 .5.4 .2 0 0 0 0 .2 01 .1 .1 .3 .2 .6 0 .6−6 .9 .1 .2 .3 0 .7 .9.3 .4 .7 .4 0 1 .8 0

7→

1 0 1 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 1 0 1 11 1 0 0 0 0 0 00 0 0 0 0 0 0 01 0 0 0 0 1 0 00 1 0 0 0 0 0 10 0 0 0 0 1 0 0

The matrix on the right is PC3x where x is the matrix on the left.

5.10 Iterations

Recall that Elser’s Difference Map with β = 1, Lions-Mercier and DouglasRachford algorithms can be defined as

T =Id +RARB

2, (5.12)

where RA and RB are reflectors onto their respective sets. The solution toour 8 Queens problem is in C1∩C2∩C3∩C4. We have 4 constraints and ouralgorithm can only handle two. In order to address this we use the productspace and the diagonal space previously defined by Definitions 2.25 and 2.27respectively. We’ll work in the Hilbert space

X ×X ×X ×X, recall X = R82 . (5.13)

We also define

C = C1 × C2 × C3 × C4 = {(c1, c2, c3, c4)|ci ∈ Ci} ⊆ X4. (5.14)

Now the Diagonal,

∆ = {(x1, x2, x3, x4) | x1 = x2 = x3 = x4 ∈ X} (5.15)

Then∆ ∩ C = {(c1, c2, c3, c4) | c1 = c2 = c3 = c4, ci ∈ Ci} (5.16)

55

5.11. 8 Queens Example

and get that

x ∈ C1 ∩ C2 ∩ C3 ∩ C4 ⇐⇒ (x, x, x, x) ∈ C ∩∆. (5.17)

Our solutions lie in the intersection of the 4 constraints. The reflectors usedin the Difference Map are two times the projection minus the identity. Inour model, we have two projections. The projection onto C, our constraintsand the projection onto the diagonal. The projection onto the diagonal isdescribed by Example 2.48:

P∆(x1, x2, x3, x4) = (y, y, y, y) where y =x1 + x2 + x3 + x4

4. (5.18)

The projection onto the constraints is described by Example 2.47:

PC = PC1×C2×C3×C4(x1, x2, x3, x4) = (PC1x1, PC2x2, PC3x3, PC4x4). (5.19)

Finally, we define the reflectors and use the Difference Map,

R∆ := 2P∆ − Id (5.20)

RC := 2PC − Id . (5.21)

The iterations are described as

xn+1 =

(Id +RCR4

2

)xn. (5.22)

We round one of P4xn to a 0-1 matrix, call the result yn. Then we monitor4∑i=1

‖yn − PCiyn‖2, which is 0 if and only if yn is a solution.

5.11 8 Queens Example

In this example we start by placing 8 queens to the top left corner- a ridicu-lous starting position. As we see in Figure 5.3, after 30 iterations the boardis more spread out. We have a queen in every row. Also notice that mostdiagonal constraints are satisfied. Notice that there are more than 8 queenson the board at this time. This is because the algorithm looks at the entriesin each cells and rounds them, if the entry is greater or equal to 0.5, weplace a queens in that entry, otherwise it is empty.

Finally after 96 iterations we see in Figure 5.6, we have a solution.

56


Figure 5.2: Initial Starting Position

Figure 5.3: 30 iterations

57




58


Figure 5.6: Solved in 96 iterations

59

Chapter 6

Sudoku

We have proved some results about projections and convergence if our setsare convex. There is no theory about convergence if our sets are non-convex.In fact, it is known to fail. The method of alternating projections fails veryquickly if our sets are non-convex- it gets stuck between two points andbounces back and forth endlessly. Sudoku uses integers from 1 to 9 whichare a very non-convex set. However, when we apply Elser’s difference mapto our very non-convex Sudoku constraint sets, it converges (very quicklyeven); which I find amazing. From a complexity point of view, Sudoku isNP-complete, see [Epp05] for details. Below are the full details on how tomodel Sudoku to solve it using Elser’s Difference Map.

6.1 What is Sudoku?

Sudoku literally means ”number single”. It was derived from the Japanesephrase ”Suuji wa dokushin ni kagiru”. This phrase translates as ”the num-bers must occur only once” or ”the numbers must be single”. Sudoku is alogic-based number placement puzzle. The objective of Sudoku is to fill a9× 9 grid so that each column, each row, and each of the nine 3× 3 boxescontains the digits from 1 to 9, only one time each. The game is a type ofLatin Square game with the additional constraint of the 3×3 offices. Becauseof this Leonard Euler is often incorrectly cited as the father of Sudoku. Su-doku was actually invented by an American architect named Howard Garnsin 1979; he called it ”Number Place”. Sudoku is now published and trade-marked by Nikoli of Japan and given its name in 1986 [BT09, Pages 6–10],[Eri10]. A sample Sudoku puzzle and its solution is given below in 6.1 and6.1 respectively (pictures from [Wik]).

6.2 Modeling Sudoku

We begin by thinking of the Sudoku as a 9×9 integer matrix. Think of eachcell as an elevator shaft, where the integer number of the cell is the floor. If

60

6.2. Modeling Sudoku

Figure 6.1: A Sample Sudoku Puzzle

61

6.2. Modeling Sudoku

Figure 6.2: A Sample Sudoku Solution

62

6.3. Constraints

the floor is say 3, we have 0,0,1,0,0,0,0,0,0 on the floors of the elevator shaft.If we do this for every cell, we add another dimension to our 9 × 9 matrix.We now have a 9 × 9 × 9 cube! We set all the blank entries to 1, but thestarting position does not matter. Let Q be a cube in X described by

Q(i, j, k), where {i, j, k} ⊆ {1, . . . , 9}. (6.1)

So an elevator shaft is Q(i, j, :) (i.e., the 3rd component is free using Matlabnotation). We aim to place unit vectors into each elevator shaft. Our spaceX = R93 or X = R729. We now describe our constraints, which are similarto those in the 8 Queens problem.

Figure 6.3: 9 Sudoku Elavator Shafts

6.3 Constraints

6.3.1 North-South Hallways

Each cell of every column must not repeat any of the numbers 1 to 9. Let’scall these North-South hallways. In each hallway we are only allowed at

63

6.3. Constraints

most one 1 and the rest are zeros. There are 81 North-South hallways, 9per floor. Therefore, our constraint becomes

C1 ={Q ∈ R93

∣∣Q(i, :, k) is a stardard unit vector ∀i, k}. (6.2)

Figure 6.4: A Sodoku Hallway

6.3.2 East-West Hallways

Each cell of every row must not repeat any of the numbers 1 to 9. Let’s callthese East - West hallways. In each hallway we are only allowed at mostone 1 and the rest are zeros. There are 81 East-West hallways, 9 per floor.Therefore, our constraint becomes

C2 ={Q ∈ R93

∣∣Q(:, j, k) is a stardard unit vector ∀j, k}. (6.3)

64

6.3. Constraints

6.3.3 Meeting Rooms

Every 3 × 3 square on each floor must not repeat any of the numbers 1 to9; we call these meeting rooms. In each meeting room we are only allowedat most 1 one the rest are zeros. There are 81 meeting rooms, 9 per floor.We define the following:

E = {A ∈ R3×3| all entries are zero except one entry equals zero}, (6.4)

J = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} , (6.5)

and, for every Q ∈ R93 , I1 ∈ J , I2 ∈ J , and k ∈ {1, . . . , 9}, set

MI1,I2,k(Q) ={A ∈ R3×3|Aimod 3,jmod 3 = Q(i, j, k) ∀j ∈ I1∀j ∈ I2

}.

(6.6)Therefore, our constraint set is

C3 ={Q ∈ R93 |∀k,∀I1 ∈ J, ∀I2 ∈ J,MI1,I2,k(Q) ∈ E

}. (6.7)

Figure 6.5: A Sodoku Meeting Room

65

6.4. Iterations

As you can see from the picture the meeting room is not a typical unitvector. We can think of it as a fancy unit vector where we write it as a 3×3matrix instead of the usual 1× 9 vector we are used to.

6.3.4 The Given Sudoku

Constraint 4 is the given Sudoku problem. We call this constraint C4. Thisis the set of all cubes Q ∈ R93 such that if the (i, j) entry is prescribed andequal to γi,j then Q(i, j, :) is equal to the γi,j-unit vector; otherwise, Q(i, j, :)is any unit vector.

6.4 Iterations

Recall that Elser’s Difference Map with β = 1, Lions-Mercier and DouglasRachford algorithms can be defined as

T =Id +RARB

2, (6.8)

where RA and RB are reflectors onto their respective sets. The solution toour Sudoku problem is in C1 ∩C2 ∩C3 ∩C4. We have 4 constraints and ouralgorithm can only handle two. As with 8 Queens, we use the product spaceand the diagonal, which we have previous defined. We work in the Hilbertproduct space

X ×X ×X ×X, recall X = R93 . (6.9)

We also define

C = C1 × C2 × C3 × C4 = {(c1, c2, c3, c4)|ci ∈ Ci} ⊆ X4, (6.10)

and the Diagonal,

∆ = {(x1, x2, x3, x4) | x1 = x2 = x3 = x4 ∈ X}. (6.11)

Then∆ ∩ C = {(c1, c2, c3, c4) | c1 = c2 = c3 = c4, ci ∈ Ci} (6.12)

and hence

x ∈ C1 ∩ C2 ∩ C3 ∩ C4 ⇐⇒ (x, x, x, x) ∈ C ∩∆. (6.13)

Our unique solution (assuming a well posed Sudoku puzzle) lies in the in-tersection of the 4 constraints. The reflectors used in the Difference Map

66

6.5. Sudoku Example

are two times the projection minus the identity. In our model, we have twoprojections. The projection onto C, our constraints and the projection ontothe diagonal. The projection onto the diagonal is described by Example2.48:

P∆(x1, x2, x3, x4) = (y, y, y, y) where y =x1 + x2 + x3 + x4

4. (6.14)

The projection onto the constraints is described by Example 2.47:

PC = PC1×C2×C3×C4(x1, x2, x3, x4) = (PC1x1, PC2x2, PC3x3, PC4x4). (6.15)

Finally, we define the reflectors and use the Difference Map,

R∆ = 2P∆ − Id (6.16)

RC := 2PC − Id . (6.17)

The iterations are described as

xn+1 =

(Id +RCR4

2

)xn. (6.18)

We now apply the exact same algorithm described in details with the 8-Queens example. Now we pick a cube from our diagonal and convert thecube into a matrix by projecting onto the nearest unit vector of each elevatorshaft. Convergence is monitored similarly.

It should be noted that the iterations are in our cube space and in orderto view them in a 9× 9 matrix that is Sodoku, we need to flatten the cube.That is take the 9 × 9 × 9 binary cube and convert it to a 9 × 9 integermatrix by converting each elevator shaft containing a standard unit vectorto its corresponding integer- see Figure 6.3.

6.5 Sudoku Example

Arto Inkala, a Finnish mathematician, is an enthusiastic puzzle and problemsolver. He completed his doctoral thesis in applied mathematics in 2001. Hecreates very difficult Sudoku puzzles. Inkala claims that his ”mathematicbackground has been great help when creating the most difficult Sudokusby [using] genetic algorithms, different optimizing methods and artificialintelligence. On Inkala’s website (www.aisudoku.com), he shares 10 of hisdifficult Sudoku puzzles. According to the Finnish puzzle maker, he callsthe most difficult Sudoku puzzle ever ”AI Escargot”, because it looks like

67

6.5. Sudoku Example

a snail. The AI comes from Inkala’s initials. He calls solving it is like anintellectual culinary pleasure.

We use the difference map method described above to solve all 10 ofInkala’s Sudoku puzzle. The online implementation of the Sudoku solvercan be found at www.schaad.ca, the author’s website. We see that AI Es-cargot was solved in 6627 iterations whereas AI labryinth took over 12000iterations. Even though Inkala claims that AI Escargot is the world’s hard-est Sudoku puzzle, our algorithm solves it faster than other Sudokus createdby Inkala. The algorithm managed to solve AI honeypot in only 208 iter-ations! The Sudoku problems and pictures in the left column are fromwww.aisudoku.com, whereas the Sudoku solutions in the right column areoriginal work.

AI escargot solved in 6627 iterations

68

6.5. Sudoku Example

AI killer application solved in 882 iterations

AI lucky diamond solved in 276 iterations

69

6.5. Sudoku Example

AI worm hole solved in 4998 iterations

AI labyrinth solved in 12025 iterations

70

6.5. Sudoku Example

AI circles solved in 2410 iterations

AI squadron solved in 3252 iterations

71

6.5. Sudoku Example

AI honeypot solved in 208 iterations

AI tweezers solved in 688 iterations

72

6.5. Sudoku Example

AI broken brick solved in 1598 iterations

In the following figure, we show how the online solver solves AI Escargot.It takes 6627 iterations and we take snapshots every 1000 iteration until wecome to a solution.

Figure 6.6: Solution to Escargot

73

6.6. Sudoku Distance

6.6 Sudoku Distance

The next figure is very interesting. At each iteration, we compute a distancefrom the solution. At each iteration, our algorithm displays the currentiterate as 9 × 9 matrix which represents an attempt at a Sudoku puzzlesolution. If we let the current iterate be the matrix A, and let the solutionto the Sudoku puzzle be B, then we can compute the difference which wecall matrix C. The difference is computed by subtracting each element, i.e.cij = aij − bij , where i, j ∈ 1, 2, . . . , 9. We calculate the distance as follows,

‖C‖ =

√∑i,j

c2ij (6.19)

Figure 6.7: Distance versus number of iterations

The graph is not monotone. It shows that Elser’s Difference Map some-times gets close to a solution, but needs to go further away to actually getthe solution. There are a few times, around 1100, 1200, 4800, and 6300iterations where we get fairly close to a solution, but then we get fartheraway before finding a solution. Also notice how quickly we converge to asolution at the end.

74

Chapter 7

Conclusion

We have shown that Elser’s Difference Map is equivalent to the Douglas-Rachford and Fienup’s Hybrid Input-Output algorithms, where β = 1. Weshowed how to model the 8-queens problem and following Elser, we modeledSudoku as well. We hope that the detailed expository in this thesis will helpreaders to understand modeling feasibility problems such as Sudoku and8-queens.

We used several very non-convex sets and applied the theory of convexsets to converge to a solution. There is no theory for the nonconvex case,although we have demonstrated the algorithm seems to work. Some futureresearch might include discovering convergence proofs for the nonconvexcase, starting with simple examples in the Euclidean plane or building onthe paper [LLM09] by Lewis, Luke and Malick.

Finally, we showed that the operator governing the Douglas-Rachforditeration need not be a proximal map even when the two involved resol-vents are; this refines an observation of Eckstein and underlines the need formonotone operator theory.

75

Bibliography

[AY89] Bruce Abramson and Moti Yung. Divide and conquer under globalconstraints: A solution to the n-queens problem. Journal of Par-allel and Distributed Computing, 6(3):649 – 662, 1989. Availablefrom: http://www.sciencedirect.com/science/article/

B6WKJ-4BRJHWS-13/2/44424d257adca8566925a5ca897293e9.→ pages 47

[BCL02] Heinz H. Bauschke, Patrick L. Combettes, and D. Russell Luke.Phase retrieval, error reduction algorithm, and Fienup vari-ants: a view from convex optimization. J. Opt. Soc. Amer. A,19(7):1334–1345, 2002. Available from: http://dx.doi.org/10.1364/JOSAA.19.001334. → pages 1

[bir09] Interdisciplinary workshop on fixed-point algorithms for in-verse problems in science and engineering, November 2009.Available from: http://www.birs.ca/birspages.php?task=

displayevent&event_id=09w5006. → pages 2

[Bor83] J. M. Borwein. Adjoint process duality. Math. Oper. Res.,8(3):403–434, 1983. Available from: http://dx.doi.org/10.

1287/moor.8.3.403. → pages 42

[BT09] Seymour S. Block and Santiago A. Tavares. Before Sudoku. Ox-ford University Press, Oxford, 2009. The world of magic squares.→ pages 60

[BWY09] Heinz H. Bauschke, Xianfu Wang, and Liangjin Yao. On Borwein-Wiersma decompositions of monotone linear relations, 2009.Available from: http://www.citebase.org/abstract?id=oai:

arXiv.org:0912.2772. → pages 41

[Com04] Patrick L. Combettes. Solving monotone inclusions via com-positions of nonexpansive averaged operators. Optimization,53(5-6):475–504, 2004. Available from: http://dx.doi.org/10.1080/02331930412331327157. → pages 43

76

http://www.sciencedirect.com/science/article/B6WKJ-4BRJHWS-13/2/44424d257adca8566925a5ca897293e9

http://www.sciencedirect.com/science/article/B6WKJ-4BRJHWS-13/2/44424d257adca8566925a5ca897293e9

http://dx.doi.org/10.1364/JOSAA.19.001334

http://dx.doi.org/10.1364/JOSAA.19.001334

http://www.birs.ca/birspages.php?task=displayevent&event_id=09w5006

http://www.birs.ca/birspages.php?task=displayevent&event_id=09w5006

http://dx.doi.org/10.1287/moor.8.3.403

http://dx.doi.org/10.1287/moor.8.3.403

http://www.citebase.org/abstract?id=oai:arXiv.org:0912.2772

http://www.citebase.org/abstract?id=oai:arXiv.org:0912.2772

http://dx.doi.org/10.1080/02331930412331327157

http://dx.doi.org/10.1080/02331930412331327157

Chapter 7. Bibliography

[Cro98] Ronald Cross. Multivalued linear operators, volume 213 of Mono-graphs and Textbooks in Pure and Applied Mathematics. MarcelDekker Inc., New York, 1998. → pages 42

[Deu01] Frank Deutsch. Best approximation in inner product spaces. CMSBooks in Mathematics/Ouvrages de Mathematiques de la SMC,7. Springer-Verlag, New York, 2001. → pages 20, 27, 31, 34

[EB92] Jonathan Eckstein and Dimitri P. Bertsekas. On the Douglas-Rachford splitting method and the proximal point algorithm formaximal monotone operators. Math. Programming, 55(3, Ser.A):293–318, 1992. Available from: http://dx.doi.org/10.

1007/BF01581204. → pages 38

[Eck89] Jonathan Eckstein. Splitting Methods for Monotone Opera-tors with Applications to Parallel Optimization. PhD disserta-tion, Massachusetts Institute of Technology, Cambridge, Mas-sachusetts, June 1989. → pages 43

[EF98] Jonathan Eckstein and Michael C. Ferris. Operator-splittingmethods for monotone affine variational inequalities, with a par-allel application to optimal control. INFORMS J. Comput.,10(2):218–235, 1998. → pages 43

[Epp05] David Eppstein. Nonrepetitive paths and cycles in graphs withapplication to sudoku. CoRR, abs/cs/0507053, 2005. → pages 60

[Eri10] Martin Erickson. Pearls of discrete mathematics. Discrete Math-ematics and its Applications (Boca Raton). CRC Press, Boca Ra-ton, FL, 2010. → pages 60

[ERT07] V. Elser, I. Rankenburg, and P. Thibault. Searching with iteratedmaps. Proc. Natl. Acad. Sci. USA, 104(2):418–423, 2007. Avail-able from: http://dx.doi.org/10.1073/pnas.0606359104. →pages 1, 26

[Fie82] J. R. Fienup. Phase retrieval algorithms: a comparison. Appl.Opt., 21(15):2758–2769, 1982. Available from: http://ao.osa.

org/abstract.cfm?URI=ao-21-15-2758. → pages 26

[GE08] Simon Gravel and Veit Elser. Divide and concur: A general ap-proach to constraint satisfaction. Phys. Rev. E, 78(3):036706, Sep2008. → pages 1

77

http://dx.doi.org/10.1007/BF01581204

http://dx.doi.org/10.1007/BF01581204

http://dx.doi.org/10.1073/pnas.0606359104

http://ao.osa.org/abstract.cfm?URI=ao-21-15-2758

http://ao.osa.org/abstract.cfm?URI=ao-21-15-2758

Chapter 7. Bibliography

[ICC07] Second mathematical programming society international confer-ence on continuous optimization, August 2007. Available from:http://iccopt-mopta.mcmaster.ca/. → pages 1

[Kre78] Erwin Kreyszig. Introductory functional analysis with applica-tions. John Wiley & Sons, New York-London-Sydney, 1978. →pages 7, 33, 34, 42

[LLM09] A. S. Lewis, D. R. Luke, and J. Malick. Local linear con-vergence for alternating and averaged nonconvex projections.Found. Comput. Math., 9(4):485–513, 2009. Available from:http://dx.doi.org/10.1007/s10208-008-9036-y. → pages 75

[LM79] P.-L. Lions and B. Mercier. Splitting algorithms for the sum oftwo nonlinear operators. SIAM J. Numer. Anal., 16(6):964–979,1979. Available from: http://dx.doi.org/10.1137/0716071.→ pages 26

[Mey00] Carl Meyer. Matrix analysis and applied linear algebra. Societyfor Industrial and Applied Mathematics (SIAM), Philadelphia,PA, 2000. With 1 CD-ROM (Windows, Macintosh and UNIX)and a solutions manual (iv+171 pp.). → pages 4, 34, 38

[Min62] George J. Minty. Monotone (nonlinear) operators in Hilbertspace. Duke Math. J., 29:341–346, 1962. → pages 39

[MN88] Jan R. Magnus and Heinz Neudecker. Matrix differential calculuswith applications in statistics and econometrics. Wiley Series inProbability and Mathematical Statistics: Applied Probability andStatistics. John Wiley & Sons Ltd., Chichester, 1988. → pages40

[Opi67] Zdzis law Opial. Weak convergence of the sequence of successiveapproximations for nonexpansive mappings. Bull. Amer. Math.Soc., 73:591–597, 1967. → pages 29

[Que] Queen moves. Available from: http://www.chessdryad.com/

education/magictheater/queens/index.htm. → pages 47

[RBC74] W. W. Rouse Ball and H. S. M. Coxeter. Mathematical recreations& essays. University of Toronto Press, Toronto, Ont., twelfthedition, 1974. → pages 47, 48

78

http://iccopt-mopta.mcmaster.ca/

http://dx.doi.org/10.1007/s10208-008-9036-y

http://dx.doi.org/10.1137/0716071

http://www.chessdryad.com/education/magictheater/queens/index.htm

http://www.chessdryad.com/education/magictheater/queens/index.htm

[Reh] Julie Rehmeyer. The Sudoku solution. The Science News. Avail-able from: http://sciencenews.org/view/generic/id/39529/title/The_Sudoku_solution. → pages 1

[Roc70] R. T. Rockafellar. On the maximal monotonicity of subdifferentialmappings. Pacific J. Math., 33:209–216, 1970. → pages 38

[Roc97] R. Tyrrell Rockafellar. Convex analysis. Princeton Landmarks inMathematics. Princeton University Press, Princeton, NJ, 1997.Reprint of the 1970 original, Princeton Paperbacks. → pages 37,39, 41

[Rud91] Walter Rudin. Functional analysis. International Series in Pureand Applied Mathematics. McGraw-Hill Inc., New York, secondedition, 1991. → pages 31

[RW98] R. Tyrrell Rockafellar and Roger J-B. Wets. Variational analysis,volume 317 of Grundlehren der Mathematischen Wissenschaften[Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1998. → pages 36, 39, 40, 42

[Sim98] Stephen Simons. Minimax and monotonicity, volume 1693 of Lec-ture Notes in Mathematics. Springer-Verlag, Berlin, 1998. →pages 38

[Wik] Queen moves. Available from: http://en.wikipedia.org/

wiki/Sudoku. → pages 60

[Zal02] C. Zalinescu. Convex analysis in general vector spaces. WorldScientific Publishing Co. Inc., River Edge, NJ, 2002. → pages 37,39

[Zar71] Eduardo H. Zarantonello. Projections on convex sets in Hilbertspace and spectral theory. I. Projections on convex sets. In Con-tributions to nonlinear functional analysis (Proc. Sympos., Math.Res. Center, Univ. Wisconsin, Madison, Wis., 1971), pages 237–341. Academic Press, New York, 1971. → pages 32

79

http://sciencenews.org/view/generic/id/39529/title/The_Sudoku_solution

http://sciencenews.org/view/generic/id/39529/title/The_Sudoku_solution

http://en.wikipedia.org/wiki/Sudoku

http://en.wikipedia.org/wiki/Sudoku

Appendix A

8 Queens PHP Code

<?php

// This code i s p ro to t ype and i s prov ided as i s under the GNU General Pub l i c License// Author : Jason Schaad August 11 , 2010

$n=8;

func t i on checkqueens ( $matrix ) {

// Round the matrixfor ( $x=1;$x<=8;$x++) {

for ( $y=1;$y<=8;$y++) {i f ( $matrix [ $x ] [ $y ] >=0.5) {

$matrix [ $x ] [ $y ]=1;}

}}

// PrintMatr ix ( $matrix ) ;$ t o t a l =0;

// HORIZONTAL

for ( $x=1;$x<=8;$x++) {$sum=0;

for ( $y=1;$y<=8;$y++) {$sum+=$matrix [ $x ] [ $y ] ;

}

i f ($sum==1) {$ t o t a l++;

}

}

// VERTICAL

80

Appendix A. 8 Queens PHP Code

for ( $x=1;$x<=8;$x++) {$sum=0;

for ( $y=1;$y<=8;$y++) {$sum+=$matrix [ $y ] [ $x ] ;

}

i f ($sum==1) {$ t o t a l++;

}

}

$check=1;

// POSITIVE

i f ( ( $matrix [ 1 ] [ 1 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 2 ] [ 1 ]+ $matrix [ 1 ] [ 2 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 3 ] [ 1 ]+ $matrix [ 2 ] [ 2 ]+ $matrix [ 1 ] [ 3 ] ) > 1){ $check=−1;}i f ( ( $matrix [ 4 ] [ 1 ]+ $matrix [ 3 ] [ 2 ]+ $matrix [ 3 ] [ 3 ]+ $matrix [ 1 ] [ 4 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 5 ] [ 1 ]+ $matrix [ 4 ] [ 2 ]+ $matrix [ 3 ] [ 3 ]+ $matrix [ 2 ] [ 4 ]+

$matrix [ 1 ] [ 5 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 6 ] [ 1 ]+ $matrix [ 5 ] [ 2 ]+ $matrix [ 4 ] [ 3 ]+ $matrix [ 3 ] [ 4 ]+

$matrix [ 2 ] [ 5 ]+ $matrix [ 1 ] [ 6 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 7 ] [ 1 ]+ $matrix [ 6 ] [ 2 ]+ $matrix [ 5 ] [ 3 ]+ $matrix [ 4 ] [ 4 ]+

$matrix [ 3 ] [ 5 ]+ $matrix [ 2 ] [ 6 ]+ $matrix [ 1 ] [ 7 ] ) > 1){ $check=−1;}i f ( ( $matrix [ 8 ] [ 1 ]+ $matrix [ 7 ] [ 2 ]+ $matrix [ 6 ] [ 3 ]+ $matrix [ 5 ] [ 4 ]+

$matrix [ 4 ] [ 5 ]+ $matrix [ 3 ] [ 6 ]+ $matrix [ 2 ] [ 7 ]+ $matrix [ 1 ] [ 8 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 8 ] [ 2 ]+ $matrix [ 7 ] [ 3 ]+ $matrix [ 6 ] [ 4 ]+ $matrix [ 5 ] [ 5 ]+

$matrix [ 4 ] [ 6 ]+ $matrix [ 3 ] [ 7 ]+ $matrix [ 2 ] [ 8 ] ) > 1) {$check=−1;}i f ( ( $matrix [ 8 ] [ 3 ]+ $matrix [ 7 ] [ 4 ]+ $matrix [ 6 ] [ 5 ]+ $matrix [ 5 ] [ 6 ]+

$matrix [ 4 ] [ 7 ]+ $matrix [ 3 ] [ 8 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 8 ] [ 4 ]+ $matrix [ 7 ] [ 5 ]+ $matrix [ 6 ] [ 6 ]+ $matrix [ 5 ] [ 7 ]+

$matrix [ 4 ] [ 8 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 8 ] [ 5 ]+ $matrix [ 7 ] [ 6 ]+ $matrix [ 6 ] [ 7 ]+ $matrix [ 5 ] [ 8 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 8 ] [ 6 ]+ $matrix [ 7 ] [ 7 ]+ $matrix [ 6 ] [ 8 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 8 ] [ 7 ]+ $matrix [ 7 ] [ 8 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 8 ] [ 8 ] ) >1 ){ $check=−1;}

// NEGATIVE

i f ( ( $matrix [ 8 ] [ 1 ] ) > 1) {$check=−1;}i f ( ( $matrix [ 7 ] [ 1 ]+ $matrix [ 8 ] [ 2 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 6 ] [ 1 ]+ $matrix [ 7 ] [ 2 ]+ $matrix [ 8 ] [ 3 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 5 ] [ 1 ]+ $matrix [ 6 ] [ 2 ]+ $matrix [ 7 ] [ 3 ]+ $matrix [ 8 ] [ 4 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 4 ] [ 1 ]+ $matrix [ 5 ] [ 2 ]+ $matrix [ 6 ] [ 3 ]+ $matrix [ 7 ] [ 4 ]+

$matrix [ 8 ] [ 5 ] ) >1 ){ $check=−1;}

81


i f ( ( $matrix [ 3 ] [ 1 ]+ $matrix [ 4 ] [ 2 ]+ $matrix [ 5 ] [ 3 ]+ $matrix [ 6 ] [ 4 ]+$matrix [ 7 ] [ 5 ]+ $matrix [ 8 ] [ 6 ] ) > 1 ) {$check=−1;}

i f ( ( $matrix [ 2 ] [ 1 ]+ $matrix [ 3 ] [ 2 ]+ $matrix [ 4 ] [ 3 ]+ $matrix [ 5 ] [ 4 ]+$matrix [ 6 ] [ 5 ]+ $matrix [ 7 ] [ 6 ]+ $matrix [ 8 ] [ 7 ] ) >1 ){ $check=−1;}

i f ( ( $matrix [ 1 ] [ 1 ]+ $matrix [ 2 ] [ 2 ]+ $matrix [ 3 ] [ 3 ]+ $matrix [ 4 ] [ 4 ]+$matrix [ 5 ] [ 5 ]+ $matrix [ 6 ] [ 6 ]+ $matrix [ 7 ] [ 7 ]+ $matrix [ 8 ] [ 8 ] ) >1 ){ $check=−1;}

i f ( ( $matrix [ 1 ] [ 2 ]+ $matrix [ 2 ] [ 3 ]+ $matrix [ 3 ] [ 4 ]+ $matrix [ 4 ] [ 5 ]+$matrix [ 5 ] [ 6 ]+ $matrix [ 6 ] [ 7 ]+ $matrix [ 7 ] [ 8 ] ) >1 ){ $check=−1;}

i f ( ( $matrix [ 1 ] [ 3 ]+ $matrix [ 2 ] [ 4 ]+ $matrix [ 3 ] [ 5 ]+ $matrix [ 4 ] [ 6 ]+$matrix [ 5 ] [ 7 ]+ $matrix [ 6 ] [ 8 ] ) >1 ){ $check=−1;}

i f ( ( $matrix [ 1 ] [ 4 ]+ $matrix [ 2 ] [ 5 ]+ $matrix [ 3 ] [ 6 ]+ $matrix [ 4 ] [ 7 ]+$matrix [ 5 ] [ 8 ] ) >1 ){ $check=−1;}

i f ( ( $matrix [ 1 ] [ 5 ]+ $matrix [ 2 ] [ 6 ]+ $matrix [ 3 ] [ 7 ]+ $matrix [ 4 ] [ 8 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 1 ] [ 6 ]+ $matrix [ 2 ] [ 7 ]+ $matrix [ 3 ] [ 8 ] ) > 1 ) {$check=−1;}i f ( ( $matrix [ 1 ] [ 7 ]+ $matrix [ 2 ] [ 8 ] ) >1 ){ $check=−1;}i f ( ( $matrix [ 1 ] [ 8 ] ) >1 ){ $check=−1;}

i f ( ( $ t o t a l==16) && ( $check==1)) {re turn 1 ;

}else {

re turn 0 ;}

}

f unc t i on chessround ( $x , $y ) {i f ( $y == 1 ) {

i f ( $x >= 0 . 5 ) {re turn ”<img he ight=50 width=50 s r c =3.png>” ;

}else {

re turn ”<img he ight=50 width=50 s r c =2.png>” ;}

}else i f ( $y ==0) {

i f ( $x >= 0 . 5 ) {re turn ”<img he ight=50 width=50 s r c =4.png>” ;

}else {

re turn ”<img he ight=50 width=50 s r c =1.png>” ;}

}}

82


f unc t i on PrintChessMatrix ( $matrix ) {echo ”<t ab l e border=1>” ;$n=s izeof ( $matrix ) ;for ( $x = 1 ; $x <= $n ; $x++) {echo ”<tr>” ;

for ( $y = 1 ; $y <= $n ; $y++) {i f ( ( $x % 2)==1) {

i f ( ( $y % 2)==1) {echo ”<td bgco lo r=#999999>” . chessround ( $matrix [ $x ] [ $y ] , 1 ) . ”</td>” ;

}else {

echo ”<td bgco lo r=#FFFFFF>” . chessround ( $matrix [ $x ] [ $y ] , 0 ) . ”</td>” ;}

}else {

i f ( ( $y % 2)==0) {echo ”<td bgco lo r=#999999>” . chessround ( $matrix [ $x ] [ $y ] , 1 ) . ”</td>” ;

}else {

echo ”<td bgco lo r=#FFFFFF>” . chessround ( $matrix [ $x ] [ $y ] , 0 ) . ”</td>” ;}

}}echo ”</tr>” ;

}echo ”</table>” ;

} //end func t i on

f unc t i on PrintMatr ix ( $matrix ) {echo ”<t ab l e border=1>” ;$n=s izeof ( $matrix ) ;

for ( $x = 1 ; $x <= $n ; $x++) {echo ”<tr>” ;for ( $y = 1 ; $y <= $n ; $y++) {

i f ( ( $x % 2)==1) {i f ( ( $y % 2)==1) {

echo ”<td bgco lo r=#999999>” . $matrix [ $x ] [ $y ] . ”</td>” ;}else {

echo ”<td bgco lo r=#FFFFFF>” . $matrix [ $x ] [ $y ] . ”</td>” ;}

}else {

i f ( ( $y % 2)==0) {echo ”<td bgco lo r=#999999>” . $matrix [ $x ] [ $y ] . ”</td>” ;

}

83


else {echo ”<td bgco lo r=#FFFFFF>” . $matrix [ $x ] [ $y ] . ”</td>” ;

}}

}echo ”</tr>” ;

}echo ”</table>” ;} //end func t i on

f unc t i on makeunitcol ( $matrix , $colnumber , $entrytomakezero ) {$n=s izeof ( $matrix ) ;

for ( $x=1; $x<=$n ; $x++) {$matrix [ $x ] [ $colnumber ]=0;

}

i f ( $entrytomakezero >0) {$matrix [ round( $entrytomakezero ) ] [ $colnumber ]=1;

}re turn $matrix ;

}

f unc t i on makeunitrow ( $matrix , $rownumber , $entrytomakezero ) {$n=s izeof ( $matrix ) ;

for ( $y=1; $y<=$n ; $y++) {$matrix [ $rownumber ] [ $y ]=0;

}

i f ( $entrytomakezero >0) {$matrix [ $rownumber ] [ round( $entrytomakezero ) ]=1 ;

}re turn $matrix ;

}

f unc t i on columnproj ( $matrix ) {$n=s izeof ( $matrix ) ;for ( $y = 1 ; $y <= $n ; $y++) {

$imax=1;$entrymax=$matrix [ 1 ] [ $y ] ;for ( $x = 1 ; $x <= $n ; $x++) {

i f ( $matrix [ $x ] [ $y]>$entrymax ) {$imax = $x ;$entrymax= $matrix [ $x ] [ $y ] ;

}

84


}$matrix=makeunitcol ( $matrix , $y , $imax ) ;

}RETURN $matrix ;

} //end func t i on

f unc t i on rowproj ( $matrix ) {$n=s izeof ( $matrix ) ;for ( $x = 1 ; $x <= $n ; $x++) {

$imax=1;$entrymax=$matrix [ $x ] [ 1 ] ;for ( $y = 1 ; $y <= $n ; $y++) {

i f ( $matrix [ $x ] [ $y]>$entrymax ) {$imax = $y ;$entrymax= $matrix [ $x ] [ $y ] ;

}}$matrix=makeunitrow ( $matrix , $x , $imax ) ;

}RETURN $matrix ;

} //end func t i on

f unc t i on po s t i v e s l o p ep r o j ( $matrix ) {

$v1 [1 ]= $matrix [ 1 ] [ 1 ] ;$v2 [1 ]= $matrix [ 2 ] [ 1 ] ; $v2 [2 ]= $matrix [ 1 ] [ 2 ] ;$v3 [1 ]= $matrix [ 3 ] [ 1 ] ; $v3 [2 ]= $matrix [ 2 ] [ 2 ] ; $v3 [3 ]= $matrix [ 1 ] [ 3 ] ;$v4 [1 ]= $matrix [ 4 ] [ 1 ] ; $v4 [2 ]= $matrix [ 3 ] [ 2 ] ; $v4 [3 ]= $matrix [ 2 ] [ 3 ] ;$v4 [4 ]= $matrix [ 1 ] [ 4 ] ;$v5 [1 ]= $matrix [ 5 ] [ 1 ] ; $v5 [2 ]= $matrix [ 4 ] [ 2 ] ; $v5 [3 ]= $matrix [ 3 ] [ 3 ] ;$v5 [4 ]= $matrix [ 2 ] [ 4 ] ; $v5 [5 ]= $matrix [ 1 ] [ 5 ] ;$v6 [1 ]= $matrix [ 6 ] [ 1 ] ; $v6 [2 ]= $matrix [ 5 ] [ 2 ] ; $v6 [3 ]= $matrix [ 4 ] [ 3 ] ;$v6 [4 ]= $matrix [ 3 ] [ 4 ] ; $v6 [5 ]= $matrix [ 2 ] [ 5 ] ; $v6 [6 ]= $matrix [ 1 ] [ 6 ] ;$v7 [1 ]= $matrix [ 7 ] [ 1 ] ; $v7 [2 ]= $matrix [ 6 ] [ 2 ] ; $v7 [3 ]= $matrix [ 5 ] [ 3 ] ;$v7 [4 ]= $matrix [ 4 ] [ 4 ] ; $v7 [5 ]= $matrix [ 3 ] [ 5 ] ; $v7 [6 ]= $matrix [ 2 ] [ 6 ] ;$v7 [7 ]= $matrix [ 1 ] [ 7 ] ;$v8 [1 ]= $matrix [ 8 ] [ 1 ] ; $v8 [2 ]= $matrix [ 7 ] [ 2 ] ; $v8 [3 ]= $matrix [ 6 ] [ 3 ] ;$v8 [4 ]= $matrix [ 5 ] [ 4 ] ; $v8 [5 ]= $matrix [ 4 ] [ 5 ] ; $v8 [6 ]= $matrix [ 3 ] [ 6 ] ;$v8 [7 ]= $matrix [ 2 ] [ 7 ] ; $v8 [8 ]= $matrix [ 1 ] [ 8 ] ;$v9 [1 ]= $matrix [ 8 ] [ 2 ] ; $v9 [2 ]= $matrix [ 7 ] [ 3 ] ; $v9 [3 ]= $matrix [ 6 ] [ 4 ] ;$v9 [4 ]= $matrix [ 5 ] [ 5 ] ; $v9 [5 ]= $matrix [ 4 ] [ 6 ] ; $v9 [6 ]= $matrix [ 3 ] [ 7 ] ;$v9 [7 ]= $matrix [ 2 ] [ 8 ] ;$v10 [1 ]= $matrix [ 8 ] [ 3 ] ; $v10 [2 ]= $matrix [ 7 ] [ 4 ] ; $v10 [3 ]= $matrix [ 6 ] [ 5 ] ;$v10 [4 ]= $matrix [ 5 ] [ 6 ] ; $v10 [5 ]= $matrix [ 4 ] [ 7 ] ; $v10 [6 ]= $matrix [ 3 ] [ 8 ] ;$v11 [1 ]= $matrix [ 8 ] [ 4 ] ; $v11 [2 ]= $matrix [ 7 ] [ 5 ] ; $v11 [3 ]= $matrix [ 6 ] [ 6 ] ;$v11 [4 ]= $matrix [ 5 ] [ 7 ] ; $v11 [5 ]= $matrix [ 4 ] [ 8 ] ;$v12 [1 ]= $matrix [ 8 ] [ 5 ] ; $v12 [2 ]= $matrix [ 7 ] [ 6 ] ; $v12 [3 ]= $matrix [ 6 ] [ 7 ] ;$v12 [4 ]= $matrix [ 5 ] [ 8 ] ;$v13 [1 ]= $matrix [ 8 ] [ 6 ] ; $v13 [2 ]= $matrix [ 7 ] [ 7 ] ; $v13 [3 ]= $matrix [ 6 ] [ 8 ] ;

85


$v14 [1 ]= $matrix [ 8 ] [ 7 ] ; $v14 [2 ]= $matrix [ 7 ] [ 8 ] ;$v15 [1 ]= $matrix [ 8 ] [ 8 ] ;

i f ( $v1 [1]>=0.5) {$matrix [ 1 ] [ 1 ] = 1 ;

}else {

$matrix [ 1 ] [ 1 ] = 0 ;}

$diagnames = array ( ’ v2 ’ , ’ v3 ’ , ’ v4 ’ , ’ v5 ’ , ’ v6 ’ , ’ v7 ’ , ’ v8 ’ , ’ v9 ’ , ’ v10 ’ , ’ v11 ’ ,’ v12 ’ , ’ v13 ’ , ’ v14 ’ ) ;

$unitnames= array ( ’w2 ’ , ’w3 ’ , ’w4 ’ , ’w5 ’ , ’w6 ’ , ’w7 ’ , ’w8 ’ , ’w9 ’ , ’w10 ’ , ’w11 ’ ,’w12 ’ , ’w13 ’ , ’w14 ’ ) ;

for ( $out s idecounte r =2; $outs idecounter <=8; $out s idecounte r++) {$ le f t sum=0;$rightsum=0;${$unitnames [ $outs idecounter −2]}=

un i tp r o j ( ${$diagnames [ $outs idecounter −2]} , $out s idecounte r ) ;for ( $ i n s i d e coun t e r =1; $ in s idecounte r<=$out s idecounte r ; $ i n s i d e coun t e r++) {

$ le f t sum=$le f t sum+pow( ${$diagnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ] , 2 ) ;$rightsum=$rightsum+pow( ${$diagnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ]−

${$unitnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ] , 2 ) ;}// echo ” LS−> ” . $ l e f t sum .” <−RS−> ” . $r ightsum . ”<− ” ;i f ( sqrt ( $ l e f t sum ) < sqrt ( $rightsum ) ) {

// echo ” A l l Zeros ” ;for ($num1=$out s idecounte r ; $num1>=1; $num1−−) {

for ($num2=1; $num2<=$out s idecounte r ; $num2++) {i f ( ( $num1+$num2)==($out s idecounte r +1)) {

$matrix [ $num1 ] [ $num2]=0;}

}}

}else {

// echo ” Unit Vector ” ;$un i tvec to r counte r =0;for ($num1=$out s idecounte r ; $num1>=1; $num1−−) {

for ($num2=1; $num2<=$out s idecounte r ; $num2++) {i f ( ( $num1+$num2)==($out s idecounte r +1)) {

$un i tvec to r counte r++;$matrix [ $num1 ] [ $num2]=

${$unitnames [ $outs idecounter −2]} [ $un i tvec to r counte r ] ;}

}}

}}

86


// ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ NOW WE do the remaining 7 d i agona l s ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗i f ( $v15 [1]>=0.5) {

$matrix [ 8 ] [ 8 ] = 1 ;}else {

$matrix [ 8 ] [ 8 ] = 0 ;}

for ( $out s idecounte r =14; $outs idecounter >=9; $outs idecounter−−) {$ le f t sum=0;$rightsum=0;${$unitnames [ $outs idecounter −2]}=

un i tp r o j ( ${$diagnames [ $outs idecounter −2]} ,(16− $out s idecounte r ) ) ;for ( $ i n s i d e coun t e r =1; $ in s idecounte r <=(16−$out s idecounte r ) ; $ i n s i d e coun t e r++) {

$ le f t sum=$le f t sum+pow( ${$diagnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ] , 2 ) ;$rightsum=$rightsum+pow( ${$diagnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ]−

${$unitnames [ $outs idecounter −2]} [ $ i n s i d e coun t e r ] , 2 ) ;}

i f ( sqrt ( $ l e f t sum ) < sqrt ( $rightsum ) ) {// echo ” A l l Zeros ” ;for ($num1=1; $num1<=8; $num1++) {

for ($num2=1; $num2<=8; $num2++) {i f ( ( $num1+$num2)==($out s idecounte r +1)) {

$matrix [ $num2 ] [ $num1]=0;}

}}

}else {

// echo ” Unit Vector ” ;$un i tvec to r counte r =0;for ($num1=1; $num1<=8; $num1++) {

for ($num2=1; $num2<=8; $num2++) {i f ( ( $num1+$num2)==($out s idecounte r +1)) {

$un i tvec to r counte r++;$matrix [ $num2 ] [ $num1]=

${$unitnames [ $outs idecounter −2]} [ $un i tvec to r counte r ] ;}

}}

}}

RETURN $matrix ;} //end func t i on

87


f unc t i on c l o c kw i s e r o t a t i o n ( $matrix ) {for ( $x=1; $x<=8; $x++) {

for ( $y=1; $y<=8; $y++) {$newmatrix [ $x ] [ $y]=$matrix [9−$y ] [ $x ] ;

}}

RETURN $newmatrix ;} //end func t i on

f unc t i on coun t e r c l o ckw i s e r o t a t i on ( $matrix ) {for ( $x=1; $x<=8; $x++) {

for ( $y=1; $y<=8; $y++) {$newmatrix [9−$y ] [ $x]=$matrix [ $x ] [ $y ] ;

}}

RETURN $newmatrix ;} //end func t i on

f unc t i on un i tp r o j ( $v , $ s i z e ) {$imax=1;$entrymax = $v [ 1 ] ;for ( $ i =2; $i<=$ s i z e ; $ i++) {

i f ( $v [ $ i ]>$entrymax ) {$imax = $ i ;$entrymax =$v [ $ i ] ;

}}RETURN makeunit ( $imax , $ s i z e ) ;

}

f unc t i on makeunit ( $i , $ s i z e ) {for ( $x=1; $x<=$ s i z e ; $x++) {

$v [ $x ]=0;}

i f ( $i >0) {$v [ round( $ i ) ]=1 ;

}re turn $v ;

}

f unc t i on addcubes ($M1, $M2) {for ( $ i =1; $i<=8; $ i++) {

for ( $ j =1; $j<=8; $ j++) {$S [ $ i ] [ $ j ]=$M1[ $ i ] [ $ j ] + $M2[ $ i ] [ $ j ] ;

88


}}

re turn $S ;}

f unc t i on sca lmultcube ( $alpha , $T) {for ( $ i =1; $i<=8; $ i++) {

for ( $ j =1; $j<=8; $ j++) {$S [ $ i ] [ $ j ]=$alpha ∗$T [ $ i ] [ $ j ] ;

}}

re turn $S ;} // end func t i on

f unc t i on d iagpro j ($T1 , $T2 , $T3 , $T4) {$T=addcubes ($T1 , $T2 ) ;$T=addcubes ($T , $T3 ) ;$T=addcubes ($T , $T4 ) ;$T=sca lmultcube ( 0 . 2 5 , $T ) ;re turn $T ;

}

f unc t i on new1 ($T1 , $T2 , $T3 , $T4) {$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T1 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = columnproj ( $S3 ) ;$S5 = addcubes ($T1 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new2 ($T1 , $T2 , $T3 , $T4) {$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T2 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = rowproj ( $S3 ) ;$S5 = addcubes ($T2 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new3 ($T1 , $T2 , $T3 , $T4) {$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;

89


$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T3 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = po s t i v e s l o p ep r o j ( $S3 ) ;$S5 = addcubes ($T3 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new4 ($T1 , $T2 , $T3 , $T4) {g l oba l $or igS ;$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T4 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = coun t e r c l o ckw i s e r o t a t i on ( p o s t i v e s l o p ep r o j ( c l o c kw i s e r o t a t i o n ( $S3 ) ) ) ;$S5 = addcubes ($T4 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

$or igMatr ix [ 1 ] [ 1 ]= $ POST [ ’ 11 ’ ] ;$or igMatr ix [ 1 ] [ 2 ]= $ POST [ ’ 12 ’ ] ;$or igMatr ix [ 1 ] [ 3 ]= $ POST [ ’ 13 ’ ] ;$or igMatr ix [ 1 ] [ 4 ]= $ POST [ ’ 14 ’ ] ;$or igMatr ix [ 1 ] [ 5 ]= $ POST [ ’ 15 ’ ] ;$or igMatr ix [ 1 ] [ 6 ]= $ POST [ ’ 16 ’ ] ;$or igMatr ix [ 1 ] [ 7 ]= $ POST [ ’ 17 ’ ] ;$or igMatr ix [ 1 ] [ 8 ]= $ POST [ ’ 18 ’ ] ;



$or igMatr ix [ 4 ] [ 1 ]= $ POST [ ’ 41 ’ ] ;

90


$or igMatr ix [ 4 ] [ 2 ]= $ POST [ ’ 42 ’ ] ;$or igMatr ix [ 4 ] [ 3 ]= $ POST [ ’ 43 ’ ] ;$or igMatr ix [ 4 ] [ 4 ]= $ POST [ ’ 44 ’ ] ;$or igMatr ix [ 4 ] [ 5 ]= $ POST [ ’ 45 ’ ] ;$or igMatr ix [ 4 ] [ 6 ]= $ POST [ ’ 46 ’ ] ;$or igMatr ix [ 4 ] [ 7 ]= $ POST [ ’ 47 ’ ] ;$or igMatr ix [ 4 ] [ 8 ]= $ POST [ ’ 48 ’ ] ;




$or igMatr ix [ 8 ] [ 1 ]= $ POST [ ’ 81 ’ ] ;$or igMatr ix [ 8 ] [ 2 ]= $ POST [ ’ 82 ’ ] ;$or igMatr ix [ 8 ] [ 3 ]= $ POST [ ’ 83 ’ ] ;$or igMatr ix [ 8 ] [ 4 ]= $ POST [ ’ 84 ’ ] ;$or igMatr ix [ 8 ] [ 5 ]= $ POST [ ’ 85 ’ ] ;$or igMatr ix [ 8 ] [ 6 ]= $ POST [ ’ 86 ’ ] ;$or igMatr ix [ 8 ] [ 7 ]= $ POST [ ’ 87 ’ ] ;$or igMatr ix [ 8 ] [ 8 ]= $ POST [ ’ 88 ’ ] ;$ i t e r a t i o n =20000;

echo ”<t ab l e border=2 ce l l padd ing=5 c e l l s p a c i n g=5>” ;echo ”<tr><th>I t e r a t i on </th><th>Chess Board</th><th>” ;echo ”Current Matrix” ;

91


echo ”</th>” ;echo ”</tr>” ;echo ”<tr><td>0</td><td>” ;PrintChessMatrix ( $or igMatr ix ) ;echo ”</td><td>” ;PrintMatr ix ( $or igMatr ix ) ;echo”</td>” ;$T1old=$or igMatr ix ;$T2old=$or igMatr ix ;$T3old=$or igMatr ix ; ;$T4old=$or igMatr ix ;$Tadaold=d iagpro j ( $T1old , $T2old , $T3old , $T4old ) ;

$ so lved=0;$ i =0;

while ( ! $ so lved ) {$T1=new1( $T1old , $T2old , $T3old , $T4old ) ;$T2=new2( $T1old , $T2old , $T3old , $T4old ) ;$T3=new3( $T1old , $T2old , $T3old , $T4old ) ;$T4=new4( $T1old , $T2old , $T3old , $T4old ) ;$Tada=d iagpro j ($T1 , $T2 , $T3 , $T4 ) ;

i f ( checkqueens ( $Tada ) ) {$so lved=1;

}

echo ”<tr><td>$i</td><td>” ;PrintChessMatrix ( $Tada ) ;echo ”</td><td>” ;PrintMatr ix ( $Tada ) ;echo ”</td>” ;echo ”</tr>” ;$ i++;

$T1old = $T1 ;$T2old = $T2 ;$T3old = $T3 ;$T4old = $T4 ;

}echo ”<big>Success ! Solved in ” . ( $i −1). ” i t e r a t i o n s .</big>” ;?>

92

Appendix B

Sudoku PHP Code

We first have sudokufunctions.php, then our sudoku code.

<?php

// This code i s p ro to t ype and i s prov ided as i s under the GNU General Pub l i c License// Author : Jason Schaad August 11 , 2010

f unc t i on makeunit ( $ i ) {// re turns a vec t o r ( array )// $ i shou ld be a f l o a t

$v [ 1 ]=0 ; $v [ 2 ]=0 ; $v [ 3 ]=0 ; $v [ 4 ]=0 ; $v [ 5 ]=0 ; $v [ 6 ]=0 ; $v [ 7 ]=0 ; $v [ 8 ]=0 ; $v [ 9 ]=0 ;// ea s i e r way to i n i t i a l i z e an arrayi f ( $i >0) {

$v [ round( $ i ) ]=1 ;}re turn $v ;

}

f unc t i on un i tp r o j ( $v ) {// input $v i s a vec t o r// f i n d s b i g g e s t component and makes 1 and the r e s t 0//e . g . [ 0 . 4 , . 6 , . 7 , . 8 , 0 , 0 , 0 , 0 , 0 ] = [0 ,0 ,0 , 1 , 0 , 0 , 0 , 0 , 0 ]// re turns a vec t o r ( array )// r e l i e s on makeunit

$imax=1;$entrymax = $v [ 1 ] ;for ( $ i =2; $i<=9; $ i++) {

i f ( $v [ $ i ]>$entrymax ) {$imax = $ i ;$entrymax =$v [ $ i ] ;

}}re turn makeunit ( $imax ) ;

}

f unc t i on addcubes ($T1 , $T2) {// add 2 9x9x9 cubes

93

Appendix B. Sudoku PHP Code

// re turns the ”sum cube”

for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

for ( $k=1; $k<=9; $k++) {$S [ $ i ] [ $ j ] [ $k]=$T1 [ $ i ] [ $ j ] [ $k ] + $T2 [ $ i ] [ $ j ] [ $k ] ;

}}

}

re turn $S ;

}

f unc t i on pr intcube ( $cube ) {// t h i s f unc t i on p r i n t s the cube to the screen// not inc l uded in un i t t e s t s as i t i s never used in my ac tua l sudoku code// i t was used f o r debugg ing when I was c r ea t i n g the codefor ( $x=1; $x<=9; $x++) {

for ( $y=1; $y<=9; $y++) {for ( $z=1; $z<=9; $z++) {

echo ” ( ” . $x . ” , ” . $y . ” , ” . $z . ” ) =” . $cube [ $x ] [ $y ] [ $z ] . ”<br>” ;}

}}

}

f unc t i on p r in ta r r ay ( $square ) {// t h i s f unc t i on p r i n t s a 2 dimensiona l array to the screen ( our ” f l a t t e n e d ” Sudoku )// the r e i s no need f o r a un i t t e s t f o r t h i s code as i t ou tpu t s to the screen onlyfor ( $x=1; $x<=9; $x++) {

for ( $y=1; $y<=9; $y++) {echo $square [ $x ] [ $y ] . ” ” ;

}echo ”<br />” ;}

}

f unc t i on columnproj ( $Tcube ) {// f i n d s the b i g g e s t en try in every column of our sudoku cube// and makes i t i n t o a un i t v e c t o r − r e turns a cubefor ( $ j =1; $j<=9; $ j++) {

for ( $k=1; $k<=9; $k++) {for ( $ i =1; $i<=9; $ i++){

$tempv [ $ i ]=$Tcube [ $ i ] [ $ j ] [ $k ] ;}

$tempw=un i tp r o j ( $tempv ) ;

94


for ( $ i =1; $i<=9; $ i++) {$cprojTcube [ $ i ] [ $ j ] [ $k]=$tempw [ $ i ] ;

}}

}re turn $cprojTcube ;}

f unc t i on rowproj ( $Tcube ) {// f i n d s the b i g g e s t en try in every row o f our sudoku cube// and makes i t i n t o a un i t v e c t o r which i t r e tu rnsfor ( $ i =1; $i<=9; $ i++) {

for ( $k=1; $k<=9; $k++) {for ( $ j =1; $j<=9; $ j++) {

$tempv [ $ j ]=$Tcube [ $ i ] [ $ j ] [ $k ] ;}$tempw=un i tp r o j ( $tempv ) ;for ( $ j =1; $j<=9; $ j++) {

$rprojTcube [ $ i ] [ $ j ] [ $k]=$tempw [ $ j ] ;}

}}

re turn $rprojTcube ;}

f unc t i on meetproj ( $Tcube ) {// This f unc t i on c a l c u l a t e s a ” un i t v e c t o r ” based on the// meeting rooms ( the 3 x 3 rooms ) i t uses meetpro2 func t i on// to do a l o t o f the work ( l oops are good . . and so i s re−us ing your code )// i n i t i a l i z e mprojTcube to have a l l 0 ’ s

for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

for ( $k=1; $k<=9; $k++) {$mprojTcube [ $ i ] [ $ j ] [ $k ]=0;

}}

}

// 1−1 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 1 , 1 ) ;// 1−2 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 4 , 1 ) ;// 1−3 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 7 , 1 ) ;// 2−1 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 1 , 4 ) ;// 2−2 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 4 , 4 ) ;

95


// 2−3 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 7 , 4 ) ;// 3−1 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 1 , 7 ) ;// 3−2 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 4 , 7 ) ;// 3−3 subcube$mprojTcube=meetproj2 ( $mprojTcube , $Tcube , 7 , 7 ) ;

r e turn $mprojTcube ;}

f unc t i on meetproj2 ( $mprojTcube , $Tcube , $index1 , $index2 ) {// This f unc t i on c a l c u l a t e s a ” un i t v e c t o r ” based on the// meeting rooms ( the 3 x 3 rooms )// t h i s i s coded us ing l oops and the v a r i a b l e s $ index1 and $ index// 2 he l p g e t the i n d i c e s co r r e c t f o r the funny ” un i t v e c t o r ”

for ( $k=1; $k<=9; $k++) {for ( $ j =0; $j <3; $ j++) {

for ( $ i =0; $i <3; $ i++) {$tempv[1+ $ i+3∗$ j ]=$Tcube [ $index2+$ j ] [ $ index1+$ i ] [ $k ] ;

}}$tempw = un i tp r o j ( $tempv ) ;for ( $ j =0; $j <3; $ j++) {

for ( $ i =0; $i <3; $ i++) {$mprojTcube [ $index2+$ j ] [ $ index1+$ i ] [ $k]=$tempw[1+ $ i+3∗$ j ] ;

}}

}re turn $mprojTcube ;

}

f unc t i on sca lmultcube ( $alpha , $T) {// Takes as input a cube and a s c a l a r and mu l t i p l e s them − output i s a cubefor ( $ i =1; $i<=9; $ i++) {

for ( $ j =1; $j<=9; $ j++) {for ( $k=1; $k<=9; $k++) {

$S [ $ i ] [ $ j ] [ $k]=$alpha ∗$T [ $ i ] [ $ j ] [ $k ] ;}

}}

re turn $S ;

} // end func t i on

96


f unc t i on thesame ($T1 , $T2) {// compares two arrays − i f they are the same re turn true , o the rw i s e f a l s e$counter=0;for ( $ i =1; $i<=9; $ i++) {

for ( $ j =1; $j<=9; $ j++) {i f ($T1 [ $ i ] [ $ j ] !=$T2 [ $ i ] [ $ j ] ) {

$counter++;

}}

}i f ( $counter >0) {

re turn FALSE;}else {

re turn TRUE;}

}

f unc t i on g ivenpro j ( $Tcube , $S ) {// t h i s f unc t i on uses the o r i g i n a l sudoku t ha t was g iven to c a l c u l a t e// a p r o j e c t i on ( our g iven Sudoku con s t r a i n t// − which i s j u s t a 9 by 9 array not a cube ) . Returns a p ro j e c t e d cube .

$gprojTcube=$Tcube ;for ( $ i =1; $i<=9; $ i++) {

for ( $ j =1; $j<=9; $ j++) {i f ( $S [ $ i ] [ $ j ]>0) {

$tempw=makeunit ( $S [ $ i ] [ $ j ] ) ;for ( $k=1; $k<=9; $k++) {

$gprojTcube [ $ i ] [ $ j ] [ $k]=$tempw [ $k ] ;}

}i f ( $S [ $ i ] [ $ j ]==0) {

for ( $k=1; $k<=9; $k++) {$tempv [ $k]=$Tcube [ $ i ] [ $ j ] [ $k ] ;

}$tempw=un i tp r o j ( $tempv ) ;for ( $k=1; $k<=9; $k++) {

$gprojTcube [ $ i ] [ $ j ] [ $k]=$tempw [ $k ] ;}

}}

}re turn $gprojTcube ;

} // end func t i on

f unc t i on makearray ( $Tcube ) {// f l a t t e n s a cube (9 x9x9 ) in t o an array (9 x9 ) so we can p r i n t i t to the screen

97


for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

$temp=1;$tempval=$Tcube [ $ i ] [ $ j ] [ 1 ] ;for ( $k=2; $k<=9; $k++) {

i f ( $Tcube [ $ i ] [ $ j ] [ $k]>$tempval ) {$temp=$k ;$tempval=$Tcube [ $ i ] [ $ j ] [ $k ] ;

}}$A [ $ i ] [ $ j ]=$temp ;

}}re turn $A ;

}

f unc t i on s tuck in l oop ( $ i ) {// no need to t e s t t h i s func t ion , i t i s too s imple .i f ( $i >100000) {

throw new Exception ( ’We are stuck in a loop with over 10000 i t e r a t i o n s . ’ ) ;}

}

f unc t i on d iagpro j ($T1 , $T2 , $T3 , $T4) {// take four cubes and computes the d iagona l$T=addcubes ($T1 , $T2 ) ;$T=addcubes ($T , $T3 ) ;$T=addcubes ($T , $T4 ) ;$T=sca lmultcube ( 0 . 2 5 , $T ) ;re turn $T ;

}

f unc t i on new1 ($T1 , $T2 , $T3 , $T4) {// c r ea t e s a new cube wi th the column p ro j e c t i on$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T1 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = columnproj ( $S3 ) ;$S5 = addcubes ($T1 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new2 ($T1 , $T2 , $T3 , $T4) {// c r ea t e s a new cube the row p ro j e c t i on$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;

98


$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T2 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = rowproj ( $S3 ) ;$S5 = addcubes ($T2 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new3 ($T1 , $T2 , $T3 , $T4) {// c r ea t e s a new cube wi th the meet p r o j e c t i on$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T3 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = meetproj ( $S3 ) ;$S5 = addcubes ($T3 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on new4 ($T1 , $T2 , $T3 , $T4) {// c r ea t e s a new cube wi th the g iven p r o j e c t i ong l oba l $or igS ;$PD=diagpro j ($T1 , $T2 , $T3 , $T4 ) ;$S1 = sca lmultcube (2 ,$PD) ;$S2 = sca lmultcube (−1 ,$T4 ) ;$S3 = addcubes ( $S1 , $S2 ) ;$S4 = g ivenpro j ( $S3 , $or igS ) ;$S5 = addcubes ($T4 , sca lmultcube (−1 ,$PD) ) ;$S = addcubes ( $S4 , $S5 ) ;r e turn $S ;

}

f unc t i on meetproj22 ( $Tcube ) {// This f unc t i on c a l c u l a t e s a ” un i t v e c t o r ” based on the// meeting rooms ( the 3 x 3 rooms )// I coded i t d i f f e r e n t l y ( us ing more l oops ) ,// but i t made i t much harder to read . We can see where an// ac t ua l meeting room i s l o c a t e d in our cube t h i s way !// 1−1 subcubefor ( $k=1; $k<=9; $k++) {

$tempv [ 1 ] = $Tcube [ 1 ] [ 1 ] [ $k ] ;$tempv [ 2 ] = $Tcube [ 1 ] [ 2 ] [ $k ] ;$tempv [ 3 ] = $Tcube [ 1 ] [ 3 ] [ $k ] ;$tempv [ 4 ] = $Tcube [ 2 ] [ 1 ] [ $k ] ;$tempv [ 5 ] = $Tcube [ 2 ] [ 2 ] [ $k ] ;$tempv [ 6 ] = $Tcube [ 2 ] [ 3 ] [ $k ] ;$tempv [ 7 ] = $Tcube [ 3 ] [ 1 ] [ $k ] ;

99


$tempv [ 8 ] = $Tcube [ 3 ] [ 2 ] [ $k ] ;$tempv [ 9 ] = $Tcube [ 3 ] [ 3 ] [ $k ] ;$tempw = un i tp r o j ( $tempv ) ;$mprojTcube [ 1 ] [ 1 ] [ $k ] = $tempw [ 1 ] ;$mprojTcube [ 1 ] [ 2 ] [ $k ] = $tempw [ 2 ] ;$mprojTcube [ 1 ] [ 3 ] [ $k ] = $tempw [ 3 ] ;$mprojTcube [ 2 ] [ 1 ] [ $k ] = $tempw [ 4 ] ;$mprojTcube [ 2 ] [ 2 ] [ $k ] = $tempw [ 5 ] ;$mprojTcube [ 2 ] [ 3 ] [ $k ] = $tempw [ 6 ] ;$mprojTcube [ 3 ] [ 1 ] [ $k ] = $tempw [ 7 ] ;$mprojTcube [ 3 ] [ 2 ] [ $k ] = $tempw [ 8 ] ;$mprojTcube [ 3 ] [ 3 ] [ $k ] = $tempw [ 9 ] ;

}// 1−2 subcubefor ( $k=1; $k<=9; $k++) {

$tempv [ 1 ] = $Tcube [ 1 ] [ 4 ] [ $k ] ;$tempv [ 2 ] = $Tcube [ 1 ] [ 5 ] [ $k ] ;$tempv [ 3 ] = $Tcube [ 1 ] [ 6 ] [ $k ] ;$tempv [ 4 ] = $Tcube [ 2 ] [ 4 ] [ $k ] ;$tempv [ 5 ] = $Tcube [ 2 ] [ 5 ] [ $k ] ;$tempv [ 6 ] = $Tcube [ 2 ] [ 6 ] [ $k ] ;$tempv [ 7 ] = $Tcube [ 3 ] [ 4 ] [ $k ] ;$tempv [ 8 ] = $Tcube [ 3 ] [ 5 ] [ $k ] ;$tempv [ 9 ] = $Tcube [ 3 ] [ 6 ] [ $k ] ;$tempw = un i tp r o j ( $tempv ) ;$mprojTcube [ 1 ] [ 4 ] [ $k ] = $tempw [ 1 ] ;$mprojTcube [ 1 ] [ 5 ] [ $k ] = $tempw [ 2 ] ;$mprojTcube [ 1 ] [ 6 ] [ $k ] = $tempw [ 3 ] ;$mprojTcube [ 2 ] [ 4 ] [ $k ] = $tempw [ 4 ] ;$mprojTcube [ 2 ] [ 5 ] [ $k ] = $tempw [ 5 ] ;$mprojTcube [ 2 ] [ 6 ] [ $k ] = $tempw [ 6 ] ;$mprojTcube [ 3 ] [ 4 ] [ $k ] = $tempw [ 7 ] ;$mprojTcube [ 3 ] [ 5 ] [ $k ] = $tempw [ 8 ] ;$mprojTcube [ 3 ] [ 6 ] [ $k ] = $tempw [ 9 ] ;

}// 1−3 subcubefor ( $k=1; $k<=9; $k++) {

$tempv [ 1 ] = $Tcube [ 1 ] [ 7 ] [ $k ] ;$tempv [ 2 ] = $Tcube [ 1 ] [ 8 ] [ $k ] ;$tempv [ 3 ] = $Tcube [ 1 ] [ 9 ] [ $k ] ;$tempv [ 4 ] = $Tcube [ 2 ] [ 7 ] [ $k ] ;$tempv [ 5 ] = $Tcube [ 2 ] [ 8 ] [ $k ] ;$tempv [ 6 ] = $Tcube [ 2 ] [ 9 ] [ $k ] ;$tempv [ 7 ] = $Tcube [ 3 ] [ 7 ] [ $k ] ;$tempv [ 8 ] = $Tcube [ 3 ] [ 8 ] [ $k ] ;$tempv [ 9 ] = $Tcube [ 3 ] [ 9 ] [ $k ] ;$tempw = un i tp r o j ( $tempv ) ;$mprojTcube [ 1 ] [ 7 ] [ $k ] = $tempw [ 1 ] ;$mprojTcube [ 1 ] [ 8 ] [ $k ] = $tempw [ 2 ] ;

100


$mprojTcube [ 1 ] [ 9 ] [ $k ] = $tempw [ 3 ] ;$mprojTcube [ 2 ] [ 7 ] [ $k ] = $tempw [ 4 ] ;$mprojTcube [ 2 ] [ 8 ] [ $k ] = $tempw [ 5 ] ;$mprojTcube [ 2 ] [ 9 ] [ $k ] = $tempw [ 6 ] ;$mprojTcube [ 3 ] [ 7 ] [ $k ] = $tempw [ 7 ] ;$mprojTcube [ 3 ] [ 8 ] [ $k ] = $tempw [ 8 ] ;$mprojTcube [ 3 ] [ 9 ] [ $k ] = $tempw [ 9 ] ;

}// 2−1 subcubefor ( $k=1; $k<=9; $k++) {


}// 2−2 subcubefor ( $k=1; $k<=9; $k++) {

$tempv [ 1 ] = $Tcube [ 4 ] [ 4 ] [ $k ] ;$tempv [ 2 ] = $Tcube [ 4 ] [ 5 ] [ $k ] ;$tempv [ 3 ] = $Tcube [ 4 ] [ 6 ] [ $k ] ;$tempv [ 4 ] = $Tcube [ 5 ] [ 4 ] [ $k ] ;$tempv [ 5 ] = $Tcube [ 5 ] [ 5 ] [ $k ] ;$tempv [ 6 ] = $Tcube [ 5 ] [ 6 ] [ $k ] ;$tempv [ 7 ] = $Tcube [ 6 ] [ 4 ] [ $k ] ;$tempv [ 8 ] = $Tcube [ 6 ] [ 5 ] [ $k ] ;$tempv [ 9 ] = $Tcube [ 6 ] [ 6 ] [ $k ] ;$tempw = un i tp r o j ( $tempv ) ;$mprojTcube [ 4 ] [ 4 ] [ $k ] = $tempw [ 1 ] ;$mprojTcube [ 4 ] [ 5 ] [ $k ] = $tempw [ 2 ] ;$mprojTcube [ 4 ] [ 6 ] [ $k ] = $tempw [ 3 ] ;$mprojTcube [ 5 ] [ 4 ] [ $k ] = $tempw [ 4 ] ;$mprojTcube [ 5 ] [ 5 ] [ $k ] = $tempw [ 5 ] ;$mprojTcube [ 5 ] [ 6 ] [ $k ] = $tempw [ 6 ] ;$mprojTcube [ 6 ] [ 4 ] [ $k ] = $tempw [ 7 ] ;

101


$mprojTcube [ 6 ] [ 5 ] [ $k ] = $tempw [ 8 ] ;$mprojTcube [ 6 ] [ 6 ] [ $k ] = $tempw [ 9 ] ;

}// 2−3 subcubefor ( $k=1; $k<=9; $k++) {


}// 3−1 subcubefor ( $k=1; $k<=9; $k++) {


}// 3−2 subcubefor ( $k=1; $k<=9; $k++) {

102



}// 3−3 subcubefor ( $k=1; $k<=9; $k++) {


}re turn $mprojTcube ;

}

f unc t i on checkMeetingRoom ( $sudoku , $index x , $ index y ) {// sudoku i s the 9x9 array and index1 and index2 are the s t a r t i n g po in t s// [ 1 ] [ 1 ] [ 1 ] [ 2 ] [ 1 ] [ 3 ]

103


// [ 2 ] [ 1 ] [ 2 ] [ 2 ] [ 2 ] [ 3 ]// [ 3 ] [ 1 ] [ 3 ] [ 2 ] [ 3 ] [ 3 ]

//$one=array (1=>1 ,1 ,1 ,2 ,2 ,2 ,3 ,3 ,3);// $two=array (1=>1 ,2 ,3 ,1 ,2 ,3 ,1 ,2 ,3);

$one=array (1=>$index y , $index y , $index y , ( $ index y +1) ,( $ index y+1) ,( $ index y +1) ,( $ index y +2) ,( $ index y +2) ,( $ index y +2)) ;

$two=array (1=>( $ index x ) , ( $ index x +1) ,( $ index x +2) ,( $ index x ) , ( $ index x+1) ,( $ index x +2) ,( $ index x ) , ( $ index x +1) ,( $ index x +2)) ;

// p r i n t r ( $one ) ;// p r i n t r ( $two ) ;

for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

i f ( $ i != $ j ) {i f ( $sudoku [ $one [ $ i ] ] [ $two [ $ i ]]==$sudoku [ $one [ $ j ] ] [ $two [ $ j ] ] ) {

re turn 0 ;}

}}

}

re turn 1 ;}

f unc t i on isVal idSudoku ( $ o r i g i n a l , $sudoku ) {// checks to see i f the sudoku i s v a l i d − r e turns t rue i f v a l i d , f a l s e o the rw i s e// input i s the o r i g i n a l c on s t r a i n t s and the sudoku to check

// Check i f we meet the ROW con s t r a i n t s (1 to 9 in every row )// f i r s t cehck f o r d u p l i c a t e sfor ( $k=1; $k<=9; $k++) {

for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

i f ( $ i != $ j ) {i f ( $sudoku [ $k ] [ $ i ]==$sudoku [ $k ] [ $ j ] ) {

// echo ”Fa i l ed row cons t ra in t<br />”;re turn 0 ;

}}

}

104


}}

// Check i f we meet the COLUMN con s t r a i n t s (1 to 9 in every row )// f i r s t cehck f o r d u p l i c a t e sfor ( $k=1; $k<=9; $k++) {

for ( $ i =1; $i<=9; $ i++) {for ( $ j =1; $j<=9; $ j++) {

i f ( $ i != $ j ) {i f ( $sudoku [ $ i ] [ $k]==$sudoku [ $ j ] [ $k ] ) {

// echo ”Fa i l ed column cons t ra in t<br />”;re turn 0 ;

}}

}}

}

//Check MEETING ROOM con s t r a i n t s (1 to 9 in every meeting room)i f ( ! checkMeetingRoom ( $sudoku , 1 , 1 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 4 , 1 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 7 , 1 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 1 , 4 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 4 , 4 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 7 , 4 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 1 , 7 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 4 , 7 ) ) { re turn 0 ;}i f ( ! checkMeetingRoom ( $sudoku , 7 , 7 ) ) { re turn 0 ;}

// Las t l y we check to make sure we have the g iven c on s t r a i n t sfor ( $ i =1; $i<=9; $ i++) {

for ( $ j =1; $j<=9; $ j++) {$new array [ $ i ] [ $ j ]= $ o r i g i n a l [ $ i ] [ $ j ]∗ $sudoku [ $ i ] [ $ j ] ;$new array [ $ i ] [ $ j ]=sqrt ( $new array [ $ i ] [ $ j ] ) ;

}}

// pr in t a r ray ( $new array ) ;// echo ”<br />”;// p r in t a r ray ( $ o r i g i n a l ) ;i f ( ! thesame ( $new array , $ o r i g i n a l ) ) {

re turn 0 ;}

re turn 1 ;}

105


<?php

set time limit ( 0 ) ;

// Report s imp le running e r ro r serror reporting (E ERROR | EWARNING | E PARSE) ;

// Verbose mode ( shows a l l d e t a i l s )$ d e t a i l s=$ POST [ ’ d e t a i l s ’ ] ;

i f ( $ d e t a i l s != 1) {$ d e t a i l s =0;

}

// Pick up the form v a r i a b l e sfor ( $x=1; $x<=9; $x++) {

for ( $y=1; $y<=9; $y++) {$or igS [ $x ] [ $y]=$ POST [ $x . $y ] ;

}}

// Disp lay the i n i t i a l Sudoku pu z z l e we w i l l s o l v e// We a l s o i n i t i a l i z e a GLOBAL va r i a b l e c a l l e d or igS// which i s our o r i g i n a l Sudoku ( which i s a c on s t r a i n t )echo ”<t ab l e border=1>” ;for ( $x=1; $x<=9; $x++) {

echo ”<tr>” ;for ( $y=1; $y<=9; $y++) {

echo ”<td>” ;i f ( $or igS [ $x ] [ $y]==”” ) {

echo ”&nbsp” ;}else {

echo $or igS [ $x ] [ $y ] ;}

echo ”</td>” ;}echo ”</tr>” ;

}

echo ”</table>” ;

// inc l ude the f unc t i on s to be used in sodokuinclude ( ’ sudokufunct ions . php ’ ) ;

// i n i t i a l i z e the scube 9x9x9 array to a l l z e ro sfor ( $x=1; $x<=9; $x++) {

for ( $y=1; $y<=9; $y++) {for ( $z=1; $z<=9; $z++) {

106


$Scube [ $x ] [ $y ] [ $z ]=0;}

}}

// Get a s t a r t i n g p o s i t i o n based on the o r i g i n a l sudoku g iven// ( the commented out l i n e i t when we s t a r t a t random po s i t i o n s − f o r re search )for ( $x=1; $x<=9; $x++) {

for ( $y=1; $y<=9; $y++) {$tempv=makeunit ( $or igS [ $x ] [ $y ] ) ;for ( $z=1; $z<=9; $z++) {

$Scube [ $x ] [ $y ] [ $z ]=$tempv [ $z ] ;//$Scube [ $x ] [ $y ] [ $z ]= rand (−10000 , 10000) ;

}}

}

echo ”<br /><br />” ;

// We now s t a r t the i t e r a t i o n proces s

// $Scube i s our s t a r t i n g p o s i t i o n as de f ined above$T1old=$Scube ;$T2old=$Scube ;$T3old=$Scube ;$T4old=$Scube ;$Tadaold=d iagpro j ( $T1old , $T2old , $T3old , $T4old ) ;

$zeros inarow = 0 ;$ i =1;echo ”working” ;while ( ! i sVal idSudoku ( $or igS , $mTada ) ) {

// the s topp ing c r i t e r i a i s when the r e i s not change// in the cubes f o r 10 i t e r a t i o n s// we c lone 4 cubes and make the d iagona l$T1=new1( $T1old , $T2old , $T3old , $T4old ) ;$T2=new2( $T1old , $T2old , $T3old , $T4old ) ;$T3=new3( $T1old , $T2old , $T3old , $T4old ) ;$T4=new4( $T1old , $T2old , $T3old , $T4old ) ;$Tada=d iagpro j ($T1 , $T2 , $T3 , $T4 ) ;$mTada=makearray ( $Tada ) ;i f ( $ d e t a i l s==1) {

echo ”Current Counter ” . $i , ”<br>” ;p r i n t a r r ay ($mTada ) ;

}else {

echo ” . ” ;

107


}

$T1old = $T1 ;$T2old = $T2 ;$T3old = $T3 ;$T4old = $T4 ;$mTadaold = $mTada ;$mTadaoldmatrix = $mTadamatrix ;$ i++;

try {s tuck in l oop ( $ i ) ;

} catch ( Exception $e ) {$badsudoku = print r ( $or igS , true ) ;error log ( $e−>getMessage ( ) . ”\n” ,3 , ”my−e r r o r s . l og ” ) ;error log ( $badsudoku . ”\n” , 3 , ”my−e r r o r s . l og ” ) ;echo ”Too many i t e r a t i o n s − we are stuck . . maybe a bad Sudoku?” ;re turn ;

}}

i f ( $ d e t a i l s==0) {echo ”<br>Solved in ” . $ i . ” i t e r a t i o n s :<br>” ; p r i n ta r r ay ($mTada ) ; }

?>

108

Appendix C

Eckstein Calculation Details

In chapter 4, we showed that Eckstein made a numerical error in (4.95) andas a result (4.96). The matrices in fact are


(2140

−116

−140

916

)(C.1)

and


(4347

1047

447

3747

). (C.2)

Below are the details of the calculation.

109

Appendix C. Eckstein Calculation Details

Figure C.1: Calculation Details

110

Documents

Ubc 2010 Fall Schaad Jason