sudoku

The Mathematics of SudokuTom Davis

[email protected]://www.geometer.org/mathcircles

(Preliminary)November 15, 2010

1 Introduction

Sudoku is a (sometimes addictive) puzzle presented on a square grid that is usually9 × 9, but issometimes16× 16 or other sizes. We will consider here only the9× 9 case, although most of whatfollows can be extended to larger puzzles. Sudoku puzzles can be found in many daily newspapers,and there are thousands of references to it on the internet. Most puzzles are ranked as to difficulty,but the rankings vary from puzzle designer to puzzle designer.

Sudoku is an abbreviation for a Japanese phrase meaning “thedigits must remain single,” and it wasin Japan that the puzzle first became popular. The puzzle is also known as “Number Place.” Sudoku(although it was not originally called that) was apparentlyinvented by Howard Garns in 1979. Itwas first published by Dell Magazines (which continues to do so) but now is available in hundredsof publications.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

4 89 4 6 75 6 1 4

2 1 6 55 8 7 9 4 1

7 8 6 93 4 5 9

6 3 7 24 1

Figure 1: An easy sudoku puzzle

At the time of publication of this article, Sudokuis very popular, but it is of course difficult topredict whether it will remain so. It does havemany features of puzzles that remain popular:puzzles are available of all degrees of difficulty,the rules are very simple, your ability to solvethem improves with time, and it is the sort ofpuzzle where the person solving it makes con-tinuous progress toward a solution, as is the casewith crossword puzzles.

The original grid has some of the squares filledwith the digits from1 to9 and the goal is to com-plete the grid so that every row, column and out-lined 3 × 3 sub-grid contains each of the digitsexactly once. A valid puzzle admits exactly onesolution.

Figure 1 is a relatively easy sudoku puzzle. Ifyou have never tried to solve one, attempt thisone (using a pencil!) before you continue, andsee what strategies you can find. It will probablytake more time than you think, but you will get much better with practice. The solution appears insection 20.

Sudoku is mathematically interesting in a variety of ways. Both simple and intricate logic can be

1

applied to solve a puzzle, it can be viewed as a graph coloringproblem and it certainly has someinteresting combinatorial aspects.

We will begin by examining some logical and mathematical approaches to solving sudoku puzzlesbeginning with the most obvious and continuing on to more andmore sophisticated techniques (see,for example, multi-coloring, described in section 9.2). Later in this article we will look at a few ofthe more mathematical aspects of sudoku.

A large literature on sudoku exists on the internet with a fairly standardized terminology, which wewill use here:

• A “square” refers to one of the81 boxes in the sudoku grid, each of which is to be filledeventually with a digit from1 to 9.

• A “block” refers to a3 × 3 sub-grid of the main puzzle in which all of the numbers mustappear exactly once in a solution. We will refer to a block by its columns and rows. Thusblockghi456 includes the squaresg4, g5, g6, h4, h5, h6, i4, i5 andi6. (See figure 1.)

• A “candidate” is a number that could possibly go into a squarein the grid. Each candidatethat we can eliminate from a square brings us closer to a solution.

• Many arguments apply equally well to a row, column or block, and to keep from having towrite “row, column or block” over and over, we may refer to it as a “virtual line.” A typicaluse of “virtual line” might be this: “If you know the values of8 of the9 squares in a virtualline, you can always deduce the value of the missing one.” In the9 × 9 sudoku puzzles thereare27 such virtual lines.

• Sometimes you would like to talk about all of the squares thatcannot contain the same numberas a given square since they share a row or column or block. These are sometimes called the“buddies” of that square. For example, you might say something like, “If two buddies of asquare have only the same two possible candidates, then you can eliminate those as candidatesfor the square.” Each square has 20 buddies.

2 Obvious Strategies

Strategies in this section are mathematically obvious, although searching for them in a puzzle maysometimes be difficult, simply because there are a lot of things to look for. Most puzzles rankedas “easy” and even some ranked “intermediate” can be completely solved using only techniquesdiscussed in this section. The methods are presented roughly in order of increasing difficulty for ahuman. For a computer, a completely different approach is often simpler.

2.1 Unique Missing Candidate

If eight of the nine elements in any virtual line (row, columnor block) are already determined, thefinal element has to be the one that is missing. When most of thesquares are already filled in thistechnique is heavily used. Similarly: If eight of the nine values are impossible for a given square,that square’s value must be the ninth.

2

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 92

7 9

2

9 1 3 825 6

7 95

9 4 57 9

5 4 6 7 9 7 9 13

9 23

7 8 92 3

7 8 9

2

8 9

2 3

7 8

24 5 67 9

25 6

7 9

24 5 67 9

1 35

9

15 6

7 8

35

7 8 9

61 2

8

2

7 8

1 25

7

1 25

7

25

7 8 4 92 35

7

42

52

67 9 3

26

7 9 8 7 11 2

7 8 3 91 24 57

1 25

7

24 57 8

25 5

7 61 2 3

8 9

1 258 9

2 34

8

1 25 6

7 9

1 25 6

7 9

25 6

7 9

1 25

9

158

258 9

1 2

9 72

81 2

59

25

9 6 3 41 2

8 9 62

8

1 25

9 4 3 71

58

258 9

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 92

7 9 9 1 3 825 6

7 95

9 4 57 9

5 4 6 7 9 7 9 13

9 23

7 8 92 3

7 8 9 8 9

3

7 8

24 5 67 9

25 6

7 9

24 5 67 9

1 35

9

15 68

35

7 8 9

61

8 7 8

1 25

7

1 25

7

25

7 8 4 92 35

4 2 5 69 3 6

9 8 7 11

7 8 3 91 24 57

1 25

7

24 57 8

25 5 6

1 3

8 9

158 9

34

8

1 25 6

7 9

1 25 6

7 9

25 6

7 9

1 25

9

158

258 9

1

9 7 2 81

59

59 6 3 4

1

8 9 6 8

1 25

9 4 3 71

58

258 9

Figure 2: Candidate Elimination and Naked Singles

2.2 Naked Singles

For any given sudoku position, imagine listing all the possible candidates from1 to 9 in each unfilledsquare. Next, for every squareS whose value isv, erasev as a possible candidate in every squarethat is a buddy ofS. The remaining values in each square are candidates for thatsquare. Whenthis is done, if only a single candidatev remains in squareS, we can assign the valuev to S. Thissituation is referred to as a “naked single”.

In the example on the left in figure 2 the larger numbers in the squares represent determined values.All other squares contain a list of possible candidates, where the elimination in the previous para-graph has been performed. In this example, the puzzle contains three naked singles ate2 andh3(where a2 must be inserted), and ate8 (where a7 must be inserted).

Notice that once you have assigned these values to the three squares, other naked singles will appear.For example, as soon as the2 is inserted ath3, you can eliminate the2’s as candidates inh3’sbuddies, and when this is done,i3 will become a naked single that must be filled with8. The positionon the right side of figure 2 shows the same puzzle after the three squares have been assigned valuesand the obvious candidates have been eliminated from the buddies of those squares, exposing twonew naked singles.

2.3 Hidden Singles

Sometimes there are cells whose values are easily assigned,but a simple elimination of candidatesas described in the last section does not make it obvious. If you reexamine the situation on the leftside of figure 2, there is a hidden single in squareg2 whose value must be5. Although at first glancethere are five possible candiates forg2 (1, 2, 5, 8 and9), if you look in column2 it is the uniquesquare that can contain a5. (The squareg2 is also a hidden single in the blockghi123.) Thus5 canbe placed in squareg2. The5 in squareg2 is “hidden” in the sense that without further examination,it appears that there are5 possible candidates for that square.

To find hidden singles look in every virtual line for a candidate that appears in only one of the

3

squares making up that virtual line. When that occurs, you’ve found a hidden single, and you canimmediately assign that candidate to the square.

To check your understanding, make sure you see why there is another hidden single in squared9 offigure 2.

The techniques in this section immediately assign a value toa square. Most puzzles that are ranked“easy” and many that are ranked “intermediate” can be completely solved using only these methods.

The remainder of the methods that we will consider usually donot directly allow you to fill in asquare. Instead, they allow you to eliminate candidates from certain squares. When all but one ofthe candidates have been eliminated, the square’s value is determined.

3 Locked Candidates

Locked candidates are forced to be within a certain part of a row, column or block.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

1 4 5 8 6 7 2 4 5 9 33

57

35

7 9 8 1 4 6 57 2

247 6

2

7 9 5 3 8 47 1

34 5

9

34 5 6

24 5

2 3

7 1 8 59

4 5 2 1 4 5 9 8 7 3 63

4 57 9 8

35

7 12 3

62

4 52

4 5 59

25 1 4 3 8 9

25 6 7

63

57

2 35

7

2

7 4 1 925 8

8 92

7

2

7 6 5 3 1 4

Figure 3: Locked Candidates

Sometimes you can find a block where the onlypossible positions for a candidate are in one rowor column within that block. Since the blockmust contain the candidate, the candidate mustappear in that row or column within the block.This means that you can eliminate the candidateas a possibility in the intersection of that row orcolumn with other blocks.

A similar situation can occur when a numbermissing from a row or column can occur onlywithin one of the blocks that intersect that rowor column. Thus the candidate must lie on theintersection of the row/column and block andhence cannot be a candidate in any of the othersquares that make up the block.

Both of these situations are illustrated in fig-ure 3. The blockdef789 must contain a2, andthe only places this can occur are in squaresf7andf8: both in rowf . Therefore2 cannot be acandidate in any other squares in rowf , includ-

ing squaref5 (sof5 must contain a3). Similarly, the2 in block ghi456 must lie in column4 so2cannot be a candidate in any other squares of that column, includingd4.

Finally, the5 that must occur in column9 has to fall within the blockdef789 so 5 cannot be acandidate in any of the other squares in blockdef789, includingf7 andf8.

4 Naked and Hidden Pairs, Triplets, Quads, . . .

These are similar to naked singles, discussed in section 2.2, except that instead of having only onecandidate in a cell, you have the same two candidates in two cells (or, in the case of naked triplets,the same three candidates in three cells, et cetera).

4

A naked pair, triplet or quad must be in the same virtual line.A naked triplet’s three values mustbe the only values that occur in three squares (and similarly, a naked quad’s four values must be theonly ones occurring in four squares). When this occurs, those n squares must contain all and onlythosen values, wheren = 2, 3 or 4. Those values can be eliminated as candidates from any othersquare in that virtual line.

a

1 2 3 4 5 6 7 8 9

12

7

2 34 5

34 6 8 9

2

7

2 34 67

8 3 9 71 1

42 1 2

Figure 4: A Naked Pair

Figure 4 shows how to use a naked pair. Insquaresa2 anda8 the only candidates that ap-pear are a2 and a7. That means that7 must bein one, and2 in the other. But then the2 and7cannot appear in any of the other squares in thatrow, so2 can be eliminated as a candidate ina3and both2 and7 can be eliminated as candidatesin a9.

For a naked pair, both squares must have exactly the same two candidates, but for naked tripletsor quads, the only requirement is that the three (or four) values be theonly values appearing inthose squares in some virtual line. For example, if three entries in a row admit the following setsof candidates:{1, 3}, {3, 7} and{1, 7} then it is impossible for a1, 3 or 7 to appear in any othersquare of that row.

a

1 2 3 4 5 6 7 8 9

51

7 214

8

1 34

8 9 63

7

1 3

8 3 9 71 1

42 1 2

Figure 5: A Naked Triple

Figure 5 contains a naked triple. In rowasquaresa2, a8 anda9 contain the naked tripleconsisting of the numbers1, 3 and7. Thus thosenumbers must appear in those squares in someorder. For that reason,1 and3 can be eliminatedas candidates from squaresa4 anda5.

Hidden pairs, triples and quads are related tonaked pairs, triples and quads in the same way that hidden singles are related to naked singles.In figure 6 consider rowi. The only squares in rowi in which the values1, 4 and8 appear are insquaresi1, i5 andi6. Therefore we can eliminate candidates2 and6 from squarei1 and candidate3 from i5.

a

1 2 3 4 5 6 7 8 9

8 4 1 9 52 3 2 3

1 24 6

3

9

26 5

1 3

84

8

2

7

2

7 9

36

Figure 6: A Hidden Triple

Remember, of course, that although the threeexamples above illustrate the naked and hiddensets in a row, these sets can appear in any virtualline: a row, column, or block. There is also noreason that there could not be a naked or hiddenquintet, sextet, and so on. These wouldn’t makemuch sense in the9 × 9 grid, but they might on

a16 × 16 or larger sudoku grid.

There is also the possibility of something called a remote naked pair, but we will discuss that later,in section 10.

5 Tuleja’s (or Mr. T’s) Theorem

In the Summer of 2008 Alan Lipp communicated to me a very interesting observation made by hisfriend, Greg Tuleja, about Sudoku that can sometimes be veryuseful in solving a puzzle. Since manypeople have trouble pronouncing “Tuleja”, Alan called it “Mr. T’s Theorem.” I’ll call it “Tuleja’s

5

Theorem” here.

This is different from all the other strategies in this paperin that the theorem does not usually yieldinformation directly that allows us to immediately fill in a square or to eliminate a candidate, butrather provides an observation that allows us to do so after asmall (or sometimes large) number oflogical conclusions. In this section we will illustrate andprove Tuleja’s Theorem, and will then gothrough a series of examples where deductions from the theorem allow us to advance our solution.It was hard to know exactly where to place this section in the article, since some applications aretrivial and some deep.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 96 4 5 7 9 8 3 2 19 8 7 3 1 2 5 4 68 5 9 6 7 3 4 1 24 1 6 2 8 5 9 3 77 3 2 9 4 1 8 6 55 6 4 8 2 9 1 7 33 7 1 5 6 4 2 9 82 9 8 1 3 7 6 5 4

Figure 7: Tuleja’s Theorem

Consider the completed Sudoku puzzle illus-trated in Figure 7. The theorem applies to anyrow of blocks or any column of blocks and statesthat there are only two situations that can occur.

The first is illustrated in the row of blocksabc123, abc345 and abc789. In this case thenumbers1, 2 and3 occur in row1 of the firstblock, row 3 of the second, and row2 of thethird. If this situation occurs where the samethree numbers appear in different rows of allthree blocks, then all the other sets of three num-bers in rows will also repeat. So in this case,we’ve got 645, 456 and 546 as rows in thoseblocks, and also987, 798 and789. The orderof the numbers within the blocks is arbitrary. Inthe example, the same thing occurs in the rowof blocks labeled withg, h andi. If you reflectthis puzzle across the main diagonal, it is obvi-ous that the same thing can occur in columns.

If a set of three numbers is repeated in rows ofany two blocks it is clear that the same set must occur in the third block in the other row. Thenhowever an additional row is filled in, it forces those same numbers to appear as sets of three inrows in the other blocks. In the example, suppose that, in theabc blocks of figure 7, the123 rowsare filled in but everything else is blank. When we choose645 in theb123 row, it is clear that thosenumbers then must go in thea456 and thus in thec789 rows. That leaves the numbers9, 8 and7 tofill the remaining rows.

c

b

a

1 2 3 4 5 6 7 8 9

1 2 31 2 33 1 2

Figure 8: Proof of Tuleja’s Theorem

That’s pretty simple, but Tuleja noticed thatthere isonly one other thing that can happen.Look at the blocks in thedef row. Instead ofall three numbers being repeated, there are threepairs of two numbers that repeat:85, 41 and73in this example. It will always happen that ifthe first situation does not occur, there will bethree sets of two numbers that sit in the rows (orcolumns, obviously) as in the example.

In the column of three blocks under123, thosepairs are69, 48 and35. As an exercise, make sure you can find the numbers that do thisin thecolumns of blocks headed by456 and by789.

6

Since the exact numbers don’t matter, we’ll use nice sets like123 to make the proof easier to follow.We will also state the proof in terms of rows of blocks, but thesame reasoning can be used forcolumns of blocks.

Suppose123 appears in the top row of the first block. If the123 appears in another row of anotherblock then all three sets of numbers repeat in all the blocks’columns. If not, the1, 2 and3 must besplit between two rows of the next block. If the split is not three and zero, it has to be two and one,so one of the other rows in the second block contains two of thethree numbers, say1 and2. Thenthe12 pair has to be in the remaining row of the third block. Figure 8illustrates what might occurin this situation. In the figure, the numbers1, 2 and3 are placed arbitrarily in the rows, but oncethe rows for1 and2 are determined, the rows where the3 must go are fixed as in the figure. Thereare no vertical divider lines within the blocks to emphasizethe fact that the horizonal order of theentries makes no difference in these arguments.

c

b

a

1 2 3 4 5 6 7 8 9

1 2 31 2 4 33 1 2 5 c

b

a

1 2 3 4 5 6 7 8 9

1 2 3 5 45 1 2 4 34 3 1 2 5

Figure 9: Proof of Tuleja’s TheoremNext, we observe that the number completing the12 set in the other rows cannot be3, and it cannotbe the same in both rows or it would force that same number in the top left row, which is alreadyfull. Let us assume that the numbers filling the rows are4 and5 as in the left side of Figure 9. This,of course, forces rows for4 and5 in the other blocks as illustrated in the example on the rightof thesame figure.

From that, it is clear that whatever pair goes intob23 also has to go intoa89, and that, in turn, forcesthe numbers that go into the remaining slots. This completesthe proof of Tuleja’s Theorem.

Notice also that the other three numbers, the ones that are not in the repeated pairs, have to cyclethrough the remaining slots. In this article I will refer to those unmatched numbers as “floaters”. Inthe examples in Figure 9, the floaters are3, 4 and5.

5.1 Applications of Tuleja’s Theroem

In the examples that follow, in most cases, at least, other standard techniques can be used to fill innumbers or eliminate candidates. The examples simply illustrate how Tuleja’s theorem might beused. Generally it is most useful when a fair number of squares have been filled since some data isrequired to figure out which numbers form the pairs and which numbers are the floaters.

We will begin with a simple example, illustrated on the left in figure 10.

Consider the second column of blocks. In that column,73 is a pair and it is obvious that we haveonly pairs and not triples repeating. The number6 is obviously a floater (since it is matched with73in the center block), making41 a pair. That means there must be a4 in eitherh5 or i5, but it cannotgo inh5, so a4 must go in the circledi5. There is a faster way to get this, of course: the4 in column6 forces a4 in column1 of the central block, forcing a4 in h5 or i5.

The example of the right of figure 10 illustrates another use of the theorem.

This time we consider the column of blocks under789. Since we have the numbers615 in column8 of the top block and we have a75 in column7 of the middle block, we’ve got sets of pairs and not

7

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

8 3 61 4 7 3

3 9 7 2 1 8 58 7 9

7 6 5 8 35 3 6 7

5 4 8 6 1 7 3 9 23 6 1 2 4 7 5

3 6 1 8 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

9 1 2 4 54 8 9 1

5 1 9 6 44 3 6 7 93 5 8 9 7 4 x

7 9 4 1 2 5 3 x5 9 43 1 4 58 4 5 1 3

Figure 10: Examples 1 and 2: Tuleja’s Theorem

triples in that column.

One of the pairs must be either61, 65 or 15 from the615 set, and those two numbers must go intothe squares marked with an “x”. Since there is already a5 in the third block under column9, thateliminates65 and15, so the two “x” slots contain a1 and a6. Thus a6 goes in the circled squareand a1 in the other.

When you first try to apply Tuleja’s theorem, you will probably find that it is easy to make mistakes.A good way to practice is to do easy Sudoku puzzles on the internet or other place where youhave easy access to the answers and as you work them, search for applications of Tuleja’s theorembefore you try anything else. When you think you’ve got one, check the answer to make sure yourreasoning was correct. When your answer is wrong, review your reasoning to learn where you madeyour mistake.

The next two examples come from the same Sudoku puzzle. In thepuzzle on the left in Figure 11consider the lower row of blocks.92 is a pair, so4 and7 are floaters. In the middle lower block wehave563, so56, 53 or 63 must be a pair. But the4 in the lower left block tells us that6 is part ofthe pair, so it is either56 or 63. Since there is a5 in row i, the pair must be63, so a3 belongs in thecircled positioni1.

After placing the3 in i1, we now have the configuration shown on the right in Figure 11.We aremissing8 and9 in the center block, so positionx must contain an8 or a 9. In the center row ofblocks,89 cannot be a pair since8 and9 go in two rows in the center block. In the center row of thecenter block, we can conclude that the pair is one of:38, 39, 48 or 49. It can’t be38 or 39 becauseof the right block in the central row of blocks. It also can’t be 49 since there is a9 in positionc2.Thus there is an8 in the circledd2 square.

The final example displayed in Figure 12. Consider the top rowof blocks.

Since there is a2 in column9 there must be a2 in one of the squares marked with anx. Thereforeone of the squares markedy must also contain a2. Because we then have21 in two block rows,21must be a pair, and7 a floater. (Clearly there are only pairs and not triples in therow since we havea721 and a73 in different block rows.)

8

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

5 8 4 6 7 97 5 9

9 7 2 3 1 51 4 6 5 3

3 43 1 2 7 8 9

7 5 6 3 9 2 44 1 8

4 6 9 7 2 5 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

5 8 4 6 7 97 5 9

9 7 2 3 1 51 4 6 5 3

3 x 43 1 2 7 8 9

7 5 6 3 9 2 44 1 8

3 4 6 9 7 2 5

Figure 11: Examples 3 and 4: Tuleja’s Theorem

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

x xy y

8 3 61 4 7 34 3 9 7 2 1 8 5 6

8 7 5 97 6 2 5 8 3

5 3 6 75 4 8 6 1 7 3 9 23 6 1 2 4 7 5

5 4 3 6 1 8

Figure 12: Example 5: Tuleja’s Theorem

Since7 is a floater and appears with a3 in therightmost block,3 is not a floater. The439 inthe leftmost block tells us that the pair is either34 or 39. Since3 appears with6 in the top rowof the middle block,6 is a floater, so85 is a pairand the only place the5 can go is in the circledsquare,a2.

(Also note that since34 or 39 is a pair, and thereis a4 elsewhere in the middle block, thata5 mustbe9.)

6 X-Wings and Swordfish

An X-wing configuration occurs when the samecandidate occurs exactly twice in two rows andin the same columns of those two rows. (Youcan swap the words “rows” and “columns” in theprevious sentence.) In the configuration on theleft in figure 13 the candidate3 occurs exactlytwice in rowsc andh and in those two rows, itappears in columns2 and7. It does not matterthat the candidate3 occurs in other places in thepuzzle.

The squares where the X-wing candidate (3, in this case) can be placed form a rectangle, so a pairof opposite corners of that rectangle must contain the candidate. In the example, this means thatthe3’s are either in bothc2 andh7 or they are in bothc7 andh2. Perhaps the fact that connectingthe possible pairs would form an ‘X’, like the X-wing fightersin the Star Wars movies, gives this

9

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 91 2

6 4 8 7 936

1 2 35

1 2 35

2 3

169 5

1 3

8 236 7

1 34

34

92

9

2 3

9 7 5 4 12 3

9 6 83 8 5 2 1 9 4 7 67 6 2 3 5 4 8 9 14 1 9 6 7 8

2 3 2 3

58 7 6 4 3 5

1 2

9

1 2 2

91

59

3

9 41

9 6 23

5 8 71 2

59

2 3

9

1 3 1

9 8 7 63

4 53

4 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 91 2

7 8

1 3

7 8 9 58

3

7 61 3

5 41 2 3

5

24

8

3

8

34

858 9 1 8 9 7 6

2 35

1

7 6 53

7 4 21 3

9 83 9 6

147 8 5

14 2

1

7147 5 2

1 347 6

1 347 9 8

1 3

7147 8

1

7 8

14

8 2 91 347

1 34 5

15

7 66 4

1

8

1

7 8 9 51

7 8 9 2 31

7 9

59

1 3

8 71 34

8 9 21 34

8 9 61

514 5

9

59 2

1 3

63

7

1 347 9 8

15

7

14 57 9

Figure 13: X-Wing (left) and Swordfish

strategy its name.

In any case, since one pair of two corners must both contain the candidate, no other squares in thecolumns or rows that contain the corners of the rectangle cancontain that candidate. In the example,we can thus conclude that3 cannot be a candidate in squaresa7, f7 or i2.

A swordfish is like an X-wing except that there must be three rows with the three candidates appear-ing in only three columns. The candidate need not appear in all three positions in all three columns.The situation is similar to the difference between naked pairs and naked triples: for a naked pair,the two candidates must appear in both positons, but for a naked triple, all that is required is thatthe three candidates appear in only three positions; some positions may only have two of the threecandidates.

The candidate must occur three times, once in each row, but since the occurrences in those rows arerestricted to exactly three columns, all the columns must beused as well. The reasoning is similarto that used for the X-wing: once you find a swordfish configuration, the candidate cannot appear inany other squares of the three columns and rows. Of course youcan again swap the words “rows”and “columns” in the description above.

A sample swordfish configuration appears on the right in figure13. In this case, the candidate is7,and the columns that form the swordfish are2, 5 and8. In these columns the value7 appears onlyin rowsa, f andi. One7 must appear in each of these rows and each of the columns, so noothersquares in those rows and columns can contain a7. Thus the candidate7 can be eliminated fromsquaresa1, f1, f6, i6 andi9.

Of course there is nothing special about a swordfish configuration; “super-swordfish” with 4, 5, or6 candidates might be possible. They are rare but not particularly difficult to spot. The “super-swordfish” with4 rows and4 columns is sometimes called a “jellyfish”. If you are playingon astandard9 × 9 grid, the most complex situation you would need to look for would be a jellyfish,since if there were a5 × 5 super-swordfish, there would have to be in addition a4 × 4 or smallerswordfish in the remaining rows or columns. It’s too bad that there’s no real need for the5 × 5super-swordfish, since the usual name in the internet literature is so nice: it is called a “squirmbag”.

10

But let us see why this is so, for a particular situation. It will be clear how the argument can beextended to others.

Suppose that the candidate1 has been assigned to two squares. Then there are7 rows and columnsin which a1 hasnot been assigned. If we find a (4-row) jellyfish, we would like to show that theremust be a3-column (or simpler) swordfish. Assume that in each of rowsw, x, y andz the number1is a possible candidate only in columnsα, β, γ andδ. It may not be a candidate in all those columns,but in those four rows, it will never be outside those columns.

But that means that in theother three columns, the candidate1 will be missing from the rowsw, x,y andz, so it must appear only in the other three rows. That means there is at most a (3-column)swordfish.

Obviously there is nothing special about the7, 4 and3 in the argument above. If there aren availablerows and columns, and you find ak-row “swordfish”, there must be an(n− k)-column “swordfish.”

7 The XY-Wing and XYZ-Wing

Sometimes a square has exactly two candidates and we are logically led to the same conclusion nomatter which of the two we assume to be the correct one. An “XY-wing” represents such a situation.This is a sort of “guess and check” strategy, but it only looksahead one step so it is easily done by ahuman.

f

e

d

c

b

a

1 2 3 4 5 6

XY YZ

XZ *

Figure 14: XY-Wing

In the configuration in figure 14, suppose that there are two possible can-didates in squaresb2, b5 ande2. In the figure, the candidates are justcalled X, Y and Z. Consider the contents of squareb2. If X belongs inb2,then there must be a Z ine2 and therefore Z cannot be a candidate ine5.But the other possibility is thatb2 contains a Y. In this case,b5 must be Zand again,e5 cannot be Z. Thus no matter which of the two values goesin b2, we can deduce that Z is impossible in squaree5.

Now consider the configuration on the left in figure 15. If either X or Ybelongs in squarea2, Z cannot be a candidate in the three squares indi-cated by asterisks. Similarly, in the configuration in the center in the samefigure, Z cannot be a candidate in two more squares indicated by asterisks.

c

b

a

1 2 3 4 5 6

XY XZ

YZ * * *

+c

b

a

1 2 3 4 5 6

* XY * XZ

YZ

→c

b

a

1 2 3 4 5 6

* XY * XZ

YZ * * *

Figure 15: XY-Wing

Obviously, the two configurations on the left in figure 15 can be combined to make the configurationon the right in the same figure where Z can be eliminated as a candidate in any of the squares markedwith an asterisk.

An example of an XY-wing in an actual puzzle appears in figure 17. Notice that in squaresd8 andf7 (both in the same block,def789) and in squared1 we have candidates{8, 9}, {3, 9} and{8, 3},respectively. No matter which of the two values appear ind8, a 3 must appear in eitherd1 or f7.Because of this, we can eliminate3 as a candidate from squaresd7, f1 andf2.

The XYZ-wing is a slight variation on the XY-wing. If you can find a square that contains exactlythe candidates X, Y and Z and it has two buddies, one of which has only candidates X and Z while

11

the other has only candidates Y and Z, then any square that is buddies withall three of those squarescannot admit the candidate Z. On the left side of figure 16 thissituation occurs, and candidate Z canbe eliminated from squaresb2 andb3.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

2 8 9 4 57 1 5

7 6 37 6 4 5 3 2 9 8

15

14

1 4 5 3 8 6 57

49

57 9 2

3

8 1 2 7 4 58 9

4 58

34

9 8 9 69 4

7 6 2 48 3 1 7 8 5

358

35

74 57 1 8 9 6

3

9 2 47

63

57

57 8 9

34

847 8 2

15

7

1

73

5 9 1 63

7 2 57 4 8

4 2 7 8 5 1 7 8 6 3 9 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 914

14 8 6

7 5 9 21

67 3

1 25

9 61 2

59 3 8

2

7

15 4

15

725 7 3 4 1

26 8 9 5 6

6 51 2

9 7 8 43

7 8

1 3

9

1 2 3

7

1

7 9

82

9 7 53

9 1 62 3

43

1

9 4 2 69

67

15

9 81

57 9

71 2 3 1 2

936

36

8 4 51

68

15

9 81

59

16 7 4

1 3

9

1 36 2

14

9

1 34

9 61

8 2 5 71 3 1

8 9

Figure 17: XY-Wing Example (left), XY-Chain (right)

8 XY-Chains

c

b

a

1 2 3 4 5 6

YZ

XYZ * * XZ

Figure 16: XYZ-Wing

There is another way to look at the XY-wing. We can think of theexamplein figure 14 as a sort of chain frome2 to b2 to b5. If the value ine2 is notZ, then it is X. Sinceb2 is a buddy ofe2 this would forceb2 to be Y, andsinceb5 is a buddy ofb2, thenb5 would be forced to be Z. The reasoningcan be reversed if we assumeb5 is not Z and we can conclude thate2must be Z. Thus exactly one ofb2 or b5 must be Z and the other is not.Any squares that are buddies of bothe2 andb5 (only e5 in the example infigure 14) cannot possibly be Z.

The interesting observation we can make from this is that there is no need for such a chain to be onlytwo steps long: it can be as long as we want, as long as the same candidate appears at both ends.When this occurs, we can eliminate that candidate from any ofthe squares that are buddies of bothsquares that are the endpoints of such a chain. We will call these XY-chains.

Consider the situation on the right in figure 17. Look at the following chain of squares linked inexactly the same way that the three squares in an XY-wing are linked: i8− e8− e2− e5− g5− h4.Each of the squares is a buddy of the next; each square contains only two possible candidates, andfinally, those two candidates match with one of the two candidates of the squares on either side ofit in the chain. Finally, the left-over candidate (1 in this example) is the same in squaresi8 andh4.By stepping through the chain we can conclude that ifi8 is not1 thenh4 is, and ifh4 is not1 theni8 is. Thus eitheri8 or h4 must be1, so squares that are buddies of bothi8 andh4 cannot be1 andwe can eliminate1 as a candidate from squaresh7, h8 andi4.

We can also note that naked pairs (discussed in section 4) area simpler version of this idea, butwhile an XY-wing is an XY-chain with three links, a naked pairis a chain with only two links. Butthe naked pair that contains a1 and a2 in each square is like two of these chains: one with1 at the

12

endpoints and one with2 at the endpoints, so all1’s and2’s can be eliminated from squares that arebuddies of the two that make up the square. Similar reasoningcan be applied to see that a nakedtriple like 12, 23 and31 can be thought of as three different XY-chains where each pair is a differentset of endpoints.

9 Coloring and Multi-Coloring

Coloring and multi-coloring are techniques that eliminatecandidates based on logical chains ofdeduction. The coloring method, is simple enough that it canbe done by hand.

9.1 Simple Coloring

Consider the example in figure 18 where we consider a few squares that admit the candidate1. Let’sassume for now that these are the only possible locations forthe candidate1 in the puzzle. Certainvirtual lines contain exactly two places where the candidate1 can go: rowsb andi, columns3 and6, and blockdef123. In each of these virtual lines, exactly one of the possible squares can contain a1 and once it is selected, the other cannot.

But this creates a sort of “chain” iff1 contains1, thene3 must not, and sincee3 must not,b3 must,sob6 must not,i6 must, andi9 must not. If, on the other hand,f1 does not contain a one, the sameseries of virtual line interactions will force an alternating set of conclusions and every square in thechain will be forced to have the opposite value.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

1+ 1–

1–1+

1+ 1–

Figure 18: Simple Coloring

In the figure we’ve marked the squares with+and− according to the assumption thatf1 doescontain a1, but of course it may be the case thatf1 does not contain1, then all the+ and− signswould be interchanged. Rather than using the“+” and “−” characters that could imply pres-ence or absence of a value it is better simply toimagine coloring each square in the chain blackor white, and either all the black squares have a1 and all the white squares do not, or vice-versa.

In most situations, not all of the squares in a puz-zle that admit a candidate can be colored: onlythose squares where the candidate appears ex-actly twice in some virtual line can be part of achain. If there are three candidates in a row, forexample, and one of them is colored, we cannotimmediately assign colors to the two others inthat row although we may be able to do so later,based on other links in the chain.

Suppose now that for some candidate you havediscovered such a chain1 and have colored it in this alternating manner.

It may be that there are additional squares where the candidate could possibly occur that do not

1For astute readers, it may not really be a chain, but it could be a tree, or even have loops, as long as the black/whitealternation is preserved.

13

happen to lie in the colored chain. In figure 18, suppose square f1 is colored black and so squarei9 must be colored white. Consider the squaref9 that lies at the intersection off1’s row andi9’scolumn. Sincef1 andi9 have opposite colors, exactly one of themwill contain a1, and thereforeit is impossible for the squaref9 to contain a1, so1 can be eliminated as a possible candidate insquaref9.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

8 6 71 2 1 2 3

5 9 41 3 2 3

5

4 1 32

67

25 6

7 8 57 9

25 6

7

5 2 914 67

1 36

7

147

1 3

7 8

1 3

7 8

36

71

7 4 2 51

7 63

8

3

8 99 3 8

1 2

7 41 2

7

15

7 6 57

6 57

15 8 9 3 2

1

7 41 2 3

7 91

61 24 67

1 26

7

1 247

3

7 5 82

7 8 5 6 32

67

25

7 9 4 11 3

75

7 4 9 81

57 6 2

3

7i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

WW B

BB WW

Figure 19: Simple ColoringThis is probably easier to see with the concrete example displayed on the left in figure 19 where weconsider1 as a possible candidate. In rowd, d1 andd5 are the only occurrences of candidate1, sowe colord5 black andd1 white. Butd1 andf3 are the only possibilities for1 in block def123, sosinced1 is white,f3 is black. By similar reasoning, sincef3 is black,g3 andf8 are white. Sincef8is white,e7 is black, and sincee7 is black,c7 is white. That’s a pretty complicated chain, but here’swhat we’ve got: black:{d5, f3, e7} white: {d1, g3, f8, c7}. A grid that displays just the coloredsquares appears on the right in figure 19.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

1 5 48 9 4

8 3 7 6 29

3

8 2 7 1 63

8 5 43

48 7 6 5 4

8 23

8 1 936 1 5 8 9 4 2 7

36

23

8 9 6 7 5 43

8 16

8 4 7 3 2 1 5 9 68

7 63

8 4 5 9 1 23

8

5 23

4 1 6 8 93

4 74

8 9 1 2 3 7 6 48 5 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

B WB W

B B

W BB W

W B

B W

Figure 20: Simple Coloring: Chain ConflictSquarec5 is at the intersection ofc7’s row andd5’s column, butc7 is white andd5 is black, so1

14

cannot be a candidate in squarec5. Similarly, squareg5 is in the same row asg3 and same columnasd5 which are white and black, respectively, so1 also cannot be a candidate ing5.

Note that after making eliminations like this, it may be possible to extend the coloring to additionalsquares although that is not the case in figure 19.

There is nothing special about a row-column intersection. Any time two oppositely-colored squares“intersect” via virtual lines of any sort in another square,the candidate can be eliminated as a pos-sibility in that square. Sometimes a candidate can be assigned immediately to a square on the basisof coloring. Suppose that two squares of a chain are the same color, but lie in the same virtual line.If squares of that color contained the candidate, then two squares in the same virtual line wouldcontain it which is impossible, so the candidate can be immediately assigned to all squares of theother color. This situation is shown in figure 20. In that figure, the board on the left is colored forcandidate8 as shown on the right. Note that the black squares conflict in afew places: in column1, row c, and in blockabc123. This means that the candidate8 can be eliminated from every squarecolored black.

9.2 Multi-coloring

Sometimes a position can be colored for a particular candidate and multiple coloring chains exist,but none of them are usable to eliminate that candidate from other squares. If there are multiplechains, it is worth looking for a multi-coloring situation.

f

e

d

c

b

a

1 2 3 4 5 61A

1 1 1

1a

1A

b1 1

B1 1

a1 1

Figure 21: Multi-coloring

Consider the puzzle in figure 21. Assume that in the parts ofthe puzzle that are not shown there are no other places that thecandidate1 can occur. When this diagram is colored, thereare two coloring chains. Instead of using words like “black”and “white” we will use letters, likeA, B, C, a, b andc wheretheA anda represent opposite colors, as do theB andb, C

andc and so on. In figure 21 rowsa andc and in column3there are only two possible locations for candidate1.

When this grid is colored, it will look something like this:squaresc1 andf3 have colora and squarec3 has colorA.Squaree2 has colorb and squaree5 has colorB. Note that thecolors assigned are arbitrary. All that matters is that squaresc1 andf3 have the same color that is the opposite ofc3 and thate2 ande5 have opposite colorsthat are different from the other assigned colors. Note alsothat none of the other squares with1 asa candidate can be colored, since all are in virtual lines with more than two squares that potentiallycould contain the candidate1.

If we consider the color “a” as standing for the sentence: “Every square containing thecolor “a”contains a1,” and so on, then we can write little logical expressions indicating the relationshipsamong the various colors when they are interpreted as sentences. The obvious ones are of the form:“a = ¬ A” or “ A = ¬ a” (where the logical symbol “¬” means “not” and the symbol “=” means “islogically equivalent to”). In other words, ifa is true thenA is false, and vice-versa.

In this section, we will be performing what is known as boolean algebra2 on expressions involving“sentences” such asa, A, b, B and so on.

Although the values of non-opposite colors do not necessarily have anything to do with each other,

2See [9] in the references for a text on boolean algebra.

15

in figure 21, the paira andb, for example, are linked, since they occur in the same block.If a is true,thenb cannot be, and vice-versa, but it may be true that botha andb are false. We will express thisrelationship as “a⊼b” and read it as “a excludesb”. Obviously, if a⊼b thenb⊼a3.

Another way to think ofa⊼b is as “If a is true then so isB.” and at the same time, “Ifb is true thenso isA.” If a⊼b then at least one ofa or b must be false. Equivalently, ifa⊼b is true then at leastone ofA or B must be true. This means that any square that is a buddy of two squares coloredAandB must not allow the candidate since one of the two squares coloredA or B must contain thecandidate. In figure 21, this means that1 cannot be a candidate in squarea5.

To condense all of the above into a single statement, we know that if a⊼b for some candidate thenany square that is a buddy of bothA andB cannot contain that candidate.

Let us begin with a simple example of multicoloring displayed in figure 22. On the left is thecomplete situation, and on the right is a simplified version where only squares having the number6as a possible candidate are displayed.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 936

1 34 6

1 3

9 2 5 67 8 4

7

8 5 91

6 7 4 21

6 37 4 6 2

1 36

1 3

8 914 6 5

4 7 6 8 5 3 1 9 22 9

1 3 14 6 7

34 5 8

51 3

8 214 9

36

74 67

4 67

369 8 4

3

7

3

9 2 5 67 1

369

36 7 5

34

9 1 8 2 4 6

1 2 5 47 8 6 4

7 3 9 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

6 6 6B

6C 6c

6 6c 6

6b 6 6a

6A 6a

6 6 6A

Figure 22: Simple Multi-coloring Example: Coloring on Right

In row b and column4 there are only two squares that admit candidate6, so we have colored allthose squares withC andc. In the same way, the two squares in column7 are colored withB andb,while A anda are used to color four squares that share, in pairs, rowg, column9 and blockghi789.

In this example, we note thata⊼b because instances of them lie in squaresf7 andf9 which arebuddies. Because of this, any square that is a simultaneous buddy of a square coloredA and of onecoloredB cannot allow6 as a candidate. In the figure, squarea1 is a buddy of bothg1 anda7, so6cannot be a candidate in squarea1, so we can see in full puzzle on the left that3 can be assigned tosquarea1.

In the figure,B⊼c (since they lie in the blockabc789) andc⊼a (since they lie in column8) as well,but there are no squares that are simultaneous buddies of squares coloredb andC or of C andA sowe cannot use those exclusion relationships to help solve this puzzle.

Next, we will look at a multi-coloring situation that is quite complex because much more can bedone with a multi-coloring. In complex situations, there may be many independent color chains with

3The truth table fora⊼b is equivalent to the “nand” (“A nand B” is the same as “not(A and B)”) logical operator that isheavily used in computer hardware logic designs and is sometimes represented by the symbol “⊼”.

16

colorsA anda, B andb, C andc, and so on. When that occurs, we need to look for consequencesofthe following inference:

If a⊼b andB⊼c thena⊼c.

It’s not hard to see why: Ifa is true,b is not, soB is true, and the second exclusion implies thatc isnot. The reasoning is trivially reversed to show that ifc is true thena is not, so we obtaina⊼c.

Thus to do multi-coloring for a particular candidate, proceed as follows:

• Construct all possible color chains for the diagram.

• Find all exclusionary relationships from pairs of colored squares that are buddies.

• Take the collection of relationships and complete it to its transitive closure using the idea thatif (a⊼b andB⊼c) thena⊼c.

• For every exclusionary pair in the transitive closure, find buddies of squares colored withcolors opposite to those in the pair, and eliminate the candidate as a possibilty from all ofthem.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

4 8 9 9 6 9 2 7 52 5 7 9 6

9 7 6 9 9 4 35 2 9 8 4 9 6 9 3

9 6 8 3 5 2 7 4 9

3 9 9 9 1 6 8 5 28 3 5 9 9 1 6 76 1 9 8 3 9 47 9 9 6 3 1 5 9 9

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

9A

9E

9

9a

9b

9

9 9 9A

9A

9a

9D

9 9e

9B

9b

9C

9c

9d

9 9 9A

Figure 23: Complex Multi-coloring Example: Coloring on Right

As another multi-coloring example, see figure 23 where only the presence of squares that admit thecandidate9 are marked (all, of course, must admit other candidates). Onthe left is the complete gridand on the right is a simplified version where only the squaresadmitting candidate9 are shown, andall of the color chains are displayed. It is an excellent exercise to look at the diagram on the right tomake certain that you understand exactly how all the color chains are constructed.

The next step in the application of multi-color is to find all the exclusionary pairs, and the initial listis displayed in table 1. Note that the “⊼” operation is commutative, so if you thinka⊼b should be inthe list and it is not, be sure to look forb⊼a as well.

From these initial exclusions, a number of others can be deduced. For example, froma⊼b andA⊼d

we can conclude thatb⊼d. Note that to make this implication, we are implicitly usingthe fact thata⊼b andb⊼a are equivalent. Similarly, sinceA⊼E andD⊼e we can conlude thatA⊼D. See if youcan discover others before reading on.

17

Table 1: Direct Exclusions

A⊼E a⊼b D⊼e A⊼d A⊼C

A⊼c b⊼E A⊼D C⊼d

In fact, if we make all such deductions, and then all deductions from those, and so on, there aretwelve additional exclusions that we find, and they are displayed in table 2

Table 2: Derived Exclusions

b⊼d b⊼C b⊼c b⊼D C⊼e

A⊼A A⊼b b⊼e b⊼b A⊼e

E⊼b A⊼D

For most of them, we need to look for generalized intersections of the opposites of the exclusionaryvalues. For example, sinceA⊼e and there is an “a” in c1 and anE in a4, then9 cannot be a candidatein squaresc5 or c6. Also, since we’ve gotA⊼A andb⊼b we can conclude thata andB are true sinceif a statement’s truth implies its own falsity, the statement must be false.

10 Remote Naked Pairs

f

e

d

c

b

a

1 2 3 4 5 61 2 1 2

1 2

*1 2

Figure 24: RemoteNaked Pairs

This technique is related to naked pairs, simple coloring, and XY-chains.Sometimes there will be a series of squares with the same two candidates,and only those two candidates, that form a chain in the same way that weconsidered chains of single candidates in section 9.

In figure 24 we have naked pairsa3 anda6, a6 andc5, andc5 andd5, whereeach pair shares a row, column or block. By an alternating coloring argument,it should be clear thata3 andd5 are effectively a naked pair: one of the twomust contain the candidate1 and the other,2. Thus the square marked withan asterisk,d3, cannot contain either a1 or a2, and they can be eliminated aspossible candidates for squared3.

11 Unique Solution Constraints

If you know that the puzzle has a unique solution, which any reasonable puzzle should, sometimesthat information can eliminate some candidates. For example, let’s examine the example in figure 25.

In row c, columns4 and6, the only possible candidates are1 and2. But in rowg, columns4 and6,the candidates are1, 2 and8. We claim that8 must appear ing4 or g6. If it does not, then the fourcorners of the squarec4, c6, g4 andg6 will all have exactly the same two candidates,1 and2, so wecould assign the value1 to either pair of opposite corners, and both must yield validsolutions. Ifthere is a unique solution, this cannot occur, so one ofg4 or g6 must contain the value8. But if that’sthe case, squarei4 cannot be8, so the candidate8 can be eliminated from squarei4. In addition,since eitherg4 or g6 must be8, g8 cannot be8 since it is in the same row as the other two.

18

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

63

8 1 7 5 9 43

8 22 9 5

34 6 8

34

36 1 7

43

8 71 2 3

61 2

68 9 5

3

9

5 6 48

34

8 9 13

48 2 7

3

9

1 23

9 5 7 63

9 4 83

9 7 48

48 9 2

34

8 1 6 53

9 52

69

1 2

8

36

1 2

8 73

8 9 48 4

36

36 9 7 5 2 1

7 12

9

2

8 4 53

8 9

3

8 9 6

Figure 25: Uniqueness Constraint

In the same figure, a similar situation appears inanother place. See if you can find it. Hint: it iscolumn-oriented instead of row-oriented.

Let’s go back and see exactly what is going on,and from that, we’ll be able to find a numberof techniques that are based on the same generalidea. Figure 26 shows a basic illegal block. Any-thing at all can occur in the squares that are notcircled, but note that an assignment of a2 or a7 to any of the circled squares forces the valuesof the others in an alternating pattern. But anyof the squares can be assigned a2 or a7 and theresulting pattern will be legal. This means thereare two valid solutions to the puzzle.

This means that if some assignment causes an il-legal block to be formed, that assignment is im-possible, and we can use that fact to eliminatecertain possibilities, as we did in the example infigure 25. Note that the four corners must notonly form a rectangle, but they must be arranged

so that two pairs of adjacent corners must lie within the sameblocks. If the four corners lie in fourdifferent blocks, then constraints from those different blocks can force the values one way or theother.

Now let us examine some variations of this theme. In the rest of the examples in this section, we’llassume nothing about the empty squares. They may have valuesassigned to them or may be empty.On the left in figure 27 we see something that is almost the sameas what we saw in figure 26 andthe only thing that makes it legal is the presence of the possibility of a 3 in squareb1. If it is not a3,then we would have the illegal block, so there must be a3 in squareb1. Note that if, in the figure,squareb1 had contained the possibilities2, 3, 4, and7, at least the two possibilities2 and7 couldstill be eliminated as candidates, so only a3 or a4 could be entered in that square.

c

b

a

1 2 3 4 5 6

2

7

2

72

7

2

7

Figure 26: An illegal block

The example in the middle of figure 27 is similar to the originalexample in this section except that the additional number occurs intwo different blocks instead of one. As before, at least one of thosesquares must contain the number (3 in this case), so the value3can be eliminated from any of the other squares in that row (rowb, in this case), but not in either of the blocks, since the one that isforced to be3 might be in the other block.

The example on the right in figure 27 illustrates another sortofdeduction that could be made. We know that at least one ofb1 andc1 must contain a number otherthan a2 or a7, but we don’t know which one. If we think of the combination ofthe two squares asa sort of unit, wedo know that this unit will contain either a3 or a4. This two-square unit, togetherwith squarea3 (which has3 and4 as its unique possibilities) means that no other square in the blockabc123 can contain a3 or a4. If the 34 square had been ina1 we could in addition eliminate3 and4 as candidates from any of the other squares in column1 outside the first block.

Note that we can have both a3 and a4 in either or both squaresb1 andc1 in this example on theright. As long as both occur, the argument holds. Also note that if the3’s and4’s appeared in rowband the entries in rowc were both27, and the34 square was in rowb we could eliminate any other

19

c

b

a

1 2 3 4 5 6

2 3

7

2

72

7

2

7c

b

a

1 2 3 4 5 6

2 3

7

2 3

72

7

2

7c

b

a

1 2 3 4 5 63

4

2 3

7

2

72

47

2

7

Figure 27: Uniqueness Considerations

3’s and4’s in that row.

12 Forcing Chains

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 91 2

6 4 8 7 936

1 2 35

1 2 35

2 3

169 5

1 3

8 236 7

1 34

34

92

9

2 3

9 7 5 4 12 3

9 6 83 8 5 2 1 9 4 7 67 6 2 3 5 4 8 9 14 1 9 6 7 8

2 3 2 3

58 7 6 4 3 5

1 2

9

1 2 2

91

59

3

9 41

9 6 23

5 8 71 2

59

2 3

9

1 3 1

9 8 7 63

4 53

4

Figure 28: Forcing Chains

This method is almost like guessing, but it is aform of guessing that is not too hard for a hu-man to do. There are various types of forcingchains, but the easiest to understand works onlywith cells that contain two candidates.

The idea is this: for each of the two-candidatecells, tentatively set the value of that cell to thefirst value and see if that forces any other two-candidate cells to take on a value. If so, findadditional two-candidate cells whose values areforced and so on until there are no more forcingmoves. Then repeat the same operation assum-ing that the original cell had the other value.

If, after making all possible forced moves withone assumption and with the other, there existsa cell that is forced to the same value, no matterwhat, then that must be the value for that cell.

As an example, consider figure 28, and let’s be-gin with cellb3 which can contain either a1 or a3. If b3 = 1, theni3 = 3, soh2 = 9, soh4 = 1.On the other hand, ifb3 = 3 theni3 = 1 so i4 = 9 soh4 = 1. In other words, it doesn’t matterwhich value we assume thatb3 takes; either assumption leads to the conclusion thath4 = 1, so wecan go ahead and assign1 to cellh4.

Note that XY-wings that we considered in section 7 are basically very short forcing chains.

13 Guessing

The methods above will solve almost every sudoku puzzle thatyou will find in newspapers, and infact, you will probably hardly ever need to use anything as complex as multi-coloring to solve suchpuzzles. But there exist puzzles that do have a unique solution, but cannot be solved using all themethods above.

20

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 91 2

5 61 2 1 2

5 68

1 36

8 71 2 3

6 9 41 2 3

6

1 26 7

1 24 6

8

1 36

8 91 2 34 6

2 36

8

1 36 5

31 24

9

1 24 6

8 9

16

8

1 24 5

26

8 71 2

6

25

9 8 7 42 35

2 369 1

35 6

9

2 369

4 6 31

59 8

1 2

9

25

75

9

2

7 91 2

59

1 2

9

1 25

9

1 35 6

9

1 2 35 7

2 34 5 6 8

2 34 6

9

81 2 34

9

1 24 6

9 71 34 5

1 34

9

34 5 6

1 35 6

9

1 34 6

9

71 34

9

14 6

9

1 35

9

1 34 5

1 34

9

34 5 6 2 8

1

9 514

9 2 6 83

47

1 3

9

1 347 9

Figure 29: A Very Hard Sudoku Puzzle

One method that will always work, althoughfrom time to time it needs to be applied recur-sively, is simply making a guess and examiningthe consequences of the guess. In a situation thatseems impossible, choose a square that has morethan one possible candidate, remember the situ-ation, make a guess at the value for that squareand solve the resulting puzzle. If you can solveit, great—you’re done. If that puzzle cannot besolved, then the guess you made must be incor-rect. It can be eliminated as a candidate for thatsquare. Now you can return to the saved puzzleand try to solve it with one candidate eliminated.

Obviously, when you try to solve the puzzle af-ter having made a guess, you may arrive at an-other situation where another guess is required,in which case a second level of guess must bemade, and so on. But since the method alwayseventually eliminates candidates, you must ar-rive at the solution, if there is one. In computerscience, this technique is known as a recursive search. Figure 29 is an example of such a puzzle thatcannot be cracked with any of the methods discussed so far except for guessing. The solution to thispuzzle can be found in section 20.

Guessing is a direct logical approach in the sense that we assume something is true, such as “squarei3 is a 3” and then we follow the consequences of that to see if it results in a contradiction. Apurely formal approach can also be taken. For every one of the81 squares, there is a set of18statements about the square that can be either true or false.These have the form “squarex has thevaluei” or “squarex does not have the valuei”, wherei is a number between1 and9. There are18 × 81 = 1, 458 of these assertions.

From the initial configuration, we know some are true and someare false, and it is possible to makelogical deductions from them. If we know that squarex is 4 then we know that, fori 6= 4 thestatements “x is i” are false, and the statements “x is noti” are true. Similarly, we can make logicalconclusions about the buddies ofx if we know that squarex has a certain value. If no buddy ofx canhave a certain value, thenx must have that value, and there are a few other similar rules of inferencewe can use to assign truth values to the1, 458 propositions. In principle, a person can search the listover and over to see if any of these rules of inference can be applied, and if so, apply them to assignyet more truth values. Repeating this over and over will always succeed in solving the puzzle.

The method described above does not involve guessing, and works directly forward using onlylogical consequences, but it is not a reasonable way for a human to solve the puzzle. Computerprograms are great at this sort of analysis, but they may needto apply thousands of such inferencerules to take each step forward.

14 Equivalent Puzzles

There is no reason that the numbers1 through9 need to be used for a sudoku problem. We never doany arithmetic with them: they simply represent9 different symbols and solving the puzzle consists

21

of trying to place these symbols in a grid subject to various constraints.

In fact, the construction of a valid completed sudoku grid isequivalent to a graph-theoretic coloringproblem in the following sense. Imagine that every one of the81 squares is a vertex in a graph, andthere is an edge connecting every pair of vertices whose squares are buddies. Each vertex will beconnected to20 other vertices, so the sudoku graph will consist of81 · 20/2 = 810 edges. Findinga valid sudoku grid amounts to finding a way to color the vertices of the graph with nine differentcolors such that no two adjacent vertices share the same color.

Since the symbols do not matter, we could use the lettersA throughI or any other set of ninedistinct symbols to represent what is essentially the same sudoku puzzle. If we take a valid grid andexchange the numbers1 and2, this is also essentially the same puzzle. In fact, any permutation ofthe values1 through9 will also yield an equivalent puzzle, so there are9! = 362, 880 versions ofevery puzzle available simply by rearranging the digits.

If you were trying to calculate how many grids there are, a good approach would be to assume thatthe top row consists of the numbers1 through9 in order, to count the number of grids of that typethere are, and then to multiply that result by9! = 362, 880.

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

5 4 32

4 68

14 6

8 71 24

8 9 48

47 8 9 1 3 4 5

8

24 5 6

24 5 4 5

7 8

2 6 47 8

48 9

14 5

8 9

14 5

9

1 34 5

8

34 5 4 5

7 83

47 9 8

24 57

4 67 9

34 67 9

4 69

2 34 5

2 34 5 6 1

1 347 9

1 347

47 5 2 8

34

34 6 4 6

9

61 34 5

24 5 4

9

1 34

9

14

9

2 34 5

8 7 4 58 9

347 8

34 57

4 5 67 8

4 67 8 9

4 5 67 8 9

4 5 69

4 5 1 214

14 5 9

24 6 4 5 6 3 7 8 4 5 6

47 8 2 4 5 6

7 8 1 4 5 67 8

4 5 6 9 4 5 6 3 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

5 6 12 3

68

2 34 6 7

24 6

8 92

6

26

7 9 4 125 6 5 6

8 3 5 68

25 6

7

8 32

67

269

24 5 6

94 5 6

9

1 24 5 6

15 6

25 6

71

67 9 2 5 6

7 8

36

7 9

1 36

7 9

369

15 68

1 35 68 4

14 67 9

14 67

67 5 8 2

16

1 36

369

314 5 6 5 6

869

14 6

94 6

9

1 25 68 7

25 6

91 2

67

15 6

7

2 35 6

7

2 36

7 9

2 35 6

7 9

35 6

95 6 4 8

4 6 4 5 6 936

8

35 6 1 7 2

35 6

26

7 82 35 6

7 42 35 6

7

35 6 9

35 6 1 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 91 2 3

8 9

1

9

1 2 3

9

1

7 8

1 3

7

1 3

4 5 61 3

8 9

14 5

9 6 21 34 57

1 34 5

1

7 8 9

1 3

7 8 9

1

7 8

714 5

1 2 34

14 6

8 91 34 5 6

1

8

1 2 3

8

1 2

81

69 2 8

14 6

9

14 5 7 3

14 6

14 5

4 31 1

6 81 2

5 61

5 67

1 26

7 95

169 7

14 6

9

1 34

1 2 34 6

16

8

1 24 6

8

1 24

81

69

14 67 9 5 3

147 8 2

14 67 9

147

1 2

81 24

147 6 9

15

7

147 3

1 369

14 67 9

1 34

9 5 214

16

7 8 9

14 67 8 9

147 8

Figure 30: Essentially Equivalent Puzzles

In addition to simply rearranging the numbers, there are other things you could do to a puzzle thatwould effectively leave it the same. For example, you could exchange any two columns (or rows)of numbers, as long as the columns (or rows) pass through the same blocks. You can exchange anycolumn (or row) of blocks with another column (or row) of blocks. Finally, you can rotate the entriesin a grid by any number of quarter-turns, or you could mirror the grid across a diagonal.

Figure 30 shows some examples. If the puzzle on the left is theoriginal one, the one in the centershows what is obtained with a trivial rearrangement of the digits 1 through9 (the entries1, 2, . . . ,9 are replaced in the center version by4, 8, 1, 6, 5, 3, 7, 2 and9, respectively). The version on theright is also equivalent, but it is very difficult to see how itis related to the puzzle on the left.

One obvious mathematical question is then, how many equivalent puzzles are there of each sudokugrid in the sense above?

Another interesting mathematical question arises, and that is the following: given two puzzles thatare equivalent in the sense above, and given a sequence of steps toward the solution of one that areselected from among those explained in earlier chapters, will those same steps work to solve theother puzzle. In other words, if there is a swordfish positionin one, will we arrive at a differentswordfish in the other? The answer is yes, but how would you go about proving it?

Notice that the puzzle on the left (and in the center) in figure30 is symmetric in the sense that if youmark the squares where clues appear, they remain the same if the puzzle is rotated by 180 degreesabout the center. Other versions of symmetric puzzles couldbe obtained by mirroring the clue

22

squares horizontally or vertically. Most published puzzles have this form. This doesn’t necessarilymake them easier or harder, but it makes them look aesthetically better, in the same way that mostcrossword puzzles published in the United States are also symmetric.

Another interesting question is this: given a symmetric puzzle, how many equivalent versions of itare there?

15 Counting Sudoku Grids

A sudoku grid is a special case of a9 × 9 latin square4 with the additional constraint barring dupli-cates in the blocks. There are a lot of9 × 9 latin squares:

5, 524, 751, 496, 156, 892, 842, 531, 225, 600.

Bertram Felgenhauer (see reference [1]) has counted the number of unique sudoku grids using acomputer, and his result has been verified by a number of otherpeople, and that number turns out tobe much smaller, but also huge:

6, 670, 903, 752, 021, 072, 936, 960 = 220385171277042679711.

The number above includes all permutations of the numbers1 through9 in each valid grid, so if wedivide it by9! we obtain:

18, 383, 222, 420, 692, 992 = 21334277042679711,

which will be the number of inequivalent grids.

16 Magic Sudoku Grids

A latin square has all the digits in each row and column. A “magic square” is a latin square whereeach diagonal also contains all the digits. Is there such a thing as a magic sudoku grid? The answeris yes, and if we assume that the main diagonal contains the digits in a fixed order, there are4, 752of them. All 4, 752 of the grids can be completed, and all of them in multiple ways. The puzzlepresented in figure 31 is a standard sudoku puzzle, except that it is easier since it requires that eachof the diagonals contains all the digits from1 to 9.

17 Minimal Sudoku Puzzles

What is the minimal number of locations that must be filled in an otherwise empty grid that willguarantee that there is a unique solution? As of the time thispaper was written, the answer to thatquestion is still unknown, but examples exist of puzzles that have only17 locations filled and thathave unique solutions. Figure 32 shows such a puzzle on the left. Although this puzzle contains theminimum amount of information in terms of initial clues, it is not, in fact, a difficult puzzle. Thepuzzle to the right in the same figure contains18 clues, and is symmetric. This is the smallest knownsize for a symmetric sudoku puzzle.

4A latin square is a grid where the only constraint is that there be no duplicate entries in any row or any column.

23

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

4 7 29 8

1 66 5 2 7 8

8 97 8 2 3

8 12 9 4 68

Figure 31: Magic Sudoku Puzzle

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

6 19 1 6 8

72

3 74 6

7 38 1

5 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

8 3 26 1

56 31 7

2 86

8 25 3 9

Figure 32: Minimal Puzzles

18 Constructing Puzzles and Measuring Their Difficulty

The difficulty of a sudoku puzzle has very little to do with thenumber of clues given initially.Usually, the difficulty ratings are given to indicate how hard it would be for a human to solve thepuzzle. A computer program to solve sudoku puzzles is almosttrivial to write: it merely needs tocheck if the current situation is solved, and if not, make a guess in one of the squares that is not yetfilled, remembering the situation before the guess. If that guess leads to a solution, great; otherwise,restore the grid to the state before the guess was made and make another guess.

The problem with the guessing scheme is that the stack of guesses may get to be twenty or thirtydeep and it is impossible for a human to keep track of this, buttrivial for a computer. A much moretypical method to evaluate the difficulty of a puzzle is relative to the sorts of solution techniques thatwere presented in the earlier sections of this article.

In this article, the techniques were introduced in an order that roughly corresponds to their difficultyfor a human. It is easy to look at a row, column or block and see if there is just one missing numberand if so, fill it in, et cetera.

24

So to test the difficulty of a problem, a reasonable method might be this. Try, in order of increasingdifficulty, the various techniques presented in this article. As soon as one succeeds, make that move,and return to the beginning of the list of techniques. As the solution proceeds, keep track of thenumber of times each technique was used. At the end, you’ll have a list of counts, and the moretimes difficult techniques (like swordfish, coloring, or multi-coloring) were used, the more difficultthe puzzle was.

The rankings seen in newspapers generally require that the first couple of rankings (say beginningand intermediate) don’t use any technique other than those that yield a value to assign to a square oneach move. In other words, they require only obvious candidates, naked and hidden singles to solve.

Published puzzles almost never require guessing and backtracking, but the methods used to assign adegree of difficulty vary from puzzle-maker to puzzle-maker.

With a computer, it is easy to generate sudoku puzzles. Firstfind a valid solution, which can bedone easily by assigning a few random numbers to a grid and finding any solution. Next startremoving numbers (or pairs of numbers, if a symmetric puzzleis desired) and try to solve theresulting puzzle. If it has a unique solution, remove more numbers and continue. If not, replacethe previously-removed numbers and try again until a sufficient number of squares are empty. Thepuzzle’s difficulty can then be determined using the techniques described above. The entire processwill take only a fraction of a second, so one would not need to wait long to obtain a puzzle of anydesired degree of difficulty.

19 References

At the time of writing this article, the following are good resources for sudoku on the internet and inbooks:

[1] http://www.afjarvis.staff.shef.ac.uk/sudoku/: Felgenhauer’s paper countingpossible sudoku grids.

[2] http://www.geometer.org/puzzles: You can download the source code for the au-thor’s program that solves sudoku puzzles and can generate the graphics used in this article.

[3] http://www.websudoku.com/: This page by Gideon Greenspan and Rachel Lee gen-erates sudoku games of varying degrees of difficulty and allows you to solve the problemonline.

[4] http://angusj.com/sudoku/: From this page you can download a program writtenby Angus Johnson that runs under Windows that will help you construct and solve sudokuproblems. In addition, the page points to a step-by-step guide for solving sudoku, similar towhat appears in this document.

[5] http://www.sadmansoftware.com/sudoku/index.html: This site by Simon Arm-strong points to some nice descriptions of solution techniques, most of which are discussed inthis article.

[6] http://www.setbb.com/phpbb/index.php: This page is a forum for people who wantto solve and construct sudoku puzzles as well as for people who want to write computerprograms to solve sudoku automatically.

25

[7] http://www.madoverlord.com/projects/sudoku.t: A computer program calledSudoku Susser by Robert Woodhead for the Mac, Windows and Linux that will solve al-most any puzzle using logic alone. The distribution comes with great documentation as well,that describes many of the techniques presented here and others besides.

[8] Robin Wilson,How to solve sudoku: A step-by-step guide, The Infinite Ideas Company,Limited, Oxford, 2005.

[9] Philip Dwinger,Introduction to Boolean Algebras, Physica-Verlag, 1971.

20 Solutions

i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

6 2 4 8 7 1 9 5 31 9 3 4 6 5 8 7 27 5 8 3 9 2 6 1 42 1 9 6 4 3 5 8 75 8 6 7 2 9 3 4 14 3 7 1 5 8 2 6 93 4 5 2 1 6 7 9 88 6 1 9 3 7 4 2 59 7 2 5 8 4 1 3 6 i

h

g

f

e

d

c

b

a

1 2 3 4 5 6 7 8 9

2 1 5 8 7 6 9 4 36 7 8 3 9 4 2 1 53 4 9 1 2 5 8 7 65 8 7 4 3 2 1 6 94 6 3 9 8 1 7 5 21 9 2 6 5 7 3 8 48 2 6 7 4 3 5 9 17 3 4 5 1 9 6 2 89 5 1 2 6 8 4 3 7

Figure 33: Solutions to Puzzles

26

Documents

sudoku