Cipher

1

Pencil and Paper Systems

The most obvious way, perhaps, of taking text and concealing it from the prying eyes of those who don't know your

secret is by replacing each letter by something else, like this:

ABCDE FGH IJKL MNO PQRST UVW XYZ

----- --- ---- --- ----- --- ---

$7+Q@ ?)/ 2X3: !8J 9%6*& 15= (;4

which turns: Please send me some money

into: 9:@$*@ *@8Q !@ *J!@ !J8@;

This is called substitution, and ciphers based on this principle date back to ancient times.

For example, the diagram on the left illustrates several cipher alphabets

used by the ancient Hebrews. Three of them are based on arrangements of

the alphabet according to a definite pattern, and these patterns can be

illustrated in terms of the 26-letter alphabet used by the English language

by showing what the equivalent substitutions are in that alphabet:

Atbash:

A B C D E F G H I J K L M

-------------------------

Z Y X W V U T S R Q P O N

Albam:

A B C D E F G H I J K L M

-------------------------

N O P Q R S T U V W X Y Z

Atbah:

A B C D J K L M E S T U V

------- ------- - -------

I H G F R Q P O N Z Y X W

Note that all three of these are reciprocal, in that if one letter becomes

another letter, then that other letter becomes the original letter in turn.

The illustration also contains other information. The numerical value of

each letter is given below the name of the letter, and the original Hebrew

form of the name of the letter is also shown to the right. Also, Cryptic

Script B, an alphabet used in the writing of part of the Dead Sea Scrolls is

shown (albeit imperfectly; the symbol for Shin is only known to be used for

one of the two values of that letter, as indicated by dots, and an additional

special-purpose character is not shown.)

The other method of concealing a message is called transposition, which was also used in ancient times, at least by

the Spartans with the scytale, a baton around which a leather belt could be wound, so that a message could be

written on the belt, crossing from one loop to the next, so that it could only be read while the belt was so wound.

2

In transposition, instead of replacing letters with something else, the letters of a message are moved around, so that

they aren't written down in order.

Cryptanalyzing the Simple Substitution Cipher

Methods of Transposition

Improving Substitution o Homophones and Nomenclators o Polygraphic Ciphers and Fractionation

Playfair and its Relatives The Bifid, the Trifid, and the Straddling Checkerboard Fractionated Morse, and Other Oddities The VIC Cipher Two Trigraphic Ciphers, and a Heptagraphic One

o Polyalphabetic Substitution

Code Books

Fun With Playing Cards

Conclusions

Cryptanalyzing the Simple Substitution Cipher

This page is not complete. It is placed here now to reserve space, to allow other changes to this

section to take place.

Here is a short message, enciphered only by replacing each of its letters by a different letter on a consistent basis:

MGSVR WWJXS VPTRY SSOEF YYTMQ SVSYM MTPTR XYMGS RVRFJ

NFVGX TYFWF EIFUS AXJJQ SJSNM QPMGS TJOTF IMLSS TYSJO

SLQSL LPLTF OYSHM MRSVO FP

How would one go about trying to read it?

The first step that would occur to many people would be to make use of the fact that some letters are more common

than others in English. E is the most common letter, and letters like J, Q, X, and Z are quite rare.

And so, we count the letters in our message. This produces the following table of frequencies:

A E F G H I J L M N O P Q R S T U V W X Y

1 2 9 4 1 2 7 5 10 2 5 5 4 6 18 9 1 6 3 4 8

In comparison, a frequency count I had my computer perform on a sample of literary text produced these

frequencies:

A 443747 8.03 H 331686 6.00 O 420966 7.62 V 54921 0.99

B 88298 1.60 I 382552 6.92 P 102205 1.85 W 114048 2.06

C 152187 2.75 J 7112 0.13 Q 5841 0.11 X 12081 0.22

D 225040 4.07 K 33872 0.61 R 330126 5.97 Y 95514 1.73

E 711756 12.88 L 220858 4.00 S 351389 6.36 Z 3519 0.06

F 139985 2.53 M 141726 2.56 T 514613 9.31

G 103279 1.87 N 383526 6.94 U 156536 2.83

3

Arranged in order of frequency, for clarity, they become:

E 12.88 H 6.00 F 2.53 K 0.61

T 9.31 R 5.97 W 2.06 X 0.22

A 8.03 D 4.07 G 1.87 J 0.13

O 7.62 L 4.00 P 1.85 Q 0.11

N 6.94 U 2.83 Y 1.73 Z 0.06

I 6.92 C 2.75 B 1.60

S 6.36 M 2.56 V 0.99

Comparing these frequencies to those of the message:

18: S 7: J 3: W

10: M 6: R V 2: E I N

9: F T 5: L O P 1: A H U

8: Y 4: G Q X

it might be tempting to start by aligning like frequencies wherever possible:

Cipher: S M Y J W

---------

Plain: e t n i f

to begin deciphering the message like this:

MGSVR WWJXS VPTRY SSOEF YYTMQ SVSYM MTPTR XYMGS RVRFJ

t e ffi e n ee nn t e ent t nt e i

NFVGX TYFWF EIFUS AXJJQ SJSNM QPMGS TJOTF IMLSS TYSJO

n f e ii eie t t e i t ee nei

SLQSL LPLTF OYSHM MRSVO FP

e e ne t t

Here, it looks like we've been luckier than we have a right to expect. With frequencies of 6.94 and 6.92 for N and I

respectively, it isn't hard to imagine that I might be more common than N, instead of N being more common than I,

in the text of a particular message.

The combination t-e occurs three times from MGS, and once each from MQS and MLS, so it seems reasonable to

think that G stands for h. e-ent might be event, and -ffi-e might be office, although it is actually hard to take

seriously that W necessarily stands for f.

To make a good start on breaking a simple substitution, however, single-letter frequencies are not enough. They

might work for picking out the letters E and T in most cases, but more information is available that can serve as a

better guide.

We've seen that N and I have frequencies of 6.94 and 6.92 respectively. This is a very small difference. But one is a

consonant, and the other is a vowel. So we might expect them to behave differently. And they do.

4

Methods of Transposition

After looking at ciphers which can replace the letters of one's message by completely different letters, a cipher that

can't change any letters at all seems weak.

And, if your message might mention, or might not mention, someone with, say, a Q or an X in his name, then a

transposition cipher will indeed give that away, although one could solve that by adding some garbage to the end of

your message before encrypting it. But transposition ciphers can be secure in themselves, and as well, transposition

methods are useful to know, since they can be mixed with substitution methods for a more secure cipher.

The best known method of transposition, simple columnar transposition, works like this:

Using a key word or phrase, such as CONVENIENCE, assign a number to each letter in the word using this rule: the

numbers are assigned starting with 1, and they are assigned first by alphabetical order, and second, where the same

letter appears twice, by position in the word.

Then, write in the message under the keyword, writing across - but take out columns, in numerical order, reading

down. Thus:

C O N V E N I E N C E

1 10 7 11 3 8 6 4 9 2 5

---------------------------------

H E R E I S A S E C R

E T M E S S A G E E N

C I P H E R E D B Y T

R A N S P O S I T I O

N

produces

HECRN CEYI ISEP SGDI RNTO AAES RMPN

SSRO EEBT ETIA EEHS

Of course, it wouldn't be left with the spaces showing the columns that were used.

Decoding is harder - to read a message scrambled this way, you first have to count the letters to determine, in this

case, that there are 45 letters, and so the first column will have five letters in it, and the other ones four, so that you

know when to stop when filling the letters in vertically to read them out horizontally.

Since the text being transposed is split into nearly regular divisions of almost equal length, even double columnar

transposition can be broken without recourse to multiple anagramming: the use of several messages of the same

length, enciphered in the same key, to recover the transposition by matching together columns of letters that form

reasonable letter pairs.

Another method of transposition, which appeared in a book by General Luigi Sacco, is a variant of columnar

transposition that produces a different cipher:

C O N V E N I E N C E

1 10 7 11 3 8 6 4 9 2 5

---------------------------------

H

E R E I S A S E C R

E T M E S

S A G E E N C I

P H E R E D B Y T R A

N S P O S I T

I O N

produces

HEESPNI RR SSEES EIY A SCBT

EMGEPN ANDI CT RTAHSO IEERO

5

Here, the first row is filled in only up to the column with the key number 1; the second row is filled in only up to the

column with the key number 2; and so on. Of course, one still stops when one runs out of plaintext, so the eighth

row stops short of the key number 8 in this example. This method has the advantage of dividing the text being

transposed in a more irregular fashion than ordinary columnar transposition.

Various methods of modifying ordinary columnar transposition slightly to make it irregular have been used from

time to time.

For example, during World War I, the French army used a transposition in which diagonal lines of letters were read

off before the rest of the diagram. Also, several countries have used columnar transpositions in which several

positions in the grid were blanked out and not used.

The method of transposition used on the digits produced by a straddling checkerboard in the VIC cipher can be

illustrated here with the alphabet; first, knowing the number of letters to be encrypted, the area to be filled is laid

out, and then the triangular areas on the right to be filled with plaintext last are marked out:

2 4 3 1 5

-----------

a b c d e

f g h i U

j k l V W

m n X Y Z

o p q r s

t

here, the alphabet becomes DIVYR AFJMOT CHLXQ BGKNP EUWZS

Another interesting form of transposition is the "turning grille", used by

Germany during the First World War.

A square grille, divided into a grid of squares, one-quarter of which are punched with holes, is placed over a sheet of

paper. The message is written on the paper through the holes, and then the grille is rotated by 90 degrees, and then

the message continues to be written, as the grille is turned through all four possible positions.

The trick to designing such a grille is to divide the grille into quarters, numbering the squares in each quarter so that

as the grille is rotated, corresponding squares have the same number. Then, choose one square with each number for

punching.

In World War I, the Germans used turning grilles with an odd number of holes on each side as well as with an even

number; to do this, they marked the square in the centre, which was always punched, with a black border to indicate

it was only to be used in one of the grille's positions.

Example of a turning grille and its use:

Grid numbering:

1 2 3 4 5 16 11 6 1

6 7 8 9 10 17 12 7 2

11 12 13 14 15 18 13 8 3

16 17 18 19 20 19 14 9 4

5 10 15 20 X 20 15 10 5

4 9 14 19 20 19 18 17 16

3 8 13 18 15 14 13 12 11

2 7 12 17 10 9 8 7 6

1 6 11 16 5 4 3 2 1

6

Layout:

O - - O - O - - - 1 4 16

- - O - - - - - O 8 2

- O - - - - O - - 12 13

- - O - O - - O - 18 20 9

- - - - X - O - O X 15 5

- - - O - - - O - 19 17

O - - - - O - - - 3 14

- O - - O - - - O 7 10 6

- - O - - - - - - 11

Filling-in:

first position

T H I O - - O - O - - -

S I - - O - - - - - O

S A - O - - - - O - -

M E S - - O - O - - O -

S A G - - - - O - O - O

E T - - - O - - - O -

H A O - - - - O - - -

T I A - O - - O - - - O

M - - O - - - - - -

(this is a message that I am)

second position

t E h N i C - O - - O - - O -

s R Y i - - - O - O - - -

s P T a - - O - O - - - -

I m e N s O - - - - - O - -

G s a W g - - - O . - - O -

I e T t O - - - - O - - -

h H A a T - O - O - - - - O

t U i R a - - O - - - - O -

N m I O - - - - - O - -

(encrypting with a turni)

third position

t e h n i N c - - - - - - O - -

G s r G y R i O - - - O - - O -

s p I t a L - - - O - - - - O

i L m e E n s - O - - - O - - -

T O g s a w g O - O - . - - - -

i P e R t O t - O - - O - O - -

h h V a a I t - - O - - - - O -

D t u i E r a O - - - - - O - -

n m T H i I - - - O - O - - O

(ng grille to provide thi)

7

fourth position

t e S h n i n c I - - O - - - - - O

g L s r g y L r i - O - - - - O - -

U s p i t S a T l O - - - - O - O -

i l m R e e n s A - - - O - - - - O

t T o g s I a w g - O - - . O - - -

i p V e r t o t E - - O - - - - - O

h h v a E a X i t - - - - O - O - -

d t i A i M e r a - - - O - O - - -

n P m t L h i E i - O - - O - - O -

(s illustrative example)

to produce the encrypted result:

TESHN INCIG LSRGY LRIUS PITSA TLILM

REENS AITOG SIAWG IPVER TOTEH HVAEA

XITDT IAIME RANPM TLHIE I

There are two important uses of transposition which are connected with substitution ciphers.

Transposition can be used to generate a scrambled order of the letters in the alphabet for use as a substitution

alphabet.

Transposition forms part of a fractionation cipher, where letters are divided into parts, then the parts are put back

together in a different order, belonging to different letters.

8

Improving Substitution

A cipher based on the use of a secret alphabet is not very secure; such ciphers are presented as puzzles in crossword

puzzle magazines. To achieve security it is required to do something better.

Today, even to people not acquainted with cryptography, a number of possibilities suggest themselves. Originally,

though, the new ideas came one at a time, separated by hundreds or thousands of years.

The basic ways to improve on simple substitution are the following:

Instead of using just 26 substitutes, make the problem harder by using a bigger substitution. This is divided into several cases:

o Use several substitutes for each letter (homophonic substitution) o Replace every two letters, or every three letters, by something else that stands for that combination

of two letters or three letters (polygraphic substitution)

o Replace common combinations of letters, or words, or phrases, by their own substitutes (nomenclators and codes)

Instead of using the same set of substitutes all the time, change from one secret alphabet to another as you encipher a message (polyalphabetic substitution).

Another way of improving on simple substitution is less obvious. Today, text is often converted from the letters,

punctuation marks, digits, and other symbols you find on a typewriter to the binary bits of ASCII. Before that, other

representations of text were used to substitute for the printed word, such as Morse code. The ancient Greeks used the

Polybius square for signalling, by means of which each letter was represented by two groups of from one to five

signal fires.

If a letter can be broken up into smaller pieces for purposes of signalling, those smaller pieces can also be used in a

cipher. For example, one can take the letters of a message apart into smaller pieces, transpose the smaller pieces, and

then put the pieces back together again into letters.

This is called fractionation, and is closely related to polygraphic substitution for two reasons; one is that both deal

with different sized units - parts of letters and letters, or letters and pairs of letters - and the other is that fractionation

is sometimes used as a method of polygraphic substitution.

Homophones and Nomenclators

Polygraphic Ciphers and Fractionation o Playfair and its Relatives o The Bifid, the Trifid, and the Straddling Checkerboard o Fractionated Morse, and Other Oddities o The VIC Cipher o Two Trigraphic Ciphers, and a Heptagraphic One

Polyalphabetic Substitution

Homophones and Nomenclators

One of the earliest methods used to create ciphers stronger than simple substitution was to create cipher tables which

had more than one substitute for each letter, and which had additional substitutes for names that would be

commonly used. Because of the significance given to proper names, these systems were called nomenclators.

9

Some of the early nomenclators were fairly unsophisticated; the substitutes for the letter B might be the letter M or

the digit 4, written in several distinctive styles - and then the substitutes for C might be the letter N or the digit 5,

again written in distinctive styles. Thus, a cryptanalyst willing to try a simple guess would only need to solve a

Caesar cipher - a simple substitution where the alphabet is merely displaced instead of being thoroughly scrambled -

instead of facing the full problem of finding substitutes for the full set of symbols individually.

One ingenious modern method of producing a homophonic cipher, called the Grandpr cipher, involves choosing

ten ten-letter words, which can be ordered so that their first letters form an eleventh ten-letter word, and which

collectively include all 26 letters of the alphabet.

For example:

0 1 2 3 4 5 6 7 8 9

0 S T R A T I F I E D

1 U N D E R S T O O D

2 B A R K E N T I N E

3 M A J O R I T I E S

4 A S T R O L O G E R

5 R E E X A M I N E D

6 I N V E S T M E N T

7 N E G A T I V E L Y

8 E F F E R V E S C E

9 S Q U E E Z I N G S

The advantage it has, over a more routine type of homophonic table, for example:

0,3,8 4,7 9 1 5 2 6

1,2,7,8 E T A O I N S

0,4,5 H R D L U B C

3,9 M F G J K P Q

6 V W X Y Z

is that the multiple substitutes for each letter are not closely related.

The book The American Black Chamber, by Herbert Osborn Yardley, illustrated a cipher wheel used by the

Mexican Army which could be set up to produce a homophonic cipher with a key that could be easily changed.

Changed from a wheel to a slide, it would look like this:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

15 16 17 18 19 20 21 22 23 24 25 26 01 02 03 04 05 06 07 08 09 10 11 12 13 14

43 44 45 46 47 48 49 50 51 52 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 53 54 55 56 57 58 59 60

92 93 94 95 96 97 98 99 00 79 80 81 82 83 84 85 86 87 88 89 90 91

having four movable disks, one containing the two-digit pairs from 01 to 26, the second the pairs from 27 to 52, the

third the pairs from 53 to 78, the fourth the pairs from 79 to 99, followed by 00 and four blank, unused spaces. The

key consisted of the four two-digit pairs aligned under the letter A, and the possible substitutes for any letter were

the four (or possibly three) two-digit pairs aligned under it. Obviously, the system would have been more secure had

the alphabet and the sequence of digit pairs been mixed.

The most important weakness of a homophonic system is that the person using it can become lazy, and use the same

substitute for a letter over and over, or use the substitutes in rotation, rather than using them randomly.

10

Also, as many homophonic systems are devised by amateurs, they can have defects of one kind or another. Helen

Fouch Gaines in Elementary Cryptanalysis notes that Givierge, author of the Cours de Cryptographie, described a

homophonic system of the following kind:

This is a type of straddling checkerboard, and we will meet a more elegant form of it later in

the section on fractionation. The word straddling refers to the fact that while most letters

have a two-letter group as their substitute, consisting of the letters indicating their row and

column (which may, incidentally, be taken in either order, as the alphabet has been split in

half for this purpose), five less-frequent letters represent each other. Thus, the presence of

occasional one-letter symbols is intended to complicate the problem for the cryptanalyst,

making it difficult for him to find out where the letter pairs that make up most of the message

begin and end.

Although this cipher has many nice features, it does have a number of defects. Since the

letters that have only one letter as their substitute are, essentially, in a separate table, why use only a 25-letter

alphabet? Of course, in French, the letter W is so little used as almost not to be part of the alphabet. But there are

other defects.

Although a group can begin with a letter from either half of the alphabet, the second letter always

has to be from the other half.

Also, the second letter of a two-letter group can't be one of the five letters that represent

themselves, although since the first letter already indicates that there is a two-letter group, that

would not cause confusion.

Hence, this cipher omits a large number of two-letter substitutes which it could be making use of. An improved

design could be the following:

Here, six mid-frequency letters have single-letter substitutes, but these

substitutes are drawn from other letters in the alphabet.

The rest of the alphabet is divided into two halves, but once a letter is chosen

to indicate either a row or a column, the other co-ordinate of the plaintext

letter is chosen from a set made from the entire alphabet. Hence, if a letter on

the left begins a two-letter group, it is ended with a letter below; if a letter on

the top begins a two-letter group, it is ended with a letter on the right.

Thus, the plaintext letter R can become ED, EO, OO, IQ, ZQ, or II.

As noted previously, the basic concepts of cryptography were slow to

emerge. David Kahn's book The Codebreakers illustrates the earliest known

example of a cipher with homophones, from the year 1401. It looked like this:

a b c d e f g h i k l m n o p q r s t u x y z

---------------------------------------------

Z y x D t s r q p o n l m k j h g f e d c b a

2 4 8 F

3 H 9 T

+ J L ~

where the capital letters stand for various special symbols (Z indicates a reverse script lowercase z, F indicates an ff

ligature, and J indicates the astrological symbol for Jupiter, for example).

E J G F D b|M

O K U H S f|T

V W g|N

--------- p|R

IZ |r q l m x| DFOQTX v|P

AL |e u w h n| CJLNPWY y|Q

BY |c i k z a| AEGKUV

CX |o j t s d| BHIMRSZ

---------

E A C F G

I B H J L

P D O K M

Q T R N V

S U Z X W

Y

E J G F D

O K M H S

P R W

---------

IT |a b c d e

AL |f g h i j

BQ |k l m n o

CN |p q r s t

u v x y z

---------

V X Y Z U

11

To modern eyes, what is particularly striking about this cipher is that, even though the step of improving on simple

substitution by using multiple equivalents was taken, the basic cipher alphabet itself is not thoroughly mixed, but

instead varies only slightly from a simple reversed alphabet.

Incidentally, the British publisher Hodder and Stoughton has an extensive series of books on various subjects in a

series called "Teach Yourself Books": particularly noteworthy in this series are the instructional books for foreign

languages, which the case of some languages are the only readily available introductory book in print in English.

(These are the books that used to have yellow covers, but which changed to light blue covers some years back.) The

book Codes and Ciphers by Frank Higenbottam in this series, while a general introduction to the subject, is

distinguished by its uniquely extensive coverage of the topic of breaking messages enciphered using nomenclators.

Polygraphic Ciphers and Fractionation

Instead of arbitrarily choosing a list of common words or syllables to give cipher equivalents for, one might be able

to achieve the same increased security another way, by enciphering several letters at once using some simple system

that handles all possible combinations of two letters, or three letters.

Of course, one could just use a random list of all 676 possible combinations of two letters, and this would give the

maximum possible security for a system that handles two letters at a time.

Or, one could even follow the lead of Giovanni Battista della Porta, and use a table giving different symbols for

every pair of letters:

This is a redrafting of the table of 400 symbols for the digraphs of a 20-character alphabet given by Porta in his De

Furtivis Literatum Notis. In the original, there are a few typographical errors, leading to some duplicate symbols:

12

ZI is a duplicate of ZO, VM is a duplicate of LL, NG is a duplicate of NB. The replacements for the first two were

obvious, that for NG is somewhat arbitrary.

Naturally, since Porta was expressing the idea of a digraphic cipher in print for the first time, he did so in a way that

seems unsophisticated by modern standards.

The columns in Porta's diagram all contain characters related in shape. This makes it easier to look up a symbol, but

also gives away information. One way to retain the advantage of easily finding a symbol, but without giving away

information to the cryptanalyst, is illustrated in the diagram above: have similar symbols arranged along the

diagonals of the diagram, and use mixed alphabets along the edges. While a digraphic symbol cipher is something

that isn't too practical, similar techniques have been used for small code charts to make them practical and secure.

Systematic methods of enciphering several letters at once, without simply using a very large table, will be outlined

in what follows. Fractionation lends itself to many complicated and bizarre developments, a few of which will be

illustrated there.

Hopefully, all the examples that will be contained in the following pages will prove a starting point from which you

can let your imagination run wild.

Fractionation, although a powerful technique, has seldom been used in paper-and-pencil ciphers, because it is too

complicated and prone to error. Two schemes that actually were used, the ADFGX or ADFGVX cipher used by

Germany in the First World War, and the VIC cipher used by Reino Hayhanen while engaged in espionage in the

United States, involved substituting multiple symbols for each letter, and transposing the letters or digits so

obtained, but did not attempt to then reconstitute the symbols back into letters.

Representing letters by five symbols from a set of two, or three symbols from a set of three, has tended to be used

mostly for steganography, as proposed by Bacon and Trithemius. (That is, in the former case, if one does not count

the use of the 5-level code for teletypewriters.)

13

The Hagelin B-21 and its relatives also involved fractionation, combined with polyalphabeticity but without

transposition.

Playfair and its Relatives

The Bifid, the Trifid, and the Straddling Checkerboard

Fractionated Morse, and Other Oddities

The VIC Cipher

Two Trigraphic Ciphers, and a Heptagraphic One

A Table of Powers, useful in finding ideas or opportunities to perform fractionation.

Playfair and its Relatives

Since 26 by 26 tables are awkward and bulky, and certainly impossible to memorize, various systematic methods

were developed to encipher more than one letter at a time.

Playfair

The most famous polygraphic system is, of course, the Playfair cipher, which works as follows: given a 5 by 5

square, containing a jumbled alphabet, such as:

doing without one letter by some rule: i.e. if Q is omitted, as here, use KW to stand for QU; or treat

I and J, or U and V, as one letter.

Then, a pair of letters is converted to a ciphertext pair using one of three possible rules, whichever one applies:

if the two letters are neither in the same row or the same column, replace each letter by the one that is in its own

row, but in the column of the other plaintext letter. Examples: TI becomes RF, TW becomes VC, KA becomes PG,

UB becomes KD, WX becomes GV

T------>R T-->V H R

| K M U | | K | U P

| Z O J | | Z | J E

| G W Y | C

14

if the two letters are both in the same column, replace each one by the one below it, wrapping around if necessary.

Examples: VW becomes MS, TN becomes LC, TL becomes LN, TF becomes LT, KB becomes ZX

T X V H R

L K M U P

N Z | J E

C G W Y A

F B S D I

it the two letters are both in the same row, replace each one by the one to the right of it, wrapping around if

necessary. Examples: TH becomes XR, KP becomes ML, NZ becomes ZO.

T X---H R

L K M U P

N Z O J E

C G W Y A

F B S D I

Double letters aren't allowed within a single digraph, and must be split up by inserting a letter used as a null (for

example, an X) between them.

If the Playfair cipher is used on a computer, perhaps in combination with other ciphers, it might be more convenient

to make a rule for double letters, such as using the letter that is both below and to the right of the plaintext letter, and

also doubling it. Then, EE would become CC.

15

The Four-Square and Two-Square Ciphers

Playfair has inspired some related bigraphic ciphers that, on the one hand, improve security by involving multiple,

unrelated alphabets, but on the other hand, are simpler in that they use fewer rules than Playfair.

In the Four-Square cipher, two squares are used to find the two plaintext letters, and two others are used to find the

two ciphertext letters:

D W X Y M | E P T O L D W X Y M | E P T O L

R J E K I | C V I Y Z R J E K I | C V I Y Z

U V H P S | R M A G B U V H---------->M A G B

A L B Z N | F W J H S A L | Z N | F | J H S

G C O F T | U N D X K G C | F T | U | D X K

-----------|----------- ----|------|----|------

J T B U E | V I M A G J T | U E | V | M A G

Z H N D X | S W P O H Z H N

16

The Bifid, the Trifid, and the Straddling

Checkerboard

The Bifid and Trifid ciphers

The first of the three rules for Playfair encipherment changes one two-letter group, or digraph, to another by

exchanging column co-ordinates. This suggests using row and column co-ordinates in a more general fashion. Let's

take the 5 by 5 square above, but number the rows and the columns, like this:

1 2 3 4 5

---------

1) T X V H R

2) L K M U P

3) N Z O J E

4) C G W Y A

5) F B S D I

Then, another method of encipherment would be as follows: Divide a message into groups of letters of a fixed

length, say five letters, and write the row and then the column co-ordinate of each letter beneath it, like this:

THISI SMYSE CRETM ESSAG E

11555 52453 41312 35544 3

14535 33435 15513 53352 5

and then, going across within each group, read the numbers in order, and turn them, in pairs, into letters: that is, read

11555 14535 52453 33435... and turn them into the letters corresponding to 11, 55, 51, 45, 35, 52, and so on.

1155514535 5245333435 4131215513 3554453352 35

T I F A E B A O J E C N L I V E D A O B E

This is the Bifid cipher of Delastelle, and the general principle of this form of cipher is called seriation. This is one

of the most secure pencil-and-paper ciphers that is still used by hobbyists as a puzzle. It isn't hard to make this kind

of cipher just a little bit more complicated, and thereby obtain one that is genuinely secure. It belongs to the class of

cipher methods known as fractionation, where letters are divided into smaller pieces, or "fractions". Just as two

symbols from 1 to 5 give 25 letters, three symbols from 1 to 3 give 27 letters; and five binary bits provide a 32-

character alphabet.

The Trifid, also due to Delastelle, is the analogous cipher using a 27-letter alphabet represented by three symbols

from 1 to 3:

W 111 M 121 Z 131 N 211 O 221 L 231 C 311 T 321 U 331

A 112 & 122 Y 132 E 212 V 222 P 232 X 312 J 322 G 332

K 113 B 123 H 133 Q 213 R 223 S 233 I 313 F 323 D 333

to encipher a message by seriation like this:

THISISM YSECRET MESSAGE

3132321 1223223 1222132

2313132 3311212 2133131

1333331 2321321 1233222

17

which again is read off horizontally after being written in vertically, yielding a cipher message like this:

313 232 123 131 321 333 331

I P B Z T D U

122 322 333 112 122 321 321

& J D A & T T

122 213 221 331 311 233 222

& Q O U C S V

Representing the letters as combinations of two groups of from one to five signal fires was originally proposed by

Polybius in his Histories, as being a general method of communications, unlike ones he noted as being previously

used that depended on a small list of pre-arranged messages. It looked like this:

with the letters of the Greek alphabet placed on five numbered tablets, and each letter being numbered on each

tablet.

Other forms of dividing a character into smaller pieces, such as ASCII or Baudot, or Morse Code (to be seen below)

were also developed to allow communications over various types of channel, as were the signal flags used by ships,

to use an example of a different type.

The Straddling Checkerboard

Some ciphers actually used by Soviet spies used a square like this:

9 8 2 7 0 1 6 4 3 5

-------------------

A T O N E S I R

2 B C D F G H J K L M

6 P Q U V W X Y Z . /

Eight of the most common letters are translated to a single digit. The two digits not used in this way begin two-digit

combinations that stand for the remaining letters. This is an example of a variable length code with the prefix

property. When it is possible to tell, from the digits one has already seen of a symbol, whether or not one needs to

include the next digit in the symbol, then spaces between the digits of a symbol are not needed, and this is what is

known as the prefix property.

At one time, telephone numbers in North America had this property, because the middle digit of an area code was

always 0 or 1, and the first three digits of a regular telephone number, also known as the exchange, never had 0 or 1

as the middle digit. Therefore, it was possible to dial 1 plus the seven-digit number to make a long-distance call

within one's own area code, since the first three digits could not possibly be an area code. However, the increased

need for more telephone numbers made it necessary to abandon this rule, in January 1995 and therefore when

dialing a long-distance call within one's own area code, it is now still necessary to dial the area code. This method of

dealing with the increased demand for telephone numbers had the advantage that the number of digits in a telephone

number did not have to be increased, and this avoided problems with computer data-processing systems that

18

allocated the fixed minimum amount of space for a telephone number, as well as limiting the amount of alteration

needed for older telephone equipment.

In Britain, on the other hand, it was necessary to lengthen every telephone number by one digit, and this was done

by inserting the digit 1 in every number in the second position on a day called "phONE day", April 16, 1995,

although permissive dialing remained in effect until April 22, 2000. A number of other countries also modified their

systems of telephone numbers during roughly the same period; Australia began in 1996, and Finland changed over

its phone system on October 12, 1996.

Since in Morse code, a dot is the letter E, and a dash is the letter T, but other Morse code symbols also begin with a

dot or a dash, Morse code is a variable-length code that does not have the prefix property, and so spaces are required

between letters in Morse code.

Of course, the second digit of a two-digit combination could also have stood, by itself, for another letter; but because

when you start from the beginning and move forwards, there is no chance of confusion, this is a workable and usable

system.

Thus, the message SENDMONEY would become 4 1 0 22 25 7 0 1 66, or, rather, 41022 25701 66 because spaces to

show where the letters begin are not needed; the first digit representing a letter determines if its substitute is one or

two digits long.

More complicated codes that work this way, using only the two binary digits 0 and 1, are used as a form of data

compression. The most famous variable-length prefix-property binary codes are the Huffman codes; but this term

only applies to such a code when symbols were assigned in it by a specific algorithm, which has been proven to be

optimal, within the limitations of only considering single-symbol frequencies, and only using this kind of code:

arithmetic coding, which doesn't work in whole bits, can be more efficient. Before Huffman's proof, codes of that

nature assigned in a different fashion, which are known as Shannon-Fano codes, were the best known.

In one case, the VIC cipher used by Reino Hayhanen (the message in that cipher on microfilm, inside a hollow

nickel, was the background to the page introducing this section) the digits produced by a straddling checkerboard

were then subjected to a form of columnar transposition which was varied by selecting triangular areas to be filled

with plaintext last.

In other cases, after the message was converted to digits, encipherment similar to the Vigenre to be described in the

next section was performed. Since Vigenre is a form of addition, doing addition on digits is easier for most people,

without special equipment, than doing it on letters.

Fractionated Morse, and Other Oddities

Fractionated Morse

Morse code uses variable-length symbols made up of dots and dashes, but unlike a straddling checkerboard, the

length of a symbol is not determined by the dots and dashes within it. Instead, spaces are also needed to mark off the

symbols from each other.

But fractionation is still possible using Morse code as a basis. Elementary Cryptanalysis, by H. F. Gaines, gives a

cipher devised by M. E. Ohaver, the author of an early series of magazine columns on cryptanalysis which was of

value to her in the writing of that book, called a "mutilation" cipher, that works like this:

19

Split the message in Morse code into two parts; the string of dots and dashes, and a series of numbers giving the

number of dots or dashes in the representation of each letter. Then, take the numbers, divide them into groups of n,

and reverse the order of the numbers in each group. Using the now transposed numbers as a guide, turn the string of

dots and dashes back into letters.

A table of Morse code follows (and, while I'm at it, I may as well include Japanese Morse, having the data available

from one of my old USENET posts):

E . CD he I .. DE [A] S ... D7 ra H .... C7 nu

V ...- B8 ku

U ..- B3 u F ..-. C1 ti

(1) ..-- C9 no

A .- B2 i R .-. C5 na L .-.. B6 ka

(2) .-.- DB ro

W .-- D4 ya P .--. C2 tu

J .--- A6 wo

T - D1 mu N -. C0 ta D -.. CE ho B -... CA ha

X -..- CF ma

K -.- DC wa C -.-. C6 ni

Y -.-- B9 ke

M -- D6 yo G --. D8 ri Z --.. CC hu

Q --.- C8 ne

O --- DA re (3) ---. BF sho

(4) ---- BA ko

5 ..... 5 6 -.... 6

4 ....- 4 = -...- D2 me

(5) ...-. / -..-. D3 mo

3 ...-- 3 -..-- D5 yu

(6) ..-.. C4 to (c) -.-.. B7 ki

Inter ..-.- D0 mi Start -.-.- BB sa

..--. DF [B] ( -.--. D9 ru

2 ..--- 2 -.--- B4 e

Wait .-... B5 o 7 --... 7

(9) .-..- yi (e) --..- CB hi

+ .-.-. DD n --.-. BC shi

.-.-- C3 te (f) --.-- B1 a

.--.. ye 8 ---.. 8

(a) .--.- B0 - ---.- BD su

(b) .---. BE se 9 ----. 9

1 .---- 1 0 ----- 0

These notes represent two special marks in Japanese:

[A] double stroke following kana (nigori),

[B] small circle following kana (han-nigori).

These notes represent accented letters in European languages or Turkish:

(1) u umlaut (2) a umlaut, cedilla (3) o umlaut or other accent (4) ch, s cedilla (5) s hat (6) e primary accent (usually

acute, grave in Italian) (9) e accent grave (a) a accent (b) j hat (c) c cedilla or accent (e) z accent grave (f) n tilde

To remove ambiguities, the Japanese syllables are preceded by the hex code, in the

version of 8-bit ASCII that includes kana, of the kana symbol represented. The

symbols whose phonetic values I give as yi and ye have the appearance, respectively:

* ********

******* *

* * *

* * * *

* * *

******* *

* *******

20

Here is a graphic, giving all the kana used in Japanese Morse:

Since this system requires that the ciphertext letters must be able to represent all combinations of from one to four

dots or dashes, four extra symbols, used in Morse for accented letters in some languages other than English, need to

be included in the cipher alphabet.

While the original system, having only the group length as a key, may not have been all that secure, the basic

concept is clever and original. The character lengths could as easily have been transposed by means of a double

columnar transposition, and the dots and dashes could be translated to 0s and 1s, and enciphered by any applicable

method, even DES.

While I consider Ohaver's "mutilation" cipher very interesting, for the principle which it illustrates, the term

Fractionated Morse is normally used for a less elegant, but more secure, system, in which possible combinations of

three symbols from the set of dot, dash, and x, the latter standing for the space between letters, are represented by

letters. Note that combinations with two consecutive "x"s are not required, so the ciphertext uses a 22-letter

alphabet.

The letters will vary in frequency, and since two adjacent letters that would produce two consecutive "x"s do not

occur, redundancy still remains in subtle forms as well.

Mixed Fractionation for the Computer

Also, fractionation can be done in a mixed fashion.

25 times 27 is one less than 26 times 26, so one could encipher bigrams (except 1 that is ignored) into objects consisting of two symbols from a five-character alphabet and three symbols from a three-character

alphabet, and then seriate the two kinds of symbols separately, also using two tables, one 125 entries long,

and one 81 entries long, for substitution on them;

32 equals 27 plus 5, and 128 equals 125 plus 3, so there are two different ways to encipher a binary bitstream as a mix of symbols from a three-character and a five-character alphabet;

26 to the 10th power is very slightly larger than 2 to the 47th power; this is noted in a section in the last chapter dealing with ways of preparing a binary message for transmission as text (known as "armor"), but

even this could be made use of in an elaborate fractionation scheme.

21

Because there are convenient ways to convert both letters and bits to a mix of symbols from a 3-element set and

from a 5-element set, as well as an efficient way to convert from bits to letters, intriguing possibilities suggest

themselves. An elaborate fractionation scheme combining the threads mentioned here together is described later.

Enciphering Digits

One interesting way to produce a mixed fractionation scheme comes from the fact that the square of any triangular

number is the same as the sum of the cubes of the consecutive numbers which, when added, produced that triangular

number!

Making use of that fact, and since 10 is a triangular number, one can construct a table like this:

0 1 2 3 4 5 6 7 8 9

0 AAA AAB AAC * -+- -++ aca acb acc acd

1 ABA ABB ABC --- +-- +-+ ada adb adc add

2 ACA ACB ACC --+ ++- +++ baa bab bac bad

3 BAA BAB BAC CAA CAB CAC bba bbb bbc bbd

4 BBA BBB BBC CBA CBB CBC bca bcb bcc bcd

5 BCA BCB BCC CCA CCB CCC bda bdb bdc bdd

6 aaa aba daa dab dac dad caa cab cac cad

7 aab abb dba dbb dbc dbd cba cbb cbc cbd

8 aac abc dca dcb dcc dcd cca ccb ccc ccd

9 aad abd dda ddb ddc ddd cda cdb cdc cdd

As 1 cubed is just 1, and 2 cubed is 8, these symbols make up only a very small part of the square table above, and

thus this part of the table is seldom used. One way to deal with that is to change the table, so that those 9 spaces are

instead filled by two symbols from the ABC set of symbols.

0 1 2 3 4 5 6 7 8 9

0 AAA AAB AAC AA AB AC aca acb acc acd

1 ABA ABB ABC BA BB BC ada adb adc add

2 ACA ACB ACC CA CB CC baa bab bac bad

3 BAA BAB BAC CAA CAB CAC bba bbb bbc bbd

4 BBA BBB BBC CBA CBB CBC bca bcb bcc bcd

5 BCA BCB BCC CCA CCB CCC bda bdb bdc bdd

6 aaa aba daa dab dac dad caa cab cac cad

7 aab abb dba dbb dbc dbd cba cbb cbc cbd

8 aac abc dca dcb dcc dcd cca ccb ccc ccd

9 aad abd dda ddb ddc ddd cda cdb cdc cdd

Using the straddling checkerboard that we saw above,

9 8 2 7 0 1 6 4 3 5

-------------------

A T O N E S I R


6 P Q U V W X Y Z . /

we can encipher a sample message in this scheme, just seriating across the whole message for simplicity (in practice,

one would want to do other things):

TH EREISAP AC K AG EW AITING F ORY OU ATTH ESTATION message

821151349699282492016093830202775667629882114898370 straddling checkerboard

d A B C c c b C d A a d d A A c b c d c d A b c b fractionation

c B C A d d a B d A a d c A A b d a a d c B c d b encoding

a B B B a d c a B a b b C C b a b a c a B c c b

22

For this example, I won't worry about enciphering the last digit of the message. Padding, or another encipherment

step might take care of that.

Now, I seriate the symbols from the ABC and the abcd sets independently, retaining the type of each pair of digits,

and thus the symbols are rearranged as follows, leading to enciphered digits:

d A B B c a c B d C a c d B C d b a c d d A a a a seriation

a B C B d d b A d A c b c B A b b b d a c A a b d

a C A A a c d c C d c a A B b c c c d c B b b b

621250409618791394350978824034733881986584017071170 reconversion

One could also leave all the "d" symbols in their place, and seriate only the "abc" and "ABC" symbols as though

case did not matter - but then convert either to capital or small letters so that the type of each two digit group is kept

the same. (Some care, of course, must be taken when developing a variation so that decipherment remains possible.)

That would produce the following result:

d C A A b b b C d B c d d A B c a c d b d A c c b seriation

a A A B d d a A d A b d c B A a d c a d a C c d b

b B A C a d b a A c c c A A b b b c c c C b b b

633400125659272392307894841030671787645864228797370 reconversion

and this method has its strengths, but also its weaknesses (mainly because the "d"s remain fixed).

Giant Playfair

Another technique I once described involved first using the straddling checkerboard to encipher a message as digits,

and then to use Playfair to encipher it. But instead of using the Playfair technique over a 5 by 5 square of letters, one

uses a 10 by 10 square containing digit pairs, like the following:

68 71 07 49 76 42 54 77 21 82

02 09 98 65 70 55 17 01 50 91

33 35 30 08 62 22 97 44 06 57

64 18 78 58 96 34 11 56 52 38

95 26 86 20 27 37 93 05 14 85

63 29 39 61 87 10 88 32 00 80

31 81 16 83 24 99 67 72 13 53

89 94 47 40 25 73 04 59 84 19

03 75 28 60 12 41 43 48 66 74

79 46 36 45 51 69 23 15 92 70

Thus, the four digits 2076 would encipher to 2749 with this square.

The VIC Cipher

The VIC cipher is an intricate cipher issued by the Soviet Union to at least one of its spies. It is of interest because it

seems highly secure, despite being a pencil-and-paper cipher. It was the cipher in which a message was written

which was found on a piece of microfilm inside a hollowed-out nickel by a newspaper boy in 1953. The workings of

this cipher were explained by Hayhaynen to FBI agents shortly after his defection to the United States in 1957.

David Kahn described that cipher briefly in an article in Scientific American, and in full detail in a talk at the 1960

annual convention of the American Cryptogram Association which was later reprinted in his book Kahn on Codes.

23

The VIC cipher, which I will demonstrate here adapted to the sending of English-language messages, begins with an

involved procedure to produce ten pseudorandom digits. The agent must have memorized six digits (which were in

the form of a date), and the first 20 letters of a key phrase (which was the beginning of a popular song) and must

think of five random digits for use as a message indicator.

Let the date be July 4, 1776, to give the digits 741776. (Actually, the Russians used their customary form of dates,

with the month second.) And let the random indicator group be 77651.

The first step is to perform digit by digit subtraction (without carries) of the first five digits of the

date from the indicator group:

The second step is to take the 20-letter keyphrase, and turn it into 20 digits by dividing it into two halves, and within

each half, assigning 1 to the letter earliest in the alphabet, and so on, treating 0 as the last number, and assigning

digits in order to identical letters. Thus, if our keyphrase is "I

dream of Jeannie with t", that step proceeds:

The result of the first step is then expanded to ten digits through a process called chain addition. This is a decimal

analog of the way a linear-feedback shift register works: starting with a group of a certain number of digits (in this

case five, and later we will do the same thing with a group of ten digits), add the first two digits in the group

together, take only the last digit of the result and append it to the end of the group, then ignore the first digit, and

repeat the process.

The 10 digit result is then added, digit by digit, ignoring carries, to the first 10

digits produced from the keyphrase to produce a ten-digit result, as follows:

And these 10 digits are then encoded by encoding 1 as the first of the 10

digits produced from the second half of the keyphrase, 2 as the second, up

to 0 as the tenth.

This ten digit number is used by chain addition to generate 50 pseudorandom digits

for use in encipherment:

The last row of these digits (which will still be used again) is used like the letters in

a keyword for transposition to produce a permutation of the digits 1 through 9 (with

0 last again):

and those digits are used as the top row of numbers for a straddling checkerboard:

1 2 0 5 3 4 8 6 7 9

-------------------

A T O N E S I R

-------------------


8 P Q U V W X Y Z . /

77651

(-) 74177

---------

03584

I D R E A M O F J E A N N I E W I T H T

6 2 0 3 1 8 9 5 7 4 1 6 7 4 2 0 5 8 3 9

6 2 0 3 1 8 9 5 7 4

(+) 0 3 5 8 4 3 8 3 2 7

-----------------------

6 5 5 1 5 1 7 8 9 1

using code: 1 2 3 4 5 6 7 8 9 0

1 6 7 4 2 0 5 8 3 9

6 5 5 1 5 1 7 8 9 1

becomes 0 2 2 1 2 1 5 8 3 1

0 2 2 1 2 1 5 8 3 1

---------------------

2 4 3 3 3 6 3 1 4 3

6 7 6 6 9 9 4 5 7 9

3 3 2 5 8 3 9 2 6 2

6 5 7 3 1 2 1 8 8 8

1 2 0 4 3 3 9 6 6 9

1 2 0 4 3 3 9 6 6 9

---------------------

1 2 0 5 3 4 8 6 7 9

24

One detail omitted is that the checkerboard actually used had the letters in the bottom part written in vertical

columns with some columns left until the end. That doesn't work as well in an English example, as there are only

two left-over spaces after the alphabet.

With the straddling checkerboard in place, we can begin enciphering a message.

Let our message be:

We are pleased to hear of your success in establishing your false identity. You will be sent some money to cover

expenses within a month.

Converting this to numbers, we proceed:

W EAREP L EASED TOH EAROF Y OU RSU C C ESSINESTAB L ISH ING

834194810741640025044195058858096800202466734621010776047303

Y OU RF AL SEID ENTITY Y OU W IL L B ESENTSOM EM ONEY TOC O

88580905107647004327288885808370707014643265094095348825025

V EREX P ENSESW ITH INAM ONTH

854948481436468372047310953204

For the sake of our example, we will give our agent a small personal number of 8. This number is used to work out

the widths of the two transposition tableaux used to transpose the numbers obtained above. The last two unequal

digits, which in this case are the last two digits (6 and 9) of the last row of the 50 numbers generated above, are

added to the personal number with the result that the two transpositions will involve 8+6, or 14, and 8+9, or 17,

columns respectively.

The keys for those two transpositions are taken by reading out the 50 numbers by

columns, using the 10 digits used to generate them as a transposition key. Again, 0 is

last, so given the table above:

we read out the digits in order:

36534 69323 39289 47352 36270 39813 4

stopping when we have the 31 digits we need.

Our first transposition uses the first 14 digits as the key of a conventional simple columnar transposition:

36534693233928

--------------

83419481074164

00250441950588

58096800202466

73462101077604

73038858090510

76470043272888

85808370707014

64326509409534

88250258549484

81436468372047

3109532049

0 2 2 1 2 1 5 8 3 1

---------------------

2 4 3 3 3 6 3 1 4 3

6 7 6 6 9 9 4 5 7 9

3 3 2 5 8 3 9 2 6 2

6 5 7 3 1 2 1 8 8 8

1 2 0 4 3 3 9 6 6 9

25

Since our message consisted of ten rows of 14 digits, plus one extra row of 9 digits, it is 149 digits long. At this

initial stage, one null digit is appended to the message, making it 150 digits long, so that it will fill a whole number

of 5-digit groups.

Thus, with the null digit added, it gives us the intermediate form of the message:

09200274534 6860181384 80577786883 15963702539 11018309880

75079700479 4027027992 90628086065 42040483240 30833654811

44818035243 4864084447 84005470562 1546580540

The fact that our message is 150 digits long was important to note, since the next step in the encipherment, although

it is also a columnar transposition, includes an extra complexity to make the transposition irregular, and so it is

necessary to lay out in advance the space that will be used in that transposition.

The remaining 17 digits of the 31 we read out above, 9 47352 36270 39813 4, are the key for this second

transposition. The numbers, in addition to indicating the order in which the columns are to be read out, indicate

where triangular areas start which will be filled in last.

The first triangular area starts at the top of the column which will be read out first, and extends to the end of the first

row. It continues in the next row, starting one column later, and so on until it includes only the digit in the last

column. Then, after one space, the second triangular area starts, this time in the column which will be read out

second.

Since we know that our message is 150 digits long, we know that it will fill 8 rows of 17 digits, with 14 digits in the

final row. This lets us fill in the transposition block, first avoiding the triangular areas:

94735236270398134

-----------------

09200274534686

018138480577786

8831596370253911

01830988075079700

47940

270279

9290628

08606542

040483240

and then with them filled in as well:

94735236270398134

-----------------

09200274534686308

01813848057778633

88315963702539116

01830988075079700

47940548114481803

27027952434864084

92906284478400547

08606542056215465

04048324080540

from which the fully encrypted message can be read out:

36178054 289959253 507014400 011342004 746845842 675048425

03100846 918177284 83603475 035007668 483882424 283890960

350713758 689914050 008042900 873786014 472544860

26

The last digit, 6, in the date shows that the indicator group is to be inserted in the final message as the sixth group

from the end, so the message in the form in which it will be transmitted becomes:

36178 05428 99592 53507 01440 00113 42004 74684 58426 75048

42503 10084 69181 77284 83603 47503 50076 68483 88242 42838

90960 35071 37586 89914 05000 77651 80429 00873 78601 44725

44860

Two Trigraphic Ciphers, and a Heptagraphic One

Playfair for Three

Based on the Playfair cipher, I once thought of a way to make a cipher that worked on groups of three letters.

Using a square, as with Playfair:

T X V H R

L K M U P

N Z O J E

C G W Y A

F B S D I

encipher with the following rules:

If all three letters are the same, replace them by three repetitions of the letter diagonally below and to the

right of that letter. Thus: MMM becomes JJJ in the square above. Below, and to the right, are always

interpreted cyclically, so DDD becomes RRR, and PPP becomes NNN, and even III becomes TTT.

If two of the letters are the same, encipher the two letters as if they were part of a digraph to be enciphered

with Playfair.

o If the two letters are in the same row, replace each one with the letter to its right. Thus: PKK

becomes LMM, NON becomes ZJZ.

o If the two letters are in the same column, replace each one with the letter below it. Thus: HUH

becomes UJU, ZZB becomes GGX.

o If the two letters are neither in the same row nor the same column, replace each letter with the

letter that is in its own row, but in the column of the other letter. Thus: BOO becomes SZZ, MIM

becomes PSP.

When all three letters are different, follow these rules:

o If all three letters are in the same row, replace each one with the letter to its right. Thus, CYG

becomes GAW, ZEN becomes ONZ.

o If all three letters are in the same column, replace each one with the letter below it. Thus, MOW

becomes OWS, KGB becomes ZBX.

o If two letters are in the same row, and one of those two is in the same column as the third letter,

replace the letter that is in the same row as one other letter and the same column as the other other

letter with the letter that is in the same column as the letter with which it shares a row, and in the

same row as the letter with which it shares a column. Replace the two other letters by each other.

Thus, YUK becomes KGY, WVY becomes HYV, POE becomes OPM, GAP becomes PKG.

27

o If two letters are in the same column, but the third letter is neither in that column nor in the same

row as either of those two letters, replace each letter by the letter which is in its own row, but in

the other column used by the three letters. Thus, TCO becomes VWN, TAN becomes RCE, HUG

becomes XKY.

o If two letters are in the same row, but the third letter is neither in that row nor in the same column

as either of those two letters, replace each letter by the letter which is in its own column, but in the

other one of the two rows used by the letters. Thus, NED becomes FIJ, GAS becomes BIW, LOP

becomes NME.

o If no two letters share either a row or column, each letter is replaced by the letter in its own row,

but in the column of the next letter of the trigram, the first letter being the 'next letter' for the last

one. Thus, TOY becomes VJC, LOB becomes MZF, GET becomes ANX.

Note that since a trigram with repeated letters always enciphers to a trigram with repeated letters, one could use a

separate square for each of the three possibilities, or even just use an arbitrary substitution alphabet for the case of

three identical letters.

Trigraphic from Fractionation

If one uses a substitution where each letter of a 27-letter alphabet is replaced by three digits from 1 to 3, then the

obvious method of constructing a trigraphic cipher from this is to write the equivalents of the three letters in by

columns and take them out by rows; thus, with the alphabet

W 111 M 121 Z 131 N 211 O 221 L 231 C 311 T 321 U 331

A 112 & 122 Y 132 E 212 V 222 P 232 X 312 J 322 G 332

K 113 B 123 H 133 Q 213 R 223 S 233 I 313 F 323 D 333

we encipher like this:

T H E

3 1 2 X

2 3 1 L

1 3 2 Y

For 26 Letters

But how can we adapt these two ciphers to a 26-letter alphabet?

Let's imagine that we want to have a method that doesn't require, as the original Playfair did, inserting a letter like X

into the plaintext when a double letter occurs; we want something that can be applied mechanically to any arbitrary

input text. This would make it suitable for use as a step in encryption performed by a computer.

For the cipher derived from Playfair, the structure of the rules provides a clue. When the extra letter turns up, ignore

it for encryption, but place it in the ciphertext without alteration, and treat two remaining letters, if they are different,

as in regular Playfair, and a single remaining letter (or two identical remaining letters) as if they were three identical

letters.

How, though, can we possibly make the cipher which requires a 27-letter alphabet work with only 26 letters?

First, choose a substitution table such that the unused letter, &, is represented by the code 333.

28

U 111 F 121 P 131 Q 211 D 221 W 231 B 311 R 321 G 331

X 112 J 122 I 132 C 212 A 222 K 232 Y 312 T 322 V 332

H 113 M 123 O 133 S 213 Z 223 L 233 E 313 N 323 & 333

Now then, any combination that does not contain an ampersand, but which produces a combination that does contain

one, will have produced a combination that, when enciphered again, doesn't contain an ampersand.

L O G S & G

2 1 3 S 2 3 3 L

3 3 3 & 1 3 3 O

3 3 1 G 3 3 1 G

That appears to be a trivial consequence of the fact that this cipher is reciprocal.

Since an ampersand is represented by the code 333, however, that means that whether or not a square produces an

ampersand depends only on the positions of the 3s in that square; the other two digits, 1 and 2, are irrelevant. Thus,

we can do better than leaving trigrams which encipher to combinations including an ampersand unenciphered.

Between the two encipherments, we can apply a substitution to the letters of the first result, as long as that

substitution leaves the 3s unchanged. Since this substitution operates perpendicular to the plaintext and the

ciphertext, the cipher still mixes the letters of the trigram together in this case.

Such a substitution might look like this:

111 212 113 123 131 232 311 321 133 233

112 122 123 223 132 231 312 311 233 133

121 211 213 113 231 131 321 322 313 313

122 221 223 213 232 132 322 312 323 323

211 121 331 332

212 222 332 331

221 111

222 112 333 333

With such a substitution, our encipherment would become:

L O G H & V

2 1 3 S 213 -> 113 (H) 1 3 3 O

3 3 3 & 333 -> 333 (&) 1 3 3 O

3 3 1 G 331 -> 332 (V) 3 3 2 V

Heptagraphic encryption

Having now obtained two trigraphic ciphers which both operate on trigrams of the 26-alphabet, but which operate

on different principles, one is immediately tempted to combine them to create a cipher which will be much stronger

than either one alone.

One way to do this is simply to apply both in sequence. However, inspired by recently encountering the

polymorphic block ciphers of Kostadin Bajalcaliev, I have thought of a more elaborate way of doing this.

Let us encipher a block of seven letters at a time. Three letters are enciphered trigraphically by one of the two

systems given above, and the next three are enciphered using the other system. The seventh letter is used to indicate

which system is used.

Then, the letters are rearranged according to the permutation from 1 2 3 4 5 6 7 to 4 7 1 2 3 5 6

and the process is repeated.

29

Since it seems wasteful to leave a letter unenciphered just to use it as the source of one bit of information, that letter

could also be used to choose between twelve possibilities: there could be three different sets of tables for one of the

block ciphers, and four different sets of tables for the other. For the one based on Playfair, it is not even necessary

that the omitted letter be the same for each set of tables.

Polyalphabetic Substitution

The idea of using substitution ciphers that change during the course of a message was a very important step

forwards in cryptography. David Kahn's book, The Codebreakers, gives a full account of the origins of this idea

during the Italian Renaissance.

The earliest form of polyalphabetic cipher was developed by Leon Battista Alberti by 1467. His system involved

writing the ciphertext in small letters, and using capital letters as symbols, called indicators, to indicate when the

substitution changes, now and then through a message. The plaintext alphabet on his cipher disk was in order, and

included the digits 1 through 4 for forming codewords from a small vocabulary.

Subsequently, more modern forms were devised, which change the substitution for each letter:

A progressive-key system, where keys are used one after the other in normal order. This was first published

posthumously, in a book by Johannes Trithemius that appeared in 1518. The key ABCD...Z was used with

regular alphabets in the form depicted therein.

A keyword indicating the alphabets to use in turn. Although this system is what is called the Vigenre, it

originated with Giovan Batista Belaso in 1553. In 1563, Giovanni Battista Porta added the use of mixed

alphabets to this system.

The autokey system, where a key starts the choice of alphabet, but the message itself determines the

alphabets to use for later parts of the message. Although an unusable form of this was first proposed by

Girolamo Cardano, it was Blaise de Vigenre who proposed the modern form of the autokey cipher in

1585.

The following compact table provides 26 alphabets, each labelled with a letter of the alphabet:

The A alphabet isn't shown, since in that alphabet, every letter stands

for itself, and so, if nothing is done, nothing need be looked up in the

table. For any other alphabet, use the letter indicating the alphabet to

find a row among the top five, and a row among the bottom five;

using those two rows, the upper row stands for plaintext, the lower for

cipher.

Thus, for alphabet Q, the top row begins KLMNO... and the bottom

row begins ABCDE..., and so K becomes A, Q becomes G, and A

becomes Q in that alphabet.

If you think of A as standing for zero, B for 1, up to Z for 25, this particular set of alphabets is nothing more than the

modulo 26 addition of the plaintext and the key to obtain the ciphertext. Circular disks or sliding scales can be used

to carry out the addition. This, perhaps, can be more easily seen if we exhibit the Vigenre tableau in full,

accompanied by the table for modulo-26 addition:

B C D E F ZABCDEFGHIJKLMNOPQRSTUVWXY

G H I J K UVWXYZABCDEFGHIJKLMNOPQRST

L M N O P PQRSTUVWXYZABCDEFGHIJKLMNO

Q R S T U KLMNOPQRSTUVWXYZABCDEFGHIJ

V W X Y Z FGHIJKLMNOPQRSTUVWXYZABCDE

B G L Q V ABCDEFGHIJKLMNOPQRSTUVWXYZ

C H M R W BCDEFGHIJKLMNOPQRSTUVWXYZA

D I N S X CDEFGHIJKLMNOPQRSTUVWXYZAB

E J O T Y DEFGHIJKLMNOPQRSTUVWXYZABC

F K P U Z EFGHIJKLMNOPQRSTUVWXYZABCD

30

|ABCDEFGHIJKLMNOPQRSTUVWXYZ | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25

-+-------------------------- --+----------------------------------------------------------------

-------------

A|ABCDEFGHIJKLMNOPQRSTUVWXYZ 0| 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25

B|BCDEFGHIJKLMNOPQRSTUVWXYZA 1| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

22 23 24 25 0

C|CDEFGHIJKLMNOPQRSTUVWXYZAB 2| 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

23 24 25 0 1

D|DEFGHIJKLMNOPQRSTUVWXYZABC 3| 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

24 25 0 1 2

E|EFGHIJKLMNOPQRSTUVWXYZABCD 4| 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

25 0 1 2 3

F|FGHIJKLMNOPQRSTUVWXYZABCDE 5| 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

0 1 2 3 4

G|GHIJKLMNOPQRSTUVWXYZABCDEF 6| 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0

1 2 3 4 5

H|HIJKLMNOPQRSTUVWXYZABCDEFG 7| 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1

2 3 4 5 6

I|IJKLMNOPQRSTUVWXYZABCDEFGH 8| 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2

3 4 5 6 7

J|JKLMNOPQRSTUVWXYZABCDEFGHI 9| 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3

4 5 6 7 8

K|KLMNOPQRSTUVWXYZABCDEFGHIJ 10|10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4

5 6 7 8 9

L|LMNOPQRSTUVWXYZABCDEFGHIJK 11|11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5

6 7 8 9 10

M|MNOPQRSTUVWXYZABCDEFGHIJKL 12|12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6

7 8 9 10 11

N|NOPQRSTUVWXYZABCDEFGHIJKLM 13|13 14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7

8 9 10 11 12

O|OPQRSTUVWXYZABCDEFGHIJKLMN 14|14 15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8

9 10 11 12 13

P|PQRSTUVWXYZABCDEFGHIJKLMNO 15|15 16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9

10 11 12 13 14

Q|QRSTUVWXYZABCDEFGHIJKLMNOP 16|16 17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10

11 12 13 14 15

R|RSTUVWXYZABCDEFGHIJKLMNOPQ 17|17 18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11

12 13 14 15 16

S|STUVWXYZABCDEFGHIJKLMNOPQR 18|18 19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12

13 14 15 16 17

T|TUVWXYZABCDEFGHIJKLMNOPQRS 19|19 20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16 17 18

U|UVWXYZABCDEFGHIJKLMNOPQRST 20|20 21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

15 16 17 18 19

V|VWXYZABCDEFGHIJKLMNOPQRSTU 21|21 22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20

W|WXYZABCDEFGHIJKLMNOPQRSTUV 22|22 23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

17 18 19 20 21

X|XYZABCDEFGHIJKLMNOPQRSTUVW 23|23 24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

18 19 20 21 22

Y|YZABCDEFGHIJKLMNOPQRSTUVWX 24|24 25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

19 20 21 22 23

Z|ZABCDEFGHIJKLMNOPQRSTUVWXY 25|25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

20 21 22 23 24

And, of course, instead of the modulo-26 addition table for our 26-letter alphabet, a

Vigenere table can be constructed for any alphabet. Thus, modulo-24 addition would

be used for the Greek alphabet, modulo-32 addition for the Russian alphabet, or, as

shown in the picture at left, modulo-22 addition for the Hebrew alphabet. Here, the

table is written from right to left, in the same direction as normally used for Hebrew

writing.

31

The message "Wish you were here" can be encrypted by the three possible methods, using SIAMESE as the

keyword:

Straight keyword:

Message: WISHYOUWEREHERE

Key: SIAMESESIAMESES

Cipher: OQSTCGYOMRQLWVW

Progressive key:


Key: SIAMESETJBNFTFU

Cipher: OQSTCGYPNSRMXWY

Autokey:


Key: SIAMESEWISHYOUW

Cipher: OQSTCGYSMJLFSLA

For the progressive key, the keyword, followed by the keyword advanced one position at a time through the

alphabet, is used. Just using ABCDEF... as the key would not have been unique enough to serve as a real cipher.

The table shown here can be thought of as a table for the addition of letters, which is equivalent to addition modulo

26, where A stands for 0, B stands for 1, continuing on to Z, which would stand for 25.

The plain keyword system can be solved by the Kasiski method; look for repeated sequences of letters in a message,

and count the number of letters between them. From this, it is easy to spot common factors, and determine the length

of the keyword used. This lets one sort the letters into the ones enciphered with the same alphabet. If normal

alphabets are used, looking at the profile of the frequency count makes solution trivial.

For the other two methods, elementary cryptanalysis only allows solution for normal (or at least known) alphabets.

The progressive key case can be made to yield its period if one looks not for repeated letters, but for repeated

distances in the alphabet between adjacent letters; this subtracts out the slow movement of the keyword through the

alphabet. The autokey can basically be solved by brute-force trial on the length of its starting keyword. Of course,

these systems can still be solved with mixed alphabets, but more advanced methods are needed, involving statistics

or multiple messages with the same key.

In addition to using mixed alphabets for greater security, there are other systems of historical importance.

The Gronsfeld, which added a numeric key to the plaintext, meant that there were only ten possible equivalents for

each letter, but was easier to do by hand without a table or slide or disk. The Porta system used a smaller table; the

first half of the alphabet was stationary while the second half moved, and equivalents for letters in each half of the

alphabet were found in the other half.

The table for the Porta system (converted to the modern 26-letter alphabet) is as

follows:

The Gronsfeld, and, even more easily, the Porta, because they only allow some letters,

but not others, as equivalents for any given plaintext letter, can be attacked through this

weakness.

In attempting to devise a cipher that, like the Gronsfeld, lends itself to mental

arithmetic, I used (for the English alphabet) the method of representing numbers as

ABCDEFGHIJKLM

-------------

AB |NOPQRSTUVWXYZ

CD |OPQRSTUVWXYZN

EF |PQRSTUVWXYZNO

GH |QRSTUVWXYZNOP

IJ |RSTUVWXYZNOPQ

K KL |STUVWXYZNOPQR

e MN |TUVWXYZNOPQRS

y OP |UVWXYZNOPQRST

QR |VWXYZNOPQRSTU

ST |WXYZNOPQRSTUV

UV |XYZNOPQRSTUVW

WX |YZNOPQRSTUVWX

YZ |ZNOPQRSTUVWXY

32

letters that was used by the ancient Hebrews and the ancient Greeks:

A 1 J 10 S 100

B 2 K 20 T 200

C 3 L 30 U 300

D 4 M 40 V 400

E 5 N 50 W 500

F 6 O 60 X 600

G 7 P 70 Y 700

H 8 Q 80 Z 800

I 9 R 90

Then, the rule for encipherment is this:

a) If the plaintext and key letters are in the same column, they are added:

B (2) + F (6) = H (8)

L (30) + J (10) = M (40)

b) If the plaintext and key letters are in two different columns, their nonzero digits are added, and the letter in the

third column which contains neither key nor plaintext containing the sum is taken:

D (4) + L (30) = Y (700)

W (500) + K (20) = G (7)

If we had a 27-letter alphabet, we would only have to add that when the sum is greater than 9, subtract 9 (in the

appropriate digit place):

M (40) + Q (80) = L (30)

For the 26-letter alphabet, it's easy to modify rule (a): if the two letters

are in the third column, subtract 800 instead of 900.

U (300) + Y (700) = T (200)

Rule (b) is modified in this way: always subtract 9; if the cipher letter

and the key letter produce 900 as the result, use instead the letter that

would be produced by enciphering a letter with the value 900 with the

key letter. Since there is no letter with that value, when one is

produced by deciphering, decipher 900 with the key to get the true

plaintext letter.

This produces the table seen at right:

This table is slightly imperfect. For each of the first eighteen letters in

the alphabet, when they occur in the plaintext, there is one letter that

no key letter will cause to be its ciphertext equivalent, and there is

another letter that will be that plaintext letter's ciphertext equivalent

for two different key letters. However, although imperfect, it is less so

than the Gronsfeld cipher, and so the system might be of some use

Plaintext

ABCDEFGHI JKLMNOPQR STUVWXYZ

----------------------------

A|BCDEFGHIA TUVWXYZJS KLMNOPQR

B|CDEFGHIAB UVWXYZKST LMNOPQRJ

C|DEFGHIABC VWXYZLSTU MNOPQRJK

D|EFGHIABCD WXYZMSTUV NOPQRJKL

E|FGHIABCDE XYZNSTUVW OPQRJKLM

F|GHIABCDEF YZOSTUVWX PQRJKLMN

G|HIABCDEFG ZPSTUVWXY QRJKLMNO

H|IABCDEFGH QSTUVWXYZ RJKLMNOP

I|ABCDEFGHI STUVWXYZR JKLMNOPQ

J|TUVWXYZAS KLMNOPQRJ BCDEFGHI

K|UVWXYZBST LMNOPQRJK CDEFGHIA

L|VWXYZCSTU MNOPQRJKL DEFGHIAB

K M|WXYZDSTUV NOPQRJKLM EFGHIABC

e N|XYZESTUVW OPQRJKLMN FGHIABCD

y O|YZFSTUVWX PQRJKLMNO GHIABCDE

P|ZGSTUVWXY QRJKLMNOP HIABCDEF

Q|HSTUVWXYZ RJKLMNOPQ IABCDEFG

R|STUVWXYZI JKLMNOPQR ABCDEFGH

S|KLMNOPQRJ BCDEFGHIA TUVWXYZS

T|LMNOPQRJK CDEFGHIAB UVWXYZST

U|MNOPQRJKL DEFGHIABC VWXYZSTU

V|NOPQRJKLM EFGHIABCD WXYZSTUV

W|OPQRJKLMN FGHIABCDE XYZSTUVW

X|PQRJKLMNO GHIABCDEF YZSTUVWX

Y|QRJKLMNOP HIABCDEFG ZSTUVWXY

Z|RJKLMNOPQ IABCDEFGH STUVWXYZ

33

(although just converting to digits with a straddling checkerboard achieves the same goal, of simplifying applying a

key, without any imperfections, and considerably more simply).

It is, however, more important to recognize the names of two other systems. If Vigenre can be thought of as

plaintext + key = cipher, Beaufort amounts to key - plaintext = cipher. Since cipher = key + plaintext, Beaufort, like

Porta, is reciprocal: the same steps exactly will both encipher and decipher. Variant Beaufort is plaintext - key =

cipher, and is the same as deciphering for Vigenre.

Slides and disks are often used for the Vigenre and other polyalphabetic ciphers, particularly mixed-alphabet

Vigenre.

Helen Fouch Gaines' Elementary Cryptanalysis gives a classification of mixed alphabet slides into four types:

Type 1: Mixed plaintext alphabet, plain cipher alphabet.

Type 2: Plain plaintext alphabet, mixed cipher alphabet.

Type 3: The same mixed alphabet for plain and cipher.

Type 4: Different mixed plain and cipher alphabets.

I would like to extend this classification slightly to make it comprehensive:

Type 0: Plain plaintext and cipher alphabet.

Type 0a: Plain plaintext alphabet, reversed cipher alphabet.

Type 1: Mixed plaintext alphabet, plain cipher alphabet.

Type 2: Plain plaintext alphabet, mixed cipher alphabet.

Type 3: The same mixed alphabet for plain and cipher.

Type 3a: Mixed plaintext alphabet, the same alphabet in reverse for cipher.

Type 4: Different mixed plain and cipher alphabets.

A slide of type 0a produces a reciprocal cipher, and can be used for Beaufort. The mechanical equivalent of such a

slide is an element of a Hagelin machine.

The Type 1 slide is more easily cryptanalyzed than the Type 2 or above slides since once the different alphabets

have been determined by discovering the period of the cipher by the Kasiski method (looking for repeated digrams,

trigrams and above, noting the distance between them, and looking for a common factor to most of the distances,

giving greater weight to longer repetitions) the frequency counts can be lined up, since they are displaced along the

cipher slide, which in this case has the known regular alphabet along it.

Even in the mixed-alphabet case, once the period is found, letter frequencies and bigram frequencies can be used to

read the message. For a frequent letter, whether only a few letters, or a wide variety of letters, appear before or after

that letter helps to identify whether the letter is a vowel or a consonant, or to determine exactly which letter it is.

When some alphabets are partly reconstructed, if you know that the alphabets have been produced by a slide, even

one with two mixed alphabets, there are certain logical inferences that you may be able to make that w

Documents

Cipher