Index Coincidence

Embed Size (px)

Citation preview

  • 7/30/2019 Index Coincidence

    1/101

    IA CRYPTOGRAPHIC SERIES

    THE INDEX OF COINCIDENCEAND ITS APPLICATIONS

    IN CRYPTANALYSIS

    byWilliam F. Friedman

    FromAegeon Pork Press

  • 7/30/2019 Index Coincidence

    2/101

    THE INDEX OF COINCIDENCEAND ITS APPLICATIONS

    IN CRYPTANALYSIS

    byWilliam F. Friedman

    M i D D L E B U R Y COLLEGE LIBRAS*

  • 7/30/2019 Index Coincidence

    3/101

    1987 AEGEAN PARK PRESS

    ISBN: 0-89412-137-5ISBN: 0-89412-138-3

    (soft cover)

    (library bound)

    AEGEAN PARK PRESSP.O. Box 2837Laguna Hills, California 92654(714) 586-8811

    Manufactured in the United States of America

  • 7/30/2019 Index Coincidence

    4/101

    TABLE OF CONTENTS

    Introduction

    Part IThe Vogel Quintuple Disk

    Part II

    The Schneider Cipher 65

    A Special Problem 89

  • 7/30/2019 Index Coincidence

    5/101

    THE INDEX OF COINCIDENCE AND ITS APPLICATIONS IN C R Y P TA N A LY S ISINTRODUCTION

    Frequency tables in the analysis and solution of ciphers have commonly been employedto make assumptions of plain-text equivalents for the cipher letters constituting a message.The significance of the various phases of the curves themselves, i.e., the crests and troughsand their relative positions in such frequency tables, has been recognized to some extent, butlargely only in connection with the determination of two more or less preliminary points intheir analysis: (1) whether the frequency distribution approximates that of a substitutioncipher involving only one alphabet or more than one alphabet; (2) whether this approximationcorresponds to that of a standard alphabet, direct or reversed, or that of a mixed alphabet.

    It will be shown in this paper that the frequency tables of certain types ofciphers havedefinite characteristics of a mathematical or rather statistical nature, approaching more or lessclosely those of ordinary statistical curves. These characteristics may be used in the solutionof such ciphers to the exclusion of any analysis of the frequencies of individual letters consti-tuting the tables or curves, and without any assumptions whatever of plain-text values Jo r thecipher letters.

    It is true that cipher systems admitting of such treatment are not very commonly encoun-tered. But inasmuch as such systems are always of a complex nature, which the ordinarymethods of cryptanalysis would find rather baffling, a description of a purely mathematicalanalysis that may be applied to other cases similar to the ones herein described may be con-sidered valuable. In fact, it is possible that the principles to be set forth may find con-siderably wider application in other phases of cryptanalysis than is apparent at this time.

    Two examples ofsuch a treatment will be given in detail: One dealing with a substitutioncipher wherein a series of messages employing as many as 125 random mixed secondary alpha-bets can be solved without assuming a plain-text value for a single cipher letter; the other amultiple alphabet, combined substitution-transposition cipher, solved from a single messageof fair length.

    (1)

  • 7/30/2019 Index Coincidence

    6/101

    P A R T ITHE VOGEL QUINTUPLE DISK CIPHER '

    This cipher system involves the use of J i v e superimposed disks bearing dissimilar randommixed alphabets. These disks are mounted upon a circular base plate, the periphery of whichis divided into 26 segments; one of these is marked "Plain", indicating the segment in linewith which the successive letters of the plain text as found on the five revolving disks are tobe brought for encipherment. The remaining 25 segments of the base plate bear the numbersfrom 1 to 25 in a mixed sequence, which we have called the "Numerical key." This key may ,however, consist of less than 25 numb ers, in which case one or more of the segments of the baseplate will remain blank. The numbers constituting the key are written on the base plate in uclockwise direction beginning immediately at the right of the plain segment (fig. 1 ) .

    M E T H O D O F E N C I P H E H M E N TIn the accompanying example illustrating the details of encipherment it will be seen that the

    numerical key consists of 21 numbers, leaving blank, therefore, the four segments immediatelypreceding the plain segment.2 A ssum in g a series of messages, let us suppose the first threebegin as follows:

    1. Prepare for bombardment at Harvey . . .2. Enemy attack on Hunterstown . . .3. Second Field Artillery Brigade . . .

    Revolving the five cipher disks successively, and thus bringing the first 5 letters of message1, P R E P A , in line with the plain segment, reading from the outer disk inward in the order 1-2-3-4-5, the cipher letters for this first set of 5 plain-text letters are then taken in th e same order fromthe segments of the disks directly in line wi th tha t segment ofthe base plate that bears the numb er 1.In this case it is the eighteenth segment after the plain, in a clockwise direction, a n d , as shownin figure 1, the equ iv a l en t cipher letters for this group are M E K J R . The second set ofl i v e plain-text letters of message 1, R E F O R , are then in a similar manner set in line under the plain segment,and their equivalent cipher letters ar e taken from the segment immediately following segment1 of the numerical key, in a clockwise directio n, vi z, segment 13. The cipher lette rs in this caseare V Z Q W H . The third group ofletters in message 1 finds its cipher equivalents a t segment

    1While on duty in the Code and Cipher Section of the I n t e l l i g e n c e Division of th e G e n e r al S t u f f , G . H . Q . ,A.E.F., Lt. Col. F. Moorman, Chief of Section, turned over to the writer for s t u d y a cipher system t o g e t h e r

    with a series of 26 test messages submitted by Mr. E. J. Vogel, Chief Clerk, w h o ha d taken considerable interestin cryptography and had, as a result of his studies, d e v i se d the system presented for e x a m i n a t i o n . Th e w r i t e rworked upon the cipher during his leisure moments, but the problem involved considerable labor and so l u t i o nw as not completed before being relieved from duty at t h a t statio n. The main p r i n c i p l e s for solut ion, h o w e v e r ,were established and only the detailed work remained to be completed. A f t e r m i interval of more than a year,while Director of the Cipher Department of the R i v e r b an k Laboratories, G e n e v a , 1 1 1 . , the writer t u r n e d hi sattention once more to this cipher and succeeded in completely solving th e problem by carrying out tho s eprinciples to their logical conclusion. It is recommended that the reader prepare a duplica te oft h e set of disks in o r d e r tha t he may m o r e readi lyfollow the v a r i o u s steps in the analysis.

    (1!)

  • 7/30/2019 Index Coincidence

    7/101

    18; the f o u rt h , at segment 8. The fifth group of plain-text letters, however, will take itscipher equivalents from the first segment to the right of the plain segment, inasmuch as thesegments immediately following segment 8 are blank. Plain-text letters T A T H A , therefore, willbe enciphered on segment, 9, becoming XONJE. This method is continued in like manner through-o u t message 1. Jf message 1 contains more than 21 groups of5 letters, the twenty-second groupwill take its cipher equivalents on segment 1 again; the twenty-third.on segment 13, and so on.

    F I G U R E 1

    Proceeding now to message 2, the first 5 plain-text letters, E N E M Y , are set up under the plainsegment an d the cipher letters ar e t a k e n fro m segment 2, becoming LTVBM. The second groupof 5 letters is enciphered on segment 1 , the third group on segment 13, and so on, throughoutth e message. The first 5 let ter s of message 3, SECON, are enciphered on segment 3, becomingY A P A C . The first group of cipher letters in n message is always to be taken from the segmentbearing the number which coincides with the serial or accession number of the message in theday's activity.

    It will be seen, ther efor e, th at no mat ter wh at repetitions occur in the plain-text beginningsof messages, the cipher letters will give no evidences ofsuch repetitions, for each message has adifferent start ing point, determined, ns said before, by the serial number of the message in the

  • 7/30/2019 Index Coincidence

    8/101

    4

    day's activity. This automatic prevention of initial repetitions in the cipher text is true, how-ever, only of a number ofmessages equal to the length of the numerical key; for, with a number ofmessages greater than the length of the key, the initial segments from which the cipher lettersare to be taken begin to repeat. In this case, message 22 would necessarily have the same initialpoint as message 1; message 23, the same as message 2, and so on. Messages 43, 64, 85, . . . ,would all begin with segment 1; messages 44, 65, 86, . . . with segment 2, and so on.

    The secrecy of the system is dependent upon a frequent change of alphabets and a still morefrequent change of numerical key, since it must be assumed, as with all cipher systems or devices,that sooner or later the general method of encipherment will become known to the enemy.The only reliance, therefore, for the safety of the messages must be placed in keeping the specificalphabets and the numerical key for a given series of messages from the enemy.

    P R I N C I P L E S O F S O L U T I O NThe following messages 'are assumed to have been intercepted within one day and therefore

    to be in the same alphabets and key:M E S S A G E No. 1

    MLVXK QNXVD GIRIE IMNEE FEXVP HPVZR UKSEK MVQCI VXSFWGVART YBZKJ WVUPV XZCBD BDOLS GHINZ LJCTE KSLPY VPBYDWTRJK BDDFA ANJXE XGHED ERYVP YPWDJ DFTJV ZHTWB WXTMFOZDOJ

    MESSAGENo. 2ULJCY GXAEU DTEIL UZBRW GJZSS QLUOX PTFOO NWSHD BPTJOHQRYY YAXRZ KTEMP UAYMK ISRDZ VUVKW HXAYD YAGSM CURBZLBXOV EBBPI BMLCB UMAXF ZSLXV QFXUE MPZMK MQZZT KMURWEJVB MESSAGE No. 3YLMKW CBGSF VGABP HOZFV QQNSQ NQLQL DIGXM XCWAI QFJOQTYDEL MBMJB SEPSO DHREM ELKIP KXNMW QYBIH BHFDC GLYWCYGMMP EEXZH UBBSB SBONG URQKW YRAYU NYUCS LNEMV VNSXNWGVME MXPDF WGTZE KRLGU ZJFZJ W

    MESSAGENo. 4UFHUJ LTMKJ PONFG RIUGG OZGWS UBNMW WGILB JNXTD BPREXMMWHB OBFVO TFGSJ SLXEH RTZMI LLUUX FIFWC PGSBA KRCASXWKQV SLDKS NTESD QVQBN RZDMB JQLYH LTMXB BSVWL ILKYUNMFEB HB

    MESSAGENo.5VJZCQ KZJJK DCVQD KSSXY TSUXE GRROH PXKZF ZKFMS VGDWUXLTBW EXHEF AWQWF ZESMX GWCEM JPNVB GRJGB IBWOH YMOAPIZYNX GXMSB OZZGK FVURN NJFGQ DTPLV STOID DWVLR TXTBHWNWIE YJXXW BKOJQ FOUHO T

    J These are the actual messages submitted by Mr. Vogel.

  • 7/30/2019 Index Coincidence

    9/101

    MESSAGE No. 6

    XGORF GCHAX DUEIQ XAOWK BKBUH SKWWW XNEYZIWTVU NPCLU KQDIG YLMNC KYJJF SDSTU DCBUKVGQSW OGHPI XCYIO SUAEU BQAPY RMDMW FWGSQLNGQT IPJRI HUMAD DZUTW BFMO

    MESSAGE No. 7UFUCL HJYDY ZTHHA NWJWJ IFMZI VZIJE QODUTDKXXR TOSVL MNHNO RZRTX QIPFN HOUNL FGUVORTDCO LNTTM MXMXR TUIIG ZOJCU BTXJK KGUEJIKGSL APA MESSAGE No. 8NWWVY UMHVB RPQHF XOHQN IPATI CFMZT DIMQIVTEGJ IAFEO BMUUT VPSKO HUYNA VPRXS SCUZBGXHRP OIHWF LBTKF QESIO YXVNK AWDAA EQLVJKLXHR NCVNY XSQMC XVXJC TJVSC UJEGZ TFONC

    MESSAGE No. 9NYMPE YZYRZ AWLPP IMPBB VWAXZ QSVFG ITZMTNNCMP SOJKI GDPQZ IIPMV ZOCBO UCKXR SEPEMNDEUH YUEIP RFOHI QLQHG IJFRT UTNQC JEGAFYKXXQ VRQUW MESSAGE No. 10TEPDA XXHHC FYMFK QRBJV YVJID JBNXF JBLXUUFRXL WELQJ QJKFW RLSSF BQJWR BZKYN EAUWPCQRVW ZVXXH NZHCW SVEYH NEANW G

    MESSAGE No. 1 1SEYBZ MGSOZ CMPSQ BASFH VFSCG CHKSB ZOPRZLGRXO XXCKK MVQGJ XYUOC FFFVJ OZCJE AQSJYKEXYA VWONX GTCWT TGLOI IWORT JJVQE HYNK

    MESSAGE No. 12OMTXX WUXZE YEHOJ ALJCO EPLPJ RBCVX WARWXEUJG ROHUP OFSGQ PONLW RAIBP KACIB GMYSBWKZTE KZYIZ ZXFJO HZIVG ESGOX YYCEO B

    MESSAGENo. 13QNKIT FZHNC GJBSA JIQBX PFTAO RJLUD IKPKILJUFZ UBSMD VNMNZ UWVYQ DJPFY KETMN CLEQSJBCPG RNNNH BSYZR STGBV E

    JAKUK BEQMGOQUVA WHSXEHYMRQ LPUVN

    MZPRX SWNFCTPOWI VHNCQMSJBS ESVWJ

    IRUJI NSBWRJHCDE WUHJVMYYTH GSEYAHNPM

    KZNFC DHSUAOTZNM AMHWXSQRWO DAJMT

    KSOXR KNHJKAYSKO UWJDX

    CZEVH OCADCWZCMO TOXVM

    DYBDQ FTZDAVKHEU XSPFG

    TNUWU EVGHUDQXNA GEYHC

  • 7/30/2019 Index Coincidence

    10/101

    6

    ZJRKK CLYZK RINOKYVWFL DXZSK ZPWNIKKGEF FJWWR ZXROXZMKMI D

    OSULC WZAFW, SQAUCQQDCX CBIMT DGIASUUHTF FOQOK KNRPIXDCAS BEHQV OPWEY

    VHXVD RSRYI PVNWQRRQDM STTAL GKPPODIPSG ZUFVH OZIGB

    UMMOL HVVEQ GWTJIRAABX ZCVWR UUJEGCJEDW AKHZA SDZIE

    MESSAGENo. 14NBTAK NOBBI WRZRX DQTAR AEKMY MOXLTJUULI TPUJR TPQGH RJZCQ XCNHU NOKMIPZUGG HMYNB DUHQZ BJDFA FJAYU RKHKB

    MESSAGE No. 15

    WPRBI WCFUO JKQHYBAQRK RUHHF JYUJGPLLXQ DRDEM JMCFBUTISE NZFCU BXI

    MESSAGE No. 16QEDMJ LTKFN RMDGTPZTCU BNCEE HAWMBTQXND HMHEW VHTLT

    MESSAGE No. 17PQTNS AEFZH TOFQHNMJHQ CHZUM TTSIUHIHWG LBUFZ DVPUB

    OIWVX KSMSX MBXOZDYNMA CWULT PKWEEWLHVX FIFZR STVJV

    DNMBM JDGVP KJMNDKRGNU ZXCWQ VEZTC

    OOXNQ NPDDT QVSJR

    GHELW HUMFH LZHNHDCMXO NAYEY GQHID

    MESSAGE No. 18

    LALNH QUDUA ZBZUD VSJFE MEHXW EUWZT OKWNO OOSIL TASEGOVQVX PMKKW BQRBI VGDGJ JDAHW RZDJA WQAXB FBRLA AHJEPPUMEU HJQJP ZSPPQ VZWDL HECDA LPJJS ZOJYB MO

    MESSAGE No. 19

    YTVUL TWEVD MBHKV IHTPI GNXBQ XAUAQ OUFVO GSMKB BAKIGYRNAF BKIJC ZJSRN WBQHM UYJPT GHCCB RLNVH OLDQA ZZDCVUWMNZ OPRFG RDONY RCZAM ZYNVQ WFONZ CITES IRWER GKETAYUSUK TFECD BMQVB VBWVV VWCZP TWCTJ FHFEH VNDCO MZVLKYUJPZ BHLDY VVPMD KFHPB VCYU

    MESSAGE No. 20

    OBHOK UWRON AJDFH FRQMI ULOTG XXIEV HMAKV PVMAV OITKDLDQIW UYVWI JEJRQ MCUZP KGUDN QSPBF TQVZP IZJTU RBUFZJUSIZ DCIGX QESJD LIZVM AOPME YNEXI HOXJK KUYHK AORUYLVD

    MESSAGE No. 21

    OMXOW LTWEW KFJHN ZMSKK GXFDL, YLGPT YYYNQ CYYMA ZWRADHGWBI HHCGM FDGMN XIQDN NPLYQ NPJNZ OWFVV KMGVH KHCUCSTNVZ CGXHV LZZXX SVTKG POEAC OJYQU MEULH KHYDD ETPDNQYZVU HIMGG RBHEW CUSO

  • 7/30/2019 Index Coincidence

    11/101

    MESSAGE No.SFDFMYJOUN

    CXEUFJJMIV

    AQMQWAEGQG

    UKCTBJFCEU

    FFONYHTZPB

    BETEX

    QFPMO

    GVFZHYRNEIZSLRK

    TFSOQ

    JTPSY

    BWIVZQFXUVXKESC

    MESSAGE No.LHKGPXUWVELCSMIDZAUO

    KUKNYYZAJE

    YVSWWW

    WJROYNFDEX

    DQNAP

    XZHPK

    GDZGA

    KFPAO

    JJIHUJBZET

    FIQTRMESSAGE No.

    OMOOK

    YJODO

    WFKKZ

    OKWHRXIRIXPXXZL

    MPXQIKDCBX

    XFRWL

    PIQMNUEQTUFHTEG

    ALWNKVPIWAINNJI

    MESSAGE No.FPETJ

    YXDKRGWILMFJYAZ

    CDVNYGWMQLFGCVS

    YQNVO

    LMKQUFQMEJNCZGB

    WSMSS

    CDALXKOBMSHUIRTCCMFY

    FIYGRZURANXUWAIKVKRA

    MESSAGE No.

    XFYKUVDCYOZAECE

    TRJTG

    VWRIHESOWP

    DLZCS

    HQWVZYQXGD

    DJBCW

    VEASF

    XRLCH

    NDBFZGYZLIXQDYBPZJCMMPMHGMJQECEMVZLXZUAB

    SECCE

    IHSAMTZCWL

    IOKZO

    KNDMFTRURCFDBPB

    LSFXW

    DURHGGGJBFGAMFQGLHMHCIJTIRHCMJAUHKAEPLQU

    GBVIJGBIWLSEBMAANHJOIJKIRSGMFCULGXCZFTIQBUILZRAQBJ

    MSTZZ

    XJZKI

    TKNDA

    VEKYGSRUEUUZHRQUFRNA

    TQRGTWIZWKOTJEP

    ZUAPV

    IWORYDDQYB

    JNSSI

    22

    IAONRJNPXG

    CLRILCJCFI23

    LUVBUXYHALKQQIE24

    HZIKQNFWOH

    ETVUP25

    HQMIPULZFE

    DGAMLY

    26

    LGPKS

    YOOHV

    RUWKEULXFUAPKQPJODEE

    IIZYYVZXCL

    QRYXB

    OISIERCMUSHCHXX

    TWSZJSXXSZ

    LHIGGBN

    ZRUDA

    DGECA

    HHYCA

    XQNUYGWXXYQHTJY

    FAONRYDLOT

    QBTFK

    CWPSK

    IAUNO

    RIYITXSINCKAXUF

    XHRXEHTAVV

    YFYOE

    VHMWLRESKE

    UXWQX

    ZMZKW

    OJUGA

    HVRUCFVDSQ

    XXHRFCDPYMGHEBF

    GQZNG

    EKBMG

    OP

    QCAVJ

    UZMJMVYIGT

    BSNWHLPSJJ

    BDWVG

    MVKNTCACVC

    KIOZGOTTTO

    LSIJULPXGC

    PAUJD

    WDPDB

    PVUGD

    HOGXWQEHQRRJCSH

    MOAFT

    BTMYKPFTYF

    DFTFO

    MGETD

    MFIACSETNP

    FLUVK

    LTTXNUTNIEMZNGB

    ZWXXQ

    LNICDDLRZVRFTHL

    SYBMC

    RLETISMQVBBYGMQ

    QCVQ

    It may be of advantage to begin the elucidation of{he principles of solution by t r a n s l a t i n gthis cipher into terms of the sliding of primary alphabets ag ain st one another with the con-sequent production of a multiplicity ofsecondary alphabets. For example, by using ordinarysliding alphabets such as are commonly used in cryptanalysis, we may produce the sameresults as are given by the set of concentric disks. Let us use the alphabets of the illus-trative disks, mounted upon sliding strips in pairs, and let us slide each pair of alphabets 8letters apart. Thus, if we consider the upper one ofeach pair of alphabets in figure 2 as theplain-text alphabet and begin each alphabet arbitrarily with the letter A, w e have the f o l l o w i n g :

  • 7/30/2019 Index Coincidence

    12/101

    [Plain text[C iph er

    F I G U R E 2A U F Q Z E R H Y G W J O I D M N C T X S B L P K V A U F Q Z E R H Y G W J O I D

    A U F Q Z E R H Y G W J O I D M N C T X S B L P K V

    fP lain textCipher

    A O X G F Z M Y L P U K E T D J V S W I R B N H C Q A O X G F Z M Y L P U K E T DA O X G F Z M Y L P U K E T D J V S W I R B N H C Q

    [Plain text

    Cipher

    A W J B Q I H V K P U F O G T N E D S Z X C M L R Y A W J B Q I H V K P U F O G TA f f J B Q I H V K P U F O G T N E D S Z X C M L R Y oo

    [Plain text

    Cipher

    A C V X Z f f M T F N U I O S E H J R D L G K Q B P Y A C V X Z W M T F N U I O S EA C V X Z W M T F N U I O S E H J R D L G K Q B P Y

    fPlain text5|Cipheer

    A E T J U D M V Z N W H X O G Y K F R L Q B P C I S A E T J U D M V Z N W H X O GA E T J U D M V Z N f f H X O G Y K F R L Q B P C I S

  • 7/30/2019 Index Coincidence

    13/101

    Note now that the first set of 5 plain-text letters, P R E P A , yields the same set of 5 cipherletters, M E K J R , that we found on page 2 by using the disks. The only thing which these fivepairs of independent sliding alphabets have in common in figure 2 is the fact that each pair hasbeen slid apart the same number of letters, viz, 8; if we consider the upper alphabet in each pairas the stationary alphabet, then the lower one has been shifted 8 intervals to the right, or 18intervals to the left, of the upper alphabet. This corresponds to the position of number 1 infigure 1, for the latter number occupies the eighteenth segment to the right of the plain segment,or the eighth to the left. The relative positions of the numbers in the numerical key, therefore,correspond to the numbers of intervals the primary alphabets in the form ofsliding strips wouldhave to be displaced in order to produce the same results as the disks.

    Now the sliding against itself of a primary sequence containing 26 letters will give rise toa series of 25 secondary cipher alphabets;' likewise, each primary concentric sequence willgive rise to a series of 25 secondary alphabets. If the numerical key consists of 25 numbers,all these secondary alphabets will be employed; if it consists of less than 25 numbers, then acorrespondingly decreased number ofsecondaries will be employed.

    Since each primary sequence can give rise to a setof 25 secondaries, the total number ofpossible secondary alphabets in the whole system is 126; but if the numerical key consists ofless than 25 numbers, then the total number ofsecondaries will be less than 125 by exact multi-ples of 5, since the absence of one or more numbers from the key affects all five primary concen-tric sequences. For example, if the key consists of 21 numbers, then there will be involved21X5, or 105 secondary alphabets. In a message of exactly 105 letters, then, each letter willbe enciphered by a different secondary alphabet. If the message contains more than 105 letters,then all the letters after the 105th will be enciphered by the same secondary alphabets as atthe beginning of the message and in the same sequence.

    In the explanation of the method ofencipherment it was made clear that the substitutionproceeds in a regular manner, taking successive groups of 5 letters; the cipher equivalents aretaken from the successive segments, proceeding in a clockwise direction from any given initialsegment. It follows, therefore, that in a single long message wherein the complete enciphermentrequires the passing through ofthis sequence ofsegments more than one time, there exist periodicor cyclic phenomena of a type found in various ciphers, due to the presence of a definite or regu-larcycle. In this case, the length of this cycle in terms of groups of 5 letters corresponds exactlywith the length of the numerical key; its length in terms of individual letters is five times thelength of the key. For the sake of clarity, we shall refer to this cycle when stated in terms ofletters as the period. Thus, with a key of 21 numbers, the length of the cycle is 2 1 groups, andthe length of the period is 105 letters. If a message consists of 315 letters, for example, theletters would pass through three complete cycles; the 1st, 106th, and 211th letters would beenciphered in exactly similar positions, and therefore by exactly the same secondary alphabet.The 2d, 107th, and 212th letters would likewise be enciphered by the same secondary alphabet,but of course not the same as the preceding secondary alphabet. With a key of 23 numbers,the length of the cycle is 23 groups, the length of the period, 115 letters; the 1st, 116th, and231st letters would be enciphered by the same secondary alphabet; the 2d, 117th and 232dletters by a different secondary, and so on. If we represent the length of the period by n,then the 1st, (n + l)th, ( 2 n - f - l ) t h , (3n-f l)th, . . . letters fall in the same secondary alphabet;the 2d, (n-f-2)th , (2n+2)th, (3n+2)th, . . . letters fall in another secondary alphabet; andso on. If a message be longer than the period, therefore, it will follow that the 1st, 2d,3d, . . . nth secondary alphabets must contain repetitions of cipher letters, representing

    1The twenty-sixth secondary alphabet coincides with the normal alphabet, since each plain-text letter

    would be represented by itself in that secondary alphabet.

  • 7/30/2019 Index Coincidence

    14/101

    10

    repetitions of plain-text letters, for these secondary alphabets, are after all only single mixedcipher alphabets, and the repetition of high-frequency letters in ordinary pla in text is a necessarycharacteristic of all alphabetica l languages. Such repetitions will be evidenced by repetitionsin the cipher text at n, 2n, 3n intervals, and they may be used to determine the length of th eperiod. Exactly how this is done will presently be demonstrated.

    But the determination of the length of the period is only a slight step forward in the analysis.It is true that it will give us the length of the numerical key, but that is all. What we mu st knownext is the sequence of numbers, or rather, the relative positions of the numbers in this key.

    We may ascertain this by fur ther scrutiny of the theoretical and actual results of the methodof encipherment. It is often the cae with various ciphers that the method of encipherment isexcellent in principle, and will yield practically indecipherable messages when th e messages ar cvery few in number, but the weaknesses in the method are quickly disclosed when it is used forregular traffic such as that necessary in military cryptography, where many messages arc to besent each day in the same key. In the cipher under examination, the weakness is introduced bythe fact tha t the initial segment for each message of the day's activity is determined by the serialnumber of the message.1 Now there are as many initial segments for each numerical key as thereare numbers in that key. Once the starting point is determined, all the messages pass through thesame cycle; diff eren t messages merely begin at differen t points in the cycle. Now, since thenumbers applying to these starting points constitute the sequence of numbers in the key, thesuccessive initial segments constitute a series or sequence which, when properly reconstructed;will give us the sequence of numbers in the key.

    After the numerical key has been reconstructed, we are yet a long way from solution, for w^are still confronted by the more complex problem of reconstructing, or solving, the cipheralphabets.

    We have so far analyzed the solution of the problem into the following three steps or phases;1. The determination of the length of the period.2. The reconstruction of the numerical key.3. The reconstruction of the cipher alphabets.

    Let us proceed, therefore, to perform each step.

    1. The determination of the length of the per io d . I t was explained above how the ciphersystem will result in the production of repetitions in the cipher text at definite intervals dependentupon the length of the period. The first, (n.-f-l)th, ( 2 n - f - l ) t h , . . .letters fall in the same second-ary alphabet; the second, (n + 2 ) t h , (2n + 2)th, . . . letters fall in another secondary alphabet;and so on. If there are repetitions in the plain text at n intervals apart, t here will be correspond-ing repetitions in the cipher text. There would be involved here only a slightly modi fied caseof the ordinary process of factoring the intervals between repetitions in the cipher text, as appliedin the solution of typical periodic multiple-alphabet ciphers. Thus, in this case, if it hap p e ns thatthe first, second, and third letters of a message, and also the (n+l)th, (n + 2 ) t h , and (n + 3 ) t h ,are the letters THE, then there must be a repetition of the initial trigraph of the cipher t e x t ,representing THE, at a distance ofn letters. Bu t in a cipher involving BO man y alphabets as thisone, th e repetition of trigraphs and polygraphs would natural ly be rather i n f r e q u e n t , except in uvery long message.

    However, the paucity of trigraphs and polygraphs, and even of digraphs, need n ot pro veto be a great obstacle, for the repetitions of individual letters may be used with groat accuracyfor the same purpose, viz, the determ inat ion of the length of the period. The m e t ho d is based

    1 However, were th e initial segments determined in some o t h e r m a n n e r, the final results would be t h e m ) me , andthe cipher could be so l v e d by a slight modification of method. Even if the init ial segments w e r e s u b j e c t to nolaw, the cipher c o u ld still be solved by the m e t h o d h e r e in a ft e r set f o r t h , with s o m e m o d i f i c a t i o n s .

  • 7/30/2019 Index Coincidence

    15/101

    XFYKU NDBFZFDBPB SEBMAUFRNA APKQPHTAW OTTTORLETI DJBCWIQKZO EPLQU

    u a o i tKNDMF GBVIJ TKNDASRUEU RUWKE RIYITKAXUF CACVC LNICDRFTHL HQWVZ XZUABIHSAM RHCMJ RAQBJXJZKI JNSSI HCHXX

    F I O D H E 3. M et sage 26 transcribed upon the attumption of a period of 100 Uttersa O J 5 4 0 4 4 O M < ) 0 7 0 7 5 8 0 8 5 g O S 1 0

    LGPKS CWPSK BSNHH LTTXN VDCYO GYZLI TRURC GBIHL VEKYG YOOHV IAUNO LPSJJ UTNIE ZAECE XQDYBDffVG MZNGB TRJTG PZJCM LSFXW ANHJO UZHRQ ULXFU XSINC MVKNT ZWXXQ VWRIH MPMHG DURHG IJKIESOWP MJQEC GGJBF SGUFC TQRGT JODEE XHRXE KIOZG DLRZV DLZCS EMVZL GAMFQ ULGXC WIZWK IIZYGLHMH ZFTIQ OTJEP VZXCL YFYOE LSIJU SYBMC YQXGD SECCE CIJTI BUILZ ZUAPV QRYXB VHMWL LPXGIWORY OISIE RESKE PAUJD SMQVB VEASF TZCWL AUHKA MSTZZ DDQYB RCMUS UXffQX WDPDB BYGMQ XRLCZMZKW PVUGD QCVQ

    Number of coincidences W W tf U W Ml If U [ H I III/

  • 7/30/2019 Index Coincidence

    16/101

    11

    upon the construction of what we have called a "Table of coincidence", which will show usmathematically the most probable length of the period. We may as well use the text ofo u rseries of messages to illustrate the process.

    If we assume the numerical key to consist of 20 numbers, then the length ofthe period wouldbe 10 0 letters. Let us write the longest message of our seriesviz, message 2 6 i n exactly super-imposed lines c o n t a i n i n g 100 letters each, and then make a count of the recurrences, or moreaccurately, the coincidences, ' of letters within the individual columns thus formed.

    Note the repetition of the letter F in the second column. This fact is indicated by placinga checkmark in the tabulation ofcoincidences. Where a letter appears three times within thesame c o l u mn (B, in column 8), three check marks are recorded, for there we have a coincidencebetween the first and second, second and third, and first and third occurrences. Where a letterappears four times in the same column, six check marks are recorded. The number of coinci-dences for each case corresponds to the number of combinations of two things that can be madefrom a total ofn things.2

    We note that on the assumption of a period of 100 letters there is a total of 39 coincidences.Now, if the period is really 100 letters in length, then the repetitions of letters within columnsare not mere coincidences brought about by chance superimposition of identical letters but areactual recurrences in the restricted sense of being the resultants of the encipherment ofsimilarplain-text letters by the same secondary alphabet. But there is no way of determining fromthis single tabulation whether the assumption of a period of 100 letters is correct or not, andtherefore we do not know whether the repetitions in this case are recurrences or coincidences.This we can determine, however, by a comparison of tabulations of coincidences made uponvarious assumptions of length of period. Theoretically, the correct assumption should yield ahigher total of coincidences than the incorrect assumptions, because the recurrence of high-frequency plain-text letters (in English, E, T, 0, A, N, I, R, S, H, D) is to be expected, and thenumber of such causally produced repetitions should certainly be greater than the numberof repetitions produced by mere chance in the superimposition.*

    Let us proceed, therefore, to make a table of coincidence for the various probable lengths ofperiod, first transcribing message26 into lines corresponding to hypotheses of 105, 110, 115, 120,and 125 letters.4 Before doing so, however, we find it necessary to introduce a few remarksupon the desirability of using a slight correction factor for this table.

    1 We draw the distinction between recurrences and coincidences on the g r o u n d s that the former term sho u l d ,and will here be used to indicate repetitions ofletters in the cipher text causally related to each other by beinge n c ip h e n n c n t s ofidentical plain-text letters by identical alphabets; whereas the latter term indicates r e p e t i t i o n snot causally related to each other in this manner but simply the result of chance. A repetition may therefore beeither a recurrence or a coincidence. The process of factoring in ordinary multiple-alphabet ciphers of the periodictype has for its purpose the separation and classification of repetitions into the two kinds. Until proved otherwise,all r e p e t i t io n s must be considered coincidences.1

    The formula is: C = n ( n 1 ) / 2 . Thus, when n = 5, the numbe r of coincidences is 10; when n =6, thenumber is 15.3 Since this paper was written, a further study of the concept of coincidences has made it possible to p r e d i c t ,w i t h a fair degree of accur acy, just how many coincidences should be expected for correct and incorrect assump-t ions. The mathematical and statistical analyses are given in detail in W . F. Friedman, Analysis of a Mechanico-Rlectrical Cryptograph, Section VI; S. K u l l b a c k , Statis tical Methods in Cryptanalysis, Section VII (TechnicalP u b l ic a t io n s , S. I. S., 1934). I t, results from these mathematical studies that the ratio of the n u m b e r ofactualcoincidences to t h e total number ofpossible coincidences is .038 for an incorrect case and .066 for a correct one.This knowledge eliminates th e necessity for t a b u l a t i o n s corresponding to every possible case and gives a reliablemeans of d e t e r m i n i n g the correct assumption as soon as it is made. (See N o t e s 1 and 3 on pages 13 and 14respectively.)'The message need not be written out more than once if long strips ofcross-section paper are used, writinga line on each strip. Each line should contain 125 letters, and the various strips can then be arranged to bringthe proper letters into supcrimposition according to each hypothesis in turn.

  • 7/30/2019 Index Coincidence

    17/101

    12

    For really accurate comparison, the totals of coincidence obtained for the various hypothesesshould be corrected in order to make proper allowance for the differences in totals due solelyto the v a r i a t i o n in the number of letters in each column when transcribed according to eachh y p o t h e s i s . 1 From a c r y p t o g r a p h i c point of view, a total of 100 coincidences in an arrangementwhere there are G letters in each column represents a slightly greater degree ofcoincidence thanin an arrangement of the same message also yielding 100 coincidences, where there are 7 lettersin most of the columns; there is less opportunity for coincidences to be produced in the formercase. We should, therefore, reduce all the totals of coincidence to some common basis. Thereasoning we have followed in the establishment of a correction factor to be applied is as follows:M t ^ s . i^ t . 2 0 c o n t a i n s exactly 539 letters. W h e n transcribed into lines of 100, 105, . . . , 125letters, the columns in each of these five set-ups have the following number of letters:

    T A B L E I

    39 c o l u m n s of 0 l o i t e r s and 61 c o l u m n s of 5 letters.14 columns of G letters and 91 columns of 5 letters.90 c o l u m n s of f ) l o i t e r s u n d 11 columns of 4 letters.70 c o l u m n s i ; f .

    r> K ' t t o r s und 36 columns of 4 letters.59 c r h m i n s of 5 leHors and 61 columns of 4 letters.

    39 columns of 5 letters and 86 columns of 4 letters-

    Period(letters)

    100105110

    115120126

    Assuming that perfect coincidence can occur in each column (all letters identical), then ina c o l u m n of G letters we can have 6X5/2=15 coincidences; in a column of 5 letters, 5X4/2=10coincidences; and in a column of 4 letters, 4X3/2 = 6 coincidences.

    If now we find the total number of chances for coincidences for each of the arrangementsgiven in table I, we have the following:

    T A B L E I I

    Period(letters)100

    1051101 1512 012 5

    C o nd i t i o ns

    39 c o l u m n s of 15 chances each, 61 c o l u mn s of 10 chances each..14 columns of 15 chances each, 91 columns of 10 chances

    each.. _.99 columns of 10 chances each, 11 columns of 6 chances each

    79 columns of 10 chances each, 36 columns of 6 chances each59 columns of 10 chances euch, 61 columns of 6 chances each39 c o l u m n s of 10 chances each, 86 c o l u m n s of 6 chances each

    Totalchances

    1, 195

    1, 1201 0561, 006

    956906

    Choosing for our basis of comparison the hypothesis of a period of 100 letters, the variousproportions of chances for coincidences for each of the remaining hypotheses will constitutecorrection factors to be applied in each case. They are as follows:

    T A B L E I I IPeriod

    (letters)

    100

    105110116120125

    C h a nce s forcoincidence1, 1951, 1201,0561,006

    956906

    Correctionfactor

    1.001.071. 131. 191.251.32

    1 See f o o t n o t e 3 on the preceding page. This correction factor is unnecessary if the number of actual coin-cidences is reduced to a p e r c e n t ag e basis, in terms of the total possible number ofcoincidences.

  • 7/30/2019 Index Coincidence

    18/101

    13

    We are now ready to establish the tables of coincidence for the various hypotheses. Spaceforbids the actual demonstration of the several arrangements of message 26 to correspondto the various hypothetical key lengthsthat sh o wn in figure 3 is typical of them all. Weshall give only the final result in table IV.

    T A B U B I VPeriod(Tetters)

    100

    105

    110

    115120

    125

    Coincidence* on each hypothesis

    t H J M M I M I n u r t u n u m inuwwtHiwwwwwwtHiwmwnumwwiiWWlHtWMWWMWttUmWWWWWWWWIWWWWWWIII

    Total

    39

    45

    47

    60

    3633

    Correctionfactor

    1.001.071. 13

    1. 191.251.32

    Correctedtotal

    39.048.253 . 171.445.043.6

    There seems to be no doubt but that the period of 115 letters is correct. The cycle, there-fore, consists of115-f-5=23 groups, and the numerical key contains 23 numbers. This meansthat the two final segments bear no numbers, and are therefore blanksegments.

    2. The reconstruction of the numerical key.Having ascertained the length of the period,and thus the length of the numerical key, the next step is to reconstruct the sequence ofnumbers constituting the key. A s stated before, this process is made possible in this caseby the method ofencipherment which is such that all the messages of the day's activity gothrough exactly the same cycle, but the successive messages begin at different initial pointsin this cycle, and these points coincide with the relative positions of the numbers making upthe sequence of numbers in the key.

    We do not know the absolute position of any numbers in the numerical key, hut we mayproceed first to find their relative positions, regarding the key in the n a tu r e of a continuouscycle or chain. Later we may find the absolute positions of the numbe r s in this cycle, i.e., weshall have reconstructed the numerical key itself.

    Now, as stated before, all messages proceed through the same cycle; it is only the initialpoints for the messages wlu'ch are different. Hence, if we can determine the relative positionsin which messages 1, 2, and 3 should be superimposed in order to make all three messagescoincide as regards the portion of the cycle through which they pass simultaneously, we shallthus have determined the relative positions of the numbers 1, 2, and 3 in the cycle. Forexample, if we should find that the first group of message 2 belongs under the twelfth groupofmessage 1, and the first group of message 3 under the sixth group of message 2, we wouldconclude that the relative positions of these numbers in the cycle are these:

    1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 3 4 5

    1 2 . . . . 31 A s a verification of Note 3, page 11 a b o v e , the percentages of the actual number of coincidences to the

    total possible number has been calculated:Period Percentage100 0.033106 . 040110 044115 .060120.126.

    . 038

    .036

  • 7/30/2019 Index Coincidence

    19/101

    14

    How are these relative positions for messages 1, 2, and 3 to be determined? Clearly, wemay use the same basis for this determination as for that concerned with the length of thep e r i o dv i z , a table of coincidences. For, when messages 1 and 2 are correctly superimposedwe should get a higher degree of coincidence between the letters of the superimposed columnsthan when they are not correctly superim posed . The reasons are the same as in the precedingcase: The successive groups of cipher letters in one message represent encipherments by thesame secondary alphabets as apply in the other message; hence, repetitions of plain-text letterswithin columns will result in repetitions of cipher letters within those columns . We can there-fore determine the correct superimposition by experiment and the recording of coincidences.

    Having found that the k ey contains 23 numbers, it is obvious that message 24 has thesame starting point in the cycle as message 1; we may proceed at once to combine them bydirect superim position . We do likewise with messages 2 and 25, and 3 and 26. The purposeof this stej) is merely to afford greater accuracy through the increased number of letters withwhich we shall have to deal in finding the correct relative superimposition of these three setsof messages.

    Since message 26 contains the greatest amount of text, we may regard it' as our base an dtry to find the relative position of messages 1 and 24 with respect to it. We now place messages1 and 24 beneath message 26, beginning the first group of the former beneath the second groupof the latter1 . A tabulation of the number of coincidences in each column is then made. Mes-sages 1 and 24 are again placed beneath message 26, beginning the first group of the formerbeneath the third group of the latter and again the total number of coincidences is ascertained.In other words, messages 1 and 24 are moved successively 1, 2, 3 . . . 22 intervals 2 to the rightof message 26, and a table of coincidences is constructed . The greatest total number of coin-cidences, as s h o w n in table V, is given when messages 1 and 24 are placed three intervals to theright of message 26. This means , then, that the n umbers 1 and 3 occupy these relative positions:

    12 3 1

    T A B L E V 3Intervals 1 2 ~ ~ 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22C oincidences 59 49 86 40 53 39 55 60 59 47 55 45 61 64 53 54 46 58 50 48 52 40

    S h o w i n g v a r i o u s totals of c o i n c i d e n c e when messages 26 and 1 and 24 are superimposed at differenti n t e r v al s c o r r e sp o n d i n g to the successive hypotheses of relative position.Messages 2 and 25 are next taken for experiment and a similar table of coincidences is made

    for the various superimpositions with message 26, omitting, of course, the three-interval trialsince the position corresponding to that test we found to be occupied by the number 1. As soonas th e re l a t i ve position of eacli number is found, the subsequent trials corresponding to thosenumbe r s may be omitted . The labor is, of course, somewhat tedious bu t may be done by

    1 It was t h o u g h t unnecessary to use both messages 3 and 26 as a base, since the la tter a l o n e seemed suffic ient lylong to give reliable information.2

    An interval in this case is equal to a g r o u p of 5 letters, because the enciphernicnt proceeds in segments of5 letters each. We need try only the first 22 in t e r va l s since the cycle consists of only 23 positions, un d messages 1and 24 cannot have the same beginning point as messages 3 and 20.3 W i t h regard to th e m a t h e m a t i c a l n o t i o n discussed i n N o t e 3 o n p . 1 1 , t h e f o l l o w i n g r e m a r k i s p e r t i n e n t :The total number ofcoincidences possible when messages 1 and 24 are placed three intervals to the right ofmessage 20 is 1253. The nu mb er ofac t u al coincidences, 8(5, w h e n divided by 1253 g i v e s .068. Since the expect edresult for a correct assumption is .060, it is at once evident that this assumption is correct and it is consequentlyunnecessary to consider any further cases.

  • 7/30/2019 Index Coincidence

    20/101

    F I O C R B 'nf correct tuperimpori t ion of M E S S A G E S 1 A N D 24

    1MLVXKYPITOJ

    24|MOOKI E T V U P

    2!"""iQFXUE25

    FPETJDGAMLIYLMKW1 YRAYU

    26

    XFYKURUWKEMJQECYFYOE

    AUHKA

    QNXVDDFTJVOKWHRQHTJYGXAEUHPZHKCDVNYQBTFKCBGSF

    NYUCSNDBFZRIYITGGJBFLSIJUMSTZZ

    GIRIEZHTWBMPXQIOP

    DTEILMQZZTLMKQUVYIGTVGABP

    LNEMVKNDMFBDffVGSGMFCSYBMCDDQYB

    IUNEEWXTMFPIQMN

    UZBRWKMURWCDALXFLUVKHOZFVVNSXNGBVIJMZNGBTQRGTYQXGDRCMUS

    FEXVPOZDOJALWNK

    GJZSSEJVBFIYGRFJYAZ

    QQNSQWGVMETKNDATRJTGJODEE

    SECCE

    UXWQX

    HPVZR

    HZIKQ

    QLUOXHQMIPYQNVONQLQLMXPDFLGPKSPZJCMXHRXECIJTIffDPDB

    UKSEKXQNUY

    PTFOO

    FAONRWSMSSDIGXMWGTZECBPSKLSFXWKIOZGBUILZBYGMQ

    MVQCIGQZNG

    NWSHD

    QCAVJCCMFYXCWAIKRLGUBSNWHANHJODLRZVZUAPV

    XRLCH

    VXSFWDFTFO

    BPTJO

    MFIACKVKRAQFJOQZJFZJLTTXNUZHRQDLZCSQRYXBIOKZO

    GVART

    YJODO

    HQRYY

    YXDKRY

    TYDELWVDCYOULXFUEMVZLVHMWLEPLQU

    M E S S A G E S 2 AND 25PTFOO NWSHD BPTJO HQRYY YAXRZ KTEMP UAYMK ISRDZ VUVKW HXAYD YAGSM CURBZ LBXOV EBBPI BMLCB UMAXF ZS

    GBMQL FQMEJ KOBMS ZURAN ULZFE YDLOT UZMJM SE7NP GWILM FGCVS NCZGB HUIRT XUMEB8AOE6 3 AND 26MBMJB SEPSO DHREM ELKIP KXNMW QYBIHGYZLI TRURC GBIWL VEKYG YOOHV IAUNOXSINC MVKNT ZWXXQ VWRIH MPUHG DURHGGAMFQ ULGXC WIZWK IIZYY HTAW OTTTOLPXGC RLETI DJBCK IHSAM RHCMJ RAQBJXJZKI JNSSI HCHXX ZMZKW PVUGD QCVQ

    BHFDC GLYWC YGMMP EEXZH UBBSB SBONG ULPSJJ UTNIE ZAECE XQDYB FDBPB SEBMA SIJKIR UFRNA APKQP KAXUF CACVC LNICD ERFTHL HQtVZ XZUAB GLHMH ZFTIQ OTJEP VIWORY OISIE RES KE PAUJD SMQVB VEASF T

  • 7/30/2019 Index Coincidence

    21/101

    1

    1

    15

    clerks. This process is continued insimilar manner for all the remaining messages. The data formessages 2 and 25 show that they belong five intervals to the right of messages 1 and 24, and therelative positions of the numbers 1,2, and 3 in the cycle are therefore these:

    2 3 1

    1 2 3 4 52

    The data for this determination of position for all messages are given in table VI.TABLE VI. Data for determination of the position ofthe numbers in cycle

    M E S S A G E 2 0 U S E D A s A B A S K{Position.... 2 3 4 5 6 7 8 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23I N umber of coincidences 59 49 86 40 53 39 65 60 59 47 65 45 61 64 53 54 46 68 50 48 52 40{Position 2 3 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23[ N u m b e r of coincidences 45 49 58 67 62 61 99

    .{Position 2 3 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23I N umber ofcoincidences 34 35 38 36 16 33 27 23 19 24 28 24 24 21 19 25 20 42 35 32{Position 2 3 5 6 7 8 10 11 12 13 14 16 16 17 18 19 20 22 2361 Number of coincidences 44 31 33 31 33 29 26 22 16 23 49{Position. 2 3 5 6 7 8 10 11 12 13 16 16 17 18 19 20 22 23I N umber ofcoincidences _ 25 33 37 25 40 31 20 11 66 -

    -{Position 2 3 5 6 7 8 10 11 13 15 16 17 18 19 20 22 237[Number ofcoincidences.,._. 23 33 26 27 25 20 24 31 31 16 19 28 28 28 46[Position 2 3 5 6 7 8 10 11 13 15 16 17 18 19 22 23I N umber of coincidences 36 27 53{Position .... 2 3 6 7 , 8 10 11 13 15 16 17 18 19 22 23I N umber of coincidences 27 32 26 17 21 21 30 17 13 21 27 58{Position _ 2 . 3 6 7 8 10 11 13 16 16 17 19 22 23{ N u m be r ofcoincidences... 18 18 15 13 23 19 14 18 13 20 10 15 18 52{Position 2 3 6 7 8 10 11 13 15 16 17 19 22J N umber of coincidences 16 28 10 15 14 16 16 26 17 28 20 31 19

    uj{Position 3 13 16 19I N umber ofcoincidences 21 30 22 1612jPosition 2 .3 6 7 8 10 11 15 16 17 19 22( N u m b e r of coincidences., 18 27 56^{Position....-- 2 3 7 8 10 11 15 16 17 19 22[ N u m b er of coincidences 20 1915 18 18 19 16 38

    {Position 2 3 7 8 10 ll 15 17 19 22[ N u m b e r of coincidences '29 17 32 30 18 20 23 33 24 42{Position r 2 3 7 81011161719[Number of coincidences 38 32 37 27 24 '69

    1Secondary test, using messages 1 and. 24 plus 2 and 25 as the base.

  • 7/30/2019 Index Coincidence

    22/101

    16T A B L E V I . Data for determination of th e position of the numbers in cycleC ontinued

    M E S S A G E 26 U S E D A s A B A S E( P o s i t i o n 2 3 7 8 10 15 17 19[ N u m b e r of coincidences 24 23 47[Position 2 3 8 10 15 17 19I Number ofcoincidences 15 44 30

    18 [Po sit ion 2 8 10 15 17 19[ N u m b e r of coincidences 48 31 26[Position 8 10 15 17 19[ N u m b e r ofcoincidences 28 37 66[Position 8 10 17 19\ N u m b e r o f c o i n c id e n c e s 65 14

    21[Position - 10 17 19J N u m b e r of coincidences 26 33 47[Position 10 17[ N u m b e r of coincidences 48 22

    23[Position 17( N u m b e r of coincidences 50

    When in any trial the total of coincidences for a certain position stands out prominentlyfrom the preceding ones, subsequent trials for the message concerned are omitted.

    The final result of carrying out this work for all the messages tried against message 26 isthat the following cycle is established:

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 233 18 17 1 8 12 16 20 2 22 15 6 11 5 19 13 23 9 21 7 4 14 10

    This reconstructed cycle represents, as stated before, the relative, not the absolute, posi-

    tions of the numbers in the key, because there is as yet no indication as to what number occupiesany given segment on the base disk. Furthermore, we must remember that there is a breakof two intervals somewhere within this cycle, representing the two blank segments on the basedisk. This fact may cause some difficulty later on but we shall find a way of overcoming it.In the meantime, we may content ourselves with the cycle as established and proceed to an analy-sis of the results of its reconstruction.

    The immediate result is to enable us to superimpose all the messages of the day's activity,as shown in figure 5. We may begin with message 1, or with any other message, but for thesake of convenience in analysis, we may as well transcribe them in regular order.

    The letters of each of these 11 5 columns belong to a corresponding number of secondaryalphabets, all different, but all single, mixed, substitution alphabets. Individual frequency tablesare made, therefore, and are s h o w n in table VII. The tables are given in groups of five, labeledA, B, C, D, and E, corresponding to the five primary alphabets of the system. The groups ofalphabets are given in their proper cyclic sequence, so that each set of five alphabets is accom-panied by a number which identifies its position in the sequence ofsegments. Thus, we may referto any of these secondary alphabets by number and letter,aa for example, 5B, meaning the secondsingle alphabet under segment 5. The A alphabets all apply to the outermost primary alphabet,or alphabet 1 ; the B alphabets to primary alphabet 2, and so on. We are now ready to attempt ananalysis of the cipher text with the object of solving these secondary alphabets and reconstructingthe primaries.

  • 7/30/2019 Index Coincidence

    23/101

    11. MLVXK

    YPWDJ

    LBXOV

    HOZFVVNSXN

    WGILB

    GWCEM

    DCBUK

    MZPRX8.

    YXVNKNNCMP

    YVJID

    OZCJE

    HZIVGVNMNZWRZRX

    D

    CWULT

    OZIGBHWEQDCMXOZBZUDHO

    WBQHMMZVLK

    DCIGXZWRAD

    QEHQR

    YZAJE24. OMOOK

    ETVUPGWILH

    GBVIJUZNG8TQRGTYQXGDRCHUS

    8 1210 11QNXVD GIRIE

    DFTJV ZHTWB

    EBBPI BHLCBQQNSQ NQLQLWGVME HXPDFJNXTD BPREX

    JPNVB GRJGBOQUVA WHSXESWNFC DKXXRNWWVY UMHVBAWDAA EQLVJSOJKI GDPQZ

    JBNXF JBLXU

    AQSJY WZCMO12. OHTXX

    ESGOX YYCEO

    UWVYQ DJPFYDQTAR AEKMY

    PKWEE UUHTF16.

    TQXND HMHEWGWTJI PQTNSNAYEY GQHIDVSJFE MEHXW

    UYJPT CHCCBYUJPZ BHLDY

    QESJD LIZVMHGWBI HHCGM

    CXEUF UKCTB

    NFDEX GDZGAOKWHR MPXQIQHTJY OPFGCVS NCZGB

    TKNDA LGPKSTRJTG PZJCMJODEE XHRXESECCE CIJTIUXWQX WDPDB

    l e 2030 29IHNEE FEXVP

    WXTIIF OZDOJ2.

    UMAXF ZSLXVDIGXM XCWAIWGTZE KRLGUMMWHB OBFVOIBWOH YMOAPVGQSW OGHPI

    TOSVL HNHNORPQHF XOHQNMYYTH GSEYAIIPHV ZOCBO

    KSOXR KNHJK

    TOXVM KEXYAWUXZE YEHOJB

    KETHN CLEQSMOXLT YVWFL

    FOQOK KNRPIVHXVD RSRYIVHTLTAEFZH TOFQH

    EUWZT OKWNORLNVH OLDQAWPMD KFHPB

    20. OBHOKAOPME YNEXIFDGMN XIQDN

    BETEX ZSLRK

    JBZET XYHALPIQMN ALWNK

    25.

    HUIRT XUWAICWPSK BSNWHLSFXW ANHJOKIOZG DLRZVBUILZ ZUAPVBYGMQ XRLCH

    2 2230 UHPVZR UKSEK

    ULJCY GXAEUQFXUE HPZMKQFJOQ TYDELZJFZJ WTFGSJ SLXEH

    IZYNX GXMSB

    XCYIO SUAEU

    R2RTX QIPFNIPATI CFJIZTKLXHR NCVNYUCKXR SEPEM

    UFRXL WELQJ

    VWONX GTCWTALJCO EPLPJDQXNA GEYHCDXZSK ZPWNI

    15.

    PLLXQ DRDEMPVNWQ QEDMJOOXNQ NPDDT

    OOSIL TASEG

    ZZDCV UWMNZ

    VCYUUWRON AJDFHHOXJK KUYHKNPLYQ NPJNZ

    22. SFDFM

    QFXUV CLRIL

    DGECA CDPYM

    HZIKQ XQNUYFPETJ CDVNYDGAML QBTFK

    LTTXN VDCYOUZHRQ ULXFUDLZCS EMVZLQRYXB VHMWLIOKZO EPLQU

    i s e40 45MVQCI VXSFW

    DTEIL UZBRWHQZZT KMURW

    MBMJB SEPSO

    RTZMI LLUUX

    OZZGK FVURN6. XGORF

    BQAPY RMDMW

    HOUNL FGUVODIMQI IRUJIXSQMC XVXJCOTZNM AMHWX

    QJKFW RLSSF11 .

    TGLOI IWORTRBCVX WARWJBCPG RNNNHJUULI TPUJROSULC WZAFWJMCFB WLHVXLTKFN RMDGT

    QVSJR RAABX

    OVQVX PMKKW

    OPRFC RDONY

    FRQMI ULOTGAORUY LVD

    OWFW KMGVHAQMQW FFONY

    LHIGG FVDSQ

    BTMYK LCSMIGQZNG DFTFOLMKQU CDALXVYIGT FLUVKGYZLI TRURCXSINC MVKNTGAMFQ ULGXCLPXGC RLETIXJZKI JNSSI

    F11 6M U

    GVART YBZKJ

    GJZSS QLUOXEJVBDHREM ELKIP

    FIFWC PGSBA5. VJZCQ

    NJFGQ DTPLVGCHAX DUEIQFWGSQ HYMRQ

    TPOWI VHNCQNSBWR VTEGJTJVSC UJEGZNDEUH YUEIP

    BQJWR BZKYNSEYBZ MGSOZ

    JJVQE HYNKDYBDQ FTZDA

    BSYZR STGBV

    TPQGH RJZCQ

    SQAUC WPRBIFIFZR STVJVDNMBM JDGVP

    ZCVWR UUJEG

    BQRBI VGDGJ19.

    RCZAM ZYNVQ

    XXIEV HMAKVKHCUC STNVZGVFZH JTPSYRJCSH JJMIV

    YVSWW DQNAP

    YJODO XIRIXFIYGR HQMIPFJYAZ YQNVO

    GBIWL VEKYGZWXXQ VWRIHWIZWK IIZYYDJBCW IHSAMHCHXX ZMZKW

    IODRB 51* 130 W

    WVUPV XZCBD

    PTFOO NWSHD

    KXNMW QYBIHKRCAS XWKQVKZJJK DCVQDSTOID DWVLRXAOWK BKBUHLPUVN LNGQT

    RTDCO LNTTMIAFEO BMUUTTFONC HNPBRFOHI QLQHG

    EAUWP AYSKOCMPSQ BASFH

    XEUJG ROHUP13. QNKIT

    E

    XCNHU NOKMIWCFUO JKQHYXDCAS BEHQVKJMND RRQDM

    NMJHQ CHZUM

    JDAHW RZDIAYTVUL TWEVDWFONZ CTTES

    PVMAV OITKDCGXHV LZZXXIAONR TWSZJJFCEU QFPMO

    23.

    KFPAO FIQTRKDCBX UEQTUFAONR QCAVJWSMSS CCMFYYOOHV IAUNOMPMHG DURHGHTAVV OTTTORHCMJ RAQBJPVUGD QCVQ

    23 n nBDOLS GHIHZ

    BPTJO HQRYY

    BHFDC GLYWC

    SLDKS NTESDKSSXY TSUXETXTBH WNWIESKWWW XNEYZIPJRI HUMAD

    HXMXR TUIIGVPSKO HUYNA

    9. NYMPEIJFRT UTNQC

    UWJDX CQRVWVFSCG CHKSB

    OFSGQ PONLWFZHNC GJBSA

    KKGEF FJWWROIWVX KSMSXOPWEY UTISESTTAL GKPPO

    TTSRU GHELW

    WQAXB FBRLAMBHKV IHTPIIRWER GKETA

    LDQIW UYVWI21.

    SVTKG POEAC

    OJUGA HOGXWTFSOQ XKESCLHKGP KUKNYKQQIE HHYCAVPIWA NFWOHMFIAC YXDKRKVKRA YLPSJJ UTNIEIJKIR UFRNARFTHL HQWVZIWORY OISIE

    21 7M 85LJCTE KSLPY

    YAXRZ KTEMP

    YGMMP EEXZH4.

    QVQBN RZDMBGRROH PXKZFYJXXW BKOJQJAKUK BEQMGDZUTW BFMO

    7. UFUCLZOJCU BTXJKVPRXS SCUZBYZYRZ AWLPPJEGAF SQRWO

    ZVXXH NZHCWZOPRZ CZEVH

    RAIBP KACIBJIQBX PFTAO

    ZXROX PZUGGMBXOZ QQDCX

    NZFCU BXIPZTCU BNCEE

    HUMFH LZHNH

    AHJEP PUMEUGNXBQ XAUAQYUSUK TFECD

    JEJRQ MCUZPOMXOW LTWEW

    OJYQU MEULHYJOUN AEGQGCJCFI BNWJROY XZHPKGHEBF PFTYFGffXXY EKBMGGWMQL FQMEJ

    ZAECE XQDYB

    APKQP KAXUFXZUAB GLHMHRESKE PAUJD

    4n

    VPBYD

    UAYMK

    UBBSBUFHUJJQLYHZKFMSFOUHOIWTVU

    HJYDY

    KGUEJJHCDE

    IMPBBDAJMTSVEYHOCADC

    GMYSB

    RJLUD14.

    HMYNBCBIMTHAWMB

    CJEDW

    HJQJP

    OUFVO

    BMQVB

    KGUDNKFJHNKHYDDHTZPB

    JJIHUDZAUO

    MGETDKOBMSFDBPB

    CACVC

    ZFTIQSMQVB

    II

    14MWTRJK

    ISRDZ

    SBONG

    LTMKJLTMXBVGDWUT

    NPCLU

    ZTHHAMSJBSWUHJVVWAXZYKXXQ

    10.

    NEANWLGRXO

    VKHEUIKPKIZJRKKDUHQZDGIAS

    KRGNU

    AKHZA

    ZSPPQ

    GSMKBVBWW

    QSPBFZMSKKETPDNYRNEI

    LUVBU

    W

    WFKKZ

    ZURAN

    SEBMA

    LNICDOTJEP

    VEASF

    10 to oBDDFA

    VUVKW3.

    URQKWPONFGBSVWLXLTBW

    KQDIG

    NWJWJESVWJGXHRPQSVFGVRQUWTEPDA

    G

    XXCKK

    XSPFG

    TNUWUCLYZK

    BJDFABAQRK

    ZXCWQ

    SDZIE

    VZWDLBAKIGVWCZP

    TQVZPGXFDL

    QYZVUBWIVZ

    ZRUDA

    PXXZL

    ULZFE

    26.

    SRUEU

    ESOWPVZXCL

    TZCWL

    3 101ANJXE

    HXAYDYLMKWYRAYURIUGGILKYUEXHEF

    YLMNC

    IFMZIIKGSLOIHWFITZMTXXHHC

    MVQGJ

    WKZTEEVGHURINOKFJAYURUHHF

    VEZTC

    HIHWG18.

    HECDA

    YRNAF

    TWCTJ

    IZJTUYLGPT

    HIMGGJNPXGXXHRF

    XFRWL

    YDLOT

    XFYKURUWKE

    MJQEC

    YFYOE

    AUHKA

    18 no XGHED E

    YAGSM CCBGSF V

    NYUCS LOZGWS UNMFEB KAWQWF ZKYJJF S

    VZIJE SQ

    APALBTKF QKZNFC DHFYMFK Q

    XYUOC F

    KZYIZ ZLJUFZ UNBTAK NRKHKB ZJYUJG D

    DIPSG ZU17. UM

    LBUFZ DVLALNH QULPJJS Z

    BKIJC ZJFHFEH VN

    RBUFZ JUYYYNQ CYRBHEW CU

    SXXSZ HV

    MOAFT XUFHTEG IUZMJM S

    NDBFZ KN

    RIYIT BDGGJBF SLSIJU SYMSTZZ DD

  • 7/30/2019 Index Coincidence

    24/101

    17

    I

    The first thought that comes to one is that these individual mixed alphabets may be solvedupon the basis offrequency alone, as is commonly done with such frequency distributions. Forexample, we might assume the most frequently occurring letter in each alphabet to be the equiva-lent of plain-text letter E, the next, of T, and so on; then substitute the A ssumed values in the textand try to build up words. But each of these alphabets contains only an average of 36 letters, sothat hardly any assumption would carry a considerable degree of certainty. This is especiallythe case in English text where the letter E does not always stand out prominently as the mostfrequently used letter in small amounts of text. Were an analysis of this kind absolutely necessaryto solution, it is doubtful whether this particular set ofmessages could be solved except after along period of patient labor. But it will be shown now that such an analysis is in fact not essen-tial, because we may be able to effect a direct reconstruction of the five primary alphabets, whichwill not only lead to the solution of all these messages, but will also give us every one of the possible125 secondary alphabets of the entire system.

    T A B L E V I I

    1

    A

    B

    C

    D

    E

    F

    OHI

    JKLM

    NO

    P

    QK8

    T

    T JV

    WX

    Y

    Z

    A

    /

    / / / /

    /

    I I I

    I I I

    1W1I I I

    111I I

    I I I

    mH

    B

    I l l lI I I !11

    11I I I

    I I1I I11I I

    I l l l1Mil

    O

    I1I I I

    1mii niiiiH IiiMl1I I

    I I

    D

    /

    I I

    1M1I I

    I I

    I l l lI I1I I

    1I I

    I l l l1I I1

    E

    I I I

    I l l l

    I I

    1

    I I

    M

    I I I11I I111I I

    I I

    I I I

    1

    8

    A

    B

    C

    D

    E

    F

    OH

    I

    JK

    LM

    N

    O

    PQ

    B

    8

    T

    ITV

    W

    X

    Y

    Z

    A

    I I

    1I I

    I I

    111I l l l

    I I I

    I I

    1I l l l

    I I I

    I I I

    I I I

    111

    B

    /

    I I

    I I

    I I

    I I I

    1

    I I I

    I I

    I I

    1M1I I

    1M

    I I1

    C

    // /I I I111M

    M

    I I

    I I I1I I

    mI I I1

    D

    I I111I l l lI I

    1M1111/ ///

    /I I1m

    ii

    E

    1 1 11/

    I l l l

    MI I1

    I l l l

    I I

    I I111I I I

    I l l l1

    C3

  • 7/30/2019 Index Coincidence

    25/101

    18

    TABLE VIIContinued12

    ABCDEPOHI

    J

    KLMN

    0

    P

    QK8

    TX TVWXTZ

    A

    /////////

    I t UI I

    1I II I I1I II I

    I I I

    I I I111

    B

    ///////

    /m ii nii iIlllH Im iiiiii i

    c

    n u

    n ui nin u

    n u

    i nii ii ii n

    D

    / / // / // / /// / / /

    / /

    /

    / //

    / / /

    / / /

    / / //I H l ' l

    E

    /

    mm

    ii ni ii iiii nnii iinni ni

    16

    ABODBF

    OHI

    J

    XLMK TOP

    QB

    8

    TT TVWXYZ

    A

    ///////

    /

    //

    ////////

    ////

    /

    //

    ///

    ///////

    B

    //

    ////

    ////////

    ////

    W1I I

    I l l l111I I

    c

    /

    / /// /

    //

    ////// //// / /

    /n u

    i nn uii

    D

    ///

    //

    ///MillI I

    1I I1Il ll

    Illl

    E

    /

    //////////////

    //

    ///

    ///

    /

    /

    I H l1I I11

  • 7/30/2019 Index Coincidence

    26/101

    19VIIContinued

    20

    ABODXFOHIJKLMNOPQK8TT TVWXYZ

    A

    II11111

    M

    1M l

    1

    MlI I I !Illl

    B

    II1II I1111Illl

    1MI I I

    I IM1I I1

    1

    O

    11I II I II I

    Mllll1I I II II I

    11I I I111M lIll lI I

    D

    Illl11111

    II

    II II I IIlllIlll1

    II1II

    II I1

    B

    II I1

    1II I

    Ml1lltlI I

    I IMI I

    11I I I

    9

    ABCDEFOHXJ

    KLMNOPQB8TV

    VWXYZ

    A

    /

    m iii ni niii

    HHm iniMnin

    B

    II I

    MI I

    1M

    Il llIll l1111

    I I1I

    o

    I I

    1I I1111I I II In

    iii niii

    MlIl llI I

    B

    m i

    iniiim inii im ii niM

    I I I

    E

    II1

    1

    1II I

    II

    II I

    II

    II I

    MilI I I1I I

    i ni

  • 7/30/2019 Index Coincidence

    27/101

    20

    TABLEVIIContinued22

    ABODEF0

    HI

    J

    X

    LMK0

    PaB

    8

    TT C TVWXTZ

    A

    /

    ////////

    ////

    /

    /

    ///

    ///

    //////////////

    /

    B

    //////////II

    1111II I1

    Ml111

    I I

    1I I1

    oI I

    I Im i

    ii nI I I !1I I I

    1I I1I I I1I I

    I I1

    D

    /w t i i im ininMin

    /

    in

    nn

    E

    //

    /

    II

    1IIIIlllIlll

    Ill l11

    I I I

    Il ll

    I I II I

    16

    ABODEF

    aHI

    3

    K

    LMIT

    0

    P

    QR8

    TUVWXYZ

    A

    ////

    //

    /////

    ///

    ///////

    MlI II I

    11

    I I I

    B

    ////

    /////

    //

    ////////////M1I I I1I I1

    o

    /

    // /

    //

    // /

    // /

    /I t t i

    Ill lI I1I I I

    1Mil

    D

    /

    mm iinii ni nm iini nii nii

    E

    //W

    I I I

    m m

    nni/

    iinii1 !i tn

  • 7/30/2019 Index Coincidence

    28/101

    21

    TABLE VIIContinued6

    AB0DBFOHIJ

    XLMNOPQB8TT TVWXTZ

    A

    /

    //

    I H 1I I1I I

    I I I

    11Mil1I II I II II II I

    B

    /

    /II1II

    II

    mnn u niii iMliii i

    c

    Illl1Il ll1I II I

    I I

    1M1till1Mill1

    D

    /

    1 1 11

    III11IIIl ll

    MilIl llI I1Ill l11

    E

    III

    II

    1II

    Il ll

    1

    1III

    11II I

    MilMI I

    11

    ABODEF0

    HI

    JKLMNOPQB8

    TUVWXYZ

    A

    III

    Il ll1MM111

    I I I

    I II II I I

    11I II I

    B

    /Il ll11

    II

    Il llM i n iinH I

    ni nnii

    c

    I II I II I

    1I l l l1I II I11n

    inim iim ii n

    D

    III

    Il ll1II

    II

    III

    11Il ll

    II I

    MillI I

    I I I

    E

    Il ll

    1

    t i nn

    ini nim iMliiiin

    n

  • 7/30/2019 Index Coincidence

    29/101

    22

    TABLE VIIContinued6

    ABODEFGHI

    JKL

    MNOP

    QB8

    TT TVWXT

    Z

    A

    /

    /////

    /////////

    /

    //////

    //Ml111 1I I

    B

    /

    //

    ///////////

    //

    //

    /

    ///

    MilI I I

    1I I I1

    c

    1I Im ii iii ni nM

    I I

    I I II I I

    11

    Ml

    D

    II

    II I

    II I11II I

    Mil1I I I !1I I

    11I I I !

    Il l

    E

    //

    ////

    ///

    ///

    MMl

    M1I II I

    I I I

    19

    ABCDEF

    OHI

    J

    KL

    MNOP

    QB8TT JVWXYZ

    A

    //

    ///

    /////n u tiiii ni niim im in

    B

    M

    I II I I1M111

    I I

    1I I

    11M

    I I I

    I I

    c

    //

    M1I I I

    I I

    / I / II IMillI I

    M11

    D

    //////

    //

    /Mil1I I

    I IM11

    I I

    I II II I

    E

    ////

    //

    /////

    /

    M1I IIII I I

    I IMI Ii1

  • 7/30/2019 Index Coincidence

    30/101

    23

    TABLB VIIContinued

    18

    ABODBFOHI

    JKLMNOfQB8TT TVWXYZ

    A

    /n nni niiiii nnnm in ninin

    B

    II I

    Illl

    II11

    II

    II

    11IlllII

    1II1m

    ni n

    c

    /II1111II

    II I

    1

    I Im iin nm ini nn

    D

    II

    1'1

    II

    Illl

    II I

    II

    1II I1

    m

    n nn nn

    ii

    E

    /

    m

    ni nii ni nn ninii nin

    in

    23

    ABO

    PEFOHX

    J

    KLMKOP

    QB

    S

    TtrVwXyz

    A

    II I

    1

    IlllII I

    II I

    Illl

    1IlllII I1II I1

    B

    /

    I I

    m

    I I1I I I1 11miI I1II I

    I I I1I I

    1

    c

    /

    /

    I I1I II II II I I

    1I I

    I I

    mnmin n

    D

    ,///II

    II I

    II I1IlllII

    Illl111

    Illl

    1II

    H i

    E

    ////

    lit11II111II

    II1II I

    II

    II111II

    II

    III

  • 7/30/2019 Index Coincidence

    31/101

    24

    TABLE VIIContinued

    9

    ABODEFaHijKLMNOPaB8

    TX TVWXYZ

    A

    II

    II

    MlW1I I

    I I I1I I

    I Imini t

    B

    /

    II

    mini nini ni nnIlllIlll1

    I I

    c

    /

    /n u tii nn

    H IH I

    im iiiiim iH I

    r >//

    /

    // //

    /// /

    // ///I I I1miini nnH

    E

    Ml1Ill lI I

    Ml

    11I I

    1I I

    Il ll1I I

    I I I

    31

    ABODEP

    aHijKI.MNOP

    QB

    8

    TITVW

    XYZ

    A

    //

    //

    n uim iiiiniiniii

    MlM

    B

    /////

    III

    1II1Ml

    11I II I

    1I I

    I Ili1M

    c

    I I

    I I111

    I I I

    I I

    I I I

    1II IIl llI III I

    m i li t

    D

    II

    MIl ll

    1I I

    11

    M

    I I I

    Ill l

    I I

    I I I

    Il ll

    E

    /

    III

    II

    III1II1II

    IlllII

    1Ill l

    ill

    II

    II

    Illl

  • 7/30/2019 Index Coincidence

    32/101

    25

    TABLE VIIContinued7

    AB0r >Ey0HXJKX .MN0

    PQ&8TT TVWXTZ

    A

    IIn u nii iitIlllI I

    I I1m iiiniii n

    B

    Illl

    II

    Ill lM

    I I1I I

    Illl

    1I I II1I I

    Ml

    c

    /I II I II I I

    1I ll l11I I

    I I I111I Imili

    D

    II

    Illl

    Illl

    11

    II I

    1miii niiiinm i

    E

    m

    nii nm imininH Ininii

    4

    ABCDEF0

    HI

    J

    KX .MNOPaB8TTTVWXYZ

    A

    /II I

    II

    II1mnH Im

    ii iin

    i nin

    B

    IlllII11II I

    III

    II

    W/

    N UI I1111111

    c

    I IIll lI I

    I I II I

    1I II I

    I I

    tI I I

    I II I I

    1mi

    D

    /m iii niim iinniH Im

    H I

    E

    m i n inm iin

    ninH Iiini inii

  • 7/30/2019 Index Coincidence

    33/101

    26

    TABLE VIIContinued14

    ABCDEPGHI

    JKX .MNOPaB

    8

    TT TVWXT

    Z

    A

    /

    ///

    /

    //

    /

    I H III I11

    I I1MIlllI IW

    B

    //

    ///////

    /

    /// /

    /

    /

    /

    //

    MMlIll l

    1

    a

    I I I111

    1W

    I II I1I I I11Il ll

    n uiiii

    D

    //////////

    //

    //m iiiH I

    iiiiim ii

    E

    III

    III

    1II1II1III

    II1II

    II

    I H II I1

    Il ll

    10

    ABCDEPOHI

    J

    XLMNOP

    OB8TtrVWXYZ

    A

    miiH

    i niin

    H

    nt i nnmn

    n

    B

    //

    ///

    /

    ///

    //

    ///// /n uii nmii n

    O

    I l l lI I I

    1111111

    I II I I

    1I I Imii iii n

    D

    //

    /// //m i lH I

    i n

    ninMilIll l

    E

    /// /

    II

    M

    I II I I

    Ml

    Ill l1

    I I I

    Il ll

    1

  • 7/30/2019 Index Coincidence

    34/101

    27

    TABLE VIIContinued3

    ABC

    D

    E

    FOH

    ZJX

    LMN

    OpQ&

    8T

    T TVW

    X

    Y

    Z

    A

    //

    I I

    1I l l ln ui

    niI l l l

    111I l l lM i l

    3

    1I I

    I l l l

    MI I

    I I

    I l l l

    I I

    I I1I I II I1I l l l

    1

    c

    I I I

    I I

    I I IM i lI I11I l l lI I

    1I I1

    11I I

    I I I

    D

    /

    /

    I I

    I I I

    I I I

    I l l l

    J1I I I111I l l l

    I I I

    I I

    I l l l1

    E

    I I

    1 1 1 11I l l l

    MI l l l

    1I I1I I

    I I IMl1

    18

    A

    B

    C

    D

    E

    FO

    HIJ

    X

    LMN

    O

    P

    Q

    B

    B

    T

    VVW

    X

    Y

    Z

    A

    I I

    111I I I

    1

    1I I I

    Ml

    I II l l l1

    I l l l111I I

    I I

    B

    I I

    Ml

    1

    1I II II I I

    I I

    11I I

    I I

    11Ml

    M

    oI I1

    I II I I

    I I I

    I I I

    I I I

    1I I111

    I l l l

    Ml

    1I I I

    D

    11I

    f H JM i l

    I I

    Ml

    I I I

    I I1

    I l l l

    I I

    1

    E

    I I

    I I I

    11MI I I

    I I

    I I

    I I

    1I I I

    I I11

    M i l

    17

    AB

    0

    D

    E

    V

    0

    H

    1

    JX

    LM

    A

    /

    I I I

    I l l l

    11I I

    1111

    B

    ///

    //////////

    /

    //

    C

    /

    ///

    / / / //

    I I I

    11I I

    D

    I I I

    I

    1

    I I

    I I I

    M i l l !

    E

    / / / /

    ///I I I111I

    1I I I

    11

    17

    N

    0

    P

    Q

    B

    8

    T

    T JV

    WX

    Y

    Z

    A

    /

    I l l l

    I l l l

    I I I

    I I

    1Ml

    B

    ////

    I I I

    I I

    MlI I

    1I I I

    c

    // /

    ////M i l l1

    I I

    I I

    D

    /I I

    1

    1MM

    I I

    E

    /

    //////

    /

    /

    I I

    I1I I

  • 7/30/2019 Index Coincidence

    35/101

    28

    The method which we are about to demonstrate is based upon the fact that the segmentsfrom which the cipher groups are taken follow one another from a given initial point in aregular succession, uninterrupted in this case except for a break of three segments representingthe two blank segments of the key plus one blank which is always present representing theplain segment. To explain the principle of this method in detail, attention is directed to thefact that, as a result of the system ofencipherment, the series of successive cipher equivalentsfor any given plain-text letter in any one of the five primary alphabets coincides with thesequence of letters in that alphabet. The series will coincide with the complete alphabet exceptfor the omission of 1, 2, 3, or more letters depending upon the number of blank segments.For example, turn to figure 1 and note that, in alphabet 1, the sequence of letters beginningwith A i s as follows: A U F Q Z E R H Y G W J . . . .

    Now it is patent that if we place letter A of the first primary alphabet in the plain segment ,its series of successive cipherequivalents coincides with the sequence of letters succeeding A in thesame alphabet, viz, U F Q Z E R H Y G W J O I D M N C . . . .

    If we place another letter of the same primary alphabetfor example Zin the plainsegment, its series of successive cipher equivalents constitutes exactly the same sequence,except with a different initial point, viz, E R H Y G W J O I D M N C . . . . In otherwords the successive cipher equivalents for these 2 plain-text letters come from one and the samecycle or sequence. Now, the same is true with respect to every other letter of alphabet 1, andalso of the other primary alphabets. O f course, the sequence is different for each primaryalphabet.

    Since this cycle or sequence of letters is the same for all the letters of each primary alphabet,only the series of successive cipher equivalents for one letter of each primary alphabet is necessaryin order to effect a complete reconstruction of that alphabet. In other words, if we can selectwith accuracy the cipher equivalent for one andonly one plain-text letter in each of the successive115 secondary alphabets, we can then arrange these equivalents into five sequences of letterswhich will coincide with the live primary alphabets, thus resulting in their reconstruction.The reconstructed sequences will be complete except for the omission of one or more lettersrepresenting the blank segments. If the numerical key consists of 23 numbers, three letterswill be missing from each sequence. These letters will be known, of course, but their relativepositions in the omitted section will have to be found later.

    Obviously, the letter which will lend itself best to such a procedure is E, for it is the mostfrequently occurring letter in English text. If, therefore, by a careful study of the individualfrequency tables applying to the columns of the superimposed messages, we can select the cipherequivalent of only the letter E with certainty in the successive secondary alphabets, we shallat once have the sequences of letters in the five primary alphabets and the solution of theproblem will be at hand. For example, if in a hypothetical sequence of these alphabets weselect the letters K, N, Q, and V, respectively, as the four successive cipher equivalents of E,then this will mean that in primary alphabet 1 there is a sequence . . . K N Q V . . .,providing a break in the numerical key does not exist between the m em b er s of the sequenceof key numbers applying to the segments concerned. Continuing this process, ultimately thefive primary alphabets can be completely reconstructed. But we must remember always thatthis process is dependent upon the correct assumptions for the cipher equivalent of E in eachof the 11 5 secondary alphabets, or columns of cipher text.

    Let us attempt such a reconstruction. Turning to the series of secondary alphabets givenin table VII, we try to find in each alphabet the letter which undoubtedly represents plain-textletter E. At the very start we encounter difficulties. In alphabet 1A, the letters M and Y areof equal frequency. There is no way of telling which letter represents E , so that we shall have

  • 7/30/2019 Index Coincidence

    36/101

    29

    to consider both M and Y as possibilities. In alphabet 8A again we have difficulties, for bothJ and Q have the same frequency. It begins to look like a very doubtful procedure. As wego further along, the difficulties in selecting the representative of E increase rather than decreaseand the cryptanalyst becomes lost in a multiplicity of possibilities. Evidently this method,as the preceding one, while theoretically correct, is practically out of the question becauseof the limited size of each frequency table. In fact, it is doubtful whether we can select therepresentative of E with certainty in any one of the A alphabets,1 and certainly, if we cannotdo this with the letter that theoretically occurs the most frequently, we cannot do it withany other letter.

    It was at this point, when apparently a blank wall confronted the writer, and there seemedlittle hope of solution, that he evolved the method which finally resulted in solution, and whichembodied such new principles that he was led to describe them in this paper. This methodhad recourse to some simple mathematics, easy of comprehension and apph'cation when theunderlying principles have been grasped.

    First, let us make what we have termed a "consolidated frequency" table for all of thesecondary alphabets applying to the first, or A, primary alphabet. This is done by collectingthe data contained in the individual frequency tables shown in table VII into one large table,taking only the data applying to the letters of primary alphabet 1. This larger table is shownbelow (table VIII).

    1 It was found later that the cipher equivalent of E has the greatest frequency in only 3 out of the 23alphabets. In o ne alphabet E did not occur at all, and in six cases it occurred only two times. It will be ofinterest to the reader to study these tables for the information they contain with regard to the extreme degreesofvariation from the normal that small frequency tables can exhibit.

  • 7/30/2019 Index Coincidence

    37/101

    T A B L B VIII.Consolidated frequency table for alphabet 1

    Cipherletter

    ABCDKp< jHijKLMNOP

    QBSTtrVwXYz

    Segment

    1

    1

    4

    1

    3

    3

    1

    513

    11

    1

    2

    3

    52

    S

    2

    1221

    11

    4

    321

    4

    33311

    1

    12

    1

    32

    21

    52

    1

    23122

    3

    3111

    18

    2411

    12

    1

    31

    31

    3

    1

    2

    214

    3

    20

    2111

    11

    5

    1

    6

    1

    1

    644

    2

    1

    6

    1

    33

    11

    12242

    1

    52

    1

    2

    22

    1

    41

    3

    4

    113

    3

    423221

    1

    i:

    22

    2

    1

    31

    3

    43

    6

    22

    1

    13

    6

    1

    1

    1

    5

    21231

    1

    71

    23222

    11

    3

    41

    551

    11

    3

    22

    3

    1

    1

    22

    5

    1

    311

    423

    1

    11

    1

    3

    2

    61

    1

    22

    IB

    2

    21

    12261

    1

    1

    3

    31i

    442

    13

    142

    3

    1

    111

    3

    22

    64121

    2

    23

    3

    1

    5

    4

    33

    4

    143131

    l

    2

    2651

    2

    31

    2

    2

    122

    21

    2

    11

    61

    4

    11

    12112

    1

    1165

    7

    271

    211

    4221

    611

    211

    3

    4

    1

    32

    21

    5235

    1

    2

    1

    2

    31

    2

    14

    1

    21

    1

    2

    15121

    1

    21

    54

    25

    10

    61

    2

    3

    1

    1

    22

    2425

    3

    2

    1

    2

    22

    451

    2

    1

    4

    11

    147

    18

    2111

    31

    1362

    41

    41

    1

    1

    22

    17

    1

    341

    1

    21

    11

    11

    4

    4

    32

    1

    6

    Total fre-quency

    22372799203040

    3529274034

    3128352230383231373928383634

    841

    Num-ber ofsegmentsoccupied

    1413161713171415121415141615141112161417151614171212

    Averagefrequencypersegment

    1.582,851.692.301.641.762 . 8 62. 3 42. 421.932 . 6 72. 431.941.862.502.002.502. 3 72.281.822.472.452.002 . 2 43.002.83

    841Average frequency per cipher le t te r =- 2g- =3 2. 4 occurrences.

  • 7/30/2019 Index Coincidence

    38/101

    31

    This consolidated frequency table is of a rather peculiar n a tu r e . Each column gives thefrequency of the cipher letters in a particular segment and there are 23 such columns, corre-sponding to the 23 segments of the numerical- key. The numbers of the columns are determinedby, and coincide with,.the sequence of numbers in the cycle as given on page 16, viz, 1,8, 12, 1 6,etc. Each row gives the frequency of a particular cipher letter in the successive segments andsince the columns succeed one another in the cyclic sequence, it follows that the frequenciesin the successive segments on a line with any given cipher letter form a definite sequence oj

    frequencies. There being 26 cipher letters, there are 26 such rows or sequences of frequencies.The total frequency for each cipher letter is g i ve n in the column labeled as such and theaverage frequency for all cipher letters is then found to be 8 4 1 -r -2 6 = 32 . 4 occurrences. Thenumber of different segments in which the cipher letter applying to any given line occurs isindicated in the next column; and the average frequency per segment for each cipher letteris given in the last column.

    Before we can proceed it will be advisable to establish certain principles which will enableus to follow the subsequent reasoning more easily. We shall make use of alphabet 1 shownin table VIII, calling attention to the fact that the same principles apply to the other fourprimary alphabets. In order to make the illustration comparable in all its details with thereal situation in the test problem, let us make the numerical key 23 numbers in length byadding numbers 22 and 23 at the end of the key shown in figure 1 .Let us see what successive plain-text letters the cipher letters A, B, and C represent in thesequence ofsegments.

    0. Plain textA K P L B S X T C N M D I O J W G Y H R E Z QBS X T C N M D I O J W G Y H R E Z Q F U A V KC M D I O J W G Y H R E Z Q F U A V K P L B S

    It "will be noted that the successive plain-text letters which cipher letters A , B, and C repre-sent constitute almost exactly the same sequence in the three lines. This follows from thenature of the cipher system itself, and the cause of it has already been pointed out. In the Bline there is a section not present in the A line, consisting of the letters F U A ; in ihe A line, thesection not present in the C line consists of the letters X T C ; and in the C line, the section notpresent in the B line consists of the letters PLB. This is due to the interruption in the numericalkey; the section omitted will consist of 3 sequent letters in each case, but these letters will bed ifferent for every cipher letter.

    Let us now accompany the sequence of the plain-text letters opposite each of the lettersA i B, and C, with a sequence of frequencies corresponding to their normal theoretical frequen-cies * for English text.

    1 These theoretical frequencies are given by Hitt on the basis of 200 letters of plain text. See Hitt, Parker.Manual for the Solution of Military Ciphers, 1918, p. 0.

    .3S

  • 7/30/2019 Index Coincidence

    39/101

    t 17 4 14 10 7 21 11 IdV K P L B S X T C

    32

    F I G U R E 7A

    12I

    6 JO 1U

    N M Da 6 is 2 i w is a 23 aO J W G Y H K E Z Q

    8 X T C N M D J W O Y H B, E Q F T J A VK

    N M D O J W Q Y H B E Z Q F T 7 A V K P I . B

    Now, since the sequences of plain-text letters represented by these sequences of frequenciesare the same, it follows that we can so arrange the latter as to make the successive individualfrequencies coincide; and if we make due allowance for the break in the sequences caused bythe omitted sections of 3 letters, the three

    sequences shouldcoincide exactly. Thus:

  • 7/30/2019 Index Coincidence

    40/101

    33

    FlCUKK 8A

    V K P L B S X T C N M D I O J W G Y H R E Z Q t J V K P L B S X T C N

    6 X T O N M D I O J W O Y H R E Z Q F t T A V K [ W ] S X T C N5* 5z^-3z"^ = 3:5>: ^^^StStSz "^Sz rt 5r 5r--3f

    M D I O J W O Y H B E Z Q F t T A V K P L B S [ M==2:3 ^^^S^SrSz ^ 3z =: "~ " 5^ &

    In order to make the sequences coincide, we displaced the B sequence five intervals tothe right of the A sequence, and the C sequence four intervals to the right of the B sequence.Let us reverse the order of these letters, A, B, and C, and space them in accordance with thenumber of intervals which each sequence of frequencies has been shifted relative to the others.Thus: 4 3 3 1 8 6 4 3 3 1

    C ... B . . . . A

    Refer now to the illustrative cipher alphabet in figure 1 and note that this corresponds to theorder of these letters A, B, and C in this primary alphabet. We have determined the order ofthese letters in our alphabet merely by correctly superimposing or shifting the three sequences offrequencies relative to one another so as to make the individual frequencies coincide.

    Now had we not known what letters these individual frequencies in each sequence of fre-quencies represented but had merely been given the sequence of frequencies themselves, it wouldstill have been just as easy to find the correct relative positions of the three sequences from acomparison of the positions of high and low points in each sequence of frequencies. In otherwords, we do not need to know wha t letters the indiv idual frequencies in each sequence offrequenciesrepresent; it is still possible to determine the relative positions (in the primary alphabet) of theletters applying to each sequence in the cipher alphabet by a study of the positions of the high andlow points in each sequence offrequencies. No analysis whatever of the individual frequenciesis necessary, the entire frequency table being treated as an ordinary statistical curve. This, inits final analysis, is the meaning of the proposition stated in the opening paragraph of this

  • 7/30/2019 Index Coincidence

    41/101

    34

    paper.1 It thus follows that the five alphabets of our problem may be reconstructed, withouta knowledge of what letter any individual frequency in the sequences of frequencies (as sh o wn intable VIII) represents, by an analysis of these frequency tables considered as true statisticalcurves.Let us return now to the test messages. Table VIII represents a set of 26 sequences of fre-quencies similar in origin to those for A, B, and C in the illustrative alphabet above. We couldsuperimpose these sequences in the same manner and as easily as we did in the messages them-selves were it not for two circumstances: First, we know that there is an interruption of threeblanks in the cycle wliich we have reconstructed but do not know where these blanks must beinserted. Consequently, some allowance must be made for the blank segments in each sequenceof frequencies. Secondly, the individual frequencies in each sequence offrequencies in our prob-lem do not exactly correspond to the theoretical frequencies of the plain-text letters to whichthey apply but only correspond approximately to the theoretical. In some cases this approxima-tion is far from close because of the paucity of text, and this will make the determination ofthe correct relative positions of two sequences a much more difficult process than was thecase with the illustrative sequences above.

    We are, therefore, confronted with the problem of superimposing the sequences of frequenciescorrectly without a knowledge of these two factors, and this we shall accomplish by a slight modi-fication of method and a recourse to some simple mathematics.

    First, as to the modification of method due to our ignorance of the exact location of thebreak of three intervals in the numerical key: this consists in superimposing sequences, not tofind the relative positions ofany pair ofsequences but to find such sequences as are one and onlyone interval apart; i.e., sequences wliich represent a relative displacement of only one interval.The reason for this step is now to be explained.

    Let us consider the sequence of theoretical frequencies corresponding to the cipher letter Aand the letter which immediately follows it in the illustrative alphabet, viz, U, arranging the twosequences as though we had only reconstructed the cycle and had not as yet determined thenumerical key. Let us begin both sequences with segment 9, the first segment in the key.

    1 The ordinary frequency table applying to a plain text or a cipher alphabet does not correspond to theordinary frequency distribution ofstatisticalwork. In the latter, the position of the points along one ofthe axesof the graph and their extension along the other axis are either causally related, or the curve treats of data which,being subject to the operation of the laws of probability, form the normal, or Quetelet, curve of error. In theformer, the positions and extensions of the coordinates are not related in any way unless one considers the arbi-trary order of the letters of the alphabet as constituting a cause. The positions of the coordinates in a crypto-graphic curve were determined many centuries ago when the English language was first evolved.

    But the sequences of frequencies in table VIII are not similar in origin to the ordinary plain-text or cipheralphabet frequency tablesof cryptographic work. They are, in fact, closelyrelated to certain frequency distribu-tions ofstatistical data because the position and extensions of the coordinates are absolutely determin ed by a causeother than the arbitrary order of the letters of the English alphabet. These two characteristics of the curves ofa aeries ofsecozidary alphabets may be varied at will by changing the sequence ofletters in the prim ary alphabet.An y set of frequency distributions applying to a series of secondary alphabets derived from a variable primaryalphabet may be treated in the same mathematical manner as these will be treated in the subsequent pages.

  • 7/30/2019 Index Coincidence

    42/101

    35

    FldDMl. 0A

    9 17 4 14 10 7 21 11 1 (I 20 ID 12 3 8 1.1 2 1 13 18 8 22 23V K P L B S X T C N M D I O J W G Y H R E Z Q 1 8 17 4]VKP 1 4 10 7 21 11 1 L B 8 . . .Krtrt

    17 4 14 10 7 21 11 18 90 18 12 3 6 14 2 1 IS 18 8 22 23A V X P L B S X T C N M D I O J W