10
Proceedings of the 1 st Conference on Historical Cryptology, pages 29– 38, Uppsala, Sweden, 18-20 June, 2018 Solving Classical Ciphers with CrypTool 2 Nils Kopal Applied Information Security – University of Kassel Pfannkuchstr. 1, 34121 Kassel, Germany [email protected] Abstract The difficulty of solving classical ciphers varies between very easy and very hard. For example, monoalphabetic substitution ciphers can be solved easily by hand. More complex ciphers like the polyalpha- betic Vigen` ere cipher, are harder to solve and the solution by hand takes much more time. Machine ciphers like the Enigma rotor machine, are nearly impossible to be solved only by hand. To support researchers, cryptanalysts, and historians analyzing ciphers, the open-source soft- ware CrypTool 2 (CT2) was implemented. It contains a broad set of tools and meth- ods to automate the cryptanalysis of dif- ferent (classical and modern) ciphers. In this paper, we present a step-by-step ap- proach for analyzing classical ciphers and breaking these with the help of the tools in CT2. The primary goals of this paper are: (1) Introduce historians and non-computer scientists to classical encryption, (2) give an introduction to CT2, enabling them to break ciphers by their own, and (3) present our future plans for CT2 with respect to (automatic) cryptanalysis of classical ci- phers. This paper does not describe the used analysis methods in detail, but gives the according references. 1 Motivation There are several historical documents contain- ing text enciphered with different encryption al- gorithms. Such books can be found for instance in the secret archives of the Vatican. Often, histori- ans who find such encrypted books during their research are not able to decipher and reveal the plaintext. Nevertheless, these books can contain secret information being of high interest for his- torians. In such cases, cryptanalysts and image- processing experts are needed to support decipher- ing the books, thus, enabling the historians to con- tinue their research. The ciphers used in historical books (from the early antiquity over the Middle Ages to the early modern times) include simple monoalpha- betic substitution and transposition ciphers, code- books and homophone ciphers. With the open-source tool CrypTool 2 (CT2) (Kopal et al., 2014) historians and cryptanalysts have a powerful tool for the analysis as well as for the (automatic) decryption of encrypted texts. Throughout this paper, we present how CT2 can be used to actually break real world ciphertexts. The following parts of this paper are structured as follows: The next section gives a short introduc- tion to classical ciphers, as well as an overview of cryptanalysis. In Section 3, we briefly introduce the CT2. In Section 4, we present a general step- by-step approach for the analysis of ciphers with CT2. Section 5 shows real example analyses done with the help of CT2. Section 6 gives an overview of cryptanalysis components (for classical ciphers) already implemented in CT2 as well as compo- nents planned for the future. Finally, Section 7 summarizes the paper. 2 Foundations of Classical Ciphers and Cryptanalysis After a brief introduction to classical ciphers we discuss the cryptanalysis of classical ciphers. 2.1 Classical Ciphers Ciphers encrypt plaintext into ciphertext based on a set of rules, i.e. the encryption algorithm, and a secret key only known to the sender and intended receiver of a message. Classical ciphers, as well as ciphers in general, can be divided into two different main classes: substitution ciphers and transposition ciphers. A

Solving Classical Ciphers with CrypTool 2 · Proceedings of the 1st Conference on Historical Cryptology, pages 29– 38, Uppsala, Sweden, 18-20 June, 2018 Solving Classical Ciphers

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Proceedings of the 1st Conference on Historical Cryptology, pages 29– 38,Uppsala, Sweden, 18-20 June, 2018

Solving Classical Ciphers with CrypTool 2

Nils KopalApplied Information Security – University of Kassel

Pfannkuchstr. 1, 34121 Kassel, [email protected]

Abstract

The difficulty of solving classical ciphersvaries between very easy and very hard.For example, monoalphabetic substitutionciphers can be solved easily by hand.More complex ciphers like the polyalpha-betic Vigenere cipher, are harder to solveand the solution by hand takes much moretime. Machine ciphers like the Enigmarotor machine, are nearly impossible tobe solved only by hand. To supportresearchers, cryptanalysts, and historiansanalyzing ciphers, the open-source soft-ware CrypTool 2 (CT2) was implemented.It contains a broad set of tools and meth-ods to automate the cryptanalysis of dif-ferent (classical and modern) ciphers. Inthis paper, we present a step-by-step ap-proach for analyzing classical ciphers andbreaking these with the help of the tools inCT2. The primary goals of this paper are:(1) Introduce historians and non-computerscientists to classical encryption, (2) givean introduction to CT2, enabling them tobreak ciphers by their own, and (3) presentour future plans for CT2 with respect to(automatic) cryptanalysis of classical ci-phers. This paper does not describe theused analysis methods in detail, but givesthe according references.

1 Motivation

There are several historical documents contain-ing text enciphered with different encryption al-gorithms. Such books can be found for instance inthe secret archives of the Vatican. Often, histori-ans who find such encrypted books during theirresearch are not able to decipher and reveal theplaintext. Nevertheless, these books can containsecret information being of high interest for his-

torians. In such cases, cryptanalysts and image-processing experts are needed to support decipher-ing the books, thus, enabling the historians to con-tinue their research.

The ciphers used in historical books (fromthe early antiquity over the Middle Ages to theearly modern times) include simple monoalpha-betic substitution and transposition ciphers, code-books and homophone ciphers.

With the open-source tool CrypTool 2 (CT2)(Kopal et al., 2014) historians and cryptanalystshave a powerful tool for the analysis as well asfor the (automatic) decryption of encrypted texts.Throughout this paper, we present how CT2 canbe used to actually break real world ciphertexts.

The following parts of this paper are structuredas follows: The next section gives a short introduc-tion to classical ciphers, as well as an overview ofcryptanalysis. In Section 3, we briefly introducethe CT2. In Section 4, we present a general step-by-step approach for the analysis of ciphers withCT2. Section 5 shows real example analyses donewith the help of CT2. Section 6 gives an overviewof cryptanalysis components (for classical ciphers)already implemented in CT2 as well as compo-nents planned for the future. Finally, Section 7summarizes the paper.

2 Foundations of Classical Ciphers andCryptanalysis

After a brief introduction to classical ciphers wediscuss the cryptanalysis of classical ciphers.

2.1 Classical Ciphers

Ciphers encrypt plaintext into ciphertext based ona set of rules, i.e. the encryption algorithm, and asecret key only known to the sender and intendedreceiver of a message.

Classical ciphers, as well as ciphers in general,can be divided into two different main classes:substitution ciphers and transposition ciphers. A

substitution cipher replaces letters or groups of let-ters of the plaintext alphabet with letters based ona ciphertext alphabet. Transposition ciphers donot change the letters themselves but their posi-tion in the text, i.e. plaintext alphabet and cipher-text alphabet are equal. There also exist ciphersthat combine both, substitution and transposition,to create a composed cipher, e.g. the ADFGVX ci-pher (Lasry et al., 2017).

Substitution ciphers can be furthermore di-vided into monoalphabetic and polyalphabetic ci-phers (Forsyth and Safavi-Naini, 1993). Withmonoalphabetic ciphers, only one ciphertext al-phabet exists. Thus, every plaintext letter is al-ways replaced with the same letter of the ci-phertext alphabet. If there are more possibilitiesto choose from the ciphertext alphabet the sub-stitution cipher is a homophone substitution ci-pher (Dhavare et al., 2013). If there are morethan one ciphertext alphabet which are exchangedafter each encrypted letter, the substitution is apolyalphabetic substitution, e.g. the Vigenere ci-pher (Schrodel, 2008). Substitution may also notonly be based on single letters but on multiple let-ters, e.g. the Playfair cipher (Cowan, 2008). In his-tory, for military and diplomatic communication,codebooks and nomenclatures were used. Witha nomenclature, not only letters were substituted,but additionally, complete words were substituted.Codebooks contained substitutions for nearly allwords of a language.

Transposition ciphers change the positions ofeach letter in the plaintext based on a pattern thatis based on a key. The most used transposition ci-pher is the columnar transposition cipher (Lasry etal., 2016c). Here, a plaintext is written in a grid ofcolumns. Then, the columns are reordered basedon the lexicographical order of a keyword writtenabove the columns. Finally, the ciphertext is readout of the transposed text column-wise. Decryp-tion is done the same way but in the reverse order.

Composed ciphers execute different ciphertypes in a consecutive order to strengthen the en-cryption. One famous composed cipher is AD-FGVX. Here in the first step, each plaintext char-acter is substituted by a bigram only consisting ofthe 6 letters A,D,F,G,V, and X. After that, the in-termediate ciphertext is encrypted with a colum-nar transposition cipher. ADFGVX was used bythe Germans during World War I. It introduced anew concept, called fractionation. With fraction-

ation, a plaintext symbol (here a bigram) is af-terwards fractionated into two different symbols,making cryptanalysis even harder.

Ciphers based on codebooks were often super-enciphered, thus, first the words were substituted,for instance with numbers. Then, the resultedciphertexts were additionally super-encrypted bychanging them according to special rules.

Many encrypted historical books that survivedhistory are available. Most of them are encryptedeither with simple monoalphabetic substitutions orwith homophone substitutions. For some booksthe type of cipher is unknown.

Many encrypted historical messages are en-crypted with simple substitution ciphers, homo-phone substitution ciphers, polyalphabetic sub-stitution ciphers, nomenclatures, or codebooks.Transposition ciphers were also used, but not asmuch as substitution ciphers since transpositionsare more complex with respect to the encryptionprocedures. Additionally, performing a transpo-sition cipher is more prone to errors. In moderntimes, transposition was used by the IRA (Mahonand Gillogly, 2008) and during World War II bythe Germans and the British.

In World War II, rotor cipher machines likethe German Enigma (Gillogly, 1995) performingpolyalphabetic encryptions were introduced andwidely used.

2.2 Cryptanalysis

Cryptanalysis is the science and art of breaking ci-phers without the knowledge of the used key. To-day, cryptanalysis is used to evaluate the securityof modern encryption algorithms and protocols.

We divide the cryptanalysis of classical ciphersinto two different approaches: the classical paper-based cryptanalysis and the modern computer-based cryptanalysis. In his paper we focus onmodern computer-based cryptanalysis which canbe done with CT2.

Substitution ciphers can be broken with the helpof language and text statistics. Since every letterin a language as well as in the plaintext alpha-bet of a cipher has its unique frequency it can beused to guess and identify putative plaintext let-ters. With monoalphabetic substitutions, plaintextand ciphertext frequencies are identical, but theletters differ. For example an ’E’ is substituted byan ’X’ – ’X’ has then the same frequency in ci-phertext as ’E’ has in plaintext. Thus, an algorithm

30

to break a substitution ciphers aims at recoveringthe original letter distribution.

Homophone substitutions as well as polyalpha-betic substitutions flatten the distribution of let-ters, hence, aiming to destroy the possibility tobreak the cipher with statistics. Nevertheless, hav-ing enough ciphertext and using sophisticated al-gorithms, e.g. hill climbing and simulated anneal-ing, it is still possible to break them.

Transposition ciphers can also be attacked withthe help of statistics. Since transposition ciphersdo not change the letters, the frequency of the un-igrams in plaintext and ciphertext are exactly thesame. Thus, to break transposition ciphers, textstatistics of higher orders (bigrams, trigrams, tetra-grams, or n-grams in general) are used to breakthem. Besides that, similar sophisticated algo-rithms, e.g. hill climbing and simulated annealing,are used to break transposition ciphers.

For breaking a classical cipher, it is useful toknow the language of the plaintext. It is possible tobreak a cipher using a “wrong” language, but thecorrect one yields a higher chance of success. Forcryptanalysis most of the algorithms implementedin CT2 contain a set of multiple languages, e.g.English, German, French, Spanish, Italian, Latin,and Greek. In many cases, the language of an en-crypted book is known to the cryptanalyst or canbe guessed by its (historical) context.

To identify the type of the cipher, whether it is asubstitution cipher or a transposition cipher, crypt-analysts use the Index of Coincidence (IC) (Fried-man, 1987). The IC, invented by William Fried-man, is the probability of two randomly drawn let-ters out of a text to be identical. For English textsthe IC is about 6.6% and for German texts about7.8%. Simple monoalphabetic encryption, wherea single letter is replaced by another letter, doesnot change the IC of the text. Same applies to alltransposition ciphers, since these do not changethe text frequencies. Polyalphabetic substitutionaims at changing the letter distribution of a text tobecome the uniform distribution. Thus, the IC isabout 1

26 ≈ 3.8% (where 26 is the length of the ci-phertext alphabet and all letters are used equallydistributed). Homophone substitution also aims atchanging the letter distribution of a text to becomethe uniform distribution, but here the IC is about1n , where n is the amount of different symbols inthe text.

Thus, having an IC close to 6.6% indicates that

we have either a plaintext, a monoalphabetic sub-stituted text, or a transposed text. And it is prob-ably German. On the other hand, having an ICclose to 3.8% indicates that we have a polyalpha-betic encrypted text. Clearly, the IC is more ac-curate having long ciphertexts. Identification ofhomophone ciphers can be done by counting thenumber of different used letters or symbols. If thenumber is above the expected alphabet size, it isprobably a homophone substitution.

State-of-the-art for breaking classical ciphersare search metaheuristics (Lasry, 2018). Becausewith classical ciphers, a “better guessed key” oftenyields a “better decryption” of a ciphertext, suchalgorithms are able to “improve” a key to comeclose to the correct key and often finally revealthe correct key. “Better” in this context means,that the putative plaintext that is obtained by de-crypting a given ciphertext is rated higher by aso-called cost or fitness function. An example forsuch a function is the aforementioned IoC, whichcomes close to a value indicating natural languagewhen the key comes closer to the original one. Acommon and very successfully used search meta-heuristic is hill climbing. A hill climbing algo-rithm first randomly guesses a putative “start key”.Then, it rates its cost value using a cost function.After that, it tries to “improve” the key by ran-domly changing elements of the key. With theVigenere cipher for example, it would change thefirst letter of the keyword. After changing the let-ter, it again computes the cost function. If the re-sult is higher than for the previous key, the newkey is accepted. Otherwise, the new key is dis-carded and another modified one is tested. Thealgorithm performs these steps until no new mod-ified key can be found that yields a higher costvalue, i.e. the hill (= local maximum) of the fitnessscore is reached. Most of our classical cryptana-lytic implementations in CT2 are based on such ahill climbing approach.

3 An Introduction to CrypTool 2

CrypTool 2 (CT2) is an open-source tool for e-learning cryptology. The CrypTool communityaims to integrate into CT2 the best known andmost powerful algorithms to automatically break(classical and modern) ciphers. Additionally, ourgoal is to make CT2 a tool that can be used by ev-eryone who needs to break a classical cipher. An-other well-known Windows analyzer for classical

31

ciphers is CryptoCrack (Pilcrow, 2018).

CT2 consists of a set of six main components:the Startcenter, the Wizard, the WorkspaceMan-ager, the Online Help, the templates, and the Cryp-Cloud, which we present in detail in the following.

The Startcenter is the first screen appearingwhen CT2 starts. From here, a user can come toevery other component by just clicking an icon.

The Wizard is intended for CT2 users that arenot yet very familiar with the topics cryptographyor cryptanalysis. The user just selects step by stepwhat he wants to do. The wizard displays at eachstep a small set of choices for the user.

The WorkspaceManager is the heart of CT2since it enables the user to create arbitrary cas-cades of ciphers and cryptanalysis methods us-ing graphical icons (components) that can beconnected. To create a cascade, the user maydrag&drop components (ciphers, analysis meth-ods, and tools) onto the so-called workspace. Af-ter that, he has to connect the components usingthe connectors of each component. This can bedone by dragging connection lines between the in-puts (small triangles) and outputs (also small trian-gles) using the mouse. Data in CT2 can be of dif-ferent types, e.g. text, numbers, binary data. Thetype of data is indicated by a unique color. A sim-ple rule is, that connections between the same col-ors are always possible. Connections between dif-ferent colors (data types) may also be possible, butthen data has to be converted. CT2 can do this au-tomatically in many cases, but sometimes specialdata converters are needed.

Figure 1 shows a sample workspace containinga so-called Caesar cipher (very simple monoal-phabetic substitution) component, a TextInputcomponent enabling the user to enter text, anda TextOutput component displaying the final en-crypted text. The connectors are the small coloredtriangles. The connections are the lines betweenthe triangles. The color of the connectors and con-nections indicate the data types (here text). Whenthe user wants to execute the flow, he has to start itby hitting the Play button in the top menu of CT2.Currently, CT2 contains more than 160 differentcomponents for encryption, decryption, cryptanal-ysis, etc. Many components that can be put ontothe workspace have a special visualization that canbe viewed when opening the component by doubleclicking on it. Figure 2 shows such a maximizedvisualization of a standard component.

Figure 1: CT2 Workspace with Caesar Cipher

CT2 contains a huge Online Help describingeach component. By pressing F1 on a selectedcomponent of the WorkspaceManager, CT2 auto-matically opens the online help of the correspond-ing component.

CT2 also contains a huge set of more than 200so-called Templates. A template shows how tocreate a specific cipher or a cryptanalytic scenariousing the graphical programming language and isready to use. The Startcenter contains a searchfield that enables the user to search for specifictemplates using keywords.

Finally, the CrypCloud (Kopal, 2018) is acloud framework built in CT2. We developed itas a real-world prototype for evaluating distribu-tion algorithms for distributed cryptanalysis usinga multitude of computers.

4 A Step-by-Step Approach forAnalyzing Classical Ciphers inCrypTool 2

In this section, we show a step-by-step approachfor analyzing classical ciphers in CT2. The firststep is to make the cipher processable for CT2, sowe create a digital transcription of the ciphertext.Then, we identify the type of the cipher. The third

32

step then finally breaks the cipher with CT2.

4.1 Create a Transcription

There are two ways to create a transcription of a ci-phertext for CT2. The first method is to manuallyassign to each ciphertext symbol a letter by handoutside of CT2, e.g. with Windows Notepad. Thetranscription is saved as a simple text file. Thisfile can be loaded into CT2 by using the FileInputcomponent and then be processed further.

Figure 2: CT2 Transcriptor Component withMarked Symbols for Transcription

The second method to create a transcriptionuses the CT2 component Transcriptor (Figure 2).With the transcriptor, a user can load a picture,e.g. a scan of a document. Then, he can assignletters to the scanned symbols by marking them.Finally, the transcriptor is able to output the com-plete transcription. It supports the user in two dif-ferent ways: (1) It automatically guesses, whichsymbol the user just had marked by showing themost likely symbols and (2) it can be set to semi-automatic mode. In semi-automatic mode, it auto-matically marks all other symbols that are similarto the one just marked by the user.

The DECODE project (Megyesi et al., 2017)already hosts a huge set of transcriptions of en-crypted historical books done by experts. Within2018 there will be an interface to call either CT2from the DECODE website or to download DE-CODE records from within CT2.

4.2 Identify the Cipher

After creating the transcription of the cipher it isnow possible to analyze its characteristics. A firstanalysis would be to create a text frequency analy-sis. For that, CT2 contains a Frequency Test com-ponent. It can be configured to show unigram dis-tribution, bigram distribution, etc.

Figure 3: Frequency Test Component ShowingDistribution of Plaintext

Figure 4: Frequency Test Component ShowingDistribution of Ciphertext

In Figure 3 we show the distribution of a plain-text (“The Declaration of Independence” of theUS). It can easily be seen, that the text followsthe letter distribution of the English language, i.e.the ’E’ is the most frequent letter, the letters ’X’,’Q’, and ’Z’ are very rare. In Figure 4 we showthe distribution of a ciphertext (“The Declarationof Independence” of the US, encrypted with a Vi-genere cipher). Here, all letters are more or lessequally distributed, showing the cryptanalyst thatit is possibly a polyalphabetic substitution cipher.

Another component that helps to analyze andidentify a cipher is the Friedman Test, invented byWilliam Friedman. With this test the key length(number of letters of a key word or phrase) of apolyalphabetic cipher can be calculated.

In Figure 5 we show the result of the Friedmantest performed on plaintext (“The Declaration ofIndependence” of the US). It shows that the giventext is possibly plaintext or a monoalphabetic sub-stitution. Furthermore, the ciphertext could betransposed since the transposition does not changethe letter distribution. In Figure 6 we show the re-sult of the Friedman test performed on ciphertext(“The Declaration of Independence” of the US,encrypted with a Vigenere cipher). It shows thatthe given text is possibly ciphertext and polyal-phabetic. Additionally, it shows that the estimated

33

Figure 5: Friedman Test Component Showing Re-sult of English Plaintext

key length is about 9. The component needs a pro-vided IC (IC provided) which is used as a refer-ence value for the analyzed IC (IC analyzed).

Figure 6: Friedman Test Component Showing Re-sult of Ciphertext

4.3 Break the Cipher

After identifying the cipher type it can now be bro-ken with the help of different cryptanalysis com-ponents. CT2 contains components for the au-tomatic breaking of the monoalphabetic substitu-tion cipher, the Vigenere cipher, and the columnartransposition cipher.

In Figure 7 we show the Vigenere Analyzer

component which automatically solved a Vigenerecipher (“The Declaration of Independence” of theUS, encrypted with a Vigenere cipher. The solverautomatically tested every keylength between 5and 20 using hill climbing. Only about ten sec-onds are needed for the component to automati-cally break the cipher. The decrypted text is auto-matically outputted by the component and can bedisplayed by an TextOutput component.

Figure 7: Vigenere Analyzer Solving a Cipher

All automatic cryptanalysis components havethe same style of user interface. Besides startand end-time, the elapsed time for the analysis isshown. Furthermore, some components estimatethe time for the remaining automatic analysis.

5 Example Cryptanalysis of OriginalClassical Ciphers

In this section we present two different real-worldclassical ciphers that can be broken with CT2.

5.1 Message in a Bottle Sent to General

Pemberton in the US Civil War

The following message was sent in a bottle by aConfederate commander at the 4th of July 1863 inVicksburg to General Pemberton. It was brokenby the retired CIA codebreaker David Gaddy in2010 (Daily Mail Reporter, 2010). We here usethis message (221 letters) as our first real-worldexample for breaking classical ciphers with CT2.

In the first step, to automatically analyze the ci-phertext, we had to create a transcription as shownin Section 4.1. We could have used the Transcrip-tor component or do it manually. Since the let-ters are written differently, the scanned image hasonly a low resolution, and the message containsink spots, we did it manually. We show the resultof the transcription of the ciphertext in Figure 9.

Now, we could analyze the text to identify the

34

Figure 8: Encrypted Message in a Bottle Sent byGeneral Johnston

Figure 9: Transcription of Encrypted Message ina Bottle

type of the cipher. First, we created a letter fre-quency analysis (see Figure 10).

The distribution of letters indicated that themessage is not encrypted with monoalphabeticsubstitution and possibly not transposed. Basedon the more or less equal distribution of the let-ters we assume that the message is encrypted witha polyalphabetic cipher. To further strengthen ourassumption, we applied the Friedman test and cal-culated the IC. In Figure 11 we show the results ofthe computation of the IC and the Friedman test.

The IC equal to 0.03834 indicated that the mes-sage is possibly encrypted with a polyalphabeticcipher. The estimated length of the key by theFriedman analysis is ≈ 5730, which is impossiblefor a text of only 221 letters. Thus, the message iseither encrypted with a running key cipher, mean-ing the key length is infinity, or the Friedman testjust fails because of the short length of the mes-sage. Since we know that in the Civil War the Vi-genere cipher was often used, we assumed it couldbe encrypted with the Vigenere cipher. Other pos-sibilities would be a codebook or a homophone ci-pher.

In the last step, we try to break the cipher. Sincewe assume it to be a Vigenere cipher, we used the

Vigenere Analyzer component to break it.We automatically test all key lengths between 1

and 20. Figure 12 shows the final result of theVigenere Analyzer component. The componentdisplays a toplist of “best” decryptions based ona cost function that rates the quality of the de-crypted texts. The higher the cost value (sum of n-gram probabilities of English language) the higherthe place in the toplist. Furthermore, the com-ponent shows the used keyword or pass phrase.With “MANCHESTERBLUFF” (15 letters), themessage can be broken. The analysis run took 5seconds on a standard desktop computer with 2.4GHz. We present the final plaintext in Figure 13.

Figure 10: Letter Frequency Analysis of En-crypted Message in a Bottle

Figure 11: Friedman Test and IC of EncryptedMessage in a Bottle

5.2 Borg Cipher – Encrypted Book from the

17th Century

The Borg cipher is a 408 pages manuscript, prob-ably from the 17th century. The manuscript is lo-cated at the Biblioteca Apostolica Vaticana (Al-darrab et al., 2018). It is written using special ci-phertext symbols. Figure 14 shows a small partof the Borg cipher. We here use the book as our

35

Figure 12: Breaking the Encrypted Message in aBottle with the Vigenere Analyzer

Figure 13: Message in a Bottle – Revealed Plain-text by Vigenere Analyzer

second real-world example for breaking classicalciphers with CT2. The cipher was already brokenby (Aldarrab et al., 2018).

Figure 14: Picture Taken from the Borg Cipher

We took the complete transcription of the bookfrom (Aldarrab et al., 2018).

First, we performed a frequency analysis of thetext shown in Figure 15.

Then, we applied the Friedman test on the ci-phertext and computed the IC (see the result inFigure 16). Both indicated, that the Borg cipher isencrypted using the monoalphabetic substitution.

Thus, we finally used the Monoalphabetic Sub-stitution Analyzer component of CT2 to break thecipher, see Figure 17. We tested different lan-

Figure 15: Letter Frequency Analysis of the BorgCipher

Figure 16: Friedman Test of the Borg Cipher

guages to be used by the analyzer. Latin producedthe best results, since the original text is Latin. Theanalysis run took 8 seconds.

Figure 17: Breaking the Borg Cipher with theMonoalphabetic Substitution Analyzer

We present the first part of the finally decryptedBorg cipher in Figure 18.

6 Current Cryptanalysis Components inCrypTool 2 and Open Tasks

CT2 contains a set of different components for theautomated cryptanalysis of classical ciphers. InTable 1 we show an overview of already imple-mented components for the cryptanalysis of clas-sical ciphers. Green marked entries refer to com-ponents which we already implemented. Yellowmarked entries refer to components that are notimplemented yet. The monoalphabetic substitu-

36

Figure 18: Borg Cipher – Revealed Plaintext byMonoalphabetic Substitution Analyzer

tion, the columnar transposition, the Vigenere ci-pher, and the Enigma machine are already break-able using CT2. We plan to implement homo-phone cipher analysis, codebook cipher analy-sis, grille analysis (a special transposition cipher),Playfair (a special substitution), strip and cylin-der cipher analyzers, ADFGVX analyzer, analyz-ers for the Hagelin Machine (Lasry et al., 2016a)(Lasry et al., 2016b) (e.g. the M209 cipher ma-chine), and a Hill cipher analyzer.

The currently implemented analysis tools, likefrequency analysis, transcriptor, or Friedman test,were shown and used in Section 5. For the future,we also plan to implement a “Cipher Detector”component which is able to automatically detectthe used type of cipher (with a high probability).

7 Conclusion

In this paper, we gave a brief introduction in thee-learning program CrypTool 2 (CT2) and how itcan be used to automatically cryptanalyze clas-sical ciphers. First, we gave an introduction toclassical ciphers as well as to their cryptanaly-sis. Then, we shortly presented CT2 and its us-age. After that, we showed an approach con-sisting of three steps (transcription, identification,and analyzing) for breaking classical ciphers us-ing CT2. We showed two classical real-world ci-phers (“Message in a Bottle Sent to General Pem-berton in the US Civil War” and “Borg Cipher– Encrypted Book from the 17th Century”) anddescribed step-by-step how we broke them withcomponents already implemented in CT2. Then,

Table 1: Cryptanalysis Components for ClassicalCiphers in CT2 – Overview

we gave an overview of methods for the automaticcryptanalysis already implemented in CT2 as wellas an overview of cryptanalytic components thatwe plan to implement.

CT2 is a project that now runs for nearly 10years. Within this time, we extended CT2 withstate-of-the-art methods for the cryptanalysis forclassical as well as for modern ciphers. CT2 con-tains possibilities to cryptanalyze ciphers by con-necting different CT2 instances over the Internet(CrypCloud). In the future, we plan to extendCT2 in such a way that it becomes easier andmore user-friendly, thus, non-computer scientistscan more easily use it for breaking their classicalciphers. There are still a lot of open tasks besidesthe implementation of cryptanalytic methods. Weplan to extend the existing components by a hugeset of different languages (e.g. Latin, Greek, He-brew, etc). Since many of the historical encryptedbooks are written in these languages, historiansand cryptanalysts will benefit by the newly addedlanguages. Furthermore, we will extend exist-ing cryptanalytic components to be more robustand more general with respect to the used alpha-

37

bets. Currently, the monoalphabetic substitutionanalyzer needs (for the transcription) a specific in-put alphabet consisting of Latin letters. Till end of2018 all kind of symbols a computer can processwill be possible (e.g. a support of UTF-8 charac-ters). Furthermore, new kinds of classical ciphersand cryptanalytic methods will be added. Exam-ples are grilles and codebooks, which were exten-sively used in history.

The CT2 team highly welcomes suggestions,wishes, and ideas of historians, cryptanalysts, andeverybody else for additional ciphers and auto-mated cryptanalysis methods which should be in-cluded in CT2 in the future. The list shown in Ta-ble 1 is open for new entries proposed by every-one. Since CT2 is open-source software, we wel-come everyone in contributing to the CT2 project(programmers, testers, etc). Finally, everyone in-terested in CT2 may download the software forfree from https://www.cryptool.org/.

References

Nada Aldarrab, Kevin Knight, and Beata Megyesi.2018. The Borg.lat.898 Cipher. http://stp.lingfil.uu.se/~bea/borg/.

Michael J Cowan. 2008. Breaking short playfair ci-phers with the simulated annealing algorithm. Cryp-tologia, 32(1):71–83.

Daily Mail Reporter. 2010. CIA codebreaker reveals147-year-old Civil War message about the Confed-erate army’s desperation. https://dailym.ai/2JkVFCu.

Amrapali Dhavare, Richard M Low, and Mark Stamp.2013. Efficient cryptanalysis of homophonic substi-tution ciphers. Cryptologia, 37(3):250–281.

William S Forsyth and Reihaneh Safavi-Naini. 1993.Automated cryptanalysis of substitution ciphers.Cryptologia, 17(4):407–418.

William Frederick Friedman. 1987. The index of

coincidence and its applications in cryptanalysis.Aegean Park Press California.

James J Gillogly. 1995. Ciphertext-Only Cryptanaly-sis of Enigma. Cryptologia, 19(4):405–413.

Nils Kopal, Olga Kieselmann, Arno Wacker, and Bern-hard Esslinger. 2014. CrypTool 2.0. Datenschutzund Datensicherheit-DuD, 38(10):701–708.

Nils Kopal. 2018. Secure Volunteer Comput-ing for Distributed Cryptanalysis. http://www.upress.uni-kassel.de/katalog/abstract.php?978-3-7376-0426-0.

George Lasry, Nils Kopal, and Arno Wacker. 2016a.Automated Known-Plaintext Cryptanalysis of ShortHagelin M-209 Messages. Cryptologia, 40(1):49–69.

George Lasry, Nils Kopal, and Arno Wacker. 2016b.Ciphertext-only cryptanalysis of Hagelin M-209pins and lugs. Cryptologia, 40(2):141–176.

George Lasry, Nils Kopal, and Arno Wacker. 2016c.Cryptanalysis of columnar transposition cipher withlong keys. Cryptologia, 40(4):374–398.

George Lasry, Ingo Niebel, Nils Kopal, and ArnoWacker. 2017. Deciphering ADFGVX messagesfrom the Eastern Front of World War I. Cryptolo-gia, 41(2):101–136.

George Lasry. 2018. A Methodology for the Crypt-analysis of Classical Ciphers with Search Meta-heuristics. kassel university press GmbH.

Thomas G Mahon and James Gillogly. 2008. Decod-ing the IRA. Mercier Press Ltd.

Beata Megyesi, Kevin Knight, and Nada Aldarrab.2017. DECODE – Automatic Decryption of Histor-ical Manuscripts. http://stp.lingfil.uu.se/

~bea/decode/.

Phil Pilcrow. 2018. CryptoCrack. http://www.cryptoprograms.com/.

Tobias Schrodel. 2008. Breaking Short Vigenere Ci-phers. Cryptologia, 32(4):334–347.

38