33
Polyalphabetic Polyalphabetic CIPHERS CIPHERS Linguistics 484 Linguistics 484

Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Embed Size (px)

Citation preview

Page 1: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Polyalphabetic CIPHERSPolyalphabetic CIPHERSLinguistics 484Linguistics 484

Page 2: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

SummarySummary

The ideaThe idea

How to recognize: index of coincidenceHow to recognize: index of coincidence

How many alphabets: KasiskiHow many alphabets: Kasiski

Page 3: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

The ideaThe idea

Remove the invariant that a plaintext letter Remove the invariant that a plaintext letter always maps to the same cryptotext letter.always maps to the same cryptotext letter.

Smooth out the frequency distribution, Smooth out the frequency distribution, removing clues.removing clues.

Page 4: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

MonoalphabeticMonoalphabetic

PlaintextPlaintext CryptosystemCryptosystem CiphertextCiphertext

Page 5: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

PolyalphabeticPolyalphabetic

PlaintextPlaintext

AAAA

CryptosystemCryptosystem

CiphertextCiphertext

BBBB

CCCC

Page 6: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

PolyalphabeticPolyalphabetic

PlaintextPlaintext

AAAA

CryptosystemCryptosystem

CiphertextCiphertext

BBBB

CCCC

Page 7: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

PolyalphabeticPolyalphabetic

PlaintextPlaintext

AAAA

CryptosystemCryptosystemCiphertextCiphertext

BBBB

CCCC

Page 8: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

PolyalphabeticPolyalphabetic

PlaintextPlaintext

AAAA

CryptosystemCryptosystem

CiphertextCiphertextBBBB

CCCC

Page 9: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Polyalphabetic systemPolyalphabetic system

Cryptosystem with several components.Cryptosystem with several components.

Systematic way of moving from one Systematic way of moving from one cryptosystem to the next.cryptosystem to the next.

Page 10: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère (simplified)Vigenère (simplified)

Component ciphers are shift ciphers, using so Component ciphers are shift ciphers, using so called Direct Standard Alphabetcalled Direct Standard Alphabet

You use each alphabet for one character, then You use each alphabet for one character, then move on.move on.

Page 11: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère (simplified)

AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ A

CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ A BB

. .

. .

. .

. .

. .Z AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY

Page 12: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère (simplified)Vigenère (simplified)

You and your friend agree a single letter key, You and your friend agree a single letter key, say ‘S’.say ‘S’.

Encrypt the first letter with the ‘S’ alphabet, Encrypt the first letter with the ‘S’ alphabet, second with ‘T’ alphabet, and so on.second with ‘T’ alphabet, and so on.

Page 13: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère (simplified)

key=”s”AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

. .

. .SS TT UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR

TT UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS

UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT

VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU

. .

BOOK JWVP

Page 14: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Polyalphabetic systemPolyalphabetic system

Cryptosystem with several components.Cryptosystem with several components.

Systematic way of moving from one Systematic way of moving from one cryptosystem to the next.cryptosystem to the next.

But two weaknesses in simplified Vigenère.But two weaknesses in simplified Vigenère.

Direct standard alphabets. Breaking one Direct standard alphabets. Breaking one character gets whole alphabet.character gets whole alphabet.

Pattern of movement is too obvious.Pattern of movement is too obvious.

Page 15: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Full VigenèreFull Vigenère

Use keyword to control jump between Use keyword to control jump between alphabetsalphabets

Pattern of movement no longer as obvious.Pattern of movement no longer as obvious.

Page 16: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère key=”SYMBOL”

AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

S T U V W X Y Z A B C D E F G H I J K L M N O P Q R

Y Z A B C D E F G H I J K L M N O P Q R S T U V W X

M N O P Q R S T U V W X Y Z A B C D E F G H I J K L

B C D E F G H I J K L M N O P Q R S T U V W X Y Z A

O P Q R S T U V W X Y Z A B C D E F G H I J K L M N

L M N O P Q R S T U V W X Y Z A B C D E F G H I J K

THE ATOMIC ENERGY L..

Page 17: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

ExerciseExercise

Encipher THE ATOMIC ENERGY with the Encipher THE ATOMIC ENERGY with the keyword SYMBOLkeyword SYMBOL

Decipher Decipher AVYUL HWLEE UCZLL LTYVI YOFJI ZSLNI AVYUL HWLEE UCZLL LTYVI YOFJI ZSLNI knowing that the keyword is HOUSEknowing that the keyword is HOUSE

Page 18: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Vigenère key=”HOUSE”

AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

H I J K L M N O P Q R S T U V W X Y Z A B C D E F G

O P Q R S T U V W X Y Z A B C D E F G H I J K L M N

U V W X Y Z A B C D E F G H I J K L M N O P Q R S T

S T U V W X Y Z A B C D E F G H I J K L M N O P Q R

E F G H I J K L M N O P Q R S T U V W X Y Z A B C D

AVYULAVYULTH...

Page 19: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

How many alphabets?How many alphabets?

Index of co-incidenceIndex of co-incidence

Babbage-Kasiski examinationBabbage-Kasiski examination

Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.

Breaking Vigenère.Breaking Vigenère.

Page 20: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Index of co-incidenceIndex of co-incidence

Based on arguments about probability.Based on arguments about probability.

Intuition: measure roughness of frequency Intuition: measure roughness of frequency distributiondistribution

Mathematical details followMathematical details follow

Page 21: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Roughness of Roughness of distributionsdistributions

Smoothest distribution has each letter Smoothest distribution has each letter happening 1/26th of the time.happening 1/26th of the time.

Roughest has one letter happening 100% of Roughest has one letter happening 100% of the timethe time

Normal English has some uneveness, less Normal English has some uneveness, less smooth than totally uniform.smooth than totally uniform.

Page 22: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Index of co-incidenceIndex of co-incidence

Get a frequency f[letter] for each letter.Get a frequency f[letter] for each letter.

Multiply f[letter]*(f[letter]-1) to get number of Multiply f[letter]*(f[letter]-1) to get number of co-incidences involving that letter. Add the co-incidences involving that letter. Add the results for all letters together.results for all letters together.

Divide by the number of co-incidences you Divide by the number of co-incidences you would expect if all the letters were the same.would expect if all the letters were the same.

Page 23: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Index of co-incidenceIndex of co-incidence

IC = IC = sum(f[letter]*(f[letter]-1)) sum(f[letter]*(f[letter]-1))

/ N(N-1)/ N(N-1)

Page 24: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Index of coincidenceIndex of coincidence

IC has a value of 0.038 if the letters are evenly IC has a value of 0.038 if the letters are evenly distributed, which is what you get if the distributed, which is what you get if the polyalphabet uses many many alphabetspolyalphabet uses many many alphabets

It has a value of 0.066 for English text, It has a value of 0.066 for English text, monoalphabetic encryptions of English text, monoalphabetic encryptions of English text, many other thingsmany other things

Page 25: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Number of alphabets

IC

1 0.066

2 0.052

5 0.044

10 0.041

large 0.038

Page 26: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Idea to quantify Idea to quantify roughnessroughness

Count the number of times a pair of letters Count the number of times a pair of letters drawn at random from the text happen to be drawn at random from the text happen to be the same.the same.

For the roughest possible, we always get the For the roughest possible, we always get the same letter, so a text of length N has N(N-1) same letter, so a text of length N has N(N-1) repeats. repeats.

For the smoothest possible, we get way fewer.For the smoothest possible, we get way fewer.

Page 27: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Babbage-Kasiski Babbage-Kasiski methodmethod

Explained well in Code Book p 67-72Explained well in Code Book p 67-72

Key idea: what does it mean if we find a Key idea: what does it mean if we find a sequence of repeated characters in a sequence of repeated characters in a message that has been encoded using a message that has been encoded using a repeated keyword.repeated keyword.

Page 28: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Babbage-Kasiski Babbage-Kasiski methodmethod

Key idea: what does it mean if we find a sequence Key idea: what does it mean if we find a sequence of of four or morefour or more repeated characters in a message repeated characters in a message that has been encoded using a repeated keyword.that has been encoded using a repeated keyword.

Most likely: a sequence of four or more English Most likely: a sequence of four or more English characters in the plaintext has been encoded characters in the plaintext has been encoded twice starting from the same place in the twice starting from the same place in the repeating keyword.repeating keyword.

Less likely: it’s an accident, some other Less likely: it’s an accident, some other arrangement of English letters gives rise to a arrangement of English letters gives rise to a repeat by chance.repeat by chance.

Page 29: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

RepeatsRepeats

OK, if a repeat is due to the fact that the same OK, if a repeat is due to the fact that the same thing is encoded twice in the same way, then thing is encoded twice in the same way, then the keyword must be used a whole number of the keyword must be used a whole number of times to get from one to the other.times to get from one to the other.

So, keep track of the spacing between repeats.So, keep track of the spacing between repeats.

Page 30: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

RepeatsRepeats

So, keep track of the spacing between repeats.So, keep track of the spacing between repeats.

Nearly every repeat will have a spacing that Nearly every repeat will have a spacing that divides evenly by the length of the keyword.divides evenly by the length of the keyword.

So, break the spacings into factors and look for So, break the spacings into factors and look for something that (almost) always turns up.something that (almost) always turns up.

Page 31: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

RepeatsRepeats

So, break the spacings into factors and look for So, break the spacings into factors and look for something that (almost) always turns up.something that (almost) always turns up.

sequencesequence spacingspacing factorsfactors

EFIQEFIQ 9595 5 195 19

PSDLPPSDLP 55 55

WCXYMWCXYM 2020 2 5 10 202 5 10 20

ETRLETRL 120120 2 3 4 5 6 8 2 3 4 5 6 8 10...10...

Page 32: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.

If there are five different alphabets, tally up If there are five different alphabets, tally up characters 1,6,11,... into one table, 2,7,12,... characters 1,6,11,... into one table, 2,7,12,... into a second, 3,8,13,... into the third, and so on into a second, 3,8,13,... into the third, and so on up to the fifth.up to the fifth.

The results will show the characteristic The results will show the characteristic frequency pattern of a shifted alphabet (high A, frequency pattern of a shifted alphabet (high A, E close to each other, low J,K next to each other, E close to each other, low J,K next to each other, X,Y,Z all low, etc.)X,Y,Z all low, etc.)

Breaking Vigenère.Breaking Vigenère.

Page 33: Polyalphabetic CIPHERS Linguistics 484. Summary The idea How to recognize: index of coincidence How many alphabets: Kasiski

Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.

The results will show the characteristic The results will show the characteristic frequency pattern of a shifted alphabet (high frequency pattern of a shifted alphabet (high A, E close to each other, low J,K next to each A, E close to each other, low J,K next to each other, X,Y,Z all low, etc.)other, X,Y,Z all low, etc.)

See if the keyword is sensible. Might be an See if the keyword is sensible. Might be an English word. Then plug in letters and check English word. Then plug in letters and check whether the message works out.whether the message works out.

Breaking Vigenère.Breaking Vigenère.