Upload
pauline-kennedy
View
217
Download
0
Embed Size (px)
Citation preview
Polyalphabetic CIPHERSPolyalphabetic CIPHERSLinguistics 484Linguistics 484
SummarySummary
The ideaThe idea
How to recognize: index of coincidenceHow to recognize: index of coincidence
How many alphabets: KasiskiHow many alphabets: Kasiski
The ideaThe idea
Remove the invariant that a plaintext letter Remove the invariant that a plaintext letter always maps to the same cryptotext letter.always maps to the same cryptotext letter.
Smooth out the frequency distribution, Smooth out the frequency distribution, removing clues.removing clues.
MonoalphabeticMonoalphabetic
PlaintextPlaintext CryptosystemCryptosystem CiphertextCiphertext
PolyalphabeticPolyalphabetic
PlaintextPlaintext
AAAA
CryptosystemCryptosystem
CiphertextCiphertext
BBBB
CCCC
PolyalphabeticPolyalphabetic
PlaintextPlaintext
AAAA
CryptosystemCryptosystem
CiphertextCiphertext
BBBB
CCCC
PolyalphabeticPolyalphabetic
PlaintextPlaintext
AAAA
CryptosystemCryptosystemCiphertextCiphertext
BBBB
CCCC
PolyalphabeticPolyalphabetic
PlaintextPlaintext
AAAA
CryptosystemCryptosystem
CiphertextCiphertextBBBB
CCCC
Polyalphabetic systemPolyalphabetic system
Cryptosystem with several components.Cryptosystem with several components.
Systematic way of moving from one Systematic way of moving from one cryptosystem to the next.cryptosystem to the next.
Vigenère (simplified)Vigenère (simplified)
Component ciphers are shift ciphers, using so Component ciphers are shift ciphers, using so called Direct Standard Alphabetcalled Direct Standard Alphabet
You use each alphabet for one character, then You use each alphabet for one character, then move on.move on.
Vigenère (simplified)
AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ
AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ
BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ A
CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ A BB
. .
. .
. .
. .
. .Z AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY
Vigenère (simplified)Vigenère (simplified)
You and your friend agree a single letter key, You and your friend agree a single letter key, say ‘S’.say ‘S’.
Encrypt the first letter with the ‘S’ alphabet, Encrypt the first letter with the ‘S’ alphabet, second with ‘T’ alphabet, and so on.second with ‘T’ alphabet, and so on.
Vigenère (simplified)
key=”s”AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ
. .
. .SS TT UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR
TT UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS
UU VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT
VV WW XX YY ZZ A BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU
. .
BOOK JWVP
Polyalphabetic systemPolyalphabetic system
Cryptosystem with several components.Cryptosystem with several components.
Systematic way of moving from one Systematic way of moving from one cryptosystem to the next.cryptosystem to the next.
But two weaknesses in simplified Vigenère.But two weaknesses in simplified Vigenère.
Direct standard alphabets. Breaking one Direct standard alphabets. Breaking one character gets whole alphabet.character gets whole alphabet.
Pattern of movement is too obvious.Pattern of movement is too obvious.
Full VigenèreFull Vigenère
Use keyword to control jump between Use keyword to control jump between alphabetsalphabets
Pattern of movement no longer as obvious.Pattern of movement no longer as obvious.
Vigenère key=”SYMBOL”
AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ
S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
THE ATOMIC ENERGY L..
ExerciseExercise
Encipher THE ATOMIC ENERGY with the Encipher THE ATOMIC ENERGY with the keyword SYMBOLkeyword SYMBOL
Decipher Decipher AVYUL HWLEE UCZLL LTYVI YOFJI ZSLNI AVYUL HWLEE UCZLL LTYVI YOFJI ZSLNI knowing that the keyword is HOUSEknowing that the keyword is HOUSE
Vigenère key=”HOUSE”
AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ
H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
AVYULAVYULTH...
How many alphabets?How many alphabets?
Index of co-incidenceIndex of co-incidence
Babbage-Kasiski examinationBabbage-Kasiski examination
Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.
Breaking Vigenère.Breaking Vigenère.
Index of co-incidenceIndex of co-incidence
Based on arguments about probability.Based on arguments about probability.
Intuition: measure roughness of frequency Intuition: measure roughness of frequency distributiondistribution
Mathematical details followMathematical details follow
Roughness of Roughness of distributionsdistributions
Smoothest distribution has each letter Smoothest distribution has each letter happening 1/26th of the time.happening 1/26th of the time.
Roughest has one letter happening 100% of Roughest has one letter happening 100% of the timethe time
Normal English has some uneveness, less Normal English has some uneveness, less smooth than totally uniform.smooth than totally uniform.
Index of co-incidenceIndex of co-incidence
Get a frequency f[letter] for each letter.Get a frequency f[letter] for each letter.
Multiply f[letter]*(f[letter]-1) to get number of Multiply f[letter]*(f[letter]-1) to get number of co-incidences involving that letter. Add the co-incidences involving that letter. Add the results for all letters together.results for all letters together.
Divide by the number of co-incidences you Divide by the number of co-incidences you would expect if all the letters were the same.would expect if all the letters were the same.
Index of co-incidenceIndex of co-incidence
IC = IC = sum(f[letter]*(f[letter]-1)) sum(f[letter]*(f[letter]-1))
/ N(N-1)/ N(N-1)
Index of coincidenceIndex of coincidence
IC has a value of 0.038 if the letters are evenly IC has a value of 0.038 if the letters are evenly distributed, which is what you get if the distributed, which is what you get if the polyalphabet uses many many alphabetspolyalphabet uses many many alphabets
It has a value of 0.066 for English text, It has a value of 0.066 for English text, monoalphabetic encryptions of English text, monoalphabetic encryptions of English text, many other thingsmany other things
Number of alphabets
IC
1 0.066
2 0.052
5 0.044
10 0.041
large 0.038
Idea to quantify Idea to quantify roughnessroughness
Count the number of times a pair of letters Count the number of times a pair of letters drawn at random from the text happen to be drawn at random from the text happen to be the same.the same.
For the roughest possible, we always get the For the roughest possible, we always get the same letter, so a text of length N has N(N-1) same letter, so a text of length N has N(N-1) repeats. repeats.
For the smoothest possible, we get way fewer.For the smoothest possible, we get way fewer.
Babbage-Kasiski Babbage-Kasiski methodmethod
Explained well in Code Book p 67-72Explained well in Code Book p 67-72
Key idea: what does it mean if we find a Key idea: what does it mean if we find a sequence of repeated characters in a sequence of repeated characters in a message that has been encoded using a message that has been encoded using a repeated keyword.repeated keyword.
Babbage-Kasiski Babbage-Kasiski methodmethod
Key idea: what does it mean if we find a sequence Key idea: what does it mean if we find a sequence of of four or morefour or more repeated characters in a message repeated characters in a message that has been encoded using a repeated keyword.that has been encoded using a repeated keyword.
Most likely: a sequence of four or more English Most likely: a sequence of four or more English characters in the plaintext has been encoded characters in the plaintext has been encoded twice starting from the same place in the twice starting from the same place in the repeating keyword.repeating keyword.
Less likely: it’s an accident, some other Less likely: it’s an accident, some other arrangement of English letters gives rise to a arrangement of English letters gives rise to a repeat by chance.repeat by chance.
RepeatsRepeats
OK, if a repeat is due to the fact that the same OK, if a repeat is due to the fact that the same thing is encoded twice in the same way, then thing is encoded twice in the same way, then the keyword must be used a whole number of the keyword must be used a whole number of times to get from one to the other.times to get from one to the other.
So, keep track of the spacing between repeats.So, keep track of the spacing between repeats.
RepeatsRepeats
So, keep track of the spacing between repeats.So, keep track of the spacing between repeats.
Nearly every repeat will have a spacing that Nearly every repeat will have a spacing that divides evenly by the length of the keyword.divides evenly by the length of the keyword.
So, break the spacings into factors and look for So, break the spacings into factors and look for something that (almost) always turns up.something that (almost) always turns up.
RepeatsRepeats
So, break the spacings into factors and look for So, break the spacings into factors and look for something that (almost) always turns up.something that (almost) always turns up.
sequencesequence spacingspacing factorsfactors
EFIQEFIQ 9595 5 195 19
PSDLPPSDLP 55 55
WCXYMWCXYM 2020 2 5 10 202 5 10 20
ETRLETRL 120120 2 3 4 5 6 8 2 3 4 5 6 8 10...10...
Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.
If there are five different alphabets, tally up If there are five different alphabets, tally up characters 1,6,11,... into one table, 2,7,12,... characters 1,6,11,... into one table, 2,7,12,... into a second, 3,8,13,... into the third, and so on into a second, 3,8,13,... into the third, and so on up to the fifth.up to the fifth.
The results will show the characteristic The results will show the characteristic frequency pattern of a shifted alphabet (high A, frequency pattern of a shifted alphabet (high A, E close to each other, low J,K next to each other, E close to each other, low J,K next to each other, X,Y,Z all low, etc.)X,Y,Z all low, etc.)
Breaking Vigenère.Breaking Vigenère.
Once you have how many alphabets, use Once you have how many alphabets, use frequency analysis as for regular shift ciphers.frequency analysis as for regular shift ciphers.
The results will show the characteristic The results will show the characteristic frequency pattern of a shifted alphabet (high frequency pattern of a shifted alphabet (high A, E close to each other, low J,K next to each A, E close to each other, low J,K next to each other, X,Y,Z all low, etc.)other, X,Y,Z all low, etc.)
See if the keyword is sensible. Might be an See if the keyword is sensible. Might be an English word. Then plug in letters and check English word. Then plug in letters and check whether the message works out.whether the message works out.
Breaking Vigenère.Breaking Vigenère.