Upload
abigail-horn
View
282
Download
6
Embed Size (px)
Citation preview
Cryptographic Hash Cryptographic Hash FunctionsFunctions
CS432CS432
OverviewOverview
Hash FunctionsHash Functions Hash Algorithms:Hash Algorithms:
MD5 (Message Digest).MD5 (Message Digest). SHA1: (Secure Hash Algorithm)SHA1: (Secure Hash Algorithm)
Hash functionsHash functions
A hash function computes a fixed length A hash function computes a fixed length value from a variable length sourcevalue from a variable length source Example: Check sums in communication Example: Check sums in communication
protocolsprotocols IndIndiices in databasesces in databases
More convenient to handle a hash of a More convenient to handle a hash of a document instead of the document itselfdocument instead of the document itself
Cryptographic Hash Functions Cryptographic Hash Functions (answers.com)(answers.com)
In In cryptography, a , a cryptographic hash cryptographic hash functionfunction is a is a hash function with certain with certain additional security properties to make it suitable additional security properties to make it suitable for use as a primitive in various for use as a primitive in various information security applications, such as applications, such as authentication and and message integrity. .
A hash function takes a long A hash function takes a long string (or (or 'message') of any length as input and produces 'message') of any length as input and produces a fixed length string as output, sometimes a fixed length string as output, sometimes termed a termed a message digestmessage digest or a or a digital digital fingerprintfingerprint..
Hash functions, definitionHash functions, definition A hash function is a function A hash function is a function ff:{0,1}* :{0,1}* {0,1} {0,1}nn.. The size of the output, The size of the output, nn, is a property of the , is a property of the
function. Common values are 128, 160 and function. Common values are 128, 160 and 256.256.
Informally, A transformation of a message of arbitrary length into a fixed-length number is called a hash function
Alternate names are fingerprint or digest
Commonly used hash functions are MD5, Commonly used hash functions are MD5, SHA and, SHA-1SHA and, SHA-1
Simple ExamplesSimple Examples
ff((mm) = first 70 bits of ) = first 70 bits of mm
ff((mm) = last 80 bits of ) = last 80 bits of mm
ff((mm) = XOR of the bytes of ) = XOR of the bytes of mm
Properties of a Good Hash Properties of a Good Hash FunctionFunction
Let Let HH be a hash function be a hash function
One-wayOne-way Given Given xx, unfeasible to compute a , unfeasible to compute a vv such such
that that HH((vv) = ) = xx
Collision-freeCollision-free Unfeasible to find Unfeasible to find xx11 and and xx22 such that such that HH((xx11) )
= = HH((xx22) and ) and xx11 xx22
Applications of Hash FunctionsApplications of Hash Functions
Hash functions are used for Hash functions are used for message and file integritymessage and file integrity• secure loginsecure login• fingerprints of keysfingerprints of keys• authenticationauthentication• digital signaturesdigital signatures
Required Properties of Hash Required Properties of Hash FunctionsFunctions
Preimage resistantPreimage resistant given given hh it should be hard to it should be hard to find any find any mm such that such that hh = hash( = hash(mm). ).
Second preimageSecond preimage resistant resistant:: given an input given an input mm1, 1, it should be hard to find another input, it should be hard to find another input, mm2 (not 2 (not equal to equal to mm1) such that hash(1) such that hash(mm1) = hash(1) = hash(mm2). 2). This property is implied by collision-resistance. This property is implied by collision-resistance.
Collision-resistantCollision-resistant:: given hash( given hash(mm1), it should 1), it should be hard to find a message be hard to find a message mm2 such that 2 such that hash(hash(mm1) = hash(1) = hash(mm2). 2).
Due to a possible Due to a possible birthday attackbirthday attack, this means , this means the hash function output must be at least twice the hash function output must be at least twice as large as what is required for preimage-as large as what is required for preimage-resistance. resistance.
Birthday AttackBirthday Attack
A A birthday attackbirthday attack is a type of is a type of cryptographic attack which exploits the mathematics behind attack which exploits the mathematics behind the birthday paradox, making use of a space-the birthday paradox, making use of a space-time tradeoff. Specifically, if a function time tradeoff. Specifically, if a function ff((xx) yields ) yields any of any of HH different outputs with equal probability different outputs with equal probability and and HH is sufficiently large, then after evaluating is sufficiently large, then after evaluating the function for about different arguments we the function for about different arguments we expect to obtain a pair of different arguments expect to obtain a pair of different arguments xx1 1 and and xx2 with 2 with ff((xx1) = 1) = ff((xx2), known as a 2), known as a collisioncollision
Birthday ParadoxBirthday Paradox In probability theory, the In probability theory, the birthday paradoxbirthday paradox states that states that
given a group of 23 (or more) randomly chosen people, given a group of 23 (or more) randomly chosen people, the probability is more than 50% that some pair of them the probability is more than 50% that some pair of them will have the same birthday. will have the same birthday.
For 57 or more people, the probability is greater than For 57 or more people, the probability is greater than 99%, although it cannot be exactly 100% unless there 99%, although it cannot be exactly 100% unless there are at least 367 people.[1]Calculating this probability are at least 367 people.[1]Calculating this probability (and related ones) is the (and related ones) is the birthday problembirthday problem. .
The mathematics behind it has been used to devise a The mathematics behind it has been used to devise a well-known cryptographic attack named the birthday well-known cryptographic attack named the birthday attack.attack.
Birthday AttackBirthday Attack
Any function H: {0,1}* ->{0,1}Any function H: {0,1}* ->{0,1}nn must have must have infinitely many collisions.infinitely many collisions.
It requires O(2It requires O(2n/2n/2) evaluations of H to find two ) evaluations of H to find two messages m and m’ that have a collision,messages m and m’ that have a collision,
H(m)=H(m’). H(m)=H(m’).
This means n must be reasonably large, This means n must be reasonably large, otherwise it cannot be collision resistant.otherwise it cannot be collision resistant.
AttacksAttacks
Suppose a hash function H produces n bit Suppose a hash function H produces n bit values. values.
Compose a document Compose a document good docgood doc and about 2 and about 2n/2+1n/2+1 semantically equivalent versions.semantically equivalent versions.
Similarly, compose a Similarly, compose a bad docbad doc and about 2 and about 2n/2+1n/2+1 semantically equivalent versions.semantically equivalent versions.
With probability ½ or more there will be a With probability ½ or more there will be a version of the version of the good docgood doc and a version of the and a version of the bad docbad doc that have the that have the same hash valuesame hash value. .
Hash AlgorithmsHash Algorithms
• We will consider two main examples:We will consider two main examples:• The message digest algorithmThe message digest algorithm MD5 MD5 by Ron Rivest by Ron Rivest
with 128 bit hash values. with 128 bit hash values. • The secure hash algorithm The secure hash algorithm SHA-1SHA-1. It was . It was
developed by NSA and standardized by NIST. developed by NSA and standardized by NIST. This algorithm uses 160 bit hash values encoded This algorithm uses 160 bit hash values encoded in 5 x 32 bit words.in 5 x 32 bit words.
• Other variations of SHA include Other variations of SHA include SHA-256, SHA-256, SHA-384, SHA-512.SHA-384, SHA-512.
Collisions in SHA-1 can be found by 263 attemptsCollision in MD5 can be found in 8 hours using a notebook PC...
MD5: Message Digest AlgorithmMD5: Message Digest Algorithm
It compresses messages of 512 bits length into It compresses messages of 512 bits length into a hash of length 128 bits.a hash of length 128 bits.
A message of arbitrary length is padded to A message of arbitrary length is padded to length length
k = 448 mod 512k = 448 mod 512A 64 bit string describing the length of the A 64 bit string describing the length of the
message is added. The message length is message is added. The message length is now a multiple of 512.now a multiple of 512.
The hashing is done block-by-block. The hashing is done block-by-block.
Step 1: Append padding bits
Padded so that its bit length 448 mod 512 (i.e., the length of padded message is 64 bits less than an integer multiple of 512 bits)Padding is always added, even if the message is already of the desired length (1 to 512 bits)Padding bits: 1000….0 (a single 1-bit followed by the necessary number of 0-bits)
Step 2: Append length:
•64-bit length: contains the length of the original message modulo 264
•The expanded message is Y0, Y1, …, YL-1; the total length is L 512 bits
•The expanded message can be thought of as a multiple of 16 32-bit words•Let M[0 … N-1] denote the word of the resulting message, where N = L 16
MD5 Algorithm ArchitectureMD5 Algorithm Architecture
Initialization VectorInitialization Vector
A buffer containing four words A,B,C,D A buffer containing four words A,B,C,D of 32 bits is used to compute the hash of 32 bits is used to compute the hash value. value.
Initializations are:Initializations are:
A = A = 01 23 45 6701 23 45 67B = B = 89 ab cd ef89 ab cd ef C = C = fe dc ba 98fe dc ba 98 D = D = 76 54 32 1076 54 32 10
MD5 processing of a single 512-bit block
(MD5 compression function)
A Typical MD5 Single Step
The procedure uses four boolean functions that operate bitwise on 32 bit words:
F(X,Y,Z) = (X Y) (X Z) G(X,Y,Z) = (X Z) (Y Z) H(X,Y,Z) = X Y Z I(X,Y,Z) = Y (X Z)
x y z F G H I
0 0 0 0 0 0 10 0 1 1 0 1 00 1 0 0 1 1 00 1 1 1 0 0 11 0 0 0 0 1 11 0 1 0 1 0 11 1 0 1 1 0 01 1 1 1 1 1 0
Truth table
MD5 - functions
What is X[k]?What is X[k]? The array of 32-bit words X[0..15] holds the value of The array of 32-bit words X[0..15] holds the value of
current 512-bit input block being processedcurrent 512-bit input block being processed
Within a round, each of the 16 words of X[i] is used Within a round, each of the 16 words of X[i] is used exactly once, during one stepexactly once, during one step
The order in which these words is used varies from round to The order in which these words is used varies from round to roundround
In the first round, the words are used in their original orderIn the first round, the words are used in their original order For rounds 2 through 4, the following permutations are usedFor rounds 2 through 4, the following permutations are used
22(i) = (1 + 5i) mod 16(i) = (1 + 5i) mod 16 33(i) = (5 + 3i) mod 16(i) = (5 + 3i) mod 16 44(I) = 7i mod 16(I) = 7i mod 16
T[i]T[i]
T is constructed from the sine function:
T[i] = integer part of 232 abs(sin(i)),
where i is in radians
Typical Values for T[i]
Circular Left Shift (CLS)Circular Left Shift (CLS)
<<< s<<< s - circular left shift (rotation) of the - circular left shift (rotation) of the 32-bit 32-bit arguments by s bitsarguments by s bits
Values of s:Values of s:
Round 1:Round 1: 77 1212 1717 2222Round 2:Round 2: 55 99 1414 2020Round 3:Round 3: 44 1111 1616 2323Round 4:Round 4: 66 1010 1515 2121
//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating
var int[64] r, T
//r specifies the per-round shift amountsr[ 0..15] := {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22} r[16..31] := {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}r[32..47] := {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}r[48..63] := {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21}
//Use binary integer part of the sines of integers as constants:for i from 0 to 63 T[i] := floor(abs(sin(i + 1)) × (2 pow 32))
//Initialize variables:var int h0 := 0x67452301var int h1 := 0xEFCDAB89var int h2 := 0x98BADCFEvar int h3 := 0x10325476
//Pre-processing:append "1" bit to messageappend "0" bits until message length in bits ≡ 448 (mod 512)append bit (bit, not byte) length of unpadded message as 64-bit little-endian integer to message
//Process the message in successive 512-bit chunks:for each 512-bit chunk of message break chunk into sixteen 32-bit little-endian words X[i], 0 ≤ i ≤ 15
//Initialize hash value for this chunk: var int a := h0 var int b := h1 var int c := h2 var int d := h3
//Main loop: for i from 0 to 63 if 0 ≤ i ≤ 15 then g:= (b and c) or ((not b) and d) p := i else if 16 ≤ i ≤ 31 g := (d and b) or ((not d) and c) p := (5×i + 1) mod 16 else if 32 ≤ i ≤ 47 g := b xor c xor d p := (3×i + 5) mod 16 else if 48 ≤ i ≤ 63 g := c xor (b or (not d)) p := (7×i) mod 16
temp := d d := c c := b b := b + leftrotate((a + g + T[i] + X[p]) , r[i]) a := temp
//Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d
var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian) //leftrotate function definition leftrotate (x, c) return (x << c) or (x >> (32-c));
Secure Hash Algorithm (SHA)Secure Hash Algorithm (SHA) Developed by NIST (National Institute of Standards and Developed by NIST (National Institute of Standards and
Technology)Technology) Published as a FIPS PUB 180 in 1993Published as a FIPS PUB 180 in 1993 A revised version is issued as FIPS PUB 180-1A revised version is issued as FIPS PUB 180-1 Generally referred to as SHA-1Generally referred to as SHA-1
Input: a message with a maximum length of less than Input: a message with a maximum length of less than 226464 bits bits
Output: 160-bit message digestOutput: 160-bit message digest 32-bit word units, 512-bit blocks32-bit word units, 512-bit blocks 4 rounds 4 rounds 20 steps per block 20 steps per block Closely models MD4Closely models MD4 Slower, stronger than MD5Slower, stronger than MD5
SHA AlgorithmSHA Algorithm The overall structure and logic is similar to MD5The overall structure and logic is similar to MD5
Step 1: Append padding bitsStep 1: Append padding bits Step 2: Append lengthStep 2: Append length Step 3: Initialize MD bufferStep 3: Initialize MD buffer
160-bit buffer (five 32-bit registers A,B,C,D,E) is used to hold 160-bit buffer (five 32-bit registers A,B,C,D,E) is used to hold intermediate and final results of the hash functionintermediate and final results of the hash function
A,B,C,D,E are initialized to the following valuesA,B,C,D,E are initialized to the following values• A,B,C,D = same as in MD5, E = C3D2E1F0A,B,C,D = same as in MD5, E = C3D2E1F0• Stored in Stored in big-endianbig-endian format (most significant byte of a word format (most significant byte of a word
in the low-address byte position)in the low-address byte position) E.g. word E: C3 D2 E1 F0 (low address E.g. word E: C3 D2 E1 F0 (low address …… high address) high address)
SHA AlgorithmSHA Algorithm Step 4: Process message in 512-bit (16-word) blocksStep 4: Process message in 512-bit (16-word) blocks
Heart of the algorithm called a Heart of the algorithm called a compression functioncompression function Consists of 4 rounds of processing of 20 steps eachConsists of 4 rounds of processing of 20 steps each The 4 rounds have a similar structure, but each uses a different The 4 rounds have a similar structure, but each uses a different primitive primitive
logical functionslogical functions, referred to as f, referred to as f11, f, f22, f, f33, and f, and f44 Each round takes as input the current 512-bit block (YEach round takes as input the current 512-bit block (Yqq), 160-bit buffer ), 160-bit buffer
value ABCDE and updates the contents of the buffervalue ABCDE and updates the contents of the buffer Each round also uses the additive constants KEach round also uses the additive constants K tt, where 0 , where 0 t t 79 79
indicates one of the 80 steps across 4 roundsindicates one of the 80 steps across 4 rounds In fact only 4 constants are used:In fact only 4 constants are used:
The output of 4The output of 4thth round (80 round (80thth step) is added to the CV step) is added to the CVqq to produce CV to produce CVq+1q+1
Step Number Hexadecimal Ineteger Part of 0 t 19 Kt = 5A827999 [230 2]20 t 39 Kt = 6ED9EBA1 [230 3]40 t 59 Kt = 8F1BBCDC [230 5]60 t 79 Kt = CA62C1D6 [230 10]
SHA-1 processing of a single 512-bit block (SHA-1 compression function)
Elementary SHA operation (single step)
Each primitive function takes three 32-bit words as input and produces a 32-bit word outputEach function performs a set of bitwise logical operations
Step Function Name Function Value( 0 t 19) f1 = f(t,B,C,D) (B C) (B’ D)(20 t 39) f2 = f(t,B,C,D) B C D(40 t 59) f3 = f(t,B,C,D) (B C) (B D) (C D)(60 t 79) f4 = f(t,B,C,D) B C D
Truth tableB C D f1 f2 f3 f4
0 0 0 0 0 0 00 0 1 1 1 0 10 1 0 0 1 0 10 1 1 1 0 1 01 0 0 0 1 0 11 0 1 0 0 1 01 1 0 1 0 1 01 1 1 1 1 1 1
Note: All variables are unsigned 32 bits and wrap modulo 232 when calculating
Initialize variables:h0 := 0x67452301h1 := 0xEFCDAB89h2 := 0x98BADCFEh3 := 0x10325476h4 := 0xC3D2E1F0
Pre-processing:append the bit '1' to the messageappend k bits '0', where k is the minimum number >= 0 such that the resulting message length (in bits) is congruent to 448 (mod 512)append length of message (before pre-processing), in bits, as 64-bit big-endian integer
Process the message in successive 512-bit chunks:break message into 512-bit chunks
for each chunk break chunk into sixteen 32-bit big-endian words w[i], 0 ≤ i ≤ 15
Extend the sixteen 32-bit words into eighty 32-bit words: for i from 16 to 79 w[i] := (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1
Initialize hash value for this chunk: a := h0 b := h1 c := h2 d := h3 e := h4
Main loop: for i from 0 to 79 if 0 ≤ i ≤ 19 then f := (b and c) or ((not b) and d) k := 0x5A827999 else if 20 ≤ i ≤ 39 f := b xor c xor d k := 0x6ED9EBA1 else if 40 ≤ i ≤ 59 f := (b and c) or (b and d) or (c and d) k := 0x8F1BBCDC else if 60 ≤ i ≤ 79 f := b xor c xor d k := 0xCA62C1D6
temp := (a leftrotate 5) + f + e + k + w[i] e := d d := c c := b leftrotate 30 b := a a := temp
Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e
Produce the final hash value (big-endian):digest = hash = h0 append h1 append h2 append h3 append h4
The following equivalent expressions may be used to compute f in the main loop above:(0 ≤ i ≤ 19): f := d xor (b and (c xor d)) (alternative) (40 ≤ i ≤ 59): f := (b and c) or (d and (b or c)) (alternative 1)(40 ≤ i ≤ 59): f := (b and c) or (d and (b xor c)) (alternative 2)(40 ≤ i ≤ 59): f := (b and c) + (d and (b xor c)) (alternative 3