43
Cryptographic Hash Cryptographic Hash Functions Functions CS432 CS432

Cryptographic Hash Functions CS432. Overview Hash Functions Hash Algorithms: MD5 (Message Digest). MD5 (Message Digest). SHA1: (Secure Hash Algorithm)

Embed Size (px)

Citation preview

Cryptographic Hash Cryptographic Hash FunctionsFunctions

CS432CS432

OverviewOverview

Hash FunctionsHash Functions Hash Algorithms:Hash Algorithms:

MD5 (Message Digest).MD5 (Message Digest). SHA1: (Secure Hash Algorithm)SHA1: (Secure Hash Algorithm)

Hash functionsHash functions

A hash function computes a fixed length A hash function computes a fixed length value from a variable length sourcevalue from a variable length source Example: Check sums in communication Example: Check sums in communication

protocolsprotocols IndIndiices in databasesces in databases

More convenient to handle a hash of a More convenient to handle a hash of a document instead of the document itselfdocument instead of the document itself

In In cryptography, a , a cryptographic hash cryptographic hash functionfunction is a is a hash function with certain with certain additional security properties to make it suitable additional security properties to make it suitable for use as a primitive in various for use as a primitive in various information security applications, such as applications, such as authentication and and message integrity. .

A hash function takes a long A hash function takes a long string (or (or 'message') of any length as input and produces 'message') of any length as input and produces a fixed length string as output, sometimes a fixed length string as output, sometimes termed a termed a message digestmessage digest or a or a digital digital fingerprintfingerprint..

Hash functions, definitionHash functions, definition A hash function is a function A hash function is a function ff:{0,1}* :{0,1}* {0,1} {0,1}nn.. The size of the output, The size of the output, nn, is a property of the , is a property of the

function. Common values are 128, 160 and function. Common values are 128, 160 and 256.256.

Informally, A transformation of a message of arbitrary length into a fixed-length number is called a hash function

Alternate names are fingerprint or digest

Commonly used hash functions are MD5, Commonly used hash functions are MD5, SHA and, SHA-1SHA and, SHA-1

Simple ExamplesSimple Examples

ff((mm) = first 70 bits of ) = first 70 bits of mm

ff((mm) = last 80 bits of ) = last 80 bits of mm

ff((mm) = XOR of the bytes of ) = XOR of the bytes of mm

Properties of a Good Hash Properties of a Good Hash FunctionFunction

Let Let HH be a hash function be a hash function

One-wayOne-way Given Given xx, unfeasible to compute a , unfeasible to compute a vv such such

that that HH((vv) = ) = xx

Collision-freeCollision-free Unfeasible to find Unfeasible to find xx11 and and xx22 such that such that HH((xx11) )

= = HH((xx22) and ) and xx11 xx22

Applications of Hash FunctionsApplications of Hash Functions

Hash functions are used for Hash functions are used for message and file integritymessage and file integrity• secure loginsecure login• fingerprints of keysfingerprints of keys• authenticationauthentication• digital signaturesdigital signatures

Required Properties of Hash Required Properties of Hash FunctionsFunctions

Preimage resistantPreimage resistant given given hh it should be hard to it should be hard to find any find any mm such that such that hh = hash( = hash(mm). ).

Second preimageSecond preimage resistant resistant:: given an input given an input mm1, 1, it should be hard to find another input, it should be hard to find another input, mm2 (not 2 (not equal to equal to mm1) such that hash(1) such that hash(mm1) = hash(1) = hash(mm2). 2). This property is implied by collision-resistance. This property is implied by collision-resistance.

Collision-resistantCollision-resistant:: given hash( given hash(mm1), it should 1), it should be hard to find a message be hard to find a message mm2 such that 2 such that hash(hash(mm1) = hash(1) = hash(mm2). 2).

Due to a possible Due to a possible birthday attackbirthday attack, this means , this means the hash function output must be at least twice the hash function output must be at least twice as large as what is required for preimage-as large as what is required for preimage-resistance. resistance.

Birthday AttackBirthday Attack

A A birthday attackbirthday attack is a type of is a type of cryptographic attack which exploits the mathematics behind attack which exploits the mathematics behind the birthday paradox, making use of a space-the birthday paradox, making use of a space-time tradeoff. Specifically, if a function time tradeoff. Specifically, if a function ff((xx) yields ) yields any of any of HH different outputs with equal probability different outputs with equal probability and and HH is sufficiently large, then after evaluating is sufficiently large, then after evaluating the function for about different arguments we the function for about different arguments we expect to obtain a pair of different arguments expect to obtain a pair of different arguments xx1 1 and and xx2 with 2 with ff((xx1) = 1) = ff((xx2), known as a 2), known as a collisioncollision

given a group of 23 (or more) randomly chosen people, given a group of 23 (or more) randomly chosen people, the probability is more than 50% that some pair of them the probability is more than 50% that some pair of them will have the same birthday. will have the same birthday.

For 57 or more people, the probability is greater than For 57 or more people, the probability is greater than 99%, although it cannot be exactly 100% unless there 99%, although it cannot be exactly 100% unless there are at least 367 people.[1]Calculating this probability are at least 367 people.[1]Calculating this probability (and related ones) is the (and related ones) is the birthday problembirthday problem. .

The mathematics behind it has been used to devise a The mathematics behind it has been used to devise a well-known cryptographic attack named the birthday well-known cryptographic attack named the birthday attack.attack.

Birthday AttackBirthday Attack

Any function H: {0,1}* ->{0,1}Any function H: {0,1}* ->{0,1}nn must have must have infinitely many collisions.infinitely many collisions.

It requires O(2It requires O(2n/2n/2) evaluations of H to find two ) evaluations of H to find two messages m and m’ that have a collision,messages m and m’ that have a collision,

H(m)=H(m’). H(m)=H(m’).

This means n must be reasonably large, This means n must be reasonably large, otherwise it cannot be collision resistant.otherwise it cannot be collision resistant.

AttacksAttacks

Suppose a hash function H produces n bit Suppose a hash function H produces n bit values. values.

Compose a document Compose a document good docgood doc and about 2 and about 2n/2+1n/2+1 semantically equivalent versions.semantically equivalent versions.

With probability ½ or more there will be a With probability ½ or more there will be a version of the version of the good docgood doc and a version of the and a version of the bad docbad doc that have the that have the same hash valuesame hash value. .

Hash AlgorithmsHash Algorithms

• We will consider two main examples:We will consider two main examples:• The message digest algorithmThe message digest algorithm MD5 MD5 by Ron Rivest by Ron Rivest

with 128 bit hash values. with 128 bit hash values. • The secure hash algorithm The secure hash algorithm SHA-1SHA-1. It was . It was

developed by NSA and standardized by NIST. developed by NSA and standardized by NIST. This algorithm uses 160 bit hash values encoded This algorithm uses 160 bit hash values encoded in 5 x 32 bit words.in 5 x 32 bit words.

• Other variations of SHA include Other variations of SHA include SHA-256, SHA-256, SHA-384, SHA-512.SHA-384, SHA-512.

Collisions in SHA-1 can be found by 263 attemptsCollision in MD5 can be found in 8 hours using a notebook PC...

MD5: Message Digest AlgorithmMD5: Message Digest Algorithm

It compresses messages of 512 bits length into It compresses messages of 512 bits length into a hash of length 128 bits.a hash of length 128 bits.

A message of arbitrary length is padded to A message of arbitrary length is padded to length length

k = 448 mod 512k = 448 mod 512A 64 bit string describing the length of the A 64 bit string describing the length of the

message is added. The message length is message is added. The message length is now a multiple of 512.now a multiple of 512.

The hashing is done block-by-block. The hashing is done block-by-block.

Padded so that its bit length 448 mod 512 (i.e., the length of padded message is 64 bits less than an integer multiple of 512 bits)Padding is always added, even if the message is already of the desired length (1 to 512 bits)Padding bits: 1000….0 (a single 1-bit followed by the necessary number of 0-bits)

Step 2: Append length:

•64-bit length: contains the length of the original message modulo 264

•The expanded message is Y0, Y1, …, YL-1; the total length is L 512 bits

•The expanded message can be thought of as a multiple of 16 32-bit words•Let M[0 … N-1] denote the word of the resulting message, where N = L 16

MD5 Algorithm ArchitectureMD5 Algorithm Architecture

Initialization VectorInitialization Vector

A buffer containing four words A,B,C,D A buffer containing four words A,B,C,D of 32 bits is used to compute the hash of 32 bits is used to compute the hash value. value.

Initializations are:Initializations are:

A = A = 01 23 45 6701 23 45 67B = B = 89 ab cd ef89 ab cd ef C = C = fe dc ba 98fe dc ba 98 D = D = 76 54 32 1076 54 32 10

MD5 processing of a single 512-bit block

(MD5 compression function)

A Typical MD5 Single Step

The procedure uses four boolean functions that operate bitwise on 32 bit words:

F(X,Y,Z) = (X Y) (X Z) G(X,Y,Z) = (X Z) (Y Z) H(X,Y,Z) = X Y Z I(X,Y,Z) = Y (X Z)

x y z F G H I

0 0 0 0 0 0 10 0 1 1 0 1 00 1 0 0 1 1 00 1 1 1 0 0 11 0 0 0 0 1 11 0 1 0 1 0 11 1 0 1 1 0 01 1 1 1 1 1 0

Truth table

MD5 - functions

What is X[k]?What is X[k]? The array of 32-bit words X[0..15] holds the value of The array of 32-bit words X[0..15] holds the value of

current 512-bit input block being processedcurrent 512-bit input block being processed

Within a round, each of the 16 words of X[i] is used Within a round, each of the 16 words of X[i] is used exactly once, during one stepexactly once, during one step

The order in which these words is used varies from round to The order in which these words is used varies from round to roundround

In the first round, the words are used in their original orderIn the first round, the words are used in their original order For rounds 2 through 4, the following permutations are usedFor rounds 2 through 4, the following permutations are used

22(i) = (1 + 5i) mod 16(i) = (1 + 5i) mod 16 33(i) = (5 + 3i) mod 16(i) = (5 + 3i) mod 16 44(I) = 7i mod 16(I) = 7i mod 16

T[i]T[i]

T is constructed from the sine function:

T[i] = integer part of 232 abs(sin(i)),

Typical Values for T[i]

Circular Left Shift (CLS)Circular Left Shift (CLS)

<<< s<<< s - circular left shift (rotation) of the - circular left shift (rotation) of the 32-bit 32-bit arguments by s bitsarguments by s bits

Values of s:Values of s:

Round 1:Round 1: 77 1212 1717 2222Round 2:Round 2: 55 99 1414 2020Round 3:Round 3: 44 1111 1616 2323Round 4:Round 4: 66 1010 1515 2121

//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating

var int[64] r, T

//r specifies the per-round shift amountsr[ 0..15] := {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22} r[16..31] := {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}r[32..47] := {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}r[48..63] := {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21}

//Use binary integer part of the sines of integers as constants:for i from 0 to 63 T[i] := floor(abs(sin(i + 1)) × (2 pow 32))

//Initialize variables:var int h0 := 0x67452301var int h1 := 0xEFCDAB89var int h2 := 0x98BADCFEvar int h3 := 0x10325476

//Pre-processing:append "1" bit to messageappend "0" bits until message length in bits ≡ 448 (mod 512)append bit (bit, not byte) length of unpadded message as 64-bit little-endian integer to message

//Process the message in successive 512-bit chunks:for each 512-bit chunk of message break chunk into sixteen 32-bit little-endian words X[i], 0 ≤ i ≤ 15

//Initialize hash value for this chunk: var int a := h0 var int b := h1 var int c := h2 var int d := h3

//Main loop: for i from 0 to 63 if 0 ≤ i ≤ 15 then g:= (b and c) or ((not b) and d) p := i else if 16 ≤ i ≤ 31 g := (d and b) or ((not d) and c) p := (5×i + 1) mod 16 else if 32 ≤ i ≤ 47 g := b xor c xor d p := (3×i + 5) mod 16 else if 48 ≤ i ≤ 63 g := c xor (b or (not d)) p := (7×i) mod 16

temp := d d := c c := b b := b + leftrotate((a + g + T[i] + X[p]) , r[i]) a := temp

//Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d

var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian) //leftrotate function definition leftrotate (x, c) return (x << c) or (x >> (32-c));

Secure Hash Algorithm (SHA)Secure Hash Algorithm (SHA) Developed by NIST (National Institute of Standards and Developed by NIST (National Institute of Standards and

Technology)Technology) Published as a FIPS PUB 180 in 1993Published as a FIPS PUB 180 in 1993 A revised version is issued as FIPS PUB 180-1A revised version is issued as FIPS PUB 180-1 Generally referred to as SHA-1Generally referred to as SHA-1

Input: a message with a maximum length of less than Input: a message with a maximum length of less than 226464 bits bits

Output: 160-bit message digestOutput: 160-bit message digest 32-bit word units, 512-bit blocks32-bit word units, 512-bit blocks 4 rounds 4 rounds 20 steps per block 20 steps per block Closely models MD4Closely models MD4 Slower, stronger than MD5Slower, stronger than MD5

SHA AlgorithmSHA Algorithm The overall structure and logic is similar to MD5The overall structure and logic is similar to MD5

Step 1: Append padding bitsStep 1: Append padding bits Step 2: Append lengthStep 2: Append length Step 3: Initialize MD bufferStep 3: Initialize MD buffer

160-bit buffer (five 32-bit registers A,B,C,D,E) is used to hold 160-bit buffer (five 32-bit registers A,B,C,D,E) is used to hold intermediate and final results of the hash functionintermediate and final results of the hash function

A,B,C,D,E are initialized to the following valuesA,B,C,D,E are initialized to the following values• A,B,C,D = same as in MD5, E = C3D2E1F0A,B,C,D = same as in MD5, E = C3D2E1F0• Stored in Stored in big-endianbig-endian format (most significant byte of a word format (most significant byte of a word

SHA AlgorithmSHA Algorithm Step 4: Process message in 512-bit (16-word) blocksStep 4: Process message in 512-bit (16-word) blocks

Heart of the algorithm called a Heart of the algorithm called a compression functioncompression function Consists of 4 rounds of processing of 20 steps eachConsists of 4 rounds of processing of 20 steps each The 4 rounds have a similar structure, but each uses a different The 4 rounds have a similar structure, but each uses a different primitive primitive

logical functionslogical functions, referred to as f, referred to as f11, f, f22, f, f33, and f, and f44 Each round takes as input the current 512-bit block (YEach round takes as input the current 512-bit block (Yqq), 160-bit buffer ), 160-bit buffer

value ABCDE and updates the contents of the buffervalue ABCDE and updates the contents of the buffer Each round also uses the additive constants KEach round also uses the additive constants K tt, where 0 , where 0 t t 79 79

indicates one of the 80 steps across 4 roundsindicates one of the 80 steps across 4 rounds In fact only 4 constants are used:In fact only 4 constants are used:

The output of 4The output of 4thth round (80 round (80thth step) is added to the CV step) is added to the CVqq to produce CV to produce CVq+1q+1

Step Number Hexadecimal Ineteger Part of 0 t 19 Kt = 5A827999 [230 2]20 t 39 Kt = 6ED9EBA1 [230 3]40 t 59 Kt = 8F1BBCDC [230 5]60 t 79 Kt = CA62C1D6 [230 10]

SHA-1 processing of a single 512-bit block (SHA-1 compression function)

Elementary SHA operation (single step)

Each primitive function takes three 32-bit words as input and produces a 32-bit word outputEach function performs a set of bitwise logical operations

Step Function Name Function Value( 0 t 19) f1 = f(t,B,C,D) (B C) (B’ D)(20 t 39) f2 = f(t,B,C,D) B C D(40 t 59) f3 = f(t,B,C,D) (B C) (B D) (C D)(60 t 79) f4 = f(t,B,C,D) B C D

Truth tableB C D f1 f2 f3 f4

0 0 0 0 0 0 00 0 1 1 1 0 10 1 0 0 1 0 10 1 1 1 0 1 01 0 0 0 1 0 11 0 1 0 0 1 01 1 0 1 0 1 01 1 1 1 1 1 1

Note: All variables are unsigned 32 bits and wrap modulo 232 when calculating

Initialize variables:h0 := 0x67452301h1 := 0xEFCDAB89h2 := 0x98BADCFEh3 := 0x10325476h4 := 0xC3D2E1F0

Pre-processing:append the bit '1' to the messageappend k bits '0', where k is the minimum number >= 0 such that the resulting message length (in bits) is congruent to 448 (mod 512)append length of message (before pre-processing), in bits, as 64-bit big-endian integer

Process the message in successive 512-bit chunks:break message into 512-bit chunks

for each chunk break chunk into sixteen 32-bit big-endian words w[i], 0 ≤ i ≤ 15

Extend the sixteen 32-bit words into eighty 32-bit words: for i from 16 to 79 w[i] := (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1

Initialize hash value for this chunk: a := h0 b := h1 c := h2 d := h3 e := h4

Main loop: for i from 0 to 79 if 0 ≤ i ≤ 19 then f := (b and c) or ((not b) and d) k := 0x5A827999 else if 20 ≤ i ≤ 39 f := b xor c xor d k := 0x6ED9EBA1 else if 40 ≤ i ≤ 59 f := (b and c) or (b and d) or (c and d) k := 0x8F1BBCDC else if 60 ≤ i ≤ 79 f := b xor c xor d k := 0xCA62C1D6

temp := (a leftrotate 5) + f + e + k + w[i] e := d d := c c := b leftrotate 30 b := a a := temp

Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e

Produce the final hash value (big-endian):digest = hash = h0 append h1 append h2 append h3 append h4

The following equivalent expressions may be used to compute f in the main loop above:(0 ≤ i ≤ 19): f := d xor (b and (c xor d)) (alternative) (40 ≤ i ≤ 59): f := (b and c) or (d and (b or c)) (alternative 1)(40 ≤ i ≤ 59): f := (b and c) or (d and (b xor c)) (alternative 2)(40 ≤ i ≤ 59): f := (b and c) + (d and (b xor c)) (alternative 3