31
1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity Cryptographic Hash Function: Provides assurance of data integrity Let h be a hash function and x some data. The hash creates a fingerprint of the data, often referred to as the message digest. Typically, x is a large binary string The digest is a fairly short binary string, say 160 bits.

Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

  • Upload
    leduong

  • View
    246

  • Download
    6

Embed Size (px)

Citation preview

Page 1: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

1

Cryptographic Hash Functions

Debdeep MukhopadhyayIIT Kharagpur

Data Integrity

• Cryptographic Hash Function: Provides assurance of data integrity

• Let h be a hash function and x some data.• The hash creates a fingerprint of the data,

often referred to as the message digest.• Typically, x is a large binary string• The digest is a fairly short binary string,

say 160 bits.

Page 2: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

2

Applications

• Say y=h(x), and y is stored in some secured place.

• If x is altered to say x’ and if we assume that h(x)≠h(x’), then the alteration of the message is readily caught, by verifying y≠y’, where y’=h(x’)

• Used in digital signature schemes• Used for message authentication codes

(MAC)

Application: Data Integrity

Page 3: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

3

Application: Digital Signatures

A Keyed Hash Function

• Suppose we also have a key in the computation of the hash functions.

• y=hK(x), and the key is kept secret.– Alice and Bob share K– Alice computes y for x, using K and sends to Bob.– Bob receives x’ and computes the hash value.– If the hashes match, the message is unaltered.– Note that here y is not required to be kept secret.

Why?

Page 4: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

4

What is a Cryptographic Hash Family?

• Note: X could be finite or infinite set, but Y is always finite• If |X|=N, |Y|=M, then there are MN possible FX,Y (the cardinality of

the set of all functions from X to Y)• Any hash family, is called an (N,M) hash family.,X YF F⊆

Security of Hash Functions

• There are three important properties which a hash function must satisfy.

• The properties are required for the security of the applciations.– Preimage– Second Preimage– Collision

• We define them one by one.

Page 5: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

5

Preimage

• If the Preimage can be solved then (x,y) is a valid pair.

• A hash function for which Preimagecannot be efficiently solved is said to be preimage resistant.

Second Preimage

• If this problem is solved, then the pair (x’,h(x)) is valid

• If it cannot be done efficiently then the hash is Second Preimage resistant.

Page 6: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

6

Collision

• Note that if this is solved, then if (x,y) is a valid pair so is (x’,y)

• If not (efficiently solvable) the hash function is called collision resitant

The Random Oracle Model

• Captures the concept of an ideal hash function

• If a hash function, h is ideal then the only way to compute the hash of a given value is by actually computing it: i,e even if many previous values are known.

Page 7: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

7

A Non-Ideal Hash Function • Consider a hash function h: Zn Zn which is a linear

function, say – h(x,y)=ax + by mod n, a, b ε Zn, n≥2 is a positive integer– Suppose, h(x1,y1)=ax1+by1, h(x2,y2)=ax2+by2h(rx1+sx2 mod n, ry1+sy2 mod n)=

=rh1(x1,y1)+sh2(x2,y2) mod nThus we can compute the hash of another value apart from (x1,y1)

and (x2,y2) without actually computing the hash value.We are computing the new hash value from pre-computed valuesNote that we do not require the knowledge of a and b also.This is not what is an ideal hash function according to the RO

model.

What is an Oracle?

• It is not an algorithm• neither a formula• imagine this to be a giant book of random

numbers and each page is a value x and the number written on that page is h(x)

Page 8: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

8

An Independence Theorem

• Note that the above is a conditional probability• It states that the knowledge of the previously

computed values, does not give any advantage to the future computations of h(x)

• This assumption in the RO model will be used in the complexity proofs that follow.

Algorithms in the RO model

• These algorithms are applicable to all hash functions, since the algorithms are not dependent on the details of the hashing method.

• These algorithms are randomized, in the sense that they make random choices

• In particular they can fail, but if they succeed they are correct: Las Vegas Algorithms

Page 9: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

9

Algorithms in the RO model• Worst case success probability, ε: if for every

problem instance, the randomized algorithm returns a correct answer with probability at least ε

• Average case success probability: if the probability that the algorithm returns a correct answer, averaged over all problem instances, is at least ε

• The average success probability is averaged over all possible random choices of FX,Y, and all possible random choices of xεX and/or yεY, if x and/or y are specified as a part of the problem instance.

Algorithm Find-Preimage

Page 10: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

10

Algorithm Find-Second Preimage

Algorithm FindCollision

Page 11: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

11

Relating Q and ε

• So, if we hash little over sqrt(M) values, we have a 50% chance of collision

• Thus our algorithm is (1/2, O(sqrt(M)) algorithm

Comparison of Security Criteria

• Solving Collision is easier than solving Preimage or 2nd Preimage

• Can we reduce one problem to the other?• We shall study two reductions:

– Collision to 2nd Preimage– Collision to Preimage

Page 12: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

12

Proof Method

• Assume that Preimage can be solved using a randomized algorithm

• Show that then the Collision can be solved.

• CollisionHardness << PreimageHardness

• Resistance against Collision => PreimageResistance

The first reduction

• Oracle-2nd-Preimage is an (ε,q) algorithm.• Since it is a Las-Vegas algorithm, if it gives an answer it

will be correct. Thus, x≠x’ and h(x)=h(x’). Thus the collision is also found.

• Thus Collision-to-second-preimage is also an (ε,q) Las-Vegas algorithm

Page 13: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

13

The second reduction

• Assume that Oracle-Preimage is a (1,Q) Las Vegas algorithm

• We will make some weak assumptions on the size of X and Y, |X|≥2|Y|

Reduction

• Proof discussed in class.

Page 14: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

14

Construction of Iterated Hash Functions

• Extending a compression function to a hash function with an infinite domain

• A hash function created in this fashion is called an iterated hash function

• Consider hash functions whose inputs and outputs are bit strings

• |x|: length of a bit string x• x||y: concatenation of strings x and y

Outline of the construction

• Given, compress:{0,1}m+t {0,1}m, t≥1• Preprocessing:

– an input string x, where |x|≥m+t+1– output string y, such that |y|≡0 (mod t)– y=y1||y2||y3||…||yr, where |yi|=t for 1≤i≤r

Page 15: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

15

Optional Output Transformation

• g: {0,1}m {0,1}l

• Define h(x)=g(zr), g is a public function• Sometimes, h(x)=zr

Processing

• z0=IV (public value, called Initialization Vector, |IV|=m)

z1=compress(z0||y1)z2=compress(z1||y2)

……

zr=compress(zr-1||yr)

Page 16: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

16

A typical preprocessing

• y=x||pad(x)– pad(x) is a padding function– it generally has the value of |x|, padded to the

left with additional zeros (so that the sum is a multiple of t)

• Note that the preprocessing step has to be injective– |y|=rt ≥ |x|

Merkle Damgård Construction

• Uses compress:{0,1}m+t {0,1}m, which is collision resistant to construct a collision resistant hash function, h:{0,1}* {0,1}m

– The construction yields a proof for this result.• Typically, we take |x|=m+t+1 (may be

because we wish to keep the message length more than double that of the hash value)

Page 17: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

17

The Preprocessing

• x=x1||x2||…||xk, – where |x1|=|x2|=…=|xk-1|=t-1 and |xk|=t-1-d, where

0≤d≤t-2– Thus,

1 1n d nkt t+ ⎡ ⎤= = ⎢ ⎥− −⎢ ⎥

The Algorithm• This step is known

as the MD strengthening

• Note that yk+1 is also padded to the left with zeros so that |yk+1|=t-1

• The MD strengthening helps to make the pre-processing step injective

Page 18: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

18

A Picture is better than thousand words

The Proof

• CompressCollision-res => HashCollision-res

• not(HashCollision-res) => not(CompressCollision-res)• If you can find a collision in the Hash function efficiently,

then you can find a collision in the compression function efficiently.

Page 19: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

19

When t=1

• Here the encoding, f is done in a special way. – f(0)=0, f(1)=01

• The encoding is injective• There does not exist two strings x≠x’, such that y(x)=z||y(x’), that is no

encoding is a postfix of another encoding.

Theorems

Page 20: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

20

Attacks: Is an Iterated Hash Ideal?

• We shall discuss some attacks against schemes that use Merkle Dåmgard Based Hashing

• The pit-fall lies in abstraction as a Black Box

• We know a double data type represents real number, but there is a precision involved.– conclusion is we have to know the limits well.

Attacks: Is an Iterated Hash Ideal?

• In our design of Hash functions (for aiding the proofs) we have assumed that the hash function is ideal.– one important requirement was that the only

way to learn the hash of a value is by actually computing it!

– This is violated in the Merkle Dåmgardconstruction.

Page 21: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

21

Commitment Scheme

• Consider an auction, where the parties submit their bids

• The least bit gets the deal• If we wish to develop a digital method

without having a central trusted authority, we can try to use a one-way function

• The function binds the bid to a value, the bid leaking no information about the original commitment

Commitment Scheme

• Hash functions are engaged for this• Security Argument:

– If f is one-way recovering x from f(x) is hard– If f is collision-resistant, the commitment can

only be x…otherwise the cheater is creating a collision

• But if the attacker creates hash of all possible values and compare with the bid

• A cheater simply copies others bid values.

Page 22: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

22

MAC Construction

• Message Authentication Code (MAC) is a keyed hash function.

• Used to verify the integrity and authentication of information.

• Alice appends M||HK(M).• Bob collects it and checks validity of the pair.• Prevents adversary from tampering the message

(integrity) and forcing Bob to believe that it actually came from Alice (authentication).

MAC Construction

• MAC’s security is thus based on the property, that the attacker can query the function on inputs of his own choice – but is not able to evaluate the function on any other input with non-negligible probability.

• It is easy to make such a function, when we have an ideal hash function

• But the problem happens when we replace the ideal function by an iterated hash, like SHA-I.

Page 23: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

23

The Attack

• Key length: 80 bits• Adversary intercepts a message M of

length 256 bits with a valid 160 bits tag, tt=SHA1(k||M)=C(k||M||10…0||0…0101010000)

• To forge construct, M’ which includes M, 112 bits of padding, 64 bit encoding of 336, followed by arbitrary text T.

M’=k|| 10…00…0101010000||T

The Forging of MAC

• Note that:SHA(k||M’)=C(C(k||M|| 10…00…0101010000),

T||padding + length)=C(t,T||padding + length)• In other words, we may compute a valid tag on M’, by

applying the compression function to t, T and some padding all of which are known!

• Thus the naive MAC construction based on iterative hash function is totally broken, if the length of a previous message is known (thus violating the RO assumption)!!!

Page 24: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

24

Security of cascaded hash functions: Joux’s Attack

• A generic collision-finding algorithm takes 2n/2 order, where n is the block length

• Suppose we have two functions, G,H:{0,1}* {0,1}n, each having an ideal security 2n/2

• Can we construct a collision-resistant hash with ideal security 2n ?– define F(M)=G(M)||H(M)

• This indeed works, but if both the hashes are ideal…we claim that it fails if one of the hash functions is iterative.

• Assume that one of the functions is based on MD construction– let C be its compression function– let us find a collision on C, we can do that in time

order 2n/2. Let the messages be: – Continuing this for k times, in time O(k2n/2) we have:

(0) (1)1 1 and M M

(0) (1)1 1 1( , ) ( , ) , where 1i i i i iC h M C h M h i k+ + += = ≤ ≤

Page 25: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

25

• We thus now have a treasure of collisions.• Any message that has the form:

collides.• There are thus 2k such messages, many times more than what one

would have found in time k2n/2, had G been ideal!• By B. Paradox, even if H is ideal there is a high probability that there

is a collision in these 2k=2n/2 messages (set k=n/2).• Thus we have a collision of the hash F in time O(n2n/2).

– This is lesser than O(2n), thus proving we do not get the security of a hash of 2n bits

1 2 ( )( ) ( )1 2 1|| || ... || , where ,..., {0,1}kbb b

k kM M M b b ∈

Meaningful Collisions

• Theoreticians said this does not work in practice.

• As the colliding string is almost always meaningless, and hence detectable.

• But we shall see that this attack is very much practical.

• First a demo…

Page 26: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

26

Meaningful Collisions

• Consider a message M=M1||M2||…||Mk

• Create, C(hj,N)=C(hi,N’)– thus two messages that differ in the jth block will

collide• M=M1||M2||…Mj-1||N||Mj+1||…||Mk

• M’=M1||M2||…Mj-1||N’||Mj+1||…||Mk

• Thus N and N’ may be complete gibberish, they are now part of a longer text, which may be carefully constructed to accommodate them!

Message Authentication Codes

• Keyed hash functions• one common way would be to make the IV

secret.• Consider for simplicity, a hash which does

not have the pre-processing steps and the final output transformation.

• Given x and hk(x) (MAC) we have to construct another valid pair. Can we do that?

Page 27: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

27

MAC

• Consider x||x’, where x,x’ are of t bits.• hk(x||x’)=compress(hk(x),x’)

– which can always be computed, even though key is secret!

– this can be also attacked to those cases where padding is required and there is a pre-processing step (refer Stinson)

What is security of MAC?

• Attacker is allowed to request for q valid MACs on x1,x2,…,xq

• Thus he obtains the list: ((x1,y1),(x2,y2),…,(xq,yq))

If he is able to output (x,y), where x is not among the q values queried for, then we say there is a forgery.

If the probability is ε, then we say (ε,q) forger.

Page 28: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

28

Nested MAC (NMAC)

• Suppose that: and are hash families.

• We compose them to make: in which and

where for all

A Result

• The nested MAC is secure provided that the following two conditions hold:– H is a secured MAC, given a fixed unknown

key.– G is collision-resistant, given a fixed unknown

key.

Page 29: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

29

Adversaries

• Three kinds of adversaries:– forger for the nested MAC (big MAC attack)– forger for the little MAC (small MAC attack)– collision finder for the hash, when the key is

secret (unknown key collision attack)

Theorem

• Result Proved in the class…

Page 30: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

30

CBC-MAC

Endomorphic BlockCipher

Each is of block length t

Attack on CBC-MAC

• Set q≈1.17 x 2t/2, be an integer.• Choose distinct q bit-strings of length t:

• Choose random q bit strings of length t:

• Let be fixed bit-strings of length t. • Construct: , where for

and Note that

Page 31: Cryptographic Hash Functionscse.iitkgp.ac.in/~debdeep/courses_iitkgp/Crypto/slides/Hash.pdf · 1 Cryptographic Hash Functions Debdeep Mukhopadhyay IIT Kharagpur Data Integrity •

31

Attack on CBC-MAC

• The attacker now queries the hash value of the q xi values.

• Due to the B. Paradox, there is bound to be a collision with probability ½

• Let . This happens if and only ifwhich happens if and only if

Attack on CBC-MAC• Let be a non-zero bit string of length t. • Define:

and

• The attacker now requests the MAC of v.• The MAC of w also is the MAC of v.• So, he publishes the MAC of v and w as a valid pair.• Thus, we have an forger.