Lecture 12-cs648-2013 Randomized Algorithms

Preview:

Citation preview

Randomized AlgorithmsCS648

Lecture 12Hashing - II

1

RECAP OF LAST LECTURE

Problem Definition• called universe• and • Examples: ,

Aim Given a set , build a data structure storing s.t. we can answer in O(1) time :

“Does ?” for any given .

Hashing• Hash table: : an array of size .• Hash function : Answering a Query: “Does ?” 1. ;2. Search the list stored at .

Properties of :• computable in O(1) time. • Space required by : O(1).

0 1

𝑻

How many bits needed to encode ?

Elements of

CollisionDefinition: Two elements are said to collide under hash function if

Worst case time complexity of searching an item : No. of elements in colliding with .

0 1

𝑻

Universal Hash Family

Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,

This definition appears strange in the beginning! But we shall soon see that there is a very natural way to arrive at this definition.

Perfect hashing using O() space

Let be Universal Hash Family. Let : the number of collisions for when ? Question: What is ?

Perfect hashing using O() space

Let be Universal Hash Family. Let : the number of collisions for when ? Lemma1:Lemma2:For , there will be no collision with probability at least .

Algorithm1: Perfect hashing for Fix ;Repeat1. Pick ;2. the number of collisions for under .Until .Build the hash table.

Theorem: A perfect hash function can be computed for in expected O() time.

HASHING WITH OPTIMAL SPACE AND WORST CASE O(1) SEARCH TIME

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1:.

Question: What is ] when = ?

Answer: .

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

𝑻 0 1

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .

Let : number of elements in []Extra Space required: = = +

𝑻𝑻

0 1 2

. .

.

0 1 2

. .

.

Is there any relation between and ’s?

Theorem: A given set can be preprocessed in expected O() time to build a data structure (2-level hash table) of O() size such that any search query can be answer in worst case O(1) time.

WHY SUCH A DEFINITION FOR UNIVERSAL HASH FAMILY ?

Why does hashing work so well in Practice ?

A simple hash function: .• works so well in practice because the set is usually a uniformly random

subset of . As a result

• It is easy to fool this hash function such that it achieves O(s) search time.

This makes us think:“Can we achieve expected O(1) search time for any given set .”

similar question while Quick Sort Randomized Quick Sort

Universal Hash Family

A simple hash function: .

Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,

A SIMPLE AND COMPACT UNIVERSAL HASH FAMILY

The starting point

The simple hash function: .

Problem: Two elements in are bound to collide if divides || .

Is there some operation which when applied over any distributes || randomly

uniformly over [0,1,…, ] ?

mod operation : a non-negative integer : a positive integer mod {0,1,…,}.

Question: How is | mod | related to ||mod ?Consider some Examples: • | 55 mod 31 43 mod 31 | = ?? and | 55 43| mod 31 = ??

• | 91 mod 31 102 mod 31 | = ?? and | 91 102| mod 31 = ??

Answer: Let = || mod . Then | mod | = ??

12 12

20 11

{, }

mod operation : a prime number: {}Consider any .Question: What can we say about set = { } ?Example: , .

1 2 3 4 5 6

mod 3 6 2 5 1 4

mod operation : a prime number: {}Consider any .Question: What can we say about set = { } ?Example: , .

Fact: = for all .Proof: = divides divides divides or divides

1 2 3 4 5 6

mod

mod 3 6 2 5 1 44 1 5 2 6 3

Not possible

mod operation : a prime number: {}Consider any .Define set = { } ?Fact: = for all .

Question: If , then what can we say about ?Answer: distributed randomly uniformly over .

Can you now see, that the above answer plays the key role in formulating the hash function ?

Good fact: An element is mapped to a random element in {}.

Slightly bad fact :Once element is mapped to a location, the mapping of is no more random.

So it is not clear whether| - | is mapped uniformly randomly over {0,…, }.…So let us see () a bit more closely…

12

.

.

.

𝑖

𝑖𝑥𝐦𝐨𝐝𝑝

𝑖+Δ

Probability of collision between and

Let

and will collide under if |mod mod | is divisible by .

Question: What is relation between |mod mod | and mod ?

Answer: |mod mod | is either mod or .

Probability of collision between and

Let Lemma: If and collide under , then either mod is divisible by or is divisible by .

{mod | } = ??

Let .Probability of collision between and = P(mod is divisible of or is divisible by ) 2 P(mod is divisible of )=

{,…, }Students must

realize that it is a necessary condition

and not sufficient condition for

collision. To get an idea, study the

example given at the last slide of this

lecture.

Theorem: Let , then H={| } is universal.

Example

, .Observe that =1Question: How many collisions between nd ?Answer: two (for =3,4).Here for =4.And for =3

Answer: No collisions! (although for here.)

1 2 3 4 5 6

2 4 6 1 3 5

3 6 2 5 1 4

4 1 5 2 6 3

5 3 1 6 4 2

6 5 4 3 2 1

1 2 3 4 5 6

123456

Table storing

Homework:

Let , Then prove that H={| } is universal. In particular, show that for any ,

Hence it is slightly better than the hash family discussed just now.

Recommended