Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures...

Preview:

Citation preview

Randomized AlgorithmsCS648

Lecture 6• Reviewing the last 3 lectures• Application of Fingerprinting Techniques

• 1-dimensional Pattern matching

• Preparation for the next lecture. 1

Randomized Algorithms discussed till now

• Randomized algorithm for Approximate Median

• Randomized Quick Sort

• Frievald’s algorithm for Matrix Product Verification

• Randomized algorithm for Equality of two files

2

Randomly select a sample

Randomly permute the array

Randomly select a vector

Randomly select a prime number

Randomized Algorithms

How does one go about designing a randomized algorithm ?

3

Randomized Algorithms

Some random idea is required to design a randomized algorithm.

4

Randomized Algorithms

An idea based on insight into the problem

Difficult/impossible to exploit the idea deterministically

A randomized algorithm

5

Randomization to materialize the idea

RANDOMIZED QUICK SORT

6

Randomized Quick Sort

7

Elements of A arranged in Increasing order of values

𝒏 /𝟒 𝟑𝒏 /𝟒

A

… 𝒏

pivot

Randomized Quick Sort

Observation: There are many elements in A that are good pivot. Is it possible to select one good pivot efficiently ?

(not possible deterministically )

We select pivot element randomly uniformly.

8

A randomly selected element is a good pivot with probability

RANDOMIZED ALGORITHM FOR APPROXIMATE MEDIAN

9

Randomized Algorithm for Approximate median

A sample captures the essence of the original population.

10

Randomized Algorithm for Approximate median

Idea: Is it possible to select a small subset of elements whose median approximates the median ?

(not possible deterministically )

Median of a uniformly random sample will be approximate median.

11

A random sample captures the essence of the original population.

FRIEVALD’S TECHNIQUEAPPLICATION

MATRIX PRODUCT VERIFICATION

12

Frievald’s Algorithm

13≟

𝑪

𝑨 𝑩

⨯ 0

0

0

0

𝒙 𝒚

𝒛𝒙

Frievald’s AlgorithmThe key idea

Fact: An equation has a unique solution depending upon and only.

Problem: Suppose you do not know the values of and . Your aim is to select a value for which does not satisfy the corresponding equation.

Idea: Consider any two different values {, }. Surely the equation is not satisfied for at least one of {, }. Can we select that value deterministically ?

selects a value randomly uniformly out of {, }.

14

Randomization used to exploit the idea:

Frievald’s Algorithm(Analyzing error probability)

15

12…𝑛

2

𝑫=(𝑨 ∙𝑩−𝑪) 𝒙

+ … + = 0

+ … + = 0

Fixing the values of , …, arbitrarily

FINGERPRINTINGAPPLICATION

CRYPTOGRAPHY

16

17

Aim: To determine if File A identical to File B by communicating fewest bits ?

File A File B

How many primes less than ?

18

Primes less than

100 25

1000 168

10000 1229

100000 9592

1000000 78498

Key idea from prime

19

4𝑛2 log𝑛1

2𝑛1 𝑑Less than prime

factors of

around prime numbers in ]

Visualize a file as a binary number

File A = … File B = …

= =

Overview of Protocol:Let be a prime number selected randomly uniformly from []If mod = mod then conclude A=B else conclude A≠BError occurs if “is one of the prime factors of ()”

20

FINGERPRINTINGAPPLICATION 3

PATTERN MATCHING

21

Text :Pattern :

Pattern is said to appear in Text at location if for all .

Problem: Given a Text , and a pattern , does appear anywhere in ?

Deterministic Algorithm• Trivial algorithm: O() time• Knuth-Morris-Pratt algorithm: O() timeRandomized Monte Carlo Algorithm• O() time, and error probability <

22

100101100110001101111010101110101010111010000101

011110101011101

17

Motivation• Simplicity, real time implementation, streaming environment • Extension to 2-dimensions

• Converting Monte Carlo to Las Vegas algorithm

23

1 1 1 0

1 1 0 1

1 0 1 1

1 1 1 1

m⨯m

n⨯nO() time algorithm

RANDOMIZED ALGORITHM FOR FINGERPRINTING

24

Checking ifappears in Text at location

Text :Pattern :

Observation: O() time algorithm is obvious.

Question: How to do this task in O(1) time ?Answer: have a fingerprint .

Question: What properties should the fingerprint possess?• ??• ??

25

0111101110110101

𝒌

100101100110001101111010101010101010111010000101

Small size

Efficiently computable

Checking ifappears in Text at location

Text :Pattern :

= = Let be a prime number selected randomly uniformly from [ ] mod . mod .

If then conclude that appears at . Error occurs if “is one of the prime factors of ()”Error probability at location ≤Fingerprint has size= O() bits.

26

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Small size but Not efficiently computable

Checking ifappears in Text at location

Text :Pattern :

= = Question: Any relation between and ?

Question: Any relation between and ? = mod = ( ) mod = ( ) mod = ( ) mod

27

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

<

Fingerprint function: how good is it ?

Text :Pattern :

= mod = mod

Lemma: The fingerprint function • Occupies bits.• Computing take O() bits operations. • Error probability for any particular location is .

Question: What is the error probability of the algorithm ?

28

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Bounding the error probability of the algorithm

: event that the algorithm fails : event that the fingerprint shows a false match at any fixed location

Can you see some relation between and ’s ? = P() ≤

= since is the same for each .

< = .Question: How large should be to ensure P() < Answer: = () Fingerprint size: O().

29

Final result

Theorem: There is a Monte Carlo randomized algorithm for detecting any match of P[] in T[] that :• Fails with error probability < .• Performs O() operations involving O() bit numbers.

Homework: It is possible to convert the above algorithm to Las Vagas. Spend some time thinking over it (we shall discuss it in some class).

30

It takes O(1) time on word-RAM model of computation for an operation involving O() bit numbers. So the time complexity of the

algorithm is O()

Probability tool (union theorem)

Suppose there is an event defined over a probability space (,P). Aim: to get an upper bound on P().

If it is difficult to calculate P(), try to express as union of events (usually similar/same) such that• it is easy to calculate P()Then you may bound P() using the following inequality:

P() ≤

31

APPLICATIONS OF THE UNION THEOREM

32

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Used in:• Hashing• Load balancing in distributed environment

33

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Theorem: For the case when , prove that with very high probability, every bin has O(log ) balls.

(The proof requires Union theorem and elementary probability. We shall discuss it in the next class. Spend some time to prove it on your own.)

34

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Randomized Quick sort

Theorem: Probability that Randomized Quick sort performs more than log comparisons is less than .

Tools needed:1. Union theorem2. Probability that we get less than HEADS during tosses of a fair coin is

less than .(The proof requires Union theorem and elementary probability. We shall

discuss it in the next class. Spend some time to prove it on your own.)

35

Recommended