37
Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Embed Size (px)

Citation preview

Page 1: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection and Sorting

Prepared by

John Reif, Ph.D.

Analysis of Algorithms

Page 2: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection and Sorting

a) Randomized Samplingb) Selection by Randomized Samplingc) Sorting by Randomized Splitting:

Quicksort and Multisample Sorts

Page 3: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Readings

• Main Reading Selections:• CLR, Chapters 9

Page 4: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Comparison Problems

• Input set X of N distinct keys total ordering < over X

• Problems1) For each key x X

rank(x,X) = |{x’ X| x’ < x}| + 1

2) For each index i {1, …, N)select (i,X) = the key x X where i = rank(x,X)

3) Sort (X) = (x1, x2, …, xn)where xi = select (i,X)

Page 5: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Comparison Tree Model

Page 6: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Algorithm Samplerank s(x,X)

beginLet S be a random sample of X-{x}

of size soutput 1+ N/s [rank(x,S)-1]

end

Page 7: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Algorithm Samplerank s(x,X) (cont’d)

• Lemma 1– The expected value of

samplerank s(x,X) is rank (x,X)

Page 8: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Algorithm Samplerank s(x,X) (cont’d)

Let k = rank(x,X)

For a random y εX,

k-1 Prob (y< x) =

Nk-1

Hence E (rank(x,S)) = s 1N

Solving for k, we get

Nrank (x,X) = k = 1+ E[rank(x,S)-1] = E (samplerank(x,X))

s

proof

Page 9: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

S is Random Sample of X of Size s

Page 10: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

More Precise Bounds on Randomized Sampling (cont’d)

Page 11: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

More Precise Bounds on Randomized Sampling

• Let S be a random sampling of X

• Let ri = rank(select(i,S),X)

-αi

2

N N Prob |r -i | > c log N N

s+1 s

Lemma

Page 12: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

More Precise Bounds on Randomized Sampling (cont’d)

• ProofWe can bound ri by a Beta

distribution, implying

• Weak bounds follow from Chebychev inequality

• The Tighter bounds follow from Chernoff Bounds

i

2i 2

N (r ) = i

s+1i (s-i+1)

(r ) N(s+1) (s+2)

mean

Var

Page 13: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Subdivision by Random Sampling

• Let S be a random sample of X of size s

• Let k1, k2, …, ks be the elements of S in sorted order

• These elements subdivide X into

1 1

2 1 2

3 2 3

s+1

s+1 subsets x = {x ε X | x k }

X = {x ε X | k x k }

X = {x ε X | k x k }

X = {x ε X | x k}

Page 14: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Subdivision by Random Sampling (cont’d)

• How even are these subdivisions?

Page 15: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Subdivision by Random Sampling (cont’d)

• Lemma 3If random sample S in X is of size s and X is of size N,

then S divides X into subsets each of

-α(N-1) α (N) with prob 1 - N

ssize ln

Page 16: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Subdivision by Random Sampling (cont’d)

Proof• The number of (s+1) partitions of X is

• The number of partitions of X with one block of size v is

sN-1 (N-1)~

s s!

sN-v-1 (N-v-1)~

s s!

Page 17: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Subdivision by Random Sampling (cont’d)

• So the probability of a random (s+1) partition having a block size v is

-1

-1

N-v-1

s N-v-1 1 N-1~ 1 for Y=

N-1 N-1 v

s

1 (N-1) 1 ~ e = e N if v = N

s

1 since 1 e

s s

sY svsY

NY

Y

Y

lnY

Y

Page 18: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection

• “canonical selection algorithm”• Algorithm can select (i,X) input set X of N keys

index i {1, …, N}[0] if N=1 then output X[1] select a bracket B of X, so that

select (i,X) B with high prob. [2] Let i1 be the number of keys less

than any element of B[3] output can select (i-i1, B)

B found by random sampling

Page 19: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Hoar’s Selection Algorithm

• Algorithm Hselect (i,X) where 1 i N

beginif X = {x} then output x else choose a random splitter k X let B = {x X|x < k}if |B| i then output Hselect(i,B) else output Hselect(i-|B|, X-B)

end

Page 20: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Hoar’s Selection Algorithm (cont’d)

• Sequential time bound T(i,N) has mean

i N

j=1 j=i+1

1T (i, N) = N + T (i-j, N-j)+ T (i-j)

N

= 2 N + min (i, N-i) + o(N)

Page 21: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Hoar’s Selection Algorithm (cont’d)• Random splitter k X Hselect (i,X) has two

cases

Page 22: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Hoar’s Selection Algorithm (cont’d)

• Inefficient: each recursive call requires N comparisons, but only

reduces average problem size to ½ N

Page 23: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Improved Randomized Selection

• By Floyd and Rivest• Algorithm FRselect(i,X)begin

if X = {x} then output x else choose k1, k2 X such that k1< k2 let r1 = rank(k1, X), r2 = rank(k2, X)

if r1 > i then FRselect(i, {x X|x < k1})

else if r2 > i then FRselect(i-r1, {x X|k1x k2})

else FRselect(i-r2, {x X|x > k2})

end

Page 24: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Choosing k1, k2

• We must choose k1, k2 so that with high likelihood,

k1 select(i,X) k2

Page 25: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Choosing k1, k2 (cont’d)

• Choose random sample S X size s• Define

1

2

(s+1)k = select i ,

(N+1)

(s+1)k = select i ,

(N+1)

where = d s log N , d = constant

S

S

Page 26: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Choosing k1, k2 (cont’d)

• Lemma 2 implies

-α1

-α2

1 1

2 2

P rob (r > i) < N

and

P rob (r > i) < N

where r = rank (k , X)

r = rank (k , X)

Page 27: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Expected Time Bound

1 1

2 1 2

1 2 1 2 1

-

T(i,N) N + 2T(-,s)

+ Prob (r > i) T(i, r )

+ Prob (i > r ) T(i - r , N - r )

+ Prob (r i r ) T(i - r , r - r )

N+1N+2T(-,s)+2N N+T i, 2

s+1

N + min (i, N-i

2

3

) + o(N)

3 if we set and s = N log N = o(N)

Page 28: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Small Cost of Recursions

• Note• With prob 1 - 2N- each recursive

call costs only O(s) = o(N) rather than N in previous algorithm

Page 29: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Sorting Algorithms

• “canonical sorting algorithm”• Algorithm cansort(X)begin

if x={x} then output X elsechoose a random sample S of X of size sSort SS subdivides X into s+1 subsets

X1, X2, …, Xs+1

output cansort (X1) cansort (X2)… cansort(Xs+1)

end

Page 30: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Using Random Sampling to Aid Sorting

• Problem:– Must subdivide X into subsets of nearly

equal size to minimize number of comparisons

– Solution: random sampling!

Page 31: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Hoar’s Randomized Sorting Algorithm

• Uses sample size s=1• Algorithm quicksort(X)begin

if |X|=1 then output X elsechoose a random splitter k X

output quicksort ({x X|x<k})· (k)· quicksort ({x X|x>k})

end

Page 32: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Expected Time Cost of Hoar’s Sort

• Inefficient:• Better to divide problem size by ½

with high likelihood!

1

1T(N) N-1 + (T(i-1) T(N-i))

N

2 N log N

N

i

Page 33: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Improved Sort using

• Better choice of splitter is k = sample selects (N/2, N)

• Algorithm samplesorts (X)begin

if |X|=1 then output X choose a random subset S of X size

s = N/log N k select (S/2,S) cost time o(N)

output samplesorts ({x X|x<k})· (k) ·

samplesorts ({x X|x>k})end

Page 34: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Approximant of Mean

• By Lemma 2, rank(k,X) is very nearly the mean:

-αNProb(| rank(k,X) - | d α N log N) < N

2

Page 35: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Improved Expected Time Bounds

• Is optimal for comparison trees!

-1T(N) 2T(N ) + N T(N) N + o(N) +N - 1

log (N!)

Page 36: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Open Problems in Selection and Sorting

1) Improve Randomized Algorithms to exactly match lower bounds on number of comparisons

2) Can we de randomize these algorithms – i.e., give deterministic algorithms with the same bounds?

Page 37: Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection and Sorting

Prepared by

John Reif, Ph.D.

Analysis of Algorithms