32
CS 5633 – Spring 2012 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small 2/9/12 CS 5633 Analysis of Algorithms 1 changes by Carola Wenk

Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

CS 5633 – Spring 2012

Order StatisticsCarola Wenk

Slides courtesy of Charles Leiserson with small

2/9/12 CS 5633 Analysis of Algorithms 1

ychanges by Carola Wenk

Page 2: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Order statisticsOrder statisticsSelect the ith smallest of n elements (the element with rank i).• i = 1: minimum;;• i = n: maximum;• i = ⎣(n+1)/2⎦ or ⎡(n+1)/2⎤: median.( ) ( )

Naive algorithm: Sort and index ith element.W t i ti Θ( l + 1)Worst-case running time = Θ(n log n + 1)

= Θ(n log n),i t h t ( t i k t)

2/9/12 CS 5633 Analysis of Algorithms 2

using merge sort or heapsort (not quicksort).

Page 3: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Randomized divide-and-l ithconquer algorithm

RAND-SELECT(A, p, q, i) i-th smallest of A[ p . . q] ( p q ) [ p q]if p = q then return A[p]r ← RAND-PARTITION(A, p, q)k ← + 1 k k(A[ ])k ← r – p + 1 k = rank(A[r])if i = k then return A[r]if i < kif i < k

then return RAND-SELECT(A, p, r – 1, i )else return RAND-SELECT(A, r + 1, q, i – k )

≤ A[r] ≥ A[r]k

2/9/12 CS 5633 Analysis of Algorithms 3

rp q

Page 4: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

ExampleExample

Select the i = 7th smallest:

i = 76 10 13 5 8 3 2 11

Select the i = 7th smallest:

pivot

P i ik = 42 5 3 6 8 13 10 11

Partition:

Select the 7 – 4 = 3rd smallest recursively.

2/9/12 CS 5633 Analysis of Algorithms 4

Select the 7 4 3rd smallest recursively.

Page 5: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Intuition for analysisIntuition for analysis(All our analyses today assume that all elements

Lucky:

( y yare distinct.)

for RAND-PARTITIONLucky:

101log 9/10 == nnCASE 3

T(n) = T(9n/10) + dn= Θ(n) CASE 3 Θ(n)

Unlucky:T(n) = T(n – 1) + dn arithmetic seriesT(n) T(n 1) + dn

= Θ(n2)arithmetic series

Worse than sorting!2/9/12 CS 5633 Analysis of Algorithms 5

Worse than sorting!

Page 6: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis of expected timeAnalysis of expected time

The analysis follows that of randomized

Let T(n) = the random variable for the running

The analysis follows that of randomized quicksort, but it’s a little different.Let T(n) = the random variable for the running time of RAND-SELECT on an input of size n, assuming random numbers are independent.assuming random numbers are independent.For k = 0, 1, …, n–1, define the indicator random variablerandom variable

Xk = 1 if PARTITION generates a k : n–k–1 split,0 otherwise

2/9/12 CS 5633 Analysis of Algorithms 6

k 0 otherwise.

Page 7: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (continued)Analysis (continued)To obtain an upper bound, assume that the i th element

T(max{0, n–1}) + dn if 0 : n–1 split,

ppalways falls in the larger side of the partition:

T(n) =T(max{1, n–2}) + dn if 1 : n–2 split,

M( { 1 0}) d if 1 0 liT(max{n–1, 0}) + dn if n–1 : 0 split,

( )∑−

+−−=1

})1(max{n

dnknkTX ( )∑=

+−−=0

})1,(max{k

k dnknkTX.

( )∑−

+≤1

)(2n

k dnkTX

2/9/12 CS 5633 Analysis of Algorithms 7

( )⎣ ⎦∑

= 2/)(

nkk

Page 8: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Calculating expectationCalculating expectation

( )⎥⎤

⎢⎡

+= ∑−1

)(2)]([n

k dnkTXEnTE

Take expectations of both sides.

( )⎣ ⎦

⎥⎦

⎢⎣

+∑= 2/

)(2)]([nk

k dnkTXEnTE

Take expectations of both sides.

2/9/12 CS 5633 Analysis of Algorithms 8

Page 9: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Calculating expectationCalculating expectation

( )∑−

⎥⎤

⎢⎡

+=1

)(2)]([n

k dnkTXEnTE ( )⎣ ⎦

( )[ ]∑

∑−

=

+=

⎥⎦

⎢⎣

+

1

2/

)(2

)(2)]([

n

nkk

dnkTXE

dnkTXEnTE

Linearity of expectation.

( )[ ]⎣ ⎦∑

=

+=2/

)(2nk

k dnkTXE

y p

2/9/12 CS 5633 Analysis of Algorithms 9

Page 10: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Calculating expectationCalculating expectation

( )∑−

⎥⎤

⎢⎡

+=1

)(2)]([n

k dnkTXEnTE ( )⎣ ⎦

( )[ ]∑

∑−

=

+=

⎥⎦

⎢⎣

+

1

2/

)(2

)(2)]([

n

nkk

dnkTXE

dnkTXEnTE

( )[ ]⎣ ⎦

[ ] [ ]∑

∑−

=

+⋅=

+=

1

2/

)(2

)(2

n

nkk

dnkTEXE

dnkTXE

Independence of Xk from other random

[ ] [ ]⎣ ⎦∑

=

+⋅=2/

)(2nk

k dnkTEXE

p kchoices.

2/9/12 CS 5633 Analysis of Algorithms 10

Page 11: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Calculating expectationCalculating expectation

( )∑−

⎥⎤

⎢⎡

+=1

)(2)]([n

k dnkTXEnTE ( )⎣ ⎦

( )[ ]∑

∑−

=

+=

⎥⎦

⎢⎣

+

1

2/

)(2

)(2)]([

n

nkk

dnkTXE

dnkTXEnTE

( )[ ]⎣ ⎦

[ ] [ ]∑

∑−

=

+⋅=

+=

1

2/

)(2

)(2

n

nkk

dnkTEXE

dnkTXE

[ ] [ ]⎣ ⎦

[ ] ∑∑

∑−−

=

+=

+⋅=

11

2/

2)(2

)(2

nn

nkk

dnkTE

dnkTEXE

Linearity of expectation; E[Xk] = 1/n .

[ ]⎣ ⎦ ⎣ ⎦

∑∑==

+=2/2/

)(nknk

dnn

kTEn

2/9/12 CS 5633 Analysis of Algorithms 11

Linearity of expectation; E[Xk] 1/n .

Page 12: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Calculating expectationCalculating expectation

( )dnkTXEnTEn

k ⎥⎤

⎢⎡

+= ∑−1

)(2)]([ ( )⎣ ⎦

( )[ ]dnkTXE

dnkTXEnTE

n

nkk

+=

⎥⎦

⎢⎣

+

∑−

=

1

2/

)(2

)(2)]([

( )[ ]⎣ ⎦

[ ] [ ]dnkTEXE

dnkTXE

n

nkk

+⋅=

+=

∑−

=

1

2/

)(2

)(2

[ ] [ ]⎣ ⎦

[ ] dnkTE

dnkTEXE

nn

nkk

+=

+⋅=

∑∑

∑−−

=

11

2/

2)(2

)(2

[ ]⎣ ⎦ ⎣ ⎦

[ ] dnkTE

dnn

kTEn

n

nknk

+=

+=

∑∑−

==

1

2/2/

)(2

)(

2/9/12 CS 5633 Analysis of Algorithms 12

[ ]⎣ ⎦

dnkTEn nk

+= ∑= 2/

)(

Page 13: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Hairy recurrenceHairy recurrence(But not quite as hairy as the quicksort one.)

[ ]⎣ ⎦

dnkTEn

nTEn

k+= ∑

−1

2/)(2)]([

⎣ ⎦n nk = 2/

Prove: E[T(n)] ≤ cn for constant c > 0 .• The constant c can be chosen large enough

so that E[T(n)] ≤ cn for the base cases.

Use fact: 21

83nk

n∑−

≤ (exercise).

2/9/12 CS 5633 Analysis of Algorithms 13

⎣ ⎦2/8

nk∑

=( )

Page 14: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Substitution methodSubstitution method[ ] dncknTE

n

+≤ ∑−12)([ ]

⎣ ⎦n nk∑

= 2/

Substitute inductive hypothesis.

2/9/12 CS 5633 Analysis of Algorithms 14

Page 15: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Substitution methodSubstitution method[ ] dncknTE

n

+≤ ∑−12)([ ]

⎣ ⎦

dnnc

n nk

+⎟⎞

⎜⎛≤

∑=

2

2/

32 dnnn

+⎟⎠

⎜⎝

≤8

Use fact.

2/9/12 CS 5633 Analysis of Algorithms 15

Page 16: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Substitution methodSubstitution method[ ] +≤ ∑

dncknTEn2)(

1

[ ]⎣ ⎦

+⎟⎞

⎜⎛≤

∑=

dnnc

n nk

32 2

2/

⎟⎞

⎜⎛

+⎟⎠

⎜⎝

dcn

dnnn 8

⎟⎠⎞

⎜⎝⎛ −−= dncncn

4

Express as desired – residual.

2/9/12 CS 5633 Analysis of Algorithms 16

Page 17: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Substitution methodSubstitution method[ ] dncknTE

n

+≤ ∑−2)(1

[ ]⎣ ⎦

dnnc

n nk

+⎟⎞

⎜⎛≤

∑=

32 2

2/

dcn

dnnn

⎟⎞

⎜⎛

+⎟⎠

⎜⎝

≤8

cn

dncncn

⎟⎠⎞

⎜⎝⎛ −−=

4cn≤

if c ≥ 4d .,

2/9/12 CS 5633 Analysis of Algorithms 17

Page 18: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Summary of randomized d i i l iorder-statistic selection

• Works fast: linear expected time• Works fast: linear expected time.• Excellent algorithm in practice.• But the worst case is very bad: Θ(n2)• But, the worst case is very bad: Θ(n ).

Q. Is there an algorithm that runs in linear i i h ?time in the worst case?

A. Yes, due to Blum, Floyd, Pratt, Rivest, d T j [1973]

IDEA: Generate a good pivot recursively.

and Tarjan [1973].

2/9/12 CS 5633 Analysis of Algorithms 18

g p y

Page 19: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Worst-case linear-time order i istatistics

SELECT(i, n)1. Divide the n elements into groups of 5. Find

the median of each 5-element group by rote.2 Recursively SELECT the median x of the ⎣n/5⎦2. Recursively SELECT the median x of the ⎣n/5⎦

group medians to be the pivot.3. Partition around the pivot x. Let k = rank(x).

if i = k then return xelseif i < k

then recursively SELECT the ith

p ( )4.

Same as RANDthen recursively SELECT the ith

smallest element in the lower partelse recursively SELECT the (i–k)th

RAND-SELECT

2/9/12 CS 5633 Analysis of Algorithms 19

smallest element in the upper part

Page 20: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Choosing the pivotChoosing the pivot

2/9/12 CS 5633 Analysis of Algorithms 20

Page 21: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Choosing the pivotChoosing the pivot

1 Divide the n elements into groups of 51. Divide the n elements into groups of 5.

2/9/12 CS 5633 Analysis of Algorithms 21

Page 22: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Choosing the pivotChoosing the pivot

lesser1 Divide the n elements into groups of 5 Find lesser1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote.

2/9/12 CS 5633 Analysis of Algorithms 22

greater

Page 23: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Choosing the pivotChoosing the pivot

x

lesser1 Divide the n elements into groups of 5 Find lesser1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote.

2. Recursively SELECT the median x of the ⎣n/5⎦

2/9/12 CS 5633 Analysis of Algorithms 23

greatery

group medians to be the pivot.

Page 24: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Developing the recurrenceDeveloping the recurrenceSELECT(i, n)T(n) ( , )1. Divide the n elements into groups of 5. Find

the median of each 5-element group by rote.2 R i l S h di f h ⎣ /5⎦

( )

Θ(n)2. Recursively SELECT the median x of the ⎣n/5⎦

group medians to be the pivot.3 Partition around the pivot x Let k = rank(x)

T(n/5)Θ(n)

if i = k then return xelseif i < k

h i l S h i h

3. Partition around the pivot x. Let k rank(x).4.

Θ(n)

then recursively SELECT the ith smallest element in the lower part

else recursively SELECT the (i–k)th T( )?

2/9/12 CS 5633 Analysis of Algorithms 24

y ( )smallest element in the upper part

Page 25: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (Assume all elements are distinct )Analysis (Assume all elements are distinct.)

x

lesserAt least half the group medians are ≤ x which lesserAt least half the group medians are ≤ x, which is at least ⎣ ⎣n/5⎦ /2⎦ = ⎣n/10⎦ group medians.

2/9/12 CS 5633 Analysis of Algorithms 25

greater

Page 26: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (Assume all elements are distinct )Analysis (Assume all elements are distinct.)

x

lesserAt least half the group medians are ≤ x which lesserAt least half the group medians are ≤ x, which is at least ⎣ ⎣n/5⎦ /2⎦ = ⎣n/10⎦ group medians.• Therefore, at least 3 ⎣n/10⎦ elements are ≤ x.

2/9/12 CS 5633 Analysis of Algorithms 26

greater

Page 27: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (Assume all elements are distinct )Analysis (Assume all elements are distinct.)

x

lesserAt least half the group medians are ≤ x which lesserAt least half the group medians are ≤ x, which is at least ⎣ ⎣n/5⎦ /2⎦ = ⎣n/10⎦ group medians.• Therefore, at least 3 ⎣n/10⎦ elements are ≤ x.

2/9/12 CS 5633 Analysis of Algorithms 27

greater• Similarly, at least 3 ⎣n/10⎦ elements are ≥ x.

Page 28: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (Assume all elements are distinct )Analysis (Assume all elements are distinct.)

Need “at most” for worst-case runtime

• At least 3 ⎣n/10⎦ elements are ≤ x⇒ at most n-3 ⎣n/10⎦ elements are ≥ x⇒ at most n 3 ⎣n/10⎦ elements are ≥ x

• At least 3 ⎣n/10⎦ elements are ≥ x⇒ at most n-3 ⎣n/10⎦ elements are ≤ x⇒ at most n 3 ⎣n/10⎦ elements are ≤ x

• The recursive call to SELECT in Step 4 is executed recursively on n-3 ⎣n/10⎦ elementsexecuted recursively on n-3 ⎣n/10⎦ elements.

2/9/12 CS 5633 Analysis of Algorithms 28

Page 29: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Analysis (Assume all elements are distinct )Analysis (Assume all elements are distinct.)

• Use fact that ⎣a/b⎦ ≥ ((a-(b-1))/b (page 51)• n 3 ⎣n/10⎦ ≤ n 3·(n 9)/10 = (10n 3n +27)/10• n-3 ⎣n/10⎦ ≤ n-3·(n-9)/10 = (10n -3n +27)/10

≤ 7n/10 + 3Th i ll t SELECT i St 4 i• The recursive call to SELECT in Step 4 is executed recursively on at most 7n/10+3elementselements.

2/9/12 CS 5633 Analysis of Algorithms 29

Page 30: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Developing the recurrenceDeveloping the recurrenceSELECT(i, n)T(n) ( , )1. Divide the n elements into groups of 5. Find

the median of each 5-element group by rote.2 R i l S h di f h ⎣ /5⎦

( )

Θ(n)2. Recursively SELECT the median x of the ⎣n/5⎦

group medians to be the pivot.3 Partition around the pivot x Let k = rank(x)

T(n/5)Θ(n)

if i = k then return xelseif i < k

h i l S h i h

3. Partition around the pivot x. Let k rank(x).4.

Θ(n)

then recursively SELECT the ith smallest element in the lower part

else recursively SELECT the (i–k)th T(7n/10

+3)

2/9/12 CS 5633 Analysis of Algorithms 30

y ( )smallest element in the upper part

Page 31: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

Solving the recurrenceSolving the recurrencednnTnTnT +⎟

⎠⎞

⎜⎝⎛ ++⎟

⎠⎞

⎜⎝⎛= 3

107

51)(

for Θ(n)

⎠⎝⎠⎝ 105)(

)337()31()( +−++−≤ dnncncnTSubstitution:

3109

)10

()5

()(

+−≤ dnccnT(n) ≤ c(n - 3)

101)3(

10

+−−= dncnncTechnical trick. This shows that T(n)∈ O(n)

if c is chosen large enough e g c=10d)3(

10−≤ nc ,

2/9/12 CS 5633 Analysis of Algorithms 31

if c is chosen large enough, e.g., c=10d

Page 32: Order Statisticscarola/teaching/cs5633/spring12/...Order statisticsOrder statistics Select the ith smallest of n elements (the element with rank i). • i = 1: minimum; • i = n:

ConclusionsConclusions• Since the work at each level of recursion is

basically a constant fraction (9/10) smaller, the work per level is a geometric series dominated by the linear work at the root.

• In practice, this algorithm runs slowly, because the constant in front of n is large.

• The randomized algorithm is far more gpractical.

Exercise: Try to divide into groups of 3 or 72/9/12 CS 5633 Analysis of Algorithms 32

Exercise: Try to divide into groups of 3 or 7.