Upload
sebastian-wild
View
1.183
Download
2
Tags:
Embed Size (px)
DESCRIPTION
I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain). A paper covering the analyses of this talk (and some more!) has been submitted. Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine: slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort Check my website for preprints of papers and my other talks: wwwagak.cs.uni-kl.de/sebastian-wild.html
Citation preview
Quickselect underYaroslavskiy’s Dual-Pivoting Algorithm
Sebastian Wild Markus E. Nebel Hosam [email protected]
Computer Science Department
28 May 2013AofA 2013Menorca
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 1 / 17
Selection
Consider the following problem
The Selection Problem
Given an unsorted array A of n data elements,find the m-th smallest element.
Applications
estimate distribution of data collection in statisticsworld rankings in sports
Algorithms
trivial solution: sort(A); return A[m]; O(n logn)linear worst case possible: Median-of-Medians (Blum et al.)but rather slow in practice (compared to sorting!)Hoare 1961: Quicksort idea usable for selection
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 2 / 17
From Quicksort to Quickselect
Sort whole array by Quicksort
Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
From Quicksort to Quickselect
Can prune many branches
Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
From Quicksort to Quickselect
Can prune many branches Dual Pivot Quicksort
Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
From Quicksort to Quickselect
Can prune many branches Prune even more branches
Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
From Quicksort to Quickselect
Can prune many branches Prune even more branches
Markus Nebel just showed:Yaroslavskiy’s partitioning can speed up Quicksort
Does success of dual partitioning in sorting carry over to selection?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
Notation
Analyse (random) number of data comparisons C(m)n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q− P − 1 J3 := n−Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
Notation
Analyse (random) number of data comparisons C(m)n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q− P − 1 J3 := n−Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
Notation
Analyse (random) number of data comparisons C(m)n
to select from random permutation
P < Q: (random) ranks of the two pivots
subarray sizes determined by P and Q:
left P rightQmiddle
J1 := P − 1 J2 := Q− P − 1 J3 := n−Q
selection continues recursively in
left subarray, if m < P
middle subarray, if P < m < Q
right subarray, if Q < m
or terminates if m = P or m = Q
corresponding events
E1 := {m < P}
E2 := {P < m < Q}
E3 := {Q < m}
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
Quickselect Recurrence
Yaroslavskiy’s partitioning preserves randomness
can set up recurrence for the distribution of C(m)n :
C(m)n
D= Tn +
C(m)J1
if m lies in left subarray
C(m−P)J2
if m lies in middle subarray
C(m−Q)J3
if m lies in right subarray
= Tn + 1E1 · C(m)J1
+ 1E2 · C(m−P)J2
+ 1E3 · C(m−Q)J3
Tn is the (random) number of comparisons during partitioning
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 5 / 17
Getting Rid of m
Complication: second parameter m in recurrence
C(m)n
D= Tn + 1E1 · C
(m)J1
+ 1E2 · C(m−P)J2
+ 1E3 · C(m−Q)J3
consider simpler quantities
1 Random Rank: m =MnD= Uniform {1, . . . , n}
abbreviate Cn := C(Mn)n
← focus in talk
CnD= Tn+ 1E1
CJ1 + 1E2CJ2 + 1E3
CJ3
2 Extreme Rank: m = 1abbreviate Cn := C
(1)n
CnD= Tn + CP−1
explicit solution for expectations possible (generating functions)But: Requires tedious computations
More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
Getting Rid of m
Complication: second parameter m in recurrence
C(m)n
D= Tn + 1E1 · C
(m)J1
+ 1E2 · C(m−P)J2
+ 1E3 · C(m−Q)J3
consider simpler quantities
1 Random Rank: m =MnD= Uniform {1, . . . , n}
abbreviate Cn := C(Mn)n
← focus in talk
CnD= Tn+ 1E1
CJ1 + 1E2CJ2 + 1E3
CJ3
2 Extreme Rank: m = 1abbreviate Cn := C
(1)n
CnD= Tn + CP−1
explicit solution for expectations possible (generating functions)But: Requires tedious computations
More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
Getting Rid of m
Complication: second parameter m in recurrence
C(m)n
D= Tn + 1E1 · C
(m)J1
+ 1E2 · C(m−P)J2
+ 1E3 · C(m−Q)J3
consider simpler quantities
1 Random Rank: m =MnD= Uniform {1, . . . , n}
abbreviate Cn := C(Mn)n
← focus in talk
CnD= Tn+ 1E1
CJ1 + 1E2CJ2 + 1E3
CJ3
2 Extreme Rank: m = 1abbreviate Cn := C
(1)n
CnD= Tn + CP−1
explicit solution for expectations possible (generating functions)But: Requires tedious computations
More elegant shortcut to asymptotics of E[Cn] and E[Cn]?Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
Convergence
Idea: Computation easy if we can drop index n
i. e. when recurrence “in equilibrium”
For that, we need stochastic convergence of Cn
0 100 200 300 400 500 600
99.9% massn = 10
n = 20
n = 30
n = 40
Cn
But:E[Cn] grows with n
high mass region moves with n
}Cn does not converge
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
Convergence
Idea: Computation easy if we can drop index n
i. e. when recurrence “in equilibrium”
Try normalized variables C∗n :=
Cn
ninstead:
0 2 4 6 8 10 12 14 16 18
99.9% massn = 10
n = 20
n = 30
n = 40
Cn/n
E[C∗n] seems constant
range of high mass looks bounded
}C∗n might converge
Works because of property of Quickselect: E[Cn] = O(√
Var(Cn))
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
The Contraction Method
Rewrite recurrence: (Write as sum)
CnD= Tn + 1E1CJ1 + 1E2CJ2 + 1E3CJ3
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Rewrite recurrence: (Divide by n)
CnD= Tn +
3∑i=1
1EiCJi
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Rewrite recurrence: (Rearrange)
Cn
n
D=
Tn
n+
3∑i=1
1EiCJin
· JiJi
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Rewrite recurrence: (use normalized variables)
Cn
n
D=
Tn
n+
3∑i=1
1Ei
Jin·CJiJi
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Recurrence for C∗n
C∗n
D= T∗n +
3∑i=1
1Ei
Jin· C∗
Jiwith T∗n :=
Tn
n
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Recurrence for C∗n
C∗nD= T∗n +
3∑i=1
1Ei
Jin︸ ︷︷ ︸ · C∗Ji
↓ ↓
T∗ A∗i
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Recurrence for C∗n
C∗nD= T∗n +
3∑i=1
1Ei
Jin︸ ︷︷ ︸ · C∗Ji
↓ ↓3∑i=1
E[|A∗i |
]< 1
T∗ A∗i
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
Recurrence for C∗n
C∗nD= T∗n +
3∑i=1
1Ei
Jin︸ ︷︷ ︸ · C∗Ji
↓ ↓ ↓ ↓3∑i=1
E[|A∗i |
]< 1
C∗D= T∗ +
3∑i=1
A∗i · C∗
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
“Equilibrium” equation for C∗
C∗D= T∗ +
3∑i=1
A∗i · C∗
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
“Equilibrium” equation for C∗
(Take expectation on both sides)
C∗D= T∗ +
3∑i=1
A∗i · C∗
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
“Equilibrium” equation for C∗
(and solve for E C∗)
E C∗ = E T∗+3∑i=1
EA∗i · E C∗
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
“Equilibrium” equation for C∗
(and solve for E C∗)
E C∗ =E T∗
1−∑3i=1 EA∗i
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i
. . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
The Contraction Method
“Equilibrium” equation for C∗
(and solve for E C∗)
E C∗ =E T∗
1−∑3i=1 EA∗i
Contraction Method:Convergence of coefficients + contraction conditionimplies convergence to fixpoint solution.
have to show: (all convergences in L1-norm)
1 T∗n → T∗
2 1Ei
Jin→ A∗i . . . upcoming
3∑3i=1 E
[|A∗i |
]< 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
induces pivot values: U1 = S1, U2 = S1 + S20
0.51 0 0.5 1
0
0.5
1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2
induces pivot values: U1 = S1, U2 = S1 + S20
0.51 0 0.5 1
0
0.5
1
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n− 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n− 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n− 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1
S1 S2 S3
U1 U2J1 J2 J3
S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large)
n− 2 elements to draw conditional on S
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1U1U1 U2 U2
Continue of sub-interval with smaller n
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Probabilistic Model
Need “convenient” probabilistic model:same probability space for finite n and limit case
Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1)0 1U1 U2 first two = pivots values
Our equivalent model:
1 Draw spacings S1, S2, S3 on (0, 1) with S D= Dirichlet(1, 1, 1)
2 Draw subarray sizes J1, J2, J3 with J D= Multinomial(n− 2;S1, S2,S3)
3 Continue recursively
0 1U1U1 U2 U2
Continue of sub-interval with smaller n
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
Distribution of Tn — Repetition1 T∗n → T∗
Invariant:< p `
→> qg
←p 6 ◦ 6 q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l@K second for large elements at C ′k
+ s@ G second for small elements at C ′g
Markus Nebel just showed:
l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )
}conditional on J
s@ GD= Hypergeometric(n− 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
Distribution of Tn — Repetition1 T∗n → T∗
Invariant:< p `
→> qg
←p 6 ◦ 6 q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l@K second for large elements at C ′k
+ s@ G second for small elements at C ′g
Markus Nebel just showed:
l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )
}conditional on J
s@ GD= Hypergeometric(n− 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
Distribution of Tn — Repetition1 T∗n → T∗
Invariant:< p `
→> qg
←p 6 ◦ 6 q k
→
k’s range K G
Tn = number of comparisons in first partitioning step
∼ n one for every element
+ J2 second for all medium elements
+ l@K second for large elements at C ′k
+ s@ G second for small elements at C ′g
Markus Nebel just showed:
l@KD= Hypergeometric(n− 2; J3 , J1 + J2 )
}conditional on J
s@ GD= Hypergeometric(n− 2; J1 , J3 )
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
Convergence of T ∗n1 T∗n → T∗
T∗n =Tn
n= 1+
J2
n+l@K
n+s@ G
n
J2
n→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l@K
n→ E
[l@K
n
∣∣∣∣ S]
by Chernoff bound arguments
= E[E[l@K | J
]n
∣∣∣∣ S]
= E[J3 ( J1 + J2 )
n(n− 2)
∣∣∣∣ S]
∼ S3 · (S1 + S2)s@ G
n→ S1 · S3 similar
T∗n → T∗ = 1+ S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
Convergence of T ∗n1 T∗n → T∗
T∗n =Tn
n= 1+
J2
n+l@K
n+s@ G
n
J2
n→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l@K
n→ E
[l@K
n
∣∣∣∣ S]
by Chernoff bound arguments
= E[E[l@K | J
]n
∣∣∣∣ S]
= E[J3 ( J1 + J2 )
n(n− 2)
∣∣∣∣ S]
∼ S3 · (S1 + S2)s@ G
n→ S1 · S3 similar
T∗n → T∗ = 1+ S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
Convergence of T ∗n1 T∗n → T∗
T∗n =Tn
n= 1+
J2
n+l@K
n+s@ G
n
J2
n→ S2 by law of large numbers, since E[Ji | S] = n · Si
Multinomial and Hypergeometric concentrated around mean
l@K
n→ E
[l@K
n
∣∣∣∣ S]
by Chernoff bound arguments
= E[E[l@K | J
]n
∣∣∣∣ S]
= E[J3 ( J1 + J2 )
n(n− 2)
∣∣∣∣ S]
∼ S3 · (S1 + S2)s@ G
n→ S1 · S3 similar
T∗n → T∗ = 1+ S2 + S3(2S1 + S2)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
Convergence of Coefficients & Contraction Condition
2 1EiJin → A∗i
standard computation:
1E1
J1
n→ 1F1 · S1 with F1 = {V < S1}
i = 2, 3 similar
3∑3i=1 E
[|A∗i |
]< 1
E[1F1S1
]= 16
by symmetry:3∑i=1
E[|A∗i |
]= 12 < 1
We (finally) proved convergence C∗n→ C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
Convergence of Coefficients & Contraction Condition
2 1EiJin → A∗i
standard computation:
1E1
J1
n→ 1F1 · S1 with F1 = {V < S1}
i = 2, 3 similar
3∑3i=1 E
[|A∗i |
]< 1
E[1F1S1
]= 16
by symmetry:3∑i=1
E[|A∗i |
]= 12 < 1
We (finally) proved convergence C∗n→ C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
Convergence of Coefficients & Contraction Condition
2 1EiJin → A∗i
standard computation:
1E1
J1
n→ 1F1 · S1 with F1 = {V < S1}
i = 2, 3 similar
3∑3i=1 E
[|A∗i |
]< 1
E[1F1S1
]= 16
by symmetry:3∑i=1
E[|A∗i |
]= 12 < 1
We (finally) proved convergence C∗n→ C∗.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 1912
Recall:
E C∗ =ET∗
1−∑3
i=1 EA∗i
∑3i=1 EA∗i =
12
} E C∗ = 19
6 = 3.16
L1 convergence E Cn ∼ 3.16n
similarly: E Cn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 1912
Recall:
E C∗ =ET∗
1−∑3
i=1 EA∗i
∑3i=1 EA∗i =
12
} E C∗ = 19
6 = 3.16
L1 convergence E Cn ∼ 3.16n
similarly: E Cn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 1912
Recall:
E C∗ =ET∗
1−∑3
i=1 EA∗i
∑3i=1 EA∗i =
12
} E C∗ = 19
6 = 3.16
L1 convergence E Cn ∼ 3.16n
similarly: E Cn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
Results — Expectations
Computing expectations and plugging in . . .
E T∗ = 1912
Recall:
E C∗ =ET∗
1−∑3
i=1 EA∗i
∑3i=1 EA∗i =
12
} E C∗ = 19
6 = 3.16
L1 convergence E Cn ∼ 3.16n
similarly: E Cn ∼ 2.375n
Comparisons Swaps
0
1
2
3
+5.55%
+100%
random rank
Classic Quickselect
Yaroslavskiy Select
Comparisons Swaps
0
1
2
+18.8%
+125%
extreme rank
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
Pivot Sampling
successful optimization of Quicksort/-select: median of threeHere: generalized pivot selection scheme
1 Choose random sample of k elements2 Sort the sample3 Select pivots s. t. in sorted sample
t1 smaller ,
t2 between and
t3 larger than pivots
Example withk = 8 and t = (3, 2, 1):
U1 U2
t1 t2 t3
Our stochastic model nicely generalizes to pivot sampling:
only the distribution of spacings S changes!
S D= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =
st11 s
t22 s
t33
B
contraction conditions remain valid
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
Pivot Sampling
successful optimization of Quicksort/-select: median of threeHere: generalized pivot selection scheme
1 Choose random sample of k elements2 Sort the sample3 Select pivots s. t. in sorted sample
t1 smaller ,
t2 between and
t3 larger than pivots
Example withk = 8 and t = (3, 2, 1):
U1 U2
t1 t2 t3
Our stochastic model nicely generalizes to pivot sampling:
only the distribution of spacings S changes!
S D= Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) =
st11 s
t22 s
t33
B
contraction conditions remain valid
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
Results — Pivot Sampling
General result: Cn ∼1+ t2+1
k+1 +(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 −∑3i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:ti
k→ αi as k→∞
M= take exact quantiles as pivots
Cn ∼1+ α2 + (2α1 + α2)α3
1 −∑3i=1α
2i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
Results — Pivot Sampling
General result: Cn ∼1+ t2+1
k+1 +(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 −∑3i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:ti
k→ αi as k→∞
M= take exact quantiles as pivots
Cn ∼1+ α2 + (2α1 + α2)α3
1 −∑3i=1α
2i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
Results — Pivot Sampling
General result: Cn ∼1+ t2+1
k+1 +(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 −∑3i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:ti
k→ αi as k→∞
M= take exact quantiles as pivots
Cn ∼1+ α2 + (2α1 + α2)α3
1 −∑3i=1α
2i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
Results — Pivot Sampling
General result: Cn ∼1+ t2+1
k+1 +(2(t1+1)+(t2+1))(t3+1)
(k+1)2
1 −∑3i=1
(ti+1)2
(k+1)2
· n
inconvenient for interpretation . . .
Limiting case for large sample:ti
k→ αi as k→∞
M= take exact quantiles as pivots
Cn ∼1+ α2 + (2α1 + α2)α3
1 −∑3i=1α
2i
· n
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
2.4654
2.5
α1
α2
optimal choice is not symmetric
classic Quickselect with median of 2t+ 1:2.5n for t = 1 and 2.3n for t = 2
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
Conclusion
Practitioners’ Lesson:Yaroslavskiy’s dual-pivoting not helpful for selection
Analyzers’ Lesson:expected costs of Quickselect derivable by contraction methodpivot sampling included naturally
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 16 / 17
References & Further Reading
M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest and R. E. Tarjan.Time bounds for selection.J. of Comp. and System Sc., 7(4):448–461, 1973.
C. A. R. Hoare.Algorithm 65: Find.Communications of the ACM, 4(7):321–322, July 1961.
H.-K. Hwang and T.-H. Tsai.Quickselect and Dickman function.Combinatorics, Probability and Computing, 11, 2000.
Hosam M. Mahmoud.Sorting: A distribution theory.John Wiley & Sons, Hoboken, NJ, USA, 2000.
H. M. Mahmoud, R. Modarres and R. T. Smythe.Analysis of quickselect : an algorithm for order statistics.Informatique théorique et appl., 29(4):255–276, 1995.
C. Martínez, D. Panario and A. Viola.Adaptive sampling strategies for quickselects.ACM Transactions on Algorithms, 6(3):1–45, June 2010.
C. Martínez and S. Roura.Optimal Sampling Strategies in Quicksort and Quickselect.SIAM Journal on Computing, 31(3):683, 2001.
R. Neininger.On a multivariate contraction method for randomrecursive structures with applications to Quicksort.Random Structures & Alg., 19(3-4):498–524, 2001.
R. Neininger and L. Rüschendorf.A General Limit Theorem for Recursive Algorithms andCombinatorial Structures.The Annals of Applied Probability, 14(1):378–418, 2004.
A. Panholzer and H. Prodinger.A generating functions approach for the analysis ofgrand averages for multiple QUICKSELECT.Random Structures & Alg., 13(3-4):189–209, 1998.
U. Rösler and L. Rüschendorf.The contraction method for recursive algorithms.Algorithmica, 29(1):3–33, 2001.
S. Wild.Java 7’s Dual Pivot Quicksort.Master thesis, University of Kaiserslautern, 2012.
S. Wild, M. E. Nebel and H. Mahmoud.Analysis of Quickselect underYaroslavskiy’s Dual-Pivoting Algorithm.Algorithmica, in preparation.
S. Wild, M. E. Nebel and R. Neininger.Average Case and Distributional Analysis ofJava 7’s Dual Pivot Quicksort.ACM Transactions on Algorithms, submitted.
V. Yaroslavskiy.Dual-Pivot Quicksort.2009.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 17 / 17