34
Average-Case and Distributional Analysis of Java 7’s Dual Pivot Quicksort Markus E. Nebel based on joint work with Ralph Neininger and Sebastian Wild AofA 2013 Menorca, Spain Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 1 / 22

Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Average-Case and Distributional Analysis of Java 7’sDual Pivot Quicksort

Markus E. Nebel

based on joint work with Ralph Neininger and Sebastian Wild

AofA 2013Menorca, Spain

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 1 / 22

Page 2: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Sorting Algorithms in Practice

Many inventionsby algorithms comunity

vs. Few methodssuccessful in practice

C

C++

Java 6

Quicksort+Mergesort variant as stable sort

.NET

Haskell

Python Timsort

Sorting methods listed on Wikipedia Sorting methods of standard libraries for random access data

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 2 / 22

Page 3: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Sorting Algorithms in Practice

Many inventionsby algorithms comunity

vs. Few methodssuccessful in practice

C

C++

Java 6

Quicksort+Mergesort variant as stable sort

.NET

Haskell

Python Timsort

Sorting methods listed on Wikipedia Sorting methods of standard libraries for random access data

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 2 / 22

Page 4: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

History of Quicksort in Practice

1961,62 Hoare: first publication, average case analysis

1969 Singleton: median-of-three & Insertionsort on small subarrays

1975-78 Sedgewick: detailled analysis of many optimizations

1993 Bentley, McIlroy: Engineering a Sort Function

1997 Musser: O(n log n) worst case by bounded recursion depth

; Basic algorithm settled since 1961; latest tweaks from 1990’s.Since then: Almost identical in all programming libraries!

Until 2009: Java 7 switches to a new dual pivot Quicksort!

Sept. 2009 Vladimir Yaroslavskiy announced algorithm on Java core library mailinglist ; July 2011 public release of Java 7 with Yaroslavskiy’s Quicksort.

1961 1969 1975 ’78 1993 1997’62 ’77

today

2009

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 3 / 22

Page 5: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

History of Quicksort in Practice

1961,62 Hoare: first publication, average case analysis

1969 Singleton: median-of-three & Insertionsort on small subarrays

1975-78 Sedgewick: detailled analysis of many optimizations

1993 Bentley, McIlroy: Engineering a Sort Function

1997 Musser: O(n log n) worst case by bounded recursion depth

; Basic algorithm settled since 1961; latest tweaks from 1990’s.Since then: Almost identical in all programming libraries!

Until 2009: Java 7 switches to a new dual pivot Quicksort!

Sept. 2009 Vladimir Yaroslavskiy announced algorithm on Java core library mailinglist ; July 2011 public release of Java 7 with Yaroslavskiy’s Quicksort.

1961 1969 1975 ’78 1993 1997’62 ’77

today

2009

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 3 / 22

Page 6: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

History of Quicksort in Practice

1961,62 Hoare: first publication, average case analysis

1969 Singleton: median-of-three & Insertionsort on small subarrays

1975-78 Sedgewick: detailled analysis of many optimizations

1993 Bentley, McIlroy: Engineering a Sort Function

1997 Musser: O(n log n) worst case by bounded recursion depth

; Basic algorithm settled since 1961; latest tweaks from 1990’s.Since then: Almost identical in all programming libraries!

Until 2009: Java 7 switches to a new dual pivot Quicksort!

Sept. 2009 Vladimir Yaroslavskiy announced algorithm on Java core library mailinglist ; July 2011 public release of Java 7 with Yaroslavskiy’s Quicksort.

1961 1969 1975 ’78 1993 1997’62 ’77

today2009

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 3 / 22

Page 7: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Running Time Experiments

Why switch to new, unknown algorithm?

Because it is faster!

0 0.5 1 1.5 2

·106

7

8

9

n

time

10−

6·n

lnn

Java 6 Library

Normalized Java runtimes (in ms).Average and standard deviation of1000 random permutations per size.

remains true for basic variants of algorithms: vs. !

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 4 / 22

Page 8: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Running Time Experiments

Why switch to new, unknown algorithm? Because it is faster!

0 0.5 1 1.5 2

·106

7

8

9

n

time

10−

6·n

lnn

Java 6 Library

Java 7 Library

Normalized Java runtimes (in ms).Average and standard deviation of1000 random permutations per size.

remains true for basic variants of algorithms: vs. !

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 4 / 22

Page 9: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Running Time Experiments

Why switch to new, unknown algorithm? Because it is faster!

0 0.5 1 1.5 2

·106

7

8

9

n

time

10−

6·n

lnn

Java 6 Library

Java 7 Library

Classic Quicksort

Yaroslavskiy

Normalized Java runtimes (in ms).Average and standard deviation of1000 random permutations per size.

remains true for basic variants of algorithms: vs. !

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 4 / 22

Page 10: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort

High Level Algorithm:

1 Partition array arround two pivots p 6 q.2 Sort 3 subarrays recursively.

How to do partitioning?

1 For each element x, determine its class

small for x < pmedium for p < x < q

large for q < x

by comparing x to p and/or q

2 Arrange elements according to classes p q

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 5 / 22

Page 11: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort

High Level Algorithm:

1 Partition array arround two pivots p 6 q.2 Sort 3 subarrays recursively.

How to do partitioning?

1 For each element x, determine its class

small for x < pmedium for p < x < q

large for q < x

by comparing x to p and/or q

2 Arrange elements according to classes p q

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 5 / 22

Page 12: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort – Previous Work

Robert Sedgewick, 1975

in-place dual pivot Quicksort implementation

more comparisons and swaps than classic Quicksort

Pascal Hennequin, 1991

comparisons for list-based Quicksort with r pivots

r = 2 ; same #comparisons as classic Quicksortin one partitioning step: 5

3 comparisons per element

r > 2 ; very small savings, but complicated partitioning

; Using two pivots does not pay, and ...

... no theoretical explanation for impressive speedup.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 6 / 22

Page 13: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort – Previous Work

Robert Sedgewick, 1975

in-place dual pivot Quicksort implementation

more comparisons and swaps than classic Quicksort

Pascal Hennequin, 1991

comparisons for list-based Quicksort with r pivots

r = 2 ; same #comparisons as classic Quicksortin one partitioning step: 5

3 comparisons per element

r > 2 ; very small savings, but complicated partitioning

; Using two pivots does not pay, and ...

... no theoretical explanation for impressive speedup.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 6 / 22

Page 14: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort – Comparisons

How many comparisons to determine classes ( small , medium or large ) ?

Assume, we first compare with p.; small elements need 1, others 2 comparisons

on average: 13 of all elements are small

; 13 · 1 + 2

3 · 2 = 53 comparisons per element

if inputs are uniform random permutations,classes of x and y are independent

; Any partitioning method needs at least53(n − 2) ∼ 20

12n comparisons on average?

No! (Stay tuned . . . )

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 7 / 22

Page 15: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Dual Pivot Quicksort – Comparisons

How many comparisons to determine classes ( small , medium or large ) ?

Assume, we first compare with p.; small elements need 1, others 2 comparisons

on average: 13 of all elements are small

; 13 · 1 + 2

3 · 2 = 53 comparisons per element

if inputs are uniform random permutations,classes of x and y are independent

; Any partitioning method needs at least53(n − 2) ∼ 20

12n comparisons on average?

No! (Stay tuned . . . )

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 7 / 22

Page 16: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Beating the “Lower Bound”

∼ 2012n comparisons only needed,

if there is one comparison location (giving rise to fixed order like firstcompare with p), then checks for x and y independent

But: Can have several comparison locations!

Here: Assume two locations C1 and C2 s. t.

C1 first compares with p.

C2 first compares with q.

C1 executed often, iff p is large.

C2 executed often, iff q is small.

; C1 executed ofteniff many small elementsiff good chance that C1 needs only one comparison

(C2 similar)

; less comparisons than 53 per elements on average

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 8 / 22

Page 17: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Yaroslavskiy’s Quicksort

5 while k 6 g6 Ck if A[k] < p7 Swap A[k] and A[`] ; ` := `+ 18 elseC′

k if A[k] > q9 Cg while A[g] > q and k < g do g := g − 1 end while

10 Swap A[k] and A[g] ; g := g − 111 C′

g if A[k] < p12 Swap A[k] and A[`] ; ` := `+ 113 end if14 end if15 k := k + 116 end while

2 comparison locations

Ck handles pointer kCg handles pointer g

Ck first checks < pC ′k if needed > q

Cg first checks > qC ′g if needed < p

Invariant:

< p `→

> qg←

p 6 ◦ 6 q k→

?

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 9 / 22

Page 18: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Analysis of Yaroslavskiy’s Algorithm

In this talk:leading term asymptotics of comparisons(we have results for swaps and Java bytecodes too)distribution and correlation of costseffect of pivot sampling

Cn expected #comparisons to sort random permutation of {1, . . . ,n}

Cn satisfies recurrence relation

Cn = cn + 2n(n−1)

∑16p<q6n

(Cp−1 + Cq−p−1 + Cn−q

),

with cn expected #comparisons in first partitioning step

recurrence solvable by standard methods

; linear cn ∼ a · n yields Cn ∼ 65a · n ln n.

; need to compute cnMarkus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 10 / 22

Page 19: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Analysis of Yaroslavskiy’s Algorithm

first comparison for all elements (at Ck or Cg ); ∼ n comparisons

second comparison for some elements at C′k resp. C′

g. . . but how often are C ′k resp. C ′g reached?

C ′k : all non- small elements reached by pointer k.

C ′g : all non- large elements reached by pointer g.

second comparison for medium elements not avoidable; ∼ 1

3n comparisons in expectation

; it remains to count:large elements reached by k andsmall elements reached by g.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 11 / 22

Page 20: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Analysis of Yaroslavskiy’s Algorithm

Second comparisons for small and large elements?Depends on location!

C ′k ; l @K: number of large elements at positions K.

C ′g ; s @G: number of small elements at positions G.

1 Recall invariant: < p `→

> qg←

p 6 ◦ 6 q k→

?

; k and g cross at (rank of) q

p q

positions K = {2, . . . , q − 1} G = {q, . . . ,n − 1}

l @K = 3 s @G = 2

; |K| ∼ q, |G| ∼ n − q

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 12 / 22

Page 21: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of l @K and s @G

Assume p and q are fixed.

2 How many small and large elements?

#small =∣∣{1, . . . , p − 1}

∣∣ = p − 1

#large =∣∣{q + 1, . . . ,n}

∣∣ = n − q

; |K|, |G|, #small and #large are constant (for given p and q).

But: l @K and s @G are random even for fixed p and q.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 13 / 22

Page 22: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of l @K and s @G

Assume p and q are fixed.

2 How many small and large elements?

#small =∣∣{1, . . . , p − 1}

∣∣ = p − 1

#large =∣∣{q + 1, . . . ,n}

∣∣ = n − q

; |K|, |G|, #small and #large are constant (for given p and q).

But: l @K and s @G are random even for fixed p and q.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 13 / 22

Page 23: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of l @K and s @G

3 Conditional Distribution of l @K

We draw positions of large elements at random.

n − 2 positions (≡ urn with n − 2 balls)

draw #large positions without replacement (≡ number of draws is #large)

|K| positions contribute to l @K (≡ maximum number of successes is |K|)

; l @KD= Hypergeometric (#large, |K|, n − 2)

E [l @K | p, q] =#large · |K|

n − 2 ∼?? , 2

(n − q) · qn − 2

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 14 / 22

Page 24: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of l @K and s @G

3 Conditional Distribution of l @K

We draw positions of large elements at random.

n − 2 positions (≡ urn with n − 2 balls)

draw #large positions without replacement (≡ number of draws is #large)

|K| positions contribute to l @K (≡ maximum number of successes is |K|)

; l @KD= Hypergeometric (#large, |K|, n − 2)

E [l @K | p, q] =#large · |K|

n − 2 ∼?? , 2

(n − q) · qn − 2

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 14 / 22

Page 25: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Analysis of Yaroslavskiy’s Algorithm

law of total expectation:

E [l @K] =∑

16p<q6nPr[pivots (p, q)] · (n − q) q−2

n−2 ∼ 16n

Similarly: E [s @G] ∼ 112n.

Summing up contributions:

cn ∼ n first comparisons

+ 13n medium elements

+ 16n large elements at C ′k

+ 112n small elements at C ′g

= 1912 n

Recall: “lower bound” was 2012n.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 15 / 22

Page 26: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of costs

The contraction method can be used to show

TheoremFor the number Cn of key comparisons used by Yaroslavskiy’s Quicksort when operatingon a uniformly at random distributed permutation we have

Cn − E[Cn ]

n → C ∗, (n →∞),

where the convergence is in distribution and with second moments. The distribution ofC ∗ is determined as the unique fixed point, subject to E[X ] = 0 and E[X2] <∞, of

X D= 1 + (D1 + D2)(D2 + 2D3) +

3∑j=1

(DjX (j) + 19

10 Dj ln Dj),

where (D1,D2,D3), X (1), X (2) and X (3) are independent and X (j) has the samedistribution as X for j ∈ {1, 2, 3}. Moreover, we have, as n →∞,

Var(Cn) ∼ σ2C n2 with σ2

C = 2231360 − 361

600π2 = 0.25901. . .

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 16 / 22

Page 27: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Distribution of costs

Exact distributionCnn ,

andCn − E[Cn ]

n , n = 5 . . . 25.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 17 / 22

Page 28: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Covariance between comparisons and swaps

TheoremFor the number Cn of key comparisons and the number Sn of swaps used byYaroslavskiy’s algorithm on a random permutation, we have for n →∞

Cov(Cn ,Sn) ∼ σC,S n2 with σC,S = 2815 − 19

100π2 = −0.00855817. . .

The correlation coefficient of Cn and Sn is consequently

ρ =Cov(Cn ,Sn)√

Var(Cn)√

Var(Sn)≈ −0.0512112. . .

Remark: Note the changed behavior compared to classic quicksort wherea strong negative correlation (−0.86404) is observed!

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 18 / 22

Page 29: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Pivot Sampling

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

α1

α2

α1 (α2) the percentage of small (medium) elements.

Black (white) dot showsoptimal (symmetric) choicefor the pivots (exact orderstatistics , k →∞);

dashed black (dotted white)lines represent“equi-cost-ant” to optimum(equidistant fromsymmetric) pivot choices;

a smaller sample togetherwith optimal choice canbeat symmetric choice forlarger sample (in number ofJava bytecodes).

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 19 / 22

Page 30: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Pivot Sampling

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

α1

α2

α1 (α2) the percentage of small (medium) elements.

Black (white) dot showsoptimal (symmetric) choicefor the pivots (exact orderstatistics , k →∞);

dashed black (dotted white)lines represent“equi-cost-ant” to optimum(equidistant fromsymmetric) pivot choices;

a smaller sample togetherwith optimal choice canbeat symmetric choice forlarger sample (in number ofJava bytecodes).

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 19 / 22

Page 31: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Discussion

– we did not really succeed

Comparisons:

Yaroslavskiy needs ∼ 65 ·

1912 n ln n = 1.9 n ln n on average.

Classic Quicksort needs ∼ 2 n ln n comparisons!

Swaps:

∼ 0.6 n ln n swaps for Yaroslavskiy’s algorithm vs.

∼ 0.3 n ln n swaps for classic Quicksort

Bytecodes:∼ 21.7n ln(n) − 3.56319n Java bytecodes for Yaroslavskiy’s algorithmvs.

∼ 18n ln(n) + 6.21488n Java bytecodes for classic Quicksort

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 20 / 22

Page 32: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Discussion – we did not really succeed

Comparisons:

Yaroslavskiy needs ∼ 65 ·

1912 n ln n = 1.9 n ln n on average.

Classic Quicksort needs ∼ 2 n ln n comparisons!

Swaps:

∼ 0.6 n ln n swaps for Yaroslavskiy’s algorithm vs.

∼ 0.3 n ln n swaps for classic Quicksort

Bytecodes:∼ 21.7n ln(n) − 3.56319n Java bytecodes for Yaroslavskiy’s algorithmvs.

∼ 18n ln(n) + 6.21488n Java bytecodes for classic Quicksort

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 20 / 22

Page 33: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Conclusion

Yaroslavskiy’s quicksort is a perfect textbook example to demonstrate

how well methods from AofA are developed;

the depth of results obtainable (precise expectations, distributions,covariances, ...) by those methods;

how AofA can guide engineering of an algorithm (pivot sampling,switch to insertionsort, ...).

However, our sophisticated machinery fails to explain the practicalefficiency of Yaroslavskiy’s algorithms; (presumably) it would beimportant to get access to

branch mispredictions and/or

cache misses.

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 21 / 22

Page 34: Average-Case and Distributional Analysis of Java 7’s Dual Pivot …aofa2013.lsi.upc.edu/slides/Nebel.pdf · Python Timsort Sorting methods listed on Wikipedia Sorting methods of

Many thanks for your attention!

Markus E. Nebel Java 7’s Dual Pivot Quicksort 2013/18/5 22 / 22