27
Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort I Quick-sort We will show that: I O(n · log n) is optimal for comparison based sorting.

Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting Algorithms

We have already seen:I Selection-sort

I Insertion-sort

I Heap-sort

We will see:I Bubble-sort

I Merge-sort

I Quick-sort

We will show that:I O(n · log n) is optimal for comparison based sorting.

Page 2: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Bubble-Sort

The basic idea of bubble-sort is as follows:I exchange neighboring elements that are in wrong order

I stops when no elements were exchanged

bubbleSort(A):n = length(A)

swapped = truewhile swapped == true do

swapped = falsefor i = 0 to n − 2 do

if A[i ] > A[i + 1] thenswap(A[i ],A[i + 1])

swapped = truedonen = n − 1

done

Page 3: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Bubble-Sort: Example

5 4 3 7 1

4 5 3 7 1

4 3 5 7 1

4 3 5 7 1

4 3 5 1 7

3 4 5 1 7

4 3 5 1 7

4 3 1 5 7

3 4 1 5 7

3 1 4 5 7

1 3 4 5 7

Unsorted Part

Sorted Part

Page 4: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Bubble-Sort: Properties

Time complexity:I worst-case:

(n − 1) + . . .+ 1 =(n − 1)2 + n − 1

2∈ O(n2)

(caused by sorting an inverse sorted list)

I best-case: O(n)

Bubble-sort is:I slow

I in-place

Page 5: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Divide-and-Conquer

Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide the input S into two disjoint subsets S1, S2

I Recur: recursively solve the subproblems S1, S2

I Conquer: combine solutions for S1, S2 to a solution for S(the base case of the recursion are problems of size 0 or 1)

Example: merge-sort

7 2 | 9 4 7→ 2 4 7 9

7 | 2 7→ 2 7

7 7→ 7 2 7→ 2

9 | 4 7→ 4 9

9 7→ 9 4 7→ 4

I | indicates the splitting point

I 7→ indicates merging of the sub-solutions

Page 6: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merge Sort

Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elements

I Recur: recursively sort S1, S2

I Conquer: merge S1 and S2 into a sorting of S

Algorithm mergeSort(S,C):Input: a list S of n elements and a comparator COutput: the list S sorted according to C

if size(S) > 1 then(S1,S2) = partition S into size bn/2c and dn/2emergeSort(S1,C)

mergeSort(S2,C)

S = merge(S1,S2,C)

Page 7: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merging two Sorted Sequences

Algorithm merge(A,B,C):Input: sorted lists A, BOutput: sorted lists containing the elements of A and B

S = empty listwhile ¬A.isEmtpy() or ¬B.isEmtpy() do

if A.first().element < B.first().element thenS.insertLast(A.remove(A.first()))

elseS.insertLast(B.remove(B.first()))

donewhile ¬A.isEmtpy() do S.insertLast(A.remove(A.first()))while ¬B.isEmtpy() do S.insertLast(B.remove(B.first()))

Performance:I Merging two sorted lists of length about n/2 is O(n) time.

(for singly linked lists, double linked lists, and arrays)

Page 8: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merge-sort: Example

a n e x | a m p l e

a n | e x

a | n

a n

e | x

e x

a m | p l e

a | m

a m

p | l e

p l | e

l e

a n e x a m e l

e l pa e n x

a e l m p

a a e e l m n p x

Divide(split)

Conquer(merge)

Page 9: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merge-Sort Tree

An execution of merge-sort can be displayed in a binary tree:I each node represents recursive call and stores:

I unsorted sequence before execution, its partition

I sorted sequence after execution

I leaves are calls on subsequences of size 0 or 1

a n e x | a m p l e 7→ a a e e l m n p x

a n | e x 7→ a e n x

a | n 7→ a n

a 7→ a n 7→ n

e | x 7→ e x

e 7→ e x 7→ x

a m | p l e 7→ a e l m p

a | m 7→ a m

a 7→ a m 7→ m

p | l e 7→ e l p

p 7→ p l | e 7→ e l

l 7→ l e 7→ e

Page 10: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merge-Sort: Example Execution

7 1 2 9 | 6 5 3 8 7→ 1 2 3 5 6 7 8 9

7 1 | 2 9 7→ 1 2 7 9

7 | 1 7→ 1 7

7 7→ 7 1 7→ 1

2 | 9 7→ 2 9

2 7→ 2 9 7→ 9

6 5 | 3 8 7→ 3 5 6 8

6 | 5 7→ 5 6

6 7→ 6 5 7→ 5

3 | 8 7→ 3 8

3 7→ 3 8 7→ 8

Finished merge-sort tree.

Page 11: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Merge-Sort: Running Time

The height h of the merge-sort tree is O(log2 n):I each recursive call splits the sequence in half

The work all nodes together at depth i is O(n):I partitioning, and merging of 2i sequences of n/2i

I 2i+1 ≤ n recursive calls

10 n

21 n/2

2ii n/2i

. . .. . . . . .

depth nodes size

Thus the worst-case running time is O(n · log2 n).

Page 12: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort

Quick-sort of a list S with n elements works as follows:I Divide: pick random element x (pivot) from S and split S in:

I L elements less than x

I E elements equal than x

I G elements greater than x

7 2 1 9 6 5 3 85

I Recur: recursively sort L, and G

2 1 3 5 7 9 6 8

L E G

I Conquer: join L, E , and G

1 2 3 5 6 7 8 9

Page 13: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort: The Partitioning

The partitioning runs in O(n) time:I we traverse S and compare every element y with x

I depending on the comparison insert y in L, E or G

Algorithm partition(S,p):Input: a list S of n elements, and positon p of the pivotOutput: list L, E , G of lists less, equal or greater than pivot

L,E ,G = empty listsx = S.elementAtRank(p)

while ¬S.isEmpty() doy = S.remove(S.first())if y < x then L.insertLast(y)

if y == x then E .insertLast(y)

if y > x then G.insertLast(y)

donereturn L, E , G

Page 14: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort Tree

An execution of quick-sort can be displayed in a binary tree:I each node represents recursive call and stores:

I unsorted sequence before execution, and its pivot

I sorted sequence after execution

I leaves are calls on subsequences of size 0 or 1

1 6 2 9 4 0 7→ 0 1 2 4 6 9

1 2 0 7→ 0 1 2

0 7→ 0 2 7→ 2

6 9 7→ 6 9

9 7→ 9

Page 15: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort: Example

8 2 9 3 1 5 7 6 4 7→ 1 2 3 4 5 6 7 8 9

2 3 1 5 4 7→ 1 2 3 4 5

2 3 5 4 7→ 2 3 4 5

2 7→ 2 5 4 7→ 4 5

4 7→ 4

8 9 7 7→ 7 8 9

7 7→ 7 9 7→ 9

Page 16: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort: Worst-Case Running Time

The worst-case running time occurs when:I the pivot is always the minimal or maximal element

I then one L and G has size n − 1, the other size 0

Then the running time is O(n2):

n + (n − 1) + (n − 2) + . . .+ 1 ∈ O(n2)

n

0 n − 1

0 n − 2

0 ...1

0 0

Page 17: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort: Average Running Time

Consider a recursive call on a list of size m:I Good call: if both L and G are each less then 3

4 · s size

I Bad call: one of L and G is greater than 34 · s size

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

bad calls good calls bad calls

A good call has probability 12 :

I half of the pivots give rise to good calls

Page 18: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Quick-Sort: Average Running Time

For a node at depth i, we expect (average):I i/2 ancestors are good calls

I the size of the sequence is ≤ (3/4)i/2 · nAs a consequence:

I for a node at depth 2 · log3/4 n the expected input size is 1

I the expected height of the quick-sort tree is O(log n)

O(log n)

O(n)

O(n)

O(n)

sa

sb

sd

. . . . . .

se

. . . . . .

sc

sf

. . . . . .

sg

. . . . . .

The amount of work at depth i is O(n).Thus the expected (average) running time is O(n · log n).

Page 19: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

In-Place Quick-Sort

Quick-Sort can be sorted in-place (but then non-stable):

Algorithm inPlaceQuickSort(A, l, r):Input: list A, indices l and rOutput: list A where elements from index l to r are sorted

if l ≥ r then returnp = A[r] (take rightmost element as pivot)l ′ = l and r ′ = rwhile l ′ ≤ r ′ do

while l ′ ≤ r ′ and A[l ′] ≤ p do l ′ = l ′ + 1 (find > p)while l ′ ≤ r ′ and A[r ′] ≥ p do r ′ = r ′ − 1 (find < p)if l ′ < r ′ then swap(A[l ′],A[r ′]) (swap < p with > p)

doneswap(A[r],A[l ′]) (put pivot into the right place)inPlaceQuickSort(A, l, l ′ − 1) (sort left part)inPlaceQuickSort(A, l ′ + 1, r) (sort right part)

Considered in-place although recursion needs O(log n) space.

Page 20: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

In-Place Quick-Sort: Example

5 8 3 7 1 6

5 1 3 7 8 6

5 1 3 6 8 7

1 5 3 6 8 7

1 3 5 6 8 7

1 3 5 6 8 7

1 3 5 6 8 7

1 3 5 6 7 8

1 3 5 6 7 8

Unsorted Part

Sorted Part

Pivot

Page 21: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Lower Bound

Many sorting algorithms are comparison based:I sort by comparing pairs of objects

I Examples: selection-sort, insertion-sort, bubble-sort,heap-sort, merge-sort, quick-sort,. . .

xi < xj ?noyes

No comparison based sorting algorithm can be faster than

Ω(n · log n)

time (worst-case).

We will prove this lower bound on the next slides. . .

Page 22: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Lower Bound (Decision Tree)

We will only count comparisons (sufficient for lower bound):I Assume input is a permutation of the numbers 1, 2, . . . , n.

I Every execution corresponds to a path in the decision tree:

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

I Algorithm itself maybe does not have this tree structure,but this is the maximal information gained by the algorithm.

Page 23: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Lower Bound (Leaves)

Every leaf corresponds to exactly one input permutation:I application of the same swapping steps on two different

input permutations, e.g. . . . ,6,. . . ,7,. . . and . . . ,7,. . . ,6,. . . ,yields different results (not both results can be sorted)

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

Page 24: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Lower Bound (Height of the Decision Tree)

The height of the tree is a lower bound on the running time:I There are n! = n · (n − 1) · · · 1 permutations of 1, 2, . . . , n.

I Thus the height of the tree is at least: log2(n!).

n!

log2(n!)

xa < xb?

xc < xd?

xg < xh?

. . . . . .

xi < xj?

. . . . . .

xe < xf ?

xk < xl?

. . . . . .

xm < xo?

. . . . . .

Page 25: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Lower Bound

Hence any comparison-based sorting algorithm takes at least

log2(n!) ≥ log2(n/2)n/2

=n2

log2n2

∈ Ω(n · log n)

time in worst-case.

Page 26: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Summary of Comparison-Based Sorting Algorithms

Algorithm Time Notes

selection-sort O(n2) I slow (but good for small lists)I in-place, stableI insertion-sort good for online

sorting and nearly sorted lists

insertion-sort O(n2)

bubble-sort O(n2)

heap-sort O(n · log2 n)I in-place, not stable, fastI good for large inputs (1K − 1M)

merge-sort O(n · log2 n)

I fast, stable, usually not in-placeI sequential data accessI good for large inputs (> 1M)

quick-sort O(n · log2 n)

(expected)I in-place, randomized, not stableI fastest, good for huge inputs

Quick-sort usually performs fastest, although worst-case O(n2).

Page 27: Sorting Algorithms - cs.vu.nltcs/ds/lecture7.pdf · Sorting Algorithms We have already seen: I Selection-sort I Insertion-sort I Heap-sort We will see: I Bubble-sort I Merge-sort

Sorting: Comparison of Runtime

Algorithm 25.000sorted

100.000sorted

25.000not sorted

100.000not sorted

selection-sort 1.1 19.4 1.1 19.5

insertion-sort 0 0 1.1 19.6

bubble-sort 0 0 5.5 89.8

Algorithm 5 millionsorted

20 millionsorted

5 millionnot sorted

20 millionnot sorted

insertion-sort 0.03 0.13 timeout timeout

heap-sort 3.6 15.6 8.3 42.2

merge-sort 2.5 10.5 3.7 16.1

quick-sort 0.5 2.2 2.0 8.7

I Source: Gumm, Sommer Einführung in die Informatik.