Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2]...

Preview:

Citation preview

Sorting

Pseudocode of Insertion Sort

Insertion SortTo sort array A[0..n-1], sort A[0..n-2] recursively and

then insert A[n-1] in its proper place among the sorted A[0..n-2]

• Usually implemented bottom up (nonrecursively)

Example: Sort 6, 4, 1, 8, 5

6 | 4 1 8 5 4 6 | 1 8 5 1 4 6 | 8 5 1 4 6 8 | 5 1 4 5 6 8

Analysis of Insertion Sort• Time efficiency

Cworst(n) = n(n-1)/2 Θ(n2)

Cavg(n) ≈ n2/4 Θ(n2)

Cbest(n) = n - 1 Θ(n) (also fast on almost sorted arrays)

• Space efficiency: in-place

• Stability: yes

• Best elementary sorting algorithm overall

• Binary insertion sort

Merge Sort

• Divide: Divide the n element array to be sorted in two sub array of n/2 element each.

• Conquer: Sort the two sub array recursively using merge sort

• Combine: Merge the two sorted sub array to produce the sorted array

Mergesort• Split array A[0..n-1] in two about equal halves and make

copies of each half in arrays B and C• Sort arrays B and C recursively• Merge sorted arrays B and C into array A as follows:

– Repeat the following until no elements remain in one of the arrays:

• compare the first elements in the remaining unprocessed portions of the arrays

• copy the smaller of the two into A, while incrementing the index indicating the unprocessed portion of that array

– Once all elements in one of the arrays are processed, copy the remaining unprocessed elements from the other array into A.

Mergesort Example8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

3 8 2 9 1 7 4 5

2 3 8 9 1 4 5 7

1 2 3 4 5 7 8 9

Pseudocode of Mergesort

Pseudocode of Merge

Analysis of Mergesort

• All cases have same efficiency: Θ(n log n) • Number of comparisons in the worst case is

close to theoretical minimum for comparison-based sorting: log2 n! ≈ n log2 n - 1.44n

Quick Sort

• Divide: The array A[l..r] is partitioned into two nonempty sub array A[l..m] and A[m+1..r] such that each element of A[l..m] is less than or equal to each element of A[m+1..r]. The index m is computed as part of the partitioning process.

• Conquer: The two sub array are sorted in place by recursive call to quick sort.

• Combine: since the sub arrays are sorted in place, no work is needed to combine them.

Quicksort• Select a pivot (partitioning element) – here, the first element• Rearrange the list so that all the elements in the first s

positions are smaller than or equal to the pivot and all the elements in the remaining n-s positions are larger than or equal to the pivot

• Exchange the pivot with the last element in the first (i.e., ) subarray — the pivot is now in its final position

• Sort the two subarrays recursively

p

A[i]p A[i]p

Quicksort Example

5 3 1 9 8 2 4 7

Quick Sort Algorithm

Quicksort(A[l..r])If l< r

thenm = Partition(A[l..r]Quicksort(A[l..m]Quicksort(A[m+1..l)

Partitioning Algorithm

Analysis of Quicksort• Best case: split in the middle — Θ(n log n) • Worst case: sorted array! — Θ(n2) • Average case: random arrays — Θ(n log n)• Improvements:

– better pivot selection: median of three partitioning – switch to insertion sort on small subfiles– elimination of recursionThese combine to 20-25% improvement

• Considered the method of choice for internal sorting of large files (n ≥ 10000)

Heap and Heap SortDefinition:

A heap is a binary tree with the following conditions:

• it is essentially complete: all its levels are full, except last level where only some rightmost leaves may be missing

• The key at each node is ≥ keys at its children

Example

10

5

4 2

7

1

10

5

2

7

1

10

5

6 2

7

1

a heapa heap not a heapnot a heap not a heapnot a heap

Note: Heap’s elements are ordered top down (along any path Note: Heap’s elements are ordered top down (along any path down from its root), but they are not ordered left to right down from its root), but they are not ordered left to right

Some Important Properties of a Heap

• Given n, there exists a unique binary tree with n nodes that

is essentially complete, with h = log2 n

• The root contains the largest key

• The subtree rooted at any node of a heap is also a heap

• A heap can be represented as an array

Heap’s Array RepresentationStore heap’s elements in an array (whose elements indexed, for

convenience, 1 to n) in top-down left-to-right orderExample:

• Left child of node j is at 2j• Right child of node j is at 2j+1• Parent of node j is at j/2 • Parental nodes are represented in the first n/2 locations

9

1

5 3

4 2

1 2 3 4 5 6

9 5 3 1 4 2

Step 0: Initialize the structure with keys in the order given

Step 1: Starting with the last (rightmost) parental node, fix the heap rooted at it, if it doesn’t satisfy the heap condition: keep exchanging it with its largest child until the heap condition holds

Step 2: Repeat Step 1 for the preceding parental node

Heap Construction (bottom-up)

Example of Heap Construction

7

2

9

6 5 8

>

2

9

6 5

8

7

2

9

6 5

8

7

2

9

6 5

8

7

>

9

2

6 5

8

7

9

6

2 5

8

7

>

Construct a heap for the list 2, 9, 7, 6, 5, 8Construct a heap for the list 2, 9, 7, 6, 5, 8

Bottom-up heap construction algorithm

Heap sort Algorithm:

1. Build heap

2. Remove root –exchange with last (rightmost) leaf

3. Fix up heap (excluding last leaf)

Repeat 2, 3 until heap contains just one node.

Root deletion

The root of a heap can be deleted and the heap fixed up as follows:

• exchange the root with the last leaf

• compare the new root (formerly the leaf) with each of its children and, if one of them is larger than the root, exchange it with the larger of the two.

• continue the comparison/exchange with the children of the new root until it reaches a level of the tree where it is larger than both its children

Sort the list 2, 9, 7, 6, 5, 8 by heapsort

Stage 1 (heap construction) Stage 2 (root/max removal)

2 9 7 6 5 8 9 6 8 2 5 7

2 9 8 6 5 7 7 6 8 2 5 | 9

2 9 8 6 5 7 8 6 7 2 5 | 9

9 2 8 6 5 7 5 6 7 2 | 8 9

9 6 8 2 5 7 7 6 5 2 | 8 9

2 6 5 | 7 8 9

6 2 5 | 7 8 9

5 2 | 6 7 8 9

5 2 | 6 7 8 9

2 | 5 6 7 8 9

Example of Sorting by Heapsort

Analysis of Heap sort (continued)Recall algorithm:

1. Build heap

2. Remove root –exchange with last (rightmost) leaf

3. Fix up heap (excluding last leaf)

Repeat 2, 3 until heap contains just one node.

Θ(n)

Θ(log n)

n – 1 times

Total:Total: Θ(n) + Θ( n log n) = Θ(n log n)

• Note:Note: this is the worst case. Average case also Θ(n log n).

Priority queues

• A priority queue is the ADT of an ordered set with the operations:– find element with highest priority – delete element with highest priority – insert element with assigned priority

• Heaps are very good for implementing priority queues

Insertion of a new element

• Insert element at last position in heap. • Compare with its parent and if it violates heap

condition exchange them• Continue comparing the new element with

nodes up the tree until the heap condition is satisfied

Insertion of a New Element into a Heap• Insert the new element at last position in heap.

• Compare it with its parent and, if it violates heap condition,exchange them

• Continue comparing the new element with nodes up the tree until the heap condition is satisfied

Example: Insert key 10

Efficiency: O(log n)

9

6

2 5

8

7 10

9

6

2 5

10

7 8

> >

10

6

2 5

9

7 8

Bottom-up vs. Top-down heap construction

• Top down: Heaps can be constructed by successively inserting elements into an (initially) empty heap

• Bottom-up: Put everything in and then fix it

33

Radix Sort• Based on examining digits in some base-b

numeric representation of items (or keys)• Least significant digit radix sort

– Processes digits from right to left– Used in early punched-card sorting machines

• Create groupings of items with same value in specified digit– Collect in order and create grouping with next

significant digit

34

Radix Sort• Sort each digit (or field) separately.• Start with the least-significant digit.• Radix sort must invoke a stable sort.

RADIX-SORT(A, d)

1 for i ← 1 to d

2 do use a stable sort to sort array A on digit i

35

Radix Sort in Action

36

Running Time of Radix Sort

• use counting sort as the invoked stable sort, if the range of digits is not large

• if digit range is 1..k, then each pass takes Θ(n+k) time

• there are d passes, for a total of Θ(d(n+k))• if k = O(n), time is Θ(dn)• when d is const, we have Θ(n), linear!

Another example• Radix Sort Example• data[ ]• 123 234 345 456 543 987 654 23 76 934 765 452 857 356 805 294 490 780 120 200

73• Buckets[ ]• 0: 490 780 120 200• 1:• 2: 452• 3: 123 543 23 73• 4: 234 654 934 294• 5: 345 765 805• 6: 456 76 356• 7: 987 857• 8:• 9:

Another example (Cont.)• data[ ]• 490 780 120 200 452 123 543 23 73 234 654 934 294 345 765 805 456 76

356 987 857• Buckets[ ]• 0: 200 805• 1:• 2: 120 123 23• 3: 234 934• 4: 345 543• 5: 452 654 456 356 857• 6: 765• 7: 73 76• 8: 780 987• 9: 490 294

Another example (Cont.)• data[ ]• 200 805 120 123 23 234 934 345 543 452 654 456 356 857 765 73 76 780 987 490 294• buckets[ ]• 0: 23 73 76• 1: 120 123• 2: 200 234 294• 3: 345 356• 4: 452 456 490• 5: 543• 6: 654• 7: 765 780• 8: 805 857• 9: 934 987• data[ ]• 23 73 76 120 123 200 234 294 345 356 452 456 490 543 654 765 780 805 857 934 987

Recommended