31
Sorting preparation for searching

Sorting preparation for searching. Overview levels of performance categories of algorithms Java class Arrays

Embed Size (px)

Citation preview

Page 1: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Sorting

preparation for searching

Page 2: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Overview

levels of performance categories of algorithms Java class Arrays

Page 3: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Performance

‘human’ sorting algorithms proven best performance

for sorting random data special conditions

O(n2)O(n log n)

O(n)

Page 4: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Categories of algorithms

interchange - move items from unsorted to sorted subset (selection, insertion, heapsort)

divide and conquer – sort subsets and combine (shellsort, mergesort, quicksort, radix sort)

distribution counting

Page 5: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Sorting in Java binary search Arrays class

static int binarySearch(Object[] a, Object key) Searches the specified array for the specified object using the binary search algorithm

static <T> int binarySearch(T[] a, T key, Comparator<? super T> c)

Searches the specified array for the specified object using the binary search algorithm.

before JAVA 1.5:static int binarySearch(Object[] a, Object key, Comparator c)

Page 6: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Sorting – modified mergesort*

static void sort(Object[] a) Sorts the specified array of objects into ascending order, according to the natural ordering of its elements.

public static <T> void sort(T[] a, Comparator<? super T> c)

Sorts the specified array of objects according to the order induced by the specified comparator.

static void sort(Object[] a, int fromIndex, int toIndex) Sorts the specified range of the specified array of objects into ascending order, according to the natural ordering of its elements.

public static <T> void sort(T[] a, int fromIndex, int toIndex, Comparator<? super T> c )Sorts the specified range of the specified array of objects according to the order induced by the specified comparator.

*primitive type arrays are sorted by quicksort

Page 7: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Sorting

Quadratic performance

(thanks to Lorrie Fava-Lindon)

Page 8: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Selection sort

elements are selected in order and placed in their final sorted positions

ith pass selects the ith largest element in the array, and places it in the (n-i)th position of the array

an “interchange sort”; i.e., based on swapping

Page 9: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Selection sort

public static void selectionsort( int[] data, int first, int n)

{ int i, j, temp; int big; // index of largest value // in data[first…first + i]

for (i = n-1; i>0; i--) { big = first; for (j=first+1; j<=first+i; j++) if (data[big] < data[j]) big=j; temp = data[first+i]; data[first+i] = data[big]; data[big] = temp; }}

Page 10: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Analysis of selection sort

best, average, and worst case time

Θ(n2) comparisons, Θ(n) swaps

Θ(n2)time—quadratic.

Page 11: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Advantages of Selection sort

can be done “in-place”—no need for a second array

minimizes number of swaps

Page 12: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Insertion sort

begin with “sorted” list of one element sequentially insert new elements into

sorted list size of sorted list increases, size of

unsorted list decreases

Page 13: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Insertion sortpublic static void insertionsort( int[] data, int first, int n)

{ int i, j; int entry; for (i=1; i<n; i++) { entry = data[first + i]; for (j= first+i; (j>first)&&(data[j-1]>entry) ; j--)

data[j] = data[j-1]; data[j]=entry; }}

Page 14: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Analysis of Insertion sort worst case:

elements initially in reverse of sorted order. Θ(n2) comparisons, swaps

average case: same analysis as worst case

best case: elements initially in sorted order no swaps Θ(n) comparisons

Page 15: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Advantages of Insertion sort

can be done “in-place” if data is in “nearly sorted” order,

runs in Θ(n) time

Page 16: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort

improved insertion sort– better than O(n2)

(really a divide-and-conquer strategy)

Page 17: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort

insertion sort:-most moves of data are a single step

shellsort:-long moves in first loop, then shorter moves in later loops-uses same basic logic as insertion

Page 18: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Insertion pseudocodea: array of integers, length nfor (i = 1; i < n; i++)

temp = a[i]j = iwhile ( a[j-1] > temp )

a[j] = a[j-1]j- -

a[j] = temp

14 16 21 27 33 39 42 43 30 32 11 56 26 19 42 10

30temp

33 39 42 43

Page 19: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort

many small ‘parallel’ insertion sorts reduce number of parallel sorts,

round by round last round is pure insertion sort (but

data is ‘almost’ sorted)

Page 20: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort pseudocode// calcualte h – first, biggest “step size”h = 1while ( h <= n ) h = h*3 + 1// repeat, reducing h by approx 2/3 each timewhile ( h > 1 )

h = h / 3for (i = h; i < n; i++)

temp = a[i]j = iwhile ( j >= h && a[j-h] > temp ) a[j] = a[j-h] j = j-ha[j] = temp

Page 21: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort example

A S O R T I N G E X A M P L E

A E O R T I N G E X A M P L S

A E A G E I N M P L O R T X S

A A E E G I L M N O P R S T X

13h

4h

1h

Page 22: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Shellsort performance

formal analysis unsolved (as far as I know)

estimates: O(n3/2), possibly better depends on sequence of h values –

should be relatively prime

Page 23: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Sorting

Heap sort

(thanks to Lorrie Fava-Lindon)

Page 24: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Heap (recall)

Node of heap contains:(element, priority)

Storage rules:1. Element contained by a node has a

priority ≥ priorities of node’s children.

2. Tree is a complete binary tree. (All leaves at same level, leftmost positions.)

Page 25: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Array representation of complete binary tree (recall)

a[0] stores data from root nodea[1] data from left child of roota[2] data from right child of root Left child of a[i] located at a[2i+1]Right child of a[i] located at a[2i+2]Parent of a[i] located at a[(i-1)/2]*(for sorting, root is usually not stored at a[1])

Page 26: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Phases of Heapsort

0) Interpret array as binary tree1) Convert the tree into a heap2) Extract elements from heap, placing

them into sorted position in the array

Page 27: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Overview of Heapsort - 1

Two stage process• First, heapify the array:

“rearrange the values in the array so that the corresponding complete binary tree is a heap.”• Largest element now at the root position

—the first location in the array.

Page 28: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Overview of Heapsort - 2

Two stage process• Second, repeat

• Swap elements in first and last locations of heap. Now, largest element in last position—its correct position in sorted order.

• Element in root out of place. Reheapify downward.

• Heap shrinks by one, sorted sequence increases by 1.

• Next largest element now at root position.

Page 29: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Analysis of Heapsort time to build initial heap:

Θ(n log n) time to remove the elements from heap,

and place in sorted array:

Θ (n log n) overall time:

Θ (n log n)average, and worst cases

Page 30: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Advantages of Heapsort in-place (doesn’t require temporary array) asymptotic analysis same as Mergesort,

average case of Quicksort on average takes twice as long as Quicksort

Page 31: Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays

Constructing heap in Θ(n) time a binary tree with only one node

satisfies the properties of a heap interpret array as binary tree, and

consider leaves as heaps; i.e., second half of array

from midpoint of array downto root, insert each element into heap formed by subtree