Upload
albert-frith
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
LECTURE 26:BUCKET SORT & RADIX SORT
CSC 213 – Large Scale Programming
Today’s Goals
Review discussion of merge sort and quick sort How do they work & why divide-and-
conquer? Are they fastest possible sorts?
Another way to sort data presented How can we sort data with single simple
value? What are limits on using buckets to sort our
data? If we want more buckets, can we expand
these limits? How does radix sort work? How long does it
need?
Quick Sort v. Merge Sort
Quick Sort Merge Sort
Divide data around pivot Want pivot to be near
middle All comparisons occur
here
Conquer with recursion Does not need extra
space
Merge usually done already Data already sorted!
Divide data in blindly half Always gets even split No comparisons
performed!
Conquer with recursion Needs* to use other
arrays
Merge combines solutions Compares from (sorted)
halves
Complexity of Sorting
With n! external nodes, binary tree’s height is:minimum height (time)
log (n!)
n!
xi < xj ?
xa < xb ?
xc < xd ? xc < xd ?xc < xd ? xc < xd ?
xa < xb ?O(n log n)
Bucket-Sort
Buckets, B, is array of Sequence Sorts Collection, C, in two phases:
1. Remove each element v from C & add to B[v]
2. Move elements from each bucket back to C
A B C
Bucket-Sort
Buckets, B, is array of Sequence Sorts Collection, C, in two phases:
1. Remove each element v from C & add to B[v]
2. Move elements from each bucket back to C
Bucket-Sort Algorithm
Algorithm bucketSort(Sequence<Integer> C)B = new Sequence[10] // & instantiate each Sequence
// Phase 1 for each element v in C
B[v].addLast(v) // Assumes each number in C between 0 & 9endfor
// Phase 2loc = 0for each Sequence b in B
for each element v in bC.set(loc, v)loc += 1
endforendfor
return C
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integer indices needed to
access arrays Sorting occurs without comparing objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integer indices needed to
access arrays Sorting occurs without comparing
objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integer indices needed to
access arrays
Sorting occurs without
comparing objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integer indices needed to
access arrays Sorting occurs without comparing objects
Stable sort describes any sort of this type Preserves relative ordering of objects with
same value (BUBBLE-SORT & MERGE-SORT are other
stable sorts)
Bucket Sort Extensions
Use Comparator for BUCKET-SORT Get index for v using compare(v, null)
Comparator for booleans could return 0 when v is false 1 when v is true
Comparator for US states, could return Annual per capita consumption of Jello Consumption of jello overall, in cubic feet State’s ranking by population
Bucket Sort Extensions
State’s ranking by population
1 California2 Texas3 New York4 Florida5 Illinois
6Pennsylvania
7 Ohio8 Michigan9 Georgia
Bucket Sort Extensions
Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set
Bucket Sort Extensions
Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set
enumerateis subtle
hint
d-Tuples
Combination of d values such as (k1, k2, …, kd) ki is ith dimension of the tuple
A point (x, y, z) is 3-tuple x is 1st dimension’s value Value of 2nd dimension is y z is 3rd dimension’s value
Lexicographic Order
Assume a & b are both d-tuples a = (a1, a2, …, ad)
b = (b1, b2, …, bd)
Can say a < b if and only if a1 < b1 OR
a1 = b1 && (a2, …, ad) < (b2, …, bd)
Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8)
Lexicographic Order
Assume a & b are both d-tuples a = (a1, a2, …, ad)
b = (b1, b2, …, bd)
Can say a < b if and only if a1 < b1 OR
a1 = b1 && (a2, …, ad) < (b2, …, bd)
Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8) (1 4) (3 2) (3 4) (4 8) (7 8)
Radix-Sort
Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower
bound Sort performed using d calls to bucket sort Sorts least to most important dimension of
tuple Luckily lots of data are d-tuples
String is d-tuple of char“L E T T E R S”“L I N G E R S”
Radix-Sort
Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower
bound Sort performed using d calls to bucket sort Sorts least to most important dimension of
tuple Luckily lots of data are d-tuples
Digits of an int can be used for sorting, also
1 0 0 1 3 7 2 91 0 0 9 2 2 1 0
Radix-Sort For Integers
Represent int as a d-tuple of digits:621010 = 1111102 041010 =
0001002
Decimal digits needs 10 buckets to use for sorting
Ordering using their bits needs 2 buckets O(d∙n) time needed to run RADIX-SORT
d is length of longest element in input In most cases value of d is constant (d =
31 for int) Radix sort takes O(n) time, ignoring
constant
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT1001
0010
1101
0001
1110
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT1001
0010
1101
0001
1110
0010
1110
1001
1101
0001
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT1001
0010
1101
0001
1110
1001
1101
0001
0010
1110
0010
1110
1001
1101
0001
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT1001
0010
1101
0001
1110
1001
0001
0010
1101
1110
1001
1101
0001
0010
1110
0010
1110
1001
1101
0001
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT 0001
0010
1001
1101
1110
1001
0010
1101
0001
1110
1001
0001
0010
1101
1110
1001
1101
0001
0010
1110
0010
1110
1001
1101
0001
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice Loop repeats once per digit to complete
sort
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice
O(n) Loop repeats once per digit to complete
sort * O(1)
O(n)
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice
O(n) Loop repeats once per digit to complete
sort * O(1)
O(log n) times (?) O(n log n)
For Next Lecture
Start thinking test cases for program #2 Friday is next deadline when these must be
submitted Spend time on this: tests & design saves
coding Next weekly assignment available
tomorrow As is usual, this will be due next Tuesday
Reading on Graph ADT for Wednesday Note: these have nothing to do with bar
charts What are mathematical graphs? Why are they the basis of everything in CS?