49
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

1

Foundations of Software DesignFall 2002Marti Hearst

Lecture 20: Sorting

Page 2: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

2

Sorting

www.naturalbrands.com

Page 3: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

3

Sorting

Putting collections of things in order– Numerical order– Alphabetical order– An order defined by the domain of use

• E.g. color, yumminess …

http://www.hugkiss.com/platinum/candy.shtml

Page 4: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

4

Reminder: Binary Search

• Search for something in a SORTED list• Intuitively:

– Look in the middle for “fred”. – If not there, look to left middle if Fred is less than the name

at that position, else look to the right of middle – Choosing the middle is a bit tricky because the bounds of

the array change with each recursive call. – O(log n)

• BinarySearch(“fred”, A, start, end)if (fred == A[middle]) return “fred”if (fred < A[middle])

BinarySearch(“fred”,A, start, middle)

elseBinarySearch(“fred”,A,(middle+1), end)

Page 5: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

5

Stable Sort

• A concept that applies to sorting in general• What to do when two items have equal keys?• A sorting algorithm is stable if

– Two items A and B, both with the same key K, end up in the same order before sorting as after sorting

• Example: sorting in excel– Given name, salary, department– Fred, Sally, and Dekai: dept == shoes, salary == 50K– Juan and Donna: dept == shoes, salary == 70K– In a stable sort:

• If you first sort by salary, then by dept, the order of Fred, Sally, and Dekai doesn’t change relative to each other

• Same goes for Juan and Donna

Page 6: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

6

Stable Sort

Sort by salary

Order of names within category doesn’t change

Page 7: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

7

Monotonicity• mon·o·tone (m n -t n )

– A succession of sounds or words uttered in a single tone of voice

• mon·o·ton·ic (m n -t n k) Mathematics. Designating sequences, the successive members of which either consistently increase or decrease but do not oscillate in relative value.

• A function is monotonically increasing if whenever a>b then f(a)>=f(b).

• A function is strictly increasing if whenever a>b then f(a)>f(b). – Similar for decreasing

• Nonmonotonic is the opposite

Page 8: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

8Slide copyright Pearson Education 2002

Sorting an Array of Integers

An array of six integers that we want to sort from smallest to largest

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6][0] [1] [2] [3] [4] [5]

Page 9: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

9Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

The SelectionSort Algorithm

Start by finding the smallest entry.

[0] [1] [2] [3] [4] [5]

Page 10: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

10Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

• Start by finding the smallest entry.

• Swap the smallest entry with the first entry.

[0] [1] [2] [3] [4] [5]

Page 11: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

11Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

• Start by finding the smallest entry.

• Swap the smallest entry with the first entry.

[0] [1] [2] [3] [4] [5]

Page 12: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

12Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

• Part of the array is now sorted.

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Page 13: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

13Slide copyright Pearson Education 2002

• Find the smallest element in the unsorted side.

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Page 14: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

14Slide copyright Pearson Education 2002

• Find the smallest element in the unsorted side.

• Swap with the front of the unsorted side.

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Page 15: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

15Slide copyright Pearson Education 2002

• We have increased the size of the sorted side by one element.

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Page 16: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

16Slide copyright Pearson Education 2002

• The process continues...

Sorted side Unsorted side

Smallestfrom

unsorted

Smallestfrom

unsorted

[0] [1] [2] [3] [4] [5]

Page 17: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

17Slide copyright Pearson Education 2002

• The process continues...

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Swap

with

front

Swap

with

front

Page 18: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

18Slide copyright Pearson Education 2002

• The process continues...

Sorted side Unsorted sideSorted side

is bigger

Sorted sideis bigger

[0] [1] [2] [3] [4] [5]

Page 19: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

19Slide copyright Pearson Education 2002

• The process keeps adding one more number to the sorted side.

• The sorted side has the smallest numbers, arranged from small to large.

Sorted side Unsorted side

[0] [1] [2] [3] [4] [5]

Page 20: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

20Slide copyright Pearson Education 2002

• We can stop when the unsorted side has just one number, since that number must be the largest number.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 21: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

21Slide copyright Pearson Education 2002

• The array is now sorted.

• We repeatedly selected the smallest element, and moved this element to the front of the unsorted side.

[0] [1] [2] [3] [4] [5]

Page 22: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

22

Selection Sort Analysis

• Running time– Do n times:

• Find the smallest item (this is O(n))• Swap

– O(n2) worst case

Page 23: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

23Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

The Insertion Sort Algorithm

Also views the array as having a sorted side and an unsorted side.

[0] [1] [2] [3] [4] [5]

Page 24: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

24Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

The sorted side starts with just the first element, which is not necessarily the smallest element.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 25: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

25Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

The sorted side grows by taking the front element from the unsorted side...

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 26: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

26Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

...and inserting it in the place that keeps the sorted side arranged from small to large.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 27: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

27Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

In this example, the new element goes in front of the element that was already in the sorted side.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 28: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

28Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Sometimes we are lucky and the new inserted item doesn't need to move at all.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 29: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

29Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Sometimes we are lucky twice in a row.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 30: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

30Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

How to Insert One Element

Copy the new element to a separate location.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 31: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

31Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Shift elements in the sorted side, creating an open space for the new element.

[0] [1] [2] [3] [4] [5]

Page 32: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

32Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Shift elements in the sorted side, creating an open space for the new element.

[0] [1] [2] [3] [4] [5]

Page 33: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

33Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Continue shifting elements...

[0] [1] [2] [3] [4] [5]

Page 34: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

34Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Continue shifting elements...

[0] [1] [2] [3] [4] [5]

Page 35: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

35Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

...until you reach the location for the new element.

[0] [1] [2] [3] [4] [5]

Page 36: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

36Slide copyright Pearson Education 2002

0

10

20

30

40

50

60

70

[1] [2] [3] [4] [5] [6]

Copy the new element back into the array, at the correct location.

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 37: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

37Slide copyright Pearson Education 2002

• The last element must also be inserted. Start by copying it...

[0] [1] [2] [3] [4] [5]

Sorted side Unsorted side

Page 38: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

38

Insertion Sort Analysis

• Useful for adding to an already (nearly) sorted list.• Go down the list and insert item into the list in its proper

order with respect to items already inserted. • Run time:

– If target location near the end of the list, time n– If target near the beginning of the list, have to shift n items

down– Do this n times– O(n2) worst case– However: can speed up with binary search– Linear time best case

Page 39: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

39

Divide-and-Conquer Strategy

• Usually recursive.• Divide:

– If the input size is smaller than some threshold, solve the problem directly using a simple method

– Otherwise, divide the input into subsets

• Recurse:– Recursively solve the problem on the subsets

• Conquer:– Merge the solutions for the subproblems into a

solution for the larger problem

Page 40: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

40

MergeSort

• Divide– If S has at least 2 elements

• Remove all elements from S• Put half in S1, half in S2

– Else • Done sorting this part

• Recurse– Recursively MergeSort S1 and S2

• Conquer– Merge S1 and S2 into a sorted sequence– Put this into S

Page 41: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

41Illustration from Goodrich & Tamassia

1

2

3

4

Page 42: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

42Illustration from Goodrich & Tamassia

5

6

7

8

Just did a merge step

Page 43: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

43Illustration from Goodrich & Tamassia

9

10

11

12

Page 44: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

44Illustration from Goodrich & Tamassia

13

14

15

16

About to do another merge Now merge the lists with 2 items

Page 45: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

45Illustration from Goodrich & Tamassia

17

18

19

20

Skipping details on this part

Page 46: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

46Illustration from Goodrich & Tamassia

21

22Final Product

Final Merge Step

Page 47: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

47

MergeSort in Action

• Also helpful to view animations– http://www.cs.toronto.edu/~neto/teaching/238/16/mergesort.html

• The difficult part to analyze is the conquer step in which two sublists are merged

• The intuition:– The two sublists are in increasing order.– Want to create a new list containing the contents of both.– Move along both sublists, keeping track of which is the smallest

current element in each sublist.– Say the smallest is the element currently pointed at in sublist A

• Put that element from A at the end of the new list• Increment the index pointing to list A

– Continue until the indices of both sublists are at the ends of those sublists.

• Easier to use Vectors than Arrays!

Page 48: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

48

Merging Two Sorted Listsmerge (S1, S2) {

S = nullwhile S1 not empty && S2 not empty {if S1.first <= S2.firstS.append(S1.first)S1.remove(S1.first)elseS.append(S2.first)S2.remove(S2.first)} // append any leftoversif S1 not emptyS.append(S1)else if S2 not emptyS.append(S2)

return S}

Page 49: 1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 20: Sorting

49

MergeSort Running Time

• Intuition:– You do the same thing to all n elements of the array

at each level of the diagram– The diagram has log(n) levels– Thus the average and worst case running times are

O(n log(n))