20
More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean

More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Embed Size (px)

Citation preview

Page 1: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

More about costs:

cost of

“ensureCapacity”,

cost of ArraySet,

Binary Search

2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria

University of Wellington

CO

MP 1

03

Marcus Frean

Page 2: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

RECAP Analysing Algorithm Costs – “Big O” notation

TODAY ArrayList Costs:

add at end (ensure capacity)- a bit tricky: as it find the “amortised” cost

summary ArraySet Costs:

get, set, contains Binary search: “findIndex” method of ArraySet –

logarithmic cost summary

Announcements: Assignment#4 released, due next Monday 3pm

2

RECAP-TODAY

Page 3: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

ArrayList: add at end Cost of add(value):

what’s the key step? worst case:

average case:

public void add (E item){    ensureCapacity();     data[count++] = item;} private void ensureCapacity () {    if (count < data.length) return;     E [ ] newArray = (E[ ]) (new Object[data.length * 2]);    for (int i = 0; i < count; i++)       newArray[i] = data[i];     data = newArray; } 

n

3

Page 4: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

ArrayList: amortised cost “Amortised” cost: total cost of adding n items,

divided by n:

first 10: cost = 1 each total = 10

11th: cost = 10+1 total = 21

12-20: cost = 1 each total = 30

21st: cost = 20+1 total = 51

22-40: cost = 1 each total = 70

41st: cost = 40+1 total = 111

42-80: cost = 1 each total = 150

:

- n total =

Amortised cost ( ) =

4

per item

Page 5: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

ArrayList costs: Summary get O(1)

set O(1)

remove O(n)

add (at i) O(n) (worst and average)

add (at end) O(1) (average)

O(n) (worst)

O(1) (amortised average)

To think about:

What would the amortised cost be if the array sizeis increased by a fixed amount (say 10) each time?

5

Page 6: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

What about ArraySet ? Order is not significant

⇒ can add a new item anywhere. where? At end: O(1), but also searching for duplicates : O(n)

⇒ can reorder when removing an item. how? Replace by last element: O(1), but also searching in set: O(n)

Duplicates not allowed.

⇒ must check if item already present before adding0 1 2 3 4 5 6 7 8 9 2910 11 12 13 3014 15 16 17 18 19 20 21 22 23 24 25 26 27 28 31

6

Page 7: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

ArraySet algorithms (pseudocode)

Add(value) if not contains(value),

place value at end, (doubling array if necessary)increment size

Remove(value)search through array

if value equals itemreplace item by item at end.decrement sizereturn

Contains(value)search through array,

if value equals itemreturn true

return false

Costs?7

Page 8: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

ArraySet costsCosts: contains, add, remove: O(n)

All the cost is in the search! How can we speed up the search?

8

Page 9: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Hand up if you find “Gnu”

Dog Fish Cat Fox Eel Ant Bee Hen Gnu Doe Oryx Fox Fish

• Are there any duplications in that list?• how many?

9

Ant Bee Cat Doe Dog Eel Fox Fox Fish Fish Gnu Hen Oryx

Moral: lots of operations get easier if your array is sorted.

Page 10: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Hand up if you find “constructs”

‘In most cases I don't believe that the disjunction between the preferred ideal way that intellectuals reflect and the modal operation of human cognition is much of an issue. Intellectuals, or those who fancy themselves as such, might struggle with issues of ontology. But I do not believe that this is particularly on the radar of the typical individual whose concerns are more prosaic, the basic material and emotional comforts and securities of life. Confusions only emerge when institutions and systems aim to span the full gamut of conventional cognition. For example, in politics or religion, where intellectuals build systems which are very relevant to the lives of most humans. Because of the general obscurity of intellectual constructs to the "average Joe" there is a large body of literature which exists to make abstruse concepts "relevant" in everyday terms to everyday folk.’

10

Page 11: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Hand up if you find “constructs”

['a', 'abstruse', 'aim', 'an', 'and', 'and', 'and', 'and', 'are', 'are', 'as', 'average', 'basic', 'Because', 'believe', 'believe', 'between', 'body', 'build', 'But', 'cases', 'cognition', 'cognition', 'comforts', 'concepts', 'concerns', 'Confusions', 'constructs', 'conventional', 'disjunction', 'do', 'dont', 'emerge', 'emotional', 'everyday', 'everyday', 'example', 'exists', 'fancy', 'folk', 'For', 'full', 'gamut', 'general', 'human', 'humans', 'I', 'I', 'ideal', 'In', 'in', 'in', 'individual', 'institutions', 'intellectual', 'intellectuals', 'Intellectuals', 'intellectuals', 'is', 'is', 'is', 'issue', 'issues', 'Joe', 'large', 'life', 'literature', 'lives', 'make', 'material', 'might', 'modal', 'more', 'most', 'most', 'much', 'not', 'obscurity', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'of', 'on', 'only', 'ontology', 'operation', 'or', 'or', 'particularly', 'politics', 'preferred', 'prosaic', 'radar', 'reflect', 'relevant', 'relevant', 'religion', 'securities', 'span', 'struggle', 'such', 'systems', 'systems', 'terms', 'that', 'that', 'that', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'the', 'themselves', 'there', 'this', 'those', 'to', 'to', 'to', 'to', 'to', 'typical', 'very', 'way', 'when', 'where', 'which', 'which', 'who', 'whose', 'with']

11

Page 12: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Making ArraySet faster.All the cost is in the searching:

Searching for “Gnu”

but if sorted…

1 2 3 4 5 6 7 80

Bee Dog Ant Fox Hen Gnu Eel Cat

8

Ant Bee Cat Dog Eel Fox Gnu Hen

8

12

Page 13: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Making ArraySet faster. Binary Search: Finding “Gnu”

If the items are sorted (“ordered”), then we can search fast

Look in the middle: if item is middle item ⇒ return if item is before middle item ⇒ look in left half if item is after middle item ⇒ look in right half

0 1 2 3 4 5 6 7 8

Ant Bee Cat Dog Eel Fox Gnu Pig

8

13

low

mid

hi

Page 14: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Binary SearchThis code returns the index of where the item ought to be, whether or not it is present (given this index, “contains” is trivial)

private int findIndex(Object item) {Comparable<E> value = (Comparable<E>) item;int low = 0;    // min possible index of item   int high  =  count;      // max possible index of itemwhile (low < high) {    int mid  =  (low + high) / 2;    if (value.compareTo(data[mid]) > 0)

        low = mid + 1;          // item should be in [mid+1..high]    else                            high = mid;        // item should be in [low..mid]}return low; 

}

14

nb. this is just a “helper” method within ArraySet

Page 15: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Binary Search: Cost What is the cost of searching if there are n

items in set? key step = ?

Iteration Size of range1 n2

k 1

0 1 2 3 4 5 6 7 8 9 2910 11 12 13 3014 15 16 17 18 19 20 21 22 23 24 25 26 27 28 31

15

Page 16: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Log2(n ) :

The number of times you can divide a set of n things in half.log2(1000) 10, log2(1,000,000) 20, log2(1,000,000,000) 30

Every time you double n, you add one step to the cost!

Logarithms often arise in analysing algorithms,

especially “Divide and Conquer” algorithms: Problem

Solution

Solve Solve

16

Page 17: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Summary: ArraySet with Binary Search

ArraySet: unordered All cost in the searching: O(n)

contains: O(n ) //simple, linear search add: O(n ) //cost of searching to see if there’s a

duplicate remove: O(n ) //cost of searching the item to

remove

SortedArraySet: with Binary Search

Binary Search is fast: O(log n ) contains: O(log n ) //uses binary search add: O(n ) //cost of keeping it sorted remove: O(n ) //cost of keeping it sorted

All the cost is in keeping it sorted!!!!

17

Page 18: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Making SortedArraySet fast If you have to call add() and/or remove() many

items,then SortedArraySet is no better than ArraySet Both O(n ) Either we... pay to search

Or we... pay to keep it in order

If you only have to construct the set once, and then manycalls to contains(),then SortedArraySet is much better than ArraySet. SortedArraySet contains() is O(log n ) to find1-in-a-billion, if sorted takes ~time that 1-in-30 would,

unsorted

But, how do you construct the set fast? A separate constructor.

18

Page 19: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

Alternative Constuctor Sort the items all at once

public SortedArraySet(Collection<E> col){

// Make space

count=col.size();

data = (E[]) new Object[count];

// Put items from collection into the data array.:

// sort the data array.

Arrays.sort(data);

}

So… how do you sort? Next lecture we will investigate… 

19

Page 20: More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University

CRUCIAL NOTE! Our assignments are “lagged” behind the lecture content.

This week’s lectures talk about SortedArraySet, and binary search, and sorting etc, BUT...

This week’s assignment is about (plain old) ArraySet in assignment, forget about these new-fangled and more efficient ideas:

you’re to implement the “vanilla” version of ArraySet

20