Last Class

Preview:

DESCRIPTION

Last Class. Summary of Implementations. Collection. Set. SortedSet. Map. SortedMap. List. Queue. Comming up. ArrayList ArrayQueue ArrayDequeue DualArrayDequeue. Array based implementations for List and Queue. Lists versus Arrays. Lists. Arrays. get(i) and put(i,x). - PowerPoint PPT Presentation

Citation preview

Last Class

2

InterfaceImplementation Technique

Hash table

Array Tree Linked list

Hash table + Linked list

Set HashSet TreeSet LinkedHashSet

Sorted Set TreeSet

List ArrayList LinkedList

Queue PriorityQueue LinkedList

Map HashMap TreeMap LinkedHashMap

Sorted Map

TreeMap

Summary of Implementations

Collection MapSet

List

Queue

SortedSet SortedMap

Comming up

Array based implementations for List and Queue

–ArrayList

–ArrayQueue

–ArrayDequeue

–DualArrayDequeu

e

Lists versus Arrays

Lists–a[i] and a[i] = x–get(i) and put(i,x)

Arrays

–add(x) adds elements to the list

–add(i,x) inserts and element into the list

– remove(i) removes an element

–size is specified at time of creation - can't grow–size is specified at time of creation - can't grow– remove(i) requires shifting a[i+1],a[i+2],...a[i+a.length-1]

Using arrays to implement List

public class ArrayList<T> extends AbstractList<T> { T[] a; // data goes in here int n; // the number of elements in the list ...}

• The ArrayList class implements a list as an array

• How?–Uses an array a, called a backing array–An integer n keeps track of the number of elements•At all times, n ≤ a.size

Using arrays to implement List

public T set(int i, T x) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); T y = a[i]; a[i] = x; return y; }

public T set(int i, T x) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); T y = a[i]; a[i] = x; return y; }

• List element i is stored in a[i]

public T get(int i) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); return a[i];}

Appending an element

public boolean add(T x) { if (n + 1 > a.length) resize(); // increase length of a a[n++] = x; return true;}

• To append an element x–grow a first if necessary–store x in a[n] and increment n

Inserting an element

public void add(int i, T x) { if (n + 1 > a.length) resize();

for (int j = n; j > i; j--) a[j] = a[j-1]; a[i] = x; n++;}

b c d ea

b c d exa

add(1,x)

• To insert element i–Grow a if necessary–shift– Increment n

Removing an element

public T remove(int i) { T x = a[i]; for (int j = i; j < n-1; j++) a[j] = a[j+1]; n--; if (a.length >= 3*n) resize(); return x;}

• To remove element i–shift–decrement n–shrink a if desired b c d ea

b c d exa

remove(1)

Growing the array a - first try

protected void resize() { T[] b = makeArray(n+1); for (int i = 0; i < n; i++) { b[i] = a[i]; } a = b;}

• To grow a–allocate a larger array b–copy everything into b

Growing the array a - first try

List<Integer> l = new MyArrayList<Integer>();for (int i = 0; i < n; i++) { l.add(new Integer(i));...

• Increasing a.length by 1 at each step causes a lot of copying–when i=1, 1 element is copied from a to

b–when i=2, 2 elements are copied from a to

b–when i=3, 3 elements are copied from a to

b–when i=n-1, n-1 elements are copied from a to

b

Growing the array a - first try

How many element are copied from one array into another during a sequence of n add operations on an empty MyArrayList ?

1 + 2 + 3 + ... + (n-1)

n-1

n-1

n-1

n-1

n

n-1

Arithmetic series: 2n(n-1)/2

Result

Theorem (Incrementing a.length): During a sequence of n add operations on an

empty MyArrayList, exactly n(n-1)/2 elements are copied from one array into

another.

Theorem (Incrementing a.length): During a sequence of n add operations on an

empty MyArrayList, exactly n(n-1)/2 elements are copied from one array into

another.

Growing the array a – second try

protected void resize() { T[] b = makeArray(2*n); for (int i = 0; i < n; i++) { b[i] = a[i]; } a = b;}

• Grow the array faster, so that we have to copy less often

• When adding n elements into an empty

MyArrayList we get– an array of length 1 that gets copied into– an array of length 2 that gets copied into– an array of length 4 that gets copied into– an array of length 8 that gets copied into– ...– an array of length 2r-1 < n– an array of length 2r < 2n

Growing the array a – second try

How many elements are copied during a sequence of n add operations?

16

• How much is 1+2+4+8+...+2r-1

1

Geometric Series

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

17

• 1/2 < 1

1

1/2

1/2

Geometric Series• How much is 1+2+4+8+...+2r-1

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

• 1/2+ 1/4 < 1

Geometric Series• How much is 1+2+4+8+...+2r-1

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

1

1/2 + 1/4

1/4

19

1

1/2 + 1/4 + 1/8

1/8

Geometric Series

• 1/2 + 1/4 + 1/8 < 1

• How much is 1+2+4+8+...+2r-1

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

20

1

1/2 + 1/4 + 1/8 + 1/16

1/16

Geometric Series

• 1/2 + 1/4 + 1/8 + 1/16 < 1

• How much is 1+2+4+8+...+2r-1

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

21

1

1/2 + 1/4 + 1/8 + 1/16 + ... + 1/2r

1/2r

Geometric Series

• 1/2 + 1/4 + 1/8 + 1/16 + 1/2r < 1

• How much is 1+2+4+8+...+2r-1

– Claim: 1+2+4+8+...+2r-1 < 2r

– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1

22

• Recall:– (i) j = 1 + i + i 2 + … + i r-1 = (i r-1)/(i-1)

Geometric Series

• Substituting i=2– (2)

j = 1 + 2 + 22 + … + 2 r-1 = (2 r-1)/(2-1)

= 2r-1

• Recall that 2r < 2n

• The number of elements copied during n add operations on an empty MyArrayList is–1+2+4+...+2r-1 < 2r < 2n

Doubling works well

Theorem (Doubling a.length): During a sequence of n add operations on an

empty MyArrayList, a total of at most 2n elements are copied from one array to

another.

Theorem (Doubling a.length): During a sequence of n add operations on an

empty MyArrayList, a total of at most 2n elements are copied from one array to

another.

• We save a lot by using doubling–O(n) copy operations versus O(n2) copy operation

Doubling versus Incrementing

n n(n-1)/2 2n

10 45 20100 4 950 200

1 000 499 500 2 00010 000 49 995 000 20 000

100 000 4 999 950 000 200 0001 000 000 499 999 500 000 2 000 000

• When n << a.length, a lot of space is wasted

• Each time an element is removed, we resize– if n < a.length/3 then we resize to 2*n

Shrinking

How good are grow() and shrink() when we have both add() and remove() operations?

• How many elements are copied from one array to another during a sequence of m add and remove operations?

Amortized analysis of grow() and shrink()

• Answer: It depends on the exact sequence of operations–We want an upper bound that holds for any sequence of m add() and remove() operations

• Suppose grow() is now reallocating array a– n = a.length elements are being copied from a

to b– How many add() operations occurred since the

last time a was reallocated?

Amortized analysis of grow()

– At least a.length/2 add() operations occurred since then– The number of copies caused by grow() is at

most twice the number of add() operations

Then:Now:

• Suppose shrink() is now reallocating array a– n < a.length / 3 elements are being copied– how many remove() operations occurred since

the last time a was reallocated

Amortized analysis of shrink()

– at least (a.length/2) - (a.length/3) = a.length/6– remove() operations have occurred since then– The number of copies caused by shrink() is at

most twice the number of remove() operations

Then:Now:

• The total number of array elements copied by grow() is at most twice the number of add() operations

Recap

• The total number of array elements copied by shrink() is at most twice the number of remove() operations

• If we perform a total of m add() and remove() operations then the total number of array elements copied by both grow() and shrink() is at most 2m

• Theorem: Starting with an empty MyArrayList, a sequence of m add() and remove() operations results in a total of at most 2m elements being copied from one array to another by grow() and shrink().

Summary Theorem

• Corollary (Stack Theorem): Starting with an empty MyArrayList, a sequence of m add(x) and remove(size()-1) operations takes O(m) time.

Stacks

Array-based lists do a lot of copying and moving of data–A for loop is not the best way to do this–Fastest methods use machine parallelism and special machine instructions to speed up copying and moving of blocks of array data– In Java, we can use

Practical Considerations

System.arraycopy(a, ia, b, ib, n)

protected void grow() { T[] b = f.newArray(a.length*2); System.arraycopy(a, 0, b, 0, n); a = b;}

public void add(int i, T x) { if (n + 1 > a.length) grow(); System.arraycopy(a, i, a, i+1, n-i); a[i] = x; n++;}

System.arraycopy (examples)

protected void shrink() { if (n > 0 && n < a.length / 3) { T[] b = f.newArray(n*2); System.arraycopy(a, 0, b, 0, n); a = b; }}

public T remove(int i) { T x = a[i]; System.arraycopy(a, i+1, a, i, n-i-1); n--; shrink(); return x;}

System.arraycopy (examples)

• MyArrayList: (JCF's ArrayList)

Summary

− A list implemented as an array that grows and shrinks

− Copying done by grow() and shrink() is proportional to number of add() and remove() operations− m add/remove ops. require at most 2m copy ops.●Fast get(i), set(i,x) for any value of i●Fast remove(i) and add(i,x) when i ~ size()

−shifting data is costly when i << size()− Useful as a stack

• MyArrayList is a bit wasteful of space– It might use an array of length 2n to store n

elements of data

Summary

• Not suitable for real-time applications (even as a stack)– Even though operations take constant time on

average [m operations take O(m) time], some operations [that reallocate a] take a long time.

• Works well as a stack, but not fast for– add(i,x) or remove(i) where i is small (near the

front)– Too much shifting of data

Next

Queue

First in First out (FIFO)

A queue would be easy to implement if we had an infinite array

d e fa b c...

..

.

ArrayQueue

public T poll() { T x = a[j]; j++; n--; return x;}

public boolean offer(T x) { a[j+n] = x; n++; return true;}

j j + (n-1)

Circular Array

• We don't have infinite arrays– But we do have arrays that can grow

• Use modular arithmetic to simulate an infinite array–wrap-around when we get to the end of the

array• Grow the array if the queue gets bigger than the array

da b ce f

(j+ n-1)% a.length

j

Modular Aritmetic

• "Clock arithmetic“– 8 + 5 ≡1 (mod 12)– (8 + 5) % 12 = 1• 8 + 5 = 13• 13 - 12 = 1

–% is the integer remainder operator• if x, y > 0 then (x % y) ϵ {0,...,y-1}

da b ce f

(j+ n-1)% a.length

j

ArrayQueue

public class ArrayQueue<T> extends AbstractQueue<T> { T[] a; int j; int n; ...}

• Represents a queue as an array a, and integers j and n– j ϵ {0,...,a.length-1} points to the head of the

queue–n is the number of elements stored in the

queue–elements stored at a[j], a[(j+1)%a.length], a[(j+2)%a.length], ... ,a[(j+n-1)%a.length]

ArrayQueue - offer(x) [add(x)]

public boolean offer(T x) { if (n + 1 > a.length) grow(); a[(j+n) % a.length] = x; n++; return true;}

• offer(x) [add(x)]– increase length of a if necessary–store x at a[(j+n)%a.length]– increment n

ArrayQueue - poll() [remove()]

public T peek() { T x = null; if (n > 0) { x = a[j]; } return x;}

• poll(), remove()–Return value in a[j]– increment j (mod a.length) and decrement n

public T poll() { T x = null; if (n > 0) { x = a[j]; j = (j + 1) % a.length; n--; shrink(); } return x;}

• grow() and shrink() are a bit trickier than before

Growing

da b c e f

fc d e a bj

j protected void grow() { T[] b = f.newArray(a.length * 2); for (int k = 0; k < n; k++) b[k] = a[(j+k) % a.length]; a = b; j = 0;}

a b c

a b c

a bc

a b c

Shrinking

protected void shrink() { if (n > 0 && n ≤ a.length / 4) { T[] b = f.newArray(n * 2); for (int k = 0; k < n; k++) b[k] = a[(j+k) % a.length]; a = b; j = 0; }}

• Theorem: – An ArrayQueue can perform a sequence of

m offer(), add(), poll(), and remove() operations in O(m) time.– If an upper-bound on the size of the queue is

known in advance, then we can eliminate need for grow() and shrink()

Summary Theorem

• Theorem: – A bounded ArrayQueue can perform each of

offer(), add(), poll(), and remove() operations in constant time per operation.

• An ArrayDeque uses modular arithmetic to implement the List interface.

ArrayDeque

• Why?

d e fa b c...

..

.

j j+n-1

– This allows modifications to be fast if they are• close to the end of the list– shift right and increment n

• close to the beginning of the list– shift left, decrement j, and increment n

ArrayDequeue get(i) and set(i,x)

public T get(int i) {

return a[(j+i)%a.length];}

• Get and set are easy (bounds-checking omitted)

public T set(int i, T x) {

T y = a[(j+i)%a.length]; a[(j+i)%a.length] = x;

return y;}

ArrayDequeue add(i,x)

• Decide whether it's better to–shift elements 0,...,i left; or–shift elements i+1,...,size()-1 right

48

d e fa b c...

..

.

d e fxa b c...

..

.

add(2,x);

d e fa b c...

..

.

d x e fba c...

..

.

add(4,x);

j-1 j+n

public void add(int i, T x) { if (n+1 > a.length) grow(); if (i < n/2) { // shift elements left j = (j == 0) ? a.length - 1 : j - 1; for (int k = 0; k < i-1; k++) a[(j+k)%a.length] = a[(j+k+1)%a.length]; } else { // shift elements right for (int k = n; k > i; k--) a[(j+k)%a.length] = a[(j+k-1)%a.length]; } a[(j+i)%a.length] = x; n++;}

ArrayDequeue add(i,x)

ArrayDequeue remove(i)• remove(i) is similar– if (i ≤ size()/2) then shift elements 0,...,i-1 right–else shift elements i+1,...,size()-1 left

50

d e fa b c...

..

.

d e fa b...

..

.

remove(2);

d e fa b c...

..

.

d fba c...

..

.

remove(4);

j+1

j+n-2

public T remove(int i) { T x = a[(j+i)%a.length]; if (i < n/2) {// shift elements right for (int k = i; k > 0; k--) a[(j+k)%a.length] = a[(j+k-1)%a.length]; j = (j + 1) % a.length; } else {// shift elements left for (int k = i; k < n-1; k++) a[(j+k)%a.length] = a[(j+k+1)%a.length]; } n--; shrink(); return x;}

ArrayDequeue remove(i)

• Theorem: – An ArrayDeque supports the operations• get(i) and put(i,x) in constant time per operation• add(i,x) and remove(i) in O(1 + min{i, size()-i}) amortized time per operation

ArrayDequeue Summary

• The % operator can be problematic– it is fairly slow, on most architectures•+, -, *, &, |, and ^ are all faster

– it doesn't handle negative values the way we expect• -1 % 12 = -1 [ we want 11]• -15 % 12 = -3 [ we want 9]

Practical Considerations

• We can replace % with branching– (j + k) % m equiv. to (j+k >= m) ? j+k-m : j+k– (j - k) % m equiv. to (j-k < 0) ? m-j+k : j-k• valid for j ϵ {0,...,m-1} and k ϵ {0,...,m}

• We can do better still if m (a.length) is a power of 2– In this case (j+k) % m = (j+k) & (m-1)•works for any values of k and j (even negative)•& is much faster than %•we can even store m-1 (=a.length-1) separately so we don't have to recompute it for every operation

Practical Considerations

• But this only works when a.length is a power of 2– The grow() method always doubles a.length– A modification to the shrink() method is needed

Example

00001000000

00000111111

00001000101

00000000101

00011000101

00000000101

(m=64)

(m-1=63)

(x=69=64+5)

(x&(m-1)=5)

(y=197=128+64+5=3*64+5)

(y&(m-1)=5)

• A DualArrayDeque is a data structure that turns two stacks into a dequeue.

DualArrayDequeue

• Main idea: Glue two stacks together back-to-back

0 1 2 3 4 5

front

back

public class DualArrayDeque<T> extends AbstractList<T> { List<T> front; List<T> back; ...}

push/poppush/pop

• The back stack stores elements in the same order they occur in the dequeue.

Ordering elements

• The front stack stores elements in reverse order

0 1 2 3 4 5

front

back

3 4 5

2 1 0front

back

push/poppush/pop

push/pop

push/pop

• The size of an DualArrayDeque is just the size of its two stacks.

DualArrayDequeue – size()

• Main idea: Glue two stacks together back-to-back

public int size() { return front.size() + back.size();}

3 4 5

2 1 0front

back

front.size()

back.size()

+

• For get(i) we need to determine if element i is stored in front or back.

DualArrayDequeue – get(i)

public T get(int i) { if (i < front.size()) { return front.get(front.size()-i-1); } else { return back.get(i-front.size()); }}

• The set(i,x) method is similar

DualArrayDequeue – set(i,x)

public T set(int i, T x) { if (i < front.size()) { return front.set(front.size()-i-1, x); } else { return back.set(i-front.size(), x); }}

• The add(i,x) is also similar.

DualArrayDequeue – add(i,x)

public void add(int i, T x) { if (i < front.size()) { front.add(front.size()-i, x); } else { back.add(i-front.size(), x); } balance();}

• Observe:• i = 0 → front.size()-1 → fast (push front)• i = size()-1 → back.size()-1 → fast (push back)

• The remove(i) method is similar– fast when i = 0 (pop front) or i = size()-1 (pop back)

DualArrayDequeue – remove (i)

public T remove(int i) { T x; if (i < front.size()) { x = front.remove(front.size()-i-1); } else { x = back.remove(i-front.size()); } balance(); return x;}

List<Integer> q = new DualArrayDeque<Integer>(Integer.class);... // some code that fills q upwhile (true) { q.add(x); q.remove(0);}

• This seems too easy–What happens when we try to use this as a

queue?• add(x) always appends to back• eventually remove(0) will empty front

DualArrayDequeue

3 4 5

2 1 0front

back

remove(0)

add(x)

– subsequent calls will translate to back.remove(0)»SLOW!

• This is why we call balance()

– If 3*front.size() < back.size() or

–3*back.size() < front.size()•rebalance: spread elements evenly between front and back

DualArrayDequeue

balance()

front

back

• a little tricker than it looks–when moving between front and back we have to reverse the order of elements

DualArrayDequeue – balance()

front

back

reverse

protected void balance() { int n = size(); if (3*front.size() < back.size()) { int s = n/2 - front.size(); List<T> l1 = newStack(); List<T> l2 = newStack(); l1.addAll(back.subList(0,s)); Collections.reverse(l1); l1.addAll(front); l2.addAll(back.subList(s, back.size())); front = l1; back = l2;} else if (3*back.size() < front.size()) { ... // code is similar}

ArrayDequeue - balance()

• size(), get(i), and set(i,x) each take constant time

DualArrayDeque - analysis

• add(i,x) and remove(i) take time–O(i + min{1, size()-i}) + time for balance()

• in the worst case, balance() moves size() elements– takes O(size()) time

• hopefully this doesn't happen too often

• Suppose balance() is performing rebalancing right now– Consider the situation right after the last time

balance() did some rebalancing.

Amortized analysis of balance()

– f0 = b0

– 3f1 = b1 [approximately]

now

thenf0 b0

f1 b1

• Claim: – front has gotten a lot smaller• lots of remove() operations.

– or back has gotten a lot bigger• lots of add() operations.

Amortized analysis of balance()

• Claim: – f0-f1 ≥ α(f1+b1) or b1-b0 ≥ α (f1+b1), for some

constant α > 0.• The total work done by balance() is O(f1+b1)

• The total number of add/remove operations since the last rebalance is at least a(f1+b1)

• The total work done by balance is proportional to the number of add/remove operations

• Theorem: Starting with an initially empty

DualArrayDeque and performing a sequence of

m add/remove operations,– add(i,x) and remove(i) take time•O(1 + min{i, size()-i}) + time for balance()

– the total time taken by balance() is O(m)

Summary of DualArrayDeque

• Theorem: Starting with an initially empty

DualArrayDeque, any sequence of m

pushFront, pushBack, popFront, and popBack operations takes a total of O(m) time

• Claim: Let w=f1+b1. Then f0-f1 ≥ aw or b1-b0 ≥ aw

Assume f0-f1 < aw [ f0 - aw < f1 ]

• b1 = 3f1 > 2f1 + f0 - aw = 2f1 + b0 - aw

• b1 - b0 > 2f1 - aw

•= f1 + b1/3 - aw

•> f1/3 + b1/3 - aw

•= w/3 - aw

•= aw for [a = 1/6]

Proof Claim

now

thenf0 b0

f1 b1

– f0 = b0

– 3f1 = b1 [ f1 = b1/3 ]

Alternate proof (potential function)

• Define the surplus– s = |front.size() - back.size()|

• Observe that, just after rebalancing,– s0 = |f0 - b0| = 0

• Just before the next rebalancing– s1 = |f1 - b1| = 2b1 ≥ (f1 + b1)/2

• Each add/remove operation increases s by at most 1

• Therefore, the number of add/remove operations

since last rebuilding is at least s1 - s0 = (f1+b1)/2

• ArrayList: Array-based implementation of a stack– grow() and shrink().

Summary

• ArrayDeque: Array-based implementation of a dequeue–grow() and shrink()–modular arithmetic (circular array)

• DualArrayDeque: Impl. of dequeue as two stacks– rebalance()–can use any kind of stack (ArrayList for example)

• All these structure offer constant time access–get(i), set(i) run in constant time

Pros and Cons

• Not suitable for real-time systems–Some individual operations can be very slow•grow(), shrink(), balance()

–Unless maximum size is known in advance• These can waste a lot of space–The array a can store as few as a.length/3 elements–a often stores only a.length/2 elements

• An array-based stack implementation that is– real-time [ in some languages ]–Only uses O(sqrt(size()) space beyond what is needed to store the data

Coming Up…

• Using these in a DualArrayDeque gives–a dequeue implementation that uses only O(sqrt(size()) space beyond what is needed to store the data