Application of Data Structures

Embed Size (px)

Citation preview

  • 8/3/2019 Application of Data Structures

    1/47

    Christopher Moh 2005

    Application of Data Structures

  • 8/3/2019 Application of Data Structures

    2/47

    Christopher Moh 2005

    Overview Priority Queue structures

    Heaps

    Application: Dijkstras algorithm

    Cumulative Sum Data Structures onIntervals

    Augmenting data structures with extrainfo to solve questions

  • 8/3/2019 Application of Data Structures

    3/47

    Christopher Moh 2005

    Priority Queue (PQ) Structures Stores elements in a list by comparing akeyfield

    Often has other satellite data

    For example, when sorting pixels by their Rvalue, we consider the R as the key field

    and GB as satellite data Priority queues allow us to sort

    elements by their key field.

  • 8/3/2019 Application of Data Structures

    4/47

    Christopher Moh 2005

    Common PQ operations Create()

    Creates an empty priority queue

    Find_Min() Returns the smallest element (by key field)

    Insert(x) Insert element x (with predefined key field)

    Delete(x) Delete position x from the queue

    Change(x, k) Change key field of position x to k

  • 8/3/2019 Application of Data Structures

    5/47

    Christopher Moh 2005

    Optional PQ operations Union (a,b)

    Combines two PQs a and b

    Search (k)

    Returns the position of the element in theheap with key value k

  • 8/3/2019 Application of Data Structures

    6/47

    Christopher Moh 2005

    Considerations when

    implementing a PQ in competition How complicated is it?

    Is the code likely to be buggy?

    How fast does it need to be? Does a constant factor also come into the

    equation?

    Do I need to store extra data to do a Search? During the course of this presentation, we shall

    assume that there exists existing extra data whichallows us to do a search in O(1) time. Thehandling of this data structure will be assumedand not covered.

  • 8/3/2019 Application of Data Structures

    7/47

    Christopher Moh 2005

    Linear Array Unsorted Array

    Create, Insert, Change in O(1) time

    Find_min, Delete in O(n) time

    Sorted Array

    Create, Find_min in O(1) time

    Insert, Delete, Change in O(n + log n) =O(n) time

  • 8/3/2019 Application of Data Structures

    8/47

    Christopher Moh 2005

    Binary Heaps Will be the most common structure that will

    be implemented in competition setting

    Efficient for most applications Easy to implement

    A heap is a structure where the value of anode is less than the value of all of its

    children A binary heap is a heap where the maximum

    number of children for each node is 2.

  • 8/3/2019 Application of Data Structures

    9/47

    Christopher Moh 2005

    Array implementation Consider a heap of size nheapin an array

    BHeap[1..nheap](Define BHeap[nheap+1 ..(nheap*2)+1] to be INFINITY for practical reasons) The children of BHeap[x] are BHeap[x*2] and

    BHeap[x*2+1]

    The parent of BHeap[x] are BHeap[x/2]

    This allows a near uniform Binary Heap where we canensure that the number of levels in this heap is O(log n)

    Some properties wrt Keyvalues: BHeap[x] >= BHeap[x/2],BHeap[x]

  • 8/3/2019 Application of Data Structures

    10/47

    Christopher Moh 2005

    PQ Operations on a BHeap We define BTree(x) to be the Binary Tree rooted at

    BHeap[x]

    We define Heapify(x) to be an operation that does

    the following: Assume: BTree(x*2) and BTree(x*2+1) are binary heaps but

    BTree(x) is not necessarily a binary heap

    Produce: BTree(x) binary heap

    Details of Heapify in later slides but for now, we assume

    Heapify is O(log n)

    For the rest of the presentation, we assume thevariable nrefers to nheap

  • 8/3/2019 Application of Data Structures

    11/47

    Christopher Moh 2005

    Operations on a BHeap Create is trivial O(1) time Find_min:

    1. Return BHeap[1]

    O(1) time Insert (element with key value x)

    1. nheap++2. BHeap[nheap] = x3. T = nheap

    4. While (T != 1 && Bheap[T] < BHeap[T/2])1. Swap (Bheap[T], BHeap[T/2]2. T = T / 2

    O(log n) time as the number of levels is O(log n)

  • 8/3/2019 Application of Data Structures

    12/47

    Christopher Moh 2005

    Operations on a BHeap ChangeDown (position x, new key value k)

    Assume: k < existing BHeap[x]

    1.

    BHeap[x] = k2. T = x

    3. While (T != 1 && BHeap[T] < BHeap[T/2])1. Swap (BHeap[T], BHeap[T/2])

    2. T = T/2

    Complexity: O(log n)

    This procedure is known as bubbling up theheap

  • 8/3/2019 Application of Data Structures

    13/47

    Christopher Moh 2005

    Operations on a BHeap ChangeUp (position x, new key value

    k)

    Assume: k > existing BHeap[x]

    1. BHeap[x] = k

    2. Heapify(x)

    O(log n) as complexity of Heapify is O(logn)

  • 8/3/2019 Application of Data Structures

    14/47

    Christopher Moh 2005

    Operations on a BHeap Delete (position x on the heap)

    1. BHeap[x] = BHeap[nheap]

    2.

    nheap3. Heapify(x)

    4. T = x

    5. While (T != 1 && BHeap[T] < BHeap[T/2])1. Swap (BHeap[T], BHeap[T/2])

    2. T = T / 2

    Complexity is O(log n)

    Why must I do both Heapify and bubble up?

  • 8/3/2019 Application of Data Structures

    15/47

    Christopher Moh 2005

    Operations on a BHeap Heapify (position x on the heap)

    1. T = min(BHeap[x], BHeap[x*2], BHeap[x*2+1])

    2. If (T == BHeap[x]) return;3. K = position where BHeap[K] = T

    4. Swap(BHeap[x], BHeap[K])

    5. Heapify(K)

    O(log n) as the maximum number of levels inthe heap is O(log n) and Heapify only goesthrough each level at most once

  • 8/3/2019 Application of Data Structures

    16/47

    Christopher Moh 2005

    BHeap Operations: Summary

    Create, Find_min in O(1) time

    Change (includes both ChangeUp and

    ChangeDown), Insert, and Delete areO(log n) time

    Union operations are how long?

    Insertion: O(n log n) union

    Heapify: O(n) union

  • 8/3/2019 Application of Data Structures

    17/47

    Christopher Moh 2005

    Corollary: Heapsort

    We can convert an unsorted array to a heapusing Heapify (why does this work?):

    1. For (i = n/2; i >= 1; i--)1. Heapify(i)

    We can then return a sorted list (list initiallyempty):

    1. For (i = 1; i

  • 8/3/2019 Application of Data Structures

    18/47

    Christopher Moh 2005

    Binomial Trees

    Define Binomial Tree B(k) as follows:

    B(0) is a single node

    B(n), n != 0, is formed by merging two B(n-1)trees in the following way:

    The root of the B(n) tree is the root of one of the B(n-1)trees, and the (new) leftmost child of this root is the rootof the other B(n-1) tree.

    Within the tree, the heap property holds i.e. thatthe keyfield of any node is greater than the keyfield of all its children.

  • 8/3/2019 Application of Data Structures

    19/47

    Christopher Moh 2005

    Properties of Binomial Trees

    The number of nodes in B(k) is exactly2^k.

    The height of B(k) is exactly (k + 1)

    For any tree B(k)

    The root of B(k) has exactly k children

    If we take the children of B(k) from left toright, they form the roots of a B(k-1), B(k-2), , B(0) tree in that order

  • 8/3/2019 Application of Data Structures

    20/47

    Christopher Moh 2005

    Binomial Heaps

    Binomial Heaps are a forest of binomial trees withthe following properties: All the binomial trees are of different sizes

    The binomial trees are ordered (from left to right) byincreasing size

    If we consider the fact that the size of B(k) is 2^k,the binomial tree B(k) exists in a binomial heap of nnodes iff the bit representing 2^k is 1 in the binary

    representation of n For example: 13 (decimal) = 1101 (binary), so the binomial

    heap with 13 nodes consists of the binomial trees B(0), B(2),and B(3).

  • 8/3/2019 Application of Data Structures

    21/47

    Christopher Moh 2005

    Binomial Heap Implementation

    Each node will store the following data: Keyfield Pointers (if non-existent, points to NIL) to

    Parent Next Sibling (ordered left to right; a sibling must have the

    same parent); For roots of binomial trees, next sibling points tothe root of the next binomial tree

    Leftmost child

    Number of children in field degree Any other data that might be useful for the program

    The binomial heap is represented by a headpointerthat points to the root of the smallest binomial tree(which is the leftmost binomial tree)

  • 8/3/2019 Application of Data Structures

    22/47

    Christopher Moh 2005

    Operations on Binomial Trees

    Link (h1, h2) Links two binomial trees with root h1 and h2 of

    the same order k to form a new binomial tree of

    order (k+1) We assume h1->key < h2->key which implies

    that h1 is the root of the new tree1. T = h1->leftchild2. h1->leftchild = h23. h2->parent = h14. H2->next_sibling= T O(1) time

  • 8/3/2019 Application of Data Structures

    23/47

    Christopher Moh 2005

    Operations on binomial heaps

    Create Create a new binomial heap with onenode (keyfield set)

    Set Parent, Leftchild, Next sibling to NIL

    O(1) time Find_min

    1. X = head, min = INFINITY2. While (X != nil)

    1. If (X->key < min) min = X->key2. X = X->next_sibling

    3. Return min O(log n) time as there are at most log n binomial trees

    (log n bits)

  • 8/3/2019 Application of Data Structures

    24/47

    Christopher Moh 2005

    More Operations

    Merge (h1, h2, L)

    Given binomial heaps with head pointers

    h1 and h2, create a list L of all thebinomial trees of h1 U h2 arranged inascending order of size

    For any order k, there may be zero, one,or two binomial trees of order k in thislist.

  • 8/3/2019 Application of Data Structures

    25/47

    Christopher Moh 2005

    More Operations

    Merge (h1, h2, L)

    Assume that NIL is a node of infinitely small

    order1. L = empty

    2. While (h1 != NIL || h2 != NIL)

    1. If (h1->degree < h2->degree)

    1.

    Append the (binomial)tree with root h1 to L2. h1 = h1->next_sibling

    2. Else

    1. Apply above steps to h2 instead

  • 8/3/2019 Application of Data Structures

    26/47

    Christopher Moh 2005

    More Operations

    Union (h1, h2)

    The fundamental operation involving

    binomial heaps Takes two binomial heaps with head

    pointers h1 and h2 and creates a newbinomial heap of the union of h1 and h2

  • 8/3/2019 Application of Data Structures

    27/47

    Christopher Moh 2005

    More Operations

    Union (h1, h2)1. Start with empty binomial heap2. Merge (h1, h2, L)3. Go by increasing k in the list L until L is empty

    1. If there is exactly one or exactly three (how can thishappen?) binomial trees of order k in L, append onebinomial tree of order k to the binomial heap andremove that tree from L

    2. If there are two trees of order k, remove both trees,use Link to form a tree of order (k+1) and pre-pendthis tree to L

    Union is O(log n)

  • 8/3/2019 Application of Data Structures

    28/47

    Christopher Moh 2005

    More Operations

    Inserting a new node with key field set Create a new binomial heap with that one node Union (existing heap with head h, new heap) O (log n) time

    ChangeDown (node at position x, new value) Decreasing the key value of a node Same idea as binary heap: Bubble up the

    binomial tree containing this node (exchange onlykey fields and satellite data! Whats thecomplexity if you physically change the node?)

    O (log n) time

  • 8/3/2019 Application of Data Structures

    29/47

    Christopher Moh 2005

    More Operations

    Delete (node at position x) Deleting position x from the heap1. ChangeDown(x, -INFINITY)

    Now x is at the root of its binomial tree Supposing that the binomial tree is of order k Recall that the children of the root of the binomial tree,

    from right to left, are binomial trees of order 0, 1, 2, 3, 4,, k-1

    2. Form a new binomial heap with the children of the root of

    this binomial tree the roots in the new binomial heap3. Remove the original binomial tree from the original

    binomial heap4. Union (original heap, new heap)

    O(log n) complexity

  • 8/3/2019 Application of Data Structures

    30/47

    Christopher Moh 2005

    More Operations

    ChangeUp (node at position X, newvalue)

    1. Delete (X)

    2. Insert (new value)

    O (log n) time

  • 8/3/2019 Application of Data Structures

    31/47

    Christopher Moh 2005

    Summary Binomial Heaps

    Create in O(1) time

    Union, Find_min, Delete, Insert, and Change

    operations take O(log n) time In general, because they are more

    complicated, in competition it is far moreprudent (saves time coding and debugging)to use a binary heap instead

    Unless there are MANY Union operations

  • 8/3/2019 Application of Data Structures

    32/47

    Christopher Moh 2005

    Application of heaps: Dijkstra

    The following describes how Dijkstrasalgorithm can be coded with a binary heap

    Initializing phase:1. Let n be the number of nodes

    2. Create a heap of size n, all key fields

    initialized to INFINITY3. Change_val (s, 0) where s is the source

    node

  • 8/3/2019 Application of Data Structures

    33/47

    Christopher Moh 2005

    Running of Dijkstras algorithm

    1. While (heap is not empty)

    1. X = node corresponding to find_min

    value2. Delete (position of X in heap = 1)

    3. For all nodes k that are adjacent to X

    1. If (cost[X] + distance[X][k] < cost[k])1. ChangeDown (position of k in heap, cost[X] +

    distance[X][k])

  • 8/3/2019 Application of Data Structures

    34/47

    Christopher Moh 2005

    Analysis of running time

    At most n nodes are deleted O(n log n)

    Let m be the number of edges. Each edge isrelaxed at most once. O(m log n)

    Total running time O([m+n] log n) This is faster than using a basic array list

    unless the graph is very dense, in which casem is about O(n^2) which leads to a runningtime of O(n^2 log n)

  • 8/3/2019 Application of Data Structures

    35/47

    Christopher Moh 2005

    Cumulative Sum on Intervals

    Problem: We have a line that runs from xcoordinate 1 to x coordinate N. At x

    coordinate X [X an integer between 0 and N],there is g(X) gold. Given an interval [a,b],how much gold is there between a and b?

    How efficiently can this be done if we

    dynamically change the amount of gold andthe interval [a,b] keeps changing?

  • 8/3/2019 Application of Data Structures

    36/47

    Christopher Moh 2005

    Cumulative Sum Array

    Let us define C(0) = 0, and C(x) = C(x-1) + g(x)where g(x) is the amount of gold at position x

    C(x) then defines the total amount of gold from

    position 1 to position x The amount of gold in interval [a,b] is simply C(b)

    C(a-1) For any change in a or b, we can perform the update in O(1)

    time

    However, if we change g(x), we will have to changeC(x), C(x+1), C(x+2), , C(N) Any change in gold results in an update in O(N) time

  • 8/3/2019 Application of Data Structures

    37/47

    Christopher Moh 2005

    Cumulative Sum Tree

    We can use the binary representation of any numberto come up with a cumulative sum tree

    For example, let say we take 13 (decimal) = 1101

    (binary) The cumulative sum of g(1) + g(2) + g(13) can be

    represented as the sum of: g(1) + g(2) + + g(8) [ 8 elements ]

    g(9) + g(10) + + g(12) [ 4 elements ]

    g(13) [ 1 element ] Notice that the number of elements in each case represents

    a bit that is 1 in the binary representation of the number

  • 8/3/2019 Application of Data Structures

    38/47

    Christopher Moh 2005

    Cumulative Sum Tree

    Another example: C(19)

    19 (decimal) is 10011 (binary)

    C(19) is the sum of the following: g(1) + g(2) + + g(16) [ 16 elements ]

    g(17) + g(18) [ 2 elements ]

    g(19) [ 1 element ]

  • 8/3/2019 Application of Data Structures

    39/47

    Christopher Moh 2005

    Cumulative Sum Tree

    Let us define C2(x) to be the sum of g(x) +g(x-1) + + g(p + 1) where p is a number

    with the same binary representation as xexcept the least significant bit of x (therightmost bit of x that is 1) is 0

    Examples of x and the corresponding p:

    x = 6 [110], p = 4 [100]

    x = 13 [1101], p = 12 [1100]

    x = 16 [10000], p = 0 [00000]

  • 8/3/2019 Application of Data Structures

    40/47

    Christopher Moh 2005

    Cumulative Sum Tree

    If we want to find the cumulative sum C(x) = g(1) +g(2) + + g(x), we can trace through the values ofC2 using the binary representation of x Examples: C(13) = C2(8) + C2(8+4) + C2(8+4+1) C(16) = C2(16) C(21) = C2(16) + C2(16+4) + C2(16+4+1) C(99) = C2(64) + C2(64+32) + C2(64+32+2) +

    C2(64+32+2+1)

    This allows us to find C(x) in log x time Hence the amount of gold in interval [a,b] = C(b) C(a-1)

    can be found in log N time, which implies updates of a and bcan be done in O(log N)

  • 8/3/2019 Application of Data Structures

    41/47

    Christopher Moh 2005

    Cumulative Sum Tree

    What happens when we change g(x)? If g(x) is changed, we only need to update C2(y)

    where C2(y) covers g(x)

    We can go through all necessary C2(y) in thefollowing way:1. While (x

  • 8/3/2019 Application of Data Structures

    42/47

    Christopher Moh 2005

    Cumulative Sum Tree

    Examples [binary representation in brackets] Change to g(5) [ 101 ] : Update C2(5), C2(6), C2(8),

    C2(16) and all C2(power of 2 > 16) Change to g(13) [ 1101 ]: Update C2(13), C2(14), C2(16),

    and all C2(power of 2 > 16) Change to g(35) [ 100011 ]: Update C2(35), C2(36),

    C2(40), C2(48), C2(64), and all C2(power of 2 > 64)

    We can implement a cumulative sum tree verysimply: By simply using a linear array to store the

    values of C2. Can we extend a cumulative sum tree to 2 or more

    dimensions? See IOI 2001 Day 1 Question 1

  • 8/3/2019 Application of Data Structures

    43/47

    Christopher Moh 2005

    Sum of Intervals Tree

    Another way to solve the question is to use a Sumof Intervals Binary Tree

    Each node in the tree is represented by (L, R) and

    the value of (L,R) is g(L) + g(L+1) + + g(R) The root of the tree has L = 1 and R = N Every leaf has L = R Every non-leaf has children (L, [L+R]/2) [left child]

    and ([L+R]/2+1, R) [right child]

    The number of nodes in the tree is O(2*N) [ why? ] In an implementation, every node should have

    pointers to its children and its parent

  • 8/3/2019 Application of Data Structures

    44/47

    Christopher Moh 2005

    Sum of Intervals Tree

    How to find C(x) = g(1) + g(2) + + g(x)? We trace from the root downwards1. L = 1, R = N, C = 02. While (L != R)

    1. M = (L + R) / 22. If (M < x)

    1. C += value of (L,R)2. Set L and R to the left child of the current node

    3. Else1. Set L and R to the right child of the current node

    3. C += value at (L,R) [ or (L,L) or (R,R) as L = R ] Time complexity: O(log n)

  • 8/3/2019 Application of Data Structures

    45/47

    Christopher Moh 2005

    Sum of Intervals Tree

    What happens when g(x) is changed? Trace from (x,x) upwards to the root

    1. Let L = R = x

    2. While (L,R) is not the root1. Update the value of (L,R)

    2. Set (L,R) to the parent of (L,R)

    3. Update the root

    Complexity of O(log N) Hence all updates of interval [a,b] and g(x)

    can be done in O(log N) time

  • 8/3/2019 Application of Data Structures

    46/47

    Christopher Moh 2005

    Augmenting Data Structures

    It is often useful to change the data structurein some way, by adding additional data ineach node or changing what each node

    represents. This allows us to use the same data structure

    to solve problems For example, we can use so-called interval

    trees to solve not just cumulative sumproblems We can use properties of elements in the interval

    (L,R) that are related to L and R.

  • 8/3/2019 Application of Data Structures

    47/47

    Other data structures

    Balanced (and unbalanced) binary trees

    Red-Black trees

    2-3-4 trees

    Splay trees

    Suffix Trees

    Fibonacci Heaps