Binary search trees Heaps Single source shortest paths Prepared by- Jatinder Paul, Shraddha Rumade

•Binary search trees

• Heaps

•Single source shortest paths

Prepared by- Jatinder Paul, Shraddha Rumade

2

Binary Trees

A binary tree is a tree which is either empty, or one in which every node:

has no children; or has just a left child; or has just a right child; or has both a left and a right child.

In other words in binary tree each node can have zero, one, or two children .

A complete binary tree is a special case of a binarytree, in which all the levels, except perhaps the last,are full; while on the last level, any missing nodes areto the right of all the nodes that are present.

Fig. A complete binary tree

3

Binary Search Trees

Binary Search Tree (BST) is a prominent data structure used in many systems programming applications for representing and managing dynamic sets.

Assuming k represents the value of a given node, BST is a binary tree with the following property :

all children to the left of the node have values smaller than k, and

all children to the right of the node have values larger than k.

BST can be used to build : Dictionaries Priority Queues

4

Binary Search Trees (Contd.)

In the case of Heaps elements are stored as an array since they are filled level by level . Whereas levels in BST are dynamically filled and are unpredictable hence we use linked lists to represent them.

Heaps are usually used in finding maximum element not the minimum element. BST are useful finding both minimum and maximum element.

The highest valued element in a BST can be found by traversing from the root in the right direction all along until a node with no right link is found (we can call that the rightmost element in the BST).

The lowest valued element in a BST can be found by traversing from the root in the left direction all along until a node with no left link is found (we can call that the leftmost element in the BST)

In other words , leftmost node is minimum value node and rightmost node is the maximum value node .

And if the tree is balanced then finding minimum and maximum element takes O( log n) time.

BST maintains sorted order in presence of insertion and deletion .

5

Inorder Traversal

The Inorder traversal of a binary tree is defined as follows: Traverse the left subtree visit the root traverse the right subtree.

Inorder-Traversal(x)

1. if x NIL

2. then Inorder-Traversal(left[x])

3. print key[x]

4. Inorder-Traversal(right[x ])

The above algorithm takes linear time. The inorder procedure is called recursively twice for each element (once for the left child and once for its right child), and the element is visited right between them. Therefore, the construction time is equal to Θ(n).

6

Sorting Using BST

Inorder traversal of a binary search tree always gives a sorted sequence of the values. This is a direct consequence of the BST property.

Given a set of unordered elements, the following method can be used to sort the elements:

construct a binary search tree whose keys are those elements, and then perform an inorder traversal of this tree.

If the data is already stored in binary search tree, only a traversal is needed in the construction. BSTSort(A)

1. for each element in an array do 2. Insert element in the BST // Constructing a BST take O( log n) time 3. Inorder-Traversal (root) // Takes O(n) time

Total running time of BSTSort(A) is O( n log n). It is same as HeapSort.

7

Example Sorting Using BST

Input Sequence :- 2 3 8 4 5 7 6 1 2.5 2.4Step 1 : Creating Binary Search Tree of above given input sequence.

2

1 3

4

8

6

5

7

2.4

2.5

8

Example Sorting Using BST (cont.)

Input Sequence :- 2 3 8 4 5 7 6 1 2.5 2.4Step 2 : Perform Inorder-Traversal.

2

1 3

4

8

6

5

7

2.4

2.5

2

1

9



2

1 3

4

8

6

5

7

2.4

2.5

2

1

2.4

2.5

3

10



2

1 3

4

8

6

5

7

2.4

2.5

2

1

2.4

2.5

3

4

5

6

7

8

Sorted Array

11

Binary Search Trees (cont.)

Search is straightforward in a BST. Start with the root and keep moving left or right using the BST property. If the key we are seeking is present, this search procedure will lead us to the key. If the key is not present, we end up in a null link.

The running time of the search operation is O (h) , where h can be

h = log n for a balanced binary tree h = n for an unbalanced tree that resembles a linear chain of n nodes in the

worst case

Example, for a binary search tree with n elements, with n = 1000, it needs about 10 comparisons for the search operation; with n = 1,000,000, it needs about 20 comparisons. However, if the binary search tree is unbalanced and is elongated, the run time of a search operation is longer.

Insertion in a BST is also a straightforward operation. If we need to insert an element x, we first search for x. If x is present, there is nothing to do. If x is not present, then our search procedure ends in a null link. It is at this position of this null link that x will be included.

Red-black trees are a variation of binary search trees to ensure that the tree is balanced. Height is O (log n), where n is the number of nodes.

12

Definition:A heap is a complete binary tree with the condition that every node (except the root) must have a value less than or equal to its parent.

A binary tree of height, h, is complete iff it is empty or

its left sub-tree is complete of height h-1 and its right sub-tree is

completely full of height h-2 or

its left sub-tree is completely full of height h-1 and its right sub-tree is complete of height h-1.

A complete tree is filled from the left: all the leaves are on the same level or

two adjacent ones and

all nodes at the lowest level are as far to the left as possible Figure 1

Heaps

13

Heap property:

The value of every parent node is greater than or equal to the values of either of its child nodes. i.e. for every node i

key (parent (i)) ≥ key (i)

Why do we need heaps?Heaps are used to maintain set of numbers in a dynamic sense. The basic data structures that we use have some drawbacks such as, Arrays - Searching takes O(n) timeSorted arrays - Fast searching but expensive insertion.Link lists - Searching is complex. Extracting min or max node is not a trivial operation

In heaps searching time is O(log n) while extracting the minimum and maximum node takes O(n/2) &O(1) time respectively.

14

Heap representation:A heap data structure is represented as an array A object with two attributes:

length[A] - number of elements in the array,heap-size[A] - number of elements in the heap.heap-size[A] ≤ length[A]

In an array A the root of the heap resides in A[1]Consider a node with index i,

Index of parent is Parent(i) = └ i/2┘Index of left child of i is LEFT_CHILD(i) = 2 x i;Index of right child of i is RIGHT_CHILD(i) = 2 x i+1;

16 11 9 10 5 6 8 1 2 4

1 2 3 4 5 6 7 8 9 10

Array representation of the heap in figure 1

15

The height of a heap with n elements is h=log n.The minimum number of elements in a heap is when it has just one node at the lowest level. The levels above the lowest level form a complete binary tree of height h-1 and 2h-1 nodes. Hence the minimum number of nodes possible in a heap of height h is 2h.

Clearly a heap of height h, has the maximum number of elements when its lowest level is completely filled. In this case the heap is a complete binary tree of height h and hence has 2h+1-1 nodes.

20

21

22

2h

Minimum-Maximum nodes in a Heap

16

Heap Algorithms

HEAPIFYHEAPIFY is an important subroutine for maintaining heap property.Given a node i in the heap with children l and r. Each sub-tree rooted at l and r is assumed to be a heap. The sub-tree rooted at i may violate the heap property [ key(i) < key(l) OR key(i) < key(r) ]Thus Heapify lets the value of the parent node “float” down so the sub-tree at i satisfies the heap property.

Algorithm: HEAPIFY(A, i)1. l ← LEFT_CHILD (i);2. r ← RIGHT_CHILD (i);3. if l ≤ heap_size[A] and A[l] > A[i]4. then largest ← l;5. else largest ← i;6. if r ≤ heap_size[A] and A[r] > A[largest]7. then largest ← r;8. if largest ≠ i9. then exchange A[i] ↔ A[largest]10. HEAPIFY (A,largest)

17

16

1 13

10 5 9 3

2 8 4

16 13 10 5 9 3 2 8 4A = 1

Heapify Example

18

16

1 13

10 5 9 3

2 8 4

16 13 10 5 9 3 2 8 4A = 1

Heapify Example

19

16

10 13

1 5 9 3

2 8 4

16 13 5 9 3 2 4A = 10

Heapify Example

81

20

16

10 13

8 5 9 3

2 1 4

16 13 5 9 3 2 4A = 10

Heapify Example

18

1

1

1

21

Running Time of Heapify

Fixing relation between i( a node ), l (left child of i ), r ( right child of i ) takes Θ(1) time. Let the heap at node i have n elements. The number of elements at subtree l or r , in worst case scenario is 2n/3 i.e. when the last level is half full.

Or Mathematically

T(n) ≤ T(2n/3)+ Θ(1)

Applying Master Theorem (Case 2) , we can solve the above to

T(n)=O ( log n)

Alternatively ,

In the worst case the algorithm needs walking down the heap of height h= log n. Thus the running time of the algorithm is O(log n)

22

Algorithm to build a Heap

This procedure builds a heap of the array modified by the Heapify algorithm

BUILD_HEAP(A) 1. heap_size [a] length [A]

2. for i └ length [A]/2 ┘ downto 1 do 3. HEAPIFY (A, i)

Elements after └ length [A]/2 ┘ till n are leaf nodes hence in line 2 we apply heapify to node from ∟ length [A]/2 ┘ to 1.

length [A] /2 = 5.

It is seen thus that 6th node onwards all are leaf nodes

23

Running Time of Build_Heap

We represent heap in the following manner

hi

For nodes at level i , there are 2i nodes . And the work is done for h-i levels. h= log n

Total work done = ∑ 2i * (h-i) i=1 h= log n

= ∑ 2i * (log n -i) i=1

24

Running Time of Build_Heap (cont.)

Substituting j = log n- i we get , 1

Total work done = ∑ 2log n - j * j j=log n

log n

= ∑ (2log n / 2j ) * j j=1

log n

= n ∑ j / 2j j=1

= O (n)

Thus running time of Build_Heap = O(n)

25

HeapSort

HEAPSORT(A)

1) BUILD_HEAP(A)2) for i <--- length [A] downto 2 3) do exchange A[1] <------> A[i]4) heap-size[A] heap-size[A]-15) HEAPIFY(A,1)

Running time of HEAPSORTThe call to BUILD_HEAP takes O(n) time and each of the n-1 calls to MAX-HEAPIFYtakes O (log n ) time. Thus HEAPSORT procedure takes O(n log n ) time.

Why doesn’t Heapsort take O(log n) time as in the case of other Heap algorithms?

Consider the Build_Heap algorithm, a node is pushed down and since the lower part ofthe heap is decreasing at each step the number of leaf node operations performeddecreases logarithmically. While in HeapSort the node moves upwards. Thus thedecreasing lower part does not reduce the number of operations.

26

HEAP-EXTRACT-MAX

The HEAP-EXTRACT-MAX removes and returns the maximum element of the heap i.e. root .

HEAP-EXTRACT-MAX(A) 1. if heap-size[ A] <12. then error “heap underflow”3. max A[1]4. A[1] A[heap-size[a]]5. heap-size[A] heap-size[A]-16. MAX-HEAPIFIY (A,1)7. return max

Step 4 takes the last element in the heap and places it at the root and then appliesheapify . The running time of HEAP-EXTRACT-MAX is O(log n) , since it performs only aconstant amount of work on top of the O(log n) time for MAX-HEAPIFY.

27

Comparison between the running times of Heap algorithms

Algorithm Time Complexity

Heapify O( log n )

Build_Heap O (n)

Extract_Max O ( log n )

Delete_Heap O( log n )

Find_Max O (1)

Insert O(log n )

28

Single Source Shortest Paths

Suppose G be a weighted directed graph where a minimum w(u, v) is associated witheach edge (u, v) in E. These weights represent the cost to traverse the edge. A path from vertex u to vertex v is a sequence of one or more edges.

<(v1,v2), (v2,v3), . . . , (vn-1, vn)> in E[G] where u = v1 and v = vn

1

2

3

45

6

The cost (or length or weight) of the path P is the sum of the weights of edges in thesequence.

The shortest-path weight from a vertex u Є V to a vertex v Є V in the weighted graphis the minimum cost of all paths from u to v. If there exists no such path from vertex u to vertex v then the weight of the shortest-path is ∞.

29

Single source shortest path problem

Till now we have used BFS traversal algorithm to find the shortest path. But this is a special case where path length is measured in links i.e. all weights are 1. Now we consider graphs with different link weights.

Problem: Given a weighted graph G, find a shortest path from given vertex to each other vertex in G.

One solution to this problem is the Dijkstra’s algorithm which is a greedy algorithm.It turns out that one can find the shortest paths from a given source to all points in a graph in the same time, hence this problem is called the single-source shortest paths problem.

30

Dijkstra’s Single Source Shortest Paths Algorithm

DIJKSTRA (G, w, s)

1 INITIALIZE-SINGLE-SOURCE(G, s)

2 S ← ø3 Q ← V[G]

4 while Q ≠ ø

5 do u ← EXTRACT-MIN(Q)

6 S ← S U {u}

7 for each vertex v Є Adj[u] do

8 if d[v] > d[u] + w (u, v)

9 then d[v] ← d[u] + w (u, v)

10 π [v] ← u

The algorithm repeatedly selects the vertex u V - S with the minimum shortest-path

estimate, adds u to S, and relaxes all edges leaving u.

31

Dijkstra’s Algorithm

7

2

3

6

4

5

To find the shortest path from node 1 to node 5.

1

34

8

6

1

7

5

20.01

Step Fringe Set

1 2, 7

2 3, 7

3

4

5

32


7

2

3

6

4

5


1

34

8

6

1

7

5

20.01

Step Fringe Set

1 2, 7

2 3, 7

3 3, 6

4 4, 5, 6

3

4

54.01

12

11.01

9.01

10.01

33


7

2

3

6

4

5


1

34

8

6

1

7

5

20.01

Step Fringe Set

1 2, 7

2 3, 7

3 3, 6

4 4, 5, 6

5 4, 6

6 6

3

4

4.01

12

9.01

10.01

Shortest path from 1 to 5 is 1-7-3-5 with the minimum weight of 9.01

34

Dijkstra’s Algorithm (cont.)

The graph formed as a result of the algorithm is also the shortest path spanning tree.

Running time of Dijkstra’s Algorithm

For a graph G(V, E) l V l = n, l E l = mThen the running time of Dijkstra’s algorithm is given by: O ((m+n) log n)Thus it is seen that it is slower than previously viewed search algorithms.

Note: The Fringe sets created at each iteration can be stored in a heap.

An interesting Applet that simulates Dijkstra’s algorithm

Documents

Binary search trees Heaps Single source shortest paths Prepared by- Jatinder Paul, Shraddha Rumade