24
Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 DCO20105 Data structures and algorithms Lecture 9: More on BST Removal of a BST Some advanced balanced BST trees (AVL trees): 234 tree, Red-Black tree -- By Rossella Lau

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6 DCO 20105 Data structures and algorithms Lecture 9: More on BST Removal of a BST Some advanced

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

DCO20105 Data structures and algorithms

Lecture 9: More on BST

Removal of a BST Some advanced balanced BST trees (AVL trees):

234 tree, Red-Black tree

-- By Rossella Lau

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Re-visit on BST

A BST is a tree where all the values of the left sub-tree are less than the root and all the values of the right sub-tree are greater than the root

It supports O(log n) execution time for both search and insert in optimal cases when the BST has high density

The worst execution time may be O(n) when the BST is sparse

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Some facts of a BST

A binary search tree’s in-order traversal sequence is a sort order insertion to a BST can also be treated as a tree sort method and this is another O(n log n) sort algorithm

The minimum value of a BST is on the left most leaf BSTNode<T> cur = root; // assume size()>=1 while (cur->left) cur = cur->left; Return cur->item;

The maximum value of a BST is on the right most leaf

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

BST removal

Removing a node from a BST should maintain the resulting tree to be a tree as a BST

It cannot have three children left sub-tree < root < right sub-tree

Should consider different situations of a node (or a sub-tree)

A leaf A node with a single child A full node, which has two children

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Deletion of an item which is a leaf

Delete 22:

When the item is found, delete it!

87 9535

22

50

28

40

75

90

87 9535

50

28

40

75

90

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

The algorithm for deletion of a leaf

bool BSTree<T>::remove (T const & target){ BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false;

BSTNode<T> *forDelete (contenAt); if (contentAt->isLeaf()) contentAt = 0;

forDelete->left = forDelete->right = 0; delete forDelete;

countNodes--; return true; }

bool BSTNode<T>::isLeaf(void) {return !left && !right;}

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Deletion of an item which has one child

Delete 75:

When the item is found, put its only child at its place

87 9535

50

28

40

75

90 87 95

35

50

28

40

90

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

The algorithm for deletion of single child nodebool BSTree<T>::remove (T const & target){ BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false;

BSTNode<T> *forDelete (contenAt); if (!contentAt->isLeaf() && // with single !contentAt->isFull() ) // child contentAt = contentAt->left ? contentAt->left : contentAt->right;

forDelete->left = forDelete->right = 0; delete forDelete; countNodes--; return true; }

bool BSTNode<T>::isFull(void) {return left && right; }

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

87 95

35

40

28

50

90

Deletion of an item which has two children

Delete 50: Theory: The inorder successor/predecessor of an internal

node at most has one child at its right/left hand side When the item is found at node n, replace n's data with n's

inorder successor s or predecessor p, then deletion goes to s or p -- s or p is either a leaf or a node with single child!

87 95

35

50

28

40

90or

95

35

87

28

40

90

5050 35

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

The algorithm for deletion of an internal nodebool BSTree<T>::remove (T const & target){BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false;

BSTNode<T> *& forDelete(prepareRemoval(contentAt)); BSTNode<T> *realDelete (forDelete);

…… // deletion of a leaf or a single child’s parent}

BSTNode<T> *& BSTree<T>::prepareRemoval( BSTNode<T> *& contentAt) { if (contentAt->isFull()) { BSTNode *& succ ( successor ( contentAt) ); swap ( succ->getItem(), contentAt->getItem() ); return succ; } else return contentAt;}

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

The algorithm for finding an inorder successor

BSTNode<T> *& BSTree<T>::successor (BSTNode<T> const *p){ // Assume that the input node (p) has two children

BSTNode<T> *it (const_cast<BSTNode<T>*> (p)); if (it->right->left) { // successor at the // left-most right subTree it = it->right; while (it->left->left) it= it->left; return it->left; } else //successor is the right child return it->right;}

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Notes on const_castC++ supports the following type cast operators:

const_cast to cast away constant attribute• In the previous example, p is passed as a pointer pointing to a constant

object.

• However, it tries to traverse p’s children and the compiler would not allow it to have updated operation it=itnext;

• To allow it to traverse its children, const_cast is needed to temporarily cast away the constant attributes of p

static_cast the new way to do former type cast • Former way: doubleResult = (double) intA / intB;

• C++: doubleResult = static_cast<double> (intA) / intB; Other two which are not encouraged:

• dynamic_cast, reinterpret_cast

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Exercises on BST removal

BST removal: Ford’s exercise: 10:26: delete 30, 80, 25; 10:32

Other BST removal related functions find a predecessor

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Complexity for remove()

The main logic for delete() is still find(). However, it requires a function successor() to search an in-order successor. successor() should have a complexity less than or equal to find(), therefore, the big O function of delete() is still the same as find()

remove() is similar to find() and has the same complexity as find()

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Balanced Binary TreeTo solve the problem of a "linear" BST and maintain

an optimal complexity, the problem becomes how to maintain a balanced binary tree

A balanced binary tree is also called an AVL tree It was discovered by two Russian mathematicians:

Adel'son-Vel'skii and Landis

First, the height is defined as the depth of the tree

Then, a balanced binary tree is a binary tree in which the heights of the two sub-trees of every node never differ by more than 1.

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Examples of AVL BST and non-AVL BST

A

C

F

G H

B

ED

AVL tree

J

L

P

S

O

K

N

Q R

M

T

Non AVL tree

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Efficiency concerns on an AVL BSTThere are efficient algorithms to maintain a binary

tree as an AVL tree Insert/remove a node into/from an AVL tree and resulting

an AVL tree at O(1) (without searching) Fords: Supplementary in the book web site Goodrich et al.: Chapter 9 Collins: Chapter 9

It requires more information, the height of a node

With an AVL BST, it can always have an optimal search process on a BST

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

B-TreeA node storing only one item is not efficient especially

considering I/O is based on “blocks” and a block usually stores about 512 bytes

B-Tree is an extension of a balanced binary tree When saying a binary tree of order n, it means that the tree

allows a node to have n children and stores n-1 items

Searching on a B-tree involves only the number of level block I/O when treating each node as an I/O block and searching within a node which has items stored in a vector that can apply binary search

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

A sample B-Tree of Order 5

• 367 •

• 103 • 218 • •492 •661•815•912 •

17 87119 165 198245 272 330 408 435524 602686 770 799 956 968 975 991832 871

A

B C

D E F G H I J K

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Searching on a B-Tree

Search for 832

1. Getting block A, linear or binary search on the key values, 815 > 367 go to block C along the right pointer of 367

2. Getting block C, 832 is in between 815 and 912 go to block J along the pointer between 815 and 912

3. Getting block J, search for 832 found!

Search for 65

Getting block A, then B, and D, 65 does not exist in D not found!

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

2-3-4 Tree

A special case of a B-Tree is 2-3-4 tree, B-tree of order 4, in which a node can have up to four children and stores 3 items

Ford’s slides: Chapter 12: 10-15

Ford’s exercises: Chapter 12: 26(b) Draw the 2-3-4 tree built when you insert the keys from

E A S Y Q U T I O N into an initially empty tree.

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

Red-Black TreeTo implement a B-Tree is complicated and to

implement a 2-3-4 tree is easier but still complicated

Using a Red-black tree to implement (represent) a 2-3-4 tree is easier

Red-black tree is a binary tree• The root is BLACK• A RED parent never has a RED child• Every path from the root to an empty sub-tree has the same

number of BLACK nodes It is closed to a balanced tree and easier to be constructed

Ford’s slides: 12:16-17; exercises: 12:26(c)

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

SummaryConstruction of a BST is also a sorting method which is at

O(n logn) for optimal casesThe in-order successor/predecessor of an interior node must

be either a leaf or a node with single childTo erase a node from a BST can be categorized as two cases:

to delete a leaf and a node with single childTo solve the worst case of a BST, constructing a BST should

assure that it is a balanced BST (AVL)An extension of a BST is a B-Tree and a special case is 2-3-4

treeUsing a Red-Black tree to implement/represent a 2-3-4 tree

greatly reduces the complexity

Rossella Lau Lecture 9, DCO20105, Semester A,2005-6

ReferenceFord: 10.5-6, 12.6-7

Data Structures and Algorithms in C++ by Michael T. Goodrich, Roberto Tamassia, David M. Mount : Chapter 9

-- END --