33
Fall 2007 CS 225 1 Self-Balancing Search Trees Chapter 9

Fall 2007CS 2251 Self-Balancing Search Trees Chapter 9

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Fall 2007 CS 225 1

Self-Balancing Search Trees

Chapter 9

Fall 2007 CS 225 2

Why Balance is Important

• Searches into an unbalanced search tree could be O(n) at worst case

Fall 2007 CS 225 3

Rotation• To achieve self-adjusting capability, we need

an operation on a binary tree that will change the relative heights of left and right subtrees but preserve the binary search tree property

• Algorithm for right rotation– Remember value of root.left (temp = root.left)– Set root.left to value of temp.right– Set temp.right to root– Set root to temp

• What is the algorithm for left rotation?

Fall 2007 CS 225 4

Rotation Illustrated

Fall 2007 CS 225 5

Internal Representation

Fall 2007 CS 225 6

Representation after Rotation

Fall 2007 CS 225 7

Unbalanced Trees• The height of a tree is the number of

nodes in the longest path from the root to a leaf node

• Measure the imbalance of a tree as the difference in height between the two subtrees

• Actual heights of the left and right subtrees are unimportant– only the relative difference matters when

balancing

Fall 2007 CS 225 8

4 Types of Unbalanced Trees• Left-Left

– Left child is higher than right child

– Left child of left child is higher than right child of left child

• Left-Right– Left child is higher than right child

– Right child of left child is higher than left child of left child

• Right-Right– Right child is higher than left child

– Right child of right child is higher than left child of right child

• Right-Left– Right child is higher than left child

– Left child of right child is higher than right child of right child

Fall 2007 CS 225 9

AVL Tree• Each node contains a number that

represents the difference in height between the two subtrees (hR - hL)

• As items are added to or removed from the tree, the balance or each subtree from the insertion or removal point up to the root is updated

• Rotation is used to bring a tree back into balance when the magnitude of the difference is greater than 1

Fall 2007 CS 225 10

Balancing a Left-Left Tree

• A left-left tree is a tree in which the root and the left subtree of the root are both left-heavy

• Right rotations are required

Fall 2007 CS 225 11

Balancing a Left-Right Tree

• Root is left-heavy but the left subtree of the root is right-heavy

• A simple right rotation cannot fix this

• Need both left and right rotations

Fall 2007 CS 225 12

To Balance Unbalanced Trees• Left-Left (parent balance is -2, left child balance is -1)

– Rotate right around parent

• Left-Right (parent balance -2, left child balance +1)– Rotate left around child– Rotate right around parent

• Right-Right (parent balance +2, right child balance +1)– Rotate left around parent

• Right-Left (parent balance +2, right child balance -1)– Rotate right around child– Rotate left around parent

Fall 2007 CS 225 13

Red-Black Trees

• Rudolf Bayer developed the red-black tree as a special case of his B-tree

• A node is either red or black

• The root is always black

• A red node always has black children

• The number of black nodes in any path from the root to a leaf is the same

Fall 2007 CS 225 14

Insertion into a Red-Black Tree

• Follows same recursive search process used for all binary search trees to reach the insertion point

• When a leaf is found, the new item is inserted and initially given the color red

• It the parent is black we are done otherwise there is some rearranging to do

Fall 2007 CS 225 15

Insertion into a Red-Black Tree

• Find location for new node as in the BST

• Color a new node red to start with

• Several possible cases– red parent has red sibling - recolor– red parent with no red sibling - do one or

two rotations depending on location of child

Fall 2007 CS 225 16

Insertion into a Red-Black Tree

Fall 2007 CS 225 17

Insertion

Fall 2007 CS 225 18

Algorithm for Red-Black Tree Insertion

Fall 2007 CS 225 19

Non-Binary Trees

• Nodes can have more than two children

• Nodes with multiple children store multiple values

Fall 2007 CS 225 20

2-3 Trees• 2-3 tree named for the number of possible children

from each node• Made up of nodes designated as either 2-nodes or 3-

nodes• A 2-node is the same as a binary search tree node• A 3-node contains two data fields, ordered so that

first is less than the second, and references to three children

• One child contains values less than the first data field• One child contains values between the two data fields• Once child contains values greater than the second data field

• 2-3 tree has property that all of the leaves are at the lowest level

Fall 2007 CS 225 21

Searching a 2-3 Tree

Fall 2007 CS 225 22

Searching a 2-3 Tree

Fall 2007 CS 225 23

Algorithm for Insertion into a 2-3 Tree

Fall 2007 CS 225 24

Inserting into a 2-3 Tree

• Trees are built bottom-up instead of top-down

Fall 2007 CS 225 25

Inserting into a 2-3 Tree

Fall 2007 CS 225 26

Removal from a 2-3 Tree

• Removing an item from a 2-3 tree is the reverse of the insertion process

• If the item to be removed is in a leaf, simply delete it

• If not in a leaf, remove it by swapping it with its inorder predecessor in a leaf node and deleting it from the leaf node

Fall 2007 CS 225 27

Removal from a 2-3 Tree

Fall 2007 CS 225 28

Removal from a 2-3 Tree (continued)

Fall 2007 CS 225 29

2-3-4 and B-Trees• 2-3 tree was the inspiration for the more

general B-tree which allows up to n children per node

• B-tree designed for building indexes to very large databases stored on a hard disk

• 2-3-4 tree is a specialization of the B-tree because it is basically a B-tree with n equal to 4

• A Red-Black tree can be considered a 2-3-4 tree in a binary-tree format

Fall 2007 CS 225 30

2-3-4 Trees• Expand on the idea of 2-3 trees by

adding the 4-node

• Addition of this third item simplifies the insertion logic

Fall 2007 CS 225 31

Algorithm for Insertion into a 2-3-4 Tree

Fall 2007 CS 225 32

B-Trees

• A B-tree extends the idea behind the 2-3 and 2-3-4 trees by allowing a maximum of CAP data items in each node

• The order of a B-tree is defined as the maximum number of children for a node

• B-trees were developed to store indexes to databases on disk storage

Fall 2007 CS 225 33

B-Trees