Upload
robert-short
View
216
Download
1
Embed Size (px)
Citation preview
More TreesMultiway Trees and 2-4 Trees
Motivation of Multi-way Trees
Main memory vs. disk◦ Assumptions so far:
◦ We have assumed that we can store an entire data structure in the main memory of a computer.
◦ What if we have more data than can fit in main memory?◦ Meaning that we must
have the data structure reside on disk.
◦ The rules of the game change, because the Big-Oh model doesn’t apply if all operations are not equal.
Motivation of Multi-way Trees
Main memory vs. disk◦ Disk Accesses are
incredibly expensive.◦ 1 disk access is worth
about 4,000,000 instructions.◦ (See the book for
derivation)
◦ So we’re willing to do lots of calculations just to save disk accesses.
Motivation of Multi-way Trees
For example:◦ Suppose we want
to access the driving records of the citizens of Florida. 10 million items. Assume doesn’t fit
in main memory. Assume in 1 sec,
can execute 25 million instructions or perform 6 disk accesses.
The Unbalanced Binary tree would be a disaster.
In the worst case, it has linear depth and could require 10 mil disk accesses.
An AVL Tree In the typical case, it
has a depth close to log N, log 10 mil ≈ 24 disk accesses, requiring 4 sec.
The point is…Reduce the # of disk
accesses to a very small constant,
Such as 3 or 4 And we are willing to write
complicated code to do this, because in comparison machine instructions are essentially free. As long as we’re not
ridiculous.
We cannot go below log N using a BST◦ Even an AVL
Solution??◦ More branching,
less height.◦ Multiway Tree
Multiway Tree (or B-tree in book)
Multiway tree is similar to BST◦ In a BST
We need 1 key to decide which of 2 branches to take.
◦ In an Multiway tree We need M-1 keys to decide which branch to take, where M
is the order of the Multiway tree. Also need to balance,
Otherwise like a BST it could degenerate into a linked list.
◦ Here is a Multiway tree of Order 5:
Specifications of a MultiwayTree
If a node contains k items (I1, I2.. Ik), it will contain k+1 subtrees. Specifically subtrees S1 thru Sk+1
10 20 30
3 7 _
15 _ _ 22 28 _40 50
60
1___
345_
78__
1012 _ _
1517_ _
2021__
22 26 _ _
2829_ _
30___
4045 __
55_ _ _
6570_ _
I1 I2 I3
S1 S2 S3 Sk+1
All values in S1 < I1.
All values in Sk+1 >Ik.
All values in S3 < I3, but ≥ I2.
All values in S2 < I2, but ≥ to I1.
2-4 Trees2-4 Trees
◦ Specific type of multitree.1) Every node must have in between 2 and 4
children. (Thus each internal node must store in between 1 and 3 keys)
2) All external nodes (the null children of leaf nodes) and leaf nodes have the same depth.
Example of a 2-4 tree
10 20 30
3 7 _
15 _ _ 22 28 _40 50
60
1___
345_
78__
1012 _ _
1517_ _
2021__
22 26 _ _
2829_ _
30___
4045 __
55_ _ _
6570_ _
InsertInsert 4 into the 2-4 tree
below.Compare 4 to vals in root node.
10 20 30
3 7 _
15 _ _ 22 28 _40 50
60
1__
5__
8__
12 _ _
17_ _
21____
26 _ _
2829__
_ _ _
45 __
55_ _
6570_
4 < 10, 4 goes in the subtree to left of 10.
4 > 3 and 4 < 7, 4 goes in the subtree to
right of 3 and left of 7
45_
Problems with InsertWhat if the node
that a value gets inserted into is full?◦ We could just
insert the value into a new level of the tree.
◦ BUT then not ALL of the external nodes will have the same depth after this.
Insert 18 into the following tree:
10 20 _
37_
131517
22_ _
Problems with InsertWhat if the node
that a value gets inserted into is full?◦ We could just
insert the value into a new level of the tree.
◦ BUT then not ALL of the external nodes will have the same depth after this.
Insert 18 into the following tree:
10 20 _
13151718 The node has too many values.
You can send one of the values to the parent (the book’s convention is to sent the 3rd value.
37_
22_ _
Problems with InsertWhat if the node
that a value gets inserted into is full?◦ We could just
insert the value into a new level of the tree.
◦ BUT then not ALL of the external nodes will have the same depth after this.
Insert 18 into the following tree:
10 17 20
1315_
22_ _
Moving 17, forces you to “split” the other three values.
18____
37_
Other Problems with InsertIn the last
example, the parent node was able to “accept” the 17.◦ What if the parent
root node becomes full?
Insert 12 into the 2-4 Tree below:
10 20 30
5_
111417_
3237_ _
25____
Other Problems with InsertIn the last
example, the parent node was able to “accept” the 17.◦ What if the parent
root node becomes full?
Insert 12 into the 2-4 Tree below:
10 20 30
5_
11121417
3237_ _
25____
Using the rule from before, let’s move 12 up.
Other Problems with InsertIn the last
example, the parent node was able to “accept” the 17.◦ What if the parent
root node becomes full?
Insert 12 into the 2-4 Tree below:
10 14 20 30
5_
1112
3237_ _
17____
Using the rule from before, let’s move 14 up.
25____
Other Problems with InsertIn the last
example, the parent node was able to “accept” the 17.◦ What if the parent
root node becomes full?
Insert 12 into the 2-4 Tree below:
10 14 20 30
5_
1112
3237_ _
25____
Now this has too many parent nodes AND subtrees!We can just repeat the
process and make a new root.
17____
Other Problems with InsertIn the last
example, the parent node was able to “accept” the 17.◦ What if the parent
root node becomes full?
Insert 12 into the 2-4 Tree below:
10 14
5_
1112
3237_ _
17____
Now this has too many parent nodes AND subtrees!We can just repeat the
process and make a new root.
25____
20
30
Deletion from a 2-4 TreeDelete a non-leaf
value◦ Replace that
value with the largest value in its left subtree Or smallest value in
its right subtree.
Deletion from a 2-4 TreeDelete a leaf
node◦ In the standard
case: a value can simply
be removed from a leaf node that stores more than one value.
Requires no structural change.
10 20 30
57_
1217__
35_ _
232735
10 20 30
7_
1217__
35_ _
232735
Delete 5
Deletion from a 2-4 TreeBUT what if the leaf node has ONLY one value?
◦ If you get rid of it, then it would violate the 2-4 property that all leaf nodes MUST be on
the same height of the tree.
Break up into 2 cases:1) An adjacent sibling has more than one value stored in its
node.2) An adjacent sibling does NOT have more than one value
stored in its node, and a fusion operation MUST be performed.
Deletion from a 2-4 TreeCase 1:
◦ Consider deleting 5 from the following tree: An adjacent sibling has
more than one value stored in its node.
Take the 10 to replace the 5,
And then simply replace the 10 with the smallest value in its right subtree. ◦ This is okay, since there is
more than one value at this subtree.
10 20 30
5__
1217__
35_ _
2327_
12 20 30
10_ __
17_ __
35_ _
2327_
Delete 5
Deletion from a 2-4 Tree Case 2:
◦ If an adjacent sibling does NOT have more than one value stored in its node, and a fusion operation MUST be performed.
◦ The fusion operation is a little more difficult since it may result in needing another fusion at a parent node.
We have 3 child nodes when we should have 4. Thus we can drop a value, we will drop 10.
10 20 30
5__
15_ __
35_ _
25_ _
10 20 30
__
15_ __
35_ _
25_ _
10 20 30
15_ __
35_ _
25_ _
Delete 5
Fuse empty
node with 15
Deletion from a 2-4 Tree Case 2:
◦ If an adjacent sibling does NOT have more than one value stored in its node, and a fusion operation MUST be performed.
◦ The fusion operation is a little more difficult since it may result in needing another fusion at a parent node.
We have 3 child nodes when we should have 4. Thus we can drop a value, we will drop 10.
10 20 30
15_ __
35_ _
25_ _
20 30
1015 __
35_ _
25_ _
Drop one parent into fused node
Examples on the Board