44
B-Trees CMSC 420: Lecture 9

B-Trees - Carnegie Mellon School of Computer Scienceckingsf/bioinfo-lectures/btrees.pdf · B-trees • A B-tree of order b is an a,b-tree with b = 2a-1-In other words, we choose the

Embed Size (px)

Citation preview

B-TreesCMSC 420: Lecture 9

Another way to achieve “balance”

• Height of a perfect binary tree of n nodes is O(log n).

• Idea: Force the tree to be perfect.

- Problem: can’t have an arbitrary # of nodes.

- Perfect binary trees only have 2h-1 nodes

• So: relax the condition that the search tree be binary.

• As we’ll see, this lets you have any number of nodes while keeping the leaves all at the same depth.

Global balance instead of the local balance of AVL trees.

2,3 Trees

• All leaves are at the same level.

• Each internal node has either 2 or 3 children.

• If it has:

- 2 children => it has 1 key

- 3 children => it has 2 keys

67 75

keys < 67

keys 68...74

keys > 75

24

keys < 24

keys > 24

2-node: 3-node:

2,3 Tree Find (multiway searching)

50

11 45 49 55 60

62

7312 19

17 42

25

Find 19

Standard BST-type walk down the tree. At each node have to examine each key stored there.

2,3 Tree Find (multiway searching)

50

11 45 49 55 60

62

7312 19

17 42

25

Find 19

Standard BST-type walk down the tree. At each node have to examine each key stored there.

2,3 Tree Find (multiway searching)

50

11 45 49 55 60

62

7312 19

17 42

25Find 19

Standard BST-type walk down the tree. At each node have to examine each key stored there.

2,3 Tree Find (multiway searching)

50

11 45 49 55 60

62

7312 19

17 42

25Find 19

Find 62

Standard BST-type walk down the tree. At each node have to examine each key stored there.

2,3 Tree Find (multiway searching)

50

11 45 49 55 60

62

7312 19

17 42

25Find 19

Find 62

Standard BST-type walk down the tree. At each node have to examine each key stored there.

2,3 Tree Insertion

73

11 69 70 76 80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76

102

80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76

102

80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76 10280

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76

13

10280

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76

13

10280

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76 13 102

Overflow!

80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76 13 102

Overflow!

Try Key Rotation: Look for left or right sibling with some space, move a parent key into it, and a child key into the parent

80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76 13 102

Overflow!

Try Key Rotation: Look for left or right sibling with some space, move a parent key into it, and a child key into the parent

80

94

9817 25

22 68

2,3 Tree Insertion

73

11 69 70 76 13 102

Overflow!

Try Key Rotation: Look for left or right sibling with some space, move a parent key into it, and a child key into the parent

80

94

9817 2522

68

2,3 Tree Insertion

73

11 69 70 76 13 102

Overflow!

Try Key Rotation: Look for left or right sibling with some space, move a parent key into it, and a child key into the parent

80

94

98

17

2522

68

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 25

17 68

27 73

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 25

17 6827

73

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 25

17 68

27

73

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 25

17 68

27

Overflow!If both siblings are filled, you have to split the node.

73

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

98

17 68

2722 27

25

Overflow!If both siblings are filled, you have to split the node.

73

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 27

17 6825Overflow!

73

May have to recursively split nodes, working back to the root.

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 27

17 68

25

Overflow!

73

May have to recursively split nodes, working back to the root.

2,3 Tree Insertion – When key rotation fails

11 69 70 76 13 10280

94

9822 27

25

Overflow!17 68

73

May have to recursively split nodes, working back to the root.

2,3 Tree Insertion – Another Splitting Example

11 69 70 76 80

94

9822 27

17 68

7325

10213

85

2,3 Tree Insertion – Another Splitting Example

11 69 70 76 80

94

9822 27

17 68

7325

10213

85

2,3 Tree Insertion – Another Splitting Example

11 69 70 76 80

94

9822 27

17 68

7325

10213 85

2,3 Tree Insertion – Another Splitting Example

11 69 70

94

9822 27

17 68

7325

10213 8576 85

80

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

98 10213 76

9480

85

87

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

98 10213 76

9480

85 87

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

98 10213 76

9480

85 87

105

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

98 10213 76

9480

85 87 105

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

13 76

9480

85 87 10598 105

102

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

13 76 85 87 10598 105

80 102

94

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

7325

13 76 85 87 98 105

80 102

94

2,3 Tree Insertion – Splitting at the root

11 69 7022 27

17 68

13 76 85 87 98 105

80 102

73

25 94

(b+1)/2 -1 keys(b+1)/2 -1 keys

From 2,3-Trees to a,b-trees

• An a,b-tree is a generalization of a 2,3-tree, where each node (except the root) has between a and b children.

• Root can have between 2 and b children.

• We require that:

- a ≥ 2 (can’t allow internal nodes to have 1 child)

- b ≥ 2a - 1 (need enough children to make split work)

b keys1 key

Overflow!(b+1)/2 children ≥ a = a

b+1 children

(b+1)/2 children ≥ a = a

a,b Insertions & Deletions

A B C D E D EA B

C

C D

B

A A B C D

split

merge

C D E

B

A D EA B

Ckey rotation

Deletion Details

• Try to borrow a key from a sibling if you have one that has an extra (≥ 1 more than the minimum)

• Otherwise, you have a sibling with exactly the minimum number a-1 of keys.

• Since you are underflowing, you must have one less than the minimum number of keys = a-2.

• Therefore, merging with your sibling produces a node with a-1 + a-2 = 2a-3 keys.

• This is one less than the maximum (2a-2 keys), so we have room to bring down the key that split us from our sibling.

B-trees

• A B-tree of order b is an a,b-tree with b = 2a-1

- In other words, we choose the largest allowed a.

• Want to have large b if bringing a node into memory is slow (say reading a disc block), but scanning the node once in memory is fast.

• b is usually chosen to match characteristics of the device.

• Ex. B-tree of order 1023 has a = 512.

- If this B-tree stores n = 10 million records, its height no more than O(loga n) ≈ 2.58. So only around 3 blocks need to be read from disk.

Each node (page) is at least 50% full.

What if b is very large?

• Need to be able to find which subtree to traverse.

• Could linearly search through keys – technically constant time if b is a constant, but may be time consuming.

• Solution: Store a balanced tree (AVL or splay) at each node so that you can search for keys efficiently.