Database design and Implementation 05.btree

  • Upload
    sushmsn

  • View
    233

  • Download
    0

Embed Size (px)

Citation preview

  • 8/8/2019 Database design and Implementation 05.btree

    1/151

    Database Systems Implementation, Bongki Moon 1

    Introduction to IndexingB+-Tree Indexes

    Ramakrishnan & Gehrke: Chap. 8.2, 10.3-10.8

    Database Systems Implementation, Bongki Moon 2

    Overview

    Index classification Primay/Secondary, Clustered/Non-clustered

    Sparse/Dense, Single-attribute/Composite

    B+-Tree

    Structure, Search etc.

    Insert and Delete algorithms

    Duplicate keys, variable-length keys, bulk loading

  • 8/8/2019 Database design and Implementation 05.btree

    2/15

    2

    Database Systems Implementation, Bongki Moon 3

    Basics

    To speed up selections on the search key fields.

    Any subset of the fields of a relation can be thesearch key for an index on the relation.

    Typically stored as a separate file (or relation).

    Typically much smaller than base relations.

    Database Systems Implementation, Bongki Moon 4

    Index Classification

    Primary vs. secondary: If search key containsprimary key, then called primary index.

    Primary index value is unique (e.g., SSN). Secondary index allows duplicate key values (e.g.

    Names, GPA).

    Unique index: Search key is or contains a candidatekey.

  • 8/8/2019 Database design and Implementation 05.btree

    3/15

    3

    Database Systems Implementation, Bongki Moon 5

    Index Classification

    Clustered vs. Non-clustered: If order of data records is the same as, or`close to, order of keys, then called clustered index.

    How many clustered and non-clustered indexes can a relation have?

    Index entries

    Data entries

    direct search for

    (Index File)

    (Data file)

    Data Records

    data entries

    Data entries

    Data Records

    CLUSTERED UNCLUSTERED

    Database Systems Implementation, Bongki Moon 6

    Index Classification

    Dense vs. Sparse: Ifthere is 1-to-1 mappingbetween key values (in

    index) and data records,then the index is dense. Can a clustered index be

    dense or sparse?

    Can a non-clusteredindex be dense or sparse?

    Ashby, 25, 3000

    Smith, 44, 3000

    Ashby

    Cass

    Smith

    22

    25

    30

    40

    44

    44

    50

    Sparse Indexon

    Name Data File

    Dense Indexon

    Age

    33

    Bristow, 30, 2007

    Basu, 33, 4003

    Cass, 50, 5004

    Tracy, 44, 5004

    Daniels, 22, 6003

    Jones, 40, 6003

  • 8/8/2019 Database design and Implementation 05.btree

    4/15

    4

    Database Systems Implementation, Bongki Moon 7

    Benefits and Costs

    Cost of retrieving data records through index variesgreatly based ontypes of indexes. (E.g.) For an exact-match query with a unique/non-unique, clustered/non-

    clustered, dense/sparse index, how many pages need to be retrieved from theindex and the base relation?

    (E.g.) How about a range query?

    Some may be more costly than others To build or maintain a clustered index, the records in the base relation must

    be sorted and be in sorted order.

    For dynamic data, it will be costly to maintain the sorted order.

    To reduce the cost of dynamic insertions, Keep some free space on each page of the base relation for future insertions.

    Use overflow pages (with links) for more future insertions. (Thus, order of datarecords is `close to, but not identical to, the sort order.)

    Ex) Cost of indexed scan: clustered vs. non-cluster. Which is better?

    Database Systems Implementation, Bongki Moon 8

    Composite Search Keys

    Search on a combination ofcolumns. Equality query:

    age=20 and sal =75

    Range query: age =20; or age=20 and sal > 10

    Keys in index should be in sortedorder by search key to supportrange queries. But, how for multiple columns? Certain queries may benefit from a

    particular order. (E.g.) age-then-sal, sal-then-age

    Asymmetric vs. Symmetric

    sue 13 75

    bob

    cal

    joe 12

    10

    20

    8011

    12

    name age sal

    12,20

    12,10

    11,80

    13,75

    20,12

    10,12

    75,13

    80,11

    11

    12

    12

    13

    10

    20

    75

    80

    Data recordssorted by name

    Data entries in indexsorted by

    Data entriessorted by

    Examples of composite keyindexes using lexicographic order.

  • 8/8/2019 Database design and Implementation 05.btree

    5/15

    5

    Database Systems Implementation, Bongki Moon 9

    B+-Tree: Introduction

    Most widely used.

    Support both range and equality searchesefficiently.

    Dynamic index Adjusts gracefully under insertions and deletions.

    Keeps tree height-balanced. The cost of exact-match/insertion/deletion is log FN.

    F is the fanout, and N is the # leaf nodes.

    Database Systems Implementation, Bongki Moon 10

    B+-Tree: Structural Characteristics

    Root node has at least two children.

    Minimum 50% occupancy (except for root). For a B+- treeof order d,

    An internal node hasd/2

    m

    d children. A leaf node has d/2 m d-1 pointers to data pages.

    Index Entries

    Data Entries

    ("Sequence set")

    (Direct search)<

  • 8/8/2019 Database design and Implementation 05.btree

    6/15

    6

    Database Systems Implementation, Bongki Moon 11

    B+-Tree: Node Structures

    Internal nodes

    N : the number of valid entries.

    Ki : key values (K1 < K2 < < Kd-1).

    Pi : tree pointers to internal or leaf nodes.

    Leaf nodes

    Ppred, Pnext : pointers to neighboring leaf nodes.

    Pi : data pointers to a record or a block.

    N P1 K1 P2 K2 Pd-1 Kd-1 Pd...

    P2K2P1K1PpredN ... PnextPd-1Kd-1

    Database Systems Implementation, Bongki Moon 12

    Example B+-Tree (of order 5)

    Search begins at root, and key comparisonsdirect it to a leaf.

    (E.g.) Search for 5, 15, or all data entries > 24 ...Root

    17 24 30

    2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

    13

  • 8/8/2019 Database design and Implementation 05.btree

    7/15

    7

    Database Systems Implementation, Bongki Moon 13

    Inserting a Key into a B+-Tree

    Find a correct leaf node L.

    Put a key entry into L. If L has enough space, done!

    Else, must split L (into L and a new node L2) Redistribute entries evenly, copy up the middle key.

    Insert a new index entry pointing to L2 into the parent of L.

    Node splitting can happen recursively To split a non-leaf node, redistribute entries evenly, but

    push up the middle key. (Contrast with leaf splits.)

    Splits grow tree wider or taller Splitting the root node makes the tree one level taller.

    Database Systems Implementation, Bongki Moon 14

    Inserting 8* into Example B+-Tree

    2* 3* 5* 7* 8*

    5

    Note that 5 is copied up and

    continues to appear in the leaf.

    5 24 30

    17

    13

    Note that 17 is pushed up andappears once in the index.

    Root

    17 24 30

    2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

    13

  • 8/8/2019 Database design and Implementation 05.btree

    8/15

    8

    Database Systems Implementation, Bongki Moon 15

    Example B+-Tree After Inserting 8*

    Notice that root was split, leading to increase in height.

    In this example, we can avoid split by re-distributingentries; however, this is usually not done in practice.

    2* 3*

    Root

    17

    24 30

    14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

    135

    7*5* 8*

    Database Systems Implementation, Bongki Moon 16

    Deleting a Key from a B+-Tree

    Start at the root, find a leaf L where the entry belongs.

    Remove the entry. If L is at least half-full, done!

    If L has less thand/2

    entries (pointers),

    Try to re-distribute, borrowing from sibling (adjacent node withsame parent as L).

    If re-distribution fails, merge L and a sibling.

    If merge occurred, must delete an entry (pointing to L orsibling) from the parent of L.

    Merge could propagate to the root, decreasing the height.

  • 8/8/2019 Database design and Implementation 05.btree

    9/15

    9

    Database Systems Implementation, Bongki Moon 17

    Example: Deleting 19* and 20*

    Deleting 20* is done with re-distribution (by moving key 24). The key 24 is removed from the parent. The new middle key 27 is copied up.

    2* 3*

    Root

    17

    30

    14*16* 33*34*38*39*

    135

    7*5* 8* 22*24*

    27

    27*29*

    2* 3*

    Root17

    24 30

    14*16* 19*20*22* 24* 27*29* 33*34*38*39*

    135

    7*5* 8*

    Database Systems Implementation, Bongki Moon 18

    ... And Then Deleting 24*

    Two leaf nodes merge : Key 27 is removed from the parent.

    2* 3*

    Root

    17

    30

    14*16* 33*34*38*39*

    135

    7*5* 8* 22*24*

    27

    27*29*

    2* 3*

    Root

    17

    14*16*

    135

    7*5* 8* 22* 27*

    30

    33* 34*29* 38* 39*

  • 8/8/2019 Database design and Implementation 05.btree

    10/15

    10

    Database Systems Implementation, Bongki Moon 19

    Example B+-Tree After Deleting 24*

    2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39*5* 8*

    Root30135 17

    2* 3*

    Root

    17

    14*16*

    135

    7*5* 8* 22* 27*

    30

    33* 34*29* 38* 39*

    Two internal nodes merge : Key 17 ispulled down from the root.

    Database Systems Implementation, Bongki Moon 20

    Example: Deleting 24*

    Root

    135 17 20

    22

    30

    14*16* 17*18* 20* 33*34*38*39*22*27*29*21*7*5* 8*3*2*

    A different scenario: Deleting 24* causes two leaf nodes merged. Then, this causes an underflow in the internal node.

    Root

    135 17 20

    22

    27

    14*16* 17*18* 20* 27*29*22*24*21*7*5* 8*3*2* 34*38*39*

    30

    33*

  • 8/8/2019 Database design and Implementation 05.btree

    11/15

    11

    Database Systems Implementation, Bongki Moon 21

    Example of Non-leaf Re-distributionRoot

    135 17 20

    22

    30

    14*16* 17*18* 20* 33*34*38*39*22*27*29*21*7*5* 8*3*2*

    14*16* 33*34*38*39*22*27*29*17*18* 20*21*7*5* 8*2* 3*

    Root

    135

    17

    3020 22

    Entries are re-distributed by pushing through the entries in the parent node. Pull down 22 from the root, and push up 20 to the root. We can stop here. Its ok to do once more. Pull down 20 from the root, and push up 17 to the root.

    Database Systems Implementation, Bongki Moon 22

    Summary of B+-tree Operations

    Insert Split a leaf node; copy up the middle key.

    Split an internal node;push up the middle key.

    Delete Merge leaf nodes; remove the middle key from the parent. Merge internal nodes;pull down the middle key from the parent.

    Redistribute leaf nodes; remove the middle key from the parent, andcopy up a new middle key.

    Redistribute internal nodes;pull down the middle key from theparent to one child node, andpush up a new middle key from theother child node.

  • 8/8/2019 Database design and Implementation 05.btree

    12/15

    12

    Database Systems Implementation, Bongki Moon 23

    Other Issues of B+-tree

    1. Duplicate Key Values

    2. Prefix Key Compression

    Variable-length Keys such as strings

    3. Bulk-Loading

    4. Choosing an Optimal Node Size

    Database Systems Implementation, Bongki Moon 24

    Duplicate Key Values

    Three alternatives for Leaf Page Layout:1. Multiple pairs

    2. One-key and multiple-pointers

    variable-length records, variable fanout.3. Another level of indirection

    additional overhead for dereferencing and disk accesses.

    Page Overflows: several leaf pages may contain entrieswith the same key value, if the 1st or 2nd layout option is selected.

    The 3rd layout option may be a better choice for this.

    How about AM Layer of MiniRel project?

  • 8/8/2019 Database design and Implementation 05.btree

    13/15

    13

    Database Systems Implementation, Bongki Moon 25

    Variable-Length Keys

    Longer keys may reduce the fan-out and grow the indextaller. (The taller the tree, the more disk accesses.)

    Prefix-key compression is done to increase fan-out.

    Key values in index entries only `direct traffic; can oftencompress them. (E.g.) If we have adjacent index entries with search key values

    Dannon Yogurt, David Smith and Devarakonda Murthy, we canabbreviate them to Dan, Dav and Dev. What if there is a data entry Davey Jones? (Then, we can only

    compress David Smith to Davi)

    Insert/delete must be suitably modified.

    What if the keys (or strings) are longer than a page?

    Database Systems Implementation, Bongki Moon 26

    Bulk Loading of a B+-Tree

    For a large collection of records, building a B+-tree byrepeatedly inserting records is very slow. Does not give sequential storage of leaves.

    Bulk-Loading:

    Start with an empty root node and sorted entries Insert a leaf node at a time into the right-most slot of the root or

    parent node.

    3* 4* 6* 9* 10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38* 41* 44*

    Sorted pages of data entries; not yet in B+ treeRoot

  • 8/8/2019 Database design and Implementation 05.btree

    14/15

    14

    Database Systems Implementation, Bongki Moon 27

    B+-Trees in Practice

    Typical order: 200. Typical fill-factor: 67%. Block: 8KB, key: 36 Bytes, Pointer: 4 Bytes.

    average fanout = 133

    Typical capacities: Height 3: 1333 = 2,352,637 records

    Height 4: 1334 = 312,900,700 records

    Can often hold top levels in buffer pool: Level 1 = 1 page = 8 Kbytes

    Level 2 = 133 pages = 1 Mbytes

    Level 3 = 17,689 pages = 133 MBytes

    Database Systems Implementation, Bongki Moon 28

    Optimal Size of B-tree Nodes

    Problem Small pages make B-tree taller and inefficient.

    Large pages increase transfer time.

    Idea: Find a Page Size that maximizes the benefit-to-cost

    ratio. Benefit = log2F

    Because F(fanout) Page size; Height 1/log2F

    AccessCost = disk_latency + PageSize/TransferRate (ignoringcache effects).

    From Table 6 in Gray&Graefe 5 Minute Rule paper, 8/16/32 Kbytes are near optimal, when one entry is 20 bytes

    long.

  • 8/8/2019 Database design and Implementation 05.btree

    15/15

    Database Systems Implementation, Bongki Moon 29

    Summary

    B+-tree is a dynamic structure. Inserts/deletes leave tree height-balanced; log F N cost.

    Adjusts to growth gracefully.

    High fanout (F) means depth rarely more than 3 or 4.

    Typically, 67% occupancy on average.

    We will discuss B-tree locking soon.

    Most widely used index in database management systemsbecause of its versatility. One of the most optimized

    components of a DBMS. Discussions

    B-Tree vs. B+-Tree

    BST vs. B+-Tree