File and Indexing

Data Storage and Indexing

Outline• Data Storage

– storing data: disks – buffer manager – representing data – external sorting

• Indexing– types of indexes– B+ trees– hash tables (maybe)

Physical Addresses• Each block and each record have a physical

address that consists of:– The host

– The disk

– The cylinder number

– The track number

– The block within the track

– For records: an offset in the block• sometimes this is in the block’s header

Logical Addresses

• Logical address: a string of bytes (10-16)

• More flexible: can blocks/records around

• But need translation table:

Logical address Physical address

Main Memory Address

• When the block is read in main memory, it receives a main memory address

• Buffer manager has another translation table

Memory address Logical address

The I/O Model of Computation

• In main memory algorithms – we care about CPU time

• In databases time is dominated by I/O cost

• Assumption: cost is given only by I/O

• Consequence: need to redesign certain algorithms

• Will illustrate here with sorting

Sorting

• Illustrates the difference in algorithm design when your data is not in main memory:– Problem: sort 1Gb of data with 1Mb of RAM.

• Arises in many places in database systems: – Data requested in sorted order (ORDER BY)– Needed for grouping operations– First step in sort-merge join algorithm– Duplicate removal– Bulk loading of B+-tree indexes.

2-Way Merge-sort:Requires 3 Buffers

• Pass 1: Read a page, sort it, write it.– only one buffer page is used

• Pass 2, 3, …, etc.:– three buffer pages used.

Main memory buffers

INPUT 1

INPUT 2

OUTPUT

DiskDisk

Two-Way External Merge Sort• Each pass we read + write

each page in file.

• N pages in the file => the number of passes

• So total cost is:

• Improvement: start with larger runs

• Sort 1GB with 1MB memory in 10 passes

log2 1N

2 12N Nlog

Input file

1-page runs

2-page runs

4-page runs

8-page runs

PASS 0

PASS 1

PASS 2

PASS 3

3,4 6,2 9,4 8,7 5,6 3,1 2

3,4 5,62,6 4,9 7,8 1,3 2

2,34,6

8,91,35,6 2

4,46,7

1,23,56

1,22,3

4,56,6

Can We Do Better ?• We have more main memory• Should use it to improve performance

Cost Model for Our Analysis

• B: Block size

• M: Size of main memory

• N: Number of records in the file

• R: Size of one record

Indexes

Outline

• Types of indexes

• B+ trees

• Hash tables (maybe)

Indexes• An index on a file speeds up selections on the

search key field(s) • Search key = any subset of the fields of a relation

– Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation).

• Entries in an index: (k, r), where:– k = the key

– r = the record OR record id OR record ids

Index Classification• Clustered/unclustered

– Clustered = records sorted in the key order– Unclustered = no

• Dense/sparse– Dense = each record has an entry in the index– Sparse = only some records have

• Primary/secondary– Primary = on the primary key– Secondary = on any key– Some books interpret these differently

• B+ tree / Hash table / …

Clustered Index

• File is sorted on the index attribute

• Dense index: sequence of (key,pointer) pairs

Clustered Index

• Sparse index: one key per data block

Clustered Index with Duplicate Keys

• Dense index: point to the first record with that key

• Sparse index: pointer to lowest search key in each block:

• Search for 20

• Better: pointer to lowest new search key in each block:

• Search for 20

Unclustered Indexes

• To index other attributes than primary key

• Always dense (why ?)

Summary Clustered vs. Unclustered Index

Data entries(Index File)

(Data file)

Data Records

Data entries

Data Records

CLUSTERED UNCLUSTERED

Composite Search Keys

• Composite Search Keys: Search on a combination of fields.

– Equality query: Every field value is equal to a constant value. E.g. wrt <sal,age> index:

• age=20 and sal =75

– Range query: Some field value is not a constant. E.g.:

• age =20; or age=20 and sal > 10

sue 13 75

joe 12

name age sal

<sal, age>

<age, sal> <age>

Data recordssorted by name

Data entries in indexsorted by <sal,age>

Data entriessorted by <sal>

Examples of composite keyindexes using lexicographic order.

B+ Trees

• Search trees

• Idea in B Trees:– make 1 node = 1 block

• Idea in B+ Trees:– Make leaves into a linked list (range queries are

easier)

• Parameter d = the degree

• Each node has >= d and <= 2d keys (except root)

• Each leaf has >=d and <= 2d keys:

B+ Trees Basics

30 120 240

Keys k < 30Keys 30<=k<120 Keys 120<=k<240 Keys 240<=k

40 50 60

Next leaf

B+ Tree Example

20 60 100 120 140

10 15 18 20 30 40 50 60 65 80 85 90

B+ Tree Design

• How large d ?

• Example:– Key size = 4 bytes– Pointer size = 8 bytes– Block size = 4096 byes

• 2d x 4 + (2d+1) x 8 <= 4096

• d = 170

Searching a B+ Tree

• Exact key values:– Start at the root– Proceed down, to the leaf

• Range queries:– As above– Then sequential traversal

Select nameFrom peopleWhere age = 25

Select nameFrom peopleWhere 20 <= age and age <= 30

B+ Trees in Practice• Typical order: 100. Typical fill-factor: 67%.

– average fanout = 133

• Typical capacities:– Height 4: 1334 = 312,900,700 records– Height 3: 1333 = 2,352,637 records

• Can often hold top levels in buffer pool:– Level 1 = 1 page = 8 Kbytes– Level 2 = 133 pages = 1 Mbyte– Level 3 = 17,689 pages = 133 MBytes

Insertion in a B+ Tree

Insert (K, P)• Find leaf where K belongs, insert• If no overflow (2d keys or less), halt• If overflow (2d+1 keys), split node, insert in parent:

• If leaf, keep K3 too in right node• When root splits, new root has 1 key only

K1 K2 K3 K4 K5

P0 P1 P2 P3 P4 p5

P0 P1 P2

P3 P4 p5

(K3, ) to parent

20 60 100 120 140

10 15 18 20 30 40 50 60 65 80 85 90

Insert K=19

20 60 100 120 140

10 15 18 19 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 9019

After insertion

20 60 100 120 140

10 15 18 19 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 9019

Now insert 25

20 60 100 120 140

10 15 18 19 20 25

50 60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

After insertion

20 60 100 120 140

10 15 18 19 20 25

50 60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

But now have to split !

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

After the split

Deletion from a B+ Tree

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

Delete 30

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 40 60 65 80 85 9019

After deleting 30

May change to 40, or not

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 40 60 65 80 85 9019

Now delete 25

20 30 60 100 120 140

10 15 18 19 20 60 65 80 85 90

10 15 18 20 40 60 65 80 85 9019

After deleting 25Need to rebalanceRotate

19 30 60 100 120 140

10 15 18 19 20

60 65 80 85 90

10 15 18 20 40 60 65 80 85 9019

Now delete 40

19 30 60 100 120 140

10 15 18 19 20

60 65 80 85 90

10 15 18 20 60 65 80 85 9019

After deleting 40Rotation not possibleNeed to merge nodes

19 60 100 120 140

10 15 18 19 20

60 65 80 85 90

10 15 18 20 60 65 80 85 9019

Final tree

Hash Tables

• Secondary storage hash tables are much like main memory ones

• Recall basics:– There are n buckets– A hash function f(k) maps a key k to {0, 1, …, n-1}– Store in bucket f(k) a pointer to record with key k

• Secondary storage: bucket = block, use overflow blocks when needed

• Assume 1 bucket (block) stores 2 keys + pointers

• h(e)=0

• h(b)=h(f)=1

• h(g)=2

• h(a)=h(c)=3

Hash Table Example

• Search for a:

• Compute h(a)=3

• Read bucket 3

• 1 disk access

Searching in a Hash Table

• Place in right bucket, if space

• E.g. h(d)=2

Insertion in Hash Table

• Create overflow block, if no space• E.g. h(k)=1

• More over-flow blocksmay be needed

Insertion in Hash Table

Hash Table Performance

• Excellent, if no overflow blocks

• Degrades considerably when number of keys exceeds the number of buckets (I.e. many overflow blocks).

Extensible Hash Table

• Allows has table to grow, to avoid performance degradation

• Assume a hash function h that returns numbers in {0, …, 2k – 1}

• Start with n = 2i << 2k , only look at first i most significant bits

Extensible Hash Table

• E.g. i=1, n=2, k=4

• Note: we only look at the first bit (0 or 1)

0(010)

1(011)

Insertion in Extensible Hash Table

• Insert 1110

0(010)

1(011)

1(110)

• Now insert 1010

• Need to extend table, split blocks

• i becomes 2

0(010)

1(011)

1(110), 1(010)

• Now insert 1110

0(010)

10(11)

10(10)

00011011

11(10) 2

• Now insert 0000, then 0101

• Need to split block

0(010)

0(000), 0(101)

10(11)

10(10)

00011011

11(10) 2

• After splitting the block

00(10)

00(00)

10(11)

10(10)

00011011

11(10) 2

01(01) 2

Performance Extensible Hash Table

• No overflow blocks: access always one read

• BUT:– Extensions can be costly and disruptive– After an extension table may no longer fit in

memory

Linear Hash Table

• Idea: extend only one entry at a time

• Problem: n= no longer a power of 2

• Let i be such that 2i <= n < 2i+1

• After computing h(k), use last i bits:– If last i bits represent a number > n, change msb

from 1 to 0 (get a number <= n)

Linear Hash Table Example

• N=3

(01)00

(11)00

(10)10

000110

(01)11 BIT FLIP

Linear Hash Table Example

• Insert 1000: overflow blocks…

(01)00

(11)00

(10)10

000110

(01)11

(10)00

Linear Hash Tables

• Extension: independent on overflow blocks

• Extend n:=n+1 when average number of records per block exceeds (say) 80%

Linear Hash Table Extension• From n=3 to n=4

• Only need to touchone block (which one ?)

(01)00

(11)00

(10)10

000110

(01)11(01)11

(01)11

000110

(10)10

(01)00

(11)00

Linear Hash Table Extension

• From n=3 to n=4 finished

• Extension from n=4to n=5 (new bit)

• Need to touch everysingle block (why ?)

(01)11

000110

(10)10

(01)00

(11)00

File and Indexing

Documents

INDEXING* INDEXING*

File organization and indexing

Indexing and Execution May 20 th, 2002. Indexing - Recap Primary file organization: heap, sorted, hashed. Primary vs. secondary Clustered vs. unclustered

AN INDEXING AND CLASSIFICATION SYSTEM FOR EARTH-SCIENCE ... · PDF fileScientific Information Management Books and Open-File ... An indexing and classification system for ... System

Indexing and Hashing - Simon Fraser · PDF file... Indexing and Hashing 22 B-Tree ... Indexing and Hashing 26 Using ... • Variable-size buckets in leaf nodes complicates the B+-tree

SCALABLE INDEXING AND SEARCHING ON DISTRIBUTED …datasys.cs.iit.edu/publications/2016_IIT_MS-thesis_Itua-Ijagbone.pdf · SCALABLE INDEXING AND SEARCHING ON DISTRIBUTED FILE SYSTEMS

Indexing Torsten Grust Chapter 3 · Indexing Torsten Grust File Organization File Organization Competition Cost Model Scan Equality Test Range Selection Insertion Deletion Indexes

Access Methods and Indexing File Organization · 2012. 1. 3. · Access Methods and Indexing File Organization ﬁle organization is about which tuples to place on each page, and

Indexing and Active Fund Management: International Evidence...indexing and lower in countries with more closet indexing. Overall, our evidence suggests that explicit indexing improves

Lecture 9: File Organization and Indexing

FamilySearch Indexing: Indexing

Modernizing File System through In-Storage Indexing

Indexing see Change - Australian and New Zealand Society ... · PDF file• Indexing art books, ... Block pattern Horizontal strip ... : Indexing see change, 12–14 September 2011,

1 Chapter 7 Indexing File Structures by Folk, Zoellick, and Ricarrdi

TagIt: An Integrated Indexing and Search Service for File

A System for Content-based Indexing and Retrieval in ...karen/AudioMUVIS.pdf · Indexing and Retrieval in Multimedia Databases System Overview & Applications ... Freq. Channel File

SEMIOTICS AND INDEXING: AN ANALYSIS OF THE SUBJECT INDEXING PROCESS

FamilySearch Indexing : Indexing - LDS

File Organization and Indexing - · PDF fileFile Organization and Indexing ... Linked allocation • each file block is assigned to some disk block ... Insertion - Read the last file

File Organizations and Indexing Review: Memory, Disks, & Filesdb.cs.berkeley.edu/jmh/cs186/s03/lecs/lecture4.pdf · File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you