16
Announcements Exam Friday Project: Steps 2.1-24 Due today

Announcements Exam Friday Project: Steps 2.1-24 –Due today

Embed Size (px)

Citation preview

Announcements

• Exam Friday

• Project: Steps 2.1-24 – Due today

Practice MappingempId

Employee

socSecNo

empName

jobTitlesalary

Project

projName startDate

endDate budget

Assign

hours

Department

deptNo

mgrName deptName

WorksIn

1

More Practice

Customer Car

Estimate

Job

Mechanic

Part

owns

given

has worksOn

needs performed

used

spent

1 M

1

M

1

M1

M

1M

1

M

M

M

M1

RepairDoneRepairNeeded

Labor

1

qty

Physical Storage

Lecture 10

Storage Media

cylinderof tracks

(imaginary)

disk rotationspindleread/write head

arm

actuator

track

actuator movement

Figure 13.1 (a) A single-sided disk with read/write hardware.(b) A disk pack with read/write hardware

Secondary Storage Device• Used because databases are too large to store in main memory• Permanent loss of data arise less frequently• Cost of storage much less

Storage Media

Disk storage terminology:disk-pack

cylindertrack

sector (physical)block or page (logical)

(2048 B is a standard block for a UNIX DB, 4096 B is a standard block for an IBM mainframe DB)

Blocking of Records

• Data arranged in files

• Transfer data in fixed size blocks– System read multiple logical records into

buffer (Blocking factor)

• Unblocked Records

• Blocked Records

Hdr1 Rec1 Hdr2 Rec2 Hdr3 Rec3 Hdr4 Rec4

Hdr1 Rec1 Rec2 Rec3 Hdr2 Rec4 Rec5 Rec6

Blocking factor = 3

Record Format

• Fixed-length records – assumes all logical records same length– Spanned records

• Retrieving records requires multiple reads

– Unspanned records

• Wastes space

Rec1 Rec2 Rec3-start Rec3-rest Rec4 Rec5 Rec6-start

Rec1 Rec2 Rec3 Rec4

Record Format

• Variable-length records– Impossible to add data without relocating it– When deleting

• all subsequent records moved up one slot• Mark record as delete and ignores when

reading (made available for insertion)– Only shorter records stored in space

– Prime area (fixed-length record) and overflow area accessed with pointer

Application

• A disk block is 2048B

• A record is 450B

• There are 10,000 records

1. What is the block factor?

2. What is the number of blocks needed to store entire table?

File organization• File organization is described in terms of

how the records are arranged. • Sequential or ordered

– Reading records in order of the key very efficient

– Inserts and Deletes are expensive• Heap or unsorted

– Efficient insertion, but slow search and deletion

• Hashed– Fast access on certain search conditions– Efficient inserts and deletes

Data structures

• B+ Trees– An efficient and flexible hierarchical

index that provides both sequenticial and direct access of records

– Index has 2 parts• Index set • Sequence set – bottom level of the index

(the leaf nodes)– All key values arranged in a sequence with a

pointer from each key value

Example B+ Tree100 200

15 60

120 150

1 8

15 25 30

60 75 80 200 215

230 240

270 300

100 115

120 145

150 165

230 270

1 8

Rules for Constructing a B+ Tree

• If the root is not a leaf, it must have at least two children

• If the tree is order n, each interior node (that is, all nodes except the root and leaf nodes), must have between n/2 and n occupied pointers (and children). If n/2 is not an integer, roundup to determine the minimizes number of pointers

Rules for Constructing a B+ Tree

• The number of key values contained in a non-leaf node is 1 less than the number of pointers

• If the tree has order n, the number of occupied key values in a leaf node must be between (n-1)/2 and n-1. If (n-1)/2 is not an integer, round up to determine the minimum number of occupied key values.

• The tree must be balanced, that is, every path from the root node must have the same length.

Storage Capacity

• Number of records that can be stored in a B+ tree– nd-1(n-1)

• Each node in a tree is a block– How many records if 20 pointers per

node and 3 levels?