Upload
allison-wood
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Practice MappingempId
Employee
socSecNo
empName
jobTitlesalary
Project
projName startDate
endDate budget
Assign
hours
Department
deptNo
mgrName deptName
WorksIn
1
More Practice
Customer Car
Estimate
Job
Mechanic
Part
owns
given
has worksOn
needs performed
used
spent
1 M
1
M
1
M1
M
1M
1
M
M
M
M1
RepairDoneRepairNeeded
Labor
1
qty
Storage Media
cylinderof tracks
(imaginary)
disk rotationspindleread/write head
arm
actuator
track
actuator movement
Figure 13.1 (a) A single-sided disk with read/write hardware.(b) A disk pack with read/write hardware
Secondary Storage Device• Used because databases are too large to store in main memory• Permanent loss of data arise less frequently• Cost of storage much less
Storage Media
Disk storage terminology:disk-pack
cylindertrack
sector (physical)block or page (logical)
(2048 B is a standard block for a UNIX DB, 4096 B is a standard block for an IBM mainframe DB)
Blocking of Records
• Data arranged in files
• Transfer data in fixed size blocks– System read multiple logical records into
buffer (Blocking factor)
• Unblocked Records
• Blocked Records
Hdr1 Rec1 Hdr2 Rec2 Hdr3 Rec3 Hdr4 Rec4
Hdr1 Rec1 Rec2 Rec3 Hdr2 Rec4 Rec5 Rec6
Blocking factor = 3
Record Format
• Fixed-length records – assumes all logical records same length– Spanned records
• Retrieving records requires multiple reads
– Unspanned records
• Wastes space
Rec1 Rec2 Rec3-start Rec3-rest Rec4 Rec5 Rec6-start
Rec1 Rec2 Rec3 Rec4
Record Format
• Variable-length records– Impossible to add data without relocating it– When deleting
• all subsequent records moved up one slot• Mark record as delete and ignores when
reading (made available for insertion)– Only shorter records stored in space
– Prime area (fixed-length record) and overflow area accessed with pointer
Application
• A disk block is 2048B
• A record is 450B
• There are 10,000 records
1. What is the block factor?
2. What is the number of blocks needed to store entire table?
File organization• File organization is described in terms of
how the records are arranged. • Sequential or ordered
– Reading records in order of the key very efficient
– Inserts and Deletes are expensive• Heap or unsorted
– Efficient insertion, but slow search and deletion
• Hashed– Fast access on certain search conditions– Efficient inserts and deletes
Data structures
• B+ Trees– An efficient and flexible hierarchical
index that provides both sequenticial and direct access of records
– Index has 2 parts• Index set • Sequence set – bottom level of the index
(the leaf nodes)– All key values arranged in a sequence with a
pointer from each key value
Example B+ Tree100 200
15 60
120 150
1 8
15 25 30
60 75 80 200 215
230 240
270 300
100 115
120 145
150 165
230 270
1 8
Rules for Constructing a B+ Tree
• If the root is not a leaf, it must have at least two children
• If the tree is order n, each interior node (that is, all nodes except the root and leaf nodes), must have between n/2 and n occupied pointers (and children). If n/2 is not an integer, roundup to determine the minimizes number of pointers
Rules for Constructing a B+ Tree
• The number of key values contained in a non-leaf node is 1 less than the number of pointers
• If the tree has order n, the number of occupied key values in a leaf node must be between (n-1)/2 and n-1. If (n-1)/2 is not an integer, round up to determine the minimum number of occupied key values.
• The tree must be balanced, that is, every path from the root node must have the same length.