42
1 CSE232A: Database System Principles Hardware

1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

Embed Size (px)

Citation preview

Page 1: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

1

CSE232A: Database System Principles

Hardware

Page 2: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

Data + Indexes

Database System Architecture

Query Processing Transaction ManagementSQL query

Parser

QueryRewriter

andOptimizer

ExecutionEngine

relational algebra

View definitions

Statistics & Catalogs &System Data

query executionplan

BufferManager

TransactionManager

Calls from Transactions (read,write)

ConcurrencyController

LockTable

RecoveryManager

Log

Hardware aspects of storing and

retrieving data

Page 3: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

3

Memory Hierarchy

• Cache memory– On-chip and L2– Caching outside control of DB

system

• RAM– Addressable space includes

virtual memory but DB systems avoid it

• Disk– Access speed & Transfer rate– Winchester, arrays,…

• Tertiary storage– Tapes, jukeboxes, DVDs C

ost

per

byte

Capacity

Acc

ess

Speed

Page 4: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

4

Storage Cost

10-9 10-6 10-3 10-0 103

access time (sec)

1015

1013

1011

109

107

105

103

cache

electronicmain

electronicsecondary

magneticopticaldisks

onlinetape

nearlinetape &opticaldisks

offlinetape

typi

cal c

apac

ity

(byt

es)

from Gray & Reuterupdated in 2002

Page 5: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

5

Storage Cost

10-9 10-6 10-3 10-0 103

access time (sec)

104

102

100

10-2

10-4

cache

electronicmain

electronicsecondary magnetic

opticaldisks

onlinetape

nearlinetape &opticaldisks

offlinetape

doll

ars/

MB

from Gray & Reuter

Page 6: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

6

Volatile Vs Non-Volatile Storage

• Persistence important for transaction atomicity and durability

• Even if database fits in main memory changes have to be written in non-volatile storage

• Hard disk• RAM disks w/ battery• Flash memory

Page 7: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

7

Cost of Disk Access

• How many blocks were accessed ?• Clustered/consecutive ?

Page 8: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

8

Moore’s Law: Different Rates of Improvement

Lead to Reconsiderations

• Processor speed• Main memory

bit/$• Disk bit/$• RAM access speed• Disk access speed• Disk transfer rate

Dis

k Tra

nsf

er

Rate

Dis

k A

ccess

Tim

e

Clustered/sequential access-based algorithms

become relativelybetter

Page 9: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

9

Moore’s Law: Same Phenomenon Applies to

RAM

RA

M T

ransf

er

Rate

RA

M A

ccess

Tim

e

Algorithms that accessmemory sequentiallyhave better constant

factors than algorithmsthat access randomly

Page 10: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

10

Moore’s Law: Different Rates of Improvement

Cach

e C

apaci

ty

RA

M C

apaci

ty

Dis

k A

ccess

Tim

e

Cost of “miss”increases

Page 11: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

11

Focus on: “Typical Disk”

Terms: Platter, Head, ActuatorCylinder, TrackSector (physical),Block (logical), Gap

Dis

kC

ontr

olle

r

BU

S

Page 12: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

12

Top View

Track

Gap

Sector

Block (typically multiple sectors)

Often different numbersof sectors per track

Page 13: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

13

“Typical” NumbersDiameter: 1 inch 15 inchesCylinders: 100 20000Surfaces: 1 (CDs) (Tracks/cyl) 2 (floppies)

5 (typical hd) 30

Sector Size:512B 50KCapacity: 360 KB (old floppy)

200 GB

Page 14: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

14

block xin memory

?

I wantblock X

Key performance metric: Time to fetch block

Time = Seek Time (locate track) +Rotational Delay (locate sector)+Transfer Time (fetch block) +Other (disk controller, …)

Page 15: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

15

Seek Delay

TrackWhereHead

is

TrackWhereHead must

go

Page 16: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

16

Rotational Delay

Head Here

Block I Want

Page 17: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

17

Seek Time

3 or 5x

x

1 N

Cylinders Traveled

Time

Few ms

Page 18: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

18

Average Random Seek Time

SEEKTIME (i j)

S =

N(N-1)

N N

i=1 j=1ji

“Typical” S: 10 ms 40 ms

Page 19: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

19

Average Rotational Delay

R = 1/2 revolution

“typical” R = 8.33 ms (7200 RPM)

Assume we have to start readingfrom start of first sector

Page 20: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

20

Transfer Rate: t

• “typical” t: 1 3 MB/second• transfer time: block size

t

Page 21: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

21

Other Delays

• CPU time to issue I/O• Contention for controller• Contention for bus, memory

“Typical” Value: 0

Page 22: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

22

Practice Problem

• Single surface• Rotation speed 7200rpm• 16,384 tracks• 128 sectors/track• 4096 bytes/sector• 4 sectors/block (16,384 bytes/block)• SEEKTIME (i j) = [1000 + (j-i)] μs• Neglect gaps• Calculate minimum, maximum, average

time to fetch one block

Page 23: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

23

Practice Problem: Minimum Time

• Head is at the start of the first sector of the block

• Just compute transfer time• 4 sectors cover 4/128 of a track• 1 full rotation takes

60/7200=8.33ms• Transfer time is 8.33 * 4 /128 =

0.26ms

Page 24: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

24

Practice Problem: Maximum Time

• Assume read must start at the first sector

• Head is at innermost, required track is the outermost

• Seek time = …• Head just missed the beginning• Rotational delay = …• Transfer time = …

Page 25: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

25

Practice problem: Average time

• Solve…

Page 26: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

26

• So far: Random Block Access• What about: Reading “Next” block?

Time to get = Block Size + Negligible

block t

- skip gap

- switch track- once in a while, next cylinder

Page 27: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

27

Rule of Random I/O: ExpensiveThumb Sequential I/O: Much less• Ex: 1 KB Block

»Random I/O: 20 ms.»Sequential I/O: 1 ms.

Page 28: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

28

Practice Problem cont’d: Sustained Bandwidth over

Track

• Assume required blocks are consecutive on single track

• What is the sustained bandwidth of fetching consecutive blocks?

• 128 sectors/track * 4KB/sector in 8.33ms/track full rotation = 512KB/8.33ms = 61.46KB/ms

Page 29: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

29

Suggested optimization

• Cluster data in consecutive blocks• Give an extra point to algorithms

that – exploit data clustering by avoiding

“random” accesses– Read/write consecutive blocks

Page 30: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

30

An Algorithm with Little Random Access: 2-Phase Merge

SortP K A D L E Z W J C R H

Main Memory: 4 blocks

SORT

A D K P

SORT

A D E K

READ

WRITE

WRITEMER

GE

Y F X I

P K A D L E Z W

L D K PP W ZA D K PA D E K L D K PP W Z

C F H I J R X Y

A D K PA C D F …

Improve by bringing max number of blocks in memory in Phase 2

Page 31: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

31

Cost for Writing similar to Reading

…. unless we want to verify! need to add (full) rotation + Block size

t

Page 32: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

32

• To Modify a Block?

To Modify Block:(a) Read Block(b) Modify in Memory(c) Write Block[(d) Verify?]

Page 33: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

33

Block Address:

• Physical Device• Cylinder #• Surface #• Sector

Once upon a time DBs had access to such – now it is the OS’s domain

Page 34: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

34

Optimizations (in controller or O.S.)

• Disk Scheduling Algorithms– e.g., elevator algorithm

• Pre-fetch• Arrays

Page 35: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

35

Double Buffering

Problem: Have a File» Sequence of Blocks B1, B2

Have a Program» Process B1» Process B2» Process B3

...

Page 36: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

36

Single Buffer Solution

(1) Read B1 Buffer(2) Process Data in Buffer(3) Read B2 Buffer(4) Process Data in Buffer ...

Page 37: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

37

Say P = time to process/blockR = time to read in 1 blockn = # blocks

Single buffer time = n(P+R)

Page 38: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

38

Double Buffering

Memory:

Disk: A B C D GE F

A B

done

process

AC

process

B

done

Page 39: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

39

Say P R

What is processing time?

P = Processing time/blockR = IO time/blockn = # blocks

• Double buffering time = R + nP

• Single buffering time = n(R+P)

Improvement much more dramatic ifconsequtive blocks: …

Page 40: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

40

Block Size Selection?

• Big Block Amortize I/O Cost

• Big Block Read in more useless stuff! and takes longer to read

Unfortunately...

Page 41: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

41

Trend

• memory prices drop and memory capacities increase,

• transfer rates increase• Disk access times do not increase that much

blocks get bigger ...

Page 42: 1 CSE232A: Database System Principles Hardware. Data + Indexes Database System Architecture Query ProcessingTransaction Management SQL query Parser Query

42

Summary

• Secondary storage, mainly disks• I/O times• I/Os should be avoided,

especially random ones…..

Summary