SECTION 13.3 Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT

SECTION 13.3Eilbroun Benjamin CS 257 – Dr. TY Lin

SECONDARY STORAGE MANAGEMENT

Presentation Outline

13.3 Accelerating Access to Secondary Storage 13.3.1 The I/O Model of

Computation 13.3.2 Organizing Data by

Cylinders 13.3.3 Using Multiple Disks 13.3.4 Mirroring Disks 13.3.5 Disk Scheduling and the

Elevator Algorithm 13.3.6 Prefetching and Large-

Scale Buffering

13.3 Accelerating Access to Secondary Storage Several approaches for more-efficiently

accessing data in secondary storage: Place blocks that are together in the same

cylinder. Divide the data among multiple disks. Mirror disks. Use disk-scheduling algorithms. Prefetch blocks into main memory.

Scheduling Latency – added delay in accessing data caused by a disk scheduling algorithm.

Throughput – the number of disk accesses per second that the system can accommodate.

13.3.1 The I/O Model of Computation The number of block accesses (Disk I/O’s) is

a good time approximation for the algorithm. This should be minimized.

Ex 13.3: You want to have an index on R to identify the block on which the desired tuple appears, but not where on the block it resides. For Megatron 747 (M747) example, it takes

11ms to read a 16k block. A standard microprocessor can execute millions

of instruction in 11ms, making any delay in searching for the desired tuple negligible.

13.3.2 Organizing Data by Cylinders If we read all blocks on a single track or

cylinder consecutively, then we can neglect all but first seek time and first rotational latency.

Ex 13.4: We request 1024 blocks of M747. If data is randomly distributed, average

latency is 10.76ms by Ex 13.2, making total latency 11s.

If all blocks are consecutively stored on 1 cylinder: 6.46ms + 8.33ms * 16 = 139ms

(1 average seek) (time per rotation) (# rotations)

13.3.3 Using Multiple Disks

If we have n disks, read/write performance will increase by a factor of n.

Striping – distributing a relation across multiple disks following this pattern: Data on disk R1: R1, R1+n, R1+2n,… Data on disk R2: R2, R2+n, R2+2n,… … Data on disk Rn: Rn, Rn+n, Rn+2n, …

Ex 13.5: We request 1024 blocks with n = 4. 6.46ms + (8.33ms * (16/4)) = 39.8ms

(1 average seek) (time per rotation) (# rotations)

13.3.4 Mirroring Disks

Mirroring Disks – having 2 or more disks hold identical copied of data.

Benefit 1: If n disks are mirrors of each other, the system can survive a crash by n-1 disks.

Benefit 2: If we have n disks, read performance increases by a factor of n.

Performance increases further by having the controller select the disk which has its head closest to desired data block for each read.

13.3.5 Disk Scheduling and the Elevator Problem

Disk controller will run this algorithm to select which of several requests to process first.

Pseudo code: requests[] // array of all non-processed data

requests upon receiving new data request:

requests[].add(new request) while(requests[] is not empty)

move head to next location if(head location is at data in requests[])

retrieve data remove data from requests[]

if(head reaches end) reverse head direction


(con’t)Events:

Head starting point

Request data at 8000



Get data at 8000Request data at

16000Get data at 24000Request data at

64000Get data at 56000Request Data at

40000Get data at 64000Get data at 40000Get data at 16000

data time

Current time

Current time

0

Current time

4.3

Current time

10

Current time

13.6

Current time

20

Current time

26.9

Current time

30

Current time

34.2

Current time

45.5

Current time

56.8

800016000240003200040000480005600064000

data time

8000.. 4.3

data time

8000.. 4.3

24000.. 13.6

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

64000.. 34.2

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

64000.. 34.2

40000.. 45.5

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

64000.. 34.2

40000.. 45.5

16000.. 56.8


(con’t)

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

64000.. 34.2

40000.. 45.5

16000.. 56.8

data time

8000.. 4.3

24000.. 13.6

56000.. 26.9

16000.. 42.2

64000.. 59.5

40000.. 70.8

Elevator Algorithm

FIFOAlgorithm

13.3.6 Prefetching and Large-Scale Buffering If at the application level, we can predict

the order blocks will be requested, we can load them into main memory before they are needed.

Questions

Documents

SECTION 13.3 Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT