34
1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University of Wisconsin at Madison

Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

  • Upload
    zamora

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems. Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University of Wisconsin at Madison. Circa 1970. App. App. App. Intense research period in policies - PowerPoint PPT Presentation

Citation preview

Page 1: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

1

Storage-Aware Caching:Revisiting Caching for Heterogeneous Systems

Brian ForneyAndrea Arpaci-Dusseau Remzi Arpaci-Dusseau

Wisconsin Network DisksUniversity of Wisconsin at Madison

Page 2: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

2

Circa 1970

OSpolicy

• Intense research period in policies• Wide variety developed; many used today

– Examples: Clock, LRU• Simple storage environment

– Focus: workload– Assumption: consistent retrieval cost

buffer cache

App AppApp

Page 3: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

3

Today• Rich storage

environment– Devices attached

in many ways– More devices– Increased device

sophistication• Mismatch: Need

to reevaluate

LAN

WAN

policybuffer cache

App AppApp

Page 4: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

4

Problem illustration• Uniform workload• Two disks• LRU policy

• Slow disk is bottleneck• Problem: Policy is oblivious– Does not filter well

fast slow

Page 5: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

5

General solution• Integrate workload and device

performance– Balance work across devices–Work: cumulative delay

• Cannot throw out:– Existing non-cost aware policy research– Existing caching software

Page 6: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

6

Our solution: Overview• Generic partitioning framework– Old idea– One-to-one mapping: device partition– Each partition has cost-oblivious policy– Adjust partition sizes

• Advantages– Aggregates performance information– Easily and quickly adapts to workload and

device performance changes– Integrates well with existing software

• Key: How to pick the partition sizes

Page 7: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

7

Outline• Motivation• Solution overview• Taxonomy• Dynamic partitioning algorithm• Evaluation• Summary

Page 8: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

8

Partitioning algorithms• Static– Pro: Simple– Con: Wasteful

• Dynamic– Adapts to workload• Hotspots• Access pattern changes

– Handles device performance faults• Used dynamic

Page 9: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

9

Our algorithm: Overview1. Observe: Determine per-device

cumulative delay2. Act: Repartition cache3. Save & reset: Clear last W requests

Observe

Act

Save & reset

Page 10: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

10

Algorithm: Observe• Want accurate system balance view• Record per-device cumulative delay

for last W completed disk requests– At client– Includes network time

Page 11: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

11

Algorithm: Act• Categorize each partition– Page consumers

• Cumulative delay above threshold possible bottleneck– Page suppliers

• Cumulative delay below mean lose pages without decreasing performance

– Neither• Always have page suppliers if there are page

consumers

Page consumer Page supplier Neither

Page consumer Page supplierNeither

Before

After

Page 12: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

12

Page consumers• How many pages? Depends on state:–Warming• Cumulative delay increasing• Aggressively add pages: reduce queuing

–Warm• Cumulative delay constant• Conservatively add pages

– Cooling• Cumulative delay decreasing• Do nothing; naturally decreases

Page 13: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

13

Dynamic partitioning• Eager– Immediately change partition sizes– Pro: Matches observation– Con: Some pages temporarily unused

• Lazy– Change partition sizes on demand– Pro: Easier to implement– Con: May cause over correction

Page 14: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

14

Outline• Motivation• Solution overview• Taxonomy• Dynamic partitioning algorithm• Evaluation• Summary

Page 15: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

15

Evaluation methodology• Simulator

• Workloads: synthetic and web

Caching, RAID-0 client

LogGP network• With endpoint contention

Disks• 16 IBM 9LZX• First-order model: queuing, seek time & rotational time

Page 16: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

16

Evaluation methodology• Introduced performance heterogeneity– Disk aging

• Used current technology trends– Seek and rotation: 10% decrease/year– Bandwidth: 40% increase/year

• Scenarios– Single disk degradation: Single disk, multiple ages– Incremental upgrades: Multiple disks, two ages

– Fault injection• Understand dynamic device performance change and

device sharing effects

• Talk only shows single disk degradation

Page 17: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

17

Evaluated policies• Cost-oblivious: LRU, Clock• Storage-aware: Eager LRU, Lazy

LRU, Lazy Clock (Clock-Lottery)• Comparison: LANDLORD– Cost-aware, non-partitioned LRU– Same as web caching algorithm– Integration problems with modern OSes

Page 18: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

18

Synthetic• Workload: Read requests, exponentially distributed around 34 KB, uniform

load across disks

• A single slow disk greatly impacts performance.• Eager LRU, Lazy LRU, Lazy Clock, and LANDLORD robust as slow disk

performance degrades

LRU-based

0

5

10

15

20

25

0 1 2 3 4 5 6 7 8 9 10Age of slow disk (years)

Throughput (MB/s)

Eager LRULANDLORDLazy LRULRUNo cache

Clock-based

0

5

10

15

20

25

0 1 2 3 4 5 6 7 8 9 10Age of slow disk (years)

Throughput (MB/s)

ClockLazy ClockNo cache

Page 19: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

19

Web• Workload: 1 day image server trace at UC Berkeley, reads & writes

• Eager LRU and LANDLORD are the most robust.

020406080

100120140160180

6 6.5 7 7.5 8 8.5 9 9.5 10Age of slow disk (years)

Throughput (MB/s)

Eager LRULazy LRU

020406080

100120140160180

0 1 2 3 4 5 6 7 8 9 10

Age of slow disk (years)

Throughput (MB/s)

Eager LRULANDLORDLazy LRULRUNo cache

Page 20: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

20

Summary• Problem: Mismatch between storage

environment and cache policy– Current buffer cache policies lack device

information• Policies need to include storage

environment information• Our solution: Generic partitioning

framework– Aggregates performance information– Adapts quickly– Allows for use of existing policies

Page 21: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

21

Questions?

More information at www.cs.wisc.edu/wind

Page 22: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

22

Page 23: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

23

Future work• Implementation in Linux• Cost-benefit algorithm• Study of integration with prefetching

and layout

Page 24: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

24

Problems with LANDLORD• Does not mesh well with unified

buffer caches (assumes LRU)• LRU-based: not always desirable– Example: Databases

• Suffers from a “memory effect”– Can be much slower to adapt

Page 25: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

25

Disk aging of an IBM 9LZXAge (years) Bandwidth (MB/s) Avg. seek time

(ms)Avg. rotational delay (ms)

0 20.0 5.30 3.001 14.3 5.89 3.332 10.2 6.54 3.693 7.29 7.27 4.114 5.21 8.08 4.565 3.72 8.98 5.076 2.66 9.97 5.637 1.90 11.1 6.268 1.36 12.3 6.969 0.97 13.7 7.7310 0.69 15.2 8.59

Page 26: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

26

Web without writes• Workload: Web workload where writes are replaced with reads

• Eager LRU and LANDLORD are the most robust.

020406080

100120140160180

6 6.5 7 7.5 8 8.5 9 9.5 10

Age of slow disk (years)

Throughput (MB/s)

Eager LRULazy LRU

020406080

100120140160180

0 1 2 3 4 5 6 7 8 9 10

Age of slow disk (years)

Throughput (MB/s)

Eager LRULANDLORDLazy LRULRUNo cache

Page 27: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

27

Page 28: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

28

Problem illustration

OS

Applications

Fast Slow

Disks

Page 29: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

29

Problem illustration

OS

Applications

Fast Slow

Disks

Page 30: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

30

Lack of information

buffer cache OS

Applications

Disks

DFSes, NAS, &new devices

drivers

Page 31: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

31

Solution: overview

• Partition the cache– One per device– Cost-oblivious

policies (e.g. LRU) in partitions

– Aggregate device perf.

• Dynamically reallocate pages

buffer cache policy

Page 32: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

32

Forms of dynamic partitioning• Eager– Change sizes

Page 33: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

33

Tomorrow• New devices• New paradigms• Increasingly

rich storage environment

• Mismatch: Reevaluation needed

LAN

WAN

buffer cache policy

?

?

Page 34: Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems

34

WiND project• Wisconsin Network Disks• Building manageable

distributed storage• Focus: Local-area

networked storage• Issues similar in wide-area