14
File Caching with SSD Arrays Wei Yang 11/14/12 US ATLAS Distributed Facility Workshop University of California, Santa Cruz 1

File Caching with SSD Arrays

  • Upload
    cece

  • View
    49

  • Download
    1

Embed Size (px)

DESCRIPTION

File Caching with SSD Arrays. Wei Yang. Motivation. We are curious No immediate needs, but future needs Caching (only) analysis job inputs SSD has limited write cycles Other goals, see the last slide File level caching Conventional LFU/LRU algorithms - PowerPoint PPT Presentation

Citation preview

Page 1: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

1

File Caching with SSD Arrays

Wei Yang

11/14/12

Page 2: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

2

Motivation• We are curious

– No immediate needs, but future needs– Caching (only) analysis job inputs– SSD has limited write cycles – Other goals, see the last slide

• File level caching– Conventional LFU/LRU algorithms

• can not capture ATLAS analysis jobs data usage pattern (if there is such a pattern)

– Sub-file level caching would be great! But book keeping is hard– We search for caching algorithm

• Out-Bytes > In-Bytes under ATLAS workload • Use LRU, but based on days/weeks/months job usage pattern

11/14/12

Page 3: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

3

Analysis jobs visit SSD cache first

Cache miss!forward to HD storage

Day 1 ... Day N

File 0001 X1 Xn

File 0002 Y1 Yn

Fill the cache

Xrootd monitoring stream

Setup 1: Caching based on File Access Frequency

o A table records access Frequency of all fileso Rotate columns to maintain N days of records

11/14/12

Page 4: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

4

Analysis jobs visit SSD cache first

Cache miss!forward to HD storage

Fill the cache

Xrootd monitoring stream to

UCSD collector

Setup 2: Caching based on Historic File Access Info

o Record every file access as event like infoo save to ROOT files for later analysis

11/14/12

Page 5: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

5

Hardware of the SSD Box• Dell 610

– 8-core 2.4 Ghz– 24GB– Intel dual X520 10Gb NIC– LSI SAS 9200-8e (support TRIM)– RHEL 6 x86_64– Xrootd

• SSD Array– Dell MD1220– 12x OCZ Talos 960GB MLC SSDs, total ~11TB

• Non-raid to support TRIM. • Xrootd take care to gluing them together as a single space

11/14/12

Page 6: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

6

The box can deliverCan the caching algorithm deliver?

3-hour plot2012-11-05

6-month plot as of 2012-11-12

File Access Freq. Alg.Net data sink, not cache

Algorithm: Bytes-read/file size > 110% during the last 5 days, prioritized by this ratio and up to 200GB/hour

Sept 1

Lack of jobs

Cache brings in ~200GB/hour

11/14/12

Page 7: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

7

GB/hour from SSD + HDDGB/hour from SSDGB/hour to SSD

Lost monitoringdata from HDD

UCSD collectordead

Lack of jobsfor the last 4 days

Ceiling of 10Gb NIC

11/14/12

Page 8: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

8

Simulate the Cache with Historic Data

Cache size requiredfor day [-x, -1]

day –n -n+1 -1 0

Day 0:Size of all files read

Bytes read from SSD+HDD

Bytes read from SSD

Cache size required for [-x+1, 0] - = New data to cache

For a given caching algorithm, what do we want to learn from those historic data?

11/14/12

Page 9: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

9

Algorithm: every files during the last N days.11/14/12

Page 10: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

10

Algorithm: every files during the last N daysCache hit rate = Byte from SSD/Bytes from SSD+HDD

11/14/12

Page 11: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

11

Algorithm: Bytes-read/file size > 110% during the last 5 days11/14/12

Page 12: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

12

Analyzing the Historic Data

Try to find a way to identify data worth caching.

So far, not much success

Worth caching

11/14/12

Page 13: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

1311/14/12

Do the jobs tend to open the same file in a short time window?• If some, we may not have a chance to cache

File that worth caching• Access time (open) scatter over several hours – cacheable• But “scattering over several hours” doesn’t mean the file worth caching

Page 14: File Caching with SSD Arrays

US ATLAS Distributed Facility Workshop University of California, Santa Cruz

14

Next Step

• So far focusing on making it a good cache– More work to be done– Should also look at

• Asking Panda for input files lists of coming jobs• Possibility of sub-file level caching

• How much can the cache speed up analysis jobs?– All files are in SSD cache– Normal caching --- some files in SSD cache, some are

not

11/14/12