Transcript
Page 1: Shark: Scaling File Servers via Cooperative Caching

Shark: Scaling File Servers via Cooperative

Caching

Siddhartha Annapureddy Michael J. Freedman David Mazieres

New York University

Networked Systems Design and Implementation, 2005

Siddhartha Annapureddy Shark

Page 2: Shark: Scaling File Servers via Cooperative Caching

Motivation

Scenario – Want to test a new application on a hundred nodes

Workstation

Workstation

Workstation

Workstation

WorkstationWorkstation

Workstation Workstation

WorkstationWorkstation

Problem – Need to push software to all nodes and collect results

Siddhartha Annapureddy Shark

Page 3: Shark: Scaling File Servers via Cooperative Caching

Current Approaches

Distribute from a central server

Client

Client

Client

Network A Network B

Server

rsync, unison, scp

– Server’s network uplink saturatesI Operation takes a long time to finish

– Wastes bandwidth along bottleneck links

Siddhartha Annapureddy Shark

Page 4: Shark: Scaling File Servers via Cooperative Caching

Current Approaches

File distribution mechanisms

BitTorrent, Bullet

+ Scales by offloading burden on server

– Client downloads from half-way across the world

Siddhartha Annapureddy Shark

Page 5: Shark: Scaling File Servers via Cooperative Caching

Inherent Problems with Copying Files

Users have to decide a priori what to ship

I Ship too much – Waste bandwidth, takes long time

I Ship too little – Hassle to work in a poor environment

Idle environments consume disk space

I Users are loath to cleanup ⇒ Redistribution

I Need low cost solution for refetching files

Manual management of software versions

Illusion of having development environmentPrograms fetched transparently on demand

Siddhartha Annapureddy Shark

Page 6: Shark: Scaling File Servers via Cooperative Caching

Networked File Systems

+ Know how to deploy these systems

+ Know how to administer such systems

+ Simple accountability mechanisms

Eg: NFS, AFS, SFS etc.

Siddhartha Annapureddy Shark

Page 7: Shark: Scaling File Servers via Cooperative Caching

Problem: Scalability

0

100

200

300

400

500

600

700

800

0 20 40 60 80 100

Tim

e to

fini

sh R

ead

(sec

)

Unique nodes

40 MB readSFS

Very slow ≈ 775s

Siddhartha Annapureddy Shark

Page 8: Shark: Scaling File Servers via Cooperative Caching

Problem: Scalability

0

100

200

300

400

500

600

700

800

0 20 40 60 80 100

Tim

e to

fini

sh R

ead

(sec

)

Unique nodes

40 MB readSFSShark

Very slow ≈ 775sMuch better !!! ≈ 88s (9x better)

Siddhartha Annapureddy Shark

Page 9: Shark: Scaling File Servers via Cooperative Caching

Peer-to-Peer Filesystems

P2P Filesystems (Pond, Ivy)

+ Scalability

– Non-standard administrative models

– New filesystem semantics

– Haven’t been widely deployed yet

Goal – Best of Both Worlds

I Convenience of central servers

I Scalability of peer-to-peer

Siddhartha Annapureddy Shark

Page 10: Shark: Scaling File Servers via Cooperative Caching

Let’s Design this New Filesystem

ServerClient

Vanilla SFS

I Central server model

Siddhartha Annapureddy Shark

Page 11: Shark: Scaling File Servers via Cooperative Caching

Scalability – Client-side Caching

ServerClientCache

Large client cache a la AFS

I Scales by avoiding redundant data transfers

I Leases to ensure NFS-style cache consistency

I With multiple clients, must address bandwidth concerns

Siddhartha Annapureddy Shark

Page 12: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

ServerClient

ClientClient

Client

Client

Siddhartha Annapureddy Shark

Page 13: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

ServerClient

ClientClient

Client

Client

I Shark clients maintain a distributed index

Siddhartha Annapureddy Shark

Page 14: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallel

Siddhartha Annapureddy Shark

Page 15: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallelI LBFS-style Chunks – Variable-sized blocksI Chunks are better – Exploit commonalities across filesI Chunks preserved across file versions, concatenations

Siddhartha Annapureddy Shark

Page 16: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallel

I Locality-awareness – Preferentially fetch chunks from nearbyclients

Siddhartha Annapureddy Shark

Page 17: Shark: Scaling File Servers via Cooperative Caching

Cross-filesystem Sharing

Global cooperative cache regardless of origin servers

Client

ClientClient

Client

ClientClient

ClientClient

Client

Client

Server ServerA B

I Two groups of clients accessing servers A and B

I Client groups share a large amount of software

I Such clients automatically form a global cache

Siddhartha Annapureddy Shark

Page 18: Shark: Scaling File Servers via Cooperative Caching

Cross-filesystem Sharing

Global cooperative cache regardless of origin servers

Client

ClientClient

Client

ClientClient

ClientClient

Client

Client

Server ServerA B

I Two groups of clients accessing servers A and B

I Client groups share a large amount of software

I Such clients automatically form a global cache

Siddhartha Annapureddy Shark

Page 19: Shark: Scaling File Servers via Cooperative Caching

Application

Linux distribution with LiveCDs

Sharkcd

Sharkcd

Sharkcd

Sharkcd

Knoppix

Knoppix Mirror

Knoppix

Knoppix

Slax

Slax

Slax

I LiveCD – Run an entire OS without using hard disk

I But all your programs must fit on a CD-ROM

I Download dynamically from server but scalability problems

I Knoppix and Slax users form global cache – Relieve servers

Siddhartha Annapureddy Shark

Page 20: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Fetch Metadata

I Fetch metadata from the server

Siddhartha Annapureddy Shark

Page 21: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Lookup Clients

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

Siddhartha Annapureddy Shark

Page 22: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Download Chunks

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

I Connect to multiple clients and download chunks in parallel

Siddhartha Annapureddy Shark

Page 23: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Download Chunks

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

I Connect to multiple clients and download chunks in parallel

I Check integrity of fetched chunks

Clients are mutually distrustful

Siddhartha Annapureddy Shark

Page 24: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

Siddhartha Annapureddy Shark

Page 25: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

Siddhartha Annapureddy Shark

Page 26: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

Siddhartha Annapureddy Shark

Page 27: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

Given a chunk B...

I Chunk token TB = H(B)

I H is a collision resistant hash function

Siddhartha Annapureddy Shark

Page 28: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

I Tokens can be used to check integrity of fetched data

Given a chunk B...

I Chunk token TB = H(B)

I H is a collision resistant hash function

Siddhartha Annapureddy Shark

Page 29: Shark: Scaling File Servers via Cooperative Caching

Discovering Clients Caching Chunks

Client Client

Client

Server

Client

Client

Discover Clients

I For every chunk B, there’s indexing key IBI IB used to index clients caching B

I Cannot set IB = TB, as TB is a secretI IB = MAC (TB, “Indexing Key”)

Siddhartha Annapureddy Shark

Page 30: Shark: Scaling File Servers via Cooperative Caching

Locality Awareness

Client

Client

Client

Client

Network A Network B

I Overlay organized as clusters based on latency

I Indexing infrastructure preferentially returns sources in samecluster as the client

I Hence, chunks usually transferred from nearby clients

Siddhartha Annapureddy Shark

Page 31: Shark: Scaling File Servers via Cooperative Caching

Final Steps

Download Chunk

I Security issues discussed later

Register as a source

I Client now becomes a source for the downloaded chunk

I Client registers in distributed index – PUT(IB, Addr)

Chunk Reconciliation

I Reuse connection to download more chunks

I Exchange mutually needed chunks w/o indexing overhead

Siddhartha Annapureddy Shark

Page 32: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Authentication

Client Client

Client

Server

Client

Client

TraditionalAuthentication

I Traditionally, server authenticated read requests using uids

Siddhartha Annapureddy Shark

Page 33: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Authentication

Client Client

Client

Server

Client

Client

TraditionalAuthentication From Token

Authenticator Derived

I Traditionally, server authenticated read requests using uids

I Challenge – How does a client know when to send chunks?

I Chunk token allows client to identify authorized clients

Siddhartha Annapureddy Shark

Page 34: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Communication

I Client should be able to check integrity of downloaded chunk

I Client should not send chunks to other unauthorized clients

I An eavesdropper shouldn’t be able to obtain chunk contents

Siddhartha Annapureddy Shark

Page 35: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

Siddhartha Annapureddy Shark

Page 36: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

I RC, RP – Random nonces to ensure freshness

Siddhartha Annapureddy Shark

Page 37: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

I RC, RP – Random nonces to ensure freshness

I AuthC – Authenticator to prove receiver has tokenI AuthC = MAC (TB, “Auth C”, C,P,RC,RP)

Siddhartha Annapureddy Shark

Page 38: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

I RC, RP – Random nonces to ensure freshness

I AuthC – Authenticator to prove receiver has tokenI AuthC = MAC (TB, “Auth C”, C,P,RC,RP)

I K – Key to encrypt chunk contentsI K = MAC (TB, “Encryption”, C,P,RC,RP)

Siddhartha Annapureddy Shark

Page 39: Shark: Scaling File Servers via Cooperative Caching

Security Properties

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

Client can check integrity of downloaded chunk

I Client checks H(Downloaded chunk)?= TB

Source should not send chunks to unauthorized clients

I Malicious clients cannot send correct AuthC

Eavesdropper shouldn’t get chunk contents

I All communication encrypted with K

Siddhartha Annapureddy Shark

Page 40: Shark: Scaling File Servers via Cooperative Caching

Security Properties

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

Privacy limitations for world-readable files

I Eavesdropper can track lookups of clients

I Eavesdropper hashes data, finds what exactly client downloads

For private files, solution described in paper

I Sacrifices cross-FS sharing for better privacy

Forward Secrecy not guaranteed

Siddhartha Annapureddy Shark

Page 41: Shark: Scaling File Servers via Cooperative Caching

Implementation

Sharksd

ServerNFS V3

Application

Syscall

Sharkcd

Corald

NFS V3 Client

Application

Syscall

Sharkcd

Corald

NFS V3Client

Server

Client Client

I sharkcd – Incorporates source-receiver client functionalityI sharksd – Incorporates chunking mechanismsI corald – A node in the indexing infrastructure

Siddhartha Annapureddy Shark

Page 42: Shark: Scaling File Servers via Cooperative Caching

Evaluation

I How does Shark compare with SFS? With NFS?

I How scalable is the server?

I How fair is Shark across clients?

I Which order is better? Random or Sequential

I What are the benefits of set reconciliation?

Siddhartha Annapureddy Shark

Page 43: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, rand, negotiationNFSSFS

I Shark – 88sI SFS – 775s (≈ 9x better), NFS – 350s (≈ 4x better)I SFS less fair because of TCP backoffs

Siddhartha Annapureddy Shark

Page 44: Shark: Scaling File Servers via Cooperative Caching

PlanetLab – 185 Nodes

0

20

40

60

80

100

0 500 1000 1500 2000 2500

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readSharkSFS

I Shark ≈ 7 min – 95th Percentile

I SFS ≈ 39 min – 95th Percentile (5x better)

I NFS – Triggered kernel panics in server

Siddhartha Annapureddy Shark

Page 45: Shark: Scaling File Servers via Cooperative Caching

Data pushed by Server

SFS Shark0.0

0.5

1.0

Nor

mal

ized

ban

dwid

th

7400

954

I Shark vs SFS – 23 copies vs 185 copies (8x better)

Siddhartha Annapureddy Shark

Page 46: Shark: Scaling File Servers via Cooperative Caching

Data served by Clients

0

20

40

60

80

100

120

140

160

0 10 20 30 40 50 60 70 80 90Ban

dwid

th tr

ansm

itted

(M

B)

Unique Shark proxies

PlanetLab hostsShark, 40MB

I Maximum contribution ≈ 3.5 copies

I Median contribution ≈ 1.5 copies

I Minimum contribution ≈ 0.75 copies

Siddhartha Annapureddy Shark

Page 47: Shark: Scaling File Servers via Cooperative Caching

Fetching Chunks – Order Matters

I In what order should we fetch chunks of a file?

I Natural choices – Random or Sequential

Intuitively, when many clients start simultaneously

I RandomI All clients fetch independent chunksI More chunks become available in the cooperative cache

I SequentialI Better disk I/O scheduling on the serverI Client that downloads most chunks alone adds to cache

Siddhartha Annapureddy Shark

Page 48: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, randShark, seq

I Random – 133sI Sequential – 203sI Random Wins !!! – 35% better

Siddhartha Annapureddy Shark

Page 49: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, randShark, rand, negotiation

I Random + Reconciliation – 88sI Random – 133sI Reconciliation crucial – 34% improvement

Siddhartha Annapureddy Shark

Page 50: Shark: Scaling File Servers via Cooperative Caching

Conclusions

I Networked filesystems offer a convenient interface

I Current networked filesystems like NFS are not scalableI Forces users to resort to inconvenient tools like rsync

I Shark offers a filesystem that scales to hundreds of clientsI Locality-aware cooperative cacheI Supports cross-FS sharing enabling novel applications

Siddhartha Annapureddy Shark

Page 51: Shark: Scaling File Servers via Cooperative Caching

Questions

http://www.scs.cs.nyu.edu/shark

Siddhartha Annapureddy Shark


Recommended