51
Shark: Scaling File Servers via Cooperative Caching Siddhartha Annapureddy Michael J. Freedman David Mazi` eres New York University Networked Systems Design and Implementation, 2005 Siddhartha Annapureddy Shark

Shark: Scaling File Servers via Cooperative Caching

Embed Size (px)

DESCRIPTION

Network file systems offer a powerful, transparent interfacefor accessing remote data. Unfortunately, in current network file systems like NFS, clients fetch data from a central file server, inherently limiting the system's ability to scale to many clients. While recent distributed (peer-to-peer) systems have managed to eliminate this scalabilitybottleneck, they are often exceedingly complex and providenon-standard models for administration and accountability. We present Shark, a novel system that retains the best of both worlds— -- the scalability of distributed systemswith the simplicity of central servers.Shark is a distributed file system designed for large-scale,wide-area deployment, while also providing a drop-in replacement for local-area file systems. Shark introduces a novel cooperative-caching mechanism, in whichmutually-distrustful clients can exploit each others' filecaches to reduce load on an origin file server. Using a distributed index, Shark clients nd nearby copies of data, even when files originate from different servers. Performance results show that Shark can greatly reduce serverload and improve client latency for read-heavy workloads, both in the wide and local areas, while still remainingcompetitive for single clients in the local area. Thus, Shark enables modestly-provisioned file servers to scale to hundreds of read-mostly clients while retaining traditional usability, consistency, security, and accountability.

Citation preview

Page 1: Shark: Scaling File Servers via Cooperative Caching

Shark: Scaling File Servers via Cooperative

Caching

Siddhartha Annapureddy Michael J. Freedman David Mazieres

New York University

Networked Systems Design and Implementation, 2005

Siddhartha Annapureddy Shark

Page 2: Shark: Scaling File Servers via Cooperative Caching

Motivation

Scenario – Want to test a new application on a hundred nodes

Workstation

Workstation

Workstation

Workstation

WorkstationWorkstation

Workstation Workstation

WorkstationWorkstation

Problem – Need to push software to all nodes and collect results

Siddhartha Annapureddy Shark

Page 3: Shark: Scaling File Servers via Cooperative Caching

Current Approaches

Distribute from a central server

Client

Client

Client

Network A Network B

Server

rsync, unison, scp

– Server’s network uplink saturatesI Operation takes a long time to finish

– Wastes bandwidth along bottleneck links

Siddhartha Annapureddy Shark

Page 4: Shark: Scaling File Servers via Cooperative Caching

Current Approaches

File distribution mechanisms

BitTorrent, Bullet

+ Scales by offloading burden on server

– Client downloads from half-way across the world

Siddhartha Annapureddy Shark

Page 5: Shark: Scaling File Servers via Cooperative Caching

Inherent Problems with Copying Files

Users have to decide a priori what to ship

I Ship too much – Waste bandwidth, takes long time

I Ship too little – Hassle to work in a poor environment

Idle environments consume disk space

I Users are loath to cleanup ⇒ Redistribution

I Need low cost solution for refetching files

Manual management of software versions

Illusion of having development environmentPrograms fetched transparently on demand

Siddhartha Annapureddy Shark

Page 6: Shark: Scaling File Servers via Cooperative Caching

Networked File Systems

+ Know how to deploy these systems

+ Know how to administer such systems

+ Simple accountability mechanisms

Eg: NFS, AFS, SFS etc.

Siddhartha Annapureddy Shark

Page 7: Shark: Scaling File Servers via Cooperative Caching

Problem: Scalability

0

100

200

300

400

500

600

700

800

0 20 40 60 80 100

Tim

e to

fini

sh R

ead

(sec

)

Unique nodes

40 MB readSFS

Very slow ≈ 775s

Siddhartha Annapureddy Shark

Page 8: Shark: Scaling File Servers via Cooperative Caching

Problem: Scalability

0

100

200

300

400

500

600

700

800

0 20 40 60 80 100

Tim

e to

fini

sh R

ead

(sec

)

Unique nodes

40 MB readSFSShark

Very slow ≈ 775sMuch better !!! ≈ 88s (9x better)

Siddhartha Annapureddy Shark

Page 9: Shark: Scaling File Servers via Cooperative Caching

Peer-to-Peer Filesystems

P2P Filesystems (Pond, Ivy)

+ Scalability

– Non-standard administrative models

– New filesystem semantics

– Haven’t been widely deployed yet

Goal – Best of Both Worlds

I Convenience of central servers

I Scalability of peer-to-peer

Siddhartha Annapureddy Shark

Page 10: Shark: Scaling File Servers via Cooperative Caching

Let’s Design this New Filesystem

ServerClient

Vanilla SFS

I Central server model

Siddhartha Annapureddy Shark

Page 11: Shark: Scaling File Servers via Cooperative Caching

Scalability – Client-side Caching

ServerClientCache

Large client cache a la AFS

I Scales by avoiding redundant data transfers

I Leases to ensure NFS-style cache consistency

I With multiple clients, must address bandwidth concerns

Siddhartha Annapureddy Shark

Page 12: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

ServerClient

ClientClient

Client

Client

Siddhartha Annapureddy Shark

Page 13: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

ServerClient

ClientClient

Client

Client

I Shark clients maintain a distributed index

Siddhartha Annapureddy Shark

Page 14: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallel

Siddhartha Annapureddy Shark

Page 15: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallelI LBFS-style Chunks – Variable-sized blocksI Chunks are better – Exploit commonalities across filesI Chunks preserved across file versions, concatenations

Siddhartha Annapureddy Shark

Page 16: Shark: Scaling File Servers via Cooperative Caching

Scalability – Cooperative Caching

Clients fetch data from each other and offload burden from server

Server

ClientClient

Client

Client

Client

I Shark clients maintain a distributed index

I Fetch a file from multiple other clients in parallel

I Locality-awareness – Preferentially fetch chunks from nearbyclients

Siddhartha Annapureddy Shark

Page 17: Shark: Scaling File Servers via Cooperative Caching

Cross-filesystem Sharing

Global cooperative cache regardless of origin servers

Client

ClientClient

Client

ClientClient

ClientClient

Client

Client

Server ServerA B

I Two groups of clients accessing servers A and B

I Client groups share a large amount of software

I Such clients automatically form a global cache

Siddhartha Annapureddy Shark

Page 18: Shark: Scaling File Servers via Cooperative Caching

Cross-filesystem Sharing

Global cooperative cache regardless of origin servers

Client

ClientClient

Client

ClientClient

ClientClient

Client

Client

Server ServerA B

I Two groups of clients accessing servers A and B

I Client groups share a large amount of software

I Such clients automatically form a global cache

Siddhartha Annapureddy Shark

Page 19: Shark: Scaling File Servers via Cooperative Caching

Application

Linux distribution with LiveCDs

Sharkcd

Sharkcd

Sharkcd

Sharkcd

Knoppix

Knoppix Mirror

Knoppix

Knoppix

Slax

Slax

Slax

I LiveCD – Run an entire OS without using hard disk

I But all your programs must fit on a CD-ROM

I Download dynamically from server but scalability problems

I Knoppix and Slax users form global cache – Relieve servers

Siddhartha Annapureddy Shark

Page 20: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Fetch Metadata

I Fetch metadata from the server

Siddhartha Annapureddy Shark

Page 21: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Lookup Clients

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

Siddhartha Annapureddy Shark

Page 22: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Download Chunks

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

I Connect to multiple clients and download chunks in parallel

Siddhartha Annapureddy Shark

Page 23: Shark: Scaling File Servers via Cooperative Caching

Cooperative Caching – Roadmap

Client Client

Client

Server

Client

Client

Download Chunks

I Fetch metadata from the server

I Look up clients caching needed chunks in overlay

I Connect to multiple clients and download chunks in parallel

I Check integrity of fetched chunks

Clients are mutually distrustful

Siddhartha Annapureddy Shark

Page 24: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

Siddhartha Annapureddy Shark

Page 25: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

Siddhartha Annapureddy Shark

Page 26: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

Siddhartha Annapureddy Shark

Page 27: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

Given a chunk B...

I Chunk token TB = H(B)

I H is a collision resistant hash function

Siddhartha Annapureddy Shark

Page 28: Shark: Scaling File Servers via Cooperative Caching

File Metadata

Client ServerMetadata:

(fh,offset,count)GET_TOKEN

Chunk tokens

I Possession of token implies server permissions to read

I Tokens are a shared secret between authorized clients

I Tokens can be used to check integrity of fetched data

Given a chunk B...

I Chunk token TB = H(B)

I H is a collision resistant hash function

Siddhartha Annapureddy Shark

Page 29: Shark: Scaling File Servers via Cooperative Caching

Discovering Clients Caching Chunks

Client Client

Client

Server

Client

Client

Discover Clients

I For every chunk B, there’s indexing key IBI IB used to index clients caching B

I Cannot set IB = TB, as TB is a secretI IB = MAC (TB, “Indexing Key”)

Siddhartha Annapureddy Shark

Page 30: Shark: Scaling File Servers via Cooperative Caching

Locality Awareness

Client

Client

Client

Client

Network A Network B

I Overlay organized as clusters based on latency

I Indexing infrastructure preferentially returns sources in samecluster as the client

I Hence, chunks usually transferred from nearby clients

Siddhartha Annapureddy Shark

Page 31: Shark: Scaling File Servers via Cooperative Caching

Final Steps

Download Chunk

I Security issues discussed later

Register as a source

I Client now becomes a source for the downloaded chunk

I Client registers in distributed index – PUT(IB, Addr)

Chunk Reconciliation

I Reuse connection to download more chunks

I Exchange mutually needed chunks w/o indexing overhead

Siddhartha Annapureddy Shark

Page 32: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Authentication

Client Client

Client

Server

Client

Client

TraditionalAuthentication

I Traditionally, server authenticated read requests using uids

Siddhartha Annapureddy Shark

Page 33: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Authentication

Client Client

Client

Server

Client

Client

TraditionalAuthentication From Token

Authenticator Derived

I Traditionally, server authenticated read requests using uids

I Challenge – How does a client know when to send chunks?

I Chunk token allows client to identify authorized clients

Siddhartha Annapureddy Shark

Page 34: Shark: Scaling File Servers via Cooperative Caching

Security Issues – Client Communication

I Client should be able to check integrity of downloaded chunk

I Client should not send chunks to other unauthorized clients

I An eavesdropper shouldn’t be able to obtain chunk contents

Siddhartha Annapureddy Shark

Page 35: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

Siddhartha Annapureddy Shark

Page 36: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

I RC, RP – Random nonces to ensure freshness

Siddhartha Annapureddy Shark

Page 37: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

I RC, RP – Random nonces to ensure freshness

I AuthC – Authenticator to prove receiver has tokenI AuthC = MAC (TB, “Auth C”, C,P,RC,RP)

Siddhartha Annapureddy Shark

Page 38: Shark: Scaling File Servers via Cooperative Caching

Security Protocol

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

I RC, RP – Random nonces to ensure freshness

I AuthC – Authenticator to prove receiver has tokenI AuthC = MAC (TB, “Auth C”, C,P,RC,RP)

I K – Key to encrypt chunk contentsI K = MAC (TB, “Encryption”, C,P,RC,RP)

Siddhartha Annapureddy Shark

Page 39: Shark: Scaling File Servers via Cooperative Caching

Security Properties

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

Client can check integrity of downloaded chunk

I Client checks H(Downloaded chunk)?= TB

Source should not send chunks to unauthorized clients

I Malicious clients cannot send correct AuthC

Eavesdropper shouldn’t get chunk contents

I All communication encrypted with K

Siddhartha Annapureddy Shark

Page 40: Shark: Scaling File Servers via Cooperative Caching

Security Properties

Client

Receiver

Client

Source

RC

RP

IB, AuthC

EK(B)

Privacy limitations for world-readable files

I Eavesdropper can track lookups of clients

I Eavesdropper hashes data, finds what exactly client downloads

For private files, solution described in paper

I Sacrifices cross-FS sharing for better privacy

Forward Secrecy not guaranteed

Siddhartha Annapureddy Shark

Page 41: Shark: Scaling File Servers via Cooperative Caching

Implementation

Sharksd

ServerNFS V3

Application

Syscall

Sharkcd

Corald

NFS V3 Client

Application

Syscall

Sharkcd

Corald

NFS V3Client

Server

Client Client

I sharkcd – Incorporates source-receiver client functionalityI sharksd – Incorporates chunking mechanismsI corald – A node in the indexing infrastructure

Siddhartha Annapureddy Shark

Page 42: Shark: Scaling File Servers via Cooperative Caching

Evaluation

I How does Shark compare with SFS? With NFS?

I How scalable is the server?

I How fair is Shark across clients?

I Which order is better? Random or Sequential

I What are the benefits of set reconciliation?

Siddhartha Annapureddy Shark

Page 43: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, rand, negotiationNFSSFS

I Shark – 88sI SFS – 775s (≈ 9x better), NFS – 350s (≈ 4x better)I SFS less fair because of TCP backoffs

Siddhartha Annapureddy Shark

Page 44: Shark: Scaling File Servers via Cooperative Caching

PlanetLab – 185 Nodes

0

20

40

60

80

100

0 500 1000 1500 2000 2500

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readSharkSFS

I Shark ≈ 7 min – 95th Percentile

I SFS ≈ 39 min – 95th Percentile (5x better)

I NFS – Triggered kernel panics in server

Siddhartha Annapureddy Shark

Page 45: Shark: Scaling File Servers via Cooperative Caching

Data pushed by Server

SFS Shark0.0

0.5

1.0

Nor

mal

ized

ban

dwid

th

7400

954

I Shark vs SFS – 23 copies vs 185 copies (8x better)

Siddhartha Annapureddy Shark

Page 46: Shark: Scaling File Servers via Cooperative Caching

Data served by Clients

0

20

40

60

80

100

120

140

160

0 10 20 30 40 50 60 70 80 90Ban

dwid

th tr

ansm

itted

(M

B)

Unique Shark proxies

PlanetLab hostsShark, 40MB

I Maximum contribution ≈ 3.5 copies

I Median contribution ≈ 1.5 copies

I Minimum contribution ≈ 0.75 copies

Siddhartha Annapureddy Shark

Page 47: Shark: Scaling File Servers via Cooperative Caching

Fetching Chunks – Order Matters

I In what order should we fetch chunks of a file?

I Natural choices – Random or Sequential

Intuitively, when many clients start simultaneously

I RandomI All clients fetch independent chunksI More chunks become available in the cooperative cache

I SequentialI Better disk I/O scheduling on the serverI Client that downloads most chunks alone adds to cache

Siddhartha Annapureddy Shark

Page 48: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, randShark, seq

I Random – 133sI Sequential – 203sI Random Wins !!! – 35% better

Siddhartha Annapureddy Shark

Page 49: Shark: Scaling File Servers via Cooperative Caching

Emulab – 100 Nodes on LAN

0

20

40

60

80

100

0 100 200 300 400 500 600 700 800

Per

cent

age

com

plet

ed w

ithin

tim

e

Time since initialization (sec)

40 MB readShark, randShark, rand, negotiation

I Random + Reconciliation – 88sI Random – 133sI Reconciliation crucial – 34% improvement

Siddhartha Annapureddy Shark

Page 50: Shark: Scaling File Servers via Cooperative Caching

Conclusions

I Networked filesystems offer a convenient interface

I Current networked filesystems like NFS are not scalableI Forces users to resort to inconvenient tools like rsync

I Shark offers a filesystem that scales to hundreds of clientsI Locality-aware cooperative cacheI Supports cross-FS sharing enabling novel applications

Siddhartha Annapureddy Shark

Page 51: Shark: Scaling File Servers via Cooperative Caching

Questions

http://www.scs.cs.nyu.edu/shark

Siddhartha Annapureddy Shark