30
NFS Shivaram Venkataraman CS 537, Spring 2020 welcome to the Penultimate future !

Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

NFS

Shivaram VenkataramanCS 537, Spring 2020

welcometo the

Penultimatefuture

!

Page 2: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

ADMINISTRIVIA

AEFIS feedbackOptional project Final exam details

No discussion this week!

>NoSUP

DATS

→ May I -354.

currently[→ Due wed 10pm to ? like this to be

→ Check Piazza

↳ canvas Quiz →randomization

↳us .ae#...r!+,..iu.

" ""-

Page 3: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

AGENDA / LEARNING OUTCOMES

How to design a distributed file system that can survive partial failures?

What are consistency properties for such designs?

-

↳ Distributedsystems

Page 4: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

RECAP

Page 5: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Distributed File Systems

Local FS: processes on same machine access shared files

Network FS: processes on different machines access shared files in same way

GoalsTransparent accessFast + simple crash recoveryReasonable performance?

TCPupp tf RPC

f-www.an?a;eun?gtFeu-orkr-sI

→For server &

client failure

Page 6: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

NFS Architecture

FileServer

Client

Client

Client

Client

RPC

RPC

RPC

RPCLocal FS

desktop!qdird powerful

f mobile tht

la.tt laptop

vin #

a.txt

TI or

more

disks

Page 7: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Strategy 1

Attempt: Wrap regular UNIX system calls using RPCopen() on client calls open() on serveropen() on server returns fd back to client

read(fd) on client calls read(fd) on serverread(fd) on server returns data back to client

int fd = open(“foo”, O_RDONLY);read(fd, buf, MAX);…read(fd, buf, MAX);

Server crash!

Naming!

vim

fotenftd

ii. FEE:÷÷.←

L) localFS

- fol store-irade run

- and offset

server comes back up ⇒ fp table'

is lost

offset is lost

Page 8: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Strategy 2: put all info in requests

“Stateless” protocol: server maintains no state about clients

Need API change. One possibility:

pread(char *path, buf, size, offset);pwrite(char *path, buf, size, offset);

Specify path and offset each time. Server need not remember anything from clients.

Pros? Server can crash and reboot transparently to clientsCons? Too many path lookups.

i::÷:

:O o ¥¥÷÷÷÷÷⇒①¥:-

- -

-

Page 9: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Strategy 3: file handles

fh = open(char *path);pread(fh, buf, size, offset);pwrite(fh, buf, size, offset);

File Handle = <volume ID, inode #, generation #>

Opaque to client (client should not interpret internals)

Oo"

ni.÷÷:read[ , .is?wEkxgd \ Drs

AwakeStateless .

We don't have

path traversal on every pread, p write

off; state

Page 10: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

file handle for 1 is

@ O message well known Grant )Tripe

- →

-- ①h@ Ishivaramlfooittt

fly cadre FHfit

* lhote /] Lookup ( fit ,

" have")

↳-

threw"

www.pqfac.yhone"

),

"Miami)←← :

Similarto path↳

Yer;D IT www.endearbd?h .

-

ismessage retro

%bdftwbreply the←

Page 11: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Can NFS Protocol include Append?

fh = open(char *path);pread(fh, buf, size, offset);pwrite(fh, buf, size, offset);

append(fh, buf, size);

=

[ ]

Page 12: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

pwrite VS APPEND

AAAAAAAA

file

pwrite ABBAAAAA

file

pwrite ABBAAAAA

file

pwrite ABBAAAAA

file

pwrite(file, “BB”, 2, 2);

append(file, “BB”);

T =, offsetEntente

OD•

°

(retry )

a¥aFd Dia-91rem

Page 13: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Idempotent Operations

Solution: Design API so no harm to executing function more than once

If f() is idempotent, then:f() has the same effect as f(); f(); … f(); f()

int fd = open(“foo”, O_RDONLY);read(fd, buf, MAX);write(fd, buf, MAX);…

Server crash!

client doesnt

know ifserver

crashed beforeI after

doing theoperation

Retrying functionsis only

- - sage if fis idempotent

---

- server-

T÷F"'Dno#

Page 14: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

What operations are Idempotent?Idempotent- any sort of read that doesn’t change anything- pwrite

Not idempotent- append

What about these?- mkdir- creat

- specified offset

NFSAPI ,

while

-not part of

pwrite is

= . ⇐ :::¥÷msee Mr

dir C"

Hoo"

)

Page 15: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Write Buffers

Local FS

Client Server

NFS

write

write bufferwrite buffer

Server acknowledges write before write is pushed to disk;What happens if server crashes?

pwrite\ X,

"

t'-

, ,a r

- --

^

Can we sayOk before

datahits disk ?

Page 16: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

client:

write A to 0write B to 1write C to 2

Server Write Buffer Lost

server mem: A B C

server disk:

server acknowledges write before write is pushed to disk

blockO l Z

# • Dinge→ FACETACK

( ←crashes ? A C

twaitfretry Ihid serrer ggrnbage.aeis back middle

Page 17: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Server Write Buffer Lost

server mem: Z

server disk: X B Z

Client:

write A to 0

write B to 1

write C to 2

write X to 0

write Y to 1

write Z to 2

Problem: No write failed, but disk state doesn’t match any point in time

Solutions?

←an

OOD←

X BC

←xyc

Page 18: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Write Buffers

Local FS

Client Server

NFS

write

write buffer

Don’t use server write buffer. Problem: Slow?

Use persistent write buffer (more expensive)

Fife friendly

+

wittol'

is

a:*:c rite

I. \ f¥yncraft:Fa -¥-U a

DRAM

Page 19: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

QUIZ 31The only costs to worry about are network costs. Assume "small" messages takes S units of time, whereas a "bigger" message (e.g., size of a block=4KB) takes B units. If a message is larger than 4KB, it takes longer (2B for 8KB).

1. How long does it take to open a 100-block (400 KB) file called /a/b/c.txtfor the first time? (assume root directory file handle is already available)

2. How long does it take to read the whole file?

https://tinyurl.com/cs537-sp20-quiz31% AFS

( ):O ②

IT -

Tfchookup Cla)

seookpc.la/b ) hookup ( lelbleitxt

)=65

e- - -¥ c- c-25 Zs ZS

read ( th , by , 4413) looting#

-- -

r . '' '

@+By goo.

.

1005+1 B

# St 10013

Page 20: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Cache Consistency

NFS can cache data in three places:- server memory- client disk- client memory

How to make sure all versions are in sync?

Page 21: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Distributed Cache

Local FS

Client 1 Server

NFScache: Acache:

Client 2

NFScache:

write read openA" # T EAT read ⇐ f. Tread IT

at c- →AaB 0 A A

using ① If i.q we have a ① How do we

clientwt

write when do we detect that

⇒ Forging artful? gun. sent?

Page 22: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Cache

Local FS

Client 1 Server

NFScache: Acache: B

Client 2

NFScache: A

write!

“Update Visibility” problem: server doesn’t have latest version

What happens if Client 2 (or any other client) reads data?

f T

Page 23: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Cache

Local FS

Client 1 Server

NFScache: Bcache: B

Client 2

NFScache: A

flush

“Stale Cache” problem: client 2 doesn’t have latest version

What happens if Client 2 reads data?

↳Merwin dirt.Fae

before flushclient 2 reads file

O O

--

Page 24: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Problem 1: Update Visibility

When client buffers a write, how can server (and other clients) see update?Client flushes cache entry to server

When should client perform flush? NFS solution: flush on fd close

Local FS

Client 1 Server

NFScache: Acache: B

write!

open- admit'T open

CB? ↳hello.cldose .org gunfire→

worst case

- - guarantee

Page 25: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Problem 2: Stale Cache

Problem: Client 2 has stale copy of data; how can it get the latest?

NFS solution: – Clients recheck if cached copy is current before using data

Local FS

Server

cache: B

Client 2

NFScache: A

client 1 Ibfwrite CB) deflectscloseHd↳ O ¥

Page 26: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Stale Cache Solution

Client cache records time when data block was fetched (t1)Before using data block, client does a STAT request to server

- get’s last modified timestamp for this file (t2) (not block…)- compare to cache timestamp- refetch data block if changed since timestamp (t2 > t1)

Local FS

Server

cache: B

Client 2

NFScache: A t1t2

Stqtlgettttrcell

gaffes fred careyc-

¥ O

D-

£7- timestamp

is

-

at file granularity-

Page 27: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Measure then Build

NFS developers found stat accounted for 90% of server requests

Why?

Because clients frequently recheck cache

- - Tf

In common case scenario

1 client opensreads

stat ← ends )b- checkvalidity

Page 28: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

Reducing Stat Calls

Solution: cache results of stat callsPartial Solution:

Make stat cache entries expire after a given time (e.g., 3 seconds) (discard t2 at client 2)

What is the consequence?

Local FS

Server

cache: B

Client 2

NFScache: A

Never see updates on server!

AttributeCache

⇐ fED

µ¥o

Treat

fit alter code entryis

older than 3s

then call stat-

cache reads could be againstale for 3A !

Page 29: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

NFS Summary

NFS handles client and server crashes very well; robust APIs that are:

- stateless: servers don’t remember clients- idempotent: doing things twice never hurts

Caching and write buffering is harder, especially with crashes

Problems:

– Consistency model is odd (client may not see updates until 3s after file closed)– Scalability limitations as more clients call stat() on server

clients pullingstatl timestamps

from server

Page 30: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages

NEXT STEPS

Next class: Review, Looking forwardOptional project due WedAEFIS feedback