37
CSS434 DFS 1 CSS434 Distributed File CSS434 Distributed File Systems Systems Textbook Ch8, 13 Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

  • View
    218

  • Download
    4

Embed Size (px)

Citation preview

Page 1: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 1

CSS434 Distributed File SystemsCSS434 Distributed File SystemsTextbook Ch8, 13Textbook Ch8, 13

Professor: Munehiro Fukuda

Page 2: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 2

DFS Desirable Features Transparency:

Access transparency: a single set of operations Location transparency: uniform file name space Mobility transparency: file mobility Performance transparency: Comparable to a centralized file system

Concurrency and synchronization: should complete concurrent access requests consistently.

Forward/backward validation File caching and replication:

Caching: at client/server for scalability Replication: at multiple servers for availability

Heterogeneity: should allow a variety of nodes to share files in different storage media and OS

Similarity between Unix and NTFS: stream-oriented files, a tree-structured system Difference between Unix and NFTS: CR char included in NTFS, file naming

Fault tolerance: at-most-once or at-least-once semantics Consistency: Unix one-copy update semantics, session semantics, etc. Security: should protect files from network intruders.

Page 3: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 3

Consistency Maintenance in Various Storage Systems

Sharing Persis-tence

Distributedcache/replicas

Consistencymaintenance

Example

Main memory RAM

File system UNIX file system

Distributed file system Sun NFS

Web Web server

Distributed shared memory Ivy (Ch. 16)

Remote objects (RMI/ORB) CORBA

Persistent object store 1 CORBA PersistentObject Service

Persistent distributed object store PerDiS, Khazana

1

1

1

Page 4: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 4

File Service Architecture

Client computer Server computer

Applicationprogram

Applicationprogram

Client module

Flat file service

Directory service

(File caching)(File caching/replication)

Consistency maintenance

Page 5: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 5

DFS Services Flat file service

File-accessing mechanism:deciding a place to manage remote

files and unit to transfer data (at server or client? file, block or byte?)

File-sharing semantics: providing similar to Unix but weaker file update semantics

File-caching mechanism: improving performance/scalability File-replication mechanism:

improving performance/availability Directory service

Mapping between text file names and reference to files, (i.e. file IDs)

Page 6: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 6

Flat File Service Operations

Read(FileId, i, n) -> Data — throws BadPosition

If 1 ≤ i ≤ Length(File): Reads a sequence of up to n itemsfrom a file starting at item i and returns it in Data.

Write(FileId, i, Data) — throws BadPosition

If 1 ≤ i ≤ Length(File)+1: Writes a sequence of Data to afile, starting at item i, extending the file if necessary.

Create() -> FileId Creates a new file of length 0 and delivers a UFID for it.

Delete(FileId) Removes the file from the file store.

GetAttributes(FileId) -> Attr Returns the file attributes for the file.

SetAttributes(FileId, Attr) Sets the file attributes (only those attributes that are notshaded in ).

Page 7: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 7

Directory Service Operations

Lookup(Dir, Name) -> FileId— throws NotFound

Locates the text name in the directory and returns therelevant UFID. If Name is not in the directory, throws anexception.

AddName(Dir, Name, File) — throws NameDuplicate

If Name is not in the directory, adds (Name, File) to thedirectory and updates the file’s attribute record.If Name is already in the directory: throws an exception.

UnName(Dir, Name) — throws NotFound

If Name is in the directory: the entry containing Name isremoved from the directory. If Name is not in the directory: throws an exception.

GetNames(Dir, Pattern) -> NameSeq Returns all the text names in the directory that match theregular expression Pattern.

host1

fileDir

host2 host3

addName( Dir, Name, file)Name1

Name2 Name3

Ref count=3 if ref_count = 0, file deleted

Page 8: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 8

File-Accessing Models Accessing Remote Files

Cache consistency problem

Reducing network traffic

At a client that cached a file copy

Data caching model

Communication overhead

A simple implementation

At a serverRemote service model

DemeritsMeritsFile access

Transfer level

Merits Demerits

File Simple, less communication overhead, and immune to server

A client required to have large storage space

Block A client not required to have large storage space

More network traffic/overhead

Byte Flexibility maximized Difficult cache management to handle the variable-length data

Record Handling structured and indexed files

More network trafficMore overhead to re-construct a file.

Unit of Data TransferNFS

Page 9: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 9

File-Sharing Semantics

Define when modifications of the file data made by a user are observable by other users

1. Unix semantics2. Session Semantics3. Immutable shared-files semantics4. Transaction-like semantics

Page 10: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 10

File-Sharing SemanticsUnix Semantics (One-copy Update Semantics)

Absolute Ordering (seen to all clients as if only a single copy existed and is updated immediately)

t1 t2 t3 t4 t5 t6

a b a b c a b c da b c a b c d e a b c d e

Client A

Client BAppend(c) Append(d)

read

read

Append(e)

Network Delays (Inevitable to have a weaker semantics)

b c

delayed

a

a bdelayed

Page 11: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 11

File-Sharing SemanticsSession Semantics

Client A Client B Client C

a b

a b c

a b c d

a b c d e

Open(file)

Append(x)

Append(y)

Append(z)

Close(file)

Server

a b

a b c d e

Open(file)

a bOpen(file)

a b c d e

a b x

a b x y

a b c y z

a b x y z

Close(file)

Append(m) a b c d e m

a b c d e m

Append(c)

Append(d)

Append(e)

Close(file)

File writes may overwrite previous updates.File lock is needed to prevent this overwrites.

Page 12: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 12

File-Sharing SemanticsSession Semantics with File Lock

Client A Client B

a b

a b c

Open(file)

Append(x)

Close(file)

Server

a b

Open(file)

Append(c)

lockt

a b

a b x

User need to choose:quit, steal, or proceed

a b x

a b x^x^s

^x^w

a b x

file

file file2

Close(file)

a b c a b c

User need to choose:Quit, save anyway, or type ^x^w

file3

X

X

lockt

Page 13: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 13

File-Sharing SemanticsTransaction-Like Semantics (Concurrency

Control)

Backward validation Forward validation

R1R2W3R4W5

R1R2W6R4W7

R1R2W9R4W8

R1R2R6R8W8

Trans_start

Trans_start

Trans_start

Trans_startTrans_end

Trans_end

Trans_end

Trans_abortTrans_restart

validation

Commitment

Client A Client B Client C Client D

R1R2W3R4W5

R1R2W6R4W7

R1R2W9R4W8

R1R2R6R8W8

Trans_start

Trans_start

Trans_start

Trans_startTrans_end

Trans_end

Trans_abortTrans_restart

validation

Commitment

Client A Client B Client C Client D

Compare reads withformer writes

Compare write withlater reads

Trans_endAbort itself or conflicting active transactions

Which validation is better?

Page 14: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 14

File-Sharing SemanticsImmutable Shared-Files Semantics

Version1.0

Tentativebased on

1.0

Tentativebased on

1.0

Version1.1

Version conflict

Version1.2

Version1.2

Ignore conflict Merge

Abort

ServerClient BClient A

Depend on each file system.Abortion is simple (later, the client A canDecide to overwrite it with its tentative 1.0by changing the corresponding directory)

Page 15: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 15

File-Caching SchemesCache Location

Disk

Mainmemory

Location Merits Demerits

No caching No modifications Frequent disk access,Busy network traffic

In server’s main memory

One-time disk access,Easy implementation,Unix-like file-sharing semantics

Busy network traffic

In client’s disk

One-time network access,No size restriction

Cache consistency problem,File access semantics, Frequent disk access,No Diskless workstation

In client’s main memory

Maximum performance,Diskless workstation,Scalability

Size restriction,Cache consistency problem,File access semantics

Disk

Mainmemory

Node boundaryClient Server

file

copy

copy

copy

Page 16: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 16

Mainmemory

File-Caching SchemesModification Propagation

Write-through scheme Pros: Unix-like semantics and high

reliability Cons: Poor write performance

Delayed-write scheme Write on cache displacement Periodic write Write on close Pros:

Write accesses complete quickly Some writes may be omitted by the

following writes. Gathering all writes mitigates network

overhead. Cons:

Delaying of write propagation results in fuzzier file-sharing semantics.

Disk

file

Mainmemory

copycopyW

new

Client 1 Client 2

W

W

Immediate write

Mainmemory

Disk

file

Mainmemory

copyW copy

new

Client 1 Client 2

delayed writeW

Page 17: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 17

File-Caching SchemesCache Validation Schemes – Client-Initiated

Approach

Checking before every access (Unix-like semantics but too slow)

Checking periodically (better performance but fuzzy file-sharing semantics)

Checking on file open (simple, suitable for session-semantics)

Problem: High network traffic

Mainmemory

Disk

file

Mainmemory

copy

Client 1 Client 2

copy

Mainmemory

Disk

file

Mainmemory

copycopyW

Client 1 Client 2

W

W

Check beforeevery access

Write through

Delayed write?

W

W

W

Write-on-close Check-on-open

new

Check-on-close?

Page 18: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 18

File-Caching SchemesCache Validation Schemes – Server-Initiated

Approach

Keeping track of clients having a copy Denying a new request, queuing it, and disabling caching Notifying all clients of any update on the original file Problem:

violating client-server model Stateful servers Check-on-open still needed for the 2nd file opening.

Mainmemory

Disk

file

Mainmemory

copy copyW

Client 1 Client 2

W

WW

Mainmemory

copy

Client 3

Notify (invalidate)

Mainmemory

Client 4

Deny for a new open

Write throughOr

Delayed write?

Page 19: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 19

Homework Assignment 4

Session semantics Client-side/server-side caching Server-initiated invalidation

Server Client 2Client 1

/tmp cwd /tmp

Name Access

Owner

state

file1 write true wOwn

Name Access

Owner

state

file1 read false rShare

name readers

owner state

file1 client2 client1 wShare

file2 clien3 rShare

file1file2file1 file1

download( )upload( )

invalidate( )writeback( )

invalidate( )writeback( )

emacs

chmod 600

emacs

chmod 400

Page 20: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 20

File Access Improvements Data sieving for a single client

Read a larger contiguous file portion Extract actual file portions from it

Collective I/O for multiple clients Read contiguous space, thereafter

distribute sub spaces to each client Disk-directed I/O Server-directed I/O Two-phase I/O (Clients-directed)

Page 21: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 21

Data Sieving

User’s request for non-contiguous file portions

Read a larger contiguous block into memory

Copy requested portions into user’s buffer

(from R. Thakur’s Data Sieving and Collective I/O in ROMIO, 1998)

Page 22: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 22

Two-Phase I/O

P0

P1

P2

P3

Read contiguous

Read contiguous

Read contiguous

Read contiguous

P0

P3

P1

P2

Redistribute

Redistribute

Redistribute

Redistribute

Page 23: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 23

File Stripes Transfer in a Hierarchy(from Fukuda/Miyauchi Journal of Supercomputing)

commanderId: 0

rootsentinel

Id: 2

sentinelId: 8

sentinelId: 9

sentinelId: 38

sentinelId: 36

sentinelId: 37

sentinelId: 39

sentinelId: 32

sentinelId: 33

sentinelId: 128

sentinelId: 129

sentinelId: 130

sentinelId: 131

sentinelId: 132

sentinelId:528

128_inputFile1_1 contents

528_inputFile2_7 contents528_inputFile1_7 contents

32_inputFile1_0 contents32_inputFile2_0 contents

key value

GUI

528

528

read files 128_inputFile1_1 contents

528_inputFile2_7 contents528_inputFile1_7 contents

32_inputFile1_0 contents32_inputFile2_0 contents

128_inputFile1_1 contents

528_inputFile2_7 contents528_inputFile1_7 contents

32_inputFile1_0 contents32_inputFile2_0 contents

3232128

128528

528

128_inputFile1_1 contents

32_inputFile1_0 contents32_inputFile2_0 contents

Page 24: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 24

DFS ExampleSun NFS

/

usrbin

shared

VFS

Local FS NFS client

RPC stub

/

optbin

shared

VFS

Local FS NFS client

RPC stub

/

usrbin

org

VFS

Local FS NFS server

RPC stub

ServerClient A Client B

export exportUser

process Userprocess

Page 25: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 25

Sun NFSInstallation

Server: Check if NFS is running:rpcinfo –p Start NSF: /etc/rc.d/init.d/nfs start Edit /etc/exports file: /dir/to/export client1(permissions), client2(… Export dirs in /etc/exports: exportfs –a Check exported directories: showmount –e

Client: Import a server’s directory: mount –o options server_name:/dir

/my_dir bg: continue working on importing upon a failure, intr: a process will be interupted if its I/O request to the server dir is pending. soft: allowing a client to time out the connection after a number of retries rw/ro: normal r/w or read only

Underlying Connections: portmapperNFS mount service port

mountdpermission

portmapper2049

client

nfsrpc

Page 26: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 26

Sun NFSOverviews

Communication RPC: a compound procedure

Lookup, Open, and Read Server status

Stateless: simple implementation in ver 3. Statefull: allowing clients to cache files in ver 4.

RPC call back from a server to invalidate a client’s cache Synchronization

Session semantics File Locking in ver 4: lock, lockt, locku, and renew

Ex. Emacs: Tests with lockt when modifying buffer, locks a file with lockt, and unlock with locku after writing buffer contents to the file.

Share reservation: specify how to share a file (with ro, wo, or r/w)

Page 27: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 27

SUN NFSOverviews (Cont’d)

Caching In client’s memory Session semantics Revalidation of client’s cache upon re-opening the same file Open delegation:

A server delegates a open decision to a writing client which can handle an open request from other clients on the same machine.

A server calls back the client when receiving an open request from another machine.

Fault Tolerance RPC failure: use a duplicate-request cache File locking failure: provide a grace period during which a

client reclaim locks previously granted and the server builds up its previous state.

Page 28: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 28

Sun NFSDuplicate Request Cache

client server

XID = 1234

reply

XID = 1234

Too soon, ignore

Transactioncompleted

client server

XID = 1234

reply

XID = 1234

Just replied, ignore

Transactioncompleted

client server

XID = 1234

reply

XID = 1234

Too soon, ignore

Transactioncompleted

reply

Then, when does the server delete this cached result?

Page 29: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 29

Venus

Workstations Servers

Venus

VenusUserprogram

Network

UNIX kernel

UNIX kernel

Vice

Userprogram

Userprogram

ViceUNIX kernel

UNIX kernel

UNIX kernel

DFS ExampleAndrew File System

Page 30: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 30

AFSFile Name Space

/

usrtmp

bin

Unix Kernel(Unix FS)

Client

Symbolic links

Venusprocess

cache

Userprocess

/

usrtmp

bin

Unix Kernel(Unix FS)

Server

Symbolic links

Viceprocess

Local Shared

Page 31: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 31

AFSSystem Call Interception

UNIX filesystem calls

Non-local fileoperations

Workstation

Localdisk

Userprogram

UNIX kernel

Venus

UNIX file system

Venus

Page 32: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 32

AFSImplementation of file system calls

User process UNIX kernel Venus Net Vice

open(FileName,mode)

If FileName refers to afile in shared file space,pass the request toVenus.

Open the local file andreturn the filedescriptor to theapplication.

Check list of files inlocal cache. If notpresent or there is novalid callback promise,send a request for thefile to the Vice serverthat is custodian of thevolume containing thefile.

Place the copy of thefile in the local filesystem, enter its localname in the local cachelist and return the localname to UNIX.

Transfer a copy of thefile and a callbackpromise to theworkstation. Log thecallback promise.

read(FileDescriptor,Buffer, length)

Perform a normalUNIX read operationon the local copy.

write(FileDescriptor,Buffer, length)

Perform a normalUNIX write operationon the local copy.

close(FileDescriptor) Close the local copyand notify Venus thatthe file has been closed. If the local copy has

been changed, send acopy to the Vice serverthat is the custodian ofthe file.

Replace the filecontents and send acallback to all otherclients holding callbackpromises on the file.

Page 33: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 33

DFS ExampleXFS

Client

MetadataManager

StorageServer

MetadataManager

StorageServer

StorageServerClient

LAN

1: Write requests

2: Log themin a segment

3: Fragment a segmentand sent them to a strip group of servers1: Read request

2: Query a manager

3: Collaborative caching(Read data from another client if possible)

Page 34: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 34

DFS ExamplePlan 9

/

ba

in ex

d1

da

d2 d3

x y

c

ba dac

x y net

N

File server 1 File server 2 Computation server Network Interface

Client

net

N

import import export

import

Internet

Union directory

Remote execution

Network access

Page 35: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 35

Paper Review by Students Sun NFS Andrew File System XFS Plan 9 LFS Discussions

What file-sharing semantics is each system based on? Which systems use server-side caching? Which systems use client-side caching? Which systems use the client-initiated validation? Which systems use the server-initiated validation?

Page 36: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 36

Non-Turn-In ExercisesQ1. In transaction-like semantics a.k.a. concurrency control, compare the pros and cons of ba

ckward and forward transactions. In particular, consider the case where each transaction includes more read than write operations.

Backward transactionPros:Cons:Forward transactionPros:

Q2. Answer the following five questions about file-caching. When you are asked to show which systems use a given caching scheme, choose all applicable systems from NFS, AFS, xFS and Plan9.

Q2-1. Why can file-caching contribute to performance improvement? Answer two reasons.Reason 1:Reason 2:

Q2-2. State one merit for using server-side caching? Which system uses server-side-caching? Merit:System: Plan9 (Answer)

Page 37: CSS434 DFS1 CSS434 Distributed File Systems Textbook Ch8, 13 Professor: Munehiro Fukuda

CSS434 DFS 37

Non-Turn-In ExercisesQ2-3. Client-side caching allows multiple clients to cache the same file. There are two scheme

s to validate the contents of a locally-cached file (or invalidate the contents of the same file cached at remote clients.) Those are client-initiated and server-initiated validations. Does the client-initiated validation require a file server to be stateful? Justify your answer. Also show which systems use the client-initiated validation.

Stateless or stateful?Reason:Systems: NFS, Plan9 (Answer)

Q2-4. Does the server-initiated validation require a file server to be stateful? Justify your answer. Also show which system uses the server-initiated validation.

Stateless or stateful?Reason:System: AFS, xFS (Answer)