Transcript
Page 1: Self Stabilizing Distributed File System

Self Stabilizing Distributed File System

Shlomi Dolev and Ronen I. KatShlomi Dolev and Ronen I. Kat

Department of Computer Science, Ben-Gurion Department of Computer Science, Ben-Gurion UniversityUniversity

Research Sponsored by IBMResearch Sponsored by IBM

Page 2: Self Stabilizing Distributed File System

DFS Motivation

• Performance

• Fault tolerance

• Placing files closer to users

Page 3: Self Stabilizing Distributed File System

Related Work

• File systems• NFS – network file system protocol• AFS – Andrew file system – CMU(1988)• Coda - CMU (1998)• Intermezzo – Peter J. Braam, CMU

• Peer to peer (2000)• Global storage: OceanStore – Berkeley• Server less: Microsoft Farsite.

Page 4: Self Stabilizing Distributed File System

Talk Overview

• Self-stabilization• Design• Algorithms• File system implementation• Future work

Page 5: Self Stabilizing Distributed File System

Self Stabilization

• Self healing• Adaptiveness• Automatic recovery• Autonomic computing

Self StabilizationDijkstra 1974

Page 6: Self Stabilizing Distributed File System

Self Stabilization

A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults.

The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour.

E.G., Self-stabilization / S. Dolev.

Page 7: Self Stabilizing Distributed File System

Self Stabilization Motivation

• The combination and type of faults cannot be totallytotally anticipated in on-going systems

• Any on-going system mustmust be self stabilizing (or manually monitored)

• Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults

Page 8: Self Stabilizing Distributed File System

Design

Page 9: Self Stabilizing Distributed File System

Design

• Replication servers joined to a spanning tree

• A spanning tree is constructed• File updates are propagated using self-

stabilizing -synchronizer

Page 10: Self Stabilizing Distributed File System

Design (Cont’)

• Clients join the replication tree and form a caching tree

• File leases• Global locking

Page 11: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

Electing a leader (leader election)Electing a leader (leader election)• Collecting connectivity information• Optimising communication costs -Synchronizer for file consistency

Page 12: Self Stabilizing Distributed File System

Leader Election

• A single leader coordinates construction

• If non exists, a server becomes a leader• If more than one exists, one survives• Message are periodically broadcasted

Page 13: Self Stabilizing Distributed File System

Leader Election Algorithm

• Every T1 do:• If (p = leader) then send-multicast(‘I’m a leader’)• Leader-exists = true

• Every T1+Td do:• If (not leader-exists) then leader = p• Leader-exists = false

• Upon arrival of message do:• If (p.volume=volume) then

• If (p=leader) then leader = min(leader,sender)• Else leader = sender

• Leader-exists = true

Page 14: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election) Collecting connectivity informationCollecting connectivity information• Optimising communication costs -Synchronizer for file consistency

Page 15: Self Stabilizing Distributed File System

Induced Graph Example

Page 16: Self Stabilizing Distributed File System

Update Algorithm

• Collect routing tables from all neighbours in the induced graph

• Elect a manager (local leader) for the tree, a server with the minimal ID

• Build a distributed BFS spanning tree• The algorithm converges

Page 17: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election)• Collecting connectivity information Optimising communication costsOptimising communication costs -Synchronizer for file consistency

Page 18: Self Stabilizing Distributed File System

Optimising Communication Costs

• Goal: find the minimal radius that keeps connectivity

• Increase by a factor of 2• Run a 2nd instance of update with < • Searching for using binary search

Page 19: Self Stabilizing Distributed File System

Tree Structure

Page 20: Self Stabilizing Distributed File System

Caching Tree

• Extends the replication tree • The update algorithm constructs both• Servers execute two instances• Caches execute one instance

Page 21: Self Stabilizing Distributed File System

Combined Spanning Tree

Page 22: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election)• Collecting connectivity information• Optimising communication costs -Synchronizer for file consistency-Synchronizer for file consistency

Page 23: Self Stabilizing Distributed File System

Synchronization Mechanism

• Provide reliable command and timing• Propagate commands between servers• Collect and distribute information

Page 24: Self Stabilizing Distributed File System

Replication Consistency

• Verifies signatures• Multiple signature – a conflict• Conflict resolution• Broadcast resolved signature

Page 25: Self Stabilizing Distributed File System

Locking Table

• A (unified) global lock table • Lock are requested• Leader resolves multiple locks• Lock are removed by cancelling the

locks request

Page 26: Self Stabilizing Distributed File System

File System Implementation

Page 27: Self Stabilizing Distributed File System

Accessing a FileLock file

Get signature

Get a copy

Yes

No

No

Use local copy

Yes

Update?

Cached?

Page 28: Self Stabilizing Distributed File System

Closing a File

Send new signature

Yes

No

Update?

Confirm signature

Page 29: Self Stabilizing Distributed File System

Meta Access

• Globally processed• Blocked until a lock is

obtained

Lock file

Executecommand

Waitconfirmation

Page 30: Self Stabilizing Distributed File System

Linux Based bgRFS

Application

User LevelLinux system calls

System Calls

New implementation:

open, close, lstat, mkdir, etc …

SyncDaemon:Cache manager & Server

Up calls

Network Communication

Page 31: Self Stabilizing Distributed File System

Future Work

• Kernel VFS module.• Communication improvements:

– Reducing update messages– Using timers with -synchronizer

• Performance enhancements• Integrating disconnected operations• Conflict resolution algorithms

Page 32: Self Stabilizing Distributed File System

Credits

Undergraduate Students:Amir Livneh [email protected] Granik [email protected] Lansky [email protected] Shmuel [email protected] Shish [email protected] Erlich [email protected] Chohen [email protected] Biran [email protected] Fridman [email protected] Bernard [email protected] Ferents [email protected] Feintuch [email protected] Shalev [email protected] Kraim [email protected] Hayuit

FacultyProf Shlomi Dolev [email protected]

Graduate StudentsRonen I. Kat [email protected]

System EngeenierAlbina Budker [email protected]

Page 33: Self Stabilizing Distributed File System

Visit us atVisit us at

www.cs.bgu.ac.il/~bgrwww.cs.bgu.ac.il/~bgr

fsfs


Recommended