Upload
bernadette-jenkins
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Serverless Network File Systems
Overview by Joseph Thompson
Problem
• Centralized file systems fundamentally limit performance and availability– All reads and writes go through the
centralized server– Increased server performance is expensive
Purpose
• Better performance and scalability
• High availability via redundant data storage
Assumption
• SNFS is only appropriate among machines that communicate over a fast network and that trust each other to enforce security– SNFS generates a significant amount of
network traffic– Security will be covered later
Components of SNFS
• Software RAID
• Log File System (LFS)
• Zebra – Merges RAID and LFS in a distributed
network– Don’t miss my next presentation on Zebra!
• Multiprocessor Cache Consistency– In this model, a each processor is one client
Three Problems to Be Solved
• Need distributed metadata which both provide cache consistency management and flexibility to dynamically reconfigure client responsibilities
• Scalable way to subset storage servers for efficiency
• Scalable log cleaning
Metadata
• Manager Map
• IMap
• File Directories
• Stripe Group Map
Mangers
• The manager of a file controls two sets of information about it– Cache consistency state– Disk location metadata
Manager Map
• Table that indicates which physical machines mange which groups of index numbers at any given time
• Globally replicates this table to all mangers in system– Table relatively small (10’s of kBytes per
hundreds of clients)– Table rarely changes
IMap
• A file’s imap entry contains the log addresses of the file’s inode– For scalability, imaps are only distributed to
managers who have been assigned to the file
File Directories
• Contains mappings from file names to index numbers– Stored in the file itself– Files created by the client are assigned to the
manager on that machine (if there is one)
• Index Numbers– Used to find the manager who is responsible
for the file
Stripe Group Map Justification
• In a large raid, even large log segments create small write inefficiencies with large RAIDs
• While one client write at is full network bandwidth to one stripe group, another client can do the same with another group
• Smaller segment size make cleaning more efficient
• Stripe groups greatly improve availability– Each group stores its own parity which helps if there
are multiple server failures in different groups
Stripe Group Implementation
• Group ID• Group Members • Current or Obsolete
– Current and Obsolete field is used to increase efficiency relying on the cleaner to eventually move all data to a current group and removing the obsolete group
• Also globally replicated to each client– Small and rarely changes
Cleaning
• Three main tasks– Utilization status– Uses status to decide which segment to clean– Writes blocks from old segment to new
segment
Distributed Utilization
• Assign the burden of maintaining each segment’s utilization status to the client that wrote the segment
• Client stores utilization information in s-files for each stripe group they write to which are written like normal files and can be found by a stripe group leader
Distributed Cleaning
• A stripe group leader (dynamically appointed) initiates cleaning when the number of free segments drops below a threshold value or when the group is idle
• The leader accumulates the s-files for the group and can dynamically assign cleaners from different machines to clean subsections of the stripe group in an efficient manner
Procedure to Read a Block
• Diagram Demystified!
Writing and Cache Consistency
• To write, a client must request a lock from the owning manager which the manger can revoke at any time
• The manger invalidates its cache and updates its cache consistency information
• One implementation uses a client caching lists to invalidate stale client caches and forward read requests to clients with valid cached copies
Recovery and Reconfiguration
• General Recovery Strategy
• Data Structure Recovery
• Storage Server Recovery
• Manager Recovery
• Cleaner Recovery
• Scalability of Recovery
General Recovery Strategy
• LFS has an append only log of every file modification between log segment writes called the delta
• Uses checkpoint recovery and roll-forward
• Unless additional parity servers per stripe group are used, multiple storage servers from a single stripe are unreachable, there can be no full recovery
Data Structure Recovery
• Layered dependence requires the recovery to start with the storage servers, then managers, then cleaner
Storage Server Recovery
• As we have seen with RAID architectures, recovering a single storage server is easy
• Once we do the initial recovery we can use LFS’ delta feature to poll clients for their unwritten changes in the process of rolling forward
Manager Recovery
• Retrieves last known imaps from its last checkpoint written to a storage server
• The manager gathers a consensus of map manager tables from clients in the roll-forward process to set the appropriate changes to data block locations
Cleaner Recovery
• Since s-files are stored like normal files, they will be recovered from the respective storage server
• Then must go through a roll-forward state where it checks the clients for a summary of their modifications to those segments that are more recent
• To avoid clients having to search their logs multiple times they can gather utilization information during the manager recovery process
Scalability of Recovery
• The roll-forward process can generate O(N^2) messages per object using the roll-forward step where N refers to the number of clients, manger, or storage servers
• An optimization is each object only need to contact N lower layer object, and if there is randomization used to reduce the number of concurrent access to a single storage server, each manager can roll-forward in parallel.
Other Information Not Covered Here
• Details of xFS prototype and performance testing
• Extra research to the state of xFS since 1995 when this paper was written
Conclusion
• Paper is valuable– Provides a creative use of new and old ideas
to pioneer a new file system
• Problems– Restrictions on the usability of this system in a
non-secure environment
• Solutions– P2P security solutions we discussed in class