84
Outline for Today’s Lecture Administrative: Objective: – NTFS – continued – Journaling FS – Distributed File Systems – Disconnected File Access – Energy Management

Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Embed Size (px)

Citation preview

Page 1: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Outline for Today’s Lecture

Administrative:

Objective: – NTFS – continued– Journaling FS– Distributed File Systems– Disconnected File Access– Energy Management

Page 2: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

NTFS - continued

Page 3: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

File Compression

(a) An example of a 48-block file being compressed to 32 blocks(b) The MTF record for the file after compression

Page 4: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

File Encryption

Operation of the encrypting file system

K retrieved

user's public key

Page 5: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Comparisons

FFS LFS NTFS

Data blocks

Clustering,

Cylinder grouping

Log: contiguous by temporal ordering

Runs of contiguous blocks possible, immed. files

Directories Directory nodes, cylinder grouping

Directory nodes, in log

They are MFT entries

Block indices

Inodes, specified loc in cylinder group

Inodes in log In MFT entries for files

Page 6: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Journaling for Meta-data Ops

Page 7: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Metadata Operations

• Metadata operations modify the structure of the file system– Creating, deleting, or renaming

files, directories, or special files

• Data must be written to disk in such a way that the file system can be recovered to a consistent state after a system crash

Page 8: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

General Rules of Ordering

1) Never point to a structure before it has been initialized (inode < direntry)

2) Never re-use a resource before nullifying all previous pointers to it

3) Never reset the old pointer to a live resource before the new pointer has been set (renaming)

Page 9: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Metadata Integrity

• FFS uses synchronous writes to guarantee the integrity of metadata– Any operation modifying multiple pieces of

metadata will write its data to disk in a specific order

– These writes will be blocking

• Guarantees integrity and durability of metadata updates

Page 10: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Deleting a file

abc

def

ghi

i-node-1

i-node-2

i-node-3

Assume we want to delete file “def”

Page 11: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Deleting a file

abc

def

ghi

i-node-1

i-node-3

Cannot delete i-node before directory entry “def”

?

Page 12: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Deleting a file

• Correct sequence is1. Write to disk directory block containing deleted

directory entry “def”

2. Write to disk i-node block containing deleted i-node

• Leaves the file system in a consistent state

Page 13: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Creating a file

abc

ghi

i-node-1

i-node-3

Assume we want to create new file “tuv”

Page 14: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Creating a file

abc

ghi

tuv

i-node-1

i-node-3

Cannot write directory entry “tuv” before i-node

?

Page 15: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Creating a file

• Correct sequence is1. Write to disk i-node block containing new i-node

2. Write to disk directory block containing new directory entry

• Leaves the file system in a consistent state

Page 16: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Synchronous Updates

• Used by FFS to guarantee consistency of metadata:– All metadata updates are done through

blocking writes

• Increases the cost of metadata updates

• Can significantly impact the performance of whole file system

Page 17: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Journaling

• Journaling systems maintain an auxiliary log that records all meta-data operations

• Write-ahead logging ensures that the log is written to disk before any blocks containing data modified by the corresponding operations.– After a crash, can replay the log to bring

the file system to a consistent state

Page 18: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Journaling

• Log writes are performed in addition to the regular writes

• Journaling systems incur log write overhead but– Log writes can be performed efficiently

because they are sequential– Metadata blocks do not need to be written

back after each update

Page 19: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Journaling

• Journaling systems can provide– same durability semantics as FFS if log is

forced to disk after each meta-data operation

– the laxer semantics if log writes are buffered until entire buffers are full

Page 20: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Implementation with log as file

• Maintains a circular log in a pre-allocated file in the FFS (about 1% of file system size)

• Buffer manager uses a write-ahead logging protocol to ensure proper synchronization between regular file data and the log

• Buffer header of each modified block in cache identifies the first and last log entries describing an update to the block

Page 21: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Implementation with log as file

• System uses – First item to decide which log entries can be

purged from log– Second item to ensure that all relevant log entries

are written to disk before the block is flushed from the cache

• Maintains its log asynchronously– Maintains file system integrity, but does not

guarantee durability of updates

Page 22: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Data structures for log

Superblock - records log start

firstlast

first

last

cached buffer headers circular log file

Page 23: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Recovery

• Superblock has address of last checkpoint• First recover the log• Read then the log from logical end (backward

pass) and undo all aborted operations• Do forward pass and reapply all updates that

have not yet been written to disk

Page 24: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Other Approaches

• Using non-volatile cache (Network Appliances) – Ultimate solution: can keep data in cache forever– Additional cost of NVRAM

• Simulating NVRAM with– Uninterruptible power supplies – Hardware-protected RAM (Rio): cache is marked

read-only most of the time

Page 25: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Other Approaches

• Log-structured file systems– Not always possible to write all related

meta-data in a single disk transfer– Sprite-LFS adds small log entries to the

beginning of segments– BSD-LFS make segments temporary until

all metadata necessary to ensure the recoverability of the file system are on disk.

Page 26: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Distributed File Systems

Page 27: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Distributed File Systems

• Naming– Location

transparency/ independence

• Caching– Consistency

• Replication– Availability and

updates

server

network

server

client

client

client

Page 28: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Naming

• \\His\d\pictures\castle.jpg– Not location transparent - both

machine and drive embedded in name.

• NFS mounting– Remote directory mounted

over local directory in local naming hierarching.

– /usr/m_pt/A– No global view

Her local directory tree

usr

m_pt

His localdir tree

for_export

A B

usr

m_pt

A B

Her local tree after mount A B

usr

m_pt

His after mount on B

Page 29: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Global Name Space

Example: Andrew File System

/

afs

tmp bin lib

local files

shared files -looks identical toall clients

Page 30: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Hints

• A valuable distributed systems design technique that can be illustrated in naming.

• Definition: information that is not guaranteed to be correct. If it is, it can improve performance. If not, things will still work OK. Must be able to validate information.

• Example: Sprite prefix tables

Page 31: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Prefix Tables

m_pt1

usr

m_pt2

A

/

/A/m_pt1/usr/m_pt2 pink

/A/m_pt1 blue

/A/m_pt1/usr/B pink

B

/A/m_pt1/usr/m_pt2/stuff.below

Page 32: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

VFS: the Filesystem Switch

syscall layer (file, uio, etc.)

user space

Virtual File System (VFS)networkprotocol

stack(TCP/IP) NFS FFS LFS etc.*FS etc.

device drivers

Sun Microsystems introduced the virtual file system framework in 1985 to accommodate the Network File System cleanly.

• VFS allows diverse specific file systems to coexist in a file tree, isolating all FS-dependencies in pluggable filesystem modules.

VFS was an internal kernel restructuringwith no effect on the syscall interface.

Incorporates object-oriented concepts:a generic procedural interface withmultiple implementations.

Other abstract interfaces in the kernel: device drivers,file objects, executable files, memory objects.

Page 33: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

VnodesIn the VFS framework, every file or directory in active

use is represented by a vnode object in kernel memory.

syscall layer

NFS UFS

free vnodes

Active vnodes are reference-counted by the structures thathold pointers to them, e.g.,the system open file table.

Each vnode has a standardfile attributes struct.

Vnode operations aremacros that vector tofilesystem-specificprocedures.

Generic vnode points atfilesystem-specific struct(e.g., inode, rnode), seenonly by the filesystem.

Each specific file system maintains a hash of its resident vnodes.

Page 34: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Vnode Operations and Attributes

directories onlyvop_lookup (OUT vpp, name)vop_create (OUT vpp, name, vattr)vop_remove (vp, name)vop_link (vp, name)vop_rename (vp, name, tdvp, tvp, name)vop_mkdir (OUT vpp, name, vattr)vop_rmdir (vp, name)vop_readdir (uio, cookie)vop_symlink (OUT vpp, name, vattr, contents)vop_readlink (uio)

files onlyvop_getpages (page**, count, offset)vop_putpages (page**, count, sync, offset)vop_fsync ()

vnode/file attributes (vattr or fattr)type (VREG, VDIR, VLNK, etc.)mode (9+ bits of permissions)nlink (hard link count)owner user IDowner group IDfilesystem IDunique file IDfile size (bytes and blocks)access timemodify timegeneration number

generic operationsvop_getattr (vattr)vop_setattr (vattr)vhold()vholdrele()

Page 35: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Pathname Traversal• When a pathname is passed as an argument to a

system call, the syscall layer must “convert it to a vnode”.

• Pathname traversal is a sequence of vop_lookup calls to descend the tree to the named file or directory.

open(“/tmp/zot”)vp = get vnode for / (rootdir)vp->vop_lookup(&cvp, “tmp”);vp = cvp;vp->vop_lookup(&cvp, “zot”);

Issues:1. crossing mount points2. obtaining root vnode (or current dir)3. finding resident vnodes in memory4. caching name->vnode translations5. symbolic (soft) links6. disk implementation of directories7. locking/referencing to handle races with name create and delete operations

Page 36: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Example:Network File System (NFS)

syscall layer

UFS

NFSserver

VFS

VFS

NFSclient

UFS

syscall layer

client

user programs

network

server

Page 37: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

NFS Protocol

NFS is a network protocol layered above TCP/IP.– Original implementations (and most today) use UDP

datagram transport for low overhead.• Maximum IP datagram size was increased to match FS

block size, to allow send/receive of entire file blocks.• Some newer implementations use TCP as a transport.

NFS protocol is a set of message formats and types.

• Client issues a request message for a service operation.• Server performs requested operation and returns a reply

message with status and (perhaps) requested data.

Page 38: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

File HandlesQuestion: how does the client tell the server which

file or directory the operation applies to?– Similarly, how does the server return the result of a lookup?

• More generally, how to pass a pointer or an object reference as an argument/result of an RPC call?

In NFS, the reference is a file handle or fhandle, a 32-byte token/ticket whose value is determined by the server.– Includes all information needed to identify the

file/object on the server, and get a pointer to it quickly.

volume ID inode # generation #

Page 39: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

NFS: From Concept to Implementation

Now that we understand the basics, how do we make it work in a real system?– How do we make it fast?

• Answer: caching, read-ahead, and write-behind.

– How do we make it reliable? What if a message is dropped? What if the server crashes?

• Answer: client retransmits request until it receives a response.

– How do we preserve file system semantics in the presence of failures and/or sharing by multiple clients?

• Answer: well, we don’t, at least not completely.

– What about security and access control?

Page 40: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Distributed File Systems

• Naming– Location

transparency/ independence

• Caching– Consistency

• Replication– Availability and

updates

server

network

server

client

client

client

Page 41: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Caching was “The Answer”

• Avoid the disk for as many file operations as possible.

• Cache acts as a filter for the requests seen by the disk reads served best.

• Delayed writeback will avoid going to disk at all for temp files.

Memory

Filecache

Proc

Page 42: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Caching in Distributed F.S.

• Location of cache on client - disk or memory

• Update policy– write through– delayed writeback– write-on-close

• Consistency– Client does validity check,

contacting server– Server call-backs

server

network

server

client

client

client

Page 43: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

File Cache Consistency

Caching is a key technique in distributed systems.The cache consistency problem: cached data may become

stale if cached data is updated elsewhere in the network.

Solutions:Timestamp invalidation (NFS).

Timestamp each cache entry, and periodically query the server: “has this file changed since time t?”; invalidate cache if stale.

Callback invalidation (AFS).Request notification (callback) from the server if the file

changes; invalidate cache on callback.

Leases (NQ-NFS) [Gray&Cheriton89]

Page 44: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

49

Sun NFS Cache Consistency• Server is stateless• Requests are self-

contained.• Blocks are transferred and

cached in memory.• Timestamp of last known

mod kept with cached file, compared with “true” timestamp at server on Open. (Good for an interval)

• Updates delayed but flushed before Close ends.

server

network

server

client

client

client

ti

tj

openti== tj ?

write/close

Page 45: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

50

Cache Consistency for the Web

• Time-to-Live (TTL) fields - HTTP “expires” header

• Client polling -HTTP “if-modified-since” request headers– polling frequency?

possibly adaptive (e.g. based on age of object and assumed stability)

network

Webserver

proxycache

lan

clientclient

Page 46: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

51

AFS Cache Consistency• Server keeps state of all

clients holding copies (copy set)

• Callbacks when cached data are about to become stale

• Large units (whole files or 64K portions)

• Updates propagated upon close

• Cache on local disk memory

server

network

server

c0

c1

c2

{c0, c1}

close

callback

• If client crashes, revalidation on recovery (lost callback possibility)

Page 47: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

NQ-NFS Leases

In NQ-NFS, a client obtains a lease on the file that permits the client’s desired read/write activity.

“A lease is a ticket permitting an activity; the lease is valid until some expiration time.”

– A read-caching lease allows the client to cache clean data.Guarantee: no other client is modifying the file.

– A write-caching lease allows the client to buffer modified data for the file.

Guarantee: no other client has the file cached.Leases may be revoked by the server if another client requests a

conflicting operation (server sends eviction notice).Since leases expire, losing “state” of leases at server is OK.

Page 48: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Coda – Using Caching to Handle Disconnected Access • Single location-transparent UNIX FS.• Scalability - coarse granularity

(whole-file caching, volume management)First class (server) replication and

client caching (second class replication)Optimistic replication & consistency

maintenance.• Designed for disconnected operation for

mobile computing clients

Page 49: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Explicit First-class Replication

• File name maps to set of replicas, one of which will be used to satisfy request– Goal: availability

• Update strategy– Atomic updates - all or none– Primary copy approach– Voting schemes– Optimistic, then detection of conflicts

Page 50: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Optimistic vs. Pessimistic

• High availability Conflicting updates are the potential problem - requiring detection and resolution.

• Avoids conflicts by holding of shared or exclusive locks.

• How to arrange when disconnection is involuntary?

• Leases [Gray, SOSP89]

puts a time-bound on locks but what about expiration?

Page 51: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Multiple Copy Schemes

• Primary Copy Of all copies, there is one which is “primary” to which

updates are sent. Secondary copies eventually get updates. Reads can go to any copy. How to take over when primary gone?

• Voting An operation must acquire locks on some subset of

copies (overlapping read or write quorums)

• Optimistic Act as though no conflicts, when possible, compare

replicas for conflicts (version vector) and resolve.

Page 52: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

“Committing” a Transaction

Begin Transactionlots of reads and writes

Commit or Abort Transaction

Page 53: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

“Committing” a Transaction

Begin TransactionWithdraw $1000 from savings account

Deposit $1000 to checking account

Commit or Abort Transaction

Page 54: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Atomic Transactions

ACID property - data is recoverable.• Atomicity - a transaction must be

all-or-nothing.• Consistency - a transaction takes system

from one consistent state to another• Isolation - No intermediate effects are visible

to others - serializability• Durability - the effects of a committed

transaction are permanent

Page 55: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Implementation Mechanisms

• Stable storage

• Shadow blocks

• Logging

Page 56: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Stable Storage

We need to be able to trust something not to be corrupted or destroyed

• Mirrored disks. Always write disk 1, verify, then write disk 2. – If crash, compare disks, disk 1 “wins”– If bad checksum, use other disk block.

• Battery backed up RAM

Page 57: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Private Workspace

• Create a shadow data structure

• On commit, make the shadow the real one.– One pointer change

to exchange indices allows this to be atomic.

Page 58: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Logging

• Intentions list• Do/undo log• Log is written to

stable storage. Rollback, if abort. Completion, if commit.

Savings $5K/$4K

Checking $100/$1100

Commit

Page 59: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

2-Phase CommitCoordinator Worker

Write “prepare” in log

Send “prepare”Write “ready” in log

Send “ready”

?Collect all responsesWrite “commit” in log

Send “commit”Write “commit” in log

Do commit

Send “done”Collect all responses

Done

Page 60: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

2-Phase CommitCoordinator Worker

Write “prepare” in log

Send “prepare”Write “ready” in log

Send “ready”

?Collect all responsesWrite “commit” in log

Send “commit”Write “commit” in log

Do commit

Send “done”Collect all responses

Done

Page 61: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Coda – Using Caching to Handle Disconnected Access • Single location-transparent UNIX FS.• Scalability - coarse granularity

(whole-file caching, volume management)• First class (server) replication and

client caching (second class replication)• Optimistic replication & consistency

maintenance. Designed for disconnected operation for

mobile computing clients

Page 62: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Client-cache State Transitions

emulation reintegration

hoarding

physicalreconnection

logical reconnection

disconnection

Page 63: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Prefetching

• To avoid the access latency of moving the data in for that first cache miss.

• Prediction! “Guessing” what data will be needed in the future.– It’s not for free:

Consequences of guessing wrongOverhead

Page 64: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Hoarding - Prefetching for Disconnected Information

Access• Caching for availability (not just latency)• Cache misses, when operating disconnected,

have no redeeming value. (Unlike in connected mode, they can’t be used as the triggering mechanism for filling the cache.)

• How to preload the cache for subsequent disconnection? Planned or unplanned.

• What does it mean for replacement?

Page 65: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Hoard Database

• Per-workstation, per-user set of pathnames with priority

• User can explicitly tailor HDB using scripts called hoard profiles

• Delimited observations of reference behavior (snapshot spying with bookends)

Page 66: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Coda Hoarding State

• Balancing act - caching for 2 purposes at once:– performance of current accesses, – availability of future disconnected access.

• Prioritized algorithm - Priority of object for retention in cache is

f(hoard priority, recent usage). • Hoard walking (periodically or on request)

maintains equilibrium - no uncached object has higher priority than any of cached objects

Page 67: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

The Hoard Walk

• Hoard walk - phase 1 - reevaluate name bindings (e.g., any new children created by other clients?)

• Hoard walk - phase 2 - recalculate priorities in cache and in HDB, evict and fetch to restore equilibrium

Page 68: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Hierarchical Cache Mgt

• Ancestors of a cached object must be cached in order to resolve pathname.

• Directories with cached children are assigned infinite priority

Page 69: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Callbacks During Hoarding

• Traditional callbacks - invalidate object and refetch on demand

• With threat of disconnection– Purge files and refetch on demand or

hoard walk– Directories - mark as stale and fix on

reference or hoard walk, available until then just in case.

Page 70: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Emulation State

• Pseudo-server, subject to validation upon reconnection

• Cache management by priority– modified objects assigned infinite priority– freeing up disk space - compression, replacement

to floppy, backout updates

• Replay log also occupies non-volatile storage (RVM - recoverable virtual memory)

Page 71: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Client-cache State Transitions with Weak

Connectivity

emulation write disconnected

hoarding

physicalreconnection

strong connection

disconnection weakconnection

disconnection

Page 72: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Cache Misses with Weak Connectivity

• At least now it’s possible to service misses but $$$ and it’s a foreground activity (noticable impact). Maybe not

• User patience threshold - estimated service time compared with what is acceptable

• Defer misses by adding to HDB and letting hoard walk deal with it

• User interaction during hoard walk.

Page 73: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

File System Energy Management

Page 74: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Spin-down Disk Model

NotSpinning

Spinning& Ready

Spinning& Access

Spinning& Seek

Spinningup

Spinningdown

Inactivity Timeout threshold*

Request

Trigger:request or predict

Predictive

Page 75: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Spin-down Disk Model

NotSpinning

Spinning& Ready

Spinning& Access

Spinning& Seek

Spinningup

Spinningdown Tout

Inactivity Timeout threshold*

Request

Trigger:request or predict

Predictive

~1- 3s delayEtransition = Ptransition * Ttransition

Tdown

Tidle

Pdown Pspin

Etransition = Ptransition * Ttransition

Page 76: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Energy = Poweri x Timei

Reducing Energy Consumption

i powerstates

Energy = Poweri x Timei

To reduce energy used for task:– Reduce power cost of power state I through better technology.

– Reduce time spent in the higher cost power states.

– Amortize transition states (spinning up or down) if significant.

PdownTdown + 2*Etransition + Pspin * Tout < Pspin*Tidle

Tdown = Tidle - (Ttransition + Tout)

Page 77: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Power Specs

IBM Microdrive (1inch)• writing 300mA

(3.3V)1W

• standby 65mA (3.3V).2W

IBM TravelStar (2.5inch)• read/write 2W• spinning 1.8W• low power idle .65W• standby .25W• sleep .1W• startup 4.7 W• seek 2.3W

Page 78: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Spin-down Disk Model

NotSpinning

Spinning& Ready

Spinning& Access

Spinning& Seek

Spinningup

Spinningdown

Request

Trigger:request or predict

Predictive

.2W.65-1.8W

2W2.3W4.7W

Page 79: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Spin-Down Policies

• Fixed Thresholds– Tout = spin-down cost s.t. 2*Etransition = Pspin*Tout

• Adaptive Thresholds: Tout = f (recent accesses)– Exploit burstiness in Tidle

• Minimizing Bumps (user annoyance/latency)– Predictive spin-ups

• Changing access patterns (making burstiness)– Caching– Prefetching

Page 80: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Disk Idle Times for File System Access Behavior

• File access patterns don’t offer long enough idle times to exploit disk spindown

• Larger buffer cache alone does not save energy without policy changes

Page 81: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Burstiness as a Goal

• Increase idle interval length to allow power state transitions

• Operate at max disk bandwidth when disk is active

• Decrease number of transitions

Page 82: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Modified Caching & Prefetching

“Bursty” system by Papathanasiou & Scott

• Hinted prefetching– Syscall for apps to provide

• Periodic updates more bursty– Longer window– Writeback on close

• Coordinate access rates of multiple application– Run out of data together

• Anticipatory Spinup• Danger of > congestion

Page 83: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Disk Power Results

Page 84: Outline for Today’s Lecture Administrative: Objective: –NTFS – continued –Journaling FS –Distributed File Systems –Disconnected File Access –Energy Management

Application Assistance

• Hints on prefetching• Coop I/O [Weissel, Beutel, Bellosa]

– File read and write ops that specify willingness to wait for timeout length of time

• If disk spun down, wait for other ops to be issued to be grouped together

• If disk spinning, do immediately

• ECOSystem– Negotiation based on Currentcy