View
216
Download
3
Tags:
Embed Size (px)
Citation preview
1Computer Science 213© 2006 Donald Acton
The “Plan”The “Plan”
• Build a model of– A computer system– How the CPU works– Memory– What the software sees
• Explore our model to see how to go from hardware to the view application software has
2Computer Science 213© 2006 Donald Acton
How is software organized to tie the hardware together?
How is software organized to tie the hardware together?
MainmemoryBus interface
ALU
Register file
CPU
Disk controller
Graphicsadapter
USBcontroller
Mouse Keyboard Display
Disk
I/O bus
PC
From Computer Systems: A Programmer’s Perspective
I/O bridge
System bus Memory bus
Expansion slots forother devices suchas network adapters
3Computer Science 213© 2006 Donald Acton
Types of SoftwareTypes of Software
There are 2 broad classes of software:– Operating, or Systems software– Application Software
The operating system code is The operating system code is often referred to as the often referred to as the kernelkernel
4Computer Science 213© 2006 Donald Acton
LayeringLayering
Application programs
Processor Main memory I/O devices
Operating system
Software
Hardware
From Computer Systems: A Programmer’s Perspective
5Computer Science 213© 2006 Donald Acton
Role of the OSRole of the OS
• Provide services commonly used by applications
• Arbitrate requests for resources• Resources are anything managed by
the OS:– Memory– Devices– CPU cycles
6Computer Science 213© 2006 Donald Acton
Role of the OS cont’d…Role of the OS cont’d…
• Protect applications from each other• Create a virtual machine to program
to (i.e. hide the hardware details as much as possible from the application programmer.)
7Computer Science 213© 2006 Donald Acton
Virtual MachineVirtual Machine
• An idealized version of what a computer should look like to an application
• The OS realizes the virtual machine• System calls are the access points to the
virtual machine• A collection of system calls for a particular
class of services (e.g. file system manipulation) form an Application Programming Interface (API)
8Computer Science 213© 2006 Donald Acton
Typical Virtual Machine Services
Typical Virtual Machine Services
• Processes Processor– System calls to create manage and
manipulate processes
• Virtual Memory Physical memory– System calls to allocate and free
memory
• Files I/O devices– System calls to create, open, close, read
and write devices
9Computer Science 213© 2006 Donald Acton
Layering - AgainLayering - Again
Application programs
Processor Main memory I/O devices
Operating system
Software
Hardware
From Computer Systems: A Programmer’s Perspective
10Computer Science 213© 2006 Donald Acton
Virtual Machine AdvantagesVirtual Machine Advantages
• Applications don’t have to know how the hardware really works – interacting with hardware is UGLY
• If new hardware is added only the virtual machine software needs to be updated
11Computer Science 213© 2006 Donald Acton
DisadvantagesDisadvantages
• The extra software makes manipulating the hardware slower
• It may not be possible to access all the features of the hardware
12Computer Science 213© 2006 Donald Acton
System CallsSystem Calls
• They are like subroutine/function calls from the application to the OS
• Collections of system calls form an API
• Example APIs– win32 API (Microsoft Windows)– Unix sockets (networking)– POSIX (a collection of “UNIX” APIs)
13Computer Science 213© 2006 Donald Acton
POSIXPOSIX
• Portable Operating System Interface• Basically yet another attempt (mid
80s) to unify the collection of UNIX OSes
• POSIX sub standards:– POSIX.1 – system interfaces and headers– POSIX.1b – realtime extensions– POSIX.1c – threads– POSIX.2 – shell and utilities
14Computer Science 213© 2006 Donald Acton
System CallsSystem Calls
Application program
Processor Main memory I/O devices
Operating System
15Computer Science 213© 2006 Donald Acton
Another System Call ViewAnother System Call View
Application Operating System
Application code
Time
OS code
Interacts with hardware
OS code
Application code
Adapted from: Computer Systems: A Programmer’s Perspective
16Computer Science 213© 2006 Donald Acton
Typical System CallsTypical System Calls
• System calls are the only way to access shared resources or request a service
• Example Unix system calls:– fork() – create a new process that is an
exact copy of the current process– execve() – replace the current running
process image with a new one– exit() – causes the process to terminate
after cleaning up
17Computer Science 213© 2006 Donald Acton
Example System Calls cont’d
Example System Calls cont’d
– creat() – create a new file– getrlimit() – determine system resource
consumption– mkdir() – make a directory– open() – open a file– stime() – set system time and date– wait() for a child process to exit
18Computer Science 213© 2006 Donald Acton
Waiting for HardwareWaiting for Hardware
• What should the OS do when it interacts with hardware and there is a delay?
• Hardware is fast, right, so shouldn’t the CPU just wait?
19Computer Science 213© 2006 Donald Acton
Consider a Disk AccessConsider a Disk Access
• Typical read of a disk takes 10ms• How many instructions could a
3.2GHz P4 execute in that time?
20Computer Science 213© 2006 Donald Acton
How to Avoid Wasting Cycles
How to Avoid Wasting Cycles
• Run another process• This requires the OS to remember:
– All the processes and what they are doing so a new one can be selected
– What system call (i.e. process) is associated with a hardware request
– What hardware requests are outstanding so the requests can be checked
21Computer Science 213© 2006 Donald Acton
How Is this Done?How Is this Done?
• Process – the OS abstraction of a running program (application)
• Context – all the state information needed to run a process
• Context Switch – stopping or suspending the currently running process for another
22Computer Science 213© 2006 Donald Acton
Content of a ContextContent of a Context
• PC – indicates where the process is executing
• Registers – capture the current state of the CPU
• Memory – defines the code and data the process is working with
• System Resources – assigned to process by OS
23Computer Science 213© 2006 Donald Acton
ResourcesResources
• Open files• Devices• Buffers to hold I/O data• Held locks/semaphores
24Computer Science 213© 2006 Donald Acton
Performing Context SwitchPerforming Context Switch
Process A Process B
User code
Kernel code
User code
Kernel code
User code
TimeSystem call
System Call/InterruptReturn
from system call
Adapted from: Computer Systems: A Programmer’s Perspective
25Computer Science 213© 2006 Donald Acton
Process SchedulingProcess Scheduling
• Scheduling is the act of “saving” the context of the running process and selecting a new process to run
• Can only happen when kernel is executing
• Kernel will execute when:1. Application makes a system call2. An interrupt occurs
26Computer Science 213© 2006 Donald Acton
SchedulingScheduling
System perspective
User
Kernel
Application A’s perspective
Active
Inactive
B BAA A
Adapted from: Computer Systems: A Programmer’s Perspective
27Computer Science 213© 2006 Donald Acton
Scheduling PoliciesScheduling Policies
• The rules the kernel uses to select a process to run
• Most common is Round Robin– Process is running– Clock interrupt occurs – Kernel stops current process and places
it on end of queue of ready processes– Selects next process in queue to start
28Computer Science 213© 2006 Donald Acton
Process StatesProcess States
• Using the CPU• Waiting for the CPU• Waiting for the kernel to finish
providing a service• The process could be finished• The process could be newly created
29Computer Science 213© 2006 Donald Acton
Process State DiagramProcess State Diagram
• Consider a snapshot of system at a given time
• States are all the legal states any process could be in
• Transitions are the arcs describing all the legal states a process could immediately transition to
30Computer Science 213© 2006 Donald Acton
Process State DiagramProcess State Diagram
New
Ready
Blocked
DeadRunning
31Computer Science 213© 2006 Donald Acton
Role of main()Role of main()
• main() is just a well known function• When linking a program think of the
linker as having an existing program that always needs to call main()
• You supply the main()• The code before main() is setup code• The code executed after main()
returns performs cleanup
32Computer Science 213© 2006 Donald Acton
Calling main()Calling main()
main()
Process startupcode
User suppliedmain() code
Process cleanup code
33Computer Science 213© 2006 Donald Acton
Process TerminationProcess Termination
• A process ends if it – Returns from main()– Calls exit() or _exit() explicitly
• Returning from main() results in exit() being called
34Computer Science 213© 2006 Donald Acton
Just what does the OS do?Just what does the OS do?
• We have:– A layering model from the hardware to the
application– Basic model of CPU and a computer system
• We know about applications• We know how hardware signals the CPU• Remaining issue:
– How does the OS move information from the hardware to the application and back?
35Computer Science 213© 2006 Donald Acton
From Disks to ApplicationsFrom Disks to Applications
• Disk hardware model• OS view of a disk
– File system organization on disk
• Application view of a file– OS supplied disk services
• Sharing files• Transactions
36Computer Science 213© 2006 Donald Acton
Layering Yet Again!Layering Yet Again!
Application programs
Operating system
Hardware
General Layering
Structure
Application
Unix I/O
File System
Disk Drive
File System
Layering
37Computer Science 213© 2006 Donald Acton
Disk ConstructionDisk Construction
• Platters – the actual recording surface, there are two recording surfaces per platter
• Surface is divided into tracks• Each track is divided into sectors• A sector is the smallest amount of
data that can be written/read to/from a disk
38Computer Science 213© 2006 Donald Acton
Surface LayoutSurface Layout
Spindle
SurfaceTracks
Track k
Sectors
Gaps
Adapted from: Computer Systems: A Programmer’s Perspective
39Computer Science 213© 2006 Donald Acton
Platter ViewPlatter View
Surface 0
Surface 1
Surface 2
Surface 3
Surface 4
Surface 5
Cylinder k
Spindle
Platter 0
Platter 1
Platter 2
Adapted from: Computer Systems: A Programmer’s Perspective
41Computer Science 213© 2006 Donald Acton
Head CrashHead Crash
http://www.bolhuijo.com/gallery/diskdrive/aaa
43Computer Science 213© 2006 Donald Acton
Performance MeasurementPerformance Measurement
• Seek time – the time to position the head to the appropriate track
• Rotational Latency -the length of time it takes for the spot on the disk to move under the head
• Transfer time – the amount of time to read the data from the sector once the reading has begun
44Computer Science 213© 2006 Donald Acton
Calculating Average Access Time
Calculating Average Access Time
7200 RPM, Tavg seek 10ms, 600 sectors/track
Tavg rotation
Tavg transfer
Taccess
45Computer Science 213© 2006 Donald Acton
Scheduling Disk TransfersScheduling Disk Transfers
• FCFS – first come first served
• SCAN
• SSTF – shortest seek time first
Fairest
Highest Throughput
46Computer Science 213© 2006 Donald Acton
Data Transfer: (Non DMA)Data Transfer: (Non DMA)
Mainmemory
ALU
Register file
CPU chip
Disk controller
Graphicsadapter
USBcontroller
Mouse Keyboard Monitor
Disk
I/O bus
Bus interface
Adapted from: Computer Systems: A Programmer’s Perspective
Write (out)Read (in)
47Computer Science 213© 2006 Donald Acton
Evolution to DMAEvolution to DMA
• Why should the CPU be involved in touching all the data?– Compared to CPU memory access speeds, going
to the device memory is very time consuming
• Direct Memory Access (DMA) allows the device to access:– Main memory– Another device on the bus
• DMA can cause bus contention (more to come)
48Computer Science 213© 2006 Donald Acton
Controlling the Disk (DMA)Controlling the Disk (DMA)
Mainmemory
ALU
Register file
CPU chip
Disk controller
Graphicsadapter
USBcontroller
Mouse Keyboard Monitor
Disk
I/O bus
Bus interface
Adapted from: Computer Systems: A Programmer’s Perspective
Interrupt
49Computer Science 213© 2006 Donald Acton
CPU – Disk InteractionCPU – Disk Interaction
• CPU writes commands to instruct disk controller which sector to read or write
• Disk controller positions head and initiates transfer, typically using DMA
• DMA – Direct Memory Access allows disk controller to transfer data directly to or from main memory
• Controller signals CPU via interrupt (through PIC) that operation is complete
• If read, CPU can get result directly from main memory
50Computer Science 213© 2006 Donald Acton
Bus ContentionBus Contention
• While OS is waiting for one device to finish it could instruct additional devices to perform operations
• Could result in 2 or more devices attempting to access bus at the same time
51Computer Science 213© 2006 Donald Acton
ContentionContention
Mainmemory
ALU
Register file
CPU chip
Disk controller
Graphicsadapter
USBcontroller
Mouse Keyboard Monitor
Disk
I/O bus
Bus interface
Adapted from: Computer Systems: A Programmer’s Perspective
52Computer Science 213© 2006 Donald Acton
Are All Sectors the Same?Are All Sectors the Same?
Spindle
53Computer Science 213© 2006 Donald Acton
Things to ConsiderThings to Consider
• The circumference of the outer track is greater than the inner track therefore:– The disk is moving faster under the head– There is more surface area to a track
• Zoned bit encodings permit more sectors per track on outer tracks than inner tracks
• Some sectors are bad • Modern disk drives abstract the
track/head/sector information out and present an image of logical sectors
54Computer Science 213© 2006 Donald Acton
Disk Sectors/TrackDisk Sectors/Track
From: http://www.storagereview.com/
Fixed sectors per trackZoned-bit encoding
Variable Sectors per track
5 tracks 16 sectors/track
5 tracks 14 sectors/track
4 tracks 12 sectors/track
3 tracks 11 sectors/track3 tracks 9 sectors/track
55Computer Science 213© 2006 Donald Acton
Cheap vs Expensive DisksCheap vs Expensive Disks
How do cheap and expensive disks differ?
• People pay for performance • Features that improve a disk’s
performance or reliability cost more to incorporate into a disk
56Computer Science 213© 2006 Donald Acton
Disk failure Disk failure
Affected by– Duty cycle (how much time it is involved
in data transfer)– Temperature– Number of platters– Number of hours drive is on– Physical environment
• Mean time between failures • Typical service life 3 – 5 years
57Computer Science 213© 2006 Donald Acton
Disk LifeDisk Life
• High initial failure rates• Stable failure rate during the drives
service life (3 – 5 years)• Gradual increase in failures as
components wear out• Separate failure rates for power-on events• Mean time between failure – average
aggregate number of power on hours of a collection of disks before observing a failure
58Computer Science 213© 2006 Donald Acton
More Disks More HeadachesMore Disks More Headaches
• Network Appliance sells a compartment with 336 drives in it
• If MTBF of 100,000 hours how many drives would be expected to be replaced in a year?
59Computer Science 213© 2006 Donald Acton
Onto the File SystemOnto the File System
• Works directly with the view of the disk provided by the controller
• Provides a device independent view of a disk
• Further levels of abstraction on file system are what the application sees
Application
Unix I/O
File System
Disk Drive
File System
Layering
60Computer Science 213© 2006 Donald Acton
Purpose of a file systemPurpose of a file system
• Provides an abstraction so that applications don’t access disk directly
• Hides complexity of different drive types
• Manages the device• Provides structure for the
organization of the data• Protection
61Computer Science 213© 2006 Donald Acton
File System PropertiesFile System Properties
• Persistent – survive reboots• Correct – reflect the operations
performed• Robust – must be able to be
recovered in crash• Efficient – must make good use of
disk space and must be fast
62Computer Science 213© 2006 Donald Acton
Unix File System SemanticsUnix File System Semantics
• A file is simply a sequence of m bytes– B0, B1, .... , Bk , .... , Bm-1
• Everything is a file– All I/O devices are represented as files
• /dev/ad0s4e (/var disk partition)• /dev/ttyd0 (terminal)• /dev/rmt0 (tape drive)
– Even the running kernel is represented as a file:• /dev/kmem
63Computer Science 213© 2006 Donald Acton
Unix File TypesUnix File Types
• Regular files:– Binary/text files they are all the same
• Directory file – It’s still a file but it contains names and
locations of other files
• Special files– Block (disks) character (terminal)
• Interprocess communication– Named pipes– Sockets
64Computer Science 213© 2006 Donald Acton
Blocks – the basic unitBlocks – the basic unit
• A file is a sequence of bytes stored in a set of fixed sized blocks
• Block sizes are multiples of the sector size (e.g. 512, 1024, 4096, 8192)
• The sectors comprising a particular block are contiguous
• All blocks in a particular file system are the same size
• With respect to a file, the smallest amount of data the file system transfers to or from a disk is a block
• Blocks are the virtualization of a disk’s sectors
65Computer Science 213© 2006 Donald Acton
Needed File System Components and Services
Needed File System Components and Services
• Must be able to:– Keep track of which blocks are free– Determine which blocks belong to a
particular file– Maintain administrative data like file
permissions, creation and modified times– Find the list of free blocks, root directory,
and some “other stuff” when the system is started
66Computer Science 213© 2006 Donald Acton
The Primitive PiecesThe Primitive Pieces
• Super block – helps locate everything• Inode – maintains the file’s
administrative data, tracks the file’s data blocks
• Data blocks – multipurpose blocks used for – File’s data blocks– Indirect blocks
67Computer Science 213© 2006 Donald Acton
File System Data StructuresFile System Data Structures
• Inode list – list of free inodes (inodes are just numbers)
• Inode map – given an inode can locate the disk block the inode is in
• Free block list
68Computer Science 213© 2006 Donald Acton
Single Indirect Block
Unix InodeUnix InodeType/mode Link count
File Size
UID GID
Various modification
access and creation
times
Direct Data Block 0
Direct Data Block 1
Direct Blocks 2 - 9
Single Indirect Block
Double Indirect Block
Additional bookkeeping
information
Data Block
Data Block
0
……127
Data Block
Data Block
0
……127
0
……127
Data Block
Double Indirect Block
0
……127
0
……127
0
……127
0
……127
0
……127
Data Block
Direct Data Block 1
Direct Data Block 0Data Block
0
……127
Triple Indirect BlockTriple Indirect Block
The indirect block is exactly
the size of a block
Data Block
69Computer Science 213© 2006 Donald Acton
(Some) Inode Parts(Some) Inode Parts
• Type/Mode– Type (regular file, directory, device, etc)– Permissions
• Group, user, other - 3 bits each (e.g. rwxr-x—x, rw-r-----)
• Link count– How many directory entries this file has
• UID – User ID number – identifies owner of file
• GID– Group ID number – identifies group of file
70Computer Science 213© 2006 Donald Acton
How big is the largest file that could be created in the original Unix
filesystem?
How big is the largest file that could be created in the original Unix
filesystem?Compute the number
of blocks in the fileBlocks Blocks
Direct Blocks 10 10
Single Indirect 128 128
Double Indirect 1282
Triple Indirect 1283
Total
71Computer Science 213© 2006 Donald Acton
Block Address RangesBlock Address Ranges
• Direct– 0 to 5,119 (10 x 512 – 1)
• Single indirect– 5120 to 70,655 (5,120 + 128 x 512 – 1)
• Double indirect– 70,656 to 8,459,263
(70,656 + 1282 x 512 - 1)
• Triple indirect– 8,459,264 to 1,082,201,087
(8,459,264 + 1283 x 512 – 1)
72Computer Science 213© 2006 Donald Acton
Locate byte 1,000,000,033Locate byte 1,000,000,033
• Convert to blocks– 1,000,000,033 / 512 = 1,953,125– Remainder of 33 is offset into the block– Based on result determine if direct
access, single, double, or triple indirect– This case is triple indirect – Determine which of the 1283 blocks it is
• 1,953,125 – (10 + 128 + 1282) = 1,936,603
73Computer Science 213© 2006 Donald Acton
1 D 8 C D B
0001 1101 1000 1100 1101 1011
Continuing the huntContinuing the hunt
• 1,936,603 0x1D8CDB• Observe that 128 is 27 so if we look at
the above in binary and then group 7 bits at a time we can quickly get the indexes to the needed indirect blocks
74Computer Science 213© 2006 Donald Acton
struct ufs1_dinode {
u_int16_t di_mode; /* 0: IFMT, permissions; see below. */
int16_t di_nlink; /* 2: File link count. */
union {
u_int16_t oldids[2]; /* 4: Ffs: old user and group ids. */
} di_u;
u_int64_t di_size; /* 8: File byte count. */
int32_t di_atime; /* 16: Last access time. */
int32_t di_atimensec; /* 20: Last access time. */
int32_t di_mtime; /* 24: Last modified time. */
int32_t di_mtimensec; /* 28: Last modified time. */
int32_t di_ctime; /* 32: Last inode change time. */
int32_t di_ctimensec; /* 36: Last inode change time. */
ufs1_daddr_t di_db[NDADDR]; /* 40: Direct disk blocks. */
ufs1_daddr_t di_ib[NIADDR]; /* 88: Indirect disk blocks. */
u_int32_t di_flags; /* 100: Status flags (chflags). */
int32_t di_blocks; /* 104: Blocks actually held. */
int32_t di_gen; /* 108: Generation number. */
u_int32_t di_uid; /* 112: File owner. */
u_int32_t di_gid; /* 116: File group. */
int32_t di_spare[2]; /* 120: Reserved; currently unused */
};
FreeBSD 6.0 dinode.hFreeBSD 6.0 dinode.h
76Computer Science 213© 2006 Donald Acton
SuperblockSuperblock
• Located in first available sector after boot sector
• Maintains global file system information such as:– Location of inode map and free inode list– Location of the inode that corresponds to the
root directory– Location of list of free blocks– Block size being used– Sector size (anything except 512 is unusual)– Dirty flag
• Loss of the super block is a catastrophe
77Computer Science 213© 2006 Donald Acton
From the FactoryFrom the Factory
• As shipped the disk does not contain a file system (old disk drives didn’t even contain tracks and sectors)
• When disk is installed must:– Lay down tracks and sectors– Determine if all the sectors are good– Put down a superblock – Create an inode map and free inode list– Build the list of free blocks
78Computer Science 213© 2006 Donald Acton
FormattingFormatting
• Referred to as low level formatting in the Microsoft/PC world
• Laid down sectors and tracks (controller was external to drive)
• Wandering tracks• Mapped out bad sectors• Modern drives are shipped formatted
and deal with the bad sectors for us
79Computer Science 213© 2006 Donald Acton
Bad SectorsBad Sectors
• Over time sectors may become unreliable• Sectors may be unreliable initially due to
manufacturing flaws• Modern drives maintain tables mapping
bad sectors to good sectors• On older systems done manually or by
running a special program (bad144) • Example console error message:
NOTICE: sdsk: Unrecoverable error reading SCSI disk 2 dev 1/64 (ha=0 id=1 lun=0) block=219102 Medium error: Unrecovered read error
80Computer Science 213© 2006 Donald Acton
DisklabelDisklabel
• Defines Unix partitions on a disk • A partition is almost like a logical
disk within a (potentially logical) disk • Partitions break the “disk” into
logical functional regions• Example:
– Swap space– /var– /usr
size offset fstype
a: 2097152 0 4.2BSD
b: 5120000 2097152 swap
c: 61432560 0 unused # "raw"
d: 2560000 7217152 4.2BSD
e: 8388608 9777152 4.2BSD
f: 43266800 18165760 4.2BSD
81Computer Science 213© 2006 Donald Acton
newfsnewfs
• High level formatting to Microsoft/PC crowd
• Within Unix partition disk creates:– Superblock and duplicates– Inode list and map– Free block list
• Example newfs output /dev/ad0s2e: 4096.0MB (8388608 sectors) block size 16384,
fragment size 2048 using 23 cylinder groups of 183.77MB, 11761 blks,
23552 inodes.
super-block backups at: 160, 376512, … 8279904
82Computer Science 213© 2006 Donald Acton
MetadataMetadata
• Data about data• File metadata:
– Basically information in inode– Describes owner of data, file size, etc
• File system metadata:– Superblock information– Block size, inode map, block lists, etc
83Computer Science 213© 2006 Donald Acton
Directory FileDirectory File
• A directory is just a file that contains information to locate other files and directories
typedef struct dirent32 {
ino32_t d_ino; /* "inode number" of entry */
off32_t d_off; /* offset of disk directory
entry */
uint16_t d_reclen; /* length of this record */
char d_name[1];/* name of file */
} dirent32_t;
From: FreeBSD 5.2
84Computer Science 213© 2006 Donald Acton
Superblock + Inodes + data blocks and directories
Superblock + Inodes + data blocks and directories
Inode Map
Free Blocks
Inode of ‘/’
SuperblockInode Inode Inode Inode
Inode Map
Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Inode
Data
Block
vmunix
Inode
Data
Block
Data
Block
Indirect
Data
Block
Data
BlockData
BlockData
Block
Data
Block
85Computer Science 213© 2006 Donald Acton
How it might look on diskHow it might look on disk
Spindle
86Computer Science 213© 2006 Donald Acton
Creating a fileCreating a file
• When a file is created it is empty– Locate a free inode – Fill in inode fields– Remove inode from free list
• Locate directory to add file to – Add a new directory entry which
includes• Filename• Inode of file
87Computer Science 213© 2006 Donald Acton
Creating a FileCreating a File
Free Inodes
Inode Map
Free Blocks
…
Superblock
Inode Inode Inode Inode
Inode Map
Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Some Directory
•
••
FileA
FileC
NewFile
Second Directory
•
••
Link Count = 1Link Count = 2
NewFile2
Data
Block
88Computer Science 213© 2006 Donald Acton
Steps in Writing a File BlockSteps in Writing a File Block
• If data goes into an existing block– Make changes to in memory copy, write
changed block
• Otherwise– Locate a free block, write data– Add free block to inode or indirect block– May have to get a block to use as
indirect block first
89Computer Science 213© 2006 Donald Acton
Writing a File BlockWriting a File Block
Free Inodes
Inode Map
Free Blocks
…
Superblock
Inode Inode Inode Inode
Inode Map
Data
Block Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Some Directory
•
••
FileA
FileC
NewFile
Indirect
Block
90Computer Science 213© 2006 Donald Acton
Steps in Deleting a FileSteps in Deleting a File
• Directory entry is removed and directory updated on disk
• Link count on file being removed decremented
• If link count 0– Return disk blocks to free list– Return inode to free inode list
91Computer Science 213© 2006 Donald Acton
Deleting a FileDeleting a File
Free Inodes
Inode Map
Free Blocks
…
Superblock
Inode Inode Inode Inode
Inode Map
Data
Block Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Some Directory
•
••
FileA
FileC
NewFile
Indirect
Block
Link Count = 1Link Count = 0
92Computer Science 213© 2006 Donald Acton
Order of Writing Meta DataOrder of Writing Meta Data
• If data not written in proper order file system could be inconsistent if system crashes
• Examples:– Block claimed by two inodes
• (block added to inode before old inode updated)
– Inode referenced from two directories when not a link
• Inode freed and then immediately reused but directory of initial reference not update
93Computer Science 213© 2006 Donald Acton
Meta Data Update Requirements
Meta Data Update Requirements
• Indirect blocks and inodes written synchronously upon deallocation
• Directory updates consisting of:– Deleting a file– Adding a file– Changing (renaming) a file are all done
synchronously• Causes significant file system
slowdown
94Computer Science 213© 2006 Donald Acton
Ordering Operations (BAD)Ordering Operations (BAD)
Free Inodes
Inode Map
Free Blocks
…
Superblock
Inode Inode Inode Inode
Inode Map
Data
Block Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Some Directory
•
••
FileA
FileC
NewFile
Indirect
Block
95Computer Science 213© 2006 Donald Acton
Ordering Operations (Good)Ordering Operations (Good)
Free Inodes
Inode Map
Free Blocks
…
Superblock
Inode Inode Inode Inode
Inode Map
Data
Block Data
Block
Data
Block
Data
Block
Data
Block
Data
Block
Some Directory
•
••
FileA
FileC
NewFile
Indirect
Block
96Computer Science 213© 2006 Donald Acton
Checking a File SystemChecking a File System
• Need to determine– Which inodes are in use– Which blocks are in use– Which inodes and blocks are
unaccounted for so that they can be recovered for use
97Computer Science 213© 2006 Donald Acton
Files - SummaryFiles - Summary
• The inode does not contain the file’s name• A file can be associated with several
different names• When the inode’s reference count goes to
0 the resources used by the file can be reclaimed
• All the interesting information about a file (e.g. owner, permissions etc.) is contained in the inode
98Computer Science 213© 2006 Donald Acton
Actions to Read File DataActions to Read File Data
• To open a file and read data the system must:– Read the inode map– Read the inode for the file– Read 0 – 3 indirect blocks– Read the data block
• With this approach 3 – 6 reads are needed
99Computer Science 213© 2006 Donald Acton
Actions to Write File DataActions to Write File Data
• To open a file and write data the system must:– Read the inode map– Read the inode for the file– Read 0 – 3 indirect blocks
• Allocate and insert 0 – 3 indirect blocks
• Read and or allocate the data block, change it, write it back
100Computer Science 213© 2006 Donald Acton
The Case for CachesThe Case for Caches
• Assume IDE disk with 16KB blocks– Transfer time for 16KB ≈16x10-5ms– Average access time 10ms (really about
12.5ms)
• Time to read 16KB of data– 10ms * 3(6) = 30(60)ms
• Effective transfer rate– 16KB/30(60)ms = 533 KB/s (267KB/s)
• Manufacturer claimed transfer rate 150MB/s (SATA)
101Computer Science 213© 2006 Donald Acton
Speeding things upSpeeding things up
• Obvious problem is all the disk accesses
• Idea– Save (i.e. cache) copies of important
data to avoid going to disk– Could cache:
• Inode map• Inodes• Even individual disk blocks (indirect blocks
would be an especially good candidate)
102Computer Science 213© 2006 Donald Acton
Where are caches used?Where are caches used?
Adapted from: Computer Systems: A Programmer’s Perspective
Registers
On-chip L1cache (SRAM)
Main memory(DRAM)
Local secondary storage(local disks)
Larger, slower,
and cheaper (per byte)storagedevices
Remote secondary storage(distributed file systems, Web servers)
Local disks hold files retrieved from disks on remote network servers.
Main memory holds disk blocks retrieved from local disks.
Off-chip L2cache (SRAM)
L1 cache holds cache lines retrieved from the L2 cache.
CPU registers hold words retrieved from cache memory.
L2 cache holds cache lines retrieved from memory.
L0:
L1:
L2:
L3:
L4:
L5:
Smaller,faster,and
costlier(per byte)storage devices
103Computer Science 213© 2006 Donald Acton
The Cost of CachingThe Cost of Caching
• It requires storage, that storage could be used for something else
• If changes are made to the cached copy then the copy and entity being cached are inconsistent
• Keeping things consistent can be a complex task especially if there are multiple cached copies
104Computer Science 213© 2006 Donald Acton
Buffering vs CachingBuffering vs Caching
• Buffering is used to deal with speed mismatches between a data producer and data consumer. The buffer contains the data.
• Caching is keeping a copy of data to speed up access to that data
• Some systems merge aspects of these two (e.g. a buffer of a changed inode can also be the inode’s cached copy)