FILES
• Files – data collection created by user processes
◦ Desirable properties: long-term existence, sharable between processes, struc-ture (hierarchical)
• File system – provides means of storing data organized as files, as well as a collec-tion of functions that can be performed on files
◦ Also maintains a set of attributes associated to files◦ Typical operations: create, delete, open, close, read, write◦ Objectives:
– Meet the data management needs of the user– Guarantee that the data in the file are valid– Minimize the potential for lost or destroyed data– Optimize performance– Provide I/O support for a variety of storage device types– Provide a standardized set of I/O interface routines to user processes– Provide I/O support as well as protection for multiple users
CS 409, FALL 2013 FILE MANAGEMENT/1
MINIMAL USER REQUIREMENTS
Each user:
• Should be able to create, delete, read, write and modify files• May have controlled access to other users’ files• May control what type of access is allowed to the files• Should be able to restructure the files in a form appropriate to the problem• Should be able to move data between files• Should be able to back up and recover files in case of damage• Should be able to access his or her files by name rather than by numeric identifier
CS 409, FALL 2013 FILE MANAGEMENT/2
FILE SYSTEM ORGANIZATION
CS 409, FALL 2013 FILE MANAGEMENT/3
FILE SYSTEM ORGANIZATION (CONT’D)
• Device drivers: lowest level, communicate directly with the device
◦ Responsible for starting I/O operations on a device◦ Processes the completion of an I/O request◦ Considered to be part of the operating system
• Basic file system (or physical I/O level): primary interface with the environment out-side the computer system (e.g., the disk)
◦ Deals with blocks of data that are exchanged with disk or tape systems◦ Concerned with the placement of blocks on the secondary storage device◦ Concerned with buffering blocks in main memory
CS 409, FALL 2013 FILE MANAGEMENT/4
FILE SYSTEM ORGANIZATION (CONT’D)
• Basic I/O supervisor: responsible for file I/O initiation and termination
◦ Maintains control structures that deal with device I/O, scheduling, and file status◦ Selects the device on which I/O is to be performed◦ Concerned with scheduling disk and tape accesses to optimize performance◦ Assigns I/O buffers and allocates secondary memory
• Logical I/O: enables users and processes to access resords
◦ Provides general-purpose record I/O capability◦ Maintains basic data about files
• Access method: the level of file system closest to the user
◦ Provides a standard interface between applications and the file systems anddevices that hold the data
◦ Different access methods reflect different file structures and different ways ofaccessing and processing the data
CS 409, FALL 2013 FILE MANAGEMENT/5
FILE SYSTEM ORGANIZATION (CONT’D)
CS 409, FALL 2013 FILE MANAGEMENT/6
FILE ORGANIZATION
Logical structuring of records. Common types:
• The pile: Data stored in order of arrival; a record consists of one burst of data• The sequential file: Fixed format for records, which are stored sequentially
◦ Key field uniquely identifies the record◦ Only organization that is easily stored on tape as well as disk
• Indexed sequential file: adds an index to support random access
◦ Greatly reduces the time required to access a single record◦ Multiple levels of indexing can be used to provide greater efficiency in access
• Indexed file: records are accessed only through their indexes
◦ Variable-length records possible◦ Exhaustive index contains one entry for every record in the main file◦ Partial index contains entries to records where the field of interest exists◦ Used mostly in applications where timeliness of information is critical
• Hashed files: direct access to fixed-length records via a hash function
CS 409, FALL 2013 FILE MANAGEMENT/7
DIRECTORIES
• Special files that contain information about other files• Operations: search, create file, delete file, list directory, update directory• Tree-structured directories: the files in a directory may be directories themselves
◦ Advantages: efficient searching,grouping capabilities
◦ Current working directory (set withcd) = files can be specified by ab-solute path (relative to root) or rela-tive path (to the current working di-rectory)
◦ Acyclic-graph directories: differentnames for a single file; new opera-tions: link a new name to an exist-ing file and unlink
CS 409, FALL 2013 FILE MANAGEMENT/8
DIRECTORIES (CONT’D)
• File Directory Information
◦ Basic information: file name, file type (text, binary, directory, etc.), file organiza-tion (if supported)
◦ Address information: volume, starting address on disk, size (allocated and used)◦ Access control information: owner, access information, permitted actions◦ Usage information: date created, identity of creator, date last read, identity of
last reader, date last modifier, identity of last writer, date of last backup◦ Current usage: Information about current activity on the file, such as process or
processes that have the file open, whether it is locked by a process, and whetherthe file has been updated in main memory but not yet on disk
• Directory implementation
◦ Sequential file (easy to implement, time-consuming to use)◦ Indexed file◦ Hashed file
CS 409, FALL 2013 FILE MANAGEMENT/9
FILE SHARING
Two issues in a multi-user system:
• Access rights◦ None ◦ Appending (can add data)◦ Knowledge (can determine exis-
tence and owner)◦ Updating (can also modify exist-
ing data)◦ Execution ◦ Deletion◦ Reading ◦ Changing protection (can change
access rights for other users)◦ Access rights usually established based on users or user classes
Owner Specific user User groups All usersUsually the creator,full rights, may grantrights to others
Specifiedindividualusers
Set of users whoare not identifiedindividually
All the users whohave access tothe file system
• Management of simultaneous access: file locking, see flock/fcntl and lockf
CS 409, FALL 2013 FILE MANAGEMENT/10
UNIX FILE PROTECTION
• Each file has an owner and an associated group
◦ Groups and group membership are managed by a separate sub-system and aresystem-wide
• Three access rights: read, write (both appending and updating), execute
◦ Knowledge = execute (can access) and read (can list) rights to the containingdirectory
◦ Deletion = write rights to the containing directory◦ Changing protection = chown, chgrp, chmod (owner or root only)
• One access group (read, write, execute) for owner, for group, and for the others
octal r w xowner access 7 = 1 1 1group access 6 = 1 1 0public access 1 = 0 0 1
fileowner
forgroup
forothers
chmod 761 /public/games
for
CS 409, FALL 2013 FILE MANAGEMENT/11
UNIX DIRECTORY LISTING
< godel:409/slides > pwd/Volumes/Home/Users/bruda/409/slides< godel:409/slides > ls -lFtotal 15860-rw-r--r-- 1 bruda staff 12104 Sep 5 13:38 00-intro-org.tex-rw-r--r-- 1 bruda staff 27458 Sep 9 23:51 02-overview.tex-rw-r--r-- 1 bruda staff 32725 Sep 15 21:16 03-processes.tex-rw-r--r-- 1 bruda staff 30018 Sep 29 18:30 04-threads.tex-rw-r--r-- 1 bruda staff 42454 Oct 16 21:24 05-synchronization.tex-rw-r--r-- 1 bruda staff 21130 Oct 23 22:30 06-deadlock.tex-rw-r--r-- 1 bruda staff 30993 Oct 29 12:54 07-memory.tex-rw-r--r-- 1 bruda staff 29915 Nov 6 14:04 08-scheduling.tex-rw-r--r-- 1 bruda staff 16003 Nov 11 19:40 09-io.tex-rw-r--r-- 1 bruda staff 8 Nov 13 19:58 10-file.aux-rw-r--r-- 1 bruda staff 22472 Nov 13 19:58 10-file.dvi-rw-r--r-- 1 bruda staff 11939 Nov 13 19:58 10-file.log-rw-r--r-- 1 bruda staff 10523 Nov 13 19:58 10-file.texlrwxr-xr-x 1 bruda staff 14 Jan 11 2010 Makefile -> ../../Makefile-rw-r--r-- 1 bruda staff 8613458 Nov 11 15:14 ch12.pdfdrwxr-xr-x 15 bruda staff 510 Nov 11 19:47 figs/< godel:409/slides >
CS 409, FALL 2013 FILE MANAGEMENT/12
RECORD BLOCKING
• Records (user-level) are usually organized into blocks (OS-level)
◦ Originally because of disk organization (block = disk sector)◦ But also central to storage optimization (disk cache, physical disk organization)
• Blocking schemes:
◦ Fixed-length = fixed-length records, fixed number of records per block– May have internal fragmentation
◦ Variable-length spanned = variable-length records packed into blocks with nounused space (one record may span multiple blocks)
◦ Variable-length unspanned = variable-length records with no spanning– Even more prone to internal fragmentation
CS 409, FALL 2013 FILE MANAGEMENT/13
FILE ALLOCATION
• On secondary storage, a file consists of a collection of blocks
◦ Inherent organization of the media or disk cache organization
• The operating system or file management system is responsible for allocating blocksto files
• Free space management is also an important task (influenced by the approach takenfor file allocation)
• Space is allocated to a file as one or more portions (contiguous set of allocatedblocks)
• File allocation table (FAT) = data structure used to keep track of the portions assignedto a file
• Allocation policies:
◦ Preallocation = allocates space for a (maximum) size for each file– Maximum size difficult to establish, wasteful
◦ Dynamic allocation = allocates space to a file in portions, as needed
CS 409, FALL 2013 FILE MANAGEMENT/14
CONTIGUOUS FILE ALLOCATION
• Preallocationstrategy
• Simple• Random access• Best from the
point of view ofan individual file
• But files cannotgrow
• Used by severalnew file systems
CS 409, FALL 2013 FILE MANAGEMENT/15
CHAINED FILE ALLOCATION
• Block allocation, with eachblock containing a pointer tothe next block
• The file allocation table needsjust a single entry for each file
• No external fragmentation toworry about
• Simple• Best for sequential files (no
random access)• Example: the FAT file system
(DOS, OS/2)
CS 409, FALL 2013 FILE MANAGEMENT/16
INDEXED FILE ALLOCATION
• Need index table• Random access• Dynamic access without
external fragmentation, buthave overhead of index block
• Size of the file limited by thesize of the index block◦ To extend the maximum
size we can use more in-dex blocks or more index-ing levels
CS 409, FALL 2013 FILE MANAGEMENT/17
DOUBLY INDEXED FILE ALLOCATION
file
outer index
index table
CS 409, FALL 2013 FILE MANAGEMENT/18
COMBINED INDEXED ALLOCATION: THE EXT2 INODE
(4KB block size)
CS 409, FALL 2013 FILE MANAGEMENT/19
FREE SPACE MANAGEMENT
• To perform file allocation, it is necessary to know which blocks are available; a diskallocation table is thus needed in addition to a file allocation table
• Methods:
◦ Bit tables: the disk allocation table is a bit vector with one bit for each block onthe disk– A 0 bit corresponds to a free block, a 1 bit to a block in use– Works well with any allocation method, table as small as possible
◦ Chained free portions: the free portions are chained together in a linked list– Suited for all allocation methods, negligible space overhead (no disk allocation
table)– But fragmentation, need to read a block before writing (i.e., allocating) it
◦ Indexing: treats free space as a file and uses indexed allocation on it◦ Free block list: each block is assigned a number (24 or 32 bits), as list of numbers
for free blocks is kept in a special area on disk– Part of the list is brought in memory for efficiency reasons
CS 409, FALL 2013 FILE MANAGEMENT/20
VOLUMES AND MOUNTING
• The file system does not reside on aphysical disk, but on a logical volume
◦ A disk may hold multiple volumes◦ The sectors on a volume do not
even need to be consecutive on thephysical storage, or even on thesame media!
• A volume (i.e., file system) must bemounted before it can be accessed
• A file system is mounted at a mountpoint = some directory in the set of al-ready mounted file systems
CS 409, FALL 2013 FILE MANAGEMENT/21
UNIX FILE MANAGEMENT
• Several file types: regular, directory, special(contains no data, associated with a device),named pipe, link, symbolic link
• All types of Unix files are administered by the OSby means of inodes◦ An inode (index node) is a control structure
that contains the key information needed bythe operating system for a particular file
◦ Several file names may be associated with asingle inode
◦ An active inode is associated with exactlyone file
◦ Each file is controlled by exactly one inode• File allocation is indexed, with part of the index
stored in the inode◦ In all UNIX implementations the inode in-
cludes a number of direct pointers and threeindirect pointers (single, double, triple)
• Directories are structured ina hierarchical tree
• Each directory can containfiles and/or other directories
CS 409, FALL 2013 FILE MANAGEMENT/22
UNIX VOLUMES
• A Unix volume is laid out with the following elements:
◦ Boot block: contains code required to boot the operating system◦ Superblock: contains attributes and information about the file system◦ Inode table: collection of inodes for each file◦ Data block: storage space available for data files and subdirectories
CS 409, FALL 2013 FILE MANAGEMENT/23
THE LINUX VIRTUAL FILE SYSTEM (VFS)
• Presents a single, uniform file sys-tem interface (API and ABI) to userprocesses
• Defines a common file model thatis capable of representing any con-ceivable file system’s general fea-ture and behavior
• Assumes files are objects thatshare basic properties regardlessof the target file system or the un-derlying processor hardware
• Allows the use of virtual file sys-tems that behave like file systemsbut do not physically exist on disk◦ Example: the /proc file sys-
tem that presents an interfaceto all processes
CS 409, FALL 2013 FILE MANAGEMENT/24
PRIMARY OBJECT TYPES IN VFS
• Superblock object represents a specific mounted file system• Inode object represents a specific file• Dentry object represents a specific directory entry• File object represents an open file associated with a process
CS 409, FALL 2013 FILE MANAGEMENT/25