45
NETW3005 File System Interface

NETW3005 File System Interface. Reading For this lecture, you should have read Chapter 10 (Sections 1-5) and Chapter 11 (Sections 1-4). NETW3005 (Operating

Embed Size (px)

Citation preview

NETW3005

File System Interface

Reading

• For this lecture, you should have read Chapter 10 (Sections 1-5) and Chapter 11 (Sections 1-4).

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 2

Last Lecture – virtual memory

• Demand paging

• Page replacement algorithms

• Frame allocation

• Thrashing

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 3

This Lecture

• What’s a file?

• File access methods

• Directory structure

• File system implementation

• Disk allocation methods

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 4

Storage Management: A Recap

• The last two lectures have been concerned with moving data into and out of main memory.

• Note: primary memory is only temporary storage.

• Storage management is also concerned with the issue of storing data on non-volatile devices.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 5

Motivation for the File Concept

• For many purposes, a programmer doesn’t care about what medium data is stored in.

• All they care about is the data itself, and how to get at it.

• The issue about how the data is stored can be left to the operating system.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 6

Motivation for the File Concept

• The operating system provides a logical unit of storage for the user, called a file.

• The user refers to files.

• The operating system maps files onto regions of secondary storage.

• Files are really an artifact of the dialogue between the user and the O/S.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 7

How should we define a file?

• What’s the point of having files?– That’s what we’ve just answered.

• What does a file hold?– A collection of related data, e.g. the

sequence of lines in a program, the sequence of words in a text document.

• Where is the data stored?– Secondary storage.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 8

How should we define a file?

• What’s the point of having files?– That’s what we’ve just answered.

• What does a file hold?– A collection of related data, e.g. the

sequence of lines in a program, the sequence of words in a text document.

• Where is the data stored?– Secondary storage. (Probably more precise

to say ‘not in main memory’.)

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 9

How should we define a file?

• What structure does a file have?– Different files have different structures, e.g.

text files are broken into units with line breaks.

• What can a user do with a file?– Create, write, read, reposition, delete,

truncate. (These are all system calls.)

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 10

File attributes

• What information should the OS store about each file in the file system?– File name and type.– Location and size.– Protection.– Housekeeping information.

• Where is all this information kept?– In a directory.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 11

File Operations

• To carry out an operation on a file, we have to know where it is.

• To avoid the overhead of searching every time, many systems require that a file is opened before using it.

• The system maintains an open file table which records the location of the file, and how it is currently being used.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 12

Memory-mapped files

• Opening a memory-mapped file causes a region of a process’s virtual memory to be associated with the file.

• Reads and writes to the file are implemented as reads and writes to this memory region.

• Closing the file causes the region of memory to be written back to the disk.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 13

File types

• Since files can store many different types of data, some systems require the type of data in a file to be specified explicitly.

• Some common file types:– executable programs,– source code and text,– application specific documents,– images, etc.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 14

Advantages/disadvantages

• Advantages.– Knowing the file type limits the choice of

which applications can process that file.– Don’t attempt inapplicable operations, e.g.

printing out a binary file.

• Disadvantages.– Hard to deal with new file formats, e.g.

encrypted files (which are binary, but not executable).

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 15

File structure

• File types can be used to indicate the internal structure of a file.

• Any operating system has to know about one file format—executable prog-rams.

• O/S can usually support a larger set of types.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 16

Strategies

• A minimal strategy, e.g. UNIX. A file is just a sequence of 8-bit bytes.

• An intermediate strategy, e.g. Mac-OS. A file consists of a resource fork and a data fork.

• An extreme strategy, e.g. MS Windows. Every file has an associated type embedded in its file name extension.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 17

File access methods - Sequential

• Most common method.

• A file pointer identifies a record within the file.

• It can be moved incrementally forwards (in read or write operations) or to the beginning (in rewinding).

• The hardware metaphor is a tape.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 18

File access methods - Direct

• A file is viewed as a numbered sequence of records.

• Operations (e.g. read, write) can be carried out on any record in any order.

• The hardware metaphor is a disk.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 19

Organising groups of files

• Since the number of files in a system can be large, it makes sense to group them in various ways.

• Normally done in two levels. – The file system is first divided into

partitions. A partition can be thought of as a virtual disk.

– Each partition contains a directory of files that reside on it.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 20

Organising groups of files

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 21

directory

directory

partition 1

partition 2

partition 1disc

directory

A simple model of user directories

• As we saw earlier, a directory is a table, relating a file name to its attributes.

• Simplest method uses a single table.

• But there are problems:– length and uniqueness of filenames,– multiple users,– searching large directories.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 22

Multiple directories

• One directory per user.

• Have a master directory, which is a table of user directories.

• When a user process refers to a file, the operating system searches only the user’s directory.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 23

Tree-structured directories

• It’s an easy extension to allow users to create new directories (aka folders in GUI-speak) in their own directories.

• To refer to an arbitrary file in the tree-structure, we now need to specify a path from the root of the tree, e.g.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 24

/user2/directory1/directory2/file

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 25

/

user1 user2 user3

directory1 file

file directory2

file

File access in a directory hierarchy

• It would be laborious to have to type in the path name for each file.

• Instead, most systems provide a notion of the current directory as the default one to search.

• For executable files, some systems allow a user to specify a search path – a list of directories to be searched in order.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 26

File access in a directory hierarchy

• Relative path names can also be used – these are interpreted relative to the current directory.

• Partitions are often thought of as the first branches in the tree.

• The syntax for specifying partitions is sometimes the same as for directory names.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 27

Directories as graphs

• One type of data-sharing is to allow two users access to the same file/directory.

• If we implement this by having two directories point to the same file/directory then the resulting structure is a graph.

• Such a graph must be acyclic.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 28

Shared directories

• Commonly implemented using links.

• A link is a pointer to an arbitrary file in the directory structure.

• In a symbolic link, the pointer is just a pathname.

• When a directory is searched and a link found, the O/S follows the pointer and uses the file pointed to.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 29

Mounting

• To make a file system available to processes, it must be mounted.

• The operating system is given the name of the device to be mounted and a directory from which the file system will be accessible, e.g. /user1.)

• The files in the mounted system will then be available as if they were files in that directory, e.g. /user1/newfile.)

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 30

Implementing files: a file system

• So far, we have been describing the logical structure of the file system, (files, directories, partitions.)

• The operating system has to map this logical structure onto a storage device (typically, a disk).

• This is done by the file organisation module.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 31

Implementing files: a file system

• We have already seen how the smallest unit in a disk is a block.

• The file organisation module has to allocate blocks for the storage of files.

• A file is broken into logical blocks, to make the mapping to disk blocks easier to manage.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 32

Disk allocation methods

• How should we set aside disk space for the files in a system to occupy?

• There are several allocation methods.– Contiguous allocation– Linked Allocation– File Allocation Table– Indexed Allocation

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 33

Contiguous allocation.

• Each file occupies a set of contiguous blocks on the disk.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 34

Advantages?

• Sequential access?– Good, because the next character to read

is very close.

• Random access? – Good, because you can just count the

number of blocks.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 35

Disadvantages?

• External fragmentation.• Notice that compaction is an option; but it

needs to be done off-line.

• Internal fragmentation.• Files can grow and shrink.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 36

Linked Allocation

• Each file is a linked list of disk blocks.

• The directory contains a pointer to the first (and last) blocks of the file.

• Each block of the file contains a pointer to the next block.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 37

Advantages/disadvantages

• Advantages?– No external fragmentation. – Files can be arbitrarily big; no need to pre-

allocate

• Disadvantages?– Can only be used effectively for sequential files.– Pointers take up some space.– Internal fragmentation.– Reliability.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 38

File Allocation Table (FAT)

• A variant on the linked allocation scheme, used in MS-DOS, OS/2.

• A table is created at the beginning of each partition, with an entry for each block in the partition.

• The directory entry for a file specifies the block number for the first block of the file.

• The value of the FAT entry for the first block will identify the block number of the next block in the file.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 39

Advantages/disadvantages

• Advantages?– Same as linked. In addition, direct access

is better supported, because chaining through the FAT is faster than chaining through a linked list.

• Disadvantages?– Same as linked. Even more head seeks, in

fact, unless the FAT is cached.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 40

Indexed Allocation

• Each file has an index block, containing a table specifying the physical block for each logical block.

• The directory entry for a file contains the address of its index block.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 41

Advantages/disadvantages

• Advantages?– Very easy direct access.– No external fragmentation.

• Disadvantages?– Wasted space - internal fragmentation, really.– We have to allocate a large array for the

index, because we don’t know how big the index needs to be. ALSO lots of head seeks, like linked allocation.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 42

Extending a full index block

• UNIX’s index block (called an inode), combines direct indexing with multilevel indexing.

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 43

triple indirectdouble indirect

modeownerstimestampsize blockcountdirect blocks

single indirect

NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 44

data

data

data data

data data

data

data

data

Next Lecture

I/O SystemsChapter 13 (Sections 1-4, 7)