31
1 1 Computer Engineering Department Yarmouk University 10/12/2008 File System and Processing CE 466 System Programming 2 Files A named collection of related info Consists of a sequence of bits, bytes, lines, or records For each file, the OS keeps the following info: Name Type Location Size Important dates

File System and Manipulation(unix)

Embed Size (px)

Citation preview

Page 1: File System and Manipulation(unix)

1

1

Computer Engineering DepartmentYarmouk University

10/12/2008

File System and Processing

CE 466System Programming

2

Files

• A named collection of related info• Consists of a sequence of bits, bytes, lines, or

records• For each file, the OS keeps the following info:

• Name• Type• Location• Size• Important dates

Page 2: File System and Manipulation(unix)

2

3

File Access Methods

• Sequential Access

• Direct Access

4

File system

• A file system provides a logical view of the file (data/information)

Page 3: File System and Manipulation(unix)

3

5

File system

• Who maps the user (logical) view of the files to the blocks where the file’s data is stored?

• If not the OS, the user should!!!• The OS provides a mapping between the

logical and physical units of storage• The OS provides a set of basic operations for

file manipulation such as: create, open, read write, ….

6

Layered File System

• The mapping is implemented using a layeredapproach

• Lower levels: device specific I/O control• Intermediate levels:

• Basic file system which issues commands to the appropriate device driver to read and write physical blocks (drive 1, track 2, ..)

• File organization module translates logical block address to a physical one

Page 4: File System and Manipulation(unix)

4

7

Layered File System

• Each file’s logical blocks are numbered from 1 to N

• Corresponding physical blocks could have any addresses (depends on allocation policy)

• Upper level (logical file system):• Manages file system structures including:

• Directory structures to provide file organization module• File control block (FCB) that contains info about the file

including location of file contents, ownership, permissions

8

Directories

• A file system may be divided into partitions• Each partition will have a directory structure• Directories store information such as name,

size, and location of a file• Operations performed on directories include:

search for a file, create a file, list a directory, traverse a file system

Page 5: File System and Manipulation(unix)

5

9

Directory Structure

• Collection of nodes that contains information about all files

• Both directory structures and files reside on disk

F 1 F 2 F 3 F 4F n

Directory

Files

10

Directory Design Constraints

• Organize directory for:• Efficiency in locating files quickly• Convenient naming for users

• 2 users may have the same file names• A file may have multiple names

• Logical grouping of files (all games, all c++ code, … )

Page 6: File System and Manipulation(unix)

6

11

Single-Level Directory

• A single directory for all users

• Naming problem• Grouping problem

12

Two-Level Directory

• Separate directory for each user

• Different users may have the same file name

• Efficient searching• Sharing might be hard

Page 7: File System and Manipulation(unix)

7

13

Multi-Level Directory

• Efficient searching

• Grouping Capability

14

General Graph Directory

Page 8: File System and Manipulation(unix)

8

15

Directory Implementation

• Linear list of file names with pointer to the data blocks• simple to program• time-consuming to execute

• Hash Table – linear list with hash data structure• decreases directory search time• collisions – situations where two file names hash to the

same location• fixed size

16

File System Structures

• On-disk structure: (nonvolatile)• Boot control block, which contains information

needed to boot the OS (boot block in UNIX file system ‘UFS’)

• Partition control block, which contains partition details such as number of blocks, size of blocks, free block count, ..(superblock in UFS)

• A directory structure used to organize the files• An FCB which contains file details (inode in UFS)

Page 9: File System and Manipulation(unix)

9

17

File System Structures

• In memory: (volatile)• Info is used for file-system management and to

improve performance• Partition table• Directory structures of recently accessed

directories• System-wide open-file table, which includes a

copy of the FCB of each open file• Per-process open-file table, which contains a

pointer to the appropriate FCB in the system-wide open-file table

18

A Typical File Control Block (FCB)

File Data BlocksFile SizeFile Owner, GroupFile Dates (create, access, write)

File Permissions

Page 10: File System and Manipulation(unix)

10

19

Open-File Table

• Saves time since no need to search for the file every time an I/O operation is performed

• When a file is opened, its info is added to the open-file table

• The operations are performed using the index in the table

• Info in the open-file table include: file pointer, open count,…

20

File System Operations (create a file)

• Create new file• Reads the directory into memory• Knowing the directory structure, the logical file

system creates a FCB• Updates the directory structure with the new FCB• Writes it back to disk

Page 11: File System and Manipulation(unix)

11

21

File System Operations (open a file)

• Before a file can be used for I/O, it must be opened:• Open system call searches the system-wide open-

file table to see if the file is already open• If the file is not already open

• Directory structure is searched• When file is found, its FCB is copied into the system-

wide open-file table

22

File System Operations (open a file)

• Entry in the per-process open-file table with a pointer to the entry in the system-wide open-file table is added(also a pointer to the current location in the file is added)

• The open file command returns a pointer (file descriptor in UFS) to the appropriate entry in the per-process file table

• All file operations are performed by this pointer

Page 12: File System and Manipulation(unix)

12

23

Allocation Methods

• Refers to how disk blocks are allocated for files:• Contiguous allocation• Linked allocation• Indexed allocation

24

Contiguous Allocation

Page 13: File System and Manipulation(unix)

13

25

Contiguous Allocation

• Each file occupies a set of contiguous blockson the disk

• Simple, only starting location and length(number of blocks) are required

• Random access?• External fragmentation (dynamic storage-

allocation problem)• Files hard to grow

26

Modified Contiguous Allocation

• Known as extent-based systems• Initially, an extent (contiguous chunk of space

is allocated• Another extent is allocated if need arises• File’s blocks are recorded as a location and

block count plus a link to the first block in the next extent

Page 14: File System and Manipulation(unix)

14

27

Linked Allocation

28

Linked Allocation

• Each file is a linked list of disk blocks that may be scattered anywhere on the disk

• Simple – need only starting address and no size-declaration is necessary

• No fragmentation or waste of space• Sequential access only (no random access)• Each block consumes 4 bytes to point to the next

block (4B in 512B/B .78% waste)• If one pointer is lost, the file state is not guaranteed

Page 15: File System and Manipulation(unix)

15

29

File-Allocation Table

30

File-Allocation Table (FAT)

• Used by MS-DOS and OS2• A section of space at the beginning of each partition

is set a side to contain the table• The table has entry for each disk block and is indexed

by block number• Directory entry contains the block number of the first

block of the file• The table entry contains the block number of the next

block in the file• The last block of the file has a special end of file

entry

Page 16: File System and Manipulation(unix)

16

31

Indexed Allocation

32

Indexed Allocation

• Solves the problem of direct access• Brings the pointers into one location: the index

block• Index block is an array of disk-block addresses• The ith entry in the index block points to the ith

block in the file• Directory contains the address of the index

block

Page 17: File System and Manipulation(unix)

17

33

Indexed Allocation

• Each file has an index block• If file is small few non null pointers (i.e. space

is wasted more than with a linked list)• If file is big Index block may not be sufficient

to hold enough pointers to file blocks

34

Index blocks and big files

• Linked scheme• Have as many index block as needed to

accommodate a file• Last pointer in the current index block points to the

next index block

Index Block 0 Index Block 1

Page 18: File System and Manipulation(unix)

18

35

Index blocks and big files

• Multilevel index• Use the first-level index block to point to a set of

second-level index blocks• Second-level index block point to the file blocks• This approach could continue to a third of fourth

level• 4 KB blocks can store 1 K of 4B pointers in an

index block Two level of indexes allow 1M 4B pointers 4GB file

36

Multilevel index

M

outer-index

index table file

Page 19: File System and Manipulation(unix)

19

37

Index blocks and big files

• Combined scheme• Use a combination of direct blocks and indirect

blocks• The indirect blocks may use multilevel index• UNIX use direct blocks, indirect blocks, double

and triple indirect blocks

38

Combined Scheme: UNIX (4K bytes per block)

Page 20: File System and Manipulation(unix)

20

39

Free-Space Management

• Bit vector (n blocks)

• Block number calculation(number of bits per word) *(number of 0-value words) +offset of first 1 bit

bit[i] =

678 1 ⇒ block[i] free

0 ⇒ block[i] occupied

40

Linked Free Space List on Disk

Page 21: File System and Manipulation(unix)

21

41

Standard ‘C’ File Read and Write

• Uses standard C library's input and output functions

• Portable across all operating systems • Buffers read and write operations, making file

operations faster and more efficient• Based on FILE data type (also called stream):

• Location in file (where to read or write next)• Read and write buffers

42

Opening Files

• To work with a file, we must open it first, using fopen()• Returns a (FILE *) on success or NULL on failure

FILE* f_read; FILE* f_write; FILE* f_readwrite; FILE* f_append; f_read = fopen("/home/choo/data.txt", "r"); f_write = fopen("logfile", "w"); f_readwrite = fopen("/usr/local/lib/db/users", "r+"); f_append = fopen("/var/adm/messages", "a");

Page 22: File System and Manipulation(unix)

22

43

Closing Files

• When done working with the file, we need to close it using fclose()• returns 0 on success and -1 on failure

• Closing a file does the following:• Flushes unsaved changes to disk (OS disk cache)• Frees file descriptor and other file resources

if (!fclose(f_readwrite))

{ perror("Failed closing file'/usr/local/lib/db/users':"); exit(1);

}

44

Reading from a File

int c; char buf[201];/*read a single character from the file*/ c = fgetc(f_read); /* read one line from the file */ fgets(buf, 201, stdin); /* place the given character back into file stream */ungetc(c, stdin); /* check if the read/write head has reached EOF */if (feof(f_read))

{ printf("End of file reached\n"); } /* read one block of 120 characters */if (fread(buf, 120, 1, f_read) != 1)

{ perror("fread"); }

Page 23: File System and Manipulation(unix)

23

45

Writing to a File

int c; char buf[201]; /* write the character 'a' to the given file. */ c = 'a';fputc(c, f_readwrite); /* write the string "hello world" to the given file. */ strcpy(buf, "hello world"); fputs(buf, f_readwrite); /* write the string "hi there, mate" to the (screen) */ fprintf(stdout, "hi there, mate\n"); /* write out any buffered writes to the given file stream. */ fflush(stdout);

/* write twice the string "hello, great world!\n" */ /* third parameter to fwrite():the number of blocks to write). */ strcpy(buf, "hello, great world. we feel fine!\n"); if (fwrite(buf, strlen(buf), 2, f_readwrite) != 2)

{ perror("fwrite"); }

46

Moving Read/Write Location

/* move the read/write pointer to position '30' */ /* first position in the file is '0', not '1'. */ fseek(f_read, 29L, SEEK_SET); /* move the read/write pointer of the file stream 25 characters*/ /* forward from its given location. */ fseek(f_read, 25L, SEEK_CUR); /* remember the current read/write pointer's position, /* move it to location '520' in the file, write the string /* "hello world", and move the pointer back to the /* previous location. */ long old_position = ftell(f_readwrite); if (old_position < 0) { perror("ftell"); exit(0); } if (fseek(f_readwrite, 520L, SEEK_SET) < 0) {

perror("fseek(f_readwrite, 520L, SEEK_SET)"); exit(0); } fputs("hello world", f_readwrite); if (fseek(f_readwrite, old_position, SEEK_SET) < 0) { perror("fseek(f_readwrite, old_position, SEEK_SET)"); exit(0);}

Page 24: File System and Manipulation(unix)

24

47

Moving Read/Write Location (other functions)

#include <stdio.h>

int fseek(FILE *stream, long offset, int whence); long ftell(FILE *stream); void rewind(FILE *stream); int fgetpos(FILE *stream, fpos_t *pos); int fsetpos(FILE *stream, fpos_t *pos);

48

Accessing Files Using System Calls

• Accessing files is done best using the standard C library functions

• Sometimes, more low-level operations are needed such as checking file permissions

• Unix treats different devices similar to accessing files

• Sometimes called “low level I/O”• This form of I/O is UNBUFFERED i.e. each

read/write request results in accessing disk (or device) directly to fetch/put a specific number of bytes.

Page 25: File System and Manipulation(unix)

25

49

File Descriptors

• A File descriptor is the basic system object to manipulate files

• A File descriptor is an +ve integer used to access a memory area containing data about the open file

• It is different for different files to uniquely identify a file

50

Creating Files

• To create a file use the creat() system callint creat(char *file_name, int mode)

Page 26: File System and Manipulation(unix)

26

51

Creating Files

/usr/include/sys/stat.h:

#define S_IRWXU 0000700 /* -rwx------ */

#define S_IREAD 0000400 /* read permission, owner */

#define S_IRUSR S_IREAD

#define S_IWRITE 0000200 /* write permission, owner */

#define S_IWUSR S_IWRITE

#define S_IEXEC 0000100 /* execute/search permission, owner */

#define S_IXUSR S_IEXEC

#define S_IRWXG 0000070 /* ----rwx--- */

#define S_IRGRP 0000040 /* read permission, group */

#define S_IWGRP 0000020 /* write " " */

#define S_IXGRP 0000010 /* execute/search " " */

#define S_IRWXO 0000007 /* -------rwx */

#define S_IROTH 0000004 /* read permission, other */

#define S_IWOTH 0000002 /* write " " */ #define S_IXOTH 0000001 /* execute/search " " */

52

Opening Files

• int open(char *filename, int flag, int perms)

• Returns a file descriptor or -1 for when it fails• O_APPEND, O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_WRONLY +

others see online man pages or reference manuals

int fd_read; int fd_write; int fd_readwrite; int fd_append; /* Open the file /etc/passwd in read-only mode. */ fd_read = open("/etc/passwd", O_RDONLY); if (fd_read < 0) { perror("open"); exit(1); }

Page 27: File System and Manipulation(unix)

27

53

File Operations

• int close(int handle) • int read(int handle, char *buffer, unsigned length)

• int write(int handle, char *buffer,unsignedlength)

• Length: is used to specify the number of bytes to be read or written into a file

• The sizeof() is commonly used to specify the length

54

Example

/* program to read a list of floats from a binary file first byte of file is an integer saying how many floats in file. Floats follow after it

*/ /* command line */ #include<stdio.h> #include<fcntl.h>

float bigbuff[1000];

Page 28: File System and Manipulation(unix)

28

55

Example

main(int argc, char **argv) {int fd; int bytes_read; int file_length; if ( (fd = open(argv[1],O_RDONLY)) = -1) { /* error file not open */.... perror("Datafile");

exit(1); } if ( (bytes_read = read(fd,&file_length, sizeof(int))) ==

-1) { /* error reading file */... exit(1); } if ( file_length > 999 ) {/* file too big */ ....} if ( (bytes_read =

read(fd,bigbuff,file_length*sizeof(float))) == -1) { /* error reading open */... exit(1); } }

56

fsync()

• int fsync(int fildes); • To ensure that the file on the physical disk gets

updated immediately #include <unistd.h>/*declaration of fsync()*/. . . .if (fsync(fd) == -1) { perror("fsync"); }

Page 29: File System and Manipulation(unix)

29

57

access()

• int access(const char *path, int amode);

• The return value is zero if the requested access is possible and -1 otherwise.

• The parameter amode can take one of the following four symbolic values. • R_OK Test for read permission • W_OK Test for write permission • X_OK Test for execute permission • F_OK Test for existence

• Check: stat() and chmod()

58

Other System Call

• off_t lseek(int fildes, off_toffset, int whence);

• lseek - reposition read/write file offset• int rename(const char *old, const char *new); • int unlink(const char *path);

Page 30: File System and Manipulation(unix)

30

59

Reading the Contents of a Directory

• DIR structure for directory reading is what the FILE structure is for files

• contains information used by other calls to read the contents of the directory

• When reading the contents of a directory, we need to open a directory (returns a DIR structure).

• The data regarding a directory entry is returned in a dirent structure.

• The only relevant field in this structure is d_name, • A null-terminated character array, containing the name of

the entry (a file or a directory).

60

Opening And Closing A Directory

#include <dirent.h> /* struct DIR, struct dirent, opendir().. */ /* open the directory "/home/users" for reading. */ DIR* dir = opendir("/home/users"); if (!dir) { perror("opendir"); exit(1); }

When we are done reading from a directory, we can close it using the closedir() function:

if (closedir(dir) == -1) {perror("closedir"); exit(1); }

Page 31: File System and Manipulation(unix)

31

61

Reading The Contents Of A Directory

/* this structure is used for storing the name of each entry in turn. */

struct dirent* entry;/* read the directory's contents, print out the name of each entry.

*/ printf("Directory contents:\n"); while ( (entry = readdir(dir)) != NULL) { printf("%s\n", entry->d_name); }