03 Unix Files

Embed Size (px)

Citation preview

  • 7/30/2019 03 Unix Files

    1/35

    UNIX Files

    by Armin R. Mikler

  • 7/30/2019 03 Unix Files

    2/35

    Overview Files in UNIX

    Directories and Paths

    User vs. System Mode

    UNIX I/O primitives

    A simple example The basic I/O system calls

    Buffered vs. un-buffered I/O

    File Locking Ownership and Permissions

  • 7/30/2019 03 Unix Files

    3/35

    Files

    UNIX Input/Output operations are based on the conceptof files.

    Files are an abstraction of specific I/O devices.

    A very small set of system calls provide the primitivesthat give direct access to I/O facilities of the UNIX

    kernel. Most I/O operations rely on the use of these primitives.

    We must remember that the basic I/O primitives are

    system calls, executed by the kernel. What does thatmean to us as programmers???

  • 7/30/2019 03 Unix Files

    4/35

    User and System Space

    Program

    Code

    Library Routine

    fread()read()

    user code

    read()kernel codeKernel Space

    User Space

  • 7/30/2019 03 Unix Files

    5/35

    Different types of files

    UNIX deals with two different classes of files: Special Files

    Regular Files

    Regular files are just ordinary data files on disk -

    something you have used all along when you studiedprogramming!

    Special files are abstractions of devices. UNIX deals with

    devices as if they were regular files. The interface between the file system and the device is

    implemented through a device driver - a program thathides the details of the actual device.

  • 7/30/2019 03 Unix Files

    6/35

    special files

    UNIX distinguishes two types of special files:

    Block Special Files represent a device withcharacteristics similar to a disk. The device drivertransfers chunks or blocks of data between theoperating system and the device.

    Character Special Files represent devices withcharacteristics similar to a keyboard. The device is

    abstracted by a stream of bytes that can only beaccessed in sequential order.

  • 7/30/2019 03 Unix Files

    7/35

    Access Primitives

    UNIX provides access to files and devicesthrough a (very) small set of basic system calls(primitives)

    create() open()

    close()

    read()

    write()

    ioctl()

  • 7/30/2019 03 Unix Files

    8/35

    the open()call

    #include

    #include

    #include

    int open(const char *path, int flags, [mode_t mode]);

    char *path: is a string that contains the fully qualifiedfilename of the file to be opened.

    int flags: specifies the method of access i.e. read_only,write_only read_and_write.

    mode_t mode: optional parameter used to set the accesspermissions upon file creation.

  • 7/30/2019 03 Unix Files

    9/35

    read()and write()

    #include

    ssize_t read(int filedes, void *buffer, size_t n);

    ssize_t write(int filedes, const void *buffer, size_t n);

    int filedes:file descriptor that has been obtained though an

    open() or create() call.void *buffer:pointer to an array that will hold the data that

    is read or holds the data to be written.

    size_t n:the number of bytes that are to be read or written

    from/to the file.

  • 7/30/2019 03 Unix Files

    10/35

    A close()call

    Although all open files are closed by the OS upon completion

    of the program, it isgood programming styleto clean upafter you are done with any system resource.

    Please make it a habit to closeall files that you program has

    used as soon as you dont need them anymore!

    #include

    int close(int filedes);

    Remember, closing resources timely can improve systemperformance and prevent deadlocks from happening (morelater)

  • 7/30/2019 03 Unix Files

    11/35

    A rudimentary example:#include /* controls file attributes */

    #include /* defines symbolic constants */main()

    {

    int fd; /* a file descriptor */

    ssize_t nread; /* number of bytes read */

    char buf[1024]; /* data buffer *//* open the file data for reading */

    fd = open(data, O_RDONLY);

    /* read in the data */

    nread = read(fd, buf, 1024);

    /* close the file */close(fd);

    }

  • 7/30/2019 03 Unix Files

    12/35

    Directories and Paths

    At each point in time, every process has an associated

    working directorywhich is used for path name resolution. If the pathname does not start with a /, the path is

    assumed to start in the current directory.

    A pathname starting with ./ refers to the current directory

    The pathname starting with ../ refers to the parentdirectory

    These pathnames are referred to as relative pathnames

    The current directory associated with your shell at login isreferred to as home directory

    Question: What is the purpose of the search path?

  • 7/30/2019 03 Unix Files

    13/35

    Some useful functions

    char *getcwd( char *buf, size_t size) returns the pathname of the current working directory.

    long sysconf(int name) returns values of system-wide limits such as clock-ticks-per-

    second and the number of processes allowed per user.

    long pathconf( const char *path, int name)

    long fpathconf( int filedes, int name);

    these functions report limits that are associated with aparticular file or directory, i.e., the maximum path length.

  • 7/30/2019 03 Unix Files

    14/35

    Navigating through Directories

    An important UNIX command is the find command find path ... [operand_expression]

    !!! Have a look at the manual pages - this command is rathercomplex!!

    There are a number of system calls that are related todirectory navigation: opendir()

    readdir()

    rewinddir()

    closedir()

  • 7/30/2019 03 Unix Files

    15/35

    the one who seeksshall find

    The OS remembers the current position of the read-write

    pointer to the file. The read-write pointer indicates whichbyte is the next to be read from (or written to) file.

    The lseek()system call enable the user to change the

    position of the read-write pointer.

    off_t lseek(int filedes, off_t offset, int start_flag);

    off_t offset: number of bytes to move from the startposition.

    int start_flag: indicates from where the offset is going to beapplied.

  • 7/30/2019 03 Unix Files

    16/35

    start_flag

    SEEK_SET (0): Measure the offset from the beginning of

    the file SEEK_CUR (1): Measure the offset from the current

    position

    SEEK_END (2): Offset is measured from the end of the

    file

    Example:

    newpos = lseek(fd, (off_t)-16, SEEK_END);

    sets the read-write pointer 16 bytes before the end of thefile!

  • 7/30/2019 03 Unix Files

    17/35

    Buffered vs unbuffered I/O

    The system can execute in user mode or kernel mode!

    Memory is divided into user space and kernel space!

    What happens when we write to a file? the write call forces a context switch to the system. What??

    the system copies the specified number of bytes from user space

    into kernel space. (into mbufs) the system wakes up the device driver to write these mbufs to

    the physical device (if the file-system is in synchronous mode).

    the system selects a new process to run.

    finally, control is returned to the process that executed thewrite call.

    Discuss the effects on the performance of your program!

  • 7/30/2019 03 Unix Files

    18/35

    Un-buffered I/O

    Every read and write is executed by the kernel.

    Hence, every read and write will cause a contextswitch in order for the system routines toexecute.

    Why do we suffer performance loss? How can we reduce the loss of performance?

    ==> We could try to move as much data aspossible with each system call.

    How can we measure the performance?

  • 7/30/2019 03 Unix Files

    19/35

    Buffered I/O

    explicit versus implicit buffering: explicit - collect as many bytes as you can before

    writing to file and read more than a single byte at atime.

    However, use the basic UNIX I/O primitives Careful !! Your program my behave differently on different

    systems.

    Here, the programmer is explicitly controlling the buffer-size

    implicit - use the Stream facility provided by

    FILE *fd, fopen, fprintf, fflush, fclose, ... etc.

    a FILE structure contains a buffer (in user space) thatis usually the size of the disk blocking factor (512 or1024)

  • 7/30/2019 03 Unix Files

    20/35

    File Locking

    Consider the following problem:Programs can obtain a unique integer by reading from a

    file. The file contains a single integer (at all times),which must be incremented by the program thatexecutes a read. Since multiple programs can compete

    for the file (a unique integer), we must make sure thatthe file access is synchronized.

    HOW??

    What happens if we use buffered I/O ?

  • 7/30/2019 03 Unix Files

    21/35

    lockf() File & Record Locking

    lockf() is a C-Library function for locking recordsof a file. Its prototype is

    int lockf( int fd, int func, long size);

    func-parameters are: F_ULOCK: 0 (unlock a locked section)

    F_LOCK: 1 (locks a section)

    F_TLOCK: 2 (Test and Lock a section)

    F_TEST: 3 (Test section for Locks) see the UNIX manual pages!!

  • 7/30/2019 03 Unix Files

    22/35

    lockf() contd

    If we rewind the file before locking AND use a size of 0L

    as the corresponding size parameter, the entire file isbeing locked.

    lseek(fd, 0L, 0) can be used to rewind the file (fd) to the

    beginning.

    The lockf()function provides both, the ability to lock andto test if a lock is set.

    If we are trying to F_Lock a region that has already been

    locked by another process, the calling process is put tosleep until the region becomes available.

  • 7/30/2019 03 Unix Files

    23/35

    lockf() example

    consider the following code segment:

    ...

    if (lockf(fd, F_TEST, size) ==0{

    rc = lockf(fd, F_LOCK, size);

    ...}

    NOTE: it is possible that right after the test has succeededanother process locks the file. Your process will then have towait until the region becomes available.

    We could use rc = lockf(fd, F_TLOCK, size);to avoid thissituation!

    When would you use a non-blocking locking call ??? DISCUSS!

  • 7/30/2019 03 Unix Files

    24/35

    flock() a 4.3BSD advisory lock

    flock() is a UNIX system call to apply or remove an advisory lock to an open file

    The locking is only on an advisory basis (not absolute)! What does that mean?

    Prototype:int flock(int fd, int operation)

    The operations are:LOCK_SH: Shared Lock

    LOCK_EX: Exclusive lockLOCK_UN: Unlock

    LOCK_NB: modifier to declare non-blocking i.e. (LOCK_SH|LOCK_NB) or

    (LOCK_EX | LOCK_NB)

  • 7/30/2019 03 Unix Files

    25/35

    a flock()example

    #include

    my_lock(fd)

    int fd;

    {if (flock(fd, LOCK_EX) == -1

    {

    printf(error locking file %d /n/n, fd);

    exit(-1);}

    }

  • 7/30/2019 03 Unix Files

    26/35

    File Ownership and Permissions

    Every file in UNIX has an owner, a group, and aset of permissions.

    You can use the ls commandto view thepermissons set for a file:

    ls -l

    Trwxrwxrwx n owner group size date name

    The first field, T: _ for an ordinary file

    d for an directory

    l Symbolic link p FIFO special file

  • 7/30/2019 03 Unix Files

    27/35

    permissions contd Trwxrwxrwx n owner group size date name

    3 sets of rwxrepresent the read, write and executepermission flags for the owner, the group, and others,reprectively.

    nrepresents the number of links to this file or

    directory ownerrepresents the current owner of this file

    grouprepresents the group associated with this file

    sizeis the number of bytes in this file

    dateconsists of date and time when the file was lastmodified

    nameis the name of the file

  • 7/30/2019 03 Unix Files

    28/35

    chmod

    OWNER GROUP OTHERS

    RWX RWX RWX

    4-2-1 4-2-1 4-2-1

    chmod 754 myfile

    OWNER: Read Write and ExecuteGROUP: Read and Execute

    OTHERS: Read

  • 7/30/2019 03 Unix Files

    29/35

    The file creation mask umask

    Upon creating a new file, the operating system will apply

    default permissions. The open()and create()system calls have an optional

    argument, which allows for the specification ofpermissions for the file that is created.

    filedes = open(datafile, O_CREAT, 0644) However, this process is governed by a maskwhich

    represents the bits that will always been turned off on anewly created file.

    Effectively, the above open()call is executed as: filedes = open(datafile, O_CREAT, (~mask)&mode);

  • 7/30/2019 03 Unix Files

    30/35

    example

    Lets say the mask is set to 04+02+01 = 07

    The call filedes = open(datafile, O_CREAT, 0644)will createthe file datafilewith permissions 640. WHY??

    The question is: How can we determine the value of the

    file creation mask?

    The system provides a system call umask(), which can beused to change the default creation mask.

    A umask commandis also available at the shell level, sothat the default creation mask can be changed for everyfile that is created.

  • 7/30/2019 03 Unix Files

    31/35

    umask()

    #include

    #include mode_t umask(mode_t newmask)

    umask() returns the old mask value!!

    This is a good way of determining the default masksetting!

    (see homework)

    Example:

    mode_t oldmask;.....

    oldmask = umask(022); /* What does the mask of 022accomplish ??? */

  • 7/30/2019 03 Unix Files

    32/35

    The UNIX file system!

    Each UNIX file has a description that is stored ina structure called inode. An inodeincludes: file size

    location

    owner (uid)

    permissions

    access times

    etc.

  • 7/30/2019 03 Unix Files

    33/35

    Directories

    A UNIX directoryis a file containing acorrespondence between inodesand filenames.

    When referencing a file, the OS traverses theFS tree to find the inode/namein theappropriate directory.

    Once the OS has determined the inodenumber itcan access the inodeto get information about the

    file.

  • 7/30/2019 03 Unix Files

    34/35

    Links

    A link is an association between a filename and aninode. We distinguish 2 types of links: hard links

    soft (or symbolic) links

    Directory entries arehard links

    as they directlylink an inodeto a filename.

    Symbolic links use the file (contents) as a pointerto another filename.

  • 7/30/2019 03 Unix Files

    35/35

    More on links

    each inodemay be pointed to by a number ofdirectory entries (hard links)

    each inodekeeps a counter, indicating how manyhard links exist to that inode.

    When a hard link is removed via the rmor unlinkcommand, the OS removes the correspondin linkbut does not free the inodeand correspondingdata blocks until the link count is 0