Chapter Four:UNIX/Linux File
Processing
Guide To UNIX Using Linux Third Edition
Guide to UNIX Using Linux, Third Edition 2
Objectives
• Explain UNIX and Linux file processing
• Use basic file manipulation commands to create, delete, copy, and move files and directories
• Employ commands to combine, cut, paste, rearrange, and sort information in files
Guide to UNIX Using Linux, Third Edition 3
Objectives (continued)
• Create a script file
• Use the join command to link files using a common field
• Use the awk command to create a professional-looking report
Guide to UNIX Using Linux, Third Edition 4
UNIX and Linux File Processing
• Based on the approach that files should be treated as nothing more than character sequences
• Because you can directly access each character, you can perform a range of editing tasks – this
offers flexibility in terms of file manipulation
Guide to UNIX Using Linux, Third Edition 5
Reviewing UNIX/Linux File Types
• Regular files, also known as ordinary files– Create information that you maintain and manipulate,
and include ASCII and binary files• Directories
– System files for maintaining file system structure
Guide to UNIX Using Linux, Third Edition 6
Reviewing UNIX/Linux File Types (continued)
• Special files– Character special files relate to serial I/O devices– Block special files relate to devices such as disks
Guide to UNIX Using Linux, Third Edition 7
Understanding File Structures
• Files can be structured in many ways depending on the kind of data they store
• UNIX/Linux store data, such as letters and product records, as flat ASCII files
• Three kinds of regular files are– Unstructured ASCII character– Unstructured ASCII records– Unstructured ASCII trees
Guide to UNIX Using Linux, Third Edition 8
Understanding File Structures (continued)
Guide to UNIX Using Linux, Third Edition 9
Processing Files
• UNIX/Linux processes commands by receiving input from a standard input device (e.g. keyboard) and sending it to a standard output device (e.g. monitor)
• System administrators and programmers refer to standard input as stdin, standard output as stdout
• When UNIX/Linux detect errors, they send data to standard error (stderr, the monitor)
Guide to UNIX Using Linux, Third Edition 10
Using Input and Error Redirection
• You can use redirection operators to retrieve input from something other than the standard input device and send output to something other than the standard output device
• Examples of redirection– Redirect the ls command output to a file, instead of to
the monitor (or screen)– Redirect a program that receives input from the
keyboard to receive input from a file instead– Redirect error messages to files, instead of to the
screen by default
Guide to UNIX Using Linux, Third Edition 11
Manipulating Files
• When you manipulate files, you work with the files themselves, as well as their contents
• Create files using output redirection– cat command - concatenate text via output redirection– without a command - > filename– touch command - creates empty files
Guide to UNIX Using Linux, Third Edition 12
Manipulating Files (continued)
• Delete files when no longer needed– rm command - permanently removes a file or an
empty directory– The -r option of the rm command will remove a
directory and everything it contains • Copy files as a means of back-up or as a means to
assist with new file creation– cp command - copies the file(s) specified by the
source path to the location specified by the destination path
Guide to UNIX Using Linux, Third Edition 13
Manipulating Files (continued)
• Move files from directory to directory
– mv command - removes file from one directory and places it in another
• Finding a file helps you locate it in the directory structure
– find command - searches for the file that has the name you specify
Guide to UNIX Using Linux, Third Edition 14
Manipulating Files (continued)
• Combining files using output redirection
– cat command - concatenate text of two different files via output redirection
– paste command - joins text of different files in side by side fashion
Guide to UNIX Using Linux, Third Edition 15
Manipulating Files (continued)
The paste command joins text of different files in side by side fashion
Guide to UNIX Using Linux, Third Edition 16
Manipulating Files (continued)
Extracting fields of a file using output redirection: the cut command removes specific columns or fields from a file
Guide to UNIX Using Linux, Third Edition 17
Manipulating Files (continued)
• Sorting the contents of a file
– sort command - sorts a file’s contents alphabetically or numerically
– the sort command offers many options:
• You can sort the contents of a file and redirect the output to another file
• Utilizing a sort key provides the option of sorting on a field position within each line
Guide to UNIX Using Linux, Third Edition 18
Manipulating Files (continued)
Guide to UNIX Using Linux, Third Edition 19
Creating Script Files
• UNIX/Linux users create shell script files to contain commands that can be run sequentially as a set – this helps with the issues of command automation and re-use of command actions
• UNIX/Linux users use the vi editor to create script files, then make the script executable using the chmod command with the x argument
Guide to UNIX Using Linux, Third Edition 20
Creating Script Files (continued)
Guide to UNIX Using Linux, Third Edition 21
Using the join Command on Two Files
• Sometimes you want to link the information in two files
• The join command is often used in relational database processing
• The join command associates information in two different files on the basis of a common field or key in those files
Guide to UNIX Using Linux, Third Edition 22
A Brief Introduction to theAwk Program
• Awk, a pattern-scanning and processing language helps to produce professional-looking reports
• Awk provides a powerful programming environment that can perform actions on files that are difficult to duplicate with a combination of other commands
Guide to UNIX Using Linux, Third Edition 23
A Brief Introduction to theAwk Program (continued)
• Awk checks to see if the input records in specified files satisfy a pattern
• If so, awk executes a specified action
• If no pattern is provided, awk applies the action to every record
Guide to UNIX Using Linux, Third Edition 24
Chapter Summary
• UNIX/Linux supports regular files, directories, and character and block special files
• File structures depend on data being stored
• UNIX/Linux receives input from the standard input device (keyboard, stdin) and sends output to the standard output device (monitor, stdout)
Guide to UNIX Using Linux, Third Edition 25
Chapter Summary (continued)
• touch updates a file’s time and date stamps and creates empty files
• rmdir removes empty directories
• cut extracts specific columns or fields from a file
• paste combines two or more files
• sort sorts a file’s contents
Guide to UNIX Using Linux, Third Edition 26
Chapter Summary (continued)
• To automate command processing, include commands in a script file
• join extracts data from two files sharing a common field and uses this field to join the two files
• Awk is a pattern-scanning and processing language useful for creating a formatted report with a professional look