40
Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Embed Size (px)

Citation preview

Page 1: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Introduction to Unix (CA263)

File Processing (continued)

By

Tariq Ibn Aziz

Page 2: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 2

Objectives

• Use the pipe operator to redirect the output of one command to another command

• Use the cut, paste, tr and grep command to search for a specified pattern in a file

• Use the uniq command to remove duplicate lines from a file

• Use the comm and diff commands to compare two files

• Use manipulation and transformation commands, which include sed, tr, and pr

• Design a new file-processing application by creating, testing, and running shell scripts

Page 3: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 3

Advancing YourFile-Processing Skills

• Selection commands focus on extracting specific information from files

Page 4: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 4

Advancing YourFile Processing Skills (continued)

Page 5: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 5

Using the Selection Commands

• Using the Pipe Operator

– The pipe operator (|) redirects the output of one command to the input of another

– The pipe operator can connect several commands on the same command line

Page 6: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 6

Cut Command

• To extract various column or fields of data file or the output of the command.– Cut –cchars file– cut –c5- data– It will extract characters 5 through the end of the line

of data and write the results to standard output.

Page 7: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 7

Cut Example

$ whoroot console Feb 24 08:54taziz tty02 Feb 24 12:55dawn tty08 Feb 24 09:15amin tty10 Feb 24 15:35

$ who | cut –c1-8roottazizdawn amin

Page 8: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 8

Cut Example

$ whoroot console Feb 24 08:54taziz tty02 Feb 24 12:55dawn tty08 Feb 24 09:15amin tty10 Feb 24 15:35

$ who | cut –c1-8 | sortamindawn roottaziz

Page 9: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 9

Cut Example

$ whoroot console Feb 24 08:54taziz tty02 Feb 24 12:55dawn tty08 Feb 24 09:15amin tty10 Feb 24 15:35

$ who | cut –c10-16consoletty02tty08tty10

Page 10: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 10

Cut Example

$ whoroot console Feb 24 08:54taziz tty02 Feb 24 12:55dawn tty08 Feb 24 09:15amin tty10 Feb 24 15:35

$ who | cut –c1-8,18-root Feb 24 08:54taziz Feb 24 12:55dawn Feb 24 09:15amin Feb 24 15:35

Page 11: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 11

Cut CommandThe –d and –f Option

• The -d and –f option are used with cut when you have data that is delimited by a particular character. The format of the cut command is as follow:cut –ddchar –ffields file

• Where dchar is the character that delimits each field of data, and fields specifies the field to be extracted from file. Field numbers start at 1.

Page 12: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 12

Cut CommandThe –d and –f Option

cut –ddchar –ffields file$cat phonebooktariq:905-3456:11 Driscoll Drkalim:205-3456:12 Driscoll Drimran:304-3456:13 Driscoll Drhasam:203-3456:14 Driscoll Dr$cut –d: –f1 phonebooktariqkalimimranHasam$

Page 13: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 13

Cut Command

• Field are separated by tabs.$cat phonebookAlice Chebba 596-2015Barbara Swingle 205-9257Jeff Goldberg 295-3378Liz Stachiw 775-2298

$cut –c1-15 phonebookAlice Chebba 59Barbara SwingleJeff Goldberg 2Liz Stachiw 775$

Page 14: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 14

Cut Command

• Field are separated by tabs, you should use the –f option to cut$cat phonebookAlice Chebba 596-2015Barbara Swingle 205-9257Jeff Goldberg 295-3378Liz Stachiw 775-2298

$cut –f1 phonebookAlice ChebbaBarbara SwingleJeff GoldbergLiz Stachiw$

Page 15: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 15

Paste Command

• Its opposite of cut command, cut break lines apart, and paste command put lines together.$ cat namesTubsEmanuelLucyRalph$ cat numbers(307)542-5356(212)954-3456(212)MH6-9959(212)BE0-7741

$ paste names numbersTubs (307)542-5356Emanuel (212)954-3456Lucy (212)MH6-9959Ralph (212)BE0-7741

Page 16: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 16

Paste Command The –d option

• Its opposite of cut command, cut break lines apart, and paste command put lines together.

$ paste –d'+‘ names numbersTubs+(307)542-5356Emanuel+(212)954-3456Lucy+(212)MH6-9959Ralph+(212)BE0-7741

Page 17: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 17

Paste Command The –s option

• The –s option tells paste to paste together lines separated by tab from the same file, not from alternate files.

$ paste –s namesTubs Emanuel Lucy Ralph Fred

$ paste –d' ' –s -Tubs Emanuel Lucy Ralph Fred

Page 18: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 18

tr command

• The tr filter is used to translate characters from standard output. The general form of the command is:tr from-char to-char$ cat introThe UNIX operating system was pioneered by Ken$ tr e x < introThx UNIX opxrating system was pionxxrxd by Kxn

Page 19: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 19

tr command

• You can translate colon into tab character to produce more readable output, by simply tacking tr command to the end of the pipeline.$ cut –d: -f1,6 /etc/passwdroot:/cron:/bin:/uucp:/usr/spool/uucpasg:/$ cut –d: -f1,6 /etc/passwd |tr : ' 'root /cron /bin /uucp /usr/spool/uucpasg /

Page 20: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 20

tr command

• Octal Values of some ASCII characters

Character Octal value

Bell 7

Backspace 10

Tab 11

NewLine 12

Linefeed 12

Formfeed 14

Carriage Return 15

Escape 33

$ date |tr ' ' '\12' SunMar1019:13:46EST1985

Page 21: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 21

tr command

$ cat introThe UNIX operating system was pioneered by Ken

$tr '[a-z]' '[A-Z]' < introTHE UNIX OPERATING SYSTEM WAS PIONEERED BY KEN

Page 22: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 22

tr commandThe –s Option

• You can use the –s option to tr to “squeeze” out multiple occurances of characters will be replaced by a single character.

$ cat lotsofspacesThis is an example of a

file that contains a lot of blank spaces.$ tr -s' ' ' ' < lotsofspacesThis is an example of a

file that contains a lot of blank spaces.

Page 23: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 23

tr commandThe –d Option

• tr can also be used to delete single characters from a stream of input.

$ tr –d ' ' < introTheUNIXoperatingsystemwaspioneeredbyKen

Page 24: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 24

Using the grep Command

• Used to search for a specific pattern in a file, such as a word or phrase

• grep’s options and wildcard support allow for powerful search operations

• You can increase grep’s usefulness by combining with other commands, such as head or tail command

Syntax: grep [-options] [pattern] [filename]Useful options includes-i ignore case-l lists only file names-c counts the number of line instead of showing them-r searches through files under all subdirectories

Page 25: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 25

grep Example

$ grep shell ed.cmdfiles, and is independent of the shell.to the shell, just type in a q.

Page 26: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 26

grep example

$ cat phone_bookAlice Chebba 596-2015Barbara Swingle 598-9257Jeff Goldberg 295-3378Liz Stachiw 775-2298Susan Goldberg 338-7776Tony Iannino 386-1295$$ grep Susan phone_bookSusan Goldberg 338-7776

Page 27: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 27

grep –v options Example

• Print all lines that don’t contain Unix

$ cat intro

The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$$ grep –v 'Unix' introenvironment for efficient program development

Page 28: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 28

grep –i option Example

• Print all lines by ignoring upper and lower case letters

$ cat intro

The unix operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development

$$ grep –i 'Unix' intro

The unix operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX

Page 29: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 29

grep [tT] option Example

• grep search for either upper or lower case letters T

$ cat intro

The unix operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development

$$ grep '[tT]he intro

The unix operating system was pioneered by Ken

Page 30: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 30

grep -i option Example

• grep with –l just gives you a list of file that contains specified pattern

$ cat intro

The unix operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development

$$ grep –l 'Main goal' *

intro$

Page 31: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 31

grep -n option Example

• grep with –n gives you output of specified pattern with relative line number

$ cat intro

The unix operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development

$$ grep –n 'environment' intro

3:environment for efficient program development $

Page 32: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 32

Using the uniq Command

• Uniq command removes duplicate lines from a file• Compares only consecutive lines, therefore uniq

requires sorted input• uniq has an option that allows you to generate output

that contains a copy of each line that has a duplicateSyntax: uniq [-option] [file1 > file2]Useful options includes-u outputs only the lines of the source file that are not

duplicated-d outputs only the copy of each lines that has a duplicate, and

does not show unique line-i ignores case-c starts each line by showing the number of each instance

Page 33: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 33

Uniq Command Example

$ cat partsmufflermufflershocksalternatorbatterybatteryradiatorradiatorcoilspark plugsspark plugscoil

$ uniq parts >inventory$ more inventorymufflershocksalternatorbatteryradiatorcoilspark plugscoil$Why coil is listed twice?

Page 34: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 34

Uniq –u Command Example

$ cat partsmufflermufflershocksalternatorbatterybatteryradiatorradiatorcoilspark plugsspark plugscoil

$ uniq –u parts >inventory$ more inventoryshocksalternatorcoilcoil$Why coil is listed twice?Ans: Two occurrences

ofcoil are nottogether

Page 35: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 35

Using the comm Command

• Used to identify duplicate lines in sorted files• Unlike uniq, it does not remove duplicates, and it works

with two files rather than one• It compares lines common to file1 and file2, and

produces three column output– Column one contains lines found only in file1– Column two contains lines found only in file2– Column three contains lines found in both files

Page 36: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 36

Using the diff Command

• Attempts to determine the minimal changes needed to convert file1 to file2

• The output displays the line(s) that differ• Codes in the output indicate that in order for the files to

match, specific lines must be added or deleted

Page 37: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 37

Using the pr Command toFormat Your Output

• pr prints specified files on the standard output in paginated form

• By default, pr formats the specified files into single-column pages of 66 lines

• Each page has a five-line header containing the file name, its latest modification date, and current page, and a five-line trailer consisting of blank lines

Page 38: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 38

Using a Shell Script toImplement the Application

• Shell scripts should contain:– The commands to execute– Comments to identify and explain the script so that

users or programmers other than the author can understand how it works

• Use the pound (#) character to mark comments in a script file

Page 39: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 39

Running a Shell Script

• You can run a shell script in virtually any shell that you have on your system

• The Bash shell accepts more variations in command structures that other shells

• Run the script by typing sh followed by the name of the script, or make the script executable and type ./ prior to the script name

Page 40: Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz

Guide to UNIX Using Linux, Third Edition 40

Putting it All Together toProduce the Report

• An effective way to develop applications is to combine many small scripts in a larger script file

• Have the last script added to the larger script print a report indicating script functions and results