Unix Files, IO Plumbing and Filters

Preview:

DESCRIPTION

The file system and pathnames Files with more than one link Shell wildcards Characters special to the shell Pipes and IO redirection Commands used with redirection Some useful filters: awk, grep, sed etc. Unix Files, IO Plumbing and Filters. - PowerPoint PPT Presentation

Citation preview

Unix Files, IO Plumbing and Filters

The file system and pathnamesFiles with more than one linkShell wildcardsCharacters special to the shellPipes and IO redirectionCommands used with redirectionSome useful filters: awk, grep, sed etc.

The file system and pathnames

The filesystem is a set of conventions and the specific data design for organising files and folders, or directories on a hard disk, USB drive, CD or other media. E.G. NTFS, FAT (Windows), EXT3 (Linux)

Files on a filesystem include plain files, directory files (or folders), links (Unix/Linux only) and device files.

All modern OSs map files on a filesystem as an inverted tree.

Inverted tree with root dir as /

/ (root directory) | ------------------------------------------------------ | | | | | dir1 file1 file2 dir2 home | | | ---------------- ------------------- ---------- | | | | | | file3 dir3 file4 file5 fred bill | | ------------------------------- ---------- | | | | | file6 file7 file8 cupboard wardrobe | ---------------------- | | | socks vests shirts

Absolute pathnames start with a / and relative pathnames don't.

Some pathname examples

Differences between Windows and Unix

Pathname delimiters: Unix forward / and Windows back \

Windows has drive letters e.g. a:Unix trees mount filesystems onto empty directories.

So, a:\reports\report.txt on Windows becomes /mnt/floppy/reports/report.txt on Unix

Links

Multiple paths to the same file saves space and keeps objects updated

Hard link: ln /sales/personnel/simon /bonuses/recipients/simon

Soft link:ln -s /usr/wizard/wands/magic/software/binaries /wiz

Hard links only possible within same physical device, soft links needed to link between mounted devices/partitions.

Shell Wildcards

Known as "regular expressions", REs or regexes.

Used in shell commands to match or "glob" multiple files.

Similar language used in Grep, Awk, Perl, Java, Python etc. to find/replace expressions in strings, e.g. tags in HTML

Shell Wildcards

Shell Wildcards can be combined, giving greater flexibility

REs and special characters. at the beginning must match explicitly, and / must always

match explicitly.

Avoid putting *,?$^()[]\/&<>|"'! and spaces in filenames ?

If you can't the backslash special character \ will usually escape the special character following it

rm \$fred deletes the file called $fred

Other special characters:

* . at start of name hides a system file * ~ at start of path means home directory

Pipes and redirection

Redirections can be combined e.g:

grep xxx < f1 | awk '{ print $2 }' 2> f2 \ | tee f3 | wc -l > f4

\ is used to escape the newline so a single command line can be typed using more than one line on the terminal.

Commands often used with redirection

Some useful filters

A filter is a program which reads from stdin and writes to stdout. Unix has many useful programs which filter data based on contents of text lines, or row positions or filtering on columns. We can also sort a file, obtain unique keys or chop the top or tail of it.

These filters will make our pipelines into a general purpose and easy to use text-processing toolkit, once a few examples have been mastered.

awk Replaced as a scripting language by Perl, but still

useful as a simple column filter.

Example 1:

cat file1 | awk '{ print $5, $2 }'

outputs columns 5 then 2 of file1 which has to have spaces or tabs between columns. Use -F<char> to delimit input using <char> e.g. -F: for colon seperated columns. You might prefer cut for this job.

awk example 2Put an awk program of any complexity into a separate file.

Use the -f flag to give filename on shell command. This awkprog file has 3 lines:

BEGIN { total=0 } {if ( $3 == "usr19999" ) total=total+$4 } END { print total/1024 }

Run awk program with command:

ls -l /scratch | awk -f awkprog

Counts the number of 1K blocks used by files belonging to usr19999 in the /scratch directory.

cut

For awk's usual purpose as a column filter, you might prefer to learn cut(1) as an alternative. The following command extracts columns 1, 3, 6 and 7 from /etc/passwd, with colon (:) as input delimiter, and space as output delimiter.

cat /etc/passwd | cut -d: --output-delimiter=" " -f1,3,6,7

Input and output of cut command

input sample from /etc/passwd

bert:x:1003:1003:Bert Trusted User,,,:/home/bert:/bin/bashavahi-autoipd:x:111:111:Avahi autoip daemon,,,:/var/lib/avahi-autoipd:/bin/falseavahi:x:116:121:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/bin/falsesnort:x:117:125:Snort IDS:/var/log/snort:/bin/falsepulse:x:119:126:PulseAudio daemon,,,:/var/run/pulse:/bin/false

Output sample

bert 1003 /home/bert /bin/bashavahi-autoipd 111 /var/lib/avahi-autoipd /bin/falseavahi 116 /var/run/avahi-daemon /bin/falsesnort 117 /var/log/snort /bin/falsepulse 119 /var/run/pulse /bin/false

grep

grep extracts lines containing a particular pattern from a file, using regexes. Similar but not

identical to shell REs.

fgrep is faster and does not interpret REs

egrep interprets more REs than grep

grep examples

Sed is used for automated "stream" editing. Another language best used for one liners.

Another sed example

cat f1 | sed '1,$s/(//g' | sed '1,$s/)//g' > f2

copies all lines from f1 to f2, removing all ( and ) brackets. sed is used twice in the same pipeline and "/" is the search-replace delimiter.

sort

Has many options and parameters for ascending or descending, string or numeric, field and column sort key selection etc. See sort(1) for further details. Example:

cat /etc/passwd | sort -t: +0n -1 | more

Gives paged output of passwd database, sorted by userid.

uniq

Having sorted our input data we may find repeated lines. The following example only outputs unique lines once:

cat file.txt | sort | uniq

The next example precedes all unique output lines with a count showing how many times each line occurred:

cat file.txt | sort | uniq -c

head and tail

Head copies the first part and tail copies the last part of stdin file, see head(1) and tail(1) manual pages.

Example:

head -5 index

displays first 5 lines of index file

tr and colrm filters

tr – is used to translate characterse.g. convert a text from lower to upper case:

richard@saturn:~/misc$ cat keller_security | tr a-z A-Z"SECURITY IS MOSTLY A SUPERSTITION. IT DOES NOT EXIST IN

NATURE... LIFE IS EITHER A DARING ADVENTURE OR NOTHING."-- HELEN KELLER

colrm – similar to cut but filters based only on character positions, and doesn't understand field delimiters.

richard@saturn:~/misc$ cat keller_security | colrm 36"Security is mostly a superstition.either a daring adventure or nothin-- Helen Keller

Conclusions

Redirections of input, output and error files combined with the use of some filter programs allows for us to manipulate the text data used for system configurations. We will be able to extract needles of security information from the system log data haystacks.

Combining a few powerful one liners into short scripts will allow for fully customised administration and security monitoring of open-source servers, routers and firewalls.

Recommended