22
Scientific Programming in C XIII. Shell programming Susi Lehtola 11 December 2012

Scientific Programming in C XIII. Shell programming - Course Pages

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scientific Programming in C XIII. Shell programming - Course Pages

Scientific Programming in CXIII. Shell programming

Susi Lehtola

11 December 2012

Page 2: Scientific Programming in C XIII. Shell programming - Course Pages

Introduction

Often in scientific computing one needs to do simple tasks relatedto

I renaming of files

I file conversions

I unit conversions

These are often best done using shell programming and commandline tools.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 2/22

Page 3: Scientific Programming in C XIII. Shell programming - Course Pages

sed

sed performs text filtering and transformation. Examples:

Replace ”foo” with ”bar” in file:

$ sed ” s | f o o | bar | g” f i l e > f i l e . new

Delete the first 10 lines of a file

$ sed ’1 ,10 d ’ f i l e > f i l e . new

Delete the last line of a file

$ sed ’ $d ’ f i l e > f i l e . new

In-place modification of file with -i argument (no output tostdout).

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 3/22

Page 4: Scientific Programming in C XIII. Shell programming - Course Pages

awk

awk is a language for processing text files.

The input is read line by line, and it is split into fields (i.e., words).

Awk programs are written as a series of pattern action pairs

c o n d i t i o n { a c t i o n }

that are run at every line of input.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 4/22

Page 5: Scientific Programming in C XIII. Shell programming - Course Pages

awk

awk is a language for processing text files.

The input is read line by line, and it is split into fields (i.e., words).

Awk programs are written as a series of pattern action pairs

c o n d i t i o n { a c t i o n }

that are run at every line of input.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 5/22

Page 6: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

Hello world in awk

$ awk ’ BEGIN { p r i n t ” H e l l o w or l d ! ”} ’H e l l o w or l d !

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 6/22

Page 7: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

There are also special BEGIN and END blocks that are run onlyonce at the startand the end of the program, respectively.

Full program:

BEGIN {/∗ code t h a t i s run a t t h e s t a r t ∗/

}

{/∗ code t h a t i s run f o r e v e r y l i n e o f i n p u t ∗/

}

END {/∗ code t h a t i s run a t t h e end ∗/

}

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 7/22

Page 8: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

Example: an xyz file

186−m o l e c u l e water c l u s t e rO 0.000000 0.000000 0.000000H −0.410000 −0.740000 0.530000H 0.640000 0.510000 0.580000O −1.030000 −1.980000 1.280000H −1.220000 −2.940000 1.490000H −1.360000 −1.400000 2.030000O 0.000000 0.270000 −2.800000H −0.100000 0.120000 −1.810000H −0.860000 0.630000 −3.160000O 1.670000 1.030000 1.820000H 2.160000 1.870000 1.610000H 2.290000 0.380000 2.270000O −0.550000 4.130000 0.440000H −1.470000 3.850000 0.150000H −0.240000 4.900000 −0.120000O −0.580000 2.340000 3.290000H −0.750000 3.170000 2.750000H 0.320000 1.970000 3.070000

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 8/22

Page 9: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’dDecompose the file#!/ u s r / b i n /awk −f{

p r i n t f (” L i n e %i : \”%s \”\n ” ,NR, $0 ) ;f o r ( i =1; i<=NF ; i ++) {

p r i n t f (”\ tWord %i : \”%s \” .\ n ” , i , $ i ) ;}

}

Running gives$ c a t c l u s t e r . xyz | . / decompose . awkL i n e 1 : ”18”

Word 1 : ” 1 8 ” .L i n e 2 : ” Water c l u s t e r , f i r s t 6 m o l e c u l e s . ”

Word 1 : ” Water ” .Word 2 : ” c l u s t e r , ” .Word 3 : ” f i r s t ” .Word 4 : ” 6 ” .Word 5 : ” m o l e c u l e s . ” .

L i n e 3 : ”O 0.000000 0.000000 0.000000”Word 1 : ”O” .Word 2 : ” 0 . 0 0 0 0 0 0 ” .Word 3 : ” 0 . 0 0 0 0 0 0 ” .Word 4 : ” 0 . 0 0 0 0 0 0 ” .

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 9/22

Page 10: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

I $0 contains the whole input line

I $1 is the first word on the line

I $2 is the second word on the line

I . . .

I $NF is the last word on the line

Useful special variables in awk:

I NR is the current line number

I NF contains the number of fields on the current line

I $NR is the amount of lines in the file (line number of last line)

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 10/22

Page 11: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

Extract the x coordinates from the file$ c a t c l u s t e r . xyz | awk ’{ i f (NR>2) { p r i n t $2 }} ’0 .000000−0.4100000.640000−1.030000−1.220000−1.3600000.000000−0.100000−0.8600001.6700002.1600002.290000−0.550000−1.470000−0.240000−0.580000−0.7500000.320000

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 11/22

Page 12: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’dFind out the maximum and minimum coordinates#!/ u s r / b i n /awk −fBEGIN {

max [0]= max [1]= max[2]=−1 e10 ;min [0]= min [1]= min [2]=1 e10 ;

}

{i f (NR>2) {

f o r ( i =0; i <3; i ++) {i f ( $ ( i +2)<min [ i ] ) {

min [ i ]=$ ( i +2)} ;i f ( $ ( i +2)>max [ i ] ) {

max [ i ]=$ ( i +2)}

}}

}

END {f o r ( i =0; i <3; i ++) {

p r i n t f (”% e . . . % e\n ” , min [ i ] , max [ i ] )}

}Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 12/22

Page 13: Scientific Programming in C XIII. Shell programming - Course Pages

awk, cont’d

Running gives

$ c a t c l u s t e r . xyz | . / minmax . awk−1.470000 e+00 . . . 2 .290000 e+00−2.940000 e+00 . . . 4 .900000 e+00−3.160000 e+00 . . . 3 .290000 e+00

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 13/22

Page 14: Scientific Programming in C XIII. Shell programming - Course Pages

Bash

Bash (Bourne-Again SHell) is the default shell on linux systems,and it has quite nice scripting features.

For example:

$ f o r i i n f o o bar ; do echo $ i ; donef o obar$ f o r ( ( i =0; i <10; i ++)); do echo ”The v a l u e o f i i s $ i . ” ; doneThe v a l u e o f i i s 0 .The v a l u e o f i i s 1 .The v a l u e o f i i s 2 .The v a l u e o f i i s 3 .The v a l u e o f i i s 4 .The v a l u e o f i i s 5 .The v a l u e o f i i s 6 .The v a l u e o f i i s 7 .The v a l u e o f i i s 8 .The v a l u e o f i i s 9 .

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 14/22

Page 15: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

You can also loop over files:

$ f o r i i n ∗ . t e x ; do cp −a $ i $ i . o r i g ; done

This will make backups of all the .tex files in the current directory.(*.tex is expanded to match all the .tex files in the directory, afterwhich the for loop runs over the expansion)

Advanced version:$ f o r i i n ∗ . t e x ; do

# Get m o d i f i e d d a t es u f f i x=$ ( d a t e −−r e f e r e n c e=$ i +%Y%m%d.%H%M. bak )

# and backup t h e f i l ecp −av $ i ${ i }−${ s u f f i x }

done

This will suffix the backup with the time stamp of the original file.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 15/22

Page 16: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

You can also loop over files:

$ f o r i i n ∗ . t e x ; do cp −a $ i $ i . o r i g ; done

This will make backups of all the .tex files in the current directory.(*.tex is expanded to match all the .tex files in the directory, afterwhich the for loop runs over the expansion)

Advanced version:$ f o r i i n ∗ . t e x ; do

# Get m o d i f i e d d a t es u f f i x=$ ( d a t e −−r e f e r e n c e=$ i +%Y%m%d.%H%M. bak )

# and backup t h e f i l ecp −av $ i ${ i }−${ s u f f i x }

done

This will suffix the backup with the time stamp of the original file.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 16/22

Page 17: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

Let’s say you have a bunch of files with names file1, file2, . . . ,file199, file200, . . . , file1098, file1099.

You want to rename these to file0001, file0002, . . . , file1099.How do you do this?

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 17/22

Page 18: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

Solution: bash and awk.$ f o r ( ( i =1; i <=1099; i ++)); do

# Conver t number to c o n t a i n l e a d i n g z e r o sn=‘ echo $ i | awk ’{ p r i n t f (”%04 i ” , $1 )} ’ ‘

# Has f i l e name changed ?i f [ [ ” $ i ” != ”$n” ] ] ; thenmv f i l e $ { i } f i l e $ {n}

f idone

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 18/22

Page 19: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

Bash also has arrays.

$ a r r =( This i s an a r r a y . )$ echo $ a r rTh i s$ echo ${ a r r [ 0 ] }This$ echo ${ a r r [ 1 ] }i s$ echo ${ a r r [ 2 ] }an$ echo ${ a r r [ 3 ] }a r r a y .$ echo ${ a r r [ @]}This i s an a r r a y .$ f o r ( ( i =0; i<${#a r r [ @ ] } ; i ++)); do

echo ” Element $ i : \”${ a r r [ i ]}\” . ”done

Element 0 : ” Thi s ” .Element 1 : ” i s ” .Element 2 : ”an ” .Element 3 : ” a r r a y . ” .

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 19/22

Page 20: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

You can also loop directly over the elements in the array as

$ f o r i i n ${ a r r [ @ ] } ; do echo $ i ; doneThisi sana r r a y .

since ${arr[@]} expands to the full array.

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 20/22

Page 21: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

For example, in conventional quantum chemistry one often needsto check the convergence with regard to the basis set. This can benicely automatized with bash arrays

$ b a s i s =({ , aug−}cc−pV{D, T,Q, 5 , 6}Z)$ f o r i i n ${ b a s i s [ @ ] } ; do echo $ i ; donecc−pVDZcc−pVTZcc−pVQZcc−pV5Zcc−pV6Zaug−cc−pVDZaug−cc−pVTZaug−cc−pVQZaug−cc−pV5Zaug−cc−pV6Z

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 21/22

Page 22: Scientific Programming in C XIII. Shell programming - Course Pages

Bash, cont’d

Read more on bash programming at

I http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html

beginners’ guide

I http://tldp.org/LDP/abs/html/ advanced level

Scientific Programming in C, fall 2012 Susi Lehtola Shell programming 22/22