Getting Started with Perl (and Excel) Biophysics 101 September 17, 2003 Griffin Weber (With material...

Preview:

Citation preview

Getting Started with Perl

(and Excel)

Biophysics 101

September 17, 2003

Griffin Weber

(With material from Jon Radoff and Ivan Ovcharenko)

What is a computer?

• Artificial brains?

• Smarter than

you?

• Too complicated

to understand?

What is a computer?

• A machine with lots of buttons

• The “3 minute popcorn” program:

TIME 300

STARTcommands

parameter

What is a computer?

01100101 10101101

Input Output

0 = off1 = on

What is Perl?

• Perl is a computer language• Easier to use than binary code

01100101print 2+3;

Perl

Interpreter

Where do I get Perl?

Course web site for download links and instructions:http://www.courses.fas.harvard.edu/~bphys101/links/

index.html

Windows: ActivePerl

MacOS: MacPerl

Unix/Linux (FAS): Already Installed

How do you use Perl?

• Create a new text file• Type all the commands• Save it as a *.pl file (hello.pl)• Tell the computer to run your file

perl hello.pl

What does Perl look like?

#!/usr/local/bin/perl

print “Hello everyone\n”;

Mandatory first line !

Draw something to the screen

Printing Output to the Screen

print “Hello \n”;

print command:

" " - place output in quotes

\n - end-of-line character

(like pressing ‘Enter’)

; - ends every command

Variables

2+2 = ? $a = 2;

$b = 2;

$c = $a + $b;

$ - indicates a variable

;

= - assigns a value to a variable

- don’t forget

a - the variable name

Perl Calculator

$c = 4 - 2;

$c = 2 * 2;

$c = 2 / 2;

$c = 2 ** 4; - power : 2^4 <-> 24

=16$c = 1.35 * 2 – log(3) / (0.12 + 1);

- subtract

- multiply

- divide

natural log

Perl Calculator : Output

print " 2 + 2 = " . (2+2) . "\n";

$c = 2 + 2;

print " 2 + 2 = $c \n";

print the value of $c

2 + 2 = 4

or… strings

expressionconcatenate.output

Datatypes

# DEFINING NUMBERS

$x = 4;

$x = 1.25;

$x = 2.345e-56;

# DEFINING STRINGS

$x = “ACTGGTA”;

$y = “Hello everyone \n”;

2.345 * 10-56

2.345*10**-56

Perl Examples:# - comment

line (Perl ignores)

Loops and Cycles : FOR Statement

# Output all the numbers from 1 to 100

for ($n=1; $n<=100; $n+=1) {

print “$n \n”;

}

1. Initialization

for ( $n=1 ; $n<=100 ; $n+=1 ) { … }

4. Increment ($n = $n + 1)

2. Termination (stop if this is not true)

3. Body (or block) of the loop - commands inside curly brackets

FOR Loop Example

Output - Triangle of A’s:

for ($line=1; $line<=3; $line+=1) {

# output $line ‘A’ symbols

for ($n=1; $n<=$line; $n+=1) {

print “A”;

}

# end the line

print “\n”;

}

AAAAAA

Source code:

Indent = TA

Conditional Statements$x = -1;

# check whether $x is positive or not

if ($x > 0) {

print “x = $x is positive\n”;

}

1. If this is true, ...

if ( some_expression ) { … }

2. ... then do this - commands inside curly brackets

Expressions

$x = 1;

$y = 2;

if ($x < $y) { }

if ($x > $y) { }

if ($x <= $y) { }

if ($x >= $y) { }

if ($x == $y) { }

if ($x != $y) { }

With numbers:

Less than

Greater than

Less than or equal

Greater than or equal

Equal

Not equal

$a = “DNA”;

$b = “RNA”;

if ($a lt $b) { }

if ($a gt $b) { }

if ($a le $b) { }

if ($a ge $b) { }

if ($a eq $b) { }

if ($a ne $b) { }

With strings:

Conditional Statements

Output:

$x = -1;

# check whether $x is positive or not

if ($x > 0) {

print “x = $x is positive\n”;

}

if ($x < 0) {

print “x = $x is negative\n”;

}

if ($x == 0) {

print “x is zero\n”;

}

x = -1 is negative

Source code:

Conditional Statements

$x = -1;

# check whether $x is positive or not

if ($x > 0) {

print “x = $x is positive\n”;

} elsif ($x < 0) {

print “x = $x is negative\n”;

} else {

print “x is zero\n”;

}

The same thing:

if ( some_expression ) { … } else { … }

if this is true do this otherwise do this

Putting It Together

FOR & IF -- all the even numbers from 1 to 100:

for ($n=1; $n<=100; $n+=1) {

if (($n % 2) == 0) {

print “$n”;

}

}Note: $a % $b -- Modulus (remainder when $a is divided by $b)

7 % 2 = 1 5 % 3 = 2 12 % 4 = 0

Data Structures : Arrays

# array of 5 numbers

@a = (7,3,4,-1,0);

# array of strings

@day = (“Mon”, “Tue”, “Wed”, “Thu”, “Fri”);

@ - indicates an array (…)

- a list of values

$a[2] - an element of the array (count from 0)

Data Structures : Arrays@a = (7,3,4,-1);

# change the value of an element$a[1] = 5;

# print all the elements in the array @a

for ($i=0; $i<=$#a; $i+=1) {

print “a[$i] = $a[$i] \n”;

}

$#a - the index of the last element in the array @a

a[0] = 7

a[1] = 5

a[2] = 4

a[3] = -1

Two Dimensional Arrays

# A 2x2 array of 9 numbers

@a = ([1,2,3],[4,5,6],[7,8,9]);

([…],[…],…,[…])

- an array of arrays

# change the value of an element

$a[1][2] = -1;

print “ $a[2][0] \n”;7

Working with Files

Microsoft Word

Open

Edit

Save

Close

Perl Files

Open

Read or Write(one line at a time)

Close

Working with Files

How to open and close a file “data.txt” from a perl program? # open data.txt file for READINGopen (FILE, "<data.txt" );

File handler -This name will be used everywhere later in the program, when we will deal with this file.

<>

Direction of file data flow

- READ from a file

# close a file specified by FILE file handlerclose (FILE);

- WRITE to a file

Working with Files

Writing “Hello everyone” to the “tmp.txt” file:

#!/usr/local/bin/perl

open (FILE, “>tmp.txt”);

print FILE “Hello everyone\n”;

close (FILE);

Note: If tmp.txt already exists, it will be erased.

Working with Files

# open file data.txt for readingopen (FILE, “<data.txt”);

# read file line by line and print it out to the

screen

while ($line = <FILE>) {

print “$line”;

}

#close fileclose(FILE); while loop is analogous to the for loop.

All the body statements of it are executed until the condition in parenthesis is false.

Read the next line from the file specified by the filehandle <FILE>

Working with FilesExample. Calculating a sum of numbers in the file data.txt:

#!/usr/local/bin/perl$sum = 0;open (FILE, “<data.txt”);while ($line = <FILE>) {

chomp($line);$sum = $sum + $line;

}close(FILE);print “Sum of the numbers in data.txt file is $sum\n”;

118232

Sum of the numbers in data.txt file is 44

chomp command removes “\n” (new line) symbol from the string

More Strings

$DNA = “ACGTCG”;

# length of the string -> number of characters

inside

$seqLen = length ($DNA); # $seqLen =

6

# extracting a part of a string

$seqPart = substr ($DNA, 2, 3); # $seqPart =

“GTC”

substr ( $string, $offset, $n)

-- extracts $n characters from the string

$string, starting at the position $offset

(first position in a string is 0, not 1!)

Note: In Perl, a string is not an array of characters.

More Strings$DNA = “ACGTCG”;

# length of the string -> number of characters

inside

$seqRev = reverse ($DNA); # $seqRev =

“GCTGCA”

# substitute all ‘C’ symbols with ‘T’ symbols

$DNA =~ s/C/T/gi; # $DNA = “ATGTTG”s - substitute

(search & replace)

g – global (everywhere)

i – case insensitive

# count the number of substitutions

$numG = ($DNA =~ s/G//gi); # $numG = 2

More Strings

$str = “I like biology.”;

# replace “biology” with “computers”$str =~ s/biology/computers/; # $str = “I like computers.”

# replace ‘r’ with ‘n’, ‘e’ with ‘i’, and ‘s’ with ‘g’$str =~ tr/res/nig/; # $str = “I liki computing.”

Note: tr/// substitutes only symbols, while s/// substitutes strings

Functions (Subroutines)

$x = min(5,3);print “Smallest of 5 and 3 is: $x\n”;

# Function minsub min {

($a, $b) = @_;if ($a < $b) {

$small = $a;} else {

$small = $b;}return $small;

}

A function is a program within a program.

define the function

input parameters

return the answer

call the function

ModulesPerl does not have functions for everything, but many useful functions have already programmed by other people, and they share their libraries of functions, which are called modules

use bignum; # Work with large numbers

use CGI; # Build interactive web pages

use BioPerl; # Perform DNA sequence analysis

use GD; # Create pictures

use DBI; # Communicate with databaseshttp://cpan.org/ -- lots of Perl modules

use X; - place near the beginning of your program

tells Perl to use all the functions in module X

Bugs!

$x = 1if ($x = 1) { $x = 2; }x = 0;$a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1if ($x = 1) { $x = 2; }x = 0;$a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1;if ($x = 1) { $x = 2; }x = 0;$a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1;if ($x == 1) { $x = 2; }x = 0;$a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1;if ($x == 1) { $x = 2; }$x = 0;$a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1;if ($x == 1) { $x = 2; }$x = 0;@a = (1, 2, 3);$y = 5/$x;

Bugs!

#!/usr/local/bin/perl $x = 1;if ($x == 1) { $x = 2; }$x = 0;@a = (1, 2, 3);$y = 5/$x; # divide by 0

Excel

• What is a worksheet?

• Enter data by hand

• Import data from a file

• Use Excel built-in functions

• Perform statistical tests

• Make graphs from data