Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm
Module structureModule path Module export Object oriented programming
Part 2 2/24/06 1-4pm
Bioperl modulesSequence accessSequence manipulationParsing BLAST records
Module and main program
package Hello1;
sub greet { return "Hello, World!"; } 1;
Hello1.pm test1.pl
#!/usr/bin/perl
use Hello1;
print Hello1::greet();
Why use module?
• Reusable by different programs.
• Keep your code well organized.
Module structure
package Hello1;
sub greet { return "Hello, World!\n"; } 1;
Declare a package; file must be saved as Hello.pm
Contents of the package:functions, and variables.
Return a true value at end
Path to module• Default path to look for module: @INC
perl -e “print @INC”• If your module is placed under one of the path in @INC, you can refer
to your module use relative path. E.g. If @INC contains /usr/my/lib, and
(1) your Mod.pm is /usr/my/lib/Mod.pm, you can refer to your module by “use Mod.pm”.
(2) Your Mod.pm is /usr/my/lib/Mymod/Seq/Mod.pm, then you say:use Mymod::Seq::Mod
• If your module is not placed under any of @INC, e.g. /some/dir/Mod.pm, then:
use lib “/some/dir”; --- this adds the path to the beginning of @INC
use Mod;
Variable scope in module• my $var --- accessible only in module• our $var --- accessible from outside • $var ---same as “our $var”• use strict; --- This forces all variables to be qualified with ‘my’ or ‘our’.
package Hello2;use strict;our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;
Hello2.pm
#!/usr/bin/perluse Hello2;print "var1= $Hello2::var1\n";print "var2= $Hello2::var2\n";
pring Hello2::greet();
test2.pl
ExportExport functions and variables, so that they can be accessed without qualifier
package Hello3;use strict;require Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}1;
Hello3.pm
#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";
print greet();
test3.pl
package Hello3;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our $var1 = 1;my $var2 = 3;my $str = "Hello World!\n";sub greet { return $str;}
1;
Hello3.pmNeed functionality in Exporter.pm to do exporting.
This programs inherits functionsExporter module, rather than createsits own.
Exporter this sub routineupon request by other program
#!/usr/bin/perluse Hello3 qw(greet);print "var1= $Hello3::var1\n";print "var2= $Hello3::var2\n";
print greet();
test3.pl
Request “greet”
package Hello4;use strict;use Exporter;our @ISA=“Exporter”;our @EXPORT_OK = qw(greet);our @EXPORT = qw(greet2);our $var1 = 1;my $var2 = 3;my $str = "Hello World!";sub greet { return $str;}
sub greet2 { return “Hi.\n”;}1;
Hello4.pm
Export this automatically
#!/usr/bin/perluse Hello4 qw(greet);use Hello4;print "var1= $Hello4::var1\n";print "var2= $Hello4::var2\n";
print greet();print greet2();
test4.pl
Request “greet”
This automatically importswhatever in @EXPORT.
Exercise 1
• Create a module which has functions to calculate the area and boundary of a rectangle. The width and length are to be supplied in your main program and passed into your module. Practice the @EXPORT, and @EXPORT_OK.
Object Orientied Programming
•A package (or module) is a class.
•A reference to a hash becomes an object of this class.
•The object contains member variables which are stored in the hash.
•The object also contains member functions.
Hello5.pmpackage Hello;use strict;
sub new { my $class = shift; my $ref = {}; bless ( $ref, $class ); return $ref;}
sub greet { my ($ref, $str)= @_; return $str;}
sub greet2 { return "Hi\n";}1;
#!/usr/local/bin/perluse Hello5;$h = new Hello5;
print $h->greet("Good morning\n");print $h->greet2;
test5.pl
Rectangle.pm
package Rectangle;sub new { my ($class, $width, $length)=@_; my $hashref = {W=>$width, L=>$length }; bless ( $hashref, $class); return $hashref;}
sub getArea { my $self = shift; return $self->{W} * $self->{L};}
sub getBoundary { my $self=shift; return 2*($self->{W}+$self->{L});}
1;
#!/usr/bin/perluse Rectangle;my $w = 3;my $l = 4;
my $rect = new Rectangle($w,$l);my $area = $rect->getArea();print "Area = $area\n";
my $b = $rect->getBoundary();Print “Boundary=$b\n”;
recttest.pl
Exercise 2
• Create a class called “Cube”. It should have methods to calculate volume based on the cube’s width, length and height.
More Pratices on Class
• Sequence.pm:clean,wrap,reverse complement,shuffle,GC content,translate
• Main program: seq.pl
Bioperl• A collection of perl modules for bioinformatics
• Facilitates sequence retrieval, manipulation, and parsing results of programs like blast, clustalw.
• http://bioperl.org for download and documentation.
• Individual .pm file has info on how to use modules.
• Usually installed: /usr/local/lib/perl5/site_perl/5.8.0/Bio
Some Bioperl modules
• Bio::Perl, Bio::DB -- access seq databases. Examples: seqret.pl
• Bio::Seq -- sequence and its annotation. E.g. seqio.pl
• Bio::SeqIO – read sequence from file, and write to file. E.g. seqio.pl
• Bio::Tools:SeqStats -- molecular weight, etc. E.g. seqmw.pl
• Bio::SearchIO -- parse blast results.
Accessing Remote Databases
use Bio::Perl;$seqobj = get_sequence(‘swiss’, “ROA1_HUMAN”);write_sequence(“roa1.fasta”, ‘fasta’, $seqobj);
Databases can be: swiss, genbank, genpept, refseq, etc.
Bio::Seq• Contain sequence and annotation• Methods: display_id, desc, seq, revcom, translate, etc.
The revcom and translate methods create new Bio::Seq object.
One way to create a Bio::Seq object:$seq = Bio::Seq->new(-seq => 'actgtggcgtcaact',
-desc => 'Sample Bio::Seq object', -display_id => 'something', -accession_number => 'accnum', -alphabet => 'dna' );
An other way: read the sequence from file via Bio::SeqIO object.
Parsing blast results• Module: Bio::SearchIO• my $in = new Bio::SearchIO(-format => 'blast', -file => 'report.bls'); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) {
if( $hsp->length('total') > 100 ) { if ( $hsp->percent_identity >= 75 ) {
print "Hit= ", $hit->name, ",Length=", $hsp->length('total'), ",Percent_id=", $hsp->percent_identity, "\n";
} } } } }
Example: blastparse.pl