Upload
theodora-ward
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Trinity College Dublin, The University of Dublin
A Brief Introduction to Scientific Programming with Python
Karsten Hokamp, PhDTCD Bioinformatics Support Team
TCD, 26/08/2015
Trinity College Dublin, The University of Dublin
Overview
• Programming
• First Python script/program
• Why Python?
• Bioinformatics examples
• Additional resources
• Outlook
Trinity College Dublin, The University of Dublin
What is programming and why bother?
Data processing
Automation
Combination of programs for analysis pipelines
More control and flexibility
Better understanding of how programs work
Trinity College Dublin, The University of Dublin
Programming Concepts
Turn into a very meticulous problem solver
Break problems into small details
Keep it variable
Give very precise instructions
Trinity College Dublin, The University of Dublin
Mac for Windows users
The main differences:
cmd instead of ctrl (e.g. cmd-C for copying)
right-click mouse: ctrl-click
# character: alt-3
switch between applications: cmd-tab
Spotlight (top right) for finding files/programs
Apple symbol (top left) for logging out
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
open through Spotlight
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
Alternatively: open through Finder
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
interactive Python console
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
simple Python statement
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
user input
output
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
try a few simplenumeric operations
user input
output
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
repeat/combine previous commands
by clicking into them and hitting return(use left/right arrows
and delete to edit them)
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Integrated DeveLopment Environment
Console vs Editor
Console Editor
interactive requires extra click for running
great for trying out code additional IDLE functionality
not suited for long scripts suited for long scripts
no saving of code allows to save code
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
open a new file
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
write some code
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
run your code shortcut: F5
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
save file first
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
specify a file name
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
write more codeIDLE provides help
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
save and run:cmd-S then F5
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
make it personal
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts
keep going
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Python vs Perl
the equivalentin Perl
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Python vs Perl
the equivalentin Perl
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Python vs Perl
• fewer special characters• indentation enforced• more user-friendly functions
Python Perl
Trinity College Dublin, The University of Dublin
Why Python?
easy to learn great for beginners
enforces clean coding great for teachers
comes with IDE avoids command-line usage
object-orientated code reuse and recycling
very popular many peers
BioPython many bioinformatics modules
Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example
built-in function 'len'
Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example
built-in function 'set'
Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example
built-in functions 'sorted' and 'set'
Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example
string method 'count'
Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example
string method 'upper'
Trinity College Dublin, The University of Dublin
Basic sequence manipulation Fetch records from databases Multiple sequence alignment (Clustal, Muscle) Sequence similarity search (Blast) Working with motifs: MEME, Jaspar, Transfac Phylogenetics Clustering Visualisation
Trinity College Dublin, The University of Dublin
Parsing GenBank records:
from Bio import SeqIO
record = SeqIO.read("AE014613.1.gb", "genbank")
record.description 'Salmonella enterica subsp. enterica serovar Typhi Ty2, complete genome.'
len(record.features) 9086
Trinity College Dublin, The University of Dublin
Parsing sequence records:
from Bio import SeqIO
for entry in SeqIO.parse("tlr4_protein.fa", "fasta") :
print(entry.description)
print(len(entry), 'bp')gi|765368240|gb|AJR32867.1| TLR4 [Gallus gallus]843 bpgi|111414439|gb|ABH09759.1| toll-like receptor 4 [Bos taurus]841 bpgi|6175873|gb|AAF05316.1|AF177765_1 toll-like receptor 4 [Homo sapiens]839 bp…
Trinity College Dublin, The University of Dublin
Graphics:
Chromosomes colour-coded by GC content (Bioinformatics with Python Cookbook)
Trinity College Dublin, The University of Dublin
Graphics:
Coloured phylogenetic tree from Ebola sequences (Bioinformatics with Python Cookbook)
Trinity College Dublin, The University of Dublin
Additional Resources
https://store.continuum.io/cshop/anaconda/
Trinity College Dublin, The University of Dublin
Visualisations with Matplotlib
http://matplotlib.org/gallery.html
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Scikit-learn – Machine Learning in Python
• Machine Learning: PCA of Iris data set
http://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_iris.html
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Python Help
Trinity College Dublin, The University of Dublin
Online courses
http://biopython.org/DIST/docs/tutorial/Tutorial.html
http://dowell.colorado.edu/education-python.html
http://www.pasteur.fr/formation/infobio/python
https://www.codecademy.com/tracks/python
http://anh.cs.luc.edu/python/hands-on/
https://www.coursera.org
Trinity College Dublin, The University of DublinTrinity College Dublin, The University of Dublin
Conclusions
• You have been briefly introduced to Python and IDLE.
• You have learnt about programming concepts.
• You have seen examples of what can be accomplished through Python.
• Topics of an extensive Python course:
• Coding in Python – variables, scope, functions…
• Bioinformatics with BioPython
• Automated biological data analysis – your interests!
Trinity College Dublin, The University of Dublin
Thank You!
http://bioinf.gen.tcd.ie/workshops/python