Upload
mudassar-samar
View
9
Download
2
Tags:
Embed Size (px)
DESCRIPTION
presentation of bioinformatics
Citation preview
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
BioinformaticsLecture 1
Muhammad Usman Ghani Khan
UET Lahore
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Outline
1 IntroductionDefinitionsRelated FieldsThe New BiologyMotivation and Background
2 Sources of Biological Data
3 Course Plan
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Definitions
over 43,000 definitions available on internet
Definition 1: Bioinformatics is the application of computertechnology to the management and analysis of biologicaldata1
Definition 2: Biologists doing stuff with computers?
Definition 3: The design, construction and use of softwaretools to generate, store, annotate, access and analyse dataand information relating to Molecular Biology
* Here we consider the use of Bioinformatics tools ratherthan their design and construction
* Here we consider the access and analysis of data andinformation items rather than their generation, storage orannotation
1European Bioinformatics Institute (EBI)
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Definitions
Every application of computer science to biology* Sequence analysis, images analysis, sample management,
population modeling,Analysis of data coming from large-scale biologicalprojects
* Genomes, transcriptomes, proteomes, metabolomes, etc
Solving biological problems with computation?Collecting, storing and analysing biological data?Informatics - library science?But: I do not think all biological computing isbioinformatics, e.g. mathematical modelling is notbioinformatics, even when connected with biology-relatedproblems. In my opinion, bioinformatics has to do withmanagement and the subsequent use of biologicalinformation, particular genetic information. RichardDurbin
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Definitions
What is not bioinformatics?
* Biologically-inspired computation, e.g., genetic algorithmsand neural networks
* However, application of neural networks to solve somebiological problem, could be called bioinformatics
* What about DNA computing?
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Related Fields
Computational biology Application of computing tobiology (broad definition)
* Often used interchangeably with bioinformatics
Biometry: the statistical analysis of biological data
Biophysics: An interdisciplinary field which appliestechniques from the physical sciences to understandingbiological structure and function2
Mathematical biology tackles biological problems, but themethods it uses to tackle them need not be numerical andneed not be implemented in software or hardware.
2British Biophysical Society
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Related Fields
Computational biology and bioinformatics overlap; both usecomputational techniques to try to understand biologicalphenomena; but comp biol has more of an emphasis onmathematical modelling to explain biological mechanisms,whereas bioinformatics has more to do with the storage andsynthesis of experimental data (eg. pattern recognition anddata mining).
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
New Biology
Traditional Biology
Small team working on a specialized topic
Well defined experiment to answer precise questions
New high-throughput biology
Large international teams using cutting edge technologydefining the project
Results are given raw to the scientific community withoutany underlying hypothesis
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Examples of High Throughput
Complete genome sequencing
Simultaneous expression analysis of thousands of genes(DNA microarrays, SAGE)
Large-scale sampling of the proteome
Protein-protein analysis large-scale 2-hybrid (yeast, worm)
Large-scale 3D structure production (yeast)
Metabolism modeling
Biodiversity
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Motivation
Rapid growth of biological related data explosion of publicly available biological materials
* Modern molecular biology and especially genomics has ledto vast quantities of data: DNA/ protein sequence, geneexpression.
* This mainly consists of vast strings/ matrices of letters/numbers, which in their raw form are not very interesting.
Management problem: how to handle this data?
* Analysis* Understand* Presentation
Approaches:* Computing techniques are very good for extracting useful
patterns.* Boinformatics consists of methods to remove these issues.
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Motivation
In order to extract useful information, it is necessary tounderstand biological principles involved.
In this course we will introduce some basic molecularbiology/ genomics and look at ways in which computerscan be used to analyse it.
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Motivation
Sample Ultimate Problems
What is the role of a particular gene?
Does a particular gene help cause a disease?
How does a drug affect a cell?
Can we insert a gene into corn to protect it againstdiseases or pests?
Can we design a drug to accomplish a particular purpose?
Can we build a cell that eats pollution?
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Motivation
Why would a student choose this course?
To prepare for graduate study in Bioinformatics orComputational Biology.
To prepare for certain jobs in the pharmaceutical orbiotechnology industries. The future is hard to predict.There are jobs related to high-tech agriculture (newvarieties of plants), industrial organisms, biofuels,pharmaceuticals (designer drugs).
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Outline
1 IntroductionDefinitionsRelated FieldsThe New BiologyMotivation and Background
2 Sources of Biological Data
3 Course Plan
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
So what data can we generate?
Biological data can be generated at many different levels
Genomics (DNA)
Transcriptomics (RNA)
Proteomics (proteins)
Metabolomics (small compounds)
Lipidomics (lipids)
Hundreds of omics have been catalogued
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
How an omics dataset looks like?
In most cases datasets present a similar structure
Each sample is characteristed by a large number ofvariables (RNA, Proteins, lipids, etc.)
Each variable indicates (usually quantitatively) thepresence of that element in the sample
Due to the high cost of most omics technologies, variablesare much more then samples
* Problems of over-fitting
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Research Areas
Genome-scale) Sequence Analysis
* Sequence alignments, motif discovery, genome-wideassociation (to study diseases such as cancers)
Computational Evolutionary Biology
* Phylogenetics, evolution modeling
Analysis of Gene Regulation
* Gene expression analysis, alternative splicing, protein-DNAinteractions, gene regulatory networks
Structural Biology
* Drug discovery, protein folding, protein-protein interactions
Synthetic Biology
High throughput Imaging Analysis
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Outline
1 IntroductionDefinitionsRelated FieldsThe New BiologyMotivation and Background
2 Sources of Biological Data
3 Course Plan
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Course Contents
Lecture 1
Introduction, Definitions.
Applications, Scope, Motivation.
Lecture 2
Molecular biology Introduction
Structure of DNA, RNA, Proteins
Announcement of term projects
Lecture 3
Bioinformatics Databases; Genbank, ENBL, Prot etc.
Practical demonstration of databanks and their structures.
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Course Contents
Lecture 4
Database Formats; Fasta, seq, Data
Quiz 1
Lecture 5
Sequence Alignment Sequence Motifs; Gene Finding
Practical demonstration of BioJava/.NetBio tools forbiological related tasks
Lecture 6
Sequence Alignment (Part 2)
Computing with Biological Structures
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Course Contents
Lecture 7
Phylogenetic Algorithms
Lecture 8
Mid-term break
Lecture 9
Microarray Data Analysis
Lecture 10
Term project presentations and discussion
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Course Contents
Lecture 11
Comparative Genomics
Lecture 12
Proteomics
Lecture 13
Biological Ontologies; Biological Text Mining
Lecture 14
Genetic Networks
Lecture 15
Final Viva and term project submissions
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Term Project Ideas
Architectures and data management techniques for the lifesciencesQuery processing and optimization for biological dataBiological data sharing and update propagationQuery formulation assistance for scientistsModeling of life sciences dataBiomedical data integration issues in eScienceLaboratory information management systems in biology(including workflow systems)Quality assurance in integrated data repositoriesBiomedical metadata management (including provenance)Mining integrated life sciences data and text resourcesStandards for biomedical data integration and annotationScientific results arising from innovative data integrationsolutions
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Term Project Ideas
Exposing biomedical data for integration purposes (APIs,Linked Open Data, SPARQL endpoints)
Creation and use of clinical data repositories
Data integration in clinical and translational research
Integration of genotypic and phenotypic data
Challenges and opportunities with big data in the lifesciences
Ethical, legal and social issues with biomedical dataintegration
Introduction
Definitions
Related Fields
The NewBiology
Motivation andBackground
Sources ofBiologicalData
Course Plan
Useful Books
Bryan Bergeron M.D: Bioinformatics Computing, PrenticeHall, 2002 (freely available on internet).
Richard C. Deonier, Simon Tavare & Michael S.Waterman: Computational Genome Analysis anIntroduction, Springer 2005
Some other helpful books
* Alberts et al- Molecular Biology of the Cell* Stryer- Biochemistry* Baldi and Brunak Bioinformatics a machine learning
approach* Durbin, Eddy, Krogh and Mitchison Biological sequence
analysis* Kanehisa - Post genome informatics* Lesk- Introduction to bioinformatics* Orengo, Jones and Thornton - Bioinformatics
IntroductionDefinitionsRelated FieldsThe New BiologyMotivation and Background
Sources of Biological DataCourse Plan