19
Bioinformatics and Computational Biology

Bioinformatics and Computational Biology

Embed Size (px)

DESCRIPTION

Bioinformatics and Computational Biology. Bioinformatics collection and storage of biological information derives knowledge from computer analysis of biological data Computational biology development of algorithms and statistical models to analyze biological data. - PowerPoint PPT Presentation

Citation preview

Page 1: Bioinformatics and  Computational Biology

Bioinformatics

and

Computational Biology

Page 2: Bioinformatics and  Computational Biology

• Bioinformatics collection and storage of biological information derives knowledge from computer analysis of

biological data

• Computational biology development of algorithms and statistical models

to analyze biological data

Page 3: Bioinformatics and  Computational Biology

Few people adequately trained in both biology and computer science

Genome sequencing, microarrays etc. lead to large amounts of data to be analyzed

Leads to important discoveries

Saves time and money

Why bioinformatics is critical?

Page 4: Bioinformatics and  Computational Biology

Why is the relationship between Computer Science and Biology is essential?

Three main reasons-

First, massive amounts of data have to be stored, analyzed and made accessible

Second, the nature of the data is often such that a computational statistical method is necessary. This applies in particular to the information on the building plans of proteins and spatial organization of their expression in the cell encoded by the DNA.

Third, there is a strong analogy between the DNA sequence and a computer program

Page 5: Bioinformatics and  Computational Biology

Key Areas/Scope of Bioinformatics

1. Organizing biological knowledge in database

2. Analysing sequence data

3. Structural Bioinformatics

4. Pharmacological relevance (Population genetics)

Page 6: Bioinformatics and  Computational Biology

1. Organizing biological knowledge in database

Genbank/Organized DNA sequences - NCBI, EMBL

Protein sequence databank and its structure and functional characteristics. For example, SWISSPROT contains verified protein sequences and more annotations describing the function

of a protein

Literature database – PUBMED, MEDLINE

Page 7: Bioinformatics and  Computational Biology

2. Analysing sequence data Establish the correct order of sequence contigs Find the translation and transcription initiation sites, find promoter sites,

define open reading frames (ORF) Find splice sites, introns, exons Translate the DNA sequence into a protein sequence Compare the DNA sequence to known protein sequences in order to

verify exons etc with homologous sequences.

Multiple sequence alignments Studying evolutionary aspects, by the construction of phylogenetic trees Determining active site residues, and residues specific for subfamilies Predicting protein–protein interactions Analysing single nucleotide polymorphism to hunt for genetic sources of

diseases.

Page 8: Bioinformatics and  Computational Biology

3. Structural Bioinformatics

This branch of bioinformatics is concerned with computational approaches

to predict and analyse the spatial structure of proteins and nucleic acids.

multiple sequence alignment, secondary structure, 3D structure can be predicted with an accuracy above 70 %.

Page 9: Bioinformatics and  Computational Biology

4. Pharmacological relevance

Drug targets in infectious organisms can be revealed by wholegenome comparisons of infectious and non–infectious organisms.

The analysis of single nucleotide polymorphisms reveals genes potentially responsible for genetic diseases.

Prediction and analysis of protein 3D structure is used to develop drugs and understand drug resistance.

Patient databases with genetic profiles, e.g. for cardiovasculardiseases, diabetes, cancer, etc. may play an important role in thefuture for individual health care, by integrating personal geneticprofile (population genetics) into diagnosis.

Page 10: Bioinformatics and  Computational Biology

National Center for Biotechnology information (NCBI)(http://ncbi.nlm.nih.gov)

Ensembl Genome Browser (http://www.ensembl.org) UCSC Genome Browser (http://genome.ucsc.edu/)

WormBase (http://www.wormbase.org/)

AceDB (http://www.acedb.org/)

FlyBase (http://flybase.bio.indiana.edu/)

Genomic Browsers

Page 11: Bioinformatics and  Computational Biology

• SWISS-PROT/TrEMBL curated protein sequences http://www.expasy.ch/sprot

• InterPro: Protein families and domains http://www.ebi.ac.uk/interpro

• EXProt: proteins with experimentally verified functions http://www.cmbi.nl/exprot

• Protein Information Resource (PIR) http://pir.georgetown.edu/

Protein databses

Page 12: Bioinformatics and  Computational Biology

NCBI

Page 13: Bioinformatics and  Computational Biology

Continued..

Page 14: Bioinformatics and  Computational Biology

NCBI text search of a protein

Page 15: Bioinformatics and  Computational Biology

Abstract finding by NCBI

Page 16: Bioinformatics and  Computational Biology

Nucleotide search of a typical gene

Page 17: Bioinformatics and  Computational Biology

Continued..

Page 18: Bioinformatics and  Computational Biology

FASTA format

Page 19: Bioinformatics and  Computational Biology

FASTA: FASTA format is a text-based format for representing either nucleic acid sequences or protein sequences, in which base pairs or protein residues are represented using single letter codes.