22
Bioinformati cs Bioengineering Summer Camp June 2001 Yaron Turpaz, Andrew Binkowski, Jie Liang

Bioinformatics

Embed Size (px)

DESCRIPTION

Introduction to Bioinformatics

Citation preview

Page 1: Bioinformatics

BioinformaticsBioengineering Summer Camp

June 2001Yaron Turpaz, Andrew Binkowski, Jie Liang

Page 2: Bioinformatics

• Introduction

– Basic molecular biology.

• What is Bioinformatics

• UIC’s Bioinformatics group

– Research projects

– Summer camp research project.

Outline

http://gila.engr.uic.edu/bioinformatics/

Page 3: Bioinformatics

DNA

Protein

Nucleotides sequence

Gene expression = Protein production

Page 4: Bioinformatics

Exons & Introns

STOP F R L

Page 5: Bioinformatics

Biological Diversity

Bacteria

Fruit Fly

Human

Yeast

Page 6: Bioinformatics

The Genetic Code• From DNA sequence to Protein sequence.

• 20 Building blocks of a protein = 20 Amino acids.

Page 7: Bioinformatics

From protein sequence to Structure

APRKFFVGGNWKMNGDKKSLGELIHTLNGAKLSADTEVVCGAPSIYLDFARQKLDAKIGVAAQNCYKVPKGAFTGEISPAMIKDIGAAWVILGHSERRHVFGESDELIGQKVAHALAEGLGVIACIGEKLDEREAGITEKVVFEQTKAIADNVKDWSKVVLAYEPVWAIGTGKTATPQQAQEVHEKLRGWLKSHVSDAVAQSTRIIYGGSVTGGNCKELASQHDVDGFLVGGASLKPEFVDIINAKH

=

Page 8: Bioinformatics

Bioinformatics

http://gila.engr.uic.edu/bioinformatics/

• Computational analysis of high-throughput biological data – Whole genome sequencing.– Global genomic expression & profiling.– Functional genomics.– Structural genomics/proteomics– Comparative genomics.

Page 9: Bioinformatics

Bioinformatics in BioE

http://gila.engr.uic.edu/bioinformatics/

• Interdisciplinary approach– Computer science, Mathematics & Statistics.– Molecular biology, Biochemistry & Medicine.

• Rapid growing impact area of BioE:– Boston U, UCBerkeley, UCSD, Rice, WashU, MIT,..

Page 10: Bioinformatics

Research

• Structural genomics/proteomics– Structural basis of functional motifs in protein families.– The CAST server - http://cast.engr.uic.edu/

– Drug discovery.

• Functional genomics– Collaboration with TIGR - http://www.tigr.org/

• Data mining of microbial DNA sequences for detection of foreign DNA.

• Whole genome comparative studies.

• Gene expression analysis– Collaboration with cancer biologist (Dr. Westbrook, School of Medicine)

• Molecular based informatics method to facilitate diagnosis of cancer.

Page 11: Bioinformatics

Gene expressionGene expression

Computational analysis of cDNA microarray expression profiles

~26,000 genes in one experiment

Page 12: Bioinformatics

Functional correlation in clusters

1. Cell division

2. cell signaling/cell communication

3. cell structure/motility

4. cell/organism defense

5. gene/protein expression

6. metabolism

7. unclassified

•Genes with similar expression patterns may participate in the same pathway or may be co-regulated.

•Clustering of expression patterns may reveal such relationships.

Page 13: Bioinformatics

Research – structure & function

How Proteins Interfaces with Other Molecules

• Analysis of protein topographic surfaces: – Identify protein function.– Predicting binding specificity and affinity.– Discovery of functional similarity.

• Protein interaction with cosolvents

– Stabilization of protein solution for longer

shelf life.– Molecular mechanism and optimization.

Page 14: Bioinformatics

Structural genomics/proteomics

• Atlas of Topographic Surfaces of All Known Protein Structures – Automatic identification of binding pockets.– Measurement size of surface binding pockets.

• Drug Discovery– Quantifying ligand accessibility. – Constructing precise negative imprint or cast of binding site.

Page 15: Bioinformatics

Geometry based approach for functional motifs

Discrete FlowDiscrete Flow

Voronoi Diagram and Delaunay TriangulationVoronoi Diagram and Delaunay Triangulation

Page 16: Bioinformatics

3D alpha shapes (HIV-1 protease)

Page 17: Bioinformatics

What is an algorithm?

• Precisely defined procedure for accomplishing a task.– driving directions,

– furniture assembly instructions,

– computer programs.• Built in hardware: fast• Built in software

Page 18: Bioinformatics

Are computers fast enough?

• NP-complete problems:eg. Traveling salesman problem:

20 cities a few seconds

30 cities a few hours

60 cities a few decades

• Computer speed does not increase exponentially.

Page 19: Bioinformatics

Pockets in Ribonuclease APockets in Ribonuclease A

Page 20: Bioinformatics

A Server for Identification of Protein Pockets & Cavities

CASTp

http://cast.engr.uic.edu/

Users of CASTp:Columbia, Harvard, Mayo Clinic, Princeton, Stanford, U Penn, SUNY Stony Brook, Texas A&M, UCIrvine, UBC, Virginia Tech, Yale, Abbott Lab, Pfizer, SmithKline Beecham, ...Agouron, Emisphere, Vertex, ...Kyoto U, Cambridge, European Molecular Biology Lab, INRA (France), Pasteur-Lille, Uppsala, Weizman,...Brazil, Czech, Korea, Turkey, ...

Page 21: Bioinformatics

CASTp ResultsCASTp Results

Calculations:

• Identifies all pockets and cavities. 

• Measures the volume and area analytically. 

• The number, area, and circumcircles of the mouth openings for each pocket. 

Files via email:

• pocket and mouth information file,

• Pocket and mouth atoms,

• a script file for visualization using rasmol.  

http://cast.engr.uic.edu/

Page 22: Bioinformatics

THANK YOU