50
1 Computational Biology and Bioinformatics in Computer Science Lenwood S. Heath Department of Computer Science 2160J Torgersen Hall Virginia Tech Department Seminar Series September 9, 2005

Computational Biology and Bioinformatics in Computer Science

  • Upload
    lada

  • View
    111

  • Download
    2

Embed Size (px)

DESCRIPTION

Computational Biology and Bioinformatics in Computer Science. Lenwood S. Heath Department of Computer Science 2160J Torgersen Hall Virginia Tech. Department Seminar Series September 9, 2005. Overview. Computational biology and bioinformatics (CBB) What is it? History at VT - PowerPoint PPT Presentation

Citation preview

Page 1: Computational Biology and Bioinformatics in Computer Science

1

Computational Biology and Bioinformatics in Computer Science

Lenwood S. HeathDepartment of Computer Science

2160J Torgersen HallVirginia Tech

Department Seminar SeriesSeptember 9, 2005

Page 2: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 2

Overview

• Computational biology and bioinformatics (CBB)•What is it?•History at VT•Some biological terminology

• CBB faculty and projects

• Education in CBB•Bioinformatics option•GBCB

• Conclusion

Page 3: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 3

Computational Biology and Bioinformatics (CBB)

• Computational biology — computational research inspired by biology

• Bioinformatics — application of computational research (computer science, mathematics, statistics) to advance basic and applied research in the life sciences

• Agriculture• Basic biological science• Medicine

• Both ideally done within multidisciplinary collaborations

Page 4: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 4

CBB History (Part I)

• Biological modeling (Tyson, Watson): > 20 years• Computational biology, genome rearrangements

(Heath): > 10 years • Fralin Biotechnology sponsored faculty advisory

committee centered on bioinformatics: 1998-2000•Biochemistry; biology; CALS; computer science (Heath, Watson); statistics; VetMed

•Provost provided $1 million seed money•First VT bioinformatics hire (Gibas, biology, 1999)

Page 5: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 5

CBB History (Part II)

• Outside initiative submitted to VT for a campus bioinformatics center — 1998

• Discussions of bioinformatics advisory committee contributed to a proposal to the Gilmore administration — 1999

• Governor Gilmore puts plans and money for bioinformatics center in budget — 1999-2000

• Virginia Bioinformatics Institute (VBI) established July, 2000; housed in CRC

Page 6: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 6

• Established by the state in July, 2000; high visibility• Applies computational and information technology in

biological research• Research faculty (currently, about 18) expertise includes

– Biochemistry– Comparative Genomics– Computer Science– Drug Discovery– Human and Plant Pathogens

• More than $43 million funded research

Virginia Bioinformatics Institute (VBI)

– Mathematics– Physics– Simulation– Statistics

Page 7: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 7

CBB History (Part III)

• Bioinformatics course and curriculum development began with faculty subcommittee — 1999

• Courses supporting bioinformatics now in many life science and computational science departments, including:

• Biology• Biochemistry• Computer Science• Plant Pathology, Physiology, and Weed Science (PPWS)• Mathematics• Statistics

Page 8: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 8

Some Molecular Biology

•The encoded instruction set for an organism is kept in DNA molecules.

• Each DNA molecule contains 100s or 1000s of genes.

•A gene is transcribed to an mRNA molecule.

• An mRNA molecule is translated to a protein (molecule).

Page 9: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 9

Elaborating Cellular Function

DNA mRNA ProteinTranscription Translation

ReverseTranscription

Degradation

Regulation

Protein functions:• Structure• Catalyze chemical reactions• Regulate transcription

(Genetic Code)

Thousands of Genes!

Page 10: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 10

Chromosomes

• Large molecules of DNA: 104 to 108 base pairs.• Human chromosomes: 22 matched pairs plus X and

Y.• A gene is a subsequence of a chromosome that

encodes a protein.• Proteins associated with regulation are present in

chromosomes.• Every gene is present in every cell.• Only a fraction of the genes are in use

(“expressed”) at any time.

Page 11: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 11

Genomics: Discovery of genetic sequences and the ordering of those sequences into individual genes, into gene families, and into chromosomes. Identification of sequences that code for gene products/proteins and sequences that act as regulatory elements.

Genomics

Page 12: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 12

Functional Genomics: The biological role of individual genes, mechanisms underlying the regulation of their expression, and regulatory interactions among them.

Functional Genomics

Page 13: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 13

Challenges for Computer Science

• Analyzing and synthesizing complex experimental data

• Representing and accessing vast quantities of information

• Pattern matching• Data mining• Gene discovery• Function discovery• Modeling the dynamics of cell function

Page 14: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 14

CBB Faculty in CS

1. Chris Barrett (VBI, CS)

2. Vicky Choi

3. Roger Ehrich

4. Edward A. Fox

5. Lenny Heath

6. Madhav Marathe (VBI, CS)

7. T. M. Murali

8. Chris North

9. Alexey Onufriev

10. Naren Ramakrishnan

11. Adrian Sandu

12. Eunice Santos

13. João Setubal (VBI, CS)

14. Cliff Shaffer

15. Anil Vullikanti (VBI, CS)

16. Layne Watson

17. Liqing Zhang

Page 15: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 15

Established CBB Faculty

• Layne Watson• Lenny Heath• Cliff Shaffer• Naren Ramakrishnan• Eunice Santos

Page 16: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 16

Layne Watson

• Professor of Computer Science and Mathematics• Expertise: algorithms; image processing; high

performance computing; optimization; scientific computing

• Computational biology: has worked with John Tyson (biology) for over 20 years

• JigCell: cell-cycle modeling environment; with Tyson, Shaffer, Ramakrishnan, Pedro Mendes of VBI

• Expresso: microarray experimentation; with Heath, Ramakrishnan

Page 17: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 17

Lenny Heath

• Professor of Computer Science• Expertise: algorithms; theoretical computer science;

graph theory• Computational biology: worked in genome

rearrangements 10 years ago• Bioinformatics: concentration in past 5 years• Expresso: microarray experimentation; with

Ramakrishnan, Watson– Multimodal networks– Computational models of gene silencing

Page 18: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 18

Cliff Shaffer

• Associate Professor of Computer Science

• Expertise: algorithms; problem solving environments; spatial data structures;

• JigCell: cell-cycle modeling environment; with Ramakrishnan, Tyson, Watson

Page 19: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 19

Naren Ramakrishnan

• Associate Professor of Computer Science• Expertise: data mining; machine learning; problem

solving environments• JigCell: cell-cycle modeling problem solving

environment; with Shaffer, Watson• Expresso: microarray experimentation; with Heath,

Watson– Proteus — inductive logic programming system for

biological applications– Computational models of gene silencing

Page 20: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 20

Eunice Santos

• Associate Professor of Computer Science• Expertise: Algorithms; computational biology;

computational complexity; parallel and distributed processing; scientific computing

• Relevant bioinformatics project: modeling progress of breast cancer

Page 21: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 21

New CBB Faculty

• T. M. Murali (2003) CS bioinformatics hire

• Alexey Onufriev (2003) CS bioinformatics hire

• Adrian Sandu (2004) CS hire

• João Setubal (Early 2004) VBI and CS

• Vicky Choi (2004) CS bioinformatics hire

• Liqing Zhang (2004) CS bioinformatics hire

• Chris Barrett, Madhav Marathe (Fall 2004) VBI and CS

• Anil Vullikanti (Fall 2004) VBI and CS

• Yang Cao (January, 2006) CS bioinformatics hire

Page 22: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 22

T. M. Murali

• Assistant Professor of Computer Science• Hired in 2003 for bioinformatics group• Expertise: algorithms; computational geometry;

computational systems biology• Projects:

– Functional gene annotation– xMotif — find patterns of coexpression among subsets of

genes– RankGene — rank genes according to predictive power for

disease

Page 23: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 23

Alexey Onufriev

• Assistant Professor of Computer Science• Hired in 2003 for bioinformatics group• Expertise: Computational and theoretical biophysics and

chemistry; structural bioinformatics; numerical methods; scientific programming

• Projects:– Biomolecular electrostatics– Theory of cooperative ligand binding– Protein folding– Protein dynamics — how does myoglobin uptake oxygen?– Computational models of gene silencing

Page 24: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 24

Adrian Sandu

• Associate Professor of Computer Science• Hired in 2003• Expertise: Computational science; numerical methods;

parallel computing; scientific and engineering applications

• Computational science:– New generation of air quality models– computational tools for assimilation of atmospheric

chemical and optical measurements into atmospheric chemical transport models

Page 25: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 25

João Setubal

• Research Associate Professor at VBI• Associate Professor of Computer Science• Joined in early 2004• Expertise: algorithms; computational biology;

bacterial genomes• Comparative genomics

Page 26: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 26

Vicky Choi

• Assistant Professor of Computer Science• Hired in 2004 for bioinformatics group• Expertise: computational biology; algorithms• Projects:

– Algorithms for genome assembly

– Protein docking

– Biological pathways

Page 27: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 27

Liqing Zhang

• Assistant Professor of Computer Science• Hired in 2004 for bioinformatics group• Expertise: evolutionary biology; bioinformatics• Research interests:

– Comparative evolutionary genomics

– Functional genomics

– Multi-scale models of bacterial evolution

Page 28: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 28

Selected CBB Research Projects

• JigCell

• Expresso

• Multimodal Networks

• Computational Modeling of Gene Silencing

Page 29: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 29

JigCell: A PSE for JigCell: A PSE for Eukaryotic Cell Cycle ControlsEukaryotic Cell Cycle Controls

Marc Vass, Nick Allen, Jason Zwolak, Dan Moisa,

Clifford A. Shaffer, Layne T. Watson,

Naren Ramakrishnan, and John J. Tyson

Departments of Computer Science and Biology

Page 30: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 30

Clb5MBF

P Sic1SCFSic1

Swi5

Clb2Mcm1

Unaligned chromosomes

Cln2Clb2

Clb5

Cdc20 Cdc20

Cdh1

Cdh1

Cdc20

APC

PPX

Mcm1

SBF

Esp1Esp1 Pds1

Pds1

Cdc20

Net1

Net1P

Cdc14

RENT

Cdc14

Cdc14

Cdc15

Tem1

Bub2

CDKs

Esp1

Mcm1 Mad2

Esp1

Unaligned chromosomes

Cdc15

Lte1

Budding

Cln2SBF

?

Cln3

Bck2and

growth

Sister chromatid separation

DNA synthesis

Cell Cycle of Budding Yeast

Page 31: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 31

JigCell Problem-Solving Environment

Experimental Database

Wiring Diagram

Differential Equations Parameter Values

Analysis Simulation

VisualizationAutomatic Parameter Estimation

Page 32: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 32

Why do these calculations?

• Is the model “yeast-shaped”?

• Bioinformatics role: the model organizes experimental information.

• New science: prediction, insight

JigCell is part of the DARPA BioSPICE suite of software tools for computational cell biology.

Page 33: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 33

Expresso:A Next Generation Software

System for Microarray Experiment Management

and Data Analysis

Page 34: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 34

Scenarios for Effects of Abiotic Stress on Gene Expression in Plants

Page 35: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 35

The Expresso Pipeline

Page 36: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 36

Proteus — Data Mining with ILP

• ILP (inductive logic programming) — a data mining algorithm for inferring relationships or rules

• Proteus — efficient system for ILP in bioinformatics context

• Flexibly incorporates a priori biological knowledge (e.g., gene function) and experimental data (e.g., gene expression)

• Infers rules without explicit direction

Page 37: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 37

Fusion — Chris North

• “Snap together” visualization environment

• Interactively linked data from multiple sources

• Data mining in the background

Page 38: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 38

• Evolution implies changes in genomic sequence through mutations and other mechanisms

• Genomic or protein sequences that are similar are called homologous

• Algorithms to detect homology provide access to evolutionary relationships and perhaps function conservation through genomic data.

Sequence Analysis

Page 39: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 39

Networks in Bioinformatics

• Mathematical Model(s) for Biological Networks

• Representation: What biological entities and parameters to represent and at what level of granularity?

• Operations and Computations: What manipulations and transformations are supported?

• Presentation: How can biologists visualize and explore networks?

Page 40: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 40

Reconciling Networks

Munnik and Meijer,FEBS Letters, 2001

Shinozaki and Yamaguchi-Shinozaki, Current Opinion

in Plant Biology, 2000

Page 41: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 41

Multimodal Networks• Nodes and edges have flexible semantics to represent:

- Time

- Uncertainty

- Cellular decision making; process regulation

- Cell topology and compartmentalization

- Rate constants

- Phylogeny

• Hierarchical

Page 42: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 42

Using Multimodal Networks

• Help biologists find new biological knowledge

• Visualize and explore

• Generating hypotheses and experiments

• Predict regulatory phenomena

• Predict responses to stress

• Incorporate into Expresso as part of closing the loop

Page 43: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 43

Computational Modeling of Computational Modeling of Gene Silencing (CMGS)Gene Silencing (CMGS)

Lenwood S. Heath, Richard Helm, Alexey Onufriev,

Naren Ramakrishnan, and Malcolm Potts

Departments of Computer Science and Biochemistry

Page 44: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 44

RNA Interference (RNAi)RNA Interference (RNAi)

Page 45: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 45

CMGS SystemCMGS System

Page 46: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 46

Other CBB Research Projects

• Bacterial genomics — Setubal• xMotif — Murali• Plant Orthologs and Paralogs (POPS)

– Heath, Murali, Setubal, Zhang, Ruth Grene (plant physiology)

• Protein structure and docking — Choi• Whole-genome functional annotation — Murali• Modeling biomolecular systems — Onufriev

Page 47: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 47

CBB Education at VT

• CS has been training CS graduate students in CBB since 2000

• Graduate bioinformatics option established in a number of participating departments — 2003

• Ph.D. program in Genetics, Bioinformatics, and Computational Biology (GBCB) — 2003

• First GBCB students arrived, Fall, 2003; now in third year

Page 48: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 48

CBB Education in CS

• A key department of the Ph.D. program in Genetics, Bioinformatics, and Computational Biology (GBCB)

• Computation for the Life Sciences I, II• Algorithms in Bioinformatics• Systems Biology• Structural Bioinformatics and Computational

Biophysics• Databases for Bioinformatics

Page 49: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 49

Conclusions

• Important research area in department• Close collaboration between life scientists and

computational scientists from the beginning of CBB research at VT

• Educational approach insists on adequate multidisciplinary background

• Multidisciplinary collaborators work closely on a regular basis

• Contributions to biology or medicine essential outcomes

Page 50: Computational Biology and Bioinformatics in Computer Science

9/9//2005 Computational Biology and Bioinformatics 50

Supported by:Next Generation Software

Information Technology Research

NSF