View
28
Download
0
Category
Tags:
Preview:
DESCRIPTION
Course Review. Name one important thing that you learnt from this course that you feel will be important to your research career Name one aspect you were hoping to learn that you did not. Some Thoughts on the Future of Biological Data with Emphasis on Structural Bioinformatics. - PowerPoint PPT Presentation
Citation preview
Course Review
• Name one important thing that you learnt from this course that you feel will be important to your research career
• Name one aspect you were hoping to learn that you did not
Pharm 201 Lecture 19, 2011 1
Pharm 201 Lecture 19, 2011 2
Some Thoughts on the Future of Biological Data
with Emphasis on Structural Bioinformatics
Philip E. BourneDept. of Pharmacology
University of California San Diegopbourne@ucsd.edu
Pharm 201 Lecture 19, 2011 3
Agenda
• What is structural genomics and what is its impact?
• Unsolved problems in structural bioinformatics
• New challenges related to structural bioinformatics
• The bigger picture
• The final
Pharm 201 Lecture 19, 2011 4
Structural Genomics:A Broad Working Definition
Structural genomics is the process of high-throughput determination of the 3-dimensional structures of biological
macromolecules
Pharm 201 Lecture 19, 2011 5
SG - What is the Goal?
• The goal of the human genome project was clear cut.. The goal of structural genomics is not so clear cut
• Phase I..
– Provision of enough structural templates to facilitate homology modeling of most proteins
– Structures of all proteins in a complete proteome– Structural elucidation of a complete biological
pathway– Structural elucidation of a complete disease
Pharm 201 Lecture 19, 2011 6
Example Goals (Phase I)“The hyperthermophilic bacterium Thermotoga maritima has been the target of choice for pipeline development and genome-wide fold coverage.“
“The SGPP consortium will determine and analyze the three-dimensional structures of a large number of proteins from major global pathogenic protozoa, Leishmania major, Trypanosoma brucei, Trypanosoma cruzi and Plasmodium falciparum. “
“It is aimed at determining structures of proteins and protein complexes directly relevant to human health and diseases. “
117
1257
70Structural Genomics of Pathogenic Protozoa
Pharm 201 Lecture 19, 2011 7
Growth in the Number of New Topologies per Year According To CATH
Total Folds
New Folds
from Nov., 2011
http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=fold-cath
SG Had Very Little Direct Impact on New Folds and Hence Homology Modeling
Pharm 201 Lecture 19, 2011 8
SG - What is the Goal? – Phase II
SG – Phase III – PSI-Biology
• The third phase of the PSI is called PSI:Biology and is intended to reflect the emphasis on the biological relevance of the work
Pharm 201 Lecture 19, 2011 9
http://en.wikipedia.org/wiki/Protein_Structure_Initiative
Implications of Phase III SG
• Less single domains more complex structures
• More p-p complexes• More protein-ligand complexes• More membrane proteins• Better models• More hybrid structures• More molecular machines
Pharm 201 Lecture 19, 2011 10
SG Accounts for 14% of Structures
Pharm 201 Lecture 19, 2011 11
From RCSB PDB Nov 2011
Pharm 201 Lecture 19, 2011 12
Agenda
• What is structural genomics and what is its impact?
• Unsolved problems in structural bioinformatics
• New challenges related to structural bioinformatics
• The bigger picture
• The final
Crude Estimators of What We Know and How We Might Get Better - Basics• Data accessibility
(60%)• Domain definitions
(80%)• Structure comparison
(80%)• Disorder predictors
(70%)• Structure
classification (80%)
• Need more computer accessible information on function etc.
• Need fresh approaches
• Need a better understanding of the role of protein disorder period
• More quantitative approaches
Pharm 201 Lecture 19, 2011 13
Crude Estimators of What We Know and How We Might Get Better
• Basic knowledge of macromolecular structure (50%)
• PPI’s Protein-ligand interactions ligand view (30%)
• Integrated view of structure as part of a biological continuum of data and associated knowledge (30%)
• Structure prediction from sequence (40%)
• Missing temporal view, alternative views
• Missing robust rules for molecular recognition
• Need better quantification
• Need more structures
Pharm 201 Lecture 19, 2011 14
Crude Estimators of What We Know and How We Might Get Better
• Inferring function from structure (40%)
• Macromolecular assemblies (40%)
• Docking (30%)
• Rational drug discovery (10%)
• Evolution (10%)
• A combination of improvements
• Hybrid methods
• Better scoring, flexible docking, allostery
• Polypharmacology, network pharmacology
• Accurate proteome coverage
Pharm 201 Lecture 19, 2011 15
http://itol.embl.de/
Natalie DawsonUnpublished
16Pharm 201 Lecture 19, 2011
Example 0f What Could be Done in Evolution: Structural Domains and the Tree of Life
Pharm 201 Lecture 19, 2011 17
Example 0f What Could be Done in Evolution: Structural Domains and the Tree of Life
18
Example: Structural Mapping and Subsequent Insights from All Biochemical Pathways
Pharm 201 Lecture 19, 2011
• Tykerb – Breast cancer
• Gleevac – Leukemia, GI cancers
• Nexavar – Kidney and liver cancer
• Staurosporine – natural product – alkaloid – uses many e.g., antifungal antihypertensive
Collins and Workman 2006 Nature Chemical Biology 2 689-700
Example: Better Understanding of Drug Receptor Interactions
19
Pharm 201 Lecture 19, 2011 20
Agenda
• What is structural genomics and what is its impact?
• Unsolved problems in structural bioinformatics
• New challenges related to structural bioinformatics
• The bigger picture
• The final
New Challenges
• Effective use of structural information in systems biology – eg structural ppis
• Bridging the biological scales in an easily understood way
• New ways of visualizing and hence thinking about proteins
• Protein design/engineering
Pharm 201 Lecture 19, 2011 21
Pharm 201 Lecture 19, 2011 22
Agenda
• What is structural genomics and what is its impact?
• Unsolved problems in structural bioinformatics
• New challenges related to structural bioinformatics
• The bigger picture
• The final
The Bigger Picture - Numbers
Pharm 201 Lecture 19, 2011 23
On the Future of Genomic DataScience 11 February 2011: vol. 331 no. 6018 728-729
The Bigger Picture – AccuracyFunctional Misannotation
Pharm 201 Lecture 19, 2011 24
PLoS Comput Biol 2009 5(12): e1000605.
The Bigger Picture – Data Culture
• Data are not available• Data are undervalued• Data are stovepiped• This is a long tail of data which are lost• Institutional repositories are roach motels• Data repositories will go like journals
Pharm 201 Lecture 19, 2011 25
Beyond Data What is Wrong Today?
Pharm 201 Lecture 19, 2011 26
What is Wrong Today?• Formal science communication:
– Occurs too slowly – Reaches too few people– Costs too much– Ignores the data– Is very hard to reproduce
• Is stuck in the era of the printing press – we need to move Beyond the PDF and use the power of the medium
https://sites.google.com/site/beyondthepdf/http://www.force11.org
Literature
DataMethods
The Research Enterprise
The Current Reality
http://www.flickr.com/photos/51282757@N05/5585299226/lightbox/
Data Knowledge
Database Knowledgebase Wikis Datapacks Journals
Data Only
Data + SomeAnnotation
Annotation
Data + SomeAnnotation
+Some
Integration
Data +Annotation
PLoS iStructure
30Pharm 201 Lecture 19, 2011
1. A link brings up figures from the paper
0. Full text of PLoS papers stored in a database
2. Clicking the paper figure retrievesdata from the PDB which is
analyzed
3. A composite view ofjournal and database
content results
My Dream
1. User reads a paper (one view of the info)
2. Clicks on a figure which can be analyzed
3. Clicking the figure gives a composite database + journal view
4. This takes you to yet more papers or databases
4. The composite view haslinks to pertinent blocks
of literature text and back to the PDB
1.
2.
3.
4.
The Knowledge and Data Cycle
It Goes Beyond Data
• Its hard and embarrassing to reproduce your own work
• We have a working prototype using Wings• I can feel the potential productivity gains• My students are more doubtful• Its been a lot of fun and will enable us to
improve our processes regardless of the workflow system itself
Literature
DataMethods
Yes The Workflow is RealLiterature
DataMethods
Problems with Publishing Workflows
• Workflows are not linear• Workflow : paper is not 1:1• Confidentiality• Peer review• Infrastructure• Community acceptance• Reward system• No publisher seems willing to touch them
Literature
DataMethods
Pharm 201 Lecture 19, 2011 35
Pharm 201 Lecture 19, 2011 36
Agenda
• What is structural genomics and what is its impact?
• Unsolved problems in structural bioinformatics
• New challenges related to structural bioinformatics
• The bigger picture
• The final
The Final
• Prepare a mini-grant research proposal with the following ingredients:– Background and Significance– Preliminary Results– Proposed Research and Methods– Expected Outcomes
• The theme is any aspect of the course where you would like to contribute new research ideas and potential outcomes
Pharm 201 Lecture 19, 2011 37
The Final
• Points (50) will be awarded for:– B&S – literature coverage, justification of the originality and
potential importance of the contribution (20)– Pre Res – anything you can actually accomplish to support the
proposal eg pseudocode, computations using existing tools, etc. (15)
– Proposed Research – the credibility and rigor of what you propose (10)
– Expected Outcomes (5)• There is no length requirement but I would anticipate ~10, 12pt
single space pages to do the topic justice• This should not relate to one of your previous assignments• Feel free to email me to discuss ideas before starting
Pharm 201 Lecture 19, 2011 38
Recommended