Upload
sage-base
View
276
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Stephen Friend, Jan 21, 2012. Fanconi Anemia Research Fund, Houston, TX
Citation preview
Use of Bionetworks to Build Maps of Diseases Moving Beyond
Stephen Friend MD PhD
Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ San Francisco
Fanconi Anemia Research Fund Annual Planning Meeting
January 21, 2012
Academia
Biotech
Industry
.
We still consider much clinical research as if we were “hunter gathers”- not sharing
Why consider the fourth paradigm- data intensive science?
thinking beyond the narrative, beyond pathways
advantages of an open innovation compute space
Where are the precompetitive goalposts?
it is more about how than what
Familiar but Incomplete
Reality: Overlapping Pathways
WHY NOT USE “DATA INTENSIVE” SCIENCE
TO BUILD BETTER DISEASE MAPS?
Equipment capable of generating massive amounts of data
“Data Intensive Science”- “Fourth Scientific Paradigm” For building: “Better Maps of Human Disease”
Open Information System
IT Interoperability
Evolving Models hosted in a Compute Space- Knowledge Expert
It is now possible to carry out comprehensive monitoring of many traits at the population level
Monitor disease and molecular traits in populations
Putative causal gene
Disease trait
what will it take to understand disease?
DNA RNA PROTEIN (dark matter)
MOVING BEYOND ALTERED COMPONENT LISTS
2002 Can one build a “causal” model?
******BYRM
Yeast segregants
Synthetic complete medium
Logorithm growth
Gene expression
Yeas
t seg
rega
nts
genotypes
Public databases
Protein-protein interations
Transcription factor binding
sites
Bayesian network
Protein Metabolite interations
Data integration via Bayesian Network
Courtesy of Dr. Jun Zhu
Preliminary Probabalistic Models- Rosetta /Schadt
Gene symbol Gene name Variance of OFPM explained by gene expression*
Mouse model
Source
Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg
Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12]
Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple
(UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg
(Columbia University, NY) [11] C3ar1 Complement component
3a receptor 1 46% ko Purchased from Deltagen, CA
Tgfbr2 Transforming growth factor beta receptor 2
39% ko Purchased from Deltagen, CA
Networks facilitate direct identification of genes that are causal for disease
Evolutionarily tolerated weak spots
Nat Genet (2005) 205:370
db/db mouse (p~10E(-30))
AVANDIA in db/db mouse
= up regulated = down regulated
Our ability to integrate compound data into our network analyses
db/db mouse (p~10E(-20) p~10E(-100))
"Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003)
"Variations in DNA elucidate molecular networks that cause disease." Nature. (2008)
"Genetics of gene expression and its effect on disease." Nature. (2008)
"Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc
"Identification of pathways for atherosclerosis." Circ Res. (2007)
"Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008)
…… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome
"Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005)
“..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009)
"An integrative genomics approach to infer causal associations ...” Nat Genet. (2005)
"Increasing the power to detect causal associations… “PLoS Comput Biol. (2007)
"Integrating large-scale functional genomic data ..." Nat Genet. (2008)
…… Plus 3 additional papers in PLoS Genet., BMC Genet.
d
Metabolic Disease
CVD
Bone
Methods
Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models
• >80 Publications from Rosetta Genetics
50 network papers http://sagebase.org/research/resources.php
List of Influential Papers in Network Modeling
(INTEGRATED BIONETWORK)
Recognition that the benefits of bionetwork based molecular models of diseases are powerful but that they require significant resources
Appreciation that it will require decades of evolving representations as real complexity emerges and needs to be integrated with therapeutic interventions
Sage Mission
Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by
contributor scientists with a shared vision to accelerate the elimination of human disease
Sagebase.org
Data Repository
Discovery Platform
Building Disease Maps
Commons Pilots
Sage Bionetworks Collaborators
Pharma Partners Merck, Pfizer, Takeda, Astra Zeneca,
Roche, Amgen
24
Foundations CHDI, Gates Foundation
Government NIH, LSDF
Academic Levy (Framingham) Rosengren (Lund) Krauss (CHORI)
Federation Ideker, Califarno, Butte, Schadt
RULES GOVERN
Engaging Communities of Interest
PLAT
FORM
NEW
MAP
S NEW MAPS
Disease Map and Tool Users- ( Scientists, Industry, Foundations, Regulators...)
PLATFORM Sage Platform and Infrastructure Builders-
( Academic Biotech and Industry IT Partners...)
RULES AND GOVERNANCE Data Sharing Barrier Breakers-
(Patients Advocates, Governance and Policy Makers, Funders...)
NEW TOOLS Data Tool and Disease Map Generators- (Global coherent data sets, Cytoscape,
Clinical Trialists, Industrial Trialists, CROs…)
PILOTS= PROJECTS FOR COMMONS Data Sharing Commons Pilots-
(Federation, CCSB, Inspire2Live....)
27
Example 1: Breast Cancer
Zhang B et al., manuscript
Bayesian Network
Survival Analysis
Coexpression Networks Module combination
Partition BN
4 Public Breast Cancer Datasets
NKI: van de Vijver et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002 Dec 19;347(25):1999-2009.
Wang Y et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005 Feb 19-25;365(9460):671-9.
Miller: Pawitan Y et al. Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res. 2005;7(6):R953-64.
Christos: Sotiriou C et al.. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006 Feb 15;98(4):262-72.
295 samples
286 samples
159 samples
189 samples
Generation of Co-expression & Bayesian Networks from published Breast Cancer Studies
Recovery of EGFR and Her2 oncoproteins downstream pathways by super modules
Comparison of Super-modules with EGFR and Her2 signaling and resistance pathways
31
Key Driver Analysis • Identify key regulators for a list of genes h and a network N • Check the enrichment of h in the downstream of each node in N • The nodes significantly enriched for h are the candidate drivers
32
A) Cell Cycle (blue)
C) Pre-mRNA Processing (brown)
B) Chromatin modification (black)
D) mRNA Processing (red)
Global driver
Global driver & RNAi validation
Signaling between Super Modules
(View Poster presented by Bin Zhang)
Clinical Trial Comparator Arm Partnership (CTCAP)
Description: Collate, Annotate, Curate and Host Clinical Trial Data with Genomic Information from the Comparator Arms of Industry and Foundation Sponsored Clinical Trials: Building a Site for Sharing Data and Models to evolve better Disease Maps.
Public-Private Partnership of leading pharmaceutical companies, clinical trial groups and researchers.
Neutral Conveners: Sage Bionetworks and Genetic Alliance [nonprofits].
Initiative to share existing trial data (molecular and clinical) from non-proprietary comparator and placebo arms to create powerful new tool for drug development.
Example 3: The Sage Federation
• Founding Lab Groups
– Seattle- Sage Bionetworks – New York- Columbia: Andrea Califano – Palo Alto- Stanford: Atul Butte – San Diego- UCSD: Trey Ideker – San Francisco: UCSF/Sage: Eric Schadt
– NEW LABS: Gary Nolan Stanford/ David Haussler UCSC
• Initial Projects – Aging – Diabetes – Warburg
• Goals: Share all datasets, tools, models Develop interoperability for human data
Federation’s Genome-wide Network and Modeling Approach
Califano group at Columbia Sage Bionetworks Butte group at Stanford
Genes Associated with Poor Prognosis are disproportionally found among the networks regulating the “glycolysis” Genes
Size of the node proportional to -log10 P value for recurrence free survival.
>5 fold enrichment of recurrence free prognostic genes with the Glycolysis BN module than random selection (p<1e-100)
P-Value<0.005
Inferred regulatory module for GGMSE Inferred regulatory module for Oxidative Phosphorylation and Sphingolipid
Metabolism genes
Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
Synapse as a Github for building models of disease
Evolution of a Software Project
Biology Tools Support Collaboration
Potential Supporting Technologies
Taverna
Addama
tranSMART
Watch What I Do, Not What I Say Reduce, Reuse, Recycle
Most of the People You Need to Work with Don’t Work with You
My Other Computer is Amazon
sage bionetworks synapse project
Arch2POCM
Restructuring the “Competitive” Phase of Drug Discovery
What is the described problem? • Regulatory hurdles too high? • Low hanging fruit picked? • Payers unwilling to pay? • Genome has not delivered? • Valley of death? • Companies not large enough to execute on strategy? • Internal research costs too high? • Clinical trials in developed countries too expensive?
In fact, all are true but none is the real problem
What is the real problem?
We need to rebuild the drug discovery process so that we better understand disease biology before testing proprietary compounds on sick patients
The solution – Arch2POCM 1. Create an Archipelago of clinicians and scientists from public
and private sectors to take projects from ideas to Proof of Clinical Mechanism (POCM)
2. Arch2POCM is a collaborative, data-sharing network of scientists, whose drug discovery objective is to use robust compounds against new targets to disentangle the complexity of human biology, not to create a medicine
3. Success? • A compound that provides proof of concept for a novel target-
allowing companies to use this common information to compete, with dramatic increased chances of success
• Culling targets with doomed mechanisms before multiple companies waste money exploring them - at $50M a pop
Why data sharing through to Phase IIb?
• Most rapidly reveals limitations and opportunities associated with the target
• Increases probability of success for internal proprietary programs
• Scientific decisions are not influenced by market considerations or biased internal thinking
• Target mechanism is only properly tested at Phase IIb
Why no IP on “Common Stream” compounds?
• Allows multiple groups to test diverse indications without funds from Arch2POCM- crowdsourcing drug discovery
• Broader and faster data dissemination
• Far fewer legal agreements to negotiate
• Generates “freedom to operate” on target because there are no patent thickets to wade through
• Efficient way to access world’s top scientists and doctors without hassle
Existing Team Ready to Execute
APRIL 20-‐21
SAGE BIONETWORKS
SAGE RESEARCH FEDERATION CTCAP CITIZEN LED PROJECTS SCIENTIST LED PROJECTS
Arch2POCM “MedXChange-‐Bridge”
WEBSITES/COMMONS CONGRESS
SYNAPSE CONSENTS
Jan 12-‐ APR 13
2012-13: Year of Learning from our Pilots
IMPACT ON PATIENTS
Actionable Disease Bionetwork Models
Open Medical Information Systems
Democratization of Science
Networked Science Approaches
OPPORTUNITIES FOR FANCONI’S COMMUNITY
Evolve Sharing of Data sets, Tools and Models
Joining Synapse Communities
Buiding your own “Federation Projects”
Paticipate in Sage Commons Congress April 20-21
Joining Arch2POCM
Change reward structures for sharing data (patients and academics)