eTRIKS: A Knowledge Management Platform for Translational Research
Ian Dix, AstraZeneca R&DYike Guo, Imperial College LondonOn Behalf of eTRIKS
[email protected]@imperial.co.uk
Challenge of Drug Development Complex
Disease Phenotypes
2
3
How do we stratify these complex phenotypes?
WGS RNAseq Mass Spec
Imaging RT Sensing
Next Generation Platforms -> Data Explosion
Challenges in running internal biomarker programs
• Study population, design & data collection is defined by clinical development program– Typically not optimised for biomarker discovery
• Cost of running sufficiently powered Longitudinal Observational Studies designed for biomarker discovery (and validation) is prohibitive
• Collaborate…Industry
Academia
Public Private Consortium
Sharing costs enables bigger studies
Pros:• Shared costs: increase in scale – breadth & depth• Diverse specialisms enabling increased complexity and insight• Multiple centres: Improved recruitment
Cons:• IP complexity• Coordination
Example Translational Study Consortium
Stratified Medicine:• GAUCHERITE Consortium• Stop HCV• MATURA
• RA-Map• COPD-Map
22 CTMM research projects are active, involving a total of 119 partners and a research budget of 302.7 M€.
Translational Research Information Flow
eCRF software
Data Data
Data
Biobank
Samples
Analytical Labs
Databases
External Analytics API
External Visualisation API
Collaboration and KM platform
Clinical Sites
LIMS / Sample Tracker
Ontology Management
Service
Analytical Workflows
Central Cloud-based Platform
Challenge 1:The Science Priority
ScienceScience
InfrastructureInfrastructure
Data Infrastructure costs are consistently underestimated
Who Pays?
Challenge 2: Data Collaboration
Sharing data securely across the individual organisations…
Org 2Org 2
Org nOrg n
Org 1Org 1
Org 3Org 3
Org 4Org 4
What tools and standards?
Challenge 3: Fixed Time Line
The value of data is long lived, virtual organisations are not:
Project Consortium
Org 2Org 2
Org nOrg n
Org 1Org 1
Org 3Org 3
Org 4Org 4
5 Years
Who stewards the data when the consortium ends?
So in IMI what are we doing about it?
Translational Research Information and Knowledge Management Service
2008 2009 2010 2011 2012 2013
IMIInitiates
EFPIA KMGroup adviseson need for KMIn IMI
GSK/JnJ/ICLPilot tranSMART
RDG acceptProposal for KM call
eTRIKSCall Published
EoISelected FPP
Agreed
eTRIKSInitiates (Q4)
FirstProjectOnboarded
Objectives
• Objectives: – Provision of a KM Service to support Private/Public Translational Research
(TR) in IMI– Single access point to standardised curated , IMI TR study information along
with IMI project relevant historic translational studies– Establishing a common, open source, interoperable TR platform, based on
open agree standards across the IMI TR projects.– Development of an active TR analytics & informatics community across IMI
• Budget: €23.79m for 5 years (Oct 2012---Sept 2017)
• Members:– 10 Pharma, 3 Academic, 1 standards, 2 Commercial Suppliers
The Consortium…
10 Pharma 6 Partners
+
Business Logic
• Reduced risk of data loss post IMI
• Improved operational efficiency for TR PPPs
• Improved access to secure data generated in TR PPPs
• Improved access to relevant historic TR study data
• Increased innovation in analytics tools and applications
Work Packages
WP1
WP2
WP3
WP4
WP5
WP6
WP7
Platform Deployment
Platform Development
Data Standards
Curation and Analysis
Management and Sustainability
Community and Outreach
Ethics
CNRS/JPNV
Imperial/Sanofi/Pfizer
Roche/IDBS/Merck/CDISC
Luxembourg/Sanofi
AstraZeneca/BioSci Consulting
Janssen/BioSci Consulting
GSK/CNRS/Bayer/Sanofi
WP Number WP Name WP Leads
Engagement and Governance
Demand
Execution
Progress Reports
IMI Client
Project
Demand1
Demand2
Demand3
Platform Deployment
Platform Development
Data Standards
Curation and Analysis
Community and Outreach
Ethics
eTRIKS Resour
ces
Decision
Delivery Packages
Deliveries
Progress UpdatesProject Input
3-6 Month Cycle
Projects Engaging eTRIKS
Oncology Safety
InflammationRA-Map
InfectionND4BB
Core Technology
Clinical CohortPhenotypic Data
Demographics
Clinical Observations
Clinical Trial Outcomes
Adverse Events
High Content Biomarker Data
Gene Expression
Genotyping
Metabolomic
PK/PD Markers
Reference Data
Literature
Pathway Data
Gene Metadata
tranSMARTETL
Mining
GPLv3
eTRIKS Platform
Collaboration Platform • Collaboration • Research process • IP capture and management• Secure access
Analytics Environment• Access to analytics tools • Open API for public and commercial
software to plug-in
TR Knowledge Hub • Cloud Infrastructure • Load procedures• ‘Big Data’ storage• Ontology management
Study Book / Visual Research
Access Management
Ontology Management
Scientific Data Architecture
Study Management
Global non-profit organization devoted to realizing the promise of translational biomedical research through development of the tranSMART knowledge management platform.
Goals1.Establish and sustain tranSMART as the preferred data sharing and analytics platform for translational biomedical research.2.Link academic, non-profit and corporate research communities for collaborative research facilitated by tranSMART.3.Align and grow a vibrant developer network around the scientific goals of the tranSMART community.4.Reduce barriers to entry through use of advanced technologies and an active marketplace.
Community: Large scale KM consortia, AMCs, NFPs, Pharma, Regulators, Biotech service suppliers
tranSMART Foundationhttp://www.transmartfoundation.org
What have we done in the first 5 months?
Progress…
1. Becoming Functional…
• Working in partnership with TF to address core issues in tranSMART (2/6 FTEs)
• tranSMART 1.1 (June)– first stable postgres release (UMichigan) * – ETL procedures for postgres (Imperial/Recombinant) *– decoupling of i2b2 (TraIT/Imperial) * – plugin architecture for:
• Ontology data (TraIT/Hyve)• Clinical data (Imperial)
• Data types for higher dimensional data• API for CRC data
• tranSMART 1.2 (October) to be defined: Unit tests, API for analysis in dataset explore, API for high dimensional data, GUI improvements, search improvements…
2. Reinforcing the Foundation
* Complete – currently in testing
• Aim: – eTRIKS server enabling access to public studies of interest to eTRIKS community
• Progress: • Setup
– tranSMART PostgresSQL Dev environment in place– Training and awareness of existing ETL processes for tranSMART
• Population of Search Tool with EBI Atlas fold change data– ETL pipeline built for populating the tranSMART Search tool. – ~2000 human, primate, mouse and rat ATLAS studies selected.– Beta version May 2013, with QC/QA prior to production release in June/July.
• Population of Dataset Explorer Tool with subset of GEO & Arrayexpress data– Selection of UBIOPRED relevant studies from GEO/Arrayexpress– Adaptation of Sanofi Dataset Explorer curation tool for PostgresSQL– Initial data sets being loaded now: production release June/July
3. Public eTRIKS Server
4. First Supported Project: UBIOPRED
Key Facts• Identification of novel biomarkers of
severe asthma
• 40 Partners (20 Academic, 10 SME, 10 EFPIA)
• Novel Cohort and Biobank of Severe Asthma Patients
• Cross-Sectional Comparative Study with Longitudinal Follow up
• Profiling of Genomics, Proteomics, Lipidomic, Breathomic
• Matching with in-vivo and in-vitro models
2604/07/23 R&D IT External Innovation
1,025 Patients 175,000 Samples 3,000,000 Data Points
• Systems Medicine approach toIdentify ‘handprint’ biomarkersacross data
http://www.ubiopred.european-lung-foundation.org/
4. eTRIKS Support of UBIOPRED
Aim (to date): Stand up a secure UBIOPRED server, load clinical and first omic data sets
Progress:Dedicated U-BIOPRED server set-up at ICL running PostgreSQL TM.tranSMART tutorials circulated to UBIOPRED projectAnonymised clinical data loaded - 250 patients to date (baseline visit)Urine and Sputum Lipidomic data loaded (eicosanoid panel)
Next sprints: Animal Model (house dust mite-induced asthma - mouse) dataRobust curation methodology to be developedResearch into Longitudinal Data Model (load & querying)Loading of omics data.
• Onboard 2-3 further projects
• Scope out Animal Model requirements across projects
• Work with Foundation to deliver TM1.1/1.2
• Release the public eTRIKS server: 2000+ studies
Immediate Next Steps
• Accessible Common Infrastructure
• Federation of searchable archives
of translational study information
• Ability to transfer data securely between
organisations within consortia
• Healthy ecosystem of commercial and
NFP service providers supporting projects
and institutions
• Large and diverse innovative
analytics & visualisation toolbox
Ideal Future State(IMI and beyond)
Medical CentresAnalytics
Specialists
DiseaseSpecialists
KM Support
CRO
AssaySpecialists
Regulatoryauthorities
Patient organization
P
P
P
The new reality of
Drug Research
1. Ensure the legacy of project data/results 2. Facilitate dataset integration 3. Increase operational efficiency 4. Establish a common set of standards
www.eTRIKS.orgLinked In Discussion Group: eTRIKS Twitter @etriks1
Recommended