Upload
matthieu-schapranow
View
287
Download
0
Embed Size (px)
Citation preview
Challenges of Big Medical Data
Dr. Matthieu-P. Schapranow Festival of Genomics, London, U.K.
Jan 19, 2016
■ Online: Visit we.analyzegenomes.com for latest research results, tools, and news
■ Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis: How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014
■ In Person: Join us for “Cebit” March 14-20, 2016 in Hanover, Germany.
Important things first: Where do you find additional information?
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
2
What is the Hasso Plattner Institute, Potsdam, Germany?
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
3
■ Founded as a public-private partnership in 1998 in Potsdam near Berlin, Germany
■ Institute belongs to the University of Potsdam
■ Ranked 1st in CHE since 2009
■ 500 B.Sc. and M.Sc. students ■ 10 professors/chairs, 150 PhD students
■ Course of study: IT Systems Engineering
Hasso Plattner Institute Key Facts
Challenges of Big Medical Data
4
Schapranow, Festival of Genomics, Jan 19, 2016
Prof. Dr. h.c. Hasso Plattner
■ Research focuses on the technical aspects of enterprise software and design of complex applications
□ In-Memory Data Management for Enterprise Applications
□ Enterprise Application Programming Model □ Scientific Data Management
□ Human-Centered Software Design and Engineering ■ Industry cooperations, e.g. SAP, Siemens, Audi, and EADS ■ Research cooperations, e.g. Stanford, MIT, and Berkeley
Hasso Plattner Institute Enterprise Platform and Integration Concepts Group
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
5
Partner of Stanford Center for Design
Research
Partner of MIT in Supply Chain
Innovation and CSAIL
Partner at UC Berkeley
RAD / AMP Lab
Partner of SAP AG
■ Since 2009 Program Manager E-Health & Life Sciences
■ 2006-2014 Strategic Projects SAP HANA
■ Visiting Scientist at V.A., Boston, MA and Charité, Berlin
■ Software Engineer by training (PhD, M.Sc., B.Sc.)
With whom are you dealing?
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
6
Healthcare Interactions in the 21st Century
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
7 Ind irect In te raction
D irect In te raction
C lin ic ian P atien tR esearcher
P harm aceu tica lC om pany
H ea lthcareP roviders
H osp ita lR esearch
C enterLabora to ry
P atien tA dvocacy
G roup
■ Patients
□ Individual anamnesis, family history, and background
□ Require fast access to individualized therapy
■ Clinicians
□ Identify root and extent of disease using laboratory tests
□ Evaluate therapy alternatives, adapt existing therapy
■ Researchers
□ Conduct laboratory work, e.g. analyze patient samples
□ Create new research findings and come-up with treatment alternatives
The Setting Actors in Oncology
Schapranow, Festival of Genomics, Jan 19, 2016
8
Challenges of Big Medical Data
■ Motivation: Can we enable patients to:
□ Understand and monitor their diseases to document the impact on their lives,
□ Receive latest information about their (chronic) diseases,
□ Cooperatively exchange with physicians and patients to improve quality of living
Our Motivation Make Precision Medicine Become Routine
Challenges of Big Medical Data
9
Schapranow, Festival of Genomics, Jan 19, 2016
Our Methodology Design Thinking
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
10
Our Methodology Design Thinking
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
11
Desirability
■ Portfolio of integrated services for clinicians, researchers, and patients
■ Include latest treatment option, e.g. most effective therapies
Viability
■ Enable precision medicine also in far-off regions and developing countries
■ Involve word-wide experts (cost-saving)
■ Combine latest international data (publications, annotations, genome data)
Feasibility
■ HiSeq X Ten delivers 30x coverage whole genome of >18k humans / year
■ IMDB enables allele frequency determination of 12B records within <1s
■ Cloud-based data processing services reduce TCO
Sequencing vs. Main Memory Costs
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
12
0.001
0.01
0.1
1
10
100
1000
10000
Jan 2
002
Jan 2
004
Jan 2
006
Jan 2
008
Jan 2
010
Jan 2
012
Jan 2
014
Cost
s in
[U
SD
]
Date
Main Memory Costs per MegabyteSequencing Costs per Megabase
IT Challenges Distributed Heterogeneous Data Sources
13
Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes
Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB)
Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov
Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB
PubMed database >23M articles
Hospital information systems Often more than 50GB
Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Cancer patient records
>160k records at NCT
Challenges of Big Medical Data
Schapranow, Festival of Genomics, Jan 19, 2016
■ Requirements
□ Real-time data analysis
□ Maintained software
■ Restrictions
□ Data privacy
□ Data locality
□ Volume of “big medical data”
■ Solution?
□ Federated In-Memory Database System vs. Cloud Computing
Software Requirements in Life Sciences
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
14
Clouds Trends?
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
15
Gartner's 2014 Hype Cycle for Emerging Technologies
Schapranow, Festival of Genomics, Jan 19, 2016
we.analyzegenomes.com Real-time Analysis of Big Medical Data
16
In-Memory Database
Extensions for Life Sciences
Data Exchange, App Store
Access Control, Data Protection
Fair Use
Statistical Tools
Real-time Analysis
App-spanning User Profiles
Combined and Linked Data
Genome Data
Cellular Pathways
Genome Metadata
Research Publications
Pipeline and Analysis Models
Drugs and Interactions
Challenges of Big Medical Data
Drug Response Analysis
Pathway Topology Analysis
Medical Knowledge Cockpit Oncolyzer
Clinical Trial Recruitment
Cohort Analysis
...
Indexed Sources
Keep in contact with us!
Hasso Plattner Institute Enterprise Platform & Integration Concepts (EPIC)
Program Manager E-Health Dr. Matthieu-P. Schapranow
August-Bebel-Str. 88 14482 Potsdam, Germany
Dr. Matthieu-P. Schapranow [email protected] http://we.analyzegenomes.com/
Schapranow, Festival of Genomics, Jan 19, 2016
Challenges of Big Medical Data
17