Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Ferran Sanz
Programa de Investigación en Informática Biomédica (GRIB) Institut Hospital del Mar d’Investigacions Mèdiques (IMIM)
Universitat Pompeu Fabra Barcelona
Big data biomédico: La integración masiva de datos
para investigación
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Biomedical Big Data
HEALTH CARE PRACTICE
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Biomedical Big Data
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Millions of EHRs that that can be reused for research
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Biomedical Big Data
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Worldwide estimated medical imaging in 2020: 35 ZB S. Sarcar. GE Healthcare. http://es.slideshare.net/sarcar/data-explosion-in-medical-imaging
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Biomedical Big Data
BIOMEDICAL RESEARCH
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Biomedical Big Data
In May 2015, the European Genotype-Phenotype Archive (EGA) stored 1.8 PB of human ‘omics data
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Biomedical Big Data
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
ChEMBL: 11K targets; 1.5M compounds; 14M activities
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Biomedical Big Data
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
20+ million scientific papers referenced in PubMed®, and 700,000+ are added every year
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
BIMEDICAL BIG DATA
Health information in social media (Web 2.0) should not be forgotten
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
BIMEDICAL BIG DATA
Health information in social media (Web 2.0) should not be forgotten
80+% digital information available is not structured and is in multiple languages
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Clinical Data
Biomedical imaging
‘omics & Systems Biology
Drugs & other chemicals
Biomedical literature
Integration of heterogeneous biomedical information to gain a more complete and powerful view on diseases and therapeutics
INTEGRATIVE BIOINFORMATICS
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Exploitation of the Biomedical Big Data in pharmacovigilance
EHR Db iv EHR
Db iii EHR Db ii EHR
db i
Data extraction and integration
Signal detection
Signal substantiation
In silico pharmacology
Pharmacoepidemiological analysis
Text mining
Stardardization & terminology mapping
Bioinformatics
Ferran Sanz – GRIB (IMIM-UPF)
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
From Bauer-Mehren A, Bundschus M, Rautschka M, Mayer MA, Sanz F, Furlong LI. PLoS One 2011; 6(6): e20284
Knowledge discovery by information linkage
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Workflows for chemo-bioinformatic signal substantiation
Drug used Clinical adverse event
Proteins interacted
Ferran Sanz – GRIB (IMIM-UPF)
Data Silos
Different Standards
Large Volume
Need for resources that gather, standardize and integrate information on the genetic basis of diseases
Information on genetic basis of diseases
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
• A comprehensive resource on gene-disease associations (GDAs)
• Integrates information from publicly available databases and from the literature by text mining
• DisGeNET v4.0 (April 2016) contains 429,036 GDAs involving 17,381 genes and more than 15,000 diseases and phenotypes
• Freely available at: http://www.disgenet.org
Ferran Sanz – GRIB (IMIM-UPF)
GWAS Catalog
OrphaNet
UniProt
CTD
LHGDN
CTD
Curated Predicted Literature
RGD
BEFREE
GAD
ClinVar
MGD
DisGeNET version 4.0: Data sources
Bio-Entity Finder and Relation Extraction
Ferran Sanz – GRIB (IMIM-UPF)
DisGeNET version 4.0: Statistics
Source Genes Diseases Associations
Curated 7,362 7,607 32,834
Predicted 2,743 2,064 10,264
Literature 16,141 11,447 403,925
All 17,381 15,093 429,036
Ferran Sanz – GRIB (IMIM-UPF)
DisGeNET version 4.0: Tools
Network Analysis Web interface
Semantic Web
Programmatic Access
R package Federated queries
Ferran Sanz – GRIB (IMIM-UPF)
DisGeNET version 4.0: Top scoring genes for Wilson disease
Gene Number
of diseases
DisGeNET score DSI DPI Number of
PMIDs Number of
SNPs
ATP7B 57 0.819 0.596 0.592 234 99 ANXA5 129 0.2 0.505 0.741 1 0 PRNP 205 0.128 0.468 0.962 4 1 CP 114 0.126 0.532 0.704 26 0 LOX 141 0.123 0.498 0.778 2 0 LOXL2 48 0.123 0.610 0.481 1 0 APOE 729 0.122 0.333 1 2 0 TNF 1524 0.120 0.247 1 2 0 IL6 1260 0.120 0.268 1 2 0 NDUFB7 1 0.120 1 0.148 1 0
Ferran Sanz – GRIB (IMIM-UPF)
DisGeNET version 4.0: Top scoring genes for Major Depression
Gene Number
of diseases
DisGeNET score DSI DPI Number of
PMIDs Number of
SNPs
SLC6A4 374 0.236 0.411 0.852 157 5 TPH2 89 0.211 0.548 0.667 26 1 HTR2A 222 0.155 0.463 0.778 45 17 PCLO 20 0.130 0.696 0.333 12 5 CRHR1 118 0.127 0.531 0.778 11 11 CYP2D6 316 0.127 0.428 0.852 11 2 FKBP5 78 0.126 0.563 0.814 16 1 SP4 16 0.125 0.739 0.296 3 1 GRM7 32 0.123 0.666 0.444 5 1 GNAI3 7 0.122 0.812 0.296 2 1
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Integrative Biomedical Informatics Group:
Acknowledgements
http://grib.upf.edu
• L.I. Furlong • A. Bauer-Mehren (Roche) • A. Bravo
• A. Gutiérrez • J. Piñero • N. Queralt
Ferran Sanz – GRIB (IMIM-UPF) Ferran Sanz – GRIB (IMIM-UPF)
Thanks for your attention!