View
216
Download
2
Category
Tags:
Preview:
Citation preview
1
1
http://www.semoss.org
http://youtube.com/user/semossanalytics
http://twitter.com/semossanalytics
2
2
Agenda
Shark Tank Overview
SEMOSS
HHS Ignite Use Case
Demo
3
HHS Ignite is an “incubator for new ideas” run out of the HHS IDEA Lab.
NIAID SEB Innovation Challenge
HHS Ignite Innovation Program
Expansion
Jan ‘14
Jul ‘14 Aug ‘14 Sep ‘14
Nov ‘14
The Evolution of HHS Ignite
Phase 1:HHS Ignite boot camp
Phase 2:NIH Interviews & Pilot Development
Phase 3:HHS Ignite Shark Tank
Vincent Munster, PhD
Peter Jahrling,PhD
NIH SEMOSS team with HHS Deputy Secretary Bill Corr
& HHS CTO Bryan Sivak
Responding to an informal request for innovation ideas from the NIH’s National Institute of Allergy and Infectious Diseases (NIAID), a small Deloitte team submitted a written proposal. At the client’s suggestion, the proposal was submitted and the team selected to compete the 2014 HHS IdeaLab’s Ignite innovation tournament. During the 3-month pilot, the Deloitte team engaged 10 NIAID customers and created a functional proof-of-concept solution for 2 intramural scientists. The Deloitte/NIAID team successfully presented their pitch to a panel of relevant federal executives and the HHS CTO during the concluding Shark Tank on September 30, 2014.
4
4
5
SEMOSS Evolution
Solution History
SEMOSS is a result of several years of federal investment in federated, semantic web technology .
In 2010, the Deputy Chief Management Officer (DCMO) of the Department of Defense began experimenting with Semantic Web technology. Military Health System (MHS) with help from Deloitte created a graph-based toolset for multi-dimensional analysis of disparate data sources to determine investment sequencing for the MHS IT portfolio. After MHS presented its solution to DCMO, a joint investment was made to fund a similar tool utilizing the Semantic Web. The guiding principles of the tool were that it must be standards based; allow integrating data from multiple sources; and adopt visualizations and analytics on an as-needed basis. This investment spawned SEMOSS.
Analysis
Data Sources
• Excessive time spent on data preparation
• analysis and visualization constraints
1-2 Data Sources 3-4 Data Sources
• Limited to single database• Answers modeled as graph
traversals demonstrations
Stakeholders 2-3
No Limit
• Integrated knowledge analytics environment
• Transitive across databases• Collaboration• Answers modeled as reports
No Limit
Knowledge Exploration• Minimal• Repetitive Visualizations
• Single Dimensional• Difficult to customize
• Multi-Dimensional• Self Service
2010-2011 - Excel / Tableau 2011 – 2012 Neo4J 2012 – 2013 SEMOSS
1
IssuesFocus on visualizationNot Malleable
ProprietaryLong cycle times
None as product created to meet client needs
Solution Evolution
6
What Does Federated Analytics Mean?
• Elastic data integration with more than 6 connectors, including Excel/CSV, NLP, RDBMS, Cloud Aware Data sources
• Context aware data, that can link across databases
• W3C Standards – RDF, SPARQL
• Rich library of visualizations• Parallel Coordinates• Excel style charting• Network Viz.• Heat-maps
• Extensibility to adopt any visualization
• Overlay visualizations to see overlaps
• Graph Algorithms• Optimization – Linear and Non-
Linear algorithms• Statistical algorithms• Equation Solving
Data
Viz.
Analytics
http://www.semoss.org
http://youtube.com/user/semossanalytics
http://twitter.com/semossanalytics
Federate Data
Discover Insights
Perform Analysis
Visualize Decisions
Share Knowledge
7
Types of Visualizations Included in SEMOSS
8
8
HHS IGNITE INNOVATION PROGRAMUSE CASE
9
Diverse Researchers across HHS
Dawei LinPhD, NIHComputer Modeling
Vincent MunsterPhD, NIHInfectious Diseases
Marie Parker, NIHResearch Initiatives
Susanna VisserDrPh, CDCADHD
10
Common Research Goals
Dawei LinPhD, NIHComputer Modeling
Vincent MunsterPhD, NIHInfectious Diseases
Marie Parker, NIHResearch Initiatives
Susanna VisserDrPh, CDCADHD
Data Access
Robust Analysis
Collaboration
11
Technology Barriers
Dawei LinPhD, NIHComputer Modeling
Vincent MunsterPhD, NIHInfectious Diseases
Marie Parker, NIHResearch Initiatives
Susanna VisserDrPh, CDCADHD
Big Data
Inaccessibili
ty
Isolated Analysis
Collaboration Barriers
Multiple SourcesIntegrationChallenges
12
Dr. Munster’s Research
Vincent MunsterPhD, NIHInfectious Diseases
Big Data
Inaccessibili
ty
Isolated Analysis
Collaboration Barriers
Multiple SourcesIntegrationChallenges
Middle East Respiratory Syndrome
(MERS)
The platform allows me to analyze and grasp large seemingly incomprehensible datasets.
- Vincent Munster, PhD
13
Dr. Munster’s Research Challenges
1) Diseases2) Articles3) Collaborators
Private Data
PubMed
FAERS
DisGeNet
PharmGKB
HGNC
14
Our Tested Solution
1) Diseases2) Articles3) Collaborators
Private Data
PubMed
FAERS
DisGeNet
PharmGKB
HGNC
15
Use Case Metamodel
CTD
PharmGKB
PubMed
DisGeNet
DrugBank
HGNC
PubChem
PrivateDatasets
Gene
Publication
Author
Chemical Disease DrugDrug
Component
Researcher Datasets
Pathway
Molecular
Function
Biological Process
Chromosome
Cell Component
• No single database has exhaustive information. Multiple connections ensure complete data.
• The data sources above reflect the information requested by our customer. This solution can be easily customized for any researcher.
16
16
HHS IGNITE DEMO
Appendix
18
SEMOSS Supplementing Insights
1. Private Research Data
2. Online Mendelian Inheritance in Man (OMIM)
3. PubMed
4. HUGO Gene Nomenclature Committee (HGNC)
5. DrugBank
6. Comparative Toxicogenomics Database (CTD)
7. Disease Gene Network (DisGeNet)
8. PubChem
9. PharmGKB
1. Gene Expression
2. Chemical
3. Cellular Pathway
4. Molecular Function
5. Biological Process
6. Cytolocation
7. Cell Component
8. Gene Nomenclature
9. Disease
10. Publication
11. Author
Relevant Data Data Sources
19
Solution Benefits & Capabilities
Researcher Benefits• Data Accuracy; ensure you are using validated, authoritative sources• Time Efficiency; eliminate days spent reading publications and searching for data• Single Platform; use centralized platform rather than multiple data locations• Rapid Visualization & Analysis; to gain insight and accelerate research • Scientific Collaboration; secure public/private cloud instance for collaboration
Solution Capabilities• Big Data; navigate and distill relevant data seamlessly• Extensible, Scalable Data Model; shared model of understanding• Undirected Research; what questions do we ask public data that we do not have
answers to?• Broad Applicability; across many subject areas and data types• Open Data Initiatives; federal public data initiatives with no data consumption tool
20
SEMOSS maximizes HHS Open Data ROI by leveraging the vast networks of public and private life science data to promote insight and discovery.
SEMOSS
Federal HealthData Environment
CloudInfrastructure
SEMOSS Platform
End Users
Scientific Use CaseSEMOSS Solution
SEMOSS
Which diseasesare associated with
my genes of interest?
PharmGKB
CTD
PubMedDisGeNet
Cancer Researcher
HGNCMESH
FAERS
Solution Overview
21
Solution Demonstration
1) Diseases2) Articles3) Collaborators
Data Sources1. Private Data2. HGNC3. OMIM4. DisGeNet5. CTD6. PharmGKB7. PubMed8. FAERS9. VAERS
22
Solution Demonstration
Data Sources1. Private Data2. HGNC3. OMIM4. DisGeNet5. CTD6. PharmGKB7. PubMed8. FAERS9. VAERS
1) Diseases2) Articles3) Collaborators
23
Solution Demonstration
1) Diseases2) Articles3) Collaborators
Data Sources1. Private Data2. HGNC3. OMIM4. DisGeNet5. CTD6. PharmGKB7. PubMed8. FAERS9. VAERS
24
Solution Demonstration
1) Diseases2) Articles3) Collaborators
Data Sources1. Private Data2. HGNC3. OMIM4. DisGeNet5. CTD6. PharmGKB7. PubMed8. FAERS9. VAERS
The platform allows me to analyze and grasp large seemingly incomprehensible datasets.
- Vincent Munster, PhD
25
SEMOSS Supplementing Insights
Identify Question
SEMOSS pre-packages more than eighty questions across domains that can be readily utilized. New questions can be modeled as reports.
Synthesize Meta Model
SEMOSS has more than ten different domain metamodels. New models can be created / extended to emulate mental models.
Find and Import Data
SEMOSS has industry data across healthcare, infrastructure, data and BPR that can be readily explored. Link excel data or RDBMS to existing data for analysisFind and
Import Data
SEMOSS has industry data across healthcare, infrastructure, data and BPR that can be readily explored. Link excel data or RDBMS to existing data for analysis.
Visual
Analysis
SEMOSS allows automatic linking of data across databases and allows cross-database visualization. Users no longer need to import everything into a single database.
26
The Team
Special Thanks to…
Prabhu KapaleeswaranAuthor, SEMOSSMHS
Joe CroghanProject SupervisorNIH
Brock SmithProject LeadNIH
Alexander ShermanTechnical SMENIH
Karthik BalakrishnanTechnical LeadNIH
LeeAnn Bailey, PhDScience SMEFDA
Alexandra KwitScience LeadNIH
Regina CoxData SMECDC
Vincent Munster, PhDNIH
Mike TartakovskyNIH
Alex RosenthalNIH
Dawei Lin, PhDNIH
Peter Jahrling, PhDNIH
David ParrishNIH
Recommended