39
Are Funders and Academic Institutions Approaches to Data Science Aligned? Philip E. Bourne PhD, FACMI Stephenson Chair of Data Science Director, Data Science Institute Professor of Biomedical Engineering [email protected] https://www.slideshare.net/pebourne 6/15/17 Dataverse 2017 1

Are Funders and Academic Institutions Approaches to Data Science Aligned

Embed Size (px)

Citation preview

Page 1: Are Funders and Academic Institutions Approaches to Data Science Aligned

Are Funders and Academic Institutions Approaches to Data

Science Aligned?

Philip E. Bourne PhD, FACMIStephenson Chair of Data Science

Director, Data Science InstituteProfessor of Biomedical Engineering

[email protected]://www.slideshare.net/pebourne

6/15/17 Dataverse 2017 1

Page 2: Are Funders and Academic Institutions Approaches to Data Science Aligned

My Bias in Addressing this Question

• Many years as a receiver of funds

• Active developer of public-private partnerships

• Chief Data Officer and final word on the BD2K project at the NIH for 3 years (funder view)

• DSI Director for 6 weeks (state and institutional view)

6/15/17 Dataverse 2017 2

Page 3: Are Funders and Academic Institutions Approaches to Data Science Aligned

What Do I Mean by Data Science?

• Use of the ever increasing amount of open, complex, diverse digital data

• Finding ways to ask and then answer relevant questions by combing such diverse data sets

• Arriving at statistically significant conclusions not otherwise obtainable

• Sharing such findings in a useful way

• Translating such findings into actions that improve the human condition

6/15/17 Dataverse 2017 3

Page 4: Are Funders and Academic Institutions Approaches to Data Science Aligned

Consider Some Current High Profile NIH Examples where Data Science is

Being Applied

• Moonshot - Bringing together 5 petabytes of homogenized data within the Genome Data Commons (GDC) to explore genotype-phenotype relationships

• MODs – Multiple high value high cost genomic resources• Human Microbiome Project – microbe characterization and analysis• TOPMed – Genomic, proteomic, metabolomic, image and EHR data• Precision Medicine - Building a platform to support data on >1M individuals

with extensive and constantly updated health profiles• ECHO – Effects of Environmental Exposures on Child Health and

Development - Integration of child health and environmental data• BRAIN - Temporal and spatial analysis of neural circuits

4

Page 5: Are Funders and Academic Institutions Approaches to Data Science Aligned

How is Data Science Being Applied?

• Moonshot – new ways to analyze genotype-phenotype associations• MODs – new curation and integration tools• Human Microbiome Project – new cloud based tools• TOPMed – large scale storage and analysis; data harmonization• Precision Medicine – security; analysis of sensor data; EHR integration• ECHO – metadata descriptions of health and environmental data;

application of geospatial methods• BRAIN – methods for network analysis, visualization

All:

Analytics, the Commons, FAIR, sustainability, workforce5

Page 6: Are Funders and Academic Institutions Approaches to Data Science Aligned

Are Funders and Academic Institutions Approaches to Data Science Aligned?

6/15/17 Dataverse 2017 6

A spoiler …

Yes…But both are so far behind the

times its scary

Page 7: Are Funders and Academic Institutions Approaches to Data Science Aligned

Top

Down

FundersFederal, Foundations, Philanthropy

State

GeneralPublic

Faculty&

Staff

Students

ScholarlyCommunities

Lack of Alignment Can Come from Serving Different Masters/Mistresses:

Academic Institutions

6/15/17 Dataverse 2017 7

Page 8: Are Funders and Academic Institutions Approaches to Data Science Aligned

Top

Down

Congress

GeneralPublic

ScholarlyCommunities

Lack of Alignment Can Come from Serving Different Masters/Mistresses:

Funders

Researchers

6/15/17 Dataverse 2017 8

Page 9: Are Funders and Academic Institutions Approaches to Data Science Aligned

Why Does Alignment Matter Now?One extreme view is the 6D’s

6/15/17 Dataverse 2017 9

Page 10: Are Funders and Academic Institutions Approaches to Data Science Aligned

How Significant?One extreme is the 6D’s

6/15/17 Dataverse 2017 10

DigitizationDeception

Disruption

Demonetization

Dematerialization

Democratization

Time

Digital camera invented by

Kodak but shelved

Megapixels & quality improve slowly;

Kodak slow to react

Film market collapses;

Kodak goes bankrupt

Phones replace

cameras

Instagram,

Flickr become the

value proposition

Digital media becomes bona fide

form of communication

Page 11: Are Funders and Academic Institutions Approaches to Data Science Aligned

How Much Data?

• Big Data– Total data from NIH-funded research currently

estimated at 650 PB*

– 20 PB of that is in NCBI/NLM (3%) and it is expected to grow by 10 PB this year

• Dark Data– Only 12% of data described in published papers is

in recognized archives – 88% is dark data^

• Cost– 2007-2014: NIH spent ~$1.2Bn extramurally on

maintaining data archives

* In 2012 Library of Congress was 3 PB

^ http://www.ncbi.nlm.nih.gov/pubmed/262077596/15/17 Dataverse 2017 11

Page 12: Are Funders and Academic Institutions Approaches to Data Science Aligned

Renaissance or not, all of us, including funders and institutions, feel that some changes are afoot

6/15/17 Dataverse 2017 12

Page 13: Are Funders and Academic Institutions Approaches to Data Science Aligned

Interestingly both are organizationally similar which points to some form of

shared misalignment

• Funder institutions

– Silos –Institutes/Divisions

• Resources flow directly to the silos

• Patchwork efforts to compensate

– Common fund (NIH)

• Academic Institutions

– Silos –Schools/Departments

• Resources flow directly to the silos (RCM)

• Patchwork efforts to compensate

– Joint appointments

6/15/17 Dataverse 2017 13

Page 14: Are Funders and Academic Institutions Approaches to Data Science Aligned

In both environments data science transcends the traditional

organizational structure, but it is not necessarily clear what to do with it

6/15/17 Dataverse 2017 14

Page 15: Are Funders and Academic Institutions Approaches to Data Science Aligned

Approaches

• Funders

– Chief data officer

– Establish• Programs

• Divisional data officers

• Institutions

– Chief data officer

– Establish• Schools/Deans

• Departments/Chairs

• Divisions

• Centers/Directors

• Institutes/Directors

6/15/17 Dataverse 2017 15

Page 16: Are Funders and Academic Institutions Approaches to Data Science Aligned

Motivations

• Funders– Intramural

• Productivity/Cost-effectiveness

– Extramural• Acceleration of research

outcomes

• Reproducibility

• Governance including policy

• Workforce inc. diversity

• Ethics

• Stewardship

• Discovery

• Institutions– Yet to significantly eat their

own dog food

– Workforce development inc. diversity

– Research dollars to economic development

– Public private partnership• More dollars

– Alumni

6/15/17 Dataverse 2017 16

Page 17: Are Funders and Academic Institutions Approaches to Data Science Aligned

Example of what motivates funders …

6/15/17 Dataverse 2017 17

Page 18: Are Funders and Academic Institutions Approaches to Data Science Aligned

Why a More Open Process?Use case:

Diffuse Intrinsic Pontine Gliomas (DIPG)

• Occur 1:100,000

individuals

• Peak incidence 6-8 years

of age

• Median survival 9-12

months

• Surgery is not an option

• Chemotherapy ineffective

and radiotherapy only

transitive

From Adam Resnick6/15/17 Dataverse 2017 18

Page 19: Are Funders and Academic Institutions Approaches to Data Science Aligned

Timeline of genomic studies in DIPG

• Landmark studies identify

histone mutations as

recurrent driver mutations in

DIPG ~2012

• Almost 3 years later, in

largely the same datasets,

but partially expanded, the

same two groups and 2

others identify ACVR1

mutations as a secondary, co-

occurring mutation

From Adam Resnick6/15/17 Dataverse 2017 19

Page 20: Are Funders and Academic Institutions Approaches to Data Science Aligned

What do we need to do differently to reveal ACVR1?

• ACVR1 is a targetable kinase

• Inhibition of ACVR1 inhibited tumor

progression in vitro

• ~300 DIPG patients a year

• ~60 are predicted to have ACVR1

• If large scale data sets were only

integrated with TCGA and/or rare

disease data in 2012, ACVR1 mutations

would have been identified

• 60 patients/year X 3 years = 180

children’s lives (who likely succumbed to

the disease during that time) could have

been impacted if only data were FAIRFrom Adam Resnick

6/15/17 Dataverse 2017 20

Page 21: Are Funders and Academic Institutions Approaches to Data Science Aligned

Example of what motivates institutions …

6/15/17 Dataverse 2017 21

Page 22: Are Funders and Academic Institutions Approaches to Data Science Aligned

The cynical view …50 x $50,000 = $2.5M

6/15/17 Dataverse 2017 22

Page 23: Are Funders and Academic Institutions Approaches to Data Science Aligned

The Google University

6/15/17 Dataverse 2017 23

Page 24: Are Funders and Academic Institutions Approaches to Data Science Aligned

Both funders and institutions see the need to move from pipes to

platforms…In this regard Dataverse is ahead of the

curve

6/15/17 Dataverse 2017 24

https://blog.lexicata.com/wp-content/uploads/2015/03/platform-model-750x410.png

Page 25: Are Funders and Academic Institutions Approaches to Data Science Aligned

Example: NSF and NIH Approaches

6/15/17 Dataverse 2017 25

Page 26: Are Funders and Academic Institutions Approaches to Data Science Aligned

If platforms are the answer we could ask the question…

Will biomedical research become more like Airbnb?

6/15/17 Dataverse 2017 26

Bonazzi & Bourne 2017 PLOS Biology 15(4) e2001818

Page 27: Are Funders and Academic Institutions Approaches to Data Science Aligned

I am not crazy, hear me out

• Airbnb is a platform that supports a trusted relationship between consumer (renter) and supplier (host)

• The platform focuses on maximizing the exchange of services between supplier and consumer and maximizing the amount of trust associated with a given stakeholder

• It seems to be working: – 60 million users searching 2 million listings in 192 countries

– Average of 500,000 stays per night.

– Evaluation of US $25bn

Bonazzi & Bourne 2017 PLOS Biology 15(4) e2001818

6/15/17 Dataverse 2017 27

Page 28: Are Funders and Academic Institutions Approaches to Data Science Aligned

Is not biomedical research the same?

6/15/17 Dataverse 2017 28

Page 29: Are Funders and Academic Institutions Approaches to Data Science Aligned

Why a comparison to Airbnb is not fair

• Airbnb was born digital

• The exchange of services on Airbnb are simple compared to what is required of a platform to support biomedical research

Nevertheless there is much to be learnt

6/15/17 Dataverse 2017 29

Page 30: Are Funders and Academic Institutions Approaches to Data Science Aligned

Paper Author Paper Reader

Data Provider Data Consumer

Employer Employee

Reagent Provider

Reagent Consumer

Software Provider

Software Consumer

Grant Writer Grant Reviewer

Supplier Consumer Platform

MS ProjectGoogle Drive

CourseraResearchgateAcademia.eduOpen Science

FrameworkSynapseF1000

Rio

Educator Student

Platforms – The situation today

6/15/17 Dataverse 2017 30

Page 31: Are Funders and Academic Institutions Approaches to Data Science Aligned

In summary there is not currently a widely adopted single platform for

the exchange of services in biomedical research. Either there is a platform per service or no platform

at all….

Funders and the institutions they fund need to work more closely to

implement platforms6/15/17 Dataverse 2017 31

Page 32: Are Funders and Academic Institutions Approaches to Data Science Aligned

Impediments to a biomedical platform

• Current work practices by all stakeholders

• Entrenched business models

• Size of the undertaking aka resources needed

• Trust

• Incentives to use the platform

http://www.forbes.com/sites/johnhall/2013/04/29/10-barriers-to-employee-innovation/#8bdbaa8111336/15/17 Dataverse 2017 32

Page 33: Are Funders and Academic Institutions Approaches to Data Science Aligned

Funders are pushing open data science…

Institutions are more resistent

6/15/17 Dataverse 2017 33

Page 34: Are Funders and Academic Institutions Approaches to Data Science Aligned

NIH – a culture of sharing

1999 20042003 2007 20142008

Research Tools Policy

NIH Data Sharing Policy

Model Organism Policy

Genome-wide Association (GWAS) Policy

2012

NIH Public Access Policy (Publications)

Big Data to Knowledge (BD2K) Initiative

Genomic Data Sharing (GDS) Policy

Modernization of NIH Clinical Trials

White House Initiative

(2013 “HoldrenMemo”)

6/15/17 Dataverse 2017 34

Page 35: Are Funders and Academic Institutions Approaches to Data Science Aligned

Driving sharing and innovation: Open Science Prize

NIH, Wellcome Trust, HHMI

https://www.openscienceprize.org

• An international scientific challenge competition to encourage and support the prototyping and development of services, tools, or platforms that enable utilization of open content

• 96 submissions received

• Solvers from 45 countries,

spanning 5 continents

• Timeline

• May 2016: Phase 1 winners announced at Health DataPalooza

• Dec 1, 2016: Presentations and public voting

• Feb 2017: Overall winner announced

Page 36: Are Funders and Academic Institutions Approaches to Data Science Aligned

Institutions are going their own way

• More dependency on the state

• More dependency on philanthropy

• More dependency on foundations

• More depend public-private partnership

Page 37: Are Funders and Academic Institutions Approaches to Data Science Aligned

What We are Doing/Planning at One Institution

• Starting an open UVA initiative

• Focusing on practical training through Capstones

• Not owning anything; only working through collaboration

• Planning for the data village – an ecosystem in which students, faculty, staff, visitors, private sector reps, entrepeneurs live and work

6/15/17 Dataverse 2017 37

Page 38: Are Funders and Academic Institutions Approaches to Data Science Aligned

So let me summarize

• Data may be the next Renaissance

• Both funders and academic institutions are slow to realize this – its hard to break away from the old ways

• Result a growing gap between what both should be doing vs what they are doing

• More exemplars like Dataverse are needed that integrate aspects of the research lifecycle

6/15/17 Dataverse 2017 38

Page 39: Are Funders and Academic Institutions Approaches to Data Science Aligned

Acknowledgements

6/15/17 Dataverse 2017 39

The BD2K Team at NIH

My New Colleagues at UVA

The 150 folks who have passed through my laboratoryhttps://docs.google.com/spreadsheets/d/1QZ48UaKcwDl_iFCvBmJsT03FK-bMchdfuIHe9Oxc-rw/edit#gid=0