38
In the Future Will a Biological Database Really be Different than a Biological Journal? Philip E. Bourne PhD [email protected] iDASH October 18, 2013 1

Is a Biological Database Really Different than a Biological Journal?

Embed Size (px)

DESCRIPTION

Presentation on the changing face of scholarly communication and the interplay between data and the knowledge derived from that data.

Citation preview

Page 1: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 1

In the Future Will a Biological Database Really be Different than a Biological Journal?

Philip E. Bourne [email protected]

Page 2: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 2

I am speaking to you today as someone who..

• Maintains a major biological database – the PDB – used by over 300,000 scientists per month

• Is the Founding Editor in Chief of PLOS Computational Biology

Page 3: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 3

A Question First Posed in August 2005

PLOS Comp Biol 2005 1(3): e34

Page 4: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 4

Here is one reason why the question is important….

Page 5: Is a Biological Database Really Different than a Biological Journal?

1. A link brings up figures from the paper

0. Full text of PLoS papers stored in a database

2. Clicking the paper figure retrievesdata from the PDB which is

analyzed

3. A composite view ofjournal and database

content results

The Paper As Experiment

1. User clicks on thumbnail2. Metadata and a

webservices call provide a renderable image that can be annotated

3. Selecting a features provides a database/literature mashup

4. That leads to new papers

4. The composite view haslinks to pertinent blocks

of literature text and back to the PDB

1.

2.

3.

4.

PLoS Comp. Biol. 2005 1(3) e34

5

Page 6: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 6

The answer 8 years ago, as is now is…

In principle there is no difference, but the way in which each is perceived is still very different…

Yet progress has been made and we will focus on what we can do to further accelerate change

Page 7: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 7

Why Bother?

Better integration of data and the knowledge derived from it can accelerate discovery and improve the comprehension and dissemination of science

Page 8: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 8

Lets take a step back ...

What got me thinking this way?

Page 9: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 9

Data Are Becoming More Complex:Witness The World Wide Protein Data Bank

• The single worldwide repository for data on the structure of biological macromolecules

• Vital for drug discovery and the life sciences

• 43 years old• Free to all

http://www.wwpdb.org

Page 10: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 10

The World Wide Protein Data BankPlaces High Value on Data

• Paper not published unless data are deposited – strong data to literature correspondence

• Highly structured data conforming to an extensive ontology

• DOI’s assigned to every structure

http://www.wwpdb.org

Page 11: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 11

The PLoS Corpus• Established in 2000• Identified as a high

quality publications• Currently 8 journals

with healthy growth• Open Access – free to

all• PLOS ONE a huge

success

Page 12: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 12

Author Submission via the Web Depositor Submission via the Web

Syntax Checking Syntax Checking

Review by Scientists &Editors

Review by Annotators

Corrections by AuthorCorrections by Depositor

Publish – Web Accessible Release – Web Accessible

Similar Processes Lead to Similar Resources

Page 13: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 13

The scientific process for handling data and publications are not that different, but the end product is perceived very differently

Page 14: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013

Unfortunately the Metrics of Success Remain…

[Carole Goble] 14

Page 15: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 15

This makes no sense when you ask yourself the question:What is more valuable a dataset used and cited by 100 scientists or

a paper you wrote that only you cite?

Case in point…

Page 16: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 16

What can you do today to change the situation?

Page 17: Is a Biological Database Really Different than a Biological Journal?

Think Globally Act Locally

• Support emergent community commons/portals• Be involved in the support and development of

metadata standards• Contribute to workflow development etc. to drive

an open research lifecycle• Educate your mentors on the importance of

open science and scholarly communication • Write software thinking of an App model

iDASH October 18, 2013 17

Page 18: Is a Biological Database Really Different than a Biological Journal?

Pressure Your Institutions to Play a Greater Role

• We need institutional data/knowledge sharing plans

• We need digital universities

• We need data/information scientists to be better recognized by institutions – its not all about papers – this implies new metrics

18iDASH October 18, 2013

Page 19: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013

Committee on Academic Promotions

• What Counts– Money– Grants– Papers– Teaching – Service

• What Does Not– Sharing data– Sharing software– Open access– Collaboration– Patents– Startups

19

Ten Simple Rules for Getting Ahead as a Computational Biologist in Academia 2011 PLOS Comp Biol 7(1) e1002001

Page 20: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013

We Need to Bend the Traditional SystemThe Wikipedia Experiment – Topic Pages

Identify areas of Wikipedia that relate to the journal that are missing of stubs

Develop a Wikipedia page in the sandbox

Have a Topic Page Editor Review the page

Publish the copy of record with associated rewards

Release the living version into Wikipedia

20

Page 21: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 21

We Need Innovative Contributions to the Research Lifecycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

SoftwareRepositories

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 22: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 22

We Need Innovative Contributions to the Research Lifecycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

SoftwareRepositories

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 23: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 23

www.rcsb.org/pdb/explore/literature.do?structureId=1TIM

Example Interoperability: The Database View

BMC Bioinformatics 2010 11:220

Page 24: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 24

This is asking a lot of us, but our job is being made easier by what is going on around us

Page 25: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 25

Open Access to Data and the Literature is no Longer a Curiosity, but Mainstream

Page 26: Is a Biological Database Really Different than a Biological Journal?

Conservative Bodies Are Recognizing Change

• Anyone, anything, anytime

• publication access, data, models, source codes, resources, transparent methods, standards, formats, identifiers, apis, licenses, education, policies

• “accessible, intelligible, assessable, reusable”

http://royalsociety.org/policy/projects/science-public-enterprise/report/

[Carole Goble]

Page 27: Is a Biological Database Really Different than a Biological Journal?

27

Governments Are Recognizing ChangeG8 Open Data Charter

iDASH October 18, 2013

http://opensource.com/government/13/7/open-data-charter-g8

Page 28: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 28

Funding Agencies are Changing

Page 29: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 29

Publishing is Changing

• Today:• Approx 10,000 publishers

• Publishing approx 25,000 journals

• Which publish approx 1.5 million articles per year (almost 1 million of which appear in PubMed)

Page 30: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 30

Witness the ‘Open Access Mega Journal'

1. Very very large– Publishing thousands of articles per year– and benefiting from economies of scale

2. Open Access– Because no one will pay a subscription fee for a journal that

large (and growing that fast)– and using an OA Business Model where each article pays for its

own costs

3. (Preferably) without any ‘artificial’ constraints on its ability to grow– For example, a desire to only publish ‘high impact; papers

[Pete Binfield]

Page 31: Is a Biological Database Really Different than a Biological Journal?

Publications by PLoS ONE per quarter since launch

0

500

1000

1500

2000

2500

3000

3500

Publications by PLOS ONE per quarter since launch

[Pete Binfield]

Page 32: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 32

“Open Access Mega Journals”– One Name, Two Flavours

• ‘Clones’ of PLoS ONE (not selective)– SAGE Open– BMJ Open– Scientific Reports (Nature)– AIP Advances (Am Inst Physics)– G3 (Genetics Soc of America)– Biology Open (Company of Biologists)

• ‘Pseudo-Clones’ of PLoS ONE (probably selective) – Physical Review X (Am Physical Society)– Open Biology (Royal Society)– Cell Reports (Elsevier, Cell Press)

[Pete Binfield]

Page 33: Is a Biological Database Really Different than a Biological Journal?

33

Attitudes are Changing“An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995

datasetsdata collectionsalgorithmsconfigurationstools and appscodesworkflowsscriptscode librariesservices,system software infrastructure, compilershardware

Morin et al Shining Light into Black BoxesScience 13 April 2012: 336(6078) 159-160

Ince et al The case for open computer programs, Nature 482, 2012

[Carole Goble]

Page 34: Is a Biological Database Really Different than a Biological Journal?

34

Flaws Are Becoming More Obvious

1. Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 142. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html3. Bjorn Brembs: Open Access and the looming crisis in science https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950

Out of 18 microarray papers, resultsfrom 10 could not be reproduced

More retractions: >15X increase in last decadeAt current % > by 2045 as many papers published as retracted

[Carole Goble]

Page 35: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013

Science is Being Deinstitutionalized

35

Daniel Hulshizer/Associated Press

Page 36: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013

Science is Being Deinstitutionalized

36

Daniel Hulshizer/Associated Press

Page 37: Is a Biological Database Really Different than a Biological Journal?

iDASH October 18, 2013 37

In Summary

• Question (2005): In the Future Will a Biological Database Really be Different than a Biological Journal?

• Answer: – Less different that they were in 2005– We still have a long way to go improve science– Change is accelerating– What one does on a daily basis as a scholar is very different

from when I was in graduate school and it will be very different again