View
1.265
Download
0
Tags:
Embed Size (px)
Citation preview
Melissa Haendel, PhDOregon Health & Science University
Future of Research Communications and E-Scholarship
Enabling transparency and efficiency
in the research landscape
@force11rescomm@ontowonka
Impetus for change: Is our
current method serving science?47/50 major preclinical
published cancer studies could not be replicated
“The scientific community assumes that the claims in a preclinical study can be taken at face value-that although there might be some errors in detail, the main message of the paper can be relied on and the data will, for the most part, stand the test of time. Unfortunately, this is not always the case.”
Begley and Ellis, 29 MARCH 2012 | VOL 483 | NATURE | 531
The scientific corpus is
fragmented
~25 million articles total, each covering a fragment of the biomedical space
Each publisher owns a fragment of a particular field
The current process is inefficient and slow
Wiley
Elsevier
MacMillian
Oxford
Spinal Muscular Atrophy
Committee on Academic
Promotions
What Counts
Money
Grants
Papers
Teaching
Service
What Does Not
Sharing data
Sharing software
Open access
Collaboration
Patents
Startups
Getting Ahead as a Computational Biologist in Academia PLOS Comp Bioldoi:10.1371/journal.pcbi.1002001
Beyond the PDF Conference/unconference
where all stakeholders come together as equals to discuss issues– Publishers
– Technologists
– Scholars
– Library scientists
– Humanists
– Policy makers
– Funders
Incubator for change
What would you do to change scholarly communication?
San Diego, Jan 2011 ...... Amsterdam, March 2013........Oxford, 2015
http://www.force11.org/beyondthepdf2
FORCE11
Future of Research Communications and E-Scholarship: A grass roots effort to accelerate the pace and nature of scholarly communications and e-scholarship through technology, education and community
Why 11? We were born in 2011 in Dagstuhl, Germany
Principles laid out in the FORCE11 Manifesto
FORCE11 launched in July 2012
www.force11.org @
Promote community, cross-
fertilization and interoperability
FORCE11 helps facilitate communications across disciplines and communities
Issues are not identical but we can learn from each other
Community platform– Meetings
– Discussions
– Tools and resources
– Blogs
– Event calendar
– Community projects
Working groups– Data Citation
– Resource identification initiative
– Attribution
– Data standards/Biosharing
Data Citation Working Group
FORCE11 provides a neutral space for bringing groups together 35 individuals
representing > 20 organizations concerned with data citation
Conducted a review of current data citation recommendations from 4 different organizations
Arrived at consensus principles
http://www.force11.org/datacitation
Data Citation Principles
Consensus Data Citation principles ready for comment
Designed to be high level and easy to understand
1. Importance2. Credit and
Attribution3. Evidence4. Unique
identifiers5. Access6. Persistence7. Versioning8. Interoperability
and flexibility
Data Citation Implementation
https://www.force11.org/datacitationimplementation
https://peerj.com/preprints/697/
BioCADDIE Data Discovery Index
https://www.force11.org/group/biocaddie/cewg
Challenge: Working with Web Data
Often have inadequate descriptions so we don’t know what they are about or how they were constructed
Datasets change over time, but often don’t come with versioning information
May have been constructed using other data, but it’s not clear which version of data was used or whether these were modified
Data may be available in a variety of formats
There may be multiple copies of data from different providers, but it’s unclear if they are exact copies or derivatives
Version of standard or vocabulary used not indicated
Data registries are not synchronized and can contain conflicting information
W3C HCLS Dataset Description
Develop a guidance note for reusing existing vocabularies to describe datasets with RDF– Mandatory, recommended, optional descriptors– Identifiers– Versioning– Attribution– Provenance– Content summarization
Recommend vocabulary-linked attributes and value sets
Provide reference editor and validation
Metadata Model:
description – version – distribution
http://tiny.cc/hcls-datadesc
Journal guidelines for methods are often poor and
space is limited
“All companies from which materials were obtained should
be listed.” - A well-known journal
Reproducibility is dependent at a minimum, on
using the same resources. But…
Only ~50% of resources were identifiableVasilevsky et al, 2013, PeerJ
There is no correlation between impact factor and
resource identification
Journal Impact Factor
0 10 20 30 40
Fra
ction o
f re
sourc
es identified
0.0
0.2
0.4
0.6
0.8
1.0Antibodies
Cell Lines
Constructs
Knockdown reagents
Organisms
http://www.force11.org/Resource_Identification_Initiative
Numerous endorsers https://www.force11.org/RII/SignUpImplementation of the new standard http://biosharing.org/bsg-000532
RRIDs should be:
Machine Readable
Consistent across publishers and journals
Free to generate and access
Sample citation:
Polyclonal rabbit anti-
MAPK3
antibody, Abgent, Cat#
AP7251E,
RRID:AB_2140114
1.
Research
er
submits a
manuscri
pt for
publicatio
n
2. Editor or
Publisher
asks for
inclusion of
RRID
3. Author goes to
Research
Identification
Portal to locate
RRID
4. RRID is
included
in
Methods
section
and
as
Keyword
Publishing Workflow
Example Scenario
Melissa creates mouse1 David creates mouse2 Layne uses performs RNAseq analysis on mouse1 and mouse2 to generate dataset3, which he subsequently curates and analyzes
Layne writes publication pmid:12345 about the results of his analysis
Layne explicitly credits Melissa as an author but not David.
Attribution Working Group
https://www.force11.org/group/attributionwg
Project CredITVIVO-ISF ontologyPROVthe Becker modelTransitive creditThe Scholarly Contributions and Roles ontology
Goal is catalyze rapid convergence on requirements, approaches, and practical implementation of a system for tracking contributions to any scholarly product.
The 1K Challenge
What would you do with £1k today to make
research communication better, anticipating
the increasing scale of people and
machines?
Researchers DO need
assistance: Finding and choosing data
standards
File versioning
Applying metadata to
facilitate data sharing
“Gummi Bear” themed
data management
exercise resonated well
with students
Lack of awareness of
services and expertise
offered by the Library
OHSU Library is
developing data
services for researchers
http://laughingsquid.com/the-anatomy-of-a-
gummy-bear-by-jason-freeny/
Conclusions and new directions
DOI:10.6083/M4QC0273
https://www.force11.org/force2015/1k-challenge-vote
Join the Force11: https://www.force11.org/
“Meta Makes My Machine Marvellous (5M)”“Crowdreviewing: the sharing economy at its finest”“Science bots”“scientific articles are too expensive to publish and to read”
FORCE11 Vision• Modern technologies enable vastly improve knowledge transfer and far wider
impact; freed from the restrictions of paper, numerous advantages appear
• We see a future in which scientific information and scholarly communication more generally become part of a global, universal and explicit network of knowledge
• To enable this vision, we need to create and use new forms of scholarly publication that work with reusable scholarly artifacts
• To obtain the benefits that networked knowledge promises, we have to put in place reward systems that encourage scholars and researchers to participate and contribute
• To ensure that this exciting future can develop and be sustained, we have to support the rich, variegated, integrated and disparate knowledge offerings that new technologies enable
What is the 21st century equivalent of the library?