"Cherish old knowledge that you may acquire new" - The Analects of Confucius

Preview:

DESCRIPTION

"Cherish old knowledge that you may acquire new" - The Analects of Confucius http://datadryad.org, http:// blog .datadryad.org , http://datadryad.org/ wiki dryad-users@nescent.org ; Twitter: @ datadryad Todd Vision – DryadUK Sustainability workshop - 4/1/2011 - The British Library. - PowerPoint PPT Presentation

Citation preview

"Cherish old knowledge that you may acquire new"

- The Analects of Confucius

http://datadryad.org, http://blog.datadryad.org, http://datadryad.org/wiki

dryad-users@nescent.org; Twitter: @datadryad

Todd Vision – DryadUK Sustainability workshop - 4/1/2011 - The British Library

Publishers Journals

Researchers Funders

1. What is the value proposition?2. What is the appropriate revenue model?3. What is the role of funders?

Long tail of orphan dataVo

lum

e

Rank frequency of datatype

Specialized repositories(e.g. Genbank, PDB)

Orphan data

after B. Heidorn

Source: PARSE Insight survey report, http://www.parse-insight.eu/

Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. A Fourth Contribution to the Study of Variation. pp. 209-226 in Biological Lectures from the Marine Biological Laboratory, Woods Hole, Mass.

Source: PARSE Insight survey report, http://www.parse-insight.eu/

Source: Publishing Research Consortium, http://publishingresearch.net

n=3824

Peer-to-peer sharing is problematic

• Wicherts et al. requested data from from 141 articles in American Psychological Association journals.

• “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied

Wicherts, J.M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61, 726-728.

Benefits to data archiving

Modified from Beagrie et al. (2009) Keeping Research Data Safe 2

DirectVerification of published researchPreserving accessibility to dataAllowing reuse and repurposing of dataDiscoverability of data

Indirect (costs avoided)Redundant data collectionInefficient legacy data curation Burden of sharing-upon-requestOpportunity cost of science not done

Near termProtection against personnel turnoverAvailability for review and validation

Long termSecure long-term stewardshipIncreased impact per publication

PrivateIncreased citationsNew collaborations New research opportunitiesFulfilling funding mandates

PublicMore efficient use of research dollarsPublic trust in scienceEducational opportunitiesImproved methodologiesMore informed policy

Brussels Declaration on STM Publishing

“Raw research data should be made freely available to all researchers. Publishers encourage the public posting of the raw data outputs of research. Sets or sub-sets of data that are submitted with a paper to a journal should wherever possible be made freely accessible to other scholars”

Signed by 46 publishers and 13 trade organizations, incl. Elsevier, Nature Publ. Group, Springer, Oxford U Press, Wiley-Blackwell.

• The End To make data archiving and reuse a standard function

of scholarly communication. • The Means

Enable low-burden, inexpensive data archiving in conjunction with article publication.

Ensure individuals receive direct benefits from data sharing.

Reduce unnecessary barriers to data reuse. Empower journals, societies & publishers in shared

governance. Plan for long-term preservation at the outset.

Dryad vs. Supplementary Online Materials

Dryad SOM

Discoverable: indexed and exposed to both web and bibliographic search engines ✔ ✗

Identifiable: Data DOIs within articles serve as permanent, resolvable identifiers ✔ ✔/✗

Attributable: reuse of data leads to article citations ✔ ✔

Permanent: preservation planning, including format migration ✔ ?

Curated: quality control of data submissions and indexing metadata ✔ ✔/✗

Ease of deposit: streamlined deposit, allowance for large and complex datasets ✔/✗ ✔/✗Formatted for reuse: support for non-PDF file formats ✔ ✔/✗Updatable: new versions of data files can be added, metadata can be enhanced ✔ ✗

Support for embargoes: can delay release of data in accordance with journal policy ✔ ?

Free reuse: no paywall, clear terms of reuse ✔ ?

Economy of scale: cost efficiency from shared infrastructure ✔ ✔/✗

Responsive to needs of individual journals ✔ ✔Core business: aligned with organizational mission ✔ ✔/✗

How well do we understand the value proposition?

• For researchers Dryad increases the impact of, and citations to, published

research. It preserves and makes available others’ data to verify published results, to refine methodologies, and for other forms of reuse. It frees researchers from being responsible for data preservation and access.

• For journals Dryad frees journals from the responsibility and costs of

maintaining supplemental data in perpetuity, and allows publishers to increase the value of their journals to its authors and readers.

• For funders Dryad provides a cost-effective mechanism to make

research more accessible, and to leverage existing investments in order to enable new science.

Dryad as an organization• International nonprofit, with multiple

institutional hosts• Governed by a Board of open size

Each partner journal appoints one (voting) representative

The full Board votes on all financial and governance matters

• Executive Committee Currently five members elected by the Board Responsible for repository policy, short-term

strategic decisions Brings issues to full Board for discussion and vote

• Institutional oversight, advisory structure both TBD

• Next board meeting 7-9 July in Vancouver Transition from interim status Adopt initial governance model Adopt initial cost-recovery model

2007

2008

2012

2009

2010

2011

NSF/ESA Data Sharing and NESCent Small Science workshopsBeginning negotiation of Joint Data Archiving Policy

Journals/societies join NESCent & others to fund Dryad through NSF

NSF funding for Dryad begins (lasts through Aug 2012)

Repository went onlineFirst consortium board meetingDebut of integrated data submission

Announcement of Joint Data Archiving Plan

JISC funding begins Discussions with potential charter partnersJDAP (and NSF DMP mandate) takes effectTransitional funding campaign Approval of cost-recovery plan and governance structure

Cost-recovery beginsTransitional funding begins

Projecting Dryad’s operating costs

• Activity-based cost model, from KRDS• Includes

Management & administrative support Storage and server hardware (incl. permanent storage) Personnel for system maintenance Curation and preservation A small amount of outreach and user support

• Does not include Facilities and other institutional costs (e.g. human

resources) Repository innovation (grants, foundation support) Special projects (grants, foundation support)

• More detail in Beagrie, Eakin-Richards and Vision, iPres 2010

17 integrated journals

Curation

Revenue return• Costs are recovered upfront, in order to

allow free dissemination assure preservation

• Fees predominantly paid by journals, which may be passed on to authors subsidized by societies rolled into publisher costs/revenue

• Fees should be attractive: cost-effective relative to SOM fair: to all different types of journals

• Model will surely evolve Under control of consortium of partner journals

A Full

B Associate

C Author pays

Joining fee (waived for charter members)

$1000 $1000 NA

Annual fee from journal a. all peer-reviewed articles in prior yr b. articles with data deposited to Dryad

Prospective$25/articlea

Retrospective $100/articleb

0

Author charge at deposit 0 0 $200Length of contract 3 or 5 yrs 3 or 5 yrs n/aLegacy data deposits free? Y N N

Can move between plans A & B? Y Y N/ARepresentative on Consortium Board

Y Y N

Can vote on board and serve on executive committee

Y N N

Coordinated data deposit Y Y Y/N

Data DOI in published article Y Y Y/N

Branding of journal content Y Y N

Journal Society or publisher

Archiving rqmnt

Submission integration

Subscriber

American Naturalist ASN Y Y AEvolution SSE Y Y ASystematic Biology SSB Y Y AMolecular Biology & Evolution

SMBE Y - A

Heredity The Genetics Soc. Y In progress AJournal of Heredity AGA N Y APaleobiology / J. of Paleontology

Paleontology Society

Y In progress A

Ecological Monographs ESA Y In progress AJournal of Evolutionary Biology

ESEB Y Y A

Molecular Ecology / MER Wiley-Blackwell Y Y -Biological J. Linnean Society Linnean Soc.

LondonN Y -

Evolutionary Applications Wiley-Blackwell Y Y -Integrative & Comparative Biology

SICB - In progress -

BMC Ecology / BMC Evolution Springer/BMC Y? In progress -BMJ Open BMJ - In progressPLoS Biology PLoS - In progress -Molecular Phylogenetics & Evolution

Elsevier - - -

Ecology Letters CNRS/W-B - - -Journal of Ecology British Ecological

Soc.- - -

Issues with the subscription plan

• Are the differences in per-article costs appropriate? Plan B and Plan C are set based on incentives, not

cost Should there be a Plan B at all? Should there be a greater safety buffer for Plan A?

• How to accommodate journals from developing countries? authors from non-partner journals who lack grant

resources?• Annual fee depends only on article volume

Is this the most equitable arrangement?

Role for funders?

“this sort of open access archiving costs money and it is not clear who pays. Certainly research funding agencies seen very keen on the doing and not very keen on the paying.”

n=564

H. Piwowar (unpubl.)

Role for funders?• Policy

Strong archiving guidelines, with enforcement

Endorsement of trusted repositories• Funding

Renewable infrastructure grants (supporting curation, maintenance, user support, business operations)

Matching funds to repositories based on deposits or reuse

Top-slicing to researchers Waiver funds for researchers

CC BY-NC-ND 2.0

Recommended