Upload
sara-mannheimer
View
104
Download
2
Embed Size (px)
DESCRIPTION
Presented at Research Data Access and Preservation (RDAP) 2014 as part of the panel "Developing and implementing institutional policies on research data: ownership, preservation, and compliance."
Citation preview
It’s a Real World: Developing Preservation Policy for Dryad
Ayoung Yoon (Dryad preservation working group, Doctoral Candidate at UNC-CH)Sara Mannheimer (Former Dryad curator, Data management librarian at Uof Montada)
Elena Feinstein, Jane Greenberg, Ryan Scherle, Dryad Digital Repository
March 26, 2014Research Data Access & Preservation Submit (RDAP) 2014
Outline• Introduction • What is Dryad Digital Repository? • Preservation policy development process • Dryad preservation policy• Lesson learned and open questions • Conclusion• Acknowledgement
Introduction • “Data deluge”• Journal policies and funding agency mandates• Benefits to archiving and preserving research data:
– Facilitates:• verification of research• accessibility and discoverability• opportunities for data reuse• increased citations• research visibility
– Prevents: • redundant data collection• inefficient legacy data curation• burden of sharing-on-request
• Challenges of data archiving:– Wider variety of file formats than most digital archival materials. – New versions as data sets are added to and updated– Security considerations– Large amounts of data Benefits adapted from Beagrie N, Lavoie BF, Woollard M (2010)
Keeping research data safe 2. HEFCE
Why preservation policy?• Preservation policy supports strategic planning for
implementation• Communicates to stakeholders
– trustworthiness and commitment to preservation
• Not many data preservation policies. Some examples:– CERN: CMS data– Archaeology Data Service– NSIDC Data Management Policies– Odum Institute Preservation Policy– ISPSR– DataONE
Dryad Digital Repository• A curated, general-purpose repository that makes the
data underlying scientific and medical publications discoverable, freely reusable, and citable (http://datadryad.org/).
• Facilitates data availability, data sharing, and scholarly communication.
• Originally partnered with leading journals and scientific societies in evolutionary biology and ecology.
• Broad collecting policy – almost any data is accepted, as long as it is associated with a publication.
Common filetypes in Dryad
HTML
WAV
Phylip
R script
JPEG Image
Newick tree file
RTF
XML
GZip archive
MS Word/OpenXML
Nexus
Unknown format
FASTA
Zip archive
CSV
MS Excel/OpenXML
Plain Text
0 500 1000 1500 2000 2500
Dryad and Preservation Needs• Preservation is a major part of Dryad’s mission.• Current preservation actions:
– MD5 Checksums– provenance metadata– informal encouragement of preferred formats
• Developing and implementing a formal preservation policy will:– guide current and future preservation practice– Facilitate the long-term preservation of the repository’s digital
assets
Policy Development Process
2012 Feb 2013 May 2013 July 2013 Nov 2013
An initial preservation plan (version 1.0.)
Preservation Working Group in Feb 2013
Version 2.0. presented to the Dryad Board of Directors
Version 2.0. revised in cooperation with Dryad staff
• Version 2.4. Approved by Dryad Board of Directors
• Preservation Working Group dissolved.
Preservation Task Force formed
Preservation Policy• Purpose • Scope and content coverage • Overview of preservation strategies • Format support and levels of preservation
– e.g. Preferred formats and format support levels
• Implementing the strategy– e.g. integrations of OAIS functional activities, pre-ingest &
ingest, and archival storage, authenticity and integrity, security, versioning, and withdrawal of collections
• Sustainability plans– e.g. technical sustainability, institutional and financial
sustainability
Lesson Learned and Open Questions
• A negotiation between what is ideal and what is realistic– Adopting International standards, models, and best
practices exist for long-term preservation • Open Archival Information System (OAIS) reference
model (ISO 14721:2003)• PREMIS (PREservation Metadata: Implementation
Strategies)– Other standards and guidelines about audit and
certification for building a trusted digital repository• Trustworthy Repositories Audit & Certification:
Criteria and Checklist (TRAC) and Data Seal of Approval (DSA)
Lesson Learned and Open Questions
• Align with other internal and institutional policies– Follow Dryad’s internal policies, we looked primarily to
Dryad’s Terms of Service document (https://datadryad.org/pages/policies), which includes policies on submission, content, payment, usage, and privacy
– Comply with Dryad’s unofficial policies, which have yet to be finalized• A policy-in-progress: Dryad’s policy on versioning
– Comply with policy from partner institutions• Dryad functions as a partnership between the
University of North Carolina at Chapel Hill (UNC), Duke University (Duke), and North Carolina State University (NC State)
Lesson Learned and Open Questions
• Structuring the policy according to Dryad’s specific needs– Meeting specific organizational needs is fundamentally
important and should be the first consideration in all work, as each organization has different goals, priorities, and capabilities.
– Data depositors’ requirements: minimum requirements• balance “minimum efforts” and having “enough”
representation information• compensated by other factors
Conclusion • Policy-creation and planning are just first steps --
implementation will require further considerations• Future plan
– Potential for implementing TRAC / DSA in the future– Divide policy and implementation into separate
documents– New Task Force
Acknowledgement• Thank you to the Dryad Preservation Working Group:
– Co-chairs:• Jane Greenberg, professor, SILS/UNC-CH, director, MRC• Sara Mannheimer, former Dryad curator, Data Management Librarian, Montana State
University– Members:
• Alex Ball, Research Officer, UKOLN, University of Bath• Robin Dale, Director of Digital and Preservation Services, LYRASIS• Michael Day, Digital Curation Centre, UKOLN, University of Bath• Ruth Duerr, Principal Investigator/Project Manager, Data Management and
Cyberinfrastructure Manager, Data Stewardship, National Snow & Ice Data Center• Elena Feinstein, former Dryad curator, Librarian for Chemistry and Biological Sciences,
Duke University• Cal Lee, Associate Professor, SILS/UNC-CH• Robin Rice, Data Librarian, EDINA and Data Library, University of Edinburgh• Ayoung Yoon, SILS doctoral student
• This work was supported in part from National Science Foundation (NSF), Award number: 1147166/ABI Development: Dryad: scalable and sustainable infrastructure for the publication of data.
Thank you!Ayoung Yoon
Doctoral candidate University of North Carolina at Chapel [email protected]
Sara MannheimerData management librarian Montana State [email protected]