24
The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow [email protected] Twitter: @sjDCC

The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow [email protected]

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

The Open Data Pilot: practical implementation

Sarah Jones

Digital Curation Centre, University of Glasgow

[email protected]

Twitter: @sjDCC

Page 2: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

How to make data openly available

OPEN RESEARCH DATA

Image CC-BY-NC-SA by Tom Magllery www.flickr.com/photos/lwr/13442910354

Page 3: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

How can researchers make data open?

1. Choose the dataset(s) to share

• What can be made open? This step may need to be revisited if

problems are encountered later.

2. Apply an open license

• Determine what IP exists. Apply a suitable licence e.g. CC-BY

3. Make the data available

• Provide the data in a suitable format. Use repositories.

4. Make it discoverable

• Post on the web, get a unique ID, register in catalogues…

https://okfn.org

Page 4: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

www.dcc.ac.uk/resources/how-guides/license-research-data

Licensing research data openly

This DCC guide outlines the pros and cons

of each approach and gives practical

advice on how to implement a data licence

CREATIVE COMMONS LIMITATIONS

NC Non-Commercial

What counts as commercial?

SA Share Alike

Reduces interoperability

ND No Derivatives

Severely restricts use

These clauses are not open licenses

Horizon 2020 Open Access

guidelines point to:

or

Page 5: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

EUDAT licensing tool

Researchers can answer a series of questions to determine

which licence(s) are appropriate to use

http://ufal.github.io/lindat-license-selector

Page 6: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Metadata standards

• Metadata is basic descriptive information to help others identify and

understand the structure of the data e.g. title, author...

• Documentation provides the wider context e.g. the methodology /

workflow, software and any information needed to understand the data

• Relevant standards should be used for interoperability

www.dcc.ac.uk/resources/metadata-standards

Page 7: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Data file formats

If researchers want their data to be re-used and sustainable in the

long-term, they should opt for open, non-proprietary formats.

Type Recommended Avoid for data sharing

Tabular data CSV, TSV, SPSS portable Excel

Text Plain text, HTML, RTF

PDF/A only if layout matters

Word

Media Container: MP4, Ogg

Codec: Theora, Dirac, FLAC

Quicktime

H264

Images TIFF, JPEG2000, PNG GIF, JPG

Structured data XML, RDF RDBMS

Further examples:

www.data-archive.ac.uk/create-manage/format/formats-table

Page 8: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Data repositories

http://service.re3data.org/search

Zenodo

• OpenAIRE-CERN joint effort

• Multidisciplinary repository

• Multiple data types

– Publications

– Long tail of research data

• Citable data (DOI)

• Links funding, publications,

data & software

www.zenodo.org

Page 9: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Plan for sharing from the outset

Many decisions taken early on in the project will affect

whether the data can be made openly available.

Researchers should:

• Ensure consent agreements also include permission to archive and

share data for reuse by others

• Seek permissions for more than just the primary project purpose if

signing licences to reuse third-party data. Derivative data may not

be able to be shared if it includes somebody else’s IP

• Explore the potential for openness when drafting agreements with

commercial partners

Page 10: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

REVIEWING DATA MANAGEMENT PLANS

What to look for in Data Management Plans

Image CC-BY-NC-SA by Ralf Appelt www.flickr.com/photos/adesigna/4090782772

Page 11: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Horizon 2020 templates

The DMP should address the points

below on a dataset by dataset basis:

• Data set reference and name

• Data set description

• Standards and metadata

• Data sharing

• Archiving and preservation

(including storage and backup)

Annex 2 (mid-term & final review)

Scientific research data should be easily:

1. Discoverable

2. Accessible

3. Assessable and intelligible

4. Useable beyond the original purpose

for which it was collected

5. Interoperable to specific quality

standards

Annex 1 (by month 6)

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Page 12: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Common themes to cover

• Data Description

• Standards and Metadata (discoverable / usable / interoperable)

• Data Sharing (as open as possible, as closed as necessary)

• Archiving and preservation

Page 13: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Key things to check

• Is the plan appropriate?– adopting relevant standards

– practices in line with norms for that field

– use of support services e.g. university storage, subject repositories…

• Does it seem feasible to implement?

• Has sufficient information been provided?

• Has advice been sought where needed?

• Are restrictions and costs properly justified?

Page 14: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Main judgement to make:

Has the researcher taken time to

reflect on what to do?

There are no absolute right answers. You just want

to be reassured that due consideration has been

given and the approach seems reasonable.

Page 15: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Data Description

• Is it clear what data will be collected?

• Are appropriate file formats proposed?

• Has the reuse or integration of existing data been

considered? (if appropriate)

• If third-party data will be reused, has sharing been

considered in the licence agreements?

Page 16: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Standards and Metadata

• Will enough contextual information and structured

metadata be provided to allow others to find,

understand and reuse the data?

• Will the data be documented during the research? Has

time been allocated to this?

• Will formal standards be used? (where available)

• Is information being captured & shared on the associated

software and tools needed for reuse and reproducibility?

Page 17: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Data Sharing

• Is it clear which data will be shared and with whom?

– Are opportunities to share data openly maximised? e.g. by seeking

consent to share, anonymising data…

– If data can’t be shared, are the reasons why explained?

• Will the data be easily accessible and openly licensed?

• If an embargo period is planned, is that in line with

norms for that discipline?

• Will persistent IDs be assigned for discovery and citation?

Page 18: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Archiving and Preservation (incl. storage)

• Will the research data be deposited in a suitable

community database, repository or archive?

• Are there any costs associated with preservation, and if

so, how will these be covered?

• Will the data be stored and backed-up appropriately

during the research project? e.g. on managed university

filestores rather than external hard drives

Page 19: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Reviewing DMPs

Useful guidelines

• ESRC guidance for peer-reviewers www.esrc.ac.uk/_images/Data-Management-Plan-Guidance-for-peer-reviewers_tcm8-15569.pdf

• MRC guidelines www.mrc.ac.uk/documents/pdf /data-management-plans-guidance-for-reviewers

• Johns Hopkins grant reviewers cribsheethttps://dmp.data.jhu.edu/resources/grant-reviewers-guide

How to assess DMPs forthcoming guide

Page 20: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

DCC support on Data Management Plans

• Checklist on what to include

• How to guide on developing a plan

• Webinars and training materials

• DMPonline tool

• Example DMPs

www.dcc.ac.uk/resources/data-management-plans

Page 21: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

DMPonline

A web-based tool to help researchers write DMPs

Includes a template for Horizon 2020

https://dmponline.dcc.ac.uk

Page 22: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Example data management plans

• Technical appendix submitted to AHRC by Bristol Unihttp://data.blogs.ilrt.org/files/2014/02/data.bris-AHRC-example-Technical-Plan.pdf

• Rural Economy & Land Use (RELU) programme examples

http://relu.data-archive.ac.uk/data-sharing/planning/examples

• UCSD example DMPs (20+ scientific plans for NSF)

http://libraries.ucsd.edu/services/data-curation/data-management/dmp-samples.html

• LSHTM guide and worked example for Wellcome Trust• www.lshtm.ac.uk/research/researchdataman/plan/wellcometrust_dmp.pdf

• Further examples: www.dcc.ac.uk/resources/data-management-plans/guidance-examples

Page 23: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Thanks for listening

DMP guidance, tools & resources:

www.dcc.ac.uk/resources/

data-management-plans

Follow us on twitter:

@digitalcuration and #ukdcc #DMPonline

Page 24: The Open Data Pilot: practical implementation · The Open Data Pilot: practical implementation Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk

Exercise: reviewing DMPs

In pairs or small groups:

1. Read through the example DMP or one you brought along (5 mins)

2. Discuss what you think about the example DMP (10 mins)

— Did you get a clear sense of what data will be created?

— Were particular standards and file formats named and explained?

— Is there enough information about how the data will be made available?

— Will the data be deposited in a repository for preservation?

3. Report back the main points from your discussion (5 mins)