32
dans.knaw.nl DANS is een instituut van KNAW en NWO Open Research Data in H2020 Marjan Grootveld OpenAIRE webinar, 26 October 2016

OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Embed Size (px)

Citation preview

Page 1: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

dans.knaw.nlDANS is een instituut van KNAW en NWO

Open Research Data in H2020

Marjan Grootveld OpenAIRE webinar, 26 October 2016

Page 2: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Who we are

Open Access Infrastructure for Research in Europe www.openaire.eu

Page 3: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

DANS: Data Archiving and Networked Services

Institute of Dutch Academy and

Research Funding Organisation

(KNAW & NWO) since 2005

First predecessor dates back to

1964 (Steinmetz Foundation),

Historical Data Archive 1989

Mission: promote and provide

permanent access to digital research

information

Page 4: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

4

DataverseNL for short- and mid-term storage

EASY: certified long-term Electronic Archiving System for self-deposit

NARCIS: Gateway to scholarly information in the Netherlands

Research data in context

Page 5: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Contents

• Brief recap from recent OpenAIRE-EUDAT webinars• The updated Guidelines for FAIR Data Management:

• F, A, I, R• Costs, data security, ethical aspects, other RDM procedures

• Recommendations• Links to EC and OpenAIRE information

5

Page 6: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Recent webinarsIntroductory RDM webinar, Tony Ross-Hellauer & Sarah Jones, 26 May: • Reasons to manage data • How to manage and share data (+ how to respond to concerns about

sharing)• EUDAT & OpenAIRE servicesQ&A document: https://b2drop.eudat.eu/s/0H6qRgwdwkAVFvD#pdfviewer

“How to write a DMP”, Sarah Jones & Marjan Grootveld, 7/14 July: • What is a Data Management Plan and why to write it?• Example DMPs in different domains, with lots of links!• Lessons and guidance (e.g. storing =/= archiving; how to find a

repository; file-naming conventions)

All recordings and slides are on https://eudat.eu/events/webinars https://www.eudat.eu Research Data Services, Expertise & Technology

6

Page 7: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Recap: why manage data?

(Not for the research funder, but for life we make data management plans)

Make your research easierStop yourself drowning in irrelevant stuffSave data for laterAvoid accusations of fraud or bad scienceWrite a data paper, connect your nano publicationsShare your data for re-use & get them validated in real lifeGet credit for it

7

NON PECUNIAE INVESTIGATIONIS CURATORE SED VITAE FACIMUS PROGRAMMAS DATORUM

PROCURATIONIS

Page 8: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Horizon 2020 infographic

Page 9: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Horizon 2020: Open Research Data Pilot

The use of a Data Management Plan (DMP) is required for projects participating in the Open Research Data Pilot, detailing what data the project will generate, whether and how they will be exploited or made accessible for verification and re-use, and how they will be curated and preserved.

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

9

Page 10: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Guidelines on FAIR DM v.3

Structure of the Guidelines:

1.Background: extension of the pilot2.DMP general definition3.Proposal, submission and evaluation 4.RDM plans during the project life cycle5.Support6.Annex 1: the DMP template

1. Data summary2. FAIR data3. Allocation of resources4. Data security5. Ethical aspects6. Other issues 7. Summary table “Fair DM at a glance”

10

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Page 11: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

What’s new?

• You should develop a DMP for your project. • There is a single DMP template from start to finish.• The DMP template is inspired by the FAIR principles: research

data should be findable, accessible, interoperable and re-usable (without suggesting any specific technology, standard, or implementation solution).

Also explicit in the new guidelines:• From 1-1-2017 the pilot will cover all thematic areas of Horizon

2020. • Costs related to open access to research data are eligible for

reimbursement during the duration of the project under the conditions defined in the Grant Agreement.

11

Page 12: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Good things that remain

Whether a (proposed) project participates in the ORD pilot or chooses to opt out does not affect the evaluation of that project: proposals will not be penalised for opting out.

Participating in the ORD pilot does not necessarily mean opening up all your research data: as open as possible, as closed as necessary.

The DMP is a living document. You are not required to provide detailed answers to all the questions in the first version of the DMP (due M6).

Deposit in a research data repository:a. the data needed to validate the results presented

in scientific publications, including the metadata;b. any other data, including the metadata, as

specified in the DMP;c. plus for a-b the documentation and the tools

that are needed to validate the results, e.g. specialised software or software code, algorithms and analysis protocols (when possible, these instruments themselves).

12

Page 13: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

DMPonlineA web-based tool to help researchers write DMPs

Guidance from EUDAT and OpenAIRE being added https://dmponline.dcc.ac.uk

Choose your funder to get their specific template

Choose any additional optional guidance

13

Page 14: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

§2 Making data FAIRFindable

– Assign persistent IDs, provide rich metadata, register in a searchable resource, ...

Accessible– Retrievable by their ID using a standard protocol, metadata remain accessible

even if data aren’t...

Interoperable– Use formal, broadly applicable languages, use standard vocabularies, qualified

references...

Reusable– Rich, accurate metadata, clear licences, provenance, use of community

standards...

14

www.force11.org/group/fairgroup/fairprinciples and http://www.nature.com/articles/sdata201618

Page 15: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

EC in the Guidelines: “This template is not intended as a strict technical implementation of the FAIR principles, it is rather inspired by FAIR as a general concept.”

EC Infographic: http://ec.europa.eu/research/images/infographics/policy/open-data-2016-w920.png

15

Page 16: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Some F questions

2.1 Making data findable, including provisions for metadata

• Use metadata and specify standards for metadata creation (if any). If there are no standards in your discipline describe what type of metadata will be created and how.

• Search keywords • Persistent and unique identifiers such as DOI• File and folder naming conventions: see

OpenAIRE-EUDAT July webinar• Versioning of the datasets and clear version numbers

16

Page 17: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Metadata and documentation

• Metadata and documentation is needed to find and understand research data.

• Think about what others would need in order to find, evaluate, understand, and reuse your data.

• Get others to check the metadata to improve quality.• Use standards to enable interoperability.

http://rd-alliance.github.io/metadata-directory

17

Page 18: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Some A questions

2.2 Making data openly accessible:• Explain which data can’t be shared openly, if any• Specify how access will be provided in case of restrictions,

e.g. through a data committee, a license, or arranged with the repository.

• Will methods or software tools needed to access the data (if any) be included or documented?

• Deposit the data and associated metadata, documentation and code preferably in certified repositories which support Open Access.

Data Seal of ApprovalICSU World Data System nestor sealISO 16363

18

Page 19: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Where to find a repository?

More information: https://www.openaire.eu/opendatapilot-repositoryZenodo: http://www.zenodo.org Re3data.org: http://www.re3data.org

19

Page 20: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

File format considerations

No clearcut definitions of “sustainable file format”.Each archives has its own expertise, related to its designated community. Examples:

http://dans.knaw.nl/en/deposit/information-about-depositing-data?set_language=enhttp://researchdata.4tu.nl/en/publishing-research/data-description-and-formats/

4TU.ResearchData DANS

Level 1 Level 2 or 3 Preferred Accepted

audio .wav .ra, .mp3, .wma .wav, .flac .aiff, .mp3, .aac

chemistry NMR, ChemDoodle, … .pdb, .xyz

databasesdelimited flat file w/DDL .mdb, .dbf, .acdb .sql, .siard, .csv .mdb, .dbf, .hdf5 …

video .mp1, .mp2, .mp4, .mov …

.mpg2, .mpg4, .avi, .mov .mkv

20

Page 21: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Interoperability

A440, which has a frequency of 440 Hz, is the

musical note A above middle C and serves as a

general tuning standard for musical pitch. Prior

to the standardization on 440 Hz, many countries

and organizations followed the Austrian

government's 1885 recommendation of 435 Hz. In

the period instrument movement, a consensus has

arisen around a modern baroque pitch of 415 Hz (

A of A440♭ ), baroque for some special church

music (Chorton pitch) at 466 Hz (A♯ of A440), and

classical pitch at 430 Hz.

In the aftermath of the French Revolution (1789),

the traditional units of measure used in the

Ancien Régime were replaced. The livre monetary

unit was replaced by the decimal franc, and a new

unit of length was introduced which became known

as the metre. The metre gained adoption in

continental Europe during the mid nineteenth

century, particularly in scientific usage, and was

officially established as an international

measurement unit by the Metre Convention of 1875.

Before clocks were invented, people kept time using different instruments to observe the Sun’s zenith at noon. Towns and cities set clocks based on sunsets and sunrises. Time calculation became a serious problem for people travelling by train, sometimes hundreds of miles in a day. UTC is the World's Time Standard.

Medical classification is the process of transforming descriptions of medical diagnoses and procedures into universal medical code numbers. SNOMED Clinical Terms (SNOMED CT) is intended to provide a set of concepts and relationships that offers a common reference point for comparison and aggregation of data about the health care process. SNOMED-CT is designed to be managed by computer.

21

Page 22: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Some I questions

2.3 Making data interoperable• Specify what data and metadata vocabularies, standards or

methodologies you will follow to facilitate interoperability. • Standard vocabulary to allow inter-disciplinary

interoperability or a mapping from your vocabulary to more commonly used ontologies?

22

Page 23: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Some R questions

2.4 Increase data re-use (through clarifying licences)

• License the data to permit the widest reuse possible • Specify a data embargo, if this is needed• How long will the data remain reusable? • Describe data quality assurance processes

Re-use over time

23

Page 24: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Licensing research data and software

EUDAT licensing wizard help you pick licence for data & software http://ufal.github.io/public-license-selector/

You should also license Open Access data, or waive rights.

Horizon 2020 Open Access guidelines point

to:

or

24

Page 25: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Keep everything? For always?

When regenerating data is cheaper than archiving, don’t archive. Select what data you’ll need and want to retain.

10 years is often stated in data policies and academic codes, but data can be valuable for ages, in climatology, sociology, health sciences, astronomy, linguistics, … Look beyond minimal retention periods where relevant.

“The lifetime of software is generally not as long as that of data” (Daniel Katz e.a. http://bit.ly/2eScCKp)

RDNL Selection criteria: http://www.researchdata.nl/en/services/data-management/selecting-research-data/ DCC How-to guide: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data

25

Page 26: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

§3 Allocation of resources

• What are the costs for making data FAIR in your project? • Resources for long term preservation

Check the UK Data Service Costing model.

Rule of thumb: 5% of the project budget is spent on RDM. The High Level Expert Group on the European Open Science Cloud recommends that “well budgeted data stewardship plans should be made mandatory and we expect that on average about 5% of research expenditure should be spent on properly managing and stewarding data”.

UKDS model http://www.data-archive.ac.uk/create-manage/planning-for-sharing/costingHLEG report http://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf#view=fit&pagemode=none p. 19

26

Page 27: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

§4-6

Data security• Provisions for data recovery, secure storage, transfer of

sensitive data?• Safely stored in certified repositories for long term

preservation and curation? Ethical aspects• Any ethical or legal issues that can impact data sharing? • Informed consent for data sharing and long term

preservation included in questionnaires dealing with personal data?

Which other national/funder/sectorial/departmental procedures for data management do you use (if any)?

27

Page 28: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Closing remarks

Image “Fishbone” CC BY-NC-ND 2.0 by ttps://www.flickr.com/photos/mrjnl/

Page 29: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

Recommendations

• Think about the desired end result and plan for this.• Involve all work packages and partners to get a coherent

plan. • “Sharing” means “outside the consortium”.• Approach the DMP in whatever way best fits your project:

• EC template is intended as a service, not an obligation. Read the background information and the guidance, and use it as a checklist.

• More than one dataset? Describe generically what is possible and dataset-specific what is necessary.

• Focus effort on datasets you’ll create rather than reuse.

29

Page 30: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

The EC Open Research Data pilot

Key sources of information• Guidelines on Open Access to Scientific Publications and Research Data in Horizon

2020http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf

• Guidelines on Data Management in Horizon 2020http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

• Annotated model grant agreement, clause 29.3 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-amga_en.pdf

• New infographic summarising key policy points http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf

• Open Access and Data Management • http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-iss

ues/open-access-dissemination_en.htm

30

Page 31: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

OpenAIRE support materials

• Briefing papers, factsheets, webinars, workshops, FAQs

• Information on:• Open Research Data Pilot• Creating a data

management plan• Selecting a data repository• Personal data

https://www.openaire.eu/opendatapilothttps://www.openaire.eu/support

31

Page 32: OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

dans.knaw.nlDANS is een instituut van KNAW en NWO

Thank you!

Acknowledgements:Thanks to Sarah Jones (DCC), OpenAIRE and EUDAT for slides.

[email protected] http://dans.knaw.nl/