Upload
leon-osinski
View
29
Download
3
Embed Size (px)
Citation preview
Research data management
Open Research Data pilot, data management (plans), FAIR data, data repositories, metadata
Everlasting project, General assembly
TU/e, 21-03-2017
[email protected], TU/e IEC/Library
Available under CC BY-SA license, which permits copying and redistributing the material in any medium or format & adapting the material for any purpose, provided the original author and source are credited & you distribute the adapted material under the same license as the original
Topics
1. Horizon 2020: Open Research Data pilot 2. Requirements Open Research Data Pilot
+ Data management plan and FAIR data+ Depositing research data and 4TU.Centre of Research Data
3. Metadata
Horizon 2020guiding principles for research data management
1. Scientific integrity+ Traceability of research results, from the figure in a paper to the
underlying raw data2. Reuse
+ Build on previous results (data-driven science)+ Encourage collaboration/avoid duplication of effort (collaborative
science)+ Innovation/progress to market+ Involve citizens and society
Ideological shift: from trust to responsibility and accountability
Horizon 2020the Open Research Data (ORD) pilot
“The ORD pilot aims to improve and maximize access to and re-use of research data generated by Horizon 2020 projects…”
“The ORD pilot applies primarily to the data needed to validate the results presented in scientific publications.”
“Good research data management is not a goal in itself, but rather the key conduit leading to knowledge discovery and innovation…”
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotis participation mandatory?
Participation is the default option. However, with opt-out possibilities at any stage, including after signature of the agreement
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotis open access of data the goal?
The ORD pilot follows the principle “as open as possible, as closed as necessary”.“…the need to balance openness and protection of scientific information, commercialisation and Intellectual Property Rights (IPR), privacy concerns, security…”is recognized
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotwhich data should be made available?
The ORD pilot applies primarily to:
“the ‘underlying data’ (the data needed to validate the results presented in scientific publications), including the associated metadata (i.e. metadata describing the research data deposited).”
Other data can also be provided.
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotare costs eligible for refund?
“Costs related to open access to research data (…) are eligible for reimbursement during the duration of the project…”
ORD pilotgood research data management
Besides reuse of research data, the ORD pilot also focuses on good data management!Good data management prepares for reuse or, reuse implies data management
“… participating in the ORD pilot does not necessarily mean opening up all your research data. Rather, the focus of the Pilot is on encouraging good data management as an essential element of research best practice.”
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotFAIR principles
Good research data management is data management following the FAIR principles.
Research data should be Findable, Accessible, Interoperable and Reusable.
Source: Research Data Netherlands / Marina Noordegraaf
ORD pilotrequirements
The conditions set by Horizon 2020 with regard to research data management, come down to two requirements:1. Formulate a data management plan;2. Deposit research data.
Source: Research Data Netherlands / Marina Noordegraaf
Data management plan
The data management plan provides information on the handling of research data during and after the end of the project, with the FAIR principles in mind: The data collection (newly generated data versus pre-existing data,
file formats, special tools needed, data size) Data storage and back-up (storage media, safe and secure storage) Data documentation (metadata); Whether, how and what data will be shared/made open access
during and after the project; Data preservation and archiving after the project
“Once a project has had its funding approved and has started, you must submit a first version of your DMP (as a deliverable) within the first six months of the project.”
Data management planFAIR principles
Findable: easy to find by both humans and computer systemsData are assigned a DOI after research and described by rich metadata; naming conventions are used during research, versioning;
Accessible: easy to be obtained by humans and computersAccess to data (who, where, how long), storage during and archiving after project, can data be made open access?
Interoperable: easy to be combined with other data sets by humans and computers;Data-exchange between researchers, institutions, machines; are standard metadata and vocabularies used, open data formats?
Reusable: easy to be used for future research and to be processed further by humans and using computational methodsData quality and provenance; licenses added to data (who can use the data under which conditions)
It’s still unclear how to turn each of these components into reality!
DMP template Horizon 2020 (via DMPOnline): recommended but voluntary
DMP template by 4TU.Centre of Research Data Examples of H2020 DMPs:
http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples
Data management plantemplates
Source: Research Data Netherlands / Marina Noordegraaf
Deposit research data
‘Underlying data’ of a scholarly paper, including the associated metadata needed;
Preferably in a research data repository; Take measures to enable others to access, exploit, reproduce
and disseminate the deposited data; Provide information via the chosen repository about the tools
that are needed to validate the results.
Deposit research data4TU.Centre for Research Data #1
4TU.Centre for Research Data is for static data (‘frozen’ data sets, ‘milestone’ data sets) after the project has ended.
With 4TU.Centre for Research Data data are made findable and accessible
Data are assigned a DOI Data can be linked to publications (DOI reservation is possible) Data are assigned descriptive/discovery metadata Data are assigned a user license (comparable to CC BY-NC, other user
licenses are being developed) Data are open access (restricted access options are being developed) Data are archived/preserved for the long term Metadata can be harvested by Google etc.
Deposit research data4TU.Centre for Research Data #1
4TU.Centre for Research Data is for static data (‘frozen’ data sets, ‘milestone’ data sets) after the project has ended.
With 4TU.Centre for Research Data data are made findable and accessible
Data are assigned a DOI findable Data can be linked to publications (DOI reservation is possible) findable Data are assigned descriptive/discovery metadata findable Data are assigned a user license (comparable to CC BY-NC, other user licenses
are being developed) re-useable Data are open access (restricted access options are being developed)
accessible Data are archived/preserved for the long term accessible Metadata can be harvested by Google etc. findable
Deposit research data4TU.Centre for Research Data #2
/ Information Expertise Center (IEC) PAGE 1821-3-2017
4TU.Centre for Research Data
Deposit research data4TU.Centre for Research Data #2
Source: Research Data Netherlands / Marina Noordegraaf
Research data managementDocumentation and metadata #1
Enhancing the re-usability and interchangeability of your measurement data:1. by adding a readme-file with data specific information on:
the size of the data set, what’s included and excluded; the provenance of the data (how you collected the data and
data manipulation steps); the parameters/variables used (how each was measured),
measurement units, codes/symbols used, etc.
2. by using the same metadata scheme
Source: Research Data Netherlands / Marina Noordegraaf
Research data managementDocumentation and metadata #2
Battery data set: https://ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/#battery
Battery data metadata scheme: http://mcscience.com/home/mcscience-web-logos/experimentors/modenjay-experiment-platform/test-recipe-3-metadata/
RDM desk: [email protected]
DMP support: Sjef Öllers: [email protected] Leon Osinski: [email protected] Website Data Coach: http://www.tue.nl/datacoach (with
information on funder policies, soon RDM programmewebsite)
Support
1. Horizon 2020 participant portal online manual: open access and data management: http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-dissemination_en.htm
2. Horizon 2020 Guidelines on FAIR data management (version 3.0, 26-07-2016): http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
3. Horizon 2020 Guidelines on open access to scientific publications and research data (version 3.1, 25-08-2016): http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
4. Expert group on turning FAIR data into reality: http://ec.europa.eu/transparency/regexpert/index.cfm?do=groupDetail.groupDetail&groupID=3464&NewSearch=1&NewSearch=1
5. Data management plan template Horizon 2020: https://dmponline.dcc.ac.uk/6. Data management plan template 4TU.Centre for Research Data: http://researchdata.4tu.nl/en/planning-
research/data-management-plan/7. Examples of H2020 DMPs: http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples8. 4TU.Centre of Research Data: http://data.4tu.nl9. Paper on FAIR data principles: http://dx.doi.org/10.3233/ISU-17082410. Battery data set: https://ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/#battery11. Battery data metadata scheme: http://mcscience.com/home/mcscience-web-logos/experimentors/modenjay-
experiment-platform/test-recipe-3-metadata/12. TU/e Data Coach: http://www.tue.nl/datacoach
URL’s of mentioned and important webpages