Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Twente Grants Week: Data managementMaarten van Bentum (Library & Archive)
Overview
1. Questions to be answered (storage, description and sharing/archiving)
2. Scientific integrity
3. Why data management
4. Data Management Plan (DMP)
5. Archiving data (data repositories/centres)
Questions to be answered
1. Where do you keep your research data?
2. Is there a backup? Where? How many copies?
3. How do you document/describe your data?
4. Who can access your data?
5. Do you share your data during or after research, for instance for reuse? If
not, why?
6. What will happen to your data after finishing your research?
Question 1-2: Where do you keep your data? Is there a backup? Where? How many copies? (2/3)
Storage options
1. UT central storage
p- or m-disk (ICTS):
http://www.utwente.nl/icts/diensten/catalogus/dataopslag_mw/stor
age/)
2. Project, community or research institute storage
IGS Datalab: https://www.utwente.nl/igs/datalab/
3. Individual data storage (computer, dvd/cd, external hard disk,…)
4. Non-commercial cloud storage
Surfdrive: https://www.surfdrive.nl/en
DataverseNL: https://dataverse.nl/dvn/
5. Commercial cloud storage: Dropbox, OneDrive, …
Question 1-2: Where do you keep your data? Is there a backup? Where? How many copies? (1/3)
Criteria
Sustainability/reliability: frequency backup (off line / off site?)
Dataset type: raw dataset, versions during processing and analysis, final
datasets
Size dataset: capacity, costs, data transfer
Legal or contractual regulations
Access: individual, community, open
DMP - Data storage and backup (3/3)
Backup
3 copies (original, external/local, external/remote)
Local vs. remote depends on recovery time needed
Question 3: How do you document/describe your data? (1/2)
Documentation during research of dynamic data sets (for yourself, fellow
researchers in the project and/or group)
Documentation after research of static data sets (for discovery, verification,
replication, and reuse)
Documentation: standard metadata schemes enhanced with specific descriptive
elements necessary for verification, replication, and reuse
See list: http://www.dcc.ac.uk/resources/metadata-standards/list
See also 3TU.Datacentrum Data description and formats
Question 3: How do you document/describe your data? (2/2)
Metadata 3TU.Datacentrum Creator* Main researcher(s) involved in producing the data Contributor Institution where the data was created or collected. Publisher* Institution which submitted the work Title* Name or title by which a resource is known Publication year* The year when the data was or will be made publicly
available Date created Date the resource itself was put together; data range or a
single date Description* Concise description of the contents of the dataset Subject Subject, keyword, classification code, of key phrase describing
the resource Coverage temporal Indicate the dates to which the data refer. Coverage spatial Describe the geographic area to which the data refer Identifier A persistent identifier to a dataset URL to publication Include the web addresses for any publication
Question 4: Who can access your data? (1/2)Verifiability
3.1.Research must be replicable in order to verify its accuracy. The choice of research question, the research set-up, the choice of method and the references to sources used are accurately documented in a form that allows for verification of all steps in the research process.3.2. The quality of data collection, data input, data storage and data processing is closely guarded. All steps taken must be properly reported and their execution must be properly monitored (lab journals, progress reports, documentation of arrangements and decisions, etc.).3.3.Raw research data are stored for at least ten years. These data are made available to other academic practitioners upon request, unless legal provisionsdictate otherwise.3.4.Raw research data are archived in such a way that they can be consulted at all times and with a minimum expense of time and effort.3.5.The source of all educational material, written as well as oral, is stated
(From: The Netherlands Code of Conduct for Academic Practice)
Question 4: Who can access your data? (2/2)
- UT data policy?
- Funder requirements?
- Requirements other parties? Contracts?
- Open Access required? Possible? Dutch Personal Data Protection Act (UT Data
Protection Officer)
Question 5: Do you share your data during or after research, for instance for reuse? If not, why?
Why sharing your data?
Replication / verification
Promote your research
Enable new discoveries (reuse)
"Open where possible, protected where needed"
See NWO policy http://www.nwo.nl/en/policies/open+science
After research: public, linked to publication(s) > 3TU.Datacentrum, DANS,
DataverseNL
Question 6: What will happen to your data after finishing your research?
Proper archiving:
Trusted data repositories (DANS, 3TU.Datacentrum)
Linked to publications
Open or restricted access (DANS)
Open: funder requirements
NWO data management pilot:
http://www.nwo.nl/en/policies/open+science/data+management
EC – Horizon 2020 data management pilot:
http://ec.europa.eu/research/participants/data/ref/h2020/grants_ma
nual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
Restricted: Legal and contractual regulations (Dutch Personal Data
Protection Act, http://www.utwente.nl/az/gegevensbescherming/, in
Dutch)
Scientific integrity
1. Criteria: Fabrication, Falsification and Plagiarism (FFP)
2. Fabrication of data (Stapel, Schön)
3. Untraceable data (Poldermans)
Neglect of basic preservation of data
Neglect of data management
No proper mechanism for quality control: no data or instruments for
easy data reproduction means no possible check
See also: https://www.utwente.nl/en/organization/structure/management/good-
management/
Why manage research data
Validate research results or verification of data (e.g. Netherlands code
of conduct for scientific practice)
Use/Reuse research data (secondary user)
Obligation by the research funding body (EC and NWO)
Uniqueness of the data (e.g. innovative character of the research)
Value of the data (non-repeatable observations)
Importance of data / heritage (e.g. history of science)
Data Management PlanFormal research project document about what and how data will be collected,
stored, described, and archived and how access, reuse and linking to
publications will be realised.
Responsibility Description of data Methodology data collection Documentation: metadata (standards) Quality assurance Storage and backup Policies for access and sharing and provisions for appropriate
protection/privacy Policies and provisions for reuse, redistribution Plans for archiving and preservation of access
From: National Science Foundation and University of California
Data Management Plan
Information, templates and checklists
UT template: website RDM on Library & Archive
3TU.Datacentrum: template
DANS checklist
NWO form
Data repositories
Data centres:
3TU.Datacentrum
DANS
List of data repositories: Databib or Data repositories
18
19
Enhanced publication
21
Support and/or advice
Information specialist in your faculty
or
Maarten van Bentum (data librarian):
tel. 489 4474