RESEARCHER DEVELOPMENT PROGRAMME Research Data Management March2017 Bill Worthington, Research and Scholarly Communications Manager, [email protected]

University of Hertfordshire researcher development - research data management

Embed Size (px)

Citation preview


BillWorthington,ResearchandScholarlyCommunicationsManager,[email protected]

LeaningAims and Outcomes

Aim: is to develop your awareness of research data management and what it means to you

By the end of this session you will

ü Know what RDM isü Have an awareness of the requirements of the funding bodiesü Understand risks and benefits ü Have an awareness of good practiceü Have of awareness UH facilities for RDMü Be able to find relevant guides and necessary support to manage your

data effectively

Context What is Research Data Management (RDM)?

• the business of managing and safeguarding your working data

• the selection, sharing, and preservation of the data which supports the verification and reproduction of your work

• (and; a professional skill closely allied to the Open Access agenda and which contributes to the advancement of knowledge for the benefit all)

Context Origins of RDM

• data loss, controversy, and a new emphasis on return on investment in research




Context Research Data Management

Research Data Management is a new skill, increasingly required of all professional researchers

UK, EU, US funders expect an increased return on investment from research and research data is seen as an under-exploited resource.

the message from all major funding bodies - look after data well and share it appropriately

Latest: 28/July/16 - Concordat on Open Research Data launched by HEFCE, RCUK, Universities UK, The Welcome Trusthttp://www.rcuk.ac.uk/media/news/160728/

Context Research Data Management – Really?

All researchers – do they mean me?

If you (or your Supervisor/Principal Investigator) take money from the public purse, then

– yes, this does mean you!

And: even unfunded research generates information, often in electronic form, that will have a cost or consequence to replace if lost or mislaid

From the Digital Curation Centre Summary of UK research funders’ expectations for the content of data management and sharing plans


So what?institutional and personal cost of bad RDM

- the cost of re-creating lost data

- reputational and actual financial cost to the University

- reputational and career cost to you





So what?benefits of good RDM

- wider, safer access to your working data

- facilitates collaboration and group working

- it will save you time and effort in the long run

- someone else will do some of it for you

- facilitates bidding in the new open access, open data culture

- your data will attract citations in its own right

- career and reputation enhanced

(RDM needy? Un-attributed image - citation lost! )

What is Data?

ActivityWhat is data?

What is your definition of research data?

What data to you have?

(2 minutes each)

What is data? Everything





Images / Photos

but also…. desktop documents, note books, the back of envelopes….unstructured, badly organised, often hiding key metadata about other data


What is data? Perspectives and definitions

From the concordat: Research data are the evidence that underpins the answer to the research question, and can be used to validate findings regardless of its form (e.g. print, digital, or physical). These might be quantitative information or qualitative statements collected by researchers in the course of their work by experimentation, observation, modelling, interview or other methods, or information derived from existing evidence.

Career - any data that would require time and effort to replicate and might just be useful to return to

Policy - UH UPRs: Personal or Confidential Information (PCI) - person identifiable information and other confidential and commercially sensitive information – includes valuable data = research data

Personal - your own PCIhttp://www.staffnet.herts.ac.uk/documents/ict/ict-sstems-training/Managing_Personal_and_Confidential_Information.pdf


Good Practice Journey

• Data management planning

• Safeguarding working data

• Publish open data

Data Management PlanningMake a plan - a Data Management Plan (DMP)

A DMP is living document about the stewardship of your working data, and the curation (or disposal) of that which needs to be preserved (or destroyed) at the conclusion of the work

Extent of your plan depends on your context –

• informal, for the benefit of the well organised individual

• structured generic format acceptable to RCUK, EU as a part of funding applications

• externally specified to satisfy rigorous demands, for example, a Standard Operating Procedure (SOP) in clinical work

Data Management Planning DMPonline from the Digital Curation Centre


• templates for all major UK funding bodies

• UH template

• DMPs are working, evolving documents

• Professionally compiled output in a variety of formats




Data Management Planning Elements of a Data Management Plan







Good practice File management strategy

A good file management strategy will pay off at some stage:

• Adopted a file naming practice, and stick to it

• X: > Research > Group name > ProjectName > 20170307-keyworded-filename.docx

• Use dates in filenames – easy to sort and search

• Consider printing to PDF from proprietary formats for future discoverability

Good practice Metadata is key

What contextual details are needed to make this data useful in the future?

• who is in this photograph?• who was the photographer?• when was it taken?• where was it taken• why was it taken??• how was it taken?

Without contextual metadata the data itself will very quickly become seriously devalued (useless even) and will never be discoverable and useful to anyone else

Good practice Plan to share – ethical approval

There is a pervasive myth that you can not share sensitive data acquired during your research. This is not true – IF you to plan ahead and design your research appropriately.

At the outset:

• seek permission for reuse in research

• get permission to publish anonymised data

• think about how to retain the impact of the data whilst making it anonymous

• if open access is not appropriate consider data deposit in a controlled access environment with a dataset metadata record in an open repository

• include your DMP in the ethical approval process

Practical steps to publish anonymised data from a Post Graduate Researcher: https://github.com/peterhcharlton/RRest/files/386203/20160728.Data.Dialogue.pdf slide 12 onward

ActivityWhere is your data?

What media to you use to store, transport or share your working data?

(2 minutes each)

Common data storage practice, expedient… but risky USB sticks and unregulated cloud storage

fragile, easy to lose unfavourable terms & conditions

if you use these for data transport or backup - encrypt PCI and delete it when no longer needed

Common data storage practice, expedient… but risky Hard drive death and laptop theft

1 in 20 hard drives fail within 18 months,1 in 8 within 4 years



7% of laptops are stolen or lost. This rate is higher in education and research

Good practice One message above all - Use networked storage

• enterprise storage: tiny chance of data loss, > 10000x safer than any local device

• U:drive and X:drive - secure, available anywhere in world 24/7, more than enough space for most people

• R:drive – terrabyte research group storage also available

• (90 day limit on restoration after accidental user deletion)

Good practice OneDrive within Office 365 now available to all @UH

• Effectively limitless storage via university Office365 account - www.office.com

• No VPN, excellent Web GUI, supports sharing (needs care)

• Some limitations for use – slower from inside UH, limited protection for user error


Good practice if in doubt consult the File Storage Guide


Good practicewhy don’t we do it?

• “it is too fiddly, time consuming, or otherwise bothersome to do it right”

• “there isn’t enough space; I can’t work on it at home; my data will disappear into a central system that I have no control over; they will lose it, or give it away

• “it belongs to me, I can look after it best”

None of this is true (anymore).

Good practice additional tools: Document management

Enterprise document manage system available for project work

• use when a high standard of project or data governance is required

• versioning, file level audited access control, retention and disposal policies

• project based structure designed by UH researchers

• Drag and drop via web GUI, or mounted drive

• Available to all projects for group work

Good practice additional tools: use encryption

When working away from your ‘home’ environments – or for transporting or transferring data offsite - keep your data secure in an encrypted folder

Veracrypt works on most operating systems. https://veracrypt.codeplex.com/wikipage?title=Downloads

• Cross platform opensource Encryption that works with Windows, Mac and Linux

• Pack your files into an encrypted volume• Send by email, shared drive, cloud storage• Password access

Good practice Use ExchangeFile for data transfer, delivery to collaborators

• approved alternative to unregulated file sharing systems such as Dropbox, Gdrive, YouSendIt, and MailBigFile

• handles files too big for email

• recorded transfer

• auto-disposal


Good practicea bit of light relief


A data management horror story by Karen Hanson, Alisa Surkis and Karen Yacobucci. This is what shouldn't happen when a researcher makes a data sharing request!

Open Dataextend publication to include data








SupportingData National/Subject/Archive

The selection of data, methods, algorithms, results, plots, and conclusions are included in research papers. Open Access to these research outputs is now required practice.

There is a corresponding demand for Open Data to support all published work.

RCUK Guidelines > expectations for researchers >…. All papers must include … and, if applicable, a statement on how the underlying research materials – such as data, samples or models – can be accessed (since 2013)http://www.rcuk.ac.uk/documents/documents/rcukopenaccesspolicy-pdf/

Data Statement Author’sManuscripts

Open Data Long term preservation – what to keep?

• Everything? – long term cost preservation prevents this

• Selection is important. Focus on the data that is required to validate your research

• Attach metadata descriptions and access mechanisms if appropriate, use future proofed file formats

• Deposit as required by your funding body in an appropriate national or international subject based repository

• Otherwise deposit in University of Hertfordshire Research Archive (UHRA) (which also holds the Author Accepted Versions of our research papers)

Open Data Use the new preservation infrastructure

subject repositories



Open Data Good example: Digital Humanities


This data has been prepared and described so as to make it FAIR –

FindableAccessible Interoperable Re-usable


RDM Resources

Closing Anecdotes

RDM resources and local help





http://www.herts.ac.uk/rdm - compiledfromtwoJISCprojectsatUH(tobeupdatedin2017)

http://www.herts.ac.uk/research-data-toolkit - anRDMblogoftheconductofourJISCprojects

[email protected] Research and Scholarly Communications team

http://www.dcc.ac.uk - the Digital Curation Centre, extensive world renowned resource

http://datalib.edina.ac.uk/mantra/ - brilliant online course in RDM

http://www.tubechop.com/watch/421197 - RDM explained by leading advocates

Closing AnecdotesClimateGate

The ClimateGate controversy in which the University of East Anglia (and the global effort on climate change) suffered huge reputational damage started with an incident of data loss (theft), and was exacerbated by a reluctance to publish underlying data



Closing AnecdotesIt can happen to you

• a personal near miss with disaster

• my university laptop was stolen from the boot of my car

• I was saved by:

• the U:drive• desktop encryption


