29
Research Lifecycles and RDM Marieke Guy Digital Curation Centre (DCC)

Research Lifecycles and RDM

Embed Size (px)

DESCRIPTION

A presentation given at a

Citation preview

Page 1: Research Lifecycles and RDM

Research Lifecycles and RDM

Marieke GuyDigital Curation Centre (DCC)

Page 2: Research Lifecycles and RDM

About this presentation

Managing data throughout the research lifecycle

• What is the research lifecycle?• How do you manage data?• What questions does managing data raise?

Page 3: Research Lifecycles and RDM

What is the research lifecycle?

• Research activity often takes place in stages which form a ‘lifecycle’

• Data is created at points during this lifecycle• The data created has its own lifespan

“Data often have a longer lifespan than the research project that creates them. Researchers may continue to work on data after funding has ceased, follow-up projects may analyse or add to the data, and data may be re-used by other researchers.” UKDA

Page 4: Research Lifecycles and RDM

CREATE DATA

ADD DOCUMENTATION

PLAN

DATA CENTRE

IT

RESEARCHERS

CREATE DATA

ADD DOCUMENTATION

PLANA model to show the activities and people involved in managing data.

Example 1: DCC lifecycle model

Page 5: Research Lifecycles and RDM

Write proposal

Start project

Acquire sample

Generate,

Create,

Collect

ProcessAnalyze

Interpret

Publish

Validate

Research Process

Example 2: Research360 lifecycle

Page 6: Research Lifecycles and RDM

Example 3: UK Data Archive

Page 7: Research Lifecycles and RDM

Example 4: UK Data Archive

Page 8: Research Lifecycles and RDM

Key ideas from the research lifecycle

Different research lifecycles suit different researchersResearch is a circular processCertain stages are likely to be familiar to many researchers

– conceptualisation/planning, creation, active use/documentation, publication etc…

Certain stages are likely to be familiar to less researchers – sharing, re-use etc…

Data may be created at many stages during the process (intervention points)

Data is likely to need management at many stages during the process

Page 9: Research Lifecycles and RDM

4.Publication& Deposit

5.Preservation

& Re-Use

1.Create

2.Active Use

3.Documentation

1. What data will you produce?

2. How will you organise the data?

3. Can you/others understand the data

4. What data will be deposited and where?

5. Who will be interested in re-using the data?

Key Qs from the research lifecycle

Page 10: Research Lifecycles and RDM

“the active management and appraisal of data over the lifecycle of scholarly and

scientific interest”

Data management is part of good research practice

What is data curation?

Manage

Share

Page 11: Research Lifecycles and RDM

Good data management is about making informed decisions

Page 12: Research Lifecycles and RDM

http://xkcd.com/949

Page 13: Research Lifecycles and RDM

How do you manage data?

Key questions to consider when:- Creating data - Documenting data- Storing data- Sharing data- Preserving data- Planning data management

Examples and pointers to support

Page 14: Research Lifecycles and RDM

Creating data: questions

What formats will you use?- determined by the instruments / software you have to use- common, widespread formats to enable reuse

How will you create your data? - What methodologies and standards will you use?- How will you address ethical concerns and protect participants?- Will you control variations to provide quality assurance?- What external data sets will you use? (See the BL Social Science Collection guide to Management and

Business studies datasets)

Page 15: Research Lifecycles and RDM

Different formats are good for different things- open, lossless formats are more sustainable e.g. rtf, xml, tif, wav - proprietary and/or compressed formats are less preservable but

are often in widespread use e.g. doc, jpg, mp3

May choose one format for analysis then convert to a standard format for preservation / sharing

Excellent guidance on creating data & managing ethics in:www.data-archive.ac.uk/media/2894/managingsharing.pdf

Creating data: advice

Page 16: Research Lifecycles and RDM

Unencrypted Uncompressed Non-proprietary/patent-encumbered Open, documented standard Standard representation (ASCII, Unicode)

Type Recommended Avoid for data sharing

Tabular data CSV, TSV, SPSS portable ExcelText Plain text, HTML, RTF

PDF/A only if layout mattersWord

Media Container: MP4, OggCodec: Theora, Dirac, FLAC

QuicktimeH264

Images TIFF, JPEG2000, PNG GIF, JPGStructured data XML, RDF RDBMS

Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table

File formats for long-term access

Page 17: Research Lifecycles and RDM

Documenting data: questions

What information do users need to understand the data?- descriptions of all variables / fields and their values - code labels, classification schema, abbreviations list- information about the project and data creators- tips on usage e.g. exceptions, quirks, questionable results

How will you capture this?

Are there standards you can use?

Page 18: Research Lifecycles and RDM

Dublin Core metadata exampleCreator:Donald Cooper

Role=PhotographerSubject: Shakespeare, William, 1564-1616,

Antony and Cleopatra [LC]Description:Vanessa Redgrave as CleopatraDate: 1973-08-09Type:ImageFormat:JPEGIdentifier:4150 [catalogue no]Source: negative no 235Relation: Antony and Cleopatra: Thompson/73-8

IsPartOfCoverage:Bankside Globe

Role=SpatialRights:Donald Cooper

http://www.ahds.ac.uk/performingarts

Page 19: Research Lifecycles and RDM
Page 20: Research Lifecycles and RDM

Storing data: questions

What is available to you?

What facilities do you need?- remote access- file sharing with colleagues- high-levels of security

How will the data be backed up?

Page 21: Research Lifecycles and RDM

Storing data: advice

Speak to the Northampton IT Team for advice – TUNDRA2

Remember that all storage is fallible – need to back-up- keep 2+ copies on different types of media in different locations- manage back-ups (migrate media, test integrity)

Choose appropriate methods to transfer / share data- email, dropbox, ftp, encrypted media, filestore, VREs...

Page 22: Research Lifecycles and RDM

Sharing data: questions

Does your funder expect you to share data?

Which data can be shared?

How will you share your data?

What do you get from sharing?- citations, recognition...

Page 23: Research Lifecycles and RDM

Sharing data: advice

Where possible, make your data available via repositories, data

centres and structured databases

http://datacite.org/repolist http://databib.org/

Northampton Electronic Collection of Theses and Research (NECTAR)http://nectar.northampton.ac.uk/

Page 24: Research Lifecycles and RDM

Preserving data: questions

Are you required to preserve (or destroy) your data?

How will you select what to keep?

Is there somewhere you can archive your data?

How can you support the reuse of your data?

Page 25: Research Lifecycles and RDM

Preserving data: advice

How to select and appraise research data:www.dcc.ac.uk/resources/how-guides/appraise-select-research-data

How to licence research datawww.dcc.ac.uk/resources/how-guides/license-research-data

How to cite datasets and link to publicationswww.dcc.ac.uk/resources/how-guides/cite-datasets

Page 26: Research Lifecycles and RDM

Planning data management

What do you (and others) want to do with the data? your decisions should bear this in mind and make it feasible

Remember:

Data management is about making informed decisionsTalk to colleagues and support staff to see which option works best

Page 27: Research Lifecycles and RDM

Data Management and Sharing Plans

Funders typically want a short statement covering:- What data will be created (format, types) and how?

- How will the data be documented and described?

- How will you manage ethics and Intellectual Property?

- What are the plans for data sharing and access?

- What is the strategy for long-term preservation?

DMP tool: https://dmponline.dcc.ac.uk/

How to write a DMP: www.dcc.ac.uk/resources/how-guides/develop-data-plan

Page 28: Research Lifecycles and RDM

The research process at Cardiff

Take 10 minutes to think about one of the academic departments you work

― How would you characterise the subject as a whole?― What do you know about the social organisation of the

discipline?― What do you know about the data they create?― Are you familiar with the research process in this

department? ― When might they require your help?

Page 29: Research Lifecycles and RDM

Thanks - any questions?

Acknowledgements: Thanks to DCC staff, UK Data Archive and Research360 for slides