20
Research Data Management for the Humanities and Social Sciences Martin Donnelly, Digital Curation Centre, University of Edinburgh University of Liverpool, 1 October 2014 Jonathan Safran Foer, Tree of Codes (2010) – based on Bruno Schulz’s The Street of Crocodiles (1934)

Research Data Management for the Humanities and Social Sciences

Embed Size (px)

DESCRIPTION

Presentation given at internal University of Liverpool College of Humanities and Social Sciences event, 1 October 2014

Citation preview

Page 1: Research Data Management for the Humanities and Social Sciences

Research Data Management for the Humanities and Social Sciences

Martin Donnelly, Digital Curation Centre, University of Edinburgh

University of Liverpool, 1 October 2014

Jonathan Safran Foer, Tree of Codes (2010) – based on Bruno Schulz’s The Street of Crocodiles (1934)

Page 2: Research Data Management for the Humanities and Social Sciences

Overview

1. Introductions and definitions The Digital Curation Centre Research data management Drivers and benefits What do we mean by ‘data’, exactly?

2. Data in/and the Humanities and Social Sciences How the Humanities and Social Sciences differ Strengths and weaknesses Possible questions for discussion

3. Resources

Page 3: Research Data Management for the Humanities and Social Sciences

The Digital Curation Centre

The (est. 2004) is… A UK centre of expertise in digital

preservation, with a particular focus on research data management (RDM)

Based across three sites: Universities of Edinburgh, Glasgow and Bath

Working with a number of UK universities to identify gaps in RDM provision and raise capabilities across the sector

Also involved in a variety of international collaborations

Page 4: Research Data Management for the Humanities and Social Sciences

Working with UK universities

Page 5: Research Data Management for the Humanities and Social Sciences

What is RDM? A definition…

“the active management and appraisal of data over the lifecycle of

scholarly and scientific interest”

Page 6: Research Data Management for the Humanities and Social Sciences

What sort of activities?

- Planning and describing data-related work before it takes place

- Documenting your data so that others can find and understand it

- Storing it safely during the project- Depositing it in a trusted archive at

the end of the project- Linking publications to the datasets

that underpin them

Data management is a part of good research practice.- RCUK Policy and Code of Conduct on the

Governance of Good Research Conduct

Page 7: Research Data Management for the Humanities and Social Sciences

Drivers and benefits of RDM

TRANSPARENCY: The evidence that underpins research can be made open for anyone to scrutinise, and attempt to replicate the findings of others.

EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes.

RISK MANAGEMENT: A pro-active approach to data management reduces the risk of inappropriate disclosure of sensitive data, whether commercial or personal.

PRESERVATION: Lots of data is unique, and can only be captured once. If lost, it can’t be replaced.

Page 8: Research Data Management for the Humanities and Social Sciences

Without intervention, data + time = no data

Vines et al. “examined the availability of data from 516 studies between 2 and 22 years old”

- The odds of a data set being reported as extant fell by 17% per year- Broken e-mails and obsolete storage devices were the main obstacles to data sharing- Policies mandating data archiving at publication are clearly needed

“The current system of leaving data with authors means that almost all of it is lost over time, unavailable for validation of the original results or to use for entirely new purposes” according to Timothy Vines, one of the researchers. This underscores the need for intentional management of data from all disciplines and opened our conversation on potential roles for librarians in this arena. (“80 Percent of Scientific Data Gone in 20 Years,” HNGN, Dec. 20, 2013, http://www.hngn.com/articles/20083/20131220/80-percent-of-scientific-data-gone-in-20-years.htm.)

Vines et al., The Availability of Research Data Declines Rapidly with Article Age,

Current Biology (2014), http://dx.doi.org/10.1016/j.cub.2013.11.014

Page 9: Research Data Management for the Humanities and Social Sciences

Definitions vary from discipline to discipline, and from funder to funder…

A science-centric definition: “The recorded factual material commonly accepted in the scientific community as

necessary to validate research findings.” (US Office of Management and Budget, Circular 110)

[Addendum: This policy applies to scientific collections, known in some disciplines as institutional collections, permanent collections, archival collections, museum collections, or voucher collections, which are assets with long-term scientific value. (US Office of Science and Technology Policy, Memorandum, 20 March 2014)]

And another from the visual arts: “Evidence which is used or created to generate new knowledge and interpretations.

‘Evidence’ may be intersubjective or subjective; physical or emotional; persistent or ephemeral; personal or public; explicit or tacit; and is consciously or unconsciously referenced by the researcher at some point during the course of their research.” (JISC “KAPTUR” project: see http://kaptur.wordpress.com/2013/01/23/what-is-visual-arts-research-data-revisited/)

So what is ‘data’ exactly?

Page 10: Research Data Management for the Humanities and Social Sciences

Scientific and other methods…

Page 11: Research Data Management for the Humanities and Social Sciences

What do funders have to say?

AHRC requires that significant electronic resources or datasets are made available in an accessible repository for at least three years after the end of the grant

Applicants submit a statement on data sharing in the relevant section of the Je-S form, and provide a two-page data management and sharing plan addressing 9 distinct themes

Datasets must be offered to the UK Data Archive on conclusion of the project

Page 12: Research Data Management for the Humanities and Social Sciences

Funders ii: Leverhulme

Page 13: Research Data Management for the Humanities and Social Sciences

HSS Schools and Departments

Page 14: Research Data Management for the Humanities and Social Sciences

Field notes etcThat’s the thing with fieldnotes, you never show them to anyone . . . as long as it works for you . . . it’s not like you have to show it to someone else and have them make sense of it, which is kind of a shame because it would be nice to have data in formats where people could, you know, sort of, archive that information. (2-16-120211)

In his study of fieldnote practices, Jean Jackson observes, “Many respondents point out that the highly personal nature of fieldnotes influences the extent of one’s willingness to share them: ‘Fieldnotes can reveal how worthless your work was, the lacunae, your linguistic incompetence, your not being made a blood brother, your childish temper.’”39 Anthropologist Simon Ottenberg describes his fieldnotes similarly, “[W]hen I was younger, I would have felt uncomfortable at the thought of someone else using my notes, whether I was alive or dead—they are so much a private thing, so much an aspect of personal field experience, so much a private language, so much part of my ego, my childhood [as an anthropologist], and my personal maturity.”40 This attachment and ambivalence may make researchers reluctant to turn over control of their notes to an archives—especially one that has the purpose of making materials available publically on the internet. This professional practice not only makes it very difficult to substantiate or verify ethnographers’ data, suggesting a need for ethnographers to develop more proactive and sensitive data-sharing procedures, but also creates a situation where ethnographic materials are saved, but with inadequate plans for preservation. Zeitlyn notes, “Paradoxically most anthropologists want neither to destroy their field material nor archive it.”41 So, too, with our study: almost no one had made plans for their data beyond the short term, let alone the final dispensation of their materials at the end of their career or after their death.

Andrew Asher and Lori M. Jahnke (2013) “Curating the Ethnographic Moment” Archive Journal, Issue 3, http://www.archivejournal.net/issue/3/archives-remixed/curating-the-ethnographic-moment/

Page 15: Research Data Management for the Humanities and Social Sciences

Qualitative and quantitative data is well-understood in the Social Sciences, less so in the Arts and Humanities

But there’s nothing new about data re-use in the Humanities; it’s an integral part of the culture, and always has been… Think Kristeva’s intertextuality, Barthes’ ‘galaxy of signifiers’, Shakespeare’s plots,

Lanark’s assorted ‘plagiarisms’, Edwin Morgan’s ‘found’ newspaper poems, Marcel Duchamp, variations on a theme, collage and intermedia art, T.S. Eliot, sampling/hip-hop, etc etc (http://www.slideshare.net/martindonnelly/data-reuse-in-the- arts)

However, it’s often more fraught than scientific data re-use in other For starters, people don’t always think of their sources or influences as ‘data’,

and the value and referencing systems are quite different Furthermore, practice/praxis based research is pretty much the sole preserve

of the Humanities, and research/production methods are not always rigorously methodical or linear…

Strengths and weaknesses re. data in the Humanities and Social Sciences

Page 16: Research Data Management for the Humanities and Social Sciences

Some characteristics of HSS data are likely to require a different kind of handling from that afforded to other disciplines

Some of the ‘data’ will be quite personal, and may not be factual in nature. Furthermore, it may be quite valuable or precious to its creator. What matters most may not be the content itself, but rather the presentation, the arrangement, the quality of expression… This tends to be why Open Access embargoes are often longer in the Arts and Humanities than other areas

There may be a higher incidence of non-digital materials, and a blurring of boundaries between funded and unfunded work

Digital ‘data’ emerging from the Arts and Humanities is as likely to be an outcome of the creative research process as an input to a workflow. This is at odds with the scientific method, and with the language in which most existing RDM resources are described

The relevance of the word ‘data’ is not always apparent. (The term ‘research object’ is gaining in currency, incorporating data (numeric, written, audiovisual….), software code, workflows and methodologies, slides, logs, lab books, sketchbooks, notebooks, etc – basically anything that underpins or enriches the (written) outputs of research)

Issues re. data in the Humanities and Social Sciences

Page 17: Research Data Management for the Humanities and Social Sciences

How does the university wish to pitch its RDM provision? Need – what do we need to archive/manage? Is it evidence without

which the research outcomes are in doubt? Want – do we want to archive/preserve materials for other

reasons? Does preserving preparatory/developmental work provide a richer experience/understanding of the work and processes of production? How do we make a business case for this?

Some institutions take a compliance-driven approach to RDM, i.e. doing only what is required of them by funders / the law

Others see scholarly benefit in going beyond these minima, treating data as ‘special collections’

Ultimately it’s a question of appraisal…

Possible discussion points

Page 18: Research Data Management for the Humanities and Social Sciences

i. DCC resources Publications

The DCC publishes a series of themed Briefing Papers, How-To Guides and Case Studies, pitched at different audiences / levels of detail http://www.dcc.ac.uk/resources/briefing-papers http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/developing-rdm-services

Training e.g. DC101 courses and the Curation Reference Manual

Advice e.g. Disciplinary metadata, www.dcc.ac.uk/resources/metadata-

standards

Tools DMPonline, CARDIO, Data Asset Framework, DRAMBORA

Events International Digital Curation Conference (next event is in London,

Feb 2015) Research Data Management Forum (themed events – next one on

Research Data and Repositories, Leicester, 18-19 Nov 2014)

Tailored support Via our Institutional Engagement programme

Page 19: Research Data Management for the Humanities and Social Sciences

ii. Other resources Jisc services and resources

RDM resources, www.jisc.ac.uk/guides/research-data-management EDINA and Mimas (national data centres)

JISCMRD projects – Phase 1 (2009-2011) and Phase 2 (2011-2013) 1) Research Data Management Infrastructure (RDMI) 2) Research Data Management Planning (RDMP) 3) Support and Tools 4) Citing, Linking, Integrating and Publishing Research Data (CLIP) 5) Research Data Management Training Materials 6) Enhancing DMPonline 7) Events

UK Data Archive at the University of Essex

Universities Good materials are available from Edinburgh, Cambridge, Oxford, Glasgow,

Bristol, and many others

Page 20: Research Data Management for the Humanities and Social Sciences

Thank you

Image creditsSlide 2 (forest) – http://assets.worldwildlife.org/photos/934/images/hero_small/forest-overview-HI_115486.jpg?1345533675

This work is licensed under the Creative Commons

Attribution 2.5 UK: Scotland License.

For more about DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc

Martin DonnellyDigital Curation Centre

University of Edinburgh

[email protected] @mkdDCC