38
10/11/2011 JISC Webinar slide 1 Meeting the Research Data Challenge 12 th October 2011, 12.00 13.00 #jiscmrd #jiscwebinar

Research data challenge presentation

  • Upload
    jisc

  • View
    1.179

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research data challenge presentation

10/11/2011 JISC Webinar slide 1

Meeting the Research Data Challenge

12th October 2011, 12.00 – 13.00

#jiscmrd #jiscwebinar

Page 2: Research data challenge presentation

10/11/2011 JISC Webinar: slide 2

Meeting the Research Data Challenge

Sarah Porter

Page 3: Research data challenge presentation

Presentation outline

The role of JISC in research support

The data challenge

Drivers for good management of research data

10/11/2011 Wellcome Collection Conference Centre slide 3

Page 4: Research data challenge presentation

Why JISC? JISC’s role in research support

National infrastructure services for research such as the JANET network,

data centres to host published resources, repository infrastructure

–The provision complements that of other stakeholders

– Research funders - Research Councils, Funding Councils

– Data hosts e.g. National Data Centres

JISC supports universities and colleges

to make effective and efficient use of technology – in research and in the

management of research

Key themes:

Increasing the impact and visibility of research

Increasing research competitiveness

Management of research information

Collaboration with business and the community

Improved management of research data.

10/11/2011 slide 4

Page 5: Research data challenge presentation

The challenge of data

–Volume

–Diversity (what is data, anyway?)

– ‘Long tail’

–Drivers for data management not well understood – complex picture due

to range of funders’ policies, other policies at multiple levels (European,

UK, each research council, each institution)

–Good management practice not yet well understood

• so not embedded into research practice

– Institutional roles and responsibilities may be unclear

–Responsibility for meeting costs not yet established.

10/11/2011 slide 5

Page 6: Research data challenge presentation

Drivers to improve research data management

Considerations for research integrity

Research Funder Policies

Freedom of Information / Environmental Information

Regulations

Benefits of data reuse and improved research data

management (including Research Excellence Framework)

10/11/2011 slide 6

Page 7: Research data challenge presentation

Drivers: Research Integrity

UK Research Integrity Office Code of Practice for Research: Promoting good practice and preventing misconduct, September 2009

Data management planning is an essential part of research design [3.4.1.c; also 3.12.6]

Section 3.12 covers collection AND RETENTION of research data.

Organisations and researchers should ensure that research data relating to publications is available for discussion with other researchers, subject to any existing agreements on confidentiality. [3.12.1]

Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form. [3.12.5]

Due regard to privacy, confidentiality and ethical issues.

Research integrity requires addressing these issues in order to make data as ‘shareable’ as possible.

10/11/2011 slide 7

Page 8: Research data challenge presentation

Drivers: Funders’ Policies

Research funders’ policies form an important part of the research data ecology.

In common with international developments, requirements are becoming increasingly exacting.

Many policy statements reference the OECD Principles and Guidelines for Access to Research Data from Public Funding: http://www.oecd.org/dataoecd/9/61/38500813.pdf

NSF recently added the requirement of a data management plan to grant proposals: http://www.arl.org/rtl/eresearch/escien/nsf/index.shtml

Health Research Funders’ ‘Joint Statement of Purpose: Sharing research data to improve public health’: http://www.wellcome.ac.uk/About-us/Policy/Spotlight-issues/Data-sharing/Public-health-and-epidemiology/WTDV030690.htm

– making research data sets available to investigators beyond the original research team in a timely and responsible manner, subject to appropriate safeguards, will generate three key benefits: • faster progress in improving health; • better value for money; • higher quality science.

10/11/2011 slide 8

Page 9: Research data challenge presentation

Joint RCUK Policy

RCUK Common Principles on Data Policy: http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

1. Publicly funded research data are a public good, produced in the public interest and therefore should be made as openly available as possible;

2. Institutional and project specific data management policies and plans should be in accordance with relevant standards and community best practice;

3. Sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and re-use potential of the data;

4. Legal, ethical and commercial constraints on release of research data must be recognised;

5. Recognition and ‘reward’ for managing and sharing research data are essential, and so limited embargo periods on the release of data are acceptable;

6. All users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they are accessed;

7. It is appropriate to use public funds to support the management and sharing of publicly-funded research data, but this should be done in as cost-effective and efficient way as possible.

Infrastructure implications to be inferred rather than directly stated? 10/11/2011 slide 9

Page 10: Research data challenge presentation

Drivers: Funders’ Policies

New MRC policies on research data management and sharing being

prepared; tested and refined; guidance produced as part of a JISC funded

project.

BBSRC Statement, April 2007, updated June 2010:

http://www.bbsrc.ac.uk/web/FILES/Policies/data-sharing-policy.pdf

–Requires statement on data sharing.

New ESRC policy now in vigour: http://www.esrc.ac.uk/about-

esrc/information/data-policy.aspx

– Introduces the requirement of a data management and sharing statement (J-eS)

and a data management and sharing plan as part of the grant submission

10/11/2011 slide 10

Page 11: Research data challenge presentation

Drivers: Funders’ Policies (EPSRC)

Responsibility:

EPSRC has a Policy Framework stating expectations concerning the

Management of and Access to EPSRC-funded Research Data. Places

responsibility with institutions, departments and centres in receipt of

EPSRC funding to show they can manage and preserve data to

adequate standards.

Appropriate division of costs:

EPSRC believes that where research has been publicly-funded it is reasonable and

appropriate to use public funds to also fund the associated data management

costs. EPSRC therefore expects research organisations to make appropriate

provision from within public research funding received, making use of both direct

and indirect funding streams as appropriate.

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/responsibility.aspx

10/11/2011 slide 11

Page 12: Research data challenge presentation

Drivers: Freedom of Information and

Environmental Information requests

Research data can be subject to Freedom of Information / Environmental Information requests: UEA and Queen’s University Belfast cases.

Guidance available at JISC Q&A on ‘Freedom of Information and Research Data’: http://www.jisc.ac.uk/publications/programmerelated/2010/foiresearchdata.aspx

Indicative research on numbers of FoI requests for research data: sample of 21 Universities, received total of 40 FoI requests for research data from 2007-10.

–Wide variance in distribution 12 universities received 0; 1 received 8; another 9.

–All but six were from 2009 and 2010; • Indicates a growing trend.

10/11/2011 slide 12

Page 13: Research data challenge presentation

Driver: preparation for Research Excellence

Framework submissions in 2013

Good data management practice improves and reduces the burden of data

collection for institutions

–Need to embed practices into key roles – researchers, research

managers, administrators.

Demonstrate the contribution that research makes to the economy and

society (impact)

Opening up data provides one level of increased opportunity for ‘citizen

science’, etc.

Can be aided by research information management systems

JISC has funded universities to demonstrate the benefits of using the

Common European Research Information Format (CERIF) to manage

research information

–The cost of use is more than offset by efficiency savings.

Research management ‘Shared Service’ being developed for April 2012.

10/11/2011 slide 13

Page 14: Research data challenge presentation

Meeting the Research Data Challenge

Simon Hodson

Programme Manager, Managing Research Data, Digital Infrastructure Team

Wednesday 12 October 2011

JISC Webinar

Page 15: Research data challenge presentation

Responding to the drivers?

How can universities respond to these drivers?

What is JISC doing to help?

Page 16: Research data challenge presentation

Supporting the Research Data Lifecycle

Plan

Create

Use

Appraise Publish

Discover

Reuse

Store

Annotate

Select

Discard

Describe

Identify Hand Over?

Access

Page 17: Research data challenge presentation

Supporting the Research Data Lifecycle

Plan

Create

Use

Appraise Publish

Discover

Reuse

Store

Annotate

Select

Discard

Describe

Identify Hand Over?

Access

Guidance and Policy

Development

Training and Information

Support for Data Management

Planning

RDM Systems and

Infrastructure

Publication and Citation

Mechanisms

Page 18: Research data challenge presentation

Wednesday 12 September 2011

JISC Webinar

Meeting the Research Data Challenge

Advice and guidance.

Training materials.

Data management planning.

Research data management systems and infrastructure.

Making the case: recognition, rewards, benefits.

Page 19: Research data challenge presentation

DCC’s Data Management Roadshows

Regional Data Management

Roadshows.

http://www.dcc.ac.uk/events/data-

management-roadshows

Next: Cambridge, 9-11 November

http://www.dcc.ac.uk/events/data-

management-roadshows/dcc-

roadshow-cambridge

Then: Cardiff, 14-16 December

Blog on Oxford Roadshow:

http://www.dcc.ac.uk/news/review-

dcc-roadshow-oxford

Page 20: Research data challenge presentation

Timeline for Institutional Development

Page 21: Research data challenge presentation

Institutional Research Data Management Policies

University of Edinburgh Research Data Management Policy:

http://www.ed.ac.uk/schools-departments/information-

services/about/policies-and-regulations/research-data-policy

University of Oxford Commitment to Research Data

Management:

http://www.ict.ox.ac.uk/odit/projects/datamanagement/

University of Hertfordshire: http://research-data-

toolkit.herts.ac.uk/?p=11

See DCC on institutional data management policies:

http://www.dcc.ac.uk/resources/policy-and-legal/institutional-

data-policies

Page 22: Research data challenge presentation

Guidance Materials (JISCMRD Programme)

Sudamih Project: http://sudamih.oucs.ox.ac.uk/

Oxford Research Data Management Pages (EIDCSR Project): http://www.admin.ox.ac.uk/rdm/

Training Materials for Humanities Scholars – delivered as part of central Humanities Division IT training courses: http://sudamih.oucs.ox.ac.uk/documents.xml

Page 23: Research data challenge presentation

Guidance Materials (JISCMRD Programme)

Incremental Project, collaboration between Glasgow and Cambridge, concentrated on providing guidance and training materials at an institutional level; focus on arts and humanities, social sciences, archaeology, social anthropology: http://www.lib.cam.ac.uk/preservation/incremental/index.html

Cambridge Website: www.lib.cam.ac.uk/dataman/

Glasgow Website: www.gla.ac.uk/datamanagement/

Workshops and Seminars: http://www.lib.cam.ac.uk/preservation/incremental/seminars.html

– Series at CRASSH covering: ethics, FoI, IPR, new technologies.

– Series at Glasgow covering: performing arts and archaeology.

Interviews from Seminars:

– http://www.lib.cam.ac.uk/dataman/training.html#Interviews

– http://www.gla.ac.uk/services/datamanagement/training/videos/

Incremental Project Blog: http://incrementalproject.wordpress.com/

Page 24: Research data challenge presentation

DCC How-To Guides

DCC How-To Guides: http://www.dcc.ac.uk/resources/how-guides

– Appraise and select research data for curation

– How to license research data

– How to develop a data management and sharing plan

Further Guides in preparation.

Page 25: Research data challenge presentation

JISCMRD Training Projects

Need for subject focussed research data management / curation training, integrated with

PG studies

Five projects to design and pilot (reusable) discipline-focussed training units for

postgraduate courses: http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx

Health studies:

http://www.northumbria.ac.uk/sd/academic/ceis/re/isrc/themes/rmarea/datum/

Creative arts: http://www.projectcairo.org/

Archaeology, social anthropology: http://www.lib.cam.ac.uk/preservation/datatrain/

Psychological sciences: http://www.dmtpsych.york.ac.uk/

Social sciences, geographical sciences, clinical psychology: http://bit.ly/RDMantra

DaMSSI Support Project: http://www.rin.ac.uk/our-work/researcher-development-and-

skills/data-management-and-information-literacy

Page 27: Research data challenge presentation

Data Management Planning for engineering and manufacturing research, IdMRC and UKOLN, Bath: http://www.ukoln.ac.uk/projects/erim/

Data very heterogeneous: data type, conditions of use etc.

Review of the State of the Art of the Digital Curation of Research Data.

Report on Understanding and Characterizing Engineering Research Data for its Better Management: included detailed Research Activity Information Development modeling.

Draft IdMRC Projects Data Management Plan; Requirements for a RAID associative tool.

Principle: interventions should result in ‘a zero net resource requirement increase’; i.e. data management needs to be supported by appropriate tools, or balanced by immediate benefits. Role of data manager in research centres needs to be examined closely.

ERIM Project

Page 28: Research data challenge presentation

DMP-ESRC Project

Led by UK Data Archive: http://www.data-archive.ac.uk/create-manage/projects/jisc-dmp

Study of data management practices in ESRC funded Centres and Programmes.

Data Management Recommendations for Research Centres and Programmes: http://www.data-archive.ac.uk/media/257765/ukdadatamanagementrecommendations_centresprogrammes.pdf

– Clear roles and responsibilities; RDM coordinator; Data Inventory; Data Management Resources Library.

– Recommendations and guidelines on anonymisation, security and backup etc.

Data Management Costing Tool: http://www.data-archive.ac.uk/media/257647/ukda_jiscdmcosting.pdf

Page 29: Research data challenge presentation

RDM Platforms and Infrastructure

FISHnet Project, freshwater

biology:

http://www.fishnetonline.org/

MaDAM Project, biomedical

research in an institutional

context:

http://www.merc.ac.uk/?q=MaD

AM

Page 30: Research data challenge presentation

JISC UMF Shared Services and Cloud Programme

Strand A: Shared IT Infrastructure:

http://www.jisc.ac.uk/whatwedo/programmes/umf.aspx

JANET(UK) brokerage to create trusted cloud(s) for HE.

Pilot Cloud provided by Eduserv.

Augment the role of DCC (in part to deploy tools in the cloud).

‘Killer RDM Apps’ developed to be deployed as Software as a

Service.

Page 31: Research data challenge presentation

RDM SaaS Applications

VIDaaS (Virtual Infrastructure for Database as a Service),

University of Oxford: http://vidaas.oucs.ox.ac.uk/

DataFlow, University of Oxford: http://www.dataflow.ox.ac.uk/

Smart Research Framework, University of Southampton:

http://www.mylabnotebook.ac.uk/

Biomedical Research Infrastructure (BRISSkit), University of

Leicester

Page 32: Research data challenge presentation

Financial Savings

OXREP case study:

Estimated research savings

during 2010 = 21%

Estimated data hosting savings during

2010 = 37%

(just central VI, not cloud hosted)

Comparison of DaaS hosting costs:

Single physical server running 30 2GB database instances = £125

Oxford VM running on local VI with 100 2GB instances = £79

Oxford VM running on local VI with 100 8GB instances = £109

Eduserv VM running on VI with 500 8GB instances = £76-98

Amazon VM with 8GB instances = £660-744

Page 34: Research data challenge presentation

Dryad: a repository for supporting research data

Joint declarations, Feb 2010, in American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology: http://www.journals.uchicago.edu/doi/full/10.1086/650340

This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity.

Allows embargos of up to one year; allows exceptions for, e.g., sensitive information such as human subject data or the location of endangered species.

Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. For more idiosyncratic data, the data can be placed in a more flexible digital data library such as the National Science Foundation-sponsored Dryad archive at http://datadryad.org.'

Page 35: Research data challenge presentation

Dryad-UK: a repository for supporting research data

Dryad-UK

Expand the number of journals: BMJ Open, titles from PLoS and BioMed Central:

Prepare a business model for long term funding of the data repository: supported by

payments from journals, in turn recouped from subscription or author-pays OA fees.

Benefits?

Benefits for researchers: indications that publishing

data increases citation rates

– Piwowar HA, Day RS, Fridsma DB (2007) Sharing

Detailed Research Data Is Associated with Increased

Citation Rate. PLoS ONE 2(3): e308.

doi:10.1371/journal.pone.0000308 (cancer microarray

clinical trial publications).

– Piwowar ongoing work e.g.

http://researchremix.wordpress.com/2011/02/18/early_re

sults/ (citation, reuse of data from Gene Expression

Omnibus).

Page 36: Research data challenge presentation

Incentives and Benefits

Research Data Management Forum, 2-3 November,

University of Warwick: http://www.dcc.ac.uk/events/research-

data-management-forum/rdmf7-incentivising-data-

management-sharing

Making the Case for RDM, DCC Briefing Paper:

http://www.dcc.ac.uk/resources/briefing-papers/making-case-

rdm

Report on the Benefits from the Infrastructure Projects in the

JISC Managing Research Data Programme:

http://www.jisc.ac.uk/whatwedo/programmes/mrd/outputs/ben

efitsreport.aspx

Page 37: Research data challenge presentation

JISC Managing Research Data Programme

JISC Managing Research Data Programme, Outputs:

http://www.jisc.ac.uk/whatwedo/programmes/mrd/outputs.aspx

Second JISC Managing Research Data Programme, Google

Map of funded projects:

http://maps.google.co.uk/maps/ms?msid=2104934568561360

57364.0004ab687f5a25636a285&msa=0

Call for Proposals on research data publications/citation and

on training planned for the New Year.

Page 38: Research data challenge presentation

Questions

16/05/2011 | Slide 41