Upload
jisc
View
1.179
Download
0
Embed Size (px)
Citation preview
10/11/2011 JISC Webinar slide 1
Meeting the Research Data Challenge
12th October 2011, 12.00 – 13.00
#jiscmrd #jiscwebinar
10/11/2011 JISC Webinar: slide 2
Meeting the Research Data Challenge
Sarah Porter
Presentation outline
The role of JISC in research support
The data challenge
Drivers for good management of research data
10/11/2011 Wellcome Collection Conference Centre slide 3
Why JISC? JISC’s role in research support
National infrastructure services for research such as the JANET network,
data centres to host published resources, repository infrastructure
–The provision complements that of other stakeholders
– Research funders - Research Councils, Funding Councils
– Data hosts e.g. National Data Centres
JISC supports universities and colleges
to make effective and efficient use of technology – in research and in the
management of research
Key themes:
Increasing the impact and visibility of research
Increasing research competitiveness
Management of research information
Collaboration with business and the community
Improved management of research data.
10/11/2011 slide 4
The challenge of data
–Volume
–Diversity (what is data, anyway?)
– ‘Long tail’
–Drivers for data management not well understood – complex picture due
to range of funders’ policies, other policies at multiple levels (European,
UK, each research council, each institution)
–Good management practice not yet well understood
• so not embedded into research practice
– Institutional roles and responsibilities may be unclear
–Responsibility for meeting costs not yet established.
10/11/2011 slide 5
Drivers to improve research data management
Considerations for research integrity
Research Funder Policies
Freedom of Information / Environmental Information
Regulations
Benefits of data reuse and improved research data
management (including Research Excellence Framework)
10/11/2011 slide 6
Drivers: Research Integrity
UK Research Integrity Office Code of Practice for Research: Promoting good practice and preventing misconduct, September 2009
Data management planning is an essential part of research design [3.4.1.c; also 3.12.6]
Section 3.12 covers collection AND RETENTION of research data.
Organisations and researchers should ensure that research data relating to publications is available for discussion with other researchers, subject to any existing agreements on confidentiality. [3.12.1]
Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form. [3.12.5]
Due regard to privacy, confidentiality and ethical issues.
Research integrity requires addressing these issues in order to make data as ‘shareable’ as possible.
10/11/2011 slide 7
Drivers: Funders’ Policies
Research funders’ policies form an important part of the research data ecology.
In common with international developments, requirements are becoming increasingly exacting.
Many policy statements reference the OECD Principles and Guidelines for Access to Research Data from Public Funding: http://www.oecd.org/dataoecd/9/61/38500813.pdf
NSF recently added the requirement of a data management plan to grant proposals: http://www.arl.org/rtl/eresearch/escien/nsf/index.shtml
Health Research Funders’ ‘Joint Statement of Purpose: Sharing research data to improve public health’: http://www.wellcome.ac.uk/About-us/Policy/Spotlight-issues/Data-sharing/Public-health-and-epidemiology/WTDV030690.htm
– making research data sets available to investigators beyond the original research team in a timely and responsible manner, subject to appropriate safeguards, will generate three key benefits: • faster progress in improving health; • better value for money; • higher quality science.
10/11/2011 slide 8
Joint RCUK Policy
RCUK Common Principles on Data Policy: http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
1. Publicly funded research data are a public good, produced in the public interest and therefore should be made as openly available as possible;
2. Institutional and project specific data management policies and plans should be in accordance with relevant standards and community best practice;
3. Sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and re-use potential of the data;
4. Legal, ethical and commercial constraints on release of research data must be recognised;
5. Recognition and ‘reward’ for managing and sharing research data are essential, and so limited embargo periods on the release of data are acceptable;
6. All users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they are accessed;
7. It is appropriate to use public funds to support the management and sharing of publicly-funded research data, but this should be done in as cost-effective and efficient way as possible.
Infrastructure implications to be inferred rather than directly stated? 10/11/2011 slide 9
Drivers: Funders’ Policies
New MRC policies on research data management and sharing being
prepared; tested and refined; guidance produced as part of a JISC funded
project.
BBSRC Statement, April 2007, updated June 2010:
http://www.bbsrc.ac.uk/web/FILES/Policies/data-sharing-policy.pdf
–Requires statement on data sharing.
New ESRC policy now in vigour: http://www.esrc.ac.uk/about-
esrc/information/data-policy.aspx
– Introduces the requirement of a data management and sharing statement (J-eS)
and a data management and sharing plan as part of the grant submission
10/11/2011 slide 10
Drivers: Funders’ Policies (EPSRC)
Responsibility:
EPSRC has a Policy Framework stating expectations concerning the
Management of and Access to EPSRC-funded Research Data. Places
responsibility with institutions, departments and centres in receipt of
EPSRC funding to show they can manage and preserve data to
adequate standards.
Appropriate division of costs:
EPSRC believes that where research has been publicly-funded it is reasonable and
appropriate to use public funds to also fund the associated data management
costs. EPSRC therefore expects research organisations to make appropriate
provision from within public research funding received, making use of both direct
and indirect funding streams as appropriate.
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/responsibility.aspx
10/11/2011 slide 11
Drivers: Freedom of Information and
Environmental Information requests
Research data can be subject to Freedom of Information / Environmental Information requests: UEA and Queen’s University Belfast cases.
Guidance available at JISC Q&A on ‘Freedom of Information and Research Data’: http://www.jisc.ac.uk/publications/programmerelated/2010/foiresearchdata.aspx
Indicative research on numbers of FoI requests for research data: sample of 21 Universities, received total of 40 FoI requests for research data from 2007-10.
–Wide variance in distribution 12 universities received 0; 1 received 8; another 9.
–All but six were from 2009 and 2010; • Indicates a growing trend.
10/11/2011 slide 12
Driver: preparation for Research Excellence
Framework submissions in 2013
Good data management practice improves and reduces the burden of data
collection for institutions
–Need to embed practices into key roles – researchers, research
managers, administrators.
Demonstrate the contribution that research makes to the economy and
society (impact)
Opening up data provides one level of increased opportunity for ‘citizen
science’, etc.
Can be aided by research information management systems
JISC has funded universities to demonstrate the benefits of using the
Common European Research Information Format (CERIF) to manage
research information
–The cost of use is more than offset by efficiency savings.
Research management ‘Shared Service’ being developed for April 2012.
10/11/2011 slide 13
Meeting the Research Data Challenge
Simon Hodson
Programme Manager, Managing Research Data, Digital Infrastructure Team
Wednesday 12 October 2011
JISC Webinar
Responding to the drivers?
How can universities respond to these drivers?
What is JISC doing to help?
Supporting the Research Data Lifecycle
Plan
Create
Use
Appraise Publish
Discover
Reuse
Store
Annotate
Select
Discard
Describe
Identify Hand Over?
Access
Supporting the Research Data Lifecycle
Plan
Create
Use
Appraise Publish
Discover
Reuse
Store
Annotate
Select
Discard
Describe
Identify Hand Over?
Access
Guidance and Policy
Development
Training and Information
Support for Data Management
Planning
RDM Systems and
Infrastructure
Publication and Citation
Mechanisms
Wednesday 12 September 2011
JISC Webinar
Meeting the Research Data Challenge
Advice and guidance.
Training materials.
Data management planning.
Research data management systems and infrastructure.
Making the case: recognition, rewards, benefits.
DCC’s Data Management Roadshows
Regional Data Management
Roadshows.
http://www.dcc.ac.uk/events/data-
management-roadshows
Next: Cambridge, 9-11 November
http://www.dcc.ac.uk/events/data-
management-roadshows/dcc-
roadshow-cambridge
Then: Cardiff, 14-16 December
Blog on Oxford Roadshow:
http://www.dcc.ac.uk/news/review-
dcc-roadshow-oxford
Timeline for Institutional Development
Institutional Research Data Management Policies
University of Edinburgh Research Data Management Policy:
http://www.ed.ac.uk/schools-departments/information-
services/about/policies-and-regulations/research-data-policy
University of Oxford Commitment to Research Data
Management:
http://www.ict.ox.ac.uk/odit/projects/datamanagement/
University of Hertfordshire: http://research-data-
toolkit.herts.ac.uk/?p=11
See DCC on institutional data management policies:
http://www.dcc.ac.uk/resources/policy-and-legal/institutional-
data-policies
Guidance Materials (JISCMRD Programme)
Sudamih Project: http://sudamih.oucs.ox.ac.uk/
Oxford Research Data Management Pages (EIDCSR Project): http://www.admin.ox.ac.uk/rdm/
Training Materials for Humanities Scholars – delivered as part of central Humanities Division IT training courses: http://sudamih.oucs.ox.ac.uk/documents.xml
Guidance Materials (JISCMRD Programme)
Incremental Project, collaboration between Glasgow and Cambridge, concentrated on providing guidance and training materials at an institutional level; focus on arts and humanities, social sciences, archaeology, social anthropology: http://www.lib.cam.ac.uk/preservation/incremental/index.html
Cambridge Website: www.lib.cam.ac.uk/dataman/
Glasgow Website: www.gla.ac.uk/datamanagement/
Workshops and Seminars: http://www.lib.cam.ac.uk/preservation/incremental/seminars.html
– Series at CRASSH covering: ethics, FoI, IPR, new technologies.
– Series at Glasgow covering: performing arts and archaeology.
Interviews from Seminars:
– http://www.lib.cam.ac.uk/dataman/training.html#Interviews
– http://www.gla.ac.uk/services/datamanagement/training/videos/
Incremental Project Blog: http://incrementalproject.wordpress.com/
DCC How-To Guides
DCC How-To Guides: http://www.dcc.ac.uk/resources/how-guides
– Appraise and select research data for curation
– How to license research data
– How to develop a data management and sharing plan
Further Guides in preparation.
JISCMRD Training Projects
Need for subject focussed research data management / curation training, integrated with
PG studies
Five projects to design and pilot (reusable) discipline-focussed training units for
postgraduate courses: http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx
Health studies:
http://www.northumbria.ac.uk/sd/academic/ceis/re/isrc/themes/rmarea/datum/
Creative arts: http://www.projectcairo.org/
Archaeology, social anthropology: http://www.lib.cam.ac.uk/preservation/datatrain/
Psychological sciences: http://www.dmtpsych.york.ac.uk/
Social sciences, geographical sciences, clinical psychology: http://bit.ly/RDMantra
DaMSSI Support Project: http://www.rin.ac.uk/our-work/researcher-development-and-
skills/data-management-and-information-literacy
Data Management Planning: DCC Guidance
DCC DMP Guidance:
– Overview of funder policies: http://www.dcc.ac.uk/resources/data-
management-plans/funders-requirements
– DMP Checklist: http://www.dcc.ac.uk/webfm_send/431
DCC DMP Online tool: http://www.dcc.ac.uk/dmponline
Data Management Planning for engineering and manufacturing research, IdMRC and UKOLN, Bath: http://www.ukoln.ac.uk/projects/erim/
Data very heterogeneous: data type, conditions of use etc.
Review of the State of the Art of the Digital Curation of Research Data.
Report on Understanding and Characterizing Engineering Research Data for its Better Management: included detailed Research Activity Information Development modeling.
Draft IdMRC Projects Data Management Plan; Requirements for a RAID associative tool.
Principle: interventions should result in ‘a zero net resource requirement increase’; i.e. data management needs to be supported by appropriate tools, or balanced by immediate benefits. Role of data manager in research centres needs to be examined closely.
ERIM Project
DMP-ESRC Project
Led by UK Data Archive: http://www.data-archive.ac.uk/create-manage/projects/jisc-dmp
Study of data management practices in ESRC funded Centres and Programmes.
Data Management Recommendations for Research Centres and Programmes: http://www.data-archive.ac.uk/media/257765/ukdadatamanagementrecommendations_centresprogrammes.pdf
– Clear roles and responsibilities; RDM coordinator; Data Inventory; Data Management Resources Library.
– Recommendations and guidelines on anonymisation, security and backup etc.
Data Management Costing Tool: http://www.data-archive.ac.uk/media/257647/ukda_jiscdmcosting.pdf
RDM Platforms and Infrastructure
FISHnet Project, freshwater
biology:
http://www.fishnetonline.org/
MaDAM Project, biomedical
research in an institutional
context:
http://www.merc.ac.uk/?q=MaD
AM
JISC UMF Shared Services and Cloud Programme
Strand A: Shared IT Infrastructure:
http://www.jisc.ac.uk/whatwedo/programmes/umf.aspx
JANET(UK) brokerage to create trusted cloud(s) for HE.
Pilot Cloud provided by Eduserv.
Augment the role of DCC (in part to deploy tools in the cloud).
‘Killer RDM Apps’ developed to be deployed as Software as a
Service.
RDM SaaS Applications
VIDaaS (Virtual Infrastructure for Database as a Service),
University of Oxford: http://vidaas.oucs.ox.ac.uk/
DataFlow, University of Oxford: http://www.dataflow.ox.ac.uk/
Smart Research Framework, University of Southampton:
http://www.mylabnotebook.ac.uk/
Biomedical Research Infrastructure (BRISSkit), University of
Leicester
Financial Savings
OXREP case study:
Estimated research savings
during 2010 = 21%
Estimated data hosting savings during
2010 = 37%
(just central VI, not cloud hosted)
Comparison of DaaS hosting costs:
Single physical server running 30 2GB database instances = £125
Oxford VM running on local VI with 100 2GB instances = £79
Oxford VM running on local VI with 100 8GB instances = £109
Eduserv VM running on VI with 500 8GB instances = £76-98
Amazon VM with 8GB instances = £660-744
Making the Case:
recognition, rewards, benefits
Data Citation
– DCC how to guide on data citation (in preparation)
– DCC Briefing Paper on Data Citation and Linking:
http://www.dcc.ac.uk/resources/briefing-papers/introduction-
curation/data-citation-and-linking
– BL is a founding member of DataCite
– Currently have DataCite user group; will be extending this and
working with JISCMRD Projects
Dryad: a repository for supporting research data
Joint declarations, Feb 2010, in American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology: http://www.journals.uchicago.edu/doi/full/10.1086/650340
This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity.
Allows embargos of up to one year; allows exceptions for, e.g., sensitive information such as human subject data or the location of endangered species.
Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. For more idiosyncratic data, the data can be placed in a more flexible digital data library such as the National Science Foundation-sponsored Dryad archive at http://datadryad.org.'
Dryad-UK: a repository for supporting research data
Dryad-UK
Expand the number of journals: BMJ Open, titles from PLoS and BioMed Central:
Prepare a business model for long term funding of the data repository: supported by
payments from journals, in turn recouped from subscription or author-pays OA fees.
Benefits?
Benefits for researchers: indications that publishing
data increases citation rates
– Piwowar HA, Day RS, Fridsma DB (2007) Sharing
Detailed Research Data Is Associated with Increased
Citation Rate. PLoS ONE 2(3): e308.
doi:10.1371/journal.pone.0000308 (cancer microarray
clinical trial publications).
– Piwowar ongoing work e.g.
http://researchremix.wordpress.com/2011/02/18/early_re
sults/ (citation, reuse of data from Gene Expression
Omnibus).
Incentives and Benefits
Research Data Management Forum, 2-3 November,
University of Warwick: http://www.dcc.ac.uk/events/research-
data-management-forum/rdmf7-incentivising-data-
management-sharing
Making the Case for RDM, DCC Briefing Paper:
http://www.dcc.ac.uk/resources/briefing-papers/making-case-
rdm
Report on the Benefits from the Infrastructure Projects in the
JISC Managing Research Data Programme:
http://www.jisc.ac.uk/whatwedo/programmes/mrd/outputs/ben
efitsreport.aspx
JISC Managing Research Data Programme
JISC Managing Research Data Programme, Outputs:
http://www.jisc.ac.uk/whatwedo/programmes/mrd/outputs.aspx
Second JISC Managing Research Data Programme, Google
Map of funded projects:
http://maps.google.co.uk/maps/ms?msid=2104934568561360
57364.0004ab687f5a25636a285&msa=0
Call for Proposals on research data publications/citation and
on training planned for the New Year.
Questions
16/05/2011 | Slide 41