23
Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Embed Size (px)

Citation preview

Page 1: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Digital | Curation | Centre

Continuing Access to Research Data: The New Digital Curation Centre

Peter Burnhill

Director (Phase One)

Funded by:

Page 2: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

2

Digital | Curation | Centre

An Overview

• Personal Provenance

• Digital Curation Centre – what it is (and is not) – who’s involved

• Curating the Future

• We’re all Curators Now

Page 3: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

3

Digital | Curation | Centre

Personal Provenance: A Data Person

1970s Economics undergrad; research administration;Statistics postgrad; medical (bioassay/ultrasound) screening

1980s Survey methodologist; Scot. Edu. Data Archive; social sciences; setup University Data Library, 1984; merged into Computing Services;geographic information systems and online database access: co-director, ESRC Regional Research Lab (Scotland)

1990s Digital Library developments; national online services;director, EDINA national data centre (JISC); president, IASSIST;eLib Programme: Digimap (OS); words, numbers, pictures, sounds

2000 - JISC IE middleware – portals/brokers; authentication/authorisationlinking reasearch data & scholarly communication;Geography Data Unit; eLearning & eScience;

interim director, Digital Curation Centre ** still a trainee **

Page 4: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

4

Digital | Curation | Centre

‘Demand-side’ Verbs from Virtual Library

• Discover information object of intereste.g. reference in an A&I databases or cited at the foot of an article

• Locate service on information objecte.g. a service giving electronic access to the full text of the article, or

one’s own library having the volume on a shelf nearby

• Request use of servicevia payment of money or (better still) privilege of membership: involves

authorisation and authentication

• Access (service on) object of intereste.g. online access (and print-out), personal visit or document delivery

MODELS workshops, UKOLN/JISC eLib Programme, 1994ish

Page 5: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

5

Digital | Curation | Centre

UK Digital Curation Centre

• identified in Report commissioned by JISC Cttee for Support of Research (Lord & Macdonald, May 2003)– Twin drivers

• Digital Preservation: ePublishing (DPC) & eLearning

• Continuing Access: e-Science, ‘data deluge’ & Res Council policies

• Call to set up DCC in JISC Circular 6/03, June 2003– Ambitious & demanding remit, – Joint funding by JISC and e-Science Core Programme

• Funding for outreach, services & development

• Funding for research programme

• Task entrusted to Consortium of four partners– award made Feb/March 2004

Page 6: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

6

Digital | Curation | Centre

Overall Aim

‘continuing quality improvement in data curation & digital preservation’

• Initial focus:

data as evidential base for scholarly conclusions– role of data archiving & preservation as keys to

reproducibility and reuse

• Wider context & remit:

worlds of scholarly communication & eLearning

Page 7: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

7

Digital | Curation | Centre

Objectives

• vibrant research programme – addressing the wider issues of digital curation

• Collaborative Associates Network of Data Organisations– outreach for strong links across existing community of practice– engagement with curators (individuals & organisations)

• service definition and delivery – to evaluate tools, methods, standards and policies – a repository of tools and technical information

• ‘virtuous circle’– expertise, experience & requirement feed into the DCC

research programme

Page 8: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

8

Digital | Curation | Centre

What the DCC is not ...

… a national digital repository

… an attempt to teach grandmothers to suck eggs

… just another advisory service

Page 9: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

9

Digital | Curation | Centre

DCC Consortium Partners

• Four Consortium partner institutions: – University of Edinburgh - lead partner – University of Glasgow (HATII)– University of Bath (UKOLN)– CCLRC (Rutherford and Daresbury Laboratories)

• Prior links via National eScience Centre (NeSC)– jointly managed by Universities of Edinburgh & Glasgow

Page 10: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

10

Digital | Curation | Centre

Some Names & Responsibilities• Them with titles …

– Peter Burnhill, Director (Phase One)with Robin Rice, Phase One Project Co-ordinator• EDINA & Data Library, University of Edinburgh

– Peter Buneman Research Director (& PI on EPSRC grant)• Informatics, University of Edinburgh

– Liz Lyon, Associate Director (Community Support & Outreach)• UKOLN, University of Bath

– Seamus Ross, Associate Director (Service Definition & Delivery)• HATII, University of Glasgow

– David Giaretta, Associate Director (Development)• CCLRC

• Two significant & well known ‘Ex Portfolio’ names– Malcolm Atkinson, Director, NeSC– Chris Rusbridge, Director, Information Services, UofGlasgow

Page 11: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

What needs to be done

• Respond to policy imperatives• twin aims:excellence in research & excellence in service

– international respect & national leadership

– meeting the needs of e-Science

• impact now and into the future

– manage complexity, risk and sustainability

• Bridge across communities• universities & research institutes• scientific data tradition & document tradition• different disciplinary perspectives• engaging the information & computing sciences

• Develop a collaborative model• Associates Network of Data Organisations

Page 12: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

CCLRC UKOLN

UofGUofE

CMS-Bristol

NIEeSRG

Durham

WT-CFGLeicester

ICMaastrichtOxford

Dutch NASwiss NAUrbino

UNC

Salzburg

SDSC

NEODC

CEH

RI

NCS

RLG

Innogen

NHS

Capri NTUAINRIAHUJUPCMax-

Planck

MIMAS

IASSIST

LDCACM

Data Archive

EDGGridPPEGEE

CambridgeLeicester

Jodrell Bank

DLI (US)DPC

DELOS

UNC

ESA

NASANARACNESESARLG

BNSC

TU Vienna UPennEBI

MRC HGUKyotoUSC

INRIA

GSK

Roslin

NDCCCANDO

IBM Almaden

JHUCSIRO

CaltechJHU

CSIRO

CDSESO

OCLC

AHDSMicrosoft

IBMOracle

BTSTK

BADCBODC

ESO

IVOA

ResearchCouncils

HEIs&FE

ResearchInstitutes

InternationalCollaborations

StandardsBodies

DPC

MIMAS

ILRT

Council forMuseums, Archives

& LibrariesRDN. OCLC

So’ton

OAINOF

NLA

NeSC

Page 13: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

developing the collaborative model

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice: users

community support & outreach

research

development

servicesmanagement

& co-ordination

curation organisations eg DPC

Collaborative Associates Network of DataOrganisations

Page 14: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

14

Digital | Curation | Centre

Digital Curation(1): Terminology

• actions needed to maintain and utilise digital data & research results over entire life-cycle– for current and future generations of users.

alongside which is Archiving – appraisal & retention/disposal

• logical & physical integrity: authenticity/security

and Digital Preservation– long-run technological/legal accessibility & usability

• Data curation in science – maintenance of body of trusted data

• to represent current state of knowledge in area of research.

Page 15: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

15

Digital | Curation | Centre

Digital Curation (2)

Digital Curation =Data Curation * Digital Preservation

• two organising themes:– data as evidence– archival responsibility

• mix of traditions and of activities– shared concern for current and future scholarship– what’s different about the digital, about data

• maintenance of body of trusted data

Page 16: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

16

Digital | Curation | Centre

Digital Curation (3)

Digital Curation =Data Curation * Digital Preservation

• Data Curation

– Data in use (huge, distributed, for long periods)

– Adding value (eg annotation)

– Combination and re-combination (provenance)

• Digital Preservation

– Future technological/legal accessibility & usability

– Significance of ‘designated community’ (OAIS)

Page 17: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Curation in action• Astronomy

• Integrating and analysing distributed data (AstroGrid)• publishing multi-TB sky surveys (SuperCOSMOS & WFCAM)• interoperability standards (IVO Alliance)

• BioInformatics• data publishing: generic tools for XML export (EBI Biomart)• annotation tools for massive data sets (Pubmed, VOTable)• archiving tools for dynamic data sets (biological DBs)

• Environmental sciences• spatio-temporal annotation (OS Mastermap/ Mouse Atlas)

• Document management• Tools for capture & normalisation (Xena)• Repository certification (RLG Task Force)

Page 18: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Digital Preservation Issues

• Supporting ingest, management and dissemination• Registries: file formats, metadata, peripheral devices

• Tracking and testing tools and standards• ingest, repository management, data exchange, ontologies, interoperability,

metadata

• Using OAIS as reference against which to test new models and architectures

• Research topics– Repositories: repository models, registries– Long-term viability of metadata– Preservation strategies for emerging digital formats

• Invest to Save, Report and recommendations of the NSF-DELOS Working Group on Digital Archiving and Preservation (2003)

• http://delos-noe.iei.pi.cnr.it/

Page 19: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

19

Digital | Curation | Centre

Research & Development

• Research– Annotation, Data integration and publication

– Appraisal and long-term preservation

– Socio-economic & legal context• rights, responsibilities and viability

– Performance and Optimisation

• Development into Services– Standards & Testbeds– File Formats– Registry of Metadata Standards

Further topics:

• Evolution of structure, Ontologies, Emulation

Page 20: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Research Agenda• Aims

evidence & curation as integrative activities– usability & automation– novel & visible research

• deliverables/testbeds

• Hot Topics– annotation & provenance

• universal interest, wide subject, eg referencing

– data publishing• metadata, Grid services, integration, security, optimisation

– archiving and appraisal • process automation at ingest, curating change, scalability

– socio-economic and legal • organisational dynamics, rights/responsibilities

• Reach out & listen - virtuous circle

Page 21: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

Development• Turns Research into ‘Products for Research’ that our

communities can use with confidence– tracking and testing tools and standards

• that are correct, usable, reliable, well documentede.g. for ingest, repository management, data exchange, ontologies

• working with tool developers wherever possible• developing testbeds & interworking with other testbeds

– aim to gain leverage formats• working with other projects worldwide• using generic tools and techniques

– to develop strategies for emerging digital formats

– Metadata standards• long-term viability of metadata

• Registries underpin this work to provide basis of Advisory Service

Page 22: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

22

Digital | Curation | Centre

Setting up the DCC

• Funding from the JISC began on 1 March 2004– EPSRC Research funding begins on 1 September 2004

• expect to harvest ‘early crop’ from extant research

• Phase One Set-up– from now until Launch of Centre in October 2004– face2face meetings: 20/21 March & 24/25 June

• drawing up programme of deliverables• re-deploying & recruiting staff

– aim to have appointed full time director in time for Launch

Page 23: Digital | Curation | Centre Continuing Access to Research Data: The New Digital Curation Centre Peter Burnhill Director (Phase One) Funded by:

23

Digital | Curation | Centre

Early ‘deliverables’

• Website at www.dcc.ac.uk– visit to learn of updates & progress

• especially of such ‘work in progress’ as– draft of ‘DCC Approach to Digital Curation’

» Dr David Giaretta, DCC Associate Director (CCLRC)– launch of e-journal

» Dr Liz Lyon, DCC Associate Director (UKOLN)– Digital Curation ‘Manual’

» Dr Seamus Ross, DCC Associate Director (HATII)

– and Presentations like this• to help build the Associates Network

– Leona Carpenter

• Helpdesk at [email protected]– contact us with offers of collaboration