Upload
lillian-lowe
View
215
Download
3
Tags:
Embed Size (px)
Citation preview
Digital | Curation | Centre
Continuing Access to Research Data: The New Digital Curation Centre
Peter Burnhill
Director (Phase One)
Funded by:
2
Digital | Curation | Centre
An Overview
• Personal Provenance
• Digital Curation Centre – what it is (and is not) – who’s involved
• Curating the Future
• We’re all Curators Now
3
Digital | Curation | Centre
Personal Provenance: A Data Person
1970s Economics undergrad; research administration;Statistics postgrad; medical (bioassay/ultrasound) screening
1980s Survey methodologist; Scot. Edu. Data Archive; social sciences; setup University Data Library, 1984; merged into Computing Services;geographic information systems and online database access: co-director, ESRC Regional Research Lab (Scotland)
1990s Digital Library developments; national online services;director, EDINA national data centre (JISC); president, IASSIST;eLib Programme: Digimap (OS); words, numbers, pictures, sounds
2000 - JISC IE middleware – portals/brokers; authentication/authorisationlinking reasearch data & scholarly communication;Geography Data Unit; eLearning & eScience;
interim director, Digital Curation Centre ** still a trainee **
4
Digital | Curation | Centre
‘Demand-side’ Verbs from Virtual Library
• Discover information object of intereste.g. reference in an A&I databases or cited at the foot of an article
• Locate service on information objecte.g. a service giving electronic access to the full text of the article, or
one’s own library having the volume on a shelf nearby
• Request use of servicevia payment of money or (better still) privilege of membership: involves
authorisation and authentication
• Access (service on) object of intereste.g. online access (and print-out), personal visit or document delivery
MODELS workshops, UKOLN/JISC eLib Programme, 1994ish
5
Digital | Curation | Centre
UK Digital Curation Centre
• identified in Report commissioned by JISC Cttee for Support of Research (Lord & Macdonald, May 2003)– Twin drivers
• Digital Preservation: ePublishing (DPC) & eLearning
• Continuing Access: e-Science, ‘data deluge’ & Res Council policies
• Call to set up DCC in JISC Circular 6/03, June 2003– Ambitious & demanding remit, – Joint funding by JISC and e-Science Core Programme
• Funding for outreach, services & development
• Funding for research programme
• Task entrusted to Consortium of four partners– award made Feb/March 2004
6
Digital | Curation | Centre
Overall Aim
‘continuing quality improvement in data curation & digital preservation’
• Initial focus:
data as evidential base for scholarly conclusions– role of data archiving & preservation as keys to
reproducibility and reuse
• Wider context & remit:
worlds of scholarly communication & eLearning
7
Digital | Curation | Centre
Objectives
• vibrant research programme – addressing the wider issues of digital curation
• Collaborative Associates Network of Data Organisations– outreach for strong links across existing community of practice– engagement with curators (individuals & organisations)
• service definition and delivery – to evaluate tools, methods, standards and policies – a repository of tools and technical information
• ‘virtuous circle’– expertise, experience & requirement feed into the DCC
research programme
8
Digital | Curation | Centre
What the DCC is not ...
… a national digital repository
… an attempt to teach grandmothers to suck eggs
… just another advisory service
9
Digital | Curation | Centre
DCC Consortium Partners
• Four Consortium partner institutions: – University of Edinburgh - lead partner – University of Glasgow (HATII)– University of Bath (UKOLN)– CCLRC (Rutherford and Daresbury Laboratories)
• Prior links via National eScience Centre (NeSC)– jointly managed by Universities of Edinburgh & Glasgow
10
Digital | Curation | Centre
Some Names & Responsibilities• Them with titles …
– Peter Burnhill, Director (Phase One)with Robin Rice, Phase One Project Co-ordinator• EDINA & Data Library, University of Edinburgh
– Peter Buneman Research Director (& PI on EPSRC grant)• Informatics, University of Edinburgh
– Liz Lyon, Associate Director (Community Support & Outreach)• UKOLN, University of Bath
– Seamus Ross, Associate Director (Service Definition & Delivery)• HATII, University of Glasgow
– David Giaretta, Associate Director (Development)• CCLRC
• Two significant & well known ‘Ex Portfolio’ names– Malcolm Atkinson, Director, NeSC– Chris Rusbridge, Director, Information Services, UofGlasgow
What needs to be done
• Respond to policy imperatives• twin aims:excellence in research & excellence in service
– international respect & national leadership
– meeting the needs of e-Science
• impact now and into the future
– manage complexity, risk and sustainability
• Bridge across communities• universities & research institutes• scientific data tradition & document tradition• different disciplinary perspectives• engaging the information & computing sciences
• Develop a collaborative model• Associates Network of Data Organisations
CCLRC UKOLN
UofGUofE
CMS-Bristol
NIEeSRG
Durham
WT-CFGLeicester
ICMaastrichtOxford
Dutch NASwiss NAUrbino
UNC
Salzburg
SDSC
NEODC
CEH
RI
NCS
RLG
Innogen
NHS
Capri NTUAINRIAHUJUPCMax-
Planck
MIMAS
IASSIST
LDCACM
Data Archive
EDGGridPPEGEE
CambridgeLeicester
Jodrell Bank
DLI (US)DPC
DELOS
UNC
ESA
NASANARACNESESARLG
BNSC
TU Vienna UPennEBI
MRC HGUKyotoUSC
INRIA
GSK
Roslin
NDCCCANDO
IBM Almaden
JHUCSIRO
CaltechJHU
CSIRO
CDSESO
OCLC
AHDSMicrosoft
IBMOracle
BTSTK
BADCBODC
ESO
IVOA
ResearchCouncils
HEIs&FE
ResearchInstitutes
InternationalCollaborations
StandardsBodies
DPC
MIMAS
ILRT
Council forMuseums, Archives
& LibrariesRDN. OCLC
So’ton
OAINOF
NLA
NeSC
developing the collaborative model
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
community support & outreach
research
development
servicesmanagement
& co-ordination
curation organisations eg DPC
Collaborative Associates Network of DataOrganisations
14
Digital | Curation | Centre
Digital Curation(1): Terminology
• actions needed to maintain and utilise digital data & research results over entire life-cycle– for current and future generations of users.
alongside which is Archiving – appraisal & retention/disposal
• logical & physical integrity: authenticity/security
and Digital Preservation– long-run technological/legal accessibility & usability
• Data curation in science – maintenance of body of trusted data
• to represent current state of knowledge in area of research.
15
Digital | Curation | Centre
Digital Curation (2)
Digital Curation =Data Curation * Digital Preservation
• two organising themes:– data as evidence– archival responsibility
• mix of traditions and of activities– shared concern for current and future scholarship– what’s different about the digital, about data
• maintenance of body of trusted data
16
Digital | Curation | Centre
Digital Curation (3)
Digital Curation =Data Curation * Digital Preservation
• Data Curation
– Data in use (huge, distributed, for long periods)
– Adding value (eg annotation)
– Combination and re-combination (provenance)
• Digital Preservation
– Future technological/legal accessibility & usability
– Significance of ‘designated community’ (OAIS)
Curation in action• Astronomy
• Integrating and analysing distributed data (AstroGrid)• publishing multi-TB sky surveys (SuperCOSMOS & WFCAM)• interoperability standards (IVO Alliance)
• BioInformatics• data publishing: generic tools for XML export (EBI Biomart)• annotation tools for massive data sets (Pubmed, VOTable)• archiving tools for dynamic data sets (biological DBs)
• Environmental sciences• spatio-temporal annotation (OS Mastermap/ Mouse Atlas)
• Document management• Tools for capture & normalisation (Xena)• Repository certification (RLG Task Force)
Digital Preservation Issues
• Supporting ingest, management and dissemination• Registries: file formats, metadata, peripheral devices
• Tracking and testing tools and standards• ingest, repository management, data exchange, ontologies, interoperability,
metadata
• Using OAIS as reference against which to test new models and architectures
• Research topics– Repositories: repository models, registries– Long-term viability of metadata– Preservation strategies for emerging digital formats
• Invest to Save, Report and recommendations of the NSF-DELOS Working Group on Digital Archiving and Preservation (2003)
• http://delos-noe.iei.pi.cnr.it/
19
Digital | Curation | Centre
Research & Development
• Research– Annotation, Data integration and publication
– Appraisal and long-term preservation
– Socio-economic & legal context• rights, responsibilities and viability
– Performance and Optimisation
• Development into Services– Standards & Testbeds– File Formats– Registry of Metadata Standards
Further topics:
• Evolution of structure, Ontologies, Emulation
Research Agenda• Aims
evidence & curation as integrative activities– usability & automation– novel & visible research
• deliverables/testbeds
• Hot Topics– annotation & provenance
• universal interest, wide subject, eg referencing
– data publishing• metadata, Grid services, integration, security, optimisation
– archiving and appraisal • process automation at ingest, curating change, scalability
– socio-economic and legal • organisational dynamics, rights/responsibilities
• Reach out & listen - virtuous circle
Development• Turns Research into ‘Products for Research’ that our
communities can use with confidence– tracking and testing tools and standards
• that are correct, usable, reliable, well documentede.g. for ingest, repository management, data exchange, ontologies
• working with tool developers wherever possible• developing testbeds & interworking with other testbeds
– aim to gain leverage formats• working with other projects worldwide• using generic tools and techniques
– to develop strategies for emerging digital formats
– Metadata standards• long-term viability of metadata
• Registries underpin this work to provide basis of Advisory Service
22
Digital | Curation | Centre
Setting up the DCC
• Funding from the JISC began on 1 March 2004– EPSRC Research funding begins on 1 September 2004
• expect to harvest ‘early crop’ from extant research
• Phase One Set-up– from now until Launch of Centre in October 2004– face2face meetings: 20/21 March & 24/25 June
• drawing up programme of deliverables• re-deploying & recruiting staff
– aim to have appointed full time director in time for Launch
23
Digital | Curation | Centre
Early ‘deliverables’
• Website at www.dcc.ac.uk– visit to learn of updates & progress
• especially of such ‘work in progress’ as– draft of ‘DCC Approach to Digital Curation’
» Dr David Giaretta, DCC Associate Director (CCLRC)– launch of e-journal
» Dr Liz Lyon, DCC Associate Director (UKOLN)– Digital Curation ‘Manual’
» Dr Seamus Ross, DCC Associate Director (HATII)
– and Presentations like this• to help build the Associates Network
– Leona Carpenter
• Helpdesk at [email protected]– contact us with offers of collaboration