View
37
Download
0
Category
Preview:
DESCRIPTION
Digital Curation Centre. a centre of expertise in data curation and preservation. Digital Curation Centre: tools and services under development. David Giaretta Associate Director (Development). Funders:. Organisation. curation organisations eg DPC. communities of practice: users. UKOLN. - PowerPoint PPT Presentation
Citation preview
Digital Curation Centre: tools and services under development
David GiarettaAssociate Director (Development)
Funders:
Digital Curation Centrea centre of expertise in data curation and preservation
Organisation
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
UKOLN
U of Edinburgh
CCLRC
U of Glasgow
U of Edinburgh
curation organisations eg DPC
Collaborative Associates Network of DataOrganisations
Organisation
Industry
research collaborators
standards bodies
testbeds& tools
communities of practice: users
community support & outreach
research
development co-ordination
service definition & delivery
management & admin support
curation organisations eg DPC
Collaborative Associates Network of DataOrganisations
CCLRC UKOLN
UofGUofE
CMS-Bristol
NIEeS
RG
Durham
WT-CFGLeicester
ICMaastricht
Oxford
Dutch NASwiss NAUrbino
UNC
Salzburg
SDSC
NEODC
CEH
RI
NCS
RLG
Innogen
NHS
Capri NTUAINRIAHUJUPCMax-
PlanckMIMAS
IASSIST
LDCACM
Data Archive
EDGGridPPEGEE
CambridgeLeicester
Jodrell Bank
DLI (US)DPC
DELOS
UNC
ESA
NASANARACNESESARLG
BNSC
TU Vienna UPenn
EBIMRC HGU
KyotoUSC
INRIA
GSK
Roslin
IBM Almaden
JHUCSIRO
CaltechJHU
CSIRO
CDSESO
OCLC
AHDSMicrosoft
IBMOracle
BTSTK
BADCBODC
ESO
IVOA
ResearchCouncils
HEIs&
FE
ResearchInstitutes
InternationalCollaborations
StandardsBodies
DPC
MIMAS
ILRT
Council forMuseums, Archives
& LibrariesRDN. OCLC
So’ton
OAI
NOF
NLA
NeSC
Overview
• Developing tools and services which will be needed in the short-medium term– integrating tools from many sources
• Will be new DCC services as well as useable separately by other projects
• Strongly OAIS based• Support automated processing &
interoperability
OAIS Reference Model – Functional Model
4-1.
2
MANAGEMENT
Ingest
Data Management
SIP
AIPDIP
queries
result setsAccess
PRODUCER
CONSUMER
Descriptive Info
AIP
orders
Descriptive Info
Archival Storage
Administration
Preservation Planning
Representation Net
Representation Information Classification
Representation Information vs Format
• Format = Structure
• Omits important information e.g – Language, terminology– Encryption
• Need to know more than just Format in order to stand a chance of being in a position to use the information
Layered Model from OAIS
More easily applicable to Science data
Representation Information - High Level View
Example of use of Representation Information Labelling
Registry/Repository
• Interface and protocols – JAXR “standard”– freebXML implementation– many access methods
• URL• Web Services• API• Etc..
• Findability– Persistent IDs
• What can we rely on?– Labels (to support automated processing)
• Initial service this Summer– Hope to work with PRONOM 4 & GDFR
Registry/ Repository
• Trusted repository of Rep. Info– Authenticity of info– Access control– Certificates/Digests : (are they trustable over the
long term?)• Extensibility• Distributed
Certification
• RLG task force preparing draft standard– Based on OAIS (plus TDR)– Expect this to become an ISO standard
• Tool:– Checklist and reports– …– Awaiting release of draft (in May)
Archival Information Package
• METS
• XFDU Packaging
• Expect tools available by end of year
Preservation Description Info
Will be working with PREMIS on tools
DCC Development Roadmap for next 6-12 months
• Registry– Complete phase 1– Include links to TNA/PRONOM– Hand over to Services group– Start Phase 2 – aim for “Trusted Repository” status
• Representation Information:– Data descriptions of science data using EAST (http://east.cnes.fr) & others– Import other Structure description tools and Data Dictionary tools– Develop Mapping to data object level– Work with other projects e.g. Emulation, Processing
• Certification– Draft certification
• Checklist• Proposed standard
• Additional Tools– Metadata extraction tool set– Ingest tool (based on PAIMAS standard)
• Testbeds e.g. large scale data management tools
Research
• To draw together the various functions of curation, from the traditional archival functions to the maintenance and publication of evolving knowledge as seen in scientific databases.
• To identify through direct research collaboration, and through interaction with the service arm of DCC, the key projects in which research is needed.
• To conduct research in areas already identified by the partners as crucial to digital curation.
• To institute two-way conduits between research and service in which practical issues can be drawn to the attention of researchers and the products of research can be tested in practice.
Current research priorities
• Data integration and publication • Performance and optimisation • Annotation • Appraisal and long-term preservation • Socio-economic and legal context: rights,
responsibilities and viability • Cost-benefit analysis of the data curation process • Security: safe and effective data analysis
environments • Automation of metadata extraction • Visitors Programme and Seminar Series
Summary
• Developing and integrating OAIS based tools
• Reviewing other related tools• See http://www.dcc.ac.uk
– also Development Web site (http://dev.dcc.rl.ac.uk) with a Wiki and associated open email list have been set up.
– aim to encourage widest possible collaboration with other projects.
• In medium-long term expect tools from DCC Research activities e.g. Annotation
Recommended