Upload
sofia-farley
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Funded by:This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
Introduction to Digital Archives
Maureen Pennock
EAOLUG Spring/Summer Meeting 2006
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Today’s talk• The DCC
• Background & Context
• What We Do
• Digital Archives & Archiving• Definitions
• Main Issues
• OAIS
• Systems
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
UK Digital Curation Centre• JISC Circular 6/03 called for bids in digital curation
• JISC and the e-Science Core Programme funding• for development, services and outreach in digital
curation• for a research programme
• Impetus to action• Growth in e-Science activity and data creation• Recognition that continuing access to digital
information is needed
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Partners• University of Edinburgh (lead site)
• Chris Rusbridge, Prof Peter Buneman
• University of Glasgow - HATII• Prof Seamus Ross, Director of HATII and Erpanet
• University of Bath - UKOLN• Dr Liz Lyon, Director of UKOLN
• Councils for the Central Laboratory of the Research Councils (CCLRC)• Dr David Giaretta
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Objectives• Lead a vibrant international research programme to improve
quality in data curation and digital preservation
• Deliver effective, efficient and high demand services
• undertake evaluation of tools, methods, standards and policies
• work with the community to establish registries of tools and technical information
• Create an active, innovative and collaborative Associates Network
• Connect communities
• Universities and Research institutions
• Scientific data and documents
• International & cross-sector
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Research• Annotation in Databases• Data archiving• Socio-economic and legal issues• Metadata extraction and curation• Provenance and databases• Data transformation, integration and publishing• Security• Supporting technologies• Organisational and cultural challenges to digital
curation
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Development• DCC Approach to Digital Curation (white paper) –
sets out the path for development activities:• Monitoring international standards• Development of a Representation Information
Registry/Repository (DCC RIR)• Development of recommendations for tools and methods for
generating Representation Information• Creating testbeds for digital curation tools• Creating auditing and certification processes for trusted
repositories
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Services• Information Services
• Community-developed Digital Curation Manual• Briefing Papers & FAQ’s• Technology Watch• Case Studies• Best Practice Checklists
• Advisory Services• Events: information days, workshops, training,
conferences• Helpdesk
• Audit and Certification Services
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Summary• Support and promote continuing improvement
in the quality of data curation and preservation activity
• Nurture strong community relationships between practitioners, researchers, and curators
• Address digital curation from all aspects of the records life-cycle
• Develop and promote curation knowledge, tools and techniques
• Identify and research new organisational, technical, and supporting curation challenges
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Digital Curation• Digital curation is all about maintaining and
adding value to a trusted body of digital information for current and future use; specifically, we mean the active management and appraisal of data over the life-cycle of scholarly and scientific materials.
• Digital Curation brings a whole host of
challenges• The range of stakeholders that affect the
survival of digital material cuts across the whole life-cycle
• Everyone plays an important role
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Digital Archiving• Digital archiving is a curation activity• Ensures that
• Data is properly selected • Data is properly stored• Data can be accessed• The logical and physical integrity of the data
is maintained over time• Data is secure and authentic *
* Lord & MacDonald, e-Science Data Curation Report, 2003
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Digital Preservation• Digital preservation is an archiving activity• Ensures that specific items of data are
maintained over time so that they can still be accessed and understood through changes in technology *
• Includes content files and associated metadata• Combats digital obsolescence• Keeps data authentic despite technological
change• Has technical, organisational, and cultural
challenges
* Lord & MacDonald, e-Science Data Curation Report, 2003
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
What is a Digital Archive?• Inconsistency in use of the terms digital
archive, digital repository, and digital library• Task Force on Archiving Digital Information
1996: “Defines digital archives strictly in functional terms as repositories of digital information that are collectively responsible for ensuring, through the exercise of various migration strategies, the integrity and long-term accessibility of the nation’s social, economic, cultural and intellectual heritage instantiated in digital form.”
• Provide reliable solutions for life-cycle and long-term management of digital archival materials
• System driver is Preservation, leading to Access
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
What is a Digital Repository?• Collections of digital objects: content +
metadata• Cross-domain implementation• Offer minimum set of basic services – Get,
Search, Access control• Sustainable & trusted; well-supported and
managed• Policies, processes, services, people• Overall commitment to stewardship of digital
materials• Enables quick & remote access to digital
materials
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Main Issues for Digital Archives• User Requirements• Transfer & Ingest• Metadata• Standards• Digital preservation strategies• Linkage• Audit and Certification• Legal Issues• Access restrictions
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
OAIS• Open Archival Information System Reference Model• ISO 14721:2003• "An archive, consisting of an organisation of people
and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community"
• Establishes a common framework of terms and concepts
• Defines an Information Model • Identifies basic Functions of an OAIS
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
OAIS Functional Model• Functional model has six entities:
• Ingest; • Archival Storage; • Data Management; • Administration; • Preservation Planning; • Access
• Described using UML diagrams
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
OAIS Functional Entities
Administration
Ingest
ArchivalStorage
Access
DataManagement
Descriptive info.
PRODUCER
CONSUMER
MANAGEMENT
queries
result sets
Descriptive info.
Preservation Planning
orders
OAIS Functional Entities (Figure 4-1)
SIP
SIP
SIP
DIP
DIP
AIP AIP
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
DSpace• DSpace: “DSpace is a groundbreaking digital
repository system that captures, stores, indexes, preserves, and redistributes an organization's research data [...] the DSpace software platform serves a variety of digital archiving needs.”
• Open source software• Example use:
• American Museum of Natural History Research Library
• Chapel Hill, SILS, Theses & Dissertations• University of Cambridge – Academic & related
content • Edinburgh Research Archive (ERA)
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
EPrints• Eprints: “GNU EPrints is generic archive
software under development by the University of Southampton. It is intended to create a highly configurable web-based archive.”
• Open Source software• Example uses:
• Southampton Crystal Structure Report Archive• Central Connecticut State University Digital Archive• Central European University – Preprint Archive• Curtin institute of Technology Institutional Repository• DLIST – Digital Library of Information Science &
Technology
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Fedora• Fedora: “Open source software that gives
organisations a flexible service-oriented architecture for managing and delivering their digital content.”
• Open source software• Example uses:
• Digital Case, Case Western Reserve University's electronic repository and archive: stores, disseminates, and preserves faculty research in digital formats (both born digital and digitised)
• University of Queensland eSpace – research digital repository with published articles and conference papers, book chapters, theses and other forms of written research
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Others• Other systems such as Digital Commons
institutional repository service• Other, custom-built systems
• NARA Electronic Records Archives (ERA) project
• UK National Archives• Public Record Office, Victoria• KB eDepot, Netherlands• Several other large bodies whose archive
pre-dates development of aforementioned repository software
• Commercial systems
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
In conclusion• There is much in common between digital
archives, libraries, and repositories• Intention and subsequent functionality is the
key to defining digital storage systems• Digital Archives offer a framework for
maintaining & preserving the authenticity and integrity of records over time
• Several software solutions are available• Development is ongoing• Need technical know-how to implement• There is still a lot of work to do... .
a centre of expertise in data curation and preservation
EAOLUG :: RSC :: Cambridge 23 May 2006
Thank you.
Questions?
Maureen [email protected]
Join the DCC Associates Network at http://www.dcc.ac.uk