39
Digital Preservation

Digital Preservation. The Past is Prologue Developing Preservation Approaches

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Digital Preservation

Page 2: Digital Preservation. The Past is Prologue Developing Preservation Approaches

The Past is Prologue

Developing Preservation Approaches

Page 3: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Diagram by Nancy Y. McGovern based on PhD Research, March 2001

Page 4: Digital Preservation. The Past is Prologue Developing Preservation Approaches

5 Stages of Digital Preservation

1. Digitization leads to understanding that digital content needs to be managed and protected

2. Digital Preservation Projects are initiated

3. Digital Preservation Projects segue into Programs

4. Digital Preservation Programs become comprehensive and coordinated

5. Institutional Programs embrace Inter-institutional Collaboration

Page 5: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Digital Preservation Officer

• First DPO appointed January 2002http://www.library.cornell.edu/iris/dpo/

• coordinates digital preservation policy development and implementation

• serves as the liaison to digital preservation initiatives and projects

• developing a conceptual framework for a cohesive digital preservation program

Page 6: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Models and Standards

• Attributes of a Trusted Digital Repository (RLG-OCLC)

http://www.rlg.org/longterm/attributes01.pdf

• OAIS Reference Model (CCSDS)http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-2.pdf

Page 7: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Models and Standards

• SIP Transfer Issues: • Producer-Archive Interface Methodology Abstract Standard

(CCSDS)

http://ssdoo.gsfc.nasa.gov/nost/isoas/CCSDS-651.0-W-1.pdf

• AIP Components (OCLC/RLG PMWG): • Content Information

• Preservation Description Informationhttp://www.oclc.org/research/pmwg/

• Format Issues: • Draft Standard - Data Dictionary - Technical Metadata for Digital Still

Images (NISO) http://www.niso.org/committees/committee_au.html

Page 8: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Attributes of a Trusted Repository

Page 9: Digital Preservation. The Past is Prologue Developing Preservation Approaches

1. Administrative responsibility

• Provide evidence of fundamental commitment to standards, best practices

• Commit to OAIS model

• Meet standards on environment (6)

• Share measurements with depositors (6)

• Involve external community experts in validating/certifying practices (6)

• Commit to transparency and accountability (6)

Page 10: Digital Preservation. The Past is Prologue Developing Preservation Approaches

2. Organizational viability

• Demonstrate viability and trustworthiness (3)• Reflect commitment to long-term retention/management in

mission statements• Have appropriate legal status, staff and professional development

(1)(3)• Establish transparent business practices, effective management

policies (6)(3)• Define inclusive agreements with depositors (6)• Review/maintain policies and procedures (6)• Undertake risk management, contingency and succession (trusted

inheritors) planning (6)(3)

Page 11: Digital Preservation. The Past is Prologue Developing Preservation Approaches

3. Financial sustainability

• Establish/maintain good business practices and an auditable business plan (1)(2)

• Demonstrate financial fitness and ongoing financial commitment (1)(2)

• Balance risk, benefit, investment, expenditure

• Maintain adequate budget and reserves and actively seek potential funding sources

Page 12: Digital Preservation. The Past is Prologue Developing Preservation Approaches

4. Technological suitability

• Consider/adopt appropriate preservation strategies (6)• Ensure appropriate infrastructure for acquisition,

storage, access (5)• Establish technology management policy for repository

(2)(3)• Comply with relevant standards and best practices,

adequate expertise (6)• Undergo regular external audits on system components

and performance (6)

Page 13: Digital Preservation. The Past is Prologue Developing Preservation Approaches

5. System security

• Assure security of systems for digital assets (3)

• Establish policies and procedures to meet requirements (4)(6)

• Stress processes that will detect, avoid and repair loss, document and notify of changes and resulting actions (4)(6)

Page 14: Digital Preservation. The Past is Prologue Developing Preservation Approaches

6. Procedural accountability

• Enact policies and procedures for tasks and functions, document practices (1)(2)

• Establish monitoring mechanisms to ensure continued operation of systems and procedures (4)(5)

• Record/justify preservation strategies (1)(2)

• Set up feedback mechanisms for problem resolution; negotiate evolving requirements between providers and consumers (1)(2)

Page 15: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Framework Components

• Administrative Responsibility

• Organizational Viability

• Financial Sustainability

• Technological Suitability

• System Security

• Procedural Accountability

Page 16: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Diagram by Nancy Y. McGovern based upon the RLG-OCLC Attributes of a Trusted Repository

Page 17: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Open Archival Information System (OAIS)

Page 18: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Framework to Model

Page 19: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Overview of the OAIS Model

from Reference Model for an Open Archival Information System [4]

Page 20: Digital Preservation. The Past is Prologue Developing Preservation Approaches

OAIS Categories

• [Data Object]• Representation Information

(Structure, Semantic, and Other Information)

• Content Information [1](Data Object + Representation Information)

• Preservation Description Information [2](Reference, Context, Provenance and Fixity Information)

• Descriptive Information (Content Information + PDI)

• Packaging Information [physically and logically binds]

Page 21: Digital Preservation. The Past is Prologue Developing Preservation Approaches

OAIS at Cornell

Page 22: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Preserving Essential Elements

• Content

• Context

• Structure

• Appearance

• Behavior

Page 23: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Emulation

• Jeff Rothenberg

• Dutch National Library

• IBM

• CAMiLEON Project

• David Bearman

Page 24: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Migration

• Risk Management of Digital Information: A File Format Investigation

• Charles Dollar

• Margaret Hedstrom

• CAMiLEON Project

• Dutch Testbed Project

Page 25: Digital Preservation. The Past is Prologue Developing Preservation Approaches

XML and Object-Based

• NARA and SDSC

• Dutch Testbed Project

• Victoria Electronic Records Project (VERS)

• Harvard SIP proposal

Page 26: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Project Prism

CUL Research Team:Anne R. Kenney

Nancy Y. McGovern

Peter Botticelli

Richard Entlich

Page 27: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Risk Management Stages

Typical Stages Prism Stages

1. Risk identification 1. Data gathering

Characterization2. Risk classification

3. Risk assessment 2. Simple risk declaration

3. Contextualized

declaration/detection4. Risk analysis

5. Program implementation 4. Automated enforcement

Page 28: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Levels of Context

• Web page • as a stand-alone object, ignoring its hyperlinks• in local context, considering the links into it

and out from it

• Web site• as a semantically coherent set of linked Web

pages• as an entity in a broader technical and

organizational context

Page 29: Digital Preservation. The Past is Prologue Developing Preservation Approaches
Page 30: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Page-level Monitoring

• Formatting: TIDY• Standards compliance• Document structure• Metadata:

• HTTP headers• HTML headers

• Changes• Content• Location

• Links• Out-link structure• In-link structure• Intra-site • Hub• Volatility

• Page provenance• URL parsing

• Log analysis

Page 31: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Site-level Monitoring

• Graph analysis

• Static site analysis and Longitudinal study

• Aggregate page analyses

• Site maintenance indicators• Backup and archiving policies and procedures

• Hardware and software environment

• Network configuration and maintenance

Page 32: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Research Plan

• Preservation Risk Management for Web Resources: Virtual Remote Control in Cornell’s Project Prism

By Anne R. Kenney, Nancy Y. McGovern, Peter Botticelli, Richard Entlich, Carl Lagoze, and Sandra Payette

DLib Magazine, January 2002http://www.dlib.org/dlib/january02/kenney/01kenney.html

Page 33: Digital Preservation. The Past is Prologue Developing Preservation Approaches
Page 34: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Publisher-Based Digital Archives

Page 35: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Subject-Based Digital Archives

Page 36: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Intersection of Digital Archives

Format-based

Page 37: Digital Preservation. The Past is Prologue Developing Preservation Approaches
Page 38: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Relevant Initiatives

• Metadata Encoding and Transmission Standard (METS) http://www.loc.gov/standards/mets/

[highlighted Web site in RLG DigiNews February 2002]

• Flexible and Extensible Digital Object and Repository Architecture (FEDORA)

• Mellon Fedora Projecthttp://fedora.comm.nsdlib.org

Slides from January 2002 briefing: http://www.cs.cornell.edu/payette/presentations

Page 39: Digital Preservation. The Past is Prologue Developing Preservation Approaches

Relevant External Projects• NEDLIB

• http://www.kb.nl/coop/nedlib/

• CAMiLEON (CEDARS)• http://www.si.umich.edu/CAMILEON/index.htm

• http://www.leeds.ac.uk/cedars/

• PANDORA• http://pandora.nla.gov.au/index.html

• Harvard University LDI• http://hul.harvard.edu/ldi/

• NARA & SDSC• http://www.nara.gov/era/