29
Long-Term Preservation

Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Embed Size (px)

Citation preview

Page 1: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Long-Term Preservation

Page 2: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Technical Approaches to Long-Term Preservation

• the challenge is to interpret formats• a similar development: sound carriers• From phonograph to MP3

Page 3: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Those that do not keep up with this development soon will lack support:– new audio documents are only produced in

current formats– out-of-date equipment spare parts are hard to

come by.

Page 4: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• technical approaches to long-term preservation of digital documents fall into two categories:– aim to preserve the original state of documents

along with systems that are suitable for rendering the documents in their original format

– aim to continually transform digital documents into the formats of state-of-the-art rendition systems and at the same time to retain their original “look and feel.”

Page 5: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Migration• advantages:– well known– documents available all the time– Possibly improved quality

• disadvantages:– reduced authenticity– hard to automate

Page 6: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Hardware Museums• The mission of a hardware museum is to

collect (and keep operational) all relevant computing systems so that future generations may view our documents in their original environments.

Page 7: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Hardware museums are not feasible in practice :– too many items– Additional software

and hardware required

– hard to maintain

Hardware museum at the Universität der Bundeswehr München, Germany

Page 8: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Emulation• Emulators allow the function of processors and

other hardware components to be simulated by software.

• When using emulation, for each digital document the following items have to be preserved (using, e.g., migration):– The character stream and the metadata– A specification of the hardware that can be

interpreted by the emulator– The complete software of the rendition system (in the

form of binary data streams).

Page 9: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• If interested persons would like to access a document conserved that way in, say, 100 years from now, they would have to proceed as follows:

1. create an emulator,– Load the hardware specification into an emulator

to obtain a software implementation which is functionally equivalent to the original hardware.

2. install software– On the emulated computer install the systems

software and the application programs needed for rendering the document

Page 10: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

3. and render documents– Load the character stream of the digital document

into the emulated . . . and render computer and start the rendition software to access the document.

Page 11: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• advantages of emulation:– relatively small cost per document– cost proportional to actual use one emulator suffices

for many documents– high authenticity

• Whenever an old format becomes obsolete emulation (while new ones become popular), new conversion techniques and tools have to be developed that achieve the required transformation.

Page 12: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Standard Formats• costs proportional to number of formats• standards for simple character sequences and

For complex document types

Page 13: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Legal and Social Concerns• long-term preservation of digital documents

involves legal and social concerns:1. “Digital Rights Management” (DRM) and copy

protection2. reserved software right3. Should hardware manufacturers provide

emulators?4. criteria for selection5. costs as a limiting factor 6. make costs affordable7. balance of interests between shareholders

Page 14: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

OAIS Models

• Open Archival Information System Reference Model

• an ISO standard on the long-term preservation of digital documents.

• two complementary points of view: both, an information model and a process model

Page 15: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

The Information Model• Data Object and Information Object• The knowledge which is required to understand data

is called Knowledge Base• In order to understand the data one needs

additional information.• Ex, Along with the source code of the Java program,

a book about the programming language Java must be available (Representation Information)

Page 16: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From
Page 17: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• The Content Information is the information object proper which contains all the information necessary to interpret data

• Preservation Description Information (PDI) denotes all the information required to suitably preserve the corresponding Content Information.

Page 18: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Content Information and PDI are combined into one logical entity, the Information Package.

• Packaging Information. It specifies how Content Information and PDI are actually related to each other e.g., by describing the directory structure of a CD-ROM.

Page 19: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Descriptive Information which yields Information about the content of the Information Package and thus allows the Information Package to be found in the archive.

Page 20: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

Modeling Context and Processes• In order to define the processes that are going

on in the archive in more detail, the OAIS Reference Model starts by considering the context of the archive.

• An archive’s purpose is to maintain documents, which are submitted to it and which are to be made available to future users.

Page 21: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Producers, i.e., authors, institutions, etc. that deliver documents to the archive.

• Management. defines the specific purpose of the archive, e.g., which documents are to be collected and which are not

Page 22: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• The OAIS Reference Model differentiates three different kinds of Information Packages in their relation to the environment of the archive:– Submission Information Packages (SIP) are sent to

the archive by Producers– Archive Information Packages (AIP) are preserved in

the archive– Dissemination Information Packages (DIP) are

passed from the archive to Consumers.

Page 23: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From
Page 24: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• The Ingest process receives an SIP from the Producer and prepares it Ingest for storage and administration within the archive.

• SIPs must be transformed into AIPs, and Descriptive Information corresponding to the AIPs has to be created.

• AIP is passed on to the Archival Storage process, and the corresponding Descriptive Information to the Data Management process.

Page 25: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• Data Management process manages the Descriptive Information and also the data that are necessary to run the system

• Administration process handles routine work in the archive: negotiates with producers the prerequisites for sending documents to the archive.

Page 26: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

DSEP Model

• Deposit System for Electronic Publications• The business routine of library can be

subdivided into four domains:– Acquisition of stock– Capturing metadata– Preservation and maintenance– Providing access

Page 27: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From
Page 28: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From

• The process Delivery & Capture transforms documents into SIPs conforming to the DSEP standards.

• The process Packaging & Delivery unpacks the DIP and transforms it into a format that can be used by the library system.

Page 29: Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From