26
A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC ICDAT 2006 Inst. of Information Science, Academia Sinica October 19, 2006

A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Embed Size (px)

Citation preview

Page 1: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

A Daunting PREMIS:Implementing

Preservation Metadata within the METS

Framework

Jerome P. McDonoughGraduate School of Library & Information Science, UIUC

ICDAT 2006Inst. of Information Science,

Academia SinicaOctober 19, 2006

Page 2: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

One Great Loss for Mankind

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 3: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

One Great Loss for Mankind

Source: Sarkissian, John M. (21 May 2006). The Search for the Apollo 11 SSTV Tapes. Parkes, Australia: CSIRO Parkes Observatory.http://www.honeysucklecreek.net.nyud.net:8080/Apollo_11/tapes/Search_for_SSTV_Tapes.pdf

Page 4: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

“Houston, we’ve had a problem

here.”Loss of data due to format conversions

Need to insure viable access to playback devices for media

Inadequacy of traditional archival practice for insuring item-level access to media

Need to detailed event history to document life-cycle/provenance of information

Page 5: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

History of PREMIS OCLC/RLG Preservation Metadata Framework Working Group (2001-2002)

“…to define the concept of preservation metadata…and evaluate the prospects for a community-wide, consensus-building activity….”

Final Report: Preservation Metadata for Digital Objects: A Review of the State of the Art

“…to develop a framework outlining the types of information -- i.e., metadata -- that should be associated with an archived digital object.”

Final Report: A Metadata Framework to Support the Preservation of Digital Objects

Page 6: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

History of PREMIS PREservation Metadata Implementation

Strategies [PREMIS] (2003-2005) “Develop a core preservation metadata set,

supported by a data dictionary, with broad applicability across the digital preservation community.”

“Identify and evaluate alternative strategies for encoding, storing, and managing preservation metadata in digital preservation systems.”

Final Report: Data Dictionary for Preservation Metadata: Final Report of the PREMIS Working Group

PREMIS Maintenance Activity at Library of Congress, including XML Schema

Page 7: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

PREMIS Data Model

Page 8: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

PREMIS Data Dictionary: Object

Object Identifier Preservation Level Object Category Object Characteristics Creating Application Original Name Storage Environment Signature Information

Relationship Linking Event Identifier Linking Intellectual

Entity Identifier Linking Permission

Statement Identifier

An Object can be associated with one or more Rights statements, can participate in one or more Events, and can be related to one or more Agents

Page 9: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

PREMIS Data Dictionary: Event

Event Identifier Event Type Event Date & Time Event Detail Event Outcome

Linking Agent Identifier

Linking Object Identifier

An Event must be related to one or more objects, and can be related to one or more Agents.

Page 10: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

PREMIS Data Dictionary: Agent

Agent Identifier Agent Name Agent Type

An Agent may hold or grant one or more rights, may carry out, authorize, or compel one or more events, and may create or act upon one or more objects.

Page 11: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

PREMIS Data Dictionary: Rights

Permission Statement Identifier

Granting Agreement

Permission Granted

Linking Object Granting Agent

Page 12: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

<premis:object> <premis:objectIdentifier> <premis:objectIdentifierType>hdl</premis:objectIdentifierType> <premis:objectIdentifierValue>loc.music/gottlieb.09611</premis:objectIdentifierValue> </premis:objectIdentifier> <premis:objectCategory>file</premis:objectCategory> <premis:objectCharacteristics> <premis:fixity> <premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm> <premis:messageDigest>36b0319…..</premis:messageDigest> <premis:messageDigestOriginator>LocalDCMS</premis:messageDigestOriginator> </premis:fixity> <premis:size>20800896</premis:size> <premis:format> <premis:formatDesignation>

<premis:formatName>image/tiff</premis:formatName> <premis:formatVersion></premis:formatVersion>

</premis:formatDesignation> </premis:format> </premis:objectCharacteristics></premis:object>

Page 13: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

<premis:event> <premis:eventIdentifier> <premis:eventIdentifierType>LocalRepository</premis:eventIdentifierType> <premis:eventIdentifierValue>e001</premis:eventIdentifierValue> </premis:eventIdentifier> <premis:eventType>ingestion</premis:eventType> <premis:eventDateTime>2006-06-06T00:00:00.001</premis:eventDateTime> <premis:linkingAgentIdentifier> <premis:linkingAgentIdentifierType>AgentID</premis:linkingAgentIdentifierType> <premis:linkingAgentIdentifierValue>na12345</premis:linkingAgentIdentifierValue> </premis:linkingAgentIdentifier></premis:event><premis:agent> <premis:agentIdentifier> <premis:agentIdentifierType>AgentID</premis:agentIdentifierType> <premis:agentIdentifierValue>na12345</premis:agentIdentifierValue> </premis:agentIdentifier> <premis:agentName>LC Repository</premis:agentName> <premis:agentType>organization</premis:agentType></premis:agent>

Page 14: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Overview of METS Digital Library Federation Initiative launched in 2001 as successor to Making of America II project

Goal: Create a single document format for encoding digital library objects which can fulfill roles of SIP, AIP and DIP within the OAIS reference model

Scope limited to objects comprised of text, image, audio and video files (or combination thereof)

METS Maintenance Activity at Library of Congress, including XML Schema

Page 15: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

METS Framework

METS Document

Header

Descriptive MD

Admin. MD

File Section

Link Structure

Structural Map

Behaviors

Page 16: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

METS StructureObject modeled as tree (e.g. movie is composed of scenes, which are composed of one or more shots)

Every node in tree structure can be associated with content files and descriptive & administrative metadata

Every content file can be associated with descriptive & administrative metadata

Page 17: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

METS Administrative

Metadata4 Types: Technical, Rights, Source Document, Digital Provenance

Non-prescriptive/Multiple instancesmay be internal (XML or binary) or external (XLink) to METS document

Internal XML reliant on extension schema (e.g., PREMIS) for support

Page 18: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

METS & PREMIS

Page 19: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

OAIS Information Package

Page 20: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

On-going Issues Architecting objects for performance, or the Metadata that Ate Cincinnati

Organizing successful & complete representation networks

Enabling trustworthy metadataSupporting ‘non-generic’ Event, Rights & Agent metadata

Creating metrics & methods for evaluating digital preservation activities

Page 21: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

The Metadata That Ate

Cincinnati Add a 300 page digitized book with TIFF page images, a TEI encoding and a METS wrapper to your repository: 302 PREMIS Object Records, 302 Other Technical Metadata

Records, 1 Descriptive Metadata Record, 1 Rights Record, 1 PREMIS Event Record (Ingest), 1 PREMIS Agent Record (Ingesting Agent), 302 PREMIS Event Records (JHOVE Validation)

Migrate TIFF to JPEG2000 Add 300 PREMIS Event Records, 300 Additional Event

Detail Records, 1 PREMIS Agent Record 300 PREMIS Object records, 300 Technical Metadata Records

Run Fixity Check on Content Files Add 302 PREMIS Event Records, 1 PREMIS Agent Record

Continue ad infinitum….

Page 22: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Representation Networks

ISO/IEC 15444-1 + :2004/PDAM 1 (JPEG 2000 +Amendment 1/profiles for Dig. Cinema)

SMPTE 384M (MXF) W3C XML 1.1 SMPTE 372M EBU Standard N22 1997 AES3-2003 SMPTE 196E ISO/IEC 15948:2004 (PNG)

Unicode version 4.0.01 SMPTE 12M (auxiliary file

format) SMPTE 336M (KLV) ISO 15706 (ISAN) SMPTE 330M-2004 (UMID) ITU-T Recommendation X.509 ISO 3166 (language code) TIA-442 (RS-422) IEEE802.3

Partial (first layer) representation network for Digital Cinema System Specification

Page 23: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Trustworthy Metadata

Metadata from a known (and trusted) source

Metadata that has not experienced unauthorized change

Metadata that is accurateMetadata that is sufficient to need

Metadata that is transparent

Page 24: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Generic vs. Specific:

Events, Rights & Agents

Event Example -- Migrate SD DTV to HD DTV. You may want to know: De-interlacing technique (motion-compensated or not, linear or non-linear)

Colorspace conversion (gamma correction, luma equations for source and destination, primary chromaticities and white points for source and destination)

Aspect ratio conversion technique

Similarly, we may want to know more about Rights and Agents than the minimal generic information

Page 25: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

Evaluating Digital Preservation

ProgramsWhat does it mean to preserve digital content? Does the meaning of “preservation” vary with context?

What metrics should we employ to evaluate the success of a digital preservation program?

Page 26: A Daunting PREMIS: Implementing Preservation Metadata within the METS Framework Jerome P. McDonough Graduate School of Library & Information Science, UIUC

謝謝 !

Jerome McDonoughGraduation School of Library & Information Science

University of Illinois at Urbana-Champaign501 E. Daniel Street, MC-493

Champaign, IL [email protected]