18
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress [email protected] PREMIS Implementation Fair San Francisco, CA October 7, 2009

Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress [email protected] PREMIS Implementation Fair San

Embed Size (px)

Citation preview

Page 1: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Implementation of PREMIS in METS

Rebecca GuentherSr. Networking & Standards Specialist, Library of [email protected]

PREMIS Implementation FairSan Francisco, CAOctober 7, 2009

Page 2: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

METS records the (possibly hierarchical) structure of digital objects, the names and locations of the files that comprise those objects, and the associated metadata

A METS document may be a unit of storage (e.g. OAIS AIP) or a transmission format (e.g. OAIS SIP or DIP)

METS is extensible and modular METS uses the XML Schema facility for combining

vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an

extension schema Many institutions trying to use PREMIS within the METS

context

Page 3: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Structure of a METS file

Page 4: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

ArchivalInformation

Package

DescriptiveInformation

ContentInformation

described by

derived from

delimited by

identifies

further described by

RepresentationInformation

DataObject

Semantics

ProvenanceInformation

ReferenceInformation

FixityInformation

ContextInformation

PreservationDescriptionInformation

PackagingInformation

Structure described by

<dmdSec>

<fileGrp>

<techMD>

<METS>

<digiProvMD><sourceMD>premis:event<techMD>

<structMap>

MODSMARCXML

DC

premis:object

metsRightspremis:rights

<rightsMD>

File formats premis:objecttextMD

MIX

<file>

<amdSec>

OAIS, METS and PREMIS

Legend

Black Arial = OAISRed Times New Roman = METS Primary SchemaBlue Times New Roman Italics = Extension Schema

<mdRef>

Page 5: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

METS extension schemas

“wrappers” or “sockets” where elements from other schemas can be plugged in

Provides extensibility Uses the XML Schema facility for combining vocabularies from

different Namespaces Endorsed extension schemas:

• Descriptive: MODS, DC, MARCXML• Technical metadata: MIX (image); textMD (text)• Preservation related: PREMIS

Page 6: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Why do we need guidelines for using PREMIS with METS?

Contents of each information package may vary depending on its function within a repository

Need to determine how to include representation metadata and associate it with package components

PREMIS data entities (objects, events, rights, agents) do not map perfectly to METS categories for representation metadata (techMD, digiProvMD, rightsMD, sourceMD)

There are redundant elements between the two standards Both have extensibility mechanisms Flexibility of both standards requires implementation

choices

Page 7: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Development of Guidelines for Using PREMIS with METS for Exchange PREMIS in METS Guidelines Working Group

• Consists of PREMIS and METS experts• Focuses on the METS document as a mechanism of exchange

of digital objects and their metadata (SIP or DIP)• Facilitates communication when internal requirements and

technical environments vary Tension between flexibility and being prescriptive to facilitate

interoperability• Consider usage scenarios• If a SIP it may get unwrapped and stored in different structures• If a DIP it is converted from internal structures to PREMIS• A more liberal approach is possible for a SIP than a DIP

Establishing guidelines, a METS profile, and exampleshttp://www.loc.gov/standards/premis/guidelines-premismets.pdf

Page 8: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Implementation issues in using PREMIS with METS

Location of PREMIS metadata within METS documents Whether to record elements redundantly if they occur in

both PREMIS and METS Relationship of different structural metadata mechanisms in

PREMIS and METS How to record PREMIS Agent entities in METS documents Use of identifiers to link elements in PREMIS and METS How to record elements that are also part of a format

specific technical metadata schema (e.g. MIX)

Page 9: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Some recommendations from Guidelines

METS sections• Use Object in techMD or digiProvMD• Use Event in digiProvMD• Use Rights in rightsMD• Use Agent in digiProvMD or rightsMD

PREMIS Container -- use only if keeping all PREMIS metadata together. Do not use if separating PREMIS metadata into different amdSec subelements

PREMIS and METS redundancies -- Choosing which options to use is an implementation decision, document in profile e.g. METS <size> element attributes and subelements of <objectCharacteristics> in PREMIS

Page 10: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Recommendations (cont.)

Structural relationship elements -- use the METS structMap to record structural relationships, use PREMIS relationship elements to record preservation and derivation relationships and structural if desired

ID/IDREF and PREMIS identifier elements -- use METS ID/IDREF mechanisms, best practices for using these ID/IDREF mechanisms apply

Use PREMIS extensibility mechanism for format specific technical metadata

Document decisions in METS profiles

Page 11: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

<fileSec><fileGrp><file ID="FID1" SIZE="184302" ADMID="TMD1PREMIS TMD1MIX DP1EVENT

DP1AGENT“ CHECKSUM="4638bc65c5b9715557d09ad373eefd147382ecbf" CHECKSUMTYPE="SHA-1">

<FLocat LOCTYPE="OTHER" xlink:href="BXF22.JPG" /></file></fileGrp></fileSec><techMD ID="TMD1PREMIS"> <mdWrap MDTYPE="PREMIS"> <xmlData>

<premis:object > <objectCharacteristics> <fixity> <messageDigestAlgorithm>SHA-1 </messageDigestAlgorithm> <messageDigest>4638bc65c5b9715557d09ad373eefd147382ecbf 

</messageDigest> <messageDigestOriginator>EchoDep/messageDigestOriginator> </fixity> <size>184302</size> </objectCharacteristics>

Elements defined in both METS and PREMIS:• METS: Checksum, Checksumtype

• attribute of <file>• not repeatable

PREMIS: fixity• also includes messageDigestOriginator• allows multiples

Page 12: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

<fileSec><fileGrp><file ID="FID1" ADMID="TMD1PREMIS DP1EVENT DP1AGENT“

MIMETYPE="image/jpeg" <FLocat LOCTYPE="OTHER" xlink:href="BXF22.JPG"/></file></fileGrp></fileSec>

<techMD ID="TMD1PREMIS“ <mdWrap MDTYPE="PREMIS"> <xmlData> <premis:object> <objectCharacteristics> <format> <formatDesignation> <formatName>image/jpeg</formatName>  <formatVersion>1.02 </formatVersion> </formatDesignation></format> </objectCharacteristics>Elements defined both in METS and PREMIS:• METS: MIMETYPE

• attribute of <file>• optional

PREMIS: <format> • more granular; includes name and version (although name may be MIMETYPE)• mandatory

Page 13: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

<fileSec> <fileGrp> <file ID="FID1" ADMID="TMD1PREMIS TMD1MIX DP1EVENT DP1AGENT"><techMD ID="TMD1PREMIS"> <linkingEventIdentifier> <linkingEventIdentifierType>ECHODEP Hub Event </linkingEventIdentifierType> <linkingEventIdentifierValue>echo12345</linkingEventIdentifierValue> </linkingEventIdentifier><digiprovMD ID="DP1EVENT">  <premis:event> <eventIdentifier> <eventIdentifierType>ECHODEP Hub Event</eventIdentifierType> <eventIdentifierValue>echo12345 </eventIdentifierValue> </eventIdentifier> <eventType>ingestion</eventType> <eventDateTime>2006-05-02T15:12:53 </eventDateTime></event>

Elements defined both in METS and PREMIS METS ID/Idref: used to associate metadata in different sections and for different

files PREMIS identifiers: explicit linking between entity types

Page 14: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

<structMap TYPE=“physical”> <div ORDER="1" TYPE="text"> <:fptr FILEID="FID9"/> <div ORDER="1" TYPE="page" LABEL=" Page [1]"> <fptr FILEID="FID1"/></mets:div> <div ORDER="2" TYPE="page" LABEL=" Page [2]"> <fptr FILEID="FID2"/></mets:div> </div>

<relationship> <relationshipType>structural</relationshipType> <relationshipSubType>is sibling of </relationshipSubType> <relatedObjectIdentification> <relatedObjectIdentifierType>UCB</relatedObjectIdentifierType> <relatedObjectIdentifierValue>FID2</relatedObjectIdentifierValue> <relatedObjectSequence>1</relatedObjectSequence>

Elements defined both in METS and PREMIS: METS: structMap

• details structural relationships and is the heart of the METS document• hierarchical, so may be more expressive than PREMIS semantic units• links the elements of the structure to content files and metadata

PREMIS: <relationship> • details all kinds of relationships, including structural• data dictionary says that implementations may record by other means

Page 15: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Some METS profiles with PREMIS

UCSD simple and complex object UC Berkeley ECHO Dep Generic METS Profile for Preservation and Digital

Repository Interoperability LC Profile for Recorded Events Australian METS Profile TIPR … many others

Page 16: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Additional changes to Guidelines

Make extensibility mechanism consistent with METS• significantPropertiesExtension• objectCharacteristicsExtension• creatingApplicationExtension• environmentExtension• signatureInformationExtension• eventOutcomeDetailExtension• rightsExtension

Page 17: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Additional changes to Guidelines (cont.)

Add the same elements and attributes as in METS to PREMIS extension elements in schema and data dictionary• mdRef, mdWrap• binData, xmlData• Attributes: ID, LABEL, MDTYPE, MIMETYPE, SIZE,

CREATED, CHECKSUM, CHECKSUMTYPE

Allow URI or string for MDTYPE Add use cases/examples to illustrate choices made Clarify structural relationships

Page 18: Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San

Implementing an Exchange Standard

PREMIS Implementation Tool• Some tools documented on the PREMIS website

http://www.loc.gov/standards/premis/tools_for_premis.php

• PiM tool developed by Florida Center for Library Automation

• Further work to generate metadata from digital files in PREMIS elements