23
PREMIS Update Rebecca Guenther Library of Congress [email protected] PREMIS Implementation Fair Vienna, Austria 22 September 2010

PREMIS Update Rebecca Guenther Library of Congress [email protected] PREMIS Implementation Fair Vienna, Austria 22 September 2010

Embed Size (px)

Citation preview

PREMIS Update

Rebecca GuentherLibrary of [email protected]

PREMIS Implementation FairVienna, Austria22 September 2010

Overview

Editorial Committee membership What's new since the last PREMIS Implementation Fair

(iPRES 2009) PREMIS Data Dictionary and schema revision process Changes to the Data Dictionary in process

• Schema changes for extensibility• Data Dictionary version 2.1

PREMIS conformance Today’s agenda

PREMIS timeline

20092008200720062005200420032002

PREMIS Data Dictionary releasedMaintenance Activity formed

PREMISWorking Group

formed

MetadataFramework

ForDigital

Preservation

PREMISEditorial Committee

formed

PREMIS 2.0released

PREMISImplementation

Fairs

2010

The State of PREMIS

de facto standard for preservation metadata; in some countries mandated for cultural heritage repositories

PREMIS implementations are appearing in many places, many contexts, many forms

Some experimentation is leading to changes in the data dictionary and schema

PREMIS Implementation fairs: attempts to consolidate implementation experiences, issues, best practices,

PREMIS Editorial Committee membership

Rebecca Guenther, Chair (Library of Congress)

Yair Brama (ExLibris) Karin Bredenberg

(Riksarkivet, Swedish National Archives)

Priscilla Caplan (Florida Center for Library Automation)

Angela Dappert (British Library)

Angela Di Iorio (Fondazione Rinascimento Digitale)

Markus Enders (British Library)

Noreen Hill (Library and Archives Canada)

Karsten Huth (Sächsisches Staatsarchiv)

David Lake (US National Archives and Records Administration)

Brian Lavoie (OCLC) Sally Vermaaten (Statistics

New Zealand) Robert Wolfe (MIT/DSpace) Kate Zwaard (US

Government Printing Office)

PREMIS Implementation Fair at iPres 2009

State of PREMIS Tools

• PREMIS in METS Toolkit• Univ. of Illinois Hub and Spoke toolkit• Statistics New Zealand toolkit

Systems• ExLibris Rosetta• DAITSS

Potential data model changes Case studies: implementations Discussion

• How to store environment information• Storing auxiliary files• Exchange

What’s new: PREMIS activities Integration with other standards and efforts

• Survey of PREMIS in METS profiles (DLib magazine Sept 2010)http://www.dlib.org/dlib/september10/vermaaten/09vermaaten.html

• Extensibility: Add elements about extensions as in METS • US intelligence community extending for security classification

PREMIS Documentation • Understanding PREMIS: Priscilla Caplan (2009)

• Gentle introduction to the PREMIS standard• Spanish, German and Italian translations

• PREMIS Data Dictionary for Preservation Metadata version 2.0: translation in Japanese

Workflows and registries• PREMIS Tools to facilitate automated workflows: PREMIS in METS

toolkit made available as open source• PREMIS controlled vocabularies in id.loc.gov

PREMIS Data Dictionary and Schema Revision Process Send change request for consideration by the PREMIS

Editorial Committee via Web form or on pigpen wiki Non-substantive changes will be documented on change

page on PREMIS website Substantive changes will be brought to the PREMIS

Implementers’ group Editorial Committee will discuss within 2 months Decisions made

• Changes made no more than twice a year• Published as addendum to Data Dictionary and/or in

revision of XML schema• Community will be informed about changes with reasons

made

Changes to Data Dictionary in process (version 2.1) Correct links Add linking semantic units from Agent Entity to Events and

Rights:• linkingEventIdentifier• linkingRightsStatementIdentifier

Corrections of errors, clarify ambiguous areas Make storage optional New agent semantic units Revision of extension element notes to indicate new

attributes New Agent semantic units: agentNote, agentExtension

Schema changes for extensibility

Add information about extension points modeled after METS • Allow for wrapping or reference of PREMIS metadata• Other attributes: CREATED, STATUS, ID, CHECKSUM, Location

type Include information about metadata type

• MDTYPE, OTHERMDTYPE,• MDTYPEURI

Additional work• Coordinate with METS Editorial Board• Define controlled values in id.loc.gov • Revise PREMIS in METS guidelines• Revise notes in Data Dictionary

Draft schema ready to go out for review

Intellectual entities

Has been out of scope and only described by an identifier in PREMIS 1.0 and 2.0

Development of use cases for giving information about intellectual entities

Consideration of how to implement: as another level of object or a separate entity?

Use cases for describing intellectual entities

Represent a collection, FRBR work, FRBR expression, fonds, series, files (in the archival sense) in order to

• capture descriptive metadata

• to have business requirements associated with them or to be referenced in business requirements (such as significant characteristics, risk definitions, guidelines for preservation actions, etc.)

• structural and derivative relationships

• rIghts information

• events and agents

Capture versioning information and metadata update events for intellectual lEntities like articles and issues

Adding semantic units for Intellectual Entities

Will be added as another level of object Advantages to this approach:

• Data dictionary will be more compact• Simplify the dictionary by dropping links such as

linkingIntellectualIdentifier• Could directly attach to events, agents and indirectly rights to

intellectual entities Next steps

• Present to PREMIS Implementers’ Group for review• Revise Data Dictionary and schema

PREMIS conformance

Experience in implementation, managing, and using PREMIS semantic units growing• Corresponding need to cultivate deeper understanding of what

it means to be “PREMIS conformant”

Need new conformance statement that is more detailed and more actionable• Detailed: precise definition of what conformance means in light

of emerging use cases;• Actionable: of practical use as resource for assessing

conformance of a given PREMIS implementation

Subgroup within PREMIS Editorial Committee formed• Brian Lavoie, Rebecca Guenther, Priscilla Caplan, Angela

Dappert, Sally Vermaaten, Yair Brama

Some “use cases” for PREMIS conformance

Inter-repository data exchange• e.g., TIPR project

Repository certification• e.g., TRAC

Shared Registries• e.g., PRONOM, Unifed Digital Formats Registry

Automated workflows/reusable tools• e.g., SIP/AIP processing

Vendor support• e.g., ExLibris Rosetta

New PREMIS conformance statement

Establish conditions required for conformance:• Articulate what implementers must do to assert

PREMIS conformance

Describe “degrees of freedom” associated with conformance:• Identify areas of implementation decision-making

where implementers are free to make their own choices while still remaining conformant

http://www.loc.gov/standards/premis/premisConformance_v4.pdf

1. Establish conditions required for conformance

Organize, amplify, and extend conformance conditions set forth in Data Dictionary v1.0 and v2.0

Define conformance from multiple perspectives:• Level of semantic unit• Level of Data Dictionary• Internal to repository• Inter-repository exchange (import and export)

Provide examples of conformance & non-conformance

Examples of conformance: semantic unit

Conformant: A repository uses a relational database system with an Objekteigenschaften table and establishes in the system documentation that Objekteigenschaften shares the definition of the PREMIS semantic unit objectCharacteristics.

Non-conformant: A repository implements a metadata element objectCategory that records information defined in PREMIS semantic units objectCategory and preservationLevel.

Examples of conformance: Data Dictionary

Conformant: A repository that is conformant in regard to Objects also wants to record information about Events; therefore, it implements metadata elements that, at the minimum, capture all of the information specified in the semantic units eventIdentifier, eventType, and eventDateTime.

Non-conformant: The information a repository records about Events does not include information that corresponds to the PREMIS semantic unit eventType

Internal and external conformance

Internal: A repository that satisfies the Principles of Use at both the semantic unit and Data Dictionary levels is considered internally conformant.

External (import): A repository that is import conformant must be able to accept PREMIS-conformant information in the form provided by another repository, parse it, and allocate the information to its corresponding metadata elements in the local repository system, as well as associate it with the appropriate Entities.

External (export): A repository that is export conformant must be able to extract PREMIS-conformant information from its local system, and provide it to another repository in an agreed-upon form, and associate it with its appropriate Entity.

2. Degrees of freedom

Naming• Repository is free to implement semantic units using names

different from those defined in Data Dictionary Granularity

• Repository is free to distribute information defined in a semantic unit across as many metadata elements as it chooses

Level of Detail• Repository is free to record more detailed information for a

semantic unit than what is defined in Data Dictionary Explicit Recording of Information

• Repository is not required to explicitly record information for an implemented semantic unit (but information must be recoverable in some way when needed)

Use of Controlled Vocabularies• Repository is free to use (or not use) controlled vocabularies. If

repository uses controlled vocabularies, it can use either internally-defined or external/standardized vocabularies

Next steps for conformance

Collect feedback on draft conformance statement from PIG List & PREMIS Implementation Fair participants

Finalize draft for approval by PREMIS Editorial Committee

Post final version on Maintenance Activity Web site

Today’s topics

Data modeling• Comparison between PREMIS and PLANETS data models• PREMIS OWL ontology

PREMIS in interchange• Towards Interoperable Preservation Repositories (TIPR)

(Priscilla Caplan, Florida Center for Library Automation)• ARTAT (Angela Di Iorio, Fondazione Rinascimento

Digitale) PREMIS controlled vocabularies

• PREMIS vocabulary service • PREMIS events in HathiTrust

Open discussion