PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for...

Preview:

Citation preview

PeDALSPersistent Digital

Archives & Library System

Richard Pearce-MosesDeputy Director for Technology & Information

ResourcesArizona State Library, Archives and Public Records

Curatorial Rationale Transformation of traditional, paper-based

practices into the digital arena Open Archival Information System (OAIS)

Acquisition Arrangement &

description Housing & storage Reference and access Preservation

Ingest Storage Data management Preservation Access

Data Flow

Middleware: Microsoft BizTalk Automated business rules

Transforming SIPs to AIPs and DIPs Mapping, generating metadata

Connecting multiple databases (“glue”) Many OOOs One repository

Allows communication between systems Validation

1. OOO Recordkeeping System For each series of records OOO and

repository Negotiate metadata you will receive Negotiate format of the records (TIFF, PDF, XML) Negotiate format of the submission information

package Negotiate frequency and manner of transfer

OOO develops procedures to create SIPs Metadata, Record Shipping manifest with hash and file names

Submission Information Packages OOO Metadata

“Well number" , "Owner" , "Title" , "File name" "56-000001","CITY OF TUCSON","2003 annual report","56 files\56-000001_0000.pdf" "56-000001","CITY OF TUCSON","2004 annual report","56 files\56-000001_0000_E52B0.pdf" "56-000001","CITY OF TUCSON","2005 annual report","56 files\56-000001_0000_E8578.pdf" "56-000001","CITY OF TUCSON","2006 annual report","56 files\56-000001_0000_EC3F8.pdf"

Records XML PDF Other formats

2. Ingest: Transfer to Drop Box Transfer to a drop box in DMZ

FTP Tape Disk

Isolated for virus scanning

Validation Were all records received without corruption? Were any false records received?

3. Data Management: Metadata Generate core metadata

Administrative (6 elements) Descriptive (28 elements) Preservation (12 elements)

Stored in “Accessions Register” MS SQL Server

Administrative Metadata Information created by repository to track

records in the system Accession Number Transfer Authority Acquisition Ingest Identifier Acquisition Date Unique Item Identifier Item Location

Discovery Metadata Information created by OOO or Repository to help

retrieving records for a variety of purposes

Office of Origin, Variant name

Source Series Title, ID Series Dates Series Extent Series Description Arrangement Restrictions Series Subjects, Keywords Activity

Item Title Originator ID Item Extent Item Date Item Description First1024 Party and Role, Subjects,

Location Item Keywords, Form/Genre Related Item Language Open Date

Preservation Metadata Information created by Repository to support to

protect integrity, support readability over time

Access Facilitators Operating System Access Inhibitors Hardware Exceptions Signature

Information

File Description Fixity Functionality Software Structural Type Technical

Infrastructure

Mapping and Creating Metadata

Metadata Element Received PeDALS Core

Office of Origin   Arizona. Dept of Water Resources

Var Names   Water Resources

Series Title   Annual Reports

Series Description/Scope note   [Narrative text]

Arrangement   [Narrative text]

Series Date Range   2003 – 2006

Series Subjects   Water wellsWater Supply

Activity/Function   RegulationHealth and safety

Transfer authority   Retention schedule 89-403

Restrictions   Open

Mapping and Creating Metadata

Metadata Element Received PeDALS Core

Title 2003 Annual Report 2003 Annual Report

Date   2003

Party to Record City of Tucson City of Tucson: Author

Location   Tucson

Subjects   Water wellsWater supply

Description   [autogenerated]

First1024   [autogenerated]

OOO Id  56-00001_0000 56-00001_0000

File format   PDF

Fixity   [autogenerated]

4. Storage Create AIP

<AIP><Hash> </Hash><CoreMetadata> </CoreMetadata><Metadata> </Metadata><Record> </Record>

</AIP>

Deposit in Digital Stacks (LOCKSS) Generate manifest list to expose to LOCKSS LOCKSS harvests from manifest server

Why LOCKSS? Benefits

Automatic integrity checking Automate error-correction Geographically dispersed copies Bitstream preservation Committed community of support Hardened operating system

Concerns Maximum number of objects in a Unix file

system Community of support is small

4. Access DIPs for public access

No administrative, preservation metadata Formats supported by common browsers

Website Records not confidential (by law) SQL query engine with discovery metadata

Limited access website In repository, selected locations Record series with personally identifying

information

5. Preservation Bitstream preservation

Developing audit procedures Periodic validation of dark archives against

accession register

For future development Capturing minimum preservation metadata On-the-fly rendering tools Long-term format migration

Community of Shared Practice Personal Relationships

Challenge of building relationships over the Internet

Lack of rich, immediate feedback in communication

Lack of spontaneity, serendipity, play

Inter-Agency Relationships Different practices Laws and regulations Money

For more information http://rpm.lib.az.us/PeDALS/

Principal Investigator Richard Pearce-Moses

Project Coordinator Sara Muth

State Partner Leads Florida: Mark Flynn New York: Bonnie Weddle South Carolina: Bill Henry Wisconsin: Helmut Knies