DIGITAL ARCHIVES
Into the Light
Gabrielle V. Michalek, Head Digital Library Initiatives Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
A Sound Digital Archives
Accessibility
Interoperability
Sustainability
Preservation
Gabrielle V. Michalek, Carnegie Mellon University
Carnegie Mellon’s Digital Collections
Senator Heinz - 850,000 images
Herbert Simon - 153,000 images
Allen Newell - 145,000 images
Over 1 Million Images Online http://diva.library.cmu.edu/
Gabrielle V. Michalek, Carnegie Mellon University
Other CMU Digital Projects
SmartWeb Exhibit http://shelf1.library.cmu.edu/IMLS/MindModels/
Million Book Project http://www.rr.cs.cmu.edu/mbdl.doc
Universal Libraryhttp://ul.cs.cmu.edu/
Gabrielle V. Michalek, Carnegie Mellon University
History
1992 Began work on Senator Heinz Papers
1995 Developed Helios System to digitize, create metadata, OCR, and provide access to collection
1999 Applied technology to Simon and Newell Collections
2000 Migrated to DIVA
Gabrielle V. Michalek, Carnegie Mellon University
DIVA
DigitalInformationVersatileArchive
Gabrielle V. Michalek, Carnegie Mellon University
DIVA
New platform Oracle based Takes in any XML File Supports heterogeneous
collections Full text or fielded searching Browsing and sorting
Gabrielle V. Michalek, Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
Use of Standards
Metadata Creation – EAD, Dublin Core, etc
Imaging – 600 DPI, 8 Bit Greyscale, 24 Bit Color
OCR – ASCII Text
Data Structure – Metadata Encoding and Transmission Standard (METS)
Metadata
What is it and why is it important?
Gabrielle V. Michalek, Carnegie Mellon University
Descriptive Metadata
Data that describes the digital object such as a bibliographic record or finding aid, i.e. MARC record
Gabrielle V. Michalek, Carnegie Mellon University
Structural Metadata
Represents the relationship between multiparts objects, i.e. chapters of a book
Gabrielle V. Michalek, Carnegie Mellon University
Administrative Metadata
“Data that supports the unique identification, maintenance, and archiving of digital objects, as well as related functions of the organization managing the repository”, i.e.who created this object, which software, version was used, etc.
Gabrielle V. Michalek, Carnegie Mellon University
What We Are Using
Archival Collections - Encoded Archival Description (EAD)
Books, Journals, Photographs, etc. – Dublin Core
Metadata Encoding and Transmission Standard - METS
Gabrielle V. Michalek, Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
Gabrielle V. Michalek, Carnegie Mellon University
METS Incorporates descriptive, structural, and
administrative metadata
Allows you to bind heterogeneous collections together and show relationships between information
Becomes a wrapper for the collection
XML DTD http://www.loc.gov/standards/mets/
Gabrielle V. Michalek, Carnegie Mellon University
Goals
Accessibility
Interoperability
Sustainability
Preservation
Thank You
http://diva.library.cmu.edu/