File storage is a lot like a basement closet... Image courtesy of Teemo, Master of Clowning Image...

Preview:

Citation preview

File storage is a lot like a basement closet...

Image courtesy of Teemo, Master of Clowning

Image courtesy of Life Magazine

What happens when it's time to move???

Identification A digital object may belong in MANY potential virtual collections…

… but it originated from ONE SINGLE ANALOG collection. Provenance trumps all!

Slavery African

Americans Sheet Music Tombigbee

River Southern

History … and more

“Gum Tree Canoe,” Published by G.P. Reed (Boston: 1847). Wade Hall collection of Southern History and Culture, Hoole Special Collections, University of Alabama Libraries.

Hierarchical classes approach

Holder ID: u0003

Collection ID: 0000023

Item ID: 0000007

Sequence ID: 0005

Archival File: u0003_0000023_0000007_0005.tif

“Holder” and format type (example: Hoole Special Collections Manuscripts)

Collection within those holdings (example: Gorgas Family Papers)

Numbered item within that collection Sequence of delivery for pages (and subpages)

Concatenated meaningful segments in filename;Same segments mirrored in file structure.

u0003_0001980_0000001 is the first digitized item in the MSS 1980 collection

HOLDER ID

COLLECTION ID

Create Clarity:Information at the Applicable Level

Holder/Format

Documentation

Collection 1 Collection 2 Collection 3Collection 2

Documentation

Item 1 Item 2 Item 3Item 2Documentation

Page 1 Page 2 Page 3

Dropping the Technical Metadata in… where it belongs

Makes METS creation a Piece of Cake!

(and redundant!)

Is simply:

u0003/ 0000002/ 0001/

The directory for

u0003_0000002_0001.tif

storage area

Simple, Clear Hierarchical Organization:

Holder ID Collection ID Item ID Sequence ID

Bringing Content Up to the Level Of the WEB!!! Greater Usability and Access == Longer Life

Images … ImageMagick: http://www.imagemagick.org(it’s free!)

Protected archive area

u0003 u0003

0000023 0000023

0000007

0005

u0003_0000023_0000007_0005.tif

0000007

0005

Thumb, mid-, and large-size derivatives

Web accessible area

Audio … LAME: http://lame.sourceforge.netOCR … TESSERACT: http://code.google.com/p/tesseract-ocr/

http://acumen.lib.ua.edu

ACCESS! Via Acumen

(also free!)

XML schema agnostic No ingest No metadata modifications All content easily

accessible Open to search engines

Recommended