36
WHERE HAVE ALL THE BINDERS GONE? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington University SAA Chicago Session #801 September 1, 2007

W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Embed Size (px)

Citation preview

Page 1: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

WHERE HAVE ALL THE BINDERS GONE?

Greg Colati, University of Denver

Jennifer King, George Washington University

Sylvia Augusteijn, George Washington University

SAA Chicago Session #801

September 1, 2007

Page 2: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington
Page 3: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

WHY MANAGE WITH A DATABASE?

Scale Centralized management Access Reusability Rearrange-ability

Page 4: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

REAL DRIVERS OF CHANGE

Demand for item level access Born Digital content Digitized content Researcher demands and expectations

Page 5: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

MANY INPUTS, MANY OUTPUTS

Metadatafrom

RecordsManagement

system

OAI metadatafor

harvesters andaggregators

EAD XMLfor

RMOA or other uses

MARCrecords

forIII or other uses

Metadata for localSystems: e.g.Heritage West,

Penrose web. DUVAGA

Metadata from local

systemse.g. DUVAGA, or

IR

In-housecataloging

orimported metadata

CollectionsManagement

Database

Physical

objectStoragelocation

Digital object

Storagelocation

Page 6: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

OBJECTS AND ATTRIBUTES I belong to a collection I belong to a series I came from somewhere I am an image I am a certain file format(s) I am about something(s) I am green, blue, and brown

Page 7: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington
Page 8: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

CLUSTERING

Page 9: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

VISUALIZATION

Page 10: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

© 2

00

7 G

regory

C.

Cola

ti

CONTEXTUALIZE THE RESOURCE

The Encyclopedia of Chicago http://www.encyclopedia.chicagohistory.org/

Page 11: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

I WANT WHAT I WANT …

Page 12: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

A CULTURAL SHIFT

General

Specific Association

Object

Page 13: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

EXTEND INTEROPERABILITY

Descriptive standards at the item level

Page 14: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

MANAGE FROM THE BOTTOM UP

Items and attributes Create associations, implicit and explicit

Page 15: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

PRODUCTIVITY APPROACH TO PROCESSING, MANAGEMENT, AND ACCESS

Automate metadata creation Metadata extraction Pre-populate metadata fields using default and

automatically generated terms Stop writing extensive biographical and

historical notes Automate digital content creation

Page 16: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington
Page 17: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

USE THE POWER OF DATABASE TOOLS

Ingest tools discussed above Export templates for:

MARC EAD Various XML schemas for item level export:

MARCXML, DC, TEI, VRA etc.

Page 18: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

LEVERAGE USE OF DIGITAL REPOSITORIES

We don’t have to be self-sufficient Outsource low-level functions

Mass storage Backup

Page 19: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

CREATE PARTNERSHIPS

Computer scientists Librarians Academic technologists

Page 20: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

GET INTO MAINSTREAM DISCOVERY TOOLSGET “INTO THE FLOW”

Can everyone say Google MySpace YouTube Facebook

Page 21: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

CREATE ACCESS TOOLS BASED ON USER NEEDS

Understand how all of our constituencies seek information and use information

Make our tools reflect these behaviors. When those behaviors change, our tools

should change with them.

Page 22: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

NEW SKILLS FOR THE DIGITAL ERAJennifer King

George Washington University

Page 23: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

RE:DISCOVERY MAIN PAGE

Page 24: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

RE:DISCOVERY FOR INTERNET SEARCH

Page 25: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

RFI AND FINDING AID

Page 26: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

From Document

To Database

Sylvia AugusteijnGeorge Washington University

Special Collections and University ArchivesSAA session 801

September 1, 2007

Page 27: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Out from the binders

Scope and content notes, series descriptions simple to cut and paste into Re:Discovery

Cut and paste not feasible for thousands of item-level records

“Container list” project is born

Goal: to separate elements of each item name (number, title, date) so Re:Discovery could import them into their respective fields

Page 28: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Container lists

Each item has a number, title, and date, but formats vary slightly in punctuation or spacing

Ways of writing the same name:

1. Correspondence, 1950-57

I. Correspondence – 1950-1957

i. correspondence 1950 to 1957

Naming conventions generally consistent within each finding aid

How to automate?

Page 29: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Automation, part 1:

Delimiting the text

Container lists saved in a text editor (TextPad)

Delimiters are special characters placed within the text to separate the elements

We chose * to signal the beginning and end of each field and % to signal the boundary between fields

Item as it appears in text of finding aid: 1. Correspondence, 1950-57

Item with delimiters inserted: *1*%*Correspondence*%*1950-57*

Page 30: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Delimiting the text (continued)

Re:Discovery can import directly from the text editor, with instructions

Instructions to Re:Discovery: the first element of this name will be the number, the second will be the title, the third will be the date

*1*%*Correspondence*%*1950-57*

How to add these delimiters to thousands of item records?

Page 31: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Automation, part 2: Regular expressions

A regular expression is a string that uses special characters (such as \ + $ ^ ]) to describe and match patterns of text within a document

Page 32: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Regular expressions(continued)

First used regular expressions to search through our text for anything formatted like an item (i.e. to search for a pattern in which an item number is followed by a title and date)

Then used regular expressions to insert our delimiters in between those elements

To turn a page of this:

1. Journals, 1950-602. Photographs, 1970-803. Postcards, 1940-50

Into a page of this:

*00001*%*Journals*%*1950-60**00002*%*Photographs*%*1970-80**00003*%*Postcards*%*1940-50*

Page 33: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Examples of regular expressions

To turn 1. Correspondence, 1950-1957 into

*00001*%*Correspondence, 1950-1957

Find: \([0-9]\). (find any digit followed by a period) Replace: *0000\1*%* (replace with *, four zeroes, that digit and *%*)

Then to turn *00001*%*Correspondence, 1950-1957 into *00001*%*Correspondence*%*1950-1957

  Find: , \([0-9]\{4\}\) (find any four-digit number preceded by a comma and space)

Replace: *%*\1 (replace the comma and space with *%*)

Page 34: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Challenges

Tweaking expressions slightly for each new container list

Writing the wrong expression and accidentally replacing the wrong text

Failing to export correctly to Re:Discovery due to small number of missing delimiters

Page 35: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

Re:Discovery and beyond Delimited text exported into Re:Discovery

From Re:Discovery, easy creation of EAD finding aids using a template

To date: 257 collections in Re:Discovery (and EAD finding aids on the web)

0 binders

Page 36: W HERE H AVE A LL THE B INDERS G ONE ? Greg Colati, University of Denver Jennifer King, George Washington University Sylvia Augusteijn, George Washington

CONTACT INFORMATION:

Greg Colati

Digital Initiatives Coordinator

University of Denver

[email protected]

Jennifer King

Manuscripts Librarian

George Washington University

Washington, DC

[email protected]

Sylvia Augusteijn

Project Archivist

George Washington University

[email protected]