19
More information on the SIL digitization program than you require Keri Thompson Smithsonian Institution Libraries SPIN Rapid Capture Workshop February 16, 2012 “MORE”

SIL rapid capture

Embed Size (px)

DESCRIPTION

More information than you require about the Smithsonian Libraries' mass digitization program. Presentation given to Smithsonian staff (and some others) for a day-long symposium on rapid capture methodology. Focuses on SIL's workflow for scanning books.

Citation preview

Page 1: SIL rapid capture

More information on the SIL digitization program than you require

Keri Thompson

Smithsonian Institution Libraries

SPIN Rapid Capture Workshop February 16, 2012

“MORE”

Page 2: SIL rapid capture

Boutique Digitization

Boutique One-offs Item-based workflow Tailored metadata Hand-crafted data,

much user intervention

Opportunistic staffing

Project specific grants

Illustration by A.E. Marty (1882-1974)Gazette du Bon Genre, July 1920Smithsonian Institution Libraries

Page 3: SIL rapid capture

Mass Digitization

Prêt à lire

Standardization Format-based

workflow and metadata model

Automate as much as possible

Assigned staff Funding stream

New York Millinery and Supply Co. , 1901 Smithsonian Institution Libraries

Page 4: SIL rapid capture

Ramping Up Find your niche Secure Funding Hire Staff Purchase Equipment Standardize on metadata, processes Automate!

i.e., find magic automation wizard

Page 5: SIL rapid capture

Our Little Corner of the Web

10 original partner institutions

Digitizing legacy literature of taxonomy

Over 50,000 titles, over 100,000 items, almost 38 million pages

Page 6: SIL rapid capture

Numbers!

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

0

2000

4000

6000

8000

10000

12000

14000

Digitization at SI Li-braries 1999-present

Total Items

At Internet Archive >10TB

Storage estimates

Locally >7.5TB

not rapid

rapid

too rapid

Page 7: SIL rapid capture

Funding Multiple grants Over multiple years Lather, rinse, repeat

Kalamazoo Tank & Silo Co.Catalog, ca. 1909Smithsonian Institution Libraries

Page 8: SIL rapid capture

Human Resources Started in 2008 with

2 FTE technicians (Grant) .7 FTE manager .5 FTE cataloger Vendor scanning only And a host of others!

In 2012 have 1 FTE technician (Grant) 2 FTE librarians (Grant) .3 FTE manager 1 scanning technician (Grant) And a host of others!

International Time Recording Co. Time Recording Card Clocks , 1914 , p.12Smithsonian Institution Libraries

Page 9: SIL rapid capture

PhaseOne P65, CaptureOne

BC100,CaptureOne

Canon 5D MkII, Biblio

Equipment

Page 10: SIL rapid capture

In-House Scanning

P65, 60.5MP camera

Strobe lights Image capture Filenaming Crop, rotate No post-

processing Convert to .tiff

Page 11: SIL rapid capture

Process(es)(es)

Vendo

r

Website

presentation

storageRequests

Special projects

In-house use (exhibitions, brochures)

“gap-fills”

Data sources

Page 12: SIL rapid capture

Select & Dedupe

Check out and

Ship

Harvest to Local

Repository

ScanningCheck in and QC

Item available in IA/BHL

Check inAdd link

SIRISWorkflow

DB

Internet Archive

Item levelmetadataInitiate

workflow

Mark asscanned

URLs in MARC recordTitle levelMARC

JP2000s+ metadata

Generalized workflow

Page 13: SIL rapid capture

Standardize Process and Data

Common staging area Metadata Model

Title level (MARC) metadata Item level metadata

volume, issue, date, barcode Page level metadata

sequence, page number, page type

Common storage area Common presentation area

Ericsson LM, Can Efficiency be Measured? Stockholm, Sweden, 1946Smithsonian Institution Libraries

Page 14: SIL rapid capture

Automate Metadata Capture & Transformation

Extract title level metadata MARC MARCXML

Extract item level metadata From SIRIS SQL db xml file

Page level metadata Interface for easy data entry

File creation and conversion Upload to staging area

National Cash RegisterAnnual Report, 1953Smithsonian Institution Libraries

Page 15: SIL rapid capture

Select & Dedupe

Check out and

Ship

Temp. Backup to

NAS

SIRISWorkflow

DB

Internet Archive

Item levelmetadata

Initiate workflow

Title levelMARC

JP2000s+ metadata

Scanning

Packages files for transfer

Creates metadata “Bucket”Transforms Images, creates

derivativesPage

level metadata

added

.tiffs

Maca

w

Check in and QC

Item available in IA/BHL

Check inAdd link

URLs in MARC record

In-house workflow with Macaw

Mark asscanned

Page 16: SIL rapid capture

Metadata Collection and Workflow (Macaw)

Page 17: SIL rapid capture

Room for Improvement Quality Speed Embed metadata

Kenwood Bicycle Mfg. Co.Catalogue for 1895 , 1895Smithsonian Institution Libraries

Page 18: SIL rapid capture

Future

Increase throughput Scan non-book items (MSS) Scan un-cataloged items Frictionless repurposing Output to METS Islandora Local delivery interface

Collier’s, October 18, 1952Smithsonian Institution Libraries

Page 19: SIL rapid capture

Thank You!THAT IS ALL.

Keri Thompson

[email protected]

@DigiKeri_SIL