44
Digitizing Images Stephen Chapman Weissman Preservation Center Harvard University Library 30 September 2003

Digitizing Images Stephen Chapman Weissman Preservation Center Harvard University Library 30 September 2003

  • View
    246

  • Download
    0

Embed Size (px)

Citation preview

Digitizing Images

Stephen ChapmanWeissman Preservation CenterHarvard University Library

30 September 2003

Motivations

Necessity More and more material is being produced in digital form; more and more of our users want access to such materials.

ExcellenceBecause we value the highest levels of teaching and research we will have to change our way of doing things.

InnovationTechnology enables uses not possible with analog formats. Obligations are to create coherent, integrated collections and to deliver them with tools that support innovative professional practice.

Why images?“The special opportunities presented by digital technologies constitute the most fundamental development in the potential for increased access and flexibility of use since the advent of photographic reproduction.”

Neil L. Rudenstine, April 2001

Digital technologies

World Wide Web “This is what the Internet does well.”

Relational databasesPermit flexible approaches to cataloging — hierarchical structures needed to manage and describe multiple versions of a particular item, including surrogates.

Digital cameras and scannersProven capability to create digital surrogates “that faithfully represent the originals in tone and color and provide a level of detail that would enable advanced study.” David Remington

Products of image digitization

Images The data for scholars to study. Single version of image rarely suffices to meet all needs (compare, study, print).

Descriptions of images (text)The “metadata” the user needs to locate and interpret images.

Descriptions of ownership and rights (text)The metadata the owner uses to disclose terms and conditions associated with using images.

Source materialsThe principal assets valued by owners and users.

Infrastructure

Catalogs Systems for comprehensive and controlled searching.

Persistent namingMeans to ensure image management and reliable access.

RepositoryA trustworthy place to manage images over time.

DeliverySystems to deliver digital images to authorized users.

Workflow

planning

selection

prep

imaging

cataloging

deposit

linking

project

source

surrogates

Project planningThere are no absolute rules for creating good collections, objects or metadata. Every project is unique and each has its own goals. The key to a successful project is not to follow any particular path, but to think strategically and make wise choices.

IMLS Framework of Guidance

SelectionFor the kind of pictures we collect, individual public domain analysis is expensive. [T]here is only one practical methodology: accession policy must choose the most conservative boundary as a functional bright line that separates what is acceptable and what is not acceptable.

Robert A. Baron

PrepWhenever there is handling of original collections, there is a need for the application of conservation knowledge and practice.

Library of Congress NDLP and Conservation Division

Workflow

planning

selection

prep

imaging

cataloging

deposit

linking

project

source

surrogates

Cataloging practices

Local, but…“Picture catalogs still tend to be incomplete, idiosyncratic, and isolated.” Helena Zinkham

Movement to consolidate: union catalogsAMICO, ArtSTOR, VIA, UCAI (UCSD, ArtSTOR, Harvard)

Emerging consensus and best practiceVRA Guide, “Cataloguing Cultural Objects”http://www.vraweb.org/CCOweb/

Data standards “promote sharing, improve the management of content, and reduce redundancy of effort.”

Descriptive metadata standards

Specific to topics or disciplinesBiology or art

Specific to kinds of materialsMoving pictures, encoded texts

Specific to support particular functionsDiscovery, rights management, presentation

Descriptive metadata standards

Which “information pieces”Data dictionaries (e.g., for OLIVIA)

CDWA, VRA Core, Dublin Core

How information is formedContent standards and vocabularies:

VIA Working Group has identified over 20

How information is encoded for processingSyntax (e.g., MARC, RDF)

Virtually no standards govern all of these aspects of metadata.

http://hul.harvard.edu/ois/systems/via/via_standards.html

Key decisions

ScopeWhich catalog(s)? …HOLLIS, OASIS, VIA used at Harvard

Item- or group-level cataloging

ExtentAmount of cataloging (project and program policies)

Digital image production

Lights, camera…Visual literacy and technical skill still absolutely critical

Pixels!“The more one looks at image quality and ways to clearlydefine it, the more parameters have to be taken into account.”

Frey and Reilly- rendering intent- tone reproduction- detail and edge reproduction- color reproduction- noise

Digital image standards

FormatsDLF Global Digital Format Registry

QualityI3A/IT10 Electronic Still Picture Imaging Committee

ISO speed, resolution (MTF), OECF, noise and color measurement

ISO 3664: 2000 Viewing conditions

Technical metadata for digital still imagesNISO Z39.87-2002 AIIM 20-2002 (governed by LC)

Digital imaging practices

masters delivery images quality controladmin metadata

“support intended current and likely future use”(IMLS Framework)

archival masters (optimized for processing, not viewing)

production masters (optimized for automation)

no compression for grayscale and color images

TIFF = format of choice

Digital imaging practices

masters delivery images quality controladmin metadata

calibrated devices

calibrated environment

targets

checksums

validation software at repository

Digital imaging practices

masters delivery images quality controladmin metadata

“supports management of resources” (R. Wendler)• ownership• access restrictions• technical attributes of files

XML format

produced and deposited in addition to images

Digital imaging practices

masters delivery images quality controladmin metadata

calibrated devices

calibrated environment

targets

checksums

validation software at repository

DepositDRS preservation services provide active oversight to ensure an indefinite lifespan for objects deposited in approved formats. "Oversight" involves monitoring file formats, assessing the vulnerability of digital collections, and transforming files to maintain usability.

HUL DRS Policy Guide

Repository Storage Cost Gaps, Photographs, Example 1 Harvard Depository and OCLC Digital Archive (2003)

$ per photograph, per year

$0.16

$0.47

$0.003HD film vault

OCLC (>1,000 GB rate)

24-bit PCD (2) (10.7 MB) 24-bit TIFF (2) (32 MB) 35mm negative

Current cost gap: digital 53-157X more expensive than film @ OCLC18-52X more expensive at Harvard (DRS)

Repository Storage Cost Gaps, Photographs, Example 2Harvard Depository and OCLC Digital Archive (2003)

$ per photograph, per year

$3.35

$0.016HD film vault

OCLC (>1,000 GB rate)

24-bit TIFF (229 MB) 4 x 5 negative

Current cost gap: digital 209X more expensive than film @ OCLC,70X more expensive at Harvard (DRS)

Closing cost gaps for repository storage

Compression

Investigate risks associated with using bit-for-bit lossless compression instead of uncompressed formats as preservation masters.

Cost metrics

Bill owners at unit other than size (e.g., per GB) to sustain costs of running repository and preservation services.

Subsidies

Create common-good repositories and services (“safe havens”) with secure, sustainable funding lines for items that meet defined criteria.

Hybrid approach viable for still images

Digital Masters

Deposit digital master to repository, pay for annual maintenance regardless of use.

Repurpose digital masters: produce delivery images, in analog or digital formats, in advance and/or upon request.

Analog Masters

Deposit analog (e.g., film) master to repository, pay for annual maintenance regardless of use.

Repurpose analog masters: produce delivery images, in analog or digital formats, in advance and/or upon request.

Lessons learnedBuilding ArtSTOR into a trusted repository … will require not only time and resources, but also collegiality and the active participation of individuals from academic institutions, museums, libraries, and research centers; specialists in imaging and in building databases; others experienced in the creation of digital resources; experts in intellectual property rights; and wise generalists.

One clear conclusion is that working on this project inspires humility!

William G. Bowen, PresidentAndrew W. Mellon Foundation

Resources

Your colleagues!

• Mellon Foundation, 2001 President’s Report, “ArtSTOR”

• Harvard University Library, LDI Program Origins

• David Remington, “HCL-DIG General Imaging Practice”

• Helena Zinkham,”Bridges & Whirlpools: Best Access Practices for Pictures”