View
246
Download
0
Tags:
Embed Size (px)
Citation preview
Digitizing Images
Stephen ChapmanWeissman Preservation CenterHarvard University Library
30 September 2003
Motivations
Necessity More and more material is being produced in digital form; more and more of our users want access to such materials.
ExcellenceBecause we value the highest levels of teaching and research we will have to change our way of doing things.
InnovationTechnology enables uses not possible with analog formats. Obligations are to create coherent, integrated collections and to deliver them with tools that support innovative professional practice.
Why images?“The special opportunities presented by digital technologies constitute the most fundamental development in the potential for increased access and flexibility of use since the advent of photographic reproduction.”
Neil L. Rudenstine, April 2001
Digital technologies
World Wide Web “This is what the Internet does well.”
Relational databasesPermit flexible approaches to cataloging — hierarchical structures needed to manage and describe multiple versions of a particular item, including surrogates.
Digital cameras and scannersProven capability to create digital surrogates “that faithfully represent the originals in tone and color and provide a level of detail that would enable advanced study.” David Remington
Products of image digitization
Images The data for scholars to study. Single version of image rarely suffices to meet all needs (compare, study, print).
Descriptions of images (text)The “metadata” the user needs to locate and interpret images.
Descriptions of ownership and rights (text)The metadata the owner uses to disclose terms and conditions associated with using images.
Source materialsThe principal assets valued by owners and users.
Infrastructure
Catalogs Systems for comprehensive and controlled searching.
Persistent namingMeans to ensure image management and reliable access.
RepositoryA trustworthy place to manage images over time.
DeliverySystems to deliver digital images to authorized users.
Project planningThere are no absolute rules for creating good collections, objects or metadata. Every project is unique and each has its own goals. The key to a successful project is not to follow any particular path, but to think strategically and make wise choices.
IMLS Framework of Guidance
SelectionFor the kind of pictures we collect, individual public domain analysis is expensive. [T]here is only one practical methodology: accession policy must choose the most conservative boundary as a functional bright line that separates what is acceptable and what is not acceptable.
Robert A. Baron
PrepWhenever there is handling of original collections, there is a need for the application of conservation knowledge and practice.
Library of Congress NDLP and Conservation Division
Cataloging practices
Local, but…“Picture catalogs still tend to be incomplete, idiosyncratic, and isolated.” Helena Zinkham
Movement to consolidate: union catalogsAMICO, ArtSTOR, VIA, UCAI (UCSD, ArtSTOR, Harvard)
Emerging consensus and best practiceVRA Guide, “Cataloguing Cultural Objects”http://www.vraweb.org/CCOweb/
Data standards “promote sharing, improve the management of content, and reduce redundancy of effort.”
Descriptive metadata standards
Specific to topics or disciplinesBiology or art
Specific to kinds of materialsMoving pictures, encoded texts
Specific to support particular functionsDiscovery, rights management, presentation
Descriptive metadata standards
Which “information pieces”Data dictionaries (e.g., for OLIVIA)
CDWA, VRA Core, Dublin Core
How information is formedContent standards and vocabularies:
VIA Working Group has identified over 20
How information is encoded for processingSyntax (e.g., MARC, RDF)
Virtually no standards govern all of these aspects of metadata.
http://hul.harvard.edu/ois/systems/via/via_standards.html
Key decisions
ScopeWhich catalog(s)? …HOLLIS, OASIS, VIA used at Harvard
Item- or group-level cataloging
ExtentAmount of cataloging (project and program policies)
Digital image production
Lights, camera…Visual literacy and technical skill still absolutely critical
Pixels!“The more one looks at image quality and ways to clearlydefine it, the more parameters have to be taken into account.”
Frey and Reilly- rendering intent- tone reproduction- detail and edge reproduction- color reproduction- noise
Digital image standards
FormatsDLF Global Digital Format Registry
QualityI3A/IT10 Electronic Still Picture Imaging Committee
ISO speed, resolution (MTF), OECF, noise and color measurement
ISO 3664: 2000 Viewing conditions
Technical metadata for digital still imagesNISO Z39.87-2002 AIIM 20-2002 (governed by LC)
Digital imaging practices
masters delivery images quality controladmin metadata
“support intended current and likely future use”(IMLS Framework)
archival masters (optimized for processing, not viewing)
production masters (optimized for automation)
no compression for grayscale and color images
TIFF = format of choice
Digital imaging practices
masters delivery images quality controladmin metadata
calibrated devices
calibrated environment
targets
checksums
validation software at repository
Digital imaging practices
masters delivery images quality controladmin metadata
“supports management of resources” (R. Wendler)• ownership• access restrictions• technical attributes of files
XML format
produced and deposited in addition to images
Digital imaging practices
masters delivery images quality controladmin metadata
calibrated devices
calibrated environment
targets
checksums
validation software at repository
DepositDRS preservation services provide active oversight to ensure an indefinite lifespan for objects deposited in approved formats. "Oversight" involves monitoring file formats, assessing the vulnerability of digital collections, and transforming files to maintain usability.
HUL DRS Policy Guide
Repository Storage Cost Gaps, Photographs, Example 1 Harvard Depository and OCLC Digital Archive (2003)
$ per photograph, per year
$0.16
$0.47
$0.003HD film vault
OCLC (>1,000 GB rate)
24-bit PCD (2) (10.7 MB) 24-bit TIFF (2) (32 MB) 35mm negative
Current cost gap: digital 53-157X more expensive than film @ OCLC18-52X more expensive at Harvard (DRS)
Repository Storage Cost Gaps, Photographs, Example 2Harvard Depository and OCLC Digital Archive (2003)
$ per photograph, per year
$3.35
$0.016HD film vault
OCLC (>1,000 GB rate)
24-bit TIFF (229 MB) 4 x 5 negative
Current cost gap: digital 209X more expensive than film @ OCLC,70X more expensive at Harvard (DRS)
Closing cost gaps for repository storage
Compression
Investigate risks associated with using bit-for-bit lossless compression instead of uncompressed formats as preservation masters.
Cost metrics
Bill owners at unit other than size (e.g., per GB) to sustain costs of running repository and preservation services.
Subsidies
Create common-good repositories and services (“safe havens”) with secure, sustainable funding lines for items that meet defined criteria.
Hybrid approach viable for still images
Digital Masters
Deposit digital master to repository, pay for annual maintenance regardless of use.
Repurpose digital masters: produce delivery images, in analog or digital formats, in advance and/or upon request.
Analog Masters
Deposit analog (e.g., film) master to repository, pay for annual maintenance regardless of use.
Repurpose analog masters: produce delivery images, in analog or digital formats, in advance and/or upon request.
Lessons learnedBuilding ArtSTOR into a trusted repository … will require not only time and resources, but also collegiality and the active participation of individuals from academic institutions, museums, libraries, and research centers; specialists in imaging and in building databases; others experienced in the creation of digital resources; experts in intellectual property rights; and wise generalists.
One clear conclusion is that working on this project inspires humility!
William G. Bowen, PresidentAndrew W. Mellon Foundation
Resources
Your colleagues!
• Mellon Foundation, 2001 President’s Report, “ArtSTOR”
• Harvard University Library, LDI Program Origins
• David Remington, “HCL-DIG General Imaging Practice”
• Helena Zinkham,”Bridges & Whirlpools: Best Access Practices for Pictures”