Upload
iris-mcbride
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Collecting History:Profiles in Science
Alexa T. McCrayAlexa T. McCray
National Library of MedicineNational Library of Medicine
Bethesda, MDBethesda, MD
[email protected]@nlm.nih.gov
Stanford University August 21, 1999
Design and Development of a
Digital Library System Early experiences in digital Early experiences in digital
conversionconversion Development of Development of Profiles in Science Profiles in Science
systemsystem Critical role of metadata in system Critical role of metadata in system
designdesign– Framework for collection managementFramework for collection management– Foundation for Web deliveryFoundation for Web delivery– Standards for resource descriptionStandards for resource description
Lessons learned from an early
Digital Library Project Digital conversion work begun in Digital conversion work begun in
19921992 Some 1,500 historical documents Some 1,500 historical documents
(about 40,000 pages)(about 40,000 pages)– mimeographsmimeographs– oddly sized documentsoddly sized documents– interview transcriptsinterview transcripts
some audio segmentssome audio segments
– photographs and other memorabiliaphotographs and other memorabilia
Processing the early Collection
Created index (metadata) recordsCreated index (metadata) records– templates varied by document typetemplates varied by document type– included topical index termsincluded topical index terms– compensated for poor OCR qualitycompensated for poor OCR quality
Created digital master copy of the Created digital master copy of the documents (TIFF)documents (TIFF)– subsequently derived first GIF and subsequently derived first GIF and
then PDFthen PDF
Profiles in Science
Profiles in ScienceProfiles in Science Web Site Web Site launched in September 1998launched in September 1998
Archival collections of eminent Archival collections of eminent biomedical scientists donated to the biomedical scientists donated to the NLMNLM– Include text, audio, still images, videoInclude text, audio, still images, video– Include books, journal volumes, Include books, journal volumes,
pamphlets, diaries, letters, pamphlets, diaries, letters, manuscripts, photographsmanuscripts, photographs
– Metadata for each item in the collectionMetadata for each item in the collection
Profiles in Science
Educational and research Educational and research applicationsapplications– Scholars of the history of medicine Scholars of the history of medicine
and the biological sciencesand the biological sciences– Students gain an appreciation of Students gain an appreciation of
the methods and success of sciencethe methods and success of science Allows anyone to look “behind Allows anyone to look “behind
the scenes” at how science is the scenes” at how science is donedone
Profiles Collections
Two collections currently availableTwo collections currently available– Oswald T. Avery, Joshua LederbergOswald T. Avery, Joshua Lederberg
Careful attention to copyright and Careful attention to copyright and intellectual property concernsintellectual property concerns
Electronic “exhibit”Electronic “exhibit”– Initial set of digitized items for exhibitInitial set of digitized items for exhibit
Papers continue to be digitizedPapers continue to be digitized
– Full paper collections available for Full paper collections available for scholarly use at NLMscholarly use at NLM
Wide Range of Document Types
Collection -specific Categories of Information
Contextualizing the Content
Tiff and Pdf Documents
Zoom Pdf for Detail
Tiff & Jpeg Documents
Tiff & Jpeg Photographs
High Resolution Tiff
Streaming Video
Search across Full Data Set
Experiment: Profiled Scientist as Interactive
User Digital documents available Digital documents available
before release to the publicbefore release to the public Online annotation capabilityOnline annotation capability Annotations complement Annotations complement
original documentoriginal document– Give additional detail, set Give additional detail, set
document in context, add document in context, add keywordskeywords
Sample Annotation
Design of the Profiles in Science
System A single underlying system that A single underlying system that
is designed to handle the entire is designed to handle the entire life-cycle of a large-scale digital life-cycle of a large-scale digital conversion projectconversion project
PrinciplesPrinciples– modularitymodularity– adherence to standardsadherence to standards– extensibilityextensibility
Metadata forms core of systemMetadata forms core of system
System Architecture
Metadata-driven Document Conversion
Interpret metadata in its broadest Interpret metadata in its broadest sensesense– data about datadata about data
Use metadata to drive the entire Use metadata to drive the entire systemsystem
The metadata record is the basic unit The metadata record is the basic unit in the system, managing the in the system, managing the – digitization processdigitization process– display and organization of the datadisplay and organization of the data– network-based resource discoverynetwork-based resource discovery
Metadata: Framework for
Collection Management
Metadata entry system manages Metadata entry system manages all aspects of digitization processall aspects of digitization process– Unique identifiers bind digital master Unique identifiers bind digital master
files, Web-derivatives, and metadata files, Web-derivatives, and metadata recordsrecords
– Enforces quality control (pull-down Enforces quality control (pull-down menus, validation, error messages)menus, validation, error messages)
– Reports that manage workflowReports that manage workflow– Security measuresSecurity measures
Metadata: Display and Organization of the
Data Series of programs generate Series of programs generate
HTML from metadata RDBMS HTML from metadata RDBMS – Include consistency checkingInclude consistency checking
Programs generate alternative Programs generate alternative viewsviews– alphabetical, chronological, alphabetical, chronological,
resource type, content arearesource type, content area Filtering mechanisms for access Filtering mechanisms for access
managementmanagement
Metadata: Networked-based Resource
Discovery Dublin Core elements Dublin Core elements derivedderived
from metadata entry systemfrom metadata entry system– simplicitysimplicity– semantic interoperabilitysemantic interoperability– international consensusinternational consensus– modularitymodularity
Sample Metadata Record on Web Site
Digital Conversion Projects
Conversion projects involve Conversion projects involve extensive human and extensive human and computational resourcescomputational resources
Therefore, it is important to design Therefore, it is important to design systems thatsystems that– Are extensibleAre extensible– Automate processes whenever possibleAutomate processes whenever possible– Adhere to standardsAdhere to standards– Ensure the persistence of the dataEnsure the persistence of the data
Profiles in Science
http://profiles.nlm.nih.gov/