Upload
kimberly-logan
View
217
Download
1
Embed Size (px)
Citation preview
Library of Congress Metadata Landscape
Sally H. [email protected]
Content Library of Congress perspective on
Descriptive metadata now Descriptive metadata evolution Broader metadata concerns
LC metadata needs Same problem everyone has:
Many type of resources• Books, journals, maps, audio, moving image, still
image, artifacts, electronic Many possible levels of access
• Collection, item, analytic, cut, etc. Many items
• 125+ million non-electronic Cataloging for electronic resources
• 3+ million digital resources Linking to electronic resources
LC service perspective Coherence and consistency (as
much as possible) Explainable to the end user
Primary access tools at LC
Online catalog content tagging Full level cataloging AACR MARC 21 Minimal level AACR MARC 21 Initial Bib. control AACR-like MARC 21
Some collections represented by collection level records in catalog connect to finding aid tools
Finding aid tools at LC Finding aids local EAD
Various collections of manuscripts, music, photographs SONIC catalog AACR-like MARC-like
Sound recording collections
PPOC catalog AACR MARC21 Photograph collections
InQuery mixed internal Digital conversion collections
Indexing and abstracting services Serials
Current LC cataloging “feeds”
Vendor records (MARC 21) Copy from OCLC, etc. (MARC 21) Publisher records (ONIX) Other for special materials Metadata in digital objects Metadata with digital objects
(future)
LC Links to electronic resources
URIs or equivalent in catalog records and finding aids
handle server OpenURL (experimentation)
Questions Is this coherent and consistent? Is it scalable to electronic resources? Do all resources need the same kind
of treatment? How about proliferating metadata
schemas? How do we maintain evolutionary
pathway and standardization?
We see content diversity Content = the data (title, subject term,
etc.)• AACR data, EAD data, DC data, ONIX data,
…• Different use of content rules: Does MARC
main entry = DC creator = ONIX contributor?
• More types – administrative, structural, product data, rights management
•Global library community convergence on AACR for descriptive metadata?
We see markup diversity
Markup = data tagging• MARC 21 tags, DC tags, ONIX tags, MAB
tags, UNIMARC tags• HTML tags• EAD DTD tags• (XML tag sets easy to establish)•Global library community convergence
on MARC 21 and EAD?
We see different structures Structure = record “arrangement”
• ISO 2709• Microsoft Access• DTDs, Schemas• SGML, XML, HTML, ?ML family•Convergence on XML, Schemas
And at LC we have 13,000,000 MARC 21bibliographic
records in primary catalog
5,000,000 MARC 21 name authority records
300,000 MARC 21 subject records 350 trained catalogers Integrated Library System
Descriptive metadata evolution
Need to take advantage of XML Establish standard MARC 21 in an XML structure
Need simpler (but compatible) alternatives Development of MODS
Need interoperability with different schemas Assemble coordinated set of tools
Need continuity with current data Provide flexible transition options
MARC 21 evolution to XML
MARC 21 (2709) MARC 21 (2709) records
Highly developed semantic content Installed base of 1000s of MARC 21
systems Over 1,000,000,000 MARC 21 records in
local and network systems Accessible to 100s of Z39.50 clients Thousands of librarians who “speak”
MARC 21
MARC 21 (2709) record (machine view)
00967cam 2200277 a 4500 001000800000005001700008008004100025020005300229040001800282050002400312082002100336100003000357245007400387260004400461300003500505440001200540500002000552650004200572651002500614
347139419990429094819.1931129s1994 wauab 001 0 eng a 93047676 a0898863872 (acid-free, recycled paper) :c$14.95 aDLCcDLCcDLC 00aGV1046.G3bG47 199400a796.6/4/09432201 aSlavinski, Nadine,d1968-10aGermany by bike :b20 tours geared for discovery /cNadine Slavinski. aSeattle, Wash. :bMountaineers,cc1994. a238 p. :bill., maps ;c22 cm. 0aBy bike aIncludes index. 0aBicycle touringzGermanyxGuidebooks.
MARCXML - MARC 21 in XML
MARCXML record XML exact equivalent of MARC (2709)
record Lossless/roundtrip conversion to/from MARC
21 record Simple flexible XML schema, no need to
change when MARC 21 changes Presentations using XML stylesheets Converters available from LC, open source LC using with OAI, METS, ZING Adopted by OAI to replace oai_marc
MARC21 (2709) to MARCXML<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00967cam 2200277 a 4500</leader><controlfield tag="001">3471394</controlfield><controlfield tag="005">19990429094819.1</controlfield><controlfield tag="008">931129s1994 wauab 001 0 eng </controlfield><datafield tag="020" ind1=" " ind2=" ">
<subfield code="a">0898863872 (acid-free, recycled paper) :</subfield><subfield code="c">$14.95</subfield>
</datafield><datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">DLC</subfield><subfield code="c">DLC</subfield><subfield code="d">DLC</subfield>
</datafield><datafield tag="050" ind1="0" ind2="0">
<subfield code="a">GV1046.G3</subfield><subfield code="b">G47 1994</subfield>
</datafield><datafield tag="082" ind1="0" ind2="0">
<subfield code="a">796.6/4/0943</subfield><subfield code="2">20</subfield>
</datafield><datafield tag="100" ind1="1" ind2=" ">
<subfield code="a">Slavinski, Nadine,</subfield><subfield code="d">1968-</subfield>
</datafield>
MARCXML record (continued)<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Germany by bike :</subfield><subfield code="b">20 tours geared for discovery /</subfield><subfield code="c">Nadine Slavinski.</subfield>
</datafield><datafield tag="260" ind1=" " ind2=" ">
<subfield code="a">Seattle, Wash. :</subfield><subfield code="b">Mountaineers,</subfield><subfield code="c">c1994.</subfield>
</datafield><datafield tag="300" ind1=" " ind2=" ">
<subfield code="a">238 p. :</subfield><subfield code="b">ill., maps ;</subfield><subfield code="c">22 cm.</subfield>
</datafield><datafield tag="440" ind1=" " ind2="0">
<subfield code="a">By bike</subfield></datafield><datafield tag="500" ind1=" " ind2=" ">
<subfield code="a">Includes index.</subfield></datafield><datafield tag="650" ind1=" " ind2="0">
<subfield code="a">Bicycle touring</subfield><subfield code="z">Germany</subfield><subfield code="x">Guidebooks.</subfield>
</datafield></record>
MODS
MODS Metadata Object Description Schema – a
MARC 21 companion Simpler element set than full MARC, but
MARC semantics - simplified coded data Richer element set than DC More compatible with MARC than others “Friendly” schema and tagging, no coded
values Special accommodation of electronic
resources
MODS for electronic resources Development
electronic resources an important target input from several digital library projects
Xlink attribute throughout Related item structure supports hierarchy
needed for complex digital objects Digital origin attribute Several date types specifically for digital projects
(e.g., capture) E-resource identifiers, e.g., DOI
MODS LC uses of MODS
Describing electronic resources• AV project, web archiving
Technician input• web archiving
Incorporation with XML resources• METS projects
OAI collections• LC offers MODS, MARCXML, DC simple
MARCXML to MODS<mods xmlns="http://www.loc.gov/mods/">
<titleInfo><title>Germany by bike : 20 tours geared for discovery /</title></titleInfo><name type="personal">
<namePart>Slavinski, Nadine,</namePart><namePart type="date">1968-</namePart><role>creator</role>
</name><typeOfResource>text</typeOfResource><publicationInfo>
<placeCode authority="marc">wau</placeCode><place>Seattle, Wash. :</place><publisher>Mountaineers,</publisher><dateIssued>c1994.</dateIssued><dateIssued encoding="marc">1994</dateIssued><issuance>monographic</issuance>
</publicationInfo><language authority="iso639-2b">eng</language><physicalDescription><extent>238 p. : ill., maps ; 22 cm.</extent></physicalDescription><note type="statement of responsibility">Nadine Slavinski.</note><note>Includes index.</note>
MODS (continued)<subject authority="lcsh">
<topic>Bicycle touring</topic><geographic>Germany</geographic><topic>Guidebooks.</topic>
</subject><classification authority="lcc">GV1046.G3 G47 1994</classification><classification authority="ddc" edition="20">796.6/4/0943</classification><relatedItem type="series">
<titleInfo><title>By bike</title></titleInfo></relatedItem><identifier type="isbn">0898863872 (acid-free, recycled paper) :</identifier><identifier type="lccn">93047676</identifier><recordInfo>
<recordContentSource>DLC</recordContentSource><recordCreationDate encoding="marc">931129</recordCreationDate><recordChangeDate encoding="iso8601">19990429094819.1</recordChangeDate><recordIdentifier>3471394</recordIdentifier>
</recordInfo></mods>
MARCXML and DC DC application target – cross domain,
metadata in document headers Transformation software important to help
standardize crosswalks - LC already maintains DCMARC 21 mapping
Transformation available from LC, open source
Offer items for OAI harvesting in DC (MARCXML and MODS)
MARCXML to DC
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:title>Germany by bike : 20 tours geared for discovery </dc:title><dc:creator>Slavinski, Nadine, 1968-</dc:creator><dc:type>text</dc:type><dc:publisher>Seattle, Wash. : Mountaineers,</dc:publisher><dc:date>c1994.</dc:date><dc:language>eng</dc:language><dc:subject>Bicycle touring</dc:subject>
</rdf:Description>
MARCXML from ONIX
Publisher/bookseller record to MARC (2709) via MARCXML
Complex XML format with• traditional descriptive data possibilities• potentially useful descriptive data LC does not
currently have or supply• publisher/bookseller data not of current interest
<imprint><b241>02</b241><b242>Clarion Books</b242><b243>HMCo008</b243>
</imprint><b081>Houghton Mifflin Company</b081><b209>New York</b209><b083>US</b083><b003>20021021</b003><b087>2002</b087><measure>
<c093>08</c093><c094>0.0</c094><c095>lb</c095>
</measure><d101>A little buckaroo is
turning two in this birthday book for the very young, the fifth story about the delightful holiday mice. Mischief and near disaster abound when the littlest ouses sister and brothers throw him a cowboy-themed party. Through simple rhymes and charming illustrations, readers witness the party preparations, the rrival of the guests, the opening of presents, and the blowing out of the candles, as well as the ensuing fulfillment of the little mouses fondest birthday wish: to be acowboy.</d101>
<mediafile><f114>04</f114><f115>05</f115> <f116>01</f116><f117>ftp://imagesro:[email protected]/low_res/juvenile_jacket_low_res/fall_2002/0618077723.tif</f117><f122>These images may be used only to promote the Houghton Mifflin publications with which they are associated. They may be used only in their entirety without any alteration, other than to change the size of the images. The images must be accompanied by any proprietary notice included therewith.</f122>
Snip from ONIX record
<datafield ind1=" " ind2=" " tag="260"><subfield code="a">New York</subfield><subfield code="b">Houghton Mifflin Company</subfield><subfield code="c">2002</subfield>
</datafield> <datafield ind1=" " ind2=" " tag="300">
<subfield code="a">32 p.</subfield></datafield><datafield ind1=" " ind2=" " tag="521">
<subfield code="a">Children/juvenile.</subfield></datafield><datafield ind1="1" ind2=" " tag="700">
<subfield code="a">Cushman, Doug</subfield><subfield code="e">illustrator</subfield>
</datafield><datafield ind1="4" ind2="2" tag="856">
<subfield code="3">Front cover image</subfield><subfield code="u">ftp://imagesro:[email protected]/low_res/ju
venile_jacket_low_res/fall_2002/0618077723.tif</subfield><subfield code="z">These images may be used only to promote the Houghton
Mifflin publications with which they are associated. They may be used only in their entirety without any alteration, other than to change the size of the images. The images must be accompanied by any proprietary notice included therewith.</subfield>
</datafield>
Snip from MARCXML from ONIX
MARCXML – other tools Tagging transformations
Name instead of number tags? Different language tags for MODS? MARC 21 XML “full” tagging oai_marc to MARCXML
Character set transformations MARCXML to FRBR tool (for
experimentation) MARC record validation tool
Uses of MARCXML and related tools
Standardize MARC 21 across community for XML communication and manipulation
Open MARC 21 to XML programming tools and presentation style sheets
Standardize MARC 21 for OAI harvesting Standardize transformations to and from other
standard formats (DC, ONIX, …) Basis for evolution while maintaining
standardization
Broader metadata needs LC descriptions of digitized items
includes technical and rights data not appropriate for MARC
Focusing on METS - Metadata Encoding and Transmission Standard
Descriptive, administrative, and structural in one XML document
Characteristics of METS METS enables resource retrieval,
object validation, preservation, rights mgt., ...
Non-proprietary; being developed by library community
(relatively) Simple; extensible; modular
METS Schema
METS use LC
Moving image project Selected digital collections Developing a record creation utility
Others BnF web archiving and digital preservation OCLC web archiving NL Wales digital collections Harvard audio collection Michigan State, Berkeley, etc.
In summary LC focuses on AACR, MARC 21, and
EAD for primary access New development is evolutionary Employing XML through MARCXML Focus for electronic documents in
MODS, a MARC derivative For broader metadata, METS and
appropriate extension schema
More information www.loc.gov/marcxml www.loc.gov/mods www.loc.gov/marc