View
216
Download
3
Tags:
Embed Size (px)
Citation preview
INLS 520 – Fall 2007Erik Mitchell
INLS 520
Information Organization
INLS 520 – Fall 2007Erik Mitchell
Review
• Metadata models• DC, METS
• Metadata Standards• Dublin core / qdc
• Encoding Schemes• HTML, XML, MARC…
• Advanced metadata concepts• Schemas, application profiles
INLS 520 – Fall 2007Erik Mitchell
Today
• Core Skills for Library/IS types
• MARC Overview– Encoding– Related Standards– Exercise
• RDF Introduction (brief)• Introduction to programming (brief)
Discussion
• Read assigned posting from NGC4LIB discussion group.
• Share in group & think about the following questions:– What are the core skills that an information
organization professional should have? – What is the relationship of Information organization
to these “core skills?”
INLS 520 – Fall 2007Erik Mitchell
Anatomy of a bibliographic record
INLS 520 – Fall 2007Erik Mitchell
INLS 520 – Fall 2007Erik Mitchell
MARC value standards
• Fields & Values – Fields, Indicators, Subfields– More information from OCLC
• Content and encoding standards– AACR2– RDA
• Development started in 2004, slated for release in 2009
• An enjoyable article on the development of RDA
How to enter a title into a MARC record
– AACR2• Transcribe title exactly according to spelling but not necessarily
punctuation/capitalization.
• If an alternative title is present, precede it by a comma following the regular title
• Use a General Material Designation in brackets []
– MARC Standard• Use 245 field – indicates Main title
• Indicator 2 – Number of non-filing characters (leading articles)
• Subfield a – main title
• Subfield b – remainder of title
• Subfield h – General Material Designation in brackets []
INLS 520 – Fall 2007Erik Mitchell
INLS 520 – Fall 2007Erik Mitchell
MARC metadata
• Definition– Machine Readable Catalog Record– Combination of content, value, and
encoding standard
• History– Created by Henriette Avram in 1968– Managed by the Library of Congress
INLS 520 – Fall 2007Erik Mitchell
MARC metadata
• The encoding standard– Variable length record– Set leader defines position of fields in record– Fixed fields in leader codifies format information– Variable length fields provide descriptive content
• Examples– System ready example record (LC)– Uses of MARC fields by OCLC
• More information– More information from LC
INLS 520 – Fall 2007Erik Mitchell
Encoded MARC record01802cam 22003371a
4500001001800000003000800018005001700026006001900043007001500062008004100077015001400118035002100132035001800153040005100171041001300222043002100235050001700256082001700273245011000290260006600400300002100466505053300487533015701020650002501177651003301202700004401235776003201279830004801311856009201359949001301451ASPS00000161/nwldVaAlASP20061114120112.0m | d | cr |n ---||a|a730321s1955 mnu 000 0 eng aGB56-6680 9(DLC) 55009368 a(OCoLC)585815 aDLCcODaUdOCoLCdMnHidUkdPBfGdDLCdVaAlASP1 aenghnor an-us---ae-no---00aE184.S2bB55 a325.2481097300aLand of their choiceh[electronic resource] :bthe immigrants write home /cedited by Theodore C. Blegen. a[Minneapolis, Minn.] :bUniversity of Minnesota Press,c1955. a463 p. ;c24 cm.0 aThe immigrant image of America -- The "sloopfolk" arrive -- Westward to El-a-noy -- Wisconsin is the place -- The Atlantic crossing -- Scouting the promised land -- Spreading the gospel -- Journeying toward new horizons -- Ordeal and debate -- Appraising the American scene -- The transatlantic gold rush -- Cheerful voices at mid-century -- More than a ballad -- A humorist in Canaan -- A lady grows old in Texas -- In defense of the southwest -- From a frontier parsonage -- The beautiful land -- The glorious new Scandinavia.I0aElectronic reproduction.bAlexandria, VA :cAlexander Street Press,d2002.f(North American women's letters and diaries).nAvailable via World Wide Web. 0aNorwegian Americans. 0aUnited StatesxCivilization.1 aBlegen, Theodore Christian,d1891-1969.1 cOriginalw(DLC) 55009368 0aNorth American women's letters and diaries.40zAccess restricted to subscribers.uhttp://www.aspresolver.com/aspresolver.asp?NWLD;S16101aER_NAWLD
INLS 520 – Fall 2007Erik Mitchell
Text formatted MARC• =LDR 01802cam 22003371a 4500• =001 ASPS00000161/nwld• =003 VaAlASP• =005 20061114120112.0• =006 m\\\\|\\\d\|\\\\\\• =007 cr\|n\---||a|a• =008 730321s1955\\\\mnu\\\\\\\\\\\000\0\eng\\• =015 \\$aGB56-6680• =035 \\$9(DLC) 55009368• =035 \\$a(OCoLC)585815• =040 \\$aDLC$cODaU$dOCoLC$dMnHi$dUk$dPBfG$dDLC$dVaAlASP• =041 1\$aeng$hnor• =043 \\$an-us---$ae-no---• =050 00$aE184.S2$bB55• =082 \\$a325.24810973• =245 00$aLand of their choice$h[electronic resource] :$bthe immigrants write home /$cedited by Theodore C. Blegen.• =260 \\$a[Minneapolis, Minn.] :$bUniversity of Minnesota Press,$c1955.• =300 \\$a463 p. ;$c24 cm.• =505 0\$aThe immigrant image of America -- The "sloopfolk" arrive -- Westward to El-a-noy -- Wisconsin is the place -- The Atlantic
crossing -- Scouting the promised land -- Spreading the gospel -- Journeying toward new horizons -- Ordeal and debate -- Appraising the American scene -- The transatlantic gold rush -- Cheerful voices at mid-century -- More than a ballad -- A humorist in Canaan -- A lady grows old in Texas -- In defense of the southwest -- From a frontier parsonage -- The beautiful land -- The glorious new Scandinavia.
• =533 I0$aElectronic reproduction.$bAlexandria, VA :$cAlexander Street Press,$d2002.$f(North American women's letters and diaries).$nAvailable via World Wide Web.
• =650 \0$aNorwegian Americans.• =651 \0$aUnited States$xCivilization.• =700 1\$aBlegen, Theodore Christian,$d1891-1969.• =776 1\$cOriginal$w(DLC) 55009368 • =830 \0$aNorth American women's letters and diaries.• =856 40$zAccess restricted to subscribers.$uhttp://www.aspresolver.com/aspresolver.asp?NWLD;S161• =949 01$aER_NAWLD
INLS 520 – Fall 2007Erik Mitchell
MARC variable fields
• 245 14 $a The MARC record: $b revealed and detailed– Field tag: 245– Indicators: 14– Subfield: $a, $b– Contents
INLS 520 – Fall 2007Erik Mitchell
MARC leader
http://www.oclc.org/support/documentation/worldcat/records/subscription/1/1.pdf
INLS 520 – Fall 2007Erik Mitchell
MARC fields (1)
• 001-007 Leader/fixed fields• 010-035 Identifying numbers• 050-099 Call Numbers• 100-130 Names• 210-247 Title• 250-270 Edition, imprint, etc• 300-362 Physical, publication
info.
INLS 520 – Fall 2007Erik Mitchell
MARC fields (2)
• 500-599 Notes & contextual info.
• 600-699 Subject headings, names
• 700-799 Added entries
• 800-830 Series added entries
• 856 Electronic access
• 900-999 Local information
INLS 520 – Fall 2007Erik Mitchell
Example MARC fields (1)
• =LDR 01802cam 22003371a 4500• =001 ASPS00000161/nwld• =003 VaAlASP• =005 20061114120112.0• =006 m\\\\|\\\d\|\\\\\\• =007 cr\|n\---||a|a• =008 730321s1955\\\\mnu\\\\\\\\\\\000\0\eng\\• =015 \\$aGB56-6680• =035 \\$9(DLC) 55009368• =035 \\$a(OCoLC)585815
INLS 520 – Fall 2007Erik Mitchell
MARC leader (006)Position Field Value
00-04 Logical Record Length 018005 RecStat (Record Status) c06 Type (type of record) a07 BLvl (Bibliographic level) m08 Ctrl (type of control) \09 Character Coding Scheme10 Indicator Count11 Subfield Code Count12-16 Base Address of data17 ELvl (Encoding Level) 118 Desc (Descriptive catalog form AACR2/ISBD) a19 Linked Record Requirement20 Length of Len-of-field21 Length of starting character 22 Transaction type code in hex23 Undf
INLS 520 – Fall 2007Erik Mitchell
008 Field (Leader – 2)Position Field Value
00–05 Entered Date added to WorldCat 730321 06 DtSt Date Type s 07–10 Dates (Date 1) 1955 11–14 Dates (Date 2) \\\\ 15–17 Ctry(Required if avail.) mnu 18–34 Format specific
(See Summary of 008 and 006 Field Bytes.) 18 Illustrations acde22 Audience e23 Form r24 Nature of Contents bcde28 Gpub (Government Publication) \29 Conf (conference Publication) 030 Fest (Festschrift) 031 Indx (does the resource have an index) 133 LitF (literary form) m34 Biog (Is the work biographical) \
35–37 Lang(Mandatory) eng 38 MRec Modified Record \ 39 Srce (Mandatory)Cataloging source \
INLS 520 – Fall 2007Erik Mitchell
Example MARC fields (2)
• =050 00$aE184.S2$bB55• =082 \\$a325.24810973• =245 00$aLand of their
choice$h[electronic resource] :$bthe immigrants write home /$cedited by Theodore C. Blegen.
• =260 \\$a[Minneapolis, Minn.] :$bUniversity of Minnesota Press,$c1955.
• =300 \\$a463 p. ;$c24 cm.
INLS 520 – Fall 2007Erik Mitchell
Example MARC fields (3)
• =505 0\$aExtracted notes fields.• =650 \0$aNorwegian Americans.• =651 \0$aUnited States$xCivilization.• =700 1\$aBlegen, Theodore Christian,
$d1891-• =830 \0$aNorth American women's letters • =856 40$zAccess restricted to
subscribers.$uhttp://www.aspresolver.com/as presolver.asp?NWLD;S161
• =949 01$aER_NAWLD
MARC Exercises
• Introduction to MARCEdit– If you can’t use MARCEdit – use a text
editor & follow this standard:• =245 04 $a content $b more content
– Tour of the application– Exercise 1 – create a MARC record– Exercise 2 – decompile/compile MARC
records, batch edit
INLS 520 – Fall 2007Erik Mitchell
INLS 520 – Fall 2007Erik Mitchell
FRBR Model
http://www.ifla.org/
http://fictionfinder.oclc.org/
http//worldcat.org
http://www.frbr.org
INLS 520 – Fall 2007Erik Mitchell
FRBR background• Work/item
– C.A. Cutter (1890)• Notion of a work
– S. R. Ranganathan (1930-late 1960)• Intellectual entity – expressed thought• Physical entity – embodies thought
– P. Wilson• Intellectual entity – work
– Subject metadata• Physical entity – item
– Selected descriptive metadata
Adapted from Jane Greenberg
INLS 520 – Fall 2007Erik Mitchell
FRBR components
• Work– distinct intellectual or artistic creation
• Expression– intellectual or artistic realization of a work
• Manifestation– physical embodiment of an expression of a
work
• Item– a single exemplar of a manifestation
Adapted from Jane Greenberg
INLS 520 – Fall 2007Erik Mitchell
FRBR Example
• Rolling Stones’ IT'S ONLY ROCK-N –ROLL (1974) (work)– Group’s performance recorded for the
album (Expression)• Recording released in 1974 by MCA
Records on tape cassette (Manifestation)• Recording released in 1974 by MCA
Records on compact disc (Manifestation)• Sheet music released in 1992 (?)
Adapted from Jane Greenberg
INLS 520 – Fall 2007Erik Mitchell
FRBR diagram
Work, the Performance (1974)
E: Music and lyrics
E: Music (just the instruments)
M: CD, RCA, 2005
M: RS, LP 1974
M: 8-track, RCA, 1975
I: My CD, RCA, 2005 c.2
I: Your CD, RCA, 2005 c.1
I: UNC Musllib.CD, RCA, 2005 c.3
Adapted from Jane Greenberg
INLS 520 – Fall 2007Erik Mitchell
FRBR Algorithm (1)
• Process– Extract Author
• Construct Authority author entry from100, 400 using subfields and 008 data to limit
– Extract Title• Construct Authority title entry from 130, 240, 245, etc.
Normalize using NACO
– Combine these two authorities to create a unique Work identifier
• <author>Mitchell, Margaret</author><title>Gone with the wind</title>
INLS 520 – Fall 2007Erik Mitchell
FRBR Algorithm (2)
• Results from a sample extraction (From FRBR doc)
• <author>/<title> (75.97%)• <uniform title> (1.34 %)• /<title>/[one or more <name>] (17.35%)• /<title>/<control number> (5.34%)
• http://www.oclc.org/research/software/frbr/frbr_workset_algorithm.pdf
INLS 520 – Fall 2007Erik Mitchell
Warwick Framework
• Components– Container– Package
• Metadata set• Indirect link• Another container
• Origins / Definition– Beginnings: Came out of DC discussions in 1995/6– Goal: to promote interoperability, define context of the DC
metadata, come up with a way of ‘contextualizing’ DC description
– Definition: A general model that describes the various parts of a complex object, including the various categories of metadata.-http://www.cs.cornell.edu/wya/DigLib/MS1999/glossary.html
INLS 520 – Fall 2007Erik Mitchell
Resource Description Framework
• Origins– PICS (Platform for Internet Content Selection)– Warwick framework– Initial goal was to code metadata for the web
• Definition:– A data model– A set of “statements” about a “resource”– RDF Triple: Description = Resource with Value
INLS 520 – Fall 2007Erik Mitchell
RDF Example
• A resource is a uniquely identifiable thing (URI)• Properties are given context (Property Type)
From Miller, 1998
INLS 520 – Fall 2007Erik Mitchell
RDF Model
Webpage: http://ils.unc.edu
“Abe Crystal”
Author
(Value)
Object
(Property type)
Predicate
(Resource)
Subject
“The author of the SILS Webpage is Abe Crystal”
http://ils.unc.edu has a creator with name Abe Crystal
-A literal, a triple, a statement
From Greenberg
INLS 520 – Fall 2007Erik Mitchell
How is RDF different?
• RDF is a descriptive model that – Allows variable contextualized description– Deconstructs the descriptive process– Allows more granular automated
processing of data– Uses exact markup to indicate the context
of values (namespaces, schemas)
• A simple Example
INLS 520 – Fall 2007Erik Mitchell
Encoding RDF in XML<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:about="http://purl.org/dc/elements/1.1/">
<dc:title>The Hang: The Island of Black Jeans</dc:title>
<dc:creator>SAKI KNAFO</dc:creator> <dc:identifier>http://www.stuff.com</dc:identifier> <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date> <dc:description>descriptive
content</dc:description> </rdf:Description></rdf:RDF>
INLS 520 – Fall 2007Erik Mitchell
Iterative RDF description<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vcard="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/vcard.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:about="http://purl.org/dc/elements/1.1/"> <dc:title>The Hang: The Island of Black Jeans</dc:title> <dc:creator rdf:href = "#Creator_001"/> <dc:identifier>http://www.stuff.com</dc:identifier> <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date> <dc:description>descriptive content</dc:description> </rdf:Description> <rdf:Description ID="Creator_001">
rdf:about="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_,,,"> <vcard:given>Saki</vcard:given>
<vcard:family>Knafo</vcard:family><vcard:email>
<vcard:userid>[email protected]</vcard:userid></vcard:email>
</rdf:Description></rdf:RDF>
INLS 520 – Fall 2007Erik Mitchell
DC in RDF
• Expressing Simple Dublin Core in RDF/XML (Beckett, et al., 2002)- http://dublincore.org/documents/dcmes-xml/
- *note, remember, you cannot do qualification with this recommendation.
• Expressing Qualified Dublin Core in RDF / XML (Kokkelink & Schwänzl, 2002)- http://dublincore.org/documents/2002/04/14/dcq-
rdf-xml/
Programming 101
• What is a program?
• What concepts do we need to understand?
• Is XSL a programming language?
INLS 520 – Fall 2007Erik Mitchell
Programming 101
• Definition:– “the act of creating software or some other
set of instructions for a computer.” [1]
• Examples– Dynamic web sites– Compiled applications (like Firefox)– Small applications that perform a specific
task (such as transform metadata)
INLS 520 – Fall 2007Erik Mitchell
Definitions
• Programming Language• “A formal language used to write instructions that can be
translated into machine language and then executed by a computer.” (definitions)
• Scripting Language• Run-time (does not require compilation)• Restricted context (requires a specific environment)• Functional / Object oriented • Definitions
• Compiler / Interpreter• A program that builds and executes a program.
Compilers create a self-executable file, interpreters read a text script at run-time
Programming approaches
• Logical/structural programming• Stream of consciousness• Starts at line 1
• Procedural programming• Uses functions, sub-functions, subroutines• Encapsulation, modularization
• Object-oriented programming• Further encapsulation• Uses concepts of inheritance, modularity
Flow of Document Models
INLS 520 – Fall 2007Erik Mitchell
What is the relationship of the data model to the intended document use in the four following document examples?
The programming process
• Analyze the problem• What do you want your program to do?• What are your users expecting, what data do you have?
• Plan program flow/logic • What steps need to occur, in what order?• Useful tools include Step-Form, flowcharts, and
pseudocode
• Code the program• Create variables, routines, functions
• Compile/run the program• Test, verify• Release
Programming 101 - Concepts
• General structure– Programs have a ‘flow’ to them– Programs use functions, algorithms, and
objects to compartmentalize operations– Programs follow a specific syntax (their
own document model)– Programs operate in specific environments
(compiled platforms, run-time platforms)
INLS 520 – Fall 2007Erik Mitchell
Programming 101 – Concepts
• Control Structures– Looping (while)– Decision making (if)
• Variables– Store information for use/reuse– A simple varaible is name=value
INLS 520 – Fall 2007Erik Mitchell
Programming 101 - XSL
• Is XSL programming?
• What can we use XSL for?
• Why are we covering it here?
INLS 520 – Fall 2007Erik Mitchell
XSL Overview
• Extensible Stylesheet Language• Components
– Defined XML standard which is used in conjunction with a transformation engine to transform XML data
– Xquery/Xpath
• Capabilities, limitations– Document processing– Semi-functional programming language
XSL Introduction
• Styling– XSL - eXtensible Style Language
• Querying– XPath– XQuery– XPointer– XLink
• Good resources for reference– http://www.w3schools.com/xsl/default.asp– http://www.w3.org/Style/XSL/– http://www.w3schools.com/css/default.asp– http://www.csstutorial.net/
XSL Overview - 1
<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/><xsl:template match="/dc">
Processing Instructions</xsl:template>
</xsl:stylesheet>
Contents of <xsl:template...><html>
<head><title>Sample XSL transformation</title>
</head><body>
<xsl:for-each select="*"><p>
<b><xsl:value-of select="name(.)"/><xsl:text>:</xsl:text>
</b><xsl:value-of select="./text()"/>
</p></xsl:for-each>
</body></html>
INLS 520 – Fall 2007Erik Mitchell
XSL – Sample Stylesheet<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/rss"><html>
<body><xsl:for-each select="./channel/item">
<xsl:value-of select="title"/><br/></xsl:for-each>
</body></html>
</xsl:template></xsl:stylesheet>
XSL Control Structures
• For Each• <xsl:for-each select=“/date”></xsl:for-each>
• Choosing between options• <xsl:choose>
– <xsl:when select=“contains(/URL, “.edu”)>– </xsl:when>
• </xsl:choose>
• If• <xsl:if test=“./title != ‘’> </xsl:if>
XSL Templates
• Templates work like functions
• Defining a template• <xsl:template name=“myName”>
– <xsl:for-each…..>– </xsl:for-each>
• </xsl:template>
• Calling a template• <xsl:call-template name=“myName”/>
INLS 520 – Fall 2007Erik Mitchell
XSL Variables
• Variables store values for later use– In XSL variables are somewhat limited due
to the processing relationship to the XML DOM
• Defining a Variable• <xsl:variable name=“myVariable”>value
here</xsl:variable>
• Using a Variable• <xsl:value-of select=“$myVariable”/>
INLS 520 – Fall 2007Erik Mitchell
XSL – Sample Stylesheet<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/rss"><html>
<body><xsl:for-each select="./channel/item">
<xsl:value-of select="title"/><br/></xsl:for-each>
</body></html>
</xsl:template></xsl:stylesheet>
XPath
• A DOM-style syntax that allows us to access elements in an XML file
• Examples– /dublinCore/title
– Access the title of a DC record
– /dulinCore/subject/@attribute– Access an attribute of the subject element
– /dublinCore/
Xpath (2)
• Xpath functions– Contains (//item/title, ‘England’)– substring-before(string1, string2), substring-
after(string1, string2)
• Xpath selectors– //elementname – finds an element anywhere in the
DOM– ./ - from the current context– / - from the root context– * - wildcard match
INLS 520 – Fall 2007Erik Mitchell
XSL – Sample Stylesheet<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/><xsl:template match="/dc">
<html><head>
<title>Sample XML File</title></head><body>
<xsl:for-each select="*"><p><b><xsl:value-of select="name(.)"/>: </b><xsl:text> </xsl:text><xsl:value-of select="./text()"/></p>
</xsl:for-each></body></html>
</xsl:template></xsl:stylesheet>
INLS 520 – Fall 2007Erik Mitchell
User generated Metadata
• Based on our work with metadata so far – is this something a general ‘user’ could do?
• What system features would help/hurt user-generated metadata?