30
An Overview of XML What it Is, How it Works, and How it’s Used for Library Metadata Steven Bernstein CLA Technical Services Section Fall Workshop November 15, 2012

An Overview of XML

  • Upload
    knox

  • View
    45

  • Download
    1

Embed Size (px)

DESCRIPTION

An Overview of XML. What it Is, How it Works, and How it’s Used for Library Metadata Steven Bernstein CLA Technical Services Section Fall Workshop November 15, 2012. A Few Things Before I Start. - PowerPoint PPT Presentation

Citation preview

Page 1: An Overview of XML

An Overview of XMLWhat it Is, How it Works, and How it’s Used for Library Metadata

Steven BernsteinCLA Technical Services Section Fall WorkshopNovember 15, 2012

Page 2: An Overview of XML

A Few Things Before I Start1. Nothing about which I am about to speak

is anything new. The technologies and standards covered in this presentation have been around for about a decade.

2. That said, most technical services librarians (let alone public services librarians) don’t have an intimate knowledge of how XML works.

3. This presentation assumes you fit into the category of “most technical services librarians”.

Page 3: An Overview of XML

A Few Things Before I Start4. I am a punster. You are forewarned.

Page 4: An Overview of XML

What is XML?EXtensible Markup Language

Page 5: An Overview of XML

EXtensible Markup Language Markup Languages

What are Markup Languages?Markup Languages provide context to electronic data, thereby transforming them into information that can more readily be used by both computers and humans

Examples of Markup Languages- HTML

- MARC

Page 6: An Overview of XML

Sample Basic HTML Webpage<html> <head> <title>My Website</title> </head> <body> <h1 id=“banner”>My Website</h1> <ul id=“menu”> <li><a href=“index.html”>Home</li> <li><a href=“about.html”>About Me</li> <li><a href=“news.html”>News</li> </ul> <p>Welcome to my website! I’m so glad you came for a visit. I don’t have any content yet so please come back again soon.</p> </body></html>

Page 7: An Overview of XML

Sample Basic MARC Record100 1_ $a DeWind, Dustin.245 10 $a Mortality and the inevitability of dying / $c by Dustin DeWind.260 $a Death Valley, Nev. : $b Heavenly Press, $c 2012.300 $a 120 p. : $b ill. ; $c 24 cm.650 0 $a Death.

Page 8: An Overview of XML

EXtensible Markup Language Extensible

What makes XML Extensible?Tags are not standardized. Anyone can develop their own XML schema with tags that they define themselves.

Examples of XML Schema- Really Simple Syndication (RSS)

- Recipe Markup Language- MARCXML

Page 9: An Overview of XML

Sample Really Simple Syndication (RSS) Feed<?xml version="1.0" encoding="utf-8"?><rss version="2.0"> <channel> <title>The Onion</title> <description>American’s Finest News Source</description> <link>http://www.theonion.com</link> <item> <title>Netflix Switches Over To Convenient New Physical Locations</title> <link>http://www.theonion.com/articles/netflix-switches-over-to-convenient-new-physical-l,19271/</link> <pubDate>Mon, 25 Feb 2011 00:00:00 GMT</pubDate> <description>Officials at Netflix announced Thursday that the company has finally reached its long-term goal of constructing a chain of easily accessible stores. "Having actual physical locations was always our ultimate intent, and we are proud to provide our customers with the convenient option of driving to a nearby Netflix store and renting any available movie for just $3.99 per title," said Netflix spokesman Henry Regis...</description> </item> ... </channel> ... </rss>

Page 10: An Overview of XML

Sample RecipleML Record<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE recipeml PUBLIC "-//FormatData//DTD RecipeML 0.5//EN""http://www.formatdata.com/recipeml/recipeml.dtd"><?xml-stylesheet href="dessert1.css" type="text/css"?>

<recipeml version="0.5"> <recipe> <head> <title>Ice</title> </head> <ingredients> <ing> <amt> <qty>2</qty> <unit>ounces</unit> </amt> <item>water</item> </ing> </ingredients> <directions> <step>Freeze the water.</step> </directions> </recipe></recipeml>

Page 11: An Overview of XML

How are XML Schemas Created?Document Type Definitions (DTDs), XML Stylesheets, and Namespace

Page 12: An Overview of XML

Schemas: Validated XML Immediately after an initial tag that serves

to declare that what is contained in the file is XML, most XML files start with a tag that links* to an external file that defines: All tags that may be used in the file How the tags are nested The permissible values and format thereof of

each tag The attributes of each tag; and The permissible values and format thereof of

each attribute* Though, it is also possible to include the definitions in the file itself.

Page 13: An Overview of XML

Schemas: Validated XML There are two methods of defining valid

tags in an XML file: Document Type Definitions (DTD); and XML Schema Definitions (XSD)

Page 14: An Overview of XML

Namespace: Would the Real Tag Please Stand Up? Oftentimes, a schema can have child tags

and/or attributes with the same names as the child tags of another parent tag. For example:

<member> <id>12345</id></member><item> <id>54321</id></item>

Page 15: An Overview of XML

Namespace: Would the Real Tag Please Stand Up? So as to avoid confusion, xmlns attributes are

included in the root element of the XML file, suffixed with a unique name for each group of tags that appear in the file (i.e the namespace name). The attribute’s value is a “link” to the body responsible for the schema followed again by the namespace’s name.

<library xmlns:patronrecord="http://www.mylibrary.org/patronrecord" xmlns:itemrecord="http://www.mylibrary.org/itemrecord">

“link” namename as suffix

Page 16: An Overview of XML

Namespace: Would the Real Tag Please Stand Up? Tags are defined as belonging to a particular

namespace by appending the namespace name to them as a prefix.

<patronrecord:member> <patronrecord:id>12345</patronrecord:id></patronrecord:member><itemrecord:item> <itemrecord:id>54321</itemrecord:id></itemrecord:item>

Note: XML schemas defined using XSDs support namespace; XML schemas defined using DTDs do not

Page 17: An Overview of XML

The Magic of XML: XSLTEXtensible Stylesheet Language Transformations

Page 18: An Overview of XML

Switching Schemas EXtensible Stylesheet Language

Transformations (XSLT) allow one to input an XML file that uses one schema and output an XML file in another schema or format.

XSLT includes functions to manipulate the values as part of the transformation

Namespace comes in very handy when transforming an XML file from one schema to another.

Page 19: An Overview of XML

XML for Library MetadataAvailable Schemas and Sharing Metadata

Page 20: An Overview of XML

Library XML Schemas General Metadata

MARCXMLAn XML schema containing all of the tags, indicators, and subfields of MARC21

Authority Metadata MADS (Metadata Authority Description Schema)

A subset of MARC21 authority tags which uses linguistic tags rather than numerical tags

Page 21: An Overview of XML

Library XML Schemas Bibliographic Metadata

MODS (Metadata Object Description Schema )A subset of MARC21 bibliographic tags which uses linguistic tags rather than numerical tags

METS (Metadata Encoding & Transmission Standard)An XML schema for descriptive, administrative, and structural metadata of digital objects

YANKEES (Yokel’s And Non-Knowledgable’s Extensible Encoding Standard)METS is used in Queens; YANKEES is used in the Bronx.

Page 22: An Overview of XML

Library XML Schemas, etc. Archival Metadata

EAD (Encoded Archival Description)

Graphical Metadata VRA Core (Visual Resource Association Core) MIX (Metadata for Images in XML)

Etc. etc. etc.

Page 23: An Overview of XML

Sample MARCXML Record<?xml version="1.0" encoding="UTF-8"?> <record xmlns:zs="http://www.loc.gov/zing/srw/" xmlns:cinclude="http://apache.org/cocoon/include/1.0" xmlns="http://www.loc.gov/MARC21/slim"> <leader>01298cam a22003255a 4500</leader> <controlfield tag="001">14730252</controlfield> <controlfield tag="005">20090313105340.0</controlfield> <controlfield tag="008">070209r20121957nyua b 000 1 eng </controlfield> <datafield tag="100" ind2=" " ind1="1"> <subfield code="a">Inay, Matthew</subfield> </datafield> <datafield tag="245" ind2="0" ind1="1"> <subfield code="a">Afternoon performances /</subfield> <subfield code="c">by Matt Inay</subfield> </datafield> </record>

Page 24: An Overview of XML

Sample MODS Record<?xml version="1.0" encoding="UTF-8"?> <mods version="3.4" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-4.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/mods/v3"> <titleInfo> <title>Chicken soup for the vegetarian soul</title> </titleInfo> <name usage="primary" type="personal"> <namePart>Soyhen, Aida</namePart> </name> <typeOfResource>text</typeOfResource> <originInfo> <place> <placeTerm type="code" authority="marccountry">nyu</placeTerm> </place> <place> <placeTerm type="text">New York</placeTerm> </place> ... </mods>

Page 25: An Overview of XML

Sample MADS Record<?xml version="1.0" encoding="utf-8"?><madsCollection xsi:schemaLocation="http://www.loc.gov/mads http://www.loc.gov/standards/mads/mads.xsd" xmlns="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink"> <mads version="beta"> <authority> <name type="personal" authority="naf"> <namePart>O'Shea, Rick</namePart> </name> <titleInfo authority="naf"> <title /> </titleInfo> </authority> <note type="source">Bouncing back, 1991: t.p. (Steven Bernstein)</note> ... </mads></madsCollection>

Page 26: An Overview of XML

Sharing our Metadata When our metadata is encoded in an XML

schema, we can share it more easily through XSLT stylesheets that convert our library metadata into other standards such as the Resource Description Framework (RDF), the foundation for the Semantic Web. It is more clearly identifiable as metadata and can be more easily harvested.

We can also more easily benefit from the metadata of others for our own use.

Page 27: An Overview of XML

Sharing our MetadataMARCXML

Preserves all the detail of the metadata

Despite using the universal language of numbers for tag names, is difficult for non-librarians to understand

Uses complex structures to maintain robustness of MARC21

MODS Many details lost to

simplification Is more easy for

English-speaking non-librarians to understand

Structured much more simply

Markup of Markup…Round Hole; Square Peg

Page 28: An Overview of XML

Essential Conversion Tool MarcEdit by Terry

Reese can convert your metadata between many Library XML schemas.

Version 5.8.4698.40412 was just released this week!http://people.oregonstate.edu/~reeset/marcedit/

Page 29: An Overview of XML

XML: Where are We Going?Which Library Schemas will be the Future?

Page 30: An Overview of XML

Some Resources World Wide Web Consortium (W3C)

http://www.w3.org/

W3Schools XML Tutorialhttp://www.w3schools.com/xml/

Library of Congress Metadata Standardshttp://www.loc.gov/standards/

The Official M.C. Escher Websitehttp://mcescher.com/