Upload
keyla-sugden
View
216
Download
0
Embed Size (px)
Citation preview
Encoding DC in (X)HTML, XML and RDF
Andy [email protected]
UKOLN, University of Bath, UKhttp://www.ukoln.ac.uk/
UKOLN is supported by:
Tutorial at ECDL 2004, BathSeptember 2004
ECDL 2004 tutorial - Bath, Sept 20042
Contents
• an abstract model for DC (30 mins)• encoding DC in XHTML (30 mins)• encoding DC in XML (30 mins)• encoding DC in RDF/XML (30 mins)• practical examples
• OAI Protocol forMetadata Harvestingand RSS (20 mins)
• assigning identifiers (20 mins)Note: you are going to see lots ofangle-brackets – but no XML schemas!
ECDL 2004 tutorial - Bath, Sept 20043
Important
• this is a tutorial…
…please feel free to ask questionsas we go through!
Important DCMI documents…• DCMI Abstract Model – DRAFT
http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/
• Expressing Dublin Core in HTML/XHTML meta and link elementshttp://dublincore.org/documents/dcq-html/
• Guidelines for implementing Dublin Core in XMLhttp://dublincore.org/documents/dc-xml-guidelines/
• Expressing Simple Dublin Core in RDF/XMLhttp://dublincore.org/documents/dcmes-xml/
• Expressing Qualified Dublin Core in RDF/XMLhttp://dublincore.org/documents/dcq-rdf-xml/
• Namespace Policy for the DCMIhttp://dublincore.org/documents/dcmi-namespace/
• DCMI Metadata Termshttp://dublincore.org/documents/dcmi-terms/
ECDL 2004 tutorial - Bath, Sept 20045
Implementing DC
• this tutorial is about the mechanics of implementing DC in HTML, XML and RDF
• it doesn’t really consider which implementation strategy isthe best!
• ask yourself two questions…• what am I trying to achieve?
• does using HTML, XML or RDF help me achieve it?
• do software and services exist that will support the creation and use of mymetadata?
ECDL 2004 tutorial - Bath, Sept 20046
Abstract models for DC
ECDL 2004 tutorial - Bath, Sept 20047
Why abstract models?
• the first part of this tutorial isn’t going to show any angle brackets!
• why?• because before we start creating DCMI
descriptions we need to understand• the DCMI view of the world/resources we
want to describe (the DCMI resource model)
• the DCMI view of the descriptions we make about that world (the DCMI description model)
ECDL 2004 tutorial - Bath, Sept 20048
DCMI resource model
• each resource that we want to describe has zero or more properties
• a property is a specific aspect, characteristic, attribute or relation used to describe a resource
• each property has one or more values• each value is a resource (the physical or
conceptual entity that is associated with a property when it is used to describe a resource)
ECDL 2004 tutorial - Bath, Sept 20049
Err… but what is a resource?
• W3C/IETF definition of resource is“…anything that has identity. Familiar examples include an
electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other
resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can
also be considered resources.”
• i.e. a resource is “anything”• physical things (books, cars, people)• digital things (Web pages, digital images)• conceptual things (colours,
points in time)
ECDL 2004 tutorial - Bath, Sept 200410
Yeah, but… no, but…
• but… this seems to be too wide for the things we can describe with DC!
• can we really describe people using DC?• do people have titles and subjects?
• no… in general we only use DC to describe a sub-set of all resources
• anything covered by the DCMIType list…• Collection, Dataset, Event, Image (Still or
Moving), Interactive Resource, Service, Software, Sound, Text, Physical Object
ECDL 2004 tutorial - Bath, Sept 200411
DCMI resource model (2)
• each resource may be a member of one or more classes
• each class may be related to one or more other classes by a refines (sub-class) relationship
• the two classes share some semantics such that all resources that are members of the sub-class are also members of the related class
where the resource is the value of aproperty, the class is referred to asa vocabulary encoding scheme
ECDL 2004 tutorial - Bath, Sept 200412
DCMI resource model (3)
• each property may be related to exactly one other property by a refines (sub-property) relationship
• the two properties share some semantics such that all valid values of the sub-property are also valid values of the related property
ECDL 2004 tutorial - Bath, Sept 200413
DCMI description model
• a description is made up of• one or more statements (about one, and only one,
resource) and• zero or one resource URI (a URI reference that
identifies the resource being described)
• each statement is made up of• a property URI (that identifies a property),• zero or one value URI (that identifies a value of the
property),• zero or one encoding scheme URI (that identifies
the class of the value) and• zero or more value representations of the value
ECDL 2004 tutorial - Bath, Sept 200414
DCMI description model (2)
• each property is an attribute of the resource being described
• each property URI may be repeated in multiple statements
• the value representation may take the form of a value string, a rich value or a related description
ECDL 2004 tutorial - Bath, Sept 200415
DCMI description model (3)
• each value string is a simple, human-readable string that represents the value of the property
• each value string may have an associated encoding scheme URI that identifies a syntax encoding scheme
• each value string may have an associated value string language that is an ISO language tag (e.g. en-GB)
ECDL 2004 tutorial - Bath, Sept 200416
DCMI description model (4)
• each rich value is some marked-up text, an image, a video, some audio, etc. or some combination thereof that represents the resource that is the value of the property
• each related description is a description of (i.e. some metadata about) the resource that is the value of the property
ECDL 2004 tutorial - Bath, Sept 200417
The 1:1 principle
• notice that the model indicates that each property used in a description must be an attribute of the resource being described
• this is commonly referred to as the 1:1 principle - the principle that a DCMI metadata description describes one, and only one, resource
• however…
ECDL 2004 tutorial - Bath, Sept 200418
Description sets
• real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way)
• known here as description sets• for example, a description set might
comprise descriptions of both a painting and the artist
ECDL 2004 tutorial - Bath, Sept 200419
DCMI records
• description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records
• each record conforms to one of the DCMI encoding guidelines (XHTML meta tags, XML, RDF/XML, etc.)
<dc:title>a document</dc:title><dc:creator>andy powell</dc:creator>
ECDL 2004 tutorial - Bath, Sept 200420
Values (again!)
• a value is the physical or conceptual entity that is associated with a property when it is used to describe a resource
• the value of the DC Creator property is a person, organisation or service - a physical entitiy
• the value of the DC Date property is a point in time - a conceptual entity
• the value of the DC Coverage property may be a geographic region or country - a physical entity
• the value of the DC Subject property may be a concept - a conceptual entity - or a physical object or person - a physical entity
• each of these entities is a resource
ECDL 2004 tutorial - Bath, Sept 200421
Simple vs. qualified DC?
• the notions of “simple DC” and “qualified DC” are often referred to in DCMI discussions
• a view of what these two terms mean is presented here…
note that not everyoneagrees with mydefinitions!
ECDL 2004 tutorial - Bath, Sept 200422
Simple DC record
• a simple DC record is a record that: • conforms to the abstract model,• comprises only a single description,• uses only the 15 properties in the Dublin
Core Metadata Element Set,• makes no use of value URIs, encoding
schemes, rich values or related descriptions.
ECDL 2004 tutorial - Bath, Sept 200423
A couple of notes…• there is no guaranteed linkage between a simple DC
record and the resource being described because the resource URI is optional
• such a linkage may be made by encoding the URI of the resource as the value string of the DC Identifier element, however this is not mandatory – everything in DC is optional
• while the value string of a property may look like a URI, there is nothing in the simple DC model that indicates this is the case
…at their own risk, implementations may choose to guess which value strings are URIs and which are not…
ECDL 2004 tutorial - Bath, Sept 200424
Qualified DC model
• a qualified DC record is a record that: • conforms to the DCMI abstract model,• contains at least one property taken
from the DCMI Metadata Terms recommendation
ECDL 2004 tutorial - Bath, Sept 200425
A couple of notes…
• it is still the case that there is no guaranteed linkage between a qualified DC record and the resource being described!
• a linkage may be made by encoding the URI of the resource as the value string of the DC Identifier element, however this is not mandatory – everything in DC is optional
…where the value of a property is a URI, we can now indicate that it is a URI by using the ‘URI’ encoding scheme…
ECDL 2004 tutorial - Bath, Sept 200426
Dumb-down
• the process of translating a qualified DC metadata record into a simple DC metadata record is normally referred to as ‘dumbing-down’
• can be separated into two parts: property dumb-down and value dumb-down.
• each of these processes can be be approached in one of two ways• informed dumb-down• uninformed dumb-down
ECDL 2004 tutorial - Bath, Sept 200427
Dumb and dumberer
• informed dumb-down takes place where the software performing the dumb-down algorithm has knowledge built into it about the property relationships and values being used within a specific DCMI metadata application
• uninformed dumb-down takes place where the software performing the dumb-down algorithm has no prior knowledge about the properties and values being used
ECDL 2004 tutorial - Bath, Sept 200428
Dumb-down algorithm
ignore any property that isn't in the Dublin Core Metadata Element Set
use value URI (if present) or value string as new
value string
recursively resolve sub-property relationships until one of the 15 properties in
the DCMES is reached, otherwise ignore
use knowledge of the related descriptions or the
value string to create a new value string
element value
uninformed
informed
…and in all cases:•ignore any related descriptions and rich values,•ignore any encoding scheme URIs.
ECDL 2004 tutorial - Bath, Sept 200429
Encoding DC in XHTML (and HTML!)
ECDL 2004 tutorial - Bath, Sept 200430
What is being described?
• a DC record embedded in an (X)HTML document describes that document
• if you want to describe something else, don’t embed it in the (X)HTML document!
…not everyone would agree with this…
ECDL 2004 tutorial - Bath, Sept 200431
Simple DC elements
• use the ‘name’ and ‘content’ attributes of the XHTML <meta> element to encode the DC element (one of the 15 DCMES elements) and it's value string. Use the following pattern:
<meta name="DC.element" content="Value string" />
• for example:
<meta name="DC.date" content="2001-07-18" />
…the element names of the 15 DCMES elementsalways have a lower-case first letter…
ECDL 2004 tutorial - Bath, Sept 200432
Simple DC values
• value strings go in the XHTML <meta> element ‘content’ attribute…
• the string in the ‘content’ attribute is defined to be CDATA, i.e. a sequence of characters from the document character set which may include character entities
…long value strings may be wrappedacross multiple lines as necessary…
…will need to escape some characters, &, <, >, etc…
ECDL 2004 tutorial - Bath, Sept 200433
Language of the value
• where the language of the value string is indicated, it should be encoded using the ‘xml:lang’ attribute of the XHTML <meta> element. For example:
<meta name="DC.subject" xml:lang="en" content="seafood" /><meta name="DC.subject" xml:lang="fr" content="fruits de mer" />
ECDL 2004 tutorial - Bath, Sept 200434
Repeated elements
• multiple property values should be encoded by repeating the XHTML <meta> element for that property, for example:
<meta name="DC.title" content="First title" /><meta name="DC.title" content="Second title" />
ECDL 2004 tutorial - Bath, Sept 200435
Other DC elements• use the ‘name’ and ‘content’ attributes of the
XHTML <meta> element to encode the DC element (e.g. audience) and it's value. Use the following patterns:
<meta name="DCTERMS.element" content="Value" />
• for example:
<meta name="DCTERMS.audience" content="software developers" />
…element names may be mixed-case butshould always have a lower-case first letter…
ECDL 2004 tutorial - Bath, Sept 200436
Element refinements
• use the ‘name’ and ‘content’ attributes of the XHTML <meta> element to encode the element refinement and it's value. Use the following pattern:
<meta name="DCTERMS.elementRefinement“ content="Value" />
• for example:
<meta name="DCTERMS.modified" content="2001-07-18" />
ECDL 2004 tutorial - Bath, Sept 200437
Element refinements (2)
• element refinements should use the names specified in:
DCMI Metadata Termshttp://dublincore.org/documents/dcmi-terms/
…as a general rule, element refinement names may be mixed-case but should always have a lower-case first letter…
ECDL 2004 tutorial - Bath, Sept 200438
Encoding schemes
• encoding schemes are encoded using the ‘scheme’ attribute of the XHTML <meta> element, using the following pattern:
<meta name="DC.element" scheme="DCTERMS.Scheme" content="Value" />
• for example:
<meta name="DC.date" scheme="DCTERMS.W3CDTF" content="2001-07-18" />
ECDL 2004 tutorial - Bath, Sept 200439
Encoding schemes (2)
• encoding schemes should use the names specified in:
DCMI Metadata Termshttp://dublincore.org/documents/dcmi-terms/
…as a general rule, encoding scheme names may be mixed-case but should always start with an upper-case letter. Encoding scheme names are often all upper-case…
ECDL 2004 tutorial - Bath, Sept 200440
Handling namespaces…• the ‘DC.’ and ‘DCTERMS.’ prefixes are used to
indicate the namespace from which the property is taken
• put the namespace URI in an XHTML <link> element:
<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /><link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
• while any string is allowable as the prefix, current practice is to use ‘DC.’ and ‘DCTERMS.’
ECDL 2004 tutorial - Bath, Sept 200441
Value URIs
• where the value of a property is the URI of another resource (e.g. DC.relation) an alternative form of encoding using the XHTML <link> element is preferred. Use the following pattern:
<link rel="propertyName" href=“valueURI" />
• for example:
<link rel="DC.relation"href="http://www.example.org/" /><link rel="DCTERMS.references"href="http://www.example.org/176459.pdf" />
ECDL 2004 tutorial - Bath, Sept 200442
HTML profile
• to give recipient software applications an indication of the XHTML profile that was used to encode the DCMI metadata, the ‘profile’ attribute of the XHTML <head> element should be used:
<head profile="http://dublincore.org/documents/dcq-html/">
ECDL 2004 tutorial - Bath, Sept 200443
The case of names
• note that some of the old DCMI documents recommend(ed) using an uppercase first letter for the names of DCMES elements and element refinements, e.g. using ‘Title’ rather than ‘title’
• this form of DCMES element naming is acceptable but is no longer considered the preferred form
ECDL 2004 tutorial - Bath, Sept 200444
The case of names (2)
• in general, any software applications that consume DC records embedded into XHTML Web pages should ignore the case of names and should treat both ‘.’ (full-stop) and ‘:’ (colon) as valid characters after the prefix
• i.e. all the following forms should be treated as being equivalent:
<meta name="DC.date" content="2001-07-18" /> <meta name="DC.Date" content="2001-07-18" /> <meta name="dc.date" content="2001-07-18" /> <meta name="dc:date" content="2001-07-18" />
ECDL 2004 tutorial - Bath, Sept 200445
Older versions of HTML
• all the examples in this tutorial conform to XHTML 1.0
• most of the recommendations in this tutorial can be applied to older versions of HTML (e.g. HTML 4.01) but• use ‘lang’ rather than ‘xml:lang’• older versions of HTML do not require the
trailing ‘/’ before the closing ‘>’ in the HTML <meta> element
• very old versions of HTML have no <meta> ‘scheme’ attribute
ECDL 2004 tutorial - Bath, Sept 200446
Mixing DC and non-DC• DC metadata can be mixed with non-DC metadata
in XHTML <meta> elements • the following example embeds DC, AGLS and
unspecified metadata properties in the same XHTML Web page:
<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /><link rel="schema.AGLS"href="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2" /><meta name="DC.title" content="Services to Government" /><meta name="keywords" content="archives, information management, public administration" /><meta name="AGLS.Function" scheme="AGIFT" content="recordkeeping standards" />
ECDL 2004 tutorial - Bath, Sept 200447
A couple of examples
• Simple DC
example 1
• Qualified DC
example 2
• ScreenCam of using DC-dot
http://www.ukoln.ac.uk/metadata/dcdot/
ECDL 2004 tutorial - Bath, Sept 200448
Encoding DC in XML
ECDL 2004 tutorial - Bath, Sept 200449
Introduction to XML
• this section is based on:
Guidelines for implementing Dublin Core in XMLhttp://dublincore.org/documents/dc-xml-guidelines/
• nine recommendations for encoding DC in XML
Note: these recommendations do not create RDF/XML. This is not intended to imply that plain XML is better than RDF/XML… …RDF is covered in the next section!
ECDL 2004 tutorial - Bath, Sept 200450
Recommendation 1
• implementers should base their XML applications on XML Schemas rather than XML DTDs
• approaches based on XML Schemas are more flexible and are more easily re-used within other XML applications
…in some cases it may be sensible to provide both an XML Schema and a DTD for the application. Where XML Schemas are not used, a DTD should be provided instead…
ECDL 2004 tutorial - Bath, Sept 200451
Recommendation 1 (2)
• the DCMI maintains a list of XML schemas that are in use in projects or products using DCMI metadata
DCMI Metadata expressed in XML Schema Languagehttp://dublincore.org/schemas/xmls/
ECDL 2004 tutorial - Bath, Sept 200452
Recommendation 2
• implementers should use URI references (see later) to uniquely identify DC elements, element refinements and encoding schemes
…DC namespaces are defined in the DCMI Namespace Recommendation...
ECDL 2004 tutorial - Bath, Sept 200453
Container elements
• note that it is anticipated that records will be encoded within one or more container XML element(s) of some kind
• this tutorial makes no recommendations for the name of any container element, nor for the namespace that the element should be taken from
• candidate container element names include <dc>, <dublinCore>, <resource>, <record> and <metadata>
ECDL 2004 tutorial - Bath, Sept 200454
Recommendation 3
• implementers should encode properties as XML elements and values as the content of those elements
• the name of the XML element should be an XML qualified name (QName) of the property
<dc:title>Dublin Core in XML</dc:title>
• do not use constructs like
<dc:title value="Dublin Core in XML" />
ECDL 2004 tutorial - Bath, Sept 200455
Recommendation 4
• the property names for the 15 DC elements should be all lower-case
<dc:title>Dublin Core in XML</dc:title>
• do not use
<dc:Title>Dublin Core in XML</dc:Title>
ECDL 2004 tutorial - Bath, Sept 200456
Recommendation 5
• multiple property values should be encoded by repeating the XML element for that property
<dc:title>First title</dc:title> <dc:title>Second title</dc:title>
ECDL 2004 tutorial - Bath, Sept 200457
Simple DC example
• example 3
ECDL 2004 tutorial - Bath, Sept 200458
Recommendation 6• element refinements should be treated in the
same way as other properties• the name of the XML element should be an XML
qualified name (QName):
<dcterms:available>2002-06</dcterms:available>
• do not use any of the following:
<dc:date refinement="available">2002-06</dc:date><dc:date type="available">2002-06</dc:date><dc:date> <dcterms:available>2002-06 </dcterms:available></dc:date>
ECDL 2004 tutorial - Bath, Sept 200459
Recommendation 6 (2)
• element refinements are properties in their own right and are therefore best encoded in a similar way to other DC elements
• in particular, it should be noted that element refinements may have further refinements of their own (e.g. ‘format’ is refined by ‘extent’ which might be further refined by ‘duration’)
…nesting does not mean refinement;nesting might be used for other purposes…
ECDL 2004 tutorial - Bath, Sept 200460
Recommendation 7
• encoding schemes should be implemented using the 'xsi:type' attribute of the XML element for the property
• the name of the encoding scheme should be given as the attribute value, and should be in the form of an XML qualified name (QName):
<dc:identifier xsi:type="dcterms:URI"> http://www.ukoln.ac.uk/</dc:identifier>
ECDL 2004 tutorial - Bath, Sept 200461
Recommendation 7 (2)
• it should be noted that there may be existing DC XML applications that use other conventions to support encoding schemes, notably the use of a ‘scheme’ attribute of the XML element for the property
• therefore, it may be sensible for software applications that consume DC XML to be fairly liberal in what they accept
ECDL 2004 tutorial - Bath, Sept 200462
Recommendation 8
• elements, element refinements and encoding schemes should use the names specified in
DCMI Metadata Termshttp://dublincore.org/documents/dcmi-terms/
…note, the 15 DCMES element names all start with a lowercase letter…
ECDL 2004 tutorial - Bath, Sept 200463
Recommendation 8 (2)
• element and element refinement names may be mixed-case but should always have a lower-case first letter
• encoding scheme names may be mixed-case but should always start with an upper-case letter
<dcterms:isPartOf xsi:type="dcterms:URI"> http://www.bbc.co.uk/</dcterms:isPartOf>
<dcterms:temporal xsi:type="dcterms:Period">name=The Great Depression; start=1929; end=1939; </dcterms:temporal>
ECDL 2004 tutorial - Bath, Sept 200464
Recommendation 9
• where the language of the value is indicated, it should be encoded using the ‘xml:lang’ attribute
<dc:subject xml:lang="en"> seafood</dc:subject><dc:subject xml:lang="fr"> fruits de mer</dc:subject>
ECDL 2004 tutorial - Bath, Sept 200465
Some examples
• Qualified DC
example 4
• DC and IMS
example 5
• DC, IMS and ODRL
example 6
HEALTH WARNINGExamples 5 and 6 mayseriously damage yourinteroperability!
ECDL 2004 tutorial - Bath, Sept 200466
Encoding DC in RDF
ECDL 2004 tutorial - Bath, Sept 200467
What is RDF?
• Resource Description Framework• W3C recommendation for metadata• model and syntax(es)• XML is most common syntax in use on
the Web• underpins the ‘semantic Web’
W3C - Resource Description Framework (RDF)http://www.w3.org/RDF/
ECDL 2004 tutorial - Bath, Sept 200468
Why use RDF?
• RDF provides shared metadata ‘model’…• …shared ‘meaning’• metadata can be shared between applications
that have little or no knowledge about each other• e.g. an RDF-based bibliographic application can
consume RDF-based geospatial metadata and have 'some' knowledge of what it means
…with (X)HTML and XML encodings, softwareapplications must have ‘understanding’ hard-codedinto them…
ECDL 2004 tutorial - Bath, Sept 200469
DC in RDF
• DC abstract models map easily onto the RDF model (because RDF was the basis for them!)
• DC in RDF/XML syntax is an encoding of the RDF model in XML
• simple DC is similar to the non-RDF XML we've seen already…
• …but with the addition of <rdf:RDF> and <rdf:Description> container elements
• example 7
ECDL 2004 tutorial - Bath, Sept 200470
RDF basics… the model• model based on triples
• a resource has a property which has a value• often represented as an arc-node diagram (or
“graph”)• resources and properties are identified using URI
references
resource “value”property
ECDL 2004 tutorial - Bath, Sept 200471
A more concrete example• The graph below approximately translates into English as…
the resource identified by the URI http://example.org/ has a dc:creator that is represented by the string “Andy Powell”
http://example.org/ “Andy Powell”
http://purl.org/dc/elements/1.1/creator
ECDL 2004 tutorial - Bath, Sept 200472
Values as resources
• values can be resources too• means that we can then attach properties
to the value as well as to the original resource
• build up quite complex graphs
http://example.org/
“Andy Powell”
dc:creator
“01225 383933”
my:phoneNumbermy:name
ECDL 2004 tutorial - Bath, Sept 200473
Typed and blank nodes
• nodes can be blank (to represent resources that have not be assigned a URI)
• can also indicate the class of a resource using the rdf:type property
http://example.org/
“Andy Powell”
dc:creator
“01225 383933”
my:phoneNumbermy:name
my:Person
rdf:type
ECDL 2004 tutorial - Bath, Sept 200474
Qualified DC in RDF
• now ready to look at some more complex examples
• for full details about how to encode DC in RDF see…
Expressing Simple Dublin Core in RDF/XMLhttp://dublincore.org/documents/dcmes-xml/
Expressing Qualified Dublin Core in RDF/XMLhttp://dublincore.org/documents/dcq-rdf-xml/
ECDL 2004 tutorial - Bath, Sept 200475
Case study 1 – dc:creator
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> <my:email> [email protected] </my:email> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Example RDF description using dc:creator…
ECDL 2004 tutorial - Bath, Sept 200476
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> <my:email> [email protected] </my:email> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
dc:creator
Andy Powell…
my:affiliation a.powell@uko…
my:email
…and the RDF model it represents.
UKOLN, Univ…
a.powell@uko…
Andy Po…
rdfs:label
my:name
ECDL 2004 tutorial - Bath, Sept 200477
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> <my:email> [email protected] </my:email> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
dc:creator
Andy Powell…
my:affiliation a.powell@uko…
my:email
UKOLN, Univ…
a.powell@uko…
Andy Po…
rdfs:label
my:name
But… we don’t want to embed all this information
into every instance metadata record do we?
relatedMetadata
ECDL 2004 tutorial - Bath, Sept 200478
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
dc:creator
Andy Powell…
rdfs:label
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <my:name> Andy Powell </my:name> <my:email> [email protected] </my:email> </rdf:Description></rdf:RDF>
my:affiliation a.powell@uko…
my:email
UKOLN, Univ…
a.powell@uko…
Andy Po…
my:name
Need to separate part of the information out and store it
in a single place – in this case in a directory service…
ECDL 2004 tutorial - Bath, Sept 200479
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
valueURI
dc:creator
Andy Powell…
rdfs:label
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <my:name> Andy Powell </my:name> <my:email> [email protected] </my:email> </rdf:Description></rdf:RDF>
valueURI
my:affiliation a.powell@uko…
my:email
UKOLN, Univ…
a.powell@uko…
Andy Po…
my:name
To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node…
ECDL 2004 tutorial - Bath, Sept 200480
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
valueURI
dc:creator
Andy Powell…
rdfs:label
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <my:name> Andy Powell </my:name> <my:email> [email protected] </my:email> </rdf:Description></rdf:RDF>
valueURI
my:affiliation a.powell@uko…
my:email
UKOLN, Univ…
a.powell@uko…
Andy Po…
my:name
relatedMetadataURI
The document containing this information is itself an
RDF resource (the ‘relatedMetadata’) and has
a URI
ECDL 2004 tutorial - Bath, Sept 200481
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> <my:email> [email protected] </my:email> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
Case study 1 – dc:creator
valueURI
dc:creator
Andy Powell…
rdfs:label
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:my="http://purl.org…"> <rdf:Description> <dc:creator> <rdf:Description> <rdf:value> Andy Powell </rdf:value> <my:email> [email protected] </my:email> </rdf:Description> </dc:creator> </rdf:Description></rdf:RDF>
valueURI
my:affiliation a.powell@uko…
my:email
UKOLN, Univ…
a.powell@uko…
Andy Po…
my:name
relatedMetadataURI
rdfs:seeAlso
Use rdf:seeAlso to form linkage between description
and relatedMetadata…
ECDL 2004 tutorial - Bath, Sept 200482
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
Example RDF description using dc:subject (taken
from Qualified DC in RDF recommendation…
ECDL 2004 tutorial - Bath, Sept 200483
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
dcterms:MESH
dc:subject
rdf:type D08.586…rdf:type
rdfs:label
Formated…
rdfs:value
…and the RDF model it represents.
ECDL 2004 tutorial - Bath, Sept 200484
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
dcterms:MESH
dc:subject
rdf:typerdf:type
But… we don’t want to embed all this information
into every instance metadata record do we?
relatedMetadata
D08.586…
rdfs:label
Formated…
rdfs:value
ECDL 2004 tutorial - Bath, Sept 200485
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
dcterms:MESH
dc:subject
rdf:typerdf:type
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH></rdf:RDF>dcterms:MESH
D08.586…
Formated…
Need to separate part of the information out and store it
in a single place – in this case with the terminology
owner…
rdfs:label
Formated…
ECDL 2004 tutorial - Bath, Sept 200486
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
valueURI
dcterms:MESH
dc:subject
rdf:typerdf:type
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH></rdf:RDF>
valueURI
dcterms:MESH
D08.586…
Formated…
To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node…
rdfs:label
Formated…
ECDL 2004 tutorial - Bath, Sept 200487
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
valueURI
dcterms:MESH
dc:subject
rdf:typerdf:type
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH></rdf:RDF>
valueURI
dcterms:MESH
D08.586…
Formated…
relatedMetadataURI
The document containing this information is itself an
RDF resource (the ‘relatedMetadata’) and has
a URI
rdfs:label
Formated…
ECDL 2004 tutorial - Bath, Sept 200488
Case study 2 – dc:subject
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
valueURI
dcterms:MESH
dc:subject
rdf:typerdf:type
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH></rdf:RDF>
valueURI
dcterms:MESH
D08.586…
Formated…
relatedMetadataURI
rdfs:seeAlso
Use rdf:seeAlso to form linkage between description
and relatedMetadata…
rdfs:label
Formated…
ECDL 2004 tutorial - Bath, Sept 200489
Abstract DC model
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <rdf:Description> <dc:subject> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> </dcterms:MESH> </dc:subject> </rdf:Description></rdf:RDF>
valueURI
dcterms:MESH
dc:subject
rdf:typerdf:type
<?xml version="1.0"?><rdf:RDF xmlns:rdf=http://www….xmlns:rdfs=http://www.w3.org/…xmlns:dc=http://purl.org/dc/…xmlns:dcterms="http://purl.org…"> <dcterms:MESH> <rdf:value> D08.586.682.075.400 </rdf:value> <rdfs:label> Formate Dehydrogenase </rdfs:label> </dcterms:MESH></rdf:RDF>
valueURI
dcterms:MESH
D08.586…
Formated…
relatedMetadataURI
rdfs:seeAlso
resource
property
valueURI
valueString
In terms of abstract DC model we now have: resource,
property, value URI, value string (and value string
language), encoding scheme, related description
resource
property
value URI
related description
encoding scheme
rdfs:label
Formated…
value string(language)
ECDL 2004 tutorial - Bath, Sept 200490
Practical examples – OAI and RSS
ECDL 2004 tutorial - Bath, Sept 200491
OAI-PMH
• OAI Protocol for Metadata Harvesting• simple protocol for sharing metadata
records between applications• currently at version 2.0• based on HTTP, XML, XML Schema and
XML namespaces• allows a harvester to ask a remote
repository for some or all of its metadata records
ECDL 2004 tutorial - Bath, Sept 200492
OAI-PMH (2)
• simple DC is default (mandatory) record format
• supports any record format provided it can be encoded using XML (e.g. DC, IMS, MARC, ODRL, …)
Open Archives Initiativehttp://www.openarchives.org/
ECDL 2004 tutorial - Bath, Sept 200493
OAI-PMH example
• record from the American Memory repository at the Library of Congress
http://memory.loc.gov/cgi-bin/oai2_0
• example 8 • ScreenCam of using the
‘repository explorer’• GetRecord for record identifier
oai:lcoa1.loc.gov:loc.gmd/g3701p.rr003570
ECDL 2004 tutorial - Bath, Sept 200494
RSS
• RDF Site Summary or Rich Site Summary (or even Really Simple Syndication)
• at least 3 different versions (0.91, 1.0 and 2.0)
• all based on XML but not compatible• simple format for sharing news feeds on the
Web• RSS ‘channels’ – list of ‘items’• channels updated by updating XML file• RSS clients gather XML on regular basis
ECDL 2004 tutorial - Bath, Sept 200495
RSS 1.0 and DC example
• RSS 1.0 based on RDF• most flexible and extensible of the RSS
‘family’ - not necessarily the most widely deployed
• can include DC in both ‘channel’ and ‘item’ descriptions
• example 9• full documentation at:
RDF Site Summary 1.0 Modules: Qualified Dublin Corehttp://web.resource.org/rss/1.0/modules/dcterms/
ECDL 2004 tutorial - Bath, Sept 200496
Assigning identifiers tometadata terms
ECDL 2004 tutorial - Bath, Sept 200497
What’s the problem?
• the terms used in DCMI metadata records must be assigned a URI reference before they can be used
• qualified DC application profiles generally use ‘local’ additions to DCMI terms
• therefore these additional terms must be assigned a URI reference
…a URI reference is a URI with an optional fragment identifier…
ECDL 2004 tutorial - Bath, Sept 200498
DCMI terms URI references
• all DCMI terms have already been assigned URI references
• for example…
http://purl.org/dc/elements/1.1/titlehttp://purl.org/dc/terms/dateCopyrighted
ECDL 2004 tutorial - Bath, Sept 200499
Namespace-name issues
• encoding syntaxes split the term URI reference into two parts• namespace• name
• the namespace is shortened to a namespace prefix
• for example• DC.title (XHTML)• dc:title (XML, RDF/XML)
ECDL 2004 tutorial - Bath, Sept 2004100
Guidelines
• for groups of related terms, URI references are typically assigned such that they can share a namespace prefix
• all term URI references should resolve to a human and/or machine readable description of the term
• term URI references should use a registered URI scheme
• term URI references should be assigned with the intention of them being as persistent as the Internet
ECDL 2004 tutorial - Bath, Sept 2004101
A note on namespaces
• DCMI namespaceA DCMI namespace is a collection of DCMI terms (a collection of names)
• DCMI termA DCMI term is a DCMI element, a DCMI qualifier or term from a DCMI-maintained controlled vocabulary
• each DCMI namespace is identified by a URI – each name in the namespace is also a URI
• a mechanism for making DCMI terms unique
ECDL 2004 tutorial - Bath, Sept 2004102
How do I assign URIs?
• no clear recommended best practice in this area yet!
• four strategies for assigning URIs are presented here…
… there are other strategies!
ECDL 2004 tutorial - Bath, Sept 2004103
Using project/service URIs
• simple to do…• …but danger of lack of persistence• examples:
http://myservice.org/terms/price<myservice:price> (XML, RDF/XML)“MYSERVICE.price” (XHTML)
http://myproject.org/metadata/vocabs/color#Red<myproject:Red> (RDF/XML)
ECDL 2004 tutorial - Bath, Sept 2004104
Using PURLs
• PURLs are persistent URLs (under the purl.org domain)
• used by DCMI, RSS and others to provide persistent term URI references
• examples:
http://purl.org/dc/elements/1.1/title<dc:title> (XML, RDF/XML)“DC.title” (XHTML)http://purl.org/rss/1.0/link<rss:link> (XML, RDF/XML)
ECDL 2004 tutorial - Bath, Sept 2004105
Using xmlns.com
• domain registered explicitly for use for XML namespaces
• but… persistence policy a little unclear• used for FOAF terms• example:
http://xmlns.cm/foaf/0.1/firstName<foaf:firstName> (RDF/XML)
ECDL 2004 tutorial - Bath, Sept 2004106
Using “info” URIs
• “info” URIs specifically designed for identifying vocabulary terms
• but… not a registered scheme yet and there is currently some “discussion” (i.e. argument!) on various lists about whether they are a good idea
• example:
info:ddc/22/eng//004.678
ECDL 2004 tutorial - Bath, Sept 2004107
What have we learned?
• an abstract model for DC• encoding DC in XHTML• encoding DC in XML• encoding DC in RDF/XML• two practical examples
• OAI Protocol forMetadata Harvesting
• RSS
• how to assign identifiersto new metadata terms
ECDL 2004 tutorial - Bath, Sept 2004108
Questions?