Upload
espen
View
18
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Representing URI Resolution in OWL. Alan Ruttenberg (responsible for errors) Matthias Samwald Jonathan Rees. What goes wrong with URLs. The server disappears (s) The content disappears - 404 (c) The content might change and you want to know and communicate what it used to be (d) - PowerPoint PPT Presentation
Citation preview
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Representing URI Resolution in OWL
Alan Ruttenberg (responsible for errors)
Matthias Samwald
Jonathan Rees
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
What goes wrong with URLs• The server disappears (s)• The content disappears - 404 (c)• The content might change and you want to know and
communicate what it used to be (d)• Access to the content is too slow (s)• Access to the content is too public (p)• The content is very big (b)• You don't know if a URI is an information resource or
not (w)• You want to record and access metadata -
information about some information resource - and you don't know where to get it. (m)
• You don't know what format an information resource is encoded in. (f)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: LSIDs
• Authority • “Location independence”• Data/Metadata distinction• Access method independence• Versioning
• (similar: ARK, purl, DOI…)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: http-range14
• Use http. • Use result code to recognize potential non-
information resources• Result code 2xx = information resource• Result code 3xx = any resource, pointer to
more information• Result code 4xx unknown type • “303” to get more information about the thing
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Existing proposals: Content negotiation
• Agent asks for resource
• Server responds with list of content types and where to get each
• Agent chooses which to retrieve
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements”
John Barkley
1. Identify MIME Types of URIs.
2. Identify versions of URIs
3. URIs should dereference to something, even if it is only documentation,e.g., rdfs:comment
4. Use LSIDs
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements”
Phil Lord
1. URIs should identify one thing only.
2. URI allocation schemes should encourage stability over time.
3. Resources identified by URIs should be checksummable.
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements”
Matthias Samwald1. Be careful what you are talking about
- use separate names2. Distinguish information resource from
non-information resource3. Distinguish making a query from
resolution
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
“Short statements”
David Booth (Information resources/metadata)
1. http://example.org/foo#bar might identify a thing(non-information resource), http://example.org/foo can be used to seek metadata about it.
2. Dereferencing non-information resources yields a http 303 result code
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Selected issues with proposed solutions
• http-range14 - “late” don’t know anything until you do the retrieval
• Content negotion - confusion over what the thing is - e.g. foaf human readable document, rdf at same address. Try talking about the ugly font.
• LSID - requires server deployment. Based on web services (slow). Unclear semantics of “versions”, “metadata”, “data”
• No single proposal deals with all issues
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
An Alternative
• Use the our SW tools to help solve this problem
• Represent the information that you want to know about URIs in OWL.
• Build an ontology to represent consensus/schema.
• Take advantage of consistency checking, inheritence
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Goals
• Transparent/explicit. Contract based.
• Adjustable
• Extendable
• Ontologically sound
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Different things• The temperature of a patient (not an
information resource)• A instrument that measures and reports
temperature (not an information resource)• The record retrieved when you query the
instrument (an information resource)• The record that you retrieved at a certain time
and you copied and saved (an information resource)
but related
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
A sketch
• Some useful distinctions
• Top level classes
• Some properties
• Only about instances (not properties or classes)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
InformationResource NotAnInformationResource
• Information resources are conceptually “Gettable”
• They might not be able to be retrieved at a particular time
• They might change• Ask yourself: “Would it be possible to
get the thing itself over a network”• Disjoint
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
UnchangingInformationResourceEvolveableInformationResource
• UnchangingInformationResource is like LSID “data”. A promise is made that the content will never change.
• EvolveableInformationResource are resources that might change (even if we don’t want them to, e.g. NCBI gene records)
• Disjoint
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
RetrievalMethod• A way to get an information resource.• Some examples
– StandardURIRetrieval– TransformUriRetrieval
http://genesdbs.org/entrez/7157 => http://cache.ibm.com/?generetrieve&id=http://genesdbs.org/entrez/
– SPARQLMethod => http://genesdbs.org/entrez/7157 =>
select ?dataFROM http://sparql.ibm.com/lifesciWHERE{ http://genesdbs.org/entrez/7157 :data ?data }
– WebServiceMethodSupply a WSDL, name of parameter
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
RetrievalMethod (notes)
• There may be more than one.• When more than one try them all in
random order, or explicitly represent preference.
• For company specific retrieval, add another RetrievalMerthod to an appropriate upper class (one more triple)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
InformationResourceFormat• Explicitly give enough information to know what you
will have to parse should you retrieve the resource (so you can choose whether or not to retrieve)
• Like mime/type - BUT only the format, not the type (that’s for defining by class)– RDFXML– RDFTurtle– JPEG– TIFF– HTML– …
Note: Different formats of “same” digital thing should be given different names. (but they may be related by some property, e.g. hasOtherRepresentation)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Some classes of InformationResource
• XrayImage• Triples (rdf statements)• MedicalRecord• VersionInformation• ProtocolDocumentation• ProvenanceDescription• SPARQLEndpoint• ….• (ask me about metadata!)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Properties
• Relate a NotInformationResource to an InformationResource - seeAlso, subjectOfMedicalRecord, foaf:homepage, biozen:d-data (‘described by data’)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Properties
• Relating an informationResource to “metadata”
• previousVersion (UnchangingInformationResource)
• hasVersionDescription• hasChangeDescription• generatedBy (not an information resource)• cachedDate • hasMD5 (UnchangingInformationResource)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
An exampleClass PartnersDigitalXray subclassOf NeverChangingInformationResource subclassOf DigitalXray mediaType hasValue jpegType retrievalmethod: hasValue webServiceMethod1002
http://partners.org/radiology/817277366 rdf:type PartnersDigitalXray
mediaType jpegType (inherited) retrievalmethod webServiceMethod1002 (inherited)
Alan Ruttenberg, Oct 3, 2006 HCLS F2F
Sharing bare URIs
• Don’t, unless you have to. Generally messages should be a set of triples giving adequate information about type, resolution
• If you do, use existing best practices to make them last, e.g. PURL