Upload
isabella-allison
View
28
Download
0
Embed Size (px)
DESCRIPTION
Using RDF/OWL Technologies for Discovery and Use Metadata. M.Benno Blumenthal, Michael Bell, John del Corral, and Emily Grover-Kopec International Research Institute for Climate and Society Columbia University http://iridl.ldeo.columbia.edu/. Definitions. Resource Description Framework (RDF) - PowerPoint PPT Presentation
Citation preview
M.Benno Blumenthal, Michael Bell, John del Corral, and Emily Grover-Kopec
International Research Institute for Climate and Society
Columbia University
http://iridl.ldeo.columbia.edu/
Using RDF/OWL Technologies for Discovery and Use Metadata
Definitions
• Resource Description Framework (RDF)
• Web Ontology Language (OWL)
Why RDF?
Web-based system for interoperating semantics
A key part of the Semantic Web
RDF/OWL is an interesting technology, but it is even more interesting when it is clear that it can help solve our problems
The Data Problem
Users
Datasets
The Tool Interface
Users
Datasets
Tools
Standard Metadata
Users
Datasets
Tools
Standard Metadata Schema/Data Services
Many Data Communities
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Super Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema
Super Schema: direct
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema/data service
Flaws
• A lot of work
• Super Schema/Service is the Lowest-Common-Denominator
• Science keeps evolving, so that standards either fall behind or constantly change
RDF Standard Data Model Exchange
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Tools
Users
Datasets
Standard Metadata Schema
Standard metadata schema
RDF
RDF
RDF
RDF
RDF
RDF
Standard metadata schema
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schem
RDF
RDFRDF
RDF Data Model Exchange
RDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
Tools
Users
Datasets
Standard Metadata Schema
RDF
RDFRDF
RDF Architecture
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF
RDF RDF
RDF
RDF RDF
Virtual (derived) RDF
queries queries queries
Why is this better?
• Maps the original dataset metadata into a standard format that can be transported and manipulated
• Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but
• When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping
• Can uses enhanced mappings between models that are close
• EASIER – these are tools to enhance the mapping process
Sample Tool: Faceted Searchhttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...
Distinctive Features of the search
• Search terms are interrelated
• terms that describe the set of returns are displayed (spanning and not)
• Returned items also have structure (sub-items and superseded items are not shown)
Architectural Features of the search
• Multiple search structures possible
• Multiple languages possible
• Search structure is kept in the database, not in the code
http://iridl.ldeo.columbia.edu/ontologies/query2.pl
Cast of RDF Characters
Semantic
Layers
Query
Language
Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
Triplets of • Subject• Property (or Predicate)• Object
URI’s identify things, i.e. most of the aboveNamespaces are used as a convenient
shorthand for the URI’s
RDF: framework for writing connections
Datatype Properties
{WOA} dc:title “NOAA NODC WOA01”
{WOA} dc:description “NOAA NODC WOA01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1x1; Longitude: global; Latitude: global; Depth: [0 m,5500 m]; Time: [Jan,Dec]; monthly”
Object Properties
{WOA} iridl:isContainerOf {Grid-1x1},
{Grid-1x1} iridl:isContainerOf {Monthly}
WOA01 diagram
Standard Properties
{WOA} dcterm:hasPart {Grid-1x1},{Grid-1x1} dcterm:hasPart {MONTHLY}
Alternatively
{WOA} iridl:isContainerOf {Grid-1x1},{iridl:isContainerOf} rdfs:subPropertyOf
{dcterm:hasPart}
{SST} rdf:type {cfatt:non_coordinate_variable}, {SST} cfatt:standard_name {cf:sea_surface_temperature}, {SST} netcdf:hasDimension {longitude}
netcdf/CF in RDF
Object properties provide a framework for explicitly writing down relationships between data objects/components, e.g. vague meaning of nesting is made explicit
Properties also can be related, since they are objects too
Noncontextual Modeling
• “noncontextual modeling make RDF the perfect glue between systems and fixed data models” – The Semantic Web
RDF Level
• Transport/Exchange (RDF/XML)
• Storage
• RDF APIs (Redland,Jena,Sesame)
• Query (SPARQL,SeRQL, …)
• Basic Semantics
RDF SemanticsRDF Primer
Truly useful property rdf:type “a”
Underlying Class rdf:Property
Organizational Classes rdf:Bag rdf:Alt rdf:Seq
rdf:List
Structured values rdf:value
Reification rdf:Statement: rdf:subject rdf:predicate rdf:object
Bag Properties rdf:_1 rdf:_2 …
List Properties rdf:first rdf:rest:rdf:nil
RDF-Schema (RDFS)
Transitive Properties rdfs:subClassOf (“is a”), rdfs:subPropertyOf
rdfs:Class, rdfs:Resource
rdfs:member
rdfs:domain, rdfs:range
rdfs:Datatype, rdfs:Literal, rdfs:Container
Refering to other RDF documents
rdfs:seeAlso, rdfs:isDefinedBy
Basic documentation rdfs:label, rdfs:comment
Gazetteer Classes
Gazetteer Individuals
Search Interface Term
• http://iri.columbia.edu/~benno/sampleterm.pdf
Semantics lead to Virtual Triples
Transitive: {a} rdfs:subClassOf {b} rdfs:subClassOf {c} implies {a} rdfs:subClassOf {c}i.e. semantics of rdfs:subClassOf imply additional triples not
explicitly statedLikewise:{a} rdfs:subPropertyOf{b} rdfs:subPropertyOf {c}implies {a} rdfs:subPropertyOf {c}More interestingly,{a} myprop {b}, {myprop} rdfs:subPropertyOf {prop2}
implies {a} prop2 {b}
Subcategories are not subClasses
So carelessly translating existing conceptual organizations can get one into trouble
Domain and Range are inherited
Since the domain and range of a property are classes, then subclasses “inherit” properties (in this sense)
UML/RDFS
• Unified Modeling Language
• Base concepts are the same (RDFS lacks methods), so one can export the underlying structure of the code as the underlying structure for the metadata
• See Representing UML in RDF
Ontologies
Use Conventions to connect concepts to established sets of concepts
Generate additional “virtual” triples from the original set and semantics
RDFS – some property/class semantics
OWL – additional property/class semantics: more sophisticated (ontological) relationships
OWL
Language for expressing ontologies, i.e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema
However, there is a serious split in world view from what we have been talking about: concepts as classes vs concepts as individuals
OWL
rdf:Property owl:DatatypeProperty owl:ObjectProperty owl:AnnotationProperty
owl:FunctionalProperty owl:InverseFunctionalProperty owl:TransitiveProperty owl:SymmetricProperty
rdfs:seeAlso owl:imports
owl:ontology
Protégé
Tool for editing/displaying Ontologies
Different “tabs” display different perspectives
http://protege.stanford.edu/
Cast of RDF Characters II
Semantic
Layers
Query
Language
Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
Query Language: SPARQL
• (quick reference at http://www.dajobe.org/2005/04-sparql/)
• Supported by Redland, Jena, Sesame-2.0 (alpha)
• Jena implementation supports url source of triples, i.e. do not even need a triple store
• The standard
Query Language: SeRQL
• Older than SPARQL
• Implemented on top of Sesame
• Currently more powerful than SPARQL, i.e. has nested queries
SeRQL DetailsCopied from on-line tutorial
• Syntax
• Select
• Construct
• Where
• From
SeRQL: basic syntax
{person} foo:worksFor {Company} rdf:type {foo:ITCompany}
SeRQL: multiple statements
{subj1} pred1 {obj1}; pred2 {obj2}
Or
{subj1} pred1 {obj1} , {subj1} pred2 {obj2}
SeRQL: short cuts
{subj1} pred1 {obj1,obj2,obj3}
(also implies obj1,obj2,obj3 are distinct)
SeRQL: Select
SELECT dataset, dlabel
FROM {dataset} rdf:type {iridl:dataset},
[{dataset} rdfs:label {dlabel}]
USING NAMESPACE
iridl = <http://iridl.ldeo.columbia.edu/ontologies/iridl.owl>
Output as table (XML)
SeRQL:Construct
CONSTRUCT {dataset} rdf:type {foo:LabelledDatasets}
FROM {dataset} rdf:type {iridl:dataset}; rdfs:label {dlabel}
USING NAMESPACE
iridl = <http://iridl.ldeo.columbia.edu/ontologies/iridl.owl>
Output as RDF (RDF/XML)
Faceted Search Explicated
Search Interface
• Items (datasets/maps)
• Terms
• Facets
• Taxa
Search Interface Semantic API
{item} dc:title dc:description rss:link iridl:icon dcterm:isPartOf {item2} dcterm:isReplacedBy {item2}
{item} trm:isDescribedBy {term}
{term} a {facet} of {taxa} of {trm:Term},{facet} a {trm:Facet}, {taxa} a {trm:Taxa},{term} trm:directlyImplies {term2}
Faceted Search w/Querieshttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...
RDF Architecture
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF RDF
RDF
RDF
RDF RDF
RDF
RDF RDF
Virtual (derived) RDF
queries queries queries
Data ServersOntologies
MMI
JPL
StandardsOrganizations
Start Point
RDF Crawler
RDFS SemanticsOwl SemanticsSWRL Rules
SeRQL CONSTRUCT
Search Queries
LocationCanonicalizer
TimeCanonicalizer
Sesame
Search Interface
bibliography
IRI RDF Architecture
Creating Virtual Triples from Semantic Layers
Semantic
Layers
Query
Language
Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
SWRL
SWRL: A Semantic Web Rule Language Combining OWL and RuleML
A language for writing rules in RDF/OWL, i.e. RDF statements that are rules for creating new RDF statements
Simple Knowledge Organization System (SKOS)
Schema for relating concepts
Simple Knowledge Oranization System (SKOS)
• So, for a resource of type skos:Concept, any properties of that resource (such as creator, date of modification, source etc.) should be interpreted as properties of a concept, and not as properties of some 'real world thing' that that resource may be a conceptualisation of.
• This layer of indirection allows thesaurus-like data to be expressed as an RDF graph. The conceptual content of any thesaurus can of course be remodelled as an RDFS/OWL ontology. However, this remodelling work can be a major undertaking, particularly for large and/or informal thesauri. A SKOS Core representation of a thesaurus maps fairly directly onto the original data structures, and can therefore be created without expensive remodelling and analysis
RDF Frameworks
Protégé API
Redland Bindings in many languages, supports several triple stores, some with context
Jena Java API, some cmd line utilities, supports inference layers
Sesame HTTP server, Java API, supports inference, version 2 alpha has context
SesameSAIL- Storage and Inference Layer
i.e. you can write down rules that imply virtual triples so that triples are generated as they are put into the store
RDF No inference
RDFS RDFS inference
OWLIM Some OWL inference
Custom
Jena
Java framework
In-memory and persistent stores
Inference API
Topics/Issues
• OpenDAP and RDF: can we transport data semantics without fixing the entire schema?
• netcdf/HDF and RDF: do we need non-contextual modeling in our metadata transport/storage?
• Concepts as classes vs concepts as individuals• Sub-classes vs sub-categories• OWL in detail• Protégé demo
RDF Cast of Characters
Semantic
Layers
Query
Language
Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena