OUTLINE 1.Foundations of Semantic Web 2.RDF 3.RDFS 4.OWL 5.OWL2
6.Semantic Web Layer Cake 7.RIF
Slide 3
Slide 4
Slide 5
Slide 6
What is Semantic? The word semantic itself implies meaning or
understanding. As such, the fundamental difference between Semantic
Web technologies and other technologies related to data (such as
relational databases or the World Wide Web itself) is that the
Semantic Web is concerned with the meaning and not the structure of
data.
Slide 7
Why do we need Semantic web? Consider a typical web page:
Markup consists of: rendering information (e.g.,font size and
colour) Hyper-links to related content Semantic content is
accessible to humans but not (easily) to computers
Slide 8
What information we can see.. WWW2002 The eleventh
international world wide web conference Sheraton waikiki hotel
Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn
interact Registered participants coming from australia, canada,
chile denmark, france, germany, ghana, hong kong, india, ireland,
italy, japan, malta, new zealand, the netherlands, norway,
singapore, switzerland, the united kingdom, the united states,
vietnam, zaire Register now On the 7 th May Honolulu will provide
the backdrop of the eleventh international world wide web
conference. This prestigious event Speakers confirmed Tim
berners-lee Tim is the well known inventor of the Web, Ian Foster
Ian is the pioneer of the Grid, the next generation internet
Slide 9
What information can a machine see
Slide 10
Solution: XML markup with meaningful tags?
Slide 11
Machine sees
Slide 12
Solution To enable machine processing - There can be two
approaches: Smarter machines Smarter data
Slide 13
Approach. 1 Smarter machines Teach computers to understand the
meaning of Web data -The Artificial Intelligence (AI) approach
Natural language processing Image recognition Etc.
Slide 14
Approach 2 Smarter data Make data easier for machines to
understand Express meaning in a machine processable format Example:
metadata The Semantic Web approach Injecting more metadata so that
data become structured.
Slide 15
The Current Web Minimal machine processable information --dumb
links. Resources are linked together forming the Web. There is no
distinction between resources or the links that connect
resources.
Slide 16
The Semantic Web -An extension of the current Web More
machine-processible information To give meaning to resources and
links, new standards and languages are being investigated and
developed. The rules and descriptive information made available by
these languages allow the type of resources on the Web and the
relationships between resources to be characterized individually
and precisely.
Slide 17
Why is machine processing difficult? Two key problems: Problem
1: Ambiguity Problem 2: Language complexity
Slide 18
Ambiguity "David Booth has VIN #2745534." Which "David Booth"?
Vehicle #2745534? Vinyl siding order #2745534? Need to identify
things: Unambiguously, in a Uniform Web-friendly way
Slide 19
Kinds of things to identify Three kinds of things in the
universe: 1) Web resources 2) Non-Web resources - Physical objects
Eg Cars, people, houses, etc. 3) Abstract concepts Sizes, colors,
verbs, "love", etc. "Creator" (e.g., the creator of a document)
"Airline reservation"
Slide 20
Unambiguously identifying Web resources Solution (trivial):
URLs http://www.example.org/index.html
Slide 21
Unambiguously identifying physical objects Many human systems:
Vehicle Identification Numbers (VIN) Product serial numbers
Employee numbers Problems: Too many formats Most are not global in
scope Solution: Convert to URIs
http://www.example.com/employeeid/85740
Slide 22
Unambiguously identifying abstract concepts Solution: Use URIs
Problem: Which URIs? Need to agree on common vocabulary Solution:
Ontology
Slide 23
URI In computing, a uniform resource identifier (URI) is a
string of characters used to identify a name or a resource. URIs
can be classified as locators (URLs), as names (URNs), or as both.
A uniform resource name (URN) functions like a person's name, while
a uniform resource locator (URL) resembles that person's street
address. In other words: the URN defines an item's identity, while
the URL provides a method for finding it.
Slide 24
Ontology "Formal description of concepts and their
relationships" In other words: Vocabulary of terms "book",
"publication", "greyhound", "dog" And their relationships "book
is-a-kind-of publication" "greyhound is-a-kind-of dog"
Slide 25
Ontology Vocabulary+Structure=Taxonomy
Taxonomy+Relationships,Contraints, Rules=Ontology
Ontology+Instances=Knowledge Base
Slide 26
Structure of an Ontology Ontologies typically have two distinct
components: Names for important concepts in the domain Elephant is
a concept whose members are a kind of animal Herbivore is a concept
whose members are exactly those animals who eat only plants or
parts of plants Adult_Elephant is a concept whose members are
exactly those elephants whose age is greater than 20 years
Background knowledge/constraints on the domain Adult_Elephants
weigh at least 2,000 kg All Elephants are either African_Elephants
or Indian_Elephants No individual can be both a Herbivore and a
Carnivore
Slide 27
Dublin Core One well-known ontology Defines 15 basic terms for
documents and publishing: "title", "creator", "subject", "publisher
Each term unambiguously identified by URI
http://purl.org/dc/elements/1.1/creator
Slide 28
Ontology Languages Wide variety of languages for Explicit
Specification Graphical notations Semantic networks Topic Maps (see
http://www.topicmaps.org/) UML RDF Logic based Description Logics
(e.g., OIL, DAML+OIL, OWL) Rules (e.g., RuleML, LP/Prolog) First
Order Logic (e.g., KIF) Conceptual graphs (Syntactically) higher
order logics (e.g., LBase) Non-classical logics (e.g., Flogic,
Non-Mon, modalities) Probabilistic/fuzzy Degree of formality varies
widely Increased formality makes languages more amenable to machine
processing (e.g., automated reasoning)
Slide 29
Slide 30 6300 kilometers western China's Qinghai-Tibet Plateau
East China Sea "Here is data about the Yangtze River. It has a
length of 6300 kilometers. Its startingLocation is western China's
Qinghai-Tibet Plateau. Its endingLocation is the East China
Sea."">
What is the Purpose of RDF? The purpose of RDF (Resource
Description Framework) is to give a standard way of specifying data
"about" something. Here's an example of an XML document that
specifies data about China's Yangtze river : 6300 kilometers
western China's Qinghai-Tibet Plateau East China Sea "Here is data
about the Yangtze River. It has a length of 6300 kilometers. Its
startingLocation is western China's Qinghai-Tibet Plateau. Its
endingLocation is the East China Sea."
Slide 31 6300 kilometers western China's Qinghai-Tibet Plateau
East China Sea XML Modify "> 6300 kilometers western China's
Qinghai-Tibet Plateau East China Sea XML Modify the following XML
document so that it is also a valid RDF document: 6300 kilometers
western China's Qinghai-Tibet Plateau East China Sea RDF
Yangtze.xml Yangtze.rdf "convert to""> 6300 kilometers western
China's Qinghai-Tibet Plateau East China Sea XML Modify "
title="XML --> RDF 6300 kilometers western China's Qinghai-Tibet
Plateau East China Sea XML Modify ">
XML --> RDF 6300 kilometers western China's Qinghai-Tibet
Plateau East China Sea XML Modify the following XML document so
that it is also a valid RDF document: 6300 kilometers western
China's Qinghai-Tibet Plateau East China Sea RDF Yangtze.xml
Yangtze.rdf "convert to"
Slide 32 6300 kilometers "> 6300 kilometers western China's
Qinghai-Tibet Plateau East China Sea RDF provides an ID attribute
for identifying the resource being described. The ID attribute is
in the RDF namespace. Add the "fragment identifier symbol" to the
namespace. 1 2 3"> 6300 kilometers " title="The RDF Format 6300
kilometers ">
The RDF Format 6300 kilometers western China's Qinghai-Tibet
Plateau East China Sea RDF provides an ID attribute for identifying
the resource being described. The ID attribute is in the RDF
namespace. Add the "fragment identifier symbol" to the namespace. 1
2 3
Slide 33 6300 kil"> 6300 kilometers western China's
Qinghai-Tibet Plateau East China Sea Identifies the type (class) of
the resource being described. Identifies the resource being
described. This resource is an instance of River. These are
properties, or attributes, of the type (class). Values of the
properties 1 2 3 4"> 6300 kil" title="The RDF Format (cont.)
6300 kil">
The RDF Format (cont.) 6300 kilometers western China's
Qinghai-Tibet Plateau East China Sea Identifies the type (class) of
the resource being described. Identifies the resource being
described. This resource is an instance of River. These are
properties, or attributes, of the type (class). Values of the
properties 1 2 3 4
Slide 34
Namespace Convention xmlns="http://www.geodesy.org/river#"
Question: Why was "#" placed onto the end of the namespace? E.g.,
Answer: RDF is very concerned about uniquely identifying things -
uniquely identifying the type (class) and uniquely identifying the
properties. If we concatenate the namespace with the type then we
get a unique identifier for the type, e.g.,
http://www.geodesy.org/river#River If we concatenate the namespace
with a property then we get a unique identifier for the property,
e.g., http://www.geodesy.org/river#length
http://www.geodesy.org/river#startingLocation
http://www.geodesy.org/river#endingLocation Thus, the "#" symbol is
simply a mechanism for separating the namespace from the type name
and the property name. Best Practice
Slide 35 6300 kilometers western China's Qinghai-Tibet Plateau
East China Sea Suppose that this RDF/XML document is located at
this URL: http://www.china.org/geography/rivers. Thus, the complete
URI for this resource is: Yangtze.rdf
http://www.china.org/geography/rivers#Yangtze">
rdf:ID The value of rdf:ID is a "relative URI". The "complete
URI" is obtained by concatenating the URL of the XML document with
"#" and then the value of rdf:ID, e.g., 6300 kilometers western
China's Qinghai-Tibet Plateau East China Sea Suppose that this
RDF/XML document is located at this URL:
http://www.china.org/geography/rivers. Thus, the complete URI for
this resource is: Yangtze.rdf
http://www.china.org/geography/rivers#Yangtze
Slide 36 6300 kilometers western China's Qinghai-Tibet Plateau
East China Sea Resource URI = concatenation(xml:base, '#', rdf:ID)
= concatenation(http://www.china.org/geography/rivers, '#',
"Yangtze") =
http://www.china.org/geography/rivers#Yangtze">
xml:base On the previous slide we showed how the URL of the
document provided the base URI. Depending on the location of the
document is brittle: it will break if the document is moved, or is
copied to another location. A more robust solution is to specify
the base URI in the document, e.g., 6300 kilometers western China's
Qinghai-Tibet Plateau East China Sea Resource URI =
concatenation(xml:base, '#', rdf:ID) =
concatenation(http://www.china.org/geography/rivers, '#',
"Yangtze") = http://www.china.org/geography/rivers#Yangtze
Slide 37 6300 kilometers western China's Qinghai-Tibet Plateau
East China Sea">
rdf:about Instead of identifying a resource with a relative URI
(which then requires a base URI to be prepended), we can give the
complete identity of a resource. However, we use rdf:about, rather
than rdf:ID, e.g., 6300 kilometers western China's Qinghai-Tibet
Plateau East China Sea
Slide 38 value..."> value..."> value..." title="The RDF
Format value...">
The RDF Format value...
Slide 39
Advantage of using the RDF Format You may ask: "Why should I
bother designing my XML to be in the RDF format?" Answer: there are
numerous benefits: The RDF format, if widely used, will help to
make XML more interoperable: Tools can instantly characterize the
structure, "this element is a type (class), and here are its
properties. The RDF format gives you a structured approach to
designing your XML documents. The RDF format is a regular,
recurring pattern. It enables you to quickly identify weaknesses
and inconsistencies of non-RDF-compliant XML designs. It helps you
to better understand your data! You reap the benefits of both
worlds: You can use standard XML editors and validators to create,
edit, and validate your XML. You can use the RDF tools to apply
inferencing to the data. It positions your data for the Semantic
Web! Network effect Interoperability
Slide 40
Disadvantage of using the RDF Format Constrained: the RDF
format constrains you on how you design your XML (i.e., you can't
design your XML in any arbitrary fashion). RDF uses namespaces to
uniquely identify types (classes), properties, and resources. Thus,
you must have a solid understanding of namespaces. Another XML
vocabulary to learn: to use the RDF format you must learn the RDF
vocabulary.
Slide 41
Triple -> resource/property/value
http://www.china.org/geography/rivers#Yangtze has a
http://www.geodesy.org/river#length of 6300 kilometers resource
property value http://www.china.org/geography/rivers#Yangtze has a
http://www.geodesy.org/river#startingLocation of western China's...
resource property value
http://www.china.org/geography/rivers#Yangtze has a
http://www.geodesy.org/river#endingLocation of East China Sea
resource property value
Slide 42
The RDF Format = triples! The fundamental design pattern of RDF
is to structure your XML data as resource/property/value triples!
The value of a property can be a literal (e.g., length has a value
of 6300 kilometers). Also, the value of a property can be a
resource, as shown above (e.g., property-A has a value of
Resource-B, property-B has a value of Resource-C). Value-C value of
property-A value of property-B Notice that the RDF design pattern
is an alternating sequence of resource-property. This pattern is
known as "striping".
Terminology As you read the RDF literature you may see the
following terminology: Subject: this term refers to the item that
is playing the role of the resource. predicate: this term refers to
the item that is playing the role of the property. Object: this
term refers to the item that is playing the role of the value.
Subject Object predicate Resource Value property Equivalent!
Slide 45
RDF Parser There is a nice RDF parser at the W3 Web site:
http://www.w3.org/RDF/Validator/ This RDF parser will tell you if
your XML is in the proper RDF format.
Slide 46
What is missing from RDF? A Schema Support Enables Reasoning
Solution: Use RDF-S (RDF Schema)
Slide 47
Slide 48
RDF Schema (RDFS) RDF gives a formalism for meta data
annotation, and a way to write it down in XML, but it does not give
any special meaning to vocabulary such as subClassOf or type RDF
Schema allows you to define vocabulary terms and the relations
between those terms it gives extra meaning to particular RDF
predicates and resources this extra meaning, or semantics,
specifies how a term should be interpreted
Slide 49
RDF Schema Extension to RDF to allow definition of application-
specific classes and properties Provides a framework to describe
such Classes - similar to OOP Allows instances and subclasses of
classes.
Slide 50
Ocean Lake BodyOfWater River Stream Properties: length: Literal
emptiesInto: BodyOfWater Sea NaturallyOccurringWaterSource RDF
Schema is about creating Taxonomies! Tributary Brook Rivulet
Slide 51 "> 6300 kilometers What inferences can be made with
this data? Inferences are made by examining a taxonomy that
contains River. See next slide. What inferences can be made on this
RDF/XML, given the taxonomy on the last slide?"> "
title="Yangtze.rdf ">
Yangtze.rdf 6300 kilometers What inferences can be made with
this data? Inferences are made by examining a taxonomy that
contains River. See next slide. What inferences can be made on this
RDF/XML, given the taxonomy on the last slide?
Slide 52 6300 kilometers">
Ocean Lake BodyOfWater River Stream Properties: length: Literal
emptiesInto: BodyOfWater Sea NaturallyOccurringWaterSource
TributaryBrook Inference Engine Inferences: - Yangtze is a Stream -
Yangtze is an NaturallyOcurringWaterSource -
http://www.china.org/geography#EastChinaSea is a BodyOfWater
Yangtze.rdf Rivulet 6300 kilometers
Slide 53
How does a taxonomy facilitate searching? Ocean Lake
BodyOfWater River Stream Properties: length: Literal emptiesInto:
BodyOfWater Sea NaturallyOccurringWaterSource Tributary Brook The
taxonomy shows that when searching for "streams", any RDF/XML that
uses the class Brook, Rivulet, River, or Tributary are relevant.
See next slide. Rivulet
Slide 54 6300 kilometers Search Engine Results: - Yangtze is a
Stream, so this document is relevant to the query. "Show me all
documents that contain info about Streams" Yangtze.rdf
Rivulet">
Ocean Lake BodyOfWater River Stream Properties: length: Literal
emptiesInto: BodyOfWater Sea NaturallyOccurringWaterSource
TributaryBrook 6300 kilometers Search Engine Results: - Yangtze is
a Stream, so this document is relevant to the query. "Show me all
documents that contain info about Streams" Yangtze.rdf Rivulet
Slide 55
So RDF Schemas RDF Schemas is all about defining taxonomies
(class hierarchies). As we've seen, a taxonomy can be used to make
inferences and to facilitate searching. That's all there is to RDF
Schemas! The rest is just syntax The previous slide showed the
taxonomy in a graphical form. Obviously, we need to express the
taxonomy in a form that is machine-processable. RDF Schemas
provides an XML vocabulary to express taxonomies.
Slide 56
RDF Schema provides an XML vocabulary to express taxonomies
Ocean Lake BodyOfWater River Stream Properties: length: Literal
emptiesInto: BodyOfWater Sea NaturallyOccurringWaterSource
TributaryBrook XML NaturallyOccurringWaterSource.rdfs Rivulet
"express as"
Slide 57 ... This is read as: "I hereby define a River Class.
River is a subClassOf Stream." "I hereby define a Stream Class.
Stream is a subClassOf NaturallyOccurringWaterSource."...
NaturallyOccurringWaterSource.rdfs (snippet) All classes and
properties are defined within rdf:RDF Defines the River class
Defines the Stream class Since the Stream class is defined in the
same document we can reference it using a fragment identifier. 1 2
Assigns a namespace to the taxonomy! 3 4 5">
Defining a class (e.g., River) ... This is read as: "I hereby
define a River Class. River is a subClassOf Stream." "I hereby
define a Stream Class. Stream is a subClassOf
NaturallyOccurringWaterSource."...
NaturallyOccurringWaterSource.rdfs (snippet) All classes and
properties are defined within rdf:RDF Defines the River class
Defines the Stream class Since the Stream class is defined in the
same document we can reference it using a fragment identifier. 1 2
Assigns a namespace to the taxonomy! 3 4 5
Slide 58
rdfs:Class This type is used to define a class. The rdf:ID
provides a name for the class. The contents are used to indicate
the members of the class. The contents are ANDed together. Name of
the class ANDed
Slide 59
Equivalent!
Slide 60
rdfs:subClassOf Stream River This represents the set of
Streams, i.e., the set of instances of type Stream. This represents
the set of Rivers, i.e., the set of instances of type River.
Slide 61
rdfs:subClassOf Use this property to indicate a subclass
relationship between one class and another class. You may specify
zero, one, or multiple rdfs : subClassOf properties. Zero: if you
define a class without specifying rdfs:subClassOf then you are
implicitly stating that the class is a subClassOf rdfs:Resource
(the root of all classes). One: if you define a class by specifying
one rdfs:subClassOf then you are indicating that the class is a
subclass of that class. Multiple: if you define a class by
specifying multiple rdfs:subClassOf properties then you are
indicating that the class is a subclass of each of the other
classes. Example: consider the River class: suppose that it has two
rdfs:subClassOf properties - one that specifies Stream and a second
that specifies SedimentContainer. Thus, the two rdfs:subClassOf
properties indicate that a River is a Stream and a
SedimentContainer. That is, each instance of River is both a Stream
and a SedimentContainer.
Slide 62
Example of multiple rdfs:subClassOf properties Stream River
SedimentContainer - a River is both a Stream and a
SedimentContainer. The conjunction (AND) of two subClassOf
statements is a subset of the intersection of the classes.
Slide 63
rdfs:subClassOf is transitive Ocean Lake BodyOfWater River
Stream Sea NaturallyOccurringWaterSource TributaryBrook Rivulet
Consider the above class hierarchy. It says, for example, that: - A
Rivulet is a Brook. - A Brook is a Stream. Therefore, since
subClassOf is transitive, a Rivulet is a Stream. (Note that a
Rivulet is also a NaturallyOccurringWaterSource.)
Slide 64 ... This is read as: "I hereby define an emptiesInto
Property. The domain (class) in which emptiesInto is used is River.
The range (of values) for emptiesInto are instances of
BodyOfWater." That is, the emptiesInto Property relates
(associates) a River to a BodyOfWater.
NaturallyOccurringWaterSource.rdfs (snippet) River BodyOfWater
emptiesInto domain range">
Defining a property (e.g., emptiesInto ) ... This is read as:
"I hereby define an emptiesInto Property. The domain (class) in
which emptiesInto is used is River. The range (of values) for
emptiesInto are instances of BodyOfWater." That is, the emptiesInto
Property relates (associates) a River to a BodyOfWater.
NaturallyOccurringWaterSource.rdfs (snippet) River BodyOfWater
emptiesInto domain range
Slide 65
rdf:Property This type is used to define a property. The rdf:ID
provides a name for the property. The contents are used to indicate
the usage of the property. The contents are ANDed together. Name of
the property ANDed
Slide 66
Equivalent!
Slide 67
rdfs:range Use this property to indicate the type of values
that a property will contain. You may specify zero, one, or
multiple rdfs:range properties. Zero: if you define a property
without specifying rdfs:range then you are providing no information
about the type of value that the property will contain. One: if you
define a property by specifying one rdfs:range then you are
indicating that the property will contain a value whose type is
that specified by rdfs:range. Multiple: if you define a property by
specifying multiple rdfs:range properties then you are indicating
that the property will contain a value which belongs to every class
defined by the rdfs:range properties. Example: consider the
property emptiesInto: suppose that it has two rdfs:range properties
- one that specifies BodyOfWater and a second that specifies
CoastalWater. Thus, the two rdfs:range properties indicate that
emptiesInto will contain a value that is a BodyOfWater and a
CoastalWater.
Slide 68
Example of multiple rdfs:range properties BodyOfWater range
CoastalWater - the value of emptiesInto is a BodyOfWater and a
CoastalWater.
Slide 69
rdfs:domain Use this property to indicate the classes that a
property will be used with. You may specify zero, one, or multiple
rdfs:domain properties. Zero: if you define a property without
specifying rdfs:domain then you are providing no information about
the class that the property will be used with, i.e., the property
can be used with any class. One: if you define a property by
specifying one rdfs:domain then you are indicating that the
property will be used with the class specified by rdfs:domain.
Multiple: if you define a property by specifying multiple
rdfs:domain properties then you are indicating that the property
will be used with a class which belongs to every class defined by
the rdfs:domain properties. Example: consider the property
emptiesInto: suppose that it has two rdfs:domain properties - one
that specifies River and a second that specifies Vessel. Thus, the
two rdfs:domain properties indicate that emptiesInto will be used
with a class that is a River and a Vessel.
Slide 70
Example of multiple rdfs:domain properties River domain Vessel
- emptiesInto is to be used in instances that are of type River and
Vessel.
Slide 71
Note that properties are defined separately from classes With
most Object-Oriented languages when a class is defined the
properties (attributes) are simultaneously defined. For example, "I
hereby define a Rectangle class, and its attributes are length and
width." With RDF Schema things are different. You define a class
(and indicate its relationships to other classes). Separately, you
define properties and then associate them with a class! For the
above example you would define the Rectangle class (and indicate
that it is a subclass of GeometricObject). Separately, you then
define a length property, indicate its range of value, and then
indicate that length may be used with the Rectangle class. (Thus,
if you have an untyped Resource with a length property you can
infer the Resource is a Rectangle.) Likewise for the width
property.
Slide 72
Advantage of separately defining classes and properties As we
have seen, the RDF Schema approach is to define a class, and then
separately define properties and state that they are to be used
with the class. The advantage of this approach is that anyone,
anywhere, anytime can create a property and state that it is usable
with the class!
The XML Representation of the taxonomy ...
NaturallyOccurringWaterSource.rdfs (snippet)
Slide 74
Literal value A literal type is a simple, untyped string.
Slide 75
NaturallyOccurringWaterSource Ontology!
NaturallyOccurringWaterSource.rdfs defines a set of classes and how
the classes are related. It defines a set of properties and
indicates the type of values they may have and what classes they
may be associated with. That is, it defines an ontology for
NaturallyOccurringWaterSources!
Slide 76 6300 kilometers Notice that in this RDF/XML instance
the class of the resource (Yangtze) is not identified: However, we
can infer that Yangtze is a River because length and emptiesInto
have a rdfs:domain of River, i.e., their domain asserts that these
properties will be used in a River instance.">
Inferring a resource's class from the properties' domain 6300
kilometers Notice that in this RDF/XML instance the class of the
resource (Yangtze) is not identified: However, we can infer that
Yangtze is a River because length and emptiesInto have a
rdfs:domain of River, i.e., their domain asserts that these
properties will be used in a River instance.
Slide 77
RDF Schemas: simple, yet powerful Let's summarize what we have
learned: Use RDF Schema to define: a class hierarchy (a taxonomy),
properties associate them with a class (use rdfs:domain) indicate
the range of values (use rdfs:range) Once an RDF Schema is defined
then it can be used to infer additional facts about data: a class
is an instance of all superclasses
Slide 78
Problems with RDFS RDFS too weak to describe resources in
sufficient detail No localised range and domain constraints Cant
say that the range of hasChild is person when applied to persons
and elephant when applied to elephants No existence/cardinality
constraints Cant say that all instances of person have a mother
that is also a person, or that persons have exactly 2 parents No
inverse properties Cant say that hasPart is the inverse of isPartOf
Two classes, same concept - people use different words to represent
the same thing. It would be very useful to be able to state "this
class is equivalent to this second class". One person may create an
ontology with a class called "Airplane". Another person may create
an ontology with a class called "Plane". It would be useful to be
able to indicate that the two classes are equivalent.
Slide 79
RDF Schemas: Building Block to More Expressive Ontology
Languages RDF Schema OWL RDF Schema was designed to be extended.
The ontology languages all use RDF Schema's basic notions of Class,
Property, domain, and range. OWL = Web Ontology Language
Slide 80
RDF Schema vs XML Schema XML Schemas is all about syntax. RDF
Schema is all about semantics. An XML Schema tool is intended to
validate that an XML instance conforms to the syntax specified by
the XML Schema. An RDF Schema tool is intended to provide
additional facts to supplement the facts in RDF/XML instances. XML
Schemas is prescriptive - an XML Schema prescribes what an element
may contain, and the order the child elements may occur. RDF
Schemas is descriptive - an RDF Schema simply describes classes and
properties.
Slide 81
Slide 82
Purpose of OWL The purpose of OWL is identical to RDF Schemas -
to provide an XML vocabulary to define classes, their properties
and their relationships among classes. RDF Schema enables you to
express very rudimentary relationships and has limited inferencing
capability. OWL enables you to express much richer relationships,
thus yielding a much enhanced inferencing capability. A benefit of
OWL is that it facilitates a much greater degree of inference
making than you get with RDF Schemas.
OWL = RDF Schema + more Note: all of the elements/attributes
provided by RDF and RDF Schema can be used when creating an OWL
document.
Slide 85
Web Ontology Language OWL adds many new features to RDF:
Functional properties Inverse functional properties (database keys)
Local domain and range constraints General cardinality constraints
Inverse properties Symmetric and transitive properties
Slide 86
Example 1: The Robber and the Speeder DNA samples from a
robbery identified John Walker Lindh as the suspect. Here is the
police report on the robbery:... Later in the day a state trooper
gives a person a ticket for speeding. The driver's license showed
the name Sulay. Here is the state trooper's report on the
speeder:...
Slide 87
Any Relationship between the Robber and the Speeder? The
Central Intelligence Agency (CIA) has a file on Sulay: Robbery
Speeder John Walker Lindh Sulay owl:sameIndividualAs Inference: The
Robber and the Speeder are one and the same! The local police,
state troopers, and CIA share their information, thus enabling the
following inference to be made:
Slide 88
Lesson Learned OWL provides a property (owl:sameIndividualAs)
for indicating that two resources (e.g., two people) are the
same.
Slide 89
Example 2: Using a Web Bot to Purchase a Camera My Web
Assistant (a Web Bot) Web Site "Please send me your e-catalog" 1.4
300mm zoom optional $325 USD Is "SLR" a Camera? "Here's my
e-catalog" 1 2 3 * A Web Bot is a software program which crawls the
Web looking for information.
Slide 90
Camera OWL Ontology Camera SLR Large-Format Digital My Web
Assistant program consults the Camera OWL Ontology. The Ontology
shows how SLR is classified. The Ontology shows that SLR is a type
(subclass) of Camera. Thus, my Web Assistant Bot dynamically
realizes that: Inference: The Olympus-OM10 SLR is a Camera!
Slide 91
Lesson Learned OWL provides elements to construct taxonomies
(called class hierarchies). The taxonomies can be used to
dynamically discover relationships!
Slide 92
Example 3: The Birthplace of King Kameha is Upon scanning the
Web, three documents were found which contain information about
King Kameha Question: What is the birthplace of King Kameha? 1 2
3
Slide 93
Answer: all three! The Person OWL Ontology indicates that a
Person has only one birthplace location: Location Person birthplace
1 Thus, the Person OWL Ontology enables this inference to be made:
Inference: Hawaii, Sandwich Islands, and Aloha State all represent
the same location! King Kameha Hawaii Sandwich Islands Aloha State
birthplace King Kameha birthplace King Kameha birthplace They all
represent the same location!
Slide 94
Lesson Learned In the example we saw that the Person Ontology
defined this relationship: Location Person birthplace 1 This is
read as: "A person has exactly one birthplace location." This
example is a specific instance of a general capability in OWL to
specify that a subject Resource has exactly one value: Resource
(value) Resource (subject) property 1 We saw in the example that
such information can be used to make inferences. OWL Terminology:
properties that relate a resource to exactly one other resource are
said to have a cardinality=1.
Slide 95
Review Some of the OWL's capabilities are: An OWL instance
document can be enhanced with an OWL property to indicate that it
is the same as another instance. OWL provides the capability to
construct taxonomies (class hierarchies). Such taxonomies can be
used to dynamically understand how entities in an XML instance
relate to other entities. OWL provides the capability to specify
that a subject can have only one value. By leveraging OWL,
additional facts about your instance data can be dynamically
ascertained. That is, OWL facilitates a dynamic understanding of
the semantics of your data!
Slide 96
Defining Properties in OWL Recall that with RDF Schema the
rdf:Property was used for both: relating a Resource to another
Resource Example: The emptiesInto property relates a River to a
BodyOfWater. relating a Resource to an rdfs:Literal or a datatype
Example: The length property relates a River to a
xsd:nonNegativeInteger. OWL decided that these are two classes of
properties, and thus each should have its own class:
owl:ObjectProperty is used to relate a Resource to another Resource
owl:DatatypeProperty is used to relate a Resource to an
rdfs:Literal or an XML Schema built-in datatype
Slide 97
ObjectProperty vs. DatatypeProperty Resource ObjectProperty
Resource DatatypeProperty Resource Value An ObjectProperty relates
one Resource to another Resource: A DatatypeProperty relates a
Resource to a Literal or an XML Schema datatype:
Slide 98
owl:ObjectProperty and owl:DatatypeProperty are subclasses of
rdf:Property rdf:Property owl:ObjectProperty owl:DatatypeProperty
rdf:Property owl:ObjectProperty owl:DatatypeProperty
Slide 99
Defining Properties in OWL vs. RDF Schema RDFS OWL
Slide 100
The Three Faces of OWL
Slide 101
OWL Full, OWL DL, and OWL Lite Not everyone will need all of
the capabilities that OWL provides. Thus, there are three versions
of OWL: OWL Full OWL DL OWL Lite DL = Description Logic
Slide 102
Comparison OWL FullOWL DL OWL Lite Everything that has been
shown in this tutorial is available. Further, you can mix RDF
Schema definitions with OWL definitions. You cannot use
owl:cardinality with TransitiveProperty. You cannot use a class as
a member of another class, i.e., you cannot have metaclasses.
FunctionalProperty and InverseFunctionalProperty cannot be used
with datatypes (they can only be used with ObjectProperty). All the
DL restrictions plus: You cannot use owl:minCardinality or
owl:maxCardinality. The only allowed values for owl:cardinality is
0 and 1. Cannot use owl:hasValue. Cannot use owl:disjointWith.
Cannot use owl:oneOf. Cannot use owl:complementOf. Cannot use
owl:unionOf.
Slide 103
Advantages/Disadvantages Full: The advantage of the Full
version of OWL is that you get the full power of the OWL language.
The disadvantage of the Full version of OWL is that it is difficult
to build a Full tool. Also, the user of a Full-compliant tool may
not get a quick and complete answer. DL/Lite: The advantage of the
DL or Lite version of OWL is that tools can be built more quickly
and easily, and users can expect responses from such tools to come
quicker and be more complete. The disadvantage of the DL or Lite
version of OWL is that you don't have access to the full power of
the language.
Slide 104
Slide 105
Experience with OWL OWL plays a key role in an increasing
number & range of applications E.g. Science, eCommerce,
geography, engineering, defence etc. E.g. OWL tools used to
identify and repair errors in a medical ontology: Experience of OWL
in use has identified restrictions: on expressivity on scalability
These restrictions are problematic in some applications Research
has now shown how some restrictions can be overcome W3C group has
updated OWL accordingly Result is called OWL 2 OWL 2 is now a
Proposed Recommendation
Slide 106
OWL2 Extends OWL 1 Inherits OWL 1 language features The new
features of OWL 2 based on: Real applications User experience Tool
developer experience
Slide 107
OWL 2 in a Nutshell Extends OWL with a small but useful set of
features That are needed in applications For which semantics and
reasoning techniques are well understood That tool builders are
willing and able to support Adds profiles Language subsets with
useful computational properties Is fully backwards compatible with
OWL: Every OWL ontology is a valid OWL 2 ontology Every OWL 2
ontology not using new features is a valid OWL ontology Already
supported by popular OWL tools & infrastructure: Protg, HermiT,
Pellet, FaCT++, OWL API
Slide 108
Whats New in OWL 2? Four kinds of new feature: Increased
expressive power Extended Datatypes Metamodelling and annotations
Syntactic sugar
Slide 109
Feature 1: Increased expressive power qualified cardinality
restrictions, e.g. P ersons having two friends who are republicans
property chains, e.g.: T he brother of your parent is your uncle
local reflexivity restrictions, e.g. Classes of objects that are
related to themselves by a given property narcissists love
themselves. Auto-regulating processes regulate themselves
reflexive, irreflexive, and asymmetric properties, e.g.: Everything
is part of itself ( Globally reflexive), Nothing can be a proper
part of itself (irreflexive) If x is proper part of y, then the
opposite does not hold(assymentric) disjoint properties, e.g.:Y ou
cant be both the parent of and child of the same person keys,
e.g.:c ountry + license plate constitute a unique identifier for
vehicles
Slide 110 = 18 DatatypeRestriction(xsd:integer minInclusive 18)
Data range combinations Intersection of DataIntersectionOf(
xsd:nonNegativeInteger xsd:nonPositiveInteger ) Union of
DataUnionOf( xsd:string xsd:integer ) Complement of data range
DataComplementOf( xsd:positiveInteger )">
Feature 2: Extended Datatypes Extra datatypes- Much wider range
of XSD Datatypes supported, e.g.:Integer, string, boolean, real,
decimal, float, datatime, Datatype definitions- New User-defined
datatypes:Eg format of Italian registration plates: xsd:string
xsd:pattern "[A-Z]{2} [0-9]{3}[A-Z]{2} Datatype restrictions Range
of datatypes. Eg: adult has an age >= 18
DatatypeRestriction(xsd:integer minInclusive 18) Data range
combinations Intersection of DataIntersectionOf(
xsd:nonNegativeInteger xsd:nonPositiveInteger ) Union of
DataUnionOf( xsd:string xsd:integer ) Complement of data range
DataComplementOf( xsd:positiveInteger )
Slide 111
Feature 3: Metamodelling and Annotations Restricted form of
metamodelling via punning, e.g.: SnowLeopard subClassOf BigCat
(i.e., a class) SnowLeopard type EndangeredSpecies (i.e., an
individual) Classes and individuals can have the same. SnowLeopard
as a class and as an individual Annotations of axioms as well as
entities, e.g.: SnowLeopard type EndangeredSpecies (source: WWF)
Even annotations of annotations
Slide 112
Feature 4: Syntactic sugar Syntax used to make things easier to
read or to express. It makes the language "sweeter" for humans to
use. DisjointUnion- Eg:Element is the DisjointUnion of Earth Wind
Fire Water i.e,Element is equivalent to the union of Earth Wind
Fire Water. Earth Wind Fire Water are pair-wise disjoint
DisjointClasses- A set of classes, All the classes are pairwise
disjoint Example:Nothing can be both a LeftLung and a RightLung
NegativeObjectPropertyAssertion- Two individuals, a property does
not hold between them Example, Patient John does not live in Povo
NegativeDataPropertyAssertion An individual, A literal, A property
does not hold between them Example, John is not 5 years old.
Slide 113
Profiles Profiles are sublanguages of OWL 2 There are three
profiles OWL 2 EL OWL 2 QL OWL 2 RL
Slide 114
OWL 2 EL The EL acronym reflects the profiles basis in the EL
family of description logics This logic is also called small
description logic (DL) EL This logic allows for conjunction and
existential restrictions It does not allow disjunction and
universal restrictions It can capture the expressive power used by
many large-scale ontologies.
Slide 115
OWL 2 QL The QL acronym reflects its relation to the standard
relational Query Language It does not allow existential and
universal restrictions to a class expression or a data range These
restrictions enable a tight integration with RDBMSs, reasoners can
be implemented on top of standard relational databases Can answer
complex queries
Slide 116
OWL 2 RL The RL acronym reflects its relation to the Rule
Languages OWL 2 RL is designed to accommodate:- OWL 2 applications
that can trade the full expressivity of the language for efficiency
RDF(S) applications that need some added expressivity from OWL 2
Existential quantification to a class, union and disjoint union to
class expressions are not allowed These restrictions allow OWL 2 RL
to be implemented using rule-based technologies such as rule
extended DBMSs
Slide 117
Profiles Profile selection depends on Expressivenss required by
the application Priority given to reasoning on classes or data Size
of the datasets
Slide 118
Using OWL to Define Classes
Slide 119
Constructing Classes using Set Operators OWL gives you the
ability to construct classes using these set operators:
intersectionOf unionOf complementOf
Slide 120
Class Constructors OWL classes can be constructed from other
classes in a variety of ways: Intersection (Boolean AND) Union
(Boolean OR) Complement (Boolean NOT) Restriction Class
construction is the basis for description logic.
Slide 121
OWL Class Constructors
Slide 122
OWL Axioms Axioms (mostly) reducible to inclusion ( v ) C D iff
both C v D and D v C
Slide 123
OWL vs. Database Advantages of using OWL to define an Ontology:
Extensible: much easier to add new properties. Contrast with a
database - adding a new column may break a lot of applications (see
example on next slide) Portable: much easier to move an OWL
document than to move a database. Advantages of using a Database to
define an Ontology: Mature: the database technology has been around
a long time and is very mature.
Slide 124
Slide 125
The semantic Web architecture is composed of a series of
standards organized into a certain structure that is an expression
of their interrelationships. This architecture is represented using
a diagram It starts with the foundation of URIs and Unicode. On top
of that we can find the syntactic interoperability layer in the
form of XML, which in turn underlies RDF and RDF schema (RDFS).
Webontology languages are built on top of RDF(S). The three last
layers are the logic, proof, and trust, which have not been
significantly explored. Some of the layers rely on the digital
signature component to ensure security.
Slide 126
Semantic Web Layer Cake Illustrates the different parts of the
semantic Web architecture. First proposed by Tim Berners-Lee.
Slide 127
Evolution of the Web
Slide 128
Logic and Proof Current semantic Web research Good: systems can
understand basic concepts (subclass, inverse etc.) Better: if we
could state any logical principles we wanted to. Logical statements
(rules) that allow the computer to make inferences and
deductions.
Slide 129
Logic I am an employee of MemberCo. MemberCo is a member of
W3C. MemberCo has GET access to http://www.w3.org/Member/. I
(therefore) have access to http://www.w3.org/Member/.
Slide 130
Example(deduction) If someone sell more than 100 products then
they are a member of Super Salesman club. John sold 102 things;
therefore John is a member of the Super Salesman club. More complex
rules and inference engines explored
Slide 131
Proof Different people can write logic statements Machines can
follow semantic links to prove facts Prove John is a Super Salesman
- Sales: John sold 55 widgets + 47 sprockets - Widgets + sprockets:
company products - 55 + 47 =102 - 102 > 100 - Super Salesman
rule - Proved: John is a Super Salesman A Web of information
processors (e.g. P2P)
Slide 132
Proof MemberCo's document employList lists me as an employee.
W3Cs member list includes MemberCo. The ACLs for
http://www.w3.org/Member/ assert that employees of members have GET
access.