Upload
aleesha-farmer
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
1
ICS-FORTH
Describing Resources on the Web: The Resource Description Framework
Vassilis ChristophidesDimitris Plexousakis
Computer Science Department, University of CreteInstitute for Computer Science - FORTH
Heraklion, Crete http://www.ics.forth.gr/proj/isst/RDF
3
ICS-FORTH
What is the Problem?
3.6 million Web sites Five hundred million or more
addressable pages on the Web High consumer expectations
conflicting with primitive tools and mechanisms
Uncertain quality, integrity, trust
4
ICS-FORTH
The Information Landscape in the Web-era
The Web changes relationships among authorspublishersinformation intermediaries and distributorsusers
Lower barriers to “publication”rapid dissemination of information and ideasless advantage to size or centralizationgreatly expanded access
Manageability is reduced resource discovery is chaoticorganization is haphazardpreservation is almost non-existent
5
ICS-FORTH
The Web Information System vs. Traditional Libraries
Search systems are motivated by advertising Index coverage is unpredictable and limited (1/3) Too much recall, too little precision Index spam abound Resources (and their names) are volatile What about versions, editions, back issues? Archiving is presently unsolved Authority and quality of service are spotty Managing Access Rights is hard
6
ICS-FORTH
Metadata: Higher Quality Web Information Services
Traditionally: metadata has been understood as “Data about Data”help to impose order on chaos
Example(s): a library catalogue contains information (metadata) about
publications (data)a file system maintains permissions (metadata) about files (data)
Metadata describes other dataOne application’s metadata is another application’s dataMetadata can itself be described by metadata (but that doesn’t
make it meta-metadata) Example:
Price lists (metadata) have expiration dates: metadata about metadata (It is still just metadata!!)
7
ICS-FORTH
Metadata takes Many Forms
resourcediscovery
documentadministration
rightsmanagement
contentrating
security andauthentication
archivalstatus
products andservices
databaseschemas
process controlor description
8
ICS-FORTH
Metadata exists for Almost Anything
People
Places
Objects
Concepts
Documents
Archives
Databases
9
ICS-FORTH
Application: Item and Collection Cataloguing
Describing individual resources documents, pages, images, audio files, etc.
Describing the content of collectionsWeb sites, databases, directories, etc.
Relationships among ResourcesTables of Content, chapters, images….Site Maps
10
ICS-FORTH
Search engines can better “understand” the contents of a particular page
More accurate searches Additional information aids precision
Makes it possible to automate searches because less manual “weeding” is needed to process the search results
Application: Resource Discovery
11
ICS-FORTH
Metadata can be used to encode information needed in all stages of electronic commerce
locating seller/buyer & productsearching “yellow pages”
agreeing on terms of saleprices, terms of payment,
contractual informationtransactions
delivery mechanisms, dates, terms
Application: Electronic Commerce
Broker
Market place
Providers/Clients
12
ICS-FORTH
Application: Intelligent Agents
Representation and sharing of knowledge
knowledge exchangemodeling
Communicationuser-to-agent, agent-to-agent,
agent-to-service Resource discovery
gives web-roaming agents the ability to “understand” their environment
place
service
place
place
13
ICS-FORTH
Application: Content Rating
Empowering users to select which kinds of web content they wish to see
Child Protection W3C PICS (Platform for Internet
Content Selection) working groupUS Communications Decency Act
of 1996simple metadata architectureprecursor to RDF
14
ICS-FORTH
Application: Digital Signatures
These are key to building the “Web of Trust” Required by
agentselectronic commercecollaboration
RDF will become the preferred way to encode digital signatures on documents and on statements about documents
15
ICS-FORTH
Other Applications
Privacy Preferences and Policiesdescribing a user’s willingness/
reluctance to disclose information about him/her-self
describing a site administrator’s desire to gather information about visiting users
Intellectual Property Rightscontractual terms related to usage
and distribution rights to a document
16
ICS-FORTH
(Meta)Data Transmission Methods
Embedded (eg META)
Associated With(in HTTP header)
Trusted Third Party(explicit HTTP GET)
17
ICS-FORTH
Metadata Assertions
The Web is “machine-readable” but not “machine-understandable”
Metadata is usefulA lot could be gained from
structured description of pages, servers, search services, and other resources
Accommodate multiple varieties of metadata
Metadata requirements will evolve
18
ICS-FORTH
A Plethora of Metadata Standards
Many metadata standards have evolved at different levels, and to meet different requirements...
MICI
19
ICS-FORTH
Interoperability Issues
SemanticInteroperability
StructuralInteroperability
SyntacticInteroperability
“Let’s talk English”Standardisation ofcontent
Standardisation ofform
“Here’s how to make a sentence”
Standardisation ofexpression
“These are the rulesof grammar”
“cat milk sat drank mat ”
“Cat sat on mat. Drankmilk.”
“The cat sat on the mat.It drank some milk.”
20
ICS-FORTH
Metadata Challenges
Many flavours of metadatawhich one do I use?
Managing changenew varieties, and evolution
of existing forms Tension between functionality
and simplicity, extensibility and interoperability
Functions, features, and cool stuff Simplicity and interoperability
21
ICS-FORTH
Towards Metadata for Community Webs
Group of people sharing a domain of discourse and a set of resources (e.g., data, documents, services) and having some common interests
Commerce, Education, Health
Provide community-specific metadata functionality in order to create, administrate, and access resources
common semantic, structural, and syntactic conventions for exchange of resource description information
Community Webs
Education
HealthCommerce
Workplace
22
ICS-FORTH
ScientificData
HomePages Geo
CommunityWebs
Library
Museums
Commerce
Whatever...
Metadata Interoperability in Community Webs
Communities of expertise (not software vendors) are responsible for:
SemanticsRegistrationAdministrationAccess managementAuthority of dataSharing and
Distribution
23
ICS-FORTH
Metadata Implementation Approaches
Harvesting metadata into a repository (database) Distributed Database Search
24
ICS-FORTH
Harvesting Metadata into a Repository (database)
HTML
XML
Other types
Repository HarvesterQuery
Dynamic document creation from database
retrieve resource
25
ICS-FORTH
Distributed Database Search
Z39.50 Server
Z39.50 Server
Z39.50 Server
Z39.50 GatewayQuery
retrieve resource
27
ICS-FORTH
RDF origins
W3C Metadata Activity 1997-2000 PICS (Internet content selection) Warwick Framework / Dublin Core XML (XML Data, Channels etc) MCF (Apple, Netscape) URI specification for Web identifiers
28
ICS-FORTH
RDF Objectives
Enables resource description communities to define their own semantics
We can disagree about semantics, but share infrastructure (syntax, query, editors)
Imposes structural constraints on the expression of various application metadata
for consistent encoding, exchange and processing of metadata on the Web
Metadata vocabularies can be developed without central coordination
Fine-grained mixing of diverse metadata Signed RDF is the basis for trust XML used for ‘serialisation syntax’
29
ICS-FORTH
Describing Community Resources using RDF
Advanced Knowledge Schemas
(ontologies, thesauri)
<tag1> <tag2> <tag3></tag1>
<tag1> <tag2> <tag3></tag1>
Complexity and diversity
of information resources
Heterogeneous
resource descriptions
30
ICS-FORTH
The Basic RDF Data Model
RDF: Resource Descriptions Data Model: Directed Labeled
GraphsNodes: Resources (URIs) or
LiteralsEdges: Properties – Attributes
or RelationshipsStatement: assertion of the
form resource, property, valueDescription: set of statements
concerning a resourceXML syntax
33
ICS-FORTH
The notion of Resource
A resource is identified by a URI:[absoluteURI | relativeURI] [“#” fragment-id]
The resource identified by a URI may be abstract i.e. not network retrievable
Resource is distinct from entity resolved at any particular timehttp://www.ics.forth.gr/RDF/
From RFC 2396:Resource A resource can be anything that has identity. Familiar examples include an
electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.
34
ICS-FORTH
RDF Syntax
RDF Model defines a formal relationships among resources, properties and values
Syntax is required to...Store instances of the model
into filesCommunicate files from one
application to another W3C XML eXtensible Markup
Language
<tag1> <tag2> <tag3></tag1>
<tag1> <tag2> <tag3></tag1>
35
ICS-FORTH
RDF Model Example: Complex Values
URI:Tutorial“RDF
Presentation”Title
Creatordc:
dc:
“Vassilis Christophides”
<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> Vassilis Christophides </dc:Creator> </Description></RDF>
“`VassilisChristophides”
“ICS-FORTH”
bib:Emailbib:Affbib:Name
URI:FORTH
36
ICS-FORTH
<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/” xmlns:bib = “http://www.bib.org/persons#”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> <Description> <bib:Name> Vassilis Christophides </bib:Name> <bib:Email> [email protected] </bib:Email> <bib:Aff resource = “http://www.ics.forth.gr” /> </Description> </dc:Creator> </Description></RDF>
RDF Syntax Example: Complex Values
<Description bib:Name = “Vassilis Christophides” bib:Email = “[email protected]” > <bib:Aff resource = “http://www.ics.forth.gr” /></Description>
37
ICS-FORTH
RDF Model Example
admin:By
admin:On
“STEP”
“01-01-01”
admin:For“...”
URI:Tutorial“RDF
Presentation”Title
Creatordc:
dc:
“`VassilisChristophides”
“ICS-FORTH”
bib:Emailbib:Affbib:Name
URI:FORTH
38
ICS-FORTH
Where do you stop?
The Basic RDF model & syntax provides enabling technology Degree of metadata simplicity/complexity is a matter of:
Resource description communities needs, best-practice and experience
Organization/Institution’s PolicyEconomicsGoals and requirements of implementation
39
ICS-FORTH
The Basic RDF Data Model: In Brief
Nodes are resources connected by named propertiesR1 R2
P1
The degenerate case is an arc terminating in a fixed value
R1 “foo”P1
An RDF description consists of a directed graph of arbitrary complexity
R1 R2 R3
R6R4
R7
R5
R8
P1 P2
P3 P4 P5
P6
P7
40
ICS-FORTH
One Additional Concept: Container Values
Containers are collectionsthey allow grouping of resources (or literal values)
It is possible to make statements about the container (as a whole) or about its members individually
Different types of containers existBags -- groups of thingsSequences -- ordered group of thingsAlternates -- Alternate things/values
First value is the defaultMust be at least one
Duplicate values are permittedthere is no mechanism to enforce unique value constraints
Syntactic shorthand provided (much like HTML lists)
41
ICS-FORTH
Containers (continued)
“Vassilis
Christophides”
rdf:_1
dc:Creatorrdf:Type
“Dimitris
Plexousakis”
rdf:_2
URI:Tutorial
rdf:Seq
42
ICS-FORTH
Containers (continued)
dc:Creator dc:Creator
“Vassilis
Christophides”
“Dimitris
Plexousakis”
URI:Tutorial
43
ICS-FORTH
The Basic RDF Data Model: Formal Aspects
Statement := (predicate,subject,object) Predicate is a resource Subject is a resource Object is either a resource or a literal
Object = Predicate(Subject) A model is a set of statements
Formal model based on triples (Universal relation)
Example
{author, “http://www.ics.forth.gr/proj/isst/RDF”, node}{name, node, “Vassilis Christophides” }{email, node, “[email protected]” }
44
ICS-FORTH
Triples for Container Values: Example
Triples from the first example:
{“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator,x}{x, rdf:_1, “Vassilis Christophides” }{x, rdf:_2, “Dimitris Plexousakis” }{x, rdf:type, rdf:Seq }
Triples from the second example:
{“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator, “Vassilis Christophides”}
{“http://www.w3.org/TR/REC-rdf-syntax”, dc:Creator, “Dimitris Plexousakis”}
45
ICS-FORTH
Edge Labeled Directed Graphs (RDF)
RDFTutorial
Vassilis
ICS-FORTH
ISL C-Web
creatoraffiliation
projectsactivities
(creator, RDFTutorial, Vassilis)(affiliation, Vassilis, ICS-FORTH)(activities, ICS-FORTH, ISL)(projects, ICS-FORTH, C-Web)
46
ICS-FORTH
Node labeled Directed Graph (XML)
root
foo bar
bazhref x y
x
z
element element
elementattribute attribute attribute
attribute
attribute
<root><foo href=“…” x=“1” /><bar x=“2” y=“3”>
<baz z=“aaa”/></bar>
aaa
2
31
47
ICS-FORTH
What can we Express in RDF?
RDF relies on a (edge labeled) directed graph model that can easily
extended by just adding more edgescombine multiple vocabularies,
distinguished by their URIs RDF provides a standard syntax to
represent these graphs in XMLRDF Model can be thought of as a
simplified XML Infoset But RDF goes beyond XML syntactic
issuesIt allows to define semantic networks
on the Web
48
ICS-FORTH
Semantic Networks
Person
Artist
Painter Sculptor
name
Sculpture
Artifact
Painting
lives in
creates
paintssculpts
“a Person has a name and lives_in somewhere . Artists are persons, paintersand sculptors are artists. An artist creates artifacts, (paintings or sculptures)a painter paints paintings and a sculptor sculpts sculptures”
String
isa
isa isa
isa
isaisa
49
ICS-FORTH
RDF Schema Definition: RDFS
Declaration of label vocabularies for description graph nodes & edges Enables communities to share machine readable tokens and define
human readable labels Node labels (types) are defined as classes
Literal data types as defined by XML Schemas WG Resource may have a specific ‘type’ property
Edge labels (predicates) are defined as properties of these classes A resource of given type may have a given property (domain
constraint) A resource of given type may be the value of a given predicate
(range constraint) RDFS vocabularies expressible in the basic RDF model and syntax
RDFS vocabularies are also Web resources (and have URIs) and therefore can be described using RDF
50
ICS-FORTH
Constructing and Using RDF schemas
RDFS Schema Vocabularies allows for
Specialization of both classes & properties (simple & multiple)
Multiple classification of resources under several classes
Unordered, optional, and multi-valued properties
Domain and range polymorphism of properties
51
ICS-FORTH
A Cultural Community Resource Description Example
r2: museoreinasofia.mcu.es/guernica.jpg
r1:www.rodin.fr/thinker.gif
PortalSchema
PortalResourceDescriptions
ExtResource
last_modified title
StringDate
“oil on canvas”technique
exhibited
“Reina Sofia Museum”
title2000/06/09
last_modified
&r3
&r1
&r2
&r4
Artist
Sculptor
StringArtifact
Sculpture
Painting
sculpts
createsfname
lname
paints
StringMuseum
exhibited
techniqueStringPainter
paints
creates
&r5
&r6
fname
lname
lname
paints
“Pablo”
“Picasso”
“Rodin”
2000/01/02last_modified
r4:museoreinasofia.mcu.esr3:www.artchive.com/woman.jpg
Web Resources
52
ICS-FORTH
RDF/XML Serialization: Data<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/2000/PR-rdf-schema-20000327#" xmlns=""><Painter rdf:id=“picasso132"> <fname>Pablo</fname> <lname>Picasso</lname> <paints> <Painting rdf:about="http://museoreinasofia.mcu.es/guernica.gif"> <exhibited> <Museum rdf:about="http://museoreinasofia.mcu.es"/> </exhibited > <technique>oil on canvas</technique> </Painting> </paints> <paints> <Painting rdf:about="http://www.artchive.com/woman.jpg”/> </paints></Painter> <ExtResource rdf:about="http://museoreinasofia.mcu.es"> <title>Reina Sophia Museum</title > <lastmodified>2000/06/09</lastmodified></ExtResource><Sculptor rdf:id="rodin424" lname="Rodin“> <creates> <Sculpture rdf:about="http://www.rodin.fr/thinker.gif"/> </creates></Sculptor></rdf:RDF>
53
ICS-FORTH
RDF/XML Serialization: Schema<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/ 22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/2000/ PR-rdf-schema-20000327#"><rdfs:Class rdf:ID="Artist"/><rdfs:Class rdf:ID="Artifact"/><rdfs:Class rdf:ID="Style"/><rdfs:Class rdf:ID=“Museum"> <rdfs:Class rdf:ID="Sculptor"> <rdfs:subClassOf rdf:resource="#Artist"/> </rdfs:Class><rdfs:Class rdf:ID="Painter"> <rdfs:subClassOf rdf:resource="#Artist"/> </rdfs:Class><rdfs:Class rdf:ID="Sculpture"> <rdfs:subClassOf rdf:resource="#Artifact"/> </rdfs:Class><rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/> </rdfs:Class><rdf:Property rdf:ID="creates"> <rdfs:domain rdf:resource="#Artist"/> <rdfs:range rdf:resource="#Artifact"/> </rdf:Property>
<rdf:Property rdf:ID="paints"> <rdfs:domain rdf:resource="#Painter"/> <rdfs:range rdf:resource="#Painting"/> <rdfs:subPropertyOf rdf:resource="#creates"/> </rdf:Property><rdf:Property rdf:ID="sculpts"> <rdfs:domain rdf:resource="#Sculptor"/> <rdfs:range rdf:resource="#Sculpture"/> <rdfs:subPropertyOf rdf:resource="#creates"/> </rdf:Property><rdf:Property rdf:ID=“exhibited"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Museum"/></rdf:Property><rdf:Property rdf:ID=" technique"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource="http://www.w3.org/ TR/1999/PR-rdf-schema-19990303#Literal"/></rdf:Property><rdf:Property rdf:ID="title"> <rdfs:domain rdf:resource="#ExtResource"/> <rdfs:range rdf:resource= "http://www.w3.org/ TR/1999/PR-rdf-schema-19990303#Literal"/></rdf:Property> ….</rdf:RDF>
54
ICS-FORTH
RDF/S vs. Well-Known Formalisms
Relational or Object Database Models (ODMG, SQL) Classes don’t define table or object types Instances may have associated quite different properties Collections with heterogeneous members
Semistructured or XML Data Models (OEM, UnQL, YAT, XML Schema) Schema labels on both nodes and edges Class and property subsumption is not captured Heterogeneous structures reminiscent to SGML exceptions
Knowledge Representation Languages (Telos, DL, F-Logic) Absence of complex values and n-ary relationships (bags, sequences)
58
ICS-FORTH
Some RDF Applications
Web Browsers:Netscape 6 from Netscape/AOL uses RDF to integrate various data-oriented
applications such as bookmarks, mail/news, channels, etc. as well as for smart browsing and related links (RDF annotation services)
Amaya Editor/Browser from W3C uses RDF to support user annotations on Web pages as metadata
Brokers/Portals:RSS (RDF Site Summary) XML/RDF Specification 1.0 2000Web Service Description Language (WSDL) XML/RDF Specification 2000PICS Rating Vocabularies in XML/RDF W3C NOTE 27 March 2000Platform for Privacy Preferences and RDF/RDF W3C Draft 10 May 2000
Content Management:OCLC Dublin Core Elements in RDFICOM-CIDOC Conceptual Reference Model in RDFThe Wordnet Lexical Ontology in RDFEuropean Treasury Browser in RDF
60
ICS-FORTH
Practical notes on RDF
Authoring/Visualizationby hand (experts only, perhaps copy & paste)support by other tools (editors like Stanford Protégé)conversion from existing data stores (using XSLT)visualize RDF graphs (using Rudolf RDFViz)
Parsing/ValidatingICS-FORTH Validating RDF Parser (VRP)Rapier RDF Parser W3C Simple RDF Parser & Compiler (SiRPAC)
Storing/QueryingICS-FORTH RSSDB/RQLAidministrator SesameRedland SquishR.V.Guha RDFdb
Harvesting/CrawlingAIFB RDF Crawling
62
ICS-FORTH
The ICS-FORTH RDFSuite
The Validating RDF Parser (VRP): Karsten Tolle Diploma ThesisThe first RDF Parser supporting semantic validation of both
resource descriptions and schemas The RDF Schema Specific DataBase (RSSDB): Sophia Alexaki
M.Sc. ThesisThe first RDF Store using schema knowledge to automatically
generate an Object-Relational (SQL3) representation of RDF metadata and load resource descriptions
The RDF Query Language (RQL): Greg Karvournarakis M.Sc. Thesis
The first Declarative Language for uniformly querying RDF schemas and resource descriptions
63
ICS-FORTH
The ICS-FORTH RDFSuite Architecture
Class Property
ORDBMSORDBMS
p_namedomain rangeResource title Literal
c_nameHotel
Hotel Dir
URIcreates
subclHotel Dir
supclHotel
subpr suppr
SubClass SubProperty
sourcepaints
targetcreates
Hotel title
DB
MS R
DF
qu
ery
APIs
SQ
L3+
SP
I fu
nctio
ns
LIBC++
SQL3
RQL InterpreterRQL Interpreter
Typing
Evaluation
GraphConstructor
Parser
Parser
VRP InternalRDF Model
Validator
RD
F Lo
ader
Loadin
g R
DF
Java A
PIsVRPVRP
JDBC
SQL3
64
ICS-FORTH
The Validating RDF Parser (VRP)
The VRP parser checks only if an RDF file is well-formed according to the RDF M & S Spec
The VRP validator checks if the model (i.e. triples) generated by the parser satisfies the constrains imposed by the RDF Schema Spec
LexicalAnalyzer
Parser
VRP InternalRDF Model
Validator
NamespaceManager
SyntaxAnalyzer
RDF graph model
subject predicate object
RDF triple model
RDF/XML
<rdf :RDF xmlns:rdf="...#” xmlns:rdfs="...#" xmlns=“ "> <tag1> <tag2> ,,, </tag2> </tag1></rdf :RDF>
Descriptions
67
ICS-FORTH
C2P1
r1 r2P1
Resource• URI
RDF_Resource•rdf:type•………...
RDF_Class•rdfs:subClassOf
RDF_Property•rdfs:domain•rdfs:range•rdfs:subPropertyOf•link_list
RDF_Statement•rdf:predicate•rdf:subject•rdf:object
Extended VRP Validator
RDF Querying APIs
Persistent Namespace
(DBMS)
Additional Constraints
RDF Loading APIs
C1
ns#C1
URI
ns#C1
p_name domain range
Property
DBMS
store()
store()
store()
RDF Model
RDF_Resource@7844
URI r1
rdf:type ns#C1RDF_Property@5678
rdf:type rdf#Property
rdfs:range ns#C2
rdfs:domain ns#C1
link_list (r1,r2)
URI ns#P1
RDF_Class@2344
URI ns#C1
rdf:type rdfs#Class
c_name
Class
r1
source target
ns#P1
ns#P1 ns#C1 ns#C2
r1 r2
The RDF to DBMS Loader
68
ICS-FORTH
RSSDB Representation of RDF metadata
id
11
Class
nsid
2
lpart
ExternalPage1415
Property
nsid42
lparttitletitle
domainid1011
rangeid11
12 3 Arts13 3 Art_History subid
11
13
SubClass
superid10
12
subid15
SubProperty
superid14
id
12 10
10 DataResource
sourcet14
target
sourcet15
target
t12
t13
URIt10
URIt11
URI
URIsubtable
id1
urihttp://www.w3.org/2000/01/rdf-schema#
Namespace
3 http://www.odp.org/schema.rdf# 4 http://www.arts.org/schema.rdf#5 http://www.dc.org/schema.rdf#
2 http://www.w3.org/1999/02/22-rdf-syntax-ns#id1
Type
nsid1
lpartLiteral
2 1 Bag3 1 Seq
69
ICS-FORTH
The RDF Query Language (RQL)
Declarative query language for RDF description basesrelies on a typed data model (literal & container types + union types)follows a functional approach (basic queries and filters)adapts the functionality of semistructured or XML query languages to
RDF, but also: treats properties as self-existent individualsexploits taxonomies of node and edge labels allows querying of schemas as semistructured data
Relational interpretation of schemas & resource descriptionsClasses (unary relations)Properties (binary relations)Containers (n-ary relations)
70
ICS-FORTH
Browsing Portal Catalogs with RQL
Simple set queries on class and property extents:Find the resources in the extent of the property creates
creates {{ [www.portal.gr/rodin424, www.rodin.fr/thinker.gif], [www.portal.gr/picasso132,
museoreinasofia.mcu.es/guernica.gif], [www.portal.gr/picasso132, www.artchive.com/woman.jpg] }}
Find the resources of type painter and sculptor ExtResource intersect Sculpture
{{ www.rodin.fr/thinker.gif }}
Schema constructs used as query terms & support for automatic query
expansion (similar to thesauri-based IRS)
Useful to query resources with minimal schema knowledge
Includes paints & sculpts
Multiply classified resources
71
ICS-FORTH
Personalizing Portal Catalogs with RQL
Navigational queries on semistructured resource descriptionsFind the Museum resources that have been modified in year 2000. select x from Museum{x}.last_modified{y} where y >= 2000/01/01
{{museoreinasofia.mcu.es}}
Similar functionality to semistructured or XML query languages (Lorel, UnQL, XQL, XML-QL, XML-GL)
Useful in the absence of schema information or when multiple schemas are used to describe resources
Data paths not
foreseen in the schema
72
ICS-FORTH
Querying Portal Catalogs with Large Schemas
Filtering both resource descriptions and schemasFind the paintings having as technique “oil on canvas” that have
been created by a neo-impressionist painter
select y from {:$X}creates{y:Painting}.technique{z} where $X <= neo-impressionist and z = “oil on canvas”
Data filtering with
schema informationSchema Filtering on
Class hierarchies
73
ICS-FORTH
Querying Portal Schemas with RQL
Pure schema queriesFind the properties which specialize the property creates and may
have as domain the class Painter along with their corresponding range classes
select @P, $Y from {:Painter}@P{:$Y} where @P <= creates
{{ [creates, Artifact], [creates, Painting], [creates, Sculpture], [paints, Painting] }}
Schema filtering on
property hierarchies
All Properties defined or
inherited in class Painter
74
ICS-FORTH
RQL: Examples
ns1#creates
ns1#Painter
ns1#Artifact
ns1#Painting ns1#Sculpture
ns1#paints
ns1#Painter ns1#Painting ns1#Sculpture
ns1#paints
ns1#creates
ns1#Artist
ns1#Painter ns1#Sculptor
String
Stringns1#Artifact
ns1#Painting ns1#Sculpture
ns1#Style
String
ns1#paints
ns1#creates ns1#has_style
ns1#has_material
ns1#fname
ns1#lname
ns1#sculpts odp#ExtPage
dc#last_modified
Date
ns1#Impressionist
ns1#PostImpressionist
ns1#Painter ns1#Painting ns1#Sculpture
ns1#paints
ns1#creates
Similar functionality to DBMS schema QLs (SchemaSQL, XSQL) Useful for large schemas (integrating ontologies and thesauri)
75
ICS-FORTH
Putting it all Together
Nested schema and data queriesFind the resources modified after 2000/01/01 which can be reached
by a property applied to the class Painting and its subclasses
select R, y from (select @P from {:$X}@P where $X <= Painting){R}.{y}last_modified{z} where z >= 2000/01/01
{{ [exhibited, museoreinasofia.mcu.es] }}
Subcommunities may use different schemas while sharing the same description base
R ranges over the labels
of type property
76
ICS-FORTH
RQL:Examples
PortalSchema
PortalResourceDescriptions
“oil on canvas”technique
exhibited
&r3
&r2
&r4
Painting
Museumexhibited
techniqueString
2000/06/09last_modified
2000/01/02last_modified
77
ICS-FORTH
Putting it all Together Schema and data queries
Find all metadata about the resources of the site museoreinasofia.mcu.es
select x,$$Y,$P,z,$$W from {x:$$Y}$P{z:$$W} where x like “*museoreinasofia.mcu.es*” or y like “*museoreinasofia.mcu.es*” {{[www.portal.gr/picasso132, Painter, paints, museoreinasofia.mcu.es/guernica.gif,
Painting], [museoreinasofia.mcu.es/guernica.gif, Painting, exhibited, museoreinasofia.mcu.es,
Museum], [museoreinasofia.mcu.es/guernica.gif, Painting, technique, “oil on canvas”, string], [museoreinasofia.mcu.es, ExtResource, title, “Reina Sophia Museum”, string], [museoreinasofia.mcu.es, ExtResource, last_modified, 2000/06/09, date], ….}}
Subcommunities may use both different schemas and description bases
URLs’ pattern matching
78
ICS-FORTH
RQL Query Processing
select y
from {x}creates{y:Painting}.has_material{z}
where z = “oil on canvas”
select y
from creates A, has_material B, D $C
define x = A.source, y = A.target, w = B.source, z = B.target,
R = range(creates), D = subclassOf(R), E = ^($C)
where z = “oil on canvas” and y = w and $C = Painting and y in E
79
ICS-FORTH
RQL Query Optimization
Project
SemiJoin
Joiny = w
Select
y
z = “oil on canvas”
y in ^($C) creates[x,y]
subclassOf(range(creates))[$C]
Select$C = Painting
has_material[w,z]
Project
Joiny = w
Select
y
z = “oil on canvas”
creates[x,y]
Selecty in ^Painting
has_material[w,z]
Project
Joiny = w
Select
y
z = “oil on canvas”
creates[x,y]
SemiJoin
Painting[p]
has_material[w,z]
y = p
select X.targetfrom creates* X, has_material* Y, Painting Pwhere X.target = Y.source and X.target = P.uri and Y.target = ’oil on canvas’
80
ICS-FORTH
The RQL Query Interpreter
Main
Query string
Query string
Query result
Syntax tree under CNF
•Evaluation of dependencies
•Factorization functions
Graph construction
•Syntactical analysis (lex/yacc)
•CNF transformation
•Checks type compatibility
•Sets appropriate evaluation functions
Type inference
DBMS – RDF Query APIs
•Defines evaluation functions
•Query Processing
Evaluator
Syntax analysis
Query graph Typing
Evaluation
Result
Query graph
(1)
(2)
(3)
(4)
(5)
(6)
DBMS
81
ICS-FORTH
RDFSuite Summary
RDFSuite addresses the needs of effective RDF metadata management by providing tools for validation, storage and querying
validation follows a formal data model and constraints enforcing consistency of RDF schemas
incremental loading of voluminous description bases in a persistent store
declarative query language for schema and data querying Ongoing efforts:
RQL query optimization transactional aspects alternative encoding and representation schemes for access
optimization
82
ICS-FORTH
Acknowledgements
Funding was generously provided by the projects:
C-WEB (IST-1999-13479): “A Generic Platform Supporting
Community Webs”
MESMUSES (IST-2000-26074): “Metaphor for Science Museums”