38
WISE 2002 /department of mathematics and computer science TU/e technische universiteit eindhoven December 12, 200 2 1 RAL: an R DF Al gebra Flavius Frasincar Geert-Jan Houben Richard Vdovjak Peter Barna

department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

Embed Size (px)

Citation preview

Page 1: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 1

RAL: an RDF Algebra

Flavius Frasincar

Geert-Jan Houben

Richard Vdovjak

Peter Barna

Page 2: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 2

Contents

1. Introduction

2. RAL Goals

3. RAL Data Model

4. RAL Operators

5. Conclusion

Page 3: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 3

1. Introduction

• Metadata is machine understandable information about web resources or other things [Source: Tim Berners-Lee, “Metadata Architecture”]

• RDF (Resource Description Framework) is the Web metadata language for the Web

• RDF extends the syntactic interoperability of XML to semantic interoperability being the foundation for the Semantic Web

Page 4: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 4

Semantic Web Architecture “Layer Cake”

[Source:

Tim Berners-Lee Director W3C

Keynote speech at XML2000

“RDF and the Semantic Web”

(Washington DC, 6 Dec. 2000)]

Page 5: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 5

Hera

• Hera research project: Web Information Systems (WIS) and web (hypermedia) generation in WIS

• WIS use RDF to represent and query application data for:

– Semantic integration of data coming from heterogeneous sources

– Semantic information presentation

– Semantic querying

• Huge quantities of data and metadata need to be processed in real-time: optimization is crucial

Page 6: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 6

Hera Methodology/Suite

ConceptualDesign

IntegrationDesign

ApplicationDesign

AdaptationDesign

(Search)Agent

inforequest

(meta) data

ConceptualModel

ApplicationModel

inforequest

(slice)presentation

End User

RQL / RDF XML

UserModel

PresentationEngine

ApplicationEngine

IntegrationEngine

CuypersEngine

AdaptationEngine

inforequest

HTML/WML/SMIL

IntegrationModel

Semantic Layer Application Layer Presentation Layer

Presentation Templates(XSLT)

PresentationDesign

User/PlatformProfile

Page 7: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 7

RDF Representations

Primitive semantics: Subject Predicate ObjectThree alternative notations:

• Triple (http://example.com/sb.jpg, painted_by, “Rembrandt”)

• RDF/XML <rdf:Description rdf:ID=http://example.com/sb.jpg>

<painted_by> Rembrandt </painted_by> </rdf:Description>

• Graph painted_byhttp://example.com/sb.jpg Rembrandt

Page 8: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 8

RDF Query Languages

• Triple-based:

– Triple [successor of SiLRI] (Horn logic)

– Metalog (Datalog)

• XML-based:

– RDF Query

– RQuery (XQuery)

• Graph-based (but not graphical):

– RQL (OQL)

Page 9: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 9

2. RAL Goals

• Support the formal specification of RDF query languages

• Provide a reference framework to compare different RDF query languages

• Consider the result construction phase

– presently neglected by RDF query languages which focus only on extraction

• Enable algebraic query optimization

Page 10: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 10

RAL

• RAL Data Model: specify what information is accessible (for RAL operators) in an RDF graph– Nodes: Resources and Literals– Edges: Properties

• RAL Operators: define operators working on collections of nodes from the RAL Data Model– Extraction Operators– Loop Operators – Construction Operators

Page 11: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 11

3. RAL Data Model

• R is the set of resources R = U B

• U is the set of URI references rdf:Property U

• B is the set of blank nodes

• L is the set of literals U, B, L are disjoint

• P is the set of properties P R, rdf:type P

R L

rdf:type

rdf:PropertyU B

P

Page 12: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 12

• An RDF model M is a finite set of triples (statements)

M R U (R L)

• The set of properties of an RDF model M

PM = {p| (s, p, o) M (p, rdf:type, rdf:Property) M}

• The RDF graph model is similar to a directed labeled graph (DLG)

– It is not a DLG since it allows for multiple edges between two nodes

– It is not a general multigraph because different edges between two nodes cannot share the same label

Page 13: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 13

• The RDF graph model corresponding to an RDF model M is defined by

GM = (N, E, lN , lE), lN : N R L, lE : E P

using the following construction mechanism:

for each (s, p, o) M

add nodes ns, no to N (different only if s o)

assign lN (ns) = s, lN (no) = o

add ep to E as a directed edge between ns and no

assign lE ( ep ) = p

Observations:

• lN (.) is an injective partial function

• lE ( .) is a total function

Page 14: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 14

Basic Properties

Basic Property

Result for resources

Result for literals

id lN(u), u U lN(s), s L

type Resource Literal

Basic Property

Result

name lE(p), p P

subject r, r R

object o, o R L

• Two non-blank nodes are equal if they have the same id

• Two blank nodes are equal if they have the same properties

and the corresponding property values are equal

Nodes Edges

Page 15: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 15

RDF(S)-Closure

• RDF Model Theory defines the RDF-closure and RDFS-closure of an RDF Model M by proposing a set of rules for generating new triples

• Extensional data: the original model M triples• Intensional data: the new triples generated by the RDF(S)-

closure

• RAL operators work on extensional+intensional data• Variants of the operators can be defined to neglect the

intensional data (similar to the RQL strict interpretation)

Page 16: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 16

4. RAL Operators

• All operators have the following form

o[f](x1, x2, … xn: expression) where an expression is a collection of nodes and f is a function

having as input/output collection of nodes

• Extraction Operators: retrieve the needed information from an RDF graph

• Loop Operators: control the repetitive application of certain operators

• Construction Operators: build new RDF graphs from the extracted data

Page 17: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 17

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 18: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 18

Projection[re_name](e: expression)

computes the values of the properties with a name given by the regular expression re_name over strings on the input collection given by e

Example [(P|p)aint[s]#](r4)

returns the resources painted by r4

4.1 Extraction Operators

Page 19: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 19

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 20: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 20

Selection

[condition](e: expression)

selects input collection nodes fulfilling the given condition

Example

[[tname] = “Chiaroscuro”](c)

where c is the collection of input resources r1, r2, r3, and r4, returns the resources representing the painting technique with the name“Chiaroscuro”

Page 21: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 21

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 22: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 22

Cartesian Product

(x: expression) (y: expression)

for each element in the Cartesian product of the input collections, a blank node that has all properties of both originating nodes is added to the result

Example

[[rdf:type] = Technique](c) [[rdf:type] = Painter](c)

returns a collection of blank nodes, each blank node having all the properties of the corresponding pair from the Cartesian product (the new nodes have both types Technique and Painter)

Page 23: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 23

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

exemplified_by paints

exemplified_by paints

tname cname

Page 24: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 24

Join (x: expression) ⋈[condition] (y: expression)

[condition](x y)

is a Cartesian product followed by a selection

Example

(x: [[rdf:type] = Technique](c)) ⋈[[exemplified_by](x) = [paints](y)] (y: [[rdf:type] = Painter](c))

returns a collection of blank nodes, each blank node having all the properties of the corresponding pair from the Cartesian product that satisfies the given condition

Page 25: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 25

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

exemplified_by paints

exemplified_by paints

tname cname

Page 26: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 26

Union, Difference, Intersection

(x: expression) (y: expression)

where {, , }

defined as in set theory

Example

[[rdf:type] = Technique](c) [[rdf:type] = Painter](c), returns the collection of resources obtained by combining the two collections (these two collections are obtained using two selections)

Page 27: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 27

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique Artifact

Painter

Creator

Painting

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 28: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 28

4.2 Loop Operators

Mapmap[f](e: expression)

applies the function f to each element of the input collection; the function results are added in the output collection

Example

map[[ rdfs:subClassOf]](Painting, Painter) computes the parent classes using the property rdfs:subClassOf

for the collection consisting of Painting and Painter

Page 29: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 29

Creator

Painter

Artifact

Painting

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 30: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 30

Creator

Painter

Artifact

Painting

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 31: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 31

Kleene Star[f](e: expression)

repeats the function f possibly infinite times starting with the given input collection; at each iteration the results of the function are added to the next function input

Example

[[rdfs:subClassOf]](Painting))

computes the transitive closure of the property rdfs:subClassOf starting from Painting, i.e. Painting and all its superclasses

Page 32: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 32

Artifact

Painting

r2 r3

r1 r4

exemplified_by exemplified_by paints paints

Stone Bridge

Rembrandt tname

image name

cname

Technique

Painter

Creator

Image

Literal

Literal Literal

Literal tname

exemplified_by

exemplifies creates

created_by

name year

cname

image

paints

painted_by

1638 Self Portrait 1628

http://example.com/sb.jpg http://example.com/sp.jpg image

name year year name

Chiaroscuro

inferred rdf:type

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

Legend

schema

instance

Page 33: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 33

4.3 Construction Operators

Create Nodenode[type, id]()

adds a new node to the graph with the given type and id (id is missing for blank nodes) and returns this node; if a resource is created, an rdf:type edge is added between the resource and the node representing rdfs:Resource

The Create Node operator assigns a unique (in the resulted RDF graph) internal identifier for each created node

Page 34: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 34

Caravagio

rdfs:Resource

rdf:type

Examplenode[Resource]() and node[Literal,“Caravagio”]()

create a Resource representing a blank node and a Literal representing the string “Caravagio”

Page 35: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 35

Create Edgeedge[name, subject](object: expression)

adds edges between the subject node and each of the nodes in the object collection, and returns the subject node; the label of the edges is given by name which is the id of a property resource

The Create Node and Create Edge operators abort if the “well-formed RDF(S) graph” conditions (e.g. rdf:type cannot refer to a literal, literals cannot have properties etc.) are not met after construction

Page 36: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 36

nameCaravagio

rdfs:Resource

rdf:type

Example edge[name, node[Resource]()](node[Literal, ”Caravagio”]())

creates an edge labeled with name between the nodes defined in the previous example

Page 37: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 37

5. Conclusion• The RAL algebra is developed from a DB perspective and

proposes a set of operators similar to their relational algebra counterparts:– Extraction Operators: Projection, Selection, Cartesian

Product, Join, Union, Difference, Intersection• Similar to the existing semi-structured query languages RAL

considers powerful repetition operators:– Loop Operators: Map, Kleene Star

• As opposed to present RDF query languages RAL supports result construction:– Construction Operators: Create Node, Create Edge

Page 38: department of mathematics and computer science TU/e technische universiteit eindhoven WISE 2002December 12, 20021 RAL: an RDF Algebra Flavius Frasincar

WISE 2002

/department of mathematics and computer science

TU/e technische universiteit eindhoven

December 12, 2002 38

Future Work

• Analyze the power of expression of RAL compared to RQL, a popular RDF query language at present time (build a translation scheme from RQL to RAL)

• Formally specify the semantics of other RDF query languages in terms of RAL

• Compare the power of expression of different RDF query languages using RAL as reference language

• Explore equivalence rules for RAL expressions to be used in query optimization

• Develop an RDF query optimization algorithm on RAL