View
223
Download
0
Category
Preview:
DESCRIPTION
Introduction 3 Not Only SQL Storing data in memory Distributed databases
Citation preview
Graph and RDF Databases
Context : Course of Advanced Databases
Prepared by : Nassim BAHRI Nabila HOSNI
February 19th, 2015
Table of contents
I. Introduction :Overview of BIG DATA & NOSQL
II. Graph Databases
III. RDF Databases
IV. Application example
V. Scientific article
VI. Conclusion and Q&A
Introduction
3
Not Only SQL
Storing data inmemory
Distributeddatabases
Introduction : Data Model
4
Documents Databases
(Voldemort, Riak)
Big Table Column
(Hbase, cassandra, Hypertable)
Key-Value
(MongoDB)
Graph Databases
(Neo4J)
Introduction : Data Model
5
Data complexity
Data
size Key-Value Stores
Column Family
Document Databases
Graph Databases
90% of use cases
This is what we are interested
Source : Neo Technology webinar
Graph Databases
What is Graph Database?
A graph database is a databases whose specific purpose is the storage of graph-oriented data structures.
Is simply an object oriented database based on Graph theory.
6
Graph Databases
Representation
• Nodes
• Relationships between nodes
• Properties on both
7
2
3
1
Name : JohnAge : 43
Name : Google
Type : FordColor : blue
Work inSince : 2013
Graph DatabasesThe power of Graph Databases
Performance Flexibility
Agility8
Graph VS Relational DatabasesRelational Database Modeling
ID Name
1 Larry Page
2 Sergey Brin
3 Larry Elisson
N …
ID Name
1 Google
2 Oracle
… …
N …
PersonID CompanyID Since
1 1 1998
2 1 2001
3 2 2010
Person
Company
WorksIn
SELECT Person.NameFROM Person,Company,WorksInWHERE Company.Name='Google'AND WorksIn.CompanyID=Company.IDAND WorksIn.PersonId=Person.ID;
Google's employees?
Lookup
Lookup
Lookup
9
Graph VS Relational DatabasesGraph Database Modeling
Name : Larry Page Name : Google
Name : Sergey Brin
Name : Oracle
Name : Larry Elisson
Person 1
Person 2
Company 1
Company 2
Person 3
WorksIN
WorksIN
WorksInSince : 2001
Since : 2010
Since : 1998
Lookup
10
Graph DatabasesGraph storage and graph processing
1. The underlying storage• Some databases use native graph storage,• The other databases use relational database, an object-oriented database,…
2. The processing engine• The nodes are physically connected to each other in database,• index-free adjacency
11
Graph DatabasesGraph Database Management System
12Source [1]
Graph Databases : ExampleVisual Modeling
13
Name : JohnAge : 27 FRIEND_OF
Name : SallyAge : 32
Title : Graph DatabasesAuthors : Ian Robinson,
Jim Webber
Since : 01/09/2013
Since : 01/09/2013
On : 02/09/2013Rating : 4
On : 02/03/2013Rating : 5
FRIEND_OF
HAS_READ HAS_READ
Graph Databases : ExampleCreate a simple dataset// Create SallyCREATE (sally:Person { name: 'Sally', age: 32 })
// Create JohnCREATE (john:Person { name: 'John', age: 27 })
// Create Graph Databases bookCREATE (gdb:Book { title: 'Graph Databases', authors: ['Ian Robinson', 'Jim Webber'] })
// Connect Sally and John as friendsCREATE (sally)-[:FRIEND_OF { since: 1357718400 }]->(john)
// Connect Sally to Graph Databases bookCREATE (sally)-[:HAS_READ { rating: 4, on: 1360396800 }]->(gdb)
// Connect John to Graph Databases bookCREATE (john)-[:HAS_READ { rating: 5, on: 1359878400 }]->(gdb) 14
Graph Databases : Example
15
Graph Databases : ExampleSimple selection from node:
Query 1 : How old are Sally?MATCH (sally:Person { name: 'Sally' })RETURN sally.age as sally_age
16
Graph Databases : ExampleSimple selection from node:
Query 2 : Who are the authors of Graph Databases?MATCH (gdb:Book { title: 'Graph Databases' }) RETURN gdb.authors as authors
17
Graph Databases : ExampleSelection using relationship:
Query 3 : Who are sally's friends?MATCH (sally:Person { name: 'Sally' }) MATCH (sally)-[r:FRIEND_OF]-(person) RETURN person.name as sally_friend
18
Graph Databases : ExampleSelection using relationship and group function:
Query 4 : What is the average rating of Graph Databases?MATCH (gdb:Book { title: 'Graph Databases' })MATCH (gdb)<-[r:HAS_READ]-()RETURN avg(r.rating) as average_rating
19
Graph Databases : ExampleUsing order and limit in query:
Query 5 : Who Read Graph Databases First, Sally or John?MATCH (people:Person) WHERE people.name = 'John' OR people.name = 'Sally'MATCH (people)-[r:HAS_READ]->(gdb:Book { title: 'Graph Databases' })RETURN people.name as first_reader ORDER BY r.on LIMIT 1
20
Graph Databases : ExampleVisual Modeling
21
Name : JohnAge : 27 FRIEND_OF
Name : SallyAge : 32
Name : AlainAge : 19
Since : 01/09/2013
Since : 01/09/2013
FRIEND_OF
FRIEND_OF
Since : 01/11/2014
Graph Databases : ExampleCompleting our schema// Create AlainCREATE (alain:Person { name: 'Alain', age: 19 })
// Connect Sally and Alain as friendsMATCH (alain:Person { name: 'Alain' })MATCH (sally:Person { name: 'Sally' })CREATE (sally)-[:FRIEND_OF { since: 1358818400 }]->(alain)
22
Alain
Sally
John
GDB book
Graph Databases : ExampleNode / relationship navigation:
Query 6 : Which is shared between Alain and John Friend?MATCH (alain:Person { name: 'Alain' })MATCH (john:Person { name: 'John' })MATCH (alain)-[:FRIEND_OF]-(person)-[:FRIEND_OF]-(john)RETURN person.name as friend
23
Graph Databases : ExampleUpdate node’s properties:
Query 7 : Change Alain name to LarryMATCH (n { name: 'Alain' })SET n.name = 'Larry'
Query 8 : Remove propertyMATCH (n { name: 'Larry' })SET n.name = NULL
Query 9 : Add propertyMATCH (n { name: 'John' })SET n += { hungry: TRUE , position: 'Entrepreneur' }
24
RDF DatabasesThe principle of the web
25
HTTP Request
HTTP Response
URL : http://website.com
Communication protocol : HTTP
Representation language : HTML
RDF DatabasesChanging status
26
URL URI IRI
Uniform ResourceLocator
Uniform ResourceIdentifier
International ResourceIdentifier
http://website.com http://animals.com#lion http://الحيوانات.tn#lion
RDF DatabasesW3C Standards
27
Identification
Representation
Query Reasoning
RDF Databases
RDFResource
Description
Framework
28
means
: pages, person, animalsIdea,…
: attributes, characteristics,Relationship,…
: Model, language and syntax to build description
RDF DatabasesData model & syntax
Description : (Subject, Predicate, object)
“example : doc.html is created by John and belongs to the music theme”
29
Doc.html is created by JohnDoc.html belongs to music theme
RDF DatabasesData model & syntax
(Subject, Predicate, object)
(Vertex, edge, Vertex)
30
John
Doc.html
Music
Author
Theme
RDF DatabasesLabeled graph with URI and literals
31
http://www.website.com/john#me
http://www.website.com/doc.html
Music
http://www.website.com/schema#author
http://www.website.com/schema#theme
RDF DatabasesRDF SyntaxesXML, Turtle, TriG, JSON-LD,…
Turtle syntax
@prefix rdf : <http://www.w3.org/1999/02/22-rdf-syntax-ns#>@prefix site : <http://www.website.com/schema#><http://www.website.com/doc.html>
site:author <http://www.website.com/john#me>;site:theme "Music".
32
RDF DatabasesSPARQL Protocol And RDF Query Language
• Syntax similar to SQL
SELECT data,FROM data sourceWHERE { conditions }
33
RDF DatabasesSPARQL Protocol And RDF Query Language
?x rdf:type ex:PersonGet all person
SELECT ?subject ?property ?valueWHERE { ?subject ?property ?value }
Get the full Graph database
SELECT ?x WHERE{ ?x rdf:type ex:Person . ?x :name ?name . }
Get all person who have a name 34
RDF DatabasesSPARQL Protocol And RDF Query Language
Declaring prefixes
PREFIX esen : <http://esen.tn#>SELECT ?studentWHERE {
?student esen:registeredAt ?x.}
35
RDF DatabasesSPARQL Protocol And RDF Query Language
Optional pattern
PREFIX foaf : <http://xmlns.com/foaf/0.1>SELECT ?person ?nameWHERE {
?person foaf:homepage <http://john.info> .OPTIONAL { ?person foaf:name ?name .}
}name : unbound
36
RDF DatabasesSPARQL Protocol And RDF Query Language
Union
PREFIX foaf : <http://xmlns.com/foaf/0.1>SELECT ?nameWHERE {
?person foaf:name ?name .{
{?person foaf:homepage <http://john.info> .} UNION {?person foaf:homepage <http://paul.info> .}
}} 37
RDF DatabasesSPARQL Protocol And RDF Query Language
Minus
PREFIX ex : <http://website.com#>SELECT ?personWHERE {
{ ?person rdf:type ?type }MINUS { ?person rdf:type ex:student }
}
38
RDF DatabasesUse case : rich snippets Google
39
<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Person"> My name is <span property="v:name">Pierre Dumoulin</span>. My personal homepage: <a href="http://www.example.com" rel="v:url" > www.homepage.com</a>I’m living is <span rel="v:address" typeof="v:address"> <span property="v:street-address">12 street name</span> <span property="v:locality">city name</span> ,<span property="v:region">XY</span> <span property="v:postal-code">12345</span>. <span></div>
Application example (RDF)Data storage# Default graph (stored at http://example.org/foaf/aliceFoaf) @prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Alice" .. _:b foaf:mbox <mailto:bob@work.example> .. _:a foaf:mbox <mailto:alice@work.example> .
QueryPREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?nameFROM <http://example.org/foaf/aliceFoaf>WHERE { ?x foaf:name ?name }
40
NameAlice
Result
Application example (Neo4J)Question : Who is older, Sally or John?
41
Name : JohnAge : 27 FRIEND_OF
Name : SallyAge : 32
Name : AlainAge : 19
Since : 01/09/2013
Since : 01/09/2013
FRIEND_OF
FRIEND_OF
Since : 01/11/2014
Application example (Neo4J)Who is older, Sally or John?
MATCH (people:Person)WHERE people.name = 'John' OR people.name = 'Sally'RETURN people.name as oldestORDER BY people.age DESCLIMIT 1
42
Scientific article
Title : Querying RDF Data from a Graph Database PerspectiveBook title : The Semantic Web: Research and ApplicationsPages : 346-360Online ISBN : 978-3-540-31547-6Series Volume : 3532Publisher : Springer Berlin HeidelbergCopyright : 2005Authors : Renzo Angles
Claudio Gutierrez
43
Scientific articleMODEL LEVEL DATA
COMPLEXITYCONNECTIVITY TYPE OF DATA
Network physical simple high homogeneous
Relational logical simple low homogeneous
Semantic user simple/medium high homogeneous
Object-O logical/physical complex Medium heterogeneous
XML logical medium medium heterogeneous
RDF logical medium high heterogeneous
44
Table 1 : Summary of comparison among different database models
Scientific articlePROPERTY G G+ GraphLog Gram GraphDB Lorel F-G
Adjacent nodes +/- √ √ √ +/- √ +/-
Adjacent edges +/- √ √ √ +/- √ +/-
Degree of a node X √ √ x ? X x
Path √ √ √ √ √ √ √
Fixed-length Path √ √ √ √ √ √ √
Distance between two nodes X √ √ X ? x x
Diameter x √ √ X ? x X
45
Table 2 : Support of some graph database query languages for the example graph properties
Scientific articlePROPERTY RQL SeRQL RDQL Triple N3 Versa RxPath
Adjacent nodes +/- +/- +/- +/- +/- +/- X
Adjacent edges +/- +/- +/- +/- X x X
Degree of a node +/- x x x x x X
Path x x x x X x +/-
Fixed-length Path +/- +/- +/- +/- +/- X +/-
Distance between two nodes x x x x x x X
Diameter x x x x x x x
46
Table 3 : Support of some current RDF query languages for some example graph properties
Conclusion
• Using Graph database for storing data in graph form or in hierarchical tree structure.
• Graph database : Performance, Agility, Flexibility
• The shortest path
47
48
Thanks for your attention
Bibliography[1] Ian Robinson, Jim Webber, and Emil Eifrem.«Graph Databases».O’REILLY, 2013.[2] Serge Miranda, Fabien Gandon. «Des Bases de Données à Big Data». Course at Nice Sophia university, MOOC, 2015.[3] Michel Domenjoud. «Bases de données graphes : un tour d’horizon». Available on <http://blog.octo.com/bases-de-donnees-graphes-un-tour-dhorizon> (consulted 18/02/2015).[4] Neo4J community. «Cypher Query Language». Available on <http://neo4j.com/developer/data-modeling/> (consulted 18/02/2015).[5] Frank Manola, Eric Miller, Brian McBride. «RDF 1.1 Primer». Available on <http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140225/> (consulted 18/02/2015).[6] Eric Prud'hommeaux, Andy Seaborne. «SPARQL Query Language for RDF». Available on <http://www.w3.org/TR/rdf-sparql-query/> (consulted 18/02/2015).[7] Neo4J community. «Introduction to graph databases webinar». Available on <http://www.neo4j.org/learn/videos_webinar> (consulted 18/02/2015).
49
Recommended