Upload
achim-friedland
View
1.826
Download
1
Tags:
Embed Size (px)
Citation preview
Connections to the Real WorldGraph Databases and Applications
Achim Friedland <[email protected]>, Aperis GmbH 1st University-Industrial Meeting on Graph Databases - 7.-8. Feb.. 2011, Barcelona , Spain
2
Let’s change out point of view...
3
Welcome on the customer side... ;)
www.graph-database.org
4
The Graph Representation Problem
Adjacency matrix vs. Incidence matrix vs. Adjacency list vs. Edge list vs. Classes,
Index-based vs. Index-free Adjacency, Dense vs. Sparse graphs, On-disc vs. In-memory
graphs, All-Indexed vs. Specific-Index-Creation, directed vs. undirected edges,
hypergraphs?, hierarchical graphs?, dynamicgraphs?
• Different levels of expressivity• Sometimes very application specific• Hard to optimize a single one for every use-case
The GraphDB Vendor Problem
5
• Multiple APIs from different vendors• Unknown internal graph representation• Unclear design goals• Community involvement?
6
Step 1) Define a common API
The Property-Graph Model
• directed:• attributed:• edge-labeled:• multi-graph:
The most common graph model withinthe NoSQL GraphDB space
Each edge has a source and destination vertexVertices and edges carry key/value pairsThe label denotes the type of relationshipMultiple edges between any two vertices allowed
7
Id: 1name: Alice
age: 21
vertex properties
Id: 2name: Bob
age: 23since: 2009/09/21
edge properties
Friends
edge label
8
• Vertex type vs. vertex interfaces?• Edge label/type vs. edge interfaces?• Vertex<->Edge constraints?• Extension: Undirected Edges?• Extension: Hyperedges?• Extension: Semantic graphs?• Extension: Dynamic graphs?
Property-Graph Constraints?
9
// Use a class-based in-memory graphvar graph = new InMemoryGraph();
var v1 = graph.AddVertex(new VertexId(1));var v2 = graph.AddVertex(new VertexId(2));v1.SetProperty("name", "Alice");v1.SetProperty("age" , 21);v2.SetProperty("name", "Bob");v2.SetProperty("age" , 23);
var e1 = graph.AddEdge(v1, v2, new EdgeId(1), "Friends");e1.SetProperty(“since”, ”2009/09/21”);
A Property Graph Model Interface for Java and .NET
structured data (XML, JSON)
10
• Strings• Integers• DataTime?• byte[]?• structured data like XML/JSON?• List<...>• ...
Supported datatypes?
11
Step 2) Declarative ways for querying
Querying a Graph Database
12
• Programmatic / API• From any programming language, Pipes, ...• Synchronous or Asynchronous • Allow bypassing all optimizations• Do not try to be smarter than the application
developer
• Ad hoc / Explorative• Gremlin aka. “high-level pipes”?• sones GQL, OrientDB QL aka. “SQL style”?• Pattern matching aka. “SPARQL style”?• Easy embedding of domain specific query languages?
13
A data flow framework for property graph models
ISideEffectPipe<in S, out E, out T>S ESource
ElementsEmitted
ElementsTSide Effect
: IEnumerator<E>, IEnumerable<E>
Pipeline<S, E>
14
pipe1<S,A> pipe2<B,C> pipe3<C,E>
SSource
Elements
EEmitted
Elements
Create complex pipes by combining pipes to pipelines
15
// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");
var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1))));
g:id-v(1)/outE[@label='Friends']/inV/outE[@label='Friends']/inV/@name
A “perl”-style Ad Hoc query language for graphs
16
// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");
var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1))));
From User u SELECT u.Friends.Friends.nameWHERE u.Id = 1
sones GQL
A “SQL”-style Ad Hoc query language for graphs
17
Step 3) Query result formats
Query Result Formats
18
• Graphs• QR may be queried over and over again• QR may be stored/cached as a graph• But again: (Too) may graph representations available
• Other data structures• If result is just a list, why converting it to a graph?• Simple for programming languages• Much more complicated for Query Languages
19
• Reduced 2-tier architecture (GraphDB -> Client)• Higher performance
• Avoids relational architecture anti-patterns
• Link-aware, self-describing hypermedia (see Neo4J)
• e.g. ATOM, XML + XLINK, RDFa
• User-defined/application specific protocols• E.g. serve HTML/GEXF directly (see CouchDB)
• Allows to create powerful embedded applications
Query Result Formats
20
Step 4) Accessing remote graphs
21
• rexster server• Exposes a graph via HTTP/REST• Vertices and edges are REST resources• Neo4J, OrientDB are available,
InfiniteGraph announced
• rexster client• Accessing remote graphs
A HTTP/REST interface for property graphs
22
Common CRUD operations...
23
Common CRUD operations...
24
What about other HTTP verbs?
• PATCH for applying small changes?• NEIGHBORS?• EXPLORE (more neighbors...)• SHORTESTPATH• CENTRALITY
25
Default resource representation: JSON
curl -H Accept:application/json http://localhost:8182/graph1/vertices/1{ "version" : "0.1", "results" : { "_type" : "vertex", "_id" : "1", "name" : "Alice", "age" : 21 }, "query_time" : 0.014235 }
26
• HTTP caching support?• HTTP Authentication support?• Conditional PUT/POST requests?
Advanced HTTP/REST concepts
27
The GraphDB Graph...
Neo4J for GIS
InfoGrid for WebApps In-Memory for Caching
OrientDB for Documents
OrientDB for Ad Hoc
ThinkerGraph & Gremlin for Ad Hoc
Neo4J for HA
InfiniteGraph for Clustering
28
Questions?
http://www.graph-database.orghttp://www.twitter.com/graphdbs