Graph Data -- RDF and Property Graphs

Embed Size (px)

Citation preview

Graph Data

Linked Data andProperty Graphs

Contents

Example

LD/RDF

PG

SPARQL

Gremlin/Cypher

BIG dataApache Giraph / Bulk Synchronous Processing

Ideally?+arrays+URIs+attributes

Intro

SPARQLMain editor of query language spec (v1, v1.1)

RDFPractical wing of the semantic web

Helped with the syntax specs

Apache JenaImplements SPARQL 1.1

Biases ...

Graphs

A set of vertices (nodes) and edges (arcs)

Except the useful kind have labels on edges

and the nodes are just dots.

Graphs

G = ( V , E )

V Vertexes (Nodes)E Edges (Arcs, Links)

Graphs

Graphs For Information

Alice

Bob

Eve

listensToknows

Graphs For Information

Alice... has a name Alice Hacker... has an employee number

Linked Data / RDF

StandardsWhat it means

Syntaxes for exchanging data

Query language

URI name things globally

Uniform representationLink to another thing is same as link to a value

Complex structures encoded in the basic mechanism

Schemaless data integration

Linked Data

make your stuff available on the Web (whatever format) under an open license make it available as structured data (e.g., Excel instead of image scan of a table) use non-proprietary formats (e.g., CSV instead of Excel) use URIs to denote things, so that people can point at your stuff link your data to other data to provide context

http://5stardata.info/

Linked Data / RDF

http://example/alice"Alice Hacker"foaf:namehttp://example/bobfoaf:knowsprefix person: prefix foaf:

foaf:name "Alice Hacker" ; foaf:knows .

foaf:name "Bob Tester" ; foaf:knows .

foaf:nameBob Testerfoaf:knows

JSON-LD

Links and semantics for the JSON ecosystem

{ "@context" : "http://example/person.jsonld", "@graph" : [ { "@id" : "http://example/alice", "knows" : "http://example/bob", "name" : "Alice Hacker" }, { "@id" : "http://example/bob", "knows" : "http://example/alice", "name" : "Bob Tester" } ]}

SPARQL Query

prefix person: prefix foaf:

foaf:name "Alice Hacker" ; foaf:knows .

foaf:name "Bob Tester" ; foaf:knows .

PREFIX foaf:

SELECT ?name
WHERE
{ ?person foaf:name "Alice Hacker" ; ?person foaf:knows ?name . } ----------------| name |================| "Bob Tester" |----------------

Property Graphs

Separates Links and Attributes

Nodes have attributes and so do edges

Different definitions https://github.com/tinkerpop/ is the de facto standard

Not universal

Data exchange (web publishing) is not an objective

Analysis and schema-less data applications

Build

Graph graph = new TinkerGraph(); Vertex a = graph.addVertex("alice"); Vertex b = graph.addVertex("bob"); a.setProperty("name","Alice Hacker"); b.setProperty("name","Bob Coder"); Edge e1 = graph.addEdge("k1", a, b, "knows"); Edge e2 = graph.addEdge("k2", b, a, "knows") ;

GSON

{ "edges" : [ { "_id" : "k1" , "_inV" : "bob" , "_label" : "knows" , "_outV" : "alice" , "_type" : "edge" } , { "_id" : "k2" , "_inV" : "alice" , "_label" : "knows" , "_outV" : "bob" , "_type" : "edge" } ] , "mode" : "NORMAL" , "vertices" : [ { "_id" : "bob" , "_type" : "vertex" , "name" : "Bob Coder" } , { "_id" : "alice" , "_type" : "vertex" , "name" : "Alice Hacker" } ]}

Gremlin

// Groovy to Java@SuppressWarnings("unchecked")Pipe pipe = Gremlin.compile("g.v('alice').out('knows').name"); for(Object name : pipe) { System.out.println((String) name);}

g.v('alice').out('knows').name

Cypher Query

Neo4J specific

Property Graph + labels (= types) node names

CREATE (alice { name: 'Alice Hacker'} ) , (bob { name: 'Bob Tester'} ) , (alice) -[:knows]-> (bob) , (bob) -[:knows]-> (alice)

MATCH (a)-[:knows]->xWHERE a.name = 'Alice Hacker'RETURN x.name