27
Graph Algebra with Pattern Matching and Aggregation Support 1

Graph Algebra

  • Upload
    daisy

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Graph Algebra. with Pattern Matching and Aggregation Support. Nowadays Graph. Variety of Sources Scientific Studies Business Activities Social Needs Internet Data are often of Large Scale Highly Liked Schema-less. Managing Graph Data. Primary Role of Database Persistent store - PowerPoint PPT Presentation

Citation preview

Page 1: Graph Algebra

1

Graph Algebrawith Pattern Matching and Aggregation Support

Page 2: Graph Algebra

2

Nowadays GraphVariety of Sources

◦ Scientific Studies◦ Business Activities◦ Social Needs◦ Internet

Data are often of◦ Large Scale◦ Highly Liked◦ Schema-less

Page 3: Graph Algebra

3

Managing Graph DataPrimary Role of Database

◦ Persistent store ◦ Efficient Query

RDBMS◦ Storage Model : vertex and edge as tuples◦ Query: Link is by join

Graph Database◦ Storage Model: graphs◦ Query: path traversal

Page 4: Graph Algebra

4

Why not RDBMS ?Schema Issue

◦ Every data inserted may of a different schema (Web Graph)

◦ Hard to represent semi structured infoScalability Issues

◦ ACID property VS CAP theoremQuery performance

◦ Difficult to optimize intensive Joins

Page 5: Graph Algebra

5

Graph Databases and Query Languages

No Universal Languages !!!

Page 6: Graph Algebra

6

No Universal Language Like SQL?No commonly agreed algebra

Relational Algebra ?◦ Expressive, test-of-time to be effective◦ NOT suitable for GRAPH

Graph Algebra ?◦ Still at preliminary work

Page 7: Graph Algebra

7

Issues with Relational Algebra (RA)Defined on Tuples or Set of Tuples

◦ Mismatch with graph nature◦ Operators loose semantics

What is Union, Intersection, Join in GRAPH?

◦ I/O type ? Tables not GRAPH

Domain centric, not Data centric◦ Don’t anticipate out-of-order data◦ Treat Tuples as independent

Didn’t aware the links among Tuples Queries written using RA are verbose and complex

Page 8: Graph Algebra

8

Advantage of Graph AlgebraAn algebra itself is a query language

◦ Easy to work out a language with Strong theoretic support

Evaluate expressiveness of given languages◦ Justify when to use what: Gremlin, Cypher etc.

Query Optimization◦ Operator order EQUALS execution plan◦ Algebraic Equivalence IMPLIES query

optimization

Page 9: Graph Algebra

9

Advantage of Graph AlgebraSeparation of Query and System:

◦ One can write Query on any system as long as common algebra is supported.

◦ Knowing RA, one can write SQL, PL/SQL, MS/SQL on MySQL, Oracle, SQLServer

Integrate new operators to database:◦ Current graph database systems didn’t support

newly developed queries: Graph OLAP, Graph Cube, Graph Aggregation etc.

◦ Proper Algebra can incorporate these operators

Page 10: Graph Algebra

10

Existing Works on Graph AlgebraGraph QL [1]

◦ A graph based algebra, operators are based on graphs◦ Selection◦ Join – not properly defined◦ Template

VAQL [2]◦ Focused on visualization◦ Selection◦ Aggregation – restricted◦ Visualization

Selection is restricted on isomorphismAggregation is not defined over edgesNo algebra equivalence[1] He, Huahai, and Ambuj K. Singh. "Graphs-at-a-time: query language and access methods for graph databases." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008.[2] Shaverdian, Anna A., et al. "A graph algebra for scalable visual analytics." Computer Graphics and Applications, IEEE 32.4 (2012): 26-33.

Page 11: Graph Algebra

11

What we want for a Graph Algebra?Universal

◦ Independent of graph types: Directed VS Undirected. Simple VS Hyper. Homogeneous VS

heterogeneous.

Expressive◦ Able to answer typical graph queries:

Pattern match, Reachability, Path finding etc.

◦ Cover Relational Algebra (RA) This ensures that graph database can handle relational data as well

Scale ◦ Able to manage data in-scale

Support queries to summarize, aggregate data

Page 12: Graph Algebra

12

Extended Algebra – Graph Model is an attributed graphis vertex set, each has a unique IDis edge set contains attributes for each vertex contains attributes for each edge

◦ Edge contain identifier as well◦ In simple graph, edge can be represented by end

points contains information for the graph

Page 13: Graph Algebra

13

Extended Algebra – OperatorsProjection

Restriction

Unification

Pattern Matching

Aggregation

Page 14: Graph Algebra

14

Operators: Projection Purpose:

◦ Select user interested data from base graph

Syntax:

are the attribute lists for vertex, edge and graph

The result is a new graph, whose attributes are trimmed by

Page 15: Graph Algebra

15

Operators: Restriction Purpose:

◦ Restrict the attribute value from base graphSyntax:

: vertex restriction, select all the vertices (and their induced edges) which matches predicate

: edge restriction, select all the edges (and their endpoints) which matches predicate

: graph restriction, select graphs whose every vertex matches predicate, every edge matches and the graph matches

Page 16: Graph Algebra

16

Operator: Unification Purpose:

◦ Concatenate graphsSyntax:

: vertex unification, unify vertices with identical ids

: edge unification, adding edges between two vertices matching

: attribute unification, create a virtual vertex for each distinct value in

Page 17: Graph Algebra

17

Operator: Unification

P(v1,v1) and P(v4,v5) are true

Page 18: Graph Algebra

18

Operator: Unification

Page 19: Graph Algebra

19

Operator: Pattern Matching Purpose:

◦ Find subgraphs out of base graph matching a given pattern

Syntax:

is a pattern, which is also a graph. The definition comes from [1]

returns all the matching graphs returns abstractive matching, where only

vertices appeared in is returned[1] Fan, Wenfei, et al. "Adding regular expressions to graph reachability and pattern queries." Data Engineering (ICDE), 2011 IEEE 27th International Conference on. IEEE, 2011.

Page 20: Graph Algebra

20

Operator: Pattern Matching

Page 21: Graph Algebra

21

Operator: Aggregation Purpose:

◦ To summarize a given graph

Syntax:

: graph aggregation, every vertex is supplied to and every edge set is supplied to

: vertex aggregation, given a set of vertices group them by

: edge aggregation, given a set of edges, group them by

Page 22: Graph Algebra

22

Operator: Aggregation

Page 23: Graph Algebra

23

Expressiveness

This set of operators are more expressive than Relational Algebra and Graph QL

It can represent many graph queries◦ Reachability◦ Graph Cube computation◦ I-OLAP and T-OLAP

Page 24: Graph Algebra

24

Algebra EquivalenceWhen operators are chained up, they

can form a query execution plan

Find the network induced by the person whose friends comment on each other’s posts with birthday greater than 1989. Output those names as a graph

friend

Comment friend

⊕𝑣 (𝜋 (𝜎 𝑣 (Γ (𝑅𝑀 ,𝐺 ) , h𝑏𝑖𝑟𝑡 𝑑𝑎𝑦>1989 ) ,𝑣 .𝑛𝑎𝑚𝑒 ))

Base Graph

Matched

ResultRestrictio

nh𝑏𝑖𝑟𝑡 𝑑𝑎𝑦>1989 v.nam

eV-

Unification

Page 25: Graph Algebra

25

Algebra EquivalenceTo generate multiple execution plans

for a same query, we need theoretic support:

Identity Equivalence:

◦ A operator can be represented by other operators // p is a common attribute predicate

◦ D(P) is to decompose a pattern P into edges

◦ //

...

Page 26: Graph Algebra

26

ConclusionGraph Algebra plays an important role

in graph database development

We make one step forward by proposing a Graph Algebra which:◦ extends existing algebraic work with

Regular pattern matching Aggregation

◦ is expressive and well-defined◦ contains equivalence rules for further query

optimization

Page 27: Graph Algebra

27

Thank you!