132
Center for Semantic Web Research www.ciws.cl, @ciwschile

Center for Semantic Web Research - uchile.cl

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Center for Semantic Web Research - uchile.cl

Center for Semantic Web Research

www.ciws.cl, @ciwschile

Page 2: Center for Semantic Web Research - uchile.cl

Three institutions

PUC, Chile

◮ Marcelo Arenas (Boss)◮ SW, data exchange,

semistrucured data

◮ Juan Reutter◮ SW, graph DBs, DLs

◮ Cristian Riveros◮ data exchange, semistr. data

◮ Jorge Baier◮ planning, search

◮ Carlos Buil◮ SW

Univ. of Talca

Renzo Angles (SW)

Univ. of Chile

◮ Pablo Barcelo (Deputy)◮ graph DBs, DB theory

◮ Claudio Gutierrez◮ SW, graph DBs

◮ Jorge Perez◮ SW, data exchange

◮ Aidan Hogan◮ SW, semistr. data

◮ Barbara Poblete◮ web mining, SNA

◮ Benjamın Bustos◮ multimedia

Page 3: Center for Semantic Web Research - uchile.cl

Alberto Mendelzon Workshop (AMW)

Page 4: Center for Semantic Web Research - uchile.cl

AMW 2016

◮ Panama City, 6-10 June, 2016

◮ PC Chairs:

Altigran Soares da Silva (UFAM), Reinhard Pichler (TU Wien)

◮ Invited speakers:◮ Diego Calvanese (Data-driven verification)◮ Juliana Freire (Urban data)◮ Lise Getoor (Relational statistical learning)◮ Raghu Ramakrishnan (Big data)

◮ Long & short submissions (due on Feb. 29th, 2016)

◮ AMW School (4 tutorials), 4-5 June, 2016

◮ Very nice environment

Page 5: Center for Semantic Web Research - uchile.cl

Query Languages for Graph DBs:

Bridging the Grap Between Theory and Practice

Pablo Barcelo

DCC, Universidad de ChileCenter for Semantic Web Research (www.ciws.cl)

Page 6: Center for Semantic Web Research - uchile.cl

BACKGROUND AND OBJECTIVES

Page 7: Center for Semantic Web Research - uchile.cl

Graph databases

Trendy applications:

◮ Social network analysis

◮ Semantic web

◮ Scientific databases

◮ Software bug localization

◮ Geo-routing

Page 8: Center for Semantic Web Research - uchile.cl

Graph databases

Trendy applications:

◮ Social network analysis

◮ Semantic web

◮ Scientific databases

◮ Software bug localization

◮ Geo-routing

More in general:

◮ Wherever connections are as important as data

Page 9: Center for Semantic Web Research - uchile.cl

What is a graph database?

A data management system that exposes a graph data model.1

1Graph databases. Robinson, Webber, & Eifrem. O’Reilly, 2013.

Page 10: Center for Semantic Web Research - uchile.cl

What is a graph database?

A data management system that exposes a graph data model.1

Several existing graph DB engines and query languages:

◮ DEX/Sparksee - basic algebra

◮ IBM System G - Gremlin

◮ Neo4J - Cypher

◮ Oracle PGX - PGQL

◮ RDF stores (Virtuoso, AllegroGraph, Oracle, IBM) - SPARQL

1Graph databases. Robinson, Webber, & Eifrem. O’Reilly, 2013.

Page 11: Center for Semantic Web Research - uchile.cl

What graph databases are good for?

◮ Flexible modelling of interconnected data

◮ Agile evolution of the data model

◮ Scalable evaluation of join-intensive queries

Page 12: Center for Semantic Web Research - uchile.cl

My personal story

◮ Since 2009: Working on theory of query languages for graph DBs

◮ Since 2015: Working group of LDBC for the design of such language

Page 13: Center for Semantic Web Research - uchile.cl

My personal story

◮ Since 2009: Working on theory of query languages for graph DBs

◮ Since 2015: Working group of LDBC for the design of such language

Conclusion:

◮ Theory and practice are more connected than expected

Page 14: Center for Semantic Web Research - uchile.cl

Objectives

◮ Identify topics of common interest for theoreticians and developers

◮ Formalize relevant concepts (syntax, semantics, terminology, etc)

◮ Understand tradeoff expressiveness/efficiency

Page 15: Center for Semantic Web Research - uchile.cl

THE DATA MODEL:PROPERTY GRAPHS

Page 16: Center for Semantic Web Research - uchile.cl

The data model is important as it must be:

◮ Flexible enough to accomodate scenarios of practical interest

◮ Simple enough to allow for a clean presentation

◮ Expressive enough for theoretical issues to appear in full force

Page 17: Center for Semantic Web Research - uchile.cl

The data model is important as it must be:

◮ Flexible enough to accomodate scenarios of practical interest

◮ Simple enough to allow for a clean presentation

◮ Expressive enough for theoretical issues to appear in full force

This is accomplished by the model of property graphs

Page 18: Center for Semantic Web Research - uchile.cl

A property graph

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

Page 19: Center for Semantic Web Research - uchile.cl

What is a property graph then?

◮ It is a graph

◮ It is directed

◮ It is labeled in nodes and edges

◮ Nodes and edges can be attributed

Page 20: Center for Semantic Web Research - uchile.cl

What is a property graph then?

◮ It is a graph

◮ It is directed

◮ It is labeled in nodes and edges

◮ Nodes and edges can be attributed

Page 21: Center for Semantic Web Research - uchile.cl

What is a property graph then?

◮ It is a graph

◮ It is directed

◮ It is labeled in nodes and edges

◮ Nodes and edges can be attributed

Page 22: Center for Semantic Web Research - uchile.cl

What is a property graph then?

◮ It is a graph

◮ It is directed

◮ It is labeled in nodes and edges

◮ Nodes and edges can be attributed

Page 23: Center for Semantic Web Research - uchile.cl

What is a property graph then?

◮ It is a graph

◮ It is directed

◮ It is labeled in nodes and edges

◮ Nodes and edges can be attributed

Page 24: Center for Semantic Web Research - uchile.cl

GRAPH PATTERNS

Page 25: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Page 26: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 27: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 28: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 29: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 30: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 31: Center for Semantic Web Research - uchile.cl

Graph patterns are:

The basic unit for querying property graphs

Definition of graph pattern:

◮ A directed graph

◮ Nodes are given by variables x , y , z , . . .

◮ Edges are given by variables X ,Y ,Z , . . .

◮ Nodes and edges satisfy label and attribute constraints (selection)◮ E.g., l(x) = Author, l(Y ) = authored by & Y@corresponding = YES

◮ Some of the variables are selected (projection)

Page 32: Center for Semantic Web Research - uchile.cl

Example of a graph pattern

Find pairs of authors who coauthored a paper in PODS83:

Author

Author

ArticleVenuepart of

authored by

authored by

name: PODS83

Page 33: Center for Semantic Web Research - uchile.cl

More examplesGet the common friends of Peter, John and Mary:

Person Person

Person

Person

name: Mary

name: John

name: Peter

friend of

friend of

friend of

Page 34: Center for Semantic Web Research - uchile.cl

More examplesGet the common friends of Peter, John and Mary:

Person Person

Person

Person

name: Mary

name: John

name: Peter

friend of

friend of

friend of

Find friends of John who are(1) mutual friends, and (2) have lovers that are colleagues

friend of

Person

Person

friend of

Person

name: John

Person

Person

friend of

lover of

lover of

works with

Page 35: Center for Semantic Web Research - uchile.cl

Evaluation of graph patterns

1. Find all matchings of the pattern over the property graph

2. Project over the variables in the output

Page 36: Center for Semantic Web Research - uchile.cl

Evaluation of graph patterns

1. Find all matchings of the pattern over the property graph

2. Project over the variables in the output

Page 37: Center for Semantic Web Research - uchile.cl

Evaluation of graph patterns

1. Find all matchings of the pattern over the property graph

2. Project over the variables in the output

Page 38: Center for Semantic Web Research - uchile.cl

An example of evaluationThe property graph

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

Page 39: Center for Semantic Web Research - uchile.cl

An example of evaluation

The graph pattern

Author

Author

ArticleVenuepart of

authored by

authored by

name: PODS83

Page 40: Center for Semantic Web Research - uchile.cl

An example of evaluationA matching

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

authored by

Page 41: Center for Semantic Web Research - uchile.cl

An example of evaluationIts projection

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

authored by

Page 42: Center for Semantic Web Research - uchile.cl

An example of evaluationAnother matching

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

authored by

Page 43: Center for Semantic Web Research - uchile.cl

An example of evaluationIts projection

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

authored by

Page 44: Center for Semantic Web Research - uchile.cl

But what is a matching?

Page 45: Center for Semantic Web Research - uchile.cl

But what is a matching?

A mapping from:

◮ nodes of the pattern to nodes of the graph, and

◮ edges of the pattern to edges of the graph

which preserves the structure of the pattern in the graph

Page 46: Center for Semantic Web Research - uchile.cl

Different notions of matching

Homomorphism: No restriction on the mapping

◮ 6= nodes in the pattern can map to the same node in the graph

Page 47: Center for Semantic Web Research - uchile.cl

Different notions of matching

Homomorphism: No restriction on the mapping

◮ 6= nodes in the pattern can map to the same node in the graph

Mostly studied in database theory (PODS)

Page 48: Center for Semantic Web Research - uchile.cl

Different notions of matching

Isomorphism: Mapping is injective

◮ 6= elements in the pattern map to 6= elements in the graph

Page 49: Center for Semantic Web Research - uchile.cl

Different notions of matching

Isomorphism: Mapping is injective

◮ 6= elements in the pattern map to 6= elements in the graph

Mostly studied in database systems (SIGMOD)

Page 50: Center for Semantic Web Research - uchile.cl

Different notions of matching

Edge-injective: Self-describing

◮ 6= edges in the pattern map to 6= edges in the graph

Page 51: Center for Semantic Web Research - uchile.cl

Different notions of matching

Edge-injective: Self-describing

◮ 6= edges in the pattern map to 6= edges in the graph

Implemented in some graph DB engines (Neo4J)

Page 52: Center for Semantic Web Research - uchile.cl

Is there a matching?

Page 53: Center for Semantic Web Research - uchile.cl

Is there a matching?

An NP-complete problem

Page 54: Center for Semantic Web Research - uchile.cl

Is there a matching?

An NP-complete problem

How to address this problem?

Page 55: Center for Semantic Web Research - uchile.cl

Solution 1: Restriction on graph patterns

In many applications, graph patterns are tame:

◮ Homomorphism/isomorphism can be solved efficiently for them

Page 56: Center for Semantic Web Research - uchile.cl

Solution 1: Restriction on graph patterns

In many applications, graph patterns are tame:

◮ Homomorphism/isomorphism can be solved efficiently for them

Tame: The underlying graph is almost acyclic

◮ Bounded treewidth (database/graph theory)

Page 57: Center for Semantic Web Research - uchile.cl

Solution 2: Heuristics for real-world datasets

Structural optimization techniques for reducing search space:

◮ Join ordering, prunning, indexes (database systems)

Page 58: Center for Semantic Web Research - uchile.cl

Solution 2: Heuristics for real-world datasets

Structural optimization techniques for reducing search space:

◮ Join ordering, prunning, indexes (database systems)

Real databases have structure that can be exploited

Page 59: Center for Semantic Web Research - uchile.cl

Solution 3: Inexact evaluation

Use weaker forms of matching that can be evaluated efficiently:

◮ Bisimulations, approximations (database theory/systems)

Page 60: Center for Semantic Web Research - uchile.cl

Solution 3: Inexact evaluation

Use weaker forms of matching that can be evaluated efficiently:

◮ Bisimulations, approximations (database theory/systems)

Compromise the quality of the answer in favor of efficiency

Page 61: Center for Semantic Web Research - uchile.cl

Solution 4: Use different notions of complexity

Graphs and patterns are different beasts:

◮ Graphs are BIG, patterns are small

Page 62: Center for Semantic Web Research - uchile.cl

Solution 4: Use different notions of complexity

Graphs and patterns are different beasts:

◮ Graphs are BIG, patterns are small

Assume pattern is fixed (data complexity/database theory):

◮ Matching can be solved very efficiently

Page 63: Center for Semantic Web Research - uchile.cl

Solution modifiers on graph patterns

Relational operations:

◮ Union

◮ Difference

◮ Cartesian product

Page 64: Center for Semantic Web Research - uchile.cl

Solution modifiers on graph patterns

Relational operations:

◮ Union

◮ Difference

◮ Cartesian product

The language becomes relational complete

Page 65: Center for Semantic Web Research - uchile.cl

OPTIONAL: An important solution modifier

Allows to match parts of the data only if available

◮ Important in the context of semistructured data

◮ Developed by the RDF community

◮ Corresponds to left-outer join in relational algebra

Page 66: Center for Semantic Web Research - uchile.cl

An example with OPTIONAL

The property graph

evaluated in

Paper

year: 1985

name: AKV85

Conference

name: STOC85

published in

Paper

year: 1985

name: Saf85

Review

text: "This paper..."

publ

ished

in

Page 67: Center for Semantic Web Research - uchile.cl

An example with OPTIONAL

The property graph

evaluated in

Paper

year: 1985

name: AKV85

Conference

name: STOC85

published in

Paper

year: 1985

name: Saf85

Review

text: "This paper..."

publ

ished

in

The pattern with optional

(x ,published in, y) OPTIONAL (y , evaluated in, z)

Page 68: Center for Semantic Web Research - uchile.cl

An example with OPTIONAL

A matching

name: STOC85

Paper

year: 1985

name: AKV85

published in

Paper

year: 1985

name: Saf85

Review

text: "This paper..."

publ

ished

in

evaluated in

Conference

The pattern with optional

(x ,published in, y) OPTIONAL (x , evaluated in, z)

Page 69: Center for Semantic Web Research - uchile.cl

An example with OPTIONAL

Another matching

name: STOC85

published in

Paper

year: 1985

name: Saf85

Review

text: "This paper..."

publ

ished

in

evaluated in

Paper

year: 1985

name: AKV85

Conference

The pattern with optional

(x ,published in, y) OPTIONAL (x , evaluated in, z)

Page 70: Center for Semantic Web Research - uchile.cl

Conclusion

◮ Graph patterns are a versatile and simple language for querying PGs

◮ Graph pattern evaluation comes in different flavors

◮ This problem is challenging (theory/practice)

◮ Different operators can be added in order to increase expressiveness

Page 71: Center for Semantic Web Research - uchile.cl

NAVIGATION

Page 72: Center for Semantic Web Research - uchile.cl

Graphs are there to be navigatedRecall our property graph?

authored by

Author

name: Leonid Libkin

affil: U. of Edinburgh

Author

name: Robert Tarjan

affil: Princeton Univ.

Article

name: jacmHopcroft74

year: 1974

Venue

name: FOCS67

authored by

corresponding: YES

Author

Author

Author

Author

name: John Hopcroft

affil: Cornell Univ.

name: Jeff Ullman

affil: Stanford Univ.

name: Ron Fagin

affil: IBM Almaden

name: Moshe Vardi

affil: Rice University

Author

name: Limsoon Wong

affil: NU Singapore

Article

name: FOCSHopcroft67

year: 1967

Article

name: PODSUllman89

year: 1989

Article

name: PODSFaginUV83

year: 1983

Article

name: PODSVardi95

year: 1995

Article

name: PODSLibkin95

year: 1995

Article

name: IPLLibkinW95

year: 1995

Journal

name: JACM

Venue

Venue

Venue

Journal

name: PODS89

name: PODS83

name: PODS95

name: IPL

journal

journal

part of

part of

part of

part of

Conference

Conference

name: FOCS

name: PODS

series

series

part of

serie

s

series

authored by

authored by

authored by

authored by

authored by

corresponding: YES

corresponding: YES

authored by

authored by

cooresponding: YES

cooresponding: YES

authored by

authored by

authored by

cooresponding: YES

Page 73: Center for Semantic Web Research - uchile.cl

Graphs are there to be navigated

Find pairs of authors linked by a coauthorship sequence

AuthorAuthor (

authored by−1authored by)

Page 74: Center for Semantic Web Research - uchile.cl

Do practical query languages navigate?

Very little:

◮ Check if there is a directed path between two nodes (DFS)

Page 75: Center for Semantic Web Research - uchile.cl

Do practical query languages navigate?

Very little:

◮ Check if there is a directed path between two nodes (DFS)

The previous query cannot be expressed (save for a few exceptions)

Page 76: Center for Semantic Web Research - uchile.cl

Do practical query languages navigate?

Very little:

◮ Check if there is a directed path between two nodes (DFS)

The previous query cannot be expressed (save for a few exceptions)

No support for regular path queries (database theory, RDF, DL)

◮ Is there a directed path whose label satisfies a regex?

Page 77: Center for Semantic Web Research - uchile.cl

Are RPQs harder to evaluate?

Page 78: Center for Semantic Web Research - uchile.cl

Are RPQs harder to evaluate?

Not really (in theory):

◮ Convert the regex into an automata

◮ Take the cross product of the property graph and the automata

◮ Check if there is a path from an initial to a final state (DFS)

Page 79: Center for Semantic Web Research - uchile.cl

Are RPQs harder to evaluate?

Not really (in theory):

◮ Convert the regex into an automata

◮ Take the cross product of the property graph and the automata

◮ Check if there is a path from an initial to a final state (DFS)

Cost is linear in the size of the data and the regex

Page 80: Center for Semantic Web Research - uchile.cl

But, what is the semantics?

Is there a path or a simple path?

◮ Database theory concentrates on the former (why?)

◮ Graph DB engines implement the latter (why?)

Page 81: Center for Semantic Web Research - uchile.cl

But, what is the semantics?

Is there a path or a simple path?

◮ Database theory concentrates on the former (why?)

◮ Graph DB engines implement the latter (why?)

Page 82: Center for Semantic Web Research - uchile.cl

But, what is the semantics?

Is there a path or a simple path?

◮ Database theory concentrates on the former (why?)

◮ Graph DB engines implement the latter (why?)

Page 83: Center for Semantic Web Research - uchile.cl

But, what is the semantics?

Is there a path or a simple path?

◮ Database theory concentrates on the former (why?)

◮ Graph DB engines implement the latter (why?)

Our algorithm evaluates RPQs under arbitrary path semantics

Page 84: Center for Semantic Web Research - uchile.cl

But, what is the semantics?

Is there a path or a simple path?

◮ Database theory concentrates on the former (why?)

◮ Graph DB engines implement the latter (why?)

Our algorithm evaluates RPQs under arbitrary path semantics

Is it possible to use it under a simple path semantics?

Page 85: Center for Semantic Web Research - uchile.cl

RPQs under simple paths

Is there a simple path whose label satisfies a regex?

◮ This problem is NP-complete even if the regex is fixed!

◮ High complexity only dependent on the size of the data

Page 86: Center for Semantic Web Research - uchile.cl

RPQs under simple paths

Is there a simple path whose label satisfies a regex?

◮ This problem is NP-complete even if the regex is fixed!

◮ High complexity only dependent on the size of the data

Page 87: Center for Semantic Web Research - uchile.cl

RPQs under simple paths

Is there a simple path whose label satisfies a regex?

◮ This problem is NP-complete even if the regex is fixed!

◮ High complexity only dependent on the size of the data

Page 88: Center for Semantic Web Research - uchile.cl

RPQs under simple paths

Is there a simple path whose label satisfies a regex?

◮ This problem is NP-complete even if the regex is fixed!

◮ High complexity only dependent on the size of the data

Example: Consider the RPQ (aa)∗

1. It asks whether there is a simple path of even length from x to y

2. This problem is NP-complete

Page 89: Center for Semantic Web Research - uchile.cl

Adding RPQs to graph patterns

Give rise to the class of conjunctive RPQs

◮ Has received considerable attention in theory

◮ (Essentially) unexplored from a practical point of view

◮ Challenging because of matching and RPQ evaluation

Page 90: Center for Semantic Web Research - uchile.cl

Adding RPQs to graph patterns

Give rise to the class of conjunctive RPQs

◮ Has received considerable attention in theory

◮ (Essentially) unexplored from a practical point of view

◮ Challenging because of matching and RPQ evaluation

Page 91: Center for Semantic Web Research - uchile.cl

Adding RPQs to graph patterns

Give rise to the class of conjunctive RPQs

◮ Has received considerable attention in theory

◮ (Essentially) unexplored from a practical point of view

◮ Challenging because of matching and RPQ evaluation

Page 92: Center for Semantic Web Research - uchile.cl

Adding RPQs to graph patterns

Give rise to the class of conjunctive RPQs

◮ Has received considerable attention in theory

◮ (Essentially) unexplored from a practical point of view

◮ Challenging because of matching and RPQ evaluation

Page 93: Center for Semantic Web Research - uchile.cl

Are paths the only form of navigation?

Page 94: Center for Semantic Web Research - uchile.cl

Are paths the only form of navigation?

No! We can allow stronger forms of recursion (DB theory, DL, MC):

◮ Navigate with branching(nested regexs, similar evaluation to RPQs)

◮ Allow transitive closure on top of conjunctive RPQs(regular queries, harder to evaluate)

◮ Allow arbitrary recursion(datalog, very hard to evaluate, non-parallelizable)

Page 95: Center for Semantic Web Research - uchile.cl

Are paths the only form of navigation?

No! We can allow stronger forms of recursion (DB theory, DL, MC):

◮ Navigate with branching(nested regexs, similar evaluation to RPQs)

◮ Allow transitive closure on top of conjunctive RPQs(regular queries, harder to evaluate)

◮ Allow arbitrary recursion(datalog, very hard to evaluate, non-parallelizable)

Page 96: Center for Semantic Web Research - uchile.cl

Are paths the only form of navigation?

No! We can allow stronger forms of recursion (DB theory, DL, MC):

◮ Navigate with branching(nested regexs, similar evaluation to RPQs)

◮ Allow transitive closure on top of conjunctive RPQs(regular queries, harder to evaluate)

◮ Allow arbitrary recursion(datalog, very hard to evaluate, non-parallelizable)

Page 97: Center for Semantic Web Research - uchile.cl

Are paths the only form of navigation?

No! We can allow stronger forms of recursion (DB theory, DL, MC):

◮ Navigate with branching(nested regexs, similar evaluation to RPQs)

◮ Allow transitive closure on top of conjunctive RPQs(regular queries, harder to evaluate)

◮ Allow arbitrary recursion(datalog, very hard to evaluate, non-parallelizable)

Page 98: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 99: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 100: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 101: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 102: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 103: Center for Semantic Web Research - uchile.cl

Conclusions (really, questions)

◮ Are RPQs useful in practice?

◮ Can they be implemented?

◮ Under which semantics?

◮ What about conjunctive RPQs?

◮ Or even stronger forms of recursion?

Page 104: Center for Semantic Web Research - uchile.cl

RETURNING PATHS

Page 105: Center for Semantic Web Research - uchile.cl

Paths in the output

Query languages such as Cypher & PGQL allow:

◮ Return one path

◮ Return one shortest path (DFS)

◮ Return all paths

◮ Return all shortest paths

Page 106: Center for Semantic Web Research - uchile.cl

Paths in the output

Query languages such as Cypher & PGQL allow:

◮ Return one path

◮ Return one shortest path (DFS)

◮ Return all paths

◮ Return all shortest paths

But, what does it mean to return all paths?

... there can be infinitely many

Page 107: Center for Semantic Web Research - uchile.cl

Paths in the output

Query languages such as Cypher & PGQL allow:

◮ Return one path

◮ Return one shortest path (DFS)

◮ Return all paths

◮ Return all shortest paths

But, what does it mean to return all paths?

... there can be infinitely many

Systems return simple paths

... but there can be exponentially many

Page 108: Center for Semantic Web Research - uchile.cl

Even more interesting for RPQs

◮ Return a shortest path whose label satisfies a regex

◮ And all shortest paths

◮ Return all paths whose label satisfies a regex

◮ And all paths

Page 109: Center for Semantic Web Research - uchile.cl

A solution from database theory

Instead of returning all paths ...

return a compact representation of them

Page 110: Center for Semantic Web Research - uchile.cl

A solution from database theory

Instead of returning all paths ...

return a compact representation of them

Compact representation: A property graph with all paths in the output

◮ Can be constructed efficiently (in data) for arbitrary paths

◮ Impossible for simple paths

Page 111: Center for Semantic Web Research - uchile.cl

A solution from database theory

Instead of returning all paths ...

return a compact representation of them

Compact representation: A property graph with all paths in the output

◮ Can be constructed efficiently (in data) for arbitrary paths

◮ Impossible for simple paths

Page 112: Center for Semantic Web Research - uchile.cl

A solution from database theory

Instead of returning all paths ...

return a compact representation of them

Compact representation: A property graph with all paths in the output

◮ Can be constructed efficiently (in data) for arbitrary paths

◮ Impossible for simple paths

Page 113: Center for Semantic Web Research - uchile.cl

Conclusions

◮ Returning paths is difficult under all interpretations

◮ “All paths” can be compactly represented, but simple paths cannot

◮ The right semantics still needs to be settled

Page 114: Center for Semantic Web Research - uchile.cl

UNGROUPING

Page 115: Center for Semantic Web Research - uchile.cl

Paths can appear in the output

Page 116: Center for Semantic Web Research - uchile.cl

Paths can appear in the output

We can ungroup them

◮ List the nodes that appear in them

Page 117: Center for Semantic Web Research - uchile.cl

Paths can appear in the output

We can ungroup them

◮ List the nodes that appear in them

We can check if a path visits all nodes

◮ Travelling salesman problem!

Page 118: Center for Semantic Web Research - uchile.cl

Paths can appear in the output

We can ungroup them

◮ List the nodes that appear in them

We can check if a path visits all nodes

◮ Travelling salesman problem!

Page 119: Center for Semantic Web Research - uchile.cl

Paths can appear in the output

We can ungroup them

◮ List the nodes that appear in them

We can check if a path visits all nodes

◮ Travelling salesman problem!

Interaction with other operators expresses even more complex properties

Page 120: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Page 121: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Page 122: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Page 123: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Page 124: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Page 125: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Complexity of evaluation is astronomical (in data)

◮ (Severely) restricting the language leads to efficiency

Page 126: Center for Semantic Web Research - uchile.cl

A query language for paths

◮ Variables for paths and nodes

◮ Can check if a node belongs to a path

◮ Can check if the label of a path satisfies a regex

◮ Closure under quantification over nodes and paths

Complexity of evaluation is astronomical (in data)

◮ (Severely) restricting the language leads to efficiency

Page 127: Center for Semantic Web Research - uchile.cl

Question

◮ What ungropuing can do and should do?

Page 128: Center for Semantic Web Research - uchile.cl

FINAL THOUGHTS

Page 129: Center for Semantic Web Research - uchile.cl

Final thoughts

◮ Exciting times for studying graph DBs (theory/practice)

◮ Lots of fine tuning needed

◮ Some issues still unexplored:◮ Comparing paths◮ Ranking of answers◮ Constraints

Page 130: Center for Semantic Web Research - uchile.cl

MANY THANKS

Page 131: Center for Semantic Web Research - uchile.cl

Bibliography

Surveys:

◮ P. Wood. Query languages for graph databases. SIGMOD Record, 2012.

◮ P. Barcelo. Querying graph databases. PODS, 2013.

Regular path queries:

◮ A. Mendelzon and P. Wood. Finding regular simple paths in graph databases.

SIAM J. on Comp, 1995.

◮ D. Calvanese, G. de Giacomo, M. Lenzerini, M. Vardi. Reasoning on regular path

queries. SIGMOD Record, 2003.

Conjunctive regular path queries:

◮ D. Calvanese, G. de Giacomo, M. Lenzerini, M. Vardi. Containment of

conjunctive regular path queries with inverse. KR, 2000.

Page 132: Center for Semantic Web Research - uchile.cl

BibliographyQueries with more expressive recursion:

◮ P. Barcelo, J. Perez, J. Reutter. Relative expressiveness of nested regular

expressions. AMW, 2012.

◮ J. Reutter, M. Romero, M. Vardi. Regular queries on graph databases. ICDT,

2015.

Queries that compare nodes:

◮ L. Libkin, D,. Vrgoc. Regular path queries on graphs with data. ICDT, 2012.

◮ G. Fletcher, M. Gyssens, D. Leinders, et al. Relative expressive power of

navigational querying on graphs. ICDT, 2011.

◮ P. Barcelo, G. Fontaine, A. W. Lin. Expressive path queries over graphs with

data. LMCS, 2015.

Queries with paths as first-class citizens and negation:

◮ P. Barcelo, L. Libkin, A. W. Lin, P. Wood. Expressive path queries over

graph-structured data. ACM TODS, 2012.