35
Graph Databases for SQL Server Professionals Stéphane Fréchette

Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Embed Size (px)

DESCRIPTION

Presented on November 22, 2014 @ SQLSaturday #350 in Winnipeg, MB Canada Graph databases are used to represent graph structures with nodes, edges and properties. Neo4j, an open-source graph database is reliable and fast for managing and querying highly connected data. Will explore how to install and configure, create nodes and relationships, query with the Cypher Query Language, importing data and using Neo4j in concert with SQL Server... Providing answers and insight with visual diagrams about connected data that you have in your SQL Server Databases! Session Level: Intermediate

Citation preview

Page 1: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Graph Databases for SQL Server

Professionals

Stéphane Fréchette

Page 2: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Who am I?

My name is Stéphane Fréchette

SQL Server MVP | Consultant | Speaker | Database & BI Architect | NoSQL. Drums, good food and fine wine. Founder @ukubu, @GatineauOuverte, @TEDxGatineau

I have a passion for architecting, designing and building solutions that matter.

Twitter: @sfrechette

Blog: stephanefrechette.com

Email: [email protected]

11/25/2014 | SQLSaturday Winnipeg #3502 |

Page 3: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Thanks to Our Sponsors!

Gold

Silver

Bronze

Page 4: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Session Outline

What is a Graph?

What is Neo4j?

Data Modeling – The Property Graph

Cypher Query Language

Importing Data…

Use Cases

Demos

Resources

11/25/2014 | SQLSaturday Winnipeg #3504 |

Page 5: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

What is a Graph?

11/25/2014 | SQLSaturday Winnipeg #3505 |

Page 6: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Are these Graphs?

11/25/2014 | SQLSaturday Winnipeg #3506 |

Page 7: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

This is a Graph

Node

Relationship

A Property Graph

11/25/2014 | SQLSaturday Winnipeg #3507 |

Page 8: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Organization Project Graph

11/25/2014 | SQLSaturday Winnipeg #3508 |

Page 9: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Twitter Social Graph

11/25/2014 | SQLSaturday Winnipeg #3509 |

Page 10: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

What is Neo4j?

An open-source graph database by Neo Technology. Neo4j stores data in nodesconnected by directed, typed relationships with properties on both, also know as a Property Graph

Fully ACID compliant

Massively scalable, up to several billion nodes/relationships/properties

Highly-available, when distributed across multiple machines

Accessible by a convenient REST interface or an object-oriented Java API

11/25/2014 | SQLSaturday Winnipeg #35010 |

Page 11: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Data Modeling

From SQL Server to Graph

Property Graph

11/25/2014 | SQLSaturday Winnipeg #35011 |

Page 12: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Example: Meetup Data In SQL Server

ID Member

1 Daniel

2 Stephane

3 John

4 Randy

ID Name

1 Ottawa SQL Server User

Group

2 Ottawa JavaScript

3 Ottawa Visio User Group

4 Ottawa Tableau User Group

5 Dirty Dancing Ottawa

MemberID MeetupID

2 1

1 2

3 3

2 4

3 5

MemberID MeetupID

3 1

3 2

4 2

4 4

1 5

Member MeetupMeetupOrganizer MeetupMember

11/25/2014 | SQLSaturday Winnipeg #35012 |

Page 13: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Example: Meetup Data In a Graph Member Meetup

name: ‘Stephane’

name: ‘Ottawa Tableau User Group’

name: ‘Ottawa SQL Server User

Group’

name: ‘John’

name: ‘Ottawa JavaScript’

name: ‘Dirty Dancing Ottawa’

name: ‘Ottawa Visio User Group’

name: ‘Randy’

name: ‘Daniel’

11/25/2014 | SQLSaturday Winnipeg #35013 |

Page 14: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Cypher Query Language

Cypher is a declarative graph query language that allows

for expressive and efficient querying and updating of the

graph store

Pattern-matching

Declarative: what to retrieve, not how to retrieve it

Inspired from other known Language (SQL, SPARQL, Haskell, Python)

Aggregation, Ordering, Limit

Update the Graph

11/25/2014 | SQLSaturday Winnipeg #35014 |

Page 15: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Cypher and T-SQL

Cypher also has a number of keywords that have a direct equivalence with SQL which makes it a curiously familiar language

WHERE

ORDER BY

LIMIT

SUM, COUNT, STDEVP, MIN, MAX etc…

LTRIM, UPPER, LOWER, REPLACE, LEFT, RIGHT, SUBSTRING

DISTINCT

CASE (SQL Server Pros) – [:WILL_LOVE] -> (Cypher)

11/25/2014 | SQLSaturday Winnipeg #35015 |

Page 16: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Cypher - Meetup

11/25/2014 | SQLSaturday Winnipeg #35016 |

Page 17: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Neo4j Browser

11/25/2014 | SQLSaturday Winnipeg #35017 |

Page 18: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Demo(let’s query some data…)

11/25/2014 | SQLSaturday Winnipeg #35018 |

Page 19: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing Data…

11/25/2014 | SQLSaturday Winnipeg #35019 |

Page 20: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing Data…

Some important considerations…Different import scenarios

Dataset size: 1000s, 100000s, 10000000s

Dataset format (source): Database, File (CSV, Spreadsheet, GraphML, Geoff), Service, Other

Import type: Initial Bulk Load, Incremental Load, Initial Bulk Load + Incremental Load

Different import tools

Spreadsheet based

Neo4j-shell based: (Cypher, neo4j-shell-tools, Cypher LOAD CSV)

Command-line based: Batch Importer

Neo4j Brower based

ETL Tools: (Talend, Mulesoft, Pentaho Kettle)

Custom software: (Java API, REST API, Spring Data Neo4j)

11/25/2014 | SQLSaturday Winnipeg #35020 |

Page 21: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Many different mappings

Not always clear what you should be using

Depends on your skillsets, dataset size… (lots of other stuff)

Choose wisely!

Import

Scenarios

Import

Tools

11/25/2014 | SQLSaturday Winnipeg #35021 |

Page 22: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Demo(walkthrough on importing data…)

11/25/2014 | SQLSaturday Winnipeg #35022 |

Page 23: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

The Sample Dataset

11/25/2014 | SQLSaturday Winnipeg #35023 |

Page 24: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using Spreadsheets

Very small size datasets < 1000, easy to use

Format data in

spreadsheet

Generate Cypher

statements with

formulas

Copy and Execute

Cypher in Neo4j

browser

11/25/2014 | SQLSaturday Winnipeg #35024 |

Page 25: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using Spreadsheets

11/25/2014 | SQLSaturday Winnipeg #35025 |

Page 26: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using neo4j-shell-tools

Small to medium size datasets

https://github.com/jexp/neo4j-shell-tools

Format data in CSV

files

Create import-

cypher commands

for

neo4j-shell-tools

Execute commands

from neo4j-shell

11/25/2014 | SQLSaturday Winnipeg #35026 |

Page 27: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using neo4j-shell-tools

11/25/2014 | SQLSaturday Winnipeg #35027 |

Page 28: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using LOAD CSV

Native Cypher

Format data in

CSV files

Create

“LOAD CSV”

commands

Execute

command from

neo4j-shell or

browser

Additional

“cleanup” for

Labels and

RelTypes

11/25/2014 | SQLSaturday Winnipeg #35028 |

Page 29: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using LOAD CSV

11/25/2014 | SQLSaturday Winnipeg #35029 |

Page 30: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Importing using Batch Importer

Non-transactional import, suited for very very large datasets

Format data in

TSV files

Execute Batch

Import

command

Copy store

files to Neo4j

Server

directory

Start Neo4j

Server with

generated store

files

11/25/2014 | SQLSaturday Winnipeg #35030 |

Page 31: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Use Cases

Principal uses of Graph Database include:

Network and Data Center Management(Queries: Impact Analysis, Root Cause Analysis, Quality-of-Service Mapping, Asset Management)

Authorization and Access(Queries : Access Management, Interconnected Group Organization, Provenance)

Social(Queries : Friend Recommendations, Sharing & Collaboration, Influencer Analysis)

Geo(Queries : Routing, Logistics, Capacity Planning)

Recommendations(Queries : Product, Social, Service, and Professional Recommendations)

Fraud Detection

http://www.neotechnology.com/neo4j-use-cases/

11/25/2014 | SQLSaturday Winnipeg #35031 |

Page 32: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Summary

(graphs)-[:ARE]->(everywhere)

11/25/2014 | SQLSaturday Winnipeg #35032 |

Page 33: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Resources

Neo Technology http://www.neotechnology.com/

Neoj.org (Learn, Develop, Downloads,…) http://www.neo4j.org/

Neo4j on Vimeo http://vimeo.com/neo4j

Neo4j on SlideShare http://www.slideshare.net/neo4j

Neo4j on Github https://github.com/neo4j

Neo4j Cypher Cheat Sheet http://docs.neo4j.org/refcard/2.1/

Neo4j Graph Database as a Service http://www.graphenedb.com/

Linkurious – The easiest way to explore graph databases http://linkurio.us/

KeyLines- Visualize dynamic networks http://keylines.com/

Experiments with NEO4J: Using a graph database as a SQL Server metadata hub http://bit.ly/V2PrxN

Kenny Bastani http://www.kennybastani.com/

Rik Van Bruggen http://blog.bruggen.com/

Max de Marzi http://maxdemarzi.com/

Better Software Development http://jexp.de/blog/

Graph Databases (Free Book) http://graphdatabases.com/

Neo4j GraphGist http://gist.neo4j.org/

GraphConnect Conference http://graphconnect.com/

Titan – Distributed Graph Database https://thinkaurelius.github.io/titan/

InfiniteGraph http://www.infinitegraph.com/

OrientDB http://www.orientechnologies.com/

Cayley by Google https://github.com/google/cayley

11/25/2014 | SQLSaturday Winnipeg #35033 |

Page 34: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

What Questions Do You

Have?

11/25/2014 | SQLSaturday Winnipeg #35034 |

Page 35: Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg

Thank You

For attending this session

11/25/2014 | SQLSaturday Winnipeg #35035 |