Property Graphs: Neo4j and...

Preview:

Citation preview

PropertyGraphs:Neo4jandCypher

1

Neo4j-themostcommonlyusedgraphDBMS,byfar

https://db-engines.com/en/ranking/graph+dbms

Model:propertygraphs(graphswithnodesandedgescarryingmultipledatavaluesarrangedaskey-valuepairs)

Querylanguage:Cypher.ASCII-artpatternmatching+usualdatabasefeatures

Modellingtranslationsandcomplexaccessrules

3

EN:house

Modellingtranslationsandcomplexaccessrules

3

EN:house

house

Modellingtranslationsandcomplexaccessrules

3

EN:house

house

ES:casa

DE:Hause

SE:hus

Modellingtranslationsandcomplexaccessrules

3

EN:house

house

building

EN:building

ES:edificio DE:Gebäude

SE:byggnad

ES:casa

DE:Hause

SE:hus

WhiteboardFriendliness

Easy to design and model, direct representation of the model

Whiteboardfriendliness

Tom Hanks Hugo Weaving

Cloud AtlasThe Matrix

Lana Wachowski

ACTED_IN

ACTED_IN ACTED_IN

DIRECTED

DIRECTED

Whiteboardfriendliness

name: Tom Hanks born: 1956

title: Cloud Atlas released: 2012

title: The Matrix released: 1999

name: Lana Wachowski born: 1965

ACTED_IN roles: Zachry

ACTED_IN roles: Bill Smoke

DIRECTED

DIRECTED

ACTED_IN roles: Agent Smith

name: Hugo Weaving born: 1960

Person

MovieMovie

Person Director

ActorPerson Actor

Whiteboardfriendliness

Whiteboardfriendliness

Introtothepropertygraphmodel

Neo4jFundamentals

• Nodes

• Relationships

• Properties

• Labels

Car

PropertyGraphModelComponents

Nodes• Representtheobjectsinthegraph• Canbelabeled

Person Person

Car

DRIVES

PropertyGraphModelComponents

Nodes• Representtheobjectsinthegraph• Canbelabeled

Relationships• Relatenodesbytypeanddirection

LOVES

LOVES

LIVESWITH

OWNS

Person Person

Car

DRIVES

name:“Dan”born:May29,1970twitter:“@dan”

name:“Ann”born:Dec5,1975

since: Jan10,2011

brand:“Volvo”model:“V70”

PropertyGraphModelComponents

Nodes• Representtheobjectsinthegraph• Canbelabeled

Relationships• Relatenodesbytypeanddirection

Properties• Name-valuepairsthatcangoonnodesandrelationships.

LOVES

LOVES

LIVESWITH

OWNS

Person Person

Summaryofthegraphbuildingblocks

• Nodes-Entitiesandcomplexvaluetypes

• Relationships-Connectentitiesandstructuredomain

• Properties-Entityattributes,relationshipqualities,metadata

• Labels-Groupnodesbyrole

GraphQuerying

WhynotSQL?

• SQLisinefficientinexpressinggraphpatternqueriesInparticularforqueriesthat• arerecursive• canacceptpathsofmultipledifferentlengths

• Graphpatternsaremoreintuitiveanddeclarativethanjoins

• SQLcannothandlepathvalues

16

17

Apatternmatchingquerylanguagemadeforgraphs

• Declarative

• Expressive

• PatternMatching

Cypher

PatterninourGraphModel

LOVES

Dan Ann

NODE NODERelationship

Cypher:ExpressGraphPatterns

LOVES

Dan Ann

Relationship

Cypher:ExpressGraphPatterns

(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

Relationship

Cypher:ExpressGraphPatterns

(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

LABEL PROPERTYLABEL PROPERTY

Relationship

Cypher:ExpressGraphPatterns

(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

LABEL PROPERTY

NODE NODE

LABEL PROPERTY

Relationship

Cypher:CREATEGraphPatterns

LOVES

Dan Ann

Relationship

Cypher:CREATEGraphPatterns

CREATE(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

Relationship

Cypher:CREATEGraphPatterns

CREATE(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

LABEL PROPERTYLABEL PROPERTY

Relationship

Cypher:CREATEGraphPatterns

CREATE(:Person{name:"Dan"})-[:LOVES]->(:Person{name:"Ann"})

LOVES

Dan Ann

LABEL PROPERTY

NODE NODE

LABEL PROPERTY

Relationship

Cypher:MATCHGraphPatterns

LOVES

Dan ?

Relationship

Cypher:MATCHGraphPatterns

MATCH(:Person{name:"Dan"})-[:LOVES]->(whom)RETURNwhom

LOVES

Dan ?

Relationship

Cypher:MATCHGraphPatterns

MATCH(:Person{name:"Dan"})-[:LOVES]->(whom)RETURNwhom

LOVES

Dan ?

VARIABLELABEL PROPERTY

Relationship

Cypher:MATCHGraphPatterns

MATCH(:Person{name:"Dan"})-[:LOVES]->(whom)RETURNwhom

LOVES

Dan ?

VARIABLE

NODE NODE

LABEL PROPERTY

Relationship

Agraphqueryexample

Asocialrecommendation

MATCH(person:Person)-[:IS_FRIEND_OF]->(friend),(friend)-[:LIKES]->(restaurant),(restaurant)-[:LOCATED_IN]->(loc:Location),(restaurant)-[:SERVES]->(type:Cuisine)WHEREperson.name='Philip'ANDloc.location='NewYork'ANDtype.cuisine='Sushi'RETURNrestaurant.name

Asocialrecommendation

TheSyntax

Nodes

Nodesaredrawnwithparentheses.

()

Relationships

Relationshipsaredrawnasarrows,withadditionaldetailinbrackets.

-->-[:DIRECTED]->

Patterns

Patternsaredrawnbyconnectingnodesandrelationshipswithhyphens,optionallyspecifyingadirectionwith>and<signs. ()-[]-()()-[]->()()<-[]-()

ThecomponentsofaCypherquery

MATCH(m:Movie)RETURNmMATCHandRETURNareCypherkeywords misavariable:Movieisanodelabel

ThecomponentsofaCypherquery

MATCH(p:Person)-[r:ACTED_IN]->(m:Movie)RETURNp,r,mMATCHandRETURNareCypherkeywords p,r,andmarevariables :Movieisanodelabel :ACTED_INisarelationshiptype

ThecomponentsofaCypherquery

MATCHpath=(:Person)-[:ACTED_IN]->(:Movie)RETURNpathMATCHandRETURNareCypherkeywords pathisavariable:Movieisanodelabel :ACTED_INisarelationshiptype

MATCH(m:Movie)RETURNm

GraphversusTabularresults

MATCH(m:Movie)RETURNm.title,m.releasedPropertiesareaccessedwith{variable}.{property_key}

GraphversusTabularresults

Casesensitive Nodelabels

Relationshiptypes

Propertykeys

Caseinsensitive Cypherkeywords

Casesensitivity

Casesensitive :Person

:ACTED_IN

name

Caseinsensitive MaTcH

return

Casesensitivity

Writequeries

TheCREATEClause

CREATE(m:Movie{title:'MysticRiver',released:2003})

RETURNm

TheSETClause

MATCH(m:Movie{title:'MysticRiver'})

SETm.tagline='Weburyoursinshere,Dave.Wewashthemclean.'

RETURNm

TheCREATEClause

MATCH(m:Movie{title:'MysticRiver'})

MATCH(p:Person{name:'KevinBacon'})

CREATE(p)-[r:ACTED_IN{roles:['Sean']}]->(m)

RETURNp,r,m

TheMERGEClause

MERGE(p:Person{name:'TomHanks'})

RETURNp

TheMERGEClause

MERGE(p:Person{name:'TomHanks',oscar:true})

RETURNp

TheMERGEClause

MERGE(p:Person{name:'TomHanks',oscar:true})

RETURNpThereisnota:Personnodewithname:'TomHanks'andoscar:trueinthegraph,butthereisa:Personnodewithname:'TomHanks'. Whatdoyouthinkwillhappenhere?

TheMERGEClause

MERGE(p:Person{name:'TomHanks'})

SETp.oscar=true

RETURNp

TheMERGEClause

MERGE(p:Person{name:'TomHanks'})-[:ACTED_IN]

->(m:Movie{title:'TheTerminal'})

RETURNp,m

TheMERGEClause

MERGE(p:Person{name:'TomHanks'})-[:ACTED_IN]

->(m:Movie{title:'TheTerminal'})

RETURNp,m

Thereisnota:Movienodewithtitle:"TheTerminal"inthegraph,butthereisa:Personnodewithname:"TomHanks". Whatdoyouthinkwillhappenhere?

MERGE(p:Person{name:'TomHanks'})

MERGE(m:Movie{title:'TheTerminal'})

MERGE(p)-[r:ACTED_IN]->(m)

RETURNp,r,m

TheMERGEClause

MERGE(p:Person{name:'YourName'})

ONCREATESETp.created=timestamp(),p.updated=0

ONMATCHSETp.updated=p.updated+1

RETURNp.created,p.updated;

ONCREATEandONMATCH

GraphModeling

Models

Themodelingworkflow

1.Derivethequestion

2.Obtainthedata

3.Developamodel

4.Ingestthedata

5.Query/Proveourmodel

Developingthemodelandthequery

1.Identifyapplication/end-usergoals2.Figureoutwhatquestionstoaskofthedomain3.Identifyentitiesineachquestion4.Identifyrelationshipsbetweenentitiesineachquestion5.Convertentitiesandrelationshipstopaths

- Thesebecomethebasisofthedatamodel6.Expressquestionsasgraphpatterns

- Thesebecomethebasisforqueries

51

1.Application/End-UserGoals

52

As an employeeI want to know who in the company has similar skills to meSo that we can exchange knowledge

2.QuestionstoaskoftheDomain

53

As an employeeI want to know who in the company has similar skills to meSo that we can exchange knowledge

Whichpeople,whoworkforthesamecompanyasme,havesimilarskillstome?

3.IdentifyEntities

Which people, who work for the same company as me, have similar skills to me?

• Person• Company• Skill

54

4.IdentifyRelationshipsBetweenEntities

Which people, who work for the same company as me, have similar skills to me?

• PersonWORKSFORCompany• PersonHASSKILLSkill

55

5.ConverttoCypherPaths

56

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

56

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

56

NodeNode

Node Node

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

56

Relationship

NodeNode Relationship

Node Node

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

• (:Person)-[:WORKS_FOR]->(:Company),• (:Person)-[:HAS_SKILL]->(:Skill)

56

Relationship

NodeNode Relationship

Node Node

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

• (:Person)-[:WORKS_FOR]->(:Company),• (:Person)-[:HAS_SKILL]->(:Skill)

56

Relationship

NodeNode Relationship

Node Node

Label Label

Label Label

5.ConverttoCypherPaths

•Person WORKS FOR Company

•Person HAS SKILL Skill

• (:Person)-[:WORKS_FOR]->(:Company),• (:Person)-[:HAS_SKILL]->(:Skill)

56

Relationship

NodeNode Relationship

Node Node

Label Label

Label Label

Relationship Type

Relationship Type

ConsolidatePattern

(:Person)-[:WORKS_FOR]->(:Company),(:Person)-[:HAS_SKILL]->(:Skill)

(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

57

Person SkillCompany

WORKS_FOR HAS_SKILL

CandidateDataModel(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)

58

name:Neo4j

name:Ian

name:ACME

Person

Company

WO

RKS_

FOR

HAS_

SKILL

name:Jacob

Person

name:Tobias

Person

WORKS_F

OR WORKS_FOR

name:Scala

name:Python

name:C#

SkillSkillSkillSkillHA

S_SK

ILL

HAS_SKILLHAS_SKILL

HAS_SKILL

HAS_SKILLHAS_

SKILL

6.ExpressQuestionasGraphPattern

Which people, who work for the same company as me, have similar skills to me?

59

skill

company

Company

colleagueme

PersonWORK

S_FOR

WORKS_FOR

Skill

HAS_SKILL HAS_SKILL

Person

CypherQuery

Which people, who work for the same company as me, have similar skills to me?

MATCH(company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill)(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHEREme.name=$nameRETURNcolleague.nameASname,count(skill)ASscore,collect(skill.name)ASskillsORDERBYscoreDESC

60

skill

company

Company

colleagueme

PersonWORK

S_FOR

WORKS_FOR

Skill

HAS_SKILL HAS_SKILL

Person

CypherQuery

Which people, who work for the same company as me, have similar skills to me?

MATCH(company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill)(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHEREme.name=$nameRETURNcolleague.nameASname,count(skill)ASscore,collect(skill.name)ASskillsORDERBYscoreDESC

61

skill

company

Company

colleagueme

PersonWORK

S_FOR

WORKS_FOR

Skill

HAS_SKILL HAS_SKILL

Person

1. Graph pattern

CypherQuery

Which people, who work for the same company as me, have similar skills to me?

MATCH(company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill)(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHEREme.name=$nameRETURNcolleague.nameASname,count(skill)ASscore,collect(skill.name)ASskillsORDERBYscoreDESC

62

skill

company

Company

colleagueme

PersonWORK

S_FOR

WORKS_FOR

Skill

HAS_SKILL HAS_SKILL

Person

1. Graph pattern

2. Filter, using index if available

CypherQuery

Which people, who work for the same company as me, have similar skills to me?

MATCH(company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill)(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)WHEREme.name=$nameRETURNcolleague.nameASname,count(skill)ASscore,collect(skill.name)ASskillsORDERBYscoreDESC

63

skill

company

Company

colleagueme

PersonWORK

S_FOR

WORKS_FOR

Skill

HAS_SKILL HAS_SKILL

Person

1. Graph pattern

2. Filter, using index if available

3. Create projection of result

FirstMatch

64

name:Neo4j

name:Ian

name:ACME

Person

Company

WO

RKS_

FOR

HAS_

SKILL

name:Jacob

Person

name:Tobias

Person

WORKS_F

OR WORKS_FOR

name:Scala

name:Python

name:C#

SkillSkillSkillSkill

HAS_

SKILL

HAS_SKILLHAS_SKILL HAS_SKILL

HAS_SKILLHAS_

SKILL

skill

company

Company

colleagueme

Person

WORKS_

FOR WORKS_FOR

SkillHAS_SKILL

HAS_

SKILL

Person

SecondMatch

65

name:Neo4j

name:Ian

name:ACME

Person

Company

WO

RKS_

FOR

HAS_

SKILL

name:Jacob

Person

name:Tobias

Person

WORKS_F

OR WORKS_FOR

name:Scala

name:Python

name:C#

SkillSkillSkillSkill

HAS_

SKILL

HAS_SKILLHAS_SKILL HAS_SKILL

HAS_SKILLHAS_

SKILL

skill

company

Company

colleagueme

Person

WORKS_

FOR WORKS_FOR

SkillHAS_SKILL

HAS_

SKILL

Person

ThirdMatch

66

name:Neo4j

name:Ian

name:ACME

Person

Company

WO

RKS_

FOR

HAS_

SKILL

name:Jacob

Person

name:Tobias

Person

WORKS_F

OR WORKS_FOR

name:Scala

name:Python

name:C#

SkillSkillSkillSkill

HAS_

SKILL

HAS_SKILLHAS_SKILL HAS_SKILL

HAS_SKILLHAS_

SKILL

skill

company

Company

colleagueme

Person

WORKS_

FOR WORKS_FOR

SkillHAS_SKILL

HAS_

SKILL

Person

ResultoftheQuery

+-------------------------------------+ | name | score | skills | +-------------------------------------+ | "Ian" | 2 | ["Scala","Neo4j"] | | "Jacob" | 1 | ["Neo4j"] | +-------------------------------------+ 2 rows

67

Modelingexercise:Moviegenres

Thequestion:shouldwemodelthemaspropertiesorasnodes?

Addingmoviegenres

vs

MATCH(m:Movie{title:'TheMatrix'})SETm.genre=['Action','Sci-Fi']RETURNm

Genresasproperties

MATCH(m:Movie{title:'MysticRiver'})SETm.genre=['Action','Mystery']RETURNm

Genresasproperties

Accessingamovie’sgenresisquickandeasy.

MATCH(m:Movie{title:"TheMatrix"})RETURNm.genre;

Thegoodsideofproperties

FindingmoviesthatsharegenresispainfulandwehaveadisconnectedpatternintheMATCHclause-asuresignyouhaveamodelingissue.

MATCH(m1:Movie),(m2:Movie)WHEREany(xINm1.genreWHERExINm2.genre)ANDm1<>m2RETURNm1,m2;

Thebadsideofproperties

MATCH(m:Movie{title:"TheMatrix"})MERGE(action:Genre{name:"Action"})MERGE(scifi:Genre{name:"Sci-Fi"})MERGE(m)-[:IN_GENRE]->(action)MERGE(m)-[:IN_GENRE]->(scifi)

Genresasnodes

MATCH(m:Movie{title:"MysticRiver"})MERGE(action:Genre{name:"Action"})MERGE(mystery:Genre{name:"Mystery"})MERGE(m)-[:IN_GENRE]->(action)MERGE(m)-[:IN_GENRE]->(mystery)

Genresasnodes

Findingmoviesthatsharegenresisanaturalgraphpattern.

MATCH(m1:Movie)-[:IN_GENRE]->(g:Genre),(m2:Movie)-[:IN_GENRE]->(g)RETURNm1,m2,g

Thegoodsideofnodes

Accessingthegenresofmoviesrequiresabitmoretyping.

MATCH(m:Movie{title:"TheMatrix"}),(m)-[:IN_GENRE]->(g:Genre)RETURNg.name;

The(nottoo)badsideofnodes

SymmetricRelationships

Symmetricrelationships

OR

BidirectionalRelationships

Usesinglerelationshipandignoredirectioninqueries

MATCH(:Person{name:'Eric'})-[:MARRIED_TO]-(p2)RETURNp2

Formalsemantics

• NadimeFrancis,PaoloGuagliardo,LeonidLibkin

• Formallydefinesa(large)coreofCypher

82

Futureimprovements

RegularPathQueries

• Conjunctivebi-directionalRegularPathQuerieswithData• Plusspecificationofacostfunctionforpaths,whichallowsformoreinterestingnotionsofshortestpath

PATH PATTERN coauth=()-[:WROTE]->(b)<-[:WROTE]-(), (b)<-[sale:SELLS]-() COST min(sale.price)

MATCH (a)-/~coauth* COST x/->(b)ORDER BY x LIMIT 1084

Furtherimprovements

• Supportforqueryingfrommultiplegraphs• Supportforreturninggraphs• Supportfordefiningviews

• IntegrationwithSQL

85

RDBMSsandGraphs

86

• Cannotmodelorstoredataandrelationshipswithoutcomplexity• Performancedegradeswithnumberandlevelsofrelationships,anddatabasesize

• QuerycomplexitygrowswithneedforJOINs• Addingnewtypesofdataandrelationshipsrequiresschemaredesign,increasingtimetomarket

RDBMScan’thandlerelationshipswell

ExpressComplexQueriesEasilywithCypher

Findallmanagersandhowmanypeopletheymanage,upto3levelsdown.

MATCH(boss)-[:MANAGES*0..3]->(mgr)WHEREboss.name="JohnDoe"AND(mgr)-[:MANAGES]->()RETURNmgr.nameASManager,size((mgr)-[:MANAGES*1..3]->())ASTotal

Cypher SQL

• Modelyourdatanaturallyasagraphofdataandrelationships• Drivegraphmodelfromdomainanduse-cases• Userelationshipinformationinreal-timetotransformyourbusiness• Addnewrelationshipsontheflytoadapttoyourchangingrequirements

UnlockingValuefromYourDataRelationships

• Relationshipsarefirstclasscitizen• Noneedforjoins,justfollowpre-materializedrelationshipsofnodes• Query&Data-locality–navigateoutfromyourstartingpoints• Onlyloadwhat’sneeded• Aggregateandprojectresultsasyougo• Optimizeddiskandmemorymodelforgraphs

HighQueryPerformancewithaNativeGraphDB

Theperformanceadvantageofmaterialisedrelationships

• asamplesocialgraphwith~1,000persons• average50friendsperperson• pathExists(a,b)limitedtodepth4• cacheswarmeduptoeliminatediskI/O

#persons querytime

Relationaldatabase 1,000 2000ms

Neo4j 1,000 2ms

Neo4j 1,000,000 2ms

1. DownloadNeo4j:http://neo4j.com/download/

2. Starttheserver.

3. Itshouldberunningon:http://localhost:7474

4. Log-inwithdefaultcredentialsuser:neo4j password:neo4j

5. Chooseanewpassword

We’regoodtogo!

GettingstartedwithNeo4j