Upload
brett-ragozzine
View
171
Download
0
Embed Size (px)
Citation preview
Graph Databasesand Neo4j
Brett Ragozzine, PhDData Scientist
Intermountain Big Data ConferenceNov 21, 2015
Introduction
● PhD, Astrophysics○ Dark matter, gravitational lensing, image analysis
● Data Scientist - CompuCom (remote, Dallas)○ IT Outfitter, Service, Support○ 1000 customers, millions of end-users
● Digital Innovation team○ Neo4j
Outline
● Graph databases (NoSQL)○ Why they’re different○ What they do well
● Neo4j● Cypher query language● GraphGist examples● Use cases● Links for getting started
Graph Databases
● Great at storing: ○ Relationships
(“first-class citizens”)○ Properties○ Sparse data, densely
● Relational DBsstore all the 0s
Graph Databases
● Great for searching:○ Subsets of the graph
● Great at finding:○ Labeled data○ Indexed data
■ relational DBs are too○ Pattern matches
■ e.g. ''Friend-of-a-friend''
Graph Databases
● Nodes, relationships, properties, labels○ Billions of nodes, relationships
● Properties○ Don’t store too many together (slow)○ Turn properties into nodes, where applicable
● Don’t store large files in a graph○ Do store url
Graph Databases
● Schema○ Design your graph around the your questions
■ Very different from relational DBs○ Easy to re-design on the fly
■ Turn nodes into properties or vice versa■ Difficult to do with relational DBs
Graph Databases
● Architecture○ Single Master
■ Transactional(single write - pass/fail)
○ Large cluster■ High availability
(multiple queries/reads at once)
Cypher Query Language
● Interact with Neo4j○ Add, delete, change, query data (SQL-inspired)
● Symbolic language○ Node ()○ Property {name:''Brett''}○ Relationship <-- or -->
■ ()<--() or ()-->()○ Label [:PERSON]
Cypher Query Language
● Example(brett:PERSON {name:''Brett''})-[:HAS_TITLE]-> (ds:TITLE {name:''Data Scientist''})
(brett)-[:WORKS_FOR {from:''April 2015''}]->(cc:COMPANY {name:''CompuCom''})
How to Get Started
● Use cases○ http://neo4j.com/use-cases/
● Neo4j tutorial○ http://neo4j.com/graphacademy/
● GraphGists○ http://graphgist.neo4j.com/#!/gists/all○ https://github.com/neo4j-contrib/graphgist/wiki
Other Graph Databases
● Top 5 Most Popularhttp://db-engines.com/en/ranking/graph+dbms (number of mentions, job postings, etc)○ Neo4j○ Titan○ OrientDB○ ArangoDB○ Giraph