39
Welcome School of Engineering, CUSAT 1

A seminar on neo4 j

Embed Size (px)

DESCRIPTION

It is a seminar on NEO4J AND CYPHER

Citation preview

Page 1: A seminar on neo4 j

WelcomeSchool of Engineering, CUSAT 1

Page 2: A seminar on neo4 j

A SEMINAR ON

NEO4J

Presented by: Vishnu Sanker

Project guide: Dr. Sudheep Elayidom

Page 3: A seminar on neo4 j

Contents

• Trends in big data

• NoSQL

• Graphs

• Neo4j

• Brief introduction to Cypher

• Pros and Cons of Neo4j

School of Engineering, CUSAT 3

Page 4: A seminar on neo4 j

TRENDS IN BIG DATA

1. Increasing data size (big data)

• “Every 2 days we create as much information as we did up to 2003”

- Eric Schmidt

2. Increasingly connected data (graph data)

• For example, text documents to html

3. Semi-structured data

• Individualization of data, with common sub-set

4. Architecture

• From monolithic to modular, distributed applications

School of Engineering, CUSAT 4

Page 5: A seminar on neo4 j

NO SQL

School of Engineering, CUSAT 5

Page 6: A seminar on neo4 j

NOSQL

• Carlo Strozzi used the term NoSQL in 1998 to name his lightweight,

open-source relational database that did not expose the standard

SQL interface

• Provides a mechanism for storage and retrieval of data that is

modeled in means other than the tabular relations used in relational

databases.

School of Engineering, CUSAT 6

Page 7: A seminar on neo4 j

BENEFITS OF NOSQL

• Large volumes of structured, semi-structured and unstructured data

• Agile sprints, quick iteration, and frequent code pushes

• Flexible, easy to use object-oriented programming

• Efficient, scale-out architecture instead of expensive, monolithic architecture

School of Engineering, CUSAT 7

Page 8: A seminar on neo4 j

TYPES OF NOSQL

• Column

- distributed data store is a NoSQL object of the lowest level in a keyspace. It is a tuple (a key-value pair) consisting of three elements

Unique name : Used to reference the column

Value : The content of the column.

Timestamp : Used to determine the valid content

• Document oriented

- designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data

• Key value pairs

- collection of key value pairs

• Graph

- database that uses graph structures with nodes, edges, and properties to represent and store data

School of Engineering, CUSAT 8

Page 9: A seminar on neo4 j

GRAPHS

School of Engineering, CUSAT 9

Page 10: A seminar on neo4 j

GRAPHS

A GRAPH DATABASE...

NO: not for charts & diagrams, or vector artwork

YES: for storing data that is structured as a graph

School of Engineering, CUSAT 10

Page 11: A seminar on neo4 j

Graphs Everywhere

๏Relationships in

•Politics, Economics, History, Science, Transportation

๏Biology, Chemistry, Physics, Sociology

•Body, Ecosphere, Reaction, Interactions

๏Internet

•Hardware, Software, Interaction

๏Social Networks

•Family, Friends

•Work, Communities

•Neighbours, Cities, Society

School of Engineering, CUSAT 11

Page 12: A seminar on neo4 j

School of Engineering, CUSAT 12

Page 13: A seminar on neo4 j

Good Relationships

๏The world is rich, messy and related data

๏Relationships are as least as important as the things they connect

๏Complex interactions

๏Always changing, change of structures as well

๏Graph: Relationships are part of the data

๏RDBMS: Relationships part of the fixed schema

School of Engineering, CUSAT 13

Page 14: A seminar on neo4 j

HOW AN RDB IS REPRESENTED BY GRAPH

RDB PROPERTY GRAPH

School of Engineering, CUSAT 14

Page 15: A seminar on neo4 j

NEO4J - A GRAPH DATABASE

NEO4j - A GRAPH DATABASE

School of Engineering, CUSAT 15

Page 16: A seminar on neo4 j

GRAPHS

School of Engineering, CUSAT 16

Page 17: A seminar on neo4 j

School of Engineering, CUSAT 17

Page 18: A seminar on neo4 j

Neo4j is a Graph Database

๏A Graph Database:

•a schema-free Property Graph

•perfect for complex, highly connected data

๏Why NEO4J:

•reliable with real ACID Transactions

•fast with more than 1M traversals / second

•Server with REST API, or Embeddable on the JVM

•scale out for higher-performance reads with High-Availability

School of Engineering, CUSAT 18

Page 19: A seminar on neo4 j

DATA MODELING FOR NEO4J

School of Engineering, CUSAT 19

Page 20: A seminar on neo4 j

School of Engineering, CUSAT 20

Page 21: A seminar on neo4 j

School of Engineering, CUSAT 21

Page 22: A seminar on neo4 j

School of Engineering, CUSAT 22

Page 23: A seminar on neo4 j

School of Engineering, CUSAT 23

Page 24: A seminar on neo4 j

School of Engineering, CUSAT 24

Page 25: A seminar on neo4 j

School of Engineering, CUSAT 25

Page 26: A seminar on neo4 j

School of Engineering, CUSAT 26

Page 27: A seminar on neo4 j

SAMPLE CODE

School of Engineering, CUSAT 27

Page 28: A seminar on neo4 j

School of Engineering, CUSAT 28

Page 29: A seminar on neo4 j

CYPHER

School of Engineering, CUSAT 29

Page 30: A seminar on neo4 j

CYPHER - QUERY LANGUAGE FOR NEO4J

• Declarative query language

• Describe what you want, not how

• Based on pattern matching

• declarative grammar with clauses (like SQL)

• aggregation, ordering, limits

• create, update, delete

School of Engineering, CUSAT 30

Page 31: A seminar on neo4 j

Cypher: START + RETURN

๏START <lookup> RETURN <expressions>

๏START binds terms using simple look-up

•directly using known ids

•or based on indexed Property

๏RETURN expressions specify result set

School of Engineering, CUSAT 31

Page 32: A seminar on neo4 j

Cypher: MATCH

๏START <lookup> MATCH <pattern> RETURN <expr>

๏MATCH describes a pattern of nodes+relationships

•node terms in optional parenthesis

•lines with arrows for relationships

School of Engineering, CUSAT 32

Page 33: A seminar on neo4 j

Cypher: WHERE

๏START <lookup> [MATCH <pattern>]

๏WHERE <condition> RETURN <expr>

๏WHERE filters nodes or relationships

•uses expressions to constrain elements

School of Engineering, CUSAT 33

Page 34: A seminar on neo4 j

Cypher: SET

๏SET [<node property>] [<relationship property>]

•update a property on a node or relationship

•must follow a START

School of Engineering, CUSAT 34

Page 35: A seminar on neo4 j

Cypher: DELETE

๏DELETE [<node>|<relationship>|<property>]

•delete a node, relationship or property

•must follow a START

•to delete a node, all relationships must be deleted

first

School of Engineering, CUSAT 35

Page 36: A seminar on neo4 j

PROS AND CONS OF NEO4J

PROS

• Powerful data model - as generalized as rdbms

• Connected data is locally indexed

• Easy to query

Cons

• Sharding

• Needs new way of thinking

School of Engineering, CUSAT 36

Page 37: A seminar on neo4 j

Concluding...

• Neo4j is property graph database

• It is scalable, flexible, and is totally

designed in java

• Cypher is a query language for neo4j,

which is highly declarative and flexible

aswell

School of Engineering, CUSAT 37

Page 38: A seminar on neo4 j

School of Engineering, CUSAT 38

Page 39: A seminar on neo4 j

School of Engineering, CUSAT 39