Geek Nights Neo4j Code Jam

Preview:

DESCRIPTION

Neo4j is a graph database. It is an embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables. A graph (mathematical lingo for a network) is a flexible data structure that allows a more agile and rapid style of development.

Citation preview

2

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

AttendeesWe all know the relational model.

It has been predominant for a long time.

3

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

Attendees

username latitude longitude title publish

thobe 55°36'47.70"N 12°58'34.50"E Malmö yes

joe 37°49'36.00"N 122°25'22.00"W San Francisco

no

... ... ... ... ...

Location

The relational model has a few problems, such as:•poor support for sparse data•modifying the data model is almost exclusively done through adding tables

4

username fullname registration speaker payment

mtiberg Michael Tiberg null no 0

thobe Tobias Ivarsson 2010-04-07 yes 0

joe John Doe 2010-02-05 no 700

... ... ... ... ...

Attendees

username latitude longitude title publish

thobe 55°36'47.70"N 12°58'34.50"E Malmö yes

joe 37°49'36.00"N 122°25'22.00"W San Francisco

no

... ... ... ... ...

Location

id title time room ...

... ... ... ... ...

... ... ... ... ...

Sessions

session user

... ...

... ...

Session attendance

... ...

... ...

... ...

More complication...... ...

... ...

... ...

... ...

... ...

... ...

... ...

... ...

... ...

After a while, modeling complex relationships leads to complicated schemas

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

5

E D

CF

G B

A

Most of the emerging database technologies are concerned with scaling to huge amounts of data and massive load.They do so by making data opaque and distribute elements based on key.

Most focus on scaling to large numbers

Scaling to size vs. Scaling to complexity

6

Size

Complexity

Key/Value stores

Bigtable clones

Document databases

Graph databases

Scaling to size vs. Scaling to complexity

6

Size

Complexity

Key/Value stores

Bigtable clones

Document databases

Graph databases

> 90% of use cases

Billions of nodesand relationships

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVES

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVESname: “James”age: 32twitter: “@spam”

name: “Mary”age: 35

brand: “Volvo”model: “V70”

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

The Property Graph data model

7

LIVES WITHLOVES

OWNSDRIVES

LOVESname: “James”age: 32twitter: “@spam”

name: “Mary”age: 35

brand: “Volvo”model: “V70”

item type: “car”

•Nodes•Relationships between Nodes•Relationships have Labels•Relationships are directed, but traversed at equal speed in both directions•The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not)•Nodes have key-value properties•Relationships have key-value properties

Graphs are whiteboard friendly

8Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Graphs are whiteboard friendly

8

1

*

1

*

*

1*

1

*

*

Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Graphs are whiteboard friendly

8

thobe

Wardrobe Strength

Joe project blog

Image credits: Tobias Ivarsson

An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.

Hello Joe

Neo4j performance analysis

Modularizing Jython

What is Neo4j?๏Neo4j is a Graph Database

•Non-relational (“#nosql”), transactional (ACID), embedded

•Data is stored as a Graph / Network

‣Nodes and Relationships with properties

‣“Property Graph” or “edge-labeled multidigraph”

๏Neo4j is Open Source / Free (as in speech) Software

•AGPLv3

•Commercial (“dual license”) license available

‣Free (as in beer) for first server installation

‣Inexpensive (as in startup-friendly) when you grow 9

Prices are available at http://neotechnology.com/

Contact us if you have questions and/or special license needs (e.g. if you want an evaluation license)

More about Neo4j๏Neo4j is stable

• In 24/7 operation since 2003

๏Neo4j is in active development

•Neo Technology received VC funding October 2009

๏Neo4j delivers high performance graph operations

• traverses 1’000’000+ relationships / secondon commodity hardware

10

http://neotechnology.com

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Path exists in social network๏Each person has on average 50 friends

12

Database # persons query timeRelational databaseNeo4j Graph DatabaseNeo4j Graph Database

1 000 2 000 ms1 000 2 ms

1 000 000 2 ms

Tobias

Emil

JohanPeter

Recommended