48
Build a recommendation engine with Python and Neo4j Mark Needham / @markhneedham

Building a recommendation engine with python and neo4j

Embed Size (px)

Citation preview

Build a recommendation engine with Python and Neo4jMark Needham / @markhneedham

Logistics

‣ Download and install Neo4j 3.0.0• http://neo4j.com/download/

‣ Open your browser to http://localhost:7474

Introducing our data set...

meetup.com’s recommendations

Recommendation queries

‣ Several different types• groups to join• topics to follow• events to attend

‣ As a user of meetup.com trying to find groups to join and events to attend

The data

meetup.com/meetup_api/

What data do we have?

‣ Groups‣ Members‣ Events‣ Topics‣ Time & Date‣ Location

Find similar groups to Neo4j

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

What makes groups similar?

Find similar groups to Neo4j

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

Nodes

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

Relationships

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

Labels

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

Properties

As a member of the Neo4j London group

I want to find other similar meetup groups

So that I can join those groups

‣ Open your browser to http://localhost:7474 if you haven’t already

‣ Type the following command::play http://guides.neo4j.com/pydata

Recommend groups by topic

When there are slides to see

Follow AlongLet’s go through the first guide

Take NoteIndexes

Indexes

We create indexes to:

‣ allow fast lookup of nodes which match these (label,property) pairs.

Indexes

We create indexes to:

‣ allow fast lookup of nodes which match these (label,property) pairs.

CREATE INDEX ON :Group(name)

The following are index backed:‣ Equality‣ STARTS WITH‣ CONTAINS, ‣ ENDS WITH‣ Range searches‣ (Non-)existence checks

Indexes

How does Neo4j use indexes?

Indexes are only used to find the starting point for queries.

Use index scans to look up rows in tables and join them with rows from other tables

Use indexes to find the starting points for a query.

Relational

Graph

ContinueContinue with the Guide

ExerciseExplore the Graph

SolutionExplore the Graph

ContinueContinue with the Guide

Next StepClustering topics

Next StepGroup Membership

Exclude groups I’m a member of

As a member of the Neo4j London group

I want to find other similar meetup groupsthat I’m not already a member of

So that I can join those groups

Group memberships

Next GuideGroup Membership

Watch OutTransactions & WITH

Periodic Commit

Cypher keeps all transaction state in memory while running a query which is fine most of the time.

Periodic Commit

Cypher keeps all transaction state in memory while running a query which is fine most of the time.

But when using LOAD CSV, this state can get very large and may result in an OutOfMemory exception.

Periodic Commit

// defaults to 1000

USING PERIODIC COMMIT

LOAD CSV

...

Periodic Commit

// defaults to 1000

USING PERIODIC COMMIT 10000

LOAD CSV

...

WITH

The WITH clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next.

WITH

It’s used to:

‣ limit the number of entries that are then passed on to other MATCH clauses.

‣ filter on aggregated values‣ separate reading from updating of the graph

ContinueContinue with the Guide

ExerciseFind yourself and your groups

SolutionFind yourself and your groups

ContinueContinue with the Guide

Next GuideEvent recommendations

ExerciseExtend event recommendations

SolutionExtend event recommendations

That’s all for today!

Mark Needham / @markhneedham