Upload
confoo
View
4.571
Download
0
Tags:
Embed Size (px)
Citation preview
Graphs, Edges & Nodes
Untangling the social web.
Wednesday, March 9, 2011
What’s a graph?
Wednesday, March 9, 2011
Graph
Wednesday, March 9, 2011
Graph
Wednesday, March 9, 2011
Graph
Wednesday, March 9, 2011
Graph
6
73
14
6
4
3
1
4
5 7
13
4
199
12
157 2
10
9
Wednesday, March 9, 2011
Graph
6
73
14
6
4
3
1
4
5 7
13
4
199
12
157 2
10
9 13
12
19
10
15
6
11 10
8
17
4
6
2
21
22
9
3
Wednesday, March 9, 2011
Simple
At most one edge between any pair of nodes.
Wednesday, March 9, 2011
Multigraph
Multiple edges between vertices allowed.
Wednesday, March 9, 2011
Pseudograph
Self-loops are permitted.
Wednesday, March 9, 2011
G = (V, E)
Wednesday, March 9, 2011
Wednesday, March 9, 2011
What’s a node?
vertexpoint
junction0-simplex
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
What’s an edge?
arcbranch
linelink
1-simplex
Wednesday, March 9, 2011
Directed
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Undirected
Wednesday, March 9, 2011
Undirected
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Data Structures
Wednesday, March 9, 2011
1
2
4
3
(Finite simple graph)
Wednesday, March 9, 2011
Adjacency Matrix(2d array)
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
vertices
vertices
Wednesday, March 9, 2011
Adjacency Matrix(2d array)
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
vertices
vertices
Wednesday, March 9, 2011
1
2
4
3
(Finite simple graph)
Wednesday, March 9, 2011
[1, 2, 3, 4]234
1 14
13
Array entries (vertices) point to singly linked-lists
Wednesday, March 9, 2011
Visualizations
Wednesday, March 9, 2011
You are here.
Wednesday, March 9, 2011
Wednesday, March 9, 2011
(Graph does not include Justin Bieber)
Wednesday, March 9, 2011
Social Graphs
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
User-based item recommendations
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
(friends)
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
Items
(friends)
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
Items
(friends)
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
(friends)(me)
Items
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
(friends)(me)
Items
Wednesday, March 9, 2011
People
Recommend items to me that are popular amongst my friends
(friends)(me)
Items
Wednesday, March 9, 2011
2-step path on homogeneous bipartitegraph.
Wednesday, March 9, 2011
Strong Connection Problem (SCP)
Wednesday, March 9, 2011
There are many of these ‘fundamental’ graph units:
- tripartite graphs (user/asset/tag)- folksonomies- multicolor-multiparity graph- etc.
Wednesday, March 9, 2011
Graph Storage Engines
Wednesday, March 9, 2011
Neo4j“An embedded, disk-based, fully transactional Java persistence engine that
stores data structured in graphs rather than in tables.”
http://neo4j.org
Wednesday, March 9, 2011
HypergraphDB“A general purpose, extensible, portable, distributed, embeddable, open-source
data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects.”
http://kobrix.org/hgdb.jsp
Wednesday, March 9, 2011
Special Purpose Storage Engines
Wednesday, March 9, 2011
FlockDB“FlockDB is a database that stores graph data, but it isn't a database
optimized for graph-traversal operations. Instead, it's optimized for very large adjacency lists, fast reads and writes, and page-able set arithmetic
queries.”
http://engineering.twitter.com/2010/05/introducing-flockdb.html
Wednesday, March 9, 2011
Redis“Redis is an advanced key-value store. [...] the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, etc.”
http://code.google.com/p/redis
Wednesday, March 9, 2011
A Redis Friends/Followers Example
Wednesday, March 9, 2011
Redis makes you think in terms of datastructures,and operations on those structures.
Wednesday, March 9, 2011
Set:Finite (for our cases) collection of objects in which order has no significance and multiplicity is generally ignored.
S = { Alice, Bob, Carol }
List:Finite (for our cases) collection of objects in which order *is* significant and multiplicity is allowed.
L = [ X, Y, X, Z, Q]
Wednesday, March 9, 2011
SET uid:1000:username jperras
Insert a user into a set
Command Key Value
Wednesday, March 9, 2011
Use sets for denoting my followers/peopleI follow.
Wednesday, March 9, 2011
SADD uid:1000:following 1001SADD uid:1001:followers 1000
Adding a new follower
Command Key Value
Wednesday, March 9, 2011
Posting Updates
$r = Redis();$postid = $r->incr("global:nextPostId");$post = $User['id'] ."|". time() ."|". $status;$r->set("post:$postid", $post);$followers = $r->smembers("uid:".$User['id'].":followers");
if ($followers === false) $followers = Array();$followers[] = $User['id']; /* Add the post to our own posts too */
foreach($followers as $fid) { $r->push("uid:$fid:posts", $postid, false);}# Push the post on the timeline, and trim the timeline to the# newest 1000 elements.$r->push("global:timeline", $postid, false);$r->ltrim("global:timeline",0,1000);
Wednesday, March 9, 2011
Common followers? - Set intersections!
SINTER users:1000:followers users:1000:followers
Command Key 1 Key 2
Wednesday, March 9, 2011
A MySQL Example
(simplified)
Wednesday, March 9, 2011
# Mutual Friendsselect f.friend_id from friends f join friends m on m.user_id = f.friend_id and m.friend_id = f.user_idwhere f.user_id = 1234
# Following (for directed graphs)select f.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 and m.user_id is null;
# Followers (for directed graphs)select m.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.friend_id = 1234 and m.user_id is null
Wednesday, March 9, 2011
Not too bad.
# Mutual Friendsselect f.friend_id from friends f join friends m on m.user_id = f.friend_id and m.friend_id = f.user_idwhere f.user_id = 1234
# Following (for directed graphs)select f.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 and m.user_id is null;
# Followers (for directed graphs)select m.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.friend_id = 1234 and m.user_id is null
Wednesday, March 9, 2011
Relational databases can work for the simplestof cases, but are not always the best solution for
many graph operations/algorithms.
Wednesday, March 9, 2011
Graphs and graph-databases are onlygoing to be more and more useful.
Wednesday, March 9, 2011
However, graph algorithms are hard.
So don’t write your own.
And make sure you use a persistent storage enginethat is best suited for the type of queries
you will be performing.
Wednesday, March 9, 2011
Resources
Wednesday, March 9, 2011
The Algorithm Design Manual, Steve S. Skiena
Programming Collective Intelligence, Toby Segaran
Introduction to Algorithms, Cormen, Leiserson, Rivest
Resources
Wednesday, March 9, 2011
@jperras
Wednesday, March 9, 2011
Graph of the internet, circa 2003: http://www.duniacyber.com/freebies/education/what-is-internet-lookslike/ (built from partial troll of public servers using traceroute)
My real friends for letting me use their Facebook profile images.
Photo Credits
Wednesday, March 9, 2011
Large Scale Graph Algorithms (class lectures), Yuri Lifshits, Steklov Institute of Mathematics at St. Petersburg
http://mathworld.wolfram.com/Set.html
Programming Collective Intelligence, Toby Segaran
The Algorithm Design Manual, Steve S. Skiena
References
Wednesday, March 9, 2011