8/11/2019 087 Pagerank Mapreduce Pregel
1/18
Where we are
Bill Howe, UW 1
1. Graph Tasks
3. Structural 5. Traversal 6. Patterns
2. Ex: Histograms
4. Ex: PageRank
9-10. Ex: Loops in MR
7. Pattern Languages
12. Ex: PageRank in Pregel
12. Ex: PageRank in MR
8. Ex: PRISM
11. Representations
8/11/2019 087 Pagerank Mapreduce Pregel
2/18
Big Graphs
Social scale
1 billion vertices, 100 billion edges
Web scale
50 billion vertices, 1 trillion edges
Brain scale
100 billion vertices, 100 trillion edges
Gerhard et al, frontiers inneuroinformatics, 2011
Web graph from the SNAP database
(http://snap.stanford.edu/data)
Paul Butler, Facebook, 2010
material adapted fromPaul Burkhardt, Chris Waring
https://www.facebook.com/notes/facebook-
engineering/visualizing-friendships/469716398919
8/11/2019 087 Pagerank Mapreduce Pregel
3/18
MapReduce for PageRank
classMapper
methodMap(id n, vertex N)
p N.PAGERANK/|N.ADJACENCYLIST|
EMIT(id n, vertex N)
for all nodeid m inN.ADJACENCYLIST do
EMIT(id m, value p)
classReducer
methodREDUCE(id m, [p1, p2, ])
M null, s 0
for all p in[p1, p2, ] do
ifISVERTEX(p) then
M p
else
s s + p
M.PAGERANK s * 0.85 + 0.15 / TOTALVERTICES
EMIT(id m, vertex M)
Bill Howe, UW 3
8/11/2019 087 Pagerank Mapreduce Pregel
4/18
Problems
The entire state of the graph is shuffled
on every iteration
We only need to shuffle the new rankcontributions, not the graph structure
Further, we have to control the iteration
outside of MapReduce
Bill Howe, UW 4
8/11/2019 087 Pagerank Mapreduce Pregel
5/18
Pregel
Originally from Google
Open source implementations
Apache Giraph, Stanford GPS, Jpregel, Hama
Batch algorithms on large graphs
Bill Howe, UW 5
Malewicz et al. SIGMOD 10
while any vertex is active or max iterations not reached:
for each vertex:
process messages from neighbors from previous iterationsend messages to neighbors
set active flag appropriately
this loop is run in parallel
8/11/2019 087 Pagerank Mapreduce Pregel
6/18
6/17/2013 Bill Howe, Data Science, Autumn 2012 6
class PageRankVertex: public Vertex {
public:virtual void Compute(MessageIterator* msgs) {
if (superstep() >= 1) {
double sum = 0;
for (; !msgs->Done(); msgs->Next())
sum += msgs->Value();
*MutableValue() = 0.15 / NumVertices() + 0.85 * sum;}
if (superstep() < 30) {
const int64 n = GetOutEdgeIterator().size();
SendMessageToAllNeighbors(GetValue() / n);
} else {
VoteToHalt();}
}
};
8/11/2019 087 Pagerank Mapreduce Pregel
7/18
Bill Howe, UW 7
0.2
0.2
0.2
0.2
0.2
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
8/11/2019 087 Pagerank Mapreduce Pregel
8/18
Bill Howe, UW 8
0.1
0.1
0.066
0.066
0.066
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
8/11/2019 087 Pagerank Mapreduce Pregel
9/18
Bill Howe, UW 9
0.1
0.1
0.066
0.066
0.066
0.2
0.2
0.172
0.03
0.426
0.34
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.2
8/11/2019 087 Pagerank Mapreduce Pregel
10/18
Bill Howe, UW 10
0.172
0.03
0.426
0.34
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
8/11/2019 087 Pagerank Mapreduce Pregel
11/18
Bill Howe, UW 11
0.172
0.03
0.426
0.34
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.015
0.015
0.01
0.01
0.01
0.172
0.34
0.426
8/11/2019 087 Pagerank Mapreduce Pregel
12/18
Bill Howe, UW 12
0.0513
0.03
0.69
0.197
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.015
0.015
0.01
0.01
0.01
0.172
0.34
0.426
8/11/2019 087 Pagerank Mapreduce Pregel
13/18
Bill Howe, UW 13
0.0513
0.03
0.69
0.197
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
8/11/2019 087 Pagerank Mapreduce Pregel
14/18
Bill Howe, UW 14
0.0513
0.03
0.69
0.197
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.015
0.015
0.01
0.01
0.01
0.0513
0.197
0.69
8/11/2019 087 Pagerank Mapreduce Pregel
15/18
Bill Howe, UW 15
0.0513
0.03
0.794
0.095
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.015 0.01
0.01
0.0513
0.197
0.69
8/11/2019 087 Pagerank Mapreduce Pregel
16/18
Bill Howe, UW 16
0.0513
0.03
0.794
0.095
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.015 0.01
0.01
0.0513
0.197
0.69
8/11/2019 087 Pagerank Mapreduce Pregel
17/18
Bill Howe, UW 17
0.0513
0.03
0.794
0.095
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum
0.01
0.095
0.794
8/11/2019 087 Pagerank Mapreduce Pregel
18/18
Bill Howe, UW 18
0.0513
0.03
0.794
0.095
0.03
sum = sum(incoming values)
rank = 0.15 / 5 + 0.85 * sum