Upload
yardley-pitts
View
17
Download
0
Embed Size (px)
DESCRIPTION
Parallel Applications And Tools For Cloud Computing Environments. CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010. Large Scale PageRank with Iterative MapReduce. Shuohuan,Yuduo,Parag,Hui. Outline. m otivation of large scale pagerank o ptimization s trategies - PowerPoint PPT Presentation
Citation preview
Parallel Applications And Tools For Cloud Computing Environments
CloudCom 2010Indianapolis, Indiana, USA
Nov 30 – Dec 3, 2010
Large Scale PageRank with Iterative MapReduce
Shuohuan,Yuduo,Parag,Hui
Outlinemotivation of large scale pagerankoptimization strategiesexperiments resultsvisualization with PlotViz3
PageRankLarge scale PageRank
Large graph processing become popularEfficient processing of large scale graph
challenges current MapReduce runtimes.Motivation: common optimization
strategies for large scale PageRankCurrent status
Twister, Hadoop,DryadLINQ with ClueWeb data set with 50 million pages
MPI PageRank
Optimization StrategiesCache partitions of web graph in Memory
Twister, Pregel, HaLoop, Surfer, Static Data (am files)
Partition the web graphDryadLINQ, (Twister, Hadoop) PageRankTask granularity should fit the memory and
network bandwidth in Cloud infrastructureHierarchy messaging in reduce stage
Hadoop, (Twister, DryadLINQ) PageRankLocal merge
Cache Static Data
500 1500 2500 3500 45000
1000
2000
3000
4000
5000
6000
7000
Twister Hadoop
Partition the WebGraphscalability with various nodes on Madrid
8 nodes 7 nodes 6 nodes 5 nodes 4 nodes 3 nodes0
2000
4000
6000
8000
10000
12000
14000
16000
420 Files ChunksLinear (420 Files Chunks)Linear (420 Files Chunks)Linear (420 Files Chunks)Single File Per NodeLinear (Single File Per Node)
Partition the web graphscalability with various input data size on Tempest
160 files 320 files 640 files 960 files 1280 files0
1000
2000
3000
4000
5000
6000
7000
fine granularityLinear (fine granularity)Linear (fine granularity)Linear (fine granularity)coarse granularityLinear (coarse granularity)
Hierarchy Messaging in Reduce Stage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 320
2
4
6
8
10
12
14
16
18
original msg size msg size after local merge
Visualization with PlotViz31k vertices, red vertex: wikipedia.org