Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15

GraphMat: Bridging the Productivity-Performance Gap in Graph AnalyticsNarayanan SundaramParallel Computing Lab, Intel Labs

© 2015 Intel Corporation 2

A cybersecurity application

• Intel Security• Loopy belief propagation for reputation management• ~2B vertices, ~6 Billion edges• Needed to run daily

• Took almost a day with Giraph on 16 machines

How can we handle Internet-of-Things reputation management without increased performance?

Port scanning

DDoS

Normal Traffic


A social problem

• Pagerank• ~1 trillion edges in graph• Takes 3 minutes/iteration on 200 machines on Giraph

How can we handle personalized pagerank for even top 1% users without increased performance?Ching, Avery, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. "One trillion edges: graph processing at Facebook-scale." Proceedings of the VLDB Endowment 8, no. 12 (2015): 1804-1815.


Problem scale

Social network ~1 billion vertices

~100 billion connections

Web graph ~50 billion pages

~1 trillion hyperlinks

Brain network ~100 billion neurons

~100 trillion connectionsMarc Smith: NodeXL Twitter Network Graphs: CHI2010: https://www.flickr.com/photos/marc_smith/4511844243 (License: CC BY 2.0 http://creativecommons.org/licenses/by/2.5 )

Larry & Teddy Page: Blog webgraph: https://www.flickr.com/photos/igboo/1814232325 (License: CC BY 2.0 http://creativecommons.org/licenses/by/2.5 )Xavier Gigandet et. al. - Gigandet X, Hagmann P, Kurant M, Cammoun L, Meuli R, et al. (2008) Estimating the Confidence Level of White Matter Connections Obtained with MRI Tractography. PLoS ONE 3(12): e4006. doi:10.1371/journal.pone.0004006 (License: CC BY 2.0 http://creativecommons.org/licenses/by/2.5 )

https://dx.doi.org/10.1371/journal.pone.0004006


GraphMat

• What is GraphMat?• GraphMat is a graph programming framework with vertex

programming as front-end and sparse matrix operations as back-end• “Matrix level performance with vertex program productivity”

• How can it help you?• “I know vertex programming and I like it, but

Giraph/GraphX/Pregel/GraphLab… is too slow”• “I heard that graph programs can be written as matrix operations

(and matrices are fast), but I do not want to recode my graph algorithms as matrix algorithms”Narayanan Sundaram, Nadathur Satish, Md Mostofa Ali Patwary, Subramanya R Dulloor, Michael Anderson,

Satya Gautam Vadlamudi, Dipankar Das, Pradeep Dubey “GraphMat: High performance graph analytics made productive”, PVLDB, Vol 8 No 11, 2015.


Why?

• Why GraphMat?• We want to enable super-fast distributed graph processing on X86 servers

• Why open-source?• We want to enable super-fast distributed graph processing on X86 servers

for everyone C++/MPI BSD license

• Integrate it with your data processing/ML frameworks• We can help


Current state-of-the-art

• GraphMat is faster than other distributed graph frameworks• Faster than GraphLab, CombBLAS, GraphX, Giraph…

• Optimized for multi-node and multi-core• Uses vertex programming

• Bringing sparse matrix optimizations from High Performance Computing to Big Graph processing


• Vertex programming “think like a vertex”– GraphLab, Giraph, MapGraph, Pregel,

GraphX

• Matrix based “graphs are sparse matrices”– CombBLAS, PEGASUS

• Task models– Galois

• Declarative programming– SociaLite (datalog-like)

• Domain-specific languages– GreenMarl

Diversity in current graph frameworks

PageRank (8 mi...0

20406080

100120140160

Giraph

GraphLab

CombBLAS

Galois

NativeSp

eedu

p w

.r.t

. Gir

aph


Diversity in current graph frameworks(contd.)

Framework Productivity PerformanceGraphLabGiraphCombBLASGaloisGraphXGraphMat

Combine high productivity with great performance

Green = good, orange = ok, red = bad.

Nadathur Satish, Narayanan Sundaram, Mostofa Patwary, Jiwon Seo, Jongsoo Park, Muhammad Hassaan, Shubho Sengupta, Zhaoming Yin, Pradeep Dubey, “Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets”, SIGMOD 2014


Assumptions

• Vertex programming is productive• Fewer building blocks are better• Sparse matrix operations are scalable• Very few people have the ability and interest to optimize “to the

metal”• Can use MPI in distributed setting (even on cloud)

• This assumption may be relaxed in the future• Graph data fits in memory


GraphMat: High level operation

Benefits High productivity (vertex programming for users) High performance (optimized sparse matrix backend)

Vertex program:

• Send message to all edges

• Process incoming message

• Reduce

• Operate on vertex

Our transformation:

Send message Create (sparse) vector

Process message SPMV multiplyReduce SPMV Add

Apply Data parallel operator

Scatter

Gather

Apply

GeneralizedSpMV /SpGEMM


Example (Vertex Degree)

Can process in-edges, out-edges or all edges.

C++ templates for handling arbitrary types

User-defined functions to specify a particular algorithm


What is new?

• Graph algorithms as linear algebra are well-known

• Unifying vertex programming with linear algebra is new


B A

C D

E1

21 3

4

2 2

Single Source Shortest PathSEND_MESSAGE : message vertex_distancePROCESS_MESSAGE : result message + edge_valueREDUCE : result min(result, operand)APPLY : vertex_distance = min(result, vertex_distance)

Example


[∞∞∞∞∞

] 𝐼𝑛𝑖𝑡→ [0∞∞∞∞

]

Iteration 0

Iteration

1

B A

C D

E1

21 3

4

2 2

B A

C D

E1

21 3

4

2 2

B A

C D

E1

21 3

4

2 2

0 ∞∞

∞∞

0 ∞1

23

0 41

22

reducedvalues

previousdistances

updateddistances

Single Source Shortest PathSEND_MESSAGE : message vertex_distancePROCESS_MESSAGE : result message + edge_valueREDUCE : result min(result, operand)APPLY : vertex_distance = min(result, vertex_distance)


Optimizations

• Flexible graph partitioning• 1-D, 2-D, Block cyclic

• Flexible data structures• Compressed Sparse Row (CSR)• Doubly compressed sparse column (DCSC)• Dense with bitvectors

• Low-level • Compiler optimizations• Vectorization


GraphMat vs others

MapGraph

Galois

CombBLAS

GraphLab

0 1 2 3 4 5 6 7 8Slowdown vs GraphMat (>1 imples GraphMat is

faster)


Is GraphMat good enough?

* Native code performs direction optimized sweeps for BFS, GraphMat only forward

Pagerank

Breadth First Search

Triangle counting

Shortest path

0 1 2 3 4 5 6 7 8

Native runtime vs GraphMat

GraphMatNative optimized code

Time in seconds

Within 1.2X of native performance on average


Scalability (Preliminary results)

Weak scaling, RMAT 128 M edges/node

1 2 40.1

1

10

100Pagerank

GraphMatGraphX

#Nodes

Tim

e pe

r ite

rati

on (

in s

ec)

1 2 40.1

1

10

100

1000

Shortest path

GraphMatGraphX

#Nodes

Tim

e in

sec

onds


Availability

• Open source under BSD license• https://

github.com/narayanan2004/GraphMat• (Single-node code only at the

moment)

• Plan to integrate with 3rd party data processing frameworks• JNI wrappers to call with Spark as a

first step

https://github.com/narayanan2004/GraphMat




Summary

• GraphMat bridges the productivity-performance gap for graph analytics

• Within 20% of native code performance• Faster than GraphLab, CombBLAS, Galois, and GraphX• As easy as vertex programming

• Integration with other frameworks on the way

• Code available under BSD at https://github.com/narayanan2004/GraphMat




Acknowledgements(Parallel Computing Lab, Intel Labs)Michael J. AndersonNadathur Rajagopalan SatishMd Mostofa Ali PatwarySubramanya DulloorSatya Gautam VadlamudiNesreen AhmedDipankar DasTed WillkePradeep Dubey

Questions?https://github.com/narayanan2004/GraphMat

[email protected]



Technology

Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15