Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
GPU$accelerated$Large.scale$Dense$Subgraph$Detection$!
Andy!(Chang+Jun)!Wu!Email:[email protected]!!Xerox!Research!Center,!Webster,!NY!
What!is!dense!subgraph!detection?!
Page!2!
Definition:)+ Detect!subsets!of!vertices,!such!
that!the!connections!within!the!induced!subgraphs!are!dense,!and!their!connections!to!the!rest!of!the!graph!are!sparse.!!
+ unsupervised*learning*
!!Applications:)• Community!detection!• Recommender!system!• Graph!visualization!• Data!exploration!
Large3scale$dense$subgraph$detection?$
Page!3!
Graph!clustering!heuristics!
Page!4!
Observation:))+ nodes!belonging!to!the!same!cluster!have!a!high!
overlap!of!their!neighbors!(aka)outlinks)or)adjacency)lists).!
$Clustering)heuristics:*+ If!two!nodes!have!a!high!overlap!of!their!neighbors,!
then!most)likely)that!they!belong!to!the!same!cluster.!+ It*is*a*necessary*but*not*sufficient*condition.*
u*v*
x*
y*z*
p*
an!instance!of!dense!subgraph!
Dense*subgraph*detection!
Γ(x)!:*the*adjacency*list*of*vertex*x*Γ(x)*=*{u,*v,*p,*z,*y}********!Γ(y)*=*{u,*v,*p,*z,*x}!!
Set comparison
Page!5!
Set*comparison!
A$=$Γ(x)*=*{u,*v,*p,*z,*y}!! B$=$Γ(y)*=*{u,*v,*p,*z,*x}!
{u,*z}! {u,*z}!
MinEwise!permutation!theory:! prob.{minK (π i (A))==minK (π i (B))} =| A∩B || A∪B |
πi():!a!permutation!on!a!set!of!elements!
Broder,*1997*
Shingling!on!all!adjacency!lists!
Page!6!
Γ(v1)! Γ(v2)! Γ(v3)! Γ(v4)! Γ(v5)!
permutation!
C!shingles!
v1!
C!shingles!
v2! v3!
C!shingles!
v4! v5!
C!shingles! C!shingles!
shingle!
Shingling!example!for!a!clique*
Page!7!
v1*:*v
1,*v
2,*v
3,…,*v
n*
v2*:*v
1,*v
2,*v
3,…,*v
n*
…,*…,*…,*…,…*
vn*:*v
1,*v
2,*v
3,…,*v
n*
size:*n×n*
input!graph*
T1*:*S
1,*S
2,*S
3,…,*S
n*
T2*:*S
1,*S
2,*S
3,…,*S
n*
…,*…,*…,*…,…*
Tc*:*S
1,*S
2,*S
3,…,*S
n*
size:*c×c*
After!2nd!level!shingling!
A*clique*example!1st!level!shingling!
<v1,*S
1>,*<v
1,*S
2>,…,<v
1,*S
c>
<v1,*S
1>,*<v
1,*S
2>,…,<v
1,*S
c>
S1*:*v
1,*v
2,*v
3,…,*v
n*
S2*:*v
1,*v
2,*v
3,…,*v
n*
…,*…,*…,*…,…*
Sc*:*v
1,*v
2,*v
3,…,*v
n*
size:*c×n*
new!input!graph*
1st!level!shingles!
2nd!level!shingles!
clique
dense*subgraph
<vn,*S
1>,*<v
n,*S
2>,…,<v
n,*S
c>
v1*
v2
… vn
…
1st!level!shingling!
Shingling$on$GPU$
Page!8!
GPU!introduction!
Page!9!
Per+thread!memory!–!1x!Per+block!shared!memory!–!1x!
Thread!Thread!block!
Grid!
Glo
bal m
emor
y
Hos
t mem
ory
100x!GPU$
CUDA$application$
host!=!CPU!
host!=!CPU!
device!=!GPU!
device!=!GPU!
A!CPU+GPU!computational!framework!
Page!10!
load!input!graph!
aggregate!graph!
report!dense!subgraph!
time!lin
e
adjacency!lists!
shingles!
adjacency!lists!
shingles!
shingling
1st!level!shingling!
2nd!level!shingling!
Shingling!
Page!11!
Γ(v1) Γ(v2) Γ(v3) Γ(v4) Γ(v5)
permutation!
C!shingles
v1
C!shingles
v2 v3
C!shingles
v4 v5
C!shingles C!shingles
shingle
sorting!
Shingling!on!GPU!
Page!12!
Shared!memory!
Thread!block0! thread!block1! thread!block2! thread!block3!
Global!memory!
iteration0!
CPU!memory!
iteration1! iterationC!
seg1! seg2! seg3! seg4! seg5!
Global!memory!
seg1! seg2! seg3! seg4! seg5!
seg6! seg7!
CPU!memory!
Segmented!sorting!problem!
Page!13!
A[i]!
offset[i]!=!count(!j!<!i!where!A[j]!>!A[i]!)!+!count(!j!>!i!where!A[j]!<!A[i]!)!
A[i]!3333>!A[i$.$offset(i)]!
Parallel!counting!sort!Thread!block!
A[0]! A[n]!
gt:!great!counter!lt:!less!counter!
!index!=!i!–!(gt!–!lt)!
threadi*
thread!block0! thread!block1! thread!block2! thread!block3!
Segmented!counting!sort!
BLOCK
_SIZE!
1$ 2$
Parallel!odd+even!sort!Parallel!merge!sort!Parallel!radix!sort!****************E*Satish*et*al.*(2009)*
shuffle!
no$data$shuffling.$
Experimental$studies$
Page!14!
Experimental!platform!
Page!15!
GPU:$NVIDIA$Tesla$Kepler$K20c$_____________________________________________________$
CUDA)capability:)3.5)CUDA)driver/runtime:)5.0)Streaming)multiprocessors:)13))CUDA)cores:)192)Shared)memory:)48KB)Global)memory:)5GB))
CPU:)Intel)Xeon)E532650)RAM:)32GB)OS:)Red)Hat)Enterprise)Linux)6.3)
PCI!express!3.0!host!
device!
Performance!study!
Page!16!
Input$Graph)
Runtime$of$each$component$in$gpClust)serial$
runtime) speedup) GPU$speedup)CPU) GPU) H<.>D) I/O) total$
runtime)
20K! 52.70! 7.57! 6.08! 0.40! 66.75! 392.32! 5.88x! 44.86x!
2M! 2685.06! 447.97! 114.18! 28.77! 3275.89! 23,537.80! 7.18x! 373.71x!
#$input$seqs.$
#$singleton$vertices$
#$vertices$ #$edges$Average$degree$
Largest$CC$size$
20,000! 2,921!! 17,079!! 374,928!! 44!±!69!! 10,707!!
2,004,241! 441,257! 1,562,984!! 56,919,738!! 73!±!153!! 31,872!!
3)Two)arbitrary)sets)of)predicted)protein)families)from)Global)Ocean)Sampling)(GOS))project.)
α!=!10%+20%!serial!computation!time!on!the!CPU!side$Amdahl's$law:!max!speedup!=!1/α!=!5x)–)10x)
Better!performance!can!be!achieved!through!streaming!!
Cluster!size!distribution!
Page!17!
0
500
1000
1500
2000
2500
3000
3500
20-4950-99
100-199
200-499
500-999
1000-2000
>2000
Num
ber o
f gro
ups
Group size
gpClust approachGOS approach
Conclusion!
! A!GPU!accelerated!large+scale!graph!clustering!algorithm.!" Scalable!solution!to!large+scale!input!graphs!" a!good!speedup!on!sequence!similarity!graphs!
! Parallel!counting!sort!+++>!more!efficient!order!statistics.!! Push!more!workload!(e.g.!union+find)!to!the!GPU!side.!
Future work
THANKS!$$
Questions?$
BACKUP$
Page 20
Metagenomics!
! Environmental!microbial!communities!
Page 21
Assemble!DNA!&!predict!genes!
Translated!ORF!sequences!
106+8!new!sequences!
already!known!protein!!seq.!&!families!
~5x107!clusters!~108!sequences!
protein*family*identification
<$1%$microbes!can!be!isolated!and!cultivated!!in!standard!laboratory!environment.!
Community!annotation!
known*
familym*
new*
family1*
Overview!
Page 22
Graph construction Input sequences
Dense subgraph detection
Our)parallel)approach) pGraph:!parallel!graph!construction!• Distributed*memory*alg.*
******************************************E*Wu*et.*al.*(2008,*2012)*
pClust:*MinEwise*clustering!• MapReduce*version*
********************E*Rytsareva*et*al.*(2011)*
• MultiEcore*CPU*version*
!!!!!!!!!!!!!!!!!!!E*Chapman*et*al.*(2011)*
• GPGPU*version*
K+neighbor!clustering!Global)Ocean)Sampling)(GOS))Yooseph*et*al.*(2007)*
All+against+all!BLAST!
Shingling!algorithm!
Page!23!
…!!!!!!!!!…!!!!!!!!!!…!!!!!!!!!!!!!…!<z,*Sz
1>,*<z,*Sz
2>,*<z,*Sz
3>,…,*<z,*S*z
c>*
π1(),)π2(),)π3(),)…,)πc())
Set*comparison!
MinEwise!permutation!theory:! prob.{minK (π i (Γ(x)))==minK (π i (Γ(y)))} =|Γ(x)∩Γ(y) ||Γ(x)∪Γ(y) |
Γ(x)!:*the*adjacency*list*of*vertex*x!πi():!a!permutation!on!a!set!of!elements!
Gibson!et*al.,*2005!
Γ(y)!=!{a,*b,*c,*e,…,*r}!
Γ(z)*=*{…}!…!
*<x,*Sx1>,*<x,*Sx
2>,*<x,*Sx
3>,…,*<x,*Sx
c>*
Γ(x)!=!{a,*b,*c,*d,…,*r}!
input!graph!
Strategy!for!qualitative!comparison!
Page 24
Input!sequences!
GOS)clusters!All+against+all!
BLAST)k+neighbor!clustering!
GOS)
Parallel!homology!detection!
Shingling!clustering!
gpClust)
gpClust$clusters!
sequence+based!profiling!
sequence+based!profiling!
Cluster)expansion)
Benchmark))(protein)families)!
profile+based!matching!are!more!sensitive!than!sequence+based!matching.!
profile+based!profiling!
Qualitative!study!for!2M!sequences!
Approach) #)clusters) density) #)seqs))included)
Group)size)
Largest) Average)
Benchmark) 813!! 0.09!±!0.12! 2,004,241!! 56,266!! 2,!465!±!4,!372!!
GOS)) 6,152!! 0.40!±!0.27!! 1,236,712!! 20,027!! 201!±!650!!
gpClust)) 6,646!! 0.75!±!0.28!! 1,414,952!! 19,066!! 213!±!721!!
Page 25
density = # edges in a cluster
# all possible edges in a cluster