Upload
cruz-chen
View
39
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Large Graph Mining. Christos Faloutsos CMU. Roadmap. Introduction – Motivation Past work: Big graph mining (‘Pegasus’/hadoop) Propagation / immunization Ongoing & future work: (big) tensors brain data Conclusions. (Big) Graphs - Why study them?. Facebook [ 2010 ] >1B nodes, >$10B. - PowerPoint PPT Presentation
Citation preview
CMU SCS
Large Graph Mining
Christos FaloutsosCMU
CMU SCS
(c) 2013, C. Faloutsos 2
Roadmap• Introduction – Motivation• Past work:
– Big graph mining (‘Pegasus’/hadoop)– Propagation / immunization
• Ongoing & future work: – (big) tensors– brain data
• Conclusions
MLD-AB
CMU SCS
(Big) Graphs - Why study them?
Human Disease Network
[Barabasi 2007]
Gene Regulatory Network
[Decourty 2008]
Facebook [2010]>1B nodes, >$10B
The Internet [2005]
C. Faloutsos (CMU) 3SUM'13
CMU SCS
(c) 2013, C. Faloutsos 4
(Big) Graphs - why study them?
• web-log (‘blog’) news propagation• computer network security: email/IP traffic and
anomaly detection• Recommendation systems• ....
• Many-to-many db relationship -> graph
MLD-AB
CMU SCS
(c) 2013, C. Faloutsos 5
Roadmap• Introduction – Motivation• Past work:
– Big graph mining (‘Pegasus’/hadoop)– Propagation / immunization
• Ongoing/future: (big) tensors / brain data
• Conclusions
MLD-AB
CMU SCS
Triangle counting for large graphs?
Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]
6MLD-AB 6(c) 2013, C. Faloutsos
? ?
?
CMU SCS
Triangle counting for large graphs?
Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]
7MLD-AB 7(c) 2013, C. Faloutsos
CMU SCS
Triangle counting for large graphs?
Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]
8MLD-AB 8(c) 2013, C. Faloutsos
CMU SCS
Triangle counting for large graphs?
Anomalous nodes in Twitter(~ 3 billion edges)[U Kang, Brendan Meeder, +, PAKDD’11]
9MLD-AB 9(c) 2013, C. Faloutsos
CMU SCS
(c) 2013, C. Faloutsos 10
Roadmap• Introduction – Motivation• Past work:
– Big graph mining (‘Pegasus’/hadoop)– Propagation / immunization
• Ongoing & future work: – (big) tensors– brain data
• Conclusions
MLD-AB
CMU SCS
Fractional Immunization of NetworksB. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong,
Christos Faloutsos
SDM 2013, Austin, TX
(c) 2013, C. Faloutsos 11MLD-AB
CMU SCS
Whom to immunize?• Dynamical Processes over networks
• Each circle is a hospital• ~3,000 hospitals• More than 30,000
patients transferred
[US-MEDICARE NETWORK 2005]
Problem: Given k units of disinfectant, whom to immunize?
(c) 2013, C. Faloutsos 12MLD-AB
CMU SCS
Whom to immunize?
CURRENT PRACTICE OUR METHOD
[US-MEDICARE NETWORK 2005]
~6x fewer!
(c) 2013, C. Faloutsos 14MLD-AB
Hospital-acquired inf. : 99K+ lives, $5B+ per year
CMU SCS
Running Time
Simulations SMART-ALLOC
> 1 weekWall-Clock
Time≈
14 secs
> 30,000x speed-up!
better
(c) 2013, C. Faloutsos 15MLD-AB
CMU SCS
What is the ‘silver bullet’?A: Try to decrease connectivity of graph
Q: how to measure connectivity?A: first eigenvalue of adjacency matrix
Q1: why??
MLD-AB (c) 2013, C. Faloutsos 16
Avg degreeMax degreeDiameterModularity‘Conductance’
CMU SCS
Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks B. Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos FaloutsosIEEE ICDM 2011, Vancouver
extended version, in arxivhttp://arxiv.org/abs/1004.0060
G2 theorem
~10 pages proof
CMU SCS
Our thresholds for some models
• s = effective strength• s < 1 : below threshold
(c) 2013, C. Faloutsos 18MLD-AB
Models Effective Strength (s)
Threshold (tipping point)
SIS, SIR, SIRS, SEIR s = λ .
s = 1 SIV, SEIV s = λ .
(H.I.V.) s = λ .
12
221
vvv
2121 VVISI
CMU SCS
Our thresholds for some models
• s = effective strength• s < 1 : below threshold
(c) 2013, C. Faloutsos 19MLD-AB
Models Effective Strength (s)
Threshold (tipping point)
SIS, SIR, SIRS, SEIR s = λ .
s = 1 SIV, SEIV s = λ .
(H.I.V.) s = λ .
12
221
vvv
2121 VVISI
No immunity
Temp.immunity
w/ incubation
CMU SCS
(c) 2013, C. Faloutsos 20
Roadmap• Introduction – Motivation• Past work:
– Big graph mining (‘Pegasus’/hadoop)– Propagation / immunization
• Ongoing & future work: – (big) tensors– brain data
• Conclusions
MLD-AB
CMU SCS
Brain data
MLD-AB (c) 2013, C. Faloutsos 21
• Which neurons get activated by ‘bee’• How wiring evolves• Modeling epilepsy
N. Sidiropoulos
George Karypis
V. Papalexakis
Tom Mitchell
CMU SCS
Preliminary results• 60 words (‘bee’, ‘apple’, ‘hammer’)• 80 questions (‘is it alive’, ‘can it hurt you’)• Brain-scan, for each word
MLD-AB (c) 2013, C. Faloutsos 23
Alive? Can hurt you? …
‘apple’
‘beetle’ ✔
‘hammer’ ✔
CMU SCS
Preliminary results
MLD-AB (c) 2013, C. Faloutsos 24
CMU SCS
Preliminary results
MLD-AB (c) 2013, C. Faloutsos 25
Premotor cortex
CMU SCS
(c) 2013, C. Faloutsos 26
CONCLUSION#1 – Big data
• Large datasets reveal patterns/outliers that are invisible otherwise
MLD-AB
CMU SCS
CONCLUSION #2 – Cross disciplinarity
MLD-AB (c) 2013, C. Faloutsos 27
CMU SCS
CONCLUSION #2 – Cross disciplinarity
MLD-AB (c) 2013, C. Faloutsos 28
Thank you! Questions?