28
KDD'09 Faloutsos, Miller, Tsoura kakis P6-1 CMU SCS Large Graph Mining: Power Tools and a Practitioner’s guide Task 6: Virus/Influence Propagation Faloutsos, Miller,Tsourakakis CMU

Large Graph Mining: Power Tools and a Practitioner’s guide

Embed Size (px)

DESCRIPTION

Large Graph Mining: Power Tools and a Practitioner’s guide. Task 6: Virus/Influence Propagation Faloutsos, Miller,Tsourakakis CMU. Outline. Introduction – Motivation Task 1: Node importance Task 2: Community detection Task 3: Recommendations Task 4: Connection sub-graphs - PowerPoint PPT Presentation

Citation preview

Page 1: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-1

CMU SCS

Large Graph Mining:Power Tools and a Practitioner’s guide

Task 6: Virus/Influence Propagation

Faloutsos, Miller,Tsourakakis

CMU

Page 2: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-2

CMU SCS

Outline• Introduction – Motivation• Task 1: Node importance • Task 2: Community detection• Task 3: Recommendations• Task 4: Connection sub-graphs• Task 5: Mining graphs over time• Task 6: Virus/influence propagation• Task 7: Spectral graph theory• Task 8: Tera/peta graph mining: hadoop• Observations – patterns of real graphs• Conclusions

Page 3: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-3

CMU SCS

Detailed outline

• Epidemic threshold– Problem definition– Analysis– Experiments

• Fraud detection in e-bay

Page 4: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-4

CMU SCS

Virus propagation

• How do viruses/rumors propagate?

• Blog influence?

• Will a flu-like virus linger, or will it become extinct soon?

Page 5: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-5

CMU SCS

The model: SIS

• ‘Flu’ like: Susceptible-Infected-Susceptible

• Virus ‘strength’ s= /

Infected

Healthy

NN1

N3

N2Prob.

Prob. β

Prob.

Page 6: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-6

CMU SCS

Epidemic threshold of a graph: the value of , such that

if strength s = / < an epidemic can not happen

Thus,

• given a graph

• compute its epidemic threshold

Page 7: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-7

CMU SCS

Epidemic threshold

What should depend on?

• avg. degree? and/or highest degree?

• and/or variance of degree?

• and/or third moment of degree?

• and/or diameter?

Page 8: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-8

CMU SCS

Epidemic threshold

• [Theorem 1] We have no epidemic, if

β/δ <τ = 1/ λ1,A

Page 9: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-9

CMU SCS

Epidemic threshold

• [Theorem 1] We have no epidemic (*), if

β/δ <τ = 1/ λ1,A

largest eigenvalueof adj. matrix A

attack prob.

recovery prob.epidemic threshold

Proof: [Wang+03](*) under mild, conditional-independence assumptions

Page 10: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-10

CMU SCS

Beginning of proofHealthy @ t+1: - ( healthy or healed ) - and not attacked @ t

Let: p(i , t) = Prob node i is sick @ t+1

1 - p(i, t+1 ) = (1 – p(i, t) + p(i, t) * ) *

j (1 – aji * p(j , t) )

Below threshold, if the above non-linear dynamical system above is ‘stable’ (eigenvalue of Hessian < 1 )

Page 11: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-11

CMU SCS

Epidemic threshold for various networks

Formula includes older results as special cases:

• Homogeneous networks [Kephart+White]– λ1,A = <k>; τ = 1/<k> (<k> : avg degree)

• Star networks (d = degree of center)– λ1,A = sqrt(d); τ = 1/ sqrt(d)

• Infinite power-law networks– λ1,A = ∞; τ = 0 ; [Barabasi]

Page 12: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-12

CMU SCS

Epidemic threshold

• [Theorem 2] Below the epidemic threshold, the epidemic dies out exponentially

Page 13: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-13

CMU SCS

Detailed outline

• Epidemic threshold– Problem definition– Analysis– Experiments

• Fraud detection in e-bay

Page 14: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-14

CMU SCS

Current prediction vs. previous

The formula’s predictions are more accurate

Oregon Star

PL-3

OurOur

Nu

mb

er o

f in

fect

ed n

odes

β/δ β/δ

PL-3

Page 15: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-15

CMU SCS

Experiments (Oregon)

/ > τ (above threshold)

/ = τ (at the threshold)

/ < τ (below threshold)

Page 16: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-16

CMU SCS

SIS simulation - # infected nodes vs time

Time (linear scale)

#inf.(log scale)

above

at

below

Log - Lin

Page 17: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-17

CMU SCS

SIS simulation - # infected nodes vs time

Log - Lin

Time (linear scale)

#inf.(log scale)

above

at

below

Exponentialdecay

Page 18: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-18

CMU SCS

SIS simulation - # infected nodes vs time

Log - Log

Time (log scale)

#inf.(log scale)

above

at

below

Page 19: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-19

CMU SCS

SIS simulation - # infected nodes vs time

Time (log scale)

#inf.(log scale)

above

at

below

Log - Log

Power-lawDecay (!)

Page 20: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-20

CMU SCS

Detailed outline

• Epidemic threshold

• Fraud detection in e-bay

Page 21: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-21

CMU SCS

E-bay Fraud detection

w/ Polo Chau &Shashank Pandit, CMU

NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks, S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos (WWW'07), pp. 201-210

Page 22: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-22

CMU SCS

E-bay Fraud detection

• lines: positive feedbacks• would you buy from him/her?

Page 23: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-23

CMU SCS

E-bay Fraud detection

• lines: positive feedbacks• would you buy from him/her?

• or him/her?

Page 24: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-24

CMU SCS

E-bay Fraud detection - NetProbe

Belief Propagation gives:

Page 25: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-25

CMU SCS

Conclusions

• λ1,A : Eigenvalue of adjacency matrix determines the survival of a flu-like virus– It gives a measure of how well connected is the

graph (~ # paths – see Task 7, later)– May guide immunization policies

• [Belief Propagation: a powerful algo]

Page 26: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-26

CMU SCS

References

• D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos, Epidemic Thresholds in Real Networks, in ACM TISSEC, 10(4), 2008

• Ganesh, A., Massoulie, L., and Towsley, D., 2005. The effect of network topology on the spread of epidemics. In INFOCOM.

Page 27: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-27

CMU SCS

References (cont’d)

• Hethcote, H. W. 2000. The mathematics of infectious diseases. SIAM Review 42, 599–653.

• Hethcote, H. W. AND Yorke, J. A. 1984. Gonorrhea Transmission Dynamics and Control. Vol. 56. Springer. Lecture Notes in Biomathematics.

Page 28: Large Graph Mining: Power Tools and a Practitioner’s guide

KDD'09 Faloutsos, Miller, Tsourakakis P6-28

CMU SCS

References (cont’d)

• Y. Wang, D. Chakrabarti, C. Wang and C. Faloutsos, Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint, in SRDS 2003 (pages 25-34), Florence, Italy