29
Sparsification of Influence Networks Michael Mathioudakis 1 , Francesco Bonchi 2 , Carlos Castillo 2 , Aris Gionis 2 , Antti Ukkonen 2 1 University of Toronto, Canada 2 Yahoo! Research Barcelona, Spain

Sparsification of Influence Networks

  • Upload
    trista

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Sparsification of Influence Networks. Michael Mathioudakis 1 , Francesco Bonchi 2 , Carlos Castillo 2 , Aris Gionis 2 , Antti Ukkonen 2 1 University of Toronto, Canada 2 Yahoo! Research Barcelona, Spain. Introduction. users perform actions post messages, pictures, videos - PowerPoint PPT Presentation

Citation preview

Page 1: Sparsification of Influence Networks

Sparsification ofInfluence Networks

Michael Mathioudakis1, Francesco Bonchi2, Carlos Castillo2, Aris Gionis2, Antti Ukkonen2

1University of Toronto, Canada2Yahoo! Research Barcelona, Spain

Page 2: Sparsification of Influence Networks

2

Introduction

users perform actionspost messages, pictures, videos

connected with other usersinteract, influence each other

actions propagate

online social networksfacebook 750m userstwitter 100m+ users

nice read indeed!

09:3009:00

Page 3: Sparsification of Influence Networks

3

Problem

sparsify networkeliminate large number of connections

keep important connections

sparsification: a data reduction operationnetwork visualization

efficient graph analysis

which connections are most important for the propagation of actions?

Page 4: Sparsification of Influence Networks

4

What We Do

technical frameworksparsify network according to observed activity

keep connections that best explain propagations

our approachsocial network & observed propagationslearn independent cascade model (ICM)

select k connectionsmost likely to have produced propagations

Page 5: Sparsification of Influence Networks

5

Outline

• introduction• setting– social network– propagation model

• sparsification– optimal algorithm– greedy algorithm: spine

• experiments

Page 6: Sparsification of Influence Networks

6

Social Network

users – nodesB follows A – arc A→B

A B

Page 7: Sparsification of Influence Networks

7

Propagation of Actions

independent cascade modelpropagation of an action unfolds in timesteps

BA

t t+1

p(A,B)

users perform actionsactions propagate

great movie

I liked this movie influence probability

Page 8: Sparsification of Influence Networks

8

Propagation of Actions

icm generates propagationssequence of activations

likelihood

active

not active

t-1 t t+1

action αp(A,B)

CA

B D

E

Page 9: Sparsification of Influence Networks

9

Estimating Influence Probabilities

social network+

set of propagations

p(A,B)

max likelihood

EM – [Saito et.al.]

active

not active

t-1 t t+1

action αp(A,B)

CA

B D

E

Page 10: Sparsification of Influence Networks

10

Outline

• introduction• setting– social network– propagation model

• sparsification– optimal algorithm– greedy algorithm: spine

• experiments

Page 11: Sparsification of Influence Networks

11

Sparsification

social network

p(A,B)

set of propagations

k arcsmost likely to

explain all propagations

B

A

p(A,B)

Page 12: Sparsification of Influence Networks

12

Sparsification

k arcs

A

B

p(A,B)

social network

p(A,B)

set of propagations

most likely to explain all

propagations

Page 13: Sparsification of Influence Networks

13

Sparsification

not the k arcs with largest probabilities

NP-hard and inapproximabledifficult to find solution with non-zero likelihood

Page 14: Sparsification of Influence Networks

14

How to Solve?

brute-force approachtry all subsets of k arcs?

no

break down into smaller problemscombine solutions

Page 15: Sparsification of Influence Networks

15

Optimal Algorithmsparsify separately incoming arcs of individual nodes

optimize corresponding likelihood

A B C

kA kB kC+ + = k

dynamic programmingoptimal solution

however…

Page 16: Sparsification of Influence Networks

16

Spinesparsification of influence networks

greedy algorithmefficient, good results

two phases

phase 1try to obtain a non-zero-likelihood solution

k0 < k arcs

phase 2build on top of phase 1

Page 17: Sparsification of Influence Networks

17

Spine – Phase 1phase 1

obtain a non-zero-likelihood solutionselect greedily arcs that participate in most propagations

until all propagations are explained

A

B

C

social network

D

A

B

CD

A

B

CD

t t+1

action α

action β

Page 18: Sparsification of Influence Networks

18

Spine – Phase 2add one arc at a time, the one that offers

largest increase in likelihoodlo

gL

# arcsk0 k

submodular

approximation guaranteefor phase 2

Page 19: Sparsification of Influence Networks

19

Outline

• introduction• setting– social network– propagation model

• sparsification– optimal algorithm– greedy algorithm: spine

• experiments

Page 20: Sparsification of Influence Networks

20

Experiments

datasets

meme.yahoo.comactions: postings (photos), nodes: users, arcs: who follows whom

data from 2010

memetracker.orgactions: mentions of a phrase, nodes: blogs & news sources,

arcs: who links to whomdata from 2009

Page 21: Sparsification of Influence Networks

21

sampled datasets of different sizes

Experiments

Dataset Actions Arcs Arcs, prob > 0

YMeme-L 26k 1.25M 430k

YMeme-M 13k 1.15M 380k

YMeme-S 5k 466k 73k

MTrack-L 9k 200k 7.8k

MTrack-M 120 110k 1.4k

MTrack-S 780 78k 768

YMeme meme.yahoo.comMTrack memetracker.org

Page 22: Sparsification of Influence Networks

22

Experiments

algorithms

optimal (very inefficient)

spine(a few seconds to 3.5hrs)

by arc probabilityrandom

Page 23: Sparsification of Influence Networks

23

Experiments

Page 24: Sparsification of Influence Networks

24

Model Selection using BICBIC(k) = -2logL + klogN

Page 25: Sparsification of Influence Networks

25

Application

spine as a preprocessing step

influence maximizationselect k nodes to maximize spread of action

[Kempe, Kleinberg, Tardos, 03]NP-hard, greedy approximation

perform on sparsified network insteadlarge benefit in efficiency, little loss in quality

Page 26: Sparsification of Influence Networks

26

Application

Page 27: Sparsification of Influence Networks

27

Public Code and Data

http://www.cs.toronto.edu/~mathiou/spine/

Page 28: Sparsification of Influence Networks

28

The End

Questions?

Page 29: Sparsification of Influence Networks

29