47
Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research U. of British Columbia

Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Embed Size (px)

Citation preview

Page 1: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Learning Influence Probabilities in Social

Networks

1 2

Amit Goyal1

Francesco Bonchi2

Laks V. S. Lakshmanan1

U. of British ColumbiaYahoo! ResearchU. of British Columbia

Page 2: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Word of Mouth and Viral Marketing We are more

influenced by our friends than strangers

68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

http://people.cs.ubc.ca/~goyal 2University of British Columbia, Yahoo! Research

Page 3: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Viral Marketing Also known as

Target Advertising Initiate chain

reaction by Word of mouth effect

Low investments, maximum gain

http://people.cs.ubc.ca/~goyal 3University of British Columbia, Yahoo! Research

Page 4: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Viral Marketing as an Optimization Problem Given: Network with

influence probabilities Problem: Select top-k

users such that by targeting them, the spread of influence is maximized

Domingos et al 2001, Richardson et al 2002, Kempe et al 2003

http://people.cs.ubc.ca/~goyal 4University of British Columbia, Yahoo! Research

How to calculate true influence probabilities?

Page 5: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Some Questions Where do those influence probabilities come

from? Available real world datasets don’t have prob.!

Can we learn those probabilities from available data?

Previous Viral Marketing studies ignore the effect of time. How can we take time into account?

Do probabilities change over time?

Can we predict time at which user is most likely to perform an action.

What users/actions are more prone to influence?http://people.cs.ubc.ca/~goyal 5University of British Columbia,

Yahoo! Research

Page 6: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Input Data We focus on actions. Input:

Social Graph: P and Q become friends at time 4.

Action log: User P performs actions a1 at time unit 5.

User Action Time

P a1 5

Q a1 10

R a1 15

Q a2 12

R a2 14

R a3 6

P a3 14

http://people.cs.ubc.ca/~goyal 6University of British Columbia, Yahoo! Research

Page 7: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Our contributions (1/2) Propose several probabilistic influence models

between users. Consistent with existing propagation models.

Develop efficient algorithms to learn the parameters of the models.

Able to predict whether a user perform an action or not.

Predict the time at which she will perform it.7

Page 8: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Our Contributions (2/2) Introduce metrics of users and actions

influenceability. High values => genuine influence.

Validated our models on Flickr.

http://people.cs.ubc.ca/~goyal 8University of British Columbia, Yahoo! Research

Page 9: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Overview Input:

Social Graph: P and Q become friends at time 4.

Action log: User P performs actions a1 at time unit 5.

User Action Time

P a1 5

Q a1 10

R a1 15

Q a2 12

R a2 14

R a3 6

P a3 14

Influence Models

Q R

P

0.330

0

0.5

0.5

0.2

http://people.cs.ubc.ca/~goyal 9University of British Columbia, Yahoo! Research

Page 10: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Background

Page 11: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

General Threshold (Propagation) Model At any point of time, each node is either active or

inactive.

More active neighbors => u more likely to get active.

Notations: S = {active neighbors of u}. pu(S) : Joint influence probability of S on u.

Θu: Activation threshold of user u.

When pu(S) >= Θu, u becomes active.

http://people.cs.ubc.ca/~goyal 11University of British Columbia, Yahoo! Research

Page 12: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

General Threshold Model - Example

Inactive Node

Active Node

Threshold

Joint Influence Probability

Source: David Kempe’s slides

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

http://people.cs.ubc.ca/~goyal 12University of British Columbia, Yahoo! Research

x

Page 13: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Our Framework

Page 14: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Solution Framework Assuming independence, we define

pv,u : influence probability of user v on user u

Consistent with the existing propagation models – monotonocity, submodularity.

It is incremental. i.e. can be updated incrementally using

Our aim is to learn pv,u for all edges.

)1(1)( ,uvSv

u pSp

http://people.cs.ubc.ca/~goyal 14University of British Columbia, Yahoo! Research

}){( wSpu uwu pSp , and )(

Page 15: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Influence Models Static Models

Assume that influence probabilities are static and do not change over time.

Continuous Time (CT) Models Influence probabilities are continuous functions of time. Not incremental, hence very expensive to apply on large

datasets.

Discrete Time (DT) Models Approximation of CT models. Incremental, hence efficient.

http://people.cs.ubc.ca/~goyal 15University of British Columbia, Yahoo! Research

Page 16: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Static Models 4 variants

Bernoulli as running example. Incremental hence most efficient. We omit details here

http://people.cs.ubc.ca/~goyal 16University of British Columbia, Yahoo! Research

Page 17: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Time Conscious Models Do influence

probabilities remain constant independently of time?

We propose Continuous Time (CT) Model Based on exponential

decay distribution

NOhttp://people.cs.ubc.ca/~goyal 17University of British Columbia,

Yahoo! Research

Page 18: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Continuous Time Models Best model. Capable of predicting time at which user is

most likely to perform the action. Not incremental

Discrete Time Model Based on step time functions Incremental

http://people.cs.ubc.ca/~goyal 18University of British Columbia, Yahoo! Research

Page 19: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Evaluation Strategy (1/2) Split the action log data into training

(80%) and testing (20%). User “James” have joined “Whistler Mountain”

community at time 5.

In testing phase, we ask the model to predict whether user will become active or not Given all the neighbors who are active Binary Classification

University of British Columbia, Yahoo! Research

http://people.cs.ubc.ca/~goyal 19

Page 20: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Evaluation Strategy (2/2) We ignore all the cases

when none of the user’s friends is active

As then the model is inapplicable.

We use ROC (Receiver Operating Characteristics) curves True Positive Rate (TPR) vs

False Positive Rate (FPR). TPR = TP/P FPR = FP/N

Reality

Prediction

Active Inactive

Active TP FP

Inactive FN TN

Total P N

Operating Point

Ideal Point

http://people.cs.ubc.ca/~goyal 20University of British Columbia, Yahoo! Research

Page 21: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Algorithms Special emphasis on efficiency of applying/testing

the models. Incremental Property

In practice, action logs tend to be huge, so we optimize our algorithms to minimize the number of scans over the action log. Training: 2 scans to learn all models simultaneously. Testing: 1 scan to test one model at a time.

http://people.cs.ubc.ca/~goyal 21University of British Columbia, Yahoo! Research

Page 22: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Experimental Evaluation

Page 23: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Dataset Yahoo! Flickr dataset “Joining a group” is considered as action

User “James” joined “Whistler Mountains” at time 5.

#users ~ 1.3 million #edges ~ 40.4 million Degree: 61.31 #groups/actions ~ 300K #tuples in action log ~ 35.8 million

http://people.cs.ubc.ca/~goyal 23University of British Columbia, Yahoo! Research

Page 24: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Comparison of Static, CT and DT models

Time conscious Models are better than Static Models.

CT and DT models perform equally well.http://people.cs.ubc.ca/~goyal 24University of British Columbia,

Yahoo! Research

Page 25: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Runtime

Static and DT models are far more efficient compared to CT models because of their incremental nature.

Testing

http://people.cs.ubc.ca/~goyal 25University of British Columbia, Yahoo! Research

Page 26: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Predicting Time – Distribution of Error Operating Point is chosen corresponding to

TPR: 82.5%, FPR: 17.5%.

X-axis: error in predicting time (in weeks)

Y-axis: frequency of that error

Most of the time, error in the prediction is very small

http://people.cs.ubc.ca/~goyal 26University of British Columbia, Yahoo! Research

Page 27: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Predicting Time – Coverage vs Error Operating Point is chosen corresponding to

TPR: 82.5%, FPR: 17.5%.

A point (x,y) here means for y% of cases, the error is within

In particular, for 95% of the cases, the error is within 20 weeks.

x

http://people.cs.ubc.ca/~goyal 27University of British Columbia, Yahoo! Research

Page 28: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

User Influenceability

Users with high influenceability => easier prediction of influence => more prone to viral marketing campaigns.

Some users are more prone to influence propagation than others.

Learn from Training data

http://people.cs.ubc.ca/~goyal 28University of British Columbia, Yahoo! Research

Page 29: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Action Influenceability

Some actions are more prone to influence propagation than others.

Actions with high user influenceability => easier prediction of influence => more suitable to viral marketing campaigns.

http://people.cs.ubc.ca/~goyal 29University of British Columbia, Yahoo! Research

Page 30: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Related Work Independently, Saito et al (KES 2008) have

studied the same problem Focus on Independent Cascade Model of

propagation. Apply Expectation Maximization (EM)

algorithm. Not scalable to huge datasets like the one we

are dealing in this work.

http://people.cs.ubc.ca/~goyal 30University of British Columbia, Yahoo! Research

Page 31: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Other applications of Influence Propagations Personalized Recommender Systems

Song et al 2006, 2007 Feed Ranking

Samper et al 2006 Trust Propagation

Guha et al 2004, Ziegler et al 2005, Golbeck et al 2006, Taherian et al 2008

http://people.cs.ubc.ca/~goyal 31University of British Columbia, Yahoo! Research

Page 32: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Conclusions (1/2) Previous works typically assume influence

probabilities are given as input.

Studied the problem of learning such probabilities from a log of past propagations.

Proposed both static and time-conscious models of influence.

We also proposed efficient algorithms to learn and apply the models.

http://people.cs.ubc.ca/~goyal 32University of British Columbia, Yahoo! Research

Page 33: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Conclusions (2/2) Using CT models, it is possible to predict

even the time at which a user will perform it with a good accuracy.

Introduce metrics of users and actions influenceability. High values => easier prediction of influence. Can be utilized in Viral Marketing decisions.

http://people.cs.ubc.ca/~goyal 33University of British Columbia, Yahoo! Research

Page 34: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Future Work Learning optimal user activation

thresholds. Considering users and actions

influenceability in the theory of Viral Marketing.

Role of time in Viral Marketing.

http://people.cs.ubc.ca/~goyal 34University of British Columbia, Yahoo! Research

Page 35: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Thanks!!0.13

http://people.cs.ubc.ca/~goyal 35University of British Columbia, Yahoo! Research

0.30.1

0.41

0.8

0.27

0.2

0.9

0.01

0.7

0.6

0.54

0.1

0.110

0.20.7

Page 36: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Predicting Time CT models can predict the time interval

[b,e] in which she is most likely to perform the action.

0

Joint influence Probability of u getting active

)2ln(,uvvte vtb

Θu

Time ->

is half life period

Tightness of lower bounds not critical in Viral Marketing Applications.

Experiments on the upper bound e.

)2ln(,uv

http://people.cs.ubc.ca/~goyal 36University of British Columbia, Yahoo! Research

Page 37: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Predicting Time - RMSE vs Accuracy CT models can predict the time interval [b,e] in which user is most likely to perform the action.

Experiments only on upper bound e.

Accuracy =

RMSE = root mean square error in days

cases total#

correct is boundupper of prediction when thecases #

RMSE ~ 70-80 days

http://people.cs.ubc.ca/~goyal 37University of British Columbia, Yahoo! Research

Page 38: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Static Models – Jaccard Index Jaccard Index is often used to measure

similarity b/w sample sets.

We adapt it to estimate pv,u

uor by v performed actions of#

u to vfrom propagated actions of#, uvp

http://people.cs.ubc.ca/~goyal 38University of British Columbia, Yahoo! Research

Page 39: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Partial Credits (PC) Let, for an action,

D is influenced by 3 of its neighbors.

Then, 1/3 credit is given to each one of these neighbors.

A B

D

1/3

C

1/31/3

performed vactions ofnumber total

daccumulate credits total, uvp

performedu or vactions ofnumber total

daccumulate credits total, uvp

PC Bernoulli

PC Jaccardhttp://people.cs.ubc.ca/~goyal 39University of British Columbia,

Yahoo! Research

Page 40: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Learning the Models Parameters to learn:

#actions performed by each user – Au

#actions propagated via each edge – Av2u

Mean life time –

P a1 5

Q a1 10

R a1 15

Q a2 12

R a2 14

R a3 6

P a3 14

u Au

P

Q

R

P Q R

P X

Q 0,0 X

R 0,0 X

Input

01

01

01

0,01,5 0,01,10

2

2

0,01,2

3

2

0,01,8

uv,

uv,uv, ,A

http://people.cs.ubc.ca/~goyal 40University of British Columbia, Yahoo! Research

Page 41: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Propagation Models Threshold Models

Linear Threshold Model General Threshold Model

Cascade Models Independent Cascade Model Decreasing Cascade Model

http://people.cs.ubc.ca/~goyal 41University of British Columbia, Yahoo! Research

Page 42: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Properties of Diffusion Models Monotonocity

Submodularity – Law of marginal Gain

Incrementality (Optional) can be updated incrementally

using

TSTpSp uu whenever )()(

TS

TpwTpSpwSp uuuu

whenever

)(}){()(}){(

}){( wSpu uwu pSp , and )(

http://people.cs.ubc.ca/~goyal 42University of British Columbia, Yahoo! Research

Page 43: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Comparison of 4 variants

ROC comparison of 4 variants of Static Models

ROC comparison of 4 variants of Discrete Time (DT) Models

Bernoulli is slightly better than Jaccard Among two Bernoulli variants, Partial Credits (PC) wins by a small margin.

http://people.cs.ubc.ca/~goyal 43University of British Columbia, Yahoo! Research

Page 44: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Discrete Time Models Approximation of

CT Models Incremental, hence

efficient 4-variants

corresponding to 4 Static Models

0uvvt ,vt

0

Influence prob. of v on u

uvvt ,vt

DT Model

0,uvp

0,uvp

Time ->

CT Model

Influence prob. of v on u

http://people.cs.ubc.ca/~goyal 44University of British Columbia, Yahoo! Research

Page 45: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Overview Context and Motivation Background Our Framework Algorithms Experiments Related Work Conclusions

http://people.cs.ubc.ca/~goyal 45University of British Columbia, Yahoo! Research

Page 46: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Continuous Time Models Joint influence probability

Individual probabilities – exponential decay

: maximum influence probability of v on u : the mean life time.

)1(1)( ,tuv

Sv

tu pSp

uvuttuv

tuv epp ,/)(0

,,

0,uvp

uv,

http://people.cs.ubc.ca/~goyal 46University of British Columbia, Yahoo! Research

Page 47: Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research

Algorithms Training – All models simultaneously in no more

than 2 scans of training sub-set (80% of total) of action log table.

Testing – One model requires only one scan of testing sub-set (20% of total) of action log table.

Due to the lack of time, we omit the details of the algorithms.

http://people.cs.ubc.ca/~goyal 47University of British Columbia, Yahoo! Research