Uncovering Social Links Through Stochastic Point …...2 The Problem (a) Twitter Tweet 1 Tweet 2...

Preview:

Citation preview

Uncovering Social Links Through

Stochastic Point Processes

Rui Zhang (u5963436)

Dr Marian-Andrei Rizoiu

Research School of Computer Science

The Australian National University

COMP6470 Final Presentation

May, 2017

2

The Problem

(a) Twitter Tweet 1

Tweet 2

Tweet 3

Tweet 4

Tweet 5

Tweet 8

Tweet 6

Tweet 7

Tweet 9

Tweet 1

Tweet 2

Tweet 3Tweet 4

Tweet 5

Tweet 6

Tweet 7Tweet 8

Tweet 9

(b) Real retweet network

(Tree structure)

1.How tweets diffuse

2.Which user is important

in the diffusion

(c) Retweet network from the Twitter

API

(Star structure)

Wrong diffusion structure

3

The Problem

Purpose: Infer the real parent-offspring relationship between tweets

using only one cascade

Existing methods Probability

distribution

NETINF[Gomez-Rodriguez

et al KDD’11]

Description Predict links based on

probabilities

Choose links improving the

log-likelihood most

significantly

Shortcomings Need cascades for

optimizing parameters

of the distribution

Cascades for training and for

prediction

Sometimes, only one cascade occurring and no more cascades for

training and improving prediction.

4

Contents of this Presentation

• Modeling Retweets Cascades with Hawkes Point

Processes

• Optimization by Expectation Maximization Algorithm

• Constructing the Twitter Dataset

• Evaluation and Results

5

Introduction to Hawkes Point Processes

Point Processesdescribing events occurring at random locations and/or times.

(a) Modeling earthquake aftershocks

Hawkes Point Processes [Hawkes Biometrika’71]

Occurring events increase the likelihood of occurrence of futures events

(self-exciting)

Applications of Hawkes Point Processes.

(b) Modeling trade

Branching Structure and Hidden Vars

6

Occurring

Time

(t1, m1)

Assumption: self-exciting - - retweets in a cascade randomly occur and

occurrence of retweets is likely to cause more retweets

Root tweet

t - - occurring time

m - - user influence (the number of followers)

𝑢1

Branching Structure and Hidden Vars

7

Occurring

Time

(t1, m1)

(t4 m4)(t2, m2)

Assumption: retweets in a cascade randomly occur and occurrence of

retweets is likely to cause more retweets

𝑢1

𝑢4

𝑢2

Branching Structure and Hidden Vars

8

Occurring

Time

(t1, m1)

(t4 m4)

(t5, m5)

(t2, m2)

(t3, m3)

Assumption: retweets in a cascade randomly occur and occurrence of

retweets is likely to cause more retweets

𝑢1

𝑢4

𝑢2

𝑢3

𝑢5

Branching Structure and Hidden Vars

9

Occurring

Time

(t1, m1)

(t4 m4)

(t5, m5)

(t2, m2)

(t3, m3)

𝑝21

𝑝41

𝒑𝟑𝟐 𝑝54

Assumption: retweets in a cascade randomly occur and occurrence of

retweets is likely to cause more retweets

𝑝𝑗𝑖 - - P( the 𝑗𝑡ℎ retweet is caused by the 𝑖𝑡ℎ retweet )

Observed event sequence

𝑢1

𝑢4

𝑢2

𝑢3

𝑢5

10

Modeling Retweet Cascades

Model: Hawkes Point Processes with Power-law Triggering Kernel

[Mishra et al CIKM’16]

𝜆 𝑡 =

𝑡𝑖<𝑡

𝜙𝑚𝑖(t − ti)

𝜙𝑚𝑖𝑡 − 𝑡𝑖

= 𝜅𝑚𝑖𝛽𝑡 − 𝑡𝑖 + 𝑐 −(1+𝜃)

Optimize model parameters (𝜅, 𝛽, 𝑐, 𝜃) and

hidden variables 𝑝𝑗𝑖

11

Contents of this Presentation

• Modeling Retweets Cascades with Hawkes Point Processes

• Optimization by Expectation Maximization Algorithm

• Constructing the Twitter Dataset

• Evaluation and Results

12

Optimization by Expectation Maximization Algorithm

𝜅, 𝛽, 𝑐, 𝜃 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝜅,𝛽,𝑐,𝜃

𝑖=2

𝑛

𝑗=1

𝑖−1

𝑝𝑗𝑖 log𝜙𝑚𝑖(𝑡𝑗 − 𝑡𝑖) − න

𝑡1

𝑡𝑛

𝜆 𝑡 𝑑𝑡

𝑝𝑗𝑖 =𝜙𝑚𝑖

𝑡𝑖 − 𝑡𝑗

𝜆(𝑡𝑗)𝑗 = 1,2, … , 𝑖 − 1 𝑖 = 1,2, … , 𝑛

E step

M step

H.EM

{𝑝𝑗𝑖} ← (𝜅, 𝛽, 𝑐, 𝜃)

(𝜅𝑜𝑙𝑑 , 𝛽𝑜𝑙𝑑 , 𝑐𝑜𝑙𝑑 , 𝜃𝑜𝑙𝑑 , {𝑝𝑗𝑖}) → (𝜅, 𝛽, 𝑐, 𝜃)

Expectation Maximization (EM) Algorithm:

1. An iterative algorithm

2. Alternates between E step and M step

13

Contents of this Presentation

• Modeling Retweets Cascades with Hawkes Point

Processes

• Optimization by Expectation Maximization Algorithm

• Constructing the Twitter Dataset

• Evaluation and Results

14

Constructing the Twitter Dataset

Retweet Cascades

Friend Networks

Twitter Crawler Twitter API

Twitter Users

in Cascades

Sydney Morning Herald (start: 14th Feb)

Simultaneously

15

Item Quantity

Cascades 68040

Tweets in cascades 259186

Users in cascades 61174

Cascades with more than 50 retweets (𝐶50) 274

Users in 𝐶50 16125

Tweets in 𝐶50 33539

Downloaded friends of users in 𝐶50 16051

Statistics on Current Data

Constructing the Twitter Dataset

16

Contents of this Presentation

• Modeling Retweets Cascades with Hawkes Point

Processes

• Optimization by Expectation Maximization Algorithm

• Constructing the Twitter Dataset

• Evaluation and Results

17

Evaluation and Results

Calculate optimal parameters on synthetic data

𝜅, 𝛽, 𝑐, 𝜃 = 𝑎𝑟𝑔𝑚𝑎𝑥(𝜅,𝛽,𝑐,𝜃)

𝑖=2

𝑛

log 𝜆(𝑡𝑖) − න𝑡1

𝑡𝑛

𝜆 𝑡 𝑑𝑡

Baseline: maximizing observed log-likelihood (MLL) of the

same Point Process Models [Mishra et al CIKM’16]

Data: 10 cascades (20 experiments with different initial

parameters on each cascade)

18

Calculate optimal parameters on synthetic data

C (optimal 0.001)

0.001620

0.001635

H.EM MLL H.EM MLL

0.29

0.31

Theta (optimal: 0.2)

Evaluation and Results

H.EM MLL H.EM

0.0174

0.0178

0.0182

K (optimal: 0.025)

MLL

0.60

0.64

Beta (optimal: 0.51)

Performance Measures

19

0 1

𝑝51

𝑝54

𝑝53𝑝52

𝑢1

𝑢4

𝑢2

𝑢3

𝑢5

𝑝52 𝑝54 𝑝51𝑝53

True

False

ROC curve

Area Under Curve (AUC)

the highest probability: an edge

Accuracy

probability

time

Friend Networks

20

Evaluation and Results

Compare with Seven Methods on Real Data

Baselines Description

H.MLLPL infer 𝑝𝑗𝑖 after optimizing log-

likelihood

(H.EM - - during optimization)

(do not need training)

Power-Law Kernel

H.MLLEXP Exponential Kernel

Exponential distribution (E) Directly calculate

probabilities of links

without iterations

(need training)

𝑝𝑗𝑖 = 𝛼 − 1 𝑒−𝛼(𝑡𝑗−𝑡𝑖)

Power-law distribution (PL) 𝑝𝑗𝑖 = 𝛼 − 1 𝑡𝑗 − 𝑡𝑖−𝛼

Rayleigh distribution (R) 𝑝𝑗𝑖 = 𝛼(𝑡𝑗 − 𝑡𝑖)𝑒−0.5𝛼 𝑡𝑗−𝑡𝑖

2

Social Exponential (SE) 𝑝𝑗𝑖 =𝑚𝑖

σ𝑗=1𝑖 𝑚𝑗

𝑒−𝛼(𝑡𝑗−𝑡𝑖)

NETINF Select edges increasing log-likelihood most significantly

(need training)

274 cascades:254 – test

20 – training, E, PL, R, SE (mean AUC) and NETINF (mean Accuracy){

21

Evaluation and Results

Compare with Seven Methods on Real Data

HEM SE HMLL

PL

EXP HMLL

EXP

PL R NETINF

Mean

AUC

0.832 0.872 0.83 0.726 0.726 0.714 0.728 NA

H.EM SE H.MLLPL EXP H.MLLEXP PL R

0.4

0.6

0.8

1.0

AUC

Compare H.EM with baselines (AUC)

22

Evaluation and Results

Compare with Seven Methods on Real Data

HEM SE HMLL

PL

EXP HMLL

EXP

PL R NETINF

Mean Accuracy 0.506 0.556 0.468 0.185 0.187 0.186 0.567 0.249

H.EM SE H.MLLPL EXP H.MLLEXP PL R NETINF

0.0

0.2

0.4

0.6

0.8

1.0

Compare H.EM with baselines (Accuracy)

Accuracy

1. Our method does not need training

2. Infering 𝑝𝑗𝑖 during optimization improves performance

23

Summary

• Modeling by Hawkes Point Processes with Power-law Kernel

• Branching structure of Hawkes used to retrieve the

parenthood relation between retweets

• Inferring 𝑝𝑗𝑖 during optimization is important

• Applied to retrieving the true retweet relations in Twitter

cascades

The Way Ahead

Thank You !

• Experiments on more cascades with different themes

• Try more competitive triggering kernels

24

Reference

• Gomez Rodriguez, M., Leskovec, J., & Krause, A. (2010, July). Inferring networks

of diffusion and influence. In Proceedings of the 16th ACM SIGKDD international

conference on Knowledge discovery and data mining (pp. 1019-1028). ACM.

• Mishra, S., Rizoiu, M.A. and Xie, L., 2016, October. Feature driven and point

process approaches for popularity prediction. In Proceedings of the 25th ACM

International on Conference on Information and Knowledge Management (pp.

1069-1078). ACM.

Recommended