30
MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL MobiHoc 2014 Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji, and Raheem Beyah School of Electrical and Computer Engineering, Georgia Institute of Technology Zhipeng Cai Department of Computer Science, Georgia State University

MobiHoc 2014 - Kennesaw State University

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MobiHoc 2014 - Kennesaw State University

MINIMUM-SIZED INFLUENTIAL NODE SET

SELECTION FOR

SOCIAL NETWORKS UNDER THE

INDEPENDENT CASCADE MODEL

MobiHoc 2014

Jing (Selena) He

Department of Computer Science, Kennesaw State University

Shouling Ji, and Raheem Beyah

School of Electrical and Computer Engineering, Georgia Institute of Technology

Zhipeng Cai

Department of Computer Science, Georgia State University

Page 2: MobiHoc 2014 - Kennesaw State University

2

INTRODUCTION

What is a social network?

The graph of relationships and interactions within a group of

individuals.

Page 3: MobiHoc 2014 - Kennesaw State University

SOCIAL NETWORK AND SPREAD OF

INFLUENCE

Social network plays a fundamental

role as a medium for the spread of

INFLUENCE among its members

Opinions, ideas, information,

innovation…

Direct Marketing takes the “word-of-mouth”

effects to significantly increase profits

(facebook, twitter, myspace, …) 3

Page 4: MobiHoc 2014 - Kennesaw State University

MOTIVATION

4

• 900 million users, Apr. 2012

• the 3rd largest ― “Country” in the world

• More visitors than Google

• Action: Update statues, create event

• More than 4 billion images

•Action: Add tags, Add favorites

• 2009, 2 billion tweets per quarter

• 2010, 4 billion tweets per quarter

•Action: Post tweets, Retweet

Social networks already become a bridge to connect

our really daily life and the virtual web space

Page 5: MobiHoc 2014 - Kennesaw State University

5

MOTIVATION (CONT.)

• Modeling and tracking users’ actions in

social networks is a very important issue

and can benefit many real applications

– Advertising – Social recommendation – Expert finding

– Marketing

–…

Page 6: MobiHoc 2014 - Kennesaw State University

6

Who are the opinion

leaders in a community?

Marketer Alice

APPLICATION

George

Frank

Ada

Eve David

Bob

Carol

2 2

4 1

1

2

3 3

Find minimum-sized node (user) set in a social network

that could influence on every node in the network

Page 7: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

7

Page 8: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Models of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

8

Page 9: MobiHoc 2014 - Kennesaw State University

NETWORK MODEL

A social network is represented as a undirected graph

Nodes start either active or inactive

An active node may trigger activation of neighboring nodes based on a pre-defined threshold τ

Monotonicity assumption: active nodes never deactivate

9

Page 10: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

10

Page 11: MobiHoc 2014 - Kennesaw State University

MODEL OF INFLUENCE

If u1 is active, then the active node set I = {u1}

P1(I) = 1

P2(I) = 0.5

P3(I) = 0.7

P4(I) = 0.6

11

Page 12: MobiHoc 2014 - Kennesaw State University

MODEL OF INFLUENCE

12

If u1 and u4 are active, then the active node set I = {u1, u4}

P1(I) = 1 – (1 – P11)(1 – P14) = 1

P2(I) = 1 – (1 – P21)(1 – P24) = 0.9

P3(I) = 1 – (1 – P31)(1 – P34) = 0.97

P4(I) = 1 – (1 – P41)(1 – P44) = 1

Pii = 1, if ui ϵ I

Pii = 0, otherwise

Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼

Page 13: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

13

Page 14: MobiHoc 2014 - Kennesaw State University

MINIMUM-SIZED INFLUENCE NODE SET SELECTION PROBLEM (MINS)

Given

a social network G = (V, E, P)

a threshold τ

Goal

The initially selected active node set denoted

by I could influence every node in the

network

∀ 𝑢𝑖 ∈ 𝑉, Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼

Objective

Minimize the size of I 14

Page 15: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

15

Page 16: MobiHoc 2014 - Kennesaw State University

CONTRIBUTION FUNCTION

f(I) = min (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

Greedy algorithm

Initialize I = empty set

While f(I) < |V|τ do

Choose u to maximize f(I ∪ {u})

I = I ∪ {u}

End while

Return I

16

Page 17: MobiHoc 2014 - Kennesaw State University

EXAMPLE

First round: I = empty set

Second round:

I = {u1}

f(I) = 0.8 + 0.5 + 0.7 + 0.6 = 2.6

I = {u2}

f(I) = 0.5 + 0.8 + 0.4 + 0.8 = 2.5

I = {u3}

f(I) = 0.7 + 0.4 + 0.8 + 0.8 = 2.7

I = {u4}

f(I) = 0.6 + 0.8 + 0.8 + 0.8 = 3.0

𝝉 = 0.8

f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

17

17

Page 18: MobiHoc 2014 - Kennesaw State University

EXAMPLE

Third round:

I = {u4 ,u1}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

I = {u4 ,u2}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

I = {u4 ,u3}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

Use node ID to break the tie

I = {u4 ,u1}

The greedy algorithm stops, since

f(I) = |V|τ = 4 * 0.8 = 3.2.

𝝉 = 0.8

f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

18

18

Page 19: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

19

Page 20: MobiHoc 2014 - Kennesaw State University

THEORETICAL ANALYSIS

20

Theorem 1. The MINS selection problem is NP-hard.

Page 21: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

21

Page 22: MobiHoc 2014 - Kennesaw State University

SIMULATION SETTINGS

generate random graphs based on the random graph

model G(n,p) = {G | G has n nodes, and an edge

between any pair of nodes is generated with

probability p}.

22

Page 23: MobiHoc 2014 - Kennesaw State University

EXPERIMENT DATA

Real-world data set: academic coauthor network,

which is extracted from academic search system

Arnetminer [19].

co-authorship networks arguably capture many

of the key features of social networks more

generally.

Resulting graph: 640, 134 nodes (authors), 1,

554, 643distinct edges (coauthor relations)

23

Page 24: MobiHoc 2014 - Kennesaw State University

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

24

Page 25: MobiHoc 2014 - Kennesaw State University

RESULTS: SIMULATION

25

Page 26: MobiHoc 2014 - Kennesaw State University

RESULTS: SIMULATION

26

Page 27: MobiHoc 2014 - Kennesaw State University

RESULTS: REAL DATA

27

Page 28: MobiHoc 2014 - Kennesaw State University

CONCLUSIONS

We introduce a new optimization problem, named the Minimum-sized Influential Node Set (MINS) selection problem. We prove that it is a NP-hard problem under the independent cascade model.

We define a polymatroid contribution function, which suggests us a greedy approximation algorithm. Comprehensive theoretical analysis about its performance ratio is given.

We conduct extensive experiments and simulations to validate our proposed greedy algorithm both on real world coauthor data sets and random graphs. 28

Page 29: MobiHoc 2014 - Kennesaw State University

FUTURE WORK

Study more realistic network model

Directed graph

Study more general influence models

Deal with negative influences

Study the network evolution as time changes

29

Page 30: MobiHoc 2014 - Kennesaw State University

30

Q & A