MobiHoc 2014 - Kennesaw State University

Preview:

Citation preview

MINIMUM-SIZED INFLUENTIAL NODE SET

SELECTION FOR

SOCIAL NETWORKS UNDER THE

INDEPENDENT CASCADE MODEL

MobiHoc 2014

Jing (Selena) He

Department of Computer Science, Kennesaw State University

Shouling Ji, and Raheem Beyah

School of Electrical and Computer Engineering, Georgia Institute of Technology

Zhipeng Cai

Department of Computer Science, Georgia State University

2

INTRODUCTION

What is a social network?

The graph of relationships and interactions within a group of

individuals.

SOCIAL NETWORK AND SPREAD OF

INFLUENCE

Social network plays a fundamental

role as a medium for the spread of

INFLUENCE among its members

Opinions, ideas, information,

innovation…

Direct Marketing takes the “word-of-mouth”

effects to significantly increase profits

(facebook, twitter, myspace, …) 3

MOTIVATION

4

• 900 million users, Apr. 2012

• the 3rd largest ― “Country” in the world

• More visitors than Google

• Action: Update statues, create event

• More than 4 billion images

•Action: Add tags, Add favorites

• 2009, 2 billion tweets per quarter

• 2010, 4 billion tweets per quarter

•Action: Post tweets, Retweet

Social networks already become a bridge to connect

our really daily life and the virtual web space

5

MOTIVATION (CONT.)

• Modeling and tracking users’ actions in

social networks is a very important issue

and can benefit many real applications

– Advertising – Social recommendation – Expert finding

– Marketing

–…

6

Who are the opinion

leaders in a community?

Marketer Alice

APPLICATION

George

Frank

Ada

Eve David

Bob

Carol

2 2

4 1

1

2

3 3

Find minimum-sized node (user) set in a social network

that could influence on every node in the network

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

7

OUTLINE

Network Model

Models of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

8

NETWORK MODEL

A social network is represented as a undirected graph

Nodes start either active or inactive

An active node may trigger activation of neighboring nodes based on a pre-defined threshold τ

Monotonicity assumption: active nodes never deactivate

9

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

10

MODEL OF INFLUENCE

If u1 is active, then the active node set I = {u1}

P1(I) = 1

P2(I) = 0.5

P3(I) = 0.7

P4(I) = 0.6

11

MODEL OF INFLUENCE

12

If u1 and u4 are active, then the active node set I = {u1, u4}

P1(I) = 1 – (1 – P11)(1 – P14) = 1

P2(I) = 1 – (1 – P21)(1 – P24) = 0.9

P3(I) = 1 – (1 – P31)(1 – P34) = 0.97

P4(I) = 1 – (1 – P41)(1 – P44) = 1

Pii = 1, if ui ϵ I

Pii = 0, otherwise

Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

13

MINIMUM-SIZED INFLUENCE NODE SET SELECTION PROBLEM (MINS)

Given

a social network G = (V, E, P)

a threshold τ

Goal

The initially selected active node set denoted

by I could influence every node in the

network

∀ 𝑢𝑖 ∈ 𝑉, Pi(I) = 1 − 1 − 𝑃𝑖𝑗 ≥ 𝜏𝑢𝑗∈𝐼

Objective

Minimize the size of I 14

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

15

CONTRIBUTION FUNCTION

f(I) = min (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

Greedy algorithm

Initialize I = empty set

While f(I) < |V|τ do

Choose u to maximize f(I ∪ {u})

I = I ∪ {u}

End while

Return I

16

EXAMPLE

First round: I = empty set

Second round:

I = {u1}

f(I) = 0.8 + 0.5 + 0.7 + 0.6 = 2.6

I = {u2}

f(I) = 0.5 + 0.8 + 0.4 + 0.8 = 2.5

I = {u3}

f(I) = 0.7 + 0.4 + 0.8 + 0.8 = 2.7

I = {u4}

f(I) = 0.6 + 0.8 + 0.8 + 0.8 = 3.0

𝝉 = 0.8

f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

17

17

EXAMPLE

Third round:

I = {u4 ,u1}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

I = {u4 ,u2}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

I = {u4 ,u3}

f(I) = 0.8 + 0.8 + 0.8 + 0.8 = 3.2

Use node ID to break the tie

I = {u4 ,u1}

The greedy algorithm stops, since

f(I) = |V|τ = 4 * 0.8 = 3.2.

𝝉 = 0.8

f(I) = m𝑖𝑛 (𝑃𝑖 𝐼 , τ)𝑢𝑖∈𝑉

18

18

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

19

THEORETICAL ANALYSIS

20

Theorem 1. The MINS selection problem is NP-hard.

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

21

SIMULATION SETTINGS

generate random graphs based on the random graph

model G(n,p) = {G | G has n nodes, and an edge

between any pair of nodes is generated with

probability p}.

22

EXPERIMENT DATA

Real-world data set: academic coauthor network,

which is extracted from academic search system

Arnetminer [19].

co-authorship networks arguably capture many

of the key features of social networks more

generally.

Resulting graph: 640, 134 nodes (authors), 1,

554, 643distinct edges (coauthor relations)

23

OUTLINE

Network Model

Model of influence

Minimum-sized Influence Node Set selection problem

Problem definition

Greedy Algorithm

Proof of performance bound

Experiments

Data and setting

Results

24

RESULTS: SIMULATION

25

RESULTS: SIMULATION

26

RESULTS: REAL DATA

27

CONCLUSIONS

We introduce a new optimization problem, named the Minimum-sized Influential Node Set (MINS) selection problem. We prove that it is a NP-hard problem under the independent cascade model.

We define a polymatroid contribution function, which suggests us a greedy approximation algorithm. Comprehensive theoretical analysis about its performance ratio is given.

We conduct extensive experiments and simulations to validate our proposed greedy algorithm both on real world coauthor data sets and random graphs. 28

FUTURE WORK

Study more realistic network model

Directed graph

Study more general influence models

Deal with negative influences

Study the network evolution as time changes

29

30

Q & A

Recommended