41
Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27, 2008 1 2

Discovering Leaders from Community Actions

  • Upload
    jason

  • View
    28

  • Download
    4

Embed Size (px)

DESCRIPTION

Discovering Leaders from Community Actions. Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27, 2008. 2. 1. Context & Motivations: Viral Marketing. We are more influenced by our friends than strangers - PowerPoint PPT Presentation

Citation preview

Page 1: Discovering Leaders from  Community Actions

Discovering Leaders from Community Actions

Amit Goyal1

Francesco Bonchi2

Laks V.S. Lakshmanan1

Oct 27, 2008 1 2

Page 2: Discovering Leaders from  Community Actions

Context & Motivations:Viral Marketing

Page 3: Discovering Leaders from  Community Actions

3

Word of Mouth and Viral Marketing We are more influenced

by our friends than strangers

68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 4: Discovering Leaders from  Community Actions

4

Viral Marketing

Also known as Target Advertising

Initiate chain reaction by Word of mouth effect

Low investments, maximum gain

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 5: Discovering Leaders from  Community Actions

5

Viral Marketing as an Optimization Problem Given: Network with

influence probabilities Problem: Select top-k

leaders such that by targeting them, the spread of influence is maximized

Hao Ma et al 2008, Domingos et al 2001, Richardson et al 2002, Kempe et al 2003

How to calculate true influence probabilities?

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 6: Discovering Leaders from  Community Actions

6

A pattern mining approach

We propose a completely different approach based on frequent pattern mining.

We focus on the actions performed by users: Joining a community (as in flickr/facebook community) Rating a song, a movie (as in Y! Music, Y! Movie)

Importance of time in which actions are performed

Assumption: Users can see their friends’ actions

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 7: Discovering Leaders from  Community Actions

7

Our Contributions

Formally define the notion of leaders and its various flavors

Efficient algorithms for extracting these leaders

Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 8: Discovering Leaders from  Community Actions

8

Rest of the talk

Framework definition: Influence propagation on the social network Various notions of leaders

Algorithms Experiments Related Work Conclusion

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 9: Discovering Leaders from  Community Actions

Framework Definition

Page 10: Discovering Leaders from  Community Actions

10

Input Data (1) A social network, i.e., an undirected graph

G=(V,E) where nodes are users and edges represent social ties.

Users declare their friends. e.g. Facebook, Yahoo! Messenger etc

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 11: Discovering Leaders from  Community Actions

11

Input Data (2)

An actions log sorted in chronological order, i.e., a relation

Actions(User, Action, Time) Example: Jack joined

Yoga community at time 5 Assumption:

Users can see their friends actions (feeds)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 12: Discovering Leaders from  Community Actions

12

Action Propagation

Jack Jill

Mary

Jack and Jill are friendsJack and Mary are friendsAction is “Joining the Yoga community”

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Action Propagated from Jack to JillAction propagated from Jack to Mary

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

3 time units

995 time units

Page 13: Discovering Leaders from  Community Actions

13

Propagation Graph

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Can we say Mary got influenced by Jack?? NO

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 14: Discovering Leaders from  Community Actions

14

User Influence Graph

When an action propagates from user u to user v,

we may think of v being influenced by u

Influence should decay in time

Size of influence graph << Size of PG

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Propagation Graph

User Influence Graph for Jack

Page 15: Discovering Leaders from  Community Actions

15

Leaders – first definition

Who should be a leader? For an action, should influence sufficiently large number of users ( >ψ ) For an action, should influence these users in a reasonable amount of

time ( <π ) Should act as a leader in sufficiently large number of actions ( >σ )

If ψ= 2, π = 15, σ = 1then, both Jack and Jill are leaders

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

3

74

7

3995

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

3

7

7Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Joined YogaCommunity at time 1000

Mary

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

Page 16: Discovering Leaders from  Community Actions

16

Tribe Leader

A leader may influence different users for different actions

What if a leader lead a fixed set of users for different actions?

We call these leaders as Tribe Leaders

Can be considered as small communities

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

jack

A1 A3A2

A1, A2 and A3 are 3 different actions

Page 17: Discovering Leaders from  Community Actions

17

Additional Constraint: Genuineness It may happen that one

user acts as a leader but in concrete he is always a follower of the other leaders

We want to avoid this kind of fake leaders.

gen(Jill) = 1/3 Another constraint:

confidence

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Tom

Jill

Jack

A1 A3

A2

A1 A2

A1, A2 and A3 are 3 different actions

Page 18: Discovering Leaders from  Community Actions

Algorithmsbut how will I discover the

leaders??

Page 19: Discovering Leaders from  Community Actions

19

Algorithms: Overview

Assumptions: Social graph is huge – millions of nodes Actions log is huge – millions of tuples For an action, size of user Influence Graph <<

size of Propagation Graph for all users Our algorithms are able to extract the patterns

(leaders and tribe leaders) in no more than one scan of the action log table.

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 20: Discovering Leaders from  Community Actions

20

Algorithms: Overview Scan the action log table by means of a window of sizeπbackward

in time, i.e., starting from the most recent timestamp (bottom of the table if we assume tuples to be ordered by time).

Efficiently compute the influence matrix, i.e., a matrix Users x Actions IMπ(u, a) represents number of users, influenced by u w.r.t. action a

within timeπ Compute leaders from IM

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Jack Jill

Joey

Joined YogaCommunity at time 5

Joined YogaCommunity at time 8

Ben

Joined YogaCommunity at time 12

Joined YogaCommunity at time 15

IM10(Jack, “joining yoga community”) = 3

Page 21: Discovering Leaders from  Community Actions

21

Computing Influence Matrix (1)

We use a bit vector to track which users are influenced by a given user. Updated incrementally

Locking mechanism using another bit vector 0 => free bit; 1 => occupied bit

Node to bit index mapping stored in a queue Bits must be dynamically allocated.

S

R

T

W

V

Node InfVec

R 01010111

S 01000110

T 00010110

W 00000110

V 00000100

(V,2) (W,1) (T,4) (S,6) (R,0)

HeadQueue

01010111

Lock bit Vector

Time window on propagation graph

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 22: Discovering Leaders from  Community Actions

22

Computing Influence Matrix (2)

Slide up the current window – delete node V Delete the entry from queue Update the lock Update influence vectors

S

R

T

W

V

Node InfVec

R 01010011

S 01000010

T 00010010

W 00000010

V 0000010001010011

Lock bit Vector

(V,2) (W,1) (T,4) (S,6) (R,0)

HeadQueue

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

(V,2) (W,1) (T,4) (S,6) (R,0)

01010111

Lock bit Vector

Node InfVec

R 01010111

S 01000110

T 00010110

W 00000110

V 00000100

Time window on propagation graph

Page 23: Discovering Leaders from  Community Actions

23

Computing Influence Matrix (3)

New node P added Issue a lock, add entry to the queue Compute its Influence Vector by propagation Number of followers of P = 4 IM(P,a) = 4

S

R

T

W

Node InfVec

P 01010111

R 01010011

S 01000010

T 00010010

W 00000010

(W,1) (T,4) (S,6) (R,0) (P,2)

HeadQueue

01010111

Lock bit Vector

P

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

(W,1) (T,4) (S,6) (R,0)

01010011

Lock bit Vector

Node InfVec

R 01010011

S 01000010

T 00010010

W 00000010

Time window on propagation graph

Page 24: Discovering Leaders from  Community Actions

24

Mining Tribe Leaders

Influence Matrix not enough We use influence cube: Users x Actions x Users

ICπ(u,a,v) = 1, when user v is influenced by user u for action a within time π

We do not explicitly compute the whole cube due to sparsity.

Problem same as discovering existence of frequent itemsets of size larger than a given threshold

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 25: Discovering Leaders from  Community Actions

25

Algorithms - Final Comments

The only truly mandatory threshold is π(time threshold)

Influence Matrix: O(TAn2) in bit level operations T = total number of tuples in action log A = total number of distinct actions n = maximum number of nodes visible in any position of the

time window n << N, where N is the total number of users

Tribe Leaders: Influence Cube: O(TAn2) Finding existence of frequent itemsets: exponential in

number of followers But very fast due to optimizations (Bonchi 2003)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 26: Discovering Leaders from  Community Actions

Experimentsenough talking, show me the

results dude!!

Page 27: Discovering Leaders from  Community Actions

27

Data Preparation

Data Social graph: Yahoo! Instant Messenger Actions log: Yahoo! Movies

Action = user u rated movie m at time t joined through common users identifiers

Started from Yahoo! Instant Messenger subgraph of “most active” users (110M nodes) and 21M ratings from Yahoo! Movies.

Ended with 217.5K nodes, 221.4K edges and 1.8M ratings.

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 28: Discovering Leaders from  Community Actions

28

Data characteristics: connected components

Giant component94K Users (43.2% of connected users)

Total 46,650 connected components

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 29: Discovering Leaders from  Community Actions

29

Leaders Vs. Tribe leaders

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

Page 30: Discovering Leaders from  Community Actions

30

Number of leaders found

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

Page 31: Discovering Leaders from  Community Actions

31

Run-time

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

Page 32: Discovering Leaders from  Community Actions

32

Genuineness: an almost binary concept!

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 33: Discovering Leaders from  Community Actions

33

Top-10 tribe leaders w.r.t. tribe size

• Tribe leaders exhibit high confidence.

• Tribe leaders with low genuineness were found dominated by other tribe leaders present in the tables.

• We found many users acting as leader in many actions but not being a tribe leader.Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 34: Discovering Leaders from  Community Actions

34

Related Work (1)

Identifying influential users Domingos et al 2001, Richardson et al 2002,

Kempe et al 2005 Identifying influential bloggers

Agarwal et al 2008 Identifying communities in Social Networks

Hoproft et al 2003, Kumar et al 2006, Backstrom et al 2006, Tantipathananadh et al 2007, Huang et al 2008, Friedland at el 2007

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 35: Discovering Leaders from  Community Actions

35

Related Work (2)

Influence and Correlation in Social Networks Aris Anagnostopoulos et al 2008

Revenue maximization Hartline et al 2008

Near optimal sensor placement for outbreak detection Leskovec et al 2007

Heat Diffusion Model Hao Ma et al 2008 (CIKM)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 36: Discovering Leaders from  Community Actions

36

Conclusions Proposed framework based on frequent pattern mining for

discovering leaders in social networks Formally define the problem of extracting leaders from social graph

and actions log. Various notions of leader, tribe leader Their confidence and genuine variants

Efficient algorithms for extracting leaders of various flavors Just one pass over the actions log table

Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log)

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 37: Discovering Leaders from  Community Actions

37

Ongoing/Future Work

Gurumine: Pattern Mining System for Discovering Leaders and Tribes (Demo paper to appear in ICDE 2009)

Leadership Cube: What kind of leaders attract what kind of followers for what kind of actions?

Viral Marketing Stronger notions of influence?

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 38: Discovering Leaders from  Community Actions

38

Thanks!

13

41

2 3

5

23

13 3

7

4

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 39: Discovering Leaders from  Community Actions

39

Backup

Page 40: Discovering Leaders from  Community Actions

40

Number of leaders found

π – threshold on timeσ – threshold on number of actionsψ – threshold on number of influenced users

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/

Page 41: Discovering Leaders from  Community Actions

41

Additional constraint: confidence Similarly to association rules, we can have a

confidence measure for leaders. Leadership confidence =

# actions in which is a leader / # actions performed Example: Lets say Jack performed 10 actions out of

which in 7 actions, he acted as a leader (i.e. more than ψ users followed in short time), then conf(Jack) = 7/10

Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/