Hypertext2017-Leveraging Followee List Memberships for Inferring User Interests for Passive Users on Twitter

Guangyuan Piao, John G. Breslin

Unit for Social Semantics

28th ACM Conference on Hypertext and Social Medial Prague, Czech Republic, 4-7, July, 2017

Leveraging Followee List Memberships for Inferring User Interests for Passive Users on Twitter

2

1/3 users seek medical information and over 50% users consume news

on Social Networks

Facebook and Twitter together generate more than 5 billion microblogs / day

[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16

According to a research done by Twocharts, 44% of Twitter users have never sent a tweet

[SOURCE] http://guardianlv.com/2014/04/twitter-users-are-not-tweeting/

How can we infer user interests for passive users based on the info. of their followees?

! user modeling for active users •  analyzing users’ tweets

•  representing user interests using different approaches •  bag-of-words

•  topic modeling

•  bag-of-concepts

dbr:Eagles_of_Death_Metal (5)

Related Work

5

interest frequency

dbr:The_Wombats (2)

dbc:Hard_rock

dbp:genre

! user modeling for passive users •  analyzing information of users’ followees

•  HIW(followees_tweet) [Chen et al. SIGCHI’10] •  a great amount of data, but also noisy

•  SA(followees_name) [Besel et al. SAC’16, Faralli et al. SNAM’16] •  link names to entities, construct category-based user profiles •  spreading activation + WiBi-taxonomy (Wikipedia categories)

Related Work

6

dbr:Cristiano_Ronaldo (5)

dbc:Real_Madrid_C.F._players

dbr:2014_FIFA_World_Cup_players

Category A

Category B

…

…

! user modeling for passive users •  analyzing information of users’ followees

•  IP(followees_bio) [Piao et al. ECIR’17]

•  exploring related categories & entities (1-hop)

•  performed better than HIW(followees_tweet) SA(followees_name)

Related Work

7

BobHorry@bob

Android developer,educator

dbr:Android_(operating_system)

dbc:Smartphones

dbr:Java_(programming_language) dbc:Tablet_operating_systems

dc:subject

dc:subject dbp:programmedIn

Different Views of Followees

! user modeling for passive users

8

BobHorry@bob

Android developer,educator

biographies (self-description)

list memberships (others-descriptions)

Aim of Work

! user modeling for passive users

•  we aim to investigate

•  whether we can leverage the list memberships of followees for inferring user interest profiles,

•  whether two different views of followees complement each other to improve the quality of user profiles

9

10

Our Approach

! user modeling leveraging list memberships of followees

1fetchuser’sfollowees

3extracten33esfrom

followees’listmemberships

5interest

propaga3on

Twitter user @alice

Interest profile

Twitter API

Tag.me

DBpedia graph

2fetchlist

membershipsoffollowees

4construc3ng

primaryinterests

Twitter API

weigh3ngscheme

11

Constructing Primary Interests

! Weighting Scheme 1 (WS1) •  profile of a followee f in Fu :

where

•  weight of an entity with respect to the target user

A (0.1) B (0.2) F (0.1) …

… …

B (0.3) F (0.2) C (0.2) …

normalized followee profile Fu

B (0.5) … F (0.3) …

12

Constructing Primary Interests

! Weighting Scheme 2 (WS2) •  based on the idea of HIW (Chen et al. CHI’10)

•  excluding entities extracted only in a single followee •  w(u, cj) = the number of followees who have cj in their list

memberships.

A B F …

… …

B F C …

B (2) … F (2) …

13

Interest Propagation

!  interest propagation using DBpedia (SEMANTiCS’16)

•  SP: # of subpages •  SC: # of subcategories

•  P: # of properties appearing in the whole DBpedia graph •  intuition 1: discount common categories

•  intuition 2: discount related entities connected with common properties

dbr:Android_(operating_system)

dbc:Smartphones

dbr:Java_(programming_language) dbc:Tablet_operating_systems

dc:subject

dc:subject dbp:programmedIn

14

Interest Propagation

!  interest propagation using DBpedia (SEMANTiCS’16)

•  same as previous approach but with DBpedia refinement •  extracting sub-graph of dbc:Main_topic_classifications

•  merging categories and entities with the same names

dbc:Apple_Inc.(0.25)

dbr:Apple_Inc.(5)

dbr:Steve_Jobs(2)

Apple_Inc.(5.25)

Steve_Jobs(2)

before after

! main goal •  analyze & compare different user modeling strategies in the

context of link (URL) recommendations

!  link (URL) profile •  same representation model for users, based on its content

! ground truth •  links shared by users in their timeline in the last two weeks 15

Experiment Setup

UM#1

UM#2

candidate links (URLs)

recommendation algorithm

(cosine similarity)

top-N recommendations

16

Experiment Setup

! Twitter dataset •  439 random users

•  2,771 followees on average •  considered up to 200 followees for each user due to the

Twitter API limit for crawling list memberships

! dataset for experiment •  439 users •  74,488 followees in total, 170 followees on average

•  15,053 candidate links for recommendations

17

Experiment Setup

! evaluation metrics •  MRR (Mean Reciprocal Rank)

•  the 1st relevant item occurs on average in recommendations

•  S@N (Success rate) •  mean probability of a relevant item occurs in the top-N list

•  P@N (Precision) •  mean probability of retrieved items in the top-N are relevant

•  R@N (Recall) •  mean probability of relevant items retrieved in in the top-N

18

Info. of List Memberships

•  over 90% users, at least 1 list membership

•  173 list memberships, on average

•  3,047 vs. 23 entities from list memberships vs. bios considering up to 50 followees

Results

•  some significant improvement when # of followees is small

0 10000 20000 30000 40000 50000

50

100

150

200

#ofen''es

#offo

llowees

withoutrefinement withrefinement

Results

9% compression of profile size, while remaining at a similar performance level

Results – combining two views

•  combining two views of followees

The final rank of an item is determined by the average rank position of each rank based on two user models (Ryen et al. SIGIR’09) score =x : rank position based on 1st user model y : rank position based on 2nd user model β : importance control parameter

•  combining two views improves the performance significantly

Results – combining two views

•  combining two views of followees

0.0400

0.0450

0.0500

0.0550

0.0600

0.0650

0.0700

0.0750

0.0800

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

R@10

beta

50

100

150

200

best performance when β = 0.1, similar results for MRR, P@10, S@10

•  list memberships paly more important role in the combination

Conclusions

•  leveraging list memberships of followees > exploiting biographies especially in the case of a user having a small number of followees

•  combining the two different views of followees can improve the quality of user modeling significantly,

•  and the list memberships of followees play a more important role in the combination

24

Thank you for your attention!

Guangyuan Piao homepage: http://parklize.github.io e-mail: [email protected] twitter: https://twitter.com/parklize slideshare: http://www.slideshare.net/parklize

Data & Analytics

Hypertext2017-Leveraging Followee List Memberships for Inferring User Interests for Passive Users on Twitter