31
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana- Champaign

Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

Embed Size (px)

Citation preview

Page 1: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

Multiple Location Profiling for Users and Relationships

from Social Network and Content

Rui Li, Shengjie Wang, Kevin Chen-Chuan ChangUniversity of Illinois at Urbana-Champaign

Page 2: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

2

Users’ Locations are important for many information services

and many others.

Lives in: Los Angeles

Carol

User

Social Network

Content Provider

Local Content Recommendation

Local Friends Recommendation

Page 3: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

3

Community has explored social network and content to profile users’ locations.

Profiling a User’s Home Location

Location: Los Angeles

Tweets

Terrible LA traffic!

Want to go to Honolulu for Spring vacation!

See Gaga in Hollywood.

Good Morning!

Mike

LA

Carol

?

Lucy

Austin

Gaga

NY

Bob

San Diego

Jean

?

Social Network

Page 4: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

4

Problem 1 They only profile a single home location.

Locations of a user’s friends

Locational Word Frequencies

Paramount 1

Los Angeles 1

Hollywood 2

Austin 2

Tweeted Locational Words

Carol lives Los Angeles and studied at Uni. of Texas at Austin

o incompleteo inaccurate

Page 5: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

5

Problem 2 They totally miss profiling relationships.

Relationships Profiling

Carol follows Bob

Carol follows Lucy

Carol tweets Hollywood

both Carol and Lucy studied at AustinCarol lives Los Angeles

both Carol and Bob work at Los Angeles

o useful !

Page 6: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

6

We focus on multiple location profiling for users and relationships.

Carol in Real-worldLocation: Los Angeles Education: Uni. of Texas at Austin 

Terrible LA traffic!

Want to go to Honolulu for Spring vacation!

See Gaga in Hollywood.

Good Morning!

Mike

LA

Carol

?

Lucy

Austin

Gaga

NY

Bob

San Diego

Jean

?

Carol’s Location Profile: Los Angeles, AustinCarol follows Lucy: Austin, Austin

 

  

Page 7: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

7

Our approach is to build a model to connect known relationships with unknown locations.

Known Relationships

Following Relationships

Carol follows Lucy

Carol follows Mike

….

Tweeting Relationships

Carol tweets Hollywood

Carol tweets Honolulu

….

Users’ Locations

?

Unknown Locations

MLP Model

Generation Model

Inference Algorithm

Page 8: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

8

Challenge 1 How to connect users’ locations with relationships?A. from users’ locations to following relationshipsB. from users’ locations to tweeting relationships

Challenge 2 How to model that the relationships are mixed?A. some relationships are not based on locations.B. each relationship is based on a different location.

Challenge 3 How to utilize home locations from labeled users?

There are three challenges for building MLP.

Page 9: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

9

Challenge 1.A We need to connect following relationships with two users’ locations.

Even a user has only one location follows others from different locations.

Tweeting Probability

Carol at Los Angeles follows Bob in San Diego. 20%

Carol at Los Angeles follows Mike in Los Angeles. 30%

The following probability as the probability generating a following relationship from a user to another user based on their locations

Page 10: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

10

Observation We explore following probability via investigating a corpus

• It captures our intuition well.

• It fits a power law distribution.

Page 11: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

11

Solution: We derive location-based following model for following probability.

The location-based following model

Page 12: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

12

Challenge 1.B We need to connect tweeting relationships with a user’s location.

User at a location tweets different locations.

The tweeting probability as the probability generating a tweeting relationship from a user to a venue based on a location

Probability of Tweeting

Carol at Los Angeles tweets about watching a show in Hollywood. 30%

Carol at Los Angeles tweets about traffic in Los Angeles. 40%

Page 13: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

13

• They capture our intuition well.

• They can be modeled as a set of multinomial distributions.

Observation We explore tweeting probability via investigating a corpus.

Page 14: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

14

Solution: We derive location-based tweeting model for tweeting probability.

The location-based tweeting model

A

ݖ ǡ�� ݐ ሺ�ǡ���ሻ �

L

K

Page 15: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

15

Noisy relationships are not useful!

Noisy Relationships

Carol follows Lady Gaga

Carol tweets Honolulu

Location-based Relationshipsb

Carol follows Lucy

Carol tweets Los Angeles

Challenge 2.A There are both noisy and location-based relationships.

Page 16: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

16

Solution: We propose a mixture component for two types of relationships.

1. A relationship is generated based on either a location-based model or a random model.

2. A binary model selector μ indicates which model is used.

3. The selector is generated via a binomial distribution

Page 17: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

17

Challenge 2.B Location-based relationships are related to multiple locations.

Location-based relationships

Carol follows Lucy

Carol tweets Hollywood

Accurate!Complete!

both Carol and Lucy studied at Austin

Carol lives Los Angeles

Page 18: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

18

Solution: We fundamentally model users multiple locations in generating relationships.

Carol

{Los Angels 0.1, Austin 0.1, … }

Location profile as a multinomial distribution over locations.

Each relationship is based on one particular location from his profile.

Page 19: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

19

Challenge 3 We should utilize observed locations from some users’ profiles.

Mike

LA

Carol

?

Lucy

Austin

Gaga

NY

Bob

San Diego

Jean

?

they are useful for profiling locations! we cannot use them directly to generate

relationships!

20% users provide their home locations in their profiles.

Page 20: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

Solution: We utilize observed locations from as priors to generate users’ profiles.

Bob

{San Diego 0.9, Los Angels 0.05, …}

We assume users profiles are generated prior distributions.

Home locations of users are likely to be generated.

Page 21: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

21

Therefore, we arrive a complete model.

Page 22: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

22

We crawled a subset of Twitter. There are 139K users, 50

million tweets and 2 million following relationships.

We evaluate our model on a large Twitter corpus.

Page 23: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

23

Task 1 profiling users’ home locations, MLP performs accurately and improves baselines.

Page 24: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

24

Task 2 profiling users’ multiple locations, MLP proforms accurately and completely.

Precision and Recall at Rank 2

Case Studies

Locations in a similar region

Locations in different areas

Accurately

Completely

Page 25: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

25

Task 3 profiling following relationships, MLP achieves 57% accuracy.

Page 26: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

26

Thanks and Questions !

Page 27: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

27

Backup for Questions

Page 28: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

28

Experiments 1

• We use the home location provided in users’ profiles as ground truth.

• We compare two baseline methods proposed in literature.

Page 29: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

29

Experiments 2

• We manually labeled multiple locations of 1000 users, and obtained 585 users, who clearly have multiple locations.

• We compare the same baseline methods as in the previous task.

• We measure the performance in terms of “precision” and “recall”.

Page 30: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

30

Experiments 3

• We manually labeled location assignments of 585 users, whose multiple locations are known to us, and obtained 4426 relationships.

• We design a meaningful baseline method, which profile a relationship based users home locations.

Page 31: Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois

31

MLP defines the joint probability of observations, parameters, and latent variables.

We infer users’ locations and locations assignments with the observed relationships and the given parameters.

We develop our algorithm based on the Gibbs sampling method.

We infer users’ locations and location assignments for relationships as latent variable in the joint probability.