Modeling Relationship Strength in Online Social Networks
Rongjing Xiang: Purdue UniversityJennifer Neville: Purdue University
Monica Rogati: LinkedInWWW 2010
Presenter: Chenghui RENSupervisors: Dr Ben Kao, Prof David Cheung
Why do we care aboutRelationship Strength?
• Various aspects of online social networks (OSNs) are based on relationship strength:– Link prediction
• Suggesting new people with top relationship strength to users
– Item recommendation• Items may be groups to join, articles to read…
– Newsfeeds• Real-time updates about status change, activities, new posts…
– People search– …
What has been done onRelationship Strength?
• Previous work analyzing OSNs has focused on binary friendship relations – E.g., friends or not
• Low cost of link formation leads to networks with different relationship strengths– E.g. close friends and acquaintances
• Treating all relationships as equal will increase the level of noise in a learned model and degrade performance.
Problem• Typically, an OSN contains:– Profiles– Interaction activities
To propose a method to infer a continuous-valued relationship strength for links based on the factors above
Roadmap
• Motivation• Latent Variable Model• Experimental Evaluation• Conclusions
Latent Variable Model: Introduction
• The homophily is common in OSNs– People tend to form ties with other people who have similar
characteristics– The stronger the tie, the higher the similarity
• Relationship strength is modeled as a hidden effect of nodal profile similarities– E.g. the schools and companies the users attended– E.g. the online groups they joined– E.g. the geographic locations that they belong to
• Relationship strength is modeled as a hidden cause of user interactions– E.g. profile viewing activities– E.g. picture tagging
Model Introduction (Cont’d)
Profile attributes
Relationship strength
User interactions
Have effect on
Cause of
Visible
Visible
Invisible
Model: Introduction (Cont’d)
Goal: Estimate z to maximize the overall observed data likelihood
Figure 1: Graphical model representation of the general relationship strength model
Model Specification
Profile attributes
Relationship strength
Affect
Visible
Invisible
First model this part
Model Specification (Cont’d)Using Gaussian distribution to model the conditional probability of z given profile similarities:
Model Specification (Cont’d)
Relationship strength
User interactions
Cause of
Visible
Invisible
Then model this part
Model Specification (Cont’d)
Using a logistic function to model the conditional probability of y given u: Figure 2: Graphical model
representation of the specific instantiation
Model Specification (Cont’d)To avoid over-fitting, L2 regularizers are put on the parameters w and θ, which can be regarded as Gaussian priors:
Model InferenceTwo ways to estimate a latent variable model
Future work
Accepted!
Model Inference (Cont’d)
Roadmap
• Motivation• Latent Variable Model• Experimental Evaluation• Conclusions
Experimental EvaluationDataset:Purdue facebook data#nodes: 4500#links: 144,712
Three profile similarity measures:
Two types of user interactions: Auxiliary variables: #people whose wall i has posted i has tagged in pictures
Experiment Evaluation (Cont’d)
• Use the proposed latent variable model to estimate the relationship strengths for the 144,712 pairs of users
How to evaluate the estimated weighted graph? Apply the estimated weighted
graph in a number of collective classification tasks.
Gender: Male? Relationship status: Single? Political views: Conservative? Religious views: Christian?
Classification Algorithm
• Gaussian Random Field Model– Autocorrelation is present in the graph– Information is propagated from the labeled
portion of the graph to infer the values for unlabeled nodes
• Vary the proportion of labeled nodes in the graph from 30% to 90%
• Measure the resulting classification rankings using area under the ROC curve
ROC curve
x-axis: False positive rate
y-axis: True positive rateThe larger the area
under the ROC curve, the higher the overall accuracy of the classification
Comparisons to Six Graphs
• Four observed graphs– Friendship graph– Top-friend graph– Wall graph– Picture graph
• Two additional graphs– Profile-similarity graph, which weights each link by – Interaction-count graph, which sums the links in
the wall
Results
Collective classification performance on various Facebook graphsCurves for the wall graph and the picture graph lie well below other curves, and are then omitted
Roadmap
• Motivation• Latent Variable Model• Experimental Evaluation• Conclusions
Conclusions
• A latent variable model was proposed to estimate relationship strength in OSNs
• The weighted graph formed by the estimated relationship strengths give rise to higher autocorrelation and better classification
• The model can facilitate many graph learning and social network mining tasks
Q&AThanks!