Link Prediction
Topics in Data MiningFall 2015
Bruno Ribeiro
2
Deterministic Heuristics Methods
Matrix / Probabilistic Methods
Supervised Learning Approaches
Overview
3
Input: Snapshot of social network at time t
Ouput: Predict edges that will be added to network at time t’ > t
Goal
4
Product Recommendation
2
2
j5
i
Example of link prediction application
5
Twitter’s Who to Follow
Another example of link prediction application
6
A few more examples:
Fraud Detection: (Beutel, Akoglu, Faloutsos, KDD’15)◦ Focuses on discovering surprising link patterns
Anomaly detection: (Rattigan et al, 2005) ◦ Focuses on finding unlikely existing links
Other Applications
7
Deterministic Heuristics
8
Every entity is assigned a score
Score(v) = how many friends person already has
A Very Naïve Approach
9
Assign connection weight score(u, v) to non-existing edge (u, v)
Rank edges and recommend ones with highest score
Can be viewed as computing a measure of proximity or “similarity” between nodes u and v.
General Approach
10
Negated length of shortest path between u and v. All nodes that share one neighbor will be linked
Problem: Network diameter often very small
Shortest Path
11
Common neighbors uses as score the number of common neighbors between vertices u and v.
Score can be high when vertices have too many neighbors
Common Neighbors
12
Fixes common neighbors “too many neighbors” problem by dividing the intersection by the union
Between 0 and 1
Jaccard Similarity
13
This score gives more weight to neighbors that that are not shared with many others.
Adamic / Adar
5 neighbors
50neighbors
Too activeLess evidence of missing link between Alice and BobLess activeMore evidence of missing link between Bob and Carol
Alice
Bob
Carol
14
The probability of co-authorship of u and v is proportional to the product of the number of collaborators of u and v
Preferential Attachment
15
Exponentially weighted sum of number of paths of length l
For small α << 1 : predictions ≈ common neighbors
Katz (1953)
16
where, Hu,v is the random walk hitting time between u and v
Hitting Time
17
Stationary probability walker is at v under the following random walk: ◦ With probability α, jump back to u ◦ With probability 1 – α, go to random neighbor of
current node
Personalized PageRank
18
N(u) and N(v) are number of in-degrees of nodes u and v
Only directed graphs
score(u,v) ∊ [0,1]. If u=v then, s(u,v)=1
SimRank
19
Latent Space Models
20
Represent the adjacency matrix A with a lower rank matrix Ak.
If Ak(u,v) has large value for a missing A(u,v)=0, then recommend link (u,v)
Low Rank Reconstruction
A ≈
Vn⨉k
Uk⨉
n
= Ak
21
(Non-negative matrix factorization)
Product Recommendations
2
2
j5
Users have a propensity rate to buy certaincategory of product (W)
In a category some products have a propensity rate to be bought (H)
i
Example Euclidean Generative Model
22
The problem of link prediction is to infer distances in the latent space, i.e., find the nearest neighbor in latent space not currently linkedRaftery, A. E., Handcock, M. S., & Hoff, P. D. (2002). Latent
space approaches to social network analysis. J. Amer. Stat. Assoc., 15, 460.
Unit plane
• Vertices uniformly distributed in latent unit
hyperplane
• Vertices close in latent space more likely to be
connected
• Probability of edge = f (distance on latent space)
23
Directed Graphical Models
Bayesian networks and Prob. Relational Models (Getoor et al., 2001, Neville & Jensen 2003)
Captures the dependence of link existence on attributes of entities
Constrains probabilistic dependency to directed acyclic graph
Undirected Graphical Models Markov Networks
Bayesian Networks
24
Example of Today’s Approaches
Credit: N. Barbieri, F. Bonchi, G. Manco KDD’14
Whom To Follow and Why
25
A Relational Markov Network (RMN) specifies cliques and the potentials between attributes of related entities at the template level
Single model provides distribution for entire graph
Train model to maximize the probability of observed edge and labels
Use trained model to predict edges and unknown attributes
Relational Markov Networks
Latent Space Rational
26
Generative model
Link Prediction Heuristic
Node u
Most likely neighbor of node v ?
Node v
Observed edges
Model parameters (latent space)
θ
27
Relationship between Deterministic & Probabilistic methods?
Yes = (Sarkar, Chakrabarti, Moore, COLT’10)
Empirically
RandomShortest Path
Common Neighbors
Adamic/Adar Ensemble of short paths
Lin
k pre
dic
tion
acc
ura
cy*
*Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007
How do we justify these observations?
Especially if the graph is sparse
Credit: Sarkar, Chakrabarti, Moore
Raftery’s Model (cont)
29
1
½
Higher probability of linking
Two sources of randomness• Point positions: uniform in D dimensional space
• Linkage probability: logistic with parameters α, r
• α, r and D are known
radius r
α determines the steepness
Credit: Sarkar, Chakrabarti, Moore
Raftery’s Model: Probability of New Edge
Logistic function integrated over volume determined by dij
i j
Credit: Sarkar, Chakrabarti, Moore
Connection to Common Neighbors
31
ik
jr r
j
No. commonneigh.
Net size Decreases with N
k ji
Empirical Bernstein bounds (Maurer & Pontil, 2009)
Distinct radius per node gives Adamic/Adar
32
Create binary classifier that predicts whether an edge exists between two nodes
Supervised Learning Approaches
33
Types of Features
Features
Label Features
KeywordsFrequency of
an item
Topological Features
Shortest DistanceDegree
CentralityPageRank
index
SVM Decision Trees Deep Belief Network (DBN) K-Nearest Neighbors (KNN) Naive Bayes Radial Basis Function (RBF) Logistic Regression
Link Prediction Heuristic
Classifier