Leonid E. Zhukov - Leonid ZhukovLecture outline 1 Label propagation problem 2 Collective classi...

Preview:

Citation preview

Label propagation on graphs

Leonid E. Zhukov

School of Data Analysis and Artificial IntelligenceDepartment of Computer Science

National Research University Higher School of Economics

Structural Analysis and Visualization of Networks

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 1 / 26

Lecture outline

1 Label propagation problem

2 Collective classificationIterative classification

3 Semi-supervised learningRandom walk based methodsGraph regularization

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 2 / 26

Label propagation

Label propagation - labeling of all nodes in a graph structure

Subset of nodes is labeled: categorical/numeric/binary values

Extend labeling to all nodes on the graph

Classification in networked data, network classification, structuredinference

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 3 / 26

Label propagation problem

Structure can help only if labels/values of linked nodes are correlated

Social networks show assortative mixing - bias in favor of connectionsbetween network nodes with similar characteristics:– homophily: similar characteristics → connections– influence: connections → similar characteristics

Can apply to constructed (induced) similarity networks

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 4 / 26

Network classification

Supervised learning approach

Given graph nodes V = Vl ∪ Vu:– nodes Vl given labels Yl

– nodes Vu do not have lables

Need to find Yu

Labels can be binary, multi-class, real values

Features (attributes) can be computed for every node φi :– local node features (if available)– link features available (labels from neighbors, attributes fromneighbours, node degrees, connectivity patterns)

Feature (design) matrix Φ = (Φl ,Φu)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 5 / 26

Network learning components

Local classifier. This is a local learned model, predicts node labelbased on node attributes. No network information

Relational classier. Takes into account labels and attributes of nodeneighbors. Uses neighborhood network information

Collective classifier. Estimates unknown values together applyingrelational classifier iteratively. Strongly depends on network structure

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 6 / 26

Collective classification

Algorithm: Iterative classification method

Input: Graph G (V ,E ), labels Yl

Output: labels Y

Compute Φ(0)

Train classifier on (Φ(0)l ,Yl)

Predict Y(0)u

repeat

Compute Φ(t)u

Train classifier on (Φ(t),Y (t))

Predict Y(t+1)u from Φ

(t)u

until Y(t)u converges;

Y ← Y (t)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 7 / 26

Iterative classification

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 8 / 26

Relational classifiers

Weighted-vote relational neighbor classifier:

P(yi = c |Ni ) =1

Z

∑i∈Ni

AijP(yj = c |Nj)

Network only Bayes classifier:

P(yi = c |Ni ) =P(Ni |c)P(c)

P(Ni )

where

P(Ni |c) =1

Z

∏j∈Ni

P(yj = yj |yi = c)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 9 / 26

Semi-supervised learning

Graph-based semi-supervised learning

Given partially labeled dataset

Data: X = Xl ∪ Xu

– small set of labeled data (Xl ,Yl)– large set of unlabeled data Xu

Similarity graph over data points G (V ,E ), where every vertex vicorresponds to a data point xi

Transductive learning: learn a function that predicts labels Yu for theunlabeled input Xu

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 10 / 26

Random walk methods

Consider random walk with absorbing states - labeled nodes Vl

Probability yi [c] for node vi ∈ Vu to have label c ,

yi [c] =∑j∈Vl

p∞ij yj [c]

where yi [c] - probability distribution over labels,pij = P(i → j) - one stept probaility transition matrixIf output requires single label per node, assign the most probableIn matrix form

Y = P∞Y

where Y = (Yl , 0), Y = (Yl , Yu)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 11 / 26

Random walk methods

Random walk matrix: P = D−1A

Random walk with absorbing states

P =

(Pll Plu

Pul Puu

)=

(I 0Pul Puu

)At the t →∞ limit:

limt→∞

Pt =

(I 0

(∑∞

n=0 Pnuu)Pul P∞uu

)=

(I 0

(I − Puu)−1Pul 0

)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 12 / 26

Random walk methods

Matrix equation(Yl

Yu

)=

(I 0

(I − Puu)−1Pul 0

)(Yl

Yu

)Solution

Yl = Yl

Yu = (I − Puu)−1PulYl

(I − Puu) is non-singular for all label connected graphs (is alwayspossible to reach a labeled node from any unlabeled node)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 13 / 26

Label propagation

Algorithm: Label propagation, Zhu et. al 2002

Input: Graph G (V ,E ), labels Yl

Output: labels Y

Compute Dii =∑

j Aij

Compute P = D−1AInitialize Y (0) = (Yl , 0), t=0repeat

Y (t+1) ← P · Y (t)

Y(t+1)l ← Y

(t)l

until Y (t) converges;

Y ← Y (t)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 14 / 26

Label propagation

IterationY (t) = P Y (t−1)

Y(t)u = PulYl + PuuY

(t−1)u

Y(t)u =

t∑n=1

Pn−1uu PulYl + Pt

uuYu

ConvergenceY = lim

t→∞Y (t) = (I − Puu)−1PulYl

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 15 / 26

Label spreading

Algorithm: Label spreading, Zhou et. al 2004

Input: Graph G (V ,E ), labels Yl

Output: labels Y

Compute Dii =∑

j Aij ,

Compute S = D−1/2AD−1/2

Initialize Y (0) = (Yl , 0), t=0repeat

Y (t+1) ← αSY (t) + (1− α)Y (0)

t ← t + 1

until Y (t) converges;

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 16 / 26

Label spreading

IterationsY (t) = αSY (t−1) + (1− α)Y (0)

At the t →∞ limit:

limt→∞

Y (t) = limt→∞

(αSY (t−1) + (1− α)Y (0)

)Y = αSY + (1− α)Y (0)

SolutionY = (1− α)(I − αS)−1Y (0)

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 17 / 26

Regularization on graphs

Find labeling Y = (Yl , Yu) that

Consistent with initial labeling:∑i∈Vl

(yi − yi )2 = ||Yl − Yl ||2

Consistent with graph structure (regression function smoothness):

1

2

∑i ,j∈V

Aij(yi − yj)2 = Y T (D − A)Y = Y TLY

Stable (additional regularization):

ε∑i∈V

y2i = ε||Y ||2

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 18 / 26

Regularization on graphs

Minimization with respect to Y , arg minY Q(Y )

Label propagation [Zhu, 2002]:

Q(Y ) =1

2

∑i ,j∈V

Aij(yi − yj)2 = Y TLY , with fixed Yl = Yl

Label spread [Zhou, 2003]:

Q(Y ) =1

2

∑ij∈V

Aij

(yi√d i

−yj√d j

)2

+ µ∑i∈V

(yi − yi )2

Q(Y ) = Y TLY + µ||Y − Y ||2

L = I − S = I − D−1/2AD−1/2

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 19 / 26

Regularization on graphs

Laplacian regularization [Belkin, 2003]

Q(Y ) =1

2

∑ij∈V

Aij (yi − yj)2 + µ

∑i∈Vl

(yi − yi )2

Q(Y ) = Y TLY + µ||Yl − Yl ||2

Use eigenvectors (e1..ep) from smallest eigenvalues of L = D − A:

Lej = λjej

Construct classifier (regression function) on eigenvectors

Err(a) =∑i∈Vl

(yi −p∑

j=1

ajeji )2

Predict value (classify) yi =∑p

j=1 ajeji , class ci = sign(yi )

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 20 / 26

Laplacian regularization

Algorithm: Laplacian regularization, Belkin and Niyogy, 2003

Input: Graph G (V ,E ), labels Yl

Output: labels Y

Compute Dii =∑

j Aij

Compute L = D − ACompute p eigenvectors e1..ep with smallest eigenvalues of L, Le = λeMinimize over a1...aparg mina1,...ap

∑li=1(yi −

∑pj=1 ajeji )

2, a = (ETE )−1ETYl

Label vi by the sign(∑p

j=1 ajeji )

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 21 / 26

Label propagation example

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 22 / 26

Label propagation example

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 23 / 26

Label propagation example

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 24 / 26

Label propagation example

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 25 / 26

References

S. A. Macskassy, F. Provost, Classification in Networked Data: AToolkit and a Univariate Case Study. Journal of Machine LearningResearch 8, 935-983, 2007Bengio Yoshua, Delalleau Olivier, Roux Nicolas Le. Label Propagationand Quadratic Criterion. Chapter in Semi-Supervised Learning, Eds.O. Chapelle, B. Scholkopf, and A. Zien, MIT Press 2006Smriti Bhagat, Graham Cormode, S. Muthukrishnan. Nodeclassification in social networks. Chapter in Social Network DataAnalytics, Eds. C. Aggrawal, 2011, pp 115-148D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learningwith local and global consistency. In NIPS, volume 16, 2004.X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learningusing Gaussian fields and harmonic functions. In ICML, 2003.M. Belkin, P. Niyogi, V. Sindhwani. Manifold regularization: Ageometric framework for learning from labeled and unlabeledexamples. J. Mach. Learn. Res., 7, 2399-2434, 2006

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 26 / 26

Recommended