Leonid E. Zhukov - Leonid ZhukovLecture outline 1 Label propagation problem 2 Collective classi...

Label propagation on graphs

Leonid E. Zhukov

School of Data Analysis and Artificial IntelligenceDepartment of Computer Science

National Research University Higher School of Economics

Structural Analysis and Visualization of Networks

Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 1 / 26

Lecture outline

1 Label propagation problem

2 Collective classificationIterative classification

3 Semi-supervised learningRandom walk based methodsGraph regularization

Label propagation

Label propagation - labeling of all nodes in a graph structure

Subset of nodes is labeled: categorical/numeric/binary values

Extend labeling to all nodes on the graph

Classification in networked data, network classification, structuredinference

Label propagation problem

Structure can help only if labels/values of linked nodes are correlated

Social networks show assortative mixing - bias in favor of connectionsbetween network nodes with similar characteristics:– homophily: similar characteristics → connections– influence: connections → similar characteristics

Can apply to constructed (induced) similarity networks

Network classification

Supervised learning approach

Given graph nodes V = Vl ∪ Vu:– nodes Vl given labels Yl

– nodes Vu do not have lables

Need to find Yu

Labels can be binary, multi-class, real values

Features (attributes) can be computed for every node φi :– local node features (if available)– link features available (labels from neighbors, attributes fromneighbours, node degrees, connectivity patterns)

Feature (design) matrix Φ = (Φl ,Φu)

Network learning components

Local classifier. This is a local learned model, predicts node labelbased on node attributes. No network information

Relational classier. Takes into account labels and attributes of nodeneighbors. Uses neighborhood network information

Collective classifier. Estimates unknown values together applyingrelational classifier iteratively. Strongly depends on network structure

Collective classification

Algorithm: Iterative classification method

Input: Graph G (V ,E ), labels Yl

Output: labels Y

Compute Φ(0)

Train classifier on (Φ(0)l ,Yl)

Predict Y(0)u

repeat

Compute Φ(t)u

Train classifier on (Φ(t),Y (t))

Predict Y(t+1)u from Φ

until Y(t)u converges;

Y ← Y (t)

Iterative classification

Relational classifiers

Weighted-vote relational neighbor classifier:

P(yi = c |Ni ) =1

∑i∈Ni

AijP(yj = c |Nj)

Network only Bayes classifier:

P(yi = c |Ni ) =P(Ni |c)P(c)

P(Ni )

P(Ni |c) =1

∏j∈Ni

P(yj = yj |yi = c)

Semi-supervised learning

Graph-based semi-supervised learning

Given partially labeled dataset

Data: X = Xl ∪ Xu

– small set of labeled data (Xl ,Yl)– large set of unlabeled data Xu

Similarity graph over data points G (V ,E ), where every vertex vicorresponds to a data point xi

Transductive learning: learn a function that predicts labels Yu for theunlabeled input Xu

Random walk methods

Consider random walk with absorbing states - labeled nodes Vl

Probability yi [c] for node vi ∈ Vu to have label c ,

yi [c] =∑j∈Vl

p∞ij yj [c]

where yi [c] - probability distribution over labels,pij = P(i → j) - one stept probaility transition matrixIf output requires single label per node, assign the most probableIn matrix form

Y = P∞Y

where Y = (Yl , 0), Y = (Yl , Yu)

Random walk methods

Random walk matrix: P = D−1A

Random walk with absorbing states

(Pll Plu

Pul Puu

(I 0Pul Puu

)At the t →∞ limit:

limt→∞

(∑∞

n=0 Pnuu)Pul P∞uu

(I − Puu)−1Pul 0

Random walk methods

Matrix equation(Yl

(I − Puu)−1Pul 0

)Solution

Yl = Yl

Yu = (I − Puu)−1PulYl

(I − Puu) is non-singular for all label connected graphs (is alwayspossible to reach a labeled node from any unlabeled node)

Label propagation

Algorithm: Label propagation, Zhu et. al 2002

Output: labels Y

Compute Dii =∑

Compute P = D−1AInitialize Y (0) = (Yl , 0), t=0repeat

Y (t+1) ← P · Y (t)

Y(t+1)l ← Y

until Y (t) converges;

Y ← Y (t)

Label propagation

IterationY (t) = P Y (t−1)

Y(t)u = PulYl + PuuY

(t−1)u

Y(t)u =

t∑n=1

Pn−1uu PulYl + Pt

ConvergenceY = lim

t→∞Y (t) = (I − Puu)−1PulYl

Label spreading

Algorithm: Label spreading, Zhou et. al 2004

Output: labels Y

Compute Dii =∑

j Aij ,

Compute S = D−1/2AD−1/2

Initialize Y (0) = (Yl , 0), t=0repeat

Y (t+1) ← αSY (t) + (1− α)Y (0)

t ← t + 1

until Y (t) converges;

Label spreading

IterationsY (t) = αSY (t−1) + (1− α)Y (0)

At the t →∞ limit:

limt→∞

Y (t) = limt→∞

(αSY (t−1) + (1− α)Y (0)

)Y = αSY + (1− α)Y (0)

SolutionY = (1− α)(I − αS)−1Y (0)

Regularization on graphs

Find labeling Y = (Yl , Yu) that

Consistent with initial labeling:∑i∈Vl

(yi − yi )2 = ||Yl − Yl ||2

Consistent with graph structure (regression function smoothness):

∑i ,j∈V

Aij(yi − yj)2 = Y T (D − A)Y = Y TLY

Stable (additional regularization):

ε∑i∈V

y2i = ε||Y ||2

Minimization with respect to Y , arg minY Q(Y )

Label propagation [Zhu, 2002]:

Q(Y ) =1

∑i ,j∈V

Aij(yi − yj)2 = Y TLY , with fixed Yl = Yl

Label spread [Zhou, 2003]:

Q(Y ) =1

∑ij∈V

(yi√d i

−yj√d j

+ µ∑i∈V

(yi − yi )2

Q(Y ) = Y TLY + µ||Y − Y ||2

L = I − S = I − D−1/2AD−1/2

Laplacian regularization [Belkin, 2003]

Q(Y ) =1

∑ij∈V

Aij (yi − yj)2 + µ

∑i∈Vl

(yi − yi )2

Q(Y ) = Y TLY + µ||Yl − Yl ||2

Use eigenvectors (e1..ep) from smallest eigenvalues of L = D − A:

Lej = λjej

Construct classifier (regression function) on eigenvectors

Err(a) =∑i∈Vl

(yi −p∑

ajeji )2

Predict value (classify) yi =∑p

j=1 ajeji , class ci = sign(yi )

Laplacian regularization

Algorithm: Laplacian regularization, Belkin and Niyogy, 2003

Output: labels Y

Compute Dii =∑

Compute L = D − ACompute p eigenvectors e1..ep with smallest eigenvalues of L, Le = λeMinimize over a1...aparg mina1,...ap

∑li=1(yi −

∑pj=1 ajeji )

2, a = (ETE )−1ETYl

Label vi by the sign(∑p

j=1 ajeji )

Label propagation example

References

S. A. Macskassy, F. Provost, Classification in Networked Data: AToolkit and a Univariate Case Study. Journal of Machine LearningResearch 8, 935-983, 2007Bengio Yoshua, Delalleau Olivier, Roux Nicolas Le. Label Propagationand Quadratic Criterion. Chapter in Semi-Supervised Learning, Eds.O. Chapelle, B. Scholkopf, and A. Zien, MIT Press 2006Smriti Bhagat, Graham Cormode, S. Muthukrishnan. Nodeclassification in social networks. Chapter in Social Network DataAnalytics, Eds. C. Aggrawal, 2011, pp 115-148D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learningwith local and global consistency. In NIPS, volume 16, 2004.X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learningusing Gaussian fields and harmonic functions. In ICML, 2003.M. Belkin, P. Niyogi, V. Sindhwani. Manifold regularization: Ageometric framework for learning from labeled and unlabeledexamples. J. Mach. Learn. Res., 7, 2399-2434, 2006

Leonid E. Zhukov - Leonid ZhukovLecture outline 1 Label propagation problem 2 Collective classi...

Documents

1UESTION CLASSI CATION ACCORDINGTOLOOMåS …

Equivalence Relations, Classi cation Problems, and ... · Su Gao Equivalence Relations, Classi cation Problems, and Descriptive Set Theory. Classi cation Problems: Examples ExampleClassify

Pattern Classi cation using Arti cial Neural Networksethesis.nitrkl.ac.in/2468/1/...using_Artificial_Neural_Networks.pdf · Pattern Classi cation using Arti cial Neural Networks

VALIS: an Evolutionary Classi cation Algorithminversed.ru/AIS/VALIS_2017.pdf · VALIS: an Evolutionary Classi cation Algorithm 3 The process of classi cation relies on reliable data

Discriminative Interpolation for Classi cation of

Naïve Bayes Text Classi cation

Classi˜cation of Living Things Discussion Questions

Visual Pattern Recognition: pattern classification and ...wiki.icmc.usp.br/images/c/c6/Dip14_patternrecognitionMCS.pdf · A classi cation task Agenda 1 A classi cation task 2 Classi

Deep Transfer Learning Ensemble for Classi cation

Partial monitoring { classi cation, regret bounds, and

Earthquake-Induced Structural Damage Classi cation Algorithmcs229.stanford.edu/proj2015/343_report.pdf · Earthquake-Induced Structural Damage Classi cation ... To obtain some of

Text Classi cation

ATLAS detector event classi cation with TensorFlow

Bayesian Classi cation

Multi-scale classi cation for electro-sensing

Embryo Classi cation using Neural Networks

Chinese Calligraphy Font Classi cation and Transformationcs229.stanford.edu/proj2016/report/DengWangRen-ChineseCharacter... · Chinese Calligraphy Font Classi cation and Transformation

Hierarchical classi cation with reject option for live sh ... · 2.1 Hierarchical classi cation method The task of sh recognition is an application of multi-class classi cation, which

Galaxy classi˜cation - UWA

Classi cation-based Financial Markets Prediction using