40
Outline Introduction Graph Construction Graph Learning Robust Multi-Class Graph Transduction (RMGT) Experiments Robust Multi-Class Transductive Learning with Graphs Wei Liu and Shih-Fu Chang Columbia University June 19, 2009 Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

Embed Size (px)

Citation preview

Page 1: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Robust Multi-Class Transductive Learning withGraphs

Wei Liu and Shih-Fu Chang

Columbia University

June 19, 2009

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 2: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 3: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Introduction

Graph Construction

Graph Learning

Robust Multi-Class Graph Transduction (RMGT)

Experiments

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 4: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

What is Semi-Supervised Learning (SSL)?

F In the narrow sense, SSL refers particularly to semi-supervisedclassification using labeled data and unlabeled data, which oftenincludes transductive and inductive cases.

+-

seen data

transductive learning

inductive learning

unseen data

Figure: Narrow-sense semi-supervised learning.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 5: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

What is Semi-Supervised Learning (SSL)?

F In the wide sense, SSL covers all learning tasks where priorknowledge about a few data is known and knowledge about theremaining data can be inferred. The knowledge may be labels,response values, vector representations, and pairwise relations.

regression clustering

Figure: Wide-sense semi-supervised learning.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 6: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Survey and Book

Xiaojin Zhu. Semi-Supervised Learning Literature Survey,Computer Sciences Technical Report 1530, University ofWisconsin-Madison, 2005.Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien.Semi-Supervised Learning, MIT Press, 2006.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 7: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Binary-Class SSL Setting

I A data set X = {x1, · · · , xl , · · · , xn} ⊂ Rd in which the first lsamples are labeled and the remaining u = n − l ones areunlabeled. Prior labels saved in y ∈ Rn such that yi ∈ {1,−1}if xi is labeled and yi = 0 if unlabeled. Use the graphLaplacian matrix L or its normalized variant L to infer theoverall labeling f ∈ Rn.

I Graph Laplacian: L = D −W where W is the weight matrixof the graph G (V ,E ,W ) built on the dataset X , andDii =

∑j Wij .

I Normalized Graph Laplacian: L = D− 12 LD− 1

2 .

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 8: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

State-of-The-Arts

F Label Propagation – the key is the Laplacian-shaped regularizer.Gaussian Fields and Harmonic Functions (GFHF), Zhu et al. 2003:

minf

fTLf

s.t. fl = yl

Local and Global Consistency (LGC), Zhou et al. 2004:

minf‖f − y‖2 + µfT Lf

Quadratic Criterion (QC), Bengio et al. 2006:

minf‖fl − yl‖2 + µfTLf + µε‖f‖2

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 9: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

F Remarks1. All these methods are akin to each other. I found that X. Zhu’smethod GFHF gives more robust performance because of the hardconstraint and no trade-off parameters.2. All these methods heavily depend on graph structures.3. All these methods naturally generalize to multi-class problems.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 10: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Motivation

1. ”Several graph-based methods listed here are similar to eachother. They differ in the particular choice of the loss function andthe regularizer. We believe it is more important to construct agood graph than to choose among the methods. However graphconstruction, as we will see later, is not a well studied area.”

X. Zhu, the SSL survey 2005.2. Two mostly used kinds of graphs: k-NN graph andh-neighborhood graph. Empirically, k-NN weighted graph withsmall k tends to perform better.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 11: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

A Simple Toy Problem–Noisy Two Moons

−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2Noisy two moons

unlabelednoiselabeled: +1labeled: −1

Figure: Noisy two moons given two labeled points. We only have groundtruth labels for the points on two moons, so we evaluate classificationperformance on these on-manifold points.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 12: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

A Simple Toy Problem–Noisy Two Moons

−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2(a) LGC (13.55%)

labeled to ’+1’ labeled to ’−1’ ’+1’ ’−1’−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2(b) GFHF (14.21%)

labeled to ’+1’ labeled to ’−1’ ’+1’ ’−1’−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2(c) GFHF with sGraph (0%)

labeled to ’+1’ labeled to ’−1’ ’+1’ ’−1’

Figure: Error rates over unlabeled points. (a) LGC with 13.55% errorrate using a 10-NN graph; (b) GFHF with 14.21% error rate using a10-NN graph; (c) GFHF with zero error rate using a symmetry-favored10-NN graph.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 13: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Illumination

I Using the traditional k-NN graph, LGC and GFHF causemany errors. But GFHF achieves perfect results when usingthe proposed symmetry-favored k-NN graph. This illustratesthat graph quality is critical to SSL, and the same SSLmethod leads to very different results using different graphconstruction schemes.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 14: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

k-NN Graph

I Let us define an asymmetric n × n matrix:

Aij =

{exp

(−d(xi ,xj )

2

σ2

), if j ∈ Ni

0, otherwise(1)

where the set Ni saves the indexes of k nearest neighbors ofpoint xi and d(xi , xj) is some distance measure (e.g.Euclidean distance) between xi and xj .

I The parameter σ is empirically estimated byσ =

∑ni=1 d(xi , xik )/n, where xik is the k-th nearest neighbor

of xi . Such an estimation is verified simple and sufficientlyeffective.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 15: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

k-NN sGraph

I Let us define a symmetric n × n matrix:

Wij =

Aij + Aji , if j ∈ Ni and i ∈ Nj

Aji , if j /∈ Ni and i ∈ Nj

Aij , otherwise(2)

Obviously, W = A + AT and W is symmetric with Wii = 0(to avoid self loops). This weighting scheme favors thesymmetric edges < xi , xj > such that xi is in theneighborhood of xj and xj is simultaneously in theneighborhood of xi .

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 16: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Remark1. The weights of those symmetric edges are doubled explicitly dueto the reasonable consideration that two points connected by asymmetric edge are prone to be on the same submanifold.2. In contrast, the weighting scheme adopted by traditional k-NNgraphs treats all edges in the same manner, which defines theweighted adjacency matrix by max{A,AT}.3. We call the graph constructed through eq. (2) thesymmetry-favored k-NN graph or k-NN sGraph in abbreviation.The proposed graph is relatively robust to noise as it reinforces thesimilarities between points on manifolds.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 17: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Comparision

2-NN Graph 2-NN sGraph

Figure: Thicker edges represent larger edge weights.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 18: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Graph Laplacian

I Given the constructed graph G (V ,E ,W), the smoothsemi-norm used in most graph-based approaches is

‖f ‖2G =

1

2(f (vi )− f (vj))

2Wij = fTLf,

where we elicit the graph Laplacian matrix

L = D−W. (3)

I The degree matrix D ∈ Rn×n is a diagonal matrix such thatDii =

∑nj=1 Wij . Dii approximates the local density of

neighborhood at xi .

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 19: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Doubly-Stochastic Matrix

I Theorem 1 (in paper) implies that the smooth normemphasizes neighborhoods of high densities (large Dii ).However, sampling is usually not uniform in practice, soover-emphasizing the neighborhoods of high densities mayocclude the information in sparse regions.

Figure: Ununiform sampling.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 20: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Doubly-Stochastic Matrix

I To fully exploit the power of unlabeled data, we wouldn’texpect sparse densities from all unlabeled data. Thus, wechoose to enforce the equal degree constraint Dii = 1 bysetting W1 = 1 which makes the adjacency matrix W adoubly-stochastic matrix.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 21: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

How to learn?

I We try to learn W from training data without any presumedfunction form. We only assume that W is close to the initialW0 calculated via eq. (2).

I We can infuse semi-supervised information into W. Considera pair set

T = {(i , j)|i = j or (xi , xj) differ in labels}

and define its matrix form T. In particular, we requireWij = 0 for (i , j) ∈ T or equivalently require

∑(i ,j)∈T Wij = 0

due to Wij ≥ 0. This constraint is intuitive since it removesself loops and erroneous edges.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 22: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Learning W

I We formulate learning doubly-stochastic W subject todifferently labeled information T as follows

min G(W) =1

2‖W −W0‖2

F

s.t.∑

(i ,j)∈T

Wij = 0

W1 = 1, W = WT , W ≥ 0 (4)

where ‖.‖F stands for the Frobenius norm. Eq. (4) falls intoan instance of quadratic programming (QP).

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 23: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Learning W

I For efficient computation, we divide this QP problem into twoconvex sub-problems

min G(W) =1

2‖W −W0‖2

F

s.t.∑

(i ,j)∈T

Wij = 0, W1 = 1, W = WT (5)

and

minG(W) =1

2‖W −W0‖2

F s.t.W ≥ 0 (6)

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 24: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Learning W

I We find a simple solution to the sub-problem eq. (6):W = dW0e≥0 in which the operator dW0e≥0 zeros out allnegative entries of W0. The operator is essentially a conicsubspace projection operator.

I We solve the sub-problem eq. (5)

W = P(W0,T) = W0 −(

t0 +21TTµµµ0

|T |)

T + µµµ01T + 1µµµ0T, (7)

where P(W0,T) behaves as an affine subspace projectionoperator. t0 and µµµ0 are also computed based on W0.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 25: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Successive Projection

I We tackle the original QP problem eq. (4) by successiveprojection using the two subspace projection operators.

I Von-Neumanns successive projection lemma: thesuccessively alternate projection process will converge ontothe intersect of the affine and conic subspace operators. VNslemma ensures that alternately solving sub-problems eq. (5)and (6) is theoretically guaranteed to converge to the globallyoptimal solution of the target problem eq. (4).

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 26: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Algorithm 1. Doubly-Stochastic Adjacency Matrix LearningINPUT: the initial adjacency matrix W0

the differently labeled information Tthe maximum iteration number MaxIter .

LOOP: m = 1, · · · ,MaxIterWm = P(Wm−1,T)If Wm ≥ 0 stop LOOP;else Wm = dWme≥0.

OUTPUT: W = Wm.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 27: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Two Rings Toy Problem

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure: Two rings toy data.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 28: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Two Rings Toy Problem

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9(a) k−NN Graph

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9(b) b−Matching Graph

Figure: k = 10. The b-matching graph is a regular graph where eachnode has k adjacent nodes.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 29: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Two Rings Toy Problem

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9(c) unit−degree Graph

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9(d) unit−degree Graph given two labeled points

Figure: These two graphs have doubly-stochastic matrices learned basedon the 10-NN sGraph. The former doesn’t use the differently labeledinformation T (good enough!), while the latter does.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 30: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Merits of Doubly-Stochastic Matrix

I It offers a nonparametric form for W, flexibly representingdata lying in compact clusters or intrinsic low-dimensionalsubmanifolds.

I It is highly robust to noise, e.g., when a noisy sample xj

invades the neighborhood of xi , the unit-degree constraintmakes the weight Wij absolutely small compared to theweights between xi and closer neighbors.

I It provides the “balanced” graph Laplacian with which thesmooth norm penalizes label prediction functions on eachsample (node) uniformly, resulting in uniform labelpropagation.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 31: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Goal

I Solve a soft label matrix F ∈ Rn×c for any multi-class SSLtask.

.1 .2 .[ , ,..., ]l

c

u

= =

F

F F F F

F

lY

unknownaccount for each class

known class assignment

Figure: Provided Yl infer Fu.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 32: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Multi-Class Constraints

I It suffices to suppose the class posteriors for the labeled databe p(Ck |xi ) = Yik = 1 if xi ∈ Ck and p(Ck |xi ) = Yik = 0otherwise. Importantly, if we knew class priorsωωω = [p(C1), · · · , p(Cc)]

T (ωωωT1c = 1) and regarded soft labelsFik as p(Ck |xi ), we would have the equation

1TF.k

n∼=

n∑

i=1

p(Ck |xi )

n=

n∑

i=1

p(xi )p(Ck |xi ) = p(Ck) (8)

where the marginal probability density p(xi ) ∝ Dii = 1 isassumed to be 1/n. Eq. (8) induces a hard constraint1TF = nωωωT (FT1 = nωωω).

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 33: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Multi-Class Label Propagation

I To address multi-class problems, our motivation is to let thesoft labels Fik carry the main properties of p(Ck |xi ). Hence,we impose two hard constraints FT1 = nωωω and F1c = 1 (dueto

∑k p(Ck |xi ) = 1, 1c is a c-dimensional 1-entry vector) to

obtain a constrained multi-class label propagation:

minF tr(FTLF)

s.t. Fl = Yl , F1c = 1, FT1 = nωωω (9)

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 34: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Multi-Class Label Propagation

I Eq. (9) reduces to

min Q(Fu) = tr(FTu LuuFu) + 2tr(FT

u LulYl)

s.t. Fu1c = 1u, FTu 1u = nωωω − YT

l 1l (10)

where Luu and Lul are sub-matrices of L =

[Lll Llu

Lul Luu

], and

1l and 1u are l- and u-dimensional 1-entry vectors,respectively.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 35: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Multi-Class Label Propagation

I Theorem 2 (in paper) shows a closed-form solution toeq. (10). The formulated multi-class label propagationsucceeds in incorporating class priors, different from allexisting label propagation methods.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 36: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Flowchart of RMGT

doubly-stochastic

adjacency matrix

learning

input

feature

vectors

k-NN sGraph unit-degree Graph

multi-class label

propagation

prior labels

global

classification

Figure: The RMGT algorithm.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 37: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Experimental Setup

Data #Features #Samples #Classes

USPS (test) 256 2007 10

FRGC (subset) 4608 3160 316

Data #Features #Samples #Classes

USPS (test) 256 2007 10

FRGC (subset) 4608 3160 316

Figure: Digit and face images.

RMGT: without graph adjacency matrix learning.RMGT(W): with graph adjacency matrix learning.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 38: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Performance Curves

20 30 40 50 60 70 80 90 1000.1

0.15

0.2

0.25

0.3

0.35

# Labeled Samples

Err

or R

ate

(%)

USPS

LGCSGTGFHF+CMNRMGTRMGT(W)

3 4 5 6 7 8 9 10

0.65

0.7

0.75

0.8

0.85

# Labeled Samples/100

Re

cog

nit

ion

Ra

te (

%)

FRGC

LGC

SGT

GFHF+CMN

RMGT

RMGT(W)

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 39: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Conclusions

I All compared SSL algorithms achieve performance gains whenswitching k-NN graphs to k-NN sGraphs.

I RMGT performs better than the other methods, thusdemonstrating the success of multi-class label propagationwith class priors.

I RMGT(W) is significantly superior to the others, manifestingthat the proposed graph learning technique (doubly-stochasticadjacency matrix learning) boosts graph-based SSLperformance.

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs

Page 40: Robust Multi-Class Transductive Learning with Graphswliu/CVPR09_wliu_slides.pdf · 2-NN GraphFigure: Thicker edges ... I We try to learn W from training data without any presumed

OutlineIntroduction

Graph ConstructionGraph Learning

Robust Multi-Class Graph Transduction (RMGT)Experiments

Thanks!For any problems, please email to [email protected].

Wei Liu and Shih-Fu Chang Robust Multi-Class Transductive Learning with Graphs