201
(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science [email protected] April 7, 2015

(Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science [email protected] April 7, 2015

  • Upload
    others

  • View
    22

  • Download
    0

Embed Size (px)

Citation preview

Page 1: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

(Graph-based) Semi-Supervised Learning

Partha Pratim TalukdarIndian Institute of Science

[email protected]

April 7, 2015

Page 2: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Supervised Learning

ModelLabeled Data Learning Algorithm

2

Page 3: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Supervised Learning

ModelLabeled Data Learning Algorithm

2

Examples:Decision Trees

Support Vector Machine (SVM) Maximum Entropy (MaxEnt)

Page 4: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Semi-Supervised Learning (SSL)

ModelLabeled Data

Learning Algorithm

A Lot of Unlabeled Data

3

Page 5: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Semi-Supervised Learning (SSL)

ModelLabeled Data

Learning Algorithm

A Lot of Unlabeled Data

3

Examples:Self-TrainingCo-Training

Page 6: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why SSL?

How can unlabeled data be helpful?

4

Page 7: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Without Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

4

Page 8: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Without Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

Labeled Instances

4

Page 9: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Without Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

Labeled Instances

DecisionBoundary

4

Page 10: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

With Unlabeled DataWithout Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

Labeled Instances

DecisionBoundary

4

Page 11: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

With Unlabeled DataWithout Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

Labeled Instances

DecisionBoundary Unlabeled

Instances

4

Page 12: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

With Unlabeled DataWithout Unlabeled Data

Why SSL?

How can unlabeled data be helpful?

Example from [Belkin et al., JMLR 2006]

Labeled Instances

DecisionBoundary

More accurate decision boundary in the presence of unlabeled instances

UnlabeledInstances

4

Page 13: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

5

Page 14: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

5

Page 15: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

5

Page 16: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

5

Page 17: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

XSupervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

5

Page 18: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

X

Manifold Regularization

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

5

Page 19: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

X

Manifold Regularization

Label Propagation (LP), MAD, MP, ...

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

5

Page 20: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

X

Manifold Regularization

Label Propagation (LP), MAD, MP, ...

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

Most Graph SSL algorithms are non-parametric (i.e., # parameters grows with data size)

5

Page 21: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

X

Manifold Regularization

Label Propagation (LP), MAD, MP, ...

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

Most Graph SSL algorithms are non-parametric (i.e., # parameters grows with data size)

5

Focus of this talk

Page 22: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Inductive vs Transductive

SVM, Maximum Entropy

X

Manifold Regularization

Label Propagation (LP), MAD, MP, ...

Supervised(Labeled)

Semi-supervised(Labeled + Unlabeled)

Inductive(Generalize toUnseen Data)

Transductive(Doesn’t Generalize to

Unseen Data)

See Chapter 25 of SSL Book: http://olivier.chapelle.cc/ssl-book/discussion.pdf

Most Graph SSL algorithms are non-parametric (i.e., # parameters grows with data size)

5

Focus of this talk

Page 23: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Two Popular SSL Algorithms

• Self Training

6

Page 24: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Two Popular SSL Algorithms

• Self Training

6

• Co-Training

Page 25: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

7

Page 26: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

• Some datasets are naturally represented by a graph

• web, citation network, social network, ...

7

Page 27: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

• Some datasets are naturally represented by a graph

• web, citation network, social network, ...

• Uniform representation for heterogeneous data

7

Page 28: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

• Some datasets are naturally represented by a graph

• web, citation network, social network, ...

• Uniform representation for heterogeneous data

• Easily parallelizable, scalable to large data

7

Page 29: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

• Some datasets are naturally represented by a graph

• web, citation network, social network, ...

• Uniform representation for heterogeneous data

• Easily parallelizable, scalable to large data

• Effective in practice

7

Page 30: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Why Graph-based SSL?

• Some datasets are naturally represented by a graph

• web, citation network, social network, ...

• Uniform representation for heterogeneous data

• Easily parallelizable, scalable to large data

• Effective in practiceGraph SSL

Supervised

Non-Graph SSL

Text Classification

7

Page 31: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

8

Page 32: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

8

Page 33: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

0.60.20.8

Similarity

8

Page 34: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

“politics”“business”

0.60.20.8

Similarity

8

Page 35: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

“politics”“business”

0.60.20.8

“business” “politics”

Similarity

8

Page 36: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

9

Page 37: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

Smoothness Assumption If two instances are similar

according to the graph, then output labels should be similar

9

Page 38: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

Smoothness Assumption If two instances are similar

according to the graph, then output labels should be similar

9

Page 39: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL

• Two stages• Graph construction (if not already present)• Label Inference

Smoothness Assumption If two instances are similar

according to the graph, then output labels should be similar

9

Page 40: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

10

Page 41: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

10

Page 42: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Construction

• Neighborhood Methods

• k-NN Graph Construction (k-NNG)

• e-Neighborhood Method

• Metric Learning

• Other approaches

11

Page 43: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Neighborhood Methods

12

Page 44: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Neighborhood Methods

• k-Nearest Neighbor Graph (k-NNG)• add edges between an instance and its

k-nearest neighbors

12

Page 45: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Neighborhood Methods

• k-Nearest Neighbor Graph (k-NNG)• add edges between an instance and its

k-nearest neighborsk = 3

12

Page 46: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Neighborhood Methods

• k-Nearest Neighbor Graph (k-NNG)• add edges between an instance and its

k-nearest neighbors

• e-Neighborhood• add edges to all instances inside a ball of

radius e

k = 3

12

Page 47: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Neighborhood Methods

• k-Nearest Neighbor Graph (k-NNG)• add edges between an instance and its

k-nearest neighbors

• e-Neighborhood• add edges to all instances inside a ball of

radius e

e

k = 3

12

Page 48: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

13

Page 49: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

13

• Not scalable (quadratic)

Page 50: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

13

• Not scalable (quadratic)• Results in an asymmetric graph

Page 51: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

13

• Not scalable (quadratic)• Results in an asymmetric graph

• b is the closest neighbor of a, but not the other way

a b c

Page 52: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

• Results in irregular graphs• some nodes may end up with

higher degree than other nodes

13

• Not scalable (quadratic)• Results in an asymmetric graph

• b is the closest neighbor of a, but not the other way

a b c

Page 53: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with k-NNG

• Results in irregular graphs• some nodes may end up with

higher degree than other nodes

Node of degree 4 inthe k-NNG (k = 1)

13

• Not scalable (quadratic)• Results in an asymmetric graph

• b is the closest neighbor of a, but not the other way

a b c

Page 54: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with e-Neighborhood

14

Page 55: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with e-Neighborhood

• Not scalable

14

Page 56: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with e-Neighborhood

• Not scalable

• Sensitive to value of e : not invariant to scaling

14

Page 57: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with e-Neighborhood

• Not scalable

• Sensitive to value of e : not invariant to scaling

• Fragmented Graph: disconnected components

14

Page 58: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Issues with e-Neighborhood

• Not scalable

• Sensitive to value of e : not invariant to scaling

• Fragmented Graph: disconnected components

Figure from [Jebara et al., ICML 2009]

e-NeighborhoodData

Disconnected

14

Page 59: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Construction using Metric Learning

15

Page 60: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Construction using Metric Learning

xi xjwij / exp(�DA(xi, xj))

15

Page 61: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Construction using Metric Learning

xi xjwij / exp(�DA(xi, xj))

Estimated using Mahalanobis metric learning algorithms

DA(xi, xj) = (xi � xj)TA(xi � xj)

15

Page 62: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Construction using Metric Learning

• Supervised Metric Learning

• ITML [Kulis et al., ICML 2007]

• LMNN [Weinberger and Saul, JMLR 2009]

• Semi-supervised Metric Learning

• IDML [Dhillon et al., UPenn TR 2010]

xi xjwij / exp(�DA(xi, xj))

Estimated using Mahalanobis metric learning algorithms

DA(xi, xj) = (xi � xj)TA(xi � xj)

15

Page 63: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Benefits of Metric Learning for Graph Construction

16

Page 64: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Benefits of Metric Learning for Graph Construction

0

0.125

0.25

0.375

0.5

Amazon Newsgroups Reuters Enron A Text

Original RP PCA ITML IDML

Erro

r

100 seed and1400 test instances, all inferences using LP

16

Page 65: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Benefits of Metric Learning for Graph Construction

Graph constructed using supervised metric learning

0

0.125

0.25

0.375

0.5

Amazon Newsgroups Reuters Enron A Text

Original RP PCA ITML IDML

Erro

r

100 seed and1400 test instances, all inferences using LP

16

Page 66: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Benefits of Metric Learning for Graph Construction

[Dhillon et al., UPenn TR 2010]

Graph constructed using supervised metric learning

0

0.125

0.25

0.375

0.5

Amazon Newsgroups Reuters Enron A Text

Original RP PCA ITML IDML

Erro

r

100 seed and1400 test instances, all inferences using LP

Graph constructed using semi-supervised metric learning[Dhillon et al., 2010]

16

Page 67: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Benefits of Metric Learning for Graph Construction

Careful graph construction is critical![Dhillon et al., UPenn TR 2010]

Graph constructed using supervised metric learning

0

0.125

0.25

0.375

0.5

Amazon Newsgroups Reuters Enron A Text

Original RP PCA ITML IDML

Erro

r

100 seed and1400 test instances, all inferences using LP

Graph constructed using semi-supervised metric learning[Dhillon et al., 2010]

16

Page 68: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Other Graph Construction Approaches

• Local Reconstruction

• Linear Neighborhood [Wang and Zhang, ICML 2005]

• Regular Graph: b-matching [Jebara et al., ICML 2008]

• Fitting Graph to Vector Data [Daitch et al., ICML 2009]

• Graph Kernels

• [Zhu et al., NIPS 2005]

17

Page 69: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Measure Propagation- Sparse Label Propagation- Manifold Regularization

18

Page 70: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

19

Graph Laplacian

Page 71: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

• Laplacian (un-normalized) of a graph:

L = D �W,where Dii =X

j

Wij , Dij( 6=i) = 0

19

Graph Laplacian

Page 72: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

• Laplacian (un-normalized) of a graph:

1

12

3

a

b

c

d

L = D �W,where Dii =X

j

Wij , Dij( 6=i) = 0

19

Graph Laplacian

Page 73: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

• Laplacian (un-normalized) of a graph:

1

12

3

a

b

c

d

3 -1 -2 0 -1 4 -3 0 -2 -3 6 -1 0 0 -1 1

abcd

a b c d

L = D �W,where Dii =X

j

Wij , Dij( 6=i) = 0

19

Graph Laplacian

Page 74: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

1

20

Page 75: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

1

20

Page 76: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-Smoothness

1

20

Page 77: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-SmoothnessVector of scores for

single label on nodes

1

20

Page 78: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-SmoothnessVector of scores for

single label on nodes

125

5

10

1

12

3

a

b

c

d1

fT = [1 10 5 25]

20

Page 79: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-SmoothnessVector of scores for

single label on nodes

125

5

10

1

12

3

a

b

c

d1

fT = [1 10 5 25]

fTLf = 588 Not Smooth20

Page 80: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-SmoothnessVector of scores for

single label on nodes

125

5

10

1

12

3

a

b

c

d1

fT = [1 10 5 25]

fTLf = 588 Not Smooth

1

1

1

3

fT = [1113]fTLf = 4

1

12

3

a

b

c

d

20

Page 81: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

fTLf =X

i,j

Wij(fi � fj)2

Graph Laplacian (contd.)• L is positive semi-definite (assuming non-negative weights)

• Smoothness of prediction f over the graph in terms of the Laplacian:

Measure of Non-SmoothnessVector of scores for

single label on nodes

125

5

10

1

12

3

a

b

c

d1

fT = [1 10 5 25]

fTLf = 588 Not Smooth

1

1

1

3

fT = [1113]fTLf = 4

1

12

3

a

b

c

d

Smooth20

Page 82: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Measure Propagation- Sparse Label Propagation- Manifold Regularization

21

Page 83: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

NotationsSeed Scores

Estimated Scores

Label RegularizationYv,l : score of estimated label l on node v

Yv,l : score of seed label l on node v

Rv,l : regularization target for label l on node v

S : seed node indicator (diagonal matrix)

v

Wuv : weight of edge (u, v) in the graph

22

Page 84: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

LP-ZGL [Zhu et al., ICML 2003]

argminY

m!

l=1

Wuv(Yul − Yvl)2

Yul = Yul, !Suu = 1such that

=

m!

l=1

YTl LYl

GraphLaplacian

23

Page 85: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

LP-ZGL [Zhu et al., ICML 2003]

argminY

m!

l=1

Wuv(Yul − Yvl)2

Yul = Yul, !Suu = 1such that

Smooth

=

m!

l=1

YTl LYl

GraphLaplacian

23

Page 86: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

LP-ZGL [Zhu et al., ICML 2003]

argminY

m!

l=1

Wuv(Yul − Yvl)2

Yul = Yul, !Suu = 1such that

Smooth

Match Seeds (hard)

=

m!

l=1

YTl LYl

GraphLaplacian

23

Page 87: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

LP-ZGL [Zhu et al., ICML 2003]

argminY

m!

l=1

Wuv(Yul − Yvl)2

Yul = Yul, !Suu = 1such that

Smooth

Match Seeds (hard)

• Smoothness

• two nodes connected by an edge with high weight should be assigned similar labels

=

m!

l=1

YTl LYl

GraphLaplacian

23

Page 88: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

LP-ZGL [Zhu et al., ICML 2003]

argminY

m!

l=1

Wuv(Yul − Yvl)2

Yul = Yul, !Suu = 1such that

Smooth

Match Seeds (hard)

• Smoothness

• two nodes connected by an edge with high weight should be assigned similar labels

• Solution satisfies harmonic property

=

m!

l=1

YTl LYl

GraphLaplacian

23

Page 89: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Manifold Regularization- Spectral Graph Transduction- Measure Propagation

24

Page 90: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

25

Page 91: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Seed Scores

Estimated Scores

Label Priorsv

′′

25

Page 92: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft)

Seed Scores

Estimated Scores

Label Priorsv

′′

25

Page 93: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft) Smooth

Seed Scores

Estimated Scores

Label Priorsv

′′

25

Page 94: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft) SmoothMatch Priors (Regularizer)

Seed Scores

Estimated Scores

Label Priorsv

′′

25

Page 95: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft) SmoothMatch Priors (Regularizer)

Seed Scores

Estimated Scores

Label Priorsv

′′for none-of-the-above label

25

Page 96: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft) SmoothMatch Priors (Regularizer)

Seed Scores

Estimated Scores

Label Priorsv

MAD has extra regularization compared to LP-ZGL [Zhu et al, ICML 03]; similar to QC [Bengio et al, 2006]

′′for none-of-the-above label

25

Page 97: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Modified Adsorption (MAD) [Talukdar and Crammer, ECML 2009]

argminY

m+1⇤

l=1

�⇥SY l � SY l⇥2 + µ1

u,v

Muv(Y ul � Y vl)2 + µ2⇥Y l �Rl⇥2

• m labels, +1 dummy label

• M = W� +W is the symmetrized weight matrix

• Y vl: weight of label l on node v

• Y vl: seed weight for label l on node v

• S: diagonal matrix, nonzero for seed nodes

• Rvl: regularization target for label l on node v

Match Seeds (soft) SmoothMatch Priors (Regularizer)

Seed Scores

Estimated Scores

Label Priorsv

MAD has extra regularization compared to LP-ZGL [Zhu et al, ICML 03]; similar to QC [Bengio et al, 2006]

′′for none-of-the-above label

MAD’s Objective is Convex

25

Page 98: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD Objective

26

Page 99: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD Objective

• Can be solved using matrix inversion (like in LP)

• but matrix inversion is expensive

26

Page 100: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD Objective

• Can be solved using matrix inversion (like in LP)

• but matrix inversion is expensive

• Instead solved exactly using a system of linear equations (Ax = b)

• solved using Jacobi iterations

• results in iterative updates

• guaranteed convergence

• see [Bengio et al., 2006] and [Talukdar and Crammer, ECML 2009] for details

26

Page 101: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD using Iterative Updates

Inputs Y ,R : |V |⇥ (|L|+ 1), W : |V |⇥ |V |, S : |V |⇥ |V | diagonalˆY YM = W +W>

Zv Svv + µ1P

u 6=v Mvu + µ2 8v 2 Vrepeat

for all v 2 V do

ˆY v 1Zv

⇣(SY )v + µ1Mv· ˆY + µ2Rv

end for

until convergence

′′

Seed Prior

0.75

0.05

0.60

Current label estimate on ba b

c

v

27

Page 102: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD using Iterative Updates

Inputs Y ,R : |V |⇥ (|L|+ 1), W : |V |⇥ |V |, S : |V |⇥ |V | diagonalˆY YM = W +W>

Zv Svv + µ1P

u 6=v Mvu + µ2 8v 2 Vrepeat

for all v 2 V do

ˆY v 1Zv

⇣(SY )v + µ1Mv· ˆY + µ2Rv

end for

until convergence

′′

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v

27

Page 103: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MAD using Iterative Updates

Inputs Y ,R : |V |⇥ (|L|+ 1), W : |V |⇥ |V |, S : |V |⇥ |V | diagonalˆY YM = W +W>

Zv Svv + µ1P

u 6=v Mvu + µ2 8v 2 Vrepeat

for all v 2 V do

ˆY v 1Zv

⇣(SY )v + µ1Mv· ˆY + µ2Rv

end for

until convergence

′′

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v

27

• Importance of a node can be discounted• Easily Parallelizable: Scalable (more later)

Page 104: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When is MAD most effective?

28

Page 105: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When is MAD most effective?

0

0.1

0.2

0.3

0.4

0 7.5 15 22.5 30

Re

lati

ve I

ncr

eas

e i

n M

RR

by M

AD

ove

r L

P-Z

GL

Average Degree

28

Page 106: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When is MAD most effective?

0

0.1

0.2

0.3

0.4

0 7.5 15 22.5 30

Re

lati

ve I

ncr

eas

e i

n M

RR

by M

AD

ove

r L

P-Z

GL

Average Degree

MAD is particularly effective in denser graphs, where there is greater need for regularization.

28

Page 107: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent Labels

29

Page 108: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

29

Page 109: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

1.0 1.0

Label Similarity in Sentiment Classification

29

Page 110: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

1.0 1.0

Label Similarity in Sentiment Classification

29

Modified Adsorption with Dependent Labels (MADDL) [Talukdar and Crammer, ECML 2009]

Page 111: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

1.0 1.0

Label Similarity in Sentiment Classification

29

Modified Adsorption with Dependent Labels (MADDL) [Talukdar and Crammer, ECML 2009]

• Can take label similarities into account

Page 112: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

1.0 1.0

Label Similarity in Sentiment Classification

29

Modified Adsorption with Dependent Labels (MADDL) [Talukdar and Crammer, ECML 2009]

• Can take label similarities into account• Convex Objective

Page 113: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Extension to Dependent LabelsLabels are not always mutually exclusive

1.0 1.0

Label Similarity in Sentiment Classification

29

Modified Adsorption with Dependent Labels (MADDL) [Talukdar and Crammer, ECML 2009]

• Can take label similarities into account• Convex Objective• Efficient iterative/parallelizable updates as in MAD

Page 114: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Measure Propagation- Sparse Label Propagation- Manifold Regularization

30

Page 115: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 116: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

Divergence on seed nodes

Seed and estimated label distributions (normalized)

on node i

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 117: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

Divergence on seed nodes

Smoothness (divergence across edge)

KL DivergenceDKL(pi||pj) =

X

y

pi(y) logpi(y)

pj(y)

Seed and estimated label distributions (normalized)

on node i

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 118: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

Divergence on seed nodes

Smoothness (divergence across edge)

Entropic Regularizer

KL DivergenceDKL(pi||pj) =

X

y

pi(y) logpi(y)

pj(y)

EntropyH(pi) = �

X

y

pi(y) log pi(y)

Seed and estimated label distributions (normalized)

on node i

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 119: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

Divergence on seed nodes

Smoothness (divergence across edge)

Entropic Regularizer

KL DivergenceDKL(pi||pj) =

X

y

pi(y) logpi(y)

pj(y)

EntropyH(pi) = �

X

y

pi(y) log pi(y)

Seed and estimated label distributions (normalized)

on node i

Normalization Constraint

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 120: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

argmin{pi}

lX

i=1

DKL(ri||pi) + µX

i,j

wijDKL(pi||pj)� ⌫nX

i=1

H(pi)

Measure Propagation (MP) [Subramanya and Bilmes, EMNLP 2008, NIPS 2009, JMLR 2011]

Divergence on seed nodes

Smoothness (divergence across edge)

Entropic Regularizer

KL DivergenceDKL(pi||pj) =

X

y

pi(y) logpi(y)

pj(y)

EntropyH(pi) = �

X

y

pi(y) log pi(y)

Seed and estimated label distributions (normalized)

on node i

CKL is convex (with non-negative edge weights and hyper-parameters)

MP is related to Information Regularization [Corduneanu and Jaakkola, 2003]

Normalization Constraint

CKL

s.t.X

y

pi(y) = 1, pi(y) � 0, 8y, i

31

Page 121: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

CMP

32

Page 122: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

New probability measure, one for each

vertex, similar to pi

CMP

32

Page 123: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

w0

ij = wij + ↵⇥ �(i, j)New probability measure, one for each

vertex, similar to pi

CMP

32

Page 124: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

w0

ij = wij + ↵⇥ �(i, j)New probability measure, one for each

vertex, similar to pi

CMP

Encourages agreement between pi and qi

32

Page 125: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

w0

ij = wij + ↵⇥ �(i, j)New probability measure, one for each

vertex, similar to pi

CMP

Encourages agreement between pi and qi

CMP is also convex(with non-negative edge weights and hyper-parameters)

32

Page 126: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Solving MP Objective

• For ease of optimization, reformulate MP objective:

arg min{pi,qi}

lX

i=1

DKL(ri||qi) + µX

i,j

w0

ijDKL(pi||qj)� ⌫nX

i=1

H(pi)

w0

ij = wij + ↵⇥ �(i, j)New probability measure, one for each

vertex, similar to pi

CMP

Encourages agreement between pi and qi

CMP is also convex(with non-negative edge weights and hyper-parameters)

CMP can be solved using Alternating Minimization (AM)32

Page 127: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Alternating Minimization

33

Page 128: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Alternating Minimization

Q0

33

Page 129: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Alternating Minimization

Q0

P1

33

Page 130: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Alternating Minimization

Q0

Q1

P1

33

Page 131: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Alternating Minimization

Q0

Q1

P1P2

33

Page 132: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Q2

Alternating Minimization

Q0

Q1

P1P2

33

Page 133: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Q2

Alternating Minimization

Q0

Q1

P1P2P3

33

Page 134: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Q2

Alternating Minimization

Q0

Q1

P1P2P3

33

Page 135: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Q2

Alternating Minimization

Q0

Q1

P1P2P3

33

Page 136: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Q2

Alternating Minimization

Q0

Q1

P1P2P3

CMP satisfies the necessary conditions for AM to converge [Subramanya and Bilmes, JMLR 2011]

33

Page 137: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Measure Propagation- Sparse Label Propagation- Manifold Regularization

34

Page 138: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Background: Factor Graphs [Kschischang et al., 2001]

Factor Graph• bipartite graph• variable nodes (e.g., label distribution on a node)• factor nodes: fitness function over variable assignment

Distribution over all variables’ values

Variable Nodes (V)

Factor Nodes (F)

variables connected to factor f35

Page 139: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

36

Page 140: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

q1 q2

w1,2 ||q1 � q2||2

36

Page 141: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

q1 q2

w1,2 ||q1 � q2||2

q1 q2�(q1, q2)

36

Page 142: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

q1 q2

w1,2 ||q1 � q2||2

q1 q2�(q1, q2)

�(q1, q2) / w1,2 ||q1 � q2||2Smoothness

Factor36

Page 143: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

q1 q2

w1,2 ||q1 � q2||2

q1 q2�(q1, q2)

�(q1, q2) / w1,2 ||q1 � q2||2Smoothness

Factor

Seed MatchingFactor (unary)

36

Page 144: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation of Graph SSL [Zhu et al., ICML 2003] [Das and Smith, NAACL 2012]

min Edge Smoothness Loss

Regularization Loss + +Seed Matching

Loss (if any)

3-term Graph SSL Objective (common to many algorithms)

q1 q2

w1,2 ||q1 � q2||2

q1 q2�(q1, q2)

Regularization factor (unary)

�(q1, q2) / w1,2 ||q1 � q2||2Smoothness

Factor

Seed MatchingFactor (unary)

36

Page 145: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Factor Graph Interpretation [Zhu et al., ICML 2003][Das and Smith, NAACL 2012]

r1 r2

r3

r4

q1 q2

q4

q3

q9264

q9265

q9266

q9267 q9268 q9269 q9270 1. Factor encouraging agreement on seed

labels

2. SmoothnessFactor

3. Unary factor for regularizationlog t(qt)

37

Page 146: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Propagation with Sparsity

38

Page 147: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Propagation with Sparsity

Enforce through sparsity inducing unary factor

38

Page 148: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Propagation with Sparsity

Enforce through sparsity inducing unary factor

log t(qt) =

log t(qt) =

Lasso (Tibshirani, 1996)

Elitist Lasso (Kowalski and Torrésani, 2009)

38

Page 149: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Propagation with Sparsity

Enforce through sparsity inducing unary factor

log t(qt) =

log t(qt) =

Lasso (Tibshirani, 1996)

Elitist Lasso (Kowalski and Torrésani, 2009)

For more details, see [Das and Smith, NAACL 2012]

38

Page 150: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Label Propagation- Modified Adsorption- Measure Propagation- Sparse Label Propagation- Manifold Regularization

39

Page 151: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Manifold Regularization [Belkin et al., JMLR 2006]

f

⇤ = argminf

1

l

lX

i=1

V (yi, f(xi)) + � f

TLf + �||f ||2K

40

Page 152: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Manifold Regularization [Belkin et al., JMLR 2006]

Loss Function(e.g., soft margin)

Training Data Loss

f

⇤ = argminf

1

l

lX

i=1

V (yi, f(xi)) + � f

TLf + �||f ||2K

40

Page 153: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Manifold Regularization [Belkin et al., JMLR 2006]

Loss Function(e.g., soft margin)

Laplacian of graph over labeled and unlabeled data

Training Data Loss

Smoothness Regularizer

f

⇤ = argminf

1

l

lX

i=1

V (yi, f(xi)) + � f

TLf + �||f ||2K

40

Page 154: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Manifold Regularization [Belkin et al., JMLR 2006]

Loss Function(e.g., soft margin)

Laplacian of graph over labeled and unlabeled data

Training Data Loss

Smoothness Regularizer

Regularizer (e.g., L2)

f

⇤ = argminf

1

l

lX

i=1

V (yi, f(xi)) + � f

TLf + �||f ||2K

40

Page 155: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Manifold Regularization [Belkin et al., JMLR 2006]

Loss Function(e.g., soft margin)

Laplacian of graph over labeled and unlabeled data

Trains an inductive classifier which can generalize to unseen instances

Training Data Loss

Smoothness Regularizer

Regularizer (e.g., L2)

f

⇤ = argminf

1

l

lX

i=1

V (yi, f(xi)) + � f

TLf + �||f ||2K

40

Page 156: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Other Graph-based SSL Methods

• TACO [Orbach and Crammer, ECML 2012]

• SSL on Directed Graphs

• [Zhou et al, NIPS 2005], [Zhou et al., ICML 2005]

• Spectral Graph Transduction [Joachims, ICML 2003]

• Graph-SSL for Ordering

• [Talukdar et al., CIKM 2012]

• Learning with dissimilarity edges

• [Goldberg et al., AISTATS 2007]

41

Page 157: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Scalability Issues- Node reordering- MapReduce Parallelization

42

Page 158: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

More (Unlabeled) Data is Better Data

[Subramanya & Bilmes, JMLR 2011]43

Page 159: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

More (Unlabeled) Data is Better Data

Graph with 120m

vertices

[Subramanya & Bilmes, JMLR 2011]43

Page 160: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

More (Unlabeled) Data is Better Data

Graph with 120m

vertices

Challenges with large unlabeled data:

• Constructing graph from large data

• Scalable inference over large graphs

[Subramanya & Bilmes, JMLR 2011]43

Page 161: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Scalability Issues- Node reordering [Subramanya & Bilmes, JMLR 2011; Bilmes & Subramanya, 2011]

- MapReduce Parallelization

44

Page 162: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Update using Message Passing

Seed Prior

0.75

0.05

0.60Current label estimate on a

a b

c

v

45

Processor 1

.....

k+1

k

2

1

.....

Processor 2

Processor k

Processor 1

Graph nodes (neighbors not shown)

SMP with k Processors

Page 163: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Label Update using Message Passing

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v

45

Processor 1

.....

k+1

k

2

1

.....

Processor 2

Processor k

Processor 1

Graph nodes (neighbors not shown)

SMP with k Processors

Page 164: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

46

Page 165: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Page 166: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Page 167: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

b

a

d

c

c

l

m

n46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Page 168: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

b

a

d

c

c

l

m

n

|N(k) \N(a)| = 1

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Page 169: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

b

a

d

c

c

l

m

n

|N(k) \N(a)| = 1

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Cardinality of Intersection

Page 170: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

b

a

d

c

c

l

m

n

|N(k) \N(a)| = 1

|N(k) \N(b)| = 2

|N(k) \N(c)| = 0

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Cardinality of Intersection

Page 171: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Node Reordering Algorithm : Intuition

k

a

b

c

a

e

j

c

b

a

d

c

c

l

m

n

|N(k) \N(a)| = 1

|N(k) \N(b)| = 2

|N(k) \N(c)| = 0

Best Node

46

Which node should be processed along with k: the one with highest

intersection of neighborhood with k

Cardinality of Intersection

Page 172: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Speed-up on SMP after Node Ordering

[Subramanya & Bilmes, JMLR, 2011]47

Page 173: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Outline

• Motivation

• Graph Construction

• Inference Methods

• Scalability

• Applications

• Conclusion & Future Work

- Scalability Issues- Node reordering- MapReduce Parallelization

48

Page 174: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD

Seed Prior

0.75

0.05

0.60

Current label estimate on ba b

c

v

49

Page 175: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

Current label estimate on ba b

c

v

49

Page 176: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

a b

c

v

49

Page 177: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

a b

c

v• Reduce

• Each node updates its own label assignment using messages received from neighbors, and its own information (e.g., seed labels, reg. penalties etc.)

• Repeat until convergence

Reduce

49

Page 178: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v• Reduce

• Each node updates its own label assignment using messages received from neighbors, and its own information (e.g., seed labels, reg. penalties etc.)

• Repeat until convergence

49

Page 179: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v• Reduce

• Each node updates its own label assignment using messages received from neighbors, and its own information (e.g., seed labels, reg. penalties etc.)

• Repeat until convergence

Code in Junto Label Propagation Toolkit

(includes Hadoop-based implementation)

https://github.com/parthatalukdar/junto49

Page 180: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

MapReduce Implementation of MAD• Map

• Each node send its current label assignments to its neighbors

Seed Prior

0.75

0.05

0.60

New label estimate on v

a b

c

v• Reduce

• Each node updates its own label assignment using messages received from neighbors, and its own information (e.g., seed labels, reg. penalties etc.)

• Repeat until convergence

Code in Junto Label Propagation Toolkit

(includes Hadoop-based implementation)

https://github.com/parthatalukdar/junto49

Graph-based algorithms are amenable to distributed processing

Page 181: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When to use Graph-based SSL and which method?

50

Page 182: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When to use Graph-based SSL and which method?

• When input data itself is a graph (relational data)

• or, when the data is expected to lie on a manifold

50

Page 183: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When to use Graph-based SSL and which method?

• When input data itself is a graph (relational data)

• or, when the data is expected to lie on a manifold

• MAD, Quadratic Criteria (QC)

• when labels are not mutually exclusive

• MADDL: when label similarities are known

50

Page 184: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When to use Graph-based SSL and which method?

• When input data itself is a graph (relational data)

• or, when the data is expected to lie on a manifold

• MAD, Quadratic Criteria (QC)

• when labels are not mutually exclusive

• MADDL: when label similarities are known

• Measure Propagation (MP)

• for probabilistic interpretation

50

Page 185: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

When to use Graph-based SSL and which method?

• When input data itself is a graph (relational data)

• or, when the data is expected to lie on a manifold

• MAD, Quadratic Criteria (QC)

• when labels are not mutually exclusive

• MADDL: when label similarities are known

• Measure Propagation (MP)

• for probabilistic interpretation

• Manifold Regularization

• for generalization to unseen data (induction)50

Page 186: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

51

Page 187: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

51

Page 188: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

• Graph construction can be key

51

Page 189: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

• Graph construction can be key

• Scalable: Node Reordering and MapReduce

51

Page 190: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

• Graph construction can be key

• Scalable: Node Reordering and MapReduce

• Can handle labeled as well as unlabeled data

51

Page 191: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

• Graph construction can be key

• Scalable: Node Reordering and MapReduce

• Can handle labeled as well as unlabeled data

• Can handle multi class, multi label settings

51

Page 192: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Graph-based SSL: Summary

• Provide flexible representation

• for both IID and relational data

• Graph construction can be key

• Scalable: Node Reordering and MapReduce

• Can handle labeled as well as unlabeled data

• Can handle multi class, multi label settings

• Effective in practice

51

Page 193: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Open Challenges

52

Page 194: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Open Challenges

• Graph-based SSL for Structured Prediction

• Algorithms: Combining Inductive and graph-based methods

• Applications: Constituency and dependency parsing, Coreference

52

Page 195: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Open Challenges

• Graph-based SSL for Structured Prediction

• Algorithms: Combining Inductive and graph-based methods

• Applications: Constituency and dependency parsing, Coreference

• Scalable graph construction, especially with multi-modal data

52

Page 196: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Open Challenges

• Graph-based SSL for Structured Prediction

• Algorithms: Combining Inductive and graph-based methods

• Applications: Constituency and dependency parsing, Coreference

• Scalable graph construction, especially with multi-modal data

• Extensions with other loss functions, sparsity, etc.

52

Page 197: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

53

Page 198: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

References (I)[1] A. Alexandrescu and K. Kirchhoff. Data-driven graph construction for semi-supervised graph-based learning in nlp. In NAACL HLT, 2007.[2] Y. Altun, D. McAllester, and M. Belkin. Maximum margin semi-supervised learn- ing for structured variables. NIPS, 2006.[3] S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravichandran, and M. Aly. Video suggestion and discovery for youtube: taking random walks through the view graph. In WWW, 2008.[4] R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter. Distributional word clusters vs. words for text categorization. J. Mach. Learn. Res., 3:1183–1208, 2003.[5] M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399–2434, 2006.[6] Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. Semi-supervised learning, 2006.[7] T. Berg-Kirkpatrick, A. Bouchard-Coˆt e, J. DeNero, and D. Klein. Painless unsupervised learning with features. In HLT-NAACL, 2010.[8] J. Bilmes and A. Subramanya. Scaling up Machine Learning: Parallel and Distributed Approaches, chapter Parallel Graph-Based Semi-Supervised Learning. 2011.[9] S. Blair-goldensohn, T. Neylon, K. Hannan, G. A. Reis, R. Mcdonald, and J. Reynar. Building a sentiment summarizer for local service reviews. In In NLP in the Information Explosion Era, 2008.[10] M. Cafarella, A. Halevy, D. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. VLDB, 2008.[11] O. Chapelle, B. Scho lkopf, A. Zien, et al. Semi-supervised learning. MIT press Cambridge, MA:, 2006.[12] Y. Choi and C. Cardie. Adapting a polarity lexicon using integer linear program- ming for domain specific sentiment classification. In EMNLP, 2009.[13] S. Daitch, J. Kelner, and D. Spielman. Fitting a graph to vector data. In ICML, 2009.[14] D. Das and S. Petrov. Unsupervised part-of-speech tagging with bilingual graph- based projections. In ACL, 2011.[15] D. Das, N. Schneider, D. Chen, and N. A. Smith. Probabilistic frame-semantic parsing. In NAACL-HLT, 2010.[16] D. Das and N. Smith. Graph-based lexicon expansion with sparsity-inducing penalties. NAACL-HLT, 2012.[17] D. Das and N. A. Smith. Semi-supervised frame-semantic parsing for unknown predicates. In ACL, 2011.[18] J. Davis, B. Kulis, P. Jain, S. Sra, and I. Dhillon. Information-theoretic metric learning. In ICML, 2007.[19] O. Delalleau, Y. Bengio, and N. L. Roux. Efficient non-parametric function induction in semi-supervised learning. In AISTATS, 2005.[20] P. Dhillon, P. Talukdar, and K. Crammer. Inference-driven metric learning for graph construction. Technical report, MS-CIS-10-18, University of Pennsylvania, 2010.

54

Page 199: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

References (II)

[21] S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In CIKM, 1998.[22] J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transaction on Mathematical Software, 3, 1977.[23] J. Garcke and M. Griebel. Data mining with sparse grids using simplicial basis functions. In KDD, 2001.[24] A. Goldberg and X. Zhu. Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, 2006.[25] A. Goldberg, X. Zhu, and S. Wright. Dissimilarity in graph-based semi-supervised classification. AISTATS, 2007.[26] M. Hu and B. Liu. Mining and summarizing customer reviews. In KDD, 2004. [27] T. Jebara, J. Wang, and S. Chang. Graph construction and b-matching for semi-supervised learning. In ICML, 2009. [28] T. Joachims. Transductive inference for text classification using support vector machines. In ICML, 1999.[29] T. Joachims. Transductive learning via spectral graph partitioning. In ICML, 2003.[30] M. Karlen, J. Weston, A. Erkan, and R. Collobert. Large scale manifold transduction. In ICML, 2008. [31] S.-M. Kim and E. Hovy. Determining the sentiment of opinions. In Proceedings of the 20th International conference on Computational Linguistics, 2004. [32] F. Kschischang, B. Frey, and H. Loeliger. Factor graphs and the sum-product algorithm. Information Theory, IEEE Transactions on, 47(2):498–519, 2001. [33] K. Lerman, S. Blair-Goldensohn, and R. McDonald. Sentiment summarization: evaluating and learning user preferences. In EACL, 2009. [34] D.Lewisetal.Reuters-21578.http://www.daviddlewis.com/resources/testcollections/reuters21578, 1987. [35] J. Malkin, A. Subramanya, and J. Bilmes. On the semi-supervised learning of multi-layered perceptrons. In InterSpeech, 2009. [36] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: sentiment classification using machine learning techniques. In EMNLP, 2002. [37] D. Rao and D. Ravichandran. Semi-supervised polarity lexicon induction. In EACL, 2009. [38] A. Subramanya and J. Bilmes. Soft-supervised learning for text classification. In EMNLP, 2008. [39] A. Subramanya and J. Bilmes. Entropic graph regularization in non-parametric semi-supervised classification. NIPS, 2009.[40] A. Subramanya and J. Bilmes. Semi-supervised learning with measure propagation. JMLR, 2011.

55

Page 200: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

References (III)[41] A. Subramanya, S. Petrov, and F. Pereira. Efficient graph-based semi-supervised learning of structured tagging models. In EMNLP, 2010.[42] P. Talukdar. Topics in graph construction for semi-supervised learning. Technical report, MS-CIS-09-13, University of Pennsylvania, 2009.[43] P. Talukdar and K. Crammer. New regularized algorithms for transductive learning. ECML, 2009.[44] P. Talukdar and F. Pereira. Experiments in graph-based semi-supervised learning methods for class-instance acquisition. In ACL, 2010.[45] P. Talukdar, J. Reisinger, M. Pa sca, D. Ravichandran, R. Bhagat, and F. Pereira. Weakly-supervised acquisition of labeled class instances using graph random walks. In EMNLP, 2008.[46] B. Van Durme and M. Pasca. Finding cars, goddesses and enzymes: Parametrizable acquisition of labeled instances for open-domain information extraction. In AAAI, 2008.[47] L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald. The viability of web-derived polarity lexicons. In HLT-NAACL, 2010.[48] F. Wang and C. Zhang. Label propagation through linear neighborhoods. In ICML, 2006.[49] J. Wang, T. Jebara, and S. Chang. Graph transduction via alternating minimization. In ICML, 2008.[50] R. Wang and W. Cohen. Language-independent set expansion of named entities using the web. In ICDM, 2007.[51] K. Weinberger and L. Saul. Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10:207–244, 2009.[52] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase- level sentiment analysis. In HLT-EMNLP, 2005.[53] D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scho lkopf. Learning with local and global consistency. NIPS, 2004.[54] D. Zhou, J. Huang, and B. Scho lkopf. Learning from labeled and un- labeled data on a directed graph. In ICML, 2005.[55] D. Zhou, B. Scho lkopf, and T. Hofmann. Semi-supervised learning on directed graphs. In NIPS, 2005.[56] X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, CMU-CALD-02-107, Carnegie Mellon University, 2002.[57] X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, Carnegie Mellon University, 2002.[58] X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, 2003.[59] X. Zhu and J. Lafferty. Harmonic mixtures: combining mixture models and graph- based methods for inductive and scalable semi-supervised learning. In ICML, 2005.

56

Page 201: (Graph-based) Semi-Supervised Learning - Semantic Scholar...(Graph-based) Semi-Supervised Learning Partha Pratim Talukdar Indian Institute of Science ppt@serc.iisc.in April 7, 2015

Thanks!

Web: http://graph-ssl.wikidot.com/

57