Upload
isabel-boone
View
219
Download
0
Embed Size (px)
Citation preview
Exploring Intrinsic Structuresfrom Samples:
Supervised, Unsupervised, andSemisupervised Frameworks
Huan Wang
Multimedia Laboratory
Department of Information Engineering
The Chinese University of Hong Kong
Supervised by Prof. Xiaoou Tang & Prof. Jianzhuang Liu
Outline
Outline
• Trace Ratio Optimization
Tensor Subspace Learning
• Correspondence Propagation
Preserve sample feature structures
Explore the geometric structures and feature domain relations concurrently
• Notations & introductions
Dimensionality reduction
Concept
Concept. Tensor• Tensor: multi-dimensional (or multi-way) arrays of components
Application
Concept. Tensor
• real-world data are affected by multifarious factorsfor the person identification, we may have facial images of different
► views and poses
► lightening conditions
► expressions
• the observed data evolve differently along the variation of different factors
► image columns and rows
Application
Concept. Tensor
• it is desirable to dig through the intrinsic connections among different affection factors of the data.
• Tensor provides a concise and effective representation.
Illumination
pose
expression
Image columns
Image rows
Images
Introduction
Concept. Dimensionality Reduction
• Preserve sample feature structures
• Enhance classification capability
• Reduce the computational complexity
Trace Ratio Optimization. Definition
,A B
w.r.t. TW W I
• Positive semidefinite
• Homogeneous property:
• Special case, when W is a vector
Generalized Rayleigh Quotient1
arg axT
T
Tw w
w Aww m
w Bw
Aw BwGEVD
• Orthoganality constraint
( )( ) ( ),
( )
TT T
T
Trace W AWJ W J WQ Q Q QQ I
Trace W BW
( )arg ax
( )
T
TW
Trace W WW m
Trace W
A
BW
Optimization over the Grassman manifold
Trace Ratio Formulation
Trace Ratio Formulation
• Linear Discriminant Analysis
2
1
2
1
|| ||( )
arg ax arg ax( )
|| ||
c
i
NT T
Tc cc b
N TW WT T w
i ci
n W x W xTrace W S W
W m mTrace W S W
W x W x
1
( )( )cN
Tb c c c
c
S n x x x x
1
( )( )i i
NT
w i c i ci
S x x x x
Trace Ratio Formulation
Trace Ratio Formulation
• Kernel Discriminant Analysis
1
1
1( ( ) )
arg ax1 1
( ( ) )
c
c
NT c cT T
c cN
A T c cT T T
c c
Tr A K I e e K An
J A m
Tr A K e e ee K An N
w.r.t. T T T TW W I A A A K A I
W A
Decompose
( )( )arg ax arg ax
( ) ( )
T T p TT p Td d d d
T T T T TA A d d d d
Tr A K K L K K ATr A KL K AJ A m m
Tr A KL K A Tr A K K L K K A
Td dK K K
w.r.t.T T
d dA K K A I
Let dK A
( )arg ax
( )
T p Td d
T Td d
Tr K L KJ m
Tr K L K
w.r.t. T I
i ix
Trace Ratio Formulation
Trace Ratio Formulation• Marginal Fisher Analysis
Intra-class graph (Intrinsic graph)
Inter-class graph (Penalty graph)
2
,
2
,
|| ||
arg ax arg ax|| ||
( ( ) ) ( )arg ax arg ax
( ( ) ) ( )
T T ci j ij
i jcT T m
W W i j ijmi j
T c c T T c T
T m m T T m TW W
W x W x WS
W m mW x W x WS
Trace W X D W X W Trace W XL X Wm m
Trace W X D W X W Trace W XL X W
Trace Ratio Formulation
Trace Ratio Formulation
• Kernel Marginal Fisher Analysis
( )arg ax
( )
T p T
T TA
Tr A KL K AJ A m
Tr A KL K A
w.r.t. T T T TW W I A A A K A I
W A
Decompose
( )( )arg ax arg ax
( ) ( )
T T p TT p Td d d d
T T T T TA A d d d d
Tr A K K L K K ATr A KL K AJ A m m
Tr A KL K A Tr A K K L K K A
Td dK K K
w.r.t. T Td dA K K A I
Let dK A
( )arg ax
( )
T p Td d
T Td d
Tr K L KJ m
Tr K L K
w.r.t. T I
i ix
Concept
Trace Ratio Formulation• 2-D Linear Discriminant Analysis
2
1 1
2
1 1
|| || ( ( ( ) ( ) ) )arg ax arg ax
|| || ( ( ( ) ( ) ) )
c c
i
N NT T T T T
c c c c cc c
N NW WT T T T T
i c c ci i
n L x R L xR Trace L n x x RR x x LW m m
L x R L x R Trace L x x RR x x L
Ti iy L x R
1
1
( ( ( ) ( ) ) )arg ax
( ( ( ) ( ) ) )
cNT T T
c c cc
NW T T T
c ci
Trace R n x x LL x x Rm
Trace R x x LL x x R
Left Projection & Right Projection
Fix one projection matrix & optimize the other
• Discriminant Analysis with Tensor Representation
1
21 1 1 1
1
| 21 1 1 1
1
|| ... ... ||( )
arg ax arg ax( )
|| ... ... ||
c
nkk k
i
N
T kc c n n n nc k b k
N T kUU k w k
i n n c n ni
n x U U x U UTrace U S U
W m mTrace U S U
x U U x U U
Trace Ratio Formulation
Trace Ratio Formulation• Tensor Subspace Analysis
2
,
2,
1|| ||
2arg min
|| ||
T Ti j ij
i j
U V i iii
U xV U x V S
Wy D
,
( ( ) )arg min
( ( ) )
T T T T Tii i i ii i i
i iT T T
U V ii i ii
Trace U D xVV x S xVV x U
Trace U D xVV x U
,
( ( ) )arg min
( ( ) )
T T T T Tii i i ii i i
i iT T T
U V ii i ii
Trace V D x UU x S x UU x V
Trace V D x UU x V
Trace Ratio Formulation
Trace Ratio Formulation
( )arg ax
( )
Tb
TW w
Trace W S WW m
Trace W S W
1| |arg ax arg ax ( ) ( )
| |
TT Tb
w bTW Ww
W S WW m m Trace W S W W S W
W S W
Conventional Solution:
b wS w S wGEVD
Singularity problem of wS Nullspace LDA
Dualspace LDA
from Trace Ratio to Trace Difference
Preprocessing
( )arg ax
( )
T p
T lU
Tr W S Wm
Tr W S W ( )
arg ax( )
T p
T tU
Tr W S Wm
Tr W S Wt l pS S S
Remove the Null Space of with Principal Component Analysis. tS
( )0 1
( )
T p
T t
Tr W S W
Tr W S W
from Trace Ratio to Trace Difference
What will we do? from Trace Ratio to Trace Difference
( )arg ax
( )
T p
TU
Tr U S Um
Tr U S UObjective:
Define( )
( )
T p
T
Tr U S U
Tr U S U
Then
( ( ) ) 0T pt tTr U S S U
( )tg U
( ) ( ( ) )T pg U Tr U S S U
Trace Ratio Trace Difference
Find ( ) ( )tg U g U
So that
( ( ) ) ( ) 0ptTr U S S U g U
( )
( )
T p
T
Tr U S U
Tr U S U
from Trace Ratio to Trace Difference
What will we do? from Trace Ratio to Trace Difference
Constraint TU U I
Let '1 1 2[ , ,..., ]k
t mU u u u
We have
1( ) ( ) 0t tg U g U
Thus
1 1
1 1
( )
( )
T pt tTt t
Tr U S U
Tr U S U
The Objective rises monotonously!
( ) ( ( ) )T pg U Tr U S S U
Where '1 2, ,...,km
u u u are the leading
eigen vectors of .( )pS S
Main Algorithm Process
Main Algorithm1: Initialization. Initialize as arbitrary column orthogonal matrices.
U
2: Iterative optimization.
For t=1, 2, . . . , Tmax, Do
1. Set.( )
( )
T p
T
Tr U S U
Tr U S U
2. Conduct Eigenvalue Decomposition: ( )p kk j jS S v v
3. Reshape the projection directions
'1 1 2[ , ,..., ]k
t mU u u u 4.
3: Output the projection matrices
Tensor Subspace Learning algorithms
Traditional Tensor Discriminant algorithms
• Tensor Subspace Analysis He et.al
• Two-dimensional Linear Discriminant Analysis
• Discriminant Analysis with Tensor RepresentationYe et.al
Yan et.al
• project the tensor along different dimensions or ways
• projection matrices for different dimensions are derived iteratively
• solve an trace ratio optimization problem
• DO NOT CONVERGE !
Discriminant Analysis Objective
Solve the projection matrices iteratively: leave one projection matrix as variable while keeping others as constant.
1
21
2| 1
|| ( ) | ||arg ax
|| ( ) | ||k nk
k n pi j k k iji j
k nU i j k k iji j
X X U Wm
X X U W
• No closed form solution
Mode-k unfolding of the tensor
2
2
|| ||arg ax
|| ||
T T
T Tk
k k k k pi j iji j
k k k kU i j iji j
U Y U Y Wm
U Y U Y W
kiY
~1 1 1
1 1 1... ...k k ni i k k nY X U U U U
~
iY
Objective Deduction
Discriminant Analysis Objective
Trace Ratio: General Formulation for the objectives of the Discriminant Analysis based Algorithms.
( )arg ax
( )
T
Tk
k p kk
k k kU
Tr U S Um
Tr U S U
( )( )k k k k k T
ij i j i ji jS Y Y Y YW
( )( )k k k k T
i j i ji j
p pk ijS Y Y Y YW
DATER:
TSA:
kS Within Class Scatter of the unfolded data
pkS Between Class Scatter of the
unfolded data
W pS Diagonal Matrix with weightsConstructed from Image Manifold
Disagreement between the Objective and the Optimization Process
Why do previous algorithms not converge?
11
1
1 111
( )arg ax
( )
T k
k
T k kk
k p
kU
Tr U S Um
Tr U S U1 1 1 1 1
11
1arg ax (( ) )T T
k
k k k k kpk
U
m Tr U S U U S U
GEVD
22
2
2 222
( )arg ax
( )
T k
k
T k kk
k p
kU
Tr U S Um
Tr U S U
2 2 2 2 2
22
1arg ax (( ) )T T
k
k k k k kpk
U
m Tr U S U U S U
( )
( )
Tr A
Tr B1( )Tr B A
The conversion from Trace Ratio to Ratio Trace induces an inconsistency among the objectives of different dimensions!
from Trace Ratio to Trace Difference
What will we do? from Trace Ratio to Trace Difference
( )arg ax
( )
T
Tk
k p kk
k k kU
Tr U S Um
Tr U S UObjective:
Define( )
( )
T
T
k p kt k t
k k kt t
Tr U S U
Tr U S U
Then
( ( ) ) 0Tk p k k
t k tTr U S S U
( )ktg U
( ) ( ( ) )T p kkg U Tr U S S U
Trace Ratio Trace Difference
Find ( ) ( )ktg U g U
So that
( ( ) ) ( ) 0p k kk tTr U S S U g U
( )
( )
T pk
T k
Tr U S U
Tr U S U
from Trace Ratio to Trace Difference
What will we do? from Trace Ratio to Trace Difference
Constraint TU U I
Let '1 1 2[ , ,..., ]k
kt m
U u u u
We have
1( ) ( ) 0k kt tg U g U
Thus
1 1
1 1
( )
( )
T
T
k p kt k t
k k kt t
Tr U S U
Tr U S U
The Objective rises monotonously!
Projection matrices of different dimensions share the same objective
( ) ( ( ) )T p kkg U Tr U S S U
Where '1 2, ,...,km
u u u are the leading
eigen vectors of .( )p kkS S
Main Algorithm Process
Main Algorithm1: Initialization. Initialize as arbitrary column orthogonal matrices.
1 20 0 0, ,..., nU U U
2: Iterative optimization.
For t=1, 2, . . . , Tmax, Do
For k=1, 2, . . . , n, Do
1. Set. 1 21 1
1 21
|| ( ) | | ||
|| ( ) | | ||
o k o n pi j o t o o t o k iji j
o k n ni j o t o o o o k iji j
X X U U W
X X U U W
2. Compute and . kS pkS
3. Conduct Eigenvalue Decomposition: ( )p kk j jS S v v
4. Reshape the projection directions
'1 1 2[ , ,..., ]k
kt m
U u u u 5.
3: Output the projection matrices
Hightlights of the Trace Ratio based algorithm
Highlights of our algorithm
• The objective value is guaranteed to monotonously increase; and the multiple projection matrices are proved to converge.
• Only eigenvalue decomposition method is applied for iterative optimization, which makes the algorithm extremely efficient.
• Enhanced potential classification capability of the derived low-dimensional representation from the subspace learning algorithms.
• The first work to give a convergent solution to the general tensor-based subspace learning.
Projection Visualization
Experimental Results
Visualization of the projection matrix W of PCA, ratio trace based LDA, and trace ratio based LDA (ITR) on the FERET database.
Face Recognition Results.Linear
Experimental Results
Comparison: Trace Ratio Based LDA vs. the Ratio Trace based LDA (PCA+LDA)
Comparison: Trace Ratio Based MFA vs. the Ratio Trace based MFA (PCA+MFA)
Face Recognition Results.Kernelization
Experimental Results
Trace Ratio Based KDA vs. the Ratio Trace based KDA
Trace Ratio Based KMFA vs. the Ratio Trace based KMFA
Results on UCI Dataset
Experimental Results
Testing classification errors on three UCI databases for both linear and kernel-based algorithms. Results are obtained from 100 realizations of randomly generated 70/30 splits of data.
Monotony of the Objective & Projection Matrix Convergence
Experimental Results
Face Recognition Results
Experimental Results
1. TMFA TR mostly outperforms all the other methods concerned in this work, with only one exception for the case G5P5 on the CMU PIE database.2. For vector-based algorithms, the trace ratio based formulation is consistently superior to the ratio trace based one for subspace learning.3. Tensor representation has the potential to improve the classification performance for both trace ratio and ratio trace formulations of subspace learning.
Correspondence Propagation
Geometric Structures & Feature Structures
Explore the geometric structures and feature domain consistency for object registration
Objective
Aim
• Exploit the geometric structures of sample features
• Introduce human interaction for correspondence guidance
• Seek a mapping of features from sets of different cardinalities
• Objects are represented as sets of feature points
Graph Construction
Spatial Graph Similarity Graph
From Spatial Graph to Categorical Product Graph
1 2 1
1 1 1 1{ , ,..., }Ni i i
1 2 2
2 2 2 2{ , ,..., }Ni i i
1G1 2 1 2
1 2{ , }i i i im
1 2 1 2
1 2{ , }j j j jm 1 1
1 1{ , }i j 2 2
2 2{ , }i j
2G
1G 2G
1 2 1 2~i i j jm m iff
1 1
1 1~i j and2 2
2 2~i j
Definition: Suppose and are the vertices
of graph and respectively. Two assignments and
are neighbors iff both pairs and are neighbors in
and respectively, namely,
where ~a b means a and b are neighbors.
Assignment Neighborhood Definition
{ 1, 2, 3, 4, 5, 6}A a a a a a a { 1, 2, 3}B b b b
{( 1, 1), ( 1, 2), ( 1, 3),
( 2, 1),..., ( 6, 3)}
A B a b a b a b
a b a b
From Spatial Graph to Categorical Product Graph
1 2aG G G
The adjacency matrix aW of aG can be derived from:
2 1aW W W
where is the matrix Kronecker product operator.
Smoothness along the spatial distribution: 21( )
2a a v v
ij i jij
ML M w m m
Feature Domain Consistency & Soft Constraints
Similarity Measure:
One-to-one correspondence penalty
where is matrix Hardamard product and returns the sum of all elements in T
1 1 2 21 1 2 2( ) ( ) ( ) ( )T T T T T T
N N N NTr A M e A M e Tr A M e A M e
2 11 N NA e I
where and2 12 N NA I e
or
Assignment Labeling
1, ( 1) 0i j i j NM M
Assign zeros to those pairs with extremely low similarity scores.
Labeled assignments: Reliable correspondence & Inhomogeneous Pairs
Inhomogeneous Pair Labeling
Reliable Pair Labeling
Assign ones to those reliable pairs1, ( 1) 1i j i j NM M
arrangement
Reliable Correspondence Propagation
Assignment variables
Arrangement:
Coefficient matrices
* ;l uM M M
*1 1 1;l uA A A
*2 2 2;l uA A A
* ;l uS S S
Spatial Adjacency matrices *a a
a ll lua a
ul uu
W WW
W W
*a a
a ll lua aul uu
L LL
L L
Objective
Reliable Correspondence Propagation
Objective:
*
1 1 2 2
* * * * *
* * * *1 1 2 2
min
( ) ( ) ( ) ( )
T a
M
T T T T T TN N N N
S M M L M
Tr A M e A M e Tr A M e A M e
Feature domain agreement:* *S M
Geometric smoothness regularization:* * *T aM L M
One-to-one correspondence penalty:
1 1 2 2
* * * *1 1 2 2( ) ( ) ( ) ( )T T T T T T
N N N NTr A M e A M e Tr A M e A M e
Solution
Reliable Correspondence Propagation
where
Relax to real domain & Closed-form Solution:
1( )u luu u ulM C B C M
* * * * *1 1 2 2
ll luT T a
ul uu
C CC A A A A L
C C
and
1 2
* * *1 2
1
2
l
N Nu
BB A e A e S
B
��������������
Rearrangement & Discretizing
Rearrangement and Discretization
Inverse process of the element arrangement: *M M
Reshape the assignment vector into matrix: M M
Thresholding: Assignments larger than a threshold are regarded as correspondences.
Eliciting: Sequentially pick up the assignments with largest assignment scores.
Semisupervised & Automatic Systems
Semi-supervised & Unsupervised Frameworks
Exact pairwise correspondence labeling:
Users give exact correspondence guidance
Obscure correspondence guidance:
Rough correspondence of image parts
Experimental Results. Demonstration
Experiment. Dataset
Experimental Results. Details
Automatic feature matching score on the Oxford real image transformation dataset. The transformations include viewpoint change ((a) Graffiti and (b) Wall sequence), image blur ((c) bikes and (d) trees sequence), zoom and rotation ((e) bark and (f) boat sequence), illumination variation ((g) leuven ) and JPEG compression ((h) UBC).
Summary
Future Works
• From point-to-point correspondence to set-to-set correspondence.
• Multi-scale correspondence searching.
Summary
Future Works
• From point-to-point correspondence to set-to-set correspondence.
• Multi-scale correspondence searching.
• Combine the object segmentation and registration.
Publications
Publications:
Publications:[1] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘A convergent solution to Tensor Subspace Learning’, International Joint Conferences on Artificial Intelligence (IJCAI 07 Regular paper) , Jan. 2007.[2] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Trace Ratio vs. Ratio Trace for Dimensionality Reduction’, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07), Jun. 2007.[3] Huan Wang, Shuicheng Yan, Thomas Huang, Jianzhuang Liu and Xiaoou Tang, ‘Transductive Regression Piloted by Inter-Manifold Relations ’, International Conference on Machine Learning (ICML 07), Jun. 2007.[4] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Maximum unfolded embedding: formulation, solution, and application for image clustering ’, ACM international conference on Multimedia (ACM MM07), Oct. 2006.[5] Shuicheng Yan, Huan Wang, Thomas Huang and Xiaoou Tang, ‘Ranking with Uncertain Labels ’, IEEE International Conference on Multimedia & Expo (ICME07), May. 2007.[6] Shuicheng Yan, Huan Wang, Xiaoou Tang and Thomas Huang, ‘Exploring Feature Descriptors for Face Recognition ’, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP07 Oral), Apri. 2007.
Thank You!
Transductive Regression on Multi-Class Data
Explore the intrinsic feature structures w.r.t. different classes for regression
Regression Algorithms. Reviews
Exploit the manifold structures to guide the regression
Belkin et.al, Regularization and semi-supervised learning on large graphs
transduces the function values from the labeled data to the unlabeled ones utilizing local neighborhood relations,
Global optimization for a robust prediction.
Cortes et.al, On transductive regression.
Tikhonov Regularization on the Reproducing Kernel Hilbert Space (RKHS)
Classification problem can be regarded as a special version of regression
Fei Wang et.al, Label Propagation Through Linear Neighborhoods
An iterative procedure is deduced to propagate the class labels within local neighborhood and has been proved convergent
Regression Values are constrained at 0 and 1 (binary)samples belonging to the corresponding class =>1o.w. => 0
The convergence point can be deduced from the regularization framework
2
1
1arg min ( ( ), ) || || ,
K
n
i i Hf H i
f V f x y fn
22
1
1arg min ( ( ), ) || || ,
( )K
nTI
i i A Hf H i
f V f x y f f Lfn u l
The Problem We are FacingAge estimation
w.r.t. different genders
Pose Estimation
w.r.t. different
Genders
Illuminations
Expressions
Persons
w.r.t. different persons
FG-NET Aging Database
CM
U-P
IE D
ataset
The problem
The Problem We are Facing
• All samples are considered as in the same class
• Samples close in the data space X are assumed to have similar function values (smoothness along the manifold)
• For the incoming sample, no class information is given.
• Utilize class information in the training process to boost the performance
Regression on Multi-Class Samples.
-0.5
0
0.5 -0.5
0
0.5-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Traditional Algorithms
• The class information is easy to obtain for the training data
TRIM. Intra-Manifold Regularization
• Respective intrinsic graphs are built for different sample classes
• Correspondingly, intra-manifold regularization item for different classes are calculated separately
intrinsic graph
• The Regularization
when p=1
when p=2
2
,
1( )
2T
i j iji j
f Lf f f W
2
~
|| ||T Ti ij j
i j i
f L Lf f w f
T pf L f
. . . 1, 0ij ijj
w r t w w
• It may not be proper to preserve smoothness between samples from different classes.
-0.5
0
0.5 -0.5
0
0.5-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
The algorithm
TRIM. Inter-Manifold Regularization
• Assumptions
Samples with similar labels lie generally in similar relative positions on the corresponding sub-manifolds.
• Motivation
1.Align the sub-manifolds of different class samples according to the labeled points and graph structures.
2. Derive the correspondence in the aligned space using nearest neighbor technique.
The algorithm
TRIM. Manifold Alignment
• Minimize the correspondence error on the landmark points
• Hold the intra-manifold structures
• The item is a global compactness regularization, and is the Laplacian Matrix of
11
( | )| arg min( ),
i
ii
i i i i
i
k Mkk M
k k T k k
k
C ff
f D f
21
1
( | ) || || ,i j ji i i ik ki ji ii j
i j i
Mk k kk k k kM T p T a
k ij kx xk k k
C f w f f f L f f L f
where
T af L faL aW
aijw 1 If and are of different classesix
ix0 o.w.
TRIM. Inter-Manifold Regularization
• Concatenate the derived inter-manifold graphs to form
rW 12WO 1MW21W O 2MW
...
...... ... ... ...
1MW 2MW ... O
• Laplacian Regularization
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
T rf L f
Objective Deduction
TRIM. Objective
2 2
2 2
1arg min || || || ||
1( ) ,
( )
kik lK i
k ki Kk x
f H k x X
k T p k T rkk
k
f f y fl
f L f f L fN N
• Fitness Item 21|| ||k
ik li
k kik x
k x X
f yl
• RKHS Norm 2|| ||Kf
• Intra-Manifold Regularization 2
1( )
( )k T p k
kkk
f L fN
• Inter-Manifold Regularization T rf L f
Solution
TRIM. Solution
• The solution to the minimization of the objective admits an expansion (Generalized Representer theorem)
1
( ) ( , )
k kk
N l u
i ii
f x K x x
Thus the minimization over Hilbert space boils down to minimizing the coefficient vector
1 1 1
1 1 11 1[ ,..., ,... ,..., ,..., ,..., ]M M M
M M M T
l l u l l u
over NR
The minimizer is given by 1 1( )k
k k T klk l
k
J S S Yl
where 2 2
1 1( ) ( ) ,
( )k k
k k T k k kT p k rkk kl l
k k
J S S S S K I S L S K L Kl N N
( , ),k k k k k
k
l l l l uS I O
1
1 1
( ),k k k Mk kk ki i
k k ki i
k
N NN N N N
S O I O
and K is the N × N Gram matrix of labeled and unlabeled points over all the sample classes.
Solution
TRIM.Generalization
• For the out-of-sample data, the labels can be estimated using
( )
1
( , )
k kk
N l u
new i i newi
y K x x
Note here in this framework the class information for the incoming sample is not required in the prediction stage.
22 2
1 1arg min || || ( ) ,
( )kik l
i
k k k T p k T ri kk kx
f k kx X
f f y f L f f L fl N N
Original version without kernel
Two Moons
Experiments
YAMAHA Dataset
Experiments.Age Dataset
TRIM vs traditional graph Laplacian regularized regression for the training set evaluation on YAMAHA database.
Open set evaluation for the kernelized regression on the YAMAHA database. (left) Regression on the training set. (right) Regression on out-of-sample data