103
Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 04.05.2015 and 07.05.2015

Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Linear and Non-Linear Dimensionality Reduction

Alexander Schulzaschulz(at)techfak.uni-bielefeld.de

University of Pisa, Pisa

04.05.2015 and 07.05.2015

Page 2: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Overview

Dimensionality ReductionMotivationLinear ProjectionsLinear Mappings in Feature SpaceNeighbor EmbeddingManifold Learning

Advances in Dimensionality ReductionSpeedup for Neighbor EmbeddingsQuality Assessment of DRFeature Relevance for DRVisualization of ClassifiersSupervised Dimensionality Reduction

Page 3: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Curse of Dimensionality1

I high-dimensional spaces are almost emptyI Hypervolume concentrates in a thin shell close to the surface

1[Lee and Verleysen, 2007]

Page 4: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Curse of Dimensionality1

I high-dimensional spaces are almost empty

I Hypervolume concentrates in a thin shell close to the surface

1[Lee and Verleysen, 2007]

Page 5: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Curse of Dimensionality1

I high-dimensional spaces are almost emptyI Hypervolume concentrates in a thin shell close to the surface

1[Lee and Verleysen, 2007]

Page 6: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why use Dimensionality Reduction?

0 0 0 0 0 0 0 0 0 0 0 0 0 0 81 34 0 0 0 0 0 0 0 0 0 0 0 0 3 104 247 1530 0 0 0 0 0 0 0 0 0 0 0 70 255 255 175 0 0 0 0 0 0 0 0 0 0 14 54 227255 255 98 0 0 0 0 0 0 0 0 0 0 168 255 255 255 179 7 0 0 0 0 0 0 0 033 157 254 255 247 162 13 0 0 0 0 0 0 0 0 72 207 255 255 255 124 00 0 0 0 0 0 0 29 149 250 255 255 228 160 2 0 0 0 0 0 0 0 6 148 255255 255 245 112 0 0 0 0 0 0 0 0 56 195 255 255 255 169 15 0 0 0 0 00 0 0 5 205 255 254 208 61 10 0 0 0 0 0 0 0 0 39 84 255 219 127 0 00 0 0 0 0 0 0 0 3 186 234 211 41 0 0 0 0 0 0 0 0 0 0 0 105 255 230 150 0 0 0 0 0 0 0 0 0 0 0 175 224 48 0 0 0 0 0 0 0 0 0 0 0 0 0 63 52 0 0 00 0 0 0 0 0 0 0 0 0 0

Page 7: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why use Dimensionality Reduction?

Page 8: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why use Dimensionality Reduction?

Page 9: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Principal Component Analysis (PCA)

variance maximization

I var(w>xi

)with ‖w‖ = 1

I = 1N∑

i(w>xi)

2

I = 1N∑

i w>xixi>w

I = w>CwI →Eigenvectors of the covari-

ance matrix are optimal

Page 10: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Principal Component Analysis (PCA)

variance maximization

I var(w>xi

)with ‖w‖ = 1

I = 1N∑

i(w>xi)

2

I = 1N∑

i w>xixi>w

I = w>CwI →Eigenvectors of the covari-

ance matrix are optimal

Page 11: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Principal Component Analysis (PCA)

variance maximization

I var(w>xi

)with ‖w‖ = 1

I = 1N∑

i(w>xi)

2

I = 1N∑

i w>xixi>w

I = w>Cw

I →Eigenvectors of the covari-ance matrix are optimal

Page 12: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Principal Component Analysis (PCA)

variance maximization

I var(w>xi

)with ‖w‖ = 1

I = 1N∑

i(w>xi)

2

I = 1N∑

i w>xixi>w

I = w>CwI →Eigenvectors of the covari-

ance matrix are optimal

Page 13: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

PCA in Action

−1

−0.5

0

0.5

1 −1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1.5 −1 −0.5 0 0.5 1 1.5−1

01

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

2

2[Gisbrecht and Hammer, 2015]

Page 14: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Distance Preservation (CMDS)

I distances δij = ‖xi − xj‖2, dij = ‖yi − yj‖2

I objective δij ≈ dij

I distances dij and similarities sij can be transformed into eachother

I S = UΛU> matrix of pairwise similaritiesI best low rank approximation of S in Frobenius norm is S = UΛU>

with the largest eigenvalue

Page 15: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Distance Preservation (CMDS)

I distances δij = ‖xi − xj‖2, dij = ‖yi − yj‖2

I objective δij ≈ dij

I distances dij and similarities sij can be transformed into eachother

I S = UΛU> matrix of pairwise similarities

I best low rank approximation of S in Frobenius norm is S = UΛU>

with the largest eigenvalue

Page 16: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Distance Preservation (CMDS)

I distances δij = ‖xi − xj‖2, dij = ‖yi − yj‖2

I objective δij ≈ dij

I distances dij and similarities sij can be transformed into eachother

I S = UΛU> matrix of pairwise similaritiesI best low rank approximation of S in Frobenius norm is S = UΛU>

with the largest eigenvalue

Page 17: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Manifold Learning

learn linear manifold

I represent data as projectionson unknown w:C = 1

2N∑

i(xi − yiw)2

I What are the best parametersyi and w?

Page 18: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Manifold Learning

learn linear manifold

I represent data as projectionson unknown w:C = 1

2N∑

i(xi − yiw)2

I What are the best parametersyi and w?

Page 19: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

PCA

three ways to obtain PCA

I maximize variance of a linear projectionI preserve distancesI find a linear manifold such that errors are minimal in an L2 sense

Page 20: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel PCA3

Idea

I Apply a fixed nonlinear preprocessing φ(x)I Perform standard PCA in feature space

I How to apply the kernel trick here?

3[Schölkopf et al., 1998]

Page 21: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel PCA3

Idea

I Apply a fixed nonlinear preprocessing φ(x)I Perform standard PCA in feature spaceI How to apply the kernel trick here?

3[Schölkopf et al., 1998]

Page 22: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel PCA in Action

−1

−0.5

0

0.5

1 −1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1.5 −1 −0.5 0 0.5 1 1.5−1

01

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

4

4[Gisbrecht and Hammer, 2015]

Page 23: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Stochastic Neighbor Embedding (SNE)5

I introduce a probabilistic neighborhood in the input space

pj|i =exp(−0.5‖xi − xj‖2/σ2

i )∑k ,k 6=i exp(−0.5‖xi − xk‖2/σ2

i )

I and in the output space

qj|i =exp(−0.5‖yi − yj‖2)∑

k ,k 6=i exp(−0.5‖yi − yk‖2)

I optimize the sum of Kullback-Leibler divergences

C =∑

i

KL(Pi ,Qi) =∑

i

∑j 6=i

pj|i log(

pj|i

qj|i

)

5[Hinton and Roweis, 2002]

Page 24: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Stochastic Neighbor Embedding (SNE)5

I introduce a probabilistic neighborhood in the input space

pj|i =exp(−0.5‖xi − xj‖2/σ2

i )∑k ,k 6=i exp(−0.5‖xi − xk‖2/σ2

i )

I and in the output space

qj|i =exp(−0.5‖yi − yj‖2)∑

k ,k 6=i exp(−0.5‖yi − yk‖2)

I optimize the sum of Kullback-Leibler divergences

C =∑

i

KL(Pi ,Qi) =∑

i

∑j 6=i

pj|i log(

pj|i

qj|i

)

5[Hinton and Roweis, 2002]

Page 25: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Stochastic Neighbor Embedding (SNE)5

I introduce a probabilistic neighborhood in the input space

pj|i =exp(−0.5‖xi − xj‖2/σ2

i )∑k ,k 6=i exp(−0.5‖xi − xk‖2/σ2

i )

I and in the output space

qj|i =exp(−0.5‖yi − yj‖2)∑

k ,k 6=i exp(−0.5‖yi − yk‖2)

I optimize the sum of Kullback-Leibler divergences

C =∑

i

KL(Pi ,Qi) =∑

i

∑j 6=i

pj|i log(

pj|i

qj|i

)

5[Hinton and Roweis, 2002]

Page 26: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Neighbor Retrieval Visualizer (NeRV)6

I precision(i) = NTP,iki

= 1− NFP,iki, recall(i) = NTP,i

ri= 1− NMISS,i

ri

6[Venna et al., 2010]

Page 27: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Neighbor Retrieval Visualizer (NeRV)6

I precision(i) = NTP,iki

= 1− NFP,iki, recall(i) = NTP,i

ri= 1− NMISS,i

ri

6[Venna et al., 2010]

Page 28: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Neighbor Retrieval Visualizer (NeRV)

I KL(Pi ,Qi) generalizes recallI KL(Qi ,Pi) generalizes precision

I NeRV optimizes

C = λ∑

i

KL(Pi ,Qi) + (1− λ)∑

i

KL(Qi ,Pi)

Page 29: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Neighbor Retrieval Visualizer (NeRV)

I KL(Pi ,Qi) generalizes recallI KL(Qi ,Pi) generalizes precisionI NeRV optimizes

C = λ∑

i

KL(Pi ,Qi) + (1− λ)∑

i

KL(Qi ,Pi)

Page 30: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

t-Distributed SNE (t-SNE)7

I symmetrized probabilities p and qI uses a Student-t distribution in the output space

7[van der Maaten and Hinton, 2008]

Page 31: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

t-Distributed SNE (t-SNE)7

I symmetrized probabilities p and qI uses a Student-t distribution in the output space

7[van der Maaten and Hinton, 2008]

Page 32: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Neighbor Embedding in Action

−1

−0.5

0

0.5

1 −1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1.5 −1 −0.5 0 0.5 1 1.5−1

01

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

8

8[Gisbrecht and Hammer, 2015]

Page 33: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Maximum Variance Unfolding (MVU)9

I goal: ’unfold’ a given manifoldwhile keeping all the localdistances and angles fixed

I maximize∑

ij ‖yi − yj‖2 s.t.∑i yi = 0

‖yi − yj‖2 = ‖xi − xj‖2, for allneighbors

9[Weinberger and Saul, 2006]

Page 34: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Maximum Variance Unfolding (MVU)9

I goal: ’unfold’ a given manifoldwhile keeping all the localdistances and angles fixed

I maximize∑

ij ‖yi − yj‖2 s.t.∑i yi = 0

‖yi − yj‖2 = ‖xi − xj‖2, for allneighbors

9[Weinberger and Saul, 2006]

Page 35: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Manifold Learner in Action

−1

−0.5

0

0.5

1 −1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1.5 −1 −0.5 0 0.5 1 1.5−1

01

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

10

10[Gisbrecht and Hammer, 2015]

Page 36: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Overview

Dimensionality ReductionMotivationLinear ProjectionsLinear Mappings in Feature SpaceNeighbor EmbeddingManifold Learning

Advances in Dimensionality ReductionSpeedup for Neighbor EmbeddingsQuality Assessment of DRFeature Relevance for DRVisualization of ClassifiersSupervised Dimensionality Reduction

Page 37: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut Approach11

Complexity of NE

I Neighbor Embeddings have the complexity O(N2)

I matrices P and Q are squaredI squared summation for the gradient

∂C∂yi

=∑j 6=i

gij(yi − yj)

11[Yang et al., 2013, van der Maaten, 2013]

Page 38: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut Approach11

Complexity of NE

I Neighbor Embeddings have the complexity O(N2)

I matrices P and Q are squared

I squared summation for the gradient

∂C∂yi

=∑j 6=i

gij(yi − yj)

11[Yang et al., 2013, van der Maaten, 2013]

Page 39: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut Approach11

Complexity of NE

I Neighbor Embeddings have the complexity O(N2)

I matrices P and Q are squaredI squared summation for the gradient

∂C∂yi

=∑j 6=i

gij(yi − yj)

11[Yang et al., 2013, van der Maaten, 2013]

Page 40: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

12

12[van der Maaten, 2013]

Page 41: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut Approach

Barnes Hut

I approximate the gradient ∂C∂yi

=∑

j 6=i gij(yi − yj) as∑j 6=i

gij(yi − yj) ≈∑

t

|Git | · gij(yi − yi

t)

I approximate P as sparse matrixI results in a O(N log N) algorithm

Page 42: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut Approach

Barnes Hut

I approximate the gradient ∂C∂yi

=∑

j 6=i gij(yi − yj) as∑j 6=i

gij(yi − yj) ≈∑

t

|Git | · gij(yi − yi

t)

I approximate P as sparse matrixI results in a O(N log N) algorithm

Page 43: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Barnes Hut SNE

13

13[van der Maaten, 2013]

Page 44: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel t-SNE14

I O(N) algorithm: apply NE to a fixed subset, map remainder without of sample projection

I how to obtain an out of sample extension?I use kernel mapping

x 7→ y(x) =∑

j

αj ·k(x,xj)∑l k(x,xl)

= Ak

I minimization of∑i

‖yi − y(xi)‖2 yields A = Y · K−1

14[Gisbrecht et al., 2015]

Page 45: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel t-SNE14

I O(N) algorithm: apply NE to a fixed subset, map remainder without of sample projection

I how to obtain an out of sample extension?

I use kernel mapping

x 7→ y(x) =∑

j

αj ·k(x,xj)∑l k(x,xl)

= Ak

I minimization of∑i

‖yi − y(xi)‖2 yields A = Y · K−1

14[Gisbrecht et al., 2015]

Page 46: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel t-SNE14

I O(N) algorithm: apply NE to a fixed subset, map remainder without of sample projection

I how to obtain an out of sample extension?I use kernel mapping

x 7→ y(x) =∑

j

αj ·k(x,xj)∑l k(x,xl)

= Ak

I minimization of∑i

‖yi − y(xi)‖2 yields A = Y · K−1

14[Gisbrecht et al., 2015]

Page 47: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel t-SNE14

I O(N) algorithm: apply NE to a fixed subset, map remainder without of sample projection

I how to obtain an out of sample extension?I use kernel mapping

x 7→ y(x) =∑

j

αj ·k(x,xj)∑l k(x,xl)

= Ak

I minimization of∑i

‖yi − y(xi)‖2 yields A = Y · K−1

14[Gisbrecht et al., 2015]

Page 48: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Kernel t-SNE

15

15[Gisbrecht et al., 2015]

Page 49: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Quality Assessment of DR16

I most popular measure

Qk (X ,Y ) =∑

i

(Nk (~x i) ∩ Nk (~y i)

)/(Nk)

I rescaling added recentlyI recently used to compare many DR techniques

16[Lee et al., 2013]

Page 50: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Quality Assessment of DR16

I most popular measure

Qk (X ,Y ) =∑

i

(Nk (~x i) ∩ Nk (~y i)

)/(Nk)

I rescaling added recently

I recently used to compare many DR techniques

16[Lee et al., 2013]

Page 51: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Quality Assessment of DR16

I most popular measure

Qk (X ,Y ) =∑

i

(Nk (~x i) ∩ Nk (~y i)

)/(Nk)

I rescaling added recentlyI recently used to compare many DR techniques

16[Lee et al., 2013]

Page 52: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Quality Assessment of DR

17

17[Peluffo-Ordóñez et al., 2014]

Page 53: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Feature Relevance for DR

I Which features are important for a given projection?

Page 54: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

data set: ESANN participants

Partic. University ESANN paper #publications . . . likes beer . . .1 A 1 15 1 . . .2 A 0 8 -1 . . .3 B 1 22 . . . -1 . . .4 C 1 9 . . . 0 . . .5 C 0 15 -1 . . .6 D . . ....

......

Page 55: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Visualization of ESANN participants

0 2 4 6 8 10

−4

−3

−2

−1

0

1

2

3

4

5

Dim 1

Dim

2

Page 56: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Relevance of features for the projection

1 2 3 4 5 6 7 8 9 100

2

4

6

8

10

12

Page 57: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Relevance of features for the projection

1 2 3 4 5 6 7 8 9 100

2

4

6

8

10

12

Page 58: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Visualization of ESANN participants

0 2 4 6 8 10

−4

−3

−2

−1

0

1

2

3

4

5

Dim 1

Dim

2

Beer 1Beer 0Beer -1

Page 59: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Aim

I estimate the relevance of single features for non linear dimen-sionality reductions

Idea

I change the influence of a single feature and observe the changein the quality

Page 60: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Aim

I estimate the relevance of single features for non linear dimen-sionality reductions

Idea

I change the influence of a single feature and observe the changein the quality

Page 61: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Evaluation of Feature Relevance for DR19

NeRV cost function18 QNeRVk

I interpretation from an information retrieval perspective

I d(~x i , ~x j)2 =∑

l(xil − x j

l )2 becomes

∑l λ

2l (x

il − x j

l )2

I λkNeRV(l) := λ2

l where λ optimizes QNeRVk (Xλ,Y ) + δ

∑l λ

2l

18[Venna et al., 2010]19[Schulz et al., 2014a]

Page 62: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Evaluation of Feature Relevance for DR19

NeRV cost function18 QNeRVk

I interpretation from an information retrieval perspectiveI d(~x i , ~x j)2 =

∑l(x

il − x j

l )2 becomes

∑l λ

2l (x

il − x j

l )2

I λkNeRV(l) := λ2

l where λ optimizes QNeRVk (Xλ,Y ) + δ

∑l λ

2l

18[Venna et al., 2010]19[Schulz et al., 2014a]

Page 63: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Evaluation of Feature Relevance for DR19

NeRV cost function18 QNeRVk

I interpretation from an information retrieval perspectiveI d(~x i , ~x j)2 =

∑l(x

il − x j

l )2 becomes

∑l λ

2l (x

il − x j

l )2

I λkNeRV(l) := λ2

l where λ optimizes QNeRVk (Xλ,Y ) + δ

∑l λ

2l

18[Venna et al., 2010]19[Schulz et al., 2014a]

Page 64: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 1

−0.6−0.4

−0.20

0.20.4

−0.2

0

0.2

0.4

0.6

−0.1

0

0.1

0.2

Dim 1Dim 2

Dim

3

C1

C2

C3

Page 65: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 1

−0.6−0.4

−0.20

0.20.4

−0.2

0

0.2

0.4

0.6

−0.1

0

0.1

0.2

Dim 1Dim 2

Dim

3

C1

C2

C3

−50 −40 −30 −20 −10 0 10 20 30 40 50−60

−50

−40

−30

−20

−10

0

10

20

30

40

DimI

Dim

II

C 1

C 2

C 3

Page 66: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 1

−0.6−0.4

−0.20

0.20.4

−0.2

0

0.2

0.4

0.6

−0.1

0

0.1

0.2

Dim 1Dim 2

Dim

3

C1

C2

C3

−50 −40 −30 −20 −10 0 10 20 30 40 50−60

−50

−40

−30

−20

−10

0

10

20

30

40

DimI

Dim

II

C 1

C 2

C 3

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Neighborhood size

λN

eR

V

Dim 1

Dim 2

Dim 3

Page 67: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 2

−0.4 −0.2 0 0.2 0.4−0.4

−0.20

0.20.4

−0.1

0

0.1

Dim

3

Dim 1Dim 2

C 1

C 2

C 3

Page 68: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 2

−0.4 −0.2 0 0.2 0.4−0.4

−0.20

0.20.4

−0.1

0

0.1

Dim

3

Dim 1Dim 2

C 1

C 2

C 3

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

DimI

Dim

II

C 1

C 2

C 3

Page 69: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy data set 2

−0.4 −0.2 0 0.2 0.4−0.4

−0.20

0.20.4

−0.1

0

0.1

Dim

3

Dim 1Dim 2

C 1

C 2

C 3

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

DimI

Dim

II

C 1

C 2

C 3

−50 −40 −30 −20 −10 0 10 20 30 40 50−40

−30

−20

−10

0

10

20

30

40

DimI

Dim

II

C 1

C 2

C 3

Page 70: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Relevances for different projections

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

DimI

Dim

II

C 1

C 2

C 3

−50 −40 −30 −20 −10 0 10 20 30 40 50−40

−30

−20

−10

0

10

20

30

40

DimI

Dim

II

C 1

C 2

C 3

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Neighborhood size

λN

eR

V

Dim 1

Dim 2

Dim 3

Page 71: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Relevances for different projections

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

DimI

Dim

II

C 1

C 2

C 3

−50 −40 −30 −20 −10 0 10 20 30 40 50−40

−30

−20

−10

0

10

20

30

40

DimI

Dim

II

C 1

C 2

C 3

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Neighborhood size

λN

eR

V

Dim 1

Dim 2

Dim 3

0 100 200 300 400 500 6000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Neighborhood size

λN

eR

V

Dim 1

Dim 2

Dim 3

Page 72: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

98.7%

Page 73: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why visualize Classifiers?

Page 74: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why visualize Classifiers?

Page 75: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Why visualize Classifiers?

Page 76: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Where is the Difficulty?

Class borders are

I often non linearI often not given in an explicit functional form (e.g. SVM)I high dimensional which makes it non feasible to sample them for

a projection

Page 77: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration the approach

00.2

0.40.6

0.81

0

0.5

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

dim1

dim2

dim

3

class 1

class 2

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.5

1

dim1

dim2

dim

3

class 1

class 2

I project data to 2D

Page 78: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration the approach

00.2

0.40.6

0.81

0

0.5

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

dim1

dim2

dim

3

class 1

class 2

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.5

1

dim1

dim2

dim

3

class 1

class 2

I project data to 2D

Page 79: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration: dimensionality reduction

dimI

dim

II

class 1

class 2

SVs

I sample the 2D data spaceI project the samples up

I classify them

Page 80: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration: dimensionality reduction

dimI

dim

II

class 1

I sample the 2D data spaceI project the samples up

I classify them

Page 81: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration: dimensionality reduction

dimI

dim

II

class 1

I sample the 2D data spaceI project the samples up

0

0.5

1

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

1.2

dim2dim1

dim

3

class 1class 2

SVs2D samples

I classify them

Page 82: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration: dimensionality reduction

dimI

dim

II

class 1

I sample the 2D data spaceI project the samples up

0

0.5

1

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

1.2

dim2dim1

dim

3

class 1class 2

SVs2D samples

I classify them

Page 83: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

An illustration: border visualization

I color intensity codes the cer-tainty of the classifier

Page 84: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Classifier visualization20

I project data to 2DI sample the 2D data spaceI project the samples upI classify them

dimI

dim

II

class 1

class 2

SVs

0

0.5

1

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

1.2

dim2dim1

dim

3

class 1class 2

SVs2D samples

dimI

dim

II

class 1

20[Schulz et al., 2014b]

Page 85: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 1

dim1

dim2

dim

3

C1

C2

Page 86: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 1

dim1

dim2

dim

3C1

C2

Page 87: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 2

dim1dim2

dim

3

C1

C2

Page 88: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 2

dim1dim2

dim

3

C1

C2

Page 89: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 2 with NE

dim1dim2

dim

3

C1

C2

Page 90: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Toy Data Set 2 with NE

dim1dim2

dim

3

C1

C2

Page 91: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

DR: intrinsically 3D data

C1

C2

C1

C2

Page 92: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

DR: NE projection

Page 93: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Supervised dimensionality reduction

I Use the Fisher metric21 d(x,x + dx) = x>J(x)x

I J(x) = Ep(c|x)

{(∂∂x log p(c|x)

)(∂∂x log p(c|x)

)>}

21[Peltonen et al., 2004, Gisbrecht et al., 2015]

Page 94: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Supervised dimensionality reduction

I Use the Fisher metric22 d(x,x + dx) = x>J(x)x

I J(x) = Ep(c|x)

{(∂∂x log p(c|x)

)(∂∂x log p(c|x)

)>}

22[Peltonen et al., 2004, Gisbrecht et al., 2015]

Page 95: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

DR: supervised NE projection

Page 96: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

DR: supervised NE projection

Page 97: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

prototypes in original space

C1

C2

Page 98: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Summary

I different objectives of dimensionality reduction

I new approach to get insight into trained classification modelsI discriminative information can yield major imrpovements

Page 99: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Summary

I different objectives of dimensionality reductionI new approach to get insight into trained classification modelsI discriminative information can yield major imrpovements

Page 100: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Thank You For Your Attention!

Alexander Schulzaschulz (at) techfak.uni-bielefeld.de

Page 101: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Literature I

Gisbrecht, A. and Hammer, B. (2015).Data visualization by nonlinear dimensionality reduction.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(2):51–73.

Gisbrecht, A., Schulz, A., and Hammer, B. (2015).Parametric nonlinear dimensionality reduction using kernel t-sne.Neurocomputing, 147:71–82.

Hinton, G. and Roweis, S. (2002).Stochastic neighbor embedding.In Advances in Neural Information Processing Systems 15, pages 833–840. MIT Press.

Lee, J. A., Renard, E., Bernard, G., Dupont, P., and Verleysen, M. (2013).Type 1 and 2 mixtures of kullback-leibler divergences as cost functions in dimensionality reduction based on similaritypreservation.Neurocomput., 112:92–108.

Lee, J. A. and Verleysen, M. (2007).Nonlinear dimensionality reduction.Springer.

Peltonen, J., Klami, A., and Kaski, S. (2004).Improved learning of riemannian metrics for exploratory analysis.Neural Networks, 17:1087–1100.

Page 102: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Literature II

Peluffo-Ordóñez, D. H., Lee, J. A., and Verleysen, M. (2014).Recent methods for dimensionality reduction: A brief comparative analysis.In 22th European Symposium on Artificial Neural Networks, ESANN 2014, Bruges, Belgium, April 23-25, 2014.

Schölkopf, B., Smola, A., and Müller, K.-R. (1998).Nonlinear component analysis as a kernel eigenvalue problem.Neural Comput., 10(5):1299–1319.

Schulz, A., Gisbrecht, A., and Hammer, B. (2014a).Relevance learning for dimensionality reduction.In 22th European Symposium on Artificial Neural Networks, ESANN 2014, Bruges, Belgium, April 23-25, 2014.

Schulz, A., Gisbrecht, A., and Hammer, B. (2014b).Using discriminative dimensionality reduction to visualize classifiers.Neural Processing Letters, pages 1–28.

van der Maaten, L. (2013).Barnes-hut-sne.CoRR, abs/1301.3342.

van der Maaten, L. and Hinton, G. (2008).Visualizing high-dimensional data using t-sne.Journal of Machine Learning Research, 9:2579–2605.

Venna, J., Peltonen, J., Nybo, K., Aidos, H., and Kaski, S. (2010).Information retrieval perspective to nonlinear dimensionality reduction for data visualization.Journal of Machine Learning Research, 11:451–490.

Page 103: Linear and Non-Linear Dimensionality Reductiondidawiki.cli.di.unipi.it/lib/exe/fetch.php/magistrale... · 2015-05-11 · Linear and Non-Linear Dimensionality Reduction Alexander Schulz

Literature III

Weinberger, K. Q. and Saul, L. K. (2006).An introduction to nonlinear dimensionality reduction by maximum variance unfolding.In Proceedings of the 21st National Conference on Artificial Intelligence. AAAI.

Yang, Z., Peltonen, J., and Kaski, S. (2013).Scalable optimization of neighbor embedding for visualization.In Dasgupta, S. and Mcallester, D., editors, Proceedings of the 30th International Conference on Machine Learning(ICML-13), volume 28, pages 127–135. JMLR Workshop and Conference Proceedings.