39
Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V Guy Wolf [email protected] Universit´ e de Montr´ eal Fall 2019 MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 1 / 14

Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Geometric Data Analysis

Diffusion Maps

MAT 6480W / STT 6705V

Guy [email protected]

Universite de MontrealFall 2019

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 1 / 14

Page 2: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Outline

1 Diffusion mapsGaussian kernelDiffusion process & affinitiesSpectral embeddingEmbedded diffusion distancesAnisotropic diffusion

2 Theoretical foundationsHeat kernelLaplace-Beltrami operatorAsymptotic convergence

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 2 / 14

Page 3: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGeodesic distances vs. diffusion distances

Are geodesic distances sufficient for manifold embedding?

Diffusion-based distances are more robust to sampling & noise.MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 3 / 14

Page 4: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 5: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 6: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 7: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 8: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 9: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsGaussian kernel

The Gaussian kernel captures local data neighborhoods:

Normalization =⇒ diffusion kernelSpectral analysis =⇒ map from M⊆ Rm to Rδ�m

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 4 / 14

Page 10: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsDiffusion process & affinities

Gaussian kernel:k(x , y) , e−

‖x−y‖2ε

Degrees: q(x) , ∑ k(x , y)Transition probabilities:

p(x , y) , k(x , y)q(x)

Diffusion affinities:

a(x , y) , k(x , y)√q(x)

√q(y)

= q1/2(x)p(x , y)q−1/2(y)

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 5 / 14

Page 11: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

The diffusion framework depends on the Gaussian kernel’s ability tocapture local manifold patches. More precisely, it depends on appro-priate tuning of the radius ε.

Example

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 12: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

The diffusion framework depends on the Gaussian kernel’s ability tocapture local manifold patches. More precisely, it depends on appro-priate tuning of the radius ε.

Example

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 13: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

The diffusion framework depends on the Gaussian kernel’s ability tocapture local manifold patches. More precisely, it depends on appro-priate tuning of the radius ε.

Example

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 14: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

The diffusion framework depends on the Gaussian kernel’s ability tocapture local manifold patches. More precisely, it depends on appro-priate tuning of the radius ε.

Example

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 15: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

The diffusion framework depends on the Gaussian kernel’s ability tocapture local manifold patches. More precisely, it depends on appro-priate tuning of the radius ε.

Example

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 16: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

Several common heuristics for choosing ε include the distance median(or another percentile), the mean distance to the k-th nearest neighbor,or the maximal distance from a point to its nearest neighbor in the data.

Another approach1 uses the observation that the degree q(x) shouldapproximate a volume of an intrinsic ball on the manifold. More pre-cisely, without noise and with an appropriate ε, we have:

∑x

∑y

k(x , y) =∑

xq(x) ≈ C(2πε)d/2, C = N2

vol(M)

where d is the intrinsic dimension of the manifold M.

1Coifman, Shkolinsky, Sigworth, & Singer, IEEE Trans. on Image Proc., 17(10):1891-1899, 2008.MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 17: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

Another approach1 uses the observation that the degree q(x) shouldapproximate a volume of an intrinsic ball on the manifold. More pre-cisely, without noise and with an appropriate ε, we have:

∑x

∑y

k(x , y) =∑

xq(x) ≈ C(2πε)d/2, C = N2

vol(M)

where d is the intrinsic dimension of the manifold M. Thus, applyinglog gives a linear relation between log ε and log ‖k(·, ·)‖1:

log ‖k(·, ·)‖1 = d2 log ε + d

2 log(2π) + log C .

However, clearly if ε is too big neighborhoods will not be local on themanifold, and if it is too small it will not capture any neighbors.

1Coifman, Shkolinsky, Sigworth, & Singer, IEEE Trans. on Image Proc., 17(10):1891-1899, 2008.MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 18: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

Therefore, one can plot log ‖k(·, ·)‖1 by log ε and choose ε from themiddle linear region on it:

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 19: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsTuning the radius ε

Therefore, one can plot log ‖k(·, ·)‖1 by log ε and choose ε from themiddle linear region on it:

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 6 / 14

Page 20: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

The diffusion probabilities and affinities between N data points can beorganized in N×N matrices P and A. Notice that P is row-stochastic(i.e., each row sums up to one) and A is symmetric.

Furthermore, these matrices are related by group conjugationA = Q1/2PQ−1/2, where Q is a diagonal matrix that contains thedegrees Q(x ,x) = q(x). This relation means P & A have the sameeigenvalues, which are in fact all in the range [0, 1], and their eigen-vectors are also related by Q1/2 and Q−1/2. These eigenpairs are usedto obtain diffusion-based spectral embedding of the data.

Finally, for t = 2, 3, . . . , we also have At = Q1/2P tQ−1/2. Thesepowers correspond to advancing the diffusion time, since P t

contains t-step transition probabilities of the diffusion process.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 21: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

λ0 · · · λδ︸ ︷︷ ︸numerical rank

Spectrum (eigenvalues) of the diffusion affinity A and its powers

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 22: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

λ0 · · · λδ︸ ︷︷ ︸numerical rank

Spectrum (eigenvalues) of the diffusion affinity A and its powers

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 23: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

λ0 · · · λδ︸ ︷︷ ︸numerical rank

Spectrum (eigenvalues) of the diffusion affinity A and its powers

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 24: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

1 = λ0 ≥ λ1 ≥ λ2 ≥ · · · ≥ λδ > 0

φ0 φ1 φ2 · · · φδN

x 7→ Φt(x) , [λt0φ0(x) , λt

1φ1(x) , λt2φ2(x) , . . . , λt

δφδ(x)]T

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 25: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

1 = λ0 ≥ λ1 ≥ λ2 ≥ · · · ≥ λδ > 0

φ0 φ1 φ2 · · · φδN

x 7→ Φt(x) , [λt0φ0(x) , λt

1φ1(x) , λt2φ2(x) , . . . , λt

δφδ(x)]T

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 26: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsSpectral embedding

The map Φ(x) is defined using the eigenpairs of A, which are simplerto relate to other kernel methods. However, in many cases we are infact interested in computing an embedding based on the eigenvectorsof P, due to their relations to heat diffusion on manifolds.

It can be shown that the (right) eigenvectors of P are given by ψj =Q−1/2φj . Furthermore, it can be show that, up to normalization, φ0 =q1/2, so we get a constant ψ0 = ~1. This can also be easily verifiedsince, as a row-stochastic matrix, P~1 = ~1.Therefore, diffusion maps are typically defined as

Ψt(x) = [λt1ψ1(x), . . . , λt

δψδ(x)]T ,which does not include the trivial eiganpair (1, ~1), although forsimplicity, Φ(x) is also sometimes referred to as a diffusion map.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 7 / 14

Page 27: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsEmbedded diffusion distances

embedded distance︷ ︸︸ ︷‖Φt(x)− Φt(y)‖ =

diffusion distance︷ ︸︸ ︷∥∥∥At(x ,·) − At

(y ,·)

∥∥∥

embedded distance︷ ︸︸ ︷‖Φt(x)− Φt(y)‖ =

diffusion distance︷ ︸︸ ︷∥∥∥P t(x ,·) − P t

(y ,·)

∥∥∥L2(‖q‖1/q)

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 8 / 14

Page 28: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsEmbedded diffusion distances

embedded distance︷ ︸︸ ︷‖Φt(x)− Φt(y)‖ =

diffusion distance︷ ︸︸ ︷∥∥∥At(x ,·) − At

(y ,·)

∥∥∥

embedded distance︷ ︸︸ ︷‖Φt(x)− Φt(y)‖ =

diffusion distance︷ ︸︸ ︷∥∥∥P t(x ,·) − P t

(y ,·)

∥∥∥L2(‖q‖1/q)

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 8 / 14

Page 29: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsExamples

Example (Shell-shaped manifold & anomalies)

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 9 / 14

Page 30: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsExamples

Example (Shell-shaped manifold & anomalies)

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 9 / 14

Page 31: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsExamples

Example (Embedding video frames)Diffusion map of BTR video frames, restricted to the first twoeigenvectors:

Color corresponds to the angle of the BTR rotation.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 9 / 14

Page 32: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsAnisotropic diffusion

Gaussian-based diffusion propagation captures not only the datageometry, but also its distribution, or density.While this may be useful in some applications, an equally useful goalit to separate the underlying geometry from this density. Forexample, sampling noise and data availability may cause density vari-ations.It can be shown that the degrees q(x) =

∫k(x , y)dy provide a smooth

approximation of data density. Therefore, the Gaussian kernel canbe normalized as kα(x , y) = k(x ,y)

qα(x)qα(y) , where α ∈ {0, 1/2, 1} isa configurable parameter that determines how much of the densityshould be canceled. Then, an anisotropic diffusion map can beconstructed based on this kernel. When α = 0 this construction givesthe original isotropic diffusion map.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 10 / 14

Page 33: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Diffusion mapsAnisotropic diffusion

Example

From the left to right:1 A closed curve sampled non-uniformly.2 Density of the samples (also corresponds to color).3 Isotropic embedding, which is skewed by sampling density.4 Anisotropic embedding, which separates geometry from density.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 10 / 14

Page 34: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Theoretical foundationsHeat kernel

The theoretical foundation of diffusion maps comes from its relationto heat propagation over manifolds. This diffusion process is modeledby the heat equation and its solution.

Definition (Heat equation)The heat equation is a differential equation for functionsu :M× [0,T ]→ R, given by

L(u) , ∂u∂t −∆xu = 0 ; u(x , 0) = f (x)

with initial solution f (x).

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 11 / 14

Page 35: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Theoretical foundationsHeat kernel

Definition (Heat equation)The heat equation is a differential equation for functionsu :M× [0,T ]→ R, given by

L(u) , ∂u∂t −∆xu = 0 ; u(x , 0) = f (x)

with initial solution f (x).

The fundamental solution of the heat equation is given by the heatkernel K (t, x , y), which satisfies:

L(K ) = 0 in (x , t).limt→0 K (t, x , y) = δy (x).

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 11 / 14

Page 36: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Theoretical foundationsLaplace-Beltrami operator

The heat kernel is closely related to the Laplace-Beltrami operator ∆,which is a manifold equivalent of the Laplacian second derivativeoperator in Euclidean space. Namely, K can be written as

K (t, x , y) = ∑∞j=1 e−tλjψj(x)ψj(y)

where 0 = λ0 < λ1 ≤ λ2 ≤ · · · ↗ ∞ are the eigenvalues of ∆, andψ0, ψ1, ψ2, . . . are the corresponding eigenfunctions.

This kernel yields the solution

u(x , t) = (e−t∆f )(x , t) ,∫M K (t, x , y)f (y)dy

of the heat equation.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 12 / 14

Page 37: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Theoretical foundationsAsymptotic convergence

To establish the relation between diffusion maps and heat propagationover a manifold, we consider a regime where the dataset tends tothe entire manifold, and the Gaussian kernel radius ε tends to zero.Furthermore, in this infinitesimal regime we either assume that datais uniformly sampled from the manifold, or that the diffusion processis constructed with an anisotropic kernel with α = 1.Under this asymptotic regime, there are two results that establishthe convergence of the row stochastic Pε to the heat kernel:

limε→0I−Pεε

= ∆limε→0 P t/ε

ε = e−t∆

More details can be found in: Diffusion maps (Coifman & Lafon,2006).

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 13 / 14

Page 38: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Summary

Diffusion maps is a kernel method for nonlinear embedding &dimensionality reduction. It is based on spectral embedding of adiffusion affinity kernel, which consists of normalized Gaussian affini-ties. Using a manifold data model, the underlying diffusion geometryprovides a discretization of a heat propagation process.

Euclidean distances in the embedded space approximate diffusion dis-tances in the data. These distances are similar in nature to thegeodesic distances used in Isomap, but they are more robust to noiseand erroneous data links.Anisotropic normalization can be applied in the diffusion map con-struction to alleviate noisy nonuniform sampling by separating datageometry from data distribution. Other variations also exist tohandle special settings or emphasize various data patterns.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 14 / 14

Page 39: Geometric Data Analysis Diffusion Maps - Guy Wolfmat6480w.guywolf.org/slides/T12 - Diffusion Maps.pdf · 2019-11-28 · Geometric Data Analysis Diffusion Maps MAT 6480W / STT 6705V

Summary

Typically, the main steps in a Diffusion Maps algorithm are:1 Gaussian kernel k(x , y) = exp(‖x − y‖2 /ε) and degrees

q(x) = ∑y k(x , y)

2 Anisotropic kernel kα(x , y) = k(x , y)/(q(x)q(y))α and degreesqα(x) = ∑

y kα(x , y)3 Diffusion affinity matrix A(x ,y) = kα(x , y)/

√qα(x)qα(y)

4 SVD to obtain eigenvalues 1 = λ0 ≥ λ1 ≥ . . . ≥ 0 andeigenvectors φ0, φ1, . . . of A.

5 Compute the δ-dimensional embedding using the mapx → Ψt(x) = (λt

i φi(x)/φ0(x))δi=1, which is based on theeigenpairs of the row-stochastic t-step transition probabilitymatrix. Notice that this row-stochastic matrix is not directlycomputed/considered with these steps.

MAT 6480W (Guy Wolf) Diffusion Maps UdeM - Fall 2019 14 / 14