Upload
truongkhuong
View
222
Download
0
Embed Size (px)
Citation preview
Conditional Expectation Manifoldsand
Brain Population Analysis
Samuel Gerber, University of Utah
Manifold Learning
Some observations on popular algorithms
−30 −20 −10 0 10 20 30 40−25
−20
−15
−10
−5
0
5
10
15
20
25
Isomap• Approximate geodesic distances by
shortest path in nearest neighbor graph
• Preserve approximate geodesics
•
• Multidimensional scaling
X ∼ Uniform([0,1]d)
P(X ∈ Sd) =πd/2
Γ(d/2+1)2d
limd→∞ P(X ∈ Sd) = 0minx = ∑i, j[δ (yi,y j)−d(xi,x j)]2
var(PY )X = PYyi = ∑N
k=0 P(Ck|xi)(ak +bkxi)ri(y) = E[X ∈Ci|Y = y]Ci = xi : src(xi) = xmin,sink(xi) = xmaxri(y) = E[X ∈Ci|Y = y]
1
X ∼ Uniform([0,1]d)
P(X ∈ Sd) =πd/2
Γ(d/2+1)2d
limd→∞ P(X ∈ Sd) = 0minx = ∑i, j[δ (yi,y j)−d(xi,x j)]2
var(PY )X = PYyi = ∑N
k=0 P(Ck|xi)(ak +bkxi)ri(y) = E[X ∈Ci|Y = y]Ci = xi : src(xi) = xmin,sink(xi) = xmaxri(y) = E[X ∈Ci|Y = y]
1
Properties• Only relies on accurate local distances
• Shortcuts in graph - very bad approximation
• Quality measure based on graph embedding
• Hard to detect
−30 −20 −10 0 10 20 30 40−25
−20
−15
−10
−5
0
5
10
15
20
25
1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
DimensionD
isto
rtio
n
Properties
• Classical multidimensional scaling is not minimizing
• Optimization based approaches
X ∼ Uniform([0,1]d)
P(X ∈ Sd) =πd/2
Γ(d/2+1)2d
limd→∞ P(X ∈ Sd) = 0minx = ∑i, j[δ (yi,y j)−d(xi,x j)]2
var(PY )X = PYyi = ∑N
k=0 P(Ck|xi)(ak +bkxi)ri(y) = E[X ∈Ci|Y = y]Ci = xi : src(xi) = xmin,sink(xi) = xmaxri(y) = E[X ∈Ci|Y = y]
1
A. Agarwal, J. Phillips and S. Venkatasubramanian, Universal Multi-Dimensional Scaling, Conference on Knowledge Discovery and Data Mining 2010
J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika 1964
Laplacian Eigenmaps• Given a manifold find functions
such that is minimized
• The low dimensional embedding is
• Small gradient implies that close by points will be mapped close together
−10 −5 0 5 10 150
50
100−15
−10
−5
0
5
10
−10 −5 0 5 10 150
50
100−15
−10
−5
0
5
10
−0.06 −0.04 −0.02 0 0.02 0.04 0.06−0.05
−0.04
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
fM∇ f (y)2dy M ∆ f
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)
1
fM∇ f (y)2dy M ∆ f
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)
1
f : M → RM∇ f (y)2dy M ∆ f
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)
1
f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
1
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
Properties
• Again only local distances important
• No quality measure of the embedding
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
f1f2x = [ f1(y), f2(y)]f : M → RM∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
1
Eigenfunction Issue
• Minimzing
• Orthogonality constraint on f in function space (not geometrically on manifold)
• Eigenvectors with higher frequency along same extension on the manifold can have smaller cost
fM∇ f (y)2dy M ∆ f
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)
1
Eigenfunction Issue
• B is orthogonal to A (in function space)
• Cost of B less than C (the desired eigenvector)
Samuel Gerber, Tolga Tasdizen, Ross Whitaker, Robust Non-linear Dimensionality Reduction using Successive 1-Dimensional Laplacian Eigenmaps, ICML 2007
x
yx
y
x
y
y
fx
fx
f
B
C
A
Conditional Expectation Manifolds
Manifold learning as unsupervised non-parametric model fitting
Principal Curves/SurfacesCurve through the middle of a density
T. Hastie, W. Stuetzle, Principal curvesJournal of the American Statistical Association 1989
Principal Surface Definition• Minimal orthogonal projection onto surface
• Principal surface iff conditional expectation of the projection equal to surface
Principal Surface Estimation
• Principal surfaces are extremal points of (objective function)
• Pick a parametrized surface model
• Optimize over parameters of
• Unfortunately principal surfaces are all saddle points of
• Projection is a non-linear optimization problem
Conditional Expectation Manifolds (CEM)
• Define a coordinate mapping
• Model surface as conditional expectation of coordinate mapping.
• Optimize coordinate mapping
CEM Estimation
• Coordinate mapping as kernel regression
s
Samuel Gerber, Tolga Tasdizen, Ross Whitaker "Dimensionality Reduction and Principal Surfaces via Kernel Map Manifolds", (ICCV 2009)
CEM Estimation
• Conditional expectation estimated with kernel regression
s
Samuel Gerber, Tolga Tasdizen, Ross Whitaker "Dimensionality Reduction and Principal Surfaces via Kernel Map Manifolds", (ICCV 2009)
Some results
• Effect of optimization
Input Initial MSE 8.6 Optimized MSE 2.6
Some results• 1965 images of different facial expression (20x28)
Work in Progress• Saddle point property of extrema is
problematic for model selection
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.5
0.0
0.5
1.0
1.5
2.0
y1
y 2
ground truthinitializationintermediatesselected
0 20 40 60 80 100
0.02
0.04
0.06
0.08
iteration
d(!,
Y)2
!
!
traintest
(a) (b)
Figure 2: Minimization of d(λ ,Y )2 with automatic bandwidth selection starting fromσg = 1 and σλ = 0.1. (a) fitted curve with optimization path and (b) train and test errorwith points indicating minimal train and test error, respectively.
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.5
0.0
0.5
1.0
1.5
2.0
y1
y 2
ground truthinitializationintermediatesselected
0 20 40 60 80 1000.000
0.004
0.008
iteration
q(!,
Y)2
!
!
traintest
(a) (b)
Figure 3: Minimization of q(λ ,Y )2 with automatic bandwidth selection starting fromσg = 1 and σλ = 0.1. (a) fitted curve with optimization path and (b) train and test errorwith points indicating minimal train and test error, respectively.
14
Work in Progress• Conditional expectation manifolds pave
way for other objective functions
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.5
0.0
0.5
1.0
1.5
2.0
y1
y 2
ground truthinitializationintermediatesselected
0 20 40 60 80 100
0.00
0.02
0.04
0.06
iteration
d(!,
Y)2
!
!
traintest
(a) (b)
Figure 4: Minimization of d(λ ,Y )2 with automatic bandwidth selection starting fromσg = 0.1 and σλ = 0.1. (a) fitted curve with optimization path and (b) train and testerror with points indicating minimal train and test error, respectively.
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.5
0.0
0.5
1.0
1.5
2.0
y1
y 2
ground truthinitializationintermediatesselected
0 20 40 60 80 1000.00
0.01
0.02
0.03
0.04
iteration
q(!,
Y)2
!!
traintest
(a) (b)
Figure 5: Minimization of q(λ ,Y )2 with automatic bandwidth selection starting fromσg = 0.1 and σλ = 0.1. (a) fitted curve with optimization path and (b) train and testerror with points indicating minimal train and test error, respectively.
15
Brain Population Analysis
Motivation
• Proof of concept
• Conditional expectation manifold for brain images
• Non-linearity in shape space
• Natural extension at the time from single atlas to multiple atlases to continuum
• Simplify statistics on shape spaces
Measuring Shape Differences
• Euclidean space does not capture changes in shape
• Distance based on measuring length of transformation
• Diffeomorphic transform
• Riemannian metric ( )
• Geodesics on diffeomorphic transformations
• Induces metric on images
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
Large Deformation Diffeomorphic Metric
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
d(e,φ)2 = minv t
0
Ωv(r,τ)Qdr dτd(yi,y j)2 = minv
10 v(r,τ)Q dτ
such that
Ωyi(φ(r,1))− y j(r))22 dr = 0
(4)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (5)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (6)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(7)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
var(PY )X = PY
2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
Manifold in Brain SpaceSpace of Smooth Images
Manifold induced by
diffeomorphic image
metric
Learned data
manifold
Samples/images
Frechet mean on
metric manifold
Frechet mean on
data manifold
Data set:spiral segments Manifold mean Diffeomorphic mean
mean on
metric manifold
Manifold in Brain Space
Approximating the Diffeomorphic Metric
• For small deformations work in tangent space
• Distance defined by
• For symmetry
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
d(yi,y j)2 = minv 1
0 v(r,τ)Q dτsuch that
Ωyi(φ(r,1))− y j(r))2
2 dr = 0(1)
φ(r,1)≈ v(r,0) = u(r), and d(yi,y j)2 ≈minu
Ωu(r)Q dr,subject to
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (2)
da(yi,y j)2 = minu
Ω ||u(r)||2Q drsuch that
Ωyi(r +u(r))− y j(r))2
2 dr ≤ ε (3)
limd→∞ P(X ∈ Sd) = 0d(yi,y j) = 1
2(da(yi,y j)+da(y j,yi)) .(4)Q = α∇+(1−α)Iu(r)2
Q = α||∇u(r)||2 +(1−α)||u(r)||2M ∆ f (y) f (y)dy
1
Manifold Representation
• Represent manifold as conditional expectation of some function
• Non euclidean space use Frechet mean
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
B. Davis, P. Fletcher, E. Bullitt, S. Joshi, Population shape regression from random design data, ICCV 2007
Manifold Representation• Compute embedding based on pairwise
distance matrix (isomap)
• Define coordinate mapping based kernel map manifold approach
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
• In all steps:
• Large distances have negligible effect
Manifold Representation
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
M∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτv(r,τ)Q
1
Results• OASIS data set
• 416 subjects, age 16 to 80
• 100 subjects diagnosed with mild to moderate dementia
• ADNI data set
• 156 Subjects, age 57 to 88
• 38 normal, 84 MCI, 34 early AD
20 22 24 26 28 300
5
10
15
20
25
30 MMSE Histogram
10 15 20 25 300
20
40
60
80
100
120
140 MMSE Histogram
OASIS 2D Embedding
Manifold Fit - OASIS• Measure reconstruction error
• Comparison to PCA
• Comparison of different metrics
• Scale by average nearest neighbor distance
f(y) = ∑ni=1
Ky(d(y,yi))zi∑n
j=1 Ky(d(y,y j)).(1)
g(x) = argminy ∑ni=1
Kx(x− f (yi))2)∑n
j=1 Kx(x− f (y j))2)d(y,yi)2 .(2)
ym = argminy∈M ∑ni=1 wid(y,yi)2 , (3)
g(x) = E[Y | f (y) = x]M∇ f (y)2dy
error = ∑i d(g( f (yi)),yi)∑i d(nn(yi),yi)
∆ fx = [ f1(y), . . . , fn(y)] ∈ Rn
min f E[g( f (Y ))−Y2]minz1,...,zn ∑ig( f (yi))− yi2
φ(r, t) = r + t
0 v(φ(r,τ),τ) dτ
1
Manifold Model PCA
Manifold Fit - ADNI
1.07 0.81 1.23Projection distance
Statistical Analysis - OASIS• Linear regression on age, MMSE, CDR
• Comparison to PCA and age as predictor
• Controlled for age - BIC to select best model
Statistical Analysis - OASIS
• Restricted to subjects age above 60
Statistical Analysis - ADNI
Reconstructions -ADNI
ADNI - Statistics
Extensions• Different Metrics?
• Transformation based metric is expensive
• No optimization of conditional expectation manifold
• Embedding/Statistics including metric tensor.
• Adding supervision
• Fit manifold with respect to a clinical predictor
Thank you
This work is supported by
NIH/NCBC grant U54-EB005149NSF grant CCF-073222
NIBIB grant 5RO1EB007688-02
Thoughts on Manifold Learning• For which applications / tasks is manifold learning
effective?
• Purely unsupervised tasks are rare
• Exploratory analysis
• In supervised settings:
• Manifold learning as regularization
• Feature extraction
• Stratified, non flat-able manifolds and detection of non-manifold structure