62
Gradient Flow in the Wasserstein Metric Katy Craig University of California, Santa Barbara NIPS, Optimal Transport & Machine Learning December 9th, 2017 Position -0.5 0.0 0.5 Time 0 1 2 3

on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Gradient Flow in the Wasserstein MetricKaty Craig University of California, Santa Barbara

NIPS, Optimal Transport & Machine Learning December 9th, 2017

Position -0.5 0.0 0.5

Tim

e

0

1

2

3

Page 2: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Example:

Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the gradient flow.

A curve x(t): [0,T ] → ℝd is the gradient flow of an energy E: ℝd → ℝ if

• “x(t) evolves in the direction of steepest descent of E” • initial value problem: given x(0), find the gradient flow x(t)

gradient flow in finite dimensions

2

metric energy functional gradient flow

(Rd, | · |)E(x) =

1

2x

2 d

dt

x(t) = �x(t)

d

dt

x(t) = �rE(x(t))

Page 3: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Def: An energy E is λ-convex if or, equivalently, if

for all x,y ∈ ℝ, t ∈ [0,1].

gradient flow in finite dimensions

3

D2E � �Id⇥d

E((1� t)x+ ty) (1� t)E(x) + tE(y)�t(1� t)�

2|x� y|2

Gradient flows often arise when solving optimization problems: minx2Rd

E(x)

Convexity of the energy determines stability and long time behavior.

f(x) =x

2

2, � = 1 f(x) = sin(x), � = �1

Page 4: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

gradient flow in finite dimensions

4

If E(x) is λ-convex, then…

1) Stability: for any gradient flows x(t) and y(t),

2) long time behavior: if λ>0, there is a unique solution x̅ of and any gradient flow x(t) converges to x̅ as t → +∞:

|x(t)� x̄| e

��t |x(0)� x̄|

|x(t)� y(t)| e

��t |x(0)� y(0)|

minx2Rd

E(x)

Page 5: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

gradient flow

5

Page 6: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

gradient flow

5

Page 7: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

d

dt

x(t) = �rXE(x(t))

In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the gradient flow of an energy E: X → ℝ if

gradient flow with different metrics

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

(L2(Rd), k · kL2)

rRdE(x) = rE(x) rL2(Rd)E(f) =@E

@f

6

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

hrE(f), gi = limh!0

E(f + hg)� E(f)

h

Examples: Euclidean L2

E(f) =1

2

Z|f |2

d

dt

f(x, t) = �f(x, t)

“ ’’

Page 8: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

d

dt

x(t) = �rXE(x(t))

In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the gradient flow of an energy E: X → ℝ if

gradient flow with different metrics

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

(L2(Rd), k · kL2)

rRdE(x) = rE(x) rL2(Rd)E(f) =@E

@f

6

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

hrE(f), gi = limh!0

E(f + hg)� E(f)

h

Examples: Euclidean L2

E(f) =1

2

Z|f |2

d

dt

f(x, t) = �f(x, t)

E(f) =1

2

Z|rf |2

“ ’’

Page 9: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

d

dt

x(t) = �rXE(x(t))

In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the gradient flow of an energy E: X → ℝ if

gradient flow with different metrics

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

(L2(Rd), k · kL2)

rRdE(x) = rE(x) rL2(Rd)E(f) =@E

@f

6

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

hrE(f), gi = limh!0

E(f + hg)� E(f)

h

Examples: Euclidean L2

E(f) =1

2

Z|f |2

d

dt

f(x, t) = �f(x, t)

E(f) =1

2

Z|rf |2

d

dt

f(x, t) = �f(x, t)

“ ’’

Page 10: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

finite difference approximation

approximated by approximate values of functionf : Rd ! R {fi}i2hZd

gradient flow with different metrics

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

(L2(Rd), k · kL2)

rRdE(x) = rE(x) rL2(Rd)E(f) =@E

@f

7

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

hrE(f), gi = limh!0

E(f + hg)� E(f)

h

Examples: Euclidean L2

E(f) =1

2

Z|f |2

d

dt

f(x, t) = �f(x, t)

E(f) =1

2

Z|rf |2

d

dt

f(x, t) = �f(x, t)

Page 11: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

8

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

Page 12: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

hrE(µ),�r · (⇠µ)iTanµP2(Rd) = limh!0

E((id + h⇠)#µ)� E(µ)

h

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

8

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

Page 13: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

8

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

Page 14: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

8

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

E(⇢) =

Z⇢(x) log(⇢(x))dx

Page 15: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

8

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

E(⇢) =

Z⇢(x) log(⇢(x))dx

d

dt

⇢(x, t) = �⇢(x, t)

Page 16: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

metric (X,d)

def of ∇X

formula for ∇X

energy

gradient flow

particle approximation

approximated by approximate mass of functionf : Rd ! R

gradient flow with different metrics

E(x) =1

2x

2

(Rd, | · |)

d

dt

x(t) = �x(t)

rRdE(x) = rE(x)

9

rX

hrE(x), vi = limh!0

E(x+ hv)� E(x)

h

(P2(Rd),W2)

rW2E(⇢) = �r ·✓⇢r@E

@⇢

Examples: Euclidean W2

NX

i=1

�ximi

E(⇢) =1

2

Zx

2⇢(x)dx

d

dt

⇢(x, t) = r · (x⇢(x, t))

E(⇢) =

Z⇢(x) log(⇢(x))dx

d

dt

⇢(x, t) = �⇢(x, t)

Page 17: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

interpolating with different metrics

10

L2 geodesic

The same dichotomy between values of a function and mass of a function is also present in the geodesics.Def: A constant speed geodesic between two points ρ0 and ρ1 in a metric space (X,d) is any curve ρ:[0,1]→X s.t.

⇢(0) = ⇢0, ⇢(1) = ⇢1, d(⇢(t), ⇢(s)) = |t� s|d(⇢0, ⇢1)

⇢(t) = (1� t)⇢0 + t⇢1

W2 geodesic⇢(t) = ((1� t)id + tT ⇢1

⇢0)#⇢0

Page 18: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

interpolating with different metrics

10

L2 geodesic

The same dichotomy between values of a function and mass of a function is also present in the geodesics.Def: A constant speed geodesic between two points ρ0 and ρ1 in a metric space (X,d) is any curve ρ:[0,1]→X s.t.

⇢(0) = ⇢0, ⇢(1) = ⇢1, d(⇢(t), ⇢(s)) = |t� s|d(⇢0, ⇢1)

⇢(t) = (1� t)⇢0 + t⇢1

W2 geodesic⇢(t) = ((1� t)id + tT ⇢1

⇢0)#⇢0

Page 19: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

interpolating with different metrics

10

L2 geodesic

The same dichotomy between values of a function and mass of a function is also present in the geodesics.Def: A constant speed geodesic between two points ρ0 and ρ1 in a metric space (X,d) is any curve ρ:[0,1]→X s.t.

⇢(0) = ⇢0, ⇢(1) = ⇢1, d(⇢(t), ⇢(s)) = |t� s|d(⇢0, ⇢1)

⇢(t) = (1� t)⇢0 + t⇢1

W2 geodesic⇢(t) = ((1� t)id + tT ⇢1

⇢0)#⇢0

Page 20: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

Page 21: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

v = �r⇢

Page 22: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

v = �r⇢

v = �m⇢m�2r⇢

Page 23: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

v = �r⇢

v = �m⇢m�2r⇢

v = �rV

Page 24: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

v = �r⇢

v = �m⇢m�2r⇢

v = �rV

v = �r(K ⇤ ⇢)

Page 25: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Examples:

gradient flow in the Wasserstein metric

11

energy functional gradient flow

E(⇢) =

Z⇢ log ⇢

d

dt⇢ = �⇢

E(⇢) =1

m� 1

Z⇢m d

dt⇢ = �⇢m

E(⇢) =

Z(K ⇤ ⇢)⇢

E(⇢) =

ZV ⇢

All Wasserstein gradient flows are of the form

d

dt⇢ = r · (r(K ⇤ ⇢)⇢)

d

dt⇢ = r · (rV ⇢)

continuity equation

d

dt⇢+r · (v⇢) = 0

v = �r⇢

v = �m⇢m�2r⇢

v = �rV

v = �r(K ⇤ ⇢)

v = �r@E

@⇢

Page 26: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

interaction kernels: degenerate diffusion: • granular media: K(x) = |x|3

• swarming: K(x) = |x|ᵃ/a - |x|ᵇ/b, -d<b<a

• chemotaxis:

12

gradient flow in the Wasserstein metric

| {z }self interaction

|{z}diffusion

K(x) =

(12⇡ log |x| if d = 2,

Cd|x|2�dotherwise.

�⇢m = r · (m⇢m�1

| {z }D

r⇢)•

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

drift|{z}

E(⇢) =1

2

ZK ⇤ ⇢d⇢+

ZV d⇢+

1

m� 1

Z⇢m

| {z } |{z} |{z}

Page 27: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

biological chemotaxis a colony of slime mold [Gregor, et. al]

13

Page 28: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

biological chemotaxis a colony of slime mold [Gregor, et. al]

13

Page 29: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

K(x) = |x|

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 30: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

K(x) = |x|

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 31: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

K(x) = |x|

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 32: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

K(x) = |x|

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 33: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 34: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3

Page 35: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| K(x) = |x|3/3

Page 36: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

14

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| K(x) = |x|3/3

Page 37: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

15

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| K(x) = |x|3/3� |x|, V (x) = �x

Page 38: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

15

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| K(x) = |x|3/3� |x|, V (x) = �x

Page 39: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

15

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| K(x) = |x|3/3� |x|, V (x) = �x

Page 40: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

16

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| m = 1,K(x) = |x|3/3� |x|

Page 41: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

16

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| m = 1,K(x) = |x|3/3� |x|

Page 42: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

16

gradient flow in the Wasserstein metric

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

K(x) = |x|3/3� |x| m = 1,K(x) = |x|3/3� |x|

Page 43: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Particle methods

JKO scheme • Pros: reduces simulating grad flow to solving a sequence of optimization

problems involving W2 distance; leverages state of the art W2 solvers. • Cons: current methods lose convexity/stability properties of gradient flow

Finite volume methods, finite element methods… • Pros: adapts numerical approaches inspired by classical fluid mechanics

to gradient flow setting • Cons: current methods lose convexity/stability properties of gradient flow

Without stability, we can’t prove general results on convergence.

17

numerical simulation of W2 grad flow

Page 44: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

A general recipe for a particle methods: (1) approximate ρ₀(x) as a sum of Dirac masses on a grid of spacing h

(2) evolve the locations of the Dirac masses by

(3) is a gradient flow of the original energy;

it inherits all convexity/stability properties, hence

particle methods

18

d

dt

xi(t) = v(xi(t), t) 8i

⇢N (x, t) ! ⇢(x, t)

Goal: Approximate a solution tod

dt

⇢(x, t) +r · (v(x, t)⇢(x, t)) = 0

N

(x, t) =NX

i=1

xi(t)mi

⇢0 ⇡NX

i=1

�ximi

Assume: v(x,t) comes from a Wasserstein gradient flow and is “nice”

Page 45: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

but what about when v(x,t) is not “nice”?

Benefits of particle methods:

(1) positivity preserving

(2) inherently adaptive

(3) energy decreasing

(4) preserves convexity/stability properties of gradient flow at discrete level

particle methods

19

Goal: Approximate a solution tod

dt

⇢(x, t) +r · (v(x, t)⇢(x, t)) = 0

Page 46: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and degenerate diffusion:

v(x,t) is often not nice

20

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

v = rK ⇤ ⇢+rV +m⇢m�2r⇢

v(x,t)

• Diffusion term is worst: particles do not remain particles

• But even the interaction term can slow down convergence if it has a strong singularity at the origin…

Page 47: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

“Blob Method”: regularize the velocity field to make it nice!

aggregation, drift, and degenerate diffusion:

v(x,t) is often not nice

20

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

v = rK ⇤ ⇢+rV +m⇢m�2r⇢

v(x,t)

• Diffusion term is worst: particles do not remain particles

• But even the interaction term can slow down convergence if it has a strong singularity at the origin…

Page 48: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

“Blob Method”: regularize the velocity field to make it nice!

aggregation, drift, and degenerate diffusion:

v(x,t) is often not nice

20

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

v = rK ⇤ ⇢+rV +m⇢m�2r⇢

v(x,t)

• Diffusion term is worst: particles do not remain particles

• But even the interaction term can slow down convergence if it has a strong singularity at the origin…

Page 49: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

A regularized particle method: (0) regularize the interaction kernel via convolution with a mollifier

(1) approximate ρ₀(x) as a sum of Dirac masses on a grid of spacing h

(2) evolve the locations of the Dirac masses by

(3) is a gradient flow of the regularized energy

particle methods

21

Goal: Approximate a solution tod

dt

⇢(x, t) +r · (v(x, t)⇢(x, t)) = 0

a blob method for aggregationd

dt⇢ = r · ((rK ⇤ ⇢)⇢) v = rK ⇤ ⇢

⇢0 ⇡NX

i=1

�ximi

N

(x, t) =NX

i=1

xi(t)mi

E✏(⇢) =

Z(K✏ ⇤ ⇢)⇢

K✏(x) = K ⇤ '✏(x), '✏(x) = '(x/✏)/✏d

Page 50: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

a blob method for aggregationTheorem [C., Bertozzi 2014]: If ε = hq, 0<q<1, the blob method converges as h →0 .

22

two dimensions, aggregationK(x) = log |x|/2π K(x) = |x|2/2 K(x) = |x|3/3

Position -0.5 0.0 0.5

Tim

e

0

1

2

3

• finite vs infinite time collapse• delta function vs delta ring h = 0.04, q = 0.9, m = 4

18 / 23

Page 51: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

aggregation, drift, and diffusion:

• Adding a drift term is straightforward: just do a particle method with

• How can we add diffusion?

• Previous work: stochastic [Liu, Yang 2017], [Huang, Liu 2015], deterministic [Carrillo, Huang, Patacchini, Wolansky 2016]

• Our idea: regularize by convolution with a mollifier.

aggregation + ?

23

d

dt⇢ = r · ((rK ⇤ ⇢)⇢) +r · (rV ⇢) +�⇢m K,V : Rd ! R, and m � 1

v = rK✏ ⇤ ⇢+rV

Page 52: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Solutions of diffusion equation are gradient flows of

Let’s consider gradient flows of

• Previous work (m=2):[Lions, Mas-Gallic 2000], [P.E. Jabin, in progress]

• For ε > 0, particles remain particles, so can do a particle method for

a blob method for diffusion.

a blob method for degenerate diffusion

24

diffusion equation:d

dt⇢ = �⇢m m � 1

E(⇢) =1

m� 1

Z⇢m

E✏(⇢) =1

m� 1

Z(⇢ ⇤ '✏)

m�1⇢

v = r'✏ ⇤�('✏ ⇤ ⇢)m�2⇢

�+ ('✏ ⇤ ⇢)m�2(r'✏ ⇤ ⇢)

=)

Page 53: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Solutions of diffusion equation are gradient flows of

Let’s consider gradient flows of

• Previous work (m=2):[Lions, Mas-Gallic 2000], [P.E. Jabin, in progress]

• For ε > 0, particles remain particles, so can do a particle method for

a blob method for diffusion.

a blob method for degenerate diffusion

24

diffusion equation:d

dt⇢ = �⇢m m � 1

E✏(⇢) =1

m� 1

Z(⇢ ⇤ '✏)

m�1⇢

v = r'✏ ⇤�('✏ ⇤ ⇢)m�2⇢

�+ ('✏ ⇤ ⇢)m�2(r'✏ ⇤ ⇢)

=)

Page 54: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Solutions of diffusion equation are gradient flows of

Let’s consider gradient flows of

• Previous work (m=2):[Lions, Mas-Gallic 2000], [P.E. Jabin, in progress]

• For ε > 0, particles remain particles, so can do a particle method for

a blob method for diffusion.

a blob method for degenerate diffusion

24

diffusion equation:d

dt⇢ = �⇢m m � 1

v = r'✏ ⇤�('✏ ⇤ ⇢)m�2⇢

�+ ('✏ ⇤ ⇢)m�2(r'✏ ⇤ ⇢)

=)

Page 55: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Solutions of diffusion equation are gradient flows of

Let’s consider gradient flows of

• Previous work (m=2):[Lions, Mas-Gallic 2000], [P.E. Jabin, in progress]

• For ε > 0, particles remain particles, so can do a particle method for

a blob method for diffusion.

a blob method for degenerate diffusion

24

diffusion equation:d

dt⇢ = �⇢m m � 1

v = r'✏ ⇤�('✏ ⇤ ⇢)m�2⇢

�+ ('✏ ⇤ ⇢)m�2(r'✏ ⇤ ⇢)

=)

E(⇢) =

Zlog(⇢)⇢

Page 56: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Solutions of diffusion equation are gradient flows of

Let’s consider gradient flows of

• Previous work (m=2):[Lions, Mas-Gallic 2000], [P.E. Jabin, in progress]

• For ε > 0, particles remain particles, so can do a particle method for

a blob method for diffusion.

a blob method for degenerate diffusion

24

diffusion equation:d

dt⇢ = �⇢m m � 1

v = r'✏ ⇤�('✏ ⇤ ⇢)m�2⇢

�+ ('✏ ⇤ ⇢)m�2(r'✏ ⇤ ⇢)

=)

E✏(⇢) =

Zlog(⇢ ⇤ '✏)⇢

E(⇢) =

Zlog(⇢)⇢

Page 57: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

a blob method for degenerate diffusion

25

Theorem [Carrillo, C., Patacchini 2017]: Consider

As ε→0, • For all m ≥ 1, Eε Γ-converge to E.

• For m = 2 and initial data with bounded entropy, gradient flows of Eε converge to gradient flows of E.

• For m ≥ 2 and particle initial data with ε = hq, 0<q<1, if a priori estimates hold, gradient flows of Eε converge to gradient flows of E.

E✏(⇢) =

Z(K ⇤ ⇢)⇢+

ZV ⇢+

1

m� 1

Z(⇢ ⇤ '✏)

m�1⇢

Page 58: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

numerics: Keller-Segel (d=2)

26

subcritical mass critical mass supercritical massEvolution of Density

Evolution of Second Moment

Page 59: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

numerics: Keller-Segel (d=2)

26

subcritical mass critical mass supercritical massEvolution of Density

Evolution of Second Moment

Page 60: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

numerics: Keller-Segel (d=2)

26

subcritical mass critical mass supercritical massEvolution of Density

Evolution of Second Moment

Page 61: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Future work• Convergence for 1 ≤ m < 2?

• Quantitative estimates for m ≥ 2?

• Utility in related fluids and kinetic equations?

Page 62: on Gradient Flow in the Wasserstein Metricweb.math.ucsb.edu/~kcraig/math/curriculum_vitae_files/NIPS_120917.pdfExample: Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the

Thank you!