Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
DeepONet: Learning nonlinear operators
Lu Lu
Applied Mathematics InstructorDepartment of Mathematics, MIT
SIAM Conference on Computational Science and EngineeringMar 3, 2021
Lu Lu (MIT Math) DeepONet SIAM CSE21 1 / 14
From function to operator
Function: Rd1 → Rd2
e.g., image classification: 7→ 5
Operator: function (∞-dim) 7→ function (∞-dim)e.g., derivative (local): x(t) 7→ x′(t)e.g., integral (global): x(t) 7→
∫K(s, t)x(s)ds
e.g., dynamic system:
e.g., biological systeme.g., social system
⇒ Can we learn operators via neural networks?⇒ How?
Lu Lu (MIT Math) DeepONet SIAM CSE21 2 / 14
From function to operator
Function: Rd1 → Rd2
e.g., image classification: 7→ 5
Operator: function (∞-dim) 7→ function (∞-dim)e.g., derivative (local): x(t) 7→ x′(t)e.g., integral (global): x(t) 7→
∫K(s, t)x(s)ds
e.g., dynamic system:
e.g., biological systeme.g., social system⇒ Can we learn operators via neural networks?⇒ How?
Lu Lu (MIT Math) DeepONet SIAM CSE21 2 / 14
Problem setupG : u ∈ V ⊂ C(K1) 7→ G(u) ∈ C(K2)G(u) : y ∈ Rd 7→ G(u)(y) ∈ R
Lu Lu (MIT Math) DeepONet SIAM CSE21 3 / 14
Universal Approximation Theorem for OperatorG : u ∈ V ⊂ C(K1) 7→ G(u) ∈ C(K2)G(u) : y ∈ Rd 7→ G(u)(y) ∈ R
Theorem (Chen & Chen, 1995)Suppose that σ is a continuous non-polynomial function, X is a BanachSpace, K1 ⊂ X, K2 ⊂ Rd are two compact sets in X and Rd, respectively,V is a compact set in C(K1), G is a continuous operator, which maps Vinto C(K2).Then for any ε > 0, there are positive integers n, p, m, constantsck
i , ξkij , θ
ki , ζk ∈ R, wk ∈ Rd, xj ∈ K1, i = 1, . . . , n, k = 1, . . . , p,
j = 1, . . . ,m, such that∣∣∣∣∣∣G(u)(y)−p∑
k=1
n∑i=1
cki σ
m∑j=1
ξkiju(xj) + θk
i
σ(wk · y + ζk)
∣∣∣∣∣∣ < ε
holds for all u ∈ V and y ∈ K2.
Lu Lu (MIT Math) DeepONet SIAM CSE21 4 / 14
Problem setup
G : u ∈ V ⊂ C(K1) 7→ G(u) ∈ C(K2)G(u) : y ∈ Rd 7→ G(u)(y) ∈ R
A Inputs & Output
u: function
u(x1)u(x2)· · ·u(xm)
y ∈ Rd
Network G(u)(y) ∈ R
B Training dataInput function u
at fixed sensors x1, . . . , xm
Output function G(u)at random location y
......
G
x1
x2
xm
x1x2
xm
C Stacked DeepONet
u
u(x1)u(x2)· · ·u(xm)
Branch net1
Branch net2
...
Branch netp
b1
b2
...
bp
y Trunk net
t1
t2
...
tp
× G(u)(y)
D Unstacked DeepONet
u
u(x1)u(x2)· · ·u(xm)
Branch net
b1
b2
...
bp
y Trunk net
t1
t2
...
tp
× G(u)(y)
Inputs: u at sensors {x1, x2, . . . , xm} ∈ Rm, y ∈ Rd
Output: G(u)(y) ∈ R
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 5 / 14
Deep operator network (DeepONet)
G(u)(y) ≈p∑
k=1bk(u)︸ ︷︷ ︸branch
tk(y)︸ ︷︷ ︸trunk
A Inputs & Output
u: function
u(x1)u(x2)· · ·u(xm)
y ∈ Rd
Network G(u)(y) ∈ R
B Training dataInput function u
at fixed sensors x1, . . . , xm
Output function G(u)at random location y
......
G
x1
x2
xm
x1x2
xm
C Stacked DeepONet
u
u(x1)u(x2)· · ·u(xm)
Branch net1
Branch net2
...
Branch netp
b1
b2
...
bp
y Trunk net
t1
t2
...
tp
× G(u)(y)
D Unstacked DeepONet
u
u(x1)u(x2)· · ·u(xm)
Branch net
b1
b2
...
bp
y Trunk net
t1
t2
...
tp
× G(u)(y)
Idea: G(u)(y): a function of y conditioning on utk(y): basis function of ybk(u): u-dependent coefficient, i.e., functional of u
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 6 / 14
Error analysis: the number of sensorsConsider G : u(x) 7→ s(x) (x ∈ [0, 1]) by ODE system
d
dxs(x) = g(s(x), u(x), x), s(0) = s0
∀u ∈ V ⇒ um ∈ Vm
Let κ(m,V ) := supu∈V maxx∈[0,1] |u(x)− um(x)|
e.g., Gaussian process with kernel e−‖x1−x2‖
2
2l2 : κ(m,V ) ∼ 1m2l2
Theorem (Lu et al., Nature Mach Intell, 2021)Assume g is Lipschitz continuous (c).Then there exists a DeepONet N , such that
supu∈V
maxx∈[0,1]
‖G(u)(x)−N ([u(x1), . . . , u(xm)], x)‖2 < cecκ(m,V ).
Lu Lu (MIT Math) DeepONet SIAM CSE21 7 / 14
Explicit operator: A simple ODE caseds(x)dx
= u(x), x ∈ [0, 1], s(0) = 0
G : u(x) 7→ s(y) = s(0) +∫ y
0u(τ)dτ
Very small generalization error!
10-6
10-5
10-4
FNN(best)
ResNet(best)
Seq2Seq(best)
Seq2Seq 10x(best)
Stacked(no bias)
Stacked(bias)
Unstacked(no bias)
Unstacked(bias)
A
MS
E
TrainTest
10-6
10-5
10-4
10-3
10-2
10-1
0 10000 20000 30000 40000 50000
B
MS
E
# Iterations
TrainTest
10-2
10-1
1 1.5 2 2.5 3 3.5 4 4.5 5
C
L2 r
ela
tive
err
or
1/l
10-5
10-4
10-3
10-2
10-1
100
0 5000 10000 15000 20000 25000 30000
D
MS
E
# Iterations
Train (Poly-fractonomial)Test (Poly-fractonomial)Train (Legendre)Test (Legendre)Train (GRF)Test (GRF)
10-4
10-3
10-2
10-1
100
0 1000 2000 3000 4000 5000
D E
MS
E
# Epochs
Train (DeepONet)Test (DeepONet)Train (CNN)Test (CNN)
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 8 / 14
Implicit operator: Gravity pendulum with an external forceds1dt
= s2,ds2dt
= −k sin s1 + u(t)
G : u(t) 7→ s(t)
10-6
10-5
10-4
10-3
10-2
10-1
103 104 105
A
e-x/1400
x-1.5
x-1.2
MS
E
# Train data
Branch/trunk width 50
Gen.Test
10-6
10-5
10-4
10-3
10-2
10-1
103 104 105
B
e-x/3100
x-1.4
x-1.3
MS
E
# Train data
Branch/trunk width 200
Gen.Test
10-6
10-5
10-4
10-3
10-2
10-1
103 104 105
C
e-x/5800
x-1.5
x-1.1
MS
E
# Train data
FNN width 100
Gen.Test
10-6
10-5
10-4
10-3
10-2
10-1
103 104 105
D
x-1.3
MS
E
# Train data
Seq2Seq
Gen.Test
Test/generalization error:small dataset: exponential convergencelarge dataset: polynomial rateslarger network has later transition point
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 9 / 14
Implicit operator: Diffusion-reaction system
∂s
∂t= D
∂2s
∂x2 + ks2 + u(x), x ∈ [0, 1], t ∈ [0, 1]
# Training points = #u× P
Small dataset:Exponential convergence
10-5
10-4
10-3
10-2
10-1
10 20 30 40 50 60 70
Ae-x/23
e-x/13
e-x/10
e-x/7.6
MSE
P
# u = 50# u = 100
# u = 200# u = 400
10-5
10-4
10-3
10-2
10-1
10 20 30 40 50 60 70
10-4
10-3
10-2
10-1
10 20 30 40 50 60 70
B
e-x/16
e-x/10
e-x/9.2
e-x/9.0
MSE
# u
P = 50P = 100
P = 200P = 400
10-4
10-3
10-2
10-1
10 20 30 40 50 60 70
0
0.05
0.1
0.15
0.2
50 100 200 400
50 100 200 400C
0.04ln(x) - 0.1
0.1 - 0.2e-x/37
1/k
# u
P
Rate w.r.t. P (Fig. A)Rate w.r.t. # u (Fig. B)
0
0.05
0.1
0.15
0.2
50 100 200 400
50 100 200 400
0
1
2
3
4
5
50 100 200 400
50 100 200 400D
2.4 - 5.1e-x/31
0.5 + 0.5ln(x)
Rate
of c
onve
rgen
ce# u
P
Rate w.r.t. P (Fig. 4C)Rate w.r.t. # u (Fig. 4D)
0
1
2
3
4
5
50 100 200 400
50 100 200 400
Large dataset:Polynomial convergence
10-910-810-710-610-510-410-310-210-1
10 100 1000
A
MSE
P
# u = 100
TrainTest
10-910-810-710-610-510-410-310-210-1
10 100 100010-7
10-6
10-5
10-4
10-3
10-2
10-1
100
10 100 1000
B
MSE
# u
P = 100
TrainTest
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
10 100 1000
10-5
10-4
10-3
10-2
10-1
10 100 1000
C
x-1.4
x-2.2
x-2.4
x-2.4
MSE
P
# u = 50# u = 100# u = 200# u = 400
10-5
10-4
10-3
10-2
10-1
10 100 100010-5
10-4
10-3
10-2
10-1
10 100 1000
D
x-2.6
x-2.9
x-3.3
x-3.7
MSE
# u
P = 50P = 100P = 200P = 400
10-5
10-4
10-3
10-2
10-1
10 100 1000
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 10 / 14
Stochastic PDEConsider the elliptic problem with multiplicative noise
−div(eb(x;ω)∇u(x;ω)) = f(x), x ∈ (0, 1)
with zero boundary conditions, f(x) = 10.Stochastic process b(x;ω) ∼ GP(0, σ2 exp(−‖x1 − x2‖2/2l2))
G : b(x;ω) 7→ u(x;ω)
Ideas:
Karhunen-Loeve (KL) expansion: b(x;ω) ≈∑N
i=1√λiei(x)ξi(ω)
branch net inputs: [√λ1e1(x),
√λ2e2(x), . . . ,
√λNeN (x)] ∈ RN×m,
where√λiei(x) =
√λi[ei(x1), ei(x2), . . . , ei(xm)] ∈ Rm
trunk net inputs: [x, ξ1, ξ2, . . . , ξN ] ∈ RN+1
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 11 / 14
Stochastic PDE
Example: 10 different random samples of u(x;ω) for b(x;ω) with l = 1.5
0
1
2
3
4
5
6
0 0.2 0.4 0.6 0.8 1
u(x;
ω)
x
ReferenceDeepONet
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 0.2 0.4 0.6 0.8 1u(
x; ω
)
x
ReferenceDeepONet
Lu et al., Nature Mach Intell, 2021Lu Lu (MIT Math) DeepONet SIAM CSE21 12 / 14
DeepONet for learning operators
DeepONet (Lu et al., Nature Mach Intell, 2021)
Algorithms & ApplicationsI 16 ODEs/PDEs (nonlinear, fractional & stochastic)I Multiphysics & Multiscale problems
F Bubble growth dynamics from nm to mm (J Chem Phys, 2021)F Hypersonics (arXiv:2011.03349)F Electroconvection (arXiv:2009.12935)
Observation: Good performanceI Small generalization errorI Exponential/polynomial error convergence
TheoryI Convergence rate for advection-diffusion equations (arXiv:2102.10621)
Lu Lu (MIT Math) DeepONet SIAM CSE21 13 / 14
DeepM&Mnet: Multiphysics & Multiscale problemsHypersonics: Coupled flow and finite-rate chemistry behind a normalshock (arXiv:2011.03349)
Lu Lu (MIT Math) DeepONet SIAM CSE21 14 / 14