28
ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB AND ENDRE S ¨ ULI Abstract. Space-time variational formulations of infinite-dimensional Fokker–Planck (FP) and Ornstein–Uhlenbeck (OU) equations for functions on a separable Hilbert space H are developed. The well-posedness of these equations in the Hilbert space L 2 (H, μ) of functions on H, which are square-integrable with respect to a Gaussian measure μ on H, is proved. Specifically, for the infinite-dimensional FP equation, adaptive space-time Galerkin discretizations, based on a tensorized Riesz basis, built from biorthogonal piecewise polynomial wavelet bases in time and the Hermite polynomial chaos in the Wiener–Itˆ o decomposition of L 2 (H, μ), are introduced and are shown to converge quasioptimally with respect to the nonlinear, best N -term approximation benchmark. As a consequence, the proposed adaptive Galerkin solution algorithms perform quasioptimally with respect to the best N -term approximation in the finite-dimensional case, in particular. All constants in our error and complexity bounds are shown to be independent of the number of “active” coordinates identified by the proposed adaptive Galerkin approximation algorithms. 1. Introduction Partial differential equations in infinite dimensions arise in a number of relevant applications; most notably as forward and backward Kolmogorov equations for stochastic partial differential equations (see, e.g., [7, 5] and the references therein). Their numerical solution has, to date, received scant attention, however. Approximations to such equations are typically attempted by path simulations in the corresponding stochastic partial differential equation, whose path-wise solutions belong to function spaces over finite-dimensional domains and can be, therefore, approx- imated by standard discretization techniques, combined with Monte Carlo path sampling. In the present paper, we propose a novel, adaptive approach to the construction of finite-dimensional numerical approximations to the deterministic forward equation in infinite-dimensional spaces, which exhibit certain optimality properties. The proposed approach is based on space-time varia- tional formulations of these equations, which are posed in Gel’fand-triples of Sobolev spaces over a separable Hilbert space H with respect to a Gaussian measure, and on Riesz bases of these spaces, which have been developed for linear and nonlinear parabolic PDEs in finite dimensions in [24, 8, 17]. The structure of the paper is as follows: in the next section we present two space-time vari- ational formulations of linear parabolic equations set in a domain of finite dimension, which we then apply in Section 3 to Fokker–Planck equations that arise in bead-spring chain models for d- dimensional polymeric flow, d ∈{2, 3}, with chains consisting of K +1 beads whose kinematics are statistically described by a configuration vector q R Kd , K 1. The probability density function ψ = ψ(q,t) that is sought as the solution of the associated Fokker–Planck equation is therefore a function of Kd spatial variables with K 1 and the time variable t. Our aim is to embed this finite-dimensional problem of potentially very high dimension into an infinite-dimensional problem. We therefore turn to Fokker–Planck equations in infinite-dimensional domains by recapitulating The research reported in this paper was carried out at the Hausdorff Institute for Mathematics (HIM), Bonn, during the HIM-Trimester on “High Dimensional Approximation”, May–August 2011. CS was partially supported by the European Research Council (ERC) under the FP7 programme ERC AdG 247277. ES was partially supported by the EPSRC Science and Innovation award to the Oxford Centre for Nonlinear PDE (EP/E035027/1). 1

ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS

FOR PARTIAL DIFFERENTIAL EQUATIONS

IN INFINITE DIMENSIONS

CHRISTOPH SCHWAB AND ENDRE SULI

Abstract. Space-time variational formulations of infinite-dimensional Fokker–Planck (FP) and

Ornstein–Uhlenbeck (OU) equations for functions on a separable Hilbert space H are developed.

The well-posedness of these equations in the Hilbert space L2(H,µ) of functions on H, whichare square-integrable with respect to a Gaussian measure µ on H, is proved. Specifically, for

the infinite-dimensional FP equation, adaptive space-time Galerkin discretizations, based on atensorized Riesz basis, built from biorthogonal piecewise polynomial wavelet bases in time and

the Hermite polynomial chaos in the Wiener–Ito decomposition of L2(H,µ), are introduced and

are shown to converge quasioptimally with respect to the nonlinear, best N -term approximationbenchmark. As a consequence, the proposed adaptive Galerkin solution algorithms perform

quasioptimally with respect to the best N -term approximation in the finite-dimensional case,

in particular. All constants in our error and complexity bounds are shown to be independent ofthe number of “active” coordinates identified by the proposed adaptive Galerkin approximation

algorithms.

1. Introduction

Partial differential equations in infinite dimensions arise in a number of relevant applications;most notably as forward and backward Kolmogorov equations for stochastic partial differentialequations (see, e.g., [7, 5] and the references therein). Their numerical solution has, to date,received scant attention, however. Approximations to such equations are typically attempted bypath simulations in the corresponding stochastic partial differential equation, whose path-wisesolutions belong to function spaces over finite-dimensional domains and can be, therefore, approx-imated by standard discretization techniques, combined with Monte Carlo path sampling. In thepresent paper, we propose a novel, adaptive approach to the construction of finite-dimensionalnumerical approximations to the deterministic forward equation in infinite-dimensional spaces,which exhibit certain optimality properties. The proposed approach is based on space-time varia-tional formulations of these equations, which are posed in Gel’fand-triples of Sobolev spaces overa separable Hilbert space H with respect to a Gaussian measure, and on Riesz bases of thesespaces, which have been developed for linear and nonlinear parabolic PDEs in finite dimensionsin [24, 8, 17].

The structure of the paper is as follows: in the next section we present two space-time vari-ational formulations of linear parabolic equations set in a domain of finite dimension, which wethen apply in Section 3 to Fokker–Planck equations that arise in bead-spring chain models for d-dimensional polymeric flow, d ∈ {2, 3}, with chains consisting of K+1 beads whose kinematics arestatistically described by a configuration vector q ∈ RKd, K � 1. The probability density functionψ = ψ(q, t) that is sought as the solution of the associated Fokker–Planck equation is therefore afunction of Kd spatial variables with K � 1 and the time variable t. Our aim is to embed thisfinite-dimensional problem of potentially very high dimension into an infinite-dimensional problem.We therefore turn to Fokker–Planck equations in infinite-dimensional domains by recapitulating

The research reported in this paper was carried out at the Hausdorff Institute for Mathematics (HIM), Bonn,during the HIM-Trimester on “High Dimensional Approximation”, May–August 2011.

CS was partially supported by the European Research Council (ERC) under the FP7 programme ERC AdG247277. ES was partially supported by the EPSRC Science and Innovation award to the Oxford Centre for Nonlinear

PDE (EP/E035027/1).

1

Page 2: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

2 CHRISTOPH SCHWAB AND ENDRE SULI

basic facts about Gaussian measures, before proving in Section 5 well-posedness of the second ofour two space-time variational formulations in the infinite-dimensional space L2(H,µ), where Hdenotes a separable Hilbert space and µ is a Gaussian measure on H. We show in particular thatthe solution operator of the variational formulation is an isomorphism between suitable solutionand data spaces. We then turn our attention to adaptive Galerkin approximations. We outline thegeneral principle in Section 6, where we introduce the idea of conversion of abstract, well-posedoperator equations on separable Hilbert spaces to equivalent, infinite matrix-vector problems inthe sequence space `2(N). After reviewing some relevant facts concerning N -term approxima-tions in `2(N), we introduce abstract adaptive Galerkin approximation algorithms that constructsequences of N -term approximations, which, while not being best N -term approximations, areoptimal in the sense that they converge asymptotically at the rate afforded by the best N -termapproximation, provided that certain conditions are met by the operators and the Riesz basesused to discretize them. In Section 9 the abstract concepts are applied to infinite-dimensionalFokker–Planck equations. Notably, a Wiener polynomial chaos type Riesz basis in L2(H,µ) and awavelet basis in time are used for the Galerkin discretization. In Section 10 we verify the abstractassumptions for the specific equations of interest, and in Section 11 we consider the more generalsetting of nonsymmetric equations with drift, leading to our main result concerning the optimalityof adaptive Galerkin discretizations with dimension-independent bounds, in Section 12.

2. Space-time variational formulation of abstract parabolic problems

2.1. Abstract elliptic equations. Suppose that D ⊆ Rd is either a bounded open domain withLipschitz continuous boundary ∂D or D = Rd. In D, we consider the elliptic partial differentialequation

Au = f in D (2.1)

with suitable homogeneous boundary conditions on ∂D when D is bounded, or subject to adecay condition at infinity when D = Rd. In typical weak formulations of elliptic boundary-valueproblems the solution u is sought in a certain separable Hilbert space V of functions defined onD (usually a hilbertian Sobolev space), so that A ∈ L(V,V∗) and f ∈ V∗, where V∗ is the dualspace of V, resulting in the following weak formulation of problem (2.1):

Given f ∈ V∗, find u ∈ V such that: a(u, v) = V∗〈f, v〉V ∀v ∈ V.

Here, a(u, v) = V∗〈Au, v〉V and V∗〈·, ·〉V denotes the V∗×V duality pairing. It is also assumed thatthere exists a separable Hilbert space H, identified with its dual H∗, such that

V ↪→ H ∼= H∗ ↪→ V∗, (2.2)

where the continuous embeddings signified by the symbol ↪→ are dense and compact. All Hilbertspaces considered here are, for the sake of simplicity, assumed to be over the field R of real numbers.By (·, ·)H we denote the inner product in H, which extends to the duality pairing V∗〈·, ·〉V on V∗×Vby continuity.

Example 2.1. The prototypical example of a problem of the above form is Poisson’s equation ona bounded open Lipschitz domain D ⊂ Rd, subject to a homogeneous Dirichlet boundary conditionon ∂D, in which case:

A = −∆, V = H10(D), H = L2(D), V∗ = H−1(D) := H1

0(D)∗, a(u, v) =

∫D

∇u · ∇v dx.

We shall assume that A is selfadjoint, i.e., A = A∗ (which implies that the bilinear form a(·, ·)is symmetric on V × V) and coercive on V, i.e., there exists a real number γ0 > 0 such that

∀u ∈ V : a(u, u) = V∗〈Au, u〉V ≥ γ0‖u‖2V . (2.3)

Our assumption A ∈ L(V,V∗) implies the existence of a positive real number γ1 = ‖A‖L(V,V∗)≥ γ0 such that

∀u, v ∈ V : |a(u, v)| ≤ γ1‖u‖V‖v‖V ; (2.4)

i.e., the bilinear form a is bounded.

Page 3: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 3

Hence, (2.3) and (2.4) imply that the energy norm ‖ · ‖a, defined on V by ‖v‖a := [a(v, v)]1/2,v ∈ V, is equivalent on V with the norm ‖ · ‖V ; viz.,

∀v ∈ V : γ0‖v‖2V ≤ ‖v‖2a ≤ γ1‖v‖2V .We recall the following version of the Hilbert–Schmidt theorem (cf. Lemma 15 in [14]).

Theorem 2.2. Suppose that H and V are separable Hilbert spaces, with V densely and compactlyembedded in H. Let a : V × V → R be a nonzero, symmetric, coercive and bounded bilinear form.Then, there exists a sequence of real numbers (λn)∞n=1 and a sequence of unit H-norm elements(ϕn)∞n=1 in V, which solve the following eigenvalue problem: find λ ∈ R and ϕ ∈ V \ {0} such that

a(ϕ, v) = λ(ϕ, v)H ∀v ∈ V.The real numbers λn, n ∈ N, which can be assumed to be in increasing order with respect to theindex n ∈ N, are positive, bounded away from 0, and limn→∞ λn = +∞.

In addition, the ϕn, n ∈ N, form an orthonormal system in H whose closed span in H is equalto H, and the scaled elements ϕn/

√λn, n ∈ N, form an orthonormal system with respect to the

inner product defined by the bilinear form a(·, ·), whose closed span with respect to the norm ‖ · ‖ainduced by a(·, ·) is equal to V. Furthermore,

h =

∞∑n=1

(h, ϕn)Hϕn and ‖h‖2H =

∞∑n=1

(h, ϕn)2H ∀h ∈ H (2.5)

and

v =

∞∑n=1

a

(v,

ϕn√λn

)ϕn√λn

and ‖h‖2a =

∞∑n=1

a

(v,

ϕn√λn

)2

∀v ∈ V;

and, in addition,

h ∈ H and

∞∑n=1

λn(h, ϕn)2H <∞ ⇐⇒ h ∈ V.

By virtue of Theorem 2.2, there exists a countable family σ ⊂ R+ of eigenvalues λ of A, boundedaway from zero, whose accumulation point is at +∞, and an H-orthonormal sequence {ϕλ : λ ∈ σ}of eigenfunctions ϕλ ∈ V, i.e.,

Aϕλ = λϕλ, λ ∈ σ, and (ϕλ, ϕλ′)H = δλλ′ , λ, λ′ ∈ σ.We then have by (2.5), for each v ∈ H, the Fourier series expansion

v =∑λ∈σ

vλϕλ, vλ := (v, ϕλ)H

and Parseval’s identity

‖v‖2H =∑λ∈σ

|vλ|2.

In particular, therefore, the system {ϕλ : λ ∈ σ} forms a normalized Riesz basis of H. Thenext two lemmas whose proofs are elementary show that renormalized versions of {ϕλ : λ ∈ σ}constitute Riesz bases in V and V∗ as well.

Lemma 2.3. The following two-sided bound holds for each v ∈ V:

γ0 ‖v‖2V ≤∑λ∈σ

λ |vλ|2 ≤ γ1 ‖v‖2V .

For f ∈ V∗, we have that

f =∑λ∈σ

fλϕλ, fλ := V∗〈f, ϕλ〉V .

Lemma 2.4. The following two-sided bound holds for each f ∈ V∗:1

γ1‖f‖2V∗ ≤

∑λ∈σ

λ−1 |fλ|2 ≤1

γ0‖f‖2V∗ .

Page 4: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

4 CHRISTOPH SCHWAB AND ENDRE SULI

Asu = A−1 f =

∑λ∈σ

uλϕλ, with uλ = λ−1 fλ for λ ∈ σ,

it follows that ∑λ∈σ

λ |uλ|2 =∑λ∈σ

λ−1 |fλ|2, (2.6)

where, by Lemma 2.3, the left-hand side of the equality belongs to the interval ‖u‖2V [γ0, γ1] and, by

Lemma 2.3, the right-hand side belongs to the interval ‖f‖2V∗[γ−1

1 , γ−10

]. Here, for a nonnegative

real number α and a nonempty bounded closed interval [s, t] ⊂ R, the notation x ∈ α[s, t] andx ∈ [s, t]α are abbreviations for the two-sided inequality: αs ≤ x ≤ αt. We deduce from (2.6) inparticular that

γ0 ‖u‖V ≤ ‖f‖V∗ ≤ γ1 ‖u‖V .Thus, the linear operator A−1 is a (bi-Lipschitz) quasi-isometric isomorphism between V∗ and V,when these spaces are equipped with the norms ‖ · ‖V∗ and ‖ · ‖V , respectively.

2.2. Abstract parabolic equations. Given T > 0 and the triple (2.2), we consider the abstractparabolic differential equation

∂tu+Au = f in Q := (0, T )×D, (2.7)

where A ∈ L(V,V∗), with ‖A‖L(V,V∗) = γ1 > 0, satisfies A = A∗, and

∃γ0 > 0 ∀v ∈ V : a(v, v) := V∗〈Av, v〉V ≥ γ0‖v‖2V .As our aim is to develop the numerical analysis of space-time adaptive Galerkin discretizations ofinfinite-dimensional parabolic problems, we begin, following [24], by presenting weak formulationsthat are amenable to adaptive Galerkin discretizations. To this end, we consider the Bochnerspaces L2(0, T ;V), L2(0, T ;H) and L2(0, T ;V∗) and we define

H1(0, T ;V) := {u ∈ L2(0, T ;V) : u′ ∈ L2(0, T ;V)},as well as

H10,{0}(0, T ;V) := {u ∈ H1(0, T ;V) : u(0) = 0 in V}, (2.8a)

H10,{T}(0, T ;V) := {u ∈ H1(0, T ;V) : u(T ) = 0 in V}. (2.8b)

Here and throughout the rest of the paper u′ will signify du/dt or ∂u/∂t, depending on thecontext.

In the variational formulation of the parabolic problem, an important role is played by thespace X defined by

X := L2(0, T ;V) ∩H1(0, T ;V∗),which we equip with the norm ‖ · ‖X defined by

‖v‖X :=(‖v‖2L2(0,T ;V) + ‖v′‖2L2(0,T ;V∗)

) 12 .

With V, H and V∗ as in the triple (2.2) the following continuous embedding holds:

X ↪→ C([0, T ];H) (2.9)

(in the sense that any v ∈ X is equal almost everywhere to a function that is uniformly continuousas a mapping from the nonempty closed interval [0, T ] of the real line into H).

Therefore, for u ∈ X and 0 ≤ t ≤ T , the values u(t) are well-defined in H and there exists aconstant C = C(T ) > 0 such that

∀u ∈ X ∀t ∈ [0, T ] : ‖u(t)‖H ≤ C‖u‖X .In particular, for u ∈ X the values u(0) and u(T ) are well-defined in H and

X0,{0} := {u ∈ X : u(0) = 0 in H}, X0,{T} := {u ∈ X : u(T ) = 0 in H}

are closed linear subspaces of X . Henceforth we shall write Y = L2(0, T ;V) and we denote by Y∗the dual space L2(0, T ;V∗) of Y, identifying L2(0, T ;H) with its own dual.

Page 5: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 5

2.3. First space-time variational formulation. We consider (2.7) in the special case when

Lu := ∂tu+Au = f ∈ Y∗ in Q := (0, T )×D, u(0) = u0 ∈ H, (2.10)

in conjunction with a suitable homogeneous boundary condition incorporated in the (domain of)definition of the linear operator A. We begin by considering the case when the initial condition isalso homogeneous, i.e., u0 = 0 ∈ H.

The first space-time variational formulation of the parabolic problem (2.10) is based on thebilinear form

(u, v) ∈ X × Y 7→ B(u, v) :=

∫ T

0

(V∗〈u′, v〉V + a(u, v)

)dt ∈ R. (2.11)

Then, the space-time weak formulation of the parabolic problem (2.10) with homogeneous initialcondition u0 = 0 in H reads as follows: given f ∈ Y∗, find u ∈ X0,{0} such that

B(u, v) = f(v) ∀v ∈ Y. (2.12)

2.4. Second space-time variational formulation. In (2.12), the initial condition u(0) = 0is incorporated as essential condition in the function space X0,{0} in which the solution to theproblem was sought. To accommodate a nonhomogeneous initial condition, one may proceed in(at least) two different ways. If

u(0) = u0 6= 0 in H, (2.13)

then one can for example first subtract from u a function up ∈ X such that up|t=0 = u0; we may,in particular, choose for this purpose

up = e−tu0, for u0 ∈ V ⊂ H.

The disadvantage of this approach is that u0 ∈ V is needed (instead of u0 ∈ H), which can beviewed as an unnecessarily restrictive demand on the regularity of the initial datum u0.

Alternatively, in order to relax the regularity requirement u0 ∈ V to u0 ∈ H, one may impose(2.13) weakly, either by enforcing it via a multiplier as in [24] or through a space-time variationalformulation, which incorporates it as a natural boundary condition, as follows: we integrate thetime derivative by parts using that∫ T

0V∗〈u′, v〉V dt = −

∫ T

0V∗〈v′, u〉V dt+ (u, v)|T0 , u, v ∈ L2(0, T ;V) ∩H1(0, T ;V∗). (2.14)

If u(0) = u0 6= 0 in H, we require that

v(T ) = 0. (2.15)

Thus, (2.14) and (2.15) lead to the weak formulation (2.18) below. To state it, we recall the spacesY = L2(0, T ;V) and

X = L2(0, T ;V) ∩H1(0, T ;V∗) (2.16)

and the subspaces (cf. also (2.8))

X0,{0} := {u ∈ X : u(0) = 0 in H}, (2.17a)

X0,{T} := {u ∈ X : u(T ) = 0 in H}, (2.17b)

equipped with the norm of X ; we shall write ‖·‖X0,{0} and ‖·‖X0,{T} to indicate that the norm of Xis applied to an element of X0,{0} and X0,{T}, respectively. Thanks to the continuous embedding(2.9), the spaces X0,{0} and X0,{T} are correctly defined and are closed subspaces of X ; in particularthe expression (2.14) is meaningful for u, v ∈ X . The variational form of the parabolic problem(2.10) with weak enforcement of the initial condition, which we shall also refer to as the space-timeadjoint weak formulation, then reads: given u0 ∈ H and f ∈ X ∗0,{T}, find

u ∈ Y : B∗(u, v) = `∗(v) ∀v ∈ X0,{T}. (2.18)

Here the bilinear form B∗(·, ·) is given by

B∗(u, v) :=

∫ T

0

(− V∗〈v′, u〉V + a(u, v)

)dt (2.19)

Page 6: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

6 CHRISTOPH SCHWAB AND ENDRE SULI

and the linear functional `∗ is defined by

`∗(v) := X∗〈f, v〉X + (u0, v(0))H. (2.20)

Clearly, for f ∈ X ∗0,{T} ' L2(I;V∗) + (H10,{T})

∗(I;V) ' L2(I)⊗V∗+ (H10,{T}(I))∗⊗V and, for any

u0 ∈ H, the functional `∗ in (2.20) is linear and continuous on X0,{T}; i.e., there exists a constantC > 0 such that

∀v ∈ X0,{T} : |`∗(v)| ≤ C(‖f‖X∗

0,{T}+ ‖u0‖H

)‖v‖X0,{T} .

2.5. Well-posedness. The well-posedness of the variational problems (2.12) and (2.18) is animmediate consequence of the following result.

Theorem 2.5. Suppose that V and H are separable Hilbert spaces over the field R such that V ↪→H ' H∗ ↪→ V∗ with dense embeddings. Assume further that the bilinear form a(·, ·) : V × V → Ris continuous, i.e.,

∃Ma > 0 ∀w, v ∈ V : |a(w, v)| ≤Ma‖v‖V‖w‖V , (2.21)

and coercive in the sense that it satisfies a Garding inequality, i.e., there exists ma > 0 and κ ≥ 0such that

∀v ∈ V : a(v, v) ≥ ma‖v‖2V − κ‖v‖2H. (2.22)

Under these hypotheses, there exists a positive constant C such that the bilinear form B(·, ·),as defined in (2.11), satisfies the following inf-sup condition:

infu∈X0,{0}\{0}

supv∈Y\{0}

B(u, v)

‖u‖X0,{0}‖v‖Y≥ C

and

∀v ∈ Y \ {0} : supu∈X0,{0}

B(u, v) > 0.

Analogously, there exists a positive constant C such that the bilinear form B∗(·, ·) in (2.19) satisfiesthe following inf-sup condition:

infu∈Y\{0}

supv∈X0,{T}\{0}

B∗(u, v)

‖u‖Y‖v‖X0,{T}

≥ C

and

∀v ∈ X0,{T} \ {0} : supu∈Y

B∗(u, v) > 0.

Furthermore, the bilinear forms B(·, ·) and B∗(·, ·) defined in (2.11) and in (2.18), respectively,induce boundedly invertible, linear operators B ∈ L(X0,{0},Y∗) and B∗ ∈ L(Y, (X0,{T})

∗), where

the spaces X0,{0}, X0,{T} are defined in (2.16) and (2.17), and Y = L2(0, T ;V).

The proof is analogous to that of Theorem 5.1 in [24]. The following result is now an immediateconsequence of Theorem 2.5.

Theorem 2.6. For every u0 ∈ H and every f ∈ X ∗0,{T} = L2(I;V∗) + H−10,{T}(I;V), there exists a

unique weak solution u ∈ Y = L2(0, T ;V) of (2.10) in the sense that u satisfies (2.18). Moreover,the operator L : Y → X ∗0,{T} is an isomorphism.

In the next section we apply these abstract results to a class of Fokker–Planck equations, posedon high-dimensional domains, that arise in bead-spring chain models in kinetic models of dilutepolymers (see, for example, [2] and [3]).

Page 7: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 7

3. Fokker–Planck equation

Suppose that K is a fixed positive integer and d ∈ {2, 3}. Suppose further that Dk ⊂ Rd iseither a bounded ball in Rd centred at the origin of Rd and of fixed radius

√bk, where bk > 0,

k = 1, . . . ,K, or that Dk = Rd for all k = 1, . . . ,K. By identifying Rd with a ball of radius +∞,we can unify the two scenarios by taking bk ∈ (0,+∞] and by assuming that the bk are eitherfinite for all k = 1, . . . ,K or infinite for all k = 1, . . . ,K. We define D := D1 × · · · × DK . Onthe interval [0, bk2 ) we consider the function Uk ∈ C1[0, bk2 ), referred to as a potential, such thatUk(0) = 0, Uk is strictly monotonic increasing and lims→bk− Uk = +∞, k = 1, . . . ,K. We thenassociate with Uk the partial Maxwellian, defined by

Mk(qk) :=1

Zkexp

(−Uk

(12 |qk|

2)), qk ∈ Dk,

where

Zk :=

∫Dk

exp(−Uk

(12 |pk|

2))

dpk,

for k = 1, . . . ,K, and we define the (full) Maxwellian

M(q) := M1(q1) · · ·MK(qK), q = (q>1 , . . . , q>K)> ∈ D1 × · · · ×DK = D ⊆ RKd.

Clearly, M(q) > 0 on D,∫DM(q) dq = 1 and lim|qk|→bkMk(qk) = 0, k = 1, . . . ,K. When the

domains Dk are bounded balls, we shall suppose that there exist positive constants Ck1 and Ck2,and real numbers αk > 1, k = 1, . . . ,K, such that

0 < Ck1 ≤ exp(−Uk

(12 |qk|

2))/(dist(qk, ∂Dk))αk ≤ Ck2 <∞, k = 1, . . . ,K.

Alternatively, when Dk = Rd for all k = 1, . . . ,K, we shall assume that Uk(s) = s, k = 1, . . . ,K.This section is devoted to the proof of existence and uniqueness of weak solutions to the Fokker–

Planck equation

∂tψ −K∑

i,j,=1

Aij∇qj ·(M ∇qi

M

))= 0, (q, t) ∈ D × (0, T ], (3.23)

subject to the initial condition

ψ(q, 0) = ψ0(q), q ∈ D, (3.24)

where A ∈ RK×K is a symmetric positive definite matrix with minimal eigenvalue γ0 > 0 andmaximal eigenvalue γ1 > 0, and ψ0 ∈ L1(D) is a nonnegative function such that

∫Dψ0 dq = 1. As

ψ0 is a probability density function, and this property needs to be propagated during the courseof the evolution over the time interval [0, T ], the boundary condition on ∂D × [0, T ] needs to bechosen so that ψ(·, t) remains a nonnegative function for all t ∈ [0, T ], and

∫Dψ(q, t) dq = 1 for all

t ∈ [0, T ]. This can be achieved, formally at least, by demanding that

K∑i=1

Aij

(M ∇qi

M

))· qj|qj |→ 0 as |qj |2 → bj , for all j = 1, . . . ,K, (3.25)

where either bj ∈ (0,∞) for all j = 1, . . . ,K when Dj is a bounded ball of radius√bj ; or bj := +∞

for all j = 1, . . . ,K when Dj = Rd. By writing

ψ :=ψ

Mand ψ0 :=

ψ0

M,

the initial-value problem (3.23), (3.24) can be restated as follows:

M ∂tψ −K∑

i,j,=1

Aij∇qj ·(M ∇qi ψ

)= 0, (q, t) ∈ D × (0, T ], (3.26)

subject to the initial condition

M(q) ψ(q, 0) = M(q) ψ0(q), q ∈ D, (3.27)

Page 8: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

8 CHRISTOPH SCHWAB AND ENDRE SULI

together with the (formal) boundary condition

K∑i=1

Aij

(M ∇qi ψ

)· qj|qj |→ 0 as |qj |2 → bj , for all j = 1, . . . ,K, (3.28)

and the adopted convention that bj := +∞ when Dj = Rd. We consider the Maxwellian-weightedL2 space

L2M (D) := {ϕ ∈ L1

loc(D) :√M ϕ ∈ L2(D)},

equipped with the inner product (·, ·)L2M (D) and norm ‖ · ‖L2

M (D), defined, respectively, by

(ψ, ϕ)L2M (D) :=

∫D

M(q) ψ(q) ϕ(q) dq and ‖ϕ‖L2M (D) := (ϕ, ϕ)

12

L2M (D)

,

and the associated Maxwellian-weighted H1 space

H1M (D) := {ϕ ∈ L2

M (D) : ∇qk ϕ ∈ L2M (D), k = 1, . . . ,K}

equipped with the inner product (·, ·)H1M (D) and norm ‖ · ‖H1

M (D) defined, respectively, by

(ψ, ϕ)H1M (D) = (ψ, ϕ)L2

M (D) +

K∑k=1

(∇qk ψ,∇qk ϕ)[L2M (D)]d and ‖ϕ‖H1

M (D) := (ϕ, ϕ)12

H1M (D)

.

We note that when Dk are bounded open balls, the open set D, as a Cartesian product of boundedLipschitz domains, is also a bounded Lipschitz domain (cf. the footnote on p.10 of [14]).

Under our assumptions on the potentials Uk, k = 1, . . . ,K, L2M (D) and H1

M (D) are separableHilbert spaces over the field of real numbers (cf. Theorems 2.7.1, 2.8.1 and 8.10.2 in [19]). Clearly,H1M (D) is continuously embedded in L2

M (D). In fact the embedding H1M (D) ↪→ L2

M (D) is denseand compact (in the case when of each Dk, k = 1, . . . ,K, is a bounded ball we refer for the proofsof these statements to the Appendix in Barrett & Suli [1], and Appendices B, C, D in the extendedversion of Barrett & Suli [2]; in the case when Dk = Rd for each k = 1, . . . ,K, we refer to theAppendices A and D in Barrett & Suli [3]).

Adopting the notations introduced in the previous sections, we takeH := L2M (D), V := H1

M (D),and consider the linear differential operator

Aϕ := −K∑

i,j=1

Aij∇qj · (M ∇qi ϕ) , ϕ ∈ V,

that maps V into its dual space V∗ (with respect to the pivot space H).

We strengthen our original assumption ψ0 ∈ L1(D) by demanding that ψ0 ∈ L2M (D) (note that

‖ψ0‖L1(D) ≤ ‖ψ0‖L2M (D) = ‖ψ0‖H for all ψ0 ∈ H = L2

M (D)), and we consider the following weak

formulation of the Fokker–Planck initial-boundary-value problem: given ψ0 ∈ H := L2M (D), find

ψ ∈ Y := L2(0, T ;V) = L2(0, T ; H1M (D)) such that

B∗(ψ, ϕ) :=

∫ T

0

(−V∗〈ϕ ′, ψ〉V + a(ψ, ϕ)) dt = (ψ0, ϕ(0))H ∀ϕ ∈ X0,{T}. (3.29)

Here, as in the previous sections, ϕ ′ signifies the derivative of ϕ with respect to the variable t.On recalling the Brascamp–Lieb inequality (cf. [18], Corollary 2.2) in the case of bk < ∞,

k = 1, . . . ,K; and the Poincare inequality for Gaussian measures (cf. the work of Beckner [4],Theorem 1, inequality (3) with p = 1) when bk = ∞, k = 1, . . . ,K, we deduce from Theorem

2.6 the existence of a unique weak solution ψ ∈ L2(0, T ; H1M (D)) to the Fokker–Planck equation

(3.26), (3.27). By choosing in the weak formulation (3.29) the test function ϕ from the denselinear subspace {ϕ ∈ C∞([0, T ]; H1

M (D)) : ϕ(·, T ) = 0} of X0,{T}, it then follows from (3.29) that

ψ ∈ H1(0, T ; H1M (D)∗), and therefore ψ ∈ L2(0, T ; H1

M (D))∩H1(0, T ; H1M (D)∗). Thus, returning to

our original variable ψ, we deduce the existence of a unique weak solution ψ to (3.23), (3.24), withψ/M ∈ L2(0, T ; H1

M (D)) ∩ H1(0, T ; H1M (D)∗). The boundary condition (3.25) (or, equivalently,

(3.28)) is imposed weakly, by demanding that the function ψ (or, equivalently, ψ) belongs to theappropriate function space for weak solutions.

Page 9: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 9

Henceforth we shall concentrate on the simplest case from the technical point of view, whenDk = Rd and Uk(s) = s for all k = 1, . . . ,K; then,

Mk(qk) =1

Zkexp

(− 1

2 |qk|2), qk ∈ Rd, where Zk :=

∫Rd

exp(− 1

2 |pk|2)

dpk, k = 1, . . . ,K,

and

M(q) = M1(q1) · · ·MK(qK) =1

Zexp

(− 1

2 |q|2), q = (q>1 , . . . , q

>K)> ∈ RKd,

where |q|2 = |q1|2 + · · ·+ |qK |2 and Z =∏Ki=1Zk, k = 1, . . . ,K.

We are particularly interested in the case when K � 1; specifically, our objective is to developadaptive numerical algorithms with quantitative optimality properties, which are dimensionallyrobust, i.e., whose computational complexity does not exhibit the curse of dimensionality. To thisend, we shall embed the weak formulation of the Fokker–Planck initial-boundary-value problem(3.29), with a finite, but arbitrarily large number of independent variables qk, k = 1, . . . ,K, intoa problem with countably many variables, qk, k = 1, 2, . . . , corresponding to formally settingK = +∞. Thus, instead of working on D = RKd, we shall be interested in the case when D = R∞(cf. the next section for the definition of R∞), which we shall equip with a finite measure, theGaussian measure on R∞. We shall employ for this purpose the notion of Gaussian measure ona separable Hilbert space. Our exposition in the next section closely follows Sections 1.1, 1.2, 9.1and 9.2 in the monograph of Da Prato and Zabczyk [12].

4. Gaussian measures

Suppose that H is a separable Hilbert space with norm ‖ · ‖H and inner product (·, ·)H over thefield of real numbers. We denote by L(H) the space of all bounded linear operators from H intoH equipped with the associated operator norm ‖ · ‖L(H), and we denote by L+(H) the subset of Hconsisting of all nonnegative symmetric linear operators. Finally, B(H) will signify the σ-algebraof all Borel subsets of H. A bounded linear operator R ∈ L(H) is said to be of trace class if thereexist two sequences (ak)∞k=1 and (bk)∞k=1 in H such that

Rh =

∞∑k=1

(h, ak)Hbk, h ∈ H, (4.1)

and∞∑k=1

‖ak‖H ‖bk‖H <∞.

The set of all elements of L(H) that are trace class will be denoted by L1(H). We note that ifR ∈ L1(H), then R is a compact linear operator on H. The set L1(H), endowed with the usualoperations of addition and scalar multiplication, is a Banach space with the norm

‖R‖L1(H) := inf

{ ∞∑k=1

‖ak‖H ‖bk‖H : Rh =

∞∑k=1

(h, ak)Hbk, h ∈ H, (ak)∞k=1, (bk)∞k=1 ⊂ H

}.

Assuming that R ∈ L1(H), the trace TrR of R is defined by the formula

TrR =

∞∑k=1

(Rek, ek)H ,

where at this stage (ek)∞k=1 is any complete orthonormal basis in H. In particular if R ∈ L1(H) isexpressed by (4.1), then

TrR =

∞∑k=1

(ak, bk)H .

The definition of trace being independent of the choice of the basis, we have that |TrR| ≤ ‖R‖L1(H).The next proposition (cf. Pietsch [22], Theorem 8.3.3) sharpens these statements further.

Page 10: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

10 CHRISTOPH SCHWAB AND ENDRE SULI

Proposition 4.1. Suppose that R is a compact selfadjoint linear operator on a separable Hilbertspace H, and that (λk)∞k=1 are its eigenvalues (repeated according to their multiplicity).

Then, R ∈ L1(H) if, and only if,∑∞k=1 |λk| < +∞. Furthermore, ‖R‖L1(H) =

∑∞k=1 |λk| and

TrR =∑∞k=1 λk. In particular if λk ≥ 0 for all k ∈ N, then TrR = ‖R‖L1(H).

Let R∞ denote linear space of all sequences x = (xk)∞k=1 of real numbers, equipped with themetric

d(x, y) :=

∞∑k=1

2−k|xk − yk|

1 + |xk − yk|, x, y ∈ R∞.

Let further `2(N) denote the Hilbert space of all sequences x = (xk)∞k=1 ∈ R∞ such that

‖x‖`2(N) :=

( ∞∑k=1

|xk|2)1/2

< +∞;

the inner product on `2(N) is defined by (x, y)`2(N) :=∑∞k=1 xkyk, for x, y ∈ `2(N).

Let us further consider, for a ∈ R and for λ > 0, the Gaussian measure on R with mean a andstandard deviation λ defined by

Na,λ(dx) :=1√2πλ

e−(x−a)2

2λ dx.

The following result is stated as Theorem 1.2.1 in [12].

Theorem 4.2. Suppose that a ∈ H and Q ∈ L+1 (H). Then, there exists a unique probability

measure µ on (H,B(H)) such that∫H

eı(h,x)Hµ(dx) = eı(a,h)H e−12 (Qh,h)H , h ∈ H.

Moreover, µ is the restriction to H (identified with the Hilbert space `2(N)) of the product measure∞⊗k=1

µk =

∞⊗k=1

Nak,λk ,

defined on (R∞,B(R∞)).

We shall write µ = Na,Q, and we call a the mean and the trace class operator Q the covarianceoperator of µ. The measure µ will be referred to as a Gaussian measure on H with mean a andcovariance operator Q. If the law of a random variable is a Gaussian measure, then the randomvariable is said to be Gaussian. In particular, Theorem 4.2 implies that a random variable X withvalues in H is Gaussian if, and only if, for any h ∈ H the real-valued random variable (h,X)H isGaussian.

For µ = Na,Q, we denote by L2(H,µ) the Hilbert space of all square-integrable (equivalenceclasses of) functions from H into R with inner product

(u, v)L2(H,µ) =

∫H

u(x) v(x)µ(dx), u, v ∈ L2(H,µ)

and norm

‖u‖L2(H,µ) = (u, u)12

L2(H,µ), u ∈ L2(H,µ).

Analogously, we shall denote by L2(H,µ;H) the Hilbert space of all square-integrable (equivalenceclasses of) functions from H into H with inner product

(u, v)L2(H,µ;H) =

∫H

(u(x), v(x))H µ(dx), u, v ∈ L2(H,µ;H)

and norm

‖u‖L2(H,µ;H) = (u, u)12

L2(H,µ;H) u ∈ L2(H,µ;H).

Throughout the rest of the paper, µ = NQ := N0,Q, where Q ∈ L+1 (H). Moreover, we shall

assume that Ker(Q) = {0}. We shall also suppose that there exists a complete orthonormal system(ek)∞k=1 in H and a sequence (λk)∞k=1 of positive real numbers, the eigenvalues of Q, such that

Page 11: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 11

Qek = λkek, k ∈ N. The eigenvalues λk are assumed to be enumerated in decreasing order (andrepeated according to their multiplicity), and accumulating only at zero. For x ∈ H, we shall thenwrite xk = (x, ek)H , k ∈ N.

The subspace Q1/2(H) = {Q1/2h : h ∈ H} is called the reproducing kernel of the Gaussianmeasure NQ := N0,Q. If Ker(Q) = {0} as has been assumed, then Q1/2(H) is dense in H. In fact,

if x0 ∈ H is such that (Q1/2h, x0)H = 0 for all h ∈ H, then Q1/2x0 = 0, and therefore Qx0 = 0,which implies that x0 = 0.

For Q ∈ L+1 (H) such that Ker(Q) = {0}, we introduce the isomorphism W from H into

L2(H,NQ) as follows: for f ∈ Q1/2(H) let Wf ∈ L2(H;NQ) be defined by

Wf (x) = (Q−1/2f, x)H , x ∈ H.

Let µ = NQ. We define Hermite polynomials in L2(H,µ). Let us consider to this end the set Γof all mappings γ : n ∈ N→ γn ∈ {0}∪N, such that |γ| :=

∑∞k=1 γk < +∞. Clearly γ ∈ Γ if, and

only if, γn = 0 for all, except possibly finitely many, n ∈ N. For any γ ∈ Γ we define the Hermitepolynomial

Hγ(x) :=

∞∏k=1

Hγk(Wek(x)), x ∈ H, (4.2)

where the factors in the product on the right-hand side are defined by

Hn(ξ) =(−1)n√n!

eξ2

2dn

dξn

(e−

ξ2

2

), ξ ∈ R, n ∈ {0} ∪ N; (4.3)

Hn is the classical univariate Hermite polynomial of degree n on R. Note that H0 ≡ 1, so thatfor each γ ∈ Γ, the countable product Hγ(x) contains only finitely many nontrivial factors and is,therefore, well-defined. Moreover, with the Gaussian measure µ = NQ being a countable productmeasure, we have that

∀γ, γ′ ∈ Γ : (Hγ , Hγ′)L2(H,µ) = δγ,γ′ . (4.4)

We shall denote by E(H) the linear space spanned by all exponential functions, that is all functionsϕ : x ∈ H 7→ ϕ(x) ∈ R of the form

ϕ(x) = e(h,x)H , h ∈ H.

It follows from Proposition 1.2.5 in Da Prato and Zabczyk [12] that E(H) is dense in L2(H,µ).On account of the separability of H, L2(H,µ) is separable.

For any k ∈ N we consider the partial derivative in the direction ek (with ek as above), definedby

Dkϕ(x) = limε→0

1

ε(ϕ(x+ εek)− ϕ(x)) , x ∈ H, ϕ ∈ E(H).

Thus, if ϕ(x) = e(h,x)H with h ∈ H, then clearly

Dkϕ(x) = e(h,x)H hk, where hk = (h, ek)H .

Let Λ0 denote the linear span of {Hγ ⊗ ek : γ ∈ Γ, k ∈ N}, with Hγ and ek as above andµ = NQ, and define the linear operator

D : E(H) ⊂ L2(H,µ)→ L2(H,µ;H) by Dϕ(x) :=

∞∑k=1

Dkϕ(x) ek, x ∈ H.

Thanks to Proposition 9.2.2 in Da Prato and Zabczyk [12], Dk is a closable linear operator for allk ∈ N. If ϕ belongs to the domain of the closure of Dk, which we shall still denote by Dk, we shallsay that Dkϕ belongs to L2(H,µ).

Analogously, by Proposition 9.2.4 in Da Prato and Zabczyk [12], D is a closable linear operator.If ϕ belongs to the domain of the closure of D, which we shall still denote by D, we shall say thatDϕ belongs to L2(H,µ;H).

Page 12: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

12 CHRISTOPH SCHWAB AND ENDRE SULI

For µ = NQ, let us denote by W1,2(H,µ) the linear space of all functions ϕ ∈ L2(H,µ) suchthat Dϕ ∈ L2(H,µ;H), equipped with the inner product

(ϕ,ψ)W1,2(H,µ) := (ϕ,ψ)L2(H,µ) +

∫H

(Dϕ(x),Dψ(x))H µ(dx)

and norm ‖ϕ‖W1,2(H,µ) = (ϕ,ϕ)12

W1,2(H,µ). Then, the Sobolev space W1,2(H,µ) is complete, and is

therefore a separable Hilbert space. In particular, for any ϕ ∈ L2(H,µ), we have that

ϕ =∑γ∈Γ

ϕγHγ , where ϕγ := (ϕ,Hγ)L2(H,µ).

The next theorem, proved independently by Da Prato, Malliavin and Nualart [11] and Peszat[21] provides an analogous characterization of functions belonging to W1,2(H,µ) in terms of thecomplete orthonormal basis (Hγ)γ∈Γ.

Theorem 4.3. A function ϕ ∈ L2(H,µ) belongs to W1,2(H,µ) if, and only if,∑γ∈Γ

〈γ, λ−1〉|ϕγ |2 < +∞, (4.5)

where ϕγ := (ϕ,Hγ)L2(H,µ), 〈γ, λ−1〉 :=∑∞k=1 γkλ

−1k , and (λk)∞k=1 is the sequence of (positive)

eigenvalues (repeated according to their multiplicity) of the covariance operator Q ∈ L+1 (H),

Ker(Q) = {0}. Moreover, if (4.5) holds, then

‖ϕ‖2W1,2(H,µ) = ‖ϕ‖2L2(H,µ) +∑γ∈Γ

〈γ, λ−1〉|ϕγ |2.

Identifying L2(H,µ) with its own dual L2(H,µ)∗, we obtain

ϕ ∈ (W1,2(H,µ))∗ ⇐⇒∑γ∈Γ

〈γ, λ〉|ϕγ |2 < +∞.

Furthermore, the embedding of W1,2(H,µ) into L2(H,µ) is compact.

For a proof of these statements we refer, for example, to [12, Theorem 9.2.12]. As E(H) isdense in both L2(H,µ) and W1,2(H,µ), the embedding of W1,2(H,µ) into L2(H,µ) is also dense.

5. Fokker–Planck equation in countably many dimensions

For k ∈ N, let us denote by Dk the set Rd equipped with the Gaussian measure

µk(dqk) = Nak,Σk(dqk) =1

(2π)d/2[det(Σk)]1/2exp

(− 1

2 (qk − ak)>Σ−1k (qk − ak)

)dqk

with mean ak ∈ Rd and positive definite covariance matrix Σk ∈ Rd×d. We shall assume henceforththat ak = 0 for all k ∈ N and that the covariance operator Q, represented by the bi-infinite block-diagonal matrix

Σ = diag(Σ1,Σ2, . . . ),

with d× d diagonal blocks Σk, k = 1, 2, . . . , is trace class. We define

D :=∞

×k=1

Dk

so that

q := (q>1 , q>2 , . . . )

> ∈ D, qk ∈ Dk, k = 1, 2, . . . .

We equip the domain D with the product measure

µ :=

∞⊗k=1

µk =

∞⊗k=1

N0,Σk .

Page 13: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 13

Let A = (Aij)∞i,j=1 ∈ R∞×∞ be a symmetric infinite matrix, i.e., Aij = Aji for all i, j ∈ N.

Suppose further that there exists a real number γ0 > 0 such that

∞∑i,j=1

Aij ξiξj ≥ γ0‖ξ‖2`2(N) for all ξ = (ξi)∞i=1 ∈ `2(N) (5.1)

and a real number γ1 > 0 such that∣∣∣∣∣∣∞∑

i,j=1

Aij ξiηj

∣∣∣∣∣∣ ≤ γ1‖ξ‖`2(N) ‖η‖`2(N) for all ξ = (ξi)∞i=1 ∈ `2(N) and η = (ηi)

∞i=1 ∈ `2(N). (5.2)

Example 5.1. As an example of an infinite matrix A that satisfies (5.1) and (5.2), we mention

A[ε] = tridiag{(εj , 1, εj) : j = 1, 2, . . . }, (5.3)

where the sequence ε = (εj)∞j=1 is assumed to satisfy ‖ε‖`∞(N) < 1/2. Then, the matrix A[ε] in

(5.3) satisfies (5.1) with γ0 = 1− 2‖ε‖`∞(N) and (5.2) with γ1 = 1 + 2‖ε‖`∞(N).

Example 5.2. More generally, we may consider A of block-diagonal form A = diag{A11,A22},where A11 ∈ Rd×d is symmetric, positive definite, and A22 is as in Example 5.1. Then, (5.1) and(5.2) hold with constants 0 < γ0 ≤ γ1 <∞ depending on the spectrum of A11 and on d, however.

In order to state the space-time variational formulation of the Fokker–Planck equation usingthe notation introduced in Section 2, we select H = L2(D;µ) and V = W1,2(D,µ), and we writeY = L2(0, T ;V). Guided by the abstract framework in Sections 2.3, 2.4, we consider the space

X = L2(0, T ;V) ∩H1(0, T ;V∗) (5.4)

and its subspaces X0,{0}, X0,{T} as in (2.17), equipped with the norm of X , which are closed dueto (2.9). With these spaces, the space-time adjoint weak formulation of the infinite-dimensional

Fokker–Planck equation reads as follows: given ψ0 ∈ H and g ∈ X ∗0,{T}, find

ψ ∈ Y : B∗(ψ, ϕ) = `∗(ϕ) ∀ϕ ∈ X0,{T}, (5.5)

where the bilinear form B∗(·, ·) : Y × X0,{T} → R is defined by

B∗(ψ, ϕ) :=

∫ T

0

(− V∗〈ϕ ′, ψ〉V + a(ψ, ϕ)

)dt, (5.6)

with

a(ψ, ϕ) :=

∞∑i,j=1

Aij(∇qi ψ,∇qj ϕ)[L2(D,µ)]d ,

and the linear functional `∗ is defined by

`∗(ϕ) := (ψ0, ϕ(0))L2(D,µ). (5.7)

We note that ϕ := etϕ ∈ X0,{T} if, and only if, ϕ ∈ X0,{T}; analogously, ψ := e−tψ ∈ Y if, and

only if, ψ ∈ Y. Thus, by selecting ϕ = e−tϕ as test function in (5.5), we deduce that ψ ∈ Y is a

solution to (5.5) if, and only if, ψ := e−tψ ∈ Y solves

ψ ∈ Y : B∗(ψ, ϕ) = `∗(ϕ) ∀ϕ ∈ X0,{T},

where the linear functional `∗ is defined as in (5.7) (note that ψ(0) = ψ(0)), and the bilinear formB∗(·, ·) : Y × X0,{T} → R is defined by (5.6), except that now

a(ψ, ϕ) :=

∞∑i,j=1

Aij(∇qi ψ,∇qj ϕ)[L2(D,µ)]d + (ψ, ϕ)L2(D,µ).

Page 14: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

14 CHRISTOPH SCHWAB AND ENDRE SULI

The solution ψ ∈ Y to (5.5) is then directly recovered from ψ by setting ψ := etψ. We shalltherefore assume (without loss of generality) in what follows that, whenever the problem (5.5) isreferred to, it is the bilinear form a defined by

a(ψ, ϕ) :=

∞∑i,j=1

Aij(∇qi ψ,∇qj ϕ)[L2(D,µ)]d + (ψ, ϕ)L2(D,µ) (5.8)

that is used in the definition of B∗, rather than the bilinear form a defined above following (5.6).By adopting this simple convention, the well-posedness of the infinite-dimensional Fokker–

Planck equation is now an immediate consequence of Theorem 2.5; in particular, the followingresult holds.

Theorem 5.3. For the bilinear form B∗(·, ·) in (5.6), where a(·, ·) is defined by (5.8), there existsa positive constant C such that

infψ∈Y\{0}

supϕ∈X0,{T}\{0}

B∗(ψ, ϕ)

‖ψ‖Y‖ϕ‖X0,{T}

≥ C (5.9)

and∀ϕ ∈ X0,{T} \ {0} : sup

ψ∈YB∗(ψ, ϕ) > 0. (5.10)

Furthermore, for each ψ0 ∈ H and each g ∈ X ∗0,{T}, there exists a unique ψ ∈ Y = L2(0, T ;V) that

satisfies (5.5); and the linear operator B∗ ∈ L(Y,X ∗0,{T}) induced by B∗(·, ·) is an isomorphism.

Following [24], we shall next develop a class of adaptive Galerkin discretization algorithms for(5.5), along the lines of adaptive wavelet discretizations of boundedly invertible operator equationsconsidered in [9, 10]. These algorithms exhibit, in particular, “stability by adaptivity”, i.e., theirstability follows directly from the stability (5.9), (5.10) of the continuous, infinite-dimensionalproblem through suitable Riesz bases of the spaces Y and X0,{T}, which we shall construct. No-tably, in the present context, these algorithms are dimensionally robust by design: as we shallprove, they deliver a sequence of approximate solutions with finitely supported coefficient vectors,i.e., with only finitely many variables qk being ‘active’ in each iteration. We will establish certainoptimality properties for these finitely supported Galerkin solutions, with all constants in the errorand complexity bounds being absolute, i.e., independent of the number of active variables. We firstdevelop the necessary concepts in an abstract setting, before applying them to the Fokker–Planckequation (5.5) in space-time variational form.

6. Well-posed operator equations as infinite matrix problems

Let us denote for a moment by X ,Y two generic separable Hilbert spaces over R, and let usassume that we have available a Riesz basis ΨX = {ψXλ : λ ∈ ∇X } for X , meaning that thesynthesis operator

sΨX : `2(∇X )→ X : c 7→ c>ΨX :=∑λ∈∇X

cλψXλ

is boundedly invertible. By identifying `2(∇X ) with its dual, the adjoint of sΨX , known as theanalysis operator, is

s∗ΨX : X ∗ → `2(∇X ) : g 7→ [g(ψXλ )]λ∈∇X .

Similarly, let ΨY = {ψYλ : λ ∈ ∇Y} be a Riesz basis for Y, with synthesis operator sΨY and itsadjoint s∗ΨY . Here, ∇X and ∇Y are countable sets of (multi-)indices λ.

Now let B ∈ L(X ,Y∗) be boundedly invertible; then, also its adjoint B∗ ∈ L(Y,X ∗) is bound-edly invertible and, for every f ∈ Y∗, f∗ ∈ X ∗ the operator equations

Bu = f, B∗u∗ = f∗

admit unique solutions u ∈ X , u∗ ∈ Y. Writing u = sΨXu and u∗ = sΨYu∗, these operatorequations are equivalent to infinite matrix-vector problems

Bu = f , B∗u∗ = f∗, (6.1)

Page 15: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 15

where the “load vectors” f = s∗ΨYf = [f(ψYλ )]λ∈∇Y ∈ `2(∇Y), f∗ = s∗ΨX f = [f(ψXλ )]λ∈∇X ∈`2(∇X ); and the system matrix B given by

B = s∗ΨYBsΨX = [(BψXµ )(ψYλ )]λ∈∇Y ,µ∈∇X ∈ L(`2(∇X ), `2(∇Y))

and with B∗ defined analogously, are boundedly invertible. We consider the associated bilinearforms

B(·, ·) : X × Y → R : (w, v) 7→ (Bw)(v), B∗(·, ·) : Y × X → R : (v, w) 7→ (B∗v)(w),

and introduce the notations

B = B(ΨX ,ΨY), f = f(ΨY), B∗ = B∗(ΨY ,ΨX ), f∗ = f∗(ΨX ).

With the Riesz constants

ΛXΨX := ‖sΨX ‖`2(∇X )→X = sup0 6=c∈`2(∇X )

‖c>ΨX ‖X‖c‖`2(∇X )

,

λXΨX := ‖s−1ΨX‖−1X→`2(∇X ) = inf

06=c∈`2(∇X )

‖c>ΨX ‖X‖c‖`2(∇X )

,

and ΛYΨY

and λYΨY

defined analogously, we obviously have that

‖B‖`2(∇X )→`2(∇Y) ≤ ‖B‖X→Y∗ΛXΨXΛYΨY, (6.2)

‖B−1‖`2(∇Y)→`2(∇X ) ≤‖B−1‖Y∗→XλX

ΨXλY

ΨY

. (6.3)

7. Best N-term approximations and approximation classes

We say that uN ∈ `2(∇X ) is a best N -term approximation of u ∈ `2(∇X ), if it is the best possibleapproximation to u in the norm of `2(∇X ), given a budget of N coefficients. Determining the bestN -term approximation uN of u requires searching the infinite vector u, which is not feasible, i.e.,actually locating uN may not be practically feasible. Nevertheless, rates of convergence affordedby best N -term approximations serve as a benchmark for concrete numerical schemes. To this end,we collect all u ∈ `2(∇X ) admitting a best N -term approximation converging with rate s > 0 inthe approximation class As∞(`2(∇X )) :=

{v ∈ `2(∇X ) : ‖v‖As∞(`2(∇X )) <∞

}, where

‖v‖As∞(`2(∇X )) := supε>0

ε× [min{N ∈ N0 : ‖v − vN‖`2(∇X ) ≤ ε}]s,

which consists of all v ∈ `2(∇X ) whose best N -term approximations converge to v with rate s.Generally, best N -term approximations cannot be realized in practice, in particular not in

situations where the vector u to be approximated is only defined implicitly as the solution ofan infinite matrix-vector problem, such as (6.1). We will now design, following [24], a space-time adaptive Galerkin discretization method for the infinite-dimensional Fokker–Planck equation(5.5) that produces a sequence of approximations to u, which, whenever u ∈ As∞(`2(∇X )) forsome s > 0, converge to u with this rate s > 0 in computational complexity that is linear withrespect to N , the cardinality of the support of the “active” coefficients in the computed, finitelysupported approximation to u.

8. Adaptive Galerkin methods

Let s > 0 be such that u ∈ As∞(`2(∇X )). In [9] and [10], adaptive wavelet Galerkin methods forsolving elliptic operator equations (6.1) were introduced; the methods considered in both papersare iterative methods. To be able to bound their complexity, one needs a suitable bound on thecomplexity of an approximate matrix-vector product in terms of the prescribed tolerance. Weformalize this idea through the notion of s∗-admissibility.

Definition 8.1. The infinite matrices B ∈ L(`2(∇X ), `2(∇Y)), B∗ ∈ L(`2(∇Y), `2(∇X )) ares∗-admissible if there exist routines

APPLYB[w, ε]→ z, APPLYB∗ [w, ε]→ z

Page 16: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

16 CHRISTOPH SCHWAB AND ENDRE SULI

that yield, for any prescribed tolerance ε > 0 and any finitely supported w ∈ `2(∇X ) and w ∈`2(∇Y), finitely supported vectors z ∈ `2(∇Y) and z ∈ `2(∇X ) such that

‖Bw − z‖`2(∇Y) + ‖B∗w − z‖`2(∇X ) ≤ ε,

and for which, for any s ∈ (0, s∗), there exists an admissibility constant aB,s such that

#supp z ≤ aB,s ε−1/s‖w‖1/sAs∞(`2(∇X )),

and likewise for z, w. The number of arithmetic operations and storage locations used by the callAPPLYB[w, ε] is bounded by some absolute multiple of

aB,s ε−1/s‖w‖1/sAs∞(`2(∇X )) + #supp w + 1.

The design of APPLYB[w, ε] and APPLYB∗ [w, ε] for the system matrices B and B∗ arisingfrom the infinite-dimensional Fokker–Planck equation (5.5) is the major technical building blockin the adaptive Galerkin discretization of (5.5).

In order to approximate u one should be able to approximate f∗. To this end, we shall assumein what follows the availability of the following routine.

RHSf∗ [ε]→ f∗ε : For a given ε > 0, the routine yields a finitely supported f∗ε ∈ `2(∇X ) with

‖f∗ − f∗ε ‖`2(∇X ) ≤ ε and #supp f∗ε . min{N : ‖f∗ − f∗N‖ ≤ ε},

with the number of arithmetic operations and storage locations used by the call RHSf∗ [ε] beingbounded by some absolute multiple of #supp f∗ε + 1.

The availability of the routines APPLYB and RHSf has the following implications.

Proposition 8.2. Let B in (6.1) be s∗-admissible. Then, for any s ∈ (0, s∗), we have that

‖B‖As∞(`2(∇X ))→As∞(`2(∇Y)) ≤ asB,s.

For zε := APPLYB[w, ε], we have that

‖zε‖As∞(`2(∇Y)) ≤ asB,s‖w‖As∞(`2(∇X )).

Analogous statements hold for B∗ in (6.1).

A proof of Proposition 8.2 can be given along the lines of the arguments presented in [9, 10].Using the definition of As∞(`2(∇Y)) and the properties of RHSf , we have the following corollary.

Corollary 8.3. Suppose that, in (6.1), B∗ is s∗-admissible and u ∈ As∞(`2(∇Y)) for s < s∗;then, for f∗ε = RHSf∗ [ε],

#supp f∗ε . aB∗,s ε−1/s‖u‖1/sAs∞(`2(∇Y)),

with the number of arithmetic operations and storage locations used by the call RHSf∗ [ε] beingbounded by some absolute multiple of

aB∗,s ε−1/s‖u‖1/sAs∞(`2(∇Y)) + 1.

Remark 8.4. Besides ‖f∗ − f∗ε ‖`2(∇Y) ≤ ε, the complexity bounds in Corollary 8.3, with aB∗,ssignifying a constant that depends only on B∗ and s, are essential for the use of RHSf∗ in theadaptive Galerkin methods.

The following corollary of Proposition 8.2 from [24] can be used for example for the constructionof valid APPLY and RHS routines in case the adaptive Galerkin algorithms are applied to apreconditioned system.

Corollary 8.5. Suppose that B∗ ∈ L(`2(∇Y), `2(∇X )), C ∈ L(`2(∇X ), `2(∇Z)) are both s∗-admissible; then, so is CB∗ ∈ L(`2(∇Y), `2(∇Z)). A valid routine APPLYCB∗ is

[w, ε] 7→ APPLYC

[APPLYB∗ [w, ε/(2‖C‖)], ε/2

], (8.1)

with admissibility constant aCB∗,s . aB∗,s(‖C‖1/s + aC,s) for s ∈ (0, s∗).

Page 17: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 17

For some s∗ > s, let C ∈ L(`2(∇Y), `2(∇Z)) be s∗-admissible. Then, for

RHSCf∗ [ε] := APPLYC[RHSf∗ [ε/(2‖C‖)], ε/2], (8.2)

we have that

#supp RHSCf∗ [ε] . aB∗,s(‖C‖1/s + aC,s) ε−1/s‖u‖1/sAs∞(`2(∇Y))

and ‖Cf∗−RHSCf∗ [ε]‖`2(∇Z) ≤ ε, with the number of arithmetic operations and storage locationsused by the call RHSCf∗ [ε] being bounded by some absolute multiple of

aB∗,s(‖C‖1/s + aC,s) ε−1/s‖u‖1/sAs∞(`2(∇X )) + 1.

Remark 8.6. The properties of RHSCf∗ stated in Corollary 8.5 show that RHSCf∗ is a validroutine for approximating Cf∗ in the sense of Remark 8.4.

In the special case when B is symmetric positive definite, i.e., ∇X = ∇Y and B = B∗ > 0,the two Galerkin methods considered in the papers [9, 10] were shown to be quasi-optimal in thefollowing sense.

Theorem 8.7. Suppose that, in (6.1), B∗ is s∗-admissible; then, for any ε > 0, the two adaptiveGalerkin methods from [9, 10] produce approximations uε to u with ‖u− uε‖`2(∇Y) ≤ ε. Suppose

that, in (6.1), for some s > 0 we have u ∈ As∞(`2(∇Y)); then #supp uε. ε−1/s‖u‖1/sAs∞(`2(∇Y))

and if, moreover, s < s∗, then the number of arithmetic operations and storage locations requiredby a call of either of these adaptive solvers with tolerance ε > 0 is bounded by some multiple of

ε−1/s(1 + aB,s)‖u‖1/sAs∞(`2(∇Y)) + 1.

The multiples depend on s only when s tends to 0 or ∞, and on ‖B‖ and ‖B−1‖ when they tendto infinity.

The method from [10] consists of the application of a damped Richardson iteration to Bu = f ,where the required residual computations are approximated using calls of APPLYB and RHSf

within tolerances that decrease linearly with the iteration counter.The method from [9] produces a sequence Ξ0 ⊂ Ξ1 ⊂ · · · ⊂ ∇X , together with a corresponding

sequence of (approximate) finitely supported Galerkin solutions ui ∈ `2(Ξi). The coefficients ofapproximate residuals f −Bui are used as indicators how to expand Ξi to Ξi+1 so that the lattergives rise to an improved Galerkin approximation.

Both methods rely on a recurrent coarsening of the approximation vectors, where small coeffi-cients are removed in order to keep an optimal balance between accuracy and support length; in[15], a modification of the algorithm introduced in [9] was proposed, which does not require thecoarsening step.

The key to why s∗-admissibility of B can be expected is the observation that for a large classof operators the stiffness matrix with respect to suitable wavelet bases is close to a computablesparse matrix, in the sense of the following definition.

Definition 8.8. B ∈ L(`2(∇X ), `2(∇Y)) is s∗-computable if, for each N ∈ N, there exists aB[N ] ∈ L(`2(∇X ), `2(∇Y)) having in each ‘column’ at most N nonzero entries whose joint com-putation takes an absolute multiple of N operations, such that the computability constants

cB,s := supN∈N

N‖B−B[N ]‖1/s`2(∇X )→`2(∇Y) (8.3)

are finite for any s ∈ (0, s∗). The notion of s∗-computability of B∗ is defined analogously.

Theorem 8.9. An s∗-computable B is s∗-admissible. Moreover, for s < s∗, aB,s . cB,s wherethe constant in this estimate depends only on s ↓ 0, s ↑ s∗, and on ‖B‖ → ∞.

This theorem is proved by constructing a suitable APPLYB routine as in [9, §6.4] (a log factorin the complexity estimate there due to sorting was removed later through the use of approximatesorting, see [13] and the references there).

Page 18: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

18 CHRISTOPH SCHWAB AND ENDRE SULI

Remark 8.10. Theorem 8.7 requires that B is s∗-admissible for an s∗ > s when u ∈ As∞(`2(∇X )).Generally this value of s is unknown, and so the condition on s∗ should be interpreted in the sensethat s∗ has to be larger than any s for which membership of the solution u in As∞(`2(∇X )) can beexpected.

The approach from [10] applies also to the saddle point variational principle (5.5) wheneverone has a linearly convergent stationary iterative scheme available for the matrix-vector problemB∗u = f∗. There is, unfortunately, no such scheme available for a general boundedly invertible B∗.In particular, for the stiffness matrices B∗ resulting from the space-time saddle-point formulation(5.5) of the infinite-dimensional Fokker–Planck equation no directly applicable scheme is available.

A remedy proposed in [10] is to apply adaptive Galerkin discretizations to the normal equation

BB∗u = Bf∗. (8.4)

By Theorem 5.3, the operator BB∗ ∈ L(`2(∇Y), `2(∇Y)) is boundedly invertible, symmetricpositive definite, with

‖BB∗‖`2(∇Y)→`2(∇Y) ≤ ‖B∗‖2`2(∇Y)→`2(∇X ), ‖(BB∗)−1‖`2(∇Y)→`2(∇Y) ≤ ‖(B∗)−1‖2`2(∇X )→`2(∇Y).

Now let u ∈ As∞(`2(∇X )), and for some s∗ > s, let B and B∗ be s∗-admissible. By Corollary8.5, with B∗ in place of C, a valid RHSBf∗ routine is given by (8.2), and BB∗ is s∗-admissible witha valid APPLYBB∗ routine given by (8.1). In the context of (5.5), one execution of APPLYBB∗

corresponds to one (approximate) sweep over the “primal” problem, followed by one (approximate)sweep over the dual problem, respectively. A combination of Theorem 8.7 and Corollary 8.5 yieldsthe following result obtained in [24].

Theorem 8.11. For any ε > 0, the adaptive wavelet methods from [10] or from [9] and [15]applied to the normal equations (8.4) using the above APPLYBB∗ and RHSBf∗ routines producean approximation uε to u with ‖u− uε‖`2(∇Y) ≤ ε.

Suppose that for some s > 0, u ∈ As∞(`2(∇Y)); then #supp uε . ε−1/s‖u‖1/sAs∞(`2(∇Y)), with

the constant in this bound only being dependent on s when it tends to 0 or ∞, and on ‖B∗‖ and‖(B∗)−1‖ when they tend to infinity.

Suppose that s < s∗; then, the number of arithmetic operations and storage locations requiredby a call of either of these adaptive wavelet methods with tolerance ε is bounded by some multipleof

1 + ε−1/s(1 + aB∗,s(1 + aB,s))‖u‖1/sAs∞(`2(∇Y)),

where this multiple only depends on s when it tends to 0 or ∞, and on ‖B∗‖ and ‖(B∗)−1‖ whenthey tend to infinity.

9. Infinite-dimensional Fokker–Planck equation as an infinite matrix vectorequation

We apply the foregoing abstract concepts to the space-time variational formulation (5.5) of theinfinite-dimensional Fokker–Planck equation. To construct Riesz bases for X0,{T} and Y, we usethat

X0,{T} = (L2(0, T )⊗ V) ∩ (H10,{T}(0, T )⊗ V∗) and Y = L2(0, T )⊗ V

with the spaces X0,{T} as defined in (5.4), and Y = L2(0, T ;V); recall that H = L2(D,µ), V =

W1,2(D,µ). Let

Υ = {Hγ : γ ∈ Γ} ⊂ V, with Hγ(q) =

∞∏k=1

Hγk

(qk√λk

),

denote the polynomial chaos basis (4.2). By Theorem 4.3, Υ is a normalized Riesz basis for[L2(H,µ)]d, which, when renormalized by the scaling sequence (〈γ, λ−1〉)γ∈Γ in V and by thescaling sequence (〈γ, λ〉)γ∈Γ in V∗, is a Riesz basis for V and V∗, respectively. Let

Θ = {θλ : λ ∈ ∇t} ⊂ H10,{T}(0, T )

Page 19: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 19

be a family of functions that is a normalized Riesz basis for L2(0, T ), which, renormalized inH1(0, T ), is a Riesz basis for H1

0,{T}(0, T ). From [16, Prop. 1 and 2], it follows that then the

collection Θ⊗Υ, normalized in X , i.e., the collection(t, q) 7→ θλ(t)Hγ(q)√〈γ, λ−1〉+ ‖θλ‖2H1(0,T )〈γ, λ〉

: (λ, γ) ∈ ∇X := ∇t × Γ

,

is a Riesz basis for X0,{T}, and that Θ⊗Υ normalized in Y, i.e., the collection{(t, q) 7→ θλ(t)Hγ(q)

‖Hγ‖V: (λ, γ) ∈ ∇Y := ∇t × Γ

},

is a Riesz basis for Y. Moreover, denoting the Riesz basis for V∗ consisting of the collection Υnormalized in V∗ by [Υ]V∗ , and similarly for the other collections and spaces, with the notationsintroduced in Section 6 we have that

ΛX[Θ⊗Υ]X≤ max(Λ

L2(0,T )Θ ΛV[Υ]V

,ΛH1(0,T )[Θ]H1(0,T )

ΛV∗

[Υ]V∗),

λX[Θ⊗Υ]X≥ min(λ

L2(0,T )Θ λV[Υ]V

, λH1(0,T )[Θ]H1(0,T )

λV∗

[Υ]V∗),

ΛY[Θ⊗Υ]Y≤ Λ

L2(0,T )Θ ΛV[Υ]V

,

λY[Θ⊗Υ]Y≥ λL2(0,T )

Θ λV[Υ]V.

Denoting by ‖Υ‖V the infinite diagonal matrix with diagonal entries ‖Hγ‖V where γ ∈ Γ, andsimilarly for the other collections and spaces, the stiffness or system matrix B∗ corresponding tothe variational form (5.5) and the Riesz bases [Θ ⊗ Υ]Y , [Θ ⊗ Υ]X for Y and X0,{T} is given bythe infinite matrix

B∗ = B∗([Θ⊗Υ]Y , [Θ⊗Υ]X )

=[Idt ⊗ ‖Υ‖−1

V]◦

[−(Θ,Θ′)L2(0,T ) ⊗ (Υ,Υ)H +

∫ T

0

a(Θ⊗Υ,Θ⊗Υ) dt

]◦[‖Θ⊗Υ‖−1

X],

where Idt denotes the identity operator with respect to the t variable and the symbol ◦ signifiescomposition of operators.

Writing the solution ψ of (5.5) as ψ = u>[Θ ⊗ Υ]X , we deduce that u is the solution of theinfinite matrix-vector equation B∗u = f∗ with

f∗ :=[(ψ0,Υ)H

]. (9.1)

Introducing the infinite matrices

D1 := (‖Θ‖H1(0,T ) ⊗ ‖Υ‖V∗)‖Θ⊗Υ‖−1X and D2 := (Idt ⊗ ‖Υ‖V)‖Θ⊗Υ‖−1

X ,

both being diagonal, with entries in modulus less than 1, the infinite matrix operator B∗ can bewritten as

B∗ =

[−(Θ, [Θ]′H1(0,T ))L2(0,T ) ⊗ ([Υ]V∗ , [Υ]V)HD1 +

∫ T

0

a(Θ⊗ [Υ]V ,Θ⊗ [Υ]V) dtD2

]. (9.2)

10. s∗-admissibility of B and B∗ from (9.2) and of its adjoint

By Theorem 8.9, the s∗-admissibility of B and B∗ follows from their s∗-computability. Toverify the s∗-computability of tensor products of possibly infinite matrices, it suffices to analyzethe s∗-computability of the factors, according to the following result from [24, 20].

Proposition 10.1. Let, for some s∗ > 0, D and E be s∗-computable. Then,

(a) D ⊗ E is s∗-computable with computability constant satisfying, for 0 < s < s < s∗,cD⊗E,s . (cD,s cE,s)

s/s; and(b) for any ε ∈ (0, s∗), D ⊗ E is (s∗ − ε)–computable, with computability constant cD⊗E,s

satisfying, for 0 < s < s∗ − ε < s < s∗, cD⊗E,s . max(cD,s, 1) max(cE,s, 1).

Page 20: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

20 CHRISTOPH SCHWAB AND ENDRE SULI

The constants absorbed in the . symbol in the bounds on the computability constants in (a) and(b) are only dependent on s ↓ 0, s→∞ and s− s ↓ 0.

In view of the representation (9.2) of B∗, using Corollary 8.5 and Proposition 10.1, part (a), for

proving s∗-admissibility of B and B∗ it suffices to show that (Υ,Υ)H and∫ T

0a(Θ⊗[Υ]V ,Θ⊗[Υ]V) dt

and its adjoint are s∗-admissible. To prove this, by Theorem 8.9 we need to verify that these objectsare s∗-computable. This will follow from the fact that ([Θ]′H1(0,T ),Θ)L2(0,T ), (Θ, [Θ]′H1(0,T ))L2(0,T ),

([Υ]V∗ , [Υ]V)H and ([Υ]V , [Υ]V∗)H are s∗-computable and, in view of the definition of a(·, ·) in(5.6), from a sparsity assumption on the coefficient matrix A.

10.1. Choice of Riesz bases Θ and Υ. We have already assumed that Θ = {θλ : λ ∈ ∇t} isa normalized Riesz basis of L2(0, T ), which, when renormalized in H1(0, T ) is a Riesz basis forH1

0,{T}(0, T ). We will now select the basis Θ to be a wavelet basis that satisfies certain additional

assumptions. Specifically, we shall assume that the basis functions θλ of Θ are:

(t1) local, i.e., supx∈[0,1],`∈N0#{|λ| = ` : x ∈ supp θλ} <∞ and supp θλ . 2−|λ|;

(t2) piecewise polynomial of order dt, where by “piecewise” we mean that the singular supportconsists of a set of points whose cardinality is uniformly bounded;

(t3) globally continuous, specifically ‖θλ‖Wk∞(0,1) . 2|λ|(

12 +k) for k ∈ {0, 1};

(t4) for |λ| > 0, have dt ≥ dt vanishing moments.

The assumptions (t1)–(t4) can be met by wavelet constructions (see, e.g. [8] and the referencestherein).

As a Riesz basis in D we choose the (countable) “polynomial chaos” basis Υ = {Hγ : γ ∈ Γ}.According to Theorem 4.3, Υ is an orthonormal Riesz basis of L2(D,µ) that, normalized in V =W1,2(D,µ) or its dual V∗, is a Riesz basis for these spaces, respectively. Indeed, on denoting byDλ the diagonal matrix

Dλ = diag{(〈γ, λ−1〉)1/2 : γ ∈ Γ}, (10.1)

we observe that the diagonal entries of Dλ relate to the V and V∗ norms of [Υ]H as follows:

Dλ ' ‖[Υ]H‖V , (Dλ)−1 ' ‖[Υ]H‖V∗ .

Therefore, the collection [Υ]V := (Dλ)−1[Υ]H = {(Dλγ )−1Hγ : γ ∈ Γ} is a Riesz basis of V =

W1,2(D,µ). This follows readily from the L2(D,µ) orthonormality of Υ and from the identity (cf.[12, (9.2.11)])

DkHγ = γ1/2k λ

−1/2k Hγk−1(Wek)H(k)

γ , for k = 1, 2, . . . and γ ∈ Γ with γk ≥ 1; (10.2)

with the notational convention H(k)γ :=

∏j 6=kHγj for γ ∈ Γ. If γk = 0, then DkHγ = 0.

By duality, the family [Υ]V∗ := Dλ[Υ]H is a Riesz basis of V∗.

10.2. s∗-computability of ([Θ]′H1(0,T ),Θ)L2(0,T ) and of its adjoint. By (t1), (t2) and (t4), for

each λ ∈ ∇t and ` ∈ N0, the number of µ ∈ ∇t with |µ| = ` and∫ T

0θ′λθµ dt 6= 0 or

∫ T0θ′µθλ dt 6= 0

is bounded, uniformly in λ and `. Indeed,∫ T

0θ′λθµ dt can only be nonzero when θµ does not vanish

on the singular support of θλ, and using integration by parts,∫ T

0θ′µθλ dt can only be nonzero when

θµ does not vanish on the singular support of θλ or at t ∈ {0, T}.As a consequence of Θ being of order dt ≥ 1 we have that

2|λ| = 2|λ|‖θλ‖L2(0,T ) = 2|λ|‖(Id−Q|λ|−1)θλ‖L2(0,T ) . ‖θλ‖H1(0,T ).

Using (t1) and (t3), we infer that

‖θλ‖−1H1(0,T )

∣∣∣∣∣∫ T

0

θ′λθµ dt

∣∣∣∣∣ . 2−|λ|2−max(|λ|,|µ|)2|λ|(12 +1)2

12 |µ| = 2−

12

∣∣|λ|−|µ|∣∣. (10.3)

Finally, we note that any entry of 〈[Θ]′H1(0,T ),Θ〉L2(0,T ) can be evaluated in closed form in O(1)

work and memory. Schur’s Lemma (cf. [23], p.6, ¶2 and [6], p.449, Theorem B) now implies that([Θ]′H1(0,T ),Θ)L2(0,T ) and (Θ, [Θ]′H1(0,T ))L2(0,T ) are ∞-computable.

Page 21: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 21

Note that from diam(supp θλ) . 2−|λ| (cf. (t1)) and from and (t3) we deduce that ‖θλ‖H1(0,T ) .2|λ|, and thus that

‖θλ‖H1(0,T ) h 2|λ|.

Remark 10.2. Suppose that instead of (t3), the θλ belong also to Crt(0, T ) for some rt ∈ N(necessarily with rt ≤ dt − 2), i.e., that ‖θλ‖Ws,∞(0,1) . 2|λ|(

12 +s) for s ∈ {0, rt + 1}. Then, by

subtracting a suitable polynomial of order rt from θ′λ in (10.3), and using that θµ has dt ≥ dt ≥ rtvanishing moments one deduces that, for |λ| ≤ |µ|,

‖θλ‖−1H1(0,T )

∣∣∣∣∣∫ T

0

θ′λθµ dt

∣∣∣∣∣ . 2−∣∣|λ|−|µ|∣∣( 1

2 +rt).

Similarly, for |λ| ≥ |µ|, using integration by parts one obtains that

‖θλ‖−1H1(0,T )

∣∣∣∣∣∫ T

0

θ′λθµ dt

∣∣∣∣∣ . 2−∣∣||λ|−|µ|∣∣( 3

2 +rt)

if t 7→ θλ(t) θµ(t) vanishes at t ∈ {0, T}. Since the wavelets in time do not satisfy Dirichletboundary conditions, there are, however, λ, µ ∈ ∇t with |λ| ≥ |µ| for which t 7→ θλ(t) θµ(t) doesnot vanish at the boundary. For those entries (10.3) cannot be improved.

Remark 10.3. In [8], multiresolutions Θ = (θλ)λ∈∇t were constructed for which the matrices([Θ]′H1(0,T ),Θ)L2(0,T ) are sparse without compression.

10.3. s∗-computability of (Υ,Υ)H and ([Υ]V∗ , [Υ]V)H and of its adjoint. By the L2(H,µ)-orthonormality (4.4) of the family Υ of Hermite polynomials, both (Υ,Υ)H and ([Υ]V∗ , [Υ]V)Hand their adjoints are bi-infinite, diagonal matrices that are therefore ∞-computable.

10.4. s∗-computability of∫ T

0a(Θ⊗ [Υ]V ,Θ⊗ [Υ]V) dt and of its adjoint. We have that∫ T

0

a(Θ⊗ [Υ]V ,Θ⊗ [Υ]V) dt = (Θ,Θ)L2(0,T ) ⊗ a([Υ]V , [Υ]V). (10.4)

Since the form a(·, ·) defined in (5.8) is symmetric, by Proposition 10.1 it suffices to investi-gate s∗-computability of the two factors on the right-hand side of (10.4) in order to deduces∗-computability of B and B∗.

As was noted in [24, Sec. 8.5], under the assumptions (t1)–(t4) above, (Θ,Θ)L2(0,T ) is ∞-computable, so it remains to address the s∗-computability of a([Υ]V , [Υ]V).

In view of the definition (5.8) of the bilinear form a(·, ·) appearing in (5.6) and noting Definition8.8, the s∗-computability of the infinite matrix G = a([Υ]V , [Υ]V) depends on the structure of theinfinite matrix A = (Aij)

∞i,j=1 in (5.6). For the sake of simplicity of the exposition we shall suppose

that d = 1, so that D = R× R× · · · . Thanks to (10.4),

a([Υ]V , [Υ]V) = (Gγ,γ′)γ,γ∈Γ

where, on recalling the definition (5.8) of the symmetric bilinear form a(·, ·), the entries Gγγ′ ofthe symmetric matrix G are, for γ, γ′ ∈ Γ, given by

Gγγ′ =∑i,j≥1

Aij(Dλγ )−1(DqiHγ ,DqjHγ′)L2(D,µ)(D

λγ′)−1 + δγγ′ .

Clearly, if γ = 0, then Gγγ′ = δγγ′ for all γ′ ∈ Γ; similarly, if γ′ = 0, then Gγγ′ = δγγ′ for allγ ∈ Γ. We shall therefore focus our attention on the nontrivial case when γ, γ′ ∈ Γ \ {0}; for anysuch γ and γ′, we have by (10.2) that

Gγγ′ =∑

i,j≥1 : γi≥1,γ′j≥1

Aij(Dλγ )−1

(√γi/λiHγi−1H

(i)γ ,√γ′j/λj Hγ′j−1H

(j)γ′

)L2(D,µ)

(Dλγ′)−1 + δγγ′

=∑

i,j≥1 : γi≥1,γ′j≥1

Aij(Dλγ )−1(Dλ

γ′)−1

√γiγ′jλiλj

(Hγi−1H

(i)γ , Hγ′j−1H

(j)γ′

)L2(D,µ)

+ δγγ′ .

Page 22: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

22 CHRISTOPH SCHWAB AND ENDRE SULI

We now impose additional hypotheses on the matrix A. It is illustrative (and will be used inwhat follows) to first consider the special case Aij = δij . We denote the corresponding matrix G

by G(0). Inserting this choice of Aij into the previous expression, we obtain from the L2(D,µ)orthonormality (4.4) of the family {Hγ : γ ∈ Γ} that

G(0)γγ′ =

∑i,j≥1 : γi≥1,γ′j≥1

δij

√γiγ′jλiλj

(Dλγ )−1(Dλ

γ′)−1(Hγi−1H

(i)γ , Hγ′j−1H

(j)γ′

)L2(D,µ)

+ δγγ′

=∑

i≥1 : γi≥1,γ′i≥1

√γiγ′iλ2i

(Dλγ )−1(Dλ

γ′)−1(Hγi−1H

(i)γ , Hγ′i−1H

(i)γ′

)L2(D,µ)

+ δγγ′

= δγγ′(Dλγ )−1(Dλ

γ′)−1

∑i : γi≥1,γ′i≥1

√γiγ′iλ2i

+ δγγ′

= δγγ′(Dλγ )−2〈γ, λ−1〉+ δγγ′ = 2δγγ′ .

For Aij = δij the infinite matrix G = G(0) is, therefore, ∞-computable.We next turn to the tridiagonal matrix A considered in Example 5.1, whose entries are

Aij = δij + εiδi,j−1 + εiδi,j+1, i, j = 1, 2, . . . . (10.5)

With the diagonal term having already been discussed, by superposition we may now confineourselves to investigating the computability of the matrix G when the matrix A has entriesAij = εiδi,j−1, i, j = 1, 2, . . . , and when A has entries Aij = εiδi,j+1, i, j = 1, 2, . . . . The matrices

G corresponding to these two cases will be denoted below by G(−) and G(+), and we denote their

respective entries by G(+)γγ′ and G

(−)γγ′ , with γ, γ′ ∈ Γ. For the sake of excluding trivial situations

we shall assume henceforth that εi 6= 0, i = 1, 2, . . . .Given γ, γ′ ∈ Γ \ {0}, we calculate, as before, that

G(±)γγ′ =

∑i,j≥1 : γi≥1,γ′j≥1

εiδi,j±1

√γiγ′jλiλj

(Dλγ )−1(Dλ

γ′)−1(Hγi−1H

(i)γ , Hγ′j−1H

(j)γ′

)L2(D,µ)

+ δγγ′

=∑

i≥1 : γi≥1,γ′i∓1≥1

εi

√γiγ′i∓1

λiλi∓1(Dλ

γ )−1(Dλγ′)−1(Hγi−1H

(i)γ , Hγ′i∓1−1H

(i∓1)γ′

)L2(D,µ)

+ δγγ′ ,

(10.6)with the notational convention γ′0 := 0 and λ0 := λ1(> 0). Thus, for example, if γ = 11 :=

(1, 0, 0, . . . ), then G(+)γγ′ = δγγ′ for all γ′ ∈ Γ \ {0}; G(−)

γγ′ = ε1 for γ′ = (0, 1, 0, 0, . . . ), and

G(−)γγ′ = δγγ′ for all other γ′ ∈ Γ \ {0}. Also, if γ = 0, then G

(±)0γ′ = δ0,γ′ for all γ′ ∈ Γ; and,

analogously, if γ′ = 0, then G(±)γ0 = δγ0 for all γ ∈ Γ.

It therefore remains to compute G(±)γγ′ for γ, γ′ ∈ Γ \ {0, 11}. Hence, by defining for a fixed

integer i ≥ 1 and for γ, γ′ ∈ Γ \ {0} such that γi ≥ 1 and γ′i∓1 ≥ 1 the expression

H(±),(i)γγ′ :=

(Hγi−1H

(i)γ , Hγ′i∓1−1H

(i∓1)γ′

)L2(D,µ)

,

we deduce that

H(±),(i)γγ′ =

{1 if γ′i = γi − 1 ∧ γ′i∓1 = γi∓1 + 1 ∧ γ′j = γj , j ∈ {i, i∓ 1}c ∩ N,0 otherwise.

Now, for any γ ∈ Γ \ {0} consider the finite sets Z±(γ) ⊂ Γ \ {0} consisting of all γ′ ∈ Γ \ {0} for

which there exists an integer i ≥ 1 such that H(±),(i)γγ′ = 1. Clearly, Z+(11) = ∅, while Z−(11) is

equal to the singleton {(0, 1, 0, 0, . . . )}. More generally,

(a) for each γ ∈ Γ \ {0, 11}, the set Z±(γ) is nonempty;(b) for each γ ∈ Γ \ {0, 11} and γ′ ∈ Z±(γ), there exists a unique integer i± = i±(γ, γ′) ≥ 1

such that H(±),(i±)γγ′ = 1;

(c) for each γ ∈ Γ \ {0, 11}, the mapping i±(γ, ·) : γ′ ∈ Z±(γ) 7→ i±(γ, γ′) ∈ N is injective;

Page 23: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 23

(d) for each γ ∈ Γ \ {0}, γ 6∈ Z±(γ).

Hence, for a fixed γ ∈ Γ \ {0, 11}, and any γ′ ∈ Z±(γ), we have from (10.6) (note that by (d):δγγ′ = 0) that

G(±)γγ′ = εi±

√γi±γ

′i±∓1

λi±λi±∓1〈γ, λ−1〉− 1

2 〈γ′, λ−1〉− 12 , where

i± = i±(γ, γ′) ≥ 1;γi± ≥ 1;γ′i±∓1 ≥ 1;

(10.7)

otherwise, for γ′ ∈ (Γ \ {0, 11}) \Z±(γ), we have that G(±)γγ′ = δγγ′ . In any case, we deduce that in

each ‘row’ with index γ ∈ Γ\{0, 11} the matrix G(+) contains nonzero off-diagonal entries only in‘columns’ γ′ ∈ Z+(γ); analogously, in each ‘row’ with index γ ∈ Γ \ {0} the matrix G(−) containsnonzero off-diagonal entries only in ‘columns’ γ′ ∈ Z−(γ). All diagonal entries of G(+) and G(−)

are equal to 1.An analogous calculation reveals that the nonzero off-diagonal entries of the transpose of G(∓)

contained in ‘row’ γ (i.e. ‘column’ γ of G) are given by

G(∓)γ′γ = εj∓

√γj∓±1γ′j∓λj∓±1λj∓

〈γ, λ−1〉− 12 〈γ′, λ−1〉− 1

2 , where

j∓ = j∓(γ, γ′) ≥ 1;γ′j∓ ≥ 1;

γj∓±1 ≥ 1.(10.8)

By taking j∓ = i± ∓ 1, noting that i± 7→ j∓ := i± ∓ 1 is a one-to-one correspondence, andcomparing (10.8) with (10.7) we see that the number of nonzero off-diagonal entries in ‘row’ γ ofG(±) is equal to the number of nonzero off-diagonal entries in ‘column’ γ of G(∓). Since εi± and

εi±∓1 may differ, the individual nonzero off-diagonal entries in ‘row’ γ of G(±) are not necessarily

equal to the respective off-diagonal entries in ‘column’ γ of G(∓); however, that is of no relevancefor the rest of our argument.

Returning to (10.5), we thus deduce by superposition that for the Fokker–Planck equation(3.23), with a tridiagonal coefficient matrix A as defined in Example 5.1, the infinite matrixG = G(0) + G(+) + G(−) is such that in each ‘row’ with index γ ∈ Γ \ {0} the matrix G hasnonzero off-diagonal entries in ‘columns’ γ′ ∈ Z+(γ)∪Z−(γ) only. Thanks to the symmetry of Gthe number of nonzero off-diagonal entries in each ‘row’ γ of G is equal to the number of nonzerooff-diagonal entries in ‘column’ γ of G. The diagonal entry of G in each ‘row’ γ ∈ Γ is equal to2δγγ + δγγ + δγγ = 4.

We now refer to Definition 8.8 and verify condition (8.3): i.e., we wish to show that for s > 0we have that

cG,s := supN∈N

N‖G−G[N ]‖1/s`2(Γ)→`2(Γ),

where (G[N ])∞N=1 is a sequence of infinite matrices, which we shall define below.

We consider, for each N ≥ 1, the matrix G[N ] := G(0) + G(+),[N ] + G(−),[N ], N ≥ 1, where thematrices G(+),[N ] and G(−),[N ] are defined as follows. For N = 1, we choose G(+),[N ] and G(−),[N ]

to be diagonal, with the same diagonal entries as G(+) and G(−), respectively. For N > 1, weagain define the diagonal parts of the matrices G(±),[N ] to be the same as those of G(+) and G(−).The off-diagonal entries of G(+),[N ] and G(−),[N ] are defined, ‘row’-wise, as follows.

Let 0 < p < 2 and define cp :=[22p/(2−p)] + 1; for an integer N > 1 we shall then use the

notation Np := cpN , and for a given sequence ε ∈ `2(N) in the definition (5.3) of the infinite

tridiagonal matrix A appearing in the bilinear form a(·, ·), we shall denote by ε[Np] the best Np-

term approximation of ε in `2(N). For N > 1, we define G(+),[N ] to contain, in the off-diagonal

of the ‘row’ associated with an index γ ∈ Γ \ {0}, at most Np nonzero elements G(+)γγ′ , whenever

γ′ ∈ Z+(γ) with associated index i+ := i+(γ, γ′) is such that i+ ∈ {j : ε[Np]j 6= 0}. Otherwise,

the off-diagonal entry G(+),[N ]γγ′ is defined to be 0. For N > 1 the off-diagonal entries of the matrix

G(−),[N ] are defined completely analogously to those of G(+),[N ].Having thus defined the infinite matrices G(±),[N ], N = 1, 2, . . . , and noting that by the triangle

inequality,

‖G−G[N ]‖`2(Γ)→`2(Γ) ≤ ‖G(+) −G(+),[N ]‖`2(Γ)→`2(Γ) + ‖G(−) −G(−),[N ]‖`2(Γ)→`2(Γ), (10.9)

Page 24: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

24 CHRISTOPH SCHWAB AND ENDRE SULI

it remains to bound the norms of G(±) −G(±),[N ]. We shall use Schur’s Lemma (cf. [23], p.6,¶2 and [6], p.449, Theorem B) for this purpose. Thanks to the ‘row’-wise definition of G(±),[N ],we already know the elements contained in each ‘row’ of G(±) − G(±),[N ]; in particular, thediagonal entries of these matrices are equal to 0. Since the infinite matrices G(±) −G(±),[N ] arenonsymmetric, we also need to understand the structure of their ‘columns’ in order to be able toapply Schur’s Lemma. As we already know from (10.8) what the nonzero off-diagonal entries inthe columns of G(±) are, it remains to consider the off-diagonal column entries of G(±),[N ]. Weshall in fact consider G(+),[N ] only, as an identical argument applies in the case of G(−),[N ].

Thanks to (10.8) the number of nonzero off-diagonal entries of G(+),[N ] in ‘column’ γ ∈ Γ \ {0}is equal to the number of nonzero off-diagonal entries in ‘row’ γ of G(−),[N ]; let us denote thisnumber by J+(γ). Hence, column γ of G(+),[N ] contains at most J+(γ) nonzero off-diagonalentries, given by the formula (10.8) whenever εj− = εi+−1 6= 0 is present in the best Np-term

approximation of ε; every other off-diagonal entry in column γ of G(+),[N ] is equal to zero, eitherbecause the particular entry was already equal to zero in the matrix G(+), or because the entrywas nonzero in G(+) but was set to zero while sweeping through the rows in the course of definingthe off-diagonal entries of G(−),[N ] ‘row’-wise.

Since, for all N ∈ N and every 0 < p < 2,

‖ε− ε[Np]‖`2(N) ≤ N−(1/p−1/2)p ‖ε‖`p(N) ≤

(2

2p2−pN

)−(1/p−1/2)

‖ε‖`p(N) = 12N−(1/p−1/2)‖ε‖`p(N),

it then follows by Schur’s Lemma (cf. [23], p.6, ¶2 and [6], p.449, Theorem B) that

∀N ∈ N : ‖G(±) −G(±),[N ]‖`2(Γ)→`2(Γ) ≤ ‖ε− ε[Np]‖`2(N) ≤ 12N−(1/p−1/2)‖ε‖`p(N),

and, therefore, by the triangle inequality (10.9) we have that

∀N ∈ N : ‖G−G[N ]‖`2(Γ)→`2(Γ) ≤ N−(1/p−1/2)‖ε‖`p(N),

from which we deduce that

cG,s = supN∈N

N‖G−G[N ]‖1/s`2(Γ) ≤ supN∈N

N1−(1/p−1/2)/s‖ε‖1/s`p(N) ≤ ‖ε‖1/s`p(N) <∞,

provided that ε ∈ `p(N) with 0 < p < 2 and that s is chosen as

0 < s(p) := 1/p− 1/2. (10.10)

Referring to the definition of s∗-computability (cf. Definition 8.8), we infer that G is s∗ computablewith any 0 < s∗ ≤ s(p) if the sequence ε in Example 5.1 belongs to `p(N) with some 0 < p < 2,resp. with s∗ = 1/p− 1/2 (this encompasses the previous case of Aij = δij , i, j = 1, 2, . . . , if p = 0is understood to indicate that ε is the zero sequence).

11. Fokker–Planck equations with drift

11.1. Bounded invertibility of B and B∗ for nonsymmetric a(·, ·). So far, we have assumedthat the “spatial” differential operator is defined by the symmetric bilinear form a(·, ·). Thisis always possible when the vector function appearing in the coefficient of the drift term in theFokker–Planck equation is the gradient of a spring potential, by introducing Maxwellian-weightedspaces L2

M (D) and H1M (D) and by rescaling the probability density function by the Maxwellian;

cf. (3.26). In cases when the drift term cannot be removed by transformation to Maxwellian-weighted operators, however, nonsymmetric bilinear forms a(·, ·) must be considered. The boundedinvertibility of the operator B corresponding to the bilinear form B∗(·, ·) in (5.6) must then beconsidered separately. We shall now prove existence of weak solutions for the Fokker-Planckequation with an Ornstein–Uhlenbeck type drift term. To this end, we verify the conditions forthe abstract existence result Theorem 2.5. To this end, we consider in detail the structure of thedrift term. Let µ be the countable product of Gaussian measures µk, k ≥ 1, on Rd with trace classcovariance operator Q. The new contribution added to the symmetric bilinear form a(·, ·) that was

Page 25: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 25

under consideration in previous sections is assumed, for a sequence σ = (σk)∞k=1 ∈ [`∞(N)]d×d, tobe the bilinear form d(·, ·) defined by

d(ψ, ϕ) = −∑k≥1

(σkqkψ,Dkϕ

)L2(H,µ)

.

Proposition 11.1. Assume that σ ∈ [`∞(N)]d×d and that the covariance operator Q of the Gauss-ian measure µ on H is trace-class. Then, d(·, ·) : W1,2(H,µ)×W1,2(H,µ)→ R is continuous and

|d(ψ, ϕ)| ≤ ‖σ‖[`∞(N)]d×d

(2 TrQ

∫H

|ψ(q)|2µ( dq) + 4‖Q‖2∫H

|Dψ(q)|2µ( dq)

)1/2

‖Dϕ‖L2(H,µ) .

(11.11)

Proof. We write

|d(ψ, ϕ)| ≤ ‖σ‖[`∞(N)]d×d

∑k≥1

∥∥∥|qk| ψ∥∥∥2

L2(H,µ)

1/2∑k′≥1

‖Dk′ ϕ‖2L2(H,µ)

1/2

= ‖σ‖[`∞(N)]d×d

∑k≥1

∥∥∥|qk| ψ∥∥∥2

L2(H,µ)

1/2

‖Dϕ‖L2(H,µ) .

Using [12, Prop. 9.2.10] and the assumption that ψ ∈W1,2(H,µ), we deduce that, for a Gaussianmeasure µ with trace-class covariance Q,∫

H

‖q‖2|ψ(q)|2µ(dq) ≤ 2 TrQ

∫H

|ψ(q)|2µ(dq) + 4‖Q‖2∫H

|Dψ(q)|2µ(dq).

This then yields the desired inequality (11.11). �

The bound (11.11) implies a Garding inequality for the Fokker–Planck operator with drift.

Proposition 11.2. Assume that the covariance operator Q of the Gaussian measure µ is trace-class, and that the infinite coefficient matrix A satisfies (5.1), (5.2). Then, the bilinear form

a(·, ·) + d(·, ·) : V × V → Ris continuous (i.e. (2.21) holds) and satisfies the Garding inequality (2.22) in the triple V ⊂ H 'H∗ ⊂ V∗. In particular, the space-time variational formulation (5.5) with the spatial bilinearform a(·, ·) + d(·, ·) in place of a(·, ·) in (5.6) is well-posed and its bilinear form B∗(·, ·) induces aboundedly invertible operator B∗ ∈ L(Y, (X0,{T})

∗).

Proof. The continuity of the bilinear form a(·, ·) + d(·, ·) is evident from the previous propositionand the continuity of a(·, ·). The Garding inequality (2.22) follows from the coercivity of a(·, ·)on V × V and from the continuity estimate (11.11) using a Cauchy inequality. The boundedinvertibility of B∗ ∈ L(Y, (X0,{T})

∗) follows from Theorem 2.5. �

11.2. s∗-computability of∫ T

0d(Θ⊗ [Υ]V ,Θ⊗ [Υ]V) dt and of its adjoint. As in Section 10.4,

thanks to the independence of the sequence σ ∈ [`∞(N)]d×d of t, we have that∫ T

0

d(Θ⊗ [Υ]V ,Θ⊗ [Υ]V) dt = (Θ,Θ)L2(0,T ) ⊗ d([Υ]V , [Υ]V).

As in the discussion of (10.4), the sparsity of the factor (Θ,Θ)L2(0,T ) is considered in [24, Sec.8.5]. Sparsity and s∗-computability of d(·, ·) are therefore determined by that of the infinite driftmatrix D = D[σ] defined via d([Υ]V , [Υ]V) = (Dγ,γ′)γ,γ∈Γ. With the Hermite polynomial Rieszbasis [Υ]V of L2(H,µ), the matrix entries Dγγ′ = Dγγ′ [σ] are (assuming, once again, for the sakeof simplicity of the exposition that d = 1; abbreviating Dqk as Dk to simplify the notation; andrecalling the definition (10.1) of Dλ

γ ):

Dγγ′ [σ] =∑k≥1

σk(Dλγ )−1(Dλ

γ′)−1 (Dk(qkHγ), Hγ′)L2(H,µ) . (11.12)

Page 26: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

26 CHRISTOPH SCHWAB AND ENDRE SULI

For the ensuing calculations, we define

θγγ′(k) := (Dk(qkHγ), Hγ′)L2(H,µ)

and observe that

θγγ′(k) = (Hγ , Hγ′)L2(H,µ) + (qkDkHγ , Hγ′)L2(H,µ) = δγγ′ + (qkDkHγ , Hγ′)L2(H,µ) . (11.13)

In order to calculate the second term on the right-hand side of (11.13), we note that, by (10.2),

DkHγ(q) =

{ √γkλkHγ−ek(q) if γk ≥ 1,

0 otherwise,

where ek ∈ Γ denotes the multi-index with entry 1 in position k and with zero entries in all otherpositions. Also, thanks to the three-term recurrence relation

qkHγk(qk) =√γk + 1Hγk+1(qk) +

√γkHγk−1(qk), γk ∈ N ∪ {0},

for the univariate Hermite polynomials (4.3), with the notational convention that Hm(qk) ≡ 0when m < 0, we have that

qkDk

[Hγk

(qk√λk

)]= qk

√γkλkHγk−1

(qk√λk

)= γkHγk

(qk√λk

)+√γk(γk − 1)Hγk−2

(qk√λk

)for any γk ≥ 0. Consequently, upon multiplying both sides of the last identity with H

(k)γ , we

deduce that

qkDkHγ = γkHγ +√γk(γk − 1)Hγ−2ek ,

which, upon substitution into (11.13) and then inserting the resulting expression into (11.12)yields, for γ, γ′ ∈ Γ, that

Dγγ′ [σ] =∑k≥1

σk

((1 + γk)δγγ′ +

√γk(γk − 1) δγ−2ek,γ′

)〈γ, λ〉−1/2〈γ′, λ〉−1/2.

We observe that D[σ] depends linearly on the sequence σ. The verification of the s∗-computabilitynow proceeds analogously to that of A[ε].

We denote by σ[N ] an N -term approximation of the sequence σ and observe that D[N ][σ] :=D[σ[N ]] has at most 2N + 1 nonzero entries in each ‘row’ with index γ ∈ Γ. To verify s∗-computability, by Definition 8.8 we must estimate

cD,s = supN∈N

N∥∥∥D[σ]−D[N ][σ]

∥∥∥1/s

`2(Γ)→`2(Γ)≤ supN∈N

(2N + 1)∥∥∥D[σ − σ[N ]]

∥∥∥1/s

`2(Γ)→`2(Γ). (11.14)

For a given fixed γ ∈ Γ and for any γ′ ∈ Γ and any k ∈ N, we have that

θγγ′(k) := (1 + γk)δγγ′ +√γk(γk − 1) δγ−2ek,γ′ =

1 + γk if γ′ = γ,√γk(γk − 1) if γ′ = γ − 2ek,

0 otherwise.

Therefore, we can estimate

‖D[σ]−D[N ][σ]‖2`2(Γ)→`2(Γ) ≤ supγ∈Γ

∑γ′∈Γ

∣∣∣Dγγ′ [σ − σ[N ]]∣∣∣2

= supγ∈Γ

1

〈γ, λ−1〉∑γ′∈Γ

1

〈γ′, λ−1〉

∣∣∣∣∣∣∑k≥1

(σk − σ[N ]k ) θγγ′(k)

∣∣∣∣∣∣2

.

Fixing γ ∈ Γ for now, we deduce that

Sγ :=∑γ′∈Γ

1

〈γ′, λ−1〉

∣∣∣∣∣∣∑k≥1

(σk − σ[N ]k ) θγγ′(k)

∣∣∣∣∣∣2

≤ ‖σ − σ[N ]‖2`1(N)

{supk

(1 + γk)2

〈γ, λ−1〉+ sup

k

γk(γk − 1)

〈γ − 2ek, λ−1〉

}.

Page 27: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

ADAPTIVE ALGORITHMS IN INFINITE DIMENSIONS 27

This implies that

‖D−D[N ]‖2`2(Γ)→`2(Γ) ≤ supγ∈Γ

Sγ〈γ, λ−1〉

≤ ‖σ − σ[N ]‖2`1(N) supγ∈Γ

1

〈γ, λ−1〉

{supk

(1 + γk)2

〈γ, λ−1〉+ sup

k

γk(γk − 1)

〈γ − 2ek, λ−1〉

}≤ C‖Q‖2 ‖σ − σ[N ]‖2`1(N).

Inserting this into (11.14), we infer that for σ ∈ `p(N) with some 0 < p < 1 the infinite driftmatrix D[σ] is s∗-computable for any 0 < s∗ ≤ 1/p− 1.

12. Optimality

Based on Proposition 10.1 and on the definition (9.2) of the infinite matrix B∗, we deduce fromthese observations our main result.

Theorem 12.1. Consider the space-time variational formulation (5.5) of the infinite-dimensionalFokker–Planck equation with bilinear form a(·, ·) defined in (5.8) corresponding to the tridiagonalmatrix A as in Example 5.1 based on the sequences ε, σ ∈ `p(N) of elements εi, σi with 0 < p < 1.

Consider its representation B∗u = f∗ using a temporal wavelet basis Θ as in Section 10.1and a spatial Hermite polynomial chaos basis Υ = {Hγ(q) : γ ∈ Γ, q ∈ D}. Then, for anyε > 0, the adaptive wavelet methods from [10] or [9] (and [15]) applied to the normal equation(8.4) with f∗ and B∗ as in (9.1) and (9.2), respectively, of the infinite matrix representation of theFokker–Planck equation (5.5) in countably many dimensions produce an approximation uε with

‖ψ − u>ε [Θ⊗Υ]‖Y h ‖u− uε‖ ≤ ε.

Suppose that for some 0 < s < min{dt − 1, s(p)} we have that u ∈ As∞(`2(∇Y)); then,

supp uε . ε−1/s‖u‖1/sAs∞(`2(∇Y)).

The number of arithmetic operations and storage locations required by one call of the space-time

adaptive solver with tolerance ε is bounded by some multiple of ε−1/s‖u‖1/sAs∞(`2(∇Y)) + 1.

The above assertions remain valid when in (3.23) the dimension K is finite, with the com-putability and admissibility constants cB∗,s and aB∗,s independent of the dimension K.

We remark in closing that for matrices A as in Example 5.2 this result remains valid, howevernow with admissibility constants depending on the block A11 in an unspecific way.

We also remark that the assumptions on the sequence σ could be slightly weakened, as the limit1/p− 1 of s∗-computability of D[σ] is larger than the value s(p) found in (10.10).

References

1. J.W. Barrett and E. Suli, Existence of global weak solutions to some regularized kinetic models of dilutepolymers, Multiscale Model. Simul. 6 (2007), 506–546.

2. , Existence and equilibration of global weak solutions to kinetic models for dilute polymers I: Finitely ex-tensible nonlinear bead-spring chains, Math. Models Methods Appl. Sci. 21 (2011), no. 6, 1211–1289, Extended

version available from http://arxiv.org/abs/1004.1432.

3. , Existence and equilibration of global weak solutions to kinetic models for dilute polymers II: Hookean-type bead-spring chains, Math. Models Methods Appl. Sci. 22 (2012), no. 5, 1–82, Extended version avialablefrom http://arxiv.org/abs/1008.3052.

4. W. Beckner, A generalized Poincare inequality for Gaussian measures, Proc. Amer. Math. Soc. 105 (1989),no. 2, 397–400. MR 954373 (89m:42027)

5. V. Bogachev, G. Da Prato, and M. Rockner, Existence and uniqueness of solutions for Fokker-Planck equationson Hilbert spaces, J. Evol. Equ. 10 (2010), no. 3, 487–509. MR 2674056 (2011g:60107)

6. D. Borwein and X. Gao, Matrix operators on `p to `q , Canad. Math. Bull. 37 (1994), no. 4, 448–446.7. S. Cerrai, Second Order PDE’s in Finite and Infinite Dimension, Lecture Notes in Mathematics, vol. 1762,

Springer-Verlag, Berlin, 2001, A probabilistic approach. MR 1840644 (2002j:35327)8. N. Chegini and R. Stevenson, Adaptive wavelet schemes for parabolic problems: sparse matrices and numerical

results, SIAM J. Numer. Anal. 49 (2011), no. 1, 182–212. MR 2783222

Page 28: ADAPTIVE GALERKIN APPROXIMATION ALGORITHMS FOR …people.maths.ox.ac.uk/suli/schwab.suli.Fokker_Planck.pdf · FOR PARTIAL DIFFERENTIAL EQUATIONS IN INFINITE DIMENSIONS CHRISTOPH SCHWAB

28 CHRISTOPH SCHWAB AND ENDRE SULI

9. A. Cohen, W. Dahmen, and R. DeVore, Adaptive wavelet methods for elliptic operator equations – Convergencerates, Math. Comp 70 (2001), 27–75.

10. , Adaptive wavelet methods II - Beyond the elliptic case, Found. Comput. Math. 2 (2002), no. 3, 203–245.

11. G. Da Prato, P. Malliavin, and D. Nualart, Compact families of Wiener functionals, C. R. Acad. Sci. ParisSer. I Math. 315 (1992), no. 12, 1287–1291. MR 1194537 (94a:60085)

12. G. Da Prato and J. Zabczyk, Second Order Partial Differential Equations in Hilbert Spaces, London Math-

ematical Society Lecture Note Series, vol. 293, Cambridge University Press, Cambridge, 2002. MR 1985790(2004e:47058)

13. T.J. Dijkema, C. Schwab, and R. Stevenson, An adaptive wavelet method for solving high-dimensional ellipticPDEs, Constr. Approx. 30 (2009), no. 3, 423–455. MR 2558688 (2010m:65270)

14. L. Figueroa and E. Suli, Greedy approximation of high-dimensional Ornstein–Uhlenbeck operators with un-

bounded drift, Tech. report, arXiv:1103.0726v1 [math.NA], http://arxiv.org/abs/1103.0726 (Submitted),2011.

15. T. Gantumur, H. Harbrecht, and R. Stevenson, An optimal adaptive wavelet method without coarsening of the

iterands, Tech. Report 1325, Utrecht University, March 2005, To appear in Math. Comp.16. M. Griebel and P. Oswald, Tensor product type subspace splittings and multilevel iterative methods for

anisotropic problems, Adv. Comput. Math. 4 (1995), no. 1–2, 171–206.

17. R. Guberovic, C. Schwab, and R. Stevenson, Space-time variational saddle point formulations ofStokes and Navier–Stokes equations, Tech. report, Seminar for Applied Mathematics, ETH Zurich,

http://www.sam.math.ethz.ch/reports/2011 (Submitted), 2011.

18. David Knezevic and Endre Suli, Spectral Galerkin approximation of Fokker–Planck equations with unboundeddrift, M2AN Math. Model. Numer. Anal. 43 (2009), no. 3, 445–485.

19. A. Kufner, O. John, and S. Fucık, Function spaces, Noordhoff International Publishing, Leyden, 1977, Mono-

graphs and Textbooks on Mechanics of Solids and Fluids; Mechanics: Analysis. MR MR0482102 (58 #2189)20. P.-A. Nitsche, Best N term approximation spaces for tensor product wavelet bases, Constr. Approx. 24 (2006),

no. 1, 49–70. MR 2217525 (2007a:41032)21. S. Peszat, On a Sobolev space of functions of infinite number of variables, Bull. Polish Acad. Sci. Math. 41

(1993), no. 1, 55–60. MR 1401886 (97m:46059)

22. A. Pietsch, Nuclear Locally Convex Spaces, Springer-Verlag, New York, 1972, Translated from the second Ger-man edition by William H. Ruckle, Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 66. MR 0350360

(50 #2853)

23. I. Schur, Bemerkungen zur Theorie der Beschrankten Bilinearformen mit unendlich vielen Veranderlichen, J.reine angew. Math. 140 (1911), 1–28.

24. C. Schwab and R. Stevenson, Space-time adaptive wavelet methods for parabolic evolution problems, Math.

Comp. 78 (2009), no. 267, 1293–1318. MR 2501051 (2011a:65347)