35
EFFECTIVE HAMILTONIANS AND AVERAGING FOR HAMILTONIAN DYNAMICS I by L. C. Evans 1 and D. Gomes 2 Department of Mathematics University of California Berkeley, CA 94720 Abstract. This paper, building upon ideas of Mather, Moser, Fathi, E and others, applies PDE methods to understand the structure of certain Hamiltonian flows. The main point is that the “cell” or “corrector” PDE, introduced and solved in a weak sense by Lions, Papanico- laou and Varadhan in their study of periodic homogenization for Hamilton–Jacobi equations, formally induces a canonical change of variables, in terms of which the dynamics are trivial. We investigate to what extent this observation can be made rigorous in the case that the Hamiltonian is strictly convex in the momenta, given that the relevant PDE does not usually in fact admit a smooth solution. 1. Introduction. This is the first of a projected series of papers that develop PDE techniques to under- stand certain aspects of Hamiltonian dynamics with many degrees of freedom. 1.1. Changing variables. The basic issue is this. Given a smooth Hamiltonian H : R n × R n R, H = H (p, x), we wish to examine the Hamiltonian flow (1.1) ˙ x = D p H (p, x) ˙ p = D x H (p, x) under a canonical change of variables (1.2) (p, x) (P,X ), 1 Supported in part by NSF Grant DMS-9424342. 2 Supported by Praxis XXI-BD 5228/95. 1

EFFECTIVE HAMILTONIANS AND AVERAGING FOR HAMILTONIAN ...evans/Effective.Hamiltonian.I.pdf · EFFECTIVE HAMILTONIANS AND AVERAGING FOR HAMILTONIAN DYNAMICS I by ... Papanicolaou–Varadhan

Embed Size (px)

Citation preview

EFFECTIVE HAMILTONIANS AND AVERAGING

FOR HAMILTONIAN DYNAMICS I

by

L. C. Evans 1 and D. Gomes 2

Department of MathematicsUniversity of California

Berkeley, CA 94720

Abstract. This paper, building upon ideas of Mather, Moser, Fathi, E and others, applies

PDE methods to understand the structure of certain Hamiltonian flows. The main point is

that the “cell” or “corrector” PDE, introduced and solved in a weak sense by Lions, Papanico-

laou and Varadhan in their study of periodic homogenization for Hamilton–Jacobi equations,

formally induces a canonical change of variables, in terms of which the dynamics are trivial.

We investigate to what extent this observation can be made rigorous in the case that the

Hamiltonian is strictly convex in the momenta, given that the relevant PDE does not usually

in fact admit a smooth solution.

1. Introduction.

This is the first of a projected series of papers that develop PDE techniques to under-stand certain aspects of Hamiltonian dynamics with many degrees of freedom.

1.1. Changing variables.

The basic issue is this. Given a smooth Hamiltonian H : Rn × R

n → R, H = H(p, x),we wish to examine the Hamiltonian flow

(1.1){

x = DpH(p,x)p = −DxH(p,x)

under a canonical change of variables

(1.2) (p, x) → (P, X),

1 Supported in part by NSF Grant DMS-9424342.2 Supported by Praxis XXI-BD 5228/95.

1

where

(1.3){

p = Dxu(P, x)X = DP u(P, x)

for a generating function u : Rn×R

n → R, u = u(P, x). Here we write Dx = ( ∂∂x1

, . . . , ∂∂xn

)and DP = ( ∂

∂P1, . . . , ∂

∂Pn). Assuming that we can find a smooth function u to solve the

Hamilton–Jacobi type PDE

(1.4) H(Dxu(P, x), x) = H(P ) in Rn,

and supposing as well that we can invert the relationships (1.3) to solve for P, X as smoothfunction of p, x, a calculation shows that we thereby transform (1.1) into the trivial dy-namics

(1.5){

X = DH(P)

P = 0.

In terms of mechanics, P is an “action” and X an “angle” or “rotation” variable, as forinstance in Goldstein [Gd].

Of course we cannot really carry out this classical procedure in general, since the PDE(1.4) does not usually admit a smooth solution and, even if it does, the transformation(1.2), (1.3) is not usually globally defined. Only very special Hamiltonians are integrablein this sense.

1.2. Homogenization.

On the other hand, under some reasonable hypotheses we can in fact build appropri-ate weak solutions of (1.4), as demonstrated within another context in the classic-but-unpublished paper Lions–Papanicolaou–Varadhan [L-P-V]. These authors look at the ini-tial value problem for the Hamilton–Jacobi PDE

(1.6){

uεt + H

(Dxuε, x

ε

)= 0 in R

n × (0,∞)uε = g on R

n × {t = 0},under the primary assumption that the mapping x �→ H(p, x) is T

n-periodic, where Tn

denotes the flat torus, that is, the unit cube in Rn, with opposite faces identified. Con-

sequently as ε → 0, the nonlinearity in (1.6) is rapidly oscillating; and the problem is tounderstand the limiting behavior of the solutions uε. Lions et al. show under some mildadditional hypotheses on the Hamiltonian that uε → u, the limit function u solving aHamilton–Jacobi PDE of the form

(1.7){

ut + H(Dxu) = 0 in Rn × (0,∞)

u = g on Rn × {t = 0}.

Here H : Rn → R, H = H(P ), is the effective (or averaged) Hamiltonian, and is built

from H as follows.2

1.3. How to construct H.

First, consider for fixed P ∈ Rn the cell (or corrector) problem{

H(P + Dxv, x) = λ in Rn,

x �→ v is Tn-periodic.

As proved in Lions–Papanicolaou–Varadhan [L-P-V] (and recounted in [E2] and in Braides–Defranceschi [B-D, §16.2]), there exists a unique real number λ for which there exists aviscosity solution. We may then define

H(P ) := λ,

and so rewrite the foregoing as

(1.8){

H(P + Dxv, x) = H(P ) in Rn,

x �→ v is Tn-periodic.

Once we set

(1.9) u(P, x) := P · x + v(P, x),

the PDE in (1.8) is just (1.4).

Remark. We pause here to draw attention to some simple observations relating thecell problem (1.8) and semiclassical approximations in quantum mechanics for periodicpotentials. These comments are intended as further motivation.

Consider the time-independent Schrodinger equation

(1.10) −�2

2∆ψ + V ψ = Eψ in R

n,

where � is Planck’s constant, V : Rn → R is a T

n-periodic potential, and E is the energycorresponding to the eigenstate ψ : R

n → C. A standard textbook procedure is to look fora solution having the Bloch wave form

(1.11) ψ = ei P ·x� φ,

where φ : Rn → C is T

n-periodic. We further suppose φ to have the WKB–structure

(1.12) φ = aeiv�

for periodic a, v : Rn → R. Our substituting (1.11), (1.12) into (1.10) and taking real parts

yields

(1.13)12|P + Dxv|2 + V (x) = E,

up to terms formally of size O(�).Thus in the semiclassical limit � → 0, we heuristically obtain the cell problem (1.8) for

the Hamiltonian H(p, x) = 12 |p|2 + V (x) and H(P ) = E. �

3

1.4. Questions, absolute minimizers.

The procedure outlined in §1.3 provides us with at least a theoretical construction ofH and of a generating function u. Returning then to the comments in §1.1, we can nowformulate these

Basic Questions. To what extent can we employ H and u to understand the solutionsof the Hamiltonian flow (1.1)? In particular, how is information about the dynamics“encoded” into H?

These are really hard issues, and to make at least a little progress we will need someadditional hypotheses on both the Hamiltonian and the particular trajectories of the ODEwe examine. Let us henceforth suppose that the mapping p �→ H(p, x) is uniformly convex,in which case we can associate with H the Lagrangian

L(q, x) := maxp

(p · q − H(p, x)).

Consider then a Lipschitz curve x(·) which minimizes the associated action integral, mean-ing that

(1.14)∫ T

0

L(x,x) dt ≤∫ T

0

L(y,y) dt

for each time T > 0 and each Lipschitz curve y(·) with x(0) = y(0), x(T ) = y(T ). We callx(·) a (one-sided) absolute minimizer. If we as usual define the momentum

p := DqL(x,x),

then (x(·),p(·)) satisfy Hamilton’s ODE (1.1).A discovery of Aubry [A], Mather [Mt 1-4], Fathi [F1–F3], Moser [Mo], E [EW2], etc., is

that solutions of (1.1) corresponding to absolute minimizers are in a strong sense “better”than other solutions. Indeed, these authors have shown that the Hamiltonian dynamicsare in some sense “integrable” for such special trajectories. The main goal of our work is tocontinue this analysis, with particular emphasis upon PDE methods (based upon viscositysolutions of (1.8)), applied to problems with many degrees of freedom.

1.5. Outline.

In §2 below we review the definition of the effective Hamiltonian H, introduce the corre-sponding effective Lagrangian L, and recall the connections with the large time asymptoticsof absolute minimizers x(·).

4

We then rescale in time x(·) and p(·) in §3, and introduce certain Young measures{νt}t≥0 on phase space, which record the oscillations of the rescaled functions in asymptoticlimits. These measures contain information about the Hamiltonian flow, and so our goalin subsequent sections is understanding their structure. In §4 we show that each ν = νt

is supported on the graph of the mapping p = Dxu(P, x) and furthermore “stays away”from the discontinuities in Dxu.

In §5 we prove that u is well behaved on the support of σ, the projection of ν ontox-space. For this, we firstly derive the formal L2-bound

(1.15)∫

Tn

|D2xu|2dσ ≤ C

and then the L∞-estimate

(1.16) |D2xu| ≤ C σ-a.e..

We rigorously establish some analogues of (1.15), (1.16), entailing difference quotients inthe x-variables. As an application, we provide in §6 a new proof of Mather’s theorem thatν is supported on an n-dimensional Lipschitz continuous graph.

Section 7 extends the techniques from §5 to establish what amounts to an L2-estimatefor the mixed second partial derivatives,

(1.17)∫

Tn

|D2xP u|2dσ ≤ CD2H(P ).

More precisely, we prove a similar inequality involving difference quotients in the variableP . An application of this bound appears in §8, where we demonstrate the strict convexityof H in certain directions.

In §9 we draw some further deductions under the assumptions that H is differentiableat P and the components of Q := DH(P ) are rationally independent.

A forthcoming companion paper [E-G2] addresses problems with time-dependent Hamil-tonians. The primary new topics developed there include a weak interpretation of the“adiabatic invariance of the action” and a discussion of the Berry–Hannay geometric phasecorrection, computed in terms of effective Hamiltonians.

Our work is strongly related to some extremely interesting papers of Fathi [F1–F3],which develop his “weak KAM theory”. We hope later to work out more clearly some ofthe connections with Fathi’s discoveries.

Some other relevant papers include Mather [Mt1-4], Weinan E [EW1-2], Sobolevskii[So1-2], Mane [Mn2-3], Jauslin–Kreiss–Moser [J-K-M], Iturriaga [I], Dias Carneiro [DC],Arisawa [Ar], etc. A good survey is Mather–Forni [M-F], and we have found Mane’s book[Mn1] to be very useful.

5

See Concordel [C1,C2], Chou–Duffin [C-D], Nussbaum[N], etc. for connections withnonlinear additive eigenvalue problems. Fathi [F4], Namah–Roquejoffre [N-F], Roquejoffre[R], Barles–Souganidis [B-S] and Fathi–Mather [F-M] discuss some related questions aboutlarge time asymptotics of solutions to Hamilton–Jacobi equations. Similar problems forstochastic homogenization have been studied by Rezakhanlou [Rz] and Souganidis [S].

There is also a large literature for time–dependent Hamiltonians with one degree offreedom. In this setting ordering properties for minimizing trajectories provide powerfultools unavailable in higher dimensions. See Mather–Forni [M-F], Aubry [A], Bangert [B2],etc. for more.

We are grateful to G. Barles, L. Barreira, M. Crandall, Weinan E, W. Oliva, D. Serreand A. Weinstein for interesting suggestions and for references.

2. Effective Hamiltonians and Lagrangians.

2.1. The Hamiltonian and Lagrangian.

As in the introduction, Tn denotes the standard flat torus.

Hypotheses on the Hamiltonian. Assume now that the given, smooth HamiltonianH : R

n × Rn → R, H = H(p, x), satisfies these conditions:

(i) periodicity:

(2.1){

For each p ∈ Rn, the mapping

x �→ H(p, x) is Tn-periodic.

(ii) strict convexity:

(2.2)

There exist constants Γ, γ > 0 such that

γ|ξ|2 ≤∑n

i,j=1∂2H

∂pi∂pjξiξj ≤ Γ|ξ|2

for each p, x, ξ ∈ Rn.

The Lagrangian. We define the associated Lagrangian L : Rn × R

n → R, L = L(q, x),by duality:

(2.3) L(q, x) := supp

(p · q − H(p, x))

for q ∈ Rn. In view of (2.1), (2.2) we see that L is smooth,

(2.4){

for each q ∈ Rn, the mapping

x �→ L(q, x) is Tn-periodic,

6

(2.5)

there exist constants Γ, γ > 0 such that

γ|ξ|2 ≤∑n

i,j=1∂2L

∂qi∂qj(q, x)ξiξj ≤ Γ|ξ|2

for all q, x, ξ ∈ Rn.

We physically interpret x as position, p as momentum and q as velocity. The corre-sponding capital letters X, P, Q will likewise respectively denote position, momentum andvelocity in new coordinates.

2.2. The effective Hamiltonian and Lagrangian.

As explained in the Introduction, we intend next to “average” H, following Lions,Papanicolaou, Varadhan [L-P-V]:

Theorem 2.1. (i) For each P ∈ Rn there exists a unique real number, denoted H(P ),

such that the cell problem

(2.6) H(P + Dxv, x) = H(P ) in Rn

has a Tn-periodic, Lipschitz continuous solution v.

(ii) In addition, there exists a constant α such that

(2.7) D2xv ≤ αI in R

n

in the distribution sense.

We call the function

(2.8) H : Rn → R

so defined the effective or averaged Hamiltonian.Remarks. (i) We understand v to solve (2.6) in the sense of viscosity solutions. This

means that if φ = φ(x) is a smooth function and

(2.9)

v − φ has a maximum (minimum) ata point x0 ∈ R

n, thenH(P + Dφ(x0), x0) ≤ H(P ) (≥ H(P )).

We will in fact mostly need only that v is differentiable a. e. with respect to n-dimensionalLebesgue measure, and that v solves the PDE (2.6) at any point of differentiability.

(ii) The inequality (2.7) means that

(2.10){

the function v(x) := v(x) − α2 |x|2

is concave on Rn.7

(iii) If v is a solution of (2.6), we will hereafter often write

v = v(P, x) (P, x ∈ Rn)

to emphasize the dependence on P . �

Given H as above, we define also the effective Lagrangian

(2.11) L(Q) := supP

(P · Q − H(P ))

for Q ∈ Rn.

2.3. Properties of H and L.

Proposition 2.2. The mappings

P �→ H(P ), Q �→ L(Q)

are convex and real-valued. Furthermore, H and L are superlinear:

(2.12) lim|P |→∞

H(P )|P | = lim

|Q|→∞

L(Q)|Q| = +∞.

Proof. 1. See Lions, Papanicolaou, Varadhan [L-P-V] (or [E2]) for a proof that H is convex.The convexity of L is immediate from (2.11).

2. In view of (2.2),

(2.13) H(P ) ≥ α|P + Dxv|2 − β ≥ α|P |2 + 2αP · Dxv − β a.e.

for appropriate constants α > 0, β ≥ 0. We integrate this inequality over Tn and recall v

is periodic, to deduceH(P ) ≥ α|P |2 − β.

Thus H is superlinear, and in particular L(Q) < ∞ for each Q. On the other hand, byconstruction H(P ) < ∞ for each P ; whence the duality formula

H(P ) = supQ

(P · Q − L(Q))

implies L is superlinear. �

In later sections we will relate H, L to appropriately rescaled minimizers of the ac-tion functionals, and for this will several times invoke the following results of Lions–Papanicolaou–Varadhan [L-P-V, §IV]. (See also E [EW1], Braides–Defranceschi [B-D,§16.2].)

8

Theorem 2.3. (i) If X : [0, T ] → Rn is a Lipschitz continuous curve and xε(·) → X(·)

uniformly, then

(2.14)∫ T

0

L(X) dt ≤ lim inf∫ T

0

L(xε,

ε

)dt.

(ii) Define

(2.15) Sε(x, y, t) := inf{∫ t

0

L(x,

)ds | x(t) = x, x(0) = y

},

for x, y ∈ Rn, t > 0. Then

(2.16) Sε(x, y, t) → tL

(x − y

t

)as ε → 0,

uniformly on compact subsets of Rn × R

n × (0,∞).

3. Young measures.

Next we study the asymptotic behavior as t → ∞ of certain curves that minimize theaction.

3.1. Hamilton’s ODE, rescalings.

Definition. A Lipschitz continuous curve x : [0,∞) → Rn is called a (one-sided) absolute

minimizer if

(3.1)∫ T

0

L(x,x) dt ≤∫ T

0

L(y,y) dt

for each time T > 0 and each Lipschitz continuous curve y : [0,∞) → Rn such that

x(0) = y(0), x(T ) = y(T ).

Given as above an absolutely minimizing curve x(·), define the corresponding momen-tum

(3.2) p(t) := DqL(x(t),x(t))9

for t ≥ 0. Then

(3.3){

x(t) = DpH(p(t),x(t))p(t) = −DxH(p(t),x(t))

for t ≥ 0.We wish to understand the pair (x(·),p(·)) for large times, and to this end introduce

the rescaled dynamics {xε(t) := εx(t/ε), pε(t) := p(t/ε)xε(0) = εx(0), pε(0) = p(0).

It follows from (3.3) that

(3.4)

xε(t) = DpH(pε(t),

xε(t)ε

)pε(t) = − 1

εDxH(pε(t),

xε(t)ε

)for t ≥ 0.

Remark. Since ddtH

(pε(t),

xε(t)ε

)= 0, we have supt≥0 H

(pε(t),

xε(t)ε

)≤ C for some

constant C, independent of ε. But H(p, x) ≥ γ2 |p|2 − C, and so

(3.5) supt≥0

{|pε(t)|, |xε(t)|} < ∞.

3.2. Recording oscillations.

We expect the functions pε(·) and xε(·)ε (mod T

n) to oscillate as ε → 0, and so introducemeasures on phase space to record these motions. Invoking for instance the methods from§1.E of [E1], we have

Proposition 3.1. There exists a sequence εk → 0 and for a.e. t > 0 a Radon probabilitymeasure νt on R

n × Tn such that

(3.6) Φ(pεk

(t),xεk

(t)εk

)⇀ Φ(t) :=

∫Rn

∫Tn

Φ(p, x) dνt(p, x)

for each bounded, continuous function

Φ : Rn × R

n → R, Φ = Φ(p, x),

such that x �→ Φ(p, x) is Tn-periodic.

We call {νt}t≥0 Young measures associated with the dynamics (3.4).

Remark. The limit (3.6) means

(3.7)∫ T

0

Φ(pεk

,xεk

εk

)ζ dt →

∫ T

0

Φζ dt

for each T > 0 and each smooth function ζ : [0, T ] → R. �

10

Lemma 3.2. The support of the measure νt is bounded, uniformly in t.

This is clear from (3.5).

Lemma 3.3. For each C1 function Φ as above,

(3.8)∫

Rn

∫Tn

{H, Φ} dνt = 0

for a.e. t ≥ 0, where

(3.9) {H, Φ} := DpH · DxΦ − DxH · DpΦ

is the Poisson bracket.

The identity (3.8) means that the measure νt is invariant under the Hamiltonian flow(3.3).

Proof. We have

d

dtΦ

(pε,

ε

)= DpΦ · pε + DxΦ · xε

ε

=1ε{H, Φ}

according to (3.4). Take ζ : [0, T ] → R to be smooth, with compact support. Then

∫ T

0

{H, Φ}(pε,

ε

)ζ dt = −

∫ T

0

εζΦ(pε,

ε

)dt.

Sending ε = εk → 0, we deduce (3.8). �

3.3. Convergence of trajectories, the action vector.

From (3.5), we conclude that the curves {xε(·)}ε>0 are uniformly Lipschitz continuous.Hence we may assume (passing if necessary to a further subsequence) that

(3.10) xεk→ X

uniformly on compact subsets of [0,∞), where X : [0,∞) → Rn is Lipschitz continuous,

X(0) = 0.11

Lemma 3.4. We have

(3.11) X(t) = Q(t) for a.e. t ≥ 0,

for

(3.12) Q(t) :=∫

Rn

∫Tn

DpH(p, x) dνt.

Proof. The limit (3.10) impliesxεk

⇀ X;

whence (3.11), (3.12) follow from (3.4). �

Theorem 3.5. (i) For a.e. time t ≥ 0

(3.13) L(Q(t)) =∫

Rn

∫Tn

L(DpH(p, x), x) dνt.

(ii) Furthermore, there exists P ∈ Rn such that

(3.14) P ∈ ∂L(Q(t)), Q(t) ∈ ∂H(P )

for a.e. t ≥ 0.

Recall that if Φ : Rn → R is convex, we write y ∈ ∂Φ(x) to mean

Φ(x) + y · (z − x) ≤ Φ(z) for all z ∈ Rn.

Remarks. (i) The point is that P does not depend on t. We call P an action vectorfor the rescaled trajectories {xε(·)}ε>0.

(ii) The second assertion above can be restated

{X ∈ ∂H(P)

P = 0for a.e. t ≥ 0,

and this formulation should be compared with (1.5).(iii) The existence of P is also a consequence of the Pontryagin Maximum Principle; cf.

Clarke [Cl]. �

Proof. 1. Let yε := xε(0) = εx(0) → 0. According to Theorem 2.3

(3.15) Sεk(x, yεk

, t) → tL(x

t

)(x ∈ R

n, t > 0),12

uniformly on compact subsets. But

Sε(x, yε, t) = inf{∫ t

0

L(x,

)ds | x(t) = x, x(0) = yε

},

and so

(3.16) Sε(xε(t), yε, t) =∫ t

0

L(xε,

ε

)ds,

since the curve xε(·) is an absolute minimizer.2. From (3.10), (3.15) we see that

(3.17) Sεk(xεk

(t), yεk, t) → tL

(X(t)

t

).

But then (3.16) implies that

(3.18) L

(xεk

,xεk

εk

)⇀

d

dt

(tL

(Xt

)).

Now

(3.19)d

dt

(tL

(Xt

))∈ L

(Xt

)+ ∂L

(Xt

) (X − X

t

)≤ L(X),

by convexity. Consequently, since xε = DpH(pε,

ε

), we deduce from (3.18) that

(3.20)∫

Rn

∫Tn

L(DpH(p, x), x) dνt ≤ L(X(t))

for a.e. t > 0.Conversely, Theorem 2.3 implies

∫ b

a

L(X(t)) dt ≤ limε→0

∫ b

a

L(xε,

ε

)dt =

∫ b

a

∫Rn

∫Tn

L(DpH, x) dνt dt

for all 0 ≤ a < b < ∞ and so

L(X(t)) ≤∫

Rn

∫Tn

L(DpH(p, x), x) dνt

for a.e. t. This and (3.20) establish (3.13).3. In particular,

d

dt

(tL

(X(t)

t

))= L(X(t)) = L(Q(t)) a.e.;

13

and so

(3.21)

1T

∫ T

0

L(Q(t)) dt =1T

∫ T

0

d

dt

(tL

(X(t)

t

))dt

= L

(X(T )

T

)

= L

(1T

∫ T

0

Q(t) dt

).

This identity, valid for each time T > 0, implies that {Q(t)}t≥0 lies a supporting domainof L. This means that

(3.22) P ∈ ∂L(Q(t)) for a.e. time t ≥ 0

for some vector P ∈ Rn. Equivalently, Q(t) ∈ ∂H(P ).

To confirm (3.22), fix a time T > 0, write Q := 1T

∫ T

0Q(t) dt, and take any P ∈ ∂L(Q).

Then owing to (3.21) we have

L(Q(t)) = L(Q) + P · (Q(t) − Q)

for a.e. time 0 ≤ t ≤ T . Thus Q(t) is a minimizer of the convex function L(Q) − L(Q) −P · (Q − Q), and so P ∈ ∂L(Q(t)), for a.e. time 0 ≤ t ≤ T . Taking a sequence of timesTk → ∞ and passing if necessary to a subsequence, we obtain a vector P satisfying (3.22).

4. Structure of minimizing measures.

We next fix one of the Young measures νt and hereafter write ν = νt. Our goal is tounderstand the form of this measure, and in particular to describe its support.

Our further deductions will be based entirely upon certain conclusions reached above.These are firstly that ν is a compactly supported Radon probability measure on R

n ×Tn,

for which we defineQ :=

∫Rn

∫Tn

DpH(p, x) dν,

as in (3.12) above. In addition, we have

(4.1)∫

Rn

∫Tn

{H, Φ} dν = 0

for each C1 function Φ that is Tn-periodic, and furthermore

(4.2) L(Q) =∫

Rn

∫Tn

L(DpH(p, x), x) dν.

14

These are, respectively, assertions (3.8) and (3.13) above.Remarks. Our ν is therefore a minimal measure in the sense of Mather [Mt1], except

that we work in phase space. The advantage is that the flow invariance condition (4.1) isfairly simple, and very useful, in the (p, x) variables. �

Notation. (i) We write M := spt(ν) and call M the Aubry–Mather set.(ii) We denote by σ the projection of ν onto the x-variables. That is,

σ(E) := ν(Rn × E)

for each Borel subset E of Tn. �

Take now any P ∈ ∂L(Q) and let v = v(P, x) be any viscosity solution of the corre-sponding cell problem

(4.3){

H(P + Dxv, x) = H(P ) in Rn

x �→ v(P, x) is Tn-periodic,

satisfying the semiconcavity condition (2.7). We hereafter set

u(P, x) := P · x + v(P, x).

4.1. Differentiability on the support of ν.

Theorem 4.1. (i) The function u is differentiable in the variable x σ-a.e., and σ-a.e.point is a Lebesgue point for Dxu.

(ii) We havep = Dxu(P, x) ν-a.e.

(iii) Furthermore,

(4.4)∫

Rn

∫Tn

H(p, x) dν =∫

Tn

H(Dxu, x) dσ = H(P );

and if H is differentiable at P ,∫Rn

∫Tn

DpH(p, x) dν =∫

Tn

DpH(Dxu, x) dσ = DH(P ).

Thus ν is supported on the graph p = Dxu(P, x) = P + Dxv(P, x), which is single-valuedσ-a.e. Also, the PDE (4.3) holds pointwise, σ-a.e.

Remarks. Formula (4.4) explicitly displays H as an average of H; but for this to beuseful, we need to know more about the measure σ. We will later, in §9, discover a bitmore about the structure of σ.

15

Observe also that from (4.4) we deduce

(4.5) H(P ) = H(P ) if P, P ∈ ∂L(Q).

Finally, compare assertion (ii) with the canonical change of variables (1.3). �

Proof. 1. To ease notation, we do not display the dependence of u on the variable P , andalso write Du for Dxu.

Take ηε to be a smooth, nonnegative, radial convolution kernel, supported in the ballB(0, ε). Then set

uε := ηε ∗ u.

The strict convexity of H implies for all p, q ∈ Rn that

H(q, x) ≥ H(p, x) + DpH(p, x) · (q − p) +γ

2|q − p|2.

Take q = Du(y), p = Duε(x) =∫

Rn ηε(x − y)Du(y) dy in this expression, multiply byηε(x − y), and then integrate with respect to y:

H(Duε(x), x) ≤∫

Rn

ηε(x − y)H(Du(y), x) dy

− γ

2

∫Rn

ηε(x − y)|Du(y) − Duε(x)|2dy.

Since the PDE H(Dxu, x) = H(P ) holds pointwise a.e., we conclude that

(4.6) βε(x) + H(Duε(x), x) ≤ H(P ) + Cε

for each x ∈ Tn, where

(4.7) βε(x) :=γ

2

∫Rn

ηε(x − y)|Du(y) − Duε(x)|2 dy.

2. Recalling again the strict convexity of H with respect to the variable p, we deduce

(4.8)

γ

2

∫Rn

∫Tn

|Duε(x) − p|2 dν

≤∫

Rn

∫Tn

H(Duε(x), x) − H(p, x) − DpH(p, x) · (Duε(x) − p) dν.

Now Duε = P + Dvε, where vε = ηε ∗ v is periodic. Consequently∫

Rn

∫Tn

DpH · Dvε dν = 0,

16

according to (4.1). This observation and (4.6) imply

(4.9)

γ

2

∫Rn

∫Tn

|Duε − p|2 dν +∫

Tn

βε dσ

≤ H(P ) −∫

Rn

∫Tn

H + DpH · (P − p) dν + Cε.

Next, P ∈ ∂L(Q) impliesL(Q) + H(P ) = P · Q.

FurthermoreL(DpH(p, x), x) + H(p, x) = DpH(p, x) · p.

Recalling that Q =∫

Rn

∫Tn DpH dν and substituting into (4.9), we find

(4.10)

γ

2

∫Rn

∫Tn

|Duε − p|2 dν +∫

Tn

βε dσ

≤ −L(Q) +∫

Rn

∫Tn

L(DpH, x) dν + Cε = Cε,

according to (4.2).3. Now send ε → 0. Passing as necessary to a subsequence we deduce first from (4.10)

thatβε → 0 σ-a.e.

Thus σ-a.e. point x is a point of approximate continuity of Du, and Du is σ-measurable.Since u = x · P + v and v is semiconcave as a function of x (Theorem 2.1,(ii)), it followsthat u is differentiable in x, σ-a.e. Thus

Duε → Du

pointwise, σ-a.e., and so (4.10) in turn forces

p = Du(x) = P + Dv(x) ν-a.e.

This proves assertion (ii), and (iii) follows then from the cell PDE. �

Remark. As a consequence of the foregoing proof, we have the identity

(4.11)∫

Rn

∫Tn

DpH(p, x) · Dxv dν =∫

Tn

DpH(Dxu, x) · Dxv dσ = 0,

which we will need later. To confirm this, recall from above that∫Rn

∫Tn

DpH · Dxvε dν = 0.

Since Dxvε → Dxv boundedly, ν-a.e., we can apply the Dominated Convergence Theorem.�

17

5. Derivative estimates in the variable x.

We devote this section to showing that our solution u of the cell problem is “smoother”on the support of σ than it may be at other points of T

n. This is a sort of “partialregularity” assertion.

5.1. Formal L2- and L∞-estimates.

First of all, we provide for the reader some purely formal L2 and L∞ estimates forD2

xu on the support of σ, calculations which provide motivation for the rigorous boundsobtained afterwards.

L2-inequalities. We assume for this that the generating function u is smooth, thendifferentiate the cell PDE twice with respect to xi, and finally sum for i = 1, . . . , n:

Hpkpl(Dxu, x)uxkxiuxlxi + Hpk

(Dxu, x)uxkxixi

+ 2Hpkxi(Dxu, x)uxkxi + Hxixi(Dxu, x) = 0.

The first term on the left is greater than or equal to γ|D2xu|2. Thus

γ

∫Tn

|D2xu|2 dσ +

∫Tn

DpH · Dx(∆xu) dσ ≤ C + C

∫Tn

|D2xu| dσ.

Since ∆xu = ∆xv is periodic, the second term on the left equals zero, according to (4.1).We consequently conclude

(5.1)∫

Tn

|D2xu|2 dσ ≤ C,

for some constant C depending only on H and P . �L∞-inequalities. We can similarly differentiate the cell PDE twice in any unit direc-

tion ξ, to find

Hpkpl(Dxu, x)uxkξuxlξ + Hpk

(Dxu, x)uxkξξ

+ 2Hpkξ(Dxu, x)uxkξ + Hξξ(Dxu, x) = 0,

for uξξ :=∑n

i,j=1 uxixj ξiξj . Take a nondecreasing, function Φ : R → R, and write φ :=Φ′ ≥ 0. Multiply the above identity by φ(uξξ), and integrate with respect to σ. After somesimplifications, we find

γ

2

∫Tn

|Dxuξ|2φ(uξξ) dσ +∫

Tn

DpH · Dx(Φ(uξξ)) dσ ≤ C

∫Tn

φ(uξξ) dσ.

18

Since uξξ = vξξ is periodic, the second term on the left is zero. We select

φ(z) ={

1 if z ≤ −µ

0 if z > −µ,

for a constant µ > 0. Since |Dxuξ|2 ≥ u2ξξ, we conclude that σ({uξξ ≤ −µ}) = 0 if µ is

large enough. Because (2.10) provides the opposite estimate uξξ ≤ α, we thereby derivethe formal bound

(5.2) |uξξ| ≤ C σ-a.e.,

the constant C depending only upon known quantities. �Remark. As the interested reader may wish to confirm, the foregoing derivations are

especially transparent for the classical Hamiltonian

H(p, x) =12|p|2 + V (x),

in which case the cell PDE (4.3) is the eikonal equation

12|Dxu|2 + V (x) = H(P )

and (4.1) corresponds to the transport equation

div(σDxu) = 0.

A clear message is that these two PDE should be considered together as a pair, in accor-dance with formal semiclassical limits. (See the Remark in §1.3.) �

5.2. An L2-estimate of difference quotients in x.

We now establish an analogue of estimate (5.1), with difference quotients replacing someof the derivatives.

Theorem 5.1. There exists a constant C, depending only on H and P , such that

(5.3)∫

Tn

|Dxu(P, x + h) − Dxu(P, x)|2 dσ ≤ C|h|2

for h ∈ Rn.

Remark. If Dxu(P, x + h) is multivalued, we interpret (5.3) to mean

(5.4)∫

Tn

|ξ − Dxu|2 dσ ≤ C|h|2

19

for some σ-measurable selection ξ ∈ Dxu(P, · + h). �

Proof. 1. To simplify notation we do not display the dependence of u on P , and just writeDu for Dxu.

Fix h ∈ Rn and define the shifted function

u(·) := u(· + h).

ThenH(Du, x + h) = H(P ) a.e. in R

n.

Mollifying as in the proof of Theorem 4.1, we have

H(Duε, x + h) ≤ H(P ) + Cε in Rn.

ThereforeH(Duε, x) − H(Du, x) ≤ Cε + H(Duε, x) − H(Duε, x + h)

σ-a.e., and consequently

(5.5)

γ

2

∫Tn

|Duε − Du|2 dσ +∫

Tn

DpH(Du, x) · (Duε − Du) dσ

≤ Cε +∫

Tn

H(Duε, x) − H(Duε, x + h) dσ

≤ C(ε + |h|2) −∫

Tn

DxH(Duε, x) · h dσ.

2. Since Duε −Du = Dvε −Dv, the second term on the left hand side of (5.5) vanishes,in view of (4.1), (4.11). Therefore

γ

2

∫Tn

|Duε − Du|2 dσ ≤ C(ε + |h|2) −∫

Tn

DxH(Du, x) · h dσ

+ C

∫Tn

|Duε − Du||h| dσ,

and thusγ

4

∫Tn

|Duε − Du|2 dσ ≤ C(ε + |h|2) −∫

Rn

∫Tn

DxH · h dν.

However (4.1) implies the last term here is zero; whence

(5.6)∫

Tn

|Duε − Du|2 dσ ≤ C(ε + |h|2).

3. We send ε → 0. Passing as necessary to a subsequence we have

Duε ⇀ ξ weakly in L2σ

20

and

(5.7)∫

Tn

|ξ − Du|2 dσ ≤ C|h|2.

4. To conclude, we must show

(5.8) ξ ∈ Du = Du(· + h) σ-a.e.,

which means that for σ-a.e. point x there exists a constant C such that

(5.9) u(y) ≤ u(x) + ξ · (y − x) + C|y − x|2

for all y. To confirm this, recall that u, and so also uε, are semiconcave:

uε(y) ≤ uε(x) + Duε(x) · (y − x) + C|y − x|2

for all x, y. Take g ∈ L2σ, g ≥ 0. Then fixing y and integrating the variable x with respect

to σ, we find

0 ≤∫

Tn

(−uε(y) + uε(x) + Duε(x) · (y − x) + C|y − x|2)g(x) dσ(x).

Let ε → 0 and note uε → u uniformly. We conclude

0 ≤∫

Tn

(−u(y) + u(y) + ξ · (y − x) + C|y − x|2)g(x) dσ(x).

This inequality is true for all g as above; whence (5.7) holds for σ-a.e. point x and ally. �

5.3. L∞-estimates of difference quotients in x.

We next refine the integration arguments above, to derive an L∞ bound on second-orderdifference quotients. This will be a variant of the formal estimate (5.2) above.

Theorem 5.2. There exists a constant C, depending only on H and P , such that

(5.10) |u(P, x + h) − 2u(P, x) + u(P, x − h)| ≤ C|h|2

for all h ∈ Rn and each point x ∈ spt(σ).

Proof. 1. Take h �= 0, and write

u = u(· + h), u = u(· − h).21

We as before consider the mollified functions uε, uε, where we take

(5.11) 0 < ε ≤ η|h|2,

for small η > 0. As in the earlier proofs, we have{

H(Duε, x + h) ≤ H(P ) + Cε,

H(Duε, x − h) ≤ H(P ) + Cε.

Therefore for σ-a.e. point x,

H(Duε, x) − 2H(Du, x) + H(Duε, x)

≤ Cε + H(Duε, x) − H(Duε, x + h) + H(Duε, x) − H(Duε, x − h).

Hence

γ

2(|Duε − Du|2 + |Duε − Du|2) + DpH(Du, x) · (Duε − 2Du + Duε)

≤ C(ε + |h|2) + (DxH(Duε, x) − DxH(Duε, x)) · h,

and consequently

γ

4(|Duε − Du|2 + |Duε − Du|2)

+ DpH(Du, x) · (Duε − 2Du + Duε) ≤ C(ε + |h|2).

2. Fix now a smooth, nondecreasing, function Φ : R → R, and write φ := Φ′ ≥ 0.Multiply the last inequality above by φ

(uε−2u+uε

|h|2), and integrate with respect to σ:

(5.12)

γ

4

∫Tn

(|Duε − Du|2 + |Duε − Du|2)φ(

uε − 2u + uε

|h|2)

+∫

Tn

DpH(Du, x) · (Duε − 2Du + Duε)φ(· · · ) dσ

≤ C(ε + |h|2)∫

Tn

φ(· · · ) dσ.

Now the second term on the left hand side of (5.12) equals

(5.13) |h|2∫

Rn

∫Tn

DpH(p, x) · DxΦ(

uε − 2u + uε

|h|2)

and thus is zero. (To see this, note from (4.1) that the expression (5.13) vanishes if wereplace u by a mollified function uδ. Let δ → 0, recalling the estimates in the proof ofTheorem 4.1.)

22

So now dropping the above term from (5.12) and rewriting, we deduce

(5.14)

∫Tn

|Duε(x + h) − Duε(x − h)|2φ(

uε(x + h) − 2u(x) + uε(x − h)|h|2

)dσ

≤ C(ε + |h|2)∫

Tn

φ

(uε(x + h) − 2u(x) + uε(x − h)

|h|2)

dσ.

3. We confront now a technical problem, as (5.14) entails a mixture of first-order dif-ference quotients for Duε and second-order difference quotients for u, uε. We can howeverrelate these expressions, since u is semiconcave.

To see this, first of all define

(5.15) Eε := {x ∈ spt(σ) | uε(x + h) − 2u(x) + uε(x − h) ≤ −µ|h|2},

the large constant µ > 0 to be fixed below. Now according to (2.10), the functions

(5.16) u(x) := u(x) − α

2|x|2, uε(x) := uε(x) − α

2|x|2

are concave. Also a point x ∈ spt(σ) belongs to Eε if and only if

(5.17) uε(x + h) − 2u(x) + uε(x − h) ≤ −(µ + α)|h|2.

Set

(5.18) fε(s) := uε

(x + s

h

|h|

)(−|h| ≤ s ≤ |h|).

Then f is concave, and

uε(x + h) − 2uε(x) + uε(x − h) = fε(|h|) − 2fε(0) + fε(−|h|)

=∫ |h|

−|h|fε′′

(x)(|h| − |s|) ds

≥ |h|∫ |h|

−|h|fε′′

(s) ds (since fε′′ ≤ 0)

= |h|(fε′(|h|) − fε′

(−|h|))= (Duε(x + h) − Duε(x − h)) · h.

Consequently if x ∈ Eε, this inequality and (5.17) together imply

2|uε(x) − u(x)| + |Duε(x + h) − Duε(x − h)||h| ≥ (µ + α)|h|2.23

Now |uε(x)− u(x)| ≤ Cε on Tn, since u is Lipschitz continuous. We may therefore take η

in (5.11) small enough to deduce from the foregoing that

(5.19) |Duε(x + h) − Duε(x − h)| ≥ (µ

2+ α)|h|

But then

(5.20) |Duε(x + h) − Duε(x − h)| ≥ (µ

2− α)|h|

4. Return now to (5.14). Taking µ > 2α and

φ(z) ={

1 if z ≤ −µ

0 if z > −µ,

we discover from (5.14) that

2− α)2|h|2σ(Eε) ≤ C(ε + |h|2)σ(Eε).

We fix µ so large that

2− α)2 ≥ C + 1,

to deduce

(|h|2 − Cε)σ(Eε) ≤ 0.

Thus σ(Eε) = 0 if η in (5.11) is small enough, and this means

uε(x + h) − 2u(x) + uε(x − h) ≥ −µ|h|2

for σ-a.e. point x. Now let ε → 0:

u(x + h) − 2u(x) + u(x − h) ≥ −µ|h|2

σ-a.e. Since

u(x + h) − 2u(x) + u(x − h) ≤ α|h|2

owing to the semiconcavity, we have

|u(x + h) − 2u(x) + u(x − h)| ≤ C|h|2

for σ-a.e. point x. As u is continuous, the same inequality obtains for all x ∈ spt(σ). �24

6. Application: Lipschitz estimates for the support of ν.

We next improve the second derivative bounds from the previous section, and then showas a simple consequence that spt(ν) lies on a Lipschitz continuous graph.

Theorem 6.1. (i) There exists a constant C, depending only on H and P , such that

(6.1) |u(P, y) − u(P, x) − Dxu(P, x) · (y − x)| ≤ C|x − y|2

for all y ∈ Tn and σ-a.e. point x ∈ T

n.(ii) Furthermore,

(6.2) |Dxu(P, y) − Dxu(P, x)| ≤ C|x − y|

for all y ∈ Tn and for σ-a.e. point x ∈ T

n.(iii) In fact, u is differentiable at each point x ∈ spt(σ), and estimates (6.1), (6.2) hold

for all y ∈ Tn, x ∈ spt(σ).

Remark. When Dxu(P, y) is multivalued, (6.2) asserts

|ξ − Dxu(P, x)| ≤ C|x − y|

for all ξ ∈ Dxu(P, y). In particular, for multivalued Dxu(P, y) we have the estimate

diam(Dxu(P, y)) ≤ C dist(y, spt(σ)),

providing a quantitative justification to the informal assertion that “spt(σ) misses theshocks in Du”. �

Proof. 1. Fix y ∈ Rn and take any point x ∈ spt(σ) at which u is differentiable.

According to Theorem 5.2 with h := y − x, we have

(6.3) |u(y) − 2u(x) + u(2x − y)| ≤ C|x − y|2.

By semiconcavity, we have

(6.4) u(y) − u(x) − Du(x) · (y − x) ≤ C|x − y|2,

and also

(6.5) u(2x − y) − u(x) − Du(x) · (2x − y − x) ≤ C|x − y|2.

Use (6.5) in (6.3):u(y) − u(x) − Du(x) · (y − x) ≥ −C|x − y|2.

25

This and (6.4) establish (6.1).2. Estimate (6.2) follows from (6.1), as follows. Take x, y as above. Let z be a point to

be selected later, with |x − z| ≤ 2|x − y|.The semiconcavity of u implies that

(6.6) u(z) ≤ u(y) + Du(y) · (z − y) + C|z − y|2.

Also,

u(z) = u(x) + Du(x) · (z − x) + O(|x− z|2), u(y) = u(x) + Du(x) · (y − x) + O(|x− y|2),

according to (6.1). Insert these indentities into (6.6) and simplify:

(Du(x) − Du(y)) · (z − y) ≤ C|x − y|2.

Now take

z := y + |x − y| Du(x) − Du(y)|Du(x) − Du(y)|

to deduce (6.2).3. Now take any point x ∈ spt(σ), and fix y. There exist points xk ∈ spt(σ) (k = 1, . . . )

such that xk → x and u is differentiable at xk. According to estimate (6.1)

|u(y) − u(xk) − Du(xk) · (y − xk)| ≤ C|xk − y|2 (k = 1, . . . ).

The constant C does not depend on k or y. Now let k → ∞. Owing to (6.2) we see that{Du(xk)} converges to some vector η, for which

|u(y) − u(x) − η · (y − x)| ≤ C|x − y|2.

Consequently u is differentiable at x and Du(x) = η. �

As an application of these bounds, we show next that the set M = spt(ν) lies on ann-dimensional Lipschitz continuous graph. This theorem (in position-velocity variables) isdue originally to Mather [Mt2].

Theorem 6.2. There exists a constant C, depending only on P and H, such that

(6.7) |Dxu(P, x1) − Dxu(P, x2)| ≤ C|x1 − x2|

for σ-a.e. pair of points x1, x2.

Proof. In view of (6.2) we can extend the mapping x �→ Du(x) to a uniformly Lipschitzfunction defined on all of T

n. The support of ν lies on the graph of this mapping. �26

7. Derivative estimates in the variable P .

We turn next to some bounds involving variations in P . These are rather subtle andinvolve the smoothness properties of H. (See Poschel [P, p. 656–657] for an explicit linearexample, showing that u can be less well behaved in P than in x.)

7.1. A formal L2-estimate.

As in §5.1, we begin with a simple, but unjustified, calculation that suggests the laterproof. So for the moment suppose u and H are smooth, differentiate the cell PDE twicewith respect to Pi, and sum on i:

(7.1) Hpkpl(Dxu, x)uxkPiuxlPi + Hpk

(Dxu, x)uxkPiPi = HPiPi(P ).

The first term on the left is greater than or equal to γ|D2xP u|2. Consequently

γ

∫Tn

|D2xP u|2 dσ +

∫Tn

DpH · Dx(∆Pu) dσ ≤ ∆H(P ),

where ∆H = ∆P H is the Laplacian of H in P . Since ∆Pu = ∆

Pv is periodic, the second

term on the left equals zero. Therefore

(7.2)∫

Tn

|D2xP u|2 dσ ≤ C∆H(P ).

7.2. An L2-estimate of difference quotients in P.

We next provide a rigorous version of the foregoing calculation, replacing derivatives bydifference quotients.

Theorem 7.1. There exists a positive constant C, depending only on H, such that

(7.3)∫

Tn

|Dxu(P , x) − Dxu(P, x)|2 dσ ≤ C(H(P ) − H(P ) − Q · (P − P ))

for all P ∈ Rn.

Remark. Recall that Q =∫

Rn

∫Tn DpH(p, x) dν =

∫Tn DpH(Dxu, x) dσ and that Q ∈

∂H(P ). In (7.3), u(P , x) = P · x + v(P , x) and v = v(P , x) is any viscosity solution of thecell problem

(7.4){

H(P + Dxv, x) = H(P ) in Rn

x �→ v(P , x) is Tn-periodic.

27

If Dxu(P , x) is multivalued, we interpret (7.3) to mean∫Tn

|ξ − Dxu(P, x)|2 dσ ≤ C(H(P ) − H(P ) − Q · (P − P ))

for some σ-measurable selection ξ ∈ Dxu(P , ·). �Proof. 1. Write v(·) = v(P , ·), u = x · P + v. Mollifying, we have

(7.5) H(Duε, x) ≤ H(P ) + Cε.

Therefore for σ almost every point

(7.6)

γ

2|Duε − Du|2 + DpH(Du, x) · (Duε − Du) ≤ H(Duε, x) − H(Du, x)

≤ H(P ) − H(P ) + Cε.

Observe that Duε − Du = P − P + (Dvε − Dv) and∫Rn

∫Tn

DpH · (Dvε − Dv) dν = 0.

Consequently (7.6) yields

(7.7)

γ

2

∫Tn

|Duε − Du|2 dσ ≤ H(P ) − H(P ) −∫

Tn

DpH(Du, x) · (P − P ) dσ + Cε

= H(P ) − H(P ) − Q · (P − P ) + Cε.

Let ε → 0. �Remark. For use later, we record the estimate

(7.8) lim supε→0

∫Tn

βε dσ ≤ H(P ) − H(P ) − Q · (P − P ),

for

(7.9) βε(x) :=γ

2

∫Tn

ηε(x − y)|Dxu(P , y) − Dxuε(P , x)|2 dy.

To see this, note that as in the proof of Theorem 4.1 we can replace (7.5) by the strongerinequality

βε(x) + H(Duε, x) ≤ H(P ) + Cε.

�Corollary 7.2. (i) For each P ∈ R

n, we have∫Tn

|Dxu(P , x) − Dxu(P, x)|2 dσ ≤ O(|P − P |) as P → P.

(ii) If H is differentiable at P ,∫Tn

|Dxu(P , x) − Dxu(P, x)|2 dσ ≤ o(|P − P |) as P → P.

(iii) If H is twice-differentiable at P ,∫Tn

|Dxu(P , x) − Dxu(P, x)|2 dσ ≤ O(|P − P |2) as P → P.

28

8. Application: strict convexity of H in certain directions.

The next estimate allows us to deduce certain strict convexity properties of H.

Theorem 8.1. (i)There exists a positive constant C such that for each R ∈ Rn, we have

(8.1) −R · Q, R · Q ≤ C

(lim inft→0+

H(P + tR) − 2H(P ) + H(P − tR)t2

)1/2

,

where Q, Q ∈ ∂H(P ).(ii) In particular, if H is twice differentiable at P , then

(8.2) |DH(P ) · R| ≤ C(R · D2H(P )R)1/2

for each R ∈ Rn.

Proof. 1. Fix R ∈ Rn, t > 0, and take

u = u(P + tR, ·), u = u(P − tR, ·).

Then for σ-a.e. point x:

H(Duε, x) − 2H(Du, x) + H(Duε, x) ≤ H(P + tR) − 2H(P ) + H(P − tR) + Cε.

Similarly to the proof in §5.2, we deduce

(8.3)∫

Tn

|Duε − Du|2 + |Duε − Du|2 dσ ≤ C(H(P + tR) − 2H(P ) + H(P − tR)) + Cε.

2. Since H(Duε, x) ≤ H(P + tR) + Cε, we have

(8.4)

H(P ) − H(P + tR) ≤∫

Tn

H(Du, x) − H(Duε, x) dσ + Cε

≤ C

(∫Tn

|Du − Duε|2 dσ

)1/2

+ Cε.

Likewise,

(8.5) H(P ) − H(P − tR) ≤ C

(∫Tn

|Du − Duε|2 dσ

)1/2

+ Cε.

Combining (8.3)–(8.5), sending ε → 0, and recalling the convexity of H, we find

−tQ(t) · R, tQ(t) · R ≤ C(H(P + tR) − 2H(P ) + H(P − tR))1/2

29

for anyQ(t) ∈ ∂H(P + tR), Q(t) ∈ ∂H(P − tR).

Taking any tk → 0, we may assume Q(tk) → Q, Q(tk) → Q with Q, Q ∈ ∂H(P ). Estimate(8.1) follows. �

Remarks. (i) From (8.1) we deduce that H is strictly convex in any direction R whichis not tangent to the level set {H = H(P )}, provided H is differentiable at P . (Comparethis assertion with Iturriaga [I].)

(ii) More generally, if H(P ) > minRn H, and so 0 /∈ ∂H(P ), there exists an open convexcone of directions R in which H is strictly convex at P .

Therefore the graph of H can contain an n-dimensional flat region only possibly atits minimum value. This can in fact happen, even though H is uniformly convex in thevariable p: see Lions–Papanicolaou–Varadhan [L-P-V] or Braides–Defranceschi [B-D, p.149]. Consult Concordel [C1,C2] for more. Physically, a flat region at the minimum of H

corresponds to “nonballistic” trajectories for the dynamics.(iii) See also Bangert [B] and Weinan E [EW2] for an example showing that the level

sets of H can have corners and/or flat parts. �

9. Application: averaging in the variable X.

Assume for this section that H is differentiable at P and furthermore that Q = DH(P )satisfies the nonresonance condition:

(9.1) Q · k �= 0 for each vector k ∈ Zn, k �= 0.

Notation. For h > 0, we write the vector of difference quotients

(9.2) DhP u(P, x) :=

(. . . ,

u(P + hel, x) − u(P, x)h

, . . .

),

for el := (0, . . . , 1, . . . , 0), the 1 in the lth-position. �

Theorem 9.1. Suppose Q = DH(P ) satisfies (9.1). Then

(9.3) limh→0

∫Tn

Φ(DhP u(P, x)) dσ =

∫Tn

Φ(X) dX

for each continuous, Tn-periodic function Φ.

Proof. 1. Let ul(·) := u(P + hel, ·), and uεl := ηε ∗ ul, for l = 1, . . . , n.

Since H is smooth, we have for all p, q lying in a compact subset of Rn that

H(q, x) = H(p, x) + DpH(p, x) · (q − p) + R, with |R| ≤ C|q − p|2.30

Take q = Dul(y), p = Duεl (x) =

∫Rn ηε(x − y)Dul(y) dy, multiply by ηε(x − y), and

integrate with respect to y:

(9.4) H(Duεl (x), x) =

∫Rn

ηε(x − y)H(Dul(y), x) dy −∫

Rn

ηε(x − y)R dy.

Furthermore the PDE H(Dul, x) = H(P +hel) holds pointwise a.e., and so we can concludethat

(9.5) H(Duεl , x) = H(P + hel) + γl

ε,

where the error term is estimated by

|γlε| ≤ C(ε + βl

ε)

forβl

ε(x) :=γ

2

∫Rn

ηε(x − y)|Dul(y) − Duεl (x)|2 dy.

2. We introduce the partially smoothed vector of difference quotients

(9.6) DhP uε(P, x) :=

(. . . ,

uεl − u

h, . . .

),

and take then a vector of integers k = (k1, . . . , kn), k �= 0.Next, observe that the function

e2πik·DhP uε

= e2πik·xe2πik·DhP vε

is Tn-periodic, even though Dh

P uε is not periodic. Hence

(9.7)

0 =∫

Tn

DpH(Du, x) · Dx

(e2πik·Dh

P uε)

= 2πi

∫Tn

e2πik·DhP uε

n∑l=1

klDpH(Du, x) · Dx(uε

l − u

h) dσ.

3. Now (9.5) implies

H(Duεl , x) − H(Du, x) = H(P + hel) − H(P ) + γl

ε.

ConsequentlyDpH(Du, x) · D(uε

l − u) = H(P + hel) − H(P ) + Γlε,

where

(9.8) |Γlε| ≤ C(ε + βl

ε + |Duεl − Du|2)

31

for l = 1, . . . , n.Therefore

(9.9) DpH(Du, x) · Dx(uε

l − u

h) = Ql +

(H(P + hel) − H(P )

h− Ql +

1h

Γlε

).

4. Insert (9.9) into (9.7), and then estimate

(9.10)

∣∣∣∣(Q · k)∫

Tn

e2πik·DhP uε

∣∣∣∣ ≤ Cε

h+ C

n∑l=1

(H(P + hel) − H(P )

h− Ql

)

+C

h

n∑l=1

∫Tn

βlε + |Duε

l − Du|2 dσ by (9.8)

≤ Cε

h+ C

n∑l=1

(H(P + hel) − H(P )

h− Ql

)

+C

h

n∑l=1

∫Tn

βlε dσ,

the last inequality following from (7.7) in the proof of Theorem 7.1.Next, send ε → 0, and remember (7.8):∣∣∣∣(Q · k)

∫Tn

e2πik·DhP u dσ

∣∣∣∣ ≤ Cn∑

l=1

(H(P + hel) − H(P )

h− Ql

).

Since Ql = HPl(P ) and Q · k �= 0, we conclude that

limh→0

∫Tn

e2πik·DhP u dσ = 0

for all k ∈ Zn, k �= 0. Because any continuous, T

n-periodic funtion Φ can be uniformlyapproximated by trigonometric polynomials, this implies assertion (9.3). �

Remarks. (i) Recalling the formal change of variables (1.3), we interpret (9.3) to assert

(9.11) “ dσ = |detD2xP u| dx ”

in some weak sense, provided (9.1) holds. See [E-G2, §5.1] for related formal computations.(ii) Theorem 9.1 provides a partial, but rigorous, interpretation of the following heuris-

tics.Suppose that our generating function u is smooth, and induces the global change of

variables (p, x) → (P, X) by (1.3). The the dynamics (1.1) become (1.5); that is,{X = DH(P)

P = 0.32

Consequently X(t) = Qt + X0,P(t) ≡ P . In view therefore of the nonresonance condition(9.1), we have

limT→∞

1λT

∫ λT

0

Φ(X(t)) dt =∫

Tn

Φ(X) dX.

for each λ > 0. However

1λT

∫ λT

0

Φ(X(t)) dt =1

λT

∫ λT

0

Φ(DP u(P,x(t))) dt

=1λ

∫ λ

0

Φ(DP u(P,xε(t)

ε)) dt for ε =

1T

→ 1λ

∫ λ

0

∫Tn

Φ(DP u(P, x)) dσt dt.

Consequently ∫Tn

Φ(DP u(P, x)) dσt =∫

Tn

Φ(X) dX

for all t ≥ 0. �

33

References

[Ar] M. Arisawa, Multiscale homogenization for first-order Hamilton–Jacobi–Bellman equations, Ad-

vances in Differential Equations (to appear).

[A] S. Aubry, The twist map, the extended Frenkel–Kantorova model and the devil’s straircase, Phys-

ica D 7 (1983), 240–258.

[B1] V. Bangert, Minimal geodesics, Ergodic Theory and Dyn. Systems 10 (1989), 263–286.

[B2] V. Bangert, Geodesic rays, Busemann functions and monotone twist maps, Calculus of Variations

2 (1994), 49–63.

[B-S] G. Barles and P. E. Souganidis, On the long time behavior of solutions of Hamilton–Jacobi

equations, J. Math Analysis (to appear).

[B-D] A. Braides and A. Defranceschi, Homogenization of Multiple Integrals, Oxford Univ. Press, 1998.

[C-D] W. Chou and R. J. Duffin, An additive eigenvalue problem of physics related to linear program-

ming, Advances in Appl. Math 8 (1987), 486–498.

[Cl] F. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, 1983.

[C1] M. Concordel, Periodic homogenization of Hamilton-Jacobi equations I: additive eigenvalues and

variational formula, Indiana Univ. Math. J. 45 (1996), 1095–1117.

[C2] M. Concordel, Periodic homogenisation of Hamilton-Jacobi equations II: eikonal equations, Proc.

Roy. Soc. Edinburgh 127 (1997), 665–689.

[DC] M. J. Dias Carneiro, On minimizing measure of the action of autonomous Lagrangians, Nonlin-

earity 8 (1995), 1077–1085.

[EW1] Weinan E, A class of homogenization problems in the calculus of variations, Comm Pure and

Appl Math 44 (1991), 733–754.

[EW2] Weinan E, Aubry–Mather theory and periodic solutions of the forced Burgers equation, Comm

Pure and Appl Math 52 (1999), 811–828.

[E1] L. C. Evans, Weak Convergence Methods for Nonlinear Partial Differential Equations, American

Math Soc, 1990.

[E2] L. C. Evans, Periodic homogenization of certain fully nonlinear PDE, Proc Royal Society Edin-

burgh 120 (1992), 245–265.

[E-G2] L. C. Evans and D. Gomes, Effective Hamiltonians and Averaging for Hamiltonian Dynamics II,

to appear.

[F1] A. Fathi, Theoreme KAM faible et theorie de Mather sur les systemes lagrangiens, C. R. Acad.

Sci. Paris Sr. I Math. 324 (1997), 1043–1046.

[F2] A. Fathi, Solutions KAM faibles conjuguees et barrieres de Peierls, C. R. Acad. Sci. Paris Sr. I

Math. 325 (1997), 649–652.

[F3] A. Fathi, Orbites heteroclines et ensemble de Peierls, C. R. Acad. Sci. Paris Sr. I Math. 326

(1998), 1213–1216.

[F4] A. Fathi, Sur la convergence du semi-groupe de Lax-Oleinik, C. R. Acad. Sci. Paris Sr. I Math.

327 (1998), 267–270.

[F-M] A. Fathi and J. Mather, Failure of convergence of the Lax–Oleinik semigroup in the time–periodic

case, preprint (2000).

[Gd] H. Goldstein, Classical mechanics (2nd ed.), Addison-Wesley, 1980.

[G] D. Gomes, Hamilton–Jacobi Equations, Viscosity Solutions and Asymptotics of Hamiltonian

Systems, Ph.D. Thesis, University of California, Berkeley (2000).

[J-K-M] H. R. Jauslin, H. O. Kreiss and J. Moser, On the forced Burgers equation with periodic boundary

conditions, preprint (1998).

[L-P-V] P.-L. Lions, G. Papanicolaou, and S. R. S. Varadhan, Homogenization of Hamilton–Jacobi equa-

tions, unpublished, circa 1988.

[I] R. Iturriaga, Minimizing measures for time-dependent Lagrangians, Proc. London Math Society

73 (1996), 216–240.

[Mn1] R. Mane, Global Variational Methods in Conservative Dynamics, Instituto de Matematica Pura

e Aplicada, Rio de Janeiro.

[Mn2] R. Mane, On the minimizing measures of Lagrangian dynamical systems, Nonlinearity 5 (1992),

623–638.

34

[Mn3] R. Mane, Generic properties and problems of minimizing measures of Lagrangian systems, Non-

linearity 9 (1996), 273–310.

[Mt1] J. Mather, Minimal measures, Comment. Math Helvetici 64 (1989), 375–394.

[Mt2] J. Mather, Action minimizing invariant measures for positive definite Lagrangian systems, Math.

Zeitschrift 207 (1991), 169–207.

[Mt3] J. Mather, Differentiability of the minimal average action as a function of the rotation number,

Bol. Soc. Bras. Math 21 (1990), 59–70.

[Mt4] J. Mather, Variational construction of connecting orbits, Ann. Inst. Fourier, Grenoble 43 (1993),

1347–1386.

[M-F] J. Mather and G. Forni, Action minimizing orbits in Hamiltonian systems, Transition to Chaos

in Classical and Quantum Mechanics, Lecture Notes in Math 1589 (S. Graffi, ed.), Sringer, 1994.

[Ms] J. Moser, Recent developments in the theory of Hamiltonian systems, SIAM Review 28 (1986),

459–486.

[N-F] G. Namah and J.-M. Roquejoffre, Comportement asymptotique des solutions d’une classe d’equations

paraboliques and de Hamilton–Jacobi, C. R. Acad. Sci. Paris Sr. I Math. 324 (1997), 1367–1370.

[N] R. Nussbaum, Convergence of iterates of a nonlinear operator arising in statistical mechanics,

Nonlinearity 4 (1991), 1223–1239.

[P] J. Poschel, Integrability of Hamiltonian systems on Cantor sets, Comm. Pure Appl. Math 35

(1982), 653–696.

[Rz] F. Rezakhanlou, Central limit theorem for stochastic Hamilton–Jacobi equations, preprint (1998).

[R] J.-M. Roquejoffre, Comportement asymptotique des solutions d’equations de de Hamilton–Jacobi

monodimensionnelles, C. R. Acad. Sci. Paris Sr. I Math. 326 (1998), 185–189.

[So1] A. N. Sobolevskii, Periodic solutions of the Hamilton–Jacobi equation with periodic forcing term,

Russian Math Surveys 53 (1998), 1375–1376.

[So2] A. N. Sobolevskii, Periodic solutions of the Hamilton–Jacobi equation with a periodic nonhomo-

geneous term and Aubry–Mather theory, Sbornik: Mathematics 190 (1999), 1487–1504.

[S] P. E. Souganidis, Stochastic homogenization of Hamilton–Jacobi equations and some applications,

Asymptotic Analysis 20 (1999), 1-11.

35