Mathematical Physics, Analysis and Geometry - Volume 4

Mathematical Physics, Analysis and Geometry 4: 1–36, 2001.© 2001 Kluwer Academic Publishers. Printed in the Netherlands.

1

On the Law of Multiplication of Random Matrices

VLADIMIR VASILCHUKUniversité Paris 7 Denis Diderot, Mathématiques, case 7012, Paris, France,Institute for Low Temperature Physics, 47 Lenin ave., 310164 Kharkov, Ukraine

(Received: 22 February 2001; in revised form: 2 April 2001)

Abstract. We recover Voiculescu’s results on multiplicative free convolutions of probability mea-sures by different techniques which were already developed by Pastur and Vasilchuk for the law ofaddition of random matrices. Namely, we study the normalized eigenvalue counting measure of theproduct of two n × n unitary matrices and the measure of the product of three n × n Hermitian(or real symmetric) positive matrices rotated independently by random unitary (or orthogonal) Haardistributed matrices. We establish the convergence in probability as n → ∞ to a limiting nonrandommeasure and obtain functional equations for the Herglotz and Stieltjes transforms of that limitingmeasure.

Mathematics Subject Classifications (2000): 15A52, 60B99, 60F05.

Key words: random matrices, counting measure, limit laws.

1. Introduction

This paper deals with the eigenvalue distribution of products of n×n random matri-ces. We consider two models: (i) the product of two unitary (resp. orthogonal) ma-trices and (ii) the product of three Hermitian (or real symmetric) positive matrices.We study the eigenvalue distribution of these two ensembles in the limit n → ∞.Namely, we express the limiting normalized counting measure of eigenvalues of theproduct via the limits of the same counting measures of the corresponding factors.We assume that these exist and that factors are randomly rotated one with respectto another by a unitary (or orthogonal) random matrix uniformly distributed overthe group U(n) (resp. O(n)).

In this paper, under weaker assumptions, we obtain the analog of Voiculescu’sresults concerning the free multiplicative convolutions of probability measures onthe real axis and unit circle. They were studied within the context of free (non-commutative) probability theory introduced by Voiculescu at the beginning of the90’s (see [2, 8, 10] for results and references). This theory deals with free randomvariables (operators in von Neumann algebras) that can be modeled by unitary (or-thogonal) invariant random matrices [5, 6]. The notion of S-transform introducedin this theory allows one to generalize the functional equations for transforms oflimiting counting measures of certain multiplicative unitary invariant models firstproposed and studied by Marchenko and Pastur [4].

2 VLADIMIR VASILCHUK

Motivation for studying multiplicative random matrix ensembles is given by thefact that they appear in some physics studies (see, e.g., [3]).

We use a simple method of deriving the functional equations for limiting eigen-value distributions. It is a natural extension of the method proposed in [5] to studyan additive analog of our ensembles. The basic idea is the same as in [4]: tostudy not the counting measure itself but rather some integral transforms that aregenerating functions of the moments of that measure. We derive functional rela-tions for these transforms using the resolvent identity and differential identities forexpectations of smooth functions with respect to the Haar measure of U(n) (orO(n)).

The paper is organized as follows. In Section 2, we state and discuss the mainresults (Theorem 2.2 for unitary matrices and Theorem 2.1 for Hermitian posi-tive matrices). In Section 3, we prove auxiliary Theorems 3.1 and 3.2 concerningHermitian positive definite matrices under the conditions of uniform in n bound-edness of the fourth moment of the normalized counting measure of the factors. InSection 4, we use these results to prove Theorem 2.1. Here the main conditionis the uniform boundedness of the second moment of the normalized countingmeasure of the factors. In Section 5 we prove Theorems 5.1 and 5.2 and then proveTheorem 2.2 giving the solution of the problem for unitary matrices. In Section 6we prove the auxiliary facts that we need. We also describe generalizations of ourresults in the cases of orthogonal and real symmetric matrices.

2. Models and Main Results

We study two ensembles of n × n random matrices Vn and Hn of the form:

Vn = V1,nV2,n, (2.1)

where

V1,n = W ∗n SnWn, V2,n = U ∗

n TnUn

and

Hn = H1/21,n H2,nH

1/21,n , (2.2)

where

H1,n = W ∗n AnWn, H2,n = U ∗

n BnUn.

We assume that Sn, Tn, An, Bn, Un and Wn are mutually independent. Sn and Tn

are random unitary matrices, Un and Wn are unitary (resp. orthogonal) random ma-trices uniformly distributed over the unitary (orthogonal) group U(n) (resp. O(n))with respect to the Haar measure. An and Bn are Hermitian positive definite randommatrices.

ON THE LAW OF MULTIPLICATION OF RANDOM MATRICES 3

We will restrict ourselves to the case of Hermitian matrices (resp. to the groupU(n)). The results for symmetric real matrices (i.e. for the group O(n)) are similar,although their proofs are more difficult (see Section 6).

We are interested in the asymptotic behavior, as n → ∞, of the normalizedeigenvalue counting measure (NCM) νn of the ensemble (2.1), whose value on anyBorel set � ⊂ [0, 2π ] is given by

νn(�) = #{µ(n)i ∈ �}

n, (2.3)

where µ(n)i are the eigenvalues of Vn. We are also interested in the asymptotic

behavior of the NCM Nn of the ensemble (2.2), whose value on any Borel set� ⊂ R is given by

Nn(�) = #{λ(n)i ∈ �}

n, (2.4)

where λ(n)i are the eigenvalues of Hn.

The problem was studied recently by Voiculescu [2, 8, 10] within the context offree (noncommutative) probability. Combining Voiculescu’s results on free multi-plicative convolution of measures having nonzero first moment [2, 10] with resultsof asymptotic freeness of n × n Haar distributed unitary matrices and nonrandomdiagonal matrices [6, 9, 10], one can easily obtain the following result:

PROPOSITION. If the matrices Sn, Tn, An and Bn are nonrandom, if the norms ofAn and Bn are uniformly bounded in n, i.e. their NCMs N1,n and N2,n have compactsupport uniformly in n, and if these measures have weak limits as n → ∞

ν1,n → ν1, ν2,n → ν2, (2.5)

N1,n → N1, N2,n → N2, (2.6)

where ν1,n and ν2,n are the NCM’s of Sn and Tn, then the NCM’s (2.3) and (2.4)

of (2.1) and (2.2) converge weakly with probability 1 to nonrandom measures ν

and N .

Here and below the convergence with probability 1 is understood as that in thenatural probability spaces

� =∏

n

�n, � =∏

n

�n, (2.7)

where �n is the probability space of matrices (2.1), that is the product of respectivespaces of Sn and Tn and two copies of the group U(n) for Un and Wn, and where �n

is the probability space of matrices (2.2), that is the product of respective spaces ofAn and Bn and two copies of the group U(n) for Un and Wn.

Besides, according to [10], one can define the S-transforms S1, S2 and S ofthe measures ν1, ν2 and ν, respectively, and the S-transforms S1, S2 and S of the


measures N1, N2 and N (see our remark after Theorem 2.2) and one can find thefollowing simple expressions S and S via S1,2 and S1,2

S = S1S2, S = S1S2.

The proof of the asymptotic freeness of n×n Haar distributed unitary matrices andnonrandom diagonal matrices having, uniformly in n, compactly supported NCMsin [6, 8], is based on the asymptotic analysis of the expectations of normalizedtraces of mixed products of matrices Sn, Tn, Un and Wn and An, Bn, Un and Wn,respectively. It requires a considerable amount of combinatorial analysis, the exis-tence of all moments measures Nr , r = 1, 2, and their rather regular behavior asn → ∞ to obtain the convergence of expectations.

In this paper we obtain analogous results under more weak assumptions by amethod that does not involve combinatorics. This is because we work with theStieltjes transforms of the measures (2.4) and (2.6) and with the Herglotz trans-forms of the measures (2.3) and (2.5). We directly derive functional equations fortheir limits by using simple identities for expectations of matrix-valued functionswith respect to the Haar measure (Proposition 3.2 below) and elementary factsconcerning resolvents of Hermitian and unitary matrices. This method was alreadyused in [5] for the additive random ensembles.

We list below the properties of the Stieltjes and Herglotz transforms that we willneed below (see, e.g., [1]).

PROPOSITION 2.1. Let

s(z) =∫

R

m(dλ)

λ − z, Im z �= 0 (2.8)

be the Stieltjes transform of a probability measure m on R, then

(i) s(z) is analytic in C \ R and

|s(z)| � |Im z|−1. (2.9)

(ii) Im s(z)Im z > 0, Im z �= 0. (2.10)

(iii) limy→∞ y |s(iy)| = 1. (2.11)

(iv) For any continuous function ϕ with compact support we have the Frobenius–Perron inversion formula∫

R

φ(λ)m(dλ) = limε→0

1

π

∫R

φ(λ)Im s(λ + iε). (2.12)

(v) Conversely, any function satisfying (2.9)–(2.11) is the Stieltjes transform ofa probability measure and this one-to-one correspondence between measuresand their Stieltjes transforms is continuous for the topology of weak conver-gence for measures and for the topology of convergence on compact subsetsof C \ R for the Stieltjes transforms.


PROPOSITION 2.2. Let

t (z) =∫ 2π

0

eiθ + z

eiθ − zµ(dθ), |z| < 1 (2.13)

be the Herglotz transform of a probability measure µ on [0, 2π ], then

(i) t (z) is analytic for |z| < 1 and

|t (z) − 1| � 2|z|(1 − |z|)−1, |z| < 1. (2.14)

(ii) t (0) = 1, Re t (z) > 0, |z| < 1. (2.15)

(iii) For any continuous on [0, 2π ] function ϕ we have the inversion formula∫ 2π

0φ(θ)µ(dθ) = lim

r→1−

1

2π

∫ 2π

0φ(θ)Re t (re−iθ ) dθ. (2.16)

(iv) Conversely, any function satisfying (2.14)–(2.15) is the Herglotz transformof a probability measure on the unit circle. This one-to-one correspondencebetween measures and their Herglotz transforms is continuous for the topol-ogy of weak convergence for measures and for the topology of convergenceon compact subsets of {z ∈ C | |z| < 1} for the Herglotz transforms.

Now we state our main results. Since the set of eigenvalues of unitary andHermitian matrices are unitary invariant, we can replace matrices (2.1) and (2.2)by

Vn = SnU ∗n TnUn (2.17)

and

Hn = A1/2n U ∗

n BnUnA1/2n , (2.18)

where Sn, Tn, An, Bn and Un are as in (2.1) and (2.2). However, it is useful to keepin mind that the problem is symmetric in Sn and Tn and in An and Bn (as we willsee below).

THEOREM 2.1. Let Hn be a positive definite random n × n matrix of the form(2.2). Assume that the normalized counting measures N1,n, N2,n of An, Bn convergeweakly in probability as n → ∞ to nonrandom probability measures N1, N2. Wealso assume

supn

∫ +∞

0λ2E{Nr,n(dλ)} � m(2) < ∞, r = 1, 2, (2.19)

and

mr =∫ +∞

0λNr(dλ) > 0, r = 1, 2, (2.20)


i.e. the measures N1, N2 are not concentrated at zero. Then the normalized count-ing measure Nn of Hn converge in probability to a nonrandom probability measurewhose Stieltjes transform

f (z) =∫ +∞

0

N(dλ)

λ − z, Im z �= 0 (2.21)

is the unique solution of the system

f (z)(1 + zf (z)) = �1(z)�2(z),

�1(z) = f2

(z�2(z)

1 + zf (z)

), (2.22)

�2(z) = f1

(z�1(z)

1 + zf (z)

)in the class of functions f (z), �1,2(z) which are analytic for Im z �= 0 and whichsatisfy (2.9)–(2.11) and

z�r(z) = −mr + O(|Im z|−1), r = 1, 2,

|Re z| � |Im z|, z → ∞. (2.23)

f1(z), f2(z) are the Stieltjes transforms of N1, N2 and E{ · } denotes the expectationwith respect to the probability measure generated by An, Bn, Un and Wn.

THEOREM 2.2. Let Vn be a random n × n matrix of the form (2.1). Assume thatthe normalized counting measures ν1,n, ν2,n of Sn, Tn converge weakly in probabil-ity as n → ∞ to nonrandom probability measures on the unit circle ν1, ν2. Thenthe normalized counting measure νn of Vn converge in probability to a nonrandomprobability measure ν whose Herglotz transform

h(z) =∫ 2π

0

eiµ + z

eiµ − zν(dµ), |z| < 1 (2.24)

is the unique solution of the system

h2(z) = 1 + 4z�1(z)�2(z),

h(z) = h2

(2z�2(z)

1 + h(z)

), (2.25)

h(z) = h1

(2z�1(z)

1 + h(z)

)in the class of functions h(z), �1,2(z) which are analytic for |z| < 1 and whichsatisfy (2.14)–(2.15) and

|�1,2(z)| � (1 − |z|)−1, |z| < 1. (2.26)

h1,2(z) are the Herglotz transforms of the measures ν1,2.


Both theorems will be proved in Sections 3 and 5. Here we interpret them interms of S-transform introduced by Voiculescu in the context of C∗-algebras.

2.1. VOICULESCU’S FORMULATION

Consider a probability measure µ on the unit circle and assume that its first momentis nonzero

µ1 =∫ 2π

0eiθ µ(dθ) �= 0.

Consider the function

ϕµ(z) = −1 + t (z−1)

2

where t (z) is the Herglotz transform of µ. Since ϕ′µ(z) = µ1 + o(1), z → ∞,

then, according to the local inversion theorem, there exists a unique inverse func-tion χµ(ϕ) of ϕµ(z), χµ(ϕµ(z)) = z defined and analytic in a neighborhood of−1 and assuming its values in a neighborhood of infinity. On the other hand, forany probability measure m on the real nonnegative semi-axis having nonzero firstmoment

m1 =∫ ∞

0λm(dλ) > 0,

we can consider the function

ϕm(z) = −(1 + z−1s(z−1)),

where s(z) is the Stieltjes transform of the measure m. Since ϕ′m(0) = m1, then,

according to the local inversion theorem, ϕm(z) also has a unique inverse functionχm(ϕ) defined and analytic in a neighborhood of zero and assuming its values in aneighborhood of zero. Denote

Sµ(ϕ) = χµ(ϕ)ϕ−1(1 + ϕ),

Sm(ϕ) = χm(ϕ)ϕ−1(1 + ϕ)

and, following Voiculescu [10], call Sµ(ϕ) the S-transform of the probability mea-sure µ on the unit circle and Sm(ϕ) the S-transform of the probability measure m

on the real nonnegative semi-axis. By using the S-transforms S1, S2 of ν1, ν2, wecan rewrite (2.25) in the form

S(ψ(z)) = 1 + ψ(z)

z�1(z)

1 + ψ(z)

z�2(z), (2.27)

Sr(ψ(z)) = −1 + ψ(z)

z�r(z), r = 1, 2, (2.28)


where ψ(z) = ϕ(z−1) = −(1 + h(z))/2 and S(ψ) denotes the S-transform ofthe limiting normalized counting measure ν of (2.1), whose Herglotz transform ish(z). Then we derive from (2.27) Voiculescu’s very simple expression of S via S1

and S2

S(ψ) = S1(ψ)S2(ψ). (2.29)

It is easy to check that (2.22) leads to the relations (2.27) for ψ(z) = −(1 + zf (z))

and S(ψ) denotes the S-transform of the limiting normalized counting measure N

of (2.2), whose Stieltjes transform is f (z). Thus, (2.22) leads to the same expres-sion (2.29) where S1,2 will be the S-transforms of the measures N1,2. The relation(2.29) was obtained by Voiculescu in the context of C∗-algebra studies (see [9, 10]for results and references).

3. Convergence with Probability 1 for Nonrandom An, Bn

In this section, we start the proof of Theorem 2.1. As a first step we prove thefollowing theorem:

THEOREM 3.1. Let Hn be a positive definite nonrandom n×n matrix of the form(2.2) in which An and Bn are nonrandom Hermitian positive matrices, Un andWn are random independent unitary matrices distributed according to the Haarmeasure on U(n). Assume that the normalized counting measures N1,n, N2,n of An

and Bn converge weakly as n → ∞ to nonrandom probability measures N1, N2,

limn→∞

∫ +∞

0λNr,n(dλ) = mr =

∫ +∞

0λNr(dλ) > 0, r = 1, 2, (3.1)

supn

∫ +∞

0λ4Nr(dλ) � m4 < ∞. (3.2)

Then the normalized counting measure Nn of Hn converges with probability 1 toa nonrandom probability measure N whose Stieltjes transform f (z) (2.21) is theunique solution of (2.22) in the class of functions f (z), �1,2(z) which are analyticfor z ∈ C \ R+ and which satisfy (2.9)–(2.11) and (2.23).

We use the technique introduced in [5]. Let us recall its basic means. First wecollect elementary facts of linear algebra.

PROPOSITION 3.1. Let Mn be the algebra of linear endomorphisms of Cn equipp-

ed with the norm induced by the standard Euclidean norm of Cn. Then

(i) If {Mjk}nj,k=1 is the matrix of M ∈ Mn in any orthonormal basis of C

n then

|Mjk| � ||M||. (3.3)


(ii) If Tr M denotes the trace of M ∈ Mn, then

|Tr M| � n||M||,|Tr M1M2|2 � Tr M1M∗

1 Tr M2M∗2 , (3.4)

where M∗ is the adjoint of M. Furthermore, if P ∈ Mn is positive definite,then

|Tr MP | � ||M||Tr P. (3.5)

(iii) Let

G(z) = (M − z)−1 (3.6)

be the resolvent of M ∈ Mn. It is defined for all nonreal z, Im z �= 0 if M isHermitian. It is defined for all z, |z| �= 1 if M is unitary.

(iv) If G(z1) and G(z2) are defined,

G(z1) − G(z2) = (z1 − z2)G(z1)G(z2). (3.7)

(v) If M ∈ Mn is Hermitian and Im z �= 0, then

||G(z)|| � |Im z|−1. (3.8)

(vi) If M ∈ Mn is invertible and if {Gjk(z)}nj,k=1 is the matrix of its resolvent

G(z) in any orthonormal basis, then

|Gjk(z)| � ||M−1||1 − |z| ||M−1|| . (3.9)

(vii) If M1, M2 ∈ Mn their resolvents G1(z), G2(z) satisfy the “resolvent iden-tity”

G2(z) = G1(z) − G1(z)(M2 − M1)G2(z). (3.10)

(viii) The differential G′(z) of the resolvent G(z) = (M − z)−1 viewed as functionof M satisfies

G′(z) · X = −G(z)XG(z) (3.11)

for any X ∈ Mn. In particular,

||G′(z)|| � ||G(z)||2 � |Im z|−2. (3.12)

Now we present the main technical tool.

PROPOSITION 3.2 ([5]). Let .: Mn → C be of class C1. Then, for any M ∈ Mn

and any Hermitian X ∈ Mn:∫U(n)

.′(U ∗MU) · [X, U ∗MU ] dU = 0, (3.13)

where [M1, M2] denotes the commutator M1M2 − M1M2 and the integral denotesintegration over U(n) with respect to the Haar measure.


Proof. cf. [5], Proposition 3.2. The integral∫U(n)

.(eiεXU ∗MUe−iεX)[X, U ∗MU ] dU = 0

does not depend on ε. The derivative at ε = 0 gives (3.13). ✷PROPOSITION 3.3. The system (2.22) has a unique solution in the class of func-tions f (z), �1,2(z) which are analytic for Im z �= 0 and which satisfy (2.9)–(2.11)and (2.23).

Proof. Assume that there exist two solutions (f ′, �′1,2) and (f ′′, �′′

1,2) of (2.22).Denote δf = f ′−f ′′, δ�1,2 = �′

1,2−�′′1,2. Then, by using (2.22) and the following

relations

fr(z) = −z−1 + z−1∫ +∞

0

λNr(dλ)

λ − z, r = 1, 2

we obtain a linear system for δφ = zδf , δ�1, δ�2:

(1 + a1(z))δφ − b1(z)δ�1 − c1(z)δ�2 = 0,

(1 + a2(z))δφ − b2(z)δ�2 = 0, (3.14)

(1 + a3(z))δφ − b3(z)δ�1 = 0,

where

a1 = zf ′ + zf ′′, b1 = z�′′2, c1 = z�′

1, (3.15)

a2 = z�′2J2(s′

2, s′′2 )

(1 + zf ′)(1 + zf ′′), b2 = z

J2(s′2, s′′

2 )

1 + zf ′ , s′,′′2 = z�

′,′′2

1 + zf ′,′′ ,

J2(z′, z′′) =∫ +∞

0

λN2(dλ)

(λ − z′)(λ − z′′)(3.16)

and a3, b3 can be obtained from a2 and b2 by replacing N2 and �′2 by N1 and �′′

1in the above formulas.

For any y0 > 0, consider the domain

E(y0) = {z ∈ C | |Im z| � y0, |Re z| � |Im z|}. (3.17)

Due to condition (2.23) and the first equation of the system (2.22), we have forz ∈ E(y0)

1 + zf ′,′′(z) = −z−1(m1m2 + O(|Im z|−1)), z → ∞. (3.18)

Besides, if

t (z′, z′′) =∫ +∞

0

λm(dλ)

(λ − z′)(λ − z′′)


and m is a probability measure having finite second moment, then we have forz′, z′′ ∈ E(y0),

|z′z′′t (z′, z′′) −∫ +∞

0λm(dλ)|

=∣∣∣∣∫ +∞

0

λ2(z′ + z′′ − λ)m(dλ)

(λ − z′)(λ − z′′)

∣∣∣∣� 6y−1

0

∫ +∞

0λ2m(dλ),

i.e.

z′z′′t (z′, z′′) =∫ +∞

0λm(dλ) + O(|Im z|−1),

z′, z′′ → ∞, z′, z′′ ∈ E(y0).

Thus, from the relation above, (3.15), (3.16), (3.18) and condition (2.23), we obtainthat for z → ∞, z ∈ E(y0)

s′r (z)s′′

r (z)Jr(s′r (z), s′′

r (z)) = mr + o(1), r = 1, 2.

Hence

a1(z) = −2 + o(1), a2,3(z) = −1 + o(1), b1(z) = m2 + o(1),

c1(z) = m1 + o(1), b2(z) = m1 + o(1), b3(z) = m2 + o(1).

Thus the determinant −(1 + a1)b2b3 + b1b2(1 + a3) + c1b3(1 + a2) of the system(3.15) is asymptotically equal to m1m2 > 0. We conclude that if y0 is sufficientlylarge in (3.17), then (3.15) has only the trivial solution, i.e. (2.22) is uniquelysoluble. ✷

Proof of Theorem 3.1. Due to the unitary invariance of eigenvalues of Hermitianmatrices, we can assume without loss of generality that W = I in (2.2), i.e. we canwork with the random matrix (2.18). We will omit below the subindex n in all caseswhen it will not lead to confusion. Consider the matrices

Hn = H1,nH2,n, Hn = H1/22,n H1,nH

1/22,n (3.19)

and their resolvents

G(z) = (Hn − z)−1 = H1/21,n G(z)H

−1/21,n , (3.20)

G(z) = (Hn − z)−1 = H1/22,n G(z)H

−1/22,n

= H1/22,n H

1/21,n G(z)H

−1/21,n H

−1/22,n , (3.21)


where G(z) = (Hn − z)−1 is the resolvent of (2.2). Because of the trace propertywe derive from (3.20)

n−1Tr G(z) = n−1Tr G(z) = n−1Tr G(z)

=∫ +∞

0

Nn(dλ)

λ − z= gn(z), (3.22)

where gn(z) is Stieltjes transform of the NCM of (2.2).Consider the resolvent identity (3.10) for the pair (H , 0):

G(z)H1H2 = zG(z) + I. (3.23)

Using Proposition 3.2 with

.(M) = ((H1M − z)−1)

ab,

in view of (3.11), we obtain that

〈(GH1[X, H2]G)ab〉 = 0. (3.24)

Choosing the matrix X with only Xab �= 0,7 we obtain

〈(G(z)H1)aa(H2G(z))bb〉 = 〈(G(z)H1H2)aaGbb(z)〉. (3.25)

Applying to this quantity n−1 ∑na,b=1 and taking into account (3.23), we obtain the

relation

〈δ1,n(z)δ2,n(z)〉 = 〈gn(z)〉 + 〈zgn(z)gn(z)〉, (3.26)

where

δ1,n(z) = n−1Tr G(z)H1 = n−1Tr G(z)H1, (3.27)

δ2,n(z) = n−1Tr G(z)H2 = n−1Tr G(z)H2. (3.28)

Introduce now the centered quantities

g◦n(z) = gn(z) − fn(z), δ◦

2,n(z) = δ2,n(z) − �2,n(z), (3.29)

where

fn(z) = 〈gn(z)〉, �2,n(z) = 〈δ2,n(z)〉. (3.30)

With these notations, (3.26) becomes

(1 + zfn(z))fn(z) = �1,n(z)�2,n(z) + r1,n(z), (3.31)

7 In fact, in this relation (and in similar relations below) we can substitute only Hermitian ma-trices, e.g. (X(r))pq = (δa,pδb,q + δa,qδb,p)/2 or (X(i))pq = −i(δa,pδb,q − δa,qδb,p)/2 (hereδ is the Kronecker symbol). Nevertheless, adding the relations obtained, we get relation (3.24) with(X)pq = (X(r) + iX(i))pq = δa,pδb,q .


where

r1,n(z) = 〈δ◦1,n(z)δ2,n(z)〉 − 〈g◦

n(z)zgn(z)〉. (3.32)

In the next Theorem 3.2, we show that there exist y0 > 0, and C(y0) > 0, bothindependent of n, such that for z ∈ E(y0) (3.17) the variances

v1(z) = 〈(g◦n(z))2〉, v1+r (z) = 〈(δ◦

r,n(z))2〉, v3(z) = 〈(δ◦2,n(z))2〉

satisfy

v1(z) � C(y0)

n2, v2(z) � C(y0)

n2, v3(z) � C(y0)

n2. (3.33)

In addition, according to Proposition 3.1, (3.22), (3.27) and (3.23), we have theestimates

gn(z) � |Im z|−1, δr,n(z) � m1/44 |Im z|−1, Im z �= 0, r = 1, 2

and

|zgn(z)| � 1 + m1/24 |Im zt|−1, Im z �= 0.

These bounds and the Cauchy–Schwarz inequality for the average 〈 · 〉 imply thatuniformly in n for z ∈ E(y0)

|r1,n(z)| � v1/22 v

1/23 + (1 + m

1/24 y−1

0 )v1/21

� C(y0)

n2+

√C(y0)

n(1 + m

1/24 y−1

0 ) = O(n−1). (3.34)

Now consider the matrix

Y1 = B−1UG(z)AU ∗B. (3.35)

It is clear that δ1,n(z) = n−1Tr Y1. On the other hand, using the resolvent identity(3.7) for the pair (G(z), G(0)), we obtain

Y1 = B−1 + zZ1, (3.36)

where

Z1 = B−1UG(z)U ∗. (3.37)

By using for the following function of the unitary matrix U

(B−1UG(z)U ∗)ac

the analog of (3.13) derived from the left shift invariance of the Haar measure, weobtain

〈(B−1[X, UG(z)U ∗])ac〉 − 〈(H −12 UG(z)H1U

∗[H2, X]UG(z)U ∗)ac〉 = 0.


Choosing the matrix X with only Xbc �= 0, we get

(B−1)ab〈(UG(z)U ∗)cc〉 − 〈(Y1)ab(UG(z)U ∗)cc〉= 〈(Z1)abIcc〉 − 〈(Y1B

−1)ab(BUG(z)U ∗)cc〉. (3.38)

Applying the operation n−1 ∑nc=1 to (3.38), we obtain the matrix relation

〈Z1〉 = B−1fn(z) − 〈Y1gn(z)〉 + 〈Y1T ∗δ2,n(z)〉. (3.39)

Regrouping the terms in (3.39) according to (3.29) and using (3.36), we obtain

〈Z1〉 = �2(z)

1 + zfn(z)〈Y1〉B−1 + R, (3.40)

where

R = (1 + zfn(z))−1(〈Y1B−1δ◦2,n(z)〉 − 〈Y1g◦

n(z)〉). (3.41)

Substituting (3.40) in (3.36), we get

〈Y1〉(B − z�2(z)(1 + zfn(z))−1I ) = I + zRB. (3.42)

Besides, using (3.20), (3.22) and the resolvent identity (3.23), we obtain

z�2,n(z) = −〈n−1Tr H2〉 + 〈n−1Tr G(z)H1/21 H 2

2 H1/21 〉, (3.43)

z�1,n(z) = −〈n−1Tr H1〉 + 〈n−1Tr G(z)H1/22 H 2

1 H1/22 〉 (3.44)

and

1 + zfn(z)

= z−1(−〈n−1Tr H1H2〉 + 〈n−1Tr G(z)(H

1/21 H2H

1/21 )2〉). (3.45)

In addition, using Proposition 6.1 with M1 = A and M2 = B, we obtain

〈n−1Tr H1H2〉 = n−1Tr H1n−1Tr H2 (3.46)

and

〈n−1Tr (H1H2)2〉 � 2

(n−1Tr H 2

1

(n−1Tr H2

)2 ++(

n−1Tr H1)2

n−1Tr H 22 −

+(n−1Tr H1

)2(n−1Tr H2

)2 ++n−3Tr H 2

1 n−1Tr H 22

). (3.47)


Besides, using (3.5), (3.2), the trace property and (3.46), (3.47), we obtain

∣∣⟨n−1Tr G(z)(H

1/21 H2H

1/21

)2⟩∣∣ � 〈n−1Tr (H1H2)2〉

|Im z|� 8µ2

2

|Im z| � 8m4

|Im z|,

∣∣⟨n−1Tr G(z)H1/21 H 2

2 H1/21

⟩∣∣ � 〈n−1Tr H1/21 H 2

2 H1/21 〉

|Im z|= n−1Tr H1n−1Tr H 2

2

|Im z|� µ

3/22

|Im z| � m3/44

|Im z|and, similarly,

∣∣⟨n−1Tr G(z)H1/21 H 2

2 H1/21

⟩∣∣ � µ3/22

|Im z| � m3/44

|Im z| ,

where

µ2 = maxr=1,2

{sup

n

∫λ2Nr,n(dλ)

}� m

1/24 .

From (3.43), (3.44), (3.45) and from the above estimates, we derive for z ∈ E(y0)

and y0 sufficiently large, uniformly in n,

zfn(z) = −1 + O(|Im z|−1). (3.48)

In addition, according to condition (3.1), we have for some n′ sufficiently large andfor all n � n′

|1 + zfn(z)| � |z|−1(m1m2)/2,∣∣�−1r,n(z)

∣∣ � 2|z|mr

, r = 1, 2 (3.49)

and

Imz�r,n(z)

1 + zfn(z)� Im z

mr

2m1m2,

Imzfn(z)

�r,n(z)� Im z

m1m2

2mr

, r = 1, 2. (3.50)

Thus, the matrix

P = B − z�2,n(z)(1 + zfn(z))−1I (3.51)


is uniformly in n invertible for z ∈ E(y0) and for y0 sufficiently large, we have

||P −1|| � 2m1

|Im z| . (3.52)

Thus, we obtain from (3.42)

〈Y1〉 = G2

(z�2(z)

1 + zfn(z)

)+ zRT P −1,

where G2(z) = (B − z)−1. Applying the operation n−1Tr to this relation and using(3.41), we obtain

�1(z) = f2,n

(z�2(z)

1 + zfn(z)

)+ r2,n(z), (3.53)

where

f2,n(z) = n−1Tr G2(z) =∫ +∞

0

N2,n(dλ)

λ − z, Im z �= 0 (3.54)

and

r2,n(z) = z

1 + zf (z)

(〈n−1Tr UG(z)AU ∗P −1δ◦2,n(z)〉 −

−〈n−1Tr G(z)BUGAU ∗P −1g◦n(z)〉). (3.55)

Using the arguments above for the matrices

GU (z) = UG(z)U ∗ = (UAU ∗B − z)−1,

Y2 = AU ∗BGU UA−1, Z2 = U ∗GU UA

in which the roles A and B are interchanged, we obtain the analogous to (3.53)relation for �2,n(z). Thus, we obtain the system of relations

fn(z) + zf 2n (z) = �1,n(z)�2,n(z) + r1,n(z),

�1,n(z) = f2,n

(z�2,n(z)

1 + zfn(z)

)+ r2,n(z),

�2,n(z) = f1,n

(z�1,n(z)

1 + zfn(z)

)+ r3,n(z), (3.56)

where

fr,n(z) = n−1Tr Gr(z) =∫ +∞

0

Nr,n(dλ)

λ − z, r = 1, 2. (3.57)

In addition, according to (3.33), (3.49), (3.52) and the Cauchy–Schwarz inequality,we have for z ∈ E(y0)

|rs,n(z)| � 16

m4−s

(m

1/44 v

1/25−s + m

1/24 v

1/21

) = O(n−1), s = 2, 3. (3.58)


Besides, for z ∈ C \R+,ξ where ξ > 0 and R+,ξ is a ξ -neighborhood of the realpositive semi-axis R+, we have the following estimates:

|fn(z)| � ξ−1, |�(n)1,2(z)| � m

1/44 ξ−1.

These estimates imply that {fn(z)} and {�(n)

1,2(z)} are sequences of analytic func-tions, bounded uniformly in n for z ∈ C\R+,ξ . Thus these sequences are relativelycompact with respect to uniform convergence on any compact subset of C\R+,ξ . Inaddition, according to the hypothesis of the theorem, the normalized counting mea-sures N1,n, N2,n of H1,n, H2,n converge weakly to limiting measures N1, N2. Thustheir Stieltjes transforms (3.57) converge uniformly on every compact of E(y0),y0 > ξ to the Stieltjes transforms f1,2(z) of N1,2. Hence, if y0 is large enough,there exist three functions f , �1, �2 which are analytic in E(y0) and which satisfythe limiting system (2.22) for z ∈ E(y0). Its unique solubility in (3.17) where y0 islarge enough is proved in Proposition 3.3. Besides, the three functions fn, �1, �2

are a-priori analytic for z ∈ C \ R+. Thus, their limits f , �1, �2 are also analyticfor z ∈ C \ R+. In view of the weak compactness of probability measures and thecontinuity of the one-to-one correspondence between nonnegative measures andtheir Stieltjes transforms (see Proposition 2.1(v)), there exists a unique nonnega-tive measure N such that f admits the representation (2.21). This measure N is aprobability measure in view of (3.48).

We conclude that the whole sequence {fn(z)} converges uniformly on everycompact subset of C \ R+,ξ , ξ > 0 to the limiting function f (z) verifying (2.22).This result, Theorem 3.2 and the Borel–Cantelli lemma imply that the sequence{gn(z)} where gn(z) is defined in (3.22) converge with probability 1 to f (z) for anyfixed z ∈ E(y0). Since the convergence of a sequence of analytical functions on anycountable set having an accumulation point in their common domain of definitionimplies the uniform convergence of the sequence on any compact of the domain,we obtain the convergence gn(z) to f (z) with probability 1 on any compact ofC \ R+,ξ . Due to the continuity of the one-to-one correspondence between prob-ability measures and their Stieltjes transforms, the normalized counting measure(NCM) of the eigenvalues of random matrix (2.2) converge weakly with probability1 to the non-random measure N whose Stieltjes transform (2.21) satisfies (2.22).Theorem 3.1 is proved. ✷

THEOREM 3.2. Let Hn be a random matrix of the form (2.2) satisfying the con-ditions of Theorem 3.1. Denote

gn(z) = n−1Tr (Hn − z)−1 = n−1Tr (Hn − z)−1,

δr,n(z) = n−1Tr Hr,n(Hn − z)−1, r = 1, 2, (3.59)

where Hn is defined in (3.19).


Then there exist y0 and C(y0), both positive and independent of n, such that thevariances of the random variables (3.59) satisfy for any z ∈ E(y0)

v1 = 〈|gn(z) − 〈gn(z)〉|2〉 � C(y0)

n2, (3.60)

v1+r = 〈|δr,n(z) − 〈δr,n(z)〉|2〉 � C(y0)

n2, r = 1, 2. (3.61)

Proof. We will derive and analyze the system of inequalities

vi �3∑

j=1,j �=i

αij (y0)v1/2i v

1/2j + βi(y0)

n2, i = 1, 2, 3. (3.62)

Below we will use the notations g(z) and δr(z) for gn(z) and δr,n(z), r = 1, 2 andthe notations 1 and 2 for two values z1 and z2 of the complex spectral parameter z.We assume that |Im z1,2| � y0 > 0.

Consider the matrix

Q1 = 〈g◦(1)z−12 BUG(2)AU ∗〉, (3.63)

where g◦(1) = g(1) − 〈g(1)〉. It follows from (3.23) that n−1Tr Q1 for z1 = z andz2 = z is the variance (3.60), that we denote v1(z):

v1(z) = n−1Tr Q1|z1=z,z2=z = 〈|g◦(z)|2〉. (3.64)

By using for the following function of the unitary matrix U(UG(1)U ∗)◦

aaz−1

2

(BUG(2)AU ∗)

cd

that is the analog of (3.13) derived from the left shift invariance of the Haar mea-sure, we obtain

z−12 (〈([X, UG(1)U ∗])aa(BUG(2)AU ∗)cd〉−−〈(UG(1)AU ∗[B, X]UG(1)U ∗)aa(BUG(2)AU ∗)cd〉++〈(UG(1)U ∗)◦

aa(B[X, UG(2)AU ∗])cd〉−−〈(UG(1)U ∗)◦

aa(BUG(2)AU ∗[B, X]UG(2)U ∗)cd〉) = 0.

Choosing in the above relation a matrix X with only Xbd �= 0, applying to theresult the operation n−2

∑na,d=1 and taking into account that

g◦(z) = n−1n∑

a=1

G◦aa(z),

we obtain the matrix relation

z−12

(n−2〈BUG(2)A(U ∗BUG2(1)AU ∗ − G2(1)AU ∗B)〉+

+B〈g◦(1)δ1(2)〉 − 〈g◦(1)BUG(2)AU ∗〉++〈g◦(1)BUG(2)AU ∗(1 + z2g(2))〉−−〈g◦(1)BUG(2)AU ∗Bδ1(2)〉) = 0. (3.65)


Regrouping the terms in (3.65) according to (3.29) and taking into account that

z−12 (BUG(2)AU ∗B − B) = UA−1G(2)AU ∗B,

we obtain

Q1(z2f (2) − �1(2)B)

= −〈g◦(1)g◦(2)BUG(2)AU ∗〉 ++ 〈g◦(1)δ◦

1(2)UA−1G(2)AU ∗B〉 + R1, (3.66)

where

R1 = n−2z−12 〈BUG(2)A(U ∗BUG2(1)AU ∗ − G2(1)AU ∗B)〉. (3.67)

Besides, according to (3.49) and (3.50), the matrix

P2 = (z2f (2) − �1(2)B) = �1(2)(z2f (2)�−11 (2) − B)

is uniformly in n invertible for z ∈ E(y0) and

||P −12 || � 8

m2.

Multiplying (3.66) by P −12 from the right and applying the operation n−1Tr to the

result, we obtain in the left-hand side the variance v1(z) of (3.60). In view of (3.9),(3.49), (3.20) and the trace property, the terms in right-hand side can be estimatedfor z2 ∈ E(y0) as follows

(i) |〈g◦(1)g◦(2)n−1Tr P −12 BUA1/2G(2)A1/2U ∗〉| � 8m

1/24

m2y0v1;

(ii) |〈g◦(1)δ◦1(2)n−1Tr P −1

2 UA−1/2G(2)A1/2U ∗B〉| � 8m1/24

m2y0v

1/21 v

1/22 ;

(iii)|z2|−1|〈n−3Tr P −1

2 BUA1/2G(2)A1/2(U ∗BUA1/2G2(1)A1/2U ∗−−A1/2G2(1)A1/2U ∗B)〉| � n−216m4m−1

2 y−30 .

These bounds lead for 8m1/24 m−1

2 y−10 � 1/2 to the first inequality of (3.62), in

which

α12(y0) = 16m1/24 m−1

2 y−10 , α13(y0) = 0, β1 = 32m4m−1

2 y−30 . (3.68)

To obtain the second inequality of the system, consider the matrix

Q2 = 〈δ◦1(1)G(2)A〉. (3.69)

Applying to Q2 operation n−1Tr and setting z1 = z, z2 = z, we obtain the variancev2 of (3.61). On the other hand, using Proposition 3.2 for the function

.(M) = (G(1)A)◦aa(G(2)A)cd,


where G(z) = (AU ∗MU − z)−1, we obtain relation

n−2(−〈U ∗BUG(1)AG(1)AG(2)A〉++〈G(1)AG(1)H G(2)A〉)−−〈δ◦

1(1)δ1(2)U ∗BUG(2)A〉++〈δ◦

1(1)g(2)z2G(2)A〉 + 〈δ◦1(1)G(2)A〉 = 0. (3.70)

We do this by identifying M and B and performing almost the same procedure asthat used in the derivation of (3.65), in particular, choosing for the matrix X thematrix with only the (c, b)th entry nonzero. Regrouping terms in (3.70) accordingto (3.29) and taking into account that U ∗BUG(z) = zA−1G(2) + A−1, we obtain

((1 + z2f (2))I − z2�1(2)A−1)Q2

= 〈δ◦1(1)δ◦

1(2)U ∗BUG(2)A〉 − 〈δ◦1(1)g◦(2)z2G(2)A〉 + R2, (3.71)

where

R2 = n−2(〈U ∗BUG(1)AG(1)AG(2)A〉 −−〈G(1)AG(1)H G(2)A〉). (3.72)

Multiplying (3.71) by A from the left, we get

P Q2 = ((1 + z2f (2))−1(〈δ◦1(1)δ◦

1(2)H G(2)A〉 −−〈δ◦

1(1)g◦(2)z2AG(2)A〉 + AR2), (3.73)

where P = A−z2�1(2)(1+z2f (2))−1I . It follows from (3.50) that for z2 ∈ E(y0)

where y0 is sufficiently large, the matrix P is invertible uniformly in n and

||P −1|| � 2m2

|Im z2| .

Multiplying (3.73) by P −1 from the left and applying the operation n−1Tr to theresult, we obtain in the left-hand side the variance v2(z) of (3.61). In view of (3.9),(3.49), (3.20) and the trace property, the terms in right-hand side can be estimatedfor z ∈ E(y0) as follows

(i) |1 + z2f (2)|−1|〈δ◦1(1)δ◦

1(2)n−1Tr P −1HG(2)A〉| � 8m1/44

m1y0v2;

(ii) |1 + z2f (2)|−1|〈δ◦1(1)g◦(2)n−1Tr P −1z2AG(2)A〉| � 16m

1/24

m1v

1/22 v

1/21 ;

(iii)|1 + z2f (2)|−1|〈n−3Tr P −1A(HG(1)AG(1)AG(2)−

−G(1)AG(1)HG(2)A)〉| � n−216m5/44 m−1

1 y−20 .


These bounds lead for 8m1/44 m−1

1 y−10 � 1/2 to the second inequality of (3.62), in

which

α21(y0) = 32m1/24 m−1

1 , α23(y0) = 0, β2 = 32m5/44 m−1

1 y−20 . (3.74)

Using the arguments above for the matrix Q3 = 〈δ◦2(1)BUG(2)U ∗〉, we obtain the

third inequality of (3.62) where

α31 = α21 = 32m1/24 m−1

1 ,

α32 = α23 = 0, (3.75)

β3 = β2 = 32m5/44 m−1

1 y−20 .

Introducing new variables

u1 = y1/20 v

1/21 , u2 = v

1/22 , u3 = v

1/23 ,

we obtain from (3.68), (3.74) and (3.75) the system

u2i �

3∑j=1,j �=i

aij uiuj + γi

n2, i = 1, 2, 3, (3.76)

where the coefficients {aij , i �= j} have the form aij = y−1/20 bij with bij and γi

bounded in y0 and n as y0 → ∞ and n → ∞. By choosing y0 sufficiently large(and then fixing it), we can guarantee that 0 � aij � 1/4, i �= j . Summing thethree relations (3.76), we can write the result in the form (cu, u) � γ /n2 whereγ = γ1 + γ2 + γ3 and the minimum eigenvalue of the matrix c is 1/2. Thus weobtain the bounds (3.60) and (3.61). Theorem 3.2 is proved. ✷

4. Convergence in Probability for Random An and Bn

According to Theorem 3.2, the randomness of Un in (2.2) (or (2.18)) already allowsus to prove that the variance of the Stieltjes transform of the NCM (2.4) vanishesas n → ∞. Therefore, we have only to prove that the additional randomness due tothe matrices An and Bn in (2.2) does not destroy this property. We will prove thisfact first for An and Bn whose norms are uniformly bounded in n (see Lemma 4.1below), and then for the general case of Theorem 2.1 by using some truncationprocedure.

PROPOSITION 4.1 ([5]). Let {mn} be a sequence of random measures defined onsome probability space � and {sn} be the sequence of their Stieltjes transforms.Then:

(i) The sequence mn converges in probability to a nonrandom probability mea-sure m if and only if the sequence {sn} converges in probability for any fixed z


belonging to some compact set K ⊂ {z ∈ C | Im z > 0} to the Stieltjes transformf of the measure m.

(ii) We can replace the requirement of their convergence for any z belonging toa certain compact of C± by the convergence for any z belonging to any interval ofthe imaginary axis, i.e. for z = iy, y ∈ [y1, y2], y1 > 0.

(iii) If {mn} is a sequence of random nonnegative measures converging weaklyin probability to a nonrandom nonnegative measure m, then the Stieltjes transformssn of mn and the Stieltjes transform s of m are related by

limn→∞ E

{supz∈K

|sn(z) − s(z)|}

= 0 (4.1)

for any compact of C±.Proof. cf. [5], Proposition 4.1. ✷

LEMMA 4.1. Let Hn be the random n×n matrix of the form (2.2) in which An andBn are random positive definite Hermitian matrices, Un and Vn are random unitarymatrices distributed each according to the Haar measure on U(n) and An, Bn, Un

and Vn are mutually independent. Assume that the normalized counting measuresNr,n, r = 1, 2 of matrices An and Bn converge weakly in probability as n → ∞to the nonrandom nonnegative and probability measures Nr, r = 1, 2 respectivelyand that

supn

||An|| � T < ∞, supn

||Bn|| � T < ∞, (4.2)

and condition (2.20) of Theorem 2.2 is fulfilled.Then the normalized counting measure Nn of Hn converges weakly in proba-

bility to a nonrandom probability measure whose Stieltjes transform (2.21) is aunique solution of the system (2.22) in the class of functions f (z), �1,2(z) analyticfor z ∈ C \ R+ and satisfying conditions (2.9)–(2.11) and (2.23).

Proof. In view of Proposition 4.1(ii) it suffices to show that

limn→∞ E{|gn(z) − f (z)|} = 0

for any z belonging to some interval of the imaginary axis, i.e.

z = iy, y ∈ [y1, y2], 0 < y1 < y2 < ∞. (4.3)

Condition (4.2) implies

P{∣∣∣∣

∫ +∞

0λNr,n(dλ) − mr

∣∣∣∣ � ε

}→ 0, r = 1, 2, as n → ∞. (4.4)

Thus, for any sufficiently small η > 0 there exists n(η) such that for all n � n(η)

P{

10

11mr �

∫ +∞

0λNr,n(dλ) � 12

11mr

}� 1 − η, r = 1, 2. (4.5)


Denote by �∗ the event whose probability is written in the left-hand side of (4.5)and by E∗{ · } the average over product of �∗ and two copies of the group U(n) forUn and Wn. As a result of (4.5), for z ∈ [y1, y2], we have

limn→∞ E{|gn(z) − f (z)|} � 2y−1

1 η + limn→∞ E∗{|gn(z) − f (z)|}

and for all n � n(η) and any realization of An and Bn on �∗, we have

10

11mr �

∫ +∞

0λNr,n(dλ) � 12

11mr. (4.6)

Thus, to complete the proof we have to show that for any z = iy, y ∈ [y1, y2]lim

n→∞ E∗{|gn(z) − f (z)|} = 0. (4.7)

Since the relation (4.4) and condition (4.2) of the lemma imply evidently the con-ditions (3.1) and (3.2) of Theorems 3.1 and 3.2, all the results obtained in thesetheorems are valid in our case for any fixed realization of random matrices An

and Bn on �∗. In addition, all n-independent estimating quantities entering variousbounds in the proofs of these theorems and depending on the moments m1, m2, m4

in (3.1), (3.2) and on n′, ξ and y0 will now depend on m1, m2, T and on n(η), ξ

and y1, y2 in (4.3), but not on any particular realization of random matrices An andBn.

Using (3.33), we can write that

E∗{|gn(z) − 〈gn(z)〉|} � E∗{|v1/21 (z)|} � C

n,

where the symbol 〈 · 〉 denotes, as above, the expectation with respect the Haarmeasure on U(n). Thus, it suffices to show

limn→∞ E∗{|〈gn(z)〉 − f (z)|} = 0, z = iy, y ∈ [y1, y2], (4.8)

where y1 is big enough. Introduce the quantities

γn(y) = iy(〈gn(iy)〉 − f (iy)), γr,n(y) = 〈δr,n(iy)〉 − �r(iy), r = 1, 2.

By using the first equation of system (2.22) and first relation of (3.56), we can writethe identity

γn(y)(1 + iy(〈gn(iy)〉 + f (iy)))

= iy�2(iy)γ1,n(y) + iy〈δ1,n(iy)〉γ2,n(y) + iyr1,n(iy). (4.9)

On other hand, using the integral representation (2.8), from the first two equationsof (2.22), we obtain

1 + iyf (iy) = ρ2(t2(y)), (4.10)


where

ρ2(z) = 1 + zf2(z) =∫ +∞

0

λN2(dλ)

λ − z, t2(y) = iy�2(iy)

1 + iyf (iy). (4.11)

Using (4.10), we obtain

γn(y) = [ρ2(t2,n(y)) − ρ2(t2(y))] + ε1,n(y), (4.12)

where

ε1,n(y) = [1 + iy〈gn(iy)〉 − ρ2(t2,n(y))],t2,n(y) = iy〈δ2,n(iy)〉

1 + iy〈gn(iy)〉 . (4.13)

We have

E∗{|ε1,n(y)|} � E∗{|1 + iy〈gn(iy)〉 − ρ2,n(t2,n(y))|} ++E∗{|ρ2,n(t2,n(y)) − ρ2(t2,n(y))|}, (4.14)

where

ρ2,n(z) = 1 + zg2,n(z) =∫ +∞

0

λN2,n(dλ)

λ − z(4.15)

and where g2,n(z) is the Stieltjes transform of the random NCM N2,n of H2,n

(cf. (3.57))

g2,n(z) = n−1Tr G2(z) =∫ +∞

0

N2,n(dλ)

λ − z.

Besides, by using two first relations of (3.56), we obtain

1 + iy〈gn(iy)〉= ρ2,n(t2,n(y)) + r2,n(iy)t2,n(y) − r1,n(iy)(1 + 〈gn(iy)〉)−1,

where

r2,n(z) = z

1 + z〈gn(z)〉(〈n−1Tr Y1P

−1δ◦2,n(z)〉−

−〈n−1Tr UG(z)AUBU ∗P −1g◦n(z)〉),

P −1 = G2(t2,n(z)),

g◦n(z) = gn(z) − 〈gn(z)〉,

δ◦r,n(z) = δr,n(z) − 〈δr,n(z)〉, r = 1, 2

are the respective random variables centered by the partial expectations with re-spect to the Haar measure. Using (4.6) we obtain the analogs of (3.49) and (3.50).


In our case, we have for all n � n(η) and y ∈ [y1, y2], y1 sufficiently large

|1 + iyfn(iy)| � y−1(m1m2)/3,

|t2,n(y)| � 3y

m1,

Im t2,n(y) � y

3m1. (4.16)

In addition, we have the analoge of (3.58)

|r2,n(z)| � C

n.

This leads to the following bound for the first term in the right-hand side of (4.14):

E∗{|1 + iy〈gn(iy)〉 − ρ2,n(t2,n(y))|}� 3y2

m1

(E∗{|r2,n(iy)|} + 1

m2E∗{|r1,n(iy)|}

)� C

n.

As for the second term, using (4.16) we obtain for it the following bound

E∗{|ρ2,n(t2,n(y)) − ρ2(t2,n(y))|}� 3y2

m1supz∈K

E∗{|g2,n(z) − f2(z)|}

� 3y2

m1supz∈K

E{|g2,n(z) − f2(z)|}, (4.17)

where K is a compact subset of C+

K ={

z ∈ C : Im z � y1

3m1, |z| � 3y2

m1

}.

The right-hand side of (4.17) tends to zero as n → ∞ in view of the hypothesis ofTheorem 2.1 and Proposition 3.3. Thus, there exist 0 < y1 < y2 < ∞ such that forfor all y ∈ [y1, y2],

limn→∞ E∗{|ε1,n(y)|} = 0. (4.18)

Analogous arguments show that limn→∞ E∗{|ε2,n(y)|} = 0, ε2,n(y), where ε2,n(y)

is defined from (4.13) and (4.11) by interchanging the indexes 1 and 2.Consider now the first term in the right-hand side of (4.12). In view of (4.11)

and (4.15), we can write this term in the form

[ρ2(t2,n(y)) − ρ2(t2(y))]= − t2,n(y)

1 + iyf (iy)J2γn + iy

1 + iy〈gn(iy)〉J2γ2,n

= −a1γn + b2γ2,n,


where J2, a2 and b2 are defined by formulas (3.15) and (3.16), in which we haveto replace �′

2, �′′2, f ′ and f ′′ by 〈δ2,n〉, �2, 〈gn〉 and f , respectively. Denote by

. = {.ij }3i,j=1 the matrix defined by the left-hand side of system (3.15) and by

H = {Hi}3i=1 the vector with components H1 = γn, H2 = γ1,n, H3 = γ2,n. It is easy

to check using (4.6) that for all n � n(η), y ∈ [y1, y2] and y1 sufficiently large, wehave

det . � m1m2

2,

i.e. matrix . is uniformly invertible in n and y.Then we have from (4.9)

E{|(.H)1|} � E{|yr1,n|}. (4.19)

Besides, relations (4.12) and (4.18) lead to

E{|(.H)2|} � E{|ε1,n|}. (4.20)

Interchanging indices 1 and 2, in the above arguments we also obtain that

E{|(.H)3|} � E{|ε2,n|}. (4.21)

Denote by || · ||1 the l1 -norm of C3 and by || · || the induced matrix norm. Then

we have

E{||H||1} � E{||.−1.H||} � E1/2{||.−1||2}E1/2{||.H||21}. (4.22)

It follows from our arguments above that all entries of the matrices . and .−1 andall components of the vector H are bounded uniformly in n and in realizations ofrandom matrices An, Bn, Un and Wn in (2.2). Thus, we have

||.−1|| �3∑

i,j=1

|(.−1)ij | � C, ||.H||1 �3∑

i,j=1

|.ij ||H|j � C.

These bounds and (4.19)–(4.22) imply that

E{||H||1} � C3/2(E{|r1,n|} + E{|ε1,n|} + E{|ε2,n|})1/2.

In view of (4.18), this inequality imply (4.8), i.e. the assertion of the lemma. ✷Proof of Theorem 2.1. For any T > 0 introduce the matrices AT

n and BTn re-

placing eigenvalues An and Bn lying in ]T , ∞[ by T . Denote by NTr,n the NCMs of

ATn and BT

n . It is clear that NTr,n converges weakly in probability to NT

r as n → ∞where NT

r is defined via Nr by the formula

NTr (�) =

Nr(�), � ⊂]−∞, T [,Nr([T , ∞[), � = {T },0, � ⊂]T , ∞[,

(4.23)


for any Borel set � ∈ R.Now denote

H T = (H1)1/2H T2 (H1)

1/2,

H T = (H T2 )1/2H1(H

T2 )1/2,

H T = (H T2 )1/2H T

1 (H T2 )1/2,

where

H T1 = V ∗AT V, H T

2 = U ∗BT U.

It is clear that the dimensions of ranges of Hn − H Tn and H T

n − H Tn do not exceed

the ranges of H T1,n − H1,n and of H T

2,n − H2,n, respectively, i.e. they do not exceedthe number of those [A]i and [B]i which are larger than T . Therefore, the NCMsNn(λ), N T

n (λ), N Tn (λ) and NT

n (λ) of matrices Hn, H T , H T and H Tn satisfy the

inequalities

|Nn(λ) − N Tn (λ)| � #{|[B]i | � T }

n=

∫ +∞

T

N2,n(dµ), (4.24)

|N Tn (λ) − NT

n (λ)| � #{|[A]i | � T }n

=∫ +∞

T

N1,n(dµ). (4.25)

Let g Tn (z), g T

n (z) and g Tn (z) be Stieltjes transforms of N T

n , N Tn and NT

n , respec-tively. Because of the trace property, we have

g Tn (z) = n−1Tr (H T

n − z)−1 = n−1Tr (H Tn − z)−1 = g T

n (z).

Besides, it follows by definition and from (4.24) and (4.25), that if gn(z) is theStieltjes transform of Nn, then we have uniformly on z ∈ E(y0)

|gTn (z) − gn(z)| � |g T

n (z) − gn(z)| + |g Tn (z) − gT

n (z)|� 1

y0

(∫ +∞

T

N1,n(dµ) +∫ +∞

T

N2,n(dµ)

).

Hence

E{|gTn (z) − gn(z)|}

� 1

y0

(∫ +∞

T

E{N1,n(dµ)} +∫ +∞

T

E{N2,n(dµ)})

. (4.26)

Since the norms of matrices H T1 and H T

2 are bounded, the results of the Lemma 4.1is applicable to the function gT

n (z), so that, in particular, for n → ∞ it convergesin probability to a function f T (z) obeying the system

f T (z)(1 + zf T (z)) = �T1 (z)�T

2 (z),

�T1 (z) = f T

2

(z�T

2 (z)

1 + zf T (z)

),

�T2 (z) = f T

1

(z�T

1 (z)

1 + zf T (z)

).


In addition, since E{gTn (z)} and E{δT

r,n(z)}, r = 1, 2 are bounded uniformly in n

and T for z ∈ C \ R+,ξ

|E{gTn (z)}| � 1

ξ,

|E{δTr,n(z)}| � 1

ξ

∫ +∞

0λE{NT

r,n(dλ)}

� 1

ξ

∫ +∞

0λE{Nr,n(dλ)} � (m(2))1/2

ξ,

we have

|f T (z)| � 1

ξ, |�T

1,2(z)| � (m(2))1/2

ξ. (4.27)

Besides, as a result of (3.43), (3.44) and (3.45), we have

1 + zf T (z) = −z−1(mT1 mT

2 + O(|Im z|−1)),

z�T1,2(z) = −mT

1,2 + O(|Im z|−1), (4.28)

where

mTr =

∫ +∞

0λE{NT

r,n(dλ)}, r = 1, 2.

Besides, for any sequence Tk → ∞, the sequences of analytic functions {f Tk (z)}and {�Tk

1,2(z)} are relatively compact with respect to uniform convergence on any

compact subset of C \ R+,ξ . In addition, according to (4.23), the measures NTk

1,2converge weakly to the limiting measures N1,2 and

mT1,2 → m1,2, T → ∞. (4.29)

Hence, there exist three analytic functions f (z), �1(z), �2(z) verifying (2.22).(4.28) and (4.29) imply that f (z), �1(z), �2(z) satisfy the conditions of Propo-

sition 3.3 and, hence, they are uniquely defined.Furthermore, for z ∈ E(y0), we have

E{|gn(z) − f (z)|}� E{|gn(z) − g

Tk′n (z)|} + E{|gTk′

n (z) − f Tk′ (z)|} + |f Tk′ (z) − f (z)|,where f Tk′ (z) denotes a convergent subsequence of f Tk (z). Hence, in view of(4.26), the arguments above, and Lemma 6.1, we conclude that for each z ∈ E(y0)

limn→∞ E{|gn(z) − f (z)|} = 0.

In view of Proposition 4.1, we conclude that the NCM of random matrices (2.2)converge weakly in probability to the nonrandom measure N whose Stieltjes trans-form f (z) satisfies (2.22). Theorem 2.1 is proved. ✷


5. Convergence with Probability 1 for Nonrandom Sn, Tn and in Probabilityfor Random Sn, Tn

The proof of Theorem 2.2 follows the scheme of the proof of Theorem 2.1. In thissection, we briefly describe the steps that corespond to those used in the previoussections.

The first step of the proof of Theorem 2.2 consists in the following statements(cf. Theorem 3.1 and Proposition 3.3).

The following Theorem 5.1 generalizes Voiculescu’s result on the multiplicativefree convolution of probability measures on the unit circle [8] proved under thecondition that the NCMs ν1,2 have nonzero first moments, i.e.∫ 2π

0eiθ νr(dθ) �= 0, r = 1, 2.

THEOREM 5.1 ([7]). Let Vn be a random n × n matrix of the form (2.1) in whichSn and Tn are nonrandom unitary matrices, Un and Wn are random independentunitary matrices distributed each according to the Haar measure on U(n). As-sume that the normalized counting measures νr,n, r = 1, 2 of Sn and Tn convergeweakly as n → ∞ to the nonnegative and probability measures on the unit circleνr, r = 1, 2, respectively. Then the normalized counting measure of νn converges inprobability to a nonrandom nonnegative and probability measure ν whose Herglotztransform (2.24) is the unique solution of the system (2.25) in the class of functionsh(z), �1,2(z) analytic for |z| < 1 and satisfying conditions (2.14)–(2.15) and(2.26).

PROPOSITION 5.1. System (2.25) has a unique solution in the class of functionsh(z), �1,2(z) analytic for |z| < 1 and satisfying conditions (2.14)–(2.15) and(2.26).

Proof. Assume that there exist two solutions (h′, �′1,2) and (h′′, �′′

1,2) of thesystem. Denote δh = h′ − h′′, δ�1,2 = �′

1,2 − �′′1,2 and δf = f ′ − f ′′, where

f ′,′′ = (h′,′′ − 1)/2z. Then, by using (2.24) and the integral representation (2.13)for h1,2, we obtain the linear system for δf , δ�1,2

(1 + a1(z))δf − b1(z)δ�1 − c1(z)δ�2 = 0,

a2(z)δf + δ�1 − b2(z)δ�2 = 0,

a3(z)δf − b3(z)δ�1 + δ�2 = 0, (5.1)

where

a1 = z(f ′ + f ′′), b1 = �′′2, c1 = �′

1,

a2 = zs′

2I2(s′2, s′′

2 )

1 + zf ′′ , b2 = zI2(s

′2, s′′

2 )

1 + zf ′′ , s′,′′2 = z�

′,′′2

1 + zf ′,′′ , (5.2)

I2(z′, z′′) =∫ 2π

0

ν2(dθ)

(eiθ − z′)(eiθ − z′′)(5.3)


and a3, b3 can be obtained from a2 and b2 by replacing subindexes 2 by 1 in aboveformulas. For any 0 < d0 � 1/4 consider the domain

D(d0) = {z ∈ C : |z| � d0}. (5.4)

By using (2.14) and (2.26), we obtain for z ∈ D(d0) that∣∣s′,′′1,2

∣∣ � 1/2,∣∣I1,2(s

′1,2, s′′

1,2)∣∣ � 4

and, hence,

ar = o(1), r = 1, 2, 3, b2,3 = o(1), z → 0.

Thus, the determinant 1+a1 −b2b3 +b1a2 +c1a3 −a1b2b3 +b1b3a3 +c1a2b3 of thesystem (5.1) is equal asymptotically to 1. We conclude that if d0 in (5.4) is smallenough, then system (5.1) has only trivial solution, i.e. system (2.25) is uniquelysoluble. ✷

The proof of Theorem 5.1 coincides with the proof of Theorem 3.1 modulo tothe substitution of matrices Sn and Tn instead of matrices An and Bn (and V1 andV2 instead of H1 and H2 correspondently). The proof for the unitary case is muchsimpler, e.g. the bounds analogous to (3.49) and (3.50) will be follows (and alsouniform in n)

|fn(z)| � 1

1 − |z|, |�2,n(z)| � 1

1 − |z|,∣∣∣∣ z�2,n(z)

1 + zfn(z)

∣∣∣∣ � 1/2, |z| < 1/4.

Thus, for |z| < 1/4 the matrix T − z�2,n(z)(1 + zfn(z))−1I will be invertibleuniformly in n and we need not require the first moments of measures νr,n, r = 1, 2to be nonzero.

The proof is also based on the following properties of variances.

THEOREM 5.2 ([7]). Let Vn be the random matrix of the form (2.1) satisfying thecondition of Theorem 5.1. Denote

gn(z) = n−1Tr (V − z)−1, δr,n(z) = n−1Tr Vr(V − z)−1, r = 1, 2. (5.5)

Then there exist d0 and C(d0), both independent of n and such that the variancesof random variables (5.5) admit the bounds for |z| � d0 < 1

v1 = 〈|(gn(z) − 〈gn(z)〉)|2〉 � C(d0)

n2, (5.6)

v1+r = 〈|δr,n(z) − 〈δr,n(z)〉|2〉 � C(d0)

n2, r = 1, 2. (5.7)


Proof. The proof follows the scheme of the proof of Theorem 3.2, i.e. we deriveand analyze the system of inequalities (3.62) (see [7] for more details). ✷

In addition, using the following relation between Stieltjes and Herglotz trans-forms

f (z) = h(z) − 1

2z,

one can easily establish that (2.25) and (2.22) are equivalent.Now we are going to prove Theorem 2.2. According to Theorem 5.2, the ran-

domness of Un in (2.1) (or (2.17)) an already provides vanishing variance of thegn(z) and, hence, it also provides vanishing of the variance of the Herglotz trans-form of the NCM (2.3) (see (5.6)). Thus we have only to prove that the additionalrandomness due to the matrices Sn and Tn in (2.1) does not destroy this property.

We use the following analog of Proposition 4.1.

PROPOSITION 5.2. Let {µn} be a sequence of random measures on the unit circledefined on a certain probability space � and {hn} be the sequence of their Herglotztransforms, then

(i) the sequence {µn} converges in probability to a nonrandom probability mea-sure µ on the unit circle if and only if the sequence {hn} converges in probabilityfor any fixed z belonging to a certain compact K ⊂ {z ∈ C | |z| < 1} to theHerglotz transform h of the measure µ;

(ii) if {µn} is a sequence of random nonnegative measures on the unit circleconverging weakly in probability to a nonrandom nonnegative measure µ and ifhn(z) are the Herglotz transforms of µn and h(z) is the Herglotz transform of µ

then the functions

pn(z) = hn(z) − 1

2z, p(z) = h(z) − 1

2z

are related as follows:

limn→∞ E

{supz∈K

|pn(z) − p(z)|}

= 0

for any compact of {z ∈ C | |z| < 1}.This proposition can be easily proved by repeating the proof of Proposition 4.1

from [5].

Proof of Theorem 2.2. In view of Proposition 5.2, it suffices to show that

limn→∞ E{|qn(z) − h(z)|} = 2|z| lim

n→∞ E{|gn(z) − f (z)|} = 0

for any z belonging to a certain compact D(d0) of {z ∈ C : |z| < 1}, wheref (z) = (h(z) − 1)/2z and

qn(z) =∫ 2π

0

eiµ + z

eiµ − zνn(dµ), |z| < 1.


The results obtained in Theorems 5.1 and 5.2 are valid in our case for any fixedrealization of random matrices Sn and Tn. In addition, all n-independent estimatingquantities entering various bounds in the proofs of these theorems and dependingon d0 will depend now also on d0 but not on any particular realization of randommatrices Sn and Tn. We will denote below all these quantities simply by the uniquesymbol C that may have different value in different formulas.

In particular, denoting as above by 〈 · 〉 the expectation with respect to the Haarmeasure and using (5.6), we can write that

E{|gn(z) − 〈gn(z)〉|} � E{|v1/21 (z)|} � C

n2.

Thus, it suffices to show

limn→∞ E{|〈gn(z)〉 − f (z)|} = 0, z ∈ D(d0), (5.8)

where d0 is small enough. Introduce the quantities

γn(z) = (〈gn(z)〉 − f (z)), γr,n(z) = 〈δr,n(z)〉 − �r(z), r = 1, 2. (5.9)

The first equation of system (2.25) and (3.31) leads to the identity

γn(z)(1 + z(〈gn(z)〉 + f (z)))

= �2(z)γ1,n(z) + 〈δ1,n(z)〉γ2,n(z) + r1,n(z). (5.10)

By using the second equation of system (2.25), we can write the identity

γ1,n(z) = f2(t2,n(z)) − f2(t2(z)) + ε1,n(z), (5.11)

where

f2(z) = h2(z) − 1

2z=

∫ 2π

0

ν2(dθ)

eiθ − z, (5.12)

ε1,n(z) = 〈δ1,n(z)〉 − f2(t2,n(z)), (5.13)

t2,n(z) = z〈δ2,n(z)〉1 + z〈gn(z)〉 , t2(y) = z�2(z)

1 + zf (z). (5.14)

We have

E{|ε1,n(y)|} � E{|〈δ1,n(z)〉 − g2,n(t2,n(z))|} ++E{|g2,n(t2,n(z)) − f2(t2,n(z))|}. (5.15)

The analogs of (3.53), and (3.54) in our case are

〈δ1,n(z)〉 = g2,n(z〈δ2,n(z)〉(1 + z〈gn(z)〉)−1) + r2,n(z), (5.16)


where g2,n(z) is random function defined as follows (cf. (3.54))

g2,n(z) = n−1Tr G2(z) =∫ 2π

0

ν2,n(dθ)

eiθ − z,

r2,n(z) = z

1 + z〈gn(z)〉(〈n−1Tr Y1P

−1δ◦2,n(z)〉 −

−〈n−1Tr UG(z)SUT U ∗P −1g◦n(z)〉),

P −1 = G2(t2,n(z)), and

g◦n(z) = gn(z) − 〈gn(z)〉, δ◦

r,n(z) = δr,n(z) − 〈δr,n(z)〉, r = 1, 2 (5.17)

are the respective random variables centralized by the partial expectations withrespect to the Haar measure. In addition, we have the analog of (3.58)

|r2,n(z)| � C

n.

This leads to the following bound for the first term in the right-hand side of (5.15):

E{|〈δ1,n(z)〉 − g2,n(t2,n(z))|} � E{| r2,n(iy)|} � C

n.

The universal bounds

|gn(z)| � 1

1 − |z|, |δr,n| � 1

1 − |z|, r = 1, 2

are valid on all realizations of random matrices Sn and Tn which imply that forz ∈ D(d0), d0 � 1/4

|tr,n(z)| � 1/2, r = 1, 2. (5.18)

Thus

E{|g2,n(t2,n(z)) − f2(t2,n(z))|}� sup

|ζ |�1/2E{|g2,n(ζ ) − f2(ζ )|}. (5.19)

The right-hand side of this inequality tends to zero as n → ∞ in view of hypothesisof the Theorem 2.2 and Proposition 5.2. Thus we have limn→∞ E{|ε1,n(z)|} = 0for all z ∈ D(d0). Analogous arguments show that limn→∞ E{|ε2,n(z)|} = 0, whereε2,n(z) is defined from (5.13) and from (5.14) by interchanging the indices 1 and 2.Thus we have

limn→∞ E{|εr,n(z)|} = 0, r = 1, 2. (5.20)


Consider now the first term in the right-hand side of (5.11). In view of (5.12)we can write this term in the form

f2(t1,n(z)) − f2(t1(z))

= − zt1,n(z)

1 + zf (z)I2γn + z

1 + zf (z)I2γ2,n

= −a2γn + b2γ2,n, (5.21)

where I2, a2 and b2 are defined by formulas (5.3) and (5.3), in which we haveto replace �′

2, �′′2, f ′ and f ′′ by 〈δ2,n〉, �2, 〈gn〉 and f , respectively. Denote by

. = {.ij }3i,j=1 the matrix defined by the left-hand side of system (5.1) and by

H = {Hi}3i=1 the vector with components H1 = γn, H2 = γ1,n, H3 = γ2,n.

Thus, the rest of the proof of the theorem corresponds line by line to the proofof Lemma 4.1 after (4.19). ✷

6. Appendix

PROPOSITION 6.1. Let Mr , r = 1, 2 be nonrandom n × n matrices and Un

unitary random matrix uniformly distributed over the unitary group U(n) withrespect to the Haar measure. Then

(i) 〈n−1Tr M1U ∗n M2Un〉 = n−1Tr M1n−1Tr M2; (6.1)

(ii)〈n−1Tr (M1U ∗

n M2Un)2〉(1 − n−2)

= (n−1Tr M1

)2n−1Tr M2

2 + n−1Tr M21

(n−1Tr M2

)2−− (

n−1Tr M1)2(

n−1Tr M2)2−

− n−3Tr M21 n−1Tr M2

2 , (6.2)

where 〈 · 〉 denotes the average over the unitary group U(n).Proof. (i) Using Proposition 3.2 with .(M2) = (M1U ∗M2U)ab we obtain

〈(M1[X, U ∗M2U ])ab〉 = 0.

Choosing the matrix X with only (a, b)th nonzero entry and applying the operationn−2 ∑n

a,b=1 to this relation, we get (6.1).(ii) On the other hand, using Proposition 3.2 with

.(M2) = (M1U ∗M2UM1U∗M2U)ab,

after the same procedure, we obtain

〈n−1Tr (M1U ∗n M2Un)2〉

= n−1Tr M1〈n−1Tr (M1U ∗M22 U)〉+

+ 〈n−1Tr (M21 U ∗M2U)〉n−1Tr M2−

− ⟨(n−1Tr (M1U ∗M2U))2

⟩. (6.3)


Besides, using Proposition 3.2 with

.(M2) = (M1U ∗M2U)cc(M1U ∗M2U)ab,

we obtain

〈(M1[X, U ∗M2U ])cc(M1U∗M2U)ab〉+

+〈(M1U ∗M2U)cc(M1[X, U ∗M2U ])ab〉 = 0.

Choosing the matrix X with only (a, b)th nonzero entry and applying the operationn−3 ∑n

a,b,c=1 to this relation, we get

〈(n−1Tr (M1U ∗M2U))2〉 = (n−1Tr M1)2(n−1Tr M2)2 ++n−2〈n−1Tr M2

1 U ∗n M2

2 Un〉 −−n−2⟨n−1Tr (M1U

∗n M2Un)2⟩.

Substituting the relation above in (6.3) and using (6.1) we obtain (6.2). ✷REMARK 6.1. In this paper we deal with unitary and Hermitian matrices, i.e.we assume that the matrices Sn, Tn, Un and Wn in (2.1) are unitary and An and Bn

in (2.2) are Hermitian. It is natural also to consider the case of orthogonal Sn, Tn,Un and Wn and real symmetric An and Bn. This case can be handled by using theanalogue of formula (3.13) of the orthogonal group O(n). Indeed, it is easy to seethat this analog has the form∫

O(n)

.′(O�MO) · [X, O�MO] dO = 0,

where O� is the transpose of O and X is a real antisymmetric matrix. By usingthis formula we obtain instead of (3.25)

〈(GH1)aa(H2G)bb〉 − 〈(GH1)ab(2G)ab〉= 〈(GH1H2)aaGbb〉 − 〈(GH1H2)abGab〉.

The second terms in both sides of this formula give two additional terms in (3.32)

〈n−2Tr (GH1)�H2G〉 − 〈n−2Tr (GH1H2)

�G〉.These terms, however, produce the asymptotically vanishing contribution to theremainder (3.32), because, in view of (3.5), (3.8) and (3.9) we have in the case ofreal symmetric An and Bn∣∣〈n−2Tr (GH1)

�H2G〉 − 〈n−2Tr (GH1H2)�G〉∣∣ � 2

ny20

m2/44

and in the case of orthogonal Sn and Tn∣∣〈n−2Tr (GV1)�V2G〉 − 〈n−2Tr (GV1V2)�G〉∣∣ � 2

n(1 − d0)2.


Similar, and also negligible as n → ∞ terms appear in formulas (3.55), (3.41),(3.67) and (3.72) of the proof of Theorems 3.1 and 3.2. As a result, in this casewe obtain the same system (2.25), defining the Herglotz transform of the limitingeigenvalue counting measure of the analogue of (2.1) with the orthogonal Sn and Tn

and orthogonal Haar-distributed Un and Wn and the same system (2.22), definingthe Stieltjes transform of the limiting eigenvalue counting measure of the analogueof (2.2) with the real symmetric An and Bn and orthogonal Haar-distributed Un andWn.

Acknowledgements

I am thankful to Prof. Anne Boutet de Monvel, Prof. A. Khorunzhy and Prof. L. Pas-tur for numerous helpful discussions. I also thanks the Ministère des AffairesEtrangères de France for financial support.

References

1. Akhiezer, N. I. and Glazman, I. M.: Theory of Linear Operators in Hilbert Space, FrederickUngar Publishing Co., New York, 1963.

2. Bercovici, H. and Voiculescu, D. V.: Free convolution of measures with unbounded support,Indiana Univ. Math. J. 42 (1993), 733–773.

3. Janik, R. A., Nowak, M. A., Papp, G., Wambach, J., and Zahed, I.: Nonhermitian random matrixmodels: a free random variable approach, Phys. Rev. E 55 (1997), 4100–4106.

4. Marchenko, V. A. and Pastur, L. A.: Distribution of eigenvalues for some sets of randommatrices, Mat. Sb. (N.S.) 72 (114) (1967), 507–536 (Russian).

5. Pastur, L. and Vasilchuk, V.: On the law of addition of random matrices, Comm. Math. Phys.214 (2000), 249–286.

6. Speicher, R.: Free convolution and the random sum of matrices, Publ. Res. Inst. Math. Sci. 29(1993), 731–744.

7. Vasilchuk, V.: On the law of multiplication of unitary random matrices, Mat. Fiz. Anal. Geom.7 (2000), 266–283 (Russian).

8. Voiculescu, D. V.: Limit laws for random matrices and free products, Invent. Math. 104 (1991),201–220.

9. Voiculescu, D.: A strengthened asymptotic freeness result for random matrices with applica-tions to free entropy, Internat. Math. Res. Notices 1998, No. 1, 41–62.

10. Voiculescu, D. V., Dykema, K. J., and Nica, A.: Free Random Variables, CRM MonographSeries, Amer. Math. Soc., Providence, RI, 1992.


37

Smoothing Properties of the Heat SemigroupsAssociated to Hamiltonians Describing PointInteractions in One and Two Dimensions

A. BEN AMOR and PH. BLANCHARDUniversität Bielefeld, Fakultät für Physik, D-33615, Bielefeld, Germany

(Received: 30 June 2000; in final form: 12 April 2001)

Abstract. Smoothing properties of the heat semigroups associated to Hamiltonians describing pointinteractions in one and two dimensions are investigated. A construction of Hamiltonians describingpoint interaction on Lp-spaces is then derived and a full description of their spectra is given. Partic-ularly, we prove the p-independence of their spectra and the exponential growth of the p-norms ofsuch semigroups for large time.

Mathematics Subject Classifications (2000): primary 79-XX, secondary 81Q10.

Key words: heat semigroup, point interaction, smoothing property, spectrum.

1. Introduction

By the heat semigroup we mean the semigroup associated to the equation

∂ψ

∂t= −Hψ, (1)

where H is a selfadjoint operator and we will denote it by exp(−tH ). The mostimportant case in applications is when H is a generalized Schrödinger operator onRd , i.e. a perturbation of minus the Laplacian by a suitable measure (for example, in

the generalized Kato class) and especially by a potential. Constructions and inves-tigations of the heat semigroups exp(−tH ), where H is a Schrödinger operator, issubject of extensive literature, rich in its methods (analytic, probabilistic, or relatedto operator theory) and contents. In a survey paper, B. Simon [19] proved the Lp-smoothing property of exp(−tH ) with H = −�+V for a large class of potentials:If the negative part of the potential is in the Kato class and its positive part is inthe local Kato class, then for every t > 0, exp(−tH ) maps continuously Lp intoLq, 1 � p � q � ∞. In the same paper, Lp-growth of these semigroups is alsostudied. For instance, it is proved that for the class of potentials described above,‖exp(−tH )‖p,p has an exponential growth for large t , namely

‖exp(−tH )‖p,p � C exp(αt) (2)

38 A. BEN AMOR AND PH. BLANCHARD

for large t , where C is a positive constant and α ∈ R. Let us emphasize that ford = 1, 2 and V < 0, and due to the existence of negative eigenvalues of H , onehas α > 0. Among other interesting features it is proved in [19] that the spectralbound,

inf σ (H) = − limt→+∞ t−1 ln ‖exp(−tH )‖p,p

is p-independent. This result is now a well-known fact. For instance, R. Hempeland J. Voigt [15, 16] proved that for certain potentials (including those in the Katoclass) the spectrum of H is p-independent as well. A more detailed answer tothe question asked by B. Simon about p-independence of spectra of Schrödingeroperators can be found in [16].

Later on, Ph. Blanchard and Z. M. Ma [4], proved also the Lp-smoothing prop-erty for the heat semigroups associated to H = −�+µ on R

d (d � 3), where µ is asigned measure whose positive part is smooth and negative part is in the generalizedKato class. In an abstract setting, P. Stollmann and J. Voigt [20] investigated theproperties of exp(−tHµ), Hµ = −� + µ on the space L2(X,m), where m is ameasure whose support is X.

Recently (cf. [13] and references therein), a particular interest was showed forthe study of smoothing properties of exp(−tH ), where H = − 1

2�+V on the scaleof Bessel potential spaces on R

d . In [14] (Theorem 1.1), the authors proved thatfor V ∈ Kd,loc with negative part V − ∈ Kd , the boundedness of exp(−tH ) fromLp into L

p,s+2loc is equivalent to V ∈ L

p,s

loc , while A. Gulisashvili [13] gave a sharpestimate of the Lp − L

p,s+2loc norm of exp(−tH ) for V ∈ Lp,s ∩ Kd , where Kd is

the Kato class of potentials and Kd,loc the local Kato class of potentials.Hamiltonians describing point interactions, or Schrödinger operators with point

interactions, are operators corresponding to perturbations of minus the Laplacianby the linear combination of Dirac measures. The mathematical setting and thedescription of such Hamiltonians are now quite well-known [3] and a new approachto handling them is given in [5]. It is known [3] that for d = 2, 3, there is only onekind of point interaction that can be denoted by Hα = −� + αδ. In contrast,in one dimension there are more possibilities of point interactions. However, wewill concentrate here on a one-parameter family corresponding to the δ′-interaction[3], which we shall also denote by Hα, −∞ < α � +∞. This family is relatedto selfadjoint extensions of the operator −� with its domain the Sobolev spaceH 2

0 (R \ {0}) and determined by the boundary condition:

ϕ′(0+) = ϕ′(0−), ϕ(0+) − ϕ(0−) = αϕ′(0), (3)

for every ϕ in the domain of Hα. Hence, up to now operators describing pointinteractions in R

d do not fit into a standard situation and, in any case, the theorydeveloped in [20] does not include such operators. However, we are saved by theexplicit knowledge of the kernels associated to Hα [1]. Our aim in this paper isto use these formulas to establish p, q smoothing properties of Hα on Lp(Rd)

SMOOTHING PROPERTIES OF HEAT SEMIGROUPS 39

for d = 1, 2. As a consequence, we get a construction of the operators Hα on Lp-spaces for 1 � p < +∞ which we denote byHα,p (for p = 2 we omit the subscriptp). Then combining the techniques used by J. Voigt and P. Stollmann [15, 16]and the new functional calculus developed by E. B. Davies [8], we prove the p-independence of their spectra and derive the exponential growth of the p-norm ofexp(−tHα,p).

Such a construction was done by S. Albeverio et al. [2] using the ‘family ofpseudo-resolvent’, thereby exploiting the expression of the resolvent kernel of Hα.They conclude the construction for d = 1, p ∈ [1,+∞) or d = 2, p ∈ ]1,+∞[or d = 3, p ∈ ] 3

2 , 3[ and the same thing for the C0-semigroup exp(−tHα). Wewill use here the reversed strategy: Using the explicit formula of the heat kernelassociated to Hα (cf. [1]) which we denote, as in [1], by Pα(t; x, y), we prove thatfor d = 1, 2, this kernel also defines a bounded linear operator from Lp into Lq

for 1 � p � q � +∞. This is the well-known ‘smoothing property’. Moreover,this kernel even defines a strongly continuous semigroup on Lp for 1 � p < +∞.Then we denote by −Hα,p the generator of exp(−tHα) on Lp (1 � p < +∞).Using the integral representation of the resolvent function [9], p. 55 (which is theLaplace transform of the semigroup), we get that for k2 such that Im(k) > 0 andRe(k2) < min(s(Hα), s(Hα,p)), (Hα,p − k2)−1 is a kernel operator whose kernel isGk, where Gk is the kernel of (Hα − k2)−1. Hence, for d = 1, 2, the constructionwe propose in this paper includes the one made in [2]. A natural question arises:why is it so? This is related to the properties of the heat kernel. For instance, theheat kernel has better properties than the resolvent kernel. A good example is acomparison between the resolvent and the heat kernel of −�.

Unfortunately, this method does not work in three dimensions for the reasonthat we will explain at the end of the paper.

For the notations we will adopt those used in [3]. So Hα is the Hamiltonian-describing interaction placed at the origin in the space L2, which corresponds inone dimension to the δ′-interaction. We shall denote the free Hamiltonian by H0.The space Lp(Rd) is denoted simply by Lp for every 1 � p � +∞. For everylinear closed operator T , the spectrum of T is denoted by σ (T ), whereas its spectralbound is denoted by s(T ) and is defined by [9]

s(T ) = inf{Re(λ), λ ∈ σ (T )}. (4)

2. Smoothing Property in Two Dimensions

Following the notation of [1], we denote the heat kernel of exp(−tHα) byPα(t; x, y) for every t > 0. It is given by [1]

Pα(t; x, y) = P(t; x, y) + 1

2π

∫ +∞

0tu−1 e(−αu)

$(u)×

×∫ +∞

1(z − 1)u−1z−ue(−z

|x|2+|y|24t )K0

( |x||y|2t

z

)dz du, (5)


where

P(t; x, y) = 1

4πtexp

(−|x − y|2

4t

)is the kernel of the free Hamiltonian and K0 is the MacDonald function. We willalso denote by P α(t; x, y) the difference Pα(t; x, y) − P(t; x, y).

A way to prove the smoothing property of exp(−tHα) has been developed in[19]: first prove the boundedness from L∞ into L∞ and the boundedness from L1

into L∞ and then use the Riesz–Thorin convexity theorem [18] to conclude. Thefirst step is given by this lemma.

LEMMA 2.1. For every t > 0, we have

supx∈R2

∫R2Pα(t; x, y) dy < +∞. (6)

Proof. Since

supx∈R2

∫R2P(t; x, y) dy = 1,

we just have to prove that

supx∈R2

∫R2P α(t; x, y) dy < +∞. (7)

Set

I (x) =∫

R2P α(t; x, y) dy

and

J (x) =∫

R2e(−z

|x|2+|y|24t )K0

( |x||y|2t

z

)dy.

Then

I (x) = 1

2π

∫ +∞

0tu−1 e−αu

$(u)

∫ +∞

1(z − 1)u−1z−uJ (x) dz du. (8)

Let us make a suitable estimate for J (x). Using polar coordinates, we get

J (x) = 2πe(−z|x|24t )

∫ +∞

0e(−z r

24t )K0

(zr|x|2t

)r dr.

With the change of variable s = z(r|x|/2t) one get

J (x) = 2π

(2t

z|x|)2 ∫ +∞

0e(− t

2z|x|2 s2)sK0(s) ds


which is equal to (cf. [12], p. 717, 3)

2πe(−z|x|24t )

1

2

(z|x|2t

) 12

e(z|x|2

8t )W− 12 ,0

(z|x|2

4t

), (9)

where Wχ,µ is the Whittaker function. On the other hand, one has (cf. [17], p. 305)

W− 12 ,0(z) = z

12 e(

z2 ) which gives

J (x) = 2π√2

t

z.

Thus

supx∈R2

I (x) = 1√2

∫ +∞

0tu

e−αu

$(u)

∫ +∞

1(z − 1)u−1z−u−1 dz du

= 1√2ν(te−α), (10)

where the function ν is defined by (cf. [10]):

ν(x) =∫ +∞

0

xs

$(s + 1)ds.

From this follows

PROPOSITION 2.1. For every t > 0 and every p such that 1 � p � +∞ theoperator

exp(−tHα): Lp → Lp (11)

is bounded and defines a strongly continuous semigroup for 1 � p < +∞.Proof. First let us denote by C(t) = 1 + 1√

2ν(te−α). For 1 < p � +∞, let q be

the conjugate of p: p−1 + q−1 = 1. Then, by Hölder inequality, we have for everyf ∈ Lp

|exp(−tHα)f (x)|p � (C(t))pq

(∫R2Pα(t; x, y)|f (y)|p dy

) 1p

, (12)

thereby∣∣∣∣∫R2

exp(−tHαf (x) dx

∣∣∣∣p� (C(t))

pq

∫R2

|f (y)|p(∫

R2Pα(t; x, y) dx

)dy. (13)

Now using the symmetry property of the heat kernel, we get∫R2

|exp(−tHα)f (x)|p dx � (C(t))p∫

R2|f |p dy. (14)


To prove that it defines a strongly continuous semigroup, let us denote by S(t)

the operators whose kernel is P α , then we have ‖S(t)‖p,p � 1√2ν(te−α). By the

properties of the function ν [10], p. 219, we have limt→0 ν(t) = 0, we then get

limt→0

exp(−tHα) = I. (15)

Now defining Tα(t) by Tα(t) = exp(−tHα) for t > 0 and Tα(0) = I , we get astrongly continuous semigroup which we shall denote by exp(−tHα). For p = 1,the proof is straightforward. ✷

We are now in a position to prove boundedness from L1 into L∞.

LEMMA 2.2. For every t > 0 we have

supx,y∈R2

Pα(t; x, y) < +∞. (16)

Proof. To achieve this goal, we are going to use the construction of the Hamil-tonian Hα via Dirichlet forms as done in [3] which we recall here. We omit thecase α = ∞, which corresponds to the free Hamiltonian. Let ϕα be the followingfunction

ϕα(x) = H(1)0 (2ie−2πα+0(1)|x|), x ∈ R

2 \ {0}, (17)

where H(1)0 is the Hankel function. Denote by Hϕα the operator associated to the

local positive Dirichlet form

Eϕα : D(Eϕα ) ⊂ L2(ϕ2α), Eϕα (f, g) =

∫R2

∇f ∇gϕ2α dx. (18)

Then the Hamiltonian Hα is related to this Dirichlet form by [3]

Hα = ϕα[Hϕα − βI ]ϕα−1, (19)

where β = 4e2(−2πα+0(1)). Clearly exp(−tHα) = eβtϕα exp(−tHϕα )ϕα−1.

Now if we denote by qα(t; x, y) the heat kernel of the operator exp(−tHϕα ),then Pα(t; x, y) = eβtϕα(x)ϕα−1(y)qα(t; x, y). Now since the Dirichlet form Eϕαis local and positive, then the kernel qα(t; x, y) is Markovian [11], hence 0 <

qα(t; x, y) � 1. Now we have on the diagonal the following estimate:

Pα(t; x, x) � eβt . (20)

Using the Chapman–Kolmogorov equation

Pα(t + s; x, y) =∫

R2Pα(s; x, z)P α(t; z, y) dz, (21)

we get Pα(t; x, y) � eβt which completes the proof. ✷We now formulate the following theorem:


THEOREM 2.1. For every t > 0 and every 1 � p � q � ∞, the operator

exp(−tHα): Lp → Lq (22)

is bounded.

Theorem 2.1 gives a variety of properties of the heat semigroup exp(−tHα)

known for a large class of Schrödinger operators, at least for operators H =−�+ V where V is in the Kato class.

For every 1 � p < +∞, let us denote by −Hα,p the generator of exp(−tHα)

on Lp which will be denoted exp(−tHα,p). For p = 2, we omit the index p.Interpretation of point interaction on Lp spaces as an extension of −� on D0 ={f ∈ C∞

0 (Rd), f (0) = 0} was done in [6]. There the authors show that pointinteraction can be defined on Lp for d = 1, 2 and 1 < p < +∞ or d = 3and 3

2 < p < 3 as negative generators of analytic semigroups. Their constructionis essentially based on estimations of the resolvent of Hα with α = 1. We hereemphasize that for d = 1, 2 we have fewer restrictions on p than in [2] or in [6].

Now the question of the p-independence of the spectra of Hα arises. Let us notethat for p = +∞ one cannot hope to get the inclusion σ (Hα) ⊂ σ (Hα,p). For theeigenfuction of Hα which is equal to

i

4H

(1)0 (2ieβ |x|), x �= 0, (23)

where β = −2πα + 0(1), is unbounded.

PROPOSITION 2.2. For every t > 0 and p such that 1 � p < +∞ we have

(i) The spectrum of Hα,p is p-independent.(ii) Every isolated eigenvalue of Hα of algebraic multiplicity m is an isolated

eigenvalue of Hα,p with the same multiplicity and conversely.(iii) The spectral bound of Hα,p satisfies

− limt→∞ t−1 ln ‖exp(−tHα,p)‖p,p = −4 exp(2(2α + 0(1))). (24)

Proof. The proof of assertion (ii) is as in [15], (iii) follows from (i) and the char-acterization of the spectral bound of generators associated to strongly continuoussemigroups [9], p. 299. So the important point is to prove (i). Following R. Hempeland J. Voigt [15], we are going to prove first that

σ (Hα) ⊂ σ (Hα,p). (25)

A crucial argument to prove (25) is that, for every 1 � p � q < +∞ and everyt > 0, the operator exp(−tHα,p) maps continuously Lp into Lq , for exp(−tHα,p)

has the same kernel as exp(−tHα). Once this is observed, one can continue theproof of (25) as in [20].

Let us prove now the reversed inclusion. We may suppose that 2 � p � +∞,and then conclude by duality. Given ξ ∈ ρ(Hα) and f ∈ Lp ∩ L2, then in fact


exp(−tHα)f = exp(−tHα,p)f , now by the integral representation of the resol-vent function [9], we have for every ξ such that Re(ξ) < min(s(Hα), s(Hα,p)),Rξ(Hα)f = Rξ(Hα,p)f which implies that Hαf = Hα,pf . Since ρ(Hα) is con-nected, we get, for every ξ ∈ ρ(Hα) and every f ∈ Lp ∩ L2,

(Hα − ξI )−1(Hα,p − ξI )f = f. (26)

It is now sufficient to prove that T (ξ) = (Hα − ξI )−1 is bounded as an operatoron Lp. Indeed, from the kernel formula of (Hα − ξI )−1 [3], we get that T (ξ) is aclosed operator in L∞ whose domain is the whole space L∞, hence by the Banachtheorem we conclude that T (ξ) is bounded on L∞. On the other hand, it is boundedon L2, thus by the Riesz–Thorin convexity theorem, we get the boundedness ofT (ξ) on Lp for every 2 � p � +∞ which implies the result. ✷

Remark 2.1. From Proposition (2.2), we get

‖exp(−tHα,p)‖p,p ∼ exp(4t exp(2α + 0(1))), (27)

for large t which expresses the exponential growth of the p-norm of the operatorexp(−tHα,p), while for small t we have

‖exp(−tHα,p)‖p,p � 1 + C

|log(t)| . (28)

In [4] it is proved that for 1 � p < +∞ and every f ∈ Lp the functionexp(−tH ) tends to zero at infinity. The same phenomenon occurs in our situation.

PROPOSITION 2.3. For 1 � p < +∞ and f ∈ Lp we have

lim|x|→∞ |exp(−tHα,p)f (x)| = 0. (29)

Proof. We give the proof for p = 2, for p �= 2, the proof is more or lessthe same. The proof is based on the explicit formula of the heat kernel [1], Equa-tion (3.16), namely,

Pα(t; x, y) = P(t; x, y) + e−At

(4πt|x||y|) 12

∫ +∞

0

tue−αu

$(u)×

×∫ +∞

0

ru−1

(r + 1)u+ 12

e−At r K0

( |x||y|2t

(r + 1)

)dr du, (30)

where

A = (|x| + |y|)2

4and K0(z) =

√2z

πexp(z)K0(z).


Let us recall that [1]

supr�0

|K0(r)| = M < +∞. (31)

Taking into account that

lim|x|→+∞

exp(−tH0)f (x) = 0,

we should just prove that

1

(4πt|x| 12 )

∫ +∞

0

tue−αu

$(u)

∫ +∞

0

ru−1

(r + 1)u+ 12

×

×∫

R2

e−At

(4πt|y|)e−Atr K0

( |x||y|2t

(r + 1)

)f (y) dy dr du (32)

tends to zero for |x| → ∞. We denote by A(x) the last term in Equation (32), then

A(x) � M

∫R2

1

|y| 12

e− (|x|+|y|)24t |f (y)| dy

for which, by Hölder inequality, we obtain

A(x) � 2π12M‖f ‖L2

(∫ +∞

0e− s2

2t ds

) 12

. (33)

This yields

|exp(−tHα,p)f (x)| � |exp(−tH0)f (x)| + Cν(te−α)

|x| 12

‖f ‖L2, (34)

where C > 0, and this completes the proof. ✷

3. Smoothing Property in One Dimension

In one dimension, the Hamiltonian Hα corresponds for α = 0 to the free Hamil-tonian, so in this section we will omit the case α = 0. For α �= 0, the heat kernelof Hα is given by [1] (Formula 3.4),

Pα(t; x, y) = P(t; x, y) + sgn(xy)√4πt

e− (|x|+|y|)24t +

+ 2sgn(xy)

α√

4πt

∫ +∞

0e− 2

αue− (|x|+|y|+u)2

4t du, (35)

where

P(t; x, y) = 1√4πt

e− |x−y|24t


is the heat kernel of the free one-dimensional Hamiltonian. The arguments usedto prove the p, q-smoothing property of the heat semigroup in one dimension arequite similar to those used in the previous section. Indeed, we have the followingeasy lemma:

LEMMA 3.1. For every t > 0, we have

supx∈R

∫R

|Pα(t; x, y)| < +∞. (36)

Proof. A direct computation shows that, for every α > 0, we have

supx∈R

∫R

|Pα(t; x, y)| � C1(t) = 2 + 2√

4πt. (37)

While for α < 0, the following estimate holds true:

supx∈R

∫R

|Pα(t; x, y)| � 2 + 2

|α|√πt exp

(4t

α2

). (38)

✷LEMMA 3.2. For every t > 0, we have

supx,y∈R

|Pα(t; x, y)| �{C1 + C2t

− 12 ; α > 0,

C ′1 + C ′

2t− 1

2 + exp( 4tα2 ); α < 0,

with positive constants Cj and Cj ′.

Now using Lemmas 3.1, 3.2, and the Riesz–Thorin theorem, one can easilyestablish the following theorem:

THEOREM 3.1. For every t > 0 and every 1 � p � q � ∞, the operator

exp(−tHα): Lp → Lq (39)

is bounded and, for p = q < +∞, it defines a strongly continuous semigroup.

Now, as in the first section, we denote by −Hα,p the generator of the operatorexp(−tHα) on the space Lp and for p = 2 we will omit the subscript p. The spec-tral properties of the operators Hα,p can be easily investigated using the smoothingproperties of their semigroups.

PROPOSITION 3.1. For every 1 � p < +∞ we have

(i) σ (Hα,p) is p-independent.(ii) Every isolated eigenvalue of Hα of algebraic multiplicity m is an isolated

eigenvalue of Hα,p with the same multiplicity and, conversely.

(iii) limt→+∞ t−1 ln ‖exp(−tHα,p)‖p,p =

{4α2 ; α < 0,0; α > 0.


We shall prove only assertion (i). To this aim we use a new method which relieson the new functional calculus introduced by E. B. Davies [8]. The applicationof this calculus requires an estimate for the resolvent function and the the spectraof Hα,p must be real. Thanks to the explicit formula of the kernel of the operator(Hα − k2)−1, one can prove a suitable estimate for the resolvent functions. Let usfirst recall that for k2 ∈ ρ(Hα), Im k > 0 the kernel of (Hα − k2)−1 is given by[3], p. 92,

Gk(x, y) = i

2keik|x−y| + α sgn(xy)

2(−ikα + 2)eik(|x|+|y|). (40)

It is obvious that Gk(x, y) is also the kernel of (Hα,p−k2)−1 for every k2 ∈ ρ(Hα)∩ρ(Hα,p) with Im(k) > 0.

LEMMA 3.3. For every k2 ∈ ρ(Hα) such that Im k > 0, we have

supx∈R

∫R

|Gk(x, y)| dy � 4

|Im(k2)| . (41)

Proof. We have

|Gk(x, y)| � 1

|2k|e−Im(k)|x−y| +∣∣∣∣ α

2(−ikα + 2)

∣∣∣∣e−Im(k)(|x|+|y|), (42)

which yields∫R

|Gk(x, y)| dy � 1

|k|Im(k)+ |α|

|(−ikα + 2)|1

Im(k). (43)

Observing that

|k|Im(k) � |Re(k)|Im(k) = 1

2|Im(k2)|

and that∣∣∣∣ 2

α− ik

∣∣∣∣ � |Re(k)|,we get the desired estimate. ✷

Using estimate (41) we conclude that for every k2 ∈ ρ(Hα) with Im k > 0 theoperator whose kernel is Gk defines a bounded operator on Lp for 1 � p � +∞.This operator is nothing else but (Hα,p − k2)−1, thereby giving that

C \ R ⊂ ρ(Hα,p). (44)

Thus, σ (Hα,p) ⊂ R. Now the operators Hα,p satisfies all hypotheses (especiallyH1) required by the functional calculus [8].

Proof of Proposition 3.1. Using Lemma 4 in [7] and the fact that for everyξ ∈ ρ(Hα) ∩ ρ(Hα,p) and every f ∈ Lp ∩ Lq , we have Rξ(Hα)f = Rξ(Hα,p)f ,we get the spectral p-independence. ✷


Remark 3.1. From Proposition 3.1(iii), we observe that for α < 0 the Lp normof exp(−tHα,p) has an exponential growth for large t , this is, however, similar tothe case of perturbations of the Laplacian by negative potentials. While for α >

0, ‖exp(−tHα,p)‖p,p behaves like ‖exp(−tH )‖p,p for large t . So that the operatorexp(−tHα,p) in one dimension behaves somewhat different as in two dimensionswhere the behavior does not depend on the sign of α.

We close this section with the following proposition:

PROPOSITION 3.2. For 1 � p < +∞ and f ∈ Lp we have

lim|x|→∞ | exp(−tHα,p)f (x)| = 0. (45)

Remark 3.2. We here studied only the case of a one-parameter family of pointinteraction, corresponding to δ′-interaction. However, the same method applies forthe four-paramater family of self-adjoint extensions.

The method we give here does not work for d = 3 for the simple reason that foreach α �= +∞ we have

supx∈R3

∫R3Pα(t; x, y) dy = +∞.

In fact, for α � 0, we have [1]

Pα(t; x, y) � 2t

|x||y|P(t; |x| + |y|), (46)

where

P(t; |x| + |y|) = 1

(4πt)32

e− (|x|+|y|)24t .

Hence,∫R3P α(t; x, y) dy � 4π2t

|x| D−2

( |x|√t

)e− |x|2

2t , (47)

where Dν is the cylindrical hypergeometric function [17]. This implies that

limx→0

∫R3P α(t; x, y) dy = +∞.

Similarly, one can prove the same result for positive α.

References

1. Albeverio, S., Brzezniak, Z. and Dabrowski, L.: Fundamental solution of the heat andSchrödinger equations with point interaction, J. Funct. Anal. 130(1) (1995), 220–254.


2. Albeverio, S., Brzezniak, Z. and Dabrowski, L.: The heat equation with point interaction in Lp

spaces, Integral Equations Operator Theory 21(2) (1995), 127–138.3. Albeverio, S., Gesztezy, F., Høegh-Krohn, R. and Holden, H.: Solvable Models in Quantum

Mechanics, Texts and Monogr. Phys., Springer-Verlag, New York, 1988.4. Blanchard, Ph. and Ma, Z. M.: Semigroup of Schrödinger operators with potentials given by

Radon measures, In: Stochastic Processes, Physics and Geometry (Ascona and Locarno, 1988),World Scientific, Teaneck, NJ, 1990, pp. 160–195.

5. Caspers, W. and Clément, P.: A different approach to singular solutions, Differential IntegralEquations 7(5–6) (1994), 1227–1240.

6. Caspers, W. and Clément, Ph.: Point interactions in Lp , Semigroup Forum 46(2) (1993), 253–265.

7. Davies, E. B.: Lp spectral independence and L1 analyticity, J. London Math. Soc. (2) 52(1)(1995), 177–184.

8. Davies, E. B.: The functional calculus, J. London Math. Soc. (2) 52(1) (1995), 166–176.9. Engel, K. J. and Nagel, R.: One-Parameter Semigroups of Linear Evolution Equations,

Springer, New York, 2000.10. Erdélyi, A., Magnus, W., Oberhettinger, F. and Tricomi, F. G.: Tables of Integral Transforms,

Vol. 3, McGraw-Hill, New York, 1954.11. Fukushima, M.: Dirichlet Forms and Markov Processes, North-Holland, Amsterdam, 1980.12. Gradshteyn, I. S. and Ryzhik, I. M.: Table of Integrals, Series, and Products, Academic Press,

New York, 1965.13. Gulisashvili, A.: Sharp estimates in smoothing theorems for Schrödinger semigroups, J. Funct.

Anal. 170(1) (2000), 161–187.14. Gulisashvili, A. and Kon, M. A.: Exact smoothing properties of Schrödinger semigroups, Amer.

J. Math. 118(6) (1996), 1215–1248.15. Hempel, R. and Voigt, J.: The spectrum of a Schrödinger operator in Lp(Rν) is p-independent,

Comm. Math. Phys. 104(2) (1986), 243–250.16. Hempel, R. and Voigt, J.: On the Lp-spectrum of Schrödinger operators, J. Math. Anal. Appl.

121(1) (1987), 138–159.17. Magnus, W., Oberhettinger, F. and Soni, R. Pal.: Formulas and Theorems for the Special

Functions of Mathematical Physics, Springer, New York, 1966.18. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol. 2, Fourier Analysis

and Self-adjointness, Academic Press, New York, 19xx.19. Simon, B.: Schrödinger semigroups, Bull. Amer. Math. Soc. (N.S.) 7(3) (1982), 447–526.20. Stollmann, P. and Voigt, J.: Perturbation of Dirichlet forms by measures, Potential Anal. 5(2)

(1996), 109–138.


51

On the Tidal Motion Around the Earth Complicatedby the Circular Geometry of the Ocean’s ShapeWithout Coriolis Forces

Dedicated to Professor L. V. Ovsyannikov on the occasion of his80th birthday

RANIS N. IBRAGIMOVDepartment of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada.e-mail: [email protected]

(Received: 10 February 2000; in revised form: 29 March 2001)

Abstract. The Cauchy–Poisson free boundary problem on the stationary motion of a perfect incom-pressible fluid circulating around the Earth is considered in this paper. Rotation plays a significantrole in the early stages of the formation of solitary waves. However, these effects are less importanton the solitary waves once they are formed. Therefore, for simplicity, rotation is not included forthese simulations. The main concern is to find the inverse conformal mapping of the unknown freeboundary in the hodograph plane onto some fixed mapping in the physical domain. The approximatesolution to the problem is derived as the application of such a method. The behaviour of tidal wavesaround the Earth is discussed. It is shown that one of the features of the positively curved bottomis that the problem admits two different higher-order systems of shallow water equations, while theclassical problem for the flat bottom admits only one system.

Mathematics Subject Classification (2000): 30C20.

Key words: free boundary, inverse conformal mapping.

1. Introduction

The Cauchy–Poisson problem on the stationary motion of a perfect fluid which hasa free boundary and has a solid bottom represented by a circle with a sufficientlylarge radius, is considered in this paper. For simplicity, the fluid is not considered inthe real, rotating reference frame of the Earth. However, we note that the presenceof Coriolis effects do not change the qualitive analysis of the presented method. Wehave shown in [4] that such a problem can be associated with a two-dimensionalmodel to an oceanic motion around the Earth, since we consider strictly longitu-dinal flow. Since the problem is a free boundary problem, the analysis is ratherdifficult.

Permanent water waves have been considered in a large number of papers.However, most researchers are concerned with fluid motion which is infinitely deep

52 RANIS N. IBRAGIMOV

and extends infinitely both rightward and leftward (see Crapper [1], Stoker [9] orStokes [10] for the history). Such problems are usually called Stokes’s problem ifthe surface tension is neglected and Wilton’s problem if the surface tension is takeninto account (see [6] for more details).

We consider water waves for which the ratio of the depth of fluid above thecircular bottom to the radius of the circle is small (shallow water).

Our primary concern is to find the conformal mapping (for the Stokes problem)of the unknown free boundary onto fixed mapping. The resulting Dirichlet problemcan be solved numerically using Okamoto’s method [3]. A more detailed structureof the bifurcation of solutions for the related problem was numerically computed byFujita et al. [3]. The existence of nontrivial solutions for the analogously reducedDirichlet problem can be found in [4] and [7] as well as in the classical literature(see, e.g., [8] or [9]).

Higher-order shallow water equations in the nonstationary case are derived inthis paper. It is shown that the present problem admits two different systems ofshallow water equations, while the classical problem for the flat bottom admitsonly one system (see [2]).

We note that papers [3, 6, 7] are concerned with fluid whose surface tensionis taken into account. In fact, the surface tension plays the role of a ‘regulator’ ofthe problem which substantially simplifies the analysis. Furthermore, the nature ofthe problem requires that the surface tension should be neglected. Thus, this paperrepresents a more systematic approach to the problem.

The paper aims to investigate the problem by using a conformal mapping whichdistinguishes it from [3, 4, 6, 7].

2. Basic Equations

The analysis of this problem is performed in the following notation: R is the radiusof the circle, r is a distance from the origin, θ is a polar angle, h0 is the undisturbedlevel of the liquid above the circle, and h = h (θ) is the level of the disturbance ofthe free boundary above the circle. For the sake of simplicity, we assume that thepressure is constant on the free boundary. The stream function ψ = ψ(r, θ) definesthe velocity vector, i.e.,

vr = −1

rψθ, vθ = ψr.

Hence, irrotational motion of an ideal incompressible fluid of the constant pressurein the homogeneous gravity field g = const is described by the stream function ψ

in the domain

�h = {(r, θ) : 0 � θ � 2π,R � r � R + h0 + h(θ)},which is bounded by the bottom �R = {(r, θ) : r = R, θ ∈ [0, 2π ]} and the freeboundary with equation �h = {(r, θ) : r = R + h0 + h(θ), θ ∈ [0, 2π ]}. Note that

TIDAL MOTION AROUND THE EARTH 53

ψ is an harmonic function in �h, since we assumed that the flow is irrotational.More specifically, we assume that the fluid is incompressible and inviscid and thatthe flow is stationary. Then the problem is to find the function h (θ) and the sta-tionary, irrotational flow beneath the free boundary r = R + h0 + h (θ) given bythe stream function ψ which satisfy the following differential equations:

ψ = 0(in �h), ψ = 0 (on �R), ψ = a (on �h), (1)

|∇ψ |2 + 2gh = constant (on �h), (2)

1

2

∫ 2π

0(R + h0 + h(θ))2 dθ = π(R + h0)

2, (3)

where a = const denotes the flow rate.Equations (1) to (3) represent the free boundary Cauchy–Poisson problem in

which the boundary �h is unknown as well as the stream function.

3. The Inverse Transforms Principle

3.1. CONSTANT FLOW

The exact solution

h ≡ 0 and ψ = ψ0 = a log r (4)

of Equations (1)–(3) corresponds to the constant flow with an undisturbed freeboundary. The trivial solution (4) represents a flow whose streamlines are concen-tric circles with the common center at the origin.

The following nondimensional quantities are introduced:

r = R + h0r′, h = h0h

′, ψ = aψ ′, ε = h0

R, F = h0

√gh0

a,

where F is a Froude number and R is used as a vertical scale. We consider ε as thesmall parameter of the problem. After dropping the prime, Equations (1)–(3) arewritten by ψ ′, h′, and (r ′, θ) as follows:

�(ε)ψ = 0 (in �h), (5)

ψ = 0 (on �R), (6)

ψ = 1 (on �h), (7)

∣∣∇(ε)ψ∣∣2 + 2F −2h = constant (on �h), (8)

1

2

∫ 2π

0(1 + ε + εh(θ))2dθ = π(1 + ε)2 (on �h). (9)


Here the Laplace and gradient operators are given by

�(ε) = (ε∂θ )2 + [(1 + εr)∂r ]2, ∇(ε) =

(ε∂θ

(1 + εr), ∂r

),

where the subscripts imply differentiation.We further consider the complex potential ω(ζ ) = ϕ + iψ , where ζ =

(1 + εr)eiθ is the independent complex variable and ϕ(ζ ) is the velocity potentialwhich is characterized by the analyticity of ϕ + iψ , i.e.,

ϕr = εψθ

(1 + εr),

εϕθ

(1 + εr)= −ψr.

We note that the complex velocity dω/dζ is a single-valued analytic function of ζ ,although ω is not single-valued. In fact, when we turn around the bottom r = 1once, ϕ increases by − ∫ 2π

0 ψr(1, θ)dθ which has a positive sign by the maximumprinciple (Hopf’s lemma). Hence, if we remove the width of annulus region θ = 0,r ∈ [1, 1 + ε], then at every point (r, θ), ω(ζ ) is single-valued analytic functionwhich maps the rectangular (in the ω(α)-hodograph plane) domain with

ϕ ∈[

0,− 2π

log(1 + ε)

]and ψ ∈ [0, 1]

as coordinates onto the annulus

�0h = {(r, θ) : 1 < r < 1 + ε, θ ∈ [0, 2π ]}.

We represent the constant flow (4) by

ω(ζ ) = ϕ + iψ = i log(1 + εr) − θ

log(1 + ε), (10)

where r = ξ0(ψ) and θ = η0(ϕ) transform the rectangular domain in the hodo-graph plane into �0

h. Consequently, each conformal mapping by the function ω(ζ )

Figure 1.


between hodograph and physical planes represents an irrotational flow in the phys-ical ζ plane. Furthermore, Equations (10) implies that

η0(ϕ) = −ϕ log(1 + ε) and ξ0(ψ) = ε−1([1 + ε]ψ − 1

). (11)

3.2. REDUCTION ONTO THE BOUNDARY

Now a two-dimensional infinitesimal disturbance ξ ′ and η′ is superimposed on ξ0

and η0. Then the resulting transform components are

ξ = ξ0 + ξ ′, η = η0 + η′.

The perturbed quantities ξ ′ and η′ are assumed to be small quantities so that thenontrivial solution is close to the trivial one.

Then, with Equation (11), the inverse transform can be combined to form

log ζ = ψ log(1 + ε) − iϕ log(1 + ε) + log

(1 + εξ ′

1 + εξ0

)+ iη′,

since log(1 + εξ0) = ψ log(1 + ε). Consequently, the polar angle θ and radius r

are given by the equations

θ = −ϕ log(1 + ε) + η′, r = ε−1

[(1 + ε)ψ

(1 + εξ ′

1 + εξ0

)− 1

].

Since the motion is irrotational, we can decrease the dimension of the problemby one. In other words, we introduce the boundary value for the function ξ andreduce the basic equations to the quantities which arise from the condition on thefree boundary ψ = 1. To this end, we introduce the regular function f (ω) = α+iβ

such that

(α, β) = 1

(λ2 + η′2)ϕ(λϕ,−η′

ϕ).

Then the nonlinear boundary condition (8) can be reduced to the differential equa-tion of the conservation form for f (ω) by virtue of the following lemma:

LEMMA 1. Let the function λ (ϕ,ψ) defined by

1 + εξ ′

1 + εξ0= eλ(ϕ,ψ) (12)

is differentiable at least once. We introduce the derivative operator ∂n along thenormal to ψ = 1 by ∂nµ = λψ |ψ=1, where µ(ϕ) = λ(ϕ, 1). Then, on the freeboundary, the following three relations hold:

βϕ = −∂nα = −∂ϕ∂n

{(τ + 1 + ε2

2

)e2µ

}, (13)

where τ = (b − µF −2)(1 + ε)2, and b = const is the Bernoulli constant.


Proof. Since the velocity potential ω defined by Equation (10) is an analyticfunction, ξ and η are single-valued and they satisfy the Cauchy–Riemann equations

ε

1 + εξξ ′ϕ = η′

ψ,ε

1 + εξξ ′ψ − ε2(1 + εξ)−1

(1 + εξ0)ξ0ψ

= −η′ϕ

which can be simplified as

λϕ = η′ψ, λψ = −η′

ϕ. (14)

From Equations (11), (14) and presentation ζ = (1 + εξ ′)eiη′, it follows that∣∣∣∣dζ

dϕ

∣∣∣∣2

= ε2(ε−1 + 1)2e2λλ2ϕ + (1 + εξ)2η

2

ϕ, (15)

since ξ0 = 1 on the free boundary.By virtue of Equations (14), (15) and the presentation

ηϕ = − log(1 + ε) − ∂nµ,

the Bernoulli equation (8) takes the form

τ

[e2λλ2

ϕ + 1

(1 + ε)2

((1 + εξ ′)2(log(1 + ε) + ∂nµ)2

)] − 12 = 0,

which can be transformed in to the conservation law,

α(ϕ, 1) = ∂

∂ϕ

(τ + (1 + ε2)

2

)e2µ. (16)

Thus, the first equation of Equations (13) holds due to the analyticity of the functionf (ω) and the second equation of Equations (13) follows from the definition of thenormal derivative operator ∂n. Finally, the last equation is the consequence of thechanging of the order of differentiation ∂ϕ∂nµ = ∂nµϕ . ✷

Note that the function µ(ϕ) is found through the analyticity of f (ω) and thustransformation ξ(ϕ,ψ) is determined by definition (12) as

ξ ′ =(

1

ε+ ξ0(ψ)

)(eλ(ϕ,ψ) − 1).

3.3. SOLUTION TO THE DIRICHLET PROBLEM IN A FIXED DOMAIN

In view of Lemma 1, it follows from the definition for the function α(ϕ, 1) andEquation (16) that integrating Equations (13) over ϕ along ψ = 1 leads to thefollowing equation on the free boundary:

4τe2µ[log(1 + ε) + ∂nµ] + ∂n([2τ + 1 + ε2]e2µ) − εδ0 = 0,


where δ0 is the constant of integrating which represent the horizontal impulse flow.Finally, simplifying the last equation, we arrive at the Dirihlet problem in the

fixed domain

λϕϕ + λψψ = 0 (0 < ψ < 1), (17)

λ(ϕ, 0) = 0, λ(ϕ, 1) = µ, (18)

µ∂nµF −2 −(b + 1

4

[1 − F −2

] − ε

2

)∂nµ+

+ log(1 + ε)

2(µF −2 − b) + δe−2µ

(1 + ε)2= 0, (19)

where we denote δ = δ0/8.Now the problem (Equations (17) to (19)) is reduced to finding one function

µ(ϕ) since if the function µ is known, then function λ(ϕ,ψ) is defined as thesolution of the mixed problem for the Laplace equation (17) and the boundaryconditions (18). In particular, λψ |ψ=1 can be considered as the result of the actionof the operator ∂n on the function µ. Namely, we represent λ (ϕ,ψ) by the Fourierseries (see, for example, [8] or [9]). Then the dependence between the nth Fouriercoefficients of functions λ, µ and ∂nµ is given by

[λ(ϕ,ψ)]n = sinh nψ

sinh nµn, [µ(ϕ)]n = µn, [∂nµ(ϕ)]n = µn cot n, (20)

in which µ(ϕ) = µneinϕ (summation is assumed). Thus, problem (17)–(19) iswritten in terms of [µ(ϕ)]n only. Since [∂nµ]n are given by Equation (20), we canrepresent the disturbance µ(ϕ) by the expansion in series with respect to para-meter ε (see also [8]). Consequently, we apply the stretching transformation andexpansion

(µ, ∂nµ) =∞∑i=0

εi{ε(µi , ε(∂nµ)i

)}(i = 0,∞) (21)

which is characteristic of shallow water. Substitution of representation (21) intoEquation (19) and elimination mod ε3 (neglecting of the terms with εm, m � 4)yields the approximate solution of the form [µi]n = [µi (b,F )]n as follows:

δ − b

2+ ε

{−2δ + b

4+ F −2

2µ1 − 1

4[1 − F −2 + 4b](∂nµ)1 − 2δµ1

}+

+ ε2

{3δ − b

6− F −2

4µ1 + F −2

4µ2 + 1

2(∂nµ)1−


− 1

4[1 − F −2 + 4b](∂nµ)2 + 4δµ1 + 2δµ2

1 − 2δµ2 + µ1(∂nµ)1

}+

+ ε3

{−4δ + b

8+ F −2

6µ1 − F −2

4µ2 + F −2

2µ3 + 1

2(∂nµ)2 −

− 1

4[1 − F −2 + 4b](∂nµ)3 −

− 6δµ1 − 4δµ21 − 4

3δµ3

1 + 4δµ2 − 2δµ3 + 4δµ1µ2 + µ1(∂nµ)2 +

+ (∂nµ)1µ2

}+ o(ε4). (22)

Thus, in view of Equations (20), Equation (22) represents the recurrent systemof algebraic equations for determination of all [µi]n, where the horizontal impulseflow has asymptotic δ = b/2.

The shape of the free boundary h (θ) can be determined numerically usingOkmaoto’s method [3]. The existence of exact solution (ψ, h) can be establishedanalytically by the Fixed Point Theorem (see, for example, [7] or [4]).

4. Behavior of Tides Waves

4.1. EXISTENCE OF STATIONARY WAVES

The main concern of this section is the evolution of tides around the Earth in time t .In order to bring out the essential parameters of the problem, the dimensionalfundamental equations, Equations (1) to (2), are written, in the nonstationary case,as follows:

�(ψ) = 0 (in �h),

ψθ = 0 (on �R),

rht + ψθ + ψrhθ = 0 (on �h),

−hθψtθ + r2ψtr + r

2

(ψ2

θ

r2+ ψ2

r

)θ

+ rghθ = 0 (on �h),

where � = (∂θθ + r2∂rr + r∂r).The perturbed quantities h′ and ψ ′ are interrelated as

h = h′, ψ = − γ

2πlog r + ψ ′,


where γ is the intensity of the vortex. For the small disturbances, we obtain thelinear problem in the domain D0 = {(r, θ) : R � r � R + h0, 0 � θ � 2π} asfollows:

(ψ ′) = 0 (in D0), (23)

ψ ′θ = 0 (on �R), (24)

h′θ + ψ ′

θ

r− γ h′

θ

2πr2= 0 (on �h0), (25)

r2ψ ′t r − γ

2πψ ′

rθ + rgh′θ = 0 (on �h0). (26)

Since Equations (23)–(26) are linear, the method of superposition is applicable(see also Friedrichs [2]). Hence it is sufficient to look for periodic solutions of theform

(h′, ψ ′) = (H,)(r)) exp{i(kθ − wt)} (27)

in which the wave number k is a given real quantity and eigenvalues w give thedifferent modes of the tide’s wave propagation. Substitution of representation (27)into Equations (23)–(26) leads to the expression

)(r) = c(rk − R2kr−k)

and to the equations

H

(w + kγ

2πr2

)− kc

r(rk − R2kr−k) = 0,

kg

crH −

(w + kγ

2πr2

)(krk−1 + kR2kr−k−1) = 0

in which c is a constant of integration.Consequently, the determinantal equation for the longitudinal tide wave is as

follows:

w = ±√

kg

(R+h0)

[(R + h0)

k−1 − R2k(R + h0)−k−1

](R + h0)k−1 + R2k(R + h0)−k−1

− kγ

2π(R + h0)2. (28)

Thus surface tide waves (on the constant flow) are dispersive with two differentmodes of propagation. Simplification of relation (28) shows that the tide wave ispropagated with a speed

a0 = w

k= ±√

gh0

√tanh [k ln(1 + ε)]

kRh0(1 + ε)− γ

2πR2(1 + ε)2.


Hence, the condition of the existence of stationary tide waves (a0 = 0) is

|γ | � 2πRε− 12 (1 + ε)2

√gh0.

4.2. SPLITTING PHENOMENA FOR SHALLOW WATER EQUATIONS

We suppose that the parameter ε is infinitesimally small. So we consider R as thenatural physical scale. Note that kinematic condition can be written as the massbalance equation. Namely,

ht + (R + h)−1∂θ

∫ R+h

R

vθdr = 0, (29)

since the radial velocity component is given by

vr = −r−1∫ R+h

R

vθθ dr.

Hence, the mass balance equation (29) takes the form

rht + ∂θ(uh) = 0 (on �h),

where the average velocity u (θ, t) is defined by the relation

u(θ, t) = h−1∫ R+h

R

vθ(r, θ, t)dr.

To go further, it is better to introduce an nondimensionalization here. We put

t = R

Ut ′, ψ = h0Uψ ′, u = Uu′,

where U is a unit of velocity. Hereafter, the prime will be omitted. Then the impulseequation is written as

− ε2hθψtθ

(1 + εh)2+ ψtr + 1

2(1 + εh)∂θ

(ε2ψ2

θ

(1 + εh)2+ ψ2

r

)+

+ hθ

(1 + εh)= 0. (30)

We represent the stream function ψ by the Lagrangian expansion (see also [8]or [2]) ψ = ∑∞

i=0 εiψ(i). Then the Laplace equation takes the form (mod ε2)

ψ(0)rr + ε

(ψ(1)

rr + 2rψ(0)rr + ψ(0)

r

) ++ ε2

(ψ

(0)θθ + ψ(2)

rr + 2rψ(1)rr + r2ψ(0)

rr + ψ(1)r + rψ(0)

r

) = 0. (31)

Equation (31) represents the recurrent system of differential equations for the de-termination of ψ(i) as the solution of the Cauchy problem with boundary conditions


ψ(0, θ, t) = 0, ψ(1+h, θ, t) = uh for ψ(0) and zero boundary conditions for ψ(1)

and ψ(2). Hence, the function ψ (mod ε2) is as follows:

ψ =ur + ε

(ur2

2− uh

r

2

)+

+ ε2

(−uθθ

r3

6+ uh

r2

4+ uθθh

2 r

6− uh2 r

4

). (32)

We use the Tailor expansion

(1 + εh)−1 = 1 − εh + (εh)2 + · · · (33)

to write Equation (30) as

ψtr + 12(ε

2ψ2θ + ψ2

r )θ − ε2hθψtθ ++ (εh − 1)

(1

2εh(ψ2

r )θ + εhh)θ+ hθ = 0. (34)

Let us multiply Equation (30) by (1 + εh)2 and then use expansion (33). ThenEquation (30) becomes

ψtr + 12(ε

2ψ2θ + ψ2

r )θ − ε2hθψtθ ++ εh

(ψtr(2 + εh) + 1

2(ψ2

r )θ + hθ

) + hθ = 0. (35)

If we substitute ψ defined by Equation (32) into (34), we obtain the followingequation of the shallow water theory:

ut + uuθ + hθ + ε

(h

2ut − u

2ht + u2hθ − hhθ

)+

+ ε2

(hhθuθt − h3

3uθθt + h2

4ut + h

3uθθht + hhθu

2θ +

+h2uθuθθ + 34uuθ + hu2hθ − 1

3uθuθθ − 13uuθθθ + h2hθ

)= 0. (36)

Consequently, the substitution of ψ (32) into (35) yields

ut + uuθ + hθ + ε

(h

2ut − u

2ht + h

2uut + u2

2hθ + 3

2huuθ

)+

+ ε2

(− hhθuθt − h2

3uθθt + 9

4h2ut + h

3htuθθ − uhht +

+ hhθu2θ + h2uθuθθ + 3

4h2uuθ + 1

4u2hhθ − 1

3uθθθ +

+ 3

4h2uθ + 1

2uhθ + uh

2hθ

)= 0. (37)


Equations (36) and (37) supplied with the kinematic condition

∂t(εh2 + 2h

) + 2∂θ (uh) = 0 (38)

represent two systems of the shallow water equations.To verify that these two systems are different, we compute their first integrals

in the stationary case as follows:

h + ε

(2c2h − 1

2h2

)+ ε2

(3c2h2 − 1

3h3

)= J1,

h + 1

2εc2h + 1

2ε2

(17

4c2h2 − 1

2c2h2 − ch

)= J2,

where c, J1, J2 are constants of integrating. Obviously, J1 �= J2.

At first sight, it seems that the problem (23)–(26) does not have a unique solu-tion because of that fact. However, it can be shown that the solution of the problemis invariant with respect to the decomposition of the function which represents thefree boundary. Since it is not difficult, the proof is omitted.

Acknowledgements

I wish to express my gratitude to Prof. N. Makarenko of Novosibirsk State Uni-versity, Russia, for his helpful comments and advice throughout this study. Inaddition, I would like to thank Prof. L.V. Ovsyannikov for valuable discussionson this subject at seminars at Lavryentiev’s Institute of Hydrodynamics, Russia.

I am grateful to Prof. H. Okamoto of the Research Institute of MathematicalSciences (Kyoto University) for arranging my visit to Japan and for encouragingdiscussions.

Also, I am grateful to Professor K. Lamb of the University of Waterloo, Canada,for valuable discussions and support.

References

1. Crapper, G. D.: Introduction to Water Waves, Ellis Horwood, London, 1984.2. Friedrichs, K. O. and Hyers, D. H.: The existence of solitary waves, Comm. Pure. Appl. Math.

7 (1954).3. Fujita, H., Okamoto, H. and Shoji, M.: A numerical approach to a free boundary problem of a

circulating perfect fluid, Japan J. Appl. Math. 2 (1985), 197–210.4. Ibragimov, R. N.: Stationary surface waves on circular liquid layer, Quaest. Math. 23(1) (2000).5. Levi-Civita, T.: Determination rigoureuse des ondes permanents d’ampleur finie, Math. Ann.

93 (1925), 264–314.6. Okamoto, H.: Nonstationary free boundary problem for perfect fluid with surface tension,

J. Math. Soc. Japan 38(3) (1986).7. Okamoto, H. and Shoji, M.: On the existence of progressive waves in the flow of perfect fluid

around a circle, In: T. Nishida, M. Mimura and H. Fujii (eds), Patterns and Waves, 1986,pp. 631–644.


8. Ovsjannikov, L. V. and Makarenko, N. I.: Nonlinear Problems of Surface and Internal WavesTheory, Nauka, Novosibirsk, 1985.

9. Stoker, J. J.: Water Waves, Interscience Publishers, New York, 1957.10. Stokes, G. G.: On the theory of oscillatory waves, Trans. Cambridge Philos. Soc. 8 (1847),

441–455.11. Lamb, K. G.: Are solitary waves solitons? Stud. Appl. Math. 101 (1998), 298–308.


65

From the Solution of the Tsarev System to theSolution of the Whitham Equations

TAMARA GRAVADepartment of Mathematics, University of Maryland, College Park 20742-4015, U.S.A. andDepartment of Mathematics, Imperial College, London SW7 2BZ, U.K. e-mail: [email protected].

(Received: 24 October 2000; in final form: 30 May 2001)

Abstract. We study the Cauchy problem for the Whitham modulation equations for increasingsmooth initial data. The Whitham equations are a collection of one-dimensional quasi-linear hy-perbolic systems. This collection of systems is enumerated by the genus g = 0, 1, 2, . . . of thecorresponding hyperelliptic Riemann surface. Each of these systems can be integrated by the so-called hodograph transformation introduced by Tsarev. A key step in the integration process is thesolution of the Tsarev linear overdetermined system. For each g > 0, we construct the uniquesolution of the Tsarev system, which matches the genus g + 1 and g − 1 solutions on the transitionboundaries.

Mathematics Subject Classifications (2000): 35Q53, 58F07.

Key words: Whitham equations, hyperelliptic Riemann surfaces, linear overdetermined systems ofEuler–Poisson Darboux type.

1. Introduction

The Whitham equations are a collection of one-dimensional quasi-linear hyper-bolic systems of the form [1–3]

∂ui

∂t− λi(u1, u2, . . . , u2g+1)

∂ui

∂x= 0,

x, t, ui ∈ R, i = 1, . . . , 2g + 1, g = 0, 1, 2, . . . , (1.1)

with the ordering u1 > u2 > · · · > u2g+1. For a given g, the system (1.1) iscalled g-phase Whitham equations. For g > 0, the speeds λi(u1, u2, . . . , u2g+1),i = 1, 2, . . . , 2g + 1, depend through u1, . . . , u2g+1 on complete hyperellipticintegrals of genus g. For this reason, the g-phase system is also called a genus g

system. The zero-phase Whitham equation has the form

∂u

∂t− 6u

∂u

∂x= 0, (1.2)

where we use the notation u1 = u.Equations (1.1) were found by Whitham [1] in the single-phase case g = 1 and

more generally by Flaschka, Forest and McLaughlin [2] in the multi-phase case.

66 TAMARA GRAVA

The Whitham equations were also found in [3] when studying the zero dispersionlimit of the Korteweg–de Vries equation. The hyperbolic nature of the equationswas found by Levermore [4].

In this paper we study the initial-value problem of the Whitham equations forincreasing smooth (C∞) initial data u(x, t = 0) = u0(x).

The initial-value problem consists of the following. We consider the evolutionon the x − u plane of the initial curve u(x, t = 0) = u0(x) according to the zero-phase equation (1.2). The solution u(x, t) of (1.2), with the initial data u0(x), isgiven by the characteristic equation

x = −6tu + f (u), (1.3)

where f (u)|t=0 is the inverse function of u0(x). The solution u(x, t) in (1.3) isglobally well defined only for 0 � t < t0, where t0 = 1

6 minu∈R[f ′(u)] is the timeof gradient catastrophe of (1.3). Near the point of gradient catastrophe and for ashort time t > t0, the evolving curve is given by a multivalued function with threebranches u1(x, t) > u2(x, t) > u3(x, t), which evolve according to the one-phaseWhitham equations (see Figure 1).

Outside the multivalued region, the solution is given by the zero-phase solutionu(x, t) defined in (1.3). On the phase transition boundary, the zero-phase solutionand the one-phase solution are C1-smoothly attached (see Figure 1).

Since the Whitham equations are hyperbolic, other points of gradient catastro-phe can appear in the branches u1(x, t) > u2(x, t) > u3(x, t) themselves or inu(x, t).

In general, for t > t0, the evolving curve is given by a multivalued functionwith an odd number of branches u1(x, t) > u2(x, t) > · · · > u2g+1(x, t), g � 0.

(a) (b)

Figure 1. In picture (a), the dashed line represents the formal solution of the zero-phase equa-tion and the continuous line represents the solution of the one-phase equations. The solution(u1(x, t) , u2(x, t), u3(x, t)) of the one-phase equations and the position of the boundariesx−(t) and x+(t) are to be determined from the conditions u(x−(t), t) = u1(x

−(t), t),ux(x

−(t), t) = u1x(x−(t), t), u(x+(t), t) = u3(x

+(t), t), ux(x+(t), t) = u3x(x+(t), t),

where u(x, t) is the solution of the zero-phase equation.

THE TSAREV AND WHITHAM EQUATIONS 67

These branches evolve according to the g-phase Whitham equations. The g-phasesolutions for different g must be glued together in order to produce a C1-smoothcurve in the (x, u) plane evolving smoothly with t (see Figure 1b). The initial-value problem of the Whitham equations is to determine, for almost all t > 0 andx, the phase g(x, t) � 0 and the corresponding branches u1(x, t) > u2(x, t) >

· · · > u2g+1(x, t) from the initial data x = f (u)|t=0. For generic initial data, itis not known whether the solution of the Whitham equations has a finite genus.Some results in this direction have been obtained in [5, 6]. Using the geometric-Hamiltonian structure [7] of the Whitham equations, Tsarev [8] showed that theseequations can be locally integrated by a generalization of the method of char-acteristic. Namely, he proved that if the functions wi = wi(u1, u2, . . . , u2g+1),i = 1, . . . , 2g + 1, solve the linear over-determined system

∂wi

∂uj

= 1

λi − λj

∂λi

∂uj

[wi − wj ], i, j = 1, 2, . . . , 2g + 1, i �= j, (1.4)

where λi = λi(u1, u2, . . . , u2g+1), i = 1, . . . , 2g + 1, are the speeds in (1.1),then the solution u(x, t) = (u1(x, t), u2(x, t), . . . , u2g+1(x, t)) of the so-calledhodograph transformation

x = −λi(u)t + wi(u), i = 1, . . . , 2g + 1, (1.5)

satisfies system (1.1). Conversely, any solution (u1(x, t), u2(x, t), . . . , u2g+1(x, t))

of (1.1) can be obtained in this way.Furthermore, the solution wi(u), i = 1, . . . , 2g + 1, g � 0, of (1.4) must sat-

isfy some natural matching conditions which guarantee that the g-phase solutionsof (1.1) for different g are glued together in order to produce a C1-smooth curve inthe (x, u) plane.

Tsarev theorem relies on two factors:

(a) the existence of a solution of the linear over-determined system (1.4);(b) the existence of a real solution u1(x, t) > u2(x, t) > · · · > u2g+1(x, t) of the

hodograph transformation (1.5).

In this paper, we investigate problem (a). We construct a new expression for thesolution of the Tsarev system. This construction enables us to extend the solutionof the Tsarev system, for g > 1, from analytic initial data with polynomial orexponential growth at infinity [15] to any smooth initial data. In addition, this newexpression has the advantage that:

(i) it is possible to evaluate explicitly the Jacobian of the hodograph transforma-tion (see (5.45)). It turns out that the determinant of the Jacobian is propor-tional to the product

∏2g+1j=1 �g(ui;u), where the function �g(r;u) can be

explicitly obtained from the initial data;(ii) it is simpler to study the hodograph transformation near the phase transition

boundary.

68 TAMARA GRAVA

The investigation of the initial value problem of the Whitham equations was initi-ated by Gurevich and Pitaevskii [9]. In the case g � 1, they solved Equations (1.1)for step-like initial data and studied numerically the case of cubic initial data.Krichever [10] introduced an algebro-geometric procedure to integrate (1.4). Basedon this procedure, Potemin [11] obtained the explicit solution of the system (1.4)for cubic initial data. Kudashev and Sharapov in [12] and Gurevich, Krylov and Elin [13] connected the solution of the Tsarev system (1.4) to the solution of some lin-ear overdetermined systems of the Euler–Poisson–Darboux type introduced in [14].

The structure of the solution of the systems of the Euler–Poisson–Darboux typewas in depth investigated by Tian in [5], where he obtained for g � 1 the solution ofthe Tsarev system from the solution of the systems of the Euler–Poisson–Darbouxtype for any smooth monotone initial data. Furthermore, he partly solved prob-lem (b) proving the solvability of the hodograph transformation for g � 1. Thestructure of the multiphase solutions has been investigated by Tian [15] and laterby El [16]. In [15], Tian built the solution of the Tsarev system for g > 1 forgeneric polynomial initial data. Again, such a solution is constructed in terms ofsolutions of the linear overdetermined systems of the Euler–Poisson–Darboux type.Tian’s formula is not a-priori generalizable to analytic initial data u0(x) which arebounded at infinity or to any smooth initial data. In this paper, we obtain a newexpression for the solution wi(u), i = 1, . . . , 2g + 1, of the Tsarev system forpolynomial initial data. Such an expression enables us to generalize Tian’s result toany increasing smooth initial data. Furthermore, we prove that the solution obtainedis unique.

This paper is organized as follows. In Section 2 we give some background toAbelian differentials on hyperelliptic Riemann surfaces. We describe the Whithamequations in Section 3 where we show that the solution of the Tsarev system withsome given natural matching conditions is unique. In Section 4, we construct anew formula for the solution of the Tsarev system (1.4) for polynomial initial data.Then we show that the formula obtained can be extended to any smooth increasinginitial data. In Section 5, we prove that such formula guarantees that the g-phasesolutions of (1.1) for different g are glued together in order to produce a C1-smoothmultivalued curve in the x−u plane evolving smoothly with time. Our conclusionsare drawn in Section 6.

2. Riemann Surfaces and Abelian Differentials: Notations and Definitions

Let

Sg :={P = (r, µ), µ2 =

2g+1∏j=1

(r − uj)

}, (2.1)

be the hyperelliptic Riemann surface of genus g � 0 with real branch pointsu1 > u2 > · · · > u2g+1. We choose the basis {αj , βj }gj=1 of the homologygroup H1(Sg) so that αj lies fully on the upper sheet and encircles clockwise the


interval [u2j , u2j−1], j = 1, . . . , g, while βj emerges on the upper sheet on the cut[u2j , u2j−1], passes anti-clockwise to the lower sheet through the cut (−∞, u2g+1]and return to the initial point through the lower sheet.

The one-forms that are analytic on the closed Riemann surface Sg except for afinite number of points are called Abelian differentials.

We define on Sg the following Abelian differentials [17]:

(1) The canonical basis of holomorphic one-forms or Abelian differentials of thefirst kind φ1, φ2 . . . φg:

φk(r) = rg−1γ k1 + rg−2γ k

2 + · · · + γ kg

µ(r)dr, k = 1, . . . , g. (2.2)

The constants γ ki are uniquely determined by the normalization conditions∫

αj

φk = δjk, j, k = 1, . . . , g. (2.3)

We remark that the holomorphic differential having all its α-periods equal to zerois identically zero [17].

(2) The set σ g

k , k � 0, g � 0, of the Abelian differentials of the second kind with apole of order 2k + 2 at infinity, with asymptotic behavior

σg

k (r) = [rk−

12 + O(r− 3

2 )]

dr for large r (2.4)

and normalized by the condition∫αj

σg

k = 0, j = 1, . . . , g. (2.5)

We use the notation

σg

0 (r) = dpg(r), 12σ g

1 (r) = dqg(r), g � 0. (2.6)

In the literature, the differentials dpg(r) and dqg(r) are called quasi-momentumand quasi-energy, respectively [7]. The explicit formula for the differentials σ

g

k ,k � 0, is given by the expression

σg

k (r) = Pg

k (r)

µ(r)dr,

Pg

k (r) = rg+k + ck1rg+k−1 + ck2r

g+k−2 + · · · + ckg+k, (2.7)

where the coefficients cki = cki (u), u = (u1, u2, . . . , u2g+1), i = 1, . . . , g + k, areuniquely determined by (2.4) and (2.5).

(3) The Abelian differential of the third kind ωqq0(r) with first-order poles at thepoints Q = (q, µ(q)) and Q0 = (q0, µ(q0)) with residues ±1, respectively. Itsperiods are normalized by the relation∫

αj

ωqq0(r) = 0, j = 1, . . . , g. (2.8)

70 TAMARA GRAVA

2.1. RIEMANN BILINEAR RELATIONS

Let ω1 and ω2 be two Abelian differentials on the Riemann surface Sg . We supposethat ω1 has some poles with nonzero residue so that the integral d−1ω1 has loga-rithm singularities on Sg. Let s be the path connecting the singular points of d−1ω1.We have the following relation:

g∑j=1

[∫αj

ω1

∫βj

ω2 −∫αj

ω2

∫βj

ω1

]+∫s

$(d−1ω1)ω2

= 2πi∑Sg−s

Res[(d−1ω1)ω2], (2.9)

where $(d−1ω1) is the difference of the values of d−1ω1 on the two sides of thecut s and the quantity

∑Sg−s Res[(d−1ω1)ω2] is the sum of the residues of the

differential (d−1ω1)ω2 on the cut surface Sg − s. This formula is known as theRiemann bilinear period relation [18].

Assuming ω1 = ωqq0 and ω2 = ωpp0 in (2.9) we obtain∫ p

p0

ωqq0 =∫ q

q0

ωpp0 . (2.10)

Differentiating the above expression, with respect to p and q we obtain the identity

dq [ωqq0(p)] = dp[ωpp0(q)], (2.11)

where dq and dp denote differentiation with respect to q and p, respectively.In the following we mainly use the normalized Abelian differential of the third

kind ωgz (r) which has simple poles at the points Q±(z) = (z,±µ(z)) with residue

±1 respectively.The differential ωg

z (r) is explicitly given by the expression

ωgz (r) = dr

µ(r)

µ(z)

r − z−

g∑k=1

φk(r)

∫αk

dt

µ(t)

µ(z)

t − z, (2.12)

where φk(r), k = 1, . . . , g, is the normalized basis of holomorphic differentials.Using the explicit expression of the φk(r)’s in (2.2), we write ω

gz (r) in the form

ωgz (r) = dr

µ(r)

µ(z)

r − z−

g∑j=1

Nj(z,u)rg−j

µ(r)dr, (2.13)

where

Nj(z,u) =g∑

k=1

γ kj

∫αk

dt

µ(t)

µ(z)

t − z(2.14)


and the coefficients γ kj have been defined in (2.2). In order to provide a more

useful expression for the Nj ’s, we apply the Riemann bilinear relation (2.9) tothe differentials σ

gm(r) and ω

gz (r), getting∫ Q+(z)

Q−(z)

σ gm(ξ) = − Res

r=∞[ωgz (r) d−1σ g

m(r)], m = 0, . . . , g,

= − 4

2m + 1

(−µ(z)εmg +

g∑j=1

Nj(z)+m+1−j

). (2.15)

In the above formula, εmg = 1 for m = g and zero otherwise, and the +l’s are thecoefficients of the expansion for ξ → ∞ of

1

µ(ξ)= ξ−g− 1

2

(+0 + +1

ξ+ +2

ξ 2+ · · · + +l

ξ l+ · · ·

). (2.16)

We define +k = 0 for k < 0.Solving (2.15) for Nj(z,u), we obtain

N1(z,u)N2(z,u)

· · ·Ng(z,u)−µ(z)

=

+0 0 0 · · · 0+1 +0 0 · · · 0· · · · · · · · · · · · · · ·+g−1 +g−2 · · · +0 0+g +g−1 · · · +1 +0

− 14

∫ Q+(z)

Q−(z)σ

g

0 (ξ)

− 34

∫ Q+(z)

Q−(z)σ

g

1 (ξ)· · ·− 2g−1

4

∫ Q+(z)

Q−(z)σ

g

g−1(ξ)

− 2g+14

∫ Q+(z)

Q−(z)σ

gg (ξ)

,

(2.17)

where the +k’s are the coefficients of the expansion for ξ → ∞ of

µ(ξ) = ξg+ 12

(+0 + +1

ξ+ +2

ξ 2+ · · · + +l

ξ l+ · · ·

). (2.18)

From the relation (2.17), we obtain the identity which will be useful later

µ(z) = 1

4

g+1∑k=1

(2k − 1)+g+1−k

∫ Q+(z)

Q−(z)

σg

k−1(ξ). (2.19)

The next proposition is also important for our subsequent considerations.

PROPOSITION 2.1. The Abelian differentials of the second kind σg

k (r), k � 0,defined in (2.4) satisfy the relations

σg

k (r) = 1

2Resz=∞

[ωg

z (r) zk− 1

2 dz] = − 1

2k + 1dr Res

z=∞[ωg

r (z) zk+ 1

2], (2.20)

where ωgz (r) has been defined in (2.12), ωg

r (z) is the normalized Abelian differ-ential of the third kind with simple poles at the points Q±(r) = (r,±µ(r)) withresidue ±1, respectively, and dr denotes differentiation with respect to r.

72 TAMARA GRAVA

Proof. The differential Resz=∞[ωgz (r) z

k− 12 dz] is normalized because ω

gz (r) is a

normalized differential. From (2.13) and (2.17) it can be easily shown that

Resz=∞

[ωg

z (r) zk− 1

2 dz] = rk−

12 dr + O(r− 3

2 ) dr for r → ∞.

Therefore, Resz=∞[ωgz (r) z

k− 12 dz] coincides with the normalized Abelian differen-

tial of the second kind σg

k (r). For proving the second equality in (2.20) we considerthe integral in the z variable

0 =∮C∞

dz(ωgz (r) z

k+ 12 ) =

∮C∞

zk+12 (dzω

gz (r)) +

∮C∞

(k + 12)(ω

gz (r) z

k+ 12 ),

(2.21)

where C∞ is a close contour around the point at infinity. From (2.11), we obtainthe identity dzω

gz (r) = drω

gr (z). Substituting the above identity in the right-hand

side of (2.21), we obtain the second relation in (2.20). ✷

3. Preliminaries on the Theory of the Whitham Equations

The speeds λi(u1, u2, . . . , u2g+1) of the g-phase Whitham equations (1.1) are givenby the ratio [1, 2]:

λi(u) = dqg(r)

dpg(r)

∣∣∣∣r=ui

, i = 1, 2, . . . , 2g + 1, (3.1)

where dpg(r) and dqg(r) have been defined in (2.6). In the case g = 0

dp0(r) = dr√r − u

, dq0(r) = 12r − 6u√r − u

dr, (3.2)

so that one obtains the zero-phase Whitham equation (1.2).For monotonically increasing smooth initial data x = f (u)|t=0, the solution of

the zero-phase equation (1.2) is obtained by the method of characteristic [1] and isgiven by the expression

x = −6tu + f (u). (3.3)

The zero-phase solution is globally well-defined only for 0 � t < t0, wheret0 = 1

6 minu∈R[f ′(u)] is the time of gradient catastrophe of the solution (3.3). Thebreaking is caused by an inflection point in the initial data. For t � t0, we expect tohave single, double and higher phase solutions. For higher genus the Whithamequations can be locally integrated using a generalization of the characteristicequation (3.3). We have the following theorem of Tsarev [8]


THEOREM 3.1. If wi(u) solves the linear over-determined system

∂wi

∂uj

= aij [wi − wj ], i, j = 1, 2, . . . , 2g + 1, i �= j,

(3.4)aij = 1

λi − λj

∂λi

∂uj

,

then the solution (u1(x, t), u2(x, t), . . . , u2g+1(x, t)) of the hodograph transfor-mation

x = −λi(u) t + wi(u), i = 1, . . . , 2g + 1, (3.5)

satisfies system (1.1). Conversely, any solution (u1, u2, . . . , u2g+1) of (1.1) can beobtained in this way in a neighborhood (x0, t0) where the uix’s are not vanishing.

To guarantee that the g-phase solutions for different g are attached continuously,the following natural matching conditions involving f (u) must be imposed onwi(u1, u2, . . . , u2g+1), i = 1, . . . , 2g + 1, g > 0.

When ul = ul+1, 1 � l � 2g,

wg

l (u1, . . . , ul−1, ul, ul, ul+2, . . . , u2g+1)

= wg

l+1(u1, . . . , ul−1, ul, ul, ul+2, . . . , u2g+1) (3.6)

and, for 1 � i � 2g + 1, i �= l, l + 1,

wg

i (u1, . . . , ul−1, ul, ul, ul+2, . . . , u2g+1)

= wg−1i (u1, . . . , ul, ul, . . . , u2g+1). (3.7)

The superscript g and g − 1 in the wi’s specify the corresponding genus and thehat denotes the variable that have been dropped. When g = 1 we have that

w11(u1, u1, u3) = w1

2(u1, u1, u3), w12(u1, u3, u3) = w1

3(u1, u3, u3) (3.8)

and

w13(u1, u1, u3) = f (u3), w1

1(u1, u3, u3) = f (u1), (3.9)

where f (u) is the initial data. In the following, we will sometimes omit the super-script g when we are referring to genus g quantities.

We remark that the λi(u)’s in (3.1) satisfy the matching conditions (3.6)–(3.8)and, for g = 1, we have

λ1(u1, u3, u3) = 6u1, λ3(u1, u1, u3) = 6u3.

The matching conditions (3.6)–(3.9) guarantee that the solution of the Tsarev sys-tem (3.4) is unique.

74 TAMARA GRAVA

THEOREM 3.2. If the initial data f (u) ≡ 0, then the solution wg

i (u) of (3.4) withmatching conditions (3.6)–(3.9) is identically zero for 1 � i � 2g + 1, for anyg � 0 and for all u1 > u2 > · · · > u2g+1.

Proof. The proof is obtained by induction on g. The statement is satisfied forg = 0.

For g = 1, we repeat the arguments of [5]. We fix u2 and we consider Equa-tion (3.4) with matching conditions (3.8)–(3.9), namely

∂w11

∂u3= a13[w1

1 − w13],

∂w13

∂u1= a31[w1

3 − w11],

w11(u1, u2, u2) = f (u1) ≡ 0, w1

3(u2, u2, u3) = f (u3) ≡ 0.

We can regard each of the above equations as a first-order linear ordinary differen-tial equation with a nonhomogeneous term. Integrating them, we obtain a coupleintegral equation. By the standard contraction mapping method, it can be shownthat when f (u) ≡ 0, this system has only the zero solution, i.e. w1

1 ≡ w13 ≡ 0 for

(u1, u3) satisfying u1 > u2 > u3. Because of the arbitrariness of u2, w11 and w1

3vanish as a function of (u1, u2, u3) and, therefore, by (3.4), so does w1

2(u). Nowwe suppose the theorem true for genus g − 1 and we prove it for genus g. We fixu2 > u3 > · · · > u2g and we consider Equation (3.4) for w

g

1 and wg

2g+1 with thematching conditions (3.6)–(3.7), namely

∂

∂u2g+1w

g

1 = a1(2g+1)[w

g

1 − wg

2g+1

],

∂

∂u1w

g

2g+1 = a(2g+1)1[w

g

2g+1 − wg

1

],

wg

1 (u1, u2, . . . , u2g, u2g) = wg−11 (u1, u2, . . . , u2g−1, u2g, u2g) ≡ 0,

wg

2g+1(u2, u2, . . . , u2g, u2g+1) = wg−12g+1(u2, u2, u3, . . . , u2g, u2g+1) ≡ 0.

(3.10)

Repeating the arguments developed for genus g = 1, we may conclude thatw

g

1 (u) ≡ wg

2g+1(u) ≡ 0, for arbitrary u1 > u2 > · · · > u2g+1. We then repeatthe above argument fixing u1 > u3 > · · · > u2g−1 > u2g+1 and consideringEquation (3.4) for w

g

2 (u) and wg

2g(u) with the matching conditions (3.6)–(3.7),namely

∂

∂u2gw

g

2 = a2(2g)[w

g

2 − wg

2g

],

∂

∂u2w

g

2g = a(2g)2[w

g

2g − wg

2

],

wg

2 (u1, u2, . . . , u2g−1, u2g+1, u2g+1)

= wg−12 (u1, u2, . . . , u2g−1, u2g+1, u2g+1) ≡ 0,

wg

2g(u1, u1, u3, . . . , u2g, u2g+1) = wg−12g (u1, u1, u3, . . . , u2g, u2g+1) ≡ 0.

(3.11)

It can be easily shown that also wg

2 (u) ≡ wg

2g(u) ≡ 0 for arbitrary u1 > u2 >

· · · > u2g+1. Repeating these arguments some other g − 2 times, we conclude that


wg

i (u) ≡ 0 for 1 � i � g, g + 2 � i � 2g + 1 and for arbitrary u1 > u2 > · · · >u2g+1. Applying (3.4) and the matching conditions (3.6)–(3.7), we can prove thatalso w

g

g+1(u) is identically zero. The theorem is then proved. ✷The solution of the Tsarev system (3.4) with the matching conditions (3.6)–

(3.9) has been obtained in [15, 10] for monotonically increasing polynomial initialdata of the form

x = fa(u) = c0 + c1u + · · · + ckuk + · · · , (3.12)

where we assume that only a finite number of ck is different from zero. For suchinitial data, the wi(u)’s which satisfy (3.4) and the matching conditions (3.6)–(3.9)are given by the expression [15]

wi(u) = dsg(r)

dpg(r)

∣∣∣∣r=ui

, i = 1, . . . , 2g + 1. (3.13)

The differential dsg(r) in (3.13) is given by

dsg(r) =∞∑k=0

2kk!(2k − 1)!!ckσ

g

k (r), (3.14)

and the differentials σg

k (r), k � 0 have been defined in (2.7). For the initial data(3.12), Tian [15] has reduced the expression of the wi(u)’s in (3.13) to the form

wi(u) =

∂

∂ui

g∑k=1

qkγj

k

∂γj

1

∂ui

, i = 1, 2 . . . , 2g + 1, 1 � j � g, (3.15)

where the γj

k ’s are the normalization constants of the holomorphic differentialφj (r) in (2.2) and the functions qk = qk(u), k = 1, . . . , g, solve the linear over-determined system of the Euler–Darboux–Poisson type [15]

2(ui − uj )∂2qk(u)

∂ui∂uj

= ∂qk(u)

∂ui

− ∂qk(u)

∂uj

,

i, j = 1, . . . , 2g + 1, k = 1, . . . , g,

qk(u, u, . . . , u︸︷︷︸2g+1

) = 2g−1

(2g − 1)!!u−k+ 1

2dg−k

du

(ug− 1

2 f (k−1)a (u)

),

(3.16)

where f (k−1)a (u) is the (k − 1)th derivative of the polynomial initial data fa(u)

defined in (3.12). Thus, the solution of the Tsarev system is reduced to the solutionof the above systems of equations. The systems (3.16) can be integrated for anysmooth initial data [5].

76 TAMARA GRAVA

THEOREM 3.3. Let f (u) be a smooth function with domain (a, b), −∞ � a <

b � +∞. The initial value problem

2(ui − uj )∂2qk(u)

∂ui∂uj

= ∂qk(u)

∂ui

− ∂qk(u)

∂uj

,

i �= j, i, j = 1, . . . , 2g + 1, g > 0, (3.17)

qk(u, u, . . . , u︸︷︷︸2g+1

) = Fk(u), (3.18)

Fk(u) = 2(g−1)

(2g − 1)!!u−k+ 1

2dg−k

dug−k

(ug− 1

2 f (k−1)(u)), (3.19)

with the ordering b > u1 > u2 > · · · > u2g+1 > a, has one and only one solution.The solution is symmetric and is given by

qk(u) = 1

C

∫ 1

−1

∫ 1

−1· · ·

∫ 1

−1dξ1dξ2 . . . dξ2g(1 + ξ2g)

g−1 ×

× (1 + ξ2g−1)g− 3

2 . . . (1 + ξ3)12 (1 + ξ1)

− 12 ×

× Fk(1+ξ2g

2 (. . . (1+ξ2

2 (1+ξ1

2 u1 + 1−ξ12 u2) + 1−ξ2

2 u3) + · · ·) + 1−ξ2g

2 u2g+1)√(1 − ξ1)(1 − ξ2) . . . (1 − ξ2g)

,

k = 1, . . . , g, (3.20)

where C = ∏2gm=1 Cm,

Cm =∫ 1

−1

(1 + µ)m2 −1

√1 − µ

dµ, m > 0. (3.21)

Proof. To prove the theorem we follow the procedure in [5]. We start with thefollowing lemma.

LEMMA 3.4 [5]. The system

2(z − y)hzy = hz − ρhy, ρ > 0,

h(z, z) = s(z),(3.22)

has, for any smooth initial data s(z), one and only one solution. Moreover, thesolution can be written explicitly

h(z, y) = 1

Cρ

∫ 1

−1

s(1+µ

2 z + 1−µ

2 y)√1 − µ

(1 + µ)ρ2 −1 dµ, (3.23)

where Cρ has been defined in (3.21).

Using the above lemma, the linear over-determined systems (3.17) can be inte-grated for any smooth initial data in the following way. Suppose that qk(u1, u2, . . . ,

u2g+1) is the solution of (3.17)–(3.19).


Clearly

Ak(u1, u2g+1) = qk(u1, u1, . . . , u1︸︷︷︸2g

, u2g+1)

satisfies

2(u1 − u2g+1)∂2Ak

∂u1∂u2g+1= ∂Ak

∂u1− 2g

∂Ak

∂u2g+1,

Ak(u, u) = Fk(u),

(3.24)

which, by Lemma 3.4, implies that

Ak(u1, u2g+1) = 1

C2g

∫ 1

−1

Fk(1+ξ2g

2 u1 + 1−ξ2g

2 u2g+1)√1 − ξ2g

(1 + ξ2g)g−1 dξ2g. (3.25)

For each fixed u2g+1, the function

Bk(u1, u2g, u2g+1) = qk(u1, . . . , u1︸︷︷︸2g−1

, u2g, u2g+1)

satisfies

2(u1 − u2g)∂2Bk

∂u1∂u2g= ∂Bk

∂u1− (2g − 1)

∂Bk

∂u2g,

Bk(u, u, u2g+1) = Ak(u, u2g+1).

(3.26)

Again using Lemma 3.4, we obtain

Bk(u1, u2g, u2g+1) = 1

C2gC2g−1

∫ 1

−1

∫ 1

−1dξ2gdξ2g−1(1 + ξ2g)

g−1(1 + ξ2g−1)g− 3

2 ×

× Fk(1+ξ2g

2 (1+ξ2g−1

2 u1 + 1+ξ2g−1

2 u2g)1−ξ2g

2 u2g+1)√1 − ξ2g

√1 − ξ2g−1

. (3.27)

Continuing the process of integration, we obtain the solution (3.20). The unique-ness follows from Lemma 3.4 and the above arguments. The boundary conditions(3.18)–(3.19) are clearly satisfied. The symmetry property is obtained consideringthe solution hk(u) = qk(u) − qk(P (u)), where P (u) is any reordering of the ui’s.Such a solution satisfies (3.17) with Fk(u) = 0 and therefore, by construction,equals zero. ✷

The formula for the wi’s obtained in (3.13) is valid only for a Taylor series withan infinite radius of convergence. This corresponds to an increasing initial datawhich is the sum of exponentials, sine, cosine, and polynomials. Therefore, it isnot obvious that formula (3.15) can be extended to analytic initial data which havea different behavior at infinity or to any smooth initial data.

78 TAMARA GRAVA

In the next section we provide a new expression for the wi(u)’s equivalentto (3.15) which enables us to make such an extension. Namely, we show thatthese wi(u)’s are the unique solution of the Tsarev system (3.4) with matchingconditions (3.6)–(3.9) for any monotonically increasing smooth initial data. Fur-thermore, this new expression enables us to evaluate the Jacobian of the hodographtransformation (3.5) very easily. Namely, the Jacobian turns out to be propor-tional to the product

∏2g+1j=1 �g(ui;u), where the function �g(r;u) satisfies a linear

overdetermined system of the Euler–Poisson–Darboux type.

4. Solution of the Tsarev System

In this section, we build the solution of the Tsarev system (3.4) with matching con-ditions (3.6)–(3.9) for monotonically increasing smooth initial data. We considerinitial data of the form x = f (u)|t=0, where f (u) is a monotonically increasingfunction. The domain of f is the interval (a, b), where −∞ � a < b � +∞, andthe range of f is the real line (−∞,+∞).

In order to obtain such a solution, we need the following technical lemma:

LEMMA 4.1. The differential dsg(r) defined in (3.14) can be written in the form

dsg(r) = 2µ(r)

(∂r5

g(r;u) +2g+1∑k=1

∂uk5g(r;u)

)dr + Rg(r)

µ(r)dr, (4.1)

where

5g(r;u) = − Resz=∞

[F (z) dz

2µ(z)(z − r)

], qk(u) = − Res

z=∞

[zg−kF (z) dz

2µ(z)

],

k = 1, . . . g, (4.2)

F (z) =∫ z

0

fa(ξ)√z − ξ

dξ, (4.3)

Rg(r) = 22g+1∑k=1

∂ukqg(u)

2g+1∏n=1,n�=k

(r − un) +

+g∑

k=1

qk(u)

k∑n=1

(2n − 1)+k−nPg

n−1(r), (4.4)

the polynomials Pgn (r), n � 0, have been defined in (2.7), the +k’s have been

defined in (2.18) and fa(ξ) is the analytic initial data (3.12).Proof. Using the second identity in (2.20), we rewrite the differential dsg(r)

defined in (3.14) in the form

dsg(r) = −dr

(Resz=∞[ωg

r (z)F (z)]), (4.5)


where ωgr (z) has been defined in (2.12) and F (z) is the Abel transform defined

in (4.3) of the analytic initial data (3.12). The identity (4.5) can be checked in astraightforward manner. Using the explicit expression of ωg

r (z) in (2.13), we obtain

dsg(r) = 2dr (µ(r)5g(r;u)) +g∑

k=1

qk(u)

k∑n=1

(2n − 1)+k−nσg

n−1(r), (4.6)

where 5g(r;u) and qk(u) have been defined in (4.2).From (4.2) we get the relations

5g(r;u)

r − ui

− 5g(ui;u)

r − ui

= 2∂ui5g(r;u), 2∂ui

qg(u) = 5g(ui;u) (4.7)

and for g = 0 we define u1 = u and

2∂uq0(u) := 50(u;u) = fa(u).

Using (4.7) we transform the expression for dsg(r) in (4.6) to the form (4.1). ✷The relation (4.1) enables us to write the quantities

wi(u) = dsg(r)

dpg(r)

∣∣∣∣r=ui

, i = 1, . . . , 2g + 1,

in (3.13) in the form

wi(u) = 1

Pg

0 (ui)

[2∂ui

qg(u)

2g+1∏n=1,n�=i

(ui − un) +

+g∑

k=1

qk(u)

k∑n=1

(2n − 1)+k−nPg

n−1(ui)

]. (4.8)

We observe that in the formula (4.8) all the information on the initial data is con-tained in the functions qk(u). The functions qk = qk(u), k = 1, . . . , g, solve thelinear over-determined system (3.16).

THEOREM 4.2 (Main Theorem). Let f (u) be a smooth monotonically increasingfunction with domain (a, b), −∞ � a < b � +∞ and range (−∞,+∞). Ifqk = qk(u1, u2, . . . , u2g+1), 1 � k � g, is the symmetric solution of the linearover-determined system (3.17)–(3.19), with the ordering b > u1 > u2 > · · · >

u2g+1 > a, then wi(u), i = 1, . . . , 2g + 1, defined by

wi(u) = 1

Pg

0 (ui)

[2∂ui

qg(u)

2g+1∏n=1,n�=i

(ui − un) +

+g∑

k=1

qk(u)

k∑n=1

(2n − 1)+k−nPg

n−1(ui)

], (4.9)

solves the Tsarev system (3.4) with matching conditions (3.6)–(3.9).

80 TAMARA GRAVA

Proof. We consider the nontrivial case where qk(u) �≡ 0, k = 1, . . . , g, and∂uj

qg(u) �≡ 0, j = 1, . . . , 2g + 1.The proof of the theorem consists of two parts.

(a) The wi(u)’s defined in (4.9) satisfy (3.4).Using the definition of wi(u) in (4.9) we have the following relation:

∂ujwi(u) = 2

∏2g+1n=1, n�=i (ui − un)

Pg

0 (ui)∂uj

∂uiqg(u) − 2

∏2g+1n=1, n�=i(ui − un)

Pg

0 (ui)∂ui

qg(u) ×

×(∂uj

Pg

0 (ui)

Pg

0 (ui)+ 1

ui − uj

)+

+ ∂uj

(g∑

n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

g∑k=n

qk(u)+k−n

),

i �= j, i, j = 1, . . . , 2g + 1. (4.10)

From (3.4), we obtain∂

∂uj

Pg

k (ui)

Pg

0 (ui)= 1

λi − λj

∂λi

∂uj

(P

g

k (ui)

Pg

0 (ui)− P

g

k (uj )

Pg

0 (uj )

),

i �= j, i, j = 1, . . . , 2g + 1, k � 1 (4.11)

and, from [15], we have

1

λi − λj

∂λi

∂uj

= −∂ujP

g

0 (ui)

Pg

0 (ui)− 1

2

1

ui − uj

,

i �= j, i, j = 1, . . . , 2g + 1, (4.12)

where λi(u) has been defined in (3.1) and Pg

k (r) has been defined in (2.4).Using the definition of +k in (2.18), we get

∂+k

∂uj

= −1

2

k∑m=1

+k−mum−1j . (4.13)

It is easy to verify that the functions Fk(u) and the solutions qk(u), k = 1, . . . , g,of (3.17)–(3.19) satisfy the relations

∂uFk(u) = 2g + 1

2Fk+1(u) + u∂uFk+1(u), k = 1, . . . , g − 1, g > 0,

∂uiqk(u) = 1

2qk+1(u) + ui∂ui

qk+1(u),

i = 1, . . . , 2g + 1, k = 1, . . . , g − 1, g > 0.

(4.14)

Repeatedly applying the relations (4.14), we obtain the following expression for∂uj

qk(u):

∂ujqk(u) = 1

2

g−k∑m=1

qm+k(u)um−1j + u

g−k

j ∂ujqg(u), k = 1, . . . , g − 1. (4.15)


From (3.17) and (4.11)–(4.15) we can write ∂ujwi(u), i �= j , in (4.10) in the form

∂ujwi(u) =

∏2g+1n=1,n�=i (ui − un)

Pg

0 (ui)

∂uiqg(u) − ∂uj

qg(u)

ui − uj

−

− 2

∏2g+1n=1,n�=i (ui − un)

Pg

0 (ui)∂ui

qg(u)

(∂uj

Pg

0 (ui)

Pg

0 (ui)+ 1

ui − uj

)+

+g∑

n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

(1

2

g−1∑k=n

+k−n

g−k∑m=1

qm+k(u)um−1j +

+g∑

k=n

+k−nug−k

j ∂ujqg(u)

)+

+g∑

n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

g∑k=n

qk(u)

(−1

2

k−n∑m=1

+k−n−mum−1j

)+

+g∑

n=1

(2n − 1)1

λi − λj

∂λi

∂uj

(P

g

n−1(ui)

Pg

0 (ui)− P

g

n−1(uj )

Pg

0 (uj )

) g∑k=n

qk(u)+k−n.

(4.16)

Simplifying, we obtain

∂ujwi(u) = −

∏2g+1n=1,n�=i (ui − un)

(ui − uj)Pg

0 (ui)∂uj

qg(u) +

+ 2

∏2g+1n=1,n�=i (ui − un)

Pg

0 (ui)∂ui

qg(u)

(1

λi − λj

∂λi

∂uj

)+

+ 1

λi − λj

∂λi

∂uj

g∑n=1

(2n − 1)

(P

g

n−1(ui)

Pg

0 (ui)−

− Pg

n−1(uj )

Pg

0 (uj )

) g∑k=l

qk(u)+k−n +

+g∑

n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

g∑k=n

+k−nug−k

j ∂ujqg(u),

i �= j, i, j = 1, . . . , 2g + 1. (4.17)

Adding and subtracting the quantity

1

λi − λj

∂λi

∂uj

wj

82 TAMARA GRAVA

to (4.17), we can reduce it to the form

∂ujwi − 1

λi − λj

∂λi

∂uj

[wi − wj ]

= ∂ujqg(u)

(2

λi − λj

∂λi

∂uj

∏2g+1n=1,n�=j (uj − un)

Pg

0 (uj )−

−∏2g+1

n=1,n�=i (ui − un)

(ui − uj)Pg

0 (ui)+

g∑n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

g∑k=n

+k−nug−k

j

). (4.18)

The term in parentheses in the right-hand side of (4.18) does not depend on theinitial data f (u). It is identically zero for the initial data (3.12) because, in such acase, the wi’s satisfy (3.4) [15]. Therefore we conclude that

∂ujwi − 1

λi − λj

∂λi

∂uj

[wi − wj ] = 0 (4.19)

for any smooth monotonically increasing initial data x = f (u)|t=0.

(b) The wi(u)’s satisfy the matching conditions (3.6)–(3.9).In the following, we use the superscript g to denote the corresponding genus of thequantities we are referring to. We have the following relations. When ul = ul+1 =v for 1 � l � 2g, the +k’s defined in (2.18) satisfy

+g

k (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= +g−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1)−

− v+g−1k−1 (u1, . . . , ul−1, ul+2, . . . , u2g+1), k � 1, g > 1, (4.20)

and the qk(u)’s defined in (3.20) satisfy

qg

k (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)−− vq

g

k+1(u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= qg−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1),

k = 1, . . . , g − 1, g > 1. (4.21)

For ui �= ul = ul+1 = v, we have

∂uiqg−1g−1 (u1, . . . , ul−1, ul+2, . . . , u2g+1)−− 1

2qgg (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= (ui − v)∂uiqgg (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1), (4.22)

which follows from (4.14).Next we study the behavior of the polynomials P

g

k (r) = Pg

k (r;u), k � 0, whentwo branch points become coincident. For this purpose, let us consider ul = v+√

ε,


ul+1 = v−√ε, where 0 < ε � 1. When l is odd, the polynomial P g

k (r;u) definedin (2.7) satisfies the following expansion [20]:

Pg

k (r; u1, . . . , ul−1, v + √ε, v − √

ε, ul+2, . . . , u2g+1)

= (r − v)Pg−1k (r)+

+ ε

2σ

g−1k (v)

(∂vµ(v) − (r − v)

g−1∑k=1

rg−1−k∂vNg−1k (v)

)+ O(ε2). (4.23)

In the above formula, the normalized Abelian differential of the second kind

σg−1k (r) = P

g−1k (r)

µ(r)dr,

with pole at infinity of order 2k + 2 and with asymptotic behavior (2.4) is definedon the Riemann surface

µ2 = (r − u1)(r − u2) . . . (r − ul−1)(r − ul+2) . . . (r − u2g+1), (4.24)

the quantity

σg−1k (v) = P

g−1k (v)

µ(v)

and the Ng−1k (v)’s have been defined in (2.17).

When l is even, we have [20]

Pg

k (r; u1, . . . , ul−1, v + √ε, v − √

ε, ul+2, . . . , u2g+1)

� (r − v)Pg−1k (r) − (r − v)

log εµ(r)ωg−1

v (r)

∫ Q+(v)

Q−(v)

σg−1k (ξ), (4.25)

where ωg−1v (r) is the normalized Abelian differential of the third kind defined

on the Riemann surface (4.24) and with simple poles at the points Q±(v) =(v,±µ(v)) with residue ±1, respectively.From (4.23) and (4.25), we have that for ui �= ul = ul+1 = v

Pg

k (ui; u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= (ui − v)Pg−1k (ui), k � 0. (4.26)

Using the relations (4.20)–(4.22) and (4.26), we obtain for i �= l, l + 1,i = 1, . . . , 2g + 1,

wg

i (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= 2

∏2g+1k=1,k �=i, l, l+1(ui − uk)

Pg−10 (ui)

∂uiqg−1g−1+

84 TAMARA GRAVA

+g−1∑m=1

(2m − 1)P

g−1m−1(ui)

Pg−10 (ui)

g−1∑k=m

qg−1k +

g−1k−m+

+(

g∑m=1

(2m − 1)P

g−1m−1(ui)

Pg−10 (ui)

+g−1g−m −

∏2g+1k=1,k �=i, l, l+1(ui − uk)

Pg−10 (ui)

)qgg (u)|ul=ul+1 ,

(4.27)

with

+g−1k = +

g−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1), k � 0

and

qg−1k = q

g−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1), k = 1, . . . , g − 1.

The above reduces to the form

wg

i (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= wg−1i (u1, . . . , ul−1, ul+2, . . . , u2g+1)+

+(

g∑m=1

(2m − 1)P

g−1m−1(ui)

Pg−10 (ui)

+g−1g−m −

∏2g+1k=1,k �=i, l, l+1(ui − uk)

Pg−10 (ui)

)qgg (u)|ul=ul+1 .

(4.28)

Using (2.19), the term in parentheses in the right-hand side of (4.28) turns out tobe identically zero. Therefore, the boundary conditions (3.7) are satisfied for anysmooth monotonically increasing initial data.

For studying the boundary conditions (3.6), we need to distinguish between l

odd or even. Let us define ul = v + √ε, ul+1 = v − √

ε, where 0 < ε � 1.From (4.23) when l is odd the polynomial P g

k (r;u) defined in (2.7) has the follow-ing expansion at r = v ± √

ε:

Pg

k (v ± √ε; u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

= ±√ε P

g−1k (v) + ε(σ

g−1k (v)∂vµ(v) + 2∂vP

g−1k (v)) + O(ε2),

k � 0, (4.29)

so that

limε→0

Pg

k (v ± √ε; u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

Pg

0 (v ± √ε; u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

= Pg−1k (v)

Pg−10 (v)

. (4.30)


We remark that here and below Pg−1k (v) = P

g−1k (v; u1, . . . , ul−1, ul+2, . . . , u2g+1).

When l is even, we have from (4.25)

Pg

k (v ± √ε;u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

� ±√ε P

g−1k (v) − 1

log εµ(v)

∫ Q+(v)

Q−(v)

σg−1k (ξ), (4.31)

so that

limε→0

Pg

k (v ± √ε; u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

Pg

0 (v ± √ε; u1, . . . , ul−1, v + √

ε, v − √ε, ul+2, . . . , u2g+1)

=∫ Q+(v)

Q−(v)σ

g−1k (ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

. (4.32)

Using (4.20), (4.21) and (4.30) we deduce that for l odd and ul = ul+1 = v

wg

l (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= 2µ2(v)

Pg−10 (v)

∂

∂ul

qgg (u)

∣∣∣∣ul=ul+1

+g−1∑k=1

qg−1k

k∑n=1

(2n − 1)+g−1k−n

Pg−1n−1 (v)

Pg−10 (v)

+

+ qgg (u)|ul=ul+1

g∑n=1

(2n − 1)+g−1g−n

Pg−1n−1 (v)

Pg−10 (v)

, (4.33)

where

qg−1k = q

g−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1)

and

+g−1k = +

g−1k (u1, . . . , ul−1, ul+2, . . . , u2g+1).

Since the functions qk(u) in (3.20) are symmetric with respect to u1, u2, . . . , u2g+1,we immediately obtain that

∂

∂ul

qgg (u)

∣∣∣∣ul=ul+1

= ∂

∂ul+1qgg (u)

∣∣∣∣ul=ul+1

. (4.34)

Therefore, combining (4.33) and (4.34), we deduce that

wg

l (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= wg

l+1(u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1), l odd. (4.35)

When l is even, using (4.20), (4.21) and (4.32), we obtain

wg

l (u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

= wg

l+1(u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)

86 TAMARA GRAVA

=g−1∑k=1

qg−1k

k∑n=1

(2n − 1)+g−1k−n

∫ Q+(v)

Q−(v)σ

g−1n−1 (ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

+

+ qgg (u)|{ul=ul+1=v}

g∑k=1

(2k − 1)+g−1g−k

∫ Q+(v)

Q−(v)σ

g−1k−1 (ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

. (4.36)

We conclude from (4.35) and (4.36) that the boundary condition (3.6) is satisfied.When g = 1 we deduce from (4.35)

w11(u1, u1, u3) = w1

2(u1, u1, u3)

and from (4.9) and (4.26)

w13(u1, u1, u3) = 2(u3 − u1)∂u3q1(u1, u1, u3) + q1(u1, u1, u3).

From (3.17) and (3.20) we get the relation

q1(u1, u1, u3) = f (u3) + 2(u1 − u3)∂u3q1(u1, u1, u3)

so that

w13(u1, u1, u3) = f (u3).

An analogous result can be obtain when u2 = u3, so that the boundary conditions(3.8)–(3.9) are satisfied. The theorem is then proved. ✷

Remark 4.3. Expression (3.15) obtained by Tian [15] for the polynomial initialdata (3.12) is equivalent to (4.9) for any smooth monotonically increasing initialdata. Indeed, (3.15) can be reduced to (4.9) using (4.14) and the identity

∂

∂ui

+0 0 0 · · · 0+1 +0 0 · · · 0· · · · · · · · · · · · · · ·+g−2 +g−3 · · · +0 0+g−1 +g−2 · · · +1 +0

γj

1

γj

2· · ·γ

j

g−1

γjg

= 1

4

Resr=ui[σ0(r)φj (r)

dr]

3 Resr=ui[σ1(r)φj (r)

dr]

· · ·(2g − 3)Resr=ui

[σg−2(r)φj (r)

dr]

(2g − 1)Resr=ui[σg−1(r)φj (r)

dr]

, (4.37)

where the σk(r)’s are the Abelian differentials of the second kind defined in (2.4),φj (r) is the j th holomorphic differential defined in (2.2) and Resr=ui

is the residueevaluated at r = ui . The +k’s are the coefficients of the expansion defined in (2.16).The above identity can be obtained from Proposition 5.4 (see below). Formula (3.15)is not effective for proving Theorem 4.2.


5. C1-Smoothness of the Solution of the Hodograph Transformation

In the following, we show that the g-phase solutions of the Whitham equationsu1(x, t) > u2(x, t) > · · · > u2g+1(x, t) for different g � 0 describe a C1-smoothmultivalued curve in the x–u plane evolving smoothly in time.

The matching conditions (3.6)–(3.9) guarantee that the C1-smoothness is pre-served on the phase transition boundary where two Riemann invariants coalesce.

THEOREM 5.1 ([8]). Let us suppose that the g-phase solution defined by (3.5)exists for some x and t > 0. Then on the solution of (3.5)

(∂xui(x, t))−1 = ∂ui

(−λi(u)t + wi(u)), i = 1, . . . , 2g + 1. (5.38)

Proof. Deriving with respect to x Equation (3.5), we obtain

1 = ∂ui(−λi(u)t + wi(u))∂xui +

∑j �=i

∂uj(−λi(u)t + wi(u))∂xuj . (5.39)

By (1.4) we have

∂uj(−λi(u)t + wi(u))

= −∂ujλi(u)t + ∂uj

wi(u), i �= j, i, j = 1, . . . , 2g + 1

= 1

λi − λj

∂λi

∂uj

[(−λi(u)t + wi(u)) − (−λj (u)t + wj(u))

]= 0. (5.40)

Thus (5.39) becomes

1 = ∂ui(−λi(u)t + wi(u))∂xui, i = 1, . . . , 2g + 1. (5.41)✷

In the next theorem, we evaluate explicitly ∂ui(−λi(u)t + wi(u)), i = 1, . . . ,

2g + 1.

THEOREM 5.2. On the solution of the hodograph transformation (3.5), thefollowing relation is satisfied:

∂

∂ui

(−λi(u)t + wi(u)) =∏2g+1

m=1m �=i

(ui − um)

Pg

0 (ui)�g(ui;u),

i = 1, . . . , 2g + 1, g > 0, (5.42)

where

�g(r;u) = ∂r5g(r;u) +

2g+1∑k=1

∂uk5g(r;u) (5.43)

88 TAMARA GRAVA

and 5g(r;u) satisfies the linear overdetermined system

∂

∂ui

5g(r;u) − ∂

∂uj

5g(r;u) = 2(ui − uj)∂2

∂ui∂uj

5g(r;u), i �= j,

i, j = 1, . . . , 2g + 1,∂

∂r5g(r;u) − 2

∂

∂uj

5g(r;u) = 2(r − uj )∂2

∂r∂uj

5g(r;u),

j = 1, . . . , 2g + 1,

5g(r; r, . . . , r︸︷︷︸2g+1

) = 2g

(2g + 1)!!f(g)(r).

(5.44)

Remark 5.3. For polynomial initial data, the solution of the above initial valueproblem coincides with the function 5g(r;u) defined in (4.2). The integrationprocedure of (5.44) is analogous to the one illustrated in the proof of Theorem 3.17.The function 5g(r;u) that solves (5.44) satisifes the relations (4.7).

We observe that the determinant of the Jacobian of the hodograph transforma-tion can be obtained easily from (5.42), namely

det(Jacobian) =2g+1∏i=1

�g(ui;u)

∏2g+1m=1m �=i

(ui − um)

Pg

0 (ui). (5.45)

Therefore, the above determinant is nondegenerate if �g(ui;u) �= 0, i = 1, . . . ,2g + 1.

Proof of Theorem 5.2. From a generalization of a result in [4], we obtain

∂

∂ui

Pg

k (ui)

Pg

0 (ui)= 1

2

∂

∂r

Pg

k (r)

Pg

0 (r)

∣∣∣∣r=ui

, (5.46)

so that∂

∂ui

(−λi(u)t + wi(u))

= −6t∂

∂r

Pg

1 (r)

Pg

0 (r)

∣∣∣∣r=ui

+ 2

∑2g+1k=1k �=i

∏2g+1m=1m �=k,i

(ui − uj )

Pg

0 (ui)∂ui

qg(u)+

+ 2

∏2g+1m=1m �=i

(ui − um)

Pg

0 (ui)(∂ui

)2qg(u) − 2

∏2g+1m=1m �=i

(ui − um)

(Pg

0 (ui))2∂ui

qg(u)∂uiP

g

0 (ui)+

+ 1

2

g∑n=1

(2n − 1)∂

∂r

Pg

n−1(r)

Pg

0 (r)

∣∣∣∣r=ui

g∑m=n

qm(u)+m−n+

+g∑

n=1

(2n − 1)P

g

n−1(ui)

Pg

0 (ui)

g∑m=n

∂ui(qm(u)+m−n). (5.47)


In the above relation we need to compute the following derivatives:

∂ui(qm(u)+m−n) = u

g−m

i +m−n∂uiqg(u), (5.48)

which are obtained from (2.18) and (4.14) and

∂uiP

g

0 (ui) = ∂rPg

0 (r)|r=ui+ ∂ui

Pg

0 (r)|r=ui,

where Pg

0 (r) = rg + α01r

g−1 + · · · + α0g . In order to obtain the derivative of the

normalization constants α01, α

02, . . . , α

0g , we need the following proposition.

PROPOSITION 5.4 ([19]). Let be ω1(r) and ω2(r) two normalized Abelian dif-ferentials on Sg . Let ξ = 1/

√r be the local coordinate at infinity and

ω1 =∑k

a1k ξ

k dξ, ω2 =∑k

a2k ξ

k dξ.

Define the bilinear product

Vω1ω2 =∑k�0

a1−k−2a

2k

k + 1,

then

∂

∂ui

Vω1ω2 = Resr=ui

ω1(r)ω2(r)

dr, i = 1, . . . , 2g + 1, (5.49)

where Resr=ui(ω1(r)ω2(r))/dr is the residue of the differential (ω1(r)ω2(r))/dr

evaluated at r = ui .

Applying the above proposition to σ0 and σk, k = 0, . . . , g − 1 and afternontrivial simplifications we obtain

∂

∂ui

α01

α02· · ·

α0g−1

α0g

= −1

2

1ui

. . .

ug−1i

ug−1i

− 1

2

0 0 0 · · · 01 0 0 · · · 0· · · · · · · · · · · · · · ·ug−3i u

g−4i · · · 0 0

ug−2i u

g−3i · · · 1 0

α01

α02· · ·

α0g−1

α0g

+

+ 1

2

Pg

0 (ui)∏2g+1k=1,k �=i(ui − uk)

+0 0 0 · · · 0+1 +0 0 · · · 0· · · · · · · · · · · · · · ·+g−2 +g−3 · · · +0 0+g−1 +g−2 · · · +1 +0

×

×

13P g

1 (ui)

. . .

(2g − 3)P g

g−2(ui)

(2g − 1)P g

g−1(ui)

, (5.50)

90 TAMARA GRAVA

where the +k’s have been defined in (2.18). From the above formula, we obtain∂ui

Pg

0 (ui) = ∂rPg

0 (r)|r=ui+ ∂ui

Pg

0 (r)|r=ui

= 1

2∂rP

g

0 (r)|r=ui+

+ 1

2

Pg

0 (ui)∏2g+1k=1,k �=i(ui − uk)

g∑n=1

(2n − 1)P g

n−1(ui)

g∑m=n

ug−m

i +m−n.

(5.51)

Using relations (5.48) and (5.51), we simplify (5.47) to the form

∂

∂ui


=∂ui

(2∏2g+1

m=1m �=i

(ui − um)∂uiqg(u))

Pg

0 (ui)+

+ 1

2∂r

(−12t

Pg

1 (r)

Pg

0 (r)+

g∑n=1

(2n − 1)P

g

n−1(r)

Pg

0 (r)

g∑m=n

qm+m−n+

+ 2

∏2g+1m=1m �=i

(ui − um)

Pg

0 (r)∂ui

qg(u)

)∣∣∣∣r=ui

. (5.52)

The above expression can be written in the form

∂

∂ui


= ∂

∂r

(−xPg

0 (r) − 12tP g

1 (r) + Rg(r)

2P g

0 (r)

)∣∣∣∣r=ui

+

+∏2g+1

m=1m �=i

(ui − um)

Pg

0 (ui)

(2(∂ui

)2qg +2g+1∑k=1k �=i

∂uiqg(u) − ∂uk

qg(u)

ui − uk

), (5.53)

where the polynomial Rg(r) has been defined in (4.4). We remark that such a poly-nomial has been defined for the analytic initial data (3.12), but it can be naturallyextended to any smooth initial data. Applying the relations (3.17) and (4.7) to thelast term of the above expression we obtain

2(∂ui)2qg +

2g+1∑k=1k �=i

∂uiqg(u) − ∂uk

qg(u)

ui − uk

= 2(∂ui)2qg + 2

2g+1∑k=1k �=i

∂ui∂uk

qg


= (∂r + ∂ui)5g(r;u)

∣∣r=ui

+2g+1∑k=1k �=i

∂uk5g(ui;u)

= �g(ui;u),

where the function �g(r;u) has been defined in (5.43). Substituting the aboverelation in (5.53), we obtain

∂

∂ui


= ∂

∂r

(−xPg

0 (r) − 12tP g

1 (r) + Rg(r)

2P g

0 (r)

)∣∣∣∣r=ui

+

+∏2g+1

m=1m �=i

(ui − um)

Pg

0 (ui)�g(ui;u). (5.54)

We observe that

−xPg

0 (r) − 12tP g

1 (r) + Rg(r) ≡ 0, g > 0,

on the solution of the Whitham equations. Indeed the hodograph transforma-tion (3.5) is equivalent to imposing 2g + 1 zeros on the above polynomial whichhas degree 2g. Consequently, such a polynomial is identically zero. Therefore, wecan simplify (5.54) to the form (5.42) when u1 > u2 > · · · > u2g+1 satisfy theg-phase Whitham equations, g > 0. ✷

From (5.42) it is clear that the ui’s evolve in such a way that the graph of theircurve is certainly C1-smooth in the x − u plane for u1 > u2 > · · · > u2g+1.Indeed, �g(r;u) is a smooth function of r and u when the initial data (5.44) issmooth. The next lemma shows that this property is preserved when two Riemanninvariants coalesce.

LEMMA 5.5. When ul = ul+1 the following relations are satisfied:

∂ui(−λi(u)t + wi(u))|{ul=ul+1=v}= ∂ui

(−λg−1i t + w

g−1i ), i �= l, l + 1, (5.55)

where

λg−1i = λ

g−1i (u1, . . . , ul−1, ul+2, . . . , u2g+1) and

wg−1i = w

g−1i (u1, . . . , ul−1, ul+2, . . . , u2g+1).

Regarding ∂xul and ∂xul+1, we have

limul→ul+1

[1

∂ul(−λl(u)t + wl(u))

− 1

∂ul+1(−λl+1(u)t + wl+1(u))

]= 0. (5.56)

92 TAMARA GRAVA

Proof. We analyze the behavior of the derivatives ∂ui(−λi(u)t + wi(u)), i =

1, . . . , 2g + 1 when ul = ul+1 = v.We first prove (5.55). From (4.23) and (4.25) we have that, for ui �= ul =

ul+1 = v ,

Pg

k (ui; u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1) = (ui − v)Pg−1k (ui), k � 0,

∂rPg

k (r; u1, . . . , ul−1, v, v, ul+2, . . . , u2g+1)|r=ui

= Pg−1k (ui) + (ui − v)∂rP

g−1k (r)

∣∣r=ui

,

(5.57)

where

Pg−1k (r) = P

g−1k (r; u1, . . . , ul−1, ul+2, . . . , u2g+1), k � 0.

Applying (4.20), (4.21) and (5.57) to expression (5.52), we obtain

∂ui(−λ

g

i (u)t + wg

i (u))∣∣ {ul=v

ul+1=v}

= −(−λg−1i t + w

g−1i

)[∂rP g−10 (r)|r=ui

2P g−10 (ui)

+ 1

2(ui − v)

]+

+ 2(ui − v)

∏′k �=i(ui − uk)

Pg−10 (ui)

∂2

∂u2i

qgg (u)|{ul=ul+1=v}+

+(

4∏′

k �=i(ui − uk)

Pg−10 (ui)

+ 2(ui − v)

∑′k �=i

∏′m�=k,i(ui − um)

Pg−10 (ui)

)×

× ∂uiqgg (u)

∣∣{ul=ul+1=v}+

+ qgg (u)|{ul=ul+1=v}

2P g−10 (ui)

(g∑

n=1

(2n − 1)+g−1g−n∂rP

g−1n−1 (r)|r=ui

+

+ 1

(ui − v)

g∑n=1

(2n − 1)P g−1n−1 (ui)+

g−1g−n

)+

+ 1

2P g−10 (ui)

∂r

(g−1∑n=1

(2n − 1)P g−1n−1 (r)

g−1∑m=n

qg−1m +

g−1m−n − 12tP g−1

1 (r)

)∣∣∣∣r=ui

+

+ 1

2(ui − v)Pg−10 (ui)

(g−1∑n=1

(2n − 1)P g−1n−1 (ui)

g−1∑m=n

qg−1m +

g−1m−n − 12tP g−1

1 (ui)

),

(5.58)

where

∏′ =2g+1∏k=1

k �=l,l+1

and∑′ =

2g+1∑m=1

m �=l,l+1

.


Using the relation (2.19), we obtain

g∑n=1

(2n − 1)+g−1g−n∂rP

g−1n−1 (r)

∣∣r=ui

= 2∑′

j �=i

∏′k �=i (ui − uk)

(ui − uj)(5.59)

andg∑

n=1

(2n − 1)+g−1g−nP

g−1n−1 (ui) =

∏′k �=i

(ui − uk). (5.60)

Substituting (5.59) and (5.60) in (5.58) and applying relation (4.22), we can easilyobtain (5.55).

For proving (5.56), we consider only l even. Analogous considerations can bedone for l odd. Using expansion (4.25) and relation (2.13), we have

∂rPg

k (r; u1, . . . , ul−1, v + √ε, v − √

ε, ul+2, . . . , u2g+1))

� Pg−1k (r) + (r − v)∂rP

g−1k (r)+

+ ∂r((r − v)∑g−1

j=1 Ng−1j (v)rg−1−j )

log ε

∫ Q+(v)

Q−(v)

σg−1k (ξ), k � 0, (5.61)

so that

∂rPg

k (r; u1, . . . , ul−1, v + √ε, v − √

ε, ul+2, . . . , u2g+1)|r=v±√ε

Pg

0 (v ± √ε)

� − log εσ

g−1k (v)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

−∫ Q+(v)

Q−(v)σ

g−1k (ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

g−1∑j=1

vg−1−j

µ(v)N

g−1j (v)±

± σ0(v)σk(v)

(∫ Q+(v)

Q−(v)σ

g−10 (ξ))2

√ε log2 ε. (5.62)

Substituting (4.32), (4.36) and (5.62) in (5.56) and using the explicit expres-sion (5.38) of the derivative, we obtain

∂ul(−λ

g

l (u)t + wg

l (u))∣∣ {ul=v+√

ε

ul+1=v−√ε}

�(

1

2log ε − √

ε log2 εσ

g−10 (v)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

)×

× σg−10 (v)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

[∫ Q+(v)

Q−(v)Zg−1(ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

− Zg−1(v)

σg−10 (v)

](5.63)

and

∂ul+1(−λg

l+1(u)t + wg

l+1(u))∣∣ {ul=v+√

ε

ul+1=v−√ε}

94 TAMARA GRAVA

�(

1

2log ε + √

ε log2 εσ

g−10 (v)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

)×

× σg−10 (v)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

[∫ Q+(v)

Q−(v)Zg−1(ξ)∫ Q+(v)

Q−(v)σ

g−10 (ξ)

− Zg−1(v)

σg−10 (v)

], (5.64)

where

Zg−1(r) = −12tσ g−11 (r) +

g−1∑n=1

(2n − 1)σ g−1n−1 (r)

g−1∑m=n

qg−1m +

g−1m−n +

+ 2µ(r) dr2g+1∑k=1

∂ukqgg (u)

∣∣{ul=ul+1=v}.

In the above formulas,

Zg−1(v) = Zg−1(r)

dr

∣∣∣∣r=v

, qg−1m = qg−1

m (u1, . . . , ul−1, ul+2, . . . , u2g+1)

and

+g−1m = +g−1

m (u1, . . . , ul−1, ul+2, . . . , u2g+1).

Substituting (5.63) and (5.64) into (5.56), it is easy to verify that the limit (5.56)is satisfied. ✷

Combining Theorem 5.2 and Lemma 5.5, we conclude that the g-phase solu-tions u1(x, t) > u2(x, t) > · · · > u2g+1(x, t) for different g � 0 are glued togetherin order to produce a C1-smooth multivalued curve in the x−u plane evolvingsmoothly with time.

6. Conclusion

In this work, we have obtained a new formula for the solution wi(u), i = 1, . . . ,2g + 1, of the Tsarev system with matching conditions (3.6)–(3.9) for any in-creasing smooth initial data. As in [15], this new formula is expressed throughsome auxiliary functions which solve the linear overdetermined systems of theEuler–Poisson–Darboux type. We have shown that the matching conditions (3.6)–(3.9) for the wi(u)’s guarantee that the solution of the Tsarev system is uniqueand that the g-phase solutions of the Whitham equations for different g, are gluedtogether in order to produce a C1-smooth multivalued curve in the (x, u) planeevolving smoothly with time. The new formula (4.9) enabled us to obtain a simpleexpression for the Jacobian of the hodograph transformation. We believe that thisformula will be useful in the future to investigate the solvability of the hodographtransformation for g > 1. Indeed, it is still an open problem to prove the existence


of a real solution u1(x, t) > u2(x, t) > · · · > u2g+1(x, t), g > 1, of the hodographtransformation (1.5) for all x and t > 0 and for generic smooth initial data.

Acknowledgements

I am indebted to Professor Boris Dubrovin who posed me the problem of this workand gave me many hints on how to reach the solution. I am grateful to ProfessorSergei Novikov for his suggestions during the preparation of the manuscript. Thiswork was partially supported by a CNR grant 203.01.70, by a grant of S. Novikov,and by a Marie Curie Fellowship. I wish to thank the anonymous referee for thesuggested improvements to the manuscript. I am also grateful to T. Liverpool forcarefully reading the manuscript.

References

1. Whitham, G. B.: Linear and Nonlinear Waves, Wiley, New York, 1974.2. Flaschka, H., Forest, M. and McLaughlin, D. H.: Multiphase averaging and the inverse spectral

solution of the Korteweg–de Vries equations, Comm. Pure Appl. Math. 33 (1980), 739–784.3. Lax, P. D. and Levermore, C. D.: The small dispersion limit of the Korteweg–de Vries equation,

I, II, III, Comm. Pure Appl. Math. 36 (1983), 253–290, 571–593, 809–830.4. Levermore, C. D.: The hyperbolic nature of the zero dispersion KdV limit, Comm. Partial

Differential Equations 13 (1988), 495–514.5. Fei Ran Tian: Oscillations of the zero dispersion limit of the Korteweg–de Vries equations,

Comm. Pure Appl. Math. 46 (1993), 1093–1129.6. Grava, T.: Existence of a global solution of the Whitham equations, Theoret. Math. Phys. 122(1)

(2000), 46–58.7. Dubrovin, B. and Novikov, S. P.: Hydrodynamic of weakly deformed soliton lattices. Differen-

tial geometry and Hamiltonian theory, Russian Math. Surveys 44(6) (1989), 35–124.8. Tsarev, S. P.: Poisson brackets and one-dimensional Hamiltonian systems of hydrodynamic

type, Soviet Math. Dokl. 31 (1985), 488–491.9. Gurevich, A. G. and Pitaevskii, L. P.: Nonstationary structure of collisionless shock waves,

JEPT Lett. 17 (1973), 193–195.10. Krichever, I. M.: The method of averaging for two dimensional integrable equations, Funct.

Anal. Appl. 22 (1988), 200–213.11. Potemin, G. V.: Algebraic-geometric construction of self-similar solutions of the Whitham

equations, Uspekhi Mat. Nauk 43(5) (1988), 211–212.12. Kudashev, V. R. and Sharapov, S. E.: Inheritance of KdV symmetries under Whitham averaging

and hydrodynamic symmetry of the Whitham equations, Theoret. Math. Phys. 87(1) (1991),40–47.

13. Gurevich, A. V., Krylov, A. L. and El, G. A.: Evolution of a Riemann wave in dispersivehydrodynamics, Soviet Phys. JEPT 74(6) (1992), 957–962.

14. Eisenhart, L. P.: Ann. of Math. 120 (1918), 262.15. Fei Ran Tian: The Whitham type equations and linear over-determined systems of Euler–

Poisson–Darboux type, Duke Math. J. 74 (1994), 203–221.16. El, G. A.: Generating function of the Whitham–KdV hierarchy and effective solution of the

Cauchy problem, Phys. Lett. A 222 (1996), 393–399.17. Springer, G.: Introduction to Riemann Surfaces, Addison-Wesley, Reading, Mass., 1957.

96 TAMARA GRAVA

18. Rodin, Yu. L.: The Riemann Boundary Value Problem on Riemann Surfaces, Math. Appl. SovietSer., D. Reidel, Dordrecht, 1987.

19. Dubrovin, B.: Lectures on 2-D Topological Field Theory, Lecture Notes in Math. 1620,Springer-Verlag, Berlin, 1996.

20. Fay, J.: Theta Functions on Riemann Surface, Lecture Notes in Math. 352, Springer-Verlag,Heidelberg, 1973.


97

A General Framework for Localization of ClassicalWaves: I. Inhomogeneous Media and DefectEigenmodes �

ABEL KLEIN1 and ANDREW KOINES2,��

1University of California, Irvine, Department of Mathematics, Irvine, CA 92697-3875, U.S.A.e-mail: [email protected] A&M University, Department of Mathematics, College Station, TX 77843-3368, U.S.A.

(Received: 7 February 2001)

Abstract. We introduce a general framework for studying the localization of classical waves in inho-mogeneous media, which encompasses acoustic waves with position dependent compressibility andmass density, elastic waves with position dependent Lamé moduli and mass density, and electromag-netic waves with position dependent magnetic permeability and dielectric constant. We also allow foranisotropy. We develop mathematical methods to study wave localization in inhomogeneous media.We show localization for local perturbations (defects) of media with a spectral gap, and study midgapeigenmodes.

Mathematics Subject Classifications (2000): 35Q60, 35Q99, 78A99, 78A48, 74J99, 35P99, 47F05.

Key words: wave localization, inhomogeneous media, defects, midgap eigenmodes.

1. Introduction

We provide a general framework for studying localization of acoustic waves, elas-tic waves, and electromagnetic waves in inhomogeneous media, i.e., the existenceof acoustic, elastic, and electromagnetic waves such that almost all of the wave’senergy remains in a fixed bounded region uniformly over time. Our general frame-work encompasses acoustic waves with position dependent compressibility andmass density, elastic waves with position dependent Lamé moduli and mass den-sity, and electromagnetic waves with position dependent magnetic permeabilityand dielectric constant. We also allow for anisotropy.

In this first article we develop mathematical methods to study wave localiza-tion in inhomogeneous media. As an application we show localization for localperturbations (defects) of media with a gap in the spectrum, and study midgapeigenmodes. In the second article [17], we use the methods developed in this article

� This work was partially supported by NSF Grant DMS-9800883.�� Current address: Orange Coast College, Department of Mathematics, Costa Mesa, CA 92626,

U.S.A.

98 ABEL KLEIN AND ANDREW KOINES

to study wave localization for random perturbations of periodic media with a gapin the spectrum.

Our results extend the work of Figotin and Klein [7–11, 16] in several ways: (1)We study a general class of classical waves which includes acoustic, electromag-netic and elastic waves as special cases. (2) We allow for more than one inhomo-geneous coefficient (e.g., electromagnetic waves in media where both the magneticpermeability and the dielectric constant are position dependent). (3) We allow foranisotropy in our wave equations. (4) In [17] we prove strong dynamical localiza-tion in random media, using the recent results of Germinet and Klein [14] on strongdynamical localization and of Klein, Koines and Seifert [18] on a generalizedeigenfunction expansion for classical wave operators.

Previous results on localization of classical waves in inhomogeneous media [7–11, 16, 3] considered only the case of one inhomogeneous coefficient. Acoustic andelectromagnetic waves were treated separately. Elastic waves were not discussed.

Our approach to the mathematical study of localization of classical waves, asin the work of Figotin and Klein, is operator theoretic and reminiscent of quantummechanics. It is based on the fact that many wave propagation phenomena in clas-sical physics are governed by equations that can be recast in abstract Schrödingerform [21, 8, 16]. The corresponding self-adjoint operator, which governs the dy-namics, is a first order partial differential operator, but its spectral theory may bestudied through an auxiliary self-adjoint, second order partial differential operator.These second-order classical wave operators are analogous to Schrödinger op-erators in quantum mechanics. The method is particularly suitable for the studyof phenomena historically associated with quantum mechanical electron waves,especially Anderson localization in random media [7, 8, 11, 16] and midgap defecteigenmodes [9, 10].

Physically interesting inhomogeneous media give rise to nonsmooth coeffi-cients in the classical wave equations, and hence in their classical wave operators(e.g., a medium composed of two different homogeneous materials will be repre-sented by piecewise constant coefficients). Thus we make no assumptions aboutthe smoothness of the coefficients of classical wave operators. Since we allowtwo inhomogeneous coefficients, we have to deal with domain questions for thequadratic forms associated with classical wave operators.

We must also take into account that many classical wave equations come withauxiliary conditions, and the corresponding classical wave operators are not elliptic(e.g., the Maxwell operator – see [21, 20, 8, 11, 16]).

This paper is organized as follows: In Section 2 we introduce our framework forstudying classical waves. We discuss classical wave equations in inhomogeneousmedia and wave localization. We define first and second order classical wave op-erators, and use them to rewrite the wave equations in abstract Schrödinger form.We state our results on wave localization created by defects. In Section 3 we studyclassical wave operators, obtaining the technical tools that are needed for provinglocalization in inhomogeneous and random media. We study finite volume classi-

A GENERAL FRAMEWORK FOR LOCALIZATION OF CLASSICAL WAVES: I 99

cal wave operators, discuss interior estimates, give an improved resolvent decayestimate in a gap, prove a Simon–Lieb-type inequality and an eigenfunction decayinequality. In Section 4 we study periodic classical wave operators, and prove atheorem that gives the spectrum of a periodic classical wave operator in terms ofthe spectra of its restriction to finite cubes with periodic boundary condition. InSection 5 we study the effect of defects on classical wave operators, and give theproofs and details of the results on defects and wave localization stated in Section 2.

2. The Mathematical Framework

2.1. CLASSICAL WAVE EQUATIONS

Many classical wave equations in a linear, lossless, inhomogeneous medium canbe written as first order equations of the form:

K(x)−1 ∂

∂tψt(x) = D∗φt (x),

R(x)−1 ∂

∂tφt (x) = −Dψt(x),

(2.1)

where x ∈ Rd (space), t ∈ R (time), ψt(x) ∈ C

n and φt (x) ∈ Cm are physical

quantities that describe the state of the medium at position x and time t , D is anm×nmatrix whose entries are first-order partial differential operators with constantcoefficients (see Definition 2.1), D∗ is the formal adjoint of D, and, K(x) and R(x)are n× n and m×m positive, invertible matrices, uniformly bounded from aboveand away from 0, that describe the medium at position x (see Definition 2.3). Inaddition, D satisfies a partial ellipticity property (see Definition 2.2), and there maybe auxiliary conditions to be satisfied by the quantities ψt(x) and φt (x).

The physical quantities ψt(x) and φt(x) then satisfy second-order wave equa-tions, with the same auxiliary conditions:

∂2

∂t2ψt(x) = −K(x)D∗R(x)Dψt(x), (2.2)

∂2

∂t2φt (x) = −R(x)DK(x)D∗φt (x). (2.3)

Conversely, given (2.2) (or (2.3)), we may write this equation in the form (2.1)by introducing an appropriate quantity φt(x) (or ψt(x)), which will then satisfyEquation (2.3) (or (2.2)).

The medium is called homogeneous if the coefficient matrices K(x) and R(x)are constant, i.e., they do not depend on the position x. Otherwise the medium issaid to be inhomogeneous.


EXAMPLES. Electromagnetic waves: Maxwell equations are given by (2.1)with d = n = m = 3, ψt(x) the magnetic field, φt (x) the electric field, D = D∗the curl,

Dφ = ∇ × φ, K(x) = 1

µ(x)I3 and R(x) = 1

ε(x)I3,

with µ(x) the magnetic permeability and ε(x) the dielectric constant. (By Ik wedenote the k × k identity matrix.) The auxiliary conditions are ∇ · µψt = 0 and∇ · εφt = 0.

Acoustic waves: The acoustic equations in d dimensions may be written as (2.1),with n = 1, m = d, ψt(x) the pressure, φt (x) the velocity, D the gradient,

Dφ = ∇φ, D∗ψ = −∇ · ψ, K(x) = 1

κ(x)I1 and R(x) = 1

�(x)Id,

with κ(x) the compressibility and �(x) the mass density. The auxiliary conditionis ∇ × �φt = 0. The usual second-order acoustic equation for the pressure is thengiven by (2.2).

Elastic waves: The equations of motion for linear elasticity, in an isotropicmedium, can be written as the second-order wave equation

ρ(x)∂2

∂t2ψt(x) = −

{∇[λ(x)+ 2µ(x)]∇∗ + ∇ × µ(x)∇×}ψt(x), (2.4)

where x ∈ R3, ψt(x) is the medium displacement, ρ(x) is the mass density, and,

λ(x) and µ(x) are the Lamé moduli. It is of the form of Equation (2.2), with n = 3,D the differential operator given by

Dψ = (∇∗ψ)⊕ (∇ × ψ) (a 4× 3 matrix), K(x) = 1

ρ(x)I3,

and

R(x) = (λ(x)+ 2µ(x))I1 ⊕ µ(x)I3 (a 4× 4 matrix).

2.2. WAVE EQUATIONS IN ABSTRACT SCHRÖDINGER FORM

The wave equation (2.1) may be rewritten in abstract Schrödinger form [21, 8, 16]:

−i d

dt�t = W�t, (2.5)

where �t =(ψtφt

)and

W =(

0 −iK(x)D∗iR(x)D 0

). (2.6)


The (first-order) classical wave operator W is formally (and can be defined as) aself-adjoint operator on the Hilbert space

H = L2(Rd,K(x)−1 dx;Cn

)⊕ L2(Rd,R(x)−1 dx;Cm

), (2.7)

where, for a k × k positive invertible matrix-valued measurable function S(x), weset

L2(Rd,S(x)−1 dx;Ck

)= {f : R

d → Ck; ⟨f,S(x)−1f

⟩L2(Rd ,dx;Ck) <∞

}.

The auxiliary conditions to the wave equation are imposed by requiring thesolutions to Equation (2.5) to also satisfy

�t = P⊥W�t, (2.8)

where P⊥W

denotes the orthogonal projection onto the orthogonal complement ofthe kernel of W. The solutions to Equations (2.5) and (2.8) are of the form

�t = eitWP⊥W�0, �0 ∈ H . (2.9)

The energy density at time t of a solution � ≡ �t(x) = (ψt (x), φt (x)) of thewave equation (2.1) is given by

E�(t, x) = 12

{〈ψ(x),K(x)−1ψt(x)〉Cn + 〈φt (x),R(x)−1φt(x)〉Cm}. (2.10)

The wave energy, a conserved quantity, is thus given by

E� = 12‖�t‖2

H for any t. (2.11)

Note that (2.9) gives the finite energy solutions to the wave equation (2.1).

2.3. WAVE LOCALIZATION

Let � = �t(x) be a finite energy solution of the wave equation (2.1). There aremany criteria for wave localization, e.g.:

Simple localization: Almost all of the wave’s energy remains in a fixed boundedregion at all times, more precisely:

limR→∞

inft

1

E�

∫|x|�R

E�(t, x) dx = 1. (2.12)

Moment localization: For some (or we may require for all) q > 0, we have

supt

∫Rd

|x|qE�(t, x) dx = 12 sup

t

‖ |x| q2�t‖2H <∞. (2.13)


Exponential localization (in the L2-sense): For some C <∞ and m > 0, we have

supt

∥∥χxE�(t, ·)∥∥2 = 1√2

supt

‖χx�t‖H � Ce−m|x| (2.14)

for all x ∈ Rd , where χx denotes the characteristic function of a cube of side 1

centered at x.

It is easy to see that exponential localization implies moment localization forall q > 0, and moment localization for some q > 0 implies simple localization.

The fact that finite energy solutions are given by (2.9) suggests a method toobtain localized waves: if �0 ∈ H is an eigenfunction for the classical waveoperator W with nonzero eigenvalue ω, i.e., W�0 = ω�0 with ω �= 0, then thewave �t = eitω�0 exhibits simple localization. If in addition ‖ |x| q2�0‖2

H < ∞,we have moment localization. If �0 is exponentially decaying (in the L2-sense),we have exponential localization.

In a homogenous medium, a classical wave operator cannot have nonzero eigen-values. (This can be shown using the Fourier transform.) Thus an appropriateinhomogenous medium is required to produce nonzero eigenvalues, and hencelocalized waves. In Subsection 2.5 we will see that we can produce eigenvaluesin spectral gaps of classical wave operators by introducing defects, i.e., by mak-ing local changes in the medium. Moreover, the corresponding waves will exhibitexponential localization.

In the sequel [17] we show that random changes in the media can produceAnderson localization in spectral gaps of periodic classical wave operators.

2.4. CLASSICAL WAVE OPERATORS

We now introduce the mathematical machinery needed to make the precedingdiscussion mathematically rigorous.

It is convenient to work on L2(Rd, dx;Ck) instead of the weighted spaceL2(Rd,S(x)−1dx;Ck). To do so, note that the operator VS , given by multiplicationby the matrix S(x)−1/2, is a unitary map from the Hilbert space L2(Rd,S(x)−1 dx;Ck) to L2(Rd, dx;Ck), and if we set W = (VK ⊕ VR)W(V ∗K ⊕ V ∗R), we have

W =(

0 −i√K(x)D∗√

R(x)i√

R(x)D√

K(x) 0

), (2.15)

a formally self-adjoint operator on L2(Rd, dx;Cn)⊕ L2(Rd, dx;Cm).In addition, if S−I � S(x) � S+I with 0 < S− � S+ < ∞, as it will be

the case in this article, it turns out that if ϕ = VSϕ, then the functions ϕ(x) andϕ(x) share the same decay and growth proporties (e.g., exponential or polynomialdecay).

Thus it will suffice for us to work on L2(Rd, dx;Ck), and we will do so in theremainder of this article. We set

H (k) = L2(Rd, dx;Ck). (2.16)


Given a closed densely defined operator T on a Hilbert space H , we will denoteits kernel by ker T and its range by ran T ; note ker T ∗T = ker T . If T is self-adjoint, it leaves invariant the orthogonal complement of its kernel; the restrictionof T to (ker T )⊥ will be denoted by T⊥. Note that T⊥ is a self-adjoint operator onthe Hilbert space (ker T )⊥ = P⊥T H , where P⊥T denotes the orthogonal projectiononto (ker T )⊥.

DEFINITION 2.1. A constant coefficient, first-order, partial differential opera-tor D from H (n) to H (m) (CPDO(1)

n,m) is of the form D = D(−i∇), where, for ad-component vector k, D(k) is the m× n matrix

D(k) = [D(k)r,s] r=1,...,ms=1,...,n

; D(k)r,s = ar,s · k, ar,s ∈ Cd. (2.17)

We set

D+ = sup{‖D(k)‖; k ∈ Cd, |k| = 1}, (2.18)

so ‖D(k)‖ � D+|k| for all k ∈ Cd . Note that D+ is bounded by the norm of the

matrix [|ar,s |] r=1,...,ms=1,...,n

.

Defined on

D(D) = {ψ ∈ H (n) : Dψ ∈ H (m) in distributional sense}, (2.19)

a CPDO(1)n,m D is a closed, densely defined operator, and C∞0 (R

d;Cn) (the space ofinfinitely differentiable functions with compact support) is an operator core for D.We will denote by D∗ the CPDO(1)

m,n given by the formal adjoint of the matrix in(2.17).

DEFINITION 2.2. A CPDO(1)n,m D is said to be partially elliptic if there exists a

CPDO(1)n,q D⊥ (for some q), satisfying the following two properties:

D⊥D∗ = 0, (2.20)

D∗D+ (D⊥)∗D⊥ � ,[(−-)⊗ In], (2.21)

with , > 0 being a constant. (- = ∇ · ∇ is the Laplacian on L2(Rd, dx); Indenotes the n× n identity matrix.)

If D is partially elliptic, we have

H (n) = ker D⊥ ⊕ ker D, (2.22)

and

D∗D+ (D⊥)∗D⊥ = (D∗D)⊥ ⊕ ((D⊥)∗D⊥)⊥. (2.23)

Note that D is elliptic if and only it is partially elliptic with D⊥ = 0. Note alsothat a CPDO(1)

n,m D may be partially elliptic with D∗ not being partially elliptic [18,Remark 1.1].


DEFINITION 2.3. A coefficient operator S on H (n) (COn) is a bounded, invert-ible operator given by multiplication by a coefficient matrix: an n×nmatrix -valuedmeasurable function S(x) on R

d , satisfying

S−In � S(x) � S+In, with 0 < S− � S+ <∞. (2.24)

DEFINITION 2.4. A multiplicative coefficient, first-order, partial differential op-erator from H (n) to H (m) (MPDO(1)

n,m) is of the form

A = √RD√

K on D(A) =K− 12 D(D), (2.25)

where D is a CPDO(1)n,m , K is a COn, and R is a COm. (We will write AK,R for A

whenever it is necessary to make explicit the dependence on the medium, i.e., onthe coefficient operators. D does not depend on the medium, so it will be omittedin the notation.)

An MPDO(1)n,m A is a closed, densely defined operator with A∗ = √KD∗

√R

an MPDO(1)m,n. Note that K− 1

2C∞0 (Rd;Cn) is an operator core for A.

The following quantity will appear often in estimates:

/A ≡ D+√R+K+. (2.26)

DEFINITION 2.5. A first-order classical operator (CWO(1)n,m) is an operator of the

form

WA =[

0 −iA∗iA 0

]on H (n+m) ∼= H (n) ⊕H (m), (2.27)

where A is an MPDO(1)n,m. If either D or D∗ is partially elliptic, WA will also be

called partially elliptic.

A CWO(1)n,m is a self-adjoint MPDO(1)

n+m,n+m: WA = √SWD

√S, where

S =K ⊕R is a COn+m and WD is a self-adjoint CPDO(1)n+m,n+m. (Note that our

definition of a first-order classical wave operator is more restrictive than the oneused in [18]. The definition of partial ellipticity is also different; [18] requiresboth D and D∗ to be partially eliptic.)

The Schrödinger-like Equation (2.5) for classical waves with the auxiliary con-dition (2.8) may be written in the form:

−i ∂∂t�t = (WA)⊥�t, �t ∈ (ker WA)

⊥ = (kerA)⊥ ⊕ (kerA∗)⊥, (2.28)

with WA a CWO(1)n+m as in (2.27). Its solutions are of the form

�t = eit (WA)⊥�0, �0 ∈ (ker WA)⊥, (2.29)

which is just another way of writing (2.9).


Since

(WA)2 =

[A∗A 0

0 AA∗

], (2.30)

if �t = (ψt, φt ) ∈ H (n)⊕H (m) is a solution of (2.28), then its components satisfythe second-order wave equations (2.2) and (2.3), plus the auxiliary conditions,which may be all written in the form

∂2

∂t2ψt = −(A∗A)⊥ψt, with ψt ∈ (kerA)⊥, (2.31)

∂2

∂t2φt = −(AA∗)⊥φt , with φt ∈ (kerA∗)⊥. (2.32)

The solutions to (2.31) may be written as

ψt = cos(t(A∗A)

12⊥)ψ0 + sin

(t(A∗A)

12⊥)η0, ψ0, η0 ∈ (kerA)⊥, (2.33)

with a similar expression for the solutions of (2.32).The operators (A∗A)⊥ and (AA∗)⊥ are unitarily equivalent (see Lemma A.1):

the operator U defined by

Uψ = A(A∗A)− 12

⊥ ψ for ψ ∈ ran(A∗A)12⊥, (2.34)

extends to a unitary operator from (kerA)⊥ to (kerA∗)⊥, and

(AA∗)⊥ = U(A∗A)⊥U ∗. (2.35)

In addition, if

U = 1√2

[IA IAiU −iU

], with IA the identity on (kerA)⊥, (2.36)

U is a unitary operator from (kerA)⊥ ⊕ (kerA)⊥ to (kerA)⊥ ⊕ (kerA∗)⊥, and wehave the unitary equivalence:

U∗(WA)⊥U = (A∗A) 1

2⊥ ⊕

[−(A∗A) 12⊥]. (2.37)

Thus the operator (A∗A)⊥ contains full information about the spectral theory ofthe operator (WA)⊥ (e.g., [8, 18]). In particular

σ ((WA)⊥) = σ((A∗A)

12⊥) ∪ (−σ ((A∗A) 1

2⊥)), (2.38)

and to find all eigenvalues and eigenfunctions for (WA)⊥, it is necessary and suf-ficient to find all eigenvalues and eigefunctions for (A∗A)⊥. For if (A∗A)⊥ψω2 =ω2ψω2 , with ω �= 0, ψω2 �= 0, we have

(WA)⊥(ψω2 ,± i

ωAψω2

)= ±ω

(ψω2,± i

ωAψω2

). (2.39)


Conversely, if (WA)⊥(ψ±ω, φ±ω) = ±ω(ψ±ω, φ±ω), with ω �= 0, it follows that(see [18, Proposition 5.2])

(A∗A)⊥ψ±ω = ω2ψ±ω and φ±ω = ± iωAψ±ω. (2.40)

DEFINITION 2.6. A second-order classical wave operator on H (n) (CWO(2)n ) is

an operator W = A∗A, with A an MPDO(1)n,m for some m. (We write WK,R =

A∗K,RAK,R.) If D in (2.25) is partially elliptic, the CWO(2)n will also be called

partially elliptic.

Note that a first-order classical wave operator WA is partially elliptic if and onlyif one of the two second-order classical wave operators A∗A and AA∗ is partiallyelliptic.

DEFINITION 2.7. A classical wave operator (CWO) is either a CWO(1)n or a

CWO(2)n . If the operator W is a CWO, we call W⊥ a proper CWO.

Remark 2.8. A proper classical wave operator W has a trivial kernel by con-struction, so 0 is not an eigenvalue. However, using a dilation argument, one canshow that 0 is in the spectrum of W⊥ [18, Theorem A.1], so W⊥ and W have thesame spectrum and essential spectrum.

2.5. DEFECTS AND WAVE LOCALIZATION

We now describe our results on defects and wave localization. We can produceeigenvalues in spectral gaps of classical wave operators by introducing defects.Moreover, the corresponding waves exhibit exponential localization. The proofsand details are given in Section 5.

A defect is a modification of a given medium in a bounded domain. Two media,described by coefficient matrices K0(x),R0(x) and K(x),R(x), are said to differby a defect, if they are the same outside some bounded set 5, i.e., K0(x) = K(x)

and R0(x) = R(x) if x /∈ 5. The defect is said to be supported by the boundedset 5.

We recall that the essential spectrum σess(H) of an operator H consists of allthe points of its spectrum, σ (H), which are not isolated eigenvalues with finitemultiplicity. Figotin and Klein [9] showed that the essential spectrum of Acousticand Maxwell operators are not changed by defects. We extend this result to the classof classical wave operators: the essential spectrum of a partially elliptic classicalwave operator (first or second order) is not changed by defects.

THEOREM 2.9. Let W0 and W be partially elliptic classical wave operators fortwo media which differ by a defect. Then

σess(W) = σess(W0). (2.41)


If (a, b) is a gap in the spectrum of W0, the spectrum of W in (a, b) consists ofat most isolated eigenvalues with finite multiplicity, the corresponding eigenmodesdecaying exponentially fast away from the defect, with a rate depending on thedistance from the eigenvalue to the edges of the gap.

In view of the unitary equivalence (2.37), Theorem 2.9 is an immediate corol-lary to Theorem 5.1 and Corollary 5.3. For second-order classical wave operators,the exponential decay of an eigenmode is given in (5.11). For first-order classicalwave operators, the exponential decay of an eigenmode follows from (2.40), (5.11),and (3.19).

We now turn to the existence of midgap eigenmodes and exponentially localizedwaves. The next theorem shows that one can design simple defects which generateeigenvalues in a specified subinterval of a spectral gap of W0 , extending [9, The-orem 2] to the class of classical wave operators. We insert a defect that changesthe value of K0(x) and R0(x) inside a bounded set of ‘size’ 8 to given positiveconstants K and R. If (a, b) is a gap in the spectrum of W0, we will show that wecan deposit an eigenvalue ofW inside any specified closed subinterval of (a, b), byinserting such a defect with 8/

√KR large enough. We provide estimates on how

large is ‘large enough’. Note that the corresponding eigenmode is exponentiallydecaying by Theorem 2.9, so we construct an exponentially localized wave.

THEOREM 2.10 (Existence of exponentially localized waves). Let (a, b) be a gapin the spectrum of a partially elliptic classical wave operator W0 = WK0,R0 , selectµ ∈ (a, b), and pick δ > 0 such that the interval [µ − δ, µ + δ] is contained inthe gap, i.e., [µ − δ, µ + δ] ⊂ (a, b). Given an open bounded set 5, x0 ∈ 5,0 < K,R, 8 < ∞, we introduce a defect that produces coefficient matrices K(x)

and R(x) that are constant in the set 58 = x0 + 8(5− x0), with

K(x) = KIn and R(x) = RIm for x ∈ 58. (2.42)

Then there is a finite constant C, satisfying an explicit lower bound depending onlyon the order ( first or second ) of the classical wave operator, and on D+, µ, δ, andthe geometry of 5, such that if

8√KR

> C, (2.43)

then the operator W = WK,R has at least one eigenvalue in the interval[µ− δ, µ+ δ].

Theorem 2.10 follows from Theorem 5.4 and (2.37). For second-order opera-tors the explicit lower bound is given in (5.16), for first-order operators it can becalculated from (5.16) and (2.37).


3. Properties of Classical Wave Operators

In this section we discuss several important properties of classical wave operators,which provide the necessary technical tools for proving localization in inhomoge-neous and random (see [17]) media.

3.1. A TRACE ESTIMATE

Partially elliptic second-order classical wave operators satisfy a trace estimate thatprovides a crucial ingredient for many results.

THEOREM 3.1 ([18, Theorem 1.1]). Let W be a partially elliptic second-orderclassical wave operator on H (n), and let P⊥W denote the orthogonal projection onto(kerW)⊥. Then

tr(V ∗P⊥W(W + I )−2rV ) � Cd,n,K±,R±,D+,D⊥+ ,,‖V ‖2∞,2 <∞, (3.1)

for r � ν, where ν is the smallest integer satisfying ν > d/4. V is the boundedoperator on H (n) given by multiplication by an n × n matrix-valued measurablefunction V (x), with

‖V ‖2∞,2 =

∑y∈Zd

‖χy,1(x)V ∗(x)V (x)‖∞ <∞. (3.2)

(χy,L denotes the characteristic function of a cube of side length L centered at y.)The constant Cd,n,K±,R±,D+,D⊥+,, depends only on the fixed parameters d, n,K±,R±,D+,D⊥+,,.

3.2. FINITE VOLUME CLASSICAL WAVE OPERATORS

Throughout this paper we use two norms in Rd and C

d :

|x| =(

d∑i=1

|xi |2)1/2

, (3.3)

‖x‖ = max{|xi|, i = 1, . . . , d}. (3.4)

We set Br(x) to be the open ball in Rd , centered at x with radius r > 0:

Br(x) = {y ∈ Rd; |y − x| < r}. (3.5)

By =L(x) we denote the open cube in Rd , centered at x with side L > 0

=L(x) = {y ∈ Rd; ‖y − x‖ < L/2}, (3.6)


and by=L(x) the closed cube. By=we will always denote some open cube=L(x).We will identify a closed cube =L(x) with a torus in the usual way, and use thefollowing distance in the torus:

dL(y, y′) = min

m∈LZd|y − y′ +m| �

√d

2L for y, y′ ∈ =L(x). (3.7)

We set H (n)= = L2(=, dx;Cn). A CPDO(1)

n,mD defines a closed densely defined

operator D= from H (n)= to H (m)

= with periodic boundary condition; an operator coreis given by C∞per(=,C

n), the infinitely differentiable, periodic Cn-valued functions

on =. The restriction of a COn S to = gives the bounded, invertible operator S= onH (n)= . Given an MPDO(1)

n,m A as in (2.25), we define its restriction A= to the cube =with periodic boundary condition by

A= =√

R=D=

√K= on D(A=) =K

− 12

= D(D=), (3.8)

a closed, densely defined operator on H (n)= . The restriction W= of the second-order

classical wave operator W = A∗A to = with periodic boundary condition is nowdefined as W= = A∗=A=.

If the CPDO(1)n,mD is partially elliptic, then the restriction D= is also partially

elliptic, in the sense that Equations (2.20) and (2.21) hold for D=, (D⊥)=, and -=.(-= is the Laplacian on L2(=, dx) with periodic boundary condition.) This canbe easily seen by using the Fourier transform; here the use of periodic boundarycondition plays a crucial role. We also have (2.22) and (2.23) with H (n)

= .If = = =L(x), we write H (n)

x,L, Wx,L, and so on.Given a second-order classical wave operator W on H (n), we define its finite

volume resolvent on a cube = by

R=(z) = (W= − z)−1 for z /∈ σ (W=). (3.9)

If W is partially elliptic, it turns out that (W=)⊥ has compact resolvent, i.e.,R=(z)P

⊥W=

is a compact operator for z /∈ σ (W=). Note that it suffices to prove thestatement for z = −1. We will prove a stronger statement.

In what follows, we write F G if the positive self-adjoint operators F and Gare unitarily equivalent, and we write F ! G if F J for some positive self-adjoint operator J � G. Note that, if 0 � F ! G, then tr f (G) � tr f (F ), for anypositive, decreasing function f on [0,∞).

PROPOSITION 3.2. Let W be a partially elliptic second-order classical waveoperator. Then for any finite cube = and p > d/2 we have

tr{(W= + 1)−pP⊥W=

}� n tr

{(K−R−,(−-=)+ 1)−p

}<∞. (3.10)


Proof. Using Lemma A.1 and (2.24), we get

(W=)⊥ = (√K=D∗=R=D=

√K=

)⊥ � R−

(√K=D∗=D=

√K=

)⊥

R−(D=K=D∗=)⊥ � K−R−(D=D∗=)⊥ K−R−(D∗=D=)⊥. (3.11)

It follows from (3.11), (2.23), and (2.21) that

tr{(W= + 1)−pP⊥W=

}= tr

{((W=)⊥ + 1

)−p}� tr

{(K−R−(D∗=D=)⊥ + 1)−p

}� tr

{(K−R−

[(D∗=D=)⊥ ⊕

((D⊥)∗=(D

⊥)=)⊥]+ 1

)−p}� tr

{(K−R−,[(−-=)⊗ In] + 1)−p

}= n tr

{(K−R−,(−-=)+ 1)−p

}<∞, (3.12)

if p > d/2. ✷Since (W=)⊥ � 0 has compact resolvent, we may define

NW=(E) = trχ(−∞,E)((W=)⊥), (3.13)

the number of eigenvalues of (W=)⊥ that are less than E. If E � 0, we haveNW=

(E) = 0, and if E > 0, NW=(E) is the number of eigenvalues of W= (or

(W=)⊥) in the interval (0, E). Notice that NW=(E) is the distribution function of

the measure nW=(dE) given by∫

h(E)nW=(dE) = tr(h((W=)⊥)), (3.14)

for positive continuous functions h of a real variable.We have the following ‘a priori’ estimate:

LEMMA 3.3. Let W be a partially elliptic second-order classical wave operator.Then for any finite cube = and E > 0 we have

NW=(E) � nN−-=

(E

K−R−,

)� nCd

(E

K−R−,

) d2

|=|, (3.15)

where Cd is some finite constant depending only on the dimension d.Proof. We have

NW=(E) � NK−R−(D∗=D=)(E) (3.16)

� NK−R−(D∗=D=+(D⊥)∗=(D⊥)=)(E) (3.17)

� NK−R−,[(−-=)⊗In](E) = nN−-=(

E

K−R−,

), (3.18)


where (3.16) follows from (3.11) and the Min-max Principle, (3.17) follows from(2.23), (3.18) follows from (2.21), plus a simple computation for the equality.

The second inequality in (3.15) is given by a standard estimate. ✷

3.3. AN INTERIOR ESTIMATE

The following interior estimate is an adaptation of [18, Theorem 4.1] to both finiteor infinite volume.

LEMMA 3.4. Let W = A∗A be a second-order classical wave operator, and let= denote either an open cube or R

d . Let ρ ∈ C10(=) and τ ∈ L∞loc(=, dx), with

0 � ρ(x) � τ(x) and |∇ρ(x)| � cτ(x) a.e., where c is a finite constant. Then, forany ψ ∈ D(W=) we have

‖ρA=ψ‖2 � a‖τW=ψ‖2 +(

1

a+ 4c2/2

A

)‖τψ‖2 (3.19)

for all a > 0, where /A is given in (2.26).Proof. This is proved as [18, Theorem 4.1], keeping track of the constants. ✷

3.4. IMPROVED RESOLVENT DECAY ESTIMATES IN A GAP

We adapt an argument of Barbaroux, Combes and Hislop [1] to second-order clas-sical wave operators, obtaining an improvement on the rate of decay given bythe usual Combes–Thomas argument (e.g., [7, Lemma 12], [8, Lemma 15]). Ourproof, while based on [1, Lemma 3.1], is otherwise different from the proof forSchrödinger operators, as we use an argument based on quadratic forms avoid-ing the analytic continuation of the operators. This way we can accomodate thenonsmoothness of the coefficients of our classical wave operators.

We will prove the decay estimate for both infinite and finite volumes (with peri-odic boundary condition). We start with infinite volume. Recall that Br(x) denotesthe open ball of radius r centered at x.

THEOREM 3.5. Let W = A∗A be a second-order classical wave operator with aspectral gap (a, b). Then for any E ∈ (a, b) and 8, 8′ > 0 we have

‖χB8(x)R(E)χB8′ (y)‖ � CE emE(8+8′) e−mE |x−y| (3.20)

for all x, y ∈ Rd , with

mE = 1

4/A

√(E − a)(b − E)(a + b + 2)(b + 1)

� 1

4/A(3.21)


and

CE = max

{a + b + 2

E − a ,4(b + 1)

b − E}. (3.22)

In addition,

‖χB8(x)AR(E)χB8′ (y)‖� CE

(2E + 16/2

A

) 12 emE(8+8

′+1) e−mE |x−y| (3.23)

for all x, y ∈ Rd with |x − y| � 8+ 8′ + 1.

Proof. We start by defining the operators formally given by

Wα = eα·xWe−α·x, α ∈ Rd. (3.24)

To do so, let us consider the bounded operator

Gα =√RD(α)

√K, ‖Gα‖ � |α|/A. (3.25)

Then

Aα = eα·xAe−α·x = A+ iGα on D(A), (3.26)

(A∗)α = eα·xA∗e−α·x = A∗ + iG∗α on D(A∗), (3.27)

are closed, densely defined operators. (Note (A∗)α �= (Aα)∗.) We define Wα =

(A∗)αAα as a quadratric form. More precisely, for each α ∈ Rd ,we define a quadratic

form with domain D(A) by

Wα[ψ] =⟨(A∗)∗αψ,Aαψ

⟩. (3.28)

Note that if α = 0, W = W0 is the closed, nonegative quadratic form associated tothe classical wave operator W .

It follows from (3.26) and (3.27) that

Wα[ψ] −W [ψ] = 2i Re〈Aψ,Gαψ〉 − 〈Gαψ,Gαψ〉, (3.29)

so ∣∣Wα[ψ] −W [ψ]∣∣ � ‖Gαψ‖(4‖Aψ‖2 + ‖Gαψ‖2

) 12

� 2sW [ψ] + 1

2

(1

s+ s)|α|2/2

A‖ψ‖2 (3.30)

for any s > 0. It follows [15, Theorem VI.1.33] that Wα is a closed sectorialform on the form domain of W . We define Wα as the unique m-sectorial operatorassociated with it [15, Theorem VI.2.1].

For each α ∈ Rd , we set

W(α) = W −G∗αGα, (3.31)

θ(α) = 1+ |α|2/2A, (3.32)


note

W(α) + θ(α) � 1. (3.33)

It follows from (3.29) that(W(α) + θ(α))− 1

2 (Wa − E)(W(α) + θ(α))− 1

2

= (W(α) + θ(α))− 12(W(α) − E)(W(α) + θ(α))− 1

2 + iYα, (3.34)

as everywhere defined quadratic forms, where∥∥(W(α) + θ(α))− 12(W(α) − E)(W(α) + θ(α))− 1

2∥∥ � 1+ θ(α) + E <∞, (3.35)

and

Yα =(W(α) + θ(α))− 1

2(A∗Gα +G∗αA

)(W(α) + θ(α))− 1

2 (3.36)

extends to a bounded self-adjoint operator with

‖Yα‖ � 2|α|/A, (3.37)

in view of (3.25), (3.31), (3.32), (3.33), and∥∥A(W(α) + θ(α))− 12ψ∥∥2

(3.38)

= ⟨(W(α) + θ(α))− 12ψ,A∗A

(W(α) + θ(α))− 1

2ψ⟩

= ‖ψ‖2 + ⟨(W(α) + θ(α))− 12ψ,

(G∗αGα − θ(α)

)(W(α) + θ(α))− 1

2ψ⟩

� ‖ψ‖2 − ∥∥(W(α) + θ(α))− 12ψ∥∥2 � ‖ψ‖2. (3.39)

Since (a, b) is a gap in the spectrum of W and E ∈ (a, b), we have that theinterval(

a − Ea + θ(α) ,

b − E − |α|2/2A

b + θ(α) − |α|2/2A

)=(

a − Ea + 1+ |α|2/2

A

,b − E − |α|2/2

A

b + 1

)(3.40)

is a gap in the spectrum of the operator(W(α) + θ(α))− 1

2(W(α) − E)(W(α) + θ(α))− 1

2 , (3.41)

containing 0, as long as

|α| <√b − E/A

. (3.42)


We now use [1, Lemma 3.1] to conclude, if in addition to (3.42) we also require

|α| < 1

4/A

√(E − a)(b − E − |α|2/2

A)

(a + 1+ |α|2/2A)(b + 1)

, (3.43)

that 0 is not in the spectrum of the operator in (3.34), and∥∥[(W(α) + θ(α))− 12 (Wa − E)

(W(α) + θ(α))− 1

2]−1∥∥

� 5α ≡ 2 max

{a + 1+ |α|2/2

A

E − a ,b + 1

b − E − |α|2/2A

}. (3.44)

Since

D(Wα) ⊂ D(Wα) = D(W) = D((W(α) + θ(α)) 1

2), (3.45)

we may use (3.33) and (3.44) to obtain, for all φ ∈ D(Wα),

‖(Wa − E)φ‖�∥∥(W(α) + θ(α))− 1

2 (Wa − E)(W(α) + θ(α))− 1

2(W(α) + θ(α)) 1

2φ∥∥

� 5−1α

∥∥(W(α) + θ(α)) 12φ∥∥ � 5−1

α ‖φ‖. (3.46)

Since (3.46) holds for all α ∈ Rd , andW ∗

α = W−α ,W(α) = W(−α), θ(α) = θ(−α),and 5(α) = 5(−α), we see that we also have (3.46) for W ∗

α . We can conclude thatE /∈ σ (Wα) and∥∥(Wα − E)−1

∥∥ � 5α, (3.47)

for all α ∈ Rd satisfying (3.42) and (3.43).

We now take |α| � mE , where mE is given in (3.21). Then both (3.42) and(3.43) are satisfied, and we also have |α|/A �

√(b − E)/2, so5α � CE , with CE

as in (3.22), and (3.47) gives

‖Rα(E)‖ � CE, with Rα(E) = (Wα − E)−1. (3.48)

We may now prove (3.20). Let x0, y0 ∈ Rd , 8 > 0, and take α = mE

|x0−y0|(x0−y0).We have

χB8(x0)R(E)χB8′ (y0)

= χB8(x0)e−α·xRα(E)eα·xχB8′ (y0)

= e−m|x0−y0|χB8(x0)e−α·(x−x0)Rα(E)e

α·(x−y0)χB8′ (y0), (3.49)

so

‖χB8(x0)R(E)χB8′ (y0)‖� CE‖χB8(x0)e

−α·(x−x0)‖∞‖χB8′ (y0)ea·(x−y0)‖∞e−m|x0−y0|. (3.50)


Since

‖χB8(x0)e±α·(x−x0)‖∞ � e|α|8 = emE8, (3.51)

(3.20) follows from (3.50) and (3.51).To prove (3.23), we use Lemma 3.4. We let |x0 − y0| � 8 + 8′ + 1, and pick

ρ ∈ C10(R

d), with χB8(x0) � ρ � χB8+1(x0) and |∇ρ(x)| � 2. We have

‖χB8(x0)AR(E)χB8′ (y0)‖� ‖ρAR(E)χB8′ (y0)‖�(2E + 16/2

A

) 12 ‖χB8+1(x0)R(E)χB8′ (y0)‖, (3.52)

so (3.23) now follows from (3.20) ✷We now turn to the torus, i.e., we prove a version of Theorem 3.5 for the restric-

tion of a second-order classical wave operator to a cube = with periodic boundarycondition. We use the distance (3.7) in the torus.

THEOREM 3.6. W be a second-order classical wave operator whose restrictionwith periodic boundary condition to a cube =L(x0) has a spectral gap (a, b). Thenfor any E ∈ (a, b) and 8 > 0, with L > 28 + 8, we have

‖χB8(x)Rx0,L(E)χB8′(y)‖x0,L � CE e2mE,L,88e−mE,L,8dL(x,y) (3.53)

for all x, y ∈ =L(x0), where

mE,L,8 = mE

cL,8, with cL,8 =

(2√d

1− 2(8+3)L−2

+ 1

), (3.54)

and mE and CE are as in Theorem 3.5.Proof. Let us fix x1, y1 ∈ =L(x0), by redefining the coefficient operators we

may assume x0 = 0 = 12(x1 + y1) and x1, y1 ∈ =L

2(0). In particular, dL(x1, y1) =

|x1 − y1|. Let L > 28 + 8, we pick a real valued function ξ ∈ C10(R) with 0 �

ξ(t) � 1 for all t ∈ R, such that

ξ(t) = 1 for |t| � L

4+ 8

2,

ξ(t) = 0 for |t| � L

2− 1,

and

|ξ ′(t)| �(L

4− 8

2− 2

)−1

for all t ∈ R.

We set �(x) =∏di=1 ξ(xi) for x ∈ R

d . Notice supp�(x) ⊂ =L(0).


We now proceed as in the proof of Theorem 3.5 with =L(0) substituted for Rd

and definition (3.24) replaced by

(W0,L)α = e�(x)α·xW0,Le−�(x)α·x, α ∈ Rd, (3.55)

and instead of (3.25), consider the bounded operators

(G0,L)α =√R0,L

(D(∇(�(x)α · x)))0,L√K0,L. (3.56)

Since∣∣∇(�(x)α · x)∣∣ �((L2 − 1)

√d

L4 − 8

2 − 2+ 1

)|α| = cL,8|α| (3.57)

for all x ∈ =L(0), with cL,8 as in (3.54), we have

‖(G0,L)α‖ � cL,8|α|/A. (3.58)

We now proceed as in the proof of Theorem 3.5, except that we must nowsubstitute cL,8|α| for |α| in the estimates. Thus, if |α| � mE/cL,8, we conclude thatE /∈ σ ((W0,L)α) and∥∥(R0,L)α(E)

∥∥ � CE, with (R0,L)α(E) =((W0,L)α − E

)−1. (3.59)

To prove (3.53), we take

α = mE

cL,8|x1 − y1|(x1 − y1),

and complete the proof of as before (with x1, y1 substituted for x, y in (3.53)), as

‖χB8(x1)e±�(x)α·x−α·x1‖∞ = ‖χB8(x1)e

±α·(x−x1)‖∞ � emE,L,88. (3.60)

✷3.5. A SIMON–LIEB-TYPE INEQUALITY

The norm in H (r)= and also the corresponding operator norm will both be denoted

by ‖ ‖=, or ‖ ‖x,L in case = = =L(x). (We omit r from the notation.) If =1 ⊂ =2

are open cubes (possibly the whole space), let J=2=1

: H (r)=1→ H (r)

=2be the canonical

injection. If =i = =Li (xi), i = 1, 2, we write ‖ ‖x2,L2x1,L1

for the operator norm from

H (r)

=L1 (x1)to H (r)

=L2 (x2), and J x2,L2

x1,L1= J=L2 (x2)

=L1 (x1).

Given a function φ ∈ L∞(Rd), with suppφ ⊂ =, we do not distinguish in thenotation between φ as a multiplication operator on H (r)

= and on H (r). If D is aCPDO(1)

n,m, and φ ∈ C10(R

d), real-valued, with supp φ ⊂ =1 ⊂ =2, we can verifythat, as operators,

D=2J=2=1φ = J=2

=1D=1φ on D(D=1). (3.61)


It follows for the MPDO(1)n,m A that

A=2J=2=1φ = J

=2=1A=1φ on D(A=1). (3.62)

We set

A[φ] = √RD[φ]√K, (3.63)

where D[φ], given by multiplication by the matrix valued function D(−i∇φ(x)),is a bounded operator from H (n) to H (m), with norm bounded byD+‖∇φ‖∞. ThusA[φ] is a bounded operator given by multiplication by a matrix-valued, measurablefunction, with (see (2.26))

‖A[φ]‖ � /A‖∇φ‖∞. (3.64)

We denote by A=[φ] its restriction to the cube =; it also satisfies the bound (3.64).We will use the fact that A=R=(z) is a bounded operator with

‖A=R=(z)‖2= � ‖R=(z)‖=

(|z|‖R=(z)‖= + 1). (3.65)

The basic tool to relate the finite volume resolvents in different scales is thesmooth resolvent identity (SRI) (see [2, 7, 8, 18]).

LEMMA 3.7 (SRI). LetW = A∗A be a second-order classical wave operator, andlet =1 ⊂ =2 be either open cubes or R

d , and let φ ∈ C10(R

d) with supp φ ⊂ =1.Then, for any z /∈ σ (W=1) ∪ σ (W=2) we have

R=2(z)φJ=2=1

= J=2=1φR=1(z)+ R=2(z)A

∗=2[φ]J=2

=1A=1R=1(z)−

−R=2(z)A∗=2J=2=1A=1 [φ]R=1(z) (3.66)

as bounded operators from H (n=L1

to H (n)=L2

.

Proof. This lemma can be proved as [18, Lemma 7.2]. ✷We will now state and prove a Simon–Lieb-type inequality (SLI) for second-

order classical wave operators. This estimate is a crucial ingredient in the multi-scale analysis proofs of localization for random operators, where it is used to obtaindecay in a larger scale from decay in a given scale [13, 12, 5, 2, 7, 8, 14].

Let us fix q ∈ N. (In [17] we will work with a periodic background medium,and we will take q to be the period.) We will take cubes =L(x) centered at sitesx ∈ qZ

d with side L ∈ 2qN (so in a periodic background medium with period qthe background medium will be the same in all cubes in a given scale L). For such


cubes (with L � 4q), we set

ϒL(x) ={y ∈ qZ

d; ‖y − x‖ = L

2− q}, (3.67)

ϒL(x) = =L−q(x)\=L−3q(x) =⋃

y∈ϒL(x)=q(y), (3.68)

ϒL(x) = =L− 3q2(x)\=

L− 5q2(x), (3.69)

Mx,L = χϒL(x) =∑

y∈ϒL(x)χy,q a.e., (3.70)

Mx,L = χϒL(x). (3.71)

Note that

|ϒL(x)| � d(L− 2q + 1)d−1. (3.72)

In addition each cube =L(x) will be equipped with a function φx,L constructedin the following way: we fix an even function ξ ∈ C1

0(R) with 0 � ξ(t) � 1 for allt ∈ R, such that ξ(t) = 1 for |t| � q/4 , ξ(t) = 0 for |t| � 3q/4, and |ξ ′(t)| � 3/qfor all t ∈ R. (Such a function always exists.) We define

ξL(t) ={

1, if |t| � L2 − 5q

4 ,

ξ(|t| − (L2 − 3q

2

)), if |t| � (L2 − 3q

2

),

(3.73)

and set

φx,L(y) = φL(y − x) for y ∈ Rd, with φL(y) =

d∏i=1

ξL(yi). (3.74)

We have φx,L ∈ C10(R

d), with suppφx,L ⊂ =L(x) and 0 � φx,L � 1. Byconstruction, we have

χx, L2 − 5q

4φx,L = χ

x, L2− 5q4, χ

x, L2− 3q4φx,L = �x,L, (3.75)

Mx,L(∇φx,L) = ∇φx,L, |∇φx,L| � 3√d

q. (3.76)

Similarly, we also construct a function ρx,L ∈ C10(R

d), 0 � ρx,L � 1, such that

Mx,Lρx,L = Mx,L, Mx,Lρx,L = ρx,L, (3.77)

|∇ρx,L| � 5√d

q. (3.78)

LEMMA 3.8 (SLI). Let W be a second-order classical wave operator. Let x, y ∈qZ

d , L, 8 ∈ 2qN, and 5 be a set, with 5 ⊂ =8−3q(y) ⊂ =L−3q(x). Then, ifz /∈ σ (Wx,L) ∪ σ (Wy,8), we have

‖Mx,LRx,L(z)χ5‖x,L � γz‖My,8Ry,8(z)χ5‖y,8‖Mx,LRx,L(z)My,8‖x,L, (3.79)


with

γz = 6√d

q/A

(2|z| + 100d

q2/2A

) 12

, (3.80)

where /A is given in (2.26).Proof. We proceed as in [7, Lemma 26]. Using (3.75), (3.66), and Mx,Lφy,8 = 0,

we obtain

Mx,LRx,L(z)Jx,Ly,8 χ5

= Mx,LRx,L(z)J x,Ly,8 φy,8χ5 = Mx,LRx,L(z)A∗x,L[φy,8]J x,Ly,8 Ay,8Ry,8(z)χ5 −−Mx,LRx,L(z)A∗x,LJ x,Ly,8 Ay,8[φy,8]Ry,8(z)χ5. (3.81)

We now use (3.76) and (3.64) to get

‖Mx,LRx,L(z)A∗x,L[φy,8]J x,Ly,8 Ay,8Ry,8(z)χ5‖x,Ly,8= ‖Mx,LRx,L(z)My,8A∗x,L[φy,8]J x,Ly,8 My,8Ay,8Ry,8(z)χ5‖x,Ly,8� 3

√d

q/A‖Mx,LRx,L(z)My,8‖x,L‖My,8Ay,8Ry,8(z)χ5‖y,8, (3.82)

and

‖Mx,LRx,L(z)A∗x,LJ x,Ly,8 Ay,8[φy,8]Ry,8(z)χ5‖x,L= ‖Mx,LRx,L(z)A∗x,LMy,8J x,Ly,8 Ay,8[φy,8]My,8Ry,8(z)χ5‖x,L� 3

√d

q/A‖Mx,LRx,L(z)A∗x,LMy,8‖x,L‖My,8Ry,8(z)χ5‖y,8

= 3√d

q/A‖My,8Ax,LRx,L(z)Mx,L‖x,L‖My,8Ry,8(z)χ5‖y,8. (3.83)

We now appeal to Lemma 3.4 using (3.77) and (3.78). For ψ ∈ H (n)y,8 and a > 0

we get

‖My,8Ay,8Ry,8(z)χ5ψ‖2y,8 � ‖ρy,8Ay,8Ry,8(z)χ5ψ‖2

y,8

� a‖My,8Wy,8Ry,8(z)χ5ψ‖2y,8 +

+(

1

a+ 100d

q2/2A

)‖My,8Ry,8(z)χ5ψ‖2

y,8

�(a|z|2 + 1

a+ 100d

q2/2A

)‖My,8Ry,8(z)χ5ψ‖2

y,8. (3.84)

Choosing a = |z|−1, we get

‖My,8Ay,8Ry,8(z)χ5‖y,8

�(

2|z| + 100d

q2/2A

) 12

‖My,8Ry,8(z)χ5‖y,8. (3.85)


Similarly, we get

‖My,8Ax,LRx,L(z)Mx,L‖x,L

�(

2|z| + 100d

q2/2A

) 12

‖My,8Rx,L(z)Mx,L‖x,L

=(

2|z| + 100d

q2/2A

) 12

‖Mx,LRx,L(z)My,8‖x,L. (3.86)

Since

‖Mx,LRx,L(z)χ5‖x,L = ‖Mx,LRx,L(z)J x,Ly,8 χ5‖x,Ly,8 , (3.87)

the lemma follows from (3.81)–(3.86). ✷

3.6. THE EIGENFUNCTION DECAY INEQUALITY

The eigenfunction decay inequality (EDI) estimates decay for generalized eigen-functions from decay of finite volume resolvents.

We start by introducing generalized eigenfunctions for classical wave operators.(We refer to [18] for the details.) Given ν > d/4, we define the weighted spaces(we will omit ν from the notation) H (r)

± as follows:

H (r)± = L2(Rd, (1+ |x|2)±2ν dx;Cr ).

H (r)− is the space of polynomially L2-bounded functions. The sesquilinear form

〈φ1, φ2〉H (r)+ ,H (r)

−=∫φ1(x) · φ2(x) dx, (3.88)

where φ1 ∈ H (r)+ and φ2 ∈ H (r)

− , makes H (r)+ and H (r)

− conjugate duals to eachother. ByO† we will denote the adjoint of an operatorO with respect to this duality.By construction, H (r)

+ ⊂ H (r) ⊂ H (r)− , the natural injections ı+: H (r)

+ → H (r) andı−: H (r) → H (r)

− being continuous with dense range, with ı†+ = ı−.Given a second-order classical wave operatorW = A∗A, whereA is a MPDO(1)

n,m,we define operators W± on H (n) as follows: A+ is the restriction of the operatorA to H (n)

+ , i.e., A+ is the operator from H (n)+ to H (m)

+ with domain D(A+) ={φ ∈ D(A) ∩ H (n)

+ ; Aφ ∈ H (m)+ }, defined by A+φ = Aφ for φ ∈ D(A+). A+

is a closed densely defined operator, and we set A− = (A∗+)†, a closed denselydefined operator from H (n)

− to H (m)− . We define W+ = A∗+A+ = A

†−A+, which is

a closed densely defined operator on H (n)+ with domain D(W+) = {φ ∈ D(W) ∩

H (n)+ ; Wφ ∈ H (n)

+ }, and W+φ = Wφ for φ ∈ D(W+) [18, Theorem 4.2]. Wedefine W− = W

†+, a closed densely defined operator on H (n)

− . Note that W is therestriction of W− to H (n).


A measurable function ψ : Rd → C

n is said to be a generalized eigenfunctionof W with generalized eigenvalue λ, if ψ ∈ H (n)

− (for some ν > d/4) and is aneigenfunction for W− with eigenvalue λ, i.e., ψ ∈ D(W−) and W−ψ = λψ . Inother words, ψ ∈ H (n)

− and

〈W+φ,ψ〉H (n)+ ,H (n)

−= λ〈φ,ψ〉H (n)

+ ,H (n)−

for all φ ∈ D(W+). (3.89)

Eigenfunctions of W are always generalized eigenfunctions. Conversely, if ageneralized eigenfunction is in H (n), then it is a bona fide eigenfunction.

LEMMA 3.9 (EDI). Let W be a second-order classical wave operator, and let ψbe a generalized eigenfunction of W with generalized eigenvalue E. Let x ∈ qZ

d

and L ∈ 2qN be such that E /∈ σ (Wx,L). Then for any set 5 ⊂ =L−3q(x) we have

‖χ5ψ‖ � γE‖Mx,LRx,L(E)χ5‖x,L‖Mx,Lψ‖, (3.90)

with γE as in (3.80).Proof. We fix ν > d/4 such that ψ ∈ D(W−) and W−ψ = Eψ . We write

Jx,L = JRd

x,L.

Using [18, Lemma 4.1], we can show that, weakly in H (n)x,L,

J ∗x,Lχ5ψ = χ5J∗x,Lφx,Lψ = χ5Rx,L(E)(Wx,L − E)J ∗x,Lφx,Lψ

= χ5Rx,L(E)A∗x,LJ

∗x,LA[φx,L]ψ +

+χ5Rx,L(E)J ∗x,LA∗[φx,L]A−ψ. (3.91)

Proceeding as in the proof of Lemma 3.8, we have∥∥χ5Rx,L(E)A∗x,LJ ∗x,LA[φx,L]ψ∥∥x,L= ∥∥χ5Rx,L(E)A∗x,LMx,LJ ∗x,LA[φx,L]Mx,Lψ∥∥x,L� 3

√d

q/A‖χ5Rx,L(E)A∗x,LMx,L‖x,L‖J ∗x,LMx,Lψ‖x,L

= 3√d

q/A‖Mx,LAx,LRx,L(E)χ5‖x,L‖J ∗x,LMx,Lψ‖x,L

� 3√d

q/A

(2|E| + 100d

q2/2A

) 12

××‖Mx,LRx,L(E)χ5‖x,L‖Mx,Lψ‖. (3.92)

Similarly,∥∥χ5Rx,L(E)J ∗x,LA∗[φx,L]A−ψ∥∥x,L= ∥∥χ5Rx,L(E)Mx,LJ ∗x,LA∗[φx,L]Mx,LA−ψ∥∥x,L� 3

√d

q/A‖Mx,LRx,L(E)χ5‖x,L‖Mx,LA−ψ‖, (3.93)


and, using Lemma 3.4, which is also valid for the operator A− (see [18, Theorem4.1], we have

‖Mx,LA−ψ‖2 � ‖ρx,LA−ψ‖2

� a‖Mx,LW−ψ‖2 +(

1

a+ 100d

q2/2A

)‖Mx,Lψ‖2

=(a|E|2 + 1

a+ 100d

q2/2A

)‖Mx,Lψ‖2, (3.94)

for any a > 0.Choosing a = |E|−1 in (3.94), (3.90) follows from (3.91)–(3.94). ✷

4. Periodic Classical Wave Operators

In this section we study classical wave operators in periodic media. The maintheorem gives the spectrum of a periodic classical wave operator in terms of thespectra of its restriction to finite cubes with periodic boundary condition.

DEFINITION 4.1. A coefficient operator S on H (n) is periodic with period q > 0if S(x) = S(x + qj) for all x ∈ R

d and j ∈ Zd .

DEFINITION 4.2. A medium is called periodic if the coefficient operators K andR that describe the medium are periodic with the same period q. (We will alwaystake the period q ∈ N without loss of generality.) The corresponding classical waveoperators will be said to be periodic with period q (q-periodic).

If k, n ∈ N, we say that k ! n if n ∈ kN and that k ≺ n if k ! n and k �= n.THEOREM 4.3. LetW be a q-periodic second-order classical wave operator. Let{8n; n = 0, 1, 2, . . .} be a sequence in N such that 80 = q and 8n ≺ 8n+1 for eachn = 0, 1, 2, . . .. Then

σ (W0,8n) ⊂ σ (W0,8n+1) ⊂ σ (W) for all n = 0, 1, 2, . . . , (4.1)

and

σ (W) =∞⋃n=0

σ (W0,8n). (4.2)

Proof. The analogous result for periodic Schrödinger operators is well known [6].Periodic acoustic and Maxwell operators are treated in [7, Theorem 14] and [8,Theorem 25], respectively. We will sketch a proof, using Floquet theory. We referto [19, Section XIII.6] for the definitions and notations of direct integrals of Hilbertspaces.


We let Q = =q(0) be the basic period cell, Q = = 2πq(0) the dual basic cell,

=L(x) ={y ∈ R

d; xi − L

2� yi < xi + L

2, i = 1, . . . , d

}.

(We should also take Q = =q(0), but we will not since it will make no differencein what follows.) For any r ∈ N we define the Floquet transform

F : H (r) →∫ ⊕

Q

H (r)Q dk ≡ L2

(Q, dk;H (r)

Q

)(4.3)

by

(F ψ)(k, x) =(q

2π

) d2 ∑m∈qZd

eik·(x−m)ψ(x −m), x ∈ Q, k ∈ Q, (4.4)

if ψ has compact support; it extends by continuity to a unitary operator.The q-periodic operator W is decomposable in this direct integral representa-

tion, more precisely,

FWF ∗ =∫ ⊕

Q

WQ(k) dk, (4.5)

where for each k ∈ Rd we set DQ(k) to be the restriction to Q with periodic

boundary condition of the operator given by the matrix D(−i∇ + k) (see (2.17) ),a closed, densely defined operator, and let WQ(k) = A∗Q(k)AQ(k) with AQ(k) =√

RQDQ(k)√

KQ. (If for p ∈ 2π/qZd , Up denotes the unitary operator on H (r)

Q

given by multiplication by the function e−ip·x , then for all k ∈ Rd we have

WQ(k + p) = U ∗pWQ(k)Up.)Since

‖AQ(k + h)− AQ(k)‖ � |h|/A, (4.6)

follows from the resolvent identity that the map

k ∈ Rd (→ (WQ(k)+ I )−1 ∈ B

(H (n)Q

)(4.7)

is operator norm continuous, so we conclude from (4.5) that

σ (W) =⋃k∈Q

σ (WQ(k)). (4.8)

If 8 ∈ qZd , similar considerations apply to the operator W0,8, which is

q-periodic on the torus =8(0). The Floquet transform

F8: H (r)0,8 →

⊕k∈ 2π

8 Zd∩QH (r)Q (4.9)


is a unitary operator now defined by

(F8ψ)(k, x) =(q

8

) d2 ∑m∈qZd∩=8(0)

eik·(x−m)ψ(x −m), (4.10)

where

x ∈ Q, k ∈ 2π

8Zd ∩ Q, ψ ∈ H (r)

0,8,

ψ(x −m) being properly interpreted in the torus =8(0). We also have

F8W0,8F∗8 =

⊕k∈ 2π

8 Zd∩QWQ(k), (4.11)

and

σ (W0,8) =⋃

k∈ 2π8 Zd∩Q

σ (WQ(k)). (4.12)

Theorem 4.3 follows from (4.8) and (4.12). ✷

5. Defects and Midgap Eigenmodes

We now prove the results in Subsection 2.5.

THEOREM 5.1 (Stability of essential spectrum). Let W0 and W be second-orderpartially elliptic classical wave operators for two media which differ by a defect.Then

σess(W) = σess(W0). (5.1)

Proof. We will first prove the theorem when the defect only changes R, i.e., wewill show

σess(WK0,R) = σess(WK0,R0). (5.2)

The general case will follow, using Remark 2.8 and Lemma A.1, as then

σess(WK0,R0) = σess(WK0,R) = σess((WK0,R)⊥

)= σess

((WR,K0)⊥

) = σess(WR,K0) = σess(WR,K)

= σess((WR,K)⊥

) = σess((WK,R)⊥

) = σess(WK,R). (5.3)

To prove (5.2), we proceed as in [9, Theorem 1]. Let T (x) = R(x) −R0(x),by our hypotheses it is a bounded, measurable, self-adjoint matrix-valued functionwith compact support. We write

T (x) = T+(x)− T−(x),


with T±(x) the positive/negative part of the self-adjoint matrix T (x). We let T± de-note the bounded operators given by the matrices T±(x), they would be coefficientsoperators except for the fact that the functions T±(x) have compact support, so theyare not bounded away from zero. We may still define operators define nonnegativeself-adjoint operators WK0,T± . We have

WK0,R = (WK0,R0 +WK0,T+)−WK0,T−, (5.4)

as quadratic forms. (Note that Q(WK0,R) = Q(WK0,R0) ⊂ Q(WK0,T±), whereQ(W) denotes the form domain of the operator W .) Thus (5.2) follows from [19,Corollary 4 to Theorem XIII.14] and the following lemma.

LEMMA 5.2. Let WK,R be a second-order partially elliptic classical wave oper-ator, and let T be like a coefficient operator, except for the fact that the functionT (x) has compact support, so it is not bounded away from zero (i.e., T− = 0).Then

tr{(WK,R + I )−rWK,T (WK,R + I )−r

}<∞ (5.5)

if r � ν + 1, where ν is the smallest integer satisfying ν > d/4.Proof. Let 5 denote the support of T (x), we pick a function ρ ∈ C1

0(Rd) with

χ5 � ρ(x) � χ5, where 5 = supp ρ is a compact set. We have

T � T+ρ2 � T+R−1− ρ

2R. (5.6)

Thus, using ‖ ‖HS to denote the Hilbert–Schmidt norm, and setting c = ‖∇ρ‖∞,we have

tr{(WK,R + I )−rWK,T (WK,R + I )−r

}(5.7)

� T+R−1− tr

{(WK,R + I )−rA∗K,Rρ

2AK,R(WK,R + I )−r}

= T+R−1−∥∥ρAK,R(WK,R + I )−r

∥∥2HS (5.8)

� T+R−1−{∥∥χ5WK,R(WK,R + I )−r

∥∥2HS + (5.9)

+ (1+ 4c2/2AK,R

)∥∥χ5(WK,R + I )−r∥∥2

HS

}� T+R−1

−{(∥∥χ5(WK,R + I )−r+1

∥∥HS +

∥∥χ5(WK,R + I )−r∥∥

HS

)2 ++ (1+ 4c2/2

AK,R

)∥∥χ5(WK,R + I )−r∥∥2

HS

}<∞, (5.10)

where the final bound in (5.10) follows from Theorem 3.1 if r−1 � ν. To go from(5.8) to (5.9) we used Lemma 3.4. ✷

This finishes the proof of Theorem 5.1. ✷

COROLLARY 5.3 (Behavior of midgap eigenmodes). Let W0 and W be second-order partially elliptic classical wave operators for two media which differ by


a defect. If (a, b) is a gap in the spectrum of W0, the spectrum of W in (a, b)consists of at most isolated eigenvalues with finite multiplicity, the correspondingeigenmodes decaying exponentially fast from the defect, with a rate depending onthe distance from the eigenvalue to the edges of the gap. If the defect is supportedby some ball Br(x0), and E ∈ (a, b) is an eigenvalue for W with a correspondingeigenmode ψ , ‖ψ‖ = 1, then

‖χxψ‖2

� 2CE/A0

(E

12 + emE

(2E + 16/2

A0

) 12)emE(

√d

2 +r+2) e−|x−x0| (5.11)

for all x ∈ Rd such that

|x − x0| �√d

2+ r + 3,

where mE and CE are as in Theorem 3.5.Proof. By Theorem 5.1 W has no essential spectrum in (a, b). Thus, if E ∈

σ (W) ∩ (a, b), it must be an isolated eigenvalue with finite multiplicity; let ψ bea corresponding eigenvector. To estimate the decay of ψ we have to deal with thefact that the form domains of W and W0 may be different, and ψ may not be in theform domain of W0. Thus we pick ρ ∈ C1(Rd) such that

1− χBr+2(x0)(x) � ρ(x) � 1− χBr+1(x0)(x), |∇ρ(x)| � 2. (5.12)

Since W and W0 differ by a defect supported by Br(x0), it follows from (2.25) thatDρ ≡ ρD(A) = ρD(A0), and Aϕ = A0ϕ for ϕ ∈ Dρ . Thus, if φ ∈ D(A0), wehave

〈A0φ,A0ρψ〉 = 〈A0φ,Aρψ〉 = 〈A0φ, ρAψ〉 + 〈A0φ,A[ρ]ψ〉= 〈ρA0φ,Aψ〉 + 〈A0φ,A[ρ]ψ〉= 〈ρA0φ,Aψ〉 + 〈A0φ,A[ρ]ψ〉= 〈A0ρφ,Aψ〉 − 〈A0[ρ]φ,Aψ〉 + 〈A0φ,A[ρ]ψ〉= 〈Aρφ,Aψ〉 − 〈A0[ρ]φ,Aψ〉 + 〈A0φ,A[ρ]ψ〉= 〈ρφ,Wψ〉 − 〈A0[ρ]φ,Aψ〉 + 〈A0φ,A0[ρ]ψ〉= E〈ρφ,ψ〉 − 〈A0[ρ]φ,Aψ〉 + 〈A0φ,A0[ρ]ψ〉. (5.13)

Taking φ = (W0 − E)−1χxψ , we get

‖χxψ‖2 = −⟨A0[ρ](W0 − E)−1χxψ,Aψ⟩+

+ ⟨A0(W0 − E)−1χxψ,A0[ρ]ψ⟩

= −⟨A0[ρ]χBr+2(x0)(W0 − E)−1χxψ,Aψ⟩+

+ ⟨χBr+2(x0)A0(W0 − E)−1χxψ,A0[ρ]ψ⟩

� 2/A0

(√E‖χBr+2(x0)(W0 − E)−1χx‖ +

+ ‖χBr+2(x0)A0(W0 − E)−1χx‖)‖ψ‖2, (5.14)

where we used (3.64) and ‖Aψ‖2 = 〈ψ,Wψ〉 = E‖ψ‖2.


The estimate (5.11) now follows from (5.14), using (3.20) and (3.23) in Theo-rem 3.5. ✷

The next theorem shows that one can design simple defects which generateeigenvalues in a specified subinterval of a spectral gap of W0, extending [9, Theo-rem 2] to the class of classical wave operators. Let 5 be a an open bounded subsetof R

d , x0 ∈ 5. Typically, we take 5 to be the cube =1(x0), or the ball B1(x0).We set 58 = x0 + 8(5− x0) for 8 > 0. We insert a defect that changes the valueof K0(x) and R0(x) inside 58 to given positive constants K and R. If (a, b) is agap in the spectrum of W0, we will show that we can deposit an eigenvalue of Winside any specified closed subinterval of (a, b), by inserting such a defect with8/√KR large enough, how large depending only on D+, the geometry of 5, and

the specified closed subinterval.

THEOREM 5.4 (Creation of midgap eigenvalues). Let (a, b) be a gap in the spec-trum of a second-order partially elliptic classical wave operator W0 = WK0,R0 ,select µ ∈ (a, b), and pick δ > 0 such that the interval [µ− δ, µ+ δ] is containedin the gap, i.e., [µ − δ, µ + δ] ⊂ (a, b). Given an open bounded set 5, x0 ∈ 5,0 < K,R, 8 < ∞, we introduce a defect that produces coefficient matrices K(x)

and R(x) that are constant in the set 58 = x0 + 8(5− x0), with

K(x) = KIn and R(x) = RIm for x ∈ 58. (5.15)

If

8√KR

>

√µ

δD+ inf

{‖∇η‖2 +

(‖∇η‖2

2 +δ

µ‖-η‖2

2

) 12}, (5.16)

where the infimum is taken over all real valued C2-functions η on Rd with support

in 5 and ‖η‖2 = 1, the operator W = WK,R has at least one eigenvalue in theinterval [µ− δ, µ+ δ].

Proof. We proceed as in [9, Theorem 2]. In view of Corollary 5.3, it suffices toshow that

σ (W) ∩ [µ− δ, µ+ δ] �= ∅ (5.17)

if (5.16) is satisfied. To prove (5.17), it suffices to find ϕ ∈ D(W) such that∥∥(W − µ)ϕ∥∥ � δ‖ϕ‖. (5.18)

To do so, we will construct a function ϕ ∈ D(W), with ‖ϕ‖ = 1 and support in58,such that (5.18) holds. In this case the inequality (5.18) takes the following simpleform:∥∥(KRD∗D− µ)ϕ∥∥ � δ, (5.19)


which is the same as∥∥(D∗D− µ′)ϕ∥∥ � δ′, (5.20)

with µ′ = µ/KR and δ′ = δ/KR.We start by constructing generalized eigenfunctions for the nonnegative oper-

ator D∗D corresponding to µ′. In order to do this, we consider κ ∈ Sd , pick an

eigenvalue λ = λκ > 0 and a corresponding eigenvector ξ = ξκ,λ ∈ Cn, |ξ | = 1,

of the n× n matrix D(κ)∗D(κ) (see (2.17)). We set

f (x) = fκ,λ,ξ (x) = ei√µ′λκ·xξ ∈ C∞(Rn;Cn). (5.21)

Note that, pointwise, we have |f (x)| = 1, and

(D∗Df )(x) = µ′f (x). (5.22)

To produce the desired ϕ satisfying (5.18), we will restrict f to 58 in suitablemanner, and prove (5.20). To do so, let η8 be a real valued C2 function on R

d withsupport in 58 and ‖η8‖2 = 1. We set

ϕ(x) = η8(x)f (x), note ‖ϕ‖ = ‖η8‖2 = 1. (5.23)

We have ϕ ∈ D(D∗D) with support in 58, and

(D∗D− µ′)ϕ= [D∗(−i∇)D(−i∇η8)]f +√µ′

λD∗(−i∇η8)D(κ)f +

+√µ′

λD∗(κ)D(−i∇η8)f. (5.24)

Thus ∥∥(D∗D− µ′)ϕ∥∥ � D2+‖-η8‖2 + 2

√µ′

λD2+‖∇η8‖2. (5.25)

We now use a scaling argument (i.e., write η8(x) = η(8−1(x − x0) + x0)) toconclude that to obtain (5.20), it suffices to find η ∈ C2(Rd,R) with support in 5,‖η‖2 = 1, and a unit vector κ ∈ R

d , such that

8−2D2+‖-η‖2 + 28−1

√µ′

λD2+‖∇η‖2 � δ′, (5.26)

which will be satisfied if

8−2KRD2+‖-η‖2 + 28−1

√KRD+

√µ‖∇η‖2 � δ, (5.27)

where we used the fact that λ � D2+. Thus (5.20) holds if (5.16) is satisfied. ✷


Appendix: A Useful Lemma

The following well known lemma (e.g., [4, Lemma 2]) is used throughout thispaper. We recall that, given a closed densely defined operator T on a Hilbert spaceH , we denote its kernel by ker T and its range by ran T . If T is self-adjoint, itleaves invariant the orthogonal complement of its kernel; the restriction of T to(ker T )⊥ is denoted by T⊥, a self-adjoint operator on the Hilbert space (ker T )⊥.

LEMMA A.1. Let B be a closed, densely defined operator from the Hilbert spaceH1 to the Hilbert space H2. Then the operators (B∗B)⊥ and (BB∗)⊥ are unitarilyequivalent. More precisely, the operator U defined by

Uψ = B(B∗B)− 12

⊥ ψ for ψ ∈ ran(B∗B)12⊥, (A.1)

extends to a unitary operator from (kerB)⊥ to (kerB∗)⊥, and

(BB∗)⊥ = U(B∗B)⊥U ∗. (A.2)

Acknowledgements

The authors thanks Maximilian Seifert for many discussions and suggestions.A. Klein also thanks Alex Figotin, François Germinet, and Svetlana Jitomirskayafor enjoyable discussions.

References

1. Barbaroux, J. M., Combes, J. M. and Hislop, P. D.: Localization near band edges for randomSchrödinger operators, Helv. Phys. Acta 70 (1997), 16–43 .

2. Combes, J. M. and Hislop, P. D.: Localization for some continuous, random Hamiltonian ind-dimension, J. Funct. Anal. 124 (1994), 149–180.

3. Combes, J. M., Hislop, P. D. and Tip, A.: Band edge localization and the density of states foracoustic and electromagnetic waves in random media, Ann. Inst. H. Poincaré Phys. Théor. 70(1999), 381–428.

4. Deift, P. A.: Applications of a commutation formula, Duke Math. J. 45 (1978), 267–310.5. von Dreifus, H. and Klein, A.: A new proof of localization in the Anderson tight binding model,

Comm. Math. Phys. 124 (1989), 285–299.6. Eastham, M.: The Spectral Theory of Periodic Differential Equations, Scottish Academic Press,

1973.7. Figotin, A. and Klein, A.: Localization of classical waves I: Acoustic waves, Comm. Math.

Phys. 180 (1996), 439–482.8. Figotin, A. and Klein, A.: Localization of classical waves II: Electromagnetic waves, Comm.

Math. Phys. 184 (1997), 411–441.9. Figotin, A. and Klein, A.: Localized classical waves created by defects, J. Statist. Phys. 86

(1997), 165–177.10. Figotin, A. and Klein, A.: Midgap defect modes in dielectric and acoustic media, SIAM J. Appl.

Math. 58 (1998), 1748–1773.11. Figotin, A. and Klein, A.: Localization of light in lossless inhomogeneous dielectrics, J. Opt.

Soc. Amer. A 15 (1998), 1423–1435.


12. Fröhlich, J., Martinelli, F., Scoppola, E. and Spencer, T.: Constructive proof of localization inthe Anderson tight binding model, Comm. Math. Phys. 101 (1985), 21–46.

13. Fröhlich, J. and Spencer, T.: Absence of diffusion with Anderson tight binding model for largedisorder or low energy, Comm. Math. Phys. 88 (1983), 151–184.

14. Germinet, F. and Klein, A.: Bootstrap multiscale analysis and localization in random media,Comm. Math. Phys., to appear.

15. Kato, T.: Perturbation Theory for Linear Operators, Springer-Verlag, New York, 1976.16. Klein, A.: Localization of light in randomized periodic media, In: J.-P. Fouque (ed.), Diffuse

Waves in Complex Media, Kluwer, Dordrecht, 1999, pp. 73–92.17. Klein, A. and Koines, A.: A general framework for localization of classical waves: II. Random

media, in preparation.18. Klein, A., Koines, A. and Seifert, M.: Generalized eigenfunctions for waves in inhomogeneous

media, J. Funct. Anal., to appear.19. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol. IV, Analysis of

Operators, Academic Press, New York, 1978.20. Schulenberger, J. and Wilcox, C.: Coerciveness inequalities for nonelliptic systems of partial

differential equations, Arch. Rational Mech. Anal. 88 (1971), 229–305.21. Wilcox, C.: Wave operators and asymptotic solutions of wave propagation problems of classical

physics, Arch. Rational Mech. Anal. 22 (1966), 37–78.


131

Toda Equations, bi-Hamiltonian Systems,and Compatible Lie Algebroids

ATTILIO MEUCCIBain & Co, via Crocefisso 10, I-20122 Milan, Italy. e-mail: [email protected]

(Received: 23 March 2001)

Abstract. We present the bi-Hamiltonian structure of Toda3, a dynamical system studied by Kuper-shmidt as a restriction of the discrete KP hierarchy. We derive this structure by a suitable reductionof the set of maps from Zd to GL(3,R), in the framework of Lie algebroids.

Mathematics Subject Classifications (2000): 37K10, 70H06.

Key words: Toda lattice, Lie algebroids, bi-Hamiltonian manifolds, Marsden–Ratiu reduction.

1. Introduction

It is well known (see [12] and references therein) that the periodic Toda lattice isa bi-Hamiltonian system and that its integrability properties can be easily derivedby its bi-Hamiltonian structure. In [11], this structure is investigated by meansof a new approach: it stems from a reduction process of a special kind of Liealgebroids [6]. This approach parallels the work of [2], where the ‘continuouscounterpart’ of the Toda lattice is studied, namely, the KdV equation.

Indeed, the KdV equation is a bi-Hamiltonian system obtained by reducingthe space Map(S1, gl(2,R)) of C∞ maps from S1 to gl(2,R). If instead of thespace Map(S1, gl(2,R)) one considers the space Map(S1, gl(3,R)), one obtainsthe Boussineq hierarchy, which also displays a bi-Hamiltonian structure. The dis-crete version of the KdV equation, the Toda lattice, is obtained in [11] by replacingthe circle S1 with the cyclic group Zd and the algebra gl(2,R) with the groupGL(2,R): therefore, the space to reduce is Map(Zd,GL(2,R)).

In this paper we analyze the discrete version of the Boussineq equation. Weconsider therefore the reduction of the space Map(Zd,GL(3,R)). The equationswe obtain also display a bi-Hamiltonian structure, which, to the best of our knowl-edge, is not known in the literature. We call Toda3 the integrable dynamical systemthat arises naturally from this structure. This dynamical system is studied under adifferent perspective by Kupershmidt in [5].

The plan of the paper is the following. In Section 2 we introduce the geomet-rical structure of the phase space to reduce: the set of maps Map(Zd,GL(3,R)).Following the analysis in [11], this space can be endowed with the structure of

132 ATTILIO MEUCCI

a Poisson bi-anchored manifold. In Section 3 we present the reduction of thePoisson bi-anchored manifold Map(Zd,GL(3,R)). This reduction is an adaptationof the Marsden–Ratiu reduction scheme for Poisson manifolds [9] and gives riseto a bi-Hamiltonian manifold. In Section 4 we apply the theory of Gelfand andZakharevich [4] to study the integrability properties of the bi-Hamiltonian flowsobtained previously. The last section contains an example.

2. Poisson Bi-anchored Manifolds

In this section we introduce the geometric objects we need for our approach to theToda lattice. The discussion closely follows [11]. We will endow the manifold Mof the maps from the cyclic group Zd to GL(3,R) with several structures, namelya Poisson tensor and two compatible Lie algebroids (see [6]) suitably solderedtogether: this will make M into a Poisson bi-anchored manifold.

First we introduce the manifold M. A point q of M is simply a d-tuple ofinvertible 3 × 3 matrices

q = (q1, . . . , qd

), (1)

where

qk = qk

1 qk2 qk

3qk

4 qk5 qk

6qk

7 qk8 qk

9

. (2)

We will always be dealing with d-tuples of matrices and the following conditionis supposed to hold throughout the discussion:

(·)k+d = (·)k. (3)

For convenience, we will say that a matrix as in (2) represents a d-tuple as in (1).Vector fields on M will be represented by d-tuples of 3 × 3 matrices qk whoseentries are functions of the point q ∈ M. The same way, one-forms on M arerepresented as d-tuples of 3 × 3 matrices αk where each entry is a function ofthe point. The value of the one-form α on the vector field q is given by the scalarfunction:

〈α, q〉 =d∑

k=1

Tr(αkqk

). (4)

Next, we endow M with a Poisson manifold structure. A quick computationshows that the map P ′:T ∗M → TM defined as

qk = P ′(α)k = qkαkb − bαkqk, (5)

TODA EQUATIONS, BI-HAMILTONIAN SYSTEMS, AND COMPATIBLE LIE ALGEBROIDS 133

where b is any fixed matrix, is indeed a Poisson tensor, i.e., it defines a Poissonbracket {f, g} = 〈df, P ′dg〉. In order to recover the Toda lattice we choose

b = 1 0 0

0 0 00 0 0

. (6)

To obtain the Hamiltonian vector field XH associated with a (Hamiltonian) functionH : M → R, we simply have to plug its differential α = dH into Equation (5).

At this point we endow M with an additional structure: a pencil of Lie alge-broids. We recall (see [6]) that (M,E, A, {·, ·}) is a Lie algebroid if

(i) E is a vector bundle on M(ii) {·, ·} is a bilinear composition law on the sections of E that makes them into

a Lie algebra(iii) the map A: E → TM, called an anchor, is a Lie algebra morphism:

A({s, t}) = [A(s),A(t)], (7)

where [·, ·] is the usual commutator of vector fields.

If two different Lie algebroid structures (M,E, A, {·, ·}) and (M,E, A′, {·, ·}′

) co-exist on the same manifold M and vector bundle E we can consider the pencil ofbrackets

{s, t}λ = {s, t} + λ{s, t}′(8)

and the pencil of maps

Aλ(s) = A(s) + λA′(s), (9)

where λ is a complex parameter. The two Lie algebroid structures are said to becompatible if (M,E, Aλ, {·, ·}λ) is a Lie algebroid for every value of λ.

In our case, we consider the trivial vector bundle E = M×[Mat(3,R)]d , where(·)d denotes the d-times Cartesian product (·)×· · ·×(·). The sections of this bundleare represented by d-tuples of 3 × 3 matrices sk whose entries are functions of thepoint q ∈ M. Then we define the pencil of anchors Aλ: E → TM as

qk = Aλ(s)k = sk+1

(qk + λb

) − (qk + λb

)sk, (10)

and a composition law {·, ·}λ as

{s, t}kλ = ∂A(s)tk − ∂A(t)s

k + [tk, sk

] + λ(∂A′(s)t

k − ∂A′(t)sk), (11)

where by ∂qt we mean the derivative of the section t along the vector field q. It iseasy to prove that the manifold (M,E, Aλ, {·, ·}λ) is a pencil of Lie algebroids.

The pencil of Lie algebroids structure defines a useful relation among one-forms. We define a one-form α to be related with a one-form β (and we denotethis by α ∼ β) if A∗α = A′∗β. In order to explicitly calculate this relation we

134 ATTILIO MEUCCI

need to derive the expression of the dual pencil of anchors A∗λ = A∗ + λA′∗. An

element ξ of the dual vector bundle E∗ can be naturally identified with a point ofE by means of the pairing

〈ξ, s〉 =d∑

k=1

Tr(ξ ksk

).

Therefore the dual pencil A∗λ:T ∗M → E∗ can be viewed as a map that with a one-

form α associates a d-tuple ξ k of matrices whose entries are functions of the pointq ∈ M. A quick calculation shows that A∗

λ reads:

ξ k = A∗λ(α)

k = (qk−1 + λb

)αk−1 − αk

(qk + λb

). (12)

Thus far we have defined a Poisson structure and a pencil of Lie algebroids onthe manifold M. We arrived at the last step: soldering these structures by means oftwo maps J, J ′:T ∗M → E that verify the following conditions: if α ∼ β, then

P ′(α) = A′(Jα + J ′β), (13)

P ′(β) = A(Jα + J ′). (14)

It is easy to see that the intertwining maps defined as

J : sk = αkqk, J ′: sk = −αkb,

satisfy the above relation. This ends the definition of the geometrical structures ofthe manifold M = Map(Zd,GL(3,R)): it is a Poisson manifold endowed withtwo compatible Lie algebroid structures that define a relation on one-forms andtwo intertwining maps that solder everything. We call such a structure a Poissonbi-anchored manifold.

3. The Reduction

In this section we perform the reduction of the manifold M = Map(Zd,GL(3,R))

introduced in the previous section. Combining a restriction and a projection weobtain a new manifold N of lower dimension endowed with the same geometricalstructure as the original manifold M, but with the important additional property ofbeing bi-Hamiltonian.

As a first step, we consider the distribution ImP ′ + ImA′. It is easy to verifyby formulas ( 5) and (10) that this distribution is integrable and therefore it foli-ates Map(Zd,GL(3,R)) in maximal integral leaves, which are the 5d-dimensionalhyperplanes of the form

qk = qk

1 qk2 qk

3qk

4 νk5 νk6qk

7 νk8 νk9

, (15)


where the νki are constants. The restriction process we mentioned above consistsin selecting one of these leaves. To obtain the Toda3 system, we pick the leaf Ldefined by points of the form

qk = qk

1 qk2 qk

3qk

4 0 0qk

7 1 0

. (16)

The pencil of anchors allows us to define another distribution, namely D =A(kerA′), which is integrable.� As opposed to the GL(2,R) case, this distributionis not tangent to L. We are interested only in the restriction of this distribution tothe leaf L, which we denote by D|L. The distribution E = D|L ∩ TL of L is alsointegrable and an explicit computation shows that E is spanned by the vector fieldsof the form

qk1 = 0, qk

2 = µkqk2 − qk

3 sk8 , qk

3 = µk−1qk3 , qk

4 = −µk+1qk4 ,

qk5 = 0, qk

6 = 0, qk7 = τ kqk

4 − µkqk7 , qk

8 = 0, qk9 = 0

(17)

for arbitrary µk, τ k. From this expression we see that along the vector fields inE the following equations are satisfied:

qk1 = 0,

(qk+1

2 qk4 + qk+1

3 qk7

)• = 0,(qk

4qk+23

)• = 0.

This means that the distribution E admits the three invariants

ak1 = qk1 , (18)

ak2 = qk+12 qk

4 + qk+13 qk

7 , (19)

ak3 = qk4q

k+23 . (20)

At this point we can operate the projection we mentioned above: we define thereduced manifold N to be the quotient of the leaf L with respect to the folia-tion induced by the distribution E. By (18), (19) and (20) we argue that N is a3d-dimensional manifold that can be regarded as R

3d , endowed with the set ofcoordinates (ak1, a

k2 , a

k3)k=1,...,d . The above formulas also yield the expression of

the canonical projection π : L → N .After obtaining the reduced manifold N we endow it with a Poisson structure.

This can be done observing that (M, P ′,D,L) is Poisson reducible, in the termi-nology of [9]. In this context, to find the expression of the reduction of P ′ we haveto extend a generic one-form ϕ on N to a one-form α on M (possibly defined onlyat the points of the leaf L) which annihilates the distribution D. This means:

〈α,D〉 = 0, 〈α, q〉 = 〈ϕ, π∗q〉.Let us denote by α = ext(ϕ) any such extension. Then the expression

p′(ϕ) = π∗ ◦ P ′(ext(ϕ)) (21)� This is true in general, provided that for any two sections s, t such that A′(s) = 0 and A′(t) = 0

we have {s, t}′ = 0. It is evident from (11) that this condition holds in our case.

136 ATTILIO MEUCCI

does not depend on the choice of ext(ϕ) and determines a Poisson structure on N .If we denote by ϕ = ∑d

k=1 ϕk1dak1 +ϕk

2dak2 +ϕk3dak3 the generic one-form on N , an

easy calculation shows that an extension α = ext(ϕ) has the form

αk = ϕk

1 ϕk3q

k+23 + qk+1

2 ϕk2 ϕk

2qk+13

ϕk−12 qk−1

4 αk5 qk+1

3 ϕk−13 qk−1

4ϕk−2

3 qk−24 + qk−1

7 ϕk−12 αk

8 αk9

, (22)

where

αk9 = qk−1

7 ϕk−13 qk+1

3 − qk2ϕ

k−23 qk−2

4 + αk−15 .

As we said, this matrix is not completely determined: the components αk5, α

k8 are

free, since the extension gives an equivalence class of one-forms. Now we canapply formula (21) to obtain the expression of the reduced Poisson tensor p′:

ak1 = ϕk−12 ak−1

2 − ϕk2a

k2 + ϕk−2

3 ak−23 − ϕk

3ak3,

ak2 = (ϕk

1 − ϕk+11

)ak2 − ϕk+1

2 ak3 + ϕk−12 ak−1

3 ,

ak3 = (ϕk

1 − ϕk+21

)ak3 .

(23)

Recalling the periodic Toda case [11], to find another Poisson structure we haveto determine the reduced relation on the one-forms of N . Therefore we need theexpression of the reduced dual pencil of anchors a∗

λ. To have this, in turn, we haveto define a proper reduced vector bundle U based on N on which the reducedanchors act: aλ: U → TN . It is convenient to define first the dual vector bundleU∗ and the dual pencil a∗

λ. The definition of the vector bundle U and the pencil aλ

will then follow by duality. A lenghty computation [10] allows us to find all thesecharacters. We are only interested in the reduced relation on one-forms, which turnsout to be the following: ϕ ∼ ψ (i.e., a∗(ϕ) = a′∗(ψ)) if and only if they satisfy

ψk1 − ψk+1

1 = ϕk1a

k1 − ϕk+1

1 ak+11 + ϕk−1

2 ak−12 − ϕk+1

2 ak+12 +

+ϕk−23 ak−2

3 − ϕk+13 ak+1

3 , (24)

−ψk−22 ak−2

3 + ψk2a

k−13

= ϕk−21 ak−2

3 − ϕk+11 ak−1

3 − ϕk−22 ak−2

3 ak−11 +

+ϕk2a

k−13 ak1 − ϕk−2

3 ak−23 ak−1

2 + ϕk−13 ak−1

3 ak−12 , (25)

ψk2a

k2 − ψk−1

2 ak−12 + ψk

3ak3 − ψk−2

3 ak−23 (26)

= ϕk−11 ak−1

2 − ϕk+11 ak2+

+ϕk−22 ak−2

3 − ϕk+12 ak3 + ak1

(ϕk

2ak2 − ϕk−1

2 ak−12

) + ak1(ϕk

3ak3 − ϕk−2

3 ak−23

).

Now we focus on the special feature of the reduced Poisson bi-anchored manifoldN . By Equations (24), (25) and (26) for a fixed one-form ϕ there is a whole class


of one-forms [ψ] = ψ + ker a′∗ that is related with it. In the reduced structure,though, ker a′∗ ⊂ ker p′. Therefore the following tensor,

p(ϕ) := p′(ψ), (27)

is well defined. A lengthy calculation shows that the bracket induced by this tensorverifies the Jacobi identity. Furthermore, p is compatible with p′, i.e., the pencilpλ = p + λp′ is a Poisson tensor� for all values of the complex parameter λ.Explicitly, a = p(ϕ) reads

ak1 = ak1(ϕk−1

2 ak−12 − ϕk

2ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) ++ ak2ϕ

k+11 − ak−1

2 ϕk−11 + ak3ϕ

k+12 − ak−2

3 ϕk−22 ,

ak2 = ak1(ϕk

1ak2 + ϕk−1

2 ak−13

) − (28)

− ak+11

(ϕk+1

1 ak2 + ϕk+12 ak3

) + ϕk+21 ak3 − ϕk−1

1 ak−13 +

+ ak2(ϕk−1

2 ak−12 − ϕk+1

2 ak+12 + ϕk−2

3 ak−23 + ϕk−1

3 ak−13 −

−ϕk3a

k3 − ϕk+1

3 ak+13

),

ak3 = ak3(ϕk

1ak1 − ϕk+2

1 ak+21 + ϕk−1

2 ak−12 + ϕk

2ak2 − ϕk+1

2 ak+12 − ϕk+2

2 ak+22 +

+ϕk−23 ak−2

3 + ϕk−13 ak−1

3 − ϕk+13 ak+1

3 − ϕk+23 ak+2

3

).

Our goal has been achieved: we arrived at a bi-Hamiltonian manifold by meansof a systematic procedure of reduction of the original Poisson bi-anchored mani-fold. Now we have to investigate the information provided by the bi-Hamiltonianstructure.

4. The Toda3 System

In this section we show how the bi-Hamiltonian structure obtained in the previoussection defines specific vector fields and, moreover, accounts for their integrability.To obtain an integrable system we will focus on the Casimirs of the Poisson pencil,i.e., the functions C such that their differentials are in the kernel of Pλ. Indeed, ifwe expand a Casimir in powers of λ,

C =∑i

Hiλi, (29)

it is immediate to check that the coefficients Hi satisfy the Lenard relations

P ′(dHi) = −P(dHi+1). (30)

It is easily shown (see, e.g., [7]) that this in turn implies that the Hi’s are in in-volution with respect to both Poisson bracket. If these coefficients are enough, the

� We recall that a manifold endowed with two compatible Poisson tensors is said to bebi-Hamiltonian.

138 ATTILIO MEUCCI

system is integrable in the classical sense of Liouville and Arnold [1]. Nevertheless,in general finding the Casimirs of a Poisson pencil is not easy. In the present casewe can make use of the following

PROPOSITION 1. If hk solves

hkhk+1hk+2 = (ak+2

1 + λ)hkhk+1 + ak+1

2 hk + ak3, (31)

then C(λ) = h1 · · · hd is a Casimir of the Poisson pencil (23)–(28). The solutionshk and thus the Casimirs C can be calculated explicitly as Laurent series in theparameter λ.

Proof. See the appendix. ✷We call (31) the characteristic equation. Proposition 1 allows us in principle

to calculate the Casimirs of the Poisson pencil pλ, but the computation is lengthy.Fortunately there is a shortcut: if in (31) we set

hk = ψk+1

ψk

µ, (32)

the characteristic equation becomes the linear system

0 = ψk+3µ3 − (

ak+21 + λ

)ψk+2µ

2 − ak+12 ψk+1µ − ak3ψk. (33)

We can express (33) in matrix form as

Lψ = 0, (34)

where L is the matrix

L =

µ2(a1

1 + λ) −µ3 0 ad−1

3 µad2

µa12 µ2

(a2

1 + λ) −µ3 . . . ad3

a13 µa2

2. . .

. . . 0

0. . .

. . . µ2(ad−1

1 + λ) −µ3

−µ3 0 ad−23 µad−1

2 µ2(ad1 + λ

)

and ψ is the vector of the ‘homogeneous coordinates’

ψ =

ψ1...

ψd

.

For (34) to admit nontrivial solutions we must have detL = 0. It can be proved thatthe cyclicity of the matrix L implies that its determinant is a polynomial of degree3 in µd . Therefore we must have

0 = detL = ±µ3d + K1µ2d + K2µ

d + K3, (35)


where K1,K2,K3 are polynomials in λ (in particular, C3 does not depend on λ).By Proposition 1 and Equation (32), for all µ that satisfy (35) we have that µd =h1 . . . hd is a Casimir, thus K1, K2 and K3 are Casimirs as well. Their coefficientsprovide all information about the geometry of the system at hand.

The family of dynamical systems associated with the Casimirs K1, K2 and K3

is what we call Toda3. It can be proved that they coincide with Kupershmidt’sreduction of the discrete KP hierarchy [5]. We will show in an example how theintegrability of these systems stems from their bi-Hamiltonian structure.

5. An Example of the Toda3 System

To illustrate how the scheme described above works we consider the specific casewhere d = 4. This example is easy to handle, but at the same time general enough.In order to make the equations easier to read we will change notation: we set ak1 =bk, ak2 = ak , ak3 = ck, ϕk

1 = βk , ϕk2 = αk, ϕk

3 = γk. Thus our bi-Hamiltonianmanifold N becomes R

12 with coordinates (a1, . . . , c4). The Poisson pencil pλ

associates with a one-form∑4

k=1(αkdak + βkdbk + γkdck) the vector field

bk = (bk + λ)(αk−1ak−1 − αkak + γk+2ck+2 − γkck) ++ akβk+1 − ak−1βk−1 + ckαk+1 − ck+2αk+2,

ak = (ck−1αk−1 + akβk)(bk + λ) − (36)

− (ckαk+1 + akβk+1)(bk+1 + λ) + ckβk+2 − βk−1ck−1 ++ ak(αk−1ak−1 − αk+1ak+1 + γk−1ck−1 − γkck −− γk+1ck+1 + γk+2ck+2),

ck = ck(βk(bk + λ) − βk+2(bk+2 + λ)) ++ ck(−αk+2ak+2 + αkak − αk+1ak+1 + αk−1ak−1 ++ γk−1ck−1 − γk+1ck+1),

where k = 1, 2, 3, 4. Equation (35) provides the Casimirs of the Poisson pencil(36). Indeed, from that equation one expects to find only three Casimirs of thewhole pencil. The interesting feature of this approach is that in fact we obtain fourCasimirs which prove to be enough to guarantee integrability. The three coefficientsare

K1(λ) = λ4 + λ3C1 + λ2H1 + λH2 + H3,

K2(λ) = λ2C2 + λC3 + H4,

K3(λ) = C4,

where

C1 = b1 + b2 + b3 + b4,

H1 = a1 + a2 + a3 + a4 + b1b2 + b2b3 + b3b4 + b4b1 + b1b3 + b2b4,

140 ATTILIO MEUCCI

H2 = c1 + c2 + c3 + c4 + b1b2b3 + b2b3b4 + b3b4b1 + b4b1b2 ++ b1(a2 + a3) + b2(a3 + a4) + b3(a4 + a1) + b4(a1 + a2),

H3 = b1b2b3b4 + b1c2 + b2c3 + b3c4 + b4c1 + a1a3 + a2a4 ++ b1b2a3 + b2b3a4 + b3b4a1 + b4b1a2,

C2 = −c2c4 − c1c3,

C3 = −b1c2c4 − b2c3c1 − b3c4c2 − b4c1c3 ++ c1a3a4 + c2a4a1 + c3a1a2 + c4a2a3,

H4 = −a1a2a3a4 + b1a2a3c4 + b2a3a4c1 + b3a4a1c2 + b4a1a2c3 ++ a1c2c3 + a2c3c4 + a3c4c1 + a4c1c2 − b2b4c1c3 − b1b3c2c4,

C4 = c1c2c3c4.

Nonetheless, the second coefficient K2(λ) is composed itself of two independentCasimirs of the pencil: λ2C2 and λC3 +H4. Therefore we obtain the four Casimirs:

K1(λ) = λ4 + λ3C1 + λ2H1 + λH2 + H3, K ′2(λ) = C2,

K ′′2 (λ) = λC3 + H4, K3(λ) = C4.

In the theory of Gelfand–Zakharevich the important objects are these eight func-tions C1, . . . , C4,H1, . . . , H4. Since the K’s are Casimir functions of the Poissonpencil, we must have:

p′(dH1) = −p(dC1), p′(dH2) = −p(dH1),

p′(dH3) = −p(dH2), p′(dH4) = −p(dC3)(37)

and

0 = p′(dC1) = p′(dC2) = p′(dC3) = p′(dC4)

= p(dC2) = p(dH3) = p(dC4) = p(dH4).(38)

The 4-particle periodic Toda3 system are the vector fields X1, . . . , X4 defined asfollows:

X1 = p′(dH1), X2 = p′(dH2), X3 = p′(dH3), X4 = p′(dH4).

The explicit expression of these four vector fields is

X1: b1 = a4 − a1,

a1 = a1(b2 − b1) − c1 + c4,

c1 = c1(b3 − b1),

X2: b1 = c3 − c1 − (b3 + b4)a1 + (b2 + b3)a4,

a1 = a1(−b4b1 − b1b3 + b2b3 + b4b2 − a4 + a2)−−c1(b4 + b1) + c4(b2 + b3),

c1 = c1(−b4b1 − b1b2 + b2b3 + b3b4 − a4 − a1 + a2 + a3),

X3: b1 = −b4c1 + b2c3 − (a3 + b3b4)a1 + (a2 + b2b3)a4,

a1 = a1(−b3b4b1 + b2b3b4 − b3a4 − b1a3 + b2a3 + b4a2−


−c3 + c2) − c1(a4 + b4b1) + c4(a2 + b2b3),

c1 = c1(−b4b1b2 + b2b3b4 − b4a1 − b2a4 + b2a3 + b4a2 − c4 + c2),

X4: b1 = −c1c4a3 + c3c4a2,

a1 = −c4b1c1a3 − c4c1c3 + c1b2c4a3 + c1c4c2,

c1 = c1(−a4a1c2 + b1c4c2 + a2a3c4 − b3c2c4).

Of course, in the above formulas the cyclic condition holds and yields the othercomponents. Due to (37) these vector fields are bi-Hamiltonian. In order to showthat they are integrable we choose a symplectic leaf of the Poisson tensor p′. It canbe shown that this leaf is given by

C1 = constant, C2 = constant,C3 = constant, C4 = constant.

Therefore, the leaf is an eight-dimensional symplectic submanifold of the twelve-dimensional original phase space. As we said in Section 4, the functions H1, . . . , H4

commute with respect to both Poisson brackets. Therefore their restrictions to thesymplectic leaf also commute, and, since they can be checked to be functionallyindependent, constitute an integrable system.

6. Conclusions

In this paper we showed that reducing a special Poisson bi-anchored manifold,namely the set of maps from Zd to GL(3,R), we obtain a new bi-Hamiltonianstructure. This bi-Hamiltonian structure gives rise to an integrable system, whichwe called Toda3 and already appeared in the work of Kupershmidt [5]. This sys-tem represents a generalization of the periodic Toda lattice (which corresponds toGL(2,R)).

There are several further developments of this approach. First of all, it is easy toendow the set of maps from Zd to GL(n,R) for a generic n ∈ N with the structureof bi-anchored Poisson manifold. It is possible to show that the reduction of thesemanifolds gives rise to other Toda systems, which are the discrete analog of theGelfand–Dickey hierarchies [3].

Secondly (see [8, 10]) the study of the conservation laws of the periodic Todalattice allows to define the discrete analog [5] of the KP equations on the SatoGrassmannian [13]. These represent flows on an infinite-dimensional phase spacethat admit invariant submanifolds. These submanifolds are the different phasespaces of the Toda system, and the restriction of the KP equation to these phasespaces are the Toda equations. This way it is possible to extend to the discrete casethe description given for the continuous case in [2], where the KdV hierarchy andthe (usual) KP equations are considered.

142 ATTILIO MEUCCI

Appendix. Proof of Proposition 1

We will split the proof in three parts.(1) We are looking for Casimir functions, i.e. exact one-forms in the kernel of

the Poisson pencil pλ, which we rewrite like this

ak1 = (ak1 + λ

)(ϕk−1

2 ak−12 − ϕk

2ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) −−ϕk−1

1 ak−12 + ϕk+1

1 ak2 − ϕk−22 ak−2

3 + ϕk+12 ak3, (39)

ak2 = ϕk−12 ak−1

3

(ak1 + λ

) − ϕk+12 ak3

(ak+1

1 + λ) −

−ϕk−11 ak−1

3 + ϕk+21 ak3 + ϕk−1

3 ak−13 ak2 − ϕk

3ak3a

k2 +

+ ak2(ϕk

1

(ak1 + λ

) − ϕk+11

(ak+1

1 + λ) + ϕk−1

2 ak−12 −

−ϕk+12 ak+1

2 + ϕk−23 ak−2

3 − ϕk+13 ak+1

3

),

ak3 = ak3(ϕk

1

(ak1 + λ

) − ϕk+11

(ak+1

1 + λ) + ϕk−1

2 ak−12 −

−ϕk+12 ak+1

2 + ϕk−23 ak−2

3 − ϕk+13 ak+1

3

) ++ ak3

(ϕk+1

1

(ak+1

1 + λ) − ϕk+2

1

(ak+2

1 + λ) + ϕk

2ak2 −

−ϕk+22 ak+2

2 + ϕk−13 ak−1

3 − ϕk+23 ak+2

3

).

From this expression, it is straightforward to see that in order for the one-form ϕ

to be in the kernel of the Poisson pencil it is enough that it satisfies the followingequations:

0 = (ak1 + λ

)(ϕk−1

2 ak−12 − ϕk

2ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) −−ϕk−1

1 ak−12 + ϕk+1

1 ak2 − ϕk−22 ak−2

3 + ϕk+12 ak3,

0 = ϕk−12 ak−1

3

(ak1 + λ

) − ϕk+12 ak3

(ak+1

1 + λ) − (40)

−ϕk−11 ak−1

3 + ϕk+21 ak3 + ϕk−1

3 ak−13 ak2 − ϕk

3ak3a

k2,

0 = ϕk1

(ak1 + λ

) − ϕk+11

(ak+1

1 + λ) + ϕk−1

2 ak−12 −

−ϕk+12 ak+1

2 + ϕk−23 ak−2

3 − ϕk+13 ak+1

3 . (41)

A change of variables reduces this system to a single equation of Riccati type.Indeed, let hk be any solution of what we call the characteristic equation, which isof Riccati type:

hkhk+1hk+2 = (ak+2

1 + λ)hkhk+1 + ak+1

2 hk + ak3 . (42)

Let us set

ϕk+21 = hkhk+1ρk, ϕk+1

2 = hkρk, ϕk3 = ρk.

Let us choose the multiplier ρ is such a way that the following equation holds:

ϕk1

(ak1 + λ

) + ϕk−12 ak−1

2 + ϕk2a

k2 + ϕk−2

3 ak−23 + ϕk−1

3 ak−13 + ϕk

3ak3 = L,

where L is any constant different from zero.


(2) Such a one-form ϕ solves (40) and therefore it is an element of the kernel ofthe Poisson pencil pλ. Indeed, let us consider a one-form ϕ that satisfies the system:

hkhk+1hk+2 = (ak+2

1 + λ)hkhk+1 + ak+1

2 hk + ak3,

ϕk1 = hk−2hk−1ϕ

k−23 , ϕk

2 = hk−1ϕk−13 , (43)

L = ϕk1(a

k1 + λ) + ϕk−1

2 ak−12 + ϕk

2ak2 +

+ ϕk−23 ak−2

3 + ϕk−13 ak−1

3 + ϕk3a

k3 .

We will show that ϕ satisfies the following equations:

0 = (ak1 + λ

)(ϕk−1

2 ak−12 − ϕk

2ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) −−ϕk−1

1 ak−12 + ϕk+1

1 ak2 − ϕk−22 ak−2

3 + ϕk+12 ak3,

0 = ϕk−12 ak−1

3

(ak1 + λ

) − ϕk+12 ak3

(ak+1

1 + λ) − (44)

−ϕk−11 ak−1

3 + ϕk+21 ak3 + ϕk−1

3 ak−13 ak2 − ϕk

3ak3a

k2,

0 = ϕk1

(ak1 + λ

) − ϕk+11

(ak+1

1 + λ) + ϕk−1

2 ak−12 − ϕk+1

2 ak+12 +

+ϕk−23 ak−2

3 − ϕk+13 ak+1

3 .

We immediately see that ϕ fulfills the third equation of (44): just compare with thelast equation of (43). Before we verify that the two other equations are satisfied aswell, we need a formula. Namely, we have �

hk−3hk−2hk−1ϕk−33 − hk−2hk−1

(ak1 + λ

)ϕk−2

3 = hk−1ϕk−13 ak2 + ϕk

3ak3 . (45)

We will show now that ϕ satisfies the first equation of (44):

0 = (ak1 + λ

)(ϕk−1

2 ak−12 − ϕk

2ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) − (46)

−ϕk−11 ak−1

2 + ϕk+11 ak2 − ϕk−2

2 ak−23 + ϕk+1

2 ak3 .

� From the last equation of (43) we obtain

L = ϕk−23 hk−2hk−1(a

k1 + λ) + hk−2ϕ

k−23 ak−1

2 + hk−1ϕk−13 ak2 + ϕk−2

3 ak−23 +

+ϕk−13 ak−1

3 + ϕk3ak3

= ϕk−23 (hk−2hk−1(a

k1 + λ) + hk−2a

k−12 + ak−2

3 ) + hk−1ϕk−13 ak2 + ϕk−1

3 ak−13 + ϕk3a

k3

= ϕk−23 (hk−2hk−1hk) + ϕk−1

3 (hk−1ak2 + ak−1

3 ) + ϕk3ak3 .

This implies

ϕk−23 (hk−2hk−1hk)

= ϕk−13 (hk−1hkhk+1 − hk−1a

k2 − ak−1

3 ) + ϕk3 (hkak+12 ) + ϕk+1

3 ak+13

and the result follows using the characteristic equation

hk−1hkhk+1 − ak2hk−1 − ak−13 = (ak+1

1 + λ)hk−1hk.

144 ATTILIO MEUCCI

We replace in this expression the values of ϕk1 , ϕk

2 obtained from (43), and wemultiply both sides by hk−2hk−1. We obtain that (46) holds if and only if

0 = hk−2hk−1(ak1 + λ

)(hk−2ϕ

k−23 ak−1

2 − hk−1ϕk−13 ak2 + ϕk−2

3 ak−23 − ϕk

3ak3

) −−hk−2hk−1hk−3hk−2ϕ

k−33 ak−1

2 + hk−2hk−1hk−1hkϕk−13 ak2 −

−hk−2hk−1hk−3ϕk−33 ak−2

3 + hk−2hk−1hkϕk3a

k3 .

Rearranging the terms in the last expression, we end up having to prove that

0 = det

( (hk−3hk−2hk−1ϕ

k−33 − hk−2hk−1

(ak1 + λ

)ϕk−2

3

) (hk−1ϕ

k−13 ak2 + ϕk

3ak3

)(hk−2hk−1hk − hk−2hk−1

(ak1 + λ

)) (hk−2a

k−12 ak−2

3

))

but this is true, since the characteristic equation and (45) hold. We will show nowthat ϕ satisfies the second equation of (44):

0 = ϕk−12 ak−1

3

(ak1 + λ

) − ϕk+12 ak3

(ak+1

1 + λ) −

−ϕk−11 ak−1

3 + ϕk+21 ak3 + ϕk−1

3 ak−13 ak2 − ϕk

3ak3a

k2 . (47)

We replace in this expression the values of ϕk1 , ϕk

2 obtained from (43), and wemultiply both sides by hk−1. We obtain that (47) holds if and only if:

0 = hk−1hk−2ϕk−23 ak−1

3

(ak1 + λ

) − hk−1hkϕk3a

k3

(ak+1

1 + λ) −

−hk−1hk−3hk−2ϕk−33 ak−1

3 + hk−1hkhk+1ϕk3a

k3 +

+hk−1ϕk−13 ak−1

3 ak2 − hk−1ϕk3a

k3a

k2 .

Rearranging the terms in the last expression, we end up having to prove that

0 = det

(ak−1

3

(hk−1hkhk+1 − hk−1hk

(ak+1

1 + λ) − ak2

)ϕk

3ak3

(hk−1hk−3hk−2ϕ

k−33 − hk−1hk−2ϕ

k−23

(ak1 + λ

) − ϕk−13 ak2

))

but this is true, since the characteristic equation and (45) hold.(3) Furthermore, ϕ is exact. The system

hkhk+1hk+2 = (ak+2

1 + λ)hkhk+1 + ak+1

2 hk + ak3,


k−23 , ϕk

2 = hk−1ϕk−13 ,

L = ϕk1

(ak1 + λ

) + ϕk−12 ak−1

2 + ϕk2a

k2 + ϕk−2

3 ak−23 + ϕk−1

3 ak−13 + ϕk

3ak3

is equivalent to

hkhk+1hk+2 = (ak+2

1 + λ)hkhk+1 + ak+1

2 hk + ak3,


k−23 , ϕk

2 = hk−1ϕk−13 ,

0 = L − ϕk3a

k3 + ϕk

2

(ak+1

1 + λ)hk − ϕk

1hk − ϕk2hkhk+1.

Combining the first and the fourth equations we obtain

ϕk3

(ak+2

1 + λ)hk+1 + ϕk

3ak+12

= − L

hk

+ ϕk3hk+1hk+2 − ϕk

2(ak+11 + λ) + ϕk

1 + ϕk2hk+1.


We will use this in the next calculation. We move on to evaluating

d∑k=1

(ϕk+2

1 ak+21 + ϕk+1

2 ak+12 + ϕk

3 ak3

)

=d∑

k=1

(hkhk+1ϕ

k3 a

k+21 + hkϕ

k3 a

k+12 + ϕk

3 ak3

)

=d∑

k=1

ϕk3

(hkhk+1a

k+21 + hka

k+12 + ak3

)

=d∑

k=1

ϕk3

(hkhk+1hk+2 + hkhk+1hk+2 + hkhk+1hk+2−

−(ak+2

1 + λ)hkhk+1 − (

ak+21 + λ

)hkhk+1 − ak+1

2 hk

)

=d∑

k=1

(ϕk

3 hkhk+1hk+2 + ϕk3hkhk+1hk+2 + ϕk

3hkhk+1hk+2−−ϕk

3

(ak+2

1 + λ)hkhk+1 − ϕk

3

(ak+2

1 + λ)hkhk+1 − ϕk

3ak+12 hk

)

=d∑

k=1

(ϕk

3 hkhk+1hk+2 + ϕk+12 hk+1hk+2 + ϕk+2

1 hk+2−−hk

(ϕk

3(ak+21 + λ)hk+1 + ϕk

3ak+12

) − ϕk+12

(ak+2

1 + λ)hk+1

)

=d∑

k=1

ϕk3 hkhk+1hk+2 + ϕk+1

2 hk+1hk+2 + ϕk+21 hk+2−

−hk

(ϕk

1 + ϕk2hk+1 + ϕk

3hk+1hk+2 − L

hk

− ϕk2

(ak+1

1 + λ))−

−ϕk+12

(ak+2

1 + λ)hk+1ϕ

k+21 hk+2.

d∑k=1

−hk

(ϕk

1 + ϕk2hk+1 + ϕk

3hk+1hk+2 − L

hk

− ϕk2

(ak+1

1 + λ)−

−ϕk3hk+1hk+2

) + (ϕk+1

2 hk+2 − ϕk+12

(ak+2

1 + λ))hk+1

=d∑

k=1

hk

L

hk

= (L log(h1 · · · hd)

)•.

The last equation follows from L being a constant (independent of both site k andvariables aki ). Therefore

d∑k=1

(ϕk

1dak1 + ϕk2dak2 + ϕk

3dak3) = d(L log(h1 · · · hd)).

Thus, ϕ is exact, and C = h1 . . . hd is a Casimir of the Poisson pencil. This endsthe proof.

Acknowledgements

I wish to thank Marco Pedroni for his suggestions and continuous support, as wellas my former advisor, Franco Magri, for useful discussions on the topic.

146 ATTILIO MEUCCI

References

1. Arnold, V. I.: Mathematical Methods of Classical Mechanics, Grad. Texts in Math., Springer,New York, 1989.

2. Falqui, G., Magri, F. and Pedroni, M.: Bihamiltonian geometry, Darboux coverings, andlinearization of the KP hierarchy, Comm. Math. Phys. 197 (1998), 303–324.

3. Gelfand, I. M. and Dickey, L. A.: Fractional powers of operators and Hamiltonian systems,Funct. Anal. Appl. 10 (1976), 259–273.

4. Gelfand, I. M. and Zakharevich, I.: On the local geometry of a bi-Hamiltonian structure, In:L. Corvin et al. (eds), The Gelfand Mathematical Seminars 1990–1992, Birkhäuser, Boston,1993, pp. 51–112.

5. Kupershmidt, B. A.: Discrete Lax equations and differential-difference calculus, Asterisque123 (1985), 212–245.

6. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry 124, CambridgeUniv. Press, Cambridge, 1987.

7. Magri, F.: Eight lectures on integrable systems, In: Integrability of Nonlinear Systems, LectureNotes in Phys. 495, Springer, New York, 1997, pp. 256–296.

8. Magri, F.: The bihamiltonian route to Sato Grassmannian, Proc. CRM Conf. on BispectralProblems, Montreal, CRM Proc. Lecture Notes 14, Amer. Math. Soc., Providence, 1998, pp.203–209.

9. Marsden, J. E. and Ratiu, T.: Reduction of Poisson manifolds, Lett. Math. Phys. 11 (1986),161–169.

10. Meucci, A.: The bi-Hamiltonian route to the discrete Sato Grassmannian, PhD thesis, Univer-sitá Statale di Milano, 1999.

11. Meucci, A.: Compatible lie algebroids and the periodic Toda lattice, To appear in J. Geom.Phys.

12. Morosi, C. and Pizzocchero, L.: R-matrix theory, formal Casimirs and the periodic Toda lattice,J. Math. Phys. 37 (1996), 4484–4513.

13. Sato, M. and Sato, Y.: Soliton equations as dynamical systems on infinite-dimensional Grass-mannian manifolds, In: P. Lax and H. Fujita (eds), Nonlinear PDE’s in Applied Sciences(US/Japan Seminar, Tokio), North-Holland, Amsterdam, 1982, pp. 259–271.


147

Ergodicity for the Randomly Forced 2DNavier–Stokes Equations

SERGEI KUKSIN1 and ARMEN SHIRIKYAN2

1Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, Scotland, U.K.e-mail: [email protected] and Steklov Institute of Mathematics, 8 Gubkina St.,117966 Moscow, Russia.2Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, Scotland, U.K.e-mail: [email protected] and Institute of Mechanics of MSU, 1 Michurinskii Av.,119899 Moscow, Russia.

(Received: 18 April 2001)

Abstract. We study space-periodic 2D Navier–Stokes equations perturbed by an unbounded randomkick-force. It is assumed that Fourier coefficients of the kicks are independent random variables allof whose moments are bounded and that the distributions of the first N0 coefficients (where N0 is asufficiently large integer) have positive densities against the Lebesgue measure. We treat the equationas a random dynamical system in the space of square integrable divergence-free vector fields. Weprove that this dynamical system has a unique stationary measure and study its ergodic properties.

Mathematics Subject Classifications (2000): 37H99, 35Q30.

Key words: Navier–Stokes equations, kick-force, stationary measure, random dynamical system,Ruelle–Perron–Frobenius theorem.

0. Introduction

We continue our study of the randomly forced 2D space-periodic Navier–Stokessystem (NS), started in [KS1, KS2]. That is, we consider the equations

u− ν�u+ (u,∇)u+∇p = ηω(t, x), div u = 0, (0.1)

where x ∈ T2 = R

2/Z2, 0 < ν � 1 is the viscosity, u = u(t, x) is the velocityfield, and p = p(t, x) is the pressure. Equations (0.1) are supplemented by theconditions

〈u〉 ≡ 〈η〉 ≡ 0, div η = 0.

The brackets 〈·〉 signify the space averaging. The right-hand side ηω is a randomprocess with range in the functional space

H = {u ∈ L2(T2,R2) : div u = 0, 〈u〉 = 0},and Equations (0.1) defines a random dynamical system in H . We provide H withthe usual orthonormal basis {e1, e2, . . .} formed by the trigonometric vector fields

148 SERGEI KUKSIN AND ARMEN SHIRIKYAN

Figure 1. Evolution defined by (0.1), (0.2).

Cs(−s2s1

)sin(s · x) and Cs

(−s2s1

)cos(s · x), s ∈ Z

2 \ {0}. The ej ’s are eigenvectors ofthe Laplacian, −�ej = αjej . We assume that the eigenvalues αj are indexed innon-decreasing order.

In [KS1], we consider the NS equations forced by a bounded random kick-force

ηω =∑k∈Z

δ(t − kT )ηk(x), ηk =∞∑j=1

bj ξjkej (x), (0.2)

where bj � 0 are some constants such that

b2 := b21 + b2

2 + · · · <∞,and {ξjk} are independent random variables. It is assumed in [KS1] that the distri-bution D(ξjk) of the random variable ξjk is k-independent and has the form

D(ξjk) = pj(r) dr for j � 1, k ∈ Z, (0.3)

where the pj ’s are Lipschitz continuous functions such that pj(0) > 0, supppj ⊂[−1, 1].

Let {St, t � 0} be flow-maps of the free NS equation (0.1) with η ≡ 0. If u(t, x)is a solution for (0.1) with a kick-force (0.2) normalised to be a continuous fromthe right curve in H , then for any integer k and for t ∈ [T k, T (k + 1)] we have(see Figure 1)

u(t) ={St−T k(u(T k)), t < T (k + 1),ST (u(T k))+ ηk, t = T (k + 1).

(0.4)

Accordingly, long-time behaviour of solutions for (0.1), (0.2) is described by long-time behaviour of solutions for the following random dynamical system with dis-crete time:

uk = S(uk−1)+ ηk, (0.5)

where S = ST and uk = u(T k, ·) ∈ H .In [KS1], we show that if relations (0.3) hold with densities pj as above and

bj �= 0 for 1 � j � N0 (0.6)

ERGODICITY FOR THE RANDOMLY FORCED 2D NAVIER–STOKES EQUATIONS 149

for some finite N0 = N0(ν, b) � 1, then the random dynamical system (0.5) hasin H a unique stationary measure λ. Moreover, if (uk, k � 0) satisfies (0.5) fork > 0 and u0 = u, then

D(uk) ⇀ λ as k→∞ (0.7)

for any choice of the initial vector u ∈ H .!

We note that if bj �= 0 for all j � 1 and∑b2j <∞, then these results apply to

Equation (0.1) with any ν > 0, any T > 0 and with arbitrarily large kick-force η asabove. See the introduction to [KS1] for discussion of this result and see [G, KS1]for its relations with statistical hydrodynamics.

Next, E, Mattingly, and Sinai [EMS] and Bricmont, Kupiainen, and Lefevere[BKL] considered the 2D NS equations perturbed by a white noise force

η =∞∑j=1

bj wj (t)ej (x),

where w1, w2, . . . are independent standard Brownian motions. Under the assump-tion that bj �= 0 for 1 � j � N0(ν) and bj = 0 for j > N with some∞ > N � N0(ν), they obtained results similar to those reviewed above. We do notdiscuss these results here, but we mention that, as it is shown in [BKL], for almostall initial functions u(0, x) the distribution of a solution converges to the stationarymeasure exponentially fast.

In this work we study the NS equation with unbounded random kick-forces.That is, we consider Equations (0.1), (0.2), where the independent random vari-ables ξjk have k-independent distributions as in (0.3), the densities pj are ab-solutely continuous and everywhere positive,∫ ∞

−∞

∣∣∣∣∂pj (r)∂r∣∣∣∣ dr <∞ for all j � 1;

pj(r) > 0 for all j � 1, r ∈ R, (0.8)

and decay at infinity faster than any negative degree of r. We consider in fact thefollowing two extreme cases which are allowed by our techniques:

(A) (finite moments) the densities pj satisfy (0.8) and∫ ∞

−∞|r|mpj(r) dr � Cm for all m � 1, j � 1, (0.9)

with some fixed constants Cm, m � 1.! In [KS1], we study in fact the system (0.3) restricted to the domain of attainability from zero A,

which is a compact subset of H , invariant for (0.3), and prove that the restricted system has a uniquestationary measure and satisfies (0.7) for u ∈ A. In the short paper [KS2] we show that this measureis a unique stationary measure for the system in the whole space H and prove that (0.7) holds for anyu ∈ H .


(B) (finite second exponential moments) the densities pj satisfy (0.8) and∫ ∞

−∞e%0r

2pj(r) dr � C0 for any j � 1,

with some fixed positive constants %0 and C0;

We stress that in (A) and (B) it is not assumed that∫rpj (r) dr = 0.

For any s > 0, we denote Hs = H ∩ Hs(T2;R2), where Hs(T2;R2) is theSobolev space of order s with the corresponding norm ‖ · ‖s .MAIN THEOREM. Let us assume that condition (A) is satisfied and

∞∑j=1

b2j αsj <∞ for some s > 0.

Then there is an integer N0 < ∞ with the following property: if (0.6) holds, thenthe random dynamical system (0.5) has a unique stationary measure λ such that∫

H

|u|mλ(du) <∞ for all m � 1.

Moreover, the following assertions hold:

(a) λ(H s) = 1;(b) if (uk, k � 0) is a solution of Equation (0.5) with a deterministic initial

function u0 = u, then, for λ-almost all u, convergence (0.7) holds. Moreover,

Ef (uk)→ (λ, f ) as k→∞, (0.10)

where f is any continuous function on Hs such that |f (u)| � C1 +C2‖u‖psfor some finite constants C1, C2, and p.

(c) if bj �= 0 for all j , then supp λ = H , and convergence (0.10) holds uniformlyin u ∈ Hs , ‖u‖s � R, for any R > 0.

Finally, if condition (B) is also satisfied, then∫H

eβ|u|2λ(du) <∞ for some β > 0,and convergence (0.10) holds for λ-almost all u ∈ H and any function f ∈ C(Hs)such that |f (u)| � C exp(σ‖u‖κs ), where the positive constants σ and κ aresufficiently small.

If s > 1, then the delta-function is a continuous functional on the space Hs .Accordingly, if s > 1 and u(t, x) is a solution for (0.1) such that u(0, x) =u0 ∈ H , then, for λ-almost all u0 ∈ H , the correlation tensor of the solutionEui(k, x)uj (k, y) converges as k→∞ to the correlation tensor of the measure λ,equal to

∫ui(x)uj (y)λ(du). If u0 is an arbitrary vector in H , then in this statement

the convergence should be replaced by the Cesàro convergence.The proof of the Main Theorem remains true if condition (A) is replaced by the

following weaker assumption withM � 20:


(AM) the densities pj satisfy (0.8), and (0.9) holds for m � M and all j � 1.

In this case, the stationary measure λ has M ′ � M finite moments, where M ′goes to infinity withM, and (0.10) holds for any continuous functional f :H → R

satisfying the inequality |f (u)| � C1 + C2‖u‖M ′s .The proof of the Main Theorem, which occupies Sections 1–5, follows the

scheme developed in [KS1] to work with bounded kick-forces. It is based on aFoias–Prodi type reduction of (0.5) to a finite-dimensional abstract Gibbs systemwhich has a unique stationary solution due to a version of the Ruelle–Perron–Frobenius theorem.

In fact, the Main Theorem can be strengthened as follows:

AMPLIFICATION. Under the assumption of the above theorem, convergence(0.10) holds for any u ∈ H , uniformly on bounded subsets of H .

This result can be derived from the Main Theorem (and some intermediateassertions), using the methods of [KS2]. Since the corresponding arguments differfrom those used in this work, we shall present them in another publication.

NOTATION

We denote by Z be the set of all integers and by Z0 be the set of non-positiveintegers.

Let X be a topological space. We shall use the following notation.[B]X is the closure in the space X of its subset B.BX(x, r) is a closed ball in X of radius r centred at x ∈ X.B(X) is the σ -algebra of Borel subsets of X.P (X) is the set of probability measures on (X,B(X)).C(X) is the space of real-valued continuous functions on X.Cb(X) is the space of bounded functions f ∈ C(X). It is endowed with the supre-mum-norm ‖f ‖∞.L1(X, µ) is the space of Borel functions on X with finite norm

‖f ‖µ :=∫

X

∣∣f (x)∣∣ dµ(x).The integral of a function f (x) over the space X with respect to a measure µ willsometimes be denoted by (µ, f ):

(µ, f ) =∫

Xf (x) dµ(x) =

∫Xf dµ.

D(ξ) is the distribution of a random variable ξ . a ∨ b (a ∧ b) is the maximum(minimum) of real numbers a and b. We denote by Ci , i = 1, 2, . . . , unessentialpositive constants.


1. Preliminaries: Equations, Estimates and the Markov Chain

1.1. DESCRIPTION OF THE CLASS OF PROBLEMS IN QUESTION

Let us consider the Navier–Stokes (NS) system (0.1). Applying the L2-orthogonalprojection 1 onto the space H of divergence-free vector fields with zero meanvalue (see the Introduction), we can write this system as

u+ νLu+ B(u, u) = η(t, x), x ∈ T2, 0 < ν � 1 (1.1)

(for instance, see [CF]). Here u(t) is a two-dimensional vector field with values inthe functional space H . The operators L and B have the form

Lu = −�u, B(u, v) = 1(u,∇)v.It is assumed that the right-hand side of (1.1) is a kick-force as in the Introduction.To simplify notations, we assume that T = 1. Then η takes the form

η(t, x) =+∞∑k=−∞

δ(t − k)ηk(x), (1.2)

where δ(·) is the Dirac measure and ηk, k ∈ Z, is a sequence of i.i.d. randomvariables with range in H . We note that if g(t): R → H is a continuous functionwith compact support, then∫ +∞

−∞〈η(t), g(t)〉 dt =

+∞∑k=−∞

〈ηk, g(k)〉,

where 〈·, ·〉 denotes the scalar product in H .We now turn to a description of the sequence {ηk}. Let α1 � α2 � · · · be

eigenvalues of the positive self-adjoint operator L acting inH and let ej (x), j � 1,be the corresponding eigenfunctions as in the Introduction. We shall assume thatthe random vector ηk has the form

ηk(x) =∞∑j=1

bj ξjkej (x), (1.3)

where {ξjk} is a family of independent scalar random variables satisfying con-dition (A) (see the Introduction), and {bj } is a sequence of real numbers suchthat

∞∑j=1

b2j αsj <∞, s � 0. (1.4)

In what follows, we always assume that inequality (1.4) and condition (A) aresatisfied. In particular, it follows that

E‖ηk‖ms <∞ for any m � 1, (1.5)


where ‖ · ‖s stands for the sth Sobolev norm:

‖u‖s =( ∞∑j=1

αsj |uj |2)1/2

.

Moreover, if {ξjk} satisfies also condition (B), then E exp(a‖ηk‖2s ) < ∞ for any

constant a > 0 such that

ab2j αsj � %0 for all j � 1.

To simplify notation, we shall write |u| and ‖u‖ instead of ‖u‖0 and ‖u‖1,respectively. In what follows, the constants bj are assumed to be fixed, and weshall not specify dependence of different parameters on them.

We now define the notion of a solution for Equation (1.1). For any s � 0 weintroduce the space Hs = H ∩Hs(T2,R2) endowed with the norm ‖ · ‖s . We notethat the operator

√L defines an isomorphism Hs → Hs−1, s � 1.

Let I ⊂ R be an open interval (which can be of infinite length).

DEFINITION 1.1. A mapping u(t): I → H is called a regular curve if it belongsto L1

loc(I,H1) and is continuous at non-integer points of I while at integer points

it is continuous from the right and has a limit from the left.

For a Banach space X, let C10(I,X) be the set of continuously differentiable

functions f (t): I → X with compact support.

DEFINITION 1.2. A regular curve u(t): I → H is called a solution of Equa-tion (1.1) with a deterministic force of the form (1.2) if the left- and right-handsides of (1.1) coincide as linear functionals on the space C1

0(I,H1). That is,∫

I

(−〈u, v〉 + ν〈√Lu,√Lv〉 + 〈B(u, u), v〉) dt

=∫I

〈η, v〉 dt =∑k∈Z∩I

〈ηk, v(k)〉 (1.6)

for any v ∈ C10(I,H

1).A random process u = uω(t), t ∈ I , with range in H is called a solution of

Equation (1.1) with a random force of the form (1.2), (1.3) if for almost all ω themapping uω(t): I → H is a regular curve satisfying (1.1).

We note that if u(t, x) is a solution of Equation (1.1), then, due to (1.6), we have

u(k, x)− u(k − 0, x) = ηk(x) for any integer k ∈ I, (1.7)

while on any interval not containing integer points the function u(t, x) satisfies thefree Navier–Stokes equations

u+ νLu+ B(u, u) = 0, u(t) ∈ H. (1.8)


In particular, (0.4) holds with T = 1.

1.2. CAUCHY PROBLEM AND A PRIORI ESTIMATES

We now consider the Cauchy problem for Equation (1.1):

u(0, x) = u0(x), (1.9)

where u0(x) is a random variable in H . We shall assume that it is independent ofη1, η2, . . . and that all of its moments are finite:

E|u0|m <∞ for any m � 1. (1.10)

We have the following theorem on the correctness of the Cauchy problem:

THEOREM 1.3. Assume that (1.10) is satisfied. Then the problem (1.1), (1.9) hasa unique solution defined for t � 0. Moreover, for any m � 1 we have the estimate

E|u(k)|m � qkE|u0|m + C(m)ν−(m−1)dν(k)E|ηk|m, k � 1, (1.11)

where 0 < ν � 1, q = e−να1 , C(m) > 0 is a constant not depending on u0, k,and ν, and

dν(k) = 1+ q + · · · + qk−1 � α−11 eα1ν−1.

Finally, if (1.4) holds for some s > 0 and l = l(s) � 1 is the smallest integer noless than s, then there is a constant C(l,m) > 0 such that

E‖u(k)‖ms � C(l,m){ν−m/2E|uk−1|m + E ‖ηk‖ms , l = 1,1+ ν−5lm/2

E |uk−1|ml + E ‖ηk‖ms , l � 2,(1.12)

where k � 1 and ml = m(2l + 1).

In case the random variables u0 and ηk have finite second exponential moments,stronger estimates for the solutions hold:

THEOREM 1.4. Suppose that the random variables ξjk satisfy condition (B) andthere is ρ > 0 such that

E exp(ρν|u0|2) <∞. (1.13)

Then the solution of the problem (1.1), (1.9) constructed in Theorem 1.3 satisfiesthe inequality

E exp(σ0ν|u(k)|2

)� d(k)

(E exp

(σ0ν|u0|2))qk , k � 1, (1.14)

where 0 < ν � 1, q = e−α1ν , σ0 = ρ ∧ (aα1e−α1), and

d(k) = (E exp(a|ηk|2))1+q+···+qk−1

�(E exp(a|ηk|2)

) 11−q .


Moreover, if (1.4) holds for some s > 0 and l is the smallest integer no less than s,then there are positive constants Cl and σl, depending only on σ0 and l, such that

E exp(σlνpl‖u(k)‖2κl

s

)� ClE exp

(a‖ηk‖2

s

)E exp

(σ0ν|u(k − 1)|2), k � 1. (1.15)

Here κ1 = 1, p1 = 2, and

κl = 1

2l + 1, pl = 7l + 1

2l + 1for l � 2.

The proof of Theorems 1.3 and 1.4 is carried out by standard methods and isgiven in the Appendix (see Section 6).

We shall also need some estimates for the rate of growth and for the mean valueof solutions (and of the right-hand side of the equation).

For any sequence of non-negative numbers ak and arbitrary integers m � n, weset

〈ak〉nm =1

n−m+ 1

n∑k=mak.

In the case m > n, we set 〈ak〉nm = 〈ak〉mn .

PROPOSITION 1.5. Let k− � k0 � k+ be some integers, where k+ (k−) cantake the value +∞ (−∞), and let u(t, x) be a solution of (1.1) that is defined fork− � t � k+ and satisfies the inequality

supk−�k�k+

E |u(k)|m � Nmν−m for 0 < ν � 1, m � 1, (1.16)

where the constants Nm > 0 do not depend on ν. Then there is a constant M > 1,not depending on Nm and ν, and a non-negative random variable Tν(ω) ∈ Z suchthat ⟨|u(k)|2 + ‖ηk‖2

s

⟩Tk0

� Mν−2 for k− � T � k+, |T − k0| � Tν(ω). (1.17)

Moreover, for any m > 1 there is a constant Cm > 0 such that

E T mν � Cm(N2m + E|η1|4(m+2))ν−m for 0 < ν � 1. (1.18)

In this proposition and everywhere below, we assume that k < k+ if k+ = +∞and k > k− if k− = −∞.

Remark 1.6. If in Proposition 1.5 we assume that condition (B) is also satisfiedand replace inequality (1.16) by the stronger estimate

supk−�k�k+

E eσ0ν|u(k)|2 � N0 for 0 < ν � 1, (1.19)


then (1.17) holds with a constant M > 1 (depending on σ0 solely) and an integer-valued non-negative random variable Tν(ω) ∈ Z such that

E eσTν � C0 for 0 < ν � 1,

where the positive constants C0 and σ depend only on N0 and σ0, respectively.Proof of this assertion follows the same scheme as that of Proposition 1.5, and weshall not dwell on it.

We also note that, due to Theorems 1.3 and 1.4, Proposition 1.5 and its modi-fication above apply to any solution of the problem (1.1), (1.9), where the randomvariable u0 satisfies condition (1.10) or (1.13).

Proof of Proposition 1.5. To simplify notation, we confine ourselves to the casewhen k0 = k− = 0 and k+ = +∞. Moreover, we shall only show that⟨|u(k)|2⟩T0 � Mν−2 for T � Tν(ω).

It will be clear from the proof that the same arguments apply in the general case.

(1) We first note that

|u(T )|2 + 2νT∑k=1

∫ kk−1‖u(t)‖2 dt

= |u(0)|2 +T∑k=1

(|ηk|2 + 2〈ηk, u(k − 0)〉), (1.20)

where T � 1 is an arbitrary integer. Indeed, since on any open interval (k − 1, k)the solution u(t, x) satisfies the free NS equations (1.8), we have (see (6.3) withl = 0)

|u(k − 0)|2 − |u(k − 1)|2 + 2ν∫ kk−1‖u(t)‖2 dt = 0. (1.21)

Besides, relation (1.7) implies that

|u(k)|2 = |u(k − 0)|2 + |ηk|2 + 2〈ηk, u(k − 0)〉. (1.22)

Combining (1.21) and (1.22), we derive

2ν∫ kk−1‖u(t)‖2 dt = |u(k − 1)|2 − |u(k)|2 + |ηk|2 + 2〈ηk, u(k − 0)〉.

Taking the sum over k = 1, . . . , T , we obtain (1.20).

(2) We now recall that (see [CF])

|St(v)| � e−να1t |v|, t � 0, (1.23)


where α1 > 0 is the first eigenvalue of L. It follows from (1.23) that

2|〈ηk, u(k − 0)〉| � (να1)−1|ηk|2 + να1|u(k − 0)|2

� (να1)−1|ηk|2 + ν

∫ kk−1‖u(t)‖2 dt.

Substitution of this inequality into (1.20) results in

|u(T )|2 + νT∑k=1

∫ kk−1‖u(t)‖2 dt � |u(0)|2 + (1+ α−1

1 ν−1)

T∑k=1

|ηk|2. (1.24)

Now note that, by (1.21) and (1.23), we have

ν

∫ kk−1‖u(t)‖2 dt � 1− e−2α1ν

2|u(k − 1)|2.

Combining this with (1.24), we derive

T∑k=0

|u(k)|2 � cν−2(ν|u(0)|2 + T E |η1|2 +<(T )), (1.25)

where c = c(α1) > 0 is a constant and

<(T ) =T∑k=1

(|ηk|2 − E |ηk|2).

Direct verification shows that, for any integer p � 1,

E∣∣<(T )∣∣2p � cp (E|η1|4p) T p, (1.26)

where cp > 0 is a constant depending only on p. We now set

t (ω) = min{t ∈ Z+ : <(t ′) � t ′ for t ′ � t

}.

Using (1.26) and applying the Chebyshev inequality, we derive

E tm =∞∑j=1

P{t = j}jm �∞∑j=1

P{∣∣<(j − 1)

∣∣2p > j 2p}jm� cp E|η1|4p

∞∑j=1

jm−p � 2cp E|η1|4p,

where p = m+ 2. Taking into account (1.25), we conclude that⟨|u(k)|2⟩T0 � c(E |η1|2 + 1

)ν−2 for � Tν(ω),


where Tν(ω) = t (ω) ∨ (ν|u(0)|2). This completes the proof of inequality (1.17) inwhichM = c(E |η1|2 + 1). ✷

1.3. MARKOV CHAIN

We recall that St denotes the semigroup generated by the free NS system (1.8).Consider a solution u(t, x) of the problem (1.1), (1.9) and set uk = u(k, x), k � 0.Due to (0.4), we have

u0 = u0, (1.27)

uk = S(uk−1)+ ηk, (1.28)

where S = S1 and k � 1. Clearly, Equation (1.28) defines a random dynamicalsystem (RDS) in H . Since the random variables ηk and u0 are independent, the setof solutions corresponding to all u0 ∈ H is a family of Markov chains with thetransition function

P(k, u0, >) = P{uk ∈ >}, u0 ∈ H, > ∈ B(H).

Denote by

Pk:Cb(H)→ Cb(H), P ∗k : P (H)→ P (H)

the Markov operators corresponding to P(k, u0, >).! It follows from Theorem 1.3that if condition (1.4) is satisfied for some s � 0, then P ∗1 µ(H

s) = 1 for anyµ ∈ P (H). In particular, when µ is the delta-measure concentrated at u0, weobtain

P(k, u0,H s) = 1 for any k � 1. (1.29)

In what follows, we shall need some properties of the operators Pk and P ∗k .The following two lemmas show that Pk can be extended to a broader class offunctionals.

LEMMA 1.7. Suppose that condition (1.4) holds for some s > 0. Then Pk canbe extended to a continuous operator from Cb(H s) to Cb(H) whose norm is equalto 1.

Proof. It suffices to consider the case k = 1. Let f ∈ Cb(H s). In view ofTheorem 1.3, for any initial function u0 ∈ H the solution u1 = u(1, x) belongsto Hs with probability 1, so that the random variable f (u1) is well-defined. More-over, since the operator S:H → Hs is continuous (see Lemma 6.1), we concludefrom (1.28) that u1 continuously depends (in Hs-norm) on u0 ∈ H for all ω.Therefore the function f (u1) is also continuous. The continuity of the function! Since the map S:H → H is continuous, for any f ∈ Cb(H) the function Pkf (u) =∫H P(k, u, dv)f (v) is continuous in u. Hence, Pk maps the space Cb(H) into itself.


P1f (u0) = Ef (u1) follows now from the Lebesgue theorem on dominated con-

vergence. It remains to note that if |f (u)| � 1 for all u ∈ H , then |Ef (u1)| � 1,that is, the norm of the operator P1:Cb(H s)→ Cb(H) does not exceed 1.

We now show that the operators Pk can be continued to a class of functionalsgrowing at infinity. For any increasing positive function β(r), r � 0, we denote byC(Hs;β) the space of continuous functions f (u):Hs → R such that

|f (u)| � const β(‖u‖s), u ∈ Hs.

It is clear that C(Hs;β) is a Banach space with respect to the norm

‖f ‖s,β := supu∈Hs

{|f (u)|/β(‖u‖s)}.We recall that the integer l = l(s) � 1 and the constants ml (l � 2), κl , σl, and plare defined in Theorems 1.3 and 1.4, and set ml = m for l = 0, 1.

LEMMA 1.8. Under the conditions of Theorem 1.3, for any m > 1 and m′, 1 �m′ < m, the operator Pk can be extended to a continuous map from C(Hs, βm′) toC(H ;βml ), where βd(r) = 1 + rd . Moreover, for any ν, 0 < ν � 1, the norms ofthe extended operators are bounded uniformly in k � 1.

Remark 1.9. Under the assumptions of Theorem 1.4, the operator Pk extendsto a bounded map from C(Hs; γ ) to C(H ; γ ′). Here γ (r) = exp(cr2κl ) andγ ′(r) = exp(c′r2), where κl is defined in Theorem 1.4 and c and c′ are somepositive constants that can be easily recovered from Theorem 1.4.

Proof. The proofs of all assertions are similar, and to simplify notation, weconfine ourselves to the case s = 0. Let f ∈ C(H ;βm′) and let hR(r) be acontinuous function equal to 1 and 0 for r � R and r � R + 1, respectively.Obviously, the function fR(u) = hR(|u|)f (u) belongs to Cb(H). It follows fromLemma 1.7 and inequality (1.11) that for any R2 > R1 � 1 we have∣∣PkfR1(u)− PkfR2(u)

∣∣ �∣∣∣∣∫H

(hR2(|v|)− hR1(|v|)

)f (v)µk(dv)

∣∣∣∣� const

∫R1�|v|�R2+1

(1+ |v|)m′µk(dv) (1.30)

� const (1+ R1)m′−m, (1.31)

where µk = P(k, u, ·). We note that inequality (1.31) holds uniformly in u frombounded subsets of H . Letting R1 to go to infinity, we conclude that there is a limit

limR→∞

PkfR(u) =: Pkf (u),


and the limiting function Pkf is continuous in u ∈ H . Moreover, it follows from(1.11) that∣∣Pkf (u)∣∣ �

∣∣∣∣∫H

f (v)µk(dv)

∣∣∣∣ � ‖f ‖0,βm′

∫H

(1+ |v|)m′µk(dv)

� const ν−m‖f ‖0,βm′(1+ |u|)m.

This completes the proof in the case s = 0. ✷We now turn to the problem of existence of a stationary measure.

DEFINITION 1.10. A probability measure λ ∈ P (H) is said to be stationary forEquation (1.1) if P ∗1 λ = λ.

We recall that the support suppµ of a measure µ is defined as the minimalclosed set of full measure and that D(ξ) denotes the distribution of a randomvariable ξ .

PROPOSITION 1.11. Suppose that condition (1.4) is satisfied for some s > 0.Then there is a stationary measure λ ∈ P (H) such that λ(H s) = 1 and∫

H

‖u‖mr λ(du) � C(l,m)ν−m for r = 0,ν−3m/2 for 0 < r � 1,ν−(5l+2)m/2 for 1 < r � s,

(1.32)

where m � 1, 0 < ν � 1, l = l(r) is the smallest integer no less than r, andC(l,m) is a constant not depending on ν. Moreover, there is a stationary Markovchain (uk, k ∈ Z) satisfying (1.28) for all k ∈ Z such that D(uk) = λ. Finally, ifall the constants bj in (1.3) are non-zero and λ0 ∈ P (H) is an arbitrary stationarymeasure for P ∗k , then supp λ0 = H .

Proof. The existence of a stationary measure and inequality (1.32) can easily beproved by the Bogolyubov–Krylov argument using Theorem 1.3 and the Prokhorovtheorem on the weak compactness of a tight family of measures (cf. [DZ]). The factthat λ(H s) = 1 follows immediately from (1.29) and the Chapman–Kolmogorovrelation

λ(>) =∫H

P (1, u, >)λ(du), u ∈ H, > ∈ B(H). (1.33)

The existence of a stationary solution of (1.28) with distribution λ follows fromthe Prokhorov and Skorokhod theorems. (For the proof of this assertion in the casewhen the support of the distribution of ηk is a bounded subset in H , see [KS1,Section 1.2].)

To prove the last assertion of the theorem, we note that if γ is the distribution ofthe random variable ηk defined by the formula (1.3) in which all bj are non-zero,then γ (U) > 0 for any open set U ⊂ H (see Lemma 6.2 in the Appendix). It


follows that P(1, u,U) > 0. Setting > = U and λ = λ0 in (1.33), we concludethat λ(U) > 0 for any open set U . ✷

Combining Propositions 1.5 and 1.11, we obtain the following assertion.

PROPOSITION 1.12. Suppose that (1.4) holds for some s > 0. Let λ0 ∈ P (H)be a stationary measure for P ∗k that satisfies the condition∫

H

|u|mλ0(du) � Nmν−m for m � 1, 0 < ν � 1, (1.34)

where Nm > 0 do not depend on ν, and let (uk, k ∈ Z) be a stationary solutionof (1.28) such that D(uk) = λ0. Then there is a constantM � 1 and for any k0 ∈ Z

there exists an integer-valued nonnegative random variable Tν(ω) satisfying (1.18)such that (1.17) holds for |T − k0| � Tν .

Remark 1.13. Analogues of Propositions 1.11 and 1.12 are true in the casewhen condition (B) holds. In this situation, the stationary measure λ satisfies theinequalities∫

Hrexp(σνpl‖u‖2κl

s

)λ(du) � Cr, 0 < ν � 1,

where 0 � r � s, l = l(r) is the smallest integer no less than r, κ0 = p0 = 1, theconstants pl and κl with l � 1 are defined in Theorem 1.4, and Cr > 0 is a constantnot depending on ν. Moreover, the random variable Tν(ω) has a finite exponentialmoment.

2. Lyapunov–Schmidt Reduction

In this section we prove a result of the Foias–Prodi type and show that if a Markovchain {uk} is a stationary solution of Equation (1.28), then sufficiently high Fouriermodes of uk are uniquely defined by low modes of the sequence (ul, l � k).This will enable us to reduce the problem of uniqueness of a stationary solutionfor (1.28) to a similar question for a Gibbs system with a finite-dimensional phasespace.

2.1. FORMULATION OF THE RESULT

To simplify notations, from now on we assume that ν = 1. Let us define H s

as the closure in Hs of the linear manifold spanned by those vectors ej whosecoefficients bj in expansion (1.3) are nonzero. It is clear that if all the coefficients bjare nonzero, then H s = Hs . For any integer N � 2, let HN be the subspace


in H spanned by the vectors ej , j = 1, . . . , N − 1, and let H⊥N be its orthogonalcomplement. We set

H sN = H s ∩HN, H s⊥

N = H s ∩H⊥Nand note that

|w| � α−1/2N ‖w‖ for any w ∈ H⊥N , (2.1)

where αj , j � 1, are the eigenvalues of L indexed in increasing order. We denoteby PN and QN the orthogonal projections onto HN and H⊥N , respectively. Finally,we set

Hs = H ×H s, H

sN = HN ×H s⊥

N ,

and for any s � 0 and any integer N � 1 we define the projections

1N : Hs → HsN ,

(u

η

)!→(

PNu

QNη

).

We shall also use the corresponding projections in the spaces of sequences:

�N :(Hs)Z0 → (

HsN

)Z0,

(u

η

)!→(

PNu

QNη

),

where PNu = (PNul, l � 0) and QNη = (QNηl, l � 0). In the case N = ∞, weset

1∞: Hs → H,

(u

η

)!→ u, �∞:

(Hs)Z0 → HZ0,

(u

η

)!→ u.

Applying QN to (1.28), we obtain

wk = QNS(vk−1 + wk−1)+ ψk, (2.2)

where

vk = PNuk, wk = QNuk, ψk = QNηk.

We wish to show that, for a sufficiently large class of sequences (vl, l � 0) and(ψl, l � 0), Equation (2.2) with k � 0 has a unique solution (wl, l � 0), and thedependence of the zeroth component w0 on vl and ψl decays exponentially as afunction of l. To formulate the corresponding results, we have to introduce somenotations.

For a sequence u = (ul, l � k) with ul ∈ H and integers m � n � k we set

⟨|u|2⟩nm≡ ⟨|ul |2⟩nm = 1

n−m+ 1

n∑l=m|ul |2.


In what follows, we shall need the following two-sided estimate for 〈|u|2〉nm:

2〈〈u〉〉nm � 〈|u|2〉nm � c〈〈u〉〉nm (2.3)

where c = 2(1− e−α1)−1 and

〈〈u〉〉nm =1

n−m+ 1

n∑l=m

∫ 1

0

∥∥St(ul)∥∥2dt.

To prove (2.3), we note that if u(t) is a solution of the homogeneous NS sys-tem (1.8), then

|u(t)|2 + 2∫ t

0‖u(θ)‖2 dθ = |u(0)|2, t � 0. (2.4)

This estimate immediately implies the left-hand inequality in (2.3). Combining(1.23) with ν = 1 and (2.4), we derive∫ t

0‖u(θ)‖2 dθ � 1

2(1− e−2α1t )|u(0)|2,

whence follows the right-hand estimate in (2.3).For any K > 0 and any integer R � 0, we denote by Fs(K,R) the set of

sequences!(u

η

)=((uk

ηk

), k � 0

), uk ∈ H, ηk ∈ H s , (2.5)

such that Equation (1.28) is satisfied for k � 0, and the following inequality holds:⟨|uk|2 + ‖ηk‖2s

⟩0T

� K, T ∈ Z, T � −R. (2.6)

It is clear that (2.6) is equivalent to the inequality

(|T | + 1)⟨|uk|2 + ‖ηk‖2

s

⟩0T

� K(|T | ∨ R + 1), T � 0, (2.7)

which implies, in particular, that

|uk|2 + ‖ηk‖2 � K(R ∨ |k| + 1), k � 0. (2.8)

We also introduce the space Fs(K) of sequences (2.5) that satisfy the inequality

lim supT→−∞

⟨|uk|2 + ‖ηk‖2s

⟩0T

� K.

It is clear that Fs(K,R) ⊂ Fs(K) for any integer R � 0. The sets Fs(K) andFs(K,R) are subsets of the linear space H = (H0)Z0 . We endow H with the! The choice of the space Fs(K,R) is implied by the fact that if {uk} is a stationary solution

for (1.28) all of whose moments are finite, then with probability 1 the sequence (ul, ηl, l � 0)belongs to Fs(M,R) for an integer R � 0, whereM > 0 is the constant in Proposition 1.12.


Tikhonov topology. That is, a sequence(uk

ηk

)converges to

(uη

)if ukl → vl in H

and ηkl → ηl in H for each l � 0. This topology metrisable; for instance, one canuse the distance

dist

((u1

η1

),

(u2

η2

))=

0∑l=−∞

(|u1l − u2

l | + |η1l − η2

l |) ∧ 2l .

The sets Fs(K) and Fs(K,R) are provided with the topology of H.We stress that the topology in the spaces Fs(K) and Fs(K,R) is defined in terms

of the L2-norm | · |, rather than the Hs-norm ‖ · ‖s .In the theorem below, we have compiled some properties of the spaces Fs(K)

and Fs(K,R). We abbreviate F0(K) = F(K) and F0(K,R) = F(K,R).

THEOREM 2.1. (i) Let s > 0 and let M > 0 be the constant defined in Propo-sition 1.5. Then for any K � M the space Fs(K) is nonempty. Moreover, for anyinteger R � 0, F(K,R) is closed in H and Fs(K,R) is compact in H.

(ii) There is a constant C∗ > 0 such that ifN ∈ [N0,∞], whereN0 = N0(K) �1 is the smallest integer satisfying the condition

log αN0 > C∗K,

then the restriction of the projection �N to F(K) is injective. Moreover, for anyinteger l � 0 the operator

Wl: FN(K) ≡ �NF(K)→ H⊥Ntaking each ϒ = (v

ψ

) = �N(uη) to wl = QNul satisfies the inequality∣∣Wl(ϒ1)−Wl(ϒ2)∣∣

� |ψ1l − ψ2

l | +l−1∑k=−∞

(Cα

−1/2N

)l−k×× exp

{C(l − k)(⟨|u1|2⟩l−1

k+ ⟨|u2|2⟩l−1

k

)}|ϒ1k −ϒ2

k |. (2.9)

Here ϒi = (vikψik

) ∈ FN(K), i = 1, 2, C > 0 is a constant not depending on K, N ,

and ϒ i , and |ϒ1k −ϒ2

k | = |v1k − v2

k | + |ψ1k − ψ2

k |.Theorem 2.1 will be proved in Subsection 2.3. In the next subsection, we use

this result to establish equivalence of two families of Markov chains related to astationary measure for the original equation.


2.2. THEOREM ON ISOMORPHISM

In what follows, we assume that condition (1.4) is satisfied for some s > 0. Ac-cording to assertion (i) of Theorem 2.1, in this case Fs(K,R) is a compact subsetof H for any R � 0 and K � M. We denote by Bs(K) the Borel σ -algebra onthe topological space Fs(K) and by P s(K) the set of all probability measures on(Fs(K),Bs(K)). In the case s = 0 we shall simply write B(K) and P (K).

We recall that to Equation (1.28) there corresponds an RDS and a family ofMarkov chains {θk} in H given by the formulas

θ0 = u, (2.10)

θk = S(θk−1)+ ηk, (2.11)

where k � 1. Let us fix arbitrary stationary measure λ0 ∈ P (H) for (2.10),(2.11) with finite moments (see (1.34)) and denote by M > 0 the constant inProposition 1.12. Along with {θk}, let us consider another family of Markov chainsin Fs(K), K � M, defined by the rule

�0 =(u

η

), (2.12)

�k =(�k−1,S(�k−1)+

(ηk

ηk

)), (2.13)

where k � 1 and

S: Hs ≡ (Hs)Z0 → Hs , S(U) =(S(u0)

0

)for U ∈ Hs .

It is easy to see that (2.13) defines an RDS in Fs(K) in the sense that if �k ∈Fs(K), K � M, then �k+1 ∈ Fs(K) for all ω ∈ F. Accordingly, Equations (2.12)and (2.13) define a family of Markov chains in Fs(K). Moreover, if (uk, k ∈ Z) is astationary solution for (2.11) such that D(uk) = λ0 (see Propositions 1.5 and 1.11),then the random vector (

(ukηk

), k � 0) belongs to Fs(K) with probability 1, and its

distribution �0 is a stationary measure for (2.12), (2.13).We now consider the image of {�k} under the projection �N . Here and every-

where below, we assume that

N0 � N �∞, logαN0 > C∗K, (2.14)

where C∗ > 0 is the constant in Theorem 2.1. We shall see that all these projectionsare equivalent to the original chain {�k}.

For any integers K � M and N � N0, we set

FsN (K,R) = �NFs(K,R), FsN (K) = �NFs(K).

Thus, for N < ∞ the set FsN(K,R) consists of those sequences ϒ = (vψ

)for

which there is(uη

) ∈ Fs(K,R) such that v = PNu and ψ = QNη. By assertion (ii)


of Theorem 2.1, the pair(uη

)is uniquely determined. Similarly, Fs∞(K,R) consists

of the sequences u = (uk, k � 0) that are the first component of an element(uη

) ∈ Fs(K,R), which is also unique since ηk = uk − S(uk−1). The spaces FsN(K)and Fs∞(K) can be described in a similar way.

In what follows, we assume that FsN(K) is endowed with the Tikhonov topologyof the space HN = (HN)Z0 . We confine ourselves to the case K = 2M (althoughthe arguments below remain valid for all K � M). To simplify notations, we shallwrite Fs and FsN instead of Fs(2M) and FsN (2M), respectively. Since�N : Fs → FsNis a one-to-one continuous mapping, we can define its inverse �−1

N . We claim thatfor any integer N � N0 = N0(2M) the mapping

�N : (Fs ,B(Fs))→ (FsN,B(FsN))is an isomorphism of measurable spaces. Indeed, the fact that �N is measurable(that is, �−1

N (>) ∈ B(Fs) for any > ∈ B(FsN)) follows from the continuity of �N .Therefore, it suffices to show that �N(>) ∈ B(FsN) for any > ∈ B(Fs). To thisend, we first note that

Fs =⋂K>2M

∞⋃R=1

Fs(K,R).

It follows that Fs is a Borel subset of H and, hence, the Borel σ -algebra B(Fs)coincides with the collection of subsets > ⊂ Fs for which there is a Borel set> ∈ B(H) such that > = > ∩ Fs .

We now fix an arbitrary > ∈ B(Fs). Since the restriction of �N to the compactset Fs(K,R) is continuous together with its inverse, the set�N(>) = �N(>)∩FsNbelongs to B(FsN).

What has been proved implies, in particular, that the composition mapping

G = �∞ ◦�−1N : FsN → Fs∞

defines an isomorphism of measurable spaces with the inverse

H = �N ◦�−1∞ : Fs∞ → FsN .

We also note that

G(ϒ) = (ul, l � 0), ul = vl +W0(ϒ), (2.15)

H(u) =((vl

ψl

), l � 0

), vl = PNul, ψl = QN

(ul − S(ul−1)

), (2.16)

where the operator W0 is defined in Theorem 2.1.We now describe the families of Markov chains resulting from application of�N

to {�k}. It is a matter of direct verification to show that for N = ∞ we obtain

θ 0 = u, (2.17)

θ k = (θ k−1, S(θk−1

0 )+ ηk), (2.18)


where k � 1 and θ k = (θkl , l � 0), and for N0 � N <∞ we have

ϒ0 =(v

ψ

), (2.19)

ϒk =(ϒk−1, T (ϒk−1)+

(ϕk

ψk

)), (2.20)

where ϕk = PNηk, ψk = QNηk, and

T

(v

ψ

)=(

PNS(v0 +W0(v,ψ))

0

). (2.21)

We shall treat (2.18) and (2.20) as either random dynamical systems or Markovchains in the corresponding phase spaces. Note that the mapping G conjugates thetwo dynamical systems: if θ k = G(ϒk), then θ k+1 = G(ϒk+1).

Let us denote by P(k,u, >) and P(k,ϒ, >) the transition probabilities for thefamilies {θ k} and {ϒk}, respectively, and by Pk and Pk the Markov semigroupsassociated with them. The above construction implies that

P(k,G(u),G(>)) = P(k,ϒ, >), ϒ ∈ FsN, > ∈ B(FsN),

and, hence,

(Pkf ) ◦G = Pk(f ◦G), f ∈ Cb(FsN).

We now set

�k =(uk

ηk

), uk = (ul, l � k), ηk = (ηl, l � k),

where (uk, k ∈ Z) is a stationary solution such that D(uk) = λ0. It is clear that{�k} is a stationary Markov chain in Fs satisfying (2.13) for all k ∈ Z. Let usconsider its image under the projections �N and �∞:

ϒk = �N�k =((vl

ψl

), l � k

), uk = �∞�k = (ul, l � k),

where vl = PNul and ψl = QNηl . What has been said implies that if N sat-isfies (2.14), then ϒk and uk are stationary Markov chains in FsN and Fs∞ thatsatisfy (2.20) and (2.18), respectively. Moreover, the distribution of each of thesequences ϒk and uk uniquely determines the distribution of �k. Thus, we ob-tain a one-to-one correspondence between some classes of stationary measuresfor (2.11), (2.18), and (2.20). More exactly, we have the following theorem.

THEOREM 2.2. Let λ0 ∈ P (H) be a stationary measure for (2.11) satisfy-ing (1.34) and let (uk, k ∈ Z) be a stationary solution of (2.11) with distribution λ0.Then the distribution µ of the corresponding stationary Markov chain ϒk in FsN is


uniquely defined. Moreover, the measure µ uniquely determines λ0. In particular,if Equation (2.20) has at most one stationary measure concentrated on the set

∞⋃R=1

FsN(2M,R) ⊂ FsN(2M) ≡ FsN,

then Equation (2.11) has a unique stationary measure that satisfies (1.34).

Remark 2.3. If {ϒk = (ϒkl , l � 0), k ∈ Z} is a stationary solution for (2.20),then {ϒk0 , k ∈ Z} is a stationary process. Its distribution in the space of sequences{ϒl, l ∈ Z} is an (abstract) Gibbs measure in the sense of Ruelle, Sinai and Bowen,see discussion in [KS1]. Therefore, uniqueness of a stationary solution for (2.20)which we prove in Section 4 below implies (is in fact equivalent to) uniqueness ofthe corresponding 1D Gibbs system.

2.3. PROOF OF THEOREM 2.1

(i) Since s > 0, Proposition 1.11 implies that there is a stationary solution (uk, k ∈Z) of (1.28) whose distribution satisfies inequality (1.19) with ν = 1 and k± =±∞. By Proposition 1.5, almost every realisation of the random variable

((ukηk

), k �

0)

belongs to Fs(M), and therefore Fs(K) �= ∅ for K � M.The proofs of the assertions on compactness and closedness are similar, and

we confine ourselves to proving that Fs(K,R) is compact in the space H withTikhonov topology. Let

(ui

ηi

) ∈ Fs(K,R) be an arbitrary sequence. The definition

of Fs(K,R) implies that for any l � 0 the sequence ηil is bounded in H s . Fur-thermore, it follows from Equation (1.28) and the continuity of the map S from HtoHs that the sequence uil is contained in a bounded subset ofHs . Therefore, thereare subsequences of {uil} and {ηil } that converge in H . It is clear that the limitingpair of sequences

(uη

)satisfies (1.28) and belongs to Fs(K,R). This implies the

required assertion.(ii) The case N = ∞ is trivial, and therefore we shall assume that N < ∞.

We shall need the following lemma whose proof is given in the Appendix (seeSection 6.4).

LEMMA 2.4. There is a constant C > 0 such that the resolving semigroup of thefree NS system (1.8) satisfies the inequalities∣∣St(u0

1)− St(u02)∣∣ �

∣∣u01 − u0

2

∣∣ exp

{C

∫ t0

∥∥Sθ(u01)∥∥2

dθ

}, (2.22)∥∥St(u0

1)− St(u02)∥∥ � C (t−3/2 ∨ 1)

∣∣u01 − u0

2

∣∣×× exp

{C

∫ t0

(∥∥Sθ(u01)∥∥2 + ∥∥Sθ(u0

2)∥∥2)

dθ

}, (2.23)

where t � 0 and u01, u

02 ∈ H .


Let

ϒ i =(vi

ψ i

)= �N

(ui

ηi

)∈ FN(K), i = 1, 2.

We set wil = QNuil and wi−l = QNS(u

il−1). By (2.1), (2.3), and (2.23), for any

l � 0 we have

|w1l − w2

l | � |w1−l − w2−

l | + |ψ1l − ψ2

l | � α−1/2N ‖w1−

l − w2−l ‖ + |ψ1

l − ψ2l |

� Cα−1/2N D(l − 1, l)

(|v1l−1 − v2

l−1| + |w1l−1 − w2

l−1|)+|ψ1

l − ψ2l |

where for any integers p < q � 0 we set

D(p, q) = exp{C(q − p)(⟨|u1|2⟩q−1

p+ ⟨|u2|2⟩q−1

p

)}.

Arguing by induction, for any m < l − 1 we derive

|w1l − w2

l | �l−1∑k=m+1

(Cα

−1/2N

)l−kD(k, l)

(∣∣v1k − v2

k

∣∣+ ∣∣ψ1k − ψ2

k

∣∣)++ ∣∣ψ1

l − ψ2l

∣∣+ (Cα−1/2N

)l−mD(m, l)

∣∣u1m − u2

m

∣∣. (2.24)

It follows from (2.7) and (2.8) that

D(k, l) � const e2KC|k|, |uik| � K1/2|k|1/2 + const, i = 1, 2.

Therefore, we can pass to the limit in (2.24) as m→−∞ on condition that

log αN > 4KC + 2 logC.

This results in∣∣w1l − w2

l

∣∣�∣∣ψ1l − ψ2

l

∣∣+ l−1∑k=−∞

(Cα

−1/2N

)l−kD(k, l)

(∣∣v1k − v2

k

∣∣+ ∣∣ψ1k − ψ2

k

∣∣). (2.25)

In particular, if ϒ1 = ϒ2, then u1 = u2 and, in view of (1.28), η1 = η2. It remainsto note that (2.25) coincides with (2.9).

3. A Version of the Ruelle–Perron–Frobenius (RPF) Theorem

In this section, we prove a version of the RPF theorem which is a generalisation ofthe corresponding result from [KS1] to systems with unbounded phase space. Itsapplication to the Markov semi-group corresponding to the family (2.19), (2.20)will give us the required uniqueness of a stationary measure.


3.1. STATEMENT OF THE RESULT

Let X0 ⊂ X1 ⊂ · · · be an increasing family of compact metric spaces which aresubsets of a topological space X. We assume that the embeddings XR ⊂ XR+1 ⊂ Xare isometries for any integer R � 0. Let B(X) be the Borel σ -algebra on X andlet P (X) be the set of all probability measures on (X,B(X)).

Let P(k,υ, >) be a family of Feller transition probabilities on (X,B(X)) andlet

Pk: Cb(X)→ Cb(X), P∗k : P (X)→ P (X), k � 0,

be the corresponding Markov semi-groups. Recall that a subset R ⊂ Cb(X) iscalled a determining family for P (X) if for arbitrary measures µ1, µ2 ∈ P (X) thecondition∫

Xf (υ) dµ1(υ) =

∫Xf (υ) dµ2(υ) for any f ∈ R

implies that µ1 = µ2.For any function f (υ), denote by f + and f − its positive and negative parts,

respectively:

f + = 12 (|f | + f ), f − = 1

2 (|f | − f ).For a function f ∈ Cb(X), we shall write

f +k = (Pkf )+, f −k = (Pkf )−.We shall assume that the condition below is satisfied (cf. hypothesis (H) in [KS1,Section 4.1]):

(H) There is a determining family R for P (X) such that f − c belongs to R forall f ∈ R and c ∈ R, and for any f ∈ R and α > 0 and arbitrary integersR � 0 and ρ � 0 there are k0 = k0(α, f, ρ,R) ∈ N and A = Af (α, ρ,R) >1 such that the following property holds: if

supυ∈Xρf +k (υ) � α for all k � 0, (3.1)

supυ∈Xρf −k (υ) � α for all k � 0, (3.2)

then for any k � k0 there is l = l(k, α, f, ρ,R) > 0 such that

supυ∈XR

(Plf

+k

)(υ) � Af (α, ρ,R) inf

υ∈XR

(Plf

+k

)(υ), (3.3)

supυ∈XR

(Plf

−k

)(υ) � Af (α, ρ,R) inf

υ∈XR

(Plf

−k

)(υ). (3.4)

Sufficient conditions guaranteeing the validity of (H) are given in Section 3.3.The following result is a generalisation of Theorem 4.1 in [KS1].


THEOREM 3.1. Suppose that condition (H) is satisfied. Then the assertions belowhold.

(i) Let µ ∈ P (X) be a stationary measure of P∗k such that

Af (α, ρ,R)µ(X \ XR)→ 0 as R→∞ (3.5)

for all f ∈ R, α > 0, and ρ � 0. Then, for any f ∈ R,

Pkf → (µ, f ) as k→∞ in L1(X, µ). (3.6)

(ii) The operator P∗k has at most one stationary measure µ ∈ P (X) satisfy-

ing (3.5).


(1) As in the case of a single metric space (see [KS1]), (i) implies (ii). Indeed, ifµ1, µ2 ∈ P (X) are two different stationary measures, then there is f ∈ R suchthat (µ1, f ) �= (µ2, f ). By (i),

Pkf → (µi, f ) as k→∞ in L1(X, µi), i = 1, 2.

Therefore, there is a sequence of integers ks such that

Pks f → (µi, f ) as s →∞ µi-almost everywhere. (3.7)

Let Ci ⊂ X be set of convergence in (3.7). We have µ1(C1) = µ2(C2) = 1 andC1 ∩ C2 = ∅ and, hence, µ1 and µ2 are singular.

We now compare the measures µ1 and µ = (µ1+µ2)/2. Applying the above ar-gument to them, we see that µ1 and µ are singular, which contradicts the definitionof µ.

(2) Thus, it suffices to establish (i). We can assume without loss of generalitythat (µ, f ) = 0. Since ‖Pkf ‖µ is a nonincreasing sequence, the required assertionwill be established if we show that for any ε > 0 there is an integer kε � 1 suchthat

‖Pkεf ‖µ � ε. (3.8)

Let us assume that for any integer ρ � 0 there is a sequence ks(ρ) such that

supυ∈Xρf +ks(ρ)(υ)→ 0 as s →∞.

In this case, we have∫X(Pks (ρ)f )

+ dµ(υ) =∫

Xf +ks (ρ) dµ(υ)

� ‖f ‖∞µ(X \ Xρ)+ supυ∈Xρf +ks (ρ)(υ). (3.9)


It is clear that the right-hand side of (3.9) can be made arbitrarily small by anappropriate choice of ρ and s. Moreover, it follows from the relation (µ, f ) = 0that (µ, f +k ) = (µ, f −k ), and therefore a subsequence of (µ, f +k ) + (µ, f −k ) =‖fk‖µ goes to zero. What has been said obviously implies (3.8).

Similar arguments apply in the case when, for any integer ρ � 0,

supυ∈Xρf −ks(ρ)(υ)→ 0 as l →∞,

where ks(ρ) is a sequence going to +∞ with s.(3) Thus, we can assume that inequalities (3.1) and (3.2) hold for some positive

constants α and ρ. In this case, by condition (H), for any integers R � 0 andk � k0(α, f, ρ,R) there is l = l(k, α, f, ρ,R) � 0 such that (3.3) and (3.4) aresatisfied.

We now fix arbitrary integer R � 0 and, repeating the scheme applied in [KS1],construct a sequence of integers ks = ks(R) such that

‖Pks f ‖µ � εf (R)(1+ af (R)+ · · · + af (R)s−1

)××‖f ‖∞ + af (R)s‖f ‖µ, (3.10)

where s � 0 and

εf (R) =(1+ 4Af (R)

−1)µ(X \ XR),

af (R) = 1− µ(XR)Af (R)−1 < 1. (3.11)

Here and henceforth, the dependence on α and ρ is not indicated explicitly.The proof of (3.10) is by induction on s. For s = 0, in view of the relation

P ∗k0µ = µ, we have

‖Pk0f ‖µ = ‖f ‖µ,which coincides with (3.10) for s = 0.

Assuming that (3.10) is established for s � r, we now prove it for s = r + 1.We set kr+1 = kr + lr , where lr = l(kr, α, f, ρ,R) � 0 is the integer enteringcondition (H). In view of (3.3) and (3.4), we have!∫

Xf ±kr dµ =

∫X

Plr f±kr

dµ =∫

XR

+∫

X\XR�{

supυ∈XR

(Plr f

±kr

)(υ)}µ(XR)+ ‖f ‖∞µ(X \ XR)

� Af (R){

infυ∈XR

(Plr f

±kr

)(υ)}µ(XR)+ ‖f ‖∞µ(X \ XR).

! Here and henceforth a formula involving the symbol ± is a brief notation for the two formulascorresponding to the upper and lower signs.


It follows that

Plr f±kr(υ)− Af (R)−1‖f ±kr ‖µ + Af (R)−1‖f ‖∞µ(X \ XR) � 0

for υ ∈ XR.

Let us estimate the expression ‖Pkr+1f ‖µ = ‖Plr fkr‖µ. We have∫X

∣∣Plr fkr ∣∣ dµ =∫

XR

+∫

X\XR� Dr(f +kr )+Dr(f −kr )+ ‖f ‖∞µ(X \ XR), (3.12)

where

Dr(f±kr) =

∫XR

∣∣Plr f ±kr − Af (R)−1‖f ±kr ‖µ∣∣ dµ.

Now note that

Dr(f±kr) �

∫XR

(Plr f

±kr(υ)− Af (R)−1‖f ±kr ‖µ +

+Af (R)−1‖f ‖∞µ(X \ XR))

dµ++Af (R)−1‖f ‖∞µ(X \ XR).

This implies that

Dr(f+kr)+Dr(f −kr ) � 4Af (R)

−1‖f ‖∞µ(X \ XR)++∫

XR

{Plr (f

+kr+ f −kr )− Af (R)−1(‖f +kr ‖µ + ‖f −kr ‖µ)

}dµ

�(1− µ(XR)Af (R)−1

)‖fkr‖µ + 4Af (R)−1‖f ‖∞µ(X \ XR).

Substituting this expression into (3.12) and using the induction hypothesis, weobtain∫

X

∣∣Pkr+1f∣∣ dµ

�(1− µ(XR)Af (R)−1

)r+1‖f ‖µ+

+ (1+ 4Af (R)−1)‖f ‖∞µ(X \ XR)

r∑j=0

(1− µ(XR)Af (R)−1)j ,

which completes the proof of (3.10).It follows from (3.10) and (3.11) that

‖Pks (R)f ‖µ � εf (R)

1− af (R)‖f ‖∞ + af (R)s‖f ‖µ

� µ(X \ XR)Af (R){µ(XR)(1+ 4Af (R)

−1)}‖f ‖∞ +

+ af (R)s‖f ‖µ. (3.13)


The expression in the brackets on the right-hand side of (3.13) is no greater than 5.Hence, in view of (3.5), the right-hand side of (3.13) can be made arbitrarily smallby a suitable choice of R and s. This completes the proof of (3.8).

3.3. SUFFICIENT CONDITIONS FOR APPLICATION OF THEOREM 3.1

Let P(k,υ, >), υ ∈ X, > ∈ B(X), be a Feller transition function. Suppose thatthere is a determining family R for P (X) such that R is invariant with respect toaddition of a constant, and the following two conditions hold:

(H1) For any f ∈ R, R � 0, and β > 0 and an arbitrary υ ∈ XR there isan integer k0 = k0(f,R, β) � 1, not depending on υ, and a Borel subsetO(f,υ, R, β) ⊂ X such that∣∣Pkf (υ ′)−Pkf (υ)

∣∣ � β for k � k0, υ ′ ∈ O(f,υ, R, β).(H2) There is an integer ρ0 � 0 such that for any ρ � ρ0, R � 0, β > 0, andf ∈ R there is a constant ε = ε(f, ρ, β) > 0, not depending on R, and aninteger l = l(f, ρ, β,R) � 1 for which

P(l,υ0,O(f,υ, ρ, β)

)� ε for any υ0 ∈ XR, υ ∈ Xρ, (3.14)

where the set O(f,υ, ρ, β) is defined in condition (H1).

THEOREM 3.2. Suppose that conditions (H1) and (H2) are satisfied. Then (H)holds for R with

Af (α, ρ,R) = Af (α, ρ) = 4 ‖f ‖∞α ε(f, ρ, α/2)

, (3.15)

where ε(f, ρ, α/2) is the constant in condition (H2). In particular, there is at mostone stationary measure µ ∈ P (X) concentrated on the union of XR, R � 0.

Proof. Let f ∈ R be arbitrary function satisfying (3.1) and (3.2) for an in-teger ρ � 1. We must prove that (3.3) and (3.4) hold. To simplify notation, weconfine ourselves to the case of the index +.

Without loss of generality, it can be assumed that ρ � ρ0, where ρ0 is the integerin condition (H2). Let υk ∈ Xρ be such that

f +k (υk) �α

2, k � 0.

By condition (H1), there is an integer k0 = k0(f, ρ, α/2) � 1 and a sequence ofBorel sets Ok = O(f,υk, ρ, α/2) such that

f +k (υ′) � α

4for υ ′ ∈ Ok, k � k0. (3.16)


Let ε = ε(f, ρ, α/2) > 0 and l = l(f, ρ, α/2, R) � 1 be the constants enteringcondition (H2). In view of (3.14) and (3.16), we have

supυ∈XR

(Plf

+k

)(υ) � ‖f ‖∞, (3.17)

infυ∈XR

(Plf

+k

)(υ) = inf

υ∈XR

∫X

P(l,υ, dυ ′)f +k (υ′)

� infυ∈XR

∫Ok

P(l,υ, dυ ′)f +k (υ′)

� α

4P(l,υ,Ok) �

α ε(f, ρ, α/2)

4. (3.18)

Combining (3.17) and (3.18), we arrive at the required inequality.We now prove the assertion on the uniqueness of a stationary measure. Since

the constant Af (α, ρ,R) is in fact independent of R (see (3.15)), there is at mostone stationary measure such that

µ(X \ XR)→ 0 as R→∞. (3.19)

It remains to note that (3.19) is equivalent to the condition that the measure µ isconcentrated on the union of XR, R � 0. ✷

4. Uniqueness of a Stationary Measure for the Reduced Chain

4.1. MAIN RESULT

We denote by P(k,ϒ, >) the transition probabilities for the family of Markovchains {ϒk} defined in the space measurable space (FsN ,B(F

sN)) (see (2.19), (2.20))

and by Pk and P∗k the corresponding Markov semi-groups. We shall also need the

following metric generating the Tikhonov topology on HN :

dist(ϒ1,ϒ2) = 0∑

l=−∞|ϒ1l −ϒ2

l | ∧ 2l .

THEOREM 4.1. Suppose that condition (1.4) is satisfied for some s > 0. There isa constant K∗ � 2M such that if a finite integer N satisfies (2.14) with K = K∗and

bj �= 0 for j = 1, . . . , N, (4.1)

then P∗k has a unique stationary measure µ that is concentrated on the union of the

sets FsN(2M,R), R � 0. Moreover, for any f ∈ Cb(FsN) and an arbitrary integerR � 0, we have

Pkf (ϒ)→ (µ, f ) uniformly in ϒ ∈ FsN(2M,R) as k→∞. (4.2)


Proof. The existence of a stationary measure follows from Proposition 1.11 andTheorem 2.2. To prove the uniqueness and convergence (4.2), we apply the RPFtype theorem established in Section 3.

(1) We set

XR = FsN (2M,R), X = FsN .

Let R ⊂ Cb(FsN ) be the set of continuous cylindrical functions on FsN , i. e., theset of functions f : FsN → R for which there is an integer m � 0 and a boundedcontinuous function F : (HN)m+1 → R such that

f (ϒ) = F(v−m,ψ−m, . . . , v0, ψ0), ϒ =(v

ψ

)∈ FsN . (4.3)

Clearly, R is a determining family for P (FsN).It will be proved in Sections 4.2 and 4.3 that if an integer N satisfies (2.14)

with sufficiently large K � 2M, then the transition function P(k,ϒ, >) obeysconditions (H1) and (H2) in which

O(f,ϒ, R, β) = {ϒ ′ ∈ FsN ∩ FN(K,R) : dist(ϒ ′,ϒ) � r}, (4.4)

where K is a fixed constant not depending on f , ϒ, R, β, and N , while r dependsonly on f , R, and β. By Theorems 3.1 and 3.2, this will imply the uniquenessof a stationary measure concentrated on the union of FsN(2M,R), R � 0, andalso convergence (4.2) in L1(X, µ)-norm for any f ∈ R. Moreover, as is shown inProposition 4.4, the sequence formed of the restrictions of the functions Pkf to XRis uniformly equicontinuous for any integer R � 0. Therefore, by Arzelà–Ascolitheorem, a subsequence Pkl f converges uniformly on any XR. In view of theL1-convergence, the limit is uniquely determined, and hence the whole sequenceuniformly converges to (µ, f ).

(2) We now show that (4.2) holds for any function f ∈ Cb(FsN). Since Xρ isa compact subset of X, the restriction of f to Xρ is uniformly continuous for anyinteger ρ � 0. Let us denote by fρ an arbitrary uniformly continuous extension off∣∣Xρ

to HN such that ‖fρ‖∞ � 3‖f ‖∞. For instance, we can take

fρ(ϒ) = infϒ ′∈Xρ

{f (ϒ ′)+ ωρ

(d(ϒ,ϒ ′)

)},

where ωρ(r), r � 0, is the modulus of continuity of f∣∣Xρ

:

ωρ(r) = sup{|f (ϒ1)− f (ϒ2)| : ϒ1,ϒ2 ∈ Xρ, d(ϒ1,ϒ2) � r

}.

Let us denote by JL: HN → HN the operator taking each ϒ = (ϒl, l � 0) to(. . . , 0, ϒ−L, . . . , ϒ0). We define the function

fρL(ϒ) = fρ(JLϒ), ϒ ∈ HN .


Clearly, we have fρL ∈ R. Thus, convergence (4.2) holds for f = fρL.Let us fix arbitrary R � 0 and write∣∣Pkf (ϒ)− (µ, f )∣∣

�∣∣PkfρL(ϒ)− (µ, fρL)∣∣+ ∣∣(µ, f − fρL)∣∣++∣∣Pkf (ϒ)−PkfρL(ϒ)

∣∣. (4.5)

As it was mentioned above,

supϒ∈XR

∣∣PkfρL(ϒ)− (µ, fρL)∣∣ := ε1(k, L, ρ), (4.6)

where ε1(k, L, ρ)→ 0 as k→∞ for any fixed L � 1 and ρ � 0. Furthermore, itis clear that

supϒ∈Xρd(ϒ, JLϒ)→ 0 as L→∞ for any ρ � 0.

Therefore, in view of the uniform continuity of fρ , we have

supϒ∈Xρ

∣∣fρL(ϒ)− f (ϒ)∣∣ = supϒ∈Xρ

∣∣fρ(JLϒ)− fρ(ϒ)∣∣ � ε2 = ε2(L, ρ),

where ε2(L, ρ)→ 0 as L→∞ for any ρ � 1. It follows that∣∣(µ, f − fρL)∣∣ �∫

X|f − fρL| dµ �

∫Xρ+∫

X\Xρ

�∫

Xρ|f − fρL| dµ+ 4‖f ‖∞µ(X \ Xρ)

� ε2(L, ρ)+ 4‖f ‖∞µ(X \ Xρ). (4.7)

Finally, to estimate the third term on the right-hand side of (4.5), we note that∣∣Pkf (ϒ)−PkfρL(ϒ)∣∣ �

∫X

P(k,ϒ, dϒ ′)∣∣f (ϒ ′)− fρL(ϒ ′)

∣∣�∫

Xρ+∫

X\Xρ� ε2(L, ρ)+

+ 4‖f ‖∞P(k,ϒ,X \ Xρ). (4.8)

Combining (4.5), (4.6), (4.7), and (4.8), we derive∣∣Pkf (ϒ)− (µ, f )∣∣ � ε1(k, L, ρ)+ ε2(L, ρ)++ 4‖f ‖∞

(P(k,ϒ,X \ Xρ)+ µ(X \ Xρ)

). (4.9)

To conclude that the right-hand side of (4.9) goes to zero, we need the lemmabelow. We formulate two estimates the first of which is used here and the other willbe needed in the next subsection.


LEMMA 4.2. For any integer R � 0 and any m � 1 there is a constant CRm > 0such that

P(k,ϒ,FsN \ FsN(2M,ρ)) � CRmρ−m for k � ρ � 1, (4.10)

P(k,ϒ,FsN \ FsN(3M,ρ)) � CRmρ−m for k, ρ � 1, (4.11)

where ϒ ∈ FsN(2M,R).

Let us fix an arbitrary ε > 0. In view of (4.10) and the fact that µ is concentratedon⋃ρ�0 Xρ , there is an integer ρ � 0 such that the third term on the right-hand

side of (4.9) is less than ε for k � ρ. We then choose integers L � 1 and k0 � ρso large that ε1(k, L, ρ) � ε for k � k0 and ε2(L, ρ) � ε. Combining all theseestimates, we see that (4.9) does not exceed 4ε for k � k0. Thus, to complete theproof of (4.2), it remains to establish Lemma 4.2. ✷

Proof of Lemma 4.2. Let us fix arbitrarym � 1. It is clear that it suffices to estab-lish (4.10) and (4.11) for sufficiently large ρ. We fix an arbitrary ϒ ∈ FsN(2M,R)and denote by U = (

uη

)the element of Fs(2M,R) such that �NU = ϒ. Let

(ul, l � 0) be the solution of the problem (1.27), (1.28) with the initial functionu0 = v0 + W0(ϒ) (note that |u0| � (2MR)1/2) and let al := |ul|2 + ‖ηk‖2

s .Application of Proposition 1.5 with k− = 0, k+ = k0 = k and Remark 1.6 to thesolution ul , 0 � l � k, shows that, with probability no less than ερm := 1−Cmρ−m,we have

(k − T + 1)〈al〉kT � 2M((k − T ) ∨ ρ + 1

), 0 � T � k. (4.12)

Since U ∈ Fs(2M,R), we conclude that if ρ � R, then

〈al〉0T � 2M for T � −R. (4.13)

Combining (4.12) and (4.13), we see that for T � 0, with probability � ερm,

〈al〉kT = (|T | + k + 1)−1((|T | + 1)〈al〉0T + k〈al〉k1

)� M(|T | + k + 1)−1(2(R ∨ |T | + 1)+ k ∨ ρ)�{

2M if k � ρ � 2R,3M if |T | + k � ρ � R.

We have thus proved that

P(k,ϒ,FsN (2M,ρ)) � ερm = 1− Cmρ−m, k � ρ � 2R,

P(k,ϒ,FsN (3M,ρ)) � ερm = 1− Cmρ−m, k � 1, ρ � R.

This implies the required inequalities (4.10) and (4.11). ✷In Section 5, we shall need a corollary of Theorem 4.1. Let us recall that {ϒk}

is isomorphic to the family of Markov chains {θ k} defined by (2.17), (2.18). Wedenote by Pk and P∗k the Markov semigroups for {θ k}.


COROLLARY 4.3. Under the conditions of Theorem 4.1, the Markov semigroupP∗k has a unique stationary measure λ ∈ P (Fs∞) that is concentrated on the unionof Fs∞(2M,R), R � 0. Moreover, for any f ∈ Cb(Fs∞) we have

Pkf (u)→ (λ, f ) uniformly in u ∈ Fs∞(2M,R) as k→∞.

4.2. CHECKING CONDITION (H1)

For any integer m � 0, let Rm be the set of those f ∈ R for which the correspond-ing function F in (4.3) is defined on (HN)m+1. We recall that the setO(f,ϒ, R, β)is defined in (4.4).

PROPOSITION 4.4. Let the conditions of Theorem 4.1 be fulfilled and let K �2M be arbitrary constant. Then for any integer R � 0 and any β > 0 there isr = r(R, β,K) > 0 satisfying the following property: if f ∈ Rm for an integerm � 1, then∣∣Pkf (ϒ1)−Pkf (ϒ

2)| � β‖f ‖∞ for k � m+ 1,

where ϒ1 ∈ FsN(2M,R), ϒ2 ∈ FsN ∩FN(K,R), and dist(ϒ1,ϒ2) � r. In particu-

lar, the sequence Pkf∣∣XR

, k � m+ 1, is uniformly equicontinuous for any R � 0,and condition (H1) holds with any domain O(f,ϒ, R, β) of the form (4.4).

Proof. (1) Let dv be the Lebesgue measure on the finite-dimensional space HNand let dα(ψ) be the distribution of the random variables ψk on H s⊥

N . We denoteby D(v), v ∈ HN , the density of the random variables ϕk with respect to dv. (Itfollows from (1.3) and the conditions imposed on ξjk that D(v) =∏Nj=1 pj(bjxj ),where v = (x1, . . . , xN ) ∈ HN .) Direct verification shows that for f ∈ Rm andk � m+ 1 we have (cf. [KS1, Section 1.3])

Pkf (ϒ) =∫(HsN )

k

F (ϒk−m, . . . , ϒk)Dk(ϒ;ϒk) Pk(dϒk), (4.14)

where ϒk = (ϒ1, . . . , ϒk) and Pk(dϒk) = dv1 · · · dvkdα(ψ1) · · · dα(ψk),

Dk(ϒ;ϒ1, . . . , ϒk) =k∏l=1

D(vl − T0(ϒ, ϒ1, . . . , ϒl−1)

), (4.15)

and T0 is the first component of the operator T defined in (2.21), that is, T0(ϒ) =PNS(v0 +W0(v,ψ)).

(2) Now let ϒ1 ∈ FsN (2M,R) and ϒ2 ∈ FsN ∩ FN(K,R). For any k � 1,we denote by Vk = Vk(ϒ1,ϒ2) the doubled variational distance between the twomeasure on (HsN )

k defined by the densities Dk(ϒi , ϒk), i = 1, 2. In other words,

Vk =∫(HsN )

k

∣∣Dk(ϒ1, ϒk)−Dk(ϒ2, ϒk)∣∣ Pk(dϒk).


Since ‖F‖∞ = ‖f ‖∞, it follows from (4.14) and (4.15) that∣∣Pkf (ϒ1)−Pkf (ϒ2)∣∣ � ‖f ‖∞Vk.

Thus, it is sufficient to estimate Vk. To this end, we note that

Vk � Vk−1 +∫(HsN )

k

Dk−1(ϒ1, ϒk−1)�k(ϒ

1,ϒ2;ϒk) Pk(dϒk)=: Vk−1 + Ik, (4.16)

where

�k(ϒ1,ϒ2;ϒk) =

∣∣D(vk − T0(ϒ2, ϒk−1)

)−D(vk − T0(ϒ1, ϒk−1)

)∣∣.We now derive an estimate for Ik = Ik(ϒ1,ϒ2).

(3) Let us fix arbitrary K � 2M and B � 1. To estimate Ik, we represent the do-main of integration (HsN )

k as the union of a sequence of nonintersecting subsets oneach of which the expression �k(ϒ

1,ϒ2;ϒk) admits a uniform estimate. Namely,for any integer ρ � R we set

Ak(ρ) = Ak(ρ) \ Ak(ρ − 1),

where Ak(R− 1) = ∅ and Ak(ρ) is the set of those (ϒ1, . . . , ϒk−1) ∈ (HsN )k−1 forwhich (ϒ1, ϒ1, . . . , ϒk−1) ∈ FsN (3M,ρ). It is easy to see that the union of Ak(ρ),ρ � R, coincides with (HsN )

k−1 for any k � 1. Let us write the integral Ik as

Ik =∞∑ρ=RIkρ, (4.17)

where

Ikρ = Ikρ(ϒ1,ϒ2)

=∫Ak(ρ)×HsN

Dk−1(ϒ1, ϒk−1)�k(ϒ

1,ϒ2;ϒk) Pk(dϒk). (4.18)

By the mean value theorem, we have

�k(ϒ1,ϒ2;ϒk) � Qk(vk)

∣∣T0(ϒ1, ϒk−1)− T0(ϒ

2, ϒk−1)∣∣, (4.19)

where

Qk(vk) =∫ 1

0

∣∣∇D(vk − θT0(ϒ1, ϒk−1)− (1− θ)T0(ϒ

2, ϒk−1))∣∣ dθ.

It is clear that∫HsN

Qk(vk) P1(dϒk) � Q,


where Q > 0 is a constant not depending on ϒ1, ϒ2 and ϒk−1. Therefore, by(4.17)–(4.19), we obtain

Ik � Q∞∑ρ=R

∫Ak(ρ)

Dk−1(ϒ1, ϒk−1)×

× ∣∣T0(ϒ1, ϒk−1)− T0(ϒ

2, ϒk−1)∣∣ Pk−1(dϒk−1)

� Q∞∑ρ=Rhkρ

∫Ak(ρ)

Dk−1(ϒ1, ϒk−1)Pk−1(dϒk−1)

� Q∞∑ρ=Rhkρ P

(k − 1,ϒ1,Ak(ρ)

), (4.20)

where Ak(ρ) is the set of elements in FsN of the form (ϒ1, ϒk−1) with ϒk−1 ∈Ak(ρ), and

hkρ = hkρ(ϒ1,ϒ2) = supϒk−1∈Ak(ρ)

∣∣T0(ϒ1, ϒk−1)− T0(ϒ

2, ϒk−1)∣∣.

(4) We now estimate hkρ . To this end, we need the following lemma.

LEMMA 4.5. There is a constant C > 0 such that for any K � 2M and anyinteger ρ � 0 we have∣∣T0(ϒ)

∣∣ �(K(ρ + 1)

)1/2, (4.21)∣∣T0(ϒ

1)− T0(ϒ2)∣∣ � C

0∑q=−∞

(Cα−1/2N )−qeCK(|q|∨ρ+1)|ϒ1

q −ϒ2q |, (4.22)

where ϒ,ϒ1,ϒ2 ∈ FN(K, ρ) ∩ FsN (K).

Taking this assertion for granted, let us complete the proof of the proposition.By definition, we have (ϒ1, ϒk−1) ∈ FsN (3M,ρ) ∩ FsN for ϒk−1 ∈ Ak(ρ). It

follows that (ϒ2, ϒk−1) ∈ FN(3K,ρ)∩FsN . Therefore, in view of inequality (4.21)with K replaced by K1 := 3K, we have

hkρ � supϒk−1∈Ak(ρ)

{|T0(ϒ1, ϒk−1)| + |T0(ϒ

2, ϒk−1)|}

� 2(K1(ρ + 1)

)1/2. (4.23)

On the other hand, inequality (4.22) implies that

hkρ � C1−k∑q=−∞

(Cα−1/2N )−qeCK1(|q|∨ρ+1)|ϒ1

q+k−1 − ϒ2q+k−1|

� C0∑

q=−∞(Cα

−1/2N )−q+k−1eCK1(|q|+k)+CK1ρ |ϒ1

q −ϒ2q |

� C1(R) 2−k eCK1ρ d, (4.24)


where d = d(ϒ1,ϒ2), C1(R) > 0 is a constant depending only on R, and theconstant K in (2.14) is chosen to be so large that 2(CK1+ logC+ log 2) � logαN .Note that the third inequality in (4.24) uses the estimate

d(ϒ1,ϒ2) � C ′(R)0∑

q=−∞2q |ϒ1

q −ϒ2q |, ϒ1,ϒ2 ∈ FN(K,R).

Combining (4.23) and (4.24), we derive

hkρ �(C1(R) 2

−k eCK1ρ d) ∧ (2K1/2

1 (ρ + 1)1/2). (4.25)

(5) We can now easily complete the proof of the proposition. We wish to showthat Vk � β if d(ϒ1,ϒ2) � r, where r = r(β) > 0 is sufficiently small. In viewof inequality (4.11) with m = 3 and the inclusion Ak(ρ) ⊂ FsN \ FsN(3M,ρ − 1)for ρ � R + 1, we have

P(k − 1,ϒ1,Ak(ρ)

)� CR3ρ

−3 (4.26)

for all k � 1, ρ � R and ϒ1 ∈ FsN(2M,R). Substituting (4.25), (4.26) and (4.20)into (4.16) and iterating the resulting inequality, we arrive at

Vk � CR3Q

k∑j=1

∞∑ρ=Rρ−3

{(C1(R) 2

−k eCK1ρ d) ∧ (2K1/2

1 (ρ + 1)1/2)}

� <(d) := C2

∞∑j=1

∞∑ρ=Rρ−3

{(2−j Dρ d) ∧ ρ1/2

},

where C2 and D are positive constants. Thus, the expression Vk can be estimatedby the double series <(d) vanishing for d = 0. By the Lebesgue theorem ondominated convergence, the required assertion will be established if we show thatthe series converges uniformly in d ∈ [0, 1]. Since all the terms in the sum <(d)are nondecreasing functions of d, it suffices to prove the convergence for d = 1.

To this end, we divide the domain of summation (i. e., j � 1, ρ � R) into twononintersecting sets:

S1 ={(j, ρ) : 2−jDρ � ρ1/22−j/2

},

S2 ={(j, ρ) : 2−jDρ > ρ1/22−j/2

}.

Let <1 and <2 be the sums corresponding to S1 and S2, respectively. Clearly,

<1 � C2

∑(j,ρ)∈S1

ρ−5/22−j/2 <∞.

On the other hand, if (j, ρ) ∈ S2, then j � cρ, where c > 0 depends only on D.Therefore,

<2 � C2

∞∑ρ=R

∑j�cρρ−5/2 � C2c

∞∑ρ=Rρ−3/2 <∞.

Thus, it remains to establish Lemma 4.5. ✷


Proof of Lemma 4.5. Inequality (4.21) is a simple consequence of the definitionof T0 and FN(K, ρ):∣∣T0(ϒ)

∣∣ = ∣∣S(v0 +W0(ϒ))∣∣ � |u| � (

K(ρ + 1))1/2, u = v0 +W0(ϒ).

Let us prove (4.22). Inequality (2.22) with t = 1 implies that∣∣T0(ϒ1)− T0(ϒ

2)∣∣ = ∣∣S(v1

0 +W0(ϒ1))− S(v2

0 +W0(ϒ2))∣∣

�(∣∣v1

0 − v20

∣∣+ ∣∣W0(ϒ1)−W0(ϒ

2)∣∣) exp

{C1

∫ 1

0

∥∥St(u1)∥∥2

dt

}, (4.27)

where ui = vi0 +W0(ϒi), i = 1, 2. In view of (2.7), (2.9), (2.3) and the definition

of the space FN(K, ρ), we have∣∣W0(ϒ1)−W0(ϒ

2)∣∣

�∣∣ψ1

0 − ψ20

∣∣+ −1∑q=−∞

(C2α

−1/2N

)−q×× exp

{C2|q|

(⟨|u1|2⟩−1q+ ⟨|u2|2⟩−1

q

)}∣∣ϒ1q −ϒ2

q

∣∣�∣∣ψ1

0 − ψ20

∣∣+ −1∑q=−∞

(C2α

−1/2N

)−q×× exp

{2C2K(|q| ∨ ρ + 1)

}∣∣ϒ1q −ϒ2

q

∣∣, (4.28)

where ϒ1,ϒ2 ∈ FN(K, ρ) ∩ FsN (K) and ui = G(ϒi ), i = 1, 2. Moreover, byinequality (2.4),∫ 1

0

∥∥St(u1)∥∥2

dt � 12 |u1|2 � 1

2K(ρ + 1)

� 12K(|q| ∨ ρ + 1), q � 0. (4.29)

Combining (4.27)–(4.29), we derive (4.22).

4.3. CHECKING CONDITION (H2)

We recall that BX(ϒ, r) denotes the ball of radius r in X centred at ϒ .

PROPOSITION 4.6. Under the conditions of Theorem 4.1, there is an integerρ0 � 1 and positive constants K and C such that if ρ � ρ0, then the followingassertions hold:

(i) For any R � 0 there is an integer l∗1 = l∗1 (R) � 1 such that

P(l1,ϒ0,FsN(2M,ρ0)) � 1/2

for any l1 � l∗1 , ϒ0 ∈ FsN(2M,R). (4.30)


(ii) For any r > 0, any integer ρ � ρ0, and an arbitrary ϒ ∈ FsN(2M,ρ) thereis ε = ε(ρ, r) > 0 and an integer l2 = l2(ϒ, ρ, r) � 1 such that

P(l2,ϒ0, BX(ϒ, r) ∩ FN(K, ρ) � ε

for any ϒ0 ∈ FsN(2M,ρ0). (4.31)

Moreover, there is an integer l∗2 = l∗2 (ρ, r) � 1 such that l2(ϒ, ρ, r) � l∗2for all ϒ ∈ FsN(2M,ρ).

(iii) The transition function P(k,ϒ, >) satisfies condition (H2) in which the setO(f,ϒ, R, β) has the form (4.4).

Proof. We first show that (i) and (ii) imply (iii). Indeed, let us fix any r > 0and any integers R � 0 and ρ � ρ0. Choosing l = l∗1 (R) + l∗2 (ρ, r) and l1 =l − l2(ϒ, ρ, r), from (4.30), (4.31), and the Chapman–Kolmogorov relation, wederive

P(l,ϒ0,O(ϒ, ρ, r)

)�∫

Xρ0

P(l1,ϒ0, dϒ ′)P(l2,ϒ ′,O(ϒ, ρ, r))

� ε(ρ, r)/2,

where O(ϒ, ρ, r) = BX(ϒ, r) ∩ FN(K, ρ). This proves the required assertion.We now turn to the proof of (i) and (ii).

Proof of (i). For ϒ0 ∈ FsN (2M,R), we set U = (uη

) = �−1N ϒ

0 ∈ Fs(2M,R)(see Section 2.2). We denote by (ul, l � 0) the trajectory of the RDS (2.11) whichstarts from u0 (the zeroth component of u) and set uk = (ul, l � k) and al =|ul|2 + ‖ηk‖2

s . Since |u0|2 � 2M(R + 1), inequality (1.11) implies that there is aninteger L1 = L1(R) � 1 such that

E |uk| �{C1(R + 1), 1 � k � L1 − 1,C1, k � L1,

(4.32)

where the constant C1 > 0 does not depend on R � 0. Let us fix arbitrary integerR1 � C1(R + 1) and estimate the probability of the event

|uk| � R1, ‖ηk‖s � R1, k = 1, . . . , L1 − 1. (4.33)

In view of (4.32), (1.5) and the Chebyshev inequality, we have

P{(4.33) holds} � 1−L1−1∑k=1

(P{|uk| � R1} + P{‖ηk‖s � R1}

)� 1− p1,

where p1 = p1(R,R1) → 0 as R1 → ∞ for any fixed R. Furthermore, let usfix sufficiently large integers ρ0 � 1 and L0 � ρ0, set l∗1 := L1 + L0, and takean arbitrary l1 � l∗1 . Applying Proposition 1.5 to the solution uk, k− � k � k+,


where k− = L1 and k+ = k0 = l1, we conclude that there is a constant C0 > 0, notdepending on R, such that with probability no less than p0 = 1− C0ρ

−10 ,

〈al〉l1T � M, L1 � T � l1 − ρ0. (4.34)

Hence, we have shown that

P{(4.33) and (4.34) hold} � p := p0 + p1 − 1. (4.35)

It is a matter of direct verification to show that if L0 � 2(R21L1M

−1 + R), theninequalities (4.33) and (4.34) imply that (

(ukηk

), k � l1) ∈ FsN(2M,ρ0). In view

of (4.35), it follows that for any U ∈ Fs(2M,R) we have

P

{((uk

ηk

), k � l1

)∈ Fs(2M,ρ0)

}� 1− p1(R,R1)− C0ρ

−10 . (4.36)

It remains to note that if ρ0 � 1 is so large that C0ρ−10 � 1/4, then for any fixed

R � 0 we can choose R1 � R such that the right-hand side of (4.36) is no less than1/2. This completes the proof of (4.30).

Proof of (ii). We shall need the following elementary lemma.

LEMMA 4.7. Let (xl, l � 0) be a sequence of nonnegative numbers such that

0∑l=Txl � C(|T | + 1) for T � −ρ, (4.37)

where ρ � 0 is an integer and C > 0 is a constant not depending on T . Then everyinteger interval � = [t1, t2] such that t1 � −ρ and t2 � 0 contains an integerpoint p such that

1

q − p + 1

q∑l=pxl � C0 := C|t1|

t2 − t1 + 1for p � q � 0.

Proof. Assuming the contrary, for each p ∈ � we can find an integer m(p),p � m(p) � 0, such that

m(p)∑l=pxl > C0(m(p)− p + 1). (4.38)

Let us define a finite sequence of integers p1, p2, . . . , pn by the following rule:p1 = t1 and pj = m(pj−1) + 1 if j � 2 and m(pj−1) � t2. Setting �j =[pj ,m(pj )] and using inequality (4.38), we derive

0∑l=t1xl �

n∑j=1

∑l∈�jxl > C0

n∑j=1

(m(pj )− pj + 1) � C0(t2 − t1 + 1).

This contradicts inequality (4.37) with T = t1. ✷


(1) To establish (4.31), we regard (2.20) as an RDS in X (rather than a Markovchain), and using the isomorphism of (2.18) and (2.20), pass from a random trajec-tory {ϒk} to {uk = G(ϒk)}. More exactly, for ϒ and ϒ0 as in (4.31), let

U =(u

η

)= �−1

N ϒ ∈ Fs(2M,ρ), U 0 =(u0

η0

)= �−1

N ϒ0 ∈ Fs(2M,ρ0).

We set FL = FL(K, ρ)∩FsL, where L = N or L = ∞, and consider the restrictionof H: Fs∞ → FsN to F∞. In view of (2.16), inequality (2.22) implies that themapping H: F∞ → FN is uniformly Lipschitz with a Lipschitz constant d notdepending on N . Therefore, inequality (4.31) will be proved if we show that

P{ul2 ∈ F∞, dist(ul2, u) � r/d

}� ε, (4.39)

where uk, k � 0, is the random trajectory of (2.18) starting from u0.(2) We fix arbitrary ρ � ρ0 and r > 0. Let B > 0 be a sufficiently large constant

which will be chosen later. Let an integer T1 = T1(r, B) � 1 and a positive constantδ1 = δ1(r, B) � 1 be such that dist(u′, 0) � r/d for any element u′ ∈ Fs∞ whosecomponents satisfy the inequalities

|u′j | � B e−(j+T1−1) + δ1, 1− T1 � j � 0. (4.40)

Since 〈|u|2〉0q � 2M for q � −ρ, the sequence xl = |ul|2, l � 0, satisfies theconditions of Lemma 4.7 with C = 2M. Let T2 = T2(ρ, r, B) be the smallest eveninteger exceeding (2T1)∨ρ. Applying Lemma 4.7 with t1 = −T2 and t2 = −T2/2,we find an integer T = T (ρ, r, B), T1 � T � T2, such that⟨|u|2⟩−T+l−T � 4M for 0 � l � T . (4.41)

We claim that there is a deterministic trajectory ul = (u0, u1, . . . , ul), l = 1, . . . ,T , for (2.18) that corresponds to a control ηl =

(ϕlψl

) ∈ H s and possesses thefollowing properties:

|ul − ul−T | � B e−l , l = 0, . . . , T , (4.42)

‖ϕl‖p � 2Bαp/2N , ψl = ψl−T , l = 1, . . . , T , (4.43)

where p � 0. Taking this assertion for granted, let us show that (4.39) holds withl2 = T . It follows from (4.43) and the inclusion U ∈ Fs(2M,ρ) that

‖ηl‖s � 2Bαs/2N + (2MT )1/2, l = 1, . . . , T .

Therefore, by Lemma 6.2, for any γ > 0 the probability of the event

Fγ :={|ηl − ηl| � γ, l = 1, . . . , T

}can be estimated from below by a constant ε > 0 depending only on N , B, ρ,r, and γ (but not on ϒ). In view of the continuous dependence of trajectoriesfor (2.11) on the control ηl , for any ω ∈ Fγ we have

|ul − ul| � δ = δ(γ ), l = 1, . . . , T ,


where ul , l � 1, is the trajectory of (2.11) corresponding to ηl , and δ(γ ) → 0as γ → 0. Combining this with (4.42), we conclude that u′j := uj+T − uj , j =1− T , . . . , 0, satisfy (4.40) if δ(γ ) � δ1. Therefore,

d(ul2, u) � r/d for ω ∈ Fγ , γ & 1.

Moreover, it is a matter of direct verification to show that ul2 ∈ F∞(K, ρ), whereK = K(M,B) is sufficiently large. This completes the proof of (4.31).

(3) Thus, it remains to establish the existence of a deterministic trajectory ul

satisfying (4.42) and (4.43).Let us set

ϕl = PN(ul−T − S(ul−1)

), ψl = ψl−T , l = 1, . . . , T , (4.44)

where u0 is the zeroth component of u0. Note that the first relation in (4.44) impliesthat vl = vl−T for l = 1, . . . , T . We claim that (4.42) and (4.43) hold with anappropriate constant B > 0. Indeed, (4.43) is a simple consequence of inequal-ity (4.42) whose proof is by induction on l. In view of (4.41) with l = 0 and theinclusion u0 ∈ F∞(2M,ρ0), we have

|u0 − u−T | � |u0| + |u−T | �(2M(ρ0 + 1)

)1/2 + 2M1/2 := B.Let us assume that (4.42) is proved for 0 � l � k−1, k � 1. It follows from (4.41)and inequality (2.24) in which m = 0, l = k, u1

r = ur , and u2r = ur−T that

|uk − uk−T | � (Cα−1/2N )k exp

{Ck(〈|uj |2〉k−1

0 + 〈|uj−T |2〉k−10

)}|u0 − u−T |� (Cα−1/2

N )k exp{2C(6M + B2)k

}B � e−kB,

where the integer N � 1 is so large that

log αN � 4C(6M + B2)+ 2(1+ logC).

This completes the induction and the proof of the proposition. ✷

5. Uniqueness and Mixing for the Original System

We recall that the Markov semigroups Pk and P ∗k associated with Equation (1.1)and the space C(Hs, β) of continuous functions with exponential growth at infinityand the corresponding norm ‖f ‖s,β were introduced in Section 1.3. Also recall thatwe set βd(r) = (1+ r)d , r � 0.

As before, we assume that ν = 1. For any integer R � 0, we denote by H(R)the set of those u ∈ H for which there is u ∈ Fs∞(2M,R) such that u0 = u,where u0 is the zeroth component of u.


THEOREM 5.1. Suppose that condition (1.4) is satisfied for some s > 0. Thenthere is an integer N � 1 such that if

bj �= 0 for j = 1, . . . , N, (5.1)

then the Markov semigroup P ∗k has a unique stationary measure λ ∈ P (H) sat-isfying condition (1.34). Moreover, the measure λ is concentrated on Hs , and iff ∈ C(Hs, βm) for some m � 1, then for any integer R � 0, we have

Pkf (u)→ (λ, f ) as k→∞ uniformly in u ∈ H(R). (5.2)

In particular, convergence (5.2) holds for λ-almost all u ∈ H . Finally, if all theconstants bj in (1.3) are nonzero, then (5.2) holds uniformly with respect to u ∈Hs , ‖u‖s � R, for any R � 0.

Remark 5.2. The existence and uniqueness of a stationary measure and con-vergence (5.2) can be established under a weaker assumption. Namely, insteadof (0.9), it suffices to assume that∫ ∞

−∞|r|20pj(r) dr � C for all j � 1. (5.3)

This assertion can be derived by analysing the arguments in Sections 1–5. We donot dwell on it and only show where the exponent 20 in (5.3) comes from.

When verifying condition (H1), we used (see (4.26)) inequality (4.11) withm = 3, which, in turn, is based on the fact that the third moment of the randomvariable Tν(ω) (see Proposition 1.5) is finite. The mth moment of Tν can be esti-mated by a constant depending only onN2m and E|η1|4(m+2) (see (1.16) and (1.18)),and N2m admits an estimate in terms of E |ηk|2m (see (1.11)). For m = 3 we obtainthe expression E |ηk|20, which can be estimated by the constant C in (5.3).

Proof of Theorem 5.1. The existence of a stationary measure satisfying (1.34)and the fact that λ(H s) = 1 are established in Proposition 1.11. The uniqueness ofsuch a measure follows from Theorem 2.2 and Corollary 4.3. Let us prove (5.2).

(1) We begin with the case f ∈ Cb(H). Let us define a function f ∈ Cb(H),H = HZ0 , by the formula

f (u) = f (u0), u = (ul, l � 0).

We recall that Pk and P∗k stand for the Markov semigroups associated with thefamily (2.17), (2.18). It is clear that Pkf (u) = Pkf (u0) for any u ∈ Fs∞. Letλ ∈ P (Fs∞) be the unique stationary measure for P∗k. By Corollary 4.3, we have

Pkf (u)→ (λ,f ) as k→∞ uniformly in u ∈ Fs∞(2M,R).

Since the projection u = (ul, l � 0) !→ u0 maps λ to λ, we conclude that (λ,f ) =(λ, f ). Therefore (5.2) holds uniformly in u ∈ H(R) for any R � 0.


(2) To show that (5.2) remains valid for f ∈ C(H, βm), we use Lemma 1.8.Namely, for L > 0 let hL(r) denote a continuous function that is equal to 1 and 0for r � L and r � L+1, respectively. We take an arbitrary function f ∈ C(H, βm)and represent it in the form

f (u) = fL(u)+ gL(u), fL(u) = hL(|u|)f (u).Since fL ∈ Cb(H), we conclude that

PkfL(u)→ (λ, fL) as k→∞ uniformly in u ∈ H(R).It is easy to see that (λ, fL)→ (λ, f ) as L→∞. Furthermore, we note that

‖gL‖0,βm′ → 0 as L→∞,where for any m′ > m. By Lemma 1.8, the norm of the operators

Pk:C(H, βm)→ C(H, βm′)is bounded uniformly in k � 1. It follows that∣∣PkgL(u)∣∣→ 0 as L→∞uniformly in k � 0 and u ∈ H(R) ⊂ BH(R1), R1 = (2M(R + 1))1/2.

We now write∣∣Pkf (u)− (λ, f )∣∣�∣∣PkfL(u)− (λ, fL)∣∣+ ∣∣PkgL(u)∣∣+ ∣∣(λ, fL − f )∣∣. (5.4)

What has been said above implies that the right-hand side of (5.4) tends to zeroas k→∞.

The fact that (5.2) holds also for f ∈ C(Hs, βm) follows from Lemma 1.8.(3) We now assume that bj �= 0 for all j � 1. To prove that (5.2) holds uni-

formly in u ∈ Hs , ‖u‖s � R, it suffices to show that the ball BHs(R) is containedin HR′ for some R′ � 1. This assertion follows immediately from the definitionof Fs∞(2M,R). ✷

Remark 5.3. If in Theorem 5.1 we assume that condition (B) is also satisfied,then convergence (5.2) holds for functions with exponential growth at infinity (seeMain Theorem in the Introduction). Namely, it suffices to assume that f (u) is acontinuous function on Hs such that |f (u)| � const exp(σ‖u‖2κl

s ), where l is thesmallest integer no less than s, κl is the constant in Theorem 1.4 with ρ = ∞, andσ > 0 is sufficiently small. This assertion can easily be proved by repeating theabove arguments and using Remark 1.9, and we shall not dwell on it.

As we saw above, convergence (5.2) is a simple consequence of Theorem 4.1.The following assertion shows that under the same conditions we have a muchstronger result. Its proof requires some new ideas and will be presented in a subse-quent publication.


THEOREM 5.4. Under the conditions of Theorem 5.1, if (5.1) is satisfied for asufficiently large N � 1, then for any f ∈ C(Hs, βm),m � 1, and R > 0, we have

Pkf (u)→ (λ, f ) as k→∞ uniformly in u ∈ BH(R). (5.5)

Moreover, if condition (B) is also satisfied, then (5.5) holds for any function de-scribed in Remark 5.3.

6. Appendix


The existence and uniqueness of a solution are obvious, so that we confine our-selves to the proof of (1.11) and (1.12).

(1) We begin with the case s = 0. Taking the scalar product of (1.8) and u(t)in H , we obtain

|St(u)| � e−α1νt |u0|, t � 0. (6.1)

Since

uk = S(uk−1)+ ηk, k � 1, (6.2)

we conclude from inequality (6.1) with t = 1 that, for any δ > 0,

|uk|m � (1+ δ)e−να1m|uk−1|m + Cmδ−(m−1)|ηk|m,where the constant Cm > 0 depends on m solely. Choosing q = e−α1ν and δ =e(m−1)α1ν − 1, we derive

|uk|m � q|uk−1|m + C(m)ν−(m−1)|ηk|m.Taking the average and iterating the resulting inequality, we obtain (1.11).

(2) We now consider the case s > 0. We shall need the following lemma, whichis proved in Subsection 6.3.

LEMMA 6.1. The resolving operator St of the free NS system (1.8) is continuousfrom H to Hs for any t > 0 and s � 0. Moreover, for any integer l � 2 there is aconstant Cl � 1 such that if u(t, x) is a solution of (1.8) for t � 0, then

t l‖u(t)‖2l + 2ν

∫ t0θ l‖u(θ)‖2

l+1 dθ

�{ν−l |u0|2, l = 0, 1,Cl(ν−l |u0|2 + ν−5l|u0|2/κl), l � 2,

(6.3)

where t � 0, and the constant κl is defined in Theorem 1.4. Furthermore, for l = 0the inequality sign in (6.3) can be replaced by equality.


To simplify notation, we confine ourselves to the case s > 1. Let us fix anarbitrary k � 1. In view of relation (6.2) and inequality (6.3) with t = 1, we have(the integer l = l(s) is defined in Theorem 1.3)

‖uk‖ms � 2m−1(‖S(uk−1)‖ml + ‖ηk‖ms

)� Cml

(1+ ν−5lm/2|uk−1|m(2l+1) + ‖ηk‖ms

),

which implies (1.12).


We confine ourselves to the case s = 0. It follows from (6.1) and (6.2) that, for anyδ > 0,

|uk|2 � (1+ δ)e−2α1ν |uk−1|2 + (1+ δ−1)|ηk|2. (6.4)

We set q = e−α1ν , δ = eα1ν − 1, and σ0 = ρ ∧ (aα1e−α1). Inequality (6.4) impliesthat

σ0ν|uk|2 � σ0νq|uk−1|2 + a|ηk|2,and therefore, in view of independence of uk−1 and ηk, we have

E eσ0ν|uk |2 � E ea|ηk |2E(eσ0ν|uk−1|2)q � E ea|ηk |

2(E eσ0|uk−1|2)q .

Arguing by induction on k, we derive (1.14).

6.3. PROOF OF LEMMA 6.1

Inequality (6.3) is proved by induction on l. For l = 0, it is well known (see [CF]).We now fix an arbitrary l = m � 1 and assume that inequality (6.3) is establishedfor l < m. Let us take the scalar product in H of Equation (1.8) and the functionLmu. Performing some simple transformations, we derive

∂t(tm‖u‖2

m

)−mtm−1‖u‖2m + 2νtm‖u‖2

m+1++ 2tm

(Lm+1

2 u,Lm−1

2 B(u, u)) = 0. (6.5)

Ifm = 1, then the last term on the left-hand side of (6.5) vanishes, and the requiredinequality can be established by integration with respect to time. Therefore weassume that m � 2. In this case, we have the following estimate, which followseasily from Hölder’s and interpolation inequalities:∣∣(Lm+1

2 u,Lm−1

2 B(u, u))∣∣ � cm

∥∥u∥∥ 4m−12mm+1

∥∥u∥∥m+12m |u| 1

2

� ν

2‖u‖2

m+1 + c′mν1−4m‖u‖2(m+1)|u|2m, (6.6)


where cm and c′m are positive constants. Substituting (6.6) into (6.5) and integratingin time, we obtain

tm‖u(t)‖2m + ν

∫ t0θm‖u(θ)‖2

m+1 dθ

� m∫ t

0θm−1‖u(θ)‖2

m dθ + c′mν1−4m∫ t

0θm‖u(θ)‖2(m+1)|u(θ)|2m dθ.

The required inequality follows now from the induction hypothesis.

6.4. PROOF OF LEMMA 2.4

(1) Let ui(t), i = 1, 2, be two solutions of the free NS system (1.8) with initialfunctions u0

i . Then the difference u = u1 − u2 satisfies the equation (recall thatν = 1)

u+ Lu+ B(u, u1)+ B(u2, u) = 0. (6.7)

Let us take the scalar product of this equation with 2u(t) in H . Since∣∣(B(u, u1), u)∣∣ � c1|u| ‖u‖ ‖u1‖ � 1

2‖u‖2 + c212 ‖u1‖2|u|2

and (B(u2, u), u) = 0, we derive the differential inequality

∂t |u|2 + ‖u‖2 � c21‖u1‖2|u|2. (6.8)

Applying the Gronwall inequality, we obtain

|u(t)|2 � exp

{c21

∫ t0

∥∥u1(θ)∥∥2

dθ

}|u0|2, u0 = u0

1 − u02, (6.9)

which coincides with (2.22). Integration of (6.8) now results in∫ t0

∥∥u(θ)∥∥2dθ � |u0|2 + c21

∫ t0

∥∥u1(θ)∥∥2|u(θ)|2 dθ

� |u0|2(

1+∫ t

0c21

∥∥u1(θ)∥∥2

ec21

∫ θ0 ‖u1(σ )‖2dσ dθ

)� exp

{c21

∫ t0

∥∥u1(θ)∥∥2

dθ

}|u0|2. (6.10)

(2) We now take the scalar product of (6.7) with 2tLu(t) in H :

∂t(t‖u‖2

)+ 2t|Lu|2 = ‖u‖2 − 2t(B(u, u1), Lu

)− 2t(B(u2, u), Lu

). (6.11)

Let us use the inequalities

‖v‖2∞ � c22|v| |Lv|, ‖v‖2 � |v| |Lv|


to estimate the second and third terms on the right-hand side of (6.11):∣∣(B(u, u1), Lu)∣∣ � ‖u‖∞‖u1‖ |Lu| � c2|u|1/2|Lu|3/2|u1|1/2|Lu1|1/2

� 12 |Lu|2 + c42|u|2|u1|2|Lu1|2, (6.12)∣∣(B(u2, u), Lu)

∣∣ � ‖u2‖∞‖u‖ |Lu| � c2|u2|1/2|Lu2|1/2|u|1/2|Lu|3/2� 1

2 |Lu|2 + c42|u|2|u2|2|Lu2|2. (6.13)

We now note that (see (6.1))

|ui(t)| � |u0i |, t � 0, i = 1, 2. (6.14)

Substituting (6.12)–(6.14) into (6.11) and integrating with respect to t , we derive

t‖u‖2 �∫ t

0‖u(θ)‖2 dθ +

+ 2c42

{|u0

1|2∫ t

0θ |u|2|Lu1|2 dθ + |u0

2|2∫ t

0θ |u|2|Lu2|2 dθ

}. (6.15)

To estimate the expression in the brackets on the right-hand side of (6.15), we applyinequalities (6.9) and (6.3) with l = 1:

|u0i |2∫ t

0θ |u|2|Lui |2 dθ � |u0|2|u0

i |2 exp

{c21

∫ t0

∥∥u1(θ)∥∥2

dθ

}∫ t0θ |Lui|2 dθ

� |u0|2|u0i |4 exp

{c21

∫ t0

∥∥u1(θ)∥∥2

dθ

}. (6.16)

Furthermore, it follows from (6.1) and (6.3) with l = 0 that

|u0i |2 � 2

(1− e−2α1t

)−1∫ t

0‖ui‖2 dθ � c3(t−1 ∨ 1)

∫ t0

∥∥ui(θ)∥∥2dθ. (6.17)

Substitution of (6.17) into (6.16) results in

|u0i |2∫ t

0θ |u|2|Lui |2 dθ

� c23(t−2 ∨ 1)

(∫ t0

∥∥ui(θ)∥∥2dθ

)2

exp

{c21

∫ t0

∥∥u1(θ)∥∥2

dθ

}|u0|2

� c23(t−2 ∨ 1) exp

{∫ t0

(c21

∥∥u1(θ)∥∥2 + ∥∥u2(θ)

∥∥2)dθ

}|u0|2. (6.18)

The required inequality (2.23) follows now from (6.15), (6.10), and (6.18).


6.5. LOWER BOUND FOR MEASURES WITH POSITIVE DENSITY

LEMMA 6.2. Let γ be the distribution in H of the random variable

η(x) =∞∑j=1

bj ξjej (x),

where bj are real numbers satisfying condition (1.4) and ξj are independent scalarrandom variables whose distributions have strictly positive, continuous densitiespj(r) with respect to the Lebesgue measure such that

∫Rr2pj(r) dr � C for all

j � 1 and some constant C > 0 not depending on j . Then γ (B) > 0 for anyopen ball B ⊂ H s . Moreover, for any p > s, R > 0, and r > 0 there is ε =ε(p,R, r) > 0 such that γ (B) � ε for any open ball B ⊂ H s of radius r centredat a point u0 ∈ Hp, ‖u0‖p � R.

Proof. We recall that H sL and H s⊥

L denote the closed subspaces in H s spannedby the vectors ej , j = 1, . . . , L − 1, and ej , j � L, respectively, and that PLand QL are the orthogonal projections in! H onto HL and H⊥

L . It is clear that forany u0 ∈ H s and r > 0 we have

BH s (u0, r) ⊃ BH sL

(v0, r/√

2)× BH s⊥L (w0, r/

√2),

where v0 = PLu0, w0 = QLu0, ϕ = PLη, ψ = QLη, and L � 2 is an arbitraryinteger. Since ϕ and ψ are independent, we conclude that

P{η ∈ BH s (u

0, r)}

� P{ϕ ∈ BH sL

(v0, r/√

2)}×

×P{ψ ∈ BH s⊥L (w

0, r/√

2)}. (6.19)

Let us choose an integer L � 2 so large that

∥∥w0∥∥s� r

2√

2,

∞∑j=Lb2jαsj <r2

8C. (6.20)

Since D(ϕ) has a strictly positive continuous density with respect to Lebesguemeasure, we conclude that the first factor on the right-hand side of (6.19) is posi-tive. To estimate the second factor, note that, in view of the first inequality in (6.20),we have

BH s⊥L (w0, r/

√2) ⊃ BH s⊥L (r/2

√2).

Therefore,

P{ψ ∈ BH s⊥L (w

0, r/√

2)}

� P{‖ψ‖s � r/2

√2}

= 1− P{‖ψ‖2

s � r2/8}. (6.21)

! In the case s = 0, we drop the index s from the notation of the spaces H s , H sL

, and H s⊥L

.


By the second inequality in (6.20), we have

E ‖ψ‖2s � C

∞∑j=Lb2jαsj <r2

8.

The Chebyshev inequality now implies that the right-hand side of (6.21) is alsopositive.

To prove the second assertion, it suffices to note that the integer L � 2 satisfy-ing (6.20) can be chosen uniformly with respect to the set of balls described in thestatement of the lemma. ✷

Acknowledgement

This research was supported by the EPSRC grant GR/N63055/01.

References

[BV] Babin, A. V. and Vishik, M. I.: Attractors of Evolutionary Equations, Stud. Math. Appl. 25,North-Holland, Amsterdam, 1992.

[BKL] Bricmont, J., Kupiainen, A. and Lefevere, R.: Exponential mixing for the 2D stochasticNavier–Stokes dynamics, Preprint.

[CF] Constantin, P. and Foias, C.: Navier–Stokes Equations, Chicago Lectures in Math., Univ.Chicago Press, Chicago, 1988.

[DZ] Da Prato, G. and Zabczyk, J.: Ergodicity for Infinite-Dimensional Systems, London Math.Soc. Lecture Note Ser. 229, Cambridge Univ. Press, Cambridge, 1996.

[EMS] E, W., Mattingly, J. C. and Sinai, Ya. G.: Gibbsian dynamics and ergodicity for thestochastically forced Navier–Stokes equation, Preprint.

[G] Gallavotti, G.: Foundations of Fluid Dynamics, Springer-Verlag, Berlin, 2001.[KS1] Kuksin, S. and Shirikyan, A.: Stochastic dissipative PDE’s and Gibbs measures, Comm.

Math. Phys. 213 (2000), 291–330.[KS2] Kuksin, S. and Shirikyan, A.: On dissipative systems perturbed by bounded random kick-

forces, To appear in Ergodic Theory Dynam. Systems.[Re] Revuz, D.: Markov Chains, 2nd edn, North-Holland Math. Library 11, North-Holland,

Amsterdam, 1984.


197

On the Modified Korteweg–De Vries Equation

Dedicated to Professor Ilia A. Shishmarev on his 60th birthday

NAKAO HAYASHI1,� and PAVEL NAUMKIN2,��

1Department of Applied Mathematics, Science University of Tokyo, Tokyo, 162-8601, Japan.e-mail: [email protected] de Física y Matemáticas, Universidad Michoacana, AP 2-82, Morelia, CP 58040,Mexico. e-mail: [email protected]

(Received: 18 December 2000)

Abstract. We consider the large time asymptotic behavior of solutions to the Cauchy problem forthe modified Korteweg–de Vries equation ut + a(t)(u3)x + 1

3uxxx = 0, (t, x) ∈ R × R, with initial

data u(0, x) = u0(x), x ∈ R. We assume that the coefficient a(t) ∈ C1(R) is real, bounded and

slowly varying function, such that |a′(t)| � C〈t〉− 76 , where 〈t〉 = (1 + t2)

12 . We suppose that the

initial data are real-valued and small enough, belonging to the weighted Sobolev space H1,1 = {φ ∈L2; ‖

√1 + x2

√1 − ∂2

xφ‖ < ∞}. In comparison with the previous paper (Internat. Res. Notices 8(1999), 395–418), here we exclude the condition that the integral of the initial data u0 is zero. We

prove the time decay estimates3√t2 3

√〈t〉‖u(t)ux(t)‖∞ � Cε and 〈t〉 13 − 1

3β ‖u(t)‖β � Cε for allt ∈ R, where 4 < β � ∞. We also find the asymptotics for large time of the solution in theneighborhood of the self-similar solution.

Mathematics Subject Classification (2000): 35Q35.

Key words: modified Korteweg–de Vries equation, large time asymptotics.

1. Introduction

We consider the modified Korteweg–de Vries (mKdV) equation

ut + a(t)(u3)x + 13uxxx = 0, (t, x) ∈ R × R,

u(0, x) = u0(x), x ∈ R,(1.1)

where u0 is a real-valued function and the coefficient a(t) ∈ C1(R) is a real,bounded and slowly varying function such that |a′(t)| � C〈t〉− 7

6 , where 〈t〉 =(1 + t2)

12 .

The existence and uniqueness of solutions to the Cauchy problem (1.1) wereproved in papers [12, 13, 17–19, 22, 28, 31]. The smoothing properties of the

� Present address: Department of Mathematics, Osaka University, Osaka 560-0043, Japan. e-mail:[email protected]�� The work of P.N. is partially supported by CONACYT.

198 NAKAO HAYASHI AND PAVEL NAUMKIN

solutions were studied in [3, 5, 6, 18, 19, 28]. The linear blow-up effect for slowlydecaying solutions of the mKdV equation was proved in [2]. For special cases ofthe KdV equation itself and for the mKdV equation (when a(t) = C), the Cauchyproblem was solved, using the Inverse Scattering Transform (IST) method and,via this method, the large time asymptotics of solutions was found (see [1, 7]).The IST method is not applicable for the case a(t) �= C, so to study the globalbehavior in time of the solution to the Cauchy problem (1.1) we apply here differentfunctional-analytic methods. In a previous paper [15], we proved the quasilinearasymptotics for solutions to the Cauchy problem for the generalized Korteweg–deVries equation with supercritical nonlinearity (|u|p−1u)x, p > 3 and in [16] westudied the large-time asymptotics of solutions to the (mKdV) equation under thecondition that the integral of the initial data is equal to zero:

∫u0(x)dx = 0. In

the present paper, we consider the Cauchy problem (1.1) with any small initial datau0 ∈ H1,1(R).

We denote the standard Lebesgue space by Lp = {φ ∈ S′; ‖φ‖p < ∞}, where

‖φ‖p =(∫

|φ(x)|pdx

)1/p

if 1 � p < ∞

and

‖φ‖∞ = ess. supx∈R

|φ(x)| if p = ∞,

S′ denotes the Schwartz space of distributions. For simplicity, we write ‖φ‖ =‖φ‖2. The weighted Sobolev space is

Hm,s = {φ ∈ S ′; ‖φ‖m,s = ∥∥〈x〉s (1 − ∂2

x )m/2φ

∥∥ < ∞}, m, s ∈ R,

where 〈x〉 = √1 + x2. We define the inner product (ψ, φ) = ∫

ψ(x) · φ(x) dxand denote by C(I; B) the space of continuous functions from an interval I to aBanach space B. Different positive constants might be denoted by the same letterC. In what follows, we consider the case of positive time t only since the negativetime is considered similarly.

The aim of this paper is to prove the following results:

THEOREM 1.1. Let a(t) ∈ C1(R) and |a(t)| + 〈t〉 76 |a′(t)| � C. We also assume

that the initial data u0 ∈ H1,1 are real-valued functions with a sufficiently smallnorm ‖u0‖1,1 = ε. Then there exists a unique global solution u ∈ C(R; H1,1) ofthe Cauchy problem for mKdV equation (1.1) such that

3√t2 3√〈t〉‖u(t)ux(t)‖∞ � Cε and 〈t〉 1

3 − 13β ‖u(t)‖β � Cε

for all t ∈ R, where 4 < β � ∞.

ON THE MODIFIED KORTEWEG–DE VRIES EQUATION 199

We denote by

S(t, x) = 13√tϕ

(x3√t

)

the self-similar solution of the (mKdV) equation such that

St + A(S3)x + 13Sxxx = 0

and ∫S(t, x)dx =

∫ϕ(x) dx =

∫u0(x) dx,

where A = limt→∞ a(t). Note that if the function ϕ satisfies the second Painlevéequation ϕ′′(ξ) = −ξϕ + 3Aϕ3. Then

S(t, x) = 13√tϕ

(x3√t

)

satisfies

St + A(S3)x + 13Sxxx = 0.

The following theorem provides us with the large-time asymptotic behavior of thesolution in the neighborhood of self-similar solutions. More precisely, the solutionof (1.1) is defined by this self-similar solution in the region |x| � C 3

√t and in

the far region −x � 3√t it has a rapidly oscillating structure similar to that of the

nonlinear Schrödinger equation (see [14]).

THEOREM 1.2. Let the conditions of Theorem 1.1 be fulfilled. Then, for any u0 ∈H1,1, there exist unique functions Hj and Bj ∈ L∞ (Bj are real-valued), j = 1, 2,such that the following asymptotic formula is valid for large time t � 1:

u(t, x) = 13√tϕ

(x3√t

)+

+√

2π3√t

�Ai

(x3√t

)(H1

(xt

)exp

(iB1

(xt

)log

|x|3√t

)+

+H2

(xt

)exp

(iB2

(xt

)log

|x|3√t

))+

+ O

(εt4γ− 5

12

(1 + x

3√t

)−1/4), (1.2)

where γ ∈ (0, 150) and

Ai(x) = 1

π

∫ ∞

0eixz+

i3 z

3dz

is the Airy–Fock function.


Remark 1.1. Note that the existence of a unique solution for the Painlevé equa-tion ϕ′′(ξ) = −ξϕ + 3Aϕ3 with the additional condition that the integral (totalmass)

∫ϕ(x) dx = ∫

u0(x) dx can be proved via the method appearing in [9].Applying the Fourier transformation to the Painlevé equation, we get the first-orderordinary differential equation with the nonlinear term in the form of convolutionand the additional condition transforms to the Cauchy initial data. Then we can usethe standard contraction mapping principle to prove the existence and uniqueness.

We now state our strategy of the proofs. Theorem 1.1 is obtained by the a-prioriestimates of local solutions in the norm

‖u(t)‖X = (‖u(t)‖1,0 + ‖Ju(t)‖1,0)〈t〉− 16 + ‖F G(−t)u(t)‖∞,

where J = G(t)xG(−t) and the Airy free evolution group

G(t)φ = F −1eit3 ξ3

φ(ξ ) = 1

2π

∫dyφ(y)

∫dξeiξ(x−y)+ it

3 ξ3

= 13√t�∫

Ai

(x − y

3√t

)φ(y) dy.

Here and below F φ or φ is the Fourier transform of the function φ, defined by

F φ(ξ) = 1√2π

∫e−ixξφ(x) dx

and F −1φ or φ is the inverse Fourier transform of φ, i.e.

F −1φ(x) = 1√2π

∫eixξφ(ξ) dξ.

In order to obtain the a-priori estimates of solutions in the norm

(‖u(t)‖1,0 + ‖Ju(t)‖1,0)〈t〉− 16 ,

we need to use the operator

J = G(−t)xG(t) = x − t∂2x .

However, the operator J is not the first-order operator and so it does not work wellwith the nonlinear term. Hence, we make use of the operator

Iφ(t, x) = xφ + 3t∫ x

−∞∂tφ(t, y) dy,

which almost commutes with the linear part, L = ∂t + 13∂

3x , of Equation (1.1)

and can be considered as the first-order operator for the nonlinear term. And theoperator I is related with the operator J. Indeed, we have

I − J = 3t∫ x

−∞dx′L.


Therefore, we have the desired estimate via the operator I (see (3.3)–(3.6) fordetails).

To show a-priori estimates of solutions in the norm ‖F G(−t)u‖∞, we use thestationary phase method which give us Equation (3.8):

vt (t, p) =√

3a(t)p3

2(p3t)1− γ2 〈p3t〉 γ

2e− 8it

27 p3v3(t,

p

3

)−

− 3ia(t)p3

2(p3t)1− γ2 〈p3t〉 γ

2|v(t, p)|2v(t, p) + R(t, p),

where v = G(−t)u and R(t, p) is considered as a remainder term in our functionspace since, from our key lemmas (Lemmas 2.3 and 2.4) we have the estimate

‖R(t)‖∞ � Cp3‖u(t)‖3X(p

3t)−1+ γ2 〈p3t〉− 1

12 +2γ ,

where γ ∈ (0, 150). The first term on the right-hand side of the above equation for

the function v can be shown by integration by parts with respect to time t to be aremainder term and the second one is cancelled by introducing the phase function

Ev(t, p) = exp

(−3i

2

∫ t

1|v(τ, p)|2 a(τ)p3dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

).

So we have the desired estimate.Theorem 1.2 is obtained by application of the above-mentioned method to the

equation for the difference w = u − S in the functional space Yδ defined belowin Section 2. Here the important fact is that the integral of w is equal to 0 whichyields an additional time decay estimate for w (see also the method of paper [16]).

Note that the commutator representations

[L,J] = 0, [L,I] = 3∫ x

−∞dx′L, [J,∂x] = [I, ∂x] = −1

are valid. We also define

Jφ(t, ξ) = ∂ξφ(t, ξ), Iφ(t, ξ) = ∂ξφ(t, ξ) − 3t

ξ∂tφ(t, ξ).

2. Lemmas

Denote

X ={φ ∈ C(R+; L2); |φ|X = sup

t∈R+‖φ(t)‖X < ∞

},

where

‖u(t)‖X = (‖u(t)‖1,0 + ‖Ju(t)‖1,0) 〈t〉− 1

6 + ∥∥F G(−t)u(t)∥∥∞ .

The next lemma shows that the function space X involves the Lβ-time decay esti-mates of the functions from X.


LEMMA 2.1. Let u ∈ X be a real-valued function. Then we have the followingestimates for all t > 0:

‖u(t)‖β � C〈t〉− 13 + 1

3β |u|X , (2.1)

where 4 < β � ∞ and

‖u(t)ux(t)‖∞ � Ct−23 〈t〉− 1

3 |u|2X . (2.2)

Proof. We follow the method of the proof of Lemma 2.2 from [15, 16], so weonly give an outline of the proof. By the Sobolev embedding inequality, we have‖u‖β � C‖u‖1,0, 4 < β � ∞, therefore we only prove (2.1) for t � 1. We havethe identity

u(t, x) = G(t)v(t) =√

2

π�∫ ∞

0eipx+ip3t/3v(t, p) dp

=√

2√π 3

√t�∫ ∞

0eiqχ+iq3/3

(v(t, κ) +

(v

(t,

q3√t

)− v(t, κ)

))dq

=√

2π3√t

�Ai

(x3√t

)v(t, κ) + R(t, x), (2.3)

where we denote

R(t, x) =√

2√π 3

√t�∫ ∞

0eiqχ+iq3/3

(v

(t,

q3√t

)− v(t, κ)

)dq,

χ = x3√t, κ = √−x/t for x � 0

and κ = 0 for x � 0, v(t) = G(−t)u(t). Consider the case x � 0, i.e. χ � 0 andκ = 0. Using the identity

eiqχ+iq3/3 = 1

1 + iq(q2 + χ)

∂

∂q

(qeiqχ+iq3/3), (2.4)

we integrate by parts with respect to q in the remainder term R(t, x):

R(t, x) = C3√t�∫ ∞

0

(iq(3q2 + χ)

1 + iq(q2 + χ)

(v

(t,

q3√t

)− v(t, 0)

)−

− q3√tJv

(t,

q3√t

))eiqχ+iq3/3dq

1 + iq(q2 + χ). (2.5)

By the integration by parts and the Cauchy–Schwarz inequality, we have∣∣v(t, p) − v(t, 0)∣∣ � C

√|p|‖Jv(t, p)‖ � C√|p| ∥∥Ju(t)

∥∥


for all p ∈ R. Therefore, applying the Cauchy inequality, we get from (2.5)

|R(t, x)| � C3√t

∫ ∞

0

(∣∣∣∣v(t,

q3√t

)− v(t, 0)

∣∣∣∣ +

+ q3√t

∣∣∣∣Jv

(t,

q3√t

)∣∣∣∣)

dq

1 + q(q2 + χ)

� C3√t2

(∫ ∞

0

∣∣∣∣Jv

(t,

q3√t

)∣∣∣∣2

dq∫ ∞

0

q2dq

(1 + q(q2 + χ))2

) 12

+

+ C√t

∥∥Ju∥∥ ∫ ∞

0

√q dq

1 + q(q2 + χ)� C‖Ju(t)‖√〈χ〉√t

. (2.6)

Now let us consider the second case x � 0. Denote χ = −µ2 � 0, so that

µ =√|x|

6√t, κ = µ

3√t.

Using the identity

eiqχ+iq3/3 = 1

1 + i(q − µ)2(q + µ)

∂

∂q

((q − µ)eiqχ+iq3/3

), (2.7)

we integrate by parts with respect to q in the remainder term R(t, x)

R(t, x) = C3√t�∫ ∞

0

(i(q − µ)2(3q + µ)

1 + i(q − µ)2(q + µ)

(v

(t,

q3√t

)− v(t, κ)

)−

− q − µ3√t

Jv

(t,

q3√t

))eiqχ+iq3/3

1 + i(q − µ)2(q + µ)dq −

− µ

π 3√t� v(t, 0) − v(t, κ)

1 + iµ3. (2.8)

Therefore, by the Cauchy–Schwarz inequality, we obtain by (2.8)

|R(t, x)| � C3√t

∫ ∞

0

(∣∣∣∣v(t,

q3√t

)− v(t, κ)

∣∣∣∣ + |q − µ|3√t

∣∣∣∣Jv

(t,

q3√t

)∣∣∣∣)

×

× dq

1 + (q − µ)2(q + µ)+ C‖Ju‖√

t 〈µ〉� C√

t‖Ju‖

(∫ ∞

0

√|q − µ|dq1 + (q − µ)2(q + µ)

+

+(∫ ∞

0

(q − µ)2dq

(1 + (q − µ)2(q + µ))2

) 12)

� C‖Ju(t)‖√t√〈µ〉 . (2.9)


Using the estimate |Ai(χ)| � C 〈χ〉− 14 for the Airy-type function (see [10, 11])

we get from (2.3), (2.6) and (2.9)

|u(t, x)| � C 〈t〉− 13

(1 + |x|

3√t

)− 14 (‖v(t)‖∞ + 〈t〉− 1

6 ‖Ju(t)‖)

� C 〈t〉− 13

(1 + |x|

3√t

)− 14 |u|X (2.10)

for all x ∈ R, t > 0. Making a change of the variable χ = x/ 3√t and using

inequality (2.10), we get

‖u(t)‖β � C 〈t〉− 13 + 1

3β ‖u‖X

(∫ ∞

0〈χ〉− β

4 dχ

) 1β

� C 〈t〉− 13 + 1

3β |u|X (2.11)

for any β > 4. Hence, (2.1) follows. As in (2.3) we have for the derivative ux

ux(t, x) = −√

2√π

3√t2

�∫ ∞

0eiqχ+iq3/3v

(t,

q3√t

)q dq.

In the domain x � 0, using the identity (2.4), analogously to (2.6), we obtain

|ux(t, x)| � C3√t2

‖v‖∞∫ ∞

0

q dq

|1 + iq(q2 + χ)| + C

t

∫ ∞

0

∣∣Jv(t,

q3√t

)∣∣q2dq

|1 + iq(q2 + χ)|� C

3√t2

(‖v‖∞ + 1

6√〈t〉‖〈p〉Jv(t, p)‖

)� C

3√t2

‖u‖X (2.12)

for all t > 0. And in the domain x � 0 using the identity (2.7) we get in analogywith (2.9), for all t � 1

|ux(t, x)| � C3√t2

∫ ∞

0

(∣∣∣∣v(t,

q3√t

)∣∣∣∣+ |q − µ|3√t

∣∣∣∣Jv

(t,

q3√t

)∣∣∣∣)

×

× q dq

1 + (q − µ)2(q + µ)

� C‖v(t)‖∞3√t2

∫ ∞

0

q dq

1 + (q − µ)2(q + µ)+

+ C6√t5

‖Jv(t, p)‖(∫ ∞

0

q2(q − µ)2 dq

(1 + (q − µ)2(q + µ))2

) 12

� C 4√〈χ〉3√t2

(‖v‖∞ + 1

6√t‖Ju‖

)� C 4

√〈χ〉3√t2

‖u‖X . (2.13)

Now estimate (2.2), for all t � 1, follows from (2.10), (2.12) and (2.13), and forthe case 0 < t < 1, x � 0, we get


|ux(t, x)| � C3√t

∫ ∞

0

(v

(t,

q3√t

)+ |q − µ|

3√t

∣∣∣∣Jv

(t,

q3√t

)∣∣∣∣)

×

×q3√t

dq

1 + (q − µ)2(q + µ)

� C6√t

(∫ ∞

0|v(t, p)|2p2 dp

) 12(∫ ∞

0

dq

(1 + (q − µ)2(q + µ))2

) 12

+

+ C√t‖pJv(t, p)‖

(∫ ∞

0

(q − µ)2 dq

(1 + (q − µ)2(q + µ))2

) 12

� C√t(‖Ju‖1,0 + ‖u‖1,0) � C√

t|u|X,

hence in view of (2.10) and the estimate ‖u‖∞ � C‖u‖1,0, we obtain (2.2) for all0 < t � 1. Lemma 2.1 is proved. ✷

In order to show the following lemma, we need some function spaces. Wedenote

Y0 = {φ ∈ L2; ‖φ‖Y0 < ∞}

,

where

‖u (t)‖Y0 = 〈t〉− 16∥∥Ju(t)

∥∥+ ∥∥F G (−t) u (t)∥∥∞

and

Yν ={φ ∈ L2 :

∫φ (x) dx = 0, ‖φ‖Yν < ∞

}, ν ∈

(0,

1

2

),

where

‖u (t)‖Yν = 〈t〉 ν3 − 1

6∥∥Ju(t)

∥∥+ supp∈R

|p|−ν〈p3t〉 ν3 − 1

6 |F G(−t)u(t)|.

LEMMA 2.2. Let

u(t, x), ϑ (t, x) ∈ L∞ ((0,∞) ,Y0),

w(t, x) ∈ L∞ ((0,∞) ,Yδ), δ = 1

2− 3γ, γ ∈

(0,

1

50

),

be real-valued functions. Then we have the following estimates for all t > 1:

‖u (t) ϑ (t) w (t) ‖ � C 〈t〉γ−1 ‖u‖Y0 ‖ϑ‖Y0 ‖w‖ Yδ . (2.14)

Moreover, we have the following asymptotics for large-time t � 1 uniformly withrespect to x ∈ R:

w(t, x) =√

2π3√t

�Ai

(x3√t

)r(t, κ) + O

(tγ− 1

2

(1 + |x|

3√t

)− 14 ‖w‖Yδ

), (2.15)


where

κ =√

−x

tfor x � 0 and κ = 0 for x � 0;

the function r(t) = G(−t)w(t) and G(t) is the free Airy evolution group.Proof. By the Cauchy–Schwarz inequality, we get

∣∣r(t, p)−r(t, κ)∣∣ �

∫ p

κ

∣∣Jr(t, ξ )∣∣dξ � C 〈t〉 1−2δ

6√|p − κ| ‖w‖Yδ . (2.16)

We use the representation similar to (2.5) and modify the estimates (2.6) and (2.9)via (2.16) as follows

|R(t, x)| � C3√t

∫ ∞

0

(∣∣r(t, q3√t

)− r(t, 0)∣∣+ q

3√t

∣∣Jr(t,

q3√t

)∣∣)dq1 + q(q2 + χ)

� C3√t2

(∫ ∞

0

∣∣∣∣Jr

(t,

q3√t

)∣∣∣∣2

dq

∫ ∞

0

q2dq

(1 + q(q2 + χ))2

) 12

+

+C 〈t〉γ− 12 ‖w‖Yδ

∫ ∞

0

(qδ + √q)dq

1 + q(q2 + χ)� Ctγ− 1

2 ‖w‖Yδ√〈χ〉and in the same manner as above

|R(t, x)| � C3√t

∫ ∞

0

(∣∣∣∣r(t,

q3√t

)− r(t, κ)

∣∣∣∣+ |q − µ|3√t

∣∣∣∣Jr

(t,

q3√t

)∣∣∣∣)

×

× dq

1 + (q − µ)2(q + µ)+ C|r(t, κ) − r(t, 0)|

3√t 〈µ〉

� Ctγ− 12 ‖w‖Yδ

(∫ ∞

0

(|q − µ|δ + √|q − µ|)dq1 + (q − µ)2(q + µ)

+

+(∫ ∞

0

(q − µ)2 dq

(1 + (q − µ)2(q + µ))2

) 12)

+ Ctγ− 12

〈µ〉 ‖w‖Yδ

� Ctγ− 12 ‖w‖Yδ√〈µ〉 ,

hence, using the estimate

∣∣r(t, κ)∣∣ = ∣∣r(t, κ) − r(t, 0)∣∣ �

∫ κ

0

∣∣Jr (t, p)∣∣dp � |κ|δ ⟨κ3t

⟩γ ‖w‖Yδ ,

we get

|w(t, x)| � C 〈t〉− 13

(1 + |x|

3√t

)− 14(

〈t〉− δ3 + |x| δ

2

tδ2

(x3√t

) 3γ2)

‖w‖Yδ


and, by virtue of Lemma 2.1 (2.10), we have

‖u (t) ϑ (t) w (t)‖� C

t‖u‖Y0 ‖ϑ‖Y0 ‖w‖Yδ ×

×(∫ (

1 + |x|3√t

)− 32(

〈t〉− 2δ3 + |x|δ

tδ

(x3√t

)3γ)dx

) 12

� C 〈t〉− 56 − δ

3 ‖u‖Y0 ‖ϑ‖Y0 ‖w‖Yδ .

Thus, estimate (2.14) and asymptotics (2.15) are true. Lemma 2.2 is proved. ✷In the next lemma we estimate the following integral:

M(t, p) = p3∫∫

e2it3 p3Q0(t, p, ξ)dξ1 dξ2,

where

Q = − 12

(1 − ξ 3

1 − ξ 32 − ξ 3

3

), t > 1, p > 0,

the vector ξ = (ξ1, ξ2, ξ3) with the relation ξ1 + ξ2 + ξ3 = 1, the function

0(t, p, ξ) = ψ3 (ξ)

(v(t, pξ1)ϑ(t, pξ2)w(t, pξ3) −

− v(t,

p

3

)ϑ(t,

p

3

)w(t,

p

3

)ψ1(ξ) −

− v(t, p)ϑ(t, p)w(t,−p)ψ2(ξ)

)

and ψj ∈ C2(R3), j = 1, 2, 3 are such that

ψ1(ξ) = 1 as∣∣ξ1 − 1

3

∣∣+ ∣∣ξ2 − 13

∣∣ < 110

and

ψ1(ξ) = 0 as∣∣ξ1 − 1

3

∣∣+ ∣∣ξ2 − 13

∣∣ > 15 ,

ψ2(ξ) = 1 as |ξ1 − 1| + |ξ2 − 1| < 110

and

ψ2(ξ) = 0 as |ξ1 − 1| + |ξ2 − 1| > 15 ,

ψ3(ξ) = 1 as ξ1 > 12 , ξ2 > 1

2

and

ψ3(ξ) = 0 as ξ1 < 16 , or ξ2 < 1

6 .


Denote

|φ|0 = supt�1

(t−

16 ‖φ′(t)‖ + ‖φ(t)‖∞

),

|φ|ν = supt�1

(tν3 − 1

6∥∥φ′(t)

∥∥+ supp∈R

∣∣|p|−ν⟨p3t

⟩ ν3 − 1

6 φ(t, p)∣∣).

(If ν > 0, then by |φ|ν < ∞, we assume also that φ(t, 0) = 0 for all t � 1.)

LEMMA 2.3. Let the functions v, ϑ, w be such that the norms |v|α, |ϑ |σ , |w|δ <

∞, where

(α, σ, δ) = (0, 0, 0),(

12 − 3γ, 0, 0

),

(0, 1

2 − 3γ, 0), or(

0, 0, 12 − 3γ

),

where γ ∈ (0, 150). Then the following estimate is valid for all t > 1, p > 0:

|M(t, p)| � pω|v|α|ϑ |σ |w|δ(p3t)1− γ

2 〈p3t〉 112 −2γ

,

where ω = 3 + α + σ + δ.

Proof. We make a change of variables of integration

ξ1 = 13 + z − y, ξ2 = 1

3 + z + y

so that ξ3 = 13 − 2z and

Q = − 49 + 3z2(1 − z) + y2(1 + 3z).

Denote also τ = (2t/3)p3. Then we have

M(t, p) = p3∫∫

eiτQ0(t, p, ξ)dξ1 dξ2.

Let us consider first the case τ ∈ (0, 1). We integrate by parts via the identity

eiτQ = 1

1 + 2iτy2 (1 + 3z)

∂

∂y

(yeiτQ

)to get

M = M1 + M2 − M3,

where

M1(t, p) = p3∫∫

eiτQ0(t, p, ξ)4iτy2(1 + 3z)dy dz

(1 + 2iτy2(1 + 3z))2

and

Mj+1(t, p) = p3∫∫

eiτQ0′

ξj(t, p, ξ)y dy dz

1 + 2iτy2(1 + 3z), j = 2, 3.


In the first summand, we make a change of variables y√

1 + 3z = η and integrateby parts with respect to z and using the identity

eiτQ = 1

1 + 3iτz2 (2 − 3z)

∂

∂z

(zeiτQ

),

we obtain

M1 = p3∫∫

eiτQ0(t, p, ξ)4iτη2 dη dz

(1 + 2iτη2)2√

1 + 3z

= p3∫∫

eiτQ

(6iτz2(4 − 9z)

1 + 3iτz2(2 − 3z)0(t, p, ξ) + 3z

2(1 + 3z)0(t, p, ξ) +

+(z − 3yz

2(1 + 3z)

)0′

ξ1(t, p, ξ) +

(z + 3yz

2(1 + 3z)

)0′

ξ2(t, p, ξ) −

− 2z0′ξ3(t, p, ξ)

)4iτη2 dη dz

(1 + 2iτη2)2√

1 + 3z(1 + 3iτz2(2 − 3z)).

Since

|0(t, p, ξ)| � C6(ξ)〈τ 〉γ pα+σ+δ〈z〉 12 −3γ |v|α|ϑ|σ |w|δ

and

p|Jv(t, pξ1)| � Cpα(τt

) 1−2α6 |√pJv(t, pξ1)|,

we get

|0′ξ1(t, p, ξ)| � C6(ξ)〈τ 〉γ pα+σ+δ〈z〉 1

2 −3γ |ϑ |σ |w|δ ××((τ

t

) 1−2α6 ∣∣√pJv(t, pξ1)

∣∣+ |v|α), (2.17)

where

6(ξ) = 1 if ξ1 > 16 and ξ2 > 1

6 and

6(ξ) = 0 otherwise, Jφ(t, q) = ∂qφ(t, q).

The derivatives 0′ξ2

and 0′ξ3

are estimated in the same way. Then, by the Cauchy–Schwarz inequality, we have

|M1| � Cp3∫∫ (|0(t, p, ξ)| + (|z| + |y|)(∣∣0′

ξ1(t, p, ξ)

∣∣ +

+ ∣∣0′ξ2(t, p, ξ)

∣∣+ ∣∣0′ξ3(t, p, ξ)

∣∣)) dy dz

(1 + τ |y|3)(1 + τ |z|3)� Cpω|v|α|ϑ |σ |w|δ

(∫∫|y|� 1

6 +z

√〈z〉 dy dz

(1 + τ |y|3)(1 + τ |z|3) +


+∫z>− 1

6

〈z〉 32 dz

1 + τ |z|3∫

|y±z|< 16

dy

1 + τ |y|3 +

+∫

dy

1 + τ |y|3(∫

z�|y|− 16

〈z〉3−3γ dz

(1 + τ |z|3)2

) 12)

� Cτγ2 −1pω|v|α|ϑ|σ |w|δ.

We estimate the second summand M2 by the Cauchy–Schwarz inequality to get

|M2| � Cp3∫∫

|0′ξ1(t, p, ξ)| |y| dy dz

1 + τ |z|y2

� Cpω|v|α|ϑ |σ |w|δ(∫

dy

(∫∫z>|y|− 1

6

y2 dz

(1 + τ |z|y2)2

) 12

+

+∫z>− 1

6

〈z〉 32 dz

1 + τ |z|3∫

|y±z|< 16

dy

)� Cτ

γ2 −1pω|v|α|ϑ |σ |w|δ.

The integral M3 is estimated analogously. Thus, in the case τ ∈ (0, 1), we have

|M(t, p)| � Cpω

τ 1− γ2 〈τ 〉 1

12

|v|α|ϑ |σ |w|δ. (2.18)

Now consider the case τ � 1. We have two stationary points: (1) y = 0, z = 0and (2) y = 0, z = 2

3 . As above, we integrate by parts with respect to y to getM = M1 + M2 − M3. Denote the function Z ∈ C2(R) such that Z = z if z � 1

6 ,

Z � 16 if 1

6 � z � 12 and Z = 2

3 − z if z � 12 . Then, in the first integral M1, we

change the variable of integration y√

1 + 3z = η, integrate by parts with respectto z and use the identity

eiτQ = 1

Z′ − 9iτZz(z − 2

3

) ∂

∂z

(ZeiτQ

)to get

M1 = p3∫∫

eiτQ((

Z′ − 9iτZz

(z − 2

3

))′Z

Z′ − 9iτZz(z − 2

3

) 0(t, p, ξ) +

+ 3Z

2(1 + 3z)0(t, p, ξ) +

(Z − 3yZ

2 (1 + 3z)

)0′

ξ1(t, p, ξ) +

+(

Z + 3yZ

2 (1 + 3z)

)0′

ξ2(t, p, ξ) + 2Z0′

ξ3(t, p, ξ)

)×

× 4iτη2 dη dz

(1 + 2iτη2)2√

1 + 3z(Z′ − 9iτZz

(z − 2

3

)) .


From the structure of the function 0, we have

|0(t, p, ξ)| � Cτ16 pα+σ+δ

(√|y| +√|Z|)|v|α|ϑ |σ |w|δand we estimate the derivatives 0′

ξ1, 0′

ξ2and 0′

ξ3as above. Then, since∣∣∣∣Z′ − 9iτZz

(z − 2

3

)∣∣∣∣ � C

(1 + τ

∣∣∣∣Zz(z − 2

3

)∣∣∣∣)

� C(1 + τZ2

),

we get

|M1| � Cp3∫∫

|y|� 16 +z

dy dz

(1 + τy2)(1 + τZ2)

(|0(t, p, ξ)| +

+ (|Z| + |y|) (|0′ξ1(t, p, ξ)| + |0′

ξ2(t, p, ξ)| + |0′

ξ3(t, p, ξ)|))

� Cτ16 pω|v|α|ϑ |σ |w|δ

(∫∫|y|� 1

6 +z

(√|y| + √|Z|)dy dz

(1 + τy2)(1 + τZ2〈z〉) +

+∫

dy

1 + τy2

(∫ 〈z〉Z2 dz

(1 + τZ2 〈z〉)2

) 12

+

+∫ 〈z〉 dz

1 + τZ2 〈z〉(∫

y2 dy

(1 + τy2)2

) 12

+

+∫z>− 1

6

|Z| 〈z〉 dz

1 + τ |z|Z2

∫|y±z|< 1

6

dy

1 + τy2

)

� Cτ− 1312 pω|v|α|ϑ |σ |w|δ. (2.19)

Consider the integral M2. We make a change of variables of integration

ξ1 = 1 − 23ζ, ξ2 = ζ

3+ η, ξ3 = ζ

3− η,

that is

y = ζ + η − 1

2, z = 3η − ζ + 1

6.

Then

Q = ζη2 + 23ζ

2 − 19ζ

3 − ζ.

Thus we have

M2 = p3∫∫

eiτQ0′

ξ1(t, p, ξ)(ζ + η − 1)dη dζ

(2 + iτ (ζ + η − 1)2)(1 + 12(3η − ζ + 1))

.

We divide the domain of integration into two parts: (1) η � 12 and (2) η < 1

2 . Weintegrate by parts with respect to η with the identity

eiτQ = 1

A

∂

∂η

(H eiτQ

),


where A = 1 + 2iτηHζ , H = η − 1 if η � 12 and H = η if η < 1

2 , to get

M2 = −p3

4

∫eiτQ0′

ξ1(t, p, ξ)

iτζ dζ

(4 + τ 2ζ 2)B+

+p3∫∫

eiτQ

(0′

ξ1(t, p, ξ)

(−1 + (ζ + η − 1)A′

η

A+ (ζ + η − 1)B ′

η

B

)+

+ (ζ + η − 1)

(0′′

ξ1ξ3(t, p, ξ) − 0′′

ξ1ξ2(t, p, ξ)

))H dζ dη

AB,

where

A′η = 2iτ (η + H)ζ, B = 2 + iτ (ζ + η − 1)2(1 + 1

2 (3η − ζ + 1)),

B ′η = 2iτ (ζ + η − 1)

(1 + 1

2 (3η − ζ + 1))+ 3

2 iτ (ζ + η − 1)2.

If the domain of integration is

12 − 3η � ζ � 5

4 , η � − 14 ,

then we have

1 + 12 (3η − ζ + 1) � 1

2 + (54 − ζ

)+ 32

(η + ζ

3− 1

6

)� 1

2 ,

which implies

|B| � C(1 + τ (ζ + η − 1)2

),

∣∣HA′η

∣∣ � C |A| ,∣∣(ζ + η − 1) HB ′η

∣∣ � C |B| (|H | + |ζ + η − 1|) .Therefore,

|M2| � Cp3∫∫ ∣∣0′

ξ1(t, p, ξ)

∣∣η= 1

2

dζ

(1 + τ |ζ |)(1 + τ(ζ − 1

2

)2) +

+Cp3∫∫ (

(|H | + |ζ + η − 1|)|0′ξ1(t, p, ξ)| +

+ |H | |ζ + η − 1| (|0′′ξ1ξ3

(t, p, ξ)| + |0′′ξ1ξ2

(t, p, ξ)|)) ×× dζ dη

(1 + τ |ηHζ |)(1 + τ(ζ + η − 1)2).

As in (2.17), we have∣∣0′ξ1(t, p, ξ)

∣∣ � C6(ξ)〈τ 〉γ pα+σ+δ〈z〉 12 −3γ |ϑ |σ |w|δ ×

×(

|v|α +(τt

) 1−2α6 |√pJv(t, pξ1)|

)


and ∣∣0′′ξ1ξ2

(t, p, ξ)∣∣

� C6(ξ)〈τ 〉γ pα+σ+δ〈z〉 12 −3γ |w|δ

(|v|α|ϑ |σ +

+(τt

) 1−2α6 |√pJv(t, pξ1)||ϑ|σ +

(τt

) 1−2σ6 |√pJϑ(t, pξ2)||v|α +

+(τt

) 1−α−σ3 |√pJv(t, pξ1)||√pJϑ(t, pξ2)|

),

the derivative |0′′ξ1ξ3

(t, p, ξ)| is estimated in the same way. Hence, we obtain

|M2| � Cpω|v|α|ϑ |σ |w|δ6∑

j=1

Ij ,

where

I1 =∫ 5

4

−1

dζ

(1 + τ |ζ |)(1 + τ(ζ − 12 )

2)+ τ

16

(∫ 54

−1

dζ

(1 + τ |ζ |)2(1 + τ(ζ − 12)

2)2

) 12

,

I2 =∫ ∞

− 14

dη∫ 5

4

12 −3η

dζ|ζ + η − 1|√〈ζ 〉

(1 + τ |ηHζ |)(1 + τ(ζ + η − 1)2),

I3 =∫ ∞

− 14

dη∫ 5

4

12 −3η

dζ|H |√〈ζ 〉

(1 + τ |ηHζ |)(1 + τ(ζ + η − 1)2),

I4 = τ16

∫ ∞

− 14

dη

(∫ 54

12 −3η

dζ(1 + H 2)(ζ + η − 1)2〈ζ 〉

(1 + τ |ηHζ |)2(1 + τ(ζ + η − 1)2)2

) 12

,

I5 = τ16

∫ ∞

− 14

dη

(∫ 54

12 −3η

dζH 2〈ζ 〉

(1 + τ |ηHζ |)2(1 + τ(ζ + η − 1)2)2

) 12

,

I6 = τ13

(∫ ∞

− 14

dη∫ 5

4

12 −3η

dζH 2(ζ + η − 1)2〈ζ 〉1−6γ

(1 + τ |ηHζ |)2(1 + τ(ζ + η − 1)2)2

) 12

.

We estimate now each integral. We have

I1 � C

τ

∫ 14

−1

dζ

(1 + τ |ζ |) + C

τ

∫ 54

14

dζ

1 + τ(ζ − 12 )

2+

+ C

τ56

(∫ 14

−1

dζ

(1 + τ |ζ |)2

) 12

+ C

τ56

(∫ 54

14

dζ

(1 + τ(ζ − 12)

2)2

) 12

� C

τ32

.

Now we consider the integral

I2 � C√τ

∫ 12

− 14

dη∫ 5

4

12 −3η

dζ

√〈ζ 〉(τη2|ζ |) 1

2 −γ (τ (ζ + η − 1)2)12 −γ

+


+ C√τ

∫ ∞

12

dη∫ 5

4

12 −3η

dζ

√〈ζ 〉(τη |η − 1| |ζ |)1−γ

� C

τ32 −2γ

∫ 12

− 14

dη∫ 5

4

12 −3η

dζ

√〈ζ 〉η1−2γ |ζ | 1

2 −γ |ζ + η − 1|1−2γ+

+ C

τ32 −2γ

∫ ∞

12

dη∫ 5

4

12 −3η

dζ

√〈ζ 〉η1−γ |η − 1|1−γ |ζ |1−γ

� C

τ32 −2γ

∫dη

〈η〉1−3γ

|η|1−γ |η − 1|1−γ� C

τ32 −2γ

.

In the same manner, we get

I3 � C

∫ 12

− 14

dη∫ 5

4

12 −3η

dζ|η|√〈ζ 〉

(τη2|ζ |)1−γ (τ (ζ + η − 1)2)12 −γ

+

+C

∫ ∞

12

dη∫ 5

4

12 −3η

dζ|η − 1|√〈ζ 〉

(τη|η − 1||ζ |)1−γ (τ (ζ + η − 1)2)12 −γ

� C

τ32 −2γ

∫ 12

− 14

dη∫ 5

4

12 −3η

dζ

√〈ζ 〉η1−2γ |ζ |1−γ |ζ + η − 1|1−2γ

+

+ C

τ32 −2γ

∫ ∞

12

dη∫ 5

4

12 −3η

dζ|η − 1|γ √〈ζ 〉

η1−γ |ζ |1−γ |ζ + η − 1|1−2γ

� C

τ32 −2γ

∫dη

〈η〉1−3γ

|η|1−γ |η − 1|1−γ� C

τ32 −2γ

.

Now we have

I4 � Cτ− 13

∫ ∞

− 14

dη

(∫ 54

12 −3η

〈η〉1−6γ dζ

(1 + τ |ηHζ |)2(1 + τ(ζ + η − 1)2)

) 12

� Cτ− 13

∫ ∞

− 14

dη

(∫ 54

12 −3η

〈η〉 12 −6γ dζ

(τ |η|H |ζ |)1−γ (τ (ζ + η − 1)2)12 −γ

) 12

� Cτγ− 1312

∫ ∞

− 14

dη

(∫ 54

12 −3η

〈η〉 12 −6γ dζ

(|η|H |ζ |)1−γ |ζ + η − 1|1−2γ

) 12

� Cτγ− 1312

∫|η| γ−1

2 Hγ−1

2 |η − 1|γ− 12 〈η〉 1

2 −6γ dη � Cτγ− 1312 , (2.20)

analogously dividing the domain of integration in seven parts then we obtain

I5 � Cτ16

∫ 12

0dη

(∫ 54

1−η2

F1 dζ

) 12

+ Cτ16

∫ 12

− 14

dη

(∫ 54

12 −3η

F2 dζ

) 12

+


+Cτ16

∫ 12

0dη

(∫ 1−η2

12 −3η

F3 dζ

) 12

+ Cτ16

∫ 1

12

dη

(∫ 54

1−η2

F4 dζ

) 12

+

+Cτ16

∫ 1

12

dη

(∫ 1−η2

12 −3η

F5 dζ

) 12

+ Cτ16

∫ ∞

1dη

(∫ 54

1−η2

F5 dζ

) 12

+

+Cτ16

∫ ∞

1dη

(∫ 1−η2

12 −3η

F4 dζ

) 12

� Cτ 2γ− 1213 ,

where

F1 = η2〈ζ 〉(τη2|ζ |)2−2γ (τ (ζ + η − 1)2)

12 −γ

,

F2 = η2〈ζ 〉(τη2|ζ |)2γ−2(τ (ζ + η − 1)2)

12 −γ

,

F3 = η2〈ζ 〉(τη2|ζ |)1−γ τ 2(ζ + η − 1)4

,

F4 = (η − 1)2〈ζ 〉(τη(η − 1)2)2−2γ (τ (ζ + η − 1)2)

12 −γ

,

F5 = (η − 1)2〈ζ 〉(τη|η − 1||ζ |)1−γ (τ (η − 1)2)

32

.

For the last integral we get

I6 � Cτ− 16

(∫ ∞

− 14

dη∫ 5

4

12 −3η

H 2〈ζ 〉1−6γ

(1 + τ |ηHζ |)2(1 + τ(ζ + η − 1)2)dζ

) 12

� Cτ− 16

(∫ 12

0dη∫ 5

4

1−η2

F6 dζ

) 12

+ Cτ− 16

(∫ 0

− 14

dη∫ 5

4

12 −3η

F6 dζ

) 12

+

+Cτ− 16

(∫ 12

0dη∫ 1−η

2

12 −3η

F7 dζ

) 12

+ Cτ− 16

(∫ 1

12

dη∫ 5

4

1−η2

F8 dζ

) 12

+

+Cτ− 16

(∫ 1

12

dη∫ 1−η

2

12 −3η

F9 dζ

) 12

+ Cτ− 16

(∫ ∞

1dη∫ 5

4

1−η2

F10 dζ

) 12

+

+Cτ− 16

(∫ ∞

1dη∫ 1−η

2

12 −3η

F8 dζ

) 12

� Cτγ− 76 ,

where

F6 = η2〈ζ 〉1−6γ

(τη2|ζ |) 32 −γ (τ (ζ + η − 1)2)

12 −γ

,


F7 = η2〈ζ 〉1−6γ

(τη2|ζ |)1−γ τ (ζ + η − 1)2,

F8 = (η − 1)2〈ζ 〉1−6γ

(τη(η − 1)2)32 −γ (τ (ζ + η − 1)2)

12 −γ

,

F9 = (η − 1)2〈ζ 〉1−6γ

(τη|η − 1||ζ |)1−γ (τ (η − 1)2)1−γ,

F10 = (η − 1)2〈ζ 〉1−6γ

(τη|η − 1||ζ |)1−γ τ (η − 1)2.

Thus, we obtain

|M2| � Cτ 2γ− 1312 pω|v|α|ϑ |σ |w|δ. (2.21)

In the integral M3, we make a change of variables

ξ1 = ζ

3+ η, ξ2 = 1 − 2

3ζ, ξ3 = ζ

3− η

that is

y = −ζ + η − 1

2, z = 3η − ζ + 1

6

and then all the estimates are the same as for M2, so we get the estimate

|M3| � Cτ 2γ− 1312 pω|v|α|ϑ |σ |w|δ. (2.22)

From the estimates (2.18)–(2.22) the result of the lemma follows. Lemma 2.3 isproved. ✷

In the next lemma, we consider the asymptotic representation for the nonlinear-ity

N (t, p) = p3∫∫

e2it3 p3Qv(t, pξ1)v(t, pξ2)w(t, pξ3) dξ1 dξ2,

where

Q = − 12

(1 − ξ 3

1 − ξ 32 − ξ 3

3

), t > 0, p > 0, ξ1 + ξ2 + ξ3 = 1.

LEMMA 2.4. Let the functions v, w be such that the norms |v|0, |w|δ < ∞, where

δ ∈ [0, 12 − 3γ

], γ ∈ (0, 1

50

).


Then the following representation is valid for all t > 1, p > 0:

N (t, p) = − π√

3p3

(p3t)1− γ2 〈p3t〉 γ

2e− 8it

27 p3v2(t,

p

3

)w(t,

p

3

)+

+ iπp3

(p3t)1− γ2 〈p3t〉 γ

2

(2|v(t, p)|2w(t, p) + w(t, p)v2(t, p)

) +

+ O

(p3+δ|v|20|w|δ

(p3t)1− γ2 〈p3t〉 1

12 −2γ

).

Proof. There are four stationary points in the integral N :

(1) ξ1 = 13 , ξ2 = 1

3 , ξ3 = 13 ,

(2) ξ1 = 1, ξ2 = 1, ξ3 = −1,

(3) ξ1 = 1, ξ2 = −1, ξ3 = 1,

(4) ξ1 = −1, ξ2 = 1, ξ3 = 1.

In view of the symmetry with respect to variables ξ1, ξ2, ξ3 we can write therepresentation N = ∑5

j=1 Nj , where

N1(t, p) = p3v2(t,

p

3

)w(t,

p

3

) ∫∫e

2it3 p3Qψ1 (ξ) dξ1 dξ2,

N2(t, p) = p3(2|v(t, p)|2w(t, p)+

+ w(t, p)v2(t, p)) ∫∫

e2it3 p3Qψ2 (ξ) dξ1 dξ2,

Nj (t, p) = p3∫∫

e2it3 p3Q0j (t, p, ξ) dξ1 dξ2, j = 3, 4, 5,

the functions 0j(t, p, ξ) are

03(t, p, ξ) =(v(t, pξ1)v(t, pξ2)w(t, pξ3) − v2

(t,

p

3

)w(t,

p

3

)ψ1 (ξ) −

− v2(t, p)w(t, p)ψ2(ξ)

)ψ3 (ξ) ,


(t,

p

3

)w(t,

p

3

)ψ1 (ξ) −

− |v(t, p)|2w(t, p)ψ2(ξ)

)ψ4(ξ),


(t,

p

3

)w(t,

p

3

)ψ1 (ξ) −

− |v(t, p)|2w(t, p)ψ2 (ξ)

)ψ5(ξ),


where ξ = (ξ1, ξ2, ξ3) and the functions ψj ∈ C2(R3), j = 1, 2, 3, 4, 5 are suchthat

ψ1(ξ) = 1 if∣∣ξ1 − 1

3

∣∣+ ∣∣ξ2 − 13

∣∣ < 110 and ψ1(ξ) = 0 if

∣∣ξ1 − 13

∣∣+ ∣∣ξ2 − 13

∣∣ > 15 ;

ψ2(ξ) = 1 if |ξ1 − 1| + |ξ2 − 1| < 110 and ψ2(ξ) = 0 if |ξ1 − 1| + |ξ2 − 1| > 1

5 ;ψj(ξ) = 1 as ξ1 > 1

2 , ξ2 > 12 and ψj(ξ) = 0 as ξ1 < 1

6 , or ξ2 < 16 for j = 3, 4, 5,

moreover, we assume that

ψ3(ξ) + ψ4(ξ2, ξ3, ξ1) + ψ5(ξ3, ξ1, ξ2) = 1.

Using the stationary phase method (see [10, 11]), for large values of p3t > 1, weget

N1(t, p) =(

−π√

3

p3te− 8it

27 p3 + O

(1

p6t2

))p3v2

(t,

p

3

)w(t,

p

3

)and

N2(t, p) =(

iπ

p3t+ O

(1

p6t2

)) (2p3|v(t, p)|2w(t, p) + p3w(t, p)v2(t, p)

).

For the summands Nj , j = 3, 4, 5, we can write an estimate via Lemma 2.2

∣∣Nj (t, p)∣∣ � p3+δ|v|20|w|δ

(p3t)1− γ2 〈p3t〉 1

12 −2γ,

hence the result of the lemma follows. Lemma 2.4 is now proved. ✷

3. Proof of Theorems

We first state the local existence theorem.

THEOREM 3.1. Let u0 ∈ H1,1(R). Then there exists a finite time interval [0, T ]with T > 0 such that there exists a unique solution u ∈ C([0, T ]; X) of the Cauchyproblem (1.1) satisfying the estimate supt∈[0,T ] ||u||X � 2‖u0‖X. Moreover, if weassume that the norm of the initial data ‖u0‖1,1 is sufficiently small, then there ex-ists a finite time interval [0, T ] with T > 1 and a unique solution u ∈ C([0, T ]; X)

of the Cauchy problem (1.1) such that supt∈[0,T ] ||u||X � 2‖u0‖X.

For the proof of Theorem 3.1, see [4–6, 12, 17–19, 25, 31].

Proof of Theorem 1.1. Let u be the local solution of the Cauchy problem (1.1)described in Theorem 3.1. Then we prove the following estimate:

‖u(t)‖X < 20ε (3.1)


for any t ∈ [0, T ], where γ ∈ (0, 150 ), ε = ‖u0‖1,1. We prove the theorem by

the method of contradiction. We assume that T1 > 1 is the maximal time suchthat estimate (3.1) is valid for t ∈ [0, T1), but is violated at t = T1. We first notethat the two conservation laws ‖u‖ = ‖u0‖ and |u(t, 0)| = |u0(0)| take place. Wedifferentiate Equation (1.1) with respect to x to get Lux = −a(t)(u3)xx , whereLu = (∂t + 1

3∂3x )u. Multiplying both sides of this equation by ux and integrating

by parts, we obtain

d

dt‖ux‖2 � C‖uux‖∞‖ux‖2.

Using estimates (2.1), (2.2) of Lemma 2.1 and Theorem 3.1, we get

‖u(t)‖β � Cε 〈t〉− 13 + 1

3β , ‖u(t)ux(t)‖∞ � Cε2t−23 〈t〉− 1

3 , (3.2)

where 4 < β � ∞. By the energy method we easily see that ‖u‖1,0 � ε〈t〉γ . Adirect computation shows that

I(a(t)u3)

x= 3ta′(t)u3 + 3a(t)u2Iux.

Therefore applying the operator I to Equation (1.1) we find

LIu = ILu + 3∫ x

−∞Lu dx′

= −3ta′(t)u3 − a(t)I(u3)x− 3a(t)u3

= −3ta′(t)u3 − 3a(t)u2(Iu)x.

Hence, we get

d

dt‖Iu‖2 � C‖uux‖‖Iu‖2 + C 〈t〉− 1

6 ‖u3‖. (3.3)

Similarly, we find

LIux = ILux + 3Lu

= −3ta′(t)(u3)

x− a(t)I

(u3)

xx− 3a(t)

(u3)

x

= −3ta′(t)(u3)x− 3a(t)

(u2(Iux)x + uuxIux + 2u2ux

).

Multiplying both sides of the equation by Iux and integrating by parts, we obtain

d

dt‖Iux‖2 � C‖uux‖∞ (‖Iux‖ + ‖u‖) ‖Iux‖. (3.4)

Applying (3.2) to (3.3)–(3.4) and using the Gronwall inequality for the resultinginequalities, we obtain the estimate

‖Iu‖ + ‖Iux‖ � 2ε〈t〉γ , (3.5)


thus by (3.2), (3.5) and Lemma 2.1, we get

‖Ju‖1,0 � ‖Iu‖ + ‖Iux‖ + ‖u‖ + Ct∥∥u3

∥∥+ Ct∥∥u2ux

∥∥ � 4ε 〈t〉 16 . (3.6)

Multiplying both sides of (1.1) by G(−t), we get

(G(−t)u(t))t + a(t)G(−t)(u3)x

= 0.

Taking the Fourier transformation, we get

vt (t, p) + ip

2πa(t)

∫∫dζ1 dζ2eitQv(t, ζ1)v(t, ζ2)v(t, ζ3) = 0, (3.7)

where

ζ3 = p − ζ1 − ζ2, Q = − 13

(p3 − ζ 3

1 − ζ 32 − ζ 3

3

), v (t) = G(−t)u(t).

We have v(t,−p) = v(t, p) since the solution u(t, x) is real, therefore it is suf-ficient to consider only the case p > 0. Changing the variables of integrationζj = pξj and applying Lemma 2.2 to (3.7), we get the following equation for thefunction v(t, p) for all p > 0, t > 1

vt (t, p) =√

3a(t)p3

2(p3t)1− γ2 〈p3t〉 γ

2e− 8it

27 p3v3(t,

p

3

)−

− 3ia(t)p3

2(p3t)1− γ2 〈p3t〉 γ

2|v(t, p)|2v(t, p) +

+ O

(p3|v|30

(p3t)1− γ2 〈p3t〉 1

12 −2γ

). (3.8)

To get rid of the second summand in the right-hand side of Equation (3.8), we makea change of the dependent variable v = fEv, with

Ev(t, p) = exp

(−3i

2

∫ t

1|v(τ, p)|2 a(τ)p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

).

Then integrating the resulting equation with respect to t , we obtain

f (t) = f (1) − C

∫ t

1Ev(τ)e

− 8iτ27 p3

v3(τ,

p

3

) a(τ)p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2+

+ O

(ε3∫ t

1

p3 dτ

(p3τ)1− γ2 〈p3τ 〉 1

12 −2γ

).

Therefore we get

‖F (G(−t)u(t))‖∞ = ‖f ‖∞

� 10ε + C

∣∣∣∣∫ t

1Ev(τ)e

− 8iτ27 p3

v3(τ,

p

3

) a(τ)p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

∣∣∣∣. (3.9)


To estimate the last integral we integrate by parts using the identity

e− 8iτ27 p3 = 1

1 − 8iτ27 p

3

d

dτ

(τe− 8iτ

27 p3).

Then we have, for all 1 � s � t, p > 0,∫ t

s

Ev(τ)e− 8iτ

27 p3v3(τ,

p

3

) a(τ)p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

=[a(τ)Ev(τ)e− 8iτ

27 p3v3(τ,

p

3

)1 − 8iτ

27 p3

]t

s

−

−∫ t

s

Ev(τ)e− 8iτ27 p3

1 − 8iτ27 p

3

(a(τ)v3

(τ,

p

3

)8i27p

3τp3(1 − 8iτ

27 p3)(p3τ)1− γ

2 〈p3τ 〉 γ2

+

+ 3a(τ)τp3

(p3τ)1− γ2 〈p3τ 〉 γ

2v2(τ,

p

3

)vτ

(τ,

p

3

)−

− 3iπa2(τ )p3τp3

(p3τ)2−γ 〈p3τ 〉2γv3(τ,

p

3

)|v(τ, p)|2 +

+ τp3 d

dτ

(a(τ)

(p3τ)1− γ2 〈p3τ 〉 γ

2

)v3(τ,

p

3

))dτ. (3.10)

From Equation (3.8) we have the estimate ‖vt (t, p)‖∞ � Cεt−1. Hence, by (3.10),∣∣∣∣∫ t

s

Ev(τ)e− 8iτ

27 p3v3(τ,

p

3

) a(τ)p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

∣∣∣∣� Cε3

∫ t

s

p3 dτ

(p3τ)1− γ2 〈p3τ 〉1+ γ

2� Cε3

〈p3s〉 . (3.11)

Estimates (3.6), (3.9) and (3.11) give us estimate (3.1). By virtue of estimate (3.1),we can continue the local solution u(t, x) for all t > 0. Theorem 1.1 is proved. ✷

Proof of Theorem 1.2. We denote w(t, x) = u(t, x) − S(t, x), where

S(t, x) = 13√tϕ

(x3√t

)

is the self-similar solution described in the Introduction. Let us prove the followingestimate:

‖w(t)‖Yδ < 30ε (3.12)

for any t ∈ [0, T ], where

δ ∈ [0, 12 − 3γ

], γ ∈ (0, 1

50

), ε = ‖u0‖1,1.


In the same way, as in the proof of (3.1), we assume that T1 > 1 is the maximaltime such that estimate (3.12) is true for all t ∈ [0, T1), but it is violated at timet = T1. In view of the self-similar structure of S(t, x), we have IS = 0 and, hence,

JS =(

I − 3t∫ x

−∞dxL

)S = 3tAS3.

From Equation (1.1), we get

Lw = A(w3 − 3uw2 + 3u2w

)x− (a (t) − A)

(u3)x.

Via (3.5) and Lemmas 2.1 and 2.2, we obtain

‖Jw‖ � ‖Iu‖ + Ct(∥∥w3

∥∥+ ∥∥uw2∥∥+ ∥∥u2w

∥∥)+ Ct−16∥∥u3

∥∥� 12εtγ (3.13)

for all t � 1. Since∫w (t, x) dx = r (t, 0) = 0 for all t > 1,

where r = G(−t)w, we have

|r(t, p)| = |r(t, p) − r(t, 0)|�∫ p

0|Jr(t, p)|dp � √

p‖Jw‖

� |p|δ〈p3t〉 16 − δ

3 t−γ ‖Jw‖ � Cε|p|δ〈p3t〉 16 − δ

3 ,

hence, in view of (3.13), estimate (3.12) follows. Now as in (3.7) we obtain

ϑt (t, p) + iAp

2π

∫∫dζ1 dζ2eitQϑ(t, ζ1)ϑ(t, ζ2)ϑ(t, ζ3) = 0,

where ϑ(t) = F G(−t)S(t). We now make a change of dependent variables v =fEv, ϑ = gEϑ , where

Ev(t, p) = exp

(−3

2iA

∫ t

1|v(τ, p)|2 p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

),

Eϑ(t, p) = exp

(−3

2iA

∫ t

1|2πϑ(τ, p)|2 p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

).

Then, for the difference h = f − g, we get

ht(t, p)

= ip (A − a (t)) Ev

∫∫dζ1 dζ2eitQv(t, ζ1)v(t, ζ2)v(t, ζ3)−

− iApEv

(∫∫dζ1 dζ2eitQ

(v(t, ζ1)v(t, ζ2)v(t, ζ3)−


− ϑ(t, ζ1)ϑ(t, ζ2)ϑ(t, ζ3))

− 3iπAp3

(p3τ)1− γ2⟨p3τ

⟩ γ2

(|v(τ, p)|2v(τ, p) − |ϑ(τ, p)|2ϑ(τ, p)))−

− ipA(Ev − Eϑ)

(∫∫dζ1dζ2eitQϑ(t, ζ1)ϑ(t, ζ2)ϑ(t, ζ3)−

− 3iπAp3

(p3τ)1− γ2 〈p3τ 〉 γ

2|ϑ(τ, p)|2ϑ(τ, p)

),

Hence, as in the proof of Theorem 1.1 applying Lemma 2.4 and estimates (3.1),(3.10), (3.11) and (3.12), we get

|h(t) − h(s)| � Cε3|p|δ〈p3s〉 1

12 −3γ, |f (t) − f (s)| � Cε3

〈p3s〉 112 −3γ

and

|g(t) − g(s)| � Cε3

〈p3s〉 112 −3γ

(3.14)

for all t > s > 1. Therefore, there exist limits

H (p) = limt→∞h(t, p), F (p) = lim

t→∞f (t, p)

and

G(p) = limt→∞ g(t, p)

with the estimates

|H (p)| � Cε3 |p|δ+3γ− 112 , |F (p)| � Cε3 and |G(p)| � Cε3.

Then, using the estimates,

|h (t, p) − H (p)| � Cε3 |p|δ〈p3t〉 1

12 −3γ, |f (t, p) − F (p)| � Cε3

〈p3t〉 112 −3γ

,

|g (t, p) − G(p)| � Cε3

〈p3t〉 112 −3γ

, |Ev − Eϑ | � Cε2 |p|δ ⟨p3t⟩γ

we obtain

r = fEv − gEϑ = f (Ev − Eϑ) + hEϑ

= HEϑ + F (Ev − Eϑ) + (h − H)Eϑ − (F − f ) (Ev − Eϑ)

= Eϑ

(H + F(EvEϑ − 1)

)+ O(ε3t4γ− 1

12).

Note that

|ϑ(t, p)|2 = |g (t, p)|2 = |G(p)|2 + O(ε2〈p3t〉3γ− 1

12).


We now denote

@1(t) = −3

2iA

∫ t

1

(|ϑ(τ, p)|2 − |ϑ(t, p)|2) p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

and

@2(t) = 3

2iA

∫ t

1

(|ϑ(τ, p)|2 − |v(τ, p)|2 − |ϑ(t, p)|2 +

+ |v(t, p)|2) p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2.

Since

|ϑ(t, p)|2 = |g|2, |v(t, p)|2 = |f |2,we get

@1(t) − @1(s)

= −3

2iA

∫ t

s

(|ϑ(τ, p)|2 − |ϑ(t, p)|2) p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2+

+ 3

2iA(|ϑ(t, p)|2 − |ϑ(s, p)|2) ∫ s

1

p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

and

@2(t) − @2(s) = −3

2iA

∫ t

s

(|f (τ, p)|2 − |g(τ, p)|2 −

− |f (t, p)|2 + |g(t, p)|2) p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2+

+ 3

2iA(|f (t, p)|2 − |g(t, p)|2 −

− |f (s, p)|2 + |g(s, p)|2) ∫ s

1

p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

for all 1 < s < τ < t . Using estimates (3.14) we get

‖@j(t) − @j(s)‖∞ � Cε2 (p3s)4γ− 1

12 , j = 1, 2. (3.15)

Therefore by (3.15), we see that there exist unique functions 0j ∈ L∞, such thati0j = limt→∞ @j(t) and

‖i0j − @j(t)‖∞ � Cε(p3t

)4γ− 112 , j = 1, 2. (3.16)


By (3.15), (3.16), we find

−3

2iA

∫ t

1|ϑ(τ, p)|2 p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

= 3

2iA |G(p)|2 log tp3 + i03 + O

(ε2(tp3)4γ− 1

12)

and

−3

2iA

∫ t

1

(|g(τ)|2 − |f (τ)|2) p3 dτ

(p3τ)1− γ2 〈p3τ 〉 γ

2

= 3

2iA(|H |2 − 2�HF

)log tp3 + i04 + O

(ε2t4γ− 1

12)

with some functions 03(p) and 04 (p) ∈ L∞. Hence

Eϑ = exp

(−3

2iA|G(p)|2 log tp3 + i03 + O

(ε2〈p3t〉4γ− 1

12))

and

EϑEv = exp

(3

2iA(|H |2 − 2�HF

)log tp3 + i04 + O

(ε2t4γ− 1

12))

.

Thus

r = e− 32 iA|G(p)|2 log tp3+i03

(H + F

(e

32 iA(|H |2−2�HF) log tp3+i04 − 1

)) ++ O

(ε3t4γ− 1

12)

= H1eiB1 log tp3 + H2eiB2 log tp3 + O(ε3t4γ− 1

12).

The asymptotic formula (1.2) follows now from the asymptotics (2.15) of Lemma2.2. This completes the proof of Theorem 1.2. ✷

References

1. Ablowitz, M. J. and Segur, H.: Solitons and the Inverse Scattering Transform, SIAM,Philadelphia, 1981.

2. Bona, J. L. and Saut, J.-C.: Dispersive blow-up of solutions of generalized Korteweg–de Vriesequation, J. Differential Equations 103 (1993), 3–57.

3. de Bouard, A., Hayashi, N. and Kato, K.: Gevrey regularizing effect for the (generalized)Korteweg–de Vries equation and nonlinear Schrödinger equations, Ann. Inst. H. Poincaré,Anal. Non Linéaire 12 (1995), 673–725.

4. Christ, F. M. and Weinstein, M. I.: Dispersion of small amplitude solutions of the generalizedKorteweg–de Vries equation, J. Funct. Anal. 100 (1991), 87–109.

5. Constantin, P. and Saut, J.-C.: Local smoothing properties of dispersive equations, J. Amer.Math. Soc. 1 (1988), 413–446.


6. Craig, W., Kapeller, K. and Strauss, W. A.: Gain of regularity for solutions of KdV type, Ann.Inst. H. Poincaré, anal. non linéaire 9 (1992), 147–186.

7. Deift, P. and Zhou, X.: A steepest descent method for oscillatory Riemann–Hilbert problems.Asymptotics for the MKdV equation, Ann. Math. 137 (1992), 295–368.

8. Dix, D. B.: Large-time Behavior of Solutions of Linear Dispersive Equations, Lecture Notes inMath. 1668, Springer, Berlin, 1997.

9. Dix, D.: Temporal asymptotic behavior of solutions of the Benjamin–Ono equation, J.Differential Equations 90 (1991), 238–287.

10. Fedoryuk, M. V.: Asymptotic methods in analysis, Encycl. of Math. Sciences 13, Springer-Verlag, New York, 1987, pp. 83–191.

11. Fedoryuk, M. V.: Asymptotics: Integrals and Series, Nauka, Moscow, 1987.12. Ginibre, J., Tsutsumi, Y. and Velo, G.: Existence and uniqueness of solutions for the generalized

Korteweg–de Vries equation, Math. Z. 203 (1990), 9–36.13. Hayashi, N.: Analyticity of solutions of the Korteweg–de Vries equation, SIAM J. Math. Anal.

22 (1991), 1738–1745.14. Hayashi, N. and Naumkin, P. I.: Large time asymptotics of solutions to the generalized

Benjamin–Ono equation, Trans. Amer. Math. Soc. 351(1) (1999), 109–130.15. Hayashi, N. and Naumkin, P. I.: Large time asymptotics of solutions to the generalized

Korteweg–de Vries equation, J. Funct. Anal. 159 (1998), 110–136.16. Hayashi, N. and Naumkin, P. I.: Large time behavior of solutions for the modified Korteweg–de

Vries equation, Internat. Math. Res. Notices 8 (1999), 395–418.17. Kato, T.: On the Cauchy problem for the (generalized) Korteweg–de Vries equation, In: V.

Guillemin (ed.), Advances in Mathematics Supplementary Studies, Stud. in Appl. Math. 8,Berlin, 1983, pp. 93–128.

18. Kenig, C. E., Ponce, G. and Vega, L.: On the (generalized) Korteweg–de Vries equation, DukeMath. J. 59 (1989), 585–610.

19. Kenig, C. E., Ponce, G. and Vega, L.: Well-posedness and scattering results for the generalizedKorteweg–de Vries equation via contraction principle, Comm. Pure Appl. Math. 46 (1993),527–620.

20. Klainerman, S.: Long time behavior of solutions to nonlinear evolution equations, Arch. Rat.Mech. Anal. 78 (1982), 73–89.

21. Klainerman, S. and Ponce, G.: Global small amplitude solutions to nonlinear evolutionequations, Comm. Pure Appl. Math. 36 (1983), 133–141.

22. Kruzhkov, S. N. and Faminskii, A. V.: Generalized solutions of the Cauchy problem for theKorteweg–de Vries equation, Math. USSR, Sb. 48 (1984), 391–421.

23. Naumkin, P. I. and Shishmarev, I. A.: Asymptotic behavior as t → ∞ of solutions of thegeneralized Korteweg–de Vries equation, Math. RAS, Sb. 187(5) (1996), 695–733.

24. Ponce, G. and Vega, L.: Nonlinear small data scattering for the generalized Korteweg–de Vriesequation, J. Funct. Anal. 90 (1990), 445–457.

25. Saut, J.-C.: Sur quelque généralisations de l’ equation de Korteweg–de Vries, J. Math. PureAppl. 58 (1979), 21–61.

26. Sidi, A., Sulem, C. and Sulem, P. L.: On the long time behavior of a generalized KdV equation,Acta Appl. Math. 7 (1986), 35–47.

27. Shatah, J.: Global existence of small solutions to nonlinear evolution equations, J. DifferentialEquations 46 (1982), 409–425.

28. Staffilani, G.: On the generalized Korteweg–de Vries equation, Differential Integral Equations10 (1997), 777–796.

29. Strauss, W. A.: Dispersion of low-energy waves for two conservative equations, Arch. Rat.Mech. Anal. 55 (1974), 86–92.


30. Strauss, W. A.: Nonlinear scattering theory at low energy, J. Funct. Anal. 41 (1981), 110–133.31. Tsutsumi, M.: On global solutions of the generalized Korteweg–de Vries equation, Publ. Res.

Inst. Math. Sci. 7 (1972), 329–344.


229

Inverse Spectral Results for AKNS Systemswith Partial Information on the Potentials

R. DEL RIO1 and B. GRÉBERT2

1IIMAS-UNAM, Circuito Escolar, Ciudad Universitaria, 04510 México, D.F., México.e-mail: [email protected] 6629, Département de Mathématiques, Université de Nantes, 2 rue de la Houssinière,44322 Nantes Cedex 03, France. e-mail: [email protected]

(Received: 1 March 2001; in final form: 24 August 2001)

Abstract. For the AKNS operator on L2([0, 1],C2) it is well known that the data of two spectrauniquely determine the corresponding potential ϕ a.e. on [0, 1] (Borg’s type Theorem). We provethat, in the case where ϕ is a-priori known on [a, 1], then only a part (depending on a) of twospectra determine ϕ on [0, 1]. Our results include generalizations for Dirac systems of classicalresults obtained by Hochstadt and Lieberman for the Sturm–Liouville case, where they showed thathalf of the potential and one spectrum determine all the potential functions. An important ingredientin our strategy is the link between the rate of growth of an entire function and the distribution of itszeros.

Mathematics Subject Classifications (2000): 34A55, 34B05, 34L40, 47E05, 47A10, 47A75.

Key words: AKNS systems, determination of coefficients, inverse problems, selfadjoint operator,spectral theory.

1. Introduction

We study problems related to classical results by Borg [2] and Hochstadt andLieberman [10]. A vast amount of literature exists on this type of inverse problemsfor the Sturm–Liouville operator, cf. [4] and references quoted therein. In this notewe want to address similar results for AKNS systems. Actually, we use a differentapproach than in [4], in particular we do not use the Titchmarsh–Weyl functionand Marchenko’s theorem. Historically, Gesztesy and Simon first deviated fromthe Hochstadt–Lieberman approach by linking the length on which the potentialwas known to a portion of eigenvalue spectra needed to recover the potential onthe whole interval in question, see [5, 6]. For related interesting results, see [3].

For ϕ ∈ L2([0, 1],C) we define the AKNS operator on L2([0, 1],C2) by

H(ϕ) :=(

0 −11 0

)d

dx+

( −q(x) p(x)

p(x) q(x)

), (1)

where ϕ = q − ip and q and p are real valued.

230 R. DEL RIO AND B. GREBERT

Notice that H(ϕ) is unitarily equivalent to the Zakharov–Shabat operator,

L(ϕ) := i(

1 00 −1

)d

dx+

(0 ϕ

ϕ 0

), (2)

where ϕ is the complex conjugate of ϕ.For each α ∈ [0, π) we consider σ (ϕ, α) the spectrum of the selfadjoint opera-

tor H(ϕ) with domain, F = (Y

Z

) ∈ H 1([0, 1],C2) such that

Z(0) = 0, cos α Y (1)− sinα Z(1) = 0. (3)

Following [7] (cf. also [8]), σ (ϕ, α) is a sequence of real numbers (µn(ϕ, α))n∈Z

satisfying µn < µn+1 (n ∈ Z) and µn = α + nπ − π/2 + l2(n).�By ϕ|[a,b], we shall denote the restriction of ϕ to the interval [a, b], that is

ϕ|[a,b](x) = ϕ(x) if x ∈ [a, b].Our main result is the following theorem:

THEOREM 1. Let ϕ ∈ L2([0, 1],C), α, β ∈ [0, π) with α = β, 0 � a � 1,l, k ∈ N ∪ {∞} with 1

l+ 1

k� 2a. Then {µln(α), µkn(β) | n ∈ Z}�� and ϕ|[a,1]

uniquely determine ϕ a.e. on [0, 1] and α, β.

For particular values of a, k, l, we obtain for the AKNS systems

− Borg type Theorem: two spectra uniquely determine ϕ on [0, 1] (a = 1, l =k = 1).

− Hochstadt–Liebermann type Theorem: one spectrum and ϕ on [1/2, 1]uniquely determine ϕ on [0, 1] (a = 1/2, l = 1, k = ∞) (cf. [1] for another proof of this result).

Actually our Theorem includes many more general results such as, for example,

− Half of one spectrum and ϕ on [1/4, 1] uniquely determine ϕ on [0, 1] (a =1/4, l = 2, k = ∞).

− Half of two spectra and ϕ on [1/2, 1] uniquely determine ϕ on [0, 1] (a =1/2, l = k = 2).

For the shake of simplicity, we only consider the case of two different boundaryconditions. However, the same method of proof applies to more general situations,for instance, considering three spectra, with obvious notation the condition wouldbe

1

k1+ 1

k2+ 1

k3� 2a.

In the same way we prove the following theorem:

� an = bn + l2(n) means that∑n∈Z |an − bn|2 <∞.

�� If l (resp. k) equals ∞ we shall understand that {µln(α) | n ∈ Z} (resp. {µkn(β) | n ∈ Z}) isempty.

INVERSE SPECTRAL RESULTS FOR AKNS 231

THEOREM 2. Let ϕ ∈ L2([0, 1],C), α, β ∈ [0, π) with α = β, 0 � a � 1,l, k ∈ N ∪ {∞} with 1

l+ 1

k� 4a. Then {µln(α), µkn(β) | n � 0} and ϕ|[a,1]

uniquely determine ϕ a.e. on [0, 1] and α, β.

Roughly speaking, Theorem 2 means that the ‘positive’ part of one spectrum(µn)n�0 gives the same information as (µ2n)n∈Z.

Of course, the data ϕ|[a,1] in Theorems 1 and 2 can be replaced by ϕ|[0,1−a].Nevertheless, the interval where ϕ is a-priori known must contain 0 or 1. Actually,even the data of Reϕ on [0, 1], Imϕ on [1/2 − ε, 1/2 + ε] (ε ∈ (0, 1/2) arbi-trary) and one spectrum do not uniquely determine ϕ on [0, 1]. Let us construct acounterexample: From [8, Propositions 1.3 and 1.4] we learn that for any

ϕ ∈ L2([0, 1],C) and n ∈ Z, µn

(ϕ,π

2

)= µn

(ϕ,π

2

),

where ϕ(x) = ϕ(1 − x). Let ϕ = q − ip with q even,

p(1 − x) = −p(x) for x ∈ [1/2 − ε, 1/2 + ε] and

p(1 − x) = p(x) for x ∈ [0, 1/2 − ε).Define ψ by ψ(x) := q(x) + ip(1 − x).Then,

σ (ϕ, π/2) = σ (ψ, π/2), Reϕ = Reψ in [0, 1],

Imϕ = Imψ on [1/2 − ε, 1/2 + ε],

nevertheless ϕ ≡ ψ .Similar construction shows that the data of Reϕ on [0, 1], Im ϕ on [0, 1]\(1/2−

ε, 1/2 + ε) (ε ∈ (0, 1/2) arbitrary) and one spectrum do not uniquely determine ϕon [0, 1].

Our fundamental strategy can be described as follows. Let ϕ and ψ satisfy theconditions of Theorem 1.

(a) We construct an entire function f (λ, ϕ,ψ) (cf. Section 2.1) which vanishes ateach µln(α) and µkn(β), n ∈ Z.

(b) The partial information on the potential allows us to bound the exponentialtype of f obtaining |f (λ)| = o(e2a|Imλ|) as |λ| → ∞ (cf. Section 2.2).

(c) We use the general principle that the growth of an entire function is related tothe distribution of its zeros� to prove that steps (a) and (b) and the condition1l+ 1k

� 2a imply f ≡ 0 (cf. Section 2.3).(d) f ≡ 0 imply ϕ ≡ ψ (cf. Section 2.4).

� This is a generalization of the fact that the rate of growth of a polynomial is given by its degreeor, equivalently, by its number of roots.


Notice that the spectral problem is well posed for ϕ ∈ L1([0, 1],C2). However,in this case the asymptotic µn = α+nπ −π/2 + l2(n) is not satisfied and then wecannot use Lemma 5 (see Section 2.3 on infinite products). Furthermore, workingwith L2 kernels greatly facilitates the proof of Proposition 9 below. Therefore, thecondition ϕ ∈ L2([0, 1],C2) in Theorems 1 and 2 is possibly not optimal but isneeded for our approach.

2. Proofs

2.1. CONSTRUCTION OF THE ENTIRE FUNCTION f

For ϕ ∈ L2([0, 1],C) and λ ∈ C, let

F(·, λ, ϕ) ≡(Y (·, λ, ϕ)Z(·, λ, ϕ)

)∈ H 1([0, 1],C2)

be the unique vector-valued function satisfying H(ϕ)F = λF and F(0, λ, ϕ) =(10

).For each 0 � x � 1 and ϕ ∈ L2([0, 1]),C), λ �→ F(x, λ, ϕ) is entire (cf. [7]

or [8] for the construction of F ).Notice that the spectrum σ (ϕ, α) is the root’s set of

cos α Y (1, λ, ϕ)− sinα Z(1, λ, ϕ) = 0. (4)

Let

G ≡(G1(x, λ, ϕ)

G2(x, λ, ϕ)

)

with

G1(x, λ, ϕ) = Z(x, λ, ϕ)− iY (x, λ, ϕ),G2(x, λ, ϕ) = Z(x, λ, ϕ)+ iY (x, λ, ϕ). (5)

One has

L(ϕ)G(·, λ, ϕ) = λG(·, λ, ϕ) (6)

and, furthermore, for λ ∈ R,

G1(x, λ, ϕ) = G2(x, λ, ϕ). (7)

Let ϕ,ψ ∈ L2([0, 1],C), from (6) the cross-product,

A(λ) := 〈G(·, λ, ψ), (L(ϕ)− λ)G(·, λ, ϕ)〉 −− 〈G(·, λ, ϕ), (L(ψ)− λ)G(·, λ, ψ)〉

vanishes for all λ ∈ C.


On the other hand, by direct calculation using (7), one obtains for λ ∈ R

A(λ) = 〈G(ψ),L(ϕ)G(ϕ)〉 − 〈G(ϕ), L(ψ)G(ψ)〉=

∫ 1

0G2(ψ)G2(ϕ)(ϕ − ψ) dx +

∫ 1

0G1(ψ)G1(ϕ)(ϕ − ψ) dx +

+ i∫ 1

0

d

dx

(−G1(ψ)G2(ϕ)+G2(ψ)G1(ϕ))

dx.

Thus defining for λ ∈ C,

f (λ, ϕ,ψ) :=∫ 1

0G2(x, λ,ψ)G2(x, λ, ϕ)(ϕ(x)− ψ(x)) dx + (8)

+∫ 1

0G1(x, λ,ψ)G1(x, λ, ϕ)(ϕ(x)− ψ(x)) dx,

one gets for λ ∈ R,

f (λ, ϕ,ψ) = −i[W(G(ϕ),G(ψ))]10. (9)

LEMMA 3. Let ϕ,ψ ∈ L2([0, 1],C) and α ∈ [0, π). Assume that µ ∈ σ (ϕ, α) ∩σ (ψ, α). Then f (µ, ϕ,ψ) = 0.

Proof. In view of formula (9), one has to prove that [W(G(·, µ, ϕ),G(·, µ,ψ))]10

= 0. Since G(0, µ) = (−i+i

), one has

W(G(0, µ, ϕ),G(0, µ,ψ)) = 0.

On the other hand

W(G(1, µ, ϕ),G(1, µ,ψ)

)= 2i

(Z(1, µ, ϕ)Y (1, µ,ψ) − Y (1, µ, ϕ)Z(1, µ,ψ)) = 0,

where we used that (cf. (4)) cos α Y (1)− sinα Z(1) = 0. ✷For simplicity, for i = 1, 2, we introduce

gi(x, λ) ≡ gi(x, λ, ϕ,ψ) := Gi(x, λ, ϕ)Gi(x, λ,ψ). (10)

Then formula (8) becomes, with r = ϕ − ψ ,

f (λ, ϕ,ψ) =∫ 1

0

(g2(x, λ)r(x) + g1(x, λ)r(x)

)dx. (11)

Notice that since λ �→ F(x, λ, ϕ) is entire, λ �→ f (λ, ϕ,ψ) is entire too.

2.2. ORDER AND TYPE OF THE ENTIRE FUNCTION f

In this section we shall prove that the entire function f defined in (11) satisfies thefollowing lemma:


LEMMA 4. Let 0 � a � 1. Assume ϕ(x) = ψ(x) for x ∈ [a, 1], then f (λ, ϕ,ψ)= o(e2|Im λ|a) when |λ| → ∞.

Proof. From [7] (see also [8]) we learn that, uniformly for x ∈ [0, 1], one haswhen |λ| → ∞,∥∥∥∥F(x, λ)−

(cos λx

− sin λx

)∥∥∥∥ = o(e|Im λ|x). (12)

Therefore, we get from (10) and (5)

g2(x, λ, ϕ,ψ) = [ieiλx + o(e|Im λ|x)

]2 = −e2iλx + o(e2|Imλ|x)

and

g1(x, λ, ϕ,ψ) = −e−2iλx + o(e2|Im λ|x).

Using (11), we obtain (with r = ϕ − ψ)

f (λ, ϕ,ψ) =∫ 1

0

(−e2iλx + o(e2|Im λ|x))r(x) dx +

+∫ 1

0

(−e−2iλx + o(e2|Im λ|x))r(x) dx.

Thus, if ϕ ≡ ψ on [a, 1],f (λ) = −

∫ a

0r(x)e2iλx dx −

∫ a

0r(x)e−2iλx dx + o(e2|Im λ|a), (13)

where we used that the error term o(e2|Im λ|x) is uniform in x ∈ [0, 1] and thatr ∈ L1([0, 1]).

For α ∈ C1([0, 1]) one obtains integrating by parts,∣∣∣∣∫ a

0α(x)e2iλx dx

∣∣∣∣ = O

(e2|Imλ|a

|λ|).

Now for α ∈ L2([0, 1]) and ε > 0 let αε in C1([0, 1]) such that ‖αε − α‖L2 <

ε/2. There exists A > 0 such that, for |λ| > A,∣∣∣∣∫ a

0αε(x)e

2iλx dx

∣∣∣∣ � ε

2e2|Im λa|

and thus∣∣∣∣∫ a

0α(x)e2iλx dx

∣∣∣∣ �∣∣∣∣∫ a

0αε(x)e

2iλx dx

∣∣∣∣ +∫ a

0|α − αε|e2|Im λ|x dx

� εe2|Imλa|.

Therefore, one has proved (cf. [13, Problem 3, p. 15]) that∫ a

0α(x)e2iλx dx = o(e2|Im λ|a)

for α ∈ L2([0, 1]). Thus, using (13), f (λ) = o(e2a|Im λ|). ✷


Remark.. We have proved that f is an entire function of order not greater than 1and type not greater than 2a. We are going to use that such a function cannot have‘many’ zeros (cf. [11]).

2.3. INFINITE PRODUCT REPRESENTATION

We begin with three auxiliary Lemmas on infinite products.Given a sequence of complex numbers (ak)k∈Z, we say that the product

∏k∈Z ak

is convergent if the limit limN→∞∏

|k|�N ak exists. In such a case we write∏k∈Z

ak := limN→∞

∏|k|�N

ak.

LEMMA 5. Let (zn)n∈Z a complex sequence satisfying zn = nπ + l2(n). Then theformula

h(λ) := (λ− z0)∏

n∈Z\{0}

zn − λnπ

defines and entire function satisfying uniformly for (n + 1/6)π � |λ| � (n +5/6)π ,

h(λ) = sin λ(1 + o(1)), n→ +∞. (14)

Lemma 5 is proved in [8, Lemma I-16] (cf. [7] and [13]), but the uniformity of(14) is proved only for (n + 1/4)π � |λ| � (n + 3/4)π . The generalization to(n + 1/6)π � |λ| � (n + 5/6)π is straightforward (actually, the only importantfact is that |λ| is far away from nπ (n � 0)).

From Lemma 5 follows:

LEMMA 6. Let (zn)n∈Z a sequence of complex numbers satisfying zn = nπ +l2(n) and k � 1. Then the formula

hk(λ) := (λ− z0)∏

n∈Z\{0}

znk − λnkπ

defines an entire function satisfying uniformly on k(n + 1/6)π � |λ| � k(n +5/6)π

hk(λ) = k sin

(λ

k

)(1 + o(1)), n→ +∞.

Proof. For n ∈ Z we define, zn = znk/k. One has

hk(λ) = (λ− kz0)∏

n∈Z\{0}

kzn − λnkπ

.


By Lemma 5, the function

h(λ) = (λ− z0)∏

n∈Z\{0}

zn − λnπ

satisfies

h(λ) = sin λ(1 + o(1)), n→ +∞uniformly on (n+ 1/6)π � |λ| � (n+ 5/6)π .

Notice that hk(λ) = kh(λ/k), hence

hk(λ) = k sin

(λ

k

)(1 + o(1)), n→ +∞

uniformly on k(n+ 1/6)π � |λ| � k(n+ 5/6)π . ✷As an application of Lemma 6, one has

LEMMA 7. Let

−π2

� aj <π

2(j = 1, 2), k1 � 1, k2 � 1

and (µn)n∈Z, (νn)n∈Z two sequences of real numbers satisfying

µn = nπ + l2(n), νn = nπ + l2(n).Then the function defined by

h(λ) := (λ− µ0 − a1)(λ− ν0 − a2)∏

n∈Z\{0}

µk1n + a1 − λk1nπ

× νk2n + a2 − λk2nπ

is entire. Furthermore, there exists (γp)p�1 a sequence of positive real numberswith γp → ∞, and C > 0 such that uniformly on |λ| = γp and p � 1,

|h(λ)| � C exp

((1

k1+ 1

k2

)|Im λ|

).

Proof. By Lemma 6, the functions

h1(λ) := (λ− µ0 − a1)∏

n∈Z\{0}

µk1n + a1 − λk1nπ

and

h2(λ) := (λ− ν0 − a2)∏

n∈Z\{0}

νk2n + a2 − λk2nπ


are entire and satisfy as n→ ∞

h1(λ) = k1 sin

(λ− a1

k1

)(1 + o(1)) (15)

uniformly on k1(n+ 1/6)π � |λ− a1| � k1(n+ 5/6)π and

h2(λ) = k2 sin

(λ− a2

k2

)(1 + o(1)) (16)

uniformly on k2(n+ 1/6)π � |λ− a2| � k2(n+ 5/6)π .Moreover, for j = 1, 2, there exist Cj > 0 such that, uniformly on

Ij :=⋃n∈Z

{λ | kj (n+ 1/6)π � |λ− aj | � kj (n+ 5/6)π},

one has (cf. [13, Lemma 1, p. 27])∣∣∣∣sin

(λ− ajkj

)∣∣∣∣ > Cj exp

( |Im(λ− aj )|kj

). (17)

Noticing that h(λ) = h1(λ)h2(λ) and, in view of (15)–(17), it remains to provethat there exists a sequence (γp)p�1 with γp > 0 (p � 1) and γp −→

p→∞ ∞ such that

for each p � 1 γp ∈ I1 ∩ I2 ∩ R.As Ij ∩ R is the union of segments whose wide is 2/3 kjπ and the distance

between two consecutive segments is 1/3 kjπ , the existence of such sequence(γp)p�1 is clear. ✷

We can now state the main result of this section. Recall that σ (ϕ, α) ≡(µn(ϕ, α))n∈Z is the spectrum of H(ϕ) with domain, F = (

Y

Z

) ∈ H 1(0, 1) suchthat Z(0) = 0 and cos α Y (1)− sinα Z(1) = 0.

PROPOSITION 8. Let ϕ, ψ ∈ L2([0, 1],C), 0 � a � 1, 0 � α, β < π withα = β and k1, k2 � 1 with 1/k1 + 1/k2 � 2a. Assume that

(i) ϕ ≡ ψ on [a, 1],(ii) µk1n(ϕ, α) = µk1n(ψ, α), n ∈ Z,

(iii) µk2n(ϕ, β) = µk2n(ψ, β), n ∈ Z.

Then f (·, ϕ, ψ) ≡ 0.Proof. As mentioned in the introduction, following [8] (see also [7]), one de-

duces from Rouché’s Theorem and formula (12)

µn(ϕ, α) = nπ + α − π

2+ l2(n) (18)

and

µn(ϕ, β) = nπ + β − π

2+ l2(n). (19)


Therefore, by Lemma 7, the entire function

h(λ) := (λ− µ0)(λ− ν0)∏

n∈Z\{0}

µk1n − λk1nπ

νk2n − λk2nπ

,

with µj := µj (ϕ, α), νj := µj (ϕ, β) (j ∈ Z), satisfies for some constant C > 0

|h(λ)| � C exp

((1

k1+ 1

k2

)|Im λ|

)� C exp(2a|Im λ|) (20)

uniformly on |λ| = γp and p � 1, where (γp)p�1 is a sequence of positive realnumbers with γp −→

p→∞ ∞.

Furthermore, as α = β, σ (ϕ, α) ∩ σ (ϕ, β) = ∅. Thus (µk1n)n∈Z and (νk2n)n∈Z

are simple roots of h.On the other hand, by Lemma 3 and Hypotheses (ii) (iii), f (µk1n) = f (νk2n) = 0

for all n ∈ Z. Besides, by Lemma 4 and Hypothesis (i),

f (λ) = o(e2a|Im λ|) (21)

when |λ| → ∞.Therefore λ �→ f (λ)/h(λ) is entire and combining (20), (21) we get

|f (λ)/h(λ)| = o(1) as p → ∞ uniformly on |λ| = γp.Hence, by the maximum principle, we conclude that f ≡ 0. ✷

2.4. INTEGRAL REPRESENTATION

The main result of this section is

PROPOSITION 9. Let ϕ,ψ ∈ L2([0, 1],C). Assume that f (λ, ϕ,ψ) = 0 forλ ∈ R, then ϕ ≡ ψ .

We follow the same strategy as in [11, Appendix 4].We first establish an integral representation of g1 and g2 (cf. formula (10)).

LEMMA 10. There exists a kernel K ∈ L2([−1, 1]2,C) such that for x ∈ [−1, 1]and λ ∈ R,

g1(x, λ) = −e−2iλx +∫ x

−x�K(x, u)e−2iλu du (22)

and

g2(x, λ) = −e2iλx +∫ x

−xK(x, u)e2iλu du. (23)


Proof. LetM(·, λ, ϕ) be the fundamental (2 × 2 matrix) solution of H(ϕ)M =λM satisfying M(0, λ, ϕ) = Id2×2 and R(x, λ) be given by

R(x, λ) =(

cos λx sin λx− sin λx cos λx

).

From [12, p. 514]�(cf. also [11]) one learns that there exists a KernelA ∈ L2([0, 1]2,

M2×2(R)) (whereM2×2(R) denotes the space of 2×2 matrix with real entries) suchthat

M(x, λ) = R(x, λ)+∫ x

0R(x − 2y, λ)A(x, y) dy. (24)

By definition, F(x, λ, ϕ) is the first column ofM(x, λ, ϕ). Therefore, by (5),

G2(x, λ) = (i, 1)M(x, λ)(10

)and by a straightforward calculation, one gets

G2(x, λ, ϕ) = ieiλx +∫ x

0eiλ(x−2y)Kϕ(x, y) dy, (25)

where

Kϕ(x, y) = (i, 1)A(x, y)(10

).

Since for λ ∈ R, G1(x, λ, ϕ) = �G2(x, λ, ϕ), one also has

G1(x, λ, ϕ) = −ie−iλx +∫ x

0e−iλ(x−2y)�Kϕ(x, y) dy. (26)

Inserting (25) in (10) one gets

g2(x, λ) = −e2iλx + h1(x, λ)+ h2(x, λ), (27)

where

h1(x, λ) = i∫ x

0(Kϕ(x, t) +Kψ(x, t))e2iλ(x−t ) dt (28)

and

h2(x, λ) =∫ x

0dt

∫ x

0dsKϕ(x, t)Kψ(x, s)e

2iλ(x−t−s). (29)

By a change of variable u = x − t − s and v = t − s in (29), one obtains

h2(x, λ) = 1

2

∫D

Kϕ

(x,v − u+ x

2

)Kψ

(x,

−v − u+ x2

)e2iλu du dv,

� In [12] the authors do not use the same spectral variable and we have to transform λ in λ/2 inour formula (24).


where D = D1 ∪D2 and with

D1 := {(u, v) | −x � u � 0;u � v � −u}and

D2 := {(u, v) | 0 � u � x;−u � v � u}.Thus

h2(x, λ) =∫ x

−xe2iλuK1(x, u) du (30)

with

K1(x, u) :=∫ x

−xKϕ

(x,v − u+ x

2

)Kψ

(x,

−v − u+ x2

)1D(u, v) dv

and where 1D denotes the characteristic function of the set D.Similarly by the change of variable u = x − t in (28), one has

h1(x, λ) = i∫ x

0

(Kϕ(x, x − u)+Kψ(x, x − u))e2iλu du.

Thus

h1(x, λ) =∫ x

−xK2(x, u)e

2iλu du, (31)

where

K2(x, u) := i1[0,x](u)(Kϕ(x, x − u)+Kψ(x, x − u)).

Combining (27), (30) and (31), one gets (23) with

K(x, u) = K1(x, u)+K2(x, u).

We deduce (22) from (23) recalling that g1(x, λ) = g2(x, λ) for λ ∈ R. ✷Proof of Proposition 9. Recall that, with r = ϕ − ψ (cf. (11)),

f (λ) ≡ f (λ, ϕ,ψ) =∫ 1

0

(g2(x, λ)r(x) + g1(x, λ)r(x)

)dx.

By Lemma 10, for λ ∈ R one gets

f (λ) =∫ 1

0

(−e2iλx +

∫ x

−xK(x, u)e2iλu du

)r(x) dx +

+∫ 1

0

(−e−2iλy +

∫ y

−y�K(y, v)e−2iλv dv

)r(y) dy. (32)


By the change of variable x = −y and u = −v in the second term of theright-hand side of (32), one obtains

f (λ) =∫ 1

0

(−e2iλx +

∫ x

−xK(x, u)e2iλu du

)r(x) dx +

+∫ 0

−1

(−e2iλx −

∫ x

−x�K(−x,−u)e2iλu du

)r(−x) dx. (33)

Thus, with

m(x) := r(x) for x ∈ [0, 1] and m(x) := r(−x) for x ∈ [−1, 0]and with

B(x, u) := K(x, u), for x ∈ [0, 1] (34)

and

B(x, u) := �K(−x,−u), for x ∈ [−1, 0],(33) leads to

f (λ) =∫ 1

−1

(−e2iλx +

∫ |x|

−|x|B(x, u)e2iλu du

)m(x) dx

=∫ 1

−1e2iλx

(−m(x)+

∫ −|x|

−1B(u, x)m(u) du +

∫ 1

|x|B(u, x)m(u) du

)dx.

Since f (λ) = 0 for all λ ∈ R and {e2iλx | λ ∈ R} spans L2([−1, 1],C), the lastformula implies that, for x ∈ [−1, 1],

m(x) =∫ −|x|

−1B(u, x)m(u) du +

∫ 1

|x|B(u, x)m(u) du.

In particular, for x ∈ [0, 1], one gets by definition of B and m,

r(x) =∫ 1

x

(�K(u,−x)r(u)+K(u, x)r(u)) du.

Therefore, defining P ∈ L2([−1, 1]) by

P(u, v) := |K(u, v)| + |�K(u,−v)|,one obtains

|r(x)| �∫ 1

x

P (u, x)|r(u)| du, (35)


for x ∈ [0, 1]. Iterating formula (35) leads to, for each n � 1,

|r(x)| �∫ 1

x

du1

∫ 1

u1

du2 . . .

∫ 1

un−1

dun P (u1, x) . . . P (un, un−1)|r(un)|.

Interchanging the order of integration we obtain�

|r(x)| �∫ 1

x

dun

∫ un

x

dun−1 . . .

∫ u2

x

du1 P(u1, x) . . . P (un, un−1)|r(un)|. (36)

Now setting K1(u, x) = P(u, x) and defining

Kj(u, x) :=∫ u

x

dv Kj−1(v, x)P (u, v),

the inequality (36) can be written as

|r(x)| �∫ 1

x

duKn(u, x)|r(u)|. (37)

Defining for 0 � x � u � 1

B(x) =∫ 1

x

|P(z, x)|2 dz,

A(u) =∫ u

0|P(u, z)|2 dz,

ρ(x) =∫ x

0A(u) du,

one obtains by a straightforward recurrence and the Cauchy–Schwarz inequality

|Kn(u, x)|2 � A(u)B(x)ρ(u)n−2

(n − 2)! .

Therefore∫ 1

x

|Kn(u, x)|2 du � B(x)

(n− 2)!∫ 1

0duA(u)ρ(u)n−2 = B(x)

(n− 1)! ρ(1)n−1.

Hence∫ 1

x

|Kn(u, x)|2 du

� 1

(n− 1)!∫ 1

0|P(z, x)|2 dz ·

(∫ 1

0du

∫ 1

0dz|P(u, z)|2

)n−1

. (38)

� The following argument is analogous to parts of [9, Theorem 6, Ch. 2].


Using the Cauchy–Schwarz in (37), we get

|r(x)|2 �∫ 1

x

du|Kn(u, x)|2 ·∫ 1

x

|r(u)|2 du. (39)

By integrating (39) and using (38) we find

‖r‖2L2

� 1

(n− 1)! ‖P ‖2nL2

‖r‖2L2

n→∞−→ 0.

It follows then that r(x) = ϕ(x)− ψ(x) = 0. ✷

2.5. PROOFS OF THEOREMS 1 AND 2

Theorem 1 is a direct consequence of Proposition 8 and Proposition 9.The proof of Theorem 2 is similar; we only have to make the following changes:

− We replace f (λ, ϕ,ψ) by

f (λ, ϕ,ψ) = f (−λ, ϕ,ψ)f (λ, ϕ,ψ).Thus, since µln(ϕ, α) = µln(ψ, α) and µkn(ϕ, β) = µkn(ψ, β) for all n ∈ None has by Lemma 3 that for any n ∈ N

f (µln(ϕ, α)) = f (µkn(ϕ, β)) = 0

and

f (−µln(ϕ, α)) = f (−µkn(ϕ, β)) = 0.

− By Lemma 4 one has

f (λ) = o(e|Im λ|4a) as |λ| → +∞.− In Proposition 8 we replace h by h where

h(x) = (λ− µ0)(λ− ν0)∏n>0

(µ2ln − λ2)(ν2

kn − λ2)

(lnπ)2(knπ)2

with µj = µj (ϕ, α) and νj = µj (ϕ, β) (j ∈ N).The function λ �→ f (λ)/h(λ) is still entire.

− By Lemma 7 there exist C > 0 and (γp)p�1 with γp −→p→∞ ∞ such that uni-

formly on |λ| = γp

|h(x)| � C exp

(1

l+ 1

k

)|Im λ|.


− As in Proposition 8 we obtain, using 1/ l + 1/k � 4a, that |f (λ)/h(λ)| =o(1) for p → ∞ uniformly on |λ| = γp. Hence by the maximum principlef ≡ 0, i.e. f ≡ 0.

− We apply Proposition 9 to conclude to ϕ = ψ .

Acknowledgements

B.G. would like to acknowledge the support of the ACI project (French Gov-ernment) and the hospitality of the IIMAS-UNAM institute. R. del R. gratefullyacknowledges support by projects IN-102998 PAPIIT-UNAM and 27487E CONA-CyT (Mexican Government) and the hospitality of the Department of Mathematicsof the University of Nantes.

References

1. Amour, L.: Extension on isospectral sets for the AKNS systems, Inverse Problems 12 (1999),115–120.

2. Borg, G.: Eine Umkehrung der Sturm-Liouvilleschen Eigenwertaufgabe, Acta Math. 78 (1946),1–96.

3. Clark, S. and Gesztesy, F.: Weyl–Titchmarsh M-function asymptotics, local, uniqueness re-sults, trace formulas, and Borg-type theorems for Dirac operators, http://www.ma.utexas.edu/mp_arc-bin/mpa?yn=01-61.

4. del Rio, R., Gesztesy, F. and Simon, B.: Inverse spectral analysis with partial information on thepotential, III. Updating boundary conditions, Internat. Math. Res. Notices 15 (1997), 751–758.

5. del Rio, R., Gesztesy, F. and Simon, B.: Corrections and addendum to inverse spectral analysiswith partial information on the potential, III. Updating boundary conditions, Internat. Math.Res. Notices 11 (1999), 623–625.

6. Gesztesy, F. and Simon, B.: Inverse spectral analysis with partial information on the potential,II. The case of discrete spectrum, Trans. Amer. Math. Soc. 352 (1999), 2765–2787.

7. Grébert, B. and Guillot, J. C.: Gaps of one dimensional periodic AKNS systems, Forum Math.5 (1993), 459–504.

8. Grébert, B. and Kappeler, T.: Normal form theory for the NLS equation, Preprint, Univ. Nantes.9. Hochstadt, H.: Integral Equations, Pure Appl. Math., Wiley, New York, 1973.

10. Hochstadt, H. and Lieberman, B.: An inverse Sturm–Liouville problem with mixed given data,SIAM J. Appl. Math. 34 (1978), 676–680.

11. Levin, B. Ja.: Distribution of Zeros of Entire Functions, Trans. Math. Monogr. 5, Amer. Math.Soc., Providence, 1964.

12. McKean, H. P. and Vaninsky, K. L.: Action-angle variables for the cubic Schrödinger equation,Comm. Pure Appl. Math. 50 (1997), 489–562.

13. Pöschel, J. and Trubowitz, E.: Inverse Spectral Theory, Academic Press, New York, 1987.


245

Inverse Problem and Monodromy Datafor Three-Dimensional Frobenius Manifolds

DAVIDE GUZZETTIResearch Institute for Mathematical Sciences (RIMS), Kyoto University, Kitashirakawa, Sakyo-ku,Kyoto 606-8502, Japan. e-mail: [email protected]

(Received: 10 April 2001; in final form: 21 September 2001)

Abstract. We study the inverse problem for semi-simple Frobenius manifolds of dimension 3 and weexplicitly compute a parametric form of the solutions of the WDVV equations in terms of Painlevé VItranscendents. We show that the solutions are labeled by a set of monodromy data. We use our para-metric form to explicitly construct polynomial and algebraic solutions and to derive the generatingfunction of Gromov–Witten invariants of the quantum cohomology of the two-dimensional projectivespace. The procedure is a relevant application of the theory of isomonodromic deformations.

Mathematics Subject Classifications (2000): 53D45, 34M55, 81T45.

Key words: WDVV equation, Frobenius manifold, isomonodromic deformation, Painlevé equation,monodromy, boundary-value problem.

1. Introduction

In this paper we face the problem of analyzing the global structure of three-dimen-sional Frobenius manifold and the analytic properties of the corresponding solu-tions of WDVV equations. The procedure followed is an application of the theoryof isomonodromic deformations and Painlevé equations. The three-dimensionalcase is analyzed here. It is already highly nontrivial and it is the first step towardsa generalization to higher dimensions.

The WDVV equations of associativity were introduced by Eric Witten [34],R. Dijkgraaf, E. Verlinde and H. Verlinde [8]. They are differential equations sat-isfied by the primary free energy F(t) in two-dimensional topological field theory.F(t) is a function of the coupling constants t := (t1, t2, . . . , tn) t i ∈ C. Let∂α := ∂/∂tα. Given a nondegenerate symmetric matrix ηαβ , α, β = 1, . . . , n, andnumbers q1, q2, . . . , qn, r1, r2, . . . , rn, d, (rα = 0 if qα �= 1, α = 1, . . . , n), theWDVV equations are

∂α∂β∂λFηλµ∂µ∂γ ∂δF = the same with α, δ exchanged, (1)

∂1∂α∂βF = ηαβ, (2)

E(F) = (3 − d)F + (at most) quadratic terms, (3)

246 DAVIDE GUZZETTI

where the matrix (ηαβ) is the inverse of the matrix (ηαβ) and the differential oper-ator E is

E :=n∑

α=1

Eα∂α, Eα := (1 − qα)tα + rα, α = 1, . . . , n,

and will be called Euler vector field.The theory of Frobenius manifolds was introduced by B. Dubrovin [9] to for-

mulate the WDVV equations in geometrical terms. It has links to many branchesof mathematics like singularity theory and reflection groups [11, 14, 30, 31], al-gebraic and enumerative geometry [24, 26], isomonodromic deformations theory,boundary-value problems, and Painlevé equations [12].

If we define cαβγ (t) := ∂α∂β∂γF (t), cγαβ(t) := ηγµcαβµ(t) (sum over repeatedindices is always omitted in the paper), and we consider a vector space A =span(e1, . . . , en), then we obtain a family of commutative algebras At with themultiplication eα · eβ := c

γ

αβ(t)eγ . Equation (1) is equivalent to associativity and(2) implies that e1 is the unity.

DEFINITION. A Frobenius manifold is a smooth/analytic manifold M over Cwhose tangent space TtM at any t ∈ M is an associative, commutative algebrawith unity e. Moreover, there exists a nondegenerate bilinear form 〈 , 〉 defining aflat metric (flat means that the curvature associated to the Levi–Civita connectionis zero).

We denote the product by · and the covariant derivative of 〈·, ·〉 by ∇. We requirethat the tensors

c(u, v,w) := 〈u · v,w〉, and ∇yc(u, v,w), u, v,w, y ∈ TtM,

be symmetric. Let t1, . . . , tn be (local) flat coordinates for t ∈ M. Let eα := ∂α bethe canonical basis in TtM,

ηαβ := 〈∂α, ∂β〉, cαβγ (t) := 〈∂α · ∂β, ∂γ 〉.The symmetry of c becomes the complete symmetry of ∂δcαβγ (t) in the indices.This implies the existence of a function F(t) such that ∂α∂β∂γF (t) = cαβγ (t),satisfying the WDVV (1). Equation (2) follows from the axiom ∇e = 0 whichyields e = ∂1. Some more axioms are needed to formulate the quasi-homogeneitycondition (3) and we refer the reader to [11–13]. In this way, the WDVV equationsare reformulated in geometrical terms.

We first consider the problem of the local structure of Frobenius manifolds(which has its counterpart in the local classification of solutions of WDVV equa-tions). A Frobenius manifold is characterized by a family of flat connections ∇(z)

parameterized by a complex number z, such that for z = 0 the connection is

INVERSE PROBLEM FOR FROBENIUS MANIFOLDS 247

associated to 〈 , 〉. For this reason ∇(z), are called deformed connections. Letu, v ∈ TtM, d

dz ∈ TzC; the family is defined on M × C as

∇uv := ∇uv + zu · v,∇ d

dzv := ∂

∂zv + E · v − 1

zµv,

∇ ddz

d

dz= 0, ∇u

d

dz= 0,

where E is the Euler vector field and µ := I − (d/2) − ∇E is an operator actingon v. In flat coordinates t = (t1, . . . , tn), µ becomes

µ = diag(µ1, . . . , µn), µα = qα − d

2,

provided that ∇E is diagonalizable. This will be assumed in the paper. A flatcoordinate t (t, z) is a solution of ∇dt = 0, which is a linear system

∂αξ = zCα(t)ξ, (4)

∂zξ =[U(t) + µ

z

]ξ, (5)

where ξ is a column vector of components

ξα = ηαµ ∂t

∂tµ, α = 1, . . . , n

and

Cα(t) := (cβαγ (t)), U := (Eµcβµγ (t))

are n × n matrices.We restrict to semi-simple Frobenius manifolds, namely analytic Frobenius man-

ifolds such that the matrix U can be diagonalized with distinct eigenvalues on anopen dense subset M of M. Then, there exists an invertible matrix φ0 = φ0(t) suchthat

φ0Uφ−10 = diag(u1, . . . , un) =: U, ui �= uj for i �= j on M.

The systems (4) and (5) become

∂y

∂ui

= [zEi + Vi]y, (6)

∂y

∂z=[U + V

z

]y, (7)

where the row-vector y is y := φ0 ξ , Ei is a diagonal matrix such that (Ei)ii = 1and (Ei)jk = 0 otherwise, and

Vi := ∂φ0

∂ui

φ−10 , V := φ0µφ−1

0 .

248 DAVIDE GUZZETTI

As it is proved in [11, 12], u1, . . . , un are local coordinates on M. The two bases

∂

∂tν, ν = 1, . . . , n and

∂

∂ui

, i = 1, . . . , n

are related by φ0 according to the linear combination

∂

∂tν=

n∑i=1

(φ0)iν

(φ0)i1

∂

∂ui

.

Locally we obtain a change of coordinates, tα = tα(u), then φ0 = φ0(u), V =V (u). The local Frobenius structure of M is given by parametric formulae

tα = tα(u), F = F(u), (8)

where tα(u), F(u) are certain meromorphic functions of (u1, . . . , un), ui �= uj ,which can be obtained from φ0(u) and V (u). Their explicit construction is theobject of the present paper.

The dependence of the system on u is isomonodromic. This means that themonodromy data of the system (7), to be introduced below, do not change fora small deformation of u. Therefore, the coefficients of the system in every localchart of M are naturally labeled by the monodromy data. To calculate the functions(8) in every local chart one has to reconstruct the system (7) from its monodromydata. This is the inverse problem.

We briefly explain what are the monodromy data of the system (7) and why theydo not depend on u (locally). For details, the reader is referred to [12]. At z = 0the system (7) has a fundamental matrix solution (i.e. an invertible n × n matrixsolution) of the form

Y0(z, u) =[ ∞∑

p=0

φp(u) zp

]zµzR, (9)

where Rαβ = 0 if µα − µβ �= k > 0, k ∈ N. At z = ∞ there is a formal n × n

matrix solution of (7) given by

YF =[I + F1(u)

z+ F2(u)

z2+ · · ·

]ezU ,

where Fj(u)’s are n × n matrices. It is a well-known result that there exist fun-damental matrix solutions with asymptotic expansion YF as z → ∞ [2]. Let l bea generic oriented line passing through the origin. Let l+ be the positive half-lineand l− the negative one. Let .L and .R be two sectors in the complex plane tothe left and to the right of l, respectively. There exist unique fundamental matrixsolutions YL and YR having the asymptotic expansion YF for x → ∞ in .L and.R, respectively [2]. They are related by an invertible connection matrix S, calledStokes matrix, such that YL(z) = YR(z)S for z ∈ l+. As it is proved in [12] we alsohave YL(z) = YR(z)S

T on l−.


Finally, there exists a n × n invertible connection matrix C such that Y0 = YRC

on .R.

DEFINITION. The matrices R, C, µ and the Stokes matrix S of the system (7)are the monodromy data of the Frobenius manifold in a neighborhood of the pointu = (u1, . . . , un). It is also necessary to specify which is the first eigenvalue of µ,because the dimension of the manifold is d = −2µ1 (a more precise definition ofmonodromy data is in [12]).

The definition makes sense because the data do not change if u undergoes asmall deformation. This problem is discussed in [12]. We also refer the reader to[21] for a general discussion of isomonodromic deformations. Here we just observethat since a fundamental matrix solution Y (z, u) of (7) also satisfies (6), then themonodromy data can not depend on u (locally). In fact, (∂Y/∂ui)Y

−1 = zEi + Vi

is single-valued in z. The compatibility of (6) and (7) is equivalent to

[U,Vk] = [Ek, V ], (10)∂V

∂uk

= [Vk, V ]. (11)

Note that (10) determines Vk uniquely, provided that ui �= uj for i �= j , namely

(Vk)ij = δki − δkj

ui − uj

Vij .

We finally recall that by construction φ0 satisfies

∂φ0

∂uk

= Vkφ0, k = 1, . . . , n. (12)

According to the results of [21], (11) and (12) are necessary and sufficient condi-tions for the deformation u to be isomonodromic.

The inverse problem can be formulated as a boundary-value problem (b.v.p.).Let’s fix u = u(0) = (u

(0)1 , . . . , u(0)

n ) such that u(0)i �= u

(0)j for i �= j . Suppose we

give µ, µ1, R, an admissible line l, S and C.1 Some more technical conditions mustbe added, but we refer to [12]. Let D be a disk specified by |z| < ρ for some smallρ. Let PL and PR be the intersection of the complement of the disk with .L and.R, respectively. We denote by ∂DR and ∂DL the lines on the boundary of D onthe side of PR and PL respectively; we denote by l+ and l− the portion of l+ and l−on the common boundary of PR and PL. Let’s consider the following discontinuousb.v.p.: we want to construct a piecewise holomorphic n × n matrix function

6(z) = 6R(z), z ∈ PR,

6L(z), z ∈ PL,

60(z), z ∈ D,

1 We remark that due to symmetries of Equation (7) the matrix C is determined by S, R and µ upto some ambiguity that does not affect the corresponding Frobenius structure. Therefore, the relevantmonodromy data are S, R, µ and µ1.

250 DAVIDE GUZZETTI

continuous on the boundary of PR, PL, D respectively, such that

6L(ζ ) = 6R(ζ )eζUSe−ζU , ζ ∈ l+,

6L(ζ ) = 6R(ζ )eζUST e−ζU , ζ ∈ l−,

60(ζ ) = 6R(ζ )eζUCζ−Rζ−µ, ζ ∈ ∂DR,

60(ζ ) = 6L(ζ )eζUS−1Cζ−Rζ−µ, ζ ∈ ∂DL,

6L/R(z) → I if z → ∞ in PL/R.

The reader may observe that

YL/R(z) := 6L/R(z)ezU , Y (0)(z) := 60(z, u)z

µzR

have precisely the monodromy properties of the solutions of (7).

THEOREM ([12, 25, 27]). If the above boundary-value problem has solution fora given u(0) = (u

(0)1 , . . . , u(0)

n ) such that u(0)i �= u

(0)j for i �= j , then:

(i) it is unique.(ii) The solution exists and it is analytic for u in a neighborhood of u(0).

(iii) The solution has analytic continuation as a meromorphic function on theuniversal covering of Cn\{diagonals}, where ‘diagonals’ stand for the unionof all the sets {u ∈ Cn | ui = uj , i �= j}.

A solution YL/R, Y (0) of the b.v.p. solves the system (6), (7).1 This means thatwe can locally reconstruct V (u), φ0(u) and (8) from the local solution of the b.v.p.It follows that every local chart of the atlas covering the manifold is labeled bymonodromy data. Moreover, V (u), φ0(u) and (8) can be continued analytically asmeromorphic functions on the universal covering of Cn\diagonals.

Let Sn be the symmetric group of n elements. Local coordinates (u1, . . . , un)

are defined up to permutation. Thus, the analytic continuation of the local struc-ture of M is described by the braid group Bn, namely the fundamental group of(Cn\diagonals)/Sn. There exists an action of the braid group itself on the mon-odromy data, corresponding to the change of coordinate chart. The group is gener-ated by n − 1 elements β1, . . . , βn−1 such that βi is represented as a deformationconsisting of a permutation of ui , ui+1 moving counter-clockwise (clockwise orcounter-clockwise is a matter of convention).

1 We show that a solution YL/R, Y (0) of the b.v.p. solves the system (6), (7). We have 6R(z) =I + F1

z + O( 1z2 ) as z → ∞ in PR. We also have 60(z) = ∑∞

p=0 φpzp as z → 0. Therefore

∂YR

∂zY−1

R =[U + [F1, U ]

z+ O

(1

z2

)], z → ∞,

∂Y (0)

∂z(Y (0))−1 = 1

z[φ0µφ−1

0 + O(z)], z → 0.

Since C is independent of u the right-hand side of the two equalities above are equal. Also S is


If u1, . . . , un are in lexicographical order w.r.t. l, so that S is upper triangular,the braid βi acts on S as follows [12]:

S �→ Sβi = Ai(S)SAi(S),

where

(Ai(S))kk = 1, k = 1, . . . , n, n �= i, i + 1,

(Ai(S))i+1,i+1 = −si,i+1,

(Ai(S))i,i+1 = (Ai(S))i+1,i = 1

and all the other entries are zero. For a generic braid β the action S → Sβ isdecomposed into a sequence of elementary transformations as above. In this way,we are able to describe the analytic continuation of the local structure in terms ofmonodromy data.

Not all the braids are actually to be considered. Suppose we do the followinggauge y �→ Jy, J = diag(±1, . . . ,±1), on the system (7). Therefore JUJ−1 ≡ U

but S is transformed to JSJ−1, where some entries change sign. The formulaewhich define a local chart of the manifold in terms of monodromy data, which weare going to describe later, are not affected by this transformation. The analyticcontinuation of the local structure on the universal covering of (Cn\diagonals)/Sn

is therefore described by the elements of the quotient group

Bn/{β ∈ Bn | Sβ = JSJ }. (13)

From these considerations it is proved in [12] that:

THEOREM ([12]). Given monodromy data (µ1, µ, R, S, C), the local Frobeniusstructure obtained from the solution of the b.v.p. extends to an open dense sub-set of the covering of (Cn\diagonals)/Sn w.r.t. the covering transformations (13).

independent of u, therefore the matrices Y above satisfy

∂y

∂z=[U + V

z

]y, V (u) := [F1(u), U ] ≡ φ0µφ−1

0 .

In the same way

∂YR

∂uiY−1

R = zEi + [F1, Ei ] +(

1

z

), z → ∞,

∂Y (0)

∂ui(Y (0))−1 = ∂φ0

∂uiφ−1

0 + O(z), z → 0.

The right-hand sides are equal, therefore the Y ’s satisfy

∂y

∂ui= [zEi + Vi ]y, Vi(u) := [F1(u), Ei ] ≡ ∂φ0

∂uiφ−1

0 .

We conclude that from the solution of the boundary value problem we obtain solutions to (7), (6).

252 DAVIDE GUZZETTI

Let’s start from a Frobenius manifold M of dimension d. Let M be the open sub-manifold where U(t) has distinct eigenvalues. If we compute its monodromy data(µ1 = −(d/2), µ, R, S,C) at a point u(0) ∈ M and we construct the Frobeniusstructure from the analytic continuation of the corresponding b.v.p. on the cover-ing of (Cn\diagonals)/Sn w.r.t. the quotient (13), then there is an equivalence ofFrobenius structures between this last manifold and M.

We now turn to the problem of understanding the global structure of a Frobe-nius manifold. In order to do it we have to study (8) when two or more distinctcoordinates ui , uj , etc., merge. φ0(u), V (u) and (8) are multi-valued meromorphicfunctions of u = (u1, . . . , un) and the branching occurs when u goes around aloop around the set of diagonals

⋃ij {u ∈ Cn | ui = uj , i �= j}. φ0(u), V (u)

and (8) have singular behavior if ui → uj (i �= j ). We call such behavior criticalbehavior. Although it is impossible to solve the boundary-value problem exactly,except for special cases occurring for 2 × 2 systems, we may hopefully computethe asymptotic/critical behavior of the solution, using the isomonodromic defor-mation method. We will face the problem in the first nontrivial case, namely forthree-dimensional Frobenius manifolds.

Instead of analyzing the boundary-value problem directly, we exploit the iso-monodromic dependence of the system (7) on u, which implies that the solutionof the inverse problem must satisfy the nonlinear equations (11), (12). For three-dimensional Frobenius manifolds, (11), (12) are reduced in [11] to a special caseof the Painlevé VI equation1:

d2y

dx2= 1

2

[1

y+ 1

y − 1+ 1

y − x

](dy

dx

)2

−[

1

x+ 1

x − 1+ 1

y − x

]dy

dx+

+ 1

2

y(y − 1)(y − x)

x2(x − 1)2

[(2µ − 1)2 + x(x − 1)

(y − x)2

],

µ ∈ C, x = u3 − u1

u2 − u1. (14)

The parameter µ is µ1 and the matrix µ =diag(µ, 0,−µ). We are going to showthat the entries of V (u) and :(u) are rational functions of x, y(x), dy/dx. If ui →

1 The six classical Painleve equations were discovered by Painleve [28] and Gambier [16], whoclassified all the second-order ordinary differential equations of the type

d2y

dx2= R

(x, y,

dy

dx

),

where R is rational in dy/dx, x and y. The Painleve equations satisfy the Painleve property ofabsence of movable critical singularities. The general solution of the VIth Painleve equation can beanalytically continued to a meromorphic function on the universal covering of P1\{0, 1,∞}. Forgeneric values of the integration constants and of the parameters in the equation, the solution cannot be expressed via elementary or classical transcendental functions. For this reason, the solution iscalled a Painleve transcendent.


uj , the critical behavior of V (u), φ0(u) and (8) is a consequence of the criticalbehavior of the transcendent y(x) close to the critical points x = 0, 1,∞. Thiswill be described in the paper.

1.1. RESULTS OF THE PAPER

(1) Let F0(t) := 12 [(t1)2t3 + t1(t2)2]. We prove in Theorem 5.1 of Section 5 that

for generic µ the parametric representation (8) becomes

t2(u) = τ2(x, µ)(u2 − u1)1+µ, t3(u) = τ3(x, µ)(u2 − u1)

1+2µ, (15)

F(u) = F0(t) + F (x, µ)(u2 − u1)3+2µ, (16)

where τ2(x, µ), τ3(x, µ),F (x, µ) are certain rational functions of µ, x, y(x), dy/dx

and a quadrature of y, which we will compute explicitly. The ratio t2/(t3)1+µ

1+2µ isindependent of (u2 − u1). Therefore, the closed form F = F(t) must be

F(t) = F0(t) + (t3)3+2µ1+2µ ϕ

(t2

(t3)1+µ1+2µ

),

where the function ϕ has to be determined by the inversion of (15), (16).For the value µ = −1, corresponding to the Frobenius manifold called quan-

tum cohomology of the projective space CP2, denoted QH ∗(CP2), we prove inTheorem 5.2 of Section 5 that the coordinate t2(u) is

t2(u) = 3 ln(u2 − u1) + 3∫ x

dζ τ(ζ ), (17)

where τ(x) is also computed explicitly as a rational function of x, y(x), dy/dx.The coordinate t3 and F are the limit for µ → −1 of (15), (16). Now et2

(t3)3 isindependent of (u2 − u1) and so

F(t) = F0(t) + 1

t3ϕ(et2

(t3)3).

To our knowledge, this is the first time the explicit parameterization (15)–(17)is given. We stress that in Section 5 the formulas will be completely explicit.Although the proof is mainly a computational problem (the theoretical problembeing already solved by the reduction to the Painlevé VI equation [11]), it is veryhard.

Moreover, the knowledge of this explicit form is necessary to proceed to theinversion of the parametric formulae t = t (u), F = F(u) close to the diagonalsui = uj , in order to investigate the global structure of the Frobenius manifold andto obtain F = F(t) in closed form.

(2) As we discussed above, the local structure of the manifold and of F(t) islabeled by the monodromy data. The formulae (15)–(17) make this explicit. Thisfollows from the fact that the two integration constants which govern the critical

254 DAVIDE GUZZETTI

behavior of y(x) – and thus of the corresponding solution of (11), (12) – and theparameter µ, are contained in the three entries (x0, x1, x∞) of the Stokes’s matrix

S =( 1 x∞ x0

0 1 x1

0 0 1

), such that x2

0 + x21 + x2

∞ − x0x1x∞ = 4 sin2(πµ),

of the system (7). It is known that there exists a class of transcendents whose criticalbehaviour is

y(x) =

a(0)x1−σ (0)

(1 + O(|x|δ)), x → 0,1 − a(1)(1 − x)1−σ (1)

(1 + O(|1 − x|δ)), x → 1,a(∞)x−σ (∞)

(1 + O(|x|−δ)), x → ∞,

(18)

where 0 < δ < 1 is a small positive number, a(i) and σ (i) are complex numberssuch that a(i) �= 0 and 0 � �σ (i) � 1, σ (i) �= 1. The above behavior depends onthe entries of S, which determine the constants a(i), σ (i) through the formulae

x2i = 4 sin2

(π

2σ (i)

), (19)

a(0) = iG(σ (0), µ)2

2 sin(πσ (0))

[2(1 + e−iπσ (0)

) − f (x0, x1, x∞)(x2∞ + e−iπσ (0)

x21)]×

×f (x0, x1, x∞), (20)

where

f (x0, x1, x∞) := 4 − x20

2 − x20 − 2 cos(2πµ)

,

G(σ (0), µ) = 1

2

4σ (0)D(σ (0)+1

2 )2

D(1 − µ + σ (0)

2 )D(µ + σ (0)

2 ).

The parameters a(1), a(∞) are obtained like a(0), provided that we do the substitu-tions

(x0, x1, x∞) �→ (x1, x0, x0x1 − x∞), σ (0) �→ σ (1)

and

(x0, x1, x∞) �→ (x∞,−x1, x0 − x1x∞), σ (0) �→ σ (∞),

respectively in the formula for a(0). Note that σ (i) �= 1 ⇔ xi �= ±2.The critical behavior of the Painlevé transcendents was obtained in [15, 20] for

generic values of the entries of the Stokes’matrix, with the exception of real xi

such that |xi | � 2, i = 0, 1,∞. We generalized the result to any xi �= ±2 [see D.Guzzetti: On the critical behavior, the connection problem and the elliptic repre-sentation of a Painlevé VI equation (2001), to appear. See also: Inverse Problem for


Semisimple Frobenius Manifolds, Monodromy Data and the Painleve’ VI Equation,Ph.D. thesis and SISSA preprint 101/2000/FM (2000).– Formulae (19) and (20) arefound in these papers].

We proved in the above-mentioned paper that for the special case �σ (i) = 1(σ (i) �= 1) – therefore for xi real, |xi | > 2 – the solution y(x) behaves like (18)only along spirals, but it is oscillatory when x → i, i = 0, 1,∞, along a radialpath. For example, if x → 0 we have

y(x) = O(x) + 1 + O(x)

sin2( ν2 ln x − ν ln 16 + πν1

2 +∑∞m=1 c0m(ν)

[(eiπν1

16iν

)xiν]m) ,

x → 0, (21)

where

σ (0) = 1 − iν, ν ∈ R\{0} and a(0) = −1

4

[eiπν1

16iν−1

].

The series in the denominator converges and defines a holomorphic and boundedfunction in a suitable domain where y(x) has no movable poles.

The critical behavior of Painlevé transcendents is also analyzed in [33], thoughthe the relation to monodromy data is not considered.

In Section 7 we reduce the formulae (15), (16) to closed form for the five alge-braic solutions of the Painlevé equation. In this case the transcendent behaves like(18) with rational exponents, then t and F in (15), (16) are expanded in Puiseuxseries in x, 1−x or 1/x. The expansion can be inverted, in order to obtain F = F(t)

in closed form as an expansion in t . We prove that we obtain here the three poly-nomial solutions of the WDVV equations corresponding to the Frobenius structureon the orbit space of Coxeter groups [11, 30, 31], plus two algebraic solutions.

We also apply the procedure to QH ∗(CP2). This time, �σ (i) = 1, because theStokes matrix is

S =( 1 3 3

0 1 30 0 1

),

as it is proved in [12, 17]. Therefore, the transcendent has the oscillatory behavior(21) and the reduction of (15)–(17) to closed form is hard. To avoid this difficultywe expanded the transcendent in Taylor series close to a regular point xreg, weplugged the expansion into (15)–(17) and we obtained t and F as a Taylor series in(x − xreg). We inverted the series and we got a closed form F = F(t).

We prove in Section 8 that the closed form we obtain through (15)–(17) coin-cides with the solution of the WDVV equations which generates the numbers Nk

of rational curves CP1 → CP2 of degree k passing through 3k − 1 generic points[24]. Namely

F(t1, t2, t3) = 12

[(t1)2t3 + t1(t2)2

]+∑∞k=1

Nk

(3k−1)!(t3)3k−1ekt2

. (22)

256 DAVIDE GUZZETTI

Therefore, we have constructed a procedure to compute the Nk’s, which is anapplication of the theory of isomonodromic deformations.

It is known [7] that (22) is convergent in a neighborhood of (t3)3et2 = 0, butthe global analytic properties of F(t) are unknown. The inverse reconstruction ofthe corresponding Frobenius manifold starting from its monodromy data may shedsome light on these properties. To this purpose, we still have to manage to invertthe parametric formulae (15)–(17) if x converges to a critical point. Particularly,we hope to better understand the connection between the monodromy data of thequantum cohomology and the number of rational curves. This problem will be theobject of further investigations.

The entire procedure developed here is a significant application of the theory ofisomonodromic deformations to a problem of mathematical physics (constructionof solutions of WDVV equations) and to pure mathematics (investigation of theglobal structure of a Frobenius manifold).

2. Inverse Reconstruction of a Frobenius Manifold

In this section we review the construction of the local parametric solution (8) of theWDVV equations in terms of the coefficients φp of (9). The result is discussed in[12] and it is the main formula which allows to reduce the problem of solving theWDVV equations to problems of isomonodromic deformations of linear systemsof differential equations. As a first step we note that the condition ∇ dt = 0 issatisfied both by a flat coordinate t α and by tα := ηαβ t

β (sum over β). Thus, wechoose a fundamental matrix solution of (4), (5) of the form:

G = (∂αtβ) ≡

(ηαγ ∂tβ

∂tγ

)=[ ∞∑

p=0

Hp(t)zp

]zµzR, H0 = I,

close to z = 0. If we restrict to the system (4) only, we can choose as a fundamentalsolution

H(z, t) :=∞∑

p=0

Hp(t)zp

and so the flat coordinates of ∇ on M (not on M × C) have the expansion

tα =∞∑

p=0

hα,p(t)zp,

where the functions hα,p must satisfy

hα,0 = tα ≡ ηαβtβ, (23)

∂γ ∂βhα,p+1 = cεγβ∂εhα,p, p = 0, 1, 2, . . . . (24)


We stress that the normalization H0 = I is precisely what is necessary to havetα(z = 0) = tα and it corresponds exactly to Y0 = φ0G in (9). Observe thathα0 = tα ≡ ηαβt

β implies

∂βhα,0 = ηβα ≡ cβα1. (25)

Denote by ∇f := (ηαβ∂βf )∂α the gradient of the function f . We claim that

tα = 〈∇hα,0,∇h1,1〉 ≡ ηµν∂µhα,0∂νh1,1 (26)

are flat coordinates and

F(t) = 12

[〈∇hα,1,∇h1,1〉ηαβ〈∇hβ,0,∇h1,1〉 −− 〈∇h1,1,∇h1,2〉 − 〈∇h1,3,∇h1,0〉] (27)

solves the WDVV equations. To prove it, it is enough to check by direct differen-tiation that ∂αtβ = ηαβ and ∂α∂β∂γF (t) = cαβγ (t), using (23)–(25) and. . . somepatience.

In the following, we denote the entry (i, j) of a matrix Ak by Aij,k . Recall that

∂µ = ∂

∂tµ, ∂i = ∂

∂ui

and ∂µ =n∑

i=1

φiµ,0

φi1,0∂i.

Therefore Yiα = 1/φi1,0∂i tα . It follows that 1/φi1,0∂ihα,p = φiα,p and thus

tα(u) =n∑

i=1

φiα,0φi1,1, (28)

F(t (u)) = 1

2

[tαtβ

n∑i=1

φiα,0φiβ,1 −n∑

i=1

(φi1,1φi1,2 + φi1,3φi1,0

)]. (29)

It is now clear that we can locally reconstruct a Frobenius manifold from thematrices φ0(u), φ1(u), φ2(u), φ3(u) of (9). They are obtained as solutions of theb.v.p. and thus they depend on the monodromy data. Their analytic continuationextends the Frobenius structure on (Cn\diagonals)/Sn, as explained in the Intro-duction. To understand the global structure of the manifold we need to study thecritical behavior of φp(u) and of (28), (29) as ui → uj .

Instead of solving the b.v.p. directly, we exploit the isomonodromic deformationtheory. Let again consider a solution of the b.v.p. of the form

Y0(z, u) =[ ∞∑

p=0

φp(u)zp

]zµzR.

It also satisfies (7) and (6) as we explained in the Introduction. Since ∂R/∂ui = 0,if we plug Y0 into (6) we get

∂φp

∂ui

= Eiφp−1 + Viφp. (30)

258 DAVIDE GUZZETTI

Let 6(z, u) := ∑∞p=0 φp(u)z

p. The condition 6(−z, u)T6(−z, u) = η holds1 andit implies

φT0 φ0 = η,

m∑p=0

φTp φm−p = 0 for any m > 0. (31)

We conclude that φp(u) can be obtained either solving the b.v.p. or solving (30)with the condition (31).

3. Inverse Reconstruction of Two-Dimensional Frobenius Manifolds

Let n = 2. In this section we explain the inverse reconstruction of a semi-simpleFrobenius manifold for n = 2 through the formulae (28), (29). The two-dimensionalcase is exactly solved by elementary methods, so our purpose here is didactic: weclarify the procedure which will be followed in the nontrivial three-dimensionalcase.

3.1. EXACT SOLUTION IN DIMENSION 2 AND MONODROMY DATA

The coefficients of the system (5) are necessarily:

V (u) =(

0 i σ2−i σ

2 0

), U = diag(u1, u2).

Here u = (u1, u2), and V is independent of u. It has the diagonal form

µ = φ−10 V φ0 = diag

(σ

2,−σ

2

),�⇒ µ1 = σ

2, µ2 = −σ

2, d = −σ,

where

φ0(u) =( 1

2f (u)f (u)

12if (u)

if (u)

), φT

0 φ0 = η :=(

0 11 0

).

1 The symmetries ηµ + µT η = 0, UT η = ηU imply that ξ1(−z, t)T ηξ2(z, t) is independent ofz for any two solutions ξ1(z, t), ξ1(z, t) of (5). We choose a fundamental matrix solution of (4), (5)of the form:

G = ∞∑

p=0

Hp(t)zp

zµzR, H0 = I,

close to z = 0. Let H(z, t) := ∑∞p=0 Hp(t)z

p. Then H(−z, t)T ηH(z, t) = η. Now, 6(z, u) =φ0(u)H(z, t (u)), therefore

6(−z, u)T 6(z, u) = η.


The computation of the Stokes’ matrix of

dY

dz=[(

u1 00 u2

)+(

0 i σ2−i σ

2 0

)]Y (32)

requires to keep into account the two oriented half-lines

R12 = {z = −iρ(u1 − u2), ρ > 0}, R21 = −R12.

Let l be an oriented line through the origin, having R12 to the left. Then YL(z, u) =YR(z, u)S on l+ and YL(z, u) = YR(z, u)S

T on l−. The Stokes matrix is

S =(

1 s

0 1

), s ∈ C.

At the origin we have the solution

Y0(z, u) =[ ∞∑

k=0

φk(u)zk

]zµzR. (33)

It is connected to YR through the invertible matrix C according to: Y0(z, u) =YR(z, u)C, z ∈ .R (recall that .R is the half plane to the right of l). Then

ST S−1 = Ce2πi

( σ2 00 − σ

2

)e2πiRC−1.

From the trace, s2 = 2(1 − cos(πσ )).

The above monodromy data R, µ, S define the boundary-value problem toreconstruct the system (5). The standard technique to solve a two-dimensionalboundary value problem is to reduce it to a system of differential equations, whichis (32) in our case, and then to reduce the system to a second-order differentialequation. It turns out that the equation is (after a change of dependent and indepen-dent variables) a Whittaker equation. Therefore, the solution of the b.v.p. is givenin terms of Whittaker functions Wκ,µ. Let H := u1 −u2; the fundamental solutionsare

YR(z, u)

= ei π

2 (Hz)− 12 ez

u1+u22 W 1

2 ,σ2

(e−iπHz

) −i σ2 (Hz)− 1

2 ezu1+u2

2 W− 12 ,

σ2(Hz)

i σ2 ei π

2 (Hz)− 12 ez

u1+u22 W− 1

2 ,σ2

(e−iπHz

)(Hz)− 1

2 ezu1+u2

2 W 12 ,

σ2(Hz)

for arg(R12) < arg(z) < arg(R12)+2π , where arg(R12) := −(π/2)−arg(u1 −u2).

YL(z, u) = (YR(z, u))11 i σ

2 (Hz)− 12 ez

u1+u22 W− 1

2 ,σ2

(e−2iπHz

)(YR(z, u))12 −(Hz)− 1

2 ezu1+u2

2 W 12 ,

σ2

(e−2iπHz

)

for arg(R12) + π < arg(z) < arg(R12) + 3π .

260 DAVIDE GUZZETTI

For the choice of YR and YL above, also the sign of s can be determined.According to our computations from the expansion of YR and YL at z = 0, it iss = 2 sin(πσ/2).

We stress that the only monodromy data are σ and the nonzero entry of R. Thepurpose of this didactic chapter is to show that (28) and (29) bring solutions F(t)

explicitly parameterized by σ and R.

3.2. PRELIMINARY COMPUTATIONS

The functions φp(u) to be plugged into (28), (29) may be derived from the aboverepresentations in terms of Whittaker functions. We prefer to proceed in a differentway, namely by imposing the conditions of isomonodromicity (30) and the con-straint (31) to the solution (33). This is the procedure we will also follow in thethree-dimensional case.

The function f (u) in φ0(u) is arbitrary, but subject to the condition of isomon-odromicity (30) for p = 0, namely ∂iφ0 = Viφ0, where V1 = V/(u1 − u2),V2 = −V1. Let U := φ−1

0 Uφ0. We will use h(u) to denote an arbitrary function ofu. Let’s also denote the entry (i, j) of a matrix Ak by Aij,k or by (Ak)ij accordingto the convenience. Let us decompose R = R1 + R2 + R3 + · · ·, where Rij,k �= 0only if µi − µj = k > 0 integer. In order to compute φp(u) of (33) we decomposeit (and define Hp(u)) as follows:

φp(u) := φ0Hp(u), p = 0, 1, 2, . . . .

Plugging the above into (32) we obtain

(1∗∗) φ0 is given,

(2∗∗) µ1 �= ± 12 , Hij,1 = Uij

1 + µj − µi

, R = 0,

µ1 = 12 , H12,1 = h1(u), R12,1 = U12,

µ1 = − 12 , H21,1 = h1(u), R21,1 = U21,

(3∗∗) µ �= ±1, Hij,2 = (UH1 − H1R1)ij

2 + µj − µi

, R2 = 0,

µ = 1, H12,2 = h2(u), R12,2 = (UH1)12, R1 = 0,

µ = −1, H21,2 = h2(u), R21,2 = (UH1)21, R1 = 0,

(4∗∗) µ �= ± 32 , Hij,3 = (UH2 − H1R2 − H2R1)ij

3 + µj − µi

, R3 = 0,

µ = 32 , H12,3 = h3(u), R12,3 = (UH2)12, R1 = R2 = 0,

µ = − 32 , H21,3 = h3(u), R21,3 = (UH2)21, R1 = R2 = 0.

For any value of σ the isomonodromicity condition ∂iφ0 = Viφ0 reads

∂f (u)

∂u1= −σ

2

f (u)

u1 − u2,

∂f (u)

∂u2= σ

2

f (u)

u1 − u2.


In other words,

∂f (u)

∂u1= −∂f (u)

∂u2

and thus f (u) ≡ f (u1 − u2). Let H := u1 − u2. Therefore

f (H)

dH= −σ

2

f (H)

H�⇒ f (H) = CH− σ

2 , C a constant.

We are ready to compute t = t (u), F = F(t (u)) from the formulae

t1 =2∑

i=1

φi2,0φi1,1, t2 =2∑

i=1

φi1,0φi1,1, (34)

F = 1

2

[tαtβ

2∑i=1

φiα,0φiβ,1 −2∑

i=1

(φi1,1φi1,2 + φi1,3φi1,0

)](35)

and to reduce them to closed form.

3.3. THE GENERIC CASE

We start from the generic case of σ not integer. The result of the application offormulae (34), (35) is

t1 = u1 + u2

2, t2 = 1

4(1 + σ )

u1 − u2

f (u)2

and

F(t (u)) = 12(t

1)2t2 + 2(1 + σ )3

(1 − σ )(σ + 3)(t2)3f (u)4.

But now observe that

f (u)2 = u1 − u2

4(1 + σ )t2, f (u)2 ≡ f (H)2 = C2H−σ , u1 − u2 = H.

The above three expressions imply H = C1(t2)1/(1+σ), where C1 =

[4(1 + σ )C2]1/(1+σ). Therefore f (u)4 = C2(t2)−2σ/(1+σ), where C2 is a constant

from C1 (we do not need to compute it explicitly in terms of C1 or C). Finally,

F(t) = 1

2(t1)2t2 + C3(t

2)σ+3σ+1 .

Here C3 is another constant, from C2.

262 DAVIDE GUZZETTI

3.4. THE CASES µ1 = 32 , µ1 = 1, µ1 = −1

(1) Case µ1 = 32 , σ = 3. Formula (34) gives the same result of the generic case

(with σ = 3) because h3(u) appears only in φ3(u) and does not affect t :

t1 = u1 + u2

2, t2 = 1

4(1 + σ )

u1 − u2

f (u)2

∣∣∣∣σ=3

.

Although h3(u) appears in φ3(u), it does not in F :

F(t (u)) = 1

2(t1)2t2 + 2(1 + σ )3

(1 − σ )(σ + 3)(t2)3f (u)4

∣∣∣∣σ=3

.

We may proceed as in the generic case. Actually, now the computation of f (u) isstraightforward because

R3 =(

0 − 116(u1 − u2)

3f (u)2

0 0

)≡(

0 r

0 0

), r = constant.

Namely

− 116(u1 − u2)

3f (u)2 = r.

On the other hand, from t2 we have u1 − u2 = 16t2f (u)2 and thus

f (u)4 = (−r)12

16(t2)32

and, finally,

F(t) = 12 (t

1)2t2 − 32(−r)

12 (t2)

32 ≡ 1

2(t1)2t2 + C(t2)

32 ,

where C is an arbitrary constant, depending on r.(2) Case µ1 = 1, σ = 2. Again, the arbitrary function h2(u) does not appear in

t (u) and F(t (u)):

t1 = u1 + u2

2, t2 = 1

4(1 + σ )

u1 − u2

f (u)2

∣∣∣∣σ=2

,

F (t (u)) = 12(t

1)2t2 + 2(1 + σ )3

(1 − σ )(σ + 3)(t2)3f (u)4

∣∣∣∣σ=2

.

Now we proceed like in the generic case and we find the generic result withσ = 2.

(3) Case µ1 = −1, σ = −2. Now the formulae (34), (35) yield

t1 = u1 + u2

2, t2 = 1

4

u2 − u1

f (u)2,

F (t (u)) = 32(t

1)2t2 − 23 (t

2)3f (u)4 − t1h2(u).


The condition

0 = φT0 φ2 − φT

1 φ1 + φT2 φ0 =

(2h2(u) + u2

1−u22

4f (u)2 00 0

)implies

h2(u) = 1

8

u22 − u2

1

f (u)2≡ t1t2.

Therefore

F(t (u)) = 12(t

1)2t2 − 23 (t

2)3f (u)4.

Now we proceed as in the generic case, using f (H) = CH−σ/2 = CH and wefind the generic result with σ = −2.

3.5. THE CASE µ1 = − 12

We analyze the case µ1 = − 12 , σ = −1. The formula (34) gives

t1 = u1 + u2

2, t2 = h1(u).

By putting h1(u) = t2 we get, from (35),

F(t (u)) = 12(t

1)2t2 + 1

16

(u1 − t1)3

f (u)2= 1

2 (t1)2t2 + 1

16

(u1−u22 )3

f (u)2.

It is straightforward to obtain f (u) from

R1 =(

0 0u1−u24f (u)2 0

)≡(

0 0r 0

),

namely,

f (u)2 = u1 − u2

4r= 1

4rH.

The last thing we need is to determine H as a function of t1, t2. We can’t use thecondition 6(−z)T6(z) = η, because direct computation shows that

φT0 φ1 − φT

1 φ0 = 0, φT0 φ2 − φT

1 φ1 + φT2 φ0 = 0,

φT0 φ3 − φT

1 φ2 + φT2 φ1 − φT

3 φ0 = 0

are identically satisfied. We make use of the isomonodromicity conditions

∂φ1

∂u1= E1φ0 + V1φ1,

∂φ1

∂u2= E2φ0 + V2φ1,

264 DAVIDE GUZZETTI

which become

∂h1(u)

∂u1= 1

4f (u)2,

∂h1(u)

∂u2= −∂h1(u)

∂u1.

Thus

h1(u) ≡ h1(u1 − u2)

and

dh1(H)

dH= r

H�⇒ t2 ≡ h1(H) = r ln(H) + D,

D a constant. Thus

f (u)4 = H 2

16r2= Ce2 t2

r ,

where C is a constant (C = exp(−2D)). We get the final result F(t) = 12 (t

1)2t2 +Ce2 t2

r .

3.6. THE CASE µ1 = 12

Let µ1 = 12 , σ = 1. t is like in the generic case

t1 = u1 + u2

2, t2 = u1 − u2

8f (u)2,

while F contains h1(u)

f (t (u)) = 12(t

1)2t2 + 12 (t

2)2h1(u) − 3(t2)3f (u)4.

We can determine f (u) as in the generic case, or better we observe that

R1 =(

0 (u1 − u2)f (u)2

0 0

)≡(

0 r

0 0

).

Thus f (u)2 = r/(u1 − u2). We determine h1(u). The condition 6(−z)T6(z) = η

does not help, because it is automatically satisfied. We use the isomonodromicityconditions

∂φ1

∂u1= E1φ0 + V1φ1,

∂φ1

∂u2= E2φ0 + V2φ1

which become

∂h1(u)

∂u1= f (u)2,

∂h1(u)

∂u2= −∂h1(u)

∂u1.


Therefore h1(u) ≡ h1(u1 −u2). Then, keeping into account that f (u)2 = r/H , weobtain,

dh1(H)

dH= r

H�⇒ h1(H) = r ln(H) + D,

D being a constant. Finally, recall that

t2 = H

8f (u)2≡ H 2

8r,

hence f (u)4 = r/8t2, which contributes a linear term to F(t), and h1(u) =(r/2) ln(t2) + B, where B = (r/2) ln(8r) + C is an arbitrary constant. Finally,

F(t) = 12 (t

1)2t2 + r4(t

2)2 ln(t2)

as we wanted.

3.7. THE CASE µ1 = − 32

Finally, let’s take µ1 = − 32 , σ = −3. From (34), (35) we have

t1 = u1 + u2

2, t2 = u2 − u1

8f (u)2,

F (t (u)) = 34(t

1)2t2 + (t2)3f (u)4 − 12h3(u).

f (u) is obtainable as in the generic case, but it is straightforward to use

R3 =(

0 0(u2−u1)

3

64f (u)2 0

)≡(

0 0r 0

)�⇒ f (u)2 = (u2 − u1)

3

64r.

To obtain h3(u) we can’t rely on 6(−z)T6(z) = η, which turns out to be identi-cally satisfied. We use again the conditions

∂φ3

∂ui

= Eiφ2 + Viφ3. (36)

It is convenient to introduce

G(u) := 14(t

1)2t2 − 12h3(u).

The above (36) becomes

∂G

∂u1= r

2(u2 − u1),

∂G

∂u2= − ∂G

∂u1,

which implies G(u) = G(u1 − u2) and

dG

dH= − r

2H�⇒ G(H) = − r

2ln(H) + C,

266 DAVIDE GUZZETTI

C is constant. Finally, recall that

t2 = H 2

8r�⇒ G(H(t2)) = r

4ln(t2) + C1.

Thus

F(t) = 12 (t

1)2t2 + r

4ln(t2),

having dropped the constant terms.

3.8. CONCLUSIONS

The solution of the boundary value problem for the monodromy data σ and thenonzero entry r of the matrix R was obtained solving Equations (30) with theconstraints (31).

We have obtained the solutions of the WDVV equations from (34), (35). Theycan also be derived from elementary considerations (see [11]). Here it becomesclear that they depend explicitly on the monodromy data σ , r:

For σ �= ±1,−3, F (t) = 12(t

1)2t2 + C(t2)3+σ1+σ ,

where C is a constant.

σ = −1, F (t) = 12 (t

1)2t2 + Ce2 t2r ,

σ = 1, F (t) = 12(t

1)2t2 + C(t2)2 ln(t2),

σ = −3, F (t) = 12 (t

1)2t2 + C ln(t2).

4. The Three-Dimensional Case: Computation of φ0 and V in Terms ofPainlevé Transcendents

Let n = 3. In this section we explicitly compute φ0 and V in terms of a PainlevéVI transcendent y(x), and conversely we give a formula for y(x) in terms of theentries of φ0 and V .

We can bring η = (ηαβ) to the form [11]:

η =( 0 0 1

0 1 01 0 0

).

Let

V (u) =( 0 −M3 M2

M3 0 −M1

−M2 M1 0

),


which is similar to

µ = diag(µ, 0,−µ), µ =√

−(M21 + M2

2 + M23) constant, µ = −d

2.

By simple linear algebra, we find the eigenvectors of V . φ0 is precisely the matrixwhose columns are the eigenvectors. We have to impose also the condition φT

0 φ0 =η and we find the most general form for φ0:

φ0 =

i√2µ

M1M2−µM3

(M21+M2

3)12G(u) M1

iµi√2µ

M1M2+µM3

(M21+M2

3)12

1G(u)

− i√2µ

(M2

1 + M23

) 12 G(u) M2

iµ− i√

2µ

(M2

1 + M23

) 12 1G(u)

i√2µ

M2M3+µM1

(M21+M2

3)12G(u) M3

iµi√2µ

M2M3−µM1

(M21+M2

3)12

1G(u)

,

where G(u) is so far an arbitrary function of u = (u1, u2, u3). To determine it wemust impose the isomonodromicity condition (30) for p = 0

∂φ0

∂ui

= Vi(u)φ0. (37)

We observe that φi2,0 = Mi/(iµ), i = 1, 2, 3. If we compute the entries of (37)for the φi2,0’s we recover the equation ∂iV = [Vi, V ]. In particular, we note that∑

i ∂iV = ∑i ui∂iV = 0. Thus V (u1, u2, u3) ≡ V (x), where

x = u3 − u1

u2 − u1.

Therefore, ∂iV = [Vi, V ] becomes:

dM1

dx= 1

xM2M3,

dM2

dx= 1

1 − xM1M3,

dM3

dx= 1

x(x − 1)M1M2. (38)

Equations (37), (38) are reduced in [11] to a special case of the VIth Painlevéequation, with the following choice of the parameters (in the standard notation of[18]):

α = (2µ − 1)2

2, β = γ = 0, δ = 1

2 .

Namely:

d2y

dx2= 1

2

[1

y+ 1

y − 1+ 1

y − x

](dy

dx

)2

−[

1

x+ 1

x − 1+ 1

y − x

]dy

dx+

+ 1

2

y(y − 1)(y − x)

x2(x − 1)2

[(2µ − 1)2 + x(x − 1)

(y − x)2

], µ ∈ C. (39)

In the following, this equation will be referred to as PVIµ. Let H := u2 − u1, lety = y(x) be a Painlevé transcendent of PVIµ, and let

k = k(x,H) := k0 exp{(2µ − 1)

∫ x dζ y(ζ )−ζ

ζ(ζ−1)

}H 2µ−1

, k0 ∈ C\{0}.

268 DAVIDE GUZZETTI

LEMMA. The following φ0 is the general solution of (37):

φ13,0 = i

√k√

y√H

√x, φ23,0 = i

√k√

y − 1√H

√1 − x

, φ33,0 = −√

k√

y − x√H

√x

√1 − x

,

φ12,0 = 1

µ

√y − 1

√y − x√

x

[A

(y − 1)(y − x)+ µ

],

φ22,0 = 1

µ

√y

√y − x√

1 − x

[A

y(y − x)+ µ

],

φ32,0 = i

µ

√y

√y − 1√

x√

1 − x

[A

y(y − 1)+ µ

],

φ11,0 = i

2µ2

√H

√y√

k√

x

[A

(B + 2µ

y

)+ µ2(y − 1 − x)

],

φ21,0 = i

2µ2

√H

√y − 1√

k√

1 − x

[A

(B + 2µ

y − 1

)+ µ2(y + 1 − x)

],

φ31,0 = − 1

2µ2

√H

√y − x√

k√

x√

1 − x

[A

(B + 2µ

y − x

)+ µ2(y − 1 + x)

],

where

A = A(x) := 1

2

[dy

dxx(x − 1) − y(y − 1)

],

B = B(x) := A

y(y − 1)(y − x).

We can also rewrite φ0 as follows

φ0 = E11

fE12 E13f

E21f

E22 E23fE31f

E32 E33f

,

where

f = f (x,H) := i

√k√

y − 1√H

√1 − x

,

Ei2 := Mi

iµ, i = 1, 2, 3,

E11 := M1M2 − µM3

2µ2, E13 := −M1M2 + µM3

M21 + M2

3

,

E21 := −M21 + M2

3

2µ2, E23 := 1,

E31 := M2M3 + µM1

2µ2, E33 := −M2M3 − µM1

M21 + M2

3

,


and

M1 = i

√y − 1

√y − x√

x

[A

(y − 1)(y − x)+ µ

],

M2 = i

√y

√y − x√

1 − x

[A

y(y − x)+ µ

],

M3 = −√

y√

y − 1√x

√1 − x

[A

y(y − 1)+ µ

].

The branches (signs) in the square roots above are arbitrary. A change of thesign of one root (for example of

√H ) implies a change of two signs in (M1,M2,M3),

or the change (φi1,0, φi3,0) �→ −(φi1,0, φi3,0). The reader may verify that all thesechanges do not affect the equations for φ0 and V .

Proof. The first proof of the lemma is direct substitution of the above φ0 into(37). Direct computation shows that (37) is satisfied if and only if y(x) satisfiesthe Painlevé equation PVIµ. (Also (38) is satisfied by the Mj ’s above if and only ify(x) satisfies PVIµ.) The second proof is constructive. We derived φ0 from the linkbetween the matrix φ0 and the 2 × 2 Fuchsian system associated to Painlevé VIin the theory of isomonodromic deformations developed in [22]. The constructionof the Fuchsian system in terms of φ0 can be found in [11]. This construction alsoimplies that φ0 above is the general solution of (37). The Fuchsian system is

∂X

∂λ= −µ

3∑i=1

Ai(u)

λ − ui

X, λ ∈ C, (40)

where

Ai :=(

φi1,0φi3,0 −φ2i3,0

φ2i1,0 φi1,0φi3,0

), A1 + A2 + A3 =

(1 00 −1

). (41)

The system depends isomonodromically on u and it is solved by introducing thefollowing coordinates q(u), p(u) in the space of matrices Ai modulo diagonalconjugation (see [11]): q is the root of(

3∑i=1

Ai

q − ui

)12

= 0 and p :=(

3∑i=1

Ai

q − ui

)11

.

The entries of the Ai’s are re-expressed as follows:

φi1,0φi3,0 = − q − ui

2µ2P ′(ui)×

×[P(q)p2 + 2µ

q − ui

P (q)p + µ2(q + 2ui −3∑

j=1

uj

], (42)

φ213,0 = −k

q − ui

P ′(ui), (43)

270 DAVIDE GUZZETTI

φ2i1,0 = − q − ui

4µP ′(ui)k×

×[P(q)p2 + 2µ

q − ui

P (q)p + µ2(q + 2ui −3∑

j=1

uj

]2

. (44)

Here k is a parameter, P(z) = (z − u1)(z − u2)(z − u3). A solution of (40) mustalso satisfy

∂

∂ui

(X1

X2

)= µ

Ai

λ − ui

(X1

X2

). (45)

This is precisely the equation which implies that the dependence on u is isomon-odromic. The compatibility of (40), (45) is

∂q

∂ui

= P(q)

P ′(ui)

[2p + 1

q − ui

], (46)

∂p

∂ui

= −P ′(q)p2 + (2q + ui −∑3j=1 uj )p + µ(1 − µ)

P ′(ui), (47)

∂ ln k

∂ui

= (2µ − 1)q − ui

P ′(ui).

In the variables

x = u3 − u1

u2 − u1, y = q − u1

u2 − u1,

the system (46), (47) becomes PVIµ. From a solution of PVIµ one can reconstruct

q = (u2 − u1)y

(u3 − u1

u2 − u1

)+ u1,

p = 1

2

P ′(u3)

P (q)y′(u3 − u1

u2 − u1

)− 1

2

1

q − u3,

and from the very definition of q we have:

y(x) = xR(x)

x[1 + R(x)] − 1, R(x) :=

(φ13,0

φ23,0

)2

=(M1M2 + µM3

µ2 + M22

)2

. (48)

This makes it possible to compute y(x) from a solution M1, M2, M3 of (38).The explicit form for φi1,0 and φi3,0 is derived taking squares roots of (43) and

(44). The sign is chosen in order to satisfy (42). We then determine φi2,0 from theequality


(φ12,0, φ22,0, φ32,0) = ±i(φ21,0φ33,0 − φ23,0φ31,0, φ13,0φ31,0 −−φ11,0φ33,0, φ11,0φ23,0 − φ13,0φ21,0). ✷

5. Explicit Computation of the Flat Coordinates and of F for n = 3

Let t = (t1, t2, t3), with higher indices. We compute the parametric form t =t (x,H) and F = F(x,H) using

t1 =3∑

i=1

φi3,0φi1,1, t2 =3∑

i=1

φi2,0φi1,1, t3 =3∑

i=1

φi1,0φi1,1, (49)

F = 1

2

[tαtβ

3∑i=1

φiα,0φiβ,1 −3∑

i=1

(φi1,1φi1,2 + φi1,3φi1,0)

]. (50)

We recall that µ1 = µ, µ2 = 0, µ3 = −µ. Let us compute φ1, φ2, φ3. Wedecompose φp := φ0Hp, p = 0, 1, 2, . . . . Hi appears in the fundamental matrixsolution of (5):

G(z, u) = (I + H1z + H2z2 + H3z

3 + · · ·)zµzR, z → 0.

By plugging G(z, u) into (5) we computed the Hi’s. We give their explicit expres-sion below. In the formulae which follow we denote by h

(k)ij (u) arbitrary functions

of u = (u1, u2, u3), to be determined later; they appear any time 2µ ∈ Z and R isnot zero. Let U = φ−1

0 Uφ0, where φ0 is given by the lemma.

(1∗∗∗) Computation of H1.

Generic case: Hij,1 = Uij

1 + µj − µi

, R1 = 0.

µ = 12 : H13,1 = h

(1)13 (u), R13,1 = U13,

Hij,1 = Uij

1 + µj − µi

if (i, j) �= (1, 3).

µ = − 12 : H31,1 = h

(1)31 (u), R31,1 = U31,

Hij,1 = Uij

1 + µj − µi

if (i, j) �= (3, 1).

µ = 1: H12,1 = h(1)12 (u), R12,1 = U12,

H23,1 = h(1)23 (u), R23,1 = U23,

Hij,1 = Uij

1 + µj − µi

if (i, j) �∈ {(1, 2), (2, 3)}.

272 DAVIDE GUZZETTI

µ = −1: H21,1 = h(1)21 (u), R21,1 = U21,

H32,1 = h(1)32 (u), R32,1 = U32,

Hij,1 = Uij

1 + µj − µi

if (i, j) �∈ {(2, 1), (3, 2)}.

(2∗∗∗) Computation of H2. Let U2: = UH1 − H1R1.

Generic case: Hij,2 = Uij,2

2 + µj − µi

, R2 = 0.

µ = 1: H13,2 = h(2)13 (u), R13,2 = U13,2,

Hij,2 = Uij,2

2 + µj − µi

if (i, j) �= (1, 3).

µ = −1: H31,2 = h(2)31 (u), R31,2 = U31,2,

Hij,2 = Uij,2

2 + µj − µi

if (i, j) �= (3, 1).

µ = 2: H12,2 = h(2)12 (u), R12,2 = U12,2,

H23,2 = h(2)23 (u), R23,2 = U23,2,

Hij,2 = Uij,2

2 + µj − µi

if (i, j) �∈ {(1, 2), (2, 3)}.

µ = −2: H21,2 = h(2)21 (u), R21,2 = U21,2,

H23,2 = h(2)32 (u), R32,2 = U32,2,

Hij,2 = Uij,2

2 + µj − µi

if (i, j) �∈ {(2, 1), (3, 2)}.

(3∗∗∗) Computation of H3. Let U3: = UH2 − H2R1 − H1R2.

Generic case: Hij,3 = Uij,3

3 + µj − µi

, R3 = 0.

µ = 32 : H13,3 = h

(3)13 (u), R13,3 = U13,3,

Hij,3 = Uij,3

3 + µj − µi

if (i, j) �= (1, 3).

µ = − 32 : H31,3 = h

(3)31 (u), R31,3 = U31,3,

Hij,3 = Uij,3

3 + µj − µi

if (i, j) �= (3, 1).


µ = 3: H12,3 = h(3)12 (u), R12,3 = U12,3,

H23,3 = h(3)23 (u), R23,3 = U23,3,

Hij,3 = Uij,3

3 + µj − µi

if (i, j) �∈ {(1, 2), (2, 3)}.

µ = −3: H21,3 = h(3)21 (u), R21,3 = U21,3,

H32,3 = h(3)32 (u), R32,3 = U32,3,

Hij,3 = Uij,3

3 + µj − µi

if (i, j) �∈ {(2, 1), (3, 2)}.

5.1. THE GENERIC CASE µ �= ± 12 ,±1,± 3

2 ,±2,±3

Let µ �= ± 12 ,±1,± 3

2 ,±2,±3 and let

φ0 = E11

fE12 E13f

E21f

E22 E23fE31f

E32 E33f

,

where Eij = Ei,j (x) and f (x,H) are given in the lemma of Section 4 in terms ofy(x). Direct computation gives the entries of H1, H2, H3, then φ1, φ2, φ3 and finallyt and F from (49), (50). They are rational functions of x, y(x), y′(x), k(x,H). Thecomputation is very hard and long: we need to substitute the entries of φ0, φ1, φ2

and φ3 in (49), (50) and do proper simplifications. We finally obtain the following

THEOREM 5.1. The flat coordinates (t1, t2, t3) and the free energy F of a three-dimensional semi-simple Frobenius manifold such that µ = −d/2 is not equal to± 1

2 ,±1,± 32 ,±2,±3, are given by the parametric formulae:

t1 = u1 + a(x)H,

t2 = 1

1 + µb(x)

H

f (x,H),

t3 = 1

1 + 2µc(x)

H

f (x,H)2,

F = F0(t) +[

a1(x)c(x)2

2(1 − 2µ)(3 + 2µ)+ (µ + 4)b(x)b1(x)c(x)

2(1 − µ)(2 + µ)(3 + 2µ)+

+b(x)2(b2(x) − a(x))

(2 + µ)(3 + 2µ)

]H 3

f (x,H)2,

F0(t) := 1

2t1(t2)2 + 1

2(t1)2t3,

274 DAVIDE GUZZETTI

where

a(x) := E21E23 + xE31E33,

b(x) := E22E21 + xE32E31,

b1(x) := E23E22 + xE33E32,

a1(x) := E223 + xE2

33,

b2(x) := E222 + xE2

32,

c(x) := E221 + xE2

31,

H = u2 − u1, and f (x,H), Eij (x) are given in the lemma. In particular, t and F

are rational functions of x, y(x), dy(x)/dx , k(x,H).

Note that F − F0 is independent of u1, namely it is independent of t1.

5.2. THE CASE OF THE QUANTUM COHOMOLOGY OF PROJECTIVE SPACES:µ = −1

Let µ = −1. This is a nongeneric case, corresponding to the Frobenius manifoldcalled the Quantum Cohomology of CP2. In this case the unknown functions h

(1)21 ,

h(1)32 , h(2)

31 have to be determined. It is known [12, 17] that

R1 =( 0 0 0

3 0 00 3 0

), R2 = 0.

The direct computation gives

R1 =( 0 0 0

b(x)Hf −1 0 00 b(x)Hf −1 0

),

R2 =( 0 0 0

0 0 0b(x)Hf −1(h

(1)21 − h

(1)32 ) 0 0

),

which implies

f (x,H) = H

3b(x), h

(1)21 = h

(1)32 , (51)

where h(1)32 is determined using the differential equation

∂φ1

∂ui

= Eiφ0 + Viφ1

which implies

∂h(1)32

∂u1= E12E11

f,

∂h(1)32

∂u2= E22E21

f,

∂h(1)32

∂u3= E32E31

f.


Therefore

∂h(1)32

∂u1+ ∂h

(1)32

∂u2+ ∂h

(1)32

∂u3= 0.

The last equation follows from E12E11 + E22E21 + E32E31 = 0, which is a conse-quence of φT

0 φ0 = η. Therefore h(1)32 is a function of x = (u3 − u1)/(u2 − u1) and

H = u2 − u1. Keeping into account (51) and the relations

∂x

∂u1= x − 1

H,

∂x

∂u2= − x

H,

∂x

∂u3= 1

H,

∂H

∂u1= 0,

∂H

∂u2= 1,

∂H

∂u3= −1,

we obtain

∂h(1)32

∂x= 3

x + E21E22E31E32

,∂h

(1)32

∂H= 3,

which are integrated as

h(1)32 = 3 ln(H) + 3

∫ x

dζ1

ζ + E21E22E31E32

. (52)

Before determining h(2)31 , it is worth computing t through (49). We again need to

substitute the entries of φ0 and φ1 in the formulae (49) and do nontrivial simplifi-cations. We obtain

t1 = u1 + a(x)H, t2 = h(1)32 , (53)

t3 = −c(x)H

f 2= −9

c(x)

b(x)2

1

H. (54)

We observe that h(2)31 does not appear in t . We also observe that both t1 and t3

coincide with the limits for µ → −1 of the same coordinates computed in thegeneric case. Instead, such a limit does not exist for t2.

Now we turn to the differential equation ∂φ2/∂ui = Eiφ1 + Viφ2 which givesthe following differential equations for h

(2)31 :

∂h(2)31

∂ui

= t1 ∂t3

∂ui

+ t2 ∂t2

∂ui

+ t3 ∂t1

∂ui

, i = 1, 2, 3.

They are immediately integrated: h(2)31 = 1

2 (t2)2 + t1t3.

Now all the entries of φp , p = 0, 1, 2, 3, are known and we can substitute theminto (50). We obtain:

F = F0(t) + 9[

16a1(x)c(x)

2 + 34b(x)b1(x)c(x) + (b2(x) − a(x))b(x)2

]×× H

b(x)2. (55)

276 DAVIDE GUZZETTI

Remarkably, this coincides with the limit, for µ → −1, of the generic case. Wehave proved the following theorem:

THEOREM 5.2. The flat coordinates (t1, t2, t3) and the free energy F for theQuantum Cohomology of CP2 are given by the parametric formulae (52), (53),(54) and (55).

6. F(t) in Closed Form

(1) Generic case µ �= ± 12 ,±1,± 3

2 ,±2,±3.If we keep into account the dependence of f (x,H) and k(x,H) on H , we see

that both t and F − F0 can be factorized in a part depending only on x and anotherone depending only on H

t2(x,H) = τ2(x)H1+µ,

t3(x,H) = τ3(x)H1+2µ,

F (x,H) = F0(t) + F (x)H 3+2µ,

where τ2(x), τ3(x) and F (x) are explicitly given as rational functions of x, y(x),dy(x)/dx and quadratures by the formulae of the Theorem 5.1. Hence, the ratio

t2

(t3)1+µ

1+2µ

is independent of H and the closed form F = F(t) must be

F(t) = F0(t) + (t3)3+2µ1+2µ ϕ

(t2

(t3)1+µ1+2µ

),

where the function ϕ(ζ ) has to be determined. We’ll obtain closed forms F = F(t)

following the steps below.

(i) First we choose a critical point x = 0, 1,∞ of PVIµ and we expand y(x)

close to the critical point, with parameters σ (i), a(i) given by (19), (20) interms of the entries of the Stokes’ matrix S. In the paper we consider only thecase of rational σ (i), therefore any expansion is a Taylor or Puiseux series. Thecoefficients of the expansion, which are rational in a(i) and σ (i), are classicalfunctions of the entries of S (actually, a(i) and σ (i) are rational, trigonometricor D functions of the monodromy data). The most efficient way to do the ex-pansion is to compute the expansions of M1(x), M2(x), M3(x). The algorithmwe use is an expansion of the Mi’s in a small parameter [see the Appendix inSection 9]. The effective variable in the expansion is a variable s → 0

s :=

x if x → 0,1 − x if x → 1,1x

if x → ∞.


(ii) We plug the above expansions into τi(x) and F (x), obtaining an expansionin s. In particular

t2

(t3)1+µ

1+2µ

≡ τ2(x(s))

τ3(x(s))1+µ

1+2µ

is expanded.(iii) One of the following cases may occur

τ2

τ1+µ1+2µ

3

→

0,∞, for s → 0ζ0

no limit,

where ζ0 is a nonzero complex number. If the limit does not exist, the problembecomes complicated. This may actually occur for particular values of themonodromy data (we’ll see later that this is the case of the quantum coho-mology of CP2, provided that we take et2

(t3)3 instead of t2(t3)−(1+µ)/(1+2µ)).In the other three cases the limit exists and we have a small quantity X =X(s) → 0 as s → 0:

X :=

τ2

τ

1+µ1+2µ

3

,(τ2

τ

1+µ1+2µ

3

)−1,

τ2

τ

1+µ1+2µ

3

− ζ0.

(iv) We invert the series X = X(s) and find a series s = s(X) for X → 0. Thuswe can rewrite τ2 = τ2(X), τ3 = τ3(X), F = F (X).

(v) We compute H as a series in X and as a function of t3:

H = H(X, t3) =[

t3

τ3(X)

] 11+2µ

.

(vi) By substituting H(X, t3) into F − F0 = F (X)H 3+2µ we obtain a series forF − F0 in the small variable X. In other words, we obtain ϕ(ζ ) as a series inζ or 1

ζor ζ − ζ0.

(vii) Finally, we simply re-express X in term of the variables t2 and t3 and that’sall. We get the closed form F(t) as a series whose coefficients are classicalfunctions of the monodromy data.

278 DAVIDE GUZZETTI

7. Closed Form of F(t) from Algebraic Solutions of PVIµ

We refer to [15] for the algebraic solutions of PVIµ. The Stokes’ matrix of themanifold is

S =( 1 x∞ x0

0 1 x1

0 0 1

)and in [15] branches of the algebraic solutions of PVIµ are reconstructed from theabove monodromy data. The formulae (19) and (20) give the critical behavior of abranch of y(x) (branch cuts in the x-plane, like (−∞, 0) and (1,+∞), are under-stood). The analytic continuation of the branch has critical behavior still specifiedby the exponents (19) and the coefficients (20) computed on the entries of a new S

obtained acting with the braid group. This action was described in the Introduction,and it is generated by the two elementary braids

(x0, x1, x∞) �→ (−x0, x∞ − x0x1, x1),

(x0, x1, x∞) �→ (x∞,−x1, x0 − x1x∞).

A triple (x0, x1, x∞) specifies an algebraic solution if and only if its orbit underthe action of the braid group is finite. There are only five finite orbits, all classifiedin [15]. For them, the entries are xi = −2 cos πri , 0 � ri � 1 rational, i = 0, 1,∞.Moreover, µ must be real. In [15] it is proved that the Stokes matrices coincidewith the Stokes matrices of the Coxeter groups A3, B3, H3. Namely, let us take oneof the above Coxeter groups and choose a basis e1, e2, e3 of three vectors whichgenerate the reflections in the planes normal to them w.r.t. an Euclidean metric ( , ).We compute the corresponding Stokes matrix with x1 := (e1, e2), x0 := (e1, e3),x∞ := (e2, e3). If we choose another basis we find a new Stokes’ matrix. It turns outthat for the group A3 all possible Stokes’ matrices constructed in this way belongto only one orbit w.r.t. the action of the braid group acting on one of the Stokesmatrices. The same holds true for B3. On the other hand, there are three orbits forH3, therefore there are three inequivalent choices for the basis e1, e2, e3. The fiveorbits correspond to the symmetries of five solids: tetrahedron, cube, icosahedron,great dodecahedron, great icosahedron (see [6]).

The equation PVIµ admits a set of symmetries, which transform the equationfor a given µ to another equation with −µ or µ+ 1. They are discussed in [15] andthey allows to reduce to the case 0 < µ < 1. With this restriction, there are fivealgebraic solutions, corresponding to the five orbits of S of A3, B3, H3.

We compute F(t) in closed form for these five algebraic solutions.

(I) TETRAHEDRON (A3), µ = − 14

There is only one orbit of S, completely given in [15]. From the entries of S wecompute the critical behavior of y(x) through (19) and (20). This gives the leadingterms of the Mj ’s (j = 1, 2, 3) according to the formulae of Section 4. They are


enough to start the expansion in the small parameter explained in the Appendix.This is a Puiseux expansion because the σ (i)’s (i = 0, 1,∞) are rational. Then wecompute the Puiseux expansion of y(x) through (48) and we expand t = t (x,H),F = F(x,H) of Subsection 5.1. Finally, we apply the procedure explained inSection 6 to obtain the closed form F = F(t). This is the general procedure we’llfollow in all the cases below.

(i) x → 0. We choose the triple (x0, x1, x∞) = (0,−1,−1). Through (19) and(20) we obtain

y(x) = 12x + O(x2), x ≡ s → 0.

Using the small parameter expansion explained in the Appendix, we compute thePuiseux expansion (a Taylor series in the example) of y up to order xm−1, for agiven large m. The small variable X is

X = t2

(t3)32

→ 0 for x → 0

and the final result obtained applying the procedure explained in Section 6 is

F − F0 = 415k

40(t

3)5 − k20(t

3)5X2 + O(Xm) X → 0

= 415k

40(t

3)5 − k20(t

2)2(t3)2 + O

([t2

(t3)32

]m),

k0 is the arbitrary integration constant in k(x,H). Note that different solutions F(t)

corresponding to different values of k0 are connected by symmetries of the WDVVequations [11].

(ii) x → 1. Let us choose (x0, x1, x∞) = (−1, 0,−1). Hence:

y(x(s)) = 1 − 12s + O(s2), s = 1 − x → 0.

X is like in (i) and

F − F0 = 415k

40(t

3)5 + k20(t

3)5X2 + O(Xm) X → 0

= 415k

40(t

3)5 + k20(t

2)2(t3)2 + O

([t2

(t3)32

]m).

Here k20 has the opposite sign w.r.t. the previous case.

(iii) x → ∞. We choose (x0, x1, x∞) = (−1,−1, 0). Hence,

y(x(s)) = 12 [1 + O(s)], s = 1

x→ 0.

Again, X is as in (i) and (ii), and the result is precisely as in (i).

280 DAVIDE GUZZETTI

(iv) We consider another example for x → 0: we choose (x0, x1, x∞) = (1, 1, 1).Hence,

y(x) = 423

50x

23 (1 + O(xδ)), 0 < δ < 1, x = s → 0

as determined by the formulae (19), (20). This time the computation of the expan-sion of y(x) is harder than before, because of the fractional exponent. The finalresult of the procedure of Section 6 is

t2

(t3)32

→ ζ0 = −72

25

√2k0, x → 0,

X =[

t2

(t3)32

− ζ0

]→ 0, x → 0,

F − F0 = 119751372

1953125k4

0(t3)5 − 314928

√2

15625k3

0(t3)5X2 + 2187

625k2

0(t3)5X4+

+ O(Xm), X → 0.

Substituting X as a function of t2 and t3 we obtain

F − F0 = 4

15α2(t3)5 − α(t2)2(t3)2 + O(Xm), α = −2187

625k2

0.

We applied the procedure for all the triples in the orbit of S of A3 and weconsidered x → 0, 1,∞ obtaining the same results of the examples above.

(II) CUBE (B3), µ = − 13

The computations are similar to those for the case A3. We just give one example,namely only the case x → 0 and (x0, x1, x∞) = (0,−1,−√

2), which implies:

y(x) = 23x + O(x2), x → 0.

We repeated the computation for all the monodromy data x0, x1, x∞ of [15] ob-taining the same result of the example explained here. For it, the small quantity X

is

X = t2

(t3)2→ 0

and the procedure of Section 6 gives

F − F0 = 512

8505k6

0(t3)7 − 16

27k3

0(t2)2(t3)3 − 2i

√2

9k

320 (t2)3t3 + O(Xm).

For the case H3 we need to distinguish three sub-cases, corresponding to threeinequivalent orbits of S. Again, all these orbits are explicitly listed in [15]. From


them we compute the leading terms of y(x) from (19) and (20) and then the Puiseuxexpansion up to any desired order by the small parameter expansion explained inthe Appendix.

(III) ICOSAHEDRON (H3), µ = − 25

We choose for example

(x0, x1, x∞) =(

0, 1,1 + √

5

2

).

Therefore,

y(x) = 3 + √5

5 + √5x + O(x2), x → 0.

Now

X = t2

(t3)3→ 0.

The final result is:

F − F0 = 18

55α4(t3)11 + 9

5α2(t2)2(t3)5 + α(t2)3(t3)2 + O(Xm),

where α = −512√

5/[3(√5 − 5)52 (

√5 + 5)

52 ]k 5

20 .

Remark. In (I), (II), (III) we have recovered the famous polynomial solutionsof [11]

F(t) = F0(t) + a(t2)2(t3)2 + 415a

2(t3)5, a ∈ C, d = 12 ,

F (t) = F0(t) + a(t2)3t3 + 6a2(t2)2(t3)3 + 21635 a4(t3)7, a ∈ C, d = 2

3 ,

F = F0(t) + a(t2)3(t3)2 + 95a

2(t2)2(t3)5 + 1855a

4(t3)11, a ∈ C, d = 45 ,

which correspond to the Frobenius structures on the space of orbits of the Coxetergroups A3, B3, H3 [11, 30, 31].

(IV) GREAT DODECAHEDRON (H3), µ = − 13

We give the final result, which is a series:

F − F0 = (t3)7

[A0 +

∞∑k=2

Ak

(t2

(t3)2

)k]

= 512

8505k6

0(t3)7 − 16

27k3

0(t2)2(t3)3 + 4i

√5

9k

320 t3(t2)3 +

+ 1

8

(t2)4

t3+ i

√5

80k320

(t2)5

(t3)3− 1

64k30

(t2)6

(t3)5+ · · · .

The series can be computed up to any order in X = t2/(t3)2 → 0.

282 DAVIDE GUZZETTI

(V) GREAT ICOSAHEDRON (H3), µ = − 15

The final result is a Puisex series:

F − F0 = (t3)133

[A0 +

∞∑k=2

Ak

(t2

(t3)43

)k]= 54

284375α4(t3)

133 − 3

125α2(t2)2(t3)

53 +

+ i

15α(t2)3(t3)

13 + 1

72

(t2)4

t3+ i

108α

(t2)5

(t3)73

+ · · ·

for X = t2/(t3)43 → 0. Here k0 = 270

15 /30α

65 .

8. Closed form for QH ∗(CP 2)

In this case the factorization is t3 = τ3(x)H−1 and F − F0 = F (x)H , but

t2 = h(1)32 = ln(H 3) +

∫ x

(...).

This implies that et2(t3)3 is independent of H and that

F(t) = F0(t) + 1

t3ϕ(et2

(t3)3)

or

F(t) = F0(t) + et23 ϕ1(et2

(t3)3).The situation is more complicated now, because the behavior of y(x) for x → 0 isnot like ax1−σ (1+ higher-order terms) as in the previous section. The same holdsfor x → 1 and x → ∞. The reason for this is that the monodromy data are in theorbit w.r.t the action of the braid group of the triple (x0, x1, x∞) = (3, 3, 3). Hence,the entries of S are real and their absolute value is greater than 2, so �σ (i) = 1. Forthe data (3, 3, 3) we have

σ (i) = 1 − iν, ν = − 2

πln

(3 + √

5

2

).

This case corresponds to an oscillatory behaviour (21) if x converges to the criticalpoints along radial directions, as we explained in the Introduction. We recall thatthe effective parameter s → 0 introduced in Section 6, point (i), is s = x or 1 − x

or 1/x. It turns out that et2(t3)3 has no limit as s → 0.

To overcome this difficulty, we compute F(t) in closed form starting from theexpansion of y(x) and of M1(x), M2(x), M3(x) close to the nonsingular point xc =


exp{−i(π/3)} and we prove that the reduction of (52)–(54) and (55) to closed formis precisely the expansion of F(t) due to Kontsevich [24], namely:

F(t) = F0(t) + 1

t3

∞∑k=1

Nk

(3k − 1)![(t3)3et2]k

, (56)

where Nk is the number of rational curves from CP1 → CP2 of degree k through3k − 1 generic points.

Our result is interesting because it allows us to obtain such a relevant expressionstarting from the isomonodromy deformation theory.

On the other hand, it is not completely satisfactory. Since the Frobenius man-ifold can in principle be reconstructed from its monodromy data, we would liketo express the coefficients Nk as functions of the monodromy data. The criticalbehavior of y(x) close to a critical point depends upon two parameters a(i), σ (i)

which are classical functions of the monodromy data (rational, trigonometric andD functions). Thus, our aim is to compute the closed form F(t) from the criticalbehavior of y(x) close to a critical point, in order to obtain the Nk’s as classicalfunctions of the monodromy data.

The choice of a nonsingular point xc is not satisfactory because the expansionof y(x) close to xc depends on two parameters (initial data y(xc), y′(xc)), whichin general are not classical functions of a(i) and σ (i), i = 0, 1,∞ (generically, thePainlevé transcendents cannot be expressed via classical functions). Thus, they arenot classical functions of the monodromy data.

We now compute F(t) in closed form. We expand y(x) close to xc = e−i π3 .

This choice comes from the knowledge of the structure of QH ∗(CP 2) at thepoint t1 = t3 = 0 [12, 17]. Namely, we know that

U =( 0 0 3q

3 0 00 3 0

), q := et2

with eigenvalues

u1 = 3q13 , u2 = 3q

13 e−i 2π

3 , u3 = 3q13 ei 2π

3 .

The matrix φ0 is

φ0 = q− 1

3 1 q13

q− 13 e−i π

3 −1 q13 ei π

3

q− 13 ei π

3 −1 q13 e−i π

3

.

Thus

xc = u3 − u1

u2 − u1= e−i π

3 .

284 DAVIDE GUZZETTI

Remark. We understand that the choice of xc is not satisfactory for anotherreason, namely because we have to rely on the knowledge of QH ∗(CP2) at thepoint t1 = t3 = 0, and not only on the monodromy data.

In order to compute y(x) we start from M1(x), M2(x), M3(x). We look for aregular expansion

Mi(x) =∞∑k=0

M(k)i (x − xc)

k, i = 1, 2, 3.

We need the initial conditions M(0)i . We can compute them using Mi = iµφi2,0:

M(0)1 = − i√

3, M

(0)2 = i√

3, M

(0)3 = i√

3.

Then we plug the expansion into (38) and we recursively compute the coefficientsat any desired order. Finally, we obtain y(x) from (48). We give only the first termsof the expansions, the effective small variable being s := x − xc → 0:

M1 = − i√

3

3−(

1

6+ 1

6i√

3

)s + 1

9i√

3s2 +(

1

18− 1

18i√

3

)s3−

−(

5

36− 5

108i√

3

)s4 + · · · ,

M2 = i√

3

3+(

1

6− 1

6i√

3

)s − 1

9i√

3s2 −(

1

18+ 1

18i√

3

)s3−

−(

5

36+ 5

108i√

3

)s4 + · · · ,

M3 = i√

3

3− 1

3s + 2

9i√

3s2 + 4

9s3 − 13

54i√

3s4 + · · · ,

y(x) = 1

2− 1

6i√

3 + 1

3s − 1

3i√

3s2 − 1

3s3 + 1

9i√

3s4 + 13

45s5−

− 37

135i√

3s6 − 17

27s7 + · · · .

Once we have M1, M2, M3, we can compute the Eij ’s, the φp’s (p = 0, 1, 2, 3) andfinally the flat coordinates t (x,H) and F(x,H) through (52)–(55). At low orders:

t1 = u1 +[

1

2− 1

6i√

3 + 1

3s + O(s2)

]H,

t3 = [−9s + O(s2)]H−1,

q = exp(t2) = 1

143i√

3q0[1 + i

√3s + O(s2)

]H 3,


where q0 is an arbitrary integration constant (recall that t2 is obtained by integra-tion). As for F , we have

F = 1

6i√

3s2 + 1

6s3 − 1

18i√

3s4 + O(s5).

We introduce the small quantity X, independent of H : X := t3q13 . For example, if

we take the cubic root (− 16 + 1

18 i√

3)q13

0 of 1143 i

√3q0 we compute

X = q13

0

[(3

2− 1

2i√

3

)s −

(1

2+ 1

2i√

3

)s2 + O(s3)

].

Another choice of the branch of the cubic root does not affect the final result(actually, we will see that F − F0 is a series in X3). We invert the series and finds = s(X), and then we find H = τ3(X)/t3 as a series in X → 0. Finally, F iscomputed through (55):

F − F0 = 1

t3

[1

2q0X3 + 1

120q20

X6 + 1

3360q30

X9 + 31

1995840q40

X12 +

+ 1559

1556755200q50

X15 + O(X17)

].

We obtained this expansion through the expansions of the Mi’s and of y(x) at order16. If we put q0 = 1, this is exactly Kontsevich’s solution (56), with

N1 = 1, N2 = 1, N3 = 12, N4 = 620, N5 = 87304.

Though not completely satisfactory to our theoretical purposes, the above is anexplicit procedure to compute the Gromov–Witten invariants Nk, which is alterna-tive to the usual procedure consisting in the direct substitution of the expansion(56) in the WDVV equations.1

The formulae (52)–(54) and (55) are completely explicit as rational functions ofdy/dx, y(x) and its quadratures. Still, the problem of inversion to obtain a closedform F(t) close to a critical point is very hard because the behavior of y(x) is not

1 We observe that for the very special case of QH ∗(CP 2)

∂2

∂(t2)2(F − F0) = u1 + u2 + u3 − 3t1.

This follows from the computation of the intersection form of the Frobenius manifold QH∗(CP 2)

in terms of F (see [11]). Therefore,

∂2

∂(t2)2(F − F0) = u1 + u2 + u3 − 3u1 − 3a(x)H

= H(1 + x − 3a(x)).

The above formula allows to compute F − F0 faster than (55).

286 DAVIDE GUZZETTI

given by a Puiseux series, but it is oscillatory. We plan to continue the project ofthis paper of understanding the global analytic structure of QH ∗(CP2) in the nearfuture.

9. Appendix

We present a procedure to compute the expansion of the Painlevé transcendents ofPVIµ and of the solutions of

dM1

dx= 1

xM2M3,

dM2

dx= 1

1 − xM1M3,

dM3

dx= 1

x(x − 1)M1M2 (57)

close to the critical point x = 0. The corresponding solution of the PVIµ equationis

y(x) = −xR(x)

1 − x(1 + R(x)), R(x) :=

[M1M2 + µM3

µ2 + M22

]2

. (58)

9.1. EXPANSION WITH RESPECT TO A SMALL PARAMETER

We want to study the behavior of the solution of (57) for x → 0. Let

x := εz,

where ε is the small parameter. The system (57) becomes:

dM1

dz= 1

zM2M3,

dM2

dz= ε

1 − εzM1M3,

dM3

dz= 1

z(εz − 1)M1M2. (59)

The coefficient of the new system are holomorphic for ε ∈ E := {ε ∈ C | |ε| � ε0}and for 0 < |z| < 1

|ε0| , in particular for z ∈ D := {z ∈ C|R1 � |z| � R2}, where

R1 and R2 are independent of ε and satisfy 0 < R1 < R2 < 1ε0

.We will use the small parameter expansion as a formal way to compute the

expansions of the M′j s for x → 0, the only justification being that in the cases

we apply it we find expansions in x which we already know they are convergent.To our knowledge, there is no rigorous justification of the (uniform) convergenceof the expansions for the Mj ’s in terms of the variable x restored after the smallparameter expansion in powers of ε.

For ε ∈ E and z ∈ D we can expand the fractions as follows:

dM1

dz= 1

zM2M3,

dM2

dz= ε

∞∑n=0

znεnM1M3,dM3

dz= −1

z

∞∑n=0

znεnM1M2, (60)

and we look for a solution expanded in powers of ε:

Mj(z, ε) =∞∑n=0

M(n)j (z)εn, j = 1, 2, 3. (61)


We compute the M(n)j ’s substituting (61) into (60). To order ε0 we find

M(0)2

′ = 0 �⇒ M(0)2 = iσ

2, M

(0)1

′ = 1

zM

(0)2 M

(0)3 , M

(0)3

′ = −1

zM

(0)2 M

(0)1 ,

where σ is so far an arbitrary constant, and the prime denotes the derivative w.r.t.z. Then we solve the linear system for M

(0)1 and M

(0)3 and find

M(0)1 = bz− σ

2 + azσ2 , M

(0)3 = ibz− σ

2 − iazσ2 ,

where a and b are integration constants. The higher orders are

M(n)2 (z) =

∫ z

dζn−1∑k=0

ζ k

n−1−k∑l=0

M(l)1 (ζ )M

(n−1−k−l)3 (ζ ),

M(n)

1

′ = 1

zM

(0)2 M

(n)

3 + A(n)

1 (z),

M(n)3

′ = −1

zM

(0)2 M

(n)1 + A

(n)3 (z),

where

A(n)

1 (z) = 1

z

n∑k=1

M(k)

2 (z)M(n−k)

3 (z),

A3(z) = −1

z

[n∑

k=1

M(k)

2 (z)M(n−k)

1 (z) +n∑

k=1

zkn−k∑l=0

M(l)

1 (z)M(n−k−l)

2 (z)

].

The system for M(n)

1 , M(n)

3 is closed and nonhomogeneous. By variation of parame-ters we obtain the particular solution

M(n)

1 (z) = zσ/2

σ

∫ z

dζ ζ 1− σ2 R

(n)

1 (ζ ) − z−σ/2

σ

∫ z

ζ 1+ σ2 R

(n)

1 (ζ ),

M(n)3 (z) = z

iσ/2

(M

(n)1 (z)′ − A

(n)1 (z)

),

where

R(1)1 (z) = 1

zA

(n)1 (z) + iσ

2zA

(n)3 (z) + A

(n)1 (z)′.

Restoring x we have:

Mj(x) = x− σ2

∞∑k,q=0

b(j)

kq xk+(1−σ)q + xσ2

∞∑k,q=0

a(j)

kq xk+(1+σ)q, j = 1, 3,

M2(x) =∞∑

k,q=0

b(2)kq x

k+(1−σ)q +∞∑

k,q=0

a(2)kq xk+(1+σ)q . (62)

288 DAVIDE GUZZETTI

The coefficients a(j)

kq and b(j)

kq contain ε. In fact, they are functions of a := aε− σ2 ,

b := bεσ2 .

9.2. SOLUTION BY FORMAL COMPUTATION

Consider the system (57) and expand the fractions as x → 0. We find

dM1

dx= 1

xM2M3,

dM2

dx=

∞∑n=0

xnM1M3,dM3

dx= − 1

x

∞∑n=0

xnM1M2. (63)

We can look for a solution written in formal series

Mj(x) = x− σ2

∞∑k,q=0

b(j)

kq xk+(1−σ)q + xσ2

∞∑k,q=0

a(j)

kq xk+(1+σ)q, j = 1, 3,

M2(x) =∞∑

k,q=0

b(2)kq x

k+(1−σ)q +∞∑

k,q=0

a(2)kq xk+(1+σ)q .

Plugging the series into the equation we find solvable relations between the coeffi-cients and we can determine them. For example, the first relations give

M2 = iσ

2+(i[b(1)

00 ]2

1 − σx1−σ + · · ·

)−(i[a(1)

00 ]2

1 + σx1+σ + · · ·

),

M1 = (b(1)00 x− σ

2 + · · ·) + (a(1)00 x

σ2 + · · ·),

M3 = (ib(1)00 x− σ

2 + · · ·) + (−ia(1)00 x

σ2 + · · ·).

All the coefficients determined by successive relations are functions of σ , b(1)00 , a(1)

00 .These are the three integration constants on which the solution of (57) must depend.We can identify b

(1)00 with b and a

(1)00 with a.

9.3. THE RANGE OF σ IN THE SMALL PARAMETER EXPANSION

The above computations make sense if σ is not an odd integer, otherwise somecoefficients of the expansions for the Mj ’s diverge (see for example the first termsof M2 in the preceding section).

Moreover, the expansion in the small parameter yields the following approxi-mation at order 0 for M2: M2 ≈ iσ/2 ≡ constant. The approximation at order 1contains powers z1−σ , z1+σ . If we assume that the approximation at order 0 in ε isactually the limit of M2 as x = εz → 0, than we need −1 < �σ < 1. Of course,this makes sense if x → 0 along a radial path (i.e. within a sector of amplitude lessthan 2π ).


The ordering of the expansion (62) is somehow conventional: namely, we couldtransfer some terms multiplied by x

σ2 in the series multiplied by x− σ

2 , and con-versely. I report the first terms:

M1(x) = bx− σ2

(1 − b2

(1 − σ )2x1−σ + σ 2

4(1 − σ )x + a2

(1 + σ )2x1+σ + · · ·

)+

+ axσ2

(1 + b2

(1 − σ )2x1−σ + σ 2

4(1 + σ )x − a2

(1 + σ )2x1+σ + · · ·

),

M3(x) = ibx− σ2

(1 − b2

(1 − σ )2x1−σ + σ (σ − 2)

4(1 − σ )x + a2

(1 + σ )2x1+σ + · · ·

)−

− iaxσ2

(1 + b2

(1 − σ )2x1−σ + σ (σ + 2)

4(1 + σ )x − a2

(1 + σ )2x1+σ + · · ·

),

M2(x) = iσ

2+ i

b2

(1 − σ )2x1−σ − i

a2

1 + σx1+σ + · · · .

Note that the dots do not mean higher-order terms. There may be terms biggerthan those written above (which are computed through the expansion in the smallparameter up to order ε) depending on the value of �σ in (−1, 1).

Finally, we note that we can always assume that 0 � �σ < 1, because thatwould not affect the expansion of the solutions but for the change of two signs.With this in mind, the expansions above are:

M1 = bx− σ2 (1 + O(x1−σ )) + ax

σ2 (1 + O(x)),

M3 = ibx− σ2 (1 + O(x1−σ )) − iax

σ2 (1 + O(x)),

M2 = iσ

2(1 + O(x1−σ )).

If we substitute them in the formula (58) we find a Painlevé transcendent with thebehavior y(x) ∼ ax1−σ for x → 0, as we already explained in the Introduction.

Acknowledgements

I am grateful to B. Dubrovin for introducing me to the theory of Frobenius mani-folds and for many discussions and advice. The author is supported by a fellowshipof the Japan Society for the Promotion of Science (JSPS).

References

1. Anosov, D. V. and Bolibruch, A. A.: The Riemann–Hilbert Problem, Publ. Steklov InstituteMath., 1994.

2. Balser, W., Jurkat, W. B. and Lutz, D. A.: Birkhoff invariants and Stokes’ multipliers formeromorphic linear differential equations, J. Math. Anal. Appl. 71 (1979), 48–94.

290 DAVIDE GUZZETTI

3. Balser, W., Jurkat, W. B. and Lutz, D. A.: On the reduction of connection problems for differ-ential equations with an irregular singular point to ones with only regular singularities, SIAMJ. Math. Anal. 12 (1981), 691–721.

4. Bertola, M.: Jacobi groups, Hurwitz spaces and Frobenius structures, Preprint SISSA74/98/FM, 1998, to appear in Differential Geom. Appl.

5. Birman, J. S.: Braids, Links, and Mapping Class Groups, Ann. Math. Stud. 82, Princeton Univ.Press, 1975.

6. Coxeter, H. S. M.: Regular Polytopes, Dover, New York, 1963.7. Di Francesco, P. and Itzykson, C.: Quantum intersection rings, In: R. Dijkgraaf, C. Faber and

G. B. M. van der Geer (eds), The Moduli Space of Curves, 1995.8. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Topological strings in d < 1, Nuclear Phys. B 352

(1991), 59–86.9. Dubrovin, B.: Integrable systems in topological field theory, Nuclear Phys. B 379 (1992), 627–

689.10. Dubrovin, B.: Geometry and itegrability of topological-antitopological fusion, Comm. Math.

Phys. 152 (1993), 539–564.11. Dubrovin, B.: Geometry of 2D topological field theories, In: Lecture Notes in Math. 1620,

Springer, New York, 1996, pp. 120–348.12. Dubrovin, B.: Painlevé trascendents in two-dimensional topological field theory, In: R. Conte

(ed.), The Painlevé Property, One Century Later, Springer, New York, 1999.13. Dubrovin, B.: Geometry and analytic theory of Frobenius manifolds, math.AG/9807034, 1998.14. Dubrovin, B.: Differential geometry on the space of orbits of a Coxeter group,

math.AG/9807034, 1998.15. Dubrovin, B. and Mazzocco, M.: Monodromy of certain Painlevé-VI trascendents and reflec-

tion groups, Invent. Math. 141 (2000), 55–147.16. Gambier, B.: Sur des équations differentielles du second ordre et du premier degré dont

l’intégrale est à points critiques fixes, Acta Math. 33 (1910), 1–55.17. Guzzetti, D: Stokes matrices and monodromy for the quantum cohomology of projective

spaces, Comm. Math. Phys. 207 (1999), 341–383. Also see the preprint math/9904099.18. Its, A. R. and Novokshenov, V. Y.: The Isomonodromic Deformation Method in the Theory of

Painleve Equations, Lecture Notes in Math. 1191, Springer, New York, 1986.19. Iwasaki, K., Kimura, H., Shimomura, S. and Yoshida, M.: From Gauss to Painlevé, Aspects

Math. 16, Vieweg, Braunschweig, 1991.20. Jimbo, M.: Monodromy problem and the boundary condition for some Painlevé transcendents,

Publ. RIMS, Kyoto Univ. 18 (1982), 1137–1161.21. Jimbo, M., Miwa, T. and Ueno, K.: Monodromy preserving deformations of linear ordinary

differential equations with rational coefficients (I), Physica D 2 (1981), 306.22. Jimbo, M. and Miwa, T.: Monodromy preserving deformations of linear ordinary differential

equations with rational coefficients (II), Physica D 2 (1981), 407–448.23. Jimbo, M. and Miwa, T.: Monodromy preserving deformations of linear ordinary differential

equations with rational coefficients (III), Physica D 4 (1981), 26.24. Kontsevich, M. and Manin, Y. I.: Gromov–Witten classes, quantum cohomology and enumera-

tive geometry, Comm. Math. Phys 164 (1994), 525–562.25. Malgrange, B.: Équations différentielles à coefficientes polynomiaux, Birkhauser, Basel, 1991.26. Manin, V. I.: Frobenius Manifolds, Quantum Cohomology and Moduli Spaces, Max Planck

Institut fur Mathematik, Bonn, Germany, 1998.27. Miwa, T.: Painlevé property of monodromy preserving equations and the analyticity of

τ -functions, Publ. RIMS 17 (1981), 703–721.28. Painlevé, P.: Sur les équations differentielles du second ordre et d’ordre supérieur, dont

l’intégrale générale est uniforme, Acta Math. 25 (1900), 1–86.


29. Ruan, Y. and Tian, G.: A mathematical theory of quantum cohomology, Math. Res. Lett. 1(1994), 269–278.

30. Saito, K.: Preprint RIMS-288, 1979 and Publ. RIMS 19 (1983), 1231–1264.31. Saito, K., Yano, T. and Sekeguchi, J.: Comm. Algebra 8(4) (1980), 373–408.32. Sato, M., Miwa, T. and Jimbo, M.: Holonomic quantum fields. II – The Riemann–Hilbert

problem, Publ. RIMS Kyoto. Univ. 15 (1979), 201–278.33. Shimomura, S.: Painlevé trascendents in the neighbourhood of fixed singular points, Funkcial.

Ekvac. 25 (1982), 163–184.Series expansions of Painlevé trascendents in the neighbourhood of a fixed singular point,Funkcial. Ekvac. 25 (1982), 185–197.Supplement to ‘Series expansions of Painlevé trascendents in the neighbourhood of a fixedsingular point’, Funkcial. Ekvac. 25 (1982), 363–371.A family of solutions of a nonlinear ordinary differntial equation and its application to Painlevéequations (III), (V), (VI), J. Math. Soc. Japan 39 (1987), 649–662.

34. Witten, E.: Nuclear Phys. B 340 (1990), 281–332.


293

On the Critical Behavior, the ConnectionProblem and the Elliptic Representation of aPainlevé VI Equation

DAVIDE GUZZETTIResearch Institute for Mathematical Sciences (RIMS), Kyoto University, Kitashirakawa, Sakyo-ku,Kyoto 606-8502, Japan. e-mail: [email protected]

(Received: 4 May 2001, in final form: 20 November 2001)

Abstract. In this paper we find a class of solutions of the sixth Painlevé equation appearing inthe theory of WDVV equations. This class covers almost all the monodromy data associated tothe equation, except one point in the space of the data. We describe the critical behavior close tothe critical points in terms of two parameters and we find the relation among the parameters atthe different critical points (connection problem). We also study the critical behavior of Painlevétranscendents in the elliptic representation.

Mathematics Subject Classification (2000): 34M55.

Key words: Painlevé equation, elliptic function, isomonodromic deformation, Fuchsian system,connection problem, monodromy.

1. Introduction

This work is devoted to the study of the critical behavior of the solutions of aPainlevé VI equation given by a particular choice of the four parameters α, β, γ , δof the equation (the notations are the standard ones, see [18]):

α = (2µ− 1)2

2, β = γ = 0, δ = 1

2, µ ∈ C.

The equation is

d2y

dx2= 1

2

[1

y+ 1

y − 1+ 1

y − x

](dy

dx

)2

−[

1

x+ 1

x − 1+ 1

y − x

]dy

dx+

+ 1

2

y(y − 1)(y − x)

x2(x − 1)2

[(2µ− 1)2 + x(x − 1)

(y − x)2

], µ ∈ C. (1)

Such an equation will be denoted PVIµ in the paper. The motivation of our workis that (1) is equivalent to the WDVV equations of associativity in two-dimensionaltopological field theory introduced by Witten [39], Dijkgraaf, E. Verlinde andH. Verlinde [6]. Such an equivalence is discussed in [9] and it is a consequence

294 DAVIDE GUZZETTI

of the theory of Frobenius manifolds. Frobenius manifolds are the geometricalsetting for the WDVV equations and were introduced by Dubrovin in [7]. Theyare an important object in many branches of mathematics like singularity theoryand reflection groups [9, 12, 34, 35], algebraic and enumerative geometry [24, 26].

The six classical Painlevé equations were discovered by Painlevé [31] and Gam-bier [15], who classified all the second-order ordinary differential equations of thetype

d2y

dx2= R

(x, y,

dy

dx

),

where R is rational in dy/dx, x and y. The Painlevé equations satisfy the Painlevéproperty of absence of movable branch points and essential singularities. Thesesingularities will be called critical points; for PVIµ they are 0, 1, ∞. The behaviorof a solution close to a critical point is called critical behavior. The general solutionof the sixth Painlevé equation can be analytically continued to a meromorphicfunction on the universal covering of P1\{0, 1,∞}. For generic values of the in-tegration constants and of the parameters in the equation, it cannot be expressedvia elementary or classical transcendental functions. For this reason, it is called aPainlevé transcendent.

The critical behavior for a class of solutions to the Painlevé VI equation wasfound by Jimbo in [20] for the general Painlevé equation with generic values of α,β, γ δ (we refer to [20] for a precise definition of generic). A transcendent in thisclass has the behavior

y(x) = a(0)x1−σ (0)(1 + O(|x|δ)), x → 0, (2)

y(x) = 1 − a(1)(1 − x)1−σ (1) (1 + O(|1 − x|δ)), x → 1, (3)

y(x) = a(∞)x−σ(∞)

(1 + O(|x|−δ)), x → ∞, (4)

where δ is a small positive number, a(i) and σ (i) are complex numbers such thata(i) �= 0 and

0 � �σ (i) < 1. (5)

We remark that x converges to the critical points inside a sector with vertex on thecorresponding critical point.

The connection problem, i.e. the problem of finding the relation among the threepairs (σ (i), a(i)), i = 0, 1,∞, was solved by Jimbo in [20] for the above classof transcendents using the isomonodromy deformations theory. He considered aFuchsian system

dY

dz=[A0(x)

z+ Ax(x)

z − x+ A1(x)

z − 1

]Y

such that the 2 × 2 matrices Ai(x) (i = 0, x, 1 are labels) satisfy Schlesingerequations. This ensures that the dependence on x is isomonodromic, according to

CRITICAL BEHAVIOR OF PAINLEVE VI 295

the isomonodromic deformation theory developed in [21]. Moreover, for a spe-cial choice of the matrices, the Schlesinger equations are equivalent to the sixthPainlevé equation, as it is explained in [22]. In particular, the local behaviors (2),(3), (4) were obtained using a result on the asymptotic behavior of a class of so-lutions of Schlesinger equations proved by Sato, Miwa and Jimbo in [33]. Theconnection problem was solved because the parameters σ (i), a(i) were expressed asfunctions of the monodromy data of the Fuchsian system. For studies on the asymp-totic behavior of the coefficients of Fuchsian systems and Schlesinger equations seealso [5].

Later, Dubrovin and Mazzocco [13] applied Jimbo’s procedure to PVIµ, withthe restriction that 2µ /∈ Z. We remark that this case was not studied by Jimbo,being a nongeneric case. Dubrovin and Mazzocco obtained a class of transcendentswith behaviors (2), (3), (4) (again, x converges to a critical point inside a sector)and restriction (5). They also solved the connection problem.

In the case of PVIµ, the monodromy data of the Fuchsian system, to be in-troduced later, turn out to be expressed in terms of a triple of complex num-bers (x0, x1, x∞). The two integration constants in y(x) and the parameter µ arecontained in the triple. The following relation holds:

x20 + x2

1 + x2∞ − x0x1x∞ = 4 sin2(πµ). (6)

There exists a one-to-one correspondence between triples (define up to the changeof two signs) and branches of the Painlevé transcendents.� In other words, anybranch y(x) is parameterized by a triple y(x) = y(x; x0, x1, x∞).

As is proved in [13], the transcendents (2), (3), (4) are parameterized by a tripleaccording to the formulae

x2i = 4 sin2

(π

2σ (i)

), i = 0, 1,∞, 0 � �σ (i) < 1.

A more complicated expression gives a(i) = a(i)(x0, x1, x∞) in [13]. We recall thata branch is defined by the choice of branch cuts, like |arg(x)| < π , |arg(1−x)|<π .The analytic continuation of a branch when x crosses the cuts is obtained by anaction of the braid group on the triple. This is explained in [13] and in Section 6.

As we mentioned above, it is very important to concentrate on PVIµ due toits equivalence to WDVV equations in 2-D topological field theory, and due toits central role in the construction of three-dimensional Frobenius manifolds. It isknown [9] that the structure of a local chart of a Frobenius manifold can in principlebe constructed from a set of monodromy data. To any manifold a PVIµ equation isassociated and the monodromy data of the local chart are contained in µ and in the

� There are only some exceptions to the one-to-one correspondence above, which are alreadytreated in [28]. In order to rule them out, we require that at most one of the entries xi of the triplemay be zero and that (x0, x1, x∞) /∈ {(2, 2, 2) (−2,−2, 2), (2,−2,−2), (−2, 2,−2)}. Two tripleswhich differ by the change of two signs identify the same transcendent. They are called equivalenttriples. The one-to-one correspondence is between transcendents and classes of equivalence.

296 DAVIDE GUZZETTI

triple (x0, x1, x∞) of a Painlevé transcendent. The mentioned action of the braidgroup, which gives the analytic continuation of the transcendent, allows us to passfrom one local chart to another.

The local structure of a Frobenius manifold is explicitly constructed in [17]starting from the Painlevé transcendents. In [17] it is shown that in order to obtaina local chart from its monodromy data we need to know the critical behavior ofthe corresponding transcendent in terms of the triple (x0, x1, x∞) (note that this isequivalent to solving the connection problem).

Recently, Frobenius manifolds have become important in enumerative geometryand quantum cohomology [24, 26]. As is shown in [17], it is possible to computeGromov–Witten invariants for the quantum cohomology of the two-dimensionalprojective space starting from a special PVIµ, with µ = −1. In this case the tripleis (x0, x1, x∞) = (3, 3, 3), as it is proved in [10] and [16]. Due to the restriction0 � �σ (i) < 1, the formulae for the critical behavior and the connection problemobtained by Dubrovin–Mazzocco do not apply if at least one xi (i = 0, 1,∞) isreal and |xi| � 2. Thus, they do not apply in the case of quantum cohomology,because xi = 3 and �σ (i) = 1.

Therefore, the motivation of our paper becomes clear: in the attempt to extendthe results of [13] to the case of quantum cohomology, we actually extended themto almost all monodromy data, namely we found the critical behavior and we solvedthe connection problem for all the triples satisfying

xi �= ±2 �⇒ σ (i) �= 1, i = 0, 1,∞.

In order to do this, we extended the Jimbo and Dubrovin and Mazzocco methodsand we analyzed the elliptic representation of the Painlevé VI equation.

1.1. OUR RESULTS

We observe that the branch y(x; x0, x1, x∞) has analytic continuation on the uni-versal covering of P1\{0, 1,∞}. We still denote this continuation by y(x; x0,

x1, x∞), where x is now a point in the universal covering. Therefore:

There is a one-to-one correspondence between triples of monodromy data(x0, x1, x∞) (defined up to the change of two signs) and Painlevé transcendents,

namely y(x) = y(x; x0, x1, x∞), x ∈ ˜P1\{0, 1,∞}.We mentioned that if we fix a branch, namely if we choose branch cuts like

|arg x| < π , |arg(1 − x)| < π , then the branch of y(x; x0, x1, x∞) has analyticcontinuation y(x; x′0, x′1, x′∞) in the cut plane, where (x′0, x

′1, x

′∞) is obtained from(x0, x1, x∞) by an action of the braid group (see Section 6 for details).

We obtained the following results:

(1) A transcendent y(x; x0, x1, x∞) such that |xi | �= 2 has behaviors (2), (3), (4)

in suitable domains, to be defined below, contained in C\{0}, C\{1}, P1\{∞}


respectively. The exponent are restricted by the condition σ (i) /∈ (−∞, 0) ∪[1,+∞), which extends (5).

(2) The parameters σ (i), a(i) are computed as functions of (x0, x1, x∞), and viceversa, by explicit formulae which extend those of [13].

(3) If we enlarge the domains where (2), (3), (4) hold, the behavior of y(x; x0,

x1, x∞) becomes oscillatory. The movable poles of the transcendent lie outsidethe enlarged domains. In proving this, we investigated the elliptic represen-tation of the transcendent, providing a general result stated in Theorem 3below.

We state result (1) in more detail. Let σ (0) be a complex number such that σ (0) /∈(−∞, 0)∪ [1,+∞). We introduce additional parameters θ1, θ2 ∈ R, 0 < σ < 1 todefine a domain

D(ε;σ (0); θ1, θ2, σ

):= {

x ∈ C0 s.t. |x| < ε, e−θ1�σ (0)|x|σ � |xσ(0) | � e−θ2�σ (0) , 0 < σ < 1},

which can can be rewritten as

|x| < ε,

�σ (0) log |x| + θ2�σ (0) � �σ (0) arg(x) � (�σ (0) − σ ) log |x| + θ1�σ (0).

For real σ (0), the domain is more simply defined as

D(σ (0); ε) := {x ∈ C0 s.t. |x| < ε}, for 0 � σ (0) < 1. (7)

For simplicity, we study the critical behavior of the transcendent for x → 0along the family of paths defined below. Such paths start at some point x0 belong-ing to the domain. If �σ = 0, any regular path will be allowed. If �σ �= 0, weconsidered the family

|x| � |x0| < ε,

arg x = arg x0 + �σ (0) −�

�σ (0)ln

|x||x0| , 0 � � � σ . (8)

The condition 0 � � � σ ensures that the paths remain in the domain as x → 0.In general, these paths are spirals.

THEOREM 1. Let µ �= 0. For any σ (0) /∈ (−∞, 0) ∪ [1,+∞), for any a(0) ∈ C,a(0) �= 0, for any θ1, θ2 ∈ R and for any 0 < σ < 1, there exists a sufficiently smallpositive ε and a small positive number δ such that Equation (1) has a solution

y(x;σ (0), a(0)

) = a(x)x1−σ (0) (1 + O(|x|δ)), 0 < δ < 1, (9)

as x → 0 along (8) in the domain D(ε;σ (0); θ1, θ2, σ ) defined for nonreal σ (0), oralong any regular path in D(ε;σ (0)) defined for real 0 � σ (0) < 1. The amplitudea(x) is

298 DAVIDE GUZZETTI

a(x) := a(0), for 0 < � � σ , or for real σ (0)

a(x) := a(0)(

1 + 1

2a(0)∣∣xσ(0)0

∣∣eiα(x) + 1

16[a(0)]2∣∣xσ(0)0

∣∣2e2iα(x)

)= O(1),

for � = 0, (10)

where we have used the notation α(x) to denote the real phase of xσ(0) = |xσ(0) |eiα(x)

≡ |xσ(0)0 |eiα(x), when � = 0.

Note that in the case (10), we can rewrite as

y(x;σ (0), a(0)

) = sin2

(iσ (0)

2ln x − i

2ln(4a(0)

)− π

2

)x(1 + O(|x|δ)). (11)

For brevity, we will sometimes denote the domain by D(σ (0)). The condition µ �= 0is not restrictive because PVIµ=0 coincides with PVIµ=1.

From Theorem 1 and the symmetries of (1), we prove the existence of solutionswith the following local behaviors:

y(x, σ (1), a(1)

) = 1 − a(1)(1 − x)1−σ (1) (1 + O(|1 − x|δ)), x → 1,

a(1) �= 0, σ (1) /∈ (−∞, 0) ∪ [1,+∞)

and

y(x;σ (∞), a(∞)

) = a(∞)xσ(∞)

(1 + O

(1

|x|δ))

, x → ∞,

a(∞) �= 0, σ (∞) /∈ (−∞, 0) ∪ [1,+∞)

in domains D(σ (1)), D(σ (∞)) given by (51), (48), respectively.The critical behaviors above coincide with (2), (3) and (4) for 0 � �σ (i) < 1,

i = 0, 1,∞. But our result is more general because it extends the range of σ (i) to�σ (i) < 0 and �σ (i) � 1. For this larger range, x may tend to x = i (i = 0, 1,∞)along a spiral, according to the shape of D(σ (i)). For more comments, see Sections3 and 7.

Result (2) is stated in the theorem below – where we write σ, a instead ofσ (0), a(0) – and in its comment.

THEOREM 2. Letµ be any non zero complex number. The transcendent y(x;σ, a)of Theorem 1, defined for σ /∈ (−∞, 0) ∪ [1,+∞) and a �= 0, is the representa-tion of a transcendent y(x; x0, x1, x∞) in D(σ). The triple (x0, x1, x∞) is uniquelydetermined (up to the change of two signs) by the following formulae:

(i) σ �= 0,±2µ + 2m for any m ∈ Z.


x0 = 2 sin

(π

2σ

),

x1 = i

(1

f (σ,µ)G(σ,µ)

√a −G(σ,µ)

1√a

),

x∞ = 1

f (σ,µ)G(σ,µ)e− iπσ2

√a +G(σ,µ)e−i

πσ2

1√a,

where

f (σ,µ) = 2 cos2(π2 σ)

cos(πσ )− cos(2πµ),

G(σ,µ) = 1

2

4σ�(σ+1

2

)2

�(1 − µ+ σ

2

)�(µ+ σ

2

) .Any sign of

√a is good (changing the sign of

√a is equivalent to changing

the sign of both x1, x∞).

(ii) σ = 0

x0 = 0, x21 = 2 sin(πµ)

√1 − a, x2

∞ = 2 sin(πµ)√a.

We can take any sign of the square roots

(iii) σ = ±2µ+ 2m.

(iii1) σ = 2µ+ 2m, m = 0, 1, 2, . . .

x0 = 2 sin(πµ),

x1 = − i

2

16µ+m�(µ+m+ 12 )

2

�(m+ 1)�(2µ+m)

1√a,

x∞ = ix1e−iπµ

(iii2) σ = 2µ+ 2m, m = −1,−2,−3, . . .

x0 = 2 sin(πµ),

x1 = 2iπ2

cos2(πµ)

1

16µ+m�(µ+m+ 12 )

2�(−2µ−m+ 1)�(−m)√a,

x∞ = −ix1eiπµ,

(iii3) σ = −2µ+ 2m, m = 1, 2, 3, . . .

x0 = −2 sin(πµ),

x1 = − i

2

16−µ+m�(−µ+m+ 12 )

2

�(m− 2µ+ 1)�(m)

1√a,

x∞ = ix1eiπµ,

300 DAVIDE GUZZETTI

(iii4) σ = −2µ+ 2m, m = 0,−1,−2,−3, . . .

x0 = −2 sin(πµ),

x1 = 2iπ2

cos2(πµ)

1

16−µ+m�(−µ+m+ 12 )

2�(2µ−m)�(1 −m)

√a,

x∞ = −ix1e−iπµ.

In all the above formulae, the relation x20 + x2

1 + x2∞ − x0x1x∞ = 4 sin2(πµ) isautomatically satisfied. Note that σ �= 1 implies x0 �= ±2. Changes of two signs inthe triple of the formulae above are allowed.

Conversely, a transcendent y(x; x0, x1, x∞), such that x20 +x2

1 +x2∞−x0x1x∞ =4 sin2(πµ), xi �= ±2, has representation y(x;σ, a) in D(σ) of Theorem 1 withparameters σ and a obtained as follows:

(I) Generic case

cos(πσ ) = 1 − x20

2,

a = iG(σ,µ)2

2 sin(πσ )

[2(1 + e−iπσ )− f (x0, x1, x∞)(x2

∞ + e−iπσ x21 )]×

× f (x0, x1, x∞),

where

f (x0, x1, x∞) := f (σ (x0), µ) = 4 − x20

2 − x20 − 2 cos(2πµ)

= 4 − x20

x21 + x2∞ − x0x1x∞

.

σ is determined up to the ambiguity σ �→ ±σ + 2n, n ∈ Z [see remarkbelow]. If σ is real we can only choose the solution satisfying 0 � σ < 1.Any solution σ of the first equation must satisfy the additional restrictionσ �= ±2µ + 2m for any m ∈ Z, otherwise we encounter the singularities inG(σ,µ) and in f (σ,µ).

(II) x0 = 0.

σ = 0, a = x2∞x2

1 + x2∞

provided that x1 �= 0 and x∞ �= 0, namely µ /∈ Z.

(III) x20 = 4 sin2(πµ). Then (6) implies x2∞ = −x2

1 exp(±2πiµ). Four caseswhich yield the values of σ not included in (I) and (II) must be considered


(III1) If x2∞ = −x21 e−2πiµ, then

σ = 2µ+ 2m, m = 0, 1, 2, . . . ,

a = − 1

4x21

162µ+2m�(µ+m+ 12 )

4

�(m+ 1)2�(2µ+m)2.

(III2) If x2∞ = −x21 e2πiµ then

σ = 2µ+ 2m, m = −1,−2,−3, . . . ,

a = −cos4(πµ)

4π4162µ+2m×

× �(µ+m+ 1

2

)4�(−2µ−m+ 1)2�(−m)2x2

1 .

(III3) If x2∞ = −x21 e2πiµ then

σ = −2µ+ 2m, m = 1, 2, 3, . . . ,

a = − 1

4x21

16−2µ+2m�(−µ+m+ 12)

4

�(m− 2µ+ 1)2�(m)2.

(III4) If x2∞ = −x21 e−2πiµ then

σ = −2µ+ 2m, m = 0,−1,−2,−3, . . . ,

a = −cos4(πµ)

4π416−2µ+2m�

(−µ+m+ 12

)4×× �(2µ−m)2�(1 −m)2x2

1 .

Let us restore the notation σ (0), a(0). At x = 1,∞ the exponents σ (i), i = 1,∞are given by cos(πσ (i)) = 1 − (x2

i /2) and the coefficients a(1), a(∞) are obtainedfrom the formula of a = a(0) of Theorem 2, provided that we do the substitutions

(x0, x1, x∞) �→ (x1, x0, x0x1 − x∞), σ (0) �→ σ (1) and

(x0, x1, x∞) �→ (x∞,−x1, x0 − x1x∞), σ (0) �→ σ (∞),

respectively.This also solves the connection problem for the transcendents y(x;σ (i), a(i)),

because we are able to compute (σ (i), a(i)) for i = 0, 1,∞ in terms of a fixed triple(x0, x1, x∞).

Remark. Let (x0, x1, x∞) be given and let us compute σ and a by the formulaeof Theorem 2. The equation

cos(πσ ) = 1 − x20

2(12)

does not determine σ uniquely. We can choose σ such that 0 � �σ � 1. Thisconvention will be assumed in the paper. Therefore, all the solutions of (12) are±σ +2n, n ∈ Z. If σ is real, we can only choose 0 � σ < 1. With this convention,

302 DAVIDE GUZZETTI

there is a one-to-one correspondence between (σ, a) and (a class of equivalence,defined by the change of two signs, of) an admissible triple (x0, x1, x∞).

We observe that σ = σ (x0) and a = a(σ ; x0, x1, x∞); namely, the transforma-tion σ �→ ±σ + 2n affects a. The transcendent y(x; x0, x1, x∞) has representationy(x;σ (x0), a(σ ; x0, x1, x∞)) in D(σ). If we choose another solution ±σ + 2n weagain have y(x; x0, x1, x∞) = y(x;±σ (x0)+ 2n, a(±σ (x0)+ 2n; x0, x1, x∞)) inthe new domain D(±σ +2n). Hence – and this is very important! the transcendenty(x; x0, x1, x∞) has different representations and different critical behaviors in dif-ferent domains. Outside the union of these domains we are not able to describe thetranscendents and we believe that the movable poles lie there (we show this in oneexample in the paper).

According to the above remark, we restrict to the case 0 � �σ (i) � 1, σ (i) �= 1.So the critical behaviors of y(x;σ (i), a(i)) coincide with (2), (3), (4) when 0 ��σ (i) < 1. But for �σ (i) = 1 the critical behaviors (2), (3), (4) hold true only if xconverges to a critical point along spirals.

We finally describe the third result. In the case �σ (i) = 1, we obtained thecritical behaviors along radial paths using the elliptic representation of Painlevétranscendents. We only consider now the critical point x = 0, because the sym-metries of (1), to be discussed in Section 7, yield the behavior close to the othercritical points.

The elliptic representation was introduced by Fuchs in [14]:

y(x) = ℘

(u(x)

2;ω1(x), ω2(x)

)+ 1 + x

3.

Here u(x) solves a nonlinear second-order differential equation to be studied laterand ω1(x), ω2(x) are two elliptic integrals, expanded for |x| < 1 in terms of hyper-geometric functions:

ω1(x) = π

2

∞∑n=0

[(12

)n

]2

(n!)2xn,

ω2(x) = − i

2

{ ∞∑n=0

[(12

)n

]2

(n!)2xn ln(x)+

∞∑n=0

[(12

)n

]2(n!)2

2[ψ(n+ 1

2

)− ψ(n+ 1)]xn},

where ψ(z) := d/dz ln�(z). We introduce a new domain, depending on twocomplex numbers ν1, ν2 and on the small real number r:

D(r; ν1, ν2) :={x ∈ C0 such that |x| < r,

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣ < r,

∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣ < r

}.

The domain can be also written as follows:

|x| < r, �ν2 ln |x| + C1 − ln r < �ν2 arg x < (�ν2 − 2) ln |x| + C2 + ln r,

C1 := −[4 ln 2�ν2 + π�ν1], C2 := C1 + 8 ln 2,

if �ν2 �= 0. If �ν2 = 0, the domain is simply |x| < r.


THEOREM 3. For any complex ν1, ν2 such that ν2 /∈ (−∞, 0] ∪ [2,+∞), thereexists a sufficiently small r < 1 such that PVIµ has a solution of the form

y(x) = ℘(ν1ω1(x)+ ν2ω2(x)+ v(x);ω1(x), ω2(x))+ 1 + x

3

in the domain D(r; ν1, ν2) defined above. The function v(x) is holomorphic inD(r; ν1, ν2) and has convergent expansion

v(x) =∑n�1

anxn +

∑n�0, m�1

bnmxn

(e−iπν1

162−ν2x2−ν2

)m

+

+∑

n�0, m�1

cnmxn

(eiπν1

16ν2xν2

)m

, (13)

where an, bnm, cnm are certain rational functions of ν2. Moreover, there exists aconstant M(ν2) depending on ν2 such that

v(x) � M(ν2)

(|x| +

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣+ ∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣)in D(r; ν1, ν2).

Theorem 3 allows us to compute the critical behavior. We consider a family ofpaths along which x may tend to zero, contained in the domain of the theorem. If0 < ν2 < 2, any regular path is allowed. If ν2 is any nonreal number, we considerthe following family, starting at x0 ∈ D(r; ν1, ν2):

|x| � |x0| < r, arg(x) = arg(x0)+ �ν2 − V

�ν2ln

|x||x0| , 0 � V � 2. (14)

The restriction 0 � V � 2 ensures that the paths remain in the domain as x → 0.

COROLLARY. Consider a transcendent

y(x) = ℘(ν1ω1(x)+ ν2ω2(x)+ v(x);ω1(x), ω2(x))+ 1 + x

3

of Theorem 3. Its critical behavior for x → 0 in D(r; ν1, ν2) along (14) if �ν2 �= 0and 0 < V < 2, or along any regular path if 0 < ν2 < 2 is

y(x) =[

1

2x − 1

4

[eiπν1

16ν2−1

]xν2 − 1

4

[eiπν1

16ν2−1

]−1

x2−ν2

](1 + O(xδ)), (15)

for some 0 < δ < 1. If �ν2 �= 0 and V = 0 the behavior along (14) is

y(x) = 1

sin2(−i ν2

2 ln x + [i ν2

2 ln 16 + πν12

]+∑∞m=1 c0m(ν2)

[eiπν1

16ν2 xν2]m) ×

× (1 + O(x)).

304 DAVIDE GUZZETTI

If �ν2 �= 0 and V = 2, the behavior along (14) is

y(x) = 1

sin2(i 2−ν2

2 ln x + [i ν2−2

2 ln 16 + πν12

]+∑∞m=1 b0m(ν2)

[e−iπν1

162−ν2x2−ν2

]m) ×× (1 + O(x)).

Note that (15) is

y(x) = −1

4

[eiπν1

16ν2−1

]xν2(1 + O(xδ)), if 0 < V < 1, or 0 < ν2 < 1, (16)

y(x) = −1

4

[eiπν1

16ν2−1

]−1

x2−ν2(1 + O(xδ)), if 1 < V < 2, or 1 < ν2 < 2 (17)

and

y(x) = sin2

(i1 − ν2

2ln

|x|16

+ πν1

2

)x(1 + O(x)), if V = 1, or ν2 = 1. (18)

The elliptic representation has been studied from the point of view of algebraicgeometry in [27] but, to our knowledge, Theorem 3 and its corollary are the firstgeneral results on its critical behavior. We note, however, that for the very specialvalue µ = 1

2 the function v(x) vanishes; the transcendents are called Picard so-lutions in [28] because they were known to Picard [32]. Their critical behavior isstudied in [28] and agrees with the corollary.

Comparing (9) with (16), we prove in Section 5.1 that the transcendent of The-orem 3 coincides with y(x;σ (0), a(0)) of Theorem 1 on the domain D(ε, σ (0)) ∩D(r; ν1, ν2) with the identification

σ (0) = 1 − ν2 and a(0) = −1

4

[eiπν1

16ν2−1

](note also that (11) is (18)). The identification of a(0) and σ (0) makes it possible toconnect ν1 and ν2 to the monodromy data (x0, x1, x∞) according to Theorem 2 andto solve the connection problem for the elliptic representation.

The corollary provides the behavior of the transcendents when �σ (0) = 1(σ (0) �= 1) and x → 0 along a radial path. This corresponds to the case �ν2 = 0(ν2 �= 0), with the identification

σ (0) = 1 − ν2 and a(0) = −1

4

[eiπν1

16ν2−1

].

The critical behavior along a radial path is then

y(x) = 1

sin2(ν2 ln x − ν ln 16 + πν1

2 +∑∞m=1 c0m(ν)

[(eiπν1

16iν

)xiν]m)(1 + O(x)),

x → 0. (19)


The number ν is real, ν �= 0 and σ (0) = 1 − iν. The series∑∞

m=1 c0m(ν)[(eiπν1/

eiν)xiν ]m converges and defines a holomorphic and bounded function in the domainD(r; ν1, iν)

|x| < r, C1 − ln r < ν arg x < −2 ln |x| + C2 + ln r.

Note that not all the values of arg x are allowed, namely C1 − ln r < ν arg(x). Ourbelief is that y(x) may have movable poles if we extend the range of arg x. We arenot able to prove it in general, but we will give an example in Section 5.

We finally remark that the critical behavior of Painlevé transcendents can also beinvestigated using a representation due to Shimomura [19, 37]. We will review thisrepresentation in the paper. However, the connection problem in this representationwas not solved.

To summarize, in this paper we give an extended and unified picture of bothelliptic and Shimomura’s representations and Dubrovin and Mazzocco’s works,showing that the transcendents obtained in these three different ways all are in-cluded in the wider class of Theorem 1. In this way we solve the connection prob-lem for elliptic and Shimomura’s representations by virtue of Theorem 2. Finally,Theorem 3 provides the oscillatory behavior along radial paths when �σ (0) = 1.

2. Monodromy Data and Review of Previous Results

Before giving further details about the result stated above, we review the connec-tion between PVIµ and the theory of isomonodromic deformations. We also give adetailed expositions of the results of [13, 28].

The equation PVIµ is equivalent to the equations of isomonodromy deformation(Schlesinger equations) of the Fuchsian system

dY

dz= A(z;u)Y, A(z;u) :=

[A1(u)

z− u1+ A2(u)

z− u2+ A3(u)

z− u3

],

u := (u1, u2, u3), tr(Ai) = detAi = 0,3∑

i=1

Ai = −diag(µ,−µ). (20)

The dependence of the system (20) on u is isomonodromic, as it is explained below.From the system we obtain a transcendent y(x) of PVIµ as follows:

x = u2 − u1

u3 − u1, y(x) = q(u)− u1

u3 − u1,

where q(u1, u2, u3) is the root of

[A(q;u1, u2, u3)]12 = 0, if µ �= 0.

The case µ = 0 is disregarded, because PVIµ=0 ≡ PVIµ=1.

306 DAVIDE GUZZETTI

Conversely, given a transcendent y(x) the system (20) associated to it is ob-tained as follows. Let’s define

k = k(x, u3 − u1) :=k0 exp

{(2µ− 1)

∫ x dζ y(ζ )−ζζ(ζ−1)

}(u3 − u1)2µ−1

, k0 ∈ C\{0}.

We have

Ai = −µ(φi1φi3 −φ2

i3

φ2i1 φi1φi3

), i = 1, 2, 3, (21)

where

φ13 = i

√k√y√

u3 − u1√x,

φ23 = −√k√y − x√

u3 − u1√x√

1 − x,

φ33 = i

√k√y − 1√

u3 − u1√

1 − x,

φ11 = i

2µ2

√u3 − u1

√y√

k(x)√x

[A

(B + 2µ

y

)+ µ2(y − 1 − x)

],

φ21 = − 1

2µ2

√u3 − u1

√y − x√

k(x)√x√

1 − x

[A

(B + 2µ

y − x

)+ µ2(y − 1 + x)

],

φ31 = i

2µ2

√u3 − u1

√y − 1√

k(x)√

1 − x

[A

(B + 2µ

y − 1

)+ µ2(y + 1 − x)

],

A = A(x) := 1

2

[dy

dxx(x − 1)− y(y − 1)

],

B = B(x) := A

y(y − 1)(y − x).

Any branch of the square roots can be chosen. For a derivation of the above formu-lae, see [9, 17, 22].

The system (20) has Fuchsian singularities at u1, u2, u3. Let us fix a branchY (z, u) of a fundamental matrix solution by choosing branch cuts in the z planeand a basis of loops in π(C\{u1, u2, u3}; z0), where z0 is a base-point. Let γi be abasis of loops encircling counter-clockwise the point ui , i = 1, 2, 3. See Figure 1.Then

Y (z, u) �→ Y (z, u)Mi, i = 1, 2, 3, detMi �= 0,

if z goes around a loop γi . Along the loop γ∞ := γ1 · γ2 · γ3 we have Y �→ YM∞,M∞ = M3M2M1. The 2 × 2 matrices Mi are the monodromy matrices, and


Figure 1.

they give a representation of the fundamental group called monodromy represen-tation. The transformations Y ′(z, u) = Y (z, u)B, det(B) �= 0 yields all possiblefundamental matrices, hence the monodromy matrices of (20) are defined up toconjugation Mi �→ M ′

i = B−1MiB. From the standard theory of Fuchsian systems,it follows that we can choose a fundamental solution behaving as follows

Y (z;u) ={ [

I + O(

1z

)]z−µzRC∞, z → ∞,

Gi[I + O(z− ui)](z − ui)JCi, z → ui, i = 1, 2, 3,

(22)

where

J =(

0 10 0

), µ = diag(µ,−µ), GiJG

−1i = Ai

and

R =

0, if 2µ /∈ Z,(0 R12

0 0

), µ > 0(

0 0R21 0

), µ < 0

if 2µ ∈ Z.

The entries R12, R21 are determined by the matrices Ai . Then

Mi = C−1i e2πiJCi, M∞ = C−1

∞ e−2πiµe2πiRC∞.

The dependence of the Fuchsian system on u is isomonodromic. This meansthat for small deformations of u, the monodromy matrices do not change [18, 22].Small deformation means that x = (u3 − u1)/(u2 − u1) can move in the x-planeprovided it does not go around complete loops around 0, 1,∞. If the deformationis not small, the monodromy matrices change according to an action of the purebraid group, as discussed in [13].

308 DAVIDE GUZZETTI

We consider a branch y(x) of a transcendent and we associate to it the Fuchsiansystem through formulae (21). A branch is fixed by the choice of branch cuts, like

α < arg(x) < α + 2π and β < arg(1 − x) < β + 2π, α, β ∈ R.

Therefore, the monodromy matrices of the Fuchsian system do not change as xmoves in the cut plane. In other words, it is a well defined correspondence whichassociates a monodromy representation to a branch of a transcendent.

Conversely, the problem of finding a branch of a transcendent for given mon-odromy matrices (up to conjugation) is the problem of finding a Fuchsian sys-tem (20) having the given monodromy matrices. This problem is called theRiemann–Hilbert problem, or the 21st Hilbert problem. For a given PVIµ (i.e. fora fixed µ), there is a one-to-one correspondence between a monodromy represen-tation and a branch of a transcendent if and only if the Riemann–Hilbert problemhas a unique solution.

• Riemann–Hilbert (RH) problem: find the coefficients Ai(u), i = 1, 2, 3 from thefollowing monodromy data:

(a) the matrices

µ = diag(µ,−µ), µ ∈ C\{0},

R =

0, if 2µ /∈ Z,(0 b

0 0

), µ > 0(

0 0b 0

), µ < 0

if 2µ ∈ Z, b ∈ C,

(b) three poles u1, u2, u3, a base-point and a base of loops in π(C\{u1, u2, u3};z0). See Figure 1.

(c) three monodromy matrices M1, M2, M3 relative to the loops (counter-clock-wise) and a matrix M∞ similar to e−2πiµe2πiR, and satisfying

tr(Mi) = 2, det(Mi) = 1, i = 1, 2, 3,

M3M2M1 = M∞, (23)

M∞ = C−1∞ e−2πiµe2πiRC∞,

where C∞ realizes the similitude. We also choose the indices of the problem,namely we fix 1/2πi logMi as follows: let J := (0 1

0 0

). We require there exist

three connection matrices C1, C2, C3 such that

C−1i e2πiJCi = Mi, i = 1, 2, 3 (24)

and we look for a matrix-valued meromorphic function Y (z;u) such that

Y (z;u) ={G∞

(I + O

(1z

))z−µzRC∞, z → ∞,

Gi(I + O(z− ui))(z− ui)JCi, z → ui, i = 1, 2, 3.

(25)


G∞ and Gi are invertible matrices depending on u. The coefficients of the Fuchsiansystem are then given by

A(z;u1, u2, u3) := dY (z;u)dz

Y (z;u)−1.

A 2×2 RH is always solvable at a fixed u [1]. As a function of u = (u1, u2, u3),the solution A(z;u1, u2, u3) extends to a meromorphic function on the universalcovering of C3\⋃i �=j {u | ui = uj }. The monodromy matrices are considered upto the conjugation

Mi �→ M ′i = B−1MiB, detB �= 0, i = 1, 2, 3,∞ (26)

and the coefficients of the Fuchsian system itself are considered up to conjugationAi �→ F−1AiF (i = 1, 2, 3) by an invertible matrix F . Actually, two conjugatedFuchsian systems admit fundamental matrix solutions with the same monodromy,and a given Fuchsian system defines the monodromy up to conjugation.

On the other hand, a triple of monodromy matrices M1, M2, M3 may be realizedby two Fuchsian systems which are not conjugated. This corresponds to the factthat the solutions C∞, Ci of (23), (24) are not unique, and the choice of differentparticular solutions may give rise to Fuchsian systems which are not conjugated.If this is the case, there is no one-to-one correspondence between monodromymatrices (up to conjugation) and solutions of PVIµ. It is proved in [28], that

The RH has a unique solution, up to conjugation, for 2µ /∈ Z or for 2µ ∈ Zand R �= 0.�

� The proof is done in the following way: consider two solutions C and C of Equations (23)and (24). Then

(CiC−1i )−1e2πiJ (CiC

−1i ) = e2πiJ ,

(C∞C−1∞ )−1e−2πiµe2πiR(C∞C−1∞ ) = e−2πiµe2πiR.

We find

CiC−1i =

(a b

0 a

), a, b ∈ C, a �= 0.

Note that this matrix commutes with J , then

(z − ui)J Ci = (z − ui)

J

(a b

0 a

)Ci =

(a b

0 a

)(z − ui)

J Ci .

We also find

C∞C−1∞ =

(i) diag(α, β), αβ �= 0; if 2µ /∈ Z,

(ii)

(α β

0 α

)(µ > 0),

(α 0β α

)(µ < 0), α �= 0, if 2µ ∈ Z, R �= 0,

(iii) Any invertible matrix, if 2µ ∈ Z, R = 0.

(27)

310 DAVIDE GUZZETTI

Once the RH is solved, the sum of the matrix coefficients Ai(u) of the solution

A(z;u1, u2, u3) =3∑

i=1

Ai(u)

z− ui

must be diagonalized (to give − diag(µ,−µ)).�� After that, a branch y(x) of PVIµcan be computed from [A(q;u1, u2, u3)]12 = 0. The fact that the RH has a uniquesolution for the given monodromy data (if 2µ /∈ Z or 2µ ∈ Z and R �= 0) meansthat there is a one-to-one correspondence between the triple M1, M2, M3 and thebranch y(x).

We review some known results [13, 28]:(1) One Mi = I if and only if q(u) ≡ ui . This does not correspond to a solution

of PVIµ.(2) If the Mi’s, i = 1, 2, 3, commute, then µ is integer (as follows from the fact

that the 2×2 matrices with 1’s on the diagonals commute if and only if they can be

Then

(i) z−µC∞ = z−µ diag(α, β)C∞ = diag(α, β)zµC∞,

(ii) z−µz−RC∞ = · · · =[αI + 1

z|2µ|Q]z−µz−RC∞,

where

Q =(

0 β

0 0

)or Q =

(0 0β 0

).

(iii) z−µC∞ = · · · =[Q1

z|2µ| +Q0 +Q−1z|2µ|

]z−µC∞,

where Q0 = diag(α, β), Q±1 are, respectively, upper and lower triangular (or lower and uppertriangular, depending on the sign of µ), and C∞C−1∞ = Q1 +Q0 +Q−1.

This implies that the two solutions Y(z;u), Y (z;u) of the form (25) with Cν and Cν , respectively(ν = 1, 2, 3,∞), are such that Y(z;u)Y (z;u)−1 is holomorphic at each ui , while at z = ∞ it is

Y(z;u)Y (z;u)−1 → (i) G∞ diag(α, β)G−1∞ ,

(ii) αI,

(iii) divergent.

Thus, the two Fuchsian systems are conjugated only in the cases (i) and (ii), because in those casesY Y−1 is holomorphic everywhere on P1, and then it is a constant. In other words, the RH has aunique solution, up to conjugation, for 2µ /∈ Z or for 2µ ∈ Z and R �= 0.�� Note that if G∞ = I , then

∑3i=1 Ai is already diagonal. Therefore, there is no loss of generality

if, for 2µ /∈ Z, we solve the Riemann–Hilbert problem for given M1, M2, M3 with the choice ofnormalization Y(z;u)zµ → I if z → ∞. This uniquely determines A1, A2, A3 up to diagonalconjugation. Note that for any diagonal invertible matrix D, the sum of D−1AiD is still diagonal.

On the other hand, if 2µ ∈ Z and R �= 0, then Y(z;u)Y (z;u)−1 = α, where α appears in (27),case (ii). Therefore the two Fuchsian systems obtained from A(z;u) := dY/dzY−1 and A(z;u) =dY /dzY−1 coincide.

In both cases, A12(z, u) changes at most for the multiplication by a constant, therefore the samey(x) is defined by A12(z, u) = 0 and A12(z, u) = 0.


simultaneously put in upper or lower triangular form). There are solutions of PVIµonly for

M1 =(

1 iπa

0 1

), M2 =

(1 iπ

0 1

),

M3 =(

1 iπ(1 − a)

0 1

), a �= 0, 1.

In this case R = 0 and M∞ = I . For µ = 1 the solution is y(x) = ax/(1 −(1 − a)x) and for other integers µ the solution is obtained from µ = 1 by abirational transformation [13, 28].

(3) Noncommuting Mi’s. The parameters in the space of the monodromy repre-sentation, independent of conjugation, are

2 − x21 := tr(M1M2), 2 − x2

2 := tr(M2M3), 2 − x23 := tr(M1M3).

The triple (x0, x1, x∞) in the Introduction is (x1, x2, x3).(3.1) If at least two of the xj ’s are zero, then one of the Mi’s is I , or the matrices

commute. We return to the case (1) or (2). Note that (x1, x2, x3) = (0, 0, 0) incase (2).

(3.2) At most one of the xj ’s is zero. We say that the triple (x1, x2, x3) is ad-missible. In this case, it is possible to fully parameterize the monodromy using thetriple (x1, x2, x3). Namely, there exists a fundamental matrix solution such that

M1 =(

1 −x1

0 1

), M2 =

(1 0x1 1

), M3 =

(1 + x2x3

x1− x2

2x1

x23x1

1 − x2x3x1

),

if x1 �= 0. If x1 = 0 we just choose a similar parameterization starting from x2

or x3. The relation M3M2M1 similar to e−2πiµe2πiR, implies

x21 + x2

2 + x23 − x1x2x3 = 4 sin2(πµ).

The conjugation (26) changes the triple by two signs. Thus, the true parameters forthe monodromy data are classes of equivalence of triples (x1, x2, x3) defined by thechange of two signs.

We have to distinguish three sub-cases of (3.2):(i) 2µ /∈ Z. There is a one-to-one correspondence between (classes of equiv-

alence of) monodromy data (x0, x1, x∞) ≡ (x1, x2, x3) and the branches of tran-scendents of PVIµ. The connection problem was solved in [13] for the class oftranscendents with critical behavior

y(x) = a(0)x1−σ (0) (1 + O(|x|δ)), x → 0, (28)

y(x) = 1 − a(1)(1 − x)1−σ (1) (1 + O(|1 − x|δ)), x → 1, (29)

y(x) = a(∞)x−σ(∞)

(1 + O(|x|−δ)), x → ∞, (30)

312 DAVIDE GUZZETTI

where a(i) and σ (i) are complex numbers such that a(i) �= 0 and 0 � �σ (i) < 1. δ isa small positive number. This behavior is true if x converges to the critical pointsinside a sector in the x-plane with vertex on the corresponding critical point andfinite angular width. In [13], all the algebraic solutions are classified and related tothe finite reflection groups A3, B3, H3.

(ii) The case µ half integer was studied in [28]. There is an infinite set of Picardtype solutions in one-to-one correspondence to triples of monodromy data not inthe equivalence class of (2, 2, 2). These solutions form a two parameter family,behave asymptotically as the solutions of the case (i), and comprise a denumerablesubclass of algebraic solutions. In this case, R �= 0. For any half integer µ �= 1

2 ,there is also a one-parameter family of Chazy solutions. In this case, R = 0 andthe one-to-one correspondence with monodromy data is lost. In fact, they form aninfinite family but any element of the family corresponds to the class of equivalenceof the triple (2, 2, 2). The result of our paper applies to Picard’s solutions withxi �= ±2.

(iii) µ integer. In this case, R �= 0.� There is a one-to-one correspondencebetween monodromy data and branches. To our knowledge, this case has not beenstudied before. There are relevant examples of Frobenius manifolds included in thiscase, like the case of quantum cohomology of CP2. For this manifold, µ = −1,the triple (x1, x2, x3) = (3, 3, 3) (see [10, 16]) and the real part of σ is equal to 1.

In this paper, we find the critical behavior and we solve the connection problemfor any µ �= 0 and for all the triples (x1, x2, x3) except for the points xi = ±2 �⇒σ (i) = 1, i = 0, 1,∞.

3. Critical Behavior – Theorem 1

Theorem 1 has been stated in the Introduction and will be proved in Section 8. Herewe give some comments about the domain D(σ). The superscript of σ (i), a(i) willbe omitted in this section and we concentrate on a small punctured neighborhoodof x = 0 (x = 1,∞ will be treated in Section 7). The point x can be read asa point in the universal covering of C0 := C\{0} with 0 < |x| < ε (ε < 1).Namely, x = |x|ei arg(x), where −∞ < arg(x) < +∞. Let σ be such that σ /∈(−∞, 0) ∪ [1,+∞). We defined the domains D(ε;σ ; θ1, θ2, σ ), or D(σ ; ε) forreal σ in the Introduction. Theorem 1 holds in these domains; the small numberε depends on σ , θ1 and a. In the following, we may sometimes omit ε, σ , θi andwrite simply D(σ).

We observe that |xσ | = |x|σ ′(x), where

σ ′(x) := �σ − �σ arg(x)

log |x| .

� (R = 0 only in the case (2) of commuting monodromy matrices and µ integer.)


In particular, we have σ ′(x) = σ for real σ . The exponent σ ′(x) satisfies therestriction 0 � σ ′(x) < 1 for x → 0, if x lies in the domain, because

−θ2�σln |x| � σ ′(x) � σ − θ1�σ

ln |x| ,

and

−θ2�σln |x| → 0,

(σ − θ1�σ

ln |x|)→ σ < 1

for x → 0. Figure 2 shows the domains in the (ln |x|,�σ arg x)-plane (in the(ln |x|, arg x)-plane if �σ = 0).

In Figure 2, we draw the paths along which x → 0. Any regular path is allowedif �σ = 0. If �σ �= 0, we considered the family of paths (8) connecting a pointx0 ∈ D(σ) to x = 0. In general, these paths are spirals, represented in Figure 2both in the plane (ln |x|,�σ arg x) and in the x-plane. They are radial paths if0 � �σ < 1 and � = �σ , because in this case arg(x) = constant. But there areonly spiral paths whenever �σ < 0 and �σ � 1. In particular, the paths

�σ arg(x) = �σ arg(x0)+�σ log|x||x0|

are parallel to one of the boundary lines ofD(σ) in the plane (ln |x|,�σ arg(x)) andthe critical behavior is (11). The boundary line is �σ arg(x) = �σ ln |x| + �σθ2

and it is shared by D(σ) and D(−σ ) (with the same θ2 – see also Remark 2 ofSection 4).

• Important Remark on the Domain: Consider the domain D(ε;σ ; θ1, θ2, σ ) for�σ �= 0. In Theorem 1 we can arbitrarily choose θ1. Apparently, if we increaseθ1�σ the domain D(ε;σ ; θ1, θ2) becomes larger. But ε itself depends on θ1. In theproof of Theorem 1 (Section 8), we will show that ε1−σ � ce−θ1�σ , where c is aconstant, depending on a. Equivalently, θ1�σ � (σ−1) ln ε+ ln c. This means thatif we increase �σθ1, we have to decrease ε. Therefore, for x ∈ D(ε;σ ; θ1, θ2, σ ),we have

�σ arg(x) � (�σ − σ ) ln |x| + θ1�σ � (�σ − σ ) ln |x| + (σ − 1) ln ε + ln c.

We advise the reader to visualize a point x in the plane (ln |x|,�σ arg(x)). Withthis visualization in mind, let xε be the point

{�σ arg(x) = (�σ − σ ) ln |x| + (σ − 1) ln ε + ln c} ∩ {|x| = ε}(see Figure 3). Namely, �σ arg(xε) = (�σ − 1) ln ε + ln c. This has the fol-lowing implication: Let σ , a, σ , θ2 be fixed. The union of the domains D(ε =ε(θ1);σ ; θ1, θ2, σ ) obtained by letting θ1 vary is⋃

θ1

D(ε(θ1);σ ; θ1, θ2, σ

) ⊆ B(σ, a; θ2, σ ),

314 DAVIDE GUZZETTI

Figure 2.


Figure 3.

where

B(σ, a; θ2, σ ) := {|x| < 1 such that �σ ln |x| + θ2�σ� �σ arg(x) < (�σ − 1) ln |x| + ln c}. (31)

The dependence on a of the domain B is motivated by the fact that c depends on a(but not on θ1, θ2).

If 0 � �σ < 1, the above result is not a limitation on the values of arg(x) inD(ε;σ ; θ1, θ2, σ ), provided that |x| is sufficiently small.

Also in the case �σ < 0, there is no limitation, because any point x, such that|x| < ε, can be included in D(ε;σ ; θ1, θ2, σ ) for a suitable θ2. In fact, we canalways decrease �σθ2 without affecting ε.

But if �σ � 1, the situation is different. Actually, all the points x which lieabove the set B(σ, a; θ2, σ ) in the (ln |x|,�σ arg(x))-plane can never be includedin any D(ε;σ ; θ1, θ2, σ ). See Figure 4. This is an important restriction on thedomains of Theorem 1.

4. Parameterization of a Branch Through Monodromy Data – Theorem 2

The second step in our discussion is to compute the relation between the parametersσ , a of Theorem 1, stated for x = 0, and the monodromy data (x0, x1, x∞), towhich a unique transcendent y(x; x0, x1, x∞) is associated. Also in this section,σ (0), a(0) are denoted σ, a. The points x = 1,∞ are studied in Section 7.

316 DAVIDE GUZZETTI

Figure 4.

We consider the Fuchsian system (20) for the special choice u1 = 0, u2 = x,u3 = 1. The labels i = 1, 2, 3 will be substituted by the labels i = 0, x, 1, and thesystem becomes

dY

dz=[A0(x)

z+ Ax(x)

z − x+ A1(x)

z − 1

]Y. (32)

Also, the triple (x1, x2, x3) will be denoted by (x0, x1, x∞), as in [13] and in theIntroduction. We consider only admissible triples and xi �= ±2, i = 0, 1,∞. Werecall that an admissible triple is defined in [13] by the condition that only one xi ,i = 0, 1,∞ may be zero. Two admissible triples are equivalent if their elementsdiffer just by the change of two signs and

x20 + x2

1 + x2∞ − x0x1x∞ = 4 sin2(πµ). (33)

In the Introduction, we called y(x; x0, x1, x∞) the branch in one-to-one cor-respondence with an equivalence class of (x0, x1, x∞). The branch has analyticcontinuation on the universal covering of P1\{0, 1,∞}. We also denote this con-tinuation by y(x; x0, x1, x∞), where x is now a point in the universal covering.

Theorem 2 has been stated in full generality in the Introduction and will beproved in Section 9. The result is a generalization of the formulae of [13] to anyµ �= 0 (including half-integer µ) and to all xi �= ±2, i = 0, 1,∞.

The proof of the theorem is also valid for the resonant case 2µ ∈ Z\{0}. Toread the formulae in this case, it is enough to just substitute an integer for 2µ in theformulae (i) or (I) of the theorem. The cases (ii), (iii); (II), (III) do not occur when2µ ∈ Z\{0}.


Note that for µ integer, the case (ii), (II) degenerates to (x0, x1, x∞) = (0, 0, 0)and a arbitrary. This is the case in which the triple is not a good parameterizationfor the monodromy (not admissible triple). Anyway, we know that in this casethere is a one-parameter family of rational solutions [28], which are all obtainedby a birational transformation from the family

y(x) = ax

1 − (1 − a)x, µ = 1.

At x = 0, the behavior is y(x) = ax(1 + O(x)), and then the limit of Theorem 2for µ → n ∈ Z\{0} and σ = 0 yields the above one-parameter family. Recall thatR = 0 in this case.

Remark 1. We repeat the remark to Theorem 2 we made in the Introduction;namely, Equation (12) does not uniquely determine σ . We decided to choose σ

such that 0 � �σ � 1, so that all the solutions of (12) are ±σ + 2n, n ∈ Z. If σis real, we can only choose 0 � σ < 1. With this convention, there is a one-to-onecorrespondence between (σ, a) and (a class of equivalence of) an admissible triple(x0, x1, x∞).

We observed that a = a(σ ; x0, x1, x∞) is affected by ±σ + 2n, n ∈ Z. Hence,y(x; x0, x1, x∞) has different critical behaviors in different domains D(±σ + 2n).Outside their union, we expect movable poles.

Remark 2. The domains D(σ) and D(−σ ), with the same θ2, intersect alongthe common boundary �σ arg(x) = �σ log |x| + θ2�σ (see Figure 2). The criti-cal behavior of y(x; x0, x1, x∞) along the common boundary is given in terms of(σ (x0), a(σ ; x0, x1, x∞)) and (−σ (x0), a(−σ ; x0, x1, x∞)), respectively. Accord-ing to Theorem 1, the critical behaviors in D(σ) and D(−σ ) are different, but theybecome equal on the common boundary. Actually, along the boundary of D(σ) thebehavior is given by (11), which we rewrite as

y(x) = A(x;σ, a(σ ))x(1 + O(|x|δ)),where δ is a small number between 0 and 1 and

A(x;σ, a(σ )) = a(Ceiα(x;σ))−1 + 1

2+ 1

16aCeiα(x;σ),

xσ = Ceiα(x;σ), C = e−θ2�σ ,α(x;σ ) = �σ arg(x)+ �σ ln |x|∣∣�σ arg(x)=�σ log |x|+θ2�σ .

We observe that α(x;−σ ) = −α(x;σ ). At the end of Section 9, we prove thata(σ ) = 1/(16a(−σ )). This implies that

A(x;−σ, a(−σ )) = A(x;σ, a(σ )).Therefore, the critical behavior, as prescribed by Theorem 1 in D(σ) and D(−σ ),is the same along the common boundary of the two domains.

318 DAVIDE GUZZETTI

We end the section with the following proposition:

PROPOSITION [unicity]. Let σ /∈ (−∞, 0) ∪ [1,+∞) and a �= 0. Let y(x) be asolution of PVIµ such that y(x) = ax1−σ (1+ higher-order terms) as x → 0 in thedomain D(ε;σ ). Suppose that the triple (x0, x1, x∞) computed by the formulae ofTheorem 2 in terms of σ and a is admissible. Then, y(x) coincides with y(x;σ, a)of Theorem 1.

Proof. See Section 9. ✷

5. Other Representations of the Transcendents – Theorem 3

We need to further investigate the critical behavior close to x = 0, in order to ex-tend the results of Theorem 1 for x → 0 along paths not allowed by the theorem. Inthis section we discuss the critical behavior of the elliptic representation of Painlevétranscendents. According to Remark 1 of Section 4, we restrict to 0 � �σ � 1 for�σ �= 0, or 0 � σ < 1 for σ real.

In Figure 5 (left) we draw domains D(σ), D(−σ ), D(−σ + 2), D(2− σ ), etc.,where y(x; x0, x1, x∞) has different critical behaviors. Some small sectors remainuncovered by the union of the domains (Figure 5 (right)). If x → 0 inside thesesectors, we do not know the behavior of the transcendent. For example, if �σ = 1,

Figure 5.


Figure 6. The figure represents a possible configuration of the strips where Theorem 1 doesnot give answers. It is in these strips that we might expect movable poles.

a radial path converging to x = 0 will end up in a forbidden small sector (see alsoFigure 7 for the case �σ = 1).

If we draw, for the same θ2, the domains B(σ ), B(−σ ), B(−σ + 2), etc., de-fined in (31) we obtain strips in the (ln |x|,�σ arg(x))-plane which are certainlyforbidden to Theorem 1 (see Figure 6). In the strips we know nothing about thetranscendent. We guess that there might be poles there, as we verify in one examplelater.

What is the behavior along the directions not allowed by Theorem 1? In thevery particular case

(x0, x1, x∞) ∈ {(2, 2, 2), (2,−2,−2), (−2,−2, 2), (−2, 2,−2)},it is known that PVIµ=−1/2 has a 1-parameter family of classical solutions. Thecritical behavior of a branch for radial convergence to the critical points 0, 1,∞was computed in [28]:

y(x) =− ln(x)−2(1 + O(ln(x)−1)), x → 0,

1 + ln(1 − x)−2(1 + O(ln(1 − x)−1)), x → 1,−x ln(1/x)−2(1 + O(ln(1/x)−1), x → ∞.

320 DAVIDE GUZZETTI

The branch is specified by |arg(x)| < π , |arg(1 − x)| < π . This behavior iscompletely different from ∼a(x)x1−σ as x → 0. Intuitively, as x0 approaches thevalue 2, 1 − σ approaches 0 and the decay of y(x) ∼ ax1−σ becomes logarithmic.These solutions were called Chazy solutions in [28], because they can be computedas functions of solutions of the Chazy equation.

This section is devoted to the investigation of the critical behavior at x = 0 inthe regions not allowed in Theorem 1.

5.1. ELLIPTIC REPRESENTATION

The transcendents of PVIµ can be represented in the elliptic form [14]

y(x) = ℘

(u(x)

2;ω1(x), ω2(x)

)+ 1 + x

3,

where ℘(z;ω1, ω2) is the Weierstrass elliptic function of half-periods ω1, ω2. u(x)solves the nonlinear differential equation

L(u) = α

x(1 − x)

∂

∂u

[℘

(u

2;ω1(x), ω2(x)

)], α = (2µ− 1)2

2, (34)

where the differential linear operator L applied to u is

L(u) := x(1 − x)d2u

dx2+ (1 − 2x)

du

dx− 1

4u.

The half-periods are two independent solutions of the hyper-geometric equationL(u) = 0:

ω1(x) := π

2F(x), ω2(x) := − i

2[F(x) ln x + F1(x)],

where F(x) is the hyper-geometric function

F(x) := F

(1

2,

1

2, 1; x

)=

∞∑n=0

[(12

)n

]2

(n!)2xn

and

F1(x) :=∞∑n=0

[(12

)n

]2

(n!)22[ψ(n+ 1

2

)− ψ(n+ 1)]xn,

ψ(z) = d

dzln�(z), ψ

(1

2

)= −γ − 2 ln 2,

ψ(1) = −γ, ψ(a + n) = ψ(a)+n−1∑l=0

1

a + l.


The solutions u of (34) were not studied in the literature, so we did that and weproved a general result in Theorem 3. But first, we give a special example, alreadyknown to Picard.

EXAMPLE. The equation PVIµ=1/2 has a two-parameter family of solutions dis-covered by Picard [28, 30, 32]. It is easily obtained from (34). Since α = 0, usolves the hyper-geometric equation L(u) = 0 and has the general form

u(x)

2:= ν1ω1(x)+ ν2ω2(x), νi ∈ C, 0 � �νi < 2, (ν1, ν2) �= (0, 0).

A branch of y(x) is specified by a branch of ln x in ω2(x). The monodromy datacomputed in [28] are

x0 = −2 cos πr1, x1 = −2 cos πr2, x∞ = −2 cos πr3,

r1 = ν2

2, r2 = 1 − ν1

2, r3 = ν1 − ν2

2, for �ν1 > �ν2,

r1 = 1 − ν2

2, r2 = ν1

2, r3 = ν2 − ν1

2, for �ν1 < �ν2.

The modular parameter is now a function of x:

τ(x) = ω2(x)

ω1(x)= 1

π(arg x − i ln |x|)+ 4i

πln 2 + O(x), x → 0.

We see that �τ > 0 as x → 0. Now, if∣∣∣∣�u(x)4ω1

∣∣∣∣ < �τ, (35)

we can expand the Weierstrass function in Fourier series. Condition (35) becomes

1

2

∣∣∣∣�ν1 + �ν2

πarg(x)− �ν2

πln |x| + 4 ln 2

π�ν2

∣∣∣∣< − ln |x|

π+ 4 ln 2

π+ O(x), as x → 0.

For �ν2 �= 0, this can be written as follows:

(�ν2 + 2) ln |x| + c1 < �ν2 arg(x) < (�ν2 − 2) ln |x| + c2, (36)

c1 := −π�ν1 − 4 ln 2(�ν2 + 2), c2 := −π�ν1 − 4 ln 2(�ν2 − 2).

On the other hand, if �ν2 = 0, any value of arg x is allowed. The Fourier expansionis

y(x) = x + 1

3+ 1

F(x)2

[1

sin2(− 1

2

[iν2(ln(x)+ F1(x)

F (x)

)− πν1]) − 1

3+

+ 8∞∑n=1

x2n

e−2nF1(x)F(x) − x2n

sin2

(−n

2

[iν2

(ln(x)+ F1(x)

F (x)

)− πν1

])]

322 DAVIDE GUZZETTI

= x

2+(

1 − x

2+ O(x2)

)[1

sin2(− 12

[iν2(

ln(x)+ F1(x)

F (x)

)− πν1]) +

− 1

4

[eiπν1

16ν2−1

]−1

x2−ν2 + O(x2 + x3−ν2 + x4−ν2)

],

x → 0 in the domain (36).

As far as radial convergence is concerned, we have(a) 0 < �ν2 < 2,

1

sin2(. . .)= −1

4

[eiπν1

16ν2−1

]xν2(1 + O(|xν2 |))

and so

y(x) ={−1

4

[eiπν1

16ν2−1

]xν2 + 1

2x − 1

4

[eiπν1

16ν2−1

]−1

x2−ν2

}(1 + O(xδ)),

δ > 0. (37)

This is the same critical behavior of Theorem 1. By virtue of the proposition ofSection 4, the transcendent here coincides with y(x;σ, a) of Theorem 1 if weidentify 1 − σ with ν2 for 0 < �ν2 < 1, or with 2 − ν2 for 1 < �ν2 < 2. Inthe case, �ν2 = 1 the three terms xν2 , x, x2−ν2 have the same order and we findagain the behavior (10) of Theorem 1 (oscillatory case):

y(x) ={axν2 + x

2+ 1

16ax2−ν2

}(1 + O(xδ))

= axν2

{1 + 1

2ax−i�ν2 + 1

16a2x−2i�ν2

}(1 + O(xδ)),

where a = − 14 [eiπν1/16ν2−1].

(b) �ν2 = 0. Put ν2 = iν (namely, σ = 1 − iν). The domain (36) is now (forsufficiently small |x|):

2 ln |x| − π�ν1 − 8 ln 2 < �ν2 arg(x) < −2 ln |x| − π�ν1 + 8 ln 2

or

2 ln |x| + π�ν1 − 8 ln 2 < �σ arg(x) < −2 ln |x| + π�ν1 + 8 ln 2. (38)

For radial convergence, we have

y(x) = 1 + O(x)

sin2(ν2 ln(x)+ ν

2F1(x)

F (x)+ πν1

2

) + O(x).

This is an oscillating function, and it may have poles. Suppose, for example, thatν1 is real. Since F1(x)/F (x) is a convergent power series (|x| < 1) with real


coefficients and defines a bounded function, then y(x) has a sequence of poles onthe positive real axis, converging to x = 0.

In the domain (38), spiral convergence of x to zero is also allowed and thecritical behavior is (37) because arg x is not constant.

Finally, if ν = 0, namely ν2 = 0 (and then x0 = 2), we have

y(x) = 1

sin2(πν1)(1 + O(|x|)).

The case (b) in the above example is helpful in understanding the limitationof Theorem 1 because it gives a complete description of the behavior of Painlevétranscendents. Actually, Theorem 1 yields the behavior (37) in the domain D(σ)∪D(−σ ) (�σ = 1):

(1 + σ ) ln |x| + θ1�σ � �σ arg x � (1 − σ ) ln |x| + θ1�σ,where radial convergence to x = 0 is not allowed. On the other hand, the transfor-mations σ → ±(σ − 2), gives a further domain D(σ − 2) ∪D(−σ + 2):

(−1 + σ ) ln |x| + θ1�σ � �σ arg x � −(1 + σ ) ln |x| + θ1�σ,but again it is not possible for x to converge to x = 0 along a radial path. Figure 7shows D(σ) ∪D(−σ ) ∪D(2 − σ ) ∪D(σ − 2). Note that a radial path would beallowed if it were possible to make σ → 1. The interior of the set obtained as thelimit for σ → 1 of D(σ)∪D(−σ )∪D(2 − σ )∪D(σ − 2) is like (38). Actually,the intersection of (38) and D(σ)∪D(−σ )∪D(2−σ )∪D(σ−2) is never empty.In (38), the elliptic representation predicts an oscillating behavior and poles. So itis definitely clear that the ‘limit’ of theorem 1 for σ → 1 is not trivial.

Remark on the Example. Forµ half integer all the possible values of (x0, x1, x∞)such that x2

0 +x21 +x2∞−x0x1x∞ = 4 are covered by Chazy and Picard’s solutions,

with the warning that for µ = 12 the image (through birational transformations) of

Chazy solutions is y = ∞. See [28].

We turn to the general case. The elliptic representation has been studied fromthe point of view of algebraic geometry in [27], but to our knowledge Theorem 3and its corollary, both stated in the Introduction, are the first general result aboutits critical behavior appearing in the literature.

We prove Theorem 3 in Section 10. Here we prove the corollary. The criticalbehavior is obtained expanding y(x) in Fourier series:

℘

(u

2;ω1, ω2

)= − π2

12ω21

+ 2π2

ω21

∞∑n=1

ne2πinτ

1 − e2πinτ

(1 − cos

(nπu

2ω1

))+

+ π2

4ω21

1

sin2(πu4ω1

) . (39)

324 DAVIDE GUZZETTI

Figure 7. Domain D(σ)∪D(−σ)∪D(σ − 2)∪D(−σ + 2) for σ = 1+ iIm σ . Comparisonwith the domain where Piccard solution is expanded (top picture). We represent the domainD(r, ν1, ν2) of Theorem 3 for immaginary ν2, and we compare it to the domain D(σ) withthe identification ν2 = 1 − σ (and for suitable θ1, θ2). The numbers close to the boundarylines are their slopes (ε = 1 − σ is arbitrarily small) (bottom picture).

The expansion can be performed if �τ(x) > 0 and |�( u(x)

ω1(x))| < �τ ; these con-

ditions are satisfied in D(r; ν1, ν2). Let’s put F1/F = −4 ln 2 + g(x), whereg(x) = O(x). Taking into account (39) and Theorem 3, the expansion of y(x)for x → 0 in D(r; ν1, ν2) is

y(x) =[

1 + x

3− π2

12ω1(x)2

]+ π2

ω1(x)2

∞∑n=1

n

1 − (eg(x)16

)2nx2n

×

×{

2

(eg(x)

16

)2n

x2n − en(ν2+2)g(x)

[eiπν1

162+ν2x2+ν2

]neinπ

v(x)ω1(x) −

− en(2−ν2)g(x)

[e−iπν1

162−ν2x2−ν2

]ne−iπn

v(x)ω1(x)

}+

+ π2

4ω1(x)2

1

sin2(−i ν22 ln x + i ν2

2 ln 16 + πν12 − i ν2

2 g(x)+ πv(x)

2ω1(x)

) .


We observe that

ω1(x) ≡ π

2F(x) = π

2(1 + 1

4x + O(x2)),

1 + x

3− π2

12ω1(x)2≡ 1 + x

3− 1

3F(x)= 1

2x(1 + O(x)),

eg(x) = 1 + O(x)

and

e±iπv(x)ω1(x) = 1 + O

(|x| +

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣+ ∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣).In order to single out the leading terms, we observe that we are dealing with the

powers x, x2−ν2 , xν2 in D(r; ν1, ν2). If 0 < ν2 < 2 (the only allowed real values ofν2) |xν2 | is leading if 0 < ν2 < 1 and |x2−ν2 | is leading if 1 < ν2 < 2. We have

1

sin2(. . .)= −4

eiπν1

16ν2xν2[1 + O(|x| + |xν2 | + |x2−ν2 |)].

Thus, there exists 0 < δ < 1 (explicitly computable in terms of ν2) such that

y(x) =[

1

2x − 1

4

[eiπν1

16ν2−1

]xν2 − 1

4

[eiπν1

16ν2−1

]−1

x2−ν2

](1 + O(xδ))

=

sin2

(πν1

2

)x(1 + O(xδ)), if ν2 = 1,

− 14

[eiπν1

16ν2−1

]xν2(1 + O(xδ)), if 0 < ν2 < 1,

− 14

[eiπν1

16ν2−1

]−1x2−ν2(1 + O(xδ)), if 1 < ν2 < 2.

This behavior coincides with that of Theorem 1 for σ = 0 in the first case, σ =1 − ν2 in the second, and σ = ν2 − 1 in the third.

We turn to the case �ν2 �= 0. We consider a path contained in D(r; ν1, ν2) ofthe equation

�ν2 arg(x) = (�ν2 − V) ln |x| + b, 0 � V � 2 (40)

with a suitable constant b (the path connects some x0 ∈ D(r; ν1, ν2) to x = 0,therefore b = �ν2 arg x0 − (�ν2 −V) ln |x0|). We have|x2−ν2 | = |x|2−Veb, |xν2 | =|x|Ve−b and so

|xν2 | is leading for 0 � V < 1,

|xν2 |, |x|, |x2−ν2 | have the same order for V = 1,

|x2−ν2 | is leading for 1 < V � 2.

If V = 0,∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣ < r, but xν2 �→ 0 as x → 0.

326 DAVIDE GUZZETTI

If V = 2,∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣ < r, but x2−ν2 �→ 0 as x → 0.

This also implies that v(x) �→ 0 as x → 0 along the paths with V = 0 or V = 2,while v(x) → 0 for all other values 0 < V < 2. We conclude that

(a) If x → 0 in D(r; ν1, ν2) along (40) for V �= 0, 2, then

y(x) =[

1

2x − 1

4

[eiπν1

16ν2−1

]xν2 − 1

4

[eiπν1

16ν2−1

]−1

x2−ν2

](1 + O(xδ)), 0 < δ < 1.

The three leading terms have the same order if the convergence is along a pathasymptotic to (40) with V = 1. Namely

y(x) = x sin2

(i1 − ν2

2ln x + πν1

2+ 2i(ν2 − 1) ln 2

)(1 + O(x)), for V = 1.

Otherwise

y(x) = −1

4

[eiπν1

16ν2−1

]xν2(1 + O(xδ)), for 0 < V < 1

or

y(x) = −1

4

[eiπν1

16ν2−1

]−1

x2−ν2(1 + O(xδ)), for 1 < V < 2.

This is the behavior of Theorem 1 with 1 − σ = ν2 or 2 − ν2.

Important Observation. Let ν2 = 1−σ and consider the intersection D(r; ν1, ν2)

∩ D(σ ) in the (ln |x|,�ν2 arg(x))-plane. It is never empty (see Figure 7). Wechoose ν1 such that a = − 1

4 [eiπν1/16ν2−1]. According to the Proposition in Sec-tion 4, the transcendent of the elliptic representation and y(x;σ, a) of Theorem 1coincide at the intersection. Equivalently, we can choose the identification 1−σ =2 − ν2 and repeat the argument.

(b) If V = 0 the term

1

sin2(−i ν2

2 ln x + [i ν2

2 ln 16 + πν12

]− i ν22 g(x)+ πv(x)

2ω1(x)

)is oscillatory as x → 0 and does not vanish. Note that there are no poles becausethe denominator does not vanish in D(r; ν1, ν2) since | eiπν1

16ν2 xν2 | < r < 1. Then

y(x) = O(x)+ 1

F(x)2

1

sin2(−i ν2

2 ln x + [i ν2

2 ln 16 + πν12

]− i ν22 g(x)+ v(x)

F (x)

)= 1 + O(x)

sin2(−i ν22 ln x + [

i ν22 ln 16 + πν1

2

]+∑∞m=1 c0m(ν2)

[eiπν1

16ν2 xν2]m) + O(x).


The last step is obtained taking into account the nonvanishing term in (13) and

πv(x)

2ω1(x)= v(x)

F (x)= v(x)(1 + O(x)).

(c) If V = 2, the series

−∞∑n=1

n

1 − (eg(x)16

)2nx2n

en(2−ν2)g(x)

[e−iπν1

162−ν2x2−ν2

]ne−iπn

v(x)ω1(x)

which appears in y(x) is oscillating. Simplifying we obtain

y(x)= O(x)− 4(1 + O(x))∞∑n=1

n

[e−iπν1

162−ν2x2−ν2

]ne−iπn

v(x)ω1(x)

= 1+O(x)

sin2(i 2−ν2

2 ln x+ [i ν2−22 ln 16+ πν1

2

]+ ∑∞m=1 b0m(ν2)

[e−iπν1

162−ν2xν2]m) +O(x).

The observation at the end of point (a) makes it possible to investigate thebehavior of the transcendents of Theorem 1 along a path (8) with � = 1. Thepath (8) coincides with (40) for V = 0 if we define 1 − σ := ν2, for V = 2 if wedefine 1 − σ = 2 − ν2.

In particular, we can analyze the radial convergence when �σ = 1. We identifyν2 = 1 − σ and choose ν2 = iν, ν �= 0 real. Namely, σ = 1 − iν. Let x → 0 inD(r; ν1, iν) along the line arg(x)= constant (it is the line with V = 0). We have

y(x) = 1

F(x)2

1

sin2(ν2 ln x − ν ln 16 + π

2 ν1 + ν2g(x)+ πv(x)

2ω1(x)

) + O(x)

= 1 + O(x)

sin2(ν2 ln x− ν ln 16+ πν1

2 + ∑∞m=1 c0m(ν)

[(eiπν1

16iν

)xiν]m+O(x)

) +O(x)

= 1 + O(x)

sin2( ν2 ln x − ν ln 16 + πν1

2 +∑∞m=1 c0m(ν)

[(eiπν1

16iν

)xiν]m) .

The last step is possible because

sin(f (x)+ O(x)) = sin(f (x))+ O(x) = sin(f (x))(1 + O(x))

if f (x) �→ 0, as x → 0; this is our case for

f (x) = ν

2ln x − ν ln 16 + πν1

2+

∞∑m=1

c0m(ν)

[(eiπν1

16iν

)xiν]m

in D .We observe that for �σ = 1 we have a limitation on arg(x) in D(r; ν1, iν),

namely

−π�ν1 − ln r < ν arg(x). (41)

This is the analogous of the limitation imposed by B(σ, a; θ2, σ ) of (31).

328 DAVIDE GUZZETTI

Remark. If �ν2 �= 0, the freedom ν2 �→ ν2+2N , N ∈ Z, is the analogous of thefreedom σ �→ ±σ + 2n. Moreover, Theorem 3 yields different critical behaviorsfor the same transcendent on the different domains corresponding to ν2 + 2N .

As a last remark, we observe that the coefficients in the expansion of v(x) can becomputed by direct substitution of v into the elliptic form of PVIµ, the right-handside being expanded in Fourier series.

5.2. SHIMOMURA’S REPRESENTATION

In [37] and [19] S. Shimomura proved the following statement for the Painlevé VIequation with any value of the parameters α, β, γ, δ.

For any complex number k and for any σ /∈ (−∞, 0] ∪ [1,+∞), there is asufficiently small r such that the Painlevé VI equation for given α, β, γ, δ has aholomorphic solution in the domain

Ds(r;σ, k) = {x ∈ C0||x| < r, |e−kx1−σ | < r, |ekxσ | < r}with the following representation:

y(x;σ, k) = 1

cosh2(σ−1

2 ln x + k2 + v(x)

2

) ,where

v(x) =∑n�1

an(σ )xn +

∑n�0,m�1

bnm(σ )xn(e−kx1−σ )m +

+∑

n�0, m�1

cnm(σ )xn(ekxσ )m,

an(σ ), bnm(σ ), cnm(σ ) are rational functions of σ and the series defining v(x)

is convergent (and holomorphic) in D(r;σ, k). Moreover, there exists a constantM = M(σ) such that

|v(x)| � M(σ)(|x| + |e−kx1−σ | + |ekxσ |). (42)

The domain D(r;σ, k) is specified by the conditions:

|x| < r,

�σ ln |x| + [�k − ln r] < �σ arg(x) < (�σ − 1) ln |x| + [�k + ln r]. (43)

This is an open domain in the plane (ln |x|, arg(x)). It can be compared with the do-main D(ε;σ, θ1, θ2) of Theorem 1 (Figure 8). Note that (43) imposes a limitationon arg(x). For example, if �σ = 1 we have �σ arg(x) < [�k + ln r] (ln r < 0).This is similar to (41). We will show that Shimomura’s transcendents coincide with


Figure 8.

those of Theorem 1 (see point (a.1) below). So, the above limitation turns out to bethe analogous of the limitation imposed to D(ε;σ ; θ1, θ2) by B(σ, a; θ2, σ ) of (31).

Like the elliptic representation, Shimomura’s allows us to investigate what hap-pens when x → 0 along a path (8) with � = 1, contained in Ds(r;σ, k). It is aradial path if �σ = 1. Along (8), we have |xσ | = |x|�e−b. We suppose �σ �= 0.

(a) 0 � � < 1. We observe that |x1−σ e−k| → 0 as x → 0 along the line. Then,

y(x;σ, k) = 1

cosh2(σ−1

2 ln x + k2 + v(x)

2

)= 4

xσ−1ekev(x) + x1−σ e−ke−v(x) + 2

= 4e−ke−v(x)x1−σ 1

(1 + e−ke−v(x)x1−σ )2

= 4e−ke−v(x)x1−σ (1 + e−v(x)O(|e−kx1−σ |)).Two sub-cases:(a.1) � �= 0. Then |xσ ek| → 0 and v(x) → 0 (see (42)). Thus

y(x;σ, k) = 4e−kx1−σ (1 + O(|x| + |ekxσ | + |e−kx1−σ |)).By the proposition in Section 4, y(x;σ, k) and y(x;σ, a) coincide, for a = 4e−k ,in Ds(r;σ, k) ∩ D(ε;σ ; θ1, θ2). The intersection is not empty for any θ1, θ2. SeeFigure 8.

(a.2) � = 0. |xσ ek| → constant < r, so |v(x)| does not vanish. Then

y(x) = a(x)x1−σ (1 + O(|e−kx1−σ |)), a(x) = 4e−ke−v(x),

and a(x) must coincide with (10) of Theorem 1:

330 DAVIDE GUZZETTI

Figure 9.

(b) � = 1. In this case Theorem 1 fails. Now |x1−σ e−k| → (constant �= 0) < r.Therefore, y(x) does not vanish as x → 0. We keep the representation

y(x;σ, k) = 1

cosh2(σ−12 ln x + k

2 + v(x)

2

) ≡ 1

sin2(i σ−12 ln x + i k2 + i v(x)2 − π

2

) .v(x) does not vanish and y(x) is oscillating as x → 0, with no limit. We remarkthat like in the elliptic representation, cosh2(. . .) does not vanish in Ds(r;σ, k), sowe do not have poles. Figure 9 synthesizes points (a.1), (a.2), (b).

As an application, we consider the case �σ = 1, namely σ = 1−iν, ν ∈ R\{0}.Then, the path corresponding to � = 1 is a radial path in the x-plane and

y(x; 1 − iν, k) = 1 + O(x)

sin2(ν2 ln(x)+ ik

2 − π2 + i

2

∑m�1 b0m(σ )(e−kx1−σ )m

) .

6. Analytic Continuation of a Branch

We describe the analytic continuation of the transcendent y(x;σ, a). We choose abasis γ0, γ1 of two loops around 0 and 1, respectively, in the fundamental groupπ(P1\{0, 1,∞}, b), where b is the base-point (Figure 10). The analytic continua-tion of a branch y(x; x0, x1, x∞) along paths encircling x = 0 and x = 1 (a looparound x = ∞ is homotopic to the product of γ0, γ1) is given by the action ofthe group of the pure braids on the monodromy data (Figure 11). This action iscomputed in [13], to which we refer. For a counter-clockwise loop around 0, wehave to transform (x0, x1, x∞) by the action of the braid β2

1 , where

β1: (x0, x1, x∞) �→ (−x0, x∞ − x0x1, x1),

β21 : (x0, x1, x∞) �→ (x0, x1 + x0x∞ − x1x

20 , x∞ − x0x1).


Figure 10.

Figure 11.

The analytic continuation of the branch y(x; x0, x1, x∞) is the new branch

y(x; x0, x1 + x0x∞ − x1x20 , x∞ − x0x1).

For a counter-clockwise loop around 1, we need the braid β22 , given by

β2: (x0, x1, x∞) �→ (x∞,−x1, x0 − x1x∞),β2

2 : (x0, x1, x∞) �→ (x0 − x1x∞, x1, x∞ + x0x1 − x∞x21).

The analytic continuation of y(x; x0, x1, x∞) is the new branch y(x; x0−x1x∞, x1,

x∞ + x0x1 − x∞x21 ).

A generic loop P1\{0, 1,∞} is represented by a braid β, which is a productof factors β1 and β2. The braid β acts on (x0, x1, x∞) and gives a new triple(x

β

0 , xβ

1 , xβ∞) and a new branch y(x; xβ0 , xβ1 , xβ∞).

On the other hand, y(x; x0, x1, x∞) is the branch of a transcendent which hasanalytic continuation on the universal covering of P1\{0, 1,∞}. We still denotethis transcendent by y(x; x0, x1, x∞), where x is now regarded as a point in theuniversal covering. A loop transforms x to a new point x′ in the covering. Thetranscendent at x′ is y(x′; x0, x1, x∞). Let β be the corresponding braid. We have

y(x; xβ0 , xβ1 , xβ∞) = y(x′; x0, x1, x∞). (44)

332 DAVIDE GUZZETTI

Let σ , a be associated to (x0, x1, x∞) according to Theorem 2. Let x ∈ D(σ).At x we have y(x; x0, x1, x∞) = y(x;σ, a). Let σβ , aβ = a(σ β; xβ0 , xβ1 , xβ∞) beassociated to (x

β

0 , xβ

1 , xβ∞). If D(σ) ∩ D(σβ) is not empty and x also belongs to

D(σβ), then y(x; xβ0 , xβ1 , xβ∞) = y(x;σβ, aβ) at x. If x /∈ D(σβ), it belongs to oneand only one of the domains D(±σβ + 2n) and y(x; xβ0 , xβ1 , xβ∞) = y(x;±σβ +2n, aβ) at x, where aβ = a(±σβ + 2n; xβ0 , xβ1 , xβ∞). We note, however, that if�σβ = 1, it may happen that x lies in the strip between B(σβ) and B(2 − σβ),where there may be poles (see the beginning of Section 5). In this case, we arenot able to describe the analytic continuation (actually, the new branch may have apole in x). But in this case, we can slightly change arg x in such a way that x fallsin a domain D(±σβ + 2n).

As an example, let us start at x ∈ D(σ); we perform the loop γ1 around 1 andgo back to x. If x also belongs to D(σβ2

2 ), the transformation is γ1: y(x;σ, a) →y(x;σβ2

2 , aβ22 ). If x /∈ D(σβ2

2) but x belong to one of the D(±σβ22 + 2n) we have

y(x;σ, a) → y(x;±σβ22 + 2n, aβ

22 ).

Again, let us start at x ∈ D(σ); we perform the loop γ0 around 0 and we goback to x. The transformation of (σ, a) according to the braid β1 is

(σ β21 , aβ

21 ) = (σ, ae−2πiσ ) (45)

as it follows from the fact that x0 is not affected by β21 , then σ does not change, and

from the explicit computation of a(σ, xβ2

10 , x

β21

1 , xβ2

1∞ ) through Theorem 2 (we willdo it at the end of Section 9). Therefore, the effect of γ0 is

γ0: y(x;σ, a) → y(x;σβ21 , aβ

21 ) = y(x;σ, ae−2πiσ ).

Since we are considering a loop around 0, it makes sense to consider is as a loopin C\{0} ∩ {|x| < ε}. The loop is x �→ x′ = e2πix. Suppose that also x′ ∈D(σ). Then we can represent the analytic continuation on the universal coveringas y(x;σ, a) → y(x′;σ, a). On the other hand, according to (44), we must havey(x′;σ, a) = y(x;σβ2

1 , aβ21 ). This is immediately verified because

y(x′;σ, a) = a[x′]1−σ (1 + O(|x′|δ))

= ae−2πiσ x1−σ (1 + O(|x|δ)) ≡ y(x;σ, ae−2πiσ ).

Thus Theorem 1 is in accordance with the analytic continuation obtained by theaction of the braid group.

7. Singular Points x = 1, x = ∞ (Connection Problem)

In this section we restore the notation σ (0) and a(0) to denote the parameters ofTheorem 1 near the critical point x = 0. We describe now the analogs of Theorem 1near x = 1 and x = ∞. The three critical points 0, 1, ∞ are equivalent thanks tothe symmetries discussed in [30] and [13].


Figure 12.

(a) Let

x = 1

t, y(x) := 1

ty(t). (46)

Then y(x) is a solution of PVIµ (variable x) if and only if y(t) is a solution of PVIµ(variable t). The singularities 0 and ∞ are exchanged. Theorem 1 holds for y(t) att = 0 with some parameters σ , a that we call now σ (∞), a(∞). Then, we go back toy(x) and find a transcendent y(x;σ (∞), a(∞)) with the behavior

y(x;σ (∞), a(∞)) = a(∞)xσ(∞)

(1 + O

(1

|x|δ))

, x → ∞ (47)

in

D(M;σ (∞); θ1, θ2, σ )

:= {x ∈ C\{∞} s.t. |x| > M,

e−θ1�σ (∞) |x|−σ � |x−σ (∞) | � e−θ2�σ (∞)

, 0 < σ < 1}, (48)

where M > 0 is sufficiently large and 0 < δ < 1 is small (Figure 12).(b) Let

x = 1 − t, y(x) = 1 − y(t). (49)

y(x) satisfies PVIµ if and only if y(t) satisfies PVIµ. Theorem 1 holds for y(t) att = 0 with some parameters σ , a that we now call σ (1) and a(1). Going back toy(x), we obtain a transcendent y(x;σ (1), a(1)) such that

y(x, σ (1), a(1)) = 1 − a(1)(1 − x)1−σ (1) (1 + O(|1 − x|δ)), x → 1 (50)

in

D(ε;σ (1); θ1, θ2, σ )

:= {x ∈ C\{1} s.t. |1 − x| < ε,

e−θ1�σ |1 − x|σ � |(1 − x)σ(1) | � e−θ2�σ , 0 < σ < 1

}. (51)

334 DAVIDE GUZZETTI

Consider a branch y(x; x0, x1, x∞). The symmetries in (a) and (b) affect themonodromy data, according to the following formulae proved in [13]:

y(x; x0, x1, x∞) = 1

ty(t; x∞,−x1, x0 − x1x∞), x = 1

t, (52)

y(x; x0, x1, x∞) = 1 − y(t; x1, x0, x0x1 − x∞), x = 1 − t. (53)

We are ready to solve the connection problem for the transcendents of The-orem 1, so extending the result of [13]. We recall that we always assume that0 � �σ (i) � 1, i = 0, 1,∞; otherwise we write ±σ (i) + 2n, n ∈ Z.

We consider a transcendent y(x;σ (0), a(0)). We choose a point x ∈ D(σ (0)).At x there exists a unique branch y(x; x0, x1, x∞) whose analytic continuationin D(σ (0)) is precisely y(x;σ (0), a(σ (0))), where the triple of monodromy data(x0, x1, x∞) corresponds to σ (0), a(0) according to Theorem 2.

If we increase the absolute value of the point and we keep arg x constant,we obtain a new point X = |X| exp{i arg x}, where |X| is large. The branchy(x; x0, x1, x∞) is also defined in X, because we have not changed arg x. Accord-ing to (52), we compute σ (∞), a(∞) from the data (x∞,−x1, x0 − x1x∞) by theformulae of Theorem 2. Therefore, if X ∈ D(M;σ (∞)), the analytic continuationof y(x; x0, x1, x∞) = y(x;σ (0), a(0)) at X is y(X;σ (∞), a(∞)).

We observe that if 0 � �σ (∞) < 1, it is always possible to choose X ∈D(M;σ (∞)), provided that |X| is large enough. But for �σ (∞) = 1 we have arestriction on the argument of the points of D(M;σ (∞)) given by a set B(σ (∞))

analogous to (31). This implies that X may not be chosen in D(M;σ (∞)) for anyvalue of |X|. In this case, we can choose X in one of the domains D(M;σ (∞)),D(M;−σ (∞)), D(M; 2 − σ (∞)), D(M;σ (∞) − 2). See Figure 13. This is almost

Figure 13.


always possible, except for the case when arg x lies in the strip between B(σ (∞))

and B(2−σ (∞)), where there may be movable poles (see the discussion about thesestrips at the beginning of Section 5).

We recall that a(∞) depends on (x∞,−x1, x0 − x1x∞) but it is also affectedby the choice of ±σ (∞) + 2n. Thus, we write below a(∞)(±σ (∞) + 2n). We con-clude that the analytic continuation of y(x; x0, x1, x∞) = y(x;σ (0), a(0)) at X iseither y(X;σ (∞), a(∞)(σ (∞))), or y(X;−σ (∞), a(∞)(−σ (∞))), or y(X; 2 − σ (∞),

a(∞)(2 − σ (∞))), or y(X;σ (∞) − 2, a(∞)(σ (∞) − 2)), provided that X is not in thestrip where there may be poles. IfX falls in the strip, this is not actually a limitation,because we can slightly change arg x in such a way that x is still in D(σ (0)) and X

falls into D(M;σ (∞)) ∪D(M;−σ (∞)) ∪D(M; 2 − σ (∞)) ∪D(M;σ (∞) − 2).In the same way we treat the connection problem between x = 0 and x = 1.

We repeat the same argument taking (53) into account. We remark again that for�σ (1) = 1 it is necessary to consider the union of D(σ (1)), D(−σ (1)), D(2−σ (1)),D(σ (1) − 2) to include all possible values of arg(1 − x).

8. Proof of Theorem 1

We recall that PVIµ is equivalent to the Schlesinger equations for the 2×2 matricesA0(x), Ax(x), A1(x) of (32):

dA0

dx= [Ax,A0]

x,

dA1

dx= [A1, Ax]

1 − x,

dAx

dx= [Ax,A0]

x+ [A1, Ax]

1 − x. (54)

We look for solutions satisfying

A0(x)+ Ax(x)+ A1(x) =(−µ 0

0 µ

):= −A∞, µ ∈ C, 2µ /∈ Z,

tr(Ai) = det(Ai) = 0.

Now let

A(z, x) := A0

z+ Ax

z− x+ A1

z − 1.

We have explained that y(x) is a solution of PVIµ if and only if A(y(x), x)12 = 0.The system (54) is a particular case of the system

dAµ

dx=

n2∑ν=1

[Aµ,Bν]fµν(x),(55)

dBν

dx= −1

x

n2∑ν ′=1

[Bν,Bν ′ ] +n1∑µ=1

[Bν,Aµ]gµν(x)+n2∑ν ′=1

[Bν,Bν ′ ]hνν ′(x),

336 DAVIDE GUZZETTI

where the functions fµν , gµν , hµν are meromorphic with poles at x = 1,∞ and∑ν Bν +∑

µ Aµ = −A∞ (here the subscript µ is a label, not the eigenvalue ofA∞!). System (54) is obtained for

fµν = gµν = bν/(aµ − xbν), hµν = 0,

n1 = 1, n2 = 2, a1 = b2 = 1, b1 = 0

and

B1 = A0, B2 = Ax, A1 = A1.

We prove the analogous result of [33], p. 262, in the domain D(ε;σ ; θ1, θ2, σ )

for σ /∈ (−∞, 0) ∪ [1,+∞):

LEMMA 1. Consider matrices B0ν (ν = 1, . . . , n2), A0

µ (µ = 1, . . . , n1) and A,to be independent of x and such that∑

ν

B0ν +

∑µ

A0µ = −A∞,∑

ν

B0ν = A, eigenvalues(A) = σ

2,−σ

2, σ /∈ (−∞, 0) ∪ [1,+∞).

Suppose that fµν , gµν , hµν are holomorphic if |x| < ε ′, for some small ε ′ < 1.For any 0 < σ < 1 and θ1, θ2 real, there exists a sufficiently small 0 <

ε < ε ′ such that the system (55) has holomorphic solutions Aµ(x), Bν(x) inD(ε;σ ; θ1, θ2, σ ) satisfying

‖Aµ(x)− A0µ‖ � C|x|1−σ1, ‖x−ABν(x)x

A − B0ν‖ � C|x|1−σ1 .

Here C is a positive constant and σ < σ1 < 1.

Important remark. There is no need to assume here that 2µ /∈ Z. The theoremholds true for any value of µ. If in the system (55) the functions fµν , gµν , hµν arechosen in such a way as to yield Schlesinger equations for the Fuchsian systemof PVIµ, the assumption 2µ /∈ Z is still not necessary, provided that the matrixR in (22) is considered as a monodromy datum independent of the deformationparameter x.

Proof. Let A(x) and B(x) be 2 × 2 matrices holomorphic on D(ε;σ ) (we omitθ1, θ2, σ ) and such that

‖A(x)‖ � C1, ‖B(x)‖ � C2 on D(ε;σ ).Let f (x) be a holomorphic function for |x| < ε ′. Let σ2 be a real number suchthat σ < σ2 < 1. Then, there exists a sufficiently small ε < ε ′ such that for


x ∈ D(ε;σ ), we have∥∥x±AA(x)x∓A∥∥ � C1|x|−σ2 ,∥∥x±AB(x)x∓A∥∥ � C2|x|−σ2 ,∥∥∥∥x−A ∫

L(x)

dsA(s)sAB(s)s−Af (s)xA∥∥∥∥ � C1C2|x|1−σ2 ,∥∥∥∥x−A ∫

L(x)

ds sAB(s)s−AA(s)f (s)xA∥∥∥∥ � C1C2|x|1−σ2 ,

where L(x) is a path in D(ε;σ ) joining 0 to x. To prove the estimates, we observethat

‖xA‖ = ‖xdiag( σ2 ,− σ2 )‖ = max{|xσ | 1

2 , |xσ |− 12 } � e

θ12 �σ |x|− σ

2 , in D(ε;σ ).Note here the importance of the bound |xσ | � e−θ2�σ in the definition of D(ε;σ ): itdetermines the above estimates of ‖xA‖ because it ensures that |x−σ | 1

2 is dominant.If this were not true, the lemma would fail and Theorem 1 could not be proved. Nowwe estimate

‖xAA(x)x−A‖ � ‖xA‖‖A(x)‖‖x−A‖ � eθ1�σC1|x|−σ= (

eθ1�σ |x|σ2−σ )C1|x|−σ2 .

Thus, if ε is small enough (we require εσ2−σ � e−θ1�σ ) we obtain ‖xAA(x)x−A‖ �C1|x|−σ2 .

We turn to the integrals. We choose a real number σ ∗ such that 0 � σ ∗ � σ andwe choose a path L(x) from 0 to x, represented in Figure 14. For �σ �= 0, L(x) is

Figure 14. Path of integration.

338 DAVIDE GUZZETTI

given by

arg(s) = a log |s| + b, a = �σ − σ ∗

�σ ,

b = arg x − �σ − σ ∗

�σ log |x|, |s| � |x|.For �σ = 0, we choose L(x) with σ ∗ = σ and arg(s) = arg(x). Note that on theL(x) we have |sσ | = |xσ |(|s|σ ∗

/|x|σ ∗). Then we compute∥∥∥∥x−A ∫

L(x)

dsA(s)sAB(s)s−Af (s)xA∥∥∥∥

=∥∥∥∥∫

L(x)

dsx−AA(s)xA(s

x

)A

B(s)

(s

x

)−Af (s)

∥∥∥∥� eθ1�σ |x|−σ C1C2 max

|x|<ε|f (x)|

∫L(x)

|ds| |s|−σ ∗

|x|−σ ∗ .

The last step in the above inequality follows from∥∥∥∥( sx)A∥∥∥∥ =

∥∥∥∥diag

(sσ2

xσ2,s−

σ2

x−σ2

)∥∥∥∥ = maxL

{ |s σ2 |

|x σ2 | ,

|s− σ2 |

|x− σ2 |}

= max

{ |s| σ∗2|x| σ∗2

,|s|− σ∗

2

|x|− σ∗2

}= |s|− σ∗

2

|x|− σ∗2

, |s| � |x|.

We choose the parameter ρ = |s| on L(x); therefore

s = ρei{

arg x+�σ−σ∗�σ log ρ

|x|}, 0 < ρ � |x|

and we obtain

|ds| = P(σ, σ ∗) dρ, P (σ, σ ∗) :=

√

1 +(�σ − σ ∗

�σ)2

for �σ �= 0,

1 for �σ = 0,∫L(x)

|ds| |s|−σ ∗ = P(σ, σ ∗)∫ |x|

0dρ ρ−σ ∗ = P(σ, σ ∗)

1 − σ ∗ |x|1−σ ∗.

Let P(σ ) := maxσ ∗ P(σ, σ ∗). The initial integral is less than or equal to

eθ1�σ max|x|<ε

|f (x)|C1C2P(σ )

1 − σ|x|1−σ .

Now, we write |x|1−σ = |x|σ2−σ |x|1−σ2 and we obtain, for sufficiently small ε,

eθ1�σ max|x|<ε

|f (x)|C1C2P(σ )

1 − σ|x|1−σ � C1C2|x|1−σ2 .


We remark that the above estimates are still valid for σ = 0. Actually

‖xA‖ ≡ ∥∥x(0 10 0)

∥∥diverges like |log x|, ‖xAA(x)x−A‖ are less than or equal to C1|log(x)|2 and, fi-nally, ‖x−A ∫

L(x)dsA(s)sAB(s)s−Af (s)xA‖ is less than or equal to

C1C2 max |f | |log(x)|2 ∫L(x)

|ds| |log s|2. We chose L(x) to be a radial path s =ρ exp(i arg x), 0 < ρ � |x|. Then the integral is |x|(log |x|2 − 2 log |x| + 2 + α2).The factor |x| does the job because we rewrite it as |x|σ2 |x|1−σ2 (here σ2 is anynumber between 0 and 1) and we proceed as above to choose ε small enough insuch a way that (max |f | |x|σ2× function diverging like log2 |x|) � 1.

The estimates above are useful in proving the lemma.We solve the Schlesinger equations by successive approximations, as in [33]:

let Bν(x) := x−ABν(x)xA. The Schlesinger equations are rewritten as

dAµ

dx=

n2∑ν=1

[Aµ, x

ABνx−A]fµν(x), (56)

dBν

dx= 1

x

[Bν,

∑µ

x−A(Aµ(x)− A0µ)x

A

]+

n1∑µ=1

[Bν, x

−AAµxA]gµν(x)+

+n2∑ν ′=1

[Bν, Bν ′

]hνν ′(x). (57)

We consider the following system of integral equations:

Aµ(x) = A0µ +

∫L(x)

ds∑ν

[Aµ(s), s

ABν(s)s−A]fµν(s), (58)

Bν(x) = B0ν +

∫L(x)

ds

{1

s

[Bν(s),

∑µ

s−A(Aµ(s)− A0

µ

)sA]+

+∑µ

[Bν(s), s

−AAµ(s)sA]gµν(s)+

+∑ν ′[Bν(s), Bν ′(s)]hνν ′

}. (59)

We solve it by successive approximations:

A(k)µ (x) = A0

µ +∫L(x)

ds∑ν

[A(k−1)µ (s), sAB(k−1)

ν (s)s−A]fµν(s),

B(k)ν (x) = B0

ν +∫L(x)

ds

{1

s

[B(k−1)ν (s),

∑µ

s−A(A(k−1)µ (s)− A0

µ

)sA]+

+∑µ

[B(k−1)ν (s), s−AA(k−1)

µ (s)sA]gµν(s)+

+∑ν ′

[B(k−1)ν (s), B

(k−1)ν ′ (s)

]hνν ′

}.

340 DAVIDE GUZZETTI

The functions A(k)µ (x), B(k)

ν (x) are holomorphic in D(ε;σ ), by construction. Ob-serve that ‖A0

µ‖ � C, ‖B0ν‖ � C for some constant C. We claim that for |x|

sufficiently small,∥∥A(k)µ (x)− A0

µ

∥∥ � C|x|1−σ1,∥∥x−A(A(k)µ (x)− A0

µ

)xA∥∥ � C2|x|1−σ2 , (60)∥∥B(k)

ν (x)− B0ν

∥∥ � C|x|1−σ1 ,

where σ < σ2 < σ1 < 1. Note that the above inequalities imply ‖A(k)µ ‖ � 2C,

‖B(k)ν ‖ � 2C. Moreover, we claim that∥∥A(k)

µ (x)− A(k−1)µ (x)

∥∥ � Cδk−1|x|1−σ1 ,∥∥x−A(A(k)µ (x)− A(k−1)

µ (x))xA∥∥ � C2δk−1|x|1−σ2 , (61)∥∥B(k)

ν (x)− B(k−1)ν (x)

∥∥ � Cδk−1|x|1−σ1,

where 0 < δ < 1.The above inequalities are proved for k = 1, using the simple methods used

in the estimates at the beginning of the proof. Then we proceed by induction, stillusing the same estimates. As an example, we prove the (k + 1)th step of the firstof (61) supposing that the kth step of (61) is true. All the other inequalities areproved in the same way. Let us consider:∥∥A(k+1)

µ (x)− A(k)µ (x)

∥∥=∥∥∥∥∫

L(x)

dsn2∑ν=1

(A(k)µ sAB(k)

ν s−A − A(k−1)µ sAB(k−1)

ν s−A+

+ sAB(k−1)ν s−AA(k−1)

µ − sAB(k)ν s−AA(k)

µ

)fµν(s)

∥∥∥∥�∫L(x)

|ds|n2∑ν=1

∥∥A(k)µ sAB(k)

ν s−A − A(k−1)µ sAB(k−1)

ν s−A∥∥|fµν(s)|+

+∫L(x)

|ds|n2∑ν=1

∥∥sAB(k−1)ν s−AA(k−1)

µ − sAB(k)ν s−AA(k)

µ

∥∥|fµν(s)|.Now we estimate∥∥A(k)

µ sAB(k)ν s−A − A(k−1)

µ sAB(k−1)ν s−A

∥∥�∥∥A(k)


µ sAB(k)ν s−A

∥∥++ ∥∥A(k−1)


µ sAB(k−1)ν s−A

∥∥�∥∥A(k)

µ − A(k−1)µ

∥∥∥∥sAB(k)ν s−A

∥∥+ ∥∥A(k−1)µ

∥∥‖sA‖∥∥B(k)ν − B(k−1)

ν

∥∥‖s−A‖.By induction,

� (Cδk−1|s|1−σ1)2Ceθ1�σ |s|−σ + 2C(Cδk−1|s|1−σ1)eθ1�σ |s|−σ .


The other term is estimated in an analogous way. Then∥∥A(k+1)µ − A(k)

µ

∥∥ � P(σ )

1 − σ8n2C

2 max |fµν|δk−1eθ1�σ |x|1−σ |x|1−σ1 .

We choose ε small enough to have

P(σ )

1 − σ ∗8n2C max |f |eθ1�σ |x|1−σ � δ.

Note that the choice of ε is independent of k. In the case σ = 0, |x|1−σ is substitutedby |x|(log2 |x| + O(log |x|)). ✷

The inequalities (60), (61) imply the convergence of the successive approxima-tions to a solution of the integral equations (58), (59) satisfying the assertion of thelemma, plus the additional inequality∥∥x−A(Aµ(x)− A0

µ)xA∥∥ � C2|x|1−σ2 .

In order to prove that the solution also solves the differential equations (56), (57),we need the following sub-lemma:

SUB-LEMMA 1. Let f (x) be a holomorphic function inD(ε, σ ) such that f (x) =O(|x| + |x1−σ |) for x → 0 in D(ε, σ ). Then F(x) := ∫

L(x)(1/s)f (s) ds is holo-

morphic in D(ε, σ ) and dF(x)/dx = (1/x)f (x).

We understand that the sub-lemma applies to our case because the entries of thematrices in the integrals in (58), (59) are of order s−1, s−σ , or higher. Thus, if weprove it, the proof of Lemma 1 will be complete.

Proof of Sub-Lemma 1. Let x + Ex be another point in D(ε;σ ) close to x. Toprove the sub-lemma, it is enough to prove that∫

L(x+Ex)1

sf (s) ds −

∫L(x)

1

sf (s) ds =

∫ x+Ex

x

1

sf (s) ds,

where the last integral is on a segment from x to x +Ex. Namely, we prove that(∫L(x)

−∫L(x+Ex)

−∫ x+Ex

x

)dsf (s)

s= 0.

We consider a small disk UR centered at x = 0 of small radius R < min{ε, |x|}and the points

xR := L(x) ∩ UR, x′R := L(x +Ex) ∩ UR.

Since the integral of f/s on a finite close curve (not containing 0) is zero, we have(∫L(x)

−∫L(x+Ex)

−∫ x+Ex

x

)dsf (s)

s

=(∫

L(xR)

−∫L(x ′R)

−∫γ (xR,x

′R)

)dsf (s)

s. (62)

342 DAVIDE GUZZETTI

The last integral is on the arc γ (xR, x′R) from xR to x′R on the circle |s| = R. Wehave also taken into account the obvious fact that L(xR) is contained in L(x) andL(x′R) is contained in L(x +Ex).

We take R → 0 and we prove that the right-hand side in (62) vanishes. First ofall, we use the hypothesis, we estimate integrals in the same way we did before,and we obtain∣∣∣∣∫

L(xR)

f (s)

sds

∣∣∣∣ �∫L(xR)

1

|s|O(|s| + |s1−σ |)|ds|

� P(σ, σ ∗)1 − σ ∗ O(R + O(R1−σ ∗

)).

Therefore∫L(xR)

(f (s)/s) ds → 0 for R → 0 (recall that 0 � σ ∗ < 1). In thesame way we prove that

∫L(x ′R)

(f (s)/s) ds → 0 for R → 0. We finally estimate

the integral on the arc. Since xR ∈ L(x) and x′R ∈ L(x +Ex), we have

arg xR = arg x + �σ − σ ∗

�σ logR

|x| ,

arg x′R = arg(x +Ex)+ �σ − σ ∗

�σ logR

|x| .

Thus

|arg xR − arg x′R| =∣∣∣∣arg x − arg(x +Ex)+ �σ − σ ∗

�σ log

∣∣∣∣1 + Ex

x

∣∣∣∣∣∣∣∣is independent of R. This implies that the length of γ (xR, x′R) is O(R). Moreover,f (x) = O(R + R1−σ ∗

) on the arc. Hence,∣∣∣∣∫γ (xR,x

′R)

1

sf (s) ds

∣∣∣∣ � 1

R

∫γ

|f (s)| |ds| = O(R1−σ ∗) → 0 for R → 0.

This completes the proof of Sub-Lemma 1 and Lemma 1. ✷We observe that in the proof of Lemma 1 we imposed (P (σ )/(1 − σ ))8n2C

max |f |eθ1�σ |x|1−σ � δ. We obtain an important condition on ε which we used forthe Remark in Section 3.

eθ1�σ |ε|1−σ � c, c := δ

8n2C

1 − σ

P (σ )

1

max |fµν| (63)

(here C = max{‖A0µ‖, ‖B0

ν‖}).We turn to the case with which we are concerned: we consider three matrices

A00, A

0x, A

01 such that

A00 + A0

x = A, A00 + A0

x + A01 = diag(−µ,µ),

tr(A0i ) = det(A0

i ) = 0, i = 0, x, 1.


LEMMA 2. Let r and s be two complex numbers not equal to 0 and ∞. Let T bethe matrix which brings A to the Jordan form:

T −1AT =

diag

(σ

2,−σ

2

), σ �= 0,(

0 10 0

), σ = 0.

The general solution of

A00 + A1

x + A01 =

(−µ 00 µ

), tr(Ai) = det(Ai) = 0, A0

0 + A0x = A

is the following:For σ �= 0,±2µ:

A = 1

8µ

(−σ 2 − (2µ)2 (σ 2 − (2µ)2)r

(2µ)2−σ 2

rσ 2 + (2µ)2

),

A01 = σ 2 − (2µ)2

8µ

(1 −r1r

−1

), A0

0 = T

( σ4

σ4 s

−σ4

1s

−σ4

)T −1,

A0x = T

( σ4 −σ

4 s

σ4

1s

−σ4

)T −1,

where

T =(

1 1(σ+2µ)2

σ 2−(2µ)21r

(σ−2µ)2

σ 2−(2µ)21r

).

For σ = −2µ: A00 and A0

x as above, but

A =(−µ r

0 µ

), A0

1 =(

0 −r0 0

), T =

(1 10 2µ

r

)(64)

or

A =(−µ 0

r µ

), A0

1 =(

0 0−r 0

), T =

(1 0

− r2µ 1

). (65)

For σ = 2µ: A00 and A0

x as above, but

A =(−µ r

0 µ

), A0

1 =(

0 −r0 0

), T =

(1 12µr

0

)(66)

or

A =(−µ 0

r µ

), A0

1 =(

0 0−r 0

), T =

(0 11 − r

2µ

). (67)

344 DAVIDE GUZZETTI

For σ = 0:

A00 = T

(0 s

0 0

)T −1, A0

x = T

(0 1 − s

0 0

)T −1,

A =(−µ

2 −µ2

4 r

1r

µ

2

), A0

1 =(−µ

2µ2

4 r

− 1r

µ

2

),

T =(

1 1− 2

µr−2µ+2

µ21r

).

We leave the proof as an exercise for the reader. ✷We are ready to prove Theorem 1, namely:

Let a := − 14s if σ �= 0, or a := s if σ = 0. Consider the family of paths

�σ arg(x) = �σ arg(x0)+ (�σ −�) log|x||x0|, 0 � � � σ ,

contained in D(ε;σ, θ1, θ2), starting at x0. If �σ = 0 we consider any regularpath. Along these paths, the solutions of PVIµ, corresponding to the solutions ofSchlesinger equations (54) obtained in Lemma 1, have the following behavior forx → 0:

y(x) = a(x)x1−σ (1 + O(|x|δ)),where 0 < δ < 1 is a small number, and

a(x) = a, if 0 < � � σ or if σ is real.

If � = 0, then xσ = Ceiα(x) (C is a constant =|xσ0 | ≡ |xσ | and α(x) is the realphase of xσ ) and

a(x) = a

(1 + 1

2aCeiα(x) + 1

16a2C2e2iα(x)

)= O(1). (68)

Proof. y(x) can be computed in terms of the Ai(x) from A(y(x), x)12 = 0:

y(x) = x(A0)12

(1 + x)(A0)12 + (Ax)12 + x(A1)12

≡ x(A0)12

x(A0)12 − (A1)12 + x(A1)12

= −x (A0)12

(A1)12

1

1 − x(1 + (A0)12

(A1)12

) .As a consequence of Lemmas 1 and 2, it follows that

|x(A1)12| � c|x|(1 + O(|x|1−σ1)) and

|x(A0)12| � c|x|1−σ (1 + O(|x|1−σ1)),


where c is a constant. Then

y(x) = −x (A0)12

(A1)12(1 + O(|x|1−σ )).

From Lemma 2 we find, for σ �= 0,±2µ,

(A0)12 = −r σ2 − 4µ2

32µ×

×[x−σ

s(1+O(|x|1−σ1))+ sxσ (1+O(|x|1−σ1))− 2(1+O(|x|1−σ1))

],

(A1)12 = −r σ2 − 4µ2

8µ(1 + O(|x|1−σ1)).

Then (recall that σ < σ1)

y(x) = −x

4

[x−σ

s(1+O(|x|1−σ1))+ sxσ (1+O(|x|1−σ1))− 2(1+O(|x|1−σ1))

]×

× (1 + O(|x|1−σ1)).

Now x → 0 along a path

�σ arg(x) = �σ arg(x0)+ (�σ −�) log|x||x0|

for 0 � � � σ . Along this path we rewrite xσ in terms of its absolute value|xσ | = C|x|� (C = |xσ0 |/|x0|�) and its real phase α(x)

xσ = C|x|�eiα(x),

α(x) = �σ arg(x)+ �σ ln |x|∣∣�σ arg(x)=�σ arg(x0)+(�σ−�) log |x||x0|.

Then

y(x) = −x1−σ

4

[1

s− 2Ceiα(x)|x|�(1 + O(|x|1−σ1))+

+ sC2e2iα(x)|x|2�(1 + O(|x|1−σ1))

](1 + O(|x|1−σ1)).

For � �= 0 the above expression becomes

y(x) = ax1−σ (1 + O(|x|1−σ1)+ O(|x|�)), where a := − 1

4s.

We collect the two O(. . .) contribution in O(|x|δ), where δ = min{1 − σ1, �} is asmall number between 0 and 1.

346 DAVIDE GUZZETTI

We use the occasion here to remark that in the case of real 0 < σ < 1, if weconsider x → 0 along a radial path (i.e. arg(x) = arg(x0)), then � = σ = σ andthus

y(x) ={− 1

4s x1−σ (1 + O(|x|σ )) for 0 < σ < 1

2 ,

− 14s x

1−σ (1 + O(|x|1−σ1)) for 12 < σ < 1.

Along the path with � = 0 we have

y(x) = −x1−σ

4

(1

s− 2Ceiα(x) + sC2e2iα(x)

)(1 + O(|x|1−σ1)).

This is (68), for a = −(1/4s). We let the reader verify the theorem also in thecases σ = ±2µ (use the matrices (64) and (66) – we must disregard the matri-ces (65), (67); the reason will be clarified in the comment following Lemma 5 andat the end of the proof of Theorem 2) and in the case σ = 0. For σ = 0 we obtain

y(x) = ax(1 + O(|x|1−σ1)), where a := s. ✷In the proof of Lemma 1, we imposed (63). Hence, the reader may observe that

ε depends on σ , θ1 and on ‖A00‖, ‖A0

x‖, ‖A01‖; thus it also depends on a.


We are interested in Lemma 1 when

fµν = gµν = bν

aµ − xbν, hµν = 0,

aµ, bν ∈ C, aµ �= 0, ∀µ = 1, . . . , n1.

Equations (55) are the isomonodromy deformation equations for the Fuchsian sys-tem

dY

dz=[ n1∑µ=1

Aµ(x)

z− aµ+

n2∑ν=1

Bν(x)

z − xbν

]Y.

As a corollary of Lemma 1, for a fundamental matrix solution Y (z, x) of theFuchsian system, the limits

Y (z) := limx→0

Y (z, x), Y (z) := limx→0

x−AY (xz, x)

exist when x → 0 in D(ε;σ ). They satisfy

dY

dz=[ n1∑µ=1

A0µ

z− aµ+ A

z

]Y ,

dY

dz=

n2∑ν=1

Bν(x)

z− bνY .


In our case, the last three systems reduce to

dY

dz=[A0(x)

z+ Ax(x)

z − x+ A1(x)

z − 1

]Y, (69)

dY

dz=[

A01

z− 1+ A

z

]Y , (70)

dY

dz=[A0

0

z+ A0

x

z − 1

]Y . (71)

Before taking the limit x → 0, let us choose

Y (z, x) =(I + O

(1

z

))z−A∞zR, z → ∞ (72)

and define, as above,

Y (z) := limx→0

Y (z, x), Y (z) := limx→0

x−AY (xz, x).

For the system (70) we choose a fundamental matrix solution normalized as fol-lows:

YN (z) =(I + O

(1

z

))z−A∞zR, z → ∞,

= (I + O(z))zAC0, z → 0,

= G1(I + O(z− 1))(z − 1)J C1, z → 1, (73)

where

G−11 A0

1G1 = J, J =(

0 10 0

)and C0, C1 are invertible connection matrices. Note that R is the same of (72),since it is independent of x. For (71) we choose a fundamental matrix solutionnormalized as follows

YN (z) =(I + O

(1

z

))zA, z → ∞,

= G0(I + O(z))zJ C0, z → 0,

= G1(I + O(z− 1))(z − 1)J C1, z → 1. (74)

Here G−10 A0

0G0 = J , G−11 A0

xG1 = J . We prove that

Y (z) = YN (z), Y (z) = YN(z)C0. (75)

The proof we give here uses the technique of the proof of Proposition 2.1 in [20],generalized to the domain D(σ). The (isomonodromic) dependence of Y (z, x) onx is given by

dY (z, x)

dx= −Ax(x)

z − xY (z, x) := F(z, x)Y (z, x).

348 DAVIDE GUZZETTI

Then

Y (z, x) = Y (z)+∫L(x)

dx1F(z, x1)Y (z, x1).

The integration is on a path L(x) defined by

arg(x) = a log |x| + b, a = �σ − σ ∗

�σ (0 � σ ∗ � σ ),

or arg(x) = 0 if �σ = 0. The path is contained in D(σ) and joins 0 and x, like inthe proof of Theorem 1 (Figure 10). By successive approximations, we have

Y (1)(z, x) = Y (z)+∫L(x)

dx1F(z, x1)Y (z)

Y (2)(z, x) = Y (z)+∫L(x)

dx1F(z, x1)Y(1)(z, x1)

...

Y (n)(z, x) = Y (z)+∫L(x)

dx1F(z, x1)Y(n−1)(z, x1)

=[I +

∫L(x)

dx1

∫L(x1)

dx2

. . .

∫L(xn−1)

dxnF (z, x1)F (z, x2) . . . F (z, xn)

]Y (z).

Performing integration like in the proof of Theorem 1, we evaluate ‖Y (n)(z, x) −Y (n−1)(z, x)‖. Recall that Y (z) has singularities at z = 0, z = x. Thus, if |z| > |x|,we obtain

‖Y (n)(z, x)− Y (n−1)(z, x)‖ � MCn

Hnm=1(m− σ ∗)

|x|n−σ ∗,

where M and C are constants. Then

Y (n) = Y + (Y (1) − Y )+ · · · + (Y (n) − Y (n−1))

converges for n → ∞ uniformly in z in every compact set contained in {z | |z| >|x|} and uniformly in x ∈ D(σ). We can exchange limit and integration, thusobtaining Y (z, x) = limn→∞ Y (n)(z, x). Namely

Y (z, x) = U(z, x)Y (z),

U(z, x) = I +∞∑n=1

∫L(x)

dx1

∫L(x1)

dx2

. . .

∫L(xn−1)

dxnF (z, x1)F (z, x2) . . . F (z, xn),


being the convergence of the series uniformly in x ∈ D(σ) and in z in everycompact set contained in {z | |z| > |x|}. Of course,

U(z, x) = I + O

(1

z

)for x → 0 and Y (z, x) → Y (z).

But now observe that

Y (z) = U(z, x)−1Y (z, x) =(I + O

(1

z

))(I + O

(1

z

))z−A∞zR, z → ∞.

Then Y (z) ≡ YN (z). Finally, for z → 1,

Y (z, x) = U(x, z)YN(z) = U(x, z)G1(I + O(z− 1))(z − 1)J C1

= G1(x)(I + O(z− 1))(z − 1)J C1.

This implies C1 ≡ C1 and then

M1 = C−11 e2πiJ C1. (76)

Here we have chosen a monodromy representation for (69) by fixing a base-pointand a basis in the fundamental group of P1 as in Figure 15. M0, M1, Mx , M∞are the monodromy matrices for the solution (72) corresponding to the loops γi ,i = 0, x, 1,∞. M∞M1MxM0 = I . The result (76) may also be proved simplyobserving that M1 becomes M1 as x → 0 in D(σ) because the system (70) isobtained from (69) when z = x and z = 0 merge and the singular point z = 1 doesnot move. x may converge to 0 along spiral paths (Figure 15). We recall that thebraid βi,i+1 changes the monodromy matrices of dY/dz = ∑n

i=1 Ai(u)/(z − ui)Y

according to Mi �→ Mi+1, Mi+1 �→ Mi+1MiM−1i+1, Mk �→ Mk for any k �= i, i + 1

(see [13]). Therefore, if arg(x) increases of 2π as x → 0 in (69), we have

M0 �→ Mx, Mx �→ MxM0M−1x , M1 �→ M1.

If follows that M1 does not change and then

M1 ≡ M1 = C−11 e2πiJ C1, (77)

where M1 is the monodromy matrix of (73) for the loop γ1 in the basis of Figure 15.Now we turn to Y (z). Let Y (z, x) := x−AY (xz, x), and by definition Y (z, x) →

Y (z) as x → 0. In this case,

dY (z, x)

dx=[x−A(A0 + Ax)x

A −A

x+ x−AA1x

A

x − 1z

]Y (z, x) := F (z, x)Y (z, x).

Proceeding by successive approximations as above, we get

Y (z, x) = V (z, x)Y (z),

V (z, x) = I +∞∑n=1

∫L(x)

dx1 . . .

∫L(xn−1)

dxnF (z, x1) . . . F (z, xn) → I,

for x → 0

uniformly in x ∈ D(σ) and in z in every compact subset of {z | |z| < 1/|x|}.

350 DAVIDE GUZZETTI

Figure 15. (1): Branch cuts and loops for the Fuchsian system associated PVIµ. (2) Branchcuts and loops when x → 0. (3) Branch cuts and loops for the research system before andx → 0.

Let’s investigate the behavior of Y (z) as z → ∞ and compare it to the behaviorof YN (z). First we note that

x−AYN(xz) = x−A(I + O(xz))(xz)AC0 → zAC0 for x → 0.

Then [x−AY (xz, x)

][x−AYN(xz)

]−1 = x−AU(xz, x)xA → Y (z)C−10 z−A.

On the other hand, from the properties of U(z, x) we know that x−AU(xz, x)xAis holomorphic in every compact subset of {z | |z| > 1} and x−AU(xz, x)xA =I + O( 1

z) as z → ∞. Thus U (z) := limx→0 x

−AU(xz, x)xA exists uniformly in

every compact subset of {z | |z| > 1} and U (z) = I + O(1/z), z → ∞. Then

Y (z) = U (z)zAC0 ≡ YN (z)C0,


as we wanted to prove. Finally, the above result implies

Y (z, x) = xAV

(z

x, x

)YN

(z

x

)C0

=

xAV

(zx, x)G0(I + O(z/x))x−J zJ C0C0

= G0(x)(I + O(z))zJ C0C0, z → 0,

xAV(zx, x)G1(O(zx− 1

))(zx− 1

)JC1C0

= Gx(x)(I + O(z− x))(z − x)J C1C0, z → x.

Let M0, M1 denote the monodromy matrices of YN(z) in the basis of Figure 13,then

M0 = C−10 C−1

0 e2πiJ C0C0 = C−10 M0C0, (78)

Mx = C−10 C−1

1 e2πiJ C1C0 = C−10 M1C0. (79)

The same result may be obtained observing that from

d(x−AY (xz, x))dz

=[x−AA0x

A

z+ x−AAxx

A

z − 1+ x−AA1x

A

z − 1x

]x−AY (xz, x) (80)

we obtain the system (71) as z = 1/x and z = ∞ merge (Figure 15). The singular-ities z = 0, z = 1, z = 1/x of (80) correspond to z = 0, z = x, z = 1 of (69). Thepoles z = 0 and z = 1 of (80) do not move as x → 0 and 1/x converges to ∞, ingeneral along spirals. At any turn of the spiral the system (80) has new monodromymatrices according to the action of the braid group

M1 �→ M∞, M∞ �→ M∞M1M−1∞ ,

but M0 �→ M0, Mx �→ Mx . Hence, the limit Y (z) still has monodromy M0 and Mx

at z = 0, x. Since Y = YNC0 we conclude that M0 and Mx are (78) and (79).In order to find the parameterization y(x;σ, a), in terms of (x0, x1, x∞), we

have to compute the monodromy matrices M0, M1, M∞ in terms of σ and a

and then take the traces of their products. In order to do this, we use the formu-lae (77), (78), (79). In fact, the matrices Mi (i = 0, 1) and M1 can be computed ex-plicitly because a 2 × 2 Fuchsian system with three singular points can be reducedto the hyper-geometric equation, whose monodromy is completely known.

Before going on with the proof, we recall that in the proof of Theorem 1 wedefined a = −(1/4s) (or a = s for σ = 0).

LEMMA 3. The Gauss hyper-geometric equation

z(1 − z)d2y

dz2+ [γ0 − z(α0 + β0 + 1)]dy

dz− α0β0y = 0 (81)

is equivalent to the system

dJ

dz=[

1

z

(0 0

−α0β0 −γ0

)+ 1

z − 1

(0 10 γ0 − α0 − β0

)]J, (82)

352 DAVIDE GUZZETTI

where

J =(

y

(z − 1) dydz

).

LEMMA 4. LetB0 and B1 be matrices of eigenvalues 0, 1−γ , and 0, γ−α−β−1,respectively, such that

B0 + B1 = diag(−α,−β), α �= β.

Then

B0 =( α(1+β−γ )

α−βα(γ−α−1)

α−β r

β(β+1−γ )α−β

1r

β(γ−α−1)α−β

), B1 =

( α(γ−α−1)α−β −(B0)12

−(B0)21β(β+1−γ )

α−β

)for any r �= 0.

We leave the proof as an exercise. The following lemma connects Lemmas 3and 4:

LEMMA 5. The system (82) with

α0 = α, β0 = β + 1, γ0 = γ, α �= β

is gauge-equivalent to the system

dX

dz=[B0

z+ B1

z − 1

]X, (83)

where B0, B1 are given in Lemma 4. This means that there exists a matrix

G(z) :=(

1 0(α−β)z+β+1−γ

(1+α−γ )r zα−β

α(1+α−γ )1r

)such that X(z) = G(z)J(z). It follows that (83) and the corresponding hyper-geometric equation (81) have the same Fuchsian singularities 0, 1,∞ and thesame monodromy group.

Proof. By direct computation. ✷Note that the form of G(z) ensures that if y1, y2 are independent solutions of

the hyper-geometric equation, then a fundamental matrix of (83) may be chosen tobe

X(z) =(y1(z) y2(z)

∗ ∗).

We also observe that if we re-define

r1 := rα(γ − α − 1)

α − β,


the matrices G(z), B0, B1 are not singular except for α = β. Actually, we have

B0 =( α(β+1−γ )

α−β r1

αβ(β+1−γ )(γ−1−α)(α−β)2

1r1

β(γ−α−1)α−β

),

B1 =( α(γ−α−1)

α−β −r1

−(B0)21β(β+1−γ )

α−β

),

G(z) =(

1 0α((α−β)z+β+1−γ )

β−α1r1

− zr1

).

The form of B0, B1 of Lemma 4 will correspond to the matrices define in Lemma 2in general, while the form of B0, B1 above will correspond to (64) and (66) ofLemma 2 (with r1 �→ r). For this reason, we must disregard the matrices (65), (67)when we prove Theorem 1.

Now we compute the monodromy matrices for the systems (70), (71) by reduc-tion to an hyper-geometric equation. We first study the case σ /∈ Z. Let us startwith (70). With the gauge Y (1)(z) := z−

σ2 Y (z), we transform (70) in

dY (1)

dz=[

A01

z− 1+ A− σ

2 I

z

]Y (1). (84)

We identify the matrices B0, B1 with A− σ2 I and A0

1, with eigenvalues 0, −σ and0, 0, respectively. Moreover

A01 +A− σ

2I = diag

(−µ− σ

2, µ− σ

2

).

Thus

α = µ+ σ

2, β = −µ+ σ

2,

γ = σ + 1; α − β = 2µ �= 0 by hypothesis.

The parameters of the correspondent hyper-geometric equation are

α0 = µ+ σ

2, β0 = 1 − µ+ σ

2, γ0 = σ + 1.

From them we deduce the nature of two linearly independent solutions at z = 0.Since γ0 /∈ Z (σ /∈ Z) the solutions are expressed in terms of hyper-geometricfunctions. On the other hand, the effective parameters at z = 1 and z = ∞ are,respectively:

α1 := α0 = µ+ σ

2, α∞ := α0 = µ+ σ

2,

β1 := β0 = 1 − µ+ σ

2, β∞ := α0 − γ0 + 1 = µ− σ

2,

γ1 := α0 + β0 − γ0 + 1 = 1, γ∞ := α0 − β0 + 1 = 2µ.

354 DAVIDE GUZZETTI

Since γ1 = 1, at least one solution has a logarithmic singularity at z = 1. Also notethat γ∞ = 2µ, therefore logarithmic singularities appear at z = ∞ if 2µ ∈ Z\{0}.

For the derivations which follows, we use the notations of the fundamentalpaper by Norlund [29]. To derive the connection formulae we use the paper of Nor-lund when logarithms are involved. Otherwise, in the generic case, any textbook ofspecial functions (like [25]) may be used.

First case: α0, β0 /∈ Z. This means σ �= ±2µ+ 2m, m ∈ Z.We can choose the following independent solutions of the hyper-geometric

equation:

At z = 0,

y(0)1 (z) = F(α0, β0, γ0; z),

(85)y(0)2 (z) = z1−γ0F(α0 − γ0 + 1, β0 − γ0 + 1, 2 − γ0; z),

where F(α, β, γ ; z) is the well-known hyper-geometric function (see [29]).

At z = 1,

y(1)1 (z) = F(α1, β1, γ1; 1 − z), y

(1)2 (z) = g(α1, β1, γ1; 1 − z).

Here g(α, β, γ ; z) is a logarithmic solution introduced in [29], and γ ≡ γ1 = 1.At z = ∞, we consider first the case 2µ /∈ Z, while the resonant case will be

considered later. Two independent solutions are

y(∞)1 = z−α0F

(α∞, β∞, γ∞; 1

z

),

y(∞)

2 = z−β0F

(β0, β0 − γ0 + 1, β0 − α0 + 1; 1

z

).

Then, from the connection formulas between F(. . . ; z) and g(. . . ; z) of [25]and [29] we derive

[y(∞)1 , y

(∞)2 ] = [y(0)1 , y

(0)2 ]C0∞,

C0∞ =( e−iπα0 �(1+α0−β0)�(1−γ0)

�(1−β0)�(1+α0−γ0)e−iπβ0 �(1+β0−α0)�(1−γ0)

�(1−α0)�(1+β0−γ0)

eiπ(γ0−α0−1) �(1+α0−β0)�(γ0−1)�(α0)�(γ0−β0)

eiπ(γ0−β0−1) �(1+β0−α0)�(γ0−1)�(β0)�(γ0−α0)

),

[y(0)1 , y(0)2 ] = [y(1)1 , y

(1)2 ]C01,

C01 =( 0 − π sin(π(α0+β0))

sin(πα0) sin(πβ0)

�(2−γ0)

�(1−α0)�(1−β0)

− �(γ0)

�(γ0−α0)�(γ0−β0)− �(2−γ0)

�(1−α0)�(1−β0)

).

We observe that

Y (1)(z) =(I + F

z+ O

(1

z2

))zdiag(−µ− σ

2 ,µ− σ2 ), z → ∞

= G0(I + O(z))zdiag(0,−σ)G−10 C0, z → 0

= G1(I + O(z− 1))(z − 1)J C1, z → 1,


where G0 ≡ T of Lemma 2; namely G−10 AG0 = diag(σ/2,−σ/2). By direct

substitution in the differential equation, we compute the coefficient F

F = −((A0

1)11(A0

1)12

1−2µ

(A01)21

1+2µ (A01)22

), where A0

1 = σ 2 − (2µ)2

8µ

(1 −r1r

−1

).

Thus, from the asymptotic behavior of the hyper-geometric function (F(α, β, γ ; 1z)

∼ 1, z → ∞) we derive

Y (1)(z) =(y(∞)1 (z) r

σ 2−(2µ)18µ(1−2µ)y

(∞)2∗ ∗

).

From

Y (1)(z) ∼(

1 z−σ∗ ∗

)G−1

0 C0, z → 0, (86)

we derive

Y (1)(z) =(y(0)1 (z) y

(0)2 (z)

∗ ∗)G−1

0 C0.

Finally, observe that

G1 =(u u

ω+ vr

ur

v

)for arbitrary

u, v ∈ C, u �= 0, and ω := σ 2 − (2µ)2

8µ.

We recall that

y(1)2 = g(α1, β1, 1; 1 − z) ∼ ψ(α1)+ ψ(β1)− 2ψ(1)− iπ + log(z− 1),

|arg(1 − z)| < π,

as z → 1. We can choose u = 1 and a suitable v, in such a way that the asymptoticbehavior of Y (1) for z → 1 is precisely realized by

Y (1)(z) =(y(1)1 (z) y

(1)2 (z)

∗ ∗)C1.

Therefore we conclude that the connection matrices are:

C0 = G0

((C0∞)11 r

σ 2−(2µ)28µ(1−2µ)(C0∞)12

(C0∞)21 rσ 2−(2µ)28µ(1−2µ)(C0∞)22

),

C1 = C01(G−10 C0) = C01

((C0∞)11 r

σ 2−(2µ)28µ(1−2µ)(C0∞)12

(C0∞)21 rσ 2−(2µ)28µ(1−2µ)(C0∞)22

).

356 DAVIDE GUZZETTI

It’s now time to consider the resonant case 2µ ∈ Z\{0}. The behavior of Y (1) atz = ∞ is

Y (1)(z) =(I + F

z+ O

(1

z2

))zdiag(−µ− σ

2 ,µ− σ2 )zR,

R =(

0 R12

0 0

), for µ = 1

2, 1,

3

2, 2,

5

2, . . . ,

R =(

0 0R21 0

), for µ = −1

2,−1,−3

2,−2,−5

2, . . .

and the entry R12 is determined by the entries of A01. For example, if µ = 1

2 , wecan compute

R12 = (A01)12 = −r σ

2 − 1

4

(and F12 arbitrary); if µ = − 12 we have

R21 = (A01)21 = −1

r

σ 2 − 1

4

(and F21 arbitrary); if µ = 1 we have

R12 = −r σ2(σ 2 − 4)

32.

Since σ /∈ Z, R �= 0. This is true for any 2µ ∈ Z\{0}. Note that the R computedhere coincides (by isomonodromicity) to the R of the system (69).

Therefore, there is a logarithmic solution at ∞. Only C0∞ and thus C0 andC1 change with respect to the nonresonant case. We will see in a while that suchmatrices disappear in the computation of tr(MiMj), i, j = 0, 1, x. Therefore, it isnot necessary to know them explicitly, the only important matrix to know beingC01, which is not affected by resonance of µ. This is the reason why the formulaeof Theorem 2 hold true also in the resonant case.

Second case: α0, β0 ∈ Z, namely σ = ±2µ+ 2m, m ∈ Z.The formulae are almost identical to the first case, but C01 changes. To see this,

we need to distinguish four cases.(i) σ = 2µ + 2m, m = −1,−2,−3, . . .. We choose y

(1)2 (z) = g0(α1, β1, γ1;

1 − z). Here g0(z) is another logarithmic solution of [29]. Thus

C01 =( �(−m)�(−2µ−m+1)

�(−2µ−2m) 0

0 − �(1−2µ−2m)�(1−m−2µ)�(−m)

).

As usual, the matrix is computed from the connection formulas between the hyper-geometric functions and g0 that the reader can find in [29].


(ii) σ = 2µ + 2m, m = 0, 1, 2, . . .. We choose y(2)1 = g(α1, β1, γ1;

1 − z). Thus

C01 =( 0 �(m+1)�(2µ+m)

�(2µ+2m)

− �(2µ+2m+1)�(2µ+m)�(m+1) 0

).

(iii) σ = −2µ + 2m, m = 0,−1,−2, . . .. We choose y(1)2 (z) = g0(α1, β1, γ1;1 − z). Thus

C01 =( �(1−m)�(2µ−m)

�(2µ−2m) 0

0 − �(1+2µ−2m)�(2µ−m)�(1−m)

).

(iv) σ = −2µ+2m, m = 1, 2, 3, . . .. We choose y(1)2 (z) = g(α1, β1, γ1; 1−z).Thus

C01 =( 0 �(m)�(m+1−2µ)

�(2m−2µ)

− �(2m+1−2µ)�(m+1−2µ)�(m) 0

).

Note that this time

F =(

0 r1−2µ

0 0

)in the case σ = ±2µ (i.e. m = 0) because A0

1 has a special form in this case. Thenin C0 the elements

σ 2 − (2µ)2

8µ(1 − 2µ)(C0∞)12,

σ 2 − (2µ)2

8µ(1 − 2µ)(C0∞)22

must be substituted, for m = 0, with

1

1 − 2µ(C0∞)12,

1

1 − 2µ(C0∞)22.

We turn to the system (71). Let Y be the fundamental matrix (74). With thegauge Y (2)(z) := G−1

0 (YN(z)G0) we have

dY (2)

dz=[B0

z+ B1

z − 1

]Y (2),

B0 = G−1A00G0 =

( σ4

σ4 s

− σ4s −σ

4

), B1 = G−1A0

xG0 =( σ

4 −σ4 s

σ4s −σ

4

).

This time the effective parameters at z = 0, 1,∞ are

α0 = −σ

2, α1 = −σ

2, α∞ = −σ

2,

β0 = σ

2+ 1, β1 = σ

2+ 1, β∞ = σ

2,

γ0 = 1, γ1 = 1, γ∞ = σ.

358 DAVIDE GUZZETTI

If follows that both at z = 0 and z = 1 there are logarithmic solutions. We skip thederivation of the connection formulae, which is done as in the previous cases, withsome more technical complications. Before giving the results, we observe that

Y (2)(z) =(I + O

(1

z

))zdiag( σ2 ,− σ

2 ), z → ∞= G−1

0 G0(1 + O(z))zJC ′0, z → 0

= G−10 G1(1 + O(z− 1))(z − 1)JC ′

1, z → 1,

where C ′i := CiG0, i = 0, 1. Then

M0 = G0(C′0)

−1

(1 2πi0 1

)C ′

0G−10 , M1 = G0(C

′1)

−1

(1 2πi0 1

)C ′

1G−10 .

So, we need to compute C ′i , i = 0, 1. The result is

C ′0 =

((C ′

0∞)11σ

σ+1s4(C

′0∞)12

(C ′0∞)21

σσ+1

s4(C

′0∞)22

), C ′

1 = C ′01C

′0,

where

(C ′0∞)

−1 =( �(β0−α0)

�(β0)�(1−α0)eiπα0 0

�(α0−β0)

�(α0)�(1−β0)eiπβ0 −�(1−α0)�(β0)

�(β0−α0+1) eiπβ0

),

C ′01 =

( 0 − πsin(πα0)

− sin(πα0)

π−e−iπα0

).

The case σ ∈ Z interests us only if σ = 0, otherwise σ /∈ C\{(−∞, 0) ∪[1,+∞)}. We observe that the system (70) is precisely the system for Y (2)(z) withthe substitution σ �→ −2µ. In the formulae for x2

i , i = 0, 1,∞ we only need C01,which is obtained from C ′

01 with α0 = µ.As for the system (71), the gauge Y (2) = G−1

0 Y G0 yields

B0 =(

0 s

0 0

), B1 =

(0 1 − s

0 0

).

Here G0 is the matrix such that

G−10 AG0 =

(0 10 0

).

The behavior of Y (2)(z) is now

Y (2)(z) =(I + O

(1

z

))zJ , z → ∞

= ˜G

−1

0 (1 + O(z))zJC ′0, z → 0

= ˜G1(1 + O(z− 1))(z − 1)JC ′

1, z → 1.


Here ˜Gi is the matrix that puts Bi in Jordan form, for i = 0, 1. Y (2) can be computed

explicitly:

Y (2)(z) =(

1 s log(z)+ (1 − s) log(z − 1)0 1

).

If we choose ˜G0 = diag(1, 1/s), then C ′

0 = (1 00 s

). In the same way we find C ′

1 =(1 00 1−s

).

To prove Theorem 2, it is now enough to compute

2 − x20 = tr(M0Mx) ≡ tr(e2πiJ (C ′

01)−1e2πiJC ′

01),

2 − x21 = tr(MxM1) ≡ tr((C ′

1)−1e2πiJC ′

1C−101 e2πiJC01),

2 − x2∞ = tr(M0M1) ≡ tr((C ′

0)−1e2πiJC ′

0C−101 e2πiJC01).

Note the remarkable simplifications obtained from the cyclic property of the trace(for example, C0, C1 and G0 disappear). The fact that C0 and C1 disappear impliesthat the formulae of Theorem 2 are derived for any µ �= 0, including the resonantcases. Thus, the connection formulae in the resonant case 2µ ∈ Z\{0} are the sameof the nonresonant case. The final result of the computation of the traces is:

(I) Generic case:

2(1 − cos(πσ )) = x20 ,

1

f (σ,µ)

(2 + F(σ,µ)s + 1

F(σ,µ)s

)= x2

1 , (87)

1

f (σ,µ)

(2 − F(σ,µ)e−iπσ s − 1

F(σ,µ)e−iπσ s

)= x2

∞,

where

f (σ,µ) = 2 cos2(π2 σ )

cos(πσ )− cos(2πµ)≡ 4 − x2

0

x21 + x2∞ − x0x1x∞

,

F (σ, µ) = f (σ,µ)16σ�(σ+1

2 )4

�(1 − µ+ σ2 )

2�(µ+ σ2 )

2.

(II) σ ∈ 2Z, x0 = 0.

2(1 − cos(πσ )) = 0, 4 sin2(πµ)(1 − s) = x21 , 4 sin2(πµ)s = x2

∞.

(III) x20 = 4 sin2(πµ). Then (33) implies x2∞ = −x2

1 exp(±2πiµ). Four caseswhich yield the values of σ not included in (I) and (II) must be consid-ered:

360 DAVIDE GUZZETTI

(III)1 x2∞ = −x21e−2πiµ

σ = 2µ+ 2m, m = 0, 1, 2, . . . ,

s = �(m+ 1)2�(2µ+m)2

162µ+2m�(µ+m+ 12 )

4x2

1 .

(III)2 x2∞ = −x21e2πiµ

σ = 2µ+ 2m, m = −1,−2,−3, . . . ,

s = π 4

cos4(πµ)

[162µ+2m�

(µ+m+ 1

2

)4 ×

× �(−2µ−m+ 1)2�(−m)2x21

]−1.

(III)3 x2∞ = −x21e2πiµ

σ = −2µ+ 2m, m = 1, 2, 3, . . . ,

s = �(m− 2µ+ 1)2�(m)2

16−2µ+2m�(−µ+m+ 12)

4x2

1 .

(III)4 x2∞ = −x21e−2πiµ

σ = −2µ+ 2m, m = 0,−1,−2,−3, . . . ,

s = π 4

cos4(πµ)

[16−2µ+2m�

(− µ+m+ 12

)4 ×

× �(2µ−m)2�(1 −m)2x21

]−1.

We recall that a in y(x;σ, a) is a = −(1/4s) in general, and a = s for σ = 0.To compute σ and s in the generic case (I) for a given triple (x0, x1, x∞), we

solve the system (87). It has two unknowns and three equations and we need toprove that it is compatible. Actually, the first equation 2(1−cos(πσ )) = x2

0 alwayshas solutions. Let us choose a solution σ0 (±σ0 + 2n, ∀n ∈ Z are also solutions).Substitute it in the last two equations. We need to verify that they are compatible.Instead of s and 1/s, write X and Y . We have the linear system in two variable X, Y(

F(σ0)1

F(σ0)

F (σ0)e−iπσ0 1F(σ0)

e−iπσ0

)(X

Y

)=(f (σ0)x

21 − 2

2 − f (σ0)x2∞

).

The system has a unique solution if and only if

2i sin(πσ0) = det

(F(σ0)

1F(σ0)

F (σ0)e−iπσ0 1F(σ0)

e−iπσ0

)�= 0.

This happens for σ0 /∈ Z. The condition is not restrictive, because for σ even weturn to the case (II), and σ odd is not in C\[(−∞, 0) ∪ [1,+∞)]. The solution is


then

X = 2(1 + e−iπσ0)− f (σ0)(x21 + x2∞e−iπσ0)

F (σ0)(e−2πiσ0 − 1),

Y = F(σ0)f (σ0)e−iπσ0(e−iπσ0x2

1 + x2∞)− 2e−iπσ0(1 + e−iπσ0)

e−2πiσ0 − 1.

Compatibility of the system means that XY ≡ 1. This is verified by direct compu-tation. It follows from this construction that for any σ solution of the first equationof (87), there always exists a unique s which solves the last two equations.

To complete the proof of Theorem 2 (points (i), (ii), (iii)), we just have to com-pute the square roots of the x2

i (i = 0, 1,∞) in such a way that (33) is satisfied.For example, the square root of (I) satisfying (33) is

x0 = 2 sin

(π

2σ

),

x1 = 1√f (σ,µ)

(√F(σ,µ)s + 1√

F(σ,µ)s

),

x∞ = i√f (σ,µ)

(√F(σ,µ)se−i

πσ2 − 1√

F(σ,µ)se−iπσ2

),

which yields (i), with F(σ,µ) = f (σ,µ)(2G(σ,µ))2.We remark that in case (II) only σ = 0 is in C\{(−∞, 0) ∪ [1,+∞)}. If µ

integer in (II), the formulae give (x0, x1, x∞) = (0, 0, 0). The triple is not admis-sible, and direct computation gives R = 0 for the system (84). This is the case ofcommuting monodromy matrices with a 1-parameter family of rational solutionsof PVIµ.

The last remark concerns the choice of (64), (66) instead of (65), (67). Thereason is that at z = 0 the system (84) has solution corresponding to (85). This istrue for any σ �= 0 in C\{(−∞, 0) ∪ [1,+∞)}, also for σ → ±2µ. Its behavioris (86), which is obtainable from the G0 = T of (64), (66) but not of (65), (67).See also the comment following Lemma 5.

Remark. In the proof of Theorem 2 we take the limits of the system and of therescaled system for x → 0 in D(σ). At x we assign the monodromy M0,M1,Mx

characterized by (x0, x1, x∞) and then we take the limit proving the theorem. Ifwe start from another point x′ ∈ D(σ) we have to choose the same monodromyM0,M1,Mx , because what we are doing is the limit for x → 0 in D(σ) of thematrix coefficient A(z, x; x0, x1, x∞) of the system (69) considered as a functiondefined on the universal covering of C0 ∩ {|x| < ε}.

Proof of Remark 2 of Section 4. We prove that a(σ ) = 1/16a(−σ ), namely

s(σ ) = 1

s(−σ )(a = − 1

4s

).

362 DAVIDE GUZZETTI

Given monodromy data (x0, x1, x∞) the parameter s corresponding to σ is uniquelydetermined by

1

f (σ )

(2 + F(σ )s + 1

F(σ )s

)= x2

1 ,

1

f (σ )

(2 − F(σ )e−iπσ s − 1

F(σ )e−iπσ s

)= x2

∞.

We observe that f (σ ) = f (−σ ) and that the properties of the Gamma function

�(1 − z)�(z) = π

sin(πz), �(z+ 1) = z�(z)

imply F(−σ ) = 1/F(σ ). Then the value of s corresponding to −σ is (uniquely)determined by

1

f (σ )

(2 + s

F (σ )+ F(σ )

s

)= x2

1 ,

1

f (σ )

(2 − s

F (σ )e−iπσ− F(σ )e−iπσ

s

)= x2

∞.

We conclude that s(−σ ) = −(1/s(σ )). ✷Proof of formula (45). We are ready to prove formula (45), namely β2

1 : (σ, a) �→(σ, ae−2πiσ ). For σ = 0, we have x0 = 0 and β2

1 : (0, x1, x∞) �→ (0, x1, x∞). Thus,

a = x2∞x2

1 + x2∞�→ x2∞

x21 + x2∞

≡ a.

For σ = ±2µ + 2m, we consider the example σ = 2µ + 2m, m = 0, 1, 2, . . ..The other cases are analogous. We have s = x2

1H(σ) = −x2∞H(σ)e2πiµ, wherethe function H(σ) is explicitly given in Theorem 2(III). Then

β1: s = −x2∞H(σ)e2πiµ �→ −x2

1H(σ)e2πiµ = −se2πiµ.

Then

β21 : s �→ se4πiµ �⇒ a �→ ae−4πiµ ≡ ae−2πiσ .

For the generic case (I) (σ /∈ Z, σ �= ±2µ+ 2m) recall that

F(σ )s + 1

F(σ )s= x2

1f (σ )− 2,

F (σ )e−iπσ s + 1

F(σ )e−iπσ s= 2 − x2

∞f (σ )


has a unique solution s. Also observe that β1: x∞ �→ x1. Then the transformedparameter β1: s �→ sβ1 satisfies the equation

F(σ )e−iπσ sβ1 + 1

F(σ )e−iπσ sβ1

= 2 − x21f (σ )

≡ −(F(σ )s + 1

F(σ )s

).

Thus sβ1 = −eiπσ s. This implies

β21 : s �→ se2πiσ �⇒ a �→ ae−2πiσ . ✷

We finally prove the proposition stated at the end of Section 4.

Proof. Observe that both y(x) and y(x;σ, a) have the same asymptotic behaviorfor x → 0 in D(σ). LetA0(x), A1(x), Ax(x) be the matrices constructed from y(x)

and A∗0(x), A

∗1(x), A

∗x(x) constructed from y(x;σ, a) by means of formulae (21).

It follows that Ai(x) and A∗i (x), i = 0, 1, x, have the same asymptotic behavior as

x → 0. This is the behavior of Lemma 1 of Section 8 (adapted to our case). Fromthe proof of Theorem 2, it follows that A0(x), A1(x), Ax(x) and A∗

0(x), A∗1(x),

A∗x(x) produce the same triple (x0, x1, x∞). The solution of the Riemann–Hilbert

problem for such a triple is unique, up to conjugation of the Fuchsian systems.Therefore, Ai(x) and A∗

i (x), i = 0, 1, x are conjugated. If 2µ /∈ Z, the conjugationis diagonal. If 2µ ∈ Z and R �= 0, then Ai(x) = A∗

i (x). Putting [A(z; x)]12 = 0and [A∗(z; x)]12 = 0, we conclude that y(x) ≡ y(x;σ, a).


The elliptic representation was derived by Fuchs in [14]. In the case of PVIµ, therepresentation is discussed at the beginning of Subsection 5.1. Here we study thesolutions of (34). We let x → 0. If �τ > 0 and∣∣∣∣�( u

4ω1

)∣∣∣∣ < �τ, (88)

we expand the elliptic function in Fourier series (39). The first condition �τ > 0 isalways satisfied for x → 0 because

�τ(x) = − 1

πln |x| + 4

πln 2 + O(x), x → 0.

Therefore, in the following we assume that |x| < ε < 1 for a sufficiently small ε.We look for a solution u(x) of (34) of the form

u(x) = 2ν1ω1(x)+ 2ν2ω2(x)+ 2v(x),

364 DAVIDE GUZZETTI

where v(x) is a (small) perturbation to be determined from (34). We observe that

u(x)

4ω1(x)= ν1

2+ ν2

2τ(x) + v(x)

2ω1(x)

= ν1

2+ ν2

2

[− i

πln x − i

π

F1(x)

F (x)

]+ v(x)

2ω1(x).

Note that for x → 0, F1(x)/F (x) = −4 ln 2 + g(x), where g(x) = O(x) is aconvergent Taylor series starting with x. Thus, condition (88) becomes

(2 +�ν2) ln |x| − C(x, ν1, ν2)− 8 ln 2 < �ν2 arg(x)

< (�ν2 − 2) ln |x| − C(x, ν1, ν2)+ 8 ln 2, (89)

where

C(x, ν1, ν2) = [�πvω1

+ 4 ln 2�ν2 + π�ν1 + O(x)].We expand the derivative of ℘ appearing in (34)

∂

∂u℘

(u

2;ω1, ω2

)=(π

ω1

)3 ∞∑n=1

n2e2πinτ

1 − e2πinτsin

(nπu

2ω1

)−(

π

2ω1

)3 cos(πu4ω1

)sin3

(πu4ω1

)= 1

2i

(π

ω1

)3 ∞∑n=1

n2e2πinτ

1 − e2πinτ

(ein

πu2ω1 − e−in

πu2ω1)+

+ 4i

(π

2ω1

)3 eiπu4ω1 + e−i

πu4ω1(

eiπu4ω1 − e−i

πu4ω1)3 .

Now we come to a crucial step in the construction: we collect e−iπu4ω1 in the last

term, which becomes

4i

(π

2ω1

)3 e4πi u4ω1 + e2πi u

4ω1(e2πi u

4ω1 − 1)3 .

The denominator does not vanish if |e2πi u4ω1 | < 1. From now on, this condition is

added to (88) and reduces the domain (89). The expansion of ∂/∂u℘ becomes

∂

∂u℘

(u

2;ω1, ω2

)= 1

2i

(π

ω1

)3 ∞∑n=1

n2eiπn[−ν1+(2−ν2)τ− vω1

]

1 − e2πinτ

(e2iπn[ν1+ν2τ+ v

ω1] − 1

)++ 4i

(π

2ω1

)3 e2πi[ν1+ν2τ+ vω1

] + eπi[ν1+ν2τ+ vω1

](eπi[ν1+ν2τ+ v

ω1] − 1

)3 .


We observe that

eiπCτ = xC

16CeCg(x) = xC

16C(1 + O(x)), x → 0, for any C ∈ C.

Hence,

∂

∂u℘

(u

2;ω1, ω2

)= F

(x,

e−iπν1

162−ν2x2−ν2 e−iπ

vω1 ,

eiπν1

16ν2xν2 eiπ

vω1

),

where

F (x, y, z) = 1

2i

(π

ω1(x)

)3 ∞∑n=1

n2en(2−ν2)g(x)

1 − [1

16eg(x)]2n

x2nyn(e2nν2g(x)z2n − 1

)++ 4i

(π

2ω1(x)

)3 e2ν2g(x)z2 + eν2g(x)z

(eν2g(x)z− 1)3.

The series converges for |x| < ε and for |y| < 1, |yz| < 1; this is precisely (88).However, we require that the last term is holomorphic, so we have to further impose|eν2g(x)z| < 1. On the resulting domain |x| < ε, |y| < 1, |eν2g(x)z| < 1, F (x, y, z)

is holomorphic and satisfies F (0, 0, 0) = 0.The condition

|y| < 1, |eν2g(x)z| < 1 is

∣∣∣∣ e−iπν1

162−ν2x2−ν2e−iπ

vω1

∣∣∣∣ < 1,∣∣∣∣eν2g(x)eiπν1

16ν2xν2 eiπ

vω1

∣∣∣∣ < 1,

namely

�ν2 ln |x| − C(x, ν1, ν2)

< �ν2 arg(x) < (�ν2 − 2) ln |x| − C(x, ν1, ν2)+ 8 ln 2, (90)

which is more restrictive than (89). For �ν2 = 0, any value of arg(x) is allowed,but ∣∣∣∣ e−iπν1

162−ν2x2−ν2 e−iπ

vω1

∣∣∣∣ < 1,

∣∣∣∣eν2g(x)eiπν1

16ν2xν2 eiπ

vω1

∣∣∣∣ < 1

imply 0 < ν2 < 2. Thus, ν2 = 0 is not allowed.The function F can be decomposed as follows:

F = F

(x,

e−iπν1

162−ν2x2−ν2,

eiπν1

16ν2xν2

)+

+[F

(x,

e−iπν1

162−ν2x2−ν2e−iπ

v2ω1 ,

eiπν1

16ν2xν2 eiπ

v2ω1

)−

− F

(x,

e−iπν1

162−ν2x2−ν2 ,

eiπν1

16ν2xν2

)]=: F

(x,

e−iπν1

162−ν2x2−ν2,

eiπν1

16ν2xν2

)+ G

(x,

e−iπν1

162−ν2x2−ν2 ,

eiπν1

16ν2xν2 , v(x)

).

366 DAVIDE GUZZETTI

The above defines G(x, y, z, v). It is holomorphic for |x|, |y|, |z|, |v| less than asufficiently small ε < 1. Moreover G(0, 0, 0, v) = G(x, y, z, 0) = 0.

Let us put u = u0 + 2v, where u0 = 2ν1ω1 + 2ν2ω2. Therefore,

L(u0) = 0 and L(u0 + 2v) = L(u0)+ L(2v) ≡ 2L(v).

Hence, (34) becomes

L(v) = α

2x(1 − x)(F + G), (91)

where

F = F

(x,

e−iπν1

162−ν2x2−ν2,

eiπν1

16ν2xν2

),

G = G

(x,

e−iπν1

162−ν2x2−ν2 ,

eiπν1

16ν2xν2 , v(x)

).

We put w := xv′ (where v′ = dv/dx), and Equation (91) becomes

w′ = 1

x

[α

2(1 − x)2F + x(w + 1

4v)

1 − x+ α

2(1 − x)2G

].

Now, let us define

L(x, y, z) := α

2(1 − x)2F (x, y, z),

J(x, y, z, v,w) := x(w + 14v)

1 − x+ α

2(1 − x)2G(x, y, z, v).

They are holomorphic for |x|, |y|, |z|, |v|, |w| less than ε and

L(0, 0, 0) = 0, J(0, 0, 0, v,w) = J(x, y, z, 0, 0) = 0.

Equation (34) becomes the system

xdv

dx= w,

xdw

dx= L

(x,

e−iπν1

162−ν2x2−ν2,

eiπν1

16ν2xν2

)+

+J

(x,

e−iπnu1

162−ν2x2−ν2,

eiπν1

16ν2xν2 , v(x),w(x)

).

We reduce it to a system of integral equations

w(x) =∫L(x)

1

s

{L

(s,

e−iπν1

162−ν2s2−ν2 ,

eiπν1

16ν2sν2

)+

+J

(s,

e−iπν1

162−ν2s2−ν2,

eiπν1

16ν2sν2 , v(s), w(s)

)}ds,


v(x) =∫L(x)

1

s

∫L(s)

1

t

{L

(t,

e−iπν1

162−ν2t2−ν2 ,

eiπν1

16ν2tν2

)+

+J

(t,

e−iπν1

162−ν2t2−ν2,

eiπν1

16ν2tν2 , v(t), w(t)

)}dt ds.

The point x and the path of integration are chosen to belong to the domain, where

|x|,∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣, ∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣, |v(x)|, |w(x)|

are less than ε, in such a way that L and J are holomorphic. That such a domainis not empty will be shown below. In particular, we’ll show that if we require that

|x| < r,

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣ < r,

∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣ < r,

where r < ε is small enough, also |v(x)| and |w(x)| are less than ε. Such a domainis precisely the domain of Theorem 3, which is contained in (90).

We choose the path of integration L(x) connecting 0 to x, defined by

arg(s) = �ν2 − ν∗

�ν2log |s| + b,

where

b = arg x − �ν2 − ν∗

�ν2log |x|.

Namely:

arg(s) = arg(x)+ �ν2 − ν∗

�ν2log

|s||x| .

If x belongs to the domain (90) (or to D(r; ν1, ν2)), then the path does not leavethe domain when s → 0, provided that 0 < ν∗ < 2. If �ν2 = 0, we take the patharg s = arg x, namely ν∗ = ν2. The parameterization of the path is

s = ρei{arg x+�ν2−ν∗�ν2log ρ

|x| }, 0 < ρ � |x|,therefore

|ds| = P(ν2, ν∗) dρ, P (ν2, ν

∗) :=√

1 +(�ν2 − ν∗

�ν2

)2

.

We observe that for any complex numbers A, B we have∫L(x)

1

|s| (|s| + |As2−ν2 | + |Bsν2 |)n|ds|

� P(ν2, ν∗)

nmin(ν∗, 2 − ν∗)(|x| + |Ax2−ν2 | + |Bxν2 |)n. (92)

368 DAVIDE GUZZETTI

This follows from the consideration that on L(x) we have |sν2 | = |xν2 ||s|ν∗/|x|ν∗ .Therefore∫

L(x)

1

|s| |s|i |As2−ν2 |j |Bsν2 |k|ds|

= |Ax2−ν2 |j |Bxν2 |k|x|(2−ν∗)j |x|ν∗k P (ν2, ν

∗)∫ |x|

0dρ ρi−1+(2−ν∗)j+ν∗k

= P(ν2, ν∗)

i + j (2 − ν∗)+ kν∗|x|i |Ax2−ν2 |j |Bxν2 |k

� P(ν2, ν∗)

(i + j + k)min(ν∗, 2 − ν∗)|x|i |Ax2−ν2 |j |Bxν2 |k

from which (92) follows, provided that 0 < ν∗ < 2. For �ν2 = 0 this brings again0 < ν2 < 2.

We observe that a solution of the integral equations is also a solution of thedifferential equations, by virtue of the analogous of Sub-Lemma 1 of Section 8:

SUB-LEMMA 2. Let f (x) be a holomorphic function in the domain |x| < ε,|Ax2−ν2 | < ε, |Bxν2 | < ε, such that f (x) = O(|x|+|Ax2−ν2 |+|Bxν2 |), A,B ∈ C.Let L(x) be the path of integration define above for 0 < ν∗ < 2 and F(x) :=∫L(x)

1/sf (s) ds. Then, F(x) is holomorphic on the domain and dF(x)/dx =(1/x)f (x).

Proof. We repeat exactly the argument of the proof of Sub-Lemma 1 in Sec-tion 8. We choose the point x+Ex close to x and we prove that

∫L(x)

− ∫L(x+Ex) =∫ x+Ex

x, where the last integral is on a segment. Again, we reduce to the evaluation

of the integral in the small portion of L(x), L(x +Ex) contained in the disc UR ofradius R < |x| and on the arc γ (xR, x′R) on the circle |s| = R. Taking into accountthat f (x) = O(|x| + |Ax2−ν2 | + |Bxν2 |) and (92) we have∣∣∣∣∫

L(xR)

1

sf (s) ds

∣∣∣∣�∫L(xR)

1

|s|O(|s| + |As2−ν2 | + |Bsν2 |)|ds|

� P(ν2, ν∗)

min(ν∗, 2 − ν∗)O(|xR| + |Ax2−ν2

R | + |Bxν2R |)

= P(ν2, ν∗)

min(ν∗, 2 − ν∗)O(Rmin{ν∗,2−ν∗}).

The last step follows from |xν2R | = (|xν2 |/|x|ν∗)Rν∗ . So the integral vanishes for

R → 0. The same is proved for∫L(x+Ex). As for the integral on the arc, we have

|arg xR − arg x′R| =∣∣∣∣arg x − arg(x +Ex)+ �ν2 − ν∗

�ν2log

∣∣∣∣1 + Ex

x

∣∣∣∣∣∣∣∣


or |arg xR − arg x′R| = |arg x− arg(x+Ex)|, if �ν2 = 0. This is independent of R,therefore the length of the arc is O(R) and∣∣∣∣∫

γ (xR,x′R)

1|s| |f (s)|

|ds|∣∣∣∣ = O(Rmin{ν∗,2−ν∗}) → 0 for x → 0. ✷

Now we prove a fundamental lemma:

LEMMA 6. For any complex ν1, ν2 such that ν2 /∈ (−∞, 0] ∪ [2,+∞), thereexists a sufficiently small r < 1 such that the system of integral equations has asolution v(x) holomorphic in

D(r; ν1, ν2) :={x ∈ C0 such that |x| < r,

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣ < r,

∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣ < r

}.

Moreover, there exists a constant M(ν2) depending on ν2 such that

v(x) � M(ν2)

(|x| +

∣∣∣∣ e−iπν1

162−ν2x2−ν2

∣∣∣∣+ ∣∣∣∣eiπν1

16ν2xν2

∣∣∣∣)in D(r; ν1, ν2).

To prove Lemma 6 we need some sub-lemmas

SUB-LEMMA 3. Let L(x, y, z) and J(x, y, z, v,w) be two holomorphic func-tions of their arguments for |x|, |y|, |z|, |v|, |w| < ε, satisfying

L(0, 0, 0) = 0, J(0, 0, 0, v,w) = J(x, y, z, 0, 0) = 0.

Then, there exists a constant c > 0 such that

|L(x, y, z)| � c(|x| + |y| + |z|), (93)

|J(x, y, z, v,w)| � c(|x| + |y| + |z|), (94)

|J(x, y, z, v2, w2)−J(x, y, z, v1, w1)|� c(|x| + |y| + |z|)(|v2 − v1| + |w2 − w1|), (95)

for |x|, |y|, |z|, |v|, |w| < ε.Proof. Let’s prove (94).

J(x, y, z, v,w)

=∫ 1

0

d

dλJ(λx, λy, λz, v,w) dλ×

× x

∫ 1

0

∂J

∂x(λx, λy, λz, v,w) dλ + y

∫ 1

0

∂J

∂y(λx, λy, λz, v,w) dλ+

+ z

∫ 1

0

∂J

∂z(λx, λy, λz, v,w) dλ.

370 DAVIDE GUZZETTI

Moreover, for δ small,

∂J

∂x(λx, λy, λz, v,w) =

∫|ζ−λx|=δ

J(ζ, λy, λz, v,w)

(ζ − λx)2

dζ

2πi,

which implies that ∂J/∂x is holomorphic and bounded when its arguments areless than ε. The same holds true for ∂J/∂y and ∂J/∂z. This proves (94), c being aconstant which bounds |∂J/∂x|, |∂J/∂y| |∂J/∂z|. The inequality (93) is provedin the same way. We turn to (95). First we prove that for |x|, |y|, |z|, |v1|, |w1|, |v2|,|w2| < ε there exist two holomorphic and bounded functions ψ1(x, y, z, v1, w1,

v2, w2), ψ2(x, y, z, v1, w1, v2, w2) such that

J(x, y, z, v2, w2)−J(x, y, z, v1, w1)

= (v2 − v1)ψ1(x, y, z, v1, w1, v2, w2)++ (w1 − w2)ψ2(x, y, z, v1, w1, v2, w2). (96)

In order to prove this, we write

J(x, y, z, v2, w2)−J(x, y, z, v1, w1)

=∫ 1

0

d

dλJ(x, y, z, λv2 + (1 − λ)v1, λw2 + (1 − λ)w1) dλ

= (v2 − v1)

∫ 1

0

∂J

∂v(x, y, z, λv2 + (1 − λ)v1, λw2 + (1 − λ)w1) dλ+

+ (w2 − w1)

∫ 1

0

∂J

∂w(x, y, z, λv2 + (1 − λ)v1, λw2 + (1 − λ)w1) dλ

=: (v2 − v1)ψ1(x, y, z, v1, w1, v2, w2)++ (w2 − w1)ψ2(x, y, z, v1, w1, v2, w2).

Moreover, for small δ,

∂J

∂v(x, y, z, v,w) =

∫|ζ−v|=δ

J(x, y, z, ζ,w)

(ζ − v)2

dz

2πi,

which implies that ψ1 is holomorphic and bounded for its arguments less than ε.We also obtain ∂J/∂v(0, 0, 0, v,w) = 0, then ψ1(0, 0, 0, v1, w1, v2, w2) = 0. Theproof for ψ2 is analogous. We use (96) to complete the proof of (95). Actually, weobserve that

ψi(x, y, z, v1, w1, v2, w2) =∫ 1

0

d

dλψi(λx, λy, λz, v1, w1, v2, w2) dλ

= x

∫ 1

0

∂ψi

∂xdλ+ y

∫ 1

0

∂ψi

∂ydλ+ z

∫ 1

0

∂ψi

∂zdλ

and we conclude as in the proof of (94). ✷


We solve the system of integral equations by successive approximations. Wecan choose any path L(x) such that 0 < ν∗ < 2. Here we choose ν∗ = 1. Forconvenience, we put

A := e−iπν1

162−ν2, B := eiπν1

16ν2.

Therefore, for any n � 1 the successive approximations are

v0 = w0 = 0,

wn(x) =∫L(x)

1

t

{L(s,As2−ν2 , Bsν2)+

+J(s,As2−ν2 , Bsν2 , vn−1(s), wn−1(s))}

ds, (97)

vn(x) =∫L(x)

1

swn(s) ds. (98)

SUB-LEMMA 4. There exists a sufficiently small ε ′ < ε such that for any n � 0the functions vn(x) and wn(x) are holomorphic in the domain

D(ε ′; ν1, ν2) :={x ∈ C0 such that |x| < ε ′, |Ax2−ν2 | < ε ′, |Bxν2 | < ε ′

}.

They are also correctly bounded, namely |vn(x)| < ε, |wn(x)| < ε for any n. Theysatisfy

|vn − vn−1| � (2c)nP (ν2)2n

n!(|x| + |Ax2−ν2 | + |Bxν2 |)n, (99)

|wn − wn−1| � (2c)nP (ν2)2n

n!(|x| + |Ax2−ν2 | + |Bxν2 |)n, (100)

where P(ν2) := P(ν2, ν∗ = 1) and c is the constant appearing in Sub-Lemma 3.

Moreover xdvn/dx = wn.Proof. We proceed by induction.

w1 =∫L(x)

1

sL(s,As2−ν2 , Bsν2) ds, v1 =

∫L(x)

1

sw1(s) ds.

It follows from Sub-Lemma 2 and (93) that w1(x) is holomorphic for |x|, |Ax2−ν2 |,|Bxν2 | < ε. From (92) and (93), we have

|w1(x)| �∫

1

|s| |L(s,As2−ν2 , Bsν2)| |ds|

� cP (ν2)(|x| + |Ax2−ν2 | + |Bxν2 |) � 3cP (ν2)ε′ < ε

on D(ε ′; ν1, ν2), provided that ε ′ is small enough. By Sub-Lemma 2, also v1(x) isholomorphic for |x|, |Ax2−ν2 |, |Bxν2 | < ε and x(dv1/dx) = w1. By (92) we alsohave

|v1(x)| � cP (ν2)2(|x| + |Ax2−ν2 | + |Bxν2 |) � 3cP (ν2)

2ε ′ < ε

372 DAVIDE GUZZETTI

on D(ε ′; ν1, ν2). Note that P(ν2) � 1, so (100) (99) are true for n = 1. Now wesuppose that the statement of the sub-lemma is true for n and we prove it for n+ 1.Consider

|wn+1(x)− wn(x)|=∣∣∣∣∫

L(x)

1

s

[J(s,As2−ν2 , Bsν2 , vn,wn)−

−J(s,As2−ν2 , Bsν2 , vn−1, wn−1)]

ds

∣∣∣∣.By (95) the above is

� c

∫L(x)

1

|s| (|s| + |As2−ν2 | + |Bsν2 |)(|vn − vn−1| + |wn − wn−1|)|ds|.

By induction this is

� 2c(2c)nP (ν2)

2n

n!∫L(x)

1

|s|(|s| + |As2−ν2 | + |Bsν2 |)n+1|ds|

� 2c(2c)nP (ν2)

2n

n!P(ν2)

n+ 1(|x| + |Ax2−ν2 | + |Bxν2 |)n+1

� (2c)n+1P(ν2)2(n+1)

(n+ 1)! (|x| + |Ax2−ν2 | + |Bxν2 |)n+1.

This proves (100). Now we estimate

|vn+1(x)− vn(x)| �∫L(x)

|wn+1(s)− wn(s)| |ds|

� (2c)n+1P(ν2)2n+1

(n+ 1)!∫L(x)

1

|s| (|s| + |As2−ν2 | + |Bsν2 |)n+1|ds|

� (2c)n+1P(ν2)2(n+1)

(n+ 1)(n+ 1)! (|x| + |Ax2−ν2 | + |Bxν2 |)n+1

� (2c)n+1P(ν2)2(n+1)

(n+ 1)! (|x| + |Ax2−ν2 | + |Bxν2 |)n+1.

This proves (99). From Sub-Lemma 2, we also conclude that wn and vn are holo-morphic in D(ε ′, ν1, ν2) and x dvn

dx = wn. Finally, we see that

|vn(x)| �n∑

k=1

|vk(x)− vk−1(x)|

� exp{2cP 2(ν2)(|x| + |Ax2−ν2 | + |Bxν2 |)} − 1

� exp{6cP 2(ν2)ε′} − 1


and the same for |wn(x)|. Therefore, if ε ′ is small enough we have |vn(x)| < ε,|wn(x)| < ε on D(ε ′, ν1, ν2). ✷

Let us define

v(x) := limn→∞ vn(x), w(x) := lim

n→∞wn(x)

if they exist. We can also rewrite

v(x) = limn→∞ vn(x) =

∞∑n=1

(vn(x)− vn−1(x)).

We see that the series converges uniformly in D(ε ′, ν1, ν2) because∣∣∣∣∣∞∑n=1

(vn(x)− vn−1(x))

∣∣∣∣∣�

∞∑n=1

(2c)nP (ν2)2n

n! (|x| + |Ax2−ν2 | + |Bxν2 |)n

= exp{2cP 2(ν2)(|x| + |Ax2−ν2 | + |Bxν2 |)} − 1.

The same holds for wn(x). Therefore, v(x) and w(x) define holomorphic func-tions in D(ε ′, ν1, ν2). From Sub-Lemma 4, we also have x(dv(x)/dx) = w(x) inD(ε ′, ν1, ν2).

We show that v(x),w(x) solve the initial integral equations. The left-hand sideof (97) converges to w(x) for n → ∞. Let us prove that the right-hand side alsoconverges to∫

L(x)

1

s

{L(s,As2−ν2 , Bsν2)+J(s,As2−ν2 , Bsν2 , v(s), w(s))

}ds.

We have to evaluate the following difference:∣∣∣∣∫L(x)

1

sJ(s,As2−ν2 , Bsν2 , v(s), w(s)) ds −

−∫L(x)

1

sJ(s,As2−ν2 , Bsν2 , vn(s), wn(s)) ds

∣∣∣∣.From (95), the above is

c �∫L(x)

1

|s| (|s| + |As2−ν2 | + |Bsν2 |)(|v − vn| + |w − wn|)|ds|. (101)

374 DAVIDE GUZZETTI

Now we observe that

|v(x)− vn(x)| �∞∑

k=n+1

|vk − vk−1|

=∞∑

k=n+1

(2c)kP (ν2)2k

k! (|x| + |Ax2−ν2 | + |Bxν2 |)k

� (|x| + |Ax2−ν2 | + |Bxν2 |)n+1∞∑k=0

(2c)k+n+1P(ν2)2(k+n+1)

(k + n+ 1)! ×

× (|x| + |Ax2−ν2 | + |Bxν2 |)k.The series converges. Its sum is less than some constant S(ν2) independent of n.We obtain

|v(x)− vn(x)| � S(ν2)(|x| + |Ax2−ν2 | + |Bxν2 |)n+1.

The same holds for |w − wn|. Thus, (101) is

� 2cS(ν2)

∫L(x)

1

|s| (|s| + |As2−ν2 | + |Bsν2 |)n+2|ds|

� 2cS(ν2)P (ν2)

n+ 2(|x| + |Ax2−ν2 | + |Bxν2 |)n+2.

Namely,∣∣∣∣∫L(x)

1

sJ(s,As2−ν2 , Bsν2 , v(s), w(s)) ds −

−∫L(x)

1

sJ(s,As2−ν2 , Bsν2 , vn(s), wn(s)) ds

∣∣∣∣ � 2cS(ν2)P (ν2)

n+ 2(3ε ′)n+2.

In a similar way, the right-hand side of (98) is∣∣∣∣∫ 1

s(w(s)− wn(s)) ds

∣∣∣∣ � S(ν2)P (ν2)

n+ 1(3ε ′)n+1.

Therefore, the right-hand sides of (97) and (98) converge on the domain D(r, ν1, ν2)

for r < min{ε ′, 1/3}. We finally observe that |v(x)| and |w(x)| are bounded onD(r). For example,

|v(x)| � (|x| + |Ax2−ν2 | + |Bxν2 |)∞∑k=0

(2c)k+1P(ν2)2(k+1)

(k + 1)! ×

× (|x| + |Ax2−ν2 | + |Bxν2 |)k=: M(ν2)(|x| + |Ax2−ν2 | + |Bxν2 |),


where the sum of the series is less than a constant M(ν2). We have provedLemma 6. ✷

We note that the proof of Lemma 6 only makes use of the properties of Land J, regardless of how these functions have been constructed. The structureof the integral equations implies that v(x) is bounded (namely |v(x)| = O(r)).Now, we come back to our case, where L and J have been constructed from theFourier expansion of elliptic functions. We need to check if (90) and D(r, ν1, ν2)

have nonempty intersection. This is true, indeed D(r) is contained in (90), becausein (90) the term �(πv/2ω1) is O(r), while in D(r, ν1, ν2) the term ln r appear, andr is small.

To conclude the proof of Theorem 3, we have to work out the explicit series(13). In order to do this, we observe that w1 and v1 are series of the type∑

p,q,r�0

cpqr(ν2)xp(Ax2−ν2)q(Bxν2)r , (102)

where cpqr(ν2) is rational in ν2. This follows from w1(x) = ∫L(x)

L(s,As2−ν2 ,

Bsν2) ds and from the fact that L(x,Ax2−ν2 , Bxν2) itself is a series (102) by con-struction, with coefficients cpqr(ν2) which are rational functions of ν2. The sameholds true for J. We conclude that wn(x) and vn(x) have the form (102) for any n.This implies that the limit v(x) is also a series of type (102). We can reorder sucha series to obtain (13). Consider the term cpqr(ν2)x

p(Ax2−ν2)q(Bxν2)r , and recallthat by definition B = 1/162A. We absorb 16−2r into cpqr(ν2) and we study thefactor

Aq−rxp+(2−ν2)q+ν2r = Aq−rxp+2q+(r−q)ν2 .

We have three cases:

(1) r = q, then we have

xp+2q =: xn, n = p + 2q.

(2) r > q, then we have

xp+2q

[1

Axν2

]r−q=: xn

[1

Axν2

]m, n = p + 2q, m = q − r.

(3) r < q, then we have

Aq−rxp+2r

[Ax2−ν2

]q−r=: xn

[Ax2−ν2

]m, n=p + 2r, m= q − r.

This brings a series of the type (102) to the form (13). The proof of Theorem 3 iscomplete.

A system of integral equations similar to the one we considered here was firststudied by Shimomura in [37] and [19].

376 DAVIDE GUZZETTI

Acknowledgements

I am grateful to B. Dubrovin for many discussions and advice. I would like tothank A. Bolibruch, A. Its, M. Jimbo, M. Mazzocco, and S. Shimomura for fruitfuldiscussions. The author is supported by a fellowship of the Japan Society for thePromotion of Science (JSPS).

References

1. Anosov, D. V. and Bolibruch, A. A.: The Riemann–Hilbert Problem, Publ. Steklov Institute ofMathematics, 1994.

2. Balser, W., Jurkat, W. B. and Lutz, D. A.: Birkhoff invariants and Stokes’ multipliers formeromorphic linear differential equations, J. Math. Anal. Appl. 71 (1979), 48–94.

3. Balser, W., Jurkat, W. B. and Lutz, D. A.: On the reduction of connection problems for differ-ential equations with an irregular singular point to ones with only regular singularities, SIAMJ. Math. Anal. 12 (1981), 691–721.

4. Birman, J. S.: Braids, Links, and Mapping Class Groups, Ann. of Math. Stud. 82, PrincetonUniv. Press, 1975.

5. Bolibruch, A. A.: On movable singular points of Schlesinger equation of isomonodromicdeformation, Preprint, 1995; On isomonodromic confluences of Fuchsian singularities, Proc.Steklov Inst. Math. 221 (1998), 117–132; On Fuchsian systems with given asymptotics andmonodromy, Proc. Steklov Inst. Math. 224 (1999), 98–106.

6. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Topological strings in d < 1, Nuclear Phys. B 352(1991), 59–86.

7. Dubrovin, B.: Integrable systems in topological field theory, Nuclear Phys. B 379 (1992), 627–689.

8. Dubrovin, B.: Geometry and Itegrability of topological-antitopological fusion, Comm. Math.Phys. 152 (1993), 539–564.

9. Dubrovin, B.: Geometry of 2D topological field theories, In: Lecture Notes in Math. 1620,Springer, New York, 1996, pp. 120–348.

10. Dubrovin, B.: Painlevé trascendents in two-dimensional topological field theory, In: R. Conte(ed.), The Painlevé Property, One Century Later, Springer, New York, 1999.

11. Dubrovin, B.: Geometry and analytic theory of Frobenius manifolds, math.AG/9807034, 1998.12. Dubrovin, B.: Differential geometry on the space of orbits of a Coxeter group,

math.AG/9807034, 1998.13. Dubrovin, B. and Mazzocco, M.: Monodromy of certain Painlevé-VI trascendents and reflec-

tion groups, Invent. Math. 141 (2000), 55–147.14. Fuchs, R.: Uber lineare homogene Differentialgleichungen zweiter Ordnung mit drei im

Endlichen gelegenen wesentlich singularen Stellen, Math. Annal. 63 (1907), 301–321.15. Gambier, B.: Sur des équations differentielles du second ordre et du premier degré dont

l’intégrale est à points critiques fixes, Acta Math. 33 (1910), 1–55.16. Guzzetti, D.: Stokes matrices and monodromy for the quantum cohomology of projective

spaces, Comm. Math. Phys. 207 (1999), 341–383.17. Guzzetti, D.: Inverse problem and monodromy data for three-dimensional Frobenius manifolds,

J. Math. Phys. Anal. Geom. 4 (2001), 245–291.18. Its, A. R. and Novokshenov, V. Y.: The Isomonodromic Deformation Method in the Theory of

Painlevé Equations, Lecture Notes in Math. 1191, Springer, New York, 1986.19. Iwasaki, K., Kimura, H., Shimomura, S. and Yoshida, M.: From Gauss to Painlevé, Aspects

Math. 16, Vieweg, Braunschweig, 1991.


20. Jimbo, M.: Monodromy problem and the boundary condition for some Painlevé trascendents,Publ. RIMS, Kyoto Univ. 18 (1982), 1137–1161.

21. Jimbo, M., Miwa, T. and Ueno, K.: Monodromy preserving deformations of linear ordinarydifferential equations with rational coefficients (I), Physica D 2 (1981), 306.

22. Jimbo, M. and Miwa, T.: Monodromy preserving deformations of linear ordinary differentialequations with rational coefficients (II), Physica D 2 (1981), 407–448.

23. Jimbo, M. and Miwa, T.: Monodromy preserving deformations of linear ordinary differentialequations with rational coefficients (III), Physica D 4 (1981), 26.

24. Kontsevich, M. and Manin, Y. I.: Gromov–Witten classes, quantum cohomology and enumera-tive geometry, Comm. Math. Phys. 164 (1994), 525–562.

25. Luke, Y. L.: Special Functions and their Approximations, Academic Press, New York, 1969.26. Manin, V. I.: Frobenius manifolds, quantum cohomology and moduli spaces, Max Planck

Institut fur Mathematik, Bonn, 1998.27. Manin, V. I.: Sixth Painlevé equation, universal elliptic curve, and mirror of P2, alg-

geom/9605010.28. Mazzocco, M.: Picard and Chazy solutions to the Painlevé VI equation, SISSA Preprint No.

89/98/FM, 1998, to appear in Math. Ann. (2001).29. Norlund, N. E.: The logaritmic solutions of the hypergeometric equation, Mat. Fys. Skr. Dan.

Vid. Selsk. 2(5) (1963), 1–58.30. Okamoto: Studies on the Painlevé equations I, the six Painlevé equation, Ann. Mat. Pura Appl.

146 (1987), 337–381.31. Painlevé, P.: Sur les équations differentielles du second ordre et d’ordre supérieur, dont

l’intégrale générale est uniforme, Acta Math. 25 (1900), 1–86.32. Picard, E.: Mémoire sur la théorie des functions algébriques de deux variables, J. Liouville 5

(1889), 135–319.33. Sato, M., Miwa, T. and Jimbo, M.: Holonomic quantum fields. II – The Riemann–Hilbert

problem, Publ. RIMS Kyoto Univ. 15 (1979), 201–278.34. Saito, K.: Preprint RIMS-288, 1979 and Publ. RIMS Kyoto Univ. 19 (1983), 1231–1264.35. Saito, K., Yano, T. and Sekeguchi, J.: Comm. Algebra 8(4) (1980), 373–408.36. Shibuya, Y.: Funkcial Ekvac. 11 (1968), 235.37. Shimomura, S.: Painlevé trascendents in the neighbourhood of fixed singular points, Funkcial

Ekvac. 25 (1982), 163–184; Series expansions of Painlevé trascendents in the neighbourhood ofa fixed singular point, Funkcial Ekvac. 25 (1982), 185–197; Supplement to Series expansionsof Painlevé trascendents in the neighbourhood of a fixed singular point, Funkcial Ekvac. 25(1982), 363–371; A family of solutions of a nonlinear ordinary differntial equation and itsapplication to Painlevé equations (III), (V), (VI), J. Math. Soc. Japan 39 (1987), 649–662.

38. Umemura, H.: Painlevé birational automorphism groups and differential equations, NagoyaMath. J. 119 (1990), 1–80.

39. Witten, E.: On the structure of the topological phase of two dimensional gravity, NuclearPhys. B 340 (1990), 281–332.


379

Spectrum Localization of Infinite Matrices �

M.I. GIL’Department of Mathematics, Ben Gurion University of the Negev, PO Box 653, Beer-Sheva 84105,Israel. e-mail: [email protected]

(Received: 4 September 2000; in final form: 1 June 2001)

Abstract. The paper deals with linear operators in a separable Hilbert space represented by infinitematrices with compact off diagonal parts. Bounds for the spectrum are established. In particular, newestimates for the spectral radius are proposed. These results are new even in the finite-dimensionalcase. Also applications to integral, differential and integro-differential operators are discussed.

Mathematics Subject Classifications (2000): 47A10, 47A55, 15A9, 15A18.

Key words: finite and infinite matrices, spectrum localization, integral, differential and integro-differential operators.

1. Introduction and Preliminaries

A lot of papers and books are devoted to the spectrum of compact operators,mainly relating to the distributions of the eigenvalues, cf. [6, 10, 11] and refer-ences therein. However, in many applications, for example, in stability theory andnumerical analysis, bounds for eigenvalues are very important. But the bounds areinvestigated considerably less than the distributions.

Let H be a separable complex Hilbert space with a scalar product (·, ·), thenorm ‖ · ‖ and the unit operator I . Let {ek}∞

k=1 be an orthogonal normed basis in H .Everywhere below A is a bounded linear operator in H represented by a matrixwith the entries

ajk = (Aek, ej ) (j, k = 1, 2, . . .). (1.1)

In the sequel V,W and D denote the upper triangular, lower triangular, and diago-nal parts of A, respectively:

(V ek, ej ) = ajk for j < k, (V ek, ej ) = 0 for all j > k;(Wek, ej ) = ajk for j > k, (Wek, ej ) = 0 for all j < k;(Dek, ek) = akk, (Dek, ej ) = 0 for j �= k (j, k = 1, 2, . . .). (1.2)

So A = D + V +W . Throughout this paper it is assumed that

V and W are compact operators. (1.3)� This research was supported by the Israel Ministry of Science and Technology.

380 M.I. GIL’

Numerous integral operators can be represented by matrix A under condition (1.3).In the present paper we derive bounds for the spectrum of A.

The paper is organized as follows. In Section 2 we prove the main result ofthe paper – Theorem 2.1 on the spectrum of matrix A. Section 3 is devoted tofinite matrices. In particular, in the case of matrices which are ‘near’ to triangular,we improve the Cassini Ovals Theorem and, the Frobenius estimate for the spec-tral radius, cf. [9]. Section 4 deals with infinite matrices having Hilbert–Schmidtoff-diagonal components. Besides, we supplement the well-known estimates forthe spectral radius [7]. In Section 5 we consider matrices whose off-diagonalsare Neumann–Schatten operators. Section 6 is devoted to the nonlinear spectrum.Localization of spectra of unbounded operators is examined in Section 7. Illustra-tive examples are collected in Section 8. Here we consider an integral operator,a nonselfadjoint differential operator and an integro-differential one. Besides, ourresults, supplement the well-known ones, cf. [1, 2, 8, 11] and references therein.

LEMMA 1.1. Under conditions (1.3), operators V and W are quasinilpotent.That is,

limn→∞

n√‖V n‖ = 0, lim

n→∞n√‖Wn‖ = 0.

Proof. For any natural m introduce the orthogonal projectors Pm and Qm by

Pmh =m∑k=1

(h, ek)ek (h ∈ H) and Qm = I − Pm.

Simple calculation shows that PkV Pk = VPk, QkWQk = WQk and

(Pk+1 − Pk)V (Pk+1 − Pk) = (Qk+1 −Qk)W(Qk+1 −Qk) = 0

for all natural k. Now the required result is due to Lemma 3.2.2 ([3]). ✷

2. The Main Result

In the sequel, σ (B) denotes the spectrum of a linear operator B, rs(B) is its spectralradius, and

ρ(λ,D) = infj=1,2,...

|λ− ajj | (λ ∈ C).

Clearly, σ (D) is the closure of the set {akk}∞k=1.

THEOREM 2.1. Let the conditions

‖[(D − λI)−1V ]k‖ � ckρ−k(λ,D)

SPECTRUM LOCALIZATION OF INFINITE MATRICES 381

and

‖[(D − λI)−1W ]k‖ � dkρ−k(λ,D) (λ /∈ σ (D)) (2.1)

hold, where ck, dk (k = 1, 2, . . .) are nonnegative numbers with the properties

k√ck → 0, k

√dk → 0 (k → ∞). (2.2)

Then for any µ ∈ σ (A), we have either µ ∈ σ (D), or

∞∑j,k=1

ckdj

ρk+j (µ,D)� 1. (2.3)

To prove this theorem we need the following lemma:

LEMMA 2.2. Let A be a bounded linear operator in H of the form

A = I + V + W , (2.4)

where operators V and W are quasinilpotent. If, in addition, the condition

θA ≡∥∥∥∥∥

∞∑j,k=1

(−1)k+j V kW j

∥∥∥∥∥ < 1 (2.5)

is fulfilled, then operator A is boundedly invertible.Proof. We have

A = I + V + W = (I + V )(I + W)− V W .

Since W and V are quasinilpotent, the operators I + V and I + W are invertible:

(I + V )−1 =∞∑k=0

(−1)kV k, (I + W)−1 =∞∑k=0

(−1)kW k.

Thus,

I + V + W = (I + V )[I − (I + V )−1V W (I + W )−1](I + W )

= (I + V )(I − BA)(I + W),

where BA = (I + V )−1V W (I + W )−1. But

V (I + V )−1 =∞∑k=1

(−1)k−1V k, (I + W )−1 =∞∑k=1

(−1)k−1W k.

So

BA =∞∑

j,k=1

(−1)k+j V kW j .

382 M.I. GIL’

If (2.5) holds, then ‖BA‖ < 1. So A−1 = (I + W)−1(I − BA)−1(I + V )−1 is

bounded. ✷Proof of Theorem 2.1. We have

A− λI = D +W + V − λ = (D − λ)(I + (D − λ)−1W + (D − λ)−1V ).

Let µ �= amm for all natural m. Due to (2.1) and (2.2), operators (D − µ)−1V and(D − µ)−1W are quasinilpotent and∥∥∥∥∥

∞∑j,k=1

(−1)k+j ((D − µ)−1V )k((D − µ)−1W)j

∥∥∥∥∥ �∞∑

j,k=1

ckdj

ρk+j (µ,D).

Assume that

∞∑j,k=1

ckdj

ρk+j (µ,D)< 1.

Then due to the previous lemma, A − µI is invertible. This contradiction provesthe required result. ✷

Note that in the case of a triangular matrix Theorem 2.1 yields the exact result:σ (A) is the closure of the set {akk, k = 1, 2, . . .}.COROLLARY 2.3. Let V �= 0 and W �= 0. Then under conditions (2.1), (2.2),for any µ ∈ σ (A), ρ(µ,D) � z(A) where z(A) is the unique positive root of theequation

∞∑j,k=1

ckdjz−j−k = 1. (2.6)

Indeed, the required result is due to comparison of (2.3) with (2.6).To estimate z(A), consider the equation

∞∑k=1

bkz−k = 1, (2.7)

where the coefficients bk are nonnegative and have the property

θ0 ≡ 2 supj

j√bj < ∞.

LEMMA 2.4. The unique positive root z0 of Equation (2.7) satisfies the estimatez0 � θ0.


Proof. Set in (2.7) z = x−1θ0. We have

1 =∞∑k=1

bkθ−k0 xk. (2.8)

But∞∑k=1

bkθ−k0 �

∞∑k=1

2−k = 1

and therefore the unique positive root x0 of (2.8) satisfies the inequality x0 � 1.Hence, z0 = θ0x

−10 � θ0. As claimed. ✷

Note that the latter lemma generalized the well-known result for algebraic equa-tions. Rewrite Equation (2.6) as

∞∑k=2

bkzk = 1 with bj =

j−1∑m=1

cj−mdm (j � 2). (2.9)

Now the previous lemma gives

z(A) � ψ(A) ≡ 2 supj�2

j

√bj . (2.10)

3. Finite Matrices

In this section A = (ajk) is a complex (n×n)-matrix (2 � n < ∞) and ‖ · ‖ is theEuclidean norm in a Euclidean space Cn. Introduce the numbers

γn,p =√

Cp

n−1

(n− 1)p(p = 1, . . . , n− 1) and γn,0 = 1.

Here

Cp

n−1 = (n− 1)!(n− p − 1)!p!

are the binomial coefficients. Evidently, for n > 2

γ 2n,p = (n− 2)(n− 3) . . . (n− p)

(n− 1)p−1p! � 1

p! (p = 1, 2, . . . , n− 1).

For an (n× n)-matrix B, denote

jn(B) =n−1∑k=0

γn,kNk(B),

where N(B) is the Frobenius (Hilbert–Schmidt) norm of B: N2(B) = TraceB∗B.

384 M.I. GIL’

THEOREM 3.1. Let A be an (n × n)-matrix. Then for any µ ∈ σ (A) there is aninteger m � n, such that either µ = amm, or

n−1∑j,k=1

γn,jγn,kNk(V )Nj (W)

|µ− amm|k+j � 1. (3.1)

Proof. Due to Lemma 17.1.2 [4], for any nilpotent matrix V

‖V j‖ � γn,jNj (V ). (3.2)

Clearly, (D − λI)−1V is the nilpotent operator. Due to (3.2)

‖((D − λI)−1V )j‖ � γn,jNj ((D − λI)−1V ) � γn,j‖(D − λI)−1‖jNj (V )

� γn,jNj (V )ρ−j (λ,D).

Similarly,

‖((D − λI)−1W)j‖ � γn,jNj (W)ρ−j (λ,D).

Now the required result follows from Theorem 2.1. ✷COROLLARY 3.2. Let zn(A) be the unique nonnegative root of the algebraicequation

n−1∑j,k=1

γn,kγn,j z2(n−1)−j−kNj (V )Nk(W) = z2(n−1). (3.4)

Then for any eigenvalue µ of A there is an integer m � n, such that

|µ− amm| � zn(A) � ψn(A), (3.5)

where

ψn(A) = 2 maxj�n

j

√√√√j−1∑k=1

Nj−k(V )Nk(W)γn,kγn,j−k.

In particular, the spectral radius of A subordinates the inequalities

rs(A) � maxk=1,...,n

|akk| + zn(A) � maxk=1,...,n

|akk| + ψn(A). (3.6)

Indeed, Equation (3.4) is equivalent to

n−1∑j,k=1

γn,jγn,kNk(V )Nj (W)

zj+k= 1.


The required result is now due to Corollary 2.3 and inequality (2.10).Note that inequalities (3.6) improve the Frobenius estimate

rs(A) � maxj=1,...,n

n∑k=1

|ajk|,

cf. [9], Section 2.2.1, if

zn(A) < maxj=1,...,n

n∑k=1, k �=j

|ajk| or ψn(A) < supj=1,...,n

n∑k=1, k �=j

|ajk|.

Put

Pj =n∑

k=1, k �=j|ajk| (j = 1, . . . , n).

As is well known, the spectrum of A lies in the union of the Cassini ovals

{µ ∈ C : |(aii − µ)(ajj − µ)| � PjPi} (i �= j ; i, j = 1, . . . , n)

([9], Ch. 3, Section 2.4.2). Let A be upper-triangular: ajk = 0 (j > k). ThenTheorem 3.1 gives the exact result σ (A) = ⋃n

j=1 ajj . At the same time, if n > 2,the Cassini ovals give us the greater set. So Theorem 3.1, improves the mentionedresult for matrices, which are ‘near’ to the triangular ones.

4. Matrices with Hilbert–Schmidt Off-Diagonals

Again, let N(·) be the Hilbert–Schmidt norm. Assume that

N2(V ) =∞∑

j=−∞

∞∑k=j+1

|ajk|2 < ∞, N2(W) =∞∑

j=−∞

j−1∑k=−∞

|ajk|2 < ∞. (4.1)

That is, V and W are Hilbert–Schmidt operators (HSO).

LEMMA 4.1. Under condition (4.1), let µ ∈ σ (A). Then either µ ∈ σ (D) or

∞∑j,k=1

Nk(V )Nj (W)

ρj+k(µ,D)√j !k! � 1. (4.2)

Proof. Due to Lemma 2.2.1 ([3]),

‖V j

0 ‖ � Nj(V0)√j ! (4.3)

386 M.I. GIL’

for any quasinilpotent HSO V0. But under (4.1), the operator (D − λI)−1V isa quasinilpotent HSO for any regular point λ of D. Thus

‖((D − λI)−1V )j‖ � Nj((D − λI)−1V )√j !

� ‖(D − λI)−1‖jNj (V )√j ! � Nj(V )

ρj (λ,D)√j ! .

Similarly,

‖((D − λI)−1W)j‖ � Nj(W)

ρj (λ,D)√j ! . (4.4)

Now the required result follows from Theorem 2.1. ✷Furthermore, according to relations (4.3), (4.4), Corollary 2.3 and inequali-

ty (2.10) yield:

THEOREM 4.2. Under condition (4.1), let V �= 0, W �= 0. Then any µ ∈ σ (A)

satisfies the inequality ρ(µ,D) � zH (A), where zH (A) is the unique positive rootof the equation

∞∑j,k=1

Nk(V )Nj (W)

zk+j√k!j ! = 1, (4.5)

i.e., σ (A) lies in the closure of the union of the discs

{z ∈ C : |z − akk| � zH (A)} (k = 1, 2, . . .).

Besides, zH (A) � ψH(A), where

ψH(A) = 2 supj=1,2,...

j

√√√√j−1∑k=1

Nj−k(V )Nk(W)√k!(j − k)! .

Under (4.1), Theorem 4.1 gives

rs(A) � supk=1,2,...

|akk| + zH (A) � supk=1,2,...

|akk| + ψH(A). (4.6)

Let

supj=1,2,...

∞∑k=1

|ajk| < ∞. (4.7)

Then the following well-known estimate is valid:

rs(A) � supj=1,2,...

∞∑k=1

|ajk| (4.8)


([7], Theorem 13.2). So under (4.7), inequalities (4.6) improve (4.8), provided that

zH (A) < supj=1,2,...

∞∑k=1, k �=j

|ajk| or ψH(A) � supj=1,2,...

∞∑k=1, k �=j

|ajk|.

5. Matrices with Neumann–Schatten Off-Diagonals

In this section it is assumed that V and W belong to the Neumann–Schatten idealC2r for some integer r > 1, i.e.,

Nr(V ) ≡ Trace[(V ∗)rV r]1/2r < ∞ and

Nr(W) ≡ Trace[(W ∗)rWr ]1/2r < ∞. (5.1)

So N1(.) = N(.) is the Hilbert–Schmidt norm.

LEMMA 5.1. Under condition (5.1), let µ ∈ σ (A). Then either µ ∈ σ (D), or

r−1∑i,m=0

∞∑j,k=1

Ni+rjr (W)Nm+rk

r (V )

ρm+i+r(j+k)(µ,D)√j !k! � 1. (5.2)

Proof. Due to Corollary 2.3.3 of [3], the operator (D − λI)−1V with λ /∈ σ (D)is quasinilpotent and, clearly, it is in C2r . Hence, it follows that ((D− λI)−1V )r isa HSO. So according to (4.3)

‖((D − λI)−1V )rk‖ � Nk1 ([(D − λI)−1V ]r )√

k! � Nrkr ((D − λI)−1V )√

k! .

Consequently,

‖((D − λI)−1V )m+rk‖� Nm+rk

r ((D − λI)−1V )√k!

� ‖(D − λI)−1‖m+rkNm+rk(V )√k! � Nm+rk(V )

ρm+rk(λ,D)√k! (5.3)

for m < r, k = 1, 2, . . . . Similarly,

‖((D − λI)−1W)m+rk‖ � Nm+rk(W)ρm+rk(λ,D)

√k! . (5.4)

Now the required result is due to Theorem 2.1. ✷Furthermore, according to relations (5.3), (5.4) Corollary 2.3 we get

388 M.I. GIL’

THEOREM 5.2. Under condition (5.1), let V �= 0, W �= 0. Then any µ ∈ σ (A)

satisfies the inequality, ρ(µ,D) � zr(A), where zr(A) is the unique positive rootof the equation

r−1∑i,m=0

∞∑j,k=1

Nrk+i (V )Nrj+m(W)zi+m+r(k+j)√k!j ! = 1.

That is, σ (A) lies in the closure of the union of the discs

{z ∈ C : |z − akk| � zr(A)} (k = 1, 2, . . .).

Clearly, Lemma 2.4 gives us an estimate for zr(A) and the latter theorem givesthe estimate for the spectral radius, which is similar to (4.6).

6. The Nonlinear Spectrum

To apply our results to unbounded operators, in this section we are going considerthe nonlinear spectrum.

Let now the diagonal operator depend on a complex parameter λ: D = D(λ)

and ajj = aj (λ) where aj (λ) (j = 1, 2, . . .) are entire functions.Put A(λ) = D(λ) + W + V , where W and V are constant upper and lower

quasinilpotent matrices defined as in (1.2), again. So in a given orthogonal normedbasis {ek}

(A(λ)ej , ej ) = aj (λ), (A(λ)ej , ek) = ajk = const (j �= k).

We will say that λ is a regular value of A(·) if A(λ) has the inverse boundedoperator. The complement of the set of all regular values to the closed complexplane is the spectrum of A(·) and is denoted by σ (A(·)). It is simple to see thatthe spectrum σ (D(·)) of the diagonal matrix D(·) coincides with the closure of theset of all the roots of aj (λ), j = 1, 2, . . . . For a regular λ of D(·), ‖D−1(λ)‖ =1/ρ0(λ), where

ρ0(λ) ≡ infk=1,2,...

|ak(λ)|.

Assume that for a regular λ of D(·) and any natural k,

‖[D−1(λ)V ]k‖ � ckρ−k0 (λ), ‖[D−1(λ)W ]k‖ � dkρ

−k0 (λ). (6.1)

LEMMA 6.1. Under conditions (6.1) and (2.2), let µ ∈ σ (A(·)). Then, eitherµ ∈ σ (D(·)), or

∞∑j,k=1

ckdj

ρk+j0 (µ)

� 1. (6.2)

The proof of this lemma is exactly the same as the proof of Theorem 2.1.


THEOREM 6.2. Under conditions (6.1), let V �= 0, W �= 0 and z(A) be theunique positive root of Equation (2.6). Then for any µ ∈ σ (A(·)), the inequalityρ0(µ) � z(A) is true.

Proof. The required result is due to comparison of (6.2) with (2.6). ✷Furthermore, Lemma 6.1 and inequality (3.2) yield the following corollary:

COROLLARY 6.3. Let W and V be an HSO. Then for any µ ∈ σ (A(·)), eitherthere is an integer m, such that am(µ) = 0, or

∞∑j,k=1

Nk(V )Nj(W)

ρj+k0 (µ)

√j !k! � 1.

Moreover, Theorem 6.2 implies

COROLLARY 6.4. Let W �= 0 and V �= 0 be an HSO. Then ρ0(µ) � zH (A) �ψH(A) for any µ ∈ σ (A(·)).

We remind that zH (A) and ψH(A) are defined in Section 4.In the case (5.1) we also can easily derive results similar to Corollaries 6.3

and 6.4.

7. Spectrum Localization of Unbounded Operators

Again, let {ek}∞k=1 be an orthogonal normed basis in H . Introduce a normal un-

bounded operator S by

Sh =∞∑k=1

λk(S)(h, ek)ek (h ∈ Dom(S)) (7.1)

with the eigenvalues λk(S) and a domain Dom(S), assuming that S has the compactinverse operator

S−1 =∞∑k=1

λ−1k (S)(·, ek)ek.

Let Q be a linear generally unbounded operator in H , represented by a matrix

(qjk)∞j,k=1; qjk = (Qej , ek)

with the zero diagonal

(Qej , ej ) = qjj = 0 (j = 1, 2, . . .).

It is assumed that

QS−1 is a compact operator. (7.2)

390 M.I. GIL’

In the sequel we investigate the operator T ≡ S+Q. Define operators V andW by

(V ek, ej ) = qjkλ−1k (S) if j < k, and (V ek, ej ) = 0 if j � k;

(Wek, ej ) = qjkλ−1k (S) if j > k and (Wek, ej ) = 0 if j � k. (7.3)

Due to (7.2), operators V and W defined by (7.3) are compact. Denote

D(λ) = diag[a1(λ), a2(λ), . . .] with aj (λ) = 1 − λλ−1j (S). (7.4)

Obviously,

T − λI = S +Q− λI = (I +QS−1 − S−1λ)S

= [D(λ)+W + V ]S. (7.5)

Put A(λ) = I + QS−1 − S−1λ = D(λ) + W + V . So under the considerationρ0(λ) = ρS(λ) where

ρS(λ) ≡ infk=1,2,...

|1 − λλ−1k (S)|.

Now Lemma 6.1 implies:

LEMMA 7.1. With notations (7.3) and (7.4), let conditions (6.1) and (2.2) befulfilled. Then, for any µ ∈ σ (T ), either µ ∈ σ (S), or

∞∑j,k=1

ckdj

ρk+jS (µ)

� 1.

This result is due to Lemma 6.1 and relation (7.5).

THEOREM 7.2. With notations (7.3) and (7.4), let conditions (6.1) and (2.2) befulfilled. Then for any µ ∈ σ (T ), the inequalities ρS(µ) � z(A) � ψ(A) are true,where z(A) is the unique positive root of (2.6) and ψ(A) is defined by (2.10).

This result is due to Theorem 6.2.The latter theorem and relations (4.3), (4.4) yield:

COROLLARY 7.3. With notations (7.3), let W �= 0, V �= 0 be HSO. ThenρS(µ) � zH (A) for any µ ∈ σ (T ), where zH (A) is the unique positive root ofEquation (4.6). That is, σ (T ) lies in closure of the the union of the sets

{µ ∈ C : |1 − µλ−1k (S)| � zH (A)} (k = 1, 2, . . .).

Besides, according to (2.10), zH (A) � ψH(A), where ψH(A) is defined by (4.7).

In the case (5.1) we also can easily derive results similar to Corollary 7.3.


8. Examples

8.1. AN INTEGRAL OPERATOR

In space H = L2[0, 1], let us consider an operator A defined by

(Au)(x) = u(x)+∫ 1

0K(x, s)u(s) ds (0 � x � 1), (8.1)

where K is a complex valued Hilbert–Schmidt kernel. Take the orthonormal basis

ek(x) = e2πikx (0 � x � 1, k = 0,±1,±2, . . .). (8.2)

Let

K(x, s) =∞∑

j,k=−∞bjkek(x)ej (s) (8.3)

be the Fourier expansion of K with the Fourier coefficients bjk. Obviously,

ajk ≡ (Aek, ej ) = bjk (j �= k), and ajj ≡ (Aej , ej ) = 1 + bjj .

Assume that infj |1 + bjj | > 0. According to (1.2) under consideration we have

N2(V ) =∞∑

j=−∞

∞∑k=j+1

|bjk|2 < ∞, N2(W) =∞∑

j=−∞

j−1∑k=−∞

|bjk|2 < ∞.

Now we can apply Theorem 4.2.

8.2. A NONSELFADJOINT DIFFERENTIAL OPERATOR

In space H = L2[0, 1], let us consider an operator T defined by

(T u)(x) = −d2u(x)

dx2+ w(x)

du(x)

dx+ l(x)u(x)

(0 < x < 1, u ∈ Dom(T )) (8.4)

with the domain

Dom(T ) = {h ∈ L2[0, 1] :h′′ ∈ L2[0, 1], h(0) = h(1), h′(0) = h′(1)}.(8.5)

Here w(·), l(·) ∈ L2[0, 1] are bounded scalar-valued functions. So the periodicboundary conditions

u(0) = u(1), u′(0) = u′(1) (8.6)

392 M.I. GIL’

are imposed. With orthonormal basis (8.2), let

l =∞∑

k=−∞lkek and w =

∞∑k=−∞

wkek

(wk = (w, ek), lk = (l, ek)) (8.7)

be the Fourier expansions of l and w, respectively. Omitting simple calculations,we have

(T ek, ej ) = iπkwj−k + lj−k (k �= j)

and

(T ek, ek) = π2k2 + iπkw0 + l0.

Take Dom(S) = Dom(T ) and define operator S by (7.1) with eigenfunctions (8.2)and

λk(S) = π2k2 + iπkw0 + l0

assuming that infk |λk(S)| > 0. Take a matrix Q = (qjk)∞j,k=−∞ with

qjk = iπkwj−k + lj−k (j �= k) and qjj = 0.

According to (7.4), under consideration, we have

N2(V ) =∞∑

k=−∞

k−1∑j=−∞

|(π2k2 + iπkw0 + l0)−1(iπkwj−k + lj−k)|2

=∞∑

k=−∞

−1∑m=−∞

|(π2k2 + iπkw0 + l0)−1(iπkwm + lm)|2

� 2∞∑

k=−∞|π2k2 + iπkw0 + l0|−2π |k|2

−1∑m=−∞

|wm|2 +

+ 2∞∑

k=−∞|(π2k2 + iπkw0 + l0)

−1|2−1∑

m=−∞|lm|2 < ∞

since w, l ∈ L2. Similarly,

N2(W) =∞∑

k=−∞

∞∑j=k+1

|(π2k2 + iπkw0 + l0)−1(iπkwj−k + lj−k)|2 < ∞.

So V and W are HSO. Now we can apply Corollary 7.3.


8.3. AN INTEGRO-DIFFERENTIAL OPERATOR

In space H = L2[0, 1] consider the operator

(T u)(x) = −d2u(x)

dx2+ w(x)u(x) +

+∫ 1

0K(x, s)u(s) ds (u ∈ Dom(T ), 0 < x < 1) (8.8)

with the domain defined by (8.5). So the periodic boundary conditions (8.6) hold.Again, K is a Hilbert–Schmidt kernel and w is a bounded scalar-valued function.Take the orthonormal basis (8.2). Let (8.3) and (8.7) be the Fourier expansions ofKand of w, respectively. Obviously,

(T ek, ej ) = wj−k + bjk (j �= k) and (T ek, ek) = π2k2 + w0 + bkk.

Define S by (7.1) with eigenfunctions (8.2) and

λk(S) = π2k2 + w0 + bkk

assuming that λk(S) �= 0 for any integer k. Take a matrix Q = (qjk)∞j,k=−∞ with

qjk = wj−k + bjk (j �= k) and qjj = 0.

According to (7.4)

N2(V ) =∞∑

k=−∞

k−1∑j=−∞

|(π2k2 + w0 + bkk)−1(wj−k + bjk)|2 < ∞.

Similarly,

N2(W) =∞∑

k=−∞

∞∑j=k+1

|(π2k2 + w0 + bkk)−1(wj−k + bjk)|2 < ∞.

Thus, V and W are Hilbert–Schmidt operators. Now we can apply Corollary 7.3.

References

1. Edmunds, D. E. and Evans, W. D.: Spectral Theory and Differential Operators, ClarendonPress, Oxford, 1990.

2. Egorov, Y. and Kondratiev, V.: Spectral Theory of Elliptic Operators, Birkhäuser-Verlag, Basel,1996.

3. Gil’, M. I.: Norm Estimations for Operator-Valued Functions and Applications, Marcel Dekker,New York, 1995.

4. Gil’, M. I.: Stability of Finite and Infinite Dimensional Systems, Kluwer Acad. Publ., Dordrecht,1998.

5. Horn, R. A. and Johnson, Ch. R.: Topics in Matrix Analysis, Cambridge Univ. Press,Cambridge, 1991.

394 M.I. GIL’

6. König, H.: Eigenvalue Distribution of Compact Operators, Birkhäuser-Verlag, Basel, 1986.7. Krasnosel’skii, M. A., Lifshits, J. and Sobolev, A.: Positive Linear Systems. The Method of

Positive Operators, Heldermann-Verlag, Berlin, 1989.8. Locker, J.: Spectral Theory of Nonselfadjoint Two Point Differential Operators, Math. Surveys

Monogr. 73, Amer. Math. Soc., Providence, 1999.9. Marcus, M. and Minc, H.: A Survey of Matrix Theory and Matrix Inequalities, Allyn and Bacon,

Boston, 1964.10. Pietsch, A.: Eigenvalues and s-Numbers, Cambridge Univ. Press, Cambridge, 1987.11. Prössdorf, S.: Linear Integral Equations, Itogi Nauki i Tekhniki 27, VINITI, Moscow, 1998,

(Russian).

Mathematical Physics, Analysis and Geometry 4: 395–396, 2001. 395

Contents of Volume 4

Volume 4 No. 1 2001

VLADIMIR VASILCHUK / On the Law of Multiplication of RandomMatrices 1–36

A. BEN AMOR and PH. BLANCHARD / Smoothing Properties of theHeat Semigroups Associated to Hamiltonians Describing PointInteractions in One and Two Dimensions 37–49

RANIS N. IBRAGIMOV / On the Tidal Motion Around the EarthComplicated by the Circular Geometry of the Ocean’s ShapeWithout Coriolis Forces 51–63

TAMARA GRAVA / From the Solution of the Tsarev System to theSolution of the Whitham Equations 65–96

Volume 4 No. 2 2001

ABEL KLEIN and ANDREW KOINES / A General Framework forLocalization of Classical Waves: I. Inhomogeneous Media andDefect Eigenmodes 97–130

ATTILIO MEUCCI / Toda Equations, bi-Hamiltonian Systems, andCompatible Lie Algebroids 131–146

SERGEI KUKSIN and ARMEN SHIRIKYAN / Ergodicity for theRandomly Forced 2D Navier–Stokes Equations 147–195

Volume 4 No. 3 2001

NAKAO HAYASHI and PAVEL NAUMKIN / On the Modified Korteweg–De Vries Equation 197–227

R. DEL RIO and B. GRÉBERT / Inverse Spectral Results for AKNSSystems with Partial Information on the Potentials 229–244

DAVIDE GUZZETTI / Inverse Problem and Monodromy Data for Three-Dimensional Frobenius Manifolds 245–291

396 CONTENTS OF VOLUME 4

Volume 4 No. 4 2001

DAVIDE GUZZETTI / On the Critical Behavior, the Connection Prob-lem and the Elliptic Representation of a Painlevé VI Equation 293–377

M.I. GIL’ / Spectrum Localization of Infinite Matrices 379–394

Documents

Mathematical Physics, Analysis and Geometry - Volume 4