100
Analysis and Ergodic Theory Summer School, Lake Arrowhead September 17th - September 22nd 2006 Organizers: Ciprian Demeter, University of California, Los Angeles Christoph Thiele, University of California, Los Angeles * supported by NSF grant DMS 0400879 1

Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Analysis and Ergodic Theory

Summer School, Lake Arrowhead ∗

September 17th - September 22nd 2006

Organizers:

Ciprian Demeter, University of California, Los Angeles

Christoph Thiele, University of California, Los Angeles

∗supported by NSF grant DMS 0400879

1

Page 2: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Contents

1 The primes contain arbitrarily long arithmetic progressionsII 5Tim Austin, UCLA . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.1 Orienting remarks . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Warmup: two tricks to help handle the primes . . . . . . . . . 6

1.2.1 The W-trick . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2 Dealing with wraparound . . . . . . . . . . . . . . . . 61.2.3 Our goal, modified . . . . . . . . . . . . . . . . . . . . 7

1.3 The actual construction . . . . . . . . . . . . . . . . . . . . . 71.4 Proving pseudorandomness . . . . . . . . . . . . . . . . . . . . 9

2 A generalization of Birkhoff’s pointwise ergodic theorem 13Svetlana Butler, UIUC . . . . . . . . . . . . . . . . . . . . . . . . . 132.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4 Outline of the proofs . . . . . . . . . . . . . . . . . . . . . . . 16

3 A pointwise ergodic theorem for amenable groups 19S. Zubin Gautam, UCLA . . . . . . . . . . . . . . . . . . . . . . . . 193.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Statement of the main result . . . . . . . . . . . . . . . . . . . 203.3 Random selection of Følner sets . . . . . . . . . . . . . . . . . 20

3.3.1 Proof sketch of Lemma 5 for G discrete . . . . . . . . . 213.3.2 Proof sketch of Lemma 5 for general G . . . . . . . . . 23

3.4 Maximal inequality and pointwise ergodic theorem . . . . . . . 243.5 Example: The lamplighter group . . . . . . . . . . . . . . . . 25

4 Arithmetic progressions in primes I 27Alexander Gorodnik, CalTech . . . . . . . . . . . . . . . . . . . . . 274.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 304.1.2 Pseudorandom measures . . . . . . . . . . . . . . . . . 31

4.2 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.1 Gowers uniformity norms . . . . . . . . . . . . . . . . . 324.2.2 Obstructions to uniformity and dual functions . . . . . 33

2

Page 3: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

4.2.3 σ-Algebras generated by generalized Bohr sets . . . . . 344.3 Structure theorem . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Polynomial extensions of van der Waerden’s and Szemeredi’stheorems 38Michael Johnson, Northwestern University . . . . . . . . . . . . . . 385.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.2 Polynomial Expressions . . . . . . . . . . . . . . . . . . . . . . 395.3 Theorem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.4 Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6 Convergence of Conze-Lesigne Averages 46Tamara Kucherenko, UCLA . . . . . . . . . . . . . . . . . . . . . . 466.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.2 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . 486.3 Reduction to a simpler system . . . . . . . . . . . . . . . . . . 48

6.3.1 Characteristic factors . . . . . . . . . . . . . . . . . . . 486.3.2 Reduction to the isometric extension of the Kronecker . 496.3.3 Reduction to an abelian group extension . . . . . . . . 50

6.4 Sketch of the proof of the main theorem . . . . . . . . . . . . 50

7 Pointwise Ergodic Theorems for Arithmetic Sets, part 2 53Victor Lie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.2 The L2 case for Theorem 1 . . . . . . . . . . . . . . . . . . . . 547.3 The almost sure convergence of ANf for f ∈ l2(Z) . . . . . . . 577.4 The Lp case for Theorem 1 . . . . . . . . . . . . . . . . . . . . 597.5 The outlines of the Theorem 2 . . . . . . . . . . . . . . . . . . 61

8 The Ergodic Theoretical Proof ofSzemeredi’s Theorem II 63Anne E. McCarthy, Temple University . . . . . . . . . . . . . . . . 638.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638.2 Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

8.2.1 skew products . . . . . . . . . . . . . . . . . . . . . . . 648.2.2 projecting functions . . . . . . . . . . . . . . . . . . . . 658.2.3 fiber squares (or relative squares) . . . . . . . . . . . . 65

8.3 Maximal SZ Factors . . . . . . . . . . . . . . . . . . . . . . . 66

3

Page 4: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

8.4 Relative Weak Mixing . . . . . . . . . . . . . . . . . . . . . . 678.5 Compact Extensions and Existence . . . . . . . . . . . . . . . 68

9 Multiple recurrence and Szemeredi’s theorem 69Richard Oberlin,UW Madison . . . . . . . . . . . . . . . . . . . . . 699.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699.2 Theorem 2 implies Theorem 3 . . . . . . . . . . . . . . . . . . 709.3 Beginning the proof of Theorem 2 . . . . . . . . . . . . . . . . 71

9.3.1 Weak-mixing systems . . . . . . . . . . . . . . . . . . . 729.3.2 Compact systems . . . . . . . . . . . . . . . . . . . . . 729.3.3 Weak-mixing and compact factors . . . . . . . . . . . . 74

10 Entropy of Convolutions on the Circle, I 75Robert C. Rhoades, University of Wisconsin . . . . . . . . . . . . . 7510.1 Furstenberg’s Conjecture . . . . . . . . . . . . . . . . . . . . . 7510.2 Uniform Distribution and cn-genericity . . . . . . . . . . . . 7710.3 The Role of Entropy . . . . . . . . . . . . . . . . . . . . . . . 78

11 Entropy of convolutions on the circle 81Shuanglin Shao, UCLA . . . . . . . . . . . . . . . . . . . . . . . . . 8111.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8111.2 Dimension of sum sets . . . . . . . . . . . . . . . . . . . . . . 8411.3 Uniform distribution on subgroups . . . . . . . . . . . . . . . 8411.4 Entropy and subgroups . . . . . . . . . . . . . . . . . . . . . . 8511.5 The convolution theorem . . . . . . . . . . . . . . . . . . . . . 85

12 Bourgain’s Entropy Estimates 87John Workman, Cornell . . . . . . . . . . . . . . . . . . . . . . . . 8712.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 8712.2 The First Entropy Result . . . . . . . . . . . . . . . . . . . . . 8812.3 The Second Entropy Result . . . . . . . . . . . . . . . . . . . 9012.4 Application: Bellow’s Averages . . . . . . . . . . . . . . . . . 91

13 Pointwise Ergodic Theorems for Arithmetic Sets, part 1 93Andrew Yingst, USC Columbia . . . . . . . . . . . . . . . . . . . . 9313.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9313.2 Reduction to Inequalities. . . . . . . . . . . . . . . . . . . . . 9413.3 A Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4

Page 5: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

1 The primes contain arbitrarily long arith-

metic progressions II

after B. Green and T. Tao [2]A summary written by Tim Austin

Abstract

We cover the second half of Green and Tao’s proof of the existenceof arbitrarily long arithmetic progressions of prime numbers, by con-structing a function and pseudorandom measure suitably associatedto the primes.

1.1 Orienting remarks

The proof of the Green-Tao Theorem breaks conveniently into two distinctstages:

1. First, it is shown how the conclusion of Szemeredi’s Theorem can beextended to cover also the cases of certain sets with asymptoticallyzero density, by assuming instead positive density with respect to a‘measure’ satisfying certain conditions (wrapped up into a propertycalled ‘pseudorandomness’);

2. Second, such a measure and function must actually be constructedwith the function non-zero only on some set related to the primes insuch a way that the presence of long arithmetic progressions in theprimes themselves may be deduced. (Note that the slightly convolutedwording here is necessary: it will turn out that our function is not itselfsupported by the primes.)

Here we will be concerned chiefly with the second of the above stages,detailed in Sections 9 – 11 of [2]. Results from the first stage (Sections 1 –8 of [2]) will mostly be assumed. One theme for this presentation is a prag-matic attitude towards the construction and use of a suitable pseudorandommeasure for studying the primes. It seems that the exact construction ofsuch a measure is somewhat arbitrary; the Green-Tao construction shouldbe used because it is one for which existing techniques yield sufficiently goodestimates to give a proof of the theorem, and not for any more high-mindedreason. Indeed, Bernard Host [3] has recently given a slightly different con-struction based on a different methodology for some of these estimates.

5

Page 6: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

1.2 Warmup: two tricks to help handle the primes

Even before constructing a particular function and measure, we can foreseecertain issues when working with the primes.

1.2.1 The W-trick

Suppose that f is some function N → R that is non-zero only on primes.As it stands there can be no k-pseudorandom measure majorizing f . Thisis because the primes “notice and avoid” certain congruence classes: if a, qare not coprime, then there can be at most one prime ≡ q mod a. Thus forany q the primes are concentrated into the φ(q) congruence classes moduloq that are coprime to q, and so this choice of f , and hence also any ν ≥ f ,would be concentrated on these congruence classes.

However, the (m0, t0, L0)-linear forms condition forces ν to be equidis-tributed among different congruence classes, if q is sufficiently small : forq ≤ L0, simply apply the condition to the one linear form ψ(x) = qx+ b forany b. We circumvent this problem by looking for arithmetic progressionsamong the primes that occur within one fixed congruence class that is co-prime to all q ≤ L0: say among n : Wn+ 1 is a prime for W a product ofall small primes.

In order to find k-term arithmetic progressions, we need the (k2k−1, 3k−4, k)-linear forms condition (as specified in the definition of k-pseudorandomness)for our measure. Thus our choice of W only need depend on k, but it is eas-ier to choose a function w(N) growing slowly to ∞ with N (how slowly cansafely be decided later), and set

W = W (N) :=∏

p≤w(N)

p;

clearly this is eventually big enough for any fixed k.

1.2.2 Dealing with wraparound

Suppose that we have found an arithmetic progression in ZN . Even uponidentifying ZN with 1, 2, . . . , N, our find need not correspond to a genuinearithmetic progression in N, owing to the possibility of wraparound in thecyclic group ZN .

However, we can rule out this problem if we know that our length-kprogression in ZN is contained in some very short arc (how short depending

6

Page 7: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

on k). It suffices to restrict attention to arcs of length less than (1/k)N , butfor other reasons we will later take εkN for εk := 1/2k(k+4)! (much smaller).

1.2.3 Our goal, modified

Combining the two tricks described above, we will search for a specific func-tion and measure of the form

f(n) :=

g(Wn+ 1) if εkN ≤ n ≤ 2εkN0 else.

(for some function g supported on the primes), and

ν(n) :=

h(Wn+ 1) if εkN ≤ n ≤ 2εkN1 else.

(for some other suitable function h).

1.3 The actual construction

It has long been known in number theory that sometimes it is easier to studythe function 1primes(n) logn than the indicator function 1primes(n) itself, andso this former seems a good ingredient for f . Thus, we set

Λ(n) :=

φ(W )

Wlog(Wn+ 1) if Wn+ 1 is prime

0 else

and

f(n) :=

k−12−k−5Λ(n) if εkN ≤ n ≤ 2εkN0 else

for 0 ≤ n < N . The various factors in the above definition are simplyconvenient normalizers.

For our choice of ν we will need to make estimates to verify pseudoran-domness. Before the appearance of [2], Goldston and Yıldırım [1] had studiedsums of the form

n≤N

ΛR(n+ h1)ΛR(n+ h2) · · ·ΛR(n+ hm)

7

Page 8: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

for ΛR the so-called truncated divisor sum

ΛR(n) :=∑

d |n, d≤R

µ(d) logR

d,

where µ is the Mobius function:

µ(n) :=

(−1)r if n is squarefree with r prime factors0 else,

which in turn has long been central to the study of the distribution of theprimes.

It is Goldston and Yıldırım’s work that provides a measure for whichpseudorandomness can be checked: we set R := Nk−12−k−4

, and

ν(n) :=

φ(W )W log R

ΛR(Wn+ 1)2 if εkN ≤ n ≤ 2εkN

1 else

for 0 ≤ n < N .

Remark The choice of this law for R as a function of N is dictated, firstly,by the requirement that ν majorize a positive multiple of f , and then bythe various considerations of how slow it needs to be in the proof of pseudo-randomness. We take it to be as high a power of N as will allow all theseestimates to go through easily; the value of that power is an artefact of thesedifferent parts of the proof. It turns out that that the growth of R in turnforces a bound on the growth of w(N) (and so also W (N)), but this can thensimply be absorbed into the requirement that w(N) grow “slowly enough”.

Assuming we have proved pseudorandomness for our ν, the Green-TaoTheorem now follows from:

Lemma 1 (Lemma 9.4 in [2]). We have ν(n) ≥ 0 for all n ∈ ZN , andν(n) ≥ f(n) for all n ∈ ZN whenever N is sufficiently large.

Thus:

Proposition 2 (Proposition 9.1 in [2]). Write εk := 1/2k(k+4)!, and let N bea sufficiently large prime number. Then there is a k-pseudorandom measureν : ZN → [0,∞) such that ν(n) ≥ k−12−k−5Λ(n) for all εkN ≤ n ≤ 2εkN .

8

Page 9: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Proof of the Green-Tao theorem assuming Proposition 2 Define fas above. From Dirichlet’s theorem we observe that

E(f) =k−12−k−5

N

εkN≤n≤2εkN

Λ(n) = k−12−k−5εk(1 + o(1)).

We can now apply Proposition 2 and Theorem 3.5 from [2] (as covered in thefirst half of that paper) to conclude that

E(f(x)f(x+ r) · · ·f(x+ (k − 1)r) | x, r ∈ ZN

)≥ c(k, k−12−k−5εk) − o(1),

where c(k, k−12−k−5εk) is a positive constant guaranteed by Theorem 3.5of [2]. Observe that the degenerate case r = 0 can contribute at mostO( 1

Nlogk N) = o(1) to the left-hand side and can thus be discarded. Further-

more, by our trick of restricting attention to [εkN, 2εkN ], every progressioncounted by the expression on the left is a genuine arithmetic progression ofintegers. Since the right-hand side is positive for sufficiently large N , theclaim follows from the definition of Λ.

Remark In fact, by a simple modification of the construction of f and νabove, one can show that there are arbitrarily long arithmetic progressionsin any subset A of the primes with positive relative upper density; that is,such that

lim supN→∞

|A ∩ [1, N ]||n ∈ [1, N ] : n is prime|)0.

This is the full strength of Green and Tao’s result.

1.4 Proving pseudorandomness

Pseudorandomness requires the linear forms condition and the correlationcondition. These two conditions are proved via two other results, one foreach; these intermediate results connect more closely with existing number-theoretic estimates, and then the onward journey to pseudorandomness ismore combinatorial.

Proposition 3 (Proposition 9.5 in [2], leading to linear forms condition). Letm, t be positive integers. For each 1 ≤ j ≤ m, let ψj(x) :=

∑ts=1 Ljkxk + bj

be linear forms with integer coefficients Ljk such that |Ljk| ≤√w(N)/2 for

all j = 1, . . . , m and k = 1, . . . , t. We assume that the t-tuples (Ljk)tk=1 are

9

Page 10: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

never identically zero, and that no two t-tuples are rational multiples of eachother. Write θj := Wψj + 1. Suppose that B is a product

∏tk=1 Ik ⊆ Rt of

intervals Ik, each of which having length at least R10m. Then (if the functionw(N) is sufficiently slowly growing in N)

E(ΛR(θ1(x))2 · · ·ΛR(θm(x))2

∣∣ x ∈ B)

= (1 + om,t(1))

(W logR

φ(W )

)m

.

Proposition 4 (Proposition 9.6 in [2], leading to correlation condition). Letm ≥ 1 be an integer, and let B be an interval of length at least R10m. Supposethat h1, . . . , hm are distinct integers satisfying |hi| ≤ N2 for all 1 ≤ i ≤ m,and let ∆ denote the integer

∆ :=∏

1≤j<k≤m

|hj − hk|.

Then (for N sufficiently large depending on m, and assuming the functionw(N) sufficiently slowly growing in N)

E(ΛR(W (x+ h1) + 1)2 · · ·ΛR(W (x+ hm) + 1)2

∣∣ x ∈ B)

≤ (1 + om(1))

(W logR

φ(W )

)m ∏

p |∆(1 + Om(p−1/2)).

Remarks

1. The roles of the arbitrary sets B in the above propositions become clearduring the deduction of pseudorandomness from them. Briefly, we needthe sets B because we will be defining ν in two different ways insideand outside the interval [εkN, 2εkN ], following our second trick fromSection 1.2; when verifying the linear forms and correlation conditions,the overall expectation is broken into several expectations over (prod-ucts of) smaller intervals, such that most of these products of smallerintervals are “homogeneous” for the definition of ν.

2. The extra factor ∏

p |∆(1 + Om(p−1/2))

arises naturally in the course of the calculations, and sheds a little lighton the correlation condition itself. The correlation condition takes theform it does because the quantity

E(ΛR(W (x+ h1) + 1)2 · · ·ΛR(W (x+ hm) + 1)2

∣∣ x ∈ B)

10

Page 11: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

does not admit a uniform bound as a function of (h1, . . . , hm), but wecan show that those tuples (h1, . . . , hm) for which its value is large arevery few (notice that, typically, most p dividing ∆ will be large, andso the product will usually not be too large). Recalling our view ofthe Green-Tao proof as split into two stages, we might say that thedefinition of the correlation condition is a staging post that has beenplaced precisely so that both halves of the journey are just possible.

We finish with a representative sketch of some of the proofs relating theabove results.

Sketch proof of the linear forms condition from Proposition 3 Letψj , 1 ≤ j ≤ m, be our linear forms. Following the first remark above, wechop up the range of summation in the linear forms condition into Q(N)t

almost equal-sized boxes, where Q is slowly growing in N . We now treat thesums over the different sets

Bu1,...,ut := x : xk ∈ [⌊ukN/Q⌋, ⌊(uk + 1)N/Q⌋) for k = 1, . . . , t

for u1, . . . , ut ∈ ZQ, and observe that asymptotically we may replace theexpectation E( · |x ∈ Zt

N) with the iterated expectation

E(E( · |x ∈ Bu1,...,ut)| u1, . . . , ut ∈ ZQ

)

in the linear forms condition.Call a t-tuple (u1, . . . , ut) ∈ Zt

Q nice if for every 1 ≤ j ≤ m the setsψj(Bu1,...,ut) are either completely contained in the interval [εkN, 2εkN ] orare completely disjoint from it. Then in these cases we can either replaceevery ν(ψj(x)) factor by φ(W )

W log RΛR(θj(x))2, and apply Proposition 3 (note

that if Q grows slowly enough then for N sufficiently large each Bu1,...,ut islarge enough), or replace every such factor by 1. Hence we always obtain1+om,t(1) for the inner of our iterated expectations when (u1, . . . , ut) is nice.

Finally, we prove by elementary estimates that the proportion of non-nicet-tuples is asymptotically zero as N → ∞.

Sketch of proof method for Propositions 3 and 4 Adapting the meth-ods of [1], these are proved by representing the summation in question as aniterated contour integral by applying the standard identity

log+ y =1

2πi

Γ1

yz

z2dz,

11

Page 12: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

where Γ1 is the vertical contour Γ1(t) = 1 + it, to the log(R/d) factors in

the definition of Λ. The integrand is then massaged into the form of amultiplicative function, which can then in turn be decomposed as

(something relatively benign

)×(some product of zeta functions

).

The values of the contour integrals then depend most strongly on thesecond factor; since the zeta function is relatively well-understood, methodsof the residue calculus can then be used to estimate the contour integrals,with small errors that can be tracked as the contours are moved around. Thetrick is to find estimates on the behaviour of the first factor that are goodenough to allow the application of such contour integral methods.

Remark Bernard Host [3] has offered an alternative construction for ν. Hisfollows the pattern of the Green-Tao definition except that he takes

Λ(n) :=∑

d |nµ(d)χ

( log d

logR

)

for a C∞ function χ : R → [0,∞) satisfying certain extra conditions.This allows him to write

χ(x) =

R

τ(t)e−x(1+it) dt

for a rapidly decreasing function τ . He then mimics the proofs of Propositions9.5 and 9.6 sketched above, except the use of the contour integrals is replacedby a (shorter) estimate using Fourier analysis.

References

[1] Goldston D.A. & Yıldırım C.Y., “Small gaps between primes I”,preprint;

[2] Green B. & Tao T., “The primes contain arbitrarily long arithmeticprogressions”, to appear in Ann. Math.;

[3] Host B., “Progressions arithmetiques dans les nombres premiers [d‘apresB.Green et T.Tao]”, Seminaire BOURBAKI, Mars 2005, 57eme annee,2004-2005, no. 944.

Tim Austin, UCLA

email: [email protected]

12

Page 13: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

2 A generalization of Birkhoff’s pointwise er-

godic theorem

after Amos Nevo and Elias M. Stein [4]A summary written by Svetlana Butler

Abstract

The paper generalizes Birkhoff’s pointwise ergodic theorem by pre-senting ergodic theorems for free non-Abelian groups on finitely manygenerators.

2.1 Introduction

Given an arbitrary invertible measure preserving transformation T on aprobability space X, Birkhoff’s pointwise ergodic theorem says that for anyf ∈ L1(X), the averages of f along the orbit of T

1

2n+ 1

n∑

k=−n

f(T kx)

converge to the limit f(x) for almost all x ∈ X, where f is the conditionalexpectation of f with respect to the σ-algebra of T -invariant sets.

Consider two invertible, measure preserving commutative transformationsT and S. It is known that the expressions

1

(2n+ 1)2

−n≤n1,n2≤n

f(T n1Sn2x)

converge for almost all x ∈ X, for any f ∈ L1(X), and the limit is theconditional expectation of f with respect to the σ-algebra of sets invariantunder T and S. Thus the pointwise ergodic theorem holds for finite measurepreserving actions of the of the free Abelian group on two generators, Z2.

It is natural to try to consider a pointwise ergodic theorem for measurepreserving actions of the free non-Abelian group on 2 (or r ) generators. Thisleads to the following setting:

For a countable group Γ let l1(Γ) = µ =∑γ∈Γ

µ(γ)γ :∑γ∈Γ

|µ(γ| < ∞

denote the group algebra. Let (X,B, m)be a standard Lebesque probability

13

Page 14: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

space, and assume Γ acts on X by measurable automorphisms preserving m.The action (γ, x) 7→ γx induces a representation of Γ by isometries on theLp(X) spaces, 1 ≤ p ≤ ∞, and this represenatation can be extended to thegroup algebra by (µf)(x) =

∑γ∈Γ

µ(γ)f(γ−1x).

Let B1 = A ∈ B : m(γAA) = 0 ∀γ ∈ Γ denote the sub-σ-algebraof invariant sets, and denote by E1 the conditional expectation operator onL1(X) which is associated with B1.

A sequence νn ∈ l1(Γ) is called a pointwise ergodic sequence in Lp if,for any action of Γ on a Lebesque space X which preserves a probabilitymeasure, and for every f ∈ Lp(X), νnf(x) → E1f(x) for almost all x ∈ X,and in the norm of Lp(X).

In search for pointwise ergodic sequences we look at sequences in l1(Γ)with an explicit geometric form. Assume Γ is finitely generated, and let S bea finite generating set which is symmetric: S = S−1 (we will assume e /∈ S).When Γ is the free group Fr, the set of generators S will always be taken tobe the set of free generators and their inverses. S induces a length functionon Γ, given by |γ| = minn : γ = s1 . . . sn, si ∈ S, |e| = 0. The sphereand the ball of radius n are Sn = w : |w| = n and Bn = w : |w| ≤ nrespectively.

2.2 Main results

Consider the following sequences:

σn =1

♯Sn

w∈Sn

w µn =1

n+ 1

n∑

k=0

σk

σ′

n =1

2(σn + σn+1) βn =

1

♯Bn

w∈Bn

w

Theorem 1. Consider the free group Fr, r ≥ 2.

1. The sequence µn is a pointwise ergodic sequence in Lp, for 1 ≤ p <∞.

2. The sequence σ′

n is a pointwise ergodic sequence in Lp, for 1 < p <∞.

3. σ2n converges to an operator of conditional expectation with respect toan Fr-invariant sub-σ-algebra. β2n converges to the operator E1+

r−1rE,

where E is a projection disjoint from E1. Given f ∈ Lp(X), 1 < p <∞, the convergence is pointwise almost everywhere and in the Lp-norm.

14

Page 15: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Note that the first part of this theorem is a direct analog of Birkhoff’spointwise ergodic theorem.

Given a sequence νn ∈ l1(Γ), define the associated maximal functionf ∗

ν (x) = supn≥0 |νnf(x)|. Let (X,B, m) be an Fr-space with an invariantσ-finite measure. Then:

Theorem 2. For each Fr, r ≥ 2, there exist positive constants Cp(r) suchthat for any f ∈ Lp(X) the following inequalities hold:

1. ‖ f ∗µ ‖p, ‖ f ∗

σ ‖p, and ‖ f ∗β ‖p are all bounded by Cp(r) ‖ f ‖p, for

1 < p <∞.

2. f ∗µ satisfies the maximal inequality of weak type (1, 1), namely:

mx : |f ∗µ(x)| ≥ δ ≤ C1(r)δ

−1 ‖ f ‖1

for every δ > 0, and f ∈ L1(X).

2.3 Remarks

In pointwise ergodic theorems one seeks to establish, for a general sequenceof Markov operators Tk acting in Lp(X,B, m), the existence of the limitlimk→∞ Tkf(x) = f(x) for almost all x and in the Lp norm. A naturaland interesting choice is to consider the sequence of operators βk of averag-ing a function on balls Bk of radius k. In the case of a finitely generatedAbelian group, the main ingredients which figure in the proof of ergodic the-orems are: the fact that the nested sequence of balls Bn is asymptoticallyinvariant under translation; the transfer principle (which uses the fact that

limN→∞

♯BN

♯BN−n= 1); and the covering argument. None of these ingredients

works if the group is a finitely generated free non-Abelian group Fr. For

example, limN→∞

♯BN

♯BN−n= (2r − 1)n, so the transfer principle does not apply.

In proving ergodic theorems for free non-abelian groups on r generatorsdifferent methods are necessary. The following two observations lead to spe-cial methods used to prove the ergodic theorems in the paper.

(I) The Markov operators βk and σk are comparable, in the sense thata maximal inequality for one sequence implies the same inequality for theother, up to a constant. (This fact follows since σn ≤ Crβn and βn is aconvex combination of σk, k = 0, . . . , n.) This means that one might use

15

Page 16: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

spheres rather than balls. The approach in the paper is to use methodssimilar to ones devised in [5] to handle singular means.

(II) The convolution identity

σ1 ∗ σn =1

2rσn−1 +

(1 − 1

2r

)σn+1 (1)

holds in the group algebra l1(Fr). It implies, by induction, that the elementsσn are linear combinations of the convolution powers σk, 0 ≤ k ≤ n. There-fore, the spheres σn generate a commutative convolution ∗-algebra, denotedby A(Fr). The algebra A(Fr), introduced in 1963 (see [1]), has an explicitspectral theory (see, for example, [3]). This spectral theory is important inproofs of both theorems.

2.4 Outline of the proofs

We outline the proof of Theorem 2 first.Part (2.) of Theorem 2 as well as the strong Lp maximal inequality for the

maximal function f ∗µ (which is Lemma 1 in [4]) follows from the inequality

N∑

k=0

σk ≤ Cr

3N∑

k=0

σk1

and application of the Hopf-Dunford-Schwartz theorem ([2], Ch.8). Theabove inequality is proved by using approximation of the binomial coeffi-

cients to estimate coefficients an(k) in σn1 =

n∑k=0

an(k)σk. Here writing σn1

as a convex combination of σk’s, 0 ≤ k ≤ n, follows from the convolutionidentity (1).

The proof of part (1.) is more elaborate. First observe that maximalinequalities for f ∗

µ and f ∗β follow from the maximal inequality for f ∗

σ , sinceµn and βn are convex averages of σk’s, 0 ≤ k ≤ n. The proof of the maximalinequality for f ∗

σ follows the method devised in [5].For a sequence Pk of bounded linear operators on Lp(X) and a function

f ∈ Lp(X) define a sequence of Cesaro sums:

Sλnf(x) =

n∑

k=0

Aλn−kPkf(x)

16

Page 17: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

where Aλn =

n∏k=1

(1 + λk) are complex binomial coefficients. The asssociated

maximal functions are

Sλ∗ f(x) = sup

n≥0

∣∣∣∣Sλ

nf(x)

(n + 1)λ+1

∣∣∣∣

Notice that when Pk = σk, we get S0∗f(x) = f ∗

µ(x), while the singularmean S−1

∗ f(x) = f ∗σ(x). When Pk = σ2k, the singular mean S−1

∗ f(x) =supn≥0 σ2nf(x).

Step 1. Establish the maximal Lp inequality

‖ Sα+iβ∗ f ‖p≤ Cαexp(2β

2) ‖ S0∗ |f | ‖p, 1 ≤ p <∞, α > 0

This is Lemma 4 (with Pk = σk) in [4] which is proved by using some esti-mates and the convolution formula for complex binomial coefficients.

Step 2. Prove the L2 maximal inequality

‖ S−m+iβ∗ f ‖2≤ Cmexp(3β

2) ‖ f ‖2

for every nonpositive integer −m and β ∈ R. Here we use Pk = σ2k.It is enough to enough to show the L2 maximal inequality for S−m

∗ . Thisinequality is proved using the Littlewood-Paley square-function method. Thespectral theory for the algebra A(Fr) plays a key role in estimating theLittlewood-Paley square-function.

Step 3. Use maximal inequalities from Step 1 and Step 2 and apply ananalytic interpolation theorem ([6]). The result is a maximal inequality forthe singular mean S−1

∗ f(x) = supn≥0 σ2nf(x) in every Lp, p > 1. This impliesthe maximal inequality for f ∗

σ in every Lp, p > 1, since for f ≥ 0

1

2rσ2n−1f(x) ≤ σ2n ∗ σ1f(x)

which follows from the convolution identity (1).The proof of Theorem 1 is based on strong maximal Lp inequalities from

Theorem 2 and the spectral theory of A(Fr). It is shown first (using thespectral theory) that the pointwise limits of Theorem 1 are valid for functionsin L2(X), in particular, for a dense set of bounded functions. The proof ofthe rest of Theorem 1 is a standard argument with an application of strongmaximal inequalities from Theorem 2.

17

Page 18: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

References

[1] Arnold, V. I. and Krylov, A. L., Uniform distribution of points on asphere and some ergodic properties of solutions of linear ordinary dif-ferential equations in the complex plane. Soviet Math. Dokl., 4 (1962),1-5.

[2] Dunford, N. and Schwartz, J. T., Linear operators, vol. 1. Interscience,New York, 1963.

[3] Figa-Talamanca, A. and Picardello, M. A., Harmonic analysis on freegroups. Lecture Notes in Pure and Appl. Math., 87. Dekker, New York,1983.

[4] Nevo, A. and Stein, E. M., A generalization of Birkhoff’s pointwiseergodic theorem. Acta Math., 173 (1994), 135-154.

[5] Stein, E. M., On the maximal ergodic theorem. Proc. Nat. Acad. Sci.U.S.A., 47 (1961), 1894-1897.

[6] – Interpolation of linear operators. Trans. Amer. Math. Soc., 83 (1956),482-492.

Svetlana Butler, UIUC

email: [email protected]

18

Page 19: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

3 A pointwise ergodic theorem for amenable

groups

after E. Lindenstrauss [1]A summary written by S. Zubin Gautam

Abstract

We summarize the proof of the L1-pointwise ergodic theorem foractions of amenable groups in [1], focusing on the discrete case. Wethen present some results on the lamplighter group Z2 ≀ Z as a demon-stration of the scope of the main theorem.

3.1 Preliminaries

In the sequel, all groups are assumed to be locally compact and second count-able; | · | and mL will both denote the left Haar measure on a group G.(X,M, µ) will denote a Lebesgue probability space. We begin by recallingthe Følner characterization of amenability:

Definition 1. A locally compact group G is amenable if for any compact setK ⊂ G and any δ > 0 there is another compact set F ⊂ G such that

|F KF | < δ|F |.Such an F is called (K, δ)-invariant.

Equivalently, G is amenable if it has a “Følner sequence”:

Definition 2. A Følner sequence in a group G is a sequence of compactsubsets Fn ⊂ G such that for every compact K ⊂ G and δ > 0, Fn is (K, δ)-invariant for all sufficiently large n.

Now consider a left action G y X by measure-preserving transformationswith G amenable. For f : X → R measurable, we consider the averages of fover sets F ⊂ G,

A(F, f)(x) =1

|F |

F

f(gx) dmL(g).

Results on the pointwise convergence of these averages along suitable Følnersequences provide natural generalizations of the classical Birkhoff ergodictheorem for Z-actions; however, care must be exercised in the choice of Følnersequence (even in the classical case).

19

Page 20: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Definition 3. A sequence of subsets Fn ⊂ G is tempered (or satisfies theShulman condition) if there is some C > 0 such that for all n

∣∣∣∣⋃

k<n

F−1k Fn

∣∣∣∣ ≤ C|Fn|. (1)

It is not too hard to show that any Følner sequence has a tempered subse-quence; hence, any convergence result along tempered Følner sequences willapply to general amenable groups.

3.2 Statement of the main result

Theorem 4 (Pointwise Ergodic Theorem). Let G be an amenable groupacting on (X,M, µ) by measure-preserving transformations, and let Fnbe a tempered Følner sequence for G. Then for any f ∈ L1(µ) there is aG-invariant f ∈ L1(µ) such that

limn→∞

A(Fn, f)(x) = f(x) a.e.

In particular, if G acts ergodically,

limn→∞

A(Fn, f)(x) =

X

f dµ a.e.

(NB: This is both Theorem 1.2 and Theorem 3.3 of [1].)

3.3 Random selection of Følner sets

We will use the following “randomized” covering lemma, Lemma 2.1 of [1]:

Lemma 5. Let δ > 0 be given, and let G be a locally compact second count-able group. Let F1, . . . , FN be a finite tempered sequence of compact subsetsof G with constant C > 0, and let F be another compact subset of G. Takearbitrary sets Aj such that FjAj ⊂ F for 1 ≤ j ≤ N , and set

F = Fj a | a ∈ Aj, 1 ≤ j ≤ N.

Then there is a probability space (Ω,P) and a map ~ω 7→ F(~ω ) from Ω tothe set of subcollections of F satisfying the following conditions:

20

Page 21: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

1. F(~ω ) is a.s. finite and the counting function

Λ~ω(g) =∑

B∈F(~ω )

1B(g)

of F(~ω ) is a measurable function on Ω × F .

2. For all g ∈ F ,E(Λ~ω(g) |Λ~ω(g) ≥ 1) ≤ 1 + δ.

3. With the notation ‖ S‖ =∑

B∈S |B| for a collection of sets S, we have

E(‖F(~ω )‖) = E

(∫

F

Λ~ω(g) dmL(g)

)≥ γ(δ, C)

∣∣∣∣N⋃

j=1

Aj

∣∣∣∣,

where γ(δ, C) = δ/(1 + Cδ).

(We use probabilistic notation for (Ω,P); E denotes expectation or con-ditional expectation. E(· |A) is the value taken on A by the conditionalexpectation with respect to the sub-σ-algebra ∅, A, Ac,Ω.)

Intuitively, condition 2 of the lemma says that “on average” the subcol-lections F(~ω ) are almost disjoint, while condition 3 guarantees that a typicalF(~ω ) will cover a large enough portion of F .

3.3.1 Proof sketch of Lemma 5 for G discrete

For clarity, it is instructive first to consider the case where G is a discretegroup. Take the sample space Ω to be

Ω =~ω = ω(j, a)1≤j≤N,a∈G | ω(j, a) ∈ 0, 1 ∀ j, a

.

(One may think of Ω as a coin-flipping experiment indexed by N copies ofG.) Let P be the probability measure on Ω such that

P(~ω | ω(j0, a0) = 1

)=

δ

|Fj0|=: pj0

for all j0, a0 (taking the σ-algebra generated by all ω(j, a) as the domain); thismakes each ω(j, a) an independent 0, 1-valued random variable on (Ω,P)that is 1 with probability pj .

21

Page 22: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Given ~ω = ω(j, a) ∈ Ω, F(~ω ) is given by the following N -step recursivealgorithm:

Algorithm A:

1. Begin by setting j = N , and define sets Ai|N+1 = Ai for 1 ≤ i ≤ N .

2. Set Σj = a ∈ Aj|j+1 | ω(j, a) = 1 and Fj(~ω) = Fja | a ∈ Σj.3. For all i < j, define Ai|j = a ∈ Ai|j+1 | Fia ∩ FjΣj = ∅.4. If j > 1, return to step 2 and replace j with j − 1; if j=1, proceed to

step 5.

5. Set F(~ω ) =⋃N

j=1 Fj(~ω).

The sets Ai|j = Ai|j(~ω), 1 ≤ i < j ≤ N should be regarded as encodingthe “admissible” translates of Fi at stage j of the recursion; their definitionguarantees that the collections Fj(~ω) are mutually pairwise disjoint. Notethat each Ai|j only depends on the σ-algebra Φj generated by the randomvariables ω(j′, a′) for j′ ≥ j and a′ ∈ G. Similarly, the algorithm enjoysa useful recursive property: Once FN(~ω) (or equivalently ΦN) is given, therandom collection

⋃N−11 Fj(~ω) has the same distribution (i.e. pushforward

measure) as that obtained by running the original algorithm on the sequenceFjN−1

1 and the sets Aj|N , 1 ≤ j ≤ N − 1.Condition 1 of the lemma is easily verified. To prove condition 2, we write

Λ~ω(g) =∑N

1 Λ~ωj (g), where

Λ~ωj (g) =

B∈Fj(~ω)

1B(g)

is the counting function of Fj(~ω). Now since the collections Fj(~ω) are mu-tually disjoint, it suffices to show that E(Λ~ω

j (g) |Λ~ωj (g) ≥ 1) ≤ 1 + δ for all

j. But, viewing Λ~ωj (g) as a random variable on Ω, once Φj+1 is given, we can

writeΛ~ω

j (g) =∑

a∈Aj|j+1∩F−1j g

ω(j, a),

which is a sum of fewer than |Fj | i.i.d. 0, 1-valued random variables thatare 1 with probability pj . From here calculation yields

E(Λ~ωj (g) |Λ~ω

j (g) ≥ 1) = E(

E(Λ~ωj (g) |Λ~ω

j (g) ≥ 1; Φj+1)∣∣ Λ~ω

j (g) ≥ 1)

≤ 1 + pj|Fj | = 1 + δ.

22

Page 23: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Condition 3 is proved by induction on N ; it is easy for N = 1. By therecursive property of Algorithm A mentioned above, if the statement holdsfor N − 1 we obtain

E(‖F(~ω )‖

∣∣ΦN

)≥ ‖FN(~ω)‖ + γ(δ, C)

∣∣∣∣N−1⋃

j=1

Aj|N(~ω)

∣∣∣∣

≥ ‖FN(~ω)‖ + γ(δ, C)

(∣∣∣∣N−1⋃

j=1

Aj

∣∣∣∣− C|FN | · |ΣN(~ω)|)

= ‖FN(~ω)‖ + γ(δ, C)

(∣∣∣∣N−1⋃

j=1

Aj

∣∣∣∣− C‖Fn(~ω)‖),

where the second inequality follows from the Shulman condition and the factthat

N−1⋃

1

Aj|N ⊃N−1⋃

1

Aj \(

N−1⋃

1

F−1j FN

)ΣN .

Taking the expectation of both sides and computing E(‖FN(~ω)‖

)= δ|AN |

yields the claim.

3.3.2 Proof sketch of Lemma 5 for general G

For G a general locally compact, second countable group, we take

Ω =~ω = ΓjN

j=1 |Γj ⊂ G locally finite ∀ j,

where the random subsets Γj are chosen independently according to Poissonprocesses on G with respect to the rescaled right Haar measures δ

|Fj | dmR

(this reduces to our original (Ω,P) when G is discrete). We then make thenatural modification of step 2 in Algorithm A:

2’. Set Σj = Γj ∩Aj|j+1 and Fj(~ω) = Fj a | a ∈ Σj.

Once we have constructed Ω and the map F , the proof proceeds much as inthe discrete case, up to some minor modifications involving the basic prop-erties of a Poisson process and the relationship between the left and rightHaar measures on G. NB: Amenability is not necessary for Lemma 5.

23

Page 24: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

3.4 Maximal inequality and pointwise ergodic theorem

Definition 6. Given a sequence Fn of finite-measure subsets of G, the max-imal function of f ∈ L1(G) is

M [f ](x) = supnA(Fn, f)(x).

Theorem 7 (Weak (1,1) maximal inequality). Let G y (X,M, µ) as abovewith G amenable, and let Fn ⊂ G be a tempered sequence of compact sets.Then the associated maximal operator M is weak-type (1, 1); i.e., there existsc > 0 such that

µM [f ](x) > λ ≤ c

λ‖f‖1.

The constant c depends on Fn but not on X.

Proof sketch: Fix n and ε < 0. Set Dn = x | max1≤j≤nA(Fj , f)(x) > λ.By the amenability of G, we can choose some large compact set F ′ ⊂ G suchthat F :=

⋃n1 FjF

′ has measure |F | ≤ (1 + ε)|F ′|. The first step is to provethe “group-side maximal inequality”

F ′

1Dn(gx) dmL(g) ≤ c

λ

F

|f(gx)| dmL(g) (2)

for all x ∈ X and c = 2/γ(1, C), with C as in the Shulman condition (1).Fix x and set Aj = g ∈ F ′ | A(Fj g, f)(x) > λ for 1 ≤ j ≤ n. We will

apply Lemma 5 to these sets with δ = 1. Let F(~ω ) be a random collectionof subsets of F as in the lemma with counting function Λ~ω(g). Firstly, thedefinition of the Aj implies the inequality

λ|Fj a| ≤∫

Fj a

f(gx) dmL(g)

for all a ∈ Aj . Combining this with condition 3 of Lemma 5 yields

λγ(1, C)

F ′

1Dn(gx) dmL(g) = λγ(1, C)

∣∣∣∣n⋃

j=1

Aj

∣∣∣∣

≤ E(‖F(~ω )‖

)

≤ E

(∫

F

Λ~ω(g)f(gx)dmL(g)

). (3)

24

Page 25: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

But condition 2 of Lemma 5 can be used to get

E

(∫

F

Λ~ω(g)f(gx)dmL(g)

)≤ 2

F

|f(gx)| dmL(g). (4)

Combining (3) and (4) yields (2), and from there integration yields

µ(Dn) ≤ c

λ(1 + ε)

X

|f(x)| dµ(x).

Letting n→ ∞ and ε→ 0 yields the maximal inequality.

The main theorem 4 now follows from the maximal inequality along thestandard lines (see e.g. Chapters 5 and 6 of [3] or pp. 92-93 of [2]). It isworth noting, however, that the amenability of G comes into play again whenshowing that the theorem holds for the coboundaries h(gx)− h(x) for g ∈ Gand h ∈ L∞(G); we need to use the fact that Fn is a Følner sequence.

3.5 Example: The lamplighter group

The lamplighter group G is the wreath product

G = Z2 ≀ Z :=

(⊕

i∈Z

Z2

)⋊σ Z,

where σ is the right shift action of Z on⊕

Z2. G is well-known to be anamenable group of exponential growth.

Theorem 8. For all C > 0 there are a finite K ⊂ G and δ > 0 such that ifA is (K, δ)-invariant and C−1|A| ≤ |B| ≤ |A|, then |B−1A| ≥ C|A|.

We omit the proof of this theorem, from which we immediately obtain:

Corollary 9. The lamplighter group has no Følner sequences satisfying theTempelman condition

|F−1n Fn| . |Fn|.

Corollary 10. If Fn is a tempered Følner sequence for the lamplighter group,

|Fn+1||Fn|

→ ∞.

In particular, the growth of |Fn| is super-exponential.

25

Page 26: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Corollary 9 reveals the limited scope of earlier ergodic theorems alongsimilar lines that require the Tempelman condition (see e.g. section 6.3of [3]). However, Corollary 10 shows that Theorem 4 only guarantees theconvergence of ergodic averages taken along a potentially “sparsely spaced”sequence of Følner sets; moreover, it is conjectured that the corollary shouldhold for any group of exponential growth.

References

[1] Lindenstrauss, E. 2001. Pointwise theorems for amenable groups. Invent.Math. 146, no. 2, 259-295.

[2] Petersen, K. 1983. Ergodic theory. Cambridge, UK: Cambridge UP.

[3] Tempelman, A. 1992. Ergodic theorems for group actions: informationaland thermodynamical aspects. Dordrecht, The Netherlands: Kluwer Aca-demic Publishers.

S. Zubin Gautam, UCLA

email: [email protected]

26

Page 27: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

4 Arithmetic progressions in primes I

after B. Green and T. Tao [3]A summary written by Alexander Gorodnik1

4.1 Introduction

This chapter contains the first part of an exposition of

Theorem 1 (Green,Tao [3]). Let A be a subset of positive upper density inthe set P of prime numbers, i.e.,

lim supN→∞

|A ∩ [1, N ]||P ∩ [1, N ]| > 0.

Then for any k ≥ 3, A contains an arithmetic progression of length k.

We start our discussion with the celebrated theorem of Szemeredi, whichcan be stated in several equivalent forms (see Section 4.1.1 below for basicnotation):

Theorem 2 (Szemeredi). 1. Let A be a set of positive integers such that

lim supN→∞

1

N|A ∩ [1, N ]| > 0.

Then for any k ≥ 3, A contains an arithmetic progression of length k.

2. Given k ≥ 3 and δ > 0, there exists N0 = N0(k, δ) > 0 such that forevery N > N0 and every A ⊂ [1, N ] with |A| > δN , the set A containsan arithmetic progression of length k.

3. Given k ≥ 3 and δ > 0, there exists c = c(k, δ) > 0 such that for afunction f : ZN → R satisfying 0 ≤ f ≤ 1 and E(f) ≥ δ, we have

E(f(x)f(x+ r) · · ·f(x+ (k − 1)r)|x, r ∈ ZN ) ≥ c

for sufficiently large N .

1The author is partially supported by NSF 0400631

27

Page 28: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Although there are explicit estimates on the constant N0 in Theorem 2(2),the set P of prime numbers is too sparse to deduce existance of arithmeticprogressions in P directly using known estimates. Recall that according tothe prime number theorem,

|P ∩ [1, N ]| ∼ N

logNas N → ∞.

The starting point of the Green-Tao argument is to consider P as a subsetof the set PR of almost primes, which consists of numbers all of whose primefactors are at least R. When R = Nα for small α > 0, the set PR is relativelywell understood (for example, one can prove Theorem 1 for PR using sievemethods), and by Mertens’ theorem,

lim supN→∞

|P ∩ [1, N ]||PR ∩ [1, N ]| > 0.

The proof of Theorem 1 uses information about the structure of the set ofalmost primes ingeniously combined with positive density argument as inTheorem 2. In fact, it is proved that a subset of positive upper density in aso-called pseudorandom set, which will be defined in Section 4.1.2, containsarbitrary long arithmetic progressions.

Theorem 3 (relative Szemeredi theorem). Fix k ≥ 3 and δ > 0, and letν : ZN → R+ be a k-pseudorandom function such that E(ν) = 1+o(1). Thenfor any function f : ZN → R+ satisfying 0 ≤ f ≤ ν and E(f) ≥ δ, we have

E(f(x)f(x+ r) · · ·f(x+ (k − 1)r)|x, r ∈ ZN ) ≥ c− ok,δ(1) (1)

as N → ∞, where c > 0 is the same as in Theorem 2(3).

At this stage, we allow the reader to think naively that ν is the normalizedcharacteristic function of the set of almost primes in ZN and f is the nor-malized characteristic function of the set of primes in ZN . However, to givea rigorous argument, one needs to consider “smoothened” version of thesefunction and to eliminate the irregularities coming for congruence propertiesof primes. The construction of functions ν and f for which Theorem 3 im-plies Theorem 1 will be given in the following chapter. In this chapter, wediscuss the proof of Theorem 3.

28

Page 29: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Sketch of the proof of Theorem 3. The main ingredient of the proof is The-orem 8 below which implies that any nonnegative function f which is majo-rated by a k-pseudorandom ν has a decomposition

f = fU + fU⊥ + E (2)

with the error term E satisfying

E ≥ 0 and E(E) = o(1). (3)

Decomposition (2) is analogous to the ordinary Koopman–von-Neumann de-composition (see Remark 4), and it has the following properties:

fU + fU⊥ ≥ 0, (4)

0 ≤ fU⊥ ≤ 1 + o(1), (5)

E(f) = E(fU⊥) + o(1), (6)

E(f0(x)f1(x+ r) · · ·fk−1(x+ (k − 1)r)|x, r ∈ ZN) is small, (7)

where each fi is either fU or fU⊥, and fi 6= fU⊥ for some i.Now it follows from (3) and (4) that

E(f(x) · · · f(x+ (k − 1)r)|x, r ∈ ZN)

≥E((fU + fU⊥)(x) · · · (fU + fU⊥)(x+ (k − 1)r)|x, r ∈ ZN).

Because of (7), it suffices to prove a lower estimate for

E(fU⊥(x)fU⊥(x+ r) · · ·fU⊥(x+ (k − 1)r)|x, r ∈ ZN).

Using that the function fU⊥ satisfies (5) and (6), this lower estimate followsfrom the Szemeredi theorem (Theorem 2(3)).

Remark 4 (comparison with ergodic theory). When k = 3, decomposi-tion (2) is a finitary analog of the Koopman–von-Neumann decompositionof L2(X), defined for a probability measure-preserving system (X,B, µ, T ),into weakly mixing and compact (almost periodic) parts. Note that the laterdecomposition can be used to show that for any nonnegative f ∈ L∞(X)with

∫Xf dµ > 0, one has

lim infN→∞

1

N

N−1∑

i=0

X

f(x)f(T ix)f(T 2ix)dµ(x) > 0.

29

Page 30: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

This implies the Szemeredi theorem (Theorem 2) with k = 3.For general k, similar decompositions appeared in the Furstenberg’s proof

of the Szemeredi theorem [1] and in the works of Host, Kra [2] and Ziegler[4]. These decompositions are of the form

f = fU + fU⊥ with fU = f − E(f |B′) and fU⊥ = E(f |B′),

where B′ is a T -invariant sub-σ-algebra of B, such that

limN→∞

1

N

N−1∑

i=0

X

f0(x)f1(Tix) · · · fk−1(T

(k−1)ix) dµ(x) = 0,

where each fi is either fU or fU⊥, and fi 6= fU⊥ for some i, and the average

1

N

N−1∑

i=0

X

fU⊥(x)fU⊥(T ix) · · ·fU⊥(T (k−1)ix) dµ(x)

can be analyzed using some additional structure.

4.1.1 Notation

• k ≥ 3 is the length of the arithmetic progression (fixed).

• N is a large prime number (N → ∞).

• o(1) and O(1) denote the quantities that go to zero and bounded re-spectively as N → ∞.

• ZN is the field of residues mod N .

• For a function f : ZlN → R and A ∈ Zl

N ,

E(f(x) | x ∈ A) :=1

|A|∑

x∈A

f(x) and ‖f‖Lq := E(|f(x)|q | x ∈ ZlN )1/q.

In particular, E(f) := E(f(x)|x ∈ ZlN ). Also, ‖ · ‖L∞ denotes the

sup-norm.

30

Page 31: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

• Given a σ-algebra B on ZN , we define the conditional expectationE(f |B), f ∈ L2(ZN ), to be the orthogonal projection on the spaceof B-measurable functions, i.e.,

E(f |B)(x) = E(f(y) | y ∈ B(x))

where B(x) denotes the atom of the σ-algebra B containing x.

• For σ-algebras B1 and B2, B1 ∨ B2 denotes the smallest σ-algebra con-taining both B1 and B2.

4.1.2 Pseudorandom measures

A function ν : ZN → R+ is called a measure if

E(ν) = 1 + o(1).

A measure ν satisfies (m0, t0, L0)-linear form condition if for any m ≤ m0,t ≤ t0, and linear forms

ψi(x) =t∑

j=1

Lijxj + bi, i = 1, . . . , m,

where Lij are rational numbers2 with the numerator and denominator boundedby L0 in absolute value and b ∈ ZN , such that the t-tuples (Lij ; j = 1, . . . , t)are not zero and not rational multiples of each other, we have

E(ν(ψ1(x)) · · · ν(ψm(x))|x ∈ ZtN ) = 1 + om0,t0,L0(1).

Roughly speaking, the measure ν will be supported on almost primes and thelinear form condition says that events “ψj(x) is almost prime” are essentiallyindependent.

The linear form condition is motivated by the Hardy-Littlewood primetuples conjecture. Let Λ(n) denote the Mangoldt function, which is equal tolog p if n is a power of prime p and zero otherwise, and Λp(n) denote the localMangoldt function, which is equal to p

p−1if (n, p) = 1 and zero otherwise.

Let ψi’s be the linear forms as above with positive Lij and bi.

2Recalling that N is a large prime, we can view Lij ’s as elements of ZN .

31

Page 32: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Conjecture 1 (Hardy-Littlewood prime tuple conjecure).

E(Λ(ψ1(x)) · · ·Λ(ψm(x))|x ∈ ZtN) =

p

αp + om0,t0,L0(1)

whereαp = E(Λp(ψ1(x)) · · ·Λp(ψm(x))|x ∈ Zt

p).

The linear form condition is analogous to the following property that holdsfor any weakly mixing probability measure-preserving system (X,B, µ, T ):for any f1, . . . , fm ∈ L∞(X),

1

N

N−1∑

i=0

X

f1(Tix) · · · fm(Tmix) dµ(x) →

(∫

X

f1dµ

)· · ·(∫

X

fmdµ

). (8)

A measure ν satisfies m0-correlation condition if for every m = 2, . . . , m0,there exists a function τ = τm : ZN → R+ such that

E(τ q) = Om,q(1)

for all q ≥ 1 and

E(ν(x + h1) · · · ν(x+ hm)|x ∈ ZN ) ≤∑

1≤i<j≤m

τ(hi − hj)

for all h1, . . . , hm ∈ ZN .A measure ν is called k-pseudorandom if it satisfies (k2k−1, 3k−4, k)-linear

form condition and 2k−1-correlation condition.

4.2 Tools

4.2.1 Gowers uniformity norms

We define d-th Gowers uniformity norm ‖ · ‖Ud inductively. Denote by Th

the shift operator: (Thf)(x) = f(x+ h). For f : ZN → R, we set

‖f‖U1 = |E(f(x)|x ∈ ZN )|,

‖f‖Ud = E

(‖f · (Thf)‖2d−1

Ud−1|h ∈ ZN

)1/2d

.

32

Page 33: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Explicitly,

‖f‖Ud = E

ω∈0,1d

f(x+ ωh) | x ∈ ZN , h ∈ ZdN

1/2d

.

One can show that for d ≥ 2, ‖ · ‖Ud is a genuine norm.Given a family of functions fω, ω ∈ 0, 1d, we define d-dimensional

Gowers inner product:

〈(fω)ω∈0,1d〉Ud = E

ω∈0,1d

fω(x+ ωh) | x ∈ ZN , h ∈ ZdN

.

Then we have Gowers Cauchy-Schwarz inequality:

|〈(fω)ω∈0,1d〉Ud| ≤∏

ω∈0,1d

‖fω‖Ud.

Gowers uniformity norms can be used to control the averages as in (1):

Theorem 5 (generalized von Neumann theorem). For a k-pseudorandommeasure ν : ZN → R+ and functions f0, . . . , fk−1 : ZN → R such that|fi| ≤ ν + 1, we have

E

(k−1∏

i=0

fi(x+ ih) | x, h ∈ ZN

)= O

(min

0≤i≤k−1‖fi‖Uk−1

)+ o(1).

Theorem 5 is a finitary analog of (8). The proof uses van der Corput typeargument and the linear form condition.

We call a function f Gowers uniform if ‖f‖Uk−1 is small. Theorem 5shows that if at least one of fi’s is Gowers uniform functions, the average

E

(∏k−1i=0 fi(x+ ih)|x, h ∈ ZN

)is negligible. This is used to arrange (7).

4.2.2 Obstructions to uniformity and dual functions

For a function F : ZN → R, define the dual function of F :

DF (x) = E

ω∈0,1k−1:ω 6=0k−1

F (x+ ωh) | h ∈ Zk−1N

.

33

Page 34: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Note that〈F,DF 〉 = ‖F‖2k−1

Uk−1. (9)

Hence, if the Gowers norm of F is large, F correlates with its dual function,and the dual functions provide obstructions to uniformity.

The following two properties of dual functions are used in the proof.

Proposition 6. For a function F : ZN → R such that |F | ≤ ν + 1, we have

‖DF‖L∞ ≤ 22k−1−1 + o(1).

Proposition 6 is deduced from the linear form condition.

Proposition 7. Let I = [−22k, 22k

]. Given function F1, . . . , Fn : ZN → R

such that |DFi| ≤ 22kand a continuous function Φ : In → R, we define

ψ(x) = Φ(DF1(x), . . . ,DFn(x)).

Then〈ν − 1, ψ〉 = on,Ψ(1).

Proposition 7 is deduced from the correlation condition using the GowersCauchy-Schwarz inequality.

4.2.3 σ-Algebras generated by generalized Bohr sets

Fix ε, η > 0. Given a function G : ZN → I := [−22k, 22k

], one defines aσ-algebra Bε,η(G) on ZN that satisfies the following properties:

1. For any σ-algebra B on ZN ,

‖G− E(G|B ∨ Bε,η(G))‖L∞ ≤ ε. (10)

2. The σ-algebra Bε,η(G) is generated by at most O(1/ε) atoms.

3. If A is any atom of Bε,η(G), then there exists a continuous functionΨA : I → [0, 1] such that

‖(1A − ΨA(G))(ν + 1)‖L1 = O(η). (11)

In fact, this implies that if G1, . . . , Gn : ZN → I and A is an atomof Bε,η(G1) ∨ · · · ∨ Bε,η(Gn), then there exists a continuous functionΨA : In → [0, 1] such that

‖(1A − ΨA(G1, . . . , Gn))(ν + 1)‖L1 = On(η). (12)

34

Page 35: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Roughly speaking, the σ-algebra Bε,η(G) is generated by the level sets ofthe function G: the atoms of Bε,η(G) are G−1([ε(n+α), ε(n+1+α))), n ∈ Z,for suitably chosen α.

4.3 Structure theorem

Theorem 8 (generalized Koopman–von Neumann structure theorem). Letν be a k-pseudorandom measure and f : ZN → R such that 0 ≤ f ≤ ν. Letε > 0 be a small parameter and N > N0(ε) sufficiently large. Then thereexists a σ-algebra B and an exceptional set Ω ∈ B such that

E(1Ων) = oε(1), (13)

‖(1 − 1Ω)E(ν − 1|B)‖L∞ = oε(1), (14)

‖(1 − 1Ω)(f − E(f |B))‖Uk−1 ≤ ε1/2k

. (15)

Now setting

fU = (1 − 1Ω)(f − E(f |B)), fU⊥ = (1 − 1Ω)E(f |B), E = 1Ωf,

we have decomposition (2) satisfying (3)–(7). Note that (3)–(6) follow di-rectly from Theorem 8, and (7) is derived using Theorem 5.

Sketch of the proof of Theorem 8. In the proof, we use a parameter η → 0+.First, we set B0 = ∅,ZN and Ω0 = ∅. Then (13) and (14) obviously

hold. If (15) fails, we set

F1 := (1 − 1Ω0)(f − E(f |B0)),

B1 := B0 ∨ Bε,η(DF1),

and define the exceptional set Ω1 to be the union of the atoms A ∈ B1 suchthat E(1A(ν + 1)) ≤ η1/2. Then

E(1Ω1(ν + 1)) = Oε(η1/2),

and using (11) and Proposition 7, one shows that

‖(1 − 1Ω1)E(ν − 1|B1)‖L∞ = Oε(η1/2).

Continuing this procedure, we construct sequences of functions

Fn := (1 − 1Ωn−1)(f − E(f |Bn−1)),

35

Page 36: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

σ-algebrasBn := Bn−1 ∨ Bε,η(DFn),

and exceptional sets Ωn ∈ Bn satisfying

E(1Ωn(ν + 1)) = On,ε(η1/2), (16)

‖(1 − 1Ωn)E(ν − 1|Bn)‖L∞ = On,ε(η1/2). (17)

Note that one can check inductively that

‖(1 − 1Ωn)E(f |Bn)‖L∞ ≤ 1 +On,ε(η1/2), (18)

|Fn| ≤ (1 +On,ε(η1/2))(ν + 1). (19)

By (19), (12) and Proposition 7 can be applied at every step to deduce (17).It remains to show that after finitely many steps, we get

‖Fn‖Uk−1 ≤ ε1/2k

.

This follows from the following claim compared with estimate (18) when ηis chosen sufficiently small.

Claim (energy increment property). If ‖Fn‖Uk−1 > ε1/2k, then

‖(1 − 1Ωn)E(f |Bn)‖2L2 > ‖(1 − 1Ωn−1)E(f |Bn−1)‖2

L2 + 2−2k+1ε.

Heuristically, the claim follows from the observation that if Fn is notGowers uniform, then by (9) it has a nontrivial correlation with the dualfunction DFn.

Since the contribution of the exceptional sets can be controlled using (16),we sketch the proof assuming that the exceptional sets are empty. Using thatby (9),

〈Fn,DFn〉 = ‖Fn‖2k−1

Uk−1 > ε1/2,

and by (10),〈Fn,DFn − E(DFn|Bn)〉 = O(ε),

we deduce that

〈Fn,E(DFn|Bn)〉 = 〈f − E(f |Bn−1),E(DFn|Bn)〉 > ε1/2 +O(ε).

36

Page 37: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

In the last inequality, we can replace f by E(f |Bn). Then by the Cauchy-Schwarz inequality and Proposition 6,

‖E(f |Bn) − E(f |Bn−1)‖L2 · (22k−1−1 +On,ε(η1/2)) > ε1/2 +O(ε).

Since E(f |Bn) − E(f |Bn−1) ⊥ E(f |Bn−1), the claim now follows from thePythagoras theorem (with an appropriate choice of parameter η).

References

[1] H. Furstenberg, Recurrence in ergodic theory and combinatorial num-ber theory. M. B. Porter Lectures. Princeton University Press, Prince-ton, N.J., 1981.

[2] B. Host and B. Kra, Nonconventional ergodic averages and nilmani-folds. Ann. of Math. (2) 161 (2005), no. 1, 397–488.

[3] B. Green and T. Tao, The primes contain arbitrary long arithmeticprogressions. To appear in Ann. Math.

[4] T. Ziegler, Universal characteristic factors and Furstenberg averages.To appear in JAMS.

Alexander Gorodnik, CalTech

email: [email protected]

37

Page 38: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

5 Polynomial extensions of van der Waerden’s

and Szemeredi’s theorems

after V. Bergelson and A Leibman [BL]A summary written by Michael Johnson

Abstract

We summarize the main ideas of V. Bergelson and A. Leibman’smultidimensional polynomial versions of Szemeredi’s and van der Waer-den’s theorems, emphasizing the use of PET-induction in both proofs.

5.1 Introduction

In 1976, H. Furstenberg proved the following ergodic version of Szemeredi’stheorem.

Theorem 1. [Fu1] Let (X,B, µ) be a probability space, and let T be a measurepreserving transformation of X. Then for all A ∈ B with µ(A) > 0,

lim infN→∞

1

N

N−1∑

n=0

µ(T−nA ∩ . . . ∩ T−knA) > 0.

In 1996, V. Bergelson and A. Leibman generalized this result in followingmanner: first they allowed for polynomial powers on the transformations,and second, their result deals with a commuting group of invertible trans-formations. We call polynomials that take integer values on the integers,integer polynomials.

Theorem 2. [BL] Let (X,B, µ) be a probability space, let T1, . . . , Tl be com-muting invertible measure preserving transformations of X, let pi,j be integerpolynomials satisfying pi,j(0) = 0 for all 1 ≤ i ≤ k, 1 ≤ j ≤ l, and let A ∈ Bwith µ(A) > 0. Then

lim infN→∞

1

N

N−1∑

n=0

µ(

k⋂

i=1

T−pi,1(n)1 . . . T

−pi,l(n)

l A) > 0.

By translating Theorem 2 back to combinatorics, we get the followingresult as a corollary.

38

Page 39: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Theorem 3. [BL] Let S ⊆ Zl have positive upper Banach density, and letp1, . . . , pk be integer polynomials with pi(0) = 0 for all i = 1, 2, . . . , k. Thenfor each finite subset v1, . . . ,vk of Zl, there exists an n ∈ N, and u ∈ Zl suchthat u + pi(n)vi ∈ S for all i = 1, 2, . . . , k.

The following topological version of Theorem 2 corresponds to the multi-dimensional polynomial version of Van der Waerden’s Theorem.

Theorem 4. [BL] Let (X, d) be a compact metric space, T1, . . . , Tl be com-muting homeomorphisms of X, and pij be integer polynomials satisfyingpi,j(0) = 0 for all 1 ≤ i ≤ k, 1 ≤ j ≤ l. Then for every ε > 0, there

exists x ∈ X such that d(Tpi,1(n)1 . . . T

pi,l(n)l x, x) < ε for all i = 1, 2, . . . , k.

In this summary paper, we present the main ideas in the proofs of Theo-rem 2 and 4, emphasizing the use of PET-induction in both proofs.

5.2 Polynomial Expressions

Expressions of the form Tp1(n)1 . . . T

pl(n)l with each pi an integer polynomial of

degree less than D are called polynomial expressions. We note that prod-ucts and inverses of polynomial expressions are again polynomial expressions.Thus the set of all polynomial expressions, PE forms a group.

We define the degree of a polynomial expression Tp1(n)1 . . . T

pl(n)l to be

max1,...,l deg(pi). Its weight is the pair (r, d) where deg(pr+1) = . . . =deg(pl) = 0 but deg(pr) = d ≥ 1. We say (r, d) > (s, e) if r > s or ifr = s and d > e.

Example 5. The polynomial expression T n2+n1 T n3−4n

2 T n2+3n3 T 3n2+n

4 T 05 has de-

gree 3 and weight (4, 2).

The expressions Tp1(n)1 . . . T

pl(n)l and T

q1(n)1 . . . T

ql(n)l are called equivalent

if they have the same weight and deg(pr − qr) < d.A finite subset of PE is called a system. The degree of a system is the

maximal degree of its elements. For a system A with l transformations and

degree D, we form the weight matrix(N1,1 . . . N1,D

......

Nl,1 . . . Nl,D

)where Nr,d is the the

number of equivalence classes in A whose weight is (r, d). We say the weightmatrix M precedes the weight matrix N if for some (r, d), Mrd < Nrd, andMre = Nre for all e > d and Mse > Nse for all s > r, e = 1, . . . , D.

39

Page 40: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Example 6. The system T n2+n1 T 3n

2 , T 3n3+n1 T 0

2 , T3n3

1 T 02 , T

01 T

2n2+2n2 , T 4n

1 T n2

2 has degree 3 and weight matrix

(0 0 11 2 0

).

5.3 Theorem 4

Given a polynomial system P = g1(n), . . . , gl(n) with weight matrix A

where each gi(n) = Tpi,1(n)1 . . . T

pi,l(n)

l . Suppose that Theorem 4 holds forevery system whose weight matrix precedes A. If we show that Theorem 4holds for system P , we are done. As an example, we prove Theorem 4 in thecase where l = k = 1 and p1,1 = n2.

Let (X, d) be a compact metric space and T a homeomorphism of X.Without loss of generality, we assume that T is minimal. Let ε > 0. Weneed to find x ∈ X and n ∈ N such that d(T n2

x, x) < ε. We will find asequence of points x0, x1, . . . and integers n1, n2, . . . such that

d(T (nm+...+nl+1)2

xm, xl) < ε/2 for every l, m ∈ N with l < m.

Here we state the appropriate PET-induction hypothesis, which is the lineartopologial version of van der Waerden’s theorem.

Proposition 7. [FW] Let (X, d) be a compact metric space and let T be ahomeomorphism of X. Then for any ε > 0, and p ∈ N, and any c0, . . . , cp−1 ∈Z, there exists x ∈ X and n ∈ N such that d(T cinx, x) < ε for all i =0, . . . , p− 1. If (X, T ) is minimal, the set of such x is dense in X.

Since X is compact for some l < m we will have d(xm, xl) < ε/2. Thuswe will have d(T (nm+...+nl+1)

2xm, xm) < ε.

Pick x0 arbitrarily and set n1 = 1 and x1 = T−n2x0. By continuity, choose

ε1 < ε/2 be such that d(T n21y, x0) < ε/2 for every y for which d(y, x1) < ε1.

Using Proposition 7 (with ε = ε1/2, p = 1, and c0 = 2n1), find y1 ∈ X, n2 ∈N, such that d(y1, x1) < ε1/2 and d(T 2n1n2y1, y1) < ε1/2. Set x2 = T−n2

2y1.Thus we have

d(T n22x2, x1) < ε1/2 < ε/2

andd(T 2n1n2+n2

2x2, x1) ≤ d(T 2n1n2y1, y1) < ε1.

Thus by choice of ε1, we have

d(T (n1+n2)2x2, x0) = d(T n21T 2n1n2+n2

2x2, Tn2

1x1) < ε/2.

40

Page 41: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Given xm, nm, we inductively find xm+1, nm+1. By continuity, choose 0 <εm < ε/2 such that for all l = 0, 1, . . . , m−1, d(T (nm+...+nl+1)

2y, xl) < ε/2 for

every y for which d(y, xm) < ε1. Using Proposition 7 (with ε = εm/2, p = m,and cl = 2(nm + . . .+nl+1)) and l = 0, 1, . . . , m−1), find ym ∈ X, nm+1 ∈ N,such that d(ym, xm) < εm/2, and d(T 2(nm+...+nl+1)nm+1ym, ym) < εm/2 forl = 0, . . . , m− 1. Set xm+1 = T−n2

m+1ym.

d(T 2(nm+...+nl+1)nm+1+n2m+1xm+1, xm)

≤ d(T 2(nm+...+nl+1)nm+1ym, ym) + d(ym, xm) < εm, forl = 0, . . . , m− 1.

Thus by choice of εm, we have

d(T n2m+1xm+1, xm) < ε/2

andd(T (nm+1+...+nl+1)

2

xm+1, xl) < ε/2, l = 0, . . . , m− 1.

5.4 Theorem 2

We now direct our attention to ideas for Theorem 2. The following propo-sition is instrumental in the proof of Theorem 2. We prove it using PET-induction.

Proposition 8. [BL] Let α : (X,B, µ,Γ) → (Y,D, ν,Γ) be a weakly mixingextension relative to Γ, let µ =

∫µydν(y), let T1, . . . , Tl ∈ Γ, and let gi(n) =∏l

j=1 Tpij(n)j , be such that gi and gi(n)g−1

t (n) depend nontrivially on n. Thenfor any f1, . . . , fk ∈ L∞(µ)

limN→∞

∥∥∥N−1∑

n=0

( k∏

i=1

gi(n)fi −k∏

i=1

α∗(

∫fidµy)

)∥∥∥L2(µ)

= 0.

We use the following version of the ”van der Corput” Lemma.

Lemma 9. [Be] Let w0, w1, . . . be a bounded sequence in a Hilbert space(H, 〈, 〉). If

D−limh

limN→∞

1

N

N−1∑

n=0

〈wn, wn+h〉 = 0,

then limN→∞

∥∥∥ 1N

∑N−1n=0 wn

∥∥∥ = 0.

41

Page 42: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Using standard arguments we make the following reductions. First weassume that pij(0) = 0 for all i, j, and hence g1(n), . . . , gk(n) form a systemA. Using the multilinearity of the integral, we also assume that

∫fidµy = 0

for all i = 1, . . . , k. We must then show

limN→∞

∥∥∥1

N

N−1∑

n=0

k∏

i=1

gi(n)fi

∥∥∥L2(µ)

= 0.

Lastly, we assume that our group Γ is a finitely generated free abelian groupwith basis T1, . . . , Tl.

Proof. Set wn =∏k

i=1 gi(n)fi.We define

L(n, h) = 〈k∏

i=1

gi(n)fi,k∏

i=1

gi(n + h)fi〉 =

∫ k∏

i=1

gi(n)fi · gi(n + h)fidµ. (1)

Thus by Lemma 9 we need to show,

D−limh

limN→∞

1

N

N−1∑

n=0

L(n, h) = 0.

By reordering the polynomial expressions, we may assume that for someq ≤ k, deg g1(n) = . . . = deg gq(n) = 1 and deg gi(n) ≥ 2 for q + 1 ≤ i ≤ k.Thus gi(n + h) = gi(n)gi(h) when i ≤ q. For a fixed h ∈ N, we definea new system, Ah = gi(n), gi(n + h)g−1

i (h) : i = 1, . . . , k. We note thatgi(n) = gi(n + h)g−1

i (h) if and only if i ≤ q (ie deg gi(n) = 1). Thus we canwrite equation (1) as follows:

L(n, h) =

∫ q∏

i=1

gi(n)fi · g1(n+ h)fi

k∏

i=q+1

gi(n)fi

k∏

i=q+1

gi(n+ h)fidµ

=

∫ q∏

i=1

gi(n)(fi · gi(h)fi)k∏

i=q+1

gi(n)fi

k∏

i=q+1

gi(n + h)g−1i (h)(gi(h)fi)dµ

=

∫ k′∏

i=1

gi(n)fidµ

42

Page 43: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

where k′ = 2k − q, fi is either fm, gm(h)fm, or fm · gm(h)fm for some 1 ≤m ≤ k, and gi is either gm for some 1 ≤ m ≤ k or gm(n+ h)g−1

m (h) for someq + 1 ≤ m ≤ k. In this way, we order the system Ah = gi : i = 1, . . . , k′.

By reordering, assume that g1(n) has minimal weight in Ah. Set gi =gi(n)g−1

1 (n), for i = 1, . . . , k′. We consider the new system Ah = gi : 1 =2, . . . , k′. We claim that the weight matrix for Ah precedes that of A. Wenote that replacing gi(n) with gi(n+ h)g−1

i (h) does not change the family ofequivalence classes. Thus, Ah has the same weight matrix as A. Next, wemultiplied each element in Ah by the inverse of an element with the smallestweight g1(n). Thus, all polynomial expressions that are not equivalent tog1(n) do not change their weights when multiplied by g−1

i (n). However, thoseexpressions that are equivalent to gi(n), decrease in weight when multipliedby g−1

i (n). Thus, the weight matrix for Ah precedes that of A.We now apply PET-induction. Suppose that Proposition 8 is true for all

systems whose weight matrix precedes that of A (namely Ah). Thus

limN→∞

1

N

N−1∑

n=0

L(n, h)

= limN→∞

1

N

N−1∑

n=0

∫f1 ·

k′∏

i=2

gi(n)fidµ

=

∫f1 lim

N→∞

1

N

N−1∑

n=0

k′∏

i=2

gi(n)fidµ

=

∫f1 lim

N→∞

1

N

N−1∑

n=0

k′∏

i=2

gi(n)α∗(

∫fidµy)dµ

= limN→∞

1

N

N−1∑

n=0

k′∏

i=1

gi(n)(

∫fidµy)dν

≤k′∏

i=1

∥∥∥∫fidµy

∥∥∥L2(ν)

for h large enough.If A has degree at least 2, then deg gk(n) ≥ 2 and fi = fk for some

i ≤ k′. Thus by assumption on fk, the last product is zero. Thus D−

43

Page 44: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

limh limN→∞1N

∑N−1n=0 L(n, h) = 0. The linear case when A has degree 1,

is done previously in [FW] to prove the linear multidimensional version ofSzemeredi’s Theorem.

An extension α : X → Y is primitive if Γ = Γc × Γw where α is compactrelative to Γc and weakly mixing relative to Γw.

Proposition 10. [FK] If γ : X → Z is a nontrivial extension, one can findhomomorphisms α : X → Y and α′ : Y → Z with γ = αα′ such that Y is anontrivial primitive extension of Z.

Proposition 11. The family of Γ invariant factors which are SZP-systemshave a maximal element.

Thus to prove Theorem 2 we need only check that the property of beinga SZP system is preserved under primitive extensions.

Proposition 12. Given a primitive extension α : X → Y, if Y is SZP, thenso is X.

Sketch of Proof. We first break each gi(n) into its compact and weakly mix-ing parts as follows: gi(n) = R(i)(n)S(i)(n) where R(i)(n) ∈ Γc and S(i)(n) ∈Γw. Let R1, . . . , Rr be the set of all the compact components of gi(n),i = 1, . . . , k including the identity. Likewise let S1, . . . , Ss be the set of allpairwise distinct weakly mixing components of gi(n), i = 1, . . . , k.

It is sufficient to find a set P ⊆ N with d(P ) > 0 (where d is the lowerdensity) and c > 0 such that

µ(∩R−1

i (n)S−1j (n)A

)> c

for all n ∈ P . To do this we first use a corollary of Proposition 8 to find a

subset P ′ ⊆ N such that µ(S−1

j (n)A)> c′. This would prove the theorem

in the case that X is weakly mixing relative to Y.To deal with the compact portion, we find a measurable subset A′ ⊆ A

with µ(A′) > 0 such that S−1j (n)A and R−1

i (n)S−1j (n)A are sufficiently close

for all i = 1, . . . , r, j = 1, . . . , s. We use the following lemma.

Lemma 13. [BL] Let f, h1, . . . , hk ∈ L2(µ) and ε > 0 be such that foralmost all y ∈ Y and all R ∈ Γc, there exists m ∈ 1, . . . , K such that‖Rf − hm‖L2(µy) < ε. Then for all B ∈ D with ν(B) > 0 there exists P ⊆ N

with d(P ) > 0, a family of sets Bn ∈ D, and b > 0 such that, for any n ∈ P ,1 ≤ i ≤ r, 1 ≤ j ≤ s

44

Page 45: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

1. ν(Bn) > b,

2. Sj(n)Bn ⊆ B,

3. ‖Ri(n)Si(n)f − Sj(n)f‖L2(µy) < 2ε for all y ∈ Bn.

References

[Be] Bergelson, V. Weakly mixing PET. Erg. Th. and Dyn. Sys. 7 (1987),337-349.

[BL] Bergelson, V., Leibman, A. Polynomial extensions of van der Waerden’sand Szemeredi’s theorems. J. Amer. Math. Soc.. 9 (1996), 725-753.

[Fu1] Furstenberg, H. Ergodic behavior of diagonal measures and a theoremof Szemeredi on arithmetic progressions. J. d’Analyse Math. 31 (1977),204-256.

[FK] Furstenberg, H., Katznelson, Y., An ergodic Szemeredi theorem forcommuting transformations. J. d’Analyse Math. 45 (1978) 275-291.

[FW] Furstenberg, H., Weiss, B., Topological dynamics and combinatorialnumber theory. J. d’Analyse Math. 34 (1978), 61-85.

Michael Johnson, Northwestern University

email: [email protected]

45

Page 46: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

6 Convergence of Conze-Lesigne Averages

after B. Host and B. Kra [5]A summary written by Tamara Kucherenko

Abstract

For a measure preserving system (X,B, µ, T ) we investigate con-vergence of

1

N

N−1∑

n=0

f1(Ta1n)f2(T

a1n)f3(Ta3n)

We are able to generalize a theorem of Conze and Lesigne and providean alternative proof.

6.1 Introduction

Suppose that (X,B, µ, T ) is a measure preserving system and l ≥ 1 is aninteger. Given a collection of bounded measurable functions f1, . . . , fl ∈L∞(µ) and a1, . . . , al ∈ N one can ask the question whether the multipleergodic average

1

N

N−1∑

n=0

f1(Ta1nx)f2(T

a2nx) · · · fl(Talnx) (1)

converges and in what sense. The case l = 1, the existence of this limit inL2(µ) is the von Neuman Ergodic Theorem.

A measure preserving transformation T : X −→ X induces an operatorUT , on functions in L2(µ) defined by UTf(x) = f(Tx). We simply write Tin place of UT and hence Tf(x) = f(Tx). The measure preserving system(X,B, µ, T ) is assumed to be ergodic, i.e. the only sets A ∈ B such thatT−1A ⊂ A have either full or zero measure. This setting suffices for most ofthe theorems we consider since a general system can be decomposed into itsergodic components.

When a system is ergodic, for l = 1 the limit of (1) in L2(µ) is theintegral

∫Xf1 dµ and in particular is constant. On the other hand, without

that assumption and l ≥ 2 , the limit of (1) is not necessarily constant. For

46

Page 47: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

example, if X = T, T : T −→ T is the rotation Tx = x+ α mod 1 for someα ∈ T, f1(x) = e4πix and f2(x) = e2πix, then f1(T

nx)f2(T2nx) = f−1

2 (x) forall n ∈ N. Consequently, the average

1

N

N−1∑

n=0

f1(Tnx)f2(T

2nx)

converges to a nonconstant function.

A factor of (X,B, µ, T ) can be defined in several ways.

(a) a T -invariant sub-σ-algebra A of B

(b) a measure preserving system (Y,A, ν, S) and a measurable map π :X −→ Y such that µ π−1 = ν and S π(x) = π T (x) for µ-almostall x ∈ X

(c) a T -invariant subspace F of L∞(µ)

The first two definitons are equivalent by identifying π−1(A) with a T -invariant sub-σ-algebra of A and noting that any T -invariant sub-σ-algebraarises in this way. The first definition implies the third by setting F =L∞(A), and the converse is obtained by taking A to be the σ-algebra gener-ated by F -measurable sets.

If (Y,A, ν, S) is a factor of (X,B, µ, T ) and f ∈ L2(µ), the conditionalexpectation E(f |A) of f with respect to A is the orthogonal projection of fonto L2(ν).

Taking the rotational behavior into account, the double averages

1

N

N−1∑

n=0

T a1nf1Ta2nf2

can be understood. Furstenberg showed that to prove convergence of thedouble averages it suffices to replace each fi for i = 1, 2 by its conditionalexpectation E(fi|Z) on the Kronecker factor (see definition in a subsequentsections). The Kronecker factor is said to be characteristic for the doubleaverage. Projection to the Kronecker factor does not capture the limitingbehavior for l ≥ 3 (see [3]). Instead, we use a an abelian group extension ofthe Kronecker factor.

47

Page 48: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

6.2 Statement of results

Existence of the limit (1) for l = 3 with the hypothesis of total ergodicity,meaning that T and all its powers are ergodic, was proven by Conze andLesigne (see [1]). Here, convergence is shown directly through an alternativeapproach. Moreover, the assumption of total ergodicity can be dropped.

Theorem 1. Let (X,B, µ, T ) be a measure preserving system and a1, a2, a3

three distinct integers. Suppose that f1, f2, f3 ∈ L∞(µ). Then the limit

limN→∞

1

N

N−1∑

n=0

f1(Ta1nx)f2(T

a2nx)f3(Ta3nx) (2)

exists in L2(µ).

The proof separates into two parts. First, we reduce the problem to studyconvergence on a simpler system following the approach of Furstenberg andWeiss [4]. The second part shows convergence of that simpler system aninvolves techniques from harmonic analysis.

6.3 Reduction to a simpler system

6.3.1 Characteristic factors

Definition 2. Given distinct integers a1, a2, . . . , al and a factor (Y,A, ν, S)of a system (X,B, µ, T ), we say that Y is a characteristic factor of X for thescheme a1, a2, . . . , al if for any f1, f2, . . . , fl ∈ L∞(µ) we have

limN→∞

(1

N

N−1∑

n=0

l∏

i=1

fi T ain − 1

N

N−1∑

n=0

l∏

i=1

E(fi T ain|A)

)= 0

in L2(µ).

Finding a characteristic factor Y for a system X allows us to restrictto functions defined only on Y , and this restriction simplifies computationswhen Y has a simple form.

For a double average the characteristic factor is the Kronecker factorwhich we denote by (Z,Z, m, S). Most specifically, S : Z −→ Z is the rota-tion defined by Z = z+α, and we use π : X −→ Z for the natural projection.

48

Page 49: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

For f ∈ L2(µ), we write f the functon on Z defined by f π = E(f |Z).

Furstenberg and Weiss showed the following (see [4]).

Theorem 3. Let (X,B, µ, T ) be a measure preserving system, with notationsas above. Let b1, b2 be integers. Then for any f1, f2 ∈ L∞(µ) we have that

1

N

N−1∑

n=0

f1(Tb1nx)f2(T

b2nx)

exists in L2(µ) and equals

Z

f1(z + b1θ)f2(z + b2θ) dm(θ)

where z = π(x).

Let Z be the closed subgroup

Z = (z + a1t, z + a2t, z + a3t) : z, t ∈ Z

of Z3 and let m be its Haar measure. The subgroup Z is invariant underthe transformation S = Sa1 × Sa2 × Sa3 . Then Sz = z + α where α =(a1α, a2α, a3α).

We denote by (X, m, T ) the product of the systems (X,µ, T ai) over (Z, m, S).Now by Theorem 3 for f1, f2, f3 ∈ L∞(µ) we have

limN→∞

1

N

N−1∑

n=0

X

3∏

i=1

fi(Tainx) dm(x) =

Z

3∏

i=1

fi(zi) dm(z)

=

X

3∏

i=1

fi(xi) dµ(x1, x2, x3)

(3)

6.3.2 Reduction to the isometric extension of the Kronecker

For an ergodic system (X,B, µ, T ) with Kronecker factor (Z, α), let (Z, D, ν, T )denote the maximal isometric extension of (Z, α) in (X, T ).

Theorem 4. (Furstenberg and Weiss [4]). Z is a characteristic factor of Xfor all schemes a1, a2, a3.

49

Page 50: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Thus in order to prove the existence of (1) in L2(m), it suffices to showthe convergence for functions defined on the isometric extension Z of Z. Weexpress the extension Z in such a way that the corresponding group extensionX1 = Z × L (for suitable metrizable compact group L) is ergodic. Since itsuffices to prove the convergence for the system X1 we can from now onassume that X is itself a group extension of its Kronecker factor Z.

6.3.3 Reduction to an abelian group extension

Following the paper of Furstenberg an Weiss [4] we use the theory of Mackeygroups to obtain the next theorem.

Theorem 5. (Furstenberg and Weiss [4]). X has a characteristic factorfor all schemes a1, a2, a3 that is an abelian group extension of its Kroneckerfactor Z.

¿From now on we can assume that our system X is a compact abeliangroup extension X = Z × G of its Kronecker with the natural projectionX −→ Z given by π(z, g) = z. The transformation T on X is given by thecocycle σ : Z −→ G and can be written as

T (z, g) = (Sz, g + σ(z))

For every integer n we have T n(z, g) = (Snz, g + σ(n)(z)) where σ(n)(z) =σ(z) + σ(Sz) + · · ·σ(Sn−1z).

Now, X = Z ×G3 and the measure µ is the product m×mG ×mG ×mG

of Haar measures. The transformation T on X is given by

T (z, g1, g2, g3) = (z + α, g1 + σ(a1)(z1), g2 + σ(a2)(z2), g3 + σ(a3)(z3)).

Thus, the system (X, µ, T ) is a compact abelian group extension of (Z, m, S)by the group G3, given by the cocycle σ : Z −→ G3, where σ = σ(a1)×σ(a2)×σ(a3).

6.4 Sketch of the proof of the main theorem

It now suffices to prove the existence of the limit (2) for the modified systemdescribed above. By density, it is enough to consider the case when thefunctions fi are of the form

fi(z, g) = wi(z)χi(g)

50

Page 51: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

for i = 1, 2, 3 and wi ∈ L∞(m), χi ∈ G and x = (z, g).

We denote by M the Mackey group of the cocycle σ on Z. The proofseparates into two cases depending on whether or not the character χ =(χ1, χ2, χ3) belongs to M⊥. The case χ 6∈ M⊥ is covered by the followinglemma.

Lemma 6. Let the functions fi = wi(z)χi(g) be given and assume that χ 6∈M⊥. Then

limN→

1

N

N−1∑

n=0

f1(Ta1nx)f2(T

a2nx)f3(Ta3nx) (4)

exists in L2(µ) and equals zero.

For the second case χ ∈M⊥ we write

1

N

N−1∑

n=0

3∏

i=1

fi(Tainx) =

3∏

i=1

χi(g)1

N

N−1∑

n=0

3∏

i=1

wi(z + naiα)χi(σ(nai)(z)).

First, we show that there exists a continuous mapping t 7→ φ(t) from Zto L2(m) such that

φnα(z) =

3∏

i=1

χi(σ(ain)(z)).

Using the Hilbert space Van der Corput lemma we obtain the followingresult.

Lemma 7. Let Z be a compact metric space, S : Z −→ Z a homomorphismso that Z is uniquely ergodic with invariant measure m. Let f : Z −→ H bea continuous map into a Hilbert space H. Then, for all z ∈ Z,

limN→∞

1

N

N−1∑

n=0

f(T nz) =

Z

f(u) dm(u)

in H.

By the lemma and the fact that (Z, S) is uniquely ergodic,

limN→∞

1

N

N−1∑

n=0

3∏

i=1

wi(z + naiα)χi(σ(nai)(z)) =

Z

3∏

i=1

φt(z)wi(z + ait) dm(t)

in L2(m).This completes the proof of the main theorem.

51

Page 52: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

References

[1] J.-P. Conze and E. Lesigne, Sur un theoreme ergodique pour des mea-sures diagonales. C. R. Acad. Sci. Paris, Serie I, 306 (1988), pp. 491–493.

[2] H. Furstenberg, Nonconventional ergodic averages. Proc. Sympos. PureMath. , 31 (1990), pp. 43-56.

[3] H. Furstenberg, Ergodic behaviour of diagonal measures and a theoremof Szemeredi on arithmetic progressions. J. d’Analyse Math., 31 (1977),pp. 204-556.

[4] H. Furstenberg and B. Weiss, A mean ergodic theorem for1N

∑Nn=1 f(T nx)g(T n2

x). Convergence in Ergodic Theory and Proba-bility, Walter de Gruyter & Co, Berlin, New York (1996), pp. 193–227.

[5] B. Host and B. Kra, Convergence of Conze-Lesigne Averages. Isr. J.Math., 149 (2005), pp. 1–19.

Tamara Kucherenko, UCLA

email: [email protected]

52

Page 53: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

7 Pointwise Ergodic Theorems for Arithmetic

Sets, part 2

after J. Bourgain [1] A summary written by Victor Lie

Abstract

We discuss the Shift model associated to a pointwise Ergodic the-orem along polynomial iterates

7.1 Introduction

In what follows we will prove the Shift model for the following two results:

Theorem 1. Let (Ω,B, µ, T ) be a dynamical system and let p be a polynomialwith integer coefficients. Given r > 1, there is a constant C so that

‖ supN≥1

|ANf | ‖r ≤ C‖f‖r

holds for all f ∈ Lr(Ω, µ), where AN is given by

ANf =1

N

N∑

n=1

f T p(n).

Furthermore, the sequence ANf converges pointwise almost everywhere.

Theorem 2. Let (Ω,B, µ, T ) be a dynamical system and let p be a polynomialwith real coefficients. For N ≥ 1, let

ANf =1

N

N∑

n=1

f T ⌊p(n)⌋.

If f is any bounded measurable function on Ω, then ANf converges pointwisealmost everywhere.

It is worth to mention that the second result can also be extended withsome effort to Lr functions ; we won’t treat this extension in this presentation.

53

Page 54: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

7.2 The L2 case for Theorem 1

According with the first part of the presentation (see Andy Yingst),the max-imal inequality and convergence problem for the averages:

ANf =1

N

N∑

n=1

f T p(n).

withp(x) = b1x + b2x

2 + . . . + bdxd bj ∈ Z & bd > 0

can be reduced to proving certain inequalities for the Shift model (Z, S).Inthe last case , one has :

ANf = f ∗KN where KN =1

N

N∑

n=1

δp(n)

( here δx stands for the Dirac measure at x ∈ Z )Now since this is an L2 problem involving convolution operators it is naturalto expect that the multiplier theory will play an important role;consequently,theideea will be to split the Fourier transform of each KN in well localized piecesand after that to use maximal and orthogonal methods to estimate each suchpiece and combining all these estimates to obtain a global control for our op-erator.Following now these lines we write:

ANf = F−1 [F [KN ]F [f ]]

where F stands for the Fourier transform:

F : L2(Z) → L2(Π) F(f)(α) =

Z

f(n)e−2πinαdµ(n)

with α ∈ Π ( Π = [0, 1] ) and µ the canonical measure on Z.

STEP 1 Obtaining informations about the multiplier F [KN ]

Clear for α ∈ Π we have:

F [KN ](α) =1

N

N∑

n=1

e−2πip(n)α

54

Page 55: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Now if θ bj ≡ a′j/q′ ( mod 1 ) and (a′1, ..., a

′d, q

′) = 1 then defining :

S(θ) = S(q′, a′1, ...a′d) =

1

q′

q′−1∑

r=0

e2πi

Pdj=1 rj

a′jq′

ff

wN(β) =1

N

∫ N

0

e−2πiβ Pdj=1 bjyjdy

it can be shown that for some δ = δ(d) > 0 and δ′ = δ′(d) > 0 and for somefixed α we have:- if ∃ θ = a/q st q < N δ and |α− q| < N−d+δ then:

F [KN ](α) = S(θ)wN(α− θ) + O(N−1/2)

- else:|F [KN ](α)| . N−δ′

Now for θ = a/q ( (a, q) = 1 ) we have:

|S(θ)| . q−δ′

Also ∀ β ∈ Π the function wN(β) obeys the relations:

|1 − wN(β)| . |β|Nd & |wN(β)| . (1 + |β|Nd)−1/d

( we use the notation A . B for A ≤ CB where C is an absolute constantindependent of A and B )

STEP 2 Breaking F [KN ] into well localized pieces

For s ≥ 0 , define an exhaustion of the rationals in Π

Rs =

θ ∈ Q ∩ [0, 1] | θ =

a

q, (a, q) = 1 & 2s ≤ q < 2s+1

Now using the notations from the previous step we introduce:

ψs,N =∑

θ∈Rs

S(θ)wN(α− θ)ζ(10s(α− θ))

where by ζ we understand a smooth function on R with ζ = 1 on [− 110, 1

10]

and ζ = 0 outside [−15, 1

5] .

With these notations , using the estimates from the step 1 we have thefollowing result:

55

Page 56: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Lemma1 1. There exists δ1 > 0 such that the uniform estimates:∣∣∣∣∣F [KN ](α) −

s≥0

ψs,N(α)

∣∣∣∣∣ . N−δ1

holds.

STEP 3 Combining the estimations on the well localized pieces into a globalestimate

- we claim that :∥∥∥∥ sup

N∈Z1

|F−1[ψs,NFf ]|∥∥∥∥

2

. s22−sδ′ ‖f‖2

where Z1 =2k | k = 1, 2, ...

.

If we believe for the moment this fact , setting ψN =∑

s ψs,N and usingLemma 1 we have:

∥∥∥∥ supN∈Z1

|f ∗KN |∥∥∥∥

2

≤∥∥∥∥ sup

N∈Z1

|F−1[ψNFf ]|∥∥∥∥

2

+

(∑

n∈Z1

‖F [KN ] − ψN‖2∞

)1/2

‖f‖2 .

∞∑

s=0

∥∥∥∥supZ1

|F−1[ψs,NFf ]|∥∥∥∥

2

+ ‖f‖2 . ‖f‖2

and due to the positiveness of our maximal operator this will imply our l2

variant of the theorem 1:∥∥∥∥sup

N|f ∗KN |

∥∥∥∥l2(Z)

. ‖f‖l2(Z)

-returning now at our claim , we want first to ’cut’ the tiles of ψs,N ; for thiswe define :

ψs,N(α) =∑

θ∈Rs

S(θ)χ(Nd(α− θ))ζ(10s(α− θ))

( with χ = χ[−1,1] considered as function on R ) and remark that we have theuniform estimate: ∑

N∈Z1

|ψs,N − ψs,N | . 2−sδ′

56

Page 57: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Therefore , by a square function argument:∥∥∥∥ sup

N∈Z1

|F−1[ψs,NFf ]|∥∥∥∥

2

.

∥∥∥∥ supN∈Z1

|F−1[ψs,NFf ]|∥∥∥∥

2

+ 2−sδ′ ‖f‖2

Now for N ∈ Z1 with Nd ≈ 2j let Rj be the 2−j - neighborhood of Rs ⊂Π.Thus,setting:

F [gs] = F [f ]∑

θ∈Rs

S(θ)ζ(10s(α− θ)) & ψs,NF [f ] = F [γs]χRj

it follows from the main lemma of this paper ( see section 1.3 in the first partof this presentation ) that :

∥∥∥∥ supN∈Z1

|F−1[ψs,NFf ]|∥∥∥∥

2

≤∥∥∥∥ sup

j∈Z+

|F−1[F [gs]χRj]|∥∥∥∥

2

. (log |Rs|)2 ‖g‖2

Now |Rs| < 4s , and from the Step 1 and Parseval we have ‖g‖2 . 2−sδ′ ‖f‖2;putting these facts together we end the proof of our claim.

7.3 The almost sure convergence of ANf for f ∈ l2(Z)

From the first part of the presentation it is enough to prove in the (Z, S)setting ,that:

J∑

j=1

‖Mjf‖2 ≤ o(J) ‖f‖2

withMjf = sup

Nj<N<Nj+1N∈Zǫ

|f ∗ (KN −KNj)|

where Zǫ = (1 + ǫ)n | n = 1, 2, ... for ǫ > 0 fixed , and Nj is any rapidlyincreasing sequence ( Nj+1 > 2Nj ).Again the main tool will be the Fourier transform used for a better localiza-tion of the kernel KN ; indeed from Lemma 1 we deduce that f ∗ (KN −KNj

)may be replaced by F−1[(ψN −ψNj

)Ff ] when defining Mj ;fixing s0 it followsfrom the claim in the Step 3 that:

‖Mjf‖2 ≤∑

s≤s0

∥∥∥∥∥∥sup

Nj<N<Nj+1N∈Zǫ

∣∣F−1[(ψN − ψNj)Ff ]

∣∣∥∥∥∥∥∥

2

+ Cǫ−12−δ′′s0 ‖f‖2

57

Page 58: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

where the second term will be o(‖f‖2) for apprpriate s0 ( you may think at

s0 = J1

100 ) .Thus is suffice to verify our statement for:

Mjf = supNj<N<Nj+1

N∈Zǫ

|f ∗ (wN − wNj)|

with wN defined at the section 2 , step 1.Now cutting as before the tiles of w we define the auxiliary maximal function:

Mjf = supNj<N<Nj+1

|f ∗ (χNd − χNdj)|

where χ = χ[0,1] and χt = 1tχ[0,1] . Now since:

J∑

j=1

Mjf

1/2

≤ J1/4 ‖f ∗ χN |N = 1, 2, ...‖v4

it follows from the Part 1 that:

J∑

j=1

∥∥∥Mjf∥∥∥

2

2. J1/2 ‖f‖2

2

hence:

J∑

j=1

‖Mjf‖22 .

N∈Zǫ

∥∥F−1[(wN − F [χNd])Ff ]∥∥2

2+ J1/2 ‖f‖2 ≤

supα

[∑

N∈Zǫ

|wN(α) − χ(Ndα)|2] ‖f‖2 + J1/2 ‖f‖2 .ǫ J1/2 ‖f‖2

Now using Cauchy-Scwartz we conclude:

J∑

j=1

‖Mjf‖2 .ǫ J3/4 ‖f‖2

fact that ends our proof.

58

Page 59: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

7.4 The Lp case for Theorem 1

In this section we extend the L2 theory to Lp , with p > 1.Since the a.s.convergence for functions f in Lp reduces to bounded functions and hence istaken care of by the L2 result , we need to show only the maximal inequality:

‖sup |ANf |‖p . ‖f‖p

and only for the shift model (Z, S) .More than that , due to the good properties of the kernels KN our problemreduces to: ∥∥∥∥∥ sup

k∈Z+

|f ∗K2k |∥∥∥∥∥

p

. ‖f‖p

In what follows we will treat only the case 1 < p < 2 ; the remainingrange was established earlier in a paper written by the same autor - see [2].Now for dealing with our problem we will need a much more careful analysisof our operators and this is because the Fourier multipliers involved in theargument need to have good bounds in Lp .Mentaining the previous notations we define the following expressions:-for s a positive integer: Qs = 2s! & Ks = k ∈ Z | 4s ≤ k < 4s+1-for s′ ≤ s and k ∈ Ks , the functions:

Ωk,s′ =∑

0≤a<Qs′

S

(a

Qs′

)w2k

(α− a

Qs′

(Q2

s′

(α− a

Qs′

))

The multiplier Ωk,s′ will substitute in Lp the role played by the expression∑r≤s′ ψr,2k in the L2 case ; indeed the former one has a better distribution

of the support with respect to the rational numbers fact that will imply agood control in the Lp norm ; more over this multiplier will not lose the goodproperty of the last expression in approximating the F(KN).More exactly wehave:- if s′ ≤ s and k ∈ Ks then |F [K2k ](α) − Ωk,s′(α)| < 2−δ′s′

- for 1 < p0 < p < 2 ‖ supk |F−1[Ωk,s′Ff ]| ‖p0. ‖f‖p0

The first estimation is a simple consequence of the properties mentioned atstep 1 ( section 2 ) while the second one is obtained by a carefully study ofeach component of Ωk,s′ .A very important intermediate step consists in obtaining the relation:

∥∥∥∥ supk∈Ks

|f ∗K2k |∥∥∥∥

p

. s ‖f‖p

59

Page 60: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

where here p > 1 . This is solved by using duality and reducing it to anL2 problem for which the Fourier transform method is well adapted . Withthis done the proof of our result will become smooth making use of theinterpolation between Lp0 and L2 results obtained in the previous sections .Indeed we have:

supk

| f ∗K2k | ≤∑

s′

supk≥4s′

∣∣F−1[(Ωk,s′ − Ωk,s′−1)Ff ]∣∣+

s

supk∈Ks

∣∣(f ∗K2k) −F−1[Ωk,sFf ]∣∣

Now from the facts mentioned before we have:∥∥∥∥∥ sup

k≥4s′

∣∣F−1[(Ωk,s′ − Ωk,s′−1)Ff ]∣∣∥∥∥∥∥

p0

. ‖f‖p0

∥∥∥∥ supk∈Ks

∣∣(f ∗K2k) − F−1[Ωk,sFf ]∣∣∥∥∥∥

p0

. s ‖f‖p0

As anounced our aim now is to interpolate the above relations with somebetter L2 - estimates ; these last ones are obtained as follows:

∥∥∥∥∥ supk≥4s′

∣∣(f ∗K2k) −F−1[Ωk,sFf ]∣∣∥∥∥∥∥

2

≤ C∑

k≥4s′

2−kδ1 +

∥∥∥∥∥ supk≥4s′

∣∣∣∣∣∑

0≤r≤s′

F−1[ψr,2kFf ] −F−1[Ωk,sFf ]

∣∣∣∣∣

∥∥∥∥∥2

+∑

r>s′

∥∥∥∥ supk

∣∣F−1[ψr,2kFf ]∣∣∥∥∥∥

2

where here we have used the lemma 1 . Now only the second term needattention :

Ωk′,s′ −∑

r≤s′

ψr,2k =

r≤s′

θ∈Rr

S(θ)w2k(α− θ)[ζ(Q2

s′(α− θ))− ζ (10r(α− θ))]

+∑

q|Qs′

q≥2s′+1

1≤a≤q(a,q)=1

S

(a

q

)w2k

(α− a

q

(Q2

s′

(α− a

q

))= (1) + (2)

60

Page 61: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

For (1) we have an l∞ estimate ( k ≥ 4s′ ) :

(1) . 4s′ sup|β|>Q−2

s′

|w2k(β)| . 2−k/2

For the second sum one may prove ( using the same arguments as those pre-sented in section 2 ) that the maximal function associated to (2) is boundedin the l2 norm by C2−δ′s′.Hence we obtain that:

∥∥∥∥ supk∈Ks

∣∣(f ∗K2k) −F−1[Ωk,sFf ]∣∣∥∥∥∥

2

. 2−δ′s′ ‖f‖2

and by substraction :

∥∥∥∥∥ supk≥4s′

∣∣F−1[(Ωk,s′ − Ωk,s′−1)Ff ]∣∣∥∥∥∥∥

2

. 2−δ′s′ ‖f‖2

Now interpolating between p0 and 2 we obtain for p0 < p < 2 and someδp > 0 :

∥∥∥∥ supk

|f ∗K2k |∥∥∥∥

p

≤∑

s′

∥∥∥∥∥ supk≥4s′

∣∣F−1[(Ωk,s′ − Ωk,s′−1)Ff ]∣∣∥∥∥∥∥

p

+

s

∥∥∥∥ supk∈Ks

∣∣(f ∗K2k) −F−1[Ωk,sFf ]∣∣∥∥∥∥

p

.p

s′

2−δps′ +∑

s

2−δps < C

ending our proof .

7.5 The outlines of the Theorem 2

In this section we need to extend the a.s.converges of ANf for polynomialswith real coefficients ( this thing will be done only in the L∞ - case ).Remember that now:

ANf =1

N

N∑

n=1

f T ⌊p(n)⌋

where P (x) = b0 + b1x+ ...+ bdxd ( bd > 0 & d ≥ 1) is a polynomial with

real coefficinets and and ⌊l⌋ stands for the entire part of l.Now we observe that since we can suppose that at least one of b1, b2, ...bd

61

Page 62: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

is not rational ( otherwise we can reduce our problem at the one dicussedbefore ) the sequence:

p(n) − ⌊p(n)⌋ | n = 1, 2, ...is uniformly distributed in [0, 1] . Now fixing ǫ > 0 small we define the affinecontinuous function τ = τǫ on R by τ = 1 on [ǫ, 1 − ǫ] and τ = 0 outside of[0, 1].With this done we set:

ANf =1

N

N∑

n=1

m∈Z

τ(p(n) −m) Tm f

and from the uniform distribution property we have the pointwise estimation( N large enough ) :

|ANf − ANf | ≤‖f‖∞N

# 1 ≤ n ≤ N | dist(p(n), Z) < ǫ ≤ 3ǫ ‖f‖∞

Thus will be suffice to show the a.s. convergence for ANf with ǫ fixed .Transferring this problem for the Shift model (Z, S) as before , we realizethat the Fourier tranform of the kernel of AN is given by:

F [KN ](α) =∑

k∈Z

τ(k − α)

1

N

n≤N

e2πi(k−α)p(n)

Now this expression is very similar with the one appearing in the section 2and so using the same techniques as before one can show that:

∥∥∥sup |ANf |∥∥∥

2.ǫ ‖f‖2

finishing our argument .

References

[1] Bourgain, J., Pointwise Ergodic Theorems for arithmetic sets. Inst.Hautes Etudes Sci. Publ. Math., 69 (1989), pp. 5-45.

[2] Bourgain, J On the Pointwise Ergodic Theorem on Lp for arithmeticsets. Israel J. Math, 61 (1) (1988), 73-84.

Victor Lie ,Ucla

email: [email protected]

62

Page 63: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

8 The Ergodic Theoretical Proof of

Szemeredi’s Theorem II

after H. Furstenberg, Y. Katnelson, and D. Ornstein [1] A summarywritten by Anne E. McCarthy

Abstract

We complete the proof of H. Furstenberg’s multiple recurrence the-orem, via an analysis of factors of a measure-preserving system. Weprove the existance of a maximal factor satisfying property SZ, andthat any proper SZ subfactor has a SZ extension.

8.1 Introduction

It is has been established in the preceeding expostion that Szemeredi’s The-orem is equivalent to a statement about multiple recurrence for measure-preserving systems (m.p.s.) We will complete the proof of this ergodic theoryresult:

Theorem 1. Let (X,B, µ, T ) be a m.p.s., and A ∈ B be set with µ(A) > 0.Then for any positive integer k there exists n ≥ 1, such that

lim infN→∞

1

N

N∑

n=1

µ(A ∩ T−nA ∩ · · · ∩ T−knA) > 0.

The validity of this theorem has been established in the two prototypicalexamples of weak mixing systems and compact systems. We have also seenthat any m.p.s. that is not weak mixing has a compact factor. Therefore, it isnow known that theorem 1 holds for a factor of any m.p.s. In this expositionwe extend to the full generality of the theorem.

Definition 2. We say that a m.p.s. satisfies property SZ if the conclusionof 1 holds.

We proceed with the proof as follows. For a fixed m.p.s., (X,B, µ, T ),we consider the family F of all factors (X,B1, µ, T ) that satisfy property SZ,ordered by inclusion. We will show

1. The family F contains a maximal element.

63

Page 64: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

2. No proper factor of (X,B, µ, T ) can be maximal.

These two facts taken together imply that the maximal element must be(X,B, µ, T ) itself. Therefore, the sytem satisfies property SZ.

The proof of item 1. is fairly straight forward, and will be proved in8.3. To prove item 2. we will need to define the notions of a relatively weakmixing extension and relatively compact extensions. Using these notions, weproceed much like we did in showing that theorem 1 holds for some factor.We prove that a relatively weak mixing extension is ”relative weak mixing ofall orders,” and hence SZ. We also show that a relatively compact extension ofa SZ system is also SZ. To conclude, we show that if an extension (X,B, µ, T )of a SZ system (Y,D, ν, S) is not realtively weak mixing, there must be anintermediate extension that is relatively compact with respect to (Y,D, ν, S).This final step is analogous to showing that a system that is not weak mixinghas a compact factor.

8.2 Factors

We begin by establishing intuition for a factor of a m.p.s, as well as someuseful properties.

8.2.1 skew products

Let (Y,D, ν, S) be a m.p.s., and (Z, E , θ) be a measure space. Let y 7→ σ(y)be a function from Y into measureable transformations on Z, such that(y, z) 7→ σ(y)(z) is a measureable. We define the skew product of Y with Zas follows. Let X = Y × Z, B = D × E , and µ = ν × θ, and note that

T (y, z) = (S(y), σ(y)(z)),

preserves measure on (X,B, µ). Let π : X → Y be the projection π(y, z) =y. Setting B1 = π−1D, we can identify (Y,D, ν, S) with the factor system(X,B1, µ, T ), becuase they are isomorphic as m.p.s. Furthermore, a theoremof Rohklin states that given a m.p.s., any factor can be thought of as existingwithin such a skew product.

Although we do not require the full strength of Rohklin’s theorem, wewill discuss one useful consequence. Because X is a product space, we areable to disintegrate the measure µ. By Fubini’s theorem, for any set A ∈ B,

64

Page 65: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

we have

µ(A) =

∫µy(A)dν(y)

Where µy = (δy × θ), is a measure supported on the fiber π−1(y). We alsonote that our measure preserving transformations respect this measure in thesense that µy(T

−1A) = µSy

(A).

8.2.2 projecting functions

We will denote X = (X,B, µ), and Y = (Y,D, ν), where Y ≃ (X,B1, µ) willalways be thought of as a factor of X in the above sense. Given a functiong ∈ L2(X), we define the L2(Y) function, g conditioned on Y, via

E(g|Y)(y) =

π−1(y)

g dµy

Alternatively, L2(X,B1, µ) is a closed subspace of L2X. It is equivalentto define E(g|Y) as the orthogonal projection of g ∈ L2(X) to E(g|Y) ∈L2(X,B1, µ). Viewing L2(X,B1, µ) ⊂ L2(X), we can also view functions onY as functions on X as well. We note some useful properties of projectedfunctions:

E(gf |Y) = g E(f |Y) if g is B1measureable

SE(f |Y) = E(Tf |Y)

8.2.3 fiber squares (or relative squares)

Using the skew-product structure X = Y ×Z, with assignemt σ(y), we definea new space X×Y X as follows. Let X = Y ×Z×Z, B = D×E×E , and µ =ν × θ× θ. We define the trasformation T (y, z, z′) = (S(y), σ(y)(z), σ(y)(z′)).

Given the projection π : X → Y , we can also identify X = ∪y∈Y π−1(y)×

π−1(y) ⊂ X × X. Also µ =∫µydν(y), where µy = µy × µy, and B is the

restriction of B × B to X. We denote this system X = X ×Y X. In the caseof relative weak mixing and relative compactness with respect to a factor, wereplace the role of X×X with X×Y X. In this setting, the following identitywill be helpful:

X×Y X

f(y, z)f(y, z′)dµ(y, z, z′) =

Y

E(f |Y)2dν(y)

65

Page 66: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

8.3 Maximal SZ Factors

Definition 3. A factor (X,B1, µ, T ) is said to be Szemeredi (SZ) if

lim infN→∞

1

N

N∑

n=1

µ(A ∩ T−nA ∩ · · · ∩ T−knA) > 0

for all k ∈ Z+, and for all A ∈ B1 with µ(A) > 0.

Recall that for every m.p.s. (X,B, µ, T ), there exists a factor (X,B1, µ, T )that is SZ. Let F denote the family of all factors of (X,B, µ, T ) that are SZ,ordered by inclusion. We will show that F has a maximal element. Fora totally ordered set of factors Bα, the σ-algebra supα Bα is characterizedby those A ∈ B that satisfy the condition: given any ε > 0, there existsA0 ∈ ∪αBα such that µ((A,A0)) < ε.

Theorem 4. Let Bα be a totally ordered family of factors, and supposeBα ⊂ F . Then supα Bα is SZ.

Proof. Fix k > 0 and A ∈ supα Bα with µ(A) > 0. Set η = 12(k+1)

and

A′0 ∈ Bα0 such that µ((A,A′

0)) <η4µ(A).

Consider the factor (X,Bα0 , µ, T ) ≃ (Y,D0, ν, T0), and let A′′0 = π(A′

0) ∈D0. A simple calculation shows that ν(y ∈ A′′

0s.t.µy(A) < 1− η) < 14µ(A).

We now define the set A0 = y ∈ A′′0s.t.µy(A) > 1 − η. By the previous

comment, we can conclude that ν(A0) >12µ(A).

Because we Bα0 ∈ F , the system (Y,D0, ν, T0) is SZ. Therefore,

lim infN→∞

1

N

N∑

j=1

ν(A0 ∩ T−jA0 ∩ · · · ∩ T−kjA0) = a > 0.

We can show that for y ∈ A0 ∩T−jA0 ∩ · · ·∩T−kjA0, we have the inequality,

µy(A ∩ T−jA ∩ · · · ∩ T−kjA) >1

2.

Since for any B ∈ B we have that µ(B) =∫µy(B)dν(y), we can use the above

inequality to bound the measure of (⋂T−jlA) below by 1

2ν(⋂T−jlA0) >

a/2 > 0.

We now apply Zorn’s Lemma to F , to conclude that there is a maximalfactor that is SZ.

66

Page 67: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

8.4 Relative Weak Mixing

In previous work, we have established that if a system is weak mixing it isSZ. We will now define the notion of an extension (x,B, µ, T ) being weaklymixing relative to a factor (Y,D, ν, S). The main goal is to show that if anextension (X,B, µ, T ) is weakly mixing relative to a SZ factor (Y,B, µ, T ),then the extension is SZ as well.

Let us briefly highlight a few facts about weak mixing systems. We saythat (X,B, µ, T ) is weak mixing if the product T × T is ergodic. This isequivalent to the statement that for all f, g ∈ L∞(X,B, µ),

1

N

∑(∫fT ngdµ−

∫fdµ

∫gdµ

)2

→ 0.

We say that a system (X,B, µ, T ) is weak mixing relative to a factor(Y,D, ν, S) if the system (X ×Y X, B, µ, T ) is ergodic. For a weakly mixingsystem, we have that for all f, g ∈ L∞(X,B, µ),

limN→∞

1

N

N∑

n=1

∫ (E(fT ng|Y) −E(f |Y)SnE(g|Y)

)2dν = 0.

In the case where f and g are characteristic functions of sets A and B,the previous expression can be rewritten as

∫1

N

N∑

n=1

|µy(A ∩ T−nB) − µy(A)µY (T−nB)|2dν → 0.

This says that for a relative weak mixing extension, for most n and y ∈ Y ,the sets A and T−nB are independent with respect to µy. Interpret this as‘relative weak mixing gives weak mixing on the fibers.’ The below theoremcan be explained intuitively bt saying that if the base action of (Y,D, ν, S)is SZ, and the fibers are weak mixing, then the whole system is SZ.

Theorem 5. Let (X,B, µ, T ) be a relative weak mixing extension of the sys-tem (Y,D, ν, S). If the action of S on D is SZ, then so is that of T on X.

The proof follows the structure of the proof that weak mixing systemsare SZ. We show that relative weak mixing gives a ‘relative weak mixing ofall orders.’ Much like the weak mixing case, this statement is establishedthrough an inductive argument using two inequalities: one the condition forrealtive weak mixing of order k, and another giving strong convergence inL2, however we now consider functions projected onto L2(Y,D, ν).

67

Page 68: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

8.5 Compact Extensions and Existence

A function f ∈ L2(X,B, µ) almost periodic relative to Y (AP) if for allδ > 0 there exist g1, g2, . . . gn ∈ L2(X,B, µ) such that for all j ∈ Z we haveinf1≤s≤n ||T jf − gs||L2(µy) < δ, for all y ∈ Y. The extension (X,B, µ, T ) isa compact extension of (y,D, ν, S) if the set of AP functions is dense inL2(X,B, µ).

Theorem 6. Let (X,B, µ, T ) be a compact extension of (Y,D, ν, S). If thesystem (Y,D, ν, S) is SZ then so is (X,B, µ, T ).

Some elements of this proof: Fix a set A ∈ B. Since AP functions aredense, we are able to slightly modify the set A so that f = 1A is an APfunction. We then remove from A the fibers π−1(y) for which µy(A) < µ(A)

2.

Let A1 be the set of y ∈ Y , for which µy(A) ≥ µ(A)2

. This set A1 has

ν(A) > µ(A)2. For y ∈ A1 we consider the sequence of vectors L(k, f, y) =

(f, T nf, · · · , T knf)n∈Z ∈⊕kl=0 L

2(µy). The fact that f is AP (all T jf closeto gi, 1 ≤ i ≤ s), tells us that this sequence is totally bounded. This bound-edness allows us to find maximal ε-separated sets. We are then able to provethat the system is SZ using ideas similar to those used in proving that acompact system is SZ.

The proof of Szemeredi’s theorem can now be concluded by establishingthat any proper SZ factor has either a compact extension, or an extensionthat is relatively weak mixing. For then we can conclude that the maximalSZ factor must be the original system itself. This result is proved in the lastsection, and is similar to the proof that a system that is not weak mixinghas a compact factor.

References

[1] Furstenberg, H., Katznelson, Y., and Ornstein, D., The Ergodic Theo-retic Proof of Szemeredi’s Theorem Comment. Math. Helv. 62 (1987),no. 1, 18–37;

Anne E. McCarthy, Temple University

email: [email protected]

68

Page 69: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

9 Multiple recurrence and Szemeredi’s theo-

rem

after H. Furstenberg, Y. Katznelson, and D. Ornstein [2]A summary written by Richard Oberlin

Abstract

In [2] it is shown that Szemeredi’s theorem concerning arithmeticprogressions is equivalent to a statement about multiple recurrence ofmeasure-preserving transformations, and an ergodic theoretic proof ofthis statement is given. We summarize the first five sections of [2].

9.1 Introduction

Let (X,B, µ) be a probability measure space, and let T be an invertiblemeasure-preserving transformation on (X,B, µ). Recurrence is the notionthat the orbits of T should return close to their initial position. This wasoriginally shown to occur by Poincare:

Theorem 1. Let A ∈ B be a set of positive measure. Then for some n > 0

µ(A ∩ T−nA) > 0. (1)

The proof of the Poincare recurrence theorem is simple. Suppose that(1) does not hold for any n. Then, since T is measure preserving, µ(T−jA ∩T−iA) = µ(T−(j−i)A ∩ A) = 0 for every j > i. Thus,

⋃∞j=0 T

−jA is anessentially disjoint union and has measure

∑∞j=0 µ(A) = ∞. This contradicts

the fact that µ is a probability measure.The following “multiple recurrence” theorem of Furstenberg generalizes

Theorem 1, showing that for every k the transformations T, T 2, . . . , T k−1

must exhibit simultaneous recurrence.

Theorem 2. Let (X,B, µ) be a probability measure space and let T be aninvertible measure-preserving transformation on (X,B, µ). Suppose A ∈ Bhas positive measure. Then for every k > 0 there exists an n > 0 such that

µ

(k−1⋂

j=0

T−jnA

)> 0.

69

Page 70: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

The proof of Theorem 2 is decidedly more subtle than that of Theorem1. Indeed, Furstenberg observed that Theorem 2 is equivalent to Szemeredi’stheorem, a deep result in combinatorial number theory.

Let Λ ⊂ Z. We define the upper-density of Λ

UD(Λ) := lim supN→∞

supb−a=N

|Λ ∩ [a, b)|N

,

where | · | denotes cardinality. Szemeredi’s theorem, proven by Szemeredi in1975, answered a forty year old conjecture of Erdos and Turan.

Theorem 3. Suppose that UD(Λ) > 0. Then Λ contains arbitrarily longarithmetic progressions.

Furstenberg’s observation in 1976 that Theorem 3 is equivalent to The-orem 2 thus proved Theorem 2. However, Furstenberg went much furtherby giving a direct ergodic theoretic proof of Theorem 2, thereby giving anew proof of Szemeredi’s theorem. The method of Furstenberg’s proof hasturned out to be flexible enough to give certain generalizations of Theorem3 which seem to be inaccessible by Szemeredi’s method, and Szemeredi’smethod gives quantitative estimates related to Theorem 3 which seem to beinaccessible by Furstenberg’s method.

9.2 Theorem 2 implies Theorem 3

Here, we focus on the more difficult half of the equivalence of Theorems 2and 3.

Let Λ ⊂ Z with UD(Λ) > 0, and let k > 0. We need to find a ∈ Z andb ∈ Z \ 0 such that

a+ bjk−1j=0 ⊂ Λ. (2)

Consider the metric on 2Z, say,

d(x, y) :=∑

j∈Z : xj 6=yj

2−|j| (3)

and let T be the shift on 2Z, (Tx)j := xj+1. Let ω be the characteristicfunction of Λ, and let X be the closure in 2Z of T nω : n ∈ Z. Note that Xis compact (since 2Z is compact) and T -invariant.

70

Page 71: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

We define the open subset of X, A := ω ∈ X : ω0 = 1. Suppose thatfor some n > 0 :

k−1⋂

j=0

T−jnA 6= ∅. (4)

Every element of the intersection (4) contains the arithmetic progressionjnk−1

j=0 . Since each T−jnA is open, the intersection (4) is open, and thus bydensity contains some translate of ω. Hence, we obtain (2).

To see (4) from Theorem 2, it remains to find a T -invariant probabilitymeasure µ on X such that µ(A) > 0. Such a measure must exist based onthe assumption that UD(Λ) > 0. Indeed, choose ǫ > 0 and sequences aj∞j=1

and bj∞j=1 such thatlimj→∞

(bj − aj) = ∞ (5)

and each|Λ ∩ [aj , bj)|bj − aj

> ǫ. (6)

Define the sequence of Borel probability measures on X

µj(E) :=|l ∈ [aj, bj) : T lω ∈ E|

bj − aj.

By the Banach-Alaoglu theorem, the closed unit ball in C(X)∗ is weak∗-compact. Additionally, since X is a compact metric space, C(X) is sepa-rable and hence the closed unit ball in C(X)∗ with the weak∗-topology ismetrizable. Thus, some subsequence of µj∞j=1 converges to a probabilitymeasure µ onX (Alternately, one may use a simple diagonalization argumentin conjunction with the separability of C(X).). From (6), we see that eachµj(A) > ǫ and hence µ(A) ≥ ǫ. From (5), we see that µ is T -invariant.

9.3 Beginning the proof of Theorem 2

Associated to each measure-preserving system (X,B, µ, T ) is the unitary op-erator U on L2(X,B, µ) given by Uf(x) := f(T (x)). We first consider Theo-rem 2 under the assumption that the spectral behavor of U lies on either oftwo extreme ends, and later we treat the cases in between.

71

Page 72: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

9.3.1 Weak-mixing systems

For certain types of measure-preserving systems, we may strengthen the con-clusion of Theorem 1. If (X,B, µ, T ) is ergodic, it follows from a weak formof the mean ergodic theorem that for any A0, A1 ∈ B

limN→∞

1

N

N∑

n=1

µ(A0 ∩ T−nA1) = µ(A0)µ(A1). (7)

In fact it is easily seen that this only holds when (X,B, µ, T ) is ergodic.If we make the stronger assumption that the product system (X×X,B×

B, µ × µ, T × T ) is ergodic, then (X,B, µ, T ) is said to be weak-mixing, andwe find that for every A0, A1 ∈ B

limN→∞

1

N

N∑

n=1

|µ(A0 ∩ T−nA1) − µ(A0)µ(A1)| = 0. (8)

Again, if (8) holds for every A0, A1 ∈ B, then (X,B, µ, T ) must be weak-mixing. Additionally, it is well known that (X,B, µ, T ) is weak-mixing if andonly if the unitary operator induced by T has absolutely continuous spectrumexcept for the eigenspace of constant functions (see [3]).

A measure-preserving system (X,B, µ, T ) is said to be “weak-mixing ofall orders” if for every k > 0 and A0, . . . , Ak−1 ∈ B,

limN→∞

1

N

N∑

n=1

∣∣∣∣∣µ(

k−1⋂

j=0

T−jnAj

)−

k−1∏

j=0

µ(Aj)

∣∣∣∣∣ = 0. (9)

Theorem 4. Every weak-mixing system is weak-mixing of all orders.

Thus, for weak-mixing systems we obtain a stronger form of Theorem 2.Let X = 2Z, B the Borel σ-algebra with respect to (3), µ the product over

Z of the probability measure on 0, 1: µ(0) = µ(1) = 12, and let T be the

shift as in Section 9.2. This system is a Bernoulli system. Bernoulli systemsare weak-mixing and one may verify directly that they are weak-mixing ofall orders.

9.3.2 Compact systems

If T p is the identity for some p > 0 then T is said to be periodic and Theorem2 is a triviality. If T is “almost periodic” for every f ∈ L2(X,B, µ), i.e. if

72

Page 73: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

the set of orbits of f , f, Tf, T 2f, . . ., has compact closure in L2, then thesystem (X,B, µ, T ) is said to be compact.

A system is compact if and only if the unitary operator induced by Thas discrete spectrum, and it is in this way that compact systems are theopposite of weak-mixing systems.

Theorem 5. Suppose (X,B, µ, T ) is compact. Then for every k > 0 andevery set A ∈ B of positive measure

lim infN→∞

1

N

N∑

n=1

µ

(k−1⋂

j=0

T−jnA

)> 0.

Thus, we obtain a stronger (though not as strong as (9)) version of Theo-rem 2 for compact systems (and in fact this is the version that will be provenfor general systems).

The class of compact systems is exemplified by the system of irrationalrotations of the circle. Let X be the circle, represented by the reals modulothe integers R/Z, with Lebesgue measure µ and the collection of Borel subsetsof X, B. We set T (x) := x + α where α ∈ R is fixed (and irrational if onewants to avoid a trivial periodic system).

Let A ∈ B be a set of positive measure, and k > 0. Since the convolutionχA ∗ χ−A is continuous, we may find δ > 0 such that µ(A ∩ (A + x)) >

µ(A) − µ(A)2k

when |x| < δ. In particular, this implies that

µ

(k−1⋂

j=0

(A + jx)

)> 0 (10)

when |x| < δ.It is well known (see for example [4], page 302) that for any positive

integer n, we may find a rational number ab

with b ≤ n and∣∣∣α− a

b

∣∣∣ ≤ 1

b(n + 1).

Choosing 1n+1

< δ, we see that T−b is the rotation y → y + x where |x| < δ.Hence, from (10) we obtain

µ

(k−1⋂

j=0

T−jbA

)> 0,

and thus Theorem 2 holds for (X,B, µ, T ).

73

Page 74: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

9.3.3 Weak-mixing and compact factors

Given a measure-preserving system (X,B, µ, T ), suppose B1 ⊂ B is a non-trivial T -invariant σ-algebra. Then we may form the new measure-preservingsystem (X,B1, µ, T ). Such a system is referred to as a factor of (X,B, µ, T ).Although we are still far from proving Theorem 2 for arbitrary measure-preserving systems, the following theorem, when combined with Theorems 4and 5, shows that Theorem 2 at least holds for a factor of every measure-preserving system.

Theorem 6. A system (X,B, µ, T ) is weak-mixing if and only if it has nonontrivial compact factors

Sketch of proof. Suppose (X,B, µ, T ) is not weak-mixing. Then, there is anonconstant B-measurable eigenfunction f of T with eigenvalue λ. Since Tis unitary, |λ| = 1. The orbit closure T nf∞n=0 is isomorphic to λn∞n=0, andhence is compact. One may verify that if g is B0-measurable, where B0 is thesmallest σ-algebra such that f is B0-measurable, then T ng∞n=0 is compact.Letting B1 be the smallest σ-algebra containing

⋃∞n=0 T

−nB0, we see that B1

is T -invariant and (X,B1, µ, T ) is a compact system.If (X,B, µ, T ) has a compact factor, then we may choose A ∈ B with

0 < µ(A) < 1 such that T nχA∞n=0 has compact closure. This implies that

‖χA − χT−jA‖L2 may be taken arbitrarily small for a set of j’s with positivedensity. Thus we may not have (8) with A0 = A1 = A.

References

[1] Furstenberg, H., Recurrence in Ergodic Theory and Combinatorial Num-ber Theory. Princeton University Press, 1981.

[2] Furstenberg, H., Katznelson, Y., and Ornstein, D., The ergodic theoret-ical proof of Szemeredi’s theorem. Bull. Amer. Math. Soc. 7 (1982), pp.527-552.

[3] Halmos, P.R., Ergodic Theory. Chelsea Publishing Company, 1956.

[4] Montgomery, H.L., Niven, I., Zuckerman, H.S., An introduction to thetheory of numbers, Fifth Edition. John Wiley & Sons Inc., 1991.

Richard Oberlin, University of Wisconsin Madison

email: [email protected]

74

Page 75: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

10 Entropy of Convolutions on the Circle, I

after E. Lindenstrauss, D. Meiri and Y. Peres [5]A summary written by Robert C. Rhoades

Abstract

We begin with a brief introduction to the main theorems of [5]. Ourdiscussion focuses on the connections of this work with previous worksof Furstenberg, Lyons, Johnson, Rudolph, and Host. The startingpoint is Furstenberg’s paper [1].

10.1 Furstenberg’s Conjecture

Lindenstrauss, Meiri, and Peres write

Let p ≥ 2 be any integer (p need not be prime), and T = R/Zthe 1-torus. Denote by σp the p-to-one map x 7→ px (mod 1).The pair (T, σp) is a dynamical system that has additional struc-ture: T is a commutative group (with the group operation beingaddition mod 1), and σp is an endomorphism of it. Even in sucha simple system, the interaction between the dynamics and thealgebraic structure of T can be quite subtle; the present work con-tinues the study of this interaction, inspired by the fundamentalwork of Furstenberg [1].

Furstenberg [1] gave the following very general conjecture about the connec-tion between the group theoretic structure and the dynamics of the system.

Conjecture 2 (Furstenberg’s Conjecture, [1]). The only ergodic invariantmeasures for the semi-group of circle endomorphisms generated by σp and σq

for p and q multiplicatively independent (i.e. log p/ log q 6∈ Q, that is p and qare not integral powers of a single integer) are Lebesque measure, and atomicmeasures concentrated on periodic orbits.

This problem was out of the reach of Furstenberg, however he was able toprove the corresponding topological result. We say a set X ⊂ T is n-invariantif for all x ∈ X, nx (mod 1) ∈ X. Then the corresponding topological resultis:

Theorem 1. Any infinite closed set in T, invariant under multiplication bytwo multiplicatively independent integers p and q must be all of T.

75

Page 76: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

The original conjecture has had a long history and as stated above is stillopen. The strongest result is due to Johnson [2], who assume the additionalcondition that µ has positive entropy.

A similar problem that has been studied more recently, and is perhapsmore general, is the one where the measure µ is assumed to be σp-invariantand we study the action of σcn on µ for some sequence cn. This problemis perhaps most closely related to the conjecture of Furstenberg in the casewhere cn = qn. This particular case was studied by Host [3].

One variation of the problem, which seems to be the simplest to consideris to investigate

1

N

N−1∑

n=0

σcnµ.

In the paper of Host [3], the following theorem is obtained

Theorem 2 (Host, [3]). Let p, q > 1 and relatively prime integers. Let µbe a Borel measure on the torus T, invariant under σp, ergodic with positiveentropy, then

1

N

N−1∑

n=0

σqnµ→ µ

in the weak* topology.

This yields the following definition: Let cn be a sequence of integers,then a measure µ of T is cn-generic if

1

N

N−1∑

n=0

σcnµ → λ

in the weak* topology.In fact, Host proves a much stronger result. Define a sequence measure µ

of T to be cn-normal if cnx (mod 1) is uniformly distributed for µ-almostevery x ∈ T. Then Host proves the following

Theorem 3 (Host, [3]). Let p, q > 1 be relatively prime integers. If µ isa Borel measure on the torus, invariant under σp, and ergodic with positiveentropy, then µ is qn-normal.

76

Page 77: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

By Weyl’s criterion this result is equivalent to the fact that for all a ∈Z \ 0,

1

N

N−1∑

n=0

e(aqnx) → 0 µ− a.e.,

where e(x) = e2πix. In turn, this condition implies that µ is qn-generic.To show how we can then deduce that µ = λ, when σqµ = µ, and thus ob-

tain the original desired result of Furstenberg, we apply the following lemmadue to Johnson and Rudolph:

Lemma 4 (Johnson and Rudolph [4]). Suppose that ν, ν1, ν2, · · · are invari-ant measures under σp, and that ν is ergodic. Suppose also that

1

N

N∑

n=1

νn → ν

in the weak* topology. Then there exists a zero-density set J ⊂ N such thatνn → ν (weakly) as n→ ∞ for n 6∈ J .

This line of reasoning shows how Furstenberg’s original conjecture is theweakest in a string of results about the interaction between algebraic struc-ture of T and the dynamics on T. From here on out we will focus ourattention not on Frustenberg’s conjecture but on the other results that aremore general and stronger.

10.2 Uniform Distribution and cn-genericity

We have seen that via Weyl’s criterion we can move from a uniform distribu-tion result about a measure µ and a sequence cn to a result about the “onaverage” behavior of σcnµ. This approach was originally due to Host [3]. Hisapproach leads naturally to the definition of the p-adic collision exponent:Given an integer-sequence cn and an integer p > 1, we define the p-adiccollision exponent of the sequence as

Γp(en) = lim supn→∞

log |0 ≤ k, ℓ < pn : ck ≡ cℓ (mod pn)|n log p

.

Remark 5. Since k = ℓ is allowed we know that we will always have 1 ≤Γp(cn) ≤ 2.

77

Page 78: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Meiri greatly generalized Host’s result by proving the following theorem

Theorem 6 (Meiri [6], Proposition 2.2 [5]). Fix an integer p > 1 and asequence ck with p-adic collision exponent < 2. Then there exists a constanth0 < log p such that every p-invariant and ergodic measure µ with h(µ, σp) >h0 is cn-normal. In particular, µ is cn-generic.

The condition on the p-adic collision exponent does not seem too strict.However, in contrast, the positive entropy condition seems to play an impor-tant role in this theorem. Having a large entropy means that the measureis not structured. (Is the positive entropy here analogous to the positiveupper-density in Szemeredi’s theorem?)

10.3 The Role of Entropy

The role of positive entropy seems to be an important one. The big idea fromthe paper [5] is to take a measure that does not have large enough entropyand some how create a new measure with greater entropy which will havelarge enough entropy to have the desired cn-generic property. Then showthat the new measure having this property implies that the old one had thisproperty. This is the basic idea of the bootstrap.

The idea is that convolution will help us create new measures that haveincreased entropy. Theorem 1 of [5] suggests that convolution may be ahelpful tool in this problem.

Theorem 7 (Theorem 1.1 [5]). Let µi be a sequence of p-invariant andergodic measures on T whose normalized base-p entropies hi = h(µi, σp)/ log psatisfy

∞∑

i=1

hi

| loghi|= ∞.

Thenh(µ1 ∗ · · · ∗ µn, σp) → log p

monotonically as n → ∞. In particular, µ1 ∗ · · · ∗ µn → λ weak* and in thed metric (with respect to the base-p partition).

Convolution also has the following property

Proposition 8 (Proposition 2.4 [5]). An invariant measure µ is cn-genericif and only if µ ∗ µ is cn-generic.

78

Page 79: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Remark 9. This proposition deals with cn-generic not cn-normal. Thisis the reason why Theorem 1.4 of [5] is about cn-generic and not cn-normal.

With all this in mind we state the Bootstrap Lemma:

Theorem 10 (The Bootstrap Lemma, Theorem 1.7 [5]). Suppose that C isa class of p-invariant measures on T with the following properties:

1. If µ is p-invariant and ergodic and µ ∗ µ ∈ C, then µ ∈ C.

2. If µ is p-invariant and almost every ergodic component of µ is in C,then µ ∈ C.

3. There exists some constant h0 < log p such that every p-invariant andergodic measure µ with h(µ, σp) > h0 is in C.

Then C contains all p-invariant ergodic measures with positive entropy.

Now we can get back to proving the type of cn-generic results that weare looking for:

Theorem 11 (Theorem 1.4 [5]). Let cn be a sequence with p-adic collisionexponent < 2, for some p > 1. Then any p-invariant ergodic measure µ onT with positive entropy is cn-generic.

Proof. The proof follows from the Bootstapping Lemma with C being theclass of cn-generic measures where cn has p-adic collision exponent lessthan 2. We apply Theorems 6 and 8 to check the first and third conditions inthe Bootstrapping Lemma. All that remains is verify that if µ is p-invariantand almost every ergodic component of µ is cn-generic then µ is cn-generic.

In fact, with a modified version of the Bootstrap Lemma it is possible togive a strengthened version of this theorem. For k ∈ Z \ 0, define

g(k)N =

1

N

N−1∑

n=0

e(kcnx).

With these functions cn-generic is equivalent to∫g

(k)N (x)dµ→ 0.

The following slightly stronger version of Theorem 11 is obtained as well:

79

Page 80: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Theorem 12 (Theorem 1.5 [5]). Under the same conditions as Theorem 11,µ is cn-normal in probability. Equivalently

∫|g(k)

N (x)|2dµ→ 0,

as N → ∞ for k 6= 0.

This finishes the summary of the relation of the results in [5] to theconjecture of Furstenberg as well as results of Lyons, Johnson, Ruldolph andHost. In the talks I will give on this paper I will present some backgroundon topics such as Hausdorff dimension, Entropy, and their relationship withone-another. I will also give in detail the proofs of Theorems 10 and 11. Timepermitting I will finish with some discussion of examples and open questionsrelated to [5].

References

[1] H. Furstenberg, Disjointness in ergodic theory, minimal sets, and a prob-lem in Diophantine approximation, Math. Systems Theory 1 (1967), 1-49.

[2] A. Johnson, Measures on the circle invariant under multiplication by anonlacunary subsemigroup of the integers, Israel J. Math. 77 (1992), 211- 240.

[3] B. Host, Nombres normaux, entropie, translations, Isreal J. Math. 91(1995), 419-428.

[4] A. Johnson and D. Rudolph, Convergence under ×p of ×q invariantmeasures on the circle, Adv. in Math. 115 (1995), 117-140.

[5] E. Lindenstrauss, D. Meiri and Y. Peres, Entropy of convolutions on thecircle Ann. of Math. 149 (1999), 871-904.

[6] D. Meiri, Entropy and uniform distribution of orbits in Td, Israel J.Math. 105 (1998), 155 - 183.

Robert C. Rhoades, University of Wisconsin

email: [email protected]

80

Page 81: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

11 Entropy of convolutions on the circle

after Elon Lindenstrauss, David Meiri and Yuval Peres [3]A summary written by Shuanglin Shao

Abstract

Given ergodic p-invariant measures µi on the torus T = R/Z,we give a sharp condition on their entropies, guanranteeing that theentropy of the convolution µ1 ∗ · · · ∗ µn converges to log p. we alsoprove a variant of this result for joinings of full entropy on TN. Wealso obtain the following corollary concerning the Hausdorff dimensionof sum sets: For any sequence Si of p-invariant closed subsets of T,if∑

dimH(Si)/| log dimH(Si)| = ∞, then dimH(S1 + · · · + Sn) → 1

11.1 Introduction

Let p ≥ 2 be any integer(p need not to be prime), and T = R/Z the 1-torus.Denote by σp the p-to-one map x 7→ px(mod 1). The pair (T, σp) is a dynam-ical system that has additional structure: T is a commutative group(withthe group operation being addition mod 1), and σp is an endormorphism ofit. Inspired by the fundamental work of Furstenberg([2]), the present workcontinues the study of the intersection between the dynamics and the alge-braic structure of T. Say that a measure µ on T is p-invariant if σpµ = µ,where for every set A ⊂ T

(σpµ)(A) = µ(σ−1p A)

(All measures we consider are Borel probability measure). Lebegue measureon T, denoted by λ, has entropy log p with respect to the σp, and is the uniquep-invariant measure of maximal entropy. Given two p-invariant measures µand ν, the group structure of T naturally yields another p-invariant measure-the convolution µ ∗ ν. Our main results, Theorem 1 and Theorem 4, concernthe entropy growth for convolutions of p-invariant measures and their ergodiccomponents. These results have applications to the Hausdorff dimension ofsum sets and to genericity of the orbits of measures with positive entropyunder multiplication by certain integer sequences.

81

Page 82: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Theorem 1 (The Convolution theorem). Let µi be a sequence of p-invariantand ergodic measures on T whose normalized base-p entropies hi = h(µi, σp)/ log psatisfy

∞∑

i=1

hi

| loghi|= ∞. (1)

Then

h(µ1 ∗ · · · ∗ µn, σp) → log p, monotonically, as n→ ∞.

In particular, µ1 ∗ · · · ∗ µn → λ weak*.

It is relatively easy to see that under hypotheses of theorem, µ1∗· · ·∗µn →λ weak*. This means that

∫f(x)dµ1 ∗ · · · ∗ µn →

∫f(x)dλ

for all continuous f , and gives very little information on the dynamics ofµ1∗· · ·∗µn. The entropy condition in Theorem 1 is sharp: if hi is a sequenceof numbers in the range (0, 1) with

∑hi/| log hi| < ∞, then there exists a

sequence of p-invariant ergodic measures µi, such that hi = h(µi, σp)/ log p,yet µ1 ∗ · · · ∗ µn doesn’t converge to Lebesgue measure λ even in the weak*topology. The convolution theorem has implications for Hausdorff dimensionof sum sets:

Corollary 2. Let Si be a sequence of p-invariant closed subsets of T, andsuppose that

∞∑

i=1

dimH(Si)

| log dimH(Si)|= ∞

Then dimH(S1 + · · · + Sn) → 1.

Is the dimension condition of Theorem 2 sharp as well? specifically,given a sequence of numbers 0 < di < 1 such that

∑di/| log di| < ∞,

can one always find p-invariant closed subsets Si ⊂ T with dimH Si = di andlim dimH(S1 + · · · + Sn) →< 1? Currently we can construct sets satisfyingthe desired conclusion, but only when

∑di/| log di| is small enough.

Theorem 3. Let µi∞i=1 be a sequence of p-invariant and ergodic measureson T such that inf i h(µi, σp) > 0. Suppose that µ is a joining of full entropyof µi. Define Θn : TN → T by Θn(x) = x1 + · · ·+ xn(mod 1). Then

82

Page 83: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

h(Θnµ, σp) → log p, monotonically, as n→ ∞.

Theorem 3 is not valid under the weaker entropy assumptions of Theorem1. Indeed, it is possible to find a joining of full entropy µ with entropiessatisfying (1), such that Θnµ doesn’t even converge weak* to λ.

Theorem 4. Let µi∞i=1 be a sequence of p-invariant and ergodic measureson T such that inf i h(µi, σp) > 0. Let Si be a sequence of Borel sets of T,and suppose that µi(Si) > 0 for all i ≥ 1. Then dimH(S1 + · · · + Sn) → 1.

This summary is organized as follows and we will list the main theoremsin each section for convenience. In Section 11.2, we use the connection be-tween entropy of measures and Hausdorff dimension, to derive Theorem 2and Theorem 4 from the results about convolutions of measures. In Section11.3-11.5, we prove our main results, Theorem 1 and Theorem 4. Section11.3 contains results about finite cyclic groups which are crucial to the proofof the convolution theorems. Lemma 7 and Lemma 8 study convolutionsof measures on a finite cyclic group and contain one key idea in the proof,namely that the convolution of a sequence of measures on a finite cyclic groupof order N(we shall use N = pk) tends to be invariant under a subgroup thatwill typically rather large (in the cases we will be interested in, this subgroupwill be of order approximately pαk for some 0 < α < 1). Lemma 9 shows thatif a measure on Z/pkZ is almost invariant under a subgroup of size pαk, thedistribution of the αk high order digits is nearly uniform. In section 11.4 webegin to show how convolutions of measures on Z/pkZ relate to convolutionsof measures on T, where we get measures on Z/pkZ from measures on T byconsidering the conditional distributional of the first k digits in the base pexpansion of x ∈ T, given the rest digits. In section 11.5 we continue thisapproach and prove Theorem 1. The basic argument is that if that if theentropy of mu1 ∗ · · · ∗ µn is almost

supN∈N

h(µ1 ∗ · · · ∗ µN , σp),

then for any k ≥ 1 the distribution of the first k digits of x given the restof the digits (chosen according to µ1 ∗ · · · ∗ µn) must be nearly invariantunder a subgroup G ∈ Z/pkZ of size pαk-for if it was not, then the entropyof mu1 ∗ · · · ∗ µn can significantly increase by further convolutions. UsingLemma 9, this implies that the first αk digits of x are distributed nearlyuniformly. Since k is arbitrary, we it follows that h(µ1 ∗ · · · ∗ µn, σp) ⋍ log p.

83

Page 84: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

11.2 Dimension of sum sets

For any any measure µ, let

dimµ = infdimH S| S is a Borel set with µ(S) = 1. (2)

Define the lower dimension of µ

dimµ = infdimH S| S is a Borel set with µ(S) > 0. (3)

Lemma 5 (Billingsley). Let µ be a positive finite measure on T. AssumeK ⊂ T is a Borel set satisfying µ(K) > 0 and

K ⊂ x ∈ T : lim infǫ→0

logµ[Bǫ(x)]

log ǫ≤ γ.

Then dimH K ≤ γ. If the lim inf is γ a. e. , then dimH K = γ.

In [4], Meiri and Peres proved that dimµθ = ess-supθ dim µθ in a moregeneral context. We have the equivalent statement for lower dimension.

Theorem 6. Let µ be a p-invariant measure on T, and denote its ergodicdecomposition by µ =

∫νθdθ. Then dimµθ = ess-infθ dimµθ.

11.3 Uniform distribution on subgroups

The following lemmas are the key to prove Theorem 1.

Lemma 7. Let Xn be an infinite sequence of independent random variableswith values in ZN = Z/NZ, for some fixed integer N > 1. Suppose that forsome non-zero g ∈ ZN ,

∞∑

j=1

N−1∑

x=0

minP(Xj = x),P(Xj = x+ g) = ∞. (4)

Let Sn = X1 + · · ·+Xn (modN). Then for any x ∈ ZN ,

limn→∞

(P(Sn = x+ g) − P(Sn = x)) = 0. (5)

84

Page 85: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Lemma 8. Let Xn be an infinite sequence of independent random variableswith values in ZN = Z/NZ, for some fixed integer N > 1. Suppose that thereexists a subgroup G ⊂ ZN , generated by g1, . . . , gr, such that (4) holds forg = g1, . . . , gr. let Sn = X1 + · · · + Xn (modN), and let Sn mod G denotethe projection of Sn to ZN/G. Then

EH(Sn|Sn modG) → log |G|.

Lemma 9. Let Y be an Zpk-valued random variable. Let G be a subgroup ofZpk , and suppose that n satisfies pn ≥ |G|. Then

H(πn(Y )) ≥ H(Y |Y modG) = H(Y ) −H(Y modG).

11.4 Entropy and subgroups

Proposition 10. Let µ, ν be two p-invariant measures, and G a subgroup ofZ/pkZ for some k ∈ N. Then

EHµ∗ν(x1...k modG|xk+1...∞) ≥ EHµ(x1...k modG|xk+1...∞)

Corollary 11. let µi be a sequence of p-invariant measures, and denoteµ = Πµi. Suppose that G is a subgroup of Z/pkZ for some k ∈ N. Then

EHµ(θn1...k modG|θn

k+1...∞)

is monotone nondecreasing in n.

Lemma 12. Let µ be a measure on T, and suppose that G ⊂ Z/pkZ is agroup of size ≥ pl. Then

H(α1...l) ≥ (l − 1) log p− log |G| +∫H(α1...k|α1...k modG ∨ αk+1...∞)dµ

11.5 The convolution theorem

Lemma 13. Let G ⊂ Zpk , and suppose that µ and ν are non-atomic measureson T. Then

EIk(µ ∗ ν) − EIk(µ) = EIG(µ ∗ ν) − EIG(µ).

85

Page 86: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Lemma 14. Let µn be a sequence of probability measures on T, and formthe product measure µ = Πµi. Suppose that for some fixed number k,

∞∑

i=1

Eψ−1(Hµi(xi

k|tik+1...∞)) = ∞. (6)

Then for µ-almost every t ∈ TN there exists a group Gk(t) ⊂ Zpksuch that

EIGk(t)(µ1 ∗ · · · ∗ µn) → E log |Gk(t)|. (7)

Furthermore, |Gk(t)| ≥ pk∗ a. e. , where p∗ is the smallest prime factor of p.

Lemma 15. Under the assumptions of Lemma 14, define

hk = supn

1

kEHµ1∗···∗µn(x1...k|xk+1...∞). (8)

For any m, if 1kEHµ1∗···∗µm(x1...k|xk+1...∞) > hk − ǫ, then

Hµ1∗···∗µm(t1...l) ≥ (l − 1) log p− (k + 1)ǫ, (9)

Where l = l(k) = xk log p∗log p

y.

References

[1] . Billingsley, Ergodic Theorey and Information, (1965), Wiley, New York.

[2] H. Furstenberg, Disjointness in ergodic theory, minimal sets, and a prob-lem in Diophantine approximation, Math. Systems Theory 1 (1967), 1-49.

[3] E. Lindenstrauss, D. Meiri and Y. Peres Entropy of convolutions on thecircle, Ann. of Math. 149 (1999), 871-904.

[4] D. Meiri and Y. Peres, Bi-invariant sets and measures have integer haus-dorff dimension, Ergodic Theory and Dynamical Systems, 19 (1999), no.2, 523–534.

Shuanglin Shao, UCLA

email: slshao@math. ucla. edu

86

Page 87: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

12 Bourgain’s Entropy Estimates

after J. Bourgain [2]A summary written by John T. Workman

Abstract

It is shown that if certain sequences of operators (Sn) convergealmost surely on some Lp, then there is a uniform estimate on theL2-entropy of Snf : n ∈ N for all f in the L2-unit ball. This showsthat there are bounded functions for which Bellow’s averages do notconverge almost surely.

12.1 Preliminaries

Let (X,µ) be a probability space. Let (Tj)j∈N be a sequence of linear opera-tors from the X-measurable functions to the X-measurable functions satis-fying: Tj : L1(µ) → L1(µ) are bounded; Tj : L2(µ) → L2(µ) are isometries;Tj are positive, i.e., if f ≥ 0 a.s.[µ] then Tj(f) ≥ 0 a.s.[µ]; and Tj(1) = 1.

Also, assume the Tj satisfy a mean ergodic condition. Namely, for every1 ≤ p <∞

1

J

J∑

j=1

Tjf →∫

X

f(x)µ(dx) in Lp(µ)-norm, for all f ∈ Lp(µ).

Let (Sn)n∈N be another sequence of linear operators satisfying Sn : L2(µ) →L2(µ) with ‖Sn(f)‖L2(µ) ≤ ‖f‖L2(µ), and TjSn = SnTj for all j, n. With thesehypotheses, we gain the following results of Bellow and Jones [1].

Lemma 1. Suppose that for some 1 ≤ p <∞, Snf converges a.s.[µ] for allf ∈ Lp(µ). Then, there is a finite-valued function C(ǫ), for ǫ > 0, so that

µ

supn

|Snf | ≤ C(ǫ)> 1 − ǫ

for every f ∈ L∞(µ), ‖f‖Lp(µ) ≤ 1.

Lemma 2. Suppose Snf converges a.s.[µ] for all f ∈ L∞(µ). Then, there isa finite-valued function δ(ǫ), for ǫ > 0, so that δ(ǫ) → 0 as ǫ→ 0 and

X

supn

|Snf | dµ < δ(ǫ)

whenever ‖f‖L∞(µ) ≤ 1 and ‖f‖L1(µ) < ǫ.

87

Page 88: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

The theory of normal random variables and Gaussian processes plays apivotal role in the proofs of both entropy results. Let (Ω, P ) be anotherprobability space. Recall, a random variable g on Ω is normal (or Gaussian)with mean m and variance σ2 if it has the distribution function

f(ω) =1√

2πσ2exp

(−(ω −m)2

2σ2

).

A normal random variable with mean 0 and variance 1 is called a standardnormal random variable.

Let T be a countable indexing set. A stochastic process (Gt : t ∈ T ) iscalled a Gaussian process if all finite linear combinations

∑t atGt are normal

random variables, and each Gt has mean 0. We define a pseudo-metric on Tby dG(s, t) = ‖Gs−Gt‖L2(P ). Denote the entropy number of T by N(T, dG, δ),the minimal number of δ-balls, under dG, needed to cover T . The followingfundamental result is Sudakov’s inequality.

Theorem 3. There is a constant R > 0 such that if (Gt : t ∈ T ) is aGaussian process, then

supδ>0

δ√

logN(T, dG, δ) ≤ R

∥∥∥∥supt∈T

|Gt|∥∥∥∥

L1(P )

.

12.2 The First Entropy Result

For δ > 0 and f ∈ L2(µ), let Nf (δ) be the δ-entropy number of the setSnf : n ∈ N in L2(µ), i.e., the minimum number of δ-balls in L2(µ) neededto cover Snf : n ∈ N. The following theorem is the first of Bourgain’sentropy results.

Theorem 4. Suppose that for some 1 ≤ p < ∞, Snf converges a.s.[µ] forall f ∈ Lp(µ). Then, there exists C > 0 such that δ(logNf(δ))

1/2 ≤ C forall δ > 0 and ‖f‖L2(µ) ≤ 1.

We give a brief idea of the proof. Let (gj)j∈N be a sequence of independentstandard normal random variables on (Ω, P ). As (X,µ) is a probability space,Lr(µ) ⊃ Lq(µ) when r < q. So, assume without loss of generality that p ≥ 2.Fix f ∈ L2(µ), ‖f‖L2(µ) ≤ 1. As L∞(µ) is dense in L2(µ), we may alsoassume f ∈ L∞(µ).

88

Page 89: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

For M ∈ N, denote M = 1, 2, . . . ,M, and define a pseudo-metric on N

by df(n, n′) = ‖Snf − Sn′f‖L2(µ). Note, supM N(M, df , δ) = Nf(δ). Fix M .

As J−1∑Tj(f

2) → ‖f 2‖L1(µ) = ‖f‖2L2(µ) ≤ 1 in Lp/2(µ)-norm (because

p ≥ 2), choose J so large that∥∥∥∥

1

J

∑Tj(f

2)

∥∥∥∥Lp/2(µ)

≤ 2. (1)

For each pair n, n′ ≤ M , J−1∑Tj((Snf − Sn′f)2) → ‖Snf − Sn′f‖2

L2(µ) =

df(n, n′)2 in L1(µ)-norm, and thus in probability. So, also choose J big

enough that

X ′ =

(1

J

∑Tj

((Snf − Sn′f)2

))1/2

≥ 1

2df(n, n

′) for all n, n′ ∈M

has large probability. Define F (x, ω) = J−1/2∑J

j=1 gj(ω)Tjf(x).Let F ∗(x, ω) = supn≤M |SnF (x, ω)|. From Lemma 1, (1), and some gen-

eral probability results, there is a “universal” constant R′ (independent of f ,M , and J) such that

X ′′ =

x ∈ X :

Ω

F ∗(x, ω)P (dω) ≤ R′

has large enough µ-probability, so that µ(X ′ ∩ X ′′) > 0. Fix x ∈ X ′ ∩ X ′′.Define a Gaussian process by Gn(ω) = SnF (x, ω) = J−1/2

∑gj(ω)TjSnf(x).

By Sudakov’s inequality and as x ∈ X ′′,

supδ>0

δ(logN(M, dG, δ))1/2 ≤ R

Ω

supn∈M

|Gn(ω)|P (dω)

= R

Ω

F ∗(x, ω)P (dω) ≤ RR′.

On the other hand, as x ∈ X ′,

dG(n, n′) =

(1

J

∑Tj(Snf − Sn′f)2(x)

)1/2

≥ 1

2df(n, n

′),

where the equality follows from a simple result on normal random variablesand the properties of (Tj). Thus, N(M, df , δ) ≤ N(M, dG, δ/2) for all δ,so that δ(logN(M, df , δ))

1/2 ≤ 2RR′ =: C. Taking the supremum over M ,δ(logNf (δ))

1/2 ≤ C for all δ > 0.

89

Page 90: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

12.3 The Second Entropy Result

Theorem 5. Suppose Snf converges a.s.[µ] for all f ∈ L∞(µ). Then, thereexists a finite-valued function C(δ), for δ > 0, such that Nf (δ) ≤ C(δ) forall δ > 0 and ‖f‖2 ≤ 1.

The proof is by contradiction. Suppose there is some δ > 0 such thatNf(δ) is unbounded over the L2(µ) unit ball. Fix K ∈ N, K > 1. Then,there is some f ∈ L∞, ‖f‖L2(µ) ≤ 1 such that Nf(δ) > K. In particular,there is a subset I ⊂ N with |I| = K (cardinality) and ‖Snf −Sn′f‖L2(µ) > δfor all n 6= n′ ∈ I.

Like in the previous proof, we must choose an appropriately large J , butwe suppress those details. Define F (x, ω) = J−1/2

∑Jj=1 gj(ω)Tjf(x). Write

F (x, ω) = ϕ(x, ω) +H(x, ω) where

ϕ(x, ω) = F (x, ω)χ|F (x,ω)|≤6√

log K and

H(x, ω) = F (x, ω)χ|F (x,ω)|>6√

log K.

Much like the first proof, here we must show that three sets (in Ω) havesufficiently large probability, so that there is some ω in all three sets. This isdone by taking J large enough initially, an application of Sudakov’s inequal-ity, and more results in probability theory. This gives ω ∈ Ω such that

X

supn∈I

|SnH(x, ω)|µ(dx) ≤ c1, (2)

X

supn∈I

|SnF (x, ω)|µ(dx) ≥ c2(logK)1/2, (3)

X

|ϕ(x, ω)|µ(dx) ≤ c3, (4)

for universal constants c1, c2, c3. By (2) and (3) we see∫

X

supI

|Snϕ(x, ω)|µ(dx) ≥ c2(logK)1/2 − c1. (5)

Define ψK(x) = 16(logK)−1/2ϕ(x, ω). Simply by construction, we have

|ψK | ≤ 1. By (5),∫

XsupI |SnψK(x)|µ(dx) ≥ c2/6−c1(logK)−1/2/6. Finally,

by (4), we see ‖ψK(x)‖L1(µ) ≤ c3(logK)−1/2/6. As K ∈ N, K > 1 wasarbitrary, we can create a sequence ψK satisfying these three conditions.However, c3(logK)−1/2/6 → 0, while c2/6 − c1(logK)−1/2/6 ≥ c2/12 > 0 forlarge K. This contradicts Lemma 2, and completes the proof.

90

Page 91: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

12.4 Application: Bellow’s Averages

Let T = R/Z. Then, a function f : R → C with period 1 can be viewed asa function on T. Let m be Lebesgue measure, and consider the probabilityspace (T, m). Let (aj) be any non-zero sequence of real numbers whichconverge to 0. For functions f : T → C, consider the operators

Snf(x) =1

n

n∑

j=1

f(x+ aj).

Bellow asked whether Snf converges to f a.s.[m] for all f ∈ L1(m). Theanswer to this turns out to be no. In fact, it is not even true for all f ∈L∞(m). We prove this using the second entropy criterion. First, we need thefollowing result.

Lemma 6. Let (aj) be a sequence of non-zero real numbers converging to0. Then, given any r ∈ N, there exist integers J1 < J2 < . . . < Jr satisfyingthe following: if α = (α1, . . . , αr) is a vector of 0’s and 1’s, then there is aninteger n(α) such that for all 1 ≤ s ≤ r

∣∣∣∣∣1 − J−1s

j≤Js

e2πiajn(α)

∣∣∣∣∣ <1

10if αs = 0,

∣∣∣∣∣1 − J−1s

j≤Js

e2πiajn(α)

∣∣∣∣∣ >1

2if αs = 1.

The proof of this lemma is somewhat technical, but it is straightforwardand relies only on some basic complex arithmetic. We now proceed to thesolution of Bellow’s question.

Theorem 7. Let (aj) be any sequence of real numbers which converge to 0and aj 6= 0 for all j. Then, there exists f ∈ L∞(m) such that Snf does notconverge a.s.[m].

Proof. For bj = (j − 1)w, where w is irrational, it follows from Birkhoff’serogdic theorem that the operators Tjf(x) = f(x + bj) satisfy the meanergodic condition. That these Sn and Tj satisfy the other conditions laid outin Section 12.1 is immediate. By Theorem 5, it suffices to show

supNf(δ) : ‖f‖L2(m) ≤ 1 = ∞

91

Page 92: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

for some δ > 0. In fact, we will do this with δ = 1/10. Let r ∈ N. ByLemma 6, choose integers J1 < . . . < Jr. Define a function f by

f(x) = 2−r/2∑

α∈0,1r

e2πin(α)x.

Note, ‖f‖22 =

∫ 1

0f(x)f(x) dx = 2−r

∑α 1 = 1 by orthogonality. Also,

SJsf(x) = 2−r/2∑

α∈0,1r

βs,αe2πin(α)x,

whereβs,α = J−1

s

j≤Js

e2πiajn(α).

Fix an α and suppose αs = 1 and αt = 0. By Lemma 6, we have

|βs,α − βt,α| ≥ |βs,α − 1| − |1 − βt,α| >1

2− 1

10=

2

5.

Hence, by orthogonality and above,

‖SJsf − SJtf‖L2(m) = 2−r/2

(∑

α

|βs,α − βt,α|2)1/2

≥ 2−r/2

(∑

α:αs 6=αt

|βs,α − βt,α|2)1/2

> 1/5.

As no two SJsf could be contained in the same 1/10-ball in L2, we seeNf(1/10) ≥ r. As r is arbitrary, supNf(1/10) = ∞.

References

[1] Bellow, A. and Jones, R. L., A Banach principle for L∞. Adv. Math.120 (1996), 155–72;

[2] Bourgain, J., Almost sure convergence and bounded entropy. Israel J.Math. 63 (1988), no. 1, 79–97;

John Workman, Cornell University

email: [email protected]

92

Page 93: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

13 Pointwise Ergodic Theorems for Arithmetic

Sets, part 1

after J. Bourgain [1]A summary written by Andrew Yingst

Abstract

We reduce a pointwise Ergodic theorem along polynomial iteratesto certain inequalities. We also prove a lemma to be used in part 2.

13.1 Introduction

Let (Ω,B, µ, T ) be a dynamical system. (For us, this is a probability mea-sure space, with a measure-preserving automorphism.) We are interested inadaptations of Birkhoff’s individual ergodic theorem which replace the usualergodic means with means along some polynomial:

ANf =1

N

N∑

n=1

f T ⌊p(n)⌋,

where p is a polynomial with real or integer coefficients. We follow Bourgain’sarguments of the following two theorems. The first applies only to integerpolynomials and includes a maximal inequality.

Theorem 1. Let (Ω,B, µ, T ) be a dynamical system and let p be a polynomialwith integer coefficients. Given r > 1, there is a constant C so that

‖ supN≥1

|ANf | ‖r ≤ C‖f‖r (1)

holds for all f ∈ Lr(Ω, µ). Furthermore, the sequence ANf converges point-wise almost everywhere. If T is weakly mixing and p is non-constant, thenthe limit is given by

∫fdµ.

In the second theorem, we generalize to all real polynomials.

Theorem 2. Let (Ω,B, µ, T ) be a dynamical system and let p be a polynomialwith real coefficients. If f is any bounded measurable function on Ω, thenANf converges pointwise almost everywhere.

93

Page 94: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

13.2 Reduction to Inequalities.

The section following this one will argue the maximal inequalities to be used,and a certain lemma which follows from them. We begin by showing thatTheorems 1 and 2 follow from these. Hence, for this subsection we assumethat inequality 1 holds, as well as the following lemma.

Lemma 3. Let (Ω,B, µ, T ) be the shift map on Z endowed with countingmeasure, and let p be a polynomial with real coefficients. Fix ǫ > 0, and letZǫ = ⌊(1 + ǫ)n⌋ : n ∈ N. Let (Nj) be a sequence of positive integers suchthat Nj+1 > 2Nj for all j > 1. For j ≥ 1 and f a function on Z, let

Mjf = supNj≤N≤Nj+1,N∈Zǫ

|ANf − ANjf |.

Then there is a function g : N → R+ so that g(J)J

→ 0 as J → ∞, and sothat for all bounded, measurable f on Ω, we have

J∑

j=1

‖Mjf‖22 ≤ g(J)‖f‖2

2.

We first show that Lemma 3 holds if (Ω,B, µ, T ) is any dynamical system.(We do not state this identical lemma.)

Proof. Let f be a bounded measurable function on Ω, fix J ≥ 1 and x ∈ Ω,and let K be some large positive integer. Let φ be a function on Z byφ(k) = f(T k(x)) for |k| ≤ K, and 0 otherwise. Using the shift map on Z, wehave that ANφ(k) = 1

N

∑Nn=1 φ(⌊p(n)⌋+k) = ANf(T k(x)), if |k+⌊p(i)⌋| ≤ K

for all i = 1 . . .N.From this it follows that Mjφ(k) = Mjf(T k(x)), if |k + ⌊p(i)⌋| ≤ K for

all i = 1 . . . Nj+1. Let M = max|p(i)| : i = 1 . . .NJ+1. Assume that K isso large that K > M . From Lemma 3, we have

J∑

j=0

K−M∑

n=−K+M

|Mjφ(n)|2 ≤J∑

j=0

‖Mjφ‖22 ≤ g(J)‖φ‖2

2 = g(J)

(K∑

n=−K

|φ(n)|2).

For the values of n considered, φ and Mjφ agree with an iterate of f :

J∑

j=0

K−M∑

n=−K+M

|Mjf(T n(x))|2 ≤ g(J)

(K∑

n=−K

|f(T n(x))|2).

94

Page 95: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

The above holds for all x, and we may integrate both sides with respect tox. Since T is invariant, these terms no longer depend on n. Dividing by(2K + 1 − 2M), we find:

J∑

j=0

‖Mjf‖2 ≤ g(J)2K + 1

2K + 1 − 2M‖f‖2.

This holds for all large K, so letting K go to infinity gives the desired result.

Next we show that the above lemma gives pointwise convergence of ANfwhen f is bounded.

Theorem 4. Let (Ω,B, µ, T ) be a dynamical system, and let p be a polynomialwith real coefficients. If f is a bounded measurable function on Ω, then Anfconverges pointwise almost everywhere.

Proof. By way of contradiction, suppose f is a bounded function for whichthe above fails. We may assume |f | < 1. Let L ≤ µx ∈ Ω : Anf(x) doesntconverge, with L ≤ 1/2. For ǫ > 0 let Eǫ denote the set of all points x sothat ǫ witnesses that the sequence (Anf(x)) is not a Cauchy sequence. Letǫ > 0 be such that µ(Eǫ) has measure greater than L/2. Let ǫ = Lǫ/6, andas above, let Zǫ = ⌊(1+ ǫ)n⌋ : n ∈ N. We now inductively define a sequence(Nj) of positive integers with Nj+1 ≥ 2Nj.

Let N1 = 1. Suppose that Nj has been defined. For x ∈ E = Eǫ, thereare arbitrarily large integers m and n so that |Amf(x)−Anf(x)| > ǫ. Hencethere are arbitrarily large n so that |Anf(x) − ANj

f(x)| > ǫ/2. For suchan n, let n′ be the nearest element of Zǫ to n. Using the definition of Zǫ itcan be verified that for large n, |Anf(x) − An′f(x)| ≤ |n−n′|

minn,n′ ≤ 2ǫ. Hence,

|ANjf(x) − An′f(x)| ≥ ǫ/2 − 2ǫ > ǫ/3.

As before, let Mjf = supNj≤N≤Nj+1,N∈Zǫ|ANf − ANj

f |. By the aboveargument, as Nj+1 becomes large, x ∈ E : Mjf(x) > ǫ/3 increases to E.Therefore, we may choose Nj+1 > Nj so that µx : Mjf(x) > ǫ/3 > L/2.

Thus the sequence Nj is defined, and we have ‖Mjf(x)‖2 ≥√L/2ǫ/3 > ǫ.

But by Lemma 3, we have∑J

j=1 ‖Mjf‖22 ≤ g(J)‖f‖2

2. In our case, this

gives Jǫ2 ≤ g(J)‖f‖22, contradicting that g(J)/J → 0.

At this point, we have completed the proof of Theorem 2. To completeTheorem 1, we use the maximal inequality in a typical way:

95

Page 96: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Theorem 5. Let (Ω,B, µ, T ) be a dynamical system, and let p be a polyno-mial with integer coefficients. If f ∈ Lr(Ω, µ), then Anf converges pointwisealmost everywhere.

Proof. Let δ, ǫ > 0. By the maximal inequality, (1) we may find a bounded

function g with ‖f − g‖r so small that ‖ supn≥1 |An(f − g)|‖r ≤ δ1/rǫ3

. Wemust then have supn≥1 |Anf−Ang| ≤ ǫ/3 except on a set of measure at mostδ. But Ang(x) converges: for almost every x ∈ Ω, there is N(x) such thatm,n > N(x) implies |Amg(x) − Ang(x)| < ǫ/3. Hence, for all x off a set ofmeasure δ, there is N(x) so that m,n > N(x) implies |Anf(x)−Amf(x)| < ǫ.Taking (δn) to be a summable sequence, and (ǫn) converging to zero, we findthat Anf(x) is a Cauchy sequence except on a set of measure

∑δn.

13.3 A Lemma

In this subsection, we outline Bourgain’s argument of an inequality for certainFourier multipliers. Let F denote the Fourier transform from l2(Z) to L2[0, 1],or from L2(R) to L2(R). The goal of this section is to motivate the following:

Lemma 6. Let λ1, . . . , λK be points in I = [0, 1], with d(λk, λk′) > 2 ·2−s fork 6= k′. For j ≥ 0, let Rj = ∪K

i=1B(λi, 2−j). Then for any function f on Z,

the following inequality holds:

∥∥∥∥supj≥s

|F−1[χRjFf ]|

∥∥∥∥l2(Z)

≤ C(logK)2‖f‖l2(Z),

where C is a constant independent of the choice of f or λi.

Note that the above supremum is over all those j so that the intervals ofRj are disjoint. Bourgain also generalizes this to take the supremum over allj, but in our applications, f is supported on some disjoint Rj, so this versionsuffices.

The argument begins by adapting a form of Doob’s Oscillation Lemmafor Martingales. We won’t go through the argument, but do outline theadaptation. To avoid introducing terminology, we state only a special caseof Doob’s Oscillation Lemma. Given f a measurable function on [0, 1], let fn

denote∑

I∈Dn( 1|I|∫

If)χI , where Dn is the collection of all dyadic intervals

in [0, 1] of length 2−n.

96

Page 97: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Lemma 7 (Doob’s Oscillation Lemma). For λ > 0 and x ∈ [0, 1], let Nλ(x)denote the number of jumps of length λ in the sequence (fn(x)). Then foreach λ > 0 and each measurable f , we have

‖λ(Nλ)1/2‖2 ≤ c‖f‖2.

In our adaptation, we replace the scalar valued function f with a functionfrom f : R → Rd. We also replace the sequence fn(x) with the similarexpression (f ∗ φt)(x), where φt(y) = 1

tφ(y

t). (Convolution with a vector-

valued function is defined coordinate-wise.) Finally, we replace our gap-counting function Nλ(x) with Mλ(x), defined to be the number of balls ofradius λ needed to cover the set (f ∗ φt)(x) : t > 0, taken to be zero if thisset has diameter less than λ.

Lemma 8. Let φ be a smooth real-valued function on R which vanishes at∞, with cφ =

∫|xφ′(x)| < ∞. Then for every measurable f : R → Rd and

every s > 2, we have

‖ supλ>0

λ(Mλ)1/s‖2 ≤ cφ(s− 2)−1‖f‖2.

From this, we easily obtain the following lemma.

Lemma 9. With the notation above, there is a constant c′φ depending onlyon φ, so that for K ≥ 1 we have

‖∫ ∞

0

min(K,Mλ(x))1/2dλ‖2 ≤ c′φ(logK)2‖f‖2.

Proof. Let F (x) = 12supt>0 ‖f ∗ φt‖2. (The norm here is the vector norm.)

Applying the Hardy-Littlewood Maximal Inequality coordinatewise, we get‖F‖2 ≤ cφ‖f‖2. Note that if λ ≥ F (x), then Mλ(x) = 0. Hence, we maywrite

∫ ∞

0

min(K,Mλ(x))1/2dλ ≤

∫ F (x)K−1/s

0

K1/2dλ+

∫ F (x)

F (x)K−1/s

K12− 1

sMλ(x)1sdλ

≤ K12− 1

sF (x) +K12− 1

s1

slog(K) · sup

λ>0λMλ(x)

1/s.

Now we are done, by applying the previous lemma and taking s so that1s− 1

2= 1

log K. (This makes K

12− 1

s = e and 1s−2

= log(K)−24

.)

97

Page 98: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

We are now in a position to outline the proof of Lemma 6.

Proof. We begin with two reductions which we do not argue fully. First itis sufficient to argue the equivalent lemma for functions f on R. (We stillleave Rj a subset of [0, 1]; we do not make it periodic.) The idea comesfrom noting that when f is a function on Z and 0 < ρ is small, the functionfρ = ρ−1

∑n∈Z

f(n)χ[n,n+ρ] on R has ‖f‖l2(Z) = ‖fρ‖L2(R), and fρ ≈ f on[0, 1].

Second, we’d like to replace χRjby a smooth function. Let |ψ| ≤ 1

be some smooth function on R with ψ = 1 on [−1/2, 1/2], and ψ = 0 off(−1, 1). Let gj(λ) =

∑Kk=1 ψ(2j(λ− λk)). The argument that it is sufficient

to argue the lemma with gj replacing χRjcomes from the observation that∑

j≥s |gj − χRj| is bounded.

We must now show the following inequality:

∥∥∥∥supj≥s

|F−1[gjFf ]|∥∥∥∥

L2(R)

≤ C(logK)2‖f‖L2(R).

Let φ = F−1ψ and again let φt(x) = 1tφ(x

t). Note that ψ(2jλ)·ψ(2s−1λ) =

ψ(2jλ). Taking the Fourier inverse of both sides of this gives φ2j ∗φ2s−1 = φ2j .Using this fact and usual rules of Fourier transforms, we may write the left-hand side of the above as:

‖ supj≥s

|K∑

k=1

e2πiλkx · (fk ∗ φ2j )(x))| ‖2, (2)

where fk = [e−2πiλkxf(x)] ∗ [φ2s−1(x)].The Hardy-Littlewood Inequality gives that ‖ supj≥s fk ∗ φt‖2 is bounded

by c(φ)‖fk‖2. Using this we find an inequality of the form,

‖ supj≥s

|K∑

k=1

e2πiλkx · (fk ∗ φ2j )(x))| ‖2 ≤ A(

K∑

k=1

‖fk‖22)

1/2,

where A = C√K. Let B = B(φ,K) be the least value for which the

above inequality holds. From Parseval’s formula, we know∑K

k=1 ‖fk‖22 =∑K

k=1 ‖fk‖22 = ‖f · gs−1‖2

2 which is easily checked to be less than 2‖f‖2 =2‖f‖2. Hence, we are done if we can show B ≤ C(logK)2.

98

Page 99: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

Because fk is supported on the interval [−2 · 2−s, 2 · 2−s], we can estimatethat ‖(1−e2πiλu)f(λ)‖2 ≤ 1

2‖f‖2 provided |1−e2πiλu| < 1

2on [−2 ·2−s, 2 ·2−s],

which will occur for example when |u| < 2s/100. Taking F−1 gives

‖fk − σufk‖2 <1

2‖fk‖2, when |u| < 2s/100

where σug(x) = g(x+ u). Writing fk = (fk − σufk) + σufk and applying thetriangle inequality gives

(2) ≤ B

2(

K∑

k=1

‖fk‖22)

1/2 + ‖ supj≥s

|K∑

k=1

e2πiλkx · σu(fk ∗ φ2j )(x)‖2.

Integrating this inequality over u ∈ [0, 2s/100] and changing variables letsus replace the last term with

10√

2−s

∥∥∥∥∥∥

(∫ 2s

0

supt≥0

|K∑

k=1

e2πiλk(x−u) · (fk ∗ φt)(x))|2 du)1/2

∥∥∥∥∥∥2

.

If we can show the above is bounded by C(logK)2(∑K

k=1 ‖fk‖22)

1/2, we’llknow that C(logK)2 + B

2is a suitable choice of A in our desired inequality,

and hence C(logK)2 + B2≥ B, and so we’ll be done.

Let F map into RK by F (x) = (f1(x), . . . , fK(x)), and consider the setAx = (F ∗φt)(x) : t > 0 = ((f1∗φt)(x), . . . , (fK ∗φt)(x)). As in Lemma 9,let Mλ(x) denote the minimal number of λ balls needed to cover the set Ax,or zero if diam(Ax) < λ. Fixing x, let q be the largest integer so that M2q(x)is positive. For j ≤ q, let Cj be a collection of balls of radius 2j covering Ax

with |Cj| = M2j (x). Next, for j ≤ q, define Dj by choosing a point of Ax

from each ball in Cj. Finally, define Bj for j ≤ q by taking Bq = Dq, and forj < q let Bj consist of all points of the form d − d′ where d ∈ Dj and d′ isthe nearest point of Dj+1 to d.

Because of the covering properties of Cj , we know that if j ≤ q andy ∈ Bj, then ‖y‖ < 2 · 2j. (For j = q we note that limt→∞(F ∗ φt)(x) = 0, sowe may assume 0 ∈ Ax.) Further, we know that for each a ∈ Ax, we can writea =

∑j≤q yj for some sequence yj ∈ Bj . This sum holds coordinate-wise,

and we may write fk ∗ φt(x) =∑

j≤q πk(yj), leading us to the inequality

√2−s

(∫ 2s

0

supt≥0

|K∑

k=1

e2πiλk(x−u) · (fk ∗ φt)(x)|2du)1/2

99

Page 100: Analysis and Ergodic Theorythiele/workshop7/anerg.pdf · 2006-09-13 · Analysis and Ergodic Theory Summer School, Lake Arrowhead ∗ September 17th - September 22nd 2006 Organizers:

≤√

2−s∑

j≤q

∥∥∥∥∥max~y∈Bj

|K∑

k=1

e2πiλk(x−u) · πk(~y)|2∥∥∥∥∥

L2[0,2s],du

. (3)

Now on the one hand, the interior of the maximum above is always less than∑Kk=1 |πk(~y)| ≤

√K‖~y‖. On the other hand, it also increases if we replace

the max by a sum. We write

max~y∈Bj

| . . . | ≤ min√K · 2 · 2j , [

~y∈Bj

|K∑

k=1

e2πiλk(x−u)πk(~y)|2]1/2.

For positive functions, the norm of the min is less than min of the norms, sowe estimate the L2[0, 2s], du norm of each of the above. For the first, we get2j+1

√K2s. For the second, the fact that |λj − λk| > 2−s for j 6= k gives the

following inequality, which we will not verify:

‖K∑

k=1

ake2πiλku‖L2[0,2s],du ≤ C

√2s(

K∑

k=1

|ak|2)1/2.

This gives that the L2[0, 2s], du norm of the second element of the above minis bounded by

~y∈Bj

C22s‖~y‖2

1/2

≤(C22s|Bj|(2j+1)2

)1/2= 2j+1

√2s(M2j (x))1/2

This gives us that (3) is bounded by

C∑

j≤q

2j+1(minK,M2j (x))1/2 ≤ 2C

∫ ∞

0

(minK,Mλ(x))1/2dλ,

since Mλ(x) decreases as λ increases. But now we are finished, by applyingLemma 9.

References

[1] Bourgain, J., Pointwise Ergodic Theorems for arithmetic sets. Inst.Hautes Etudes Sci. Publ. Math., 69 (1989), pp. 5-45.

Andy Yingst, Univ. of S. Carolina Columbia

email: [email protected]

100