An Application of Fourier analysis on Boolean functions in ... · additive combinatorics to show the quasi-polynomial Freiman-Ruzsa theorem originally proven by Sanders. Using this,

An Application of Fourier analysis on Boolean

functions in Theoretical Computer Science

Ismani Nieuweboer

Bachelor ThesisDouble bachelor’s programme Mathematics & Computer Science

Supervisors: dr. Guus Regts & dr. Jop Briet

Korteweg-de Vries Institute for Mathematics

Faculty of Science

University of Amsterdam

Abstract

After laying down a foundation of Fourier analysis on Boolean functions (functions thathave the n-fold Cartesian product of integers modulo 2 as domain), the Level-k inequalitiesand Chang’s lemma are proven. Subsequently Chang’s lemma is used together withadditive combinatorics to show the quasi-polynomial Freiman-Ruzsa theorem originallyproven by Sanders. Using this, an overview of linearity testing of Boolean functions andfunctions from and to the Boolean/Hamming cube, and the complexity, are given.

Title: An Application of Fourier analysis on Boolean functions in Theoretical Computer ScienceAuthor: Ismani Nieuweboer, [email protected]; student number 10502815Supervisors: dr. Guus Regts (Korteweg-de Vries Instituut)

dr. Jop Briet (Centrum Wiskunde & Informatica)Second graders: prof. dr. Tom Koornwinder (Korteweg-de Vries Instituut)

prof. dr. Ronald de Wolf (Centrum Wiskunde & Informatica)End date: July 8th, 2016

Korteweg-de Vries Institute for Mathematics (KdVI)University of AmsterdamScience Park 105, 1098 XH Amsterdamhttp://kdvi.uva.nl/

2

http://kdvi.uva.nl/

Acknowledgments

First off I want to thank my supervisors, Jop and Guus, for their time, patience, andguidance. Not to forget their suggestion for the topic of this thesis which I found veryinteresting due to the immense diversity of it. Secondly, I want to show my appreciationto all my friends for their moral support, given that a few times I needed it more than Ithought. Last but not least many thanks to my parents back in Suriname, for allowingme to get to this point in life and for supporting me in the choices I make today.

Ismani Nieuweboer

3

Contents

Introduction 6

1 Preliminaries 71.1 Measure theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Fourier analysis on Boolean functions 92.1 Basic theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Noise stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Hypercontractivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.2 Base case and induction . . . . . . . . . . . . . . . . . . . . . . . . 222.3.3 Small-set expansion theorem, Level-k inequalities, and Chang’s

lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Additive Combinatorics 29

4 Linearity testing 404.1 Testing of Boolean functions . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2 Testing of functions over Boolean space . . . . . . . . . . . . . . . . . . . 424.3 Algorithmic description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Conclusion 47

Popular Summary (in Dutch) 48

5

Introduction

Theoretical computer science leans broadly on the fundamentals laid down by mathematics.One can argue a lot about where the boundary lies between the two. The theory describedin this thesis is pioneered by computer scientists and mathematicians alike.

As a particular example, Boolean functions are used to describe properties of n-bitstrings. The Boolean/Hamming cube, Fn2 and {±1}n are equivalent ways to denote thesestrings. Fourier analysis is used to analyze the properties of Boolean functions, and forapplications throughout (theoretical) computer science. Some examples of these applica-tions are property testing, extremal/additive combinatorics, random graph theory, socialchoice theory, cryptography, circuit complexity, learning theory and pseudorandomness[6].

The goal of this thesis is to give a description of linearity tests and the complexity ofthose. Given a function that is either linear or far from linear, i.e. not linear on a lot ofpairs of points, the linearity test described determines which case applies to the function.Two classes of functions are discussed: Boolean functions (Fn2 → F2) and functions overBoolean space (Fn2 → Fn2 ), up to isomorphism of F2 with other groups. Working over F2,for linearity it only has to hold that f(x) + f(y) = f(x+ y) for all x, y ∈ Fn2 .

For testing Boolean functions, only standard results from Fourier analysis are needed.For functions from and to the Hamming cube however, a lot more machinery is needed.Chang’s lemma, a result from hypercontractivity is derived. This result gives a logarithmicbound on the dimension of the space spanned by elements with large Fourier coefficients.

Freiman and Rusza proved an exponential bound and conjectured a polynomial boundon the size of the span of sets A with small doubling . Recently Sanders proved that for asubset within A with a quasi-polynomial lower bound of size, the size of the span hasa polynomial upper bound. This theorem, called the quasi-polynomial Freiman-Ruszatheorem, is proven using techniques from additive combinatorics and Fourier analysis.Thisresult has been a large breakthrough in theoretical computer science.

Finally, we describe an explicit application of this to the area of property testing inthe form of an algorithm for testing linearity of Boolean functions, and functions overBoolean space. The complexity of the algorithm describes the amount of points on whichthe function agrees with a certain linear map. This complexity is in the case of Fn2 → Fn2directly related to the parameter that appear in the quasi-polynomial Freiman-Ruszatheorem.

Chapters two and three can be considered mathematics. In chapter three the resultdiscussed has many other applications in theoretical computer science [4] than the onediscussed in this thesis. Therefore chapter three can also be considered to belong tothe field of theoretical computer science. The goal of giving proofs of correctness for analgorithm concludes chapter four, which makes that chapter theoretical computer science.

6

1 Preliminaries

1.1 Measure theory

Let (S,Σ, µ) be a measure space. For p ∈ R≥1 consider the Lp-space over R.

Definition 1.1.1. [Inner product over L2] Let f, g ∈ L2. The inner product between fand g is defined as

〈f | g〉 :=

∫Sfg dµ.

Definition 1.1.2. [p-norm] Let (S,Σ, µ) be a measure space. Then the p-norm of f isdefined as

‖f‖p :=(∫

S|f |p dµ

) 1p.

Remark. If µ is a probability measure, we have ‖f‖p = E[f ] = Ex[f(x)] for x a uniformlychosen random variable over S.

Theorem 1.1.3. [Holder’s inequality (Theorem 12.2 in [10])] For f, g measurable on S,and p, q ∈ [1,∞] Holder conjugates or a Holder pair, i.e. 1

p + 1q = 1 ⇐⇒ p− 1 = 1

q−1 itholds that

‖fg‖1 ≤ ‖f‖p‖g‖q.

In particular, if f, g ∈ L2 we have

|〈f | g〉| ≤ ‖f‖p‖g‖q

(using the triangle inequality for integration).Additionally, when p, q ∈ (0,∞), f ∈ Lp, g ∈ Lq the inequality is sharp if and only if|f |p, |g|q are linearly dependent in L1.

Remark. A calculation shows that the second inequality can be made sharp by the

choice of g(x) := f(x)|f(x)|pq−1. This in particular implies that sup‖g‖q=1〈f | g〉 = ‖f‖p

since the choice of g can always be normalized.

Theorem 1.1.4. [Jensen’s inequality (Theorem 12.14 in [10])] Suppose x is a randomvariable and ϕ a convex function. Then ϕ(E[x]) ≤ E[ϕ(x)].

Theorem 1.1.5. [Monotonicity of the p-norm (in probability spaces)] Suppose µ(S) = 1,i.e. (S,Σ, µ) is a probability space. Then ‖·‖p is monotonously increasing in p.

7

Proof. Let f ∈ Lp(S), q, q′ ≥ 1 Holder conjugates.

‖f‖pp =

∫|f |p dµ =

∫|f |p · 1 dµ =

⟨|f |p

∣∣∣ 1⟩≤∥∥∥|f |p∥∥∥

q‖1‖q′ =

(∫|f |pq dµ

) 1q · µ(S)

1q′ = (‖f‖pqpq)

1q · 1

= ‖f‖ppq.

Since we have ‖f‖p ≤ ‖f‖pq and q ≥ 1 only if pq ≥ p the claim immediately follows.

Theorem 1.1.6. [Markov’s inequality (Proposition 10.12 in [10])] Suppose x ≥ 0 is a

nonnegative random variable and let a > 0. Then P[x ≥ a] ≤ E[x]a .

Theorem 1.1.7. [Marcinkiewicz-Zygmund inequality (Theorem 10.3.2 in [3] )] Let p ≥ 1.There exists a constant C > 0 (only depending on p) such that for every sequence x1, . . .xlof independent, zero-mean random variables with each of them finite p-th moment (i.e.E[|xi|p] <∞) it holds that

E[|l∑

i=1

xi|p]≤ (Cp)

p2E[(

l∑i=1

|xi|2)p2

].

Corollary 1.1.8. Let p ≥ 1. There exists a constant C > 0 (only depending on p) suchthat for every sequence x1, . . .xl of independent, zero-mean random variables such that|xi| ≤ 1 for each i ∈ [l], it holds that

E[|1l

l∑i=1

xi|p]≤(Cpl

) p2 .

Proof. Note that bounded random variables also have finite p-th moment. Hence theMarcinkiewicz-Zygmund inequality can be applied to give

E[∣∣∣1l

l∑i=1

xi

∣∣∣p] =1

lpE[∣∣∣ l∑i=1

xi

∣∣∣p]

≤ 1

lp(Cp)

p2E[(

l∑i=1

|xi|2)p2

]

≤ 1

lp(Cp)

p2

( l∑i=1

1) p

2=(Cpl

) p2.

The second inequality follows from the assumption that |xi| ≤ 1 for each i ∈ [l].

8

2 Fourier analysis on Boolean functions

This chapter closely follows the book Analysis of Boolean Functions by Ryan O’Donnell[6]. Specific references will be given at sections and theorems.

2.1 Basic theory

This section follows parts of chapter 1 of O’Donnell [6].We will consider bit strings of a fixed length n ∈ Z>0. One can do an bit-wise XOR

operation between two of these bit strings. The strings together with the operation thengives rise to an Abelian group.

Mathematically one can describe this group in multiple ways. The first way is asFn2 (here F2 = {0, 1}) with addition modulo 2, and a second way is as {±1}n withmultiplication. A third way is by subsets of [n] := {1, 2, . . . , n} with as operation thesymmetric difference between sets. The third way is needed to describe the space onwhich the Fourier transforms of functions take their arguments, which will be done laterin this section.

We identify F2 and {±1} with each other using the group isomorphism F2 → {±1} : x 7→(−1)x, which extends easily to n dimensions. Furthermore Fn2 is isomorphic to 2[n], withthe isomorphism given by x 7→ S := {i : xi = 1}, where we see that S contains i if andonly if x has a 1 at position i. Shortly notated, we have (Fn2 ,+) ∼= ({±1}n, ·) ∼= (2[n],∆).

It is important to note that addition and subtraction are the same in F2, as well asmultiplication and division in {±1}, however trivial this might be.

We now give the definition of the type of functions that are at the center of this thesis.

Definition 2.1.1. [Boolean function] A function f is said to be a R-valued Booleanfunction if it is specified as

f : {±1}n → R.

We may also use terminology like {±1}-valued Boolean functions for f : {±1}n → {±1}.The domain of the function can be switched with Fn2 .

The space {f : {±1}n → R} of such functions forms a vector space under pointwiseaddition and scalar multiplication.

In chapter 3 (on additive combinatorics) and section 4.2 we will switch from {±1} toF2.

Two simple examples of Boolean functions are

min2(x1, x2) = −1

2+

1

2x1 +

1

2x2 +

1

2x1x2,

9

and

Maj3(x1, x2, x3) =1

2x1 +

1

2x2 +

1

2x3 −

1

2x1x2x3.

Let f, g be Boolean functions. In Definition 1.1.1, take µ the uniform measure over{±1}n. We then have that the inner product between f and g is equal to

〈f | g〉 =1

2n

∑x∈{±1}n

f(x)g(x).

The stochastic interpretation to this is as follows. Let x ∼ A denote that the randomvariable x is chosen uniformly from a (finite) set A. Then 〈f | g〉 = Ex∼{±1}n [f(x)g(x)].Expected values and probabilities written without random variable are implicitly meantto be uniformly chosen, i.e. E[f ] = Ex∼{±1}n [f(x)].

We now define a type of Boolean functions, characters, which give an orthonormalbasis for the space of Boolean functions. Furthermore two equivalent ways of describingcharacters are given.

Definition 2.1.2. [Character] Define the character χS : {±1}n → {±1} for S ⊆ [n] by

χS : x 7→∏i∈S

xi.

Now (χS)S⊆[n] is an orthonormal basis for the space of functions. We have 〈χS |χT 〉 =δS,T , where δS,T is the Kronecker delta which is equal to 1 if and only if S equals T , andequal to 0 else.

This can be proven in two ways:

• Using the identification ({±1}n, ·) ∼= (2[n],∆) and proving it directly;

• Using representation theory, since there are exactly 2n of these characters and{±1}n is an Abelian group (see for example [11]).

When writing Boolean functions as f : Fn2 → R, one can use

χS : x 7→∏i∈S

(−1)xi

as an orthonormal basis of characters instead. This follows from the isomorphism between{±1} and F2 given earlier.

We now give a third way to describe characters, where we replace the index from 2[n]

with an index from Fn2 . For given S ⊆ 2[n], let z ∈ Fn2 such that zi = 1 if and only ifi ∈ S. Using the characters of Boolean functions over Fn2 , we then have

∏i∈S(−1)xi =

(−1)∑i∈S xi = (−1)

∑ni=1 xizi , for x ∈ Fn2 . For z ∈ Fn2 , another equivalent way to describe

the basis of characters is therefore χz : Fn2 → {±1} with

χz : x 7→ (−1)x·z = (−1)∑nk=1 xkzk .

10

Using the first description of the basis of characters a Boolean function f can bewritten as

f =∑S⊆[n]

〈f |χS〉χS .

The coefficients 〈f |χS〉 are called Fourier coefficients. Following directly from this isthe Fourier transform:

Definition 2.1.3. [Fourier transform] The Fourier transform f of a Boolean function fis defined by f(S) := 〈f |χS〉 for S ⊆ [n]. In particular, f : 2[n] → R.

The decomposition using the orthonormal basis of characters can then be written usingthese Fourier coefficients as

f =∑S⊆[n]

f(S)χS .

We call this the Fourier expansion of f . Note that min2 and Maj3 have already beenwritten out in their Fourier expansions.

It is easy to see that taking the Fourier transform is a linear map. Let α, β ∈ R,S ⊆ [n], and let f, g be Boolean functions. Then one has

(αf + βg∧

)(S) = 〈αf + βg |χS〉 = α〈f |χS〉+ β〈g |χS〉 = αf(S) + βg(S).

Proposition 2.1.4. [Plancherel’s identity] Let 〈· | ·〉L2 be the inner product on the spaceof Boolean functions corresponding to the uniform measure on {±1}n, as stated earlier.Let 〈· | ·〉`2 be the inner product for the function space on 2[n] corresponding to thecounting measure on 2[n]. Then, for Boolean functions f, g,

〈f | g〉L2 = 〈f | g〉`2 .

Proof. Taking the Fourier expansion of both functions and then using linearity andorthonormality gives

〈f | g〉L2 = 〈∑S⊆[n]

f(S)χS |∑T⊆[n]

g(T )χT 〉L2 =∑

S,T⊆[n]

f(S)g(T )〈χS |χT 〉L2

=∑

S,T⊆[n]

f(S)g(T )δS,T =∑S⊆[n]

f(S)g(S) = 〈f | g〉`2 ,

which proves Plancherel’s identity.

Remark (Parseval’s identity). With f = g this is also called Parseval’s identity and theproposition states

‖f‖L2 = ‖f‖`2 . (2.1)

For inner products the L2, `2 subscripts will be left away. For norms they will mostlybe replaced with a subscript indicating the 2-norm, since the domain of the functionimplicitly specifies the measure used on the space.

11

Definition 2.1.5. [Fourier weights] For ◦ ∈ {=, <,>,≤,≥} and k ∈ [n] ∪ {0} define

f◦k :=∑|S|◦k

f(S)χS ,

andW◦k[f ] := ‖f◦k‖22 =

∑|S|◦k

f(S)2,

where the last equality follows by Parseval’s identity.

Definition 2.1.6. [ε-spectrum of a function] For ε ∈ (0, 1) define the ε-spectrumSpecε[f ] := {S ⊆ [n] : |f(S)| ≥ ε}.

Definition 2.1.7. [Density (Definition 1.20 in [6])] ϕ : {±1}n → R≥0 is called a (proba-bility) density if E[ϕ] = 1.

Definition 2.1.8. [Random variable drawn from associated probability distribution] Letϕ be a density. One writes x ∼ ϕ for a random variable x chosen from the probabilitydistribution associated with ϕ, defined by

Px∼ϕ[x = x] =1

2nϕ(x),

for x ∈ {±1}n. Note that∑

x∈{±1}n Px∼ϕ[x = x] =∑

x∈{±1}n12nϕ(x) = Ex∼{±1}n [ϕ(x)] =

1, which means this probability is well-defined.

Remark. When A ⊆ {±1}n the probability density corresponding to uniformly choosinga random variable from A is ϕA := 1A

E[1A] .

This holds since Px∼ϕA [x = x] = 12n

2n

|A|1A = 1|A|1A = Px∼A[x = x], which shows that

x ∼ ϕA if and only if x ∼ A.

The sets are often omitted for densities of singleton sets {x} for x ∈ {±1}n; i.e. wewrite ϕx instead of ϕ{x}.

Remark. Suppose that the random variable x ∼ A is uniformly chosen from a setA ⊆ {±1}n. Let B ⊆ {±1}n be another set. Then with

Px∼A

[x ∈ B] = Ex∼A

[1B(x)] = Ex∼{±1}n

[ϕA(x)1B(x)] = 〈ϕA |1B〉. (2.2)

one sees that the probability of x being contained in B can be written as the innerproduct of a density and an indicator.

Parseval’s identity gives a bound on the dimension of the space spanned by the ε-spectrum of density functions. Here, the dimension of a span of a subset of 2[n] is to beinterpreted as the dimension of the span of the corresponding set in {±1}n. The boundgiven here will be significantly improved at the end of this chapter (Chang’s lemma,Theorem 2.3.18).

12

Proposition 2.1.9. Let A ⊆ {±1}n have volume α = E[1A] = |A|2n , and let ε > 0. Define

d := dim(Sp(Specε[ϕA])), using the ε-spectrum as defined in Definition 2.1.6.Then d ≤ ε−2α−1.

Proof. Let α := E[1A] = |A|2n . Then

α−1 =E[12A]

α2= ‖ϕA‖22 = ‖ϕA‖22 =

∑S∈[n]

|ϕA(S)|2

≥∑

S∈Specε[ϕA]

|ϕA|2 ≥∑

S∈Specε[ϕA]

ε2 =∣∣Specε[ϕA]

∣∣ε2.Consequently,

dim Specε[ϕA] ≤∣∣Specε[ϕA]

∣∣ ≤ ε−2α−1.Now the definition of an operation between Boolean functions will be given, which can

also be defined for general groups. One can think of this operation as “smoothing” oneof the functions using the other.

Definition 2.1.10. [Convolution of Boolean functions] Let f, g be Boolean functions.The convolution of f and g is the Boolean function defined by

(f ∗ g)(x) := Ey∼{±1}n

[f(y)g(xy)].

Lemma 2.1.11. [(Exercise 1.25 in [6])] The convolution as defined in Definition 2.1.10is commutative and associative.

Note that for a Boolean function f

(ϕx ∗ f)(y) =1

2n

∑z∈{±1}n

f(zy)ϕx(z) =1

2nf(xy)ϕx(x) = f(xy).

Due to this convolutions with densities on singletons are also called shifts. Related tothis, we see that

(ϕA ∗ f)(x) = Ey∼{±1}n [ϕA(y)f(xy)] =1

2n

∑y∈{±1}n

2n

|A|1A(y)f(xy)

=1

|A|∑y∈A

f(xy) = Ey∼A[f(xy)].

A few useful properties about convolutions which also hold over a general Abeliangroup will now follow.

Lemma 2.1.12. The p-norm is invariant under shifts.

13

Proof. Writing out gives

‖ϕx ∗ f‖pp = Ey[|(ϕx ∗ f)(y)|p] = Ey[|f(xy)|p] = Ez[|f(z)|p] = ‖f‖pp,

which proves the lemma.

Remark. For A ⊆ {±1}n a subset and x ∈ {±1}n it holds that ϕA ∗ ϕx = ϕAx. Indeed,

ϕx ∗ ϕA(y) = ϕA(xy) =2n

|A|1A(xy) =

2n

|Ax|1Ax(y) = ϕAx(y).

Lemma 2.1.13. [(Fact 1.26 in [6])] Suppose ϕ, ψ are densities. Let x ∼ ϕ, y ∼ ψ beindependently distributed random variables. Define the random variable z := xy. Thenϕ ∗ ψ is a density that represents the random variable z.

Proof. Let z ∈ {±1}n. Then one sees that

P[z = z] =∑x

P[x = x]P[y = zx] =∑x

ϕ(x)

2nψ(zx)

2n=

1

2n(ϕ ∗ ψ)(z),

with the last equality following by definition. This proves the lemma, given Definition2.1.8.

The following theorem is often stated in a more general form, for (Abelian) groups.Take the space of Boolean functions with the convolution on one side, and the space ofFourier transformed Boolean functions on the other. Then the Fourier transform providesan isomorphism between the two function spaces. The theory in its more general formcan for example be studied at [11].

Theorem 2.1.14. [Fourier transform as isomorphism over algebras (Theorem 1.27 in

[6])] Let f, g be Boolean functions. Then f ∗ g∧

= f · g.

For a proof we refer to O’Donnell [6].

Remark. Convolutions can “flipped across” an inner product using Plancherel’s identity:

〈f ∗ g |h〉 = 〈f ∗ g∧

| h〉 = 〈f · g | h〉 = 〈f | g · h〉 = 〈f | g ∗ h〉. (2.3)

2.2 Noise stability

This section follows chapter 2 of O’Donnell[6]. Concepts closely related to social choicetheory are defined. We start with the definition of an operator which is needed in section2.3; then we explore properties of this operator.

Definition 2.2.1. [Noise operator (Definition 2.46 and Proposition 2.47 in [6])] Letρ ∈ R, and f be a Boolean function. Define the noise operator Tρ by

Tρf :=∑S⊆[n]

ρ|S|f∧

(S)χS .

14

Note that the noise operator is linear by definition, and T1 = Id.For ρ ∈ [−1, 1] one can give a stochastic interpretation to this operator as the expecta-

tion of a random variable. For this we need the concept of applying “noise” to an n-bitstring, which boils down to flipping each bit independently with a certain probability.

Definition 2.2.2. [ρ-correlation (Definition 2.40 in [6]] Let ρ ∈ [−1, 1], and x ∈ {±1}nbe fixed. Let y chosen from {±1}n be a random variable such that for every i ∈ [n]independently,

P[yi = xi] =1 + ρ

2, P[yi = −xi] =

1− ρ2

.

The random variable y is then called ρ-correlated to x, denoted by y ∼ Nρ(x).

Remark. Due to the symmetry in x and y in the previous definition one can alsoconsider a pair of random variables (x,y). This pair is called a ρ-correlated pair if firstx ∼ {±1}n is chosen uniformly, and y ∼ Nρ(x) is chosen, i.e. y is chosen ρ-correlated tox.

Theorem 2.2.3. [Stochastic interpretation of the noise operator (Definition 2.46 andProposition 2.47 in [6])] Let ρ ∈ [−1, 1], and f a Boolean function. Then

Tρf(x) = Ey∼Nρ(x)[f(y)].

Proof. By linearity it suffices to consider how the noise operator acts on characters. Onehas

TρχS(x) = ρ|S|χS(x) =∏i∈S

ρxi

=∏i∈S

Ey∼Nρ(x)[yi]

(2.4)

= Ey∼Nρ(x)

[∏i∈S

yi

](2.5)

= Ey∼Nρ(x)

[χS(y)].

Here equation 2.4 holds since

Ey∼Nρ(x)[yi]

= xiP[yi = xi] + (−xi)P[yi = −xi] =xi2

(1 + ρ)− xi2

(1− ρ) = ρxi,

which can also be seen as the ”damping” of xi by ρ. Furthermore equation 2.5 followsfrom independence.

Lemma 2.2.4. [Exercise 2.34 in [6]] The noise operator satisfies |Tρf | ≤ Tρ|f | pointwise,for all Boolean f .

Proof. For all x ∈ {±1}n it holds that

|Tρf(x)| = |Ey∼Nρ(x)[f(y)]| ≤ Ey∼Nρ(x)[|f(y)|] = Ey∼Nρ(x)[|f |(y)] = Tρ|f |(x),

which proves the lemma.

15

Lemma 2.2.5. [Exercise 2.32 in [6]] The noise operator is a multiplicative homomorphismin ρ.

Proof. First note that for all Boolean functions f :

Tρf∧

(S) = ρ|S|f(S)

by definition of the noise operator and uniqueness of the Fourier expansion. It followsthat

Tτρf∧

(S) = (τρ)|S|f(S) = τ |S|ρ|S|f(S) = τ |S|Tρf∧

(S) = TτTρf∧

(S)

and therefore Tτρf∧

= TτTρf∧

. This immediately gives Tτρf = TτTρf for all Booleanfunctions f and hence

Tτρ = Tτ ◦ Tρas was claimed.

Lemma 2.2.6. The noise operator is a Hermitian operator for every ρ ∈ R.

Proof. Let f , g be Boolean functions. Then for every ρ ∈ R:

〈Tρf | g〉 =⟨∑S⊆[n]

ρ|S|f(S)χS

∣∣∣ ∑T⊆[n]

g(T )χT

⟩=

∑S,T⊆[n]

ρ|S|f(S)g(T )〈χS |χT 〉

=∑S⊆[n]

ρ|S|f(S)g(S)

=∑

S,T⊆[n]

f(S)ρ|T |g(T )〈χS |χT 〉 =⟨∑S⊆[n]

f(S)χS

∣∣∣ ∑T⊆[n]

ρ|T |g(T )χT

⟩= 〈f |Tρg〉

using linearity of the inner product and that 〈χS |χT 〉 = δS,T .

Theorem 2.2.7. [Contraction theorem (Exercise 2.33 in [6])] Tρ is a contraction onLp({±1}n) for p ≥ 1, ρ ∈ [−1, 1] and all Boolean functions f :

‖Tρf‖p ≤ ‖f‖p.

Proof. Writing out one gets

‖Tρf‖pp = Ex∼{±1}n[|Tρf(x)|p

]= Ex

[∣∣∣Ey∼Nρ(x)[f(y)]∣∣∣p]. (2.6)

Since t 7→ |t|p is convex for p ≥ 1, one can apply Jensen’s inequality (Theorem 1.1.4) toequation 2.6 and then switch the random variables around:

Ex

[∣∣∣Ey∼Nρ(x)[f(y)]∣∣∣p] ≤ Ex

[Ey∼Nρ(x)

[|f(y)|p

]]= E

(x,y)ρ−correlated

[|f(y)|p

]= Ey

[Ex∼Nρ(y)

[|f(y)|p

]]. (2.7)

16

Since x does not appear in the argument of the inner expectation in equation 2.7, theproof concludes with

Ey

[Ex∼Nρ(y)

[|f(y)|p

]]= Ey

[|f(y)|p

]= ‖f‖pp.

Definition 2.2.8. [Noise stability (Definition 2.42 and Fact 2.48 in [6])] Let f be aBoolean function, ρ ∈ R. Then the noise stability of f at ρ is defined as

Stabρ[f ] = 〈Tρf | f〉.

Proposition 2.2.9. [Stochastic interpretation of noise stability (Definition 2.42 andFact 2.48 in [6])] Let f be a Boolean function, and ρ ∈ [−1, 1]. Then

Stabρ[f ] = E(x,y)

ρ−correlated

[f(x)f(y)

].

Proof. Writing out the noise stability gives

Stabρ[f ] = 〈Tρf | f〉 = Ey[Tρf(y)f(y)]

= Ey

[E

x∼Nρ(y)[f(x)]f(y)

]= E


[f(x)f(y)

],

as was claimed.

2.3 Hypercontractivity

This section follows parts of chapter 9 and 10 of O’Donnell[6].It has been shown that Tρ is a contraction on Lp({±1}n) for p ≥ 1. The goal of

this section is to prove a stronger statement, namely that Tρ is a hypercontraction fromLp({±1}n) to Lq({±1}n) for certain p and q.

Let µ(A) = E[1A] = |A|2n be the uniform probability measure on Fn2 , in Definition 1.1.2.

For f ∈ Lp({±1}n) this gives in particular

‖f‖p =(∫{±1}n

|f |p dµ) 1p

=( ∑x∈{±1}n

|f(x)|p 1

2n

) 1p

= Ex∼{±1}n [|f(x)|p]1p .

Theorem 2.3.1. [Bonami-Beckner inequality, Hypercontractivity theorem (page 247,

284 in [6])] Let f : {±1}n → R, 1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤(p−1q−1) 1

2 . Then f is(p, q, ρ)-hypercontractive, i.e.

‖Tρf‖q ≤ ‖f‖p

There are two border cases of this inequality worth describing.

17

• The case ρ = 1 implies 1 = ρ ≤(p−1q−1) 1

2 ≤ ( q−1q−1)12 = 1 ⇐⇒ p−1

q−1 = 1 ⇐⇒ p = q,for q 6= 1. Since T1 = Id the inequality then states ‖f‖p ≤ ‖f‖p, which is triviallyvalid.

• The other case, q = 1, implies 1 ≤ p ≤ q = 1 and therefore p = 1 = q. Thecondition on ρ is now not well-defined, but it is already proven that Tρ is acontraction (Theorem 2.2.7), which exactly proves the Bonami-Beckner inequalityfor p = q.

Both cases therefore need not be considered during the chapter.Considering Tρ as a linear operator from Lp({±1}n) to Lq({±1}n), this inequality

states that Tρ is a bounded operator, and therefore also continuous.The Bonami-Beckner inequality implies the operator norm can be bounded with‖Tρ‖ ≤ 1. Sharpness is achieved in Tρχ∅ = χ∅. In particular this implies ‖Tρ‖ = 1.

The proof of this inequality consists of the following parts:

• Formulating a Two-function version of the Hypercontractivity theorem, and showingequivalence of it with the Bonami-Beckner inequality.

• Reduce the theorem for n = 1 to a statement on uniform {±1}-bits, for 1 ≤ p <

q ≤ 2, ρ =(p−1q−1) 1

2 .

• Showing that the Bonami-Beckner inequality holds for n = 1; this is also called theTwo-Point inequality.

• Prove the general statement of the Two-Function version by induction on n.

We will start with the Two-function version of the Bonami-Beckner inequality. Thereductions, the case n = 1 and the induction on n are discussed in later sections.

Theorem 2.3.2. [Two-function Hypercontractivity Theorem (page 284 in [6])] Let

f, g : {±1}n → R be Boolean functions, and r, s ≥ 0, 0 ≤ ρ ≤ (rs)12 ≤ 1. Then

E(x,y)

ρ−correlated

[f(x)g(y)

]≤ ‖f‖1+r‖g‖1+s.

The following proposition and its corollary show equivalence of the Hypercontractivitytheorems.

Proposition 2.3.3. [Proposition 10.4 in [6]] Let 1 ≤ p ≤ q ≤ ∞, 0 ≤ ρ ≤(p−1q−1) 1

2 . Wehave for all Boolean functions f, g:

‖Tρf‖q ≤ ‖f‖p ⇐⇒ 〈Tρf | g〉 ≤ ‖f‖p‖g‖q′ ,

where q′ is the Holder conjugate of q.

18

Proof. Suppose that ‖Tρf‖q ≤ ‖f‖p for every Boolean function f . Then for all Booleanfunctions f , g:

〈Tρf | g〉 ≤ ‖Tρf‖q‖g‖q′ ≤ ‖f‖p‖g‖q′

with the first inequality following by Holder’s inequality and the second by assumption.On the other hand, suppose that 〈Tρf | g〉 ≤ ‖f‖p‖g‖q′ for all Boolean functions f , g.Then for every Boolean function f one has

‖Tρf‖q = sup‖g‖q′=1

〈Tρf | g〉 ≤ sup‖g‖q′=1

‖f‖p‖g‖q′ = ‖f‖p,

where the first equality holds by sharpness of Holder’s inequality and the inequality byassumption.

Corollary 2.3.4. [Equivalence of Hypercontractivity theorems] The HypercontractivityTheorem (Theorem 2.3.1) and Two-function Hypercontractivity Theorem (Theorem 2.3.2)are equivalent.

Proof. Let p = 1 + r and q′ = 1 + s in the previous proposition. Similarly to Proposition2.2.9 we have

E(x,y)

ρ−correlated

[f(x)g(y)

]= Ey

[E

x∼Nρ(y)[f(x)]g(y)

]= Ey[Tρf(y)g(y)] = 〈Tρf | g〉.

Furthermore(p−1q−1) 1

2 = ((p− 1)(q′ − 1))12 = (rs)

12 , which shows that ρ ≤

(p−1q−1) 1

2 if and

only if ρ ≤ (rs)12 .

2.3.1 Reductions

Suppose f : {±1} → R, i.e. n = 1. The Fourier expansion of f is then given by f =f(∅)χ∅ + f({1})χ{1}, which gives f(x) = f(∅) + f({1})x. Therefore, write f(x) = a+ bxfor a, b ∈ R. Denoting f like this also gives Tρf(x) = a+ ρbx. Since x ∈ {±1} only takestwo values it is useful to think of the function f : x 7→ a + bx as the random variablea+ bx, for x uniformly chosen from {±1}.

Hence hypercontractivity of f can be expressed as hypercontractivity of the uniform{±1}-bit x, as follows:

Definition 2.3.5. Let x be a random variable uniformly chosen from {±1}, i.e. x ∼ {±1}.Define for any function f in x:

‖f(x)‖p := ‖f‖p = Ex[|f(x)|p]1p .

Let again 1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤(p−1q−1) 1

2 . The uniform {±1}-bit x is now called(p, q, ρ)-hypercontractive if for all a, b ∈ R, it holds that

‖a+ ρbx‖q ≤ ‖a+ bx‖p.

19

It is essential to note that by definition the {±1}-bit x is hypercontractive if and onlyif the function f : x 7→ a+ bx is hypercontractive.

The following two lemmas are proven in the context of general Boolean functions, butapplied for n = 1.

Lemma 2.3.6. Suppose f : {±1}n → R is (p, q, ρ)-hypercontractive. Let c ∈ R. Thencf is also (p, q, ρ)-hypercontractive.

Proof. We have

‖Tρcf‖q = ‖c · Tρf‖q = |c|‖Tρf‖q ≤ |c|‖f‖p = ‖cf‖p.

This proves the lemma.

Suppose hypercontractivity is proven for the {±1}-bit x with a = 1. Using the previouslemma we then know that x is hypercontractive for any a ∈ R.

In the case that a = 0 one is left to prove

‖ρbx‖q ≤ ‖bx‖p ⇐⇒ ρ‖x‖q ≤ ‖x‖p ⇐⇒ ρ ≤‖x‖p‖x‖q

= 21p− 1q ,

which holds since ρ ≤(p−1q−1) 1

2 and p ≤ q implies(p−1q−1) 1

2 ≤ 1 ≤ 21p− 1q .

Lemma 2.3.7. [Exercise 9.7 in [6]] Let 1 ≤ p ≤ q, 0 ≤ ρ < 1. Suppose we knowthat for all non-negative Boolean functions f : {±1}n → R, f ≥ 0 that f is (p, q, ρ)-hypercontractive. Then the same holds for arbitrary Boolean functions g : {±1}n → R.

Proof. We have to prove that the Boolean function g is (p, q, ρ)-hypercontractive, i.e.that ‖Tρg‖q ≤ ‖g‖p. We see that∥∥Tρg∥∥q =

∥∥|Tρg|∥∥q ≤ ∥∥Tρ|g|∥∥q ≤ ∥∥|g|∥∥p =∥∥g∥∥

p,

using Lemma 2.2.4 with f = |g| ≥ 0.

We already know it is sufficient to prove hypercontractivity of a single bit x ∼ {±1}(Definition 2.3.5) for a = 1. Now suppose this is proven also with |b| ≤ 1. Let f : x 7→1 + bx. It holds for all x ∈ {±1} that f(x) ≥ 0 if and only if bx ≥ −1, exactly whenb ≥ −1 and −b ≥ −1, i.e. |b| ≤ 1. Applying the previous lemma restricted to functions fas defined, we then know that every function of this form is hypercontractive. Hence, for

1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤(p−1q−1) 1

2 , we know for the bit x that

‖1 + ρbx‖q ≤ ‖1 + bx‖p

holds for every b ∈ R, which was sufficient for proving hypercontractivity of it. Hence,from this point on, this is exactly what needs to be proven in the case n = 1.

The following lemma shows that the Hypercontractivity theorem only needs to be

proven for ρ =(p−1q−1) 1

2 in the case of n = 1.

20

Lemma 2.3.8. [Exercise 9.11 in [6]] Suppose that a uniform {±1}-bit x is (p, q, ρ)-hypercontractive. Then x is also (p, q, τ)-hypercontractive for all τ < ρ.

Proof. Let a ∈ R. Using Lemma 2.3.6 we prove hypercontractivity for b = 1. Sinceuniform {±1}-bits have mean 0, we have

|a| = |a+ E[x]| = |E[a+ x]| ≤ E[|a+ x|] = ‖a+ x‖1 ≤ ‖a+ x‖q. (2.8)

with Jensen’s inequality and the fact that the q-norm with the uniform measure ismonotonously increasing. Furthermore for ρ < 1 we have

‖a+ ρx‖q = ‖(1− ρ)a+ ρ(a+ x)‖q≤ ‖(1− ρ)a‖q + ‖ρ(a+ x)‖q= (1− ρ)|a|+ ρ‖a+ x‖q ; using that ρ < 1

≤ (1− ρ)‖a+ x‖q + ρ‖a+ x‖q ; with equation 2.8

= ‖a+ x‖q. (2.9)

Finally, let 0 ≤ τ < ρ. Then

‖a+ τx‖q = ρ∥∥aρ

+τ

ρx∥∥q

≤ ρ∥∥aρ

+ x∥∥q

; sinceτ

ρ< 1 using equation 2.9

= ‖a+ ρx‖q ≤ ‖a+ x‖p ; since x is hypercontractive.

The conclusion is that x is (p, q, τ)-hypercontractive for all 0 ≤ τ < ρ.

The following two lemmas show that the Bonami-Beckner inequality needs only to beproven for 1 ≤ p ≤ q ≤ 2. Letting 2 ≤ q′, p′ ≤ ∞ be the Holder conjugates of p and q wethen have

q′ − 1 =1

q − 1≤ 1

p− 1= p′ − 1 =⇒ q′ ≤ p′.

Lemma 2.3.9. [Proposition 9.19 in [6]] Let f be a real-valued Boolean function. Supposewe prove (p, q, ρ)-hypercontractivity of f for 1 ≤ p ≤ q ≤ 2. Let 2 ≤ q′ ≤ p′ ≤ ∞ be theirHolder conjugates. Then f is also (q′, p′, ρ)-hypercontractive.

Proof. We have ‖Tρf‖q ≤ ‖f‖p. Let g be any real-valued Boolean function, then

‖Tρg‖p′ = sup‖f‖p=1

〈f |Tρg〉 = sup‖f‖p=1

〈Tρf | g〉

≤ sup‖f‖p=1

‖Tρf‖q‖g‖q′ ≤ sup‖f‖p=1

‖f‖p‖g‖q′ = ‖g‖q′ ,

using sharpness of Holder’s inequality, that Tρ is Hermitian, Holder’s inequality, and theassumption.

21

Lemma 2.3.10. [Exercise 9.17 in [6]] Suppose that for every p, q such that p < 2 < q we

have that a Boolean function f is (2, q, (q − 1)−12 )- and (p, 2, (p− 1)

12 )-hypercontractive.

Then f also is (p, q, (p−1q−1 )12 )-hypercontractive.

Proof. Define τ := (q − 1)−12 and σ := (p− 1)

12 . We have for all Boolean functions f :

‖Tτf‖q ≤ ‖f‖2, ‖Tσf‖2 ≤ ‖f‖p.

Let ρ = (p−1q−1 )12 = τσ. Then for all Boolean functions f :

‖Tρf‖q = ‖Tτσf‖q = ‖TτTσf‖q ≤ ‖Tσf‖2 ≤ ‖f‖p.

2.3.2 Base case and induction

At this point we need to prove much less to actually prove the entire Bonami-Becknerinequality, or Hypercontractivity theorem. Proving the base case and doing induction onthe Two-function Hypercontractivity theorem now suffices. First, we give a lemma thatis needed for the base case:

Lemma 2.3.11. For t ≥ 0, 0 ≤ θ ≤ 1 we have (1 + t)θ ≤ 1 + θt.

This lemma can be checked by taking derivatives in t.

Theorem 2.3.12. [Two-point inequality (page 286 in [6]] Suppose 1 ≤ p ≤ q ≤ ∞, 0 ≤ρ ≤

(p−1q−1) 1

2 and let f : {±1} → R be a function on a single bit. Then

‖Tρf‖q ≤ ‖f‖p.

Proof. With remark that the proof is trivial for ρ = 1, and using the reductions from theprevious section (2.3.1) one needs to prove for 1 ≤ p < q ≤ 2 (since Tρ is a contraction(Theorem 2.2.7) and the norms can be ”flipped” (Lemma 2.3.9)), a uniform {±1}-bit x,

ρ =(p−1q−1) 1

2 and |ε| < 1 that

‖1 + ρεx‖q ≤ ‖1 + εx‖p.

Expanding this one gets

‖1 + ρεx‖pq ≤ ‖1 + εx‖pp ⇐⇒ Ex[(1 + ρεx)q]pq ≤ Ex[(1 + εx)q]

dropping the absolute values since dεx > −1 for d = ρ or d = 1. Now write out theexpected values, with x ∼ {±1}:(1

2(1 + ρε)q +

1

2(1− ρε)q

)p/q≤ 1

2(1 + ε)p +

1

2(1− ε)p.

22

Since |ε| < 1, the Generalized Binomial Theorem can be applied to obtain(1

2

∑k≥0

(q

k

)(ρε)k +

1

2

∑k≥0

(q

k

)(−ρε)k

)p/q≤ 1

2

∑k≥0

(p

k

)εk +

1

2

∑k≥0

(p

k

)(−ε)k,

in which all uneven terms cancel out to give(1 +

∑l≥1

(q

2l

)(ρε)2l

)p/q≤ 1 +

∑l≥1

(p

2l

)ε2l. (2.10)

Given 1 ≤ q ≤ 2 and since(q

2l

)=

1

(2l)!

2l−1∏j=0

(q − j) =1

(2l)!q(q − 1)

2l−1∏j=2

(j − q) ≥ 0,

with Lemma 2.3.11 one has(1 +

∑l≥1

(q

2l

)(ρε)2l

)p/q≤ 1 +

p

q

∑l≥1

(q

2l

)(ρε)2l = 1 +

∑l≥1

p

q

(p− 1

q − 1

)l( q2l

)ε2l,

which means it is sufficient to prove∑l≥1

p

q

(p− 1

q − 1

)l( q2l

)ε2l ≤

∑l≥1

(p

2l

)ε2l.

Comparing terms therefore it suffices to prove for each l that

p

q

(p− 1

q − 1

)l( q2l

)≤(p

2l

). (2.11)

Now note that for p = 1 it has to hold that 0 ≤(12l

)= 0. Therefore, we can assume from

this point on that p 6= 1. Equation equation 2.11 can now be written out as

p

q

(p− 1

q − 1

)l 1

(2l)!q(q − 1)

2l−1∏j=2

(j − q) ≤ 1

(2l)!p(p− 1)

2l−1∏j=2

(j − p)

⇐⇒(p− 1

q − 1

)l−1 2l−1∏j=2

(j − q) ≤2l−1∏j=2

(j − p)

⇐⇒2l−1∏j=2

j − q(q − 1)

12

≤2l−1∏j=2

j − p(p− 1)

12

.

Noting that for j ≥ 2, r > 1 one has

d

dr

j − r(r − 1)

12

=−(r − 1)

12 − 1

2(r − 1)−12 (j − r)

r − 1= −2r − 2 + j − r

2(r − 1)32

= − r + j − 2

2(r − 1)32

< 0,

23

it follows that j−r(r−1)

12

decreases in r. Therefore p < q implies for each j ≥ 2:

j − q(q − 1)

12

≤ j − p(p− 1)

12

,

which concludes the proof.

Now induction is done for the Two-function Hypercontractivity theorem. Doinginduction on the Two-point inequality to derive the Bonami-Beckner inequality directlyis possible, but needs more concepts on Boolean functions than described in this thesis.See also Remark 10.5 in O’Donnell [6].

Theorem 2.3.13. [Two-function Hypercontractivity Induction Theorem (page 261 in[6])] Suppose that the base case of the Two-function Hypercontractivity theorem holds.Then it holds for all n.

Proof. For n > 1, let f, g : {±1}n → R be Boolean functions. Choose (x,y) ρ-correlated.Denote x′ = (xi)

n−1i=1 and x = (x′,xn), and similarly for y. This makes both (x′,y′), (xn,yn)

ρ-correlated pairs by definition. Also denote fxn = f[n−1]|xn for the restriction of f withthe last coordinate fixed in xn; similarly for g.

Then

E(x,y)

ρ−correlated

[f(x)g(y)

]= E

(xn,yn)

[E

(x′,y′)[fxn(x′)gyn(y′)]

]≤ E

(xn,yn)[‖fxn‖p‖gyn‖q]

where the induction hypothesis is used for the inequality. Define F (xn) := ‖fxn‖p,G(yn) := ‖gyn‖q, then

E(xn,yn)

[‖fxn‖p‖gyn‖q] = E(xn,yn)

[F (xn)G(yn)] ≤ ‖F‖p‖G‖q

where the inequality follows by the base case. Writing out one gets

‖F‖pp = Exn [|F (xn)|p] = Exn [|‖fxn‖p|p] = Exn [E[x′]|fxn(x′)|p] = Ex[|f(x)|p] = ‖f‖pp

and similarly for G, which implies

E(x,y)

ρ−correlated

[f(x)g(y)

]≤ ‖f‖p‖g‖q.

By applying the equivalence of the Hypercontractivity theorems for one and twofunctions (Proposition 2.3.3) on the Two-point inequality (Theorem 2.3.12), applyinginduction by the previous theorem, and again noting equivalence of the Hypercontractivitytheorems, the Bonami-Beckner inequality is proven.

As an addendum, the following proposition shows that one cannot weaken the conditionson the Bonami-Beckner inequality.

24

Proposition 2.3.14. [Exercise 9.10b in [6]] ‖1 + ρεx‖q ≤ ‖1 + εx‖p implies ρ ≤(p−1q−1) 1

2 ;in particular, the Bonami-Beckner inequality cannot improve on this bound.

Proof. Starting with the expansion in equation 2.10 in the proof of Theorem 2.3.12, onehas

‖1 + ρεx‖q ≤ ‖1 + εx‖p =⇒(

1 +∑l≥1

(q

2l

)(ρε)2l

)p/q≤ 1 +

∑l≥1

(p

2l

)ε2l.

Now cut off both series to their second-order expansions, with corresponding remainderterms expressed using asymptotic notation to get(

1 +q(q − 1)

2ρ2ε2 +O(ε4)

)p/q≤ 1 +

p(p− 1)

2ε2 +O(ε4).

We now again want to take a second-order expansion to eliminate the power taken onthe left-hand side. For this one can use the Generalized Binomial Theorem, or note thatwith g(ε) = q(q−1)

2 ρ2ε2 +O(ε4), one has g′(ε) = 0 and therefore, in the Taylor expansionthe second term is of the second order and the third term is of the fourth order. Henceone has for ε small enough:

1 +p

q

q(q − 1)

2ρ2ε2 +O(ε4) ≤ 1 +

p(p− 1)

2ε2 +O(ε4).

Writing this out gives

(q − 1)ρ2ε2 +O(ε4) ≤ (p− 1)ε2 +O(ε4)

=⇒ (q − 1)ρ2 +O(ε2) ≤ p− 1 +O(ε2)

=⇒ (q − 1)ρ2 ≤ p− 1 =⇒ ρ ≤(p− 1

q − 1

) 12,

letting ε→ 0.

2.3.3 Small-set expansion theorem, Level-k inequalities, and Chang’slemma

In this section sets A ⊆ {±1}n are considered, with the exception of Chang’s lemma.We now are going to prove a theorem that touches upon the idea of the Hamming

cube being a small-set expander, namely that most of the “weight” of a subset of theHamming cube lies at its boundary. More about these ideas is explained in O’Donnell [6].

Theorem 2.3.15. [Small-set expansion theorem (page 264 in [6])] Suppose A ⊆ {±1}nhas volume α ∈ [0, 1], i.e. E[1A] = α. Then for all 0 ≤ τ ≤ 1 it holds that

Stabτ [1A] ≤ α2

1+τ .

25

Proof. By definition, Lemma 2.2.5, Lemma 2.2.6 and non-negativity of τ we haveStabτ [1A] = 〈Tτ1A |1A〉 = 〈T√τ 1A |T√τ 1A〉 = ‖T√τ 1A‖

22. Furthermore, by the Bonami-

Beckner inequality (Theorem 2.3.1) with q = 2, p− 1 = ρ2 = τ it holds that ‖T√τ 1A‖2 ≤‖1A‖τ+1. Combining these results gives

Stabτ [1A] = ‖T√τ 1A‖22 ≤ ‖1A‖

2τ+1 = E[|1A|τ+1]

2τ+1 = E[1A]

2τ+1 = α

2τ+1 ,

which concludes the proof.

In particular, given the assumptions of the theorem just proved, with Proposition 2.2.9we have

Stabρ[1A] = E(x,y)

ρ−correlated

[1A(x)1A(y)

]= E


[1A2(x,y)

]= P


[(x,y) ∈ A2

]≤ α

21+ρ .

This means that the probability of staying inside A when applying noise to a uniformlychosen x ∼ A decreases quadratically in the volume of the set A.

In the following two theorems, asympototical statements are made that bound theFourier weight up until a certain level k.

Theorem 2.3.16 (Level-k inequalities (page 264 in [6])). Suppose a non-empty setA ⊆ {±1}n has volume α ∈ (0, 1], i.e. E[1A] = α. Furthermore take k ∈ Z>0 such thatk ≤ 2 ln(α−1). Then

W≤k[1A] ≤(2e

kln(α−1)

)kα2.

Proof. Let 0 < ρ ≤ 1. Writing out the Fourier weights of 1A at degrees at most k:

ρkW≤k[1A] = ρk∑|S|≤k

1A(S)2 =∑|S|≤k

ρk1A(S)2

≤∑|S|≤k

ρ|S|1A(S)2

≤∑S⊆[n]

ρ|S|1A(S)2 =⟨∑S⊆[n]

ρ|S|1A(S)χS

∣∣∣ ∑T⊆[n]

1A(T )χT

⟩= 〈Tρ1A |1A〉 = Stabρ[1A] ≤ α

21+ρ ; by the Small-Set Expansion Theorem

When ρ = 1, one has α2

1+ρ = α ≤ 1 = α2(1−ρ). Now assume ρ < 1. Then 11+ρ =∑

n≥0(−ρ)n = 1− ρ+ ρ2 − ρ3 + . . ., and ρ2m ≥ ρ2m+1 implies ρ2m − ρ2m+1 ≥ 0 for everym; hence all terms higher than first order can be estimated away to give the inequality1

1+ρ ≥ 1− ρ. Therefore

W≤k[1A] ≤ ρ−kα2

1+ρ ≤ ρ−kα2(1−ρ) (2.12)

26

holds for each 0 < ρ ≤ 1. We now want to minimize the right-hand side. Setting thederivative of the right side with respect to ρ to zero gives

− kρ−k−1α2(1−ρ) + ρ−k · −2 ln(α)α2(1−ρ) = 0

⇐⇒ − kρ−k−1 + ρ−k · −2 ln(α) = 0

⇐⇒ k + 2ρ ln(α) = 0 ⇐⇒ ρ =k

2 ln(α−1)≤ k

k= 1

so that ρ satisfies the conditions of the Small-Set Expansion Theorem. Substituting thisvalue in the right-hand side of equation 2.12 gives

ρ−kα2(1−ρ) =(2

kln(α−1)

)kα2(1− k

2 ln(α−1)

)=(2e

kln(α−1)

)kα2,

since α− k

ln(α−1) = α− k

ln(α−1) = ek

ln(α)·ln(α)

= ek. We conclude W≤k[1A] ≤(2ek ln(α−1)

)kα2.

Theorem 2.3.17 (Sharp form of the Level-1 inequality (Exercise 9.18 in [6])). Supposea non-empty set A ⊆ {±1}n has volume α ∈ (0, 1], i.e. E[1A] = α. Then

W=1[1A] ≤ 2 ln(α−1)α2.

Proof. Let 0 < ρ ≤ 1.

ρW=1[1A] = ρ∑|S|=1

1A(S)2 =∑|S|=1

ρ|S|1A(S)2

≤∑S 6=∅

ρ|S|1A(S)2 =∑S⊆[n]

ρ|S|1A(S)2 − 1A(∅)2

= Stabρ[1A]− α2 ; since E[1A] = 1A(∅)

≤ α2

1+ρ − α2

by the Small-Set Expansion Theorem. Hence

W=1[1A] ≤ 1

ρ(α

21+ρ − α2).

Taking the limit as ρ ↓ 0 and applying l’Hopital’s rule, it follows that

W=1[1A] ≤ limρ↓0

α2

1+ρ − α2

ρ= lim

ρ↓0

− 1(1+ρ)2

ln(α2)α2

1+ρ

1= 2 ln(α−1)α2,

which proves the theorem.

As promised, we will improve the bound given in Proposition 2.1.9. The theorem thatgives this bound, named after Mei-Chu Chang, has a more general version describedby himself as lemma 3.1 in [2]. This article touches upon the additive combinatorics as

27

described in chapter 3. We focus on a version for the Hamming cube, from O’Donnell[5], in which Fn2 is used to describe the domain of both original and Fourier transformedBoolean functions.

Again the dimension of a span of a subset of 2[n] is to be interpreted as the dimensionof the span of the corresponding set in Fn2 .

Theorem 2.3.18. [Chang’s lemma] Let A ⊆ Fn2 have volume α = E[1A] = |A|2n , and let

ε > 0. Define d := dim(Sp(Specε[ϕA])), using the ε-spectrum as defined in Definition2.1.6. Then d ≤ 2

ε2ln(α−1).

Proof. In this proof we will use (Fn2 ,+), χz(x) = (−1)x·z to describe the Fourier basis.Let Γ ⊆ Specε[1A] be a maximal linearly independent subset (which can be generated

by iterating across elements and adding those from Specε[1A] which are not already inthe span of Γ). Since Γ is maximal we have dim(Γ) = d, so one can write

Γ = {v1, . . . , vd}.

Now let M : Fn2 → Fn2 be an invertible linear map such that for all i ∈ {1, . . . d} we havevi =Mei.

Define ϕ := ϕA, ψ := ϕ ◦M−T , where M−T is the inverse transposed of M. Then

ψ(ei) = (ϕ ◦M−T∧

)(ei) = Ex[ϕ(M−Tx)(−1)x·ei ]

= Ey[ϕ(y)(−1)MTy·ei ]

= Ey[ϕ(y)(−1)y·Mei ]

= Ey[ϕ(y)χMei(y)] = ϕ(Mei) = ϕ(vi).

Since one chooses x uniformly if and only if one chooses y =M−Tx uniformly, the mapψ is a density: Ex[ψ(x)] = Ex[ϕ(M−Tx)] = Ey[ϕ(y)] = 1.

This proves that without loss of generality we can assume that Γ = {e1, . . . , ed},since the transformation preserves the volume of the set considered. Using the Level-1inequality and ϕA = 2n

|A|1A = 1α1A, one has

2 ln(α−1)α2 ≥W=1[1A] = W=1[αϕA] = α2W=1[ϕA].

Hence

2 ln(α−1) ≥W=1[ϕA] = W=1[ϕA] ≥d∑i=1

ϕA(ei)2 ≥

d∑i=1

ε2 = dε2,

which implies d ≤ 2ε2

ln(α−1) proving the theorem.

Comparing this inequality to Proposition 2.1.9 where we obtained a bound of d ≤ε−2α−1, this is a significant improvement. Suppose we take A = Fn2 , so α = 1, i.e.ϕA = χ∅ ≡ 1. Then only the empty set has Fourier coefficient at least ε, since χ∅(S) =〈χ∅ |χS〉 = δ∅,S . Therefore Specε[ϕA] = {∅}. This corresponds to the null space {0} ⊆Fn2 , which has dimension zero. While the bound derived from Parseval’s identity does notgive a sharp estimate, it is correctly given by Chang’s lemma, where d ≤ 2ε−2 ln(1) = 0.

It is worthwhile to note that the Level-k inequalities and Chang’s lemma are independentof the domain of the Boolean functions involved.

28

3 Additive Combinatorics

Additive combinatorics is an area within mathematics that considers estimates, oftenlycombinatorial, involved with addition and subtraction within arbitrary subsets of groups.Working over F2, subtraction is exactly the same as addition, which gives additionalstructure. At first sight it might not look like this area of mathematics has anything todo with Fourier analysis. The convolution of Boolean functions from Definition 2.1.10plays a large role in allowing Fourier analysis to be a toolset to be used in additivecombinatorics on Fn2 ; addition can be turned into so-called shifts, convolutions withdensities on singletons. The largest result from the previous chapter, Chang’s lemma(Theorem 2.3.18), is going to be needed.

The proof techniques in this chapter sometimes vary significantly from earlier chapters- a lot more is done with the domain of the Boolean functions involved. For this reason(Fn2 ,+) is used for the domain of both Boolean functions themselves, and the Fouriertransform of Boolean functions. Since Chang’s lemma is independent of the domain ofthe Boolean functions involved, this is a legitimate choice to make.

This chapter follows a survey written by Lovett [4], with modifications for readabilityand completion.

First we need a few definitions.

Definition 3.0.1. Let A ⊆ G be a subset of an Abelian group (G,+) and let t ∈ Z>0.Define the k-sumset of A as

kA = {k∑j=1

aj : aj ∈ A∀j ∈ [k]}

Let G be a group and A ⊆ G. Note that it always holds that |2A| ≥ |A|; let a ∈ A,then x 7→ a + x is an injection from A to 2A. Therefore we are interested in the casethat |2A| is not too big in relation to |A|, i.e. when for a fixed K ∈ R≥1, it holds that|2A| ≤ K|A|. Since one can prove that |2A| = |A| if and only if A = g +H for a certaing ∈ G and a subgroup H ≤ G, one can view the sets that meet the requirement of notbeing too big as “almost-cosets”. Hence the following definition:

Definition 3.0.2. Fix K ∈ R≥1. Let A ⊆ G be a subset of a group (G,+). We say Ahas doubling K if

|2A| ≤ K|A|.

A subset A always has doubling K for K ≥ |2A||A| . Terminology like “small” doubling

therefore only makes sense when K is small relatively to |2A||A| .Now, the goal of this chapter is to prove the following result in additive combinatorics:

29

Theorem 3.0.3. [Quasi-polynomial Freiman-Ruzsa theorem] Fix K ∈ R≥1. Let A ⊆ Fn2have doubling K.

Then there is a subset A′ ⊆ A such that |A′| ≥ K−O(log(K)3

)|A| and Sp(A′) ≤ 2K5|A′|.

The theorem just stated follows from another large result that is similar in form, provedoriginally by Sanders [9]:

Theorem 3.0.4. [Quasi-polynomial Bogolyubov-Ruzsa theorem] Fix K ∈ R≥1. LetA ⊆ Fn2 have doubling K.

Then there exists a linear subspace V ⊆ 4A such that |V | ≥ K−O(log(K)3

)|A|.

This theorem will be proven, and at the end of the chapter the reduction from thistheorem to the quasi-polynomial Freiman-Ruzsa theorem will be given.

The quasi-polynomial comes from how the subset A′, respectively the subspace V arestated to have quasi-polynomially size in relation to the set of doubling K. Similar tothis, one can formulate the polynomial Freiman-Ruzsa and polynomial Bogolyubov-Ruzsaconjectures [4]. At the moment of writing it is an open problem to either prove or disprove(one of) these conjectures.

We state the following lemma, to be used a few times over the course of this chapter.

Lemma 3.0.5. [Plunnecke-Ruzsa inequality] Fix K ∈ R≥1. Suppose A ⊆ G is a subsetof a group such that |2A| ≤ K|A|. Then for each t ∈ Z>0 it holds that |tA| ≤ Kt|A|.

For a proof of this lemma, we refer to Corollary 6.28 in [12].We now need the notion of a Freiman homomorphism.

Definition 3.0.6. [Freiman homomorphism] Let A ⊆ Fn2 . A linear map φ : Fn2 → Fm2 iscalled a Freiman homomorphism on A of order t if φ is injective on tA, i.e. by linearityfor all ai, bi ∈ A : i ∈ [t]:

t∑i=1

φ(ai) =t∑i=1

φ(bi) =⇒t∑i=1

ai =t∑i=1

bi.

Lemma 3.0.7. Let A ⊆ Fn2 and t ≥ 1. Choose m (depending on t) the smallestinteger such that a Freiman homomorphism φ : Fn2 → Fm2 on A of order t exists. Thenφ(2tA) = Fm2 .

Proof. When m = n, then the identity map Id: Fm2 → Fm2 is a Freiman homomorphismof order t, which means the minimal m exists.

We know by definition of the image that φ(2tA) ⊆ Fm2 . Furthermore for x ∈ tA,0 = 2x ∈ 2tA gives us 0 = φ(0) ∈ φ(2tA).

So, suppose 0 6= y ∈ Fm2 . Let ψ : Fm2 → Fm−12 be a surjective linear map for which itholds that ψ(y) = 0 (such a map exists; extend y to a basis on Fm2 and take the linearmap that sends each basis element to itself, excluding y. This results in a m× (m− 1)matrix).

30

Define the map φ′ = ψ ◦ φ : Fn2 → Fm−12 , which is linear by composition of linear maps.Suppose φ′ is injective. Then it is a Freiman homomorphism on A of order t, whichcontradicts the minimality of m.

Therefore, we know φ′ is not injective. This means there are distinct x1, x2 such that(by linearity) ψ(φ(x1 + x2)) = φ′(x1 + x2) = 0, i.e. φ(x1 + x2) = 0 or φ(x1 + x2) = ysince Ker(φ) = {0, y}. As x1 + x2 6= 0, the first case is impossible by injectivity of φ ontA; therefore y = φ(x1 + x2) ∈ φ(2tA) since x1, x2 ∈ tA, hence x1 + x2 ∈ 2tA.

Hence also Fm2 ⊆ φ(2tA) and so Fm2 = φ(2tA).

The notion of a Freiman homomorphism is now going to be applied in proving thatthe Bogolyubov-Ruzsa theorem only needs to be proven in the context of “large” sets.“Large” is here defined in terms of the doubling constant K ≥ 1.

Proposition 3.0.8. Fix K ∈ R≥1. It is sufficient to prove Theorem 3.0.4 for “large”

sets, i.e. A ⊆ Fn2 with doubling K such that E[1A] = |A|2n ≥ K

−1.

Proof. Fix K ∈ R≥1. We assume Theorem 3.0.4 for “large” sets.Let A ⊆ Fn2 such that A has doubling K.Replacing A with A + a for any a ∈ A we may assume 0 ∈ A, and since |A + a| =|A|, |2(A+ a)| = |2A| this does not matter for the theorem.

Let φ : Fn2 → Fm2 be a minimal Freiman homomorphism of A of order 12. DefineB = φ(A).

Now 0 ∈ A implies φ is a Freiman homomorphism on A of orders t for any t ≤ 12, asone can choose any of the elements ai, bi as zero. In particular for t = 1, 2 we get that|A| = |B|, |2A| = |2B|. Hence |2B| = |2A| ≤ K|A| = K|B|.

Therefore by Theorem 3.0.4 we know a linear subspace W ⊆ 4B with |W | ≥K−O

(log(K)3

)|B| exists. Also, using Lemma 3.0.7, the fact that φ|A maps A onto B and

therefore φ(tA) = tB for t ∈ Z>0, and Lemma 3.0.5 we have

2m = |Fm2 | = |φ(24A)| = |24B| ≤ K24|B| =⇒ E[1B] =|B|2m≥ K−24

so B is large in Fm2 .Since φ is injective on 12A, φ|12A : 12A → φ is a bijection and therefore an inverse

φ|−112Aexists.

Now define V := φ−1∣∣12A

(W ) ⊂ 4A; this will be the linear subspace as to be foundin the theorem. Note that since 0 = φ(0) ∈ φ(A) = B, one has W ⊆ 12B, so V iswell-defined.

Let x, y ∈ V . Then x′ := φ(x), y′ := φ(y) ∈ W , and since W is a linear subspace wehave z′ := x′ + y′ ∈ W . Define z := φ|−112A

(z′). Since x, y, z ∈ V ⊂ 4A, it holds that

x + y + z ∈ 12A; lastly, φ(x + y + z) = x′ + y′ + z′ = x′ + y′ + x′ + y′ = 0 and due toinjectivity of the linear map φ on 12A we therefore have x+y+z = 0 ⇐⇒ x+y = z ∈ V .We conclude that V ⊂ 4A is a linear subspace.

The following result shows that a [0, 1]-valued Boolean function, when smoothed, isinvariant to many shifts.

31

Lemma 3.0.9. [Croot-Sisask] Fix K ∈ R≥1. Let A ⊆ Fn2 such that E[1A] = |A|2n ≥ K

−1.Furthermore let f : Fn2 → [0, 1], p ≥ 1, and ε > 0. Then there exists a subset X ⊆ Fn2with E[1X ] = |X|

2n ≥ K−O(p

ε2

)such that for every x ∈ X:

‖ϕx ∗ ϕA ∗ f − ϕx ∗ ϕA‖p ≤ ε.

Proof. Take the assumptions of the theorem, let l ≥ C41+ 1

p pε2

, and let (a1, . . . ,al) be asequence of independent random variables distributed uniformly over A.

Define the random variables

z :=∥∥ϕA ∗ f − 1

l

l∑i=1

ϕai ∗ f∥∥p, zi := ϕA ∗ f − ϕai ∗ f.

Then

∥∥1

l

l∑i=1

zi∥∥p

=∥∥1

l

l∑i=1

(ϕA ∗ f − ϕai ∗ f

)∥∥p

=∥∥ϕA ∗ f − 1

l

l∑i=1

ϕai ∗ f∥∥p

= z,

and all zi(x) = ϕA ∗ f(x)− ϕai ∗ f(x) = Ea∼A[f(x+ a)]− f(x+ ai) ∈ [−1, 1] have mean0 (as both a,ai ∼ A uniformly). We now have, by simply writing out:

Ea1,...,al∼A

[zp] = Eai

[∥∥1

l

l∑i=1

zi∥∥pp

]= E

ai,x∼Fn2

[|1l

l∑i=1

zi(x)|p]

≤ Eai

[(Cpl

) p2]

; using Corollary 1.1.8

=(Cpl

) p2.

As a consequence,

Pa1,...,al∼A

[z ≤ ε

2] = 1− P[z ≥ ε

2] = 1− P[zp ≥

( ε2

)p]

≥ 1− E[zp]

( ε2)p; by Markov’s inequality

≥ 1−(2

ε

)p·(Cpl

) p2 = 1−

( 4

ε2Cp

l

) p2

≥ 1− 1

2=

1

2; using the choice of l.

Now define

S(A) :={

(a1, . . . , al) ∈ Al :∥∥ϕA ∗ f − 1

l

l∑i=1

ϕai ∗ f∥∥p≤ ε

2

}⊆ Fnl2 .

32

Since Pa1,...,al∼A[z ≤ ε2 ] ≥ 1

2 , the number of sequences contained in S(A) have to make upat least half of Al. Furthermore shifting by x ∈ Fn2 is a bijection (over any finite set), so

|S(A+ x)| ≥ 1

2|A|l ≥ 1

22nl|K|−l.

Defining w(a) :=∑

x∈Fn21[a ∈ S(A+ x)] for a ∈ Fnl2 , we see that

∑a∈Fnl2

w(a) =∑a∈Fnl2x∈Fn2

1[a ∈ S(A+ x)] =∑x∈Fn2

|S(A+ x)| ≥ 2n · 1

22nl|K|−l.

Hence, choosing a′ ∈ Fnl2 such that w(a′) is maximal, we have

w(a′) ≥ 1

2nl

∑a∈Fnl2

w(a) ≥2n · 122nlK−l

2nl=

1

22nK−l.

Define X ′ := {x ∈ Fn2 : a′ ∈ S(A + x)}. Let y ∈ X ′ be fixed, and define X := X ′ + y.Then every x ∈ X is of the form x = z + y for z ∈ X ′.

Now, writing out, we have

‖ϕx ∗ ϕA ∗ f − ϕA ∗ f‖p = ‖ϕy ∗ (ϕA+x ∗ f − ϕA ∗ f)‖p = ‖ϕA+x+y ∗ f − ϕA+y ∗ f‖p,

using that the p-norm is invariant under shifts (Lemma 2.1.12). Working towards anestimate, this is equal to∥∥∥∥ϕA+z ∗ f − 1

l

l∑i=1

ϕa′i ∗ f +1

l

l∑i=1

ϕa′i ∗ f − ϕA+y ∗ f∥∥∥∥p

≤∥∥∥∥ϕA+z ∗ f − 1

l

l∑i=1

ϕa′i ∗ f∥∥∥∥p

+

∥∥∥∥1

l

l∑i=1

ϕa′i ∗ f − ϕA+y ∗ f∥∥∥∥p

≤ ε

2+ε

2= ε,

where the second inequality holds given that a′ ∈ S(A+ z) and a′ ∈ S(A+ y) imply thatthe first, respectively second term are both at most ε

2 . This concludes the proof.

A more general version of the previous lemma has been proven by Sanders [9]. This isdone in a more representation theoretic context.

Note thatP

a1,a2∼A[a1 + a2 ∈ 2A] = 1,

by definition. The following result which is a corollary of Croot-Sisask shows that shifting2A by elements from a t-sumset does not lower this probability much.

33

Lemma 3.0.10. Fix K ∈ R≥1. Let A ⊆ Fn2 such that |A|2n ≥ K−1. Let t = O(log(K)),and δ ∈ (0, 1).

Then there exists a subset X ⊆ Fn2 with E[1X ] = |X|2n ≥ K−O

(log(K)3

δ2

)such that for

every y ∈ tX it holds that

Pa1,a2∼A

[a1 + a2 + y ∈ 2A] ≥ 1− δ.

Proof. Take the assumptions from the theorem. Set f = 12A, p = log(K), ε = δ2t and

take the X ⊆ Fn2 from Lemma 3.0.9. Then

E[1X ] ≥ K−O(p

ε2

)= K−O

(log(K)3

δ2

).

We first claim that ∥∥∥∥ϕy ∗ ϕA ∗ 12A − ϕA ∗ 12A∥∥∥∥p

≤ tε,

which will be proven using a telescoping argument. Define si :=∑i

j=1 yj for i ∈ {1, . . . , t},and g := ϕA ∗ 12A. Let y =

∑ti=1 yi ∈ tX. Then y = st and 0 = s0, hence we have

‖ϕy ∗ g − g‖p =

∥∥∥∥ϕst ∗ g +t−1∑i=1

ϕsi ∗ g −t∑i=2

ϕsi−1 ∗ g − ϕs0 ∗ g∥∥∥∥p

=

∥∥∥∥ t∑i=1

ϕsi ∗ g −t∑i=1

ϕsi−1 ∗ g∥∥∥∥p

=

∥∥∥∥ t∑i=1

(ϕsi ∗ g − ϕsi−1 ∗ g

)∥∥∥∥p

,

where ϕst and ϕs0 have been incorporated into their corresponding sums. Continuing,and applying the triangle inequality,

‖ϕy ∗ g − g‖p ≤t∑i=1

∥∥ϕsi−1+yi ∗ g − ϕsi−1 ∗ g∥∥p

≤t∑i=1

∥∥ϕsi−1 ∗ (ϕyj ∗ g − g)∥∥p,=

t∑i=1

∥∥ϕyi ∗ g − g∥∥p.Here the common factor ϕsi−1 is taken out and then can be eliminated due to invarianceof the p-norm under shifts (Lemma 2.1.12). Finally, for each i ∈ [t]:

∥∥ϕyj ∗ ϕA ∗ 12A −ϕA ∗ 12A

∥∥p≤ ε, by Lemma 3.0.9 since yi ∈ X for each i. Hence

‖ϕy ∗ ϕA ∗ 12A − ϕA ∗ 12A‖p ≤t∑i=1

ε = tε,

and the claim is proven. Substituting ε = δ2t now gives for all y ∈ tX

‖ϕy ∗ ϕA ∗ 12A − ϕA ∗ 12A‖p ≤δ

2.

34

Let q be the Holder conjugate of p. We have

‖ϕA‖q =2n

|A|E[|1A|q]

1q =

2n

|A|E[1A]

1qq =

2n

|A|( |A|

2n) 1q

=( 2n

|A|)1− 1

q =( 2n

|A|) 1p =

( 2n

|A|) 1

log(K) ≤ K1

log(K) = 2.

Note that the case K = 1 is irrelevant as then A = Fn2 and the theorem holds trivially.Let y ∈ tX. From the remarks at equation 2.2 and equation 2.3 we have

Pa1,a2∼A

[a1 + a2 + y ∈ 2A] = 〈ϕA ∗ ϕA ∗ ϕy |12A〉 = 〈ϕy ∗ ϕA ∗ 12A |ϕA〉.

We have 〈ϕA ∗12A |ϕA〉 = 〈ϕA ∗ϕA |12A〉 = Pa1,a2∼A[a1 + a2 ∈ 2A] = 1. Hence it followsthat

Pa1,a2∼A

[a1 + a2 + y ∈ 2A] = 1− 〈ϕA ∗ 12A |ϕA〉+ 〈ϕy ∗ ϕA ∗ 12A |ϕA〉

= 1− 〈ϕA ∗ 12A − ϕy ∗ ϕA ∗ 12A |ϕA〉.

Finally with Holder’s inequality (Theorem 1.1.3) and ‖ϕA‖q ≤ 2 as derived,

Pa1,a2∼A

[a1 + a2 + y ∈ 2A] ≥ 1− ‖ϕA ∗ 12A − ϕy ∗ ϕA ∗ 12A‖p‖ϕA‖q

≥ 1− δ

2· 2 = 1− δ.

This proves the lemma.

We already have seen a fair share of how Fourier analysis appears in additive combina-torics. We will now prove the quasi-polynomial Bogolyubov-Ruzsa theorem, which wasstated to be one of the largest results of this chapter. In the proof the largest result fromthe previous chapter, Chang’s lemma (Theorem 2.3.18), will be applied.

Theorem 3.0.4 (Quasi-polynomial Bogolyubov-Ruzsa theorem). Fix K ∈ R≥1. LetA ⊆ Fn2 have doubling K.

Then there exists a linear subspace V ⊆ 4A such that |V | ≥ K−O(log(K)3

)|A|.

Proof. Take the assumptions of the theorem. For t = log(10K), using Lemma 3.0.10 with

δ = 110 one has the existence of X ⊆ Fn2 with volume at least α := E[1X ] ≥ K−O

(log(K)3

)such that for every x ∈ tX:

Pa1,a2∼A

[a1 + a2 + x ∈ 2A] ≥ 9

10.

Hence

Pa1,a2∼Ax∼tX

[a1 + a2 + x ∈ 2A] ≥ 9

10.

35

Set V := Spec 12[ϕA]⊥ as the vector space that lies orthogonal to all vectors in the spec-

trum of ϕA. Using Chang’s lemma (Theorem 2.3.18) we know with d := dim(Sp(Spec 12[ϕA]))

that d ≤ 2ε2

ln(α−1) = 8 ln(2) log(α−1). Therefore using that each free coordinate takesexactly two values in Fn2 and the dimension theorem:

|V | = 2dim(V ) = 2n2−d ≥ |A|2−8 ln(2) log(α−1)

= 2log(α8 ln(2))|A| = α8 ln(2)|A| ≥ K−O

(log(K)3

)|A|.

It remains to show that V ⊂ 4A. Let v ∈ V . Write

Pa1,a2∼Ax∼tX

[a1 + a2 + x ∈ 2A] = 〈ϕ∗2A ∗ ϕ∗tX |12A〉 = 〈ϕA∧2ϕX∧t |12A∧〉

=∑α∈Fn2

ϕA∧2(α)ϕX

∧(α)t12A∧

(α), and

Pa1,a2∼A

x∼tX,v∈V

[a1 + a2 + x + v ∈ 2A] = 〈ϕ∗2A ∗ ϕ∗tX ∗ ϕV |12A〉 = 〈ϕA∧2ϕX∧tϕV∧|12A∧〉

=∑α∈Fn2

ϕA∧2(α)ϕX

∧(α)tϕV∧

(α)12A∧

(α).

We have

ϕV∧

(α) = Ey∼Fn2 [ϕV (y)χα(y)] = Ey∼V [χα] =∑y∈V

1

|V |(−1)y·α.

For α ∈ V ⊥ we have α · y = 0, so

ϕV∧

(α) =∑y∈V

1

|V |(−1)0 = 1.

When α ∈ V , let (v1, . . . , vd) be an orthonormal basis for V , and let α =∑d

i=1 civi.Now for i ∈ {1, . . . , d} we have

ϕV∧

(vi) =∑y∈V

1

|V |(−1)y·vi =

∑βj∈F2 : j∈[d]

1

|V |(−1)

∑dj=1 βjvj ·vi

=∑βi∈F2

1

|V |(−1)βj (3.1)

= 0,

where y =∑d

j=1 βjvj has been written as a linear combination of the orthonormalbasis vectors. We also made use of vj · vi = δj,i in equation 3.1. This gives ϕV

∧(α) =∑d

i=1 ciϕV∧

(vi) = 0.

36

Hence ϕV∧

(α) = 1(α ∈ V ⊥). Combining these results,

P[a1 + a2 + x ∈ 2A]− P[a1 + a2 + x + v ∈ 2A]

=∑α∈Fn2

ϕA∧2(α)ϕX

∧(α)t12A∧

(α)−∑α∈Fn2

ϕA∧2(α)ϕX

∧(α)tϕV∧

(α)12A∧

(α)

=∑α 6∈V ⊥

ϕA∧2(α)ϕX

∧(α)t12A∧

(α).

For this sum α 6∈ V ⊥ = Spec 12[ϕA]⊥⊥ ⊇ Spec 1

2[ϕA], which gives α 6∈ Spec 1

2[ϕA].

Therefore ϕX∧

(α)t <(12

)tand 12A

∧(α) = E[12Aχα] ≤ E[1 · χα] = 〈χ0 |χα〉 = δ0,α ≤ 1, so

P[a1 + a2 + x ∈ 2A]− P[a1 + a2 + x + v ∈ 2A] ≤∑α 6∈V ⊥

ϕA∧2(α)2−t

≤∑α∈Fn2

ϕA∧

(α)22−t = 2−t‖ϕA‖22; by Parseval’s identity

= 2−t( 2n

|A|

)2‖1A‖22 = 2−t

( 2n

|A|

)2E[12A] = 2−t

( 2n

|A|

)2( |A|2n

)≤ 2−tK = 2log((10K)−1)K =

1

10.

It follows that P[a1 + a2 + x + v ∈ 2A] ≥ P[a1 + a2 + x ∈ 2A] − 110 ≥

45 . Define the

random variable b := a1 + a2 + x ∼ 2A+ tX, then

Pb∼2A+tX

v∼V

[v ∈ (2A+ b)] = Pb∼2A+tX

v∼V

[b + v ∈ 2A] ≥ 4

5.

Now choosing b′ such that Pv∼V [v ∈ (2A+ b′)] is maximal, we have

|V ∩ (2A+ b′)||V |

= Pv∼V

[v ∈ (2A+ b′)]

≥ 1

|2A+ tX|∑

b∈2A+tXP

v∼V[v ∈ (2A+ b)]

= Pb∼2A+tX

v∼V

[v ∈ (2A+ b)] ≥ 4

5. (3.2)

Note that 0 ∈ 4A, so fix 0 6= v ∈ V . Define W := V ∩ (2A+b). Then |W | ≥ 45 |V | >

|V |2 .

Write out the decomposition v = v1 +v2, which can be done in |V |2 ways; choose v1 ∈ Vand set v2 = v1 + v. Shifting a set by addition with v is a bijection, so the pairs arecounted twice.

Define the sets

W1 := {w ∈WC : w + v ∈WC}, W2 := {w ∈WC : w + v ∈W},W := {w + v : w ∈W2},

37

where WC denotes the set complement of W relative to V . Note that W ⊆W and thatW2 and W are in bijection with each other by a shift with v. Now V = W ∪W1∪W2 canbe written as a disjoint union of sets. Hence with |W | > |V |

2 , we have

2|W | − 2|W | > |V | − 2|W | = |W |+ |W1|+ |W2| − 2|W |= (|W | − |W |) + |W1| ≥ |W | − |W |,

which implies |W | − |W | > 0 and so W \ W is not empty. This set difference exactlyrepresents the elements of W of which the shift by v is also contained in W ; since it is notempty there must be a pair v1, v2 ∈W = 2A+ b such that v = v1 + v2 ∈ 2(2A+ b) = 4A.So, V ⊆ 4A, as desired.

When inspecting the arguments given in the proof of the quasi-polynomial Bogolyubov-Ruzsa theorem, one can remark that the parameters δ, t and 1

2 from the 12 -spectrum can

be chosen differently. It only has to hold that

|V ∩ (2A+ b′)||V |

>1

2,

(equation 3.2) since the crucial argument in the proof is finding a pair for which anelement of the candidate vector space can be decomposed into.

Having proved the quasi-polynomial Bogolyubov-Ruzsa theorem, we will now reduce itto the quasi-polynomial Freiman-Ruzsa theorem.

Theorem 3.0.3 (Quasi-polynomial Freiman-Ruzsa theorem). Fix K ∈ R≥1. Let A ⊆ Fn2have doubling K.

Then there is a subset A′ ⊆ A such that |A′| ≥ K−O(log(K)3

)|A| and Sp(A′) ≤ 2K5|A′|.

Proof. Let A ⊆ Fn2 such that |2A| ≤ K|A|. Then there exists a linear subspace V ⊆ 4A

with |V | ≥ K−O(log(K)3

)|A|.

Choose S ⊆ A maximally such that for all distinct s, s′ ∈ S we have s+ s′ 6∈ V (whichholds if and only if for all distinct s, s′ ∈ S such that s+ V 6= s′ + V ). This can be doneby writing Fn2 =

⋃s(s+ V ) as a disjoint union of cosets for {s} distinct representatives

chosen from A if possible.Now for distinct s, s′ ∈ S, v, v′ ∈ V we have v + v′ ∈ V and therefore s+ s′ 6= v + v′,

only if s+ v 6= s′ + v′, so there’s an injective (and therefore bijective, since all sets arefinite) map from S × V to S + V , which gives

|S||V | = |S × V | = |S + V | ≤ |A+ V | ≤ |A+ 4A| = |5A| ≤ K5|A|

by Lemma 3.0.5. This implies 1|S| ≥

|V |K5|A| ≥

K−O(log(K)3

)K5 ≥ K−O

(log(K)3

)and |V | ≤

K5 |A||S| .

38

Now choose s ∈ S such that the size |A′| of A′ := A ∩ (V + s) is maximal. Then|A||S| ≤ |A

′| since using that s+ V are cosets and therefore A ∩ (V + s) for s ∈ S form apartition of A:

|A| =∣∣∣∪s∈S(A ∩ (V + s)

)∣∣∣ =∑s∈S

∣∣A ∩ (V + s)∣∣ ≤∑

s∈S|A′| = |S||A′|.

This gives us the required |A′| ≥ |A||S| ≥ K−O(log(K)3

)|A|.

Finally, since |V | is a vector space and therefore the span of the coset V + s can atmost have twice as many elements (working over F2):

|Sp(A′)| ≤ |Sp(V + s)| ≤ 2|Sp(V )| = 2|V | ≤ 2K5 |A||S|≤ 2K5|A′|,

which concludes the reduction.

39

4 Linearity testing

In this chapter we will consider two types of functions: Functions of the form f : Fn2 →{±1}, and of the form f : Fn2 → Fn2 .

In both cases we will prove a theorem that states that f is close to a certain linearmap, if it is linear on a certain fraction of points. “Close” will be quantified; in the firstcase there is a direct dependence, and in the second case we get an quasi-polynomialorder of approximation.

The goal is to prove correctness of an (probabilistic) algorithm that decides given afunction that is either linear, or far from linear, which of the two cases holds.

4.1 Testing of Boolean functions

This section follows parts of chapter 1 of O’Donnell [6]. Boolean functions of theformf : Fn2 → {±1} will be considered.

When considering linearity tests, it is

Definition 4.1.1 (Distance between functions). Let f, g be {±1}-valued Boolean func-tions.

• Define the (Hamming) distance as

d(f, g) := P[f 6= g].

• For given ε ∈ [0, 1], f and g are ε-close if d(f, g) ≤ ε; otherwise they are ε-far.

• Suppose that ∅ 6= P ⊆ {f : Fn2 → {±1}} is a non-empty property of Booleanfunctions. Define the distance of f to that property as

d(f,P) := ming∈P

d(f, g)

This minimum exists as as |{f : Fn2 → {±1}}| = 22n.

It can be shown that the Hamming distance defines a metric. The proof has beenomitted since this property is not necessary for our needs.

Lemma 4.1.2. [(Proposition 1.9 in [6])] Let f, g : Fn2 → {±1} be Boolean functions.Then 〈f | g〉 = 1− 2d(f, g).

40

Proof. We have

〈f | g〉 =1

2n

∑x∈Fn2

f(x)g(x) =1

2n

∑x∈Fn2

f(x)=g(x)

f(x)g(x) +1

2n

∑x∈Fn2

f(x)6=g(x)

f(x)g(x)

=1

2n

∑x∈Fn2

f(x)=g(x)

1 +1

2n

∑x∈Fn2

f(x)6=g(x)

−1 = P[f = g]− P[f 6= g] = 1− 2P[f 6= g],

as was claimed.

Theorem 4.1.3. [Characters and linearity] A Boolean function f is linear if and only ifit is equal to a character: f = χS for a certain S ⊆ [n].

Proof. “ =⇒ ” Suppose f is linear. For x, y ∈ Fn2 , equip Fn2 with the inner product(x, y) = (−1)x·y = (−1)

∑nk=1 xkyk . Then, using the Riesz-Frechet representation theorem

there exists an z ∈ Fn2 such that f(x) = (x, z) = χz(x) for all x ∈ Fn2 .“⇐= ” Characters are clearly linear.

Definition 4.1.4. [Almost-linear function] For given ε ∈ [0, 1], a Boolean function fis said to be almost/approximately linear or ε-linear if d(f,P) ≤ ε for P = {f : Fn2 →{±1} : linear} ⊆ {±1}F

nw .

Note that by definition of the Hamming distance we have that f is ε-linear if and onlyif Px,y[f(x)f(y) = f(x + y)] ≥ 1− ε. Here f(x)f(y) is denoted multiplicatively since fis {±1}-valued.

Theorem 4.1.5. [Characters and almost-linearity (Theorem 1.30 in [6])] Let ε ∈ [0, 1]and f : Fn2 → {±1} be a Boolean function. Suppose

Px,y∼Fn2 [f(x)f(y) = f(x + y)] ≥ 1− ε.

Then f is ε-linear.

Proof. We have

1− ε ≤ Px,y∼Fn2 [f(x)f(y) = f(x + y)] = P[f(x)f(y)f(x + y) = 1]

=∑x,y

1

22n1[f(x)f(y)f(x+ y) = 1] =

∑x,y

1

22n1[

1

2+

1

2f(x)f(y)f(x+ y) = 1]

=∑x,y

f linear

1

22n

(1

2+

1

2f(x)f(y)f(x+ y)

)= Ex,y[

1

2+

1

2f(x)f(y)f(x + y)]

=1

2+

1

2Ex,y[f(x)f(y)f(x + y)] =

1

2+

1

2Ex[f(x)Ey[f(y)f(x + y)]].

41

Using Parseval’s identity equation 2.1 and Theorem 2.1.14, we then have that this isequal to

1

2+

1

2

∑S⊆[n]

f(S)f ∗ f∧

(S) =1

2+

1

2

∑S⊆[n]

f(S)3.

Therefore∑

S⊆[n] f(S)3 ≥ 1 − 2ε. Hence, choosing S′ such that f(S′) is maximal and

using that ‖f‖22 =∑

S⊆[n] f(S)2 = 1 we obtain

1− 2d(f, χS′) = 〈f |χS′〉 = f(S′) =∑S⊆[n]

f(S′)f(S)2

≥∑S⊆[n]

f(S)3 ≥ 1− 2ε,

which gives d(f, χS′) ≤ ε. We conclude that by definition, f is ε-linear.

A weakened form of the converse statement goes as follows. Supposing f agrees on(1− ε)2n points with a character χS , it follows that there are (1− ε)222n pairs on whichf is linear. Hence Px,y∼Fn2 [f(x)f(y) = f(x + y)] ≥ (1− ε)2 ≥ 1− 2ε.

4.2 Testing of functions over Boolean space

A function is defined to be over Boolean space simply when f : Fn2 → Fn2 or one of theother group representatives; in this section we will use Fn2 . This is done because theFourier domain is not needed.

This section closely follows the exposition of the linearity test over Boolean space byViola [13].

The goal of this section is to prove the following result by Samorodnitsky:

Theorem 4.2.1. [Samorodnitsky linearity test] Let ε ∈ (0, 1]. Then there exists δ ∈ (0, 1]quasi-polynomial in ε, N such that for n > N , functions f : Fn2 → Fn2 we have that

Px,y∼Fn2 [f(x) + f(y) = f(x + y)] ≥ ε

implies there exists an n× n matrix M such that

Px∼Fn2 [f(x) =Mx] ≥ δ.

In the idea of the proof of this theorem, the graph of a function plays a large role.

Definition 4.2.2. [Graph of a function] Suppose f : X → Y is a function. Then defineG(f) := {(x, f(x)) : x ∈ X} ⊆ X × Y .

The proof of the theorem actually is based around the idea that since

(x, f(x)) + (y, f(y)) = (x+ y, f(x) + f(y)) ∈ G(f) if and only if f(x+ y) = f(x) + f(y),

42

the graph is a linear space if and only if f is linear. Inspecting the algebraic structure of(subsets of) the graph allows us to construct the linear map needed.

To prove Samorodnitsky’s theorem, we need the following result, called the Balog-Szemeredi-Gowers (BSG) theorem:

Theorem 4.2.3. [Balog-Szemeredi-Gowers] Let ε ∈ (0, 1]. Then there is N such that forall n > N , A ⊆ Fn2 it holds that

Pa,b∼A[a + b ∈ A] ≥ ε

implies there exists a subset A′ ⊆ A such that

|A′| ≥ ε

3|A| and |2A′| ≤ 6

ε

8

|A|.

This theorem will not be proved, as it is out of the scope of this thesis. Viola’s article[13] contains a proof based on graph theory.

Lemma 4.2.4. For all ε > 0, all sufficiently large n, all functions f : Fn2 → Fn2 and allA ⊆ G(f) we have that

2nε ≤ |A| ≤ |Sp(A)| ≤ 2n

εimplies there exists an n× n matrix M such that

Px∼Fn2 [f(x) =Mx] ≥ ε3

3.

Proof. Let (v1, . . . , va) be a basis for Sp(A). Extend this basis to (v1, . . . , vk) such that theprojection of (v1, . . . , vk) onto the first n coordinates spans Fn2 . When writing vi = (xi, yi)for i ∈ [k], one therefore has Fn2 = Sp((x1, . . . , xk)). Note that for i ∈ [a], yi = f(xi)since A ⊆ G(f).

Working over F2, adding a vector to a basis doubles the size of the vector space spanned.Therefore

2nε ≤ |A| ≤ |Sp(A)| = |Sp((v1, . . . , va))| = |Sp((x1, . . . , xa))|

and hence we have 2n+k−aε ≤ |Sp((x1, . . . , xk))| = |Fn2 | = 2n which gives k − a ≤ log(1ε ).We also have from 2a = |Sp(A)| ≤ 2n

ε that a− n ≤ log(1ε ). Together with the previousthis gives k − n = k − a+ a− n ≤ 2 log(1ε ).

Since Sp((x1, . . . xk)) = Fn2 one can row-reduce the 2n× k matrix

Vk := [v1| · · · |vk] =

[x1y1· · · xk

yk

]into Vk = Wk · L =

[Idn 0

T U

]· L

for some invertible k × k matrix L, an n× n matrix T , and an n× k − n matrix U .Hence every vector (x, f(x)) ∈ A can then be written as a linear combination (x, f(x)) =∑ki=1 αivi = Vkα = WkLα = Wkw for α = (α1, . . . , αk) ∈ Fk2 and w = Lα ∈ Fk2. Using

block matrix multiplication, (x

f(x)

)=

[Idn 0

T U

]w

43

we see that w = (x, yx) for an yx ∈ Fk−n2 and f(x) = Tx+ Uyx.Now, for every (x, f(x)) ∈ A there exists an yx ∈ Fk−n2 such that f(x) = Tx + Uyx.

Let for each y ∈ Fk−n2 , w(y) :=∑

x∈Fn21[f(x) = Tx+ Uy]. Choose y′ such that w(y′) is

maximal. Then we have

w(y′) ≥ 1

2k−n

∑y∈Fk−n2

w(y) =1

2k−n

∑y∈Fk−n2x∈Fn2

1[f(x) = Tx+ Uy]

≥ 1

2k−n|A|,

having summed over more than all x and corresponding yx such that f(x) = Tx+ Uyx.Consequently,

w(y′) ≥ 1

2k−n|A| ≥ 2nε

2k−n= 22n−kε.

Take u′ := Uy′. Then

Px∼Fn2

[f(x) = Tx + u′] =∑x∈Fn2

1[f(x) = Tx+ Uy′]1

2n=w(y′)

2n≥ 22n−kε

2n= 2n−kε ≥ ε3.

Viola [13] now gives an stochastic argument that there is an i ∈ [n] such that

Px∼Fn2

[f(x) = Tx + u′] ≥ 99

100ε3,

then taking this i and defining M : x 7→ Tx+ xiu, he claims that it follows for large nthat

Px∼Fn2 [f(x) =Mx] ≥ ε3

3,

as was to be proven.

Finally, we can show correctness of Samorodnitsky’s linearity test.

Theorem 4.2.1 (Samorodnitsky linearity test). Let ε ∈ (0, 1]. Then there exists δ ∈ (0, 1]quasi-polynomial in ε, N such that for n > N , functions f : Fn2 → Fn2 we have that

Px,y∼Fn2 [f(x) + f(y) = f(x + y)] ≥ ε

implies there exists an n× n matrix M such that


Proof. Let ε ∈ (0, 1]. Take N from Balog-Szemeredi-Gowers. Let n > N , f : Fn2 → Fn2 .Suppose

Px,y∼Fn2 [f(x) + f(y) = f(x + y)] ≥ ε. (4.1)

44

Let x, y ∈ Fn2 . Then with a = (x, f(x)), b = (y, f(y)) we have G(f) 3 a + b =(x + y, f(x) + f(y)) if and only if f(x + y) = f(x) + f(y). Hence equation 4.1 can bewritten as

Pa,b∼G(f)[a + b ∈ G(f)] ≥ ε.

With the Balog-Szemeredi-Gowers theorem (Theorem 4.2.3) one gets the existence of

A ⊆ G(f) such that |A| ≥ ε3 |G(f)| and |2A| ≤ 6

ε

8|G(f)|.Again noting that A ⊆ G(f), define B := {x : (x, f(x)) ∈ A} as the projection of

A on elements of Fn2 . Applying the Quasi-polynomial Freiman-Ruzsa theorem onto

this set for K :=(6ε

)8 · ε3 , we know there is B′ ⊆ B with |B′| ≥ 2nK−O(log(K)3

)and

Sp(B′) ≤ 2K5|B′|.Hence

2nK−O(log(K)3

)≤ |B′| ≤ |Sp(B′)| ≤ |B′|2K5 ≤ 2nKO

(log(K)3

)≤ 2n

K−O(log(K)3

) .Finally, let ε = K−O

(log(K)3

), and δ = ε3

3 . Then one concludes using Lemma 4.2.4 thatthere exists an n× n matrix M such that


This proves correctness of Samorodnitsky’s linearity test.

4.3 Algorithmic description

In this section we give a short description of how to apply the previous results from thischapter. Note that we can always encode F2 into {±1} as needed for Boolean functionsf : Fn2 → F2 in section 4.1; in this section, we use F2.

Now, let us consider a linearity test (originally for Boolean functions), stated as analgorithm, by Blum, Luby and Rubinfeld.

Definition 4.3.1. [BLR Test (Section 1.6 in [6])] Given query access to a function f ,either Boolean or over Boolean space, the BLR (Blum, Luby, Rubinfeld) test is as follows:

Choose N ∈ [22n] for the amount of times to iterate, i.e. choose a ratio ε = N22n

.for k ← 1, . . . , N do

Choose x,y uniformly from Fn2 .Query f at x,y,x + y.if f(x + y) 6= f(x) + f(y) then

breakend if

end for

Now suppose we indeed have this query access (which means that we can ask for outputgiven input, but we do not know how the function behaves as a whole) to a function f ,either Fn2 → F2 or Fn2 → Fn2 . Suppose also we have the promise that f is either linear, or

45

far from linear. Then, if the BLR test fails, we know for sure that f is far from linear. Ifthe function f passes this test for a fraction ε of pairs of points, we know depending onthe type of function that there exists a linear map with which f agrees on with a fractionδ of points. For the values of δ, we refer to the earlier sections.

Note that this does not show which linear map f is close to in the Hamming distance.

46

Conclusion

In this thesis a path has been laid down towards describing linearity tests. First Fourieranalysis on Boolean functions, and corresponding results in hypercontractivity have beendescribed thoroughly using results described by O’Donnell [6]. The path laid down byLovett [4] to prove the quasi-polynomial Freiman-Rusza theorem was also clear, whichresulted in the chapter on additive combinatorics. After that, two tests have beendealt with, namely Boolean functions (Fn2 → {±1}) and functions over Boolean space(Fn2 → Fn2 ). The Samorodnitsky linearity test (Theorem 4.2.1) has not been proven infull detail, as the influence of the dimension n of the space Fn2 is not clear from the proof.Furthermore the stochastic argument as given in Viola’s article [13] was not fully clearand has therefore not been included. Finally an algorithmic description was given, usingthe BLR test. As can be seen, there is still work that could be done on the final topicsdescribed in this thesis.

There are several ways to further the research beyond the contents of this thesis, bothin breadth and depth. As mentioned in the thesis, Chang [2] and Sanders [9] provedtheorems that hold in a more general setting than the ones described here. In particular,finite cyclic groups can be considered in the setting of additive combinatorics; or thetheorems can be stated in a more representation theoretic or measure theoretic context.Also, Samorodnitsky and Trevisan [7] [8] described more about the field of propertytesting. Of course Fourier analysis on Boolean functions can be applied in many otherfields as well, as described by O’Donnell [6].

47

Popular Summary (in Dutch)

We bekijken zogeheten Booleaanse functies. Deze hebben 0, 1-zinnen, ook wel genoemdn-bit strings, zoals 00001101, als domein. Gegeven een vaste lengte van zulke zinnenbeeldt een Booleaanse functie ieder zo een zin af op een 0 of een 1. Aangezien wiskundigenvaak met getallen modulo n werken, wordt de verzameling van n-bit strings vanoudsbeschreven met Fn2 met de optelling modulo 2 als operatie.

Om een voorbeeld van zo een functie te geven, kunnen zinnen van lengte 3 afgestuurdworden op de bit die vaker voorkomt:

000 7→ 0 001 7→ 0

010 7→ 0 011 7→ 1

100 7→ 0 101 7→ 1

110 7→ 1 111 7→ 1

De zinnen worden afhankelijk van de context ook wel beschreven door middel van {−1, 1}nonder vermenigvuldiging, of deelverzamelingen van {1, 2, . . . , n} onder het symmetrischeverschil.

Toepassingen van Booleaanse functies zijn onder andere te vinden in moderne cryp-tografie (ter beveiliging van o.a. banksystemen en het internet), sociale keuzetheorie(waarbij een Booleaanse functie een stemregel voorstelt, de input de stemmen, en deoutput de uitkomst), property testing (eigenschapstesten), en binnen de wiskunde zelf.

Een voorbeeld van property testing is lineariteitstesten; Theoretische informatici zijnonder andere geınteresseerd in de lineariteit van Booleaanse functies. Oftewel, of vooriedere x, y zo een functie voldoet aan f(x + y) = f(x + y). De domste manier om ditte weten te komen is natuurlijk alle punten afgaan. Dit is niet zo slim, aangezien debenodigde tijd exponentieel stijgt naarmate de zinnen lager worden. Er zouden

(2n

2

)paren van punten afgegaan worden; voor n = 20 is dit al 5, 5 · 1011. Door de zekerheid opte geven en voldoening te nemen met weten of de functie lineair is met grote kans, kandit beter.

Deze scriptie heeft als doel om te bewijzen dat dit inderdaad kan, en een beschrijvingte geven hoe dit gedaan zou moeten worden gegeven de belofte dat we weten dat defunctie of lineair is, of ver van lineair (dus op een heleboel paren van punten niet lineairis).

Voor de eerst genoemde functies is het redelijk eenvoudig om te kwantificeren hoevaak je zou moeten testen. Blum, Luby en Rubinfeld hebben in 1990 al laten zien [1]dat er op weinig punten geevalueerd hoeft te worden om met te kunnen zeggen dat defunctie niet linear is, of met grote kans dichtbij een lineare functie zit (dus op veel puntenovereenkomt met een lineaire functie).

48

Verder kunnen nog functies bekeken worden die een n-bit string op een andere n-bitstring afbeelden. Om uitspraken te doen over lineariteitstesten op dit soort functies isveel meer nodig - de resultaten nodig hiervoor zijn pas ontwikkeld en uitgebracht nade milleniumwisseling. Deze scriptie tracht de benodigheden op een rijtje te zetten, envervolgens een concrete aanpak te geven voor de probleemstelling.

49

Bibliography

[1] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. “Self-testing/correcting withapplications to numerical problems”. In: Proceedings of the twenty-second annualACM symposium on Theory of computing. ACM. 1990, pp. 73–83.

[2] Mei-Chu Chang et al. “A polynomial bound in Freiman’s theorem”. In: Dukemathematical journal 113.3 (2002).http://projecteuclid.org/euclid.dmj/1087575313, pp. 399–420.

[3] Yuan Shih Chow and Henry Teicher. Probability theory: independence,interchangeability, martingales. Springer Science & Business Media, 2012.Chap. 10.3, p. 356.

[4] Shachar Lovett. “An exposition of Sanders quasi-polynomial Freiman-Ruzsatheorem.” In: 19 (2012). http://theoryofcomputing.org/articles/gs006/,p. 29.

[5] Ryan O’Donnell. Additive Combinatorics and the Polynomial Freiman–RuszaConjecture. 2012. url: http://www.contrib.andrew.cmu.edu/~ryanod/?p=1173(visited on 07/04/2016).

[6] Ryan O’Donnell. Analysis of Boolean functions.http://analysisofbooleanfunctions.org/. Cambridge University Press, 2014.

[7] Alex Samorodnitsky. “Low-degree tests at large distances”. In: Proceedings of thethirty-ninth annual ACM symposium on Theory of computing. ACM. 2007,pp. 506–515.

[8] Alex Samorodnitsky and Luca Trevisan. “Gowers uniformity, influence of variables,and PCPs”. In: SIAM Journal on Computing 39.1 (2009), pp. 323–360.

[9] Tom Sanders. “On the Bogolyubov–Ruzsa lemma”. In: Analysis & PDE 5.3(2012), pp. 627–655.

[10] Rene L Schilling. Measures, integrals and martingales. Cambridge University Press,2005, p. 106.

[11] Benjamin Steinberg. Representation theory of finite groups: an introductoryapproach. Springer Science & Business Media, 2011.

[12] Terence Tao and Van H Vu. Additive combinatorics.https://terrytao.wordpress.com/books/additive-combinatorics/.Cambridge University Press, 2006, p. 269.

[13] Emanuele Viola. “Selected Results in Additive Combinatorics: An Exposition.” In:Theory of Computing, Graduate Surveys 3 (2011).http://theoryofcomputing.org/articles/gs003/, pp. 1–15.

50

http://projecteuclid.org/euclid.dmj/1087575313

http://theoryofcomputing.org/articles/gs006/

http://www.contrib.andrew.cmu.edu/~ryanod/?p=1173

http://analysisofbooleanfunctions.org/

https://terrytao.wordpress.com/books/additive-combinatorics/

http://theoryofcomputing.org/articles/gs003/

Documents

An Application of Fourier analysis on Boolean functions in ... · additive combinatorics to show the quasi-polynomial Freiman-Ruzsa theorem originally proven by Sanders. Using this,