WILL SAWIN AND MARK SHUSTERMAN arXiv:1808.04001v2 …[TT17]. Generalizations of some of these arguments to the function ﬁeld set-ting are part of a work in progress by Klurman, Mangerel,

arX

iv:1

808.

0400

1v2

[m

ath.

NT

] 7

Sep

201

9

ON THE CHOWLA AND TWIN PRIMES CONJECTURES

OVER Fq[T ]

WILL SAWIN AND MARK SHUSTERMAN

Abstract. Using geometric methods, we improve on the function fieldversion of the Burgess bound, and show that, when restricted to certainspecial subspaces, the Mobius function over Fq[T ] can be mimicked byDirichlet characters. Combining these, we obtain a level of distributionclose to 1 for the Mobius function in arithmetic progressions, and re-solve Chowla’s k-point correlation conjecture with large uniformity inthe shifts. Using a function field variant of a result by Fouvry-Michelon exponential sums involving the Mobius function, we obtain a level ofdistribution beyond 1/2 for irreducible polynomials, and establish thetwin prime conjecture in a quantitative form. All these results hold forfinite fields satisfying a simple condition.

1. Introduction

Our main results are the resolutions of two open problems in numbertheory, except with the ring of integers Z replaced by the ring of polynomialsFq[T ], under suitable assumptions on q.

We first fix some notation. Define the norm of a nonzero f ∈ Fq[T ] to be

(1.1) |f | = qdeg(f) = |Fq[T ]/(f)|.

The degree of the zero polynomial is negative ∞, so we set its norm to be 0.

1.1. The main result - twin primes. Our main result covers the twinprime conjecture in its quantitative form. The latter is the 2-point primetuple conjecture of Hardy-Littlewood, predicting for a nonzero integer h that

(1.2) #{X ≤ n ≤ 2X : n and n+h are prime} ∼ S(h)X

log2(X), X → ∞,

where

(1.3) S(h) =∏

p

(1 − p−1)−2(1 − p−1 − p−11p∤h),

with 1p∤h equals 1 if h is not divisible by p, and 0 otherwise.For the function field analogue, we set

(1.4) Sq(h) =∏

P

(1 − |P |−1

)−2 (1 − |P |−1 − |P |−11P ∤h

)

W.S. served as a Clay Research Fellow while working on this paper.

1

http://arxiv.org/abs/1808.04001v2

2 WILL SAWIN AND MARK SHUSTERMAN

where q is a prime power, P ranges over all primes (monic irreducibles) ofFq[T ], and h ∈ Fq[T ] is nonzero.

Theorem 1.1. For an odd prime number p, and a power q of p satisfyingq > 685090p2, the following holds. For any nonzero h ∈ Fq[T ] we have

(1.5) #{f ∈ Fq[T ] : |f | = X, f and f + h are prime} ∼ Sq(h)X

log2q(X)

as X → ∞ through powers of q. Moreover, we have a power saving (depend-ing on q) in the asymptotic above.

For example, the 2-point Hardy-Littlewood conjecture holds over

(1.6) F315 ,F511 ,F79 ,F118 ,F6850933 .

In case h ∈ Fq[T ] is a monomial, the fact that the count above tendsto ∞ as X → ∞ (under the weaker assumption q > 105) has been provedin [CHLPT15, Theorem 1.3] using an idea of Entin. Their work builds onthe recent dramatic progress on this problem over the integers, particularly[Ma15]. The strongest result known over the integers is [PM14, Theorem16(i)], which says that for any ‘admissible tuple’ of 50 integers, there existsat least one difference h between two elements in the tuple such that thereare infinitely many pairs of primes separated by h.

Remark 1.2. Our proof of Theorem 1.1 establishes also the analog of theGoldbach problem over function fields, and can be modified to treat moregeneral linear forms in the primes.

The proof of Theorem 1.1 passes through some intermediate results whichmay be of independent interest. We will discuss these results in the remain-der of the introduction. Once Theorem 1.7 and Theorem 1.9 below areestablished, Theorem 1.1 will follow from a quick argument involving a con-volution identity relating the von Mangoldt function, which can be used tocount primes, to the Mobius function.

1.2. The key ingredient - Chowla’s conjecture. The main ingredientin the proof of Theorem 1.1 is the removal of the ‘parity barrier’. Moreprecisely, we confirm Chowla’s k-point correlation conjecture over Fq[T ] forsome prime powers q. Over the integers, this conjecture predicts that forany fixed distinct integers h1, . . . , hk, one has

(1.7)∑

n≤X

µ(n+ h1)µ(n+ h2) · · ·µ(n+ hk) = o(X), X → ∞.

The only completely resolved case is k = 1 which is essentially equivalent tothe prime number theorem.

ON THE CHOWLA AND TWIN PRIMES CONJECTURES OVER Fq[T ] 3

For the function field analogue, we recall that the Mobius function of amonic polynomial f is given by

(1.8) µ(f) =

0, #{P : P 2 | f} > 0

1, #{P : P | f} ≡ 0 mod 2

−1, #{P : P | f} ≡ 1 mod 2,

and denote by Fq[T ]+ the set of monic polynomials over Fq.

Theorem 1.3. For an odd prime number p, an integer k ≥ 1, and a powerq of p satisfying q > p2k2e2, the following holds. For distinct h1, . . . , hk ∈Fq[T ] we have

(1.9)∑

f∈Fq[T ]+

|f |≤X

µ(f + h1)µ(f + h2) · · · µ(f + hk) = o(X), X → ∞.

For instance, the 2-point Chowla conjecture holds over

(1.10) F36 ,F55 ,F74 ,F313 .

In fact, in Theorem 1.3 we obtain a power saving inversely proportionalto p, and the shifts h1, . . . , hk can be as large as any fixed power of X(the corresponding assumption on q becomes stronger as this power growslarger). In Corollary 6.1 we also get cancellation in case the sum is restrictedto prime polynomials f .

Over the integers, the k = 2 case of the Chowla conjecture, with log-arithmic averaging, was proven by Tao [Tao16, Theorem 3], building onearlier breakthrough work of Matomaki and Radziwi l l [MR16]. The k oddcase, again with logarithmic averaging, was handled by Tao and Teravainen[TT17]. Generalizations of some of these arguments to the function field set-ting are part of a work in progress by Klurman, Mangerel, and Teravainen.

In contrast to these works, which deal with any sufficiently general (i.e.non-pretentious) multiplicative function, our result relies on special prop-erties of the Mobius function (in positive characteristic). Specifically, weobserve that for any fixed polynomial r, the function µ(r + sp) essentiallyequals χDr(s + cr) where χDr is a quadratic Dirichlet character and cr is ashift, both depending only on r (and not on s). This observation is veryclosely related to the properties of Mobius described in [CCG08], specifi-cally [CCG08, Theorem 4.8]. Conrad, Conrad, and Gross prove a certainquasiperiodicity property in s for a general class of expressions of the formr+ sp, while we give a more precise description via Dirichlet characters in aspecial case.

Our observation on µ(r + sp) arises from the connection between theparity of the number of prime factors of a squarefree f ∈ Fq[T ], and the sign(inside the symmetric group) of the Frobenius automorphism acting on theroots of f . In odd characteristic, this sign is determined by the value of thequadratic character of F×

q on the discriminant of f , i.e. the resultant of f


and its derivative f ′. In characteristic p, the derivative of f = r+sp is equalto the derivative of r, so the aforementioned sign of Frobenius is determinedby the quadratic character of the resultant of f with the fixed polynomialr′. The latter is a quadratic Dirichlet character of f , and thus equals anadditively shifted Dirichlet character of s.

To use our observation, we restrict the sum in Theorem 1.3 to f of theform r + sp for any fixed r, and obtain a short sum in s of a product ofadditively shifted Dirichlet characters. As the conductors of these charactersare typically essentially coprime, we arrive at a short sum of a single Dirichletcharacter.

Typically in analytic number theory, short character sums are handled bythe method of Burgess, who showed in [Bur63] that for a real number η >1/4, a squarefree integer M , a real number X ≥ |M |η, and a nonprincipalDirichlet character χ mod M , one has

(1.11) sups∈Z

∣∣∣∣∣∣

∑

|a|≤X

χ(s+ a)

∣∣∣∣∣∣= o(X), |M | → ∞.

Refining the method of Burgess is the focus of several works, but the expo-nent 1/4 has not yet been improved (even conditionally). However, in thefunction field setting, we can do better by a geometric method, as long as qis sufficiently large.

Theorem 1.4. Fix η > 0. Then for a prime power q > e2/η2 the follow-ing holds. For a squarefree M ∈ Fq[T ], a real number X ≥ |M |η, and anonprincipal Dirichlet character χ mod M , we have

(1.12) sups∈Fq[T ]

∣∣∣∣∣∣

∑

|a|≤X

χ(s+ a)

∣∣∣∣∣∣= o(X), |M | → ∞.

By further enlarging q, we get arbitrarily close to square root cancellation.This is stated more precisely in Corollary 2.7.

To prove Theorem 1.4, we express the problem geometrically, viewingthe short interval {s + a : |a| ≤ X} as an affine space over Fq, and thecharacter χ as arising from a sheaf on that space. Following a strategy from[Hoo91, appendix by Katz], we use vanishing cycles theory to compare thecohomology of this sheaf for the s = 0 short interval and its cohomology fora general short interval. Vanishing cycles can only occur when the vanishinglocus of χ is not (geometrically) a simple normal crossings divisor. Arguingas in [Kat89], we split the modulus M of χ into a product of distinct linearterms over Fq, which makes our vanishing locus a union of the hyperplaneswhere the linear terms vanish, so we can check that this is a simple normalcrossings divisor away from some isolated points. This implies that thecohomology groups vanish until almost the middle degree. Since we canprecisely calculate the vanishing cycles at the isolated points, we get a verygood control of the dimensions of cohomology groups as well.


Remark 1.5. The relation between Mobius and multiplicative charactersis less powerful the larger p is, as then fewer polynomials share a givenderivative. On the other hand, our geometric character sum bounds becomestronger as q grows. Thus, to make this method of proving Chowla work,we need q to be sufficiently large with respect to p.

Remark 1.6. The study of the statistics of polynomial factorizations by ex-amining Frobenius as an element of the symmetric group has been very fruit-ful in the ‘large finite field limit’, where (in the notation of Theorem 1.3)X is kept fixed and q is allowed to grow. We refer to [CaRu14], [Ca15],[GS18] (and references therein) for the large finite field analogs of Theo-rem 1.3 which save a fixed power of q. Our methods likely give an improvedsavings in the large finite field limit when the characteristic is fixed, as longas the degrees of the polynomials are sufficiently large with respect to thecharacteristic, but we have not carefully calculated the resulting bounds inthis range.

1.3. Further ingredients - level of distribution estimates. Anotheringredient in the proof of Theorem 1.1 is an improvement of the level ofdistribution of the Mobius and von Mangoldt functions in arithmetic pro-gressions. Over the integers, assuming the Generalized Riemann Hypothesis(GRH), this level of distribution is (at least) 1/2, which means that

(1.13)∑

n≤Xn≡a mod M

µ(n) = o

(X

|M |

),

∑

n≤Xn≡a mod M

Λ(n) =X

ϕ(M)+o

(X

|M |

)

where M,a are coprime integers, and |M | ≤ X12−ǫ (for any fixed ǫ > 0 and

A ∈ R). For Mobius over Fq[T ], we obtain a level of distribution close to 1.

Theorem 1.7. Fix η > 0. For an odd prime number p, and a power q of p

with q > p2e2(2η − 1

)2, the following holds. For coprime M,a ∈ Fq[T ], and

a real number X with X1−η ≥ |M | we have

(1.14)∑

f∈Fq [T ]+

|f |≤Xf≡a mod M

µ(f) = o

(X

|M |

), |M | → ∞.

As in the previous theorems, we obtain a power savings estimate. Herehowever (as opposed to Theorem 1.3), every f in our sum may have a dif-ferent derivative, so a somewhat more elaborate implementation of our ob-servation on the Mobius function is required. We put f = Mg + a, andwishfully write

(1.15) µ(Mg + a) ≈ µ(M)µ(g +

a

M

)

in order to create coincidences among the derivatives of the inputs to theMobius function. This is carried out more formally in Lemma 3.2, where we


show that for a power q of an odd prime p, and coprime a,M ∈ Fq[T ], thefunction s 7→ µ(a + spM) is essentially proportional to an additive shift ofa (quadratic) Dirichlet characters in s, with the modulus of the characterdepending on a and M in an explicit way. To visualize the power of thisclaim, we view Fq[T ] as a rank p lattice over its subring Fq[T

p]. Restrictingthe Mobius function to any line in this lattice gives a Dirichlet characterwhose modulus varies with the line.

In order to deduce from Theorem 1.7 an improved level of distributions forprimes, we establish in the appendix a function field variant of [FM98] givingquasi-orthogonality of the Mobius function and ‘inverse additive characters’.While Fouvry-Michel work with characters to prime moduli, in order toestablish Theorem 1.1 we need arbitrary squarefree moduli.

Theorem 1.8. Let q be a prime power, and let ǫ > 0. Then for a squarefreeM ∈ Fq[T ], and an additive character ψ mod M , we have

(1.16)∑

f∈Fq[T ]+

|f |≤X(f,M)=1

µ(f)ψ(f)≪ |M |

316

+ǫX2532 , X, |M | → ∞

where f denotes the inverse of f mod M , and the implied constant dependsonly on q and ǫ.

For nonzero M ∈ Fq[T ] we define Euler’s totient function by

(1.17) ϕ(M) =∣∣(Fq[T ]/(M))×

∣∣ ,and for f ∈ Fq[T ]+, we define the von Mangoldt function by

(1.18) Λ(f) =

{deg(P ), f = Pn

0, otherwise.

Theorem 1.9. Fix δ < 1126 . For an odd prime p and a power q of p with

(1.19) q > p2e2(

51 − 26δ

1 − 126δ

)2

,

the following holds. For a squarefree M ∈ Fq[T ], a polynomial a ∈ Fq[T ]

coprime to M , and X a power of q with X12+δ ≥ |M | we have

(1.20)∑

f∈Fq[T ]+

|f |≤Xf≡a mod M

Λ(f) =X

ϕ(M)+ o

(X

|M |

), |M | → ∞.

We have not ventured too much into improving the constant 1126 , as our

method cannot give anything above 16 , even if Theorem 1.8 would give square

root cancellation. Since our proof of Theorem 1.9 is based on the ‘convo-lutional’ connection of von Mangoldt and Mobius, it is not surprising that


a level of distribution of 23 = 1

2 + 16 , which is a longstanding barrier for

the divisor function over Z (perhaps the most basic convolution), is a natu-ral limit of our techniques. A large finite field variant of Theorem 1.7 andTheorem 1.9 was earlier proved in [BBSR15, Theorem 2.5].

Remark 1.10. It would be interesting to see whether our results can beextended to characteristic 2, perhaps in a manner similar to which [Ca15]extends the results of [CaRu14].

1.4. Additional results in small characteristic. Throughout this work,we have not made every possible effort to reduce the least values of theprime powers q to which our theorems apply. Instead, we present in the lastsection some results that hold for q as small as 3.

The first concerns sign changes of Mobius in short intervals. Improvingon many previous works, Matomaki and Radziwi l l have shown in [MR16]that for any η > 1/2, and any large enough positive integer N , there existintegers a, b with |a|, |b| ≤ Nη such that µ(N + a) = 1, µ(N + b) = −1. Incharacteristic 3, we show that the exponent 1/2 can be improved to 3/7.

Theorem 1.11. Let q be a power of 3, and fix 3/7 < η < 1. Then for anyf ∈ Fq[T ]+ of large enough norm, there exist g, h ∈ Fq[T ] with |g|, |h| ≤ |f |η

such that µ(f + g) = 1 and µ(f + h) = −1.

We follow the same proof strategy relating Mobius to characters, but sincewe allow small values of q, we cannot apply Theorem 1.4 anymore. Nowhowever, once we are interested in sign change only (and not cancellation),we can focus on just one of the derivatives appearing. It turns out that ifthis derivative has a relatively large order of vanishing at 0, the conductor ofthe associated character is relatively small, and we can apply a function fieldversion (see [Hsu99]) of the aforementioned result of Burgess. For p > 3,the character sums arising are too short for the Burgess bound to apply.

For large enough q, Theorem 1.11 follows from Theorem 1.7 (since T 7→1/T allows one to think of short intervals as arithmetic progressions), andalso from the k = 1 case of Theorem 1.3 (as we have sufficient uniformity inthe shift).

Our last result, in the spirit of [Gal72], shows that Mobius enjoys cancel-lation in the arithmetic progression 1 mod a growing power of a fixed primeP , no matter how slowly does the length of the progression increase.

Theorem 1.12. Let q be a power of 3. Fix an irreducible P ∈ Fq[T ]. Thenfor a positive integer n we have

(1.21)∑

f∈Fq[T ]+

|f |≤Xf≡1 mod Pn

µ(f) = o

(X

|P |n

),

X

|P |n→ ∞.

As before, we can obtain a power saving. Since the progressions are soshort, this result does not follow from the previous ones, even if q is large.


In view of Maier’s phenomenon, we cannot expect to obtain the theorem forall residue classes.

1.5. Further directions. In future work, we hope to use some of the meth-ods introduced here to address the following problems:

• Obtaining cancellation in ‘polynomial Mobius sums’ such as∑

f∈Fq [T ]+

|f |≤X

µ(f2 + T

)

which may be relevant for counting primes of the form f2 + T .• Calculating the variance (and higher moments) of the Mobius func-

tion in short intervals (and arithmetic progressions) over Fq[T ].

1.6. Notation. From this point on, it will be more convenient to work withdegrees of polynomials instead of absolute values. For g ∈ Fq[T ] we denoteits degree by d(g). By convention, the latter is −∞ if g = 0. The letter qdenotes a prime power, and is often suppressed from notation such as

(1.22) Md ={g ∈ Fq[T ]+ : d(g) = d

}.

2. Character sums

The main result of this section is the following.

Theorem 2.1. Let t ≤ m be natural numbers, let f ∈ Fq[T ], let g ∈ Mm

be squarefree, and let χ : (Fq[T ]/g)× → C× be a nontrivial character. Then

(2.1)

∣∣∣∣∣∣∣∣∣∣∣

∑

h∈Fq[T ]d(h)<t

gcd(f+h,g)=1

χ(f + h)

∣∣∣∣∣∣∣∣∣∣∣

≤ (q1/2 + 1)

(m− 1

t

)q

t2 .

To prove this theorem, we use the following geometric setup. View At asa space parameterizing polynomials h of degree less than t and let (c1, c2)be coordinates on A2. Let U ⊆ At ×A2 be the open set consisting of points(h, (c1, c2)) where c1f + h+ c2T

t is prime to g. Let j : U → Pt × A2 be theopen immersion, embedding At into Pt in the usual way. Let π : Pt×A2 → A2

be the projection.Let T be the torus parameterizing of polynomials of degree less than

m that are relatively prime to g. (The space T isa torus because, overan algebraically closed field, we may factor g and write T as Gm

m, withthe coordinates given by evaluating the polynomial on each of the rootsof g.) On T , we have a character sheaf Lχ whose trace function is χ,constructed by the Lang isogeny. Let Lχ(c1f + h+ c2T

t) be the pullback ofthis sheaf to U along the natural map from U to T that sends (h, (c1, c2))


to c1f + h + c2Tt. We will prove (2.1) using geometric properties of the

complex Rπ∗j!Lχ(c1f + h+ c2Tt).

Lemma 2.2. The complex Rπ∗j!Lχ(c1f+h+c2Tt) is geometrically isomor-

phic to its pullback under the map (c1, c2) → (λc1, λc2) for any λ ∈ Fq×.

Proof. Because U is invariant under multiplying c1, c2, and h by λ, as are themaps j and π, its suffices to check that Lχ(c1f + h+ c2T

t) is geometricallyisomorphic to its pullback under this multiplication map. It is sufficient thatLχ on T is geometrically isomorphic to its pullback under any multiplicativetranslation, which follows from its construction as a character sheaf. �

Lemma 2.3. The stalk of Rπ∗j!Lχ(c1f + h + c2Tt) at the point (0, 1) is

supported in degree t, where it has rank(m−1t

).

Proof. When c1 = 0, c2 = 1, the polynomial c1f+h+c2Tt = T t+h is monic

and has degree t. By the proper base change theorem, our stalk is thusequivalent to the compactly supported cohomology of the space of degree tmonic polynomials that are prime to g, with coefficients in Lχ(T t + h). We

may view this space as the quotient(SpecFq[x, g(x)−1]

)t/St, and denote by

(2.2) ρ :(SpecFq

[x, g(x)−1

])t→(SpecFq

[x, g(x)−1

])t/St

the quotient map. Then (ρ∗ρ∗Qℓ)

St = Qℓ, so

(2.3)(ρ∗ρ

∗Lχ(T t + h))St = Lχ(T t + h)

and thus

H∗((

SpecFq[x, g(x)−1

])t/St,Lχ(T t + h)

)=

H∗((

SpecFq[x, g(x)−1

])t/St, ρ∗ρ

∗Lχ(T t + h))St

=

H∗((

SpecFq[x, g(x)−1

])t, ρ∗Lχ(T t + h)

)St

.

(2.4)

Now ρ is the map defined by factorizing a polynomial into linear terms,so ρ∗Lχ(T t + h) = (Lχ(T − x))⊠t. By the Kunneth formula, it follows thatthe stalk of Rπ∗j!Lχ(c1f + h+ c2T

t) at (0, 1) is

(2.5)((H∗(SpecFq

[x, g(x)−1

],Lχ(T − x)

))⊗t)St

.

Because χ is nontrivial, Lχ(T − x) has nontrivial mondromy, so

(2.6) H∗(SpecFq

[x, g(x)−1

],Lχ(T − x)

)

vanishes in degrees other than 1. Because it arises from a character sheafon a torus, Lχ has tame local monodromy, so the rank of this cohomology

group in degree 1 is the Euler characteristic of SpecFq[x, g(x)−1], which ism− 1. Thus

(2.7)(H∗(SpecFq

[x, g(x)−1

],Lχ(T − x)

))⊗t


is supported in degree t, where it equals the t-th tensor power of an m− 1-dimensional vector space, with St acting the usual way, twisted by the signcharacter because of the Koszul sign in the tensor product of a derivedcategory. Thus taking St-invariants is equivalent to taking the t-th wedgepower of this m− 1-dimensional vector space, which has dimension

(m−1t

).�

Lemma 2.4. The complement of U in Pt × A2 is a divisor with simplenormal crossings relative to A2 away from

(2.8){

(h, (c1, c2)) ∈ At × A2 : d(gcd(c1f + h+ c2T

t, g))> t}.

Proof. Because this is a purely geometric question, we may assume that gsplits completely, and let α1, . . . , αm be its roots. Then the complement ofU is the union of the hyperplane at ∞ in Pt with the hyperplanes

(2.9) c1f(αi) + h(αi) + c2αti = 0, 1 ≤ i ≤ m.

This union has simple normal crossings if the intersection of each subset ofour hyperplanes has the expected dimension.

We first consider the case of a subset not including the hyperplane at ∞.For any S ⊆ {1, . . . ,m}, the intersection of the hyperplanes correspondingto the elements of S is

(2.10)

{(h, (c1, c2)) ∈ At × A2 :

∏

i∈S

(T − αi)∣∣∣ c1f + h+ c2T

t

}.

Since∏i∈S(T −αi) is a polynomial of degree |S|, the above has codimension

|S| for any c1, c2 as long as |S| ≤ t, because the set of h of degree less than twhich are multiples of this polynomial has the expected dimension. On theother hand, when |S| > t we get that

(2.11) d(gcd(c1f + h+ c2T

t, g))≥ |S| > t,

so removing these points, we obtain simple normal crossings.Now we consider the case where we have a set of hyperplanes correspond-

ing to S ⊆ {1, . . . ,m} and also the divisor at ∞. We can take coordinateson the divisor at ∞ to be (h, (c1, c2)), with h nonzero and well-defined upto scaling. In these coordinates, the equation for the intersection of anyhyperplane with the divisor at ∞ is its original equation with all terms hav-ing degree zero in h removed. Thus the equation for the i-th hyperplane,restricted to the divisor at ∞, is simply h(αi) = 0, so the intersection of thedivisor at ∞ with the hyperplanes in S consists of (h, (c1, c2)) where h isa multiple of

∏i∈S(T − αi). This can only happen if |S| < t, as |S| is the

degree of this polynomial and d(h) < t, but then our intersection always hasthe expected dimension, so we have simple normal crossings. �

Lemma 2.5. Away from a finite union of lines through the origin, thecomplex Rπ∗j!Lχ is supported in degree t with rank

(m−1t

).


Proof. Let l1 : A1 → A2 send c to (c, 1). We consider the vanishing cyclesat zero RΦcl

∗1j!Lχ(f + h+ λT t) of the pullback of j!Lχ(f + h+ λT t) under

l1. Let us first check that the complement of U is a simple normal crossingsdivisor everywhere in the fiber over zero. To do this, we apply Lemma 2.4and observe that we cannot have d(gcd(c1f + h+ c2T

t, g)) > t in this fiberas c1 = 0, c2 = 1 and

(2.12) d(gcd(h+ T t, g)

)≤ d(h+ T t) ≤ t.

By [SGA7-II, XIII Lemma 2.1.11] it follows that the vanishing cycles

(2.13) RΦcl∗1j!Lχ(f + h+ λT t)

vanish everywhere. Hence the cohomology of the nearby and general fibersis isomorphic. Using Lemma 2.3 to compute the cohomology of the specialfiber, it follows that the cohomology of the general fiber is supported indegree t with rank

(m−1t

). By constructibility, the stalk cohomology must

have the same description at every point in some nonempty open set. ByLemma 2.2, the same description holds for the stalks in the Gm-orbit of thisopen set, which is the complement of finitely many lines. �

Let l2 : A1 → A2 send c to (1, c).

Lemma 2.6. Taking vanishing cycles at zero, the complex

(2.14) RΦcl∗2j!Lχ(f + h+ λT t)

has the following properties:

• it is supported on C ={

(h, 0) ∈ At × A1 : d (gcd(f + h, g)) > t};

• it is supported in degree t;

• the rank of its stalk at (h, 0) ∈ C is(d(gcd(f+h,g))−1

t

).

Proof. The first property follows immediately from [SGA7-II, XIII Lemma2.1.11] and Lemma 2.4.

We note that l∗2j!Lχ(f + h + λT t)[t + 1] is the extension by zero of alisse sheaf in degree −(t + 1) on a variety of dimension t + 1 and is thussemiperverse. The dual complex is the pushforward of a lisse sheaf in degree−(t+ 1) on a variety of dimension t+ 1 along an affine open immersion andthus is semiperverse by Artin’s affine theorem. We conclude that

(2.15) l∗2j!Lχ(f + h+ λT t)[t+ 1]

is a perverse sheaf. By [Ill94, Corollary 4.6], the vanishing cycles of a per-verse sheaf are perverse up to a shift by one, so

(2.16) RΦcl∗2j!Lχ(f + h+ λT t)[t]

is perverse.The support of the perverse sheaf above is the closed set C. Because C

does not intersect the divisor at ∞, C is finite. A perverse sheaf supportedat a finite set is necessarily a sum of skyscraper sheaves supported in degreezero, so we obtain the second property in the statement of this lemma.


It remains to calculate the rank of the stalk at a particular point (h0, 0) ∈C. We do that by working locally in an etale neighborhood.

We set

(2.17) g′ = gcd(f + h0, g), g∗ = g/g′,

and factor T as the product of the torus T ′ of residue classes mod g′ andthe torus T ∗ of residue classes mod g∗. This lets us factor Lχ as the tensorproduct of Lχ′ (the pullback of a character sheaf from T ′) and Lχ∗ (thepullback of a character sheaf from T ∗). Because f +h0 is relatively prime tog∗, the map U → T → T ∗ extends to a well-defined map in a neighborhoodof the point (h0, 0), so Lχ∗(c1f + h + c2T

t) extends to a lisse sheaf in aneighborhood of (h0, 0). Because tensoring with a lisse rank one sheaf doesnot affect vanishing cycles, it suffices to calculate

(2.18) RΦcl∗2j

′!Lχ′(c1f + h+ c2T

t),

where j′ is the inclusion of the open set where gcd(c1f + h+ c2Tt, g′) = 1.

By changing variables, we may replace (f, h) with f ′, h′ where f ′ = f+h0and h′ = h − h0 - we are then tasked with calculating the vanishing cyclesRΦcl

∗2j

′!Lχ′(c1f

′ + h′ + c2Tt) at zero. Having done this, we observe that f ′

is a multiple of g′, so translation by f ′ does not affect Lχ′(c1f′ + h′ + c2T

t),thus these vanishing cycles are the same as RΦcl

∗2j

′!Lχ′(h′ + c2T

t).As d(h′ + c2T

t) ≤ t, we can only have

(2.19) d(gcd(h′ + c2T

t, g′))> t

if h′ + c2Tt = 0. Hence, by Lemma 2.4, the complement of the image of j′

is a simple normal crossings divisor way from the point (0, 0). Therefore,by [SGA7-II, XIII Lemma 2.1.11], the vanishing cycles are supported at thispoint. By the vanishing cycles long exact sequence, the Euler characteristicof the vanishing cycles complex is the difference between the Euler charac-teristic of the generic fiber and the Euler characteristic of the special fiber.Because the vanishing cycles sheaf is supported at a single point in a singledegree, its Euler characteristic is (−1)t times its rank at that point. We willcalculate these Euler characteristics and thereby calculate the rank.

By Lemma 2.5, the Euler characteristic of the generic fiber is (−1)t(d(g′)−1

t

).

So it remains to check the Euler characteristic at the special point is zero.Because Lχ is lisse of rank one and tame, the Euler characteristic of thespecial fiber is the Euler characteristic of the space of polynomials of degreeless than t and prime to g′. Because this admits a free action of Gm byscaling, its Euler characteristic is zero. �

We can now prove Theorem 2.1.

Proof. We have a vanishing cycles long exact sequence

(Rπ∗j!Lχ(c1f + h+ c2Tt))(1,0) → (Rπ∗j!Lχ(c1f + h+ c2T

t))(1,η)

→ H∗(Pt, RΦcl∗2j!Lχ(f + h+ λT t)).


By Lemma 2.5, (Rπ∗j!Lχ(c1f + h + c2Tt))(1,η) is supported in degree t

with rank(m−1

t

). By Lemma 2.6, the complex RΦcl

∗2j!Lχ(f + h + λT t) is

supported in degree t and at finitely many points, so the third term aboveis also supported in degree t and is simply the sum of the stalks at thosepoints, and thus has rank

(2.20) r(f, g, t) =∑

h∈Fq[T ]d(h)<t

d(gcd(f+h,g))>t

(d (gcd(f + h, g)) − 1

t

),

again using Lemma 2.6.We conclude that, upon suppressing c1f + h + c2T

t for brevity, the van-ishing cycles long exact sequence becomes

0 → (Rtπ∗j!Lχ)(1,0) → (Rtπ∗j!Lχ)(1,η)

→ Ht(Pt, RΦcl∗2j!Lχ) → (Rt+1π∗j!Lχ)(1,0) → 0.

(2.21)

Thus (Rπ∗j!Lχ(c1f +h+ c2Tt))(1,0) is supported in degrees t and t+1, with

rank at most(m−1

t

)in degree t and rank bounded by r(f, g, t) in degree t+1.

By Deligne’s theorem, the absolute values of the eigenvalues of Frobeniuson the i-th cohomology group are at most qi/2, so the absolute value of thetrace of Frobenius on cohomology is at most

(2.22)

(m− 1

t

)q

t2 + r(f, g, t)q

t+12 .

By the Lefschetz fixed point formula, the trace of Frobenius on cohomol-ogy equals the sum of the trace of Frobenius on the stalks, which is

(2.23)∑

h∈Fq[T ]d(h)<t

gcd(f+h,g)=1

χ(f + h).

Finally, we check that r(f, g, t) ≤(m−1

t

). To do this fix a root α of g, and

note that(d(gcd(f+h,g))−1

t

)does not exceed the number of degree t divisors of

g, prime to T −α, that divide f + h. Each such divisor of g prime to T −αcontributes at most once to the sum in Eq. (2.20), so this sum is bounded

by the number of such divisors, which is(m−1

t

). �

Corollary 2.7. Fix η > 0 and 0 < β < 1/2. Then for a prime power

q ≥ (eη−1)2

1−2β the following holds. For a nonprincipal character χ to asquarefree modulus g ∈ Fq[T ], f ∈ Fq[T ], and t ≥ η · d(g), we have

(2.24)∑

d(h)<t

χ(f + h) ≪ q(1−β)t

as t→ ∞, with the implied constant depending only on q.Furthermore, if we have t ≤ η · d(g) ≤ t′, then we still have


(2.25)∑

d(h)<t

χ(f + h) ≪ q(1−β)t′

.

Proof. If η ≥ 1, the left hand side vanishes and the bound is trivial. Other-wise, we apply Theorem 2.1 to the left side, taking m = d(g), to obtain

∑

d(h)<t

χ(f + h) ≤ (q1/2+1)

(m− 1

t

)qt/2

≪

(m

t

)qt/2 ≤

(t

m

)−t(m− t

m

)t−mqt/2

(2.26)

where the last inequality follows from

1 =

(t

m+m− t

m

)m=

m∑

k=0

(m

k

)(t

m

)k (m− t

m

)m−k

≥

(m

t

)(t

m

)t(m− t

m

)m−t

.

(2.27)

From the Taylor series we can see that − log(1 − x) ≤ x/(1 − x) if x > 0

so (1 − x)−(1−x)/x ≤ e. Applying this to x = t/m, we get

(2.28)

(m− t

m

)t−m≤ et

so we obtain

(2.29)∑

d(h)<t

χ(f + h) ≪

(t

m

)−t

etqt/2.

Because q ≥ (eη−1)2

1−2β and t/m ≥ η, we have

(2.30) e

(t

m

)−1

≤ eη−1 ≤ q12−β

hence

(2.31)

(t

m

)−t

etqt/2 ≤ q(1−β)t,

as desired.To handle the case where t ≤ η · d(g) ≤ t′, first note that we may assume

t′ ≤ m. We observe that the left hand side of Eq. (2.31) is an increasingfunction of t because its logarithm

(2.32) t logm− t log t+ t+ t(log q)/2

has derivative

(2.33) logm− log t+ (log q)/2

which is positive in the range t ≤ m. Thus we can get a bound for theshorter sum which is at least as good as our bound for the longer sum. �


3. The Mobius Function

From now on, we will assume that the characteristic p of Fq is odd. Be-cause of this, F×

q admits a unique quadratic character, which we denote ψ.We use freely the basic properties of resultants (see [Jan07]) and the Jacobisymbol (see [Ros13, Chapter 3]).

The following lemma recalls the relation between the (real valued) Jacobisymbol and the quadratic character of a resultant.

Lemma 3.1. Let f ∈ Fq[T ], g = anTt + · · · + a0 of degree n ≥ 1. Then

(3.1)

(f

g

)= ψ(an)max{d(f),0}ψ (Res(g, f)) .

Proof. Fix f 6= 0, and note that both sides above are completely multiplica-tive in g, so we assume that g is irreducible. For a root θ of g we have

(3.2) Res(g, f) = ad(f)n

d(g)−1∏

i=0

f(θq

i)= ad(f)n

d(g)−1∏

i=0

f(θ)qi

= ad(f)n f(θ)qd(g)−1

q−1

so we get the mod p congruence

ψ (Res(g, f)) = ψ(an)d(f)ψ

(f(θ)

qd(g)−1q−1

)

≡ ψ(an)d(f)f(θ)qd(g)−1

2 ≡ ψ(an)d(f)(f

g

)(3.3)

which suffices for the lemma. �

Given a D ∈ Fq[T ] we write rad(D) for the product of primes that appearin the factorization of D, and rad1(D) for the product of primes that appearwith odd multiplicity in the factorization of D. The derivative of D (withrespect to T ) is denoted by D′.

The next lemma interprets the Mobius function (on an arithmetic pro-gression) as a Dirichlet character that ‘depends only on the derivative’.

Lemma 3.2. Let m,k ≥ 0, d ≥ 1 be integers with k 6= d+m. For M ∈ Mm,g ∈ Md, and a ∈ Mk coprime to M , define the polynomials

(3.4) D = M2(g +

a

M

)′, E =

rad(D)

gcd (M, rad(D)), E1 =

rad1(D)

gcd (M, rad1(D)).

Then

(3.5) µ(a+ gM) = S · χ(w + g)

where w = wa,M,g′ ∈ Fq[T ], χ = χa,M,g′ is a (real) multiplicative charactermod E with conductor E1, and

(3.6) S = Sd,a,M,g′ ∈ {0, 1,−1}

with S = 0 if and only if D = 0.


Proof. Pellet’s formula (see [Con05, Lemma 4.1]) gives

(3.7) µ(a+ gM) = (−1)d(a+gM)ψ (Disc(a+ gM))

and since d(a + gM) = max{k, d + m}, we see that (−1)d(a+gM) can beabsorbed into S. Our assumption that k 6= d+m implies a+ gM is monic,so ψ (Disc(a+ gM)) equals, up to a sign that S absorbs,

(3.8) ψ(Res(a+ gM, a′ + g′M + gM ′)

).

By Lemma 3.1, and the fact that a+ gM is monic, the above equals

(3.9)

(a′ + g′M + gM ′

a+ gM

)

and since gcd(a,M) = 1 by multiplicativity, this equals

(3.10)

(a′M + g′M2 + gMM ′

a+ gM

) (M

a+ gM

)−1

.

Using quadratic reciprocity, we can absorb(

Ma+gM

)into S. Subtracting

M ′(a + gM) from the numerator of the first Jacobi symbol above, we get(in the notation of equation (3.4))

(3.11)

(D

a+ gM

)

and set w = 0, S = 0 in case D = 0. Otherwise (if D 6= 0) we applyquadratic reciprocity once again to obtain

(3.12)

(a+ gM

D

)

up to a sign that goes into S.We write ξ(a + gM) for the Jacobi symbol above, so that ξ is a multi-

plicative character mod rad(D) with conductor rad1(D). Since rad(D) issquarefree, we see that M is coprime to E (from equation (3.4)). There-fore, by the Chinese remainder theorem, there exists a unique decompositionξ = ξEξN to characters mod E and N = gcd (M, rad(D)) respectively. Inthis notation, our Jacobi symbol equals ξE(a + gM)ξN (a + gM) and thesecond factor is simply ξN (a), so we immerse it in S. Taking M ∈ Fq[T ]

with MM ≡ 1 (E) we can write

(3.13) ξE(a+ gM) = ξE(aM + g)ξE(M)

and conclude by setting w = aM, χ = ξE, and dumping ξE(M) into S. �

Remark 3.3. With notation as in Lemma 3.2, suppose that χ is principal.We can then write D = AB2 for some A,B ∈ Fq[T ] with A | M , simply bytaking A = rad1(D) since principality gives E1 = 1.


4. Linear forms in Mobius

Proposition 4.1. Let p be a prime, let q be a power of p, let m,d be non-negative integers, let M ∈ Mm be squarefree, and let a ∈ Fq[T ]. Then

# {h : h = g′ for some g ∈ Md, h ≡ a mod M}

# {g′ : g ∈ Md}≤ q

−min{

m,⌊

d−1p

⌋}

.

Proof. Let 0 ≤ j ≤ p − 1 be the unique integer congruent to d mod p. Anyg ∈ Md can then be uniquely expressed as

(4.1) g =

p−1∑

i=0

T igpi , gj ∈ M d−j

p

, d(gi) ≤

⌊d− i

p

⌋for i 6= j.

For the derivative we then have

(4.2) g′ =

p−1∑

i=1

iT i−1gpi

so we set ai = iT i−1 for 1 ≤ i ≤ p− 1, and consider the congruence

(4.3)

p−1∑

i=1

aigpi ≡ a mod M.

Since M is squarefree, we can uniquely pick b, b1, . . . , bp−1 ∈ Fq[T ] with

(4.4) bp ≡ a mod M, bpi ≡ ai mod M, 1 ≤ i ≤ p− 1,

so the congruence above becomes

(4.5) g1 ≡ b−

p−1∑

i=2

bigi mod M

and from this restriction on g1 the proposition follows. �

The following technical proposition follows from the arguments of [BGP92,Page 371] or [CG07, Section 9] (see also [BZ02]), which obtain stronger state-ments over Z in place of Fq[T ].

Proposition 4.2. Fix α, ǫ > 0, and a prime power q. Then for integersd,m, k ≥ 0 with d ≥ ǫ(m+ k), and M ∈ Mm, A ∈ Mk, a ∈ Fq[T ], we have

#{g ∈ Fq[T ] : d(g) < d, a+ gM = λAB2, λ ∈ Fq, B ∈ Fq[T ]

}≪ q(

12+α)d

as d→ ∞, with the implied constant depending only on ǫ, α and q.

We can now prove Theorem 1.3. We first prove the “generic” special casewhere the derivatives of certain parameters are distinct, and then prove thegeneral case.


Proposition 4.3. Fix ǫ, δ > 0, 0 < β < 1/2, and a positive integer n. Letq be a power of an odd prime p such that

(4.6) q >

pne

min{

ǫǫ+2 ,

ǫδǫ+δ

}

21−2β

.

Then for nonnegative integers d,m1, . . . ,mn, k1, . . . , kn with

(4.7) d ≥ max{ǫm1, . . . , ǫmn, δk1, . . . , δkn}, ki 6= d+mi, 1 ≤ i ≤ n,

and pairs (ai,Mi) ∈ Mki × Mmifor 1 ≤ i ≤ n such that the derivatives(

aiMi

)′are all distinct, we have

(4.8)∑

g∈Md

n∏

i=1

µ (ai + gMi) ≪ |Md|1−β

p

as d→ ∞, with the implied constant depending only on β, ǫ, δ, n and q.

Remark 4.4. Note that the statement of this proposition remains meaningfuleven if ǫ and δ are very large, though it is at its strongest when ǫ and δ aresmall.

Proof. Let us first assume that for every 1 ≤ i ≤ n we have gcd(ai,Mi) = 1.We say that g1, g2 ∈ Md are equivalent if g′1 = g′2, and let R be a complete

set of representatives of equivalence classes. So for each g ∈ Md there existsa unique r ∈ R such that (g− r)′ = 0, and therefore also a unique s ∈ Fq[T ]such that g − r = sp. We can thus write our sum as

(4.9)∑

r∈R

∑

d(s)<t

n∏

i=1

µ (ai + (r + sp)Mi) , t =d

p.

By Lemma 3.2 (the notation of which is used throughout), our sum equals

(4.10)∑

r∈R

n∏

i=1

S(i)r

∑

d(s)<t

n∏

i=1

χ(i)r

(w(i)r + sp

)

with χ(i)r a character to a squarefree modulus E

(i)r (defined in Lemma 3.2

using D(i)r ). Hence, there exist f

(i)r ∈ Fq[T ] with

(4.11) f (i)rp≡ w(i)

r mod E(i)r , 1 ≤ i ≤ n,

so using the fact that χ(i)r is real, we see that our inner sum equals

(4.12)∑

d(s)<t

n∏

i=1

χ(i)r

(f (i)r + s

).

For 1 ≤ i < j ≤ n we set

(4.13) G(i,j)r = gcd

(E(i)r , E(j)

r

), Ur = lcmi,j

(G(i,j)r

), ℓr = d(Ur),


and claim that, for every integer ℓ ≥ 0 and γ > 0, we have

(4.14) # {r ∈ R : ℓr ≥ ℓ} ≪ qd(

1− 1p+γ

)

−min(

d−1p,ℓ)

, d→ ∞.

To do this, first note that, for any 1 ≤ i < j ≤ n , since G(i,j)r divides

E(i)r , and divides E

(j)r , it also divides D

(i)r , and divides D

(j)r so therefore it

divides the polynomial

(4.15) M2jD

(i)r −M2

i D(j)r = M2

iM2j

(aiMi

)′

−M2iM

2j

(ajMj

)′

which is nonzero by our initial assumption. The degree of the above poly-nomial is at most d/δ + 3d/ǫ, so by the divisor bound (see [IK04, Equation

1.81]), the polynomial G(i,j)r attains ≪ q2γd/(n(n−1)) values, for any γ > 0.

Hence, the tuple(G

(i,j)r

)1≤i<j≤n

attains ≪ qγd values. For each possible

tuple G(i,j)∗ , we can recover the residue class of r′ mod G

(i,j)∗ from the con-

gruence

(4.16) M2j r

′ ≡M ′jaj − a′jMj mod G(i,j)

r .

and the fact that G(i,j)r is prime to Mj (because D

(j)r is prime to Mj , by

definition). Combining these for all i, j we can also recover the residue class

of r′ mod the least common multiple U∗ of G(i,j)∗ .

Proposition 4.1 tells us that for any α ∈ Fq[T ] we have

#{r ∈ R : r′ ≡ α mod U∗} ≤ q−min

{

d(U∗),⌊

d−1p

⌋}

#R ≪ qd(

1− 1p

)

−min(

d−1p,ℓ)

since d(U∗) ≥ ℓ, so our claim is established.Next, we observe from Eq. (4.14) that the contribution of those r ∈ R

with ℓr ≥ d−1p to Eq. (4.9) is ≪ q

d(

1− 1p+γ

)

and thus can be ignored as we

can choose γ small enough that 1 − 1p + γ ≤ 1 − β

p . So we may assume that

ℓr <d−1p .

We further set

(4.17) E(i)r = gcd

(E(i)r , Ur

), E(i)

r =E

(i)r

E(i)r

, 1 ≤ i ≤ n,

and note that gcd(E(i)r , E

(i)r ) = 1 as E

(i)r is squarefree. The Chinese remain-

der theorem then gives a unique decomposition

(4.18) χ(i)r = χ(i)

r χ(i)r , 1 ≤ i ≤ n,

to characters mod E(i)r and E

(i)r respectively. In this notation, our sum reads

(4.19)∑

d(s)<t

n∏

i=1

χ(i)r

(f (i)r + s

) n∏

i=1

χ(i)r

(f (i)r + s

)


so splitting according to the residue class u of s mod Ur we get

(4.20)∑

d(u)<ℓr

n∏

i=1

χ(i)r

(f (i)r + u

) ∑

d(h)<t−ℓr

n∏

i=1

χ(i)r

(f (i)r + u+ hUr

).

Since gcd(E(i)r , E

(i)r ) = 1, we get that gcd(Ur, E

(i)r ) = 1 for 1 ≤ i ≤ n.

Hence, there exist V(i)r ∈ Fq[T ] with UrV

(i)r ≡ 1 mod E

(i)r . Summing trivially

over u, we may thus consider

(4.21)

n∏

i=1

χ(i)r (Ur)

∑

d(h)<t−ℓr

n∏

i=1

χ(i)r

(f (i)r V (i)

r + uV (i)r + h

).

From gcd(E(i)r , E

(i)r ) = 1 we moreover conclude that {E

(i)r }ni=1 are pairwise

coprime. The Chinese remainder theorem then gives an fr,u ∈ Fq[T ] with

(4.22) fr,u ≡ f (i)r V (i)r + uV (i)

r mod E(i)r , 1 ≤ i ≤ n,

so defining the character χr = χ(1)r · · · χ

(n)r , mod Er = E

(1)r · · · E

(n)r , the sum

above becomes

(4.23)∑

d(h)<t−ℓr

χr(fr,u + h).

We have

(4.24)t

d(Er)≥

t

nmax{d+ 2d/ǫ, d/δ + d/ǫ}=

min{

ǫǫ+2 ,

ǫδǫ+δ

}

pn=·· η

so if χr is nonprincipal, Corollary 2.7 bounds the sum above by ≪ q(1−β′)t,

for some β′ > β + pγ, with γ > 0 arbitrarily small. Because we have

(4.25) q >

pne

min{

ǫǫ+2 ,

ǫδǫ+δ

}

21−2β

by assumption, using our definition of η this can be writeen as

(4.26) q >(eη−1

) 21−2β

and therefore

(4.27) q >(eη−1

) 21−2β′

for any sufficiently small choice of γ.Hence, those r ∈ R for which ℓr = ℓ and χr is nonprincipal, contribute

≪ q(1−β′)tqℓ individually, so the total contribution to our initial sum is ≪

q(1−β′)t+ℓq

d(

1− 1p+γ

)

−ℓ= q(1−β)tq

d(

1− 1p

)

qγd−(β′−β)t = qd(

1−βp

)

qd(γ−(β′−β)/p).

The increase from summing over the possible values of ℓ is linear in d andthus can be bounded by the exponential qd((β

′−β)/p−γ), so the contribution

of all the terms where χ is nonprincipal is ≪ qd(

1−βp

)

.


Let r ∈ R for which χr is principal. Pairwise coprimality implies that

χ(1)r is principal as well, so the conductor of χ

(1)r divides that of χ

(1)r , and

the latter divides Ur. Thus, for some λ ∈ Fq and monic B ∈ Fq[T ] we have

M21 r

′ + a′1M1 − a1M′1 = D(1)

r = rad1

(D(1)r

)λB2 = λAAB2, A |M1, A | Ur

where A is the greatest common divisor of M1 and rad1(D(1)r ), and A is the

conductor of χ(1)r . Since Ur divides the nonzero polynomial Q defined to be

(4.28)∏

1≤i<j≤n

M2iM

2j

[(aiMi

)′

−

(ajMj

)′ ],

it follows that A | Q as well, so Proposition 4.2 ensures that the number ofr for which the equation above holds is

(4.29) ≪ d2(M1Q)q(1/2+ζ)d.

The divisor bound then allows us to neglect those r for which χr is principal,

as long as 12 + ζ + 1

p < 1 − βp , which is alright as β < 1

2 ≤ p(12 −

1p

)so we

can choose ζ small enough that this inequality holds.Let us now handle the case when ai and Mi are not coprime. Set

(4.30) Hi = gcd(ai,Mi), 1 ≤ i ≤ n.

If some Hi is not squarefree, the sum vanishes and the bound is trivial.Otherwise, we have the identity

(4.31) µ(ai + gMi) =

µ(aiHi

+ gMi

Hi

)µ(Hi) if gcd

(aiHi

+ gMi

Hi,Hi

)= 1

0 if gcd(aiHi

+ gMi

Hi,Hi

)6= 1.

Thus we can write the sum as the constant factor∏ni=1 µ(Hi) times a

similar sum, except that the degrees of ai and Mi are reduced and the

terms where gcd(aiHi

+ gMi

Hi,Hi

)6= 1 are removed. The degree reduction

preserves our inequalities and so is no trouble. Removing the terms with

gcd(aiHi

+ gMi

Hi,Hi

)6= 1 amounts to removing those g which lie in a par-

ticular residue class modulo each prime factor of Hi, for a total of at mostΓ =

∑ni=1 ω(Hi) residue classes.

We perform the same argument to this restricted sum. The only changethat occurs is when we write g = r + sp, we must assume that s avoidsa corresponding set of residue classes modulo these primes. By inclusion-exclusion, we can write a Dirichlet character sum avoiding Γ residue classesas an alternating sum of Dirichlet character sums in at most 2Γ residueclasses, and hence as an alternating sum of at most 2Γ shorter Dirichletcharacter sums. Because the sums over each residue class are shorter, wecan get the same bound for them by Corollary 2.7. Thus our final boundfor this case is worse by a factor of

(4.32) 2Γ = q∑n

i=1 o(d(Hi)) = qo(d).


We can absorb this into our bound by slightly increasing β so that it stillsatisfies the strict inequality (4.6). �

Theorem 4.5. Fix ǫ, δ > 0, 0 < β < 1/2, and a positive integer n. Let qbe a power of an odd prime p such that

(4.33) q >

pne

min{

ǫǫ+2 ,

ǫδǫ+δ

}

21−2β

.



and pairs (ai,Mi) ∈ Mki ×Mmifor 1 ≤ i ≤ n with ai/Mi distinct, we have

(4.35)∑

g∈Md

n∏

i=1

µ (ai + gMi) ≪ |Md|1−β

p

as d→ ∞, with the implied constant depending only on ǫ, δ, β, n and q.

Proof. Our initial assumption is that there are no coincidences among (ai/Mi),for 1 ≤ i ≤ n, so we can find a prime P not dividing

(4.36)

n∏

i=1

Mi

∏

1≤i<j≤n

aiMj − ajMi,

with d(P ) = o(d). Let t′ = d(P ). Splitting our initial sum according to theresidue class z of g mod P we get

(4.37)∑

d(z)<t′

∑

f∈Md−t′

n∏

i=1

µ (ai + zMi + fMiP ) .

We can bound the sums over residue classes by applying Proposition 4.3.Indeed, if for some 1 ≤ i < j ≤ n we have

(4.38)

(ai + zMi

MiP

)′

=

(aj + zMj

MjP

)′

then we get that aiMj ≡ ajMi mod P , a contradiction. Because the

length of the sum in this case is qd−t′

, we obtain a savings in each termof q(d−t

′)β/2p from Proposition 4.3. To obtain our desired savings of qdβ/2p,we must choose β + o(1) instead of β in the statement of Proposition 4.3 .Similarly to ensure that d − t′ ≥ max{ǫm1, . . . , ǫmn, δk1, . . . , δkn} we mustchoose ǫ−o(1) and δ−o(1). However because the inequality from Eq. (4.33)is strict, we may increase β by o(1) and reduce ǫ and δ by o(1) in such away that this inequality is still satisfied. �

We prove two corollaries that give weaker results under conditions thatare simpler to state.


Corollary 4.6. Fix ǫ, δ > 0 and a positive integer n. Let q be a power ofan odd prime p such that

(4.39) q > p2n2e2 max

(1 +

2

ǫ,1

ǫ+

1

δ

)2

.



and distinct coprime pairs (ai,Mi) ∈ Mki ×Mmifor 1 ≤ i ≤ n, we have

(4.41)∑

g∈Md

n∏

i=1

µ (ai + gMi) = o (|Md|)

as d→ ∞ for fixed ǫ, δ, n, q.

Proof. The assumed lower bound on q is equivalent to

(4.42) q >

pne

min{

ǫǫ+2 ,

ǫδǫ+δ

}

2

.

If q satisfies this inequality, we can take β small enough that the inequality

(4.43) q >

pne

min{

ǫǫ+2 ,

ǫδǫ+δ

}

21−2β

holds, and then apply Theorem 4.5. �

Corollary 4.7. Let q be a power of an odd prime p such that

(4.44) q > p2n2e2,

and let (ai,Mi) ∈ Mki ×Mmibe distinct coprime pairs for 1 ≤ i ≤ n. Then

(4.45)∑

g∈Md

n∏

i=1

µ (ai + gMi) = o (|Md|)

as d→ ∞ for fixed q, n, ai,Mi.

Proof. We can take ǫ, δ large enough that

(4.46) q > p2n2e2 max

(1 +

2

ǫ,1

ǫ+

1

δ

)2

.

For this ǫ and δ, the inequalities


will be satisfied for all d sufficiently large. We can then apply Corollary 4.6to deduce the claim. �


5. Level of distribution

The following will be obtained using the techniques of [FM98] and [FKM14]in the appendix.

Theorem 5.1. Fix an odd prime power q. Then for any θ > 0, for non-negative integers d,m with d ≤ m, squarefree M ∈ Mm, and an additivecharacter ψ mod M , we have

(5.1)∑

g∈Md

(g,M)=1

µ(g)ψ(g) ≪ q(316

+θ)m+ 2532d

as d→ ∞, with the implied constant depending only on q and θ.

The following proposition allows us to identify the main term in sums ofvon Mangoldt in arithmetic progressions.

Proposition 5.2. Fix a prime power q. For nonnegative integers d,m, anda squarefree M ∈ Mm we have

(5.2)d∑

k=1

kq−k∑

A∈Mk

(A,M)=1

µ(A) = −qm

ϕ(M)+ qo(m+d)−d.

Proof. The left hand side above is the sum of the first d coefficients of thepower series

ud

du

∞∑

k=1

q−kuk∑

A∈Mk

(A,M)=1

µ(A) = ud

du

∏

P ∤M

(1 − ud(P )|P |−1

)

= ud

du

(1 − u)

∏

P |M

(1 − ud(P )|P |−1

)−1

.

Summing all the coefficients of a power series, evaluates it at u = 1.Hence, the main term comes from the equality(ud

du((1 − u)F (u))

)(1) = −F (1), F (u) =

∏

P |M

(1 − ud(P )|P |−1

)−1.

The coefficients of degree greater than d contribute to the error term. Tobound the sum of these coefficients, we can write the degree k coefficient as

(5.3) k

∮

|u|=r

(1 − u)F (u)

uk+1

for r < q, getting a bound of

(5.4) (1 + r)r−k−1 max|u|=r

F (u).


As long as the above maximum is subexponential in m for all r < q, theexpression above will be qo(m)(1 + r)r−k−1, so the sum of the coefficients ofdegree greater than d is

(5.5) qo(m)∑

k>d

k(1 + r)r−k−1 = qo(m)(r − o(1))−d.

Taking r arbitrarily close to q, the above is qo(m)(q − o(1))d = qo(m+d)−d.The value of F (u) is indeed subexponential in m, because M has o(m)

prime divisors and each contributes at most

(5.6)

(1 −

(r

q

)d(P ))−1

≤

(1 −

r

q

)−1

.

�

In the following we deduce, from our results on the Mobius function, alevel of distribution beyond 1/2 for primes in arithmetic progressions tosquarefree moduli. We shall use the convolution identity Λ = −1 ∗ (µ · deg)which for f ∈ Fq[T ]+ of degree d ≥ 0 says that

(5.7) Λ(f) = −

d∑

k=1

k∑

A∈Mk

∑

B∈Md−k

AB=f

µ(A).

Corollary 5.3. For any 0 < ω < 1/32, for any odd prime p and power q

of p such that q > p2e2(

1 + 501−32ω

)2, the following holds. For nonnegative

integers d,m with d ≥ (1 − ω)m,, squarefree M ∈ Mm, and a ∈ Fq[T ] withd(a) < d+m and coprime to M , we have

(5.8)∑

g∈Md

Λ(a+ gM) =qd+m

ϕ(M)+ Eq

as d→ ∞, with a power saving error term Eq.

Proof. We can assume ℓ = m, and use Eq. (5.7) to write our sum as

(5.9)∑

f∈Mm+d

f≡a mod M

Λ(f) = −

d+m∑

k=1

k∑

A∈Mk

(A,M)=1

µ(A)∑

B∈Mm+d−k

AB≡a mod M

1

so by Proposition 5.2 the range k ≤ d contributes the main term.


The (absolute value of the) contribution of any k > d is

qd−k

∣∣∣∣∣∣∣∣∣

∑

A∈Mk

(A,M)=1

µ(A)∑

ψ : Fq[T ]/M→C×

ψ(f)=0 if d(f)<m+d−k

ψ(aA)ψ(Tm+d−k)

∣∣∣∣∣∣∣∣∣≤

qd−k∑

ψ : Fq[T ]/M→C×

ψ(f)=0 if d(f)<m+d−k

∣∣∣∣∣∣∣∣

∑

A∈Mk

(A,M)=1

µ(A)ψ(aA)∣∣∣∣∣∣∣∣

(5.10)

so by Theorem 5.1 we get

(5.11) ≪ qd−kqk−dq(316

+θ)m+ 2532k = q(

316

+θ)m+ 2532k

which gives a power saving as long as

(5.12)

(3

16+ 2θ

)m+

25

32k < d.

The (absolute value of the) contribution of any other k is at most

(5.13)∑

B∈Mm+d−k

(B,M)=1

∣∣∣∣∣∣∣

∑

A∈MkAB≡a mod M

µ(A)

∣∣∣∣∣∣∣=

∑

B∈Mm+d−k

(B,M)=1

∣∣∣∣∣∣

∑

g∈Mk−m

µ(b+ gM)

∣∣∣∣∣∣

where b is the unique monic polynomial of degree m congruent to aB−1

modulo m. We apply Theorem 4.5 with some fixed β > 0 and with

(5.14) ǫ = δ =32

25

(1

32− ω − 2θ

).

Then we because (5.12) does not hold, we have

k −m ≥32

25

(d−

(31

32+ 2θ

)m

)≥

32

25

((1 − ω)m−

(31

32+ 2θ

)m

)= ǫm

so the conditions of Theorem 4.5 are satisfied as long as

(5.15) q >

(pe

(1 +

25

16

1132 − ω − 2θ

)) 21−2β

.

Because we may take β and θ arbitrarily small, it suffices to have

(5.16) q > p2e2

(1 +

25

16

1132 − ω

)2

= p2e2(

1 +50

1 − 32ω

)2

Summation over k gives only an extra logarithmic factor, so it preservesour power savings. �


6. Twins

6.1. Chowla sums over primes. We establish cancellation in Mobius au-tocorrelation over primes.

Corollary 6.1. Fix ǫ, δ > 0, 0 < α < 1, 0 < β < 1/2, and a positive integern. Let q be a power of an odd prime p such that

q >

(p(n+ 1)emax

(1 +

2 + 2α+ 4ǫ−1

1 − α,

1 + α+ 2ǫ−1 + 2δ−1

1 − α

)) 21−2β

.

Take nonnegative integers d,m1, . . . ,mn, k1, . . . , kn with


and distinct coprime pairs (ai,Mi) ∈ Mki ×Mmifor 1 ≤ i ≤ n.

Furthermore let m,k be nonnegative integers with m ≤ αd, k < m + d,let M ∈ Mm, and let a ∈ Mk with (a,M) distinct from (ai,Mi) for every1 ≤ i ≤ n. Then

(6.2)∑

g∈Md

Λ(a+ gM)

n∏

i=1

µ (ai + gMi) ≪ |Md|1−β(1−α)

2p

as d→ ∞, with the implied constant depending only on ǫ, δ, α, n and q.

Proof. We can assume (a,M) = 1, set z =⌊d+m2

⌋, and use Eq. (5.7) to write

Λ(a+gM) = −

z∑

b=1

b∑

B∈Mb

B|a+gM

µ(B)−

d+m−z−1∑

c=0

(d+m−c)∑

C∈Mc

C|a+gM

µ

(a+ gM

C

).

For every B above, taking a monic N = N(B) ∈ Fq[T ] with

(6.3) MN ≡ −a mod B, d(N) 6= ki −mi, d(N) ≤ b+ n,

and writing g = N + hB with h ∈ Md−b, we see that the sum over bcontributes

(6.4)

⌊ d+m2 ⌋∑

b=1

b∑

B∈Mb

µ(B)∑

h∈Md−b

n∏

i=1

µ (ai +NMi + hBMi)

so we can apply Theorem 4.5 to the innermost sum, taking

(6.5) ǫ =1 − α

1 + α+ 2ǫ−1≤

d−m

d+m+ 2mi=

d− d+m2

d+m2 +mi

≤d− b

b+mi

and

δ = min

(1 − α

2δ−1,

1 − α

1 + α+ 2ǫ−1

)≤ min

(d− b

ki,d− b

b+mi

)

≤ min

(d− b

deg ai,

d− b

degNMi+ o(1)

).

(6.6)


To obtain the condition on q, note that, when calculating max(1+ 2ǫ ,

1ǫ+

1δ ),

we can treat δ as 1−α2δ−1

, because if the other term is smaller, then 1ǫ + 1

δ is

dominated by 1 + 2ǫ anyways.

Taking L ∈ Fq[T ]+ with

(6.7) ML ≡ −a mod C, d(L) 6= ki−mi, d(L) 6= d, d(L) ≤ c+n+ 1,

and writing g = L + hC with h ∈ Md−c, we see that the sum over ccontributes

d+m−z−1∑

c=0

(d+m−c)∑

C∈Mc

∑

h∈Md−c

µ

(a+ LM

C+ hM

) n∏

i=1

µ (ai + LMi + hCMi)

and we again apply Theorem 4.5 to the innermost sum, with the same valuesof ǫ and δ. Everything is the same as before, with c replaced by b, exceptfor two things.

(1) We can no longer use the inequality b ≤ m+d2 , but rather the slightly

weaker inequality c ≤ m+d+12 .

(2) The term µ(a+LMC + hM

)appears, which means we must check that

(6.8) (d− c) ≥ ǫm, (d− c) ≥ ǫd

(a+ LM

C

).

However (1) is no difficulty as we may assume d sufficiently large andperturb the parameters ǫ and δ slightly to insure the inequalities of Theo-rem 4.5 still hold. For this reason we will assume c ≤ m+d

2 while handling(2) as well.

To check that (d−c) ≥ ǫm we observe that, because c ≤ m+d2 and d < αm,

we have d− c ≥ d−m2 , and thus d−c

m ≥ 1−α2α ≥ 1−α

1+α ≥ ǫ.To check that

(6.9) d

(a+ LM

C

)≤ max(m + d− c,m + n) ≤ δ−1(d− c)

we observe

δ ≤1 − α

1 + α=

2(d−m)

d+m=

1m

(d−m)/2 + 1≤

1md−c + 1

=d− c

m+ d− c.

�

6.2. Singular series. For nonzero a ∈ Fq[T ] define

(6.10) Sq(a) =∏

P |a

(1 − |P |−1

)−1∏

P ∤a

(1 − (|P | − 1)−2

).

The following propositon allows us to identify the main term in our twinprime number theorem. An analogous result over the integers is proved in[GY03, Lemma 2.1].


Proposition 6.2. Fix a prime power q. Then for an integer n ≥ 1 and anonzero a ∈ Fq[T ] we have

(6.11)

n∑

k=1

k∑

M∈Mk

(M,a)=1

µ(M)

ϕ(M)= −Sq(a) + (q − 1)o(n+d(a))−n.

Proof. We are interested in the sum of the first n coefficients of

Z(u) = ud

du

∞∑

k=1

uk∑

M∈Mk

(M,a)=1

µ(M)

ϕ(M)= u

d

du

∏

P ∤a

(1 − ud(P ) (|P | − 1)−1

)

= ud

du((1 − u)G(u))

(6.12)

where G(u) is∏

P |a

(1 − ud(P )|P |−1

)−1∏

P ∤a

(1 − ud(P )|P |−1

)−1 (1 − ud(P ) (|P | − 1)−1

).

The sum of all coefficients equals

(6.13) Z(1) = −G(1) = −Sq(a)

because

(6.14)(1 − |P |−1

)−1(

1 − (|P | − 1)−1)

= 1 − (|P | − 1)−2 .

As in Proposition 5.2, to prove the bound for the error term it sufficesto prove that G(u) is bounded subexponentially in a for u on each circle ofradius < q − 1.

Note that (1 − ud(P )|P |−1

)−1 (1 − ud(P ) (|P | − 1)−1

)=

1 −ud(P )

|P | (|P | − 1)(1 − ud(P )|P |−1

)(6.15)

so G(u) can be rewritten as

∏

P

(1 −

ud(P )

|P | (|P | − 1)(1 − ud(P )|P |−1

))∏

P |a

(1 − ud(P ) (|P | − 1)−1

)−1.

The first product above is independent of a and converges on the disc where|u| < q. On a circle of radius r, the value of each term in the second productis at most

(6.16)

(1 −

rd(P )

qd(P ) − 1

)−1

≤

(1 −

r

q − 1

)−1

and thus is bounded, and the number of terms is o(d(a)), so the product issubexponential in a.

�


6.3. Hardy-Littlewood.

Theorem 6.3. For every odd prime number p, and power q of p with

(6.17) q > 685090p2,

there exists λ > 0 such that the following holds. For nonnegative integersd > ℓ, and a ∈ Mℓ we have

(6.18)∑

f∈Md

Λ(f)Λ(f + a) = Sq(a)qd +O(q(1−λ)d

)

as d→ ∞, with the implied constant depending only on q.

Proof. Using Eq. (5.7) our sum becomes

(6.19) −

d∑

k=0

k∑

M∈Mk

µ(M)∑

N∈Md−k

Λ(a+NM).

The appearance of the factor µ(M) allows us to consider only squarefree M ,so by Corollary 5.3, the contribution of the range 0 ≤ k ≤ d/(2 − ω) is

(6.20) − qdd/(2−ω)∑

k=0

k∑

M∈Mk

(a,M)=1

µ(M)

ϕ(M)+O

(dq

d2−ωEq

)

which equals by Proposition 6.2 to

(6.21) qdSq(a) +O

(dq

d2−ωEq + qdE′

q

(d

2 − ω

)).

The error term is of power savings size.In the other range, we need to prove a power savings bound for

(6.22)

d∑

k>d/(2−ω)

k∑

N∈Md−k

∑

M∈Mk

Λ(a+MN)µ(M)

and this is done by applying Corollary 6.1 to the innermost sum with

(6.23) n = 1, ǫ = ∞, δ = ∞, α = 1 − ω >d− k

kand β > 0 but very small. This requires

(6.24) q >

(2pe

(1 +

2 + 2 − 2ω

ω

))2

=

(2pe

(4

ω− 1

))2

and Corollary 5.3 requires

(6.25) q > p2e2(

1 +50

1 − 32ω

)2

so the optimal bound is obtained by solving

(6.26) 2

(4

ω− 1

)= 1 +

50

1 − 32ω


whose solution is

(6.27) ω =103 −

√30803

3

64= .0261 . . .

which satisfies

(6.28)

(2e

(4

ω− 1

))2

< 685090.

�

7. Results for Small q

We prove Theorem 1.11.

Proof. Set d(f) = d. We need to show that by suitably changing the coeffi-cients of f in degree at most ηd, one can arrive at a polynomial with a given(nonzero) Mobius value.

Let c < ηd be the largest even integer not divisible by 3. Note that

(7.1) c ≥ ηd− 4.

We take the coefficient of T c in f to be 1, and the coefficient of T k to be 0for every k < c that is not divisible by 3. Hence, it is enough to show that

(7.2) 1,−1 ∈{µ(f + b3) : b ∈ M⌊c/3⌋

}.

By Lemma 3.2 with M = 1, a = f , and g = b3, our set equals

(7.3){S · χ(w + b3) : b ∈ M⌊c/3⌋

}

and since the highest power of T that divides (f + b3)′ = f ′ is T c−1, weconclude that S = ±1 and from Remark 3.3 that χ is a nonprincipal charac-ter (to a squarefree modulus E). Arguing as in Eq. (4.11) to ’extract thirdroots’, we are thus led to consider

(7.4){χ(w + b) : b ∈ M⌊c/3⌋

}.

From Lemma 3.2 we further conclude that

(7.5) d(E) ≤ d− c+ 1 ≤ (1 − η)d+ 5,

and on the other hand

(7.6) ⌊c/3⌋ ≥c

3− 1 ≥

ηd− 4

3− 1 ≥

η

3d− 3,

so combining the two we get

(7.7)⌊c/3⌋

d(E)≥

η3d− 3

(1 − η)d+ 5.

Since we have assumed that 3/7 < η < 1, the right hand side of the abovetends to a quantity greater than 1/4 as d → ∞. Consequently, we can usethe (function field version of the) Burgess bound (as stated for instance in[Bur63, Theorem 2]) to show that 1 and −1 belong to the set above. Sucha version is obtained in [Hsu99]. �


Now we prove Theorem 1.12.

Proof. Set k = d(P ). For positive integers d, n, we seek cancellation in

(7.8)∑

g∈Md

µ(a+ gPn)

where a ∈ Fq[T ]+ satisfies d(a) < nk and a ≡ 1 mod Pn−2. We assume firstthat 3 | n, and follow the proof of Proposition 4.3 up to Eq. (4.23) getting

(7.9)∑

d(h)<t

χr(f + h)

with χr a character mod Er. If χr is principal, then by Remark 3.3 we have

(7.10) P 2nr′ + a′Pn = D(1)r = AB2, A,B ∈ Fq[T ], A | Pn, P ∤ B

and since Pn−3 | a′, we conclude that

(7.11) P 3r′ +a′

Pn−3= AB2, A, B ∈ Fq[T ], A | P.

There are ≪ qd/2 choices of r ∈ R satisfying the above, so those can beneglected.

For r ∈ R with χr nonprincipal, we note that

(7.12) d(Er) ≤ d(rad

(P 2nr′ + a′Pn

))≤ d

(P 3r′ +

a′

Pn−3

)+ k

so for large enough d we have t/d(Er) > 1/4, hence cancellation in Eq. (7.9)is guaranteed by Burgess.

Suppose now that 3 | n+ β for some β ∈ {1, 2}, and write

(7.13)∑

g∈Md

µ(1 + gPn) =∑

d(g0)<βk

∑

g1∈Md−βk

µ(1 + g0Pn + g1P

n+β)

for d ≥ β. We have thus reduced to the previous case with a = 1+g0Pn. �

Remark 7.1. We see from the proof that 1 is not the only residue class forwhich the argument works. Also, the modulus does not have to be a powerof a fixed prime, but it has to be ‘multiplicatively close’ to a cube.

Appendix A. Orthogonality of Mobius and

inverse additive characters

We explain how a variant of the results of [FM98] carries over to functionfields and gives Theorem 5.1.

A standard strategy in the treatment of sums such as those from Eq. (5.1)is to use a combinatorial identity for the Mobius function. Following [FM98],we use Vaughan’s identity, which for f ∈ Fq[T ]+ gives

(A.1) µ(f) = −∑

d(g)≤α

∑

d(h)≤β

gh|f

µ(g)µ(h) +∑

d(g)>α

∑

d(h)>β

gh|f

µ(g)µ(h)


where summation is over monic polynomials, and α, β are nonnegative in-tegers with max{α, β} < d(f). For a proof (that is also valid for functionfields) see [IK04, Proposition 13.5].

By applying Vaughan’s identity as in [FM98, Section 6], we reduce ourtask to bounding sums of type I:

(A.2) Σ(I)k,r =

∑

f∈Mk

∑

g∈Mr

(fg,M)=1

γfψ(fg)

where k ≤ 18d(M) + 7

16d, k + r ≤ d, |γf | ≤ 1 and sums of type II:

(A.3) Σ(II)k,r =

∑

f∈Mk

∑

g∈Mr

(fg,M)=1

γfδgψ(fg)

with k ≥ 18d(M) + 7

16d, r ≥ 716d −

38d(M), k + r ≤ d, |δg| ≤ 1. For every

ǫ > 0, we need the bounds

(A.4) Σ(I)k,r ≪ q(

316

+ǫ)d(M)+ 2532d, Σ

(II)k,r ≪ qd+ǫd(M)− r

2 + qd+( 14+ǫ)d(M)− k

2

that are analogous to [FM98, Equation 6.4]. The bounds (A.4) then imply(5.1) by the argument of [FM98, Section 6].

A.1. Sums of type I. Following first the bilinear shifting trick argumentof [FM98, §4], we obtain the inequality

(A.5) Σ(I)k,r ≪

1

V

∑

a∈MdA

(a,M)=1

∑

f∈Mk

∑

g∈Mr

∣∣∣∣∣∣∣∣∣

∑

b∈MdB

(a−1g+b,M)=1

ψ(af(a−1g + b)

)∣∣∣∣∣∣∣∣∣

where, in analogy with the variables A,B from [FM98, (4.8)],

(A.6) dA =3r − k

4, dB =

k + r

4− 1

and

(A.7) V = # {ab : a ∈ MdA , b ∈ MdB , (a,M) = 1} ≫ qdA+dB−ǫd(M).

The parameter t and the term e(−bt) from [FM98] do not appear here,as they arise from the failure of an archimedean interval to be perfectlyinvariant under a shift.


Now following the Holder’s inequality argument from [FM98, §4.a], weget as in [FM98, Equation 4.6], the bound

∣∣∣∣∣∣∣∣∣

∑

a∈MdA

(a,M)=1

∑

f∈Mk

∑

g∈Mr

∣∣∣∣∣∣∣∣∣

∑

b∈MdB

a−1g+b∈A×

ψ(af(a−1g + b)

)∣∣∣∣∣∣∣∣∣

∣∣∣∣∣∣∣∣∣

6

≪

q5(k+r+dA)+ǫd(m)∑

b1,...,b′3∈MdB

∣∣∣∣∣∣∣∣∣∣∣

∑

h∈Ah+bi∈A×

h+b′i∈A×

∑

d(s)≤k+dA(s,M)=1

ψ (∆b(h, s))

∣∣∣∣∣∣∣∣∣∣∣

(A.8)

where A = Fq[T ]/(M),

(A.9) b = (b1, b2, b3, b′1, b

′2, b

′3) ∈ Fq[T ]6,

and the function ∆b(h, s) is defined by

(A.10) ∆b(h, s) =

3∑

i=1

[1

(h+ bi)s−

1

(h+ b′i)s

].

Because of our conditions h + bi ∈ A×, h + b′i ∈ A×, and (s,M) = 1, thefunction ∆b is never evaluated where it is undefined. The Holder’s inequalityargument is slightly more involved because the invertibility assumptions aremore complex in case M is not prime, but the idea is essentially the same.

As in [FM98], we complete the right hand side of Eq. (A.8) and get∑

h∈Ah+bi∈A×

h+b′i∈A×

∑

d(s)≤k+dA(s,M)=1

ψ (∆b(h, s)) =

qk+dA−d(M)∑

z∈Fq[T ]d(z)<d(M)−k−dA

∑

h∈Ah+bi∈A

×

h+b′i∈A×

∑

s∈A×

ψ (∆b(h, s) + zs) .(A.11)

In order to treat the innermost sum on the right hand side, we note that

(A.12) ∆b(h, s) + zs =1

s

3∑

i=1

[1

(h+ bi)−

1

(h+ bi′)

]+ zs

so the aforementioned innermost sum over s is a Kloosterman sum. We put

(A.13) Ab ={x ∈ A : (x + b1) · · · (x+ b′3) ∈ A×

},

define a function Rb : Ab → A by

(A.14) Rb(x) =

3∑

i=1

(1

x + bi−

1

x+ b′i

),


and a Kloosterman sum

(A.15) S(x, z) =∑

y∈A×

ψ(xy−1 + zy

), x ∈ A, z ∈ Fq[T ].

In this notation, for any z ∈ Fq[T ] we have

(A.16)∑

h∈Ab

∑

s∈A×

ψ (∆b(h, s) + zs) =∑

x∈Ab

S(Rb(x), z)

and the following claim.

Proposition A.1. For σ ∈ S3 and

(A.17) Mb

σ = gcd(M, b1 − b′σ(1), b2 − b′σ(2), b3 − b′σ(3)

)

we have if p > 3 the bound

(A.18)

∣∣∣∣∣∣

∑

x∈Ab

S(Rb(x), z)

∣∣∣∣∣∣≪ |M |d2(M)4

∣∣∣lcmσ∈S3 gcd(Mb

σ , z)∣∣∣

with the implied constant depending only on q.If p = 3, with

(A.19) Mb

△ = gcd(M, b1 − b2, b2 − b3, b

′1 − b′2, b

′2 − b′3

),

we have the bound

(A.20)

∣∣∣∣∣∣

∑

x∈Ab

S(Rb(x), z)

∣∣∣∣∣∣≪ |M |d2(M)4

∣∣∣lcmσ∈S3∪{△} gcd(Mb

σ , z)∣∣∣

with the implied constant depending only on q.

Proof. Since both the bound and the sum are multiplicative in M , it sufficesto handle the case when M is prime, where we show that

(A.21)

∣∣∣∣∣∣

∑

x∈A\{b1,...,b′3}

S(Rb(x),m)

∣∣∣∣∣∣≤ 16|A|

unless z = 0 and either

• for some σ ∈ S3 we have bi = b′σ(i) for all 1 ≤ i ≤ 3;

• or p = 3, b1 = b2 = b3, and b′1 = b′2 = b′3;

in which case we have the trivial bound

(A.22)

∣∣∣∣∣∣

∑

x∈A\{b1,...,b′3}

S(Rb(x), 0)

∣∣∣∣∣∣≤ |A|2.

The relevance of these conditions is that the residue of the pole of Rb ata point x equals

(A.23) #{1 ≤ i ≤ 3 : bi = −x} − #{1 ≤ i ≤ 3 : b′i = −x}


so it is nonzero whenever these two numbers are not equal, except whenp = 3, one of these numbers is 3, and the other is zero. Hence, Rb has apole unless each bi is equal to some bσ(i)′ , or p = 3, all the bi are equal, andthe b′i are also all equal.

Excluding these ‘trivial’ values of b for z = 0, we get that the rationalfunction Rb is nonconstant and at most 6 to 1. Hence, if Rb(x) 6= 0 we getS(Rb(x), 0) = −1, while for the values of x with Rb(x) = 0, at most 6 innumber, we have

(A.24) S(0, 0) = |A| − 1.

In total, we get

(A.25)

∣∣∣∣∣∣

∑

x∈A\{b1,...,b′3}

S(Rb(x), 0)

∣∣∣∣∣∣≤ |A| + 6|A| = 7|A|.

Suppose from now on that z 6= 0. Note that if the rational function Rb

is constant then it necessarily vanishes identically, so we get

(A.26)

∣∣∣∣∣∣

∑

x∈A\{b1,...,b′3}

S(0, z)

∣∣∣∣∣∣≤ |A|.

We can thus assume that Rb is nonconstant, and let Kℓ2 be the Kloostermansheaf defined by Katz. With this notation, our sum can be written as

−∑

x∈A\{b1,...,b′3}

tr(Frob|A|, (Kℓ2)Rb(x)z

)=

−∑

x∈A\{b1,...,b′3}

tr(Frob|A|, ([zRb]∗Kℓ2)x

)=

−

2∑

i=0

(−1)i tr(

Frob|A|,Hic

(A1Fq

\ {−b1, . . . ,−b′3}, [zRb]∗Kℓ2

))

(A.27)

by the Grothendieck-Lefschetz fixed point formula. Because the geometricmonodromy group of Kℓ2 is SL2, which is connected, the geometric mon-odromy group of its pullback by any finite covering map is SL2, which hasno monodromy coinvariants, so the cohomology groups in degree 0 and 2vanish. As Kℓ2 is pure of weight 1, its pullback by a finite covering mapis mixed of weight at most 1, so by Deligne’s theorem, the eigenvalues ofFrob|A| on H1

c (A1Fq

\{−b1, . . . ,−b′3}, [zRb]∗Kℓ2) have absolute value at most

|A|. Hence, in order to bound our sum by 16|A|, it suffices to prove that thedimension of the above cohomology group is at most 16.

By the aforementioned vanishing of cohomology in degrees 0 and 2, thedimension of our cohomology group equals minus the Euler characteristic.Since [zRb]∗Kℓ2 is lisse of rank 2 on A1

Fq\ {−b1, . . . ,−b

′3}, its Euler char-

acteristic is twice the Euler characteristic of A1Fq

\ {−b1, . . . ,−b′3}, which


is

(A.28) 2(1 − #{−b1, . . . ,−b

′3}),

minus the sum of the Swan conductors at each singular point. Becausethe rational function mRb has a zero at ∞ and a pole of order at most 1at each bi or b′i, the Swan conductor of [mRb]∗Kℓ2 at ∞ vanishes and theSwan conductor of [zRb]∗Kℓ2 at bi or b′i is at most 1, so the total Eulercharacteristic is at least

(A.29) 2 − 3#{−b1, . . . ,−b′3} ≥ 2 − 3 · 6 = −16.

�

Corollary A.2. Keeping the same notation, we have if p > 3∑

h∈Ab

∑

s∈A×

d(s)≤k+dA

ψ (∆b(h, s)) ≪ d2(M)5(|M | + q

3(r+k)4

∣∣∣lcmσ∈S3 Mb

σ

∣∣∣).

and, if p = 3, the same bound but with Mb

△ also included in the lcm.

Proof. Using Eq. (A.11), Eq. (A.16), and Proposition A.1 we get a boundof

(A.30) ≪ qk+dAd2(M)4∑

z∈Fq[T ]d(z)<d(M)−k−dA

∣∣∣lcmσ∈S3 gcd(Mb

σ , z)∣∣∣

for the left hand side above. Summing over the possible values of the leastcommon multiple, we get

≪ q3(r+k)

4 d2(M)4∑

L|lcmσ∈S3Mb

σ

|L|∑

z∈Fq[T ]

d(z)<d(M)−3(r+k)

4L|z

1

≪ q3(r+k)

4 d2(M)4∑

L|lcmσ∈S3Mb

σ

|L|max{qd(M)− 3(r+k)

4−d(L), 1

}.

(A.31)

The contribution of qd(M)−3(r+k)

4−d(L) (respectively, of 1) is the first (respec-

tively, the second) summand of the right hand side in our corollary. �

Corollary A.3. Notation unchanged, we have

(A.32)∑

b1,...,b′3∈MdB

∣∣∣∣∣∣∣∣

∑

h∈Ab

∑

s∈A×

d(s)≤k+dA

ψ (∆b(h, s))

∣∣∣∣∣∣∣∣≪ |M |d2(M)12q

64(k+r)

Proof. By Corollary A.2 we have a bound of

(A.33) ≪∑

b1,...,b′3∈MdB

d2(M)5(|M | + q

3(r+k)4

∣∣∣lcmσ∈S3 Mb

σ

∣∣∣).


Summing first over tuples Mσ of divisors of M we get

(A.34)∑

σ∈S3

∑

Mσ|M

∑

b1,...,b′3∈MdB

∀σ Mσ=Mbσ

d2(M)5(|M | + q

3(r+k)4 |lcmσ∈S3 Mσ|

).

For each such tuple, the conditions

(A.35) Mσ = Mb

σ , σ ∈ S3

imply the congruences

(A.36) bi ≡ b′σ(i) mod Mσ, 1 ≤ i ≤ 3, σ ∈ S3

which determine b′1, b′2, b

′3 mod lcmσ∈S3Mσ once b1, b2, b3 are chosen. Hence,

for each tuple of divisors of M , the number of possible values of b is at most

(A.37) max{q6dB |lcmσ∈S3Mσ|

−3 , q3dB}.

If p = 3, we need a slightly more complicated argument. There are at mostM2

△ ways to choose the congruence classes of b modulo M△, and then choos-

ing b1, b2, b3 arbitrarily now determines b′1, b′2, b

′3 mod lcmσ∈S3∪{△}Mσ. Be-

cause the number of ways to choose b modulo Mσ and then choose b1, b2, b3is

(A.38)

{M2

△(q3dB/M3△) ≤ q3dB if M△ ≤ qdB

q2dB ≤ q3dB if M△ > qdB

in either case the number of possible values of b is at most

(A.39) max{q6dB

∣∣lcmσ∈S3∪{△}Mσ

∣∣−3, q3dB

}.

Setting τ = d(lcmσ∈S3Mσ) (or adding △ if p = 3), and taking the maximalpossible contribution for every tuple of divisors of M , we get the bound

(A.40) max0≤τ≤d(M)

d2(M)|S3|+1(q6dB−3τ + q3dB

)d2(M)5

(|M | + q

3(r+k)4

+τ).

Expanding the brackets above, we see that each exponent is a linear functionof τ , hence maximized either at τ = 0 or at τ = d(M). Using Eq. (A.6), one

observes that the maximal terms q6dB+d(M) and q3db+3(r+k)

4+d(M) agree and

arrives at the right hand side of Eq. (A.32). �

It then follows from Eq. (A.8), Eq. (A.32), and the divisor bound that∣∣∣∣∣∣∣∣∣

∑

a∈MdA

(a,M)=1

∑

f∈Mk

∑

g∈Mr

∣∣∣∣∣∣

∑

b∈MdB

ψ(af(a−1g + b)

)∣∣∣∣∣∣

∣∣∣∣∣∣∣∣∣≪ q

4124r+ 21

24k+( 1

6+ǫ)d(M)

and thus (matching the ℓ = 3 case of [FM98, (1.2.3)] we get

(A.41) Σ(I)k,r ≪ q

1724r+ 7

8k+( 1

6+ǫ)d(M)


using Eq. (A.5). Using the fact that k ≤ 18d(M) + 7

16d and k + r ≤ d wearrive at the first bound in Eq. (A.4).

A.2. Sums of type II. Keeping the same notation, we follow the proof of[FKM14, Theorem 1.17]. Applying Cauchy’s inequality and Polya-Vinogradovcompletion as in [FKM14, Section 3], we get

∣∣∣Σ(II)k,r

∣∣∣2≤ qk

∑

g1,g2∈Mr

δg1δg2∑

f∈Mk

(fg1,M)=(fg2,M)=1

ψ(fg1 − fg2)

≤ qk∑

g1,g2∈Mr

qk−d(M)∑

h∈Fq[T ]d(h)<d(M)−k

C(g1 − g2, h).(A.42)

where

(A.43) C(g, h) =∑

z∈A×

ψ(gz)ep

(TrAFp

(hz)), g, h ∈ Fq[T ].

We have the following analog of [FKM14, Proposition 3.1].

Proposition A.4. For g, h ∈ Fq[T ] and Mg,h = gcd(M,g, h) we have

(A.44) |C(g, h)| ≤ d2(M)√

|M |√

|Mg,h|.

Proof. Since both C(g, h) and our putative bound are multiplicative in M ,it suffices to show that for a prime P we have

(A.45) |C(g, h)| ≤ 2√

|P |

unless g ≡ h ≡ 0 mod P . To demonstrate that, take f ∈ Fq[T ]/(P ) with

(A.46) ψ(z) = ep

(Tr

Fq[T ]/(P )Fp

(fz))

and note that

(A.47) C(g, h) =∑

z∈(Fq[T ]/(P ))×

ep

(Tr

Fq[T ]/(P )Fp

(gfz−1 + hz)).

We have a Weil bound of 2√

|P | for this exponential sum unless the rationalfunction gfz−1+hz is an Artin-Schreier polynomial, which can only happenif it is constant, as all its poles have order at most 1. The latter happensonly if g = h = 0, as desired. �

By Eq. (A.42) we have

(A.48)∣∣∣Σ(II)

k,r

∣∣∣2≤ q2k−d(M)

∑

L|M

∑


∑

g1,g2∈Mr

Mg1−g2,h=L

C(g1 − g2, h)


so from Proposition A.4 and the divisor bound, this is at most

(A.49) q2k−d(M)

2+ǫd(M)

∑

L|M

qd(L)2

∑


∑

g1,g2∈Mr

Mg1−g2,h=L

1.

Since Mg1−g2,h = L implies that g2 ≡ g1, h ≡ 0 mod L, the above is at most

(A.50) q2k−d(M)

2+ǫd(M)

∑

L|M

qd(L)2

+r+max(r−d(L),0)+max(d(M)−k−d(L),0)

so setting ξ = d(L) and applying the divisor bound once again, we arrive at

(A.51) max0≤ξ≤d(M)

q2k+ξ2+r+max{r−ξ,0}+max{d(M)−k−ξ,0}−

d(M)2

+ǫd(M).

As a function of ξ, the exponent above is convex, so its values do notexceed those at ξ = 0 and ξ = d(M), which are

(A.52) qd(M)

2+ǫd(M)+2r+k, qr+2k+ǫd(M)

in view of our assumption that r ≤ d ≤ d(M). Taking a square root andusing the fact that k + r ≤ d we get the second bound in (A.4).

Remark A.5. There are (at least) two other potential approaches for bound-ing our type II sums. The first is to follow the proof of [FM98, Proposi-tion 1.3] that uses Bombieri’s bound on complete exponential sums [FM98,Lemma 4.3]. One then argues as in [FM98, Section 5]. The second is tofollow the proof of [FM98, Theorem 1.4] given in [FM98, Section 7].

References

[BBSR15] E. Bank, L. Bary-Soroker, L. Rosenzweig, Prime polynomials in short intervalsand in arithmetic progressions, Duke Math. J. 164.2 (2015): 277-295.

[BGP92] E. Bombieri, A. Granville, J. Pintz, Squares in arithmetic progressions, DukeMath. J. 66.3 (1992): 369-385.

[BZ02] E. Bombieri, U. Zannier, A Note on squares in arithmetic progressions, II, Attidella Accademia Nazionale dei Lincei. Classe di Scienze Fisiche, Matematiche e Nat-urali. Rendiconti Lincei. Matematica e Applicazioni 13.2 (2002): 69-75.

[Bur63] D. A. Burgess (1963), On Character Sums and L-Series II, Proc. Lond. Math.Soc. 3(1), 524-536.

[Ca15] D. Carmon, The autocorrelation of the Mobius function and Chowla’s conjecturefor the rational function field in characteristic 2, Phil. Trans. R. Soc. A 2015.

[CaRu14] D. Carmon, Z. Rudnick, The autocorrelation of the Mobius function andChowla’s conjecture for the rational function field, Q. J. Math. (2014) 65 (1):53–61.

[CHLPT15] A. Castillo, C. Hall, R. Lemke Oliver, P. Pollack, L. Thompson, (2015)Bounded gaps between primes in number fields and function fields, Proc. Amer. Math.Soc.143(7), 2841-2856.

[CG07] J. Cilleruelo, A. Granville, Lattice points on circles, squares in arithmetic progres-sions and sumsets of squares, Additive combinatorics, 43, 241-262, Amer. Math. Soc.Providence, RI, 2007.

[CCG08] B. Conrad, K. Conrad, and R. Gross (2008) Prime specialization in genus 0,Transactions of the AMS, 360(6), 2867-2908.


[Con05] K. Conrad (2005), Irreducible values of polynomials: a non-analogy, NumberFields and Function Fields: Two Parallel Worlds, 71-85, Progress in Mathematics,239, Birkhauser, Basel.

[FM98] E. Fouvry, P. Michel, Sur certaines sommes d’exponentielles sur les nombres pre-miers, Ann. scient. Ec. Norm. Sup., 4e serie, t. 31, 1998, 93-130.

[FKM14] E. Fouvry, E. Kowalski, P. Michel (2014), Algebraic trace functions over theprimes, Duke Math. J. 163(9), 1683-1736.

[Gal72] P. X. Gallagher (1972), Primes in progressions to prime-power modulus, Invent.math. 16(3), 191-201.

[GY03] D. Goldston, C. Yıldırım, Higher correlations of divisor sums related to primes I:Triple correlations, Integers 3 (2003) A5.

[GS18] O. Gorodetsky, W. Sawin, Correlation of Arithmetic Functions over Fq[T ], arxivpreprint, 2018.

[Hoo91] C. Hooley, On the number of points on a complete intersection over a finite field,J. of Number Theory, 38.3 (1991), 338-358.

[Hsu99] C. N. Hsu (1999), Estimates for Coefficients of L-Functions for Function Fields,Fin. Fields App. 5(1), 76-88.

[Ill94] L. Illusie (1994) Autour du theoreme de monodromie locale, Asterisque 223 (1994)9-57..

[IK04] H. Iwaniec, E. Kowalski, Analytic number theory, Vol. 53. Amer. Math. Soc. 2004.[Jan07] S. Janson (2007), Resultant and discriminant of polynomials, Notes.[Kat89] N. Katz (1989), An Estimate for Character Sums, JAMS, 2, 2, 197-200.[Ma15] J. Maynard (2015), Small gaps between primes, Ann. Math. 181, 1-31.[MR16] K. Matomaki, M. Radziwi l l (2016), Multiplicative functions in short intervals,

Ann. Math 1015-1056.[PM14] Polymath, D. H. J. (2014) Variants of the Selberg sieve, and bounded intervals

containing many primes, Res. Math. sci. 1(1), 12.[Ros13] M. Rosen (2013), Number theory in function fields (Vol. 210). Springer Science &

Business Media.[Sa18] W. Sawin, Square-root cancellation for sums of factorization functions over short

intervals in function fields, arxiv preprint, 2018.[SGA7-II] P. Deligne, N. Katz, eds. Seminaire de Geometrie Algebrique du Bois Marie -

1967-69 - Groupes de monodromie en geometrie algebrique - (SGA 7) - vol. 2, LectureNotes in Mathematics (in French), Vol. 340. Springer-Verlag.

[Tao16] T. Tao, The logarithmically averaged Chowla and Elliott conjectures for two-pointcorrelations, Forum of Mathematics, Pi, Vol 4, 2016, e8, Cambridge University Press.

[TT17] T. Tao, J. Teravainen, The structure of logarithmically averaged correlations ofmultiplicative functions, with applications to the Chowla and Elliott conjectures, arXivpreprint, 2017.

Department of Mathematics, Columbia University, New York, NY 10027,

USA

E-mail address: [email protected]

Department of Mathematics, University of Wisconsin-Madison, 480 Lincoln

Drive, Madison, WI 53706, USA

E-mail address: [email protected]

Documents

WILL SAWIN AND MARK SHUSTERMAN arXiv:1808.04001v2 …[TT17]. Generalizations of some of these arguments to the function ﬁeld set-ting are part of a work in progress by Klurman, Mangerel,