Upload
aravind-srinivasan
View
213
Download
0
Embed Size (px)
Citation preview
Information Processing Letters 109 (2008) 133–135
Contents lists available at ScienceDirect
Information Processing Letters
www.elsevier.com/locate/ipl
A note on the distribution of the number of prime factors ofthe integers ✩
Aravind Srinivasan 1
Department of Computer Science and Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
a r t i c l e i n f o a b s t r a c t
Article history:Received 31 March 2008Received in revised form 3 September 2008Available online 11 September 2008Communicated by C. Scheideler
Keywords:Chernoff boundsProbabilistic number theoryPrimesTail boundsDependent random variablesRandomized algorithms
The Chernoff–Hoeffding bounds are fundamental probabilistic tools. An elementaryapproach is presented to obtain a Chernoff-type upper-tail bound for the number of primefactors of a random integer in {1,2, . . . ,n}. The method illustrates tail bounds in negatively-correlated settings.
© 2008 Elsevier B.V. All rights reserved.
1. Introduction
Large-deviation bounds such as the Chernoff–Hoeffdingbounds are of much use in randomized algorithms andprobabilistic analysis. Hence, it is valuable to understandhow such bounds can be extended to situations where theconditions of these bounds as presented in their standardversions, do not hold: the most important such condi-tion is independence. Here, we show one such extension,to the classical problem of the distribution of the numberof prime factors of integers.
For any positive integer N , let ν(N) denote the num-ber of prime factors of N , ignoring multiplicities. Let ln x
✩ An earlier version of this work appeared in the Proc. Hawaii Interna-tional Conference on Statistics, Mathematics, and Related Fields, 2005.
E-mail address: [email protected]: http://www.cs.umd.edu/~srin.
1 This research was done in parts at: (i) Cornell University, Ithaca, NY(supported in part by an IBM Graduate Fellowship), (ii) Institute for Ad-vanced Study, Princeton, NJ (supported in part by grant 93-6-6 of theAlfred P. Sloan Foundation), (iii) DIMACS Center, Rutgers University, Pis-cataway, NJ (supported in part by NSF-STC91-19999 and by support fromthe N.J. Commission on Science and Technology), and (iv) the Universityof Maryland (supported in part by NSF Award CCR-0208005).
0020-0190/$ – see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.ipl.2008.09.010
denote the natural logarithm of x, as usual. It is knownthat the average value of ν(i), for i ∈ [n] = {1,2, . . . ,n}, isμn
.= ln ln n + O (1)± O (ln−2 n), for sufficiently large n. Seealso the discussion in Alon and Spencer [1]. We are inter-ested in seeing if there is a “significant” fraction of integersi ∈ [n] for which ν(i) deviates “largely” from μn .
Formally, Hardy and Ramanujan [4] showed that for anyfunction ω(n) with limn→∞ w(n) = ∞,
|{i ∈ [n]: ν(i) � ln ln n + ω(n)√
ln ln n}|n
= o(1), (1)
where the “o(1)” term goes to zero as n increases. Theirproof was fairly complicated. Turán [9] gave a very elegantand short proof of this result; his proof is as follows. LetE[Z ] and Var[Z ] denote the expected value and varianceof random variable Z , respectively. Define Pn to be the setof primes in [n]. For a randomly picked x ∈ [n], define, forevery prime p ∈ Pn , X p to be 1 if p divides x, and 0 oth-erwise. Clearly,
ν(x) =∑
X p .
p∈Pn
134 A. Srinivasan / Information Processing Letters 109 (2008) 133–135
Hence,
μn = E[ν(x)
] =∑
p∈Pn
E[X p] =∑
p∈Pn
�n/p�n
and thus,
μn ∼ μ′n
.=∑
p∈Pn
1
p= ln ln n + 0.261 . . . ± O
(ln−2 n
),
where the last equality follows from Mertens’ theorem(see, for instance, Rosser and Schoenfeld [7]). By Cheby-shev’s inequality,
Pr(∣∣ν(x) − μ(n)
∣∣ � λ)� Var[ν(x)]
λ2
and by obtaining good upper bounds on the variancesVar[X p] and the covariances Cov[X p, Xq], Turán obtains hisresult that
Pr(∣∣ν(x) − μn
∣∣ � λ)� O
(ln ln n
λ2
), (2)
which, in particular, implies (1). Erdos and Kac [3] showthat as n → ∞, the tail of ν(x) (and of any function from afairly broad class of functions of x) approaches that of thecorresponding normal distribution, i.e., that if ω is real andif Kn = |{i ∈ [n]: ν(i) � ln ln n + ω
√2 ln ln n}|, then
limn→∞
Kn
n= π−1/2
w∫−∞
e−u2du. (3)
We strengthen the “upper-tail” part of (2) by showingthat for any n and any parameter δ > 0,
Pr(ν(x) � μn(1 + δ)
)�
(eδ
(1 + δ)1+δ
)μ′n
.
In contrast with (3), we get a bound for every n; thus, forinstance, we get a concrete bound for deviations that are ofan order of magnitude more than the standard deviation.We point out that strong upper- and lower-tail bounds areknown using non-probabilistic methods [6]. The goal ofthis note is to show that a simple probabilistic approachsuffices to derive exponential upper-tail bounds here. Wealso hope that the method and result may be of pedagogicuse in showing the strength of probabilistic methods, andin the study of tail bounds for (negatively) correlated ran-dom variables.
2. Large deviation bounds
We first quickly review some salient features of thework of Schmidt et al. [8].
2.1. Chernoff–Hoeffding type bounds in non-independentscenarios
The basic idea used in the Chernoff–Hoeffding (hence-forth CH) bounds is as follows [2,5]. Given n random vari-ables (henceforth “r.v.”s) X1, X2, . . . , Xn , we want to up-per bound the upper tail probability Pr(X � a), where
X.= ∑n
i=1 Xi , μ.= E[X], a = μ(1 + δ) and δ > 0. For any
fixed t > 0,
Pr(X � a) = Pr(et X � eat) � E[et X ]
eat;
by computing an upper bound u(t) on E[et X ] and min-imizing u(t)
eat over t > 0, we can upper bound Pr(X � a).Suppose Xi ∈ {0,1} for each i, a commonly occurring case.In this case, a commonly used such bound is
Pr(
X � μ(1 + δ))� F (μ, δ)
.=(
eδ
(1 + δ)1+δ
)μ
(4)
(see, for example, [1]).One basic idea of [8] when Xi ∈ {0,1} is as follows.
Suppose we define, for z = (z1, z2, . . . , zn) ∈ n , a familyof functions S j(z), j = 0,1, . . . ,n, where S0(z) ≡ 1, and for1 � j � n,
S j(z).=
∑1�i1<i2···<i j�n
zi1 zi2 . . . zi j .
Then, for any t > 0, there exist non-negative reals a0,a1,
. . . ,an such that et X ≡ ∑ni=0 ai Si(X1, X2, . . . , Xn). So, we
may consider functions of the form
n∑i=0
yi Si(X1, X2, . . . , Xn),
where y0, y1, . . . , yn � 0, instead of restricting ourselvesto those of the form et X , for some t > 0. For any y =(y0, y1, . . . , yn) ∈ n+1+ , define f y(X1, X2, . . . , Xn)
.=∑ni=0 yi Si(X1, X2, . . . , Xn). Then, it is easy to see that
Pr(X � a) = Pr
(f y(X1, . . . , Xn) �
a∑i=0
yi
(a
i
))
� E[ f y(X1, . . . , Xn)]∑ai=0 yi
(ai
) .
So, the goal now is to minimize this upper bound over(y0, y1, . . . , yn) ∈ n+1+ . Assuming that the Xi ’s are inde-pendent, it is shown in [8] that the optimum for the uppertail occurs roughly when: yi = 1 if i = �μδ , and yi = 0otherwise. We can summarize this discussion by
Theorem 2.1. (See [8].) Let bits X1, X2, . . . , Xn be random withX = ∑
i Xi , and let μ = E[X], k = �μδ . Then for any δ > 0,
Pr(
X � μ(1 + δ))� E[Sk(X1, X2, . . . , Xn)](μ(1+δ)
k
) .
If the Xi ’s are independent, then this is at most(eδ
(1 + δ)1+δ
)μ
.
2.2. Tail bounds for ν(x)
Returning to our original scenario, let n be our giveninteger. For a randomly picked x ∈ [n], let X p be 1 if pdivides x, and 0 otherwise. As stated earlier,
ν(x) =∑
X p .
p∈Pn
A. Srinivasan / Information Processing Letters 109 (2008) 133–135 135
Let { X p | p ∈ Pn} be a set of independent binary randomvariables with Pr(X p = 1) = 1/p. For any r and any set ofprimes pi1 , pi2 , . . . , pir , note that
E
[r∏
j=1
X pi j
]= Pr
(r∧
j=1
(pi j |x))
= �n/∏r
j=1 pi j n
� 1∏rj=1 pi j
= E
[r∏
j=1
X pi j
]. (5)
Thus we get
Theorem 2.2. For any n � 2 and for any δ > 0,
Pr(ν(x) � μn(1 + δ)
)�
(eδ
(1 + δ)1+δ
)μ′n
,
just by invoking Theorem 2.1.
3. Variants
Why does our approach not work directly for the lowertail of ν(x) also? The reason is that a direct negative-correlation result analogous to (5) does not appear to hold.It would be interesting to see if good lower-tail boundsalso can be obtained by a short proof; as in [3,9], it maybe possible to make quantitative use of the fact that the{X p} are all “almost independent”. It is known that count-ing the prime divisors including multiplicity changes the
functions a little [6], and it would be worth consideringshort (probabilistic) proofs for the tail behavior of thisfunction also. More generally, can we concretely exploitthe “near-independence” properties of additive number-theoretic functions [3]?
Acknowledgements
I thank Noga Alon, Eric Bach, Carl Pomerance, ChristianScheideler and Joel Spencer for valuable discussions & sug-gestions.
References
[1] N. Alon, J. Spencer, The Probabilistic Method, second ed., John Wiley& Sons, Inc., 2000.
[2] H. Chernoff, A measure of asymptotic efficiency for tests of a hy-pothesis based on the sum of observations, Annals of MathematicalStatistics 23 (1952) 493–509.
[3] P. Erdos, M. Kac, The Gaussian law of errors in the theory of addi-tive number theoretic functions, American Journal of Mathematics 62(1940) 738–742.
[4] G.H. Hardy, S. Ramanujan, The normal number of prime factors of anumber n, Quarterly J. Math. 48 (1917) 76–92.
[5] W. Hoeffding, Probability inequalities for sums of bounded randomvariables, American Statistical Association Journal 58 (1963) 13–30.
[6] C. Pomerance, On the distribution of round numbers, in: K. Alladi(Ed.), Number Theory Proceedings, Ootacamund, India, 1984, in: Lec-ture Notes in Math., vol. 1122, 1985, pp. 173–200.
[7] J.B. Rosser, L. Schoenfeld, Approximate formulas for some functions ofprime numbers, Illinois Journal of Mathematics 6 (1962) 64–94.
[8] J.P. Schmidt, A. Siegel, A. Srinivasan, Chernoff–Hoeffding bounds forapplications with limited independence, SIAM Journal on DiscreteMathematics 8 (1995) 223–250.
[9] P. Turán, On a theorem of Hardy and Ramanujan, Journal of the Lon-don Mathematics Society 9 (1934) 274–276.