A Randomized Sieving Algorithm for Approximate Integer ...dadush/papers/asymmetric-aks.pdf · The Integer Programming (IP) Problem, i.e. the problem of deciding whether a polytope

Noname manuscript No.(will be inserted by the editor)

A Randomized Sieving Algorithm for ApproximateInteger Programming

Daniel Dadush

the date of receipt and acceptance should be inserted later

Abstract The Integer Programming Problem (IP) for a polytope P ⊆ Rnis to find an integer point in P or decide that P is integer free. We givea randomized algorithm for an approximate version of this problem, whichcorrectly decides whether P contains an integer point or whether a (1 + ε)-scaling of P about its center of gravity is integer free in 2O(n)(1/ε2)n-time and2O(n)(1/ε)n-space with overwhelming probability. Our algorithm proceeds byreducing the approximate IP problem to an approximate Closest Vector Problem(CVP) under a “near-symmetric” norm. Our main technical contribution is anextension of the AKS randomized sieving technique, first developed by Ajtai,Kumar, Sivakumar (STOC 2001) for lattice problems under the `2 norm, to thesetting of asymmetric norms. We also present an application of our techniquesto exact IP, where we give a nearly optimal algorithmic implementation of theFlatness Theorem, a central ingredient for many IP algorithms. Our resultsalso extend to general convex bodies and lattices.

Keywords Integer Programming · Lattice Problems · Shortest VectorProblem · Closest Vector Problem

1 Introduction

The Integer Programming (IP) Problem, i.e. the problem of deciding whether apolytope contains an integer point, is a classic problem in Operations Researchand Computer Science. Algorithms for IP were first developed in the 1950swhen Gomory [1] gave a finite cutting plane algorithm to solve general (Mixed)-Integer Programs. However, the first algorithms with complexity guarantees(i.e. better than finiteness) came much later. The first such algorithm wasthe breakthrough result of Lenstra [2], which gave the first fixed dimension

Daniel DadushComputer Science Department, New York University, 251 Mercer Street, New York, NY10012, USA. E-mail: [email protected]

2 Daniel Dadush

polynomial time algorithm for IP. Lenstra’s approach revolved on finding “flat”integer directions of a polytope, and achieved a leading complexity term of2O(n3) where n is the number of variables. Lenstra’s approach was generalizedand substantially improved upon by Kannan [3], who used an entire shortlattice basis to yield an O(n2.5)n-time and poly(n)-space algorithm. In [4],Hildebrand and Koppe use strong ellipsoidal rounding and a recent solver forthe Shortest Vector Problem (SVP) under the `2 norm [5] to give an 2O(n)n2n-time and 2O(n)-space algorithm for IP. Lastly, Dadush et al [6] use a solver

for SVP under general norms to give a 2O(n)(n43 polylog(n))n-expected time

and 2O(n)-space algorithm. Following the works of Lenstra and Kannan, fixeddimension polynomial algorithms were discovered for many related problemssuch as counting the number of integer points in a rational polyhedron [7],parametric integer programming [8,9], and integer optimization over quasi-convex polynomials [10,4]. However, over the last twenty years the knownalgorithmic complexity of IP has only modestly decreased. A central openproblem in the area therefore remains the following [5,4,6]:

Problem 1 Does there exist a 2O(n)-time algorithm for Integer Programming?

In this paper, we show that if one is willing to accept an approximatenotion of containment then the answer to the above question is affirmative.More precisely, we give a randomized algorithm which can correctly distinguishwhether a polytope P contains an integer point or if a (1 + ε)-dilation of Pabout its center of gravity contains no integer points in 2O(n)(1/ε2)n-time and2O(n)(1/ε)n-space with overwhelming probability. Our results naturally extendto the setting of general convex bodies and lattices, where the IP problem inthis context is to decide for a convex body K and lattice L in Rn whetherK ∩ L 6= ∅. To obtain the approximate IP result, we reduce the problem to a(1 + ε)-approximate Closest Vector Problem (CVP) under a “near-symmetric”norm.

Given an n-dimensional lattice L ⊆ Rn (integer combinations of a basisb1, . . . ,bn ∈ Rn) the SVP is to find miny∈L\0 ‖y‖, and given x ∈ Rn theCVP is to find miny∈L ‖y − x‖, where ‖ · ‖ is a given (asymmetric-)norm.An asymmetric norm ‖ · ‖ satisfies all the standard norm properties exceptsymmetry, i.e. we allow ‖x‖ 6= ‖ − x‖.

Our methods in this setting are based on a randomized sieving techniquefirst developed by Ajtai, Kumar and Sivakumar [11,12] for solving the Shortest(SVP) and approximate Closest Vector Problem (CVP). In [11], they give a2O(n)-time and space randomized sieving algorithm for SVP in the `2 norm,extending this in [12] to give a 2O(nε )-time and space randomized algorithmfor (1 + ε)-CVP in the `2 norm. In [13], Blomer and Naewe adapt the AKSsieve to give a 2O(n)-time and space randomized algorithm for `p SVP, and a2O(n)(1/ε2)n-time and 2O(n)(1/ε)n-space randomized algorithm for (1+ε)-CVPunder `p norms. In [14], the previous results are extended to give a 2O(n)-timeand space randomized algorithm for the SVP under any symmetric norm.In [15], a technique to boost any 2-approximation algorithm for `∞ CVP is

A Randomized Sieving Algorithm for Approximate Integer Programming 3

given which yields a 2O(n)(ln( 1ε ))n and 2O(n)-space algorithm for (1 + ε)-CVPunder `∞.

Our main technical contribution is an extension of the AKS sieving techniqueto give a 2O(n)(1/ε2)n-time and 2O(n)(1/ε)n-space randomized algorithm forCVP under any near-symmetric norm.

1.1 Definitions

In what follows, K ⊆ Rn will denote a convex body (a full dimensional compactconvex set) and L ⊆ Rn will denote an n-dimensional lattice. We define thedual lattice of L as L∗ = y ∈ Rn : 〈y,x〉 ∈ Z ∀x ∈ L. K will be presentedby a membership oracle in the standard way (see section 2), and L will bepresented by a generating basis b1, . . . ,bn ∈ Rn. We define the barycenter (orcentroid) of K as b(K) = 1

voln(K)

∫K

xdx.

For sets A,B ⊆ Rn and scalars s, t ∈ R define the Minkowski sum sA+tB =sa + tb : a ∈ A,b ∈ B. int(A) denotes the interior of the set A.

Let C ⊆ Rn be a convex body where 0 ∈ int(C). Define the norm(possibly asymmetric) induced by C (or gauge function of C) as ‖x‖C =infs ≥ 0 : x ∈ sC for x ∈ Rn. ‖ · ‖C satisfies all standard norm propertiesexcept symmetry, i.e. ‖x‖C 6= ‖ − x‖C is allowed. ‖ · ‖C (or C) is γ-symmetric,for 0 < γ ≤ 1, if voln(C∩−C) ≥ γnvoln(C). Note C is 1-symmetric iff C = −C.For the sake of concision, we shall now use the generic term norm to includeboth asymmetric and symmetric norms.

For a lattice L and norm ‖ · ‖C , define the first minimum of L under ‖ · ‖Cas λ1(C,L) = infz∈L\0 ‖z‖C (length of shortest non-zero vector). For a targetx, lattice L, and norm ‖ · ‖C , define the distance from x to L under ‖ · ‖C asdC(L,x) = infz∈L ‖z− x‖C .

1.2 Results

We state our main result in terms of general convex bodies and lattices. Werecover the standard integer programming setting by setting L = Zn, thestandard integer lattice, and K = x ∈ Rn : Ax ≤ b, a general polytope. Forsimplicity, we often omit standard polynomial factors from the runtimes of ouralgorithms (i.e. polylog terms associated with bounds on K or the bit lengthof the basis for L).

Our main result is the following:

Theorem 1 (Approximate IP Feasibility) For 0 < ε ≤ 12 , there exists

a 2O(n)(1/ε2)n-time and 2O(n)(1/ε)n-space randomized algorithm which withprobability at least 1− 2−n either outputs a point

y ∈ ((1 + ε)K − εb(K)) ∩ L

4 Daniel Dadush

or correctly decides that K ∩ L = ∅. Furthermore, if(1

1 + εK +

ε

1 + εb(K)

)∩ L 6= ∅,

the algorithm returns a point z ∈ K ∩ L with probability at least 1− 2−n.

Here we note that ((1 + ε)K − εb(K)) = (1 + ε)(K − b(K)) + b(K), corre-sponds to a (1 + ε)-scaling of K about its barycenter b(K).

The above theorem shows that IP can be solved much faster than thecurrent nO(n) worst case running time when K contains a “deep” lattice point(i.e. within a slight scaling of K around its barycenter). Indeed, as long as(

1

1 + n−1/2K +

n−1/2

1 + n−1/2b(K)

)∩ L 6= ∅,

our algorithm will find an integer point in 2O(n)nn-time (which is the con-jectured complexity of the IP algorithm in [6]). Hence to improve the timecomplexity of IP below 2O(n)nδn, for any 0 < δ < 1, one may assume that allthe integer points lie close to the boundary, i.e. that(

1

1 + n−12 δK +

n−12 δ

1 + n−12 δ

b(K)

)∩ L = ∅.

The above statement helps elucidate the structure of the “hard” IP instancesfor current algorithms, i.e. they are instances with many infeasible lattice pointslying near the boundary of K.

Starting from the above algorithm, we can use a binary search procedureto go from approximate feasibility to approximate optimization. This yieldsthe following theorem:

Theorem 2 (Approximate Integer Optimization) For v ∈ Rn, 0 < ε ≤12 , δ > 0, there exists a 2O(n)(1/ε2)n polylog( 1

δ , ‖v‖2)-time and 2O(n)(1/ε)n-space randomized algorithm which with probability at least 1 − 2−n eitheroutputs

y ∈ (K + ε(K −K)) ∩ Lsuch that

supz∈K∩L

〈v, z〉 ≤ 〈v,y〉+ δ

or correctly decides that K ∩ L = ∅.

The above theorem states that if we wish to optimize over K ∩ L, we canfind a lattice point in a slight blowup of K whose objective value is essentiallyas good as any point in K ∩ L. We remark that the blowup is worse than inTheorem 1, since (1 + ε)K − εx ⊆ K + ε(K −K) for any x ∈ K. This stemsfrom the need to call the feasibility algorithm on multiple restrictions of K.For a vector v ∈ Rn, we define the width of K with respect to v as

widthK(v) = supx∈K〈x,v〉 − inf

x∈K〈x,v〉 .


It is easy to check that widthK(·) defines a symmetric norm on Rn. To get aclearer understanding of the “blowup body”, the new constraints ofK+ε(K−K)can be understood from the following formula:

supx∈K+ε(K−K)

〈v,x〉 = supx∈K〈v,x〉+ εwidthK(v).

Hence each valid constraint 〈v,x〉 ≤ c for K, is relaxed by an ε-fraction of v’swidth (or variation) with respect to K.

Application to Exact Integer Programming. A natural question is whetherthe approximate IP problem has a direct application to exact IP. While wehave not yet found a natural class of IPs for which our approximate IP solvergives provably optimal or near-optimal solutions (since we do not know howto control which constraints are violated), we can instead show the utilityof our solver as a tool for solving the exact problem. In his breakthroughwork, Lenstra [2] showed how to efficiently solve IPs using small lattice widthdirections. At the core of Lenstra’s algorithm is a subroutine that given aconvex body K1 and lattice L, either outputs a lattice point in K, or outputsa decomposition of L into consecutive parallel lattice hyperplanes such thatonly a “small” number of these hyperplanes intersect K. To solve the IP whenthe latter case occurs, the idea is to recursively solve the n − 1 dimensionalIPs indexed by the lattice hyperplanes intersecting K (since any lattice pointin K must be contained in exactly one of them), and return any lattice pointfound by the recursive calls. This strategy still forms the basis of the fastestcurrent IP algorithms [4,6], where the bulk of the new work has focused onfinding better hyperplane decompositions (i.e. decompositions minimizing thenumber hyperplanes intersecting K).

For the main application of our approximate IP solver, we use it to give analgorithm which either outputs a lattice point inside K, or outputs a nearly“optimal” hyperplane decomposition with respect to K. To explain what drivesthe “thinness” of these decompositions (i.e. the number of intersecting hyper-planes), we must first describe Kinchine’s Flatness theorem in the geometry ofnumbers. We define the lattice width of K with respect to L as

width(K,L) = miny∈L∗\0

widthK(y) (1)

We note that finding a smallest width non-zero dual vector corresponds ex-actly to an SVP on L∗ with respect to the symmetric norm widthK(·). Thelattice width essentially corresponds to the optimal thinness of any hyperplanedecomposition of L with respect to K. To see this, take y ∈ L∗ \ 0 suchthat widthK(y) = width(K,L), and let Hi

y = x : 〈x,y〉 = i, for i ∈ Z. Since

y 6= 0, we see that the Hiy are distinct parallel hyperplanes, and since y ∈ L∗

1 This was originally done for K a polyhedron, though this makes no essential difference.

6 Daniel Dadush

we have that each Hiy corresponds to the affine hull of the lattice points it

contains (i.e. a lattice hyperplane) and that

L ⊆⋃i∈Z

Hiy ⇒ K ∩ L ⊆

⋃i∈Z:Hiy∩K 6=∅

Hiy.

Furthermore, by convexity of K the set of hyperplanes Hiy intersecting K are

consecutive, i.e. i ∈ Z : Hiy ∩K 6= ∅ is an interval in Z. Lastly, it is not hard

to check that bwidthK(y)c ≤ |i ∈ Z : Hiy ∩K 6= ∅| ≤ bwidthK(y)c + 1. In

conclusion, if K has widthK(L) = λ, then the optimal hyperplane decomposi-tion of L with respect to K has thinness between bλc and bλc+ 1.

Clearly, without any assumptions on K and L, L need not admit a thinlattice decomposition with respect to K (indeed width(K,L) may be arbitrarilylarge). Therefore, from the perspective of IP, it is important to have a goodsufficient condition for the lattice width to be small. The Flatness theorem,first proven by Kinchine [16], states that if K is lattice-free (i.e. K contains nolattice points), then K has lattice width bounded by a function of dimensionalone. We define the flatness constant of K as

Fl(K) = supwidth(TK,Zn) : T affine invertible , TK ∩ Zn = ∅ (2)

Also, for n ∈ N, we define the n-dimensional flatness constant

Fl(n) = supFl(K) : convex body K ⊆ Rn,

i.e. the worst case flatness bound over all n dimensional convex bodies. From thedefinition, we see that Fl(K) is invariant under invertible affine transformationsof K. Furthermore, it is also easy to check that one can replace Zn in thedefinition with any other lattice without changing Fl(K) (since all n-dimensionallattices are related by linear transformation). Given the importance to IP, muchwork has focused on the problem of deriving strong bounds for the flatnessconstant [17–23]. We provide the best known bounds below:

Theorem 3 (Flatness Theorem) For a convex body K ⊆ Rn,

– [21] Fl(K) = Ω(n).

– [23] Fl(K) = O(n43 polylog(n)).

– [22] if K is a polytope with O(nc) facets, Fl(K) = O(cn log n).– [20] if K is an ellipsoid, Fl(K) = O(n).

We now present the main result of this section:

Theorem 4 (Algorithmic Flatness Theorem) For 0 ≤ ε ≤ 12 , there exists

a 2O(n)( 1ε2 )n-time and 2O(n)( 1

ε )n-space randomized algorithm that correctly

outputs one of the following with probability at least 1− 2−n:

(a) EMPTY if K ∩ L = ∅ (guaranteed if ((1 + ε)K − εb(K)) ∩ L = ∅), or

(b) z ∈ K ∩ L (guaranteed if(

11+εK + ε

1+εb(K))∩ L 6= ∅), or

(c) y ∈ L∗ \ 0 satisfying widthK(y) = width(K,L) ≤ (1 + ε)Fl(K).


In the above theorem, the guarantees in (a)-(c) should be interpreted asfollows: conditioned on the “success” of the algorithm (which occurs withprobability 1 − 2−n), and assuming that the condition in the guarantee issatisfied, then the stated outcome will occur with conditional probability 1.Note that this does not imply that a certain outcome cannot occur if thecondition from the guarantee is violated. For example, it is possible that thealgorithm correctly returns a point y ∈ K ∩ L even if y ∈ ∂K (and hence not“deep inside” K).

Given our current toolset, implementing the above algorithm is straightfor-ward. First, we run the approximate IP solver on K with parameter ε. Now weneed only determine what to do if the solver neither decides that K ∩ L = ∅,nor returns a lattice point x ∈ K ∩ L. In this case, we run any general normSVP solver [14,6] to compute a shortest non-zero vector in L∗ under the normwidthK(·). Here we get the desired bound on width(K,L) from the fact thatK is nearly lattice-free.

The above algorithm can be thought of as a nearly “optimal” implementationof the Flatness theorem. Comparing to previous algorithms with guaranteesof the above type, the previous best obtained hyperplane decompositions ofthinness O(n2) [4]. Hence here we gain a factor of at least n2/F l(K) = Ω(n

23 )

(using Theorem 3). The blowup in most previous approaches stems from theneed to use an ellipsoidal approximation of K, which generally results in ann factor blowup with respect to the ellipsoidal estimates. We note that usingthe above algorithm as the core IP subroutine (setting ε = 1

2 ), one directly

gets a randomized 2O(n)Fl(n)n-time and 2O(n)-space algorithm for exact IP(matching the complexity of [6]). To see this, note that any recursion nodewe create at most (1 + ε)Fl(n) ≤ 2Fl(n) subproblems, and hence the entirerecursion tree has size at most 2nFl(n)n (with very high probability). Thiscontrols the complexity of the algorithm since we never do more than 2O(n)

work at any recursion node.

In [6], they give an algorithm which constructs hyperplane decompositions of

thinness O(n43 ) in a slightly different manner than presented above. Here, they

first compute widthK(L) using a general norm SVP solver, and if widthK(L) ≥Fl(n), they replace K by K ′ = αK+ (1−α)x (a scaled down version of K), forany x ∈ K, satisfying widthK′(n) = Fl(n). Since K ′ always has lattice width≥ Fl(n), it is guaranteed that K ′ ∩ L 6= ∅ ⇔ K ∩ L 6= ∅, and hence it sufficesto solve the IP on K ′. Furthermore, by definition K ′ now admits a hyperplanedecomposition of thinness O(Fl(n)). The drawback of this approach is that itrequires an explicit bound on the worst case flatness bound (our algorithm isagnostic to Fl(n)), it cannot guarantee near optimal thinness of the hyperplanedecompositions on a per instance basis, and it provides very little informationon how close K is to being lattice-free.

Unfortunately, it is unclear at this point whether using the algorithm fromTheorem 4 can by itself enable an improvement in the worst case complexityof IP achieved in [6]. The reason for this is that in the worst case, the thinnessof the hyperplane decompositions produced by both approaches remains the

8 Daniel Dadush

essentially the same (though one would expect the average case to be muchbetter for the Algorithm 4). However, with a more careful analysis, one mightbe able to show that using Algorithm 4 allows one to quickly prune a very largeportion of the search tree. To justify this, we note that we only enter case (c)when K is both close to being lattice feasible and infeasible. Therefore, if verylittle pruning occurs during the IP algorithm, it means that almost all recursionnodes correspond to slices of K that lie on the boundary of lattice feasibility,a scenario which seems quite unlikely. We therefore believe that Algorithm 4should, at the very least, provide a useful tool for future IP algorithms.

1.3 Main Tool

We now describe the main tool used to derive all of the above algorithms. Atthe heart of Theorem 1, is the following algorithm:

Theorem 5 Let ‖ · ‖C denote a γ-symmetric norm. For x ∈ Rn, 0 < ε ≤ 12 ,

there exists an 2O(n)( 1γ4ε2 )n-time and 2O(n)( 1

γ2ε )n-space randomized algorithm

which computes a point y ∈ L satisfying

‖y − x‖C ≤ (1 + ε)dC(L,x)

with probability at least 1 − 2−n. Furthermore, if dC(L,x) ≤ tλ1(C,L), for

t ≥ 2, then an exact closest vector can be found in randomized 2O(n)( t2

γ4 )n-time

and 2O(n)( tγ2 )n-space with probability at least 1− 2−n.

The above algorithm generalizes the AKS sieve to work for asymmetricnorms. As mentioned previously [13] gave the above result for `p norms, and [14]gave a 2O(n)-time exact SVP solver for symmetric norms (also implied by theabove since SVP ≤ CVP, see [24]). In [6], a Las Vegas algorithm (where onlythe runtime is probabilistic, not the correctness) is given for the exact versionsof the above results (i.e. where an exact closest / shortest vector is found)with similar asymptotic complexity, which crucially uses the techniques of [5]developed for `2-CVP. In [5], Micciancio and Voulgaris give deterministic 2O(n)-time and space algorithms for both `2 SVP and CVP based upon Voronoi cellcomputations.

Hence compared with previous results, the novelty of the above algorithm isthe extension of the AKS sieving technique for (1 + ε)-CVP under asymmetricnorms. As seen from Theorems 1, 2, and 4 the significance of this extensionis in its applications to IP. We note that asymmetry arises naturally in thecontext of IP, since the continuous relaxation of an IP need not be symmetric(or even closely approximable by a symmetric set). Furthermore, we believeour results illustrate the versatility of the AKS sieving paradigm.

From a high level, our algorithm uses the same framework as [13,14]. Wefirst show that the AKS sieve can be used to solve the Subspace AvoidingProblem (SAP), which was first defined in [13], and use a reduction from CVPto SAP to get the final result. The technical challenge we overcome, is finding


the correct generalizations of the each of the steps performed in previousalgorithms to the asymmetric setting. We discuss this further in section 3.2.

1.4 Organization

In section 2, we give some general background in convex geometry and lattices.In section 3.1, we describe the reductions from Approximate Integer Program-ming to Approximate CVP, Approximate Integer Optimization to ApproximateInteger Programming, as well as our implementation of the Flatness theorem.In section 3.2, we present the algorithm for the Subspace Avoiding Problem,and in section 3.3 we give the reduction from CVP to SAP.

2 Preliminaries

Convexity: For a convex body K ⊆ Rn, containing 0 ∈ int(K), we definethe polar of K as K∗ = x ∈ Rn : 〈x,y〉 ≤ 1 ∀y ∈ K. By classical duality,we have that (K∗)∗ = K. Furthermore, one can check that for x ∈ Rn,‖x‖K = infs ≥ 0 : x ∈ sK = supy∈K∗ 〈x,y〉.

Computation Model: A convex body K ⊆ Rn is (a0, r, R)-centered if a0 +rBn2 ⊆ K ⊆ a0 + RBn2 , where Bn2 is the unit Euclidean ball. All the convexbodies in this paper will be (a0, r, R)-centered unless otherwise specified. Tointeract with K, algorithms are given access to a membership oracle for K, i.e.an oracle OK such that OK(x) = 1 if x ∈ K and 0. In some situations, an exactmembership oracle is difficult to implement (e.g. deciding whether a matrix Ahas operator norm ≤ 1), in which situation we settle for a “weak”-membershiporacle, which only guarantees its answer for points that are either ε-deep insideK or ε-far from K (the error tolerance ε is provided as an input to the oracle).We note that for a centered convex body K ⊆ Rn equipped with a weakmembership oracle, one can approximately optimize any linear function overK in polynomial time (see [25] section 4.3 for additional details).

For a (0, r, R)-centered K the gauge function ‖ · ‖K is an asymmetric norm.To interact with norms, algorithms are given a distance oracle, i.e. a functionwhich on input x returns ‖x‖K . It is not hard to check that given a membershiporacle for K, one can compute ‖x‖K to within any desired accuracy using binarysearch. Also we remember that ‖x‖K ≤ 1 ⇔ x ∈ K, hence a distance oraclecan easily implement a membership oracle. All the algorithms in this papercan be made to work with weak-oracles, but for simplicity in presentation, weassume that our oracles are all exact and that the conversion between differenttypes of oracles occurs automatically. We note that when K is a polytope, allthe necessary oracles can be implemented exactly and without difficulty.

In the oracle model of computation, complexity is measured by the numberof oracles calls and arithmetic operations.

10 Daniel Dadush

Probability: For random variables X,Y ∈ Ω, we define the total variationdistance between X and Y as

dTV (X,Y ) = supA⊆Ω

|Pr(X ∈ A)− Pr(Y ∈ A)|

The following lemma is a standard fact in probability theory:

Lemma 1 Let (X1, . . . , Xm) ∈ Ωm and (Y1, . . . , Ym) ∈ Ωm denote indepen-dent random variables variables satisfying dTV (Xi, Yi) ≤ ε for i ∈ [m]. Then

dTV ((X1, . . . , Xm), (Y1, . . . , Ym)) ≤ mε

For two random variables X,Y ∈ Ω, we write X ≡D Y if X and Y areidentically distributed (i.e. have the same probability distribution). For acontinuous random variable X, we let d Pr[X = x], for x ∈ Ω, denote itsprobability density at x.

Algorithms on Convex Bodies: For the purposes of our sieving algorithm,we will need an algorithm to sample uniform points from K. We call a randomvector X ∈ K η-uniform if the total variation distance between X and auniform vector on K is at most η. The following result of [26] provides theresult:

Theorem 6 (Uniform Sampler) Given η > 0, there exists an algorithmwhich outputs an η-uniform X ∈ K using at most poly(n, ln( 1

η ), ln(Rr )) callsto the oracle and arithmetic operations.

Our main IP algorithm will provide a guarantee with respect to the barycen-ter of K. The following standard lemma shows that one can approximate apoint near b(K) with overwhelming probability via sampling methods. Weinclude a proof of this in the appendix for completeness.

Lemma 2 (Approx. Barycenter) For ε > 0, let b = 1N

∑Ni=1Xi, N =(

2cε

)2n, c > 0 an absolute constant, and where X1, . . . , XN are iid 4−n-uniform

samples on K ⊆ Rn. Then

Pr[‖ ± (b− b(K))‖K−b(K) > ε] ≤ 2−n

Lattices: An n-dimensional lattice L ⊆ Rn is formed by integral combinationsof linearly independent vectors b1, . . . ,bn ∈ Rn. Letting B = (b1, . . . ,bn), fora point x ∈ Rn we define the modulus operator as

x (mod B) = B(B−1x− bB−1xc)

where for y ∈ Rn, byc = (by1c, . . . , bync). We note that x (mod B) ∈ B[0, 1)n,i.e. the fundamental parallelipiped of B and that x− (x (mod B)) ∈ L, hencex (mod B) is the unique representative of the coset x + L in B[0, 1)n.


Convex Geometry:

Lemma 3 Take x,y ∈ K satisfying ‖ ± (x − y)‖K−y ≤ α < 1. Then forz ∈ Rn we have that

1. for τ ≥ 0, z ∈ τK + (1− τ)y⇔ ‖z− y‖K−y ≤ τ2. ‖z− y‖K−y ≤ ‖y − x‖K−x + α |1− ‖z− x‖K−x|3. ‖z− x‖K−x ≤ ‖z− y‖K−y + α

1−α |1− ‖z− y‖K−y|

The following theorem of Milman and Pajor, tells us that K − b(K) is12 -symmetric.

Theorem 7 ([27]) Assume b(K) = 0. Then voln(K ∩ −K) ≥ 12n voln(K).

Using the above theorem, we give a simple extension which shows thatnear-symmetry is a stable property.

Corollary 1 Assume b(K) = 0. Then for x ∈ K we have that K − x is12 (1− ‖x‖K)-symmetric.

Proofs of Lemma 3 and Corollary 1 are included in the appendix.

3 Algorithms

3.1 Integer Programming

For the first result in this section, we give a reduction from ApproximateInteger Programming to the Approximate Closest Vector Problem.

Proof (Theorem 1: Approximate Integer Programming) We are given 0 < ε ≤ 12 ,

and we wish to find a lattice point in ((1 + ε)K − εb(K)) ∩ L or decide thatK ∩ L = ∅. The algorithm, which we denote by ApproxIP(K,L, ε), will be thefollowing:

Algorithm:

1. Compute b ∈ K, satisfying ‖ ± (b− b(K))‖K−b(K) ≤ 13 , using Lemma 2.

2. Compute y ∈ L such that y is 1 + 2ε5 approximate closest lattice vector to

b under the norm ‖ · ‖K−b using Approx-CVP (Theorem 9).3. Return y if y ∈ ‖y − b‖K−b ≤ 1 + 3ε

4 , and otherwise return EMPTY (i.e.K ∩ L = ∅).

Correctness: Assuming that steps (1) and (2) return correct outputs (whichoccurs with overwhelming probability), we show that the final output is correct.

First note that if ‖y − b‖K−b ≤ 1 + 3ε4 , then by Lemma 3 we have that

‖y − b(K)‖K−b(K) ≤ ‖y − b‖K−b +1

3|1− ‖y − b‖K−b|

≤ 1 +3ε

4+

1

3

3ε

4= 1 + ε

12 Daniel Dadush

as required. Now assume that K ∩ L 6= ∅. Then we can take z ∈ L suchthat ‖z − b‖K−b ≤ 1. Since y is a 1 + 2ε

5 closest vector, we must have that‖y − b‖K−b ≤ 1 + 2ε

5 . Hence by the reasoning in the previous paragraph, wehave that ‖y − b(K)‖K−b(K) ≤ 1 + ε as needed.

For the furthermore, we assume that 11+εK + ε

1+εb(K)∩L 6= ∅. So we may

pick z ∈ L such that ‖z− b(K)‖K−b(K) ≤ 11+ε . By Lemma 3, we have that

‖z− b‖K−b ≤ ‖z− b(K)‖K−b(K) +13

1− 13

∣∣1− ‖z− b(K)‖K−b(K)

∣∣≤ 1

1 + ε+

1

2

ε

1 + ε=

1 + ε2

1 + ε

Next by the assumptions on y, we have that ‖y − b‖K−b ≤1+ ε

2

1+ε (1 + 2ε5 ) ≤ 1

since 0 < ε ≤ 12 . Hence y ∈ K ∩ L as needed.

Runtime: For step (1), by Lemma 2 we can compute b ∈ K, satisfying‖ ± (b − b(K))‖K−b(K) ≤ 1

3 , with probability at least 1 − 2−n, by letting bbe the average of O(n) 4−n-uniform samples over K. By Theorem 6, each ofthese samples can be computed in poly(n, ln(Rr )) time.

For step (2), we first note that by Corollary 1, K − b is (1 − 13 ) 1

2 = 13 -

symmetric. Therefore, the call to the Approximate CVP algorithm, with errorparameter 2ε

5 returns a valid approximation vector with probability at least

1− 2−n in time O(3( 52ε )

2)n = 2O(n)(1/ε2)n. Hence the entire algorithm takes

time 2O(n)(1/ε2)n and outputs a correct answer with probability at least1− 2−n+1 as needed.

We now provide the reduction from Approximate Integer Optimization toApproximate Integer Programming Feasibility. To achieve this we show thatAlgorithm 1 (ApproxOPT), which proceeds by an essentially standard binarysearch, satisfies the desired specifications for Theorem 2.

To analyze ApproxOPT, we will require the following technical lemma. Dueto the tedious nature of the proof, we delay its presentation till the appendix.

Lemma 4 (Well-Centered Bodies) Let Ka,b = K∩x ∈ Rn : a ≤ 〈x,v〉 ≤ b,where K is as in Algorithm 1. Then during the execution of ApproxOpt, everycall of ApproxIP is executed on a (a′0, r

′, R′)-centered convex body Ka,b, wherer′ ≥ 3rδ

64R‖v‖2 , R′ ≤ 2R, and a′0 can be computed in polynomial time as a convex

combination of a0,xu, and xl.

Proof (Theorem 2: Approximate Integer Optimization) We are given v ∈ Rn,0 < ε ≤ 1

2 , and δ > 0 where we wish to find a lattice point in K+ε(K−K) ∩ Lwhose objective value is within an additive δ of the best point in K ∩ L. Weremember that K is (a0, r, R)-centered. Since we lose nothing by making δsmaller, we shall assume that δ ≤ 1

64‖v‖2r. This will allow us to assume that〈v,xu〉 − 〈v,xl〉 > 127δ (see Lemma 4), i.e. there is significant space betweenthe upper and lower bound for the continuous relaxation. We will show thatAlgorithm 1 correctly solves the optimization problem.


Algorithm 1 Algorithm ApproxOPT(K,L,v, ε, δ)Input: (a0, r, R)-centered convex body K ⊆ Rn presented by membership oracle, latticeL ⊆ Rn given by a basis, objective v ∈ Rn, tolerance parameters 0 < ε ≤ 1

2and

0 < δ ≤ r‖v‖264

.Output: EMPTY if K ∩ L = ∅ or z ∈ K + ε(K −K) ∩ L satisfying

supy∈K∩L 〈v,y〉 ≤ 〈v, z〉+ δ1: z←ApproxIP(K,L, ε)2: if z = EMPTY then3: return EMPTY4: Compute xl,xu ∈ K using the ellipsoid algorithm satisfying

infx∈K 〈v,x〉 ≥ 〈v,xl〉 − δ16

and supx∈K 〈v,x〉 ≤ 〈v,xu〉+ δ16

5: Set l← 〈v, z〉 and u← 〈v,xu〉+ δ16

6: while u− l > δ do7: m← 1

2(u+ l)

8: y←ApproxIP(K ∩ x ∈ Rn : m ≤ 〈v,x〉 ≤ u,L, ε)9: if y = EMPTY then

10: u← m11: if u < 〈xl,v〉 − δ

16return z; else u← maxu, 〈xl,v〉+ 3δ

16

12: else13: Set z← y and l← 〈v, z〉14: return z

Correctness: Assuming that all the calls to the ApproxIP solver output acorrect result (which occurs with overwhelming probability), we show thatAlgorithm 1 is correct. As can be seen, the algorithm performs a standardbinary search over the objective value. During the iteration of the while loop,the value u represents the current best upper bound on supy∈K∩L 〈v,y〉, wherethis bound is achieved first by bounding supx∈K 〈v,x〉 (line 5), or by showingthe lattice infeasibility of appriopriate restrictions of K (line 8). On line 11,the value u can potentially be relaxed to maxu, 〈xl,v〉+ 3δ

16, though thisoperation maintains that u is an upper bound. We remark that the value ofu is relaxed here to guarantee that the restrictions of K on which ApproxIPis called are all well-centered (the details of this are presented in Lemma 4).Similarly, the value l represents the objective value of the best lattice pointfound thus far, which is denoted by z. Now as long as the value of z is notnull, we claim that z ∈ K + ε(K −K). To see this note that z is the outputof some call to Approx IP, on Ka,b = K ∩ x ∈ Rn : a ≤ 〈v,x〉 ≤ b for somea < b, the lattice L, with tolerance parameter ε. Hence if z is not null, we areguaranteed that

z ∈ (1 + ε)Ka,b − εb(Ka,b) = Ka,b + ε(Ka,b − b(Ka,b))

⊆ Ka,b + ε(Ka,b −Ka,b) ⊆ K + ε(K −K)(3)

since b(Ka,b) ∈ Ka,b ⊆ K. Therefore z ∈ K + ε(K −K) as required. Now, thealgorithm returns EMPTY if K ∩ L = ∅ (line 3), or z if u < 〈xl,v〉 − δ

16 or if

u− l < δ (line 14). Note that if u < 〈xl,v〉 − δ16 , then since u is a valid upper

bound on maxw∈K∩L 〈w,v〉 and since

u < 〈xl,v〉 −δ

16≤ min

y∈K〈y,v〉

14 Daniel Dadush

we must have that K ∩ L = ∅. Therefore the returned vector z clearly hashigher objective value than any vector in K ∩ L = ∅. Lastly, if u− l < δ, thereturned vector z ∈ K + ε(K −K) is indeed nearly optimal. The algorithm’soutput is therefore valid as required.

Runtime: Assuming that each call to ApproxIP returns a correct result, weshall bound the number of iterations of the while loop. After this, using aunion bound over the failure probability of ApproxIP, we get a bound on theprobability that the algorithm does not perform as described by the analysis.

First, we show that gap u− l decreases by a factor of at least 34 after each

non-exiting iteration of the loop. We first examine the case where Km,u isdeclared EMPTY in line 8. Let u0 denote the value of u on line 7. After line10, note that u = m = 1

2 (l+u0). Since the loop exits if u < 〈xl,v〉− δ16 on line

11, we may assume that this is not the case. If 〈xl,v〉+ 3δ16 ≤ u, note that u is

unchanged after line 11. Therefore u− l = 12 (u0 + l)− l = 1

2 (u0 − l) decreases

by a factor 12 . Otherwise, we have that 〈xl,v〉 − δ

16 ≤12 (u0 + l) < 〈xl,v〉+ 3δ

16 ,

and the value of u after line 11 is set to 〈xl,v〉 + 3δ16 . From the while loop

invariant, note that u0 − l ≥ δ. Therefore, after line 11, we have that

u− l = 〈xl,v〉+3δ

16− l =

(〈xl,v〉+

3δ

16− 1

2(u0 + l)

)+

1

2(u0 − l)

≤ 1

4δ +

1

2(u0 − l) ≤

3

4(u0 − l)

as needed. Next, if a lattice point y is returned in line 8, we know by equation (3)that y ∈ Km,u − ε(Km,u −Km,u). Therefore

〈v,y〉 ≥ infx∈Km,u

〈v,y〉−ε

(sup

x∈Km,u〈v,y〉 − inf

x∈Km,u〈v,y〉

)≥ m−ε(u−m) (4)

Since m = 12 (l + u), and ε ≤ 1

2 , we see that

u− 〈v,y〉 ≤ (u−m) + ε(u−m) ≤ 1

2(u− l) +

1

4(u− l) =

3

4(u− l)

as needed. From here, we claim that we perform at most dln( 4R‖v‖2)ln(δ) / ln( 4

3 )eiterations of the for loop. Now since K ⊆ a0 +RBn2 , note thatwidthK(v) ≤ 2R‖v‖2. Therefore, using equation (4), the initial value of u− l(line 6) is at most

2R‖v‖2 + ε(2R‖v‖2) +δ

16≤ 2R‖v‖2 +

1

2(2R‖v‖2) +

2r‖v‖16

≤ 4R‖v‖2

Since u − l decreases by a factor at least 34 at each iteration, it takes at

most dln( 4R‖v‖2δ )/ ln 4

3e iterations before u− l ≤ δ. Since we call ApproxIP atmost twice at each iteration, the probability that any one of these calls fails

(whereupon the above analysis does not hold) is at most 2dln( 4R‖v‖2δ )/ ln 4

3eF ,


where F is the failure probability of a call to ApproxIP. For the purposes ofthis algorithm, we claim that the error probability of a call to ApproxIP canbe made arbitrarily small by repetition. To see this, note any lattice vectorreturned by ApproxIP(Ka,b,L, ε) is always a success for our purposes, since bythe algorithm’s design any returned vector is always in Ka,b+ε(Ka,b−Ka,b)∩L(which is sufficient for us). Hence the only failure possibility is that ApproxIPreturns that Ka,b ∩ L = ∅ when this is not the case. By the guarantees onApproxIP, the probability that this occurs over k independent repetitions is at

most 2−nk. Hence by repeating each call to ApproxIP O(1+ 1n ln ln R‖v‖

δ ) times,the total error probability over all calls can be reduced to 2−n as required.Hence with probability at least 1− 2−n, the algorithm correctly terminates in

at most dln( 4R‖v‖2δ )/ ln 4

3e iterations of the while loop.Lastly, we must check that each call to ApproxIP is done over a well-centered

body Ka,b, i.e. we must be able to provide to ApproxIP a center a′0 ∈ Rn andradii r′, R′ such that a′0 + r′Bn2 ⊆ Ka,b ⊆ a′0 +R′Bn2 , where each of a′0, r

′, R′

have size polynomial in the input parameters. This is guaranteed by Lemma 4.Given the above, since we call ApproxIP at most once at each iteration (over

a well-centered convex body), with probability at least 1−2−n the total runningtime is 2O(n)( 1

ε2 )n polylog(R, r, δ, ‖v‖2) and the space usage is 2O(n)( 1ε )n as

required.

For the last result of this section, we give our implementation of the Flatnesstheorem.

Proof (Theorem 4: Algorithmic Flatness Theorem) Take ε, 0 ≤ ε ≤ 12 , K ⊆ Rn

a convex body, and L an n-dimensional lattice. Here we wish to either find alattice point in K, decide that K is lattice free, or return a small lattice widthdirection for K. We shall achieve this with the following algorithm:

Algorithm:

1. Compute y← ApproxIP(K,L, ε).2. If y = EMPTY, return EMPTY. Else if y ∈ K ∩ L, return y.3. Else, use a symmetric norm SVP solver to compute y ∈ L∗ \ 0 such that

widthK(y) = width(K,L). Return y.

Correctness: We will prove correctness of the algorithm assuming that ourcall to ApproxIP and to the SVP solver return correct outputs (which occurswith overwhelming probability). First, note that if ((1+ε)K−εb(K))∩L = ∅, bythe guarantees on ApproxIP, we must have that ApproxIP declares K ∩ L = ∅.Furthermore, if

(1

1+εK −ε

1+εb(K))∩ L 6= ∅, then again by the guarantees on

ApproxIP, we have that ApproxIP returns y ∈ K ∩ L. This proves that thefirst two guarantees of Theorem 4 are satisfied.

Let us now assume that both of the preliminary tests fail, and that thealgorithm computes a shortest non-zero vector y in L∗ under the normwidthK(·). Here, we must show that widthK(y) = width(K,L) ≤ (1 + ε)Fl(K).

16 Daniel Dadush

To see this, note that by the guarantees on ApproxIP, we must have that( 11+εK −

ε1+εb(K)) ∩ L = ∅. Therefore, we see that

Fl(K) ≥ width

(1

1 + εK − ε

1 + εb(K),L

)=

1

1 + εwidth(K,L)

as needed.

Runtime: The algorithm makes one call to ApproxIP and one call to asymmetric norm SVP solver. The call to ApproxIP requires 2O(n)( 1

ε2 )n-time

and 2O(n)( 1ε )n-space, and successfully returns a correct solution with probabilityat least 1− 2−n. Next, making a call to the AKS based symmetric SVP solverof Arvind and Joglekar [14] requires at most 2O(n)-time and space, and returnsa correct solution probability at least 1− 2−n. We note that since K is well-centered, the norm widthK(x), for x ∈ Rn, can be computed to within anydesired accuracy in polynomial time using the ellipsoid algorithm. Therefore,the full algorithm requires at most 2O(n)( 1

ε2 )n-time and 2O(n)( 1ε )n-space, and

outputs a correct solution with probability at least 1− 2−n+1, as needed.

3.2 Subspace Avoiding Problem

In the following two sections, C ⊆ Rn will denote be a (0, r, R)-centeredγ-symmetric convex body, and L ⊆ Rn will denote an n-dimensional lattice.

In this section, we introduce the Subspace Avoiding Problem of [13], andoutline how the AKS sieve can be adapted to solve it under general norms.

Let M ⊆ Rn be a linear subspace where dim(M) = k ≤ n − 1. Letλ(C,L,M) = infx∈L\M ‖x‖C . Note that under this definition, we have theidentity λ1(C,L) = λ(C,L, 0).Definition 1 The (1 + ε)-Approximate Subspace Avoiding Problem (SAP)with respect C, L and M is to find a lattice vector y ∈ L \M such that‖y‖C = (1 + ε)λ(C,L,M).

For x ∈ Rn, let ‖x‖∗C = min‖x‖C , ‖x‖−C. For a point x ∈ Rn, defines(x) = 1 if ‖x‖C ≤ ‖x‖−C and s(x) = −1 if ‖x‖C > ‖x‖−C . From the notation,we have that ‖x‖∗C = ‖x‖s(x)C = ‖s(x)x‖C .

We begin with an extension of the AKS sieving lemma to the asymmetricsetting. The following lemma will provide the central tool for the SAP algorithm.

Lemma 5 (Basic Sieve) Let (x1,y1), (x2,y2), . . . , (xN ,yN ) ∈ Rn × Rn de-note a list of pairs satisfying yi−xi ∈ L, ‖xi‖∗C ≤ β and ‖yi‖∗C ≤ D ∀i ∈ [N ].Then a clustering, c : 1, . . . , N → J , J ⊆ [N ], satisfying:

1. |J | ≤ 2

(5

γ

)n3. yi − yc(i) + xc(i) − xi ∈ L

2. ‖yi − yc(i) + xc(i)‖∗C ≤1

2D + β

for all i ∈ [N ] \ J , can be computed in deterministic O(N(

5γ

)n)-time.


Proof

Algorithm: We build the set J and clustering c iteratively, starting fromJ = ∅, in the following manner. For each i ∈ [N ], check if there exists j ∈ Jsuch that ‖yi − yj‖s(xj)C ≤ D

2 . If such a j exists, set c(i) = j. Otherwise,append i to the set J and set c(i) = i. Repeat.

Analysis: We first note, that for any i, j ∈ [N ], we have that yi−yj+xj−xi =(yi−xi)− (yj −xj) ∈ L since by assumption both yi−xi,yj −xj ∈ L. Hence,property (3) is trivially satisfied by the clustering c.

We now check that the clustering satisfies property (2). For i ∈ [N ] \ J ,note that by construction we have that ‖yi − yc(i)‖sC ≤ D

2 where s = s(xc(i)).Therefore by the triangle inequality, we have that

‖yi − yc(i) + xc(i)‖∗C ≤ ‖yi − yc(i) + xc(i)‖sC ≤ ‖yi − yc(i)‖sC + ‖xc(i)‖sC

= ‖yi − yc(i)‖sC + ‖xc(i)‖∗C ≤D

2+ β

as required.

We now show that J satisfies property (1). By construction of J , we knowthat for i, j ∈ J , i < j that ‖yj − yi‖s(xi)C > D

2 . Therefore we have that

‖yj − yi‖s(xi)C >D

2⇒ ‖yj − yi‖C∩−C = ‖yi − yj‖C∩−C >

D

2,

where the last equality follows by symmetry of C ∩ −C. We now claim that

yi +D

4(C ∩ −C) ∩ yj +

D

4(C ∩ −C) = ∅. (5)

Assume not, then we may pick z in the intersection above. Then by assumption

‖yj − yi‖C∩−C = ‖(yj − z) + (z− yi)‖C∩−C ≤ ‖yj − z‖C∩−C + ‖z− yi‖C∩−C

= ‖z− yj‖C∩−C + ‖z− yi‖C∩−C ≤D

4+D

4=D

2

a clear contradiction.

For each i ∈ [N ], by assumption ‖yi‖∗C ≤ D ⇔ yi ∈ D(C ∪−C). Therefore,we get that

yi +D

4(C ∩ −C) ⊆ D(C ∪ −C) +

D

4(C ∩ −C)

= D((C +1

4(C ∩ −C)) ∪ (−C +

1

4(C ∩ −C)))

⊆ D((C +1

4C) ∪ (−C +

1

4(−C))) =

5

4D(C ∪ −C)

(6)

18 Daniel Dadush

From (5), (6), and since J ⊆ [N ], we have that

|J | =voln(yi : i ∈ J+ D

4 (C ∩ −C))

voln(D4 (C ∩ −C))≤

voln( 54D(C ∪ −C))

voln(D4 (C ∩ −C))

≤(54

)n(voln(DC) + voln(−DC))(

γ4

)nvoln(DC)

= 2

(5

γ

)nas needed.

To bound the running time of the clustering algorithm is straightforward.For each element of [N ], we iterate once through the partially constructed set

J . Since |J | ≤ 2(

5γ

)nthroughout the entire algorithm, we have that the entire

runtime is bounded by O(N(

5γ

)n) as required.

Definition 2 (Sieving Procedure) For a list of pairs (x1,y1), . . . , (xN ,yN )as in Lemma 5, we call an application of the Sieving Procedure the processof computing the clustering c : [N ] → J , and outputting the list of pairs(xi,yi − yc(i) + xc(i)) for all i ∈ [N ] \ J .

Note that the Sieving Procedure deletes the set of pairs associated with thecluster centers J , and combines the remaining pairs with their associatedcenters.

We remark some differences with the standard AKS sieve. Here the SievingProcedure does not guarantee that ‖yi‖C decreases after each iteration. Insteadit shows that at least one of ‖yi‖C or ‖− yi‖C decreases appropriately at eachstep. Hence the region we must control is in fact D(C ∪−C), which we note isgenerally non-convex. Additionally, our analysis shows that how well we canuse ‖ · ‖C to sieve only depends on voln(C ∩ −C)/voln(C), which is a veryflexible global quantity. For example, if C = [−1, 1]n−1 × [−1, 2n] (i.e. a cubewith one highly skewed coordinate) then C is still 1

2 -symmetric, and hence thesieve barely notices the asymmetry.

The algorithm for approximate SAP we describe presently will constructa list of large pairs as above, and use repeated applications of the SievingProcedure to create shorter and shorter vectors.

We will need the following lemmas for the SAP algorithm.

Lemma 6 Let C ⊆ Rn a (0, r, R)-centered convex body, L ⊆ Rn be an n-dimensional lattice, and M ⊆ Rn, dim(M) ≤ n− 1, be a linear subspace. Thena number ν > 0 satisfying

ν ≤ λ(C,L,M) ≤ 2nR

rν

can be computed in polynomial time.

The above lemma follows directly from Lemma 4.1 of [13]. They prove itfor `p balls, but it is easily adapted to the above setting using the relationship1r‖x‖2 ≤ ‖x‖C ≤

1R‖x‖2 (since C is (0, r, R)-centered).


Lemma 7 Take v ∈ Rn where β ≤ ‖v‖C ≤ 32β. Define C+

v = βC ∩ (v − βC)and C−v = (βC − v) ∩ −βC. Then

voln(C+v )

voln(βC)=

voln(C−v )

voln(βC)≥(γ

4

)nFurthermore, int(C+

v ) ∩ int(C−v ) = ∅.

Proof Since ‖v‖C ≤ 32β, we see that v

2 ∈34C. Now we get that

v

2+

1

4β(C ∩ −C) ⊆ 3

4βC +

1

4βC = βC

Furthermore since ‖v2 − v‖−C = ‖ − v2 ‖−C = 1

2‖v‖C ≤34β, we also have that

v2 ∈ v − 3

4C. Therefore

v

2+

1

4β(C ∩ −C) ⊆ (v − 3

4βC) +

1

4β(−C) = v − βC

We therefore conclude that

voln(βC ∩ (v − βC))

voln(βC)≥

voln(v2 + 1

4β(C ∩ −C))

voln(βC)=

voln(C ∩ −C)

4nvoln(C)≥(γ

4

)nas needed. For the furthermore, we remember that ‖x‖C = infs ≥ 0 : x ∈ sC =sup〈x,y〉 : y ∈ C∗ and that (−C)∗ = −C∗. Now assume there exists x ∈C+

v ∩ C−v . Then x = v − βk1 = βk2 − v where k1,k2 ∈ C. Choose y ∈ C∗such that 〈y,v〉 = ‖v‖C . Note that

〈y,v − βk1 − (βk2 − v)〉 = 2 〈y,v〉 − β(〈y,k1〉+ 〈y,k2〉)= 2‖v‖C − β(〈y,k1〉+ 〈y,k2〉)≥ 2‖v‖C − β(‖k1‖C + ‖k2‖C) ≥ 2β − 2β = 0

Since v − βk1 − (βk2 − v) = x − x = 0 by construction, all of the aboveinequalities must hold at equality. In particular, we must have that 1 =‖k1‖C = ‖k2‖C = 〈y,k1〉 = 〈y,k2〉. Since −y ∈ (−C)∗, we know that

v − βC ⊆ x ∈ Rn : 〈−y,x− v〉 ≤ β

and since 〈−y, (v − βk1)− v〉 = β 〈y,k1〉 = β, we must have that v − βk1 ∈∂C+

v . Via a symmetric argument, we get that βk2 − v ∈ ∂C−v . ThereforeC+

v ∩ C−v ⊆ ∂C+v ∩ ∂C−v ⇔ int(C+

v ) ∩ int(C−v ) = ∅, as needed.

Algorithm ShortVectors above is the core subroutine for the SAP solver.We relate some important details about the SAP algorithm. Our algorithmfor SAP follows a standard procedure. We first guess a value β satisfyingβ ≤ λ(C,L,M) ≤ 3

2β, and then run ShortVectors on inputs C,L,M, β and ε.We show that for this value of β, ShortVectors outputs a (1 + ε) approximatesolution with overwhelming probability.

As we can be seen above, the main task of the ShortVectors algorithm,is to generate a large quantity of random vectors, and sieve them until they

20 Daniel Dadush

Algorithm 2 ShortVectors(C,L,M,β,ε)

Input: (0, r, R)-centered γ-symmetric convex body C ⊆ Rn, basis B = (b1, . . . ,bn) ∈ Qn×nfor L, linear subspace M ⊆ Rn, scaling parameter β > 0, tolerance parameter 0 < ε ≤ 1

21: D ← nmax1≤i≤n ‖bi‖C2: N0 ← 4d6 ln

(Dβ

)e(

20γ2

)n+ 8

(36γ2ε

)n, η ← 2−(n+1)

N0

3: Create pairs (X01 , Y

01 ),(X0

2 , Y02 ), . . . ,(X0

N0, Y 0

N0) as follows: for each i ∈ [N0],

compute Z an η-uniform sample over βC (using Theorem 6) and a uniform s in −1, 1,and set X0

i ← sZ and Y 0i ← X0

i (mod B).4: t← 05: while D ≥ 3β do6: Apply Sieving Procedure to (Xt

1, Yt1 ), . . . ,(Xt

Nt, Y tNt )

yielding (Xt+11 , Y t+1

1 ), . . . ,(Xt+1Nt+1

, Y t+1Nt+1

)

7: D ← D2

+ β and t← t+ 1

8: return Y ti −Xti − (Y tj −Xt

j) : i, j ∈ [Nt] \M

are all of relatively small size (i.e. 3β ≤ 3λ(C,L,M)). ShortVectors thenexamines all the differences between the sieved vectors in the hopes of findingone of size (1 + ε)λ(C,L,M) in L \M . ShortVectors, in fact, needs to balancecertain tradeoffs. On the one hand, it must sieve enough times to guaranteethat the vector differences have small size. On the other, it must use “large”perturbations sampled from β(C ∪−C), to guarantee that these differences donot all lie in M .

We note that the main algorithmic difference with respect to [13,14] is theuse of a modified sieving procedure (i.e. that pays attention to asymmetry whenbuilding the clusters) as well as the use of a different sampling distribution forthe perturbation vectors (i.e. over β(C ∪ −C) instead of just βC).

Theorem 8 (Approximate-SAP) For 0 < ε ≤ 12 , a lattice vector y ∈ L\M

such that ‖y‖C ≤ (1 + ε)λ(C,L,M) can be computed in 2O(n)( 1γ4ε2 )n-time and

2O(n)( 1γ2ε )

n-space with probability at least 1−2−n. Furthermore, if λ(C,L,M) ≤tλ1(C,L), t ≥ 2, a vector y ∈ L \M satisfying ‖y‖C = λ(C,L,M), can be

with computed in 2O(n)(

1γ4t2

)n-time and 2O(n)

(1γ2t

)n-space with probability

at least 1− 2−n.

Proof

Algorithm: The algorithm for (1 + ε)-SAP is as follows:

1. Using Lemma 6 compute a value ν satisfying ν ≤ λ(C,L,M) ≤ 2n Rr ν.2. For each i ∈ 0, 1, . . . , dln(2nR/r)/ ln(3/2)e,

let β = (3/2)iν and run ShortVectors(C,L, β, ε).3. Return the shortest vector found with respect to ‖ · ‖C in the above runs of

ShortVectors.


Preliminary Analysis: In words, the algorithm first guesses a good ap-proximation of λ(C,L,M) (among polynomially many choices) and runs theShortVectors algorithm on these guesses. By design, there will be one iterationof the algorithm where β satisfies β ≤ λ(C,L,M) ≤ 3

2β. We prove that for thissetting of β the algorithm returns a (1 + ε)-approximate solution to the SAPproblem with probability at least 1− 2−n.

Take v ∈ L \M denote an optimal solution to the SAP problem, i.e. vsatisfies ‖v‖C = λ(C,L,M). We will show that with probability at least 1−2−n,a small pertubation of v (Claim 4) will be in the set returned by ShortVectorswhen run on inputs C,L,M ,β, and ε.

Within the ShortVectors algorithm, we will assume that the samples gen-erated over βC (line 3) are exactly uniform. By doing this, we claim theprobability that ShortVectors returns a (1 + ε)-approximate solution to theSAP problem by changes by at most 2−(n+1). To see this, note that we generateexactly N0 such samples, all of which are η-uniform. Therefore by Lemma 1,we have that the total variation distance between the vector of approximatelyuniform samples and truly uniform samples is at most N0η = 2−(n+1). Lastly,the event that ShortVectors returns (1 + ε)-approximate solution is a ran-dom function of these samples, and hence when switching uniform samplesfor η-uniform ones, the probability of this event changes by at most 2−(n+1).Therefore to prove the theorem, it suffices to show that the failure probabilityunder truly uniform samples is at most 2−(n+1).

In the proof, we adopt all the names of parameters and variables defined inthe execution of ShortVectors. We denote the pairs at stage t as (Xt

1, Yt1 ), . . . ,

(XtNt, Y tNt). We also let C+

v , C−v be as in Lemma 5. For any stage t ≥ 0, we

define the pair (Xti , Y

ti ), i ∈ [Nt], as good if Xt

i ∈ int(C+v ) ∪ int(C−v ).

We now outline the remaining analysis, which we break up into four separateclaims. We note that at a high level, the analysis is identical to the onespresented in [13,14].

– Claim 1: With high probability, the number of good pairs sampled at stage0 is large.

The argument is a straightforward Chernoff bound.

– Claim 2: After the main sieving stage (while loop lines 5-7), we still havemany good pairs remaining.

Using the packing argument from Lemma 5 we argue that we only remove2(5/γ)n vectors at each stage, and hence even if all the removed vectorsare good, there will still be many good pairs left at the last stage.

– Claim 3: At the last stage T , for any good pair (XTi , Y

Ti ), we can randomly

(using a carefully chosen distribution) send it to (XTi ± v, Y Ti ) without

changing the global output distribution of the stage T variables.

22 Daniel Dadush

Here we use a symmetry Fv (see equation (8) for details) of the base samplingdistribution over β(C∪−C) which preserves cosets of L, i.e. Fv(sZ) ≡D sZand Fv(x) (mod B) = x (mod B). Here the main idea is that since thealgorithm doesn’t inspect the XT

i coordinate of a surviving pair (XTi , Y

Ti )

during the main sieving stage (other than knowing its coset representative),one can apply the symmetry Fv via (XT

i , YTi ) → (Fv(XT

i ), Y Ti ) withoutchanging the global output distribution.

– Claim 4: At the last stage T , the difference set

Y Ti −XTi − (Y Tj −XT

j ) : i, j ∈ [Nt] \M

contains a lattice vector w close to v w.h.p.

Let (XTi , Y

Ti ), i ∈ [NG] be the good pairs at the last stage T . The sieve

allows us to conclude (with a slight technical modification) that all thelattice vectors Y Ti −XT

i : i ∈ [NG] are of length ≤ 4β (i.e. short). Usinga packing bound, we group these lattice vectors into at most O(1/γε)n

clusters (substantially fewer than the number of good vectors) of radius βε.We use the symmetry Fv to send each good pair Y Ti −XT

i → Yi−Fv(Xi) =Y Ti −XT

i ±v (this is legal in the analysis since it doesn’t change the globaloutput distribution). We now argue that w.h.p. at least one pair of latticevectors Y Ti − Fv(Xi)

T , Y Tj − Fv(Xj)T in the same cluster have difference

equal to ±v + error, where the error has length ≤ βε. This follows fromthe closeness of the lattice vectors in the same cluster and the random ±v“jiggling” induced by the Fv map.

We now formalize the above claims and give the full proofs.

Claim 1: Let G denote the event that there are at least 12

(γ4

)nN0 good pairs

at stage 0. Then G occurs with probability least 1− e− 148γ

nN0 .

Proof Let Gi = I[X0i ∈ int(C+

v ) ∪ int(C−v )] for i ∈ [N0] denote the indicatorrandom variables denoting whether (X0

i , Y0i ) is good or not. Let si, i ∈ [N0],

denote the −1, 1 random variable indicating whether X0i is sampled uniformly

from βC or −βC. Since β ≤ ‖v‖C ≤ 32β, by Lemma 7 we have that

Pr[Gi = 1] ≥∑

b∈−1,+1

Pr(X0i ∈ int(Cbv)|si = b) Pr(si = b)

=1

2

voln(βC ∩ (v − βC))

voln(βC)+

1

2

voln((βC − v) ∩ (−βC))

voln(−βC)≥(γ

4

)nFrom the above we see that E[

∑Ni=1Gi] ≥

(γ4

)nN0. Since the Gi’s are iid

Bernoulli random variables, by the Chernoff bound we get thatPr[G] = Pr[

∑Ni=1Gi <

12

(γ4

)nN0] ≤ e− 1

48γnN0 , as needed.


Claim 2: Let T denote the last stage of the sieve (i.e. value of t at end of thewhile loop). Then conditioned on G, the number of good pairs at stage T is at

least NG = 4(

9γε

)n.

Proof Examine (X0i , Y

0i ) for i ∈ [N0]. We first claim that ‖Y 0

i ‖∗C ≤ D. To seethis note that Y 0

i = Bz where z = B−1X0i − bB−1X0

i c ∈ [0, 1)n. Hence

‖Y 0i ‖∗C ≤ ‖Y 0

i ‖C = ‖n∑j=1

bjzj‖C ≤n∑j=1

zj‖bj‖C ≤ n max1≤j≤n

‖bj‖C = D

as needed. Let Dt = max‖Y ti ‖∗C : i ∈ [Nt], where we note that above shows

that D0 ≤ D. By Lemma 5, we know that Nt ≥ Nt−1 − 2(

5γ

)nand that

Dt ≤ 12Dt−1 +β for t ≥ 1. For Dt ≥ 3β, we see that 1

2Dt+β ≤ 56Dt. Given the

previous bounds, an easy computation reveals that T ≤ d ln(D3β )

ln( 65 )e ≤ d6 ln(Dβ )e.

From the above, we see that during the entire sieving phase we remove

at most T (2)(

5γ

)n≤ 2d6 ln(Dβ )e

(5γ

)npairs. Since we never modify the Xt

i ’s

during the sieving operation, any pair that starts off as good stays good as longas it survives through the last stage. Since we start with at least 1

2

(γ4

)nN0

good pairs in stage 0, we are left with at least

1

2

(γ4

)nN0 − 2d6 ln

(D

β

)e(

5

γ

)n≥ 4

(9

γε

)n= NG

good pairs at stage T as required.

Modifying the output: Here we will analyze a way of modifying the outputof ShortVectors, which maintains the global output distribution but makes theoutput analysis far simpler.

Let w(x) = I[x ∈ βC] + I[x ∈ −βC]. Letting Z be uniform in βC and s beuniform in −1, 1 (i.e. the distribution in line 3), for x ∈ β(C ∪ −C) we havethat the probability density function of sZ is

d Pr[sZ = x] = d Pr[Z = x] Pr[s = 1] + d Pr[Z = −x] Pr[s = −1]

=1

2

(I[x ∈ βC]

voln(βC)+I[x ∈ −βC]

voln(βC)

)=

w(x)

2voln(β(C)).

(7)

Examine the function fv : β(C ∪ −C)→ β(C ∪ −C) defined by

fv(x) =

x− v : x ∈ int(C+

v )

x + v : x ∈ int(C−v )

x : otherwise

Since int(C+v )∩ int(C−v ) = ∅, it is easy to see that fv is a well-defined bijection

on β(C ∪ −C) satisfying fv(fv(x)) = x. Furthermore by construction, we seethat fv(int(C+

v )) = int(C−v ) and fv(int(C−v )) = int(C+v ). Lastly, note that for

24 Daniel Dadush

any x ∈ β(C ∪ −C), that fv(x) ≡ x (mod B) since fv(x) is just a latticevector shift of x.

We note that if a sample X0i , i ∈ [N0], is good then fv(X0

i ) = X0i ± v, and

if it is not good then fv(X0i ) = X0

i . This is the main property we will needabout good “perturbations”.

Let Fv denote the random function where

Fv(x) =

x : with probability w(x)

w(x)+w(fv(x))

fv(x) : with probability w(fv(x))w(x)+w(fv(x))

(8)

Here, we intend that different applications of the function Fv all occur withindependent randomness. Crucially, we will see that Fv defines a symmetryof the base sample distribution over β(C ∪ −C), i.e. Fv(X0

i ) ≡D sZ (seeequation (10) below). Next we define the function cv as

cv(x,y) =

fv(x) : ‖y − fv(x)‖∗C < ‖y − x‖∗Cx : otherwise

The sole purpose of the cv function is to be able to send a good last stage pair(XT

i , YTi ) to a representative for which Y Ti −XT

i is short under ‖ · ‖∗C . Thiswill be crucial for enabling the packing bound in Claim 4.

We claim that for any x,y,v that Fv(x) ≡D Fv(cv(x,y)). This followsfrom equation (8), noting that

Pr[Fv(x) = x] = Pr[Fv(cv(x,y)) = x] =w(x)

w(x) + w(fv(x))

Pr[Fv(x) = fv(x)] = Pr[Fv(cv(x,y)) = fv(x)] =w(fv(x))

w(x) + w(fv(x))

(9)

We now prove that Fv yields a symmetry of distribution of sZ:

d Pr[Fv(sZ) = x] = d Pr[sZ = x] Pr[Fv(x) = x] + d Pr[sZ = fv(x)] Pr[Fv(fx(x)) = x]

=w(x)

2voln(βC)

w(x)

w(x) + w(fv(x))+

w(fv(x))

2voln(βC)

w(x)

w(x) + w(fv(x))

=w(x)

2voln(βC)= d Pr[sZ = x] ( by equation (7) )

(10)

For any stage t ≥ 0, and i ∈ [Nt], define Xti = cv(Xt

i , Yti ).

Claim 3: For any stage t ≥ 0, the pairs (Xt1, Y

t1 ), . . . , (Xt

Nt, Y tNt

), and(Fv(Xt

1), Y t1 ), . . . , (Fv(XtNt

), Y tNt) are identically distributed. Furthermore, this

remains true after conditioning on the event G.


Proof To prove the claim, it suffices to show that the sequences

1. (Xt1, Y

t1 ), . . . , (Xt

Nt, Y tNt

) and 2. (Fv(Xt1), Y t1 ), (Xt

2, Yt2 ), . . . , (Xt

Nt, Y tNt

)

are identically distributed both before and after conditioning on G. Our analysiswill be independent of the index analyzed, hence the claim will follow byapplying the proof inductively on each remaining pair in the second list.

First, we begin by showing that the sequences

2. (Fv(Xt1), Y t1 ), . . . , (Xt

Nt , YtNt) and 3. (Fv(Xt

1), Y t1 ), . . . , (XtNt, Y tNt

)

are identically distributed. Let us first condition on any valid instantiation ofthe normal stage t variables,

(Xt1, Y

t1 ), . . . , (XNt

, YNt) = (xt1,y

t1), . . . , (xtNt ,y

tNt).

From equation (9), we know that as distributions Fv(cv(xt1,yt1)) ≡D Fv(xt1),

where we remember that xt1 = cv(xt1,yt1). Therefore sequences 2 and 3 are

clearly identically distributed. Since the above equivalence holds for any fullconditioning of the stage t variables, we see that the equivalence must alsohold in general, and in particular before or after conditioning on G.

It now remains to show that the sequences

1. (Xt1, Y

t1 ), . . . , (Xt

Nt, Y tNt

) and 3. (Fv(Xt1), Y t1 ), . . . , (Xt

Nt, Y tNt

)

are identically distributed. The pairs in (1) correspond exactly to the induceddistribution of the algorithm on the stage t variables. We think of the pairsin (3) as the induced distribution of a modified algorithm on these variables,where the modified algorithm just runs the normal algorithm and replaces(Xt

1, Yt1 ) by (Fv(Xt

1), Y t1 ) in stage t. To show the distributional equivalence, weshow a probability preserving correspondence between runs of the normal andmodified algorithm having the same stage t variables.

For 0 ≤ k ≤ t, let the pairs (xki ,yki ), i ∈ [Nk], denote a valid run of the

normal algorithm through stage t. We label this as run A. Let us denote thesequence of ancestors of (xt1,y

t1) in the normal algorithm by (xkak ,y

kak

) for0 ≤ k < t. By definition of this sequence, we have that x0

a0 = x1a1 = · · · = xt1.

Since the ShortVectors algorithm is deterministic given the initial samples, theprobability density of this run is simply

d Pr[∩i∈[N0]X

0i = x0

i ]

= d Pr[X0a0 = xt1

] ∏i∈[N0],i6=a0

d Pr[X0i = x0

i

](11)

by the independence of the samples and since x0a0 = xt1. Notice if we condition

on the event G, assuming the run A belongs to G (i.e. that there are enoughgood pairs at stage 0), the above probability density is simply divided by Pr[G].

If xt1 /∈ int(C+v ) ∪ int(C−v ), note that Fv(xt1) = xt1, i.e. the action of Fv is

trivial. In this case, we associate run A with the identical run for the modifiedalgorithm, which is clearly valid and has the same probability. Now assumethat xt1 ∈ C+

v ∪C−v . In this case, we associate run A to two runs of the modified

26 Daniel Dadush

algorithm: A, identical to run A, and C, run A with (xkak ,ykak

) replaced by

(fv(xkak),ykak) for 0 ≤ k < t. Note that both of the associated runs have thesame stage t variables as run A by construction.

We must check that both runs are indeed valid for the modified algorithm.To see this, note that up till stage t, the modified algorithm just runs thenormal algorithm. Run A inherents validity for these stages from the fact thatrun A is valid for the normal algorithm. To see that run C is valid, we firstnote that fv(x0

a0) ≡ x0a0 ≡ y0

a0 (mod B) (B is the lattice basis), which givesvalidity for stage 0. By design of the normal sieving algorithm, note that duringrun A, the algorithm never inspects the contents of xkak for 0 ≤ k < t. This is

because none of the ykak , 0 ≤ k < t, is designated as a cluster center during the

runs of the Sieving Procedure. Therefore, if (xkak ,ykak

) denotes a valid ancestor

sequence in run A, then so does (fv(xak),ykak) in run C for 0 ≤ k < t. For

stage t, note that the normal algorithm, given the stage 0 inputs of run Awould output (xt1,y

t1), . . . , (xtNt ,y

tNt

) for the stage t variables, and that giventhe stage 0 inputs of run C would output (fv(xt1),yt1), (xt2,y

t2), . . . , (xtNt ,y

tNt

).

Hence in run A, the modified algorithm retains the normal algorithms output,and in run C, it swiches it from fv(xt1) back to xt1. Therefore, both runs areindeed valid for the modified algorithm. Furthermore, note that if run A is inG, then both the stage 0 variables of A and C index a good run for the normalalgorithm since the pair (x0

a0 ,ya0) is good iff (fv(x0a0),ya0) is good. Hence we

see that correspondence described is valid both before and after conditioningon G. Lastly, it is clear that for any run of the modified algorithm, the abovecorrespondence yields a unique run of the normal algorithm.

It remains to show that the correspondence is probability preserving. Wemust therefore compute the probability density associated with the union ofrun A and C for the modified algorithm. Using the analysis from the previousparagraph and the computation in (11), we see that this probability density is(

d Pr[X0a0 = xt1] Pr[Fv(xt1) = xt1] +

d Pr[X0a0 = fv(xt1)] Pr[Fv(fv(xt1)) = xt1]

) ∏i∈[N0],i6=a0

dPr[X0i = x0

i

] (12)

On the first line above, the first term corresponds to run A which samples xt1in stage 0 and then chooses to keep (xt1,y

t1) in stage t, and the second term

corresponds to C which samples fv(xt1) in stage 0 and chooses to flip (fv(xt1),yt1)to (xt1,y

t1) in stage t. Now by definition of Fv and since Fv(X0

a0) ≡D X0a0 (see

equation (10)), we have that(d Pr[X0

a0 = xt1] Pr[Fv(xt1) = xt1] + d Pr[X0a0 = fv(xt1)] Pr[Fv(fv(xt1)

)= Pr[Fv(X0

a0) = xt1] = Pr[X0a0 = xt1].

Hence the probabilities in equations (11) and (12) are equal as needed. Lastly,note that when conditioning on G, both of the corresponding probabilities aredivided by Pr[G], and hence equality is maintained.


Output Analysis:

Claim 4: Let T denote the last stage of the sieve. Then conditioned on the

event G, with probability at least 1−(23

) 12NG there exists a lattice vector w in

the set Y Ti −XTi − (Y Tj −XT

j ) : i, j ∈ [NT ] satisfying

(†) w ∈ L \M, w − v ∈M ∩ L and ‖w − v‖C < εβ.

Furthermore, any lattice vector satisfying (†) is a (1 + ε)-approximate solutionto SAP.

Proof Let (xT1 ,yT1 ), . . . , (xTNT ,y

TNT

) denote any valid instantiation of the stageT variables corresponding to a good run of the algorithm (i.e. one belonging toG). Let (xTi ,y

Ti ) = (cv(xTi ,y

Ti ),yTi ) for i ∈ [NT ]. By claim 3, it suffices to prove

the claim for the pairs (Fv(xT1 ),yT1 ), . . . , (Fv(xTNT ),yTNT ). This follows sincethe probability of success (i.e. the existence of the desired vector) conditionedon G, is simply the average over all instantiations above of the conditionalprobability of success.

Since our instantiation corresponds to a good run, by Claim 2 we have atleast NG good pairs in stage T . Since cv preserves good pairs, the same holdstrue for (xTi ,y

Ti ), i ∈ [NT ]. For notational convenience, let us assume that the

pairs (xTi ,yTi ), i ∈ [NG] are all good. We remember that fv(xTi ) = xTi ± v and

fv(fv(xTi )) = xTi for i ∈ [NG].

First, since T is last the stage, we know that ‖yTi ‖∗C ≤ 3β for i ∈ [NG].Next, for i ∈ [NG], by definition of cv we have that

‖yTi − xTi ‖∗C = min‖yTi − xTi ‖∗C , ‖yTi − fv(xTi )‖∗C

Let s = s(yTi ), i.e. ‖yTi ‖∗C = ‖yTi ‖sC . Since (xTi ,yTi ) is good at least one of

−xTi ,−fv(xTi ) ∈ βsC. Without loss of generality, we assume −xTi ∈ βsC.Therefore, we get that

‖yTi − xTi ‖∗C ≤ ‖yTi − xTi ‖∗C ≤ ‖yTi − xTi ‖sC≤ ‖yTi ‖sC + ‖ − xTi ‖sC ≤ 3β + β = 4β

(13)

Let S denote the set yTi − xTi : i ∈ [NG]. Since xTi ≡ yTi (mod B), we notethat S ⊆ L. Also by equation (13) we have that S ⊆ 4β(C ∪ −C) ∩ L. LetΛ ⊆ S denote a maximal subset such that(

x +ε

2βint(C ∩ −C)

)∩(y +

ε

2βint(C ∩ −C)

)= ∅

for distinct x,y ∈ Λ. Since S ⊆ 4β(C ∪ −C), we see that for x ∈ S

x +ε

2β(C ∩ −C) ⊆ 4β(C ∪ −C) +

ε

2β(C ∩ −C) ⊆

(4 +

ε

2

)β(C ∪ −C)

28 Daniel Dadush

Therefore we see that

|Λ| ≤voln((4 + ε

2 )β(C ∪ −C))

voln( ε2 β(C ∩ −C))=

(8 + ε

ε

)nvoln(C ∪ −C)

voln(C ∩ −C)

≤(

8 + ε

ε

)n2voln(C)

γnvoln(C)≤ 2

(9

γε

)n≤ 1

2NG

Since Λ is maximal, we note that for any x ∈ S, there exists y ∈ Λ such that

int(x +ε

2β(C ∩−C))∩ int(y +

ε

2β(C ∩−C)) 6= ∅ ⇔ ‖x−y‖C∩−C < εβ (14)

Let c1, . . . , c|Λ| ∈ [NG], denote indices such that Λ = yTci − xTci : 1 ≤ i ≤ |Λ|,and let C = cj : 1 ≤ j ≤ |Λ|. For j ∈ 1, . . . , |Λ|, recursively define the sets

Ij = i ∈ [NG] : ‖(yTi − xTi )− (yTcj − xTcj )‖C∩−C < εβ \(C ∪ (∪j−1k=1Ik)

)(15)

Given equation (14), we have by construction that the sets C, I1, . . . , I|Λ|partition [NG]. For each j ∈ 1, . . . , |Λ|, we examine the differences

Sj = ±(yTi − Fv(xTi ))− (yTcj − Fv(xTcj )) : i ∈ Ij

We will show that Sj fails to contain a vector satisfying (†) with probability at

most(23

)|Ij |. First we note that Sj ⊆ L since yTi ≡ xTi ≡ Fv(xTi ) (mod B) for

i ∈ [NG].We first condition on the value of Fv(xTcj ) which is either xTcj , x

Tcj − v or

xTcj + v. We examine the case where Fv(xTcj ) = xTcj , the analysis for the othertwo cases is similar. Now, for i ∈ Ij , we analyze the difference

yTi − Fv(xTi )− (yTcj − xTcj ) (16)

Let δi = (yTi − xTi )−(yTcj − xTcj ). Depending on the output of Fv(xTi ), note that

the vector (16) is either (a) δi or of the form (b) ±v+δi (since fv(xTi ) = xTi ±v).We claim that a vector of form (b) satisfies (†). To see this, note that afterpossibly negating the vector, it can be brought to the form v ± δi ∈ L, wherewe have that

‖± δi‖C < ‖± δi‖C∩−C = ‖δi‖C∩−C < εβ ≤ ελ(C,L,M) < λ(C,L,M), (17)

since i ∈ Ij and ε ≤ 12 . Since δi ∈ L and ‖δi‖C < λ(C,L,M), we must have that

±δi ∈M ∩L. Next, since v ∈ L\M and ±δi ∈M , we have that v±δi ∈ L\M .Lastly, note that

‖v ± δi‖C ≤ ‖v‖C + ‖ ± δi‖C < λ(C,L,M) + εβ ≤ (1 + ε)λ(C,L,M)

as required.Now the probability the vector in (16) is of form (b) is

Pr[Fv(xTi ) = fv(xTi )] =w(fv(xTi ))

w(xTi ) + w(fv(xTi ))≥ 1

3


since for any x ∈ β(C ∪ −C) we have that 1 ≤ w(x) ≤ 2. Since each i ∈ Ijindexes a vector in Sj not satisfying (†) with probability at most 1− 1

3 = 23 ,

the probability that Sj contains no vector satisfying (†) is at most(23

)|Ir|(by

independence) as needed.Let Fj , j ∈ 1, . . . , |Λ|, denote the event that Sj does not contain a

vector satisfying (†). Note that Fj only depends on the pairs (Fv(xTcj ),yTcj )

and (Fv(xTi ),yTi ) for i ∈ Ij . Since the sets I1, . . . , I|Λ|, C partition [NG], thesedependencies are all disjoint, and hence the events are independent. Thereforethe probability that none of S1, . . . , S|Λ| contains a vector satisfying (†) is atmost

Pr[∩|Λ|j=1Fj ] ≤|Λ|∏j=1

(2

3

)|Ij |=

(2

3

)NG−|Λ|≤(

2

3

) 12NG

as needed.

Runtime and Failure Probability: We first analyze the runtime. First, wemake O(n log R

r ) guesses for the value of Λ(C,L,M). We run the ShortVectorsalgorithm once for each such guess β. During one iteration of the sievingalgorithm, we first generate N0 η-uniform samples from βC (line 3). ByTheorem 6, this takes poly(n, ln 1

η , lnβ, lnR, ln r) time per sample. We also

mod each sample by the basis B for L, which takes poly(|B|) time (|B| is thebit size of the basis). Next, by the analysis of Claim 2, we apply the sievingprocedure at most d6 ln D

β e times (runs of while loop at line 5), where each

iteration of the sieving procedure (line 6) takes at most O(N0

(5γ

)n) time

by Lemma 5. Lastly, we return the set of differences (line 8), which takes atmost O(N2

0) time. Now by standard arguments, one has that the values Dand β (for each guess) each have size (bit description length) polynomial inthe input, i.e. polynomial in |B| (bit size of the basis of L), n, lnR, ln r. SinceN0 = O(ln(Dβ )( 36

γ2ε )n), we have that the total running time is

poly(n, lnR, ln r, |B|, 1

ε) O

(1

γ4ε2

)nas needed.

Furthermore, the space usage is the algorithm clearly controlled by the spaceneeded to store the sample pairs (X0

1 , Y01 ), . . . , (X0

N0, Y 0

N0). Hence the space us-

age is bounded by poly(n, |B|, lnR, ln r)N0 = poly(n, |B|, lnR, ln r)2O(n)( 1γ2ε )

n

as needed.We now analyze the success probability. Here we only examine the guess β,

where β ≤ λ(C,L,M) ≤ 32β. Assuming perfectly uniform samples over βC, by

the analysis of Claim 4, we have that conditioned on G, we fail to output a

(1 + ε) approximate solution to SAP with probability at most(23

) 12NG . Hence,

under the uniform sampling assumption, the total probability of failure is atmost (

2

3

) 12NG

+ Pr[Gc] ≤(

2

3

) 12NG

+ e−148γ

nN0 2−(n+1)

30 Daniel Dadush

by Claim 2. When switching to η-uniform samples, as argued in the prelimi-nary analysis, this failure probability increases by at most the total variationdistance, i.e. by at most ηN0 = 2−(n+1). Therefore, the algorithm succeedswith probability at least 1− 2−(n+1) − 2−(n+1) = 1− 2−n as needed.

We remark that in the above analysis, if we assume that the samples areexactly uniform, then the error probability becomes doubly exponentially small.Indeed, if one is willing to let the uniform sampler over C run for exponentialtime (allowing for η to be doubly exponentially small), then the total errorprobability can be reduced to this level. This in fact improves over the analysesof [13,14], where they even with exact samples the error probability remainsexponentially small. Here, we achieve this improvement by a better analysis ofthe pairwise difference set in the last step.

Exact SAP: Here we are given the guarantee that λ(C,L,M) ≤ tλ1(C,L),and we wish to use our SAP solver to get an exact minimizer to the SAP. Tosolve this, we run the approximate SAP solver on C,L,M with parameterε = 1

t , which takes 2O(n)(t2/γ4)n-time and 2O(n)(t/γ2)n-space. Let v ∈ L \Mbe a lattice vector satisfying ‖v‖C = λ(C,L,M). By Claim 4, with probabilityat least 1− 2−n we are guaranteed to output a lattice vector w ∈ L \M , suchthat

‖w − v‖C < ελ(C,L,M) ≤(

1

t

)tλ1(C,L) = λ1(C,L)

However, since w − v ∈ L and ‖w − v‖C < λ1(C,L), we must have thatw − v = 0. Therefore w = v and our SAP solver returns an exact minimizeras needed.

3.3 Closest Vector Problem

In this section, we present a reduction from Approximate-CVP to Approximate-SAP for general norms. In [13], it is shown that `p CVP reduces to `p SAP inone higher dimension. By relaxing the condition that the lifted SAP problemremain in `p, we give a very simple reduction which reduces CVP in any normto SAP in one higher dimension under a different norm that is essentially assymmetric. Given the generality of our SAP solver, such a reduction suffices.

Theorem 9 (Approximate-CVP) Take x ∈ Rn. Then for any ε ∈ (0, 13 ),y ∈ L satisfying ‖y−x‖C ≤ (1 + ε)dC(L,x) can be computed in time O( 1

γ4ε2 )n

with probability at least 1− 2−n. Furthermore, if dC(L,x) ≤ tλ1(C,L), t ≥ 2,then a vector y ∈ L satisfying ‖y − x‖C = dC(L,x) can be computed in time

O( t2

γ4 )n with probability at least 1− 2−n.

Proof To show the theorem, we use a slightly modified version of Kannan’slifting technique to reduce CVP to SAP. Let us define L′ ⊆ Rn+1 as the latticegenerated by (L, 0) and (−x, 1).


In the standard way, we first guess a value β > 0 satisfying β ≤ dC(L,x) ≤32 β. Now let C ′ = C × [− 1

2β ,1β ]. For (y, z), y ∈ Rn, z ∈ R, we have that

‖(y, z)‖C′ = max‖y‖C , βz,−2βz

Also, note that C ′ ∩ −C ′ = (C ∩ −C)× [− 12β ,

12β ]. Now we see that

voln+1(C ′ ∩ −C ′) = 1βvoln(C ∩ −C) and voln+1(C ′) = 3

(2β)voln(C). Therefore

voln+1(C ′ ∩ −C ′) =1

βvoln(C ∩ −C) ≥ 1

βγnvoln(C) =

2

3γnvoln+1(C ′).

Hence C ′ is γnn+1 ( 23 )

1n+1 ≥ γ(1− 1

n ) symmetric. LetM = y ∈ Rn+1 : yn+1 = 0.Define m : L → L′ \M by m(y) = (y − x, 1), where it is easy to see that m iswell-defined and injective. Define

S = y ∈ L : ‖y − x‖C ≤ (1 + ε)dC(L,x) and

S′ = y ∈ L′ \M : ‖y‖′C ≤ (1 + ε)λ(C ′,L′,M).

We claim that m defines a norm preserving bijection between S and S′. Takingy ∈ L, we see that

‖m(y)‖C′ = ‖(y − x, 1)‖C′ = max‖y − x‖C , β,−2β = ‖y − x‖C

since β ≤ dC(L,x) ≤ ‖y − x‖C by construction. So we have that ‖m(y)‖C′ =‖y − x‖C , and hence λ(C ′,L′,M) ≤ infy∈L ‖y − x‖C = dC(L,x). Next take(y, z) ∈ L′ \M , y ∈ Rn, z ∈ R, such that ‖(y, z)‖C′ ≤ (1 + ε)λ(C ′,L′,M). Weclaim that z = 1. Assume not, then since (y, z) ∈ L′ \M , we must have thateither z ≥ 2 or z ≤ −1. In either case, we have that

‖(y, z)‖C′ = max‖y‖C , βz,−2βz ≥ maxβz,−2βz ≥ 2β

Now since β ≤ dC(L,x) ≤ 32 β, ε ∈ (0, 13 ), and that λ(C ′,L′,M) ≤ dC(L,x),

we get that

‖(y, z)‖C′ ≥ 2β = (1 +1

3)(

3

2β) ≥ (1 +

1

3)dC(L,x)

> (1 + ε)dC(L,x) ≥ (1 + ε)λ(C ′,L′,M)

a clear contradiction to our initial assumption. Since z = 1, we may writey = w − x where w ∈ L. Therefore, we see that

‖(y, z)‖C′ = ‖(w − x, 1)‖C′ = max‖w − x‖C , β,−2β = ‖w − x‖C

since ‖w − x‖C ≥ dC(L,x) ≥ β. So we have that (1 + ε)λ(C ′,L′,M) ≥‖(y, z)‖C′ = ‖w − x‖C ≥ dC(L,x). Since the previous statement still holdswhen choosing ε = 0, we must have that λ(C ′,L′,M) ≥ dC(L,x) and henceλ(C ′,L′,M) = dC(L,x).

From the above, for y ∈ S, we have that ‖m(y)‖C′ = ‖y − x‖C ≤ (1 +ε)dC(L,x) = (1 + ε)λ(C ′,L′,M), and hence m(y) ∈ S′ as needed. Next if(y, z) ∈ S′, from the above we have that z = 1, and hence (y, z) = (w − x, 1)

32 Daniel Dadush

where w ∈ L. Therefore (y, z) = m(w), where ‖w − x‖C = ‖(y, z)‖C′ ≤(1 + ε)λ(C ′,L′,M) = (1 + ε)dC(L, x), and hence w ∈ S. Since the map m isinjective, we get that m defines a norm preserving bijection between S and S′

as claimed.Hence solving (1 + ε)-CVP with respect to C,L,x is equivalent to solving

(1 + ε)-SAP with respect to C ′,L′,M . Hence applying the 1 + ε approximationalgorithm for SAP from Theorem 8, we get an algorithm for (1 + ε)-CVP whichruns in 2O(n)( 1

γ4ε2 )n-time and 2O(n)( 1γ2ε )

n-space and succeeds with probability

at least 1− 2−n as required.For exact CVP, we are given the guarantee that dC(L,x) ≤ tλ1(C,L). From

analysis above, we see that

λ1(C ′,L′) = minλ1(C ′,L′,M), infy∈L\0

‖(y, 0)‖C′ = mindC(L,x), λ1(C,L)

Therefore

λ(C ′,L′,M) = dC(L,x) = mindC(L,x), tλ1(C,L)≤ tmindC(L,x), λ1(C,L) = tλ1(C ′,L′)

Hence we may again use the SAP solver in Theorem 8 to solve the exact CVP

problem in 2O(n)( t2

γ4 )n-time and 2O(n)( tγ2 )n-space with probability at least

1− 2−n as required.

4 Acknowledgments

I would like to thank Santosh Vempala for useful discussions relating to thisproblem.

References

1. Ralph Gomory. An outline of an algorithm for solving integer programs. Bulletin of theAmerican Mathematical Society, 64(5):275–278, 1958.

2. Hendrik W. Lenstra. Integer programming with a fixed number of variables. Mathematicsof Operations Research, 8(4):538–548, 1983.

3. Ravi Kannan. Minkowski’s convex body theorem and integer programming. Mathematicsof operations research, 12(3):415–440, 1987.

4. Robert Hildebrand and Matthias Koppe. A new lenstra-type algorithm for quasiconvexpolynomial integer minimization with complexity 2O(n logn). Arxiv, Report 1006.4661,2010. http://arxiv.org.

5. Daniele Micciancio and Panagiotis Voulgaris. A deterministic single exponential timealgorithm for most lattice problems based on Voronoi cell computations. SIAM Journalof Computing, 42(3):1364–1391, 2013. Preliminary version in STOC 2010.

6. Daniel Dadush, Chris Peikert, and Santosh Vempala. Enumerative lattice algorithmsin any norm via M-ellipsoid coverings. In Proceedings of 52nd annual Symposium onFoundations of Computer Science (FOCS), 580–589, 2011.

7. Alexander Barvinok. A polynomial time algorithm for counting integral points inpolyhedra when the dimension is fixed. Mathematics of Operations Research, 19(4):769–779, 1994.


8. Ravi Kannan. Test sets for integer programs, ∀∃ sentences. In DIMACS Series inDiscrete Mathematics and Theoretical Computer Science Volume 1, 39–47, 1990.

9. Friedrich Eisenbrand and Gennady Shmonin. Parametric integer programming in fixeddimension. Mathematics of Operations Research, 33(4):839–850, 2008.

10. Sebastian Heinz. Complexity of integer quasiconvex polynomial optimization. Journal ofComplexity, 21(4):543–556, 2005. Festschrift for the 70th Birthday of Arnold Schonhage.

11. Miklos Ajtai, Ravi Kumar, and D. Sivakumar. A sieve algorithm for the shortest latticevector problem. In Proceedings of 33rd Symposium on the Theory of Computing (STOC),601–610, 2001.

12. Miklos Ajtai, Ravi Kumar, and D. Sivakumar. Sampling short lattice vectors and theclosest lattice vector problem. In Proceedings of the 17th Conference on ComputationalComplexity (CCC), 53–57, 2002.

13. Johannes Blomer and Stefanie Naewe. Sampling methods for shortest vectors, closestvectors and successive minima. Theoretical Computer Science, 410(18):1648–1665, 2009.Preliminary version in ICALP 2007.

14. Vikraman Arvind and Pushkar S. Joglekar. Some sieving algorithms for lattice problems.In Proceedings of 28th Foundations of Software Technology and Theoretical ComputerScience (FSTTCS), 25–36, 2008.

15. Friedrich Eisenbrand, Nicolai Hahnle, and Martin Niemeier. Covering cubes and the clos-est vector problem. In Proceedings of the 27th annual ACM symposium on ComputationalGeometry (SoCG), 417–423, 2011.

16. Aleksandr Y. Khinchin. A quantitative formulation of Kronecker’s theory of approxima-tion. Izv. Acad. Nauk SSSR, 12:113–122, 1948.

17. Laszlo Babai. On Lovasz’ lattice reduction and the nearest lattice point problem.Combinatorica, 6(1):1–13, 1986. Preliminary version in STACS 1985.

18. Jeffrey C. Lagarias, Hendrik W. Lenstra Jr., and Claus-Peter Schnorr. Korkin-Zolotarevbases and successive minima of a lattice and its reciprocal lattice. Combinatorica,10(4):333–348, 1990.

19. Ravi Kannan and Laszlo Lovasz. Covering minima and lattice point free convex bodies.Annals of Mathematics, 128:577–602, 1988.

20. Wojciech Banaszczyk. New bounds in some transference theorems in the geometry ofnumbers. Mathematische Annalen, 296:625–635, 1993.

21. Wojciech Banaszczyk. Inequalities for convex bodies and polar reciprocal lattices in Rn

II: Application of k-convexity. Discrete and Computational Geometry, 16:305–311, 1996.

22. Wojciech Banaszczyk, Alexander Litvak, Alain Pajor, and Stanislaw Szarek. Theflatness theorem for nonsymmetric convex bodies via the local theory of Banach spaces.Mathematics of Operations Research, 24(3):728–750, 1999.

23. Mark Rudelson. Distance between non-symmetric convex bodies and the MM*-estimate.Positivity, 4(8):161–178, 2000.

24. Oded Goldreich, Daniele Micciancio, Shmuel Safra, and Jean-Pierre Seifert. Approxi-mating shortest lattice vectors is not harder than approximating closest lattice vectors.Information Processing Letters, 71(2):55–61, 1999.

25. Martin Grotschel, Laszlo Lovasz, and Alexander Schrijver. Geometric Algorithms andCombinatorial Optimization. Springer, 1988.

26. Martin E. Dyer, Alan M. Frieze, and Ravi Kannan. A random polynomial time algorithmfor approximating the volume of convex bodies. In Proceedings of the 21st Symposiumon the Theory of Computing (STOC), 375–381, 1989.

27. Vitali D. Milman and Alain Pajor. Entropy and asymptotic geometry of non-symmetricconvex bodies. Advances in Mathematics, 152(2):314–335, 2000.

28. Ravi Kannan, Laszlo Lovasz, and Miklos Simonovits. Isoperimetric problems for convexbodies and a localization lemma. Discrete & Computational Geometry, 13:541–559,1995.

29. Grigoris Paouris. Concentration of mass on isotropic convex bodies. Comptes RendusMathematique, 342(3):179–182, 2006.

34 Daniel Dadush

A Appendix

A.1 Preliminaries

Logconcave functions: A function f : Rn → R+ is logconcave if for all x,y ∈ Rn,0 ≤ α ≤ 1, we have that f(x)αf(y)1−α ≤ f(αx+ (1−α)y). For a convex body K ⊆ Rn, theindicator function IK of K is easily seen to be logconcave. A logconcave density f : Rn → R+

is a logconcave function for which∫Rn f(x)dx = 1.

A logconcave random variable X ∈ Rn is logconcave if X admits a logconcave densityf : Rn → R+. We note that the uniform distribution over K admits a logconcave density,i.e. πK(x) = 1

voln(K)IK [x] for x ∈ Rn. A classical fact is that for two logconcave random

variables X,Y ∈ Rn, the sum X + Y is also a logconcave random variable. The randomvariable X (with density f) is isotropic if E[X] =

∫Rn xf(x)dx = 0 (mean zero), and if

E[XXt] =(∫

Rn xixjf(x)dx)ij

= In (covariance matrix identity), the n×n identity. For any

full-dimensional random variable X ∈ Rn, there exists an affine transformation T : Rn → Rn(unique up to rotations) such that TX is isotropic. A convex body K is isotropic if theuniform distribution over K is isotropic.

We will need the following two theorems in the proof of Lemma 2. The first theorem dueto Kannan, Lovasz and Simonovits gives sandwiching estimates for isotropic convex bodies.

Theorem 10 (Isotropic Sandwiching [28]) Let K ⊆ Rn be an isotropic convex body.Then √

n+ 2

n·Bn2 ⊆ K ⊆

√n(n+ 2) ·Bn2 .

The following theorem of Paouris gives strong concentration estimates for logconcaverandom variables.

Theorem 11 (Measure Concentration [29], Theorem 11) Let X ∈ Rn be an isotropiclogconcave random variable. Then for some absolute constant c > 0,

Pr[‖X‖ ≥ c√nt] ≤ e−

√nt

for all t ≥ 1.

Proof (Lemma 2: Approx. Barycenter) Let X1, . . . , XN denote iid uniform samples over

K ⊆ Rn, where N =(2cε

)2n. We will show that for b = 1

N

∑ni=1Xi, that the following

holdsPr[‖ ± (b− b(K))‖K−b(K) > ε] ≤ 4−n (18)

Since the above statement is invariant under affine transformations, we may assume K isisotropic, i.e. b(K) = E[X1] = 0, the origin, and E[X1Xt

1] = In, the n× n identity. Since Kis isotropic, from Theorem 10 we have that Bn2 ⊆ K. Therefore to show (18) it suffices toprove that Pr[‖b‖2 > ε] ≤ 4−n. Since the Xi’s are iid isotropic random vectors, we see that

E[b] = 1N

∑Ni=1 E[Xi] = 0 and

E[bbt] =1

N2

∑i,j∈[N ]

E[XiXtj ] =

1

N2

n∑i=1

E[XiXti ] =

1

NIn

Now since the Xis are logconcave, we have that b is also logconcave. Note that the randomvariable

√Nb is logconcave and isotropic, since E[

√Nb(√Nb)t] = N E[bbt] = In. Therefore,

by the concentration inequality of Paouris 11 we have that

Pr[‖b‖2 > ε] = Pr[‖√Nb‖2 >

√Nε] = Pr[‖

√Nb‖2 > 2c

√n] < e−2n < 4−n

as claimed. To prove the theorem, we note that when switching the Xi’s from truly uniform to

4−n uniform, the above probability changes by at most n(

2cε2

)4−n by Lemma 1. Therefore

the total error probability under 4−n-uniform samples is at most 2−n as needed.


Convexity:

Proof (Lemma 3: Estimates for norm recentering) We have z ∈ Rn, x,y ∈ K satisfying(†) ‖ ± (x− y)‖K−y ≤ α < 1. We prove the statements as follows:

1. ‖z− y‖K−y ≤ τ ⇔ (z− y) ∈ τ(K − y)⇔ z ∈ τK + (1− τ)y as needed.2. Let τ = ‖z− x‖K−x. Then by (1), we have that z ∈ τK + (1− τ)x. Now note that

(1− τ)(x− y) ⊆ |1− τ |α(K − y)

by assumption (†) and (1). Therefore

z ∈ τK + (1− τ)x = τK + (1− τ)y + (1− τ)(x− y) ⊆ τK + (1− τ)y + α|1− τ |(K − y)

= (τ + α|1− τ |)K + (1− τ − α|1− τ |)y

Hence by (1), we have that

‖z− y‖K−y ≤ τ + α|1− τ | = ‖z− x‖K−x + α|1− ‖z− x‖K−x|

as needed.3. We first show that

±(y − x) ∈α

1− α(K − x)

By (1) and (†) we have that

(x− y) ∈ α(K − y)⇔ (x− y)− α(x− y) ∈ α(K − y)− α(x− y)

⇔ (1− α)(x− y) ∈ α(K − x)⇔ (x− y) ∈α

1− α(K − x)

as needed. Next since 0 ≤ α ≤ 1, we have that |1− 2α| ≤ 1. Therefore by (†) we havethat

(1− 2α)(y − x) ∈ |1− 2α|α(K − y) ⊆ α(K − y)

since 0 ∈ K − y. Now note that

(1− 2α)(y − x) ∈ α(K − y)⇔ (1− 2α)(y − x) + α(y − x) ∈ α(K − y) + α(y − x)

⇔ (1− α)(y − x) ∈ α(K − x)⇔ (y − x) ∈α

1− α(K − x)

as needed.Let τ = ‖z− y‖K−y. Then by (1), we have that z ∈ τK + (1− τ)y. Now note that

z ∈ τK + (1− τ)y = τK + (1− τ)x + (1− τ)(y − x)

⊆ τK + (1− τ)x +α

1− α|1− τ |(K − x)

= (τ +α

1− α|1− τ |)K + (1− τ −

α

1− α|1− τ |)x

Hence by (1), we have that

‖z− x‖K−x ≤ τ +α

1− α|1− τ | = ‖z− y‖K−y +

α

1− α|1− ‖z− y‖K−y|

as needed.

Proof (Corollary 1: Stability of symmetry)We claim that (1−‖x‖K)(K ∩−K) ⊆ (K −x)∩ (x−K). Take z ∈ K ∩−K, then note that

‖x + (1− ‖x‖K)z‖K ≤ ‖x‖K + (1− ‖x‖K)‖z‖K ≤ ‖x‖K + (1− ‖x‖K)‖z‖K∩−K≤ ‖x‖K + (1− ‖x‖K) = 1

36 Daniel Dadush

hence x + (1− ‖x‖K)(K ∩ −K) ⊆ K ⇔ (1− ‖x‖K)(K ∩ −K) ⊆ K − x. Next note that

‖ − x + (1− ‖x‖K)z‖−K ≤ ‖ − x‖−K + (1− ‖x‖K)‖z‖−K ≤ ‖x‖K + (1− ‖x‖K)‖z‖K∩−K≤ ‖x‖K + (1− ‖x‖K) = 1

hence −x + (1− ‖x‖K)(K ∩−K) ⊆ −K ⇔ (1− ‖x‖K)(K ∩−K) ⊆ x−K, as needed. Nowwe see that

voln((K − x) ∩ (x−K)) ≥ voln((1− ‖x‖K)(K ∩ −K)) = (1− ‖x‖K)nvoln(K ∩ −K)

and so the claim follows from Theorem 7.

A.2 Integer Programming

Proof (Lemma 4: Well-Centered Bodies) Clearly for any a′0 ∈ Ka,b we have that Ka,b ⊆a′0 + 2RBn2 , since Ka,b ⊆ K ⊆ a0 + RBn2 . Therefore the we need only worry that Ka,bcontains a polynomially sized ball around an easy to compute point in Ka,b. Since a, bcorrespond to m,u in the while loop (lines 6-13 in Algorithm 1), we have that

b− a ≥δ

2〈xl,v〉+

3δ

16≤ b a ≤ 〈xu,v〉 −

7δ

16,

where the last inequality follows since a+ δ2≤ b ≤ 〈xu,v〉+ δ

16. Since a0 + rBn2 ⊆ K, by

assumption δ ≤ 164r‖v‖2, we also have that

widthK(v) ≥ widtha0+rBn2

(v) = 2r‖v‖2 ≥ 128δ

From the above, we additionally conclude that

〈xl,v〉 ≤ 〈a0,v〉 ≤ 〈xu,v〉 and 〈xu,v〉 − 〈xl,v〉 ≥ widthK(v)−δ

8≥ 127δ

Let I denote the interval [〈xl,v〉 , 〈xu,v〉] ∩ [a, b]. Combining the above inequalities, it isnot hard to check that length(I) ≥ 3δ

16(corresponding to having b shifted as far to the left

as possible). Let Il = [〈xl,v〉 , 〈x,a0〉] ∩ I and Iu = [〈a0,v〉 , 〈xu,v〉] ∩ I. Since Il and Iupartition I, we must have that either length(Il) ≥ 3δ

32or length(Iu) ≥ 3δ

32. Assume we are in

the former case (the analysis for the latter case is symmetric). Let c denote the midpoint ofIl. Define a′0 as

a′0 =〈a0,v〉 − c〈a0 − xl,v〉

xl +c− 〈xl,v〉〈a0 − xl,v〉

a0,

where we note that a′0 can easily be computed in polynomial time. Since a′0 is a convexcombination of xl and a0, and since

⟨a′0,v

⟩= c ∈ [a, b], we have that a′0 ∈ Ka,b. Next by

our assumption that length(Il) ≥ 3δ32

, that c is the midpoint of Il, and that ‖xl − a0‖2 ≤ R,we have that

c− 〈xl,v〉〈a0 − xl,v〉

≥3δ

64R‖v‖2.

Since a0 + rBn2 ⊆ K, we get that

K ⊇〈a0,v〉 − c〈a0 − xl,v〉

xl +c− 〈xl,v〉〈a0 − xl,v〉

(a0 + rBn2 )

= a′0 + rc− 〈xl,v〉〈a0 − xl,v〉

Bn2 ⊇ a′0 +3rδ

64R‖v‖2Bn2

Furthermore, note that

max〈x,v〉 : x ∈ a′0 +3rδ

64R‖v‖2Bn2 = c+

3rδ

64R≤ c+

3δ

64≤ b


since c is the midpoint of Il ⊆ [a, b]. By the symmetric argument, we also have thatmin〈x,v〉 : x ∈ a′0 + 3rδ

64R‖v‖2Bn2 ≥ c−

3δ64≥ a. Therefore, we have that

a′0 +3rδ

64R‖v‖2Bn2 ⊆ Ka,b

as needed.

Documents

A Randomized Sieving Algorithm for Approximate Integer ...dadush/papers/asymmetric-aks.pdf · The Integer Programming (IP) Problem, i.e. the problem of deciding whether a polytope