[IEEE 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS) - Palm Springs, CA, USA (2011.10.22-2011.10.25)] 2011 IEEE 52nd Annual Symposium on Foundations of Computer

Efficient Reconstruction of Random Multilinear Formulas

Ankit GuptaMicrosoft Research India

[email protected]

Neeraj KayalMicrosoft Research India

[email protected]

Satya LokamMicrosoft Research [email protected]

Abstract— In the reconstruction problem for a multivariatepolynomial f , we have blackbox access to f and the goal is toefficiently reconstruct a representation of f in a suitable model ofcomputation. We give a polynomial time randomized algorithmfor reconstructing random multilinear formulas. Our algorithmsucceeds with high probability when given blackbox access to thepolynomial computed by a random multilinear formula accordingto a natural distribution. This is the strongest model of computationfor which a reconstruction algorithm is presently known, albeitefficient in a distributional sense rather than in the worst-case.Previous results on this problem considered much weaker modelssuch as depth-3 circuits with various restrictions or read-onceformulas.

Our proof uses ranks of partial derivative matrices as a keyingredient and combines it with analysis of the algebraic structureof random multilinear formulas. Partial derivative matrices haveearlier been used to prove lower bounds in a number of models ofarithmetic complexity, including multilinear formulas and constantdepth circuits. As such, our results give supporting evidence to thegeneral thesis that mathematical properties that capture efficientcomputation in a model should also enable learning algorithms forfunctions efficiently computable in that model.

Keywords-arithmetic circuits; multilinear formulas; reconstruc-tion; learning.

1. INTRODUCTION

We study the problem of reconstructing a multivariatepolynomial: given blackbox access to a hidden polynomialf ∈ F[x1, . . . , xn] over a finite1 field F, reconstruct arepresentation of f in some suitable model of computa-tion. A reconstruction algorithm can adaptively query theblackbox to evaluate f on inputs of its choice from Fn. Itsefficiency is measured in terms of the number of queriesand the running time. We typically assume f itself to beefficiently computable in some model of computation, e.g.,depth-3 circuits of polynomial size, and also require thereconstruction algorithm to produce a succinct representationof f in some (possibly different) model of computation. Themost obvious representation of a multivariate polynomial isits formula as a sum, weighted by coefficients from F, ofmonomials, i.e., a depth-2 ΣΠ formula. In this case, theproblem of reconstruction is more commonly referred toas interpolation: given blackbox access to a polynomial,produce its representation as a sum of products. However,many interesting polynomials, e.g., determinant, have expo-nentially long (in the number of variables) representations

1Many of the definitions make sense for infinite fields as well.

as a sum of products, whereas as a straight line programor an arithmetic circuit, they can be represented muchmore succinctly. The reconstruction problem demands suchsuccinct representations as outputs and hence is a gener-alization of the interpolation problem. In its most generalformulation, e.g., produce (roughly) the smallest arithmeticcircuit for f , the reconstruction problem is extremely hard. Ifa circuit class C has a deterministic reconstruction algorithm,it is easy to see that C also has a deterministic (blackbox)polynomial identity testing (PIT) algorithm. On the otherhand, a deterministic PIT implies superpolynomial sizelower bounds against C for an explicit polynomial. Hence,a deterministic reconstruction algorithm for C is at leastas hard as proving superpolynomial lower bounds againstC. Thus, much of the research in this area focusses onreconstructing polynomials efficiently computable by weakervariants of arithmetic circuits.

Previous work on the reconstruction problem focussed onpolynomials computable by restricted constant depth arith-metic circuits and read-once formulas. In particular, depth-2circuits [4], i.e., interpolation problem, depth-3 circuits withbounded top fan-in and multilinear depth-3 formulas withbounded top fan-in [9], [3]. See [11] for more details onprevious work.

In this paper, we consider the model of multilinear for-mulas. An arithmetic formula, using + and × operations, ismultilinear if the formal polynomial computed by each of itssubformulas is multilinear. Our main result is a randomizedreconstruction algorithm for a class of random multilinearformulas. The algorithm uses as a blackbox a multilinearformula randomly chosen according to a natural distribution(see Section 2 below for details). It succeeds with highprobability w.r.t. its internal randomness and the choice ofthe formula from the distribution. Its output is a multilinearformula of the same size as the hidden formula; it is, infact, the smallest multilinear formula computing the hiddenpolynomial. This is the strongest model, and the first oneof super-constant depth, in arithmetic complexity for whichan efficient (even in a randomized or distributional sense)reconstruction algorithm is shown. We further remark thata slight variant of the problem of reconstructing multilinearformulas, even for depth three formulas, is known to be NP-hard. Specifically, Hastad [1] showed that reconstructing thesmallest set-multilinear formula (an even weaker model than

2011 52nd Annual IEEE Symposium on Foundations of Computer Science

0272-5428/11 $26.00 © 2011 IEEE

DOI

756


0272-5428/11 $26.00 © 2011 IEEE

DOI 10.1109/FOCS.2011.70

778


0272-5428/11 $26.00 © 2011 IEEE

DOI 10.1109/FOCS.2011.70

778

2011 IEEE 52nd Annual Symposium on Foundations of Computer Science

0272-5428/11 $26.00 © 2011 IEEE

DOI 10.1109/FOCS.2011.70

778

multilinear formulas) for a given set-multilinear polynomialis NP-hard. This indicates that without some kind of adistributional assumption, it would be unrealistic to hopefor a reconstruction algorithm for multilinear formulas.Alternatively, it indicates that there is unlikely to be a worst-case reconstruction algorithm for multilinear formulas.

From a broad perspective, reconstructing polynomialsfrom arithmetic complexity classes is, in some sense, anal-ogous to learning concept classes of Boolean functionsusing membership and equivalence queries. (see Chapter5 of survey by Shpilka and Yehudayoff [11] for justifyingarguments for the analogy to the Boolean world and, moregenerally, for previous work in this area.) While research onthe theory of learnability in the Boolean world has evolvedinto a mature discipline, thanks to fundamental notions suchas PAC learning due to Valiant, research on learnability inthe arithmetic world has been gaining momentum only inrecent years.

A recurring theme in Boolean and arithmetic domains isthat techniques used to prove lower bounds for a modelof computation are often helpful in designing learningalgorithms for that model. At a very high level, a lowerbound proof identifies mathematical properties of a model ofcomputation that capture efficient computation in that model.Thus functions efficiently computable in that model shouldpossess the same or similar properties and they should alsobe useful in learning such functions. This thesis has beenborne out in the Boolean world by several examples, e.g.,Fourier approximability of AC0 circuits is useful in bothlower bounds and learning algorithms. In the arithmeticworld, we see a similar trend, but there are still an abundantnumber of open questions suggested by this general theme.

Our results in this paper, and in this direction in general,are guided by, and provide supporting evidence to, the thesismentioned above. One of the key ingredients of our proofis the use of partial derivative matrices of polynomialscomputed in a multilinear formula. We note that propertiesof partial derivatives of a polynomial have been an importanttool in proving lower bounds in a variety of models. Inparticular, Raz [7] used them to prove lower bounds onmultilinear formulas and Raz and Shpilka used them forlower bounds on constant depth circuits. Nisan [6] alsoused them to prove lower bounds in the noncommutativesetting. Thus it is to be expected that properties of partialderivatives of polynomials are useful in reconstruction algo-rithms. Indeed, Klivans and Shpilka [5] prove that wheneverthe space of partial derivatives has polynomial dimension,one has polynomial time reconstruction algorithms. Thisimplies reconstruction algorithms for some restricted ver-sions of depth-3 circuits and Arithmetic Branching Programs(ABP’s) since their partial derivatives span low-dimensionalspaces. This approach, however, cannot be used for multi-linear formulas since there are multilinear formulas whosepartial derivatives span spaces of exponential dimension.

Nevertheless, Raz [7] combines rank arguments about partialderivative matrices and combinatorial arguments based onrandom restrictions to prove quasipolynomial lower boundson the multilinear formula complexity of the determinantand permanent polynomials. In this paper, we also exploitrank arguments about partial derivative matrices of polyno-mials computed in a multilinear formula and combine themwith additional structural properties of random multilinearformulas to derive our reconstruction algorithm.

2. DEFINITIONS AND MAIN RESULT

We recall that an arithmetic formula is a binary treesuch that (i) each leaf is labeled by either a variable fromX = x1, . . . , xn or an element of the field F, (ii) eachinternal node is either + gate or × gate, and (iii) The in-coming edges of a + gate are also labeled by constants fromF. A + gate computes the linear combination of its inputswith coefficients given by the constants on the incomingedges of the gate. A × gate computes the product of itsinputs. Each gate v in the formula is naturally associatedto a polynomial pv ∈ F[X] computed at v. In particular,the polynomial computed at the root (output node) is thepolynomial computed by the formula. The size of a formulais the number of leaves in the tree. The (multiplicative) depthof a node is the number of× gates on the path from that nodeto the root. The depth of the formula is the maximum depthof a leaf. An arithmetic formula is said to be multilinearif each gate in it computes a multilinear polynomial, i.e., ineach of its monomials the power of every input variable isat most one.

Definition 2.1. Syntactic Multilinear Formulas: Let Φ bean arithmetic formula over X = x1, . . . , xn. Let Φvdenote the subformula rooted at a node v and Xv be theset of variables that appear in Φv . Then, Φ is said to besyntactic multilinear if for every product gate v = v1 × v2

of Φ, the sets Xvl and Xv2 are disjoint.

Note that for any multilinear formula, there exists asyntactic multilinear formula of the same size that computesthe same polynomial (see [7]). Hence, we often omit theword “syntactic” while referring to multilinear formulas.Moreover, any syntactic multilinear formula can be con-verted into a syntactic multilinear formula with alternatinglayers of + and × gates with only a polynomial blow-up insize (see [8]).

A Natural Distribution on the set of Multilinear Formulas:

Our reconstruction algorithm uses, as a blackbox, a ran-dom multilinear formula drawn according to a distributionas defined below. Informally, this distribution constructs abinary2 tree with + and × gates at alternating levels (with a+ gate at the root). Each + gate computes a random linear

2we assume this for the clarity of presentation and handle the k-ary casein Section A.

757779779779

combination of its inputs over F. Moving down the tree,at each × gate, we partition the variables into two equal-sized sets and recursively build a subformula rooted at eachof this × gate. We stop the recursion when the numberof variables is small enough (we choose this to be aboutlog3 n for technical reasons and ensure an error probabilityof 1/poly(n).)

A formal definition of the distribution follows:Let M(X,F) be the set of all possible syntactic mul-

tilinear formulas over the variable set X = x1, . . . , xnand a (sufficiently large) finite field F. We propose thefollowing method SAMPLE(X,F) to sample a random syn-tactic multilinear formula from the set M(X,F), therebyinducing a natural P-samplable distribution D(X,F) on theset M(X,F). This distribution also depends on an integerparameter βn, which we assume to be Θ(log3 n).Sampling Method SAMPLE(X,F):

Step 1: Ψ ← CONSTRUCT(X,+), whereCONSTRUCT(X, op) is defined below.Step 2: Let W be the set of wires in Ψ incident to a+ gate. Let Φ be the syntactic multilinear arithmeticformula obtained by labeling each wi ∈ W by arandomly and independently chosen ci ∈R F.Step 3: return(Φ).

CONSTRUCT(X, op):

Case 1: |X| ≤ βn. Let Ψ be the formula with a +gate at the root that has wires incident to it from eachxi ∈ X .Case 2: |X| > βn and op = ×. Partition X randomlyinto two equal sized sets X1, X2 and let Ψ1 ←CONSTRUCT(X1,+), Ψ2 ← CONSTRUCT(X2,+).Let Ψ be the formula with a × gate at the root andΨ1,Ψ2 as its two children.Case 3: |X| > βn and op = +. Let Ψ1 ←CONSTRUCT(X,×), Ψ2 ← CONSTRUCT(X,×). LetΨ be the formula with a + gate at the root and Ψ1,Ψ2

as its two children.Step: return(Ψ).

We now state our main reconstruction result for multilin-ear formulas.

Theorem 2.2. Let Φ ∼ D(X,F) be a random multilinearformula sampled as above and let Φ ∈ F[X] be thepolynomial computed by Φ. Then, there is an nO(1)-timerandomized algorithm A which, given blackbox access toΦ, constructs a syntactic multilinear formula ΦA of size atmost size(Φ) and such that

Pr[ΦA 6= Φ] ≤ nO(1)

|F|+

1nΩ(1)

,

where the probability is taken over the randomness in thechoice of Φ and the internal randomness of A.

3. BASIC IDEA AND APPROACH

Suppose we have blackbox access to the output polyno-mial f of a random multilinear formula Φ. By querying fat points of our choice, we want to recover Φ. How do wedo so? We give an overview of our approach to do this.Determining the nature of the output gate: Observe thatif the output node was a × gate then the output wouldbe a reducible polynomial 3. The converse is not true ingeneral. That is, it can happen that the output gate is a +gate and f is reducible as well. At this point we invokethe assumption that the formula Φ is chosen randomly anddeduce that with high probability over the random choiceof Φ the output node is a × node if and only if f isreducible (Lemma 5.9). Thus, we can use the blackboxfactoring algorithm of Kaltofen [2] (or alternatively themultilinear factoring algorithm of Shpilka and Volkovich[10]) to determine whether f is reducible and this helpsus answer our first question. The next thing that we wouldlike to do is get blackbox access to the two children. Oncewe have that we can recursively do the reconstruction ofthe two subformulas. There are two cases depending on thenature of the output gate.Case I: Output node is a × gate. In this case we factorf using Kaltofen’s algorithm. Now it can happen (in rarecircumstances) that the number of factors of f is larger thanthe number of children of the output node. For a generic(i.e. randomly chosen) formula Φ these two quantities willhowever be equal (Lemma 5.9) so that Kaltofen’s algorithmprovides blackbox access to the two children of the outputnode. We then recursively compute the formulas for the twochildren.Case II: Output node is a + gate. In this case we needto go one level deeper. The two children of the output nodeare × gates (except when we are in the base case) so thatthe output polynomial f is of the form

f = A ·B + C ·D.

Our aim will be to obtain blackbox access to the four‘grandchildren’ A,B,C and D. If we can do that then wecan recursively compute formulas for these polynomials andwe would be done. At this point we use the fact that weare dealing with (syntactic) multilinear formulas. It meansthat there exists a partition of the set of variables into four(disjoint) subsets u, v, x and y such that

f(u, v, x, y) = A(u, v) ·B(x, y) + C(v, x) ·D(u, y). (1)

In general this partition of the set of variables can bearbitrary in which case it becomes much more difficultto find Φ. However, when Φ is random then with highprobability all these sets are roughly of the same size

3If one of the children was a constant then the subtree rooted at thatnode can be discarded and we would have a smaller formula computingthe same polynomial

758780780780

(Lemma 5.1). Now it turns out that we can exploit the ideasin the lower bound proof of Raz [7] to find this partitionof the set of variables. Very roughly, the idea is that forthe right partition the rank of a certain related matrix willbe very small whereas for every other partition the rankof this matrix will be much larger. This is the one of thekey technical arguments (Theorem 5.6) in our work andis described in its proof sketch. For now assume that weknow the subsets u, v, x and y. Knowing these subsets, howdo we obtain blackbox access to f? The idea is that ifin equation (1) we substitute each u-variable and each v-variable to some random values say u = a and v = b thenA(a, b) becomes a constant so that the degree of (A · B)drops down after this substitution (with high probability, thissubstitution does not change the degree of C ·D). This meansthat the homogeneous part of largest degree of f(a, b, x, y)is a product of the homogeneous parts of largest degrees ofC(b, x) and D(a, y). Thus factoring the homogeneous partof largest degree of f gives us blackbox access to the largestdegree homogeneous parts of C(b, x) and D(a, y). This ideacan be extended suitably (see Lemma 5.5) to obtain blackboxaccess to the whole of each polynomial A,B,C and D. Thiscompletes our brief overview of the reconstruction algorithmfor multilinear formulas.

4. PRELIMINARIES AND NOTATIONS

Lemma 4.1 (Chernoff’s bound). Let ζ1, . . . , ζn be indepen-dent uniform 0-1 random variables . Then,

Pr[ (1−δ)n/2 ≤∑i

ζi ≤ (1+δ)n/2 ] ≥ 1−2 exp(−δ2n/8).

Lemma 4.2 (DeMillo-Lipton-Schwartz-Zippel). Let f ∈F[x1, . . . , xn] be a non-zero polynomial of degree d ≥ 0.Let S be a finite subset of F and let r1, . . . , rn be selectedrandomly from S. Then

Pr[f(r1, r2, . . . , rn) = 0] ≤ d

|S|The above lemma automatically results in the following PIT

algorithm which succeeds with probability ≥ 1− d|S| .

Algorithm 1 (Blackbox PIT). Given blackbox access toa polynomial f ∈ F[x1, . . . , xn] of degree d, queryf(r1, r2, . . . , rn) to the blackbox for r1, . . . , rn ∈R S,where S is any finite subset of F. Conclude f = 0 ifff(r1, r2, . . . , rn) = 0.

Kaltofen’s Blackbox Factoring: We state the multivariateblackbox factoring algorithm by Kaltofen [2] (also see [10])in context of multilinear polynomials,

Lemma 4.3 (Kaltofen’s Blackbox Factoring). There is arandomized polynomial-time algorithm that, given blackboxaccess to a multilinear polynomial f ∈ F[x1, . . . , xn],with probability 1 − 2−Ω(n), outputs blackboxes to all theirreducible factors of f .

Notation: [n] denotes the set 1, 2, . . . , n. For a polynomialf , f [d] denotes the homogenous degree-d part of f . Tupleswould be denoted by placing a bar over a letter, e.g. x.For a tuple β = (β1, . . . , βn), iβ would denote the tuple(iβ1, . . . , iβn). For an arithmetic formula Φ, the polynomialcomputed at the root is denoted by Φ.

5. RECONSTRUCTING MULTILINEAR FORMULAS

5.1. Structural Properties of Multilinear Formulas fromD(X,F)

Before we sketch the proof of Theorem 2.2, we state andexamine some structural properties of random multilinearformulas which would be essential for our algorithm towork. Due to space constraints, the proofs of the lemmashere have been omitted.

Our first lemma says for the variables in the subformularooted at a + gate, the two partitions induced by the children(× gates) of that gate intersect more or less “transversally,”i.e., each block of either partition is split nontrivially (in factin a rather balanced way) by the other partition. Moreover,a child polynomial of a × gate (a grandchild of the + gate)here is not annihilated by zeroing out either subset of itsvariables induced by the partition at the sibling product gate.

Lemma 5.1. Let Φ ∼ D(X,F). Then, for all nodes of Φ,the following hold with probability at least 1− nO(1)

|F| −1

nΩ(1) :1) The polynomial computed by a node at (multiplicative)

depth h is a homogenous polynomial of degree nβn2h

.2) The polynomial computed at a + gate is of the form

α.A(v, u)B(x, y) + β.C(v, x)D(u, y) where for allp ∈ v, u, x, y, |p| ≥ 1

8 |v ∪ u ∪ x ∪ y|.3) In the above polynomial computed at a + gate, for

all R ∈ A,B,C,D, say R(p, q), R(0, q) 6= 0 andR(p, 0) 6= 0.

Given a multilinear polynomial f over two variable setsY = y1, . . . , ym and Z = z1, . . . , zn, define Mf as a2m× 2n matrix whose (p, q) entry, p ⊆ Y and q ⊆ Z is thecoefficient of the monomial pq in f . The rank of Mf in thiscase is denoted by RankY Z(f). We will use the followingproperties of the partial derivatives matrix.

Lemma 5.2 ([7]). Given two multilinear polynomials f andg over the variable set Y ∪ Z,

1) RankY Z(f + g) ≤ RankY Z(f) + RankY Z(g),2) RankY Z(f.g) = RankY Z(f).RankY Z(g) if f and g

are polynomials on disjoint sets of variables, and3) RankY Z(f) ≤ 2min(Y (f),Z(f)) where Y (f) and Z(f)

are the number of Y and Z variables that occur in f .

We next show that a random linear combination of twomultilinear polynomials can only increase the rank w.h.p.

Lemma 5.3. Let f and g be two multilinear polynomialsover the variable set Y ∪ Z and field F. Let S ⊂ F.For α, β chosen uniformly at random from S ⊆ F, let

759781781781

p be the probability that RankY Z(α · f + β · g) ≥maxRankY Z(f),RankY Z(g). Then we have

p ≥ 1− 2min|Y |,|Z|

|S|.

5.2. Simulating Blackbox Access to Subformulas

Our reconstruction algorithm will be recursive on thestructure of the (unknown, random) multilinear formula.Hence, we will need to simulate blackbox access to its com-ponents using blackbox access to the polynomial/formulaitself. The next lemma shows this for the homogenouscomponent of a given degree and the theorem below forthe grandchildren of a + node.

Lemma 5.4. Let F be a field with at least d + 1 elementsand let f ∈ F[x1, . . . , xn] be a degree d polynomial. Givenblackbox access to f we can simulate blackbox access tof [r]’s, where f [r] denotes the homogenous degree-r part off .

Theorem 5.5. Let v, u, x, y be a partitionof x1, . . . , xn and f(v, u, x, y) = A(v, u)B(x, y) +C(v, x)D(u, y) be a non-zero polynomial such that,

1) A,B,C,D are homogenous multilinear polynomialsover the indicated variable sets,

2) either deg(AB) 6= deg(CD) or deg(A) = deg(B) =deg(C) = deg(D),

3) for all R ∈ A,B,C,D, say R(p, q), R(0, q) 6= 0and R(p, 0) 6= 0.

4) for all p ∈ v, u, x, y, |p| ≥ δn, for some δ > 0Then there is an nO(1)-time randomized algorithmthat, given blackbox access to f and the partitionv, u, x, y, constructs blackboxes for A,B,C,Dwith probability at least 1− nO(1)

|F| −1

2Ω(n) .

Proof: The proof follows from the algorithmTRICKLEDOWN below.Input: The partition v, u, x, y and an oraclefor A(v, u)B(x, y) + C(v, x)D(u, y) where A,B,C,D arepolynomials satisfying the above stated properties.Output: Blackboxes for A,B,C,D.Algorithm: TRICKLEDOWN

Step 1: Using blackbox for f = AB + CD, constructblackboxes for f [i]’s for all i ∈ [n].Step 2: For i ∈ [n], using blackbox PIT, determineif f [i] 6= 0. If there is only one such i then proceedto the next step. Otherwise let f [i], f [j] be non-zero.For f [i](v, u, x, y), determine using blackbox PIT, iff [i](v, u, 0, 0) is 0. If yes, conclude f [i] = AB andf [j] = CD, else the other way. Using Kaltofen’s fac-toring algorithm, construct blackboxes for irreduciblefactors of A(v, u)B(x, y). For each factor h(v, u, x, y),determine, using blackbox PIT, if h(0, 0, x, y) is 0. Ifyes, conclude it is a factor of A else B. Similarly,construct blackboxes for C and D.

Step 3: Determining degrees of A,B,C,D. UsingKaltofen’s factoring algorithm, gain blackbox access toirreducible factors of f(v, u, 0, 0) = C(v, 0)D(u, 0).For each factor h, determine, using blackbox PIT, if hbecomes the zero polynomial after instantiating v to 0.If yes it is a factor of C(v, 0) else D(u, 0). Similarly,construct blackboxes for C(0, x) and D(0, y). Havingconstructed blackboxes for C(v, 0) and D(u, 0), con-clude d = deg(C) = log

(C(2α,0)C(α,0)

)for a randomly

chosen α ∈ F|v|, and similarly for D, A, B.Step 4: Constructing blackbox for C. To determineC(α, β), for any α ∈ F|v|, β ∈ F|x| we will makethe substitution x := β and y := γ for γ ∈R F|y|. Inthe ensuing discussion we shall denote by g(v, u) thepolynomial obtained by making the above substituionin a polynomial g(v, u, x, y). i.e. g def= g(v, u, β, γ).Then, f := f(v, u, β, γ) is of the form

A(v, u)B(β, γ)︸︷︷︸only degree deg(A) terms

+ C(v, β)D(u, γ)︸︷︷︸terms can have degree > deg(A)

= A(v, u)B(β, γ) + C(v)D(u).

Then,

C(v) = C [d](v) . . .+ C [1](v) + C(0, β)

while

D(u) = D[d](u) . . .+ D[1](u) +D(0, γ).

Note that f [2d] = C [d](v) · D[d](u). Using Kaltofen’salgorithm, obtain blackboxes for C [d](v) and D[d](u)using the blackbox for f [2d]. As Kaltofen’s algo-rithm gives blackboxes for irreducible factors ofC [d](v)D[d](u) and any such factor depends on eitherv or u, to find out if h(v, u) depends on u use blackboxPIT on h(v, 0).Step 5: Constructing blackboxes for C [i](v) andD[i](u) for i ∈ [d − 1] . Having gained blackboxesfor C [d](v) and D[d](u) we note that,

f(v, u)[2d−1]

= C [d] · D[d−1] + C [d−1] · D[d] (2)

Also

f(v, 2u)[2d−1]

= 2d−1C [d] · D[d−1] + 2dC [d−1] · D[d]

(3)Solving (2) and (3) for C [d−1], we get that C [d−1](v)equals

1D[d](u)

[1

2d−1f(v, 2u)

[2d−1]− f(v, u)

[2d−1]]

As we have blackbox access to f [2d−1] and D[d](u), wehave blackbox access to C [d−1](v) (after instantiatingu randomly to avoid making the denominator vanish inthe above equation). Similarly we have blackbox access

760782782782

to D[d−1](u). In general, after constructing blackboxesfor C [r](v) and D[r](u) for all r ∈ [d′ + 1 : d]and proceeding as above, one obtains the followingexpression for C [d′](v).

C [d′](v) = E1 · (E2 − E3)

where

E1 =2d

′

(2d − 2d′)D[d](u)

E2 =f(v, 2u)

[d+d′]

2d′ − f(v, u)[d+d′]

E3 =d−1∑

i=d′+1

(2d−i − 1)C [i](v)D[d+d′−i](u)

Hence using the above procedure, blackboxes forC [d′](v), for all d′ ∈ [d], can be constructed. Also,using the blackbox for C(0, x) constructed in Step 3determine C(0, β). This completes our blackbox forC(v, β).Step 6: Repeat the above 3 steps similarly with thecorrect parameters to construct blackboxes for A,B andD.

5.3. The Reconstruction AlgorithmRECONSTRUCT(OΦ, X,F,m)

We are now ready to present the reconstruction algorithmfor random multilinear formulas.Input: oracle OΦ for polynomial Φ computable by a mul-tilinear formula Φ sampled usingSAMPLE(X,F) where X = x1, . . . , xn and size m of theseed partition4(m = Θ(log n)).Output: multilinear formula Ψ such that |Ψ| ≤ |Φ| andΨ = Φ, or else FAIL.

Step 1: Determining linearity: For any xi ∈ X , fi =Φ|xi=1 − Φ|xi=0 is the coefficient polynomial of xi inΦ. For all fi’s, using blackbox PIT on fi|xj=1−fi|xj=0,determine if fi depends on xj . If for all xi with a non-zero fi, fi does not depend on X , then Φ is linear andin this case simply interpolate Φ exactly and output aΣ-circuit for it.Step 2: Reducible Φ: Using Kaltofen’s factoring algo-rithm construct oracles for irreducible factors hi’s, ofΦ. If Φ is irreducible proceed to the next step. Elseusing blackbox PIT, as described in the previous step,determine the variable sets of these factors. Recursivelyusing RECONSTRUCT, construct formulas Ψi’s forhi’s. If RECONSTRUCT fails on any hi output FAIL.Else, output a formula with × gate at the root and Ψi’sas its children.

4size of the seed partition is kept unchanged while recursing.

Step 3: Determining a seed partition: Let Φ =A(v, u)B(x, y)+C(v, x)D(u, y). Randomly choose anm-sized subset S of X . In Φ, instantiate the variablesin X \ S to random values over F to get ΦS =AS(vS , uS)BS(xS , yS) + CS(vS , xS)DS(uS , yS) andinterpolate it in nO(1) time. Iterate over all possiblepartitions v′′, u′′, x′′, y′′ of S such thatthe size of each set in them is at least γm (for asmall enough γ) and let v′, u′, x′, y′ bea partition such that Rankv′y′(ΦS |v′,y′) ≤ 2 andRanku′x′(ΦS |u′,x′) ≤ 2 where ΦS |v′,y′ is ΦS withvariables in S\v′, y′ instantiated to random values inF and similarly for ΦS |u′,x′ . Having interpolated ΦS ,this can be done in nO(1) time using Gaussian elimina-tion as there are 2O(logn) such possible partitions andthe partial derivative matrix on O(log n) variables is ofsize at most 2O(logn).Step 4: Extending the seed partitionv′, u′, x′, y′: For all xi ∈ X \ Sdo the following. Let Si = S ∪ xi. In Φ,instantiate the variables in X \ Si to random valuesover F to get ΦSi and interpolate it in 2O(logn)

time. Iterate over the following 4 partitions of Si,v′, xi, u′, x′, y′, v′, u′, xi, x′, y′,v′, u′, x′, xi, y′, v′, u′, x′, y′, xiand determine the partition v′′, u′′, x′′, y′′such that, Rankv′′y′′(ΦSi |v′′,y′′) ≤ 2 andRanku′′x′′(ΦSi |u′′,x′′) ≤ 2 where ΦSi |v′′,y′′ isΦSi with variables in Si \ v′′, y′′ instantiated torandom values in F. Attach xi to the appropriate blockof the seed partition. This can be done in 2O(logn)

time.Step 5: Using TRICKLEDOWN algorithm and the abovedetermined partition v, u, x, y of X con-struct oracles for A,B,C,D. Then, recursively usingRECONSTRUCT, construct formulas ΨR’s for R ∈A,B,C,D. If RECONSTRUCT fails on for any ofthem output FAIL. Else, let ΨAB be the formula with× gate at the root and ΨA, ΨB as its children. Outputa formula Ψ with + gate at the root and ΨAB , ΨCD

as its children.This completes the description of the algorithmRECONSTRUCT. Algorithm A of Theorem 2.2 is nowessentially RECONSTRUCT, returning Ψ using blackboxcalls to Φ. (If RECONSTRUCT outputs FAIL, A outputsa random multilinear formula.) The bound on the runningtime of A is obvious. For correctness, it’s crucial to showthat the partition determined by steps 3 and 4 is, w.h.p., theoriginal partition of Φ. We do this in the next section. Thiswill complete the proof of Theorem 2.2.

5.4. Uniqueness of the Seed PartitionOur main result for this section is Theorem 5.6,

which shows that for a large F, Steps 3 and 4 of the

761783783783

RECONSTRUCT method determine the correct partitionw.h.p.. We need some preliminary discussion leading up tothe theorem.

Placement of random field elements on the wires of a randommultilinear formula drawn by SAMPLE(X,F):

While sampling a multilinear formula from the setM(X,F) we first sampled a formula without any fieldelements using the method CONSTRUCT and later placedfield elements, chosen independently and uniformly from F,on its wires. Also note that, distinct wires originating fromany of the xi’s, have distinct independent uniform r.v.’s onthem. For instance consider a multilinear formula on X andthat every xi has at most one wire originating from it. Letthe polynomial computed by the formula be

∑Nk=1 αk.Mk

where Mi’s are multilinear monomials. Now, if we place anr.v. ri on the wire from xi then a term like α.x1x3xn be-comes α.r1r3rn.x1x3xn. Hence essentially, the coefficientof a multilinear monomial M on X , takes the form αM .Mr

where Mr is the multilinear monomial Πxi∈Mri and eachαM is independent of ri’s. These observations help us proveLemma 5.8 showing that for every monomial M there isa set of logN monomials containing M such that the setof coefficients of these monomials is mutually independent.Also, it is easy to note that this remains true even afterinstantiating variables to random values over F.

Instantiating n − m variables to random field elements inStep 3 of the RECONSTRUCT method:

Let Φ be a random multilinear formula sampled us-ing SAMPLE(X,F) and let Φ = A(v, u)B(x, y) +C(v, x)D(u, y). In Step 3 of the RECONSTRUCT method,we choose an m-sized subset S of X randomly and in-stantiate the variables in X \ S to random values over F toget ΦS = AS(vS , uS)BS(xS , yS)+CS(vS , xS)DS(uS , yS).Using Chernoff bounds it easily follows that w.h.p., sizes ofthe sets vS , etc., are Ω(m). Let Y = S and Z = X \ S. Inthe SAMPLE method, partitioning the set Y ∪Z at a × gate(where |Y | ≤ |Z|) into two equal-sized sets a, b canbe viewed as follows: label the yi’s in Y with independentuniform 0-1 values, include the yi’s with label 0 in a andlabel 1 in b, and finally, place the Z variables randomlyto make |a| = |b|. It is now easy to see that in theabove expression of ΦS , the polynomials AS , BS , CS , DS

are close in distribution to a multilinear formula sampled bythe following method on their respective variable sets.Sampling Method SAMPLE2(X,F):

Step 1: Ψ← CONSTRUCT2(X,+).Step 2: Let W be the set of wires in Ψ incident to a+ gate. Let Φ be the syntactic multilinear arithmeticformula obtained by labeling each wi ∈ W by arandomly and independently chosen ci ∈R F.Step 3: return Φ.

where CONSTRUCT2(X, op):

Case 1: X = xi. Let Ψ be a single + gate with xias one input and the field element 1 as the other input.Case 2: op = ×. Label each xi ∈ X with independentuniformly chosen 0-1 values. Include the xi’s labeled0 in a set X1 and the rest in X2. If either Xi is emptythen repeat. Let Ψ1 ← CONSTRUCT2(X1,+), Ψ2 ←CONSTRUCT2(X2,+). Let Ψ be the formula with a ×gate at the root and Ψ1 and Ψ2 as its two children.Case 3: op = +. Let Ψ1 ← CONSTRUCT2(X,×),Ψ2 ← CONSTRUCT2(X,×). Let Ψ be the formula witha + gate at the root and Ψ1 and Ψ2 as its two children.Step: return Ψ.

Theorem 5.6 (Uniqueness of Partition). Let a, b andc, d be partitions of y ∪ z, where |a|, |b|, |c|,|d|, |y|, |z| are all Ω(m). Let A(a), B(b), C(c), D(d) bepolynomials independently computed by random multilinearformulas sampled using SAMPLE2 over the indicated vari-able sets. Then for independent α, β ∈R F,

Pr[Rankyz(α ·AB + β ·CD) ≤ 2] ≤ 2O(m)

|F|+

12Ω(m)

,

unless

1) either y = a & z = b or y = b &z = a, and

2) either y = c & z = d or y = d &z = c.

Before we sketch a proof of Theorem 5.6, let’s seehow it is used in the proof of Theorem 2.2. In Step 3of RECONSTRUCT, we consider the ranks of the partialderivative matrices for ΦS |v′,y′ and ΦS |u′,x′ w.r.t. partitionsv′, y′ and u′, x′, respectively. First, note that if v′ etc arethe correct partition of S, i.e., in ΦS , vS = v′ etc., then boththe above matrices have rank at most 2. We use Theorem 5.6to show that, w.h.p., the only partition of S (into fourparts) that satisfies these two rank conditions is the correctpartition. Indeed, by the discussion preceding Theorem 5.6,we can see that AS |v′,y′ , BS |v′,y′ , CS |v′,y′ , and DS |v′,y′

can be viewed as samples from SAMPLE2 on the variable setv′, y′ (assigning S \v′∪ y′ to random values). Similarlyfor AS |u′,x′ , etc., on u′, x′. Now, Theorem 5.6 says ifRankv′y′(ΦS |v′,y′) ≤ 2, then, w.h.p., the variables thatAS |v′,y′ etc. depend on must each be either v′ and y′. Thus,w.l.o.g., we must have vS = v′ and yS = y′. By a similarargument applied to ΦS |u′,x′ , we can conclude that uS = u′

and xS = x′. Note that since AB and CD are defined ontwo independent partitions of X , it is unlikely that A and Cdepend on the same set of variables. Furthermore, we canalso see that Step 4 associates each xi with correct blockof the seed partition by applying this argument repeatedlyfor the seed partition augmented with xi. This concludes theproof that Steps 3 and 4 determine the correct partition forΦ.

762784784784

The following technical lemmas are used in the proof ofTheorem 5.6. Their proofs will appear in the full versionof the paper. Throughout this paper, LI stands for “LinearlyIndependent” and LD for “Linearly Dependent.”

Lemma 5.7. Let f and g be two multilinear polynomialsover an n-sized variable set Y ∪ Z and field F. Then forany S ⊆ F and independently chosen α, β ∈R F,

Prα,β∈RS

[RankY Z(α.f + β.g) > 2] ≥ 1− 2n

|S|,

unless f and g have one of the following forms,1) f = f1(Y )f2(Z) and g = g1(Y )g2(Z)2) f = f1(Y )f2(Z) + f3(Y )f4(Z) (f1, f3 are LI, f2, f4

are LI) and either g = [a.f1(Y ) + b.f3(Y )]g2(Z) org = g1(Y )[a.f2(Z) + b.f4(Z)]

3) f = f1(Y )f2(Z) + f3(Y )f4(Z) (f1, f3 are LI,f2, f4 are LI) and g = [a.f1(Y ) + b.f3(Y )]g2(Z) +[c.f1(Y )+d.f3(Y )]g4(Z) (g2, g4 are LI and ad 6= bc)

4) f = f1(Y )f2(Z) + f3(Y )f4(Z) and g = [a.f1(Y ) +b.f3(Y )]g2(Z) + g3(Y )[c.f2(Z) +d.f4(Z)] (f1, f3, g3

are LI, f2, f4, g4 are LI and ac = −bd)and their analogous cases, where fi’s and gi’s are somemultilinear polynomials on their indicated variable sets anda, b, c, d ∈ F.

Lemma 5.8. Let S be a set of multilinear monomials overr1, r2, . . . , rn, where ri’s are independent r.v.’s and eachri ∈R F∗. Then for every M ∈ S there exists a set SM ⊆ Ssuch that

1) |SM | ≥ log |S| − 1 and2) SM ∪ M is a set of independent uniform r.v.’s over

F∗.

Lemma 5.9 (Irreducibility Lemma). Let fR be the polyno-mial computed by a random multilinear formula over thevariables set X = x1, x2, . . . , xm and field F sampledusing SAMPLE2. The probability that there exists a properpartition Y,Z of X such that RankY Z(fR) = 1 is at most2O(m)

|F| .

Lemma 5.10. Let Y,Z, with |Y | ≤ |Z|, be a partition ofvariable set X = x1, . . . , xm such that both |Y |, |Z| areat least γm for some γ > 0 and δ be a sufficiently largeinteger constant . Let f be the polynomial computed by arandom multilinear formula sampled using SAMPLE2(X,F).Then, with probability at least 1− 2O(m)

|F| −1

2γm/18 log2 δ ,1) there are at least δ distinct monomials multilinear in

Z variables such that coefficients of these are poly-nomials in Y each containing at least δ monomialsand

2) RankY Z(f) > 2.

Proof sketch for Theorem 5.6: We first show, usingLemma 5.7, that a random linear combination αf + βghas rank ≤ 2 w.r.t. a partition (Y,Z) of the underlying

variable set only under very special conditions. The mostnatural of these is when f and g are both of rank 1, i.e.,f(Y,Z) = f1(Y ) · f2(Z) and g(Y, Z) = g1(Y ) · g2(Z).The other (degenerate) conditions arise when at least oneof f or g has rank 2 and can be categorized into a smallnumber of special cases. The second part of the proofis to show that when f = AB and g = CD and A,B, C, and D are samples from SAMPLE2, the degenerateconditions are satisfied with very low probability. This willimply AB and CD must satisfy the natural condition andhence their supports must satisfy (1) and (2). For the secondpart, we use two main arguments about a random formulaaccording to SAMPLE2 on m variables: (i) it must haverank at least two, w.h.p., for any nontrivial partition ofits variables (Irreducibility Lemma, Lemma 5.9) and (ii)for any partition (Y,Z) with |Y |, |Z| ≥ Ω(m), it mustcontain many monomials in Z variables whose coefficients(which are polynomials in Y ) must also contain manymonomials in Y variables (Lemma 5.10). By (i), we onlyneed to consider when, say f , is of Rank 2 w.r.t. somepartition (not necessarily (Y, Z)). This, combined with anyof the degeneracy conditions, implies that the number ofstatistically independent monomials in Y variables in thecoefficient of a suitably chosen Z-monomial in g must besmall (since they are determined by linear combinationsgiven by the degeneracy conditions of a small number ofcoefficients of f ’s factors). But this contradicts (ii) since (byLemma 5.8) there must be many independent monomials inY variables.

6. CONCLUSIONS AND OPEN PROBLEMS

In this paper, we proved that multilinear arithmetic formu-las can be efficiently reconstructed when the formula is cho-sen randomly according to some natural distributions. Thisis the strongest model of arithmetic complexity for whichsuch a reconstruction algorithm is currently known even ifthe efficiency is in a distributional sense. A slight variantof the worst-case reconstruction of multilinear formulas isknown to be intractable. Our result gives supporting evidenceto the general theme that mathematical techniques useful inproving lower bounds against a model of computation shouldalso enable learning algorithms for that model. This generalview and the technical and intuitive connections among theareas of lower bounds in arithmetic complexity, polynomialidentity testing, and reconstruction motivate several openquestions for future research. Some examples include worst-case reconstruction of depth-3 constant top fan-in arithmeticformulas (over arbitrary fields), constant top fan-in depth-4multilinear formulas, Arithmetic Branching Programs (non-commutative as well as suitably restricted commutative mod-els), and so on. Worst-case reconstruction in many nontrivialmodels is either very hard or provably intractable. However,requiring efficiency only in a distributional sense appears tobring the reconstruction problem in several powerful models

763785785785

including, for example, general arithmetic formulas, withinreach and thus open up a direction of positive results inthis area. This paper may be viewed as a first result in thatdirection.

ACKNOWLEDGMENTS

The authors would like to thank the anonymous refereesfor their thoughtful comments and suggestions which havesignificantly helped us in improving the paper. We also thankAmir Shpilka for pointing out an improvement in the fieldsize.

REFERENCES

[1] J. Hastad, “Tensor rank is NP-complete,” J. Algorithms,vol. 11, no. 4, pp. 644–654, 1990.

[2] E. Kaltofen, “Factorization of polynomials given by straight-line programs,” in Randomness and Computation. JAI Press,1989, pp. 375–412.

[3] Z. S. Karnin and A. Shpilka, “Reconstruction of generalizeddepth-3 arithmetic circuits with bounded top fan-in,” in IEEEConference on Computational Complexity, 2009, pp. 274–285.

[4] A. Klivans and D. A. Spielman, “Randomness efficient iden-tity testing of multivariate polynomials,” in STOC, 2001, pp.216–223.

[5] A. R. Klivans and A. Shpilka, “Learning restricted models ofarithmetic circuits,” Theory of Computing, vol. 2, no. 1, pp.185–206, 2006.

[6] N. Nisan, “Lower bounds for non-commutative computation,”in Proceedings of the twenty-third annual ACM symposium onTheory of computing, ser. STOC ’91. New York, NY, USA:ACM, 1991, pp. 410–418.

[7] R. Raz, “Multi-linear formulas for permanent and determinantare of super-polynomial size,” Journal of the Association forComputing Machinery, vol. 56, no. 2, 2009.

[8] R. Raz and A. Yehudayoff, “Lower bounds and separationsfor constant depth multilinear circuits,” Computational Com-plexity, vol. 18, no. 2, pp. 171–207, 2009.

[9] A. Shpilka, “Interpolation of depth-3 arithmetic circuits withtwo multiplication gates,” SIAM J. Comput., vol. 38, no. 6,pp. 2130–2161, 2009.

[10] A. Shpilka and I. Volkovich, “On the relation between poly-nomial identity testing and finding variable disjoint factors,”in ICALP (1), 2010, pp. 408–419.

[11] A. Shpilka and A. Yehudayoff, “Arithmetic circuits: A surveyof recent results and open questions,” Foundations and Trendsin Theoretical Computer Science, vol. 5, no. 3-4, pp. 207–388,2010.

APPENDIX

HANDLING + GATES WITH FAN-IN k

In the previous sections we described how to recon-struct a random binary multilinear formula sampled usingSAMPLE(X,F). We now show that our algorithm, with afew minor modifications, can also reconstruct a random k-ary multilinear formula (where k = O(1)) i.e. a formulasampled using the method SAMPLE(X,F) where, in Case3, instead of recursively constructing 2 children, it constructs

k children. Hence the resulting formula would have + gatesof fanin k. As before, it can be shown that w.h.p. thepolynomial computed at a + gate in such a formula would beirreducible. Hence if the root was a × gate then we can gainblackbox access to the children using Kaltofen’s algorithmand/or the algorithm of Shpilka and Volkovich [10]. Theoverall strategy of the algorithm is as in the outline givenin section 3. The interesting case is where the root of theformula is a + gate so that the polynomial computed is ofthe form

f =k∑i=1

Ai(xi, . . . , xi+k−1) ·Bi(X \ xi, . . . , xi+k−1)

where xii∈[2k] is a partition of the variable set Xsuch that xi’s are of roughly the same size and Ai’s, Bi’sare random k-ary multilnear formulas on their respectivevariable sets. With high probability, the Ai’s, Bi’s are ofthe same degree say d. As before, we obtain blackbox to thegrandchildren in two steps. In the first step, we determinethe partition of xii∈[2k] of the variable set X . In thesecond step, we indicate how to obtain the evaluation of Ai(respectively Bi) at any given point.

Determining the partition. As in the case of fanin two,the idea is that for the correct partition of X the rank of acertain matrix will be very small whereas it will be largefor every incorrect partition. As before, the problem can bereduced to the case where |X| is relatively small (roughlyΘ(k log n)) by first determining a seed partition and thenextending it by introducing one variable at a time. With thislifting idea at hand, it suffices to determine whether a givenpartition is “the correct one” (w.h.p over the random choiceof the formula, it will hold true that there is a unique correctpartition). The idea is that if

X =⊎i∈[2k]

xi

is the correct partition then for any xi−1, xi we have

f = Ai(xi)Bi(xi−1)+∑j 6=i

αjAj(xi−1, xi)+βjBj(xi−1, xi),

where for any polynomial g, g denotes the polynomialobtained by setting all the variables from X \ xi−1, xito random values. This means that for any j < d, thepolynomial f [2d−j], the homogeneous part of degree (2d−j)of f , will be of rank (j + 1) with respect to the partitionxi−1 ] xi. On the other hand, f [2d−j] will not satisfy thestated rank bound for any incorrect partition (w.h.p. overthe random choice of the formula).

Obtaining blackbox access to the grandchildren. Wenow indicate how one can obtain blackbox access to theAi’s and Bi’s, given blackbox access to f and the partitionxii∈[2k]. In order to determine say A1(β1, . . . , βk), onesubstitutes all the variables except x1 and x2k to appropriate

764786786786

values and looks at the homogeneous components of theresulting polynomial. Specifically, make the substitution

xj :=

βj for j ∈ [2..k],αj ∈R F|xj | for j ∈ [(k + 1)..(2k − 1)]

and let the resulting polynomial be f(x1, x2k). It turnsout then that f [2d] equals A

[d]1 (x1) · B[d]

1 (x2k). Factoringthis polynomial gives us access to A[d]

1 (x1, β2, . . . , βk) andtherefore to A

[d]1 (β1, . . . , βk) as well. As in the case of

fanin two, this idea can be extended suitably to obtainA1(β1, . . . , βk). This completes our brief description forextending the reconstruction result to formulas with higherfanin.

765787787787

Documents

[IEEE 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS) - Palm Springs, CA, USA (2011.10.22-2011.10.25)] 2011 IEEE 52nd Annual Symposium on Foundations of Computer