Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
COMS 6998: Advanced Complexity Spring 2017
Lecture 14: April 27, 2017Lecturer: Rocco Servedio Scribes: Emmanouil Vlatakis
1 Introduction
Last time
During the last lecture we completed ABFR lower bound method. We finished theproof about computing PAR using circuits with a MAJ gate at the top and ∨, ∧, ¬everywhere else. In addition, we proved Razborov AC0 lower bounds using PAR gates.
Today
Today we discuss about the linear threshold functions (LTF s). We will prove lowerbounds for circuits that use LTF gates. We will present the main facts on LTF s andthe basic Randomized Communication Complexity (RCC) model. Finally, we analyzeLTF under RCC model.
2 Linear Threshold Functions (LTF s)
Definition 1. Linear Threshold Function (LTF) A boolean function f : 0, 1n→0, 1is called a weighted majority or (linear) threshold function if it is expressible as f(x) =sgn(w1x1 + · · ·+ wnxn − θ) = sgn(~w · ~x− θ) for some w1, · · · , wn, θ ∈ R
A simple example of LTF is φ(x) = 13x1 + 4x2 + · · ·+ 10xn ≥ 71LTF s have also some several different names: weighted majority, weighted threshold
functions, perceptrons, halfspaces, linear separators etc.
Definition 2. Polynomial Threshold Function (PTF s) A function f : 0, 1n→0, 1is called a polynomial threshold function (PTF) of degree at most k if it is expressibleas f(x) = sgn(p(x)) for some real polynomial p : 0, 1n→R of degree at most k
1
2 LINEAR THRESHOLD FUNCTIONS (LTF S) 2
Figure 1: Linear Separator.
Corollary 3. Every LTF is just a degree-1 PTF.
Fact 4. Is every Boolean function an LTF ? No the PAR(x) is not linearly separable:
Figure 2: f(x) = xor(x1, x2)
Fact 5. Any LTF f(x) has many different representations based on ~w, θ. In fact thereare infinite different representations over the binary hypercube −1, 1n. Any coeffi-cient can be perturbed wi in the range [wi− 1
cn, wi +
1cn
], for a significant large constantc, without changing the value of f(x) over −1, 1n.
Corollary 6. Combining the previous fact and the density of the rational numbers inR, it holds that there is always a representation of LTF with only rational coefficients.
2 LINEAR THRESHOLD FUNCTIONS (LTF S) 3
Example 7. Both them are equivalent over −1, 12:
∥∥∥∥ x1 + x2 ≥ 150.99x1 + 1.002x2 ≥ 15.005
∥∥∥∥Fact 8.
sgn(~w · ~x− θ) = sgn(c(~w · ~x− θ)
)∀c > 0
Fact 9. Any LTF has an integer representation, that is w1, · · · , wn, θ ∈ Z.
Proof. As we mentioned before, we can choose a very small εn such that f(x) =sgn(
∑i∈[n] wixi − θ) equals to g(x) = sgn(
∑i∈[n] wixi − θ) for all x ∈ −1, 1n, where
wi ∈ (wi − εn, wi + εn). As it is well-known from real analysis, the rational numbersare dense in the set of real numbers. This implies that ∀i ∈ [n] there is at least one wiin the range (wi− εn, wi + εn), which is rational. Thus we can just adopt the followingsimple strategy: Every non rational coefficient wi, θ can be perturbed to any rationalnumber of its aforementioned corresponding range. After that, we can just multiplyeach coefficient with least common multiple of the denominators.
Definition 10 (The weight of an integer representation of LTF). The weight of aninteger representation of LTF’s parameter (~w, θ) is
∑ni=1 |wi|
Definition 11 (The weight of an LTF). The weight of an LTF function f(x), denotedby W(f(x)) is the smallest weight over all the weights of the different integer represen-tations of the function
Example 12.
or(x1, · · · , xn) = sgn(n∑i=1
xi − 1), ~x ∈ 0, 1n
Notice that using the above representation we can easily argue that W(or) ≤ n. Inaddition we can easily argue that this is optimal. Let g(x) = sgn(
∑ni=1wixi − t) be the
weight-optimal representation of or function. By definition of or function we have
that ∀ i ∈ [n] : f(~ei) = f(
(0, · · · , 0, · · · , 1i-th position
, · · · , 0))
= 1. This implies that
∀i ∈ [n] : wi ≥ t Additionally, we have that f(~0) = −1. This implies that t > 0.Therefore it holds that ∀i ∈ [n] : wi > 0 ⇒ wi ≥ 1, since they are integers. Thisconclude the proof that the weight of or is at least ≥ n too.
Example 13. MAJ(x1, · · · , xn) has weight Θ(n).
MAJ(x1, · · · , xn) =
∑i xi ≥ n/2 ~x ∈ 0, 1n∑i xi ≥ 0 ~x ∈ −1, 1n
2 LINEAR THRESHOLD FUNCTIONS (LTF S) 4
To prove that the lower bound of W(MAJ) = Ω(n) , i.e for the 0, 1n case, instead of~eii∈[n],~0, one just can use all the
(n
n/2+1
)positive examples of MAJ function which
contain bn/2 + 1c ones and all the(
nn/2−1
)negative examples which contain dn/2− 1e
examples. This family of vectors is sufficient to prove that t ≥ n/2 and wi ≥ 1.
2.1 LTF ’s weight upper bound
Lemma 14 (Weight Upper bound). Any n−variate LTF has weight ≤ 2Θ(n logn)
Proof sketch: Each configuration (x1, · · · , xn), f(~x) corresponds to a linear inequality.
• For example if f(1, · · · , 1) = 1 then we get :
±w1 + w2 + · · ·+ wn − θ ≥ 1
• On the other hand, if f(−1, 1, · · · , 1) = −1 then we get −w1 + · · ·+wn− θ ≤ −1
Since there is the guarantee of LTF property for f(x) , we know that the system isfeasible. To be more precise, using LP arguments we can conclude that there exists aset of (n+ 1) inequalities, which if they will be converted to equalities , will give us asolution w∗1, · · · , w∗n = S. Via LP arguments again, it is not difficult to prove thatthis solution is guaranteed that will satisfy all 2n linear inequalities.
(n+ 1) :
(±)w1 · · · −θ = ±1...
. . . −θ = ±1... · · · −θ = ±1
=: SystemM
Notice that the matrix of the system has only (±1) entries. Solving the system withCramer Rule gives us a rational solution. The determinant of the system at the worstscenario is upper bounded by n! ≤ nn = 2n logn. The determinant of the matrix is thedenominator of the solution of the Cramer Rule. Numerators are also the determinantsof minor matrices of the system. Therefore using the scaling trick that we discussedto transform all the coefficients to integers we get a weight of 2n logn.
Question: The previous fact upper bounds the weight of an LTF. Is this huge sizereally necessary?
3 LTF AS GATES 5
Example 15. Suppose the following decision list:
L =
x1 → x2 → · · · → xn↓ ↓ ↓ ↓
+1 −1 +1 −1
⇒ LTF = sgn(2nx1 − 2n−1x2 + 2n−3 + · · · )
Notice that because of the priority of the decision list we have inequality constraints
like:
|w1| ≥
∑ni=1 /2|w2i|
|w2| ≥∑n
i=1 /2|w2i+1||w3| ≥
∑ni=2 /2|w2i|
...Especially for this problem the smallest weight’s numbers are the Fibonacci sequence
that are growing exponentially as (1+√
52
)n
Lemma 16 (Bad News[Has94]). ∃ n-variate LTF with weights 2Ω(n logn)
Question: Can we approximate any LTF with a lower weight LTF?
Answer: Yes
Fact 17. Let f(x) be any LTF. For any ε > 0 : ∃LTFf ′ such that dist(f, f ′) =Pr
x∈0,1n[f(x) 6= f ′(x)] ≤ ε where W(f ′(x)) is (1
ε)log2(1/ε) poly(n)
Example 18. For the previous reprentation of the decision list L, we can just dropthe lower significance’s numbers.
3 LTF as gates
First, note that LTF gates can be transformed to equivalent MAJ gates:
f(x) = sgn(w1x1 + w2x2 + · · · − θ)
m
f(x) = sgn(x1 + · · ·+ x1︸ ︷︷ ︸w1 times
+x2 + · · ·+ x2︸ ︷︷ ︸w2 times
+ · · ·+−1− · · · − 1︸ ︷︷ ︸θ
)
3 LTF AS GATES 6
Figure 3: Suppose that we have an LTF function of the form: f(x) = sgn(~w · ~x− θ) =1~w · ~x > θ. Then, we can convert it to a majority gate by duplicating each inputwire wi times.
LTF(3x1, · · · , 2xn) ≡ MAJ(x1, x1, x1, · · · , xn, xn)
So constructing circuits of LTF gates is equivalent with a MAJ gates but at a cost ofthe weight of the LTF.
Some additional interesting remarks:
1. We do not know of any counterexamples to the following claim:
“Every f in NP (e.g. f = Clique) has poly(n)-size LTF circuits of depth 2”
2. Any poly(n)-size depth-d LTF circuit can be computed by a poly(n)-size depth-d+ 1 MAJ circuits
3. Furthermore it is not difficult to prove that any LTF circuit can be computed bydepth−O(1) and/or/not circuits with MAJ gates at the very bottom level.
3.1 LTF circuits lower bounds
Fact 19 (Layered LTF circuits). Any depth-2 layered LTF circuit for parity must haveΩ(√n) LTF gates
3 LTF AS GATES 7
Note: A circuit is called layered if and only if after a level-decomposition from theinput to the output, every wire of the previous level goes only to the very next levelProof sketch:
• An edge in −1, 1n is just (~x, ~xei), where ~xei =
xeii 6=j = xj
xeii = −xi.
• It is not difficult to prove that if f(x) is an LTF that separates −1, 1n hypercubethen f(x) slices the edge (~x, ~xei) if and only if f(~x) 6= f(~xei)
Definition 20 (Influence of a variable). The influence of a variable xi is thenumber of the edges of the form (y, yei) that are sliced by f .
Infi
(f) =1
2n
∣∣∣(y, yei)|∀y ∈ −1, 1n and f(y) 6= f(yei)∣∣∣
• For any boolean function |f(i)| ≤ Inf i(f) and if f is and LTF then the equalityis tight.
• For any boolean function, it holds:∑S⊆[n]
f 2(S) = 1⋃i∈[n]i ⊆ 2[n]
⇒∑i∈[n]
f 2(i) ≤ 1⇒
For any LTF f :∑i∈[n]
(Infi
(f))2
≤ 1
Using Cauchy-Schwarz we get :∑i∈[n]
Infi
(f) ≤√n
• Therefore, for any boolean function that LTF f , f slices at most ( 1√n) from all
n2n−1 edges.
• Parity function has to slice all the edges. Therefore we need at least√n LTFs
gates in the slice them.
3 LTF AS GATES 8
Slicing the cubeUsing LTFs arises many interesting questions. For example:How many hyperplanes
do we need to slice all the n2n−1 edges of −1, 1n hypercube?
1. n LTFs are enough. We use one for each variable.
2. For the −1, 16 cube, it is possible to slice it using 5 LTFs.
3. Generalizing the last result, we can slice all the edges of −1, 1n by 5n/6.
How many different LTFs exist? Using the Chow’s theorem we can enumerate2Θ(n2). A tighter result gives us a number of 2n
2/2−n/10 different LTFs.
What is a hard function for LTFs circuits?
Generally our preferred function is PAR. Here, we will use it via the backdoor.
Definition 21. Inner boolean product is a (2,n)-variate function which corresponds toIP(x, y) = 〈x, y〉 mod 2 or equivalenty IP(x1, · · · , xn, y1, · · · , yn) = PAR(x1∧y1, · · · , xn∧yn)
Let’s recall the definition of the parity function:
χS(~x) =⊕i∈S
xi
Therefore we have : χS(~x) = IP(1S(~x), ~x)
3.2 Lecture’s lower bounds:
Today we will prove the following lower bounds:
1. Any circuit of LTFs gates for IP must use at least Ω(n/ log n) gates.
2. Any DT with LTFs as nodes for inner product must have depth Ω(n/ log n)
3. Any circuit of the form MAJ (LTF)s for IP uses at least 2Ω(n) LTFgatesQuestion : Why we will use IP instead of PAR?
Answer : Under Randomized Communication Complexity with public coins,PAR
has a very cheap randomized protocol. On the other hand, IP requires
very high communication complexity even for the randomized protocol.
4 RANDOMIZED COMMUNICATION COMPLEXITY 9
4 Randomized Communication Complexity
Firstly, let’s recall the model of the communication complexity. Let A and B be twocooperative parties, say Alice and Bob. There is a known f : X×Y→Z, so f(x, y) = z.However, A only gets x and B only gets y. They must send information to each otherto compute f and at the end both must know the result of f(x, y).
Definition 22 (Deterministic Communication Complexity of function f(x, y)). Fora protocol P , cost(P ) = the depth of the protocol tree (this is the number of bits ofcommunication needed in the worst case). The deterministic communication complexityof f,D(f) is the minimum cost(P ), ∀P that compute f .
Definition 23 (Random Communication Model using public coins). Let A and Bbe two cooperative parties, say Alice and Bob. There is a known f : X × Y→Z, sof(x, y) = z. However, A only gets x and B only gets y. They must send informa-tion to each other to compute f and at the end both must know the result of f(x, y).Additionally, Alice and Bob have access to a common string of random bits.
Notice that Random Communication Model using public coins is just a probabilitydistribution over all deterministic protocols.
So, let’s suppose that Alice and Bob draw some random bits and given that thedrawn coins are public, their samples corresponds to a combined sampled Protoco`.
Thus, we have the following definition of the random communication complexityprotocol:
Definition 24 (Random Communication Complexity Model). A random protocol Pfor computing f(x, y) works thus: A, B sample from a public uniform random stringand based on this sample they agree on a deterministic protocol Protoco`. Then A andB apply this protocol.
Definition 25 (The cost of the protocol). For a random protocol P , we can definecost(P )(x, y) = the depth of the protocol tree (this is the number of bits of communi-cation needed in the worst case), given (x,y). For a random protocol P , the cost of theprotocol is cost(P ) = max(x,y)∈X×Y cost(x, y)
Definition 26 (Random Communication Protocol error). A random protocol P haserror ε if and only if ∀(x, y) ∈ X× Y :
Prr∈public coins[P (x, y) 6= f(x, y)] ≤ ε
Definition 27 (Randomized -ε communication complexity). The randomized -ε com-munication complexity of f,Rε(f) is the minimum cost(P ), ∀P that compute f withat most ε error.
4 RANDOMIZED COMMUNICATION COMPLEXITY 10
Lemma 28.Rε(EQ(x, y)) = O(log 1
ε)
Proof. It suffices to prove that there exists a protocol that achieves that bound.
Protocol P :
• Sample independently, uniformly and publicly log(1/ε) strings r1, · · · , rlog(1/ε)
of length n.
• Let’s imagine that Alice has the string x and Bob has the string y.
• Alice computes log(1/ε) inner products
α1
α2...
αlog(1/ε)
=
〈x, r1〉〈x, r2〉
...〈x, rlog(1/ε)〉
• Bob computes two inner products
β1
β2...
βlog(1/ε)
=
〈y, r1〉〈y, r2〉
...〈y, rlog(1/ε)〉
• Alice sends to Bob these log(1/ε) bits
α1
α2...
αlog(1/ε)
• Bob answers Q(a1, b1, a2, b2) =
n∧i=1
[ai = bi].
• Correctness
– Firstly, if x = y = s then :α1
α2...
αlog(1/ε)
=
〈x, r1〉〈x, r2〉
...〈x, rlog(1/ε)〉
=
〈s, r1〉〈s, r2〉
...〈s, rlog(1/ε)〉
=
〈y, r1〉〈y, r2〉
...〈y, rlog(1/ε)〉
4 RANDOMIZED COMMUNICATION COMPLEXITY 11
Therefore, if x = y then Bob always answers correctly.
– Secondly, if x 6= y then there exists an index i such that xi 6= yi.
Let’s imagine the equality and not-equality indices’ sets :
E = i|xi = yiD = i|xi = yi
Given that x 6= y ⇒ |D| > 1.
Pr[Bob fails] = Pr[
log(1/ε)⋂i=1
(ai = bi)] =
log(1/ε)∏i=1
Pr[(ai = bi)]
Let’s analyze Pr[(ak = bk)]
Pr[(ak = bk)] = Pr[〈x, rk〉 = 〈y, rk〉]= Pr[〈x, rk〉 = 〈y, rk〉]= Pr[〈x− y, rk〉 = 0]
= Pr[n∑i=1
(xi − yi)rki mod 2 = 0]
= Pr[∑i∈E
(xi − yi)rki mod 2 +∑i∈D
(xi − yi)rki mod 2 = 0]
= Pr[∑i∈E
0× rki mod 2 +∑i∈D
1× rki mod 2 = 0]
= Pr[∑i∈D
rki mod 2 = 0] = 1/2
The above analysis is just the Random Subset identity. Finally:
Pr[Bob fails] =
log(1/ε)∏i=1
Pr[(ai = bi)] =
log(1/ε)∏i=1
12
= 2− log(1/ε) = ε
• Complexity
– As the coins are public the only bits that Alice and Bob exchange are theselog(1/ε) inner-product bit.
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 12
– So we need O(log(1/ε)) communication bits in order to solve with probabil-ity error at most ε to solve the equality problem.
Exercise 29. R12−ε
(IP) ≥ n− log(1/ε)
5 Communication Complexity of Univariate Func-
tions
Suppose that we want to analyze Communication Complexity of a uni-variate functionf(x) Then, in order to define the communication complexity of this class, we can justcompute
Deterministic Communication Complexity(f) = maxall splits
D(f)
Random Communication Complexity(f) = maxall splits
Rε(f)
In any case:
• If you want to compute an upper bound on the ratio, we should solve the problemfor any partition.
• If you want to compute a lower bound on the ratio, we have to exhibit a specificpartition.
Fact 30. For the class of the linear threshold functions, we know that the deterministiccommunication complexity is D(LTF) ≥ Ω(n).
It seems obvious that this lower bound cannot be improved.Let’s analyze the ran-dom public protocols.
The best-known result is the following:
Theorem 31. Rε(LTF) ≤ log(log(n/ε))
Here, in class, We will prove a more simple result:
Theorem 32. Rε(LTF) ≤ O(log n)O(log(log(n/ε)))
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 13
Proof. As we mentioned, in order to upper bound the complexity, we have to designa protocol. Notice that the proposed protocol should be independent of any specificsplit.
So, let’s assume that ~x is split arbitrarily into
Alice bits := X = (x1, · · · , xk)(xk+1, · · · , xn) = Y =: Bob bits
In the beginning of the lecture for any LTF we showed that |wi| ≤ 2n logn for alli ∈ [n]
Therefore each wi is descibed by log(size) = n log n bits.
Protocol P :
1. Alice computes α =∑k
i=1 xi × wi
2. Bob computes β = θ −∑n
i=k+1 xi × wiNote:They want to verify if α > β. These numbers have n log n bits as length.
3. Alice and Bob execute binary search to find the most significant bit that α, βdiffers.-Let’s call it i∗
4. If i∗ = −1 then Alice and Bob agree to output 0.
5. If αi∗ = 1 then Alice and Bob agree to output 1.
6. If αi∗ = 0 then Alice and Bob agree to output 0
Their binary search scheme is the following: They are trying to identify what isthe longest prefix of their numbers that differ.
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 14
Binary Search:
1. `u ← n log n
2. `d ← n log n/2
3. Alice looks at a← (α)`ubit=`d
4. Bob looks at b← (β)`ubit=`d
5. Alice and Bob runs the Rε/(2 log(n)) Equality protocol.
6. If a = b then we extend our search to the suffix (`u, `d ← `d, `d − (`d − `u)/2)
7. If a 6= b then we extend our search to the prefix (`u, `d ← `u, `d − (`d − `u)/2)
8. If `u − `d = 0 then x = `u = `d
• If x 6= 0 then output x
• If x 6= 0 then output -1 /* The numbers were equal */
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 15
• Communication Complexity:O( log(length|W(f)|)︸ ︷︷ ︸
Number of the stages of the binary search
)O( log(2 logW(f)ε
)︸ ︷︷ ︸The samples for a ε
2 logW(f)Equality Protocol
)
Since W(f) = 2O(n logn) ⇒ length|W(f)|) ≤ logW(f) = n log n.
So, the total complexity is O(log(n log n) log(log(n/ε))).
• Correctness:The probability error is by union bound log(lengthW(f))× ε/(2 log(n)) ≤ ε.
Lemma 33. Let f be a boolean function that is computable by a circuit C of s gatesthat belong < family. Then R1/3(f) ≤ sR1/3s(<)
Proof.
Protocol P :
• Sort topologically the circuit of f from the input-bottom layer to the output-top layer. Each layer is made ups of some gates.
• Parse each layer from the bottom to the top layer and apply the R1/3s(<)protocol.
• Output the result of the final gate of the top layer
• Communication Complexity: In each gate we are applying an R1/3s(<) protocol.Since the number of the gates are at most s. We are using s × R1/3s(<) forcommunication.
• Correctness: Indeed, by union bound the probability error is1
3s× |#gates| =
1
3s× s = 1
3
5.1 Circuit lower bounds
Corollary 34. Any circuit of LTF gates for IP needs Ω( nlogn
) gates:
Proof. As we mention : n− log(1/ε) ≤ R12−ε
(IP).
Therefore n− 2 ≤ R13(IP) ≤ sR 1
3s(LTF).
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 16
Let’s assume now by contradiction that there was a circuit that uses s = o( nlogn
).
However given the previous result, we have that sR 13s
(LTF) = o( nlogn
)R 1
3o(n
logn)
(LTF) =
o(n)But this means that n− 2 ≤ o(n), which is obviously a contradiction.
Corollary 35. Any DT with LTFs as nodes for inner product must have depth Ω(n/ log n)
Proof. The proof is similar with the previous one.The corresponding lemma for thin computation model is the following:
Lemma 36. Let f be a boolean function that is computable by a decision tree T withheight h using gates that belong to < family. Then R1/3(f) ≤ hR1/3h(<)
Proof sketch: The reason that this lemma is different with the previous one that refersto general gates is simple. During the simulation of a decision tree we will follow only avery specific branch. Therefore the actual circuit that will be used for the computationcould be at most the longest branch. By definition the number of the longest branchof the tree is its depth. Thus in the worst case scenario, the communication bits thatwe will need, are O(depth(T ))R1/3h(<).
Let’s assume now by contradiction that there was a tree that has depth . Howevergiven the previous result, we have that :
dR 13d
(LTF) = o( nlogn
)R 1
3o(n
logn)
(LTF) = o(n)
However as we have already mentioned:
n− 2 ≤ R13(IP)
This implies that if there was any DT T with depth o( nlogn
) that uses as nodes LTF
gates then n− 2 ≤ R13(IP) ≤ o(n), which is obviously a contradiction.
Let’s recall the last goal of this lecture:
Theorem 37. Any circuit of the form MAJ (LTF)s for IP uses at least 2Ω(n) LTFgates.
5 COMMUNICATION COMPLEXITY OF UNIVARIATE FUNCTIONS 17
Figure 4: Any circuit c = MAJ (LTF)s has two layers. The bottom layer is made upof s LTFs gates and the top layer is made up of the final MAJ gate.
Proof. Here we will need a new trick. We will prove the following lemma:
Lemma 38. For any function of the form f = MAJ(<), we have that: R1/2−1/4s(f) ≤R1/4s(<), for any family of circuits <.
Proof.
• Again we have to implement a protocol for f using the best possible randomprotocol for <.
• The reader first of all should notice that there is an extremely simple O(1)-bitR1/2(f) protocol. We can just toss a coin and decide the output of the Majoritygate.
• However, given that the majority gates are defined for odd length input we candescribe a protocol that boosts a little bit the success probability.
Question : What is the probability that the majority function agrees with a uniformly
randomly chosen input variable xi?
(where the probability is also over uniform random inputs.
Answer : At leastn+1
2
n. As the population of the majority is at least
n+12
n
REFERENCES 18
• Therefore:
Pri∈|Size(Bottom Layer)|
[f(x) = Ci(x)|Ci(x)is computed correctly] ≥ (s+ 1)/2
s=
1
2+
1
2s
• So the protocol is pretty simple:
Protocol P :
1. Choose uniformly at random a circuit C∗(x) and run the protocol R1/4s(<).
2. Output its result
• Indeed, the protocol P will output the correct result with probability =
Pri∈|Size(Bottom Layer)|
[f(x) = Ci(x)|Ci(x)is computed correctly] Pr[Ci(x)is computed correctly] =
(1
2+
1
2s)(1− 1
4s) ≥ (
1
2+
1
2s− 1
4s) =
1
2+
1
4s
Now we are ready to prove our last lower bound : Suppose that s = 2o(n).We know that :
n− log(1/ε) ≤ R12−ε
(IP) ∀ε > 0
Let ε = 1/4s then we have:
n− o(n) ≤ R12−1/4s
(IP)
Additionally:
R12−1/4s
(IP) ≤ R12−1/4s
(MAJ (LTF)) ≤ R1/4s(LTF) ≤ log(log n× 4s) ∈ O(log n)
Combining the last two inequalities we get Θ(n) ≤ O(log n), which is a contradic-tion.
References
[Has94] Johan Hastad. On the size of weights for threshold gates. SIAM J. Discret.Math., 7(3):484–492, August 1994. 16