Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Approximate List-Decoding and

Hardness Amplification

Valentine Kabanets (SFU)joint work with

Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Error-Correcting Codes, Randomness and

Complexity

“Classical” complexity (pre-Randomness): P vs NP, P vs NL, …

“Modern” complexity (Randomness):cryptography, NP = PCP(log n, 1), BPP vs P, expanders/extractors/…

Use of classical error-correcting codes (Hadamard, Reed-Solomon, …)

Invention of new kinds of codes (locally testable, locally decodable, locally list-decodable, …)

Example: Derandomization

Idea: Replace truly random string with

computationally random (pseudorandom) string

Goal: Save on the number of random bits used

Example: Derandomization

Computational Randomness =

Computational Hardness

Hard-to-compute functions ) Pseudorandom Generators (PRG) ) derandomization (BPP = P)

PRG ) Hard-to-compute functions

Hardness of Boolean functions

worst-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least one input x

average-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least fraction of inputs x

PRG requires average-case hard functions

Worst-Case to Average-Case

f(x1) f(x2) f(x3) … f(xN)

Error-Correcting Encoding

g(x1) g(x2) g(x3) … g(xM)

N = 2n

M = 2O(n)

Correcting Errors

f(x1) f(x2) f(x3) … f(xN)

Error-Correcting Decoding

g(x1) g(x2) g(x3) … g(xM)

If can compute g on “many” inputs, then can compute f on all inputs.

A Closer Look

ImplicitError-Correcting

Decoding

If h(x) = g(x) for “many” x , and h is computable by a “small” circuit,then f is computable by a “small” circuit.

h

f

h ¼ g

Use Locally Decodableerror-correcting codes !

List-Decodable Codes

ImplicitError-Correcting

List-Decoding

If h(x) = g(x) for ½ + of inputs, and h is computable by a “small” circuit,then f is computable by a “small” circuit.

hh ¼ g

Use Locally List-Decodableerror-correcting codes !

[Sudan, Trevisan, Vadhan ’01](algebraic polynomial-based codes)

f

Hardness Amplification

Yao’s XOR Lemma: If f:{0,1}n ! {0,1} is -hard for size s (i.e.,

any size s circuit errs on ¸ fraction of inputs), then

f©k(x1,…,xk) = f(x1) © … © f(xk) is (1/2-) –hard for size s’=s* poly(,), for ¼ 2-( k)

Proof: By contradiction. Suppose have a smallcircuit computing f©k on more than ½+ fraction, show how to build a new circuit computing f on > 1- fraction.

XOR-based Code

Think of a binary message msg on N=2n bits as a truth-table of a Boolean function f.

The code of msg is of length Nk where code(x1,…,xk) = f(x1) © … © f(xk).

This is very similar to a version of Hadamard code …

Hadamard CodeGiven a binary msg on N bits, the

Hadamard code of msg is a string of 2N bits, where for an N-bit string r, the code at r is

Had(msg)r = < msg, r > mod 2

(the inner product of msg and r)Our XOR-code is essentially truncated Hadamard code where we only consider N-bit strings r of Hamming weight k :f(x1) © … © f(xk) = < msg, r >where ri=1 for i=x1, …, xk and ri=0 elsewhere

List-Decoding Hadamard Code

Given a 2N-bit string w, how many N-bit strings m1, …, mt are there such that Had(mi) agrees with w in ¸ ½ + fraction of positions ?

Answer: O(1/2) (easy to show using discrete Fourier analysis, or

elementary probability theory)

The famous Goldreich-Levin algorithm provides an efficient way of list-decoding Hadamard code with optimal list size O(1/2)

List-Decoding k-XOR-Code

Given a string w, how many strings m1, …, mt are there such that each k-XOR codeword code(mi) agrees with w in ¸ ½ + fraction of positions ?

Answer: Too many ! (any two messages that differ in < 1/k fraction of bits have almost identical codewords)

List-Decoding k-XOR-Code

Correct question:Given a string m, how many k-XOR codewords

code(msg1), …, code(msgt) are there such that

(1) each code(msgi) agrees with m in ¸ ½ + fraction of positions, and

(2) every pair msgi and msgj differ in at least fraction of positions ?

Answer: 1/(42 – e-2 k), which is O(1/2) for > log (1/)/k (as is the case for Yao’s XOR Lemma ! )

The List Size

The proof of Yao’s XOR Lemma yields an approximate list-decoding algorithm for the XOR-code defined above.

But the list size is 2poly(1/) rather than the optimal poly(1/)

Our Result for k-XOR Code

There is a randomized algorithm such that, for ¸ poly(1/k):

Given a circuit C that computes code(msg) in ½+ fraction of positions, the algorithm outputs with high probability a list of poly(1/) circuits that contains a circuit agreeing with msg in ¸ 1- k-0.1 fraction positions. The running time is poly(|C|,1/).

Direct Product Lemma

If f:{0,1}n ! {0,1} is -hard for size s (i.e., any size s circuit errs on ¸ fraction of inputs), then

fk(x1,…,xk) = f(x1)…f(xk) is ¼ 2-( k) -hard for size s’=s* poly(,).

XOR Lemma and Direct Product Lemma are essentially equivalent, thanks to the Goldreich-Levin list-decoding algorithm for Hadamard codes. Hence, enough to list-decode the Direct Product Lemma.

The proof of the DP Lemma

[Impagliazzo & Wigderson]: Give efficient randomized algorithm LEARN that, given as input a circuit C -computing fk (where f:{0,1}n ! {0,1}) and poly(n,1/) pairs (x,f(x)) for independent uniform x’s, with high probability outputs a circuit C’ (1-)-computing f.

Need to know f(x) for poly(n,1/) random x’s. Let’s choose x’s at random, and then try all possibilities for the values of f on these x’s. This gives a list of 2poly(n,1/) circuits.

Reducing the List Size

Magic: We will use the circuit C -computing fk

to generate poly(n,1/) pairs (x,f(x)) for independent uniform x’s, and then run LEARN on C and the generated pairs (x,f(x)).

Well… Cannot do exactly that, but …

Imperfect samples

We will use the circuit C -computing fk to generate poly(n,1/) pairs (x,bx) for a distribution on x’s that is statistically close to uniform and such that for most x’s we have bx= f(x).

Then run a generalization of LEARN on C and the generated pairs (x,bx), where the generalized LEARN is tolerant of imperfect samples (x,bx).

How to generate imperfect samples

Warm-up

Given a circuit C -computing fk, want to generate (x,f(x)) where x is almost uniformly distributed.

First attempt: Pick a k-tuple (x1,…, xk) uniformly at random from the -fraction of k-tuples where C is correct. Evaluate C(x1,…, xk) = b1… bk. Pick a random i, 1· i· k, and output (xi,bi).

A Sampling Lemma

Let Sµ {0,1}nk be any set of density . Define a distribution D as follows: Pick an k-tuple of n-bit strings (x1,…,xk)

uniformly at random from S, pick uniformly an index 1· i· k, and output xi.

Then the statistical distance between D and the uniform distribution is at most (log(k/)/k)1/2 ¼ 1/k

Using the Sampling Lemma

If we could sample k-tuples on which C is correct, then we would have a pair (x,f(x)) for x ¼1/k- close to uniform.

But we can’t ! Instead, run the previous sampling procedure with a random k-tuple (x1,…, xk) some poly(1/) number of times.

With high probability, at least one pair will

be of the form (x,f(x)) for x close to uniform.

Getting more pairs

Given a circuit C -computing fk, we can get k1/2 pairs (x,f(x)),

for x’s statistically close to uniform, by viewing the input k-tuple as a k1/2-

tuple of k1/2-tuples, and applying the Sampling Lemma to

that “meta-tuple”.

What does it give us ?

Given a circuit C -computing fk, we can generate about k1/2 samples (x,f(x)). (Roughly speaking.)

Need about n/2 samples (to run LEARN).

If n/2 < k1/2, then done. What if n/2 > k1/2 ???

Direct Product Amplification

Idea:Given a circuit C -computing fk,

construct a new circuit C’ that ’-computes fk’ for k’ = k3/2, and ’ > 2.

Iterate a constant number of times, and get a circuit poly()-computingfpoly(k) for any poly(k). If = poly(1/k), we are done. [ since n/2 · poly(k) ]

Direct Product Amplification

Cannot achieve perfect DP amplification !Instead, can create a circuit C’ such that,

for at least ’ fraction of tuples (x1,…, xk’), C’(x1,…, xk’) agrees with f(x1),…, f(xk’) in “most” positions.

Because of this imperfection, we can onlyget pairs of the form (x,bx) where x’s are almost uniform and “most” bx=f(x).

Putting Everything Together

C for fk C’ for fkc

DP amplification

Sampling

LEARN

pairs (x,bx)

circuit (1-)-computing f

with probability > poly()

Repeat poly(1/) times to get a list containing a good circuit for f, w.h.p.

An application to uniform hardness

amplification

Hardness amplification in PH

Theorem: Suppose there is a language L in PNPk that is 1/nc-hard for BPP. Then there is a language L’ in PNPk that is (1/2-n-d)-hard for BPP, for any constant d.

Trevisan gives a weaker reduction (from 1/nc to (1/2 – log- n) hardness) but within NP. Since we use the nonmonotone function XOR as an amplifier, we get outside NP.

Open Questions

Achieving optimal list-size decoding for arbitrary .

What monotone functions f yield efficiently list-decodable f-based error-correcting codes ? Getting an analogue of the Goldreich-Levin algorithm for monotone f-based codes would yield better uniform hardness amplification in NP.

Documents

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)