123
ANKUR MOITRA MASSACHUSETTS INSTITUTE OF TECHNOLOGY BEYOND MATRIX COMPLETION Based on joint work with Boaz Barak (Harvard)

BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

ANKUR  MOITRA  MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY  

   BEYOND  MATRIX  COMPLETION  

Based  on  joint  work  with  Boaz  Barak  (Harvard)  

Page 2: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Matrix  compleMon  

Part  I:    

Page 3: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NETFLIX  PROBLEM  

users  

movies  

Page 4: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NETFLIX  PROBLEM  

users  

movies  

1-­‐5  stars  

Page 5: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NETFLIX  PROBLEM  

users  

movies  

480,189  

17,770  

Page 6: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NETFLIX  PROBLEM  

users  

movies  

480,189  1%    

observed  

17,770  

Page 7: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NETFLIX  PROBLEM  

users  

movies  

480,189  1%    

observed  

17,770  

Can  we  (approximately)  fill-­‐in  the  missing  entries?  

Page 8: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

Page 9: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

Page 10: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

drama  

Page 11: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

comedy  drama  

Page 12: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

comedy  drama   sports  

Page 13: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

comedy  drama   sports  

Model:  we  are  given  random  observaMons  Mi,j  for  all  i,j        Ω  

Page 14: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  Let  M  be  an  unknown,  approximately  low-­‐rank  matrix    

≈   +   …  +  M   +  

comedy  drama   sports  

Model:  we  are  given  random  observaMons  Mi,j  for  all  i,j        Ω  

Is  there  an  efficient  algorithm  to  recover  M?  

Page 15: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  

The  natural  formulaMon  is  non-­‐convex,  and  NP-­‐hard  

min  rank(X)    s.t.    (i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

Page 16: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

MATRIX  COMPLETION  

The  natural  formulaMon  is  non-­‐convex,  and  NP-­‐hard  

min  rank(X)    s.t.    (i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

There  is  a  powerful,  convex  relaxaMon…  

Page 17: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NUCLEAR  NORM  

Consider  the  singular  value  decomposi;on  of  X:      

X   =   U   Σ   VT  

orthogonal  orthogonal   diagonal  

Page 18: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NUCLEAR  NORM  

Consider  the  singular  value  decomposi;on  of  X:      

X   =   U   Σ   VT  

orthogonal  orthogonal   diagonal  

Let  σ1  ≥  σ2  ≥  …  σr  >  σr+1  =  …  σm  =  0  be  the  singular  values  

Page 19: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  NUCLEAR  NORM  

Consider  the  singular  value  decomposi;on  of  X:      

X   =   U   Σ   VT  

orthogonal  orthogonal   diagonal  

Let  σ1  ≥  σ2  ≥  …  σr  >  σr+1  =  …  σm  =  0  be  the  singular  values  

Then  rank(X)  =  r,  and     X   =  σ1  +  σ2  +  …  +  σr  *  (nuclear  norm)  

Page 20: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

[Fazel],  [Srebro,  Shraibman],  [Recht,  Fazel,  Parrilo],  [Candes,  Recht],  [Candes,  Tao],  [Candes,  Plan],  [Recht],    

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

This  yields  a  convex  relaxaMon,  that  can  be  solved  efficiently:  

(P)  

Page 21: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

[Fazel],  [Srebro,  Shraibman],  [Recht,  Fazel,  Parrilo],  [Candes,  Recht],  [Candes,  Tao],  [Candes,  Plan],  [Recht],    

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

This  yields  a  convex  relaxaMon,  that  can  be  solved  efficiently:  

(P)  

Theorem:  If  M  is  n  x  n  and  has  rank  r,  and  is  C-­‐incoherent  then  (P)    recovers  M  exactly  from  C6nrlog2n  observaMons      

Page 22: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

[Fazel],  [Srebro,  Shraibman],  [Recht,  Fazel,  Parrilo],  [Candes,  Recht],  [Candes,  Tao],  [Candes,  Plan],  [Recht],    

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

This  is  nearly  opMmal,  since  there  are  O(nr)  parameters  

This  yields  a  convex  relaxaMon,  that  can  be  solved  efficiently:  

(P)  

Theorem:  If  M  is  n  x  n  and  has  rank  r,  and  is  C-­‐incoherent  then  (P)    recovers  M  exactly  from  C6nrlog2n  observaMons      

Page 23: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   Robust  PCA   [Candes  et  al.],  [Chandrasekaran  et  al.],  …  Can  we  recover  a  low  rank  matrix  from  sparse  corrupMons?  

Page 24: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   Robust  PCA   [Candes  et  al.],  [Chandrasekaran  et  al.],  …  Can  we  recover  a  low  rank  matrix  from  sparse  corrupMons?  

Superresolu;on,  compressed  sensing  off-­‐the-­‐grid  [Candes,  Fernandez-­‐Granda],  [Tang  et  al.],  …  

Can  we  recover  well-­‐separated  points  from  low-­‐frequency  measurements?  

f1   f2   f3  f4  

Page 25: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Higher  order  structure?  

Part  II:    

Page 26: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

Page 27: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

e.g.  Groupon  

users  

ac;vi;es  

Page 28: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  users  

;me  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

;me:  season,  Mme  of  day,  weekday/weekend,  etc  

e.g.  Groupon  

Page 29: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  

T    ≈      i  =  1  

r  

users  

;me  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

snow  sports  

;me:  season,  Mme  of  day,  weekday/weekend,  etc  

e.g.  Groupon  

Page 30: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  

T    ≈       ai            bi                  ci  ×   ×  i  =  1  

r  

users  

;me  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

Page 31: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

TENSOR  COMPLETION  

T    ≈       ai            bi                  ci  ×   ×  i  =  1  

r  

users  

;me  

Can  we  (approximately)  fill-­‐in  the  missing  entries?  

Can  using  more  than  two  auributes  can  lead  to  beuer    recommendaMons?  

Page 32: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  TROUBLE  WITH  TENSORS  Natural  approach  (suggested  by  many  authors):  

(P)  min        X          s.t.    *   (i,j,k)      Ω  

|Xi,j,k–Ti,j,k|  ≤  η  1  |Ω|  

tensor  nuclear  norm  

Page 33: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

THE  TROUBLE  WITH  TENSORS  

(P)  min        X          s.t.    *   (i,j,k)      Ω  

|Xi,j,k–Ti,j,k|  ≤  η  1  |Ω|  

tensor  nuclear  norm  

[Gurvits],  [Liu],  [Harrow,  Montanaro]  

The  tensor  nuclear  norm  is  NP-­‐hard  to  compute!    

Natural  approach  (suggested  by  many  authors):  

Page 34: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   In  fact,  most  of  the  linear  algebra  toolkit  is  ill-­‐posed,  or  computa;onally  hard  for  tensors…  

Page 35: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

e.g.  [Hillar,  Lim]  “Most  Tensor  Problems  are  NP-­‐Hard”  

In  fact,  most  of  the  linear  algebra  toolkit  is  ill-­‐posed,  or  computa;onally  hard  for  tensors…  

Page 36: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

e.g.  [Hillar,  Lim]  “Most  Tensor  Problems  are  NP-­‐Hard”  

In  fact,  most  of  the  linear  algebra  toolkit  is  ill-­‐posed,  or  computa;onally  hard  for  tensors…  

Most Tensor Problems are NP-Hard 0:3

Table I. Tractability of Tensor Problems

Problem Complexity

Bivariate Matrix Functions over R, C Undecidable (Proposition 12.2)

Bilinear System over R, C NP-hard (Theorems 2.6, 3.7, 3.8)

Eigenvalue over R NP-hard (Theorem 1.3)

Approximating Eigenvector over R NP-hard (Theorem 1.5)

Symmetric Eigenvalue over R NP-hard (Theorem 9.3)

Approximating Symmetric Eigenvalue over R NP-hard (Theorem 9.6)

Singular Value over R, C NP-hard (Theorem 1.7)

Symmetric Singular Value over R NP-hard (Theorem 10.2)

Approximating Singular Vector over R, C NP-hard (Theorem 6.3)

Spectral Norm over R NP-hard (Theorem 1.10)

Symmetric Spectral Norm over R NP-hard (Theorem 10.2)

Approximating Spectral Norm over R NP-hard (Theorem 1.11)

Nonnegative Definiteness NP-hard (Theorem 11.2)

Best Rank-1 Approximation NP-hard (Theorem 1.13)

Best Symmetric Rank-1 Approximation NP-hard (Theorem 10.2)

Rank over R or C NP-hard (Theorem 8.2)

Enumerating Eigenvectors over R #P-hard (Corollary 1.16)

Combinatorial Hyperdeterminant NP-, #P-, VNP-hard (Theorems 4.1 , 4.2, Corollary 4.3)

Geometric Hyperdeterminant Conjectures 1.9, 13.1

Symmetric Rank Conjecture 13.2

Bilinear Programming Conjecture 13.4

Bilinear Least Squares Conjecture 13.5

Note: Except for positive definiteness and the combinatorial hyperdeterminant, which apply to 4-tensors,all problems refer to the 3-tensor case.

and n be positive integers. For the purposes of this article, a 3-tensor A over F is anl ⇥m⇥ n array of elements of F:

A = Jaijk

Kl,m,n

i,j,k=1 2 Fl⇥m⇥n. (1)

These objects are natural multilinear generalizations of matrices in the following way.For any positive integer d, let e1, . . . , ed denote the standard basis1 in the F-vector

space Fd. A bilinear function f : Fm⇥Fn ! F can be encoded by a matrix A = [aij

]

m,n

i,j=1 2Fm⇥n, in which the entry a

ij

records the value of f(ei

, ej

) 2 F. By linearity in eachcoordinate, specifying A determines the values of f on all of Fm ⇥ Fn; in fact, we havef(u,v) = u

>Av for any vectors u 2 Fm and v 2 Fn. Thus, matrices both encode 2-dimensional arrays of numbers and specify all bilinear functions. Notice also that ifm = n and A = A> is symmetric, then

f(u,v) = u

>Av = (u

>Av)

>= v

>A>u = v

>Au = f(v,u).

Thus, symmetric matrices are bilinear maps invariant under coordinate exchange.

1Formally, ei is the vector in Fd with a 1 in the ith coordinate and zeroes everywhere else. In this article,vectors in Fn will always be column-vectors.

Journal of the ACM, Vol. 0, No. 0, Article 0, Publication date: June 2013.

Page 37: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

Page 38: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

flat(   )      =  n 1  

n3  

n2n3  

n1  

Page 39: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

flat(   )      =  n 1  

n3  

ac;vi;es        ;me  

users  

×  

Page 40: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

flat(   )      =  n 1  

n3  

(j,k)  

i  

Ti,j,k  

Page 41: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

flat(   )      =  n 1  

n3  

(j,k)  

i  

Ti,j,k  

This  is  a  rearrangement  of  the  entries,  into  a  matrix,  that  does  not  increase  its  rank  

Page 42: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

FLATTENING  A  TENSOR  

Many  tensor  methods  rely  on  flaTening:  

flat(   )      =  n 1  

n3  

(j,k)  

i  

Ti,j,k  

flat(   ai            bi                  ci  ×   ×  i  =  1  

r  

)    =   ai          vec(bici)    ×  i  =  1  

r  

T  

n2n3-­‐dimensional  vector  

Page 43: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Let  n1  =  n2  =  n3  =  n  

We  would  need  O(n2r)  observaMons  to  fill-­‐in  flat(T)      

   

Page 44: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Let  n1  =  n2  =  n3  =  n  

We  would  need  O(n2r)  observaMons  to  fill-­‐in  flat(T)      

   There  are  many  other  variants  of  flaTening,  but  with  comparable  guarantees  

[Liu,  Musialski,  Wonka,  Ye],  [Gandy,  Recht,  Yamada],      [SignoreTo,  De  Lathauwer,  Suykens],  [Tomioko,  Hayashi,  Kashima],    [Mu,  Huang,  Wright,  Goldfarb],  …  

Page 45: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Let  n1  =  n2  =  n3  =  n  

We  would  need  O(n2r)  observaMons  to  fill-­‐in  flat(T)  

Can  we  beat  flauening?  

   

   There  are  many  other  variants  of  flaTening,  but  with  comparable  guarantees  

[Liu,  Musialski,  Wonka,  Ye],  [Gandy,  Recht,  Yamada],      [SignoreTo,  De  Lathauwer,  Suykens],  [Tomioko,  Hayashi,  Kashima],    [Mu,  Huang,  Wright,  Goldfarb],  …  

Page 46: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Let  n1  =  n2  =  n3  =  n  

We  would  need  O(n2r)  observaMons  to  fill-­‐in  flat(T)  

Can  we  beat  flauening?  

Can  we  make  beuer  predicMons  than  we  do  by  treaMng  each  ac;vity  x  ;me  as  unrelated?  

   

   There  are  many  other  variants  of  flaTening,  but  with  comparable  guarantees  

[Liu,  Musialski,  Wonka,  Ye],  [Gandy,  Recht,  Yamada],      [SignoreTo,  De  Lathauwer,  Suykens],  [Tomioko,  Hayashi,  Kashima],    [Mu,  Huang,  Wright,  Goldfarb],  …  

Page 47: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Nearly  opMmal  algorithms  for  noisy  tensor  compleMon  

Part  III:    

Page 48: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Suppose  we  are  given  |Ω|=  m  noisy  observaMons  from  T:  

OUR  RESULTS  

T    =       ai            bi                  ci  ×   ×  i  =  1  

r  

σi  

with  |σi|,  |ai|∞,  |bi|∞,  |ci|∞  ≤  C  

+  noise  

bdd  by  η    

Page 49: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

Suppose  we  are  given  |Ω|=  m  noisy  observaMons  from  T:  

OUR  RESULTS  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

T    =       ai            bi                  ci  ×   ×  i  =  1  

r  

σi  

with  |σi|,  |ai|∞,  |bi|∞,  |ci|∞  ≤  C  

+  noise  

bdd  by  η    

Page 50: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

Page 51: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

In  many  se}ngs,  1-­‐o(1)  fracMon  of  entries  of  T  are  at  least    r1/2/polylog(n)  in  magnitude  

Page 52: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

When  m  =  Ω  (n3/2r),  average  error  is  asymptoMcally  smaller  and  hence  Xi,j,k  =  (1±o(1))Ti,j,k  for  1-­‐o(1)  fracMon  of  entries    

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

In  many  se}ngs,  1-­‐o(1)  fracMon  of  entries  of  T  are  at  least    r1/2/polylog(n)  in  magnitude  

Page 53: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

When  m  =  Ω  (n3/2r),  average  error  is  asymptoMcally  smaller  and  hence  Xi,j,k  =  (1±o(1))Ti,j,k  for  1-­‐o(1)  fracMon  of  entries    

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

“Almost  all  of  the  entries,  almost  en;rely  correct”  

In  many  se}ngs,  1-­‐o(1)  fracMon  of  entries  of  T  are  at  least    r1/2/polylog(n)  in  magnitude  

Page 54: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

Page 55: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

For  r  =  n3/2-­‐δ  (highly  overcomplete),  we  only  need  to  observe    an  n-­‐δ  fracMon  of  the  entries  

Page 56: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SOME  REMARKS  

Algorithms  for  decomposing  such  tensors  given  all  the  entries  need  stronger  (e.g.  factors  are  random)  assumpMons  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

For  r  =  n3/2-­‐δ  (highly  overcomplete),  we  only  need  to  observe    an  n-­‐δ  fracMon  of  the  entries  

Page 57: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

LOWER  BOUNDS  

Not  only  is  the  tensor  nuclear  norm  hard  to  compute,  but…  

Page 58: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

LOWER  BOUNDS  

Not  only  is  the  tensor  nuclear  norm  hard  to  compute,  but…  

Noisy  tensor  compleMon  with  m  observaMons  

Refute  random  3-­‐SAT    with  m  clauses  

Page 59: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

LOWER  BOUNDS  

Not  only  is  the  tensor  nuclear  norm  hard  to  compute,  but…  

Noisy  tensor  compleMon  with  m  observaMons  

Refute  random  3-­‐SAT    with  m  clauses  

Believed  to  be  hard,    If  m  =  n3/2-­‐δ  

Page 60: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Matrix  compleMon  revisited:  ConnecMons  to  random  CSPs  

Part  IV:    

Page 61: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   Can  we  disMnguish  between  low-­‐rank  and  random?  

Page 62: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

+    noise  

Can  we  disMnguish  between  low-­‐rank  and  random?  

aT  

a  

Case  #1:  Approximately  low-­‐rank  

Page 63: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

Mi,j  =     random  ±1      w/  probability  ¼  

+    noise  

For  each  (i,j)        Ω  

aiaj                              w/  probability  ¾  

Can  we  disMnguish  between  low-­‐rank  and  random?  

where  each  ai  =  ±1      

aT  

a  

Case  #1:  Approximately  low-­‐rank  

Page 64: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   Can  we  disMnguish  between  low-­‐rank  and  random?  

Page 65: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

,  Mi,j  =  random  ±1    For  each  (i,j)        Ω  

Can  we  disMnguish  between  low-­‐rank  and  random?  

Case  #2:  Random  

random  

Page 66: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

,  Mi,j  =  random  ±1    For  each  (i,j)        Ω  

Can  we  disMnguish  between  low-­‐rank  and  random?  

Case  #2:  Random  

random  

In  Case  #1  the  entries  are  (somewhat)  predictable,  but  in    Case  #2  they  are  completely  unpredictable  

Page 67: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

Page 68: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  

Page 69: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

Page 70: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

AN  INTERPRETATION  

(i1,  j1;  σ1),  (i2,  j2;  σ2),  …,  (im,  jm;  σm)    

We  can  interpret:  

±1  r.v.    as  a  random  2-­‐XOR  formula  ψ  

Page 71: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

AN  INTERPRETATION  

In  parMcular  each  observaMon/fctn  value  maps  to  a  clause:  

(i,  j,  σ)  

We  can  interpret:  

±1  r.v.    as  a  random  2-­‐XOR  formula  ψ  

(i1,  j1;  σ1),  (i2,  j2;  σ2),  …,  (im,  jm;  σm)    

vi      vj    =    σ  �  

variables    constraint  

Page 72: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

AN  INTERPRETATION  

(and  vice-­‐versa)  

In  parMcular  each  observaMon/fctn  value  maps  to  a  clause:  

(i,  j,  σ)   vi      vj    =    σ  �  

We  can  interpret:  

±1  r.v.    as  a  random  2-­‐XOR  formula  ψ  

(i1,  j1;  σ1),  (i2,  j2;  σ2),  …,  (im,  jm;  σm)    

variables    constraint  

Page 73: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

STRONG  REFUTATION  

We  will  say  that  an  algorithm  strongly  refutes*  random  2-­‐XOR  with  m  clauses  if:  

Page 74: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

STRONG  REFUTATION  

We  will  say  that  an  algorithm  strongly  refutes*  random  2-­‐XOR  with  m  clauses  if:  

(1)  On  any  2-­‐XOR  formula  ψ,  it  outputs  val  where:  

OPT(ψ)  ≤  val(ψ)  

Page 75: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

STRONG  REFUTATION  

We  will  say  that  an  algorithm  strongly  refutes*  random  2-­‐XOR  with  m  clauses  if:  

(1)  On  any  2-­‐XOR  formula  ψ,  it  outputs  val  where:  

OPT(ψ)  ≤  val(ψ)  

largest  fracMon  of  clauses  of  ψ  that  can  be  saMsfied    

Page 76: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

STRONG  REFUTATION  

We  will  say  that  an  algorithm  strongly  refutes*  random  2-­‐XOR  with  m  clauses  if:  

(1)  On  any  2-­‐XOR  formula  ψ,  it  outputs  val  where:  

OPT(ψ)  ≤  val(ψ)  

Page 77: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

STRONG  REFUTATION  

We  will  say  that  an  algorithm  strongly  refutes*  random  2-­‐XOR  with  m  clauses  if:  

(1)  On  any  2-­‐XOR  formula  ψ,  it  outputs  val  where:  

OPT(ψ)  ≤  val(ψ)  

(2)  With  high  probability  (for  random  ψ  with  m  clauses):  

val(ψ)  =     1  2   +  o(1)  

Page 78: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Lemma:    If    (i1,  j1;  σ1),  …,  (im,  jm;  σm)     ψ   then        

σ    

i  

j  

A  2  OPT(ψ)  –   ≤  1  

n  1  m   2  

A  

Page 79: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

Lemma:    If    (i1,  j1;  σ1),  …,  (im,  jm;  σm)     ψ   then         j  

Proof:  Map  the  assignment  to  a  unit  vector  so  that  xi  =  ±1/√n  and  take  the  quadraMc  form  on  A  

2  OPT(ψ)  –   ≤  1  n  

1  m   2  

A  σ    

i  

A  

Page 80: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

1  m   A   1  

mn  

Lemma:    If    (i1,  j1;  σ1),  …,  (im,  jm;  σm)     ψ   then    j  

Proof:  Map  the  assignment  to  a  unit  vector  so  that  xi  =  ±1/√n  and  take  the  quadraMc  form  on  A  

2  OPT(ψ)  –   ≤  1  n  

1  m   2  

A  σ    

i  

A  

Page 81: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

1  m   A   1  

mn  OPT(ψ)  ≤  

1  2   +  o(1)  

Lemma:    If    (i1,  j1;  σ1),  …,  (im,  jm;  σm)     ψ   then    j  

Proof:  Map  the  assignment  to  a  unit  vector  so  that  xi  =  ±1/√n  and  take  the  quadraMc  form  on  A  

2  OPT(ψ)  –   ≤  1  n  

1  m   2  

A  σ    

i  

A  

m  =  ω(n)  

Page 82: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

This  solves  the  strong  refutaMon  problem…    

Lemma:    If    (i1,  j1;  σ1),  …,  (im,  jm;  σm)     ψ   then    j  

Proof:  Map  the  assignment  to  a  unit  vector  so  that  xi  =  ±1/√n  and  take  the  quadraMc  form  on  A  

2  OPT(ψ)  –   ≤  1  n  

1  m   2  

A  σ    

i  

A  

1  m   A   1  

mn  OPT(ψ)  ≤  

1  2   +  o(1)  m  =  ω(n)  

Page 83: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

Page 84: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  

The  same  spectral  bound  implies:  

An  algorithm  for  strongly  refuMng  random  2-­‐XOR  (1)  

The  community  working  on  refu;ng  random  CSPs  

Page 85: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  

The  same  spectral  bound  implies:  

An  algorithm  for  strongly  refuMng  random  2-­‐XOR  An  algorithm  for  the  disMnguishing  problem  

(1)  

(2)  

The  community  working  on  refu;ng  random  CSPs  

Page 86: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  

The  same  spectral  bound  implies:  

An  algorithm  for  strongly  refuMng  random  2-­‐XOR  An  algorithm  for  the  disMnguishing  problem  

(1)  

(2)  

The  community  working  on  refu;ng  random  CSPs  

GeneralizaMon  bounds  for  the  nuclear  norm  (3)  

Page 87: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

Page 88: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

An  approach  through  sta;s;cal  learning  theory:    

Page 89: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

An  approach  through  sta;s;cal  learning  theory:    

empirical  error:  (i,j)      Ω  

|Xi,j–Mi,j|  1  |Ω|  

Page 90: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

An  approach  through  sta;s;cal  learning  theory:    

empirical  error:  (i,j)      Ω  

|Xi,j–Mi,j|  1  |Ω|  

(≤  η)  

Page 91: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

min      X          s.t.    *  

(i,j)      Ω  

|Xi,j–Mi,j|  ≤  η  1  |Ω|  

An  approach  through  sta;s;cal  learning  theory:    

empirical  error:  

predic;on  error:  

(i,j)      Ω  

|Xi,j–Mi,j|  1  |Ω|  

|Xi,j–Mi,j|  1  

(≤  η)  

n2  

Page 92: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Then  if  we  let  

K =  {  X  s.t.   *  X   ≤  1  }    =   }  abT  s.t.  conv  {   a   b   ≤    ,   1  

Page 93: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Then  if  we  let  

K =  {  X  s.t.   *  X   ≤  1  }    =   }  abT  s.t.  conv  {   a   b   ≤    ,   1  

sup  X      K

–  

generaliza;on  error:  

empirical  error  (X)   predic;on  error  (X)  (on  Ω)    

Page 94: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Then  if  we  let  

K =  {  X  s.t.   *  X   ≤  1  }    =   }  abT  s.t.  conv  {   a   b   ≤    ,   1  

generalizaMon  error    ≤  best  agreement  with  random  funcMon  Theorem:  “   ”  

(on  Ω)    

sup  X      K

–  

generaliza;on  error:  

empirical  error  (X)   predic;on  error  (X)  (on  Ω)    

Page 95: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Then  if  we  let  

K =  {  X  s.t.   *  X   ≤  1  }    =   }  abT  s.t.  conv  {   a   b   ≤    ,   1  

generalizaMon  error    ≤  best  agreement  with  random  funcMon  Theorem:  “   ”  

(on  Ω)    

Rademacher  complexity  

sup  X      K

–  

generaliza;on  error:  

empirical  error  (X)   predic;on  error  (X)  (on  Ω)    

Page 96: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   More  precisely:  

sup  X      K a  =  1  

X  ia,ja  σa  1  m  

Rademacher  complexity  (Rm(K))    

It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Page 97: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   More  precisely:  

sup  X      K a  =  1  

X  ia,ja  σa  1  m  

Rademacher  complexity  (Rm(K))    

=   1  m   A  

It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

Page 98: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   More  precisely:  

sup  X      K a  =  1  

X  ia,ja  σa  1  m  

Rademacher  complexity  (Rm(K))    

=   1  m   A  

It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

1  m   A   1  

mn  

Page 99: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   More  precisely:  

sup  X      K a  =  1  

X  ia,ja  σa  1  m  

Rademacher  complexity  (Rm(K))    

=   1  m   A  

Rm(K)  =  o(      )      1  n  

It  also  yields  bounds  on  how  well  the  soluMon  to  the  convex  program  generalizes  [Srebro,  Shraibman]  …  

1  m   A   1  

mn  m  =  ω(n)  

Page 100: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

Noisy  matrix  compleMon  with  m  observaMons  

Strongly  refute*  random    2-­‐XOR/2-­‐SAT  with  m  clauses  

*Want  an  algorithm  that  cerMfies  a  formula  is  far  from  saMsfiable  

Rademacher  Complexity  

Page 101: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

*Want  an  algorithm  that  cerMfies  a  formula  is  far  from  saMsfiable  

Strongly  refute*  random    3-­‐XOR/3-­‐SAT  with  m  clauses  Rademacher  

Complexity  

Noisy  tensor  compleMon  with  m  observaMons  

Page 102: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

*Want  an  algorithm  that  cerMfies  a  formula  is  far  from  saMsfiable  

[Coja-­‐Oghlan,  Goerdt,  Lanka]  

Strongly  refute*  random    3-­‐XOR/3-­‐SAT  with  m  clauses  Rademacher  

Complexity  

Noisy  tensor  compleMon  with  m  observaMons  

Page 103: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )  

vj'  

vk'  

vi  σ'  vj  

vk  

σ  

[Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Page 104: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  

vj'  

vk'  

vi  σ'  vj  

vk  

σ   vj'  

vk'  

vi  

vj  

vk   σ  σ’  

[Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Page 105: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  

vj'  

vk'  

vi  σ'  vj  

vk  

σ   vj'  

vk'  

vi  

vj  

vk   σ  σ’  

This  yields  n5p2  =  n2  logO(1)n  clauses  

[Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Page 106: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  

vj'  

vk'  

vi  σ'  vj  

vk  

σ   vj'  

vk'  

vi  

vj  

vk   σ  σ’  

This  yields  n5p2  =  n2  logO(1)n  clauses  

[Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Warning:  The  4-­‐XOR  clauses  are  not  independent!  

Page 107: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  [Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

vj'  

vk'  

vi  

vj  

vk  

(j,  j’)  

A  

Page 108: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  [Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

vj'  

vk'  

vi  

vj  

vk  

(j,  j’)  

(k,k’)  

A  

Page 109: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  [Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

vj'  

vk'  

vi  

vj  

vk  σ  σ’  

(j,  j’)  

(k,k’)  

A  

Page 110: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  [Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Hence  the  paired  variables  for  the  rows  (and  colns)  come  from  different  clauses!  

vj'  

vk'  

vi  

vj  

vk  σ  σ’  

(j,  j’)  

(k,k’)  

A  

Page 111: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

vi      vj’      vk’  =  σ’  vi      vj      vk  =  σ  �   �  (   )  �   �   �  (   )   (  vj      vk      vj’      vk’  =  σ  σ’  �   �   �   )  [Coja-­‐Ohglan,  Goerdt,  Lanka]:  Reduce  3-­‐XOR  to  4-­‐XOR  

   

Hence  the  paired  variables  for  the  rows  (and  colns)  come  from  different  clauses!  

vj'  

vk'  

vi  

vj  

vk  σ  σ’  

(j,  j’)  

(k,k’)  

A   +    trace  method  

Page 112: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

*Want  an  algorithm  that  cerMfies  a  formula  is  far  from  saMsfiable  

[Coja-­‐Oghlan,  Goerdt,  Lanka]  

Strongly  refute*  random    3-­‐XOR/3-­‐SAT  with  m  clauses  Rademacher  

Complexity  

Noisy  tensor  compleMon  with  m  observaMons  

Page 113: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   There  are  two  very  different  communiMes  that  (essenMally)  auacked  this  same  disMnguishing  problem:  

The  community  working  on  matrix  comple;on  The  community  working  on  refu;ng  random  CSPs  

Strongly  refute*  random    3-­‐XOR/3-­‐SAT  with  m  clauses  

*Want  an  algorithm  that  cerMfies  a  formula  is  far  from  saMsfiable  

[Coja-­‐Oghlan,  Goerdt,  Lanka]  

We  then  embed  this  algorithm  into  the  sixth  level  of  the    sum-­‐of-­‐squares  hierarchy,  to  get  a  relaxaMon  for  tensor  predicMon    

Embedding  in  SOS  

Noisy  tensor  compleMon  with  m  observaMons  

Page 114: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

i,j,k  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

GENERALIZATION  BOUNDS  

Suppose  we  are  given  |Ω|=  m  noisy  observaMons  Ti,j,k  ±  η,  and  the  factors  of  T  are  C-­‐incoherent:  

Page 115: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

This  comes  from  giving  an  efficiently  computable  norm  whose  Rademacher  complexity  is  asymptoMcally  smaller  than  the  trivial  bound  whenever  m=Ω(n3/2log4n)  

K  �  

Theorem:  There  is  an  efficient  algorithm  that  with  prob  1-­‐δ,  outputs  X  with  

Suppose  we  are  given  |Ω|=  m  noisy  observaMons  Ti,j,k  ±  η,  and  the  factors  of  T  are  C-­‐incoherent:  

GENERALIZATION  BOUNDS  

i,j,k  

|Xi,j,k–Ti,j,k|  ≤  1  n3  

+    ln(2/δ)  

m  2C3r  n3/2log4n  m  C3r   +  2η    

Page 116: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SUMMARY  

We  gave  an  algorithm  for  3rd-­‐order  tensor  predicMon  that  uses  m  =  n3/2rlog4n  observaMons  

New  Algorithm:  

Page 117: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SUMMARY  

There  is  an  inefficient  algorithm  that  use  m  =  nrlogn  observaMons  

New  Algorithm:  

An  Inefficient  Algorithm:  

We  gave  an  algorithm  for  3rd-­‐order  tensor  predicMon  that  uses  m  =  n3/2rlog4n  observaMons  

(via  tensor  nuclear  norm)  

Page 118: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

SUMMARY  

There  is  an  inefficient  algorithm  that  use  m  =  nrlogn  observaMons  

Even  for  nδ  rounds  of  the  powerful  sum-­‐of-­‐squares  hierarchy,  no  norm  solves  tensor  predicMon  with  m  =  n3/2-­‐δr  observaMons  

New  Algorithm:  

An  Inefficient  Algorithm:  

A  Phase  Transi;on:  

We  gave  an  algorithm  for  3rd-­‐order  tensor  predicMon  that  uses  m  =  n3/2rlog4n  observaMons  

(via  tensor  nuclear  norm)  

Page 119: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

   

New  direcMons  in  computaMonal  vs.  staMsMcal  tradeoffs  

Epilogue:    

Page 120: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

DISCUSSION  

Convex  programs  are  unreasonably  effecMve  for  linear  inverse  problems!  

Page 121: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

DISCUSSION  

Convex  programs  are  unreasonably  effecMve  for  linear  inverse  problems!  

But  we  gave  simple  linear  inverse  problems  that  exhibit  striking    gaps  between  efficient  and  inefficient  esMmators  

Page 122: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

DISCUSSION  

Convex  programs  are  unreasonably  effecMve  for  linear  inverse  problems!  

But  we  gave  simple  linear  inverse  problems  that  exhibit  striking    gaps  between  efficient  and  inefficient  esMmators  

Where  else  are  there  computaMonal  vs  staMsMcal  tradeoffs?  

Page 123: BEYONDMATRIXCOMPLETION& - People | MIT CSAIL

DISCUSSION  

Where  else  are  there  computaMonal  vs  staMsMcal  tradeoffs?  

New  Direc;on:  Explore  computaMonal  vs.  staMsMcal  tradeoffs  through  the  powerful  sum-­‐of-­‐squares  hierarchy  

Convex  programs  are  unreasonably  effecMve  for  linear  inverse  problems!  

But  we  gave  simple  linear  inverse  problems  that  exhibit  striking    gaps  between  efficient  and  inefficient  esMmators