Matrix Completion from Fewer Entriesdimacs.rutgers.edu/.../UnifyingTheory/Slides/montanari.pdf · 2009. 4. 9. · r/ . with probability larger than 1−exp(−Bn). Theorem (Keshavan,

Matrix Completion from Fewer Entries

Raghunandan Keshavan, Andrea Montanari and Sewoong Oh

Stanford University

March 30, 2009

Raghunandan Keshavan, Andrea Montanari and Sewoong Oh Matrix Completion

Outline

1 The problem, a look at the data, and some results (slides)

2 Proofs (blackboard)

arXiv:0901.3150


The problem, a look at the data, and some results


Netflix dataset: A big (!) matrix

3

13

4

1

1

5

4 4

4

4

4

4

4

4

4

4

4

4

4

4

3 3

3

3

1

1

1

11

5

5

5

2

2

2

2

5 · 105 users

2·10

4m

ovies

108 ratings

M =


A big (!) matrix

?

13

4

1

1

5

4 4

4

4

4

4

4

4

4

4

4

4

4

4

3 3

3

3

1

1

1

11

5

5

5

2

2

2

2

3

?

?

?

?

5 · 105 users

2·10

4m

ovies

106 queries

M =


You get a prize if. . .

RMSE < 0.8563 ;−)

Is this possible?



RMSE < 0.8563 ;−)

Is this possible?



RMSE < 0.8563 ;−)

Is this possible?


A model: Incoherent low-rank matrices


The observations

24312412365126251454231542321542143214324135124424225534242444245231552162561272662262626711515252241

13421532161432614361436514327147171542154437171521726547152481582524858141258141841852423233334448148

243124123651262514542315423215421432143241351244233232121444225552315521625612726622626267115152522412431241236512625145423154232154214321432413512442422555231552162561272662262141241221262671151525224141315426514236152461547‘614542422471‘6567157157‘65147‘615241543154311315464566366531253151353116‘161466243124123651262514542315423215421432143241351244242255523155216256127266226262671151434343432252522412431241236512625145423154232154214321432413512442422555231552162561272662262345235256662671151525224141315426514236152461547‘614542422471‘6567157157‘65147‘615241543154311315312345334646653151353116‘161466243124123651262514542315423215421432143241351244242255523155213434666635625612726622626267115152522412431241236512625145423154232154214321432413512442422343454345355523155216256127266226262671151525224141315426514236152461547‘614542422471‘6567157157‘65147‘353534543361524154315431131531253151353116‘161466243124123651262514542315423215421432143434534534524135124424225552315521625612726622626267115152522412431241236512625145423154232154214324534535455143241351244242255542315521625612726622626267115152524141315426514236152461547‘614542422471‘346567157157‘65147‘61524154315431131531253151353116‘16146453454356243124123651262514542315423215421432143241351331331111244242255523155216254612726622626267115152522412431241236512625145423154232154214333421123332143241351244242255523155216256127266226262671151525224124312412365126251454231542321542143214324135124424225552315521625612721231‘1313266226262671151525224141315426514236152461547‘614542422471‘6567157157‘65147‘615241543154311131232333311531253151353116‘1614662431241236512625145423154232154214321432413512442422555231552162561272662262626711515252244374474744141315426514236152461547‘614542422471‘6567157157‘65147‘615241543154311343344453551531253151353116‘161466143265421542715765127651543151221652465236125436541625143615243162534535666461r526146341645264616161141315426514236152461547‘614542422471‘6567157157‘65147‘615241543154366363443135131531253151353116‘16146641315426514236152461547‘614542422471‘6567157157‘65147‘615241543144444345554431131531253151353116‘161466

41315426514236152461547‘614542422471‘6567157157‘65147‘615241545345346664315431131531253151353116‘161466

2431241236512625145423154232154214321432413512442422555231552162544634646666127266226262671151525224124312412365126251454231542321542143214324135124424225552315521625446436666661272662262626711515252241

24312412365126251454231542321542143214324135124424225552315521625464363423361272662262626711515252241

41315426514236152461547‘614542422471‘6567157157‘65147‘615241543135353453445431131531253151353116‘16146624312412365126251454231542321542143214324135124424225552315521624534466433561272662262626711515252241

2431241236512625145423154232154234343444551432143241351244242255523155216256127266226262671151525224124312412365126251454231542434444453332154214321432413512442422555231552162561272662262626711515252241

243124123651262514542315423215421214324135124424225552315521625612726622626267132424425152522333333415125125653426356254412545346532671735712351663571237213533333333172671238127638172681871881

nα users

nm

ovies

M =


The observations

3

13

4

1

1

5

4 4

4

4

4

4

4

4

4

4

4

4

4

4

3 3

3

3

1

1

1

11

5

5

5

2

2

2

2

nα users

nm

ovies

nε unif. random positions

ME =


You need some structure!

nα

rr

nM = U

VT

r n


You need some structure!

nα

rr

nM = U

VT

r n


Unstructured factors

A1. Bounded entries

|Mia| ≤ Mmax = µ0

√r .

A2. Incoherence

r∑k=1

U2ik ≤ µ1 r ,

r∑k=1

V2ak ≤ µ1 r .

[Candes, Recht 2008]


Metric (RMSE)

D(M, M) ≡

1

n2M2max

∑i ,a

|Mia − Mia|2

1/2


Previous work

Theorem (Candes, Recht, 2008)

If

ε ≥ C r n1/5 log n

then whp

1. M is unique given the observed entries.

2. M is the unique minimum of a SDP.

cf. also [Recht, Fazel, Parrilo 2007]


Previous work


If


then whp





Previous work


If


then whp





Previous work


If


then whp





Previous work


If


then whp





Great, but. . .

1. n1/5 observations for 1 bit of information?

2. RMSE = 0?

3. SDP = O(n4...6). Substitute n = 105. . .


Great, but. . .


2. RMSE = 0?



Great, but. . .


2. RMSE = 0?



Great, but. . .


2. RMSE = 0?



O(n) entries are enough (practice)


A movie


Rank = 1: Bayes optimal vs. Belief Propagation

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12 14

n=100n=1000

n=10000

ε

D


Rank = 2: Belief Propagation

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 2 4 6 8 10 12

n=100n=1000

n=10000

ε

D



0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0 2 4 6 8 10 12 14 16 18

n=100n=1000

n=10000

ε

D



0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

0 5 10 15 20

n=100n=1000

n=10000

ε

D


O(n) entries are enough (theory)


Naive spectral algorithm

MEia =

Mia if (i , a) ∈ E ,0 otherwise.

Projection

ME =n∑

i=1

σixiyTi , σ1 ≥ σ2 ≥ . . .

Tr (ME ) =n√

α

ε

r∑i=1

σixiyTi .


Naive spectral algorithm

MEia =

Mia if (i , a) ∈ E ,0 otherwise.

Projection

ME =n∑

i=1

σixiyTi , σ1 ≥ σ2 ≥ . . .

Tr (ME ) =n√

α

ε

r∑i=1

σixiyTi .


Troubles and solution

If ε = O(1), ‘spurious’ singular values Ω(√

log n/(log log n)).

Trimming

MEia =

ME

ia if deg(i) ≤ 2 Edeg(i), deg(a) ≤ 2 Edeg(a) ,0 otherwise.


Troubles and solution

If ε = O(1), ‘spurious’ singular values Ω(√

log n/(log log n)).

Trimming

MEia =

ME

ia if deg(i) ≤ 2 Edeg(i), deg(a) ≤ 2 Edeg(a) ,0 otherwise.


Not-as-naive spectral algorithm

Spectral Matrix Completion( matrix ME )

1: Trim ME , and let ME be the output;

2: Project ME to Tr (ME );

3: Clean residual errors by gradient descent in the factors.


Theorem (Keshavan, M, Oh, 2009)

Assume r ≤ n1/2 and bounded entries. Then

1

nMmax||M− Tr (M

E )||F = RMSE ≤ C√

r/ε.

with probability larger than 1− exp(−Bn).


Assune r = O(1), bounded entries and incoherent factors, withΣmin, Σmax uniformly bounded away from 0 and ∞.

If ε ≥ C ′ log n thenSpectral Matrix Completion returns, whp, the matrix M.



Assume r ≤ n1/2 and bounded entries. Then

1

nMmax||M− Tr (M

E )||F = RMSE ≤ C√

r/ε.

with probability larger than 1− exp(−Bn).


Assune r = O(1), bounded entries and incoherent factors, withΣmin, Σmax uniformly bounded away from 0 and ∞.

If ε ≥ C ′ log n thenSpectral Matrix Completion returns, whp, the matrix M.


A comparison

Theorem (Achlioptas, McSherry 2007)

Assume ε ≥ (8 log n)4 and bounded entries. Then

1

nMmax||M− Tr (M

E )||F = RMSE ≤ 4√

r/ε.

with probability larger than 1− exp(−19(log n)4).

(For n = 106, (8 log n)4 ≈ 1.5 · 108)


A comparison

Theorem (Achlioptas, McSherry 2007)

Assume ε ≥ (8 log n)4 and bounded entries. Then

1

nMmax||M− Tr (M

E )||F = RMSE ≤ 4√

r/ε.

with probability larger than 1− exp(−19(log n)4).

(For n = 106, (8 log n)4 ≈ 1.5 · 108)


One more comparison

Theorem (Candes, Tao, March 8, 2009)

Assume bounded entries and strongly incoherent factorsIf ε ≥ C r (log n)6 thenSemidefinite Programming returns, whp, the matrix M.

A2’. Strong incoherence

r∑k=1

U2ik ≤ µ1 r ,

∣∣∣∣∣r∑

k=1

UikUjk

∣∣∣∣∣ ≤ µ1

√r ,


One more comparison




r∑k=1

U2ik ≤ µ1 r ,

∣∣∣∣∣r∑

k=1

UikUjk

∣∣∣∣∣ ≤ µ1

√r ,


One more comparison




r∑k=1

U2ik ≤ µ1 r ,

∣∣∣∣∣r∑

k=1

UikUjk

∣∣∣∣∣ ≤ µ1

√r ,


Our approach: Graph theory

movies

users

1

1

nα

n

i

a

(i , a) ∈ E ⇔ User a rated movie i .


Our approach: Graph theory

movies

users

1

1

nα

n

i

a

(i , a) ∈ E ⇔ User a rated movie i .


Back to the data


Random r = 4, n = 10000, ε = 12.5

0 10 20 30 40 50 60 70 800

10

20

30

40

σ1σ2σ3σ4


Netflix data (trimmed)

0 20 40 60 80 100 120 1400

20

40

60

80

100

120

140


Is Neflix a random low-rank matrix?

Compare for coordinate descent (SimonFunk).


Rank = 3

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

D

steps steps

fit error pred. error


Rank = 4

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

D

steps steps



Rank = 5

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

LowrankNetflix Data

Random Data U[-1 1]

D

steps steps



Proofs (blackboard)


Documents

Matrix Completion from Fewer Entriesdimacs.rutgers.edu/.../UnifyingTheory/Slides/montanari.pdf · 2009. 4. 9. · r/ . with probability larger than 1−exp(−Bn). Theorem (Keshavan,