31
Charalampos (Babis) E. Tsourakakis MACH: Fast Randomized Tensor Decompositions, SDM 2010 1 SIAM Data Mining Conference April 30 th , 2010

MACH: Fast Randomized Tensor Decompositions

Embed Size (px)

DESCRIPTION

MACH: Fast Randomized Tensor Decompositions

Citation preview

Page 1: MACH: Fast Randomized Tensor Decompositions

Charalampos  (Babis)    E.  Tsourakakis  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 1

SIAM Data Mining Conference April 30th, 2010

Page 2: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 2

Page 3: MACH: Fast Randomized Tensor Decompositions

0200040006000800010000051015202530time (min)value

Temperature 02000400060008000100000100200300400500600time (min)value

Light

020004000600080001000000.511.522.5time (min)value

Voltage 0200040006000800010000010203040time (min)value

Humidity

Intel Berkeley lab

3 MACH: Fast Randomized Tensor Decompositions, SDM 2010

Page 4: MACH: Fast Randomized Tensor Decompositions

time

Loca

tion

Data modeled as a tensor, i.e., multidimensional matrix, T x (#sensors) x (#types of measurements)  

4

                                                                                             Observation                                    Multi-­‐aspect  data  can  be  modeled  in  such  way.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010

Time  mode    Sensor  mode    Measurement  type  mode  

Page 5: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 5

5 Mode Tensor voxel x subjects x trials x task conditions x timeticks

Functional Magnetic Resonance Imaging (fMRI)

Tensors  model  naturally  numerous  real-­‐world  datasets.                                                                                                                                  And  now  what?    

Page 6: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 6

Page 7: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 7

o + o + o … +

u1 u2 u3

v1 v2 v3

σ1 σ2 σ3

Α  (m  x  n)  =

Singular  value  decomposition  (SVD)    The  “Swiss  army  knife”  of  matrix  decompositions  (O’Leary)  

Page 8: MACH: Fast Randomized Tensor Decompositions

= x x

Document to term matrix

Documents to Document HCs

Strength of each concept

Term to Term HCs data graph java brain lung

CS

MD

8 MACH: Fast Randomized Tensor Decompositions, SDM 2010

Page 9: MACH: Fast Randomized Tensor Decompositions

  Two  families  of  algorithms  extend  SVD  to  the  multilinear  setting    PARAFAC/CANDECOMP  decompositions    Tucker  decomposition      

MACH: Fast Randomized Tensor Decompositions, SDM 2010 9

Kolda Bader

Tensor Decompositions and its Applications, SIAM review

Page 10: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 10

Tucker  is  an  SVD-­‐like  decomposition  of  a  tensor,  with  one    projection  matrix  per  mode  and  a  core  tensor.  

~

J.  Sun  showed  that  Tucker  decompositions  can  be  used  to  extract  useful  knowledge  from  monitoring  systems.  

Page 11: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 11

Page 12: MACH: Fast Randomized Tensor Decompositions

 Most  of  the  real-­‐world  processes  result  in  sparse  tensors.    

  However,  there  exist  important  processes  which  result  in  dense  tensors:  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 12

Physical  Process     Percentage  of  non-­‐zero  entries  

Sensor  network  (sensor  x  measurement  type  x  timeticks)  

85%  

Computer  network  (machine  x    measurement  type  x  timeticks)  

81%  

Page 13: MACH: Fast Randomized Tensor Decompositions

  It  can  be  either  very  slow  or  impossible  to  perform  a  Tucker  decomposition  on  a  dense  tensor  due  to  memory  constraints.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 13

Given  the  fact  that    (low  rank)  Tucker  decompositions  are  valuable  in  practice,  can  we  “trade”  a  “little  bit”  of  accuracy  for  efficiency?  

Page 14: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 14

Page 15: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 15

McSherry Achlioptas

MACH extends the work of Achlioptas-McSherry for fast low rank approximations to the multilinear setting.

Fast low rank matrix approximation STOC 2001

Page 16: MACH: Fast Randomized Tensor Decompositions

  Toss  a  coin  for  each  non-­‐zero  entry  with  probability  p      If  it  “survives”  reweigh  it  by  1/p.      If  not,  make  it  zero!  

  Perform  Tucker  on  the  sparsified  tensor!    For  the  theoretical  results  and  more  details,  see  the  MACH  paper.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 16

Page 17: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 17

Page 18: MACH: Fast Randomized Tensor Decompositions

  Intemon:  A  prototype  monitoring  and  mining  system  for  data  centers,  developed  at  Carnegie  Mellon  University.  

  Tensor  X,  100  machines  x  12  types  of    measurement  x  10080  timeticks  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 18

Page 19: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 19

For  p=0.1  we  obtain    that  Pearson’s  Correlation  Coefficient    is  0.99  

Ideal  ρ=1  

Page 20: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 20

Exact MACH

The  qualitative  analysis  which  is  important  for  our  goals  remains  the  same!  

Find the differences!

Page 21: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 21

Page 22: MACH: Fast Randomized Tensor Decompositions

  Berkeley  Lab  

  Tensor  54  sensors  x  4  types  of  measurement  x  5385  timeticks  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 22

Page 23: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 23

The  qualitative  analysis  which  is  important    for  our  goals  remains  the  same!  

Page 24: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 24

The  spatial  principal  mode  is  also  preserved,    and  Pearson’s  correlation  coefficient    is  again  almost  1!  

Exact MACH

Page 25: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 25

                       REMARKS  1)  Daily  periodicity    is    apparent.  2)  Pearson’s  correlation  Coefficient  0.99  with  the  exact  component.  

Page 26: MACH: Fast Randomized Tensor Decompositions

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 26

Page 27: MACH: Fast Randomized Tensor Decompositions

  Randomized  Algorithms  for  Tensors      Smallest  p*  for  tensor  sparsification  for  the    HOOI    algorithm    Randomized  Algorithms  work  very  well  (e.g.,  sublinear  time  algorithm),  but  typically  hard  to  analyze.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 27

Page 28: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 28

Page 29: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 29

Page 30: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 30

Remark:Even if our theoretical results refer to HOSVD, MACH works for HOOI

Page 31: MACH: Fast Randomized Tensor Decompositions

MACH: Fast Randomized Tensor Decompositions, SDM 2010 31

Canonical  Decomposition  CANDECOMP/  PARAFAC                        Tucker  Decomposition