MACH: Fast Randomized Tensor Decompositions

Charalampos (Babis) E. Tsourakakis

MACH: Fast Randomized Tensor Decompositions, SDM 2010 1

SIAM Data Mining Conference April 30th, 2010

  Introduction  Why Tensors?   Tensor Decompositions

  Our Motivation   Proposed Method   Experimental Results

  Case study I: Intemon   Case study II: Intel Berkeley Lab

  Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 2

0200040006000800010000051015202530time (min)value

Temperature 02000400060008000100000100200300400500600time (min)value

Light

020004000600080001000000.511.522.5time (min)value

Voltage 0200040006000800010000010203040time (min)value

Humidity

Intel Berkeley lab

3 MACH: Fast Randomized Tensor Decompositions, SDM 2010

time

Loca

tion

Data modeled as a tensor, i.e., multidimensional matrix, T x (#sensors) x (#types of measurements)

4

Observation Multi-‐aspect data can be modeled in such way.

MACH: Fast Randomized Tensor Decompositions, SDM 2010

Time mode Sensor mode Measurement type mode


5 Mode Tensor voxel x subjects x trials x task conditions x timeticks

Functional Magnetic Resonance Imaging (fMRI)

Tensors model naturally numerous real-‐world datasets. And now what?






o + o + o … +

u1 u2 u3

v1 v2 v3

σ1 σ2 σ3

Α (m x n) =

Singular value decomposition (SVD) The “Swiss army knife” of matrix decompositions (O’Leary)

= x x

Document to term matrix

Documents to Document HCs

Strength of each concept

Term to Term HCs data graph java brain lung

CS

MD

8 MACH: Fast Randomized Tensor Decompositions, SDM 2010

  Two families of algorithms extend SVD to the multilinear setting   PARAFAC/CANDECOMP decompositions   Tucker decomposition


Kolda Bader

Tensor Decompositions and its Applications, SIAM review


Tucker is an SVD-‐like decomposition of a tensor, with one projection matrix per mode and a core tensor.

~

J. Sun showed that Tucker decompositions can be used to extract useful knowledge from monitoring systems.





 Most of the real-‐world processes result in sparse tensors.

  However, there exist important processes which result in dense tensors:


Physical Process Percentage of non-‐zero entries

Sensor network (sensor x measurement type x timeticks)

85%

Computer network (machine x measurement type x timeticks)

81%

  It can be either very slow or impossible to perform a Tucker decomposition on a dense tensor due to memory constraints.


Given the fact that (low rank) Tucker decompositions are valuable in practice, can we “trade” a “little bit” of accuracy for efficiency?






McSherry Achlioptas

MACH extends the work of Achlioptas-McSherry for fast low rank approximations to the multilinear setting.

Fast low rank matrix approximation STOC 2001

  Toss a coin for each non-‐zero entry with probability p   If it “survives” reweigh it by 1/p.   If not, make it zero!

  Perform Tucker on the sparsified tensor!   For the theoretical results and more details, see the MACH paper.






  Intemon: A prototype monitoring and mining system for data centers, developed at Carnegie Mellon University.

  Tensor X, 100 machines x 12 types of measurement x 10080 timeticks



For p=0.1 we obtain that Pearson’s Correlation Coefficient is 0.99

Ideal ρ=1


Exact MACH

The qualitative analysis which is important for our goals remains the same!

Find the differences!





  Berkeley Lab

  Tensor 54 sensors x 4 types of measurement x 5385 timeticks



The qualitative analysis which is important for our goals remains the same!


The spatial principal mode is also preserved, and Pearson’s correlation coefficient is again almost 1!

Exact MACH


REMARKS 1) Daily periodicity is apparent. 2) Pearson’s correlation Coefficient 0.99 with the exact component.





  Randomized Algorithms for Tensors   Smallest p* for tensor sparsification for the HOOI algorithm   Randomized Algorithms work very well (e.g., sublinear time algorithm), but typically hard to analyze.





Remark:Even if our theoretical results refer to HOSVD, MACH works for HOOI


Canonical Decomposition CANDECOMP/ PARAFAC Tucker Decomposition