Scalable Training of Mixture Models via Coresets

Daniel Feldman

MatthewFaulkner

Andreas Krause

Fitting Mixtures to Massive Data

ImportanceSample

EM, generally expensive Weighted EM, fast!

Coresets for Mixture Models

Naïve Uniform Sampling

Small cluster is missed

Sample a set U of m points uniformly

High variance

Sampling Distribution

Sampling distribution

Bias sampling towards small clusters

Importance Weights

WeightsSampling distribution

Creating a Sampling Distribution

Iteratively find representative points

• Sample a small set uniformly at random

• Remove half the blue points nearest the samples• Sample a small set uniformly at random

Small clusters are represented

Partition data via a Voronoi diagram centered at points17

Sampling distribution 18

Points in sparse cells get more massand points far from centers

Importance Weights

Sampling distribution 19

Points in sparse cells get more massand points far from centers

Weights

Importance Sample

Coresets via Adaptive Sampling

A General Coreset Framework

Contributions for Mixture Models:

A Geometric PerspectiveGaussian level sets can be expressed purely geometrically:

affine subspace

Geometric Reduction

Lifts geometric coreset tools to mixture models

Soft-min

Semi-Spherical Gaussian Mixtures

Extensions and Generalizations

Level Sets

Composition of Coresets

Merge[c.f. Har-Peled, Mazumdar 04]

Composition of Coresets

Compress

Merge[Har-Peled, Mazumdar 04]

Coresets on Streams

Compress

Coresets on Streams

Compress

Coresets on Streams

Compress

31Error grows linearly with number of compressions

Coresets on Streams

Error grows with height of tree

Coresets in Parallel

Handwritten DigitsObtain 100-dimensional features from 28x28 pixel images via PCA. Fit GMM with k=10 components.

MNIST data:60,000 training,10,000 testing

Neural Tetrode RecordingsWaveforms of neural activity at four co-located electrodes in a live rat hippocampus. 4 x 38 samples = 152 dimensions.

T. Siapas et al, Caltech

Community Seismic NetworkDetect and monitor earthquakes using smart phones, USB sensors, and cloud computing.

CSN Sensors Worldwide

Learning User Acceleration

17-dimensional acceleration feature vectors

Seismic Anomaly Detection

GMM used for anomaly detection

Conclusions

• Lift geometric coreset tools to the statistical realm - New complexity result for GMM level sets

• Parallel (MapReduce) and Streaming implementations

• Strong empirical performance, enables learning on mobile devices

• GMMs admit coresets of size independent of n - Extensions for other mixture models

Scalable Training of Mixture Models via Coresets

Documents

Coresets for Data-efficient Training of Machine Learning Modelsbaharanm/papers/mirzasoleiman20... · 2020. 9. 21. · Coresets for Data-efﬁcient Training of Machine Learning Models

K-Robots Clustering of Moving Sensors using Coresets

…A Scalable Systems Initiative Scalable Microsoft ...€¦ · A Scalable Systems Initiative Scalable Microsoft Application Re engineering Technology Scalable Systems Inc Web: systems.com

Machine Learning and Coresets for Automated Real-Time ...people.csail.mit.edu/rosman/papers/icra_17_medical.pdf · Machine Learning and Coresets for Automated Real-Time Video Segmentation

Chapter 2. MatterSubstanceElementCompoundMixture Heterogeneous Mixture ColloidSuspension Homogeneous Mixture

Scalable k -Means Clustering via Lightweight Coresets · denotes the optimal solution of k centers on C and Q X denotes the optimal solution on X . Any -approximation can be used

48 CORESETS AND SKETCHESjeffp/papers/chap48-coreset+sketch.pdf · coresets and sketches at each location before communicating only these summaries ... Similar linearization is possible

Sparse Datasets Using Coresets Dimensionality Reduction of ...cobweb.cs.uga.edu/.../Coresets-Norman.pdf · Previous coresets dependent on d, so useless for fat or square matrices

Scalable Transactions for Scalable Distributed Database ...digitalassets.lib.berkeley.edu/etd/ucb/text/Pang_berkeley_0028E_15408.pdf · Scalable Transactions for Scalable Distributed

Muthen and Shedden - Mixture Modeling With Mixture Outcomes

Mixture modelsxian/BCS/Bmix.pdf · 2013-04-16 · Mixture models Mixture models 5 Mixture models Mixture models MCMC approaches Label switching MCMC for variable dimension models

Coresets for Nonparametric Estimation — the Case of DP-Means

Wasserstein Measure Coresets - arXiv

On Coresets for Logistic Regression - arXiv

Scalable database, Scalable language @ JDC 2013

MRNet: From Scalable Performance to Scalable Reliability

Solutions Mixtures Heterogeneous mixture: substances in mixture are not spread uniformly throughout mixture. Homogeneous mixture: components uniformly

3.6 Solubility Solution: homogeneous mixture or mixture in which components are uniformly intermingled Solution: homogeneous mixture or mixture in which

Multiple Description Coding for Non-Scalable and Scalable ...paduaresearch.cab.unipd.it/628/1/tesi.pdf · Multiple Description Coding for Non-Scalable and Scalable Video Compression

Coresets for k-Means and k-Median Clustering and their