Upload
rosalind-stewart
View
215
Download
0
Embed Size (px)
Citation preview
Optimistic Concurrency Control
for Distributed LearningXinghao Pan
Joseph E. GonzalezStefanie Jegelka
Tamara BroderickMichael I. Jordan
Data
ModelParameters
Machine Learning Algorithm
Data
ModelParameters
Distributed Machine Learning
Data
ModelParameters !!
Distributed Machine Learning
Concurrency:more machines = less time
Correctness:serial equivalence
Data
ModelParameters
Coordination-free
Data
ModelParameters
Mutual Exclusion
Data
ModelParameters
Mutual Exclusion
Correctness
Concurrency
Coordination-free
Mutualexclusion
Mechanism forensuring correctness
Conflictsare rare
High
Low High
Low
OptimisticConcurrencyControl
?
Data
ModelParameters
Optimistic Concurrency Control
• Optimistic updates• Validation: detect conflict• Resolution: fix conflict
! !
Hsiang-Tsung Kung and John T Robinson.On optimistic methods for concurrency control.
ACM Transactions on Database Systems (TODS), 6(2):213–226, 1981.
Concurrency
Correctness
OptimisticConcurrencyControl
Application: Clustering
• Natural domain for parallelization
• K-means – popular algorithm• Fixed number of clusters – not fit for
Big Data
Big Data solution: DP-means + OCC
Example
Example: K-means
Bad!
Example: DP-means
Correct clustersSequential!
Brian Kulis and Michael I. Jordan.Revisiting k-means: New algorithms via Bayesian nonparametrics.
In Proceedings of 23rd International Conference on Machine Learning, 2012.
OCC DP-means
ValidationResolution
Evaluation: Amazon EC2
1 2 3 4 5 6 7 80
500
1000
1500
2000
2500
3000
3500
Number of Machines
Ru
nti
me I
n S
econ
dP
er
Com
ple
te P
ass o
ver
Data
OCC DP-means Runtime Projected Linear Scaling
2x #machines≈ ½x runtime
~140 million data points; 1, 2, 4, 8 machines
OptimisticConcurrencyControl
High concurrency: Conflicts rare Validation easy Resolution cheap
OCCified Algorithms Online facility location BP-means: feature modeling
Ongoing Stochastic gradient descent Collapsed Gibbs sampling
What can OCC do for you?
See us @ poster [email protected]
OptimisticConcurrencyControl
Big Learning @ NIPS 2013http://biglearn.org
Xinghao Pan, Joseph E. Gonzalez,
Stefanie Jegelka, Tamara
Broderick, and Michael I. Jordan.
Optimistic
concurrency control
for distributed
unsupervised
learning.
ArXiv e-prints
arXiv:1307.8049, 2013.