Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO: BIRCH meets Coresets for k -means
Hendrik Fichtenberger, Marc Gille, Melanie Schmidt, ChrisSchwiegelshohn, Christian Sohler
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
The k -means ProblemGiven a point set P ⊆ Rd ,compute a set C ⊆ Rd
with |C| = k centerswhich minimizescost(P,C)
=∑p∈P
minc∈C||c − p||2,
the sum of the squareddistances.
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982)k -means++ (2007)several approximation algorithms (recent)
. . . for Big DataBIRCH (1996)MacQueen’s k -means (1967)several approximations using coresets (recent)
Implementability, Speed and good quality?
Stream-KM++ (2010)BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982)k -means++ (2007)several approximation algorithms (recent)
. . . for Big DataBIRCH (1996)MacQueen’s k -means (1967)several approximations using coresets (recent)
Implementability, Speed and good quality?
Stream-KM++ (2010)BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982)k -means++ (2007)several approximation algorithms (recent)
. . . for Big DataBIRCH (1996)MacQueen’s k -means (1967)several approximations using coresets (recent)
Implementability, Speed and good quality?
Stream-KM++ (2010)BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982) moderate speedk -means++ (2007) moderate speed & qualityseveral approximation algorithms (recent) quality
. . . for Big DataBIRCH (1996) fastMacQueen’s k -means (1967) fastseveral approximations using coresets (recent) quality
Implementability, Speed and good quality?
Stream-KM++ (2010) next slidesBICO next slides
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982) moderate speedk -means++ (2007) moderate speed & qualityseveral approximation algorithms (recent) quality
. . . for Big DataBIRCH (1996) fastMacQueen’s k -means (1967) fastseveral approximations using coresets (recent) quality
Implementability, Speed and good quality?Stream-KM++ (2010) next slides
BICO next slides
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Popular k -means algorithms. . .Lloyd’s algorithm (1982) moderate speedk -means++ (2007) moderate speed & qualityseveral approximation algorithms (recent) quality
. . . for Big DataBIRCH (1996) fastMacQueen’s k -means (1967) fastseveral approximations using coresets (recent) quality
Implementability, Speed and good quality?Stream-KM++ (2010) next slidesBICO next slides
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Costs
3e+012
3.5e+012
4e+012
4.5e+012
5e+012
5.5e+012
6e+012
6.5e+012
7e+012
15 20 25 30
Cos
ts
Number of Centers
BigCross
StreamKM++BICO
MacQueenBIRCH
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Clustering Algorithms: Practice and Theory
Running Time
0
200
400
600
800
1000
1200
10 20 30 40 50
Run
ning
Tim
e
Number of Centers
Census
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
Ideastart with BIRCH for the basic design because it is very fastanalyze its flawsdevelop an improved algorithm based ontheoretical observations
Now: Description of BIRCH
WarningBIRCH has several phaseswe are only interested in the main phase(and a little in the rebuilding phase)
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
Ideastart with BIRCH for the basic design because it is very fastanalyze its flawsdevelop an improved algorithm based ontheoretical observations
Now: Description of BIRCH
WarningBIRCH has several phaseswe are only interested in the main phase(and a little in the rebuilding phase)
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
Ideastart with BIRCH for the basic design because it is very fastanalyze its flawsdevelop an improved algorithm based ontheoretical observations
Now: Description of BIRCH
WarningBIRCH has several phaseswe are only interested in the main phase(and a little in the rebuilding phase)
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCH
stores points in a treeeach node represents a subset of the input point setsubset is summarized by the number of points, the centroidof the set and the squared distances to the centroidall points in the same subset get the same center
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHstores points in a tree
each node represents a subset of the input point setsubset is summarized by the number of points, the centroidof the set and the squared distances to the centroidall points in the same subset get the same center
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHstores points in a treeeach node represents a subset of the input point set
subset is summarized by the number of points, the centroidof the set and the squared distances to the centroidall points in the same subset get the same center
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHstores points in a treeeach node represents a subset of the input point setsubset is summarized by the number of points, the centroidof the set and the squared distances to the centroid
all points in the same subset get the same center
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHstores points in a treeeach node represents a subset of the input point setsubset is summarized by the number of points, the centroidof the set and the squared distances to the centroidall points in the same subset get the same center
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same center
Insertion of a new pointWhen a new point p is added to the tree
BIRCH searches for the ‘closest’ node according to∑q∈(S∪{p})
(q − µ(S ∪ {p}))2 −∑
q∈S(q − µ(S))2
p is added to the node representing subset S∗ if∑q∈(S∗∪{p})
(q − µS)2/(|S∗|+ 1) ≤ T 2 for a given threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same center
Insertion of a new pointWhen a new point p is added to the tree
BIRCH searches for the ‘closest’ node according to∑q∈(S∪{p})
(q − µ(S ∪ {p}))2 −∑
q∈S(q − µ(S))2
p is added to the node representing subset S∗ if∑q∈(S∗∪{p})
(q − µS)2/(|S∗|+ 1) ≤ T 2 for a given threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same center
Insertion of a new pointWhen a new point p is added to the tree
BIRCH searches for the ‘closest’ node according to∑q∈(S∪{p})
(q − µ(S ∪ {p}))2 −∑
q∈S(q − µ(S))2
p is added to the node representing subset S∗ if∑q∈(S∗∪{p})
(q − µS)2/(|S∗|+ 1) ≤ T 2 for a given threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same center
Insertion of a new pointWhen a new point p is added to the tree
BIRCH searches for the ‘closest’ node according to∑q∈(S∪{p})
(q − µ(S ∪ {p}))2 −∑
q∈S(q − µ(S))2
p is added to the node representing subset S∗ if∑q∈(S∗∪{p})
(q − µS)2/(|S∗|+ 1) ≤ T 2 for a given threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
150 points drawn uniformly around (−0.5,0) and (0,0.5)75 points drawn uniformly from [−4,−2]× [4,2] as noiseCenters and partitions computed by BIRCH and BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
150 points drawn uniformly around (−0.5,0) and (0,0.5)75 points drawn uniformly from [−4,−2]× [4,2] as noiseCenters and partitions computed by BIRCH and BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
150 points drawn uniformly around (−0.5,0) and (0,0.5)75 points drawn uniformly from [−4,−2]× [4,2] as noiseCenters and partitions computed by BIRCH and BICO
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
Insights from BIRCHFast point by point updatesTree structure
Insertion decision should be improved
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
How BIRCH computes a summary of the data
Insights from BIRCHFast point by point updatesTree structure
Insertion decision should be improved
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
Coresetssmall summary of given datatypically of constant or polylogarithmic sizecan be used to approximate the cost of the original data
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
CoresetsGiven a set of points P, a weighted subset S ⊂ P is a(k , ε)-coreset if for all sets C ⊂ C of k centers it holds
|costw (S,C)− cost(P,C)| ≤ ε cost(P,C)
where costw (S,C) =∑
p∈S minc∈C w(p)(.p, c).
3 2
2
4
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
Merge & Reduceread data in blockscompute a coresetfor each block→ smerge coresets in atree fashion space s · log n
Runtime: No asymptotic increase, but overhead in practice
BIRCH uses point-wise updates :-)
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
Merge & Reduceread data in blockscompute a coresetfor each block→ smerge coresets in atree fashion space s · log n
Runtime: No asymptotic increase, but overhead in practice
BIRCH uses point-wise updates :-)
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
Insights from Coreset ThreoryLimit the induced error
Goal: Point set P ′ in each node should induce at mostε · cost(P ′,C) error (for an optimal solution C)
Base insertion decision on induced errorReplacing all points in a node by the (weighted) centroid islike moving all points to the centroidInduced error is connected to the 1-means cost of the set
Side noteAvoiding Merge & Reduce is a good idea
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
Coresets
Insights from Coreset ThreoryLimit the induced error
Goal: Point set P ′ in each node should induce at mostε · cost(P ′,C) error (for an optimal solution C)
Base insertion decision on induced errorReplacing all points in a node by the (weighted) centroid islike moving all points to the centroidInduced error is connected to the 1-means cost of the set
Side noteAvoiding Merge & Reduce is a good idea
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same centerimprove insertion decision
AdjustmentsNodes additionally have a reference point and a range
Closest is now determined by Euclidean distanceWe say a node is full with regard to a point p if adding p tothe node increases its 1-means cost above a threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same centerimprove insertion decision
AdjustmentsNodes additionally have a reference point and a range
Closest is now determined by Euclidean distanceWe say a node is full with regard to a point p if adding p tothe node increases its 1-means cost above a threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same centerimprove insertion decision
AdjustmentsNodes additionally have a reference point and a rangeClosest is now determined by Euclidean distance
We say a node is full with regard to a point p if adding p tothe node increases its 1-means cost above a threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
BIRCHnodes in the tree represent subsets of pointspoints at the same node get the same centerimprove insertion decision
AdjustmentsNodes additionally have a reference point and a rangeClosest is now determined by Euclidean distanceWe say a node is full with regard to a point p if adding p tothe node increases its 1-means cost above a threshold T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
1. Find closest reference point2. If node is not in range3. Then create a new node4. Else add to node if possible5. If not, go one level down,6. And Find closest child, goto 2.
Threshold TRadius Ri
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Theorem
For T ≈ OPT/(k · log n · 8d · εd+2) and Ri :=√
T/(8 · 2i),the set of centroids weighted by the number of points in thesubset is a (1 + ε)-coresetfor constant d , the number of nodes is O(k · log n · ε−(d+2))
ProblemWe do not know OPT and thus cannot compute T !
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Theorem
For T ≈ OPT/(k · log n · 8d · εd+2) and Ri :=√
T/(8 · 2i),the set of centroids weighted by the number of points in thesubset is a (1 + ε)-coresetfor constant d , the number of nodes is O(k · log n · ε−(d+2))
ProblemWe do not know OPT and thus cannot compute T !
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Rebuilding algorithmDouble T when maximum number of nodes is reached‘Rebuild’ the tree according to new T
Let T ′ and R′i be before and T and Ri be after the doublingMove all nodes one level down and create empty first levelNotice that Ri =
√T/(8 · 2i) =
√T ′/8 · 2i−1 = R′i−1
⇒ Radius doesn’t change!
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Rebuilding algorithmDouble T when maximum number of nodes is reached‘Rebuild’ the tree according to new T
Let T ′ and R′i be before and T and Ri be after the doublingMove all nodes one level down and create empty first levelNotice that Ri =
√T/(8 · 2i) =
√T ′/8 · 2i−1 = R′i−1
⇒ Radius doesn’t change!
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
For every node in the tree1. Find ref. point one level higher2. If none in range, move node
and children one level up3. If it is in range, check T4. If sum is smaller than T ,
merge nodes6. Else make it a child
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodes
Increases the coreset size by a constant factorBut does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodes
Increases the coreset size by a constant factorBut does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodesIncreases the coreset size by a constant factor
But does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodesIncreases the coreset size by a constant factorBut does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodesIncreases the coreset size by a constant factorBut does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH meets Coresets
Merging might result in violations of the range of nodesIncreases the coreset size by a constant factorBut does not destroy the coreset property
⇒ BICO computes a coreset in the data stream setting ¨
. . . if we compute lower bound on T
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
The actual solution is computed with k -means++.
Adjustments
Add heuristic speed-up to find closest reference point
Speed-upProject all ref. points to d random 1-dim. subspacesProject new point p to the same subspacesCount how many ref. points are in range of p in everysubspaceSearch nearest neighbor in the shortest list
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
The actual solution is computed with k -means++.
AdjustmentsSet coreset size to 200k
Add heuristic speed-up to find closest reference point
Speed-upProject all ref. points to d random 1-dim. subspacesProject new point p to the same subspacesCount how many ref. points are in range of p in everysubspaceSearch nearest neighbor in the shortest list
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
The actual solution is computed with k -means++.
AdjustmentsSet coreset size to 200kAdd heuristic speed-up to find closest reference point
Speed-upProject all ref. points to d random 1-dim. subspacesProject new point p to the same subspacesCount how many ref. points are in range of p in everysubspaceSearch nearest neighbor in the shortest list
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
The actual solution is computed with k -means++.
AdjustmentsSet coreset size to 200kAdd heuristic speed-up to find closest reference point
Speed-upProject all ref. points to d random 1-dim. subspacesProject new point p to the same subspacesCount how many ref. points are in range of p in everysubspaceSearch nearest neighbor in the shortest list
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
The actual solution is computed with k -means++.
AdjustmentsSet coreset size to 200kAdd heuristic speed-up to find closest reference point
Speed-upProject all ref. points to d random 1-dim. subspacesProject new point p to the same subspacesCount how many ref. points are in range of p in everysubspaceSearch nearest neighbor in the shortest list
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Data SetsData Sets used in StreamKM++ paper from UCI repository:Tower, CoverType, Census and BigCross (cross product)CalTech128 by Rene Grzeszick, group of Prof. Finkconsists of 128 SIFT descriptors of an object database
Data Set SizesBigCross CalTech128 Census CoverType Tower
n 11620300 3168383 2458285 581012 4915200d 57 128 68 55 3
n · d 662357100 405553024 167163380 31955660 14745600
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Data SetsData Sets used in StreamKM++ paper from UCI repository:Tower, CoverType, Census and BigCross (cross product)CalTech128 by Rene Grzeszick, group of Prof. Finkconsists of 128 SIFT descriptors of an object database
Data Set SizesBigCross CalTech128 Census CoverType Tower
n 11620300 3168383 2458285 581012 4915200d 57 128 68 55 3
n · d 662357100 405553024 167163380 31955660 14745600
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
ImplementationsAuthor’s implementations for StreamKM++ and BIRCHimplementation for MacQueen’s k -means fromESMERALDA (framework by group of Prof. Fink)
ExperimentsExperiments done on mud1-6 and mud8100 runs for every test instancevalues shown in the diagrams are mean values
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
ImplementationsAuthor’s implementations for StreamKM++ and BIRCHimplementation for MacQueen’s k -means fromESMERALDA (framework by group of Prof. Fink)
ExperimentsExperiments done on mud1-6 and mud8100 runs for every test instancevalues shown in the diagrams are mean values
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Costs
3e+012
3.5e+012
4e+012
4.5e+012
5e+012
5.5e+012
6e+012
6.5e+012
7e+012
15 20 25 30
Cos
ts
Number of Centers
BigCross
StreamKM++BICO
MacQueenBIRCH
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Costs
1e+011
1.5e+011
2e+011
2.5e+011
3e+011
3.5e+011
4e+011
4.5e+011
5e+011
5.5e+011
6e+011
10 20 30 40 50
Cos
ts
Number of Centers
Covertype
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Costs
1e+012
1.02e+012
1.04e+012
1.06e+012
1.08e+012
1.1e+012
1.12e+012
1.14e+012
1.16e+012
250
Costs
Number of Centers
BigCross
StreamKM++BICO
MacQueenBIRCH
1e+011
1.5e+011
2e+011
2.5e+011
3e+011
250
Number of Centers
CalTech
1e+011
1.2e+011
1.4e+011
1.6e+011
1.8e+011
2e+011
2.2e+011
2.4e+011
2.6e+011
1000
Number of Centers
CalTech
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Running time
0
200
400
600
800
1000
1200
10 20 30 40 50
Run
ning
Tim
e
Number of Centers
Census
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Running time
0
1000
2000
3000
4000
5000
6000
7000
250
Ru
nn
ing
Tim
e
Number of Centers
Census
StreamKM++
BICOBICO KMeans++
MacQueen
BIRCH
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
1000
Number of Centers
Census
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Running time
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
250
Number of Centers
CalTech
0
100000
200000
300000
400000
500000
600000
1000
Number of Centers
CalTech
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Running time
0
2000
4000
6000
8000
10000
12000
14000
250
Ru
nn
ing
Tim
e
Number of Centers
BigCross
StreamKM++
BICOBICO KMeans++
MacQueen
BIRCH
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
1000
Number of Centers
BigCross
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
Trade-OffBIRCH MQ m = 200k 100k 25k
Time 616 4241 20271 5444 618Costs 58 · 1010 72 · 1010 51 · 1010 52 ·1010 55 · 1010
Tests run on on BigCross with k = 1000BICO is less than 1.5 times slower than MacQueen withm = 100k while still computing reasonable costsfaster than BIRCH for m = 25k , still much better cost thanBIRCH and MacQueen
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BICO is cool :-)
ZieleImplementierung in bekanntem Framework / Anbindung
Baustein fur andere AlgorithmenVergleich verschiedener Strategien fur Nearest Neighbor
BICO: BIRCH meets Coresets for k -means
Introduction Insights from BIRCH Coreset Theory BICO Experiments End
BIRCH MQ m = 200k 100k 25kTime 616 4241 20271 5444 618Costs 58 · 1010 72 · 1010 51 · 1010 52 ·1010 55 · 1010
Thank you for your attention!
BICO: BIRCH meets Coresets for k -means