24
Practical Machine Learning in R Clustering Lars Kotthoff 12 [email protected] 1 with slides from Bernd Bischl and Michel Lang 2 slides available at http://www.cs.uwyo.edu/~larsko/ml-fac 1

Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff [email protected] Created Date: 10/22/2017

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

Practical Machine Learning inR

Clustering

Lars Kotthoff12

[email protected]

1with slides from Bernd Bischl and Michel Lang2slides available at http://www.cs.uwyo.edu/~larsko/ml-fac

1

Page 2: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

Unsupervised Clustering

●●● ●●

●●

● ●

●●

●●

● ●●

●●

●● ●●

● ●● ●

● ●

●●

●● ●●

● ●

● ●

●●●

● ●

●●

● ●

●● ●

●●

0.0

0.5

1.0

1.5

2.0

2.5

2 4 6

a

b

Goal: Group data by similarity, or estimate membershipprobabilities

2

Page 3: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

▷ pick k cluster centers randomly▷ assign each data point to a cluster by shortest mean distance▷ centroid (point with smallest mean distance to all points) of

each cluster becomes new center▷ repeat until convergence

3

Page 4: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

▷ easy to understand, runs quickly▷ need to specify number of clusters▷ clusters are spherical

4

Page 5: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

5

Page 6: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

6

Page 7: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

7

Page 8: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

8

Page 9: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

9

Page 10: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

10

Page 11: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

11

Page 12: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

12

Page 13: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

13

Page 14: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

14

Page 15: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

15

Page 16: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

16

Page 17: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

17

Page 18: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

18

Page 19: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

k-means clustering

By Chire - Own work, GFDL, https://commons.wikimedia.org/w/index.php?curid=59409335

19

Page 20: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

EM – Expectation Maximization

▷ maximize likelihood of clusters, given data▷ estimate distribution of data as mixture of distributions▷ compute expectation of clusters for fixed model▷ determine model parameters that maximize fixed clusters

▷ repeat until convergence▷ can determine number of clusters automatically

20

Page 21: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

EM – Expectation Maximization

21

Page 22: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

DBScan

▷ density-based clustering▷ find core points (with a large number of neighbors)▷ find connected core points, and which core points other points

are assigned to▷ number of clusters and shape determined automatically▷ need to specify minimum number of points in a cluster and

density threshold

22

Page 23: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

DBScan

23

Page 24: Practical Machine Learning in R - Clusteringlarsko/ml-fac/03-clustering.pdf · Practical Machine Learning in R - Clustering Author: Lars Kotthoff larsko@uwyo.edu Created Date: 10/22/2017

Exercises

http://www.cs.uwyo.edu/~larsko/ml-fac/03-clustering-exercises.Rmd

24