14

MS Clustering Chapters15_to_17_Part5

MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Download PPT Report

Upload
walter-obrien
View
217
Download
0

Tags:

Embed Size (px)

Citation preview

Page 1: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

MS Clustering

Chapters15_to_17_Part5

Page 2: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

What is it

Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure.

Page 3: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

We have being doing it

We have been grouping people, cars, etc. We are just not very good when we have too many

items to keep track Experts can track five to six dimensions, we may

have data set with many times of that We can only see the obvious groups, most likely It is difficult for us to see the hidden ones, or the

combined ones

Page 4: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

An Example

You can group your customers (for a bike store) into several groups based on • Gender • Income• Age• Etc

There may be other things, such as do they play game?

Page 5: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Principles of Clustering

Guessing and lying (MS)• Setting clusters

Training with data Calibrating your clusters Training again Repeating until converged or going nowhere

The clustering mythology is very sensitive to the starting points and can converge at local solutions that many not be optimal global solution

Page 6: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Soft and hard clustering One case one cluster – hard One case several clusters – soft

Page 7: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Scalable clustering

Ideally, the data point that will not change its cluster do not need to be considered

In MS’ implementation, it will read the first 50,000. If that don’t converge, we process the next 50K, rather than read in and process all 100K.

Page 8: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Few interesting parameters Clustering_Method

• What method to use 1~4 Clustering_Count

• The number of clusters to find• 0 makes the algorithms to guess a good number

Minimum_Support• What case count can be considered as empty

Stopping_tolerance• The number of cases switch clusters

Sample_size• For scalable clustering

Cluster_Seed• Where to put the clusters

Maximum_Input_attributes• A number before attributed considered before automatic feature selection kicks in. Automatic feature selection,

selects the most popular attributes Maximum_states

• Possible values

Page 9: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Understanding The Results

Comprehending the results can be difficult because you have to look for many directions• High-level overview• Look into a cluster• Determine how a cluster is different from a near

by one

Page 10: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

High-level overview Cluster Profiles view -- too much info

• Getting some sense regarding who/what are in each cluster

http://1.bp.blogspot.com/-tAvUhhOrvro/UhBNg40RGjI/AAAAAAAABBg/v3dn8DIjtSI/s1600/Clustering_fnl.png

Page 11: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

High-level overview Cluster Diagram view

• Get some sense the relationships among clusters

Page 12: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Look into a cluster

The Cluster characteristic view• See the attributes that are going together • Note that an attribute ranks high may be

because it is ranked high on all the cluster. In that case, it is not that interesting.

Page 13: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Cluster characteristic view

Page 14: MS Clustering Chapters15_to_17_Part5. What is it Clustering is the classification of objects into different groups, or more precisely, the partitioning

Look outside a cluster Discrimination and Complement

• Shows you what attributes are important

Clustering Visualiser - Visualising partitioning-based ...vda.univie.ac.at/Teaching/Vis/17w/files/final_reports/final_report_g15… · tering Visualizer, a novel clustering algorithm

Clustering Visualiser - Visualising partitioning-based ...vda.univie.ac.at/Teaching/Vis/17w/files/final_reports/final_report_g15… · tering Visualizer, a novel clustering algorithm

Documents

C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cassandra Summit 2016

C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cassandra Summit 2016

Software

K-MEDOIDS CLUSTERING USING PARTITIONING AROUND … · The rest of this paper is organized as follows: Section 2 discusses the conceptual details about K-Medoids Clustering and Partitioning

K-MEDOIDS CLUSTERING USING PARTITIONING AROUND … · The rest of this paper is organized as follows: Section 2 discusses the conceptual details about K-Medoids Clustering and Partitioning

Documents

Graph Partitioning by Spectral Rounding: …Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering ∗ David A. Tolliver Robotics Institute, Carnegie

Graph Partitioning by Spectral Rounding: …Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering ∗ David A. Tolliver Robotics Institute, Carnegie

Documents

Hypergraph Partitioning and Clusteringweb.eecs.umich.edu/~imarkov/pubs/book/part_survey.pdfHypergraph Partitioning and Clustering David A. Papa and Igor L. Markov University of Michigan,

Hypergraph Partitioning and Clusteringweb.eecs.umich.edu/~imarkov/pubs/book/part_survey.pdfHypergraph Partitioning and Clustering David A. Papa and Igor L. Markov University of Michigan,

Documents

Chapter 4: Clustering · K-Modes algorithm proceeds similar to k-Means algorithm Clustering Partitioning Methods Variants: K-Medoid, K-Mode, K-Median 20 Huang, Z.: A Fast Clustering

Chapter 4: Clustering · K-Modes algorithm proceeds similar to k-Means algorithm Clustering Partitioning Methods Variants: K-Medoid, K-Mode, K-Median 20 Huang, Z.: A Fast Clustering

Documents

EFFICIENT VECTOR PARTITIONING ALGORITHMS FOR MODULARITY-BASED GRAPH CLUSTERING · 2020-04-18 · EFFICIENT VECTOR PARTITIONING ALGORITHMS FOR MODULARITY-BASED GRAPH CLUSTERING HIROAKI

EFFICIENT VECTOR PARTITIONING ALGORITHMS FOR MODULARITY-BASED GRAPH CLUSTERING · 2020-04-18 · EFFICIENT VECTOR PARTITIONING ALGORITHMS FOR MODULARITY-BASED GRAPH CLUSTERING HIROAKI

Documents

Introduction to partitioning-based clustering methods …users.jyu.fi/~samiayr/pdf/introtoclustering_report.pdf · · 2007-03-21No. C. 1/2006 Introduction to partitioning-based

Introduction to partitioning-based clustering methods …users.jyu.fi/~samiayr/pdf/introtoclustering_report.pdf · · 2007-03-21No. C. 1/2006 Introduction to partitioning-based

Documents

Partitioning a Graph in Alliances and its Application to ...crcv.ucf.edu/papers/theses/Khurram_theses.pdf · Partitioning a Graph in Alliances and its Application to Data Clustering

Partitioning a Graph in Alliances and its Application to ...crcv.ucf.edu/papers/theses/Khurram_theses.pdf · Partitioning a Graph in Alliances and its Application to Data Clustering

Documents

Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided K-means Clustering (text) EM Clustering (paper) Graph Partitioning (text)

Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided K-means Clustering (text) EM Clustering (paper) Graph Partitioning (text)

Documents

Bipartite spectral graph partitioning for clustering dialect · Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic featuresI Martijn

Bipartite spectral graph partitioning for clustering dialect · Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic featuresI Martijn

Documents

1992_New Spectral Methods for Ratio Cut Partitioning and Clustering

1992_New Spectral Methods for Ratio Cut Partitioning and Clustering

Documents

Graph Partitioning Advance Clustering Technique

Graph Partitioning Advance Clustering Technique

Documents

Lecture 7: Clustering - StatisticsHierarchical clustering Partitioning methods (K-means, K-medoids): t Kclusters, for pre-determined number of clusters. Results of clustering depend

Lecture 7: Clustering - StatisticsHierarchical clustering Partitioning methods (K-means, K-medoids): t Kclusters, for pre-determined number of clusters. Results of clustering depend

Documents

Comparing clustering and partitioning strategies · Comparing Clustering and Partitioning Strategies Carlos Afonso+, Fábio Ferreira+, José Exposto+, Ana I. Pereira*+ + Polytechnic

Comparing clustering and partitioning strategies · Comparing Clustering and Partitioning Strategies Carlos Afonso+, Fábio Ferreira+, José Exposto+, Ana I. Pereira*+ + Polytechnic

Documents

On Clustering Validation Techniques · 108 HALKIDI, BATISTAKIS AND VAZIRGIANNIS Figure 1.Steps of clustering process. The clustering process may result in different partitioning of

On Clustering Validation Techniques · 108 HALKIDI, BATISTAKIS AND VAZIRGIANNIS Figure 1.Steps of clustering process. The clustering process may result in different partitioning of

Documents

Datamining_3 Clustering Methods Clustering a set is partitioning that set. Partitioning is subdividing into subsets which mutually exclusive (don't overlap)

Datamining_3 Clustering Methods Clustering a set is partitioning that set. Partitioning is subdividing into subsets which mutually exclusive (don't overlap)

Documents

Using Modified Partitioning Around Medoids Clustering ... · Using Modified Partitioning Around Medoids Clustering Technique in Mobile Network Planning . ... GSM Global System for

Using Modified Partitioning Around Medoids Clustering ... · Using Modified Partitioning Around Medoids Clustering Technique in Mobile Network Planning . ... GSM Global System for

Documents

10 DIMACS Implementation Challenge – Graph Partitioning and Graph Clustering – Results · 2012. 2. 20. · Graph Partitioning and Graph Clustering – Results David A. Bader School

10 DIMACS Implementation Challenge – Graph Partitioning and Graph Clustering – Results · 2012. 2. 20. · Graph Partitioning and Graph Clustering – Results David A. Bader School

Documents

Bipartite spectral graph partitioning for clustering dialect varieties

Bipartite spectral graph partitioning for clustering dialect varieties

Documents

Precisely My Point: Leverage Attribute Clustering and Zone Mapping in Oracle Database 12.1.0.2

Precisely My Point: Leverage Attribute Clustering and Zone Mapping in Oracle Database 12.1.0.2

Documents

Isoperimetric Graph Partitioning for Data Clustering and ...leogrady.net/wp-content/uploads/2017/01/grady2003isoperimetric-TR.pdfh, and demonstrate applications to data clustering

Isoperimetric Graph Partitioning for Data Clustering and ...leogrady.net/wp-content/uploads/2017/01/grady2003isoperimetric-TR.pdfh, and demonstrate applications to data clustering

Documents

Partitioning and Clustering Professor Lei He lhe@ee.ucla.edu

Partitioning and Clustering Professor Lei He [email protected]

Documents

Hierarchical Clustering - GitHub PagesHierarchical Clustering Mikhail Dozmorov Fall 2016 What is clustering Partitioning of a data set into subsets. A cluster is a group of relatively

Hierarchical Clustering - GitHub PagesHierarchical Clustering Mikhail Dozmorov Fall 2016 What is clustering Partitioning of a data set into subsets. A cluster is a group of relatively

Documents

Probabilistic Correlation Clustering and Image Partitioning Using

Probabilistic Correlation Clustering and Image Partitioning Using

Documents

Technical Report Spectral Clustering of Large Advertiser Datasets … · 2014-03-17 · 2.4 Spectral graph partitioning Spectral graph partitioning problem can be formulated as a

Technical Report Spectral Clustering of Large Advertiser Datasets … · 2014-03-17 · 2.4 Spectral graph partitioning Spectral graph partitioning problem can be formulated as a

Documents

aks249 - Cornell University• Average hops two the lowest common ancestor • What other baselines? • K-means clustering / DP-means clustering • Greedy partitioning Design Decisions

aks249 - Cornell University• Average hops two the lowest common ancestor • What other baselines? • K-means clustering / DP-means clustering • Greedy partitioning Design Decisions

Documents

Clustering Algorithm (DBSCAN) - Computer Sciencegu/teaching/courses/csc76010/slides/Clustering Algorithm by Vishal...Partitioning based clustering algorithms divide the dataset into

Clustering Algorithm (DBSCAN) - Computer Sciencegu/teaching/courses/csc76010/slides/Clustering Algorithm by Vishal...Partitioning based clustering algorithms divide the dataset into

Documents

Chapter 4: Clustering - LMU1).pdfChapter 4: Clustering DATABASE SYSTEMS GROUP Contents 1) Introduction to Clustering 2) Partitioning Methods ØK-Means ØVariants: K-Medoid, K-Mode,

Chapter 4: Clustering - LMU1).pdfChapter 4: Clustering DATABASE SYSTEMS GROUP Contents 1) Introduction to Clustering 2) Partitioning Methods ØK-Means ØVariants: K-Medoid, K-Mode,

Documents

K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE RECOGNITION

K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE RECOGNITION

Education