18
FCM-Based Model Selection Algorithms for Determining the Number of Clusters Haojun Sun,ShengruiWang*,Qingshan Jiang Received 16 December 2002; received in revised form 29 March 2004; accepted 29 March 2004 Presenter Chia-Cheng Chen 1

Haojun Sun,ShengruiWang*,Qingshan Jiang Received 16 December 2002; received in revised form 29 March 2004; accepted 29 March 2004 Presenter Chia-Cheng

Embed Size (px)

Citation preview

FCM-Based Model Selection Algorithms for Determining the

Number of Clusters

Haojun Sun,ShengruiWang*,Qingshan Jiang

Received 16 December 2002; received in revised form 29 March 2004; accepted 29 March 2004

Presenter Chia-Cheng Chen 1

Introduction

Basic algorithm

A new validity index

Experimental results

Conclusion and perspectives

Outline

2

Clustering is a process for grouping a set of objects into classes or clusters so that the objects within a cluster have high similarity.

Because of its concept of fuzzy membership, FCM is able to deal more effectively with outliers and to perform membership grading, which is very important in practice.

Introduction

3

FCM algorithm

Basic algorithm

4

Basic algorithm(Cont.)

5

FCM-based model selection algorithm

Basic algorithm(Cont.)

6

FBSA: FCM-Based Splitting Algorithm

Basic algorithm(Cont.)

7

Function S(i)

Basic algorithm(Cont.)

8

A new validity index

A new validity index

9

A new validity index(Cont.)

10

DataSet1◦ IRIS data◦ This is a biometric data set consisting of 150 measurements

belonging to three flower varieties DataSet2◦Mixture of Gaussian distributions◦ 50 data vectors in each of the 5ve clusters

DataSet3 ◦Mixture of Gaussian distributions◦ 500 data vectors

Experimental results

11

Experimental results(Cont.)

12

Experimental results(Cont.)

13

Experimental results(Cont.)

14

Experimental results(Cont.)

15

Experimental results(Cont.)

16

Experimental results(Cont.)

17

The major contributions of this paper are an improved FCM-based algorithm for determining the number of clusters and a new index for validating clustering results.

Use of the new algorithm to deal with the dimension reduction problem is another promising avenue.

Conclusion and perspectives

18