40
Unsupervised Feature Selection for Multi- Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill

Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010

  • Upload
    jeneil

  • View
    57

  • Download
    0

Embed Size (px)

DESCRIPTION

Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010. Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill. Introduction . The problem. Feature selection for high dimensional data Assume high dimensional n*d data matrix X - PowerPoint PPT Presentation

Citation preview

Page 1: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Unsupervised Feature Selection for Multi-Cluster Data

Deng Cai et al, KDD 2010

Presenter: Yunchao GongDept. Computer Science, UNC Chapel Hill

Page 2: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Introduction

Page 3: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

The problem

• Feature selection for high dimensional data

• Assume high dimensional n*d data matrix X

• When d is too high (d > 10000) , it causes problems– Storing the data is expensive– Computational challenges

Page 4: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

A solution to this problem

• Select the most informative features

• Assume high dimensional n*d data matrix X

n

d

Page 5: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

After feature selection

n

d d’ << d

• More scalable and can be much more efficient

Page 6: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Previous approaches

• Supervised feature selection– Fisher score– Information theoretical feature selection

• Unsupervised feature selection– LaplacianScore– MaxVariance– Random selection

Page 7: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Motivation of the paper

• Improving clustering (unsupervised) using feature selection

• Automatically select the most useful feature by discovering the manifold structure of data

Page 8: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Motivation and Novelty

• Previous approaches – Greedy selection– Selects each feature independently

• This approach– Considers all features together

Page 9: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

A toy example

Page 10: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Some background—manifold learning

• Tries to discover manifold structure from data

Page 11: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

appearance variation

manifolds in vision

Page 12: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Some background—manifold learning

• Discovers smooth manifold structure• Map different manifold (classes) as far as

possible

Page 13: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

The proposed method

Page 14: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Outline of the approach

• 1. manifold learning for cluster analysis

• 2. sparse coefficient regression

• 3. sparse feature selection

Page 15: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Outline of the approach

1 2 3

Generate response Y

Page 16: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Manifold learning to find clusters of data2

1

3

4

Page 17: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Observations– Points in the same class are clustered together

Page 18: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Ideas?– How to discover the manifold structure?

Page 19: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Map similar points closer• Map dissimilar points faraway

Data pointssimilarity

min

Page 20: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Map similar points closer• Map dissimilar points faraway

min

= ,

D = diag(sum(K))

Page 21: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

1. Manifold learning for clustering

• Constraining f to be orthogonal (f^TDf=I) to eliminate free scaling

• So we have the following minimization problem

Page 22: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Summary of manifold discovery

• Construct graph K

• Choose a weighting scheme for K ( )

• Perform

• Use f as the response vectors Y

Page 23: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Response vector Y

• Y reveals the manifold structure!

2

3

4

1

Response Y =

Page 24: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

2. Sparse coefficient regression

• When we have obtained Y, how to use Y to perform feature selection?

• Y reveals the cluster structure, use Y as response to perform sparse regression

Page 25: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Sparse regression

• The ``Lasso’’

sparsely

Page 26: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Steps to perform sparse regression

• Generate Y from step 1• Data matrix input X• For each column of Y, we denote it as Y_k• Perform the following step to estimate

Page 27: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

illustration

X Y_ka =

nonzero

Page 28: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

3. Sparse feature selection

• But for Y, we will have c different Y_k

• How to finally combine them?

• A simple heuristic approach

Page 29: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

illustration

a a a a

Page 30: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

The final algorithm

• 1. manifold learning to obtain Y

• 2. sparse regression to select features

• 3. final combination

, Y = f

Page 31: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

2

3

4

1

Response Y =

Step 1

Page 32: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Step 2

X a

nonzero

2

3

4

1

regression

Page 33: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Step 3

a a a a

Page 34: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Discussion

Page 35: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Novelty of this approach

• Considers all features

• Uses sparse regression ``lasso’’ to perform feature selection

• Combines manifold learning with feature selection

Page 36: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Results

Page 37: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Face image clustering

Page 38: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Digit recognition

Page 39: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Summary

– The method is technically new and interesting

– But not ground breaking

Page 40: Unsupervised Feature Selection for Multi-Cluster Data Deng  Cai  et al, KDD 2010

Summary

• Test on more challenging vision datasets– Caltech– Imagenet