22
Intelligent Database Systems Presenter : BEI-YI JIANG Authors : GUENAEL CABANES , YOUNES BENNANI , DOMINIQUE FRESNEAU 2012. ELSEVIER Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

  • Upload
    palila

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance. Presenter : Bei -YI Jiang Authors : Guenael Cabanes , Younes Bennani , Dominique Fresneau 2012. elsevier. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Presenter : BEI-YI JIANG

Authors : GUENAEL CABANES , YOUNES BENNANI , DOMINIQUE

FRESNEAU

2012. ELSEVIER

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

Page 2: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Motivation

• The exponential growth of data generates terabytes of

very large databases.• The growing number of data dimensions and

data objects presents tremendous challenges for

effective data analysis and data exploration methods and

tools.

Page 4: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Objectives

• Develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm.• Provide data visualizations via maps and graphs, to

provide a comprehensive exploration of the data structure.

Page 5: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Methodology

Prototype enrichmen

t

Clustering

of prototypes

Modeling data

distributions

Visualization

Page 6: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Prototype enrichment

Methodology-learning data structure

Input: The distance matrix Dist(w, x) between the M prototypes w andthe N data x.

Output: The density Di and the local variability si associated to eachprototype wi.The neighborhood values vi,j associated with each pair ofprototype wi and wj.

Page 7: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Principle

Methodology-learning data structure

− Density modes.It is a measure of the data density surroundingthe prototype (local density).

− Local variabilityIt can be defined as the average distance between the prototypes and the represented data.

− The neighborhoodThis is a prototype’s neighborhood measure.

Page 8: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Algorithm

Methodology-learning data structure

Page 9: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Clustering of prototypes

Methodology-learning data structure

Input: Density values Di. Neighborhood values vi,j.

Output: The clusters of prototypes.

Page 10: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Algorithm

Methodology-learning data structure

Page 11: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Algorithm

Methodology-learning data structure

Page 12: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Methodology-learning data structure

Page 13: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Presents some interesting qualities

Methodology-learning data structure

− The number of cluster is automatically detected by the algorithm.− No linearly separable clusters and non hyper-spherical clusters can be detected.− The algorithm can deal with noise (i.e. touching clusters) by using density estimation.

Page 14: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Modeling data distributions• Density function

Methodology-learning data structure

Page 15: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

• Algorithm

Methodology-learning data structure

Page 16: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Methodology-A new two-level coclustering algorithm

Page 17: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Methodology-A new two-level coclustering algorithm

Page 18: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Experiments

Page 19: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Experiments

Page 20: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Experiments

Page 21: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Conclusions

• Propose a new data structure modeling method, based on the learning of prototypes.

• Propose a new coclustering algorithm to solve different kind of problems. The results are easy to read and understand, and are perfectly compatible with biologists knowledge.

• A method of visualization able to enhance the data structure within and between groups.

Page 22: Improving the Quality of Self-Organizing Maps by  Self-Intersection Avoidance

Intelligent Database Systems Lab

Comments• Advantages

-Resolve some clustering problems-Obtained results are easy to read and understand-Enhance the data structure

• Applications- Analyze and visualize biological experimental