Modeling and Visualization of High Dimensional Datasyllabus.cs.manchester.ac.uk/pgt/COMP61021/lectures/SOM.pdf · COMP61021 Modelling and Visualization of High Dimensional Data. 3

Self-Organizing Maps (SOM)

COMP61021 Modelling and Visualization of High Dimensional Data

Additional reading can be found from non-assessed exercises (week 9) in this course unit teaching page.

Textbook: Ch. 9 in [3]

COMP61021 Modelling and Visualization of High Dimensional Data2

Outline• Introduction • Kohonen SOM• Learning Algorithm• Visualization Method• Examples• Relevant Issues• Conclusions


Introduction• Self-organizing maps (SOM)

– SOM is a biologically inspired unsupervised neural network that approximates an unlimited number of input data by a finite set of nodes arranged in a low-dimensional grid, where neighbor nodes correspond to more similar input data.

– The model is produced by a learning algorithm that automatically orders the inputs on a one or two-dimensional grid according to their mutual similarity.

– Useful for clustering analysis and data visualization

Input space Initial weights Final weights


Kohonen SOM

Competition

Tww ),( 21=w

Txx ),( 21=x

)()(),( xwxwxwxw −−=−= TEd

2 .,. =Nge

hard-wiredconnection


Kohonen SOM

Cooperation

2 :radius"" =ikd


Kohonen SOM

Adaptation

(see the algorithm on the following slides for details)


Learning Algorithm

1τ

2τ


Learning Algorithm


Visualization Method

• In 2D/3D dimensional space, neurons are visualized as changing positions in the weight space as learning takes place. Each neuron is described by the corresponding weight vector.

• Two neurons are connected by an edge if they are direct neighbors in the neural network lattice. For 2-D/3-D data, the lattice via weights can be displayed in the original data space.

• The locations specified by weight vectors of neurons in a grid mimic the distribution of the training data.


Visualization Method


Visualization Method• Example: U-Matrix


Examples• Example 1: 1-D self-organizing map


Examples• Example 2: 2-D self-organizing map


Examples• Example 3: self-organizing map of synthetic data sets

After convergence of SOM learning, we achieve SOMs for different data distributions

The grid mimics the data distribution!


Examples• Example 4: Taxonomy of animals

A grouping with SOM according to similarity has emerged

Animal names and their attributes

birds

peaceful

is

has

likesto

Dove Hen Duck Goose Owl Hawk Eagle Fox Dog Wolf Cat Tiger Lion Horse Zebra Cow Small 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0

Medium 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 Big 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1

2 legs 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 4 legs 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 Hair 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

Hooves 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 Mane 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0

Feathers 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 Hunt 0 0 0 0 1 1 1 1 0 1 1 1 1 0 0 0 Run 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 Fly 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0

Swim 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0

hunters


Relevant Issues


Relevant Issues• SOM extension

– PSOM: continuous projection: interpolation between centroid locations via parameterisation

– disSOM: SOM working on distance between objects; more general than distance Nonnegative Matrix Factorization

– Hierarchical SOM: extension from single to multiple layers for multi-scale data analysis

– Generative topographic map (GTM): a probabilistic counterpart of the SOM and is provably convergent and does not require a shrinking neighborhood or a decreasing step size.

– Kernel SOM: overcome two major limitations of Kohonen SOM


Conclusions• Kohonen SOM is a biologically inspired neural network for

high dimensional data clustering and visualisation.• Its most important property is topology preservation.• Learning gets involved in two phases: order vs. convergence• It is no guarantee that SOM is always convergent and hence

the parameter tuning is needed. • There are several variants or extensions, which tends to

overcome the limitations of the SOM.• There are a number of successful applications of SOM.

Documents

Modeling and Visualization of High Dimensional Datasyllabus.cs.manchester.ac.uk/pgt/COMP61021/lectures/SOM.pdf · COMP61021 Modelling and Visualization of High Dimensional Data. 3