49
Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer Science and Engineering

Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Embed Size (px)

Citation preview

Page 1: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture 10Artificial Neural Networks

Dr. Jianjun Humleg.cse.sc.edu/edu/csce833

CSCE833 Machine Learning

University of South CarolinaDepartment of Computer Science and Engineering

Page 2: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Outline

Midterm moved to March 15th

Neural Network Learning Self-Organizing Maps

Origins Algorithm Example

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)2

Page 3: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Neuron: no division, only 1 axon

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)3

Page 4: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)4

Neural Networks

Networks of processing units (neurons) with connections (synapses) between them

Large number of neurons: 1010

Large connectitivity: 105

Parallel processing Distributed computation/memory Robust to noise, failures

Page 5: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)5

Understanding the Brain

Levels of analysis (Marr, 1982)1. Computational theory2. Representation and algorithm3. Hardware implementation

Reverse engineering: From hardware to theory Parallel processing: SIMD vs MIMD

Neural net: SIMD with modifiable local memoryLearning: Update by training/experience

Page 6: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)6

Perceptron

(Rosenblatt, 1962)

Td

Td

Td

jjj

x,...,x,

w,...,w,w

wxwy

1

10

01

1

x

w

xw

Page 7: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)7

What a Perceptron Does

Regression: y=wx+w0 Classification:

y=1(wx+w0>0)

ww0

y

x

x0=+1

ww0

y

x

s

w0

y

x

xwToy

exp1

1sigmoid

Page 8: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)8

K Outputs

kk

i

i

k k

ii

Tii

yy

C

oo

y

o

maxif

choose

expexp

xw

Classification:

Regression:

xy

xw

W

Tii

d

jjiji wxwy 0

1

Page 9: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)9

Training

Online (instances seen one by one) vs batch (whole sample) learning: No need to store the whole sample Problem may change in time Wear and degradation in system components

Stochastic gradient-descent: Update after a single pattern

Generic update rule (LMS rule):

InpututActualOutpputDesiredOutctorLearningFaUpdate

tj

ti

ti

tij xyrw

Page 10: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)10

Training a Perceptron: Regression Regression (Linear output):

t

jttt

j

tTtttttt

xyrw

ryrr,E

22

21

21

| xwxw

Page 11: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)11

Classification

Single sigmoid output

K>2 softmax outputs

tj

tttj

ttttttt

tTt

xyrw

yryr,E

y

1 log 1 log |

sigmoid

rxw

xw

tj

ti

ti

tij

i

ti

ti

ttii

t

k

tTk

tTit

xyrw

yr,Ey

log | exp

exprxw

xwxw

Page 12: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)12

Learning Boolean AND

Page 13: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)13

XOR

No w0, w1, w2 satisfy:

(Minsky and Papert, 1969)

0

0

0

0

021

01

02

0

www

ww

ww

w

Page 14: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)14

Multilayer Perceptrons

(Rumelhart et al., 1986)

d

j hjhj

Thh

H

hihih

Tii

wxw

z

vzvy

1 0

10

exp1

1

sigmoid xw

zv

Page 15: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)15

x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2)

Page 16: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)16

Backpropagation

hj

h

h

i

ihj

d

j hjhj

Thh

H

hihih

Tii

wz

zy

yE

wE

wxw

z

vzvy

exp1

1

sigmoid

1 0

10

xw

zv

Page 17: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)17

t

jth

th

th

tt

tj

th

th

th

tt

hj

th

th

t

tt

hjhj

xzzvyr

xzzvyr

wz

zy

yE

wE

w

1

1

Regression

Forward

Backward

x

xwThhz sigmoid

H

h

thh

t vzvy1

0

221

| t

tt yr,E XvW

th

t

tth zyrv

Page 18: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)18

Regression with Multiple Outputs

zh

vih

yi

xj

whj

tj

th

th

t iih

ti

tihj

th

t

ti

tiih

i

H

h

thih

ti

t i

ti

ti

xzzvyrw

zyrv

vzvy

yr,E

1

21

|

01

2

XVW

Page 19: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)19

Page 20: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)20

Page 21: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)21

whx+w0

zh

vhzh

Page 22: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)22

Two-Class Discrimination

One sigmoid output yt for P(C1|xt) and P(C2|xt) ≡ 1-yt

t

jth

thh

t

tthj

th

t

tth

t

tttt

H

h

thh

t

xzzvyrw

zyrv

yryr,E

vzvy

1

1 log 1 log |

sigmoid1

0

XvW

Page 23: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)23

K>2 Classes

tj

th

th

t iih

ti

tihj

th

t

ti

tiih

t i

ti

ti

ti

k

tk

tit

i

H

hi

thih

ti

xzzvyrw

zyrv

yr,E

CPo

oyvzvo

1

log|

|exp

exp

10

Xv

x

W

Page 24: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)24

Multiple Hidden Layers

MLP with one hidden layer is a universal approximator (Hornik et al., 1989), but using multiple layers may lead to simpler networks

2

1

1022

21

0212122

11

01111

1sigmoidsigmoid

1sigmoidsigmoid

H

lll

T

H

hlhlh

Tll

d

jhjhj

Thh

vzvy

H,...,l,wzwz

H,...,h,wxwz

zv

zw

xw

Page 25: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)25

Improving Convergence

Momentum

Adaptive learning rate

1

ti

i

tti w

wE

w

otherwise

if

b

EEa tt

Page 26: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)26

Overfitting/OvertrainingNumber of weights: H (d+1)+(H+1)K

Page 27: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)27

Page 28: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)28

Tuning the Network Size

Destructive Weight decay:

Constructive Growing networks

(Ash, 1989) (Fahlman and Lebiere, 1989)

ii

ii

i

wE'E

wwE

w

2

2

Page 29: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)29

Dimensionality Reduction

Page 30: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)30

Page 31: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)31

Learning Time: sequential learning Applications:

Sequence recognition: Speech recognition Sequence reproduction: Time-series prediction Sequence association

Network architectures Time-delay networks (Waibel et al., 1989) Recurrent networks (Rumelhart et al., 1986)

Page 32: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)32

Time-Delay Neural Networks

Page 33: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)33

Recurrent Networks

Page 34: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)34

Unfolding in Time

Page 35: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Self-Organizing Maps : Origins

Self-Organizing Maps

Ideas first introduced by C. von der Malsburg (1973), developed and refined by T. Kohonen (1982)

Neural network algorithm using unsupervised competitive learning

Primarily used for organization and visualization of complex data

Biological basis: ‘brain maps’

Teuvo Kohonen

Page 36: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Self-Organizing MapsSOM - Architecture

Lattice of neurons (‘nodes’) accepts and responds to set of input signals

Responses compared; ‘winning’ neuron selected from lattice Selected neuron activated together with ‘neighbourhood’

neurons Adaptive process changes weights to more closely resemble

inputs

2d array of neurons

Set of input signals(connected to all neurons in lattice)

Weighted synapses

x1 x2 x3 xn...

wj1 wj2 wj3 wjn

jj

Page 37: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Self-Organizing MapsSOM – Result Example

‘Poverty map’ based on 39 indicators from World Bank statistics (1992)

Classifying World Poverty Helsinki University of Technology

Page 38: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Self-Organizing MapsSOM – Result Example

‘Poverty map’ based on 39 indicators from World Bank statistics (1992)

Classifying World Poverty Helsinki University of Technology

Page 39: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Self-Organizing MapsSOM – Algorithm Overview

1. Randomly initialise all weights2. Select input vector x = [x1, x2, x3, … , xn] 3. Compare x with weights wj for each neuron j to

determine winner4. Update winner so that it becomes more like x,

together with the winner’s neighbours5. Adjust parameters: learning rate &

‘neighbourhood function’6. Repeat from (2) until the map has converged

(i.e. no noticeable changes in the weights) or pre-defined no. of training cycles have passed

Page 40: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Initialisation

(i)Randomly initialise the weight vectors wj for all nodes j

Page 41: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

(ii) Choose an input vector x from the training set

In computer texts are shown as a frequency distribution of one word.

A Text Example:

Self-organizing maps (SOMs) are a data visualization technique invented by Professor Teuvo Kohonen which reduce the dimensions of data through the use of self-organizing neural networks. The problem that data visualization attempts to solve  is that humans simply cannot visualize high dimensional data as is so technique are created to help us understand this high dimensional data.

Input vector

Region

Self-organizing 2maps 1data 4visualization 2technique 2Professor 1invented 1Teuvo Kohonen 1dimensions 1...Zebra 0

Page 42: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Finding a Winner (iii) Find the best-matching neuron (x), usually

the neuron whose weight vector has smallest Euclidean distance from the input vector x

The winning node is that which is in some sense ‘closest’ to the input vector

‘Euclidean distance’ is the straight line distance between the data points, if they were plotted on a (multi-dimensional) graph

Euclidean distance between two vectors a and b, a = (a1,a2,…,an), b = (b1,b2,…bn), is calculated as:

i

2ii bad b a,

Euclidean distance

Page 43: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Weight Update SOM Weight Update Equation

wj(t +1) = wj(t) + (t) (x)(j,t) [x - wj(t)]

“The weights of every node are updated at each cycle by adding

Current learning rate × Degree of neighbourhood with respect to winner × Difference between current

weights and input vector to the current weights”

Example of (t) Example of (x)(j,t)

L. rate

No. of cycles

–x-axis shows distance from winning node

–y-axis shows ‘degree of neighbourhood’ (max. 1)

Page 44: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Example: Self-Organizing Maps The animals should be ordered by a neural

networks. And the animals will be described with their

attributes(size, living space). e.g. Mouse = (0/0)

Size: Living space: small=0 medium=1 big=2 Land=0 Water=1 Air=2

Mouse Lion Horse Shark Dove

Size small bigmedium smallbig

Living space LandLand AirWaterLand

(2/0)(0/0) (0/2)(2/1)(1/0)

Page 45: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Example: Self-Organizing Maps This training will be very often repeated. In the best

case the animals should be at close quarters ordered by similarest attribute.

(0.75/0.6875)(0.1875/1.25)

Dove(1.125/1.625)

(1.375/0.5) (1/0.875)(1.5/0)Hourse

(1.625/1)Shark

(1/0.75)Lion

(0.75/0)Mouse

Land animals

Page 46: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Example: Self-Organizing Maps

[Teuvo Kohonen 2001] Self-Organizing Maps; Springer;

A grouping according to similarity has emerged

Animal names and their attributes

birds

peaceful

hunters

is

has

likesto

Dove Hen Duck Goose Owl Hawk Eagle Fox Dog Wolf Cat Tiger Lion Horse Zebra Cow Small 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0

Medium 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 Big 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1

2 legs 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 4 legs 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 Hair 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

Hooves 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 Mane 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0

Feathers 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 Hunt 0 0 0 0 1 1 1 1 0 1 1 1 1 0 0 0 Run 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 Fly 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0

Swim 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0

Page 47: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Conclusion

Advantages SOM is Algorithm that projects high-dimensional data onto a

two-dimensional map. The projection preserves the topology of the data so that

similar data items will be mapped to nearby locations on the map.

SOM still have many practical applications in pattern recognition, speech analysis, industrial and medical diagnostics, data mining

Disadvantages Large quantity of good quality representative training data

required No generally accepted measure of ‘quality’ of a SOM

e.g. Average quantization error (how well the data is classified)

Page 48: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

Discussion topics What is the main purpose of the SOM? Do you know any example systems with SOM Algorithm?

Page 49: Lecture 10 Artificial Neural Networks Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer

References [Witten and Frank (1999)] Witten, I.H. and Frank, Eibe. Data Mining: Practical Machine Learning Tools and

Techniques with Java Implementations. Morgan Kaufmann Publishers, San Francisco, CA, USA. 1999 [Kohonen (1982)] Teuvo Kohonen. Self-organized formation of topologically correct feature maps. Biol.

Cybernetics, volume 43, 59-62 [Kohonen (1995)] Teuvo Kohonen. Self-Organizing Maps. Springer, Berlin, Germany [Vesanto (1999)] SOM-Based Data Visualization Methods, Intelligent Data Analysis, 3:111-26 [Kohonen et al (1996)] T. Kohonen, J. Hynninen, J. Kangas, and J. Laaksonen, "SOM PAK: The Self-Organizing Map program package, " Report A31, Helsinki University of Technology, Laboratory of Computer and Information Science, Jan. 1996 [Vesanto et al (1999)] J. Vesanto, J. Himberg, E. Alhoniemi, J Parhankangas. Self- Organizing Map in Matlab: the SOM Toolbox. In Proceedings of the Matlab DSP Conference 1999, Espoo, Finland, pp. 35-40, 1999. [Wong and Bergeron (1997)] Pak Chung Wong and R. Daniel Bergeron. 30 Years of Multidimensional

Multivariate Visualization. In Gregory M. Nielson, Hans Hagan, and Heinrich Muller, editors, Scientific Visualization - Overviews, Methodologies and Techniques, pages 3-33, Los Alamitos, CA, 1997. IEEE Computer

Society Press. [Honkela (1997)] T. Honkela, Self-Organizing Maps in Natural Language Processing, PhD Thesis, Helsinki, University of Technology, Espoo, Finland [SVG wiki] http://en.wikipedia.org/wiki/Scalable_Vector_Graphics [Jost Schatzmann (2003)] Final Year Individual Project Report Using Self-Organizing Maps to Visualize

Clusters and Trends in Multidimensional Datasets Imperial college London 19 June 2003