102
t. Martin’s Institute, August 2015 ael Biehl ematics and Computing Science ersity of Groningen / NL rial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher Education Malta, August 31, 2015 Distance based classifiers: Basic concepts, recent developments, and application example www.cs.rug.nl/~ biehl

Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

Embed Size (px)

Citation preview

Page 1: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

Michael BiehlMathematics and Computing ScienceUniversity of Groningen / NL

Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher Education Malta, August 31, 2015

Distance based classifiers: Basic concepts, recent developments, and application examples

www.cs.rug.nl/~ biehl

Page 2: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

1) Distance based classifieres, Learning Vector Quantization

classification problems distance based classifiers, from KNN to prototypes the basic scheme: LVQ1 cost function based training: GLVQ Application: classification of adrenal tumors (I) Receiver Operator Characteristics performance evaluation by (cross-) validation

2) GLVQ implementation

stochastic gradient descent, learning rate schedule batch gradient descent, step size control Demo: GLVQ with the no-nonsense GMLVQ toolbox

Overview

Page 3: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

3) Alternative distance measures and Relevance Learning

Fixed distance measures: Minkowski measures, Kernelized distances, Divergences Application example: detection of Cassava Mosaic Disease

Adaptive distance measures Matrix Relevance Learning Vector Quantizaion Application example: Adrenal Tumors cont‘d Demos: GMLVQ with the no-nonsense GMLVQ toolbox Application example: Early diagnosis of Rheumatoid Arthritis Uniqueness, regularization and singularity control

Challenges in bio-medical data analysis Concluding remarks, references

Overview

Page 4: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

1) Distance based classifiers, Learning Vector Quantization

Page 5: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 5

classification problems

- character/digit/speech recognition

- medical diagnoses

- pixel-wise segmentation in image processing

- object recognition/scene analysis

- fault detection in technical systems

- ...

machine learning approach:

extract information from example data

parameterized in a learning system (neural network, LVQ, SVM...)

working phase: application to novel data

here only: supervised learning , classification:

Page 6: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 6

distance based classification

assignment of data (objects, observations,...)

to one or several classes (crisp/soft) (categories, labels)

based on comparison with reference data (samples, prototypes)

in terms of a distance measure (dis-similarity, metric)

representation of data (a key step!)

- collection of qualitative/quantitative descriptors

- vectors of numerical features

- sequences, graphs, functional data

- relational data, e.g. in terms of pairwise (dis-) similarities

Page 7: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

K-NN classifier

a simple distance-based classifier

- store a set of labeled examples

- classify a query according to the label of the Nearest Neighbor (or the majority of K NN)

- local decision boundary acc. to (e.g.) Euclidean distances

?

- piece-wise linear class borders parameterized by all examples

feature space

+ conceptually simple, no training required, one parameter (K)

- expensive storage and computation, sensitivity to “outliers”can result in overly complex decision boundaries

Page 8: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

prototype based classification

a prototype based classifier [Kohonen 1990, 1997]

- represent the data by one or several prototypes per class

- classify a query according to the label of the nearest prototype (or alternative schemes)

- local decision boundaries according to (e.g.) Euclidean distances

- piece-wise linear class borders parameterized by prototypes

feature space

?

+ less sensitive to outliers, lower storage needs, little computationaleffort in the working phase

- training phase required in order to place prototypes,model selection problem: number of prototypes per class, etc.

Page 9: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

What about the curse of dimensionality ?

concentration of norms/distances for large N

„distance based methods are bound to fail in high dimensions“?

LVQ:

- prototypes are not just random data points - carefully selected representatives of the data - distances of a given data point to prototypes are compared

projection to non-trivial

low-dimensional subspace!

[Ghosh et al., 2007, Witoelar et al., 2010]

models of LVQ training, analytical treatment in the limit

successful training needs training examples

see also:

Page 10: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

set of prototypes

carrying class-labels

based on dissimilarity / distance measure

nearest prototype classifier (NPC):

given - determine the winner with

Nearest Prototype Classifier (NPC)

minimal requirements:

- assign to class

standard example:squared Euclidean

Page 11: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

∙ identification of prototype vectors from labeled example data

∙ distance based classification (e.g. Euclidean)

Learning Vector Quantization

N-dimensional data, feature vectors

• initialize prototype vectors for different classes

competititve learning: LVQ1 [Kohonen, 1990]

• identify the winner (closest prototype)

• present a single example

• move the winner

- closer towards the data (same class)

- away from the data (different class)

Page 12: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

∙ identification of prototype vectors from labeled example data

∙ distance based classification (e.g. Euclidean)

Learning Vector Quantization

N-dimensional data, feature vectors

∙ tesselation of feature space [piece-wise linear]

∙ distance-based classification [here: Euclidean distances]

∙ generalization ability correct classification of new data

∙ aim: discrimination of classes ( ≠ vector quantization or density estimation )

Page 13: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

sequential presentation of labelled examples

… the winner takes it all:

learning rate

many heuristic variants/modifications: [Kohonen, 1990,1997]

- learning rate schedules ηw (t) [Darken & Moody, 1992]

- update more than one prototype per step

iterative training procedure:

randomized initial , e.g. close to the class-conditional means

LVQ1

LVQ1 update step:

Page 14: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

LVQ1 update step:

LVQ1-like update forgeneralized distance:

requirement:

update decreases (increases) distance if classes coincide (are different)

LVQ1

Page 15: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

cost function based LVQ

one example: Generalized LVQ [Sato & Yamada, 1995]

sigmoidal (linear for small arguments), e.g.

E approximates number of misclassifications

linear

E favors large margin separation of classes, e.g.

two winning prototypes:

minimize

small , large

E favors class-typical prototypes

Page 16: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

cost function based LVQ

There is nothing objective about objective functions

James L. McClelland

Page 17: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

GLVQ

training = optimization with respect to prototype position,

e.g. single example presentation, stochastic gradient descent,

update of two prototypes per step

based on non-negative, differentiable distance

requirement:

Page 18: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

GLVQ

training = optimization with respect to prototype position,

e.g. single example presentation, stochastic sequence of examples,

update of two prototypes per step

based on non-negative, differentiable distance

Page 19: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

GLVQ

training = optimization with respect to prototype position,

e.g. single example presentation, stochastic sequence of

examples,

update of two prototypes per step

based on Euclidean distance

moves prototypes towards / away from

sample with prefactors

Page 20: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

related schemes

Many variants of LVQ

intuitive schemes: LVQ2, LVQ2.1, LVQ3, OLVQ, ...

cost function based: RSLVQ (likelihood ratios) ...

Supervised Neural Gas (NG)

many prototypes, rank based update

Supervised Self-Organizing Maps (SOM)

neighborhood relations, topology preserving mapping

Radial Basis Function Networks (RBF)

hidden units = centroids (prototypes) with Gaussian

activation

Page 21: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

An example problem: classification of adrenal tumors

Wiebke Arlt , Angela TaylorDave J. Smith, Peter Nightingale P.M. Stewart, C.H.L. Shackleton et al.

Petra SchneiderHan Stiekema Michael Biehl

Johann Bernoulli Institute for Mathematics and Computer Science University of Groningen

School of MedicineQueen Elizabeth HospitalUniversity of Birmingham/UK(+ several centers in Europe)

tumor classification

[Arlt et al., J. Clin. Endocrinology & Metabolism,

2011]

Page 22: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

∙ adrenal tumors are common (1-2%)

and mostly found incidentally

∙ adrenocortical carcinomas (ACC) account

for 2-11% of adrenal incidentalomas

( ACA: adrenocortical adenomas )

∙ conventional diagnostic tools lack sensitivity

and are labor and cost intensive (CT, MRI)

www.ensat.org

adrenal gland

∙ idea: tumor classification based on steroid excretion profile

tumor classification

Page 23: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

- urinary steroid excretion (24 hours) - 32 potential biomarkers - biochemistry imposes correlations, grouping of steroids

tumor classification

Page 24: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

ACA patient #

ACC patient #

# steroid marker

102 patients with benign ACA

45 patients with malignant ACC

color coded excretion values(logarithmic scale, relative to healthy controls)

data set:

tumor classification

Page 25: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Generalized LVQ , training and performance evaluation

∙ data divided in 90% training and 10% test set

∙ determine prototypes by (stochastic) gradient descent

typical profiles (1 per class)

∙ apply classifier to test data

evaluate performance (error rates)

∙ employ Euclidean distance measure

in the 32-dim. feature space

∙ repeat and average over many random splits

tumor classification

Page 26: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

ACA

ACC

prototypes: steroid excretion in ACA/ACC

tumor classification

Page 27: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

∙ Receiver Operator Characteristics (ROC) [Fawcett, 2000]

obtained by introducing a biased NPC:

false positive rate(1-specificity)

true p

osi

tive r

ate

(s

ensi

tivit

y)

θ = 0

rand

om g

uess

ing

Area under Curve

all tumors classified as ACA

- no false alarms

- no true positives detected

all tumors classified as ACC

- all true positives detected

- max. number of false alarms

tumor classification

(NPC)

Page 28: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

ROC characteristics (averaged over splits of the data set)

AUC=0.87

GLVQ performance:

tumor classification

Page 29: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

2) GLVQ implementation

Page 30: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 30

brief excursion: gradient descent

stochastic gradient descent: convergence requiresdecreasing learning rate with ‘time’ (number of steps t ),

e.g. as

condition [Robbins and Monro, 1954]:

?

alternatives: - more general optimization schemes (conjugate gradient, line search, second order derivatives…) - adaptive learning rates - …

Page 31: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 31

batch gradient descent

batch gradient-based descent w.r.t. GLVQ costs

concatenated prototype vector

update in the direction ofthe negative (full) gradient

step size for normalized gradient

Page 32: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 201532

batch gradient descent

too small:slow convergence

too large:

over-shooting zig-zagging oscillatory behavior divergence

Waypoint averaging [Papari, Biehl, Bunte, 2011](here: modified default step)

default: increase αw by factor, e.g. 1.1

if E(mean over k last ) < E (next )

replace next by mean

reduce αw by a factor, e.g. 2/3

end

Page 33: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 33

- collection of Matlab code (no toolboxes required)

includes example data sets and limited documentation

- mainly for demo-purposes (do not use for critical applications)

efficiency, programming style, etc. were not in the focus

“no nonsense” GMLVQ code collection

provides: single runs, visualization of the data set

leave-one-out, subset validation procedure

variants/options: GLVQ, [GRLVQ], GMLVQ

null-space projection

singularity-control

A no-nonsense beginners’ tool for G(M)LVQ:

http://www.cs.rug.nl/~biehl/No-Nonsense-GMLVQ.zip

Page 34: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 34

example demo

>> load twoclass-difficult.mat (98 examples, 34-dim. feature vectors, binary labels)>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)

learning curvesand step sizes

prototypes

Page 35: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 35

example demo

>> load twoclass-difficult.mat>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)

training set ROC visualization (features 33, 34)

Page 36: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 36

example demo

avg. validation set ROCavg. learning curves

>> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=… run_validation(fvec,lbl,50);GLVQ without relevances…learning curves, averages over 5 validation runswith 10 % of examples left out for testing

avg. prototypes

Page 37: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 37

http://matlabserver.cs.rug.nl/gmlvqweb/web/

More sophisticated Matlab code: [K. Bunte]

(more options, training by non-linear optimization

etc.)

Relevance and Matrix adaptation in Learning Vector

Quantization (GLVQ, GRLVQ, GMLVQ and LiRaM LVQ):

http://www.cs.rug.nl/~biehl/

more links

Pre- and re-prints etc.:

Page 38: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

3) Alternative distance measures

and relevance learning

Page 39: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 39

fixed, pre-defined distance measures: GLVQ (or more general cost function based LVQ): can be based on general, differentiable distances,

e.g. Minkowski measures

Alternative distance measures

possible work-flow

- select several distance measures according to prior knowledge

or a driven-choice in a preprocessing step

- compare performance of various measures

examples: Kernelized distances

Divergences (statistics)

Page 40: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 40

Kernelized distances

rewrite squared Euclidean distance in terms of dot-product

distance measure associated with general inner product or kernel function

e.g. Gaussian Kernel

implicit mapping to high-dimensional space forbetter separability of classes, similar: Support Vector Machine

Page 41: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Divergence Based LVQ:Detection of Cassava Mosaic Disease

Ernest MwebayeJohn QuinnJennifer Aduwo

Petra SchneiderMichael Biehl

Johann Bernoulli InstituteUniversity of Groningen

Department of Computer ScieneMakerere University, KampalaNamulonge Crop Research Center, Uganda

41

Thomas VillmannSven Haase

Frank-Michael Schleif

University of Applied Sciences, Mittweida

University Bielefeld, Germany

divergence based LVQ

[Neurocomputing, 2011]

Page 42: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 42

healt

hy

Mosa

ic

Example: detection of Mosaic disease in Cassava (maniok) plants

Makerere University and Namulonge Crop Research Center, Uganda

LVQ classifiers based on histogram specific distance measures

divergences (statistics) for non-negative, possibly normalized data

(densities, spectral data, more general functional data)

leaf images

divergence based LVQ

Page 43: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 43

Squared Euclidean distance:

Cauchy-Schwartz divergence

(a) (b) (c)

divergence based LVQ

Page 44: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 44

example family: γ-divergences

non-symmetric (in general) includes: Kullback-Leibler violates triangle inequality Cauchy-Schwarz

Euclidean

divergence based LVQ

Page 45: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 45

http://www.air.ug/mcrops/

Page 46: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Adaptive distance measures

46

Page 47: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 47

relevance learning: - employ a parameterized distance measure with only the mathematical form fixed in advance - update its parameters in the training process together with prototype training - adaptive, data driven dissimilarity

example: Matrix Relevance LVQ data-driven optimization of prototypes and relevance matrix in the same training process (≠ pre-processing )

Relevance Learning

Page 48: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Quadratic distance measure

generalized quadratic distance:

variants: one global, several local, class-wise relevance matrices Λ(j)

→ piecewise quadratic decision boundaries

rectangular discriminative low-dim. representation e.g. for visualization [Bunte et al., 2012]

diagonal matrices: single feature weights [Bojer et al., 2001] [Hammer et al., 2002]

scaling of features, general linear transformation of feature space

potential normalization:

Page 49: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

But this is just Mahalonobis distance…

[Mahalonobis, 1936]

S covariance matrix of random vectors(calculated once from the data, fixed definition, not adaptive)

if you insist…

(‘two point version’)

So it is a generalized Mahalonobis distance ?

No.

a generalizedbroccoli

a generalizationof Ohm’s Law

Page 50: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Relevance Matrix LVQ

optimization of prototypes and distance measure

WTA

Matrix-LVQ1

Page 51: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Relevance Matrix LVQ

Generalized Matrix LVQ

(GMLVQ)

optimization of prototypes and distance measure

Page 52: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 52

heuristic interpretation

summarizes

- the contribution of the original dimension

- the relevance of original features for the classification

interpretation assumes implicitly:

features have equal order of magnitude

e.g. after z-score-transformation →

(averages over data set)

standard Euclidean distance for

linearly transformed features

Page 53: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Relevance Matrix LVQ

optimization of prototype positions

distance measure(s) in one training process (≠ pre-processing)

motivation:

improved performance - weighting of features and pairs of features

simplified classification schemes - elimination of non-informative, noisy features - discriminative low-dimensional representation

insight into the data / classification problem - identification of most discriminative features - incorporation of prior knowledge (e.g. structure of Ω)

Page 54: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

related schemes

Relevance LVQ variants

local, rectangular, structured, restricted... relevance matrices

for visualization, functional data, texture recognition, etc.

relevance learning in Robust Soft LVQ, Supervised NG, etc.

combination of distances for mixed data ...

Relevance Learning related schemes in supervised learning ...

RBF Networks [Backhaus et

al., 2012]

Neighborhood Component Analysis [Goldberger et

al., 2005]

Large Margin Nearest Neighbor [Weinberger et al., 2006,

2010]

and many more!

Linear Discriminant Analysis (LDA)

one prototype per class + global matrix,

different objective function!

Page 55: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

Classification of adrenal tumors (cont‘d)

Wiebke Arlt , Angela TaylorDave J. Smith, Peter Nightingale P.M. Stewart, C.H.L. Shackleton et al.

Petra SchneiderHan Stiekema Michael Biehl

Johann Bernoulli Institute for Mathematics and Computer Science University of Groningen

School of MedicineQueen Elizabeth HospitalUniversity of Birmingham/UK(+ several centers in Europe)

[Arlt et al., J. Clin. Endocrinology & Metabolism,

2011][Biehl et al., Europ. Symp. Artficial Neural Networks (ESANN),

2012]

Page 56: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

∙ adrenocortical tumors, difficult differential diagnosis: ACC: adrenocortical carcinomas ACA: adrenocortical adenomas

∙ idea: steroid metabolomics tumor classification based on urinary steroid excretion 32 candidate steroid markers:

adrenocortical tumors

Page 57: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Generalized Matrix LVQ , ACC vs. ACA classification

∙ data divided in 90% training, 10% test set

∙ determine prototypes

typical profiles (1 per class)

∙ apply classifier to test data

evaluate performance (error rates, ROC)

∙ adaptive generalized quadratic distance measure

parameterized by

∙ repeat and average over many random splits

adrenocortical tumors

data set: 24 hrs. urinary steroid excretion 102 patients with benign ACA 45 patients with malignant ACC

Page 58: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Generalized Matrix LVQ , ACC vs. ACA classification

∙ data divided in 90% training, 10% test set, (z-score transformed)

∙ determine prototypes

typical profiles (1 per class)

∙ apply classifier to test data

evaluate performance (error rates, ROC)

∙ adaptive generalized quadratic distance measure

parameterized by

∙ repeat and average over many random splits

tumor classification (cont’d)

[Arlt et al., 2011][Biehl et al., 2012]

Page 59: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

off-diagonaldiagonal elements

fraction of runs (random splits) in which asteroid is rated among 9 most relevant markers

subset of 9 selected steroids ↔ technical realization (patented, University

of Birmingham/UK)

tumor classification

Relevance matrix

Page 60: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

off-diagonaldiagonal elements

19

ACA

ACCdiscriminative e.g. steroid 19

tumor classification

Page 61: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

off-diagonaldiagonal elements

8

ACA ACC

non-trivial role:steroid 8 among the most relevant!

tumor classification

Page 62: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

highly discriminativecombination of markers!

weakl

y d

iscr

imin

ati

ve m

ark

ers

12

8

tumor classification

Page 63: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

ROC characteristics

clear improvement due to

adaptive distances

(1-specificity)

(s

ensi

tivit

y)

8

GMLVQ

GRLVQ

diagonal rel.Euclidean

full matrix

AUC0.870.930.97

tumor classification

Page 64: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

observation / theory :

low rank of resulting relevance matrix

often: single relevant eigendirection

eigenvaluesin ACA/ACCclassification

intrinsic regularization

nominally ~ NxN adaptive parameters in Matrix LVQ

reduce to ~ N effective degrees of freedom

low-dimensional representation

facilitates, e.g., visualization of labeled data sets

tumor classification

theory: stationarity of Matrix RLVQ

Biehl et al. Stationarity of Matrix Relevance LVQ Proc. IJCNN 2015

Page 65: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

tumor classification

visualization of the data set

ACAACC

Page 66: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 66

modified batch gradient descent

batch gradient-based descent w.r.t. costs

concatenated prototype vector

elements of Ω

updates in the direction ofthe normalized gradients

waypoint averaging and step size controlseparately for and

Page 67: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 67

example demo

>> load twoclass-difficult.mat (98 34-dim. feature vectors, binary classification)>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)

prototypes andrelevance matrix

learning curvesand step sizes

Page 68: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 68

example demo

>> load twoclass-difficult.mat>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)

training set ROC visualization of the data set

Page 69: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 69

example demo

avg. validation set ROCavg. prototypes and relevance matrix

>> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=… run_validation(fvec,lbl,50);GMLVQ…learning curves, averages over 5 validation runswith 10 % of examples left out for testing

Page 70: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 70

a multi-class problem

visualization of 18-dim. data setavg. prototypes and rel. matrix

>> load uci-segmenation-sampled>> [gmlvq_system, curves_single,param_set]=run_single(fvec,lbl,50)

Page 71: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 71

Singularity control

Note:

singularity of relevance matrix can lead to numerical instabilities

and over-simplification effects

singularity control: penalty term derivative

-> modified matrix update

(implemented in the no-nonsense gmlvq code collection)

Page 72: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 72

Uniqueness

(I) uniqueness of Ω, given Λ

matrix square root is not unique

irrelevant rotations, reflections, symmetries…. canonical representation in terms of eigen-decomposition of Λ:

- pos. semi-definite symmetric solution (Matlab: “sqrtm”)

Page 73: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 73

simple example:

contributions cancel exactly if-> disregarded in the classification of the training data

but naïve interpretation of diagonal

suggests high relevance, could cause

non-trivial effect for novel data

consider two identical, entirely

irrelevant features, e.g.

Uniqueness

(II) uniqueness of relevance matrix for given data set ?

Page 74: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 74

(II) uniqueness

given transformation:

are in the null-space of

is possible if the rows of

→ identical mapping of examples, different for

possible to extend by prototypes

is singular if features are correlated, dependent

Uniqueness

Page 75: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 75

regularization

training process yields

determine with eigenvectors and eigenvalues

regularization:

(K<J ) retain the eigenspace corresponding to largest eigenvalues removes also span of small non-zero eigenvalues

(K=J ) removes all null-space contributions, unique solution with minimal Euclidean norm of row vectors equivalent: (Moore-Penrose-Inverse X+ ) (implemented in the no-nonsense gmlvq code collection)

Page 76: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 76

regularization

regularized mappingafter/during training

pre-processing of data(PCA)

mapped feature spacefixed K prototypes yet unknownexample: diagnosis ofrheumatoid arthritis

retains original featuresflexible K may include prototypesexample: Wine data set

Strickert, Hammer, Villmann, Biehl, IEEE SCCI 2013Regularization and improved interpretation of linear data mappingsand adaptive distance measures

Page 77: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 77

illustrative example

infra-red spectral data: 124 wine spamples256 wavelengths 30 training data 94 test spectra

alco

hol co

nte

nt

high

low

medium

GMLVQ classification

[UCI ML repository]

Page 78: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 78

GMLVQ

best performance7 dimensions remaining

over-fittingeffect

null-space correctionP=30 dimensions

Page 79: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 79

original

regularized

regularization - enhances generalization - smoothens relevance profile/matrix - removes ‘false relevances’ - improves interpretability of Λ

raw relevance matrix

posterior regularization

Page 80: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

Early diagnosis of Rheumatoid Arthritis

Expression of chemokines CXCL4 and CXCL7 by synovial macrophages defines an early stage of rheumatoid arthritis

Ann. of the Rheumatic Diseases, 2015 (available online)

L. Yeo, N. Adlard, M. Biehl, M. Juarez, M. SnowC.D. Buckley, A. Filer, K. Raza, D. Scheel-Toellner

Page 81: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 201581

uninflamed control established RA

early inflammation

resolving early RA

cytokine based diagnosis of RA at earliest possible stage ?

ultimate goals: understand pathogenesis and mechanism of progression

?

rheumatoid arthritis (RA)

Page 82: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

mRNA extraction real-time PCRtissue sectionsynovium

synovial tissue cytokine expression

IL1A IL17F FASL CXCL4 CCL15 TGFB1 KITLGIL1B IL18 CD70 CXCL5 CCL16 TGFB2 MST1IL1RN IL19 CD30L CXCL6 CCL17 TGFB3 SPP1IL2 IL20 4-1BB-L CXCL7 CCL18 EGF SFRP1IL3 IL21 TRAIL CXCL9 CCL19 FGF2 ANXA1IL4 IL22 RANKL CXCL10 CCL20 TGFA TNFRSF13BIL5 IL23A TWEAK CXCL11 CCL21 IGF2 IL6RIL6 IL24 APRIL CXCL12 CCL22 VEGFA NAMPTIL7 IL25 BAFF CXCL13 CCL23 VEGFB C1QTNF3IL8 IL26 LIGHT CXCL14 CCL24 MIF VCAM1IL9 IL27 TL1A CXCL16 CCL25 LIF LGALS1IL10 IL28A GITRL CCL1 CCL26 OSM LGALS9IL11 IL29 FASLG CCL2 CCL27 ADIPOQ LGALS3IL12A IL32 IFNA1 CCL3 CCL28 LEP LGALS12IL12B IL33 IFNA2 CCL4 XCL1 GHRLIL13 LTA IFNB1 CCL5 XCL2 RETNIL14 TNF IFNG CCL7 CX3CL1 CTLA4IL15 LTB CXCL1 CCL8 CSF1 EPOIL16 OX40L CXCL2 CCL11 CSF2 TPOIL17A CD40L CXCL3 CCL13 CSF3 FLT3LG

panel of 117 cytokines

• cell signaling proteins

• regulate immune response

• produced by, e.g. T-cells, macrophages, lymphocytes, fibroblasts, etc.

Page 83: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

GMLVQ analysis

pre-processing:

• log-transformed expression values (117 dim. data, 47 samples in total)

• 21 leading principal components explain ca. 90% of the total variation

Two two-class problems: (A) established RA vs. uninflamed controls (B) early RA vs. resolving inflammation • 1 prototype per class, global relevance matrix, distance measure:

• leave-one-out validation

evaluation in terms of Receiver Operating Characteristics

Page 84: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

false positive rate

true

pos

itive

rate

true

pos

itive

rate

diagonal Λii vs. cytokine index i

established RA vs.uninflamed control

early RA vs.resolving inflammation

Matrix Relevance LVQ

diagonal relevancesleave-one-out

intializationof LVQ system

Page 85: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

CXCL4 chemokine (C-X-C motif) ligand 4

CXCL7 chemokine (C-X-C motif) ligand 7

direct study on protein level, staining / imaging of sinovial tissue: macrophages : predominant source of CXCL4/7 expression

protein level studies

• high levels of CXCL4 and CXLC7 in the first 12 weeks of synovitis in early RA

• expression on macrophages outside of blood vessels discriminates early RA / resolving cases

(2 PhD thesis projects)

Page 86: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

false positive rate

true

pos

itive

rate

true

pos

itive

rate

diagonal Λii vs. cytokine index i

established RA vs.uninflamed control

early RA vs.resolving inflammation

relevant cytokines

macrophagestimulating 1

diagonal relevancesleave-one-out

Page 87: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

four class problem

one prototype per classand one global matrixtrained in one go

low-rank relevancematrix (rank ≈ 2)

visualization of dataset in terms of eigenvectors of Λ

Niels Kluiterresearch internshipat JBI Groningen

Page 88: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

four class problem

- extract binary classifiers (healthy vs. est. RA, resolving vs. early RA)

by restricting the system to the corresponding prototypes

for varying number K of PCs used as feature vectors - determine corresponding ROC performances

robust in a rangeof 14 < K < 20

healthy vs. est. RA

K=16: AUC = 0.92

early vs. resolving RA

K=16: AUC = 0.79

to do: nested L1O-val.

Page 89: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

four class problem

read off problem-specific relevancesfrom eigenvectorsof Λ

control vs. est. RA

reso

lvin

gvs.

earl

y R

A

Page 90: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

Some challenges in biomedical data,further application examples

Page 91: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

challenges in bio-medical data

A. Filer, A. Clark, M. Juarez, J. Falconer et al.

- micro-array gene expression data

high-dimensional (~50000 probes) PCA + GMLVQ

(work in progress)

early Arthritis vs. resolving inflammations

- preliminary result:

better than random classification

close inspection of high relevance genes:system discriminates male/female patientsprediction reflects higher prevalence of RA in female patients

leave-one-out

“accuracy is not enough”

Page 92: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 92

interpretability

- important: understand the basis of decisions

- white-box approaches for classification/regression etc.

- insights into the data and problem at hand

- e.g. selection of most discriminative bio-markers

challenges

relevance of steroid markersww

w ensat.org

adrenocortical tumors adenomas (ACA) carcinomas (ACC)

W. Arlt, M. Biehl et al. Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors J. of Clin. Endocrinology & Metabolism 96: 3775-3784 (2011).

Page 93: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 93

large amounts of data , e.g. image data bases

life lines (longitudinal patient data)

prescription data bases [E. Hak, K.

Taxis]

challenges

A

B

C

D

query images

retrieval:

√ - same class

× - different classs

UMCG data base of skin lesion images

K. Bunte, M. Biehl, M.F. Jonkman, N. Petkov Learning Effective Color Features for Content Based Image Retrieval in Dermatology. Pattern Recognition 44 (2011) 1892-1902.

Page 94: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 94

high-dimensional data, e.g. medical images (CT, MRI, PET …)

gene expression, DNA sequences, …

challenges

projection on first eigenvector of Λ

pro

ject

ion o

n fi

rst

eig

envect

or

of

Λ

M. Biehl, K. Bunte, P. Schneider Analysis of Flow Cytometry Data by Matrix Relevance Learning Vector QuantizationPLOS One 8: e59401 (2013)

- low-dim. representation- feature selection- visualization

high-throughput flow cytometry ~ 10k cells x30 markers/sample derive 186 features GMLVQ, low-dim. projection

Page 95: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015 95

incomplete data

challenges

- missing values, noise, uncertain labels…

imputation, semi-supervised learning

- complementary data sets…

learning from privileged information, transfer learning

mixed data

- combination of different sources / technical

platforms suitable adaptive & integrative (dis-)

similarity measures

E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity measures for prototype-based classification Proc. of the 23rd European Symposium on Artificial Neural Networks ESANN 2015, d-side publishing, 31-36 (2015)

Page 96: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

distances combined

...N-dim. vector M-bin histogram temporal sequence

Euclidean divergence (mis-)alignment

combined distance measure, e.g.

+source-specific prototypes

relevance learning!

E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity measures for prototype-based classification Proc. of the 23rd European Symposium on Artificial Neural Networks ESANN 2015, d-side publishing, 31-36 (2015)

Page 97: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

challenges

imbalanced data sets

- prevalence of diseases (screening vs. differential diagnosis)

- role of false positive / false negatives

T. Villmann, M. Kaden, W. Herrmann, M. BiehlLearning Vector Quantization for ROC-optimization

possibleworking points

Page 98: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

causal relations vs. correlation

challenges

- predictive power vs. causal dependence ?

www.causality.inf.ethz.ch/data/LUCAS.html

E. Mwebaze, J. Quinn, M. BiehlCausal Relevance Learning for Robust Classification under InterventionsProc. 19th Europ. Symp. on Artificial Neural Networks ESANN 2011

Page 99: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

challenges

data not given as vectors in a Euclidean space,

e.g. symbolic sequences of different length

known: pairwise dis-similarities, e.g. edit-distance

‘relational data’ given as matrix

loooooooongword

shrtwrd

pseudo-Euclidean embedding

prototypes expressed as

Non-vectorial data:

Page 100: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

non-vectorial data

distances

Training: updates w.r.t. prototype coefficients, e.g. LVQ1-like or GLVQ

Working phase: WTA classification of novel data:

distance from known example data

distance from protoypes

[Hammer, Schleif, Zhu, 2011] [Hammer & Hasenfuss, 2010]

prototypes

Page 101: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

CAIP contributions

Gert-Jan de Vries, Steffen Pauws and Michael Biehl.

Facial Expression Recognition using Learning Vector Quantization

Thomas Villmann, Marika Kaden, David Nebel and Michael

Biehl. Learning Vector Quantization with Adaptive Cost-based

Outlier-Rejection

Page 102: Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher

St. Martin’s Institute, August 2015

a review article

For a recent review and further references see:

M. Biehl, B. Hammer, T. Villmann Distance measures for prototype based classificationIn: BrainComp, Proc. of the International Workshop on Brain-Inspired Computing. Cetraro/Italy, July 2013L. Grandinetti, T. Lippert, N. Petkov (editors)Springer Lecture Notes in Computer Science Vol 8603pp. 100-116 (2014)

check www.cs.rug.nl/~biehl

for more references and application examples