52
Multiple kernel Self-Organizing Maps Madalina Olteanu (2) , Nathalie Villa-Vialaneix (1,2) , Christine Cierco-Ayrolles (1) http://www.nathalievilla.org [email protected], [email protected], [email protected] Groupe de travail SAMM-Graph (25/01/2013) (1) (2) MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 1 / 25

Multiple kernel Self-Organizing Maps

  • Upload
    tuxette

  • View
    158

  • Download
    3

Embed Size (px)

DESCRIPTION

January, 25th, 2013 Groupe de travail Graphes, SAMM, Université Paris 1

Citation preview

Page 1: Multiple kernel Self-Organizing Maps

Multiple kernel Self-Organizing Maps

Madalina Olteanu(2), Nathalie Villa-Vialaneix(1,2), ChristineCierco-Ayrolles(1)

http://www.nathalievilla.org

[email protected], [email protected],

[email protected]

Groupe de travail SAMM-Graph (25/01/2013)

(1) (2)

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 1 / 25

Page 2: Multiple kernel Self-Organizing Maps

Introduction

Outline

1 Introduction

2 MK-SOM

3 Applications

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 2 / 25

Page 3: Multiple kernel Self-Organizing Maps

Introduction

Data, notations and objectives

Data: D datasets (xdi )i=1,...,n,d=1,...,D all measured on the same individuals

or on the same objects, {i, . . . , n}, each taking values in an arbitrary spaceGd .

Examples:• (xd

i )i is p numeric variables;

• (xdi )i is n nodes of a graph;

• (xdi )i is p factors;

• (xdi )i is a text...

Purpose: Combine all datasets to obtain a map of individuals/objects:self-organizing maps for clustering {i, . . . , n} using all datasets

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 3 / 25

Page 4: Multiple kernel Self-Organizing Maps

Introduction

Data, notations and objectives

Data: D datasets (xdi )i=1,...,n,d=1,...,D all measured on the same individuals

or on the same objects, {i, . . . , n}, each taking values in an arbitrary spaceGd .Examples:• (xd

i )i is p numeric variables;

• (xdi )i is n nodes of a graph;

• (xdi )i is p factors;

• (xdi )i is a text...

Purpose: Combine all datasets to obtain a map of individuals/objects:self-organizing maps for clustering {i, . . . , n} using all datasets

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 3 / 25

Page 5: Multiple kernel Self-Organizing Maps

Introduction

Data, notations and objectives

Data: D datasets (xdi )i=1,...,n,d=1,...,D all measured on the same individuals

or on the same objects, {i, . . . , n}, each taking values in an arbitrary spaceGd .Examples:• (xd

i )i is p numeric variables;

• (xdi )i is n nodes of a graph;

• (xdi )i is p factors;

• (xdi )i is a text...

Purpose: Combine all datasets to obtain a map of individuals/objects:self-organizing maps for clustering {i, . . . , n} using all datasets

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 3 / 25

Page 6: Multiple kernel Self-Organizing Maps

Introduction

Particular case [Villa-Vialaneix et al., 2013]Data: A weighted undirected network represented by a graph G with nnodes x1, . . . , xn with weight matrix W : Wij = Wji and Wii = 0.

For every node, additional information is provided: either , ,

Examples: Weight of people in a social network, Number of visits of aweb page in WWW...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 4 / 25

Page 7: Multiple kernel Self-Organizing Maps

Introduction

Particular case [Villa-Vialaneix et al., 2013]Data: A weighted undirected network represented by a graph G with nnodes x1, . . . , xn with weight matrix W : Wij = Wji and Wii = 0.For every node, additional information is provided: either numericalvariables, factors, textual information... or a combination of all of them.

Examples: Weight of people in a social network, Number of visits of aweb page in WWW...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 4 / 25

Page 8: Multiple kernel Self-Organizing Maps

Introduction

Particular case [Villa-Vialaneix et al., 2013]Data: A weighted undirected network represented by a graph G with nnodes x1, . . . , xn with weight matrix W : Wij = Wji and Wii = 0.For every node, additional information is provided: either numericalvariables, factors, textual information... or a combination of all of them.

Examples: Gender in a social network, Functional group of a gene in agene interaction network...

Examples: Weight of people in a social network, Number of visits of aweb page in WWW...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 4 / 25

Page 9: Multiple kernel Self-Organizing Maps

Introduction

Particular case [Villa-Vialaneix et al., 2013]Data: A weighted undirected network represented by a graph G with nnodes x1, . . . , xn with weight matrix W : Wij = Wji and Wii = 0.For every node, additional information is provided: either numericalinformation, factors, textual information... or a combination of all of them.

Examples: Weight of people in a social network, Number of visits of aweb page in WWW...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 4 / 25

Page 10: Multiple kernel Self-Organizing Maps

Introduction

Purpose

Node clustering: find communities, i.e., groups of “close” nodes in thegraph; close meaning:

• densely connected and sharing (comparatively) a few links with theother groups (“communities”);

• but also having similar labels.

Here: self-organizing map approach to produce a map of the nodes(clutering+visualization).

Multiple sources of information are handled bycombining kernels.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 5 / 25

Page 11: Multiple kernel Self-Organizing Maps

Introduction

Purpose

Node clustering: find communities, i.e., groups of “close” nodes in thegraph; close meaning:

• densely connected and sharing (comparatively) a few links with theother groups (“communities”);

• but also having similar labels.

Here: self-organizing map approach to produce a map of the nodes(clutering+visualization).

Multiple sources of information are handled bycombining kernels.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 5 / 25

Page 12: Multiple kernel Self-Organizing Maps

Introduction

Purpose

Node clustering: find communities, i.e., groups of “close” nodes in thegraph; close meaning:

• densely connected and sharing (comparatively) a few links with theother groups (“communities”);

• but also having similar labels.

Here: self-organizing map approach to produce a map of the nodes(clutering+visualization).

Multiple sources of information are handled bycombining kernels.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 5 / 25

Page 13: Multiple kernel Self-Organizing Maps

Introduction

Purpose

Node clustering: find communities, i.e., groups of “close” nodes in thegraph; close meaning:

• densely connected and sharing (comparatively) a few links with theother groups (“communities”);

• but also having similar labels.

Here: self-organizing map approach to produce a map of the nodes(clutering+visualization). Multiple sources of information are handled bycombining kernels.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 5 / 25

Page 14: Multiple kernel Self-Organizing Maps

MK-SOM

Outline

1 Introduction

2 MK-SOM

3 Applications

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 6 / 25

Page 15: Multiple kernel Self-Organizing Maps

MK-SOM

Basics on SOM

Project the data on a squared grid (each square of the grid is a cluster)

Project the data on a squared grid (each square of the grid is a cluster)such that:• the nodes in a same cluster are highly connected• the nodes in two close clusters are also (less) connected• the nodes in two distant clusters are (almost) not connected

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 7 / 25

Page 16: Multiple kernel Self-Organizing Maps

MK-SOM

Basics on SOM

Project the data on a squared grid (each square of the grid is a cluster)such that:• the nodes in a same cluster are highly connected• the nodes in two close clusters are also (less) connected• the nodes in two distant clusters are (almost) not connected

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 7 / 25

Page 17: Multiple kernel Self-Organizing Maps

MK-SOM

Basics on SOM• the map is made of neurons (visually symbolized by, e.g.,

rectangles), 1...M, each associated to a prototype pi (a prototype is a“representer” of the neuron in the original dataset);

• the map is equipped with a neighborhood relationship, i.e., a“distance” (actually a dissimilarity) between neurons, δ;

• goal: find the best mapping f(xi) ∈ {1, . . . ,M} of the observations xi

on the map minimizing the energy

E =n∑

i=1

M∑j=1

h(δ(f(xi), j))‖xi − pi‖2.

i.e., each data is assigned to a neuron so that xi is:• “close” to the prototype of f(xi);• “close” (to a lesser extent) to the prototypes of the neighboring neurons

of f(xi);• “distant” to the prototypes of the neurons that are distant of f(xi).

(topology preservation)

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 8 / 25

Page 18: Multiple kernel Self-Organizing Maps

MK-SOM

Basics on SOM• the map is made of neurons (visually symbolized by, e.g.,

rectangles), 1...M, each associated to a prototype pi (a prototype is a“representer” of the neuron in the original dataset);

• the map is equipped with a neighborhood relationship, i.e., a“distance” (actually a dissimilarity) between neurons, δ;

• goal: find the best mapping f(xi) ∈ {1, . . . ,M} of the observations xi

on the map minimizing the energy

E =n∑

i=1

M∑j=1

h(δ(f(xi), j))‖xi − pi‖2.

i.e., each data is assigned to a neuron so that xi is:• “close” to the prototype of f(xi);• “close” (to a lesser extent) to the prototypes of the neighboring neurons

of f(xi);• “distant” to the prototypes of the neurons that are distant of f(xi).

(topology preservation)

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 8 / 25

Page 19: Multiple kernel Self-Organizing Maps

MK-SOM

Basics on SOM• the map is made of neurons (visually symbolized by, e.g.,

rectangles), 1...M, each associated to a prototype pi (a prototype is a“representer” of the neuron in the original dataset);

• the map is equipped with a neighborhood relationship, i.e., a“distance” (actually a dissimilarity) between neurons, δ;

• goal: find the best mapping f(xi) ∈ {1, . . . ,M} of the observations xi

on the map minimizing the energy

E =n∑

i=1

M∑j=1

h(δ(f(xi), j))‖xi − pi‖2.

i.e., each data is assigned to a neuron so that xi is:• “close” to the prototype of f(xi);• “close” (to a lesser extent) to the prototypes of the neighboring neurons

of f(xi);• “distant” to the prototypes of the neurons that are distant of f(xi).

(topology preservation)

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 8 / 25

Page 20: Multiple kernel Self-Organizing Maps

MK-SOM

on-line SOMData: x1, . . . , xn ∈ R

d

1: Initialization: randomly set p01 ,...,p0

M in Rd

2: for t = 1→ T do3: Randomly choose i ∈ {1, . . . , n}4: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖Rd

5: for all j = 1→ M do Representation6: pt

j ← pt−1j + µtht (δ(f t (xi), j))

(1i − pt−1

j

)7: end for8: end for

where ht is a decreasing function which reduces the neighborhood when tincreases and µt is generally of order 1/t .

Problem with non numerical data: definitions of ‖.‖ and pj???

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 9 / 25

Page 21: Multiple kernel Self-Organizing Maps

MK-SOM

on-line SOMData: x1, . . . , xn ∈ R

d

1: Initialization: randomly set p01 ,...,p0

M in Rd

2: for t = 1→ T do3: Randomly choose i ∈ {1, . . . , n}4: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖Rd

5: for all j = 1→ M do Representation6: pt

j ← pt−1j + µtht (δ(f t (xi), j))

(1i − pt−1

j

)7: end for8: end for

where ht is a decreasing function which reduces the neighborhood when tincreases and µt is generally of order 1/t .Problem with non numerical data: definitions of ‖.‖ and pj???

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 9 / 25

Page 22: Multiple kernel Self-Organizing Maps

MK-SOM

Kernels/Multiple kernel

What is a kernel?

(xi) ∈ G, (K(xi , xj))ij st: K(xi , xj) = K(xj , xi) and∀ (αi)i ,

∑ij αiαjK(xi , xj) ≥ 0. In this case [Aronszajn, 1950],

∃ (H , 〈., .〉) , Φ : G → H st : K(xi , xj) = 〈Φ(xi),Φ(xj)〉

Examples:• nodes of a graph: Heat kernel K = e−βL or K = L+ where L is the

Laplacian [Kondor and Lafferty, 2002, Smola and Kondor, 2003,Fouss et al., 2007]

• numerical variables: Gaussian kernel K(xi , xj) = e−β‖xi−xj‖2;

• text: [Watkins, 2000]...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 10 / 25

Page 23: Multiple kernel Self-Organizing Maps

MK-SOM

Kernels/Multiple kernel

What is a kernel?

(xi) ∈ G, (K(xi , xj))ij st: K(xi , xj) = K(xj , xi) and∀ (αi)i ,

∑ij αiαjK(xi , xj) ≥ 0. In this case [Aronszajn, 1950],

∃ (H , 〈., .〉) , Φ : G → H st : K(xi , xj) = 〈Φ(xi),Φ(xj)〉

Examples:• nodes of a graph: Heat kernel K = e−βL or K = L+ where L is the

Laplacian [Kondor and Lafferty, 2002, Smola and Kondor, 2003,Fouss et al., 2007]

• numerical variables: Gaussian kernel K(xi , xj) = e−β‖xi−xj‖2;

• text: [Watkins, 2000]...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 10 / 25

Page 24: Multiple kernel Self-Organizing Maps

MK-SOM

Kernel SOM[Mac Donald and Fyfe, 2000, Andras, 2002, Villa and Rossi, 2007](xi)i ⊂ G described by a kernel (K(xi , xi′))ii′ . Prototypes are defined in thefeature space: pj =

∑i γjiφ(xi); energy calculated calculated in H :

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ij Φ(xi) st γ0

ij ≥ 0and

∑i γ

0ij = 1

2: for t = 1→ T do3: Randomly choose i ∈ {1, . . . , n}4: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖H

5: for all j = 1→ M do Representation6: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)7: end for8: end for

Usually, µt ∼ 1/t .MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 11 / 25

Page 25: Multiple kernel Self-Organizing Maps

MK-SOM

Kernel SOM[Mac Donald and Fyfe, 2000, Andras, 2002, Villa and Rossi, 2007](xi)i ⊂ G described by a kernel (K(xi , xi′))ii′ . Prototypes are defined in thefeature space: pj =

∑i γjiφ(xi); energy calculated calculated in H :

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ij Φ(xi) st γ0

ij ≥ 0and

∑i γ

0ij = 1

2: for t = 1→ T do3: Randomly choose i ∈ {1, . . . , n}4: Assignment

f t (xi)← arg minj=1,...,M

∑ll′γt−1

jl γt−1jl′ K t (xl , xl′) − 2

∑l

γt−1jl K t (xl , xi)

5: for all j = 1→ M do Representation6: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)7: end for8: end for

Usually, µt ∼ 1/t .MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 11 / 25

Page 26: Multiple kernel Self-Organizing Maps

MK-SOM

Combining kernels

Suppose: each (xdi )i is described by a kernel Kd , kernels can be

combined (e.g., [Rakotomamonjy et al., 2008] for SVM):

K =D∑

d=1

αdKd ,

with αd ≥ 0 and∑

d αd = 1.

Remark: Also useful to integrate different types of information comingfrom different kernels on the same dataset.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 12 / 25

Page 27: Multiple kernel Self-Organizing Maps

MK-SOM

Combining kernels

Suppose: each (xdi )i is described by a kernel Kd , kernels can be

combined (e.g., [Rakotomamonjy et al., 2008] for SVM):

K =D∑

d=1

αdKd ,

with αd ≥ 0 and∑

d αd = 1.Remark: Also useful to integrate different types of information comingfrom different kernels on the same dataset.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 12 / 25

Page 28: Multiple kernel Self-Organizing Maps

MK-SOM

multiple kernel SOM

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ji Φ(xi) st γ0

ji ≥ 0 and∑i γ

0ji = 1

2: Kernel initialization: set (α0d) st α0

d ≥ 0 and∑

d αd = 1 (e.g., α0d = 1/D);

K0 ←∑

d α0dKd

3: for t = 1→ T do4: Randomly choose i ∈ {1, . . . , n}5: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖

2H(K t )

6: for all j = 1→ M do Representation7: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)8: end for9: empty state ;-)

10: end for

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 13 / 25

Page 29: Multiple kernel Self-Organizing Maps

MK-SOM

multiple kernel SOM

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ji Φ(xi) st γ0

ji ≥ 0 and∑i γ

0ji = 1

2: Kernel initialization: set (α0d) st α0

d ≥ 0 and∑

d αd = 1 (e.g., α0d = 1/D);

K0 ←∑

d α0dKd

3: for t = 1→ T do4: Randomly choose i ∈ {1, . . . , n}5: Assignment

f t (xi)← arg minj=1,...,M

∑ll′γt−1

jl γt−1jl′ K t (xl , xl′) − 2

∑l

γt−1jl K t (xl , xi)

6: for all j = 1→ M do Representation7: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)8: end for9: empty state ;-)

10: end for

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 13 / 25

Page 30: Multiple kernel Self-Organizing Maps

MK-SOM

Tuning (αd)d on-linePurpose: minimize over (γji)ji and (αd)d the energy

E((γji)ji , (αd)d) =n∑

i=1

M∑j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α,

KSOM picks up an observation xi whose contribution to the energy is:

E|xi =M∑

j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α(1)

Idea: Add a gradient descent step based on the derivative of (1):

∂E|xi

∂αd=

M∑j=1

h(f(xi), j)

Kd(xdi , x

di ) − 2

n∑l=1

γjlKd(xdi , x

dl )+

n∑l,l′=1

γjlγjl′Kd(xdl , x

dl′ )

=M∑

j=1

h(f(xi), j)‖φ(xi) − pdj (γj)‖

2Kd

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 14 / 25

Page 31: Multiple kernel Self-Organizing Maps

MK-SOM

Tuning (αd)d on-linePurpose: minimize over (γji)ji and (αd)d the energy

E((γji)ji , (αd)d) =n∑

i=1

M∑j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α,

KSOM picks up an observation xi whose contribution to the energy is:

E|xi =M∑

j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α(1)

Idea: Add a gradient descent step based on the derivative of (1):

∂E|xi

∂αd=

M∑j=1

h(f(xi), j)

Kd(xdi , x

di ) − 2

n∑l=1

γjlKd(xdi , x

dl )+

n∑l,l′=1

γjlγjl′Kd(xdl , x

dl′ )

=M∑

j=1

h(f(xi), j)‖φ(xi) − pdj (γj)‖

2Kd

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 14 / 25

Page 32: Multiple kernel Self-Organizing Maps

MK-SOM

Tuning (αd)d on-linePurpose: minimize over (γji)ji and (αd)d the energy

E((γji)ji , (αd)d) =n∑

i=1

M∑j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α,

KSOM picks up an observation xi whose contribution to the energy is:

E|xi =M∑

j=1

h (f(xi), j)∥∥∥∥φα(xi) − pαj (γj)

∥∥∥∥2

α(1)

Idea: Add a gradient descent step based on the derivative of (1):

∂E|xi

∂αd=

M∑j=1

h(f(xi), j)

Kd(xdi , x

di ) − 2

n∑l=1

γjlKd(xdi , x

dl )+

n∑l,l′=1

γjlγjl′Kd(xdl , x

dl′ )

=M∑

j=1

h(f(xi), j)‖φ(xi) − pdj (γj)‖

2Kd

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 14 / 25

Page 33: Multiple kernel Self-Organizing Maps

MK-SOM

adaptive multiple kernel SOM

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ji Φ(xi) st γ0

ji ≥ 0 and∑i γ

0ji = 1

2: Kernel initialization: set (α0d) st α0

d ≥ 0 and∑

d αd = 1 (e.g., α0d = 1/D);

K0 ←∑

d α0dKd

3: for t = 1→ T do4: Randomly choose i ∈ {1, . . . , n}5: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖

2H(K t )

6: for all j = 1→ M do Representation7: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)8: end for9:

Kernel update αtd ← αt−1

d + νt∂E|xi∂αd

and K t ←∑

d αtdKd

10: end for

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 15 / 25

Page 34: Multiple kernel Self-Organizing Maps

MK-SOM

adaptive multiple kernel SOM

1: Prototypes initialization: randomly set p0j =

∑ni=1 γ

0ji Φ(xi) st γ0

ji ≥ 0 and∑i γ

0ji = 1

2: Kernel initialization: set (α0d) st α0

d ≥ 0 and∑

d αd = 1 (e.g., α0d = 1/D);

K0 ←∑

d α0dKd

3: for t = 1→ T do4: Randomly choose i ∈ {1, . . . , n}5: Assignment

f t (xi)← arg minj=1,...,M

‖xi − pt−1j ‖

2H(K t )

6: for all j = 1→ M do Representation7: γt

j ← γt−1j + µtht (δ(f t (xi), j))

(1i − γ

t−1j

)8: end for9: Kernel update αt

d ← αt−1d + νt

∂E|xi∂αd

and K t ←∑

d αtdKd

10: end for

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 15 / 25

Page 35: Multiple kernel Self-Organizing Maps

Applications

Outline

1 Introduction

2 MK-SOM

3 Applications

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 16 / 25

Page 36: Multiple kernel Self-Organizing Maps

Applications

Example1: simulated data

Graph with 200 nodes classified in 8 groups:

• graph: Erdös Reyni models: groups 1 to 4 and groups 5 to 8 withintra-group edge probability 0.3 and inter-group edge probability 0.01;

• numerical data: nodes labelled with 2-dimensional Gaussian

vectors: odd groups N((

00

),

(0.3 00 0.3

));

• factor with two levels: groups 1, 2, 5 and 7: first level; other groups:second level.

Only the knownledgeon the three datasetscan discriminate all 8groups.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 17 / 25

Page 37: Multiple kernel Self-Organizing Maps

Applications

Experiment

Kernels: graph: L+, numerical data: Gaussian kernel; factor: anotherGaussian kernel on the disjunctive recoding

Comparison: on 100 randomly generated datasets as previouslydescribed:

• multiple kernel SOM with all three data;

• kernel SOM with a single dataset;

• kernel SOM with two datasets or all three datasets in a single(Gaussian) kernel.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 18 / 25

Page 38: Multiple kernel Self-Organizing Maps

Applications

Experiment

Kernels: graph: L+, numerical data: Gaussian kernel; factor: anotherGaussian kernel on the disjunctive recodingComparison: on 100 randomly generated datasets as previouslydescribed:

• multiple kernel SOM with all three data;

• kernel SOM with a single dataset;

• kernel SOM with two datasets or all three datasets in a single(Gaussian) kernel.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 18 / 25

Page 39: Multiple kernel Self-Organizing Maps

Applications

An exampleMK-SOM All in one kernel

Graph only Numerical variables only

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 19 / 25

Page 40: Multiple kernel Self-Organizing Maps

Applications

Numerical comparison(over 100 simulations) with mutual information∑

ij

|Ci ∩ C̃j |

200log|Ci ∩ C̃j |

|Ci | × |C̃j |

(adjusted version, equal to 1 if partitions are identical[Danon et al., 2005]).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 20 / 25

Page 41: Multiple kernel Self-Organizing Maps

Applications

Numerical comparison(over 100 simulations) with mutual information∑

ij

|Ci ∩ C̃j |

200log|Ci ∩ C̃j |

|Ci | × |C̃j |

(adjusted version, equal to 1 if partitions are identical[Danon et al., 2005]).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 20 / 25

Page 42: Multiple kernel Self-Organizing Maps

Applications

Example 2: Data coming from a medieval corpusExample from [Boulet et al., 2008]http://graphcomp.univ-tlse2.fr/ In the “Archive départementalesdu Lot” (Cahors, France), big corpus of 5000 transactions (mostly landcharters)

• coming from 4 “seigneuries” (about 25 little villages) in South West ofFrance;

• being established between 1240 and 1520 (just before and after thehundred years’ war).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 21 / 25

Page 43: Multiple kernel Self-Organizing Maps

Applications

Graph descriptionGraph:

• nodes: 1446 individuals directly involved in the transactions (1 446individuals);

• edges: the two individuals are involved in the same transaction (3 192edges).

Labels on nodes:

• numerical: average date of activity;

• text: family name of the individual.

Kernels:

• L+ for the graph;

• linear kernel for the dates;

• spectral string kernel for the family names[Karatzoglou and Feinerer, 2010] (uses common strings having alength larger than 4).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 22 / 25

Page 44: Multiple kernel Self-Organizing Maps

Applications

Graph descriptionGraph:

• nodes: 1446 individuals directly involved in the transactions (1 446individuals);

• edges: the two individuals are involved in the same transaction (3 192edges).

Labels on nodes:

• numerical: average date of activity;

• text: family name of the individual.

Kernels:

• L+ for the graph;

• linear kernel for the dates;

• spectral string kernel for the family names[Karatzoglou and Feinerer, 2010] (uses common strings having alength larger than 4).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 22 / 25

Page 45: Multiple kernel Self-Organizing Maps

Applications

Graph descriptionGraph:

• nodes: 1446 individuals directly involved in the transactions (1 446individuals);

• edges: the two individuals are involved in the same transaction (3 192edges).

Labels on nodes:

• numerical: average date of activity;

• text: family name of the individual.

Kernels:

• L+ for the graph;

• linear kernel for the dates;

• spectral string kernel for the family names[Karatzoglou and Feinerer, 2010] (uses common strings having alength larger than 4).

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 22 / 25

Page 46: Multiple kernel Self-Organizing Maps

Applications

Date and graph maps

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 23 / 25

Page 47: Multiple kernel Self-Organizing Maps

Applications

Name map

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 24 / 25

Page 48: Multiple kernel Self-Organizing Maps

Applications

Conclusion

Summary• finding communities in graph, while taking into account labels;

• uses multiple kernel and automatically tunes the combination;

• the method gives relevant communities according to all sources ofinformation and a well-organized map.

Possible developments• currently studying a similar approach for dissimilarity/relational SOM;

• Main issue: computational time; currently studying a sparse version.

Thank you for your attention...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 25 / 25

Page 49: Multiple kernel Self-Organizing Maps

Applications

Conclusion

Summary• finding communities in graph, while taking into account labels;

• uses multiple kernel and automatically tunes the combination;

• the method gives relevant communities according to all sources ofinformation and a well-organized map.

Possible developments• currently studying a similar approach for dissimilarity/relational SOM;

• Main issue: computational time; currently studying a sparse version.

Thank you for your attention...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 25 / 25

Page 50: Multiple kernel Self-Organizing Maps

Applications

Conclusion

Summary• finding communities in graph, while taking into account labels;

• uses multiple kernel and automatically tunes the combination;

• the method gives relevant communities according to all sources ofinformation and a well-organized map.

Possible developments• currently studying a similar approach for dissimilarity/relational SOM;

• Main issue: computational time; currently studying a sparse version.

Thank you for your attention...

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 25 / 25

Page 51: Multiple kernel Self-Organizing Maps

Applications

ReferencesAndras, P. (2002).Kernel-Kohonen networks.International Journal of Neural Systems, 12:117–135.

Aronszajn, N. (1950).Theory of reproducing kernels.Transactions of the American Mathematical Society, 68(3):337–404.

Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).Batch kernel SOM and related laplacian methods for social network analysis.Neurocomputing, 71(7-9):1257–1273.

Danon, L., Diaz-Guilera, A., Duch, J., and Arenas, A. (2005).Comparing community structure identification.Journal of Statistical Mechanics, page P09008.

Fouss, F., Pirotte, A., Renders, J., and Saerens, M. (2007).Random-walk computation of similarities between nodes of a graph, with application to collaborative recommendation.IEEE Trans Knowl Data En, 19(3):355–369.

Karatzoglou, A. and Feinerer, I. (2010).Kernel-based machine learning for fast text mining in R.Comput Statist Data Anal, 54:290–297.

Kondor, R. and Lafferty, J. (2002).Diffusion kernels on graphs and other discrete structures.In Proceedings of the 19th International Conference on Machine Learning, pages 315–322.

Mac Donald, D. and Fyfe, C. (2000).The kernel self organising map.In Proceedings of 4th International Conference on knowledge-based intelligence engineering systems and applied technologies,pages 317–320.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 25 / 25

Page 52: Multiple kernel Self-Organizing Maps

Applications

Rakotomamonjy, A., Bach, F., Canu, S., and Grandvalet, Y. (2008).SimpleMKL.J Mach Learn Res, 9:2491–2521.

Smola, A. and Kondor, R. (2003).Kernels and regularization on graphs.In Warmuth, M. and Schölkopf, B., editors, Proceedings of the Conference on Learning Theory (COLT) and Kernel Workshop,Lecture Notes in Computer Science, pages 144–158.

Villa, N. and Rossi, F. (2007).A comparison between dissimilarity SOM and kernel SOM for clustering the vertices of a graph.In 6th International Workshop on Self-Organizing Maps (WSOM), Bielefield, Germany. Neuroinformatics Group, BielefieldUniversity.

Villa-Vialaneix, N., Olteanu, M., and Cierco-Ayrolles, C. (2013).Carte auto-organisatrice pour graphes étiquetés.In Actes des Ateliers FGG (Fouille de Grands Graphes), colloque EGC (Extraction et Gestion de Connaissances), Toulouse,France.

Watkins, C. (2000).Dynamic alignment kernels.In Smola, A., Bartlett, P., Schölkopf, B., and Schuurmans, D., editors, Advances in Large Margin Classifiers, pages 39–50,Cambridge, MA, USA. MIT P.

MK-SOM (SAMM-Graph) Nathalie Villa-Vialaneix Toulouse, 25 Janvier 2013 25 / 25