Clustering for Semantic labels - Wk9

8/8/2019 Clustering for Semantic labels - Wk9

http://slidepdf.com/reader/full/clustering-for-semantic-labels-wk9 1/39

Computational Intelligence:

Lecture 20

Clustering to form semantic concepts –



Overview• Interpretability of fuzzy representation.

• Histogram analysis• LVQ (Linear Vector Quantization)

• FCM (Fuzzy C-Means)

• FKP (Fuzzy Kohonen Partitioning)

Partitioning)



Semantic Label Clustering• Semantic properties of a linguistic variable

– , , , ,

where L is the name of the variable; T (L) is the linguistic term setof L; U is a universe of discourse; G is a syntactic rule whichgenerates T (L); and M is a semantic rule that associates eachT (L) with its meaning.

– Each linguistic term set is characterized by a fuzzyset which is described using a membership function

0.4

0.6

0.8

1

μ T

( x )

0.4

0.6

0.8

1

μ G

( x )

Gaussian MF

0 2 4 6 8 100

0.2

x

0 2 4 6 8 100

0.2

x



• linguistic variable x named L=“performance ”

• five lin uistic terms

where T (L)={“very small ”, “small ”, “medium ”, “large ” and “ very large ”}.• Semantic assignment M is shown in the figure – normal and convex

ordering such that “ very small ” “small ” “medium ” “large ” “very large ”.

• universe of discourse

≺≺≺ ≺

U =[0, 100] of the base variable x

0.8

1

verysmall small medium large

verylarge

≺

0.4

0.6 μ T ( x

)

0 20 40 60 80 1000

.

x (performance)



Criteria of Inter retabilit•

of discourse•

1 x =• Convex :

i

min , X X X x y z y x zμ μ μ ≤ ≤ ⇒ ≥

• Ordered : 1 2 j n X X X X ≺ ≺

1 2 X X ≺ denotes X1 precedes X2



Clustering• Clustering is a method that organizes patterns

into clusters such that atterns within a cluster

are more similar to each other than patterns inother clusters.

• When the crisp partition in classical clustering

analysis is replaced with a fuzzy partition or afuzzy pseudo-partition , it is referred to as fuzzy clustering

• xamp es: o onen , e ze ,MLVQ (Ang and Quek), DIC (Tung and Quek)

.



•

P2 :

P3:

10/7/2008 7



1: 2: 3:

P1

P2 b e r

b e r

N u m

N u m

P3

10/7/2008 8



1: 2: 3:

P2

b e r

e t e r P1

P3

3 N u m

P e r i

P2



•

sets: –

designing a classifier – -

evaluating the obtained classifier

• m by (n+1) matrix, where m is the number

of features.



Features Class

Area Perimeter Class

3 6 P1

5 7 P1

4 4 P1

7 6 P1

Design set:Odd-indexed entries

15 10 P2

14 12 P2

17 13 P2

Test set:Even-indexed entries

14 19 P313 20 P3

15 22 P3

… … …



Flowchart for HistogramFlowchart for HistogramAnalysisAnalysis

Feature Feature

From image toextraction extraction ea ures

a a a a reduction reduction None

Probability Probability estimate estimate Histogram analysis



• .

points50 sam les

3 binsfrom aGaussian

distribution

25 bins10 bins



Histo ram Anal sisHisto ram Anal sis• Properties: –

require explicit use of density functions – Dilemma between no. of intervals vs. no. of points – Rule of thumb: no. of intervals is equal to the square

root of no. of points – – To convert to density functions, the total area must be

unity – Can be used in any number of features, but subjected

to curse of dimensionality



σ = 0.1

σ = 0.3



•

– Also known as Parzen estimator – – Can be used in multi-features estimation –

154opt ⎛ ⎞=

normal optimal smoothing strategy

3n⎝ ⎠σ denotes the standard deviation of the distribution

W. Bowman and A. Azzalini. A lied Smoothin Techni ues for Data Anal sis: The Kernel Approach with S-Plus Illustrations . New York:Oxford University Press, 1997.



Learning Vector Quantization• LVQ are unsupervised neural networks that determine the weights

for cluster centers in an iterative and sequential manner • Each output neuron has a weight vector v j

that is adjusted during learning.• The winner whose wei ht has

x1 y1

v1

the minimum distance from the

input, updates its weights and

.

.

.

x2

y j

v jw j1

w

w j2

• Repeated until the weights are

forced to stabilize through the.

xiym

wm1

wmi

w

vm

.

winner

wm2

.

.

ji

specification of a learning rate.

.

.

xn yc

mn

vc

inputlayer

outputlayer



LVQ – Cont’d( )( ) ( )min for j 1..cT T

i j j x v x v− = − =

T T T

-

( 1)

( )

if j i j jT

j T j

v x v

v v

α + − =

=⎨ ≠⎪⎩

,clusters, x is the input vector, v i is the i th cluster centre

and is the learning constantα Pseudo Code: (1) Define number of clusters c and smallterminating condition (2) Initialise weights (3)Determinin winnin neuron based on distance 4

ε

Update winner:(5) Determine terminating condition, else repeat with

(T) (T-1) ( ) ( 1) (T)i iv v ( ) for i NT T

i k i x vα −= + − ≤

new vec or



Fuzz C-Means FCM – Bezdek• A fuzzy pseudo-partition of a finite data setc

1

( ) 1 for all k 1..ni k i

xμ =

= =

n

• An objective function for fuzzy clustering is

1

or a ..ci k k

x nμ =

=

(m defines the degree of fuzziness):2

n cm

1 1m i k k i

k i x x vμ = == −



– ’•

– Define number of clusters (c), degree of fuzziness (m)and terminating condition ( ε )

– Init t and pseudo parition p 0

– Compute cluster centres: v 1, v 2, …v i … v c

( ) 1

( ( ))for i 1..c

nm

i k k T k

i n

x xv

μ == =

∑

1( ( ))

mi k

k x

=∑



– ’•

– Update new Pseudo Partition:11 −

2 1( )( 1)

2( )1

( ) for i 1..c, k 1..nmT c

k iT i k

T j k j

x v x

x vμ

−+

=

⎛ ⎞⎜ ⎟−⎜ ⎟= = =⎜ ⎟

⎜ ⎟⎜ ⎟−∑

– Compare distance between the partitions E= p t+1 – p t

c n

–

( 1) ( ) ( 1) ( )

1 1( ) ( )T T T T i k i k

i k E x xμ μ + +

= == Ρ − Ρ = −∑∑





• - -,

optimization algorithm..

• Unable to perform on-line training.• performance depends on a good choice of

weighting exponential m.



0.4

0.6

0.8

1.0

b e r s

h i p d e g r e e μ

( x )

0.4

0.6

0.8

1.0

b e r s

h i p d e g r e e

( x )

0.0

0.2

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

M e

0.0

0.2

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

M e

1.0 1.0

0.2

0.4

0.6

0.8

M e m

b e r s

h i p d e g r e e

( x )

0.2

0.4

0.6

0.8

M e m

b e r s

h i p d e g r e e

( x )

sentosa

versicolor

0.0

1 2 3 4 5 6Petal length (cm)

0.0

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

IRIS data set - FCM with m=1.5, ε=0.0001

trapezoidal –like membership functions



– ’

0.4

0.6

0.8

1.0

e r s

h i p d e g r e e μ

( x )

0.4

0.6

0.8

1.0

b e r s

h i p d e g r e e

( x )

0.0

0.2

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

M e m

0.0

0.2

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

M e m

1.0 1.0

0.4

0.6

0.8

m b e r s

h i p d e g r e e

( x )

0.4

0.6

0.8

m b e r s

h i p d e g r e e

( x )

sentosa

0.0

.


M

0.0

.

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

M versicolor

virginica

- = . , ε= .

Gaussian-like membership functions



• i closest v v−= i σ

0.8

1.0

e μ

( x )0.8

1.0

e e

( x )

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

0.8

1.0

r e e

( x )

0.8

1.0

r e e

( x )

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g

sentosa

versicolor

virginica


0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

MLVQ with λ=0.02, σ=1.5, ε=0.0001



– ’ • i closest v v−= i σ

0.8

1.0

r e e μ

( x )

0.8

1.0

r e e

( x )

0.0

0.2

0.4

.

M e m

b e r s

h i p d e g

0.0

0.2

0.4

.

M e m

b e r s

h i p d e g

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

Sepal length (cm)

. . . . .Sepal width (cm)

0.8

1.0

r e e

( x )

0.8

1.0

r e e

( x )

0.0

0.2

0.4

.

M e m

b e r s

h i p

d e g

0.0

0.2

0.4

.

M e m

b e r s

h i p

d e g

sentosa

versicolor

virginica

MLVQ with λ=0.02, σ=3.0, ε=0.0001


0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)



•

can be described by a fuzzy interval, , ,

a centroid v μ(x)

also known as a

trapezoidal fuzzy number α γ δ

x0 v



– ’ • The subinterval where x =1 is called the kernel of

the fuzzy interval, and the subinterval [ α, δ] is called thesupport.• , γ = erne o e uzzy n erva , an• [α, δ]=support of the fuzzy interval.•

used to derive the centroid v• it cannot derive the

0 if x or x

if x

x

α

α α β

α

< >⎪ −⎪ ≤ ≤

−parameters ( α, β, γ, δ)of the trapezoidal-shaped

(x) 1 if x

x

μ β γ

δ =⎨ ≤ ≤⎪⎪ −

membership function

δ γ −⎩



The Fuzzy Kohonen Partitionalgorithm - supervised

• Define:

– c as the number of classes, – λ≤1/Ω as the learning constant, where Ω=number of data vectors,

– η as the learning width and a small positive number ε as astopping criterion; n=total number of data vectors

• n a se we g s:( ) ( ) ( )( )(0) 2min max min

for i 1..c, k 1..n

i k k k k k k v x x x

c

+= + −

= =

• Determine the i th cluster that the data x k belongs andUpdate the weights v i of the i th cluster



The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• ompute error to c uster an erence n error etween

iteration: ( 1) ( 1)n

T T e x v+ += −1k =

( 1) ( 1) ( )T T T + + −• Repeat: while ¬ (de (T+1) ≤ε )

– End of determining centroids



The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• n t a ze

– where ϕi is the pseudo weight of v i.1..cifor ====== iiiiii vϕ γ δ β α

• Determine the i th cluster that the data x belon s and

i=1 i=2 i=3

Update the pseudo weights ϕi of the i th cluster

−i i k i



The Fuzzy Kohonen Partitionalgorithm – supervised (cont’d)• p ate t e our po nts o t e rapezo a uzzy um er

(Tr FN)

m in( , )

m in

i i k xα α =

=m ax( , )i i iγ γ ϕ =

m ax( , )i i k xδ δ =



The Fuzzy Kohonen Partitionalgorithm Results

0.8

1.0

g r e e μ

( x )

0.8

1.0

g r e e

( x )

0.0

0.2

0.4

.

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9

M e m

b e r s

h i p d e

0.0

0.2

0.4

.

2 2.4 2.8 3.2 3.6 4 4.4

M e m

b e r s

h i p d e

Sepal length (cm) Sepal width (cm)

0.6

0.8

1.0

e g r e e

( x )

0.6

0.8

1.0

e g r e e

( x )

0.0

0.2

0.4

1 2 3 4 5 6

M e m

b e r s

h i p

0.0

0.2

0.4

0.1 0.5 0.9 1.3 1.7 2.1 2.5

M e m

b e r s

h i p

sentosa

versicolor

virginica

e a eng cm e a w cm

FKP with λ=0.02, η=0, ε=0.0005



The Fuzzy Kohonen Partitionalgorithm Results

0.8

1.0

e

( x )

0.8

1.0

e

( x )

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r e

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r e

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

0.8

1.0

e

( x )

0.8

1.0

e

( x )

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r e

0.0

0.2

0.4

0.6

M e m

b e r s

h i p d e g r e

sentosa

versicolor

virginica

FKP with λ=0.02, η=0.5, ε=0.0005


0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)





-1.0 1.0

0.4

0.6

0.8

b e r s h i p d e g r e

e

( x )

0.4

0.6

0.8

b e r s h i p d e g r e

e

( x )

0.0

0.2

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

M e m

0.0

0.2

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

M e m

0.4

0.6

0.8

1.0

r s h i p d e g r e e

( x )

0.4

0.6

0.8

1.0

r s h i p d e g r e e

( x )

0.0

0.2


M e m b e

0.0

0.2

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

M e m b e

sentosa

versicolor

virginica

PFKP with α=0.02, η=0, ε=0.0005



– ’ 1.0 1.0

0.4

0.6

0.8

m b e r s

h i p d e g r e e

( x

0.4

0.6

0.8

m b e r s

h i p d e g r e e

( x

0.0

.

4.3 4.7 5.1 5.5 5.9 6.3 6.7 7.1 7.5 7.9Sepal length (cm)

M

0.0

.

2 2.4 2.8 3.2 3.6 4 4.4Sepal width (cm)

M

1.0 1.0

0.4

0.6

0.8

e r s

h i p d e g r e e

( x )

0.4

0.6

0.8

e r s

h i p d e g r e e

( x )

0.0

0.2


M

e m

0.0

0.2

0.1 0.5 0.9 1.3 1.7 2.1 2.5Petal width (cm)

M

e m sentosa

versicolor virginica

with λ=0.02, η=0.01, ε=0.0005



•

Fuzzy Kohonen Partition (PFKP), were proposedto directly derive appropriate membershipfunctions from training data.

• Both algorithms directly derive trapezoidalmembership functions that are convex andnormal from training data while the latter derive

-partition of the input space

Documents

Clustering for Semantic labels - Wk9