5
978-1-4244-4547-9/09/$26.00 ©2009 IEEE TENCON 2009 A Novel Clustering Method for fuzzy Model Identification Meena Tushir Deptt. of Electrical & Electronics Engg. MSIT, New Delhi ,India Email: [email protected] Smriti Srivastava Deptt. Of Instrumentation & Control Engg. NSIT , New Delhi, India Email : [email protected] Abstract: Takagi-Sugeno models are an important class of fuzzy rule based oriented models, generally used for prediction and control. Fuzzy clustering is one of effective methods for identification. In this method, we propose to use a fuzzy clustering method (Kernel based fuzzy c-means method) for automatically constructing a multi-input fuzzy model to identify the structure of a fuzzy model. To clarify the advantages of the proposed method, it also shows some examples of modeling, among them a model of a human operator’s control action and a qualitative model to explain the trends in the time series data of the price of a stock. Keywords: TS Models, system identification, kernel function I. Introduction Fuzzy system identification has attracted a lot of interest in the past. With this technique, it is usually assumed that there is no prior knowledge about the system or that the expert’s knowledge is not sufficiently trustworthy. In this case, instead of using a fixed prior interpretation of the system, one often uses raw input-output data to augment one’s prior knowledge or perhaps even generates new knowledge about the system. This approach was initially proposed by Takagi-Sugeno_kang [1] under the name of TSK fuzzy modeling. Inspired by the classical system theory, TSK modeling is also referred to as system identification [2]. The problem of fuzzy system identification involves eliciting IF-THEN rules from raw input-output data. It usually proceeds in two steps: 1) Clustering and 2) specification of the input-output relations (IF-THEN rules). Clustering of numerical data forms the basis of many classification and system modeling algorithms. The purpose of clustering is to distill natural groupings of data from a large data set, producing a concise representation of a system’s behaviors. In particular, the Fuzzy c-means (FCM) clustering algorithm [3,4] has been widely studied and applied. In this paper, we propose to use a new Kernel based hybrid c-means clustering [5] model which adopts a Kernel induced metric in the data space to replace the original Euclidean norm metric. By replacing the inner product with an appropriate ‘Kernel’ function, one can implicitly perform a non linear mapping to a high dimensional feature space in which the data is more clearly separable, thus proposed method is characterized by higher clustering accuracy. Although clustering is generally associated with classification problems, here we use fuzzy clustering as an intuitive approach for generating objective rules in fuzzy modeling. The proposed approach is composed of two steps: structure identification and parameter identification. In the process of structure identification, a clustering method is proposed to provide a systematic procedure to determine the number of fuzzy rules and construct an initial fuzzy model from the given input-output data. In the process of parameter identification, the gradient descent method is used to tune the parameters of the constructed fuzzy model to obtain a more precise fuzzy model from the given input-output data. In section II of this paper we review fuzzy system identification. In section III, we explain the kernel based fuzzy clustering algorithm. Section IV shows how kernel based clustering can be applied to fuzzy system identification and its performance is presented. Concluding remarks are given in section V. II. Fuzzy System Identification Fuzzy identification is an effective tool for the approximation of uncertain nonlinear systems on the basis of measured data. Among the different fuzzy modeling techniques, the Takagi- Sugeno (TS) model has attracted most attention. This model consists of if-then rules with fuzzy antecedents and mathematical functions in the consequent part. Fuzzy clustering has been quite extensively used to obtain the antecedent membership functions, while the parameters of the consequent functions can be estimated by using standard linear least-square methods. This model is of the following form: Rule i : If 1 x is 1 i A and …. and n x is in A THEN n in i i i x c x c x c y + + + = ....... 2 1 1 0 Where . ,..., 2 , 1 l i = l is the number of IF-THEN rules, ) ,..... 1 , 0 ( ' n k s c ik = are the consequent parameters. i y is an output from the i th IF –THEN rule, and ij A is a fuzzy set. Given an input ) ,...., , ( 2 1 n x x x , the final output of the fuzzy model used is inferred as follows: 1

[IEEE TENCON 2009 - 2009 IEEE Region 10 Conference - Singapore (2009.01.23-2009.01.26)] TENCON 2009 - 2009 IEEE Region 10 Conference - A novel clustering method for fuzzy model identification

  • Upload
    smriti

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

978-1-4244-4547-9/09/$26.00 ©2009 IEEE TENCON 2009

A Novel Clustering Method for fuzzy Model Identification

Meena Tushir Deptt. of Electrical & Electronics Engg.

MSIT, New Delhi ,India Email: [email protected]

Smriti Srivastava Deptt. Of Instrumentation & Control Engg.

NSIT , New Delhi, India Email : [email protected]

Abstract: Takagi-Sugeno models are an important class of fuzzy rule based oriented models, generally used for prediction and control. Fuzzy clustering is one of effective methods for identification. In this method, we propose to use a fuzzy clustering method (Kernel based fuzzy c-means method) for automatically constructing a multi-input fuzzy model to identify the structure of a fuzzy model. To clarify the advantages of the proposed method, it also shows some examples of modeling, among them a model of a human operator’s control action and a qualitative model to explain the trends in the time series data of the price of a stock. Keywords: TS Models, system identification, kernel function

I. Introduction Fuzzy system identification has attracted a lot of interest in the past. With this technique, it is usually assumed that there is no prior knowledge about the system or that the expert’s knowledge is not sufficiently trustworthy. In this case, instead of using a fixed prior interpretation of the system, one often uses raw input-output data to augment one’s prior knowledge or perhaps even generates new knowledge about the system. This approach was initially proposed by Takagi-Sugeno_kang [1] under the name of TSK fuzzy modeling. Inspired by the classical system theory, TSK modeling is also referred to as system identification [2]. The problem of fuzzy system identification involves eliciting IF-THEN rules from raw input-output data. It usually proceeds in two steps: 1) Clustering and 2) specification of the input-output relations (IF-THEN rules). Clustering of numerical data forms the basis of many classification and system modeling algorithms. The purpose of clustering is to distill natural groupings of data from a large data set, producing a concise representation of a system’s behaviors. In particular, the Fuzzy c-means (FCM) clustering algorithm [3,4] has been widely studied and applied. In this paper, we propose to use a new Kernel based hybrid c-means clustering [5] model which adopts a Kernel induced metric in the data space to replace the original Euclidean norm metric. By replacing the inner product with an appropriate ‘Kernel’ function, one can implicitly perform a non linear mapping to a high dimensional feature space in which the data is more

clearly separable, thus proposed method is characterized by higher clustering accuracy. Although clustering is generally associated with classification problems, here we use fuzzy clustering as an intuitive approach for generating objective rules in fuzzy modeling. The proposed approach is composed of two steps: structure identification and parameter identification. In the process of structure identification, a clustering method is proposed to provide a systematic procedure to determine the number of fuzzy rules and construct an initial fuzzy model from the given input-output data. In the process of parameter identification, the gradient descent method is used to tune the parameters of the constructed fuzzy model to obtain a more precise fuzzy model from the given input-output data. In section II of this paper we review fuzzy system identification. In section III, we explain the kernel based fuzzy clustering algorithm. Section IV shows how kernel based clustering can be applied to fuzzy system identification and its performance is presented. Concluding remarks are given in section V.

II. Fuzzy System Identification Fuzzy identification is an effective tool for the approximation of uncertain nonlinear systems on the basis of measured data. Among the different fuzzy modeling techniques, the Takagi-Sugeno (TS) model has attracted most attention. This model consists of if-then rules with fuzzy antecedents and mathematical functions in the consequent part. Fuzzy clustering has been quite extensively used to obtain the antecedent membership functions, while the parameters of the consequent functions can be estimated by using standard linear least-square methods. This model is of the following form: Rule i : If 1x is 1iA and …. and nx is inA THEN

niniii xcxcxcy +++= .......2110

Where .,...,2,1 li = l is the number of IF-THEN rules, ),.....1,0(' nkscik = are the consequent parameters. iy is an

output from the i th IF –THEN rule, and ijA is a fuzzy set.

Given an input ),....,,( 21 nxxx , the final output of the fuzzy model used is inferred as follows:

1

i

l

ii ywy ∑

=

=1

(1) where iy is calculated for the consequent equation of the

i th implication and the weight iw implies the overall truth value of the premise of the i th implication for the input, and calculated as

)(1

k

n

kiki xAw ∏

=

=

(2) where Gaussian membership functions are used to represent the fuzzy sets

⎟⎟⎠

⎞⎜⎜⎝

⎛ −−= 2

2)(exp)(

ik

ikkkik

mxxA

β

(3) with ikm being the center and ikβ , the variance of the Gaussian curve. From (1) and (2)

kik

n

k

l

ii xcwy ∑∑

= =

=0 1

(4)

III. Kernel based fuzzy c-means clustering

A Kernel function is a generalization of the distance metric that measures the distance between two data points as the data points are mapped into a high dimensional space in which they are more clearly separable. By employing a mapping function )(xφ , which defines a non–linear transformation: )(xx φ→ the non-linearly separable data structure existing in the original data space can possibly be mapped into a linearly separable case in the higher dimensional feature space. Our proposed model called Kernel-based hybrid c-means clustering (KPFCM) [5] adopts a kernel-induced metric different from the Euclidean norm in original possibilistic fuzzy c-means clustering [9,10]. KPFCM minimizes the following objective function:

ηη γφφ )1()()()(),,(11

2

1 1∑∑∑∑

=== =

−+−+=N

kik

c

iiikik

N

k

c

i

mikKPFCM tvxbtauTVUJ

(5)

Where , 2)()( ik vx φφ − is the square of distance

between )( kxφ and )( ivφ .The distance in the feature space is calculated through the kernel in the input space as follows:

))()()).(()(()()( 2ikikik vxvxvx φφφφφφ −−=−

)()()()(.2)().( iiikkk vvvxxx φφφφφφ +−= ),(),(.2),( iiikkk vvKvxKxxK +−= If we adopt Gaussian function as a kernel function i.e

⎟⎟

⎜⎜

⎛ −−= 2

2

2)(

exp),(σ

yxyxK , where σ defined as

Kernel width, is a positive number, then 1),( =xxK Thus (5) can be written as

ηη γ )1()),(1)((2),,(111 1∑∑∑∑

=== =

−+−+=N

kik

c

iiikik

N

k

c

i

mikKPFCM tvxKbtauTVUJ

(6) Given a set of points X, we minimize ),,( TVUJ KPFCM in order to determine U, V, T .We adopt an alternating optimization approach to minimize ),,( TVUJ KPFCM and need the following theorem: Theorem 1: The necessary conditions for minimizing

KPFCMJ under the constraint of U are

∑=

⎟⎟⎠

⎞⎜⎜⎝

⎟⎟⎠

⎞⎜⎜⎝

⎛−

=c

j

m

jk

m

ikik

vxK

vxKu

1

11

11

),(11

),(11

(7)

11

),(1(21

1

⎥⎦

⎤⎢⎣

⎡ −+

γ i

ik

ik

vxKbt

(8) ∑

=

=

+

+= N

kikik

mik

N

kkikik

mik

i

vxKbtau

xvxKbtauv

1

1

),()(

),()(

η

η (9) The optimal values of the Kernel parameters can be obtained through (6) , i.e.

2

3

2

1 1

),()(2σσ

η ikikik

N

k

c

i

mik

vxvxKbtauJ −

+−=∂∂ ∑∑

= =

(10) It is suggested to select iγ [6, 7] as

=

=

−= N

k

mik

N

kik

mik

i

u

vxKuH

1

1

)),(1(2γ

(11) Typically, H is chosen as 1. The general form of Kernel-based hybrid c-means clustering algorithm is given below: KENEL-BASED HYBRID c-MEANS CLUSTERING ____________________________________ * Fix the number of clusters C; fix 1),,,( >bam η ; Set the learning rateα ; * Execute a FCM clustering algorithm to find initial U and V; * Initialize the typicality values 0

ikt randomly;

*Initialize the Kernel parameter )0(σσ = . * Set iteration count k=1; Repeat Update k

iv using (9).

Compute σ∂

∂Jusing (10)

Compute iγ using (11).

Update kikt using (8).

Update kiku using (7).

Update the kernel parameter using σ

ασσ∂∂+=+ Jkk 1

Until a given stopping criterion is satisfied. _______________________________________

IV Experimental Results

To demonstrate the effectiveness of the proposed method for identification of a system, we applied the fuzzy c-means clustering method and our proposed clustering method on a chemical plant data and stock price data and compared the performance of the proposed method. We choose m=2 which is a common choice for fuzzy clustering. . The Kernel used in the experiments is the Gaussian Kernel

⎟⎟

⎜⎜

⎛ −−= 2

2

2)(

exp),(σ

yxyxK , where σ is the kernel

parameter that should be optimized. The initial value for the

kernel parameter σ is set to

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

⎡−

=∑

=

l

xx

c

l

jj

1

2

01σ .

Case I . Human Operation at a Chemical Plant We use a TSK fuzzy modeling approach to deal with a model of an operator’s control of a chemical plant. The plant is for producing a polymer by the polymerization of some monomers. There are five input candidates, which a human operator might refer to for his control, and one output. 70 data points of the above six variables from the actual plant operation are taken from [8]. Training is composed of two phases. In the first phase, we use KPFCM clustering to find the centers lmmm ,......,, 21 and width of the membership functions is calculated as follows:

=

=

−= N

k

mik

N

kijkjik

ij

u

mxu

1

12)(

β

In the second phase, gradient descent method is used to

minimize the error function 2

21 eE = where

yye −= * , where y and *y denote outputs of a fuzzy model and a real system, respectively. First we find six clusters by our proposed clustering method, which implies six rules in this case. Fig.1 shows the initial values of membership functions for the five input variables. Fig. 2 shows the actual output and the desired output vs. the number of samples, which clearly indicates that the actual output is following the desired output quite accurately.

4 4.5 5 5.5 6 6.5 70

0.5

1

-0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.20

0.5

1

0 1000 2000 3000 4000 5000 6000 7000 80000

0.5

1

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.30

0.5

1

-0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.30

0.5

1

Figure (1) . Initial membership functions using KPFCM clustering

3

0 10 20 30 40 50 60 700

1000

2000

3000

4000

5000

6000

7000

8000plot for identification

No. of samples

iden

tifie

r ou

tput

Desired output

Identified output with KPFCM ClusteringWith FCM

Without clustering

Figure (2) . Output of the Plant . Fig 3 shows the final values of the membership functions for the five input variables after training. Fig 4 shows the performance Index vs. number of iterations.

4 4.5 5 5.5 6 6.5 70

0.5

1

-0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2

0.6

0.81

0 1000 2000 3000 4000 5000 6000 7000 80000

0.5

1

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

0.6

0.81

-0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

0.6

0.8

1

Figure (3) . Final membership functions using KPFCM clustering

0 5 10 15 20 25 30 35 40 45 502.5

3

3.5

4

4.5

5

5.5

6

6.5x 10

5

No. of iterations

Mag

nitu

de o

f P

erfo

rman

ce in

dex

With KPFCM ClusteringWith FCM ClusteringWithout Clustering

Figure (4) . Variation of square of error Case II. Trend of stock Prices In the second experiment, we take up the trend data of stock price .Here we use the daily data of a stock market. There are 100 data points The stock data consists of ten inputs and one output.

Making fuzzy clustering, we obtain four rules. Fig (5) shows a comparison of the price with the actual price. Fig (6) shows the variation of SE with our proposed clustering, FCM Clustering and without clustering. We can see the good performance of the model using KPFCM clustering

0 10 20 30 40 50 60 70 80 90 100-30

-20

-10

0

10

20

30

40plot for identification

No. of samples

iden

tifie

r ou

tput

Desired output

Identified output

Figure (5) Plot of Desired vs. Actual price

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

4

No. of iterations

Mag

nitu

de o

f P

erfo

rman

ce in

dex

Without clustering

With FCM clusteringWith KPFCM Clustering

Figure (6) Variation of square of error Table 1: Square of error (SE)

Table 1 shows the values of SE of the identified plant for the two cases discussed in the paper. It has been clearly shown that the SE has minimum value when the parameters of the membership function viz ijm and ijβ are initialized using KPFCM clustering. V Conclusion

In this paper, we have proposed the use of a new kernel based hybrid fuzzy clustering method (KPFCM) for the

Cases Square of error (SE) Without Clustering

With FCM Clustering

With KPFCM Clustering

Case I

6.2247×105

3.0016×105

2.7828×105

Case II

2500

2214

1301

4

structure identification of a fuzzy model. We have compared the results with the fuzzy model obtained using the standard fuzzy c-means clustering and the conventional technique. It was shown that resulting model obtained using the new clustering method is more accurate.

References: [1] M .Sugeno and C.T. Kang ,Structure identification of fuzzy model.

Fuzzy sets and systems , 28:15-33,1988 [2] T. Takagi and M. Sugeno , “ Fuzzy identification of systems and its

application to modeling and control,” IEEE Trans. On systems, Man & Cybernetics , vol.15,pp. 116-132,1985.

[3] J.Dunn , “A fuzzy relative of the ISODATA process and its use in detecting compact, well separated clusters, ” J.Cybernetics, vol. 3, no. 3,pp. 32-57 ,1974.

[4] J. Bezdek , “ Cluster validity with fuzzy sets ,” J. Cybernetics ,vol. 3, pp. 58-71, 1974.

[5] Tushir Meena , Srivastava Smriti , “ A New kernel based hybrid c-means clustering model ,” Proceedings of IEEE Int. Conf. on fuzzy systems , pp(1-5), 23-26 July,2007, London , U.K.

[6] R.Krishnapuram and J.Keller, “A possibilistic approach to clustering,” IEEE Trans. Fuzzy Sys., vol. 1, no. 2, pp. 998-110, Apr. 1993.

[7] N.R.Pal, K.Pal, J.Keller and J.C.Bezdek, “ A Possibilistic Fuzzy c-Means Clustering Algorithm,” IEEE Trans. on Fuzzy Systems, vol 13 (4), pp 517-530,2005.

[8] M. Sugeno and T.Yasukawa , “A fuzzy-logic-bsaed approach to qualitative modeling,” IEEE Transaction on Fuzzy Systems , 1(1):7-31,1993.

[9] N.R. Pal, K.Pal and J.C.Bezdek, “A mixed c-means clustering mode,” Proceedings of the IEEE Int. Conf. on Fuzzy Systems, Spain, pp.11-21, 1997.

[10] N.R.Pal, K.Pal, J.Keller and J.C.Bezdek,“ A Possibilistic Fuzzy c-Means Clustering Algorithm”, IEEE Trans. on Fuzzy Systems, vol 13 (4), pp 517-530,2005.

5