Upload
cheng-hao
View
226
Download
9
Embed Size (px)
Citation preview
This article was downloaded by: [University of California Santa Cruz]On: 19 November 2014, At: 14:58Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK
Cybernetics and Systems: AnInternational JournalPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/ucbs20
A NEW METHOD TO GENERATEFUZZY RULES FROM TRAININGINSTANCES FOR HANDLINGCLASSIFICATION PROBLEMSSHYI-MING CHEN a & CHENG-HAO YU ba Department of Computer Science and InformationEngineering, National Taiwan University of Scienceand Technology Taipei, Taiwan, R.O.C.b Department of Electronic Engineering, NationalTaiwan University of Science and Technology, Taipei,Taiwan, R.O.C.Published online: 30 Nov 2010.
To cite this article: SHYI-MING CHEN & CHENG-HAO YU (2003) A NEW METHOD TOGENERATE FUZZY RULES FROM TRAINING INSTANCES FOR HANDLING CLASSIFICATIONPROBLEMS, Cybernetics and Systems: An International Journal, 34:3, 217-232, DOI:10.1080/01969720302837
To link to this article: http://dx.doi.org/10.1080/01969720302837
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified with
primary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.
This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
ANEWMETHODTOGENERATEFUZZYRULESFROMTRAINING INSTANCESFORHANDLINGCLASSIFICATIONPROBLEMS
SHYI-MINGCHEN
DepartmentofComputerScienceand InformationEngineering,NationalTaiwan University of Science andTechnology Taipei,Taiwan, R.O.C.
CHENG-HAOYU
Department of Electronic Engineering, NationalTaiwanUniversity of Science andTechnology, Taipei,Taiwan, R.O.C.
A major task in developing a fuzzy classification system is to generate a set of
fuzzy rules from training instances to deal with a specific classification prob-
lem. In recent years, many methods have been developed to generate fuzzy
rules from training instances. We present a new method to generate fuzzy
rules from training instances to deal with the Iris data classification problem.
The proposed method can discard some useless input attributes to improve
the average classification accuracy rate. It can obtain a higher average clas-
sification accuracy rate and it generates fewer fuzzy rules and fewer input
fuzzy sets in the generated fuzzy rules than the existing methods.
This work was supported in part by the National Science Council, Republic of China,
under Grant NSC 90-2213-E-011-053.
Address correspondence to Professor Shyi-Ming Chen, Ph.D., Department of Compu-
ter Science and Information Engineering, National Taiwan University of Science and Tech-
nology, Taipei, Taiwan, R.O.C.
Cybernetics and Systems: An InternationalJournal, 34: 217�232, 2003Copyright# 2003 Taylor & Francis
0196-9722/03 $12.00+ .00
DOI: 10.1080/01969720390184399
217
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
INTRODUCTION
In 1965, Zadeh proposed the theory of fuzzy sets (Zadeh 1965). It has
been used to deal with uncertain and imprecise data. Based on the fuzzy
set theory, we can design a fuzzy classification system to generate fuzzy
rules from training data. We usually have two approaches to obtain a set
of fuzzy rules in a fuzzy classification system. One approach is to obtain
the knowledge from human experts and then transfer it into fuzzy rules,
but this is time consuming. The other way is to use a machine learning
method to automatically generate fuzzy rules from the training data
(Hong and Lee 1996, 1999; Castro et al. 1999; Chen and Chen 2000;
Hong and Chen 1999, 2000; Kao and Chen 2000; Wang et al. 1999; Wu
and Chen 1999). Castro et al. (1999) presented a method to generate
fuzzy rules in expert systems. Chang and Chen (2000) presented a method
to generate fuzzy rules from numerical data based on the exclusion of
attribute terms. Chen and Yeh (1998) presented a method for generating
fuzzy rules from relational database systems for estimating null values.
Chen et al. (1999) presented a method for generating fuzzy rules from
numerical data for handling fuzzy classification problems. Chen and Lin
(2000) presented a method for constructing fuzzy decision trees and
generating fuzzy classification rules from training examples. Chen and
Chen (2000) presented a method to generate fuzzy rules for fuzzy classi-
fication systems. Hong and Chen (1999) presented a method for gen-
erating fuzzy rules based on finding relevant attributes and membership
functions. Hong and Chen (2000) also presented a method for inducing
fuzzy rules based on processing individual fuzzy attributes. Hong and Lee
(1996) presented a method for induction of fuzzy rules and membership
functions from training examples. Hong and Lee (1999) investigated the
effect of the merging order on performance of fuzzy induction. Kao and
Chen (2000) presented a method to generate fuzzy rules from training
data containing noise for handling classification problems. Lin and Chen
(2000) presented a method to generate weighted fuzzy rules from training
data for handling fuzzy classification problems. Wang et al. (1999) pre-
sented a fuzzy inductive strategy for modular rules. Wu and Chen (1999)
presented a method for constructing membership functions and fuzzy
rules from training examples.
In this article, we present a new method to generate fuzzy rules from
training instances to deal with the Iris data (Fisher 1936) classification
problem. The proposed method can discard some useless input attributes
218 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
to improve the average classification accuracy rate. It can obtain a higher
average classification accuracy rate and it generates fewer fuzzy rules and
fewer input fuzzy sets in the generated fuzzy rules than the existing
methods.
BASICCONCEPTSOFFUZZYSETS
In 1965, Zadeh proposed the theory of fuzzy sets (Zadeh 1965). Let A be
a triangular fuzzy set in the universe of discourse U, where
A ¼Z b
a
u� a
b� a
� �=xþ
Z c
b
c� u
c� b
� �=u; 8u 2 U: ð1Þ
The membership function of the triangular fuzzy set A is shown in
Figure 1, where the membership function of the triangular fuzzy set A
also can be represented by a triplet (a, b, c), where b is called the center of
the triangular fuzzy set A; a and c are called the left vertex and the right
vertex of the triangular fuzzy set A, respectively.
In the following, we briefly review the union and the intersection
operations of fuzzy sets and review a similarity measure of fuzzy sets
from Klir and Yuan (1995) and Zadeh (1965).
Definition 1: Let A and B be fuzzy sets of the universe of discourse U
characterized by the membership functions mA and mB, respectively, wheremA :U! [0, 1] and mB :U! [0, 1]. The union of the fuzzy sets A and B,
denoted as A[B, is defined as follows:
mA[BðuiÞ ¼ maxðmAðuiÞ;mBðuiÞÞ; 8ui 2 U: ð2Þ
Figure 1. A triangular membership function.
NEW METHOD TO GENERATE FUZZY RULES 219
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
The intersection of the fuzzy sets A and B, denoted as A\B, is defined as
follows:
mA\BðuiÞ ¼ minðmAðuiÞ; mBðuiÞÞ; 8ui 2 U: ð3Þ
Definition 2: Let A and B be fuzzy sets of the universe of discourse U
characterized by the membership functions mA and mB, respectively, wheremA :U! [0, 1] mB :U! [0, 1]. The degree of similarity between the fuzzy
sets A and B, denoted as S(A, B), is defined as follows:
SðA;BÞ ¼ The Area of A\BThe Area of A[B ; ð4Þ
where SðA;BÞ 2 ½0; 1�. The larger the value of S(A, B), the more the
similarity between the fuzzy sets A and B.
ANEWALGORITHMFORFUZZYRULESGENERATION
In this section, we present a new algorithm for constructing membership
functions and generating fuzzy rules from training data to deal with the
Iris data classification problem. The algorithm is presented as follows.
Step1: Choose m instances from the Iris data as the training data set,
and let the rest of the instances of the Iris data be the testing data set.
Step 2: Find the maximum attribute value, the minimum attribute
value, and the average attribute value for each input attribute (i.e., sepal
length (SL), sepal width (SW), petal length (PL), and petal width (PW))
of each species of flowers (i.e., Iris-Setosa, Iris-Versicolor, and Iris-
Virginica) from the training data set to form the membership function of
each attribute for each species.
Step 3: Based on Eq. (4), calculate the average degree of similarity
between each species for each input attribute.
Step 4: For each input attribute, if the average degree of similarity
between each species of the input attribute is greater than the threshold
value a, where a 2 [0, 1], then discard this input attribute. In this article,
220 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
we let a ¼ 0:05. Then, the rest of the input attributes (i.e., X1; X2; . . . ;
and Xn) are used as the input attributes of the generated fuzzy rules
shown as follows:
IF X1 is A1 AND X2 is A2 AND � � � AND Xn is An
THEN the flower is B;
where B denotes the species of the flower, B 2 {Iris-Setosa, Iris-Versicolor,
Iris-Virginica}, A1, A2, . . . , and An are input fuzzy sets formed by the
attribute values of the input attributes X1; X2; . . . ; andXn, respectively,
with respect to the species B derived in step 2, where 1 � n � 4. Because
there are three species of the flower, three fuzzy rules will be generated
from the training data set.
If the testing instance is (y1; y2; . . . ; yn), where yi denotes the attribute
value of the input attribute Xi and 1 � i � n, then the degree of possibility
mRðy1; y2; . . . ; ynÞ that the flower is B can be evaluated by mRðy1,y2; . . . ; ynÞ ¼ mA1
ðy1Þ � mA2ðy2Þ � � � � � mAn
ðynÞ, where y1; y2; . . . ; and
yn denote the attribute values of the input attributes A1; A2; . . . ; and An of
the testing instance and mRðy1; y2; . . . ; ynÞ2½0; 1�.
ANEXAMPLE
In the following, we apply the proposed algorithm to deal with the Iris
data (Fisher 1936) classification problem. Table 1 shows the Iris data.
There are three species of flowers in the Iris data (i.e., ‘‘Iris-Setosa,’’ ‘‘Iris-
Versicolor,’’ and ‘‘Iris-Virginica’’) and there are 150 instances in the Iris
data, with 50 instances for each species and each species with four
attributes (i.e., SL, SW, PL, and PW).
The step-by-step illustration of the proposed algorithm is shown as
follows:
Step1: The computer randomly chooses 75 instances of the Iris data as
the training data set and lets the rest of the instances of the Iris data be
the testing data set. Assume that the chosen training instances are as
shown in Table 2.
Step2: Based on Table 2, we can find the maximum attribute value, the
minimum attribute value, and the average attribute value of each input
attribute of each species of flower from the training data set as shown in
Table 3.
NEW METHOD TO GENERATE FUZZY RULES 221
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
Table 1. Iris data (Fisher 1936)
Iris-Setosa Iris-Versicolor Iris-Virginica
SL SW PL PW SL SW PL PW SL SW PL PW
5.1 3.5 1.4 0.2 7.0 3.2 4.7 1.4 6.3 3.3 6.0 2.5
4.9 3.0 1.4 0.2 6.4 3.2 4.5 1.5 5.8 2.7 5.1 1.9
4.7 3.2 1.3 0.2 6.9 3.1 4.9 1.5 7.1 3.0 5.9 2.1
4.6 3.1 1.5 0.2 5.5 2.3 4.0 1.3 6.3 2.9 5.6 1.8
5.0 3.6 1.4 0.2 6.5 2.8 4.6 1.5 6.5 3.0 5.8 2.2
5.4 3.9 1.7 0.4 5.7 2.8 4.5 1.3 7.6 3.0 6.6 2.1
4.6 3.4 1.4 0.3 6.3 3.3 4.7 1.6 4.9 2.5 4.5 1.7
5.0 3.4 1.5 0.2 4.9 2.4 3.3 1.0 7.3 2.9 6.3 1.8
4.4 2.9 1.4 0.2 6.6 2.9 4.6 1.3 6.7 2.5 5.8 1.8
4.9 3.1 1.5 0.1 5.2 2.7 3.9 1.4 7.2 3.6 6.1 2.5
5.4 3.7 1.5 0.2 5.0 2.0 3.5 1.0 6.5 3.2 5.1 2.0
4.8 3.4 1.6 0.2 5.9 3.0 4.2 1.5 6.4 2.7 5.3 1.9
4.8 3.0 1.4 0.1 6.0 2.2 4.0 1.0 6.8 3.0 5.5 2.1
4.3 3.0 1.1 0.1 6.1 2.9 4.7 1.4 5.7 2.5 5.0 2.0
5.8 4.0 1.2 0.2 5.6 2.9 3.6 1.3 5.8 2.8 5.1 2.4
5.7 4.4 1.5 0.4 6.7 3.1 4.4 1.4 6.4 3.2 5.3 2.3
5.4 3.9 1.3 0.4 5.6 3.0 4.5 1.5 6.5 3.0 5.5 1.8
5.1 3.5 1.4 0.3 5.8 2.7 4.1 1.0 7.7 3.8 6.7 2.2
5.7 3.8 1.7 0.3 6.2 2.2 4.5 1.5 7.7 2.6 6.9 2.3
5.1 3.8 1.5 0.3 5.6 2.5 3.9 1.1 6.0 2.2 5.0 1.5
5.4 3.4 1.7 0.2 5.9 3.2 4.8 1.8 6.9 3.2 5.7 2.3
5.1 3.7 1.5 0.4 6.1 2.8 4.0 1.3 5.6 2.8 4.9 2.0
4.6 3.6 1.0 0.2 6.3 2.5 4.9 1.5 7.7 2.8 6.7 2.0
5.1 3.3 1.7 0.5 6.1 2.8 4.7 1.2 6.3 2.7 4.9 1.8
4.8 3.4 1.9 0.2 6.4 2.9 4.3 1.3 6.7 3.3 5.7 2.1
5.0 3.0 1.6 0.2 6.6 3.0 4.4 1.4 7.2 3.2 6.0 1.8
5.0 3.4 1.6 0.4 6.8 2.8 4.8 1.4 6.2 2.8 4.8 1.8
5.2 3.5 1.5 0.2 6.7 3.0 5.0 1.7 6.1 3.0 4.9 1.8
5.2 3.4 1.4 0.2 6.0 2.9 4.5 1.5 6.4 2.8 5.6 2.1
4.7 3.2 1.6 0.2 5.7 2.6 3.5 1.0 7.2 3.0 5.8 1.6
4.8 3.1 1.6 0.2 5.5 2.4 3.8 1.1 7.4 2.8 6.1 1.9
5.4 3.4 1.5 0.4 5.5 2.4 3.7 1.0 7.9 3.8 6.4 2.0
5.2 4.1 1.5 0.1 5.8 2.7 3.9 1.2 6.4 2.8 5.6 2.2
5.5 4.2 1.4 0.2 6.0 2.7 5.1 1.6 6.3 2.8 5.1 1.5
4.9 3.1 1.5 0.2 5.4 3.0 4.5 1.5 6.1 2.6 5.6 1.4
5.0 3.2 1.2 0.2 6.0 3.4 4.5 1.6 7.7 3.0 6.1 2.3
5.5 3.5 1.3 0.2 6.7 3.1 4.7 1.5 6.3 3.4 5.6 2.4
4.9 3.6 1.4 0.1 6.3 2.3 4.4 1.3 6.4 3.1 5.5 1.8
(Continued )
222 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
For each species Y of the flowers, where Y 2 {Iris-Setosa, Iris-
Versicolor, Iris-Virginica}, the minimum attribute value, the maximum
attribute value, and the average attribute value of each input attribute X,
where X 2 {SL, SW, PL, PW}, are defined as the left vertex, the right
vertex, and the center of the triangular fuzzy set X(Y ) of the linguistic
value of the input attribute X, where we let the membership degree of the
left vertex and the right vertex of the fuzzy set X(Y) be equal to 1/the
number of training data for each species, as shown in Figures 2 through
5. For example, from Table 3, we can see that the minimum attribute
value, the maximum attribute value, and the average attribute value of
the input attribute SL of the species Iris-Setosa of the training instances
are 4.3 cm, 5.8 cm, and 5.028 cm, respectively. We can construct the
membership function SL(Iris-Setosa) for the input attribute SL of the
species Iris-Setosa as shown in Figure 2. Because the number of instances
for each species in the training data set is 25, we let the membership
degree of the left vertex (i.e., 4.3 cm) and the right vertex (i.e., 5.8 cm)
of the fuzzy set SL(Iris-Setosa) be equal to 125 ¼ 0:04, as shown in
Figure 2. In the same way, we can construct the membership functions
SL(Iris-Versicolor), SL(Iris-Virginica), SW(Iris-Setosa), SW(Iris-
Versicolor), SW(Iris-Virginica) PL(Iris-Setosa), PL(Iris-Versicolor),
PL(Iris-Virginica), PW(Iris-Setosa), PW(Iris-Versicolor), and PW(Iris-
Virginica), respectively, as shown in Figures 2 to 5.
Table 1. Continued
Iris-Setosa Iris-Versicolor Iris-Virginica
SL SW PL PW SL SW PL PW SL SW PL PW
4.4 3.0 1.3 0.2 5.6 3.0 4.1 1.3 6.0 3.0 4.8 1.8
5.1 3.4 1.5 0.2 5.5 2.5 4.0 1.3 6.9 3.1 5.4 2.1
5.0 3.5 1.3 0.3 5.5 2.6 4.4 1.2 6.7 3.1 5.6 2.4
4.5 2.3 1.3 0.3 6.1 3.0 4.6 1.4 6.9 3.1 5.1 2.3
4.4 3.2 1.3 0.2 5.8 2.6 4.0 1.2 5.8 2.7 5.1 1.9
5.0 3.5 1.6 0.6 5.0 2.3 3.3 1.0 6.8 3.2 5.9 2.3
5.1 3.8 1.9 0.4 5.6 2.7 4.2 1.3 6.7 3.3 5.7 2.5
4.8 3.0 1.4 0.3 5.7 3.0 4.2 1.2 6.7 3.0 5.2 2.3
5.1 3.8 1.6 0.2 5.7 2.9 4.2 1.3 6.3 2.5 5.0 1.9
4.6 3.2 1.4 0.2 6.2 2.9 4.3 1.3 6.5 3.0 5.2 2.0
5.3 3.7 1.5 0.2 5.1 2.5 3.0 1.1 6.2 3.4 5.4 2.3
5.0 3.3 1.4 0.2 5.7 2.8 4.1 1.3 5.9 3.0 5.1 1.8
NEW METHOD TO GENERATE FUZZY RULES 223
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
Step 3: Based on Eq. (4), we can calculate the average degree of simi-
larity of each pair of species for each input attribute as follows. First, we
calculate the degree of similarity of each pair of the species for the
attributes SL:
i. Based on Eq. (4), the degree of similarity S(SL(Iris-Setosa), SL(Iris-
Versicolor)) between the species Iris-Setosa and Iris-Versicolor for the
attribute SL can be evaluated as follows:
S(SL(Iris-Setosa), SL(Iris-Versicolor))¼ 0.1356.
Table 2. Training data set
Iris-Setosa Iris-Versicolor Iris-Virginica
SL SW PL PW SL SW PL PW SL SW PL PW
5.1 3.5 1.4 0.2 7.0 3.2 4.7 1.4 6.3 3.3 6.0 2.5
4.9 3.0 1.4 0.2 6.4 3.2 4.5 1.5 5.8 2.7 5.1 1.9
4.7 3.2 1.3 0.2 6.9 3.1 4.9 1.5 7.1 3.0 5.9 2.1
4.6 3.1 1.5 0.2 5.5 2.3 4.0 1.3 6.3 2.9 5.6 1.8
5.0 3.6 1.4 0.2 6.5 2.8 4.6 1.5 6.5 3.0 5.8 2.2
5.4 3.9 1.7 0.4 5.7 2.8 4.5 1.3 7.6 3.0 6.6 2.1
4.6 3.4 1.4 0.3 6.3 3.3 4.7 1.6 4.9 2.5 4.5 1.7
5.0 3.4 1.5 0.2 4.9 2.4 3.3 1.0 7.3 2.9 6.3 1.8
4.4 2.9 1.4 0.2 6.6 2.9 4.6 1.3 6.7 2.5 5.8 1.8
4.9 3.1 1.5 0.1 5.2 2.7 3.9 1.4 7.2 3.6 6.1 2.5
5.4 3.7 1.5 0.2 5.0 2.0 3.5 1.0 6.5 3.2 5.1 2.0
4.8 3.4 1.6 0.2 5.9 3.0 4.2 1.5 6.4 2.7 5.3 1.9
4.8 3.0 1.4 0.1 6.0 2.2 4.0 1.0 6.8 3.0 5.5 2.1
4.3 3.0 1.1 0.1 6.1 2.9 4.7 1.4 5.7 2.5 5.0 2.0
5.8 4.0 1.2 0.2 5.6 2.9 3.6 1.3 5.8 2.8 5.1 2.4
5.7 4.4 1.5 0.4 6.7 3.1 4.4 1.4 6.4 3.2 5.3 2.3
5.4 3.9 1.3 0.4 5.6 3.0 4.5 1.5 6.5 3.0 5.5 1.8
5.1 3.5 1.4 0.3 5.8 2.7 4.1 1.0 7.7 3.8 6.7 2.2
5.7 3.8 1.7 0.3 6.2 2.2 4.5 1.5 7.7 2.6 6.9 2.3
5.1 3.8 1.5 0.3 5.6 2.5 3.9 1.1 6.0 2.2 5.0 1.5
5.4 3.4 1.7 0.2 5.9 3.2 4.8 1.8 6.9 3.2 5.7 2.3
5.1 3.7 1.5 0.4 6.1 2.8 4.0 1.3 5.6 2.8 4.9 2.0
4.6 3.6 1.0 0.2 6.3 2.5 4.9 1.5 7.7 2.8 6.7 2.0
5.1 3.3 1.7 0.5 6.1 2.8 4.7 1.2 6.3 2.7 4.9 1.8
4.8 3.4 1.9 0.2 6.4 2.9 4.3 1.3 6.7 3.3 5.7 2.1
224 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
ii. Based on Eq. (4), the degree of similarity S(SL(Iris-Setosa), SL(Iris-
Virginica)) between the species Iris-Setosa and Iris-Virginica for the
attribute SL can be evaluated as follows:
S(SL(Iris-Setosa), SL(Iris-Virginica)) ¼ 0:083:
iii. Based on Eq. (4), the degree of similarity S(SL(Iris-Versicolor),
SL(Iris-Virginica)) between the species Iris-Versicolor and Iris-
Virginica for the attribute SL can be evaluated as follows:
S(SL(Iris-Versicolor), SL(Iris-Virginica)) ¼ 0:510:
Thus, the average degree of similarity AVE(SL) of each pair of the
species for the attribute SL can be evaluated as follows:
Table 3. Minimum attribute value, maximum attribute value, and average attribute
value for the training data set
Species Input attribute
Minimum
attribute
value (cm)
Maximum
attribute
value (cm)
Average
attribute
value (cm)
Iris-Setosa Sepal Length (SL) 4.3 5.8 5.028
Sepal Width (SW) 2.9 4.4 3.480
Petal Length (PL) 1.0 1.9 1.460
Petal Width (PW) 0.1 0.5 0.248
Iris-Versicolor Sepal Length (SL) 4.9 7.0 6.012
Sepal Width (SW) 2.0 3.3 2.764
Petal Length (PL) 3.3 4.9 4.312
Petal Width (PW) 1.0 1.8 1.344
Iris-Virginica Sepal Length (SL) 4.9 7.7 6.576
Sepal Width (SW) 2.2 3.8 2.928
Petal Length (PL) 4.5 6.9 5.640
Petal Width (PW) 1.5 2.5 2.044
Figure 2. Membership functions of the attribute sepal length (SL) for the species.
NEW METHOD TO GENERATE FUZZY RULES 225
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
AVE(SL) ¼ ½S(SL(Iris-Setosa), SL(Iris-Versicolor))þ S(SL(Iris-Setosa), SL(Iris-Virginica))
þ S(SL(Iris-Versicolor), SL(Iris-Virginica))�=3¼ 0:243:
In the same way, we can calculate the average degrees of similarity
AVE(SW ), AVE(PL), and AVE(PW ) of each pair of the species for the
attributes SW, PL, and PW, respectively. The results are as follows:
S(SW(Iris-Setosa), SW(Iris-Versicolor)) ¼ 0:054;
S(SW(Iris-Setosa), SW(Iris-Virginica)) ¼ 0:2194;
S(SW(Iris-Versicolor), SW(Iris-Virginica)) ¼ 0:492;
AVE(SW) ¼ ðS(SW(Iris-Setosa), SW(Iris-Versicolor))
þ S(SW(Iris-Setosa), SW(Iris-Virginica))
þ S(SW(Iris-Versicolor), SW(Iris-Virginica))Þ=3¼ 0:255;
Figure 3. Membership functions of the attribute sepal width (SW) for the species.
Figure 4. Membership functions of the attribute petal length (PL) for the species.
226 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
S(PL(Iris-Setosa), PL(Iris-Versicolor)) ¼ 0;
S(PL(Iris-Setosa), PL(Iris-Virginica)) ¼ 0;
S(PL(Iris-Versicolor), PL(Iris-Virginica)) ¼ 0:024;
AVE(PL) ¼ ðS(PL(Iris-Setosa), PL(Iris-Versicolor))þ S(PL(Iris-Setosa), PL(Iris-Virginica))
þ S(PL(Iris-Versicolor), PL(Iris-Virginica))Þ=3¼ 0:0079;
S(PW(Iris-Setosa), PW(Iris-Versicolor)) ¼ 0;
S(PW(Iris-Setosa), PW(Iris-Virginica)) ¼ 0;
S(PW(Iris-Versicolor), PW(Iris-Virginica)) ¼ 0:053;
AVE(PW) ¼ ðS(PW(Iris-Setosa), PW(Iris-Versicolor))
þ S(PW(Iris-Setosa), PW(Iris-Virginica))
þ S(PW(Iris-Versicolor), PW(Iris-Virginica))Þ=3¼ 0:0018:
Step 4: Because the threshold value a given by the user is 0.05, if the
average degree of similarity of each pair of the species of an input at-
tribute is greater than 0.05, then discard this input attribute. From the
calculation results of Step 3, we can see that the average degrees of si-
milarity of each pair of the species for the attributes SL and SW are
greater than 0.05. Thus, the attributes SL and SW are discarded. Finally,
we use the rest of the input attributes (i.e., PL and PW) as the input
attributes of the generated fuzzy rules. Because there are three species of
Figure 5. Membership functions of the attribute petal width (PW) for the species.
NEW METHOD TO GENERATE FUZZY RULES 227
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
flowers in the Iris data, there are three fuzzy rules to be generated, shown
as follows:
Rule R1: IF Petal Length is PL(Iris-Setosa) AND
Petal Width is PW(Iris-Setosa)
THEN the flower is Iris-Setosa,
Rule R2: IF Petal Length is PL(Iris-Versicolor) AND
Petal Width is PW(Iris-Versicolor)
THEN the flower is Iris-Versicolor,
Rule R3: IF Petal Length is PL(Iris-Virginica) AND
Petal Width is PW(Iris-Virginica)
THEN the flower is Iris-Virginica,
where the membership functions of PL(Iris-Setosa), PW(Iris-Setosa),
PL(Iris-Versicolor), PW(Iris-Versicolor), PL(Iris-Virginica), and PW(Iris-
Virginica) are shown in Figures 4 and 5, respectively. In the following, we
use the generated fuzzy rules to classify the testing instance (5.0 cm,
3.0 cm, 1.6 cm, 0.2 cm) in Table 1 as follows:
i. Let’s consider the generated fuzzy rule R1:
Rule R1: IF Petal Length is PL(Iris-Setosa) AND
Petal Width is PW(Iris-Setosa)
THEN the flower is Iris-Setosa.
From Figures 4 and 5, we can see mPLðIris-SetosaÞð1:6 cmÞ ¼ 0:682 and
mPWðIris-SetosaÞð0:2 cmÞ ¼ 0:676. Then, from fuzzy rule R1, we can see
that the degree of possibility that the flower is Iris-Setosa is equal to
mPLðIris-SetosaÞð1:6 cmÞ � mPWðIris-SetosaÞð0:2 cmÞ � 0:461:
ii. Let’s consider the generated fuzzy rule R2:
Rule R2: IF Petal Length is PL(Iris-Versicolor) AND
Petal Width is PW(Iris-Versicolor)
THEN the flower is Iris-Versicolor.
From Figures 4 and 5, we can see that mPLðIris-VersicolorÞð1:6 cmÞ ¼ 0 and
mPWðIris-VersicolorÞð0:2 cmÞ ¼ 0. Then, from fuzzy rule R2, we can see that
the degree of possibility that the flower is Iris-Versicolor is equal to
mPLðIris-VersicolorÞð1:6 cmÞ � mPWðIris-VersicolorÞ ð0:2 cmÞ ¼ 0.
228 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
iii. Let’s consider the generated fuzzy rule R3:
Rule R3: IF Petal Length is PL(Iris-Virginica) AND
Petal Width is PW(Iris-Virginica)
THEN the flower is Iris-Virginica.
From Figures 4 and 5, we can see mPLðIris-VirginicaÞð1:6 cmÞ ¼ 0
and mPWðIris-VirginicaÞð0:2 cmÞ ¼ 0. Then, from fuzzy rule R3, we can see
that the degree of possibility that the flower is Iris-Virginica is equal to
mPLðIris-VirginicaÞð1:6 cmÞ � mPWðIris-VirginicaÞð0:2 cmÞ ¼ 0.
From i, ii, and iii, we can see that fuzzy rule R1 gets the highest degree of
possibility (i.e., 0.461) among the fuzzy rules R1, R2, and R3. Therefore,
the testing instance (5.0 cm, 3.0 cm, 1.6 cm, 0.2 cm) is classified as the
species ‘‘Iris-Setosa.’’ It is obvious that this classification result coincides
with the one shown in Table 1. It should be noted that in the generated
fuzzy rules R1, R2, and R3, the membership functions PL(Iris-Setosa),
PW(Iris-Setosa), PL(Iris-Versicolor), PW(Iris-Versicolor), PL(Iris-Virgi-
nica), and PW(Iris-Virginica) are called the input fuzzy sets of the gen-
erated fuzzy rules.
We have implemented the proposed algorithm on a Pentium III PC
by using Visual Basic 6.0. By applying the implemented program to deal
with the Iris data classification problem, we can obtain the following
experimental results:
1. If the training data set contains 150 training instances (i.e., the full Iris
data) and the testing data set is equal to the training data set containing
150 training instances (i.e., the full Iris data), then after executing the
program200 times, the average classification accuracy rate is 97.3300%,
the number of generated fuzzy rules is 3, and the number of input fuzzy
sets in the antecedent portions of the generated fuzzy rules is 6.
2. If the training data set contains 120 training instances randomly chosen
from the Iris data, and the testing data set contains the rest of the in-
stances of the Iris data (i.e., 30 instances), then after executing the
program200 times, the average classification accuracy rate is 96.8250%,
the number of generated fuzzy rules is 3, and the number of input fuzzy
sets in the antecedent portions of the generated fuzzy rules is 6.
3. If the training data set contains 75 training instances randomly chosen
from the Iris data, and the testing data set contains the rest of the
instances of the Iris data (i.e., 75 instances), then after executing the
NEW METHOD TO GENERATE FUZZY RULES 229
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
program 200 times, the average classification accuracy rate is
96.3427%, the number of generated fuzzy rules is 3, and the number
of input fuzzy sets in the antecedent portions of the generated fuzzy
rules is 6.
In the following, we compare the experimental results of the pro-
posed algorithm with Hong and Lee’s (1996) algorithm, Wu and Chen’s
(1999) algorithm, and Castro et al.’s (1999) algorithm as shown in
Table 4.
Table 4. A comparison of the average classification accuracy rate, the average number of
generated fuzzy rules, and the average number of input fuzzy sets for different
algorithms
Algorithms
Average
classification
accuracy rate
Average
number of
generated
fuzzy rules
Average
number of
input fuzzy
sets
The Proposed Algorithm (Training
Data Set: 150 Instances; Testing
Data Set: 150 Instances; Executing
200 Times)
97.3300% 3 6
The Proposed Algorithm (Training
Data Set: 120 Instances; Testing
Data Set: 30 Instances; Executing
200 Times)
96.8250% 3 6
The Proposed Algorithm (Training
Data Set: 75 Instances; Testing data
Set: 75 Instances; Executing 200
Times)
96.3427% 3 6
Wu and Chen’s (1999) Algorithm
(Training Data set: 75 Instances;
Testing Data Set: 75 Instances;
Executing 200 Times)
96.2100% 3 8.21
Hong and Lee’s (1996) Algorithm
(Training Data Set: 75 Instances;
Testing Data Set: 75 Instances;
Executing 200 Times)
95.5700% 6.21 8
Castro et al.’s (1999) Algorithm
(Training Data Set: 120 Instances;
Testing Data Set: 30 Instances;
Executing 10 Times)
96.6000% 11 25
230 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
CONCLUSIONS
In this article, we have presented a new method for generating fuzzy rules
and constructing membership functions from training instances to deal
with the Iris data classification problem. The proposed algorithm has the
following advantages: (1) it can obtain a higher average classification
accuracy rate; (2) it generates fewer fuzzy rules; and (3) it generates fewer
input fuzzy sets in the antecedent portions of the generated fuzzy rules.
From Table 4, we can see that the proposed method is better than the
existing methods due to the fact that it has a higher average classification
accuracy rate and it generates fewer fuzzy rules and fewer input fuzzy sets
in the antecedent portions of the generated fuzzy rules.
REFERENCES
Castro, J. L., J. J. Castro-Schez, and J. M. Zurita. 1999. Learning maximal
structure rules in fuzzy logic for knowledge acquisition in expert systems.
Fuzzy Sets and Systems, 101(2):331�342.
Chang, C. H., and S. M. Chen. 2000. A new method to generate fuzzy rules from
numerical data based on the exclusion of attribute terms. In Proceedings of
the 2000 International Computer Symposium: Workshop on Artificial Intelli-
gence, pp. 57�64, Chiayi Taiwan, Republic of China.
Chen, S. M., and S. Y. Lin. 2000. A new method for constructing fuzzy decision
trees and generating fuzzy classification rules from training examples.
Cybernetics and Systems, 31(7):763�785.
Chen, S. M., and M. S. Yeh. 1998. Generating fuzzy rules from relational
database systems for estimating null values. Cybernetics and Systems,
29(6):363�376.
Chen, S. M., S. H. Lee, and C. H. Lee. 1999. Generating fuzzy rules from
numerical data for handling fuzzy classification problems. In Proceedings of
the 1999 National Computer Symposium, Vol. 2, pp. 336�343, Taipei,
Taiwan, Republic of China.
Chen, Y. C., and S. M. Chen. 2000. A new method to generate fuzzy rules for
fuzzy classification systems. In Proceedings of the 2000 Eighth National
Conference on Fuzzy Theory and Its Applications, Taipei, Taiwan, Republic of
China.
Fisher, R. 1936. The use of multiple measurements in taxonomic problems.
Annals of Eugenics, 7:179�188.
Hong, T. P., and J. B. Chen. 1999. Finding relevant attributes and membership
functions. Fuzzy Sets and Systems, 103(1):389�404.
Hong, T. P., and J. B. Chen. 2000. Process individual fuzzy attributes for fuzzy
rule induction. Fuzzy Sets and Systems, 112(1):127�140.
NEW METHOD TO GENERATE FUZZY RULES 231
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014
Hong, T. P., and C. Y. Lee. 1996. Induction of fuzzy rules and membership
functions from training example. Fuzzy Sets and Systems, 84(10):33�47.
Hong, T. P., and C. Y. Lee. 1999. Effect of merging order on performance of
fuzzy induction. Intelligent Data Analysis, 3(2):139�151.
Kao, C. M., and S. M. Chen. 2000. A new method to generate fuzzy rules from
training data containing noise for handling classification problems. In Pro-
ceedings of the Fifth Conference on Artificial Intelligence and Applications,
pp. 324�332, Taipei, Taiwan, Republic of China.
Klir, G. J., and B. Yuan. 1995. Fuzzy sets and fuzzy logic theory and applications.
Englewood Cliffs, NJ: Prentice Hall.
Lin, H. L., and S. M. Chen. 2000. Generating weighted fuzzy rules from training
data for handling fuzzy classification problems. In Proceedings of the 2000
International Computer Symposium: Workshop on Artificial Intelligence,
pp. 11�18, Chiayi Taiwan, Republic of China.
Wang, C. H., J. F. Liu, T. P. Hong, and S. S. Tseng. 1999. A fuzzy inductive
strategy for modular rules. Fuzzy Sets and Systems, 103(1):91�105.
Wu, T. P., and S. M. Chen. 1999. A new method for constructing membership
functions and fuzzy rules from training examples. IEEE Transactions on
Systems, Man, and Cybernetics-Part B, 29(1):25�40.
Zadeh, L. A. 1965. Fuzzy sets. Information and Control, 8:338�353.
232 S.-M. CHEN AND C.-H. YU
Dow
nloa
ded
by [
Uni
vers
ity o
f C
alif
orni
a Sa
nta
Cru
z] a
t 14:
58 1
9 N
ovem
ber
2014