Intuitionistic Fuzzy Rough Set-Based Granular Structures ...dig.sxu.edu.cn/docs/2019-10/d9147b3cbd1a4dbc906e3d91317d1131.pdfThe intuitionistic fuzzy (IF) set [36], [37], as an intuitive

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 27, NO. 3, MARCH 2019 527

Intuitionistic Fuzzy Rough Set-Based GranularStructures and Attribute Subset Selection

Anhui Tan , Wei-Zhi Wu, Yuhua Qian , Jiye Liang , Jinkun Chen , and Jinjin Li

Abstract—Attribute subset selection is an important issue indata mining and information processing. However, most automaticmethodologies consider only the relevance factor between sampleswhile ignoring the diversity factor. This may not allow the utiliza-tion value of hidden information to be exploited. For this reason,we propose a hybrid model named intuitionistic fuzzy (IF) roughset to overcome this limitation. The model combines the technicaladvantages of rough set and IF set and can effectively consider theabove-mentioned statistical factors. First, fuzzy information gran-ules based on IF relations are defined and used to characterize thehierarchical structures of the lower and upper approximations ofIF rough set within the framework of granular computing. Then,the computation of IF rough approximations and knowledge re-duction in IF information systems are investigated. Third, basedon the approximations of IF rough set, significance measures aredeveloped to evaluate the approximation quality and classificationability of IF relations. Furthermore, a forward heuristic algorithmfor finding one optimal reduct of IF information systems is de-veloped using these measures. Finally, numerical experiments areconducted on public datasets to examine the effectiveness and effi-ciency of the proposed algorithm in terms of the number of selectedattributes, computational time, and classification accuracy.

Index Terms—Attribute reduction, granular structure, intu-itionistic fuzzy (IF) relation, IF rough set, rough approximation.

Manuscript received April 2, 2018; revised June 27, 2018; accepted July 24,2018. Date of publication August 2, 2018; date of current version February27, 2019. This work was supported in part by the National Natural ScienceFoundation of China under Grant 61602415, Grant 61573321, Grant 41631179,Grant 61672332, Grant 61773247, and Grant 61773349, in part by the Programfor the Outstanding Innovative Teams of Higher Learning Institutions of Shanxi,the Program for the Young San Jin Scholars of Shanxi, and in part by the NaturalScience Foundation of Zhejiang Province of China under Grant LY18F030017.(Corresponding author: Anhui Tan.)

A. Tan is with the School of Mathematics, Physics and Information Science,Zhejiang Ocean University, Zhoushan 316022, China, and also with the Schoolof Computer and Information Technology, Shanxi University, Taiyuan 030006,China (e-mail:,[email protected]).

W.-Z. Wu is with the School of Mathematics, Physics and Information Sci-ence, Zhejiang Ocean University, Zhoushan 316022, China, and also withthe Key Laboratory of Oceanographic Big Data Mining and Application ofZhejiang Province, Zhoushan 316022, China (e-mail:,[email protected]).

Y. Qian is with the Key Laboratory of Computational Intelligence and ChineseInformation Processing of Ministry of Education, Taiyuan 030006, China, andalso with the Institute of Big Data Science and Industry, Shanxi University,Taiyuan 030006, China (e-mail:,[email protected]).

J. Liang is with the School of Computer and Information Technology, ShanxiUniversity, Taiyuan 030006, China, and also with the Key Laboratory of Com-putational Intelligence and Chinese Information Processing of Ministry of Ed-ucation, Taiyuan 030006, China (e-mail:,[email protected]).

J. Chen and J. Li are with the School of Mathematics and Statistics, Min-nan Normal University, Zhangzhou 363000, China (e-mail:, [email protected];[email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TFUZZ.2018.2862870

I. INTRODUCTION

ROUGH set theory and fuzzy set theory are two importanttools for conceptualizing and analyzing various types ofdata. They have attracted considerable attention in the domainsof granular computing and machine learning. Pawlak’s rough set[1] utilizes an indiscernibility relation to construct the lower andupper approximations of any arbitrary subset of the universe.It can directly handle categorical attributes [2]–[5], whereasnumerical and continuous ones must be discretized into severalintervals, which are assigned a set of symbolic values before therough set model is applied. Different discretization methods maychange the original distribution of data and lead to informationloss. In view of this observation, fuzzy rough set [6], [7], asa combination of fuzzy set and rough set, has proven to bean effective tool for overcoming the puzzle of discretizationand can be used to analyze numerical and continuous attributeswithout additional preprocessing steps. Over the past decade,this theory has been successfully applied in gene expression,cluster analysis, feature selection, and rule extraction [8]–[11].

We elaborate the advantages of fuzzy rough set from the fol-lowing two angles. First, this model [12]–[16] was introducedto express the idea of granular computing, which refers to thetheories and methodologies that utilize granules in informationprocessing. Zadeh [16] first proposed and discussed the issueof fuzzy information granulation, which is used to generate afamily of fuzzy information granules from numerical and con-tinuous data. Pedrycz et al. [17] presented a granular evaluationof fuzzy models and delivered an augmentation of fuzzy mod-els by forming information granules. Qian et al. [18] developeda measure, named a fuzzy granular structure distance, to dis-criminate the difference between fuzzy granular structures. Byconsidering the intention and extension of information, Xu andLi [19] addressed the mechanism required to train arbitraryfuzzy information granules. Chen et al. [20] introduced the con-cept of granular fuzzy set based on fuzzy similarity relationsand described the granular structures of fuzzy rough approx-imation operators. Second, fuzzy rough set is widely utilizedfor feature selection (also called attribute reduction) [21], [22].Jensen and Shen [10], [23] first defined a notion of dependencefunction-based reduct and designed a heuristic reduction algo-rithm using a fuzzy rough set. In [24], a computational domainwas presented to improve the computational efficiency of thealgorithm proposed in [10]. In [25], the authors examined thegranular structures of lower and upper fuzzy rough approxima-tions and constructed a discernibility matrix, using which thefoundation of fuzzy-rough attribute reduction was established.

1063-6706 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

https://orcid.org/0000-0002-2525-3499https://orcid.org/0000-0001-6772-4247https://orcid.org/0000-0001-5887-9327https://orcid.org/0000-0002-1141-1548mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]

528 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 27, NO. 3, MARCH 2019

Considerable effort has been devoted to developing reductionmethods by remodeling the discernibility matrices [26], [27].Wang et al. [28]–[30] introduced the concept of fuzzy decisionof samples for feature selection using the fuzzy rough set. It isnoteworthy that the fuzzy rough set is sensitive to noise inter-ference. For this reason, the robustness is enhanced to improvethe generalizability of fuzzy rough set models in the presence ofattribute noise and class noise [31]–[33]. The perspective of anadditional class of attribute reduction algorithms with the fuzzyrough set is fuzzy information entropy. Hu et al. [34] employedinformation entropy to redefine the definitions of attribute subsetreduct and relative reduct. Zhang et al. [35] presented the gran-ular structures of fuzzy rough set under general fuzzy relationsand proposed granule-based information entropy for featureselection.

The intuitionistic fuzzy (IF) set [36], [37], as an intuitivegeneralization of fuzzy set, simultaneously takes the member-ship and nonmembership of objects belonging to target setsinto account. This model possesses a stronger ability to expressthe delicate ambiguities of information in the objective worldthan do the traditional fuzzy set. The combination of the IFset and rough set resulted in the introduction of IF rough set[38]–[40]. It employs the rough set approximations to describethe IF set by associating a collection of IF relations betweenobjects. Zhou et al. [41], [42] examined approximation oper-ators of the IF rough set by using constructive and axiomaticapproaches. Huang et al. [43], [44] developed IF rough set mod-els for interval-valued and multigranulation data and discussedthe hierarchical structures and uncertainty measures of IF infor-mation [45]. Zhang et al. [46] provided a general framework ofIF rough set models related to general binary relations betweentwo universes. Hesameddini et al. [47] and Zhang [48] investi-gated the attribute reduction of the IF rough set and constructedthe reduction structures based on discernibility matrices. Theexistence of all these results indicates that the IF rough sethas received wide attention in the research community in re-cent years. However, the studies were focused mainly on modelgeneralization [49]–[51], property exploration [52]–[54], andmeasure description [55], [56], while the advantages of the IFrough set have rarely been demonstrated in different fields, suchas granular computing and feature selection.

The fuzzy rough set considers only the relevance of sampleswhile overlooking the diversity. To overcome this limitation, wein this paper employ the IF rough set model to construct thefuzzy lower and upper approximations of decisions. This modelaims to maintain both the maximal degrees of samples’ mem-bership to their own classes and nonmembership to other classessimultaneously. We first define several types of fuzzy informa-tion granules and then present the hierarchical structures of theIF rough set from the viewpoint of granular computing. The sig-nificance measures are introduced to evaluate the classificationability of the IF relations in an IF decision system. A forwardreduction algorithm is further proposed for finding a reduct ofthe system. Finally, we compare the proposed algorithm withexisting algorithms. The experimental results show that the newalgorithm is effective and efficient in the aspects of dimension-ality reduction, computational time, and classification accuracy.

The rest of this paper is organized as follows. In Section I, webriefly review some basic concepts of IF set, IF relation, and IFrough set. In Section II, we define several types of fuzzy granulesand characterize the hierarchical structures of the lower and up-per approximations of the IF rough set by using these granules.In Section III, we present a simple way to compute the IF roughapproximations in the IF decision systems. In Section IV, wediscuss the knowledge reduction of IF decision information sys-tems in detail. A dependence function-based heuristic reductionalgorithm is then formulated to compute a suboptimal reductof IF decision systems. In Section V, numerical experiments toverify the feasibility and stability of the proposed algorithm aredescribed. The paper is concluded with a summary.

A. Preliminaries

In this section, we review several basic concepts related tothe IF set and IF relation. The notion of the IF rough set is thenpresented. More details can be found in [36], [37], [40], and[41].

1) IF Set and IF Relation:Definition 1 ([36], [37]): Let U be the universe. An IF set

X on U is

X ={

(μX (x), γX (x))x

|x ∈ U}

where μX : U → [0, 1] and γX : U → [0, 1] satisfy μX (x) +γX (x) ≤ 1 for any x ∈ U . μX (x) and γX (x) are called themembership and nonmembership degrees of object x to set X ,respectively. The pair (μX (x), γX (x)) is an IF number of xw.r.t. X .

The family of all IF sets on U is denoted by IF , in which 1̃ ={ (1,0)x |x ∈ U} is the IF universal set and ∅̃ = { (0,1)x |x ∈ U} isthe IF empty set. Obviously, each ordinary fuzzy set X can berepresented by an IF set: X = { (μX (x),1−μX (x))x |x ∈ U}.

Denote the quality of X as |X| = ∑x∈U 1+μX (x)−γX (x)2 [57],[58], where 1 is a translation factor that ensures the value is apositive number, and quotient 2 is a scaling factor that ensuresthe value is bounded within 0 and 1. Moreover, define the proba-bility quality of X w.r.t. the universe as p(X) = |X ||U | . We obtainthe following basic properties.

Property 1: Let U be the universe and X,Y ∈ IF .1) 0 ≤ |X| ≤ |U | and 0 ≤ p(X) ≤ 1.2) p(X) = 1 iff X = 1̃.3) p(X) = 0 iff X = ∅̃.4) If Y ⊆ X , then |Y | ≤ |X| and p(Y ) ≤ p(X).We can see from Property 1 that the smaller the membership

and the larger the nonmembership degree, the smaller the IFset. In extreme cases, 1̃ and ∅̃ are the largest and smallest IFsets, respectively. All IF sets in IF on the universe generate acomplete lattice.

Definition 2 ([59]): Let U be the universe. An IF binary re-lation R on U is defined as

R = {< (x, y), μR (x, y), γR (x, y) > |(x, y) ∈ U × U}where μR : U × U → [0, 1] and γR : U × U → [0, 1] satisfyμR (x, y) + γR (x, y) ≤ 1 for any (x, y) ∈ U × U . μR (x, y) and

TAN et al.: INTUITIONISTIC FUZZY ROUGH SET-BASED GRANULAR STRUCTURES AND ATTRIBUTE SUBSET SELECTION 529

γR (x, y) are, respectively, the similarity and diversity degreesof x to y.

For two IF relations R and B, we say R is finer than B,denoted by R � B, if and only if μR (x, y) ≤ μB (x, y) andγR (x, y) ≥ γB (x, y) for any (x, y) ∈ U × U .

Assume U = {x1 , x2 , . . . , xn}. An IF relation is then repre-sented by a matrix R = [(μR (xi, xj ), γR (xi, xj ))]n×n , wherethe (i, j)th entry displays the IF number between xi and xj .

An IF relation R is an IF tolerance relation if it satisfies thefollowing:

a) reflexive: μR (x, x) = 1, γR (x, x) = 0;b) symmetric: μR (x, y) = μR (y, x), γR (x, y) = γR (x, y).The representative matrix of an IF tolerance relation is sym-

metric whose diagonal values are 1̃. Throughout this study, welimit the IF relations within IF tolerance ones.

2) IF Rough Set:Definition 3 ([41], [42]): Let U be the universe and R an IF

relation. For any IF set X ∈ IF , the lower and upper approxi-mations of X w.r.t. R are, respectively, defined as

R(X) ={

(μR(X )(x), γR(X )(x))x

|x ∈ U}

R(X) =

{(μR(X )(x), γR(X )(x))

x> |x ∈ U

}

where

μR(X )(x) = infy∈U

max(γR (x, y), μX (y))

γR(X )(x) = supy∈U

min(μR (x, y), γX (y))

μR(X )(x) = supy∈U

min(μR (x, y), μX (y))

γR(X )(x) = infy∈Umax(γR (x, y), γX (y)).

The pairs (μR(X )(x), γR (X)(x)) and (μR(X )(x), γR(X )(x))are the IF numbers of object x to the lower and upper approxima-tion sets, respectively. If R(X) = R(X), then (R(X), R(X))is referred to as an IF definable set; otherwise, as an IF roughset.

Property 2: Let R and B be two IF relations. If R � B, thenfor any X ∈ IF , it holds that

a) R(X) ⊆ X ⊆ R(X);b) B(X) ⊆ R(X);c) R(X) ⊆ B(X).Property 2 discloses that IF approximations are monotonic

with the increasing or decreasing in the granularity of the IF re-lations. A finer IF relation leads to a larger lower approximationand a smaller upper approximation w.r.t. a given IF set.

3) IF Decision Information System: Classification is one ofthe most important tasks in the machine learning field. For deal-ing with IF information and IF information-related classifica-tion, in this section, we formally define the notion of IF infor-mation systems and examine IF rough approximations in thiscontext.

Definition 4: Let U be the universe andR a family of IF re-lations. The pair S = (U,R) is called an IF information system.

Moreover, if d is a decision attribute, then S = (U,R∪ {d}) iscalled an IF decision system.

For processing real datasets, it is possible to construct IFrelations from the attributes by applying various methods. Then,the datasets can be transformed to IF information systems forfurther analysis.

Assume d is a decision attribute, which divides the universeinto a family of decision classes U/d = {d1 , d2 , . . . , dl} ac-cording to different decision labels. Each decision class can

be typically written as an IF set: di = { (μd i (x),γd i (x))x |x ∈ U},where

(μdi (x), γdi (x)) ={

(1, 0), x ∈ di ;(0, 1), x /∈ di.

Data often contain redundant information, which is irrelevantfor knowledge representation and decision classification. Its re-moval can save the storage space and computational time, andraise learning efficiency. In this section, we describe the useof the IF rough set to reduce irrelevant IF relations and selectimportant ones from IF decision systems.

Given a subset of IF relations B ⊆ R. A special IF relationmay be induced fromB by taking the interrelationship among theIF relations. We represent the corresponding IF relation w.r.t.B by the symbol B. Note that the IF relation induced varieswith different constructive methods. Based on this generalizeddefinition, we examine the knowledge reduction of IF decisionsystems in the following.

Let S = (U,R∪ {d}) be an IF decision system with U/d ={d1 , d2 , . . . , dl}. Given a subset B ⊆ R, we can compute thedistributions of IF lower and upper approximations of decisionclasses as

B(d) = {B(d1),B(d2), . . . ,B(dl))}B(d) = {B(d1),B(d2), . . . ,B(dl))}.

Definition 5: Let S = (U,R∪ {d}) be an IF decisionsystem with U/d = {d1 , d2 , . . . , dl} and B ⊆ R. DefinePosB(d) = ∪li=1B(di) as the IF positive region w.r.t. B. B iscalled a reduct of S if it satisfies the following:

a) PosB(d) = PosR(d)b) PosB−{R}(d) ⊂ PosB(d) ∀R ∈ B.In Definition 5, the IF positive region is an IF set generated

by taking the union of the lower approximations of all decisionclasses. It reflects the classification ability of any subset B.The larger the IF positive region, the stronger the classificationability of B.

II. GRANULAR STRUCTURES OF INTUITIONISTICFUZZY APPROXIMATIONS

In this section, we examine the granular structures of IF roughapproximation operators. We first define several types of fuzzygranules and clarify the hierarchies of the lower and upper ap-proximations of IF set using these granules.

Definition 6: Let U be the universe and R an IF relation.Given a real number λ ∈ [0, 1] and x ∈ U , we define a type-1


fuzzy granule as

[xλγR ](y) ={

λ, γR (x, y) < λ0, γR (x, y) ≥ λ .

In Definition 6, the granule [xλγR ] is based on the diversityfunction γR of the relation R. Parameter λ influences the sizeof the fuzzy granule: the larger is the value of λ, the larger is thefuzzy granule. The lower approximations can be hierarchicallydescribed as follows.

Theorem 1: Let U be the universe and R an IF relation. ForX ∈ IF and x ∈ U , we have

μR(X )(x) = sup{λ|[xλγR ] ⊆ μX }.Proof:1) For any x ∈ U , let μR(X )(x) = λ0 . Then, from Defi-

nition 3, it holds that max(γR (x, y), μX (y)) ≥ λ0 forany y ∈ U , and there exists one y0 ∈ U satisfyingmax(γR (x, y0), μX (y0)) = λ0 .We have that if γR (x, y) < λ0 , then μX (y) ≥ λ0 forany y ∈ U . This means that [xλ0γR ](y) ≤ μX (y) for anyγR (x, y) < λ0 . On the other hand, if γR (x, y) ≥ λ0 , then[xλ0γR ](y) = 0. On the whole, [x

λ0γR

] ⊆ μX .2) Assume that there is some λ1 > λ0 such that [xλ1γR ] ⊆ μX .

It follows that [xλ1γR ](y) ≤ μX (y) for any y ∈ U . We knowfrom Definition 6 that the condition γR (x, y) < λ1 im-plies [xλ1γR ](y) = λ1 . With the fact of [x

λ1γR

] ⊆ μX , we havethat if γR (x, y) < λ1 , then μX (y) ≥ λ1 . Subsequently

μR(X )(x) = infy∈U

max(γR (x, y), μX (y))

= infγR (x,y ) λ0 , which is in contradic-

tion with μR(X )(x) = λ0 . Hence, μR(X )(x) = sup{λ|[xλγR ] ⊆μX , λ ∈ [0, 1]}. We complete the proof. �

From Theorem 1, one can see that the membership of anobject x belonging to the set μR(X ) is the maximal real number,which guarantees that the granule [xλγR ] is contained in μX . Anexample is used to examine this idea.

Example 1: Continued from Example 1.In Example 1, we have obtained that μR(X ) = 0.5x1 +

0.5x2

+0.2x3

. From Definition 6, we have that

[x0.51γR

]=

0.5x1

+0.5x2

+0x3

[x0.52γR

]=

0.5x1

+0.5x2

+0x3

[x0.23γR

]=

0x1

+0x2

+0.2x3

.

Since μX = 0.5x1 +0.6x2

+ 0.2x3 , we can see that [x0.51γR ] ⊆ μX ,

[x0.62γR ] ⊆ μX and [x0.23γR ] ⊆ μX .Moreover, if we take λ > 0.5, it holds [xλ1γR ](x1) = λ >

μX (x1) and [xλ2γR ](x2) = λ > μX (x2), which follows the con-clusion that [xλ1γR ] ⊆ μX and [xλ2γR ] ⊆ μX do not hold. If

λ > 0.2, it follows that [xλ3γR ](x3) = λ > μX (x3), which fol-lows that [xλ3γR ] ⊆ μX does not hold.

By these analyses, we assert that μR(X ) = 0.5x1 +0.5x2

+ 0.2x3 .This is in accordance with Theorem 1.

Definition 7: Let U be the universe and R an IF relation.Given a real number λ ∈ [0, 1] and x ∈ U , define a type-2 fuzzygranule as

[xλμR ](y) ={

λ, μR (x, y) > λ;1, μR (x, y) ≤ λ.

The granule [xλμR ] is based on the similarity function μR ofthe relation R and parameter λ: the larger is the value of λ, thelarger is the fuzzy granule. We have the following theorem.

Theorem 2: Let U be the universe and R an IF relation. ForX ∈ IF and x ∈ U , we have

γR(X )(x) = inf{λ|[xλμR ] ⊇ γX }.Proof:1) For any x ∈ U , let γR(X )(x) = λ0 . Then, from Defi-

nition 3, it holds that min(μR (x, y), γX (y)) ≤ λ0 forany y ∈ U , and there exists one y0 ∈ U satisfying thatmin(μR (x, y0), γX (y0)) = λ0 .We have that if μR (x, y) > λ0 , then γX (y) ≤ λ0 for anyy ∈ U . This means that [xλ0μR ](y) ≥ γX (y) for μR (x, y) >λ0 . On the other hand, if μR (x, y) ≤ λ0 , then [xλ0μR ](y) =1. On the whole, [xλ0μR ] ⊇ γX .

2) Assume that there is some λ1 < λ0 such that [xλ1μR ] ⊇ γX .It follows that [xλ1μR ](y) ≥ γX (y) for any y ∈ U . We knowfrom that the condition μR (x, y) > λ1 implies [xλ1μR ](y) =λ1 . With the fact of [xλ1μR ] ⊇ γX and Definition 7, we havethat if μR (x, y) > λ1 , then γX (y) ≤ λ1 . Subsequently

γR(X )(x) = supy∈U


= supμR (x,y )>λ1


≤ max{λ1 , λ1} = λ1 .This implies that γR(X )(x) ≤ λ1 < λ0 , which is in contradic-

tion with γR(X )(x) = λ0 . Hence, γR(X )(x) = inf{λ|[xλμR ] ⊇γX }. We complete the proof. �

From Theorem 2, one can see that the membership of anobject x to the set γR(X ) is the minimal real number, whichguarantees that the granule [xλμR ] contains γX. An example isused to illustrate this idea.

Example 2: Continued from Example 1.Example 1 indicates that γR(X ) = 0.3x1 +

0.4x2

+ 0.7x3 . FromDefinition 7, we have that

[x0.31μR

]=

0.3x1

+0.3x2

+1x3

[x0.42μR

]=

0.4x1

+0.4x2

+1x3

[x0.73μR

]=

1x1

+1x2

+0.7x3

.


Since γX = 0.3x1 +0.1x2

+ 0.7x3 , we can see that [x0.31γR ] ⊇ γX ,

[x0.42γR ] ⊇ γX , and [x0.73γR ] ⊇ γX .Moreover, if λ < 0.3, it holds [xλ1μR ](x3) = λ < γX (x3),

which follows that [xλ1μR ] ⊇ γX does not hold. If λ < 0.4, wehave that [xλ2μR ](x3) = λ < γX (x3), then [x

λ2μR ] ⊇ γX does

not hold. If λ < 0.7, it follows [xλ3μR ](x3) = λ < γX (x3). Thus,[xλ3μR ] ⊇ γX does not hold. This is in accordance with the con-clusion in Theorem 2.

From the examples mentioned above, we see that the IF lowerapproximations can be characterized and computed using fuzzyinformation granules. We next examine the IF upper approxi-mations. The following theorems can be similarly verified.

Theorem 3: μR(X )(x) = inf{λ|[xλμR ] ⊇ μX }.Proof: The proof is similar to that of Theorem 2. �Example 3: Continued from Example 1.In Example 1, we have that μR(X ) =

0.6x1

+ 0.6x2 +0.4x3

. Wenow use Theorem 3 to examine μR(X ) .

From Definition 7, we have that

[x0.61μR

]=

0.6x1

+1x2

+1x3

[x0.62μR

]=

1x1

+0.6x2

+1x3

[x0.43μR

]=

1x1

+1x2

+0.4x3

.

Since μX = 0.5x1 +0.6x2

+ 0.2x3 , we can see that [x0.61μR ] ⊇ μX ,

[x0.62μR ] ⊇ μX , and [x0.63μR ] ⊇ μX .Moreover, if we take λ < 0.6, it holds [xλ1μR ](x2) = λ <

μX (x2) and [xλ2μR ](x1) = λ < μX (x2), which follows that[xλ1μR ] ⊇ μX and [xλ2γR ] ⊇ μX do not hold. If λ < 0.4, it holds[xλ3γR ](x2) = λ < μX (x2), which follows that [x

λ3μR ] ⊇ μX

does not hold.Theorem 3 implies that μR(X )(x1) = 0.6, μR(X )(x2) = 0.6,

and μR(X )(x3) = 0.4. Consequently, μR(X ) =0.6x1

+ 0.6x2 +0.4x3

.Theorem 4: γR(X )(x) = sup{λ|[xλγR ] ⊆ γX }.Proof: The proof is similar to that of Theorem 1. �Example 4: Continued from Example 1.From Example 1, we obtain that γR(X ) =

0.2x1

+ 0.1x2 +0.5x3

.From Definition 6, we compute that

[x0.21γR

]=

0.2x1

+0x2

+0x3

[x0.12γR

]=

0x1

+0.1x2

+0x3

[x0.53γR

]=

0x1

+0x2

+0.5x3

.

Since γX = 0.3x1 +0.1x2

+ 0.7x3 , we can see that [x0.21γR ] ⊆ γX ,

[x0.12γR ] ⊆ μX , and [x0.53γR ] ⊆ μX .If λ > 0.2, it holds [xλ1γR ](x2) = λ > γX (x2), which fol-

lows that [xλ1γR ] ⊆ γX does not hold. If λ > 0.1, we have[xλ2γR ](x2) = λ > γX (x2). This indicates that [x

λ2γR ] ⊆ γX

does not hold. If λ > 0.5, it holds [xλ3γR ](x2) = λ > γX (x2),which follows that [xλ3γR ] ⊆ γX does not hold.

Theorem 4 indicates that γR(X )(x1) = 0.6, μR(X )(x2) =0.6, and μR(X )(x3) = 0.4. Consequently, μR(X ) =

0.6x1

+0.6x2

+ 0.4x3 .In this section, we presented the construction of two types

of fuzzy granules based on IF relations, and then, the hierar-chical structures of IF rough lower and upper approximationswere both characterized. In this sense, the lower and upper ap-proximations of the IF set can be computed by the granularcomputing-based methods.

III. COMPUTATION OF INTUITIONISTICFUZZY APPROXIMATIONS

The computation of IF approximations is a basic issue for thefurther application of the IF rough set. We employ the followingexample to illustrate the acquisition of the rough approximationsof the IF set.

Example 5: Assume U = {x1 , x2 , x3} and the IF relation Ris

R =

⎛⎝ (1, 0) (0.6, 0.2) (0.3, 0.6)(0.6, 0.2) (1, 0) (0.4, 0.5)

(0.3, 0.6) (0.4, 0.5) (1, 0)

⎞⎠ .

Let X = (0.5,0.3)x1 +(0.6,0.1)

x2+ (0.2,0.7)x3 . We now compute

the lower and upper approximations of X .We have that

μR(X )(x1) = inf{max(0, 0.5),max(0.2, 0.6),max(0.6, 0.2)}= 0.5

μR(X )(x2) = inf{max(0.2, 0.5),max(0, 0.6),max(0.5, 0.2)}= 0.5

μR(X )(x3) = inf{max(0.6, 0.5),max(0.5, 0.6),max(0, 0.2)}= 0.2,

and

γR(X )(x1) = sup{min(1, 0.3),min(0.6, 0.1),min(0.3, 0.7)}= 0.3

γR(X )(x2) = sup{min(0.6, 0.3),min(1, 0.1),min(0.4, 0.7)}= 0.4

γR(X )(x3) = sup{min(0.3, 0.3),min(0.4, 0.1),min(1, 0.7)}= 0.7.

This means that R(X) = (0.5,0.3)x1 +(0.5,0.4)

x2+ (0.2,0.7)x3 .

Moreover,

μR(X )(x1) = sup{min(1, 0.5),min(0.6, 0.6),min(0.3, 0.2)}= 0.6

μR(X )(x2) = sup{min(0.6, 0.5),min(1, 0.6),min(0.4, 0.2)}= 0.6

μR(X )(x3) = sup{min(0.3, 0.5),min(0.4, 0.6),min(1, 0.2)}= 0.4.


Also

γR(X )(x1) = inf{max(0, 0.3),max(0.2, 0.1),max(0.6, 0.7)}= 0.2

γR(X )(x2) = inf{max(0.2, 0.3),max(0, 0.1),max(0.5, 0.7)}= 0.1

γR(X )(x3) = inf{max(0.6, 0.3),max(0.5, 0.1),max(0, 0.7)}= 0.5.

Subsequently, R(X) = (0.6,0.2)x1 +(0.6,0.1)

x2+ (0.4,0.5)x3 .

We clearly see that R(X) ⊆ X ⊆ R(X).The computation of IF approximations is a difficult task be-

cause it requires a full run through all the IF numbers in the IFrelation. To facilitate this process, we examine the special prop-erties of IF approximations in IF decision systems. We arriveat the following statements for the IF rough approximations ofdecision classes.

Theorem 5: Let S = (U,R∪ {d}) be an IF decision systemwith R ∈ R and U/d = {d1 , d2 , . . . , dl}. For any 1 ≤ i ≤ l, wehave

μR(di )(x) = infy /∈di

γR (x, y)

γR(di )(x) = supy /∈di

μR (x, y)

μR(di )(x) = supy∈di

μR (x, y)

γR(di )(x) = infy∈diγR (x, y).

Proof:1) From Definition 3, we have that μR(di )(x) = infy∈U

max(γR (x, y), μdi (y)). If y ∈ di , then μdi (y) = 1,which leads to that max(γR (x, y), μdi (y)) = max(γR(x, y), 1)=1. Consequently, μR(di)(x)=infy /∈di max(γR(x, y), μdi (y)) = infy /∈di max(γR (x, y), 0)= infy /∈di γR(x, y). Hence, μR(di )(x) = infy /∈di (γR (x, y)).

2) It holds that γR(di )(x) = supy∈U min(μR (x, y), γdi (y)).If y ∈ di , then γdi (y) = 0, which leads to thatmin(μR (x, y), γdi (y)) = min(μR (x, y), 0)=0. We havethat γR(di)(x)=supy /∈di min(μR (x, y), γdi(y))=supy /∈dimin(μR (x, y), 1) = supy /∈di μR (x, y). This means thatγR(di )(x) = supy /∈di μR (x, y).

3) With the fact that y /∈ di implies μdi (y) = 0 andmin(μR (x, y), μdi (y)) = 0. Hence, μR(di )(x) = supy∈Umin(μR (x, y), μdi(y))=supy∈di min(μR (x, y), μdi (y))= supy∈di min(μR (x, y), 1) = supy∈di μR (x, y). Itimplies that μR(di )(x) = supy∈di μR (x, y).

4) With the fact that if y /∈ di , then γdi (y) = 1 andmax(μR (x, y), γdi (y))=1. Subsequently, γR(di )(x) =infy∈di max(γR (x, y), γdi(y)) = infy∈di max(γR (x, y), 0) =infy∈di γR (x, y). It follows that γR(di )(x) = infy∈diγR (x, y). �

Theorem 5 intuitively indicates the following.1) The lower membership degree of an object x to a deci-

sion class di is the minimum diversity between x and theobjects outside the decision class.

2) The lower nonmembership degree of an object to a de-cision class is the maximal similarity between x and theobjects outside the decision class.

3) The upper membership degree of an object to a decisionclass is the maximal similarity between x and the objectsin the same decision class.

4) The upper nonmembership degree of an object to a deci-sion class is the minimum diversity between x and the ob-jects in the same decision class. These conclusions presenta simpler means of calculating the IF approximations ofdecision classes.

In particular, if an object does (or does not) belong to adecision class, the rough approximations are directly obtainedas follows.

Property 3: Let S = (U,R∪ {d}) be an IF decision systemwith R ∈ R and U/d = {d1 , d2 , . . . , dl}. For any 1 ≤ i ≤ l, ifx /∈ di , then

μR(di )(x) = 0, γR(di )(x) = 1.

Proof:1) With the fact of γR (x, x) = 0 and x /∈ di , it follows

from Theorem 5 that μR(di )(x) = infy /∈di (γR (x, y)) =γR (x, x) = 0.

2) With the fact of μR (x, x) = 0 and x /∈ di , we have thatγR(di )(x) = supy /∈di (μR (x, y)) = μR (x, x) = 1. �

Property 3 means that, if x /∈ di , the lower membership de-gree and lower nonmembership degree are fixed to 0 and 1,respectively.

Property 4: Let S = (U,R∪ {d}) be an IF decision systemwith R ∈ R and U/d = {d1 , d2 , . . . , dl}. For any 1 ≤ i ≤ l, ifx ∈ di , then

μR(di )(x) = 1, γR(di )(x) = 0.

Proof:1) We have that μR (x, x) = 0. From Theorem 5, it holds that

μR(di )(x) = supy∈di (μR (x, y)) = μR (x, x) = 1.2) Since γR (x, x) = 0, from Theorem 5, we see that

γR(di )(x) = infy∈di (γR (x, y)) = γR (x, x) = 0. �Property 4 shows that if x ∈ di , the lower membership de-

gree and lower nonmembership degree are fixed to 1 and 0,respectively.

From Theorem 5 and Properties 3 and 4, we find that theIF rough approximations in IF decision systems satisfy specialproperties, which will simplify the process of computation. Inother words, if we want to obtain the rough approximations of anIF set, we need only to compute the lower approximations of anobject w.r.t. its decision class, i.e., μR(di )(x) and γR(di )(x) forx ∈ di , and the upper approximations of an object w.r.t. otherdecision classes, i.e., μR(di )(x) and γR(di )(x) for x /∈ di .


IV. KNOWLEDGE REDUCTION OF INTUITIONISTIC FUZZYDECISION SYSTEMS

To measure the IF positive region generated by any subsetB ⊆ R, define a dependence function f(B) = p(PosB(d)) asthe probability quality of the IF positive region. The dependencefunction is the ratio of the sizes of the IF positive region. Thereduct of IF decision systems can also be defined from theviewpoint of the dependence function as follows:

1) f(PosB(d)) = f(PosR(d))2) f(PosB−{R}(d)) < f(PosB(d)) ∀R ∈ B.The following conclusions present a simplified method for

computing the IF positive region and the dependence function.Property 5: Let S = (U,R∪ {d}) be an IF decision system

with U/d = {d1 , d2 , . . . , dl} and B ⊆ R. Then1) PosB(d)(x) = B(di0 )(x) where x ∈ di0 ,2) f(B) = ∑li=1 ∑x∈di p(B(di)(x)), and3) f(B) = 12 + 12|U |

∑li=1

∑x∈di (μB(di )(x)− γB(di )(x)).

Proof:1) From Property 3, if x /∈ di , then μB(di )(x) = 0 and

γB(di )(x) = 1. It follows that PosB(d)(x) = (∪li=1B(di))(x) = B(di0 ))(x) for x ∈ di0 .

2) It holds that f(B) = P (PosB(d)) =∑

x∈U P (PosB(d)(x)). Combing with (1), we have that f(B) =∑

x∈U P (PosB(d)(x)) =∑l

i=1∑

x∈di P (B(di))(x)).

3) With the fact of P (B(di)(x)) = 1|U |1+μB (d i ) (x)−γB (d i ) (x)

2 ,it follows that

f(B) =l∑

i=1

∑x∈di

P (B(di)(x))

=1|U |

l∑i=1

∑x∈di

1 + μB(di )(x)− γB(di )(x)2

=12

+1

2|U |l∑

i=1

∑x∈di

(μB(di )(x)− γB(di )(x)).�

It is observed from Property 5 that if we want to obtain the IFpositive region, we need only to compute the lower approxima-tion of each object w.r.t. its own decision class.

The following example is employed to exhibit the idea ofProperty 5.

Example 6: Continued from Example 1.Let S = (U,R∪ {d}) be an IF decision system, where U ={x1 , x2 , x3} andR = {Ri |1 ≤ i ≤ 4}:

R1 =

⎛⎝ (1, 0) (0.6, 0.2) (0.3, 0.6)(0.6, 0.2) (1, 0) (0.4, 0.5)

(0.3, 0.6) (0.4, 0.5) (1, 0)

⎞⎠

R2 =

⎛⎝ (1, 0) (0.7, 0.3) (0.4, 0.4)(0.7, 0.3) (1, 0) (0.5, 0.3)

(0.4, 0.4) (0.5, 0.3) (1, 0)

⎞⎠

R3 =

⎛⎝ (1, 0) (1, 0) (0.5, 0.5)(1, 0) (1, 0) (0.4, 0.4)

(0.5, 0.5) (0.4, 0.4) (1, 0)

⎞⎠

R4 =

⎛⎝ (1, 0) (0.5, 0.1) (0.2, 0.7)(0.5, 0.1) (1, 0) (0.8, 0.1)

(0.2, 0.7) (0.8, 0.1) (1, 0)

⎞⎠ .

Denote a special IF relation by B = ∩{R|R ∈ B} w.r.t. anysubsetB ⊆ R. Clearly, B is a commonly used one in the existingliterature. Consequently

R = ∩R =⎛⎝ (1, 0) (0.5, 0.3) (0.2, 0.7)(0.5, 0.3) (1, 0) (0.4, 0.5)

(0.2, 0.7) (0.4, 0.5) (1, 0)

⎞⎠ .

The decision is defined as U/d = {d1 , d2}, where d1 ={x1 , x2}, d2 = {x3}.

From Property 3, we directly obtain that

μR(d1 )(x3) = 0, γR(d1 )(x3) = 1

μR(d2 )(x1) = 0, γR(d2 )(x1) = 1

μR(d2 )(x2) = 0, γR(d2 )(x2) = 1.

From Theorem 5, we compute that

μR(d1 )(x1) = infxi /∈d1

γR (x1 , xi)

= inf(γR (x1 , x2), γR (x1 , x3)) = inf(0.3, 0.7) = 0.3

γR(d1 )(x1) = supxi /∈d1

(μR (x1 , xi))

= sup(μR (x1 , x2), μR (x1 , x3)) = sup(0.5, 0.2) = 0.5


γR (x2 , xi) = inf(γR (x2 , x1))

= inf(0.3) = 0.3


(μR (x2 , xi)) = sup(μR (x2 , x1))

= sup(0.5) = 0.5


γR (x3 , xi) = inf(γR (x3 , x1))

= inf(0.7) = 0.7


(μR (x3 , xi)) = sup(μR (x3 , x1))

= sup(0.2) = 0.2.

To sum up

R(d1) =(0.3, 0.5)

x1+

(0, 1)x2

+(0, 1)x3

R(d2) =(0, 1)x1

+(0.3, 0.5)

x2+

(0.7, 0.2)x3

.


The IF positive region w.r.t.R is

PosR(d) = R(d1) ∪R(d2) = (0.3, 0.5)x1

+(0.3, 0.5)

x2

+(0.7, 0.2)

x3.

The probability quality of the IF positive region is

p(PosR(d)) =|PosR(d)||U | =

∑x∈U

1+μPosR(d ) (x)−γPosR(d ) (x)2

|U |

=13

(1 + 0.3− 0.5

2+

1 + 0.3− 0.52

+1 + 0.7− 0.2

2

)= 0.51.

Theorem 6: Let S = (U,R∪ {d}) be an IF decision systemwith A, B ⊆ R. If B � A, then

1) PosA(d) ⊆ PosB(d);2) f(A) ≤ f(B).Proof:1) Since B � A, it holds from Definition 3 that A(di)(x) ⊆

B(di)(x) for any di . This implies that PosA(d) ⊆PosB(d).

2) It can be concluded from (1). �It is noted from Theorem 6 that the IF positive region is

monotonic with the granularity of the induced IF relation. Inother words, the finer is the IF relation induced by a subset ofIF relations, the larger is the IF positive region. A subset ofIF relations may maintain the entire positive region. From thisviewpoint, the notion of reduct can be introduced.

The dependence function reflects the classification ability ofIF relations and can be used to evaluate the significance ofIF relations. Adding different IF relations to a given subsetof IF relations may lead to a variety of different changes inthe dependence function. Accordingly, the significance of IFrelations can be defined as follows.

Definition 8: Let S = (U,R∪ {d}) be an IF decision sys-tem with B ⊆ R and R ∈ R− B. The significance of R w.r.t.B is defined as Sig(R,B, d) = f(B ∪ {R})− f(B).

Based on the above-mentioned discussion, a forward heuris-tic algorithm for finding a reduct of an IF decision system isformulated as shown in Algorithm 1.

The parameter δ is used to stop the loop in the algorithm. Thelarger is the value of δ, the larger is the number of selected IFrelations. Therefore, the value of δ should be chosen in appropri-ate ranges according to real data. Moreover, the computation ofIF relation B ∪R in Step 3 adopts an incremental method. Thatis, we remember the IF relation B in the previous loop and takean intersection between it and relation R, instead of recalculat-ing the IF relation B ∪R. This strategy avoids a considerableamount of unnecessary time-consuming calculation.

In Step 4, the computation of the IF approximations for everyobject can be completed in O(|U |2). In Step 5, the dependencefunctions can be obtained within O(|U |2). In Steps 9–11, theelements ofR are evaluated by using the significance measure,

Algorithm 1 A Heuristic Algorithm for Finding a Reductof an IF Decision System Based on IF Positive Region (Al-gorithm IFPR).

Input: An IF decision system (U,R∪ {d});Output: An IF relation reduct B.

1: Let B = ∅;2: For each IF relation R ∈ R3: Compute the IF relation B ∪R;4: Compute the IF approximations μB∪R(di )(x) andγB∪R(di )(x) for each x ∈ di according to Theorem 5;5: Compute the dependence functions f(B) andf(B ∪ {R}) according to Property 5(3);6: Compute the significance Sig(R,B, d) according toDefinition 8;7: End For8: Find R0 with maximum value Sig(R0 ,B, d);9: If Sig(R0 ,B, d) > δ;

10: Let B ← B ∪ {R0} andR ← R− {R0};11: Return to Step 2;12: Else13: Output B and terminate the algorithm.14: End If

and their complexity is O(|U |2 |R|). The overall time complex-ity of the algorithm is thus O(|U |2 |R|).

Let us employ an example to show the idea of the new algo-rithm.

Example 7: Continued from Example 6.We take the value δ = 0 in Algorithm 1 for illustration.We first initialize B = ∅, and compute the IF approximations

under each Ri(1 ≤ i ≤ 4), respectively, in what follows.We obtain that

μR1 (d1 )(x1) = infxi /∈d1

γR1 (x1 , xi)

= inf(γR1 (x1 , x2), γR1 (x1 , x3)) = inf(0.2, 0.6) = 0.2

γR1 (d1 )(x1) = supxi /∈d1

(μR1 (x1 , xi))

= sup(μR1 (x1 , x2), μR1 (x1 , x3)) = sup(0.6, 0.3) = 0.6

μR1 (d2 )(x2) = infxi /∈d2

γR1 (x2 , xi)

= inf(γR1 (x2 , x1)) = inf(0.3) = 0.3

γR1 (d2 )(x2) = supxi /∈d2

(μR1 (x2 , xi))

= sup(μR1 (x2 , x1)) = sup(0.6) = 0.6

μR1 (d2 )(x3) = infxi /∈d2

γR1 (x3 , xi)

= inf(γR1 (x3 , x1)) = inf(0.6) = 0.6

γR1 (d2 )(x3) = supxi /∈d2

(μR1 (x3 , xi))

= sup(μR1 (x3 , x1)) = sup(0.3) = 0.3.


Hence

R1(d1) =(0.2, 0.6)

x1+

(0, 1)x2

+(0, 1)x3

R1(d2) =(0, 1)x1

+(0.3, 0.6)

x2+

(0.6, 0.3)x3

.

The IF positive region w.r.t. R1 is

PosR1 (d) = R1(d1) ∪R1(d2)

=(0.2, 0.6)

x1+

(0.3, 0.6)x2

+(0.6, 0.3)

x3.

The dependence function of the IF positive region w.r.t. R1 is

f(R1) = p(PosR1 (d)) =|PosR1 (d)||U |

=∑

x∈U1+μPosR 1 (d )

(x)−γPosR 1 (d ) (x)2

|U | =13

(1 + 0.2− 0.6

2

+1 + 0.3− 0.6

2+

1 + 0.6− 0.32

)= 0.33.

Similarly

R2(d1) =(0.3, 0.7)

x1+

(0, 1)x2

+(0, 1)x3

R2(d2) =(0, 1)x1

+(0.3, 0.7)

x2+

(0.4, 0.4)x3

.


PosR2 (d) = R2(d1) ∪R2(d2)

=(0.3, 0.7)

x1+

(0.3, 0.7)x2

+(0.4, 0.4)

x3.

The dependence function of the IF positive region w.r.t. R2is f(R2) = p(PosR2 (d)) = 0.37.

In addition

R3(d1) =(0, 1)x1

+(0, 1)x2

+(0, 1)x3

R3(d2) =(0, 1)x1

+(0, 1)x2

+(0.5, 0.5)

x3.


PosR1 (d) = R3(d1) ∪R3(d2) =(0, 1)x1

+(0, 1)x2

+(0.5, 0.5)

x3.


Additionally

R4(d1) =(0.1, 0.5)

x1+

(0, 1)x2

+(0, 1)x3

R4(d2) =(0, 1)x1

+(0.1, 0.5)

x2+

(0.7, 0.2)x3

.


PosR4 (d) =(0.1, 0.5)

x1+

(0.1, 0.5)x2

+(0.7, 0.2)

x3.


By the above analyses, we further obtain that

Sig(R1 ,B, d) = 0.33, Sig(R2 ,B, d) = 0.37Sig(R3 ,B, d) = 0.17, Sig(R4 ,B, d) = 0.45.

From Steps 8–10, we should set B = {R4} and R ={R1 , R2 , R3}. Since Sig(R4 ,B, d) > 0, we go to the next cycle.

It is calculated that

PosB∪R1 (d) =(0.1, 0.5)

x1+

(0.1, 0.5)x2

+(0.7, 0.2)

x3

PosB∪R2 (d) =(0.3, 0.5)

x1+

(0.3, 0.5)x2

+(0.7, 0.2)

x3

PosB∪R3 (d) =(0.1, 0.5)

x1+

(0.1, 0.5)x2

+(0.7, 0.2)

x3.

Consequently

Sig(R1 ,B, d) = 0.45− 0.45 = 0Sig(R2 ,B, d) = 0.51− 0.45 = 0.06Sig(R3 ,B, d) = 0.45− 0.45 = 0.

Hence, B = {R4 , R2} and R = {R1 , R3}. Since Sig(R2 ,B, d) > 0, we next go to the third cycle.

With the fact that Sig(R1 ,B, d) = 0 and Sig(R3 ,B, d) = 0,the algorithm is terminated and an IF reduct B = {R4 , R2} isreturned.

V. NUMERICAL EXPERIMENTS

In a sequence of experiments, we assess the performanceof the proposed algorithm, IFPR, and compare it with someexisting attribute reduction algorithms. The experiments are setup as follows.

1) The experiments are performed on a personal computerwith Windows XP and an Intel (R) Core (TM) i5-2400CPU @ 3.10 GHz with 4 GB of memory. The algorithmsare implemented using MATLAB 7.8. Datasets down-loaded from UCI machine learning repository [60] andKeng Ridge Bio-medical dataset repository [61] are usedfor experimental analysis, which are described in Table I.

2) Before reduction, datasets are transformed into IF deci-sion systems. For convenience, we adopt a simple methodby computing the similarity degree of objects. For anyattribute a, the similarity degree between objects x and yis defined as [62]

Ra(x, y) =

max(

min(

a(y)− a(x) + σσ

,a(x)− a(y) + σ

σ

), 0

)

where σ2 is the variance of attribute a.For any attribute set B, the corresponding IF relation RBis defined as RB = {< (x, y), μRB (x, y), γRB (x, y) >|(x, y) ∈ U × U}, where μRB (x, y) = mina∈B Ra(x, y)and γRB (x, y) = 1−maxa∈B Ra(x, y). Obviously, it


TABLE IDESCRIPTION OF DATA

TABLE IICOMPARISON ON THE NUMBERS OF SELECTED ATTRIBUTES

Fig. 1. Portion of selected attributes with the algorithms.

satisfies μRB (x, y) + γRB (x, y) ≤ 1 for any x, y ∈ U andthus constitutes an IF relation.

3) As suggested by many scholars [29], [30], the parameterδ in the new algorithm IFPR is set as 0.001 for low-dimensional data and 0.01 for high-dimensional data. Wecompare the proposed algorithm IFPR with several repre-sentative ones: the fuzzy boundary region-based algorithm(B-FRFS) [23], discernibility matrix-based fuzzy roughreduction algorithm (DM-FR) [63], and fuzzy positiveregion-based accelerator algorithm (FA-FPR) [64]. These

Fig. 2. Computational time of the algorithms.

TABLE IIICOMPARISON ON THE COMPUTATIONAL TIMES OF DIFFERENT ALGORITHMS

algorithms employ different dependence functions as theevaluation indicators of features. In algorithm B-FRFS,the features that can minimize the boundary region of alldecision concepts are desirable. Algorithm DM-FR re-gards the features owing the largest discrimination in thediscernibility matrix as the optimal solution. AlgorithmFA-FPR is an accelerated version of the fuzzy positiveregion reduction algorithm of Hu et al. [34], while theirselected feature subsets are constantly the same.

4) Three indices, i.e., the number of selected attributes, com-putational time, and classification accuracy, are adoptedfor comparison. Two classical classifiers, i.e., tree-basedCART [65] and k-nearest neighbor rule (K-NN, K = 10)are employed to evaluate the classification performanceof the reduction algorithms. All classification results aregiven out by a tenfold cross validation. The optimal re-sults among different algorithms in the following tablesare emphasized with bold font.

Table II presents the numbers of selected attributes of thefour reduction algorithms on each dataset, which are depictedin Fig. 1. As seen in this table, all four algorithms can signif-icantly decrease the size of the attribute subsets. In particular,


TABLE IVCOMPARISON ON CLASSIFICATION ACCURACIES OF REDUCED DATA WITH CART

TABLE VCOMPARISON ON CLASSIFICATION ACCURACIES OF REDUCED DATA WITH KNN

the dimensionality reduction achieves 50% for low-dimensionaldatasets and 99.98% for high-dimensional datasets. This impliesthat all the reduction algorithms are efficient in terms of reduc-ing redundant attributes. Moreover, as seen in Fig. 1, the resultsobtained by the proposed algorithm are consistently better thanthose obtained by the other algorithms on most datasets. For ex-ample, the proposed algorithm selects eight attributes on datasetheart and ten attributes on dataset wpbc, while the other algo-rithms select more than ten attributes on most data. The numberof attributes selected on an average by the new algorithm is11.47, which is smaller than that selected by the other threealgorithms. Hence, the proposed algorithm is effective in termsof decreasing the dimensionality of data.

Table III records the computational time of the reductionalgorithms, which is depicted in Fig. 2. We can see that algorithmIFPR and algorithm FA-FPR are faster than the other two inmost cases. This is because that algorithm FA-FPR is equippedwith an accelerator, where the numbers of samples that need tobe dealt with in the current step is largely reduced in previoussteps, whereas the proposed algorithm adopts a fast computationmethod based on matrix representation and matrix logic. Asseen in this table, algorithm IFPR takes dozens of seconds onall datasets. For instance, algorithm IFPR takes about 32.16 son data Leukemia, 0.39 s on data iono, and 27.14 s on data

MLL. These results indicate that algorithm IFPR runs fast onboth low-dimensional and high-dimensional datasets. Hence,the proposed algorithm is more acceptable in terms of runningtime.

Tables IV and V record the classification performance of thefour reduction algorithms tested using two different types ofclassifiers CART and KNN (K = 10), respectively, where theblack-bordered symbols indicate the highest among the obtainedclassification accuracies. In the case of CART, all the four al-gorithms significantly improve the classification accuracies onmost datasets as compared to the raw data. Moreover, algorithmIFPR outperforms the other algorithms for nine times among allthe 15 datasets. In the same time, algorithm IFPR outperformsthe other algorithms for eight times with respect to KNN. Itshould be emphasized that algorithm IFPR shows a decrease inthe classification accuracy as compared to the raw horse datasetin Table IV. However, the decrease in the classification accuracyis only 0.26%, while that achieves in dimensionality 60.87%.On the other 14 datasets, algorithm IFPR offers a clear improve-ment in the classification accuracy as compared to the raw data.Moreover, the average accuracy indicates that algorithm IFPRranks first with a margin of 1.39% between it and the second bestaccuracy 83.99% of algorithm B-FRFS, and a margin of 3.94%between it and the 81.43% accuracy rate of the raw data with


Fig. 3. Classification accuracy of reduced data with CART.

Fig. 4. Classification accuracy of reduced data with KNN.

respect to CART. Meanwhile, the average accuracy achievedby algorithm IFPR is 86.57%, which exceeds that obtained byalgorithm DM-FR by a margin of 1.02% and the raw data bya margin of 4.84% with respect to KNN. Figs. 3 and 4 displaythe classification accuracies of the four algorithms and the rawdata based on the CART and KNN classifiers, respectively. Itcan be seen that IFPR gains favorable classification results onmost of the datasets. Thus, the proposed method, which takesthe similarity and diversity of objects into account, is feasibleand effective. The reasons can be elaborated from the follow-ing aspects. Algorithm B-FRFS strives to achieve the smallestuncertainty of the decision classes. This idea can decrease thedifference of the lower and upper approximations of all decisionconcepts. Algorithm DM-FR uses the index of the occurrencefrequency of features in the discernibility matrix. The subsetof features selected depends on the discernibility matrix con-structed. Algorithm FA-FPR is aimed to maintain the positiveregion unchanged, which can guarantee the maximal degree ofsamples’ membership to their own classes. However, these al-gorithms are based on the same fuzzy rough set model, whichconsiders only the relevance of samples while overlooking thediversity. The proposed algorithm is based on the IF roughset that takes both the two factors into account. This idea canmaintain both the maximal degrees of samples’ membership to

their own classes and nonmembership to other classes, and thuscan guarantee the maximum rate of correct classification andthe minimum rate of error classification simultaneously.

VI. CONCLUSION

The rough set and IF set are two important tools for describ-ing the uncertainty of knowledge and have been widely appliedin the frameworks of granular computing and data dimension-ality reduction. In this study, we introduced several types offuzzy granules, utilizing which granular structures of the lowerand upper approximations of the IF set were addressed. Fur-thermore, to deal with data dimensionality reduction with IFrough set, we formalized a type of information systems, i.e., IFdecision information systems, for representing IF information.The IF approximations were examined in IF decision systems.The knowledge reduction of IF decision systems was addressedfrom the viewpoint of maintaining the IF approximations ofdecision classes, and a reduction algorithm was developed toreduce redundant IF relations in these systems. To examine theefficiency of the proposed algorithm, we constructed IF rela-tions from numerical attributes of data, which bear both thesimilarity and diversity degrees between objects. Experimentswere conducted to compare the new algorithm with existingones. This study presented a pioneering attempt to investigatethe problem of attribute reduction on real datasets by combiningthe techniques of the rough set and IF set.

Some problems related to the IF rough set still remain thatneed to be considered and discussed further. They are summa-rized as follows. First, the granular structures of the fuzzy roughset were elaborated for attribute reduction in many studies inthe literature. How can we demonstrate the corresponding so-lution of granular structures of the IF rough set? Second, manyIF relations can be induced from real datasets using substantialmethods. Convergency is important for the theoretical analysisand algorithmic application of IF rough set models. In a futurestudy, we will search the models that can maintain the conver-gency of algorithms on real datasets. Third, IF rough set-basedtechniques also need to be applied to the fields of classificationlearning in the future.

REFERENCES

[1] Z. Pawlak, “Rough sets,” Int. J. Comput. Inf. Sci., vol. 11, no. 5, pp. 341–356, 1982.

[2] J. Dai, Q. Hu, J. Zhang, H. Hu, and N. Zheng, “Attribute selection forpartially labeled categorical data by rough set approach,” IEEE Trans.Cybern., vol. 47, no. 9, pp. 2460–2471, Sep. 2017.

[3] J. Dai, Q. Hu, H. Hu, and D. Huang, “Neighbor inconsistent pair selectionfor attribute reduction by rough set approach,” IEEE Trans. Fuzzy Syst.,vol. 26, no. 2, pp. 937–950, Apr. 2018.

[4] A. Tan, W. Wu, J. Li, and G. Lin, “Evidence-theory-based numericalcharacterization of multigranulation rough sets in incomplete informationsystems,” Fuzzy Sets Syst., vol. 294, no. 1, pp. 18–35, 2016.

[5] A. Tan, W. Wu, and Y. Tao, “A set cover-based approach for the test-cost-sensitive attribute reduction problem,” Soft Comput., vol. 21, no. 20,pp. 6159–6173, 2017.

[6] D. Dubois and H. Prade, “Rough fuzzy sets and fuzzy rough sets,” Int. J.General Syst., vol. 17, no. 2/3, pp. 191–209, 1990.

[7] D. Dubois and H. Prade, “Fuzzy sets in approximate reasoning,” FuzzySets Syst., vol. 40, no. 1, pp. 143–244, 1991.

[8] Q. Hu, D. Yu, W. Pedrycz, and D. Chen, “Kernelized fuzzy rough setsand their applications,” IEEE Trans. Knowl. Data Eng., vol. 23, no. 11,pp. 1649–1667, Nov. 2011.


[9] Q. Hu, L. Zhang, Y. Zhou, and W. Pedrycz, “Large-scale multimodalityattribute reduction with multi-kernel fuzzy rough sets,” IEEE Trans. FuzzySyst., vol. 26, no. 1, pp. 226–238, Feb. 2018.

[10] R. Jensen and Q. Shen, “Fuzzy-rough attributes reduction with applicationto web categorization,” Fuzzy Sets Syst., vol. 141, no. 3, pp. 469–485, 2004.

[11] R. Wang, D. Chen, and S. Wong, “Fuzzy-rough-set-based active learning,”IEEE Trans. Fuzzy Syst., vol. 22, no. 6, pp. 1699–1704, Dec. 2014.

[12] J. Mi, Y. Leung, H. Zhao, and T. Feng, “Generalized fuzzy rough setsdetermined by a triangular norm,” Inf. Sci., vol. 178, no. 16, pp. 3203–3213, 2008.

[13] W. Wu, Y. Leung, and J. Mi, “On characterizations of (I,T)-fuzzy roughapproximation operators,” Fuzzy Sets Syst., vol. 154, no. 1, pp. 76–102,2005.

[14] Y. Y. Yao, “A comparative study of fuzzy sets and rough sets,” Inf. Sci.,vol. 109, no. 1–4, pp. 21–47, 1998.

[15] D. S. Yeung, D. Chen, E. C. C. Tsang, J. W. T. Lee, and X. Wang, “Onthe generalization of fuzzy rough sets,” IEEE Trans. Fuzzy Syst., vol. 13,no. 3, pp. 343–361, Jun. 2005.

[16] L. A. Zadeh, “Fuzzy sets,” Inf. Control, vol. 8, no. 3, pp. 338–353, 1965.[17] X. Liu, W. Pedrycz, T. Chai, and M. Song, “The development of fuzzy

rough sets with the use of structures and algebras of axiomatic fuzzy sets,”IEEE Trans. Knowl. Data Eng., vol. 21, no. 3, pp. 443–462, Mar. 2009.

[18] Y. Qian, Y. Li, J. Liang, G. Lin, and C. Dang, “Fuzzy granular struc-ture distance,” IEEE Trans. Fuzzy Syst., vol. 23, no. 6, pp. 2245–2259,Dec. 2015.

[19] W. Xu and W. Li, “Granular computing approach to two-way learningbased on formal concept analysis in fuzzy datasets,” IEEE Trans. Cybern.,vol. 46, no. 2, pp. 366–379, Feb. 2016.

[20] D. Chen, Y. Yang, and H. Wang, “Granular computing based on fuzzysimilarity relations,” Soft Comput., vol. 15, no. 6, pp. 1161–1172, 2010.

[21] Q. He, C. Wu, D. Chen, and S. Zhao, “Fuzzy rough set based attributereduction for information systems with fuzzy decisions,” Knowl.-BasedSyst., vol. 24, no. 5, pp. 689–696, 2011.

[22] Y. Lin, Q. Hu, J. Liu, J. Li, and X. Wu, “Streaming feature selection formulti-label learning based on fuzzy mutual information,” IEEE Trans.Fuzzy Syst., vol. 25, no. 6, pp. 1491–1507, Dec. 2017.

[23] R. Jensen and Q. Shen, “New approaches to fuzzy-rough feature selection,”IEEE Trans. Fuzzy Syst., vol. 17, no. 4, pp. 824–838, Aug. 2009.

[24] R. B. Bhatt and M. Gopal, “On fuzzy rough sets approach to featureselection,” Pattern Recognit. Lett., vol. 26, no. 7, pp. 965–975, 2005.

[25] G. C. Y. Tsang, D. Chen, E. C. C. Tsang, J. W. T. Lee, and D. S. Yeung,“On attributes reduction with fuzzy rough sets,” in Proc. IEEE Int. Conf.Syst., Man, Cybern., Oct. 2005, vol. 3, no. 3, pp. 2775–2780.

[26] D. Chen and Y. Yang, “Attribute reduction for heterogeneous data basedon the combination of classical and fuzzy rough set models,” IEEE Trans.Fuzzy Syst., vol. 22, no. 5, pp. 1325–1334, Oct. 2014.

[27] C. C. E. Tsang, D. Chen, S. D. Yueng, W. T. J. Lee, and X. Wang, “Attributereduction using fuzzy rough sets,” IEEE Trans. Fuzzy Syst., vol. 16, no. 5,pp. 1130–1141, Oct. 2008.

[28] C. Wang, M. Shao, Q. He, Y. Qian, and Y. Qi, “Feature subset selectionbased on fuzzy neighborhood rough sets,” Knowl.-Based Syst., vol. 111,no. 1, pp. 173–179, 2016.

[29] C. Wang et al., “A fitting model for feature selection with fuzzy roughsets,” IEEE Trans. Fuzzy Syst., vol. 25, no. 4, pp. 741–753, Aug. 2017.

[30] C. Wang, X. Wang, D. Chen, Q. Hu, Z. Dong, and Y. Qian, “Feature se-lection based on neighborhood discrimination index,” IEEE Trans. NeuralNetw. Learn. Syst., vol. 29, no. 7, pp. 2986–2999, Jul. 2018.

[31] Q. Hu, L. Zhang, S. An, D. Zhang, and D. Yu, “On robust fuzzy roughset models,” IEEE Trans. Fuzzy Syst., vol. 20, no. 4, pp. 636–651,Aug. 2012.

[32] S. Zhao, H. Chen, C. Li, X. Du, and H. Sun, “A novel approach to buildinga robust fuzzy rough classifier,” IEEE Trans. Fuzzy Syst., vol. 23, no. 4,pp. 769–786, Aug. 2015.

[33] S. Zhao, E. C. C. Tsang, and D. Chen, “The model of fuzzy variableprecision rough sets,” IEEE Trans. Fuzzy Syst., vol. 17, no. 2, pp. 451–467, Apr. 2009.

[34] Q. Hu, Z. Xie, and D. Yu, “Hybrid attribute reduction based on anovel fuzzy-rough model and information granulation,” Pattern Recog-nit., vol. 40, no. 12, pp. 3509–3521, 2007.

[35] X. Zhang, C. Mei, D. Chen, and J. Li, “Feature selection in mixed data: Amethod using a novel fuzzy rough set-based information entropy,” PatternRecognit., vol. 56, no. 1, pp. 1–15, 2016.

[36] K. Atanassove, “Intuitionistic fuzzy sets,” Fuzzy Sets Syst., vol. 20, no. 1,pp. 87–96, 1986.

[37] K. Atanassov, Intuitionistic Fuzzy Sets: Theory and Applications. Heidel-berg, Germany: Physica-Verlag, 1999.

[38] C. Cornelis, M. D. Cock, and E. E. Kerre, “Intuitionistic fuzzy rough sets:At the crossroads of imperfect knowledge,” Expert Syst., vol. 20, no. 5,pp. 260–270, 2003.

[39] C. Cornelis, G. Deschrijver, and E. E. Kerre, “Implication in intuitionisticfuzzy and interval-valued fuzzy set theory: Construction, classification,application,” Int. J. Approx. Reasoning, vol. 35, no. 1, pp. 55–95, 2004.

[40] S. K. Samanta and T. K. Mondal, “Intuitionistic fuzzy rough sets andrough intuitionistic fuzzy sets,” J. Fuzzy Math., vol. 9, no. 3, pp. 561–582,2001.

[41] L. Zhou and W. Wu, “On generalized intuitionistic fuzzy rough approxi-mation operators,” Inf. Sci., vol. 178, no. 11, pp. 2448–2465, 2008.

[42] L. Zhou, W. Wu, and W. Zhang, “On characterization of intuitionistic fuzzyrough sets based on intuitionistic fuzzy implicators,” Inf. Sci., vol. 179,no. 7, pp. 883–898, 2009.

[43] B. Huang, Y. Zhuang, and H. Li, “Information granulation and uncertaintymeasures in interval-valued intuitionistic fuzzy information systems,” Eur.J. Oper. Res., vol. 231, no. 1, pp. 162–170, 2013.

[44] B. Huang, C. Guo, Y. Zhuang, H. Li, and X. Zhou, “Intuitionistic fuzzymultigranulation rough sets,” Inf. Sci., vol. 277, no. 1, pp. 299–320, 2014.

[45] B. Huang, C. Guo, H. Li, G. Feng, and X. Zhou, “Hierarchical structuresand uncertainty measures for intuitionistic fuzzy approximation space,”Inf. Sci., vol. 336, no. 1, pp. 92–114, 2016.

[46] X. Zhang, B. Zhou, and P. Li, “A general frame for intuitionistic fuzzyrough sets,” Inf. Sci., vol. 216, no. 24, pp. 34–49, 2012.

[47] E. Hesameddini, M. Jafarzadeh, and H. Latifizadeh, “Rough set theory forthe intuitionistic fuzzy information systems,” Int. J. Modern Math. Sci.,vol. 6, no. 3, pp. 132–143, 2013.

[48] Z. Zhang, “Attributes reduction based on intuitionistic fuzzy rough sets,”J. Intell. Fuzzy Syst., vol. 30, no. 2, pp. 1127–1137, 2016.

[49] Z. Gong and X. Zhang, “Variable precision intuitionistic fuzzy rough setsmodel and its application,” Int. J. Mach. Learn. Cybern., vol. 5, no. 2,pp. 263–280, 2015.

[50] P. Jena and S. K. Ghosh, “Intuitionistic fuzzy rough sets,” Notes Intuit.Fuzzy Sets, vol. 8, no. 1, pp. 1–18, 2002.

[51] D. Liang and D. Liu, “Deriving three-way decisions from intuitionisticfuzzy decision-theoretic rough sets,” Inf. Sci., vol. 300, no. 10, pp. 28–48,2015.

[52] Y. Liu and Y. Lin, “Intuitionistic fuzzy rough set model based on conflictdistance and applications,” Appl. Soft Comput., vol. 31, no. 2, pp. 266–273,2015.

[53] Y. Liu, Y. Lin, and H. Zhao, “Variable precision intuitionistic fuzzy roughset model and applications based on conflict distance,” Exp. Syst., vol. 32,no. 2, pp. 220–227, 2015.

[54] B. Sun, W. Ma, and Q. Liu, “An approach to decision making based onintuitionistic fuzzy rough sets over two universes,” J. Oper. Res. Soc.,vol. 64, no. 7, pp. 1079–1089, 2017.

[55] D. Coker, “Fuzzy rough sets are intuitionistic L-fuzzy sets,” Fuzzy SetsSyst., vol. 96, no. 3, pp. 381–383, 1998.

[56] T. Das, D. Acharjya, and M. Patr, “Multi criterion decision making usingintuitionistic fuzzy rough set on two universal sets,” Int. J. Intell. Syst.Appl., vol. 7, no. 4, pp. 26–33, 2015.

[57] E. Szmidt and J. Kacprzyk, “Entropy for intuitionistic fuzzy sets,” FuzzySets Syst., vol. 118, no. 3, pp. 467–477, 2001.

[58] I. K. Vlachos and G. D. Sergiadis, “Subsethood, entropy, and cardinalityfor interval-valued fuzzy sets-an algebraic derivation,” Fuzzy Sets Syst.,vol. 158, no. 12, pp. 1384–1396, 2007.

[59] H. Bustince and P. Burillo, “Structures on intuitionistic fuzzy relations,”Fuzzy Sets Syst., vol. 78, no. 78, pp. 293–303, 1996.

[60] C. L. Blake and C. J. Merz, UCI Repository of Machine LearningDatabases, 1998. [Online]. Available: http://www. ics.uci.edu/mlearn/MLRepository.html

[61] Kent Ridge Bio-medical Dataset. 2002. [Online]. Available: http://datam.i2r.a-tar.edu.sg/datasets/krbd/index.html

[62] R. Jensen and Q. Shen, “Semantics-preserving dimensionality reduction:Rough and fuzzy-rough-based approaches,” IEEE Trans. Knowl. DataEng., vol. 16, no. 12, pp. 1457–1471, Dec. 2004.

[63] D. Chen, L. Zhang, S. Zhao, Q. Hu, and P. Zhu, “A novel algorithm forfinding reducts with fuzzy rough sets,” IEEE Trans. Fuzzy Syst., vol. 20,no. 2, pp. 385–389, Apr. 2012.

[64] Y. Qian, Q. Wang, H. Cheng, J. Liang, and C. Dang, “Fuzzy-rough featureselection accelerator,” Fuzzy Sets Syst., vol. 258, no. 1, pp. 61–78, 2015.

[65] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classificationand regression trees, vol. 93, Belmont, CA, USA: Wadsworth, 1984,no. 99, pp. 101.

Authors’ photographs and biographies not available at the time of publication.

http://www. ics.uci.edu/ mlearn/MLRepository.htmlhttp://www. ics.uci.edu/ mlearn/MLRepository.htmlhttp://datam.i2r.a-tar.edu.sg/datasets/krbd/index.htmlhttp://datam.i2r.a-tar.edu.sg/datasets/krbd/index.html

/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages false /GrayImageDownsampleType /Bicubic /GrayImageResolution 1200 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.00083 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages false /MonoImageDownsampleType /Bicubic /MonoImageResolution 1600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.00063 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

/CreateJDFFile false /Description >>> setdistillerparams> setpagedevice

Documents

Intuitionistic Fuzzy Rough Set-Based Granular Structures ...dig.sxu.edu.cn/docs/2019-10/d9147b3cbd1a4dbc906e3d91317d1131.pdfThe intuitionistic fuzzy (IF) set [36], [37], as an intuitive