[IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

Research on an Efficient Rough Set based Attribute ReductionAlgorithm

Jun Wang, Xiu-Feng Zhong, and Xi-Yuan Peng

Abstract— Rough set is a valid mathematical theory devel-oped in recent years, which has the ability to deal with impreciseand uncertain information. It has been proven that computingall the reductions and the minimal reduction of informationsystem is a NP-hard problem. In this paper, a coding and sortingmethod is proposed to reduce the computational complexity ofindiscernibility relation and positive region computation, andso attribute reduction can be obtained efficiently. Experimentalresults showed that the proposed algorithm computed attributereduction efficiently.

I. INTRODUCTION

Rough set theory was proposed firstly by professorPawlak. Now rough set theory has become a useful mathe-matical tool for dealing the vague and inaccuracy knowledge.With over twenty years development of the rough set theory,it has been widely applied in information system analysis,artificial intelligence, decision support system, data mining,pattern recognition, process control [2]. The foundation ofthe rough set theory is the approximation relation. In roughset theory, a knowledge which is always represented by a setis described with the upper approximation set and the lowerapproximation set. Attribute reduction or feature selectionin information system is one of the core researches in therough set theory. Heuristic algorithms are always adoptedfor obtaining the optimal or approximate optimal attributereduction as the attribute reduction is a NP-hard problem. Adiscriminate matrix based attribute reduction algorithm withO(|C|3|U |2) computation complexity was proposed in [3].A positive region based attribute reduction algorithm withO(|C|2|U |2) was proposed in [4]. A positive region basedreduction algorithm combined with quick sorting methodwas proposed to reduce the computational complexity fromO(|C|2|U |2) to O(|C|2|U |log|U |) in [5]. [6] proposed anattribute reduction algorithm with O(|C|2|U |) computationcomplexity, but the computation complexity of this algorithmdid not include the computation complexity of toleranceclass computing. In fact, the computation complexity ofthe algorithm proposed in [6] is O(|C|3|U |2). In order toreduce the computation complexity further, we proposed afast rough set attribute reduction algorithm based on coding

This work was supported by the Natural Science Foundation of Guang-dong Province of China(No.9451503101003263), Educational Commissionof Guangdong Province of China and the Youth Foundation of ShantouUniversity.

J. Wang and X.F. Zhong are with the Department of Electronics Engineer-ing, Shantou University, No.243 Daxue Road, Shantou, Guangdong, 515063,People’s Republic of China. Email: [email protected]

X.Y. Peng is with the Auto-testing and Control Laboratory, HarbinInstitute of Technology, Harbin, Heilongjiang, P. R. China.

and sorting with O(|C||U |log|U |) computation complexityand O(|C||U |) space complexity.

II. SOME DEFINITIONS

Definition 1. Let S = (U,C,D, V, f) be a decision tableor information system, where U = {u1, u2, ..., un} repre-sents the set of the objects, C represents the set of conditionattributes, D represents the set of decision attributes, Va

in the V = ∪a∈C∪DVa represents the value domain ofattributes, f is the mapping function from U × (C ∪ D)to V . A subset of the attributes B ⊆ (C ∪D) determines anindiscriminative relation IND(B) = {(x, y) ∈ U × U |∀a ∈B, f(x, a) = f(y, a)} , and this indiscriminative relationwill give us a division of U represented by U/B. Anyelement (a set of some objects) in the U/B [x]B = {y|∀a ∈B, f(x, a) = f(y, a)} is called an equivalence class.

Definition 2.For ∀R ⊆ C ∪ D in S = (U,C,D, V, f) ,there will be a division of U i.e. U/R = {R1, R2, ..., Rl}. And any X ⊆ U can be represented with this partition.The RX = ∪{Ri|Ri ∈ U/R, Ri ⊆ X} is called thelower approximation of X with R. And RX = ∪{Ri|Ri ∈U/R, Ri ∩X 6= φ} is called the upper approximation of Xwith R.

Definition 3. Let S be an information system, and let P ⊆(C∪D) and Q ⊆ (C∪D), then the positive region POSp(Q)can be defined as POSp(Q) = ∪X∈U/QPX where PX thelower approximation of X with P .

Definition 4. For every b ∈ B ⊆ C in the informationsystem S, if there has the POSB(D) = POSB−b(D), thenwe can claim that the b is an unnecessary attribute of theattribute set B on the dataset D. Otherwise, the b is necessaryattribute in the attribute set B on D. If every attribute ofB ⊆ C is necessary to D, we called that B is independentto D.

Definition 5. For ∀B ⊆ C in the information system S,if POSB(D) is equal to POSC(D) and B is independentto D, we call the B is a reduction of C on D.

Definition 6. The intersect of all reductions of the C withregard to D is called as the core attribute set of C on D,which can be represented as CoreD(C).

Theorem 1. The necessary and sufficient conditions [10]for an attribute ai being a core attribute of C on D isPOSC−{ai}(D) 6= POSC(D).

714978-1-4244-6044-1/10/$26.00 ©2010 IEEE

III. EFFICIENT ATTRIBUTE REDUCTION ALGORITHMWITH ROUGH SET BASED ON CODING AND SORTING

A. Fast algorithm for indiscriminate relationship

The indiscernibility relation is the foundation of rough-settheory and the computational complexity of indiscernibilityrelation will determine the computational complexity ofrough set based attribute reduction. [7] proposed an algorithmfor computing the indiscernibility relation with computa-tional complexity O(|B||U |2), in which every attribute valuein the same attribute of every two samples were compared.[8] proposed an algorithm for computing the indiscerni-bility relation with computational complexity O(|B||U |2),in which every two samples were compared to determinewhether two samples belong to the same equivalence class ornot. [5] proposed an algorithm for computing indiscernibilityrelation with computational complexity O(|B||U |log|U |), inwhich the samples were sorted with the values of everyattribute and than the partition of samples was obtained withone scan further.

[5] showed the IND(B) had the following property.Property1.Two samples belong to the same equivalent

class if and only if all discretized values of attributes in twosamples are equal [5].

According to the property 1, [5] sorted the samples in Uwith attribute set B firstly and then the equivalent classeswere obtained. So it need |B| times sorting. The values ofattributes are discretized before applying the rough set basedalgorithm and the number of discretized values for everyattribute is up to the number of integer, so these discretizedattribute values can be mapped into a series of integers.And these integers can be represented with complement oroffset binary. In this paper, the discretized values of |B|attributes of every sample in U are transformed into a biginteger firstly, and then these transformed integers are sorted,and finally the equivalent class is obtained in which theobjects with same value are in the same equivalent class. Let∀a ∈ B ⊂ C can be represented with CODEa bit binary,and the attribute set B = [B1, B2, ..., B|B|] can be codedinto Codebook = [CODEB1 , CODEB2 , ..., CODEB|B| ].So any sample or object X = [xB1 , xB2 , ..., xB|B| ] in Ucan be looked as a binary string and can be decoded into aninteger CodedV alueX with the following equation:

CodedV alue(X) =|B|∑i=1

xBi· 2

∑i−1j=1 CODEBj (1)

After the samples or objects are decoded with the afore-mentioned policy, we can have the following theorem.

Theorem2. If two samples have the same decoded value,these two samples are in the same equivalent class.

This theorem can be proofed easily according to theprocess of coding. With theorem 2we has the fast algorithmfor computing the equivalent classes IND(B) as Algorithm1.

Algorithm 1 computing the indiscernibility relation IND(B)Input: S = 〈U,C,D, V, f〉, A = C ∪ D,U ={u1, u2, ..., u|U |} and B ⊆ AOutput:IND(B)Step 1: Decoding the objects in the set U according tothe Eq.(1);Step2: quick sorting the decoded objects;Step3: s = 1, j = 1, B1 = u1;Step4: for i=1 to |U |

{if CodedV alue(ui) = CodedV alue(uj)Bs = Bs + {ui}

else{s = s + 1;Bs = ui;j = i;

}}

B. Fast computing the positive region

The relative core and relative reduction are both definedon the positive region, so computing the positive regionefficiently is very important. We have the following positiveregion computing algorithm with the Theorem 1 in [5] andthe Algorithm 1 in this paper, as Algorithm 2.

Algorithm 2 computing the positive region POSp(Q)Input: S = 〈U,C,D, V, f〉, A = C ∪ D,U ={u1, u2, ..., u|U |} and P,Q ⊆ AOutput:POSP (Q)Step 1: Compute the partitions U/P = {P1, P2, ..., Pl} ofthe U with P by using the Algorithm1;Step2: POSP (Q) = φ;Step3: for i = 1 to l

{Compute the partitions Pi/Q ={Qi1, Qi2, ..., Qim};for j = 1 to m{if (there is only one equivalent class in the Qij){POSP (Q) = POSP (Q) ∪Qij ;}}

}Step 4:Output the POSP (Q).

C. Incremental Calculation of Positive Region

Attribute reduction has been proved a NP-complete prob-lem, so people are always aim to find the suboptimal re-duction by using some heuristic information. The commonproperty in the heuristic algorithms for attribute reduction isto use the importance of attributes. Let S = 〈U,C∪D,V, F 〉be an information system and a ∈ C −R(R ⊂ C), then the

715

importance of the attribute to the decision attribute set D canbe defined as Eq.(2) [9]

SGF (a,R,D) = γR∪{a} − γR (2)

where γR = |POSR(D)|/|D|.From Eq.(2), to compute SGF (a,R,D) should compute

the POSR(D) and POSR∪{a}(D) at firstly. POSR(D) andPOSR∪{a}(D) both can be computed with Algorithm 2. Inorder to reduce the computational complexity further, thePOSR∪{a}(D) can be incrementally calculated by usingthe U/R obtained when the POSR(D) being computed, asAlgorithm 3.

Algorithm 3 Incremental computing of POSR∪{a}(D)

Input: S = 〈U,C,D, V, f〉, U/R = {R1, R2, ..., Rm},0 < m ≤ |U |Output:POSR∪{a}(D)Step 1: Compute the POS{a}(D|Ri) with the Algorithm2 for every Ri in the U/R;Step 2: POSR∪{a}(D) =

⋃mi=1 POS{a}(D|Ri)

The computational complexity of Algorithm 3 is about∑mi=1 O(|Ri|log|Ri|) <

∑mi=1 O(|Ri|log(maxi |Ri|)) <

O(|U |log|U |). So the computational complexity for comput-ing the SGF (a,R,D) is O(|U |log|U |).

D. The proposed attribute reduction algorithm

The Algorithm 1 to Algorithm 3 can be integrated as acomplete algorithm for attribute reduction, as Algorithm 4.

Algorithm 4 Rough set based attribute reduction algorithmInput: S = 〈U,C,D, V, f〉Output:A reduct of C relative to DStep 1: Compute the CoreD(C);Step 2: RED = CoreD(C);Step 3: Compute the POSC(D) and POSRED;Step 4: While(POSC(D) 6= POSRED(D))

{Find an attribute a from attribute set C −RED

which maximizes the SGF (a,RED, D);RED = RED + a;Update the POSRED(D);}

The computational complexity of Algorithm 4 isO(|C||U |log|U |). Furthermore, its space complexity is onlyO(|U |).

IV. EXPERIMENTAL RESULTS

An integer type dataset is randomly generated to comparethe performance of two rough set based attribute reductionalgorithms, i.e. fast sorting based reduction algorithm (Algorithm a) and the proposed algorithm (Algorithm b).The experimental results showed in Figure 1 and Figure2.Figure 1 showed the running time ratio of Algorithm ato Algorithm b with variable number of condition attributes

and fixed dataset size. From Figure 1, we can find thatthe running time ratio increases greatly with the numberof condition attributes |C| increasing and is much greaterthan the number of condition attributes |C| before the |C| isless than 350. And the running time ratio decreases with thenumber of condition attributes |C| increasing and becomesapproximately equal to the number of condition attributes |C|after the |C| is greater than 350. Figure 2 shows the runningtime ratio of Algorithm a to Algorithm b with variable sizeof dataset. From Figure 2,we can find that the running timeratio decreases with the size of dataset increasing and fixednumber of data dimension, and becomes approximately equalto the number of condition attributes |C|. From Figure 1andFigure 2, we can draw a conclusion that the efficiency ofthe proposed algorithm are much higher than the comparedalgorithm [5] and the computational complexity decreasedfrom O(|C|2|U |log|U |) to O(|C||U |log|U |).

Fig. 1. Running time ratio of Algorithm a and Algorithm b with variablenumber of condition attributes and fixed dataset size

Fig. 2. Running time ratio of Algorithm a and Algorithm b with variabledataset size and fixed number of condition attributes

V. CONCLUSION

Rough set theory takes the upper and lower approximationof the concepts to deal with uncertainty in the information

716

system, and the uncertainty is determined entirely by thedata with strong objectively. Rough set theory for its uniqueadvantages has won more and more attention, its theoreticalfoundations have been well established and been appliedin many application fields successfully. In order to makerough set theory more widely used in concrete practice, it ismuch important to find some efficient algorithms with lowercomputational complexity [5]. Rough set based attribute re-duction in the information systems is the focus of theoreticalresearch, the key problem of which is how to computethe positive region efficiently. [5] proposed a fast algorithmwhose computational complexity is O(|C|2|U |log|U |), butits computational complexity was still too high for manyapplications. In this paper, we proposed an improved roughset based attribute reduction algorithm whose computationalcomplexity is O(|C||U |log|U |) and much less than that ofthe compared algorithm by reducing the times of sorting.Experimental results showed that the proposed algorithmdid attribute reduction roughly |C| times faster than thecompared algorithm. So our algorithm is much qualified fordoing attribute reduction in the relative large informationsystem.

REFERENCES

[1] Z. Pawlak, “Rough sets”.International Journal of Computer andSciences, vol.11, no.5,1982,pp.341-356.

[2] F. Tay et al, “Fault diagnosis based on rough set theory”, EngineeringApplications of Artificial Intelligence, vol.16, no.1, 2003, pp.39-43.

[3] X. Hu, C.Nick, “Learning in relational databases: A rough set ap-proach”, International Journal of Computational Intelligence, vol.11,no.2, 1995, pp.323-338.

[4] D. Ye, “An improvement to Jelonek’s attribution reduction algorithm”,Acta Electronics Sinica, vol.28, no.12, 2000, pp.81-82.

[5] S. Liu, Q. Sheng, B. Wu, Z. Shi, and F. Hu, “Research on EfficientAlgorithms for Rough Set Methods”, Chinese Journal of Computers,vol.25, no.5, 2003, pp.524-529.

[6] J. Liang, Z. Xu, “The algorithm on knowledge reduction in incompleteinformation systems”, International Journal of Uncertainty, Fuzzinessand Knowledge-Based Systems, vol.10, no.1, 2002, pp.95-103.

[7] J. Guan, D. Bell, “Rough computational methods for informationsystems”, Artificial Intelligences, vol.105, No.1-2, 1998, pp.77-103.

[8] W. Zhang et, “Theory and methods of Rough Set”, Beijing: SciencePress, 2001.

[9] J. Jelonek, K. Krawiec, R. Slowinski, “Rough set reduction of at-tributes and their domains for neural networks”, International Journalof Computational Intelligence, vol.11, no.2, 1995, pp.339-347.

[10] T. Zhang, J. Xiao, X. Wang, “Algorithms of Attribute Relative Re-duction in Rough Set Theory”, Acta Electronica Sinica, vol.33, no.11,2005, pp.2080-2083.

717

Documents

[IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems