4
Research on an Efficient Rough Set based Attribute Reduction Algorithm Jun Wang, Xiu-Feng Zhong, and Xi-Yuan Peng Abstract— Rough set is a valid mathematical theory devel- oped in recent years, which has the ability to deal with imprecise and uncertain information. It has been proven that computing all the reductions and the minimal reduction of information system is a NP-hard problem. In this paper, a coding and sorting method is proposed to reduce the computational complexity of indiscernibility relation and positive region computation, and so attribute reduction can be obtained efficiently. Experimental results showed that the proposed algorithm computed attribute reduction efficiently. I. INTRODUCTION Rough set theory was proposed firstly by professor Pawlak. Now rough set theory has become a useful mathe- matical tool for dealing the vague and inaccuracy knowledge. With over twenty years development of the rough set theory, it has been widely applied in information system analysis, artificial intelligence, decision support system, data mining, pattern recognition, process control [2]. The foundation of the rough set theory is the approximation relation. In rough set theory, a knowledge which is always represented by a set is described with the upper approximation set and the lower approximation set. Attribute reduction or feature selection in information system is one of the core researches in the rough set theory. Heuristic algorithms are always adopted for obtaining the optimal or approximate optimal attribute reduction as the attribute reduction is a NP-hard problem. A discriminate matrix based attribute reduction algorithm with O(|C| 3 |U | 2 ) computation complexity was proposed in [3]. A positive region based attribute reduction algorithm with O(|C| 2 |U | 2 ) was proposed in [4]. A positive region based reduction algorithm combined with quick sorting method was proposed to reduce the computational complexity from O(|C| 2 |U | 2 ) to O(|C| 2 |U |log|U |) in [5]. [6] proposed an attribute reduction algorithm with O(|C| 2 |U |) computation complexity, but the computation complexity of this algorithm did not include the computation complexity of tolerance class computing. In fact, the computation complexity of the algorithm proposed in [6] is O(|C| 3 |U | 2 ). In order to reduce the computation complexity further, we proposed a fast rough set attribute reduction algorithm based on coding This work was supported by the Natural Science Foundation of Guang- dong Province of China(No.9451503101003263), Educational Commission of Guangdong Province of China and the Youth Foundation of Shantou University. J. Wang and X.F. Zhong are with the Department of Electronics Engineer- ing, Shantou University, No.243 Daxue Road, Shantou, Guangdong, 515063, People’s Republic of China. Email: [email protected] X.Y. Peng is with the Auto-testing and Control Laboratory, Harbin Institute of Technology, Harbin, Heilongjiang, P. R. China. and sorting with O(|C||U |log|U |) computation complexity and O(|C||U |) space complexity. II. SOME DEFINITIONS Definition 1. Let S =(U,C,D,V,f ) be a decision table or information system, where U = {u 1 ,u 2 , ..., u n } repre- sents the set of the objects, C represents the set of condition attributes, D represents the set of decision attributes, V a in the V = aCD V a represents the value domain of attributes, f is the mapping function from U × (C D) to V . A subset of the attributes B (C D) determines an indiscriminative relation IND(B)= {(x, y) U × U |∀a B,f (x, a)= f (y,a)} , and this indiscriminative relation will give us a division of U represented by U/B. Any element (a set of some objects) in the U/B [x] B = {y|∀a B,f (x, a)= f (y,a)} is called an equivalence class. Definition 2.For R C D in S =(U,C,D,V,f ) , there will be a division of U i.e. U/R = {R 1 ,R 2 , ..., R l } . And any X U can be represented with this partition. The RX = ∪{R i |R i U/R, R i X} is called the lower approximation of X with R. And RX = ∪{R i |R i U/R, R i X = φ} is called the upper approximation of X with R. Definition 3. Let S be an information system, and let P (C D) and Q (C D), then the positive region POS p (Q) can be defined as POS p (Q)= XU/Q PX where PX the lower approximation of X with P . Definition 4. For every b B C in the information system S, if there has the POS B (D)= POS B-b (D), then we can claim that the b is an unnecessary attribute of the attribute set B on the dataset D. Otherwise, the b is necessary attribute in the attribute set B on D. If every attribute of B C is necessary to D, we called that B is independent to D. Definition 5. For B C in the information system S, if POS B (D) is equal to POS C (D) and B is independent to D, we call the B is a reduction of C on D. Definition 6. The intersect of all reductions of the C with regard to D is called as the core attribute set of C on D, which can be represented as Core D (C). Theorem 1. The necessary and sufficient conditions [10] for an attribute a i being a core attribute of C on D is POS C-{ai} (D) = POS C (D). 714 978-1-4244-6044-1/10/$26.00 ©2010 IEEE

[IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

  • Upload
    xi-yuan

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

Research on an Efficient Rough Set based Attribute ReductionAlgorithm

Jun Wang, Xiu-Feng Zhong, and Xi-Yuan Peng

Abstract— Rough set is a valid mathematical theory devel-oped in recent years, which has the ability to deal with impreciseand uncertain information. It has been proven that computingall the reductions and the minimal reduction of informationsystem is a NP-hard problem. In this paper, a coding and sortingmethod is proposed to reduce the computational complexity ofindiscernibility relation and positive region computation, andso attribute reduction can be obtained efficiently. Experimentalresults showed that the proposed algorithm computed attributereduction efficiently.

I. INTRODUCTION

Rough set theory was proposed firstly by professorPawlak. Now rough set theory has become a useful mathe-matical tool for dealing the vague and inaccuracy knowledge.With over twenty years development of the rough set theory,it has been widely applied in information system analysis,artificial intelligence, decision support system, data mining,pattern recognition, process control [2]. The foundation ofthe rough set theory is the approximation relation. In roughset theory, a knowledge which is always represented by a setis described with the upper approximation set and the lowerapproximation set. Attribute reduction or feature selectionin information system is one of the core researches in therough set theory. Heuristic algorithms are always adoptedfor obtaining the optimal or approximate optimal attributereduction as the attribute reduction is a NP-hard problem. Adiscriminate matrix based attribute reduction algorithm withO(|C|3|U |2) computation complexity was proposed in [3].A positive region based attribute reduction algorithm withO(|C|2|U |2) was proposed in [4]. A positive region basedreduction algorithm combined with quick sorting methodwas proposed to reduce the computational complexity fromO(|C|2|U |2) to O(|C|2|U |log|U |) in [5]. [6] proposed anattribute reduction algorithm with O(|C|2|U |) computationcomplexity, but the computation complexity of this algorithmdid not include the computation complexity of toleranceclass computing. In fact, the computation complexity ofthe algorithm proposed in [6] is O(|C|3|U |2). In order toreduce the computation complexity further, we proposed afast rough set attribute reduction algorithm based on coding

This work was supported by the Natural Science Foundation of Guang-dong Province of China(No.9451503101003263), Educational Commissionof Guangdong Province of China and the Youth Foundation of ShantouUniversity.

J. Wang and X.F. Zhong are with the Department of Electronics Engineer-ing, Shantou University, No.243 Daxue Road, Shantou, Guangdong, 515063,People’s Republic of China. Email: [email protected]

X.Y. Peng is with the Auto-testing and Control Laboratory, HarbinInstitute of Technology, Harbin, Heilongjiang, P. R. China.

and sorting with O(|C||U |log|U |) computation complexityand O(|C||U |) space complexity.

II. SOME DEFINITIONS

Definition 1. Let S = (U,C,D, V, f) be a decision tableor information system, where U = {u1, u2, ..., un} repre-sents the set of the objects, C represents the set of conditionattributes, D represents the set of decision attributes, Va

in the V = ∪a∈C∪DVa represents the value domain ofattributes, f is the mapping function from U × (C ∪ D)to V . A subset of the attributes B ⊆ (C ∪D) determines anindiscriminative relation IND(B) = {(x, y) ∈ U × U |∀a ∈B, f(x, a) = f(y, a)} , and this indiscriminative relationwill give us a division of U represented by U/B. Anyelement (a set of some objects) in the U/B [x]B = {y|∀a ∈B, f(x, a) = f(y, a)} is called an equivalence class.

Definition 2.For ∀R ⊆ C ∪ D in S = (U,C,D, V, f) ,there will be a division of U i.e. U/R = {R1, R2, ..., Rl}. And any X ⊆ U can be represented with this partition.The RX = ∪{Ri|Ri ∈ U/R, Ri ⊆ X} is called thelower approximation of X with R. And RX = ∪{Ri|Ri ∈U/R, Ri ∩X 6= φ} is called the upper approximation of Xwith R.

Definition 3. Let S be an information system, and let P ⊆(C∪D) and Q ⊆ (C∪D), then the positive region POSp(Q)can be defined as POSp(Q) = ∪X∈U/QPX where PX thelower approximation of X with P .

Definition 4. For every b ∈ B ⊆ C in the informationsystem S, if there has the POSB(D) = POSB−b(D), thenwe can claim that the b is an unnecessary attribute of theattribute set B on the dataset D. Otherwise, the b is necessaryattribute in the attribute set B on D. If every attribute ofB ⊆ C is necessary to D, we called that B is independentto D.

Definition 5. For ∀B ⊆ C in the information system S,if POSB(D) is equal to POSC(D) and B is independentto D, we call the B is a reduction of C on D.

Definition 6. The intersect of all reductions of the C withregard to D is called as the core attribute set of C on D,which can be represented as CoreD(C).

Theorem 1. The necessary and sufficient conditions [10]for an attribute ai being a core attribute of C on D isPOSC−{ai}(D) 6= POSC(D).

714978-1-4244-6044-1/10/$26.00 ©2010 IEEE

Page 2: [IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

III. EFFICIENT ATTRIBUTE REDUCTION ALGORITHMWITH ROUGH SET BASED ON CODING AND SORTING

A. Fast algorithm for indiscriminate relationship

The indiscernibility relation is the foundation of rough-settheory and the computational complexity of indiscernibilityrelation will determine the computational complexity ofrough set based attribute reduction. [7] proposed an algorithmfor computing the indiscernibility relation with computa-tional complexity O(|B||U |2), in which every attribute valuein the same attribute of every two samples were compared.[8] proposed an algorithm for computing the indiscerni-bility relation with computational complexity O(|B||U |2),in which every two samples were compared to determinewhether two samples belong to the same equivalence class ornot. [5] proposed an algorithm for computing indiscernibilityrelation with computational complexity O(|B||U |log|U |), inwhich the samples were sorted with the values of everyattribute and than the partition of samples was obtained withone scan further.

[5] showed the IND(B) had the following property.Property1.Two samples belong to the same equivalent

class if and only if all discretized values of attributes in twosamples are equal [5].

According to the property 1, [5] sorted the samples in Uwith attribute set B firstly and then the equivalent classeswere obtained. So it need |B| times sorting. The values ofattributes are discretized before applying the rough set basedalgorithm and the number of discretized values for everyattribute is up to the number of integer, so these discretizedattribute values can be mapped into a series of integers.And these integers can be represented with complement oroffset binary. In this paper, the discretized values of |B|attributes of every sample in U are transformed into a biginteger firstly, and then these transformed integers are sorted,and finally the equivalent class is obtained in which theobjects with same value are in the same equivalent class. Let∀a ∈ B ⊂ C can be represented with CODEa bit binary,and the attribute set B = [B1, B2, ..., B|B|] can be codedinto Codebook = [CODEB1 , CODEB2 , ..., CODEB|B| ].So any sample or object X = [xB1 , xB2 , ..., xB|B| ] in Ucan be looked as a binary string and can be decoded into aninteger CodedV alueX with the following equation:

CodedV alue(X) =|B|∑i=1

xBi· 2

∑i−1j=1 CODEBj (1)

After the samples or objects are decoded with the afore-mentioned policy, we can have the following theorem.

Theorem2. If two samples have the same decoded value,these two samples are in the same equivalent class.

This theorem can be proofed easily according to theprocess of coding. With theorem 2we has the fast algorithmfor computing the equivalent classes IND(B) as Algorithm1.

Algorithm 1 computing the indiscernibility relation IND(B)Input: S = 〈U,C,D, V, f〉, A = C ∪ D,U ={u1, u2, ..., u|U |} and B ⊆ AOutput:IND(B)Step 1: Decoding the objects in the set U according tothe Eq.(1);Step2: quick sorting the decoded objects;Step3: s = 1, j = 1, B1 = u1;Step4: for i=1 to |U |

{if CodedV alue(ui) = CodedV alue(uj)Bs = Bs + {ui}

else{s = s + 1;Bs = ui;j = i;

}}

B. Fast computing the positive region

The relative core and relative reduction are both definedon the positive region, so computing the positive regionefficiently is very important. We have the following positiveregion computing algorithm with the Theorem 1 in [5] andthe Algorithm 1 in this paper, as Algorithm 2.

Algorithm 2 computing the positive region POSp(Q)Input: S = 〈U,C,D, V, f〉, A = C ∪ D,U ={u1, u2, ..., u|U |} and P,Q ⊆ AOutput:POSP (Q)Step 1: Compute the partitions U/P = {P1, P2, ..., Pl} ofthe U with P by using the Algorithm1;Step2: POSP (Q) = φ;Step3: for i = 1 to l

{Compute the partitions Pi/Q ={Qi1, Qi2, ..., Qim};for j = 1 to m{if (there is only one equivalent class in the Qij){POSP (Q) = POSP (Q) ∪Qij ;}}

}Step 4:Output the POSP (Q).

C. Incremental Calculation of Positive Region

Attribute reduction has been proved a NP-complete prob-lem, so people are always aim to find the suboptimal re-duction by using some heuristic information. The commonproperty in the heuristic algorithms for attribute reduction isto use the importance of attributes. Let S = 〈U,C∪D,V, F 〉be an information system and a ∈ C −R(R ⊂ C), then the

715

Page 3: [IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

importance of the attribute to the decision attribute set D canbe defined as Eq.(2) [9]

SGF (a,R,D) = γR∪{a} − γR (2)

where γR = |POSR(D)|/|D|.From Eq.(2), to compute SGF (a,R,D) should compute

the POSR(D) and POSR∪{a}(D) at firstly. POSR(D) andPOSR∪{a}(D) both can be computed with Algorithm 2. Inorder to reduce the computational complexity further, thePOSR∪{a}(D) can be incrementally calculated by usingthe U/R obtained when the POSR(D) being computed, asAlgorithm 3.

Algorithm 3 Incremental computing of POSR∪{a}(D)

Input: S = 〈U,C,D, V, f〉, U/R = {R1, R2, ..., Rm},0 < m ≤ |U |Output:POSR∪{a}(D)Step 1: Compute the POS{a}(D|Ri) with the Algorithm2 for every Ri in the U/R;Step 2: POSR∪{a}(D) =

⋃mi=1 POS{a}(D|Ri)

The computational complexity of Algorithm 3 is about∑mi=1 O(|Ri|log|Ri|) <

∑mi=1 O(|Ri|log(maxi |Ri|)) <

O(|U |log|U |). So the computational complexity for comput-ing the SGF (a,R,D) is O(|U |log|U |).

D. The proposed attribute reduction algorithm

The Algorithm 1 to Algorithm 3 can be integrated as acomplete algorithm for attribute reduction, as Algorithm 4.

Algorithm 4 Rough set based attribute reduction algorithmInput: S = 〈U,C,D, V, f〉Output:A reduct of C relative to DStep 1: Compute the CoreD(C);Step 2: RED = CoreD(C);Step 3: Compute the POSC(D) and POSRED;Step 4: While(POSC(D) 6= POSRED(D))

{Find an attribute a from attribute set C −RED

which maximizes the SGF (a,RED, D);RED = RED + a;Update the POSRED(D);}

The computational complexity of Algorithm 4 isO(|C||U |log|U |). Furthermore, its space complexity is onlyO(|U |).

IV. EXPERIMENTAL RESULTS

An integer type dataset is randomly generated to comparethe performance of two rough set based attribute reductionalgorithms, i.e. fast sorting based reduction algorithm (Algorithm a) and the proposed algorithm (Algorithm b).The experimental results showed in Figure 1 and Figure2.Figure 1 showed the running time ratio of Algorithm ato Algorithm b with variable number of condition attributes

and fixed dataset size. From Figure 1, we can find thatthe running time ratio increases greatly with the numberof condition attributes |C| increasing and is much greaterthan the number of condition attributes |C| before the |C| isless than 350. And the running time ratio decreases with thenumber of condition attributes |C| increasing and becomesapproximately equal to the number of condition attributes |C|after the |C| is greater than 350. Figure 2 shows the runningtime ratio of Algorithm a to Algorithm b with variable sizeof dataset. From Figure 2,we can find that the running timeratio decreases with the size of dataset increasing and fixednumber of data dimension, and becomes approximately equalto the number of condition attributes |C|. From Figure 1andFigure 2, we can draw a conclusion that the efficiency ofthe proposed algorithm are much higher than the comparedalgorithm [5] and the computational complexity decreasedfrom O(|C|2|U |log|U |) to O(|C||U |log|U |).

Fig. 1. Running time ratio of Algorithm a and Algorithm b with variablenumber of condition attributes and fixed dataset size

Fig. 2. Running time ratio of Algorithm a and Algorithm b with variabledataset size and fixed number of condition attributes

V. CONCLUSION

Rough set theory takes the upper and lower approximationof the concepts to deal with uncertainty in the information

716

Page 4: [IEEE 2010 3rd International Symposium on Systems and Control in Aeronautics and Astronautics (ISSCAA) - Harbin, China (2010.06.8-2010.06.10)] 2010 3rd International Symposium on Systems

system, and the uncertainty is determined entirely by thedata with strong objectively. Rough set theory for its uniqueadvantages has won more and more attention, its theoreticalfoundations have been well established and been appliedin many application fields successfully. In order to makerough set theory more widely used in concrete practice, it ismuch important to find some efficient algorithms with lowercomputational complexity [5]. Rough set based attribute re-duction in the information systems is the focus of theoreticalresearch, the key problem of which is how to computethe positive region efficiently. [5] proposed a fast algorithmwhose computational complexity is O(|C|2|U |log|U |), butits computational complexity was still too high for manyapplications. In this paper, we proposed an improved roughset based attribute reduction algorithm whose computationalcomplexity is O(|C||U |log|U |) and much less than that ofthe compared algorithm by reducing the times of sorting.Experimental results showed that the proposed algorithmdid attribute reduction roughly |C| times faster than thecompared algorithm. So our algorithm is much qualified fordoing attribute reduction in the relative large informationsystem.

REFERENCES

[1] Z. Pawlak, “Rough sets”.International Journal of Computer andSciences, vol.11, no.5,1982,pp.341-356.

[2] F. Tay et al, “Fault diagnosis based on rough set theory”, EngineeringApplications of Artificial Intelligence, vol.16, no.1, 2003, pp.39-43.

[3] X. Hu, C.Nick, “Learning in relational databases: A rough set ap-proach”, International Journal of Computational Intelligence, vol.11,no.2, 1995, pp.323-338.

[4] D. Ye, “An improvement to Jelonek’s attribution reduction algorithm”,Acta Electronics Sinica, vol.28, no.12, 2000, pp.81-82.

[5] S. Liu, Q. Sheng, B. Wu, Z. Shi, and F. Hu, “Research on EfficientAlgorithms for Rough Set Methods”, Chinese Journal of Computers,vol.25, no.5, 2003, pp.524-529.

[6] J. Liang, Z. Xu, “The algorithm on knowledge reduction in incompleteinformation systems”, International Journal of Uncertainty, Fuzzinessand Knowledge-Based Systems, vol.10, no.1, 2002, pp.95-103.

[7] J. Guan, D. Bell, “Rough computational methods for informationsystems”, Artificial Intelligences, vol.105, No.1-2, 1998, pp.77-103.

[8] W. Zhang et, “Theory and methods of Rough Set”, Beijing: SciencePress, 2001.

[9] J. Jelonek, K. Krawiec, R. Slowinski, “Rough set reduction of at-tributes and their domains for neural networks”, International Journalof Computational Intelligence, vol.11, no.2, 1995, pp.339-347.

[10] T. Zhang, J. Xiao, X. Wang, “Algorithms of Attribute Relative Re-duction in Rough Set Theory”, Acta Electronica Sinica, vol.33, no.11,2005, pp.2080-2083.

717