An Innovative Approach for Attribute Reduction in Rough Set Theory

Abstract

The Rough Sets Theory is used in data mining with emphasis on the treatment of uncertain or vague information. In the case of classification, this theory implicitly calculates reducts of the full set of attributes, eliminating those that are redundant or meaningless. Such reducts may even serve as input to other classifiers other than Rough Sets. The typical high dimensionality of current databases precludes the use of greedy methods to find optimal or suboptimal reducts in the search space and requires the use of stochastic methods. In this context, the calculation of reducts is typically performed by a genetic algorithm, but other metaheuristics have been proposed with better performance. This work proposes the innovative use of two known metaheuristics for this calculation, the Variable Neighborhood Search, the Variable Neighborhood Descent, besides a third heuristic called Decrescent Cardinality Search. The last one is a new heuristic specifically proposed for reduct calculation. Considering some databases commonly found in the literature of the area, the reducts that have been obtained present lower cardinality, i.e., a lower number of attributes.

Share and Cite:

Pessoa, A. and Stephany, S. (2014) An Innovative Approach for Attribute Reduction in Rough Set Theory. Intelligent Information Management, 6, 223-239. doi: 10.4236/iim.2014.65022.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Pawlak, Z. (1982) Rough Sets. International Journal of Computing and Information Sciences, 11, 341-356.
http://www.springerlink.com/index/r5556398717921x5.pdf
[2] Komorowski, J., Pawlak, Z., Polkowski, L. and Skowron, A. (1999) Rough Sets: A Tutorial. Warsaw.
http://secs.ceas.uc.edu/~mazlack/dbm.w2011/Komorowski.RoughSets.tutor.pdf
[3] Ohrn, A. (1999) Discernibility and Rough Sets in Medicine: Tools and Applications. Norwegian University of Science and Technology, Trondheim.
[4] Jensen, R. and Shen, Q. (2003) Finding Rough Set Reducts with Ant Colony Optimization. Proceedings of the 2003 UK Workshop on Computational Intelligence, 15-22.
http://users.aber.ac.uk/rkj/pubs/papers/antRoughSets.pdf
[5] Hedar, A.-R., Wang, J. and Fukushima, M. (2008) Tabu Search for Attribute Reduction in Rough Set Theory. Soft Computing, 12, 909-918.
http://www.springerlink.com/index/c4181mh5l23816u3.pdf
[6] Pessoa, A.S.A., Stephany, S. (2012) Feature Selection with New Metaheuristics in the Rough Sets Theory. XXXIV National Congress of Computational and Applied Mathematics, 1, 1-7.
[7] Hansen, P. and Mladenovic, N. (2003) A Tutorial on Variable Neighborhood Search. Montreal.
http://yalma.fime.uanl.mx/~roger/work/teaching/mecbs5122/5-VNS/VNS-tutorial-G-2003-46.pdf
[8] Han, J., Sanchez, R. and Hu, X. (2005) Feature Selection Based on Relative Attribute Dependency: An Experimental Study. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, 3641, 214-223.
http://www.springerlink.com/index/743agvmf3etlaq0p.pdf http://dx.doi.org/10.1007/11548669_23
[9] Zadeh, L.A. (1965) Fuzzy Sets. Information and Control, 8, 338-353.
http://dx.doi.org/10.1016/S0019-9958(65)90241-X
[10] Shafer, G. (1976) A Mathemathical Theory of Evidence. Princeton University Press, Princeton.
[11] Dempster, A.P. (1967) Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 38, 325-339.
http://dx.doi.org/10.1214/aoms/1177698950
[12] Zadeh, L.A. (1999) Fuzzy Sets as a Basis for a Theory of Possibility. Fuzzy Sets and Systems, 100, 9-34.
http://dx.doi.org/10.1016/S0165-0114(99)80004-9
[13] Nicoletti, M. do C. and Uchôa, J.Q. (1997) Rough Sets under the Perspective of Pertinence Function. 3rd Brazilian Symposium on Intelligent Automation, Vitoria, Brazil, 307-313, in Portuguese.
[14] Chouchoulas, A. and Shen, Q. (2001) Rough Set-Aided Keyword Reduction for Text Categorization. Applied Artificial Intelligence, 15, 843-873.
http://www.tandfonline.com/doi/abs/10.1080/088395101753210773
[15] Jensen, R. and Shen, Q. (2005) Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16, 1457-1471.
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1350758
[16] Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.
http://www.citeulike.org/group/664/article/400721
[17] Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston.
http://www.mendeley.com/research/genetic-algorithms-in-search-optimization-and-machine-learning/
[18] Kirkpatrick, S., Gelatt, C.D. and Vecchi, M.P. (1983) Optimization by Simulated Annealing. Science, 220, 671-680.
http://www.sciencemag.org/content/220/4598/671.short http://dx.doi.org/10.1126/science.220.4598.671
[19] Glover, F. and McMillan, C. (1986) The General Employee Scheduling Problem. An Integration of MS and AI. Computers Operations Research, 13, 563-573.
http://dx.doi.org/10.1016/0305-0548(86)90050-X
[20] Colorni, A., Dorigo, M. and Maniezzo, V. (1991) Distributed Optimization by Ant Colonies. European Conference on Artificial Life, Paris, France 134-142.
[21] Hansen, P. and Mladenovic, N. (1997) Variable Neighborhood Search. Computers & Operations Research, 24, 1097-1100.
http://www.sciencedirect.com/science/article/pii/S0305054897000312
http://dx.doi.org/10.1016/S0305-0548(97)00031-2
[22] Blake, C.L. and Merz, C.J. (1998) UCI Repository of Machine Learning Databases. University of California, Oakland.
http://www.ics.uci.edu/~mlearn

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.