TITLE:
A KNN Undersampling Approach for Data Balancing
AUTHORS:
Marcelo Beckmann, Nelson F. F. Ebecken, Beatriz S. L. Pires de Lima
KEYWORDS:
Machine Learning, Class Overlaping, Imbalanced Datases
JOURNAL NAME:
Journal of Intelligent Learning Systems and Applications,
Vol.7 No.4,
November
11,
2015
ABSTRACT: In supervised learning, the imbalanced
number of instances among the classes in a dataset can make the algorithms to
classify one instance from the minority class as one from the majority class.
With the aim to solve this problem, the KNN algorithm provides a basis to other
balancing methods. These balancing methods are revisited in this work, and a
new and simple approach of KNN undersampling is proposed. The experiments
demonstrated that the KNN undersampling method outperformed other sampling
methods. The proposed method also outperformed the results of other studies,
and indicates that the simplicity of KNN can be used as a base for efficient
algorithms in machine learning and knowledge discovery.