TITLE:
An Improved Algorithm for Imbalanced Data and Small Sample Size Classification
AUTHORS:
Yong Hu, Dongfa Guo, Zengwei Fan, Chen Dong, Qiuhong Huang, Shengkai Xie, Guifang Liu, Jing Tan, Boping Li, Qiwei Xie
KEYWORDS:
Class Imbalance Learning, Over-Sampling, High-Dimensional Small-Sample Size, Support Vector Machine
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.3 No.3,
July
8,
2015
ABSTRACT: Traditional classification algorithms perform not very well on imbalanced data sets and small sample size. To deal with the problem, a novel method is proposed to change the class distribution through adding virtual samples, which are generated by the windowed regression over-sampling (WRO) method. The proposed method WRO not only reflects the additive effects but also reflects the multiplicative effect between samples. A comparative study between the proposed method and other over-sampling methods such as synthetic minority over-sampling technique (SMOTE) and borderline over-sampling (BOS) on UCI datasets and Fourier transform infrared spectroscopy (FTIR) data set is provided. Experimental results show that the WRO method can achieve better performance than other methods.