Ensemble-based active learning for class imbalance problem

HTML  Download Download as PDF (Size: 107KB)  PP. 1022-1029  
DOI: 10.4236/jbise.2010.310133    5,659 Downloads   11,626 Views  Citations

Affiliation(s)

.

ABSTRACT

In medical diagnosis, the problem of class imbalance is popular. Though there are abundant unlabeled data, it is very difficult and expensive to get labeled ones. In this paper, an ensemble-based active learning algorithm is proposed to address the class imbalance problem. The artificial data are created according to the distribution of the training dataset to make the ensemble diverse, and the random subspace re-sampling method is used to reduce the data dimension. In selecting member classifiers based on misclassification cost estimation, the minority class is assigned with higher weights for misclassification costs, while each testing sample has a variable penalty factor to induce the ensemble to correct current error. In our experiments with UCI disease datasets, instead of classification accuracy, F-value and G-means are used as the evaluation rule. Compared with other ensemble methods, our method shows best performance, and needs less labeled samples.

Share and Cite:

Yang, Y. and Ma, G. (2010) Ensemble-based active learning for class imbalance problem. Journal of Biomedical Science and Engineering, 3, 1022-1029. doi: 10.4236/jbise.2010.310133.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.