TITLE:
Early Detection of Diabetes Using a Hybrid Approach Based on the Voting Classifier
AUTHORS:
Adlès Francis Kouassi, Bi Irié Cyrille Dje, Kigninman Désiré Kone, Olivier Asseu
KEYWORDS:
Diabetes Prediction, Extra Trees Classifier, XGBoost, K-Nearest Neighbors, SMOTEENN, Machine Learning, Hyperparameter Optimization
JOURNAL NAME:
Open Journal of Applied Sciences,
Vol.15 No.4,
March
27,
2025
ABSTRACT: The early detection of type 2 diabetes is a major challenge for healthcare professionals, as a late diagnosis can lead to severe and difficult-to-manage complications. In this context, this paper proposes an innovative hybrid approach based on an ensemble method using Voting, designed to improve the accuracy of diabetes prediction. Our methodology is based on three main steps. First, we balanced the dataset classes using the SMOTEENN method to correct imbalances and ensure a fair representation of positive and negative classes. Next, we combined three complementary algorithms—Extra Trees Classifier (ETC), XGBoost (XGB), and K-Nearest Neighbors (KNN)—using the Voting strategy. This combination allows us to leverage the specific strengths of each model while reducing their individual limitations. Finally, we applied GridSearch to optimize hyperparameters, ensuring maximum model performance. The results obtained from experiments conducted on the Pima Indians Diabetes Dataset are remarkable. Our hybrid model achieves an overall accuracy of 95.50%, a precision of 93.22%, a recall of 98.21%, an F1-Score of 95.65%, and an AUC-ROC of 98.83%. These performances surpass those of individual models, demonstrating the potential of this approach for developing reliable and effective tools dedicated to the early diagnosis of type 2 diabetes.