TITLE:
A Machine Learning Classification Model for Detecting Prediabetes
AUTHORS:
A. K. M. Raquibul Bashar, Mahdi Goudarzi, Chris P. Tsokos
KEYWORDS:
Prediabetes, Machine Learning, SVM, Forest, Cumulative Lift
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.12 No.3,
August
27,
2024
ABSTRACT: The incidence of prediabetes is in a dangerous condition in the USA. The likelihood of increasing chronic and complex health issues is very high if this stage of prediabetes is ignored. So, early detection of prediabetes conditions is critical to decrease or avoid type 2 diabetes and other health issues that come as a result of untreated and undiagnosed prediabetes condition. This study is done in order to detect the prediabetes condition with an artificial intelligence method. Data used for this study is collected from the Centers for Disease Control and Prevention’s (CDC) survey conducted by the Division of Health and Nutrition Examination Surveys (DHANES). In this study, several machine learning algorithms are exploited and compared to determine the best algorithm based on Average Squared Error (ASE), Kolmogorov-Smirnov (Youden) scores, areas under the ROC and some other measures of the machine learning algorithm. Based on these scores, the champion model is selected, and Random Forest is the champion model with approximately 89% accuracy.