TITLE:
Analysis of Risk Factors and Segment-Specific Strategies for Diabetes Prevention
AUTHORS:
Aya Patricia Konan, Adama Coulibaly, Kouassi Bernard Saha, Souleymane Oumtanaga
KEYWORDS:
Diabetes, KMeans, Logistic Regression, Decision Tree, Segmentation
JOURNAL NAME:
Journal of Applied Mathematics and Physics,
Vol.13 No.9,
September
28,
2025
ABSTRACT: This study proposes a segmented approach to analyzing diabetes risk factors using the dataset diabete_custom.xlsx (150 individuals, 14 medical and behavioral variables). The combination of KMeans with logistic regression and KMeans with decision tree enabled the definition of three clusters corresponding to low, moderate, and high risk, while identifying key variables such as blood glucose, BMI, and heredity. The hybrid models improve accuracy and interpretability compared to KMeans alone, with the decision tree being slightly more effective in unbalanced clusters. These findings provide a foundation for personalized interventions, including targeted screening, glycemic and nutritional monitoring, physical activity, and educational campaigns tailored to each risk profile.