TITLE:
Enhancing Prediction of Osteoporosis Using Supervised and Unsupervised Learning: New Approach to Disease Subtyping
AUTHORS:
Muhannad Almohaimeed
KEYWORDS:
Osteoporosis, Machine Learning, Prediction, Clustering, Subtypes
JOURNAL NAME:
Intelligent Information Management,
Vol.17 No.2,
March
19,
2025
ABSTRACT: Background: Osteoporosis is a serious health issue that can lead to severe clinical diseases, involving fractures. The resulting fracture can be a major risk factor for disability or even death for the elderly. A well-timed diagnosis of osteoporosis disease can help identify and prevent such fractures and improve patient outcomes. Objective: The aim of this study is to explore a novel hybrid approach for the characterization of osteoporotic patients into different subtypes, leading to enhanced classification of the condition. Methods: We examined a cohort of 10,000 patients based on nationwide chronic disease data in Germany, which included 1293 osteoporotic patients. We included various medical variables such as chronic kidney disease, cancer, stroke, hypertension, and diabetes. We deployed a hybrid approach that used HDBSCAN clustering to stratify patients into distinct subtypes. We constructed the predictive models for each subtype using seven different classification methods. Results: We identified seven distinct subtypes, each linked with different conditions such as cancer, cardiovascular diseases, and chronic obstructive pulmonary disease (COPD). Logistic Regression showed the highest subtype-level prediction performance, reaching an accuracy score of 87.8% compared to other predictive models based on the original dataset without clustering. Unsupervised learning approach improved prediction using all classification methods, emphasizing the impact of deploying subtype analysis to complex data. Conclusion: This research revealed that deploying a hybrid methodology is important for the discovery of patient subtypes and for making predictions more precise. The choice of the methods in this research was critical in ensuring robust prediction performance. The predictive model is vital for finding patients at high risk for osteoporosis disease and enabling early intervention and prevention strategies. This approach holds potential for the study of other complex clinical diseases using any data source.