TITLE:
Extra Trees Model for Heart Disease Prediction
AUTHORS:
Uchenna J. Nzenwata, Emokiniovo Edwin, Emmanuel A. Chukwu, Dare Osilaja, Johnson O. Hinmikaiye, Chidiebere Enyinnah
KEYWORDS:
Accuracy, Extra tree Model, Heart Disease Prediction, Machine Learning, Predictive Model, Random Forest, Recursive Feature Elimination
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.13 No.2,
April
30,
2025
ABSTRACT: Heart disease continues to be a major global cause of death, making the development of reliable prediction models necessary to enable early detection and treatment. Using machine learning to improve prediction accuracy, this study investigates the use of the Extra Tree (Extremely Randomized Trees) algorithm for heart disease prediction. The research includes data preparation, model training, and performance evaluation using measures like accuracy, precision, recall, and F1-score. It makes use of a dataset that includes a variety of medical and demographic variables. The Extra Tree model outperforms a number of baseline models in terms of accuracy and predictive power. The dataset was obtained from the University of California, Irvine (UCI) Machine Learning Repository, which contains about 319,796 instances and 18 attributes related to heart disease. The attributes serve as the features. This study reduced the number of features from 18 to 7, by using recursive feature elimination method, which uses Random Forest as an estimator. The Extra Tree model demonstrates great performance, showing high accuracy, precision, recall, and f1 scores of 93.1%, 94.8%, 100% and 93.1% respectively on a dataset split ratio of 80% to 20% train set and test set respectively. The study concluded that the model may be implemented into a clinical decision support system to help healthcare providers diagnose cardiac disease. Furthermore, the feature importance analysis can help direct future research into finding the most significant risk factors for cardiovascular disease.