Improving coronary heart disease prediction with real-life dataset: a stacked generalization framework with maximum clinical attributes and SMOTE balancing for imbalanced data
Madhuri Dubey,Jitendra Tembhurne,Richa Makhijani
DOI: https://doi.org/10.1007/s11042-024-19429-9
IF: 2.577
2024-06-02
Multimedia Tools and Applications
Abstract:Heart disease increases the strain on the heart by reducing its ability to pump blood throughout the body, which can lead to heart attacks and strokes. Heart disease is becoming a global threat to the world due to people's unhealthy lifestyles, prevalent stroke history, physical inactivity, and current medical background. In predictive analytics, many studies were proposed to get alerts about forthcoming heart disease based on various attributes. However, the performance metrics were good, but the model was trained with few features. This study aims to train the model with all the essential attributes for heart disease prediction on Framingham Heart Study (FHS) dataset. The dataset is pre-processed with IQR (Inter Quartile Range) outlier detection followed by data oversampling using Synthetic Minority Oversampling Technique (SMOTE). We proposed a stack generalization approach, wherein various machine learning classifiers, namely logistic regression, random forest, K-nearest neighbour, Naïve Bayes, support vector machine, XGBoost, and decision tree with optimized hyperparameter trained the model to offer the best learner for the prediction of Coronary Heart disease with improved performance. The proposed model is tested on the original imbalanced and SMOTE-balanced FHS dataset. It is observed that logistic regression on the original (imbalanced) FHS dataset provides 86.51% accuracy, while the support vector machine on the SMOTE (balanced) FHS dataset outperformed the other models with an accuracy of 93.07%. Also, the proposed approach of stacked generalization with cross-validation provided 97.2% accuracy on SMOTE (balanced) dataset, which is remarkable.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering