Mitigating class imbalance in heart disease detection with machine learning

Mohbey, Krishna Kumar
DOI: https://doi.org/10.1007/s11042-024-19705-8
IF: 2.577
2024-06-27
Multimedia Tools and Applications
Abstract:Within the context of contemporary civilization, cardiovascular disease has emerged as a severe health issue that impacts individuals of all ages and from a variety of backgrounds. The results of several surveys indicate that cardiovascular disorders are the cause of 32% of all fatalities that take place all over the world. The difficulty in predicting cardiac illnesses, such as heart attacks, is one of the significant factors that contribute to deaths. This is because predicting these conditions is complicated and requires a lot of information and skill. In the discipline of computer science, machine learning has emerged as one of the most prominent subfields, and it has been linked to the effective resolution of a great deal of complex problems, notably in the field of medicine. In this study, we adopt various supervised classifiers and analyze their effectiveness in heart disease prediction. To address the imbalanced classification problem, we apply four sampling methods, namely, Synthetic Minority Oversampling TEchnique (SMOTE), random oversampling (ROS), random undersampling (RUS), and cost-sensitive learning. For the purpose of experimentation, we have used an extremely unbalanced dataset, the Behavioral Risk Factor Surveillance System (BRFSS) 2021 Heart Disease Health Indicators Dataset. In light of the facts acquired, it can be concluded that the random over-sampling strategy performed noticeably better than the other sampling procedures.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?