Brain stroke prediction model based on boosting and stacking ensemble approach

Subhash Mondal,Soumadip Ghosh,Amitava Nag
DOI: https://doi.org/10.1007/s41870-023-01418-0
2023-08-24
International Journal of Information Technology
Abstract:The concern of brain stroke increases rapidly in young age groups daily. The leading causes of death from stroke globally will rise to 6.7 million yearly if untreated and undetected by early estimates by WHO in a recent report. Machine learning (ML) based prediction models can reduce the fatality rate by detecting this unwanted medical condition early by analyzing the factors influencing cerebral stroke. This research studied the performance and behavior of six ML models which is based on boosting along with ensemble learning techniques for the prediction of brain stroke, like Ada-Boost (AB), histogram based gradient boost (HGB), XGBoost (XGB), gradient boost (GB), light gradient boosting machine (LGBM), Cat boost (CB). We have used two separate datasets with similar attributes for building and validating the deployed model, whereas dataset 1 (DF 1) contains 43,400 tuples for training and testing the models. Dataset 2 (DF 2) has 4981 tuples and 11 attributes used to validate the model performances after applying basic data pre-processing steps. We accomplished an accuracy of 98.51% with the CB on DF 2. Later we stacked on subsets of six models as mentioned above and listed the results. We concluded that the stacked model performed well in finding the best mapping function for predicting stroke with an accuracy of 97.88%. This study produces an insightful view of boosting-based stacking generalized prediction model for brain stroke at an early.
What problem does this paper attempt to address?