Study on the effectiveness of AutoML in detecting cardiovascular disease

T.V. Afanasieva,A.P. Kuzlyakin,A.V. Komolov
2023-08-19
Abstract:Cardiovascular diseases are widespread among patients with chronic noncommunicable diseases and are one of the leading causes of death, including in the working age. The article presents the relevance of the development and application of patient-oriented systems, in which machine learning (ML) is a promising technology that allows predicting cardiovascular diseases. Automated machine learning (AutoML) makes it possible to simplify and speed up the process of developing AI/ML applications, which is key in the development of patient-oriented systems by application users, in particular medical specialists. The authors propose a framework for the application of automatic machine learning and three scenarios that allowed for data combining five data sets of cardiovascular disease indicators from the UCI Machine Learning Repository to investigate the effectiveness in detecting this class of diseases. The study investigated one AutoML model that used and optimized the hyperparameters of thirteen basic ML models (KNeighborsUnif, KNeighborsDist, LightGBMXT, LightGBM, RandomForestGini, RandomForestEntr, CatBoost, ExtraTreesGini, ExtraTreesEntr, NeuralNetFastA, XGBoost, NeuralNetTorch, LightGBMLarge) and included the most accurate models in the weighted ensemble. The results of the study showed that the structure of the AutoML model for detecting cardiovascular diseases depends not only on the efficiency and accuracy of the basic models used, but also on the scenarios for preprocessing the initial data, in particular, on the technique of data normalization. The comparative analysis showed that the accuracy of the AutoML model in detecting cardiovascular disease varied in the range from 87.41% to 92.3%, and the maximum accuracy was obtained when normalizing the source data into binary values, and the minimum was obtained when using the built-in AutoML technique.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the effectiveness of Automated Machine Learning (AutoML) in detecting cardiovascular diseases. Specifically, the researchers propose a framework to evaluate AutoML methods and explore their performance in detecting cardiovascular diseases through three different data preprocessing scenarios. The specific issues the paper aims to address are as follows: 1. **Can AutoML models improve the accuracy of detecting cardiovascular diseases?** - The researchers aim to improve the accuracy of detecting cardiovascular diseases by optimizing base machine learning models (ML models) and incorporating them into AutoML models. 2. **Can AutoML's built-in data preprocessing algorithms compare to those manually created by data scientists?** - The paper explores the impact of different data preprocessing methods on the structure and performance of AutoML models, particularly the different implementations of data normalization techniques. Through experiments in these three scenarios, the paper aims to validate the effectiveness of AutoML models in practical applications, especially in the healthcare field, to see if they can simplify and accelerate the development process of machine learning models and improve the accuracy of predicting cardiovascular diseases.