Machine learning model matters its accuracy: a comparative study of ensemble learning and AutoML using heart disease prediction
Yagyanath Rimal,Siddhartha Paudel,Navneet Sharma,Abeer Alsadoon
DOI: https://doi.org/10.1007/s11042-023-16380-z
IF: 2.577
2023-09-28
Multimedia Tools and Applications
Abstract:Ensemble machine learning is the concept of using multiple models to gain better performance from the combination of weak individual models. New researchers focus on improving machine learning models for accurate classification and prediction on test data, highlighting the critical issue of overall model quality. Once weak learners’ ensembles for making strong models were compared separately, the precision, accuracy, and f1 model score were compared separately, and the majority of voting aggregation recommended the best mode for deployment. The model accuracy and their performances of the decision tree, logistic regression, support vector machine, random forest, artificial neural network, gaussian, k nearest neighbor, and multilayer perception were compared for the best model prediction. Similarly, Auto Machine Learning (AutoML) supports both binary classifications and regression problems that can be applied instantly without feature engineering directly. AutoML tries to develop a list of more robust models in tabular form and then determine whose accuracy prediction is the best. This research compares the eighteen (18) different machine learning models, i.e., eight (8) different models that were individually trained and ten (10) from AutoML, whose accuracy, mse, and r2 scores were compared with the same open-source heart disease data set. The support vector, logistic regression, and neural network models produced the highest 80% accuracy result compared to the gaussian, k nearest neighbors, and multilayer perception algorithms, which scored a 76% accuracy score. Similarly, after using AutoML, the generalized linear model (88%), gradient boosting model (87%), distributed random forest model (87%), extra tree model score (82%), and accuracy scores (82%), which ultimately mattered for model accuracy of prediction, were recommended for heart disease classification.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering