Abstract:BACKGROUND: With the advent of artificial intelligence technology, machine learning algorithms have been widely used in the area of disease prediction. OBJECTIVE: Cardiovascular disease (CVD) seriously jeopardizes human health worldwide, thereby needing the establishment of an effective CVD prediction model that can be of great significance for controlling the risk of the disease and safeguarding the physical and mental health of the population. METHODS: Considering the UCI heart disease dataset as an example, initially, a single machine learning prediction model was constructed. Subsequently, six methods such as Pearson, chi-squared, RFE and LightGBM were comprehensively used for the feature screening. On the basis of the base classifiers, Soft Voting fusion and Stacking fusion was carried out to build a prediction model for cardiovascular diseases, in order to realize an early warning and disease intervention for high-risk populations. To address the data imbalance problem, the SMOTE method was adopted to process the data set, and the prediction effect of the model was analyzed using multi-dimensional and multi-indicators. RESULTS: In the single classifier model, the MLP algorithm performed optimally on the preprocessed heart disease dataset. After feature selection, five features eliminated. The ENSEM_SV algorithm that combines the base classifiers to determine the prediction results by soft voting on the results of the classifiers achieved the optimal value on five metrics such as Accuracy, Jaccard_Score, Hamm_Loss, AUC, etc., and the AUC value reached 0.951. The RF, ET, GBDT, and LGB algorithms were employed in the first stage sub-model composed of base classifiers. The AB algorithm was selected as the second stage model, and the ensemble algorithm ENSEM_ST, obtained by Stacking fusion of the two stages exhibited the best performance on 7 indicators such as Accuracy, Sensitivity, F1_Score, Mathew_Corrcoef, etc., and the AUC reached 0.952. Furthermore, a comparison of the algorithms' classification effects based on different training set occupancy was carried out. The results indicated that the prediction performance of both the fusion models was better than the single models, and the overall effect of ENSEM_ST fusion was stronger than the ENSEM_SV fusion. CONCLUSIONS: The fusion model established in this study improved the overall classification accuracy and stability of the model to a significant extent. It has a good application value in the predictive analysis of CVD diagnosis, and can provide a valuable reference in the disease diagnosis and intervention strategies.

Comparing different feature selection algorithms for cardiovascular disease prediction

Comparative Study on Heart Disease Prediction Using Feature Selection Techniques on Classification Algorithms

Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction

Machine Learning-Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques

Comparative analysis of supervised learning algorithms for prediction of cardiovascular diseases

Enhanced feature selection and ensemble learning for cardiovascular disease prediction: hybrid GOL2-2 T and adaptive boosted decision fusion with babysitting refinement

Machine Learning-Based Comparative Study For Heart Disease Prediction

Analyzing the impact of feature selection on the accuracy of heart disease prediction

Machine Learning Models for Cardiovascular Disease Prediction: A Comparative Study

A comparative analysis of meta-heuristic optimization algorithms for feature selection on ML-based classification of heart-related diseases

A Review of Feature Selection and Classification Approaches for Heart Disease Prediction

Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms

Cardiac disease prediction using AI algorithms with SelectKBest

Score and Correlation Coefficient-Based Feature Selection for Predicting Heart Failure Diagnosis by Using Machine Learning Algorithms

Prediction of Heart Disease using Machine Learning Algorithms with Feature Selection Techniques

Two-level boosting classifiers ensemble based on feature selection for heart disease prediction

A proposed technique for predicting heart disease using machine learning algorithms and an explainable AI method

Enhancing Heart Disease Prediction Accuracy through Machine Learning Techniques and Optimization

Exploring Predictive Methods for Cardiovascular Disease: A Survey of Methods and Applications

COMPARATIVE ASSESSMENT OF MACHINE LEARNING ALGORITHMS FOR HEART DISEASE PREDICTION

Comparative Analysis of Machine Learning Algorithms for Heart Disease Prediction