Machine Learning-based Diabetes Prediction: A Cross-Country Perspective

A. Nesa,Md. Saiful Islam,Sadia Afrin Shampa
DOI: https://doi.org/10.1109/NCIM59001.2023.10212596
2023-06-16
Abstract:High blood sugar levels characterize diabetes and is a chronic disease with long-lasting effects on human health. Accurately predicting diabetes occurrence presents challenges due to the limited availability of labeled data and outliers or missing values in diabetes datasets. In diabetes research, machine learning (ML) algorithms are extensively employed to analyze datasets and predict the onset of the disease. In this study, diabetes data from Bangladesh, India, and Germany were examined using various ML models. The experimental results demonstrate that the Bangladesh dataset performs better using boosting ML algorithms such as AdaBoost, CatBoost, Gradient Boost, and XGBoost. These algorithms effectively predict the occurrence of diabetes. Additionally, satisfactory performance was observed with basic models like Random Forests and Decision Trees, as evaluated by performance metrics. Early detection of diabetes plays a crucial role in mitigating associated risk factors and severity. ML algorithms have emerged as valuable tools in diabetes prediction, leveraging the available data to make accurate predictions. The study's findings underscore the potential of boosting ML algorithms, such as AdaBoost, CatBoost, Gradient Boost, and XGBoost, in predicting diabetes based on the Bangladesh dataset. Furthermore, the study acknowledges the acceptable performance of basic models like Random Forests and Decision Trees in evaluating diabetes data. In conclusion, this study contributes to the understanding of diabetes prediction by analyzing datasets from multiple countries. The results highlight the effectiveness of ML algorithms, particularly boosting algorithms, in accurately predicting diabetes occurrence. This knowledge can aid researchers, healthcare professionals, and policymakers in implementing strategies for early detection and management of diabetes, ultimately improving patient outcomes and overall public health.
Computer Science,Medicine
What problem does this paper attempt to address?