Abstract:Obesity is strongly associated with multiple risk factors. It is significantly contributing to an increased risk of chronic disease morbidity and mortality worldwide. There are various challenges to better understand the association between risk factors and the occurrence of obesity. The traditional regression approach limits analysis to a small number of predictors and imposes assumptions of independence and linearity. Machine Learning (ML) methods are an alternative that provide information with a unique approach to the application stage of data analysis on obesity. This study aims to assess the ability of ML methods, namely Logistic Regression, Classification and Regression Trees (CART), and Naïve Bayes to identify the presence of obesity using publicly available health data, using a novel approach with sophisticated ML methods to predict obesity as an attempt to go beyond traditional prediction models, and to compare the performance of three different methods. Meanwhile, the main objective of this study is to establish a set of risk factors for obesity in adults among the available study variables. Furthermore, we address data imbalance using Synthetic Minority Oversampling Technique (SMOTE) to predict obesity status based on risk factors available in the dataset. This study indicates that the Logistic Regression method shows the highest performance. Nevertheless, kappa coefficients show only moderate concordance between predicted and measured obesity. Location, marital status, age groups, education, sweet drinks, fatty/oily foods, grilled foods, preserved foods, seasoning powders, soft/carbonated drinks, alcoholic drinks, mental emotional disorders, diagnosed hypertension, physical activity, smoking, and fruit and vegetables consumptions are significant in predicting obesity status in adults. Identifying these risk factors could inform health authorities in designing or modifying existing policies for better controlling chronic diseases especially in relation to risk factors associated with obesity. Moreover, applying ML methods on publicly available health data, such as Indonesian Basic Health Research (RISKESDAS) is a promising strategy to fill the gap for a more robust understanding of the associations of multiple risk factors in predicting health outcomes.

Clustering Based BMI Indexing for Child Disease Prone-Probability Prediction

Predicting Childhood Obesity Based on Single and Multiple Well-Child Visit Data Using Machine Learning Classifiers

Implementation of K-Nearest Neighbors, Naïve Bayes Classifier, Support Vector Machine and Decision Tree Algorithms for Obesity Risk Prediction

Predicting Childhood Obesity Using Machine Learning: Practical Considerations

Identifying Growth-Patterns in Children by Applying Cluster analysis to Electronic Medical Records

A machine learning approach for obesity risk prediction

Estimation of Obesity Levels through the Proposed Predictive Approach Based on Physical Activity and Nutritional Habits

Subtyping patients with chronic disease using longitudinal BMI patterns

From Affect Regulation to BMI: Unveiling Childhood Obesity Patterns Through Advanced Clustering Techniques

Risk factor identification and classification of malnutrition among under-five children in Bangladesh: Machine learning and statistical approach

A cluster-based ensemble approach for congenital heart disease prediction

Implementation of K-Means, K-Medoid and DBSCAN Algorithms In Obesity Data Clustering

Prediction of Obesity Categories Based on Physical Activity Using Machine Learning Algorithms

Associations of longitudinal BMI percentile classification patterns in early childhood with neighborhood-level social determinants of health

Reliable prediction of childhood obesity using only routinely collected EHRs may be possible

Predictive Performance of Machine Learning Algorithms Regarding Obesity Levels Based on Physical Activity and Nutritional Habits: A Comprehensive Analysis

Subclassification of obesity for precision prediction of cardiometabolic diseases

Prediction of early childhood obesity with machine learning and electronic health record data

Predicting Obesity in Adults Using Machine Learning Techniques: An Analysis of Indonesian Basic Health Research 2018

Predicting age at onset of childhood obesity using regression, Random Forest, Decision Tree, and K-Nearest Neighbour-A case study in Saudi Arabia

Body Mass Index, Waist Circumference, and the Clustering of Cardiometabolic Risk Factors in Early Childhood