Clustering Based BMI Indexing for Child Disease Prone-Probability Prediction

Meena Moharana,Manjusha Pandey,Siddharth Swarup Rautaray
DOI: https://doi.org/10.1007/s42979-023-01823-z
2023-05-25
SN Computer Science
Abstract:Early age obesity has a significant impact on the world's public health. Obesity can be identified in children using body mass indexing based on age-specific vitals of a child. An increase in BMI due to excess deposit of body fats has an association with early age obesity. Analyses the impact of parental factors along with child obesity using data analytic techniques decision tree, random forest, OLS Regression technique, k -means algorithm, and suggest how fatal it can be along with requirements to overcome it. The ongoing research finds the possibility of generating predictive models from existing/logged data and using them for imputation. With higher model accuracy decision tree, Random forest, k-Means algorithm followed by two major hypothesis z-test and OLS regression has been done. The current model adopted for finding PtD (prone to disease) clusters on dataset. PtD cluster has been defined as one with k = 3. Clusters of EAO has been defined as the one with k = 5. The accuracy level of model with DT + k -means ( a 1 = 0.987%, a 2 = 0.989%, m 1 = 83%) and RF + k -means ( a 1 = 0.976%, a 2 = 0.988%, m 1 = 79%) has been generated by our proposed model. Our model rejects null hypothesis translated as (mean = 20.31, std = 7.91, p = 0.05). Clusters of both CBMI and CorrBMI found out (E, D, C, B, A), which gives EAO clusters and PtD (C, B, A) clusters with k = 3. Out of 1102 instances 562 no. of child of PtD and 857 no. of child have been deduced to be suffering from EAO in the sample data. The result of the performance evaluation of our models proposed results into the deduction of fact that the random forest algorithm of data analysis has the highest accuracy of 0.993 while the decision tree has the accuracy of 0.997.
What problem does this paper attempt to address?