Trends in the prevalence of osteoporosis and effects of heavy metal exposure using interpretable machine learning

Hewei Xiao,Xueyan Liang,Huijuan Li,Xiaoyu Chen,Yan Li
DOI: https://doi.org/10.1016/j.ecoenv.2024.117238
2024-11-01
Abstract:There is limited evidence that heavy metals exposure contributes to osteoporosis. Multi-parameter scoring machine learning (ML) techniques were developed using National Health and Nutrition Examination Survey data to predict osteoporosis based on heavy metal exposure levels. For generating an optimal predictive model for osteoporosis, 12 ML models were used. Identification was carried out using the model that performed the best. For interpretation of models, Shapley additive explanation (SHAP) methods and partial dependence plots (PDP) were integrated into a pipeline and incorporated into the ML pipeline. By regressing osteoporosis on survey cycles, logistic regression was used to evaluate linear trends in osteoporosis over time. For the purpose of training and validating predictive models, 5745 eligible participants were randomly selected into training and testing set. It was evident from the results that the gradient boosting decision tree model performed the best among the predictive models, attributing to an accuracy rate of 89.40 % in the testing set. Based on the model results, the area under the curve and F1 score were 0.88 and 0.39, respectively. As a result of the SHAP analysis, urinary Co, urinary Tu, blood Cd, and urinary Hg levels were identified as the most influential factors influencing osteoporosis. Urinary Co (0.20-6.10 μg/mg creatinine), urinary Tu (0.06-1.93 μg/mg creatinine), blood Cd (0.07-0.50 μg/L), and urinary Hg (0.06-0.75 μg/mg creatinine) levels displayed a distinctive upward trend with risk of osteoporosis as values increased. Our analysis revealed that urinary Co, urinary Tu, blood Cd, and urinary Hg played a significant role in predictability.
What problem does this paper attempt to address?