Development of an interpretable machine learning model associated with heavy metals’ exposure to identify coronary heart disease among US adults via SHAP: Findings of the US NHANES from 2003 to 2018

Xi Li,Yang Zhao,Dongdong Zhang,Lei Kuang,Hao Huang,Weiling Chen,Xueru Fu,Yuying Wu,Tianze Li,Jinli Zhang,Lijun Yuan,Huifang Hu,Yu Liu,Ming Zhang,Fulan Hu,Xizhuo Sun,Dongsheng Hu
DOI: https://doi.org/10.1016/j.chemosphere.2022.137039
IF: 8.8
2023-01-01
Chemosphere
Abstract:Limited information is available on the links between heavy metals' exposure and coronary heart disease (CHD). We aim to establish an efficient and explainable machine learning (ML) model that associates heavy metals' exposure with CHD identification. Our datasets for investigating the associations between heavy metals and CHD were sourced from the US National Health and Nutrition Examination Survey (US NHANES, 2003–2018). Five ML models were established to identify CHD by heavy metals' exposure. Further, 11 discrimination characteristics were used to test the strength of the models. The optimally performing model was selected for identification. Finally, the SHapley Additive exPlanations (SHAP) tool was used for interpreting the features to visualize the selected model's decision-making capacity. In total, 12,554 participants were eligible for this study. The best performing random forest classifier (RF) based on 13 heavy metals to identify CHD was chosen (AUC: 0.827; 95%CI: 0.777–0.877; accuracy: 95.9%). SHAP values indicated that cesium (1.62), thallium (1.17), antimony (1.63), dimethylarsonic acid (0.91), barium (0.76), arsenous acid (0.79), total arsenic (0.01) in urine, and lead (3.58) and cadmium (4.66) in blood positively contributed to the model, while cobalt (−0.15), cadmium (−2.93), and uranium (−0.13) in urine negatively contributed to the model. The RF model was efficient, accurate, and robust in identifying an association between heavy metals' exposure and CHD among US NHANES 2003–2018 participants. Cesium, thallium, antimony, dimethylarsonic acid, barium, arsenous acid, and total arsenic in urine, and lead and cadmium in blood show positive relationships with CHD, while cobalt, cadmium, and uranium in urine show negative relationships with CHD.
environmental sciences
What problem does this paper attempt to address?