Interpretable machine learning framework to predict gout associated with dietary fiber and triglyceride-glucose index

Shunshun Cao,Yangyang Hu
DOI: https://doi.org/10.1186/s12986-024-00802-2
2024-05-15
Nutrition & Metabolism
Abstract:Gout prediction is essential for the development of individualized prevention and treatment plans. Our objective was to develop an efficient and interpretable machine learning (ML) model using the SHapley Additive exPlanation (SHAP) to link dietary fiber and triglyceride-glucose (TyG) index to predict gout.
nutrition & dietetics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to develop an efficient and interpretable machine learning (ML) model to explore the relationship between dietary fiber and the triglyceride-glucose index (TyG index) with gout using the SHapley Additive exPlanation (SHAP) method, and to predict the occurrence of gout. #### Main Objectives - **Develop an interpretable ML model**: Use the SHAP method to explain the Light Gradient Boosting Machine (LGBM) model, revealing its process of predicting gout. - **Improve prediction accuracy**: Select the best-performing LGBM model to predict gout related to dietary fiber and the TyG index. - **Individualized interventions**: Based on the model's prediction results, formulate individualized prevention and treatment plans for potential gout risk. #### Method Overview - **Data Source**: Use the dataset from the National Health and Nutrition Examination Survey (NHANES) from 2005 to 2018. - **Model Selection**: Evaluate six machine learning models (including SVM, RF, GBDT, XGBoost, and CatBoost), and ultimately select LGBM as the optimal algorithm. - **Feature Importance Analysis**: Analyze feature importance through SHAP values, finding that age and uric acid (UA) are the most influential factors on the model's output. - **Model Interpretation**: Use SHAP decision plots to visually demonstrate the decision-making process of individuals in the LGBM classification model. #### Results and Conclusions - **Model Performance**: The LGBM model has high accuracy and robustness, with an AUC value of 0.823 (95% confidence interval: 0.798–0.848) and an accuracy rate of 95.3%. - **Feature Impact**: Lower dietary fiber intake and higher TyG index have a significant positive impact on predicting gout. - **Practical Application**: Increasing dietary fiber intake and reducing the TyG index can help reduce the risk of gout. Through this study, the authors hope to provide clinicians with a more transparent and reliable model to better understand and predict the occurrence of gout.