Interpretable Machine Learning Algorithm Reveals Novel Gut Microbiome Features in Predicting Type 2 Diabetes

Wanglong Gou,Chu-Wen Ling,Yan He,Zengliang Jiang,Yuanqing Fu,Xu Fengzhe,Ze-Lei Miao,Sun Ting-yu,Lin Jie-Sheng,Huilian Zhu,Hong-Wei Zhou,Yu-Ming Chen,Ju-Sheng Zheng
DOI: https://doi.org/10.1093/cdn/nzaa062_016
2020-05-29
Current Developments in Nutrition
Abstract:Abstract Objectives The gut microbiome-type 2 diabetes (T2D) relationship among human cohorts have been controversial. We hypothesized that this limitation could be addressed by integrating the cutting-edge interpretable machine learning framework and large-scale human cohort studies. Methods 3 independent cohorts with >9000 participants were included in this study. We proposed a new machine learning-based analytic framework — using LightGBM to infer the relationship between incorporated features and T2D, and SHapley Additive explanation(SHAP) to identified microbiome features associated with the risk of T2D. We then generated a microbiome risk score (MRS) integrating the threshold and direction of the identified microbiome features to predict T2D risk. Results We finally identified 15 microbiome features (two of them are indicators of microbial diversity, others are taxa-related features) associated with the risk of T2D. The identified T2D-related gut microbiome features showed superior T2D prediction accuracy compared to host genetics or traditional risk factors. Furthermore, we found that the MRS (per unit change in MRS) consistently showed positive association with T2D risk in the discovery cohort (RR 1.28, 95%CI 1.23-1.33), external validation cohort 1 (RR 1.23, 95%CI 1.13-1.34) and external validation cohort 2 (GGMP, RR 1.12, 95%CI 1.06-1.18). The MRS could also predict future glucose increment. We subsequently identified dietary and lifestyle factors which could prospectively modulate the microbiome features, and found that body fat distribution may be the key factor modulating the gut microbiome-T2D relationship. Conclusions Taken together, we proposed a new analytical framework for the investigation of microbiome-disease relationship. The identified microbiome features may serve as potential drug targets for T2D in future. Funding Sources This study was funded by National Natural Science Foundation of China (81903316, 81773416), Westlake University (101396021801) and the 5010 Program for Clinical Researches (2007032) of the Sun Yat-sen University (Guangzhou, China).
What problem does this paper attempt to address?