Hybrid Explainable Artificial Intelligence Models for Targeted Metabolomics Analysis of Diabetic Retinopathy

Fatma Hilal Yagin,Cemil Colak,Abdulmohsen Algarni,Yasin Gormez,Emek Guldogan,Luca Paolo Ardigò
DOI: https://doi.org/10.3390/diagnostics14131364
IF: 3.6
2024-06-27
Diagnostics
Abstract:Background: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis of patients with DR, utilizing a focused approach to identify specific metabolites exhibiting varying concentrations among individuals without DR (NDR), those with non-proliferative DR (NPDR), and individuals with proliferative DR (PDR) who have type 2 diabetes mellitus (T2DM). Methods: A total of 317 T2DM patients, including 143 NDR, 123 NPDR, and 51 PDR cases, were included in the study. Serum samples underwent targeted metabolomics analysis using liquid chromatography and mass spectrometry. Several machine learning models, including Support Vector Machines (SVC), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Multilayer Perceptrons (MLP), were implemented as solo models and in a two-stage ensemble hybrid approach. The models were trained and validated using 10-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed to interpret the contributions of each feature to the model predictions. Statistical analyses were conducted using the Shapiro–Wilk test for normality, the Kruskal–Wallis H test for group differences, and the Mann–Whitney U test with Bonferroni correction for post-hoc comparisons. Results: The hybrid SVC + MLP model achieved the highest performance, with an accuracy of 89.58%, a precision of 87.18%, an F1-score of 88.20%, and an F-beta score of 87.55%. SHAP analysis revealed that glucose, glycine, and age were consistently important features across all DR classes, while creatinine and various phosphatidylcholines exhibited higher importance in the PDR class, suggesting their potential as biomarkers for severe DR. Conclusion: The hybrid XAI models, particularly the SVC + MLP ensemble, demonstrated superior performance in predicting DR progression compared to solo models. The application of SHAP facilitates the interpretation of feature importance, providing valuable insights into the metabolic and physiological markers associated with different stages of DR. These findings highlight the potential of hybrid XAI models combined with explainable techniques for early detection, targeted interventions, and personalized treatment strategies in DR management.
medicine, general & internal
What problem does this paper attempt to address?
The main problem this paper attempts to address is the early detection and progression prediction of diabetic retinopathy (DR). Specifically, the study aims to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis in patients with type 2 diabetes mellitus (T2DM) to identify changes in the concentration of specific metabolites at different stages of DR (no DR, non-proliferative DR, and proliferative DR). Through this approach, the study hopes to discover potential biomarkers associated with DR progression, thereby supporting early diagnosis, personalized treatment strategies, and effective disease management. ### Research Background - **Diabetic Retinopathy (DR)**: DR is a common microvascular complication of diabetes, and early detection is crucial for effective management. - **Metabolomics Analysis**: Metabolomics analysis has emerged as a promising method for identifying potential biomarkers associated with DR progression. - **Research Objective**: To develop a hybrid XAI model for targeted metabolomics analysis to identify changes in the concentration of specific metabolites at different stages of DR. ### Methods - **Sample Selection**: The study included 317 T2DM patients, divided into three groups: no DR (NDR), non-proliferative DR (NPDR), and proliferative DR (PDR). - **Metabolomics Analysis**: Targeted metabolomics analysis of serum samples was performed using liquid chromatography and mass spectrometry techniques. - **Machine Learning Models**: Various machine learning models (SVM, random forest, decision tree, logistic regression, and multilayer perceptron) were implemented, and a two-stage ensemble approach was adopted. - **Model Validation**: Training and validation were conducted using 10-fold cross-validation, and SHAP values were used to interpret the contribution of features to model predictions. ### Results - **Model Performance**: The hybrid SVM + MLP model performed the best, with an accuracy of 89.58%, precision of 87.18%, F1 score of 88.20%, and F-beta score of 87.55%. - **Feature Importance**: SHAP analysis showed that glucose, glycine, and age were significant across all DR categories, while creatinine and various phosphatidylcholines were more important in the PDR category, suggesting these metabolites as potential biomarkers for severe DR. ### Conclusion - **Hybrid XAI Model**: The SVM + MLP ensemble model, in particular, excelled in predicting DR progression, outperforming individual models. - **Explainability Techniques**: The application of SHAP helped interpret feature importance, providing valuable insights into metabolic and physiological markers at different DR stages. - **Clinical Application**: These findings highlight the potential of hybrid XAI models combined with explainability techniques in early DR detection, targeted interventions, and personalized treatment strategies. Through this study, the authors hope to provide new tools and techniques for the early diagnosis and management of DR, thereby improving patient treatment outcomes and quality of life.