Predictive modeling of co-infection in lupus nephritis using multiple machine learning algorithms

Jiaqian Zhang,Bo Chen,Jiu Liu,Pengfei Chai,Hongjiang Liu,Yuehong Chen,Huan Liu,Geng Yin,Shengxiao Zhang,Caihong Wang,Qibing Xie
DOI: https://doi.org/10.1038/s41598-024-59717-w
IF: 4.6
2024-04-23
Scientific Reports
Abstract:This study aimed to analyze peripheral blood lymphocyte subsets in lupus nephritis (LN) patients and use machine learning (ML) methods to establish an effective algorithm for predicting co-infection in LN. This study included 111 non-infected LN patients, 72 infected LN patients, and 206 healthy controls (HCs). Patient information, infection characteristics, medication, and laboratory indexes were recorded. Eight ML methods were compared to establish a model through a training group and verify the results in a test group. We trained the ML models, including Logistic Regression, Decision Tree, K-Nearest Neighbors, Support Vector Machine, Multi-Layer Perceptron, Random Forest, Ada boost, Extreme Gradient Boosting (XGB), and further evaluated potential predictors of infection. Infected LN patients had significantly decreased levels of T, B, helper T, suppressor T, and natural killer cells compared to non-infected LN patients and HCs. The number of regulatory T cells (Tregs) in LN patients was significantly lower than in HCs, with infected patients having the lowest Tregs count. Among the ML algorithms, XGB demonstrated the highest accuracy and precision for predicting LN infections. The innate and adaptive immune systems are disrupted in LN patients, and monitoring lymphocyte subsets can help prevent and treat infections. The XGB algorithm was recommended for predicting co-infection in LN.
multidisciplinary sciences
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the problem of predicting concurrent infections in patients with lupus nephritis (LN). Specifically, the research objective is to analyze the peripheral blood lymphocyte subsets in LN patients and use multiple machine learning (ML) algorithms to establish an effective prediction model to predict the concurrent infection status of LN patients. #### Research background - **Overview of LN**: LN is a serious complication of systemic lupus erythematosus (SLE), manifested as symptoms such as hematuria, proteinuria, edema, hypertension, and renal insufficiency. - **Concurrent infection problem**: Approximately 40% - 60% of SLE patients will show obvious clinical symptoms of LN, and LN is closely related to the overall morbidity and mortality. Although treatment methods have improved, LN patients are still prone to concurrent infections, which pose a threat to the prognosis of patients. - **Existing challenges**: Currently, there is no prediction model specifically for concurrent infections in LN patients, and the performance of existing prediction models cannot meet clinical needs. #### Research objectives 1. **Data analysis**: Analyze the changes in peripheral blood lymphocyte subsets in LN patients, including T cells, B cells, helper T cells, suppressor T cells, and natural killer cells (NK). 2. **Model construction**: Use 8 machine learning algorithms (Logistic Regression, Decision Tree, K - Nearest Neighbors, Support Vector Machine, Multi - Layer Perceptron, Random Forest, AdaBoost, Extreme Gradient Boosting (XGB)) to construct a prediction model. 3. **Model evaluation**: Verify the prediction performance of different algorithms through the training set and the test set, and select the optimal model. #### Main findings - **Changes in lymphocyte subsets**: Compared with uninfected patients and the healthy control group, the number of T cells, B cells, helper T cells, suppressor T cells, and NK cells in infected LN patients is significantly reduced, especially the number of regulatory T cells (Tregs) is the lowest. - **Best algorithm**: Among all ML algorithms, the XGB algorithm shows the highest accuracy and precision and can effectively predict the concurrent infection in LN patients. - **Potential applications**: Monitoring the changes in lymphocyte subsets in LN patients is helpful for preventing and treating infections, and the XGB algorithm is recommended for predicting the concurrent infection in LN patients. #### Conclusion By using multiple machine learning algorithms, especially the XGB algorithm, the concurrent infection in LN patients can be effectively predicted. This provides clinicians with a method for early identification and intervention, thereby improving the prognosis of LN patients.