[Establishment of a prognostic model for non-nephrotic membranous nephropathy based on unbalanced data]

Y Q Liu,Y Y Lu,W X Li,Z J Wu,F Zhang,Y R Wang,R S Li,X S Zhou
DOI: https://doi.org/10.3760/cma.j.cn112137-20221115-02399
2023-05-16
Abstract:Objective: To explore the construction of a machine learning model based on unbalanced data to predict the progression of non-nephrotic membranous nephropathy. Methods: The clinical and pathological data of patients diagnosed with non-nephrotic membranous nephropathy by renal biopsy in Shanxi People's Hospital from January 2018 to December 2021 were retrospectively analyzed.The prediction models were constructed based on logistic regression, support vector machine (SVM) and light gradient boosting machine (lightGBM), respectively. The mixed sampling technology was used to process the unbalanced data, and the area under the receiver operating characteristic curve (AUC) was used to evaluate the predictive performance of the models. Finally, Shapley additive explanation (SHAP) was used to interpret the results of the optimal prediction model. Results: A total of 148 patients were included in the study, including 84 males and 64 females, with a mean age of (47.2±12.5) years. The follow-up time [M(Q1, Q3)] was 14(7, 20) months. Twenty-three patients (15.5%) achieved the renal end-point event in the study. The SVM model had the highest AUC (0.868, 95%CI: 0.813-0.925), followed by logistic regression (AUC=0.865, 95%CI: 0.755-0.899) and lightGBM (AUC=0.791, 95%CI: 0.690-0.882). The feature recursive elimination cross validation (RFECV) method based on random forest (RF) and the SHAP plot based on the SVM model showed that immunohistochemistry IgG, total protein (TP), anti-phospholipase A2 receptor (anti-PLA2R), blood chloride and D-Dimer were risk factors affecting the progress of non-nephrotic membranous nephropathy. Moreover, patients with high immunohistochemistry IgG, anti-PLA2R and D-Dimer had an increased risk of achieving the renal end-point event. Conclusion: The SVM model established in this study can effectively predict the progress of non-nephrotic membranous nephropathy, and provide a new method for the early identification of high-risk patients and precision therapy.
What problem does this paper attempt to address?