Machine-Learning Based Early Warning System for Prediction for Disseminated Intravascular Coagulation after Allogeneic Hematopoietic Stem Cell Transplantation: A Nationwide Multicenter Study
Zhuo-Yu An,Ye-Jun Wu,Yun He,Xiao-Lu Zhu,Yan Su,Chen-Cong Wang,Tian-Xiao Han,Hai-Xia Fu,Feng-Rong Wang,Xiao-Dong Mo,Yu Wang,Xiang-Yu Zhao,Yuan-Yuan Zhang,Wei Han,Huan Chen,Yao Chen,Chen-Hua Yan,Jing-Zhi Wang,Ting-Ting Han,Yu-Hong Chen,Yi-Fei Cheng,Ying-Jun Chang,Lan-Ping Xu,Kai-Yan Liu,Xiao Jun Huang,Xiaohui Zhang
DOI: https://doi.org/10.1182/blood-2021-150437
IF: 20.3
2021-11-05
Blood
Abstract:Abstract Introduction Allogeneic haematopoietic stem cell transplantation (allo-HSCT) has been demonstrated to be the most effective therapy for various malignant as well as nonmalignant haematological diseases. The wide use of allo-HSCT has inevitably led to a variety of complications after transplantation, with bleeding complications such as disseminated intravascular coagulation (DIC). DIC accounts for a significant proportion of life-threatening bleeding cases occurring after allo-HSCT. However, information on markers for early identification remains limited, and no predictive tools for DIC after allo-HSCT are available. This research aimed to identify the risk factors for DIC after allo-HSCT and establish prediction models to predict the occurrence of DIC after allo-HSCT. Methods The definition of DIC was based on the International Society of Thrombosis and Hemostasis (ISTH) scoring system. Overall, 197 patients with DIC after allo-HSCT at Peking University People's Hospital and other 7 centers in China from January 2010 to June 2021 were retrospectively identified. Each patient was randomly matched to 3 controls based on the time of allo-HSCT (±3 months) and length of follow-up (±6 months). A lasso regression model was used for data dimension reduction, feature selection, and risk factor building. Multivariable logistic regression analysis was used to develop the prediction model. We incorporated the clinical risk factors, and this was presented with a nomogram. The performance of the nomogram was assessed with respect to its calibration, discrimination, and clinical usefulness. Internal and external validation was assessed. Various machine learning models were further used to perform machine learning modeling by attempting to complete the data sample classification task, including XGBClassifier, LogisticRegression, MLPClassifier, RandomForestClassifier, and AdaBoostClassifier. Results A total of 7280 patients received allo-HSCT from January 2010 to June 2021, and DIC occurred in 197 of these patients (incidence of 2.7%). The derivation cohort included 120 DIC patients received allo-HSCT and 360 patients received allo-HSCT from Peking University People's Hospital, and the validation cohort included the remaining 77 patients received allo-HSCT and 231 patients received allo-HSCT from the other 7 centers. The median time for DIC events was 99.0 (IQR, 46.8-220) days after allo-HSCT. The overall survival of patients with DIC was significantly reduced (P < 0.0001). By Lasso regression, the 10 variables with the highest importance were found to be prothrombin time activity (PTA), shock, C-reactive protein, internationalization normalized ratio, bacterial infection, oxygenation, fibrinogen, blood creatinine, white blood cell count, and acute respiratory distress syndrome (from highest to lowest). In the multivariate analysis, the independent risk factors for DIC included PTA, bacterial infection and shock (P <0.001), and these predictors were included in the clinical prediction nomogram. The model showed good discrimination, with a C-index of 0.975 (95%CI, 0.939 to 0.987 through internal validation) and good calibration. Application of the nomogram in the validation cohort still gave good discrimination (C-index, 0.778 [95% CI, 0.759 to 0.766]) and good calibration. Decision curve analysis demonstrated that the nomogram was clinically useful. The predictive value ROC curves of different machine learning models show that XGBClassifier is the best performing model for this dataset, with an area under the curve of 0.86. Conclusions Risk factors for DIC after allo-HSCT were identified, and a nomogram model and various machine learning models were established to predict the occurrence of DIC after allo-HSCT. Combined, these can help recognize high-risk patients and provide timely treatment. In the future, we will further refine the prognostic model utilizing nationwide multicenter data and conduct prospective clinical trials to reduce the incidence of DIC after allo-HSCT and improve the prognosis. Disclosures No relevant conflicts of interest to declare.
hematology