Application and comparison of several machine learning methods in the prognosis of cervical cancer
Yawen Ling,Weiwei Zhang,Zhidong Li,Xiaorong Pu,Yazhou Ren
DOI: https://doi.org/10.22514/ejgo.2022.056
2022-01-01
European Journal of Gynaecological Oncology
Abstract:Accurate prognosis of cervical cancer in the clinical setting is challenging because of the complexity of the causative factors. Considering the drawbacks of the widely used Cox proportional hazards model, such as the inability to fully use the information and the possible failure to achieve the best fit, several new attempts based on machine learning have been developed to find better prognostic prediction models. However, the application of these attempts is often limited, because they often rely on public databases. Therefore, for cervical cancer, there is a need to explore the value of machine learning in terms of its practical application in prognostic prediction. In this study, we introduced several machine learning methods including k-nearest neighbors (KNN), decision tree (DT), logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM) to predict the survival of patients by using the real-world pathological data of 216 patients collected from the Fifth People???s Hospital of Chengdu. The experimental results showed that these methods have a promising application value in the prediction of overall survival (OS) of patients with cervical cancer (KNN: F1-score = 0.95, Accuracy = 0.93, DT: F1-score = 0.94, Accuracy = 0.92, LR: F1-score = 0.92, Accuracy = 0.90, SVM: F1-score = 0.94, Accuracy = 0.92, RF: F1-score = 0.96, Accuracy = 0.95, XGBoost: F1-score = 0.96, Accuracy = 0.95, LightGBM: F1-score = 0.96, Accuracy = 0.95). Moreover, XGBoost and LightGBM gave the importance of the clinical indicators associated with cervical cancer, whose correlation with OS and progression-free survival (PFS) can be further obtained. Thus, the predictors of OS and PFS were successfully identified. Finally, the results were confirmed by the Cox proportional hazards model. These results indicated that machine learning methods can accurately predict the OS of patients with cervical cancer. Moreover, the methods can be used to analyze the correlation between clinical indicators and OS or PFS to help doctors make more accurate decisions in a clinical setting.