Prediction of Second Primary Lung Cancer Patient’s Survivability Based on Improved Eigenvector Centrality-Based Feature Selection

Peng Liu,Kexin Jin,Yiping Jiao,Mutian He,Shumin Fei
DOI: https://doi.org/10.1109/access.2021.3063944
IF: 3.9
2021-01-01
IEEE Access
Abstract:Modeling of second primary lung cancer (SPLC) patients' survival prediction has important theoretical significance and practical needs. Cancer survivability prediction may provide advice for better clinical decisions and personalized medicine. The Surveillance, Epidemiology, and End Results (SEER) program provides large data sets for analysis with machine learning methods. SPLC cases are identified and labeled from the SEER database; the data set is then preprocessed with improved eigenvector centrality-based feature selection (IECFS). The IECFS method utilizes interclass and intraclass dispersions and the ranking criteria. By adjusting the value of the $alpha $ parameter and the number of features selected, the method achieves the best performance. The experiment is divided into five folds. This method yields a prediction accuracy of 90.998% for the five-year survivability that is higher than the original classification accuracy (89.16%) and the other state-of-the-art feature selection methods. For the three-year survivability, the proposed methods yields a prediction accuracy of 83.16%, slightly outperforming all of the compared methods. The method is effective and generalizable.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?