Establishing a survival probability prediction model for different lung cancer therapies

Hsiu-An Lee,Hsiao-Hsien Rau,Louis R. Chao,Chien-Yeh Hsu
DOI: https://doi.org/10.1007/s11227-019-02992-6
IF: 3.3
2019-09-18
The Journal of Supercomputing
Abstract:Cancer is the leading cause of death in Taiwan, according to the Ministry of Health and Welfare (2017), with cancers of the trachea, bronchus, and lung being the most prevalent. Thus, it is critically important to study this disease. By using Taiwan's National Health Insurance Research Database (NHIRDB), which covers 99.9% of residents, we are capable of analyzing comorbidities and predicting the outcomes of the clinical therapy. This study focuses on non-small cell lung cancer. We first obtain cancer registration indexes from two million individual patient records in NHIRDB by screening patients of having a clinical diagnosis of ICD C33-34 (trachea, bronchus and lung cancer). Then, we used these cancer registration indexes to find all the therapies and comorbidity of the patients and used them as input parameters to establish a predictive model of survival probability for lung cancer. Linear and nonlinear data mining methods were employed to build prediction models to study the effects of different therapies on the 3-year survival probability of lung cancer patients. We found that the artificial neural network (ANN) model performs better than the logistic regression (LR) model. It comes out that the best point of the ANN model on the ROC curve is at sensitivity = 77.6%, specificity = 76.8% and AUROC = 83%.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?