Gastric cancer prediction model based on C5.0 classification algorithm

Zhigang HUANG,Hong LIU,Juan LIU,Qishan ZHANG
DOI: https://doi.org/10.13878/j.cnki.jnuist.2017.04.008
2017-01-01
Abstract:The incidence of gastric cancer is very high in China,and the number of new patients diagnosed with gastric cancer accounts for 42% of that of the whole world every year,so gastric cancer has become the focus of the prevention and control of malignant tumors in China.In this paper,the C5.0 classification algorithm is used to predict the survival rate of gastric cancer,and experiments are carried out using the SEER database of the American National Cancer Institute.The data preprocessing and data integration methods are given according to the unbalanced characteristics of gastric cancer record data.The prediction experimental results show that,the accuracy and specificity of C5.0 algorithm are high compared with BP-neural network method;and there is an obvious correlation between birth place and survival state of gastric cancer patients.This study is a practical application of data mining technology in the field of medicine,which has certain reference value for the clinical diagnosis of gastric cancer;it can provide reference for doctors to formulate reasonable treatment and prevention program.
What problem does this paper attempt to address?