Improving Colorectal Polyp Classification Based on Physical Examination Data — A Ensemble Learning Approach

Chong Li,Xiaolei Xie,Jinlin Li,Nan Kong
DOI: https://doi.org/10.1109/coase.2017.8256102
2018-01-01
Abstract:Colorectal cancer is a common type of cancer. Due to the alarming incidence and mortality rate, it has received increasing attention on early detection and treatment. Colorectal polyps form and grow at initial stages of most colorectal cancer cases. Due to rather stringent medical resource availability and low screening compliance rate, it is more desirable in China than industrialized countries to characterize the relations between the detection of colorectal polyps and various potential determinants, including basic health information, comorbidities, and lifestyle conditions. Subsequently, one can better predict polyp onset for each individual. In this paper, we present a data-driven modeling study to improve binary classification of colorectal polyp occurrence. We apply several machine-learning methods, particularly random forests, on physical examination data of 849 Chinese people, to build the classifiers. Our results suggest improved prediction performance with a random forest model. Our results also show that subject's negative mood score, rarely recorded in previous studies, is highly correlated with colorectal polyp occurrence.
What problem does this paper attempt to address?