Identifying an optimal machine learning generated image marker to predict survival of gastric cancer patients

Huong Pham,Meredith Jones,Tiancheng Gai,Warid Islam,Gopichandh Danala,Javier Jo,Bin Zheng,Huong Pham,Tiancheng Gai,Warid Islam,Gopichandh Danala,Javier Jo,Bin Zheng
DOI: https://doi.org/10.1117/12.2611788
2022-04-04
Abstract:Computer-aided detection and/or diagnosis (CAD) schemes typically include machine learning classifiers trained using handcrafted features. The objective of this study is to investigate the feasibility of identifying and applying a new quantitative imaging marker to predict survival of gastric cancer patients. A retrospective dataset including CT images of 403 patients is assembled. Among them, 162 patients have more than 5-year survival. A CAD scheme is applied to segment gastric tumors depicted in multiple CT image slices. After gray-level normalization of each segmented tumor region to reduce image value fluctuation, we used a special feature selection library of a publicly available Pyradiomics software to compute 103 features. To identify an optimal approach to predict patient survival, we investigate two logistic regression model (LRM) generated imaging markers. The first one fuses image features computed from one CT slice and the second one fuses the weighted average image features computed from multiple CT slices. Two LRMs are trained and tested using a leave-one-case-out cross-validation method. Using the LRM-generated prediction scores, receiving operating characteristics (ROC) curves are computed and the area under ROC curve (AUC) is used as index to evaluate performance in predicting patients’ survival. Study results show that the case prediction-based AUC values are 0.70 and 0.72 for two LRM-generated image markers fused with image features computed from a single CT slide and multiple CT slices, respectively. This study demonstrates that (1) radiomics features computed from CT images carry valuable discriminatory information to predict survival of gastric cancer patients and (2) fusion of quasi-3D image features yields higher prediction accuracy than using simple 2D image features.
What problem does this paper attempt to address?