Reduction of false positives by machine learning for computer-aided detection of colonic polyps

xin zhao,su wang,hongbin zhu,zhengrong liang
DOI: https://doi.org/10.1117/12.812468
2009-01-01
Abstract:With the development of computer-aided detection of polyps (CADpolyp), various features have been extracted to detect the initial polyp candidates (IPCs). In this paper, three approaches were utilized to reduce the number of false positives (FPs): the multiply linear regression (MLR) and two modified machine learning methods, i.e., neural network (NN) and support vector machine (SVM), based on their own characteristics and specific learning purposes. Compared to MLR, the two modified machine learning methods are much more sophisticated and well-adapted to the data provided. To achieve the optimal sensitivity and specificity, raw features were pre-processed by the principle component analysis (PCA) in the hope of removing the second-order statistical correlation prior to any learning actions. The gain by the use of PCA was evidenced by the collected 26 patient studies, which included 32 colonic polyps confirmed by both optical colonoscopy (OC) and virtual colonoscopy (VC). The learning and testing results showed that the two modified machine-learning methods can reduce the number of FPs by 48.9% (or 7.2 FPs per patient) and 45.3% (or 7.7 FPs per patient) respectively, at 100% detection sensitivity in comparison with that of traditional MLR method. Generally, more than necessary number of features were stacked as input vectors to machine learning algorithms, dimensionality reduction for a more compact feature combination, i.e., how to determine the remaining dimensionality via PCA linear transform was considered and discussed in this paper. In addition, we proposed a new PCA-scaled data pre-processing method to help reduce the FPs significantly. Finally, fROC (free-response receiver operating characteristic) curves corresponding to three FP-reduction approaches were acquired, and comparative analysis was conducted.
What problem does this paper attempt to address?