Polyp detection in CT colonography based on imbalanced data sets

Xin XIONG,Lisheng XU,Chunwu WANG,Yan KANG
2013-01-01
Abstract:Polyp detection in CT Colongraphy suffers from imbalanced data sets where negative samples (non-polyp) are dominant. In data level, SMOTE (Synthetic Minority Over-Sampling Technique) was applied to alleviate imbalanced degree by synthetic minority samples. In algorithm level, Boosting approach was employed in order to improve classification performance. Having combined Boosting with SMOTE (SMOTEBoost), the proposed classifier not only improved the prediction of the minority samples, but also guaranteed the accuracy over the entire data set. To satisfy real-time requirements for polyp detection, MRMR (Minimum Redundancy Maximum Relevance) was provided to select low-cost simple features for training the first stage of cascade, resulting in refusing the great majority negative samples and speeding procession. The experimental results showed that the classifier could achieve an overall per-polyp sensitivity of 90% (corresponding to the polyp whose diameter is equal to or greater than 5 mm), with false positives of 6 per volume on average.
What problem does this paper attempt to address?