VFEDet: a variational information bottleneck based feature enhancement object detection network

Mingyu Wu,Ming Zhu,Ruixue Tang
DOI: https://doi.org/10.1117/12.2589411
2021-01-27
Abstract:The anchor-based two-stage object detection methods like the Faster R-CNN are commonly utilized for detection tasks in various fields. Since networks in these methods are built on the pre-trained classification models, their performance largely depends on the backbone's properties. And it will make them suffer from limited generalization ability on some specific datasets. To overcome this problem and enhance the model's representation ability, we propose a Variational Information Bottleneck Based Feature Enhancement Object Detection Network (VFEDet). We first design a spatial-wise feature enhancement module in the first stage to highlight the critical target in the images, using a weighting map generated from the original feature in the form of information bottleneck (i.e., Variational Information Bottleneck, VIB). It can effectively suppress the overfitting and make the features contain more discriminative information for recognition and bounding box regression. Furthermore, we modify the second stage by inserting the VIB after the first fully connected layer to improve the model's robustness. Introducing the two parts into the original detection model, we achieve 39.34% improvement on a thyroid nodule ultrasound image dataset polluted by a kind of special noise in a previous work. The effectiveness of the proposed method is also evaluated on the COCO dataset.
What problem does this paper attempt to address?