Learning from Noisy Anchors for One-Stage Object Detection.

Hengduo Li,Zuxuan Wu,Chen Zhu,Caiming Xiong,Richard Socher,Larry S. Davis
DOI: https://doi.org/10.1109/cvpr42600.2020.01060
2019-01-01
Computer Vision and Pattern Recognition
Abstract:State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects. Such a harsh split conditioned on IoU results in binary labels that are potentially noisy and challenging for training. In this paper, we propose to mitigate noise incurred by imperfect label assignment such that the contributions of anchors are dynamically determined by a carefully constructed cleanliness score associated with each anchor. Exploring outputs from both regression and classification branches, the cleanliness scores, estimated without incurring any additional computational overhead, are used not only as soft labels to supervise the training of the classification branch but also sample re-weighting factors for improved localization and classification accuracy. We conduct extensive experiments on COCO, and demonstrate, among other things, the proposed approach steadily improves RetinaNet by ~2% with various backbones.
What problem does this paper attempt to address?