Using Multi-Label Classification to Improve Object Detection.

Tao Gong,Bin Liu,Qi Chu,Nenghai Yu
DOI: https://doi.org/10.1016/j.neucom.2019.08.089
IF: 6
2019-01-01
Neurocomputing
Abstract:In this paper, a novel multi-task framework for object detection is proposed. The framework uses multi-label classification as an auxiliary task to improve object detection, and can be trained and tested end-to-end. The object detection branch adopts R-FCN methods to solve the object detection task. The multi-label branch uses attention mechanism to solve the multi-label classification task. The features, which are generated by the attention mechanism in the multi-label branch, contain rough localization information of the objects. Thus, the features can be useful for the object detection. Both the box-level features and the image-level features of multi-label are fused to improve the accuracy of the object detection. The proposed framework does not require any extra annotation, since the ground truth of the multi-label classification can be directly obtained from the bounding box annotations. This is different from other multi-task frameworks such as StuffNet and Mask R-CNN which need extra semantic segmentation and instance segmentation annotations. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO demonstrate the effectiveness of the proposed approach. Code has been made publicly available at: https://github.com/GT9505/MONet .
What problem does this paper attempt to address?