Sparse Object Detector With Faster Convergence and Simpler Structure

Jiajie Hu,Dong Yin
DOI: https://doi.org/10.1109/ISCEIC59030.2023.10271223
2023-01-01
Abstract:Mainstream object detection methods are commonly categorized as dense object detection and sparse object detection. Sparse object detectors face a significant challenge in slow convergence due to the limitations of transformer encoders or feature pyramid networks in processing image feature maps. In this paper, a sparse object detector that does not rely on encoders or feature pyramid networks is proposed to mitigate this issue. Firstly, a 4D sampling module is designed to focus on regions where potential objects are located, rather than all regions. This design significantly enhances the convergence speed. Additionally, a relative position encoding is presented to enhance the position-aware ability of the sampling module. Lastly, the dynamic instance interactive head and multi-layer perception mixer are used to decode sparse queries. Experimental results demonstrate that our proposed detector surpasses the performance of the majority of existing sparse object detectors, reaching 46.9 AP with ResNet-50 backbone. Moreover, the proposed method achieves comparable performance with only 80% of training epochs.
What problem does this paper attempt to address?