Visible and Clear: Finding Tiny Objects in Difference Map

Bing Cao,Haiyu Yao,Pengfei Zhu,Qinghua Hu
2024-09-30
Abstract:Tiny object detection is one of the key challenges in the field of object detection. The performance of most generic detectors dramatically decreases in tiny object detection tasks. The main challenge lies in extracting effective features of tiny objects. Existing methods usually perform generation-based feature enhancement, which is seriously affected by spurious textures and artifacts, making it difficult to make the tiny-object-specific features visible and clear for detection. To address this issue, we propose a self-reconstructed tiny object detection (SR-TOD) framework. We for the first time introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects. Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects. This inspires us to enhance the weak representations of tiny objects under the guidance of the difference maps. Thus, improving the visibility of tiny objects for the detectors. Building on this, we further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear. In addition, we further propose a new multi-instance anti-UAV dataset, which is called DroneSwarms dataset and contains a large number of tiny drones with the smallest average size to date. Extensive experiments on the DroneSwarms dataset and other datasets demonstrate the effectiveness of the proposed method. The code and dataset will be publicly available.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the key challenges in **small - object detection**. Specifically, small - object detection faces major difficulties in most general - purpose detectors, especially in extracting effective small - object features from feature maps. Existing methods usually adopt generation - based methods for feature enhancement, but this method is vulnerable to false textures and artifacts, making it difficult to make small - object - specific features clearly visible, thus affecting the detection effect. ### Main contributions of the paper 1. **Propose the self - reconstruction small - object detection framework (SR - TOD)**: - For the first time, introduce the image self - reconstruction mechanism, discover the strong association between the difference map and small objects, and provide prior information for the location and structure of small objects. - Through the self - reconstruction mechanism, effectively transform the information usually lost by small objects into actionable prior guidance. 2. **Design the difference - map - guided feature enhancement module (DGFE)**: - Enhance the features of small objects by calculating the element - level attention matrix in the channel dimension, making them clearer. - The DGFE module can be easily and flexibly integrated into general - purpose detectors, significantly improving the small - object detection performance. 3. **Propose a new multi - instance anti - drone dataset (DroneSwarms)**: - This dataset has the smallest average target size (about 7.9 pixels) so far and contains typical small - object detection scenes under various complex backgrounds and lighting conditions. - Extensive experiments have been carried out on this dataset and two other datasets containing a large number of small objects to verify the effectiveness of the method. ### Method overview 1. **Overall architecture**: - The input image is used to extract features through the backbone network and generate a multi - scale feature pyramid through the FPN module. - The P2 feature map is sent to the reconstruction head to generate a reconstructed image with the same size as the input image. - By calculating the difference map between the original image and the reconstructed image, it is sent to the DGFE module together with the P2 feature map to generate an enhanced feature map P2'. - The enhanced feature map P2' replaces the original P2 as the bottom layer of the feature pyramid and is sent to the detection head. 2. **Difference map**: - Generate the difference map by calculating the absolute value difference between the reconstructed image and the original image and taking the average along the channel dimension. - Optimize the parameters of the reconstruction head by calculating the mean - squared - error (MSE) loss between the original image and the reconstructed image. 3. **Difference - map - guided feature enhancement**: - Calculate the element - level attention matrix through constructing a binary difference map and a re - weighting mechanism to perform targeted feature enhancement on the P2 feature map. - Filter out most of the noise signals, making the difference map sharper while maintaining feature diversity. ### Experimental results - **DroneSwarms dataset**: - The RFLA combined with the SR - TOD method reaches 39.0 in the AP metric, exceeding other methods by 1.1 AP, with an overall performance improvement of 2.1 AP. - In particular, it has an improvement of 2.3 points on very small objects (AP vt) and small objects (AP t) respectively, indicating the significant effectiveness of the method in detecting small objects. - **VisDrone2019 dataset**: - The RFLA combined with the SR - TOD method reaches 27.8 in the AP metric, exceeding other methods by 0.6 AP. - The Cascade R - CNN combined with the SR - TOD method performs best on small objects (AP s), with an AP s improvement of 2.2 points. - **AI - TOD dataset**: - The DetectoRS combined with the SR - TOD method reaches 24.0 in the AP metric, exceeding RFLA by 2.3 AP, significantly surpassing all competitors. - All detectors show significant performance improvements after being combined with SR - TOD. ### Conclusion This paper effectively solves the information loss problem in small - object detection by introducing the self - reconstruction mechanism and the difference - map - guided feature enhancement module, significantly improving the small - object detection performance. The proposed SR - TOD framework performs excellently on multiple datasets and has broad application prospects.