EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Bo Yang,Xinyu Zhang,Jian Zhang,Jun Luo,Mingliang Zhou,Yangjun Pi
DOI: https://doi.org/10.1109/TGRS.2024.3365677
2024-02-27
Abstract:Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at <a class="link-external link-https" href="https://github.com/YangBo0411/infrared-small-target" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily addresses several key issues in single-frame infrared small target detection: 1. **Extreme imbalance between target and background**: Due to the highly uneven ratio between targets and background in infrared images, models tend to focus more on background features rather than target features. 2. **Sensitivity of bounding box regression**: Infrared small targets are extremely sensitive to bounding box regression, where slight positional changes can lead to significant changes in Intersection over Union (IOU), thus affecting model convergence. 3. **Loss of high-level semantic information**: During downsampling, target information is easily lost. Shallow features contain more target information but are not fully utilized. To address these issues, the authors propose an Enhanced Feature Learning Network (EFLNet), which includes the following improvements: - **Adaptive Threshold Focal Loss (ATFL)**: By decoupling targets and background and using an adaptive mechanism to adjust loss weights, the model focuses more on target features. - **Normalized Gaussian Wasserstein Distance (NWD)**: This alleviates the high sensitivity of bounding box regression for infrared small targets, improving model convergence performance. - **Dynamic Head Mechanism**: By using a self-attention mechanism to learn the relative importance of each semantic layer, the detection performance of infrared small targets is improved. Experimental results show that this method outperforms existing deep learning-based methods in infrared small target detection tasks. Additionally, the authors provide a version of the existing public infrared small target datasets with bounding box annotations to support detection tasks rather than being limited to segmentation tasks.