Abstract:Infrared target detection has important applications in rescue and Earth observation. However, the disadvantages of low signal-to-clutter ratios and severe background noise interference for infrared imaging pose great challenges to the detection technology for infrared dim targets. Most algorithms only extract features from the spatial domain, while the lack of temporal information results in unsatisfactory detection performance when the difference between the target and the background is not significant enough. Although some methods utilize temporal information in the detection process, these nonlearning-based methods fail to incorporate the complex and changeable background, and need to adjust parameters according to the input. To tackle this problem, we proposed a Spatio-Temporal Differential Multiscale Attention Network (STDMANet), a learning-based method for multiframe infrared small target detection in this article. Our STDMANet first used the temporal multiscale feature extractor to obtain spatiotemporal (ST) features from multiple time scales and then resorted them to the spatial multiscale feature refiner to enhance the semantics of ST features on the premise of maintaining the position information of small targets. Finally, unlike other learning-based networks that require binary masks for training, we designed a mask-weighted heatmap loss to train the network with only center point annotations. At the same time, the proposed loss can balance missing detection and false alarm, so as to achieve a good balance between finding the targets and suppressing the background. Extensive quantitative experiments on public datasets validated that the proposed STDMANet could improve the metric ${F_{1}}$ score up to 0.9744, surpassing the state-of-the-art baseline by 0.1682. Qualitative experiments show the proposed method could stably extract foreground moving targets from video sequences with various backgrounds while reducing false alarm rate better than other recent baseline methods.

STDMANet: Spatio-Temporal Differential Multiscale Attention Network for Small Moving Infrared Target Detection

Hierarchical attention-guided multiscale aggregation network for infrared small target detection

Multiscale Interactive Attention Network for Infrared Small Target Detection

Dense Nested Attention Network for Infrared Small Target Detection

FDDBA-NET: Frequency Domain Decoupling Bidirectional Interactive Attention Network for Infrared Small Target Detection

4DST-BTMD: an Infrared Small Target Detection Method Based on 4-D Data-Sphered Space

Infrared Small Target Detection Based on a Temporally-Aware Fully Convolutional Neural Network

Improved Dense Nested Attention Network Based on Transformer for Infrared Small Target Detection

Multiscale Progressive Fusion Filter Network for Infrared Small Target Detection

5-D Spatial-Temporal Information-Based Infrared Small Target Detection in Complex Environments

LEC-MTNN: a Novel Multi-Frame Infrared Small Target Detection Method Based on Spatial-Temporal Patch-Tensor

Location-Guided Dense Nested Attention Network for Infrared Small Target Detection

An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism

Cross-Layer Feature Guided Multiscale Infrared Small Target Detection

IMNN-LWEC: A Novel Infrared Small Target Detection Based on Spatial–Temporal Tensor Model

MPANet: Multi-Patch Attention For Infrared Small Target object Detection

SSTNet: Sliced Spatio-Temporal Network With Cross-Slice ConvLSTM for Moving Infrared Dim-Small Target Detection

Guided Attention and Joint Loss for Infrared Dim Small Target Detection

Multi-Scale Direction-Aware Network for Infrared Small Target Detection

ST-Trans: Spatial-Temporal Transformer for Infrared Small Target Detection in Sequential Images

CCRANet: A Two-Stage Local Attention Network for Single-Frame Low-Resolution Infrared Small Target Detection