SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target Detection

Shuai Yuan,Hanlin Qin,Xiang Yan,Naveed AKhtar,Ajmal Mian

DOI: https://doi.org/10.1109/TGRS.2024.3383649

2024-04-30

Abstract:Infrared small target detection (IRSTD) has recently benefitted greatly from U-shaped neural models. However, largely overlooking effective global information modeling, existing techniques struggle when the target has high similarities with the background. We present a Spatial-channel Cross Transformer Network (SCTransNet) that leverages spatial-channel cross transformer blocks (SCTBs) on top of long-range skip connections to address the aforementioned challenge. In the proposed SCTBs, the outputs of all encoders are interacted with cross transformer to generate mixed features, which are redistributed to all decoders to effectively reinforce semantic differences between the target and clutter at full scales. Specifically, SCTB contains the following two key elements: (a) spatial-embedded single-head channel-cross attention (SSCA) for exchanging local spatial features and full-level global channel information to eliminate ambiguity among the encoders and facilitate high-level semantic associations of the images, and (b) a complementary feed-forward network (CFN) for enhancing the feature discriminability via a multi-scale strategy and cross-spatial-channel information interaction to promote beneficial information transfer. Our SCTransNet effectively encodes the semantic differences between targets and backgrounds to boost its internal representation for detecting small infrared targets accurately. Extensive experiments on three public datasets, NUDT-SIRST, NUAA-SIRST, and IRSTD-1k, demonstrate that the proposed SCTransNet outperforms existing IRSTD methods. Our code will be made public at

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

This paper proposes a new solution to the problem of Infrared Small Target Detection (IRSTD). Specifically, the paper addresses the following key issues: 1. **Limitations of Existing Technologies**: Although methods based on U-shaped neural networks have made significant progress in infrared small target detection, they often neglect effective modeling of global information, leading to performance degradation when the background and target are highly similar. 2. **Proposed New Method**: The authors propose a new architecture called "Spatial-channel Cross Transformer Network" (SCTransNet), which enhances the interaction between different levels of features by utilizing Spatial-channel Cross Transformer Blocks (SCTBs) on long-distance skip connections, thereby improving detection performance. 3. **Core Components**: - **SCTBs**: Comprising two key parts, namely Spatial-embedded Single-head Channel-cross Attention (SSCA) and Complementary Feedforward Network (CFN). SSCA is used to exchange local spatial features and global channel information, while CFN enhances feature discrimination through a multi-scale strategy and promotes cross-spatial-channel information interaction. 4. **Experimental Validation**: Extensive experiments on three public datasets (NUDT-SIRST, NUAA-SIRST, and IRSTD-1K) demonstrate that the proposed SCTransNet outperforms existing infrared small target detection methods. In summary, this paper aims to address the current challenge of distinguishing between background and target in infrared small target detection by introducing a novel spatial-channel cross Transformer structure to achieve this goal.

SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target Detection

Local Information Guided Global Integration for Infrared Small Target Detection.

IST-TransNet: Infrared Small Target Detection Based on Transformer Network

Cross-Layer Feature Guided Multiscale Infrared Small Target Detection

ST-Trans: Spatial-Temporal Transformer for Infrared Small Target Detection in Sequential Images

MDCENet: Multi-Dimensional Cross-Enhanced Network for Infrared Small Target Detection

A Feature Enhancement and Augmentation-Based Infrared Small Target Detection Network

IRSTFormer: A Hierarchical Vision Transformer for Infrared Small Target Detection

Multilevel Interactive Enhanced Network for Infrared Small-Target Detection

IR-TransDet: Infrared Dim and Small Target Detection with IR-Transformer

Region-guided Network with Visual Cues Correction for Infrared Small Target Detection

Nanetformer: Nested Attention Network with Auxiliary Transformer Enhancement for Infrared Small Target Detection

FTC-Net: Fusion of Transformer and CNN Features for Infrared Small Target Detection

CDMNet: Contrastive Distribution Mapped Network for Infrared Small Target Detection.

Thermodynamics-Inspired Multi-Feature Network for Infrared Small Target Detection

CSC-Net: A Lightweight Model for Remote Real-Time Monitoring of Infrared Small Targets

Exploring Feature Compensation and Cross-level Correlation for Infrared Small Target Detection

SFFNet: Shallow Feature Fusion Network Based on Detection Framework for Infrared Small Target Detection

Context-aware Cross-Level Attention Fusion Network for Infrared Small Target Detection

Dense Nested Attention Network for Infrared Small Target Detection

SSTNet: Sliced Spatio-Temporal Network With Cross-Slice ConvLSTM for Moving Infrared Dim-Small Target Detection