Infrared Weak Target Detection Method Based on Cross Connection and Fusion Attention Mechanism
Hui Li,Zhengzhou Li,Yuxin Yang,Congyu Hao,Haitao Liu
DOI: https://doi.org/10.3788/gzxb20245309.0910002
IF: 0.6
2024-01-01
ACTA PHOTONICA SINICA
Abstract:Compared with radar and visible light imaging, infrared imaging has its unique advantages and is widely used in medical imaging, , traffic management, , automatic driving in the civil field, as well as early warning and air defence systems and naval defence systems in the military field. It has the advantages of good concealment, anti-interference and all-weather operation. In complex backgrounds, small targets are usually submerged and look weak, lacking key information such as shape, colour and texture, , generating a large number of spurious false alarms. Traditional methods mainly extract shallow features of the target and background, but due to the lack of effective mining and utilisation of deep features, their adaptability in detecting weak targets in complex scenes is poor, and their ability to detect small targets and their adaptability to the scene need to be improved. Aiming at the problems of low detection performance such as weak signals, unclear features and multiple false alarms of infrared weak targets in complex backgrounds, an infrared weak target detection method based on spanning connection and fusion attention mechanism is proposed. The method combines the attention mechanism with residual networks to extract multiple features of small targets and reduce complex background interference; the bidirectional spanning connection structure fuses feature information at lower and higher levels, highlighting the ability to express the features of weak targets; a high-resolution detection layer is added to regroup the a priori frames of the weak targets and enhance the learning ability of the differences between target and background features; and, finally, the real Gaussian distribution model of target and predicted target frames and calculate their similarity, which solves the problem of sensitivity of target loss regression bias caused by IoU measurements and improves the accuracy of loss regression. The algorithm structure consists of four parts: backbone, neck, head and prediction. The input is an infrared image of size 256x256. The CBS module used by Backbone consists of convolution, batch normalisation and activation functions. The C3 module consists of three convolutional layers and x Resuit modules stitched together. In the last layer of the backbone network, the Convolutional Block Attention Module (CBAM) is introduced, which is fused with the residual network, and the C3CSA module for feature extraction is designed, in order to reduce the background interference in complex scenes. SPPF represents a fast spatial pyramid pooling process. Neck employs a Bidirectional Feature Pyramid Network (BIFPN) that spans connections to fuse low-level detail features to high-level features, as well as transferring high-level semantic information from top to bottom to low-level features. It also adds spanning connections to reduce the loss of some weak target information due to deep feature extraction, to achieve the interaction of global and local information, and to highlight the representation and localisation ability of weak targets at different scales. Upsample represents the up-sampling process. The Head design adds a 64x64 high-resolution feature map weak target detection head, which can avoid the large-scale detection head causing the background interference, and finally predicts the location and confidence information of the target. Comparative tests were conducted on publicly available infrared small target datasets, and the experimental results show that the algorithm has the best performance in detecting infrared small and weak targets in a variety of complex backgrounds, and the average accuracy, recall and speed are significantly improved. The average detection accuracy of this paper's algorithm reaches 98.4%, the model size is only 11.9 MB, and the detection speed is as high as 107 frame/s. By comparing the detection performance of various algorithms in PR curves, ROC curves, and complex scenarios, it can be seen that this paper's algorithm has a better accuracy in detecting weak targets in infrared images in complex scenarios, with a low false alarm rate, and it can be deployed in an embedded terminal for real-time processing.