PCSA: Enhancing CNN Performance With Pyramid Channel and Spatial Attention

YeHang Zhang,Yehang Zhang
DOI: https://doi.org/10.1109/access.2024.3368801
IF: 3.9
2024-03-02
IEEE Access
Abstract:Recent studies have demonstrated that the attention mechanism can effectively enhance the effectiveness of deep convolutional neural networks. In this paper, we propose a "Pyramid channel and spatial attention" (PCSA), which consists of reconstructing the features after pyramidal multiscale convolution by extracting spatial weights and channel weights. This dual weight extraction process helps to merge the multiscale information more accurately and enhances the model's focus on the complex locations of image objects. As a plug-and-play module, PCSA can be easily added to various backbone networks to enhance the modeling effect. We apply the PCSA module to two kinds of backbone networks, VGG and ResNet, and the improved models are named: VGG-PCSA and PCSANet, respectively. Experimental results show that on the CIFAR-10, CIFAR-100, and NaSC-TG2 datasets, our model has a significant performance improvement over the backbone networks while keeping the number of parameters low and performs better than most of the state-of-the-art channel attention methods. In addition, we visualize feature maps and class activation diagrams to explain the better performance of PCSA.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?