Abstract:Lightweight, high‐performance networks are important in vision perception systems. Recent research on convolutional neural networks has shown that attention mechanisms can significantly improve the network performance. However, existing approaches either ignore the significance of using both types of attention mechanisms (channel and space) simultaneously or increase the model complexity. In this study, we propose the adaptive attention module (AAM), which is a truly lightweight yet effective module that comprises channel and spatial submodules to balance model performance and complexity. The AAM initially utilizes the channel submodule to generate intermediate channel‐refined features. In this module, an adaptive mechanism enables the model to autonomously learn the weights between features extracted by global max pooling and global average pooling to adapt to different stages of the model, thus enhancing performance. The spatial submodule employs a group‐interact‐aggregate strategy to enhance the expression of important features. It groups the intermediate channel‐refined features along the channel dimension into multiple subfeatures for parallel processing and generates spatial attention feature descriptors and channelwise refined subfeatures for each subfeature; subsequently, it aggregates all the refined subfeatures and employs a "channel shuffle" operator to transfer information between different subfeatures, thereby generating the final refined features and adaptively emphasizing important regions. Additionally, AAM is a plug‐and‐play architectural unit that can be directly used to replace standard convolutions in various convolutional neural networks. Extensive tests on CIFAR‐100, ImageNet‐1k, BDD100K, and MS COCO demonstrate that AAM improves the baseline network performance under various models and tasks, thereby validating its versatility.

Fine-grained Image Recognition Via Attention Interaction and Counterfactual Attention Network

Subtler mixed attention network on fine-grained image classification

Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Aggregate Attention Module for Fine-Grained Image Classification

Fine-grained image classification method based on hybrid attention module

Attention-in-Attention Networks for Surveillance Video Understanding in Internet of Things

Fine-grained Image Recognition Based on Attention Map and Image Sampling

Adaptive Attention Module for Image Recognition Systems in Autonomous Driving

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

Mixed Attention Mechanism for Small-Sample Fine-grained Image Classification

Attention in Attention: Modeling Context Correlation for Efficient Video Classification

Towards accurate RGB-D saliency detection with complementary attention and adaptive integration

Fully Convolutional Attention Networks for Fine-Grained Recognition

Adversarial Complementary Attention-Enhancement Network For Fine-grained Image Recognition

Focus Longer to See Better: Recursively Refined Attention for Fine-Grained Image Classification

Attention in Attention: Modeling Context Correlation for Efficient Video Classification

Infrared and Visible Image Fusion via Interactive Compensatory Attention Adversarial Learning

Attention Graph: Learning Effective Visual Features for Large-Scale Image Classification

Bilinear Residual Attention Networks for Fine-Grained Image Classification

Semantic-Guided Information Alignment Network for Fine-Grained Image Recognition

MCA: Multidimensional Collaborative Attention in Deep Convolutional Neural Networks for Image Recognition