Abstract:To efficiently capture feature information in tasks of fine-grained image classification, this study introduces a new network model for fine-grained image classification, which utilizes a hybrid attention approach. The model is built upon a hybrid attention module (MA), and with the assistance of the attention erasure module (EA), it can adaptively enhance the prominent areas in the image and capture more detailed image information. Specifically, for tasks involving fine-grained image classification, this study designs an attention module capable of applying the attention mechanism to both the channel and spatial dimensions. This highlights the important regions and key feature channels in the image, allowing for the extraction of distinct local features. Furthermore, this study presents an attention erasure module (EA) that can remove significant areas in the image based on the features identified; thus, shifting focus to additional feature details within the image and improving the diversity and completeness of the features. Moreover, this study enhances the pooling layer of ResNet50 to augment the perceptual region and the capability to extract features from the network's less deep layers. For the objective of fine-grained image classification, this study extracts a variety of features and merges them effectively to create the final feature representation. To assess the effectiveness of the proposed model, experiments were conducted on three publicly available fine-grained image classification datasets: Stanford Cars, FGVC-Aircraft, and CUB-200–2011. The method achieved classification accuracies of 92.8, 94.0, and 88.2% on these datasets, respectively. In comparison with existing approaches, the efficiency of this method has significantly improved, demonstrating higher accuracy and robustness.

Multi-Granularity Part Sampling Attention for Fine-Grained Visual Classification

Object-Part Attention Model for Fine-Grained Image Classification.

Fine-Grained Image Classification Via Spatial Saliency Extraction.

Feature Boosting, Suppression, and Diversification for Fine-Grained Visual Classification.

Dual attention guided multi-scale CNN for fine-grained image classification

Attention-based Multi-scale ViT Fine-grained Visual Classification

From the whole to detail: Progressively sampling discriminative parts for fine-grained recognition

Fine-grained image classification method based on hybrid attention module

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

Multi-directional guidance network for fine-grained visual classification

Subtler mixed attention network on fine-grained image classification

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

The Application of Two-Level Attention Models in Deep Convolutional Neural Network for Fine-Grained Image Classification

Multi-level Dictionary Learning for Fine-Grained Images Categorization with Attention Model

Selective Sparse Sampling for Fine-Grained Image Recognition

Object-Part Attention Driven Discriminative Localization for Fine-grained Image Classification.

Learning Granularity-Aware Convolutional Neural Network for Fine-Grained Visual Classification

Learning Rich Part Hierarchies with Progressive Attention Networks for Fine-Grained Image Recognition

Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization

Learning Granularity-Aware Convolutional Neural Network for Fine-Grained Visual Classification

Cascaded Multi-Scale Attention for Enhanced Multi-Scale Feature Extraction and Interaction with Low-Resolution Images