Abstract:Vehicle recognition technology is widely applied in automatic parking, traffic restrictions, and public security investigations, playing a significant role in the construction of intelligent transportation systems. Fine-grained vehicle recognition seeks to surpass conventional vehicle recognition by concentrating on more detailed sub-classifications. This task is more challenging due to the subtle inter-class differences and significant intra-class variations.Localization-classification subnetworks represent an efficacious approach frequently employed for this task, but previous research has typically relied on CNN deep feature maps for object localization, which suffer from the low resolution, leading to poor localization accuracy. The multi-layer feature fusion localization (MFFL) method proposed by us fuses the high-resolution feature map of the shallow layer of CNN with the deep feature map, and makes full use of the rich spatial information of the shallow feature map to achieve more precise object localization. In addition, traditional methods acquire local attention information through the design of complex models, frequently resulting in regional redundancy or information omission. To address this, we introduce an attention module that adaptively enhances the expressiveness of global features and generates global attention features. These global attention features are then integrated with object-level features and local attention cues to achieve a more comprehensive attention enhancement. Lastly, we devise a multi-branch model and employ the aforementioned object localization and attention enhancement methods for end-to-end training to make the multiple branches collaborate seamlessly to adequately extract fine-grained features. Extensive experiments conducted on the Stanford Cars dataset and the self-built Cars-126 dataset have demonstrated the effectiveness of our method, achieving a leading position among existing methods with 97.7% classification accuracy on the Stanford Cars dataset.

CAM: A fine-grained vehicle model recognition method based on visual attention model

Fine-Grained Vehicle Model Recognition Using A Coarse-to-Fine Convolutional Neural Network Architecture

Multi-layer feature fusion and attention enhancement for fine-grained vehicle recognition research

Vehicle Behavior Recognition using Multi-Stream 3D Convolutional Neural Network

Vehicle Recognition Model for Complex Scenarios Based on Human Memory Mechanism.

A Joint Object Detection and Semantic Segmentation Model with Cross-Attention and Inner-Attention Mechanisms

Embedding Pose Information for Multiview Vehicle Model Recognition

Grad-CAM guided channel-spatial attention module for fine-grained visual classification

AVFP-MVX: Multimodal VoxelNet with Attention Mechanism and Voxel Feature Pyramid

SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition

Fine-grained Vehicle Recognition Using Lightweight Convolutional Neural Network with Combined Learning Strategy

Fine-Grained Vehicle Make and Model Recognition Framework Based on Magnetic Fingerprint

Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification

Attention-Mechanism-based Tracking Method for Intelligent Internet of Vehicles

Multi-Path Deep CNNs for Fine-Grained Car Recognition

Cross-Domain Car Detection Model with Integrated Convolutional Block Attention Mechanism

VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models

Fine-grained image classification method based on hybrid attention module

Multi-View Active Fine-Grained Recognition

Fine-grained Traffic Video Vehicle Recognition Based Orientation Estimation and Temporal Information