Abstract:With the need for criminal investigation technology and the development of deep learning , the task of person re-identification has gradually become a research hotspot. Recently, various neural network-based person re-identification technologies designed by researchers have shown excellent results. However, most of the frameworks focus on complex structural design or redundant networks to guide model construction, which hugely increases the cost of train and application cost. In addition, the correlation between the channel information and spatial information on the pedestrian feature map is also relatively lacking. Therefore, we design a lightweight attention module to address the lack of correlation question response. The proposed module sequentially extracts person images' channel and spatial features and effectively associates the two kinds of information through sequential connections. The proposed attention module has a simple structure, and the parameter increase in the backbone network is tiny. We place the fuse module in each feature extraction layer to focus on the pedestrian information extracted by each layer. To solve the problem of complex model structure, we choose the residual network as the backbone network and the attention mechanism to extract person features without using pose point estimation or additional network assistance to reduce model complexity. We adjust the drop rate of the person classification layer to improve the model's generalization ability. We estimate the performance of our method on three public datasets: Market-1501, DukeMTMC-reID, and CUHK03 (both detected and labeled) demonstrate the proposed method's effectiveness and obtain highly competitive performance on the three datasets.

Triplet Attention Network for Video-Based Person Re-Identification

Instance Hard Triplet Loss for In-video Person Re-identification

A Novel Two-Stream Saliency Image Fusion CNN Architecture for Person Re-Identification

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

Deep Siamese Network with Multi-level Similarity Perception for Person Re-identification

Person Re-identification Based on Transform Algorithm

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

Diverse Part Attentive Network for Video-Based Person Re-Identification *

MSTN: A Multi-granular Spatial–Temporal Network for video-based person re-identification

Learning Recurrent 3D Attention for Video-Based Person Re-Identification

Dense 3D-Convolutional Neural Network for Person Re-Identification in Videos

Multi-Level Fusion Temporal-Spatial Co-Attention for Video-Based Person Re-Identification

Concentrated Multi-Grained Multi-Attention Network for Video Based Person Re-Identification

AA-RGTCN: Reciprocal Global Temporal Convolution Network with Adaptive Alignment for Video-Based Person Re-Identification

Information complementary attention-based multidimension feature learning for person re-identification

Pose-Aided Video-based Person Re-Identification via Recurrent Graph Convolutional Network

A Person Re-Identification Network Based Upon Channel Attention and Self-Attention

Multi-Scale Triplet CNN for Person Re-Identification.

ASTA-Net: Adaptive Spatio-Temporal Attention Network for Person Re-Identification in Videos.

MIX-Net: Hybrid Attention/Diversity Network for Person Re-Identification

Multi-Scale 3D Convolution Network for Video Based Person Re-Identification.