Abstract:Cloth-changing person re-identification (ReID) is a newly emerging research topic that aims to retrieve pedestrians whose clothes are changed. Since the human appearance with different clothes exhibits large variations, it is very difficult for existing approaches to extract discriminative and robust feature representations. Current works mainly focus on body shape or contour sketches, but the human semantic information and the potential consistency of pedestrian features before and after changing clothes are not fully explored or are ignored. To solve these issues, in this work, a novel semantic-aware attention and visual shielding network for cloth-changing person ReID (abbreviated as SAVS) is proposed where the key idea is to shield clues related to the appearance of clothes and only focus on visual semantic information that is not sensitive to view/posture changes. Specifically, a visual semantic encoder is first employed to locate the human body and clothing regions based on human semantic segmentation information. Then, a human semantic attention (HSA) module is proposed to highlight the human semantic information and reweight the visual feature map. In addition, a visual clothes shielding (VCS) module is also designed to extract a more robust feature representation for the cloth-changing task by covering the clothing regions and focusing the model on the visual semantic information unrelated to the clothes. Most importantly, these two modules are jointly explored in an end-to-end unified framework. Extensive experiments demonstrate that the proposed method can significantly outperform state-of-the-art methods, and more robust features can be extracted for cloth-changing persons. Compared with multibiometric unified network (MBUNet) (published in TIP2023), this method can achieve improvements of 17.5% (30.9%) and 8.5% (10.4%) on the LTCC and Celeb-reID datasets in terms of mean average precision (mAP) (rank-1), respectively. When compared with the Swin Transformer (Swin-T), the improvements can reach 28.6% (17.3%), 22.5% (10.0%), 19.5% (10.2%), and 8.6% (10.1%) on the PRCC, LTCC, Celeb, and NKUP datasets in terms of rank-1 (mAP), respectively.

VAC-Net: Visual Attention Consistency Network for Person Re-identification

Person Re-identification Based on Transform Algorithm

Recurrent Deep Attention Network for Person Re-Identification.

AMC-Net: Attentive Modality-Consistent Network for Visible-Infrared Person Re-Identification.

VMRFANet:View-Specific Multi-Receptive Field Attention Network for Person Re-identification

Consistency-driven feature scoring and regularization network for visible–infrared person re-identification

A part-based attention network for person re-identification

Learning View-Specific Deep Networks for Person Re-Identification.

Correlation-guided Semantic Consistency Network for Visible-infrared Person Re-identification

GW-net: an Efficient Grad-Cam Consistency Neural Network with Weakening of Random Erasing Features for Semi-Supervised Person Re-Identification.

Related Attention Network for Person Re-Identification

A Spatial-Channel Multi-Attention Parallel Network for Visible-Infrared Person Re-identification

A Semantic-aware Attention and Visual Shielding Network for Cloth-changing Person Re-identification.

Discriminative Spatial Feature Learning for Person Re-Identification

CNN Attention Enhanced ViT Network for Occluded Person Re-Identification

Boosting Person Re-Identification with Viewpoint Contrastive Learning and Adversarial Training

ReMamba: a Hybrid CNN-Mamba Aggregation Network for Visible-Infrared Person Re-Identification

Self-Erasing Network for Person Re-Identification

Recurrent matching networks of spatial alignment learning for person re-identification

Adaptive Attention-Aware Network for Unsupervised Person Re-Identification

Attention-Aware Adversarial Network for Person Re-Identification