Abstract:Occluded person re-identification (Re-ID) is a challenging task, as pedestrians are often obstructed by various occlusions, such as non-pedestrian objects or non-target pedestrians. Previous methods have heavily relied on auxiliary models to obtain information in unoccluded regions, such as human pose estimation. However, these auxiliary models fall short in accounting for pedestrian occlusions, thereby leading to potential misrepresentations. In addition, some previous works learned feature representations from single images, ignoring the potential relations among samples. To address these issues, this paper introduces a Multi-Level Relation-Aware Transformer (MLRAT) model for occluded person Re-ID. This model mainly encompasses two novel modules: Patch-Level Relation-Aware (PLRA) and Sample-Level Relation-Aware (SLRA). PLRA learns fine-grained local features by modeling the structural relations between key patches, bypassing the dependency on auxiliary models. It adopts a model-free method to select key patches that have high semantic correlation with the final pedestrian representation. In particular, to alleviate the interference of occlusion, PLRA captures the structural relations among key patches via a two-layer Graph Convolution Network (GCN), effectively guiding the local feature fusion and learning. SLRA is designed to facilitate the model to learn discriminative features by modeling the relations among samples. Specifically, to mitigate noisy relations of irrelevant samples, we present a Relation-Aware Transformer (RAT) block to capture the relations among neighbors. Furthermore, to bridge the gap between training and testing phases, a self-distillation method is employed to transfer the sample-level relations captured by SLRA to the backbone. Extensive experiments are conducted on four occluded datasets, two partial datasets and two holistic datasets. The results show that the proposed MLRAT model significantly outperforms existing baselines on four occluded datasets, while maintains top performance on two partial datasets and two holistic datasets.

Pedestrian 3D Shape Understanding for Person Re-Identification via Multi-View Learning

3D Person Re-identification Based on Global Semantic Guidance and Local Feature Aggregation

Multi-view Information Integration and Propagation for Occluded Person Re-identification

Full-scaled Deep Metric Learning for Pedestrian Re-Identification

Parameter-Efficient Person Re-identification in the 3D Space

Occluded Person Re-Identification with Single-scale Global Representations

Occluded Person Re-Identification with Pose Estimation Correction and Feature Reconstruction

Learning to Know Where to See - A Visibility-Aware Approach for Occluded Person Re-identification.

Robust Video-Based Person Re-Identification by Hierarchical Mining

Learning Visual-Spatial Saliency for Multiple-Shot Person Re-Identification

Multi-Rate Gated Recurrent Convolutional Networks for Video-Based Pedestrian Re-Identification.

Semantic-Aware Occlusion-Robust Network for Occluded Person Re-Identification

Focus on the Visible Regions: Semantic-Guided Alignment Model for Occluded Person Re-Identification.

Deep Multi-View Feature Learning for Person Re-Identification

Visible Infrared Cross-Modality Person Re-Identification Network Based on Adaptive Pedestrian Alignment

A Multi-Level Relation-Aware Transformer model for occluded person re-identification

Ped-Mix: Mix Pedestrians for Occluded Person Re-identification.

DGSN: Learning How to Segment Pedestrians from Other Datasets for Occluded Person Re-Identification

Mgd: Mask Guided De-Occlusion Framework For Occluded Person Re-Identification

Attribute-Guided Collaborative Learning for Partial Person Re-Identification

Part-based Representation Enhancement for Occluded Person Re-identification