Abstract:Person re-identification (Re-ID) aims to match person images across non-overlapping camera views. The majority of Re-ID methods focus on small-scale surveillance systems in which each pedestrian is captured in different camera views of adjacent scenes. However, in large-scale surveillance systems that cover larger areas, it is required to track a pedestrian of interest across distant scenes (e.g., a criminal suspect escapes from one city to another). Since most pedestrians appear in limited local areas, it is difficult to collect training data with cross-camera pairs of the same person. In this work, we study intra-camera supervised person re-identification across distant scenes (ICS-DS Re-ID), which uses cross-camera unpaired data with intra-camera identity labels for training. It is challenging as cross-camera paired data plays a crucial role for learning camera-invariant features in most existing Re-ID methods. To learn camera-invariant representation from cross-camera unpaired training data, we propose a cross-camera feature prediction method to mine cross-camera self supervision information from camera-specific feature distribution by transforming fake cross-camera positive feature pairs and minimize the distances of the fake pairs. Furthermore, we automatically localize and extract local-level feature by a transformer. Joint learning of global-level and local-level features forms a global-local cross-camera feature prediction scheme for mining fine-grained cross-camera self supervision information. Finally, cross-camera self supervision and intra-camera supervision are aggregated in a framework. The experiments are conducted in the ICS-DS setting on Market-SCT, Duke-SCT and MSMT17-SCT datasets. The evaluation results demonstrate the superiority of our method, which gains significant improvements of 15.4 Rank-1 and 22.3 mAP on Market-SCT as compared to the second best method.

Discriminative Spatial Feature Learning for Person Re-Identification

A Novel Two-Stream Saliency Image Fusion CNN Architecture for Person Re-Identification

Deep Siamese Network with Multi-level Similarity Perception for Person Re-identification

Person Re-identification Based on Transform Algorithm

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

RETRACTED CHAPTER: Person Re-identification Based on Transform Algorithm

Person Re-Identification Based on Spatial Feature Learning and Multi-Granularity Feature Fusion

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

Deep-Person: Learning discriminative deep features for person Re-Identification

Information complementary attention-based multidimension feature learning for person re-identification

An End-to-End Foreground-Aware Network for Person Re-Identification

Person Re-identification Based on CNN with Multi-scale Contour Embedding

Cross-Camera Feature Prediction for Intra-Camera Supervised Person Re-identification across Distant Scenes

Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification

Multi-level Similarity Perception Network for Person Re-identification

Deep Fusion Feature Representation Learning With Hard Mining Center-Triplet Loss for Person Re-Identification

Discriminative feature extraction for video person re-identification via multi-task network

Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos

A Discriminatively Learned CNN Embedding for Person Reidentification

STFE: A Comprehensive Video-Based Person Re-Identification Network Based on Spatio-Temporal Feature Enhancement

Domain-adaptive Person Re-identification without Cross-camera Paired Samples