Abstract:Directly benefiting from the rapid advancement of deep learning methods, person re-identification (Re-ID) applications have been widespread with remarkable successes in recent years. Nevertheless, cross-scene Re-ID is still hindered by large view variation, since it is challenging to effectively exploit and leverage the temporal clues due to heavy computational burden and the difficulty in flexibly incorporating discriminative features. To alleviate, we articulate a long-short temporal–spatial clues excited network (LSTS-NET) for robust person Re-ID across different scenes. In essence, our LSTS-NET comprises a motion appearance model and a motion-refinement aggregating scheme. Of which, the former abstracts temporal clues based on multi-range low-rank analysis both in consecutive frames and in cross-camera videos, which can augment the person-related features with details while suppressing the clutter background across different scenes. In addition, to aggregate the temporal clues with spatial features, the latter is proposed to automatically activate the person-specific features by incorporating personalized motion-refinement layers and several motion-excitation CNN blocks into deep networks, which expedites the extraction and learning of discriminative features from different temporal clues. As a result, our LSTS-NET can robustly distinguish persons across different scenes. To verify the improvement of our LSTS-NET, we conduct extensive experiments and make comprehensive evaluations on 8 widely-recognized public benchmarks. All the experiments confirm that, our LSTS-NET can significantly boost the Re-ID performance of existing deep learning methods, and outperforms the state-of-the-art methods in terms of robustness and accuracy.

Long-Short Temporal–Spatial Clues Excited Network for Robust Person Re-identification

Adversarial Recovery Network for Low-Light Person Re-Identification.

Deep Siamese Network with Multi-level Similarity Perception for Person Re-identification

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

Person Re-identification Network Based on Multi-Level Feature Fusion

Person Re-identification Based on Transform Algorithm

MSTN: A Multi-granular Spatial–Temporal Network for video-based person re-identification

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

Deep-Person: Learning discriminative deep features for person Re-Identification

Discriminative Spatial Feature Learning for Person Re-Identification

Video-Based Person Re-Identification Using Spatial-Temporal Memory Coupling Network

STFE: A Comprehensive Video-Based Person Re-Identification Network Based on Spatio-Temporal Feature Enhancement

Multi-level Similarity Perception Network for Person Re-identification

Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos

Multi-Level Fusion Temporal-Spatial Co-Attention for Video-Based Person Re-Identification

Cross-modal Local Shortest Path and Global Enhancement for Visible-Thermal Person Re-Identification

Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification

AA-RGTCN: Reciprocal Global Temporal Convolution Network with Adaptive Alignment for Video-Based Person Re-Identification

Multi-Scale Relation Network for Person Re-identification.

Information complementary attention-based multidimension feature learning for person re-identification

Asymmetric double networks mutual teaching for unsupervised person Re-identification