Abstract:Person re-identification (re-ID) is an important and challenging topic in video surveillance and public security. Re-ID aims to retrieve persons from different cameras. Despite the developments in recent years, re-ID still faces many challenges due to different camera views, changeable person posture, complex background, and occlusion. To exploit more discriminative image similarity descriptor, we propose a novel method in this paper. First, we design a body partition extraction network to extract three body regions with efficient alignment. Second, we propose a multi-stream contribution framework to fuse feature distance with different contributions and generate the final image similarity descriptor. In addition, we combine re-ID and semantic segmentation. A mask feature is introduced to the proposed framework and we design a contribution feedback module to generate contribution coefficients dynamically. Third, in order to improve re-ID performance, we propose a fragment learning method to optimize the contribution feedback module. Fourth and last, we propose a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> -distribution re-ranking strategy to further improve performance. Our method achieves competitive results on two popular datasets, CUHK03, and Market1501, with rank-1 accuracy of 93.5% and 85.7%. The proposed re-ranking method achieves 2.3% and 2.8% performance boost. The data demonstrate the effectiveness of the proposed multi-stream contribution framework and the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> -distribution re-ranking strategy.

Spatial-temporal Fusion Network with Residual Learning and Attention Mechanism: A Benchmark for Video-Based Group Re-ID

A Novel Two-Stream Saliency Image Fusion CNN Architecture for Person Re-Identification

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

Contribution-Based Multi-Stream Feature Distance Fusion Method with ${k}$ -Distribution Re-Ranking for Person Re-Identification

Gaussian-based Probability Fusion for Person Re-Identification with Taylor Angular Margin Loss

Person Re-identification Network Based on Multi-Level Feature Fusion

Contribution-Based Multi-Stream Feature Distance Fusion Method With <inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>-Distribution Re-Ranking for Person Re-Identification

Multi-Level Fusion Temporal-Spatial Co-Attention for Video-Based Person Re-Identification

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

MSTN: A Multi-granular Spatial–Temporal Network for video-based person re-identification

Video-Based Person Re-Identification Using Spatial-Temporal Memory Coupling Network

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

Spatial-Temporal Attention-aware Learning for Video-based Person Re-identification.

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

Parallel Attention with Weighted Efficient Network for Video-Based Person Re-Identification.

Relation-Guided Spatial Attention and Temporal Refinement for Video-Based Person Re-Identification.

Object Re-identification via Spatial-temporal Fusion Networks and Causal Identity Matching

STFE: A Comprehensive Video-Based Person Re-Identification Network Based on Spatio-Temporal Feature Enhancement

A Video Target Re-Recognition Method Based on Adaptive Attention Enhancement and Multi-Scale Feature Fusion

Revisiting Temporal Modeling for Video-based Person ReID

Spatial and Temporal Mutual Promotion for Video-Based Person Re-Identification.