Abstract:Person re-identification plays an important role in searching for a specific person in a camera network with non-overlapping cameras. The most critical problem for re-identification is feature representation. In this paper, a multi-level cross-view consistent feature learning framework is proposed for person re identification. First, local deep, LOMO and SIFT features are extracted to form multi-level features. Specifically, local features from the lower and higher layers of a convolutional neural network (CNN) are extracted, these features complement each other as they extract apparent and semantic properties. Second, an ID-based cross-view multi-level dictionary learning (IDB-CMDL) is carried out to obtain sparse and discriminant feature representation. Third, a cross-view consistent word learning is performed to get the cross-view consistent BoVW histograms from sparse feature representation. Finally, a multi-level metric learning fuses multiple BoVW histograms, and learns the sample distance in the subspace for ranking. Experiments on the public CUHK03, Market1501, and DukeMTMC-ReID datasets show results that are superior to many state-of-the-art methods for person re-identification. (c) 2021 Elsevier B.V. All rights reserved. Person re-identification plays an important role in searching for a specific person in a camera network with non-overlapping cameras. The most critical problem for re-identification is feature representation. In this paper, a multi-level cross-view consistent feature learning framework is proposed for person reidentification. First, local deep, LOMO and SIFT features are extracted to form multi-level features. Specifically, local features from the lower and higher layers of a convolutional neural network (CNN) are extracted, these features complement each other as they extract apparent and semantic properties. Second, an ID-based cross-view multi-level dictionary learning (IDB-CMDL) is carried out to obtain sparse and discriminant feature representation. Third, a cross-view consistent word learning is performed to get the cross-view consistent BoVW histograms from sparse feature representation. Finally, a multi-level metric learning fuses multiple BoVW histograms, and learns the sample distance in the subspace for ranking. Experiments on the public CUHK03, Market1501, and DukeMTMC-ReID datasets show results that are superior to many state-of-the-art methods for person re-identification.

Multi-view Feature Fusion for Person Re-Identification

Contribution-Based Multi-Stream Feature Distance Fusion Method with ${k}$ -Distribution Re-Ranking for Person Re-Identification

Contribution-Based Multi-Stream Feature Distance Fusion Method With <inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>-Distribution Re-Ranking for Person Re-Identification

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

Person Re-identification Network Based on Multi-Level Feature Fusion

Gaussian-based Probability Fusion for Person Re-Identification with Taylor Angular Margin Loss

Multi-feature Fusion Network for Person Reidentification

Multi-level cross-view consistent feature learning for person re-identification

Unity is Strength: Unifying Convolutional and Transformeral Features for Better Person Re-Identification

Learning View-Specific Deep Networks for Person Re-Identification.

Person Re-Identification by Optimizing and Integrating Multiple Feature Representations

Multi-scale Feature Fusion Network for Person Re-Identification.

Joint Cross-Consistency Learning and Multi-Feature Fusion for Person Re-Identification

Learning fused features with parallel training for person re-identification

Adaptive Feature Fusion Via Graph Neural Network for Person Re-identification.

Multiple-local feature and attention fused person re-identification method

Multi-Level Fusion Temporal-Spatial Co-Attention for Video-Based Person Re-Identification

Multi-level Feature Fusion and Multi-Loss Learning for Person Re-Identification.

Adaptive Multi-Metric Fusion for Person Re-identification.

VMRFANet:View-Specific Multi-Receptive Field Attention Network for Person Re-identification

Multi Deep Invariant Feature Learning for Cross-Resolution Person Re-Identification