ReMix: Training Generalized Person Re-identification on a Mixture of Data

Timur Mamedov,Anton Konushin,Vadim Konushin
2024-10-29
Abstract:Modern person re-identification (Re-ID) methods have a weak generalization ability and experience a major accuracy drop when capturing environments change. This is because existing multi-camera Re-ID datasets are limited in size and diversity, since such data is difficult to obtain. At the same time, enormous volumes of unlabeled single-camera records are available. Such data can be easily collected, and therefore, it is more diverse. Currently, single-camera data is used only for self-supervised pre-training of Re-ID methods. However, the diversity of single-camera data is suppressed by fine-tuning on limited multi-camera data after pre-training. In this paper, we propose ReMix, a generalized Re-ID method jointly trained on a mixture of limited labeled multi-camera and large unlabeled single-camera data. Effective training of our method is achieved through a novel data sampling strategy and new loss functions that are adapted for joint use with both types of data. Experiments show that ReMix has a high generalization ability and outperforms state-of-the-art methods in generalizable person Re-ID. To the best of our knowledge, this is the first work that explores joint training on a mixture of multi-camera and single-camera data in person Re-ID.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the weak generalization ability issue in the task of Person Re-identification (Re-ID). Specifically, existing Re-ID methods experience significant performance degradation when capturing environmental changes, which limits their application in real-world scenarios. The main reasons for this problem include: 1. **Limited training data**: Existing multi-camera Re-ID datasets are small in scale and lack diversity because such data is difficult to obtain. 2. **Underutilization of single-camera data**: Although a large amount of single-camera data is available, it is currently only used for self-supervised pre-training. Subsequent fine-tuning still relies on limited multi-camera data, resulting in the diversity of single-camera data not being fully utilized. To address these issues, the paper proposes a new method called ReMix, which improves the generalization ability of Re-ID by jointly training with limited multi-camera labeled data and a large amount of single-camera unlabeled data. ReMix achieves this goal through the following innovations: - **Novel data sampling strategy**: Effectively obtains pseudo-labels for a large amount of single-camera unlabeled data and forms mini-batches from the mixed data. - **New loss functions**: Including instance loss, augmentation loss, and center loss, these loss functions are adapted to simultaneously use both types of data, thereby improving training efficiency. - **Combination of self-supervised pre-training and joint training**: Utilizes self-supervised pre-training to generate high-quality pseudo-labels and further enhances the algorithm's generalization ability through joint training. Experimental results show that ReMix outperforms existing methods in both cross-dataset and multi-source cross-dataset scenarios, demonstrating the effectiveness of the proposed method.