Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams

Duy Tran Thanh,Yeejin Lee,Byeongkeun Kang
2024-03-05
Abstract:This work addresses the task of long-term person re-identification. Typically, person re-identification assumes that people do not change their clothes, which limits its applications to short-term scenarios. To overcome this limitation, we investigate long-term person re-identification, which considers both clothes-changing and clothes-consistent scenarios. In this paper, we propose a novel framework that effectively learns and utilizes both global and local information. The proposed framework consists of three streams: global, local body part, and head streams. The global and head streams encode identity-relevant information from an entire image and a cropped image of the head region, respectively. Both streams encode the most distinct, less distinct, and average features using the combinations of adversarial erasing, max pooling, and average pooling. The local body part stream extracts identity-related information for each body part, allowing it to be compared with the same body part from another image. Since body part annotations are not available in re-identification datasets, pseudo-labels are generated using clustering. These labels are then utilized to train a body part segmentation head in the local body part stream. The proposed framework is trained by backpropagating the weighted summation of the identity classification loss, the pair-based loss, and the pseudo body part segmentation loss. To demonstrate the effectiveness of the proposed method, we conducted experiments on three publicly available datasets (Celeb-reID, PRCC, and VC-Clothes). The experimental results demonstrate that the proposed method outperforms the previous state-of-the-art method.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the issue of long-term person re-identification. Traditional person re-identification methods usually assume that people do not change clothes, which limits their application in short-term scenarios. To overcome this limitation, this paper studies long-term person re-identification, considering both scenarios where people change clothes and where they do not. Specifically, the paper proposes a new framework that can effectively learn and utilize both global and local information. This framework includes three streams: a global stream, a local body part stream, and a head stream. Through these three streams, the framework can extract and encode identity-related features, thereby reliably re-identifying individuals in images taken at different times and places, even if they have changed clothes. ### Main Contributions 1. **Proposed an effective long-term person re-identification framework**: This framework combines global and local information, including one stream for encoding global information and two streams for extracting local features (local body part stream and head stream). 2. **Encoded three types of feature vectors in the global stream and head stream**: These feature vectors represent the most salient, second most salient, and average features, achieved through adversarial erasing, max pooling, and average pooling techniques. 3. **Utilized explicit and implicit methods to encode local information**: The local body part stream implicitly discovers body parts using clustering algorithms, while the head stream explicitly detects and crops the head region. 4. **Experimental validation of the method's effectiveness**: Experimental results on three public datasets (Celeb-reID, PRCC, and VC-Clothes) show that the proposed method outperforms existing state-of-the-art methods. ### Method Overview - **Global Stream**: Extracts identity-related features from the entire image, encoding the most salient, second most salient, and average features through adversarial erasing, max pooling, and average pooling techniques. - **Local Body Part Stream**: Generates pseudo labels using clustering algorithms, trains a human parsing network to extract features of each body part, and compares them with the same body parts in another image. - **Head Stream**: Explicitly detects and crops the head region, extracting identity-related features of the head. By integrating these three streams, the framework achieves excellent performance in long-term person re-identification tasks, especially in scenarios where individuals change clothes.