Rethinking Person Re-Identification via Semantic-Based Pretraining

Suncheng Xiang,Dahong Qian,Jingsheng Gao,Zirui Zhang,Ting Liu,Yuzhuo Fu
DOI: https://doi.org/10.1145/3628452
2023-10-17
Abstract:Pretraining is a dominant paradigm in computer vision. Generally, supervised ImageNet pretraining is commonly used to initialize the backbones of person re-identification (Re-ID) models. However, recent works show a surprising result that CNN-based pretraining on ImageNet has limited impacts on Re-ID system due to the large domain gap between ImageNet and person Re-ID data. To seek an alternative to traditional pretraining, here we investigate semantic-based pretraining as another method to utilize additional textual data against ImageNet pretraining. Specifically, we manually construct a diversified FineGPR-C caption dataset for the first time on person Re-ID events. Based on it, a pure semantic-based pretraining approach named VTBR is proposed to adopt dense captions to learn visual representations with fewer images. We train convolutional neural networks from scratch on the captions of FineGPR-C dataset, and then transfer them to downstream Re-ID tasks. Comprehensive experiments conducted on benchmark datasets show that our VTBR can achieve competitive performance compared with ImageNet pretraining – despite using up to 1.4 × fewer images, revealing its potential in Re-ID pretraining. Our source code is also publicly available at https://github.com/JeremyXSC/VTBR.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?