Style-Agnostic Representation Learning for Visible-Infrared Person Re-Identification

Jianbing Wu,Hong Liu,Wei Shi,Mengyuan Liu,Wenhao Li
DOI: https://doi.org/10.1109/tmm.2023.3294002
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:One main challenge of visible-infrared person reidentification (VI Re-ID) lies in the large style discrepancy between the heterogeneous data. We present a STyle-Agnostic Representation learning (STAR) framework that bridges the modality gaps at both data and feature levels in a progressive manner. At the data level, we present Cross Modality Blending (CMB), a powerful and parameter-free data augmentation scheme that smoothly synthesizes intermediate modalities by conducting identity-preserving patch exchange and smooth crossmodality blending. At the feature level, we explore the intermodality feature alignment problem from a new perspective of the style-related feature statistics. Specifically, we design a plug-and-play Adaptive Style Normalization (ASN) module to discard the intrinsic style distractors without losing discriminative content via dual-level adaptive distribution normalization and discriminability compensation. Moreover, considering that an appropriate modality intermediary can convey relevant information on the inter-modality distribution shift, we propose Reciprocal Modality Bridging Learning (RMBL) to better steer the modality bridging process. Two lightweight modality transformation modules are designed in RMBL to model an appropriate intermediate space by manipulating high-order statistics under our shortest distance constraint. Meanwhile, intermediary-guided distribution alignment is reciprocally conducted to align heterogeneous features to the modality intermediary. Experiments on VI Re-ID benchmarks demonstrate the superiority and flexibility of STAR over state-of-the-art methods. Our code is available at https://github.com/KimbingNg/Star-VIReID.
What problem does this paper attempt to address?