Balanced and Essential Modality-Specific and Modality-Shared Representations for Visible-Infrared Person Re-Identification

Soonyong Gwon,Sejun Kim,Kisung Seo
DOI: https://doi.org/10.1109/lsp.2024.3358725
2024-02-07
IEEE Signal Processing Letters
Abstract:Retrieving and matching individual images for Visible-Infrared Person Re-identification is a challenging task due to the huge modality gap between daytime color and nighttime infrared images from different modalities. Existing approaches rely on inefficient data augmentation and/or biased modality characteristics, limiting their potential for performance improvement. To solve these problems, we propose a novel balanced approach between Modality-Specific and Modality-Shared method including efficient transform for diversity of instances. First, we propose the Informative Weighted Gray transform (IWG), which aims to maximize the diversity of instances by generating the distinct combination of RGB colors. Second, introducing the Customized Modality-Specific Enhanced Module (CMSpEM) provides enhanced feature maps using attention mechanism between pre-pooling and post-pooling features and reinforces specific features for the modality-shared based network with an extremely small number of parameters. Third, we introduce the Pseudo Label-oriented Modality-Specific (PLMSp) Loss, which provides effective representation learning explicitly to reduce the modality gap using pseudo labels as anchors. We compare our Balanced and Essential Modality-Specific and Modality-Shared Network (BEMSSNet) and various existing methods for the mAP and Rank-1 performances on the SYSU-MM01 and RegDB datasets. Experimental results demonstrate that our proposed model outperforms the existing state-of-the-art methods.
engineering, electrical & electronic
What problem does this paper attempt to address?