Multi-knowledge-driven enhanced module for visible-infrared cross-modal person Re-identification

Shihao Shan,Peixin Sun,Guoqiang Xiao,Song Wu
DOI: https://doi.org/10.1007/s13735-024-00327-7
2024-04-18
International Journal of Multimedia Information Retrieval
Abstract:Visible-Infrared Person Re-identification (VI-ReID) is challenging in social security surveillance because the semantic gap between cross-modal data significantly reduces VI-ReID performance. To overcome this challenge, this paper proposes a novel Multi Knowledge-driven Enhancement Module (MKEM) for high-performance VI-ReID. It mainly focuses on explicitly learning appropriate transition modalities and effectively synthesizing them to reduce the burden of models learning vastly different cross-modal knowledge. The MKEM consists of a Visible Knowledge-driven Enhancement Module (VKEM) and an Infrared Knowledge-driven Enhancement Module (IKEM), which generate model knowledge-accumulating transition modalities for the visible and infrared modalities, respectively. To effectively leverage the transition modalities, the model needs to learn the original data distribution while accumulating knowledge of the transition modes; thus, a Diversity Loss is designed to guide the representation of the generated transition modalities to be diverse, which can facilitate the model's knowledge accumulation. To prevent redundant knowledge accumulation, a Consistency Loss is proposed to maintain the semantic similarity between the original and modeled transitional modalities. Furthermore, we implemented a Bias Adjustment Strategy (BAS) to effectively adjust the gap between the head and tail categories. We evaluated our proposed MKEM on two VI-ReID benchmark datasets, SYSU-MM01 and RegDB, and the experimental results demonstrate that our method outperforms existing methods significantly. The source code of our proposed MKEM is available at https://github.com/SWU-CS-MediaLab/MKEM.
computer science, artificial intelligence, software engineering
What problem does this paper attempt to address?