Cross-Modality Vessel Re-Identification With Deep Alignment Decomposition Network

Yafei Lv,Qian Wu,Zaidao Wen,Jinhui Wu
DOI: https://doi.org/10.1109/TMM.2024.3406193
IF: 7.3
IEEE Transactions on Multimedia
Abstract:Cross-modality vessel re-identification (ReID) presents a formidable challenge in the domain of maritime surveillance, necessitating the development of robust methodologies to accurately match vessels across disparate imaging modalities. This paper introduces a novel Cross-modality Alignment Decomposition Network (CAD-Net) to address the inherent complexities associated with this task. CAD-Net incorporates a geometric-semantic cross-modal alignment module for effectively mitigating geometric and modality variances within the global features. Additionally, it integrates an adaptive local decomposition module associated with a diversity regularization, enabling the capture of local vessel features, all while circumventing the reliance on predefined part separation criteria. To address the scarcity of cross-modal vessel datasets, which are predominantly biased towards visible light modality, and to evaluate the performance of the proposed framework, we have constructed a novel dataset named KongTong-boat (KT-boat). It comprises 2,826 high-resolution images, including 1,443 RGB images and 1,383 IR images, featuring 117 distinct vessels. This dataset can be served as a new fundamental benchmark for evaluating the efficacy of cross-modality vessel ReID algorithms, filling a critical gap in the field. The experimental results obtained on the KT-boat dataset unequivocally demonstrate the remarkable effectiveness of CAD-Net in the context of cross-modality ReID. Notably, when compared to state-of-the-art cross-modality ReID algorithms applied to general cross-modality pedestrian benchmarks on KT-boat and RegDB dataset, CAD-Net consistently outperforms them across key evaluation metrics, including the rank-1 index and mean Average Precision (mAP).
Environmental Science,Engineering,Computer Science
What problem does this paper attempt to address?