Language-Assisted Siamese Contrastive Framework for Fine-Grained Remote Sensing Ship Image Retrieval

Yaohua Zhang,Zhizhuo Jiang,Yu Liu,Yaowen Li,Xueqian Wang,Yiming Zhang,Chenggang Yan
DOI: https://doi.org/10.1109/igarss53475.2024.10640466
2024-01-01
Abstract:As the number of remote sensing (RS) images increases, it is crucial to retrieval ship targets according to specific demands. The existing ship image retrieval methods only extract features from the image modality, which may not fully utilize the rich text information available and ignore the high-level hierarchical relations between ship classes. In this paper, we propose a language-assisted siamese contrastive framework, namely LASCF, for fine-grained ship retrieval in RS images. In the new LASCF, the siamese vision models are employed to measure the similarity between images. Moreover, a label text encoder with a pretrained language model is designed to extract the high-level semantic information from labels, and thus the information of the hierarchical relations between ship classes are fused in LASCF. Finally, the multimodal similarity measurement module based on contrastive learning is proposed to optimize the siamese vision models. The experimental results show that the proposed LASCF outperforms several existing state-of-the-art methods.
What problem does this paper attempt to address?