Spatial–Channel Attention Transformer With Pseudo Regions for Remote Sensing Image-Text Retrieval
Cuili Xu,Hang Liu,Yinxuan Hou,Huihui Li,Dongqing Wu,Lei Guo,Gong Cheng
DOI: https://doi.org/10.1109/TGRS.2024.3395313
IF: 8.2
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Recently, remote sensing image-text retrieval (RSITR) has received significant attention due to its flexible query form and effective management of remote sensing images. However, prior work often relies on compact global features and ignores local features that can reflect salient objects in the images. Moreover, these methods primarily model interactions between features in the spatial domain, which is insufficient for mining the rich semantic information presented in remote sensing images. In this article, we propose a novel spatial-channel attention transformer (SCAT) with pseudo regions to address these issues. Concretely, in order to acquire the fine-grained perception of local objects, we introduce a pseudo region generation (PRG) module that adaptively aggregates grid features with similar semantic information into multiple clusters through a clustering algorithm. These generated cluster centers are able to flexibly and efficiently represent local objects in remote sensing images without relying on sophisticated object detectors. Furthermore, in order to achieve a comprehensive understanding of image semantics information, we carefully construct a novel SCAT. By exploiting spatial and channel attention to explore the dependencies between features at both spatial and channel domains, the proposed SCAT enhances the model’s ability to identify both “where to look” and “what it is,” thereby obtaining a more powerful representation. In addition, SCAT incorporates two novel designs that alleviate the high overhead caused by attention modeling. Extensive experiments on two benchmark datasets, RSICD and RSITMD, fully demonstrate the effectiveness and superiority of our proposed method.
Environmental Science,Computer Science