MfrNet: A New Multi-Scale Feature Refining Method for Remote Sensing Image Change Captioning

Kaiqi Xu,Yingping Han,Rui Yang,Xiutiao Ye,Yanhe Guo,Hantong Xing,Shuang Wang
DOI: https://doi.org/10.1109/igarss53475.2024.10640584
2024-01-01
Abstract:Remote Sensing Image Change Captioning (RSICC) is an emerging multimodal field with promising prospects. This paper introduces a remote sensing image change caption model based on multi-scale and refined features. First, it extracts multi-scale features from dual-temporal images and then feeds them into the JointAtt and Dence Fusion (JADF) module for attention mutual guidance and feature refinement to eliminate noise. Next, the features are input into a transformer-based sentence generator for change statement generation. We conducted experiments on the Levir-CC dataset comparing our approach with existing methods, the results indicate that our MFRNet outperforms state-of-the-art methods in all metrics.
What problem does this paper attempt to address?