Inter-Temporal Interaction and Symmetric Difference Learning for Remote Sensing Image Change Captioning

Yunpeng Li,Xiangrong Zhang,Xina Cheng,Puhua Chen,Licheng Jiao
DOI: https://doi.org/10.1109/tgrs.2024.3462091
2024-01-01
Abstract:Remote sensing image change captioning (RSICC) is more challenging than remote sensing change detection task, which requires extracting occurred changes in similar remote sensing image (RSI) pairs while generating change caption. However, few works have been investigated on RSICC, the main challenges come from how to learn abundant change clues and face the modality gap. To handle these problems, we rethink this task from the perspective of obtaining and aligning symmetrical change features for temporal RSIs. In this work, the proposed intertemporal interaction and symmetric difference learning network are cascaded through several multitemporal integration units to model differences from coarse to fine representations. Specifically, we design a cross-temporal attention (CTA) mechanism to probe direct interaction between bi-temporal RSIs for motivating information coupling between intralevel representations and suppressing irrelevant interferences. To learn robust change features, a symmetric difference transformer (SDT) module is devised to guarantee temporal symmetry between the "before-to-after" and "after-to-before" change representations. Besides, the bi-directional triplet ranking loss is adopted to guide the network to learn strongly discriminative and temporal-symmetric change representation. Extensive experimental results on Dubai-CC and LEVIR-CC datasets demonstrate that our framework with the proposed components can achieve excellent performance and surpass recent state-of-the-art methods. https://github.com/romanticLYP/TISDNet
What problem does this paper attempt to address?