Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning

Chenyang Liu,Keyan Chen,Zipeng Qi,Haotian Zhang,Zhengxia Zou,Zhenwei Shi

2024-05-21

Abstract:The existing methods for Remote Sensing Image Change Captioning (RSICC) perform well in simple scenes but exhibit poorer performance in complex scenes. This limitation is primarily attributed to the model's constrained visual ability to distinguish and locate changes. Acknowledging the inherent correlation between change detection (CD) and RSICC tasks, we believe pixel-level CD is significant for describing the differences between images through language. Regrettably, the current RSICC dataset lacks readily available pixel-level CD labels. To address this deficiency, we leverage a model trained on existing CD datasets to derive CD pseudo-labels. We propose an innovative network with an auxiliary CD branch, supervised by pseudo-labels. Furthermore, a semantic fusion augment (SFA) module is proposed to fuse the feature information extracted by the CD branch, thereby facilitating the nuanced description of changes. Experiments demonstrate that our method achieves state-of-the-art performance and validate that learning pixel-level CD pseudo-labels significantly contributes to change captioning. Our code will be available at:

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address the issue of poor performance of current Remote Sensing Image Change Captioning (RSICC) methods in complex scenes. Specifically, existing RSICC methods perform well in simple scenes but poorly in complex ones, mainly due to the limited ability of these models to distinguish and locate changes. To solve this problem, the authors believe that pixel-level change detection (CD) is crucial for describing differences between images through language. However, current RSICC datasets lack readily available pixel-level CD labels. Therefore, the authors propose an innovative approach that uses a pre-trained CD model to generate CD pseudo-labels and introduces a new network with an auxiliary CD branch supervised by the pseudo-labels. Additionally, a Semantic Fusion Augmentation (SFA) module is introduced to fuse feature information extracted by the CD branch, thereby promoting detailed change descriptions. Experimental results show that this method achieves state-of-the-art performance on the RSICC task and validates the significant contribution of learning pixel-level CD pseudo-labels to change description.

Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning

Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

A Decoupling Paradigm with Prompt Learning for Remote Sensing Image Change Captioning.

Detection Assisted Change Captioning for Remote Sensing Image

Beyond Pixel-Level Annotation: Exploring Self-Supervised Learning for Change Detection With Image-Level Supervision

CSDACD: Domain-adaptive Change Detection Network for Cross-seasonal Remote Sensing Images

Change Captioning for Satellite Images Time Series

Intertemporal Interaction and Symmetric Difference Learning for Remote Sensing Image Change Captioning

Language-Guided Semantic Clustering for Remote Sensing Change Detection

Inter-Temporal Interaction and Symmetric Difference Learning for Remote Sensing Image Change Captioning

Remote Sensing Image Change Captioning Using Multi-Attentive Network with Diffusion Model

Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images

ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning

Changes to Captions: An Attentive Network for Remote Sensing Change Captioning

Contrastive Scene Change Representation Learning for High-Resolution Remote Sensing Scene Change Detection

Reliable Contrastive Learning for Semi-Supervised Change Detection in Remote Sensing Images

Progressive Scale-aware Network for Remote sensing Image Change Captioning

Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning

Remote Sensing Image Semantic Change Detection Boosted by Semi-supervised Contrastive Learning of Semantic Segmentation

Remote Sensing Image Change Captioning With Dual-Branch Transformers: A New Method and a Large Scale Dataset

RaSRNet: an End-to-End Relation-Aware Semantic Reasoning Network for Change Detection in Optical Remote Sensing Images