A Method of Efficient Synthesizing Post-disaster Remote Sensing Image with Diffusion Model and LLM

Ming Wu,Chuang Zhang,Ruizhe Ou,Haotian Yan
DOI: https://doi.org/10.1109/APSIPAASC58517.2023.10317383
2023-10-31
Abstract:Due to the fact that current deep learning models are typically driven by big data, existing interpretation models for emergency management lack relevant learning data. However, existing pre-trained image generative models cannot directly generate post-disaster remote sensing images without fine-tuning. In this paper, we demonstrate the ability of natural language guidance synthesizing remote sensing imagery affected by disaster by pre-trained image generative model fine-tuned with very few unlabelled images (i.e., less than 100 fine-tuning images) at very low training cost (i.e., one 2080Ti GPU). To trade for lower cost, we embrace the trend of large model, leveraging a pre-trained caption model, GPT-4 and a pre-trained text-to-image Stable Diffusion model for this task. The Stable Diffusion Model, fine-tuned with our method, successfully synthesizes remote sensing images affected by disasters using natural language guidance in both image inpainting and image generation tasks. In addition, the ground truth for other interpretation models learning. With this achievement, our method can synthesize a large amount of data for the emergency management interpretation model to learn when there is less existing data, only unlabelled data and less time, so as to achieve better interpretation performance. Furthermore, our approach highlights the significant of combining human feedback with large models in synthesizing data which is out of the prior knowledge of large model, especially when there is less data available and less computational power available.
Environmental Science,Computer Science
What problem does this paper attempt to address?