SatDM: Synthesizing Realistic Satellite Image with Semantic Layout Conditioning using Diffusion Models

Orkhan Baghirli,Hamid Askarov,Imran Ibrahimli,Ismat Bakhishov,Nabi Nabiyev
2023-09-29
Abstract:Deep learning models in the Earth Observation domain heavily rely on the availability of large-scale accurately labeled satellite imagery. However, obtaining and labeling satellite imagery is a resource-intensive endeavor. While generative models offer a promising solution to address data scarcity, their potential remains underexplored. Recently, Denoising Diffusion Probabilistic Models (DDPMs) have demonstrated significant promise in synthesizing realistic images from semantic layouts. In this paper, a conditional DDPM model capable of taking a semantic map and generating high-quality, diverse, and correspondingly accurate satellite images is implemented. Additionally, a comprehensive illustration of the optimization dynamics is provided. The proposed methodology integrates cutting-edge techniques such as variance learning, classifier-free guidance, and improved noise scheduling. The denoising network architecture is further complemented by the incorporation of adaptive normalization and self-attention mechanisms, enhancing the model's capabilities. The effectiveness of our proposed model is validated using a meticulously labeled dataset introduced within the context of this study. Validation encompasses both algorithmic methods such as Frechet Inception Distance (FID) and Intersection over Union (IoU), as well as a human opinion study. Our findings indicate that the generated samples exhibit minimal deviation from real ones, opening doors for practical applications such as data augmentation. We look forward to further explorations of DDPMs in a wider variety of settings and data modalities. An open-source reference implementation of the algorithm and a link to the benchmarked dataset are provided at <a class="link-external link-https" href="https://github.com/obaghirli/syn10-diffusion" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper aims to address the difficulty of obtaining large-scale accurately labeled satellite images in the field of Earth observation. Specifically, the paper proposes a method based on Diffusion Models to synthesize realistic satellite images from semantic layouts through a conditional generative model. This method can effectively alleviate the problem of data scarcity, improve the quality and diversity of satellite image generation, and thus play a role in practical applications such as data augmentation. ### Main Contributions 1. **SAT25K Dataset**: Compiled a meticulously curated building footprint dataset, including image slices and their corresponding semantic layouts. 2. **Exploration of Diffusion Models**: Conducted an in-depth study on the application of diffusion models in synthesizing satellite images. 3. **SatDM Model**: Developed a high-performance conditional diffusion model specifically for image generation under semantic layout conditions. 4. **Open Source Code and Model Weights**: Released the source code and model weights to promote reproducible research. ### Method Overview - **Loss Function**: Trained the model by maximizing the Evidence Lower Bound (ELBO) and minimizing the KL divergence between the posterior distribution of the reverse process and the estimated reverse process. - **Sampling Process**: Started from a standard normal distribution and iteratively denoised to generate the final satellite image. - **Denoising Network**: Based on a time-conditioned U-Net architecture, retained semantic information through the Spatially-Adaptive Normalization (SPADE) module, reducing the impact of convolution and normalization layers on signal quality. ### Experimental Validation - **Quantitative Evaluation**: Evaluated using algorithms such as Fréchet Inception Distance (FID) and Intersection over Union (IoU). - **Qualitative Evaluation**: Verified the quality of generated samples through human opinion surveys. ### Conclusion The paper demonstrates that the proposed conditional diffusion model can generate high-quality satellite images with limited data and computational resources, providing new possibilities for practical applications such as data augmentation. Future research will further explore the application of diffusion models in more scenarios and data modalities.