Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance

Zexin Hu,Kun Hu,Clinton Mo,Lei Pan,Zhiyong Wang
DOI: https://doi.org/10.48550/arXiv.2308.16725
2023-08-31
Abstract:Sketch-based terrain generation seeks to create realistic landscapes for virtual environments in various applications such as computer games, animation and virtual reality. Recently, deep learning based terrain generation has emerged, notably the ones based on generative adversarial networks (GAN). However, these methods often struggle to fulfill the requirements of flexible user control and maintain generative diversity for realistic terrain. Therefore, we propose a novel diffusion-based method, namely terrain diffusion network (TDN), which actively incorporates user guidance for enhanced controllability, taking into account terrain features like rivers, ridges, basins, and peaks. Instead of adhering to a conventional monolithic denoising process, which often compromises the fidelity of terrain details or the alignment with user control, a multi-level denoising scheme is proposed to generate more realistic terrains by taking into account fine-grained details, particularly those related to climatic patterns influenced by erosion and tectonic activities. Specifically, three terrain synthesisers are designed for structural, intermediate, and fine-grained level denoising purposes, which allow each synthesiser concentrate on a distinct terrain aspect. Moreover, to maximise the efficiency of our TDN, we further introduce terrain and sketch latent spaces for the synthesizers with pre-trained terrain autoencoders. Comprehensive experiments on a new dataset constructed from NASA Topology Images clearly demonstrate the effectiveness of our proposed method, achieving the state-of-the-art performance. Our code and dataset will be publicly available.
Computer Vision and Pattern Recognition,Artificial Intelligence,Multimedia
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is that when existing terrain generation methods generate realistic terrains, it is difficult to simultaneously meet the two requirements of user - flexible control and maintaining generation diversity. Specifically: 1. **Trade - off between user control and generation diversity**: Traditional GAN - based terrain generation methods can generate diverse terrains, but they perform poorly in following the conditions provided by users (such as structural features like rivers, ridges, basins and peaks). This results in the generated terrains may not meet users' expectations. 2. **Consideration of the authenticity of geological details and the influence of climate**: Existing methods often fail to fully consider the influence of geological details (such as erosion and tectonic activities) and climate patterns on the terrain, thus making the generated terrains lack a sense of reality. To solve these problems, the author proposes a new method based on the diffusion model - Terrain Diffusion Network (TDN). TDN aims to generate terrains that are both in line with user control and highly realistic by introducing a multi - level denoising scheme and user - sketch guidance. Specific improvements include: - **Multi - level denoising synthesizer**: TDN adopts synthesizers at three different levels: structure - level, intermediate - level and fine - grained - level, which focus on different terrain features respectively. This design enables each synthesizer to focus on a specific aspect of the terrain, thereby increasing the generation diversity and authenticity. - **User - sketch guidance**: TDN allows users to define the main structural features of the terrain (such as rivers, ridges, basins and peaks) by inputting sketches and fully utilizes this information during the generation process. This improves user control and ensures that the generated terrains are more in line with users' expectations. - **Terrain and sketch latent spaces**: To improve efficiency, TDN introduces pre - trained terrain auto - encoders and sketch auto - encoders to compress the terrain and sketches into low - dimensional latent spaces. This not only reduces the computational requirements but also makes the generation process more efficient. In conclusion, this paper aims to generate both diverse and realistic terrains by combining user - control and climate - aware multi - level denoising schemes, thereby overcoming the problems existing in current methods.