DepthGAN: GAN-based depth generation from semantic layouts

Yidi Li,Jun Xiao,Yiqun Wang,Zhengda Lu

DOI: https://doi.org/10.1007/s41095-023-0350-8

IF: 4.1268

2024-04-27

Computational Visual Media

Abstract:Abstract Existing GAN-based generative methods are typically used for semantic image synthesis. We pose the question of whether GAN-based architectures can generate plausible depth maps and find that existing methods have difficulty in generating depth maps which reasonably represent 3D scene structure due to the lack of global geometric correlations. Thus, we propose DepthGAN, a novel method of generating a depth map using a semantic layout as input to aid construction, and manipulation of well-structured 3D scene point clouds. Specifically, we first build a feature generation model with a cascade of semantically-aware transformer blocks to obtain depth features with global structural information. For our semantically aware transformer block, we propose a mixed attention module and a semantically aware layer normalization module to better exploit semantic consistency for depth features generation. Moreover, we present a novel semantically weighted depth synthesis module, which generates adaptive depth intervals for the current scene. We generate the final depth map by using a weighted combination of semantically aware depth weights for different depth ranges. In this manner, we obtain a more accurate depth map. Extensive experiments on indoor and outdoor datasets demonstrate that DepthGAN achieves superior results both quantitatively and visually for the depth generation task.

computer science, software engineering

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that the existing methods based on Generative Adversarial Networks (GAN) have difficulties in generating depth maps from semantic layouts. In particular, these methods are hard to generate depth maps that can reasonably represent the 3D scene structure because they lack the ability to capture global geometric correlations. Specifically, the paper points out: 1. **Limitations of Existing Methods**: The existing methods based on Convolutional Neural Networks (CNN) can only focus on local information due to their limited receptive fields, and are unable to accurately predict the global geometric associations between different objects, resulting in visually incoherent generated depth maps. 2. **Importance of Depth Maps**: As a 2.5D medium, depth maps can measure the distance between objects and the camera in three - dimensional space, providing a transition from 2D images to 3D scenes. Therefore, generating accurate and reasonable depth maps is of great significance for constructing 3D scenes. 3. **Proposal of a New Task**: The paper proposes a new task, that is, generating accurate depth maps using only simple semantic layouts as input to assist visual designers in constructing 3D scenes. To solve these problems, the paper proposes DepthGAN, a GAN - based depth map generation method. It generates depth features containing global structure information by introducing a series of semantically - aware Transformer blocks, and generates the final depth map through a semantically - weighted depth synthesis module. The experimental results on indoor and outdoor datasets show that DepthGAN is superior to the existing methods both quantitatively and in visual effects.

DepthGAN: GAN-based depth generation from semantic layouts

3D-Aware Image Synthesis Via Learning Structural and Textural Representations

Depth Generation Network: Estimating Real World Depth From Stereo And Depth Images

DCL: Differential Contrastive Learning for Geometry-Aware Depth Synthesis

Occlusion-aware Unsupervised Light Field Depth Estimation based on Muti-Scale GANs

Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images

Unsupervised Learning of Depth Estimation and Camera Pose With Multi-Scale GANs

Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation

Depth Estimation from Monocular Image and Coarse Depth Points Based on Conditional GAN

Depth Information Precise Completion-GAN: A Precisely Guided Method for Completing Ill Regions in Depth Maps

Depth Structure Preserving Scene Image Generation.

Unpaired Single-Image Depth Synthesis with cycle-consistent Wasserstein GANs

Depth Images Could Tell Us More: Enhancing Depth Discriminability for RGB-D Scene Recognition

RGB-Depth Fusion GAN for Indoor Depth Completion

RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion

DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Conditional Generative Adversarial Network for Monocular Image Depth Map Prediction

Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation

Deep Structured Generative Models.

Subsurface Depths Structure Maps Reconstruction with Generative Adversarial Networks

Generative Adversarial Networks for Unsupervised Monocular Depth Prediction