Discovering Interpretable Latent Space Directions for 3D-Aware Image Generation

Zhiyuan Yang,Qingfu Zhang
DOI: https://doi.org/10.1109/tetci.2024.3369319
2024-01-01
IEEE Transactions on Emerging Topics in Computational Intelligence
Abstract:2D GANs have yielded impressive results especially in image synthesis. However, they often encounter challenges with multi-view inconsistency due to the absence of 3D perception in their generation process. To overcome this shortcoming, 3D-aware GANs have been proposed to take advantage of both 3D representation methods, GANs, but it is very difficult to edit semantic attributes. To explore the semantic disentanglement in the 3D-aware latent space, this paper proposes a general framework, presents two representative approaches for the 3D manipulation task in both supervised, unsupervised manners. Our key idea is to utilize existing latent discovery methods, bring direct compatibility to 3D control. Specifically, we propose a novel module to extract the semantic latent space of the existing 3D-aware models, then develop two approaches to find a normal editing direction in the latent space. Leveraging the meaningful semantic latent directions, we can easily edit the shape, appearance attributes while preserving the 3D consistency. Quantitative, qualitative experiments show that our method is effective, efficient for the 3D-aware generation with steerability on both synthetic, real-world datasets.
What problem does this paper attempt to address?