Abstract:Style-based GANs achieve state-of-the-art results for generating high-quality images, but lack explicit and precise control over camera poses. Recently proposed NeRF-based GANs have made great progress towards 3D-aware image generation. However, the methods either rely on convolution operators which are not rotationally invariant, or utilize complex yet suboptimal training procedures to integrate both NeRF and CNN sub-structures, yielding un-robust, low-quality images with a large computational burden. On top of our open-source CIPS-3D framework <sup>1</sup> https://github.com/PeterouZh/CIPS-3D, this paper presents an upgraded version called CIPS-3D++, aiming at high-robust, high-resolution and high-efficiency 3D-aware GANs. On the one hand, our basic model CIPS-3D, encapsulated in a style-based architecture, features a shallow NeRF-based 3D shape encoder as well as a deep MLP-based 2D image decoder, achieving robust image generation/editing with rotation-invariance. On the other hand, our proposed CIPS-3D++, inheriting the rotational invariance of CIPS-3D, together with geometric regularization and upsampling operations, encourages high-resolution high-quality image generation/editing with great computational efficiency. Trained on raw single-view images, without any bells and whistles, CIPS-3D++ sets new records for 3D-aware image synthesis, with an impressive FID of 3.2 on FFHQ at the 1024×1024 resolution. In the meantime, CIPS-3D++ runs efficiently and enjoys a low GPU memory footprint so that it can be trained end-to-end on high-resolution images directly, in contrast to previous alternate/progressive methods. Based on the infrastructure of CIPS-3D++, we propose a 3D-aware GAN inversion algorithm named FlipInversion, which can reconstruct the 3D object from a single-view image. We also provide a 3D-aware stylization method for real images based on CIPS-3D++ and FlipInversion. In addition, we analyze the problem of mirror symmetry suffered in training, and solve it by introducing an auxiliary discriminator for the NeRF network. Overall, CIPS-3D++ provides a strong base model that can serve as a testbed for transferring GAN-based image editing methods from 2D to 3D. Our open-source project as well as accompanying demo videos can be found online <sup>2</sup> https://github.com/PeterouZh/CIPS-3Dplusplus.

Learning Reconstruction Models of Textured 3D Mesh Using StyleGAN2

3D-Aware Image Synthesis Via Learning Structural and Textural Representations

2D GANs Meet Unsupervised Single-view 3D Reconstruction

3D-Mask-GAN:Unsupervised Single-View 3D Object Reconstruction

Monocular 3D Object Reconstruction with GAN Inversion

Progressive Learning of 3D Reconstruction Network from 2D GAN Data

Enhanced 3D Shape Reconstruction With Knowledge Graph of Category Concept

Semi-supervised Three-dimensional Reconstruction Framework with GAN.

Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

Unsupervised Style-based Explicit 3D Face Reconstruction from Single Image

Get3DHuman: Lifting StyleGAN-Human into a 3D Generative Model using Pixel-aligned Reconstruction Priors

Fine Detailed Texture Learning for 3D Meshes with Generative Models

3D-GANTex: 3D Face Reconstruction with StyleGAN3-based Multi-View Images and 3DDFA based Mesh Generation

CIPS-3D++: End-to-End Real-Time High-Resolution 3D-Aware GANs for GAN Inversion and Stylization

Single-view 3D Mesh Reconstruction for Seen and Unseen Categories

High-Quality Textured 3D Shape Reconstruction with Cascaded Fully Convolutional Networks

Geometry aware 3D generation from in-the-wild images in ImageNet

Lifting 2D StyleGAN for 3D-Aware Face Generation

Texture-Shape Optimized GAT for 3D Face Reconstruction.

Neural 3D Face Rendering Conditioned on 2D Appearance Via GAN Disentanglement Method

Reconstruction of three-dimension digital rock guided by prior information with a combination of InfoGAN and style-based GAN