ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
Kyle Sargent,Zizhang Li,Tanmay Shah,Charles Herrmann,Hong-Xing Yu,Yunzhi Zhang,Eric Ryan Chan,Dmitry Lagun,Li Fei-Fei,Deqing Sun,Jiajun Wu
2024-04-24
Abstract:We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds, we propose new techniques to address challenges introduced by in-the-wild multi-object scenes with complex backgrounds. Specifically, we train a generative prior on a mixture of data sources that capture object-centric, indoor, and outdoor scenes. To address issues from data mixture such as depth-scale ambiguity, we propose a novel camera conditioning parameterization and normalization scheme. Further, we observe that Score Distillation Sampling (SDS) tends to truncate the distribution of complex backgrounds during distillation of 360-degree scenes, and propose "SDS anchoring" to improve the diversity of synthesized novel views. Our model sets a new state-of-the-art result in LPIPS on the DTU dataset in the zero-shot setting, even outperforming methods specifically trained on DTU. We further adapt the challenging Mip-NeRF 360 dataset as a new benchmark for single-image novel view synthesis, and demonstrate strong performance in this setting. Our code and data are at <a class="link-external link-http" href="http://kylesargent.github.io/zeronvs/" rel="external noopener nofollow">this http URL</a>
Computer Vision and Pattern Recognition,Graphics