Abstract:Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints. Volumetric approaches provide a solution for modeling occlusions through the explicit 3D representation of the camera frustum. Multi-plane Images (MPI) are volumetric methods that represent the scene using front-parallel planes at distinct depths but suffer from depth discretization leading to a 2.D scene representation. Another line of approach relies on implicit 3D scene representations. Neural Radiance Fields (NeRF) utilize neural networks for encapsulating the continuous 3D scene structure within the network weights achieving photorealistic synthesis results, however, methods are constrained to per-scene optimization settings which are inefficient in practice. Multi-plane Neural Radiance Fields (MINE) open the door for combining implicit and explicit scene representations. It enables continuous 3D scene representations, especially in the depth dimension, while utilizing the input image features to avoid per-scene optimization. The main drawback of the current literature work in this domain is being constrained to single-view input, limiting the synthesis ability to narrow viewpoint ranges. In this work, we thoroughly examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields. In addition, we propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range. Features from the input source frames are effectively fused through a proposed attention-aware fusion module to highlight important information from different viewpoints. Experiments show the effectiveness of attention-based fusion and the promising outcomes of our proposed method when compared to multi-view NeRF and MPI techniques.

Multi-Scale Feature Fusion for Single Image Novel View Synthesis

An Improved Novel View Synthesis Approach Based on Feature Fusion and Channel Attention.

Depth-Assisted Full Resolution Network for Single Image-Based View Synthesis.

Novel View Synthesis from a Single Unposed Image via Unsupervised Learning

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

Novel View Synthesis with Pixel-Space Diffusion Models

A neural refinement network for single image view synthesis

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Content-Aware Warping for View Synthesis

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

View synthesis with multiplane images from computationally generated RGB-D light fields

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis

Pixel-Aligned Multi-View Generation with Depth Guided Decoder

Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day

Novel View Synthesis from only a 6-DoF Camera Pose by Two-stage Networks

Deep Learning based Novel View Synthesis

MultiDiff: Consistent Novel View Synthesis from a Single Image

Self-Supervised Visibility Learning for Novel View Synthesis

Multi-Plane Neural Radiance Fields for Novel View Synthesis

Remote Sensing Novel View Synthesis With Implicit Multiplane Representations