FoV-NeRF: Foveated Neural Radiance Fields for Virtual Reality

Nianchen Deng,Zhenyi He,Jiannan Ye,Budmonde Duinkharjav,Praneeth Chakravarthula,Xubo Yang,Qi Sun
DOI: https://doi.org/10.48550/arXiv.2103.16365
2022-07-22
Abstract:Virtual Reality (VR) is becoming ubiquitous with the rise of consumer displays and commercial VR platforms. Such displays require low latency and high quality rendering of synthetic imagery with reduced compute overheads. Recent advances in neural rendering showed promise of unlocking new possibilities in 3D computer graphics via image-based representations of virtual or physical environments. Specifically, the neural radiance fields (NeRF) demonstrated that photo-realistic quality and continuous view changes of 3D scenes can be achieved without loss of view-dependent effects. While NeRF can significantly benefit rendering for VR applications, it faces unique challenges posed by high field-of-view, high resolution, and stereoscopic/egocentric viewing, typically causing low quality and high latency of the rendered images. In VR, this not only harms the interaction experience but may also cause sickness. To tackle these problems toward six-degrees-of-freedom, egocentric, and stereo NeRF in VR, we present the first gaze-contingent 3D neural representation and view synthesis method. We incorporate the human psychophysics of visual- and stereo-acuity into an egocentric neural representation of 3D scenery. We then jointly optimize the latency/performance and visual quality while mutually bridging human perception and neural scene synthesis to achieve perceptually high-quality immersive interaction. We conducted both objective analysis and subjective studies to evaluate the effectiveness of our approach. We find that our method significantly reduces latency (up to 99% time reduction compared with NeRF) without loss of high-fidelity rendering (perceptually identical to full-resolution ground truth). The presented approach may serve as the first step toward future VR/AR systems that capture, teleport, and visualize remote environments in real-time.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve efficient rendering of Neural Radiance Fields (NeRF) while ensuring low latency and high - quality images in virtual reality (VR) applications. Specifically, traditional NeRF methods have problems of low image quality and high latency when dealing with high field - of - view (FoV), high - resolution, and stereoscopic/egocentric viewing requirements. These problems not only damage the interactive experience but may also lead to motion sickness. To overcome these challenges, the paper proposes the first fixation - point - based 3D neural representation and view - synthesis method. ### Specific problems solved by the paper: 1. **High - latency problem**: Traditional NeRF methods require a large amount of computing resources when performing image rendering, resulting in high latency, which is a serious problem for VR applications that require quick responses. 2. **Low - image - quality**: In scenarios with high field - of - view and high - resolution, traditional NeRF methods cannot provide high - quality images, especially in the fovea region where the human eye is most sensitive to image quality. 3. **Not adapted to VR viewing requirements**: Existing NeRF methods are mainly suitable for the "outside - in" viewing mode, while VR applications usually require the "inside - out" viewing mode, which leads to poor performance when used in VR environments. ### Solutions: The paper solves the above problems through the following methods: 1. **Fixation - point - based neural radiance field representation**: The paper introduces a new 3D neural representation method that combines the spatial/stereoscopic acuity and temporal sensitivity of the human visual system to optimize low latency and high - quality images in VR. 2. **Adaptive monocular and stereoscopic acuity**: The paper designs two networks to handle the fovea region (high acuity) and the peripheral visual field (low acuity) respectively, and adjusts the resolution and stereoscopic vision of the image through an adaptive method, thereby reducing the amount of computation while ensuring visual quality. 3. **Real - time frame synthesis**: The paper proposes an image - based rendering method that can quickly synthesize the final display frame at runtime. At the same time, it enhances the visual consistency between layers through a smooth - transition function and further improves the visual fidelity of the peripheral region through contrast - enhancement technology. 4. **Joint optimization of latency and quality**: By establishing a spatio - temporal model, the paper analyzes the relationship between latency and quality under different parameter settings and finds the optimal parameter configuration through an optimization algorithm, achieving an ideal balance between latency and quality. ### Summary: The fixation - point - based 3D neural representation and view - synthesis method proposed in this paper effectively solves the problems of high latency and low - image - quality encountered by traditional NeRF methods in VR applications, providing new possibilities for future VR/AR systems.