Acoustic Volume Rendering for Neural Impulse Response Fields

Zitong Lan,Chenhao Zheng,Zhiwei Zheng,Mingmin Zhao
2024-11-10
Abstract:Realistic audio synthesis that captures accurate acoustic phenomena is essential for creating immersive experiences in virtual and augmented reality. Synthesizing the sound received at any position relies on the estimation of impulse response (IR), which characterizes how sound propagates in one scene along different paths before arriving at the listener's position. In this paper, we present Acoustic Volume Rendering (AVR), a novel approach that adapts volume rendering techniques to model acoustic impulse responses. While volume rendering has been successful in modeling radiance fields for images and neural scene representations, IRs present unique challenges as time-series signals. To address these challenges, we introduce frequency-domain volume rendering and use spherical integration to fit the IR measurements. Our method constructs an impulse response field that inherently encodes wave propagation principles and achieves state-of-the-art performance in synthesizing impulse responses for novel poses. Experiments show that AVR surpasses current leading methods by a substantial margin. Additionally, we develop an acoustic simulation platform, AcoustiX, which provides more accurate and realistic IR simulations than existing simulators. Code for AVR and AcoustiX are available at <a class="link-external link-https" href="https://zitonglan.github.io/avr" rel="external noopener nofollow">this https URL</a>.
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to improve the quality of realistic audio synthesis in virtual and augmented reality, especially by accurately modeling the impulse response (IR) of sound waves propagating to the listener's position along different paths. Current methods have limitations in generating high - fidelity impulse responses, especially in capturing the detailed features of impulse responses and correct spatial variations. These problems mainly stem from the lack of physical constraints, which cause the network to be prone to over - fitting the training data and have poor generalization ability. To overcome these challenges, the authors propose the **Acoustic Volume Rendering (AVR)** method, which is a new technical framework for modeling acoustic impulse response fields through frequency - domain volume rendering and spherical area integration. Specifically, the main contributions and innovations of the AVR method include: 1. **Frequency - Domain Volume Rendering**: By converting the impulse response in the time domain to the frequency domain and applying phase shift to handle time delay, the limitation of finite sampling in the time domain is bypassed. This not only reduces spatial variation but also facilitates network optimization. 2. **Spherical Area Integration**: Considering that the listener receives signals from all directions, the AVR method projects rays uniformly on the spherical surface and uses spherical area integration to synthesize impulse response measurements. This design also supports personalized audio experiences and can integrate individual head - related transfer functions (HRTFs) during inference. 3. **Incorporation of Physical Principles**: The AVR method intrinsically encodes the principle of wave propagation, ensuring multi - view consistency, thereby improving the generalization ability and accuracy of the model. 4. **High - Performance Simulation Platform**: In addition to the AVR method, the authors have also developed an acoustic simulation platform named **AcoustiX**, which can generate more accurate impulse responses than existing simulators, especially in terms of the accuracy of signal phase and arrival time. Through these innovations, the performance of the AVR method on multiple real - world and simulated datasets is significantly better than existing leading methods, especially in zero - shot and personalized binaural audio synthesis. In addition, the development of the AcoustiX platform also provides a more accurate physical basis for acoustics - related research, promoting further development in this field.