A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays

Federico Miotello,Ferdinando Terminiello,Mirco Pezzoli,Alberto Bernardini,Fabio Antonacci,Augusto Sarti
2024-07-26
Abstract:Spherical microphone arrays are convenient tools for capturing the spatial characteristics of a sound field. However, achieving superior spatial resolution requires arrays with numerous capsules, consequently leading to expensive devices. To address this issue, we present a method for spatially upsampling spherical microphone arrays with a limited number of capsules. Our approach exploits a physics-informed neural network with Rowdy activation functions, leveraging physical constraints to provide high-order microphone array signals, starting from low-order devices. Results show that, within its domain of application, our approach outperforms a state of the art method based on signal processing for spherical microphone arrays upsampling.
Audio and Speech Processing,Machine Learning,Sound,Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to achieve spatial up - sampling of the spherical microphone array (SMA) through a limited number of microphone capsules to improve spatial resolution while maintaining computational efficiency and generalization ability**. Specifically, the spherical microphone array (SMA) is a tool for capturing the spatial characteristics of the sound field. However, to obtain higher spatial resolution, more microphone capsules are usually required, which will lead to an increase in device cost. To solve this problem, the author proposes a method based on the physics - informed neural network (PINNs), using the Rowdy activation function to upgrade the low - order SMA signals to high - order signals, thereby achieving spatial up - sampling. ### Problem Background 1. **Applications of the Spherical Microphone Array (SMA)**: - SMA is widely used in fields such as virtual reality, augmented reality, and remote conferencing. - It can efficiently estimate the spherical - harmonic representation of the sound field for tasks such as source localization and separation. 2. **Existing Challenges**: - In practical applications, the limited number of microphones in SMA results in a limited representation of the sound field in the spherical - harmonic domain. - This may lead to spatial aliasing and coding errors, so it is necessary to limit the order of the spherical - harmonic expansion to avoid aliasing. - Spatial up - sampling techniques can improve spatial resolution by adding virtual microphones, but traditional methods rely on a large amount of data for training and it is difficult to obtain sufficient data. ### Proposed Method The author proposes a method based on the physics - informed neural network (PINNs), combined with the Rowdy activation function, aiming at: - **Improving Spatial Resolution**: Extract more information from low - order devices by generating high - order microphone array signals. - **Maintaining Computational Efficiency**: Utilize the high efficiency and generalization ability of the neural network. - **Using Physical Constraints**: Ensure that the network output conforms to the basic physical laws of the sound field (such as the wave equation), thereby improving the accuracy of the results. ### Main Contributions - **Innovation**: The introduction of the Rowdy activation function enhances the network's ability to capture high - frequency components and promotes convergence to meaningful solutions. - **Experimental Verification**: Through experiments on measured data, the effectiveness of this method is verified, and it is compared with existing signal - processing methods, showing better performance. - **Future Prospects**: Further explore the application of more physical constraints and test the performance of this method in more challenging scenarios. ### Summary This paper proposes a new method based on the physics - informed neural network (PINNs) and the Rowdy activation function, which solves the key problems in the spatial up - sampling of the spherical microphone array, significantly improves the spatial resolution, and at the same time maintains computational efficiency and generalization ability.