Towards a Real-Time Production of Immersive Spatial Audio of High Individuality with an RBF Neural Network

Weiping Tu,Yuhong Yang,Bo Du,Jiaxi Zheng,Shuangxing Zhai
DOI: https://doi.org/10.1016/j.jpdc.2019.04.020
IF: 4.542
2019-01-01
Journal of Parallel and Distributed Computing
Abstract:Immersion perception plays a critical role in the tremendous success of the recent development of augment/virtual reality applications, in which high-quality spatial audio is mandatory. However, because of the high individuality of numerous anthropometric parameters in connection with listeners, deriving the proper acoustic perturbation characteristics in the process of producing immersive spatial audio via loudspeakers, in which speed and precision are both important, has long been a research challenge. This study first adopts gain vectors for loudspeakers (GVL) to represent the acoustic perturbations, which are sensitive to both the frequency bands and the anthropometric parameters of an individual. The radial base function neural network then maps the parameter sets to the corresponding GVLs. A parallel convolution algorithm guides the GVLs to convolve with the source signals, and the outputs drive the loudspeakers to produce the designated spatial audio of high individuality. Experimental results indicate the following: (1) the binaural cues deviation decrease by 12.21% on average, and the subjective score of the listener increases by 27.24%, and (2) the ratio of time consumed by parallel convolution based on six threads to a general convolution is 0.373, demonstrating that immersive spatial audio of high individuality can be produced in real time.
What problem does this paper attempt to address?