Delay-and-Sum Beamforming Based Spatial Mapping for Multi-Source Sound Localization
Changjiang He,Siyao Cheng,Rong Zheng,Jie Liu
DOI: https://doi.org/10.1109/jiot.2024.3352051
IF: 10.6
2024-01-01
IEEE Internet of Things Journal
Abstract:Multi-source sound localization can find applications in many domains including auditory scene analysis, fault detection and diagnosis in manufacturing, augmented reality, etc. In far fields, 3D sound source localization is equivalent to finding the direction of arrival (DOA), namely, the azimuth and elevation angles of sound sources. Recent DOA estimation pipelines take multichannel audio inputs, extract spectral features from each channel and then feed them into a deep neural network. Unfortunately, the spectral features contain only the time-frequency information of the audio signals, while spatial information is only implicitly captured in the signals across different channels, which is highly dependent on the acoustic array geometry. To embed the spatial information of the sound source into the spectral feature representation, we propose a DSB-based spatial mapping method encode sound source location information. It can be combined with different feature extraction methods and machine learning models for DOA estimation. Furthermore, a redundancy removal procedure is proposed to accelerate DSB computation so that the pipeline can run in real-time on embedded GPUs, such as NVidia Jeston Nano. We conduct extensive experiments using two neural network models along with the DSB method on two datasets. The experiments demonstrate that the DOA errors can be effectively reduced using the DSB method. When combining DSB for feature extraction, the DOA errors are reduced by up to 19.24%. In addition, the feature extraction process is accelerated by up to 30.42% after the application of redundancy removal.
computer science, information systems,telecommunications,engineering, electrical & electronic