Sound Event Localization and Detection Based on Deep Learning

Dada Zhao,Kai Ding,Xiaogang Qi,Yu Chen,Hailin Feng
DOI: https://doi.org/10.23919/jsee.2023.000110
IF: 1.363
2024-01-01
Journal of Systems Engineering and Electronics
Abstract:Acoustic source localization (ASL) and sound event detection (SED) are two widely pursued independent research fields. In recent years, in order to achieve a more complete spatial and temporal representation of sound field, sound event localization and detection (SELD) has become a very active research topic. This paper presents a deep learning-based multi- overlapping sound event localization and detection algorithm in three-dimensional space. Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimension as input features. These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively. The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential information and suppress the useless features. Finally, a thourough comparison confirms the efficiency and effectiveness of the proposed SELD algorithm. Field experiments show that the proposed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy compared with the baseline method.
What problem does this paper attempt to address?