Sound Event Detection Using Multi-Scale Dense Convolutional Recurrent Neural Network with Lightweight Attention

Hanyue Yang,Liyan Luo,Mei Wang,Xiyu Song,Fukun Mi
DOI: https://doi.org/10.1109/eiect60552.2023.10442997
2023-01-01
Abstract:Neural network based sound event detection has attracted considerable attention due to high detection accuracy. Existing sound event detection algorithms often improve performance by constructing deeper and more complex detection models. To address the issue of high computational complexity in existing high performance sound event detection algorithms, this paper proposes a sound event detection algorithm using multiscale dense convolutional recurrent neural network with lightweight attention (MS-AttDenseNet-RNN). It utilizes multiscale dense convolutional neural network with lightweight attention (MS-AttDenseNet) to capture information from different scales of sound signal receptive fields and then generates importance-enhanced multi-scale feature. Finally, recurrent neural network (RNN) is used to model the temporal dependency of the enhanced multi-scale features. Compared to the system model that won the DCASE challenge, this method achieves a 18.9% improvement in F1 score and a 4.0% reduction in error rate, while reducing the number of parameters by 75.8% and the computational complexity by 47.5%. This demonstrates that the proposed method can achieve lower parameter count and computational complexity while maintaining accuracy.
What problem does this paper attempt to address?