Research on Environmental Sound Recognition Method based on Residual Network with Dual Attention Mechanism

Li Jia,Xiang Sun,Zhengyong Qiu,Meng Sun
DOI: https://doi.org/10.1109/ITAIC58329.2023.10409085
2023-12-08
Abstract:Aiming at the problem that existing deep learning methods pay the same attention to sound features, which leads to low accuracy of environmental sound recognition, a method for environmental sound recognition based on residual network with dual attention mechanism is proposed. Firstly, the log-Mel spectrogram (LM), Mel-scaled frequency cepstral coefficients (MFCC) of the sound signal are extracted, and the two features are fused as inputs to the neural network. Secondly, based on the channel attention mechanism and the spatial attention mechanism, a dual attention mechanism is designed to automatically learn different feature channel weights, so that the model focuses more on the key channel feature information, and at the same time, improves the model's ability to extract features in the time-frequency domain, so as to better capture the sound features and improve the accuracy of sound recognition. Finally, the features are classified using the fully connected layer and softmax output layer to achieve the recognition of different sound signals. The experimental results show that the recognition rate of using fusion features is higher than that of the single feature, and the recognition rate of the proposed model is 95.42 %, which is higher than that of other models and meets the requirements of practical engineering applications.
Computer Science,Environmental Science
What problem does this paper attempt to address?