An Attention-augmented Fully Convolutional Neural Network for Monaural Speech Enhancement

Zezheng Xu,Ting Jiang,Chao Li,Jiacheng Yu
DOI: https://doi.org/10.1109/iscslp49672.2021.9362114
2021-01-01
Abstract:Convolutional neural networks (CNN) have made remarkable achievements in speech enhancement. However, the convolution operation is difficult to obtain the global context of the feature map due to its locality. To solve the above problem, we propose an attention-augmented fully convolutional neural network for monaural speech enhancement. More specifically, the method is to integrate a new two-dimensional relative self-attention mechanism into fully convolutional networks. Besides, we utilize Huber Loss as the loss function, which is more robust to noise. Experimental results indicate that compared with the optimally modified log-spectral amplitude (OMLSA) estimator and other CNN-based models, our proposed network has better performance in five indicators, and can well balance noise suppression and speech distortion. What is more, we also embed the proposed attention mechanism into other convolutional networks and get satisfactory results, showing that this mechanism has great generalization ability.
What problem does this paper attempt to address?