Parallel convolutional recurrent network based monaural speech enhancement system

LI Xin-yuan,HUANG He-ming
DOI: https://doi.org/10.16208/j.issn1000-7024.2023.04.030
2023-01-01
Abstract:To improve the convergence speed and generalization of the speech enhancement system while reducing the requirement for training data, a speech enhancement system based on parallel convolutional recurrent network(PCRN) was proposed. Based on the CRN, normalized gated linear units(NGLU) was used to enhance performance and convergence speed, a parallel recurrent layer structure was used to process both the original and the encoder-processed speech features, and the output of the parallel structure was processed through a post-processing module. Experimental results on such speech datasets as THCHS30 and LibriSpeech and such noise datasets as NOISEX92 and PNL 100 Nonspeech Sounds verify that, compared with several state-of-the-art speech enhancement systems, the proposed method achieves up to 36.92% performance improvement and it also improves the convergence speed by 62.36%.
What problem does this paper attempt to address?