A Deep Proximal-Unfolding Method for Monaural Speech Dereverberation

Meihuang Wang,Minmin Yuan,Andong Li,Chengshi Zheng,Xiaodong Li
DOI: https://doi.org/10.23919/apsipaasc55919.2022.9979935
2022-01-01
Abstract:Speech is often distorted by reverberation in an enclosure when the microphone is placed far away from the speech source, reducing speech quality and intelligibility. Recent years have witnessed the development of deep neural networks, and many deep learning-based methods have been proposed for dereverberation. Most deep learning-based methods remove the reverberation by directly mapping the reverberant speech to target speech, which often lacks adequate interpretability, limiting the performance upper bound. This paper proposes a deep un-folding method with an interpretable network structure. First, the dereverberation problem was reformulated based on maximum posterior criterion, and an iterative optimization algorithm was then devised by using proximal operators. Second, we unfolded the iterative optimization algorithm into multi-stage deep neural network, where each stage corresponded to a specific operation of the iterative procedure. Experiments were conducted on the WSJO-SI84 corpus, and the results on both simulated and real RIRs showed that the proposed model outperformed previous models and achieved state-of-the-art performance in terms of PESQ, ESTOI and frequency-weighted segmental SNR.
What problem does this paper attempt to address?