A LSTM-Based Joint Progressive Learning Framework for Simultaneous Speech Dereverberation and Denoising

XinTang,JunDu,LiChai,Yannan Wang,Qing Wang,Chin-Hui Lee
DOI: https://doi.org/10.1109/apsipaasc47483.2019.9023160
2019-01-01
Abstract:We propose a joint progressive learning (JPL) framework of gradually mapping highly noisy and reverberant speech features to less noisy and less reverberant speech features in a layer-by-layer stacking scenario for simultaneous speech denoising and dereverberation. As such layers are easier to learn than mapping highly distorted speech features directly to clean and anechoic speech features, we adopt a divide-and-conquer learning strategy based on a long short-term memory (LSTM) architecture, and explicitly design multiple intermediate target layers. Each hidden layer of the LSTM network is guided by a step-by-step signal-to-noise-ratio (SNR) increase and reverberant time decrease. Moreover, post-processing is applied to further improve the enhancement performance by averaging the estimated intermediate targets. Experiments demonstrate that the proposed JPL approach not only improves objective measures for speech quality and intelligibility, but also achieves a more compact model design when compared to the direct mapping and two-stage, namely denoising followed dereverberation approaches.
What problem does this paper attempt to address?