Supervised Single-Channel Speech Dereverberation And Denoising Using A Two-Stage Processing

Long Zhang,Jiaxu Chen,You Luo,Jiafei Fu,Zhongfu Ye
DOI: https://doi.org/10.1109/ICIVC.2017.7984668
2017-01-01
Abstract:In many acoustic conditions, a single-channel recorded speech signal may be severely affected by reverberation and noise, leading to a reduced speech quality and intelligibility. This paper focuses on proposing a novel two-stage processing scheme for single-channel speech dereverberation and denoising to enhance the spectrum of the noisy reverberant signal. Similar as previous methods, the proposed method uses a non-negative approximation of the convolutive transfer function (N-CTF) to simultaneously estimate the magnitude spectrograms of the speech signal and the room impulse response (RIR). What's the novelty of proposed algorithm is decomposing the RIRs into two parts to build a two-stage processing scheme for enhancing speech from the noisy environments. The proposed algorithm is iteratively updated to estimate a less reverberant speech signal and a short RIR at first stage, then the clean speech signal and another short RIR are estimated by iteratively updating at the second stage. There are always denosing process steps within both stages. The advantages of our proposed algorithm are more capable to enhance the speech and more time-saving by decomposing the long RIRs into two parts. Additionally, the optimal estimator is derived based on temporal stacking to utilize speech temporal dynamics. Experiments are performed on two simulated RIRs to compare the performances of the proposed method with a state-of-the-art method and the results show that the proposed method has significantly improved the enhanced speech quality and intelligibility.
What problem does this paper attempt to address?