Preserving Early Reflections to Improve Speech Quality of Reverberant Speech Separation

Fengfan Hou,Dongmei Li,Xupeng Jia,Chao Ma
DOI: https://doi.org/10.1109/icsip52628.2021.9688830
2021-01-01
Abstract:Speech separation is an important component in robust speech processing systems, and recent models have shown a better performance than ideal time-frequency magnitude masks on noise-free dataset. But reverberant environment will greatly degrade the performance of these models. Inspired by the usage of early reflections in speech enhancement tasks, we explore the effect of preserving early reflections on reverberant speech, and proposed a training method for reverberant monaural speech separation systems. By using proposed training method, which uses early reflections within 20ms behind direct sound as a part of target speech, the perceptual evaluation of speech quality (PESQ) is improved by 0.145, and the short time objective intelligibility (STOI) is improved by 0.017, compared to the origin network using direct sound as target speech.
What problem does this paper attempt to address?