Single-channel Speech Enhancement Student under Multi-channel Speech Enhancement Teacher

Yuzhu Zhang,Hui Zhang,Xueliang Zhang
DOI: https://doi.org/10.23919/apsipaasc55919.2022.9980275
2022-01-01
Abstract:In recent years, significant success has been made in single-channel speech enhancement using the deep neural networks. These approaches trained a model on synthetic noisy speech corpus, which was created by adding noise to clean speech. Because there is a mismatch between synthetic training data and the actual application environment, the model's performance is not guaranteed. This paper proposes to use a multi-channel speech enhancement teacher model to guide a single-channel noise suppression student model. We set the multi-channel teacher's processed signal as the single-channel student's training target. With our proposed approach, the single-channel speech enhancement model can be trained using real noisy speech and performed as well as a multi-channel speech enhancement model. Experimental results on CHIME-3 demonstrate that our proposed approach can achieve competitive performance both in speech enhancement and automatic speech recognition tasks, even without ground truth signals.
What problem does this paper attempt to address?