Speech Enhancement Model for High Sampling Rate Speech Datasets Based on Multi-branch Time Convolutional Network

Zehua Zhang,Mingjiang Wang,Xuyi Zhuang
DOI: https://doi.org/10.1109/icsip52628.2021.9688870
2021-01-01
Abstract:In this paper, a high sampling rate speech enhancement method based on multi-branch time convolutional networks (TCN) is proposed. The most important parameter in traditional speech enhancement algorithms is the prior signal-to-noise ratio (SNR). In this paper, Deep Xi framework is used to estimate the prior SNR, and multi-branch TCN is proposed to realize the mapping of the amplitude spectrum of noisy speech to the prior SNR. The multi-branch time convolutional network proposed in this paper can better capture context information and smaller model size. In addition, in the waveform reconstruction stage, this paper proposes to use the weighted Euclidean distortion measure to correct the gain function. Experimental results on a speech dataset with a 48kHz sampling rate show that our strategy has more advanced performance and superior performance.
What problem does this paper attempt to address?