A Target Speaker Separation Neural Network with Joint-Training

Wenjing Yang,Jing Wang,Hongfeng Li,Na Xu,Fei Xiang,Kai Qian,Shenghua Hu
2021-01-01
Abstract:Target speaker separation aims to separate a target speech from multiple interference voices, which is promising for solving conventional difficulties in speech separation, such as arbitrary source permutation and unknown number of sources, and is useful for personal applications, like online meeting and personal phone calls. Recently, the application of deep-learning based models provided more alternatives for target speaker separation tasks. In this paper, we proposed a target speaker separation neural network with joint-training that separates the target voice in the spectrogram domain with the proposed combinative loss function. Experimental results show that compared with the baseline, our proposed method yields better performance on both test data and real data. Meanwhile, the proposed combinative loss function is more effective in addressing this issue.
What problem does this paper attempt to address?