Robust Spatial Filtering Network for Separating Speech in the Direction of Interest

Dongxu Liu,Dongmei Li,Chao Ma,Xupeng Jia
DOI: https://doi.org/10.1109/icsip57908.2023.10271031
2023-01-01
Abstract:In recent years, target speech separation has drawn a lot of attention with the development of deep-learning methods. The target speech from the specific direction-of-interest (DOI) can be extracted with auxiliary directional information. However, separating target signals from DOI has not been investigated in detail, and the performance of existing systems can degrade when the direction estimation error occurs. In this paper, a spatial filtering convolutional recurrent network (SF-CRN) is proposed for target speech separation in the direction-of-interest. Inspired by the GCC-PHAT localization method, we construct direction feature (DF) to integrate with multi-channel short-time complex spectra as the input. In addition, we propose a method to train our network to make it more robust to direction estimation error. The experimental results show that our network can achieve significant performance improvement on target speech separation.
What problem does this paper attempt to address?