Multi-Band Pit And Model Integration For Improved Multi-Channel Speech Separation

Lianwu Chen,Meng Yu,Dan Su,Dong Yu
DOI: https://doi.org/10.1109/icassp.2019.8682470
2019-01-01
Abstract:The recent exploration of deep learning for supervised speech separation has significantly accelerated the progress on the multi-talker speech separation problem. Multi-channel extension has attracted much research attention due to the benefit of spatial information in far-field acoustic environments. In this paper, We review the most recent models of multi-channel permutation invariant training (PIT), investigate spatial features formed by microphone pairs and their underlying impact and issue, present a multi-band architecture for effective feature encoding, and conduct a model integration between single-channel and multi-channel PIT for resolving the spatial overlapping problem in the conventional multi-channel PIT framework. The evaluation confirms the significant improvement achieved with the proposed model and training approach for the multi-channel speech separation.
What problem does this paper attempt to address?