Glmsnet: single channel speech separation framework in noisy and reverberant environments

Huiyu Shi,Xi Chen,Tianlong Kong,Shouyi Yin,Peng Ouyang
DOI: https://doi.org/10.1109/ASRU51503.2021.9688217
2021-01-01
Abstract:In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSnet). The graph convolution network (GCN) is introduced on high-level features for modeling global context and incorporating long-range information, and it can be arbitrarily inserted into the desired position. Furthermore, Global multi-scale convolution is proposed to aggregate different levels features and improve the audio quality of separation. The leading factor is applied to increase valid information of target speech. We evaluate our method on WHAMR! Database. The results show that our proposed method can obtain state-of-the-art speech separation effect in the presence of noise and reverberation. Compared with the most advanced model before, the performance is improved by 22.7%.
What problem does this paper attempt to address?