Monaural Speech Separation Method Based on Recurrent Attention with Parallel Branches

Xue Yang,Changchun Bao,Xu Zhang,Xianhong Chen
DOI: https://doi.org/10.21437/interspeech.2023-518
2023-01-01
Abstract:In many speech separation methods, the contextual information contained in the feature sequence is mainly modeled by recurrent layer and/or self-attention mechanism. However, how to combine these two powerful approaches more effectively needs to be explored. In this paper, a recurrent attention with parallel branches is proposed to first fully exploit the contextual information contained in the time-frequency (T-F) features. Then, this information is further modeled by the recurrent modules in a conventional manner. Specifically, the proposed recurrent attention with parallel branches uses two attention modules stacked sequentially. Each attention module has two parallel branches of self-attention to model dependencies along two axes and one convolutional layer for feature fusion. Thus, the contextual information contained in the T-F features can be fully exploited and further modeled by the recurrent modules. Experimental results showed the effectiveness of our proposed method.
What problem does this paper attempt to address?