Acoustic echo cancellation based on two‐stage BLSTM

Zhiwei Niu,Shifeng Ou,Peng Song,Ying Gao
DOI: https://doi.org/10.1049/ell2.13164
2024-04-04
Electronics Letters
Abstract:A two‐stage bidirectional long short term memory (TS‐BLSTM) framework, incorporating multi‐head self‐attention mechanisms after each BLSTM block, is introduced. The BLSTM blocks are utilized to aggregate magnitude spectrogram information, modelling both time and frequency dependencies. Additionally, dilation convolution is introduced to broaden the range of information in each convolution output. Acoustic echo cancellation (AEC) methods aim to suppress the acoustic coupling for hands‐free speech communication. Traditional AEC works by identifying the acoustic impulse response using adaptive algorithms. With recent research advances, deep learning has become an attractive choice for AEC. This paper introduces a two‐stage bidirectional long short term memory (TS‐BLSTM) framework, incorporating multi‐head self‐attention mechanisms after each BLSTM block. This is aimed at better capturing contextual information and further enhancing ability of the model to handle complex acoustic scenarios. The BLSTM blocks are utilized to aggregate magnitude spectrum information, modelling both time and frequency dependencies. Additionally, dilation convolution is introduced to broaden the range of information in each convolution output. The magnitude decoder estimates a mask for the input, resulting in the generation of an estimated magnitude spectrum for near‐end speech. Experimental results indicate that the proposed method achieves promising outcomes.
engineering, electrical & electronic
What problem does this paper attempt to address?