Abstract:Acoustic echo is a persistent issue in telecommunication that degrades the quality of speech and breaks down communication either entirely or for a period of time; therefore, acoustic echo cancellation (AEC) systems were developed. The demand for AEC has significantly risen after the global pandemic 2020 as the speaker and the listener communicate in unpredictable environments such as home environments where echo and noise significantly disrupt communication. Numerous AEC solutions have been proposed, including adaptive filters and deep learning techniques. However, their effectiveness is notably lowered during double-talk scenarios, where both nearend and farend speakers talk simultaneously, as well as in noisy environments. This paper proposes a novel transQT neural network (TNN), an end-to-end neural network that leverages the constant Q transform (CQT) and transformer-inspired self-attention module to eliminate the echo and noise in double-talk noisy scenarios. Additionally, it utilizes the smooth L1 loss function to enable efficient training and enhance the overall performance of the proposed model. In the proposed TNN, the CQT is used as the front end to convert the signal from time domain to time-frequency domain. The primary aim of CQT is to improve speech quality as it aligns more closely with the human auditory system due to its use of a logarithmic frequency scale. The attention module has been incorporated among the layers of the proposed models to focus on double-talk and noisy parts of speech. It aids the AEC model by making it easier to separate the clean target signal from the parts affected by double-talk and noise. The smooth L1 loss is employed to ensure smooth training and stable and efficient convergence. It is also less sensitive to variability in data, therefore reducing large errors and overall loss. An experimental implementation was conducted for both causal and non-causal scenarios. The proposed TNN model demonstrated superior performance in terms of speech quality, as measured by the perceptual evaluation of speech quality (PESQ) and it also showed a significant reduction of echo, quantified by echo return loss enhancement (ERLE). The performance was further evaluated using the correlation coefficient, which indicates the relationship between the clean and the echo signal.

Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning

End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression

Double Branches and Stages Neural Network for Joint Acoustic Echo and Noise Suppression

Acoustic Echo Cancellation by Combining Adaptive Digital Filter and Recurrent Neural Network

A Robust and Cascaded Acoustic Echo Cancellation Based on Deep Learning

Improving Acoustic Echo Cancellation by Exploring Speech and Echo Affinity with Multi-Head Attention.

Deep Neural Network Based Regression Approach for Acoustic Echo Cancellation

A complex spectral mapping with inplace convolution recurrent neural networks for acoustic echo cancellation

Deep Echo Cancellation Algorithm Based on Time-Frequency Domain Combination

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

A Multilayer Complex Neural Network Training Algorithm and Its Application in Adaptive Equalization

A Robust Acoustic Echo Canceller for Noisy Environment

Residual acoustic echo suppression based on efficient multi-task convolutional neural network

Novel TransQT Neural Network: A Deep Learning Framework for Acoustic Echo Cancellation in Noisy Double-Talk Scenario

Deep Multi-task Network for Delay Estimation and Echo Cancellation

AEC in a NetShell: On Target and Topology Choices for FCRN Acoustic Echo Cancellation

Nonlinear Acoustic Echo Cancellation with Deep Learning

Exploring the Interactions between Target Positive and Negative Information for Acoustic Echo Cancellation

Explore Relative and Context Information with Transformer for Joint Acoustic Echo Cancellation and Speech Enhancement

Acoustic echo cancellation based on two‐stage BLSTM