A Denoising-Aided Multi-Task Learning Method for Blind Estimation of Reverberation Time

Yulong Zhang,Jinqiu Sang,Chengshi Zheng,Xiaodong Li
DOI: https://doi.org/10.1016/j.measurement.2024.114568
IF: 5.6
2024-01-01
Measurement
Abstract:The noise in reverberant speech severely limits the estimation accuracy of reverberation time T60 using current deep learning (DL) methods. To address this issue, this paper proposes a denoising-aided multi-task learning (DAMTL) method for blind T60 estimation. Specifically, speech denoising, as an auxiliary module, is conducted joint training with T60 estimation for more accurate prediction accuracy. These two tasks are integrated into one DL network by sharing the same encoder network, where the complex-valued spectrum is introduced to extract comprehensive high-dimensional features from noisy reverberant speech. Subsequently, complex operation of 2-D convolutional neural network (Conv2d), batch normalization and long short-term memory (LSTM) are formulated. Furthermore, the noise robustness and applicability of the DAMTL are fully discussed by comparison with state-of-the-art DL-based methods using simulated data and real-world recorded data. The results prove the effectiveness and superiority of the proposed DAMTL, especially in low signal-to-noise ratio (SNR) scenarios and practical applications.
What problem does this paper attempt to address?