Densely Connected Progressive Learning For Lstm-Based Speech Enhancement

Tian Gao,Jun Du,Li-Rong Dai,Chin-Hui Lee
DOI: https://doi.org/10.1109/icassp.2018.8461861
2018-01-01
Abstract:Recently, we proposed a novel progressive learning (PL) framework for deep neural network (DNN) based speech enhancement to improve the performance in low signal-to-noise ratio (SNR) environments. In this study, several new contributions are made to this framework. First, the advanced long short-term memory (LSTM) architecture is adopted to achieve better results, namely LSTM-PL, where each LSTM layer is guided to explicitly learn an intermediate target with a specific SNR gain. However, we observe that the performance of LSTM-PL architecture is easily degraded by increasing the number of intermediate targets due to the possible information loss when involving more target layers. Accordingly, we propose densely connected progressive learning in which the input and the estimations of intermediate targets are spliced together to learn the next target. This new structure can fully utilize the rich set of information from the multiple learning targets and alleviate the information loss problem. Experimental results demonstrate that the dense structure with deeper LSTM layers can yield significant gains of speech intelligibility measure for all noise types and levels. Moreover, the post-processing with more targets tends to achieve better performance.
What problem does this paper attempt to address?