Gaussian Density Guided Deep Neural Network For Single-Channel Speech Enhancement

Li Chai,Jun Du,Yan-Nan Wang
DOI: https://doi.org/10.1109/MLSP.2017.8168116
2017-01-01
Abstract:Recently, the minimum mean squared error (MMSE) has been a benchmark of optimization criterion for deep neural network (DNN) based speech enhancement. In this study, a probabilistic learning framework to estimate the DNN parameters for single-channel speech enhancement is proposed. First, the statistical analysis shows that the prediction error vector at the DNN output well follows a unimodal density for each log-power spectral component. Accordingly, we present a maximum likelihood (ML) approach to DNN parameter learning by charactering the prediction error vector as a multivariate Gaussian density with a zero mean vector and an unknown co-variance matrix. It is demonstrated that the proposed learning approach can achieve a better generalization capability than MMSE-based DNN learning for unseen noise types, which can significantly reduce the speech distortions in low SNR environments.
What problem does this paper attempt to address?