A Binaural Deep Neural Networks Parameter Mask for the Robust Automatic Speech Recognition System

Yi Jiang,Runsheng Liu
DOI: https://doi.org/10.1109/icnisc.2016.082
2016-01-01
Abstract:Within the framework of computational auditory scene analysis (CASA), a parameter masks estimator based on deep neural networks (DNN) is proposed for automatic speech recognition (ASR) in noisy environments. This paper addresses the robustness in binaural machine speech recognition by speech energy estimation using DNN. An ideal parameter mask (IPM) is introduced as the goal of the DNN estimator, which is calculated by the energy of the target speech and mixture. We systematically examine DNN generalization to untrained and real office configurations. Evaluations and comparisons show that the DNN based binaural estimator produces a superior signal to noise ratio (SNR) and ASR performance in a variety of noisy environments.
What problem does this paper attempt to address?