An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech

Yan-Hui Tu,Jun Du,Qing Wang,Xiao Bao,Li-Rong Dai,Chin-Hui Lee
DOI: https://doi.org/10.1016/j.csl.2016.12.004
IF: 3.252
2017-01-01
Computer Speech & Language
Abstract: •The early fusion by using multiple beamformings and feature concatenation.•The late fusion of subnets from multiple perspectives.•A simplified and effective MVDR beamforming approach.•Building the best one-pass single DNN system among all submissions to CHiME-3.
What problem does this paper attempt to address?