Abstract:Aiming at solving the problems of the conventional minimum variance distortionless response (MVDR) beamformer in practical applications, such as the sensibility of the steering vector mismatch and beampattern distortion, a robust broadband MVDR beamforming method with low-latency by reconstructing covariance matrix is proposed and applied to speech enhancement with a linear microphone array in this paper. In this work, some important steps are optimized, and the main contribution is to consider the problem of correlation terms generated by the low latency. Firstly, the direction of arrival (DOA) is corrected and the steering vector is estimated based on the sparsity of the DOAs corresponding to the sound sources, which improves the ability of anti-mismatches in the steering vector. Secondly, the correlation terms between the sound sources and noise are estimated and eliminated by the Capon power within the eigen-subspace, and the indirect dominant method is used to eliminate the correlation terms between the sound sources, so that the covariance matrix is reconstructed to obtain a more robust MVDR beamformer. Thirdly, the problem of white noise amplification at low frequency bins is analyzed, and a white noise gain (WNG) modification method is proposed to obtain a compromise between the interference suppression and WNG. In the experiments, the TIMIT corpus is used to generate the multi-channel speech data set, and the performance of the proposed method is evaluated with different DOAs and input signal to interference plus noise ratios (SINRs). The experimental results show that the proposed method can effectively suppress the interferences and reduce the noise with strong robustness.

Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction

Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition

On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones

Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting.

An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

A Study of Learning Based Beamforming Methods for Speech Recognition

The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing , feature enhancement , and language modeling

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Design of a robust MVDR beamforming method with Low-Latency by reconstructing covariance matrix for speech enhancement

Wavoice: A mmWave-assisted Noise-resistant Speech Recognition SystemJust Accepted

A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge

An Iterative Mask Estimation Approach to Deep Learning Based Multi-Channel Speech Recognition

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

Attention-Based Beamformer For Multi-Channel Speech Enhancement

Multi-Channel Feature Adaptation for Robust Speech Recognition

An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech

Wavoice: an Mmwave-Assisted Noise-Resistant Speech Recognition System

Wavoice: an Mmwave-Assisted Noise-Resistant Speech Recognition System.