Abstract:Compared with single channel speech processing, multi- microphone based speech processing methods are capable of high interference suppression in noisy environments because of their spatial filtering capability. This dissertation develops novel microphone array speech processing methods in a variety of configurations and also analyzes and provides insights into existing popular techniques. First we develop a two microphone based source localization technique for multiple speech sources utilizing speech specific properties and the generalized mixture decomposition clustering algorithm. Voiced speech is sparse in the frequency domain and can be represented by sinusoidal tracks via sinusoidal modeling which provides high local SNR. By utilizing the inter-channel phase differences (IPD) between the dual channels on the sinusoidal tracks, the source localization of the mixed multiple speech sources is turned into a clustering problem on the IPD vs. frequency plot. The generalized mixture decomposition algorithm (GMDA) is used to cluster the groups of points corresponding to multiple sources and thus estimate the DOA of the sources. Our next work considers data dependent adaptive beamformers, which are known to have high resolution and interference rejection capability when the array steering vector is accurately known. However, these methods degrade severely if steering vector error exists and so robust variants are needy to remedy this sensitivity. We compare and analyze recent developments in adaptive beamforming. We then develop a robust broadband adaptive beamforming algorithm which combined the robustness of the delay-and-sum beamforming in the look direction with the high interference rejection capability of adaptive beamforming algorithm. Based on J. Li and P. Stoica's work on robust Capon beamforming, we develop variants of the constrained robust Capon beamformer that attempt to limit the search in the underlying optimization problem to a feasible set of steering vectors thereby achieving improved performance. Another class of promising multi-channel signal separation algorithms that complement beamforming methods are blind source separation methods. We analyze and provide insight into one such class of blind source separation methods, independent component analysis (ICA) methods. For separating convolutively mixed source signals, the frequency domain ICA approach is often used because it simplifies the time domain convolutive mixing problem into the instantaneous mixing problem in each frequency bin. We examine and provide insights into the frequency domain ICA methods for source separation in reverberant environments. Concentrating on the bin-wise ICA methods, a significant contribution of this work is to show that signals modeled using Gaussian scale mixtures (GSM) density can be separated using ICA even though they might be dependent on each other as long as the the frame dynamics of the source signals are different almost surely. We also analyze the stability conditions of the complex maximum likelihood ICA /IVA. Lastly, in an attempt to make the best of ICA and beamforming methods, we propose two approaches for combining geometric information with ICA algorithms to solve the permutation problem in a scenario where approximate information about the direction of the desired source is known

Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition

Closely Coupled Array Processing And Model-Based Compensation For Microphone Array Speech Recognition

Microphone array processing via joint wideband angle-of-arrival estimation and speech feature enhancement

Low Complexity Modeling of Cross-Spectral Matrix and Its Application in the Non-Synchronous Measurements of Microphones Array

Matched-field source localization under multi-coherent modal group model via covariance matrix matching

GSC-like Speech Enhancement for Dual Small Microphone Array

Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array

Broadband Beamforming Compensation Algorithm in CI Front-End Acquisition

An Algorithm of Model Compensation Based on the Estimation of Additive Noise and Channel Function for Speech Recognition

Adaptive Beamforming Based on Interference-Plus-Noise Covariance Matrix Reconstruction for Speech Separation

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Microphone Array Processing for Speech: Dual Channel Localization, Robust Beamforming, and ICA Analysis

Speech Enhancement with Generalized Sidelobe Canceller Based on a Coherence-based Filter for Small Microphone Arrays

A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge

Design of a robust MVDR beamforming method with Low-Latency by reconstructing covariance matrix for speech enhancement

A Comparative Study of Noise Estimation Algorithms for Nonlinear Compensation in Robust Speech Recognition

Robust Speaker Recognition in Cross-Channel Condition Based on Gaussian Mixture Model

Array Configuration-Agnostic Personalized Speech Enhancement Using Long-Short-Term Spatial Coherence

Separating Voices from Multiple Sound Sources Using 2D Microphone Array

Observer for Phased Microphone Array Signal Processing with Nonlinear Output

Residual Noise Compensation For Robust Speech Recognition In Nonstationary Noise