Abstract:Compared with single channel speech processing, multi- microphone based speech processing methods are capable of high interference suppression in noisy environments because of their spatial filtering capability. This dissertation develops novel microphone array speech processing methods in a variety of configurations and also analyzes and provides insights into existing popular techniques. First we develop a two microphone based source localization technique for multiple speech sources utilizing speech specific properties and the generalized mixture decomposition clustering algorithm. Voiced speech is sparse in the frequency domain and can be represented by sinusoidal tracks via sinusoidal modeling which provides high local SNR. By utilizing the inter-channel phase differences (IPD) between the dual channels on the sinusoidal tracks, the source localization of the mixed multiple speech sources is turned into a clustering problem on the IPD vs. frequency plot. The generalized mixture decomposition algorithm (GMDA) is used to cluster the groups of points corresponding to multiple sources and thus estimate the DOA of the sources. Our next work considers data dependent adaptive beamformers, which are known to have high resolution and interference rejection capability when the array steering vector is accurately known. However, these methods degrade severely if steering vector error exists and so robust variants are needy to remedy this sensitivity. We compare and analyze recent developments in adaptive beamforming. We then develop a robust broadband adaptive beamforming algorithm which combined the robustness of the delay-and-sum beamforming in the look direction with the high interference rejection capability of adaptive beamforming algorithm. Based on J. Li and P. Stoica's work on robust Capon beamforming, we develop variants of the constrained robust Capon beamformer that attempt to limit the search in the underlying optimization problem to a feasible set of steering vectors thereby achieving improved performance. Another class of promising multi-channel signal separation algorithms that complement beamforming methods are blind source separation methods. We analyze and provide insight into one such class of blind source separation methods, independent component analysis (ICA) methods. For separating convolutively mixed source signals, the frequency domain ICA approach is often used because it simplifies the time domain convolutive mixing problem into the instantaneous mixing problem in each frequency bin. We examine and provide insights into the frequency domain ICA methods for source separation in reverberant environments. Concentrating on the bin-wise ICA methods, a significant contribution of this work is to show that signals modeled using Gaussian scale mixtures (GSM) density can be separated using ICA even though they might be dependent on each other as long as the the frame dynamics of the source signals are different almost surely. We also analyze the stability conditions of the complex maximum likelihood ICA /IVA. Lastly, in an attempt to make the best of ICA and beamforming methods, we propose two approaches for combining geometric information with ICA algorithms to solve the permutation problem in a scenario where approximate information about the direction of the desired source is known

Separating Voices from Multiple Sound Sources Using 2D Microphone Array

Microphone array processing via joint wideband angle-of-arrival estimation and speech feature enhancement

Experiments on Blind Speech Separations

Source Separation by Feature-Based Clustering of Microphones in Ad Hoc Arrays

Adaptive Beamforming Based on Interference-Plus-Noise Covariance Matrix Reconstruction for Speech Separation

Adaptive Speech Separation Based on Beamforming and Frequency Domain-Independent Component Analysis

A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge

Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation

Microphone Array Processing for Speech: Dual Channel Localization, Robust Beamforming, and ICA Analysis

Towards Robust Multiple Blind Source Localization Using Source Separation and Beamforming

A Multi-channel Speech Separation System for Unknown Number of Multiple Speakers

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation

Localization and Separation of Acoustic Sources by Using a 2.5-Dimensional Circular Microphone Array

Design and Implementation of A Space Domain Spherical Microphone Array with Application to Source Localization and Separation

Poster Abstract: Voxnet Acoustic Array For Multiple Bird Source Separation By Beamforming Using Measured Data

Comparison and application of the far-field identification algorithms for multiple sound sources based on microphone array

A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation

Distributed speech separation in spatially unconstrained microphone arrays

Binaural Angular Separation Network

DualSep: A Light-weight dual-encoder convolutional recurrent network for real-time in-car speech separation