Abstract:This paper addresses the problem of multiple-speaker localization in noisy and reverberant environments, using binaural recordings of an acoustic scene. A complex-valued Gaussian mixture model (CGMM) is adopted, whose components correspond to all the possible candidate source locations defined on a grid. After optimizing the CGMM-based objective function, given an observed set of complex-valued binaural features, both the number of sources and their locations are estimated by selecting the CGMM components with the largest weights. An entropy-based penalty term is added to the likelihood to impose sparsity over the set of CGMM component weights. This favors a small number of detected speakers with respect to the large number of initial candidate source locations. In addition, the direct-path relative transfer function (DP-RTF) is used to build robust binaural features. The DP-RTF, recently proposed for single-source localization, encodes interchannel information corresponding to the direct path of sound propagation and is thus robust to reverberations. In this paper, we extend the DP-RTF estimation to the case of multiple sources. In the short-time Fourier transform domain, a consistency test is proposed to check whether a set of consecutive frames is associated with the same source or not. Reliable DP-RTF features are selected from the frames that pass the consistency test to be used for source localization. Experiments carried out using both simulation data and real data recorded with a robotic head confirm the efficiency of the proposed multisource localization method.

A Two Microphone-Based Approach For Source Localization Of Multiple Speech Sources

Two microphone based direction of arrival estimation for multiple speech sources using spectral properties of speech

Passive Localization Based on Double-Correlation Function Using Sources of Opportunity

Efficient Localization of Low-Frequency Sound Source with Non-Synchronous Measurement at Coprime Positions by Alternating Direction Method of Multipliers.

Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization with Spatial Sparsity Regularization.

Multiple Sound Source Counting and Localization Based on Spatial Principal Eigenvector

DOA estimation of multiple speech sources based on the single-source point detection using an FOA microphone

Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering

Joint DOA Estimation and Dereverberation Based on Multi-Channel Linear Prediction Filtering and Azimuth Sparsity

Multiple Sound Source Localization Based on TDOA Clustering and Multi-Path Matching Pursuit

Probabilistic Binaural Multiple Sources Localization Based On Time-Delay Compensation Estimator And Clustering Analysis

Source Separation by Feature-Based Clustering of Microphones in Ad Hoc Arrays

Dual-Microphone Source Location Method in 2-D Space.

Microphone Clustering and BP Network based Acoustic Source Localization in Distributed Microphone Arrays

A Fast and Robust Localization Method for Low-Frequency Acoustic Source: Variational Bayesian Inference Based on Nonsynchronous Array Measurements.

Estimation for the Location of Multiple Moving Sound Sources in Small-Distance Dual-Microphone

Multiple Sound Source Localization and Counting Using One Pair of Microphones in Noisy and Reverberant Environments

ACP1–ADA1 interaction in type 2 diabetes: a study in coronary artery disease

Deconvolution-based Acoustic Source Localization and Separation Algorithms

Towards Robust Multiple Blind Source Localization Using Source Separation and Beamforming

Closed-form multiple source direction-of-arrival estimator under reverberant environments