Abstract:This paper addresses the problem of multiple-speaker localization in noisy and reverberant environments, using binaural recordings of an acoustic scene. A complex-valued Gaussian mixture model (CGMM) is adopted, whose components correspond to all the possible candidate source locations defined on a grid. After optimizing the CGMM-based objective function, given an observed set of complex-valued binaural features, both the number of sources and their locations are estimated by selecting the CGMM components with the largest weights. An entropy-based penalty term is added to the likelihood to impose sparsity over the set of CGMM component weights. This favors a small number of detected speakers with respect to the large number of initial candidate source locations. In addition, the direct-path relative transfer function (DP-RTF) is used to build robust binaural features. The DP-RTF, recently proposed for single-source localization, encodes interchannel information corresponding to the direct path of sound propagation and is thus robust to reverberations. In this paper, we extend the DP-RTF estimation to the case of multiple sources. In the short-time Fourier transform domain, a consistency test is proposed to check whether a set of consecutive frames is associated with the same source or not. Reliable DP-RTF features are selected from the frames that pass the consistency test to be used for source localization. Experiments carried out using both simulation data and real data recorded with a robotic head confirm the efficiency of the proposed multisource localization method.

A Two-Microphone Method for Localization of Multiple Speech Sources Using Complex Exponential Transform of Phase Differences

Passive Localization Based on Double-Correlation Function Using Sources of Opportunity

A Two Microphone-Based Approach For Source Localization Of Multiple Speech Sources

Passive Localization Based on Spectral Estimation Methods Using Sources of Opportunity

Efficient Localization of Low-Frequency Sound Source with Non-Synchronous Measurement at Coprime Positions by Alternating Direction Method of Multipliers.

Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization with Spatial Sparsity Regularization.

A Modified Cross Power-Spectrum Phase Method Based on Microphone Array for Acoustic Source Localization

Two microphone based direction of arrival estimation for multiple speech sources using spectral properties of speech

An iteratively reweighted steered response power approach to multisource localization using a distributed microphone network

Dual-Microphone Source Location Method in 2-D Space.

Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering

Multiple Sound Source Localization Based on TDOA Clustering and Multi-Path Matching Pursuit

Real-time Sound Source Localization Using Hybrid Framework

SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization

Joint DOA Estimation and Dereverberation Based on Multi-Channel Linear Prediction Filtering and Azimuth Sparsity

Estimation for the Location of Multiple Moving Sound Sources in Small-Distance Dual-Microphone

Probabilistic Binaural Multiple Sources Localization Based On Time-Delay Compensation Estimator And Clustering Analysis

Microphone Clustering and BP Network based Acoustic Source Localization in Distributed Microphone Arrays

A Two-Stage Approach for the Estimation of Multiple Input Multiple Output Acoustic Channels

RSS-Based Multiple Co-Channel Sources Localization With Unknown Shadow Fading and Transmitted Power

Two-Microphones Speech Separation Using Generalized Gaussian Mixture Model