Binaural sound source localization based on generalized parametric model and two-layer matching strategy in complex environments

Liu Hong,Pang Cheng,Zhang Jie
DOI: https://doi.org/10.1109/ICRA.2015.7139822
2015-01-01
Abstract:Binaural sound source localization is an important technique involving Human-Robot Interaction (HRI), video conference, speech enhancement, etc. In many real application scenarios, especially for closed environments, the affect of reverberation and noise would degrade the precision of position estimations. Therefore, a new binaural sound source localization method based on generalized parametric model and two-layer matching strategy is proposed in this paper for complex environments. Firstly, cepstral prefiltering is utilized for dereverberation of binaural signals. Then, two binaural cues computed from a dual-channel frequency representation, are combined to estimate the azimuths of sources. Additionally, the generalized parametric model is presented to describe the relationship between the azimuth and binaural cues through finding the optimal scaling factors from training data. At last, a two-layer matching strategy based on Bayesian rule is used to make the final decision, which can effectively decrease the computation complexity. Experiments have validated the proposed approach and show that it achieves favorably better results compared with several available methods without extra spacial burden. © 2015 IEEE.
What problem does this paper attempt to address?