Direction of Arrival Estimation for Indoor Environments Based on Acoustic Composition Model with a Single Microphone

Xingchen Guo,Xuexin Xu,Xunquan Chen,Jinhui Chen,Rong Jia,Zhihong Zhang,Tetsuya Takiguchi,Edwin R. Hancock
DOI: https://doi.org/10.1016/j.patcog.2022.108715
IF: 8
2022-01-01
Pattern Recognition
Abstract:This paper presents an effective method for multiple talker localisation using only a single microphone in a room. One of the main challenge here is obtaining a model that can be used for estimating the localization parameter. This model must be sensitive to all possible speaker locations and correctly dis-criminate their positions. The reverberant speech signal in a room environment can be composited by the clean speech and the acoustic transfer function (ATF). The ATF is a useful tool to describe changes of the speech source, and the approaches based on ATF can thus be used to identify talker localizations with a single microphone. This paper presents two methods, referred to as Composite Reverberant Speech (CRS) model and Direct Training Reverberant Speech (DTRS) model, and uses these methods for obtaining the ATF of a room. The approaches based on proposed methods can successfully and accurately process multi-talker localization task with single microphone. Experiments also demonstrate the effectiveness of the proposed methods.(c) 2022 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?