Exploring the Large-Scale TDOA Feature Space for Speaker Diarization.

Yi Yang,Jia Liu
DOI: https://doi.org/10.1007/978-3-319-07857-1_97
2014-01-01
Abstract:Using Time-Delay-Of-Arrival (TDOA) features has been proven greatly beneficial to the conventional acoustic feature-based speaker diarization systems by linking the speakers with their localization information. However, most state of-the-art speaker diarization systems depend on (relatively) limited distant microphones, which might not be sufficient in completely exploring the spatial information of speakers. In this study, the feature space spanned by TDOAs from (up to) 64 distant microphones is explored for the purpose of improving the performance of speaker classification, as an important branch of speaker diarization. Additionally, observing the intrinsic correlations of the high-dimensional feature space spanned by large-scale TDOAs, we compare several dimensionality reduction algorithms to explore an effective low-dimensional representation of TDOAs. Experimental results of speaker classification show consistent improvements when expanding the TDOA feature space by increasing the number of distant microphones. Furthermore, dimensionality reduction with the manifold information has been proven to be necessary for large-scale TDOAs.
What problem does this paper attempt to address?