A Quick and Effective Speaker Diarization System.

Zuoer Chen,Liang He
DOI: https://doi.org/10.21437/odyssey.2022-24
2022-01-01
Abstract:Currently, Agglomerative Hierarchical Clustering with Variational Bayes Hidden Markov Model re-clustering (AHC-VBHMM) and spectral clustering (SC) are two dominant clustering methods for speaker diarization task. The former has the state-of-the-art performance on several well-known evaluation databases, such as CallHome 97, CallHome 00, NIST RT09, Dihard and etc, with the cost of high computation. The latter needs less computation resources but fails to make better usage of the time series information. To take advantages of the merits of these two methods, we propose a quick and effective diarization method, which is based on adaptive spectral clustering and the VBHMM re-clustering. Besides, we adopt an end-to-end diarization method to solve the overlapping speech problem. The proposed system boost the diarization performance with lower diarization error rate (DER) and real time factor (RTF) on the evaluation databases.
What problem does this paper attempt to address?