VB-HMM Speaker Diarization with Enhanced and Refined Segment Representation.
Xianhong Chen,Liang He,Can Xu,Yi Liu,Tianyu Liang,Jia Liu
DOI: https://doi.org/10.21437/odyssey.2018-19
2018-01-01
Abstract:Variational Bayes hidden Markov model (VB-HMM) is a soft speaker diarization system. It is often combined with fixed length segmentation (FLS) instead of speaker change detection (SCD) to avoid SCD error propagation. However, as each segment is too short to provide enough speaker information, the emission probability (given a speaker, a segment occurs) will be noisy and inaccuracy. Therefore, we propose a VB-HMM speaker diarization system with enhanced and refined segment representation. First, it enhances the segment representation with stream neighbors to extract more information of the same speaker to improve the accuracy of emission probability, and then it further refines the segment representation with speaker change points in the iteration to dislodge the information of other different speakers. The experiment results on RT09 demonstrate that, VB-HMM with enhanced and refined segment representation has a relative improvement of 22.9 % compared with VB-HMM with only FLS.