Speaker Clustering Aided by Visual Dialogue Analysis

Shuang Zhang,Wei Hu,Tao Wang,Jia Liu,Yimin Zhang
DOI: https://doi.org/10.1007/978-3-540-89796-5_71
2008-01-01
Abstract:Speaker clustering aims to automatically cluster speech segments for each speaker. By speaker clustering, we can discover main cast list from long videos and retrieve their relevant video clips for efficient browsing. In this paper, we propose a dialogue supervised speaker clustering method, which makes use of the visual dialogue analysis results to improve the performance of speaker clustering. Compared with the traditional approach based only on acoustic features, the dialogue supervised speaker clustering approach can get significant improvement on the clustering result for movie and TV series.
What problem does this paper attempt to address?