Abstract:Speaker clustering is an important problem of speech processing, such as speaker diarization, however, its behavior in adverse acoustic environments is lack of comprehensive study. To address this problem, we focus on investigating its components respectively. A speaker clustering system contains three components-a feature extraction front-end, a dimensionality reduction algorithm, and a clustering back-end. In this paper, we use the standard Gaussian mixture model based universal background model ( GMM-UBM) as a front end to extract high-dimensional supervectors, and compare three dimensionality reduction algorithms as well as two clustering algorithms. The three dimensionality reduction algorithms are the principal component analysis ( PCA), spectral clustering ( SC), and multilayer bootstrap network ( MBN). The two clustering algorithms are the k-means and agglomerative hierarchical clustering ( AHC). We have conducted an extensive experiment with both in-domain and out-of-domain settings on the noisy versions of the NIST 2006 speaker recognition evaluation ( SRE) and NIST 2008 SRE corpora. Experimental results in various noisy environments show that ( i) the MBN based systems perform the best in most cases, while the SC based systems outperform the PCA based systems as well as the original supervector based systems; ( ii) AHC is more robust than k-means.

An Investigation of Speaker Clustering Algorithms in Adverse Acoustic Environments

Preliminary Study on Self-contained UBM Construction for Speaker Recognition.

Speaker Clustering of Telephone Speech Based on Front-End Factor Analysis

Emotional Speech Clustering Based Robust Speaker Recognition System

An Algorithm of Speaker Clustering Based on Model Distance.

Speaker Segmentation and Clustering Based on the Improved Spectral Clustering

Assessing the Robustness of Spectral Clustering for Deep Speaker Diarization

A Quick and Effective Speaker Diarization System.

Robust End-to-end Speaker Diarization with Generic Neural Clustering

A new DP-like speaker clustering algorithm

Exploiting Speaker Embeddings for Improved Microphone Clustering and Speech Separation in ad-hoc Microphone Arrays

Advances in speaker segmentation and clustering

Experimental evaluation of a new speaker identification framework using PCA.

UBM Based Speaker Segmentation and Clustering for 2-Speaker Detection

Speaker Adaptation for Telephony Data Using Speaker Clustering

Research on the Speaker Identification Based on Short Utterance

Universal Background Sparse Coding and Multilayer Bootstrap Network for Speaker Clustering

Speaker Clustering Algorithm in Speech Recognition

Single-Channel Multi-Speaker Separation using Deep Clustering

Subspace construction and selection for speaker recognition

Speech recognition adaptive clustering feature extraction algorithms based on the k-means algorithm and the normalized intra-class variance