A Study on Graph Embedding for Speaker Recognition.

Liang He,Ruida Li,Mengqi Niu
DOI: https://doi.org/10.1109/ICASSP48485.2024.10448308
2024-01-01
Abstract:Currently, most speaker recognition systems make a decision by calculating the similarity between enrollment and test embeddings extracted with convolutional neural networks. However, for each embedding, the local structure between itself and its neighbors in the low-dimensional space is different, which is beneficial but is often ignored. We regard embeddings as nodes on a graph, compute edges among them by a distance function and nearest neighbor algorithm, and extract graph embeddings with graph neural networks (GNN) to further mine relational information for speaker recognition. Variants of GNN and different graph configurations are comparatively studied on NIST SRE14 i-vector challenging and VoxCeleb1 datasets. Experimental results demonstrate the excellent performance of the proposed graph embeddings .
What problem does this paper attempt to address?