A Speaker Recognition Method Based on Stable Learning.

Jian Zhang,Jing Ma,Xiaochen Guo,Lin Li,Liang He
DOI: https://doi.org/10.1109/ICASSP48485.2024.10446329
2024-01-01
Abstract:With the development of deep learning, speaker recognition systems have shown increasingly better performance. The generalization ability of the models is also an important aspect of performance evaluation. Typically, a baseline system is used to compare against the improved models to demonstrate performance enhancements. However, we cannot determine the differences in learned voiceprint features between the improved models and the baseline system. This paper introduces an improved speaker recognition system based on the ECAPA-TDNN model. It utilizes stable learning to eliminate sample correlation and employs attribution analysis to compare the differences in voiceprint feature learning between the improved and baseline systems. Experimental results demonstrate that stable learning improves the model’s generalization performance and helps it learn better voiceprint features. The effectiveness and generalization capability of the proposed method are verified through experiments on the VoxCeleb, CNCeleb, and LibriSpeech datasets. This work is important for enhancing speaker recognition performance, analyzing differences in voiceprint feature learning, and promoting advancements in the field.
What problem does this paper attempt to address?