Self-Supervised Speaker Verification with Mini-Batch Prediction Correction

Junxu Wang,Zhihua Fang,Liang He
DOI: https://doi.org/10.21437/interspeech.2024-1466
2024-01-01
Abstract:Applying self-supervised learning to speaker verification tasks has been a challenge. In the two-stage solution, the clustering-iteration step in stage 2 determines the upper bound of the system. Since the pseudo-labels obtained through clustering contain a lot of noise, in order to deal with them, in this paper, we propose a new method for learning with noisy pseudo-labels focusing on small batches, using a unified alignment method based on the model predicted mean and exponential moving average to determine the samples that can be rectified in noisy pseudo-labels. In addition, we explore different iterative training methods, and propose a training method that takes into account the effects of re-clustering and noisy pseudo-labels. By combining these techniques, our system achieves similar or better results compared with previous studies.
What problem does this paper attempt to address?