Noise-Disentanglement Metric Learning for Robust Speaker Verification

Yao Sun,Hanyi Zhang,Longbiao Wang,Kong Aik Lee,Meng Liu,Jianwu Dang
DOI: https://doi.org/10.1109/icassp49357.2023.10096848
2023-01-01
Abstract:Automatic speaker verification (ASV) suffers from performance degradation in noisy environments. To solve this problem, we propose the noise-disentanglement metric learning to reduce the speaker-irrelevant noisy components and build a noise-invariant embedding space. Specifically, the disentanglement module, including the speaker encoder and re-construction module, is dedicated to decoupling speech signals. The speaker encoder is used to disentangle speaker-related components, and the reconstruction module increases the model’s ability to constrain the noise information by re-constructing the signal. In addition, distribution optimization is introduced to supervise the spatial structure of speaker embeddings under noisy environments. Experiments on Vox-Celeb1 indicate that the proposed method improves the performance of the speaker verification system in both clean and noisy conditions.
What problem does this paper attempt to address?