Spoofing Speaker Verification With Voice Style Transfer And Reconstruction Loss

Gaël Le Lan,A. Larcher,Thomas Thebaud
DOI: https://doi.org/10.1109/WIFS53200.2021.9648375
2021-12-07
Abstract:In this paper we investigate a template reconstruction attack against a speaker verification system. A stolen speaker embedding is processed with a zero-shot voice-style transfer system to reconstruct a Mel-spectrogram containing as much speaker information as possible. We assume the attacker has a black box access to a state-of-the-art automatic speaker verification system. We modify the AutoVC voice-style transfer system to spoof the automatic speaker verification system. We find that integrating a new loss targeting embedding reconstruction and optimizing training hyper-parameters significantly improves spoofing. Results obtained for speaker verification are similar to other biometrics, such as handwritten digits or face verification. We show on standard corpora (VoxCeleb and VCTK) that the reconstructed Mel-spectrograms contain enough speaker characteristics to spoof the original authentication system.
Computer Science
What problem does this paper attempt to address?