PhoneyTalker: an Out-of-the-Box Toolkit for Adversarial Example Attack on Speaker Recognition

Meng Chen,Li Lu,Zhongjie Ba,Kui Ren
DOI: https://doi.org/10.1109/infocom48880.2022.9796934
2022-01-01
Abstract:Voice has become a fundamental method for human-computer interactions and person identification these days. Benefit from the rapid development of deep learning, speaker recognition exploiting voice biometrics has achieved great success in various applications. However, the shadow of adversarial example attacks on deep neural network-based speaker recognition recently raised extensive public concerns and enormous research interests. Although existing studies propose to generate adversarial examples by iterative optimization to deceive speaker recognition, these methods require multiple iterations to construct specific perturbations for a single voice, which is input-specific, time-consuming, and non-transferable, hindering the deployment and application for non-professional adversaries. In this paper, we propose PhoneyTalker, an out-of-the-box toolkit for any adversary to generate universal and transferable adversarial examples with low complexity, releasing the requirement for professional background and specialized equipment. PhoneyTalker decomposes an arbitrary voice into phone combinations and generates phone-level perturbations using a generative model, which are reusable for voices from different persons with various texts. Experiments on mainstream speaker recognition systems with large-scale corpus show that PhoneyTalker outperforms state-of-the-art methods with overall attack success rates of 99.9% and 84.0% under white-box and black-box settings respectively.
What problem does this paper attempt to address?