Abstract:Automatic speaker verification (ASV) systems have been widely applied in voice user interfaces to conduct person identification and access control via voiceprints. A typical ASV system consists of three stages, i.e., training, enrollment, and verification. Previous work has revealed that the ASV system can be bypassed at the training stage by backdoor attacks and at the verification stage by adversarial example attacks. In this paper, we propose a new type of backdoor attack aimed at the enrollment stage via adversarial ultrasound, named UltraBD, which is highly imperceptible, synchronization-free, and content-independent. By simultaneously injecting the ultrasound backdoor examples when the legitimate user initiates the enrollment, the polluted voiceprints stored in the ASV systems grant access to both the legitimate user and the adversary with relatively high confidence. Despite the challenges, i.e., when, what, and how the legitimate user articulates at the enrollment stage can be remarkably unpredictable and various, we managed to launch UltraBD by augmenting the generation and optimization process of the ultrasound backdoor examples with the randomness of synchronous time and relative amplitude ratio. Furthermore, we optimize the modulation mechanism of adversarial ultrasound by tuning the baseband signal on limited signal frequency points to improve its robustness in the physical world setting. We validate UltraBD on two common datasets together with two open-source ASV models. Results show that UltraBD can be robust to various configurations, e.g., different speakers and utterance content. In sum, our attack calls attention to a new attack surface of ASV systems and sheds light on its fundamental mechanisms.

ADVERSARIAL DEFENSE FOR AUTOMATIC SPEAKER VERIFICATION BY CASCADED SELF-SUPERVISED LEARNING MODELS

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning

UltraBD: Backdoor Attack against Automatic Speaker Verification Systems via Adversarial Ultrasound

Voting for the right answer: Adversarial defense for speaker verification

VarASV: Enabling Pitch-variable Automatic Speaker Verification Via Multi-task Learning

Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning.

Defense Against Adversarial Attacks on Spoofing Countermeasures of ASV

Voice Presentation Attack Detection Using Convolutional Neural Networks

Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification

The defender's perspective on automatic speaker verification: An overview

Defending Against Adversarial Attacks in Speaker Verification Systems

Diffusion-Based Adversarial Purification for Speaker Verification

Adversarial Sample Detection for Speaker Verification by Neural Vocoders

Defending Adversarial Attacks on Cloud-aided Automatic Speech Recognition Systems.

AdvSV: An Over-the-Air Adversarial Attack Dataset for Speaker Verification

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference.

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification

Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification

Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation

Generalizing Speaker Verification for Spoof Awareness in the Embedding Space

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches