Abstract:Automatic speaker verification (ASV) systems have been widely applied in voice user interfaces to conduct person identification and access control via voiceprints. A typical ASV system consists of three stages, i.e., training, enrollment, and verification. Previous work has revealed that the ASV system can be bypassed at the training stage by backdoor attacks and at the verification stage by adversarial example attacks. In this paper, we propose a new type of backdoor attack aimed at the enrollment stage via adversarial ultrasound, named UltraBD, which is highly imperceptible, synchronization-free, and content-independent. By simultaneously injecting the ultrasound backdoor examples when the legitimate user initiates the enrollment, the polluted voiceprints stored in the ASV systems grant access to both the legitimate user and the adversary with relatively high confidence. Despite the challenges, i.e., when, what, and how the legitimate user articulates at the enrollment stage can be remarkably unpredictable and various, we managed to launch UltraBD by augmenting the generation and optimization process of the ultrasound backdoor examples with the randomness of synchronous time and relative amplitude ratio. Furthermore, we optimize the modulation mechanism of adversarial ultrasound by tuning the baseband signal on limited signal frequency points to improve its robustness in the physical world setting. We validate UltraBD on two common datasets together with two open-source ASV models. Results show that UltraBD can be robust to various configurations, e.g., different speakers and utterance content. In sum, our attack calls attention to a new attack surface of ASV systems and sheds light on its fundamental mechanisms.

Voting for the right answer: Adversarial defense for speaker verification

UltraBD: Backdoor Attack against Automatic Speaker Verification Systems via Adversarial Ultrasound

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning

FenceSitter: Black-box, Content-Agnostic, and Synchronization-Free Enrollment-Phase Attacks on Speaker Recognition Systems

ADVERSARIAL DEFENSE FOR AUTOMATIC SPEAKER VERIFICATION BY CASCADED SELF-SUPERVISED LEARNING MODELS

Defending Against Adversarial Attacks in Speaker Verification Systems

The defender's perspective on automatic speaker verification: An overview

Defense Against Adversarial Attacks on Spoofing Countermeasures of ASV

Adversarial Sample Detection for Speaker Verification by Neural Vocoders

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference.

Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification

Attack on Practical Speaker Verification System Using Universal Adversarial Perturbations

To what extent can ASV systems naturally defend against spoofing attacks?

Defending Adversarial Attacks on Cloud-aided Automatic Speech Recognition Systems.

Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification

Automatic Speech Verification Spoofing Detection

Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification

Voice Presentation Attack Detection Using Convolutional Neural Networks

Generalizing Speaker Verification for Spoof Awareness in the Embedding Space

Vulnerability issues in Automatic Speaker Verification (ASV) systems