Abstract:Advances in automatic speaker verification (ASV) promote research into the formulation of spoofing detection systems for real-world applications. The performance of ASV systems can be degraded severely by multiple types of spoofing attacks, namely, synthetic speech (SS), voice conversion (VC), replay, twins and impersonation, especially in the case of unseen synthetic spoofing attacks. A reliable and robust spoofing detection system can act as a security gate to filter out spoofing attacks instead of having them reach the ASV system. A weighted additive angular margin loss is proposed to address the data imbalance issue, and different margins has been assigned to improve generalization to unseen spoofing attacks in this study. Meanwhile, we incorporate a meta-learning loss function to optimize differences between the embeddings of support versus query set in order to learn a spoofing-category-independent embedding space for utterances. Furthermore, we craft adversarial examples by adding imperceptible perturbations to spoofing speech as a data augmentation strategy, then we use an auxiliary batch normalization (BN) to guarantee that corresponding normalization statistics are performed exclusively on the adversarial examples. Additionally, A simple attention module is integrated into the residual block to refine the feature extraction process. Evaluation results on the Logical Access (LA) track of the ASVspoof 2019 corpus provides confirmation of our proposed approaches' effectiveness in terms of a pooled EER of 0.87%, and a min t-DCF of 0.0277. These advancements offer effective options to reduce the impact of spoofing attacks on voice recognition/authentication systems.

Adapter Learning from Pre-trained Model for Robust Spoof Speech Detection

End-to-end Spoofing Speech Detection and Knowledge Distillation under Noisy Conditions

Enhancing Out-of-Domain Detection for Speech Spoofing Countermeasure Via Supervised Contrastive Learning

Siamese Network with Wav2vec Feature for Spoofing Speech Detection

Efficient Adapters for Giant Speech Models

Efficient Adapter Tuning of Pre-trained Speech Models for Automatic Speaker Verification

Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Exploration of Adapter for Noise Robust Automatic Speech Recognition

Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation

Lightweight Adapter Tuning for Multilingual Speech Translation

Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples

SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition

Adapting Pre-Trained Self-Supervised Learning Model for Speech Recognition with Light-Weight Adapters

Robustness of Speech Spoofing Detectors Against Adversarial Post-Processing of Voice Conversion

A Light CNN with Split Batch Normalization for Spoofed Speech Detection Using Data Augmentation

Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0

WhisPAr: Transferring Pre-trained Audio Models to Fine-grained Classification Via Prompt and Adapter

Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

Speaker-Aware Anti-Spoofing

Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning