An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

You Zhang,Ge Zhu,Fei Jiang,Zhiyao Duan
DOI: https://doi.org/10.48550/arXiv.2104.01320
2021-10-11
Abstract:Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials. In practice, however, acoustic condition variability in speech utterances may significantly degrade the performance of CM systems. In this paper, we conduct a cross-dataset study on several state-of-the-art CM systems and observe significant performance degradation compared with their single-dataset performance. Observing differences of average magnitude spectra of bona fide utterances across the datasets, we hypothesize that channel mismatch among these datasets is one important reason. We then verify it by demonstrating a similar degradation of CM systems trained on original but evaluated on channel-shifted data. Finally, we propose several channel robust strategies (data augmentation, multi-task learning, adversarial learning) for CM systems, and observe a significant performance improvement on cross-dataset experiments.
Audio and Speech Processing,Sound
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the significant performance degradation of counter - measure systems (CM systems) against spoofing (anti - synthetic - speech - spoofing) in cross - dataset evaluations. Specifically, when these CM systems are trained on a specific dataset and then tested on other different datasets, their ability to detect genuine speech and spoofing attacks will be greatly reduced. One of the reasons for this performance degradation may be the differences in channel effects between different datasets. The paper proves this through experiments and proposes some strategies to improve the robustness of CM systems to channel changes, such as data augmentation, multi - task learning and adversarial learning, thereby significantly improving the performance of the system in cross - dataset experiments.