Multi-branch Network with Circle Loss Using Voice Conversion and Channel Robust Data Augmentation for Synthetic Speech Detection.

Ruoyu Wang,Jun Du,Chang Wang
DOI: https://doi.org/10.1007/978-3-031-20233-9_62
2022-01-01
Abstract:Synthesized speech in internet and telephone communications is often difficult to detect by traditional systems due to channel coding. Moreover, traditional systems limited by training data tend to perform poorly on specific synthetic attacks. Accordingly, we propose a new data augmentation strategy that training the voice conversion system without out-of-set data to synthesize specific attack data and performing single-channel data augmentation for both training and evaluation data. Further, we use multi-branching networks and introduce circle loss to improve system performance. The effectiveness of our approach is validated on the ASVspoof 2019 and 2021 LA database.
What problem does this paper attempt to address?