Transferring Audio Deepfake Detection Capability Across Languages

Zhongjie Ba,Qing Wen,Peng Cheng,Yuwei Wang,Feng Lin,Li Lu,Zhenguang Liu
DOI: https://doi.org/10.1145/3543507.3583222
2023-01-01
Abstract:The proliferation of deepfake content has motivated a surge of detection studies. However, existing detection methods in the audio area exclusively work in English, and there is a lack of data resources in other languages. Cross-lingual deepfake detection, a critical but rarely explored area, urges more study. This paper conducts the first comprehensive study on the cross-lingual perspective of deepfake detection. We observe that English data enriched in deepfake algorithms can teach a detector the knowledge of various spoofing artifacts, contributing to performing detection across language domains. Based on the observation, we first construct a first-of-its-kind cross-lingual evaluation dataset including heterogeneous spoofed speech uttered in the two most widely spoken languages, then explored domain adaptation (DA) techniques to transfer the artifacts detection capability and propose effective and practical DA strategies fitting the cross-lingual scenario. Our adversarial-based DA paradigm teaches the model to learn real/fake knowledge while losing language dependency. Extensive experiments over 137-hour audio clips validate the adapted models can detect fake audio generated by unseen algorithms in the new domain.
What problem does this paper attempt to address?