Multi-Scale Permutation Entropy for Audio Deepfake Detection

Chenglong Wang,Jiayi He,Jiangyan Yi,Jianhua Tao,Chu Yuan Zhang,Xiaohui Zhang
DOI: https://doi.org/10.1109/icassp48485.2024.10448095
2024-01-01
Abstract:With the widespread application of Automatic Speaker Verification (ASV) technology in security authentication, the threat of fake audio attacks looms as a malicious means compromising system security. In this study, we employ the multi-scale permutation entropy (MPE) in audio deepfake detection, which could help measure the complexity and detect the dynamic characteristics of audio signals at different scales. Experimental results indicate that MPE can effectively improve the performance of LFCC. For example, on the ASVspoof2019 LA test set, it successfully achieves an equal error rate (EER) of less than 2%, which is around 50% lower than that of LFCC. Notably, MPE exhibits extraordinary generalization performance when applied to the In-the-Wild dataset, as its performance of EER is comparable to that of Wav2vec, without requiring pretraining. Therefore, we believe that MPE holds promising prospects in voice biometric recognition for anti-spoofing applications. Our code is available at https://github.com/ADDchallenge/MPE-for-audio-deepfake-detection
What problem does this paper attempt to address?