ADD 2023: the Second Audio Deepfake Detection Challenge

Jiangyan Yi,Jianhua Tao,Ruibo Fu,Xinrui Yan,Chenglong Wang,Tao Wang,Chu Yuan Zhang,Xiaohui Zhang,Yan Zhao,Yong Ren,Le Xu,Junzuo Zhou,Hao Gu,Zhengqi Wen,Shan Liang,Zheng Lian,Shuai Nie,Haizhou Li
2023-05-23
Abstract:Audio deepfake detection is an emerging topic in the artificial intelligence community. The second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around the world to build new innovative technologies that can further accelerate and foster research on detecting and analyzing deepfake speech utterances. Different from previous challenges (e.g. ADD 2022), ADD 2023 focuses on surpassing the constraints of binary real/fake classification, and actually localizing the manipulated intervals in a partially fake speech as well as pinpointing the source responsible for generating any fake audio. Furthermore, ADD 2023 includes more rounds of evaluation for the fake audio game sub-challenge. The ADD 2023 challenge includes three subchallenges: audio fake game (FG), manipulation region location (RL) and deepfake algorithm recognition (AR). This paper describes the datasets, evaluation metrics, and protocols. Some findings are also reported in audio deepfake detection tasks.
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The paper aims to address the issue of audio deepfake detection and promote the development of related research by hosting the 2nd Audio Deepfake Detection Challenge (ADD 2023). Specifically, ADD 2023 aims to go beyond binary classification (real/fake) to achieve the localization of partially fake audio segments and the identification of the source generating the fake audio. Compared to previous challenges (such as ADD 2022), ADD 2023 introduces more evaluation rounds and adds two new sub-challenge tasks: Manipulation Region Localization (RL) and Deepfake Algorithm Recognition (AR). Additionally, the paper describes the dataset, evaluation metrics, and protocols, and reports some findings in the audio deepfake detection task. These improvements are intended to encourage researchers to develop new innovative techniques to accelerate and promote the detection and analysis of deepfake speech.