Review of audio deepfake detection techniques: Issues and prospects

Abhishek Dixit,Nirmal Kaur,Staffy Kingra
DOI: https://doi.org/10.1111/exsy.13322
IF: 3.3
2023-04-25
Expert Systems
Abstract:In the past years, multimedia content has improved in realism and plausibility owing to the development of deep learning techniques, particularly the generative adversarial networks and variational auto‐encoders. Though digital content, especially digital movies shot from a certain viewpoint gives a true representation of reality, yet the ubiquitous usage of content manipulation techniques casts doubt on its veracity. Deepfaking an AI based tampering technique, is able to map facial and acoustic features of a source person onto the target with an intention to make target say or enact the things that has not happened in real. Numerous approaches have been proposed in the literature for detection of image and video deepfakes. With technological advancement, researchers have also started to examine audio deepfakes and ways to detect them. As there is currently no comprehensive overview of audio deepfake generation and detection techniques, this paper aims to provide a survey of the relevant literature in this area. This survey paper intends to help research fraternity about the available audio generation and detection approaches for design of reliable detection models in future to classify fake and real audios.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?