Abstract:This paper evaluates the impact of training undergraduate students to improve their audio deepfake discernment ability by listening for expert-defined linguistic features. Such features have been shown to improve performance of AI algorithms; here, we ascertain whether this improvement in AI algorithms also translates to improvement of the perceptual awareness and discernment ability of listeners. With humans as the weakest link in any cybersecurity solution, we propose that listener discernment is a key factor for improving trustworthiness of audio content. In this study we determine whether training that familiarizes listeners with English language variation can improve their abilities to discern audio deepfakes. We focus on undergraduate students, as this demographic group is constantly exposed to social media and the potential for deception and misinformation online. To the best of our knowledge, our work is the first study to uniquely address English audio deepfake discernment through such techniques. Our research goes beyond informational training by introducing targeted linguistic cues to listeners as a deepfake discernment mechanism, via a training module. In a pre-/post- experimental design, we evaluated the impact of the training across 264 students as a representative cross section of all students at the University of Maryland, Baltimore County, and across experimental and control sections. Findings show that the experimental group showed a statistically significant decrease in their unsurety when evaluating audio clips and an improvement in their ability to correctly identify clips they were initially unsure about. While results are promising, future research will explore more robust and comprehensive trainings for greater impact.

Transferring Audio Deepfake Detection Capability Across Languages

Audio Deepfake Detection: A Survey

Cross-Domain Audio Deepfake Detection: Dataset and Analysis

Does Audio Deepfake Detection Generalize?

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

Speaker Recognition-Assisted Robust Audio Deepfake Detection

Toward Transdisciplinary Approaches to Audio Deepfake Discernment

Efficient Deepfake Audio Detection Using Spectro-Temporal Analysis and Deep Learning

FakeSound: Deepfake General Audio Detection

Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning

Retrieval-Augmented Audio Deepfake Detection

Audio Anti-Spoofing Detection: A Survey

SafeEar: Content Privacy-Preserving Audio Deepfake Detection

A robust audio deepfake detection system via multi-view feature

Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier

Listening for Expert Identified Linguistic Features: Assessment of Audio Deepfake Discernment among Undergraduate Students

Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models

AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake

Source Tracing of Audio Deepfake Systems

FTDKD: Frequency-Time Domain Knowledge Distillation for Low-Quality Compressed Audio Deepfake Detection