Quantum-Trained Convolutional Neural Network for Deepfake Audio Detection

Chu-Hsuan Abraham Lin,Chen-Yu Liu,Samuel Yen-Chi Chen,Kuan-Cheng Chen
2024-10-12
Abstract:The rise of deepfake technologies has posed significant challenges to privacy, security, and information integrity, particularly in audio and multimedia content. This paper introduces a Quantum-Trained Convolutional Neural Network (QT-CNN) framework designed to enhance the detection of deepfake audio, leveraging the computational power of quantum machine learning (QML). The QT-CNN employs a hybrid quantum-classical approach, integrating Quantum Neural Networks (QNNs) with classical neural architectures to optimize training efficiency while reducing the number of trainable parameters. Our method incorporates a novel quantum-to-classical parameter mapping that effectively utilizes quantum states to enhance the expressive power of the model, achieving up to 70% parameter reduction compared to classical models without compromising accuracy. Data pre-processing involved extracting essential audio features, label encoding, feature scaling, and constructing sequential datasets for robust model evaluation. Experimental results demonstrate that the QT-CNN achieves comparable performance to traditional CNNs, maintaining high accuracy during training and testing phases across varying configurations of QNN blocks. The QT framework's ability to reduce computational overhead while maintaining performance underscores its potential for real-world applications in deepfake detection and other resource-constrained scenarios. This work highlights the practical benefits of integrating quantum computing into artificial intelligence, offering a scalable and efficient approach to advancing deepfake detection technologies.
Sound,Artificial Intelligence,Audio and Speech Processing,Quantum Physics
What problem does this paper attempt to address?
The paper aims to address the issue of deepfake audio detection. Specifically, it proposes a Quantum-Trained Convolutional Neural Network (QT-CNN) framework that leverages the power of Quantum Machine Learning (QML) to enhance the detection capabilities of deepfake audio. QT-CNN employs a hybrid quantum-classical approach, combining Quantum Neural Networks (QNN) with classical neural network architectures to optimize training efficiency and reduce the number of trainable parameters. The main contributions of the paper include: 1. **Parameter Reduction**: Through quantum-classical parameter mapping techniques, QT-CNN can significantly reduce the number of parameters, by up to 70%, without sacrificing accuracy. 2. **Hybrid Training Mechanism**: QT-CNN utilizes quantum circuits to optimize the weight parameters of the CNN, thereby enhancing its robustness against adversarial inputs while maintaining performance. 3. **Experimental Validation**: Experiments on the DEEP-VOICE dataset demonstrate that QT-CNN maintains high accuracy during both training and testing phases and performs excellently under different configurations. 4. **Practicality**: The framework can leverage the advantages of quantum computing during the training phase while using classical hardware for inference, making it suitable for resource-constrained environments such as edge computing or real-time processing scenarios. In summary, the QT-CNN framework showcases the potential of combining quantum computing with classical machine learning in the detection of deepfake audio, providing an efficient and reliable solution for practical applications.