Abstract:The scarcity of high-quality large-scale labeled datasets poses a huge challenge for employing deep learning models in video deception detection. To address this issue, inspired by the psychological theory on the relation between deception and expressions, we propose a novel method called AFFAKT in this paper, which enhances the classification performance by transferring useful and correlated knowledge from a large facial expression dataset. Two key challenges in knowledge transfer arise: 1) \textit{how much} knowledge of facial expression data should be transferred and 2) \textit{how to} effectively leverage transferred knowledge for the deception classification model during inference. Specifically, the optimal relation mapping between facial expression classes and deception samples is firstly quantified using proposed H-OTKT module and then transfers knowledge from the facial expression dataset to deception samples. Moreover, a correlation prototype within another proposed module SRKB is well designed to retain the invariant correlations between facial expression classes and deception classes through momentum updating. During inference, the transferred knowledge is fine-tuned with the correlation prototype using a sample-specific re-weighting strategy. Experimental results on two deception detection datasets demonstrate the superior performance of our proposed method. The interpretability study reveals high associations between deception and negative affections, which coincides with the theory in psychology.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in video spoofing detection, due to the lack of high - quality large - scale labeled datasets, the performance of deep - learning models is limited. Specifically, current spoofing - detection datasets (such as Real Life Trial (RTL) and DOLOS) usually contain a small number of labeled samples, which restricts the training of deep neural networks and thus hinders further performance improvement. To solve this problem, the authors propose a new method named AFFAKT (Affective Facial Knowledge Transfer). AFFAKT enhances the classification performance of video spoofing - detection models by transferring useful and relevant knowledge from large - scale facial - expression datasets. This method aims to answer two key questions: 1. How much knowledge of facial - expression data should be transferred? 2. How can the transferred knowledge be effectively utilized during the inference process to improve the spoofing - classification model? ### Solution Overview AFFAKT mainly consists of the following four modules: 1. **Encoder Layer**: Use a pre - trained encoder to extract feature representations of the source domain (facial - expression datasets) and the target domain (spoofing - detection datasets). 2. **Hierarchical Optimal Transport Knowledge Transfer Module (Hierarchical Optimal Transport Knowledge Transfer, H - OTKT)**: Automatically quantify the potential correlation between facial - expression categories and spoofing samples through hierarchical optimal transport (H - OT), and determine how much knowledge to transfer from different categories to each sample. The specific formula is as follows: \[ OT_{\text{high}}(P, Q)=\min_{T \in \Pi(P, Q)}\langle T, M\rangle_F-\epsilon H(T) \] where \( T \in \mathbb{R}^{n\times L_s} \) and \( M \in \mathbb{R}^{n\times L_s} \) are the transport - plan matrix and cost matrix respectively, and \( \Pi(P, Q) \) is the constraint condition, ensuring that the marginal distributions of \( T \) are \( P \) and \( Q \). 3. **Classification Layer**: Use a multi - layer perceptron (MLP) and the softmax function for the final classification prediction, and define the cross - entropy loss function \( L_{ce} \) and the Sinkhorn - divergence - based spatial - difference loss function \( L_{ot} \) to optimize the entire network and reduce the difference between the source - feature space and the target - feature space. 4. **Sample - specific Re - weighting Knowledge Bank Module (Sample - specific Re - weighting Knowledge Bank, SRKB)**: Construct relevant prototypes \( B \) through a momentum - update mechanism to maintain the invariant relationship between the target class and the source class, and use a sample - specific re - weighting strategy in the test phase to enhance the detection performance. ### Experimental Results The experimental results show that AFFAKT achieves better performance than existing methods on two video spoofing - detection datasets (RTL and DOLOS). In particular, AFFAKT shows significant advantages in terms of F1 - score, accuracy (ACC), and AUC metrics. In addition, interpretability studies show that there is a high correlation between spoofing behavior and negative emotions, which is consistent with psychological theories. In conclusion, AFFAKT significantly improves the performance of video spoofing - detection by effectively transferring relevant knowledge in facial - expression data, and solves the problem of limited performance of deep - learning models due to insufficient data.

AFFAKT: A Hierarchical Optimal Transport based Method for Affective Facial Knowledge Transfer in Video Deception Detection

Introducing Representations of Facial Affect in Automated Multimodal Deception Detection

Facial Expression Guided Diagnosis of Parkinson's Disease Via High-Quality Data Augmentation

Two-in-one Knowledge Distillation for Efficient Facial Forgery Detection

Identifying individual facial expressions by deconstructing a neural network

Affect-Aware Deep Belief Network Representations for Multimodal Unsupervised Deception Detection

Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention

Adaptively Learning Facial Expression Representation via C-F Labels and Distillation

Facial Image-to-Video Translation by a Hidden Affine Transformation

Event Correlation for Deception Detection in Long Video

Relationship-Guided Knowledge Transfer for Class-Incremental Facial Expression Recognition.

AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection

Adaptive Deep Disturbance-Disentangled Learning for Facial Expression Recognition

Deep Identity-aware Transfer of Facial Attributes

Deep Emotion Transfer Network for Cross-database Facial Expression Recognition

Leave No Stone Unturned: Mine Extra Knowledge for Imbalanced Facial Expression Recognition

Dynamic Transfer Exemplar based Facial Emotion Recognition Model Toward Online Video

A Study on the Accuracy of Micro Expression Based Deception Detection with Hybrid Deep Neural Network Models

Leveraging Affect Transfer Learning for Behavior Prediction in an Intelligent Tutoring System

Deception detection in videos using the facial action coding system

Constructing Robust Emotional State-based Feature with a Novel Voting Scheme for Multi-modal Deception Detection in Videos