Cross-Modal Augmentation for Few-Shot Multimodal Fake News Detection

Ye Jiang,Taihang Wang,Xiaoman Xu,Yimin Wang,Xingyi Song,Diana Maynard
2024-07-16
Abstract:The nascent topic of fake news requires automatic detection methods to quickly learn from limited annotated samples. Therefore, the capacity to rapidly acquire proficiency in a new task with limited guidance, also known as few-shot learning, is critical for detecting fake news in its early stages. Existing approaches either involve fine-tuning pre-trained language models which come with a large number of parameters, or training a complex neural network from scratch with large-scale annotated datasets. This paper presents a multimodal fake news detection model which augments multimodal features using unimodal features. For this purpose, we introduce Cross-Modal Augmentation (CMA), a simple approach for enhancing few-shot multimodal fake news detection by transforming n-shot classification into a more robust (n $\times$ z)-shot problem, where z represents the number of supplementary features. The proposed CMA achieves SOTA results over three benchmark datasets, utilizing a surprisingly simple linear probing method to classify multimodal fake news with only a few training samples. Furthermore, our method is significantly more lightweight than prior approaches, particularly in terms of the number of trainable parameters and epoch times. The code is available here: \url{<a class="link-external link-https" href="https://github.com/zgjiangtoby/FND_fewshot" rel="external noopener nofollow">this https URL</a>}
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily aims to address the following issues: 1. **Few-shot learning problem in fake news detection**: In emerging topics (such as COVID-19), traditional fake news detection methods perform poorly in few-shot scenarios due to the lack of large-scale annotated data. This paper proposes a Cross-Modal Augmentation (CMA) method, which enhances multimodal fusion by leveraging unimodal features, thereby improving the effectiveness of few-shot learning. 2. **Effective utilization of multimodal information**: Existing multimodal fake news detection methods typically require complex neural networks or large-scale annotated datasets, which are difficult to meet in practical applications. The proposed method achieves efficient multimodal fusion using simple linear probing techniques with only a small number of training samples, and significantly reduces the number of model parameters. 3. **Consistency and inconsistency of cross-modal information**: The semantic consistency between text and images is crucial for fake news detection. By introducing a cross-modal augmentation mechanism, this paper explores how to use unimodal features to assist multimodal fusion, thereby better capturing the semantic relationship between text and images. In summary, the paper aims to solve the problem of multimodal fake news detection under few-shot conditions through cross-modal augmentation technology and proposes a lightweight and efficient solution.