MDMN: Multi-task and Domain Adaptation Based Multi-modal Network for Early Rumor Detection
Honghao Zhou,Tinghuai Ma,Huan Rong,Yurong Qian,Yuan Tian,Najla Al-Nabhan
DOI: https://doi.org/10.1016/j.eswa.2022.116517
IF: 8.5
2022-01-01
Expert Systems with Applications
Abstract:With the development of social media, people tend to express their opinions on multimedia such as text, photos, audios, and videos. Meanwhile, more rumors hiding in multi-modal content are misleading social media users. The early rumor detection task aims to detect rumors before spreading. However, annotation on multi-modal data often involves a large amount of manpower. Existing approaches universally used transfer learning to overcome it. But they ignored the differences between the source domain of pre-trained models and the task domain. In this paper, Multi-task and Domain Adaptation based Multi-modal Network (MDMN) is proposed, which consists of three components: Textual Feature Extractor, Visual Feature Extractor, and Fusion & Classification Network. To improve the diversity and stability of textual representation, a Multi-task Sharing Layer, a task-specific Transformer Encoder and a Selection Layer are applied. Domain Adaptation is involved in training an adaptive model for extracting visual representation. The adaptive models can encode task data better than fine-tuning the pre-trained models. Then, multi-modal representations are fused through two fusion strategies, each having their own benefits. The experiment on multi-modal datasets collected from Weibo and Twitter show that the proposed MDMN can outperform the baseline methods. The decision-level fusion strategy achieves a Recall of over 92%.