Unsupervised Cross-Domain Rumor Detection from Multiple Sources Based on RoBERTa and Multi-CNN.

Taozheng Zhang,Shuaidong Hu
DOI: https://doi.org/10.1145/3613330.3613340
2023-01-01
Abstract:Internet rumors are prevalent and harmful to society. Hence, automatic rumor detection is essential. However, supervised learning methods are impractical due to the high cost of data labeling in the early stage of rumor propagation. Moreover, rumors can originate from multiple different source domains, and single-source domain adaptation methods cannot handle this scenario. To address these challenges, this paper proposes a rumor detection model based on RoBERTa pre-training model and multiple convolutional neural networks that combines unsupervised learning and multi-source domain adaptation. The model only uses the microblog text content to transfer the knowledge from multiple source domains to the target domain. We find that dynamic text representations can be effectively extracted by RoBERTa, and the discrepancy can be better reduced by aligning the distributions of each pair of source and target domains in multiple feature spaces. Additionally, according to F1 and time cost, dynamic distribution adaptation performs best on quantitative evaluation. Finally, extensive experiments demonstrate that the proposed model outperforms the baseline models based on transfer learning in rumor detection.
What problem does this paper attempt to address?