Domain- and category-style clustering for general fake news detection via contrastive learning

Danke Wu,Zhenhua Tan,Haoran Zhao,Taotao Jiang,Ning Geng
DOI: https://doi.org/10.1016/j.ipm.2024.103725
IF: 7.466
2024-04-14
Information Processing & Management
Abstract:Nowadays, online social networks increase information dissemination but also accelerate the spread of fake news. Existing work mainly focuses on detecting fake news in a predefined scenario, therefore struggling to handle general tasks, especially the newly emerged events and unseen news domains. Exploration on linguistic styles have shown promising results across events. However, most of them require complex preprocessing for capturing styles and ignore the compatibility of certain styles across news domains, and hence are inefficient in real applications. To address these problems, we propose a domain- and category-style clustering framework to learn general style patterns across news domains. Two key modules, content integrity detection (CID) and contrastive style detection (CSD) cooperate to obtain event-independent styles in an adversarial manner, which eliminates the need for data preprocessing. Meanwhile, in the CSD module, a multilevel contrastive loss is developed to refine style clustering at both domain and category levels, improving generalization and discrimination of the learned style patterns. Extensive experiments show that our framework improves F1 scores of 2.37%/2.08% on the unseen event/news domain, and 0.63%/1.19% on the known events/news domains. Furthermore, the quantitative analysis demonstrates the existence of general style patterns and suggests that real news is more likely to use the hashtag (''), mention function (@), and numerals, while fake news tends to use '!' and '?'. Our data and code are available for comparison. 1
computer science, information systems,information science & library science
What problem does this paper attempt to address?