Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Zacharias Chrysidis,Stefanos-Iordanis Papadopoulos,Symeon Papadopoulos,Panagiotis C. Petrantonakis

DOI: https://doi.org/10.1145/3643491.3660278

2024-04-29

Abstract:Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.

Computation and Language,Computers and Society,Information Retrieval,Social and Information Networks

What problem does this paper attempt to address?

The paper attempts to address the issue of insufficient evidence source and quality verification in Automatic Fact-Checking (AFC) systems. Specifically, existing AFC methods typically rely on external information obtained from the web to check the veracity of claims but often overlook the importance of verifying the source and quality of the collected "evidence." This leads to two main problems: 1. **Leaked Evidence**: Many AFC systems directly obtain information from fact-checking websites and use this information to train models, which is impractical for early detection of newly emerging false information. 2. **Unreliable Sources**: Information obtained from unreliable sources can undermine the effectiveness of AFC systems. To address these issues, the authors propose a comprehensive evidence verification and filtering method. They construct a dataset named “CREDible, Unreliable or LEaked” (CREDULE), containing 91,632 articles, categorized into credible, unreliable, and fact-checked (leaked) classes. Additionally, they introduce the EVidence VERification Network (EVVER-Net) to detect leaked and unreliable evidence in both short and long texts, thereby enhancing the robustness of end-to-end AFC systems.

Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking

AVeriTeC: A Dataset for Real-world Claim Verification with Evidence from the Web

Complex Claim Verification with Evidence Retrieved in the Wild

Weakly Supervised Veracity Classification with LLM-Predicted Credibility Signals

Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs

AmbiFC: Fact-Checking Ambiguous Claims with Evidence

Give Me More Details: Improving Fact-Checking with Latent Retrieval

Claim-Dissector: An Interpretable Fact-Checking System with Joint Re-ranking and Veracity Prediction

Contrastive Learning to Improve Retrieval for Real-world Fact Checking

Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments

That is a Known Lie: Detecting Previously Fact-Checked Claims

Fully Automated Fact Checking Using External Sources

Explainability of Automated Fact Verification Systems: A Comprehensive Review

Fact Checking Beyond Training Set

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts

Linked Credibility Reviews for Explainable Misinformation Detection

Towards Explainable Fact Checking

"The Data Says Otherwise"-Towards Automated Fact-checking and Communication of Data Claims

Retrieval Augmented Verification: Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts

FEVER: a large-scale dataset for Fact Extraction and VERification