Abstract:Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.

The Automated Verification of Textual Claims (AVeriTeC) Shared Task

AVeriTeC: A Dataset for Real-world Claim Verification with Evidence from the Web

AIC CTU system at AVeriTeC: Re-framing automated fact-checking as a simple RAG task

Overview of the CLEF-2019 CheckThat!: Automatic Identification and Verification of Claims

CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media

Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media

ClaimVer: Explainable Claim-Level Verification and Evidence Attribution of Text Through Knowledge Graphs

The Fact Extraction and VERification (FEVER) Shared Task

Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking

Complex Claim Verification with Evidence Retrieved in the Wild

Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

Evidence-Based Temporal Fact Verification

Contrastive Learning to Improve Retrieval for Real-world Fact Checking

SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables

Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024

Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments

ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text

Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models

Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches

HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims

Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking