Abstract:Real-world misinformation, often multimodal, can be partially or fully factual but misleading using diverse tactics like conflating correlation with causation. Such misinformation is severely understudied, challenging to address, and harms various social domains, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies effectively reduces false beliefs. Despite the wide acceptance of manual correction, it is difficult to be timely and scalable. While LLMs have versatile capabilities that could accelerate misinformation correction, they struggle due to a lack of recent information, a tendency to produce false content, and limitations in addressing multimodal information. We propose MUSE, an LLM augmented with access to and credibility evaluation of up-to-date information. By retrieving evidence as refutations or supporting context, MUSE identifies and explains content (in)accuracies with references. It conducts multimodal retrieval and interprets visual content to verify and correct multimodal content. Given the absence of a comprehensive evaluation approach, we propose 13 dimensions of misinformation correction quality. Then, fact-checking experts evaluate responses to social media content that are not presupposed to be misinformation but broadly include (partially) incorrect and correct posts that may (not) be misleading. Results demonstrate MUSE's ability to write high-quality responses to potential misinformation--across modalities, tactics, domains, political leanings, and for information that has not previously been fact-checked online--within minutes of its appearance on social media. Overall, MUSE outperforms GPT-4 by 37% and even high-quality responses from laypeople by 29%. Our work provides a general methodological and evaluative framework to correct misinformation at scale.

LRQ-Fact: LLM-Generated Relevant Questions for Multimodal Fact-Checking

Multimodal Large Language Models to Support Real-World Fact-Checking

Multimodal Misinformation Detection using Large Vision-Language Models

RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models

Correcting misinformation on social media with a large language model

PACAR: Automated Fact-Checking with Planning and Customized Action Reasoning Using Large Language Models

OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs

DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection

LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation

Multimodal Fact-Checking with Vision Language Models: A Probing Classifier based Solution with Embedding Strategies

Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate

FactLens: Benchmarking Fine-Grained Fact Verification

LM vs LM: Detecting Factual Errors via Cross Examination

Multimodal Automated Fact-Checking: A Survey

MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation