Evaluating the Efficacy of Large Language Models in Detecting Fake News: A Comparative Analysis

Sahas Koka,Anthony Vuong,Anish Kataria
2024-06-05
Abstract:In an era increasingly influenced by artificial intelligence, the detection of fake news is crucial, especially in contexts like election seasons where misinformation can have significant societal impacts. This study evaluates the effectiveness of various LLMs in identifying and filtering fake news content. Utilizing a comparative analysis approach, we tested four large LLMs -- GPT-4, Claude 3 Sonnet, Gemini Pro 1.0, and Mistral Large -- and two smaller LLMs -- Gemma 7B and Mistral 7B. By using fake news dataset samples from Kaggle, this research not only sheds light on the current capabilities and limitations of LLMs in fake news detection but also discusses the implications for developers and policymakers in enhancing AI-driven informational integrity.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is the evaluation of the effectiveness of different large language models (LLMs) in detecting fake news. Specifically, the researchers aim to determine which models are more accurate and reliable in distinguishing between true and false news. Given the significant impact of the proliferation of fake news on public opinion, social trust, and the decision-making processes of governments and research institutions, reliably detecting fake news is crucial for maintaining the integrity of information dissemination and supporting informed decision-making. The study tested six different language models, including GPT-4, Claude 3, Mistral Large, Gemini Pro 1.0, Mistral 7B, and Gemma 7B, and compared their performance using a series of metrics such as accuracy, precision, recall, and F1 score.