Generative Large Language Models in Automated Fact-Checking: A Survey

Ivan Vykopal,Matúš Pikuliak,Simon Ostermann,Marián Šimko
2024-10-30
Abstract:The dissemination of false information on online platforms presents a serious societal challenge. While manual fact-checking remains crucial, Large Language Models (LLMs) offer promising opportunities to support fact-checkers with their vast knowledge and advanced reasoning capabilities. This survey explores the application of generative LLMs in fact-checking, highlighting various approaches and techniques for prompting or fine-tuning these models. By providing an overview of existing methods and their limitations, the survey aims to enhance the understanding of how LLMs can be used in fact-checking and to facilitate further progress in their integration into the fact-checking process.
Computation and Language
What problem does this paper attempt to address?
This paper aims to explore the application of generative large - scale language models (Generative Large Language Models, LLMs) in automated fact - checking. Specifically, the paper attempts to address the following issues: 1. **Widespread dissemination of false information**: With the rise of social media, the spread of false information has become a serious social challenge. Although manual fact - checking is still necessary, generative LLMs offer new possibilities for supporting fact - checkers through their vast knowledge bases and advanced reasoning capabilities. 2. **Improving the efficiency and accuracy of fact - checking**: Traditional fact - checking mainly relies on manpower, but human resources are limited and it is difficult to cope with the ever - increasing amount of false information. Therefore, the paper explores how to use generative LLMs to improve the efficiency and accuracy of fact - checking. 3. **Review and limitations of existing methods**: The paper provides a comprehensive review of the existing application methods of generative LLMs in fact - checking and points out the limitations of these methods. By providing a comprehensive overview, the paper hopes to enhance researchers' understanding of the application of generative LLMs in fact - checking and promote further development in this field. ### Main contributions - **Comprehensive overview**: The paper provides a comprehensive overview of the methods and limitations of using generative LLMs for automated fact - checking, covering 69 related papers. - **Method classification**: The paper classifies the methods into three categories according to the output types of generative LLMs: structured output, unstructured output, and synthetic data generation. - **Technical analysis**: The paper analyzes various techniques in detail, such as Prompting, Fine - tuning, and Augmentation with External Knowledge, and explores the application of these techniques in fact - checking tasks. ### Specific tasks The paper discusses the application of generative LLMs in the following fact - checking tasks: 1. **Fact verification and fake news detection**: This is the most common task, which involves evaluating the authenticity of a given claim or checking the credibility of a longer text (such as a news article). 2. **Evidence retrieval**: Collect key information from reliable sources to evaluate the authenticity of a claim. 3. **Claim detection**: Identify claims that contain verifiable information and may require further verification. 4. **Detection of verified claims**: Reduce the repetitive work of fact - checkers and improve efficiency by identifying similar claims that have already been verified in the database. ### Method classification - **Structured output**: Generate answers with a predefined structure, such as category labels in classification tasks, numerical scores in regression tasks, etc. - **Unstructured output**: Generate continuous text, such as explanations, summaries, etc. - **Synthetic data generation**: Generate new datasets or parts of datasets for training or fine - tuning other models. ### Technical analysis - **Prompting**: Improve the performance of LLMs by designing instructions, especially in cases where data is limited. - **Fine - tuning**: Adapt the pre - trained model to a specific task, especially when the model cannot effectively complete the task through prompting. - **External knowledge augmentation**: Combine external tools and knowledge bases to provide up - to - date information and improve the accuracy of fact - checking. Through these methods and techniques, the paper hopes to provide researchers with a comprehensive perspective and promote the broader application and development of generative LLMs in fact - checking.