Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

N. E. Kriman
2024-08-28
Abstract:The use of large language models (LLMs) has significantly increased since the introduction of ChatGPT in 2022, demonstrating their value across various applications. However, a major challenge for enterprise and commercial adoption of LLMs is their tendency to generate inaccurate information, a phenomenon known as "hallucination." This project proposes a method for estimating the factuality of a summary generated by LLMs when compared to a source text. Our approach utilizes Naive Bayes classification to assess the accuracy of the content produced.
Computation and Language
What problem does this paper attempt to address?