Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations

Hassan Shakil,Zeydy Ortiz,Grant C. Forbes
2024-05-07
Abstract:In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries. Also, we generate hybrid summaries by combining both DistilBERT and T5 models. Central to our research is the implementation of GPT-based refining process to minimize the common problem of hallucinations that happens in AI-generated summaries. We evaluate unrefined summaries and, after refining, we also assess refined summaries using a range of traditional and novel metrics, demonstrating marked improvements in the accuracy and reliability of the summaries. Results highlight significant improvements in reducing hallucinatory content, thereby increasing the factual integrity of the summaries.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to reduce the "hallucination" phenomenon during text summarization generation, that is, the content in the summaries generated by AI that does not match the source text. The author uses the DistilBERT model to generate extractive summaries, the T5 model to generate abstractive summaries, and combines these two methods to generate hybrid summaries. To further reduce the hallucination phenomenon, the author introduces a refinement process based on GPT. Through this process, the generated summaries are evaluated and optimized to improve the accuracy and reliability of the summaries. Specifically, the paper mainly focuses on the following aspects: 1. **Generating unrefined summaries**: Use DistilBERT to generate extractive summaries, use T5 to generate abstractive summaries, and combine the two to generate hybrid summaries. 2. **GPT - based refinement process**: Evaluate and refine the generated summaries through the GPT model to reduce hallucinatory content. 3. **Evaluation metrics**: Use a variety of traditional and new evaluation metrics (such as FactSumm, QAGS, SummaC, ROUGE, and GPT 3.5 Turbo) to evaluate the quality of unrefined and refined summaries. 4. **Statistical analysis**: Verify the effectiveness of the refinement process through statistical methods such as paired t - tests. The goal of the paper is to significantly reduce the hallucination phenomenon in summaries and improve the accuracy and reliability of summaries through these methods.