Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-Tuned Models

Gabriel Y. Arteaga,Thomas B. Schön,Nicolas Pielawski
2024-12-06
Abstract:Uncertainty estimation is a necessary component when implementing AI in high-risk settings, such as autonomous cars, medicine, or insurances. Large Language Models (LLMs) have seen a surge in popularity in recent years, but they are subject to hallucinations, which may cause serious harm in high-risk settings. Despite their success, LLMs are expensive to train and run: they need a large amount of computations and memory, preventing the use of ensembling methods in practice. In this work, we present a novel method that allows for fast and memory-friendly training of LLM ensembles. We show that the resulting ensembles can detect hallucinations and are a viable approach in practice as only one GPU is needed for training and inference.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of detecting hallucinations in large - language models (LLMs). Specifically, the paper focuses on two types of hallucinations: faithfulness hallucinations and factual hallucinations. These hallucinations may lead to serious consequences when using AI in high - risk scenarios (such as self - driving cars, the medical or insurance industries). Although existing methods have shown certain effectiveness in specific tasks, they are usually limited to a narrow range of tasks and lack broad applicability. In addition, although existing deep integration methods can provide more reliable uncertainty estimates, they require a large amount of computing resources, which limits their practical applications. To solve these problems, the paper proposes a fast and memory - efficient method to train LLM ensembles. This method utilizes low - rank adaptation (LoRA) matrices and component - specific rank - 1 matrix modifications to reduce computational overhead and make the effective use of the ensemble method possible. Through this method, the author shows how to use the uncertainty estimates generated by LLMs as features to train a binary classifier to distinguish between hallucinatory content and correct content, thereby achieving effective detection of both types of hallucinations. The main contributions of the paper include: 1. Proposing a fast and memory - efficient LLM fine - tuning method, using LoRA matrices and component - specific rank - 1 matrix modifications, reducing computational overhead and making the application of the ensemble method more feasible. 2. Proposing a new hallucination - detection method, redefining hallucination detection as a binary - classification task, and using the uncertainty estimates of LLMs as features to distinguish between hallucinatory content and correct content. 3. Demonstrating that this method can operate effectively under the minimum hardware configuration (only requiring one A40 GPU), proving its efficiency and scalability. Through these contributions, the paper not only improves the accuracy of hallucination detection in LLMs but also provides a practical solution for deploying these models in resource - constrained environments.