Large Language Models Must Be Taught to Know What They Don't Know

Sanyam Kapoor,Nate Gruver,Manley Roberts,Katherine Collins,Arka Pal,Umang Bhatt,Adrian Weller,Samuel Dooley,Micah Goldblum,Andrew Gordon Wilson

2024-06-13

Abstract:When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibration and then show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead. We show that a thousand graded examples are sufficient to outperform baseline methods and that training through the features of a model is necessary for good performance and tractable for large open-source models when using LoRA. We also investigate the mechanisms that enable reliable LLM uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators, applicable not just to their own uncertainties but also the uncertainty of other models. Lastly, we show that uncertainty estimates inform human use of LLMs in human-AI collaborative settings through a user study.

Machine Learning,Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

This paper discusses how large-scale language models (LLMs) accurately represent their predicted uncertainty in high-risk applications. There is currently debate about whether LLMs can accurately represent uncertainty, with some studies suggesting that prompting can provide well-calibrated uncertainty and others noting that sampling methods may be costly. The paper first argues that prompting alone is insufficient for achieving good calibration, and then demonstrates that fine-tuning on a small dataset containing correct and incorrect answers can create uncertainty estimation with good generalization and low computational cost. The research finds that surpassing baseline methods only requires about one thousand graded examples, and training large-scale open-source models using LoRA is feasible. Furthermore, the paper explores mechanisms that enable LLMs to reliably estimate uncertainty and finds that they can serve as general uncertainty estimators, applicable not only to themselves but also to uncertainty in other models. Finally, user studies are conducted to demonstrate how uncertainty estimation affects scenarios of human-AI collaboration. The paper emphasizes the importance of fine-tuning on open-source models to improve uncertainty estimation in LLMs, and provides practical insights and methods.

Large Language Models Must Be Taught to Know What They Don't Know

Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning

The Calibration Gap between Model and Human Confidence in Large Language Models

Large Language Model Uncertainty Measurement and Calibration for Medical Diagnosis and Treatment

Can LLMs Learn Uncertainty on Their Own? Expressing Uncertainty Effectively in A Self-Training Manner

Look before you leap: An exploratory study of uncertainty measurement for large language models

Large Language Model Confidence Estimation via Black-Box Access

Do LLMs estimate uncertainty well in instruction-following?

Large language model uncertainty proxies: discrimination and calibration for medical diagnosis and treatment.

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models

Finetuning Language Models to Emit Linguistic Expressions of Uncertainty

Calibrating Large Language Models Using Their Generations Only

Guiding Reinforcement Learning Using Uncertainty-Aware Large Language Models

"I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust

Distinguishing the Knowable from the Unknowable with Language Models

Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

Calibrating Large Language Models with Sample Consistency

Do Large Language Models Know What They Don't Know?