Uncertainty Quantification in Large Language Models Through Convex Hull Analysis

Ferhat Ozgur Catak,Murat Kuzlu

2024-06-28

Abstract:Uncertainty quantification approaches have been more critical in large language models (LLMs), particularly high-risk applications requiring reliable outputs. However, traditional methods for uncertainty quantification, such as probabilistic models and ensemble techniques, face challenges when applied to the complex and high-dimensional nature of LLM-generated outputs. This study proposes a novel geometric approach to uncertainty quantification using convex hull analysis. The proposed method leverages the spatial properties of response embeddings to measure the dispersion and variability of model outputs. The prompts are categorized into three types, i.e., `easy', `moderate', and `confusing', to generate multiple responses using different LLMs at varying temperature settings. The responses are transformed into high-dimensional embeddings via a BERT model and subsequently projected into a two-dimensional space using Principal Component Analysis (PCA). The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is utilized to cluster the embeddings and compute the convex hull for each selected cluster. The experimental results indicate that the uncertainty of the model for LLMs depends on the prompt complexity, the model, and the temperature setting.

Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to quantify uncertainty in large - language models (LLMs). Specifically, traditional uncertainty - quantification methods, such as probability models and ensemble techniques, face challenges when applied to the complex and high - dimensional outputs generated by LLMs. This paper proposes a new geometric method based on convex - hull analysis to quantify uncertainty. This method utilizes the spatial characteristics of response embeddings to measure the dispersion and variability of model outputs. By dividing prompts into three categories: "easy", "medium" and "confusing", and using different LLMs to generate multiple responses at different temperature settings, then converting these responses into high - dimensional embeddings and projecting them onto a two - dimensional space via principal component analysis (PCA), and then using the density - based spatial clustering of applications with noise (DBSCAN) algorithm to cluster the embeddings and calculate the convex hull of each selected cluster, in order to evaluate the degree of uncertainty. Experimental results show that the uncertainty of LLMs depends on the complexity of the prompt, the model type and the temperature setting.

Uncertainty Quantification in Large Language Models Through Convex Hull Analysis

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions

Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings

Uncertainty Quantification for In-Context Learning of Large Language Models

Improving Medical Diagnostics with Vision-Language Models: Convex Hull-Based Uncertainty Analysis

Testing Uncertainty of Large Language Models for Physics Knowledge and Reasoning

SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

Benchmarking LLMs via Uncertainty Quantification

LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation

Distinguishing the Knowable from the Unknowable with Language Models

Semantic Density: Uncertainty Quantification for Large Language Models through Confidence Measurement in Semantic Space

Shifting Attention to Relevance: Towards the Uncertainty Estimation of Large Language Models

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

On Verbalized Confidence Scores for LLMs

Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling

Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?

MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty

Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities

To Believe or Not to Believe Your LLM

DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction

UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions