Abstract:Estimating the uncertainty of responses of Large Language Models~(LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free Bayesianization~(TFB), a novel framework that transforms existing off-the-shelf trained LoRA adapters into Bayesian ones without additional training. TFB systematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. We theoretically demonstrate that under mild conditions, this search process is equivalent to variational inference for the weights. Through comprehensive experiments, we show that TFB achieves superior uncertainty estimation and generalization compared to existing methods while eliminating the need for complex training procedures. Code will be available at <a class="link-external link-https" href="https://github.com/Wang-ML-Lab/bayesian-peft" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the uncertainty estimation problem in the responses of large - language models (LLMs). Specifically, although low - rank adapters (LoRA) are very effective in adapting large - language models to perform new tasks, they themselves do not provide a mechanism to systematically estimate uncertainty. Although existing Bayesian methods can quantify uncertainty through low - rank weight updates, they usually require complex fine - tuning or post - training processes. Therefore, the key challenge faced by researchers is how to convert existing, off - the - shelf LoRA adapters into Bayesian - natured adapters without additional training, thereby achieving effective uncertainty estimation.
### Main contributions of the paper
1. **Proposing Training - Free Bayesianization (TFB)**: This is the first framework that can convert a trained LoRA adapter into a Bayesian adapter without retraining, continued training, or even gradient estimation.
2. **Theoretical connection**: Establishing a theoretical connection between TFB and variational inference (VI), proving that under mild conditions, the maximum variance search of TFB is equivalent to variational free - energy minimization.
3. **Efficient implementation**: Developing an efficient implementation method that only requires an anchored dataset for the search, making it widely applicable in different fields and various LoRA variants.
4. **Experimental verification**: Verifying the effectiveness of TFB in multiple settings, datasets, LLM backbones, LoRA weights, and LoRA variants through extensive experiments, proving that it is superior to existing methods in terms of uncertainty estimation and generalization ability.
### Method overview
The core idea of TFB is to constrain the weight posterior distribution of the LoRA adapter to a low - rank isotropic Gaussian distribution and systematically search for the maximum acceptable variance of the weight posterior distribution. The specific steps include:
1. **Low - rank isotropic Gaussian posterior distribution**: Restrict the weight posterior distribution to a more compact Gaussian distribution family, that is, project the full - space isotropic Gaussian distribution onto the low - rank space.
2. **Determining the maximum variance**: Determine the maximum value of the standard deviation \(\sigma_q\) of the weight posterior distribution through an optimization problem, ensuring that the performance degradation is within an acceptable range.
3. **Algorithm implementation**: Determine the optimal \(\sigma_q\) through binary search or grid search, and use this value to Bayesianize all LoRA layers.
### Theoretical analysis
The paper also provides theoretical analysis, proving that under specific conditions, the process of TFB is equivalent to variational inference. This provides a solid theoretical foundation for TFB, while also ensuring its practicality and simplicity.
In conclusion, this paper solves the key problem of Bayesianizing the LoRA adapter of large - language models without additional training by proposing Training - Free Bayesianization (TFB), thereby achieving more accurate uncertainty estimation and better generalization ability.