AutoLoRA: AutoGuidance Meets Low-Rank Adaptation for Diffusion Models

Artur Kasymov,Marcin Sendera,Michał Stypułkowski,Maciej Zięba,Przemysław Spurek
2024-10-05
Abstract:Low-rank adaptation (LoRA) is a fine-tuning technique that can be applied to conditional generative diffusion models. LoRA utilizes a small number of context examples to adapt the model to a specific domain, character, style, or concept. However, due to the limited data utilized during training, the fine-tuned model performance is often characterized by strong context bias and a low degree of variability in the generated images. To solve this issue, we introduce AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach. Inspired by other guidance techniques, AutoLoRA searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model. Moreover, we show that incorporating classifier-free guidance for both LoRA fine-tuned and base models leads to generating samples with higher diversity and better quality. The experimental results for several fine-tuned LoRA domains show superiority over existing guidance techniques on selected metrics.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems encountered when using the Low - Rank Adaptation (LoRA) technique to fine - tune diffusion models, namely the lack of diversity in generated images and the overly strong context bias. Specifically: 1. **Lack of diversity in generated images**: LoRA adjusts the model to adapt to specific domains, roles, styles or concepts through a small number of context examples. However, due to the limited data used in the training process, the performance of the fine - tuned model usually shows strong context bias and low diversity in generated images. 2. **Overly strong context bias**: Since the LoRA fine - tuning process usually uses relatively few data samples, the generated images are prone to be biased towards the training samples and lack generalization ability. To solve these problems, the author introduced AutoLoRA, a new guidance technique for diffusion models fine - tuned by the LoRA method. The main contributions of AutoLoRA include: - **Increasing the diversity of generated images**: By combining the base conditional diffusion model and the LoRA fine - tuned model, AutoLoRA can increase the sample diversity while maintaining the domain consistency of LoRA weight representation. - **Reducing context bias**: AutoLoRA reduces the bias of the LoRA model caused by fine - tuning with a smaller data set. - **Finding the balance between diversity and consistency**: By introducing Classifier - Free Guidance (CFG) and AutoGuidance mechanisms, AutoLoRA can find the optimal balance among prompt adjustment, LoRA consistency and generalization. ### Formula summary - **LoRA update formula**: \[ W' = W+\alpha\cdot\Delta W = W+\alpha\cdot A\cdot B \] where \(A\in\mathbb{R}^{d\times r}\) and \(B\in\mathbb{R}^{r\times k}\), \(r\) is the rank, and \(r\ll d,k\). - **Classifier - Free Guidance (CFG) formula**: \[ \hat{\epsilon}_w(x_t,y)=\epsilon_\theta(x_t,\emptyset)+w\cdot(\epsilon_\theta(x_t,y)-\epsilon_\theta(x_t,\emptyset)) \] - **AutoLoRA guidance formula**: \[ \epsilon_{\text{AutoLoRA}}^\gamma(x_t,y)=\epsilon(x_t,y)+\gamma\cdot(\epsilon_{\text{LoRA}}(x_t,y)-\epsilon(x_t,y)) \] Through these methods, AutoLoRA significantly improves the quality and diversity of images generated by the LoRA fine - tuned model and reduces the dependence on training data.