SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values

Chengwei Sun,Jiwei Wei,Yujia Wu,Yiming Shi,Shiyuan He,Zeyu Ma,Ning Xie,Yang Yang
2024-09-09
Abstract:Large pre-trained models (LPMs) have demonstrated exceptional performance in diverse natural language processing and computer vision tasks. However, fully fine-tuning these models poses substantial memory challenges, particularly in resource-constrained environments. Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, mitigate this issue by adjusting only a small subset of parameters. Nevertheless, these methods typically employ random initialization for low-rank matrices, which can lead to inefficiencies in gradient descent and diminished generalizability due to suboptimal starting points. To address these limitations, we propose SVFit, a novel PEFT approach that leverages singular value decomposition (SVD) to initialize low-rank matrices using critical singular values as trainable parameters. Specifically, SVFit performs SVD on the pre-trained weight matrix to obtain the best rank-r approximation matrix, emphasizing the most critical singular values that capture over 99% of the matrix's information. These top-r singular values are then used as trainable parameters to scale the fundamental subspaces of the matrix, facilitating rapid domain adaptation. Extensive experiments across various pre-trained models in natural language understanding, text-to-image generation, and image classification tasks reveal that SVFit outperforms LoRA while requiring 16 times fewer trainable parameters.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the memory challenges faced when fine - tuning large pre - trained models (LPMs) in resource - constrained environments. Specifically, fully fine - tuning these models requires a large amount of memory, which is difficult to achieve in resource - limited environments. Existing parameter - efficient fine - tuning (PEFT) methods such as LoRA alleviate this problem by adjusting a small number of parameters, but these methods usually use randomly initialized low - rank matrices, which may lead to inefficient gradient descent and poor generalization performance. To overcome these limitations, the paper proposes SVFit, a new PEFT method. SVFit utilizes singular value decomposition (SVD) to initialize low - rank matrices and uses the most important singular values as trainable parameters. This method can quickly adapt to new domains while significantly reducing the number of trainable parameters. ### Main contributions: 1. **Propose SVFit**: A novel PEFT method that initializes low - rank matrices through SVD, focuses on training the most important first \( r \) singular values, thereby significantly reducing the number of trainable parameters while achieving efficient fine - tuning and retaining the core capabilities of the model. 2. **Theoretical analysis**: Reveal the mechanisms behind SVFit, showing how to effectively capture key information in pre - trained models using singular values and efficiently learn new domain knowledge with the fewest parameters. 3. **Experimental verification**: Conducted extensive experiments on various tasks such as natural language understanding, image classification, and text - to - image generation. The results show that SVFit is superior to LoRA and other state - of - the - art techniques in terms of parameter efficiency and overall performance. ### Method overview: 1. **SVD decomposition**: Perform SVD decomposition on the pre - trained weight matrix \( W \) to obtain the optimal rank - \( r \) approximation matrix \( W_r \) and the residual matrix \( W_e \). 2. **Initialization and training**: Use the most important singular values obtained from SVD as trainable parameters and only train these singular values while keeping the other parts frozen. 3. **Fast adaptation**: Promote fast adaptation to new domains by scaling the basic subspaces obtained from SVD. ### Experimental results: - **Natural language understanding tasks**: In the GLUE benchmark test, SVFit performs excellently on multiple tasks, especially achieving the highest scores on the CoLA and STS - B tasks. - **Image classification tasks**: On multiple datasets, SVFit outperforms LoRA and other methods. Especially when using the ViT - large model, it can exceed LoRA's 0.8M parameters with only 0.036M trainable parameters. - **Text - to - image generation tasks**: In subject - driven generation tasks, the image quality generated by SVFit is comparable to that of LoRA, but it requires fewer trainable parameters. In conclusion, SVFit provides a new method for efficiently fine - tuning large pre - trained models in resource - constrained environments by utilizing singular value decomposition and an effective parameter initialization strategy.