Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Jiajun Hu,Jian Zhang,Lei Qi,Yinghuan Shi,Yang Gao
2024-07-21
Abstract:Domain generalization (DG) aims to avoid the performance degradation of the model when the distribution shift between the limited training data and unseen test data occurs. Recently, foundation models with enormous parameters have been pre-trained with huge datasets, demonstrating strong generalization ability and showing promising direction for solving the DG problem. However, fully Fine-Tuning (FT) the foundation models results in unsatisfactory out-of-distribution accuracy due to the destroyed pre-trained generalized features. Recently, Parameter-Efficient Fine-Tuning (PEFT) alleviates the above problem by fine-tuning a small portion of the model parameters while keeping the rest frozen, which achieves better generalization performance compared to FT. Nevertheless, PEFT still suffers from the issue of overfitting to the training domains. To address the above issue, we propose Parameter-Efficient Group with Orthogonal regularization (PEGO) for vision transformers, which effectively preserves the generalization ability of the pre-trained network and learns more diverse knowledge compared with conventional PEFT. Specifically, we inject a group of trainable Low-Rank Adaptation (LoRA) modules into the pre-trained model and propose an orthogonal regularization loss to enhance the generalization ability of the model. Our framework achieves SOTA performance on five DG benchmarks, while only requiring training a small number of parameters without adding additional testing cost.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two key problems in **Domain Generalization (DG)**: 1. **The over - fitting problem during fine - tuning of pre - trained models**: - When using large - scale pre - trained models (such as Vision Transformer, ViT) for fine - tuning, directly performing full - scale fine - tuning (Full Fine - Tuning, FT) will lead to a decline in the performance of the model on unseen test data. This is because when training on limited source - domain data, a large number of parameters are prone to cause over - fitting. - Although Parameter - Efficient Fine - Tuning (PEFT) alleviates the over - fitting problem by only fine - tuning a small number of parameters, there is still a risk of over - fitting to the source domain, and it may partially distort the generalization characteristics of the pre - trained model. 2. **How to make full use of the generalization ability of pre - trained models**: - After pre - training on large - scale data sets, large - scale pre - trained models have strong generalization abilities. However, in DG tasks, how to preserve and make full use of these generalization abilities during the fine - tuning process is a challenge. - Existing DG methods mainly focus on how to extract invariant features from limited source domains or generate more training data through data augmentation, while ignoring how to preserve and utilize the generalization ability of the pre - trained model itself. To solve the above problems, the authors propose the **Parameter - Efficient Group with Orthogonal Regularization (PEGO)** framework. Specifically: - **Learn to Preserve**: By introducing an orthogonal regularization loss, the weights of the injected LoRA module are constrained to be orthogonal to the pre - trained weights, thereby minimizing the distortion of the pre - trained generalization features and preserving the generalization ability of the pre - trained model. - **Learn to Diversify**: By injecting multiple LoRA modules in each layer and imposing orthogonal constraints between these modules, the model is encouraged to learn more diverse knowledge in order to better handle various unseen domains. Through these two mechanisms, PEGO can not only effectively alleviate the over - fitting problem, but also significantly improve the generalization performance of the model in unseen domains. Experimental results show that PEGO has achieved state - of - the - art performance on multiple DG benchmark data sets.