HyperLoRA: Efficient Cross-task Generalization Via Constrained Low-Rank Adapters Generation

Chuancheng Lv,Lei Li,Shitou Zhang,Gang Chen,Fanchao Qi,Ningyu Zhang,Hai-Tao Zheng
DOI: https://doi.org/10.18653/v1/2024.findings-emnlp.956
2024-01-01
Abstract:Adapting pre-trained language models (PLMs) for cross-task generalization is a crucial research area within the field of NLP. While fine-tuning and in-context learning are effective approaches for adapting LMs to emerging tasks, they can be costly and inefficient. Recently, some researchers have focused on achieving efficient task adaptation via hypernetwork, which is a meta network that generates task-specific weights based on task-oriented information without any optimization. However, the training of hypernetworks often lacks stability since the optimization signal is not straightforward, and the task information is not adequately representative. Moreover, previous works train hypenetworks with the general corpus, which is struggling with few-shot adaptation. To address these issues, we introduce HyperLoRA, a hypernetwork for LoRA parameters generation involving hypernetwork pre-training on instruction-following data and generalization fine-tuning on sparse task data. Furthermore, we utilize a constrained training loss and a gradient-based demonstration selection strategy to enhance the training stability and performance. Experimental results and analysis across four benchmark datasets (P3, S-NI, BBH, and SuperGLUE) demonstrate the proposed approach has flexible generalization ability and superior performance.
What problem does this paper attempt to address?