Abstract:Large-scale pre-trained language models (e.g., BERT) have attracted great attention in recent years. It is straightforward to fine-tune them on natural language understanding tasks such as text classification, however, effectively and efficiently incorporating them into natural language generation tasks such as neural machine translation remains a challenging problem. In this paper, we integrate two pre-trained BERT models from the source and target language domains into a sequence-to-sequence model by introducing light-weight adapter modules. The adapters are inserted between BERT layers and tuned on downstream tasks, while the parameters of BERT models are fixed during fine-tuning. As pre-trained language models are usually very deep, inserting adapters into all layers will result in a considerable scale of new parameters. To deal with this problem, we introduce latent variables to decide whether using adapters or not in each layer, which are learned during fine-tuning. In this way, the model is able to automatically determine which adapters to use, therefore hugely promoting the parameter efficiency and decoding speed. We evaluate the proposed framework on various neural machine translation tasks. Equipped with parallel sequence decoding, our model consistently outperforms autoregressive baselines while reducing the inference latency by half. With automatic adapter selection, the proposed model further achieves 20% speedup while still outperforming autoregressive baselines. When applied to autoregressive decoding, the proposed model can also achieve comparable performance with the state-of-the-art baseline models.

Parameter-Efficient Adapter Based on Pre-trained Models for Speech Translation

Parameter-Efficient Fine-Tuning With Adapters

Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation

Lightweight Adapter Tuning for Multilingual Speech Translation

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation

Parameter-Efficient Transfer Learning for NLP

Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models

Adaptive Adapters: an Efficient Way to Incorporate BERT into Neural Machine Translation

Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition

Efficient Adaptation: Enhancing Multilingual Models for Low-Resource Language Translation

Efficient Adapters for Giant Speech Models

Meta-adapter: efficient cross-lingual adaptation with meta-learning

ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

Parameter-Efficient Learning for Text-to-Speech Accent Adaptation

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision

Efficient Adapter Tuning of Pre-trained Speech Models for Automatic Speaker Verification

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models