Abstract:Large language models (LLMs), typically designed as a function of next-word prediction, have excelled across extensive NLP tasks. Despite the generality, next-word prediction is often not an efficient formulation for many of the tasks, demanding an extreme scale of model parameters (10s or 100s of billions) and sometimes yielding suboptimal performance. In practice, it is often desirable to build more efficient models -- despite being less versatile, they still apply to a substantial subset of problems, delivering on par or even superior performance with much smaller model sizes. In this paper, we propose text alignment as an efficient unified model for a wide range of crucial tasks involving text entailment, similarity, question answering (and answerability), factual consistency, and so forth. Given a pair of texts, the model measures the degree of alignment between their information. We instantiate an alignment model (Align) through lightweight finetuning of RoBERTa (355M parameters) using 5.9M examples from 28 datasets. Despite its compact size, extensive experiments show the model's efficiency and strong performance: (1) On over 20 datasets of aforementioned diverse tasks, the model matches or surpasses FLAN-T5 models that have around 2x or 10x more parameters; the single unified model also outperforms task-specific models finetuned on individual datasets; (2) When applied to evaluate factual consistency of language generation on 23 datasets, our model improves over various baselines, including the much larger GPT-3.5 (ChatGPT) and sometimes even GPT-4; (3) The lightweight model can also serve as an add-on component for LLMs such as GPT-3.5 in question answering tasks, improving the average exact match (EM) score by 17.94 and F1 score by 15.05 through identifying unanswerable questions.

Aligner: One Global Token is Worth Millions of Parameters when Aligning Large Language Models

Parameter-Efficient Tuning Helps Language Model Alignment

A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques

PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning

Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning

Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time

Aligner: Efficient Alignment by Learning to Correct

Text Alignment Is An Efficient Unified Model for Massive NLP Tasks

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention

Parameter-efficient Tuning for Large Language Model Without Calculating Its Gradients

Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMs

Aligners: Decoupling LLMs and Alignment

MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models

Panacea: Pareto Alignment via Preference Adaptation for LLMs

On Diversified Preferences of Large Language Model Alignment

Safety Alignment Should Be Made More Than Just a Few Tokens Deep