Abstract:The recent success of Large Language Models (LLMs) has gained significant attention in both academia and industry. Substantial efforts have been made to enhance the zero- and few-shot generalization capabilities of open-source LLMs through finetuning. Currently, the prevailing approach is instruction-tuning, which trains LLMs to complete real-world tasks by generating responses guided by natural language instructions. It is worth noticing that such an approach may underperform in sequence and token classification tasks. Unlike text generation tasks, classification tasks have a limited label space, where precise label prediction is more appreciated than generating diverse and human-like responses. Prior research has unveiled that instruction-tuned LLMs cannot outperform BERT, prompting us to explore the potential of leveraging latent representations from LLMs for supervised label prediction. In this paper, we introduce a label-supervised adaptation for LLMs, which aims to finetuning the model with discriminant labels. We evaluate this approach with Label Supervised LLaMA (LS-LLaMA), based on LLaMA-2-7B, a relatively small-scale LLM, and can be finetuned on a single GeForce RTX4090 GPU. We extract latent representations from the final LLaMA layer and project them into the label space to compute the cross-entropy loss. The model is finetuned by Low-Rank Adaptation (LoRA) to minimize this loss. Remarkably, without intricate prompt engineering or external knowledge, LS-LLaMA substantially outperforms LLMs ten times its size in scale and demonstrates consistent improvements compared to robust baselines like BERT-Large and RoBERTa-Large in text classification. Moreover, by removing the causal mask from decoders, LS-unLLaMA achieves the state-of-the-art performance in named entity recognition (NER). Our work will shed light on a novel approach to adapting LLMs for various downstream tasks.

Fine-tuning Large Language Models for Domain-specific Machine Translation

Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning

Adapting Large Language Models for Document-Level Machine Translation

How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes

Enhancing Document-level Translation of Large Language Model via Translation Mixed-instructions

The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Fine-grained LLM Agent: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback

Label Supervised LLaMA Finetuning

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Fine-tuning Large Language Models for Adaptive Machine Translation

MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

Fine-tuning Large Language Models for Entity Matching