Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

Haoyun Xu,Runzhe Zhan,Derek F. Wong,Lidia S. Chao

2024-03-18

Abstract:Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale. Recent studies have revealed that not all neurons are active across different datasets, and this sparsity correlates positively with the task-specific ability, leading to advancements in model pruning and training efficiency. Traditional fine-tuning methods engage all parameters of LLMs, which is computationally expensive and may not be necessary. In contrast, Parameter-Efficient Fine-Tuning (PEFT) approaches aim to minimize the number of trainable parameters, yet they still operate at a relatively macro scale (e.g., layer-level). We introduce Neuron-Level Fine-Tuning (NeFT), a novel approach that refines the granularity of parameter training down to the individual neuron, enabling more precise and computationally efficient model updates. The experimental results show that NeFT not only exceeded the performance of full-parameter fine-tuning and PEFT but also provided insights into the analysis of neurons.

Computation and Language

What problem does this paper attempt to address?

### The Problem the Paper Aims to Solve This paper aims to address the inefficiency of parameter updates in large-scale language models (LLMs) during fine-tuning. Specifically: 1. **High computational cost of traditional full-parameter fine-tuning methods**: Traditional full-parameter fine-tuning methods require updating all parameters in the model, which is not only computationally expensive but also potentially unnecessary, as not all neurons are active in different tasks. 2. **Limitations of existing parameter-efficient fine-tuning methods**: Existing parameter-efficient fine-tuning methods (such as PEFT) reduce the number of trainable parameters but still primarily operate at the layer level, failing to achieve finer-grained optimization. 3. **Optimization potential at the neuron level**: Research shows that neuron activity varies across different datasets, and this sparsity is positively correlated with task-specific capabilities. Therefore, fine-tuning at the neuron level can achieve more precise and efficient model updates. To this end, the paper proposes a new method—**Neuron-level Fine-Tuning (NeFT)**, which refines the granularity of parameter training down to individual neurons, thereby achieving more efficient and precise model updates. Experimental results show that NeFT not only outperforms full-parameter fine-tuning and PEFT methods in terms of performance but also provides in-depth analysis of neuron behavior.

Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

Discovering Long-Term Effects on Parameter Efficient Fine-tuning

An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification

Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

Parameter-efficient fine-tuning of large-scale pre-trained language models

Sparse Matrix in Large Language Model Fine-tuning

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning

Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs

Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization

LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation

LANDeRMT: Detecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation

Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models

Sparsity-Accelerated Training for Large Language Models

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models