Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

Haoyun Xu,Runzhe Zhan,Derek F. Wong,Lidia S. Chao
2024-03-18
Abstract:Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale. Recent studies have revealed that not all neurons are active across different datasets, and this sparsity correlates positively with the task-specific ability, leading to advancements in model pruning and training efficiency. Traditional fine-tuning methods engage all parameters of LLMs, which is computationally expensive and may not be necessary. In contrast, Parameter-Efficient Fine-Tuning (PEFT) approaches aim to minimize the number of trainable parameters, yet they still operate at a relatively macro scale (e.g., layer-level). We introduce Neuron-Level Fine-Tuning (NeFT), a novel approach that refines the granularity of parameter training down to the individual neuron, enabling more precise and computationally efficient model updates. The experimental results show that NeFT not only exceeded the performance of full-parameter fine-tuning and PEFT but also provided insights into the analysis of neurons.
Computation and Language
What problem does this paper attempt to address?
### The Problem the Paper Aims to Solve This paper aims to address the inefficiency of parameter updates in large-scale language models (LLMs) during fine-tuning. Specifically: 1. **High computational cost of traditional full-parameter fine-tuning methods**: Traditional full-parameter fine-tuning methods require updating all parameters in the model, which is not only computationally expensive but also potentially unnecessary, as not all neurons are active in different tasks. 2. **Limitations of existing parameter-efficient fine-tuning methods**: Existing parameter-efficient fine-tuning methods (such as PEFT) reduce the number of trainable parameters but still primarily operate at the layer level, failing to achieve finer-grained optimization. 3. **Optimization potential at the neuron level**: Research shows that neuron activity varies across different datasets, and this sparsity is positively correlated with task-specific capabilities. Therefore, fine-tuning at the neuron level can achieve more precise and efficient model updates. To this end, the paper proposes a new method—**Neuron-level Fine-Tuning (NeFT)**, which refines the granularity of parameter training down to individual neurons, thereby achieving more efficient and precise model updates. Experimental results show that NeFT not only outperforms full-parameter fine-tuning and PEFT methods in terms of performance but also provides in-depth analysis of neuron behavior.