Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity

Zichen Song,Sitan Huang,Yuxin Wu,Zhongfeng Kang
2024-11-15
Abstract:Evaluating the importance of different layers in large language models (LLMs) is crucial for optimizing model performance and interpretability. This paper first explores layer importance using the Activation Variance-Sparsity Score (AVSS), which combines normalized activation variance and sparsity to quantify each layer's contribution to overall model performance. By ranking layers based on AVSS and pruning the least impactful 25\%, our experiments on tasks such as question answering, language modeling, and sentiment classification show that over 90\% of the original performance is retained, highlighting potential redundancies in LLM architectures. Building on AVSS, we propose an enhanced version tailored to assess hallucination propensity across layers (EAVSS). This improved approach introduces Hallucination-Specific Activation Variance (HSAV) and Hallucination-Specific Sparsity (HSS) metrics, allowing precise identification of hallucination-prone layers. By incorporating contrastive learning on these layers, we effectively mitigate hallucination generation, contributing to more robust and efficient LLMs(The maximum performance improvement is 12\%). Our results on the NQ, SciQ, TriviaQA, TruthfulQA, and WikiQA datasets demonstrate the efficacy of this method, offering a comprehensive framework for both layer importance evaluation and hallucination mitigation in LLMs.
Computation and Language,Performance
What problem does this paper attempt to address?
This paper aims to solve two key problems in large - language models (LLMs): 1. **Layer Importance Evaluation**: Research on the importance of different layers in large - language models to optimize model performance and improve model interpretability. By proposing the Activation Variance - Sparsity Score (A VSS), which combines the normalized activation variance and sparsity to quantify the contribution of each layer to the overall model performance. Experiments show that by ranking layers based on A VSS and pruning the least important 25% of layers, more than 90% of the original performance can be retained in multiple tasks (such as question - answering, language modeling, and sentiment classification), revealing the potential redundancy in the LLM architecture. 2. **Hallucination Generation Analysis and Mitigation**: Analyze and mitigate the hallucination - generation tendency of specific layers in LLM. By extending the A VSS method and introducing the Enhanced Activation Variance - Sparsity Score (EA VSS), this method introduces the Hallucination - Specific Activation Variance (HSA V) and Hallucination - Specific Sparsity (HSS) indicators, which can accurately identify layers prone to hallucination. By applying contrastive learning to these layers, hallucination generation is effectively reduced, and the model's robustness and efficiency are improved. Experimental results show that on the NQ, SciQ, TriviaQA, TruthfulQA, and WikiQA datasets, the EA VSS method performs excellently in reducing hallucination generation, with a maximum performance improvement of up to 12%. In conclusion, through proposing the A VSS and EA VSS methods, this paper not only solves how to evaluate and optimize the importance of each layer in LLM, but also provides an effective hallucination - generation analysis and mitigation framework, providing new ideas for building more efficient, more reliable, and more interpretable LLMs.