Attention Heads of Large Language Models: A Survey

Zifan Zheng,Yezhaohui Wang,Yuxin Huang,Shichao Song,Mingchuan Yang,Bo Tang,Feiyu Xiong,Zhiyu Li
2024-09-24
Abstract:Since the advent of ChatGPT, Large Language Models (LLMs) have excelled in various tasks but remain as black-box systems. Consequently, the reasoning bottlenecks of LLMs are mainly influenced by their internal architecture. As a result, many researchers have begun exploring the potential internal mechanisms of LLMs, with most studies focusing on attention heads. Our survey aims to shed light on the internal reasoning processes of LLMs by concentrating on the underlying mechanisms of attention heads. We first distill the human thought process into a four-stage framework: Knowledge Recalling, In-Context Identification, Latent Reasoning, and Expression Preparation. Using this framework, we systematically review existing research to identify and categorize the functions of specific attention heads. Furthermore, we summarize the experimental methodologies used to discover these special heads, dividing them into two categories: Modeling-Free methods and Modeling-Required methods. Also, we outline relevant evaluation methods and benchmarks. Finally, we discuss the limitations of current research and propose several potential future directions.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the problem of understanding the internal working mechanisms of Attention Heads in Large Language Models (LLMs). Specifically: 1. **Background and Motivation**: Despite the excellent performance of large language models on various tasks, their internal mechanisms remain as opaque as a black box. Therefore, many researchers are dedicated to uncovering the internal reasoning processes of these models, particularly the role of attention heads. 2. **Main Contributions**: - Propose a four-stage framework to analyze human thinking processes and apply it to the reasoning mechanism analysis of LLMs: Knowledge Recall (KR), Context Identification (ICI), Latent Reasoning (LR), Expression Preparation (EP). - Systematically classify the functions of specific attention heads in existing research and explore the collaboration mechanisms between heads at different stages. - Summarize experimental methods for identifying special attention heads, divided into Modeling-Free and Modeling-Required methods. - Outline related evaluation methods and benchmarks. 3. **Core Content**: - Describe in detail the different types of attention heads and their roles in various reasoning stages. - Analyze how attention heads capture contextual information through the QK matrix and write back reasoning results into the residual stream through the OV matrix. - Discuss attention heads related to specific tasks, such as multiple-choice question answering, binary decision tasks, and other specialized heads. Through this work, the paper attempts to provide a theoretical foundation for the interpretability research of large language models and propose suggestions for future research directions.