Identifying Semantic Induction Heads to Understand In-Context Learning

Jie Ren,Qipeng Guo,Hang Yan,Dongrui Liu,Quanshi Zhang,Xipeng Qiu,Dahua Lin
2024-07-25
Abstract:Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens present in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily aims to address the following issues: 1. **In-depth understanding of the attention mechanism in large language models (LLMs)**: Given the contradiction between the significant performance of LLMs in the field of natural language processing (NLP) and the opacity of their internal workings, the authors aim to enhance the understanding of LLMs by meticulously analyzing the operations of attention heads, particularly focusing on how they perform In-Context Learning (ICL). 2. **Exploring the semantic induction capability in attention heads**: Researchers investigate whether attention heads can encode two types of relationships between words in natural language—syntactic dependencies (such as subject-verb, verb-object, etc.) and semantic relationships in knowledge graphs (such as "used for," "feature of," etc.). They define a new type of attention head, namely **Semantic Induction Heads**, which tend to increase the output probability of words (tail words) related to a specific relationship when attending to a particular word (head word). 3. **Studying the correlation between semantic induction heads and in-context learning ability**: By categorizing in-context learning ability into three levels (loss reduction, format adherence, and pattern discovery), the authors observe the gradual emergence of these different levels of ICL ability in the early stages of training and explore the correlation between the appearance of semantic induction heads and the emergence of ICL ability. In summary, by introducing the concept of semantic induction heads, this paper not only deepens our understanding of the internal mechanisms of LLMs but also reveals the important connection between semantic induction heads and ICL ability, which is significant for further enhancing the safety and reliability of LLMs.