Abstract:Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens present in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs.

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **In-depth understanding of the attention mechanism in large language models (LLMs)**: Given the contradiction between the significant performance of LLMs in the field of natural language processing (NLP) and the opacity of their internal workings, the authors aim to enhance the understanding of LLMs by meticulously analyzing the operations of attention heads, particularly focusing on how they perform In-Context Learning (ICL). 2. **Exploring the semantic induction capability in attention heads**: Researchers investigate whether attention heads can encode two types of relationships between words in natural language—syntactic dependencies (such as subject-verb, verb-object, etc.) and semantic relationships in knowledge graphs (such as "used for," "feature of," etc.). They define a new type of attention head, namely **Semantic Induction Heads**, which tend to increase the output probability of words (tail words) related to a specific relationship when attending to a particular word (head word). 3. **Studying the correlation between semantic induction heads and in-context learning ability**: By categorizing in-context learning ability into three levels (loss reduction, format adherence, and pattern discovery), the authors observe the gradual emergence of these different levels of ICL ability in the early stages of training and explore the correlation between the appearance of semantic induction heads and the emergence of ICL ability. In summary, by introducing the concept of semantic induction heads, this paper not only deepens our understanding of the internal mechanisms of LLMs but also reveals the important connection between semantic induction heads and ICL ability, which is significant for further enhancing the safety and reliability of LLMs.

Identifying Semantic Induction Heads to Understand In-Context Learning

In-context Learning and Induction Heads

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers

Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning

Theoretical Understanding of In-Context Learning in Shallow Transformers with Unstructured Data

A Theory of Emergent In-Context Learning as Implicit Structure Induction

Decoding In-Context Learning: Neuroscience-inspired Analysis of Representations in Large Language Models

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning

How Transformers Implement Induction Heads: Approximation and Optimization Analysis

Linking In-context Learning in Transformers to Human Episodic Memory

Attention Heads of Large Language Models: A Survey

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning

From Unstructured Data to In-Context Learning: Exploring What Tasks Can Be Learned and When

Investigating the Learning Behaviour of In-Context Learning: A Comparison with Supervised Learning

The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis

Interpretable Language Modeling via Induction-head Ngram Models

Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps

Why Larger Language Models Do In-context Learning Differently?

Large Language Models Are In-Context Semantic Reasoners Rather Than Symbolic Reasoners

Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism