Abstract:This study analyzes changes in the attention mechanisms of large language models (LLMs) when used to understand natural conversations between humans (human-human). We analyze three use cases of LLMs: interactions over web content, code, and mathematical texts. By analyzing attention distance, dispersion, and interdependency across these domains, we highlight the unique challenges posed by conversational data. Notably, conversations require nuanced handling of long-term contextual relationships and exhibit higher complexity through their attention patterns. Our findings reveal that while language models exhibit domain-specific attention behaviors, there is a significant gap in their ability to specialize in human conversations. Through detailed attention entropy analysis and t-SNE visualizations, we demonstrate the need for models trained with a diverse array of high-quality conversational data to enhance understanding and generation of human-like dialogue. This research highlights the importance of domain specialization in language models and suggests pathways for future advancement in modeling human conversational nuances.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the lack of ability of large - language models (LLMs) in understanding and processing human natural conversations. Specifically, the authors focus on: 1. **Differences in Attention Mechanisms**: When large - language models are used to understand natural conversations between humans (human - human conversations), how their attention mechanisms differ from those for other types of data (such as web content, code, and mathematical texts). 2. **Unique Challenges of Conversational Data**: Natural conversations have characteristics such as long - term context - dependence, complex emotional expressions, dynamic adjustment, and multi - modal information. These characteristics make conversational data more difficult to process than other types of text data. 3. **Data Scarcity Problem**: The proportion of real - human - conversation data on the Internet is extremely low, resulting in a lack of sufficient high - quality conversational data when training large - language models, thus affecting the model's ability to understand and generate human conversations. 4. **Domain Specificity**: Although large - language models perform well in some specific domains (such as code, mathematics, etc.), there is a significant gap when processing human conversations. Therefore, more domain - specific models need to be developed to improve their conversation - processing ability. ### Specific Research Contents To explore the above problems, the authors carried out the following analyses: - **Attention Distance**: By calculating the difference in attention distance (\(\Delta D_\alpha\)) between different data domains, quantify the degree of the model's attention to long - distance dependencies when processing different types of texts. \[ D_\alpha^k=\frac{\sum_{x \in X^{D_k}} \sum_{i = 1}^{|x|} \sum_{j = 1}^{|x|} \alpha_{i,j}(x)\cdot(i - j)}{\sum_{x \in X^{D_k}} \sum_{i = 1}^{|x|} \sum_{j = 1}^{|x|} \alpha_{i,j}(x)} \] \[ \Delta D_\alpha=D_\alpha^{D_1}-D_\alpha^{D_2} \] - **Attention Dispersion**: By calculating the entropy of the attention distribution (Entropy), measure the attention - allocation situation of the model in different domains and evaluate the effectiveness of its learning and processing strategies. \[ Entropy_\alpha(x_i)=-\sum_{j = 1}^{|x|} \alpha_{i,j}(x)\log(\alpha_{i,j}(x)) \] - **Interdependency Analysis**: Introduce the "Interdependency Factor (IF)" to quantify the dependency relationships between elements in different data domains and reveal the information structure and dynamic changes. \[ IF=\frac{1}{|N|^2-|N|}\sum_{i = 1}^{|N|} \sum_{j = 1, j\neq i}^{|N|} a_{ij} \] Through these analyses, the authors hope to reveal the limitations of current general - purpose large - language models in processing conversational data and emphasize the importance of incorporating more diverse and real - life conversational data into the training process to improve the model's ability to understand and generate human conversations.

Are Human Conversations Special? A Large Language Model Perspective

Large Human Language Models: A Need and the Challenges

Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function

Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on Understanding

The Importance of Understanding Language in Large Language Models

Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs)

Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs

The Limitations of Large Language Models for Understanding Human Language and Cognition

Attention Heads of Large Language Models: A Survey

Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue

Large language models as linguistic simulators and cognitive models in human research

Existential Conversations with Large Language Models: Content, Community, and Culture

Large Language Models Humanize Technology

Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People

Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

Large Language Models Know What To Say But Not When To Speak

Talking about Large Language Models

Understanding User Experience in Large Language Model Interactions

Large Language Models: A Historical and Sociocultural Perspective

HLB: Benchmarking LLMs' Humanlikeness in Language Use

Limits of Large Language Models in Debating Humans