Formality is Favored: Unraveling the Learning Preferences of Large Language Models on Data with Conflicting Knowledge

Jiahuan Li,Yiqing Cao,Shujian Huang,Jiajun Chen
2024-10-07
Abstract:Having been trained on massive pretraining data, large language models have shown excellent performance on many knowledge-intensive tasks. However, pretraining data tends to contain misleading and even conflicting information, and it is intriguing to understand how LLMs handle these noisy data during training. In this study, we systematically analyze LLMs' learning preferences for data with conflicting knowledge. We find that pretrained LLMs establish learning preferences similar to humans, i.e., preferences towards formal texts and texts with fewer spelling errors, resulting in faster learning and more favorable treatment of knowledge in data with such features when facing conflicts. This finding is generalizable across models and languages and is more evident in larger models. An in-depth analysis reveals that LLMs tend to trust data with features that signify consistency with the majority of data, and it is possible to instill new preferences and erase old ones by manipulating the degree of consistency with the majority data.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to explore the learning preferences of large language models (LLMs) when dealing with data containing conflicting knowledge. Specifically, the researchers systematically analyzed and found that pre-trained large language models exhibit human-like learning preferences when faced with conflicting information in the data, showing a tendency towards formal texts and texts with fewer spelling errors. This preference leads the models to learn faster from data with these prominent features and assign higher probabilities to the knowledge in these data during testing. ### Main Contributions 1. **Proposing the Research Question**: The researchers proposed the question of exploring the learning preferences of LLMs when handling data with conflicting knowledge. 2. **Validating the Findings**: The researchers experimentally validated that existing LLMs indeed have a significant preference for formal texts and texts with fewer spelling errors during the learning process, and this finding applies to different models and languages. 3. **In-depth Explanation**: The researchers provided a preliminary explanation of how LLMs form these learning preferences, suggesting that LLMs can effectively identify consistency features between the current data and other data and use these features to decide whether the current data is worth learning. ### Research Methods 1. **Data Construction**: The researchers constructed synthetic biographical data containing conflicting knowledge to facilitate the study of LLMs' learning preferences. 2. **Experimental Setup**: The researchers fine-tuned the LLaMA2-7B model using standard language modeling objectives and evaluated the model's preference for conflicting knowledge through a test set. 3. **Experimental Results**: By comparing learning results under different text features, the researchers found that LLMs have a significant preference for formal texts and texts with correct spelling. ### Main Findings 1. **Learning Speed**: LLMs learn faster from texts with specific features, such as scientific report style and news style texts. 2. **Preference Existence**: When conflicting knowledge is present, LLMs are more likely to activate knowledge from formal style texts, such as scientific reports and news styles. 3. **Impact of Model Size**: As the model size increases, LLMs' preference for news style gradually strengthens. 4. **Cross-Model and Language Generalization**: LLMs exhibit consistent preferences across different languages and models, but the degree of preference varies by model. ### Hypothesis for Preference Formation The researchers proposed a Consistency-Driven Feature Preference Hypothesis, suggesting that LLMs can observe the degree of consistency between texts with specific features and other data, forming an intrinsic preference based on this observation. When learning data containing conflicting knowledge, LLMs decide which knowledge to learn based on the formed preferences. ### Conclusion This paper systematically reveals the learning preferences of large language models when dealing with data containing conflicting knowledge and provides a preliminary explanation. The research results help understand the knowledge learning mechanisms of LLMs and offer new perspectives for future research. However, the study also has some limitations, such as being conducted only on synthetic datasets, which may not fully apply to real-world datasets.