Abstract:Large Language Models (LLMs) are capable of displaying a wide range of abilities that are not directly connected with the task for which they are trained: predicting the next words of human-written texts. In this article, I review recent research investigating the cognitive abilities developed by LLMs and their relation to human cognition. I discuss the nature of the indirect process that leads to the acquisition of these cognitive abilities, their relation to other indirect processes, and the implications for the acquisition of integrated abilities. Moreover, I propose the factors that enable the development of abilities that are related only very indirectly to the proximal objective of the training task. Finally, I discuss whether the full set of capabilities that LLMs could possibly develop is predictable.
What problem does this paper attempt to address?
This paper attempts to explore how large - language models (LLMs) can exhibit a range of complex cognitive abilities without direct training. Specifically, the author reviews recent research on the cognitive abilities developed by LLMs and their relationship with human cognition, and discusses how these abilities are acquired through indirect processes. In addition, the author also explores why the development of these abilities is unexpected and whether their emergence can be predicted. The paper also discusses the differences between LLMs and natural intelligence, especially in terms of knowledge acquisition methods, the amount of training data, and computational properties.
### The main issues of concern in the paper include:
1. **Unexpected abilities of LLMs**:
- The paper points out that LLMs can not only predict the next word in a text, but also exhibit many complex cognitive abilities that are not directly related to this task, such as formal language ability, factual knowledge, dynamic semantic operations, theoretical thinking ability, logical reasoning, etc.
- The acquisition of these abilities is achieved through an indirect process. That is, in the process of predicting the next word, the model needs to have a deep understanding of the previous text, and this understanding requires certain cognitive skills.
2. **Acquisition of indirect abilities**:
- The author explains why the task of predicting the next word can promote the development of complex cognitive abilities. The acquisition of these abilities is achieved through an indirect process, similar to the indirect processes in natural evolution and individual learning.
- The paper proposes several factors that enable LLMs to acquire these indirect abilities, including the high informativeness of prediction errors, the predictability of human languages, and a large amount of available training data.
3. **Predictability and emergence of abilities**:
- The paper discusses whether the abilities of LLMs can be predicted. Although the overall performance can be predicted by model size, data set size, and training time, the emergence of specific abilities is difficult to predict.
- The author cites the research of Wei et al. (2022) and points out that some abilities do not exist in smaller models but suddenly appear in larger models, which makes it difficult to predict the abilities of larger models.
4. **Comparison between LLMs and natural intelligence**:
- The paper makes a detailed comparison between LLMs and humans in terms of knowledge acquisition methods, the amount of training data, and computational properties.
- Humans actively acquire knowledge and skills through interaction with the external environment and others, while LLMs passively acquire knowledge through text data.
- The author also discusses whether this passive learning method limits the ability of LLMs to obtain high - quality representations and how future research can further understand the development process of these abilities.
In summary, this paper aims to reveal how LLMs acquire complex cognitive abilities without direct training and explore the development mechanisms of these abilities and their similarities and differences with human cognition.