The Antecedents of Transformer Models
Simon Dennis,Kevin Shabahang,Hyungwook Yim
DOI: https://doi.org/10.1177/09637214241279504
IF: 7.867
2024-11-24
Current Directions in Psychological Science
Abstract:Current Directions in Psychological Science, Ahead of Print. Transformer models of language represent a step change in our ability to account for cognitive phenomena. Although the specific architecture that has garnered recent interest is quite young, many of its components have antecedents in the cognitive science literature. In this article, we start by providing an introduction to large language models aimed at a general psychological audience. We then highlight some of the antecedents, including the importance of scale, instance-based memory models, paradigmatic association and systematicity, positional encodings of serial order, and the learning of control processes. This article offers an exploration of the relationship between transformer models and their precursors, showing how they can be understood as a next phase in our understanding of cognitive processes.
psychology, multidisciplinary