Abstract:Compositionality has long been considered a key explanatory property underlying human intelligence: arbitrary concepts can be composed into novel complex combinations, permitting the acquisition of an open ended, potentially infinite expressive capacity from finite learning experiences. Influential arguments have held that neural networks fail to explain this aspect of behavior, leading many to dismiss them as viable models of human cognition. Over the last decade, however, modern deep neural networks (DNNs), which share the same fundamental design principles as their predecessors, have come to dominate artificial intelligence, exhibiting the most advanced cognitive behaviors ever demonstrated in machines. In particular, large language models (LLMs), DNNs trained to predict the next word on a large corpus of text, have proven capable of sophisticated behaviors such as writing syntactically complex sentences without grammatical errors, producing cogent chains of reasoning, and even writing original computer programs -- all behaviors thought to require compositional processing. In this chapter, we survey recent empirical work from machine learning for a broad audience in philosophy, cognitive science, and neuroscience, situating recent breakthroughs within the broader context of philosophical arguments about compositionality. In particular, our review emphasizes two approaches to endowing neural networks with compositional generalization capabilities: (1) architectural inductive biases, and (2) metalearning, or learning to learn. We also present findings suggesting that LLM pretraining can be understood as a kind of metalearning, and can thereby equip DNNs with compositional generalization abilities in a similar way. We conclude by discussing the implications that these findings may have for the study of compositionality in human cognition and by suggesting avenues for future research.

From Words to Worlds: Compositionality for Cognitive Architectures

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning

Compositionality in a Parallel Architecture for Language Processing

Dissociating language and thought in large language models: a cognitive perspective

A Survey on Compositional Learning of AI Models: Theoretical and Experimental Practices

Modelling Compositionality and Structure Dependence in Natural Language

From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks

Geometric Signatures of Compositionality Across a Language Model's Lifetime

Language models and psychological sciences

Interpreting token compositionality in LLMs: A robustness analysis

What makes Models Compositional? A Theoretical View: With Supplement

Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models

Evaluating Morphological Compositional Generalization in Large Language Models

Large Language Models Lack Understanding of Character Composition of Words

Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models

Compositionality as Lexical Symmetry

Large Language Models with Controllable Working Memory

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning Through Trap Problems

A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Hum4n L4ngu4ge and the W0rld behind W0rds?

Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web