Abstract:We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent data are becoming more aligned. Next, we demonstrate convergence across data modalities: as vision models and language models get larger, they measure distance between datapoints in a more and more alike way. We hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato's concept of an ideal reality. We term such a representation the platonic representation and discuss several possible selective pressures toward it. Finally, we discuss the implications of these trends, their limitations, and counterexamples to our analysis.

What problem does this paper attempt to address?

This paper attempts to explore and verify the convergence phenomenon of different artificial intelligence models in the representation space. Specifically, the author proposes "The Platonic Representation Hypothesis", believing that as the model scale expands, the amount of data grows, and the task diversity increases, different neural network models are gradually converging to a shared statistical reality model. The core view of this hypothesis is that although these models may differ in training objectives, data sets, and modalities, they are ultimately all trying to construct a common representation of the real world, that is, a representation of the joint distribution that generates the data events we observe. ### Main issues 1. **Convergence of representation**: - The paper first reviews several examples of representation convergence in the literature, pointing out that over time and across multiple fields of research, different neural networks are becoming more and more consistent in the way they represent data. - The author shows through experiments that models of different modalities (such as vision and language) become more and more similar in the way of measuring the distance of data points as their scale increases. 2. **Driving factors of convergence**: - The author explores the possible selection pressures that lead to this convergence phenomenon, including the generality of tasks, the increase in the amount of data, and the improvement of task diversity. - The paper also discusses the influence, limitations, and counter - examples of this trend. ### Specific research contents 1. **Representation alignment between different models**: - The author uses the mutual nearest - neighbor metric to evaluate the degree of representation alignment between different models. - The experimental results show that models with different architectures and training objectives exhibit a high degree of alignment in the representation space. 2. **Cross - modal representation alignment**: - The author measures the representation alignment between visual models and language models by pairing data sets (such as Wikipedia's images and text descriptions). - The results show that better - performing language models have a higher degree of alignment with visual models, and vice versa. 3. **Alignment between models and the brain**: - The research finds that neural network models also show significant alignment with the representations in biological brains. - This alignment may be due to the similarity of the tasks and data constraints faced by the systems. 4. **Relationship between alignment and downstream task performance**: - The author proves through experiments that there is a positive correlation between the degree of model representation alignment and its performance in downstream tasks (such as common - sense reasoning and math problem - solving). ### Conclusion The main contribution of the paper is to propose and verify "The Platonic Representation Hypothesis", that is, different artificial intelligence models gradually converge to a shared statistical reality model in the representation space. This hypothesis not only explains the current convergence phenomenon of models in representation, but also provides theoretical support for the future development direction of models.

The Platonic Representation Hypothesis

Training objective drives the consistency of representational similarity across datasets

Representations and generalization in artificial and brain neural networks

Formation of Representations in Neural Networks

Prototypical Concept Representation

When Representations Align: Universality in Representation Learning Dynamics

A Timeline and Analysis for Representation Plasticity in Large Language Models

Convergent Learning: Do different neural networks learn the same representations?

Representations as Language: An Information-Theoretic Framework for Interpretability

Understanding Dynamics of Nonlinear Representation Learning and Its Application

Unified Representations for Learning and Reasoning

Bridging the Gap: Representation Spaces in Neuro-Symbolic AI

Getting aligned on representational alignment

Artificial Intelligence's Fusing Representation Model Based on Dynamics of Neural System

Relational Constraints On Neural Networks Reproduce Human Biases towards Abstract Geometric Regularity

Emergence of machine language: towards symbolic intelligence with neural networks

Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents

Representation Engineering: A Top-Down Approach to AI Transparency

Neural and phenotypic representation under the free-energy principle

The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence

A Philosophical Understanding of Representation for Neuroscience