Abstract:Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we propose the principle of interaction asymmetry which states: "Parts of the same concept have more complex interactions than parts of different concepts". We formalize this via block diagonality conditions on the $(n+1)$th order derivatives of the generator mapping concepts to observed data, where different orders of "complexity" correspond to different $n$. Using this formalism, we prove that interaction asymmetry enables both disentanglement and compositional generalization. Our results unify recent theoretical results for learning concepts of objects, which we show are recovered as special cases with $n\!=\!0$ or $1$. We provide results for up to $n\!=\!2$, thus extending these prior works to more flexible generator functions, and conjecture that the same proof strategies generalize to larger $n$. Practically, our theory suggests that, to disentangle concepts, an autoencoder should penalize its latent capacity and the interactions between concepts during decoding. We propose an implementation of these criteria using a flexible Transformer-based VAE, with a novel regularizer on the attention weights of the decoder. On synthetic image datasets consisting of objects, we provide evidence that this model can achieve comparable object disentanglement to existing models that use more explicit object-centric priors.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to learn separable and composable concept representations in order to generalize to unseen scenarios. Specifically, the authors explored achieving concept disentanglement and compositional generalization in generative models by introducing the principle of interaction asymmetry. The following are the main problems of the paper and their backgrounds: ### Research Background 1. **Disentangled Representations**: A key challenge in machine learning is how to learn abstract internal representations of different concepts from observed data and ensure that these representations are disentangled, that is, the representation of each concept is independent of other concepts. 2. **Compositional Generalization**: Another challenge is how to make these disentangled representations capable of dealing with newly combined concepts, such as unseen combinations of objects. ### Core Problems of the Paper The paper points out that although many current models can well interpret the same data set, only a few models can learn truly disentangled and compositional - generalization - capable representations. In order to ensure that the model can achieve these goals, appropriate inductive biases must be incorporated into the model. And these inductive biases should reflect some basic properties of the concepts behind the observed data. ### The Principle of Interaction Asymmetry To solve the above problems, the authors proposed the "interaction asymmetry" principle: - **Definition**: The interaction between different parts of the same concept is more complex than the interaction between different concepts. - **Formalization**: This principle is formalized by the block - diagonal condition of the (n + 1) - order derivative tensor of the generating function $f$, where different $n$ values correspond to different interaction complexities. ### Theoretical Contributions Based on the principle of interaction asymmetry, the authors proved that it supports both disentanglement and compositional generalization. In addition, they showed that this principle unifies previous theoretical results on object concept learning and extends these results to adapt to a wider class of generating functions. ### Methods and Experiments - **Methods**: A Transformer - based variational auto - encoder (VAE) was proposed, and a new regularization term was introduced during the decoding process to penalize the interaction between different concept slots. - **Experiments**: The effectiveness of the model was verified on a synthetic image data set, indicating that its performance is comparable to that of existing models using explicit object - centric priors. In conclusion, this paper aims to provide a general method for learning disentangled and composable concept representations by introducing the principle of interaction asymmetry, thereby achieving effective generalization to unseen scenarios.

Interaction Asymmetry: A General Principle for Learning Composable Abstractions

Disentangling Factors of Variation in Deep Representations Using Adversarial Training.

A Study of Compositional Generalization in Neural Models

Provable Compositional Generalization for Object-Centric Learning

Vector-based Representation is the Key: A Study on Disentanglement and Compositional Generalization

Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings

Compositional Generalization in Unsupervised Compositional Representation Learning: A Study on Disentanglement and Emergent Language

Compositional Structures in Neural Embedding and Interaction Decompositions

Next state prediction gives rise to entangled, yet compositional representations of objects

Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language

Dynamics of Concept Learning and Compositional Generalization

Towards Understanding the Relationship between In-context Learning and Compositional Generalization

Compositional diversity in visual concept learning

Compositional generalization through abstract representations in human and artificial neural networks

Lost in Latent Space: Disentangled Models and the Challenge of Combinatorial Generalisation

Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization

Compositionality as Lexical Symmetry

From abstract items to latent spaces to observed data and back: Compositional Variational Auto-Encoder

Compositional Generalization by Learning Analytical Expressions.

Concepts, Properties and an Approach for Compositional Generalization