Connecting Permutation Equivariant Neural Networks and Partition Diagrams

Edward Pearce-Crump
2024-08-08
Abstract:Permutation equivariant neural networks are often constructed using tensor powers of $\mathbb{R}^{n}$ as their layer spaces. We show that all of the weight matrices that appear in these neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra. In particular, we adapt Schur-Weyl duality to derive a simple, diagrammatic method for calculating the weight matrices themselves.
Machine Learning,Combinatorics,Representation Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to comprehensively characterize all possible weight matrices in permutation - equivariant neural networks through Schur - Weyl duality (a common result in algebraic combinatorics and representation theory). Specifically, the author aims to provide a simple and intuitive graphical method for linear - layer functions between any two tensor - power spaces of $\mathbb{R}^n$ to calculate weight matrices. ### Problem Background Permutation - equivariant neural networks are a type of neural network that can encode permutation symmetry and are widely used in various machine - learning tasks, such as learning from sets or graphs, object - motion prediction in computer vision, combinatorial modeling in natural language processing, and auction design in economics. These networks usually use tensor powers of $\mathbb{R}^n$ as their layer spaces and need to ensure that the network's weight matrices remain equivariant under permutation operations. ### Existing Work Existing research mainly focuses on two aspects: 1. **Specific Applications**: Design permutation - equivariant neural networks for specific tasks, such as point - cloud - data learning, image annotation, and set - anomaly detection. 2. **Higher - Order Structures**: Create more general permutation - equivariant functions to handle data living on higher - order structures (such as graphs). However, most of the existing methods rely on complex mathematical tools, such as Kronecker products and fixed - point equations, to obtain the basis of weight matrices. This makes the calculation process rather complex and difficult to understand. ### Paper Contributions This paper proposes a completely new method that uses Schur - Weyl duality to simplify and comprehensively characterize weight matrices in permutation - equivariant neural networks. The specific steps are as follows: 1. **Introducing Partition Vector Spaces**: Define a new vector space $P_k^l(n)$, which consists of some diagrams that partition the set $[l + k]$ into several disjoint subsets. 2. **Establishing a Bijective Correspondence**: Prove that there is a bijective correspondence between the basis elements of $\text{Hom}_{S_n}((\mathbb{R}^n)^{\otimes k}, (\mathbb{R}^n)^{\otimes l})$ and the orbital - basis diagrams in $P_k^l(n)$ with at most $n$ blocks. 3. **Constructing Weight Matrices**: Through the above - mentioned correspondence, the basis elements of weight matrices can be easily constructed from the orbital - basis diagrams, and these basis elements can be further combined to obtain the complete weight matrix. ### Formula Summary - **Permutation Action**: For any $\sigma\in S_n$ and $a\in [n]$, the permutation action is defined as: \[ \sigma\cdot e_a = e_{\sigma(a)} \] - **Basis of Tensor - Power Spaces**: For $I := (i_1, i_2,\ldots, i_k)\in [n]^k$, the basis element of the tensor - power space is defined as: \[ e_I := e_{i_1}\otimes e_{i_2}\otimes\cdots\otimes e_{i_k} \] - **Permutation - Equivariant Mapping**: A mapping $\phi: (\mathbb{R}^n)^{\otimes k}\to (\mathbb{R}^n)^{\otimes l}$ is permutation - equivariant if for all $\sigma\in S_n$ and $v\in (\mathbb{R}^n)^{\otimes k}$, it satisfies: \[ \phi(\rho_k(\sigma)[v])=\rho_l(\sigma)[\phi(v)] \] ### Conclusion By introducing Schur - Weyl duality and partition vector spaces, this paper provides a concise and intuitive method for calculating weight matrices in permutation - equivariant neural networks. This method not only simplifies the calculation process.