Abstract:Cornish (2024) recently gave a general theory of neural network symmetrisation in the abstract context of Markov categories. We give a high-level overview of these results, and their concrete implications for the symmetrisation of deterministic functions and of Markov kernels.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve the symmetrisation of neural networks in specific settings. Specifically, the author aims to apply the abstract theory of neural network symmetrisation proposed by Cornish (2024) to more specific scenarios, especially for the symmetrisation of deterministic functions and Markov kernels. The following are the main problems and goals of the paper:
1. **Ensure the equivariance of neural networks**:
- The neural network \( f: X \to Y \) needs to maintain equivariance under the action of certain groups \( G \). That is, for all \( g \in G \) and \( x \in X \), it is required to satisfy \( f(g \cdot x) = g \cdot f(x) \).
- To achieve this, the paper explores converting the non - equivariant neural network \( f_0 \) into an equivariant neural network \( f := \text{sym}(f_0) \) through symmetrisation techniques.
2. **Generalise the symmetrisation method to stochastic neural networks**:
- The paper not only focuses on deterministic functions but also extends to stochastic neural networks, that is, Markov kernels \( k: X \to Y \). These Markov kernels can model conditional distributions or stochastic functions.
- The author proposes how to define the symmetrisation process in Markov categories to ensure the equivariance of Markov kernels.
3. **Unify the handling of complex situations**:
- The paper provides a unified framework to handle complex symmetrisation problems, including non - compact translation groups and semi - direct products.
- This framework simplifies the representation of combining and recursively using symmetrisation techniques, allowing different types of symmetrisation methods to be described within a consistent framework.
4. **Transition from the abstract to the concrete**:
- The theory of Cornish (2024) is proposed based on the higher - order algebraic framework of Markov categories, which are not widely known in the machine - learning community at present.
- Therefore, this paper attempts to present these results in a more familiar concrete environment so that more researchers can understand and apply these theories.
### Specific contributions
- **Theorems and formulas**:
- A theorem (Theorem 1) is proposed, which describes all possible symmetrisation processes and represents this bijective relationship with the formula \( \text{Set}^H(RX, RY) \cong \text{Set}^G(G/H \otimes X, Y) \).
- Specific symmetrisation processes are derived and how to achieve symmetrisation through precomposition is shown.
- **Stochastic symmetrisation**:
- Stochastic equivariance is defined and a symmetrisation method for Markov kernels is given.
- Specific steps for calculating the symmetrised Markov kernels are provided, including the sampling process.
- **Practical applications**:
- Through the symmetrisation method, some expensive operations (such as averaging operations) can be avoided without losing the overall symmetry of the model, thereby improving computational efficiency.
In summary, this paper systematically solves several key problems in neural network symmetrisation by introducing the theoretical framework of Markov categories and extends its application to a wider range of scenarios, including stochastic neural networks.