Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Miltiadis Kofinas,Boris Knyazev,Yan Zhang,Yunlu Chen,Gertjan J. Burghouts,Efstratios Gavves,Cees G. M. Snoek,David W. Zhang
2024-03-21
Abstract:Neural networks that process the parameters of other neural networks find applications in domains as diverse as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. However, existing approaches either overlook the inherent permutation symmetry in the neural network or rely on intricate weight-sharing patterns to achieve equivariance, while ignoring the impact of the network architecture itself. In this work, we propose to represent neural networks as computational graphs of parameters, which allows us to harness powerful graph neural networks and transformers that preserve permutation symmetry. Consequently, our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations, predicting generalization performance, and learning to optimize, while consistently outperforming state-of-the-art methods. The source code is open-sourced at
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper discusses how to design neural networks that can handle other neural network parameters, which is widely used in fields such as implicit neural representation classification, neural network weight generation, and predicting generalization errors. Existing methods either ignore the inherent permutation symmetry in neural network parameters or rely on complex weight sharing patterns to achieve invariance, while ignoring the impact of the network architecture itself. The paper proposes a new approach to represent neural networks as parameterized computation graphs, using powerful graph neural networks and transformers to maintain permutation symmetry, enabling a single model to handle neural graphs with different architectures. This approach has shown effectiveness in various tasks, including classification and editing of implicit neural representations, predicting generalization performance, and learning optimization, consistently surpassing the current state-of-the-art methods. The paper also open-sourced the source code for further research and application.