Abstract:Deep learning is widely used in tasks including image recognition and generation, in learning dynamical systems from data and many more. It is important to construct learning architectures with theoretical guarantees to permit safety in the applications. There has been considerable progress in this direction lately. In particular, symplectic networks were shown to have the non vanishing gradient property, essential for numerical stability. On the other hand, architectures based on higher order numerical methods were shown to be efficient in many tasks where the learned function has an underlying dynamical structure. In this work we construct symplectic networks based on higher order explicit methods with non vanishing gradient property and test their efficiency on various examples.

What problem does this paper attempt to address?

This paper discusses the use of symplectic methods in deep learning to address the issues of gradient vanishing and numerical stability. The authors propose a new neural network architecture based on high-order explicit symplectic partitioned Runge-Kutta (SPRK) methods, which preserves the non-zero gradient property and demonstrates efficiency in various tasks. Specifically, the main problems addressed in the paper can be summarized as follows: 1. **Gradient vanishing problem**: In deep learning, the gradient can become very small or vanish during the backward propagation process due to the chain rule between layers, which can affect the training of the network. 2. **Numerical stability**: Network architectures using symplectic methods, such as symplectic neural networks, have been shown to possess non-zero gradient properties, which are crucial for numerical stability. 3. **Advantages of higher-order methods**: The paper suggests that networks using high-order numerical integration methods, such as SPRK, can provide better approximation performance when learning functions with dynamic structures. 4. **Theoretical guarantees and universality**: The newly proposed network not only possesses the non-zero gradient property but also demonstrates universal approximation capability, i.e., the ability to approximate any continuous function, which is an important feature of deep learning networks. 5. **Application examples**: The paper validates the effectiveness of the new network through tasks such as image classification and learning of autonomous symplectic systems, and compares it with existing symplectic neural networks. 6. **Relationship to continuous learning settings**: The authors point out that SPRK-based networks can be viewed as higher-order approximations of continuous optimization problems, providing a foundation for further analysis and understanding of transformations in neural networks. In conclusion, the goal of the paper is to improve the design of deep learning networks by introducing symplectic methods to enhance their stability and generalization ability.

Symplectic Methods in Deep Learning

Symplectic Neural Networks Based on Dynamical Systems

Deep Neural Networks with Symplectic Preservation Properties

Exploiting Problem Structure in Deep Declarative Networks: Two Case Studies

Learning reversible symplectic dynamics

Symplectic Autoencoders for Model Reduction of Hamiltonian Systems

Locally-symplectic neural networks for learning volume-preserving dynamics

DeepSets and their derivative networks for solving symmetric PDEs

Symbolically Solving Partial Differential Equations using Deep Learning

Structure-preserving model reduction of Hamiltonian systems by learning a symplectic autoencoder

Closed-form Symbolic Solutions: A New Perspective on Solving Partial Differential Equations

Symplectic Momentum Neural Networks -- Using Discrete Variational Mechanics as a prior in Deep Learning

Symplectic Learning for Hamiltonian Neural Networks

SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems

Solving multiscale dynamical systems by deep learning

Multisymplectic Formulation of Deep Learning Using Mean--Field Type Control and Nonlinear Stability of Training Algorithm

Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

Physical Symmetries Embedded in Neural Networks

Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control

Symplectic Recurrent Neural Networks

Differential geometry and stochastic dynamics with deep learning numerics