Abstract:We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our findings give a numerical perspective on the success of modern neural network architectures, and they provide design criteria for stable networks.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to explore how to transform the concepts of numerical algorithms into neural network architectures, especially convolutional neural networks (CNNs). Specifically, the author focuses on: 1. **Transformation from numerical algorithms to neural networks**: By corresponding numerical algorithms (such as explicit, accelerated explicit and implicit schemes and linear multigrid methods) to modern neural network architectures (such as ResNets, RNNs and U - nets) to ensure the stability of specific ResNets under the Euclidean norm. 2. **The role of skip connections**: Explain the value of skip connections from three different numerical perspectives: - As time discretization in explicit schemes. - As an extrapolation mechanism to accelerate these methods. - As a recursive connection of fixed - point solvers in implicit schemes. 3. **Application of non - monotonic activation functions**: Through the study of numerical schemes of generalized diffusion processes, it is shown that non - monotonic activation functions are allowed and may be advantageous. 4. **The relationship between multigrid methods and U - net**: By analyzing multigrid methods, the connection between it and the U - net structure is revealed, explaining why U - net is so efficient. ### Main contributions - **Direct correspondence from numerical algorithms to ResNet**: It is shown how high - order diffusion steps are equivalent to ResNet blocks, and the stability and well - posedness of this form of ResNet under the Euclidean norm are proved. - **New insights into skip connections**: Provide new explanations for skip connections in different numerical methods, further supporting their importance in deep learning. - **Rationality of non - monotonic activation functions**: Based on the numerical explanation of the diffusion process, it is proposed that non - monotonic activation functions may be beneficial in some cases. - **Structural similarity between multigrid methods and U - net**: Reveal the structural connection between multigrid methods and U - net, providing a new perspective for understanding the efficiency of U - net. ### Conclusion By translating successful numerical concepts (such as multigrid methods) into neural network architectures, the author hopes to establish a closer connection between the stability and efficiency of numerical algorithms and the performance of neural networks. This work not only provides systematic design principles for designing stable neural networks, but also provides a blueprint for future research.

Translating Numerical Concepts for PDEs into Neural Architectures

Definition and delineation of the clinical target volume for rectal cancer.

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

Trans-Net: A transferable pretrained neural networks based on temporal domain decomposition for solving partial differential equations

Regularity-Conforming Neural Networks (ReCoNNs) for solving Partial Differential Equations

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

Predictions Based on Pixel Data: Insights from PDEs and Finite Differences

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations

Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods

Message Passing Neural PDE Solvers

A novel paradigm for solving PDEs: multi scale neural computing

Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries

Energetic Variational Neural Network Discretizations of Gradient Flows

Deep neural network methods for solving forward and inverse problems of time fractional diffusion equations with conformable derivative

SineNet: Learning Temporal Dynamics in Time-Dependent Partial Differential Equations

PhyCRNet: Physics-informed convolutional-recurrent network for solving spatiotemporal PDEs

Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks

Numerical solutions of boundary problems in partial differential equations: A deep learning framework with Green's function

A PDE-based Explanation of Extreme Numerical Sensitivities and Edge of Stability in Training Neural Networks

Linking Machine Learning with Multiscale Numerics: Data-Driven Discovery of Homogenized Equations