Translating Numerical Concepts for PDEs into Neural Architectures

Tobias Alt,Pascal Peter,Joachim Weickert,Karl Schrader
DOI: https://doi.org/10.48550/arXiv.2103.15419
2021-05-17
Abstract:We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our findings give a numerical perspective on the success of modern neural network architectures, and they provide design criteria for stable networks.
Numerical Analysis,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to explore how to transform the concepts of numerical algorithms into neural network architectures, especially convolutional neural networks (CNNs). Specifically, the author focuses on: 1. **Transformation from numerical algorithms to neural networks**: By corresponding numerical algorithms (such as explicit, accelerated explicit and implicit schemes and linear multigrid methods) to modern neural network architectures (such as ResNets, RNNs and U - nets) to ensure the stability of specific ResNets under the Euclidean norm. 2. **The role of skip connections**: Explain the value of skip connections from three different numerical perspectives: - As time discretization in explicit schemes. - As an extrapolation mechanism to accelerate these methods. - As a recursive connection of fixed - point solvers in implicit schemes. 3. **Application of non - monotonic activation functions**: Through the study of numerical schemes of generalized diffusion processes, it is shown that non - monotonic activation functions are allowed and may be advantageous. 4. **The relationship between multigrid methods and U - net**: By analyzing multigrid methods, the connection between it and the U - net structure is revealed, explaining why U - net is so efficient. ### Main contributions - **Direct correspondence from numerical algorithms to ResNet**: It is shown how high - order diffusion steps are equivalent to ResNet blocks, and the stability and well - posedness of this form of ResNet under the Euclidean norm are proved. - **New insights into skip connections**: Provide new explanations for skip connections in different numerical methods, further supporting their importance in deep learning. - **Rationality of non - monotonic activation functions**: Based on the numerical explanation of the diffusion process, it is proposed that non - monotonic activation functions may be beneficial in some cases. - **Structural similarity between multigrid methods and U - net**: Reveal the structural connection between multigrid methods and U - net, providing a new perspective for understanding the efficiency of U - net. ### Conclusion By translating successful numerical concepts (such as multigrid methods) into neural network architectures, the author hopes to establish a closer connection between the stability and efficiency of numerical algorithms and the performance of neural networks. This work not only provides systematic design principles for designing stable neural networks, but also provides a blueprint for future research.