Predictions Based on Pixel Data: Insights from PDEs and Finite Differences

Elena Celledoni,James Jackaman,Davide Murari,Brynjulf Owren
2024-06-21
Abstract:As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.
Numerical Analysis,Machine Learning
What problem does this paper attempt to address?
The paper mainly discusses how to use a two-layer convolutional neural network (CNNs) to accurately approximate the temporal discretization of partial differential equations (PDEs) based on the method of lines. The research indicates that by utilizing the connection between discrete convolution and finite difference operations, a relatively small network can accurately represent a class of PDE numerical discretizations. The paper not only constructs these results but also supports theoretical analysis through numerical experiments simulating linear transport, heat diffusion, and Fisher equations. The main objective proposed in the paper is to understand to what extent a two-layer CNN can accurately approximate the spatial-temporal discretization of PDEs. In the research, the authors view the temporal sequences as matrix sequences generated by the discretization of PDEs and focus on the two-dimensional spatial domain. They demonstrate that for linear PDEs, a two-layer CNN with ReLU activation function and two channels can provide second-order accuracy in semi-discretization. Similar results also apply to nonlinear PDEs with quadratic interaction terms. To improve prediction stability, the paper proposes two strategies: injecting noise during model training and preserving certain characteristics of the PDEs (such as gauge invariance) in the network, for example, preserving the norm of the initial condition when dealing with linear transport equations. The experiments show that these methods can improve the reliability of network predictions for the future. In summary, this paper aims to enhance the accuracy and stability of numerical solutions to PDEs through deep learning techniques, providing new tools for understanding and simulating complex physical processes.