Abstract:Recent years have witnessed the promise of coupling machine learning methods and physical domain-specific insights for solving scientific problems based on partial differential equations (PDEs). However, being data-intensive, these methods still require a large amount of PDE data. This reintroduces the need for expensive numerical PDE solutions, partially undermining the original goal of avoiding these expensive simulations. In this work, seeking data efficiency, we design unsupervised pretraining for PDE operator learning. To reduce the need for training data with heavy simulation costs, we mine unlabeled PDE data without simulated solutions, and we pretrain neural operators with physics-inspired reconstruction-based proxy tasks. To improve out-of-distribution performance, we further assist neural operators in flexibly leveraging a similarity-based method that learns in-context examples, without incurring extra training costs or designs. Extensive empirical evaluations on a diverse set of PDEs demonstrate that our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models. We provide our code at <a class="link-external link-https" href="https://github.com/delta-lab-ai/data_efficient_nopt" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the data - efficiency problem in solving partial differential equations (PDEs) in scientific machine learning (SciML). Specifically, although existing neural - network - based methods have shown potential in solving PDEs, these methods usually require a large amount of data for training, which leads to high computational costs because high - fidelity numerical simulations are very expensive. Therefore, the paper proposes a new framework to reduce the need for a large amount of labeled data through unsupervised pre - training and in - context learning (ICL), thereby improving data efficiency and reducing the simulation cost of PDE solutions. ### Main Contributions 1. **Unsupervised Pre - training**: The paper introduces the concept of unsupervised pre - training, using unlabeled PDE data for pre - training. By designing two physics - based reconstruction tasks (Masked Autoencoder and Super - resolution), the model can learn useful feature representations without relying on expensive simulation data. Experimental results show that this unsupervised pre - training method can significantly reduce the amount of required simulation data while improving model performance. 2. **In - context Learning**: To further improve the generalization ability of the model in out - of - distribution (OOD) situations, the paper proposes a similarity - based in - context learning method. This method utilizes a small number of context examples (demos) in the inference stage, calculates the similarity between the input and the examples, and aggregates the solutions of these examples to improve the prediction performance of the model. This method significantly improves the OOD generalization ability of the model without requiring additional training costs. ### Method Overview 1. **Unsupervised Pre - training**: - **Unlabeled PDE Data**: Defines unlabeled PDE data, that is, only contains input such as physical parameters, coordinates, forcing functions, etc., but does not contain the solution of PDE. - **Surrogate Tasks**: Designs two surrogate tasks - Masked Autoencoder and Super - resolution. Through these tasks, the model can learn invariance to sparse sensing and different resolutions, thereby extracting useful feature representations. 2. **In - context Learning**: - **Similarity Calculation**: Calculates the distance between the input and the context examples to find similar samples. - **Aggregation**: For each query location, aggregates the solutions of its similar samples as the final prediction. ### Experimental Results The paper has carried out extensive experiments on multiple PDE benchmark tests and actual observation data. The experimental results show that the unsupervised pre - training method not only significantly reduces the amount of required simulation data but also outperforms the model trained from scratch in performance. In addition, the in - context learning method significantly improves the generalization ability of the model in out - of - distribution situations. ### Formula Examples - **General Form of PDE**: \[ \sum_{i,j = 1}^{n}a_{ij}(x)\frac{\partial^{2}u}{\partial x_{i}\partial x_{j}}+\sum_{i = 1}^{n}b_{i}(x)\frac{\partial u}{\partial x_{i}}+c(x)u = f(x) \] where \(x\in\mathbb{R}^{n}\) represents the physical space, \(a_{ij}, b_{i}, c\) are known physical parameters, \(u\) is the target solution, and \(f\) is the external forcing function. - **Loss Function**: - **Masked Autoencoder**: \[ \mathcal{L}_{\text{MAE}}=\frac{1}{|M|}\sum_{(i,j)\in M}\|\hat{y}_{ij}-y_{ij}\|^{2} \] where \(M\) is the set of masked regions, \(\hat{y}_{ij}\) is the predicted value of the model, and \(y_{ij}\) is the true value. - **Super - resol

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Latent Neural Operator Pretraining for Solving Time-Dependent PDEs

LeMON: Learning to Learn Multi-Operator Networks

Diffeomorphic Latent Neural Operators for Data-Efficient Learning of Solutions to Partial Differential Equations

DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning

DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

Synergistic Learning with Multi-Task DeepONet for Efficient PDE Problem Solving

NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data

Latent Neural Operator for Solving Forward and Inverse PDE Problems

Learning Partial Differential Equations with Deep Parallel Neural Operator

Strategies for Pretraining Neural Operators

Pretraining a Neural Operator in Lower Dimensions

Data Generation-based Operator Learning for Solving Partial Differential Equations on Unbounded Domains

Neural Operator: Is data all you need to model the world? An insight into the impact of Physics Informed Machine Learning

Physics-enhanced Neural Operator for Simulating Turbulent Transport

Separable Operator Networks

One-shot learning for solution operators of partial differential equations

Learning in latent spaces improves the predictive accuracy of deep neural operators

Neural Operators Meet Energy-based Theory: Operator Learning for Hamiltonian and Dissipative PDEs

Diffeomorphism Neural Operator for various domains and parameters of partial differential equations