DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning

Zecheng Zhang,Christian Moya,Lu Lu,Guang Lin,Hayden Schaeffer
2024-11-12
Abstract:We propose a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks. Operator learning effectively approximates solution operators for PDEs and various PDE-related problems, yet it often struggles to generalize to new tasks. To address this, we investigate fine-tuning a pretrained model, while carefully selecting an initialization that enables rapid adaptation to new tasks with minimal data. Our approach combines distributed learning to integrate data from various operators in pre-training, while physics-informed methods enable zero-shot fine-tuning, minimizing the reliance on downstream data. We investigate standard fine-tuning and Low-Rank Adaptation fine-tuning, applying both to train complex nonlinear target operators that are difficult to learn only using random initialization. Through comprehensive numerical examples, we demonstrate the advantages of our approach, showcasing significant improvements in accuracy. Our findings provide a robust framework for advancing multi-operator learning and highlight the potential of transfer learning techniques in this domain.
Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Data Limitations**: Traditional neural operator learning methods usually require a large number of data sets to train the model, which may be limited in practical applications. In addition, these methods also face challenges when dealing with heterogeneous input spaces, that is, there are differences in input functions or function domains among different tasks. 2. **Generalization and Extrapolation Abilities**: A key challenge in neural operator learning is how to enable a pre - trained model to predict test samples with different properties from the training samples. Especially when facing new operators (such as solution operators related to unseen partial differential equations), the generalization and extrapolation abilities of the model are particularly important. 3. **Initialization Problem**: When training physics - informed DeepONets (Deep Operator Networks), choosing appropriate initial training points is a difficult problem. The complexity of physical systems makes it difficult to select these initial points, thus affecting the overall training process. 4. **Computational Efficiency**: Most multi - operator learning frameworks (MOL) have high computational costs, require a large number of data sets, and still face challenges when generalizing to partial differential equations with completely new physical properties. Therefore, improving computational efficiency and reducing the dependence on downstream data are the focuses of research. To address the above problems, the paper proposes a new fine - tuning method to achieve multi - operator learning through distributed pre - training combined with physics - informed fine - tuning. Specifically, this method includes the following: - **Distributed Pre - training**: Use diverse function data from different operators for distributed pre - training to generate a robust pre - trained model. - **Physics - informed Fine - tuning**: Use a physics - informed loss function for zero - sample fine - tuning in downstream tasks to reduce the dependence on downstream data. - **Low - Rank Adaptation (LoRA)**: Introduce a low - rank parameter matrix, update only a small number of parameters, thereby improving computational efficiency and maintaining model performance. Through these methods, the paper aims to improve the generalization and extrapolation abilities of the model, especially when facing new tasks, it can quickly adapt and provide accurate predictions.