Self-supervised learning based on Transformer for flow reconstruction and prediction

Bonan Xu,Yuanye Zhou,Xin Bian
2023-11-26
Abstract:Machine learning has great potential for efficient reconstruction and prediction of flow fields. However, existing datasets may have highly diversified labels for different flow scenarios, which are not applicable for training a model. To this end, we make a first attempt to apply the self-supervised learning (SSL) technique to fluid dynamics, which disregards data labels for pre-training the model. The SSL technique embraces a large amount of data ($8000$ snapshots) at Reynolds numbers of $Re=200$, $300$, $400$, $500$ without discriminating between them, which improves the generalization of the model. The Transformer model is pre-trained via a specially designed pretext task, where it reconstructs the complete flow fields after randomly masking $20\%$ data points in each snapshot. For the downstream task of flow reconstruction, the pre-trained model is fine-tuned separately with $256$ snapshots for each Reynolds number. The fine-tuned models accurately reconstruct the complete flow fields based on less than $5\%$ random data points within a limited window even for $Re=250$ and $600$, whose data were not seen in the pre-trained phase. For the other downstream task of flow prediction, the pre-training model is fine-tuned separately with $128$ consecutive snapshot pairs for each corresponding Reynolds number. The fine-tuned models then correctly predict the evolution of the flow fields over many periods of cycles. We compare all results generated by models trained via SSL and models trained via supervised learning, where the former has unequivocally superior performance. We expect that the methodology presented here will have wider applications in fluid mechanics
Fluid Dynamics
What problem does this paper attempt to address?
### Main Problems Addressed by the Paper This paper primarily explores how to use self-supervised learning (SSL) techniques to address the issues of flow field reconstruction and prediction in fluid dynamics. Specifically, the paper addresses the following key problems: 1. **Diversity of existing dataset labels**: The currently available datasets contain highly diverse labels for different flow scenarios, which are not suitable for training a single model. 2. **Insufficient labeled data**: Although a large amount of high-fidelity simulation data can be obtained through high-performance scientific computing, these datasets are not always directly usable for supervised learning tasks because they contain too many label categories (such as Reynolds number, Mach number, etc.), making it difficult to merge data from different sources. 3. **Improving model generalization**: The proposed method aims to improve the model's generalization ability by pre-training with a large amount of unlabeled data and fine-tuning with a small amount of labeled data on specific downstream tasks to achieve high performance. ### Overview of Research Methods - **Self-supervised learning strategy**: The paper adopts a self-supervised learning strategy based on the Transformer architecture. The model is pre-trained through a specially designed pre-training task, which involves reconstructing the complete flow field after randomly masking 20% of the data points in each snapshot. - **Data augmentation method**: A new data augmentation method is introduced, allowing for the random selection of data point indices and the random removal of certain data points to increase the variability of the input sequence length. - **Downstream tasks**: The pre-trained model is fine-tuned for two downstream tasks: flow field reconstruction and flow field prediction. In the flow field reconstruction task, the model reconstructs the complete flow field based on sparse data points within the observation window. In the flow field prediction task, the model predicts the subsequent changes in the flow field based on any initial snapshot. ### Main Contributions - For the first time, self-supervised learning techniques are applied to the tasks of flow field reconstruction and prediction in fluid dynamics. - A Transformer-based self-supervised learning framework is designed, capable of handling data from unstructured grids and utilizing a large amount of unlabeled data during the pre-training phase to improve the model's generalization ability. - Experiments validate the superior performance of the proposed self-supervised learning method in flow field reconstruction and prediction tasks, especially when dealing with sparse and randomly distributed data. ### Conclusion This study demonstrates an effective method for solving the problems of flow field reconstruction and prediction in fluid dynamics using self-supervised learning techniques. The pre-training and fine-tuning process improves the model's generalization ability and adaptability to sparse data. This method is expected to be widely applied in the field of fluid dynamics.