Model scale versus domain knowledge in statistical forecasting of chaotic systems

William Gilpin
2023-11-23
Abstract:Chaos and unpredictability are traditionally synonymous, yet large-scale machine learning methods recently have demonstrated a surprising ability to forecast chaotic systems well beyond typical predictability horizons. However, recent works disagree on whether specialized methods grounded in dynamical systems theory, such as reservoir computers or neural ordinary differential equations, outperform general-purpose large-scale learning methods such as transformers or recurrent neural networks. These prior studies perform comparisons on few individually-chosen chaotic systems, thereby precluding robust quantification of how statistical modeling choices and dynamical invariants of different chaotic systems jointly determine empirical predictability. Here, we perform the largest to-date comparative study of forecasting methods on the classical problem of forecasting chaos: we benchmark 24 state-of-the-art forecasting methods on a crowdsourced database of 135 low-dimensional systems with 17 forecast metrics. We find that large-scale, domain-agnostic forecasting methods consistently produce predictions that remain accurate up to two dozen Lyapunov times, thereby accessing a new long-horizon forecasting regime well beyond classical methods. We find that, in this regime, accuracy decorrelates with classical invariant measures of predictability like the Lyapunov exponent. However, in data-limited settings outside the long-horizon regime, we find that physics-based hybrid methods retain a comparative advantage due to their strong inductive biases.
Machine Learning,Computational Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to explore the performance differences between large - scale general - purpose models and physics - based models in chaotic system prediction. Specifically, by comparing the performance of 24 state - of - the - art prediction methods on 135 low - dimensional chaotic systems, the author aims to quantify how statistical modeling choices and the dynamical invariants of different chaotic systems jointly determine empirical prediction capabilities. The core question of the paper is to verify whether large - scale, domain - independent prediction models can outperform traditional physics - based prediction models, especially in long - term prediction and data - limited situations. ### Main research contents of the paper include: 1. **Introducing a large - scale chaotic attractor benchmark dataset**: The author constructs a dataset containing 135 known chaotic attractors described by low - dimensional differential equations. These systems cover examples from multiple fields such as climatology and neuroscience. 2. **Evaluating 24 prediction models**: These models include traditional linear regression, ARIMA, exponential smoothing, Fourier mode extrapolation, etc., as well as deep - learning - based Transformer, LSTM, RNN, Temporal Convolutional Neural Network (TCN), NBEATS/NHiTS, etc. 3. **Designing prediction benchmark experiments**: By generating two time series from different initial conditions for training and testing respectively, the performance of the models in different prediction time ranges is evaluated. 4. **Analyzing the results**: The author finds that when there is sufficient training data, large - scale, domain - independent prediction models outperform physics - based models in both short - to - medium - term and long - term predictions. However, in situations where computing resources or data are limited, models with inductive bias (such as Echo State Networks) show stronger performance. ### Main conclusions: - **Advantages of large - scale general - purpose models**: When there is sufficient training data, large - scale general - purpose models (such as NBEATS, NHiTS, Transformer, and LSTM) perform well on multiple chaotic systems and can predict future states up to 22 Lyapunov times. - **Advantages of physics - based models in data - limited situations**: When computing resources or data are limited, models with inductive bias (such as Echo State Networks) show better performance because they can use limited data more effectively. - **Relationship between the inherent properties of chaotic systems and prediction capabilities**: The study finds that the performance of the best prediction model has a weak correlation with the maximum Lyapunov exponent (λmax) of the chaotic system, indicating that the scale of the model and data availability may be the limiting factors for the current large - scale models' ability to predict chaotic systems. Through these studies, the paper provides important insights into understanding the potential and limitations of large - scale machine - learning models in chaotic system prediction.