Automatically discovering ordinary differential equations from data with sparse regression

Kevin Egan,Weizhen Li,Rui Carvalho
DOI: https://doi.org/10.1038/s42005-023-01516-2
2024-01-10
Communications Physics
Abstract:Discovering nonlinear differential equations that describe system dynamics from empirical data is a fundamental challenge in contemporary science. While current methods can identify such equations, they often require extensive manual hyperparameter tuning, limiting their applicability. Here, we propose a methodology to identify dynamical laws by integrating denoising techniques to smooth the signal, sparse regression to identify the relevant parameters, and bootstrap confidence intervals to quantify the uncertainty of the estimates. We evaluate our method on well-known ordinary differential equations with an ensemble of random initial conditions, time series of increasing length, and varying signal-to-noise ratios. Our algorithm consistently identifies three-dimensional systems, given moderately-sized time series and high levels of signal quality relative to background noise. By accurately discovering dynamical systems automatically, our methodology has the potential to impact the understanding of complex systems, especially in fields where data are abundant, but developing mathematical models demands considerable effort.
physics, multidisciplinary
What problem does this paper attempt to address?
The paper aims to address the problem of automatically discovering nonlinear differential equations that describe system dynamics from observational data. Specifically, the authors propose a new method (ARGOS) to overcome the challenges faced by existing methods in identifying these equations, such as the need for extensive manual hyperparameter tuning. The main contributions of the paper include: 1. **Noise Handling and Differential Computation**: Smoothing the signal using the Savitzky-Golay filter and automatically computing numerical derivatives. 2. **Sparse Regression and Variable Selection**: Using sparse regression techniques (such as LASSO and adaptive LASSO) to identify important parameters and estimating uncertainty intervals through Bootstrap sampling. 3. **Automated Model Selection**: Optimizing Savitzky-Golay filter parameters through grid search and automatically selecting the best model. By testing on multiple classical differential equations, the authors demonstrate the consistency and accuracy of their method under different signal-to-noise ratios and time series lengths, particularly outperforming the existing SINDy with AIC method in the identification of 3-dimensional systems. Additionally, the method excels in handling noisy data, enabling more efficient identification of complex systems.