Learning dominant physical processes with data-driven balance models

Jared L. Callaham,James V. Koch,Bingni W. Brunton,J. Nathan Kutz,Steven L. Brunton
DOI: https://doi.org/10.1038/s41467-021-21331-z
2021-01-20
Abstract:Throughout the history of science, physics-based modeling has relied on judiciously approximating observed dynamics as a balance between a few dominant processes. However, this traditional approach is mathematically cumbersome and only applies in asymptotic regimes where there is a strict separation of scales in the physics. Here, we automate and generalize this approach to non-asymptotic regimes by introducing the idea of an equation space, in which different local balances appear as distinct subspace clusters. Unsupervised learning can then automatically identify regions where groups of terms may be neglected. We show that our data-driven balance models successfully delineate dominant balance physics in a much richer class of systems. In particular, this approach uncovers key mechanistic models in turbulence, combustion, nonlinear optics, geophysical fluids, and neuroscience.
Fluid Dynamics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to automatically identify the dominant physical processes in complex physical systems, especially in non - asymptotic regimes. Traditional methods rely on reducing the observed dynamics to the balance between a few dominant processes, but this is usually only applicable to asymptotic regimes with strict scale separation. To overcome this limitation, the author introduced a new method by constructing an "equation space", in which different local equilibria are represented as different subspace clusters. Using unsupervised learning techniques, it is possible to automatically identify which terms can be ignored, thereby determining the dominant physical processes. ### Main Problem Summary 1. **Limitations of Traditional Methods**: - Traditional physics - based modeling methods rely on reducing the observed dynamics to the balance between a few dominant processes. - These methods are effective in asymptotic regimes but not in non - asymptotic regimes because these regions do not have strict scale separation. 2. **Objectives of the New Method**: - Automate and generalize traditional methods so that they are applicable to non - asymptotic regimes. - Introduce the concept of "equation space" so that different local equilibria are represented as different subspace clusters in this space. - Use unsupervised learning techniques to automatically identify which terms can be ignored, thereby determining the dominant physical processes. ### Specific Implementation - **Equation Space**: Each coordinate is defined by a term in the governing equation. For each space - time point \((x, t)\), the values of all terms can be calculated to form a vector \(f(x, t)\in\mathbb{R}^K\). - **Dominant Balance Region**: Defined as the main region where the governing equation can be approximately satisfied by a subset of its terms; the remaining terms can be ignored. - **Geometric Interpretation**: In the equation space, the dominant - balance physics has a natural geometric interpretation, allowing the use of standard machine - learning tools (such as Gaussian Mixture Model (GMM) and Sparse Principal Component Analysis (SPCA)) to automatically identify these regions. - **Application Examples**: This method has been applied to fields such as turbulence, combustion, nonlinear optics, geophysical fluids, and neuroscience, successfully revealing key mechanistic models. ### Result Verification This method has been verified in multiple physical systems, including flow around a cylinder, turbulent boundary layer, supercontinuum generation in optical fibers, geostrophic flow in the Gulf of Mexico, and Hodgkin - Huxley - type neuron models. The results show that this method can effectively identify the dominant physical processes and is consistent with classical scaling analysis and known physical behaviors. ### Key Formulas - General form of the governing equation: \[ N(u)=\sum_{i = 1}^{K}f_i(u, u_x, u_{xx},\dots, u_t,\dots)=0 \] - Viscous Burgers equation: \[ N(u)=u_t+uu_x-\nu u_{xx}=0 \] - Two - dimensional Navier - Stokes equation (dimensionless form): \[ u_t+(u\cdot\nabla)u = -\nabla p+\frac{1}{Re}\nabla^2u \] Through this method, researchers can identify the dominant physical processes in a wider range of systems without relying on specific asymptotic assumptions.