Improvement of system identification of stochastic systems via Koopman generator and locally weighted expectation

Yuki Tahara,Kakutaro Fukushi,Shunta Takahashi,Kayo Kinjo,Jun Ohkubo
2024-06-27
Abstract:The estimation of equations from data is of interest in physics. One of the famous methods is the sparse identification of nonlinear dynamics (SINDy), which utilizes sparse estimation techniques to estimate equations from data. Recently, a method based on the Koopman operator has been developed; the generator extended dynamic mode decomposition (gEDMD) estimates a time evolution generator of dynamical and stochastic systems. However, a naive application of the gEDMD algorithm cannot work well for stochastic differential equations because of the noise effects in the data. Hence, the estimation based on conditional expectation values, in which we approximate the first and second derivatives on each coordinate, is practical. A naive approach is the usage of locally weighted expectations. We show that the naive locally weighted expectation is insufficient because of the nonlinear behavior of the underlying system. For improvement, we apply the clustering method in two ways; one is to reduce the effective number of data, and the other is to capture local information more accurately. We demonstrate the improvement of the proposed method for the double-well potential system with state-dependent noise.
Dynamical Systems,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?
The paper is primarily dedicated to improving the accuracy of methods based on the Koopman generator operator in estimating the dynamic equations of stochastic systems, especially when the data is noisy. Specifically, the paper addresses the following key issues: 1. **Problems with existing methods**: The existing generator Extended Dynamic Mode Decomposition (gEDMD) algorithm encounters difficulties when dealing with stochastic differential equations, particularly in the presence of noise in the data. This is because the noise effects can severely impact the estimation results. 2. **Improvement goals**: The goal of the paper is to enhance the accuracy of the gEDMD algorithm in estimating stochastic system equations, especially in the case of noisy data. The authors propose some preprocessing techniques to mitigate the impact of noise on the estimation. 3. **Proposed solutions**: - **Weighted expectations**: Using kernel functions for local weighted averaging to compute conditional expectations. This helps to reduce the impact of noise in the data. - **Clustering methods**: First, representative points are selected through clustering for computing weighted expectations; second, an additional clustering is performed using the Dirichlet Process Mixture Model (DPMM) to more accurately capture the local information of the data. 4. **Study subject**: The paper uses a double-well potential system as a case study, which has state-dependent noise, to well demonstrate the effectiveness of the proposed methods. 5. **Experimental validation**: Numerical experiments are conducted to demonstrate the effectiveness of the proposed methods. The results show that after using the additional clustering step, the estimated drift coefficient function is closer to the true value and can better reproduce the sample trajectory behavior of the original system. In summary, this paper aims to improve the gEDMD algorithm by introducing weighted expectations and clustering methods, enabling it to more accurately estimate the dynamic equations of stochastic systems in noisy datasets.