Neural Network-based High-index Saddle Dynamics Method for Searching Saddle Points and Solution Landscape

Yuankai Liu,Lei Zhang,Jin Zhao
2024-11-25
Abstract:The high-index saddle dynamics (HiSD) method is a powerful approach for computing saddle points and solution landscape. However, its practical applicability is constrained by the need for the explicit energy function expression. To overcome this challenge, we propose a neural network-based high-index saddle dynamics (NN-HiSD) method. It utilizes neural network-based surrogate model to approximates the energy function, allowing the use of the HiSD method in the cases where the energy function is either unavailable or computationally expensive. We further enhance the efficiency of the NN-HiSD method by incorporating momentum acceleration techniques, specifically Nesterov's acceleration and the heavy-ball method. We also provide a rigorous convergence analysis of the NN-HiSD method. We conduct numerical experiments on systems with and without explicit energy functions, specifically including the alanine dipeptide model and bacterial ribosomal assembly intermediates for the latter, demonstrating the effectiveness and reliability of the proposed method.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the challenges encountered in calculating saddle points and solution landscapes, especially when the explicit energy function expression is unavailable or the computational cost is too high. Specifically: 1. **Limitations of traditional methods**: The High - index Saddle - point Dynamics (HiSD) method, although a powerful tool for calculating saddle points and solution landscapes, has its practical applications limited by the need for an explicit energy function expression. In many practical problems, the form of the energy function may be unknown or too complex to be calculated by traditional analytical or numerical methods. 2. **Introduction of neural network surrogate models**: To solve the above problems, the paper proposes a neural - network - based High - index Saddle - point Dynamics method (NN - HiSD). This method uses a neural network as a surrogate model to approximate the energy function, so that the HiSD method can be used for saddle - point calculation without an explicit energy function. 3. **Improving computational efficiency**: To further improve the computational efficiency of the NN - HiSD method, the paper introduces momentum acceleration techniques, such as Nesterov acceleration and the heavy - ball method, to speed up the convergence rate. 4. **Verifying the effectiveness of the method**: The paper verifies the effectiveness and reliability of the proposed method through a series of numerical experiments. The experiments include systems with explicit energy function expressions and data - driven systems (such as the alanine dipeptide model and bacterial ribosome assembly intermediates), demonstrating the applicability and accuracy of the NN - HiSD method in different scenarios. ### Formula summary - **Saddle - point dynamics equations**: \[ \begin{cases} \dot{x}=\beta\left(I - 2\sum_{i = 1}^{k}v_iv_i^T\right)F(x),\\ \dot{v}_i=-\gamma\left(I - v_iv_i^T-\sum_{j = 1}^{i - 1}2v_jv_j^T\right)G(x)v_i,\quad i = 1,2,\ldots,k. \end{cases} \] where \(I\) is the identity matrix, \(F(x)=-\nabla E(x)\) is the natural force of energy \(E(x)\), \(G(x)=\nabla^2E(x)\) is the corresponding Hessian matrix, \(v_1,\ldots,v_k\) are the eigenvectors corresponding to the first \(k\) negative eigenvalues of \(G(x)\), and \(\beta\) and \(\gamma\) are two positive relaxation parameters. - **Loss function**: \[ L_1=\|E_{NN}(x;\theta)-E(x)\| \] For the case with partial gradient information, another loss function is defined: \[ L_2=\|\text{AD}_x(E_{NN}(x;\theta))-\nabla E(x)\| \] The total loss function is: \[ L = L_1+\lambda_2L_2 \] Through these improvements, the paper provides an effective method for calculating saddle points and solution landscapes without an explicit energy function and demonstrates the successful application of this method in multiple practical application scenarios.