Mixed Newton Method for Optimization in Complex Spaces

Nikita Yudin,Roland Hildebrand,Sergey Bakhurin,Alexander Degtyarev,Anna Lisachenko,Ilya Kuruzov,Andrei Semenov,Mohammad Alkousa
2024-07-30
Abstract:In this paper, we modify and apply the recently introduced Mixed Newton Method, which is originally designed for minimizing real-valued functions of complex variables, to the minimization of real-valued functions of real variables by extending the functions to complex space. We show that arbitrary regularizations preserve the favorable local convergence properties of the method, and construct a special type of regularization used to prevent convergence to complex minima. We compare several variants of the method applied to training neural networks with real and complex parameters.
Optimization and Control,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to optimize real - valued functions in the complex space and extend this method to the optimization of real - valued functions in the real space. Specifically, the paper improves and applies the Mixed Newton Method (MNM) by introducing a regularization term to improve its convergence and robustness. ### Core of the problem 1. **Optimizing real - valued functions in the complex space**: The paper first explores how to use the Mixed Newton Method to optimize the real - valued function \( f(z)=\sum_j |g_j(z)|^2 \) in the complex space, where \( g_j(z) \) are holomorphic functions of the complex variable \( z\in\mathbb{C}^n \). 2. **Extension to the real space**: Then, the paper extends this method to the optimization problem of real - valued functions in the real space. Specifically, it considers extending the real - valued function to the complex space and adding a regularization term to ensure that the minimum lies within the real subspace. 3. **The role of regularization**: The regularization term is used to prevent the method from converging to local minima in the complex space and can alleviate the degeneracy problems caused by the symmetry of the objective function or other reasons. In addition, regularization can also stabilize the irregular behavior that may occur far from the critical points. ### Specific problem description - **Form of the objective function**: For the objective function \( f(z)=\sum_j |g_j(z)|^2 \) in the complex space, the iteration formula of the Mixed Newton Method is: \[ z_{k + 1}=z_k-\left(\frac{\partial^2 f(z_k)}{\partial\bar{z}\partial z}\right)^{-1}\frac{\partial f(z_k)}{\partial\bar{z}} \] where the derivative is defined as the Wirtinger derivative. - **Iteration formula after regularization**: To ensure positive definiteness and stability, a positive definite regularization matrix \( P \) is introduced, and the iteration formula becomes: \[ z_{k + 1}=z_k-\left(\frac{\partial^2 f(z_k)}{\partial\bar{z}\partial z}+P\right)^{-1}\frac{\partial f(z_k)}{\partial\bar{z}} \] - **Application in the real space**: For the real - valued function \( F(x) \) in the real space, the Mixed Newton Method can be applied by extending it to the complex space \( f(z) = |g(z)|^2 \) and adding a regularization term of a specific form to ensure that the minimum lies within the real subspace. ### Experimental verification The paper verifies the superior performance of the Regularized Mixed Newton Method (RMNM) in optimizing non - convex polynomials and training neural network tasks through numerical experiments. The experimental results show that RMNM has better global convergence and can effectively avoid local minima and saddle points. ### Summary The main contribution of this paper is to improve the Mixed Newton Method by introducing a regularization term, making it applicable not only to optimization problems in the complex space but also effectively applicable to optimization problems in the real space. This method shows significant advantages in dealing with non - convex optimization problems and training neural networks.