Abstract:This paper presents a novel approach to solving convex optimization problems by leveraging the fact that, under certain regularity conditions, any set of primal or dual variables satisfying the Karush-Kuhn-Tucker (KKT) conditions is necessary and sufficient for optimality. Similar to Theory-Trained Neural Networks (TTNNs), the parameters of the convex optimization problem are input to the neural network, and the expected outputs are the optimal primal and dual variables. A choice for the loss function in this case is a loss, which we refer to as the KKT Loss, that measures how well the network's outputs satisfy the KKT conditions. We demonstrate the effectiveness of this approach using a linear program as an example. For this problem, we observe that minimizing the KKT Loss alone outperforms training the network with a weighted sum of the KKT Loss and a Data Loss (the mean-squared error between the ground truth optimal solutions and the network's output). Moreover, minimizing only the Data Loss yields inferior results compared to those obtained by minimizing the KKT Loss. While the approach is promising, the obtained primal and dual solutions are not sufficiently close to the ground truth optimal solutions. In the future, we aim to develop improved models to obtain solutions closer to the ground truth and extend the approach to other problem classes.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to use neural networks to solve convex optimization problems, especially by introducing Karush - Kuhn - Tucker (KKT) conditions to train neural networks to obtain optimal primal and dual variables. Specifically, the main objectives of the paper include: 1. **Propose a new method**: Use the KKT conditions as part of the loss function to train neural networks to solve convex optimization problems. 2. **Verify the effectiveness of the method**: Through the example of linear programming, prove that minimizing only the KKT loss (KKT Loss) is more effective than combining data loss (Data Loss) or using only data loss. 3. **Explore the impact of different loss functions**: Study the impact of using different combinations of loss functions (such as only KKT loss, only data loss, and the weighted sum of the two) on model performance during the training process. ### Background and Motivation Traditional convex optimization problems are usually solved by numerical methods, such as the interior - point method or the gradient descent method. However, with the development of deep learning, researchers have begun to explore how to use neural networks to solve these optimization problems. The author of this paper proposes a new method based on the KKT conditions, aiming to enable the neural network to directly output the optimal solution that satisfies the KKT conditions, thereby simplifying the solution process and improving efficiency. ### Method Overview 1. **Problem Formalization**: - A general convex optimization problem can be expressed as: \[ \begin{aligned} & \min_{x \in \mathbb{R}^n} f_0(x), \\ & \text{subject to } f_i(x) \leq 0, \quad i = 1, \ldots, m, \\ & \quad \quad g_i(x) = 0, \quad i = 1, \ldots, p, \end{aligned} \] - Where \( x = [x_1, x_2, \ldots, x_n] \in \mathbb{R}^n \), \( f_i: \mathbb{R}^n \to \mathbb{R} \) is a convex function, and \( g_i: \mathbb{R}^n \to \mathbb{R} \) is an affine function. 2. **KKT Conditions**: - The KKT conditions are necessary and sufficient conditions for convex optimization problems, including: - Primal Feasibility: \( f_i(x^*) \leq 0 \), \( i = 1, \ldots, m \) - Dual Feasibility: \( \lambda_i^* \geq 0 \), \( i = 1, \ldots, m \) - Complementary Slackness: \( \lambda_i^* f_i(x^*) = 0 \), \( i = 1, \ldots, m \) - Stationarity: \( \nabla f_0(x^*) + \sum_{i = 1}^m \lambda_i^* \nabla f_i(x^*) + \sum_{i = 1}^p \nu_i^* \nabla g_i(x^*) = 0 \) 3. **Loss Function Design**: - Define the KKT loss (KKT Loss) to measure whether the output of the neural network satisfies the KKT conditions. Specifically, it includes: - Primal Feasibility Loss: \[ L_{PF} = \frac{1}{m} \sum_{i = 1}^m \max(0, f_i(\hat{x}))^2 \] - Dual Feasibility Loss: \[ L_{DF} = \frac{1}{m} \sum_{i = 1}^m \max(0, -

Karush-Kuhn-Tucker Condition-Trained Neural Networks (KKT Nets)

KKT-Informed Neural Network

Neural Network Solution for General Nonlinear Optimization Problems

A Recurrent Neural Network for Solving Nonconvex Optimization Problems

Convergence of a Recurrent Neural Network for Nonconvex Optimization Based on an Augmented Lagrangian Function

On Enhanced KKT Optimality Conditions for Smooth Nonlinear Optimization

Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs

A discrete-time neural network for optimization problems with hybrid constraints.

An Improved Dual Neural Network for Solving A Class of Quadratic Programming Problems and Its K-Winners-Take-All Application

A new recurrent neural network for solving convex quadratic programming problems with an application to the κ-winners-take-all problem

The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models

Optimization Condition and Algorithm of Optimization with Convertible Nonconvex Function

Neural Tangent Kernels Motivate Graph Neural Networks with Cross-Covariance Graphs

Constrained optimization in simulation: efficient global optimization and Karush-Kuhn-Tucker conditions

Training Artificial Neural Networks Using a Global Optimization Method That Utilizes Neural Networks

A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

Approximate Karush-Kuhn-Tucker conditions for nonsmooth bilevel optimization problems

A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained Optimization

A note on approximate Karush-Kuhn-Tucker conditions in locally Lipschitz multiobjective optimization

GPINN with Neural Tangent Kernel Technique for Nonlinear Two Point Boundary Value Problems

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel