Towards Constituting Mathematical Structures for Learning to Optimize

Jialin Liu,Xiaohan Chen,Zhangyang Wang,Wotao Yin,HanQin Cai
2023-05-30
Abstract:Learning to Optimize (L2O), a technique that utilizes machine learning to learn an optimization algorithm automatically from data, has gained arising attention in recent years. A generic L2O approach parameterizes the iterative update rule and learns the update direction as a black-box network. While the generic approach is widely applicable, the learned model can overfit and may not generalize well to out-of-distribution test sets. In this paper, we derive the basic mathematical conditions that successful update rules commonly satisfy. Consequently, we propose a novel L2O model with a mathematics-inspired structure that is broadly applicable and generalized well to out-of-distribution problems. Numerical simulations validate our theoretical findings and demonstrate the superior empirical performance of the proposed L2O model.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?
The paper aims to address a key issue in the design of optimization algorithms, namely how to use machine learning techniques to automatically learn optimization algorithms, a process known as "Learning to Optimize (L2O)." Specifically, the paper focuses on improving the performance of L2O methods when dealing with out-of-distribution test sets and avoiding the problem of overfitting. Traditionally, L2O methods learn the update direction by parameterizing iterative update rules and treating them as black-box networks. However, although this approach is widely applicable, it may perform poorly when faced with out-of-distribution data and is prone to overfitting. To address these issues, the paper proposes a new L2O model based on mathematical conditions that successful update rules typically satisfy. The new model not only performs well in a wide range of application scenarios but also has good generalization capabilities when dealing with out-of-distribution problems. The main contributions of the paper include: 1. Describing the basic mathematical conditions that a good update rule should satisfy in convex optimization problems. 2. Based on these conditions, the paper derives a mathematical heuristic structure for the update rule on the variable $x_k$. 3. The proposed approach is validated through numerical experiments to have superior generalization performance, even achieving surprisingly good results on real datasets. Overall, the paper aims to improve the generalization ability and stability of L2O models by introducing mathematical constraints, thereby achieving better performance in the design of learning optimization algorithms.