Introduction to Online Convex Optimization

Elad Hazan
2023-08-06
Abstract:This manuscript portrays optimization as a process. In many practical applications the environment is so complex that it is infeasible to lay out a comprehensive theoretical model and use classical algorithmic theory and mathematical optimization. It is necessary as well as beneficial to take a robust approach, by applying an optimization method that learns as one goes along, learning from experience as more aspects of the problem are observed. This view of optimization as a process has become prominent in varied fields and has led to some spectacular success in modeling and systems that are now part of our daily lives.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to make effective decisions and optimizations through the Online Convex Optimization (OCO) framework in a constantly changing and uncertain environment. Specifically, the OCO framework focuses on how a decision - maker makes optimal decisions based on the information obtained at each step during a series of iterations while minimizing the cumulative loss. These problems usually have the following characteristics: 1. **Uncertainty**: At each decision point, the loss function faced by the decision - maker is unknown, and these loss functions may be selected by an adversarial opponent or randomly generated. 2. **Dynamic environment**: The environment is dynamically changing, and the decision - maker needs to adapt to this change, rather than just finding a fixed solution for static problems. 3. **Performance measurement**: The performance of the algorithm is usually measured by "regret", that is, the difference between the total cumulative loss of the algorithm in multiple iterations and the total loss of the best fixed strategy in hindsight. ### Specific problem examples 1. **Expert advice prediction**: - **Problem description**: The decision - maker needs to select one from multiple experts to listen to advice. After each selection, a loss (ranging from 0 to 1) will be generated according to the advice of the selected expert. This process will be repeated many times, and the loss each time may be arbitrary or even adversarial. - **OCO framework**: The decision set is all possible expert distributions (i.e., n - dimensional simplex), and the cost function is linear, representing the expected loss of selecting a certain expert. 2. **Online spam filtering**: - **Problem description**: An online system needs to classify continuously arriving emails to determine whether they are spam. The system needs to deal with adversarially generated data and be able to adjust dynamically as the input changes. - **OCO framework**: The decision set is all linear filters that satisfy a certain norm constraint (i.e., a Euclidean ball of a certain radius), and the cost function is determined according to the email and its label, usually a convex loss function. 3. **Online shortest path**: - **Problem description**: In a dynamically changing network, select a path each time with the goal of minimizing the cumulative path cost. - **OCO framework**: The decision set is all possible paths, and the cost function is determined according to the current network state. 4. **Portfolio selection**: - **Problem description**: In the financial market, select an investment portfolio each time with the goal of maximizing returns or minimizing risks. - **OCO framework**: The decision set is all possible investment portfolios, and the cost function is usually the negative return or risk of the investment portfolio. 5. **Matrix completion and recommendation systems**: - **Problem description**: In the recommendation system, it is necessary to predict the missing ratings according to the partially known user - item rating matrix. - **OCO framework**: The decision set is all possible low - rank matrices, and the cost function is determined according to the error of the known ratings. ### Summary This paper provides a systematic method to deal with the above various dynamic and uncertain decision - making problems through the online convex optimization framework. The OCO framework can not only handle complex practical applications but also guarantee the performance of the algorithm through strict mathematical analysis. The main contribution of the paper is to propose a variety of efficient OCO algorithms and analyze their theoretical properties, especially their performance in terms of regret.