Abstract:We introduce an online convex optimization algorithm which utilizes projected subgradient descent with optimal adaptive learning rates. Our method provides second-order minimax-optimal dynamic regret guarantee (i.e., dependent on the sum of squared subgradient norms) for a sequence of general convex functions, which may not have strong-convexity, smoothness, exp-concavity or even proper Lipschitz-continuity. The regret guarantee is against any comparator decision sequence with bounded path variation (i.e., sum of the distances between successive decisions). We generate the lower bound of the worst-case second-order dynamic regret by incorporating actual subgradient norms. We show that this lower bound matches with our regret guarantee within a constant factor, which makes our algorithm minimax optimal. We also derive the extension for learning in each decision coordinate individually. We demonstrate how to best preserve our regret guarantee in a truly online manner, when the bound on path variation of the comparator sequence grows in time or the feedback regarding such bound arrives partially as time goes on. We further build on our algorithm to eliminate the need of any knowledge on the comparator path variation, and provide minimax optimal second-order regret guarantees with no a priori information. Our approach can compete against all comparator sequences simultaneously (universally) in a minimax optimal manner, i.e., each regret guarantee depends on the respective comparator path variation. We discuss modifications to our approach which address complexity reductions for time, computation and memory. We further improve our results by making the regret guarantees also dependent on comparator sets' diameters in addition to the respective path variations.

Efficient Constrained Regret Minimization

Efficient Methods for Non-stationary Online Learning

Online $\mathrm{L}^{\natural}$-Convex Minimization

Online DR-Submodular Maximization: Minimizing Regret and Constraint Violation

On the Computational Efficiency of Adaptive and Dynamic Regret Minimization

Do LLM Agents Have Regret? A Case Study in Online Learning and Games

Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

Online Reinforcement Learning in Markov Decision Process Using Linear Programming

Constrained Online Two-stage Stochastic Optimization: Near Optimal Algorithms via Adversarial Learning

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback

Minimizing Dynamic Regret and Adaptive Regret Simultaneously

Optimistic Regret Bounds for Online Learning in Adversarial Markov Decision Processes

Online Learning under Budget and ROI Constraints via Weak Adaptivity

Minimizing Adaptive Regret with One Gradient Per Iteration

Universal Online Convex Optimization with Minimax Optimal Second-Order Dynamic Regret

Adaptive Online Learning in Dynamic Environments.

No-Regret Learnability for Piecewise Linear Losses

Regret Minimization via Saddle Point Optimization

Regret Minimization and Statistical Inference in Online Decision Making with High-dimensional Covariates

Online Stackelberg Optimization via Nonlinear Control

On Adaptivity in Information-constrained Online Learning