Abstract:We study the online decision making problem (ODMP) as a natural generalization of online linear programming. In ODMP, a single decision maker undertakes a sequence of decisions over $T$ time steps. At each time step, the decision maker makes a locally feasible decision based on information available up to that point. The objective is to maximize the accumulated reward while satisfying some convex global constraints called goal constraints. The decision made at each step results in an $m$-dimensional vector that represents the contribution of this local decision to the goal constraints. In the online setting, these goal constraints are soft constraints that can be violated moderately. To handle potential nonconvexity and nonlinearity in ODMP, we propose a Fenchel dual-based online algorithm. At each time step, the algorithm requires solving a potentially nonconvex optimization problem over the local feasible set and a convex optimization problem over the goal set. Under certain stochastic input models, we show that the algorithm achieves $O(\sqrt{mT})$ goal constraint violation deterministically, and $\tilde{O}(\sqrt{mT})$ regret in expected reward. Numerical experiments on an online knapsack problem and an assortment optimization problem are conducted to demonstrate the potential of our proposed online algorithm.

What problem does this paper attempt to address?

The paper primarily focuses on addressing a class of Online Decision Making Problems (ODMP), which can be seen as a natural extension of Online Linear Programming (OLP). Specifically, the ODMP in the paper involves a decision-maker making decisions over a series of time steps $T$, with each decision based on the currently available information. The objective is to maximize cumulative rewards while satisfying certain convex global constraints (referred to as target constraints). The key contributions of the paper are as follows: 1. **Algorithm Design**: An online algorithm based on Fenchel duality is proposed. This algorithm requires solving a potentially non-convex optimization problem over the local feasible set $\Omega_t$ and a convex optimization problem over the target set at each time step $t$. The algorithm is particularly suitable for mixed-integer subproblems. When $\Omega_t$ is a mixed-integer set, the problem becomes a Mixed Integer Program (MIP) involving only local decision variables. 2. **Theoretical Guarantees**: - Under a specific random input model, the algorithm can deterministically guarantee that the violation of the target constraints is $O(\sqrt{mT})$, where $m$ is the dimension of the target vector and $T$ is the number of time steps. - In terms of expected rewards, the regret (i.e., reward gap) of the algorithm relative to the optimal offline solution is $\tilde{O}(\sqrt{mT})$, which holds under the uniform random permutation model. 3. **Practical Applications**: Numerical experiments demonstrate the potential of the proposed online algorithm in online knapsack problems and product assortment optimization problems. The paper also discusses the relationship between ODMP and fairness over time, i.e., how the impact of decisions is balanced among different stakeholders, and how to handle generalized global target constraints, which may not be simple budget constraints. Overall, this paper presents a new approach to solving online decision-making problems and demonstrates its effectiveness both theoretically and practically.

Online Decision Making with Nonconvex Local and Convex Global Constraints

Online Sequential Decision-Making with Unknown Delays

Online Decisioning Meta-Heuristic Framework for Large Scale Black-Box Optimization.

Online Convex Optimization for Dynamic Network Resource Allocation

Online Alternating Direction Method (longer version)

Online Dynamic Submodular Optimization

Optimal and Efficient Algorithms for Decentralized Online Convex Optimization

Online $\mathrm{L}^{\natural}$-Convex Minimization

Online Contextual Decision-Making with a Smart Predict-then-Optimize Method

Online Non-convex Optimization with Long-term Non-convex Constraints

Online Convex Optimization with Continuous Switching Constraint

Online Weakly DR-Submodular Optimization with Stochastic Long-Term Constraints

A Primal-Dual Online Algorithm for Online Matching Problem in Dynamic Environments

Regrets of Proximal Method of Multipliers for Online Non-convex Optimization with Long Term Constraints

An Online Convex Optimization Approach to Proactive Network Resource Allocation

Lazy OCO: Online Convex Optimization on a Switching Budget

Distributed Online Optimization with Long-Term Constraints

Nearly Optimal Regret for Decentralized Online Convex Optimization

Non-stationary Online Convex Optimization with Arbitrary Delays

Online DR-Submodular Maximization with Stochastic Cumulative Constraints